fix(k8s): add scaleUp/scaleDown stabilization to schemas HPA #844

Merged
argoyle merged 1 commits from tune-hpa-behavior into main 2026-05-21 17:10:19 +00:00
Owner

Even with the schema cache (#841) and startup warmup (#843), residual CPU bursts (metrics-server occasionally samples a request mid-flight) were enough to trip a brief scale-up, after which the default 5min scaleDown stabilization pinned the deployment at maxReplicas long after the spike had subsided.

Tune both directions

  • scaleUp.stabilizationWindowSeconds: 120 — a transient spike must persist for two consecutive minutes before any pod is added. Brief metric anomalies no longer move replicas.
  • scaleUp policy: add at most 1 pod per 60s. Smooths reaction.
  • scaleDown.stabilizationWindowSeconds: 120 (default 300) — once the workload calms, return to minReplicas faster.
  • scaleDown policy: remove at most 1 pod per 60s. Avoids thundering-herd scale-down.

Net effect: replicas now spend most of the time at minReplicas (2) and only grow under sustained load.

Even with the schema cache (#841) and startup warmup (#843), residual CPU bursts (metrics-server occasionally samples a request mid-flight) were enough to trip a brief scale-up, after which the default 5min scaleDown stabilization pinned the deployment at maxReplicas long after the spike had subsided. ## Tune both directions - `scaleUp.stabilizationWindowSeconds: 120` — a transient spike must persist for two consecutive minutes before any pod is added. Brief metric anomalies no longer move replicas. - `scaleUp` policy: add at most 1 pod per 60s. Smooths reaction. - `scaleDown.stabilizationWindowSeconds: 120` (default 300) — once the workload calms, return to minReplicas faster. - `scaleDown` policy: remove at most 1 pod per 60s. Avoids thundering-herd scale-down. Net effect: replicas now spend most of the time at minReplicas (2) and only grow under sustained load.
argoyle added 1 commit 2026-05-21 16:56:25 +00:00
fix(k8s): add scaleUp/scaleDown stabilization to schemas HPA
schemas / vulnerabilities (pull_request) Successful in 2m2s
schemas / check (pull_request) Successful in 2m42s
schemas / check-release (pull_request) Successful in 2m59s
pre-commit / pre-commit (pull_request) Successful in 6m18s
schemas / build (pull_request) Successful in 5m47s
schemas / deploy-prod (pull_request) Has been skipped
eefab471d6
Even with the schema cache and startup warmup, residual CPU bursts
(metrics-server occasionally samples a request mid-flight) were enough
to trip a brief scale-up, after which the default 5min scaleDown
stabilization pinned the deployment at maxReplicas long after the
spike had subsided.

Tune both directions:

- scaleUp.stabilizationWindowSeconds: 120 — a transient spike must
  persist for two consecutive minutes before any pod is added.
  Brief metric anomalies no longer move replicas.
- scaleUp policy: add at most 1 pod per 60s. Smooths reaction.
- scaleDown.stabilizationWindowSeconds: 120 (default 300) — once
  the workload calms, return to minReplicas faster.
- scaleDown policy: remove at most 1 pod per 60s. Avoids
  thundering-herd scale-down.
argoyle scheduled this pull request to auto merge when all checks succeed 2026-05-21 16:56:38 +00:00
argoyle merged commit 111c2e4b19 into main 2026-05-21 17:10:19 +00:00
argoyle deleted branch tune-hpa-behavior 2026-05-21 17:10:20 +00:00
Sign in to join this conversation.
No Reviewers
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: unboundsoftware/schemas#844