fix(k8s): add scaleUp/scaleDown stabilization to schemas HPA #844
Reference in New Issue
Block a user
Delete Branch "tune-hpa-behavior"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Even with the schema cache (#841) and startup warmup (#843), residual CPU bursts (metrics-server occasionally samples a request mid-flight) were enough to trip a brief scale-up, after which the default 5min scaleDown stabilization pinned the deployment at maxReplicas long after the spike had subsided.
Tune both directions
scaleUp.stabilizationWindowSeconds: 120— a transient spike must persist for two consecutive minutes before any pod is added. Brief metric anomalies no longer move replicas.scaleUppolicy: add at most 1 pod per 60s. Smooths reaction.scaleDown.stabilizationWindowSeconds: 120(default 300) — once the workload calms, return to minReplicas faster.scaleDownpolicy: remove at most 1 pod per 60s. Avoids thundering-herd scale-down.Net effect: replicas now spend most of the time at minReplicas (2) and only grow under sustained load.