perf(graph): cache merged SDL and SchemaUpdate per ref #841

Merged

argoyle merged 1 commits from precompute-schema-cache into main

2026-05-19 07:51:51 +00:00

Author	SHA1	Message	Date
argoyle	d652c1e446	perf(graph): cache merged SDL and SchemaUpdate per ref schemas / vulnerabilities (pull_request) Successful in 2m8s Details schemas / check (pull_request) Successful in 3m5s Details schemas / check-release (pull_request) Successful in 5m14s Details pre-commit / pre-commit (pull_request) Successful in 6m55s Details schemas / build (pull_request) Successful in 5m44s Details schemas / deploy-prod (pull_request) Has been skipped Details Both Supergraph and LatestSchema resolvers recomputed their result on every request. The work is non-trivial: - Supergraph: sdlmerge.MergeSDLs() runs AST validation + normalization + custom merge walkers over all subgraph SDLs. - LatestSchema: CosmoGenerator.Generate() shells out to wgc router compose (Node via npx), spending 100-300m CPU per call. Because the output is fully determined by the set of subgraph SDLs and their lastUpdate timestamp, the result can be cached and reused across requests until a SubGraphUpdated event bumps the lastUpdate for the (orgId, ref) key. Add two precomputation caches to cache.Cache, both versioned by the existing lastUpdate map so a single timestamp comparison invalidates stale entries implicitly: - mergedSDLs: cached MergeSDLs output for Supergraph - schemaUpdates: cached SchemaUpdate (subgraphs + cosmo config) for LatestSchema The UpdateSubGraph debounce already computes the cosmo config to publish through PubSub; it now also stores the SchemaUpdate so the next LatestSchema query is warm. OrganizationRemoved evicts both caches alongside lastUpdate. This eliminates the per-request CPU bursts that were tripping the HPA into TooManyReplicas territory.	2026-05-19 09:37:43 +02:00

Author

SHA1

Message

Date

argoyle

d652c1e446

perf(graph): cache merged SDL and SchemaUpdate per ref

schemas / vulnerabilities (pull_request) Successful in 2m8s

Details

schemas / check (pull_request) Successful in 3m5s

Details

schemas / check-release (pull_request) Successful in 5m14s

Details

pre-commit / pre-commit (pull_request) Successful in 6m55s

Details

schemas / build (pull_request) Successful in 5m44s

Details

schemas / deploy-prod (pull_request) Has been skipped

Details

Both Supergraph and LatestSchema resolvers recomputed their result on
every request. The work is non-trivial:

- Supergraph: sdlmerge.MergeSDLs() runs AST validation + normalization
  + custom merge walkers over all subgraph SDLs.
- LatestSchema: CosmoGenerator.Generate() shells out to wgc router
  compose (Node via npx), spending 100-300m CPU per call.

Because the output is fully determined by the set of subgraph SDLs and
their lastUpdate timestamp, the result can be cached and reused across
requests until a SubGraphUpdated event bumps the lastUpdate for the
(orgId, ref) key.

Add two precomputation caches to cache.Cache, both versioned by the
existing lastUpdate map so a single timestamp comparison invalidates
stale entries implicitly:

- mergedSDLs: cached MergeSDLs output for Supergraph
- schemaUpdates: cached SchemaUpdate (subgraphs + cosmo config) for
  LatestSchema

The UpdateSubGraph debounce already computes the cosmo config to
publish through PubSub; it now also stores the SchemaUpdate so the
next LatestSchema query is warm. OrganizationRemoved evicts both
caches alongside lastUpdate.

This eliminates the per-request CPU bursts that were tripping the
HPA into TooManyReplicas territory.

2026-05-19 09:37:43 +02:00

perf(graph): cache merged SDL and SchemaUpdate per ref #841

1 Commits