Both Supergraph and LatestSchema resolvers recomputed their result on
every request. The work is non-trivial:
- Supergraph: sdlmerge.MergeSDLs() runs AST validation + normalization
+ custom merge walkers over all subgraph SDLs.
- LatestSchema: CosmoGenerator.Generate() shells out to wgc router
compose (Node via npx), spending 100-300m CPU per call.
Because the output is fully determined by the set of subgraph SDLs and
their lastUpdate timestamp, the result can be cached and reused across
requests until a SubGraphUpdated event bumps the lastUpdate for the
(orgId, ref) key.
Add two precomputation caches to cache.Cache, both versioned by the
existing lastUpdate map so a single timestamp comparison invalidates
stale entries implicitly:
- mergedSDLs: cached MergeSDLs output for Supergraph
- schemaUpdates: cached SchemaUpdate (subgraphs + cosmo config) for
LatestSchema
The UpdateSubGraph debounce already computes the cosmo config to
publish through PubSub; it now also stores the SchemaUpdate so the
next LatestSchema query is warm. OrganizationRemoved evicts both
caches alongside lastUpdate.
This eliminates the per-request CPU bursts that were tripping the
HPA into TooManyReplicas territory.