Cache Layer Architecture
Poorly defined cache boundaries are the root cause of most stale-data bugs and excess network traffic in SaaS frontends: components fetch redundantly, mutations propagate to some views but not others, and memory grows unbounded in long-lived SPAs. This page, a sub-topic within State Architecture & Cache Fundamentals, details how to construct a normalized cache layer that separates server state from transient UI state, configures framework adapters consistently, and defines explicit lifecycle rules that scale as your data graph grows. If you are first evaluating where cache state should live, read Client vs Server State Boundaries before continuing here; for the specific decision between React Query and Redux, see React Query vs Redux for Server State.
Diagnostic checklist
You are in the right place if you observe any of these symptoms:
- The same API endpoint is called multiple times per page render with identical parameters.
- A mutation updates the database but sibling components continue displaying stale data.
- Browser memory climbs steadily over a user session without a visible leak in component code.
- Optimistic UI updates flicker or snap back inconsistently after server confirmation.
- Invalidating one resource accidentally clears unrelated cached queries.
- Hydration mismatches appear during SSR because the client re-fetches data the server already had.
Prerequisites
Before implementing the patterns below you should be comfortable with:
- Reference vs Value Storage Models — understanding when the cache holds a pointer to a shared entity vs a deep-copied payload is essential before choosing normalization depth.
- Normalization Principles for UI — the difference between a flat entity table and a nested response shape, and why denormalization should happen at render time, not at storage time.
- TanStack Query v5
QueryClientAPI (staleTime,gcTime,structuralSharing,setQueryData,invalidateQueries) or Apollo Client v3InMemoryCache(cache.modify,cache.identify,readFragment/writeFragment).
Data-flow overview
The diagram below models how a SaaS frontend cache layer sits between the network and the component tree. API responses are normalized into a flat entity store before they reach any component; components read denormalized views reconstructed at render time.
Implementation: Normalized Entity Graph
Flat API responses must be transformed into a keyed entity graph before entering the cache. This prevents storage duplication and ensures that a single setQueryData call propagates atomically to every component that reads the same entity.
Steps
- Extract unique identifiers. Parse each item in the response for its primary key (
id,uuid, or a composite like${type}:${id}) before writing to the cache. - Build a flat lookup table. Store entities as
Record<string, Entity>and maintain a parallel orderedstring[]of IDs to preserve list sequencing without re-sorting on every render. - Decouple component reads from payload shape. Components should receive normalized references and reconstruct their denormalized view via a
selecttransform at render time, not at write time.
// queryClient.ts — TanStack Query v5
import { QueryClient } from '@tanstack/react-query';
export interface NormalizedList<T> {
ids: string[];
entities: Record<string, T>;
}
export function normalizeList<T extends { id: string }>(
items: T[]
): NormalizedList<T> {
const ids: string[] = [];
const entities: Record<string, T> = {};
for (const item of items) {
ids.push(item.id);
entities[item.id] = item;
}
return { ids, entities };
}
export const queryClient = new QueryClient({
defaultOptions: {
queries: {
// 5-minute stale window: safe for most SaaS entity lists
staleTime: 1000 * 60 * 5,
// 30-minute garbage-collection window for inactive queries
gcTime: 1000 * 60 * 30,
// Skip re-render when the returned reference is structurally equal
structuralSharing: true,
},
},
});
// useProjects.ts — apply normalization via the select option
import { useQuery } from '@tanstack/react-query';
import { normalizeList, NormalizedList } from './queryClient';
interface Project { id: string; name: string; status: 'active' | 'archived'; }
export function useProjects() {
return useQuery<Project[], Error, NormalizedList<Project>>({
queryKey: ['projects'],
queryFn: () => fetch('/api/projects').then(r => r.json()),
// select runs after every successful fetch and after cache reads.
// TanStack memoizes it: if the raw data reference is unchanged,
// select does not re-run and the previous normalized map is returned.
select: normalizeList,
});
}
Cache Behavior Impact: structuralSharing: true runs a deep structural equality check on the incoming data before committing it to the cache. If the server response is byte-for-byte identical to the last snapshot, TanStack Query keeps the previous reference and no subscriber re-renders. The select transform receives the raw cached value, not the normalized one, so structural sharing operates on the raw API response — this prevents unnecessary re-normalization on focus-triggered background refetches.
Configuration Trade-offs
staleTimevs data volatility. A 5-minutestaleTimeis safe for static entity lists but wrong for real-time inventory or presence data. SetstaleTime: 0for those queries and userefetchIntervalor a WebSocket channel alongside.gcTimeand memory pressure. A 30-minutegcTimemeans inactive queries remain in memory for 30 minutes. In SPAs with hundreds of unique query keys, this can accumulate to tens of MB. Profile withqueryClient.getQueryCache().getAll().lengthin development to detect key sprawl.- Composite key overhead. Normalization requires consistent ID extraction. A backend that returns
idfor some resources and_idfor others silently breaks the lookup graph. Enforce a mapping layer at the API boundary. structuralSharingcost on large payloads. The deep equality pass runs synchronously on the main thread. For payloads exceeding ~5 000 objects, disablestructuralSharingon those specific queries and rely on mutation-driven invalidation instead.
Implementation: Framework Adapter Configuration
Both TanStack Query and Apollo Client expose low-level configuration that most teams leave at defaults long after those defaults become a liability. Explicit adapter configuration standardizes retry behaviour, deduplication windows, and fetch cancellation across every query in the application.
Steps
- Standardize the fetch function signature. Wrap
fetchoraxiosin a single adapter that handles auth headers, request cancellation, and non-2xx error throwing uniformly. Pass the adapter toQueryClientviaqueryFndefaults or to Apollo viaHttpLink. - Configure exponential back-off with jitter. Retry only on network failures and 5xx errors; fail immediately on 4xx. Spread retry delay with jitter to prevent thundering-herd reconnect bursts.
- Set deduplication and window consistency. In TanStack Query, concurrent calls with identical query keys share one in-flight request automatically — verify you have not accidentally diverged query keys for the same resource.
// fetchAdapter.ts — shared fetch wrapper for TanStack Query queryFn
export class ApiError extends Error {
constructor(public status: number, message: string) {
super(message);
this.name = 'ApiError';
}
}
export async function apiFetch<T>(
endpoint: string,
signal?: AbortSignal
): Promise<T> {
const response = await fetch(`/api${endpoint}`, {
signal,
headers: {
'Content-Type': 'application/json',
// In production, inject auth token from a secure store, not here.
},
});
if (!response.ok) {
// 4xx: do not retry — propagate immediately so QueryClient treats it as non-retryable
throw new ApiError(response.status, `API error: ${response.status}`);
}
return response.json() as Promise<T>;
}
// queryClient.ts — configure retry policy
import { QueryClient } from '@tanstack/react-query';
import { ApiError } from './fetchAdapter';
export const queryClient = new QueryClient({
defaultOptions: {
queries: {
staleTime: 1000 * 60 * 5,
gcTime: 1000 * 60 * 30,
structuralSharing: true,
// Retry only on non-4xx errors, up to 3 attempts with exponential back-off
retry: (failureCount, error) => {
if (error instanceof ApiError && error.status < 500) return false;
return failureCount < 3;
},
retryDelay: (attempt) =>
Math.min(1000 * 2 ** attempt + Math.random() * 200, 30_000),
},
mutations: {
retry: 0, // Never auto-retry mutations: side effects are not idempotent
},
},
});
Cache Behavior Impact: TanStack Query’s request deduplication operates at the query key level: if two components mount simultaneously and both trigger a ['projects'] query, only one HTTP request fires. The second subscriber attaches to the in-flight promise via the internal QueryObserver mechanism. Configuring retry as a function gives you per-error-type control — ApiError with status < 500 short-circuits the retry loop entirely, so 401 and 403 responses surface to the UI immediately rather than after three delayed retries.
Configuration Trade-offs
- Retry on mutations. Setting
retry > 0on mutations risks duplicate side effects — a payment charge, a sent email — if the server processed the request but the response was lost in transit. Keepmutations.retry: 0as a hard rule. - AbortSignal propagation. Passing
signalfrom TanStack Query’squeryFnargument tofetchenables automatic request cancellation when a component unmounts mid-fetch. Without it, the network request completes and the response is discarded, wasting bandwidth and keeping server threads occupied. - Shared
QueryClientvs per-route instances. A singleQueryClientmounted at the application root deduplicates correctly. Multiple instances (e.g., one per page in a micro-frontend setup) cannot share cache entries; this is the most common cause of redundant fetches after navigation. HttpLinkvsBatchHttpLinkin Apollo.BatchHttpLinkreduces round-trips by merging operations, but increases perceived latency for individual queries because each request waits for the batch window to close. PreferHttpLinkwith Apollo’s built-in deduplication unless you have measured network overhead that justifies batching.
Implementation: Cache Lifecycle and Invalidation Boundaries
Once the entity graph is normalized and the adapter is configured, you need explicit rules for when data ages out and how mutations trigger targeted refreshes. The two failure modes here are over-invalidation (resetting the entire cache on every mutation) and under-invalidation (relying solely on TTL expiry and missing server-side changes).
Steps
- Implement stale-while-revalidate windows. Serve the cached entity immediately while a background refetch runs. Tune
staleTimeper resource volatility tier: 0 s for real-time, 60 s for frequently-edited entities, 300 s for reference data. - Scope invalidation by resource prefix. After a mutation, call
invalidateQueries({ queryKey: ['projects'] })rather thaninvalidateQueries()to flush only the affected resource family. For cross-resource dependencies (e.g., a team mutation that affects project membership), invalidate both keys in a singlePromise.all. - Apply optimistic updates with rollback. Set the new entity state immediately on mutation dispatch; roll back to the pre-mutation snapshot on failure using the
onErrorcontext pattern.
// useUpdateProject.ts — TanStack Query v5 optimistic mutation
import { useMutation, useQueryClient } from '@tanstack/react-query';
import { apiFetch } from './fetchAdapter';
import { NormalizedList } from './queryClient';
interface Project { id: string; name: string; status: 'active' | 'archived'; }
interface UpdateProjectInput { id: string; status: 'active' | 'archived'; }
export function useUpdateProject() {
const queryClient = useQueryClient();
return useMutation<Project, Error, UpdateProjectInput, { previous: NormalizedList<Project> | undefined }>({
mutationFn: ({ id, status }) =>
apiFetch<Project>(`/projects/${id}`, undefined).then(/* POST in real impl */),
// 1. Snapshot the current cache before mutating
onMutate: async ({ id, status }) => {
// Cancel any outgoing refetches so they don't overwrite our optimistic update
await queryClient.cancelQueries({ queryKey: ['projects'] });
const previous = queryClient.getQueryData<NormalizedList<Project>>(['projects']);
// 2. Apply the optimistic change to the normalized entity store
queryClient.setQueryData<NormalizedList<Project>>(['projects'], (old) => {
if (!old) return old;
return {
...old,
entities: {
...old.entities,
[id]: { ...old.entities[id], status },
},
};
});
return { previous };
},
// 3. On server confirmation, invalidate to sync any server-side derived fields
onSuccess: () => {
queryClient.invalidateQueries({ queryKey: ['projects'] });
},
// 4. On failure, restore the snapshot
onError: (_err, _vars, context) => {
if (context?.previous) {
queryClient.setQueryData(['projects'], context.previous);
}
},
});
}
Cache Behavior Impact: cancelQueries issues an abort signal to any in-flight ['projects'] fetch via the AbortController associated with the query. This prevents a race condition where a background refetch resolves after the optimistic write and overwrites it with stale server data. The onMutate → onError → rollback pattern is the canonical TanStack Query v5 approach: context passes the pre-mutation snapshot through the mutation lifecycle without any global variable. When onSuccess fires invalidateQueries, TanStack Query marks the ['projects'] query stale and schedules a background refetch — components see the optimistic value until the refetch resolves, then receive the authoritative server state.
For tag-based invalidation systems in Apollo, the equivalent uses cache.modify to write the confirmed value directly into the normalized graph:
// Apollo Client v3 — targeted cache patch after mutation confirmation
import { useMutation, gql } from '@apollo/client';
const UPDATE_PROJECT = gql`
mutation UpdateProject($id: ID!, $status: String!) {
updateProject(id: $id, status: $status) {
id
status
}
}
`;
function useUpdateProjectApollo() {
return useMutation(UPDATE_PROJECT, {
// Optimistic response tells Apollo what the mutation result looks like
optimisticResponse: ({ id, status }) => ({
updateProject: { __typename: 'Project', id, status },
}),
// update patches the normalized InMemoryCache directly
update: (cache, { data }) => {
if (!data?.updateProject) return;
cache.modify({
id: cache.identify(data.updateProject),
fields: {
status: () => data.updateProject.status,
},
});
},
});
}
Cache Behavior Impact: cache.identify resolves the Apollo InMemoryCache’s normalized key for the entity (Project:${id}) so cache.modify targets exactly that node. The optimistic response is written immediately to every query that references Project:${id} — Apollo’s reactive variables propagate the change to all active useQuery subscribers without an additional network round-trip. On server confirmation, Apollo replaces the optimistic entry with the real response; on failure, it reverts to the pre-optimistic snapshot automatically.
For background refetch strategies and tuning revalidation intervals at scale, see the sibling topic on optimizing SWR revalidation intervals.
Configuration Trade-offs
- Memory pressure with long
gcTime. AgcTimeof 30 minutes means every inactive query key stays in the in-memory cache for 30 minutes after its last subscriber unmounts. In SPAs where users navigate across many resource types, measure total cache size in development:queryClient.getQueryCache().getAll().reduce((n, q) => n + JSON.stringify(q.state.data ?? '').length, 0). - Race conditions during concurrent mutations. If two mutations target the same entity simultaneously (e.g., two users editing the same record, or a bulk-update operation running alongside a single-row edit), their optimistic writes will overwrite each other unless serialized via a mutation queue or version vector. TanStack Query does not serialize concurrent mutations automatically.
- Aggressive invalidation and waterfall refetches. Invalidating broad query key families (e.g.,
['projects']when only one project changed) triggers a refetch of every matching query. In a dashboard with 20 active queries under theprojectskey family, this creates 20 simultaneous network requests. Prefer surgicalsetQueryDatapatches over broadinvalidateQuerieswherever the mutation response includes the full updated entity. cancelQueriesscope.cancelQueries({ queryKey: ['projects'] })cancels all in-flight queries whose key starts with['projects']. If you have nested key families (['projects', teamId]), the cancel will propagate to all of them. This is usually the right behaviour but can create subtle ordering issues if team-scoped queries have independent refetch schedules.
Common Pitfalls & Resolutions
| Observable Issue | Root Cause | Diagnostic Resolution |
|---|---|---|
| Identical API requests firing for every component mount | Multiple components sharing the same logical resource are using divergent query keys (e.g., ['project', id] vs ['projects', id]) |
Audit query keys with queryClient.getQueryCache().getAll().map(q => q.queryKey) in development; enforce a query key factory to guarantee key consistency |
| UI flickers back to stale state after a mutation succeeds | onSuccess calls invalidateQueries before the optimistic write; or a background refetch resolves between onMutate and onSuccess without cancelQueries |
Add await queryClient.cancelQueries in onMutate; verify onSuccess fires invalidateQueries after server confirmation, not alongside it |
| Memory footprint grows across user sessions | gcTime is set too high (or at Infinity) for queries that accumulate unique keys (e.g., per-user-action query keys) |
Set gcTime to session-appropriate durations and use a query key factory that limits cardinality; monitor with getQueryCache().getAll().length |
| Apollo cache modifications not reflecting in a sibling component | The sibling component’s query result references a different entity key or uses a fetchPolicy that bypasses the normalized cache |
Confirm cache.identify resolves correctly; ensure the sibling query uses cache-first or cache-and-network; check that __typename is present in all responses |
Frequently Asked Questions
How do I handle cache invalidation for deeply nested relational data without flushing the whole cache?
Use entity-level cache modification — cache.modify in Apollo or setQueryData in TanStack Query — to patch specific nodes by their normalized ID. Scope invalidation to the parent resource ID; the normalized lookup table propagates changes to every consuming component without a full cache reset. Only fall back to invalidateQueries when the server response includes derived fields that cannot be reconstructed client-side.
When should I prefer reference storage over value storage in the frontend cache?
Prefer reference storage whenever two or more components consume the same entity. Reference storage ensures a single mutation propagates atomically across the entire UI tree and eliminates duplicate memory consumption from deep-cloned payloads. Value storage is appropriate only for ephemeral, component-local state that never needs to be shared (e.g., a form draft that has not yet been submitted).
What staleTime is appropriate for SaaS dashboards that need near-real-time data?
Set staleTime: 0 for actively-viewed real-time metrics, combined with refetchInterval: 10_000 or a WebSocket push channel for critical updates. For entity detail pages that a user is actively editing, use staleTime: 0 plus refetchOnWindowFocus: false to prevent background refetches from overwriting in-progress edits. Static reference data (countries, plan tiers, feature flags) can tolerate staleTime: 1000 * 60 * 60 (1 hour) without impacting user experience.
How does normalization affect SSR hydration performance?
Normalizing on the server before serialization (e.g., inside dehydrate for TanStack Query or extractApolloState for Apollo) reduces payload size and shifts the graph-reconstruction work to where it is cheapest: the server. On the client, HydrationBoundary restores the pre-built normalized map directly into the QueryClient without a second parse pass. This prevents React hydration mismatches caused by structural differences between the server-rendered snapshot and the client-side initial fetch. If you cannot normalize on the server, ensure the client-side select transform is idempotent so repeated calls during hydration produce identical references.
Related
- State Architecture & Cache Fundamentals — the parent topic covering the full spectrum of frontend cache design, from client/server state separation through normalization and synchronization.
- Client vs Server State Boundaries — how to decide which state belongs in the query cache vs local component state vs a global store, with concrete decision criteria for SaaS applications.
- Reference vs Value Storage Models — understanding entity reference graphs vs deep-copied value payloads, which underpins the normalization approach on this page.
- Stale-While-Revalidate Implementation — a detailed recipe for configuring background refetch windows in TanStack Query and SWR so users always see an instant response without sacrificing data freshness.
- Tag-Based Invalidation Systems — how to group cache entries by resource type and invalidate them surgically after mutations, avoiding the over-invalidation trap covered in the pitfalls table above.
- Mutation Sync & Rollback — advanced patterns for handling concurrent mutations, conflict resolution, and rollback in collaborative SaaS UIs.