Nested Data Flattening Techniques
Deeply nested API payloads cause three compounding problems in client-side caches: structural duplication (the same entity stored in multiple query keys), cascade invalidation failures (updating a leaf node doesn’t reach its duplicates), and diffing overhead (React Query’s structuralSharing must walk the full tree on every revalidation). This page shows how to solve all three with deterministic flattening pipelines that convert hierarchical JSON into a flat entity map before it ever enters the query cache.
This is a direct sub-topic of Data Normalization & Query Key Design, which establishes the architectural rationale for normalization boundaries. If you need to reconstruct nested views from an already-flat cache, see Relationship Stitching in Cache. For the specific challenge of GraphQL __typename-keyed payloads, see Flattening Deeply Nested GraphQL Responses.
Diagnostic Checklist
You need a flattening pipeline if you observe any of these:
- The same entity (e.g. a
Userobject) appears in two different query caches, and mutating one does not update the other queryClient.invalidateQueriestriggers a refetch, but the updated value doesn’t surface in all components until a second invalidation- React DevTools profiler shows re-renders on components whose data did not logically change
- Payload logs show the same
idunder multiple nested paths — e.g.order.customer.id === comment.author.id - Cache memory grows unbounded during navigation: previously-fetched entities accumulate as isolated snapshots rather than merging into shared records
Prerequisites
Before implementing a flattening pipeline, confirm that you understand:
- How
selecttransforms data in TanStack Query v5 — the raw response is stored internally;selectoutput is what components receive. Normalization belongs inselect, not inqueryFn. - Entity Mapping Strategies — the schema contracts that govern which fields constitute an entity’s identity key.
- Relationship Stitching in Cache — once data is flat, stitching reconstructs the relational graph at read time; understand both sides before you flatten.
structuralSharingin React Query — enabled by default, it prevents re-renders by returning the previous object reference when data hasn’t changed by value. Your flattened output must produce stable references across revalidations.
Data Flow: Nested Payload to Flat Entity Map
The diagram below shows how a deeply nested API response moves through the flattening pipeline before entering the React Query cache. Each arrow represents a transformation boundary where normalization occurs.
Implementation 1 — Iterative Payload Traversal
Recursive traversal is the natural first approach, but it risks call stack overflow on payloads that exceed ~500 nesting levels (lower on constrained runtimes such as Cloudflare Workers or React Native’s Hermes). An explicit stack on the heap eliminates that ceiling while giving you precise control over visit ordering and early termination.
Steps:
- Push the root payload onto a work stack as
{ node, path }. - Pop each entry; skip primitives,
null, and already-visited objects (cycle guard viaWeakSet). - Assign an entity key:
node[idKey]if present, otherwise the joined path string. - Shallow-copy the node into
entities[entityKey]; replace any nested object child that carries its ownidKeywith that child’s ID string, then push the child onto the stack. - For array children, map each item: push objects with IDs onto the stack and replace the slot with the ID string; leave primitives in place.
- Return
{ entities, rootRef }.
// normalizer.ts
export interface FlattenResult {
entities: Record<string, Record<string, unknown>>;
rootRef: string;
}
export function flattenPayload(
payload: Record<string, unknown>,
idKey = 'id',
): FlattenResult {
const entities: Record<string, Record<string, unknown>> = {};
const visited = new WeakSet<object>();
const stack: Array<{ node: Record<string, unknown>; path: string[] }> = [
{ node: payload, path: ['root'] },
];
while (stack.length > 0) {
// Non-null assertion safe: loop condition guarantees length > 0
const { node, path } = stack.pop()!;
if (visited.has(node)) continue;
visited.add(node);
const entityKey =
typeof node[idKey] === 'string' || typeof node[idKey] === 'number'
? String(node[idKey])
: path.join('.');
const flattened: Record<string, unknown> = { ...node };
for (const key of Object.keys(node)) {
const val = node[key];
if (val !== null && typeof val === 'object' && !Array.isArray(val)) {
const child = val as Record<string, unknown>;
if (child[idKey] != null) {
// Replace nested object with its ID reference
flattened[key] = String(child[idKey]);
stack.push({ node: child, path: [...path, key] });
}
} else if (Array.isArray(val)) {
flattened[key] = val.map((item, i) => {
if (item !== null && typeof item === 'object' && (item as Record<string, unknown>)[idKey] != null) {
const child = item as Record<string, unknown>;
stack.push({ node: child, path: [...path, key, String(i)] });
return String(child[idKey]);
}
return item;
});
}
}
entities[entityKey] = flattened;
}
const rootKey =
payload[idKey] != null ? String(payload[idKey]) : 'root';
return { entities, rootRef: rootKey };
}
Cache Behavior Impact: flattenPayload runs synchronously before the data enters React Query’s internal store. Every child object that carries its own ID becomes a top-level entry in entities, so a component reading entities["u42"] always sees the latest value regardless of which query originally fetched that user. structuralSharing can then short-circuit re-renders: if entities["u42"] was not touched by the latest revalidation, its reference is unchanged and consumers skip the render cycle.
Configuration Trade-offs:
- Setting
staleTime: 0causesflattenPayloadto run on every mount even when data has not changed; setstaleTimeto match your API’s cache-control headers to avoid redundant traversals. WeakSetcycle detection is O(1) per node and adds no measurable overhead; skipping it to “save memory” is a false economy — a single circular payload hangs the event loop.- Flat-copying with
{ ...node }is a shallow clone; nested arrays of primitives are shared references, not copied. This is intentional: primitive arrays don’t need deduplication and copying them wastes heap. gcTime(formerlycacheTime) should be set at least 2×staleTimeso that background revalidations have a warm cache to merge into.
Implementation 2 — React Query select Adapter
Applying normalization inside select keeps the raw server response intact in React Query’s internal cache (accessible via queryClient.getQueryData) while delivering a flat entity map to components. This is the safest integration point: it doesn’t intercept unrelated queries and is trivially testable in isolation.
Steps:
- Import
flattenPayloadinto a custom hook. - Pass the normalization call as the
selectoption ofuseQuery. - Do not include client-generated timestamps in the
selectreturn — derive time values from server fields or omit them. - Set
staleTimeandgcTimebased on your API’s actual freshness guarantee.
// useNormalizedQuery.ts
import { useQuery, type UseQueryOptions } from '@tanstack/react-query';
import { flattenPayload, type FlattenResult } from './normalizer';
type NormalizedResult = FlattenResult & { queryKey: readonly unknown[] };
export function useNormalizedQuery<TRaw extends Record<string, unknown>>(
queryKey: readonly unknown[],
queryFn: () => Promise<TRaw>,
options?: Omit<UseQueryOptions<TRaw, Error, NormalizedResult>, 'queryKey' | 'queryFn' | 'select'>,
) {
return useQuery<TRaw, Error, NormalizedResult>({
queryKey,
queryFn,
select: (raw): NormalizedResult => {
// flattenPayload runs only when React Query decides the data changed.
// structuralSharing means it will NOT run if the raw response is
// reference-equal to the previous fetch — zero extra work on a 304.
const result = flattenPayload(raw);
return { ...result, queryKey };
},
staleTime: 1000 * 60 * 5, // 5 min — tune to API cache-control
gcTime: 1000 * 60 * 15, // 15 min — 3× staleTime keeps warm cache
...options,
});
}
Cache Behavior Impact: React Query calls the select function only when the underlying raw data changes by value (courtesy of structuralSharing). If a background refetch returns an identical response (same JSON by value), select is skipped entirely and the component receives the previous entities reference unchanged — zero re-renders. By returning { ...result, queryKey } instead of { ...result, timestamp: Date.now() }, the output remains structurally stable across identical fetches.
Configuration Trade-offs:
staleTimecontrols when a background refetch fires;flattenPayloadruns on every actual data change, not on every mount. HigherstaleTimemeans fewer flattening passes at the cost of serving potentially stale entities.- Passing
structuralSharing: falsedisables React Query’s deep-equality check and causesselectto run on every refetch regardless of whether data changed — a significant CPU overhead on large entity maps. - If the same entity appears under multiple query keys (e.g. in a list and in a detail query), each will maintain its own flattened copy. To share a single live record, extract the entity into a separate
['entity', id]query key and useinitialDatafrom the list query to seed it — see Entity Mapping Strategies for the full pattern. - Avoid running
flattenPayloadinsidequeryFn: that stores the normalized output as the raw cache value, which breaksstructuralSharingfor nested refetches and makes the raw response unreachable viagetQueryData.
Implementation 3 — Relational Reference Maintenance
Flattening severs parent-child nesting, replacing child objects with ID strings. For most read paths this is sufficient: render components join entity IDs at display time. But when a component needs to traverse relationships in both directions — e.g. “all orders for this customer” — you need an inverse relationship map built during the traversal pass.
Steps:
- Extend
flattenPayloadto emit aninverseMap: Record<string, Record<string, string[]>>alongsideentities. - During traversal, whenever you replace a child with its ID reference, record the parent’s key in the inverse map under
{ [childType]: { [childId]: [...parentIds] } }. - Expose the inverse map from
selectso components can callinverseMap.User["u42"].ordersrather than scanning all orders.
// normalizer-with-inverse.ts
export interface FlattenWithInverseResult {
entities: Record<string, Record<string, unknown>>;
inverseMap: Record<string, string[]>; // childId → parentIds[]
rootRef: string;
}
export function flattenWithInverse(
payload: Record<string, unknown>,
idKey = 'id',
): FlattenWithInverseResult {
const entities: Record<string, Record<string, unknown>> = {};
const inverseMap: Record<string, string[]> = {};
const visited = new WeakSet<object>();
const stack: Array<{
node: Record<string, unknown>;
path: string[];
parentKey: string | null;
}> = [{ node: payload, path: ['root'], parentKey: null }];
while (stack.length > 0) {
const { node, path, parentKey } = stack.pop()!;
if (visited.has(node)) continue;
visited.add(node);
const entityKey =
node[idKey] != null ? String(node[idKey]) : path.join('.');
const flattened: Record<string, unknown> = { ...node };
// Register inverse relationship: child → parent
if (parentKey !== null) {
if (!inverseMap[entityKey]) inverseMap[entityKey] = [];
if (!inverseMap[entityKey].includes(parentKey)) {
inverseMap[entityKey].push(parentKey);
}
}
for (const key of Object.keys(node)) {
const val = node[key];
if (val !== null && typeof val === 'object' && !Array.isArray(val)) {
const child = val as Record<string, unknown>;
if (child[idKey] != null) {
flattened[key] = String(child[idKey]);
stack.push({ node: child, path: [...path, key], parentKey: entityKey });
}
} else if (Array.isArray(val)) {
flattened[key] = val.map((item, i) => {
if (item !== null && typeof item === 'object') {
const child = item as Record<string, unknown>;
if (child[idKey] != null) {
stack.push({ node: child, path: [...path, key, String(i)], parentKey: entityKey });
return String(child[idKey]);
}
}
return item;
});
}
}
entities[entityKey] = flattened;
}
return {
entities,
inverseMap,
rootRef: payload[idKey] != null ? String(payload[idKey]) : 'root',
};
}
Cache Behavior Impact: The inverse map is rebuilt on every normalization pass, but because it is derived deterministically from the entity graph, structuralSharing can still detect when it hasn’t changed. If you need the inverse map to be stable across refetches that return identical data, memoize it with a deep-equality check or use useMemo in the consuming component. For patterns involving bidirectional foreign-key resolution at render time, see Relationship Stitching in Cache.
Configuration Trade-offs:
- Building inverse maps doubles the traversal constant factor: for payloads with 1000+ nodes, consider building the inverse map lazily (on first access) rather than eagerly on every normalization pass.
- When mutations update a child entity, both the direct
entities[childId]entry and theinverseMap[childId]parent list may need updating. Forgetting the inverse map during optimistic updates causes components that navigate viainverseMapto show stale relationships. gcTimesettings govern how longinverseMapsurvives in theselectoutput after the query becomes inactive; align it with the main entity’s freshness window so relationship maps don’t outlive their entities.
Common Pitfalls & Resolutions
| Observable Issue | Root Cause | Diagnostic Resolution |
|---|---|---|
| Mutating a user in one query key doesn’t update the same user in a list query | Same entity stored under two isolated query keys; no shared flat map | Normalize both queries into the same entities registry and read from that registry in components, or move the entity to a dedicated ['entity', 'User', id] query key seeded by initialData from the list query |
select runs on every refetch even when the server returns the same payload |
structuralSharing disabled, or select returns a new object containing Date.now() |
Remove client-generated timestamps from select output; ensure structuralSharing is not set to false in QueryClient defaults |
| Circular reference in payload causes the normalization loop to hang | Bidirectional API relationships (e.g. order → customer → orders) without cycle detection | Confirm WeakSet is tracking visited nodes; add a debug log before visited.add(node) to count visits per entity key and confirm no key is processed twice |
| Orphaned entities accumulate in the flat map after a mutation removes them | Normalization adds to entities on every fetch but never removes deleted records |
Diff the previous entities against the new result in the select callback and remove keys absent from the latest response; or use entity-scoped gcTime to let stale entries expire |
| Array order changes trigger re-renders even though entity data is unchanged | Array of ID strings is a new reference on every traversal even if contents are identical | Sort the ID array before storing it, or use a custom structuralSharing function that compares arrays by sorted element equality |
Frequently Asked Questions
When should I flatten at the network layer versus inside the React Query select callback?
Flatten inside select for query-specific normalization: it keeps the raw response intact in React Query’s internal cache (useful for debugging and for passing to other select transforms), and it scopes the normalization to the queries that need it. Use a global axios interceptor only when every API endpoint returns the same shape and you need normalized output across all caches simultaneously. Mixing both strategies — interceptor plus select — causes double-normalization: the select function receives an already-flattened payload and tries to flatten it again, producing incorrect entity keys and duplicate entries.
How do I handle missing IDs in deeply nested payloads?
Generate deterministic composite keys from the node’s position in the payload: concatenate parent entity ID + field name + array index, e.g. "o1.lineItems.0". Alternatively, hash the object’s stable scalar fields using a deterministic algorithm (e.g. FNV-1a on the sorted key-value pairs). The critical constraint is strict determinism across requests: if the same logical entity produces different keys on two fetches, React Query treats them as different records, leaving the old entry in the cache until gcTime expires and causing a flash of stale content.
Does including Date.now() in a select callback break structural sharing?
Yes. React Query’s structuralSharing compares the previous and next select output by deep value equality. A timestamp generated client-side will always differ, producing a new object reference on every revalidation — including background refetches that return the same server data. Every component subscribed to the query re-renders unnecessarily. Only include time values derived from the server response (e.g. the payload’s updatedAt field), never from Date.now() or new Date() inside select.
Can flattening payloads break optimistic updates?
No — it improves them. With a nested cache, an optimistic mutation must traverse the tree to find and patch the right node. With a flat entity map, queryClient.setQueryData(['entity', 'Order', 'o1'], updater) targets the entity directly. Rollback is equally precise: the onError callback in useMutation restores the pre-mutation snapshot via queryClient.setQueryData using the same key, without needing to know where in the original hierarchy the entity lived. This is covered in depth in Relationship Stitching in Cache.
Related
- Data Normalization & Query Key Design — the parent topic covering the full architectural rationale for normalization boundaries, query key design, and cache synchronization strategy.
- Flattening Deeply Nested GraphQL Responses — applies the techniques on this page to
__typename-keyed GraphQL payloads from Apollo Client, where polymorphic type routing requires entity-registry partitioning. - Entity Mapping Strategies — schema contracts and DTO transformation boundaries that define which fields constitute an entity’s identity key, directly upstream of the flattening step.
- Relationship Stitching in Cache — the read-time complement to flattening: reconstructing relational object graphs from the flat entity map without issuing additional network requests.
- Pagination Normalization Patterns — extends flat entity maps to paginated list queries, covering cursor-based and offset pagination merge strategies without duplicating entity records across pages.