Nested Data Flattening Techniques

Deeply nested API payloads cause three compounding problems in client-side caches: structural duplication (the same entity stored in multiple query keys), cascade invalidation failures (updating a leaf node doesn’t reach its duplicates), and diffing overhead (React Query’s structuralSharing must walk the full tree on every revalidation). This page shows how to solve all three with deterministic flattening pipelines that convert hierarchical JSON into a flat entity map before it ever enters the query cache.

This is a direct sub-topic of Data Normalization & Query Key Design, which establishes the architectural rationale for normalization boundaries. If you need to reconstruct nested views from an already-flat cache, see Relationship Stitching in Cache. For the specific challenge of GraphQL __typename-keyed payloads, see Flattening Deeply Nested GraphQL Responses.


Diagnostic Checklist

You need a flattening pipeline if you observe any of these:

  • The same entity (e.g. a User object) appears in two different query caches, and mutating one does not update the other
  • queryClient.invalidateQueries triggers a refetch, but the updated value doesn’t surface in all components until a second invalidation
  • React DevTools profiler shows re-renders on components whose data did not logically change
  • Payload logs show the same id under multiple nested paths — e.g. order.customer.id === comment.author.id
  • Cache memory grows unbounded during navigation: previously-fetched entities accumulate as isolated snapshots rather than merging into shared records

Prerequisites

Before implementing a flattening pipeline, confirm that you understand:

  • How select transforms data in TanStack Query v5 — the raw response is stored internally; select output is what components receive. Normalization belongs in select, not in queryFn.
  • Entity Mapping Strategies — the schema contracts that govern which fields constitute an entity’s identity key.
  • Relationship Stitching in Cache — once data is flat, stitching reconstructs the relational graph at read time; understand both sides before you flatten.
  • structuralSharing in React Query — enabled by default, it prevents re-renders by returning the previous object reference when data hasn’t changed by value. Your flattened output must produce stable references across revalidations.

Data Flow: Nested Payload to Flat Entity Map

The diagram below shows how a deeply nested API response moves through the flattening pipeline before entering the React Query cache. Each arrow represents a transformation boundary where normalization occurs.

Nested Payload Flattening Pipeline A data flow diagram illustrating how a deeply nested API response passes through a flattenPayload function, producing a flat entities map and a rootRef, which are then stored via a React Query select callback into the query cache as normalized entity records. API Response { order: { id: "o1", customer: { id: "u42", name: "Jo" } } } flattenPayload() iterative + WeakSet cycle detection Flat Entity Map entities: { "o1": { id: "o1", customer: "u42" }, "u42": { id: "u42", name: "Jo" } } Query Cache select() output O(1) lookups select ① Raw payload ② Traversal ③ Entity map ④ Cache entry

Implementation 1 — Iterative Payload Traversal

Recursive traversal is the natural first approach, but it risks call stack overflow on payloads that exceed ~500 nesting levels (lower on constrained runtimes such as Cloudflare Workers or React Native’s Hermes). An explicit stack on the heap eliminates that ceiling while giving you precise control over visit ordering and early termination.

Steps:

  1. Push the root payload onto a work stack as { node, path }.
  2. Pop each entry; skip primitives, null, and already-visited objects (cycle guard via WeakSet).
  3. Assign an entity key: node[idKey] if present, otherwise the joined path string.
  4. Shallow-copy the node into entities[entityKey]; replace any nested object child that carries its own idKey with that child’s ID string, then push the child onto the stack.
  5. For array children, map each item: push objects with IDs onto the stack and replace the slot with the ID string; leave primitives in place.
  6. Return { entities, rootRef }.
// normalizer.ts
export interface FlattenResult {
  entities: Record<string, Record<string, unknown>>;
  rootRef: string;
}

export function flattenPayload(
  payload: Record<string, unknown>,
  idKey = 'id',
): FlattenResult {
  const entities: Record<string, Record<string, unknown>> = {};
  const visited = new WeakSet<object>();
  const stack: Array<{ node: Record<string, unknown>; path: string[] }> = [
    { node: payload, path: ['root'] },
  ];

  while (stack.length > 0) {
    // Non-null assertion safe: loop condition guarantees length > 0
    const { node, path } = stack.pop()!;

    if (visited.has(node)) continue;
    visited.add(node);

    const entityKey =
      typeof node[idKey] === 'string' || typeof node[idKey] === 'number'
        ? String(node[idKey])
        : path.join('.');

    const flattened: Record<string, unknown> = { ...node };

    for (const key of Object.keys(node)) {
      const val = node[key];

      if (val !== null && typeof val === 'object' && !Array.isArray(val)) {
        const child = val as Record<string, unknown>;
        if (child[idKey] != null) {
          // Replace nested object with its ID reference
          flattened[key] = String(child[idKey]);
          stack.push({ node: child, path: [...path, key] });
        }
      } else if (Array.isArray(val)) {
        flattened[key] = val.map((item, i) => {
          if (item !== null && typeof item === 'object' && (item as Record<string, unknown>)[idKey] != null) {
            const child = item as Record<string, unknown>;
            stack.push({ node: child, path: [...path, key, String(i)] });
            return String(child[idKey]);
          }
          return item;
        });
      }
    }

    entities[entityKey] = flattened;
  }

  const rootKey =
    payload[idKey] != null ? String(payload[idKey]) : 'root';

  return { entities, rootRef: rootKey };
}

Cache Behavior Impact: flattenPayload runs synchronously before the data enters React Query’s internal store. Every child object that carries its own ID becomes a top-level entry in entities, so a component reading entities["u42"] always sees the latest value regardless of which query originally fetched that user. structuralSharing can then short-circuit re-renders: if entities["u42"] was not touched by the latest revalidation, its reference is unchanged and consumers skip the render cycle.

Configuration Trade-offs:

  • Setting staleTime: 0 causes flattenPayload to run on every mount even when data has not changed; set staleTime to match your API’s cache-control headers to avoid redundant traversals.
  • WeakSet cycle detection is O(1) per node and adds no measurable overhead; skipping it to “save memory” is a false economy — a single circular payload hangs the event loop.
  • Flat-copying with { ...node } is a shallow clone; nested arrays of primitives are shared references, not copied. This is intentional: primitive arrays don’t need deduplication and copying them wastes heap.
  • gcTime (formerly cacheTime) should be set at least 2× staleTime so that background revalidations have a warm cache to merge into.

Implementation 2 — React Query select Adapter

Applying normalization inside select keeps the raw server response intact in React Query’s internal cache (accessible via queryClient.getQueryData) while delivering a flat entity map to components. This is the safest integration point: it doesn’t intercept unrelated queries and is trivially testable in isolation.

Steps:

  1. Import flattenPayload into a custom hook.
  2. Pass the normalization call as the select option of useQuery.
  3. Do not include client-generated timestamps in the select return — derive time values from server fields or omit them.
  4. Set staleTime and gcTime based on your API’s actual freshness guarantee.
// useNormalizedQuery.ts
import { useQuery, type UseQueryOptions } from '@tanstack/react-query';
import { flattenPayload, type FlattenResult } from './normalizer';

type NormalizedResult = FlattenResult & { queryKey: readonly unknown[] };

export function useNormalizedQuery<TRaw extends Record<string, unknown>>(
  queryKey: readonly unknown[],
  queryFn: () => Promise<TRaw>,
  options?: Omit<UseQueryOptions<TRaw, Error, NormalizedResult>, 'queryKey' | 'queryFn' | 'select'>,
) {
  return useQuery<TRaw, Error, NormalizedResult>({
    queryKey,
    queryFn,
    select: (raw): NormalizedResult => {
      // flattenPayload runs only when React Query decides the data changed.
      // structuralSharing means it will NOT run if the raw response is
      // reference-equal to the previous fetch — zero extra work on a 304.
      const result = flattenPayload(raw);
      return { ...result, queryKey };
    },
    staleTime: 1000 * 60 * 5,   // 5 min — tune to API cache-control
    gcTime: 1000 * 60 * 15,     // 15 min — 3× staleTime keeps warm cache
    ...options,
  });
}

Cache Behavior Impact: React Query calls the select function only when the underlying raw data changes by value (courtesy of structuralSharing). If a background refetch returns an identical response (same JSON by value), select is skipped entirely and the component receives the previous entities reference unchanged — zero re-renders. By returning { ...result, queryKey } instead of { ...result, timestamp: Date.now() }, the output remains structurally stable across identical fetches.

Configuration Trade-offs:

  • staleTime controls when a background refetch fires; flattenPayload runs on every actual data change, not on every mount. Higher staleTime means fewer flattening passes at the cost of serving potentially stale entities.
  • Passing structuralSharing: false disables React Query’s deep-equality check and causes select to run on every refetch regardless of whether data changed — a significant CPU overhead on large entity maps.
  • If the same entity appears under multiple query keys (e.g. in a list and in a detail query), each will maintain its own flattened copy. To share a single live record, extract the entity into a separate ['entity', id] query key and use initialData from the list query to seed it — see Entity Mapping Strategies for the full pattern.
  • Avoid running flattenPayload inside queryFn: that stores the normalized output as the raw cache value, which breaks structuralSharing for nested refetches and makes the raw response unreachable via getQueryData.

Implementation 3 — Relational Reference Maintenance

Flattening severs parent-child nesting, replacing child objects with ID strings. For most read paths this is sufficient: render components join entity IDs at display time. But when a component needs to traverse relationships in both directions — e.g. “all orders for this customer” — you need an inverse relationship map built during the traversal pass.

Steps:

  1. Extend flattenPayload to emit an inverseMap: Record<string, Record<string, string[]>> alongside entities.
  2. During traversal, whenever you replace a child with its ID reference, record the parent’s key in the inverse map under { [childType]: { [childId]: [...parentIds] } }.
  3. Expose the inverse map from select so components can call inverseMap.User["u42"].orders rather than scanning all orders.
// normalizer-with-inverse.ts
export interface FlattenWithInverseResult {
  entities: Record<string, Record<string, unknown>>;
  inverseMap: Record<string, string[]>; // childId → parentIds[]
  rootRef: string;
}

export function flattenWithInverse(
  payload: Record<string, unknown>,
  idKey = 'id',
): FlattenWithInverseResult {
  const entities: Record<string, Record<string, unknown>> = {};
  const inverseMap: Record<string, string[]> = {};
  const visited = new WeakSet<object>();
  const stack: Array<{
    node: Record<string, unknown>;
    path: string[];
    parentKey: string | null;
  }> = [{ node: payload, path: ['root'], parentKey: null }];

  while (stack.length > 0) {
    const { node, path, parentKey } = stack.pop()!;
    if (visited.has(node)) continue;
    visited.add(node);

    const entityKey =
      node[idKey] != null ? String(node[idKey]) : path.join('.');
    const flattened: Record<string, unknown> = { ...node };

    // Register inverse relationship: child → parent
    if (parentKey !== null) {
      if (!inverseMap[entityKey]) inverseMap[entityKey] = [];
      if (!inverseMap[entityKey].includes(parentKey)) {
        inverseMap[entityKey].push(parentKey);
      }
    }

    for (const key of Object.keys(node)) {
      const val = node[key];
      if (val !== null && typeof val === 'object' && !Array.isArray(val)) {
        const child = val as Record<string, unknown>;
        if (child[idKey] != null) {
          flattened[key] = String(child[idKey]);
          stack.push({ node: child, path: [...path, key], parentKey: entityKey });
        }
      } else if (Array.isArray(val)) {
        flattened[key] = val.map((item, i) => {
          if (item !== null && typeof item === 'object') {
            const child = item as Record<string, unknown>;
            if (child[idKey] != null) {
              stack.push({ node: child, path: [...path, key, String(i)], parentKey: entityKey });
              return String(child[idKey]);
            }
          }
          return item;
        });
      }
    }

    entities[entityKey] = flattened;
  }

  return {
    entities,
    inverseMap,
    rootRef: payload[idKey] != null ? String(payload[idKey]) : 'root',
  };
}

Cache Behavior Impact: The inverse map is rebuilt on every normalization pass, but because it is derived deterministically from the entity graph, structuralSharing can still detect when it hasn’t changed. If you need the inverse map to be stable across refetches that return identical data, memoize it with a deep-equality check or use useMemo in the consuming component. For patterns involving bidirectional foreign-key resolution at render time, see Relationship Stitching in Cache.

Configuration Trade-offs:

  • Building inverse maps doubles the traversal constant factor: for payloads with 1000+ nodes, consider building the inverse map lazily (on first access) rather than eagerly on every normalization pass.
  • When mutations update a child entity, both the direct entities[childId] entry and the inverseMap[childId] parent list may need updating. Forgetting the inverse map during optimistic updates causes components that navigate via inverseMap to show stale relationships.
  • gcTime settings govern how long inverseMap survives in the select output after the query becomes inactive; align it with the main entity’s freshness window so relationship maps don’t outlive their entities.

Common Pitfalls & Resolutions

Observable Issue Root Cause Diagnostic Resolution
Mutating a user in one query key doesn’t update the same user in a list query Same entity stored under two isolated query keys; no shared flat map Normalize both queries into the same entities registry and read from that registry in components, or move the entity to a dedicated ['entity', 'User', id] query key seeded by initialData from the list query
select runs on every refetch even when the server returns the same payload structuralSharing disabled, or select returns a new object containing Date.now() Remove client-generated timestamps from select output; ensure structuralSharing is not set to false in QueryClient defaults
Circular reference in payload causes the normalization loop to hang Bidirectional API relationships (e.g. order → customer → orders) without cycle detection Confirm WeakSet is tracking visited nodes; add a debug log before visited.add(node) to count visits per entity key and confirm no key is processed twice
Orphaned entities accumulate in the flat map after a mutation removes them Normalization adds to entities on every fetch but never removes deleted records Diff the previous entities against the new result in the select callback and remove keys absent from the latest response; or use entity-scoped gcTime to let stale entries expire
Array order changes trigger re-renders even though entity data is unchanged Array of ID strings is a new reference on every traversal even if contents are identical Sort the ID array before storing it, or use a custom structuralSharing function that compares arrays by sorted element equality

Frequently Asked Questions

When should I flatten at the network layer versus inside the React Query select callback?

Flatten inside select for query-specific normalization: it keeps the raw response intact in React Query’s internal cache (useful for debugging and for passing to other select transforms), and it scopes the normalization to the queries that need it. Use a global axios interceptor only when every API endpoint returns the same shape and you need normalized output across all caches simultaneously. Mixing both strategies — interceptor plus select — causes double-normalization: the select function receives an already-flattened payload and tries to flatten it again, producing incorrect entity keys and duplicate entries.

How do I handle missing IDs in deeply nested payloads?

Generate deterministic composite keys from the node’s position in the payload: concatenate parent entity ID + field name + array index, e.g. "o1.lineItems.0". Alternatively, hash the object’s stable scalar fields using a deterministic algorithm (e.g. FNV-1a on the sorted key-value pairs). The critical constraint is strict determinism across requests: if the same logical entity produces different keys on two fetches, React Query treats them as different records, leaving the old entry in the cache until gcTime expires and causing a flash of stale content.

Does including Date.now() in a select callback break structural sharing?

Yes. React Query’s structuralSharing compares the previous and next select output by deep value equality. A timestamp generated client-side will always differ, producing a new object reference on every revalidation — including background refetches that return the same server data. Every component subscribed to the query re-renders unnecessarily. Only include time values derived from the server response (e.g. the payload’s updatedAt field), never from Date.now() or new Date() inside select.

Can flattening payloads break optimistic updates?

No — it improves them. With a nested cache, an optimistic mutation must traverse the tree to find and patch the right node. With a flat entity map, queryClient.setQueryData(['entity', 'Order', 'o1'], updater) targets the entity directly. Rollback is equally precise: the onError callback in useMutation restores the pre-mutation snapshot via queryClient.setQueryData using the same key, without needing to know where in the original hierarchy the entity lived. This is covered in depth in Relationship Stitching in Cache.


  • Data Normalization & Query Key Design — the parent topic covering the full architectural rationale for normalization boundaries, query key design, and cache synchronization strategy.
  • Flattening Deeply Nested GraphQL Responses — applies the techniques on this page to __typename-keyed GraphQL payloads from Apollo Client, where polymorphic type routing requires entity-registry partitioning.
  • Entity Mapping Strategies — schema contracts and DTO transformation boundaries that define which fields constitute an entity’s identity key, directly upstream of the flattening step.
  • Relationship Stitching in Cache — the read-time complement to flattening: reconstructing relational object graphs from the flat entity map without issuing additional network requests.
  • Pagination Normalization Patterns — extends flat entity maps to paginated list queries, covering cursor-based and offset pagination merge strategies without duplicating entity records across pages.