Normalization Principles for UI Caches

When multiple UI components consume overlapping server entities — a user record displayed in a sidebar, a post feed, and an edit form simultaneously — keeping those entities in sync requires more than just re-fetching. Without a normalized structure, each component holds its own copy of the same data. A mutation to one copy does not propagate to the others, leaving the UI in an inconsistent state. This is the specific failure mode that normalization addresses: it eliminates duplicated entity storage by replacing embedded objects with foreign key references, so a single mutation updates every dependent view at once.

This page is part of State Architecture & Cache Fundamentals, which covers the structural decisions that govern how server state is modeled in the browser. For the architectural context on how normalized stores fit within a layered cache design, see Cache Layer Architecture. If you are deciding whether to normalize at all, the Client vs Server State Boundaries topic explains which payloads warrant a normalized cache versus local-only ephemeral storage.

Diagnostic Checklist

You are in the right place if you are observing one or more of these symptoms:

A mutation to a user profile updates the edit form but leaves an avatar or post author display showing the old value.
Fetching a list of posts causes n redundant author objects to be stored in cache, one per post, even though most authors repeat.
A useQuery subscriber re-renders even though the data it consumes did not change — caused by new object references being allocated for unchanged nested entities.
Pagination appends create flicker because the entire post collection is overwritten rather than merged into the existing entity map.
Rollback after an optimistic mutation is inconsistent: some views revert, others do not.

Prerequisites

Before implementing normalization, make sure you understand:

Reference vs Value Storage Models — the distinction between reference-stable entity graphs and copied value trees, which determines whether two components share a single object or hold independent copies.
Cache Layer Architecture — the extraction boundaries between the network layer, the raw response cache, and the normalized entity store.
Client vs Server State Boundaries — which payloads belong in a normalized cache and which belong in local component state.

Implementation 1: Entity Extraction with Deterministic Key Generation

Normalization begins at the extraction step. Every nested API payload must be recursively traversed to isolate discrete records before they reach the cache.

Steps:

Define entity schemas as TypeScript interfaces with a mandatory id: string field.
Write an extractor that traverses a raw payload depth-first, pulling named entity types into separate flat maps keyed by entity.id.
Replace embedded sub-objects with their foreign key: authorId: item.author.id instead of author: item.author.
Build a parallel collection array of ordered IDs: ids: data.map(d => d.id).
For polymorphic responses, inject a __typename discriminator during extraction so the selector layer can resolve entity type dynamically.

// normalizer.ts — framework-agnostic entity extractor

interface RawPost {
  id: string;
  content: string;
  author: { id: string; name: string; avatarUrl: string };
}

interface NormalizedPost {
  id: string;
  content: string;
  authorId: string;
}

interface NormalizedUser {
  id: string;
  name: string;
  avatarUrl: string;
}

export interface NormalizedPostsResult {
  entities: {
    users: Record<string, NormalizedUser>;
    posts: Record<string, NormalizedPost>;
  };
  ids: string[];
}

export function normalizePosts(raw: RawPost[]): NormalizedPostsResult {
  const users: Record<string, NormalizedUser> = {};
  const posts: Record<string, NormalizedPost> = {};
  const ids: string[] = [];

  for (const item of raw) {
    // Author is upserted by ID — duplicate authors across posts collapse to one entry
    users[item.author.id] = {
      id: item.author.id,
      name: item.author.name,
      avatarUrl: item.author.avatarUrl,
    };

    posts[item.id] = {
      id: item.id,
      content: item.content,
      authorId: item.author.id, // foreign key reference, not embedded object
    };

    ids.push(item.id);
  }

  return { entities: { users, posts }, ids };
}

Cache Behavior Impact: The extractor runs synchronously before any cache write. Entities are de-duplicated at write time: if fifty posts share the same author, users[authorId] is upserted once. Every subsequent read resolves the author via users[post.authorId] — there is one canonical object in memory, so a single mutation propagates to every selector that resolves through that author ID.

Configuration Trade-offs:

Composite keys (e.g., ${userId}_${roleId}) work for join-table entities but must be generated deterministically both on write and on read; a mismatch causes phantom duplicates.
Extracting a __typename discriminator enables a single entities map keyed by __typename + id — useful for Apollo-style caches — but adds branching logic that couples the extractor to the schema.
Avoid JSON.stringify for key hashing: it is not stable across object property orderings and breaks structural equality checks.

Implementation 2: TanStack Query with `select` for Per-Subscriber Normalization

TanStack Query’s select option transforms the raw fetch result at read time, before it reaches the component. It is the idiomatic place to apply normalization in a React Query setup without changing what is stored in the internal QueryCache.

Steps:

Write a select function that accepts the raw query data and returns a normalized shape.
Pass it to useQuery; each subscriber receives the transformed result independently.
Wire a queryClient.setQueryData call inside useMutation’s onSuccess to update only the affected entity keys, avoiding a full collection refetch.

// usePosts.ts
import { useQuery, useMutation, useQueryClient } from '@tanstack/react-query';
import { normalizePosts, NormalizedPostsResult } from './normalizer';

type RawPost = { id: string; content: string; author: { id: string; name: string; avatarUrl: string } };

export function usePosts() {
  return useQuery<RawPost[], Error, NormalizedPostsResult>({
    queryKey: ['posts'],
    queryFn: (): Promise<RawPost[]> => fetch('/api/posts').then((r) => r.json()),
    select: normalizePosts,
    // structuralSharing: true is the default — TanStack Query deep-compares
    // the returned object and reuses unchanged entity references, preventing
    // downstream re-renders for unmodified posts.
    staleTime: 30_000,
    gcTime: 5 * 60_000,
  });
}

export function useUpdateUser() {
  const queryClient = useQueryClient();

  return useMutation({
    mutationFn: (payload: { id: string; name: string }) =>
      fetch(`/api/users/${payload.id}`, {
        method: 'PATCH',
        body: JSON.stringify({ name: payload.name }),
      }).then((r) => r.json()),

    onMutate: async (payload) => {
      // Snapshot and optimistically patch the raw cache
      await queryClient.cancelQueries({ queryKey: ['posts'] });
      const snapshot = queryClient.getQueryData<RawPost[]>(['posts']);

      queryClient.setQueryData<RawPost[]>(['posts'], (old) =>
        old?.map((post) =>
          post.author.id === payload.id
            ? { ...post, author: { ...post.author, name: payload.name } }
            : post,
        ) ?? [],
      );

      return { snapshot };
    },

    onError: (_err, _payload, ctx) => {
      // Roll back to snapshot on server rejection
      if (ctx?.snapshot) {
        queryClient.setQueryData(['posts'], ctx.snapshot);
      }
    },

    onSettled: () => {
      queryClient.invalidateQueries({ queryKey: ['posts'] });
    },
  });
}

Cache Behavior Impact: The select callback fires inside TanStack Query’s subscriber notification path. Because structuralSharing is true by default, TanStack Query performs a recursive structural equality check on the transformed output before notifying subscribers. If a post’s fields did not change, the reference for that post entry is reused from the previous selector output, preventing the <PostCard /> that renders it from re-rendering. Only the mutated author’s entry gets a new object reference, limiting reconciliation to the components that actually display that author.

Configuration Trade-offs:

Setting structuralSharing: false disables reference reuse entirely — every query result allocates fresh objects. Avoid this in normalized setups; it negates the primary re-render benefit.
staleTime and gcTime must be set independently. gcTime defaults to 5 minutes; if a component unmounts and remounts within that window, TanStack Query serves the cached normalized shape without refetching.
select runs per subscriber, not once per cache entry. If ten components call usePosts(), the normalizePosts function runs ten times on the same raw data. Memoize the extractor with useMemo or move transformation upstream (into a queryFn wrapper that stores the normalized result directly) when subscriber count is high.

Implementation 3: Redux Toolkit `createEntityAdapter` for Normalized Slices

When your architecture stores server state in Redux — common in apps that mix normalized server entities with complex client-side derived state — Redux Toolkit’s createEntityAdapter provides built-in upsertMany, removeMany, and stable selector factories that integrate directly with the normalized slice pattern.

Steps:

Define the entity interface and create an adapter with an optional sortComparer.
Initialise slice state with adapter.getInitialState().
In async thunks or RTK Query transformResponse, call upsertMany to merge server payloads into the existing entity map without overwriting untouched entries.
Use adapter.getSelectors() to create memoized selectAll, selectById, and selectIds selectors — pass a root-state accessor so they are composable with createSelector.

// postsSlice.ts
import {
  createEntityAdapter,
  createSlice,
  createAsyncThunk,
  PayloadAction,
} from '@reduxjs/toolkit';

interface Post {
  id: string;
  content: string;
  authorId: string;
  updatedAt: string;
}

const postsAdapter = createEntityAdapter<Post>({
  sortComparer: (a, b) => b.updatedAt.localeCompare(a.updatedAt),
});

export const fetchPosts = createAsyncThunk('posts/fetchAll', async () => {
  const raw: Array<{ id: string; content: string; author: { id: string }; updatedAt: string }> =
    await fetch('/api/posts').then((r) => r.json());

  // Normalize at thunk boundary — only NormalizedPost shapes enter the adapter
  return raw.map((p) => ({
    id: p.id,
    content: p.content,
    authorId: p.author.id,
    updatedAt: p.updatedAt,
  }));
});

const postsSlice = createSlice({
  name: 'posts',
  initialState: postsAdapter.getInitialState<{ status: 'idle' | 'loading' | 'error' }>({
    status: 'idle',
  }),
  reducers: {
    // Partial update: merge only the fields that changed, preserve the rest
    patchPost: (state, action: PayloadAction<Partial<Post> & { id: string }>) => {
      postsAdapter.updateOne(state, {
        id: action.payload.id,
        changes: action.payload,
      });
    },
  },
  extraReducers: (builder) => {
    builder
      .addCase(fetchPosts.pending, (state) => { state.status = 'loading'; })
      .addCase(fetchPosts.fulfilled, (state, action) => {
        // upsertMany merges incoming posts by ID — existing entries that are
        // absent from the new payload are NOT evicted, preventing flicker on
        // paginated fetches where only one page is refreshed.
        postsAdapter.upsertMany(state, action.payload);
        state.status = 'idle';
      })
      .addCase(fetchPosts.rejected, (state) => { state.status = 'error'; });
  },
});

export const { patchPost } = postsSlice.actions;
export default postsSlice.reducer;

// Memoized selectors — root state accessor keeps them composable
type RootState = { posts: ReturnType<typeof postsSlice.reducer> };
export const {
  selectAll: selectAllPosts,
  selectById: selectPostById,
  selectIds: selectPostIds,
} = postsAdapter.getSelectors<RootState>((state) => state.posts);

Cache Behavior Impact: upsertMany uses Immer’s structural mutation internally: it compares incoming entity fields against existing entries and applies only the changed paths. Entries absent from the incoming payload retain their existing Immer draft proxies — their object identities do not change. Memoized selectors created by getSelectors use createSelector under the hood, so selectAll only recomputes when state.posts.ids or state.posts.entities references change — unrelated components do not re-render on unrelated entity changes.

Configuration Trade-offs:

updateOne vs upsertOne: updateOne fails silently if the entity does not exist yet; use upsertOne when processing partial server payloads that might arrive before the full collection is loaded.
sortComparer re-runs on every upsertMany call; for large entity maps, derive sorted order at selector time with a memoized createSelector rather than paying the sort cost on every write.
The adapter adds approximately 3 kB gzip to bundle size. For applications with fewer than three distinct entity types, a hand-rolled normalized reducer is leaner, but loses the built-in CRUD helpers and selector factories.

Configuration Trade-offs Summary

Dimension	Impact	Mitigation
`structuralSharing` in TanStack Query	Disabled by setting `structuralSharing: false` — all references refresh on every fetch, eliminating re-render savings	Leave enabled; disable only in benchmarks to measure baseline overhead
`select` per-subscriber overhead	`normalizePosts` runs once per `useQuery` subscriber per cache invalidation cycle	Memoize with `useRef`-backed cache keyed on query data identity if subscriber count exceeds ~10
`gcTime` and orphaned entities	Entities from unmounted queries linger in `QueryCache` until `gcTime` expires; related entities may be stale when the query remounts	Set `gcTime` to match session freshness requirements; call `removeQueries` explicitly on logout
`upsertMany` vs full collection replace	Replacing the entire `ids` array causes every `selectPostIds` consumer to re-render	Always use `upsertMany`/`setAll` with pagination; only use `setAll` when replacing the complete dataset intentionally
SSR rehydration of normalized graphs	Serializing a normalized entity map via `JSON.stringify` may lose ordering guarantees in `ids` arrays across environments	Use a deterministic replacer in `JSON.stringify` and reconstruct `ids` from a sorted key scan on the client

Common Pitfalls & Resolutions

Observable Issue	Root Cause	Diagnostic Resolution
Circular reference error during state serialization	Bidirectional relationships stored as direct object references (`user.posts[0].author === user`) rather than ID foreign keys	Enforce ID-only foreign keys throughout the entity schema; use memoized selectors that resolve references lazily at read time
Cache thrashing on infinite scroll pagination	Each page fetch calls `setQueryData` with a concatenated raw array, allocating new objects for unchanged entities	Switch to `upsertMany` on the Redux adapter, or accumulate page data in `useInfiniteQuery` and apply `select` normalization to the aggregated `pages` array
Stale nested data after partial server update	Server PATCH returns only modified fields; client upsert replaces the whole entity, losing unrelated fields not included in the response	Use `updateOne` with `changes: partialPayload` (RTK) or deep-merge the partial payload into the existing `getQueryData` snapshot before calling `setQueryData`
Selector thrash: `selectAllPosts` re-runs on unrelated entity change	`selectAll` derives its output from `state.posts.entities` — any entity upsert updates the `entities` reference even if the component’s target entity is unchanged	Use `selectPostById` for single-entity selectors; compose `createSelector` chains that depend only on the specific entity subtree the component actually renders

Frequently Asked Questions

When should I normalize versus keep nested API responses intact?

Normalize when multiple UI components consume overlapping entities or when referential integrity is critical across views — a feed, a sidebar, and an edit form all rendering the same user record is the canonical case. Keep nested structures intact for isolated, single-consumer views where the lookup overhead and extractor verbosity outweigh the deduplication benefit. A rule of thumb: if only one component ever reads a given entity shape, normalization adds complexity without measurable gain.

Does TanStack Query's select option affect what is stored in the raw cache?

No. The select transform runs at read time per subscriber — it does not mutate what TanStack Query stores in its internal QueryCache. Every subscriber receives the transformed output independently; two subscribers can apply different select transforms to the same raw cache entry. The raw fetch result persists unchanged, meaning queryClient.getQueryData(['posts']) always returns the original server payload, not the normalized shape.

How do I invalidate a normalized entity without refetching its entire collection?

Invalidate by entity ID using queryClient.invalidateQueries with a fine-grained queryKey that includes the entity ID: queryClient.invalidateQueries({ queryKey: ['user', userId] }). Pair this with structuralSharing: true (the default) so only the changed entity reference updates — components that subscribe to other entities in the same collection remain stable. For cross-entity cascades, maintain a dependency map that links an entity ID to the query keys that depend on it, then invalidate each in a single Promise.all call inside onSuccess.

What causes cache thrashing during infinite scroll and how do I fix it?

Overwriting the entire entity array on each page fetch allocates new object references for every post entry, causing selectAll or useInfiniteQuery’s aggregated data to produce new arrays on each page load — even for pages the user already scrolled past. Fix this in Redux by calling upsertMany rather than setAll; in TanStack Query, use useInfiniteQuery with a select that normalizes across data.pages and memoizes the result using the last known page count as a dependency.

State Architecture & Cache Fundamentals — the parent topic covering every structural decision that governs how server state is modeled in the browser, from cache layer design to hydration.
How to Design a Normalized State Tree — a step-by-step guide to mapping server entity schemas to stable in-memory reference graphs, including partial update and rollback patterns.
Reference vs Value Storage Models — explains why reference stability matters for memoized selectors and how to audit your cache for accidental value-copy semantics.
Cache Layer Architecture — details the extraction boundaries between the network layer and the normalized entity store, including where to intercept fetches for transformation.
Entity Mapping Strategies — covers deterministic key generation, composite keys, and polymorphic entity resolution in depth with production examples.
Mutation Sync & Rollback — covers optimistic update patterns and transactional rollback strategies when server mutations conflict with normalized cache state.