Normalization Principles for UI Caches
When multiple UI components consume overlapping server entities — a user record displayed in a sidebar, a post feed, and an edit form simultaneously — keeping those entities in sync requires more than just re-fetching. Without a normalized structure, each component holds its own copy of the same data. A mutation to one copy does not propagate to the others, leaving the UI in an inconsistent state. This is the specific failure mode that normalization addresses: it eliminates duplicated entity storage by replacing embedded objects with foreign key references, so a single mutation updates every dependent view at once.
This page is part of State Architecture & Cache Fundamentals, which covers the structural decisions that govern how server state is modeled in the browser. For the architectural context on how normalized stores fit within a layered cache design, see Cache Layer Architecture. If you are deciding whether to normalize at all, the Client vs Server State Boundaries topic explains which payloads warrant a normalized cache versus local-only ephemeral storage.
Diagnostic Checklist
You are in the right place if you are observing one or more of these symptoms:
- A mutation to a user profile updates the edit form but leaves an avatar or post author display showing the old value.
- Fetching a list of posts causes
nredundant author objects to be stored in cache, one per post, even though most authors repeat. - A
useQuerysubscriber re-renders even though the data it consumes did not change — caused by new object references being allocated for unchanged nested entities. - Pagination appends create flicker because the entire post collection is overwritten rather than merged into the existing entity map.
- Rollback after an optimistic mutation is inconsistent: some views revert, others do not.
Prerequisites
Before implementing normalization, make sure you understand:
- Reference vs Value Storage Models — the distinction between reference-stable entity graphs and copied value trees, which determines whether two components share a single object or hold independent copies.
- Cache Layer Architecture — the extraction boundaries between the network layer, the raw response cache, and the normalized entity store.
- Client vs Server State Boundaries — which payloads belong in a normalized cache and which belong in local component state.
Implementation 1: Entity Extraction with Deterministic Key Generation
Normalization begins at the extraction step. Every nested API payload must be recursively traversed to isolate discrete records before they reach the cache.
Steps:
- Define entity schemas as TypeScript interfaces with a mandatory
id: stringfield. - Write an extractor that traverses a raw payload depth-first, pulling named entity types into separate flat maps keyed by
entity.id. - Replace embedded sub-objects with their foreign key:
authorId: item.author.idinstead ofauthor: item.author. - Build a parallel collection array of ordered IDs:
ids: data.map(d => d.id). - For polymorphic responses, inject a
__typenamediscriminator during extraction so the selector layer can resolve entity type dynamically.
// normalizer.ts — framework-agnostic entity extractor
interface RawPost {
id: string;
content: string;
author: { id: string; name: string; avatarUrl: string };
}
interface NormalizedPost {
id: string;
content: string;
authorId: string;
}
interface NormalizedUser {
id: string;
name: string;
avatarUrl: string;
}
export interface NormalizedPostsResult {
entities: {
users: Record<string, NormalizedUser>;
posts: Record<string, NormalizedPost>;
};
ids: string[];
}
export function normalizePosts(raw: RawPost[]): NormalizedPostsResult {
const users: Record<string, NormalizedUser> = {};
const posts: Record<string, NormalizedPost> = {};
const ids: string[] = [];
for (const item of raw) {
// Author is upserted by ID — duplicate authors across posts collapse to one entry
users[item.author.id] = {
id: item.author.id,
name: item.author.name,
avatarUrl: item.author.avatarUrl,
};
posts[item.id] = {
id: item.id,
content: item.content,
authorId: item.author.id, // foreign key reference, not embedded object
};
ids.push(item.id);
}
return { entities: { users, posts }, ids };
}
Cache Behavior Impact: The extractor runs synchronously before any cache write. Entities are de-duplicated at write time: if fifty posts share the same author, users[authorId] is upserted once. Every subsequent read resolves the author via users[post.authorId] — there is one canonical object in memory, so a single mutation propagates to every selector that resolves through that author ID.
Configuration Trade-offs:
- Composite keys (e.g.,
${userId}_${roleId}) work for join-table entities but must be generated deterministically both on write and on read; a mismatch causes phantom duplicates. - Extracting a
__typenamediscriminator enables a singleentitiesmap keyed by__typename + id— useful for Apollo-style caches — but adds branching logic that couples the extractor to the schema. - Avoid
JSON.stringifyfor key hashing: it is not stable across object property orderings and breaks structural equality checks.
Implementation 2: TanStack Query with select for Per-Subscriber Normalization
TanStack Query’s select option transforms the raw fetch result at read time, before it reaches the component. It is the idiomatic place to apply normalization in a React Query setup without changing what is stored in the internal QueryCache.
Steps:
- Write a
selectfunction that accepts the raw query data and returns a normalized shape. - Pass it to
useQuery; each subscriber receives the transformed result independently. - Wire a
queryClient.setQueryDatacall insideuseMutation’sonSuccessto update only the affected entity keys, avoiding a full collection refetch.
// usePosts.ts
import { useQuery, useMutation, useQueryClient } from '@tanstack/react-query';
import { normalizePosts, NormalizedPostsResult } from './normalizer';
type RawPost = { id: string; content: string; author: { id: string; name: string; avatarUrl: string } };
export function usePosts() {
return useQuery<RawPost[], Error, NormalizedPostsResult>({
queryKey: ['posts'],
queryFn: (): Promise<RawPost[]> => fetch('/api/posts').then((r) => r.json()),
select: normalizePosts,
// structuralSharing: true is the default — TanStack Query deep-compares
// the returned object and reuses unchanged entity references, preventing
// downstream re-renders for unmodified posts.
staleTime: 30_000,
gcTime: 5 * 60_000,
});
}
export function useUpdateUser() {
const queryClient = useQueryClient();
return useMutation({
mutationFn: (payload: { id: string; name: string }) =>
fetch(`/api/users/${payload.id}`, {
method: 'PATCH',
body: JSON.stringify({ name: payload.name }),
}).then((r) => r.json()),
onMutate: async (payload) => {
// Snapshot and optimistically patch the raw cache
await queryClient.cancelQueries({ queryKey: ['posts'] });
const snapshot = queryClient.getQueryData<RawPost[]>(['posts']);
queryClient.setQueryData<RawPost[]>(['posts'], (old) =>
old?.map((post) =>
post.author.id === payload.id
? { ...post, author: { ...post.author, name: payload.name } }
: post,
) ?? [],
);
return { snapshot };
},
onError: (_err, _payload, ctx) => {
// Roll back to snapshot on server rejection
if (ctx?.snapshot) {
queryClient.setQueryData(['posts'], ctx.snapshot);
}
},
onSettled: () => {
queryClient.invalidateQueries({ queryKey: ['posts'] });
},
});
}
Cache Behavior Impact: The select callback fires inside TanStack Query’s subscriber notification path. Because structuralSharing is true by default, TanStack Query performs a recursive structural equality check on the transformed output before notifying subscribers. If a post’s fields did not change, the reference for that post entry is reused from the previous selector output, preventing the <PostCard /> that renders it from re-rendering. Only the mutated author’s entry gets a new object reference, limiting reconciliation to the components that actually display that author.
Configuration Trade-offs:
- Setting
structuralSharing: falsedisables reference reuse entirely — every query result allocates fresh objects. Avoid this in normalized setups; it negates the primary re-render benefit. staleTimeandgcTimemust be set independently.gcTimedefaults to 5 minutes; if a component unmounts and remounts within that window, TanStack Query serves the cached normalized shape without refetching.selectruns per subscriber, not once per cache entry. If ten components callusePosts(), thenormalizePostsfunction runs ten times on the same raw data. Memoize the extractor withuseMemoor move transformation upstream (into aqueryFnwrapper that stores the normalized result directly) when subscriber count is high.
Implementation 3: Redux Toolkit createEntityAdapter for Normalized Slices
When your architecture stores server state in Redux — common in apps that mix normalized server entities with complex client-side derived state — Redux Toolkit’s createEntityAdapter provides built-in upsertMany, removeMany, and stable selector factories that integrate directly with the normalized slice pattern.
Steps:
- Define the entity interface and create an adapter with an optional
sortComparer. - Initialise slice state with
adapter.getInitialState(). - In async thunks or RTK Query
transformResponse, callupsertManyto merge server payloads into the existing entity map without overwriting untouched entries. - Use
adapter.getSelectors()to create memoizedselectAll,selectById, andselectIdsselectors — pass a root-state accessor so they are composable withcreateSelector.
// postsSlice.ts
import {
createEntityAdapter,
createSlice,
createAsyncThunk,
PayloadAction,
} from '@reduxjs/toolkit';
interface Post {
id: string;
content: string;
authorId: string;
updatedAt: string;
}
const postsAdapter = createEntityAdapter<Post>({
sortComparer: (a, b) => b.updatedAt.localeCompare(a.updatedAt),
});
export const fetchPosts = createAsyncThunk('posts/fetchAll', async () => {
const raw: Array<{ id: string; content: string; author: { id: string }; updatedAt: string }> =
await fetch('/api/posts').then((r) => r.json());
// Normalize at thunk boundary — only NormalizedPost shapes enter the adapter
return raw.map((p) => ({
id: p.id,
content: p.content,
authorId: p.author.id,
updatedAt: p.updatedAt,
}));
});
const postsSlice = createSlice({
name: 'posts',
initialState: postsAdapter.getInitialState<{ status: 'idle' | 'loading' | 'error' }>({
status: 'idle',
}),
reducers: {
// Partial update: merge only the fields that changed, preserve the rest
patchPost: (state, action: PayloadAction<Partial<Post> & { id: string }>) => {
postsAdapter.updateOne(state, {
id: action.payload.id,
changes: action.payload,
});
},
},
extraReducers: (builder) => {
builder
.addCase(fetchPosts.pending, (state) => { state.status = 'loading'; })
.addCase(fetchPosts.fulfilled, (state, action) => {
// upsertMany merges incoming posts by ID — existing entries that are
// absent from the new payload are NOT evicted, preventing flicker on
// paginated fetches where only one page is refreshed.
postsAdapter.upsertMany(state, action.payload);
state.status = 'idle';
})
.addCase(fetchPosts.rejected, (state) => { state.status = 'error'; });
},
});
export const { patchPost } = postsSlice.actions;
export default postsSlice.reducer;
// Memoized selectors — root state accessor keeps them composable
type RootState = { posts: ReturnType<typeof postsSlice.reducer> };
export const {
selectAll: selectAllPosts,
selectById: selectPostById,
selectIds: selectPostIds,
} = postsAdapter.getSelectors<RootState>((state) => state.posts);
Cache Behavior Impact: upsertMany uses Immer’s structural mutation internally: it compares incoming entity fields against existing entries and applies only the changed paths. Entries absent from the incoming payload retain their existing Immer draft proxies — their object identities do not change. Memoized selectors created by getSelectors use createSelector under the hood, so selectAll only recomputes when state.posts.ids or state.posts.entities references change — unrelated components do not re-render on unrelated entity changes.
Configuration Trade-offs:
updateOnevsupsertOne:updateOnefails silently if the entity does not exist yet; useupsertOnewhen processing partial server payloads that might arrive before the full collection is loaded.sortComparerre-runs on everyupsertManycall; for large entity maps, derive sorted order at selector time with a memoizedcreateSelectorrather than paying the sort cost on every write.- The adapter adds approximately 3 kB gzip to bundle size. For applications with fewer than three distinct entity types, a hand-rolled normalized reducer is leaner, but loses the built-in CRUD helpers and selector factories.
Configuration Trade-offs Summary
| Dimension | Impact | Mitigation |
|---|---|---|
structuralSharing in TanStack Query |
Disabled by setting structuralSharing: false — all references refresh on every fetch, eliminating re-render savings |
Leave enabled; disable only in benchmarks to measure baseline overhead |
select per-subscriber overhead |
normalizePosts runs once per useQuery subscriber per cache invalidation cycle |
Memoize with useRef-backed cache keyed on query data identity if subscriber count exceeds ~10 |
gcTime and orphaned entities |
Entities from unmounted queries linger in QueryCache until gcTime expires; related entities may be stale when the query remounts |
Set gcTime to match session freshness requirements; call removeQueries explicitly on logout |
upsertMany vs full collection replace |
Replacing the entire ids array causes every selectPostIds consumer to re-render |
Always use upsertMany/setAll with pagination; only use setAll when replacing the complete dataset intentionally |
| SSR rehydration of normalized graphs | Serializing a normalized entity map via JSON.stringify may lose ordering guarantees in ids arrays across environments |
Use a deterministic replacer in JSON.stringify and reconstruct ids from a sorted key scan on the client |
Common Pitfalls & Resolutions
| Observable Issue | Root Cause | Diagnostic Resolution |
|---|---|---|
| Circular reference error during state serialization | Bidirectional relationships stored as direct object references (user.posts[0].author === user) rather than ID foreign keys |
Enforce ID-only foreign keys throughout the entity schema; use memoized selectors that resolve references lazily at read time |
| Cache thrashing on infinite scroll pagination | Each page fetch calls setQueryData with a concatenated raw array, allocating new objects for unchanged entities |
Switch to upsertMany on the Redux adapter, or accumulate page data in useInfiniteQuery and apply select normalization to the aggregated pages array |
| Stale nested data after partial server update | Server PATCH returns only modified fields; client upsert replaces the whole entity, losing unrelated fields not included in the response | Use updateOne with changes: partialPayload (RTK) or deep-merge the partial payload into the existing getQueryData snapshot before calling setQueryData |
Selector thrash: selectAllPosts re-runs on unrelated entity change |
selectAll derives its output from state.posts.entities — any entity upsert updates the entities reference even if the component’s target entity is unchanged |
Use selectPostById for single-entity selectors; compose createSelector chains that depend only on the specific entity subtree the component actually renders |
Frequently Asked Questions
When should I normalize versus keep nested API responses intact?
Normalize when multiple UI components consume overlapping entities or when referential integrity is critical across views — a feed, a sidebar, and an edit form all rendering the same user record is the canonical case. Keep nested structures intact for isolated, single-consumer views where the lookup overhead and extractor verbosity outweigh the deduplication benefit. A rule of thumb: if only one component ever reads a given entity shape, normalization adds complexity without measurable gain.
Does TanStack Query's select option affect what is stored in the raw cache?
No. The select transform runs at read time per subscriber — it does not mutate what TanStack Query stores in its internal QueryCache. Every subscriber receives the transformed output independently; two subscribers can apply different select transforms to the same raw cache entry. The raw fetch result persists unchanged, meaning queryClient.getQueryData(['posts']) always returns the original server payload, not the normalized shape.
How do I invalidate a normalized entity without refetching its entire collection?
Invalidate by entity ID using queryClient.invalidateQueries with a fine-grained queryKey that includes the entity ID: queryClient.invalidateQueries({ queryKey: ['user', userId] }). Pair this with structuralSharing: true (the default) so only the changed entity reference updates — components that subscribe to other entities in the same collection remain stable. For cross-entity cascades, maintain a dependency map that links an entity ID to the query keys that depend on it, then invalidate each in a single Promise.all call inside onSuccess.
What causes cache thrashing during infinite scroll and how do I fix it?
Overwriting the entire entity array on each page fetch allocates new object references for every post entry, causing selectAll or useInfiniteQuery’s aggregated data to produce new arrays on each page load — even for pages the user already scrolled past. Fix this in Redux by calling upsertMany rather than setAll; in TanStack Query, use useInfiniteQuery with a select that normalizes across data.pages and memoizes the result using the last known page count as a dependency.
Related
- State Architecture & Cache Fundamentals — the parent topic covering every structural decision that governs how server state is modeled in the browser, from cache layer design to hydration.
- How to Design a Normalized State Tree — a step-by-step guide to mapping server entity schemas to stable in-memory reference graphs, including partial update and rollback patterns.
- Reference vs Value Storage Models — explains why reference stability matters for memoized selectors and how to audit your cache for accidental value-copy semantics.
- Cache Layer Architecture — details the extraction boundaries between the network layer and the normalized entity store, including where to intercept fetches for transformation.
- Entity Mapping Strategies — covers deterministic key generation, composite keys, and polymorphic entity resolution in depth with production examples.
- Mutation Sync & Rollback — covers optimistic update patterns and transactional rollback strategies when server mutations conflict with normalized cache state.