Entity Mapping Strategies
When raw API responses reach the client as deeply nested JSON trees, the cache stores multiple copies of the same logical entity under different query keys. The result is referential drift: a user record updated via mutation stays stale inside every query that embedded it. This page covers the transformation layer — entity mappers — that converts server payloads into flat, ID-indexed structures before they touch the store.
Entity mapping sits inside the broader Data Normalization & Query Key Design system. It is the step that makes relationship stitching in cache possible, and it directly determines whether pagination normalization patterns merge list pages cleanly or duplicate records. For a comparison of the tradeoffs this choice implies at the state-architecture level, see client vs server state boundaries.
Diagnostic checklist
You are in the right place if you observe any of these symptoms:
- A mutation updates a record on the server, but some components still display the old value until a full page reload.
- Paginating through a list inserts duplicate items because the same entity appears in two query key buckets.
- DevTools shows the same object stored under three different keys (
['user', 1],['post', 7, 'author'],['team', 3, 'members', 0]). - Invalidating a single query key causes a cascade of unrelated UI re-renders because child objects are embedded rather than referenced by ID.
- Memory grows monotonically over a long session as normalized slices accumulate without garbage collection.
Prerequisites
Before implementing entity mappers, understand the following concepts (each is linked to its dedicated page):
- Normalization principles for UI — why flat ID-indexed stores outperform nested object trees for read performance and cache coherence.
- Reference vs value storage models — how structural sharing works in React Query’s
structuralSharingoption and Apollo’s normalizedInMemoryCache, and why the distinction matters before you choose a mapping depth. - Designing stable query keys for React Query — query key structure must be derived from normalized entity IDs, not raw endpoint parameters, or the mapper and the cache will drift apart.
Architecture overview
The diagram below shows the transformation pipeline from network response to cache store. Each stage is a discrete boundary where a mapping or validation function operates before data moves to the next layer.
Implementation 1 — Deterministic payload transformation with Zod
The goal of the first stage is to convert a validated server payload into a flat { entities, ids } slice ready for direct cache insertion.
Steps:
- Define a Zod schema that mirrors the raw API contract exactly, including nested relations.
- Parse the incoming array through the schema at the start of your
queryFn(not inside the component). Zod throws synchronously on a schema mismatch, which React Query surfaces as anerrorstate — preventing a malformed payload from reaching the cache. - Iterate the validated array and build two structures: an
entitiesrecord keyed by entity ID, and an orderedidsarray that preserves server-side sort. - Replace embedded child objects with arrays of their IDs (foreign keys). Store the child entities in a parallel slice.
// src/mappers/users.ts
import { z } from 'zod';
const PostSchema = z.object({
id: z.string(),
title: z.string(),
publishedAt: z.string().nullable(),
});
const RawUserSchema = z.object({
id: z.string(),
name: z.string(),
email: z.string().email(),
posts: z.array(PostSchema),
});
type RawUser = z.infer<typeof RawUserSchema>;
export interface NormalizedUsers {
entities: Record<string, { id: string; name: string; email: string; postIds: string[] }>;
ids: string[];
}
export interface NormalizedPosts {
entities: Record<string, z.infer<typeof PostSchema>>;
ids: string[];
}
export function normalizeUsersPayload(raw: unknown): {
users: NormalizedUsers;
posts: NormalizedPosts;
} {
// Throws ZodError on schema mismatch — React Query converts this to error state
const validated = z.array(RawUserSchema).parse(raw);
const userEntities: NormalizedUsers['entities'] = {};
const userIds: string[] = [];
const postEntities: NormalizedPosts['entities'] = {};
const postIds: string[] = [];
for (const user of validated) {
// Replace embedded post objects with IDs (foreign-key flattening)
userEntities[user.id] = {
id: user.id,
name: user.name,
email: user.email,
postIds: user.posts.map((p) => p.id),
};
userIds.push(user.id);
for (const post of user.posts) {
if (!postEntities[post.id]) {
postEntities[post.id] = post;
postIds.push(post.id);
}
}
}
return {
users: { entities: userEntities, ids: userIds },
posts: { entities: postEntities, ids: postIds },
};
}
Cache Behavior Impact: React Query stores whatever the queryFn resolves with. Because normalizeUsersPayload returns a plain object, React Query’s structuralSharing (enabled by default) performs a deep equality check on each subsequent fetch — if only one user’s name changed, only that entry is replaced with a new object reference, keeping every unmodified user reference stable and preventing unnecessary re-renders in subscribed components. Zod’s parse runs synchronously before the promise resolves, so a malformed response never enters the cache; React Query transitions directly to status: 'error' with failureCount: 1 and respects your retry policy.
Configuration trade-offs:
- Setting
staleTime: Infinityon the normalized query prevents background refetches from overwriting optimistic cache patches mid-flight. Pair it with explicitinvalidateQueriesafter mutation settlement. gcTime(formerlycacheTime) controls how long the normalized slice persists after all subscribers unmount. Increase it (e.g.gcTime: 10 * 60 * 1000) on low-churn entities like org settings to survive route transitions without a refetch.structuralSharing: true(default) is essential here — disabling it would cause every refetch to produce new object references even when data is unchanged, re-rendering every subscriber.
Implementation 2 — React Query adapter with select
The select option applies a read-time projection on the cached raw response without altering what is stored. This is the right pattern when you need different components to derive different shapes from the same server payload.
Steps:
- Write a
queryFnthat returns the raw validated payload. - Pass
selecta pure function that maps the raw cache value to the shape the component needs. - Derive the
queryKeyfrom the logical entity scope, not from URL parameters — see designing stable query keys for React Query for the full key structure guide. - For mutations, write the normalized result directly to the affected query keys via
queryClient.setQueryDatato avoid a round-trip.
// src/hooks/useNormalizedUsers.ts
import { useQuery, useQueryClient, useMutation } from '@tanstack/react-query';
import { normalizeUsersPayload, type NormalizedUsers } from '../mappers/users';
// Raw fetch — stores the full normalizeUsersPayload result in cache
async function fetchUsers() {
const res = await fetch('/api/v2/users');
if (!res.ok) throw new Error(`HTTP ${res.status}`);
return normalizeUsersPayload(await res.json());
}
// Hook A: component only needs the ID-ordered list
export function useUserIds() {
return useQuery({
queryKey: ['users', 'normalized'],
queryFn: fetchUsers,
staleTime: 30_000,
// select runs per-subscriber at read time; does NOT alter cached value
select: (data) => data.users.ids,
});
}
// Hook B: component needs a single user by ID
export function useUser(userId: string) {
return useQuery({
queryKey: ['users', 'normalized'],
queryFn: fetchUsers,
staleTime: 30_000,
select: (data) => data.users.entities[userId] ?? null,
});
}
// Mutation: patch a user and update the cache without a refetch
export function useUpdateUser() {
const queryClient = useQueryClient();
return useMutation({
mutationFn: (patch: { id: string; name: string }) =>
fetch(`/api/v2/users/${patch.id}`, {
method: 'PATCH',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify(patch),
}).then((r) => r.json()),
onSuccess: (serverUser) => {
// Write the single updated entity directly into the normalized cache slice
queryClient.setQueryData(
['users', 'normalized'],
(prev: ReturnType<typeof normalizeUsersPayload> | undefined) => {
if (!prev) return prev;
return {
...prev,
users: {
...prev.users,
entities: {
...prev.users.entities,
[serverUser.id]: {
...prev.users.entities[serverUser.id],
name: serverUser.name,
},
},
},
};
},
);
},
});
}
Cache Behavior Impact: Both useUserIds and useUser share a single underlying cache entry at ['users', 'normalized']. React Query calls select after retrieving the raw value from cache — not before storing it — so the network is only hit once regardless of how many subscribers are active. Each subscriber’s select result is memoized independently: useUser('u1') and useUser('u2') each compare their derived value to the previous call and skip a re-render if the entity did not change. The setQueryData call in onSuccess triggers structuralSharing internally, replacing only the modified entity reference while keeping every other user’s identity stable.
Configuration trade-offs:
selectmemoization usesObject.iscomparison. If your selector returns a new array or object on every invocation (e.g.select: (d) => d.users.ids.slice(0, 10)), wrap it inuseCallbackor define it outside the component to prevent unnecessary re-renders.- Avoid storing the
selectoutput in state. React Query manages the derived value internally; double-storing it creates two sources of truth. - When multiple query keys need the same entity (e.g.
['users', 'normalized']and['dashboard']), prefer a single canonical key and derive from it rather than maintaining two normalized slices — the relationship stitching in cache patterns cover cross-key entity resolution in depth.
Implementation 3 — Apollo Client InMemoryCache with keyFields
Apollo’s InMemoryCache normalizes by __typename + id automatically. keyFields overrides the identity key when your server uses a non-standard field.
Steps:
- Configure
InMemoryCachewithtypePoliciesthat declarekeyFieldsfor each type. - Add
mergefunctions for fields that return arrays to prevent Apollo from clobbering existing list entries on a partial update. - Use
readfunctions to derive computed values at read time (equivalent to React Query’sselect). - For polymorphic unions, add
possibleTypesso Apollo can normalize interface-typed responses to the correct concrete type.
// src/apolloClient.ts
import { ApolloClient, InMemoryCache, gql } from '@apollo/client';
const cache = new InMemoryCache({
typePolicies: {
User: {
// Use 'uuid' as the cache key instead of the default 'id'
keyFields: ['uuid'],
fields: {
posts: {
// Merge incoming posts with existing cached posts (pagination / refetch safety)
merge(existing: readonly unknown[] = [], incoming: readonly unknown[]) {
// Build a Set of existing refs to de-duplicate on append
const seen = new Set(existing.map((ref) => (ref as { __ref: string }).__ref));
const merged = [...existing];
for (const item of incoming) {
const ref = (item as { __ref: string }).__ref;
if (!seen.has(ref)) {
seen.add(ref);
merged.push(item);
}
}
return merged;
},
},
},
},
Post: {
keyFields: ['uuid'],
fields: {
// Computed read field: derive display title without storing it
displayTitle: {
read(_, { readField }) {
const title = readField<string>('title');
const publishedAt = readField<string | null>('publishedAt');
return publishedAt ? title : `[Draft] ${title}`;
},
},
},
},
},
possibleTypes: {
// Polymorphic interface — Apollo needs the concrete types to normalize correctly
ContentNode: ['Post', 'Video', 'Poll'],
},
});
export const client = new ApolloClient({
uri: '/graphql',
cache,
});
// Usage in a component
export const USERS_QUERY = gql`
query GetUsers {
users {
uuid
name
email
posts {
uuid
title
publishedAt
displayTitle @client
}
}
}
`;
Cache Behavior Impact: When Apollo writes a GetUsers result, InMemoryCache walks every object in the response, reads __typename + the configured keyFields, and stores each entity at a stable cache ID (e.g. User:abc-123). If a mutation response includes the same User:abc-123 with an updated name, Apollo merges only that field — all other queries that reference User:abc-123 by cache ID instantly reflect the change without an additional network request. The merge function on posts prevents the common bug where a partial list fetch replaces the full cached list; instead, it appends only new post refs. The @client directive on displayTitle triggers the local read function, computed fresh from the cached fields on each read without storing a derived copy.
Configuration trade-offs:
- Omitting a
mergefunction on a list field causes Apollo to emit a warning and replace the cached array entirely on every partial refetch — a silent data loss bug under pagination. keyFields: falsedisables normalization for a type, embedding it inline as a value object. Use this deliberately for immutable value types (currency amounts, timestamps) to avoid polluting the normalized store with objects that will never be updated individually.- Apollo’s
fetchPolicy: 'cache-and-network'always fires a network request after serving the cache, which is useful for high-churn entities but doubles request frequency. Usecache-firstwith explicitrefetchQueriesin mutations for entities that change only via user action.
Common Pitfalls & Resolutions
| Observable Issue | Root Cause | Diagnostic Resolution |
|---|---|---|
| Mutation updates a user but some components still show the stale name | Mutation response bypasses the entity mapper and writes a partial object directly to a different query key | Route all mutation responses through queryClient.setQueryData using the same normalized key and mapper shape; alternatively use Apollo’s automatic cache normalization via keyFields so any mutation including the entity ID triggers a merge |
| Paginated list shows duplicate rows after fetching page 2 | New page entities are appended raw rather than merged through an ID-deduplication step | Add an existingIds Set check in the mapper loop (as shown in Implementation 1); in Apollo, add a merge function that guards on __ref identity |
select callback causes re-render on every background refetch even when data did not change |
select returns a new array/object reference on each call (e.g. .filter(...) inline) |
Stabilize the selector: move it outside the component, memoize it with useCallback, or use a stable selector library like reselect |
| Memory grows over a long session until the tab crashes | Normalized entities are inserted without gcTime or garbage collection; no subscriber eviction |
Set a finite gcTime (default 5 min in React Query v5); for Apollo, call cache.evict({ id: 'User:abc' }) + cache.gc() after mutations that delete entities |
| Zod parse fails in production for a valid-looking payload | Server added a new non-nullable field not reflected in the local schema | Use .passthrough() temporarily to unblock; schedule a schema sync; consider z.object({...}).strip() for fields you intentionally ignore |
Frequently Asked Questions
Should I normalize at the fetcher level or inside the UI component?
Always normalize at the fetcher or query adapter level. Component-level normalization runs per subscriber, producing redundant parse cycles for every component that calls the same hook. More critically, it breaks the single-source-of-truth guarantee: two components that independently normalize the same raw cache entry can produce different shapes depending on their rendering order or filter conditions, causing divergent UI state that is nearly impossible to debug.
Does the React Query select option store the transformed result in the cache?
No. The raw value returned by queryFn is stored. select is applied per subscriber at read time, after React Query retrieves the raw value from its internal store. This means: (a) multiple components can derive different shapes from a single cache entry without additional network requests; (b) invalidating the underlying key refetches and renormalizes once, then re-applies each subscriber’s select independently; © if you need the transformed result in another query’s queryFn, you must call queryClient.getQueryData and apply the mapper manually — select is not visible to the query client itself.
How do optimistic updates interact with entity mapping?
Optimistic payloads must follow the exact same normalized schema as the eventual server response. In React Query, write the optimistic value via queryClient.setQueryData using the same mapper output type before the mutation fires (onMutate), then roll back with the snapshot in onError. If the optimistic shape diverges from the server response shape — for example, the optimistic update omits postIds — the rollback will produce a structural mismatch that structuralSharing cannot resolve, causing a visible flash. In Apollo, pass optimisticResponse with __typename included so InMemoryCache can apply the same merge policy it would use for a real server response.
When should I use Apollo's InMemoryCache keyFields over a manual mapper function?
Use keyFields when your API consistently provides __typename + a stable unique ID for every object type and you want normalization to be automatic across all queries and mutations. Use a manual mapper function (as in Implementations 1 and 2) when: the server omits __typename (REST, non-GraphQL); you need to rename, split, or derive fields during normalization (not just identify entities); the normalization shape differs from the wire format for business logic reasons; or you are using React Query, SWR, or RTK Query rather than Apollo. The two approaches are not mutually exclusive — you can wrap Apollo responses in a mapper for shape transformation and still rely on InMemoryCache for entity identity and merge.
Related
- Data Normalization & Query Key Design — the parent reference covering the full normalization system, including query key structure, entity lifecycle, and cross-framework patterns.
- Designing Stable Query Keys for React Query — a focused implementation guide: how to structure query keys so they derive from normalized entity IDs rather than raw URL parameters, preventing cache key drift after mapping.
- Relationship Stitching in Cache — how to resolve foreign-key references across normalized slices at read time, the step that follows entity mapping in the data pipeline.
- Pagination Normalization Patterns — extending entity mapping to cursor-based and offset pagination, including the merge strategies that prevent duplicate records across pages.
- Nested Data Flattening Techniques — sibling coverage for deeply nested GraphQL responses where multiple levels of embedding must be collapsed before ID-indexed storage.
- Cache Layer Architecture — the foundational layer below entity mapping: understanding where the normalized store sits relative to the network, service worker, and rendering layers.