Pagination Normalization Patterns
Paginated APIs expose a structural tension that breaks naive cache implementations: the same entity can appear in multiple pages, traversal metadata mutates independently of entity data, and rapid scrolling generates overlapping concurrent fetches that race to write the same cache slots. Resolving this requires applying the Data Normalization & Query Key Design discipline specifically to list traversal — separating cursor state, page metadata, and entity storage into distinct cache layers with different invalidation lifecycles.
This guide addresses the concrete implementation decisions: how to build a unified adapter that handles both offset and cursor APIs, how to architect the resultIds / entities split that prevents re-render storms, and how to enforce deduplication merge semantics that survive concurrent fetches. For the cross-cutting normalization theory that underpins this work, see Entity Mapping Strategies and Relationship Stitching in Cache.
Diagnostic Checklist
You are likely hitting a pagination normalization problem if you observe:
- Duplicate rows appearing in an infinite-scroll list after rapid scrolling or a page refresh
- A single entity mutation (e.g. updating a user’s name) causing the entire paginated list to re-render from scratch
- Cursor-based pagination silently skipping records when a backend delete shifts item positions between page fetches
useInfiniteQueryreturningundefinedfordata.pages[n]mid-scroll when a concurrent fetch resolves out of order- Memory footprint growing unbounded during infinite scroll because full entity payloads are stored inside each page array rather than a flat entity dictionary
- Inconsistent
totalCountvalues across different list views that share the same underlying entities
Prerequisites
Before implementing the patterns below, you should be comfortable with:
- Flat entity stores: understand the
resultIds+entitiessplit and why nested array storage causes re-render cascades (covered in Entity Mapping Strategies) - Query key hierarchies: know how TanStack Query v5 matches invalidation patterns against key arrays (covered in Designing Stable Query Keys for React Query)
- Foreign key resolution: understand how the normalized store resolves entity references at read-time rather than write-time (covered in Relationship Stitching in Cache)
- TanStack Query v5
useInfiniteQuery: familiarity withinitialPageParam,getNextPageParam,fetchNextPage, andisFetchingNextPage
Implementation 1 — Unified Offset/Cursor Adapter
The first decision point is whether to write two separate fetch functions (one for offset APIs, one for cursor APIs) or to route both through a shared adapter that normalizes them to a common PageData contract. The unified approach pays off when a single UI must talk to multiple endpoints that use different pagination schemes — common in microservice architectures.
Steps:
- Define a
PageData<T>interface that captures both cursor and offset metadata alongside the item array. - Write two thin protocol adapters (
CursorAdapterandOffsetAdapter) that each implement the shared interface. - Pass the active adapter to
useInfiniteQueryso the hook remains agnostic to the underlying API shape. - Use the
selectprojection to strip traversal metadata before the data reaches the component, preventing metadata changes from triggering entity re-renders.
import { useInfiniteQuery, InfiniteData } from '@tanstack/react-query';
// ── Shared contract ────────────────────────────────────────────────────────
interface PageData<T> {
items: T[];
nextCursor: string | null;
cursor: string; // the cursor used to fetch this page
hasNextPage: boolean;
totalCount?: number;
}
type PaginationAdapter<T> = {
queryFn: (pageParam: unknown) => Promise<PageData<T>>;
getNextPageParam: (lastPage: PageData<T>) => string | undefined;
initialPageParam: unknown;
};
// ── Cursor adapter ─────────────────────────────────────────────────────────
function cursorAdapter<T>(endpoint: string): PaginationAdapter<T> {
return {
initialPageParam: null,
getNextPageParam: (last) => last.nextCursor ?? undefined,
queryFn: ({ pageParam }) =>
fetch(`${endpoint}?cursor=${pageParam ?? ''}`).then((r) => {
if (!r.ok) throw new Error(`Fetch failed: ${r.status}`);
return r.json() as Promise<PageData<T>>;
}),
};
}
// ── Offset adapter ─────────────────────────────────────────────────────────
const PAGE_SIZE = 25;
function offsetAdapter<T>(endpoint: string): PaginationAdapter<T> {
return {
initialPageParam: 0,
getNextPageParam: (last, _, lastPageParam) =>
last.hasNextPage ? (lastPageParam as number) + PAGE_SIZE : undefined,
queryFn: ({ pageParam }) =>
fetch(`${endpoint}?offset=${pageParam}&limit=${PAGE_SIZE}`).then((r) => {
if (!r.ok) throw new Error(`Fetch failed: ${r.status}`);
return r.json() as Promise<PageData<T>>;
}),
};
}
// ── Hook (adapter-agnostic) ────────────────────────────────────────────────
interface User { id: string; name: string; email: string }
export function useNormalizedList(endpoint: string, mode: 'cursor' | 'offset') {
const adapter =
mode === 'cursor' ? cursorAdapter<User>(endpoint) : offsetAdapter<User>(endpoint);
return useInfiniteQuery<PageData<User>, Error, { ids: string[]; hasNext: boolean }>({
queryKey: ['entities', endpoint, mode],
queryFn: adapter.queryFn,
initialPageParam: adapter.initialPageParam,
getNextPageParam: adapter.getNextPageParam,
staleTime: 30_000,
gcTime: 5 * 60_000,
structuralSharing: true,
// select strips cursor metadata — component only sees IDs and a boolean
select: (data: InfiniteData<PageData<User>>) => ({
ids: data.pages.flatMap((p) => p.items.map((item) => item.id)),
hasNext: data.pages[data.pages.length - 1]?.hasNextPage ?? false,
}),
});
}
Cache Behavior Impact: The select projection runs after every fetch and after every cache read. By returning only ids and hasNext, the component’s dependency on cursor strings is severed — when TanStack Query updates the nextCursor on the last page internally, the select output is structurally identical to the previous render if no new IDs arrived, so React skips the re-render entirely. The structuralSharing: true default means TanStack Query performs a deep equality check on each page object; only pages with new data get a new object reference.
Configuration Trade-offs:
- Setting
staleTime: 30_000prevents immediate background refetches when the user navigates away and returns, but means list order can be 30 seconds stale after a backend mutation — tune against your update frequency. gcTime: 5 * 60_000keeps all loaded pages in memory for five minutes after the last subscriber unmounts. For very long lists (200+ pages), consider reducinggcTimeor implementing virtual scroll with page eviction.- The
selectprojection re-executes on every render that reads from the cache. Ifidsis large (500+), memoize theflatMapresult withuseMemoinsideselector move it to a dedicated selector function. - Mixing cursor and offset adapters under the same
queryKeyprefix (['entities', endpoint]) allowsqueryClient.invalidateQueries({ queryKey: ['entities', endpoint] })to flush both schemes simultaneously — useful after a mutation that might affect either list view.
Implementation 2 — Normalized List State Architecture
Raw useInfiniteQuery stores full entity payloads inside each page array. When entity A appears on page 1 and page 3, two copies exist in cache. When a mutation updates entity A, you must either flush the entire list (expensive) or leave stale copies in other pages (incorrect). The resultIds / entities split eliminates both problems by ensuring each entity has exactly one canonical record.
Steps:
- After fetching, extract entity records into a flat
entitiesdictionary keyed by ID. - Store only the ordered ID array in the list slice — never the full entity objects.
- Write a memoized selector that joins
resultIdsagainstentitiesat read-time, analogous to the lazy stitching pattern from Relationship Stitching in Cache. - On mutation, update only the entity’s record in the flat dictionary; the
resultIdsorder is unaffected and the list does not re-render.
import { useQueryClient, useQuery, useMutation } from '@tanstack/react-query';
interface User { id: string; name: string; avatar: string }
interface NormalizedListState {
resultIds: string[];
entities: Record<string, User>;
nextCursor: string | null;
hasNextPage: boolean;
}
// ── Selector: join IDs → entities at read-time ─────────────────────────────
function selectOrderedUsers(state: NormalizedListState): User[] {
return state.resultIds.map((id) => state.entities[id]).filter(Boolean) as User[];
}
// ── Read hook ──────────────────────────────────────────────────────────────
export function useNormalizedUsers(listKey: string) {
return useQuery<NormalizedListState, Error, User[]>({
queryKey: ['normalizedList', listKey],
queryFn: () => fetch(`/api/users?key=${listKey}`).then((r) => r.json()),
select: selectOrderedUsers,
staleTime: 60_000,
});
}
// ── Mutation: update entity without touching resultIds ─────────────────────
export function useUpdateUser(listKey: string) {
const queryClient = useQueryClient();
return useMutation<User, Error, { id: string; name: string }>({
mutationFn: ({ id, name }) =>
fetch(`/api/users/${id}`, {
method: 'PATCH',
body: JSON.stringify({ name }),
headers: { 'Content-Type': 'application/json' },
}).then((r) => r.json()),
// Optimistic: update only the entity slot
onMutate: async ({ id, name }) => {
await queryClient.cancelQueries({ queryKey: ['normalizedList', listKey] });
const previous = queryClient.getQueryData<NormalizedListState>(['normalizedList', listKey]);
queryClient.setQueryData<NormalizedListState>(['normalizedList', listKey], (old) => {
if (!old) return old;
return {
...old,
entities: {
...old.entities,
[id]: { ...old.entities[id], name },
},
// resultIds is untouched — list order does not change
};
});
return { previous };
},
onError: (_err, _vars, context) => {
if (context?.previous) {
queryClient.setQueryData(['normalizedList', listKey], context.previous);
}
},
onSettled: () => {
// Invalidate only the specific entity, not the whole list
queryClient.invalidateQueries({ queryKey: ['normalizedList', listKey] });
},
});
}
Cache Behavior Impact: queryClient.setQueryData with the entity-only patch triggers a structural sharing comparison on the NormalizedListState object. TanStack Query detects that resultIds is the same array reference as before (we spread ...old without touching resultIds), so the selector selectOrderedUsers receives the same ID sequence. React only re-renders components that read the specific updated entity — not the entire list. The onSettled invalidation refetches from the server to confirm the optimistic value, but the visual update is instant.
Configuration Trade-offs:
- Storing entities as a flat
Record<string, User>enables O(1) mutation updates but requires the selector to performresultIds.lengthlookups per render. For lists under ~1000 items this is negligible; for larger lists, useuseMemoto cache the join result between renders. - Omitting entities from the
resultIdsarray (via.filter(Boolean)) silently drops items deleted by concurrent users. In production, make the missing-entity case explicit: return a placeholder and trigger a targeted refetch for that ID. - If the normalized store is managed by RTK Query’s
normalizrintegration or ApolloInMemoryCache, you can skip manual entity extraction and read entity updates directly from the cache layer — but you still need an explicitresultIdsslice to maintain list order.
Implementation 3 — Duplicate-Free Infinite Scroll Merge
Merging Paginated Lists Without Duplicates is the most operationally risky step: a backend that uses keyset pagination may return the same item on consecutive pages when an insert shifts item positions between fetches. Your client-side merge must be the last line of defence.
Steps:
- Maintain the canonical
resultIdsset as aSet<string>to make membership checks O(1) regardless of list length. - Before appending any incoming IDs, filter them through the existing set.
- Spread
incoming.entitiesontoexisting.entitiesto merge entity records — later pages’ data wins, which is correct because it is fresher. - Update
nextCursorfrom the incoming page, never from the existing state. - Guard
fetchNextPagewithisFetchingNextPageandhasNextPagechecks to prevent duplicate in-flight requests.
import { useInfiniteQuery } from '@tanstack/react-query';
import { useMemo, useCallback } from 'react';
interface Item { id: string; title: string; updatedAt: string }
interface PageData {
items: Item[];
nextCursor: string | null;
hasNextPage: boolean;
}
// ── Merge function ─────────────────────────────────────────────────────────
interface MergedState {
resultIds: string[];
entities: Record<string, Item>;
nextCursor: string | null;
}
function mergePages(pages: PageData[]): MergedState {
const entities: Record<string, Item> = {};
const seenIds = new Set<string>();
const resultIds: string[] = [];
for (const page of pages) {
for (const item of page.items) {
// Always update entities (later page = fresher data)
entities[item.id] = item;
// Only add to ordered list once
if (!seenIds.has(item.id)) {
seenIds.add(item.id);
resultIds.push(item.id);
}
}
}
const lastPage = pages[pages.length - 1];
return {
resultIds,
entities,
nextCursor: lastPage?.nextCursor ?? null,
};
}
// ── Hook ───────────────────────────────────────────────────────────────────
export function useInfiniteItems(endpoint: string) {
const query = useInfiniteQuery<PageData, Error>({
queryKey: ['infiniteItems', endpoint],
queryFn: ({ pageParam }) =>
fetch(`${endpoint}?cursor=${pageParam ?? ''}`).then((r) => {
if (!r.ok) throw new Error(`${r.status}`);
return r.json() as Promise<PageData>;
}),
initialPageParam: null,
getNextPageParam: (last) => last.nextCursor ?? undefined,
staleTime: 20_000,
gcTime: 10 * 60_000,
structuralSharing: true,
refetchOnWindowFocus: false, // avoid mid-scroll refetches resetting cursor chain
});
// Derive merged state outside select to keep the raw pages available
const merged = useMemo(
() => (query.data ? mergePages(query.data.pages) : null),
[query.data]
);
const loadMore = useCallback(() => {
if (!query.isFetchingNextPage && query.hasNextPage) {
query.fetchNextPage();
}
}, [query.isFetchingNextPage, query.hasNextPage, query.fetchNextPage]);
return { merged, loadMore, isLoading: query.isLoading, isError: query.isError };
}
Cache Behavior Impact: useInfiniteQuery stores each page as a discrete entry inside data.pages. When page N arrives, TanStack Query appends it to the array and runs structuralSharing on the full InfiniteData object: pages 0 through N-1 retain their previous object references, and only the new page slot gets a new reference. The mergePages call in useMemo then fires with a new query.data reference (because the pages array changed), but the resulting merged.entities objects for entities that have not changed will be identical references to the previous merge output — React can skip those sub-tree re-renders. Setting refetchOnWindowFocus: false prevents TanStack Query from re-fetching all pages when the user alt-tabs, which would rebuild the mergePages loop over potentially hundreds of pages on return.
Configuration Trade-offs:
refetchOnWindowFocus: falseimproves UX for long scroll sessions but means entity data can drift from the server during an extended session. Consider a targeted invalidation after 5 minutes of inactivity instead of disabling window-focus refetches entirely.gcTime: 10 * 60_000(10 minutes) retains all loaded pages after unmount, enabling instant hydration when the user navigates back — but 10 minutes × many pages × entity size can exhaust memory on low-end devices. Profile with the Performance tab before deploying to mobile-first audiences.- The
mergePagesloop runs in O(n) where n is the total item count across all loaded pages. For feeds exceeding 500 items, move this computation to a Web Worker or use a memoized reducer so the main thread is not blocked during scroll events. - Setting
staleTime: 20_000allows a 20-second window where navigating back to the list shows cached data. Adjust based on how frequently your backend pushes new items — real-time feeds should use a shorterstaleTimeor a push-based invalidation strategy via background refetch strategies.
Common Pitfalls & Resolutions
| Observable Issue | Root Cause | Diagnostic Resolution |
|---|---|---|
| Duplicate rows in infinite scroll after rapid downward scroll | fetchNextPage called before previous page committed — two pages share items due to backend keyset overlap |
Add if (isFetchingNextPage || !hasNextPage) return; guard before fetchNextPage; add Set-based deduplication in mergePages |
| Entire list re-renders when a single entity name is updated via mutation | Entity payload stored directly inside the pages array rather than in a flat entities dictionary |
Migrate to the resultIds + entities split from Implementation 2; update only the entity slot in setQueryData |
nextCursor from page N is used to fetch page N instead of page N+1 |
getNextPageParam reads from firstPage rather than lastPage, or pageParam is not forwarded correctly to the queryFn |
Verify getNextPageParam receives (lastPage, allPages, lastPageParam) and returns lastPage.nextCursor ?? undefined, not firstPage.nextCursor |
| Cursor-based list skips records after a backend delete | Keyset cursor points to a deleted record; backend advances the cursor beyond the gap | After mutations that delete items, call queryClient.invalidateQueries({ queryKey: ['infiniteItems', endpoint] }) to restart pagination from the beginning rather than patching cursors client-side |
totalCount on the first page is stale after new items are added |
totalCount is cached with the page data but not invalidated when the list grows |
Exclude totalCount from staleTime caching by refetching the count separately on a shorter staleTime, or derive it from the flat entities dictionary length after full load |
Frequently Asked Questions
Should pagination metadata be normalized into the entity store alongside item data?
No. Metadata like nextCursor, hasNextPage, and totalCount belongs to the list traversal context, not the entity graph. Store it in a dedicated list-scoped slice keyed by the pagination query key. This prevents metadata mutations from triggering entity cache evictions and allows independent invalidation of the cursor chain when a single entity changes.
How do I prevent stale cursor references after a mutation that reorders or removes items?
After any mutation that affects list ordering (deletes, re-ranks, status changes), call queryClient.invalidateQueries({ queryKey: ['entities', endpoint, 'cursor'] }) to flush all pages. Do not attempt surgical cursor patching — the backend cannot guarantee cursor stability across mutations, so a full list invalidation is the only safe recovery path. If invalidating all pages is too expensive, surface a “Refresh list” prompt to the user instead of auto-refetching.
What happens when two concurrent infinite-scroll fetches return overlapping pages in TanStack Query v5?
TanStack Query v5 serializes page fetches through getNextPageParam, but rapid scroll events can still trigger duplicate requests before the previous page has committed. Guard against this by checking isFetchingNextPage before calling fetchNextPage, and implement a Set-based deduplication merge in your select projection so that duplicate IDs from race-condition overlaps are filtered before the result array reaches your component.
How does structuralSharing interact with merged infinite-scroll pages?
TanStack Query’s structuralSharing (enabled by default) performs a deep equality check on each page object before replacing its reference. When a new page arrives, only the changed page slice gets a new reference — existing pages retain their identity. This means React can skip reconciliation for all stable page entries, making structuralSharing critical for large lists where a single append should not re-render all prior rows.
Related
- Data Normalization & Query Key Design — the parent section covering entity mapping, query key hierarchies, and cache topology decisions that this work builds on.
- Merging Paginated Lists Without Duplicates — deep-dive into the Set-based merge algorithm, including edge cases for real-time feeds where items can shift between pages between fetches.
- Normalizing Cursor-Based Pagination — specific recipes for opaque cursor formats, Base64-encoded keysets, and cursor validation at the adapter boundary.
- Entity Mapping Strategies — the upstream normalization pass that extracts entity records from raw API responses before they reach the pagination layer.
- Relationship Stitching in Cache — how to resolve foreign keys between entities at read-time so that paginated lists composed of relational records stay consistent without re-fetching related resources.
- Background Refetch Strategies — when and how to configure
refetchInterval,refetchOnWindowFocus, and SWR-style revalidation for paginated endpoints that must stay fresh without blocking scroll.