Cache Invalidation & Server Synchronization

Q: How do I prevent cache thrashing during rapid user interactions?

Combine mutation queues with cancelQueries in onMutate, set a debounce on optimistic patches, and configure a minimum refetchInterval so the server processes updates sequentially without redundant network calls.

Q: What staleTime should I set for frequently-updated server resources?

For resources that change on every mutation (e.g. a feed or a cart), set staleTime: 0 so React Query treats the cache as immediately stale after a successful mutation and triggers a background refetch on next focus. For slower-changing resources a staleTime of 30–60 s eliminates redundant requests without meaningful staleness risk.

Q: What is the recommended fallback when a WebSocket connection drops?

Transition to HTTP polling with exponential backoff (starting at 2 s, capping at 30 s) until the socket reconnects, then issue a full resource sync request to reconcile any patches missed during the offline window.

Q: How do I handle optimistic updates across multiple related queries?

In onMutate, call cancelQueries and getQueryData for every affected query key, patch each one with setQueryData, and return all snapshots in the context object. In onError, iterate the context to restore every snapshot atomically.

Keeping a client-side cache aligned with authoritative server state is one of the hardest distributed-systems problems frontend engineers face daily. A stale cache shows users outdated data; an over-eager invalidation strategy hammers the server with redundant requests and introduces waterfall refetch latency. This reference covers the full spectrum — from the cache layer architecture decisions that establish ownership boundaries, through granular invalidation workflows, optimistic mutation patterns, and real-time reconciliation — for teams shipping production applications on React Query (TanStack Query), Apollo Client, SWR, and RTK Query.

Architectural Overview: Cache Invalidation Data Flow

The diagram below models the lifecycle a cache entry passes through from initial fetch through mutation and background revalidation. Understanding this state machine is a prerequisite before choosing any invalidation strategy.

The five states — Fresh, Stale, Invalidated, Fetching, and Error — and the transitions between them (controlled by staleTime, gcTime, invalidateQueries, and retry logic) are the conceptual foundation for every strategy in this guide. Before choosing between client and server state ownership, engineers need to be able to place any given resource in this state machine.

Core Concepts Reference

Term	Definition	React Query API	Apollo Client API	SWR API	RTK Query API
staleTime	How long a fresh cache entry stays fresh before background refetches are eligible	`staleTime` on `useQuery` / `QueryClient` defaults	`fetchPolicy: 'cache-first'` TTL behaviour	`dedupingInterval`	`keepUnusedDataFor`
gcTime (cacheTime)	How long an unused cache entry survives in memory before garbage collection	`gcTime` (v5) / `cacheTime` (v4) on `QueryClient`	Apollo `InMemoryCache` eviction policy	Not configurable; SWR relies on key presence	`keepUnusedDataFor`
invalidateQueries	Marks entries stale and triggers background refetch for active observers	`queryClient.invalidateQueries({ queryKey })`	`cache.evict` + `cache.gc` or `refetchQueries`	`mutate(key, undefined, { revalidate: true })`	`dispatch(api.util.invalidateTags([...]))`
Tag-based invalidation	Groups queries under named resource tags so a single invalidation call targets all related entries	Query key arrays used as implicit tags	`@invalidate` directive / manual tag arrays	Not native; use `mutate` with regex key match	`providesTags` / `invalidatesTags` on endpoints
Optimistic update	Applies a predicted result to the cache before the server confirms the write	`onMutate` + `setQueryData`	`optimisticResponse` on `useMutation`	`mutate(key, newData, { optimisticData })`	`onQueryStarted` + `updateQueryData`
stale-while-revalidate	Serve the stale cached value immediately while fetching a fresh copy in the background	`staleTime: 0` + active observer triggers refetch	`fetchPolicy: 'cache-and-network'`	Default SWR behaviour	Default RTK Query behaviour with `keepUnusedDataFor`
Background refetch	Automatic re-fetch triggered by window focus, network reconnection, or polling interval	`refetchOnWindowFocus`, `refetchOnReconnect`, `refetchInterval`	`watchQuery` polling via `pollInterval`	`refreshInterval`, `revalidateOnFocus`	`pollingInterval` on `useQuery`

Strategy 1: Granular Tag-Based Invalidation

Blanket cache clearing — calling queryClient.invalidateQueries() with a top-level key — is the most common cause of waterfall refetch storms. When a single mutation triggers a refetch of every active query, the network layer sees an artificial burst and the UI hangs behind simultaneous in-flight requests.

Tag-based invalidation solves this by assigning each query a resource-scoped tag. A mutation then invalidates only the tags it actually touches. This aligns the client-side invalidation model with the same data normalization principles that prevent entity duplication in the store.

The approach in React Query is to encode the resource type and identifier into the query key array, then match selectively:

// Invalidate only the 'user' entity with id 42 — not the full user list
await queryClient.invalidateQueries({ queryKey: ['user', 42] });

// Invalidate all queries under the 'user' resource type
await queryClient.invalidateQueries({ queryKey: ['user'] });

In RTK Query, tags are declared explicitly via providesTags on the endpoint and invalidatesTags on the mutation, giving compile-time visibility of the invalidation graph.

Apollo Client implements tag-based invalidation differently: the InMemoryCache normalises entities by __typename + id, so calling cache.evict({ id: cache.identify(entity) }) removes a specific entity and cache.gc() cascades to dependent queries automatically.

Configuration trade-offs:

Narrower query key arrays mean more targeted invalidation but require disciplined key design upfront — see designing stable query keys for React Query before committing to a schema.
refetchType: 'active' (React Query default) only re-fetches queries that currently have active observers; setting refetchType: 'all' is safer after bulk mutations but increases bandwidth.
Apollo’s cache.gc() is synchronous and blocks the JS thread briefly on very large caches; prefer selective cache.evict over broad cache.modify + gc in performance-critical paths.
RTK Query’s invalidatesTags fires immediately after the mutation settles; wrap mutations in async onQueryStarted if you need to chain side-effects before the invalidation.

Strategy 2: Optimistic Mutations with Deterministic Rollback

Perceived responsiveness demands that the UI reflect a write operation before the server responds. Optimistic updates achieve this by writing a predicted result into the cache immediately on onMutate. The risk is state corruption when the server rejects the write — which makes snapshot capture and rollback non-negotiable.

Mutation sync and rollback patterns hinge on three invariants:

Cancel any in-flight queries that might overwrite the optimistic patch before the server responds.
Capture the pre-mutation state as an immutable snapshot.
Restore the snapshot atomically in onError before triggering any UI notification.

The production pattern in React Query v5:

import { useMutation, useQueryClient } from '@tanstack/react-query';

interface UpdateCommentPayload {
  id: string;
  body: string;
  postId: string;
}

interface Comment {
  id: string;
  body: string;
  author: string;
}

export function useUpdateComment() {
  const queryClient = useQueryClient();

  return useMutation<Comment, Error, UpdateCommentPayload, { previous: Comment | undefined; previousList: Comment[] | undefined }>({
    mutationFn: (payload) =>
      fetch(`/api/comments/${payload.id}`, {
        method: 'PATCH',
        headers: { 'Content-Type': 'application/json' },
        body: JSON.stringify({ body: payload.body }),
      }).then((res) => {
        if (!res.ok) throw new Error(`HTTP ${res.status}`);
        return res.json();
      }),

    onMutate: async (payload) => {
      // 1. Cancel any outgoing refetches that would stomp over our optimistic update.
      //    cancelQueries is async — await it or the race condition remains.
      await queryClient.cancelQueries({ queryKey: ['comment', payload.id] });
      await queryClient.cancelQueries({ queryKey: ['comments', 'list', payload.postId] });

      // 2. Snapshot both the detail and the list so rollback is atomic.
      const previous = queryClient.getQueryData<Comment>(['comment', payload.id]);
      const previousList = queryClient.getQueryData<Comment[]>(['comments', 'list', payload.postId]);

      // 3. Apply the optimistic patch. structuralSharing is on by default — React Query
      //    will diff the old and new objects and only re-render components whose data changed.
      queryClient.setQueryData<Comment>(['comment', payload.id], (old) =>
        old ? { ...old, body: payload.body } : old
      );
      queryClient.setQueryData<Comment[]>(['comments', 'list', payload.postId], (old) =>
        old?.map((c) => (c.id === payload.id ? { ...c, body: payload.body } : c))
      );

      // 4. Return context so onError can restore both snapshots.
      return { previous, previousList };
    },

    onError: (_err, payload, context) => {
      // Restore both snapshots. If either was undefined the entity did not exist yet — no-op.
      if (context?.previous !== undefined) {
        queryClient.setQueryData(['comment', payload.id], context.previous);
      }
      if (context?.previousList !== undefined) {
        queryClient.setQueryData(['comments', 'list', payload.postId], context.previousList);
      }
    },

    onSettled: (_data, _err, payload) => {
      // Always refetch after settle to reconcile against server truth,
      // regardless of whether the mutation succeeded or rolled back.
      queryClient.invalidateQueries({ queryKey: ['comment', payload.id] });
      queryClient.invalidateQueries({ queryKey: ['comments', 'list', payload.postId] });
    },
  });
}

Cache Behavior Explanation:

cancelQueries issues an abort signal to any in-flight fetch for the matched key; React Query then marks those queries as cancelled and will not write their response to the cache on arrival.
setQueryData bypasses the network entirely and writes directly to the in-memory store; React Query’s structuralSharing diffing means only components subscribed to the changed fields re-render.
When onError fires, setQueryData with the snapshot is synchronous — the UI reverts in the same event loop tick, before the error toast renders, preventing a double-flash.
onSettled runs after both success and error; placing invalidateQueries here (rather than onSuccess only) ensures the cache always reconciles with the server even if a race condition produced an inconsistency during the optimistic phase.

Configuration trade-offs:

structuralSharing: true (default) reduces re-renders but adds a shallow equality pass on every setQueryData call. Disable it (structuralSharing: false) for very large arrays where the diffing overhead exceeds the render savings.
SWR’s optimisticData option accepts either a value or a function (currentData) => newData; the function form is safer for list mutations where the current length may have changed since onMutate.
Apollo’s optimisticResponse writes via the same InMemoryCache normalizer as real responses — meaning it automatically updates every query that references the same __typename:id entity, removing the need to enumerate keys manually.
RTK Query’s updateQueryData inside onQueryStarted produces an Immer draft patch that can be undo()-called in the catch block, giving a clean rollback without manual snapshot management.

Strategy 3: Stale-While-Revalidate and Background Refetch Orchestration

The stale-while-revalidate (SWR) pattern is the contractual guarantee that users see data instantly — always from cache — while the freshest version arrives silently in the background. Without it, navigating between routes forces users to wait through a loading spinner even for data they fetched 10 seconds ago.

Configuring stale-while-revalidate involves tuning two interdependent parameters: staleTime (how long before a background refetch is eligible) and the refetch triggers (window focus, network reconnect, component remount, or polling interval). Getting them wrong in either direction degrades the experience: too-low staleTime turns every navigation into a network call; too-high staleTime leaves users on outdated data after a server-side update.

Background refetch strategies govern how these triggers interact with concurrency. React Query deduplicates concurrent requests to the same query key automatically — multiple components mounting simultaneously share a single in-flight request — but optimizing SWR revalidation intervals requires careful interval tuning to avoid creating a polling-induced server load spike when many browser tabs are open.

A production configuration for a frequently-updated feed resource:

const queryClient = new QueryClient({
  defaultOptions: {
    queries: {
      // Data is stale as soon as the fetch settles — background refetch fires on next trigger.
      staleTime: 0,
      // Keep unused cache entries for 2 minutes before GC. Navigating back to a route
      // shows stale data instantly (SWR) even if staleTime: 0 means a refetch follows.
      gcTime: 2 * 60 * 1000,
      // Retry once with a 1-second delay before surfacing an error to the UI.
      retry: 1,
      retryDelay: 1000,
      // Refetch when the tab regains focus — the primary trigger for SWR behaviour.
      refetchOnWindowFocus: true,
      // Refetch after the device reconnects to the network.
      refetchOnReconnect: true,
    },
  },
});

For resources where the server pushes updates via WebSocket, disable refetchOnWindowFocus and refetchInterval for that specific query and instead call queryClient.setQueryData directly from the WebSocket message handler to merge the patch without a round-trip:

useEffect(() => {
  const socket = new WebSocket(WS_URL);
  socket.addEventListener('message', (event) => {
    const patch = JSON.parse(event.data) as Partial<FeedItem> & { id: string };
    queryClient.setQueryData<FeedItem[]>(['feed'], (old) =>
      old?.map((item) => (item.id === patch.id ? { ...item, ...patch } : item))
    );
  });
  return () => socket.close();
}, [queryClient]);

Configuration trade-offs:

refetchInterval set to a short value (< 5 s) in combination with multiple open tabs creates a thundering-herd problem at the server. Prefer WebSocket push or server-sent events for sub-5-second freshness requirements.
Setting gcTime below staleTime is a logic error: entries will be garbage-collected before they have a chance to serve as a stale-while-revalidate source. Always keep gcTime >= staleTime.
SWR’s dedupingInterval (default 2 s) prevents duplicate requests from simultaneous mounts — increasing it reduces server load but delays freshness signals for rapid key changes.
RTK Query’s pollingInterval respects skipPollingIfUnfocused: true (v1.9+), which pauses polling on hidden tabs without requiring custom visibility listeners.

Strategy 4: Real-Time and Cross-Environment Synchronization

HTTP request–response cycles have an inherent latency floor that makes them unsuitable for collaborative features (shared documents, live dashboards, presence indicators). WebSocket and Server-Sent Event (SSE) integrations extend the cache synchronization model to push-driven updates.

The critical constraint when merging WebSocket payloads into a normalized cache is schema conformance: the patch payload must match the shape the query normalizer expects, or partial merges will leave the cache in an inconsistent state. Understanding reference vs. value storage models is essential here — Apollo’s InMemoryCache normalizes by reference, so a WebSocket patch that supplies a valid __typename + id will automatically update every query holding a reference to that entity. React Query and SWR store by value, so all affected query keys must be patched individually.

Offline-first architectures add a further complication: when the client reconnects after an extended offline period, refetchOnReconnect will issue a fresh fetch, but any mutations queued offline must be replayed in causal order before the reconciliation fetch resolves. Implementing a mutation queue with optimistic rollback is covered in the mutation sync and rollback reference.

For cross-tab consistency in the same browser session, React Query’s broadcastQueryClient experimental plugin and SWR’s use-broadcast-channel pattern propagate cache updates to all same-origin tabs without an additional server round-trip.

Configuration trade-offs:

Persistent WebSocket connections consume a file descriptor per client at the server; plan connection multiplexing (one socket per domain, not per component) before scaling to thousands of concurrent users.
SSE is unidirectional and works over standard HTTP/2, making it simpler to deploy behind CDNs and load balancers than WebSocket — prefer SSE for read-heavy live feeds where the client never pushes patches back.
Out-of-order patch delivery requires either logical timestamps or vector clocks on each message; without ordering guarantees, setQueryData may apply an older patch over a newer one, silently reverting user-visible state.
Apollo’s subscribeToMore on a useQuery hook merges subscription updates via updateQuery, which runs through the same InMemoryCache normalizer as mutations — ensuring consistent cache writes with no extra deduplication logic.

Production Code Example: Multi-Entity Optimistic Mutation (TanStack Query v5 + TypeScript)

The following annotated snippet demonstrates a realistic cart checkout mutation that touches three related query keys — the cart, the inventory count, and the user’s order history — applying optimistic patches to each and rolling back all three atomically on failure.

import { useMutation, useQueryClient } from '@tanstack/react-query';

interface CartItem { id: string; quantity: number; productId: string }
interface InventoryEntry { productId: string; available: number }
interface Order { id: string; status: string; items: CartItem[] }

interface CheckoutPayload { cartId: string; userId: string }

export function useCheckout() {
  const queryClient = useQueryClient();

  return useMutation<Order, Error, CheckoutPayload, {
    prevCart: CartItem[] | undefined;
    prevInventory: InventoryEntry[] | undefined;
    prevOrders: Order[] | undefined;
  }>({
    mutationFn: (payload) =>
      fetch('/api/checkout', {
        method: 'POST',
        headers: { 'Content-Type': 'application/json' },
        body: JSON.stringify(payload),
      }).then((res) => {
        if (!res.ok) throw new Error(`Checkout failed: HTTP ${res.status}`);
        return res.json();
      }),

    onMutate: async ({ cartId, userId }) => {
      // Cancel all three affected queries to prevent race-condition overwrites.
      await Promise.all([
        queryClient.cancelQueries({ queryKey: ['cart', cartId] }),
        queryClient.cancelQueries({ queryKey: ['inventory'] }),
        queryClient.cancelQueries({ queryKey: ['orders', userId] }),
      ]);

      // Capture immutable snapshots before applying any optimistic writes.
      const prevCart = queryClient.getQueryData<CartItem[]>(['cart', cartId]);
      const prevInventory = queryClient.getQueryData<InventoryEntry[]>(['inventory']);
      const prevOrders = queryClient.getQueryData<Order[]>(['orders', userId]);

      // Optimistic patch 1: empty the cart immediately.
      queryClient.setQueryData<CartItem[]>(['cart', cartId], []);

      // Optimistic patch 2: decrement available inventory for each cart item.
      if (prevCart && prevInventory) {
        queryClient.setQueryData<InventoryEntry[]>(['inventory'], (old) =>
          old?.map((entry) => {
            const cartItem = prevCart.find((ci) => ci.productId === entry.productId);
            return cartItem
              ? { ...entry, available: Math.max(0, entry.available - cartItem.quantity) }
              : entry;
          })
        );
      }

      // Optimistic patch 3: prepend a pending order to the history list.
      queryClient.setQueryData<Order[]>(['orders', userId], (old) => [
        { id: 'optimistic-pending', status: 'pending', items: prevCart ?? [] },
        ...(old ?? []),
      ]);

      return { prevCart, prevInventory, prevOrders };
    },

    onError: (_err, { cartId, userId }, context) => {
      // Restore all three snapshots atomically — the UI reverts in a single synchronous pass.
      queryClient.setQueryData(['cart', cartId], context?.prevCart);
      queryClient.setQueryData(['inventory'], context?.prevInventory);
      queryClient.setQueryData(['orders', userId], context?.prevOrders);
    },

    onSuccess: (newOrder, { userId }) => {
      // Replace the optimistic-pending entry with the confirmed order from the server.
      queryClient.setQueryData<Order[]>(['orders', userId], (old) =>
        old?.map((o) => (o.id === 'optimistic-pending' ? newOrder : o))
      );
    },

    onSettled: (_data, _err, { cartId, userId }) => {
      // Invalidate all three keys regardless of outcome to force a background sync
      // that reconciles the cache against the authoritative server state.
      queryClient.invalidateQueries({ queryKey: ['cart', cartId] });
      queryClient.invalidateQueries({ queryKey: ['inventory'] });
      queryClient.invalidateQueries({ queryKey: ['orders', userId] });
    },
  });
}

This example demonstrates the key architectural invariant: onMutate → optimistic write, onError → atomic rollback, onSuccess → patch with server truth, onSettled → unconditional background reconciliation.

Common Engineering Pitfalls

Symptom	Root Cause	Resolution
Every mutation triggers a full-page loading spinner as all queries refetch simultaneously	`invalidateQueries()` called with no `queryKey` filter, invalidating the entire cache namespace	Replace with targeted `invalidateQueries({ queryKey: ['resource', id] })` using the minimum-scope key; use `refetchType: 'active'` to skip inactive queries
UI shows stale data for 2–5 seconds after a successful mutation, then snaps to the updated value	`onSuccess` invalidates the query but `gcTime` is shorter than `staleTime`, causing the entry to be GC’d and forcing a cold fetch instead of a background revalidation	Ensure `gcTime >= staleTime`; if `staleTime: 0`, set `gcTime` to at least 30 s so the stale-while-revalidate entry survives the background fetch duration
Optimistic update flickers: the UI shows the new value then immediately reverts before the server responds	`cancelQueries` was not awaited in `onMutate`, so an in-flight refetch resolved after the optimistic patch and overwrote it	Await `cancelQueries` calls at the top of `onMutate` before calling `setQueryData`
WebSocket patches appear out of order: a later update arrives before an earlier one and the final state is wrong	WebSocket messages delivered over separate TCP frames can arrive out of order; `setQueryData` applies them in arrival order	Attach a monotonic `seq` field to each WebSocket message; in the handler, discard messages where `seq` ≤ the last applied sequence number
Apollo queries show the cached value after a mutation even though `refetchQueries` was passed	`refetchQueries` was passed entity references but the cache `identify` returned `undefined` because `__typename` was absent from the mutation response	Ensure mutation `RESPONSE` fields include `__typename` and `id`; add them to the selection set explicitly or configure `addTypename: true` globally on `InMemoryCache`
RTK Query endpoint returns stale data after a mutation invalidates the tag	The query endpoint’s `providesTags` and the mutation’s `invalidatesTags` use different tag shapes — `{ type, id }` vs `string`	Normalise all tag definitions to `{ type: 'Resource', id: string }` objects; avoid mixing string and object tag forms in the same API slice

Frequently Asked Questions

When should I use tag-based invalidation over query-key invalidation?

Tag-based invalidation scales better for normalized caches: it invalidates every query that touches a resource type without manually enumerating keys. Use it whenever multiple queries share entity data — for example when a user entity appears in both a list query (['users']) and a detail query (['user', id]). In React Query, encoding the resource type as the first element of the key array achieves the same effect as explicit tags; in RTK Query, providesTags makes the tag graph explicit and type-checked.

How do I prevent cache thrashing during rapid user interactions?

Combine three defences: first, await cancelQueries at the top of onMutate to abort any in-flight requests that could stomp the optimistic patch. Second, debounce mutations that fire on every keystroke (search, inline edit) so only the final state triggers a network write. Third, set a refetchInterval no shorter than your expected server response time to prevent overlapping polls from queuing up behind each other.

What staleTime should I set for frequently-updated server resources?

For resources that change on every mutation (a cart, a comment thread), set staleTime: 0 so React Query treats the cache as immediately stale after the fetch settles and triggers a background refetch on next window focus or remount. For slower-changing resources (user profile, configuration), a staleTime of 30–60 seconds eliminates redundant network calls without meaningful staleness risk. Avoid per-component overrides — set defaults at the QueryClient level and override only for specific endpoints that genuinely need different freshness guarantees.

What is the recommended fallback when a WebSocket connection drops?

Transition to HTTP polling with exponential backoff — start at 2 s, double on each retry, cap at 30 s — using refetchInterval on the affected queries. When the WebSocket reconnects, issue a full resource sync request to the REST endpoint to reconcile any patches missed during the offline window, then disable polling again. Implement the reconnection logic in a custom hook that toggles refetchInterval based on socket readyState, so the polling overhead is strictly bounded to the offline recovery window.

How do I handle optimistic updates across multiple related queries?

In onMutate, call cancelQueries and getQueryData for every affected query key, apply setQueryData patches to each, and return all snapshots in the context object as a single object. In onError, iterate the context to restore every snapshot. Using Promise.all for the cancelQueries calls parallelises the abort signals and keeps onMutate latency minimal. For Apollo, a single optimisticResponse automatically updates all queries that reference the same normalized entity — no manual key enumeration needed.

Tag-Based Invalidation Systems — deep dive into scoping invalidation to resource domains in React Query and Apollo, with tag schema design patterns.
Mutation Sync & Rollback — step-by-step implementation of optimistic updates with deterministic rollback for 4xx/5xx server responses.
Stale-While-Revalidate Implementation — configuring staleTime, gcTime, and refetch triggers to serve stale payloads instantly while fetching fresh data in parallel.
Background Refetch Strategies — orchestrating window-focus, reconnect, and interval-based background syncs without creating thundering-herd network bursts.
Data Normalization & Query Key Design — the normalization layer that makes tag-based invalidation and optimistic patching deterministic; covers entity mapping, nested data flattening, and pagination normalization.
State Architecture & Cache Fundamentals — foundational cache layer architecture and client-vs-server state boundaries that every invalidation strategy builds on.