Vercel Caching
You are an expert in understanding Vercel's caching infrastructure, and how the CDN Cache, ISR, and PPR work.
Core Knowledge
- ISR (and PPR, a rendering strategy built on it) is a framework feature — Next.js, SvelteKit, Nuxt, and Astro all use it on Vercel, and the layers, metrics, and CLI here apply regardless. (For caching data between your function and a backend, that's the Runtime Cache — a separate layer; see References.)
- PPR (Partial Prerendering) — a rendering strategy, not a cache layer: the static shell lives in the ISR cache while a function renders the dynamic holes per request and streams them into the same response. A route with holes still invokes the function on a shell hit; a holeless route is just ISR (a pure HIT).
How caching works
Vercel caches at multiple layers between the visitor and your backend. A request reaches the nearest PoP, which routes to a Vercel region; the CDN then checks each layer in order and returns a cached response as soon as one is available, so your function runs only when nothing upstream has a valid copy.
Cache layers
- CDN cache — regional, ephemeral. On a hit the region returns the response with no function call. Reads/writes are free.
- ISR cache — durable, in a single Function region. On a CDN miss, Vercel reads here before invoking your function (cache shielding), then replicates the result back to the CDN. Survives deploys for 31 days or until revalidated; reads/writes are billed in 8 KB units.
- Function invocation — runs only if neither cache has a valid copy. It may read the Runtime/data cache (a separate layer; see References) and your backend, then Vercel stores the response in the ISR cache.
- Image cache — optimized images, cached on the CDN after the first transform.
- Purges propagate globally in ~300 ms.
Request collapsing: when many requests hit the same uncached path at once, Vercel collapses them into one function invocation per region to protect the origin.
Key concepts
-
Cache hit rate — share served from cache (
/
/
) versus origin (
/
). Measure it over
cacheable requests — exclude
and
(redirects, errors, uncacheable methods), or they drag the ratio down for non-cache reasons. Low hit rate means more origin load and higher latency.
-
Revalidation — refreshing cached content. Time-based runs automatically after an interval; on-demand runs when you call an API. Both use stale-while-revalidate: visitors keep getting the cached version while the new one regenerates in the background.
-
Invalidate vs. dangerously-delete — two ways to clear content, with very different blast on hit rate:
- Invalidate (, Next.js /) = stale-while-revalidate. Keeps serving stale while refreshing in the background → response shows .
- Dangerously-delete (, Next.js or a revalidate with no lifetime) = hard removal. The next request blocks in the foreground to regenerate →
x-vercel-cache: REVALIDATED
.
-
Cache tags & blast radius — tags group cached entries so one call can clear many. A coarse tag attached to thousands of paths has a large
blast radius: a single write drops them all and the hit rate collapses until they re-warm. Prefer granular tags (
) plus a roll-up tag.
-
Cache status / cache reason (
response header):
| Value | Meaning |
|---|
| Served from cache; no function ran |
| Not cached; origin/function ran |
| Served stale while revalidating in background (SWR / invalidate) |
| Served a prerendered ISR/PPR shell |
| Foreground revalidation after a delete (or ) |
| Caching skipped (, , cookies, etc.) |
Investigating cache issues
Reach for the Vercel CLI.
gives aggregate numbers (requires
Observability Plus);
shows per-request behavior.
Metrics need to be queried by team and project (
). Filter production with
-f "environment eq 'production'"
(there is no
flag). Run
vercel metrics schema <metric>
to discover dimensions; use
for machine-readable output. With
, remember
is per time bucket — omit
when you need totals across the whole window.
Cache hit rate
Start here for an overall picture of how well caching is working.
Step 1 — overall split. Group
by
. Treat
,
, and
as cache-served; focus investigation on
. Exclude
and
when computing a hit rate over
cacheable traffic (see
Debugging BYPASS traffic).
means stale-while-revalidate is working — dig into revalidation frequency in
Analyzing ISR costs, not here.
bash
vercel metrics vercel.request.count -S <team> -p <project> \
-f "environment eq 'production'" --group-by cache_result --since 24h
Step 2 — where misses concentrate. Split the
bucket (and optionally
) by
, then by
or
:
bash
vercel metrics vercel.request.count -S <team> -p <project> \
-f "environment eq 'production' and cache_result eq 'MISS'" \
--group-by path_type --since 24h
vercel metrics vercel.request.count -S <team> -p <project> \
-f "environment eq 'production' and cache_result eq 'MISS' and path_type eq 'prerender'" \
--group-by request_path --since 24h
What to expect: routes (static shells, ISR pages) should show a high share of
/
. A
path with a disproportionate
count is your short list for per-path header inspection (
above) and code review.
routes render dynamically by default, but you can still cache them with
headers — matching requests are cached on the CDN. Each cache entry varies by
headers (cookies, RSC, etc.) as well as path and query parameters, so expect more cache keys and a lower hit rate than a fully static
route.
Analyzing ISR costs
Once you know hit rate, quantify ISR spend and whether revalidation — not traffic volume — is driving it.
Utilization vs. ISR billing. Utilization is
— total request volume.
ISR cost is billed separately in 8 KB units:
when the regional CDN misses and falls through to the ISR cache, and
on every revalidation/regeneration. The regional CDN shields ISR heavily — most requests never touch the ISR layer, so
read_units will be far below request count. Do not compare read_units to write_units as a utilization check; focus on
write_units (revalidation cost) and how they relate to total traffic.
bash
vercel metrics vercel.request.count -S <team> -p <project> -a sum --since 24h
vercel metrics vercel.isr_operation.write_units -S <team> -p <project> -a sum --since 24h
Which routes revalidate most. Break write units down by
and
to find paths that regenerate often relative to traffic:
bash
vercel metrics vercel.isr_operation.write_units -S <team> -p <project> \
-a sum --group-by route --since 24h
vercel metrics vercel.isr_operation.write_units -S <team> -p <project> \
-a sum --group-by request_path --since 24h
Regeneration vs. serving. Group write units by
— concentration in
confirms revalidation (not per-request dynamic work) is the cost driver.
Time-based vs. tag-based revalidation. Time-based intervals regenerate on a schedule whether or not content changed — often inefficient. Tag-based on-demand revalidation is usually better, but an overly broad tag has a large blast radius: one invalidate drops every entry that carries it.
- Tag blast radius — group write units by . If many unrelated routes show near-identical write counts, a shared hot tag is invalidating them in lockstep (e.g. every blog post rewriting at the same rate because they share one broad tag):
bash
vercel metrics vercel.isr_operation.write_units -S <team> -p <project> \
-a sum --group-by cache_tags --since 24h
- What triggered revalidation — group by to see which tags fire most often ( is on request count only, not ISR operation metrics. It is one of the tags that triggered the page to be stale):
bash
vercel metrics vercel.request.count -S <team> -p <project> \
-f "triggering_tag ne null" --group-by triggering_tag --since 24h
Tags with a large blast radius that revalidate frequently are the usual root cause of high write_units. Prefer granular tags (
) and on-demand invalidation over short time-based intervals for event-driven content.
Confirm in code. Metrics tell you
which tag is hot; the repo tells you
why. Grep for the tag's invalidation call site —
,
,
,
— and read the trigger. A CMS webhook or a sync cron that invalidates a
broad tag on every event (instead of a specific
) is the classic amplifier.
Debugging BYPASS traffic
The largest legitimate sources of
are
Draft Mode and
SEO crawlers. Draft Mode must bypass cache so editors see live content. SEO bots must receive the
full response — especially on PPR routes where the static shell and dynamic holes are assembled at request time — so crawlers index what users actually see. That BYPASS is expected, not a misconfiguration.
Before tuning headers or revalidate intervals, confirm what's left after those two buckets:
bash
vercel metrics vercel.request.count -S <team> -p <project> \
-f "cache_result eq 'BYPASS'" --group-by bot_category --since 24h
vercel metrics vercel.request.count -S <team> -p <project> \
-f "cache_result eq 'BYPASS'" --group-by user_agent --since 24h
vercel metrics vercel.request.count -S <team> -p <project> \
-f "cache_result eq 'BYPASS'" --group-by request_method --since 24h
The
Firewall/WAF with the
skill can be used to manage verified SEO crawlers, block abusive bots, and rate-limit junk traffic before it distorts your hit-rate picture.
Reducing ISR cost
- Prefer tag-based over time-based revalidation. Replace short intervals with on-demand / when content changes — time-based regeneration runs whether or not anything changed. If using Cache Components, analyze calls with the skill.
- Scope tags to specific IDs. Invalidate , not a generic / tag — one broad invalidate regenerates everything that carries it.
- Tune the revalidate interval where your framework declares it (Next.js / , SvelteKit , Nuxt , Astro). For Next.js Cache Components, see the skill.
- Use headers to cache dynamic functions.
Inspect one path
bash
curl -sSI https://<host>/<path> | grep -iE 'x-vercel-cache|x-matched-path|cache-control|vary|age|set-cookie'
This zero-dependency first reach shows the status (
), the cache directives (
/
/
), and — crucially —
, which reveals rewrites like
that expose experiment/flag precomputation.
flags personalization (RSC, cookies);
forces
. For a per-phase timing breakdown,
vercel httpstat /some/path
(CLI v48.9.0+; needs the
tool installed) adds latency stats. A path that should cache but shows
/
usually has
,
,
, a per-request input (cookies/headers/
), or an uncacheable method (see FAQ).
Inspect one request. When metrics or headers give you a request ID, pull the full log record:
bash
vercel logs --request-id <request-id> --json
Use
so the agent can parse cache status, path, and timing fields programmatically.
FAQ
- What are prerender variant misses? When a route uses a dynamic param, each distinct cache-key variant is prerendered and cached separately, so each variant misses on its first hit per region and low-traffic ones rarely stay warm. The most common modern cause is feature-flag / experiment precomputation — middleware picks a variant per request ( paths), and flags × routes × PPR segments multiply into thousands of ISR entries (also a middleware-invocation cost). Fix: collapse the variant matrix (retire finished experiments), or accept the cost.
- Does PPR avoid function invocations? No — a PPR route has dynamic holes by definition, so the cached shell hit still runs the function to fill them. (A route with no holes is just ISR and serves a pure HIT — see Key concepts.)
- Why are there more function invocations than PPR requests? PPR requests have a static shell and a dynamic function invocation. When the static shell needs to be regenerated, it incurs a function invocation on top of the dynamic function for the content.
Related skills
- — manage verified SEO crawlers, block abusive bots, and rate-limit junk BYPASS traffic.
- — caching data between your function and a backend (per-region key-value / data cache). A different layer from the CDN/ISR caches; use it to cache an API response or query result inside a function.
- — Next.js , , , and tuning (one framework's ISR/PPR controls).
References: