Service Runtime & Execution Model
When this skill applies
Use this skill when the main decision is how a VTEX IO backend app runs inside the
builder: how the
entrypoint is structured, how runtime configuration is declared, and how routes, events, or GraphQL handlers are registered into the service.
- Creating a new backend app under
- Structuring as the service entrypoint
- Defining typed , , and params contracts for handlers
- Configuring for timeout, memory, workers, and replicas
- Troubleshooting runtime issues caused by service registration or execution model mismatches
- Registering GraphQL handlers at the runtime level, while keeping schema and resolver design in a separate skill
Do not use this skill for:
- deciding the app contract in
- designing custom clients or integration transport layers
- detailed HTTP route handler behavior
- event-specific business workflows
- GraphQL schema or resolver modeling beyond runtime registration
Decision rules
- Treat as the runtime composition root of the backend app.
- Use the definition to register runtime surfaces such as routes, events, and GraphQL handlers, not to hold business logic directly.
- Keep runtime wiring explicit: context typing, client typing, route registration, and event registration should be visible at the service boundary.
- Put execution knobs such as timeout, ttl, memory, workers, and replica limits in , not inside handler code.
- Use to declare the runtime parameters the platform uses to execute the service, especially , , , , , , , , and .
- Use in to expose HTTP entrypoints. Routes are private by default, so set explicitly for routes that must be externally reachable.
- Use only on idempotent, cacheable routes where the same response can be safely reused across repeated requests. Avoid it on personalized, authenticated, or write-oriented endpoints.
- Use in to declare which event sources and handlers are part of the service runtime. Keep event registration in the runtime layer and event-specific business rules in dedicated event modules.
- Use to shape throughput per replica for requests and events. Set a global baseline only when the service needs it, then add small explicit overrides only for expensive routes or noisy event sources.
- Do not use as a substitute for redesigning expensive routes, queueing work, or moving slow operations to async processing.
- Keep handlers focused on request or event behavior; keep runtime structure focused on bootstrapping and registration.
- Model , , and params types deliberately so middlewares and handlers share a stable contract. Apply the same typed and to middlewares so they can safely manipulate , , and params without falling back to .
- If a backend app starts mixing runtime wiring, client implementation, and business rules in the same file, split those concerns before expanding the service further.
- Although some authorization fields such as or may live in , they are primarily authorization concerns and belong in auth or security-focused skills rather than this runtime skill.
Runtime sizing heuristics:
- These ranges are intended for partner and account-level apps. Native VTEX core services may legitimately use much higher values such as thousands of MB of memory or hundreds of replicas, but those values should not be used as defaults for custom apps.
Suggested defaults:
- Start synchronous HTTP services with between 10 and 30 seconds. For UX-facing routes, prefer 5 to 15 seconds.
- Start at 256 MB.
- Start at 1.
- Use as the default for installed apps, and reserve for linked-app development contexts where the platform allows it.
- Use as the lowest practical starting point, since the documented minimum is .
- Use intentionally. In VTEX IO, is measured in minutes, with platform defaults and limits that differ from . For partner apps, start from the default minutes and increase intentionally up to only when reducing cold starts matters more than allowing idle instances to sleep sooner.
Scaling ranges and exceptions:
- Use 128 to 256 MB for simpler IO-bound services, and move to 512 MB only when there is evidence of OOM, large payload processing, or heavier libraries.
- Increase to 2 to 4 only for high-throughput IO-bound workloads after measuring benefit. Avoid using more than 4 workers per instance as a default.
- Increase from toward only when public traffic or predictable peaks justify it. Treat values above 10 as exceptions that require explicit justification and monitoring in partner apps.
- Avoid values above 60 seconds for HTTP routes; if more time is needed, redesign the flow as async work.
- Remember that has a documented minimum of minutes and maximum of minutes. Use higher values intentionally to reduce cold starts on low-traffic or bursty services, and avoid treating like a per-request timeout.
- For partner apps,
rateLimitPerReplica.perMinute
often starts in the to range for normal routes and in the to range for more expensive ones. rateLimitPerReplica.concurrent
often starts between and .
Hard constraints
Constraint: The Service entrypoint must stay a runtime composition root
MUST define and export the VTEX IO service runtime structure, not become a catch-all file for business logic, data transformation, or transport implementation.
Why this matters
When the entrypoint mixes registration with business logic, the execution model becomes harder to reason about, handlers become tightly coupled, and changes to routes, events, or GraphQL surfaces become risky.
Detection
If
contains large handler bodies, external API calls, complex branching, or data-mapping logic, STOP and move that logic into dedicated modules. Keep the entrypoint focused on typing and registration.
Correct
typescript
import type { ClientsConfig, RecorderState, ServiceContext } from '@vtex/api'
import { Service } from '@vtex/api'
import { clients, Clients } from './clients'
import { routes } from './routes'
export interface State extends RecorderState {}
export type Context = ServiceContext<Clients, State>
const clientsConfig: ClientsConfig<Clients> = {
implementation: clients,
options: {},
}
export default new Service<Clients, State>({
clients: clientsConfig,
routes,
})
Wrong
typescript
import { Service } from '@vtex/api'
import axios from 'axios'
export default new Service({
routes: {
reviews: async (ctx: any) => {
const response = await axios.get('https://example.com/data')
const transformed = response.data.items.map((item: any) => ({
...item,
extra: true,
}))
ctx.body = transformed.filter((item: any) => item.active)
},
},
})
Constraint: Runtime configuration must be expressed in , not improvised in code
Resource and execution settings such as timeout, ttl, memory, workers, and replica behavior MUST be configured in
when the app depends on them.
resides inside the
folder and centralizes runtime parameters such as routes, events, memory, timeout, ttl, workers, replicas, and rate limits for this service.
Why this matters
These settings are part of the service runtime contract with the platform. Hiding them in assumptions or spreading them across code makes behavior harder to predict and can cause timeouts, cold-start churn, underprovisioning, or scaling mismatches. In VTEX IO,
is especially important because it is measured in minutes and influences how aggressively service infrastructure can go idle between requests.
Using the minimum
on low-traffic services can increase cold starts, because the platform is allowed to scale the service down more aggressively between bursts.
Detection
If the app depends on long-running work, concurrency, warm capacity, or specific route exposure behavior, STOP and verify that the relevant
settings are present and intentional. If the behavior is only implied in code comments or handler logic, move it into runtime configuration.
Correct
json
{
"memory": 256,
"timeout": 30,
"ttl": 10,
"minReplicas": 2,
"maxReplicas": 10,
"workers": 4,
"rateLimitPerReplica": {
"perMinute": 300,
"concurrent": 10
},
"routes": {
"reviews": {
"path": "/_v/api/reviews",
"public": false
}
}
}
Wrong
json
{
"routes": {
"reviews": {
"path": "/_v/api/reviews"
}
}
}
This runtime configuration is incomplete for a service that depends on explicit timeout, concurrency, rate limiting, or replica behavior, and it leaves execution characteristics undefined.
Constraint: Route exposure must be explicit in the runtime contract
Every HTTP route exposed by the service MUST be declared in
with an intentional visibility choice. Do not rely on implicit defaults when the route should be private or public.
Routes are private by default, so always set
explicitly when the route must be externally reachable.
Why this matters
Route visibility is part of the runtime contract of the service. If exposure is ambiguous, a route can be published with the wrong accessibility, which creates security risk for private handlers and integration failures for routes expected to be public.
Detection
If a route exists in the service runtime, STOP and verify that it is declared in
and that
matches the intended exposure. If the route is consumed only by trusted backoffice or app-to-app flows, default to checking that it is private before expanding access.
Correct
json
{
"routes": {
"status": {
"path": "/_v/status/health",
"public": true,
"smartcache": true
},
"reviews": {
"path": "/_v/api/reviews",
"public": false
}
}
}
Wrong
json
{
"routes": {
"reviews": {
"path": "/_v/api/reviews"
}
}
}
This route leaves visibility implicit, so the runtime contract does not clearly communicate whether the endpoint is meant to be public or protected.
Constraint: Typed context and state must match the handlers registered in the runtime
The service MUST define
,
, and handler contracts that match the routes, events, or GraphQL handlers it registers.
Why this matters
Untyped or inconsistent runtime contracts make middleware composition fragile and allow handlers to rely on state or params that are never guaranteed to exist.
Detection
If middlewares or handlers use
,
,
, or params fields without a shared typed contract, STOP and introduce or fix the runtime types before adding more handlers.
Correct
typescript
import type { ParamsContext, RecorderState, ServiceContext } from '@vtex/api'
interface State extends RecorderState {
reviewId?: string
}
type CustomContext = ServiceContext<Clients, State, ParamsContext>
export async function getReview(ctx: CustomContext) {
ctx.state.reviewId = ctx.vtex.route.params.id
ctx.body = { id: ctx.state.reviewId }
}
Wrong
typescript
export async function getReview(ctx: any) {
ctx.state.reviewId = ctx.params.review
ctx.body = { id: ctx.state.missingField.value }
}
Preferred pattern
Recommended file layout:
text
node/
├── index.ts
├── clients/
│ └── index.ts
├── routes/
│ └── index.ts
├── events/
│ └── index.ts
├── graphql/
│ └── index.ts
└── middlewares/
└── validate.ts
Minimal service runtime pattern:
typescript
import type { ClientsConfig, RecorderState, ServiceContext } from '@vtex/api'
import { Service } from '@vtex/api'
import { clients, Clients } from './clients'
import { routes } from './routes'
export interface State extends RecorderState {}
export type Context = ServiceContext<Clients, State>
const clientsConfig: ClientsConfig<Clients> = {
implementation: clients,
options: {},
}
export default new Service<Clients, State>({
clients: clientsConfig,
routes,
})
json
{
"memory": 256,
"timeout": 30,
"ttl": 10,
"minReplicas": 2,
"maxReplicas": 5,
"workers": 1,
"rateLimitPerReplica": {
"perMinute": 120,
"concurrent": 4
},
"routes": {
"status": {
"path": "/_v/status/health",
"public": true,
"smartcache": true
},
"reviews": {
"path": "/_v/api/reviews",
"public": false
}
},
"events": {
"orderCreated": {
"sender": "vtex.orders-broadcast",
"topics": ["order-created"],
"rateLimitPerReplica": {
"perMinute": 60,
"concurrent": 2
}
}
}
}
Use the service entrypoint to compose runtime surfaces, then push business behavior into handlers, clients, and other focused modules.
If
or
grows too large, split it by domain such as
or
and keep the index file as a small registry.
Common failure modes
- Putting business logic directly into .
- Treating as optional when runtime behavior depends on explicit resource settings.
- Setting too low and causing the service to sleep too aggressively between bursts of traffic.
- Enabling on personalized or write-oriented routes and risking incorrect cache reuse across requests.
- Registering routes, events, or GraphQL handlers without a clear typed and .
- Mixing runtime composition with client implementation details.
- Letting one service entrypoint accumulate unrelated responsibilities across HTTP, events, and GraphQL without clear module boundaries.
Review checklist
Reference
- Service - VTEX IO service runtime structure and registration
- Service JSON - Runtime configuration for VTEX IO services
- Node Builder - Backend app structure under the builder
- Developing an App - General backend app development flow