api-monetization

Compare original and translation side by side

🇺🇸

Original

English

🇨🇳

Translation

Chinese

When this skill is activated, always start your first response with the 🧢 emoji.

激活此技能后，你的第一条回复请始终以🧢表情开头。

API Monetization

API货币化

API monetization is the practice of turning API usage into a revenue stream through pricing models, metering, and billing infrastructure. It spans from defining developer tiers and rate limits to integrating with payment providers like Stripe for usage-based billing. This skill covers the full stack: pricing model design, quota enforcement, metered usage tracking, and Stripe integration for automated invoicing.

API货币化是指通过定价模型、计量和计费基础设施将API使用转化为收入来源的实践。它涵盖了从定义开发者层级和速率限制，到与Stripe等支付提供商集成以实现基于使用量的计费等内容。此技能覆盖全流程：定价模型设计、配额执行、使用量跟踪，以及集成Stripe实现自动化发票开具。

When to use this skill

何时使用此技能

Trigger this skill when the user:

Wants to design a pricing model for a public or partner API
Needs to implement usage-based or metered billing for API calls
Asks about rate limiting strategies tied to paid tiers
Wants to integrate Stripe metering or usage records into an API
Needs to build a developer tier system (free, pro, enterprise)
Asks about tracking API consumption per customer
Wants to handle overage billing or throttling for quota breaches
Needs to design a developer portal with tiered access

Do NOT trigger this skill for:

General Stripe payments unrelated to API billing (use a Stripe skill)
API gateway configuration without a monetization component

当用户有以下需求时触发此技能：

想要为公开或合作伙伴API设计定价模型
需要为API调用实现基于使用量或计量的计费
询问与付费层级绑定的速率限制策略
想要将Stripe计量或使用记录集成到API中
需要构建开发者层级系统（免费版、专业版、企业版）
询问如何按客户跟踪API消耗情况
想要处理超额计费或配额违规后的限流
需要设计带有分层访问权限的开发者门户

请勿在以下场景触发此技能：

与API计费无关的通用Stripe支付操作（使用专门的Stripe技能）
不涉及货币化组件的API网关配置

Key principles

核心原则

Meter before you bill - Never charge for usage you cannot accurately measure. Instrument every billable endpoint with reliable counters before enabling paid tiers. Lost meter events mean lost revenue or customer disputes.
Tiers define the product, not just the price - Each developer tier should differ in meaningful capabilities (rate limits, endpoints available, SLA, support level), not just volume. This prevents a race-to-the-bottom on price.
Rate limits are a feature, not just protection - Rate limits serve dual duty: they protect infrastructure AND enforce tier boundaries. Design them as a first-class part of the product, with clear headers and upgrade paths.
Idempotent usage reporting - Usage records must be idempotent. Network retries, duplicate webhook deliveries, and reprocessed queues should never double-count usage. Use idempotency keys on every usage report call.
Graceful degradation over hard cutoffs - When a customer hits a quota, prefer throttling or overage billing over immediately blocking access. Hard cutoffs break production systems and destroy trust.

先计量再计费 - 绝不向用户收取无法准确计量的使用费用。在启用付费层级前，为每个可计费端点配备可靠的计数器。丢失计量事件意味着收入损失或客户纠纷。
层级定义产品，而非仅价格 - 每个开发者层级应在核心功能（速率限制、可用端点、SLA、支持级别）上有显著差异，而不仅仅是使用量。这避免了价格上的恶性竞争。
速率限制是功能，而非仅防护手段 - 速率限制兼具双重作用：保护基础设施边界，同时执行层级限制。将其作为产品的核心部分进行设计，提供清晰的响应头和升级路径。
幂等的使用量上报 - 使用量记录必须具备幂等性。网络重试、重复的Webhook投递和重处理队列绝不能导致使用量重复统计。在每次使用量上报请求中使用幂等键。
优雅降级而非强制切断 - 当客户达到配额时，优先选择限流或超额计费，而非立即阻断访问。强制切断会破坏客户的生产系统，损害信任。

Core concepts

核心概念

Pricing models fall into three categories: flat-rate (fixed monthly fee per tier), usage-based (pay per API call or resource unit), and hybrid (base fee plus usage overage). Most successful API businesses use hybrid pricing because it provides revenue predictability while rewarding growth.

Metering is the infrastructure that counts billable events. A meter sits between the API gateway and the billing system. It must be durable (no lost events), idempotent (no double-counts), and near-real-time (customers see current usage). Common implementations use a message queue (Kafka, SQS) feeding an aggregation service that reports to Stripe.

Developer tiers are named bundles of quotas, rate limits, and feature flags. A typical structure is Free (heavily rate-limited, basic endpoints), Pro (higher limits, all endpoints, email support), and Enterprise (custom limits, SLA, dedicated support). Each tier maps to a Stripe Price with optional metered components.

Rate limiting enforces tier boundaries at the API gateway level. The standard approach is token bucket or sliding window per API key, returning

429 Too Many Requests

with

Retry-After

X-RateLimit-Limit

X-RateLimit-Remaining

, and

X-RateLimit-Reset

headers.

Stripe metering connects API usage to invoices. The flow is: create a metered Price on a Product, subscribe customers, then report usage via

stripe.subscriptionItems.createUsageRecord()

. Stripe aggregates usage and generates invoices at the billing cycle end.

定价模型分为三类：固定费率（每个层级每月固定费用）、基于使用量（按API调用或资源单元付费）和混合模式（基础费用加超额使用费用）。大多数成功的API业务采用混合定价，因为它既保证了收入可预测性，又能奖励业务增长。

计量是统计可计费事件的基础设施。计量系统位于API网关和计费系统之间，必须具备耐用性（无事件丢失）、幂等性（无重复统计）和近实时性（客户可查看当前使用量）。常见实现方式是使用消息队列（Kafka、SQS）为聚合服务提供数据，再由聚合服务向Stripe上报。

开发者层级是包含配额、速率限制和功能标志的命名包。典型结构为：免费版（严格速率限制，仅基础端点）、专业版（更高限制，全端点，邮件支持）和企业版（自定义限制，SLA，专属支持）。每个层级对应一个Stripe价格，可包含可选的计量组件。

速率限制在API网关层面执行层级边界。标准方法是为每个API密钥使用令牌桶或滑动窗口算法，返回

429 Too Many Requests

响应，并附带

Retry-After

、

X-RateLimit-Limit

、

X-RateLimit-Remaining

和

X-RateLimit-Reset

响应头。

Stripe计量将API使用量与发票关联。流程为：在产品上创建计量价格，为客户订阅，然后通过

stripe.subscriptionItems.createUsageRecord()

上报使用量。Stripe会聚合使用量，并在计费周期结束时生成发票。

Common tasks

常见任务

Design a tier structure

设计层级结构

Define tiers based on target customer segments. Each tier needs: a name, monthly base price, included API calls, rate limit (requests/minute), overage rate, and available endpoints.

yaml

tiers:
  free:
    price: 0
    included_calls: 1000/month
    rate_limit: 10/min
    endpoints: [/v1/basic/*]
    support: community
  pro:
    price: 49/month
    included_calls: 50000/month
    rate_limit: 100/min
    endpoints: [/v1/*]
    support: email
    overage: $0.002/call
  enterprise:
    price: custom
    included_calls: custom
    rate_limit: custom
    endpoints: [/v1/*, /v1/admin/*]
    support: dedicated
    sla: 99.9%

根据目标客户群体定义层级。每个层级需要：名称、每月基础价格、包含的API调用次数、速率限制（请求/分钟）、超额费率和可用端点。

yaml

tiers:
  free:
    price: 0
    included_calls: 1000/month
    rate_limit: 10/min
    endpoints: [/v1/basic/*]
    support: community
  pro:
    price: 49/month
    included_calls: 50000/month
    rate_limit: 100/min
    endpoints: [/v1/*]
    support: email
    overage: $0.002/call
  enterprise:
    price: custom
    included_calls: custom
    rate_limit: custom
    endpoints: [/v1/*, /v1/admin/*]
    support: dedicated
    sla: 99.9%

Set up Stripe metered billing

设置Stripe计量计费

Create a Product and a metered Price, then subscribe a customer.

javascript

const stripe = require('stripe')(process.env.STRIPE_SECRET_KEY);

// 1. Create product
const product = await stripe.products.create({
  name: 'API Access - Pro Tier',
});

// 2. Create metered price (per-unit usage)
const meteredPrice = await stripe.prices.create({
  product: product.id,
  currency: 'usd',
  recurring: {
    interval: 'month',
    usage_type: 'metered',
    aggregate_usage: 'sum',
  },
  unit_amount: 0.2, // $0.002 per call (in cents: 0.2)
  billing_scheme: 'per_unit',
});

// 3. Create base price for the tier
const basePrice = await stripe.prices.create({
  product: product.id,
  currency: 'usd',
  recurring: { interval: 'month' },
  unit_amount: 4900, // $49.00
});

// 4. Subscribe the customer to both prices
const subscription = await stripe.subscriptions.create({
  customer: 'cus_xxx',
  items: [
    { price: basePrice.id },
    { price: meteredPrice.id },
  ],
});

创建产品和计量价格，然后为客户订阅。

javascript

const stripe = require('stripe')(process.env.STRIPE_SECRET_KEY);

// 1. Create product
const product = await stripe.products.create({
  name: 'API Access - Pro Tier',
});

// 2. Create metered price (per-unit usage)
const meteredPrice = await stripe.prices.create({
  product: product.id,
  currency: 'usd',
  recurring: {
    interval: 'month',
    usage_type: 'metered',
    aggregate_usage: 'sum',
  },
  unit_amount: 0.2, // $0.002 per call (in cents: 0.2)
  billing_scheme: 'per_unit',
});

// 3. Create base price for the tier
const basePrice = await stripe.prices.create({
  product: product.id,
  currency: 'usd',
  recurring: { interval: 'month' },
  unit_amount: 4900, // $49.00
});

// 4. Subscribe the customer to both prices
const subscription = await stripe.subscriptions.create({
  customer: 'cus_xxx',
  items: [
    { price: basePrice.id },
    { price: meteredPrice.id },
  ],
});

Report usage to Stripe

向Stripe上报使用量

Report API call counts to Stripe periodically (hourly or daily). Always use

action: 'increment'

for safe idempotent reporting.

javascript

// Find the metered subscription item
const subscription = await stripe.subscriptions.retrieve('sub_xxx');
const meteredItem = subscription.items.data.find(
  (item) => item.price.recurring.usage_type === 'metered'
);

// Report usage - increment by the count since last report
await stripe.subscriptionItems.createUsageRecord(meteredItem.id, {
  quantity: 1250, // API calls in this reporting period
  timestamp: Math.floor(Date.now() / 1000),
  action: 'increment',
});

Always use
action: 'increment'
rather than
action: 'set'
. With
'set'
, a retry after a network failure would silently overwrite the correct total.

定期（每小时或每天）向Stripe上报API调用次数。为了安全的幂等上报，请始终使用

action: 'increment'

。

javascript

// Find the metered subscription item
const subscription = await stripe.subscriptions.retrieve('sub_xxx');
const meteredItem = subscription.items.data.find(
  (item) => item.price.recurring.usage_type === 'metered'
);

// Report usage - increment by the count since last report
await stripe.subscriptionItems.createUsageRecord(meteredItem.id, {
  quantity: 1250, // API calls in this reporting period
  timestamp: Math.floor(Date.now() / 1000),
  action: 'increment',
});

Always use
action: 'increment'
rather than
action: 'set'
. With
'set'
, a retry after a network failure would silently overwrite the correct total.

Implement rate limiting middleware

实现速率限制中间件

Express middleware that enforces per-tier rate limits using a sliding window with Redis.

javascript

const Redis = require('ioredis');
const redis = new Redis(process.env.REDIS_URL);

const TIER_LIMITS = {
  free: { rpm: 10, window: 60 },
  pro: { rpm: 100, window: 60 },
  enterprise: { rpm: 1000, window: 60 },
};

async function rateLimiter(req, res, next) {
  const apiKey = req.headers['x-api-key'];
  const tier = await getTierForApiKey(apiKey); // your lookup
  const limit = TIER_LIMITS[tier];
  const key = `ratelimit:${apiKey}`;
  const now = Date.now();

  // Sliding window using sorted set
  await redis.zremrangebyscore(key, 0, now - limit.window * 1000);
  const count = await redis.zcard(key);

  if (count >= limit.rpm) {
    res.set('Retry-After', String(limit.window));
    res.set('X-RateLimit-Limit', String(limit.rpm));
    res.set('X-RateLimit-Remaining', '0');
    return res.status(429).json({ error: 'Rate limit exceeded' });
  }

  await redis.zadd(key, now, `${now}-${Math.random()}`);
  await redis.expire(key, limit.window);

  res.set('X-RateLimit-Limit', String(limit.rpm));
  res.set('X-RateLimit-Remaining', String(limit.rpm - count - 1));
  next();
}

使用Redis的滑动窗口算法实现按层级速率限制的Express中间件。

javascript

const Redis = require('ioredis');
const redis = new Redis(process.env.REDIS_URL);

const TIER_LIMITS = {
  free: { rpm: 10, window: 60 },
  pro: { rpm: 100, window: 60 },
  enterprise: { rpm: 1000, window: 60 },
};

async function rateLimiter(req, res, next) {
  const apiKey = req.headers['x-api-key'];
  const tier = await getTierForApiKey(apiKey); // your lookup
  const limit = TIER_LIMITS[tier];
  const key = `ratelimit:${apiKey}`;
  const now = Date.now();

  // Sliding window using sorted set
  await redis.zremrangebyscore(key, 0, now - limit.window * 1000);
  const count = await redis.zcard(key);

  if (count >= limit.rpm) {
    res.set('Retry-After', String(limit.window));
    res.set('X-RateLimit-Limit', String(limit.rpm));
    res.set('X-RateLimit-Remaining', '0');
    return res.status(429).json({ error: 'Rate limit exceeded' });
  }

  await redis.zadd(key, now, `${now}-${Math.random()}`);
  await redis.expire(key, limit.window);

  res.set('X-RateLimit-Limit', String(limit.rpm));
  res.set('X-RateLimit-Remaining', String(limit.rpm - count - 1));
  next();
}

Track usage for billing

跟踪计费使用量

Middleware that counts API calls per customer and flushes to Stripe on a schedule.

javascript

const usageBuffer = new Map(); // apiKey -> count

function usageTracker(req, res, next) {
  const apiKey = req.headers['x-api-key'];
  usageBuffer.set(apiKey, (usageBuffer.get(apiKey) || 0) + 1);
  next();
}

// Flush every hour
setInterval(async () => {
  for (const [apiKey, count] of usageBuffer.entries()) {
    const subItemId = await getMeteredSubItemForKey(apiKey);
    if (subItemId && count > 0) {
      await stripe.subscriptionItems.createUsageRecord(subItemId, {
        quantity: count,
        timestamp: Math.floor(Date.now() / 1000),
        action: 'increment',
      });
    }
  }
  usageBuffer.clear();
}, 60 * 60 * 1000);

In production, use a durable queue (SQS, Kafka) instead of an in-memory buffer to avoid losing usage data on process restarts.

按客户统计API调用次数并定期同步到Stripe的中间件。

javascript

const usageBuffer = new Map(); // apiKey -> count

function usageTracker(req, res, next) {
  const apiKey = req.headers['x-api-key'];
  usageBuffer.set(apiKey, (usageBuffer.get(apiKey) || 0) + 1);
  next();
}

// Flush every hour
setInterval(async () => {
  for (const [apiKey, count] of usageBuffer.entries()) {
    const subItemId = await getMeteredSubItemForKey(apiKey);
    if (subItemId && count > 0) {
      await stripe.subscriptionItems.createUsageRecord(subItemId, {
        quantity: count,
        timestamp: Math.floor(Date.now() / 1000),
        action: 'increment',
      });
    }
  }
  usageBuffer.clear();
}, 60 * 60 * 1000);

In production, use a durable queue (SQS, Kafka) instead of an in-memory buffer to avoid losing usage data on process restarts.

Handle overage notifications

处理超额通知

Notify customers when they approach or exceed their included quota.

javascript

async function checkUsageThresholds(customerId, currentUsage, includedCalls) {
  const percentage = (currentUsage / includedCalls) * 100;
  const thresholds = [80, 100, 120];

  for (const threshold of thresholds) {
    if (percentage >= threshold) {
      const alreadyNotified = await hasNotifiedThreshold(customerId, threshold);
      if (!alreadyNotified) {
        await sendUsageAlert(customerId, {
          currentUsage,
          includedCalls,
          percentage,
          threshold,
          message: threshold >= 100
            ? `You have exceeded your included ${includedCalls} API calls. Overage billing is active.`
            : `You have used ${percentage.toFixed(0)}% of your included API calls.`,
        });
        await markThresholdNotified(customerId, threshold);
      }
    }
  }
}

当客户接近或超出包含的配额时发送通知。

javascript

async function checkUsageThresholds(customerId, currentUsage, includedCalls) {
  const percentage = (currentUsage / includedCalls) * 100;
  const thresholds = [80, 100, 120];

  for (const threshold of thresholds) {
    if (percentage >= threshold) {
      const alreadyNotified = await hasNotifiedThreshold(customerId, threshold);
      if (!alreadyNotified) {
        await sendUsageAlert(customerId, {
          currentUsage,
          includedCalls,
          percentage,
          threshold,
          message: threshold >= 100
            ? `You have exceeded your included ${includedCalls} API calls. Overage billing is active.`
            : `You have used ${percentage.toFixed(0)}% of your included API calls.`,
        });
        await markThresholdNotified(customerId, threshold);
      }
    }
  }
}

Anti-patterns / common mistakes

反模式/常见错误

Mistake	Why it's wrong	What to do instead
Billing on gateway logs alone	Gateway logs can be incomplete or delayed; disputes become unresolvable	Use a dedicated metering service with durable event ingestion
Hard-cutting access at quota	Breaks customer production systems, causes churn	Throttle or enable overage billing with clear notifications
Using `action: 'set'` in Stripe usage records	Retries overwrite the correct total, causing under-billing	Always use `action: 'increment'` for idempotent reporting
Same rate limit for all endpoints	Expensive endpoints (ML inference) subsidized by cheap ones (health check)	Weight rate limits by endpoint cost or use separate quotas
No rate limit headers in 429 responses	Clients cannot implement proper backoff	Always return `Retry-After` , `X-RateLimit-Limit` , `X-RateLimit-Remaining`
Reporting usage in real-time per request	Creates enormous Stripe API load, risks rate limiting from Stripe itself	Batch usage reports hourly or daily

错误	危害	正确做法
仅基于网关日志计费	网关日志可能不完整或延迟，导致纠纷无法解决	使用带有耐用事件摄入的专用计量服务
达到配额时强制切断访问	破坏客户生产系统，导致客户流失	采用限流或超额计费，并发送清晰通知
在Stripe使用记录中使用 `action: 'set'`	网络故障重试会覆盖正确的统计总量，导致计费不足	始终使用 `action: 'increment'` 进行幂等上报
所有端点使用相同速率限制	高成本端点（如ML推理）被低成本端点（如健康检查）交叉补贴	根据端点成本加权速率限制，或使用单独配额
429响应中不包含速率限制头	客户端无法实现正确的退避策略	始终返回 `Retry-After` 、 `X-RateLimit-Limit` 、 `X-RateLimit-Remaining` 响应头
每个请求实时上报使用量	给Stripe API带来巨大负载，可能触发Stripe的速率限制	按小时或按天批量上报使用量

References

参考资料

For detailed content on specific sub-domains, read the relevant file from the

references/

folder:

```
references/stripe-metering.md
```
- Deep dive into Stripe metered billing setup, tiered pricing, and invoice lifecycle
```
references/rate-limiting-patterns.md
```
- Advanced rate limiting algorithms, Redis implementations, and distributed rate limiting

Only load a references file if the current task requires it - they are long and will consume context.

如需特定子领域的详细内容，请阅读

references/

文件夹中的相关文件：

```
references/stripe-metering.md
```
- Stripe计量计费设置、分层定价和发票生命周期的深入指南
```
references/rate-limiting-patterns.md
```
- 高级速率限制算法、Redis实现和分布式速率限制

仅在当前任务需要时加载参考文件 - 这些文件内容较长，会占用大量上下文。

api-monetization

Original

Translation

API Monetization

API货币化

When to use this skill

何时使用此技能

Key principles

核心原则

Core concepts

核心概念

Common tasks

常见任务

Design a tier structure

设计层级结构

Set up Stripe metered billing

设置Stripe计量计费

Report usage to Stripe

向Stripe上报使用量

Implement rate limiting middleware

实现速率限制中间件

Track usage for billing

跟踪计费使用量

Handle overage notifications

处理超额通知

Anti-patterns / common mistakes

反模式/常见错误

References

参考资料

Related skills

相关技能