analytics-engine

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Analytics Engine

Analytics Engine

Write high-cardinality event data at scale and query it with SQL. Perfect for user events, billing metrics, per-tenant analytics, and custom telemetry.
大规模写入高基数事件数据并使用SQL进行查询。非常适用于用户事件、账单指标、租户级分析和自定义遥测场景。

FIRST: Create Dataset

第一步:创建数据集

bash
wrangler analytics-engine create my-dataset
Add binding in
wrangler.jsonc
:
jsonc
{
  "analytics_engine_datasets": [
    {
      "binding": "USER_EVENTS",
      "dataset": "my-dataset"
    }
  ]
}
bash
wrangler analytics-engine create my-dataset
wrangler.jsonc
中添加绑定:
jsonc
{
  "analytics_engine_datasets": [
    {
      "binding": "USER_EVENTS",
      "dataset": "my-dataset"
    }
  ]
}

When to Use

适用场景

Use CaseWhy Analytics Engine
User behavior trackingHigh-cardinality data (userId, sessionId, etc.)
Billing/usage metricsPer-tenant aggregation with doubles
Custom telemetryNon-blocking writes, queryable with SQL
A/B test metricsIndex by experiment ID, query results
API usage trackingCount requests per customer/endpoint
使用场景选择Analytics Engine的原因
用户行为追踪支持高基数数据(如userId、sessionId等)
账单/使用指标支持基于租户的数值聚合
自定义遥测非阻塞写入,可通过SQL查询
A/B测试指标按实验ID索引,可查询结果
API使用追踪统计每个客户/端点的请求次数

Quick Reference

快速参考

OperationAPINotes
Write event
env.DATASET.writeDataPoint({ ... })
Non-blocking, do NOT await
Metrics
doubles: [value1, value2]
Up to 20 numeric values
Labels
blobs: [label1, label2]
Up to 20 text values
Grouping
indexes: [userId]
1 index per datapoint (max 96 bytes)
Query dataSQL API via RESTGraphQL also available
操作API说明
写入事件
env.DATASET.writeDataPoint({ ... })
非阻塞,请勿使用await
指标
doubles: [value1, value2]
最多支持20个数值
标签
blobs: [label1, label2]
最多支持20个文本值
分组
indexes: [userId]
每个数据点1个索引(最大96字节)
查询数据基于REST的SQL API同时支持GraphQL

Data Model

数据模型

Analytics Engine stores datapoints with three types of fields:
Field TypePurposeLimitExample
doublesNumeric metrics (counters, gauges, latency)20 per datapoint
[response_time, bytes_sent]
blobsText labels (URLs, names, IDs)20 per datapoint
[path, event_name]
indexesGrouping key (userId, tenantId, etc.)1 per datapoint
[userId]
Key concept: The index is the primary key that represents your app, customer, merchant, or tenant. Use it to group and filter data efficiently in SQL queries. For multiple dimensions, use blobs or create a composite index.
Analytics Engine存储的数据点包含三种类型的字段:
字段类型用途限制示例
doubles数值指标(计数器、仪表盘、延迟)每个数据点最多20个
[response_time, bytes_sent]
blobs文本标签(URL、名称、ID)每个数据点最多20个
[path, event_name]
indexes分组键(userId、tenantId等)每个数据点最多1个
[userId]
核心概念:索引是代表应用、客户、商家或租户的主键。使用它可以在SQL查询中高效地分组和过滤数据。如需多维度分组,可使用blobs或创建复合索引。

Write Events Example

写入事件示例

typescript
interface Env {
  USER_EVENTS: AnalyticsEngineDataset;
}

export default {
  async fetch(req: Request, env: Env): Promise<Response> {
    let url = new URL(req.url);
    let path = url.pathname;
    let userId = url.searchParams.get("userId");

    // Write a datapoint for this visit, associating the data with
    // the userId as our Analytics Engine 'index'
    env.USER_EVENTS.writeDataPoint({
      // Write metrics data: counters, gauges or latency statistics
      doubles: [],
      // Write text labels - URLs, app names, event_names, etc
      blobs: [path],
      // Provide an index that groups your data correctly.
      indexes: [userId],
    });

    return Response.json({
      hello: "world",
    });
  },
};
typescript
interface Env {
  USER_EVENTS: AnalyticsEngineDataset;
}

export default {
  async fetch(req: Request, env: Env): Promise<Response> {
    let url = new URL(req.url);
    let path = url.pathname;
    let userId = url.searchParams.get("userId");

    // 写入该访问对应的数据点,将数据与作为Analytics Engine 'index'的userId关联
    env.USER_EVENTS.writeDataPoint({
      // 写入指标数据:计数器、仪表盘或延迟统计
      doubles: [],
      // 写入文本标签 - URL、应用名称、事件名称等
      blobs: [path],
      // 提供可正确分组数据的索引
      indexes: [userId],
    });

    return Response.json({
      hello: "world",
    });
  },
};

API Usage Tracking Example

API使用追踪示例

typescript
interface Env {
  API_METRICS: AnalyticsEngineDataset;
}

export default {
  async fetch(req: Request, env: Env): Promise<Response> {
    const start = Date.now();
    const url = new URL(req.url);
    const apiKey = req.headers.get("x-api-key") || "anonymous";
    const endpoint = url.pathname;

    try {
      // Handle API request...
      const response = await handleApiRequest(req);
      const duration = Date.now() - start;

      // Track successful request
      env.API_METRICS.writeDataPoint({
        doubles: [duration, response.headers.get("content-length") || 0],
        blobs: [endpoint, "success", response.status.toString()],
        indexes: [apiKey],
      });

      return response;
    } catch (error) {
      const duration = Date.now() - start;

      // Track failed request
      env.API_METRICS.writeDataPoint({
        doubles: [duration, 0],
        blobs: [endpoint, "error", error.message],
        indexes: [apiKey],
      });

      return new Response("Error", { status: 500 });
    }
  },
};
typescript
interface Env {
  API_METRICS: AnalyticsEngineDataset;
}

export default {
  async fetch(req: Request, env: Env): Promise<Response> {
    const start = Date.now();
    const url = new URL(req.url);
    const apiKey = req.headers.get("x-api-key") || "anonymous";
    const endpoint = url.pathname;

    try {
      // 处理API请求...
      const response = await handleApiRequest(req);
      const duration = Date.now() - start;

      // 追踪成功的请求
      env.API_METRICS.writeDataPoint({
        doubles: [duration, response.headers.get("content-length") || 0],
        blobs: [endpoint, "success", response.status.toString()],
        indexes: [apiKey],
      });

      return response;
    } catch (error) {
      const duration = Date.now() - start;

      // 追踪失败的请求
      env.API_METRICS.writeDataPoint({
        doubles: [duration, 0],
        blobs: [endpoint, "error", error.message],
        indexes: [apiKey],
      });

      return new Response("Error", { status: 500 });
    }
  },
};

Non-Blocking Writes

非阻塞写入

IMPORTANT: Do NOT
await
calls to
writeDataPoint()
. It is non-blocking and returns immediately.
typescript
// ❌ WRONG - Do not await
await env.USER_EVENTS.writeDataPoint({ ... });

// ✅ CORRECT - Fire and forget
env.USER_EVENTS.writeDataPoint({ ... });
This allows your Worker to respond quickly without waiting for the write to complete.
重要提示:请勿对
writeDataPoint()
调用使用
await
。它是非阻塞的,会立即返回。
typescript
// ❌ 错误示例 - 不要使用await
await env.USER_EVENTS.writeDataPoint({ ... });

// ✅ 正确示例 - 无需等待
env.USER_EVENTS.writeDataPoint({ ... });
这能让你的Worker快速响应,无需等待写入完成。

Querying with SQL API

使用SQL API查询

Analytics Engine data is accessible via REST API with SQL queries:
Endpoint:
https://api.cloudflare.com/client/v4/accounts/{account_id}/analytics_engine/sql
Analytics Engine的数据可通过REST API结合SQL查询访问:
端点
https://api.cloudflare.com/client/v4/accounts/{account_id}/analytics_engine/sql

Example: Query Recent Events

示例:查询近期事件

sql
SELECT
  timestamp,
  blob1 AS path,
  index1 AS userId
FROM USER_EVENTS
WHERE timestamp > NOW() - INTERVAL '1' DAY
ORDER BY timestamp DESC
LIMIT 100
sql
SELECT
  timestamp,
  blob1 AS path,
  index1 AS userId
FROM USER_EVENTS
WHERE timestamp > NOW() - INTERVAL '1' DAY
ORDER BY timestamp DESC
LIMIT 100

Example: Aggregate Metrics

示例:聚合指标

sql
SELECT
  index1 AS apiKey,
  COUNT(*) AS request_count,
  AVG(double1) AS avg_duration_ms,
  SUM(double2) AS total_bytes
FROM API_METRICS
WHERE timestamp > NOW() - INTERVAL '7' DAY
GROUP BY apiKey
ORDER BY request_count DESC
sql
SELECT
  index1 AS apiKey,
  COUNT(*) AS request_count,
  AVG(double1) AS avg_duration_ms,
  SUM(double2) AS total_bytes
FROM API_METRICS
WHERE timestamp > NOW() - INTERVAL '7' DAY
GROUP BY apiKey
ORDER BY request_count DESC

Example: List Datasets

示例:列出数据集

bash
curl "https://api.cloudflare.com/client/v4/accounts/{account_id}/analytics_engine/sql" \
  --header "Authorization: Bearer <API_TOKEN>" \
  --data "SHOW TABLES"
bash
curl "https://api.cloudflare.com/client/v4/accounts/{account_id}/analytics_engine/sql" \
  --header "Authorization: Bearer <API_TOKEN>" \
  --data "SHOW TABLES"

Field Naming in SQL

SQL中的字段命名

Fields are automatically numbered based on write order:
  • double1
    ,
    double2
    , ...
    double20
  • blob1
    ,
    blob2
    , ...
    blob20
  • index1
    ,
    index2
    , ...
    index20
Use
AS
aliases to make queries readable:
sql
SELECT
  double1 AS response_time,
  blob1 AS endpoint,
  index1 AS user_id
FROM my_dataset
字段会根据写入顺序自动编号:
  • double1
    ,
    double2
    , ...
    double20
  • blob1
    ,
    blob2
    , ...
    blob20
  • index1
    ,
    index2
    , ...
    index20
使用
AS
别名让查询更易读:
sql
SELECT
  double1 AS response_time,
  blob1 AS endpoint,
  index1 AS user_id
FROM my_dataset

wrangler.jsonc Configuration

wrangler.jsonc配置

jsonc
{
  "name": "analytics-engine-example",
  "main": "src/index.ts",
  "compatibility_date": "2025-02-11",
  "analytics_engine_datasets": [
    {
      "binding": "USER_EVENTS",
      "dataset": "user-events"
    },
    {
      "binding": "API_METRICS",
      "dataset": "api-metrics"
    }
  ]
}
jsonc
{
  "name": "analytics-engine-example",
  "main": "src/index.ts",
  "compatibility_date": "2025-02-11",
  "analytics_engine_datasets": [
    {
      "binding": "USER_EVENTS",
      "dataset": "user-events"
    },
    {
      "binding": "API_METRICS",
      "dataset": "api-metrics"
    }
  ]
}

TypeScript Types

TypeScript类型

typescript
interface Env {
  // Analytics Engine dataset binding
  USER_EVENTS: AnalyticsEngineDataset;
}

// Datapoint structure
interface AnalyticsEngineDataPoint {
  doubles?: number[];  // Up to 20 numeric values
  blobs?: string[];    // Up to 20 text values
  indexes?: string[];  // Up to 20 grouping keys
}
typescript
interface Env {
  // Analytics Engine数据集绑定
  USER_EVENTS: AnalyticsEngineDataset;
}

// 数据点结构
interface AnalyticsEngineDataPoint {
  doubles?: number[];  // 最多20个数值
  blobs?: string[];    // 最多20个文本值
  indexes?: string[];  // 最多20个分组键
}

Detailed References

详细参考文档

  • references/writing.md - Writing datapoints, field types, patterns
  • references/querying.md - SQL API, GraphQL, aggregations, time series
  • references/limits.md - Comprehensive limits, quotas, free tier, sampling behavior
  • references/testing.md - Mocking strategies (no local simulation available)
  • references/writing.md - 写入数据点、字段类型、模式
  • references/querying.md - SQL API、GraphQL、聚合、时间序列
  • references/limits.md - 完整限制、配额、免费层、采样行为
  • references/testing.md - 模拟策略(暂无本地模拟)

Best Practices

最佳实践

  1. Design indexes first: Choose grouping keys (userId, tenantId) that match your query patterns
  2. Never await writes:
    writeDataPoint()
    is non-blocking for maximum performance
  3. Use doubles for metrics: Numeric data enables aggregations (AVG, SUM, COUNT)
  4. Use blobs for dimensions: Text labels for filtering and grouping
  5. Consistent field order: Keep doubles/blobs/indexes in same order across all writes for consistent SQL queries
  6. Handle missing data: Use default values or filter NULL in SQL queries
  7. Monitor cardinality: Too many unique indexes can impact query performance
  8. Use intervals wisely: Query with time ranges to limit data scanned
  1. 优先设计索引:选择与查询模式匹配的分组键(userId、tenantId)
  2. 永远不要等待写入
    writeDataPoint()
    为非阻塞,以实现最佳性能
  3. 使用doubles存储指标:数值数据支持聚合操作(AVG、SUM、COUNT)
  4. 使用blobs存储维度:文本标签用于过滤和分组
  5. 保持字段顺序一致:所有写入操作中保持doubles/blobs/indexes的顺序一致,确保SQL查询的一致性
  6. 处理缺失数据:使用默认值或在SQL查询中过滤NULL
  7. 监控基数:过多唯一索引会影响查询性能
  8. 合理使用时间区间:通过时间范围查询减少扫描的数据量

Common Patterns

常见模式

Pattern 1: User Session Tracking

模式1:用户会话追踪

typescript
env.SESSIONS.writeDataPoint({
  doubles: [sessionDuration, pageViews, eventsCount],
  blobs: [browser, country, deviceType],
  indexes: [userId, sessionId],
});
typescript
env.SESSIONS.writeDataPoint({
  doubles: [sessionDuration, pageViews, eventsCount],
  blobs: [browser, country, deviceType],
  indexes: [userId, sessionId],
});

Pattern 2: Error Tracking

模式2:错误追踪

typescript
env.ERRORS.writeDataPoint({
  doubles: [1], // Error count
  blobs: [errorType, errorMessage.slice(0, 256), endpoint],
  indexes: [userId, appVersion],
});
typescript
env.ERRORS.writeDataPoint({
  doubles: [1], // 错误计数
  blobs: [errorType, errorMessage.slice(0, 256), endpoint],
  indexes: [userId, appVersion],
});

Pattern 3: Revenue Events

模式3:收入事件

typescript
env.REVENUE.writeDataPoint({
  doubles: [amountCents, taxCents, discountCents],
  blobs: [productId, currency, paymentMethod],
  indexes: [customerId, merchantId],
});
typescript
env.REVENUE.writeDataPoint({
  doubles: [amountCents, taxCents, discountCents],
  blobs: [productId, currency, paymentMethod],
  indexes: [customerId, merchantId],
});

Limits and Considerations

限制与注意事项

  • Write rate: Up to 250 data points per Worker invocation
  • Field limits: 20 doubles, 20 blobs, 1 index per datapoint
  • Blob size: Total blobs limited to 16 KB per datapoint (increased from 5 KB in June 2025)
  • Index size: 96 bytes maximum
  • Free tier: 100,000 writes/day, 10,000 queries/day (not yet enforced)
  • Query performance: ~100ms average, ~300ms p99
  • Retention: Data retained for 3 months
  • Eventual consistency: Small delay between write and query visibility
See references/limits.md for complete details.
  • 写入速率:每次Worker调用最多250个数据点
  • 字段限制:每个数据点最多20个doubles、20个blobs、1个index
  • Blob大小:每个数据点的Blob总大小限制为16 KB(2025年6月从5 KB提升)
  • 索引大小:最大96字节
  • 免费层:每日100,000次写入,10,000次查询(暂未强制执行)
  • 查询性能:平均约100ms,p99约300ms
  • 数据保留:数据保留3个月
  • 最终一致性:写入与查询可见性之间存在微小延迟
详见**references/limits.md**获取完整详情。

Migration from Other Solutions

从其他方案迁移

From Custom D1 Tables

从自定义D1表迁移

typescript
// Before: D1
await env.DB.prepare("INSERT INTO events (userId, event) VALUES (?, ?)")
  .bind(userId, event)
  .run();

// After: Analytics Engine
env.EVENTS.writeDataPoint({
  blobs: [event],
  indexes: [userId],
}); // Non-blocking, no await
typescript
// 之前:D1
await env.DB.prepare("INSERT INTO events (userId, event) VALUES (?, ?)")
  .bind(userId, event)
  .run();

// 之后:Analytics Engine
env.EVENTS.writeDataPoint({
  blobs: [event],
  indexes: [userId],
}); // 非阻塞,无需await

From Third-Party Analytics

从第三方分析工具迁移

Analytics Engine provides:
  • ✅ No data sampling
  • ✅ Full SQL access to raw data
  • ✅ No per-event cost
  • ✅ Integrated with Workers (no external HTTP calls)
  • ✅ High-cardinality data support
Analytics Engine提供:
  • ✅ 无数据采样
  • ✅ 对原始数据的完整SQL访问
  • ✅ 无按事件计费
  • ✅ 与Worker集成(无需外部HTTP调用)
  • ✅ 支持高基数数据