cloudflare-workers-ai
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseCloudflare Workers AI - Complete Reference
Cloudflare Workers AI 完整参考指南
Production-ready knowledge domain for building AI-powered applications with Cloudflare Workers AI.
Status: Production Ready ✅
Last Updated: 2025-10-21
Dependencies: cloudflare-worker-base (for Worker setup)
Latest Versions: wrangler@4.43.0, @cloudflare/workers-types@4.20251014.0
用于基于Cloudflare Workers AI构建AI驱动应用的生产级知识库。
状态:生产就绪 ✅
最后更新:2025-10-21
依赖项:cloudflare-worker-base(用于Worker设置)
最新版本:wrangler@4.43.0, @cloudflare/workers-types@4.20251014.0
Table of Contents
目录
Quick Start (5 minutes)
快速入门(5分钟)
1. Add AI Binding
1. 添加AI绑定
wrangler.jsonc:
jsonc
{
"ai": {
"binding": "AI"
}
}wrangler.jsonc:
jsonc
{
"ai": {
"binding": "AI"
}
}2. Run Your First Model
2. 运行你的第一个模型
typescript
export interface Env {
AI: Ai;
}
export default {
async fetch(request: Request, env: Env): Promise<Response> {
const response = await env.AI.run('@cf/meta/llama-3.1-8b-instruct', {
prompt: 'What is Cloudflare?',
});
return Response.json(response);
},
};typescript
export interface Env {
AI: Ai;
}
export default {
async fetch(request: Request, env: Env): Promise<Response> {
const response = await env.AI.run('@cf/meta/llama-3.1-8b-instruct', {
prompt: 'What is Cloudflare?',
});
return Response.json(response);
},
};3. Add Streaming (Recommended)
3. 添加流式传输(推荐)
typescript
const stream = await env.AI.run('@cf/meta/llama-3.1-8b-instruct', {
messages: [{ role: 'user', content: 'Tell me a story' }],
stream: true, // Always use streaming for text generation!
});
return new Response(stream, {
headers: { 'content-type': 'text/event-stream' },
});Why streaming?
- Prevents buffering large responses in memory
- Faster time-to-first-token
- Better user experience for long-form content
- Avoids Worker timeout issues
typescript
const stream = await env.AI.run('@cf/meta/llama-3.1-8b-instruct', {
messages: [{ role: 'user', content: 'Tell me a story' }],
stream: true, // Always use streaming for text generation!
});
return new Response(stream, {
headers: { 'content-type': 'text/event-stream' },
});为什么使用流式传输?
- 避免在内存中缓冲大响应
- 缩短首令牌生成时间
- 提升长文本内容的用户体验
- 避免Worker超时问题
Workers AI API Reference
Workers AI API参考
env.AI.run()
env.AI.run()env.AI.run()
env.AI.run()Run an AI model inference.
Signature:
typescript
async env.AI.run(
model: string,
inputs: ModelInputs,
options?: { gateway?: { id: string; skipCache?: boolean } }
): Promise<ModelOutput | ReadableStream>Parameters:
- (string, required) - Model ID (e.g.,
model)@cf/meta/llama-3.1-8b-instruct - (object, required) - Model-specific inputs
inputs - (object, optional) - Additional options
options- (object) - AI Gateway configuration
gateway- (string) - Gateway ID
id - (boolean) - Skip AI Gateway cache
skipCache
Returns:
- Non-streaming: - JSON response
Promise<ModelOutput> - Streaming: - Server-sent events stream
ReadableStream
运行AI模型推理。
签名:
typescript
async env.AI.run(
model: string,
inputs: ModelInputs,
options?: { gateway?: { id: string; skipCache?: boolean } }
): Promise<ModelOutput | ReadableStream>参数:
- (字符串,必填)- 模型ID(例如:
model)@cf/meta/llama-3.1-8b-instruct - (对象,必填)- 模型特定输入
inputs - (对象,可选)- 附加选项
options- (对象)- AI Gateway配置
gateway- (字符串)- 网关ID
id - (布尔值)- 跳过AI Gateway缓存
skipCache
返回值:
- 非流式:- JSON响应
Promise<ModelOutput> - 流式:- 服务器发送事件流
ReadableStream
Text Generation Models
文本生成模型
Input Format:
typescript
{
messages?: Array<{ role: 'system' | 'user' | 'assistant'; content: string }>;
prompt?: string; // Deprecated, use messages
stream?: boolean; // Default: false
max_tokens?: number; // Max tokens to generate
temperature?: number; // 0.0-1.0, default varies by model
top_p?: number; // 0.0-1.0
top_k?: number;
}Output Format (Non-Streaming):
typescript
{
response: string; // Generated text
}Example:
typescript
const response = await env.AI.run('@cf/meta/llama-3.1-8b-instruct', {
messages: [
{ role: 'system', content: 'You are a helpful assistant.' },
{ role: 'user', content: 'What is TypeScript?' },
],
stream: false,
});
console.log(response.response);输入格式:
typescript
{
messages?: Array<{ role: 'system' | 'user' | 'assistant'; content: string }>;
prompt?: string; // Deprecated, use messages
stream?: boolean; // Default: false
max_tokens?: number; // Max tokens to generate
temperature?: number; // 0.0-1.0, default varies by model
top_p?: number; // 0.0-1.0
top_k?: number;
}输出格式(非流式):
typescript
{
response: string; // Generated text
}示例:
typescript
const response = await env.AI.run('@cf/meta/llama-3.1-8b-instruct', {
messages: [
{ role: 'system', content: 'You are a helpful assistant.' },
{ role: 'user', content: 'What is TypeScript?' },
],
stream: false,
});
console.log(response.response);Text Embeddings Models
文本嵌入向量模型
Input Format:
typescript
{
text: string | string[]; // Single text or array of texts
}Output Format:
typescript
{
shape: number[]; // [batch_size, embedding_dimensions]
data: number[][]; // Array of embedding vectors
}Example:
typescript
const embeddings = await env.AI.run('@cf/baai/bge-base-en-v1.5', {
text: ['Hello world', 'Cloudflare Workers'],
});
console.log(embeddings.shape); // [2, 768]
console.log(embeddings.data[0]); // [0.123, -0.456, ...]输入格式:
typescript
{
text: string | string[]; // Single text or array of texts
}输出格式:
typescript
{
shape: number[]; // [batch_size, embedding_dimensions]
data: number[][]; // Array of embedding vectors
}示例:
typescript
const embeddings = await env.AI.run('@cf/baai/bge-base-en-v1.5', {
text: ['Hello world', 'Cloudflare Workers'],
});
console.log(embeddings.shape); // [2, 768]
console.log(embeddings.data[0]); // [0.123, -0.456, ...]Image Generation Models
图像生成模型
Input Format:
typescript
{
prompt: string; // Text description
num_steps?: number; // Default: 20
guidance?: number; // CFG scale, default: 7.5
strength?: number; // For img2img, default: 1.0
image?: number[][]; // For img2img (base64 or array)
}Output Format:
- Binary image data (PNG/JPEG)
Example:
typescript
const imageStream = await env.AI.run('@cf/black-forest-labs/flux-1-schnell', {
prompt: 'A beautiful sunset over mountains',
});
return new Response(imageStream, {
headers: { 'content-type': 'image/png' },
});输入格式:
typescript
{
prompt: string; // Text description
num_steps?: number; // Default: 20
guidance?: number; // CFG scale, default: 7.5
strength?: number; // For img2img, default: 1.0
image?: number[][]; // For img2img (base64 or array)
}输出格式:
- 二进制图像数据(PNG/JPEG)
示例:
typescript
const imageStream = await env.AI.run('@cf/black-forest-labs/flux-1-schnell', {
prompt: 'A beautiful sunset over mountains',
});
return new Response(imageStream, {
headers: { 'content-type': 'image/png' },
});Vision Models
视觉模型
Input Format:
typescript
{
messages: Array<{
role: 'user' | 'assistant';
content: Array<{ type: 'text' | 'image_url'; text?: string; image_url?: { url: string } }>;
}>;
}Example:
typescript
const response = await env.AI.run('@cf/meta/llama-3.2-11b-vision-instruct', {
messages: [
{
role: 'user',
content: [
{ type: 'text', text: 'What is in this image?' },
{ type: 'image_url', image_url: { url: 'data:image/png;base64,iVBOR...' } },
],
},
],
});输入格式:
typescript
{
messages: Array<{
role: 'user' | 'assistant';
content: Array<{ type: 'text' | 'image_url'; text?: string; image_url?: { url: string } }>;
}>;
}示例:
typescript
const response = await env.AI.run('@cf/meta/llama-3.2-11b-vision-instruct', {
messages: [
{
role: 'user',
content: [
{ type: 'text', text: 'What is in this image?' },
{ type: 'image_url', image_url: { url: 'data:image/png;base64,iVBOR...' } },
],
},
],
});Model Selection Guide
模型选择指南
Text Generation (LLMs)
文本生成(LLMs)
| Model | Best For | Rate Limit | Size |
|---|---|---|---|
| General purpose, fast | 300/min | 8B |
| Ultra-fast, simple tasks | 300/min | 1B |
| High quality, complex reasoning | 150/min | 14B |
| Coding, technical content | 300/min | 32B |
| Fast, efficient | 400/min | 7B |
| 模型 | 适用场景 | 速率限制 | 规模 |
|---|---|---|---|
| 通用场景、速度快 | 300/分钟 | 8B |
| 超快速、简单任务 | 300/分钟 | 1B |
| 高质量、复杂推理 | 150/分钟 | 14B |
| 代码生成、技术内容 | 300/分钟 | 32B |
| 快速、高效 | 400/分钟 | 7B |
Text Embeddings
文本嵌入向量
| Model | Dimensions | Best For | Rate Limit |
|---|---|---|---|
| 768 | General purpose RAG | 3000/min |
| 1024 | High accuracy search | 1500/min |
| 384 | Fast, low storage | 3000/min |
| 模型 | 维度 | 适用场景 | 速率限制 |
|---|---|---|---|
| 768 | 通用RAG场景 | 3000/分钟 |
| 1024 | 高精度搜索 | 1500/分钟 |
| 384 | 快速、低存储占用 | 3000/分钟 |
Image Generation
图像生成
| Model | Best For | Rate Limit | Speed |
|---|---|---|---|
| High quality, photorealistic | 720/min | Fast |
| General purpose | 720/min | Medium |
| Artistic, stylized | 720/min | Fast |
| 模型 | 适用场景 | 速率限制 | 速度 |
|---|---|---|---|
| 高质量、照片级真实感 | 720/分钟 | 快 |
| 通用场景 | 720/分钟 | 中等 |
| 艺术风格、风格化创作 | 720/分钟 | 快 |
Vision Models
视觉模型
| Model | Best For | Rate Limit |
|---|---|---|
| Image understanding | 720/min |
| Fast image captioning | 720/min |
| 模型 | 适用场景 | 速率限制 |
|---|---|---|
| 图像理解 | 720/分钟 |
| 快速图像 captioning | 720/分钟 |
Common Patterns
常见模式
Pattern 1: Chat Completion with History
模式1:带历史记录的聊天补全
typescript
app.post('/chat', async (c) => {
const { messages } = await c.req.json<{
messages: Array<{ role: string; content: string }>;
}>();
const response = await c.env.AI.run('@cf/meta/llama-3.1-8b-instruct', {
messages,
stream: true,
});
return new Response(response, {
headers: { 'content-type': 'text/event-stream' },
});
});typescript
app.post('/chat', async (c) => {
const { messages } = await c.req.json<{
messages: Array<{ role: string; content: string }>;
}>();
const response = await c.env.AI.run('@cf/meta/llama-3.1-8b-instruct', {
messages,
stream: true,
});
return new Response(response, {
headers: { 'content-type': 'text/event-stream' },
});
});Pattern 2: RAG (Retrieval Augmented Generation)
模式2:RAG(检索增强生成)
typescript
// Step 1: Generate embeddings
const embeddings = await env.AI.run('@cf/baai/bge-base-en-v1.5', {
text: [userQuery],
});
const vector = embeddings.data[0];
// Step 2: Search Vectorize
const matches = await env.VECTORIZE.query(vector, { topK: 3 });
// Step 3: Build context from matches
const context = matches.matches.map((m) => m.metadata.text).join('\n\n');
// Step 4: Generate response with context
const response = await env.AI.run('@cf/meta/llama-3.1-8b-instruct', {
messages: [
{
role: 'system',
content: `Answer using this context:\n${context}`,
},
{ role: 'user', content: userQuery },
],
stream: true,
});
return new Response(response, {
headers: { 'content-type': 'text/event-stream' },
});typescript
// Step 1: Generate embeddings
const embeddings = await env.AI.run('@cf/baai/bge-base-en-v1.5', {
text: [userQuery],
});
const vector = embeddings.data[0];
// Step 2: Search Vectorize
const matches = await env.VECTORIZE.query(vector, { topK: 3 });
// Step 3: Build context from matches
const context = matches.matches.map((m) => m.metadata.text).join('\n\n');
// Step 4: Generate response with context
const response = await env.AI.run('@cf/meta/llama-3.1-8b-instruct', {
messages: [
{
role: 'system',
content: `Answer using this context:\n${context}`,
},
{ role: 'user', content: userQuery },
],
stream: true,
});
return new Response(response, {
headers: { 'content-type': 'text/event-stream' },
});Pattern 3: Structured Output with Zod
模式3:使用Zod的结构化输出
typescript
import { z } from 'zod';
const RecipeSchema = z.object({
name: z.string(),
ingredients: z.array(z.string()),
instructions: z.array(z.string()),
prepTime: z.number(),
});
app.post('/recipe', async (c) => {
const { dish } = await c.req.json<{ dish: string }>();
const response = await c.env.AI.run('@cf/meta/llama-3.1-8b-instruct', {
messages: [
{
role: 'user',
content: `Generate a recipe for ${dish}. Return ONLY valid JSON matching this schema: ${JSON.stringify(RecipeSchema.shape)}`,
},
],
});
// Parse and validate
const recipe = RecipeSchema.parse(JSON.parse(response.response));
return c.json(recipe);
});typescript
import { z } from 'zod';
const RecipeSchema = z.object({
name: z.string(),
ingredients: z.array(z.string()),
instructions: z.array(z.string()),
prepTime: z.number(),
});
app.post('/recipe', async (c) => {
const { dish } = await c.req.json<{ dish: string }>();
const response = await c.env.AI.run('@cf/meta/llama-3.1-8b-instruct', {
messages: [
{
role: 'user',
content: `Generate a recipe for ${dish}. Return ONLY valid JSON matching this schema: ${JSON.stringify(RecipeSchema.shape)}`,
},
],
});
// Parse and validate
const recipe = RecipeSchema.parse(JSON.parse(response.response));
return c.json(recipe);
});Pattern 4: Image Generation + R2 Storage
模式4:图像生成 + R2存储
typescript
app.post('/generate-image', async (c) => {
const { prompt } = await c.req.json<{ prompt: string }>();
// Generate image
const imageStream = await c.env.AI.run('@cf/black-forest-labs/flux-1-schnell', {
prompt,
});
const imageBytes = await new Response(imageStream).bytes();
// Store in R2
const key = `images/${Date.now()}.png`;
await c.env.BUCKET.put(key, imageBytes, {
httpMetadata: { contentType: 'image/png' },
});
return c.json({
success: true,
url: `https://your-domain.com/${key}`,
});
});typescript
app.post('/generate-image', async (c) => {
const { prompt } = await c.req.json<{ prompt: string }>();
// Generate image
const imageStream = await c.env.AI.run('@cf/black-forest-labs/flux-1-schnell', {
prompt,
});
const imageBytes = await new Response(imageStream).bytes();
// Store in R2
const key = `images/${Date.now()}.png`;
await c.env.BUCKET.put(key, imageBytes, {
httpMetadata: { contentType: 'image/png' },
});
return c.json({
success: true,
url: `https://your-domain.com/${key}`,
});
});AI Gateway Integration
AI Gateway集成
AI Gateway provides caching, logging, and analytics for AI requests.
Setup:
typescript
const response = await env.AI.run(
'@cf/meta/llama-3.1-8b-instruct',
{ prompt: 'Hello' },
{
gateway: {
id: 'my-gateway', // Your gateway ID
skipCache: false, // Use cache
},
}
);Benefits:
- ✅ Cost Tracking - Monitor neurons usage per request
- ✅ Caching - Reduce duplicate inference costs
- ✅ Logging - Debug and analyze AI requests
- ✅ Rate Limiting - Additional layer of protection
- ✅ Analytics - Request patterns and performance
Access Gateway Logs:
typescript
const gateway = env.AI.gateway('my-gateway');
const logId = env.AI.aiGatewayLogId;
// Send feedback
await gateway.patchLog(logId, {
feedback: { rating: 1, comment: 'Great response' },
});AI Gateway为AI请求提供缓存、日志和分析功能。
设置:
typescript
const response = await env.AI.run(
'@cf/meta/llama-3.1-8b-instruct',
{ prompt: 'Hello' },
{
gateway: {
id: 'my-gateway', // Your gateway ID
skipCache: false, // Use cache
},
}
);优势:
- ✅ 成本追踪 - 监控每个请求的神经元使用量
- ✅ 缓存 - 减少重复推理成本
- ✅ 日志 - 调试和分析AI请求
- ✅ 速率限制 - 额外的防护层
- ✅ 分析 - 请求模式和性能统计
访问网关日志:
typescript
const gateway = env.AI.gateway('my-gateway');
const logId = env.AI.aiGatewayLogId;
// Send feedback
await gateway.patchLog(logId, {
feedback: { rating: 1, comment: 'Great response' },
});Rate Limits & Pricing
速率限制与定价
Rate Limits (per minute)
速率限制(每分钟)
| Task Type | Default Limit | Notes |
|---|---|---|
| Text Generation | 300/min | Some fast models: 400-1500/min |
| Text Embeddings | 3000/min | BGE-large: 1500/min |
| Image Generation | 720/min | All image models |
| Vision Models | 720/min | Image understanding |
| Translation | 720/min | M2M100, Opus MT |
| Classification | 2000/min | Text classification |
| Speech Recognition | 720/min | Whisper models |
| 任务类型 | 默认限制 | 说明 |
|---|---|---|
| 文本生成 | 300/分钟 | 部分快速模型:400-1500/分钟 |
| 文本嵌入向量 | 3000/分钟 | BGE-large:1500/分钟 |
| 图像生成 | 720/分钟 | 所有图像模型 |
| 视觉模型 | 720/分钟 | 图像理解 |
| 翻译 | 720/分钟 | M2M100、Opus MT |
| 分类 | 2000/分钟 | 文本分类 |
| 语音识别 | 720/分钟 | Whisper模型 |
Pricing (Neurons-Based)
定价(基于神经元)
Free Tier:
- 10,000 neurons per day
- Resets daily at 00:00 UTC
Paid Tier:
- $0.011 per 1,000 neurons
- 10,000 neurons/day included
- Unlimited usage above free allocation
Example Costs:
| Model | Input (1M tokens) | Output (1M tokens) |
|---|---|---|
| Llama 3.2 1B | $0.027 | $0.201 |
| Llama 3.1 8B | $0.088 | $0.606 |
| BGE-base embeddings | $0.005 | N/A |
| Flux image generation | ~$0.011/image | N/A |
免费层级:
- 每日10,000个神经元
- 每日00:00 UTC重置
付费层级:
- 每1,000个神经元0.011美元
- 包含每日10,000个神经元
- 免费额度之外无使用限制
成本示例:
| 模型 | 输入(100万令牌) | 输出(100万令牌) |
|---|---|---|
| Llama 3.2 1B | $0.027 | $0.201 |
| Llama 3.1 8B | $0.088 | $0.606 |
| BGE-base嵌入向量 | $0.005 | N/A |
| Flux图像生成 | ~$0.011/张 | N/A |
Production Checklist
生产环境检查清单
Before Deploying
部署前
- Enable AI Gateway for cost tracking and logging
- Implement streaming for all text generation endpoints
- Add rate limit retry with exponential backoff
- Validate input length to prevent token limit errors
- Set appropriate timeouts (Workers: 30s CPU default, 5m max)
- Monitor neurons usage in Cloudflare dashboard
- Test error handling for model unavailable, rate limits
- Add input sanitization to prevent prompt injection
- Configure CORS if using from browser
- Plan for scale - upgrade to Paid plan if needed
- 启用AI Gateway 以进行成本追踪和日志记录
- 实现流式传输 用于所有文本生成端点
- 添加速率限制重试 并使用指数退避策略
- 验证输入长度 以避免令牌超限错误
- 设置适当的超时时间(Workers:默认CPU超时30秒,最大5分钟)
- 监控神经元使用量 在Cloudflare控制台中
- 测试错误处理 针对模型不可用、速率限制等情况
- 添加输入清理 以防止提示注入
- 配置CORS 若从浏览器调用
- 规划扩容 - 如需升级到付费计划
Error Handling
错误处理
typescript
async function runAIWithRetry(
env: Env,
model: string,
inputs: any,
maxRetries = 3
): Promise<any> {
let lastError: Error;
for (let i = 0; i < maxRetries; i++) {
try {
return await env.AI.run(model, inputs);
} catch (error) {
lastError = error as Error;
const message = lastError.message.toLowerCase();
// Rate limit - retry with backoff
if (message.includes('429') || message.includes('rate limit')) {
const delay = Math.pow(2, i) * 1000; // Exponential backoff
await new Promise((resolve) => setTimeout(resolve, delay));
continue;
}
// Other errors - throw immediately
throw error;
}
}
throw lastError!;
}typescript
async function runAIWithRetry(
env: Env,
model: string,
inputs: any,
maxRetries = 3
): Promise<any> {
let lastError: Error;
for (let i = 0; i < maxRetries; i++) {
try {
return await env.AI.run(model, inputs);
} catch (error) {
lastError = error as Error;
const message = lastError.message.toLowerCase();
// Rate limit - retry with backoff
if (message.includes('429') || message.includes('rate limit')) {
const delay = Math.pow(2, i) * 1000; // Exponential backoff
await new Promise((resolve) => setTimeout(resolve, delay));
continue;
}
// Other errors - throw immediately
throw error;
}
}
throw lastError!;
}Monitoring
监控
typescript
app.use('*', async (c, next) => {
const start = Date.now();
await next();
// Log AI usage
console.log({
path: c.req.path,
duration: Date.now() - start,
logId: c.env.AI.aiGatewayLogId,
});
});typescript
app.use('*', async (c, next) => {
const start = Date.now();
await next();
// Log AI usage
console.log({
path: c.req.path,
duration: Date.now() - start,
logId: c.env.AI.aiGatewayLogId,
});
});OpenAI Compatibility
OpenAI兼容性
Workers AI supports OpenAI-compatible endpoints.
Using OpenAI SDK:
typescript
import OpenAI from 'openai';
const openai = new OpenAI({
apiKey: env.CLOUDFLARE_API_KEY,
baseURL: `https://api.cloudflare.com/client/v4/accounts/${env.CLOUDFLARE_ACCOUNT_ID}/ai/v1`,
});
// Chat completions
const completion = await openai.chat.completions.create({
model: '@cf/meta/llama-3.1-8b-instruct',
messages: [{ role: 'user', content: 'Hello!' }],
});
// Embeddings
const embeddings = await openai.embeddings.create({
model: '@cf/baai/bge-base-en-v1.5',
input: 'Hello world',
});Endpoints:
- - Text generation
/v1/chat/completions - - Text embeddings
/v1/embeddings
Workers AI支持OpenAI兼容的端点。
使用OpenAI SDK:
typescript
import OpenAI from 'openai';
const openai = new OpenAI({
apiKey: env.CLOUDFLARE_API_KEY,
baseURL: `https://api.cloudflare.com/client/v4/accounts/${env.CLOUDFLARE_ACCOUNT_ID}/ai/v1`,
});
// Chat completions
const completion = await openai.chat.completions.create({
model: '@cf/meta/llama-3.1-8b-instruct',
messages: [{ role: 'user', content: 'Hello!' }],
});
// Embeddings
const embeddings = await openai.embeddings.create({
model: '@cf/baai/bge-base-en-v1.5',
input: 'Hello world',
});端点:
- - 文本生成
/v1/chat/completions - - 文本嵌入向量
/v1/embeddings
Vercel AI SDK Integration
Vercel AI SDK集成
bash
npm install workers-ai-provider aitypescript
import { createWorkersAI } from 'workers-ai-provider';
import { generateText, streamText } from 'ai';
const workersai = createWorkersAI({ binding: env.AI });
// Generate text
const result = await generateText({
model: workersai('@cf/meta/llama-3.1-8b-instruct'),
prompt: 'Write a poem',
});
// Stream text
const stream = streamText({
model: workersai('@cf/meta/llama-3.1-8b-instruct'),
prompt: 'Tell me a story',
});bash
npm install workers-ai-provider aitypescript
import { createWorkersAI } from 'workers-ai-provider';
import { generateText, streamText } from 'ai';
const workersai = createWorkersAI({ binding: env.AI });
// Generate text
const result = await generateText({
model: workersai('@cf/meta/llama-3.1-8b-instruct'),
prompt: 'Write a poem',
});
// Stream text
const stream = streamText({
model: workersai('@cf/meta/llama-3.1-8b-instruct'),
prompt: 'Tell me a story',
});Limits Summary
限制汇总
| Feature | Limit |
|---|---|
| Concurrent requests | No hard limit (rate limits apply) |
| Max input tokens | Varies by model (typically 2K-128K) |
| Max output tokens | Varies by model (typically 512-2048) |
| Streaming chunk size | ~1 KB |
| Image size (output) | ~5 MB |
| Request timeout | Workers timeout applies (30s default, 5m max CPU) |
| Daily free neurons | 10,000 |
| Rate limits | See "Rate Limits & Pricing" section |
| 功能 | 限制 |
|---|---|
| 并发请求 | 无硬限制(适用速率限制) |
| 最大输入令牌数 | 因模型而异(通常2K-128K) |
| 最大输出令牌数 | 因模型而异(通常512-2048) |
| 流式传输块大小 | ~1 KB |
| 图像输出大小 | ~5 MB |
| 请求超时时间 | 适用Workers超时(默认30秒CPU时间,最大5分钟) |
| 每日免费神经元数 | 10,000 |
| 速率限制 | 参见「速率限制与定价」章节 |