cloudflare-r2

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Cloudflare-R2 Skill

Cloudflare-R2 技能指南

Comprehensive assistance with cloudflare-r2 development, generated from official documentation.
基于官方文档生成的Cloudflare R2开发全面辅助资料。

When to Use This Skill

何时使用本技能

This skill should be triggered when:
  • Working with cloudflare-r2
  • Asking about cloudflare-r2 features or APIs
  • Implementing cloudflare-r2 solutions
  • Debugging cloudflare-r2 code
  • Learning cloudflare-r2 best practices
在以下场景中可触发本技能:
  • 开发Cloudflare R2相关功能时
  • 咨询Cloudflare R2特性或API时
  • 实现Cloudflare R2解决方案时
  • 调试Cloudflare R2代码时
  • 学习Cloudflare R2最佳实践时

Quick Reference

快速参考

Common Patterns

常见使用模式

R2 Data Catalog - Apache Iceberg Integration

R2数据目录 - Apache Iceberg集成

bash
undefined
bash
undefined

Enable Data Catalog on bucket (Wrangler CLI)

启用存储桶的数据目录功能(Wrangler CLI)

npx wrangler r2 bucket catalog enable my-bucket
npx wrangler r2 bucket catalog enable my-bucket

Enable via Dashboard: R2 > Bucket > Settings > R2 Data Catalog > Enable

通过控制台启用:R2 > 存储桶 > 设置 > R2数据目录 > 启用

Note the Warehouse and Catalog URI values

记录Warehouse和Catalog URI的值

undefined
undefined

Python - PyIceberg Catalog Connection

Python - PyIceberg目录连接

python
from pyiceberg.catalog.rest import RestCatalog
python
from pyiceberg.catalog.rest import RestCatalog

Connection configuration

连接配置

catalog = RestCatalog( name="my_catalog", warehouse="<WAREHOUSE_ID>", # From catalog settings uri="<CATALOG_URI>", # From catalog settings token="<API_TOKEN>", # Admin Read & Write token )
catalog = RestCatalog( name="my_catalog", warehouse="<WAREHOUSE_ID>", # 来自目录设置 uri="<CATALOG_URI>", # 来自目录设置 token="<API_TOKEN>", # 管理员读写令牌 )

Create namespace

创建命名空间

catalog.create_namespace_if_not_exists("default")
catalog.create_namespace_if_not_exists("default")

Create table with schema

使用Schema创建表

test_table = ("default", "people") table = catalog.create_table(test_table, schema=df.schema)
test_table = ("default", "people") table = catalog.create_table(test_table, schema=df.schema)

Append data

追加数据

table.append(df)
table.append(df)

Query data

查询数据

result = table.scan().to_arrow()
result = table.scan().to_arrow()

Drop table

删除表

catalog.drop_table(test_table)
undefined
catalog.drop_table(test_table)
undefined

Data Catalog - API Token Setup

数据目录 - API令牌设置

bash
undefined
bash
undefined

1. Navigate to: R2 > Manage API tokens > Create API token

1. 访问:R2 > 管理API令牌 > 创建API令牌

2. Select "Admin Read & Write" permission (required for catalog access)

2. 选择“管理员读写”权限(目录访问必需)

3. Save token for authentication

3. 保存令牌用于身份验证

Token must grant both R2 and catalog permissions for Iceberg clients

Iceberg客户端需要令牌同时拥有R2和目录权限

undefined
undefined

S3-Compatible SDK Usage (boto3)

兼容S3的SDK使用(boto3)

python
import boto3
python
import boto3

Configure S3 client for R2

为R2配置S3客户端

s3_client = boto3.client( 's3', endpoint_url='https://<ACCOUNT_ID>.r2.cloudflarestorage.com', aws_access_key_id='<ACCESS_KEY_ID>', aws_secret_access_key='<SECRET_ACCESS_KEY>', region_name='auto' )
s3_client = boto3.client( 's3', endpoint_url='https://<ACCOUNT_ID>.r2.cloudflarestorage.com', aws_access_key_id='<ACCESS_KEY_ID>', aws_secret_access_key='<SECRET_ACCESS_KEY>', region_name='auto' )

Upload object

上传对象

s3_client.put_object( Bucket='my-bucket', Key='path/to/file.txt', Body=b'File contents' )
s3_client.put_object( Bucket='my-bucket', Key='path/to/file.txt', Body=b'File contents' )

Download object

下载对象

response = s3_client.get_object(Bucket='my-bucket', Key='path/to/file.txt') data = response['Body'].read()
response = s3_client.get_object(Bucket='my-bucket', Key='path/to/file.txt') data = response['Body'].read()

List objects

列出对象

response = s3_client.list_objects_v2(Bucket='my-bucket', Prefix='path/') for obj in response.get('Contents', []): print(obj['Key'])
response = s3_client.list_objects_v2(Bucket='my-bucket', Prefix='path/') for obj in response.get('Contents', []): print(obj['Key'])

Delete object

删除对象

s3_client.delete_object(Bucket='my-bucket', Key='path/to/file.txt')
undefined
s3_client.delete_object(Bucket='my-bucket', Key='path/to/file.txt')
undefined

Presigned URLs

预签名URL

python
undefined
python
undefined

Generate presigned URL for upload (expires in 1 hour)

生成上传用预签名URL(1小时后过期)

presigned_url = s3_client.generate_presigned_url( 'put_object', Params={ 'Bucket': 'my-bucket', 'Key': 'uploads/file.txt' }, ExpiresIn=3600 )
presigned_url = s3_client.generate_presigned_url( 'put_object', Params={ 'Bucket': 'my-bucket', 'Key': 'uploads/file.txt' }, ExpiresIn=3600 )

Generate presigned URL for download

生成下载用预签名URL

download_url = s3_client.generate_presigned_url( 'get_object', Params={ 'Bucket': 'my-bucket', 'Key': 'path/to/file.txt' }, ExpiresIn=3600 )
undefined
download_url = s3_client.generate_presigned_url( 'get_object', Params={ 'Bucket': 'my-bucket', 'Key': 'path/to/file.txt' }, ExpiresIn=3600 )
undefined

Multipart Upload

分段上传

python
undefined
python
undefined

Initiate multipart upload

初始化分段上传

multipart = s3_client.create_multipart_upload( Bucket='my-bucket', Key='large-file.bin' ) upload_id = multipart['UploadId']
multipart = s3_client.create_multipart_upload( Bucket='my-bucket', Key='large-file.bin' ) upload_id = multipart['UploadId']

Upload parts

上传分段

parts = [] for i, chunk in enumerate(file_chunks, start=1): part = s3_client.upload_part( Bucket='my-bucket', Key='large-file.bin', PartNumber=i, UploadId=upload_id, Body=chunk ) parts.append({'PartNumber': i, 'ETag': part['ETag']})
parts = [] for i, chunk in enumerate(file_chunks, start=1): part = s3_client.upload_part( Bucket='my-bucket', Key='large-file.bin', PartNumber=i, UploadId=upload_id, Body=chunk ) parts.append({'PartNumber': i, 'ETag': part['ETag']})

Complete multipart upload

完成分段上传

s3_client.complete_multipart_upload( Bucket='my-bucket', Key='large-file.bin', UploadId=upload_id, MultipartUpload={'Parts': parts} )
undefined
s3_client.complete_multipart_upload( Bucket='my-bucket', Key='large-file.bin', UploadId=upload_id, MultipartUpload={'Parts': parts} )
undefined

Workers Integration

Workers集成

javascript
export default {
  async fetch(request, env) {
    const bucket = env.MY_BUCKET; // R2 bucket binding

    // Upload to R2
    await bucket.put('key', 'value', {
      httpMetadata: {
        contentType: 'text/plain',
      },
      customMetadata: {
        user: 'example',
      },
    });

    // Retrieve from R2
    const object = await bucket.get('key');

    if (object === null) {
      return new Response('Object Not Found', { status: 404 });
    }

    // Return object with metadata
    return new Response(object.body, {
      headers: {
        'Content-Type': object.httpMetadata.contentType,
        'ETag': object.httpEtag,
      },
    });
  },
};
javascript
export default {
  async fetch(request, env) {
    const bucket = env.MY_BUCKET; // R2存储桶绑定

    // 上传到R2
    await bucket.put('key', 'value', {
      httpMetadata: {
        contentType: 'text/plain',
      },
      customMetadata: {
        user: 'example',
      },
    });

    // 从R2获取对象
    const object = await bucket.get('key');

    if (object === null) {
      return new Response('对象未找到', { status: 404 });
    }

    // 返回带元数据的对象
    return new Response(object.body, {
      headers: {
        'Content-Type': object.httpMetadata.contentType,
        'ETag': object.httpEtag,
      },
    });
  },
};

Bucket CORS Configuration

存储桶CORS配置

javascript
// Set CORS policy via S3 SDK
const corsConfig = {
  CORSRules: [
    {
      AllowedOrigins: ['https://example.com'],
      AllowedMethods: ['GET', 'PUT', 'POST', 'DELETE'],
      AllowedHeaders: ['*'],
      MaxAgeSeconds: 3000,
    },
  ],
};

await s3_client.put_bucket_cors(
  Bucket='my-bucket',
  CORSConfiguration=corsConfig
);
javascript
// 通过S3 SDK设置CORS策略
const corsConfig = {
  CORSRules: [
    {
      AllowedOrigins: ['https://example.com'],
      AllowedMethods: ['GET', 'PUT', 'POST', 'DELETE'],
      AllowedHeaders: ['*'],
      MaxAgeSeconds: 3000,
    },
  ],
};

await s3_client.put_bucket_cors(
  Bucket='my-bucket',
  CORSConfiguration=corsConfig
);

Object Metadata

对象元数据

python
undefined
python
undefined

Upload with custom metadata

上传带自定义元数据的对象

s3_client.put_object( Bucket='my-bucket', Key='document.pdf', Body=file_data, Metadata={ 'author': 'John Doe', 'department': 'Engineering', 'classification': 'internal', }, ContentType='application/pdf', )
s3_client.put_object( Bucket='my-bucket', Key='document.pdf', Body=file_data, Metadata={ 'author': 'John Doe', 'department': 'Engineering', 'classification': 'internal', }, ContentType='application/pdf', )

Retrieve metadata without downloading object

获取对象元数据(无需下载对象)

response = s3_client.head_object(Bucket='my-bucket', Key='document.pdf') metadata = response['Metadata'] content_type = response['ContentType']
undefined
response = s3_client.head_object(Bucket='my-bucket', Key='document.pdf') metadata = response['Metadata'] content_type = response['ContentType']
undefined

Data Catalog - Apache Spark Integration

数据目录 - Apache Spark集成

python
from pyspark.sql import SparkSession
python
from pyspark.sql import SparkSession

Configure Spark with R2 Data Catalog

配置Spark连接R2数据目录

spark = SparkSession.builder
.config("spark.sql.catalog.r2", "org.apache.iceberg.spark.SparkCatalog")
.config("spark.sql.catalog.r2.catalog-impl", "org.apache.iceberg.rest.RESTCatalog")
.config("spark.sql.catalog.r2.uri", "<CATALOG_URI>")
.config("spark.sql.catalog.r2.warehouse", "<WAREHOUSE_ID>")
.config("spark.sql.catalog.r2.token", "<API_TOKEN>")
.getOrCreate()
spark = SparkSession.builder
.config("spark.sql.catalog.r2", "org.apache.iceberg.spark.SparkCatalog")
.config("spark.sql.catalog.r2.catalog-impl", "org.apache.iceberg.rest.RESTCatalog")
.config("spark.sql.catalog.r2.uri", "<CATALOG_URI>")
.config("spark.sql.catalog.r2.warehouse", "<WAREHOUSE_ID>")
.config("spark.sql.catalog.r2.token", "<API_TOKEN>")
.getOrCreate()

Create table

创建表

spark.sql(""" CREATE TABLE r2.default.events ( event_id STRING, timestamp TIMESTAMP, user_id STRING, action STRING ) USING iceberg """)
spark.sql(""" CREATE TABLE r2.default.events ( event_id STRING, timestamp TIMESTAMP, user_id STRING, action STRING ) USING iceberg """)

Insert data

插入数据

spark.sql(""" INSERT INTO r2.default.events VALUES ('evt123', current_timestamp(), 'user456', 'login') """)
spark.sql(""" INSERT INTO r2.default.events VALUES ('evt123', current_timestamp(), 'user456', 'login') """)

Query data

查询数据

df = spark.sql("SELECT * FROM r2.default.events WHERE action = 'login'") df.show()
undefined
df = spark.sql("SELECT * FROM r2.default.events WHERE action = 'login'") df.show()
undefined

R2 SQL - Serverless Analytics Query Engine

R2 SQL - 无服务器分析查询引擎

bash
undefined
bash
undefined

Query R2 Data Catalog tables with R2 SQL (Wrangler CLI)

使用R2 SQL查询R2数据目录表(Wrangler CLI)

npx wrangler r2 sql query "YOUR_WAREHOUSE_NAME" "SELECT * FROM default.table LIMIT 10"
npx wrangler r2 sql query "YOUR_WAREHOUSE_NAME" "SELECT * FROM default.table LIMIT 10"

Authentication setup (required before querying)

身份验证设置(查询前必需)

export WRANGLER_R2_SQL_AUTH_TOKEN="YOUR_API_TOKEN"
export WRANGLER_R2_SQL_AUTH_TOKEN="YOUR_API_TOKEN"

API token needs: Admin Read & Write + R2 SQL Read permissions

API令牌需要:管理员读写 + R2 SQL读取权限

Create at: R2 > Manage API tokens > Create API token

创建地址:R2 > 管理API令牌 > 创建API令牌

undefined
undefined

R2 SQL - Basic Query Patterns

R2 SQL - 基础查询模式

bash
undefined
bash
undefined

Select with filtering

带过滤的查询

npx wrangler r2 sql query "warehouse-123"
"SELECT user_id, event_type, product_id, amount FROM default.ecommerce WHERE event_type = 'purchase' LIMIT 10"
npx wrangler r2 sql query "warehouse-123"
"SELECT user_id, event_type, product_id, amount FROM default.ecommerce WHERE event_type = 'purchase' LIMIT 10"

Aggregation queries

聚合查询

npx wrangler r2 sql query "warehouse-123"
"SELECT user_id, COUNT(*) as transaction_count, SUM(amount) as total_spent FROM default.transactions GROUP BY user_id HAVING total_spent > 1000 ORDER BY total_spent DESC"
npx wrangler r2 sql query "warehouse-123"
"SELECT user_id, COUNT(*) as transaction_count, SUM(amount) as total_spent FROM default.transactions GROUP BY user_id HAVING total_spent > 1000 ORDER BY total_spent DESC"

Time-based filtering

基于时间的过滤

npx wrangler r2 sql query "warehouse-123"
"SELECT * FROM default.events WHERE timestamp >= '2024-01-01' AND timestamp < '2024-02-01' ORDER BY timestamp DESC"
npx wrangler r2 sql query "warehouse-123"
"SELECT * FROM default.events WHERE timestamp >= '2024-01-01' AND timestamp < '2024-02-01' ORDER BY timestamp DESC"

Join operations

连接操作

npx wrangler r2 sql query "warehouse-123"
"SELECT u.user_id, u.name, t.transaction_id, t.amount FROM default.users u JOIN default.transactions t ON u.user_id = t.user_id WHERE t.fraud_flag = false"
undefined
npx wrangler r2 sql query "warehouse-123"
"SELECT u.user_id, u.name, t.transaction_id, t.amount FROM default.users u JOIN default.transactions t ON u.user_id = t.user_id WHERE t.fraud_flag = false"
undefined

R2 SQL - Stream Processing

R2 SQL - 流处理

bash
undefined
bash
undefined

Insert from stream to sink table (continuous processing)

从流插入数据到目标表(持续处理)

npx wrangler r2 sql query "warehouse-123"
"INSERT INTO ecommerce_sink SELECT * FROM ecommerce_stream"
npx wrangler r2 sql query "warehouse-123"
"INSERT INTO ecommerce_sink SELECT * FROM ecommerce_stream"

Filtered stream transformation

带过滤的流转换

npx wrangler r2 sql query "warehouse-123"
"INSERT INTO high_value_transactions SELECT transaction_id, user_id, amount, timestamp FROM transaction_stream WHERE amount > 10000"
undefined
npx wrangler r2 sql query "warehouse-123"
"INSERT INTO high_value_transactions SELECT transaction_id, user_id, amount, timestamp FROM transaction_stream WHERE amount > 10000"
undefined

R2 SQL - Advanced Analytics

R2 SQL - 高级分析

bash
undefined
bash
undefined

Window functions for ranked queries

用于排名查询的窗口函数

npx wrangler r2 sql query "warehouse-123"
"SELECT user_id, product_id, amount, ROW_NUMBER() OVER (PARTITION BY user_id ORDER BY amount DESC) as purchase_rank FROM default.purchases QUALIFY purchase_rank <= 5"
npx wrangler r2 sql query "warehouse-123"
"SELECT user_id, product_id, amount, ROW_NUMBER() OVER (PARTITION BY user_id ORDER BY amount DESC) as purchase_rank FROM default.purchases QUALIFY purchase_rank <= 5"

Time-series aggregations

时间序列聚合

npx wrangler r2 sql query "warehouse-123"
"SELECT DATE_TRUNC('hour', timestamp) as hour, COUNT(*) as event_count, AVG(response_time_ms) as avg_response_time FROM default.api_logs GROUP BY hour ORDER BY hour DESC LIMIT 24"
npx wrangler r2 sql query "warehouse-123"
"SELECT DATE_TRUNC('hour', timestamp) as hour, COUNT(*) as event_count, AVG(response_time_ms) as avg_response_time FROM default.api_logs GROUP BY hour ORDER BY hour DESC LIMIT 24"

Fraud detection pattern

欺诈检测模式

npx wrangler r2 sql query "warehouse-123"
"SELECT user_id, COUNT(*) as transaction_count, SUM(CASE WHEN fraud_flag THEN 1 ELSE 0 END) as fraud_count, AVG(amount) as avg_amount FROM default.transactions WHERE timestamp >= CURRENT_DATE - INTERVAL '7' DAY GROUP BY user_id HAVING fraud_count > 0 ORDER BY fraud_count DESC"
undefined
npx wrangler r2 sql query "warehouse-123"
"SELECT user_id, COUNT(*) as transaction_count, SUM(CASE WHEN fraud_flag THEN 1 ELSE 0 END) as fraud_count, AVG(amount) as avg_amount FROM default.transactions WHERE timestamp >= CURRENT_DATE - INTERVAL '7' DAY GROUP BY user_id HAVING fraud_count > 0 ORDER BY fraud_count DESC"
undefined

Reference Files

参考文档

This skill includes comprehensive documentation in
references/
:
  • api.md - Api documentation
  • buckets.md - Buckets documentation
  • getting_started.md - Getting Started documentation
  • other.md - Other documentation
Use
view
to read specific reference files when detailed information is needed.
本技能在
references/
目录下包含完整文档:
  • api.md - API文档
  • buckets.md - 存储桶文档
  • getting_started.md - 入门指南文档
  • other.md - 其他相关文档
需要详细信息时,可使用
view
命令查看指定参考文档。

Working with This Skill

如何使用本技能

For Beginners

面向初学者

Start with the getting_started or tutorials reference files for foundational concepts.
getting_started
或教程类参考文档开始,学习基础概念。

For Specific Features

面向特定功能

Use the appropriate category reference file (api, guides, etc.) for detailed information.
使用对应分类的参考文档(如api、指南等)获取详细信息。

For Code Examples

面向代码示例

The quick reference section above contains common patterns extracted from the official docs.
上方的快速参考部分包含从官方文档提取的常见使用模式。

Resources

资源

references/

references/

Organized documentation extracted from official sources. These files contain:
  • Detailed explanations
  • Code examples with language annotations
  • Links to original documentation
  • Table of contents for quick navigation
来自官方资源的结构化文档,包含:
  • 详细说明
  • 带语言注释的代码示例
  • 官方文档链接
  • 用于快速导航的目录

scripts/

scripts/

Add helper scripts here for common automation tasks.
存放用于常见自动化任务的辅助脚本。

assets/

assets/

Add templates, boilerplate, or example projects here.
存放模板、样板代码或示例项目。

Notes

说明

  • This skill was automatically generated from official documentation
  • Reference files preserve the structure and examples from source docs
  • Code examples include language detection for better syntax highlighting
  • Quick reference patterns are extracted from common usage examples in the docs
  • 本技能由官方文档自动生成
  • 参考文档保留了源文档的结构和示例
  • 代码示例包含语言检测,以实现更好的语法高亮
  • 快速参考模式提取自文档中的常见使用示例

Updating

更新方法

To refresh this skill with updated documentation:
  1. Re-run the scraper with the same configuration
  2. The skill will be rebuilt with the latest information
如需使用最新文档更新本技能:
  1. 使用相同配置重新运行抓取工具
  2. 技能将基于最新信息重建