Cloudflare-R2 Skill

Cloudflare-R2 技能指南

Comprehensive assistance with cloudflare-r2 development, generated from official documentation.

基于官方文档生成的Cloudflare R2开发全面辅助资料。

When to Use This Skill

何时使用本技能

This skill should be triggered when:

Working with cloudflare-r2
Asking about cloudflare-r2 features or APIs
Implementing cloudflare-r2 solutions
Debugging cloudflare-r2 code
Learning cloudflare-r2 best practices

在以下场景中可触发本技能：

开发Cloudflare R2相关功能时
咨询Cloudflare R2特性或API时
实现Cloudflare R2解决方案时
调试Cloudflare R2代码时
学习Cloudflare R2最佳实践时

Quick Reference

快速参考

Common Patterns

常见使用模式

R2 Data Catalog - Apache Iceberg Integration

R2数据目录 - Apache Iceberg集成

bash

undefined

bash

undefined

Enable Data Catalog on bucket (Wrangler CLI)

启用存储桶的数据目录功能（Wrangler CLI）

npx wrangler r2 bucket catalog enable my-bucket

Enable via Dashboard: R2 > Bucket > Settings > R2 Data Catalog > Enable

通过控制台启用：R2 > 存储桶 > 设置 > R2数据目录 > 启用

Note the Warehouse and Catalog URI values

记录Warehouse和Catalog URI的值

undefined

undefined

Python - PyIceberg Catalog Connection

Python - PyIceberg目录连接

python

from pyiceberg.catalog.rest import RestCatalog

python

from pyiceberg.catalog.rest import RestCatalog

Connection configuration

连接配置

catalog = RestCatalog( name="my_catalog", warehouse="<WAREHOUSE_ID>", # From catalog settings uri="<CATALOG_URI>", # From catalog settings token="<API_TOKEN>", # Admin Read & Write token )

catalog = RestCatalog( name="my_catalog", warehouse="<WAREHOUSE_ID>", # 来自目录设置 uri="<CATALOG_URI>", # 来自目录设置 token="<API_TOKEN>", # 管理员读写令牌 )

Create namespace

创建命名空间

catalog.create_namespace_if_not_exists("default")

Create table with schema

使用Schema创建表

test_table = ("default", "people") table = catalog.create_table(test_table, schema=df.schema)

Append data

追加数据

table.append(df)

Query data

查询数据

result = table.scan().to_arrow()

Drop table

删除表

catalog.drop_table(test_table)

undefined

catalog.drop_table(test_table)

undefined

Data Catalog - API Token Setup

数据目录 - API令牌设置

bash

undefined

bash

undefined

1. Navigate to: R2 > Manage API tokens > Create API token

1. 访问：R2 > 管理API令牌 > 创建API令牌

2. Select "Admin Read & Write" permission (required for catalog access)

2. 选择“管理员读写”权限（目录访问必需）

3. Save token for authentication

3. 保存令牌用于身份验证

Token must grant both R2 and catalog permissions for Iceberg clients

Iceberg客户端需要令牌同时拥有R2和目录权限

undefined

undefined

S3-Compatible SDK Usage (boto3)

兼容S3的SDK使用（boto3）

python

import boto3

python

import boto3

Configure S3 client for R2

为R2配置S3客户端

s3_client = boto3.client( 's3', endpoint_url='https://<ACCOUNT_ID>.r2.cloudflarestorage.com', aws_access_key_id='<ACCESS_KEY_ID>', aws_secret_access_key='<SECRET_ACCESS_KEY>', region_name='auto' )

Upload object

上传对象

s3_client.put_object( Bucket='my-bucket', Key='path/to/file.txt', Body=b'File contents' )

Download object

下载对象

response = s3_client.get_object(Bucket='my-bucket', Key='path/to/file.txt') data = response['Body'].read()

List objects

列出对象

response = s3_client.list_objects_v2(Bucket='my-bucket', Prefix='path/') for obj in response.get('Contents', []): print(obj['Key'])

Delete object

删除对象

s3_client.delete_object(Bucket='my-bucket', Key='path/to/file.txt')

undefined

s3_client.delete_object(Bucket='my-bucket', Key='path/to/file.txt')

undefined

Presigned URLs

预签名URL

python

undefined

python

undefined

Generate presigned URL for upload (expires in 1 hour)

生成上传用预签名URL（1小时后过期）

presigned_url = s3_client.generate_presigned_url( 'put_object', Params={ 'Bucket': 'my-bucket', 'Key': 'uploads/file.txt' }, ExpiresIn=3600 )

Generate presigned URL for download

生成下载用预签名URL

download_url = s3_client.generate_presigned_url( 'get_object', Params={ 'Bucket': 'my-bucket', 'Key': 'path/to/file.txt' }, ExpiresIn=3600 )

undefined

download_url = s3_client.generate_presigned_url( 'get_object', Params={ 'Bucket': 'my-bucket', 'Key': 'path/to/file.txt' }, ExpiresIn=3600 )

undefined

Multipart Upload

分段上传

python

undefined

python

undefined

Initiate multipart upload

初始化分段上传

multipart = s3_client.create_multipart_upload( Bucket='my-bucket', Key='large-file.bin' ) upload_id = multipart['UploadId']

Upload parts

上传分段

parts = [] for i, chunk in enumerate(file_chunks, start=1): part = s3_client.upload_part( Bucket='my-bucket', Key='large-file.bin', PartNumber=i, UploadId=upload_id, Body=chunk ) parts.append({'PartNumber': i, 'ETag': part['ETag']})

Complete multipart upload

完成分段上传

s3_client.complete_multipart_upload( Bucket='my-bucket', Key='large-file.bin', UploadId=upload_id, MultipartUpload={'Parts': parts} )

undefined

s3_client.complete_multipart_upload( Bucket='my-bucket', Key='large-file.bin', UploadId=upload_id, MultipartUpload={'Parts': parts} )

undefined

Workers Integration

Workers集成

javascript

export default {
  async fetch(request, env) {
    const bucket = env.MY_BUCKET; // R2 bucket binding

    // Upload to R2
    await bucket.put('key', 'value', {
      httpMetadata: {
        contentType: 'text/plain',
      },
      customMetadata: {
        user: 'example',
      },
    });

    // Retrieve from R2
    const object = await bucket.get('key');

    if (object === null) {
      return new Response('Object Not Found', { status: 404 });
    }

    // Return object with metadata
    return new Response(object.body, {
      headers: {
        'Content-Type': object.httpMetadata.contentType,
        'ETag': object.httpEtag,
      },
    });
  },
};

javascript

export default {
  async fetch(request, env) {
    const bucket = env.MY_BUCKET; // R2存储桶绑定

    // 上传到R2
    await bucket.put('key', 'value', {
      httpMetadata: {
        contentType: 'text/plain',
      },
      customMetadata: {
        user: 'example',
      },
    });

    // 从R2获取对象
    const object = await bucket.get('key');

    if (object === null) {
      return new Response('对象未找到', { status: 404 });
    }

    // 返回带元数据的对象
    return new Response(object.body, {
      headers: {
        'Content-Type': object.httpMetadata.contentType,
        'ETag': object.httpEtag,
      },
    });
  },
};

Bucket CORS Configuration

存储桶CORS配置

javascript

// Set CORS policy via S3 SDK
const corsConfig = {
  CORSRules: [
    {
      AllowedOrigins: ['https://example.com'],
      AllowedMethods: ['GET', 'PUT', 'POST', 'DELETE'],
      AllowedHeaders: ['*'],
      MaxAgeSeconds: 3000,
    },
  ],
};

await s3_client.put_bucket_cors(
  Bucket='my-bucket',
  CORSConfiguration=corsConfig
);

javascript

// 通过S3 SDK设置CORS策略
const corsConfig = {
  CORSRules: [
    {
      AllowedOrigins: ['https://example.com'],
      AllowedMethods: ['GET', 'PUT', 'POST', 'DELETE'],
      AllowedHeaders: ['*'],
      MaxAgeSeconds: 3000,
    },
  ],
};

await s3_client.put_bucket_cors(
  Bucket='my-bucket',
  CORSConfiguration=corsConfig
);

Object Metadata

对象元数据

python

undefined

python

undefined

Upload with custom metadata

上传带自定义元数据的对象

s3_client.put_object( Bucket='my-bucket', Key='document.pdf', Body=file_data, Metadata={ 'author': 'John Doe', 'department': 'Engineering', 'classification': 'internal', }, ContentType='application/pdf', )

Retrieve metadata without downloading object

获取对象元数据（无需下载对象）

response = s3_client.head_object(Bucket='my-bucket', Key='document.pdf') metadata = response['Metadata'] content_type = response['ContentType']

undefined

response = s3_client.head_object(Bucket='my-bucket', Key='document.pdf') metadata = response['Metadata'] content_type = response['ContentType']

undefined

Data Catalog - Apache Spark Integration

数据目录 - Apache Spark集成

python

from pyspark.sql import SparkSession

python

from pyspark.sql import SparkSession

Configure Spark with R2 Data Catalog

配置Spark连接R2数据目录

spark = SparkSession.builder
.config("spark.sql.catalog.r2", "org.apache.iceberg.spark.SparkCatalog")
.config("spark.sql.catalog.r2.catalog-impl", "org.apache.iceberg.rest.RESTCatalog")
.config("spark.sql.catalog.r2.uri", "<CATALOG_URI>")
.config("spark.sql.catalog.r2.warehouse", "<WAREHOUSE_ID>")
.config("spark.sql.catalog.r2.token", "<API_TOKEN>")
.getOrCreate()

Create table

创建表

spark.sql(""" CREATE TABLE r2.default.events ( event_id STRING, timestamp TIMESTAMP, user_id STRING, action STRING ) USING iceberg """)

Insert data

插入数据

spark.sql(""" INSERT INTO r2.default.events VALUES ('evt123', current_timestamp(), 'user456', 'login') """)

Query data

查询数据

df = spark.sql("SELECT * FROM r2.default.events WHERE action = 'login'") df.show()

undefined

df = spark.sql("SELECT * FROM r2.default.events WHERE action = 'login'") df.show()

undefined

R2 SQL - Serverless Analytics Query Engine

R2 SQL - 无服务器分析查询引擎

bash

undefined

bash

undefined

Query R2 Data Catalog tables with R2 SQL (Wrangler CLI)

使用R2 SQL查询R2数据目录表（Wrangler CLI）

npx wrangler r2 sql query "YOUR_WAREHOUSE_NAME" "SELECT * FROM default.table LIMIT 10"

Authentication setup (required before querying)

身份验证设置（查询前必需）

export WRANGLER_R2_SQL_AUTH_TOKEN="YOUR_API_TOKEN"

API token needs: Admin Read & Write + R2 SQL Read permissions

API令牌需要：管理员读写 + R2 SQL读取权限

Create at: R2 > Manage API tokens > Create API token

创建地址：R2 > 管理API令牌 > 创建API令牌

undefined

undefined

R2 SQL - Basic Query Patterns

R2 SQL - 基础查询模式

bash

undefined

bash

undefined

Select with filtering

带过滤的查询

npx wrangler r2 sql query "warehouse-123"
"SELECT user_id, event_type, product_id, amount FROM default.ecommerce WHERE event_type = 'purchase' LIMIT 10"

Aggregation queries

聚合查询

npx wrangler r2 sql query "warehouse-123"
"SELECT user_id, COUNT(*) as transaction_count, SUM(amount) as total_spent FROM default.transactions GROUP BY user_id HAVING total_spent > 1000 ORDER BY total_spent DESC"

Time-based filtering

基于时间的过滤

npx wrangler r2 sql query "warehouse-123"
"SELECT * FROM default.events WHERE timestamp >= '2024-01-01' AND timestamp < '2024-02-01' ORDER BY timestamp DESC"

Join operations

连接操作

npx wrangler r2 sql query "warehouse-123"
"SELECT u.user_id, u.name, t.transaction_id, t.amount FROM default.users u JOIN default.transactions t ON u.user_id = t.user_id WHERE t.fraud_flag = false"

undefined

npx wrangler r2 sql query "warehouse-123"
"SELECT u.user_id, u.name, t.transaction_id, t.amount FROM default.users u JOIN default.transactions t ON u.user_id = t.user_id WHERE t.fraud_flag = false"

undefined

R2 SQL - Stream Processing

R2 SQL - 流处理

bash

undefined

bash

undefined

Insert from stream to sink table (continuous processing)

从流插入数据到目标表（持续处理）

npx wrangler r2 sql query "warehouse-123"
"INSERT INTO ecommerce_sink SELECT * FROM ecommerce_stream"

Filtered stream transformation

带过滤的流转换

npx wrangler r2 sql query "warehouse-123"
"INSERT INTO high_value_transactions SELECT transaction_id, user_id, amount, timestamp FROM transaction_stream WHERE amount > 10000"

undefined

npx wrangler r2 sql query "warehouse-123"
"INSERT INTO high_value_transactions SELECT transaction_id, user_id, amount, timestamp FROM transaction_stream WHERE amount > 10000"

undefined

R2 SQL - Advanced Analytics

R2 SQL - 高级分析

bash

undefined

bash

undefined

Window functions for ranked queries

用于排名查询的窗口函数

npx wrangler r2 sql query "warehouse-123"
"SELECT user_id, product_id, amount, ROW_NUMBER() OVER (PARTITION BY user_id ORDER BY amount DESC) as purchase_rank FROM default.purchases QUALIFY purchase_rank <= 5"

Time-series aggregations

时间序列聚合

npx wrangler r2 sql query "warehouse-123"
"SELECT DATE_TRUNC('hour', timestamp) as hour, COUNT(*) as event_count, AVG(response_time_ms) as avg_response_time FROM default.api_logs GROUP BY hour ORDER BY hour DESC LIMIT 24"

Fraud detection pattern

欺诈检测模式

npx wrangler r2 sql query "warehouse-123"
"SELECT user_id, COUNT(*) as transaction_count, SUM(CASE WHEN fraud_flag THEN 1 ELSE 0 END) as fraud_count, AVG(amount) as avg_amount FROM default.transactions WHERE timestamp >= CURRENT_DATE - INTERVAL '7' DAY GROUP BY user_id HAVING fraud_count > 0 ORDER BY fraud_count DESC"

undefined

npx wrangler r2 sql query "warehouse-123"
"SELECT user_id, COUNT(*) as transaction_count, SUM(CASE WHEN fraud_flag THEN 1 ELSE 0 END) as fraud_count, AVG(amount) as avg_amount FROM default.transactions WHERE timestamp >= CURRENT_DATE - INTERVAL '7' DAY GROUP BY user_id HAVING fraud_count > 0 ORDER BY fraud_count DESC"

undefined

Reference Files

参考文档

This skill includes comprehensive documentation in

references/

:

api.md - Api documentation
buckets.md - Buckets documentation
getting_started.md - Getting Started documentation
other.md - Other documentation

Use

view

to read specific reference files when detailed information is needed.

本技能在

references/

目录下包含完整文档：

api.md - API文档
buckets.md - 存储桶文档
getting_started.md - 入门指南文档
other.md - 其他相关文档

需要详细信息时，可使用

view

命令查看指定参考文档。

Working with This Skill

如何使用本技能

For Beginners

面向初学者

Start with the getting_started or tutorials reference files for foundational concepts.

从

getting_started

或教程类参考文档开始，学习基础概念。

For Specific Features

面向特定功能

Use the appropriate category reference file (api, guides, etc.) for detailed information.

使用对应分类的参考文档（如api、指南等）获取详细信息。

For Code Examples

面向代码示例

The quick reference section above contains common patterns extracted from the official docs.

上方的快速参考部分包含从官方文档提取的常见使用模式。

Resources

资源

references/

Organized documentation extracted from official sources. These files contain:

Detailed explanations
Code examples with language annotations
Links to original documentation
Table of contents for quick navigation

来自官方资源的结构化文档，包含：

详细说明
带语言注释的代码示例
官方文档链接
用于快速导航的目录

scripts/

Add helper scripts here for common automation tasks.

存放用于常见自动化任务的辅助脚本。

assets/

Add templates, boilerplate, or example projects here.

存放模板、样板代码或示例项目。

Notes

说明

This skill was automatically generated from official documentation
Reference files preserve the structure and examples from source docs
Code examples include language detection for better syntax highlighting
Quick reference patterns are extracted from common usage examples in the docs

本技能由官方文档自动生成
参考文档保留了源文档的结构和示例
代码示例包含语言检测，以实现更好的语法高亮
快速参考模式提取自文档中的常见使用示例

Updating

更新方法

To refresh this skill with updated documentation:

Re-run the scraper with the same configuration
The skill will be rebuilt with the latest information

如需使用最新文档更新本技能：

使用相同配置重新运行抓取工具
技能将基于最新信息重建

cloudflare-r2

Original

Translation

Cloudflare-R2 Skill

Cloudflare-R2 技能指南

When to Use This Skill

何时使用本技能

Quick Reference

快速参考

Common Patterns

常见使用模式

R2 Data Catalog - Apache Iceberg Integration

R2数据目录 - Apache Iceberg集成

Enable Data Catalog on bucket (Wrangler CLI)

启用存储桶的数据目录功能（Wrangler CLI）

Enable via Dashboard: R2 > Bucket > Settings > R2 Data Catalog > Enable

通过控制台启用：R2 > 存储桶 > 设置 > R2数据目录 > 启用

Note the Warehouse and Catalog URI values

记录Warehouse和Catalog URI的值

Python - PyIceberg Catalog Connection

Python - PyIceberg目录连接

Connection configuration

连接配置

Create namespace

创建命名空间

Create table with schema

使用Schema创建表

Append data

追加数据

Query data

查询数据

Drop table

删除表

Data Catalog - API Token Setup

数据目录 - API令牌设置

1. Navigate to: R2 > Manage API tokens > Create API token

1. 访问：R2 > 管理API令牌 > 创建API令牌

2. Select "Admin Read & Write" permission (required for catalog access)

2. 选择“管理员读写”权限（目录访问必需）

3. Save token for authentication

3. 保存令牌用于身份验证

Token must grant both R2 and catalog permissions for Iceberg clients

Iceberg客户端需要令牌同时拥有R2和目录权限

S3-Compatible SDK Usage (boto3)

兼容S3的SDK使用（boto3）

Configure S3 client for R2

为R2配置S3客户端

Upload object

上传对象

Download object

下载对象

List objects

列出对象

Delete object

删除对象

Presigned URLs

预签名URL

Generate presigned URL for upload (expires in 1 hour)

生成上传用预签名URL（1小时后过期）

Generate presigned URL for download

生成下载用预签名URL

Multipart Upload

分段上传

Initiate multipart upload

初始化分段上传

Upload parts

上传分段

Complete multipart upload

完成分段上传

Workers Integration

Workers集成

Bucket CORS Configuration

存储桶CORS配置

Object Metadata

对象元数据

Upload with custom metadata

上传带自定义元数据的对象

Retrieve metadata without downloading object

获取对象元数据（无需下载对象）

Data Catalog - Apache Spark Integration