developing-datacloud-code-extension
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
Chinesedeveloping-datacloud-code-extension Skill
developing-datacloud-code-extension 技能
Overview
概述
This skill provides a complete workflow for developing, testing, and deploying custom Python code extensions to Salesforce Data Cloud. Code extensions allow you to write Python transformations that read from and write to Data Lake Objects (DLOs) and Data Model Objects (DMOs).
此技能提供了一套完整的工作流,用于开发、测试并将自定义Python代码扩展部署到Salesforce Data Cloud。代码扩展允许您编写Python转换逻辑,读取和写入数据湖对象(DLOs)和数据模型对象(DMOs)。
When to Use
使用场景
- User wants to create a new code extension project
- User needs to test a code extension locally
- User wants to scan code for required permissions
- User needs to deploy a code extension to Data Cloud
- User is working with Data Cloud transformations
- User wants to read/write DLO or DMO data programmatically
- 用户想要创建新的代码扩展项目
- 用户需要在本地测试代码扩展
- 用户想要扫描代码以检查所需权限
- 用户需要将代码扩展部署到Data Cloud
- 用户正在处理Data Cloud转换任务
- 用户想要以编程方式读写DLO或DMO数据
Prerequisites Check
前置条件检查
Before executing any code extension commands, verify prerequisites:
-
SF CLI with plugin installedbash
sf plugins --core | grep data-code-extensionIf not installed:bashsf plugins install @salesforce/plugin-data-codeextension -
Python 3.11bash
python --version # Should show 3.11.x -
Data Cloud Custom Code SDKbash
pip list | grep salesforce-data-customcodeIf not installed:bashpip install salesforce-data-customcode -
Docker running (for deploy only)bash
docker ps -
Authenticated orgbash
sf org display --target-org <org_alias> --json
在执行任何代码扩展命令之前,请验证以下前置条件:
-
已安装插件的SF CLIbash
sf plugins --core | grep data-code-extension如果未安装:bashsf plugins install @salesforce/plugin-data-codeextension -
Python 3.11bash
python --version # 应显示3.11.x -
Data Cloud自定义代码SDKbash
pip list | grep salesforce-data-customcode如果未安装:bashpip install salesforce-data-customcode -
Docker正在运行(仅部署时需要)bash
docker ps -
已认证的组织bash
sf org display --target-org <org_alias> --json
Skill Workflow
技能工作流
Phase 1: Initialize Project
阶段1:初始化项目
Create a new code extension project with scaffolding.
Commands:
For script-based code extensions (batch transformations):
bash
sf data-code-extension script init --package-dir <directory>For function-based code extensions (real-time):
bash
sf data-code-extension function init --package-dir <directory>Required Option:
- - Directory path where the package will be created
--package-dir, -p
What it creates:
my-transform/ # Project root
├── payload/ # CRITICAL: This is what --package-dir must point to for deploy
│ ├── entrypoint.py # Main transformation code
│ └── config.json # Code extension configuration
├── requirements.txt # Python dependencies
└── README.md创建带有脚手架的新代码扩展项目。
命令:
针对脚本型代码扩展(批量转换):
bash
sf data-code-extension script init --package-dir <directory>针对函数型代码扩展(实时处理):
bash
sf data-code-extension function init --package-dir <directory>必填选项:
- - 用于创建包的目录路径
--package-dir, -p
生成的目录结构:
my-transform/ # 项目根目录
├── payload/ # 关键:部署时--package-dir必须指向此目录
│ ├── entrypoint.py # 主转换代码
│ └── config.json # 代码扩展配置文件
├── requirements.txt # Python依赖项
└── README.mdDirectory Context During Workflow
工作流中的目录上下文
IMPORTANT: Understanding the directory structure is critical for successful deployment.
Commands and their directory requirements:
| Command | Run From | Path/File Argument |
|---|---|---|
| Parent directory | |
| Project root | |
| Project root | |
| Project root | |
CRITICAL: The argument in deploy command MUST point to the directory, not the project root.
--package-dirpayload重要提示: 理解目录结构对于成功部署至关重要。
命令及其目录要求:
| 命令 | 运行目录 | 路径/文件参数 |
|---|---|---|
| 父目录 | |
| 项目根目录 | |
| 项目根目录 | |
| 项目根目录 | |
关键注意事项: deploy命令中的参数必须指向目录,而非项目根目录。
--package-dirpayloadPhase 2: Develop Transformation
阶段2:开发转换逻辑
Edit with transformation logic.
payload/entrypoint.pyScript Example (Batch):
python
from datacustomcode import Client
client = Client()编辑添加转换逻辑。
payload/entrypoint.py脚本示例(批量处理):
python
from datacustomcode import Client
client = Client()Read from DLO
从DLO读取数据
df = client.read_dlo('Employee__dll')
df = client.read_dlo('Employee__dll')
Transform data (uppercase position field)
转换数据(职位字段转为大写)
df['position_upper'] = df['position'].str.upper()
df['position_upper'] = df['position'].str.upper()
Write to output DLO
写入输出DLO
client.write_to_dlo('Employee_Upper__dll', df, 'overwrite')
**Function Example (Real-time):**
```python
from datacustomcode import FunctionClient
def transform(event, context):
client = FunctionClient(context)
input_data = event['data']
output = {
'name': input_data['name'].upper(),
'status': 'processed'
}
return outputCommon Operations:
- - Read from DLO
client.read_dlo('DLO_Name__dll') - - Read from DMO
client.read_dmo('DMO_Name') - - Write to DLO
client.write_to_dlo('DLO_Name__dll', df, 'overwrite') - - Write to DMO
client.write_to_dmo('DMO_Name', df, 'upsert')
client.write_to_dlo('Employee_Upper__dll', df, 'overwrite')
**函数示例(实时处理):**
```python
from datacustomcode import FunctionClient
def transform(event, context):
client = FunctionClient(context)
input_data = event['data']
output = {
'name': input_data['name'].upper(),
'status': 'processed'
}
return output常见操作:
- - 从DLO读取数据
client.read_dlo('DLO_Name__dll') - - 从DMO读取数据
client.read_dmo('DMO_Name') - - 写入DLO
client.write_to_dlo('DLO_Name__dll', df, 'overwrite') - - 写入DMO
client.write_to_dmo('DMO_Name', df, 'upsert')
Phase 3: Scan for Permissions
阶段3:扫描权限
Scan the entrypoint file to detect required permissions and generate config.json.
Command:
bash
sf data-code-extension script scan --entrypoint ./payload/entrypoint.pyWhat it detects:
- Read permissions for DLOs/DMOs
- Write permissions for DLOs/DMOs
- Python package dependencies
- Updates and
config.jsonrequirements.txt
扫描入口文件以检测所需权限并生成config.json。
命令:
bash
sf data-code-extension script scan --entrypoint ./payload/entrypoint.py扫描内容:
- DLO/DMO的读取权限
- DLO/DMO的写入权限
- Python包依赖项
- 更新和
config.jsonrequirements.txt
Phase 4: Validate DLO Schema (Pre-Test Check)
阶段4:验证DLO Schema(测试前检查)
CRITICAL: Before running tests locally, validate that all DLOs used in your code exist and have the expected fields.
关键注意事项: 在本地运行测试之前,请验证代码中使用的所有DLO是否存在且具有预期字段。
Step 4a: Extract DLOs from config.json
步骤4a:从config.json提取DLO
After scanning, review the generated to identify all DLOs:
config.jsonbash
cat payload/config.json扫描完成后,查看生成的以识别所有DLO:
config.jsonbash
cat payload/config.jsonStep 4b: Validate Each DLO Schema
步骤4b:验证每个DLO Schema
Use the skill to verify DLOs exist and check field names.
getting-datacloud-schemaFor each DLO referenced in your code:
-
Verify DLO exists:bash
python3 scripts/get_dlo_schema.py <org_alias> <dlo_name> -
Verify field names match — compare fields used in youragainst the DLO schema.
entrypoint.py -
Check all DLOs:
- Validate all DLOs in permissions
read - Validate all DLOs in permissions
write - Check field names match exactly (case-sensitive)
- Verify data types are compatible with operations
- Validate all DLOs in
使用技能验证DLO是否存在并检查字段名称。
getting-datacloud-schema针对代码中引用的每个DLO:
-
验证DLO存在:bash
python3 scripts/get_dlo_schema.py <org_alias> <dlo_name> -
验证字段名称匹配 — 将中使用的字段与DLO Schema进行对比。
entrypoint.py -
检查所有DLO:
- 验证权限中的所有DLO
read - 验证权限中的所有DLO
write - 检查字段名称是否完全匹配(区分大小写)
- 验证数据类型与操作兼容
- 验证
Step 4c: Validation Checklist
步骤4c:验证清单
Before proceeding to run, ensure:
- All DLOs in config.json exist in target org
- All field names used in code exist in DLO schemas
- Field data types match your transformation logic
- Primary key fields are correctly identified
- Write target DLOs are created and accessible
在继续运行测试之前,请确保:
- config.json中的所有DLO在目标组织中存在
- 代码中使用的所有字段名称在DLO Schema中存在
- 字段数据类型与转换逻辑匹配
- 主键字段已正确识别
- 写入目标DLO已创建且可访问
Phase 5: Test Locally
阶段5:本地测试
After validating DLO schemas, run the code extension locally against your Data Cloud org.
Command:
bash
sf data-code-extension script run --entrypoint <entrypoint_file> --target-org <org_alias> [options]Options:
- - SF CLI org alias (required)
--target-org, -o - - Custom config file path
--config-file, -c
If you get errors:
- Re-validate DLO schemas
- Check field names are exact matches
- Verify data types are compatible
- Review error messages for field/DLO issues
验证DLO Schema后,在本地针对您的Data Cloud组织运行代码扩展。
命令:
bash
sf data-code-extension script run --entrypoint <entrypoint_file> --target-org <org_alias> [options]选项:
- - SF CLI组织别名(必填)
--target-org, -o - - 自定义配置文件路径
--config-file, -c
如果遇到错误:
- 重新验证DLO Schema
- 检查字段名称是否完全匹配
- 验证数据类型是否兼容
- 查看错误消息以排查字段/DLO问题
Phase 6: Deploy to Data Cloud
阶段6:部署到Data Cloud
Deploy the code extension to Data Cloud for scheduled or on-demand execution.
CRITICAL: You MUST specify to point to the payload directory created by init.
--package-dir ./payloadCommand:
bash
sf data-code-extension script deploy --target-org <org_alias> --name <name> --package-dir ./payload --package-version <version> --description <description> [options]Required Options:
- - SF CLI org alias
--target-org, -o - - Name for code extension deployment
--name, -n - - Path to payload directory (REQUIRED - must be
--package-dirwhen running from project root)./payload - - Version string (default: 0.0.1)
--package-version - - Description of code extension
--description
Optional Options:
- - CPU size: CPU_L, CPU_XL, CPU_2XL (default), CPU_4XL
--cpu-size - - Function invoke options (for function type)
--function-invoke-opt - - Docker network (default: default)
--network
After deployment:
- Navigate to Data Cloud in Salesforce UI
- Go to Data Transforms section
- Find your deployment by name
- Click "Run Now" to execute
- Schedule for recurring execution
将代码扩展部署到Data Cloud以进行定时或按需执行。
关键注意事项: 您必须指定以指向init命令创建的payload目录。
--package-dir ./payload命令:
bash
sf data-code-extension script deploy --target-org <org_alias> --name <name> --package-dir ./payload --package-version <version> --description <description> [options]必填选项:
- - SF CLI组织别名
--target-org, -o - - 代码扩展部署的名称
--name, -n - - payload目录的路径(必填 - 在项目根目录运行时必须为
--package-dir)./payload - - 版本字符串(默认值:0.0.1)
--package-version - - 代码扩展的描述
--description
可选选项:
- - CPU规格:CPU_L、CPU_XL、CPU_2XL(默认)、CPU_4XL
--cpu-size - - 函数调用选项(针对函数型扩展)
--function-invoke-opt - - Docker网络(默认值:default)
--network
部署完成后:
- 导航至Salesforce UI中的Data Cloud
- 进入数据转换板块
- 通过名称找到您的部署
- 点击“立即运行”执行
- 设置定时重复执行
Error Handling
错误处理
Common Issues and Solutions
常见问题及解决方案
| Error | Solution |
|---|---|
| |
| |
| Use pyenv: |
| Start Docker Desktop |
| |
| |
| Verify DLO exists (use getting-datacloud-schema skill), check spelling and |
| Re-run scan, verify target DLO exists and is writable |
| Ensure |
| 错误 | 解决方案 |
|---|---|
| 执行 |
| 执行 |
| 使用pyenv: |
| 启动Docker Desktop |
| 执行 |
| 执行 |
| 验证DLO是否存在(使用getting-datacloud-schema技能),检查拼写和 |
| 重新运行扫描,验证目标DLO存在且可写入 |
| 确保 |
Best Practices
最佳实践
Development
开发
- Always scan before testing — run scan after code changes
- Test locally first — use command before deploying
run - Use version control — git commit after each successful test
- Version your deployments — use semantic versioning (1.0.0, 1.1.0, etc.)
- Deploy from project root with
--package-dir ./payload
- 测试前务必扫描——代码变更后运行扫描
- 先在本地测试——部署前使用命令
run - 使用版本控制——每次测试成功后提交git
- 为部署版本化——使用语义化版本(1.0.0、1.1.0等)
- 在项目根目录执行部署并指定
--package-dir ./payload
Performance
性能
- CPU_L: Small datasets (< 1M records)
- CPU_2XL: Medium datasets (1M-10M records)
- CPU_4XL: Large datasets (> 10M records)
- CPU_L:小型数据集(< 100万条记录)
- CPU_2XL:中型数据集(100万-1000万条记录)
- CPU_4XL:大型数据集(> 1000万条记录)
Security
安全
- No hardcoded credentials — use SF CLI authentication only
- Validate input data — check for nulls and data types
- Limit write permissions — only grant necessary DLO/DMO access
- 不要硬编码凭证——仅使用SF CLI认证
- 验证输入数据——检查空值和数据类型
- 限制写入权限——仅授予必要的DLO/DMO访问权限
Integration with Other Skills
与其他技能集成
Use with getting-datacloud-schema skill (CRITICAL for validation):
The skill is required for validating DLOs before testing code extensions.
getting-datacloud-schemaUse with Datakit Workflow:
- Create DLO via code extension
- Map DLO to DMO using datakit workflow
- Use DMO in segments and activations
与getting-datacloud-schema技能配合使用(验证环节必备):
getting-datacloud-schema与Datakit工作流配合使用:
- 通过代码扩展创建DLO
- 使用datakit工作流将DLO映射到DMO
- 在细分和激活中使用DMO
Command Reference
命令参考
| Command | Purpose | Required Args |
|---|---|---|
| Create new script project | --package-dir |
| Create new function project | --package-dir |
| Generate config | entrypoint file |
| Test locally | entrypoint file, --target-org |
| Deploy to Data Cloud | --target-org, --name, --package-dir, --package-version, --description |
| 命令 | 用途 | 必填参数 |
|---|---|---|
| 创建新的脚本项目 | --package-dir |
| 创建新的函数项目 | --package-dir |
| 生成配置文件 | 入口文件 |
| 本地测试 | 入口文件, --target-org |
| 部署到Data Cloud | --target-org, --name, --package-dir, --package-version, --description |
Resources
资源
- SF CLI Plugin: https://github.com/salesforcecli/plugin-data-code-extension
- Python SDK: https://github.com/forcedotcom/datacloud-customcode-python-sdk
- Data Cloud Docs: https://help.salesforce.com/s/articleView?id=sf.c360_a_intro.htm
- Python SDK PyPI: https://pypi.org/project/salesforce-data-customcode/
Notes
注意事项
- Code extensions run in isolated Python 3.11 environment
- Docker is required only for deployment, not for local testing
- Use SF CLI authentication only (no separate credential files)
- Scan command auto-detects permissions from code
- Local run uses actual Data Cloud data (not mocked)
- Deployments are versioned and can be rolled back in UI
- 代码扩展在独立的Python 3.11环境中运行
- Docker仅在部署时需要,本地测试无需Docker
- 仅使用SF CLI认证(无需单独的凭证文件)
- 扫描命令会自动从代码中检测权限
- 本地运行使用真实的Data Cloud数据(非模拟数据)
- 部署版本可在UI中回滚