developing-datacloud-code-extension

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

developing-datacloud-code-extension Skill

developing-datacloud-code-extension 技能

Overview

概述

This skill provides a complete workflow for developing, testing, and deploying custom Python code extensions to Salesforce Data Cloud. Code extensions allow you to write Python transformations that read from and write to Data Lake Objects (DLOs) and Data Model Objects (DMOs).
此技能提供了一套完整的工作流,用于开发、测试并将自定义Python代码扩展部署到Salesforce Data Cloud。代码扩展允许您编写Python转换逻辑,读取和写入数据湖对象(DLOs)和数据模型对象(DMOs)。

When to Use

使用场景

  • User wants to create a new code extension project
  • User needs to test a code extension locally
  • User wants to scan code for required permissions
  • User needs to deploy a code extension to Data Cloud
  • User is working with Data Cloud transformations
  • User wants to read/write DLO or DMO data programmatically
  • 用户想要创建新的代码扩展项目
  • 用户需要在本地测试代码扩展
  • 用户想要扫描代码以检查所需权限
  • 用户需要将代码扩展部署到Data Cloud
  • 用户正在处理Data Cloud转换任务
  • 用户想要以编程方式读写DLO或DMO数据

Prerequisites Check

前置条件检查

Before executing any code extension commands, verify prerequisites:
  1. SF CLI with plugin installed
    bash
    sf plugins --core | grep data-code-extension
    If not installed:
    bash
    sf plugins install @salesforce/plugin-data-codeextension
  2. Python 3.11
    bash
    python --version  # Should show 3.11.x
  3. Data Cloud Custom Code SDK
    bash
    pip list | grep salesforce-data-customcode
    If not installed:
    bash
    pip install salesforce-data-customcode
  4. Docker running (for deploy only)
    bash
    docker ps
  5. Authenticated org
    bash
    sf org display --target-org <org_alias> --json
在执行任何代码扩展命令之前,请验证以下前置条件:
  1. 已安装插件的SF CLI
    bash
    sf plugins --core | grep data-code-extension
    如果未安装:
    bash
    sf plugins install @salesforce/plugin-data-codeextension
  2. Python 3.11
    bash
    python --version  # 应显示3.11.x
  3. Data Cloud自定义代码SDK
    bash
    pip list | grep salesforce-data-customcode
    如果未安装:
    bash
    pip install salesforce-data-customcode
  4. Docker正在运行(仅部署时需要)
    bash
    docker ps
  5. 已认证的组织
    bash
    sf org display --target-org <org_alias> --json

Skill Workflow

技能工作流

Phase 1: Initialize Project

阶段1:初始化项目

Create a new code extension project with scaffolding.
Commands:
For script-based code extensions (batch transformations):
bash
sf data-code-extension script init --package-dir <directory>
For function-based code extensions (real-time):
bash
sf data-code-extension function init --package-dir <directory>
Required Option:
  • --package-dir, -p
    - Directory path where the package will be created
What it creates:
my-transform/              # Project root
├── payload/               # CRITICAL: This is what --package-dir must point to for deploy
│   ├── entrypoint.py      # Main transformation code
│   └── config.json        # Code extension configuration
├── requirements.txt       # Python dependencies
└── README.md
创建带有脚手架的新代码扩展项目。
命令:
针对脚本型代码扩展(批量转换):
bash
sf data-code-extension script init --package-dir <directory>
针对函数型代码扩展(实时处理):
bash
sf data-code-extension function init --package-dir <directory>
必填选项:
  • --package-dir, -p
    - 用于创建包的目录路径
生成的目录结构:
my-transform/              # 项目根目录
├── payload/               # 关键:部署时--package-dir必须指向此目录
│   ├── entrypoint.py      # 主转换代码
│   └── config.json        # 代码扩展配置文件
├── requirements.txt       # Python依赖项
└── README.md

Directory Context During Workflow

工作流中的目录上下文

IMPORTANT: Understanding the directory structure is critical for successful deployment.
Commands and their directory requirements:
CommandRun FromPath/File Argument
init
Parent directory
<project-name>
or
.
scan
Project root
./payload/entrypoint.py
run
Project root
./payload/entrypoint.py
deploy
Project root
--package-dir ./payload
(REQUIRED)
CRITICAL: The
--package-dir
argument in deploy command MUST point to the
payload
directory, not the project root.
重要提示: 理解目录结构对于成功部署至关重要。
命令及其目录要求:
命令运行目录路径/文件参数
init
父目录
<project-name>
.
scan
项目根目录
./payload/entrypoint.py
run
项目根目录
./payload/entrypoint.py
deploy
项目根目录
--package-dir ./payload
必填
关键注意事项: deploy命令中的
--package-dir
参数必须指向
payload
目录,而非项目根目录。

Phase 2: Develop Transformation

阶段2:开发转换逻辑

Edit
payload/entrypoint.py
with transformation logic.
Script Example (Batch):
python
from datacustomcode import Client

client = Client()
编辑
payload/entrypoint.py
添加转换逻辑。
脚本示例(批量处理):
python
from datacustomcode import Client

client = Client()

Read from DLO

从DLO读取数据

df = client.read_dlo('Employee__dll')
df = client.read_dlo('Employee__dll')

Transform data (uppercase position field)

转换数据(职位字段转为大写)

df['position_upper'] = df['position'].str.upper()
df['position_upper'] = df['position'].str.upper()

Write to output DLO

写入输出DLO

client.write_to_dlo('Employee_Upper__dll', df, 'overwrite')

**Function Example (Real-time):**
```python
from datacustomcode import FunctionClient

def transform(event, context):
    client = FunctionClient(context)
    input_data = event['data']
    output = {
        'name': input_data['name'].upper(),
        'status': 'processed'
    }
    return output
Common Operations:
  • client.read_dlo('DLO_Name__dll')
    - Read from DLO
  • client.read_dmo('DMO_Name')
    - Read from DMO
  • client.write_to_dlo('DLO_Name__dll', df, 'overwrite')
    - Write to DLO
  • client.write_to_dmo('DMO_Name', df, 'upsert')
    - Write to DMO
client.write_to_dlo('Employee_Upper__dll', df, 'overwrite')

**函数示例(实时处理):**
```python
from datacustomcode import FunctionClient

def transform(event, context):
    client = FunctionClient(context)
    input_data = event['data']
    output = {
        'name': input_data['name'].upper(),
        'status': 'processed'
    }
    return output
常见操作:
  • client.read_dlo('DLO_Name__dll')
    - 从DLO读取数据
  • client.read_dmo('DMO_Name')
    - 从DMO读取数据
  • client.write_to_dlo('DLO_Name__dll', df, 'overwrite')
    - 写入DLO
  • client.write_to_dmo('DMO_Name', df, 'upsert')
    - 写入DMO

Phase 3: Scan for Permissions

阶段3:扫描权限

Scan the entrypoint file to detect required permissions and generate config.json.
Command:
bash
sf data-code-extension script scan --entrypoint ./payload/entrypoint.py
What it detects:
  • Read permissions for DLOs/DMOs
  • Write permissions for DLOs/DMOs
  • Python package dependencies
  • Updates
    config.json
    and
    requirements.txt
扫描入口文件以检测所需权限并生成config.json。
命令:
bash
sf data-code-extension script scan --entrypoint ./payload/entrypoint.py
扫描内容:
  • DLO/DMO的读取权限
  • DLO/DMO的写入权限
  • Python包依赖项
  • 更新
    config.json
    requirements.txt

Phase 4: Validate DLO Schema (Pre-Test Check)

阶段4:验证DLO Schema(测试前检查)

CRITICAL: Before running tests locally, validate that all DLOs used in your code exist and have the expected fields.
关键注意事项: 在本地运行测试之前,请验证代码中使用的所有DLO是否存在且具有预期字段。

Step 4a: Extract DLOs from config.json

步骤4a:从config.json提取DLO

After scanning, review the generated
config.json
to identify all DLOs:
bash
cat payload/config.json
扫描完成后,查看生成的
config.json
以识别所有DLO:
bash
cat payload/config.json

Step 4b: Validate Each DLO Schema

步骤4b:验证每个DLO Schema

Use the
getting-datacloud-schema
skill to verify DLOs exist and check field names.
For each DLO referenced in your code:
  1. Verify DLO exists:
    bash
    python3 scripts/get_dlo_schema.py <org_alias> <dlo_name>
  2. Verify field names match — compare fields used in your
    entrypoint.py
    against the DLO schema.
  3. Check all DLOs:
    • Validate all DLOs in
      read
      permissions
    • Validate all DLOs in
      write
      permissions
    • Check field names match exactly (case-sensitive)
    • Verify data types are compatible with operations
使用
getting-datacloud-schema
技能验证DLO是否存在并检查字段名称。
针对代码中引用的每个DLO:
  1. 验证DLO存在:
    bash
    python3 scripts/get_dlo_schema.py <org_alias> <dlo_name>
  2. 验证字段名称匹配 — 将
    entrypoint.py
    中使用的字段与DLO Schema进行对比。
  3. 检查所有DLO:
    • 验证
      read
      权限中的所有DLO
    • 验证
      write
      权限中的所有DLO
    • 检查字段名称是否完全匹配(区分大小写)
    • 验证数据类型与操作兼容

Step 4c: Validation Checklist

步骤4c:验证清单

Before proceeding to run, ensure:
  • All DLOs in config.json exist in target org
  • All field names used in code exist in DLO schemas
  • Field data types match your transformation logic
  • Primary key fields are correctly identified
  • Write target DLOs are created and accessible
在继续运行测试之前,请确保:
  • config.json中的所有DLO在目标组织中存在
  • 代码中使用的所有字段名称在DLO Schema中存在
  • 字段数据类型与转换逻辑匹配
  • 主键字段已正确识别
  • 写入目标DLO已创建且可访问

Phase 5: Test Locally

阶段5:本地测试

After validating DLO schemas, run the code extension locally against your Data Cloud org.
Command:
bash
sf data-code-extension script run --entrypoint <entrypoint_file> --target-org <org_alias> [options]
Options:
  • --target-org, -o
    - SF CLI org alias (required)
  • --config-file, -c
    - Custom config file path
If you get errors:
  • Re-validate DLO schemas
  • Check field names are exact matches
  • Verify data types are compatible
  • Review error messages for field/DLO issues
验证DLO Schema后,在本地针对您的Data Cloud组织运行代码扩展。
命令:
bash
sf data-code-extension script run --entrypoint <entrypoint_file> --target-org <org_alias> [options]
选项:
  • --target-org, -o
    - SF CLI组织别名(必填)
  • --config-file, -c
    - 自定义配置文件路径
如果遇到错误:
  • 重新验证DLO Schema
  • 检查字段名称是否完全匹配
  • 验证数据类型是否兼容
  • 查看错误消息以排查字段/DLO问题

Phase 6: Deploy to Data Cloud

阶段6:部署到Data Cloud

Deploy the code extension to Data Cloud for scheduled or on-demand execution.
CRITICAL: You MUST specify
--package-dir ./payload
to point to the payload directory created by init.
Command:
bash
sf data-code-extension script deploy --target-org <org_alias> --name <name> --package-dir ./payload --package-version <version> --description <description> [options]
Required Options:
  • --target-org, -o
    - SF CLI org alias
  • --name, -n
    - Name for code extension deployment
  • --package-dir
    - Path to payload directory (REQUIRED - must be
    ./payload
    when running from project root)
  • --package-version
    - Version string (default: 0.0.1)
  • --description
    - Description of code extension
Optional Options:
  • --cpu-size
    - CPU size: CPU_L, CPU_XL, CPU_2XL (default), CPU_4XL
  • --function-invoke-opt
    - Function invoke options (for function type)
  • --network
    - Docker network (default: default)
After deployment:
  • Navigate to Data Cloud in Salesforce UI
  • Go to Data Transforms section
  • Find your deployment by name
  • Click "Run Now" to execute
  • Schedule for recurring execution
将代码扩展部署到Data Cloud以进行定时或按需执行。
关键注意事项: 您必须指定
--package-dir ./payload
以指向init命令创建的payload目录。
命令:
bash
sf data-code-extension script deploy --target-org <org_alias> --name <name> --package-dir ./payload --package-version <version> --description <description> [options]
必填选项:
  • --target-org, -o
    - SF CLI组织别名
  • --name, -n
    - 代码扩展部署的名称
  • --package-dir
    - payload目录的路径(必填 - 在项目根目录运行时必须为
    ./payload
  • --package-version
    - 版本字符串(默认值:0.0.1)
  • --description
    - 代码扩展的描述
可选选项:
  • --cpu-size
    - CPU规格:CPU_L、CPU_XL、CPU_2XL(默认)、CPU_4XL
  • --function-invoke-opt
    - 函数调用选项(针对函数型扩展)
  • --network
    - Docker网络(默认值:default)
部署完成后:
  • 导航至Salesforce UI中的Data Cloud
  • 进入数据转换板块
  • 通过名称找到您的部署
  • 点击“立即运行”执行
  • 设置定时重复执行

Error Handling

错误处理

Common Issues and Solutions

常见问题及解决方案

ErrorSolution
command data-code-extension not found
sf plugins install @salesforce/plugin-data-codeextension
datacustomcode CLI not found
pip install salesforce-data-customcode
Python version mismatch
Use pyenv:
pyenv install 3.11.0 && pyenv local 3.11.0
Cannot connect to Docker daemon
Start Docker Desktop
No org found for alias
sf org login web --alias <org_alias>
config.json not found
sf data-code-extension script scan --entrypoint ./payload/entrypoint.py
DLO not found
Verify DLO exists (use getting-datacloud-schema skill), check spelling and
__dll
suffix
Permission denied writing
Re-run scan, verify target DLO exists and is writable
Deploy fails - wrong directory
Ensure
--package-dir
points to
payload/
directory, not project root
错误解决方案
command data-code-extension not found
执行
sf plugins install @salesforce/plugin-data-codeextension
datacustomcode CLI not found
执行
pip install salesforce-data-customcode
Python version mismatch
使用pyenv:
pyenv install 3.11.0 && pyenv local 3.11.0
Cannot connect to Docker daemon
启动Docker Desktop
No org found for alias
执行
sf org login web --alias <org_alias>
config.json not found
执行
sf data-code-extension script scan --entrypoint ./payload/entrypoint.py
DLO not found
验证DLO是否存在(使用getting-datacloud-schema技能),检查拼写和
__dll
后缀
Permission denied writing
重新运行扫描,验证目标DLO存在且可写入
Deploy fails - wrong directory
确保
--package-dir
指向
payload/
目录,而非项目根目录

Best Practices

最佳实践

Development

开发

  1. Always scan before testing — run scan after code changes
  2. Test locally first — use
    run
    command before deploying
  3. Use version control — git commit after each successful test
  4. Version your deployments — use semantic versioning (1.0.0, 1.1.0, etc.)
  5. Deploy from project root with
    --package-dir ./payload
  1. 测试前务必扫描——代码变更后运行扫描
  2. 先在本地测试——部署前使用
    run
    命令
  3. 使用版本控制——每次测试成功后提交git
  4. 为部署版本化——使用语义化版本(1.0.0、1.1.0等)
  5. 在项目根目录执行部署并指定
    --package-dir ./payload

Performance

性能

  • CPU_L: Small datasets (< 1M records)
  • CPU_2XL: Medium datasets (1M-10M records)
  • CPU_4XL: Large datasets (> 10M records)
  • CPU_L:小型数据集(< 100万条记录)
  • CPU_2XL:中型数据集(100万-1000万条记录)
  • CPU_4XL:大型数据集(> 1000万条记录)

Security

安全

  1. No hardcoded credentials — use SF CLI authentication only
  2. Validate input data — check for nulls and data types
  3. Limit write permissions — only grant necessary DLO/DMO access
  1. 不要硬编码凭证——仅使用SF CLI认证
  2. 验证输入数据——检查空值和数据类型
  3. 限制写入权限——仅授予必要的DLO/DMO访问权限

Integration with Other Skills

与其他技能集成

Use with getting-datacloud-schema skill (CRITICAL for validation):
The
getting-datacloud-schema
skill is required for validating DLOs before testing code extensions.
Use with Datakit Workflow:
  1. Create DLO via code extension
  2. Map DLO to DMO using datakit workflow
  3. Use DMO in segments and activations
与getting-datacloud-schema技能配合使用(验证环节必备):
getting-datacloud-schema
技能是测试代码扩展前验证DLO的必备工具
与Datakit工作流配合使用:
  1. 通过代码扩展创建DLO
  2. 使用datakit工作流将DLO映射到DMO
  3. 在细分和激活中使用DMO

Command Reference

命令参考

CommandPurposeRequired Args
script init
Create new script project--package-dir
function init
Create new function project--package-dir
script scan
Generate configentrypoint file
script run
Test locallyentrypoint file, --target-org
script deploy
Deploy to Data Cloud--target-org, --name, --package-dir, --package-version, --description
命令用途必填参数
script init
创建新的脚本项目--package-dir
function init
创建新的函数项目--package-dir
script scan
生成配置文件入口文件
script run
本地测试入口文件, --target-org
script deploy
部署到Data Cloud--target-org, --name, --package-dir, --package-version, --description

Resources

资源

Notes

注意事项

  • Code extensions run in isolated Python 3.11 environment
  • Docker is required only for deployment, not for local testing
  • Use SF CLI authentication only (no separate credential files)
  • Scan command auto-detects permissions from code
  • Local run uses actual Data Cloud data (not mocked)
  • Deployments are versioned and can be rolled back in UI
  • 代码扩展在独立的Python 3.11环境中运行
  • Docker仅在部署时需要,本地测试无需Docker
  • 仅使用SF CLI认证(无需单独的凭证文件)
  • 扫描命令会自动从代码中检测权限
  • 本地运行使用真实的Data Cloud数据(非模拟数据)
  • 部署版本可在UI中回滚