migrate

Compare original and translation side by side

🇺🇸

Original

English

🇨🇳

Translation

Chinese

Migrate to AgentControl

迁移至AgentControl

You're using a skill that will guide you through migrating an application from hardcoded LLM prompts to a full LaunchDarkly AgentControl implementation. Your job is to run the migration in five stages, stopping at each stage for the user to confirm:

Audit the code — read-only scan that produces a structured list of everything hardcoded (prompt, model, parameters, tools, app-scoped knobs).
Wrap the call — install the SDK, create the config in LaunchDarkly with a fallback that mirrors the hardcoded values, and rewrite the call site to fetch the config fresh on every request.
Move the tools — extract each tool's JSON schema, attach it to the config, and swap every call site that references the old tool list.
Add tracking — wire the per-request tracker (duration, tokens, success/error) around the provider call.
Attach evaluators — either offline evals via the Playground + Datasets, or online judges that score sampled traffic automatically.

⚠️ Three first-run failure modes to avoid.
Tracker in the wrong scope. For an agent with a loop, mint
create_tracker()
once per user turn in a
setup_run
entry node — not inside
call_model
. Per-iteration factory calls produce N
runId
s and trip the at-most-once guards. See agent-mode-frameworks.md § Custom
StateGraph
.
load_chat_model
wrapper reuse. Templates like
langchain-ai/react-agent
ship a
load_chat_model(f"{provider}/{name}")
helper that wraps
init_chat_model(...)
and silently drops every variation parameter. Delete it (don't just avoid using it) and replace call sites with
create_langchain_model(ai_config)
.
Fallthrough not flipped after
/configs-create
. A freshly-created config's fallthrough points at an auto-generated disabled variation, so the SDK returns
enabled=False
until
/configs-targeting
runs. Flip it before Stage 2 verification.

你将使用本技能引导完成应用从硬编码LLM提示词到完整LaunchDarkly AgentControl实现的迁移。你的任务是按五个阶段执行迁移，每个阶段结束后需等待用户确认：

代码审计 — 只读扫描，生成所有硬编码内容（提示词、模型、参数、工具、应用范围配置项）的结构化列表。
封装调用 — 安装SDK，在LaunchDarkly中创建配置并设置与硬编码值一致的回退方案，重写调用逻辑以在每次请求时实时获取配置。
迁移工具 — 提取每个工具的JSON schema，关联至配置，并替换所有引用旧工具列表的调用点。
添加追踪 — 在提供商调用周围接入请求级追踪器（统计时长、token数、成功/错误状态）。
关联评估器 — 要么通过Playground + 数据集实现离线评估，要么通过在线评判器自动对抽样流量打分。

⚠️ 需避免的三种首次运行失败模式。
追踪器作用域错误：对于带循环的Agent，在
setup_run
入口节点中为每个用户回合调用一次
create_tracker()
— 不要在
call_model
内部调用。每次迭代调用工厂方法会生成N个
runId
，触发至多一次的防护机制。详见agent-mode-frameworks.md § Custom
StateGraph
。
复用
load_chat_model
封装：
langchain-ai/react-agent
等模板提供的
load_chat_model(f"{provider}/{name}")
助手会封装
init_chat_model(...)
并静默丢弃所有可变参数。删除该方法（不只是避免使用），并将调用点替换为
create_langchain_model(ai_config)
。
/configs-create
后未切换回退方案：新创建的配置默认回退至自动生成的禁用变体，因此在
/configs-targeting
执行前，SDK会返回
enabled=False
。在第二阶段验证前需切换回退方案。

Coverage — which shapes are well-trodden vs require extrapolation

覆盖范围——哪些场景已成熟支持，哪些需要自行扩展

The skill is optimized for Python and Node.js / TypeScript; other languages are install-only. Within Python and Node the coverage tiers are:

Shape	Python	Node.js	Reference
One-shot completion (direct OpenAI / Anthropic / Bedrock / Gemini call)	✅ Worked example	✅ Worked example	before-after-examples.md, per-provider docs in `built-in-metrics/references/`
Chat loop via managed runner ( `ManagedModel` )	✅ Tier 1 pattern	✅ Tier 1 pattern	built-in-metrics SKILL.md
LangChain single-call	✅ Worked example	✅ Worked example	langchain-tracking.md
LangGraph prebuilt agent (Python `langchain.agents.create_agent` , Node `createReactAgent` )	✅ Worked example	✅ Worked example	agent-mode-frameworks.md § LangGraph
LangGraph custom `StateGraph` with run-scoped tracker (setup_run + call_model + finalize)	✅ Deep worked example	⚠️ Mentioned — translate from Python	agent-mode-frameworks.md § Custom `StateGraph`
CrewAI `Agent`	✅ Worked example	— (not a Node framework)	agent-mode-frameworks.md § CrewAI
Strands `Agent`	✅ Worked example	⚠️ BedrockModel + OpenAIModel only (no Anthropic)	agent-mode-frameworks.md § Strands
Custom ReAct loop (hand-rolled, any framework or none)	✅ Worked example	⚠️ Apply framework-agnostic invariants; translate from Python	agent-mode-frameworks.md § Custom ReAct loop
Vercel AI SDK ( `generateText` / `streamText` )	— (not a Python framework)	⚠️ Provider package exists; no worked example in skill	`built-in-metrics` provider-package matrix
Streaming (SSE / WebSocket)	⚠️ Delegated to `built-in-metrics` streaming doc	⚠️ Same — use `trackStreamMetricsOf` + manual TTFT	streaming-tracking.md
Multi-agent graph (supervisor + workers)	⚠️ Out of main scope; see reference	⚠️ Out of main scope; see reference	agent-graph-reference.md
Non-LangGraph agent frameworks (Pydantic AI, DSPy, AutoGen, Haystack, LlamaIndex agents, Semantic Kernel)	⚠️ Apply the three invariants; no framework-specific example	⚠️ Same	agent-mode-frameworks.md § Framework-agnostic invariants
Go, Ruby, .NET	ℹ️ Install commands only	ℹ️ Install commands only	phase-1-analysis-checklist.md § SDK routing table

Reading the key: ✅ = follow the skill verbatim; ⚠️ = the architecture applies but you'll translate idioms or cross-reference another skill; ℹ️ = skill doesn't go past the install step.

If the target app is in the ⚠️ column, start by reading agent-mode-frameworks.md § Framework-agnostic invariants — those three rules (one

agent_config

per turn, one tracker per turn, at-most-once methods fire once at turn end) apply regardless of framework, and every code snippet in this skill is an instantiation of them. Translate the Python example's shape onto the target framework's primitives.

本技能针对Python和Node.js / TypeScript优化；其他语言仅支持安装步骤。Python和Node.js内的覆盖层级如下：

场景	Python	Node.js	参考文档
一次性补全（直接调用OpenAI / Anthropic / Bedrock / Gemini）	✅ 示例可用	✅ 示例可用	before-after-examples.md， `built-in-metrics/references/` 下的各提供商文档
通过托管运行器实现的聊天循环（ `ManagedModel` ）	✅ 一级模式	✅ 一级模式	built-in-metrics SKILL.md
LangChain单次调用	✅ 示例可用	✅ 示例可用	langchain-tracking.md
LangGraph预构建Agent（Python `langchain.agents.create_agent` ，Node `createReactAgent` ）	✅ 示例可用	✅ 示例可用	agent-mode-frameworks.md § LangGraph
带运行级追踪器的LangGraph自定义 `StateGraph` （setup_run + call_model + finalize）	✅ 深度示例	⚠️ 需参考Python实现转换	agent-mode-frameworks.md § Custom `StateGraph`
CrewAI `Agent`	✅ 示例可用	—（非Node框架）	agent-mode-frameworks.md § CrewAI
Strands `Agent`	✅ 示例可用	⚠️ 仅支持BedrockModel + OpenAIModel（不支持Anthropic）	agent-mode-frameworks.md § Strands
自定义ReAct循环（手动实现，无框架或任意框架）	✅ 示例可用	⚠️ 应用框架无关规则；参考Python实现转换	agent-mode-frameworks.md § Custom ReAct loop
Vercel AI SDK（ `generateText` / `streamText` ）	—（非Python框架）	⚠️ 存在提供商包；本技能无示例	`built-in-metrics` 提供商包矩阵
流式传输（SSE / WebSocket）	⚠️ 委托给 `built-in-metrics` 流式文档	⚠️ 相同——使用 `trackStreamMetricsOf` + 手动TTFT	streaming-tracking.md
多Agent图（监督者 + 执行者）	⚠️ 超出主范围；参考文档	⚠️ 超出主范围；参考文档	agent-graph-reference.md
非LangGraph Agent框架（Pydantic AI、DSPy、AutoGen、Haystack、LlamaIndex Agents、Semantic Kernel）	⚠️ 应用三条通用规则；无框架特定示例	⚠️ 相同	agent-mode-frameworks.md § Framework-agnostic invariants
Go、Ruby、.NET	ℹ️ 仅提供安装命令	ℹ️ 仅提供安装命令	phase-1-analysis-checklist.md § SDK routing table

符号说明： ✅ = 直接遵循本技能指导；⚠️ = 架构适用，但需转换语言习惯或交叉参考其他技能；ℹ️ = 本技能仅支持安装步骤。

如果目标应用属于⚠️列，请先阅读agent-mode-frameworks.md § Framework-agnostic invariants——三条规则（每个回合一个

agent_config

、每个回合一个追踪器、至多一次方法在回合结束时触发）适用于所有框架，本技能中的所有代码片段都是这些规则的实例。将Python示例的结构转换为目标框架的原生实现。

Prerequisites

前置条件

This skill requires the remotely hosted LaunchDarkly MCP server to be configured in your environment, and an application that already calls an LLM provider with hardcoded model, prompt, and parameter values.

Required environment:

```
LD_SDK_KEY
```
— server-side SDK key (starts with
```
sdk-
```
) from the target LaunchDarkly project

MCP tools used directly by this skill: none — every LaunchDarkly write happens in a focused sibling skill.

Check the SDK CHANGELOG before applying any pattern. The API surface described throughout this skill targets the SDK behavior at the time of the skill's last update; SDK releases can rename, remove, or split methods after that. Before you start, fetch the latest CHANGELOG for the SDK(s) you'll target and skim for anything that contradicts the pattern you're about to apply:

Python: https://github.com/launchdarkly/python-server-sdk-ai/blob/main/packages/sdk/server-ai/CHANGELOG.md (and per-provider CHANGELOGs under
```
packages/ai-providers/server-ai-{openai,langchain}/CHANGELOG.md
```
)
Node: https://github.com/launchdarkly/js-core/blob/main/packages/sdk/server-ai/CHANGELOG.md (and per-provider CHANGELOGs under
```
packages/ai-providers/server-ai-{openai,langchain,vercel}/CHANGELOG.md
```
)

If a CHANGELOG entry post-dates this skill and changes an API you're about to use, the CHANGELOG wins — and the skill should be updated.

Hand-off model. This skill does not auto-invoke other skills. At each stage that needs a LaunchDarkly write, this skill prepares the inputs (config key, mode, model, prompt, tool schemas, judge keys) and then tells the user to run the next slash-command themselves. After the user finishes that sibling skill, return to the next step here. Treat the "Delegate" lines below as next-step instructions, not auto-handoffs.

Sibling skills the user runs at each stage:

```
projects
```
— pre-Stage 2, only if no project exists yet
```
configs-create
```
— Stage 2 (creates the config and first variation)
```
tools
```
— Stage 3 (creates tool definitions and attaches them)
```
configs-targeting
```
— between Stage 2 and Stage 4 (promotes the new variation to fallthrough so the SDK actually serves it)
```
online-evals
```
— Stage 5 (attaches judges, creates custom judges)

本技能要求环境中已配置远程托管的LaunchDarkly MCP服务器，且应用已通过硬编码的模型、提示词和参数值调用LLM提供商。

必需环境变量：

```
LD_SDK_KEY
```
— 目标LaunchDarkly项目的服务端SDK密钥（以
```
sdk-
```
开头）

本技能直接使用的MCP工具：无——所有LaunchDarkly写入操作由关联的兄弟技能完成。

应用任何模式前请查看SDK CHANGELOG。本技能描述的API基于技能最后更新时的SDK行为；SDK发布后可能会重命名、移除或拆分方法。开始前，请获取目标SDK的最新CHANGELOG并浏览是否有与即将应用的模式冲突的内容：

如果CHANGELOG条目晚于本技能更新时间且修改了你将使用的API，以CHANGELOG为准——同时本技能也应更新。

交接模式。本技能不会自动调用其他技能。每个需要写入LaunchDarkly的阶段，本技能会准备输入（配置密钥、模式、模型、提示词、工具schema、评判器密钥），然后告知用户自行运行下一个斜杠命令。用户完成兄弟技能后，返回此处继续下一步。将以下“委托”行视为下一步指令，而非自动交接。

用户在各阶段需运行的兄弟技能：

```
projects
```
— 第二阶段前，仅当项目尚未存在时使用
```
configs-create
```
— 第二阶段（创建配置和首个变体）
```
tools
```
— 第三阶段（创建工具定义并关联）
```
configs-targeting
```
— 第二阶段与第四阶段之间（将新变体升级为回退方案，使SDK实际提供该变体）
```
online-evals
```
— 第五阶段（关联评判器，创建自定义评判器）

Core Principles

核心原则

Inspect before you mutate. Every stage begins with a read-only audit. Do not touch code until Step 1 is confirmed by the user.
Replace config, not business logic. The SDK call is a drop-in for the place where the model, parameters, and prompt are defined — not for the provider call itself. OpenAI/Anthropic/Bedrock calls stay where they are.
Fallback mirrors current behavior. The fallback passed to
```
completion_config
```
/
```
agent_config
```
must preserve the hardcoded values you removed, so the app is unchanged if LaunchDarkly is unreachable.
Stages are ordered. Wrap before you add tools. Add tools before you track. Track before you add evals. Skipping ahead produces configs without traffic, metrics without context, and judges with nothing to score.
Hand off to focused skills, manually. Each stage that needs a LaunchDarkly write tells the user to run a sibling slash-command (
```
/configs-create
```
,
```
/tools
```
,
```
/configs-targeting
```
,
```
/online-evals
```
) and waits for them to come back. This skill does not auto-invoke other skills.

先检查再修改。每个阶段以只读审计开始。在用户确认第一步骤前，请勿修改代码。
替换配置，而非业务逻辑。SDK调用是模型、参数和提示词定义位置的替代方案——而非提供商调用本身。OpenAI/Anthropic/Bedrock调用保持原位置不变。
回退方案镜像当前行为。传递给
```
completion_config
```
/
```
agent_config
```
的回退方案必须保留你移除的硬编码值，这样当LaunchDarkly不可用时，应用行为不会改变。
阶段有序。先封装再添加工具，先添加工具再追踪，先追踪再添加评估器。跳过前面的阶段会导致配置无流量、指标无上下文、评判器无内容可打分。
手动交接至聚焦技能。每个需要写入LaunchDarkly的阶段会告知用户运行兄弟斜杠命令（
```
/configs-create
```
、
```
/tools
```
、
```
/configs-targeting
```
、
```
/online-evals
```
）并等待用户返回。本技能不会自动调用其他技能。

Workflow

工作流程

Minimum viable migration

最小可行迁移

Stages 1–4 (audit, wrap, tools, tracker) are independently shippable. A migration that stops after Stage 4 is complete, production-ready, and delivers the core value — externalized prompts and model config, targeting, variation A/B testing, and Monitoring-tab metrics. Stage 5 (evaluators) is a quality-of-life addition, not a gate. Do not block a Stage-4 rollout on evaluators; ship the run-scoped tracker path, verify metrics flow, then come back for Stage 5 when the team has time to curate a dataset.

That said, do not skip Stage 4. A migration without the tracker gives you externalized prompts but no visibility, which is most of the payoff left on the floor.

第1-4阶段（审计、封装、工具、追踪）可独立交付。迁移至第四阶段即完成、可投入生产，并提供核心价值——外部化提示词和模型配置、定向发布、变体A/B测试、监控面板指标。第五阶段（评估器）是体验优化项，而非必要条件。不要因评估器而阻碍第四阶段的发布；先交付运行级追踪路径，验证指标正常流转，待团队有时间整理数据集后再返回完成第五阶段。

但请勿跳过第四阶段。无追踪的迁移仅能实现提示词外部化，却失去了可见性，浪费了大部分价值。

Step 1: Audit the codebase (Stage 1)

步骤1：审计代码库（第一阶段）

This is the first stage. It is read-only — no code writes, no LaunchDarkly resources created. The goal is to scan the repo and produce a structured manifest of every hardcoded value that needs to move, then hand the manifest back to the user for confirmation before any code is touched in Stage 2.

Use phase-1-analysis-checklist.md to scan:

Language and package manager — Python (pip/poetry/uv), TypeScript/JavaScript (npm/pnpm/yarn), Go, Ruby, .NET
LLM provider — OpenAI, Anthropic, Bedrock, Gemini, LangChain, LangGraph, CrewAI, Strands
Existing LaunchDarkly usage — any pre-existing
```
LDClient
```
or
```
ldclient
```
initialization to reuse
Hardcoded model configs — model name string literals, temperature / max_tokens / top_p, system prompts, instruction strings
Template placeholders in prompts —
```
.format()
```
calls, f-strings in prompt constants, JS/TS template literals,
```
%(var)s
```
, hand-rolled
```
str.replace("__VAR__", ...)
```
. Flag each placeholder name and its runtime-value source; all get rewritten to Mustache
```
{{ variable }}
```
in Stage 2.
Externalized prompt files — scan YAML / JSON / TOML / Markdown /
```
.prompt
```
/
```
.j2
```
files and prompt-template registries (
```
langchain.hub.pull(...)
```
, LangSmith
```
client.pull_prompt(...)
```
) for prompts loaded at runtime. Common shapes: CrewAI
```
agents.yaml
```
/
```
tasks.yaml
```
, LangChain Promptfiles, k8s ConfigMap overlays, Pydantic Settings classes with
```
prompt_*
```
fields. Same Mustache rewrite (sub-step 5 of Stage 2) applies if the placeholder syntax differs. See phase-1-analysis-checklist.md § 4.
Hardcoded app-scoped knobs — search-result limits, retry budgets, tool-timeout overrides, feature toggles, any config-dataclass field that isn't a prompt or model parameter but still governs agent behavior. These belong in
```
model.custom
```
on the variation (not
```
model.parameters
```
, which is forwarded to the provider SDK and will crash on unknown kwargs).
Mode decision — completion mode (chat messages array) or agent mode (single instructions string). Completion mode is the default and the only mode that supports judges attached in the UI.

For each hardcoded target the audit finds, record:

File path and line range
Current value (model name, full prompt text, parameter dict)

Target config field (

model.name

model.parameters.temperature

messages[].content

instructions

)

Whether the surrounding call uses function calling / tools (drives Stage 3)
Whether the surrounding call has retry logic (affects where Stage 4 tracker calls go)

This manifest is the contract for the next four stages.

Stage 1 output (return to user as a structured summary):

Language: Python 3.12
Package manager: uv
LLM provider: OpenAI
Existing LD SDK: none
Target mode: completion
Hardcoded targets:
  - src/chat.py:42   model="gpt-4o"
  - src/chat.py:43   temperature=0.7, max_tokens=2000
  - src/chat.py:45   system="You are a helpful assistant..."
Externalized prompt files: none (or e.g. "prompts/agents.yaml — CrewAI role/goal/backstory")
Prompt-template registries: none (or e.g. langchain.hub.pull("rlm/rag-prompt") at app.py:14)
Coverage totals: 3 hardcoded code targets · 0 externalized prompt files · 0 registry pulls
Proposed plan: single config key `chat-assistant`, mirror fallback, Stage 3 (tools) skipped (no function calling), Stage 4 (tracking) inline, Stage 5 (evals) attach built-in accuracy judge.

STOP. Present this summary, state the coverage totals out loud (e.g. "I found N hardcoded code targets and M externalized prompt files — does that match what you expected?"), and wait for the user to reply with one of four explicit forms:

confirm
— proceed to Stage 2.
add: <files or paths>
— re-run the audit with the new locations and present an updated summary.
fix: <correction>
— update a target in the list (provider, mode, prompt content, etc.) and ask again.
stop
— pause the migration here.

Do not interpret any other word — including

skip

next

go

ok

proceed

— as confirmation; ask the user to pick one of the four forms. This is the most important checkpoint in the workflow — if the audit is wrong, every stage after this will be wrong. The user should cross-check the hardcoded-targets list against what they know is in the code before giving the go-ahead.

这是第一阶段，为只读操作——不修改代码，不创建LaunchDarkly资源。目标是扫描仓库，生成所有需迁移的硬编码值的结构化清单，然后将清单交给用户确认，再进入第二阶段修改代码。

使用phase-1-analysis-checklist.md进行扫描：

语言和包管理器 — Python（pip/poetry/uv）、TypeScript/JavaScript（npm/pnpm/yarn）、Go、Ruby、.NET
LLM提供商 — OpenAI、Anthropic、Bedrock、Gemini、LangChain、LangGraph、CrewAI、Strands
现有LaunchDarkly使用情况 — 是否存在可复用的
```
LDClient
```
或
```
ldclient
```
初始化逻辑
硬编码模型配置 — 模型名字面量、temperature / max_tokens / top_p、系统提示词、指令字符串
提示词中的模板占位符 —
```
.format()
```
调用、提示词常量中的f-string、JS/TS模板字面量、
```
%(var)s
```
、手动实现的
```
str.replace("__VAR__", ...)
```
。标记每个占位符名称及其运行时值来源；所有占位符在第二阶段需重写为Mustache
```
{{ variable }}
```
格式。
外部化提示词文件 — 扫描YAML / JSON / TOML / Markdown /
```
.prompt
```
/
```
.j2
```
文件以及提示词模板注册表（
```
langchain.hub.pull(...)
```
、LangSmith
```
client.pull_prompt(...)
```
）中运行时加载的提示词。常见形式：CrewAI
```
agents.yaml
```
/
```
tasks.yaml
```
、LangChain Promptfiles、k8s ConfigMap覆盖层、带
```
prompt_*
```
字段的Pydantic Settings类。如果占位符语法不同，同样需按第二阶段第5子步骤重写为Mustache格式。详见phase-1-analysis-checklist.md § 4。
硬编码应用范围配置项 — 搜索结果限制、重试预算、工具超时覆盖、功能开关、任何不属于提示词或模型参数但仍控制Agent行为的配置数据类字段。这些应放在变体的
```
model.custom
```
中（而非
```
model.parameters
```
，后者会转发给提供商SDK，未知关键字参数会导致崩溃）。
模式决策 — 补全模式（聊天消息数组）或Agent模式（单指令字符串）。补全模式是默认模式，也是唯一支持在UI中关联评判器的模式。

对于审计发现的每个硬编码目标，记录：

文件路径和行范围
当前值（模型名称、完整提示词文本、参数字典）

目标配置字段（

model.name

、

model.parameters.temperature

、

messages[].content

、

instructions

）

周围调用是否使用函数调用/工具（决定第三阶段是否执行）
周围调用是否有重试逻辑（影响第四阶段追踪器调用的位置）

该清单是后续四个阶段的执行依据。

第一阶段输出（以结构化摘要形式返回给用户）：

语言：Python 3.12
包管理器：uv
LLM提供商：OpenAI
现有LD SDK：无
目标模式：补全
硬编码目标：
  - src/chat.py:42   model="gpt-4o"
  - src/chat.py:43   temperature=0.7, max_tokens=2000
  - src/chat.py:45   system="You are a helpful assistant..."
外部化提示词文件：无（或例如 "prompts/agents.yaml — CrewAI角色/目标/背景故事"）
提示词模板注册表：无（或例如 langchain.hub.pull("rlm/rag-prompt") 在app.py:14）
覆盖统计：3个硬编码代码目标 · 0个外部化提示词文件 · 0个注册表拉取
建议方案：单个配置密钥`chat-assistant`，镜像回退方案，跳过第三阶段（无函数调用），第四阶段（追踪）内联实现，第五阶段（评估）关联内置准确性评判器。

暂停。展示此摘要，大声说明覆盖统计（例如：“我发现N个硬编码代码目标和M个外部化提示词文件——是否符合你的预期？”），并等待用户以以下四种明确形式回复：

confirm
— 进入第二阶段。
add: <files or paths>
— 使用新位置重新运行审计并展示更新后的摘要。
fix: <correction>
— 更新清单中的目标（提供商、模式、提示词内容等）并再次确认。
stop
— 在此处暂停迁移。

请勿将任何其他词语——包括

skip

、

next

、

go

、

ok

、

proceed

——视为确认；请用户选择上述四种形式之一。这是工作流程中最重要的检查点——如果审计错误，后续每个阶段都会出错。用户应在确认前将硬编码目标清单与他们已知的代码内容交叉核对。

Step 2: Wrap the call in the AI SDK (Stage 2)

步骤2：用AI SDK封装调用（第二阶段）

This is the first stage that writes code. It has nine sub-steps.

Delete any hand-rolled model / tool wrappers the audit flagged. Do this before installing the new SDK so the replacement lands in a repo without confusing fallback imports. The two shapes the Stage 1 audit should have surfaced:
- load_chat_model(f"{provider}/{name}")
  or any
  init_chat_model(...)
  wrapper. Ships with
```
langchain-ai/react-agent
```
  and many derivative repos. Delete the function and its module; the replacement is
```
create_langchain_model(ai_config)
```
  (installed in the next sub-step). Leaving the wrapper in place means the next edit in this repo will import the familiar helper and silently drop variation parameters.
- Hand-rolled
  resolve_tools
  /
  TOOL_REGISTRY
  /
  ALL_TOOLS
  helpers that hard-code a static tool list. Delete them;
```
ldai_langchain.langchain_helper.build_structured_tools(ai_config, TOOL_REGISTRY_DICT)
```
  is the canonical replacement and gets wired in Stage 3. If you leave the hand-rolled version, both shapes will live side-by-side and the next contributor will pick the familiar one.
Commit the deletion separately from the SDK install if the repo's review process benefits from it — otherwise bundle with sub-step 2.

Install the AI SDK. Detect the package manager from Step 1, then install:

Python:

launchdarkly-server-sdk

launchdarkly-server-sdk-ai>=0.20.0

Node.js/TypeScript:

@launchdarkly/node-server-sdk

@launchdarkly/server-sdk-ai@^0.20.0

Go:

github.com/launchdarkly/go-server-sdk/v7

github.com/launchdarkly/go-server-sdk/ldai

Tier-2 provider packages (install in Stage 4, only if you're using the matching provider):

OpenAI:

launchdarkly-server-sdk-ai-openai>=0.4.0

(Python) /

@launchdarkly/server-sdk-ai-openai@^0.5.5

(Node)

LangChain / LangGraph:

launchdarkly-server-sdk-ai-langchain>=0.5.0

(Python) /

@launchdarkly/server-sdk-ai-langchain@^0.5.5

(Node)

Vercel AI SDK (Node only):

@launchdarkly/server-sdk-ai-vercel@^0.5.5

Anthropic, Gemini, Bedrock — no provider package published; use Tier-3 custom extractor (see
```
built-in-metrics
```
)

Initialize
LDAIClient
once at startup. Reuse any existing

LDClient

— do not create a second base client. Place the initialization in the same module that owns existing app config.

Python:

python

import os
import ldclient
from ldclient.config import Config
from ldai.client import LDAIClient

# Order matters: ldclient.get() raises if called before ldclient.set_config().
# The set_config call is what initializes the singleton; .get() just returns it.
sdk_key = os.environ.get("LD_SDK_KEY")
if sdk_key:
    ldclient.set_config(Config(sdk_key))
else:
    # Missing key: init in offline mode so the app still starts and the fallback
    # path runs on every call. Never raise at import time for a missing env var —
    # that turns a config gap into a boot failure.
    import logging
    logging.getLogger(__name__).warning(
        "LD_SDK_KEY not set; configs will use fallback values only."
    )
    ldclient.set_config(Config("", offline=True))

ai_client = LDAIClient(ldclient.get())

Node.js/TypeScript:

typescript

import { init } from '@launchdarkly/node-server-sdk';
import { initAi } from '@launchdarkly/server-sdk-ai';

// The Node SDK does not have an explicit offline mode — a missing or invalid
// key fails fast during waitForInitialization, and every agent_config /
// completion_config call returns the fallback. Log a warning; do not throw.
if (!process.env.LD_SDK_KEY) {
  console.warn('LD_SDK_KEY not set; configs will use fallback values only.');
}
const ldClient = init(process.env.LD_SDK_KEY ?? 'sdk-offline');
await ldClient.waitForInitialization({ timeout: 10 }).catch(() => {
  // Swallow init failures in offline mode; fallback path runs.
});
const aiClient = initAi(ldClient);

Hand off to
configs-create
. Print the extracted model, prompt/instructions, parameters, and mode from the Stage 1 manifest, then tell the user: "Run
/configs-create
with these inputs, then come back here." Supply the config key you want the code to call (e.g.
```
chat-assistant
```
). Do not attempt to auto-invoke the sibling skill — wait for the user to finish it before continuing.
After
configs-create
finishes, the user must also run
/configs-targeting
to promote the new variation to fallthrough. A freshly created variation returns
```
enabled=False
```
to every consumer until targeting is updated. Skip this and Stage 2 verification (sub-step 9 below) will silently take the fallback path on every request.
Rewrite template placeholders to Mustache syntax. If the hardcoded prompt interpolates runtime values with Python
```
.format()
```
, f-strings, JS template literals, or any other non-Mustache syntax (e.g.
```
{system_time}
```
,
```
${userName}
```
,
```
%(topic)s
```
), rewrite every placeholder to
```
{{ variable }}
```
Mustache form. Do this in both the file you're about to send to
```
/configs-create
```
and the fallback string you'll write in sub-step 6. The AI SDK interpolates variables through a Mustache renderer on the LD-served path and the fallback path using the fourth-argument
```
variables
```
dict to
```
completion_config(...)
```
/
```
completionConfig(...)
```
. Leaving a Python-style
```
{system_time}
```
literal in the fallback ships a silent regression when LaunchDarkly is unreachable — the renderer won't match the single-brace form and the literal
```
{system_time}
```
goes to the provider as part of the prompt.
Before:
python
```
SYSTEM_PROMPT = "You are a helpful assistant. The time is {system_time}."
prompt = SYSTEM_PROMPT.format(system_time=datetime.now().isoformat())
```
After (in source):
python
```
SYSTEM_PROMPT = "You are a helpful assistant. The time is {{ system_time }}."
# .format() is removed at the call site — the SDK interpolates via `variables`
config = ai_client.completion_config(
    CONFIG_KEY,
    context,
    fallback,
    variables={"system_time": datetime.now().isoformat()},
)
```
Common shapes to rewrite:
- Python
```
"{var}"
```
  /
```
"{var!s}"
```
  /
```
"%(var)s"
```
  →
```
"{{ var }}"
```
- JS/TS
```
`${var}`
```
  template literals inside prompt strings →
```
"{{ var }}"
```
- Any hand-rolled
```
str.replace("__VAR__", value)
```
  scheme →
```
"{{ var }}"
```
See fallback-defaults-pattern.md § Template placeholders for the fallback-specific variant.

Build the fallback. Mirror the hardcoded values you extracted. Use

AICompletionConfigDefault

AIAgentConfigDefault

in Python, plain object literals in Node. See fallback-defaults-pattern.md for inline, file-backed, and bootstrap-generated patterns.

Python fallback (completion mode):

python

from ldai.client import AICompletionConfigDefault, ModelConfig, ProviderConfig, LDMessage

fallback = AICompletionConfigDefault(
    enabled=True,
    model=ModelConfig(name="gpt-4o", parameters={"temperature": 0.7, "max_tokens": 2000}),
    provider=ProviderConfig(name="openai"),
    messages=[LDMessage(role="system", content="You are a helpful assistant...")],
)

Replace the hardcoded call site. Swap the hardcoded model/prompt/params for a

completion_config

completionConfig

(or

agent_config

agentConfig

) call, then read the returned fields into the existing provider call. Keep the provider call intact.

Python — before:

python

response = openai_client.chat.completions.create(
    model="gpt-4o",
    temperature=0.7,
    max_tokens=2000,
    messages=[
        {"role": "system", "content": "You are a helpful assistant..."},
        {"role": "user", "content": user_input},
    ],
)

Python — after:

python

context = Context.builder(user_id).set("email", user.email).build()
config = ai_client.completion_config("chat-assistant", context, fallback)

if not config.enabled:
    return disabled_response()

params = config.model.parameters or {}
response = openai_client.chat.completions.create(
    model=config.model.name,
    temperature=params.get("temperature"),
    max_tokens=params.get("max_tokens"),
    messages=[m.to_dict() for m in (config.messages or [])] + [
        {"role": "user", "content": user_input},
    ],
)

Python — after (agent mode) — for LangGraph, CrewAI, or any framework that takes a goal/instructions string:

python

context = Context.builder(user_id).kind("user").build()
config = ai_client.agent_config("support-agent", context, FALLBACK)

if not config.enabled:
    return disabled_response()

# config is a single AIAgentConfig object — NOT a (config, tracker) tuple.
# Obtain the tracker once per execution via the factory: tracker = config.create_tracker()
model_name = f"{config.provider.name}/{config.model.name}"
instructions = config.instructions
params = config.model.parameters or {}

# Pass model_name + instructions into your framework's agent constructor.
# Example: LangGraph prebuilt agent (Python — `from langchain.agents import create_agent`;
# this replaces `langgraph.prebuilt.create_react_agent`, deprecated in LangGraph 1.0
# and removed in 2.0. Same return shape; `prompt=` was renamed to `system_prompt=`.)
# agent = create_agent(
#     create_langchain_model(config),  # forwards every variation parameter
#     TOOLS,                            # Stage 3 will replace this with a config.tools loader
#     system_prompt=instructions,
# )

See before-after-examples.md for full Python OpenAI, Node Anthropic, and LangGraph agent-mode paired snippets.

Check
config.enabled
. If it returns
```
False
```
, handle the disabled path without crashing and without calling the provider. The check is required — not optional.
Verify. Run the app with a valid
```
LD_SDK_KEY
```
; confirm the call succeeds and the response matches pre-migration output. Then temporarily set
```
LD_SDK_KEY=sdk-invalid
```
(or unset it) and confirm the fallback path runs without error. Both paths must work before moving to Stage 3.

Delegate: configs-create
(sub-step 4).

这是第一个修改代码的阶段，包含九个子步骤。

删除审计标记的任何手动实现的模型/工具封装。在安装新SDK前执行此操作，确保替代方案在无混淆回退导入的仓库中落地。第一阶段审计应发现两种形式：
- load_chat_model(f"{provider}/{name}")
  或任何
  init_chat_model(...)
  封装。随
```
langchain-ai/react-agent
```
  及许多衍生仓库发布。删除该函数及其模块；替代方案是
```
create_langchain_model(ai_config)
```
  （在下一个子步骤安装）。保留该封装会导致仓库中的下一次编辑导入熟悉的助手并静默丢弃变体参数。
- 手动实现的
  resolve_tools
  /
  TOOL_REGISTRY
  /
  ALL_TOOLS
  助手，硬编码静态工具列表。删除它们；
```
ldai_langchain.langchain_helper.build_structured_tools(ai_config, TOOL_REGISTRY_DICT)
```
  是标准替代方案，将在第三阶段接入。保留手动实现版本会导致两种形式并存，下一位贡献者会选择熟悉的版本。
如果仓库的评审流程需要，可将删除操作与SDK安装分开提交——否则与子步骤2合并。

安装AI SDK。根据步骤1检测包管理器，然后安装：

Python:

launchdarkly-server-sdk

launchdarkly-server-sdk-ai>=0.20.0

Node.js/TypeScript:

@launchdarkly/node-server-sdk

@launchdarkly/server-sdk-ai@^0.20.0

Go:

github.com/launchdarkly/go-server-sdk/v7

github.com/launchdarkly/go-server-sdk/ldai

二级提供商包（仅在使用对应提供商时于第四阶段安装）：

OpenAI:

launchdarkly-server-sdk-ai-openai>=0.4.0

（Python） /

@launchdarkly/server-sdk-ai-openai@^0.5.5

（Node）

LangChain / LangGraph:

launchdarkly-server-sdk-ai-langchain>=0.5.0

（Python） /

@launchdarkly/server-sdk-ai-langchain@^0.5.5

（Node）

Vercel AI SDK（仅Node）:

@launchdarkly/server-sdk-ai-vercel@^0.5.5

Anthropic、Gemini、Bedrock — 无提供商包发布；使用三级自定义提取器（详见
```
built-in-metrics
```
）

在启动时初始化一次
LDAIClient
。复用任何现有

LDClient

——不要创建第二个基础客户端。将初始化放在拥有现有应用配置的同一模块中。

Python：

python

import os
import ldclient
from ldclient.config import Config
from ldai.client import LDAIClient

# 顺序重要：ldclient.get()在ldclient.set_config()之前调用会抛出异常。
# set_config调用初始化单例；.get()仅返回单例。
sdk_key = os.environ.get("LD_SDK_KEY")
if sdk_key:
    ldclient.set_config(Config(sdk_key))
else:
    # 缺少密钥：以离线模式初始化，使应用仍能启动并在每次调用时执行回退路径。永远不要因缺少环境变量在导入时抛出异常——
    # 这会将配置缺口变为启动失败。
    import logging
    logging.getLogger(__name__).warning(
        "LD_SDK_KEY未设置；配置将仅使用回退值。"
    )
    ldclient.set_config(Config("", offline=True))

ai_client = LDAIClient(ldclient.get())

Node.js/TypeScript：

typescript

import { init } from '@launchdarkly/node-server-sdk';
import { initAi } from '@launchdarkly/server-sdk-ai';

// Node SDK没有显式离线模式——缺少或无效密钥会在waitForInitialization期间快速失败，每个agent_config /
// completion_config调用都会返回回退值。记录警告；不要抛出异常。
if (!process.env.LD_SDK_KEY) {
  console.warn('LD_SDK_KEY未设置；配置将仅使用回退值。');
}
const ldClient = init(process.env.LD_SDK_KEY ?? 'sdk-offline');
await ldClient.waitForInitialization({ timeout: 10 }).catch(() => {
  // 在离线模式下忽略初始化失败；执行回退路径。
});
const aiClient = initAi(ldClient);

交接至
configs-create
。打印第一阶段清单中提取的模型、提示词/指令、参数和模式，然后告知用户：*“使用这些输入运行
```
/configs-create
```
，然后返回此处。”*提供代码将调用的配置密钥（例如
```
chat-assistant
```
）。请勿尝试自动调用兄弟技能——等待用户完成后再继续。
configs-create
完成后，用户还需运行
/configs-targeting
将新变体升级为回退方案。新创建的变体在定向更新前会向所有消费者返回
```
enabled=False
```
。跳过此步骤会导致第二阶段验证（下文第9子步骤）在每次请求时静默执行回退路径。
将模板占位符重写为Mustache语法。如果硬编码提示词使用Python
```
.format()
```
、f-string、JS模板字面量或任何非Mustache语法（例如
```
{system_time}
```
、
```
${userName}
```
、
```
%(topic)s
```
）插值运行时值，将所有占位符重写为
```
{{ variable }}
```
的Mustache形式。在即将发送给
/configs-create
的文件和将在第6子步骤编写的回退字符串中都执行此操作。AI SDK通过Mustache渲染器在LD提供的路径和回退路径中插值变量，使用
```
completion_config(...)
```
/
```
completionConfig(...)
```
的第四个参数
```
variables
```
字典。如果回退字符串中保留Python风格的
```
{system_time}
```
字面量，当LaunchDarkly不可用时会导致静默回归——渲染器无法匹配单大括号形式，字面量
```
{system_time}
```
会作为提示词的一部分发送给提供商。
修改前：
python
```
SYSTEM_PROMPT = "You are a helpful assistant. The time is {system_time}."
prompt = SYSTEM_PROMPT.format(system_time=datetime.now().isoformat())
```
修改后（源码中）：
python
```
SYSTEM_PROMPT = "You are a helpful assistant. The time is {{ system_time }}."
# .format()在调用点移除——SDK通过`variables`插值
config = ai_client.completion_config(
    CONFIG_KEY,
    context,
    fallback,
    variables={"system_time": datetime.now().isoformat()},
)
```
需重写的常见形式：
- Python
```
"{var}"
```
  /
```
"{var!s}"
```
  /
```
"%(var)s"
```
  →
```
"{{ var }}"
```
- JS/TS
```
`${var}`
```
  提示词字符串内的模板字面量 →
```
"{{ var }}"
```
- 任何手动实现的
```
str.replace("__VAR__", value)
```
  方案 →
```
"{{ var }}"
```
详见fallback-defaults-pattern.md § Template placeholders中的回退特定变体。

构建回退方案。镜像你提取的硬编码值。在Python中使用

AICompletionConfigDefault

AIAgentConfigDefault

，在Node中使用普通对象字面量。详见fallback-defaults-pattern.md中的内联、文件驱动和引导生成模式。

Python回退方案（补全模式）：

python

from ldai.client import AICompletionConfigDefault, ModelConfig, ProviderConfig, LDMessage

fallback = AICompletionConfigDefault(
    enabled=True,
    model=ModelConfig(name="gpt-4o", parameters={"temperature": 0.7, "max_tokens": 2000}),
    provider=ProviderConfig(name="openai"),
    messages=[LDMessage(role="system", content="You are a helpful assistant...")],
)

替换硬编码调用点。将硬编码的模型/提示词/参数替换为

completion_config

completionConfig

（或

agent_config

agentConfig

）调用，然后将返回的字段读取到现有提供商调用中。保持提供商调用不变。

Python — 修改前：

python

response = openai_client.chat.completions.create(
    model="gpt-4o",
    temperature=0.7,
    max_tokens=2000,
    messages=[
        {"role": "system", "content": "You are a helpful assistant..."},
        {"role": "user", "content": user_input},
    ],
)

Python — 修改后：

python

context = Context.builder(user_id).set("email", user.email).build()
config = ai_client.completion_config("chat-assistant", context, fallback)

if not config.enabled:
    return disabled_response()

params = config.model.parameters or {}
response = openai_client.chat.completions.create(
    model=config.model.name,
    temperature=params.get("temperature"),
    max_tokens=params.get("max_tokens"),
    messages=[m.to_dict() for m in (config.messages or [])] + [
        {"role": "user", "content": user_input},
    ],
)

Python — 修改后（Agent模式） — 适用于LangGraph、CrewAI或任何接受目标/指令字符串的框架：

python

context = Context.builder(user_id).kind("user").build()
config = ai_client.agent_config("support-agent", context, FALLBACK)

if not config.enabled:
    return disabled_response()

# config是单个AIAgentConfig对象 — 不是(config, tracker)元组。
# 通过工厂方法每次执行获取一次追踪器：tracker = config.create_tracker()
model_name = f"{config.provider.name}/{config.model.name}"
instructions = config.instructions
params = config.model.parameters or {}

# 将model_name + instructions传入框架的Agent构造函数。
# 示例：LangGraph预构建Agent（Python — `from langchain.agents import create_agent`;
# 替代已废弃的`langgraph.prebuilt.create_react_agent`，在LangGraph 1.0中废弃，2.0中移除。返回结构相同；`prompt=`重命名为`system_prompt=`。）
# agent = create_agent(
#     create_langchain_model(config),  # 转发所有变体参数
#     TOOLS,                            # 第三阶段将替换为config.tools加载器
#     system_prompt=instructions,
# )

详见before-after-examples.md中的完整Python OpenAI、Node Anthropic和LangGraph Agent模式配对代码片段。

检查
config.enabled
。如果返回
```
False
```
，处理禁用路径，避免崩溃且不调用提供商。该检查是必需的——而非可选。
验证。使用有效的
```
LD_SDK_KEY
```
运行应用；确认调用成功且响应与迁移前输出一致。然后临时设置
```
LD_SDK_KEY=sdk-invalid
```
（或取消设置）并确认回退路径无错误运行。两条路径都必须正常工作才能进入第三阶段。

委托：configs-create
（第4子步骤）。

Step 3: Move tools into the config (Stage 3)

步骤3：将工具迁移至配置（第三阶段）

Skip this step if the audited app has no function calling / tools. Otherwise:

Enumerate the tools currently registered. Common shapes to look for:
- ```
openai.chat.completions.create(tools=[...])
```
  — OpenAI direct
- ```
anthropic.messages.create(tools=[...])
```
  — Anthropic direct
- ```
create_agent(llm, tools=[...], system_prompt=...)
```
  — LangGraph prebuilt (Python,
```
langchain.agents
```
  ; replaces deprecated
```
langgraph.prebuilt.create_react_agent
```
  )
- ```
createReactAgent({ llm, tools: [...] })
```
  — LangGraph.js prebuilt (Node,
```
@langchain/langgraph/prebuilt
```
  )
- ```
Agent(tools=[...])
```
  — CrewAI
- ```
Agent(tools=[...])
```
  — Strands (Python
```
@tool
```
  -decorated callables passed through the constructor; TS SDK uses Zod-schema tools)
- Custom
  StateGraph
  — module-level
```
TOOLS = [...]
```
  list referenced in both
```
model.bind_tools(TOOLS)
```
  and
```
ToolNode(TOOLS)
```
  . This is the
```
langchain-ai/react-agent
```
  template shape; the list is usually in a
```
tools.py
```
  module. Grep for
```
bind_tools(
```
  and
```
ToolNode(
```
  together — they will point at the same list.
Record each tool's name, description, and JSON schema.
For LangChain/LangGraph tools defined with
```
@tool
```
, extract the schema via
```
tool.args_schema.model_json_schema()
```
(or the equivalent Pydantic
```
model_json_schema()
```
call). For plain async callables used as tools (common in custom StateGraph shapes), LangChain infers the schema from the function signature at bind time — extract it via
```
StructuredTool.from_function(fn).args_schema.model_json_schema()
```
. Do not hand-write the schema.
Hand off to
tools
. Print the extracted tool names, descriptions, and schemas, then tell the user: "Run
/tools
with these tools and the variation key, then come back here." The sibling skill creates tool definitions (
```
create-ai-tool
```
) and attaches them to the variation (
```
update-ai-config-variation
```
). Wait for the user to finish before proceeding to sub-step 3. Do not auto-invoke.
Replace the hardcoded tools array at the call site with a read from
```
config.tools
```
(or the SDK equivalent for your language). Load the actual implementation functions dynamically from the tool names — see agent-mode-frameworks.md for the dynamic-tool-factory pattern from the devrel agents tutorial.
For custom
StateGraph
shapes, you must update both call sites:
```
.bind_tools(TOOLS)
```
and
```
ToolNode(TOOLS)
```
must both read from the same
```
config.tools
```
-derived list. Forgetting one leaves the LLM seeing the new tools but the executor still running the old ones, or vice versa.
Verify. Run the app; confirm the tool flows still execute correctly.
```
get-ai-config
```
(via the delegate) confirms the tools are attached server-side.

Delegate: tools
(sub-step 2).

如果审计的应用无函数调用/工具，跳过此步骤。否则：

枚举当前注册的工具。需查找的常见形式：
- ```
openai.chat.completions.create(tools=[...])
```
  — 直接调用OpenAI
- ```
anthropic.messages.create(tools=[...])
```
  — 直接调用Anthropic
- ```
create_agent(llm, tools=[...], system_prompt=...)
```
  — LangGraph预构建（Python，
```
langchain.agents
```
  ；替代已废弃的
```
langgraph.prebuilt.create_react_agent
```
  ）
- ```
createReactAgent({ llm, tools: [...] })
```
  — LangGraph.js预构建（Node，
```
@langchain/langgraph/prebuilt
```
  ）
- ```
Agent(tools=[...])
```
  — CrewAI
- ```
Agent(tools=[...])
```
  — Strands（Python中通过
```
@tool
```
  装饰的可调用对象传入构造函数；TS SDK使用Zod-schema工具）
- 自定义
  StateGraph
  — 模块级
```
TOOLS = [...]
```
  列表，同时在
```
model.bind_tools(TOOLS)
```
  和
```
ToolNode(TOOLS)
```
  中引用。这是
```
langchain-ai/react-agent
```
  模板的形式；列表通常在
```
tools.py
```
  模块中。同时搜索
```
bind_tools(
```
  和
```
ToolNode(
```
  ——它们会指向同一列表。
记录每个工具的名称、描述和JSON schema。
对于使用
```
@tool
```
定义的LangChain/LangGraph工具，通过
```
tool.args_schema.model_json_schema()
```
（或等效的Pydantic
```
model_json_schema()
```
调用）提取schema。对于用作工具的普通异步可调用对象（自定义StateGraph形式中常见），LangChain在绑定阶段从函数签名推断schema——通过
```
StructuredTool.from_function(fn).args_schema.model_json_schema()
```
提取。请勿手动编写schema。
交接至
tools
。打印提取的工具名称、描述和schema，然后告知用户：*“使用这些工具和变体密钥运行
```
/tools
```
，然后返回此处。”*兄弟技能会创建工具定义（
```
create-ai-tool
```
）并关联至变体（
```
update-ai-config-variation
```
）。等待用户完成后再进入第3子步骤。请勿自动调用。
将调用点的硬编码工具数组替换为从
config.tools
读取的内容（或对应语言的SDK等效方法）。根据工具名称动态加载实际实现函数——详见agent-mode-frameworks.md中开发者关系Agent教程的动态工具工厂模式。
对于自定义
StateGraph
形式，必须更新两个调用点：
```
.bind_tools(TOOLS)
```
和
```
ToolNode(TOOLS)
```
都必须从同一
```
config.tools
```
派生的列表读取。遗漏其中一个会导致LLM看到新工具但执行器仍运行旧工具，反之亦然。
验证。运行应用；确认工具流程仍能正确执行。通过委托技能的
```
get-ai-config
```
确认工具已在服务端关联。

委托：tools
（第2子步骤）。

Step 4: Instrument the tracker (Stage 4)

步骤4：接入追踪器（第四阶段）

Delegate: built-in-metrics
wires the per-request

tracker.track_*

calls (duration, tokens, success/error, feedback) around the provider call. Use custom-metrics
alongside it if the app needs business metrics beyond the built-in agent ones. Note: do not confuse this with

launchdarkly-metric-instrument

, which is for

ldClient.track()

feature metrics — a different API. See sdk-ai-tracker-patterns.md for the full per-method Python + Node matrix that the delegate skill draws on.

Hand off: print the config key, variation key, provider, and whether the call is streaming, then tell the user: "Run
/built-in-metrics
with these inputs, then come back here." Do not auto-invoke. Return here for sub-step 5 (verify) once they're done.

Create the tracker. Obtain a per-execution tracker via the factory on the config returned in Stage 2:
```
tracker = config.create_tracker()
```
(Python) or
```
const tracker = aiConfig.createTracker();
```
(Node). Call the factory once per user turn and reuse the returned
```
tracker
```
for every tracking call in that turn — each call mints a fresh
```
runId
```
that tags every event emitted from the turn so they can be correlated via exported events or downstream queries. (The Monitoring tab aggregates today; run-level grouping is a downstream concern — but the
```
runId
```
is also what the SDK's at-most-once guards are keyed on, so minting a new one mid-turn breaks the guard semantics regardless of where the events end up.)
Where to call the factory depends on the call shape:
- Completion mode / one-shot provider call: mint the tracker right after
```
completion_config(...)
```
  returns, in the same function that handles the request.
- Agent mode with a ReAct loop (LangGraph, LangChain, custom): mint the tracker in a dedicated
```
setup_run
```
  entry node that executes once before the loop, stash it on graph state, and read it from state in
```
call_model
```
  / tool handlers / a terminal
```
finalize
```
  node. Emitting
```
track_duration
```
  /
```
track_tokens
```
  /
```
track_success
```
  inside the loop body will trip the at-most-once guards. See
  agent-mode-frameworks.md § Custom
```
StateGraph
```
  (run-scoped architecture)
  for the full
```
setup_run
```
  +
```
call_model
```
  +
```
finalize
```
  pattern.
- Managed runner (Tier 1): skip this step entirely.
```
ManagedModel
```
  mints the tracker internally per
```
run()
```
  /
```
invoke()
```
  . Move to sub-step 4 if that's what the app uses.
Pick a tier from the four-tier ladder. See sdk-ai-tracker-patterns.md § Tier decision table for the full table (chat loop → Tier 1; provider-package call → Tier 2; custom extractor → Tier 3; streaming/manual → Tier 4).

Wire the chosen tier. The delegate skill has full Python + Node examples for each tier plus per-provider files. A condensed Tier 2/3 example for reference — OpenAI via the provider package:

Python:

python

from ldai_openai import get_ai_metrics_from_response
import openai

client = openai.OpenAI()

tracker = config.create_tracker()

def call_openai():
    return client.chat.completions.create(
        model=config.model.name,
        messages=[{"role": "system", "content": config.messages[0].content},
                  {"role": "user", "content": user_prompt}],
    )

# Exceptions are tracked automatically — track_metrics_of catches
# exceptions, records tracker.track_error(), and re-raises. Wrap your
# own try/except only for local handling (logging, fallback).
response = tracker.track_metrics_of(get_ai_metrics_from_response, call_openai)

Node:

typescript

import { getAIMetricsFromResponse } from '@launchdarkly/server-sdk-ai-openai';

const tracker = aiConfig.createTracker();
// Exceptions are tracked automatically — trackMetricsOf catches
// exceptions, records tracker.trackError(), and re-throws.
const response = await tracker.trackMetricsOf(
  getAIMetricsFromResponse,
  () => openaiClient.chat.completions.create({
    model: aiConfig.model!.name,
    messages: [...aiConfig.messages, { role: 'user', content: userPrompt }],
  }),
);

For Anthropic direct, Bedrock (no provider package), Gemini, and custom HTTP, write a small extractor returning

LDAIMetrics

— see the delegate skill's anthropic-tracking.md, bedrock-tracking.md, and gemini-tracking.md. LangChain single-node and LangGraph go through the

launchdarkly-server-sdk-ai-langchain

@launchdarkly/server-sdk-ai-langchain

provider package. Build the model with

create_langchain_model(config)

(Python) /

createLangChainModel(config)

(Node) — both forward all variation parameters — and track with

get_ai_metrics_from_response

getAIMetricsFromResponse

. See langchain-tracking.md.

Wire feedback tracking if the app has thumbs-up/down UI. Both SDKs expose
```
trackFeedback
```
with a
```
{kind}
```
argument.
Python:
python
```
from ldai.tracker import FeedbackKind
tracker.track_feedback({"kind": FeedbackKind.Positive})
```
Node:
typescript
```
import { LDFeedbackKind } from '@launchdarkly/server-sdk-ai';
tracker.trackFeedback({ kind: LDFeedbackKind.Positive });
```
Deferred feedback across processes. If the thumbs-up UI fires in a different process than the one that produced the response, do not call
```
create_tracker()
```
again in the consumer — that mints a new
```
runId
```
. Persist the tracker's resumption token (
```
tracker.resumption_token
```
in Python,
```
tracker.resumptionToken
```
in Node) alongside the message, then rehydrate the tracker with
```
LDAIConfigTracker.from_resumption_token(...)
```
(Python) or
```
aiClient.createTracker(token, context)
```
(Node) in the feedback handler.
Verify. Hit the wrapped endpoint in staging, then open the config in LaunchDarkly → Monitoring tab. Duration, token, and generation counts should appear within 1–2 minutes. If nothing shows up, walk the checklist in sdk-ai-tracker-patterns.md under "Troubleshooting."

委托：built-in-metrics
在提供商调用周围接入请求级
tracker.track_*
调用（统计时长、token数、成功/错误、反馈）。如果应用需要内置Agent指标之外的业务指标，可同时使用
custom-metrics
。注意：不要与

launchdarkly-metric-instrument

混淆，后者用于

ldClient.track()

功能指标——是不同的API。详见sdk-ai-tracker-patterns.md中委托技能使用的完整Python + Node方法矩阵。

交接：打印配置密钥、变体密钥、提供商以及调用是否为流式传输，然后告知用户：*“使用这些输入运行

/built-in-metrics

，然后返回此处。”*请勿自动调用。用户完成后返回此处执行第5子步骤（验证）。

创建追踪器。通过第二阶段返回的配置上的工厂方法获取每次执行的追踪器：
```
tracker = config.create_tracker()
```
（Python）或
```
const tracker = aiConfig.createTracker();
```
（Node）。每个用户回合调用一次工厂方法，并在该回合的所有追踪调用中复用返回的
```
tracker
```
——每次调用会生成一个新的
```
runId
```
，标记该回合发出的所有事件，以便通过导出事件或下游查询关联。（监控面板当前按天聚合；运行级分组是下游关注点——但
```
runId
```
也是SDK至多一次防护机制的键，因此在回合中途生成新的
```
runId
```
会破坏防护语义，无论事件最终流向何处。）
工厂方法的调用位置取决于调用形式：
- 补全模式 / 一次性提供商调用：在
```
completion_config(...)
```
  返回后立即生成追踪器，在处理请求的同一函数中。
- 带ReAct循环的Agent模式（LangGraph、LangChain、自定义）：在专用的
```
setup_run
```
  入口节点中生成追踪器，该节点在循环前执行一次，将其存储在图状态中，并在
```
call_model
```
  / 工具处理程序 / 终端
```
finalize
```
  节点中从状态读取。在循环体内调用
```
track_duration
```
  /
```
track_tokens
```
  /
```
track_success
```
  /
```
track_error
```
  会触发至多一次防护机制。详见
  agent-mode-frameworks.md § Custom
```
StateGraph
```
  (run-scoped architecture)
  中的完整
```
setup_run
```
  +
```
call_model
```
  +
```
finalize
```
  模式。
- 托管运行器（一级）：完全跳过此步骤。
```
ManagedModel
```
  在每次
```
run()
```
  /
```
invoke()
```
  内部自动生成追踪器。如果应用使用此模式，直接进入第4子步骤。
从四级阶梯中选择一级。详见sdk-ai-tracker-patterns.md § Tier decision table中的完整表格（聊天循环 → 一级；提供商包调用 → 二级；自定义提取器 → 三级；流式/手动 → 四级）。

接入所选层级。委托技能包含每个层级的完整Python + Node示例以及各提供商文件。以下是二级/三级的浓缩示例——通过提供商包调用OpenAI：

Python：

python

from ldai_openai import get_ai_metrics_from_response
import openai

client = openai.OpenAI()

tracker = config.create_tracker()

def call_openai():
    return client.chat.completions.create(
        model=config.model.name,
        messages=[{"role": "system", "content": config.messages[0].content},
                  {"role": "user", "content": user_prompt}],
    )

# 异常会自动追踪——track_metrics_of捕获
# 异常，记录tracker.track_error()，然后重新抛出。仅在需要本地处理（日志、回退）时包裹自己的try/except。
response = tracker.track_metrics_of(get_ai_metrics_from_response, call_openai)

Node：

typescript

import { getAIMetricsFromResponse } from '@launchdarkly/server-sdk-ai-openai';

const tracker = aiConfig.createTracker();
// 异常会自动追踪——trackMetricsOf捕获
// 异常，记录tracker.trackError()，然后重新抛出。
const response = await tracker.trackMetricsOf(
  getAIMetricsFromResponse,
  () => openaiClient.chat.completions.create({
    model: aiConfig.model!.name,
    messages: [...aiConfig.messages, { role: 'user', content: userPrompt }],
  }),
);

对于直接调用Anthropic、Bedrock（无提供商包）、Gemini和自定义HTTP请求，编写一个返回

LDAIMetrics

的小型提取器——详见委托技能的anthropic-tracking.md、bedrock-tracking.md和gemini-tracking.md。LangChain单节点和LangGraph通过

launchdarkly-server-sdk-ai-langchain

@launchdarkly/server-sdk-ai-langchain

提供商包处理。使用

create_langchain_model(config)

（Python） /

createLangChainModel(config)

（Node）构建模型——两者都会转发所有变体参数——并使用

get_ai_metrics_from_response

getAIMetricsFromResponse

追踪。详见langchain-tracking.md。

如果应用有 thumbs-up/down UI，接入反馈追踪。两个SDK都提供带
```
{kind}
```
参数的
```
trackFeedback
```
方法。
Python：
python
```
from ldai.tracker import FeedbackKind
tracker.track_feedback({"kind": FeedbackKind.Positive})
```
Node：
typescript
```
import { LDFeedbackKind } from '@launchdarkly/server-sdk-ai';
tracker.trackFeedback({ kind: LDFeedbackKind.Positive });
```
跨进程延迟反馈。如果thumbs-up UI在生成响应的不同进程中触发，不要在消费者进程中再次调用
```
create_tracker()
```
——这会生成新的
```
runId
```
。将追踪器的恢复令牌（Python中为
```
tracker.resumption_token
```
，Node中为
```
tracker.resumptionToken
```
）与消息一起持久化，然后在反馈处理程序中使用
```
LDAIConfigTracker.from_resumption_token(...)
```
（Python）或
```
aiClient.createTracker(token, context)
```
（Node）重新加载追踪器。
验证。在预发布环境中访问封装后的端点，然后打开LaunchDarkly中的配置 → 监控面板。时长、token和生成计数应在1-2分钟内显示。如果没有显示，按照sdk-ai-tracker-patterns.md中“故障排除”下的清单排查。

Step 5: Attach evaluations (Stage 5)

步骤5：关联评估器（第五阶段）

Decide between three evaluation paths. This is the most commonly misunderstood stage — there are three paths, not two, and the right default for a migration context is often the one people skip.

Path	When to use	Supports agent mode?
Offline eval (recommended default for migration)	Pre-ship regression: run a fixed dataset through the new variation in the LD Playground and score against baseline. Best fit for migration because you want to prove the new config behaves at least as well as the hardcoded version before shipping.	Yes — all modes
UI-attached auto judges	Attach one or more judges to a variation in the LD UI; judges run on sampled live requests automatically. Zero code changes.	Completion mode only (the UI widget is completion-only today)
Programmatic direct-judge	Call `ai_client.create_judge(...)` inside the request handler and `judge.evaluate(input, output)` on each call. Adds per-request cost and code complexity. Best for continuous live scoring of workflows where sampled auto-judges aren't enough.	Yes — all modes (the SDK handles both identically)

Most migration users should start with offline eval, then add programmatic direct-judge only if they need continuous live scoring after the rollout is stable.

For agent-mode migrations, default to offline eval. UI-attached auto judges are completion-mode only today. The documented path for agent mode is either (a) offline regression via the LD Playground + Datasets (works for all modes), or (b) programmatic direct-judge wired into the call site. Generate a starter dataset CSV from the audit manifest (one representative input per row) and point the user at the Offline Evals guide for the Playground walkthrough. Only wire programmatic direct-judge into production code if the user explicitly asks for continuous live scoring.

Recommended offline-eval shape for a migration:
- Run the
```
default
```
  variation (or whichever variation mirrors the pre-migration hardcoded behavior) against the dataset first — this is the baseline.
- Clone it into a second variation pointing at a different model family (e.g., if the baseline is
```
anthropic/claude-sonnet-4-5
```
  , clone to
```
openai/gpt-4o
```
  or
```
openai/gpt-4o-mini
```
  ). The comparison is most informative across families, not across siblings.
- Attach the built-in Accuracy judge with a pass threshold of 0.85, and run both variations against the same dataset.
- Promote the winner to fallthrough via
```
/configs-targeting
```
  only if it beats the baseline on Accuracy and does not regress on Relevance or Toxicity.
Write this shape into the project's
```
datasets/README.md
```
(or equivalent) so the comparison pattern is reproducible after the migration ships.
Hand off to
online-evals
— only for UI-attached judges (completion mode) or to create custom judge configs that will be referenced by the programmatic path. Tell the user: "Run
/online-evals
with these inputs, then come back here." Do not auto-invoke. Pass:
- The parent config key and variation key
- A list of built-in judges (Accuracy, Relevance, Toxicity) or custom judge keys to create/attach
- Target environment
The delegate handles creating custom judge configs, attaching them via the variation PATCH endpoint, and setting fallthrough on each judge config. Offline eval does not go through this delegate — it's a Playground workflow, not an API write.

For programmatic direct-judge: wire
create_judge
+
evaluate
+
track_judge_result
. This is the only path at Stage 5 that writes code. The Python shape:

python

from ldai.client import AIJudgeConfigDefault

judge = ai_client.create_judge(
    judge_key,                               # judge config key in LD
    ld_context,
    AIJudgeConfigDefault(enabled=False),     # fallback: skip eval on SDK miss
)

if judge and judge.enabled:
    result = await judge.evaluate(
        input_text,
        output_text,
        sampling_rate=0.25,                  # optional; default 1.0 (always eval)
    )
    if result.sampled:
        tracker.track_judge_result(result)

Four rules:

create_judge
returns
Optional[Judge]
. Always guard with
```
if judge and judge.enabled:
```
— it returns
```
None
```
if the judge config is disabled for the context or the provider is missing. A direct
```
.evaluate()
```
on a
```
None
```
return will raise
```
AttributeError
```
.
Pass
AIJudgeConfigDefault
, not
```
AICompletionConfigDefault
```
. The
```
create_judge
```
```
default
```
parameter is typed
```
Optional[AIJudgeConfigDefault]
```
; passing the completion type will not type-check and is a doc-level bug in some older examples.
sampling_rate
is a parameter on
evaluate()
, not on
```
create_judge
```
. It defaults to
```
1.0
```
(evaluate every call). For live paths, pass something lower (0.1–0.25) to control cost.

evaluate()
returns a
JudgeResult
(never

None

). Check

result.sampled

to know whether the evaluation actually ran, and call

track_judge_result(result)

. Node uses

trackJudgeResult(result)

and

LDJudgeResult

with the same

sampled

field.

Ask the user which judge config key to use. LaunchDarkly ships three built-in judges — Accuracy, Relevance, Toxicity — but the actual config keys for the built-ins are not canonical SDK constants and aren't documented. Have the user open AgentControl > Library in the LD UI and copy the key of the judge they want to reference, or create a custom judge config via

configs-create

first.

Verify.
- UI-attached auto judges: trigger a request in staging, open the Monitoring tab → "Evaluator metrics" dropdown. Scores appear within 1–2 minutes at the configured sampling rate.
- Programmatic direct-judge: hit the wrapped endpoint and confirm
```
track_judge_result
```
  lands on the parent config's Monitoring tab.
- Offline eval: run the dataset through the LD Playground, compare baseline vs new-variation scores side by side. No runtime wiring required.

Delegate: online-evals
(sub-step 3, optional — only for UI-attached judges or custom-judge creation; offline eval doesn't delegate).

在三种评估路径中选择。这是最常被误解的阶段——有三种路径，而非两种，迁移场景下的正确默认路径往往是人们跳过的那种。

路径	使用场景	支持Agent模式？
离线评估（迁移推荐默认）	发布前回归测试：在LD Playground中通过固定数据集运行新变体，并与基线对比打分。最适合迁移场景，因为你需要在发布前证明新配置的表现至少与硬编码版本一致。	是——所有模式
UI关联自动评判器	在LD UI中为变体关联一个或多个评判器；评判器自动对抽样实时请求打分。无需修改代码。	仅支持补全模式（当前UI组件仅支持补全模式）
程序化直接评判器	在请求处理程序中调用 `ai_client.create_judge(...)` ，并在每次调用时执行 `judge.evaluate(input, output)` 。增加每次请求的成本和代码复杂度。最适合对采样自动评判器不足以覆盖的工作流进行持续实时打分。	是——所有模式（SDK处理方式相同）

大多数迁移用户应从离线评估开始，然后仅在迁移稳定后需要持续实时打分时再接入程序化直接评判器。

对于Agent模式迁移，默认选择离线评估。当前UI关联自动评判器仅支持补全模式。Agent模式的文档化路径是（a）通过LD Playground + 数据集进行离线回归测试（支持所有模式），或（b）接入程序化直接评判器到调用点。从审计清单生成初始数据集CSV（每行一个代表性输入），并引导用户查看离线评估指南中的Playground操作步骤。仅当用户明确要求持续实时打分时，才在生产代码中接入程序化直接评判器。

迁移场景下推荐的离线评估形式：
- 首先通过数据集运行
```
default
```
  变体（或任何镜像迁移前硬编码行为的变体）——这是基线。
- 将其克隆为指向不同模型系列的第二个变体（例如，如果基线是
```
anthropic/claude-sonnet-4-5
```
  ，克隆为
```
openai/gpt-4o
```
  或
```
openai/gpt-4o-mini
```
  ）。跨系列对比最具参考价值，而非同系列对比。
- 关联内置准确性评判器，通过阈值设为0.85，并在同一数据集上运行两个变体。
- 仅当获胜变体在准确性上优于基线且在相关性或毒性上无回归时，才通过
```
/configs-targeting
```
  将其升级为回退方案。
将此形式写入项目的
```
datasets/README.md
```
（或等效文件），以便迁移发布后仍能重现对比模式。
交接至
online-evals
— 仅适用于UI关联评判器（补全模式）或创建将被程序化路径引用的自定义评判器配置。告知用户：*“使用这些输入运行
```
/online-evals
```
，然后返回此处。”*请勿自动调用。传递：
- 父配置密钥和变体密钥
- 内置评判器列表（准确性、相关性、毒性）或要创建/关联的自定义评判器密钥
- 目标环境
委托技能处理创建自定义评判器配置、通过变体PATCH端点关联、以及为每个评判器配置设置回退方案。离线评估不通过此委托技能——这是Playground工作流，而非API写入操作。

对于程序化直接评判器：接入
create_judge
+
evaluate
+
track_judge_result
。这是第五阶段唯一需要修改代码的路径。Python形式：

python

from ldai.client import AIJudgeConfigDefault

judge = ai_client.create_judge(
    judge_key,                               # LD中的评判器配置密钥
    ld_context,
    AIJudgeConfigDefault(enabled=False),     # 回退方案：SDK不可用时跳过评估
)

if judge and judge.enabled:
    result = await judge.evaluate(
        input_text,
        output_text,
        sampling_rate=0.25,                  # 可选；默认1.0（每次都评估）
    )
    if result.sampled:
        tracker.track_judge_result(result)

四条规则：

create_judge
返回
Optional[Judge]
。始终使用
```
if judge and judge.enabled:
```
防护——当评判器配置对当前上下文禁用或提供商缺失时，返回
```
None
```
。对
```
None
```
返回值直接调用
```
.evaluate()
```
会抛出
```
AttributeError
```
。
传递
AIJudgeConfigDefault
，而非
```
AICompletionConfigDefault
```
。
```
create_judge
```
的
```
default
```
参数类型为
```
Optional[AIJudgeConfigDefault]
```
；传递补全类型会导致类型检查失败，这是一些旧示例中的文档级错误。
sampling_rate
是
evaluate()
的参数，而非
```
create_judge
```
的参数。默认值为
```
1.0
```
（每次都评估）。对于实时路径，传递较小的值（0.1–0.25）以控制成本。
evaluate()
返回
JudgeResult
（永远不会是
```
None
```
）。检查
```
result.sampled
```
以确认评估是否实际运行，并调用
```
track_judge_result(result)
```
。Node使用
```
trackJudgeResult(result)
```
和带相同
```
sampled
```
字段的
```
LDJudgeResult
```
。

询问用户使用哪个评判器配置密钥。LaunchDarkly提供三个内置评判器——准确性、相关性、毒性——但内置评判器的实际配置密钥不是标准SDK常量，也未文档化。让用户打开LD UI中的AgentControl > Library并复制他们要引用的评判器密钥，或先通过

configs-create

创建自定义评判器配置。

验证。
- UI关联自动评判器：在预发布环境中触发请求，打开监控面板 → “评估器指标”下拉菜单。分数会在1-2分钟内按配置的采样率显示。
- 程序化直接评判器：访问封装后的端点并确认
```
track_judge_result
```
  显示在父配置的监控面板上。
- 离线评估：通过LD Playground运行数据集，并排对比基线与新变体的分数。无需运行时接入。

委托：online-evals
（第3子步骤，可选——仅适用于UI关联评判器或自定义评判器创建；离线评估无需委托）。

Edge Cases

边缘情况

Situation	Action
App already initializes `LDClient` for feature flags	Reuse it — pass the existing client to `LDAIClient()` / `initAi()` , do not create a second client
App uses LangChain `ChatOpenAI(model=...)`	Replace the hand-rolled model construction with `create_langchain_model(config)` (Python) or `createLangChainModel(config)` (Node). Do not read `config.model.name` and pass it to `ChatOpenAI(model=...)` by hand — that pattern drops every variation parameter except the ones you explicitly name
Retry wrapper around the provider call	The tracker is minted once at the top of the user turn; the retry loop is inside that scope. Every retry attempt shares the same `runId` . Tracker calls ( `track_duration` / `track_tokens` / `track_success` / `track_error` ) live outside the retry body — one call at the end of the turn, on the success path or the final-failure path
App has no tools — Stage 3 skipped	Move directly from Stage 2 verification to Stage 4 (tracking)
Mode mismatch: user said agent, audit shows one-shot chat	Choose completion mode unless the app uses a LangGraph prebuilt agent ( `langchain.agents.create_agent` in Python or `createReactAgent` in Node), CrewAI `Agent` , Strands `Agent` , or a similar goal-driven framework
App uses Strands Agents (Python)	Agent mode. Build a `create_strands_model` dispatcher keyed on `agent_config.provider.name` that returns `AnthropicModel(model_id=..., max_tokens=...)` or `OpenAIModel(model_id=..., params=...)` . Drop `parameters.tools` before passing params to the model class — Strands receives tools via `Agent(tools=[...])` . Tracking is Tier 3: wrap `invoke_async` with `tracker.track_duration_of(...)` and record tokens from `result.metrics.accumulated_usage` . See agent-mode-frameworks.md § Strands Agent and strands-tracking.md
Strands app on TypeScript	TS SDK ships `BedrockModel` and `OpenAIModel` only — cannot serve Anthropic-backed variations. Use the Python SDK if multi-provider variations are required
TypeScript app using Anthropic SDK	No `trackAnthropicMetrics` helper exists. Use Tier 3: `trackMetricsOf` with a small custom extractor that reads `response.usage.input_tokens` / `response.usage.output_tokens` and returns `LDAIMetrics` . See anthropic-tracking.md in the `built-in-metrics` skill for the exact extractor
Fallback would silently crash because `LD_SDK_KEY` is missing	Log a startup warning; proceed with the fallback. Never raise at import time
Multi-agent graph (supervisor + workers)	Stop after migrating a single agent. Agent Graph Definitions are available in both SDKs — Python via `launchdarkly-server-sdk-ai.agent_graph` and Node via the graph API in `@launchdarkly/server-sdk-ai` . Read agent-graph-reference.md for the graph-level migration path — it is deliberately out of this skill's main scope
Single-agent (ReAct, tool loop) + agent mode	Default to offline eval via the LD Playground + Datasets for Stage 5. UI-attached judges are completion-only today, and programmatic direct-judge adds per-call cost that is usually not worth it until after the migration is live and stable. Point at the Offline Evals guide
Tool with a Pydantic `args_schema` (LangChain `@tool` )	Extract the schema via `tool.args_schema.model_json_schema()` ; do not hand-write the JSON schema for the delegate
Custom `StateGraph` with module-level `TOOLS` list bound via `.bind_tools(TOOLS)` and run through `ToolNode(TOOLS)` (e.g. the `langchain-ai/react-agent` template)	Find the `TOOLS` list (usually in a separate `tools.py` module). Extract schemas the same way. Swap both call sites — `.bind_tools(...)` and `ToolNode(...)` — to read from the same `config.tools` -derived list
App has already externalized config into a `Context` dataclass with env-var fallback (e.g. `react-agent` template's `context.py` )	Replace the consumers of `runtime.context.model` / `runtime.context.system_prompt` with `ai_client.agent_config(...)` and read from the returned `AIAgentConfig` . Empty the dataclass rather than keeping it as the fallback shape — the canonical fallback is `FALLBACK = AIAgentConfigDefault(...)` in Python (a top-level constant near the `agent_config` call), not a parallel Python dataclass. Two sources of truth for fallback values drift. An empty `Context` is a placeholder satisfying LangGraph's `context_schema` requirement only; `thread_id` and any other per-request plumbing comes through `config: RunnableConfig` instead (see agent-mode-frameworks.md § Custom `StateGraph` )

场景	操作
应用已为功能标志初始化 `LDClient`	复用该客户端——将现有客户端传递给 `LDAIClient()` / `initAi()` ，不要创建第二个客户端
应用使用LangChain `ChatOpenAI(model=...)`	将手动实现的模型构造替换为 `create_langchain_model(config)` （Python）或 `createLangChainModel(config)` （Node）。不要手动读取 `config.model.name` 并传递给 `ChatOpenAI(model=...)` ——该模式会丢弃除你显式指定的参数外的所有变体参数
提供商调用周围有重试封装	追踪器在用户回合顶部生成一次；重试循环在该作用域内。每次重试尝试共享同一 `runId` 。追踪调用（ `track_duration` / `track_tokens` / `track_success` / `track_error` ）位于重试体外部——在回合结束时调用一次，在成功路径或最终失败路径上
应用无工具——跳过第三阶段	直接从第二阶段验证进入第四阶段（追踪）
模式不匹配：用户说是Agent模式，审计显示为一次性聊天	除非应用使用LangGraph预构建Agent（Python中的 `langchain.agents.create_agent` 或Node中的 `createReactAgent` ）、CrewAI `Agent` 、Strands `Agent` 或类似的目标驱动框架，否则选择补全模式
应用使用Strands Agents（Python）	Agent模式。构建一个基于 `agent_config.provider.name` 的 `create_strands_model` 调度器，返回 `AnthropicModel(model_id=..., max_tokens=...)` 或 `OpenAIModel(model_id=..., params=...)` 。将参数传递给模型类前删除 `parameters.tools` ——Strands通过 `Agent(tools=[...])` 接收工具。追踪为三级：使用 `tracker.track_duration_of(...)` 包裹 `invoke_async` ，并从 `result.metrics.accumulated_usage` 记录token数。详见agent-mode-frameworks.md § Strands Agent和strands-tracking.md
TypeScript应用使用Strands Agents	TS SDK仅提供 `BedrockModel` 和 `OpenAIModel` ——无法提供Anthropic支持的变体。如果需要多提供商变体，请使用Python SDK
TypeScript应用使用Anthropic SDK	不存在 `trackAnthropicMetrics` 助手。使用三级： `trackMetricsOf` 搭配一个读取 `response.usage.input_tokens` / `response.usage.output_tokens` 并返回 `LDAIMetrics` 的小型自定义提取器。详见 `built-in-metrics` 技能中的anthropic-tracking.md中的精确提取器
因 `LD_SDK_KEY` 缺失导致回退方案静默崩溃	记录启动警告；继续执行回退方案。永远不要在导入时抛出异常
多Agent图（监督者 + 执行者）	迁移单个Agent后停止。两个SDK都提供Agent图定义——Python通过 `launchdarkly-server-sdk-ai.agent_graph` ，Node通过 `@launchdarkly/server-sdk-ai` 中的图API。阅读agent-graph-reference.md获取图级迁移路径——这故意超出本技能的主范围
单Agent（ReAct、工具循环） + Agent模式	第五阶段默认选择通过LD Playground + 数据集进行离线评估。当前UI关联评判器仅支持补全模式，程序化直接评判器会增加每次请求的成本，通常在迁移上线并稳定前不值得投入。引导用户查看离线评估指南
带Pydantic `args_schema` 的工具（LangChain `@tool` ）	通过 `tool.args_schema.model_json_schema()` 提取schema；不要为委托技能手动编写JSON schema
自定义 `StateGraph` ，带模块级 `TOOLS` 列表，通过 `.bind_tools(TOOLS)` 绑定并通过 `ToolNode(TOOLS)` 运行（例如 `langchain-ai/react-agent` 模板）	找到 `TOOLS` 列表（通常在单独的 `tools.py` 模块中）。以相同方式提取schema。替换两个调用点—— `.bind_tools(...)` 和 `ToolNode(...)` ——从同一 `config.tools` 派生的列表读取
应用已将配置外部化到带环境变量回退的 `Context` 数据类中（例如 `react-agent` 模板的 `context.py` ）	将 `runtime.context.model` / `runtime.context.system_prompt` 的消费者替换为 `ai_client.agent_config(...)` 并从返回的 `AIAgentConfig` 读取。清空数据类，而非将其保留为回退形式——标准回退是Python中的 `FALLBACK = AIAgentConfigDefault(...)` （ `agent_config` 调用附近的顶级常量），而非并行的Python数据类。回退值的两个来源会导致偏差。空的 `Context` 仅作为满足LangGraph `context_schema` 要求的占位符； `thread_id` 和任何其他请求级管道通过 `config: RunnableConfig` 传递（详见agent-mode-frameworks.md § Custom `StateGraph` ）

What NOT to Do

禁止操作

These are ordered by how likely they are to show up as a first-run failure. The first three rules — about tracker and config lifetime — account for most of the "migration looks done but the Monitoring tab is fragmented / wrong" reports.

以下按首次运行失败概率排序。前三条规则——关于追踪器和配置生命周期——是“迁移看似完成但监控面板数据碎片化/错误”报告的主要原因。

Tracker and config lifetime (most common failure mode)

追踪器和配置生命周期（最常见失败模式）

Don't call
create_tracker()
/
createTracker()
more than once per user turn. One turn = the full request/response cycle including every ReAct iteration, tool call, and retry. See Stage 4 Step 1 for the canonical placement in each app shape (completion / agent loop / managed runner).
Don't call
track_duration
/
track_tokens
/
track_success
/
track_error
/
track_time_to_first_token
inside a loop body. These are at-most-once per tracker; second calls are dropped. Accumulate inside the loop, emit once in a terminal/finalize node. Per-event methods (
```
track_tool_call
```
,
```
track_tool_calls
```
,
```
track_feedback
```
,
```
track_judge_result
```
) are safe to call repeatedly. Full matrix: sdk-ai-tracker-patterns.md § At-most-once guards.
Don't call
agent_config()
/
completion_config()
more than once per user turn. Each call is a flag evaluation and emits a
```
$ld:ai:agent:config
```
event. Re-fetching inside a loop step or a tool body inflates agent-config counts on the Monitoring tab and lets a mid-turn targeting change swap the variation between LLM calls in a single turn. Resolve once at the top, stash on state, and have every subsequent consumer read from state. Tools that need variation-scoped knobs should use the tool-factory pattern (
```
make_search(ai_config)
```
that closes over the knob at setup time) — see agent-mode-frameworks.md § Getting knobs into tools.
Don't cache the config object across requests — resolve once per turn, yes, but still resolve once per turn. Caching at module scope defeats the targeting-change mechanism entirely.
Don't delete the fallback once LaunchDarkly is wired up. It is required for the
```
enabled=False
```
and SDK-unreachable paths.
Don't tuple-unpack the return of
```
completion_config
```
/
```
agent_config
```
/
```
completionConfig
```
/
```
agentConfig
```
. They return a single config object (e.g.
```
AIAgentConfig
```
,
```
AICompletionConfig
```
), not
```
(config, tracker)
```
. Obtain the tracker by calling
```
config.create_tracker()
```
/
```
aiConfig.createTracker()
```
. LLMs hallucinate both the tuple shape and a
```
config.tracker
```
property — the actual API is a factory.

每个用户回合不要调用
create_tracker()
/
createTracker()
超过一次。一个回合 = 完整的请求/响应周期，包括所有ReAct迭代、工具调用和重试。详见第四阶段第1子步骤中每种应用形式（补全 / Agent循环 / 托管运行器）的标准放置位置。
不要在循环体内调用
track_duration
/
track_tokens
/
track_success
/
track_error
/
track_time_to_first_token
。这些方法每个追踪器至多调用一次；第二次调用会被丢弃。在循环体内累加，在终端/finalize节点中调用一次。每事件方法（
```
track_tool_call
```
、
```
track_tool_calls
```
、
```
track_feedback
```
、
```
track_judge_result
```
）可安全重复调用。完整矩阵：sdk-ai-tracker-patterns.md § At-most-once guards。
每个用户回合不要调用
agent_config()
/
completion_config()
超过一次。每次调用都是一次标志评估，并发出
```
$ld:ai:agent:config
```
事件。在循环步骤或工具体内重新获取会导致监控面板上的Agent配置计数膨胀，并允许回合中途的定向更改在单个回合的LLM调用之间切换变体。在顶部解析一次，存储在状态中，让所有后续消费者从状态读取。需要变体范围配置项的工具应使用工具工厂模式（
```
make_search(ai_config)
```
在设置时封装配置项）——详见agent-mode-frameworks.md § Getting knobs into tools。
不要跨请求缓存配置对象——每个回合解析一次是可以的，但仍需每个回合解析一次。在模块范围缓存会完全破坏定向更改机制。
不要在LaunchDarkly接入后删除回退方案。它是
```
enabled=False
```
和SDK不可用路径所必需的。
不要对
```
completion_config
```
/
```
agent_config
```
/
```
completionConfig
```
/
```
agentConfig
```
的返回值进行元组解包。它们返回单个配置对象（例如
```
AIAgentConfig
```
、
```
AICompletionConfig
```
），而非
```
(config, tracker)
```
。通过调用
```
config.create_tracker()
```
/
```
aiConfig.createTracker()
```
获取追踪器。LLM会幻觉元组形式和
```
config.tracker
```
属性——实际API是工厂方法。

LangChain / LangGraph patterns (second most common failure mode)

LangChain / LangGraph模式（第二常见失败模式）

If the repo already contains a
load_chat_model(f"{provider}/{name}")
helper, delete it — don't just avoid using it. This exact shape ships with
```
langchain-ai/react-agent
```
and is copied into dozens of derivative repos; look for
```
utils.load_chat_model
```
,
```
utils.build_model
```
, or any one-arg
```
init_chat_model
```
wrapper that splits a
```
"provider/model"
```
string. Re-using it is the first-run failure mode: every variation parameter (temperature, max_tokens, top_p, stop sequences) silently drops on the floor because
```
init_chat_model
```
only receives the name and provider.
```
create_langchain_model(ai_config)
```
is a one-for-one replacement that forwards the whole
```
model.parameters
```
dict. Replace every call site, then delete the wrapper file-side so the next reader can't reach for it.
Same rule applies to hand-rolled
resolve_tools
/
TOOL_REGISTRY
/
ALL_TOOLS
helpers. If the template already has a
```
resolve_tools(tool_keys)
```
or an
```
ALL_TOOLS
```
module-level list, import
```
build_structured_tools
```
from
```
ldai_langchain.langchain_helper
```
and delete the hand-rolled version.
```
build_structured_tools(ai_config, TOOL_REGISTRY_DICT)
```
reads
```
ai_config.model.parameters.tools
```
and wraps the matching callables as LangChain
```
StructuredTool
```
s with the LD tool key as the
```
StructuredTool.name
```
— so
```
ToolNode
```
lookup works without a second mapping. Don't leave both in the repo.
Don't put app-scoped knobs directly in
```
model.parameters
```
.
```
create_langchain_model
```
forwards every key in
```
parameters
```
to the provider SDK via
```
init_chat_model
```
, so a
```
max_search_results
```
/
```
retry_budget
```
/
```
feature_toggle
```
entry will crash the provider with an unexpected-keyword-argument error. The correct home is
```
model.custom
```
, which the provider helpers ignore and the app reads via
```
ai_config.model.get_custom("key")
```
. The MCP
```
update-ai-config-variation
```
tool does not currently expose top-level
```
custom
```
, so pick one of two paths: (a) PATCH the variation via the REST API to set
```
model.custom
```
directly, or (b) set it via MCP inside
```
parameters.custom
```
(as a nested dict) and use a defensive accessor that reads both locations. Full walk-through with code samples in langchain-tracking.md § MCP caveat.
Don't re-encode tool schemas inside the fallback. When LaunchDarkly is unreachable the fallback should run without tools (or with whatever minimal provider-bound parameters the app needs to keep operating). Building a
```
_FALLBACK_TOOLS
```
array that duplicates the config's tool schema re-introduces the hardcoded config the migration was supposed to move out of code.
Don't import
```
LaunchDarklyCallbackHandler
```
from
```
ldai.langchain
```
— neither the class nor the dotted module path exists. The Python LangChain helper package is
```
ldai_langchain
```
(top-level module, underscore). Use
```
create_langchain_model(config)
```
+
```
track_metrics_of_async(get_ai_metrics_from_response, lambda: llm.ainvoke(messages))
```
as the canonical pattern.

如果仓库已包含
load_chat_model(f"{provider}/{name}")
助手，删除它——不要只是避免使用。该形式随
```
langchain-ai/react-agent
```
发布并被复制到数十个衍生仓库；查找
```
utils.load_chat_model
```
、
```
utils.build_model
```
或任何拆分
```
"provider/model"
```
字符串的单参数
```
init_chat_model
```
封装。复用它是首次运行失败模式：所有变体参数（temperature、max_tokens、top_p、停止序列）会被静默丢弃，因为
```
init_chat_model
```
仅接收名称和提供商。
```
create_langchain_model(ai_config)
```
是一对一的替代方案，会转发整个
```
model.parameters
```
字典。替换所有调用点，然后删除封装文件，让下一位读者无法使用它。
手动实现的
resolve_tools
/
TOOL_REGISTRY
/
ALL_TOOLS
助手同样遵循此规则。如果模板已有
```
resolve_tools(tool_keys)
```
或
```
ALL_TOOLS
```
模块级列表，从
```
ldai_langchain.langchain_helper
```
导入
```
build_structured_tools
```
并删除手动实现版本。
```
build_structured_tools(ai_config, TOOL_REGISTRY_DICT)
```
读取
```
ai_config.model.parameters.tools
```
并将匹配的可调用对象封装为LangChain
```
StructuredTool
```
，使用LD工具密钥作为
```
StructuredTool.name
```
——因此
```
ToolNode
```
查找无需二次映射。不要在仓库中同时保留两种形式。
不要将应用范围配置项直接放在
```
model.parameters
```
中。
```
create_langchain_model
```
会将
```
parameters
```
中的每个键通过
```
init_chat_model
```
转发给提供商SDK，因此
```
max_search_results
```
/
```
retry_budget
```
/
```
feature_toggle
```
条目会导致提供商因未知关键字参数崩溃。正确的位置是
```
model.custom
```
，提供商助手会忽略该字段，应用通过
```
ai_config.model.get_custom("key")
```
读取。MCP
```
update-ai-config-variation
```
工具当前未暴露顶级
```
custom
```
，因此选择以下两种路径之一：(a) 通过REST API PATCH变体直接设置
```
model.custom
```
，或(b) 在MCP中通过
```
parameters.custom
```
（作为嵌套字典）设置，并使用读取两个位置的防御性访问器。完整代码示例详见langchain-tracking.md § MCP caveat。
不要在回退方案中重新编码工具schema。当LaunchDarkly不可用时，回退方案应在无工具（或仅带应用运行所需的最小提供商绑定参数）的情况下运行。构建复制配置工具schema的
```
_FALLBACK_TOOLS
```
数组会重新引入迁移本应移出代码的硬编码配置。

不要从

ldai.langchain

导入

LaunchDarklyCallbackHandler

——该类和点分模块路径都不存在。Python LangChain助手包是

ldai_langchain

（顶级模块，下划线）。使用

create_langchain_model(config)

track_metrics_of_async(get_ai_metrics_from_response, lambda: llm.ainvoke(messages))

作为标准模式。

Stage / handoff discipline

阶段/交接规范

Don't skip Step 1 even when the user says "just wrap it." Without the audit, the fallback will drift from the hardcoded behavior.
Don't delegate to
```
configs-create
```
before extracting the prompt and model — the delegate needs them as inputs.
Don't try to attach tools during initial
```
setup-ai-config
```
. Tool attachment is a separate step owned by
```
tools
```
.
Don't claim you "delegated to
```
configs-create
```
" or any other sibling skill. This skill does not auto-invoke. At each handoff, print the inputs and tell the user to run the sibling slash-command, then wait. Anything else misleads the user about what just happened.
Don't skip the
```
/configs-targeting
```
step between Stage 2 and Stage 4. A freshly created variation returns
```
enabled=False
```
until targeting promotes it to fallthrough — Stage 2 verification will silently take the fallback path on every request.
Don't attempt a multi-agent graph migration in one pass. Migrate a single agent first; use agent-graph-reference.md as the next-step read.

即使用户说“直接封装”，也不要跳过步骤1。没有审计，回退方案会与硬编码行为偏差。
不要在提取提示词和模型前委托给
```
configs-create
```
——委托技能需要这些作为输入。
不要在初始
```
setup-ai-config
```
期间关联工具。工具关联是
```
tools
```
负责的单独步骤。
不要声称“已委托给
```
configs-create
```
”或任何其他兄弟技能。本技能不会自动调用。在每次交接时，打印输入并告知用户运行兄弟斜杠命令，然后等待。任何其他表述都会误导用户。
不要跳过第二阶段与第四阶段之间的
```
/configs-targeting
```
步骤。新创建的变体在定向升级为回退方案前会返回
```
enabled=False
```
——第二阶段验证会在每次请求时静默执行回退路径。
不要尝试一次完成多Agent图迁移。先迁移单个Agent；使用agent-graph-reference.md作为下一步参考。

Stage 5 evaluations

第五阶段评估

Don't wire evals before the tracker is in place. Judges score traffic; without Stage 4 traffic, there is nothing to judge.
Don't frame Stage 5 as "either UI or programmatic." There are three paths: offline eval (recommended default for migration), UI-attached auto judges (completion-mode only), and programmatic direct-judge. Offline eval is the one most people skip and usually the right starting point.

Don't pass

sampling_rate

create_judge

— it's a parameter on

Judge.evaluate()

, not

create_judge()

Don't hardcode judge config keys (
```
"accuracy-judge"
```
,
```
"relevance-judge"
```
, etc). The built-in keys are not canonical SDK constants; ask the user to look them up in AgentControl > Library in the LD UI.
Don't forget the
```
if judge and judge.enabled:
```
guard after
```
create_judge
```
. It returns
```
Optional[Judge]
```
and returns
```
None
```
when the judge config is disabled for the context.

不要在追踪器接入前关联评估器。评判器为流量打分；没有第四阶段的流量，就没有可打分的内容。
不要将第五阶段描述为“要么UI要么程序化”。有三种路径：离线评估（迁移推荐默认）、UI关联自动评判器（仅补全模式）、程序化直接评判器。离线评估是大多数人跳过且通常是正确起点的路径。
不要将
```
sampling_rate
```
传递给
```
create_judge
```
——它是
```
Judge.evaluate()
```
的参数，而非
```
create_judge()
```
的参数。
不要硬编码评判器配置密钥（
```
"accuracy-judge"
```
、
```
"relevance-judge"
```
等）。内置密钥不是标准SDK常量；让用户在LD UI的AgentControl > Library中查找。
不要忘记
```
create_judge
```
后的
```
if judge and judge.enabled:
```
防护。它返回
```
Optional[Judge]
```
，当评判器配置对当前上下文禁用时返回
```
None
```
。

API surface gotchas

API表面陷阱

Don't use
```
launchdarkly-metric-instrument
```
for Stage 4 (tracking). That skill is for
```
ldClient.track()
```
feature metrics, not agent
```
tracker.track_*
```
calls — they are different APIs.
Don't use
```
track_request()
```
in Python — it does not exist in
```
launchdarkly-server-sdk-ai
```
. Use
```
track_metrics_of
```
with a provider-package or custom extractor, or drop to explicit
```
track_duration
```
+
```
track_tokens
```
+
```
track_success
```
/
```
track_error
```
if you're on the streaming path.
Don't pass
```
graph_key=...
```
to
```
tracker.track_*()
```
methods in Python — it is not an accepted argument. Trackers obtained inside a graph traversal are automatically configured with the correct graph key.

不要使用
```
launchdarkly-metric-instrument
```
进行第四阶段（追踪）。该技能用于
```
ldClient.track()
```
功能指标，而非Agent的
```
tracker.track_*
```
调用——它们是不同的API。
不要在Python中使用
```
track_request()
```
——它不存在于
```
launchdarkly-server-sdk-ai
```
中。使用带提供商包或自定义提取器的
```
track_metrics_of
```
，如果是流式路径，使用显式的
```
track_duration
```
+
```
track_tokens
```
+
```
track_success
```
/
```
track_error
```
。
不要在Python中向
```
tracker.track_*()
```
方法传递
```
graph_key=...
```
参数——它不是接受的参数。在图遍历中获取的追踪器会自动配置正确的图密钥。

Related Skills

References

参考文档

phase-1-analysis-checklist.md — Step 1 audit checklist, grep patterns, SDK routing table, mode decision tree
before-after-examples.md — Paired hardcoded-to-wrapped snippets for Python OpenAI, Node Anthropic, Python LangGraph
sdk-ai-tracker-patterns.md — Every
```
tracker.track_*
```
method in Python and Node side by side, auto-helper matrix, and common gotchas
agent-mode-frameworks.md — How to wire
```
agent_config
```
into LangGraph, CrewAI, and custom react loops; dynamic tool loading pattern
fallback-defaults-pattern.md — Three fallback patterns (inline, file-backed, bootstrap-generated) and when to use each
agent-graph-reference.md — Out-of-scope pointer doc for multi-agent migrations

phase-1-analysis-checklist.md — 步骤1审计清单、grep模式、SDK路由表、模式决策树
before-after-examples.md — Python OpenAI、Node Anthropic、Python LangGraph的硬编码到封装配对代码片段
sdk-ai-tracker-patterns.md — Python和Node中所有
```
tracker.track_*
```
方法的对比、自动助手矩阵和常见陷阱
agent-mode-frameworks.md — 如何将
```
agent_config
```
接入LangGraph、CrewAI和自定义react循环；动态工具加载模式
fallback-defaults-pattern.md — 三种回退模式（内联、文件驱动、引导生成）及其使用场景
agent-graph-reference.md — 多Agent迁移的范围外参考文档

migrate

Original

Translation

Migrate to AgentControl

迁移至AgentControl

Coverage — which shapes are well-trodden vs require extrapolation

覆盖范围——哪些场景已成熟支持，哪些需要自行扩展

Prerequisites

前置条件

Core Principles

核心原则

Workflow

工作流程

Minimum viable migration

最小可行迁移

Step 1: Audit the codebase (Stage 1)

步骤1：审计代码库（第一阶段）

Step 2: Wrap the call in the AI SDK (Stage 2)

步骤2：用AI SDK封装调用（第二阶段）

Step 3: Move tools into the config (Stage 3)

步骤3：将工具迁移至配置（第三阶段）

Step 4: Instrument the tracker (Stage 4)

步骤4：接入追踪器（第四阶段）

Step 5: Attach evaluations (Stage 5)

步骤5：关联评估器（第五阶段）

Edge Cases

边缘情况

What NOT to Do

禁止操作

Tracker and config lifetime (most common failure mode)

追踪器和配置生命周期（最常见失败模式）

LangChain / LangGraph patterns (second most common failure mode)

LangChain / LangGraph模式（第二常见失败模式）

Stage / handoff discipline

阶段/交接规范

Stage 5 evaluations

第五阶段评估

API surface gotchas

API表面陷阱

Related Skills

相关技能

References

参考文档