arize-instrumentation

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Arize Instrumentation Skill

Arize Instrumentation Skill

Use this skill when the user wants to add Arize AX tracing to their application. Follow the two-phase, agent-assisted flow from the Agent-Assisted Tracing Setup and the Arize AX Tracing — Agent Setup Prompt.
当用户希望为其应用添加Arize AX追踪功能时,使用此Skill。遵循Agent辅助追踪设置Arize AX追踪 — Agent设置提示文档中的两阶段Agent辅助流程

Quick start (for the user)

快速入门(面向用户)

If the user asks you to "set up tracing" or "instrument my app with Arize", you can start with:
Follow the instructions from https://arize.com/docs/PROMPT.md and ask me questions as needed.
Then execute the two phases below.
如果用户要求你"设置追踪"或"为我的应用添加Arize埋点",你可以这样开场:
然后执行以下两个阶段。

Core principles

核心原则

  • Prefer inspection over mutation — understand the codebase before changing it.
  • Do not change business logic — tracing is purely additive.
  • Use auto-instrumentation where available — add manual spans only for custom logic not covered by integrations.
  • Follow existing code style and project conventions.
  • Keep output concise and production-focused — do not generate extra documentation or summary files.
  • 优先检查,避免修改 — 在修改代码前先理解整个代码库。
  • 不修改业务逻辑 — 追踪功能仅为附加性的。
  • 优先使用自动埋点 — 仅当集成工具未覆盖自定义逻辑时,才添加手动跨度。
  • 遵循现有代码风格和项目规范。
  • 输出简洁,聚焦生产环境 — 不生成额外的文档或摘要文件。

Phase 1: Analysis (read-only)

阶段1:分析(只读)

Do not write any code or create any files during this phase.
此阶段请勿编写任何代码或创建任何文件。

Steps

步骤

  1. Check dependency manifests to detect stack:
    • Python:
      pyproject.toml
      ,
      requirements.txt
      ,
      setup.py
      ,
      Pipfile
    • TypeScript/JavaScript:
      package.json
    • Java:
      pom.xml
      ,
      build.gradle
      ,
      build.gradle.kts
  2. Scan import statements in source files to confirm what is actually used.
  3. Check for existing tracing/OTel — look for
    TracerProvider
    ,
    register()
    ,
    opentelemetry
    imports,
    ARIZE_*
    ,
    OTEL_*
    ,
    OTLP_*
    env vars, or other observability config (Datadog, Honeycomb, etc.).
  4. Identify scope — for monorepos or multi-service projects, ask which service(s) to instrument.
  1. 检查依赖清单以识别技术栈:
    • Python:
      pyproject.toml
      ,
      requirements.txt
      ,
      setup.py
      ,
      Pipfile
    • TypeScript/JavaScript:
      package.json
    • Java:
      pom.xml
      ,
      build.gradle
      ,
      build.gradle.kts
  2. 扫描源文件中的导入语句,确认实际使用的依赖。
  3. 检查是否已有追踪/OTel配置 — 查找
    TracerProvider
    register()
    opentelemetry
    导入语句,
    ARIZE_*
    OTEL_*
    OTLP_*
    环境变量,或其他可观测性配置(如Datadog、Honeycomb等)。
  4. 确定范围 — 对于单体仓库或多服务项目,询问用户需要为哪些服务添加埋点。

What to identify

需要识别的内容

ItemExamples
LanguagePython, TypeScript/JavaScript, Java
Package managerpip/poetry/uv, npm/pnpm/yarn, maven/gradle
LLM providersOpenAI, Anthropic, LiteLLM, Bedrock, etc.
FrameworksLangChain, LangGraph, LlamaIndex, Vercel AI SDK, Mastra, etc.
Existing tracingAny OTel or vendor setup
Tool/function useLLM tool use, function calling, or custom tools the app executes (e.g. in an agent loop)
Key rule: When a framework is detected alongside an LLM provider, instrument both. Provider/framework instrumentors do not create spans for tool execution — only for LLM API calls. If the app runs tools (e.g.
check_loan_eligibility
,
run_fraud_detection
), add manual TOOL spans so each invocation appears with input/output (see Enriching traces below). The framework instrumentor does not trace the underlying LLM calls — you need the provider instrumentor too.
示例
开发语言Python, TypeScript/JavaScript, Java
包管理器pip/poetry/uv, npm/pnpm/yarn, maven/gradle
LLM提供商OpenAI, Anthropic, LiteLLM, Bedrock, etc.
开发框架LangChain, LangGraph, LlamaIndex, Vercel AI SDK, Mastra, etc.
已有追踪配置任何OTel或厂商提供的配置
工具/函数使用情况应用使用LLM工具、函数调用或执行自定义工具(如在Agent循环中)
关键规则: 若同时检测到框架和LLM提供商,需为两者都添加埋点。提供商/框架的埋点工具不会为工具执行创建跨度 — 仅会为LLM API调用创建跨度。若应用运行工具(如
check_loan_eligibility
run_fraud_detection
),请添加手动TOOL跨度,以便每次调用都显示输入和输出(见下文丰富追踪数据部分)。框架埋点工具不会追踪底层的LLM调用 — 你还需要提供商的埋点工具。

Phase 1 output

阶段1输出

Return a concise summary:
  • Detected language, package manager, providers, frameworks
  • Proposed integration list (from the routing table in the docs)
  • Any existing OTel/tracing that needs consideration
  • If monorepo: which service(s) you propose to instrument
  • If the app uses LLM tool use / function calling: note that you will add manual CHAIN + TOOL spans so each tool call appears in the trace with input/output (avoids sparse traces).
STOP. Present your analysis and wait for user confirmation before proceeding to Phase 2.
返回简洁的摘要:
  • 检测到的语言、包管理器、提供商、框架
  • 建议的集成列表(来自文档中的路由表)
  • 任何需要考虑的已有OTel/追踪配置
  • 若为单体仓库:建议为哪些服务添加埋点
  • 若应用使用LLM工具/函数调用: 说明你将添加手动CHAIN + TOOL跨度,以便每个工具调用在追踪数据中显示输入和输出(避免追踪数据过于稀疏)。
停止操作。在进入阶段2前,展示你的分析结果并等待用户确认。

Integration routing and docs

集成路由与文档

The canonical list of supported integrations and doc URLs is in the Agent Setup Prompt. Use it to map detected signals to implementation docs.
Fetch the matched doc pages from the full routing table in PROMPT.md for exact installation and code snippets. Use llms.txt as a fallback for doc discovery if needed.
支持的集成及文档URL的标准列表位于Agent设置提示中。请根据检测到的信息映射到对应的实现文档。
PROMPT.md中的完整路由表获取匹配的文档页面,以获取准确的安装步骤和代码片段。若需要,可使用llms.txt作为文档查找的备选方案。

Phase 2: Implementation

阶段2:实现

Proceed only after the user confirms the Phase 1 analysis.
仅在用户确认阶段1的分析结果后再继续。

Steps

步骤

  1. Fetch integration docs — Read the matched doc URLs and follow their installation and instrumentation steps.
  2. Install packages using the detected package manager before writing code:
    • Python:
      pip install arize-otel
      plus
      openinference-instrumentation-{name}
      (hyphens in package name; underscores in import, e.g.
      openinference.instrumentation.llama_index
      ).
    • TypeScript/JavaScript:
      @opentelemetry/sdk-trace-node
      plus the relevant
      @arizeai/openinference-*
      package.
    • Java: OpenTelemetry SDK plus
      openinference-instrumentation-*
      in pom.xml or build.gradle.
  3. Credentials — User needs Arize Space ID and API Key from Space API Keys. Set as
    ARIZE_SPACE_ID
    and
    ARIZE_API_KEY
    .
  4. Centralized instrumentation — Create a single module (e.g.
    instrumentation.py
    ,
    instrumentation.ts
    ) and initialize tracing before any LLM client is created.
  5. Existing OTel — If there is already a TracerProvider, add Arize as an additional exporter (e.g. BatchSpanProcessor with Arize OTLP). Do not replace existing setup unless the user asks.
  1. 获取集成文档 — 阅读匹配的文档URL,按照其中的安装和埋点步骤操作。
  2. 安装依赖包 — 在编写代码前,使用检测到的包管理器安装:
    • Python:
      pip install arize-otel
      以及
      openinference-instrumentation-{name}
      (包名使用连字符;导入时使用下划线,例如
      openinference.instrumentation.llama_index
      )。
    • TypeScript/JavaScript:
      @opentelemetry/sdk-trace-node
      以及相关的
      @arizeai/openinference-*
      包。
    • Java: OpenTelemetry SDK 以及在pom.xml或build.gradle中添加
      openinference-instrumentation-*
      依赖。
  3. 凭证 — 用户需要从Space API密钥获取Arize Space IDAPI Key,并设置为环境变量
    ARIZE_SPACE_ID
    ARIZE_API_KEY
  4. 集中式埋点 — 创建单独的模块(如
    instrumentation.py
    instrumentation.ts
    ),并在创建任何LLM客户端之前初始化追踪功能。
  5. 已有OTel配置 — 若已存在TracerProvider,将Arize添加为额外的导出器(如使用Arize OTLP的BatchSpanProcessor)。除非用户要求,否则不要替换现有配置。

Implementation rules

实现规则

  • Use auto-instrumentation first; manual spans only when needed.
  • Fail gracefully if env vars are missing (warn, do not crash).
  • Import order: register tracer → attach instrumentors → then create LLM clients.
  • Project name attribute (required): Arize rejects spans with HTTP 500 if the project name is missing —
    service.name
    alone is not accepted. Set it as a resource attribute on the TracerProvider (recommended — one place, applies to all spans): Python:
    register(project_name="my-app")
    handles it automatically (sets
    "openinference.project.name"
    on the resource); TypeScript: Arize accepts both
    "model_id"
    (shown in the official TS quickstart) and
    "openinference.project.name"
    via
    SEMRESATTRS_PROJECT_NAME
    from
    @arizeai/openinference-semantic-conventions
    (shown in the manual instrumentation docs) — both work. For routing spans to different projects in Python, use
    set_routing_context(space_id=..., project_name=...)
    from
    arize.otel
    .
  • CLI/script apps — flush before exit:
    provider.shutdown()
    (TS) /
    provider.force_flush()
    then
    provider.shutdown()
    (Python) must be called before the process exits, otherwise async OTLP exports are dropped and no traces appear.
  • When the app has tool/function execution: add manual CHAIN + TOOL spans (see Enriching traces below) so the trace tree shows each tool call and its result — otherwise traces will look sparse (only LLM API spans, no tool input/output).
  • 优先使用自动埋点;仅在必要时添加手动跨度。
  • 若环境变量缺失,优雅降级(发出警告,而非崩溃)。
  • 导入顺序: 注册追踪器 → 附加埋点工具 → 然后创建LLM客户端。
  • 项目名称属性(必填): 若缺失项目名称,Arize会以HTTP 500错误拒绝跨度 — 仅设置
    service.name
    是不够的。需将其设置为TracerProvider的资源属性(推荐方式 — 一处设置,适用于所有跨度):Python:
    register(project_name="my-app")
    会自动处理(在资源上设置
    "openinference.project.name"
    );TypeScript:Arize接受两种方式,一是官方快速入门中展示的
    "model_id"
    ,二是通过
    @arizeai/openinference-semantic-conventions
    中的
    SEMRESATTRS_PROJECT_NAME
    设置
    "openinference.project.name"
    — 两种方式均有效。在Python中,若要将跨度路由到不同项目,可使用
    arize.otel
    中的
    set_routing_context(space_id=..., project_name=...)
  • CLI/脚本应用 — 退出前刷新: 进程退出前必须调用
    provider.shutdown()
    (TypeScript) /
    provider.force_flush()
    然后
    provider.shutdown()
    (Python),否则异步OTLP导出会被丢弃,导致追踪数据不显示。
  • 若应用包含工具/函数执行: 添加手动CHAIN + TOOL跨度(见下文丰富追踪数据部分),以便追踪树显示每个工具调用及其结果 — 否则追踪数据会显得稀疏(仅包含LLM API跨度,没有工具的输入和输出)。

Enriching traces: manual spans for tool use and agent loops

丰富追踪数据:工具使用和Agent循环的手动跨度

Why doesn't the auto-instrumentor do this?

为什么自动埋点工具不处理这个?

Provider instrumentors (Anthropic, OpenAI, etc.) only wrap the LLM client — the code that sends HTTP requests and receives responses. They see:
  • One span per API call: request (messages, system prompt, tools) and response (text, tool_use blocks, etc.).
They cannot see what happens inside your application after the response:
  • Tool execution — Your code parses the response, calls
    run_tool("check_loan_eligibility", {...})
    , and gets a result. That runs in your process; the instrumentor has no hook into your
    run_tool()
    or the actual tool output. The next API call (sending the tool result back) is just another
    messages.create
    span — the instrumentor doesn't know that the message content is a tool result or what the tool returned.
  • Agent/chain boundary — The idea of "one user turn → multiple LLM calls + tool calls" is an application-level concept. The instrumentor only sees separate API calls; it doesn't know they belong to the same logical "run_agent" run.
So TOOL and CHAIN spans have to be added manually (or by a framework instrumentor like LangChain/LangGraph that knows about tools and chains). Once you add them, they appear in the same trace as the LLM spans because they use the same TracerProvider.

To avoid sparse traces where tool inputs/outputs are missing:
  1. Detect agent/tool patterns: a loop that calls the LLM, then runs one or more tools (by name + arguments), then calls the LLM again with tool results.
  2. Add manual spans using the same TracerProvider (e.g.
    opentelemetry.trace.get_tracer(...)
    after
    register()
    ):
    • CHAIN span — Wrap the full agent run (e.g.
      run_agent
      ): set
      openinference.span.kind
      =
      "CHAIN"
      ,
      input.value
      = user message,
      output.value
      = final reply.
    • TOOL span — Wrap each tool invocation: set
      openinference.span.kind
      =
      "TOOL"
      ,
      input.value
      = JSON of arguments,
      output.value
      = JSON of result. Use the tool name as the span name (e.g.
      check_loan_eligibility
      ).
OpenInference attributes (use these so Arize shows spans correctly):
AttributeUse
openinference.span.kind
"CHAIN"
or
"TOOL"
input.value
string (e.g. user message or JSON of tool args)
output.value
string (e.g. final reply or JSON of tool result)
Python pattern: Get the global tracer (same provider as Arize), then use context managers so tool spans are children of the CHAIN span and appear in the same trace as the LLM spans:
python
from opentelemetry.trace import get_tracer

tracer = get_tracer("my-app", "1.0.0")
提供商的埋点工具(如Anthropic、OpenAI等)仅包装LLM客户端 — 即发送HTTP请求和接收响应的代码。 它们只能追踪到:
  • 每个API调用对应一个跨度:请求(消息、系统提示、工具)和响应(文本、tool_use块等)。
它们无法看到响应返回后应用内部发生的事情:
  • 工具执行 — 你的代码解析响应,调用
    run_tool("check_loan_eligibility", {...})
    并获取结果。这部分在你的进程中运行;埋点工具无法Hook到你的
    run_tool()
    函数或实际的工具输出。下一次API调用(将工具结果发送回去)只是另一个
    messages.create
    跨度 — 埋点工具不知道消息内容是工具结果,也不知道工具返回了什么。
  • Agent/Chain边界 — “一次用户交互 → 多次LLM调用 + 工具调用”是一个应用层面的概念。埋点工具只能看到独立的API调用;它不知道这些调用属于同一个逻辑上的“run_agent”运行过程。
因此,TOOL和CHAIN跨度必须手动添加(或由了解工具和Chain的框架埋点工具添加,如LangChain/LangGraph)。添加后,由于使用同一个TracerProvider,它们会和LLM跨度出现在同一个追踪数据中。

为避免追踪数据缺少工具输入/输出而显得稀疏:
  1. 检测Agent/工具模式:循环调用LLM,然后运行一个或多个工具(按名称+参数),再将工具结果传回LLM进行下一次调用。
  2. 使用同一个TracerProvider添加手动跨度(例如
    register()
    后调用
    opentelemetry.trace.get_tracer(...)
    ):
    • CHAIN跨度 — 包裹整个Agent运行过程(如
      run_agent
      ):设置
      openinference.span.kind
      =
      "CHAIN"
      input.value
      = 用户消息,
      output.value
      = 最终回复。
    • TOOL跨度 — 包裹每次工具调用:设置
      openinference.span.kind
      =
      "TOOL"
      input.value
      = 参数的JSON,
      output.value
      = 结果的JSON。使用工具名称作为跨度名称(如
      check_loan_eligibility
      )。
OpenInference属性(使用这些属性以确保Arize正确显示跨度):
属性用途
openinference.span.kind
"CHAIN"
"TOOL"
input.value
字符串(如用户消息或工具参数的JSON)
output.value
字符串(如最终回复或工具结果的JSON)
Python示例: 获取全局追踪器(与Arize使用同一个提供商),然后使用上下文管理器,使工具跨度成为CHAIN跨度的子跨度,并与LLM跨度出现在同一个追踪数据中:
python
from opentelemetry.trace import get_tracer

tracer = get_tracer("my-app", "1.0.0")

In your agent entrypoint:

在你的Agent入口函数中:

with tracer.start_as_current_span("run_agent") as chain_span: chain_span.set_attribute("openinference.span.kind", "CHAIN") chain_span.set_attribute("input.value", user_message) # ... LLM call ... for tool_use in tool_uses: with tracer.start_as_current_span(tool_use["name"]) as tool_span: tool_span.set_attribute("openinference.span.kind", "TOOL") tool_span.set_attribute("input.value", json.dumps(tool_use["input"])) result = run_tool(tool_use["name"], tool_use["input"]) tool_span.set_attribute("output.value", result) # ... append tool result to messages, call LLM again ... chain_span.set_attribute("output.value", final_reply)

See [Manual instrumentation](https://arize.com/docs/ax/observe/tracing/setup/manual-instrumentation) for more span kinds and attributes.
with tracer.start_as_current_span("run_agent") as chain_span: chain_span.set_attribute("openinference.span.kind", "CHAIN") chain_span.set_attribute("input.value", user_message) # ... LLM调用 ... for tool_use in tool_uses: with tracer.start_as_current_span(tool_use["name"]) as tool_span: tool_span.set_attribute("openinference.span.kind", "TOOL") tool_span.set_attribute("input.value", json.dumps(tool_use["input"])) result = run_tool(tool_use["name"], tool_use["input"]) tool_span.set_attribute("output.value", result) # ... 将工具结果添加到消息中,再次调用LLM ... chain_span.set_attribute("output.value", final_reply)

更多跨度类型和属性请参考[手动埋点](https://arize.com/docs/ax/observe/tracing/setup/manual-instrumentation)文档。

Verification

验证

After implementation:
  1. Run the application and trigger at least one LLM call.
  2. Use the
    arize-trace
    skill
    to confirm traces arrived. If empty, retry shortly. Verify spans have expected
    openinference.span.kind
    ,
    input.value
    /
    output.value
    , and parent-child relationships.
  3. If no traces: verify
    ARIZE_SPACE_ID
    and
    ARIZE_API_KEY
    , ensure tracer is initialized before instrumentors and clients, check connectivity to
    otlp.arize.com:443
    ; for debug set
    GRPC_VERBOSITY=debug
    or pass
    log_to_console=True
    to
    register()
    . Common gotchas: (a) missing project name resource attribute causes HTTP 500 rejections —
    service.name
    alone is not enough; Python: pass
    project_name
    to
    register()
    ; TypeScript: set
    "model_id"
    or
    SEMRESATTRS_PROJECT_NAME
    on the resource; (b) CLI/script processes exit before OTLP exports flush — call
    provider.force_flush()
    then
    provider.shutdown()
    before exit.
  4. If the app uses tools: confirm CHAIN and TOOL spans appear with
    input.value
    /
    output.value
    so tool calls and results are visible.
实现完成后:
  1. 运行应用并触发至少一次LLM调用。
  2. 使用
    arize-trace
    Skill
    确认追踪数据已到达。若为空,请稍后重试。验证跨度是否包含预期的
    openinference.span.kind
    input.value
    /
    output.value
    以及父子关系。
  3. 若没有追踪数据:验证
    ARIZE_SPACE_ID
    ARIZE_API_KEY
    是否正确,确保追踪器在埋点工具和客户端之前初始化,检查与
    otlp.arize.com:443
    的连接性;调试时可设置
    GRPC_VERBOSITY=debug
    或向
    register()
    传递
    log_to_console=True
    。常见问题:(a) 缺失项目名称资源属性会导致HTTP 500错误被拒绝 — 仅设置
    service.name
    是不够的;Python:向
    register()
    传递
    project_name
    参数;TypeScript:在资源上设置
    "model_id"
    SEMRESATTRS_PROJECT_NAME
    ;(b) CLI/脚本进程在OTLP导出刷新前退出 — 退出前调用
    provider.force_flush()
    然后
    provider.shutdown()
  4. 若应用使用工具:确认CHAIN和TOOL跨度已显示
    input.value
    /
    output.value
    ,确保工具调用和结果可见。

Leveraging the Tracing Assistant (MCP)

使用追踪助手(MCP)

For deeper instrumentation guidance inside the IDE, the user can enable:
  • Arize AX Tracing Assistant MCP — instrumentation guides, framework examples, and support. In Cursor: Settings → MCP → Add and use:
    json
    "arize-tracing-assistant": {
      "command": "uvx",
      "args": ["arize-tracing-assistant@latest"]
    }
  • Arize AX Docs MCP — searchable docs. In Cursor:
    json
    "arize-ax-docs": {
      "url": "https://arize.com/docs/mcp"
    }
Then the user can ask things like: "Instrument this app using Arize AX", "Can you use manual instrumentation so I have more control over my traces?", "How can I redact sensitive information from my spans?"
See the full setup at Agent-Assisted Tracing Setup.
若要在IDE中获取更深入的埋点指导,用户可以启用:
  • Arize AX 追踪助手 MCP — 提供埋点指南、框架示例和支持。在Cursor中:设置 → MCP → 添加,并使用:
    json
    "arize-tracing-assistant": {
      "command": "uvx",
      "args": ["arize-tracing-assistant@latest"]
    }
  • Arize AX 文档 MCP — 可搜索的文档。在Cursor中:
    json
    "arize-ax-docs": {
      "url": "https://arize.com/docs/mcp"
    }
然后用户可以提出诸如:"使用Arize AX为我的应用添加埋点""能否使用手动埋点让我更控制追踪数据?""如何从跨度中编辑敏感信息?" 等问题。
完整设置请参考Agent辅助追踪设置

Reference links

参考链接

ResourceURL
Agent-Assisted Tracing Setuphttps://arize.com/docs/ax/alyx/tracing-assistant
Agent Setup Prompt (full routing + phases)https://arize.com/docs/PROMPT.md
Arize AX Docshttps://arize.com/docs/ax
Full integration listhttps://arize.com/docs/ax/integrations
Doc index (llms.txt)https://arize.com/docs/llms.txt
资源URL
Agent辅助追踪设置https://arize.com/docs/ax/alyx/tracing-assistant
Agent设置提示(完整路由+阶段)https://arize.com/docs/PROMPT.md
Arize AX 文档https://arize.com/docs/ax
完整集成列表https://arize.com/docs/ax/integrations
文档索引(llms.txt)https://arize.com/docs/llms.txt
",