otel-onboarding-style

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

OTel Onboarding Style

OTel接入规范

Use native OpenTelemetry APIs. Do not invent helper APIs.
In TypeScript/JavaScript, use the published
@superlog/otel-helpers
withSpan
helper for bounded business spans and add
@superlog/otel-helpers
to
package.json
when it is not already present. This is required when the package can be installed.
withSpan
is the intended replacement for expanding a whole function into
tracer.startActiveSpan(...)
plus
try
/
catch
/
finally
. Do not use helpers to wrap provider SDK calls that OpenInference/provider instrumentation can observe directly.
Do:
ts
const tracer = trace.getTracer("mugline.api");
const meter = metrics.getMeter("mugline.api");
const ordersSubmitted = meter.createCounter("orders.submitted");

await withSpan("order.submit", async (span) => {
  span.setAttributes({
    "tenant.id": tenantId,
    "order.id": orderId,
    outcome: "success",
  });
  ordersSubmitted.add(1, { "tenant.id": tenantId, outcome: "success" });
}, { tracer });
Do not:
ts
await sendSuperlogSpan(...);
recordCounter(...);
withTelemetry(...);
使用原生OpenTelemetry API,请勿自行封装辅助API。
在TypeScript/JavaScript中,使用已发布的
@superlog/otel-helpers
包中的
withSpan
工具来处理有界业务Span;若项目
package.json
中尚未包含该包,则需添加进去。当包可安装时,这一步是必须的。
withSpan
旨在替代将整个函数展开为
tracer.startActiveSpan(...)
加上
try
/
catch
/
finally
的写法。请勿使用工具来封装OpenInference/服务商可直接观测到的服务商SDK调用。
推荐写法:
ts
const tracer = trace.getTracer("mugline.api");
const meter = metrics.getMeter("mugline.api");
const ordersSubmitted = meter.createCounter("orders.submitted");

await withSpan("order.submit", async (span) => {
  span.setAttributes({
    "tenant.id": tenantId,
    "order.id": orderId,
    outcome: "success",
  });
  ordersSubmitted.add(1, { "tenant.id": tenantId, outcome: "success" });
}, { tracer });
不推荐写法:
ts
await sendSuperlogSpan(...);
recordCounter(...);
withTelemetry(...);

Naming

命名规范

  • Files/functions are provider-neutral:
    telemetry.ts
    ,
    observability.ts
    ,
    initTelemetry()
    ,
    initObservability()
    .
  • The word Superlog belongs only in endpoint/key setup comments or PR instructions.
  • Span names are conventional and low-cardinality:
    checkout.process
    ,
    voice.session
    ,
    llm.generate_copy
    .
  • Prefer semantic product-operation span names over provider transport names.
    llm.generate_copy
    or
    llm.voice_response
    is usually more useful than
    llm.anthropic.messages.create
    .
  • 文件/函数名称需与服务商无关:如
    telemetry.ts
    observability.ts
    initTelemetry()
    initObservability()
  • 仅在端点/密钥配置注释或PR说明中使用Superlog一词。
  • Span名称需遵循惯例且基数较低:如
    checkout.process
    voice.session
    llm.generate_copy
  • 优先使用语义化的产品操作类Span名称,而非服务商传输类名称。例如
    llm.generate_copy
    llm.voice_response
    通常比
    llm.anthropic.messages.create
    更实用。

Endpoint and key

端点与密钥

Inline the endpoint and the project's ingest key directly in the bootstrap source — don't read from
OTEL_EXPORTER_OTLP_*
env vars and don't write
.env
files. The Superlog ingest key is project-scoped + write-only (Sentry DSN shaped), so source-level configuration is the right default; env-var indirection only adds deploy-time failure modes.
text
SUPERLOG_ENDPOINT = "https://intake.superlog.sh"
SUPERLOG_KEY = "superlog_live_…"   # or "SUPERLOG_TEST" while pairing
Do not invent legacy names such as
SUPERLOG_API_KEY
or
SUPERLOG_INTAKE_URL
, even as placeholder text in docs or comments. While pairing is in flight, use the literal
SUPERLOG_TEST
sentinel — Superlog's ingest accepts it without forwarding events anywhere, so the bootstrap exercises the full code path before the real key arrives.
Pass the inline values to the SDK explicitly via the exporter constructor's
endpoint
/
headers
options. Do not configure the SDK off implicit env-var reads.
If the repo can call telemetry init from multiple paths, guard provider/exporter setup so repeated imports, tests, reloads, or framework callbacks do not install duplicate processors or log handlers. For a single-entrypoint app that starts cleanly, keep this simple.
Include standard resource attributes when values are available:
service.name
,
service.version
,
deployment.environment.name
, and the VCS attributes below.
直接在启动源码中嵌入端点和项目的接入密钥——请勿从
OTEL_EXPORTER_OTLP_*
环境变量读取,也不要写入
.env
文件。Superlog接入密钥是项目级别的且仅可写(类似Sentry DSN格式),因此源码级配置是合理的默认选择;通过环境变量间接配置只会增加部署时的故障点。
text
SUPERLOG_ENDPOINT = "https://intake.superlog.sh"
SUPERLOG_KEY = "superlog_live_…"   # 配对测试时可使用"SUPERLOG_TEST"
请勿创建诸如
SUPERLOG_API_KEY
SUPERLOG_INTAKE_URL
之类的旧命名,即使是文档或注释中的占位文本也不行。在配对测试过程中,使用字面量
sentinel
SUPERLOG_TEST
——Superlog的接入服务会接受该值但不会转发任何事件,因此在获取真实密钥之前,启动流程即可完整验证代码路径。
通过导出器构造函数的
endpoint
/
headers
选项,将嵌入的值显式传递给SDK。请勿通过隐式读取环境变量来配置SDK。
如果仓库可从多个路径调用遥测初始化函数,请添加防护逻辑,避免重复导入、测试、重载或框架回调导致安装重复的处理器或日志处理器。对于单入口且启动流程清晰的应用,保持逻辑简洁即可。
若值可用,请包含标准资源属性:
service.name
service.version
deployment.environment.name
以及下方的VCS属性。

VCS resource attributes

VCS资源属性

Set
vcs.repository.url.full
on the OTel resource for every instrumented service. The value is the canonical https URL of the repo (e.g.
https://github.com/acme/api
) — the same URL the user would paste in a browser, not an SSH URL, not a local working-tree path. This is the important one: it lets Superlog link telemetry back to the source of truth. It is fine to hardcode this string alongside
service.name
in the SDK init; if a build env already exposes the slug (e.g.
VERCEL_GIT_REPO_OWNER
+
VERCEL_GIT_REPO_SLUG
,
RAILWAY_GIT_REPO_OWNER
+
RAILWAY_GIT_REPO_NAME
), prefer reading from env so a fork or rename doesn't drift.
Also set
vcs.ref.head.revision
(the commit SHA) on a best-effort basis. Read it from whatever env var the runtime/build platform already injects:
VERCEL_GIT_COMMIT_SHA
,
RAILWAY_GIT_COMMIT_SHA
,
GITHUB_SHA
,
SOURCE_COMMIT
,
GIT_COMMIT
,
HEROKU_SLUG_COMMIT
, etc. Do not shell out to
git
from the running process — many production images do not have git or a working tree. If no env source is available, omit the attribute; skipping the SHA is fine, skipping the URL is not.
Use
vcs.repository.url.full
and
vcs.ref.head.revision
exactly as named — these are the OTel semantic-convention keys. Do not invent parallel attributes like
git.repo
or
app.repo_url
.
为每个已接入遥测的服务在OTel资源上设置
vcs.repository.url.full
。其值为仓库的标准HTTPS URL(例如
https://github.com/acme/api
)——即用户会在浏览器中粘贴的URL,而非SSH URL或本地工作目录路径。这一点至关重要:它能让Superlog将遥测数据关联回代码源。可以将该字符串与
service.name
一起硬编码在SDK初始化代码中;如果构建环境已暴露仓库标识(例如
VERCEL_GIT_REPO_OWNER
+
VERCEL_GIT_REPO_SLUG
RAILWAY_GIT_REPO_OWNER
+
RAILWAY_GIT_REPO_NAME
),则优先从环境变量读取,避免仓库分支或重命名后出现不一致。
同时,尽最大努力设置
vcs.ref.head.revision
(即提交SHA值)。从运行时/构建平台已注入的环境变量中读取:
VERCEL_GIT_COMMIT_SHA
RAILWAY_GIT_COMMIT_SHA
GITHUB_SHA
SOURCE_COMMIT
GIT_COMMIT
HEROKU_SLUG_COMMIT
等。请勿从运行进程中调用
git
命令——许多生产镜像并未安装Git或包含工作目录。若没有可用的环境变量源,则省略该属性;缺少SHA值是可接受的,但不能缺少URL。
严格使用
vcs.repository.url.full
vcs.ref.head.revision
作为属性名——这些是OTel语义规范的标准键。请勿创建类似
git.repo
app.repo_url
的自定义属性。

Signals

信号规范

  • Traces: all critical operations have spans with relevant attributes.
  • Logs: structured, concise, OTLP-forwarded, and trace/span-correlated.
  • Metrics: critical operations have low-cardinality counters/histograms.
  • Tenant/org/project information is included where available.
  • Do not put raw user ids or request ids in metric tags unless the repo already treats them as bounded tenant-like ids.
  • 链路追踪:所有关键操作都需包含带有相关属性的Span。
  • 日志:结构化、简洁、通过OTLP转发,并与链路/Span关联。
  • 指标:关键操作需包含低基数的计数器/直方图。
  • 若可用,需包含租户/组织/项目信息。
  • 除非仓库已将原始用户ID或请求ID视为有界的租户类ID,否则请勿将其放入指标标签中。

LLM Metrics

LLM指标

If the app uses LLMs, first look for provider instrumentation that already captures model/provider/token/error spans. In JavaScript/TypeScript, prefer OpenInference packages such as
@arizeai/openinference-instrumentation-anthropic
for supported SDKs. Keep the real provider call native and readable.
ts
const response = await client.messages.create({
  model,
  max_tokens: 100,
  messages,
});
Every provider/call site still needs enough telemetry to answer usage questions. Let provider instrumentation own model/provider/token spans where it supports them. Do not duplicate those attributes at every application call site, and do not put provider pricing tables or cost math in product handlers. Superlog computes estimated LLM cost centrally in the UI/query layer from captured provider/model/token data.
ts
llmInputTokens.add(inputTokens, {
  "tenant.id": tenantId,
  "gen_ai.provider.name": "anthropic",
  "gen_ai.request.model": model,
  "app.gen_ai.use_case": "voice.initial_greeting",
  "app.gen_ai.call_site": "_callMugCopyLlm",
  outcome: "success",
});
Use counters for additive totals only when provider instrumentation cannot capture token usage:
  • llm.tokens.input
  • llm.tokens.output
Token counters use
unit="tokens"
or the SDK equivalent. If OpenInference/provider instrumentation already captures token usage, do not duplicate token counters just to mirror it. Do not add
llm.cost_usd
or equivalent app-side cost metrics for normal LLM calls; cost belongs in Superlog's central pricing layer.
Use histograms for latency/duration distributions.
Prefer current
gen_ai.*
semantic-convention-style attribute names for LLM provider/model/token attributes, plus
app.gen_ai.*
for bounded application dimensions such as use case and call site. Avoid inventing parallel
llm.*
attributes unless the repo already standardizes on them.
If the app has OpenAI, Anthropic, and Google callers, instrument all three.
如果应用使用LLM,首先查看是否已有服务商工具可捕获模型/服务商/令牌/错误Span。在JavaScript/TypeScript中,对于支持的SDK,优先使用OpenInference包,例如
@arizeai/openinference-instrumentation-anthropic
。保持真实的服务商调用原生且可读。
ts
const response = await client.messages.create({
  model,
  max_tokens: 100,
  messages,
});
每个服务商/调用站点仍需足够的遥测数据以回答使用情况相关问题。在服务商工具支持的情况下,由其负责捕获模型/服务商/令牌Span。请勿在每个应用调用站点重复这些属性,也不要在产品处理程序中加入服务商定价表或成本计算逻辑。Superlog会在UI/查询层集中根据捕获的服务商/模型/令牌数据计算LLM的预估成本。
ts
llmInputTokens.add(inputTokens, {
  "tenant.id": tenantId,
  "gen_ai.provider.name": "anthropic",
  "gen_ai.request.model": model,
  "app.gen_ai.use_case": "voice.initial_greeting",
  "app.gen_ai.call_site": "_callMugCopyLlm",
  outcome: "success",
});
仅当服务商工具无法捕获令牌使用情况时,才使用计数器统计累加总量:
  • llm.tokens.input
  • llm.tokens.output
令牌计数器需使用
unit="tokens"
或SDK等效配置。如果OpenInference/服务商工具已捕获令牌使用情况,则无需重复创建令牌计数器。请勿为常规LLM调用添加
llm.cost_usd
或类似的应用端成本指标;成本计算属于Superlog的集中定价层。
使用直方图统计延迟/持续时间分布。
对于LLM的服务商/模型/令牌属性,优先使用当前的
gen_ai.*
语义规范风格的属性名,加上
app.gen_ai.*
用于有界的应用维度(如用例和调用站点)。除非仓库已标准化使用
llm.*
属性,否则请勿自定义此类属性。
如果应用包含OpenAI、Anthropic和Google的调用者,需对三者全部进行接入。

Smoke Checks

冒烟测试

Add a durable smoke path when the repo has a natural place for it: README, TESTING guide, script, npm command, pytest, or checked-in command note.
The smoke should explicitly prove startup/import with OTel env vars present so provider setup, exporter construction, log bridging, and framework instrumentation initialize without errors. Then, where practical, exercise an actual instrumented span/log/metric or OTLP export attempt. A generic health route only proves the server responds; prefer an operation that crosses the instrumentation you added.
当仓库中有合适的位置时,添加一个持久的冒烟测试路径:README、测试指南、脚本、npm命令、pytest或已提交的命令说明。
冒烟测试需明确验证在存在OTel环境变量的情况下,启动/导入过程无错误,确保服务商配置、导出器构造、日志桥接和框架工具初始化正常。然后,在可行的情况下,执行一个实际的已接入遥测的Span/日志/指标操作或OTLP导出尝试。通用健康路由仅能证明服务器可响应;优先选择一个会触发你所添加的遥测工具的操作。