dd-apm

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Datadog APM

Datadog APM

Distributed tracing, service maps, and performance analysis.
分布式链路追踪、服务地图和性能分析。

Requirements

要求

Datadog Labs Pup should be installed via:
bash
go install github.com/datadog-labs/pup@latest
Datadog Labs Pup 需通过以下方式安装:
bash
go install github.com/datadog-labs/pup@latest

Quick Start

快速开始

bash
pup auth login
pup apm services list
pup apm traces list --service api-gateway --duration 1h
bash
pup auth login
pup apm services list
pup apm traces list --service api-gateway --duration 1h

Services

服务

List Services

列出服务

bash
pup apm services list
pup apm services list --env production
bash
pup apm services list
pup apm services list --env production

Service Details

服务详情

bash
pup apm services get api-gateway --json
bash
pup apm services get api-gateway --json

Service Map

服务地图

bash
undefined
bash
undefined

View dependencies

查看依赖关系

pup apm service-map --service api-gateway --json
undefined
pup apm service-map --service api-gateway --json
undefined

Traces

链路追踪

Search Traces

搜索链路

bash
undefined
bash
undefined

By service

按服务筛选

pup apm traces list --service api-gateway --duration 1h
pup apm traces list --service api-gateway --duration 1h

Errors only

仅展示错误

pup apm traces list --service api-gateway --status error
pup apm traces list --service api-gateway --status error

Slow traces (>1s)

慢链路(耗时>1秒)

pup apm traces list --service api-gateway --min-duration 1000ms
pup apm traces list --service api-gateway --min-duration 1000ms

With specific tag

带指定标签

pup apm traces list --query "@http.url:/api/users"
undefined
pup apm traces list --query "@http.url:/api/users"
undefined

Get Trace Detail

获取链路详情

bash
pup apm traces get <trace_id> --json
bash
pup apm traces get <trace_id> --json

Key Metrics

核心指标

MetricWhat It Measures
trace.http.request.hits
Request count
trace.http.request.duration
Latency
trace.http.request.errors
Error count
trace.http.request.apdex
User satisfaction
指标测量内容
trace.http.request.hits
请求数
trace.http.request.duration
延迟
trace.http.request.errors
错误数
trace.http.request.apdex
用户满意度

⚠️ Trace Sampling

⚠️ 链路采样

Not all traces are kept. Understand sampling:
ModeWhat's Kept
Head-basedRandom % at start
Error/SlowAll errors, slow traces
RetentionWhat's indexed (billed)
bash
undefined
并非所有链路都会被保留。 了解采样规则:
模式保留内容
基于头采样链路启动时按随机百分比采样
错误/慢链路所有错误、慢链路全部保留
留存被索引的内容(计费部分)
bash
undefined

Check retention filters

查看留存过滤器

pup apm retention-filters list
undefined
pup apm retention-filters list
undefined

Trace Retention Costs

链路留存成本

RetentionCost
Indexed spans$$$ per million
Ingested spans$ per million
Best practice: Only index what you need for search.
留存类型成本
已索引 spans每百万个 $$$
已摄入 spans每百万个 $
**最佳实践:**仅索引你需要搜索的内容。

Service Level Objectives

服务级别目标(SLO)

Link APM to SLOs:
bash
pup slos create \
  --name "API Latency p99 < 200ms" \
  --type metric \
  --numerator "sum:trace.http.request.hits{service:api,@duration:<200000000}" \
  --denominator "sum:trace.http.request.hits{service:api}" \
  --target 99.0
将APM与SLO关联:
bash
pup slos create \
  --name "API Latency p99 < 200ms" \
  --type metric \
  --numerator "sum:trace.http.request.hits{service:api,@duration:<200000000}" \
  --denominator "sum:trace.http.request.hits{service:api}" \
  --target 99.0

Common Queries

常用查询

GoalQuery
Slowest endpoints
avg:trace.http.request.duration{*} by {resource_name}
Error rate
sum:trace.http.request.errors{*} / sum:trace.http.request.hits{*}
Throughput
sum:trace.http.request.hits{*}.as_rate()
目标查询语句
最慢接口
avg:trace.http.request.duration{*} by {resource_name}
错误率
sum:trace.http.request.errors{*} / sum:trace.http.request.hits{*}
吞吐量
sum:trace.http.request.hits{*}.as_rate()

Troubleshooting

问题排查

ProblemFix
No tracesCheck ddtrace installed, DD_TRACE_ENABLED=true
Missing serviceVerify DD_SERVICE env var
Traces not linkedCheck trace headers propagated
High cardinalityDon't tag with user_id/request_id
问题解决方法
无链路数据检查是否安装了ddtrace,DD_TRACE_ENABLED配置是否为true
服务缺失验证DD_SERVICE环境变量配置是否正确
链路未关联检查链路头部是否正常透传
基数过高不要给user_id/request_id加标签

References/Docs

参考/文档