langfuse-observability

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Langfuse Observability

Langfuse 可观测性

Instrument LLM applications with Langfuse tracing, following best practices and tailored to your use case.
使用Langfuse追踪为LLM应用程序添加观测能力,遵循最佳实践并根据你的使用场景定制方案。

When to Use

适用场景

  • Setting up Langfuse in a new project
  • Auditing existing Langfuse instrumentation
  • Adding observability to LLM calls
  • 在新项目中搭建Langfuse
  • 审计现有Langfuse观测实现
  • 为LLM调用添加可观测性

Workflow

工作流程

1. Assess Current State

1. 评估当前状态

Check the project:
  • Is Langfuse SDK installed?
  • What LLM frameworks are used? (OpenAI SDK, LangChain, LlamaIndex, Vercel AI SDK, etc.)
  • Is there existing instrumentation?
No integration yet: Set up Langfuse using a framework integration if available. Integrations capture more context automatically and require less code than manual instrumentation.
Integration exists: Audit against baseline requirements below.
检查项目情况:
  • 是否已安装Langfuse SDK?
  • 使用了哪些LLM框架?(OpenAI SDK、LangChain、LlamaIndex、Vercel AI SDK等)
  • 是否已有观测实现?
尚未集成: 如果有可用的框架集成,使用框架集成来搭建Langfuse。集成会自动捕获更多上下文,比手动实现所需代码更少。
已集成: 对照下方的基线要求进行审计。

2. Verify Baseline Requirements

2. 验证基线要求

Every trace should have these fundamentals:
RequirementCheckWhy
Model nameIs the LLM model captured?Enables model comparison and filtering
Token usageAre input/output tokens tracked?Enables automatic cost calculation
Good trace namesAre names descriptive? (
chat-response
, not
trace-1
)
Makes traces findable and filterable
Span hierarchyAre multi-step operations nested properly?Shows which step is slow or failing
Correct observation typesAre generations marked as generations?Enables model-specific analytics
Sensitive data maskedIs PII/confidential data excluded or masked?Prevents data leakage
Trace input/outputDoes the trace capture the full data being processed as input, and the result as output?Enables debugging and understanding what was processed
Framework integrations (OpenAI, LangChain, etc.) handle model name, tokens, and observation types automatically. Prefer integrations over manual instrumentation.
每条追踪记录都应具备以下基础要素:
要求检查项原因
模型名称是否捕获了LLM模型信息?支持模型对比与筛选
Token使用量是否追踪了输入/输出Token数量?支持自动计算成本
清晰的追踪名称名称是否具有描述性?(如
chat-response
,而非
trace-1
便于追踪记录的查找与筛选
调用层级结构多步骤操作是否正确嵌套?可定位哪个步骤缓慢或失败
正确的观测类型生成结果是否标记为生成类型?支持模型特定的分析
敏感数据掩码是否排除或掩码了PII/机密数据?防止数据泄露
追踪输入/输出追踪是否捕获了处理的完整输入数据和结果输出?便于调试和理解处理过程
框架集成(OpenAI、LangChain等)会自动处理模型名称、Token和观测类型。优先使用集成而非手动实现。

3. Explore Traces First

3. 先探索追踪记录

Once baseline instrumentation is working, encourage the user to explore their traces in the Langfuse UI before adding more context:
"Your traces are now appearing in Langfuse. Take a look at a few of them—see what data is being captured, what's useful, and what's missing. This will help us decide what additional context to add."
This helps the user:
  • Understand what they're already getting
  • Form opinions about what's missing
  • Ask better questions about what they need
基线观测实现正常工作后,建议用户先在Langfuse UI中探索他们的追踪记录,再添加更多上下文:
"你的追踪记录现在已出现在Langfuse中。查看几条记录,了解当前捕获的数据、有用的信息以及缺失的内容。这将帮助我们决定需要添加哪些额外上下文。"
这有助于用户:
  • 了解当前已获取的信息
  • 明确缺失的内容
  • 提出更精准的需求

4. Discover Additional Context Needs

4. 发掘额外上下文需求

Determine what additional instrumentation would be valuable. Infer from code when possible, only ask when unclear.
Infer from code:
If you see in code...InferSuggest
Conversation history, chat endpoints, message arraysMulti-turn app
session_id
User authentication,
user_id
variables
User-aware app
user_id
on traces
Multiple distinct endpoints/featuresMulti-feature app
feature
tag
Customer/tenant identifiersMulti-tenant app
customer_id
or tier tag
Feedback collection, ratingsHas user feedbackCapture as scores
Only ask when not obvious from code:
  • "How do you know when a response is good vs bad?" → Determines scoring approach
  • "What would you want to filter by in a dashboard?" → Surfaces non-obvious tags
  • "Are there different user segments you'd want to compare?" → Customer tiers, plans, etc.
Additions and their value:
AdditionWhyDocs
session_id
Groups conversations togetherhttps://langfuse.com/docs/tracing-features/sessions
user_id
Enables user filtering and cost attributionhttps://langfuse.com/docs/tracing-features/users
User feedback scoreEnables quality filtering and trendshttps://langfuse.com/docs/scores/overview
feature
tag
Per-feature analyticshttps://langfuse.com/docs/tracing-features/tags
customer_tier
tag
Cost/quality breakdown by segmenthttps://langfuse.com/docs/tracing-features/tags
These are NOT baseline requirements—only add what's relevant based on inference or user input.
确定哪些额外的观测实现有价值。尽可能从代码中推断,仅在不明确时询问用户。
从代码中推断:
如果在代码中看到...推断结论建议
对话历史、聊天端点、消息数组多轮对话应用添加
session_id
用户认证、
user_id
变量
感知用户的应用在追踪记录中添加
user_id
多个不同的端点/功能多功能应用添加
feature
标签
客户/租户标识符多租户应用添加
customer_id
或层级标签
反馈收集、评分功能具备用户反馈机制捕获为评分
仅在代码中不明确时询问:
  • "你如何判断响应的好坏?" → 确定评分方式
  • "你希望在仪表盘中筛选哪些内容?" → 发现非显性标签需求
  • "是否有不同的用户群体需要对比?" → 客户层级、套餐等
额外添加项及其价值:
添加项价值文档
session_id
将同一场对话的消息分组https://langfuse.com/docs/tracing-features/sessions
user_id
支持用户筛选与成本归因https://langfuse.com/docs/tracing-features/users
用户反馈评分支持质量筛选与趋势分析https://langfuse.com/docs/scores/overview
feature
标签
按功能维度分析https://langfuse.com/docs/tracing-features/tags
customer_tier
标签
按群体细分成本/质量https://langfuse.com/docs/tracing-features/tags
这些并非基线要求——仅根据推断或用户输入添加相关内容。

5. Guide to UI

5. 引导使用UI

After adding context, point users to relevant UI features:
  • Traces view: See individual requests
  • Sessions view: See grouped conversations (if session_id added)
  • Dashboard: Build filtered views using tags
  • Scores: Filter by quality metrics
添加上下文后,引导用户使用相关UI功能:
  • 追踪记录视图:查看单个请求
  • 会话视图:查看分组的对话(如果添加了session_id)
  • 仪表板:使用标签构建筛选视图
  • 评分:按质量指标筛选

Framework Integrations

框架集成

Prefer these over manual instrumentation:
FrameworkIntegrationDocs
OpenAI SDKDrop-in replacementhttps://langfuse.com/docs/integrations/openai
LangChainCallback handlerhttps://langfuse.com/docs/integrations/langchain
LlamaIndexCallback handlerhttps://langfuse.com/docs/integrations/llama-index
Vercel AI SDKOpenTelemetry exporterhttps://langfuse.com/docs/integrations/vercel-ai-sdk
LiteLLMCallback or proxyhttps://langfuse.com/docs/integrations/litellm
优先使用框架集成而非手动实现:

Always Explain Why

始终解释原因

When suggesting additions, explain the user benefit:
"I recommend adding session_id to your traces.

Why: This groups messages from the same conversation together.
You'll be able to see full conversation flows in the Sessions view,
making it much easier to debug multi-turn interactions.

Learn more: https://langfuse.com/docs/tracing-features/sessions"
当建议添加内容时,向用户说明益处:
"我建议在追踪记录中添加session_id。

原因:它会将同一场对话的消息分组在一起。
你可以在会话视图中查看完整的对话流程,
这会让调试多轮交互变得更加容易。

了解更多:https://langfuse.com/docs/tracing-features/sessions"

Common Mistakes

常见错误

MistakeProblemFix
No
flush()
in scripts
Traces never sentCall
langfuse.flush()
before exit
Flat tracesCan't see which step failedUse nested spans for distinct steps
Generic trace namesHard to filterUse descriptive names:
chat-response
,
doc-summary
Logging sensitive dataData leakage riskMask PII before tracing
Manual instrumentation when integration existsMore code, less contextUse framework integration
Langfuse import before env vars loadedLangfuse initializes with missing/wrong credentialsImport Langfuse AFTER loading environment variables (e.g., after
load_dotenv()
)
Wrong import order with OpenAILangfuse can't patch the OpenAI clientImport Langfuse and call its setup BEFORE importing OpenAI client
错误问题修复方案
脚本中未调用
flush()
追踪记录从未发送在退出前调用
langfuse.flush()
扁平的追踪结构无法定位失败步骤为不同步骤使用嵌套调用层级
通用的追踪名称难以筛选使用描述性名称:
chat-response
doc-summary
记录敏感数据存在数据泄露风险在追踪前对PII进行掩码处理
已有集成却手动实现代码更多,上下文更少使用框架集成
加载环境变量前导入LangfuseLangfuse初始化时缺少/错误的凭证在加载环境变量(如
load_dotenv()
)后再导入Langfuse
OpenAI导入顺序错误Langfuse无法修补OpenAI客户端在导入OpenAI客户端前,先导入Langfuse并完成设置