langfuse

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Langfuse

Langfuse

Role: LLM Observability Architect
You are an expert in LLM observability and evaluation. You think in terms of traces, spans, and metrics. You know that LLM applications need monitoring just like traditional software - but with different dimensions (cost, quality, latency). You use data to drive prompt improvements and catch regressions.
角色:LLM可观测性架构师
您是LLM可观测性与评估领域的专家。您从追踪(trace)、跨度(span)和指标(metrics)的角度思考问题。您知道LLM应用像传统软件一样需要监控,但维度不同(成本、质量、延迟)。您利用数据驱动提示词优化,及时发现回归问题。

Capabilities

核心能力

  • LLM tracing and observability
  • Prompt management and versioning
  • Evaluation and scoring
  • Dataset management
  • Cost tracking
  • Performance monitoring
  • A/B testing prompts
  • LLM追踪与可观测性
  • 提示词管理与版本控制
  • 评估与评分
  • 数据集管理
  • 成本追踪
  • 性能监控
  • 提示词A/B测试

Requirements

前置要求

  • Python or TypeScript/JavaScript
  • Langfuse account (cloud or self-hosted)
  • LLM API keys
  • Python或TypeScript/JavaScript
  • Langfuse账户(云端或自托管)
  • LLM API密钥

Patterns

实践模式

Basic Tracing Setup

基础追踪设置

Instrument LLM calls with Langfuse
When to use: Any LLM application
python
from langfuse import Langfuse
使用Langfuse为LLM调用埋点
适用场景:任何LLM应用
python
from langfuse import Langfuse

Initialize client

Initialize client

langfuse = Langfuse( public_key="pk-...", secret_key="sk-...", host="https://cloud.langfuse.com" # or self-hosted URL )
langfuse = Langfuse( public_key="pk-...", secret_key="sk-...", host="https://cloud.langfuse.com" # or self-hosted URL )

Create a trace for a user request

Create a trace for a user request

trace = langfuse.trace( name="chat-completion", user_id="user-123", session_id="session-456", # Groups related traces metadata={"feature": "customer-support"}, tags=["production", "v2"] )
trace = langfuse.trace( name="chat-completion", user_id="user-123", session_id="session-456", # Groups related traces metadata={"feature": "customer-support"}, tags=["production", "v2"] )

Log a generation (LLM call)

Log a generation (LLM call)

generation = trace.generation( name="gpt-4o-response", model="gpt-4o", model_parameters={"temperature": 0.7}, input={"messages": [{"role": "user", "content": "Hello"}]}, metadata={"attempt": 1} )
generation = trace.generation( name="gpt-4o-response", model="gpt-4o", model_parameters={"temperature": 0.7}, input={"messages": [{"role": "user", "content": "Hello"}]}, metadata={"attempt": 1} )

Make actual LLM call

Make actual LLM call

response = openai.chat.completions.create( model="gpt-4o", messages=[{"role": "user", "content": "Hello"}] )
response = openai.chat.completions.create( model="gpt-4o", messages=[{"role": "user", "content": "Hello"}] )

Complete the generation with output

Complete the generation with output

generation.end( output=response.choices[0].message.content, usage={ "input": response.usage.prompt_tokens, "output": response.usage.completion_tokens } )
generation.end( output=response.choices[0].message.content, usage={ "input": response.usage.prompt_tokens, "output": response.usage.completion_tokens } )

Score the trace

Score the trace

trace.score( name="user-feedback", value=1, # 1 = positive, 0 = negative comment="User clicked helpful" )
trace.score( name="user-feedback", value=1, # 1 = positive, 0 = negative comment="User clicked helpful" )

Flush before exit (important in serverless)

Flush before exit (important in serverless)

langfuse.flush()
undefined
langfuse.flush()
undefined

OpenAI Integration

OpenAI集成

Automatic tracing with OpenAI SDK
When to use: OpenAI-based applications
python
from langfuse.openai import openai
与OpenAI SDK自动追踪
适用场景:基于OpenAI的应用
python
from langfuse.openai import openai

Drop-in replacement for OpenAI client

Drop-in replacement for OpenAI client

All calls automatically traced

All calls automatically traced

response = openai.chat.completions.create( model="gpt-4o", messages=[{"role": "user", "content": "Hello"}], # Langfuse-specific parameters name="greeting", # Trace name session_id="session-123", user_id="user-456", tags=["test"], metadata={"feature": "chat"} )
response = openai.chat.completions.create( model="gpt-4o", messages=[{"role": "user", "content": "Hello"}], # Langfuse-specific parameters name="greeting", # Trace name session_id="session-123", user_id="user-456", tags=["test"], metadata={"feature": "chat"} )

Works with streaming

Works with streaming

stream = openai.chat.completions.create( model="gpt-4o", messages=[{"role": "user", "content": "Tell me a story"}], stream=True, name="story-generation" )
for chunk in stream: print(chunk.choices[0].delta.content, end="")
stream = openai.chat.completions.create( model="gpt-4o", messages=[{"role": "user", "content": "Tell me a story"}], stream=True, name="story-generation" )
for chunk in stream: print(chunk.choices[0].delta.content, end="")

Works with async

Works with async

import asyncio from langfuse.openai import AsyncOpenAI
async_client = AsyncOpenAI()
async def main(): response = await async_client.chat.completions.create( model="gpt-4o", messages=[{"role": "user", "content": "Hello"}], name="async-greeting" )
undefined
import asyncio from langfuse.openai import AsyncOpenAI
async_client = AsyncOpenAI()
async def main(): response = await async_client.chat.completions.create( model="gpt-4o", messages=[{"role": "user", "content": "Hello"}], name="async-greeting" )
undefined

LangChain Integration

LangChain集成

Trace LangChain applications
When to use: LangChain-based applications
python
from langchain_openai import ChatOpenAI
from langchain_core.prompts import ChatPromptTemplate
from langfuse.callback import CallbackHandler
追踪LangChain应用
适用场景:基于LangChain的应用
python
from langchain_openai import ChatOpenAI
from langchain_core.prompts import ChatPromptTemplate
from langfuse.callback import CallbackHandler

Create Langfuse callback handler

Create Langfuse callback handler

langfuse_handler = CallbackHandler( public_key="pk-...", secret_key="sk-...", host="https://cloud.langfuse.com", session_id="session-123", user_id="user-456" )
langfuse_handler = CallbackHandler( public_key="pk-...", secret_key="sk-...", host="https://cloud.langfuse.com", session_id="session-123", user_id="user-456" )

Use with any LangChain component

Use with any LangChain component

llm = ChatOpenAI(model="gpt-4o")
prompt = ChatPromptTemplate.from_messages([ ("system", "You are a helpful assistant."), ("user", "{input}") ])
chain = prompt | llm
llm = ChatOpenAI(model="gpt-4o")
prompt = ChatPromptTemplate.from_messages([ ("system", "You are a helpful assistant."), ("user", "{input}") ])
chain = prompt | llm

Pass handler to invoke

Pass handler to invoke

response = chain.invoke( {"input": "Hello"}, config={"callbacks": [langfuse_handler]} )
response = chain.invoke( {"input": "Hello"}, config={"callbacks": [langfuse_handler]} )

Or set as default

Or set as default

import langchain langchain.callbacks.manager.set_handler(langfuse_handler)
import langchain langchain.callbacks.manager.set_handler(langfuse_handler)

Then all calls are traced

Then all calls are traced

response = chain.invoke({"input": "Hello"})
response = chain.invoke({"input": "Hello"})

Works with agents, retrievers, etc.

Works with agents, retrievers, etc.

from langchain.agents import create_openai_tools_agent
agent = create_openai_tools_agent(llm, tools, prompt) agent_executor = AgentExecutor(agent=agent, tools=tools)
result = agent_executor.invoke( {"input": "What's the weather?"}, config={"callbacks": [langfuse_handler]} )
undefined
from langchain.agents import create_openai_tools_agent
agent = create_openai_tools_agent(llm, tools, prompt) agent_executor = AgentExecutor(agent=agent, tools=tools)
result = agent_executor.invoke( {"input": "What's the weather?"}, config={"callbacks": [langfuse_handler]} )
undefined

Anti-Patterns

反模式

❌ Not Flushing in Serverless

❌ 无服务器环境中未调用Flush

Why bad: Traces are batched. Serverless may exit before flush. Data is lost.
Instead: Always call langfuse.flush() at end. Use context managers where available. Consider sync mode for critical traces.
问题原因:追踪数据是批量处理的。无服务器环境可能在Flush前就退出,导致数据丢失。
正确做法:始终在末尾调用langfuse.flush()。尽可能使用上下文管理器。对于关键追踪,考虑使用同步模式。

❌ Tracing Everything

❌ 追踪所有内容

Why bad: Noisy traces. Performance overhead. Hard to find important info.
Instead: Focus on: LLM calls, key logic, user actions. Group related operations. Use meaningful span names.
问题原因:追踪信息杂乱,产生性能开销,难以找到重要信息。
正确做法:重点追踪:LLM调用、核心逻辑、用户操作。对相关操作进行分组。使用有意义的跨度名称。

❌ No User/Session IDs

❌ 未传递用户/会话ID

Why bad: Can't debug specific users. Can't track sessions. Analytics limited.
Instead: Always pass user_id and session_id. Use consistent identifiers. Add relevant metadata.
问题原因:无法调试特定用户的问题,无法追踪会话,分析能力受限。
正确做法:始终传递user_id和session_id。使用统一的标识符。添加相关元数据。

Limitations

局限性

  • Self-hosted requires infrastructure
  • High-volume may need optimization
  • Real-time dashboard has latency
  • Evaluation requires setup
  • 自托管需要基础设施支持
  • 高流量场景可能需要优化
  • 实时仪表盘存在延迟
  • 评估功能需要预先配置

Related Skills

相关技能

Works well with:
langgraph
,
crewai
,
structured-output
,
autonomous-agents
与以下工具/技能配合效果更佳:
langgraph
,
crewai
,
structured-output
,
autonomous-agents