Loading...
Loading...
Compare original and translation side by side
posthog:llma-evaluation-*posthog:llma-evaluation-*exploring-llm-tracesexploring-llm-evaluationsexploring-llm-tracesexploring-llm-evaluations$ai_trace_idsession-summary:group:replay-search:ai_product = 'posthog_ai' AND agent_mode = '<mode>'agent_modesupermode$ai_generationagent_mode IS NOT NULLagent_mode=null$ai_trace_idsession-summary:group:replay-search:ai_product = 'posthog_ai' AND agent_mode = '<mode>'agent_modesupermode$ai_generationagent_mode IS NOT NULLagent_mode=null| Requirement | How to verify |
|---|---|
(Pattern A) Feature emits | |
(Pattern B) | |
| |
| Same query but on |
| User has organisation-level AI data processing approval | Required for |
$session_id| 要求 | 验证方式 |
|---|---|
(模式A)功能会发送带有稳定 | 使用 |
(模式B)近期 | 使用 |
| 使用 |
| 评估运行一次后,对 |
| 用户拥有组织级AI数据处理权限 | 这是使用 |
$session_id| Tool | Purpose |
|---|---|
| Find sample traces matching the feature's |
| Inspect a specific trace's contents end-to-end |
| Verify trace volume, session_id coverage, eval result distributions |
| (often unexposed — UI fallback: LLM analytics → Evaluations → New) Create the LLM-judge eval (disabled at first) |
| (often unexposed — UI fallback: the eval's detail page has a "Run on event" button) Dry-run the eval against specific generations during prompt iteration |
| (often unexposed — UI fallback: edit the eval in LLM analytics → Evaluations) Tweak the prompt / enable when ready |
| (often unexposed — UI fallback: the eval detail page has a "Summarize results" button) After the feed is running, get an AI summary of pass/N/A patterns to validate signal quality |
| (often unexposed — UI: Data pipeline → Workflows) Browse existing workflow configs — useful for cloning an existing feed's structure when setting up a new one. Read-only; no create/update tool is exposed yet, so step 6's Slack workflow setup is UI-only. |
posthog:llma-evaluation-*| 工具 | 用途 |
|---|---|
| 查找匹配功能 |
| 端到端查看特定跟踪记录的内容 |
| 验证跟踪记录数量、session_id覆盖率、评估结果分布 |
| (通常未开放 — UI回退方案:LLM分析→评估→新建)创建LLM-judge评估(初始状态为禁用) |
| (通常未开放 — UI回退方案:评估详情页有“在事件上运行”按钮)在提示词迭代期间,针对特定生成结果试运行评估 |
| (通常未开放 — UI回退方案:在LLM分析→评估中编辑评估)调整提示词/准备就绪后启用评估 |
| (通常未开放 — UI回退方案:评估详情页有“总结结果”按钮)信息流运行后,获取AI生成的通过/不适用模式摘要,验证信号质量 |
| (通常未开放 — UI方案:数据管道→工作流)浏览现有工作流配置——在设置新信息流时,克隆现有信息流的结构非常有用。仅支持只读;目前未开放创建/更新工具,因此步骤6的Slack工作流设置只能通过UI完成。 |
posthog:llma-evaluation-*SELECT
splitByChar(':', coalesce(properties.$ai_trace_id, ''))[1] AS root,
splitByChar(':', coalesce(properties.$ai_trace_id, ''))[2] AS subtype,
count() AS events
FROM events
WHERE timestamp > now() - INTERVAL 3 DAY
AND event = '$ai_generation'
AND properties.$ai_trace_id IS NOT NULL
GROUP BY root, subtype
ORDER BY events DESC
LIMIT 25coalesce(..., '')splitByCharSELECT
properties.agent_mode AS agent_mode,
properties.supermode AS supermode,
count() AS events,
count(DISTINCT properties.$ai_trace_id) AS traces
FROM events
WHERE timestamp > now() - INTERVAL 3 DAY
AND event = '$ai_generation'
AND properties.ai_product = 'posthog_ai'
GROUP BY agent_mode, supermode
ORDER BY events DESC
LIMIT 20agent_modeerror_trackingproduct_analyticssqlsession_replayflagssurveyllm_analyticsnullsupermode='plan'SELECT
splitByChar(':', coalesce(properties.$ai_trace_id, ''))[1] AS root,
splitByChar(':', coalesce(properties.$ai_trace_id, ''))[2] AS subtype,
count() AS events
FROM events
WHERE timestamp > now() - INTERVAL 3 DAY
AND event = '$ai_generation'
AND properties.$ai_trace_id IS NOT NULL
GROUP BY root, subtype
ORDER BY events DESC
LIMIT 25coalesce(..., '')splitByCharSELECT
properties.agent_mode AS agent_mode,
properties.supermode AS supermode,
count() AS events,
count(DISTINCT properties.$ai_trace_id) AS traces
FROM events
WHERE timestamp > now() - INTERVAL 3 DAY
AND event = '$ai_generation'
AND properties.ai_product = 'posthog_ai'
GROUP BY agent_mode, supermode
ORDER BY events DESC
LIMIT 20agent_modeerror_trackingproduct_analyticssqlsession_replayflagssurveyllm_analyticsnullsupermode='plan'posthog:query-llm-traces-list
{
"properties": [
{ "type": "event", "key": "$ai_trace_id", "operator": "icontains", "value": "<your-prefix-here>" }
],
"limit": 10,
"dateRange": { "date_from": "-2d" },
"randomOrder": true
}posthog:query-llm-traces-list
{
"properties": [
{ "type": "event", "key": "ai_product", "operator": "exact", "value": "posthog_ai" },
{ "type": "event", "key": "agent_mode", "operator": "exact", "value": "<mode-here>" }
],
"limit": 10,
"dateRange": { "date_from": "-2d" },
"randomOrder": true
}randomOrder: truequery-llm-traces-listlimit: 10$current_urlagent_modeposthog:query-llm-traces-list
{
"properties": [
{ "type": "event", "key": "$ai_trace_id", "operator": "icontains", "value": "<your-prefix-here>" }
],
"limit": 10,
"dateRange": { "date_from": "-2d" },
"randomOrder": true
}posthog:query-llm-traces-list
{
"properties": [
{ "type": "event", "key": "ai_product", "operator": "exact", "value": "posthog_ai" },
{ "type": "event", "key": "agent_mode", "operator": "exact", "value": "<mode-here>" }
],
"limit": 10,
"dateRange": { "date_from": "-2d" },
"randomOrder": true
}randomOrder: truequery-llm-traces-listlimit: 10$current_urlagent_modeYou are analyzing a PostHog [FEATURE NAME] trace to extract its real use case.
Your reasoning text will be posted directly to a Slack channel as a notification.
Write it as a short, ready-to-post message — no preamble, no meta-description.
Step 1 — Classification:
- PASS = this trace is the [feature kind] you care about
- FAIL = a different LLM call or a false match
- N/A = ambiguous from the trace alone
Step 2 — Reasoning (only matters if PASS). Write 2-3 sentences in this exact format:
"[OPENER] [what they targeted/filtered for]. They were
trying to [understand X / debug Y / find Z]. The result surfaced [key pattern
or finding]."
Your output MUST start with the exact phrase "[OPENER]". No other opening is allowed.
Rules:
- No "This is a [feature]..." or "The input contains..." preamble
- No JSON, field names, system-prompt references, or meta-description
- Concrete > generic. "users hitting error tracking for the first time" beats "user behavior"
- If you cannot infer one of the three pieces from the trace, write "(unclear from trace)" in that slot — do not guess[OPENER]| Feature / mode | OPENER |
|---|---|
| Session summary (group / single) | |
| Replay AI search | |
| PostHog AI in error tracking mode | |
| PostHog AI in session replay mode | |
| PostHog AI in SQL mode | |
supermode='plan'agent_modeagent_mode='<mode>' AND supermode='plan'"A user asked PostHog AI to plan""A user ran"You are analyzing a PostHog [FEATURE NAME] trace to extract its real use case.
Your reasoning text will be posted directly to a Slack channel as a notification.
Write it as a short, ready-to-post message — no preamble, no meta-description.
Step 1 — Classification:
- PASS = this trace is the [feature kind] you care about
- FAIL = a different LLM call or a false match
- N/A = ambiguous from the trace alone
Step 2 — Reasoning (only matters if PASS). Write 2-3 sentences in this exact format:
"[OPENER] [what they targeted/filtered for]. They were
trying to [understand X / debug Y / find Z]. The result surfaced [key pattern
or finding]."
Your output MUST start with the exact phrase "[OPENER]". No other opening is allowed.
Rules:
- No "This is a [feature]..." or "The input contains..." preamble
- No JSON, field names, system-prompt references, or meta-description
- Concrete > generic. "users hitting error tracking for the first time" beats "user behavior"
- If you cannot infer one of the three pieces from the trace, write "(unclear from trace)" in that slot — do not guess[OPENER]| 功能/模式 | 开场白 |
|---|---|
| 会话摘要(群组/单个) | |
| 回放AI搜索 | |
| PostHog AI错误跟踪模式 | |
| PostHog AI会话回放模式 | |
| PostHog AI SQL模式 | |
supermode='plan'agent_modeagent_mode='<mode>' AND supermode='plan'"A user asked PostHog AI to plan""A user ran"enabled: falseposthog:llma-evaluation-createposthog:llma-evaluation-create
{
"name": "[feature] use case feed",
"description": "Extracts canonical use cases for [feature] for the #team-[area]-usage Slack feed",
"evaluation_type": "llm_judge",
"evaluation_config": {
"prompt": "<full prompt from step 3>"
},
"output_type": "boolean",
"output_config": { "allows_na": true },
"model_configuration": {
"provider": "<provider>",
"model": "<model>"
},
"enabled": false,
"conditions": {
"filters": [
// Pattern A — feature-native trace_id prefix:
{ "key": "$ai_trace_id", "operator": "icontains", "value": "<your-prefix>" }
// Pattern B — PostHog AI agent mode (use these INSTEAD of the trace_id filter):
// { "key": "ai_product", "operator": "exact", "value": "posthog_ai" },
// { "key": "agent_mode", "operator": "exact", "value": "<mode>" }
]
}
}llma-evaluation-createLLM judgeposthog:llma-evaluation-runposthog:llma-evaluation-run
{
"evaluationId": "<uuid from create>",
"target_event_id": "<a $ai_generation event id from step 2>",
"timestamp": "<ISO timestamp of that event>"
}$ai_evaluation_reasoningllma-evaluation-update| Symptom | Fix |
|---|---|
| Reasoning starts with "This is a..." | Strengthen the forced opener instruction; add a counter-example |
| Reasoning is generic ("user behavior", "various patterns") | Add positive examples of concrete phrasing in the prompt |
| Model classifies everything as PASS | Tighten the FAIL definition; add an example of what a non-match looks like |
| Reasoning is too long for Slack | Add a hard sentence cap ("MAX 3 sentences, hard limit") |
enabled: falseposthog:llma-evaluation-createposthog:llma-evaluation-create
{
"name": "[feature] use case feed",
"description": "Extracts canonical use cases for [feature] for the #team-[area]-usage Slack feed",
"evaluation_type": "llm_judge",
"evaluation_config": {
"prompt": "<full prompt from step 3>"
},
"output_type": "boolean",
"output_config": { "allows_na": true },
"model_configuration": {
"provider": "<provider>",
"model": "<model>"
},
"enabled": false,
"conditions": {
"filters": [
// Pattern A — feature-native trace_id prefix:
{ "key": "$ai_trace_id", "operator": "icontains", "value": "<your-prefix>" }
// Pattern B — PostHog AI agent mode (use these INSTEAD of the trace_id filter):
// { "key": "ai_product", "operator": "exact", "value": "posthog_ai" },
// { "key": "agent_mode", "operator": "exact", "value": "<mode>" }
]
}
}llma-evaluation-createLLM judgeposthog:llma-evaluation-runposthog:llma-evaluation-run
{
"evaluationId": "<uuid from create>",
"target_event_id": "<a $ai_generation event id from step 2>",
"timestamp": "<ISO timestamp of that event>"
}$ai_evaluation_reasoningllma-evaluation-update| 症状 | 修复方案 |
|---|---|
| 推理以"This is a..."开头 | 强化强制开场白说明;添加反例 |
| 推理过于通用(如"user behavior"、"various patterns") | 在提示词中添加具体表述的正面示例 |
| 模型将所有内容分类为PASS | 收紧FAIL定义;添加非匹配示例 |
| 推理过长,不适合Slack | 添加严格的句子限制("MAX 3 sentences, hard limit") |
posthog:llma-evaluation-updateposthog:llma-evaluation-update
{
"evaluationId": "<uuid>",
"enabled": true
}$ai_generationposthog:llma-evaluation-updateposthog:llma-evaluation-update
{
"evaluationId": "<uuid>",
"enabled": true
}$ai_generationposthog:workflows-listposthog:workflows-get/invite @PostHogposthog:workflows-listposthog:workflows-get/invite @PostHog<feature> use case feed<feature> use case feedAI evaluation (LLM)$ai_evaluation$ai_evaluation_*$ai_generation$ai_generationAI Evaluation Name (LLM)<your eval name from step 4>AI Evaluation Result (LLM)true$ai_evaluation_result'True''False''None''PASS''FAIL''N/A'true'True'equals trueSELECT DISTINCT toString(properties.$ai_evaluation_result) AS result, count() AS n
FROM events
WHERE event = '$ai_evaluation'
AND properties.$ai_evaluation_name = '<your eval name>'
AND timestamp > now() - INTERVAL 1 HOUR
GROUP BY resultTrueFalseNoneTrueequals trueAI evaluation (LLM)$ai_evaluation$ai_evaluation_*$ai_generation$ai_generationAI Evaluation Name (LLM)<步骤4中的评估名称>AI Evaluation Result (LLM)true$ai_evaluation_result'True''False''None''PASS''FAIL''N/A'true'True'equals trueSELECT DISTINCT toString(properties.$ai_evaluation_result) AS result, count() AS n
FROM events
WHERE event = '$ai_evaluation'
AND properties.$ai_evaluation_name = '<your eval name>'
AND timestamp > now() - INTERVAL 1 HOUR
GROUP BY resultTrueFalseNoneTrueequals true#<your-team>-usage-feedPostHog Usage Feed<project_id>2[
{
"text": {
"text": "<emoji> *{event.properties.$ai_evaluation_name}* triggered by *{person.name}*",
"type": "mrkdwn"
},
"type": "section"
},
{
"text": {
"text": "{event.properties.$ai_evaluation_reasoning}",
"type": "mrkdwn"
},
"type": "section"
},
{
"type": "actions",
"elements": [
{
"url": "https://us.posthog.com/project/<project_id>/llm-analytics/traces/{event.properties.$ai_trace_id}?event={event.properties.$ai_target_event_id}",
"text": { "text": "View Trace", "type": "plain_text" },
"type": "button"
},
{
"url": "https://us.posthog.com/project/<project_id>/replay/{event.properties.$session_id}",
"text": { "text": "View Trigger Session", "type": "plain_text" },
"type": "button"
},
{
"url": "{person.url}",
"text": { "text": "View Person", "type": "plain_text" },
"type": "button"
}
]
}
]<emoji>{event.properties.X}{person.X}#<your-team>-usage-feedPostHog Usage Feed<project_id>2[
{
"text": {
"text": "<emoji> *{event.properties.$ai_evaluation_name}* triggered by *{person.name}*",
"type": "mrkdwn"
},
"type": "section"
},
{
"text": {
"text": "{event.properties.$ai_evaluation_reasoning}",
"type": "mrkdwn"
},
"type": "section"
},
{
"type": "actions",
"elements": [
{
"url": "https://us.posthog.com/project/<project_id>/llm-analytics/traces/{event.properties.$ai_trace_id}?event={event.properties.$ai_target_event_id}",
"text": { "text": "View Trace", "type": "plain_text" },
"type": "button"
},
{
"url": "https://us.posthog.com/project/<project_id>/replay/{event.properties.$session_id}",
"text": { "text": "View Trigger Session", "type": "plain_text" },
"type": "button"
},
{
"url": "{person.url}",
"text": { "text": "View Person", "type": "plain_text" },
"type": "button"
}
]
}
]<emoji>{event.properties.X}{person.X}$ai_evaluation{event.properties.$ai_*}nullinvalid_blocks$ai_evaluation$ai_evaluation{event.properties.$ai_*}nullinvalid_blocks$ai_evaluation$ai_generation$ai_evaluation$session_id$ai_evaluation$ai_generation$ai_generation$ai_evaluation$ai_evaluation$session_id$ai_generationgroup_summary_use_case_feed#<team>-usage-feedsession-summary:group:"A user ran a group summary on"📊 group_summary_use_case_feed triggered by some user "A user ran a group summary on a company's onboarding sessions from the last 7 days. They were trying to understand why account activation rates are low. The summary surfaced that most users abandon at the company onboarding wizard after creating accounts." [View Trace] [View Trigger Session] [View Person]
trigger_session_id$ai_generation$session_id$ai_evaluationgroup_summary_use_case_feed#<team>-usage-feedsession-summary:group:"A user ran a group summary on"📊 group_summary_use_case_feed triggered by some user "A user ran a group summary on a company's onboarding sessions from the last 7 days. They were trying to understand why account activation rates are low. The summary surfaced that most users abandon at the company onboarding wizard after creating accounts." [View Trace] [View Trigger Session] [View Person]
trigger_session_id$ai_generation$session_id$ai_evaluationagent_mode = 'error_tracking'#<team>-usage-feed"A user asked PostHog AI about"agent_modesupermode$ai_generationee/hogai/core/agent_modes/executables.pyAgentExecutable._get_modelposthog_propertiesMaxChatMixinee/hogai/llm.pyagent_modeagent_mode=error_trackingagent_mode = 'error_tracking'#<team>-usage-feed"A user asked PostHog AI about"agent_modesupermode$ai_generationee/hogai/core/agent_modes/executables.pyAgentExecutable._get_modelee/hogai/llm.pyMaxChatMixinposthog_propertiesagent_modeagent_mode=error_trackingposthog:llma-evaluation-summary-createposthog:llma-evaluation-summary-create
{
"evaluation_id": "<uuid>",
"filter": "fail"
}'True''PASS'SELECT
properties.$ai_evaluation_reasoning AS reasoning,
properties.$ai_trace_id AS trace_id,
timestamp
FROM events
WHERE event = '$ai_evaluation'
AND properties.$ai_evaluation_name = '<your eval name>'
AND properties.$ai_evaluation_result = 'True'
AND timestamp > now() - INTERVAL 1 DAY
ORDER BY timestamp DESC
LIMIT 25posthog:llma-evaluation-summary-createposthog:llma-evaluation-summary-create
{
"evaluation_id": "<uuid>",
"filter": "fail"
}'True''PASS'SELECT
properties.$ai_evaluation_reasoning AS reasoning,
properties.$ai_trace_id AS trace_id,
timestamp
FROM events
WHERE event = '$ai_evaluation'
AND properties.$ai_evaluation_name = '<your eval name>'
AND properties.$ai_evaluation_result = 'True'
AND timestamp > now() - INTERVAL 1 DAY
ORDER BY timestamp DESC
LIMIT 25model_configurationconditions.filters$ai_trace_id$ai_generationfilter: "pass"model_configurationconditions.filters$ai_trace_id$ai_generationfilter: "pass"