diagnosing-sdk-health

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Diagnosing SDK health

诊断SDK健康状况

When a user asks about PostHog SDK versions, outdated SDKs, or whether they should upgrade, use the pre-digested SDK Doctor report rather than reasoning about versions yourself. The backend applies smart-semver rules (grace periods, minor-count thresholds, age-based detection), traffic-percentage thresholds, and provides user-facing copy that matches the SDK Doctor UI exactly.
当用户询问PostHog SDK版本、过时SDK或是否应该升级时,请使用预先处理好的SDK Doctor报告,而非自行判断版本情况。后端应用了智能语义化版本规则(宽限期、次要版本数量阈值、基于使用时长的检测)、流量占比阈值,并提供与SDK Doctor UI完全一致的面向用户的文案。

Available tools

可用工具

ToolPurpose
posthog:sdk-doctor-get
Returns a structured health report plus UI-matching copy and drill-in URLs per SDK/version.
posthog:execute-sql
(Optional) Run a
sql_query
from the report to show events captured by a specific outdated version.
工具用途
posthog:sdk-doctor-get
返回结构化的健康报告,以及每个SDK/版本对应的与UI匹配的文案和深入分析链接。
posthog:execute-sql
(可选)运行报告中的
sql_query
,展示特定过时版本捕获的事件。

Workflow

工作流程

Step 1 — Invoke the tool

步骤1 — 调用工具

json
posthog:sdk-doctor-get
{}
Pass
force_refresh: true
only when the user explicitly asks for fresh data — by default the report uses a Redis cache that's refreshed every 12 hours.
json
posthog:sdk-doctor-get
{}
仅当用户明确要求获取最新数据时,才传入
force_refresh: true
——默认情况下,报告使用每12小时刷新一次的Redis缓存。

Step 2 — Read the top-level summary

步骤2 — 查看顶层摘要

json
{
  "overall_health": "healthy" | "needs_attention",
  "health": "success" | "warning" | "danger",
  "needs_updating_count": 0,
  "team_sdk_count": 0,
  "sdks": [ /* per-SDK assessments */ ]
}
Lead with the headline:
  • overall_health: healthy
    — everything's current; say so and stop.
  • health: warning
    — some SDKs outdated but less than half of the project's SDKs. Flag as upgrade recommendations.
  • health: danger
    — majority of team SDKs are outdated. Treat as urgent.
json
{
  "overall_health": "healthy" | "needs_attention",
  "health": "success" | "warning" | "danger",
  "needs_updating_count": 0,
  "team_sdk_count": 0,
  "sdks": [ /* 各SDK的评估结果 */ ]
}
以标题内容作为开头:
  • overall_health: healthy
    — 所有SDK均为最新版本,告知用户后即可结束。
  • health: warning
    — 部分SDK已过时,但占项目SDK总数的比例不足一半,将其标记为升级建议。
  • health: danger
    — 团队使用的大多数SDK已过时,需视为紧急情况处理。

Step 3 — Surface the banner text(s) verbatim

步骤3 — 一字不差地展示横幅文本

Each SDK's
banners
array contains zero or more sentences that match the SDK Doctor UI's "Time for an update!" alert exactly, e.g.:
Version 7.0.0 of the Python SDK has captured more than 10% of events in the last 7 days.
Quote these verbatim. They're what the user already sees (or would see) in the UI — rewording creates drift between agent and product copy.
每个SDK的
banners
数组包含零个或多个句子,与SDK Doctor UI中的“是时候更新了!”提示完全一致,例如:
Python SDK的7.0.0版本在过去7天内捕获了超过10%的事件。
请一字不差地引用这些内容。它们是用户已在UI中看到(或将会看到)的文案——改写会导致Agent与产品文案出现偏差。

Step 4 — Report per-SDK findings

步骤4 — 报告各SDK的检测结果

For each entry in
sdks
, surface:
  • readable_name
    (e.g.
    Python
    ,
    Node.js
    ,
    Web
    ) — use this in prose, not the raw
    lib
  • latest_version
  • severity
    (
    none
    /
    warning
    /
    danger
    ) — use this to group or color findings
Group by severity (
danger
first, then
warning
, then
none
). Skip SDKs with
needs_updating: false
unless the user explicitly asked about the full state.
对于
sdks
中的每个条目,展示以下内容:
  • readable_name
    (例如
    Python
    Node.js
    Web
    )——在说明性文字中使用该名称,而非原始的
    lib
    字段
  • latest_version
  • severity
    none
    /
    warning
    /
    danger
    )——用于分组或标记检测结果的优先级
按严重程度分组(先
danger
,再
warning
,最后
none
)。除非用户明确要求查看完整状态,否则跳过
needs_updating: false
的SDK。

Step 5 — Per-version drill-down (when the user wants detail)

步骤5 — 按版本深入分析(当用户需要细节时)

Each SDK's
releases
array has per-version rows. Each row includes UI-matching copy and ready-to-use links:
  • status_reason
    — badge tooltip text that closely matches the UI (e.g.
    "Released 5 months ago. Upgrade recommended."
    ,
    "You have the latest available. Click 'Releases ↗' above to check for any since."
    , or
    "Released 2 months ago. Upgrading is a good idea, but it's not urgent yet."
    ). Quote directly. Caveat: the relative-age segment ("5 months ago" etc.) is computed with Python's
    humanize.naturaltime
    on the backend and JavaScript's
    dayjs().fromNow()
    in the browser, and the two libraries have different thresholds at some boundaries (e.g. humanize says
    "30 days ago"
    where dayjs says
    "a month ago"
    ; humanize says
    "4 months ago"
    at 148 days where dayjs says
    "5 months ago"
    ). The overall template is identical; the age phrasing may be one threshold off. If a user cites an exact age from the UI that doesn't match, don't "correct" them — the UI is showing dayjs output and both are internally consistent.
  • released_ago
    — human-readable relative age (e.g.
    "5 months ago"
    ) — same humanize-vs-dayjs caveat as above.
  • is_outdated
    ,
    is_old
    ,
    is_current_or_newer
    — booleans if you need to branch
  • sql_query
    — complete SQL statement to see the last 50 events captured by this version. Suggest it as a copy-paste snippet OR pass it to
    posthog:execute-sql
    to drill in.
  • activity_page_url
    — relative path (starts with
    /project/<id>/
    ) to the Activity > Explore page pre-filtered to this lib + version. Combine with the user's PostHog host (e.g.
    us.posthog.com
    ) for a clickable link.
每个SDK的
releases
数组包含各版本的详细信息。每一行都包含与UI匹配的文案和可直接使用的链接:
  • status_reason
    — 徽章提示文本,与UI高度匹配(例如
    "发布于5个月前。建议升级。"
    "您使用的是最新版本。点击上方的'发布记录 ↗'查看后续更新。"
    "发布于2个月前。升级是个不错的选择,但目前并不紧急。"
    )。请直接引用。注意:相对时长部分(如“5个月前”)由后端使用Python的
    humanize.naturaltime
    计算,而浏览器中使用JavaScript的
    dayjs().fromNow()
    计算,两个库在某些时间边界的阈值不同(例如humanize显示
    "30天前"
    ,而dayjs显示
    "1个月前"
    ;humanize在148天时显示
    "4个月前"
    ,而dayjs显示
    "5个月前"
    )。整体模板完全一致,但时长表述可能相差一个阈值。如果用户引用UI中的精确时长与报告不符,不要“纠正”他们——UI显示的是dayjs的输出,两者在各自逻辑内都是一致的。
  • released_ago
    — 人类可读的相对时长(例如
    "5个月前"
    )——同样存在上述humanize与dayjs的差异问题。
  • is_outdated
    ,
    is_old
    ,
    is_current_or_newer
    — 用于分支判断的布尔值
  • sql_query
    — 完整的SQL语句,用于查看该版本捕获的最近50条事件。建议将其作为复制粘贴片段提供,或传入
    posthog:execute-sql
    进行深入分析。
  • activity_page_url
    — 相对路径(以
    /project/<id>/
    开头),指向预筛选该库+版本的“活动 > 探索”页面。将其与用户的PostHog主机(例如
    us.posthog.com
    )组合,即可生成可点击的链接。

Step 6 — Link to the UI

步骤6 — 链接到UI页面

Always close with a link to the SDK Doctor page:
/project/<project_id>/health/sdk-doctor
. The UI shows per-row event counts, last-event timestamps, release notes, and SDK docs links — more than the tool response includes.
始终以SDK Doctor页面的链接收尾:
/project/<project_id>/health/sdk-doctor
。该UI页面显示每行的事件计数、最后事件时间戳、发布说明和SDK文档链接——包含的信息比工具响应更多。

Interpreting severity

严重程度解读

The backend applies these rules (you don't need to re-check them):
  • Grace period: versions released within the last 7 days (14 days for web) are never flagged, even if major versions behind.
  • Minor-version rule: flag if 3+ minors behind OR > 180 days old.
  • Major-version rule: always flag if a major version behind (outside grace period).
  • Patch-version rule: never flagged — patch differences are noise.
  • Age rule (separate "old" flag): desktop SDKs flagged at > 16 weeks old, mobile at
    24 weeks old (mobile is more lenient — users don't auto-update apps).
  • Traffic threshold: an outdated version handling ≥10% of events (≥20% for web) surfaces as a traffic alert even if a newer version is also in use. Mobile SDKs are excluded from traffic alerts.
  • Overall severity:
    danger
    when half or more of the project's SDKs are outdated,
    warning
    when some are outdated but not majority.
后端应用以下规则(无需您重新检查):
  • 宽限期:最近7天内发布的版本(Web SDK为14天)不会被标记,即使落后多个主版本。
  • 次要版本规则:若落后3个及以上次要版本,或版本已发布超过180天,则标记为需升级。
  • 主版本规则:若落后一个主版本(超出宽限期),则始终标记为需升级。
  • 补丁版本规则:从不标记——补丁版本差异属于无关信息。
  • 时长规则(单独的“老旧”标记):桌面SDK发布超过16周则标记为老旧,移动SDK为超过24周(移动SDK更为宽松——用户不会自动更新应用)。
  • 流量阈值:若过时版本处理的事件占比≥10%(Web SDK为≥20%),即使同时使用了更新版本,也会触发流量警报。移动SDK不参与流量警报判断。
  • 整体严重程度:当项目中半数或以上SDK过时,标记为
    danger
    ;部分SDK过时但未超过半数,标记为
    warning

Copy-faithfulness

文案一致性

These response fields are user-facing UI copy — quote them verbatim, don't reword:
  • banners[]
    — top-level "Time for an update!" alert text. Byte-for-byte match with UI.
  • releases[].status_reason
    — per-version badge tooltip text. Template matches the UI, but the relative-age phrasing (
    "5 months ago"
    etc.) can be one boundary threshold off because the backend uses
    humanize
    and the UI uses
    dayjs
    . See Step 5 caveat.
  • readable_name
    — human-readable SDK name. Byte-for-byte match with UI.
reason
(per-SDK) is a programmatic summary meant for ranking/filtering, not for quoting to users. Prefer
banners[]
and
status_reason
for user-visible output.
以下响应字段为面向用户的UI文案——请一字不差地引用,不要改写:
  • banners[]
    — 顶层“是时候更新了!”提示文本。与UI完全一致。
  • releases[].status_reason
    — 各版本的徽章提示文本。模板与UI匹配,但相对时长表述(如“5个月前”)可能相差一个阈值,因为后端使用
    humanize
    而UI使用
    dayjs
    。详见步骤5的注意事项。
  • readable_name
    — 人类可读的SDK名称。与UI完全一致。
reason
(各SDK字段)是用于排序/筛选的程序化摘要,不适合引用给用户。优先使用
banners[]
status_reason
作为面向用户的输出内容。

Handling empty or errored drill-in fields

处理空值或报错的深入分析字段

If
sql_query
or
activity_page_url
comes back as an empty string for a particular release, the backend sanitizer rejected the
lib_version
as potentially unsafe to interpolate (e.g. it contained quote characters or whitespace). When this happens:
  • Surface it — tell the user "the recorded SDK version string doesn't look safe to interpolate into a query, so I can't build a drill-in link for it." This is a signal worth noting (it could indicate instrumentation tampering or a library bug).
  • Do NOT retry — calling the tool again won't change the result.
  • Do NOT patch — don't rewrite the version or guess at a safe substitute and pipe that into
    posthog:execute-sql
    . That would defeat the sanitizer.
Similarly, if you pass
sql_query
to
posthog:execute-sql
and it errors, surface the error verbatim rather than rewriting the query. The query template is a verbatim mirror of what the SDK Doctor UI uses — if the UI's SQL wouldn't run, something else is wrong.
Do not wrap, truncate, or modify
sql_query
in any way before passing to
posthog:execute-sql
.
No
SELECT * FROM (<sql_query>) LIMIT 10
, no adding
WHERE
clauses, no changing the ORDER BY, no dropping columns. The query is the verbatim mirror — if you need something different, build a fresh query from scratch with the user's help, don't derive it from
sql_query
.
如果某个版本的
sql_query
activity_page_url
返回空字符串,说明后端 sanitizer 判定
lib_version
存在安全风险,无法插入到查询中(例如包含引号或空格)。出现这种情况时:
  • 告知用户——告诉用户“记录的SDK版本字符串无法安全插入到查询中,因此无法生成深入分析链接。”这是一个值得注意的信号(可能表明 instrumentation 被篡改或存在库 bug)。
  • 请勿重试——再次调用工具不会改变结果。
  • 请勿修改——不要重写版本或猜测安全替代方案并传入
    posthog:execute-sql
    。这会绕过 sanitizer 的安全防护。
同样,若将
sql_query
传入
posthog:execute-sql
后出现错误,请一字不差地展示错误信息,不要改写查询语句。该查询模板与SDK Doctor UI使用的模板完全一致——如果UI中的SQL无法运行,说明存在其他问题。
在将
sql_query
传入
posthog:execute-sql
之前,请勿以任何方式包装、截断或修改它
。不要添加
SELECT * FROM (<sql_query>) LIMIT 10
,不要添加
WHERE
子句,不要更改ORDER BY,不要删除列。查询语句是完全镜像的——如果需要不同的查询,请在用户的帮助下从头构建,不要基于
sql_query
修改。

Deferring to documentation for "why is it still outdated?" questions

对于“为什么还没更新?”类问题,引导至官方文档

When the user expresses confusion about an old SDK version still producing events after they believed it was updated — phrasings like "I thought I updated", "we already upgraded", "we deployed the new version but…", "why are users still on the old SDK?", or any variation of "why isn't it gone?" — do not improvise a list of causes. Point them to the canonical docs page instead:
That page is the product team's source of truth on why versions persist (HTML snippet pinning, lockfiles in separate apps, CDN or browser caching, service workers, build/deploy issues) and what to do about each one (auto vs. manual update paths per SDK). It has diagrams, product-specific language, and will stay up to date as the guidance evolves — your improvised version will drift.
Suggested response shape:
That's a common question with a few possible causes — cached bundles, pinned snippet versions, lockfiles in separate apps, service workers, build/deploy issues, etc. Rather than guess which one's biting you, have a look at Keeping SDKs current — it walks through each cause and the fix. Once you've skimmed it, I can help you narrow it down for your setup (e.g. by pulling the activity events for the outdated version to see whether it's one app/domain/subpath or spread across everything).
The trigger is intent, not content. Defer whenever the user expresses surprise or confusion about persistence ("still", "thought I updated", "why haven't users upgraded", "why is the old one still there"), even when the tool response technically contains the version's age, reason, or traffic breakdown. The docs page exists because the literal data doesn't answer why, just what.
当用户对更新后仍有旧版本SDK产生事件表示困惑时——例如“我以为已经更新了”、“我们已经升级了”、“我们部署了新版本但……”、“为什么用户还在使用旧SDK?”或任何类似“为什么还没消失?”的表述——不要自行猜测原因列表。请引导他们查看官方文档页面:
该页面是产品团队关于版本持续存在原因的权威来源(HTML代码片段固定、独立应用中的锁文件、CDN或浏览器缓存、Service Worker、构建/部署问题等),并提供了针对每个SDK的自动/手动更新方案。页面包含图表、产品专属术语,且会随着指导内容的演变保持更新——自行编写的内容会逐渐偏离官方信息。
建议的回复格式:
这是一个常见问题,可能有多种原因——缓存的打包文件、固定的代码片段版本、独立应用中的锁文件、Service Worker、构建/部署问题等。与其猜测具体原因,不如查看保持SDK为最新版本页面——它会逐一讲解每个原因及对应的解决方法。浏览完页面后,我可以帮您针对您的场景缩小排查范围(例如,提取过时版本的活动事件,查看是单个应用/域名/子路径还是所有路径都存在问题)。
触发条件是用户的意图,而非具体内容。只要用户对版本持续存在表示惊讶或困惑(如“仍然”、“以为已经更新”、“为什么用户还没升级”、“为什么旧版本还在”),即使工具响应中包含版本时长、原因或流量占比信息,也请引导至文档。该文档页面的存在是因为原始数据只能回答“是什么”,无法回答“为什么”。

When NOT to defer

无需引导至文档的情况

Do not send the user to the docs page when:
  • The question is about a specific field in the response — e.g. "what does
    is_old
    mean?" or "how is severity calculated?" — answer directly using the report fields or the "Interpreting severity" rules below.
  • The user is asking for the raw data (events, versions in use, counts, persons) — pull it via the report or
    posthog:execute-sql
    and present it.
  • The user has already read the page and is asking a specific follow-up — answer directly or pull data to help narrow things down.
在以下情况下,不要引导用户至文档页面:
  • 用户询问响应中的特定字段——例如“
    is_old
    是什么意思?”或“严重程度是如何计算的?”——直接使用报告字段或下方的“严重程度解读”规则回答。
  • 用户请求原始数据(事件、使用中的版本、数量、用户)——通过报告或
    posthog:execute-sql
    提取数据并展示。
  • 用户已阅读该页面并提出具体的后续问题——直接回答或提取数据帮助缩小排查范围。

Tips

提示

  • If
    team_sdk_count
    is 0, the project isn't sending events with SDK metadata. Suggest checking that
    posthog-js
    (or another SDK) is actually installed and capturing events.
  • The report is per-project. If the user asks about multiple projects, invoke the tool once per project (after switching project context via
    posthog:switch-project
    ).
  • The tool is read-only. No side effects, no rate limits, safe to call anytime.
  • For "show me the events from this outdated version" requests, you have three options, ordered from most to least interactive:
    1. Render a clickable link from
      activity_page_url
      — user explores in PostHog UI.
    2. Pass
      sql_query
      to
      posthog:execute-sql
      and summarize the result inline.
    3. Quote the
      sql_query
      as a copy-paste snippet.
  • 如果
    team_sdk_count
    为0,说明项目未发送包含SDK元数据的事件。建议检查
    posthog-js
    (或其他SDK)是否已正确安装并捕获事件。
  • 报告是按项目生成的。如果用户询问多个项目,请为每个项目调用一次工具(先通过
    posthog:switch-project
    切换项目上下文)。
  • 该工具为只读工具。无副作用,无速率限制,可随时安全调用。
  • 对于“展示此过时版本的事件”请求,您有三种选择,按互动性从高到低排序:
    1. 根据
      activity_page_url
      生成可点击链接——用户在PostHog UI中自行探索。
    2. sql_query
      传入
      posthog:execute-sql
      并在回复中汇总结果。
    3. sql_query
      作为复制粘贴片段引用。

Phrasing the drill-in offer

深入分析请求的表述方式

When you offer to open
activity_page_url
or run
sql_query
, describe what the user will see in terms of the SDK being old, not the page or person. The data is about the customer's end-users' events captured while an old SDK is loaded — the old thing is the SDK, not the page or person.
  • Good: "Want me to pull up the events captured by this old SDK so you can see which pages on your site are still loading it, and which end-users are hitting them?"
  • Good: "I can show you which URLs on the site are still serving the outdated SDK and who's generating events from them."
  • Avoid: "Want to see which pages/persons are still on the old version?" — pages don't run SDK versions; visitors of pages do. This phrasing makes the user think the old thing is the page or the person, which is wrong.
  • Avoid for web / server SDKs: "which users are on the old SDK" — users don't install these; the customer's deployed app/site does. For mobile SDKs (
    posthog-ios
    ,
    posthog-android
    ,
    posthog-flutter
    ,
    posthog-react-native
    ), though, "users who haven't updated the app" IS accurate — the SDK ships embedded in the app binary and users control the update by updating the app. So the rule flips for mobile: phrasings like "end-users still running an older app version" or "users who haven't updated to the latest release" are correct for mobile drill-ins.
当您提供打开
activity_page_url
或运行
sql_query
的选项时,请从SDK过时的角度描述用户将看到的内容,而非页面或用户本身。数据反映的是客户终端用户在加载旧SDK时捕获的事件——过时的是SDK,而非页面或用户。
  • 正确表述:“需要我提取此旧SDK捕获的事件,以便您查看网站上哪些页面仍在加载它,以及哪些终端用户在访问这些页面吗?”
  • 正确表述:“我可以展示网站上哪些URL仍在提供过时SDK,以及哪些用户正在从中生成事件。”
  • 避免表述:“需要查看哪些页面/用户仍在使用旧版本吗?”——页面不会运行SDK版本;访问页面的用户才会。这种表述会让用户误以为过时的是页面或用户,这是错误的。
  • 对于Web/服务器SDK避免:“哪些用户在使用旧SDK”——这些SDK并非由用户安装;而是由客户部署的应用/网站安装。对于移动SDK
    posthog-ios
    posthog-android
    posthog-flutter
    posthog-react-native
    ),“未更新应用的用户”这种表述是准确的——SDK嵌入在应用二进制文件中,用户通过更新应用来更新SDK。因此对于移动SDK,表述方式相反:“仍在运行旧版本应用的终端用户”或“未更新至最新版本的用户”是正确的深入分析表述。