exploring-autocapture-events

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Exploring autocapture events

探索autocapture事件

if users opt in then posthog-js automatically captures clicks, form submissions, and page changes as
$autocapture
events. Each event records the clicked DOM element and its ancestors in the
elements_chain
column.
$autocapture
is intentionally excluded from the
posthog:read-data-schema
taxonomy because it is only useful with autocapture-specific filters (selector, tag, text, href). This skill fills that gap.
若用户选择启用,posthog-js会自动将点击、表单提交和页面变更捕获为
$autocapture
事件。 每个事件会在
elements_chain
列中记录被点击的DOM元素及其祖先元素。
$autocapture
被有意排除在
posthog:read-data-schema
分类之外, 因为它只有配合autocapture专属过滤器(选择器、标签、文本、href)才有用。 本技能填补了这一空白。

Materialized columns

物化列

The
events
table provides fast access to common element fields without parsing the full chain string.
ColumnTypeDescription
elements_chain
StringFull semicolon-separated element chain (see format reference)
elements_chain_href
StringLast href value from the chain
elements_chain_texts
Array(String)All text values from elements
elements_chain_ids
Array(String)All id attribute values
elements_chain_elements
Array(String)Useful tag names: a, button, input, select, textarea, label
Use materialized columns for exploration queries whenever possible — they avoid regex parsing.
events
表提供了对常见元素字段的快速访问,无需解析完整的链式字符串。
列名类型描述
elements_chain
String完整的分号分隔元素链(参见格式参考
elements_chain_href
String元素链中的最后一个href值
elements_chain_texts
Array(String)所有元素的文本值
elements_chain_ids
Array(String)所有id属性值
elements_chain_elements
Array(String)常用标签名:a、button、input、select、textarea、label
尽可能使用物化列进行探索查询——它们可以避免正则解析。

Canonical autocapture properties

标准autocapture属性

Every
$autocapture
event from posthog-js ships with a fixed set of properties. Do not query the schema to "look them up" — they are these:
PropertyExamplesNotes
$event_type
click
,
submit
,
change
the kind of interaction
$el_text
Sign up
,
Submit
text of the clicked element
$current_url
https://app.example.com/pricing
page the interaction happened on
$elements_chain
semicolon-separated chainparsed via the
elements_chain*
materialized columns above
Standard event properties (
$browser
,
$os
,
$device_type
, etc.) are also present.
posthog-js生成的每个
$autocapture
事件都带有一组固定的属性。 无需查询架构来“查找”它们——它们就是以下这些:
属性名称示例说明
$event_type
click
,
submit
,
change
交互类型
$el_text
Sign up
,
Submit
被点击元素的文本
$current_url
https://app.example.com/pricing
交互发生的页面
$elements_chain
分号分隔的元素链通过上述
elements_chain*
物化列解析
标准事件属性(
$browser
$os
$device_type
等)也会包含在内。

Workflow

工作流程

1. Confirm autocapture data exists

1. 确认autocapture数据存在

Run a count query before doing anything else. If the count is zero, autocapture may be disabled. There are two ways this happens:
  • Project settings — the team can set
    autocapture_opt_out
    in PostHog project settings
  • SDK config — the posthog-js
    init()
    call can pass
    autocapture: false
Tell the user if no data is found so they can check both settings.
sql
SELECT count() as cnt
FROM events
WHERE event = '$autocapture'
  AND timestamp > now() - INTERVAL 7 DAY
在进行任何操作前先运行计数查询。 如果计数为零,可能是autocapture已被禁用。有两种情况会导致禁用:
  • 项目设置——团队可在PostHog项目设置中设置
    autocapture_opt_out
  • SDK配置——posthog-js的
    init()
    调用可传入
    autocapture: false
如果未找到数据,请告知用户,以便他们检查这两项设置。
sql
SELECT count() as cnt
FROM events
WHERE event = '$autocapture'
  AND timestamp > now() - INTERVAL 7 DAY

2. Explore what users are interacting with

2. 探索用户的交互行为

Start broad using the materialized columns. The goal is to understand what users are clicking before narrowing down.
Useful explorations:
  • Top clicked tag names (via
    elements_chain_elements
    )
  • Top clicked text values (via
    elements_chain_texts
    )
  • Top clicked hrefs (via
    elements_chain_href
    )
  • Raw
    elements_chain
    values for a specific page (filtered by
    properties.$current_url
    )
See example queries for all patterns.
从使用物化列开始进行广泛探索。 目标是在缩小范围前先了解用户点击的内容。
有用的探索方向:
  • 点击量最高的标签名(通过
    elements_chain_elements
  • 点击量最高的文本值(通过
    elements_chain_texts
  • 点击量最高的href(通过
    elements_chain_href
  • 特定页面的原始
    elements_chain
    值(通过
    properties.$current_url
    过滤)
所有模式请参见示例查询

3. Find candidate selectors

3. 寻找候选选择器

Once the user identifies an interaction they care about, find a CSS selector that identifies it.
Priority order for selector attributes (best first):
  1. data-attr
    or other
    data-*
    attributes
    — highest specificity, stable across deploys, developer-intended anchors. Search with
    match(elements_chain, 'data-attr=')
    or
    extractAll
    .
  2. Element ID (
    attr_id
    ) — also highly stable, queryable via
    elements_chain_ids
    .
  3. Tag + class combination — moderately stable but classes change with CSS refactors.
  4. Text content — fragile (changes with copy edits, i18n) but sometimes the only option.
  5. Tag name alone — too broad on its own, useful as a qualifier.
When a
data-attr
value is found, construct a selector like
[data-attr="value"]
or
button[data-attr="value"]
.
一旦用户确定了他们关心的交互行为,就需要找到一个能识别该行为的CSS选择器。
选择器属性的优先级顺序(从优到劣):
  1. data-attr
    或其他
    data-*
    属性
    ——特异性最高,在部署中保持稳定,是开发者指定的锚点。 使用
    match(elements_chain, 'data-attr=')
    extractAll
    进行搜索。
  2. 元素ID
    attr_id
    )——同样高度稳定,可通过
    elements_chain_ids
    查询。
  3. 标签+类组合——稳定性中等,但类会随CSS重构而变化。
  4. 文本内容——脆弱(会随文案修改、国际化而变化)但有时是唯一的选择。
  5. 仅标签名——本身范围太广,用作限定符时有用。
找到
data-attr
值后,构造类似
[data-attr="value"]
button[data-attr="value"]
的选择器。

4. Evaluate selector uniqueness

4. 评估选择器的唯一性

A selector is only useful if it matches the intended interaction and not unrelated events.
Run a uniqueness check using
elements_chain =~
with the regex pattern for the selector. Then sample matching events to inspect what the selector actually captures. Compare the count against total autocapture volume to understand selectivity.
A good selector matches a single logical interaction. If it matches too many distinct elements, refine it in the next step.
只有当选择器匹配目标交互行为且不匹配无关事件时,它才有用。
使用
elements_chain =~
和选择器的正则模式运行唯一性检查。 然后抽样匹配的事件,检查选择器实际捕获的内容。 将计数与总autocapture量进行比较,以了解其选择性。
一个好的选择器应匹配单一逻辑交互行为。 如果它匹配太多不同元素,请在下一步中优化它。

5. Refine with additional filters

5. 通过额外过滤器优化

If the selector alone is not unique enough, layer on additional filters:
  • Text filter — match by element text content using
    elements_chain_texts
  • URL filter — restrict to a specific page using
    properties.$current_url
  • Href filter — match by link target using
    elements_chain_href
Re-run the uniqueness check after each refinement. Only include filters that are needed — fewer filters means more resilience to minor DOM changes.
如果仅使用选择器不够唯一,可以添加额外过滤器:
  • 文本过滤器——使用
    elements_chain_texts
    匹配元素文本内容
  • URL过滤器——使用
    properties.$current_url
    限制到特定页面
  • Href过滤器——使用
    elements_chain_href
    匹配链接目标
每次优化后重新运行唯一性检查。 只添加必要的过滤器——过滤器越少,对DOM微小变更的适应性越强。

6. Filter autocapture inside an insight query

6. 在洞察查询中过滤autocapture事件

When the user wants a funnel, trend, or other insight, the filter shape is different from HogQL. Each step in a
FunnelsQuery
/
TrendsQuery
is an
EventsNode
(or
ActionsNode
) with
event: "$autocapture"
and a
properties
array.
Two distinct property
type
values matter — they are not interchangeable:
  • type: "element"
    — keys:
    selector
    ,
    tag_name
    ,
    text
    ,
    href
    . Matched against the parsed
    elements_chain
    . Operator support is split:
    • selector
      and
      tag_name
      only support
      exact
      and
      is_not
      — anything else raises
      NotImplementedError
      in the query compiler (
      posthog/hogql/property.py
      ).
    • text
      and
      href
      accept the full string operator set (
      exact
      ,
      is_not
      ,
      icontains
      ,
      not_icontains
      ,
      regex
      ,
      not_regex
      ,
      is_set
      ,
      is_not_set
      ).
  • type: "event"
    — keys: any of the canonical autocapture properties (
    $event_type
    ,
    $el_text
    ,
    $current_url
    ) or anything else on the event. Standard event-property operators (
    exact
    ,
    icontains
    ,
    regex
    , etc.).
Example funnel from clicking one button to clicking another:
json
{
  "kind": "FunnelsQuery",
  "series": [
    {
      "kind": "EventsNode",
      "event": "$autocapture",
      "properties": [
        {
          "type": "element",
          "key": "selector",
          "value": ["[data-attr=\"autocapture-series-save-as-action-banner-shown\"]"],
          "operator": "exact"
        }
      ]
    },
    {
      "kind": "EventsNode",
      "event": "$autocapture",
      "properties": [
        {
          "type": "element",
          "key": "selector",
          "value": ["[data-attr=\"autocapture-save-as-action\"]"],
          "operator": "exact"
        }
      ]
    }
  ]
}
Two things easy to get wrong:
  • value
    is an array even when matching a single selector
  • The selector string includes the
    [data-attr="..."]
    wrapper — it is a CSS selector, not a bare attribute value
Decision rule: prefer an action (
ActionsNode
referencing an existing action — see Step 8) when the interaction will be referenced more than once; inline
type: "element"
/
type: "event"
filters when it's a one-off insight; raw HogQL (Step 7) when joining across events or doing custom aggregations.
当用户需要漏斗、趋势或其他洞察时,过滤器的形式与HogQL不同。
FunnelsQuery
/
TrendsQuery
中的每个步骤都是一个
EventsNode
(或
ActionsNode
),其中
event: "$autocapture"
且带有
properties
数组。
有两种不同的属性
type
值很重要——它们不能互换:
  • type: "element"
    ——键:
    selector
    tag_name
    text
    href
    。与解析后的
    elements_chain
    匹配。运算符支持分为两类:
    • selector
      tag_name
      仅支持
      exact
      is_not
      ——使用其他运算符会在查询编译器中引发
      NotImplementedError
      posthog/hogql/property.py
      )。
    • text
      href
      支持完整的字符串运算符集(
      exact
      is_not
      icontains
      not_icontains
      regex
      not_regex
      is_set
      is_not_set
      )。
  • type: "event"
    ——键:任何标准autocapture属性(
    $event_type
    $el_text
    $current_url
    )或事件上的其他属性。支持标准事件属性运算符(
    exact
    icontains
    regex
    等)。
示例:从点击一个按钮到点击另一个按钮的漏斗:
json
{
  "kind": "FunnelsQuery",
  "series": [
    {
      "kind": "EventsNode",
      "event": "$autocapture",
      "properties": [
        {
          "type": "element",
          "key": "selector",
          "value": ["[data-attr=\"autocapture-series-save-as-action-banner-shown\"]"],
          "operator": "exact"
        }
      ]
    },
    {
      "kind": "EventsNode",
      "event": "$autocapture",
      "properties": [
        {
          "type": "element",
          "key": "selector",
          "value": ["[data-attr=\"autocapture-save-as-action\"]"],
          "operator": "exact"
        }
      ]
    }
  ]
}
容易出错的两点:
  • 即使匹配单个选择器,
    value
    也是一个数组
  • 选择器字符串包含
    [data-attr="..."]
    包裹——它是CSS选择器,不是裸属性值
决策规则:当交互行为会被多次引用时,优先使用动作(引用现有动作的
ActionsNode
——参见步骤8);当是一次性洞察时,使用内联
type: "element"
/
type: "event"
过滤器;当需要跨事件关联或自定义聚合时,使用原始HogQL(步骤7)。

7. Use in ad-hoc queries

7. 在临时查询中使用

The discovered selector can be used directly in HogQL without creating an action.
Trends — count matching clicks over time:
sql
SELECT
  toStartOfDay(timestamp) as day,
  count() as clicks
FROM events
WHERE event = '$autocapture'
  AND timestamp > now() - INTERVAL 14 DAY
  AND elements_chain =~ '(^|;)button.*?data-attr="checkout"'
GROUP BY day
ORDER BY day
Funnel — pageview to click conversion:
sql
SELECT
  person_id,
  first_pageview,
  first_click_after
FROM (
  SELECT
    p.person_id,
    p.pageview_time as first_pageview,
    min(c.click_time) as first_click_after
  FROM (
    SELECT person_id, min(timestamp) as pageview_time
    FROM events
    WHERE event = '$pageview'
      AND timestamp > now() - INTERVAL 14 DAY
      AND properties.$current_url ILIKE '%/pricing%'
    GROUP BY person_id
  ) p
  INNER JOIN (
    SELECT person_id, timestamp as click_time
    FROM events
    WHERE event = '$autocapture'
      AND timestamp > now() - INTERVAL 14 DAY
      AND elements_chain =~ '(^|;)button.*?data-attr="signup"'
  ) c ON p.person_id = c.person_id AND c.click_time > p.pageview_time
  GROUP BY p.person_id, p.pageview_time
)
For recurring analysis, prefer creating an action (next step) or using
posthog:query-trends
/
posthog:query-funnel
with the action.
找到的选择器可直接在HogQL中使用,无需创建动作。
趋势——按时间统计匹配的点击量:
sql
SELECT
  toStartOfDay(timestamp) as day,
  count() as clicks
FROM events
WHERE event = '$autocapture'
  AND timestamp > now() - INTERVAL 14 DAY
  AND elements_chain =~ '(^|;)button.*?data-attr="checkout"'
GROUP BY day
ORDER BY day
漏斗——页面浏览到点击的转化率:
sql
SELECT
  person_id,
  first_pageview,
  first_click_after
FROM (
  SELECT
    p.person_id,
    p.pageview_time as first_pageview,
    min(c.click_time) as first_click_after
  FROM (
    SELECT person_id, min(timestamp) as pageview_time
    FROM events
    WHERE event = '$pageview'
      AND timestamp > now() - INTERVAL 14 DAY
      AND properties.$current_url ILIKE '%/pricing%'
    GROUP BY person_id
  ) p
  INNER JOIN (
    SELECT person_id, timestamp as click_time
    FROM events
    WHERE event = '$autocapture'
      AND timestamp > now() - INTERVAL 14 DAY
      AND elements_chain =~ '(^|;)button.*?data-attr="signup"'
  ) c ON p.person_id = c.person_id AND c.click_time > p.pageview_time
  GROUP BY p.person_id, p.pageview_time
)
对于重复分析,优先创建动作(下一步)或使用
posthog:query-trends
/
posthog:query-funnel
并关联该动作。

8. Create an action

8. 创建动作

Actions are the durable version of ad-hoc selector queries. Once the criteria uniquely identify the interaction, create an action using
posthog:action-create
.
Construct the step with only the filters needed for uniqueness:
json
{
  "name": "Clicked checkout button",
  "steps": [
    {
      "event": "$autocapture",
      "selector": "button[data-attr='checkout']",
      "text": "Complete Purchase",
      "text_matching": "exact",
      "url": "/checkout",
      "url_matching": "contains"
    }
  ]
}
Available step fields for
$autocapture
:
  • selector
    — CSS selector (e.g.
    button[data-attr='checkout']
    )
  • tag_name
    — HTML tag name (e.g.
    button
    ,
    a
    ,
    input
    )
  • text
    /
    text_matching
    — element text (
    exact
    ,
    contains
    , or
    regex
    )
  • href
    /
    href_matching
    — link href (
    exact
    ,
    contains
    , or
    regex
    )
  • url
    /
    url_matching
    — page URL (
    exact
    ,
    contains
    , or
    regex
    )
After creation, verify with
matchesAction()
:
sql
SELECT count() as matching_events
FROM events
WHERE matchesAction('Clicked checkout button')
  AND timestamp > now() - INTERVAL 7 DAY
动作是临时选择器查询的持久化版本。 一旦条件能唯一识别交互行为,使用
posthog:action-create
创建动作。
仅使用确保唯一性所需的过滤器来构造步骤:
json
{
  "name": "Clicked checkout button",
  "steps": [
    {
      "event": "$autocapture",
      "selector": "button[data-attr='checkout']",
      "text": "Complete Purchase",
      "text_matching": "exact",
      "url": "/checkout",
      "url_matching": "contains"
    }
  ]
}
$autocapture
可用的步骤字段:
  • selector
    ——CSS选择器(例如
    button[data-attr='checkout']
  • tag_name
    ——HTML标签名(例如
    button
    a
    input
  • text
    /
    text_matching
    ——元素文本(
    exact
    contains
    regex
  • href
    /
    href_matching
    ——链接href(
    exact
    contains
    regex
  • url
    /
    url_matching
    ——页面URL(
    exact
    contains
    regex
创建后,使用
matchesAction()
验证:
sql
SELECT count() as matching_events
FROM events
WHERE matchesAction('Clicked checkout button')
  AND timestamp > now() - INTERVAL 7 DAY

Tips

提示

  • Always set timestamp filters —
    $autocapture
    is high volume
  • Use
    LIMIT
    generously when sampling
    elements_chain
    — the strings can be long
  • The
    elements_chain =~
    operator matches CSS selectors as regex internally; prefer materialized columns when possible for performance
  • This workflow only applies to posthog-js — other SDKs do not capture elements
  • 始终设置时间戳过滤器——
    $autocapture
    数据量很大
  • 抽样
    elements_chain
    时尽量使用
    LIMIT
    ——字符串可能很长
  • elements_chain =~
    运算符内部将CSS选择器作为正则匹配; 为了性能,尽可能优先使用物化列
  • 本工作流程仅适用于posthog-js——其他SDK不会捕获元素