copilot-history-ingest

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Copilot History Ingest — Conversation Mining

Copilot历史导入——会话挖掘

You are extracting knowledge from the user's past GitHub Copilot CLI conversations and distilling it into the Obsidian wiki. Conversations are rich but messy — your job is to find the signal and compile it.
This skill can be invoked directly or via the
wiki-history-ingest
router (
/wiki-history-ingest copilot
).
你需要从用户过往的GitHub Copilot CLI对话中提取知识,并将其提炼到Obsidian wiki中。对话内容丰富但杂乱——你的任务是筛选有效信息并进行整理。
此技能可直接调用,也可通过
wiki-history-ingest
路由调用(
/wiki-history-ingest copilot
)。

Before You Start

开始前准备

  1. Read
    .env
    to get
    OBSIDIAN_VAULT_PATH
    and
    COPILOT_HISTORY_PATH
    (defaults to
    ~/.copilot/session-state
    ) and
    COPILOT_VSCODE_STORAGE_PATH
    (the VS Code
    workspaceStorage
    directory; platform-specific — ask the user if absent from
    .env
    )
  2. Read
    .manifest.json
    at the vault root to check what's already been ingested
  3. Read
    index.md
    at the vault root to know what the wiki already contains
  1. 读取
    .env
    文件获取
    OBSIDIAN_VAULT_PATH
    COPILOT_HISTORY_PATH
    (默认值为
    ~/.copilot/session-state
    )和
    COPILOT_VSCODE_STORAGE_PATH
    (VS Code的
    workspaceStorage
    目录;路径因平台而异——如果
    .env
    中缺失,请询问用户)
  2. 读取vault根目录下的
    .manifest.json
    文件,检查已导入的内容
  3. 读取vault根目录下的
    index.md
    文件,了解wiki已包含的内容

Ingest Modes

导入模式

Append Mode (default)

追加模式(默认)

Check
.manifest.json
for each source file (events JSONL, transcript JSONL, checkpoint, session-store DB). Only process:
  • Sessions not in the manifest (new sessions)
  • Sessions whose
    updated_at
    is newer than their
    ingested_at
    in the manifest
This is usually what you want — the user ran a few new sessions and wants to capture the delta.
检查
.manifest.json
中的每个源文件(events JSONL、transcript JSONL、checkpoint、session-store DB)。仅处理:
  • 未在manifest中记录的会话(新会话)
  • updated_at
    时间晚于manifest中
    ingested_at
    时间的会话
这通常是用户需要的模式——用户运行了一些新会话,希望捕获增量内容。

Full Mode

全量模式

Process everything regardless of manifest. Use after a
wiki-rebuild
or if the user explicitly asks.
无论manifest记录如何,处理所有内容。在
wiki-rebuild
之后或用户明确要求时使用。

GitHub Copilot Data Layout

GitHub Copilot数据布局

Copilot stores data in three locations. Scan all three.
Copilot将数据存储在三个位置。请扫描所有三个位置

Source 1:
~/.copilot/session-state/
(CLI sessions)

数据源1:
~/.copilot/session-state/
(CLI会话)

~/.copilot/session-state/
├── <session-uuid>/
│   ├── workspace.yaml           # Session metadata (id, cwd, summary_count, created_at, updated_at)
│   ├── vscode.metadata.json     # VS Code context (workspaceFolder, repositoryProperties, customTitle)
│   ├── events.jsonl             # Full event log — all turns, tool calls, reasoning
│   ├── session.db               # Per-session SQLite (todos/todo_deps only — skip for ingestion)
│   ├── index.md                 # Session summary written at session end
│   ├── checkpoints/             # Checkpoint JSON files (mid-session summaries)
│   │   └── <uuid>.json          # title, overview, history, work_done, technical_details,
│   │                            #   important_files, next_steps
│   ├── files/                   # Artifacts produced during session (plans, diagrams, etc.)
│   └── research/                # Research artifacts
└── ...
~/.copilot/session-state/
├── <session-uuid>/
│   ├── workspace.yaml           # Session metadata (id, cwd, summary_count, created_at, updated_at)
│   ├── vscode.metadata.json     # VS Code context (workspaceFolder, repositoryProperties, customTitle)
│   ├── events.jsonl             # Full event log — all turns, tool calls, reasoning
│   ├── session.db               # Per-session SQLite (todos/todo_deps only — skip for ingestion)
│   ├── index.md                 # Session summary written at session end
│   ├── checkpoints/             # Checkpoint JSON files (mid-session summaries)
│   │   └── <uuid>.json          # title, overview, history, work_done, technical_details,
│   │                            #   important_files, next_steps
│   ├── files/                   # Artifacts produced during session (plans, diagrams, etc.)
│   └── research/                # Research artifacts
└── ...

Source 2:
~/.copilot/session-store.db
(Global SQLite)

数据源2:
~/.copilot/session-store.db
(全局SQLite数据库)

The canonical cross-session database. This is the highest-value source: structured, queryable, and pre-summarised.
sessions       — id, cwd, repository, branch, summary, created_at, updated_at, host_type
turns          — session_id, turn_index, user_message, assistant_response, timestamp
checkpoints    — session_id, checkpoint_number, title, overview, history, work_done,
                 technical_details, important_files, next_steps, created_at
session_files  — session_id, file_path, tool_name, turn_index, first_seen_at
session_refs   — session_id, ref_type (commit/pr/issue), ref_value, turn_index, created_at
search_index   — FTS5 virtual table (content, session_id, source_type, source_id)
跨会话的标准数据库。这是价值最高的数据源:结构化、可查询且已预先提炼摘要。
sessions       — id, cwd, repository, branch, summary, created_at, updated_at, host_type
turns          — session_id, turn_index, user_message, assistant_response, timestamp
checkpoints    — session_id, checkpoint_number, title, overview, history, work_done,
                 technical_details, important_files, next_steps, created_at
session_files  — session_id, file_path, tool_name, turn_index, first_seen_at
session_refs   — session_id, ref_type (commit/pr/issue), ref_value, turn_index, created_at
search_index   — FTS5 virtual table (content, session_id, source_type, source_id)

Source 3: VS Code Workspace Storage (
<workspaceStorage>/<hash>/GitHub.copilot-chat/
)

数据源3:VS Code工作区存储(
<workspaceStorage>/<hash>/GitHub.copilot-chat/

VS Code extension data, keyed by workspace hash. The path is platform-specific and must come from
.env
or user input.
<hash>/GitHub.copilot-chat/
├── transcripts/
│   └── <session-uuid>.jsonl     # Conversation transcripts (same JSONL format as events.jsonl)
├── memory-tool/
│   └── memories/
│       └── <base64-session-id>/ # Per-session saved artifacts (plan.md, etc.)
│           └── plan.md
└── codebase-external.sqlite     # Codebase index (skip — no conversation knowledge)
VS Code扩展数据,按工作区哈希值索引。路径因平台而异,必须从
.env
或用户输入获取。
<hash>/GitHub.copilot-chat/
├── transcripts/
│   └── <session-uuid>.jsonl     # Conversation transcripts (same JSONL format as events.jsonl)
├── memory-tool/
│   └── memories/
│       └── <base64-session-id>/ # Per-session saved artifacts (plan.md, etc.)
│           └── plan.md
└── codebase-external.sqlite     # Codebase index (skip — no conversation knowledge)

Key data sources ranked by value:

关键数据源价值排名:

  1. Checkpoints (
    session-store.db
    checkpoints
    table + per-session
    checkpoints/*.json
    ) — Pre-distilled summaries with
    overview
    ,
    work_done
    ,
    technical_details
    ,
    important_files
    ,
    next_steps
    . Gold.
  2. Session summaries (
    session-store.db
    sessions.summary
    +
    index.md
    ) — One-paragraph synopsis per session.
  3. Turns (
    session-store.db
    turns
    table +
    events.jsonl
    / transcript JSONL) — Full conversation. Rich but verbose.
  4. Memory artifacts (
    memory-tool/memories/<id>/plan.md
    etc.) — Pre-written plans and structured notes the user saved explicitly. Worth importing verbatim (or lightly summarised).
  5. File access patterns (
    session_files
    table +
    tool.execution_*
    events) — Which files the agent repeatedly touched — reveals high-value project files.
  6. Session refs (
    session_refs
    table) — Commits, PRs, and issues linked to sessions.
  7. vscode.metadata.json
    — Workspace folder path, branch,
    customTitle
    (user-set session label). Useful for grouping and naming.
  1. Checkpoints
    session-store.db
    checkpoints
    表 + 会话级
    checkpoints/*.json
    )——预先提炼的摘要,包含
    overview
    work_done
    technical_details
    important_files
    next_steps
    。黄金级数据源。
  2. 会话摘要
    session-store.db
    sessions.summary
    +
    index.md
    )——每个会话的一段式概要。
  3. 对话回合
    session-store.db
    turns
    表 +
    events.jsonl
    / transcript JSONL)——完整对话。内容丰富但冗长。
  4. 记忆工件
    memory-tool/memories/<id>/plan.md
    等)——用户明确保存的预先编写的计划和结构化笔记。值得直接导入(或轻度提炼)。
  5. 文件访问模式
    session_files
    表 +
    tool.execution_*
    事件)——Agent反复访问的文件——揭示高价值项目文件。
  6. 会话引用
    session_refs
    表)——与会话关联的提交、PR和问题。
  7. vscode.metadata.json
    ——工作区文件夹路径、分支、
    customTitle
    (用户设置的会话标签)。对分组和命名有用。

Step 1: Survey and Compute Delta

步骤1:排查并计算增量

Scan all three data locations and compare against
.manifest.json
:
bash
undefined
扫描所有三个数据位置,并与
.manifest.json
进行比较:
bash
undefined

--- Source 1: per-session directories ---

--- Source 1: per-session directories ---

Find all session directories (each has workspace.yaml)

Find all session directories (each has workspace.yaml)

ls ~/.copilot/session-state/
ls ~/.copilot/session-state/

For each session, read workspace.yaml for id/cwd/updated_at

For each session, read workspace.yaml for id/cwd/updated_at

and vscode.metadata.json for customTitle / repositoryProperties

and vscode.metadata.json for customTitle / repositoryProperties

--- Source 2: global database ---

--- Source 2: global database ---

Query session-store.db with sqlite3 (or Python sqlite3)

Query session-store.db with sqlite3 (or Python sqlite3)

SELECT s.id, s.cwd, s.repository, s.branch, s.summary, s.updated_at, COUNT(DISTINCT t.turn_index) AS turn_count, COUNT(DISTINCT c.id) AS checkpoint_count FROM sessions s LEFT JOIN turns t ON t.session_id = s.id LEFT JOIN checkpoints c ON c.session_id = s.id GROUP BY s.id ORDER BY s.updated_at DESC;
SELECT s.id, s.cwd, s.repository, s.branch, s.summary, s.updated_at, COUNT(DISTINCT t.turn_index) AS turn_count, COUNT(DISTINCT c.id) AS checkpoint_count FROM sessions s LEFT JOIN turns t ON t.session_id = s.id LEFT JOIN checkpoints c ON c.session_id = s.id GROUP BY s.id ORDER BY s.updated_at DESC;

--- Source 3: VS Code workspace storage ---

--- Source 3: VS Code workspace storage ---

For each <hash> directory under workspaceStorage, check for GitHub.copilot-chat/

For each <hash> directory under workspaceStorage, check for GitHub.copilot-chat/

Find transcript files

Find transcript files

ls <workspaceStorage>/<hash>/GitHub.copilot-chat/transcripts/

Build a unified inventory — one entry per session UUID — and classify:

- **New** — not in manifest → needs ingesting
- **Modified** — in manifest but `updated_at` is newer → needs re-ingesting
- **Unchanged** — in manifest and not modified → skip in append mode

Report to the user: "Found X sessions in session-state, Y in session-store.db, Z VS Code transcript files. Checkpoints: A. Delta: B new, C modified."
ls <workspaceStorage>/<hash>/GitHub.copilot-chat/transcripts/

构建统一清单——每个会话UUID对应一个条目,并分类:

- **新会话**——未在manifest中记录 → 需要导入
- **已修改会话**——在manifest中记录但`updated_at`时间更新 → 需要重新导入
- **未修改会话**——在manifest中记录且未修改 → 追加模式下跳过

向用户报告:“在session-state中找到X个会话,在session-store.db中找到Y个,VS Code转录文件Z个。Checkpoints:A个。增量:B个新会话,C个已修改会话。”

Step 2: Ingest Checkpoints and Summaries First

步骤2:优先导入Checkpoints和摘要

Checkpoints are already distilled — process them before touching raw turns.
Checkpoints已完成提炼——在处理原始对话回合之前先处理它们。

From
session-store.db
:

session-store.db
导入:

sql
SELECT s.id, s.cwd, s.repository, s.branch, s.summary,
       c.checkpoint_number, c.title, c.overview, c.work_done,
       c.technical_details, c.important_files, c.next_steps,
       c.created_at
FROM checkpoints c
JOIN sessions s ON c.session_id = s.id
ORDER BY s.updated_at DESC, c.checkpoint_number ASC;
sql
SELECT s.id, s.cwd, s.repository, s.branch, s.summary,
       c.checkpoint_number, c.title, c.overview, c.work_done,
       c.technical_details, c.important_files, c.next_steps,
       c.created_at
FROM checkpoints c
JOIN sessions s ON c.session_id = s.id
ORDER BY s.updated_at DESC, c.checkpoint_number ASC;

From per-session
checkpoints/*.json
:

从会话级
checkpoints/*.json
导入:

Each checkpoint file has:
title
,
overview
,
history
,
work_done
,
technical_details
,
important_files
,
next_steps
.
Read
index.md
(if present) as a session-level summary — it's typically written at session end and is already concise.
每个checkpoint文件包含:
title
overview
history
work_done
technical_details
important_files
next_steps
读取
index.md
(如果存在)作为会话级摘要——它通常在会话结束时编写,内容已简洁。

What to extract:

需要提取的内容:

  • overview
    → high-level description of what the session accomplished
  • work_done
    → concrete tasks completed (good for skills / project pages)
  • technical_details
    → implementation specifics (good for concepts pages)
  • important_files
    → high-value files in the project (good for project pages)
  • next_steps
    → open threads (good for linking to ongoing project work)
  • overview
    → 会话完成内容的高层描述
  • work_done
    → 已完成的具体任务(适用于技能/项目页面)
  • technical_details
    → 实现细节(适用于概念页面)
  • important_files
    → 项目中的高价值文件(适用于项目页面)
  • next_steps
    → 未完成的事项(适用于链接到正在进行的项目工作)

Step 3: Parse Session Turns

步骤3:解析会话回合

Read turns from
session-store.db
(preferred — already parsed) or from
events.jsonl
/ transcript JSONL.
session-store.db
(优先选择——已解析)或
events.jsonl
/ transcript JSONL读取对话回合。

From
session-store.db
:

session-store.db
读取:

sql
SELECT turn_index, user_message, assistant_response, timestamp
FROM turns
WHERE session_id = '<uuid>'
ORDER BY turn_index ASC;
sql
SELECT turn_index, user_message, assistant_response, timestamp
FROM turns
WHERE session_id = '<uuid>'
ORDER BY turn_index ASC;

From
events.jsonl
/ transcript JSONL:

events.jsonl
/ transcript JSONL读取:

Each file is one session. Each line is a JSON event. See
references/copilot-data-format.md
for the full schema.
Relevant event types:
type
What it isWorth reading?
session.start
Session metadata (cwd, branch, version)Yes — establishes project context
user.message
User turnYes —
data.content
assistant.message
Assistant turnYes —
data.content
(text) +
data.toolRequests
tool.execution_start
Tool callSkim — reveals what files/commands were used
tool.execution_end
Tool resultNo — usually noise
Extraction strategy for
assistant.message
:
  • data.content
    is the assistant's text response — extract this
  • data.reasoningText
    is internal reasoning — skip (it's the unpacked
    reasoningOpaque
    field)
  • data.toolRequests
    lists tool calls — skim tool names and arguments for file access patterns
  • Skip
    type: "tool.execution_end"
    entirely
每个文件对应一个会话。每行是一个JSON事件。完整架构请参见
references/copilot-data-format.md
相关事件类型:
type
说明是否值得读取?
session.start
会话元数据(cwd、分支、版本)是——建立项目上下文
user.message
用户回合是——
data.content
assistant.message
助手回合是——
data.content
(文本) +
data.toolRequests
tool.execution_start
工具调用略读——揭示使用的文件/命令
tool.execution_end
工具结果否——通常是冗余内容
assistant.message
提取策略:
  • data.content
    是助手的文本响应——提取此内容
  • data.reasoningText
    是内部推理逻辑——跳过(是
    reasoningOpaque
    字段的展开内容)
  • data.toolRequests
    列出工具调用——略读工具名称和参数以了解文件访问模式
  • 完全跳过
    type: "tool.execution_end"
    事件

Step 3b: Process Memory Artifacts

步骤3b:处理记忆工件

For each session that has a
memory-tool/memories/<base64-id>/
directory in VS Code workspace storage, read any markdown files saved there (typically
plan.md
). These are documents the user explicitly saved — treat them as high-quality, user-authored content.
Decode the base64 directory name to get the session UUID:
python
import base64
session_id = base64.b64decode(dir_name).decode('utf-8')
Memory artifacts map to project
skills/
or
concepts/
pages, depending on content type.
对于在VS Code工作区存储中拥有
memory-tool/memories/<base64-id>/
目录的会话,读取其中保存的所有markdown文件(通常是
plan.md
)。这些是用户明确保存的文档——视为高质量的用户原创内容。
解码base64目录名称以获取会话UUID:
python
import base64
session_id = base64.b64decode(dir_name).decode('utf-8')
记忆工件根据内容类型映射到项目的
skills/
concepts/
页面。

Step 3c: Extract File and Ref Patterns

步骤3c:提取文件和引用模式

From
session-store.db
:
sql
-- Most-touched files per project
SELECT repository, file_path, COUNT(*) AS touch_count
FROM session_files
GROUP BY repository, file_path
ORDER BY touch_count DESC;

-- Linked commits/PRs/issues per session
SELECT session_id, ref_type, ref_value, turn_index
FROM session_refs
ORDER BY session_id, turn_index;
File access patterns reveal which files are architecturally important — note them on project pages.
Session refs link Copilot sessions to git history — useful for connecting wiki knowledge to concrete code changes.
session-store.db
提取:
sql
-- Most-touched files per project
SELECT repository, file_path, COUNT(*) AS touch_count
FROM session_files
GROUP BY repository, file_path
ORDER BY touch_count DESC;

-- Linked commits/PRs/issues per session
SELECT session_id, ref_type, ref_value, turn_index
FROM session_refs
ORDER BY session_id, turn_index;
文件访问模式揭示哪些文件在架构上重要——在项目页面中记录它们。
会话引用将Copilot会话与git历史关联——有助于将wiki知识与具体代码变更连接起来。

Step 4: Cluster by Topic

步骤4:按主题聚类

Don't create one wiki page per session. Instead:
  • Group extracted knowledge by topic across sessions
  • A single session about "debugging auth + setting up CI" → two separate topics
  • Three sessions across different days about "React performance" → one merged topic
  • cwd
    /
    repository
    give you a natural first-level grouping;
    vscode.metadata.json
    's
    customTitle
    gives a human-readable session label
不要为每个会话创建一个wiki页面。相反:
  • 跨会话按主题分组提取的知识
  • 一个关于“调试认证 + 设置CI”的会话 → 分为两个独立主题
  • 不同日期的三个关于“React性能”的会话 → 合并为一个主题
  • cwd
    /
    repository
    提供自然的一级分组;
    vscode.metadata.json
    中的
    customTitle
    提供人类可读的会话标签

Step 5: Distill into Wiki Pages

步骤5:提炼为Wiki页面

Each Copilot project maps to a project directory in the vault. Derive the project name from
cwd
or
repository
:
C:\Users\name\git\my-project   → my-project
/Users/name/code/another-app   → another-app
Prefer
repository
(e.g.,
owner/repo
) from
session-store.db
over raw
cwd
when available.
每个Copilot项目对应vault中的一个项目目录。从
cwd
repository
派生项目名称:
C:\Users\name\git\my-project   → my-project
/Users/name/code/another-app   → another-app
当可用时,优先使用
session-store.db
中的
repository
(例如
owner/repo
)而非原始
cwd

Project-specific vs. global knowledge

项目特定知识 vs 全局知识

What you foundWhere it goesExample
Project architecture decisions
projects/<name>/concepts/
projects/my-project/concepts/main-architecture.md
Project-specific debugging patterns
projects/<name>/skills/
projects/my-project/skills/api-rate-limiting.md
General concept the user learned
concepts/
(global)
concepts/react-server-components.md
Recurring problem across projects
skills/
(global)
skills/debugging-hydration-errors.md
A tool/service used
entities/
(global)
entities/vercel-functions.md
Patterns across many sessions
synthesis/
(global)
synthesis/common-debugging-patterns.md
For each project with content, create or update the project overview page at
projects/<name>/<name>.md
named after the project, not
_project.md
. Obsidian's graph view uses the filename as the node label, so
_project.md
makes every project show up as
_project
in the graph. Naming it
<name>.md
gives each project a distinct, readable node name.
Important: Distill the knowledge, not the conversation. Don't write "In a session on March 15, the user asked about X." Write the knowledge itself, with the session as a source attribution.
Write a
summary:
frontmatter field
on every new/updated page — 1–2 sentences, ≤200 chars, answering "what is this page about?" for a reader who hasn't opened it.
wiki-query
's cheap retrieval path reads this field to avoid opening page bodies.
Mark provenance per the convention in
llm-wiki
(Provenance Markers section):
  • Checkpoints and index.md are pre-distilled by the system — treat checkpoint-derived claims as extracted (the system wrote them from observed actions).
  • Memory artifacts are user-authored — treat as extracted.
  • Conversation turn distillation is mostly inferred. You're synthesizing a coherent claim from many turns. Apply
    ^[inferred]
    liberally to synthesized patterns, generalizations across sessions, and "what the user really meant" interpretations.
  • Use
    ^[ambiguous]
    when the user changed direction mid-session or when the session ended unresolved.
  • Write a
    provenance:
    frontmatter block on every new/updated page summarizing the rough mix.
发现内容存储位置示例
项目架构决策
projects/<name>/concepts/
projects/my-project/concepts/main-architecture.md
项目特定调试模式
projects/<name>/skills/
projects/my-project/skills/api-rate-limiting.md
用户学到的通用概念
concepts/
(全局)
concepts/react-server-components.md
跨项目的常见问题
skills/
(全局)
skills/debugging-hydration-errors.md
使用的工具/服务
entities/
(全局)
entities/vercel-functions.md
多会话中的模式
synthesis/
(全局)
synthesis/common-debugging-patterns.md
对于有内容的每个项目,创建或更新项目概述页面
projects/<name>/<name>.md
——以项目命名,而非
_project.md
。Obsidian的图谱视图使用文件名作为节点标签,因此
_project.md
会使每个项目在图谱中显示为
_project
。命名为
<name>.md
可为每个项目提供独特、可读的节点名称。
**重要提示:**提炼的是_知识_,而非对话内容。不要写“在3月15日的会话中,用户询问了X。”而是直接写知识本身,并将会话作为来源归因。
在每个新建/更新的页面上添加
summary:
前置字段
——1-2句话,≤200字符,回答“这个页面是关于什么的?”,供未打开页面的读者了解。
wiki-query
的快速检索路径会读取此字段,避免打开页面主体。
按照
llm-wiki
中的约定标记来源
(来源标记部分):
  • Checkpoints和index.md由系统预先提炼——将来自checkpoint的声明视为提取内容(系统从观察到的操作中编写)。
  • 记忆工件是用户原创——视为提取内容。
  • 对话回合提炼大多是推断内容。你需要从多个回合中合成连贯的声明。对合成模式、跨会话概括以及“用户真实意图”的解释,广泛使用
    ^[inferred]
    标记。
  • 当用户在会话中途改变方向或会话未解决时,使用
    ^[ambiguous]
    标记。
  • 在每个新建/更新的页面上添加
    provenance:
    前置块,总结大致的来源类型占比。

Step 6: Update Manifest, Journal, and Special Files

步骤6:更新Manifest、日志和特殊文件

Update
.manifest.json

更新
.manifest.json

For each session processed, add/update its entry with:
  • ingested_at
    ,
    session_id
    ,
    updated_at
  • source_type
    : one of
    "copilot_session"
    ,
    "copilot_checkpoint"
    ,
    "copilot_transcript"
    ,
    "copilot_memory_artifact"
  • project
    : the decoded project name
  • pages_created
    and
    pages_updated
    lists
Also update the
projects
section of the manifest:
json
{
  "project-name": {
    "repository": "owner/repo",
    "cwd": "C:\\Users\\name\\git\\project-name",
    "vault_path": "projects/project-name",
    "last_ingested": "TIMESTAMP",
    "sessions_ingested": 5,
    "sessions_total": 8,
    "checkpoints_ingested": 12,
    "memory_artifacts_ingested": 3
  }
}
对于每个处理的会话,添加/更新其条目,包含:
  • ingested_at
    session_id
    updated_at
  • source_type
    "copilot_session"
    "copilot_checkpoint"
    "copilot_transcript"
    "copilot_memory_artifact"
    之一
  • project
    :解码后的项目名称
  • pages_created
    pages_updated
    列表
同时更新manifest的
projects
部分:
json
{
  "project-name": {
    "repository": "owner/repo",
    "cwd": "C:\\Users\\name\\git\\project-name",
    "vault_path": "projects/project-name",
    "last_ingested": "TIMESTAMP",
    "sessions_ingested": 5,
    "sessions_total": 8,
    "checkpoints_ingested": 12,
    "memory_artifacts_ingested": 3
  }
}

Create journal entry + update special files

创建日志条目 + 更新特殊文件

Update
index.md
and
log.md
per the standard process:
- [TIMESTAMP] COPILOT_HISTORY_INGEST projects=N sessions=M checkpoints=C pages_updated=X pages_created=Y mode=append|full
hot.md
— Read
$OBSIDIAN_VAULT_PATH/hot.md
(create from the template in
wiki-ingest
if missing). Update Recent Activity with a one-line summary — e.g. "Ingested 5 Copilot sessions across 2 projects; surfaced patterns in API design and testing strategy." Keep the last 3 operations. Update Active Threads if any ongoing project is now better understood. Update
updated
timestamp.
按照标准流程更新
index.md
log.md
- [TIMESTAMP] COPILOT_HISTORY_INGEST projects=N sessions=M checkpoints=C pages_updated=X pages_created=Y mode=append|full
hot.md
——读取
$OBSIDIAN_VAULT_PATH/hot.md
(如果缺失,从
wiki-ingest
中的模板创建)。更新近期活动部分,添加一行摘要——例如“导入了2个项目的5个Copilot会话;提炼出API设计和测试策略中的模式。”保留最近3次操作。如果任何进行中的项目现在有了更清晰的理解,更新活跃线程部分。更新
updated
时间戳。

Privacy

隐私说明

  • Distill and synthesize — don't copy raw conversation text verbatim
  • Skip anything that looks like secrets, API keys, passwords, tokens
  • data.reasoningOpaque
    /
    data.reasoningText
    in assistant events is internal reasoning — skip entirely, never copy to wiki
  • If you encounter personal/sensitive content, ask the user before including it
  • The user's conversations may reference other people — be thoughtful about what goes in the wiki
  • 提炼和合成内容——不要直接复制原始对话文本
  • 跳过任何看起来像密钥、API密钥、密码、令牌的内容
  • 助手事件中的
    data.reasoningOpaque
    /
    data.reasoningText
    是内部推理逻辑——完全跳过,绝不复制到wiki
  • 如果遇到个人/敏感内容,在包含前询问用户
  • 用户的对话可能涉及其他人——谨慎决定哪些内容可以加入wiki

Reference

参考资料

See
references/copilot-data-format.md
for detailed data structure documentation.
详细的数据结构文档请参见
references/copilot-data-format.md