imessage-query

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

iMessage Database Query

iMessage数据库查询

Query the macOS iMessage SQLite database (
~/Library/Messages/chat.db
) to retrieve conversation history, decode messages stored in binary format, and build sourced timelines with precise timestamps.
查询macOS iMessage SQLite数据库(
~/Library/Messages/chat.db
)以获取对话历史记录,解码以二进制格式存储的消息,并构建带有精确时间戳的溯源时间线。

When to Use

适用场景

  • Retrieving iMessage conversation history for a specific contact
  • Building sourced timelines with timestamps from text messages
  • Searching for keywords across all conversations
  • Debugging messages that appear empty but contain recoverable text
  • Extracting message content that iOS stored in binary
    attributedBody
    format
  • 检索特定联系人的iMessage对话历史记录
  • 基于短信时间戳构建溯源时间线
  • 在所有对话中搜索关键词
  • 调试显示为空但包含可恢复文本的消息
  • 提取iOS以二进制
    attributedBody
    格式存储的消息内容

Prerequisites

前提条件

  1. macOS only
    chat.db
    is a macOS-specific database
  2. Full Disk Access — The terminal running Claude Code must have FDA granted in System Settings > Privacy & Security > Full Disk Access
  3. Read-only — Never write to
    chat.db
    . Always use read-only SQLite access.
  1. 仅支持macOS
    chat.db
    是macOS专属数据库
  2. 完全磁盘访问权限 — 运行Claude Code的终端必须在“系统设置 > 隐私与安全性 > 完全磁盘访问权限”中获得授权
  3. 只读访问 — 切勿写入
    chat.db
    。始终使用SQLite只读访问模式。

Critical Knowledge - The
text
vs
attributedBody
Problem

关键知识点 -
text
attributedBody
的问题

IMPORTANT: Many iMessage messages have a NULL or empty
text
column but contain valid, recoverable text in the
attributedBody
column. This is NOT because they are voice messages — iOS stores dictated messages, messages with rich formatting, and some regular messages in
attributedBody
as an NSAttributedString binary blob.
重要提示:许多iMessage消息的
text
列为空或NULL,但在
attributedBody
列中包含有效的可恢复文本。这并非因为它们是语音消息——iOS会将听写消息、带有富格式的消息以及部分普通消息以NSAttributedString二进制 blob的形式存储在
attributedBody
中。

How to detect

如何检测

sql
-- Messages with attributedBody but no text (these are NOT necessarily voice messages)
SELECT COUNT(*) as hidden_messages
FROM message m
JOIN chat_message_join cmj ON m.ROWID = cmj.message_id
JOIN chat c ON cmj.chat_id = c.ROWID
WHERE c.chat_identifier = '<CHAT_IDENTIFIER>'
AND (m.text IS NULL OR length(m.text) = 0)
AND m.attributedBody IS NOT NULL
AND length(m.attributedBody) > 100
AND m.associated_message_type = 0
AND m.cache_has_attachments = 0;
sql
-- 存在attributedBody但无text的消息(这些不一定是语音消息)
SELECT COUNT(*) as hidden_messages
FROM message m
JOIN chat_message_join cmj ON m.ROWID = cmj.message_id
JOIN chat c ON cmj.chat_id = c.ROWID
WHERE c.chat_identifier = '<CHAT_IDENTIFIER>'
AND (m.text IS NULL OR length(m.text) = 0)
AND m.attributedBody IS NOT NULL
AND length(m.attributedBody) > 100
AND m.associated_message_type = 0
AND m.cache_has_attachments = 0;

How to distinguish message types when
text
is NULL

text
为NULL时如何区分消息类型

cache_has_attachments
attributedBody
length
Likely type
0> 100 bytesDictated/rich text — recoverable via decode script
1anyAttachment (image, file, voice memo) — text may be in
attributedBody
too
0< 50 bytesTapback reaction or system message — usually noise
cache_has_attachments
attributedBody
长度
可能的类型
0> 100字节听写/富文本 — 可通过解码脚本恢复
1任意附件(图片、文件、语音备忘录) — 文本可能也存储在
attributedBody
0< 50字节Tapback反应或系统消息 — 通常是无效信息

How to decode

如何解码

Use the bundled decode script for reliable extraction:
bash
python3 <skill-path>/scripts/decode_attributed_body.py --chat "<CHAT_IDENTIFIER>" --limit 50
Or use inline Python for one-off decoding (see Known Pitfalls for the technique).
使用内置的解码脚本进行可靠提取:
bash
python3 <skill-path>/scripts/decode_attributed_body.py --chat "<CHAT_IDENTIFIER>" --limit 50
或者使用内嵌Python进行一次性解码(技术细节见已知陷阱)。

Date Formula

日期公式

iMessage stores dates as nanoseconds since Apple epoch (2001-01-01 00:00:00 UTC).
sql
datetime(m.date/1000000000 + 978307200, 'unixepoch', 'localtime') as timestamp
  • m.date / 1000000000
    — Convert nanoseconds to seconds
  • + 978307200
    — Add offset from Unix epoch (1970) to Apple epoch (2001)
  • 'unixepoch'
    — Tell SQLite this is a Unix timestamp
  • 'localtime'
    — Convert to local timezone (CRITICAL — omitting this gives UTC)
iMessage将日期存储为自Apple纪元(2001-01-01 00:00:00 UTC)以来的纳秒数
sql
datetime(m.date/1000000000 + 978307200, 'unixepoch', 'localtime') as timestamp
  • m.date / 1000000000
    — 将纳秒转换为秒
  • + 978307200
    — 添加从Unix纪元(1970)到Apple纪元(2001)的偏移量
  • 'unixepoch'
    — 告知SQLite这是一个Unix时间戳
  • 'localtime'
    — 转换为本地时区(至关重要 — 省略此参数将返回UTC时间)

Quick Start Queries

快速入门查询

1. List all conversations

1. 列出所有对话

sql
sqlite3 ~/Library/Messages/chat.db \
  "SELECT c.chat_identifier, c.display_name, COUNT(cmj.message_id) as msg_count
   FROM chat c
   JOIN chat_message_join cmj ON c.ROWID = cmj.chat_id
   GROUP BY c.ROWID
   ORDER BY msg_count DESC
   LIMIT 20"
sql
sqlite3 ~/Library/Messages/chat.db \
  "SELECT c.chat_identifier, c.display_name, COUNT(cmj.message_id) as msg_count
   FROM chat c
   JOIN chat_message_join cmj ON c.ROWID = cmj.chat_id
   GROUP BY c.ROWID
   ORDER BY msg_count DESC
   LIMIT 20"

2. Get conversation thread (text column only)

2. 获取对话线程(仅text列)

sql
sqlite3 ~/Library/Messages/chat.db \
  "SELECT datetime(m.date/1000000000 + 978307200, 'unixepoch', 'localtime') as ts,
          CASE WHEN m.is_from_me = 1 THEN 'Me' ELSE 'Them' END as sender,
          m.text
   FROM message m
   JOIN chat_message_join cmj ON m.ROWID = cmj.message_id
   JOIN chat c ON cmj.chat_id = c.ROWID
   WHERE c.chat_identifier = '<CHAT_IDENTIFIER>'
   AND length(m.text) > 0
   AND m.associated_message_type = 0
   ORDER BY m.date DESC
   LIMIT 50"
sql
sqlite3 ~/Library/Messages/chat.db \
  "SELECT datetime(m.date/1000000000 + 978307200, 'unixepoch', 'localtime') as ts,
          CASE WHEN m.is_from_me = 1 THEN 'Me' ELSE 'Them' END as sender,
          m.text
   FROM message m
   JOIN chat_message_join cmj ON m.ROWID = cmj.message_id
   JOIN chat c ON cmj.chat_id = c.ROWID
   WHERE c.chat_identifier = '<CHAT_IDENTIFIER>'
   AND length(m.text) > 0
   AND m.associated_message_type = 0
   ORDER BY m.date DESC
   LIMIT 50"

3. Get ALL messages including attributedBody (use decode script)

3. 获取所有消息(包括attributedBody,使用解码脚本)

bash
python3 <skill-path>/scripts/decode_attributed_body.py \
  --chat "<CHAT_IDENTIFIER>" \
  --after "2026-01-01" \
  --limit 100
bash
python3 <skill-path>/scripts/decode_attributed_body.py \
  --chat "<CHAT_IDENTIFIER>" \
  --after "2026-01-01" \
  --limit 100

Filtering Noise

过滤无效信息

Tapback reactions

Tapback反应

Tapback reactions (likes, loves, emphasis, etc.) are stored as separate message rows with
associated_message_type != 0
. Always filter:
sql
AND m.associated_message_type = 0
Tapback反应(点赞、爱心、强调等)作为独立的消息行存储,且
associated_message_type != 0
。请始终过滤:
sql
AND m.associated_message_type = 0

Shell escaping in zsh

zsh中的Shell转义问题

The
!=
operator can cause issues in zsh. Use positive assertions instead:
sql
-- BAD (breaks in zsh)
AND m.text != ''

-- GOOD (works everywhere)
AND length(m.text) > 0
!=
运算符在zsh中可能会导致问题。请改用正向断言:
sql
-- 错误写法(在zsh中会失效)
AND m.text != ''

-- 正确写法(在所有环境中有效)
AND length(m.text) > 0

Using the Decode Script

使用解码脚本

The bundled
decode_attributed_body.py
handles all edge cases:
bash
undefined
内置的
decode_attributed_body.py
可处理所有边缘情况:
bash
undefined

Basic usage - get last 50 messages from a contact

基础用法 - 获取联系人的最近50条消息

python3 <skill-path>/scripts/decode_attributed_body.py --chat "+1234567890" --limit 50
python3 <skill-path>/scripts/decode_attributed_body.py --chat "+1234567890" --limit 50

Search for keyword

搜索关键词

python3 <skill-path>/scripts/decode_attributed_body.py --chat "+1234567890" --search "meeting"
python3 <skill-path>/scripts/decode_attributed_body.py --chat "+1234567890" --search "meeting"

Date range

日期范围

python3 <skill-path>/scripts/decode_attributed_body.py --chat "+1234567890" --after "2026-01-01" --before "2026-02-01"
python3 <skill-path>/scripts/decode_attributed_body.py --chat "+1234567890" --after "2026-01-01" --before "2026-02-01"

Only messages from the other party

仅获取对方发送的消息

python3 <skill-path>/scripts/decode_attributed_body.py --chat "+1234567890" --sender them
python3 <skill-path>/scripts/decode_attributed_body.py --chat "+1234567890" --sender them

Only messages from me

仅获取我发送的消息

python3 <skill-path>/scripts/decode_attributed_body.py --chat "+1234567890" --sender me

Output format: `timestamp|sender|text` (pipe-delimited, one message per line)

**Note**: Replace `<skill-path>` with the actual installed skill path. To find it:

```bash
find ~/.claude -path "*/imessage-query/scripts/decode_attributed_body.py" 2>/dev/null
python3 <skill-path>/scripts/decode_attributed_body.py --chat "+1234567890" --sender me

输出格式:`timestamp|sender|text`(竖线分隔,每行一条消息)

**注意**:将`<skill-path>`替换为实际的技能安装路径。查找路径的方法:

```bash
find ~/.claude -path "*/imessage-query/scripts/decode_attributed_body.py" 2>/dev/null

Reference Documentation

参考文档

  • Schema Reference — Tables, columns, relationships
  • Query Patterns — Reusable SQL templates for common operations
  • Known Pitfalls — Every gotcha discovered and how to handle it

  • Schema参考 — 表、列、关系
  • 查询模式 — 可复用的SQL模板,适用于常见操作
  • 已知陷阱 — 已发现的所有问题及解决方法

TodoWrite Task Templates

TodoWrite任务模板

Template A - Retrieve Conversation Thread

模板A - 检索对话线程

1. Identify chat_identifier for the contact (phone number or email)
2. Run decode script with --chat and appropriate date range
3. Review output for attributedBody-decoded messages (marked with [decoded])
4. If searching for specific topic, add --search flag
5. Format results as needed for the task
1. 确定联系人的chat_identifier(电话号码或邮箱)
2. 使用--chat参数和合适的日期范围运行解码脚本
3. 查看输出中的attributedBody解码消息(标记为[decoded])
4. 如果搜索特定主题,添加--search参数
5. 根据任务需求格式化结果

Template B - Debug Empty Messages

模板B - 调试空消息

1. Query messages where text IS NULL but attributedBody IS NOT NULL
2. Check cache_has_attachments to distinguish voice/file from dictated text
3. Run decode script to extract hidden text content
4. Verify decoded content makes sense in conversation context
5. Document any new decode patterns in known-pitfalls.md
1. 查询text为NULL但attributedBody不为NULL的消息
2. 检查cache_has_attachments以区分语音/文件消息与听写文本
3. 运行解码脚本提取隐藏的文本内容
4. 验证解码后的内容在对话语境中是否合理
5. 在known-pitfalls.md中记录任何新的解码模式

Template C - Build Sourced Timeline

模板C - 构建溯源时间线

1. Identify all relevant chat_identifiers
2. Run decode script for each contact with date range
3. Merge and sort by timestamp
4. Format as sourced quotes with timestamps for documentation
5. Verify no messages were missed (compare total count vs decoded count)

1. 确定所有相关的chat_identifier
2. 为每个联系人运行解码脚本并指定日期范围
3. 合并结果并按时间戳排序
4. 将结果格式化为带有时间戳的溯源引用,用于文档
5. 验证是否有消息遗漏(比较总计数与解码计数)

Post-Change Checklist

变更后检查清单

After modifying this skill:
  1. YAML frontmatter valid (name, description with triggers)
  2. No private data (phone numbers, names, emails) in any file
  3. All SQL uses parameterized placeholders
  4. Decode script works with
    python3
    (no external deps)
  5. All reference links are relative paths
  6. Append changes to evolution-log.md
修改本技能后:
  1. YAML前置元数据有效(名称、包含触发词的描述)
  2. 所有文件中无私人数据(电话号码、姓名、邮箱)
  3. 所有SQL使用参数化占位符
  4. 解码脚本可通过
    python3
    运行(无外部依赖)
  5. 所有参考链接均为相对路径
  6. 将变更记录到evolution-log.md