entity-memory
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseSkill: Entity Memory
技能:Entity Memory
This is your persistent memory for everything you know about a record — contacts, companies, employees, members, locations, devices, and more. Store and retrieve: customer data, interaction history, inputs, actions you've taken, reports, notes, and observations. Always consult Memory before acting on a record. Always update Memory after meaningful interactions.
Internal principle: Bad data in = bad personalization out. Memory is the foundation. Get it right, and every downstream feature (emails, notifications, dashboards, agents) gets better automatically.
这是你存储各类记录相关所有信息的持久化内存,支持的记录类型包括联系人、公司、员工、成员、地点、设备等。可存储和检索的内容包括:客户数据、交互历史、输入信息、你执行过的操作、报告、备注和观察结果。在对某条记录执行操作前,请务必先查询Memory。在完成有价值的交互后,请务必更新Memory。
内部原则: 垃圾进,垃圾出,输入的劣质数据只会输出劣质的个性化结果。Memory是整个系统的基础,做好这一层,所有下游功能(邮件、通知、仪表盘、Agent)都会自动优化。
When This Skill is Activated
该技能的激活场景
This skill gives you the ability to store and retrieve data using the Personize SDK's memory system.
If the developer hasn't given a specific instruction yet, introduce yourself:
"I have access to the Memory skill. I can help you store data into Personize memory (memorize) and retrieve it (recall) — including batch syncs, semantic search, entity digests, and data export. What data are you working with?"
If the developer says something about storing, syncing, importing, or ingesting data, jump to MEMORIZE.
If the developer says something about retrieving, querying, searching, or assembling context, jump to RECALL.
该技能让你可以通过Personize SDK的内存系统存储和检索数据。
如果开发者还没有给出明确指令,请先做自我介绍:
"我可以使用Memory技能,帮你将数据存储到Personize内存(记忆存储)以及检索数据(召回),支持批量同步、语义搜索、实体摘要和数据导出。你当前在处理什么数据?"
如果开发者提到存储、同步、导入、摄入数据相关内容,直接跳转到**记忆存储(MEMORIZE)**章节。
如果开发者提到检索、查询、搜索、组装上下文相关内容,直接跳转到**召回(RECALL)**章节。
When NOT to Use This Skill
该技能的不适用场景
- For CRM sync with deploy templates → see the CRM / Database Sync section below
- Need no-code visual workflows → use no-code-pipelines
- Need durable scheduled pipelines with retries → use code-pipelines
- Need to manage organizational rules, not entity data → use governance
- Need multi-agent coordination state → use collaboration
- 用于带部署模板的CRM同步 → 参考下文的CRM/数据库同步章节
- 需要无代码可视化工作流 → 使用 no-code-pipelines
- 需要带重试机制的持久化定时流水线 → 使用 code-pipelines
- 需要管理组织规则而非实体数据 → 使用 governance
- 需要多Agent协调状态 → 使用 collaboration
Works With Both SDK and MCP — One Skill, Two Interfaces
同时支持SDK和MCP,一个技能两种接口
This skill works identically whether the LLM accesses memory via the SDK (code, scripts, IDE agents) or via MCP (Claude Desktop, ChatGPT, Cursor MCP connection).
| Interface | How it works | Best for |
|---|---|---|
SDK ( | | Scripts, CI/CD, IDE agents, recipes |
| MCP (Model Context Protocol) | | Claude Desktop, ChatGPT, Cursor, any MCP-compatible client |
MCP tools map to SDK methods:
| SDK Method | MCP Tool | Purpose |
|---|---|---|
| | Store data with AI extraction |
| | Semantic search (recommended) |
| (SDK only) | Direct DynamoDB lookup — properties + freeform memories ( |
| (SDK only) | Compiled entity context (properties + memories) |
| (SDK only) | Filter and export records |
| (SDK only) | Batch sync with per-property control |
| | Fetch guidelines by topic |
无论LLM是通过SDK(代码、脚本、IDE Agent)还是MCP(Claude Desktop、ChatGPT、Cursor MCP连接)访问内存,该技能的运行逻辑完全一致。
| 接口 | 运行逻辑 | 适用场景 |
|---|---|---|
SDK ( | | 脚本、CI/CD、IDE Agent、执行方案 |
| MCP(模型上下文协议) | | Claude Desktop、ChatGPT、Cursor、所有兼容MCP的客户端 |
MCP工具与SDK方法的对应关系:
| SDK方法 | MCP工具 | 用途 |
|---|---|---|
| | 带AI提取能力的数据存储 |
| | 语义搜索(推荐使用) |
| 仅SDK可用 | 直接DynamoDB查询,返回属性和自由格式记忆(需要指定 |
| 仅SDK可用 | 编译后的实体上下文(属性+记忆) |
| 仅SDK可用 | 过滤和导出记录 |
| 仅SDK可用 | 支持单属性控制的批量同步 |
| | 按主题提取规则指南 |
MCP-Only Feature: Self-Memory (about='self'
)
about='self'仅MCP可用特性:自我记忆(about='self'
)
about='self'MCP tools support an parameter that the SDK does not expose directly:
about- (default) — store/recall about a contact or company. Requires
about='lead',email, orwebsite_url.record_id - — store/recall about the current user (preferences, working style, goals). No identifier needed — identity is resolved automatically.
about='self'
// MCP: Store user preferences
memory_store_pro(content="I prefer formal communication. My timezone is PST.", about="self")
// MCP: Recall user preferences
memory_recall_pro(query="What are my preferences and working style?", about="self", generate_answer=true)When reading this skill document:
- If you're connected via MCP, use the MCP tool names (,
memory_store_pro, etc.)memory_recall_pro - If you're running via SDK, use the methods
client.memory.* - All workflows, rules, and best practices apply equally to both interfaces
MCP工具支持SDK没有直接暴露的参数:
about- (默认)—— 存储/召回联系人或公司相关信息,必须提供
about='lead'、email或website_url。record_id - —— 存储/召回当前用户相关信息(偏好、工作风格、目标),不需要提供标识符,身份会自动解析。
about='self'
// MCP: 存储用户偏好
memory_store_pro(content="I prefer formal communication. My timezone is PST.", about="self")
// MCP: 召回用户偏好
memory_recall_pro(query="What are my preferences and working style?", about="self", generate_answer=true)阅读本技能文档时请注意:
- 如果你通过MCP连接,请使用MCP工具名(、
memory_store_pro等)memory_recall_pro - 如果你通过SDK运行,请使用系列方法
client.memory.* - 所有工作流、规则和最佳实践对两种接口都适用
Actions
操作类型
You have 2 actions. Use whichever matches what the developer needs.
| Action | When to Use | Reference |
|---|---|---|
| MEMORIZE | Developer needs to store data — single items, batch sync, CRM import, webhook data, generated outputs | |
| RECALL | Developer needs to retrieve data — semantic search, entity context, filtered exports, context assembly | |
Before each action: Read the reference file for full method signatures, decision trees, code examples, and common mistakes.
你可以使用两种操作,根据开发者的需求选择即可。
| 操作 | 适用场景 | 参考文档 |
|---|---|---|
| 记忆存储(MEMORIZE) | 开发者需要存储数据:单条数据、批量同步、CRM导入、Webhook数据、生成的输出 | |
| 召回(RECALL) | 开发者需要检索数据:语义搜索、实体上下文、过滤导出、上下文组装 | |
执行每个操作前: 请阅读参考文档获取完整的方法签名、决策逻辑、代码示例和常见错误说明。
Action: MEMORIZE
操作:记忆存储(MEMORIZE)
Store data into Personize memory. The right method depends on what you're storing and how much of it.
将数据存储到Personize内存中,选择合适的方法取决于你存储的数据类型和数据量。
Which Method to Use
如何选择合适的方法
| Scenario | Method | Why |
|---|---|---|
| One item, with AI extraction | | Rich text (notes, transcripts, emails) → AI extracts facts and creates vectors |
| Batch sync from CRM/DB | | Multiple records with per-property |
| Structured data, no AI needed | | Store exact key-value pairs (email, plan_tier, login_count) without AI overhead |
| 场景 | 方法 | 原因 |
|---|---|---|
| 单条数据,需要AI提取 | | 富文本(备注、对话记录、邮件)→ AI提取事实并生成向量 |
| 从CRM/数据库批量同步 | | 多条记录,支持针对每个属性设置 |
| 结构化数据,不需要AI处理 | 关闭 | 存储精确的键值对(邮箱、套餐等级、登录次数),没有AI处理的额外开销 |
The extractMemories
Decision
extractMemoriesextractMemories
配置决策
extractMemoriesextractMemoriesfalseextractMemories: true| Data Type | | Reasoning |
|---|---|---|
| Rich text (notes, transcripts, emails, descriptions) | | AI extracts facts, creates vector embeddings for semantic search |
| Generated content (AI outputs you want to remember) | | Enables the feedback loop — AI knows what it already said |
| ML outputs with explanations (churn reason, lead score rationale) | | The explanation text benefits from extraction |
| Structured facts (email, name, plan, dates, counts) | | Already structured — AI extraction wastes tokens and adds latency |
| Binary flags, IDs, URLs | | No semantic content to extract |
Rule of thumb: Always seton any field containing free-form text. If you skip it, those fields get stored as properties but produce zero memories —extractMemories: trueandsmartRecall()won't find them.smartDigest()
extractMemoriesfalseextractMemories: true| 数据类型 | | 原因 |
|---|---|---|
| 富文本(备注、对话记录、邮件、描述) | | AI提取事实,生成向量嵌入用于语义搜索 |
| 生成的内容(你需要记住的AI输出) | | 形成反馈闭环,AI知道自己已经输出过什么内容 |
| 带解释的ML输出(流失原因、线索评分依据) | | 解释文本可以从提取能力中受益 |
| 结构化事实(邮箱、姓名、套餐、日期、计数) | | 已经是结构化数据,AI提取会浪费token并增加延迟 |
| 二进制标识、ID、URL | | 没有可以提取的语义内容 |
经验法则: 所有包含自由文本的字段都要设置。如果不设置,这些字段只会作为属性存储,不会生成任何记忆,extractMemories: true和smartRecall()都无法检索到这些内容。smartDigest()
Quick Example
快速示例
typescript
// Single item — AI extraction with identity hints
await client.memory.memorize({
content: 'Also extract First Name, Last Name, Company Name, and Job Title if mentioned.\n\nCall with Sarah Chen (VP Eng, Initech). She mentioned they are evaluating SOC2 compliance tools. Main pain point: manual audit prep taking 2 weeks per quarter. Budget approved for Q2.',
speaker: 'Sales Team',
email: 'sarah.chen@initech.com',
enhanced: true,
tags: ['call-notes', 'sales', 'source:manual'],
});
// Batch sync — per-property control
await client.memory.memorizeBatch({
source: 'Hubspot',
mapping: {
entityType: 'contact',
email: 'email',
runName: 'hubspot-contact-sync',
properties: {
full_name: { sourceField: 'firstname', collectionId: 'col_xxx', collectionName: 'Contacts', extractMemories: false },
job_title: { sourceField: 'jobtitle', collectionId: 'col_xxx', collectionName: 'Contacts', extractMemories: false },
last_notes: { sourceField: 'notes', collectionId: 'col_xxx', collectionName: 'Contacts', extractMemories: true },
},
},
rows: crmContacts, // array of objects from your CRM
});
// ⚠️ memorizeBatch() is async — records land in ~1-2 minutes (EventBridge → Lambda).
// Verify with search() or smartDigest() after processing completes.typescript
// 单条数据 — 带身份提示的AI提取
await client.memory.memorize({
content: 'Also extract First Name, Last Name, Company Name, and Job Title if mentioned.\n\nCall with Sarah Chen (VP Eng, Initech). She mentioned they are evaluating SOC2 compliance tools. Main pain point: manual audit prep taking 2 weeks per quarter. Budget approved for Q2.',
speaker: 'Sales Team',
email: 'sarah.chen@initech.com',
enhanced: true,
tags: ['call-notes', 'sales', 'source:manual'],
});
// 批量同步 — 单属性控制
await client.memory.memorizeBatch({
source: 'Hubspot',
mapping: {
entityType: 'contact',
email: 'email',
runName: 'hubspot-contact-sync',
properties: {
full_name: { sourceField: 'firstname', collectionId: 'col_xxx', collectionName: 'Contacts', extractMemories: false },
job_title: { sourceField: 'jobtitle', collectionId: 'col_xxx', collectionName: 'Contacts', extractMemories: false },
last_notes: { sourceField: 'notes', collectionId: 'col_xxx', collectionName: 'Contacts', extractMemories: true },
},
},
rows: crmContacts, // array of objects from your CRM
});
// ⚠️ memorizeBatch()是异步方法 — 记录会在1-2分钟内落地(EventBridge → Lambda)
// 处理完成后通过search()或smartDigest()验证结果Constraints
约束规则
Keywords follow RFC 2119: MUST = non-negotiable, SHOULD = strong default (override with stated reasoning), MAY = agent discretion.
- MUST include at least one tag on every call (e.g.
memorize()) -- because tags enable filtering, attribution, and workspace scoping; untagged memories are unsearchable by category.tags: ['source:hubspot', 'type:interaction', 'team:sales'] - SHOULD include a timestamp in the or use the
contentparameter -- because temporal ordering lets recall distinguish recent facts from stale ones.timestamp - MUST NOT pre-process content with an LLM before calling with
memorize()-- because double-processing wastes tokens and the extraction pipeline is optimized for raw input.enhanced: true - MUST NOT manually deduplicate before memorizing -- because the platform deduplicates at cosine 0.92 similarity and runs background consolidation; client-side dedup adds complexity with no benefit.
- SHOULD memorize generated outputs (emails, notifications, reports) after delivery -- because the feedback loop lets future recalls see what was already sent, preventing repetition.
- SHOULD use or the web app for schema changes -- because collections define the extraction schema and ad-hoc creation risks inconsistency.
client.collections.create/update/delete() - MUST call before batch operations to read plan rate limits -- because exceeding limits causes 429 errors and partial syncs with no automatic resume.
client.me() - SHOULD prepend extraction hints for identity/demographic fields (name, company, title, location) when those fields may be empty for the record -- because the property selector uses embedding similarity, and generic identity fields score low against specific content; hints ensure they are selected alongside the content-relevant properties without limiting the selector. See → "Extraction Hints" for the full pattern.
reference/memorize.md
Full guide: Readfor complete method signatures, data mapping patterns, all source-specific recipes (CRM, database, webhook, CSV), batch strategies, error handling, and the feedback loop.reference/memorize.md
关键字遵循RFC 2119规范:MUST = 强制要求,SHOULD = 强烈建议(如有合理原因可以调整),MAY = 可自主选择。
- 必须(MUST) 每次调用时至少添加一个标签(例如
memorize())—— 标签可用于过滤、归因和工作区范围限定,未打标签的记忆无法按类别搜索。tags: ['source:hubspot', 'type:interaction', 'team:sales'] - 建议(SHOULD) 在中添加时间戳,或使用
content参数—— 时间顺序可以帮助召回区分最新事实和过时事实。timestamp - 禁止(MUST NOT) 在调用开启的
enhanced: true前用LLM预处理内容—— 重复处理会浪费token,且提取管道已经针对原始输入做了优化。memorize() - 禁止(MUST NOT) 存储前手动去重—— 平台会自动对余弦相似度0.92以上的内容去重,并运行后台合并任务,客户端去重只会增加复杂度,没有额外收益。
- 建议(SHOULD) 发送生成的输出(邮件、通知、报告)后存储这些内容—— 反馈闭环可以让后续的召回看到已经发送过的内容,避免重复。
- 建议(SHOULD) 使用或网页端修改 schema—— 集合定义了提取 schema,临时创建会带来不一致风险。
client.collections.create/update/delete() - 必须(MUST) 批量操作前调用读取套餐限流阈值—— 超出限制会返回429错误,导致部分同步失败且不会自动恢复。
client.me() - 建议(SHOULD) 当身份/人口统计字段(姓名、公司、职位、地点)可能为空时,提前添加提取提示—— 属性选择器使用嵌入相似度,通用身份字段和特定内容的匹配得分较低,提示可以确保这些字段和内容相关属性一起被选中,且不会限制选择器。参考→"提取提示"章节了解完整模式。
reference/memorize.md
完整指南: 阅读获取完整的方法签名、数据映射模式、所有数据源专用方案(CRM、数据库、Webhook、CSV)、批量策略、错误处理和反馈闭环说明。reference/memorize.md
CRM / Database Sync
CRM/数据库同步
For production-grade data sync from CRMs and databases (Salesforce, HubSpot, Postgres), this skill includes source-specific connector templates and deployment configs:
- Source templates: ,
templates/salesforce.md,templates/hubspot.md— fetch patterns, auth setup, field mapping for each sourcetemplates/postgres.md - Deployment: ,
deploy/Dockerfile,deploy/render.yaml— scheduled sync on Render, GitHub Actions, or any container platformdeploy/github-action.yml - Advanced patterns: — incremental sync with state tracking, multi-source architecture, batch export with pagination, complete end-to-end example
reference/sync-advanced-patterns.md
The integration pattern: initialize project → for auth + limits → fetch rows from source → for collection IDs → build property mapping → in chunks with 429 retry → verify with or . See for a runnable example.
client.me()client.collections.list()memorizeBatch()search()smartDigest()recipes/data-sync.ts对于CRM和数据库(Salesforce、HubSpot、Postgres)的生产级数据同步,该技能包含数据源专用的连接器模板和部署配置:
- 数据源模板: 、
templates/salesforce.md、templates/hubspot.md—— 每个数据源的提取模式、鉴权配置、字段映射说明templates/postgres.md - 部署配置: 、
deploy/Dockerfile、deploy/render.yaml—— 可在Render、GitHub Actions或任何容器平台上运行的定时同步配置deploy/github-action.yml - 高级模式: —— 带状态追踪的增量同步、多源架构、分页批量导出、完整端到端示例
reference/sync-advanced-patterns.md
集成模式:初始化项目 → 调用获取鉴权信息和限流阈值 → 从数据源提取数据 → 调用获取集合ID → 构建属性映射 → 分块调用并处理429重试 → 调用或验证结果。参考获取可运行示例。
client.me()client.collections.list()memorizeBatch()search()smartDigest()recipes/data-sync.tsAction: RECALL
操作:召回(RECALL)
Retrieve data from Personize memory. The right method depends on what kind of answer you need.
从Personize内存中检索数据,选择合适的方法取决于你需要的答案类型。
Which Method to Use
如何选择合适的方法
| Need | Method | Returns |
|---|---|---|
| "What do we know about X topic?" | | Semantic search results with optional reflection/answers (recommended) |
| "Quick vector lookup, no frills" | | Direct vector search ( |
| "Give me everything about this person/company" | | Compiled markdown context — all properties + memories for one entity |
| "List all contacts matching criteria X" | | Filtered records with property values |
| "What are our guidelines for X?" | | Governance variables matching a topic |
vssmartRecall(): Userecall()for most use cases — it supports reflection, answer generation,smartRecall(), and infersfast_modefrom email/website_url. Usetypeonly for simple direct lookups —recall()is required (e.g.type).type: 'Contact'
Identifier behavior — how,,websiteUrl,recordId-only, and no identifier affect each endpoint (error vs empty vs org-wide search) → readtype.reference/identifier-scenarios.md
| 需求 | 方法 | 返回内容 |
|---|---|---|
| "我们对X主题有哪些了解?" | | 语义搜索结果,支持可选的反思/答案生成(推荐使用) |
| "简单向量查询,不需要额外功能" | | 直接向量搜索(需要指定 |
| "给我这个个人/公司的所有相关信息" | | 编译后的Markdown上下文——单个实体的所有属性+记忆 |
| "列出符合X条件的所有联系人" | | 带属性值的过滤后记录 |
| "我们对X主题有哪些规则?" | | 匹配主题的管理规则变量 |
vssmartRecall(): 大多数场景使用recall()——它支持反思、答案生成、smartRecall(),可以从邮箱/网站URL推断fast_mode。仅在简单直接查询场景使用type,且recall()是必填项(例如type)。type: 'Contact'
标识符行为——、、websiteUrl、仅传recordId、不传标识符对每个接口的影响(报错/空结果/全组织搜索)→ 阅读type。reference/identifier-scenarios.md
When to Use What
不同场景的方法选择
Need specific facts about a topic? → smartRecall()
Need full context about ONE entity? → smartDigest()
Need to filter/segment a list of records? → search()
Need organizational rules/guidelines? → smartGuidelines()
Building a generation prompt? → smartGuidelines() + smartDigest() + smartRecall()
(governance + entity + task-specific facts)需要特定主题的相关事实? → smartRecall()
需要单个实体的完整上下文? → smartDigest()
需要过滤/分组记录列表? → search()
需要组织规则/指南? → smartGuidelines()
构建生成提示? → smartGuidelines() + smartDigest() + smartRecall()
(规则 + 实体信息 + 任务相关事实)Quick Example
快速示例
typescript
// Semantic search — find specific facts (recommended)
const results = await client.memory.smartRecall({
query: 'what pain points did this contact mention?',
email: 'sarah.chen@initech.com',
type: 'Contact',
limit: 10,
minScore: 0.4,
include_property_values: true,
});
// Fast recall — skip reflection, ~500ms response
const fast = await client.memory.smartRecall({
query: 'what do we know about this contact?',
email: 'sarah.chen@initech.com',
type: 'Contact',
fast_mode: true,
});
// Entity digest — compiled context for one person
const digest = await client.memory.smartDigest({
email: 'sarah.chen@initech.com',
type: 'Contact',
token_budget: 2000,
include_properties: true,
include_memories: true,
});
// digest.data.compiledContext → ready-to-inject markdown
// Filtered export — find all enterprise contacts
const exported = await client.memory.search({
type: 'Contact',
returnRecords: true,
pageSize: 50,
groups: [{
conditions: [
{ field: 'plan_tier', operator: 'EQUALS', value: 'enterprise' },
{ field: 'email', operator: 'IS_SET' },
],
}],
});typescript
// 语义搜索 — 查找特定事实(推荐使用)
const results = await client.memory.smartRecall({
query: 'what pain points did this contact mention?',
email: 'sarah.chen@initech.com',
type: 'Contact',
limit: 10,
minScore: 0.4,
include_property_values: true,
});
// 快速召回 — 跳过反思,响应时间约500ms
const fast = await client.memory.smartRecall({
query: 'what do we know about this contact?',
email: 'sarah.chen@initech.com',
type: 'Contact',
fast_mode: true,
});
// 实体摘要 — 单个用户的编译后上下文
const digest = await client.memory.smartDigest({
email: 'sarah.chen@initech.com',
type: 'Contact',
token_budget: 2000,
include_properties: true,
include_memories: true,
});
// digest.data.compiledContext → 可直接注入的Markdown内容
// 过滤导出 — 查找所有企业版客户联系人
const exported = await client.memory.search({
type: 'Contact',
returnRecords: true,
pageSize: 50,
groups: [{
conditions: [
{ field: 'plan_tier', operator: 'EQUALS', value: 'enterprise' },
{ field: 'email', operator: 'IS_SET' },
],
}],
});The Three-Layer Agent Operating Model
三层Agent运行模型
Memory is one of three layers every agent should assemble before acting: Guidelines (organizational rules via ), Memory (entity knowledge via /), and Workspace (coordination state via workspace-tagged /). All three together: the agent acts within governance, with full context, in coordination with others.
smartGuidelines()smartDigest()recall()recall()memorize()Full architecture guide: See theskill'scollaborationfor the complete three-layer model, composition patterns, and adoption path.reference/architecture.md
内存是Agent执行操作前需要组装的三层信息之一:规则指南(通过获取的组织规则)、内存(通过/获取的实体信息)、工作区(通过带工作区标签的/获取的协调状态)。三者结合可以让Agent在规则范围内运行,拥有完整上下文,且可以和其他Agent协同。
smartGuidelines()smartDigest()recall()recall()memorize()完整架构指南: 参考技能的collaboration获取完整的三层模型、组合模式和落地路径。reference/architecture.md
Cross-Entity Context
跨实体上下文
Memory gives you everything about ONE entity. But agents often need context from related entities — the company a contact works at, other contacts at the same account, related deals or projects.
Pattern: Multi-entity context assembly
typescript
// When working on a contact, also pull their company context
const [contactDigest, companyDigest] = await Promise.all([
client.memory.smartDigest({ email: 'sarah@acme.com', type: 'Contact', token_budget: 1500 }),
client.memory.smartDigest({ website_url: 'https://acme.com', type: 'Company', token_budget: 1000 }),
]);
// Now you know Sarah AND you know Acme — funding stage, tech stack, team size, etc.When to pull cross-entity context:
- Working on a contact → also pull their company
- Working on a deal → also pull the contact AND the company
- Generating account-level content → pull all contacts at that company
- Detecting patterns → export across entity types and cross-reference
内存可以提供单个实体的所有信息,但Agent通常需要关联实体的上下文——联系人所属的公司、同个客户下的其他联系人、相关的商机或项目。
模式:多实体上下文组装
typescript
// 处理联系人信息时,同时拉取所属公司的上下文
const [contactDigest, companyDigest] = await Promise.all([
client.memory.smartDigest({ email: 'sarah@acme.com', type: 'Contact', token_budget: 1500 }),
client.memory.smartDigest({ website_url: 'https://acme.com', type: 'Company', token_budget: 1000 }),
]);
// 现在你同时了解Sarah和Acme的信息——融资阶段、技术栈、团队规模等需要拉取跨实体上下文的场景:
- 处理联系人信息 → 同时拉取所属公司信息
- 处理商机信息 → 同时拉取联系人和公司信息
- 生成客户层级的内容 → 拉取该公司的所有联系人信息
- 检测模式 → 跨实体类型导出并交叉比对
The Context Assembly Pattern
上下文组装模式
Most generation pipelines combine multiple recall methods:
typescript
async function assembleContext(email: string, task: string): Promise<string> {
const sections: string[] = [];
// 1. Governance — rules and guidelines
// Use mode: 'fast' for real-time agents (~200ms), 'full' for deep analysis (~3s)
const governance = await client.ai.smartGuidelines({
message: `${task} — guidelines, tone, constraints`,
mode: 'fast', // embedding-only routing, no LLM overhead
});
if (governance.data?.compiledContext) {
sections.push('## Guidelines\n' + governance.data.compiledContext);
}
// 2. Entity context — everything about this person
const digest = await client.memory.smartDigest({
email,
type: 'Contact',
token_budget: 2000,
include_properties: true,
include_memories: true,
});
if (digest.data?.compiledContext) {
sections.push('## Recipient Context\n' + digest.data.compiledContext);
}
// 3. Task-specific facts — semantic search
const recalled = await client.memory.smartRecall({
query: task,
email,
type: 'Contact',
fast_mode: true,
limit: 10,
minScore: 0.3,
});
if (recalled.data && Array.isArray(recalled.data) && recalled.data.length > 0) {
sections.push('## Relevant Facts\n' + recalled.data.map((m: any) =>
`- ${m.text || m.content || JSON.stringify(m)}`
).join('\n'));
}
return sections.join('\n\n---\n\n');
}大多数生成流水线会组合多个召回方法:
typescript
async function assembleContext(email: string, task: string): Promise<string> {
const sections: string[] = [];
// 1. 规则指南 — 规则和规范
// 实时Agent使用mode: 'fast'(约200ms),深度分析使用mode: 'full'(约3s)
const governance = await client.ai.smartGuidelines({
message: `${task} — guidelines, tone, constraints`,
mode: 'fast', // 仅嵌入路由,无LLM开销
});
if (governance.data?.compiledContext) {
sections.push('## 指南\n' + governance.data.compiledContext);
}
// 2. 实体上下文 — 该用户的所有相关信息
const digest = await client.memory.smartDigest({
email,
type: 'Contact',
token_budget: 2000,
include_properties: true,
include_memories: true,
});
if (digest.data?.compiledContext) {
sections.push('## 收件人上下文\n' + digest.data.compiledContext);
}
// 3. 任务相关事实 — 语义搜索
const recalled = await client.memory.smartRecall({
query: task,
email,
type: 'Contact',
fast_mode: true,
limit: 10,
minScore: 0.3,
});
if (recalled.data && Array.isArray(recalled.data) && recalled.data.length > 0) {
sections.push('## 相关事实\n' + recalled.data.map((m: any) =>
`- ${m.text || m.content || JSON.stringify(m)}`
).join('\n'));
}
return sections.join('\n\n---\n\n');
}Constraints
约束规则
Keywords follow RFC 2119: MUST = non-negotiable, SHOULD = strong default (override with stated reasoning), MAY = agent discretion.
- MUST set an explicit on every
token_budgetcall -- because the default (1000) may truncate critical context for deep personalization or waste tokens for simple lookups.smartDigest() - SHOULD set on
minScore(0.3 for broad context, 0.5+ for precision) -- because omitting it returns low-relevance noise that dilutes the context window.smartRecall() - SHOULD use for context injection, real-time UIs, and batch processing -- because it cuts recall latency from ~10-20s to ~500ms; override for exploratory queries where reflection adds value.
fast_mode: true - SHOULD assemble context from all three layers (+
smartGuidelines+smartDigest) before generating -- because single-source context produces governance-blind, entity-ignorant, or task-irrelevant output. UsesmartRecallfor real-time agent flows (~200ms),mode: 'fast'for first-call or complex planning tasks (~3s).mode: 'full' - MAY set on
include_property_values: true-- because it returns structured properties alongside semantic results, useful when the caller needs both.smartRecall() - MUST paginate calls using
export()andpage-- because unbounded exports can time out or exceed memory limits on large datasets. Default pageSize is 50.pageSize - MAY cache results within a single pipeline run when the same entity is referenced multiple times -- because redundant API calls waste tokens and add latency.
smartDigest()
Full guide: Readfor complete method signatures, query writing strategies, token budget tuning, scoring thresholds, all context assembly patterns, export filtering, and performance optimization.reference/recall.md
关键字遵循RFC 2119规范:MUST = 强制要求,SHOULD = 强烈建议(如有合理原因可以调整),MAY = 可自主选择。
- 必须(MUST) 每次调用时显式设置
smartDigest()—— 默认值(1000)可能会截断深度个性化需要的关键上下文,或者在简单查询场景浪费token。token_budget - 建议(SHOULD) 给设置
smartRecall()(宽泛上下文场景用0.3,精准查询场景用0.5以上)—— 不设置会返回低相关性的噪音内容,稀释上下文窗口。minScore - 建议(SHOULD) 上下文注入、实时UI、批量处理场景使用—— 可以将召回延迟从10-20s降低到约500ms;探索性查询场景可以关闭该模式,反思功能会带来额外价值。
fast_mode: true - 建议(SHOULD) 生成内容前组装三层所有上下文(+
smartGuidelines+smartDigest)—— 单一来源的上下文会生成无视规则、不了解实体信息或和任务无关的输出。实时Agent流使用smartRecall(约200ms),首次沟通或复杂规划任务使用mode: 'fast'(约3s)。mode: 'full' - 可以(MAY) 给设置
smartRecall()—— 可以在语义结果之外返回结构化属性,适合调用方同时需要两种内容的场景。include_property_values: true - 必须(MUST) 使用和
page对pageSize调用做分页—— 大数据集的无边界导出可能超时或超出内存限制,默认pageSize是50。export() - 可以(MAY) 同一个流水线运行中多次引用同一个实体时,缓存结果—— 冗余API调用会浪费token并增加延迟。
smartDigest()
完整指南: 阅读获取完整的方法签名、查询编写策略、token预算调优、评分阈值、所有上下文组装模式、导出过滤和性能优化说明。reference/recall.md
SDK Method Reference
SDK方法参考
typescript
import { Personize } from '@personize/sdk';
const client = new Personize({ secretKey: process.env.PERSONIZE_SECRET_KEY! });typescript
import { Personize } from '@personize/sdk';
const client = new Personize({ secretKey: process.env.PERSONIZE_SECRET_KEY! });Memorize Methods
记忆存储方法
| Method | Endpoint | Purpose |
|---|---|---|
| | Store single item with AI extraction |
| | Batch sync with per-property |
| 方法 | 接口 | 用途 |
|---|---|---|
| | 存储单条数据,支持AI提取 |
| | 批量同步,支持单属性 |
Recall Methods
召回方法
| Method | Endpoint | Purpose |
|---|---|---|
| | Semantic search with reflection + answer gen (recommended) |
| | Direct DynamoDB lookup — properties + freeform memories ( |
| | Compiled entity context (properties + memories) |
| | Filter and export records |
| | Fetch governance variables by topic |
| 方法 | 接口 | 用途 |
|---|---|---|
| | 带反思和答案生成的语义搜索(推荐使用) |
| | 直接DynamoDB查询,返回属性和自由格式记忆(需要指定 |
| | 编译后的实体上下文(属性+记忆) |
| | 过滤和导出记录 |
| | 按主题提取管理规则变量 |
Key Type Signatures
核心类型签名
typescript
// memorize() — single item
interface MemorizeProOptions {
content: string; // The text to memorize
speaker?: string; // Who said/wrote it
timestamp?: string; // When it happened
email?: string; // Match to contact by email
website_url?: string; // Match to company by website
record_id?: string; // Match to record by ID
enhanced?: boolean; // Enable AI extraction (default: false)
tags?: string[]; // Categorization tags
max_properties?: number; // Max properties to extract
schema?: Record<string, unknown>; // Extraction schema hint
actionId?: string; // Target collection ID
}
// memorizeBatch() — batch sync
interface BatchMemorizeOptions {
source: string; // Source system label ('Hubspot', 'Salesforce')
mapping: {
entityType: string; // 'contact', 'company'
email?: string; // Source field name for email
website?: string; // Source field name for website
runName?: string; // Tracking label
properties: Record<string, {
sourceField: string; // Source field name in row data
collectionId: string; // Target collection ID
collectionName: string; // Target collection name
extractMemories?: boolean; // AI extraction for this property
}>;
};
rows: Record<string, unknown>[]; // Source data rows
dryRun?: boolean; // Validate without writing
chunkSize?: number; // Rows per chunk (default: 1)
}
// smartRecall() — semantic search (recommended)
interface SmartRecallOptions {
query: string; // Natural language query
limit?: number; // Max results (default: 10)
minScore?: number; // Minimum relevance score (0-1)
email?: string; // Scope to one contact
website_url?: string; // Scope to one company
record_id?: string; // Scope to one record
type?: string; // Entity type filter (optional — inferred from email/website_url)
include_property_values?: boolean; // Include structured properties
enable_reflection?: boolean; // AI reflects on results
generate_answer?: boolean; // AI generates a direct answer
fast_mode?: boolean; // Skip reflection + answer gen, ~500ms (default: false)
min_score?: number; // Server-side score filter (in fast_mode, defaults to 0.3)
}
// recall() — direct lookup (simpler, type required)
interface RecallOptions {
query: string; // Natural language query
type: string; // Entity type — REQUIRED (e.g. 'Contact', 'Company')
record_id?: string; // Scope to one record
email?: string; // Scope to one contact
website_url?: string; // Scope to one company
filters?: Record<string, unknown>; // Additional filters
}
// smartDigest() — entity context
interface SmartDigestOptions {
email?: string; // Contact email
website_url?: string; // Company website
record_id?: string; // Record ID
type?: string; // Entity type ('Contact', 'Company')
token_budget?: number; // Max tokens for output (default: 1000)
max_memories?: number; // Max memories to include (default: 20)
include_properties?: boolean; // Include structured properties (default: true)
include_memories?: boolean; // Include free-form memories (default: true)
}typescript
// memorize() — 单条数据
interface MemorizeProOptions {
content: string; // 要存储的文本内容
speaker?: string; // 内容的说话者/作者
timestamp?: string; // 内容产生的时间
email?: string; // 通过邮箱匹配联系人
website_url?: string; // 通过网站匹配公司
record_id?: string; // 通过ID匹配记录
enhanced?: boolean; // 开启AI提取(默认:false)
tags?: string[]; // 分类标签
max_properties?: number; // 最多提取的属性数量
schema?: Record<string, unknown>; // 提取schema提示
actionId?: string; // 目标集合ID
}
// memorizeBatch() — 批量同步
interface BatchMemorizeOptions {
source: string; // 源系统标签('Hubspot', 'Salesforce')
mapping: {
entityType: string; // 'contact', 'company'
email?: string; // 邮箱对应的源字段名
website?: string; // 网站对应的源字段名
runName?: string; // 追踪标签
properties: Record<string, {
sourceField: string; // 行数据中的源字段名
collectionId: string; // 目标集合ID
collectionName: string; // 目标集合名称
extractMemories?: boolean; // 该属性是否开启AI提取
}>;
};
rows: Record<string, unknown>[]; // 源数据行
dryRun?: boolean; // 仅验证不写入
chunkSize?: number; // 每块的行数(默认:1)
}
// smartRecall() — 语义搜索(推荐使用)
interface SmartRecallOptions {
query: string; // 自然语言查询
limit?: number; // 最多返回结果数(默认:10)
minScore?: number; // 最低相关性得分(0-1)
email?: string; // 限定到单个联系人
website_url?: string; // 限定到单个公司
record_id?: string; // 限定到单条记录
type?: string; // 实体类型过滤(可选——可从邮箱/网站URL推断)
include_property_values?: boolean; // 包含结构化属性
enable_reflection?: boolean; // AI对结果做反思
generate_answer?: boolean; // AI生成直接答案
fast_mode?: boolean; // 跳过反思和答案生成,约500ms(默认:false)
min_score?: number; // 服务端得分过滤(fast_mode下默认0.3)
}
// recall() — 直接查询(更简单,必填type)
interface RecallOptions {
query: string; // 自然语言查询
type: string; // 实体类型 — 必填(例如 'Contact', 'Company')
record_id?: string; // 限定到单条记录
email?: string; // 限定到单个联系人
website_url?: string; // 限定到单个公司
filters?: Record<string, unknown>; // 额外过滤条件
}
// smartDigest() — 实体上下文
interface SmartDigestOptions {
email?: string; // 联系人邮箱
website_url?: string; // 公司网站
record_id?: string; // 记录ID
type?: string; // 实体类型('Contact', 'Company')
token_budget?: number; // 输出的最大token数(默认:1000)
max_memories?: number; // 最多包含的记忆数(默认:20)
include_properties?: boolean; // 包含结构化属性(默认:true)
include_memories?: boolean; // 包含自由格式记忆(默认:true)
}The Data Model
数据模型
┌─────────────────────────────────────────────────────────────┐
│ PERSONIZE MEMORY │
│ │
│ ┌───────────────────┐ ┌──────────────────────────────┐ │
│ │ STRUCTURED DATA │ │ SEMANTIC MEMORIES │ │
│ │ (DynamoDB) │ │ (LanceDB + Vectors) │ │
│ │ │ │ │ │
│ │ Records: │ │ AI-extracted facts from: │ │
│ │ ├─ email: "..." │ │ ├─ Call notes │ │
│ │ ├─ plan: "pro" │ │ ├─ Support tickets │ │
│ │ ├─ title: "VP" │ │ ├─ Email threads │ │
│ │ └─ login_count:5 │ │ ├─ Meeting transcripts │ │
│ │ │ │ └─ Generated outputs │ │
│ └───────────────────┘ └──────────────────────────────┘ │
│ │ │ │
│ ▼ ▼ │
│ search() filters smartRecall() searches │
│ memorize() writes memorizeBatch() writes │
│ smartDigest() reads both ──────────┘ │
└─────────────────────────────────────────────────────────────┘- Structured data = exact key-value pairs. Queryable by field, filterable, paginated.
- Semantic memories = AI-extracted facts with vector embeddings. Searchable by meaning.
- smartDigest combines both into a single, token-budgeted markdown block.
┌─────────────────────────────────────────────────────────────┐
│ PERSONIZE MEMORY │
│ │
│ ┌───────────────────┐ ┌──────────────────────────────┐ │
│ │ 结构化数据 │ │ 语义记忆 │ │
│ │ (DynamoDB) │ │ (LanceDB + 向量) │ │
│ │ │ │ │ │
│ │ 记录: │ │ AI提取的事实来源: │ │
│ │ ├─ email: "..." │ │ ├─ 通话记录 │ │
│ │ ├─ plan: "pro" │ │ ├─ 支持工单 │ │
│ │ ├─ title: "VP" │ │ ├─ 邮件线程 │ │
│ │ └─ login_count:5 │ │ ├─ 会议记录 │ │
│ │ │ │ └─ 生成的输出 │ │
│ └───────────────────┘ └──────────────────────────────┘ │
│ │ │ │
│ ▼ ▼ │
│ search() 过滤 smartRecall() 搜索 │
│ memorize() 写入 memorizeBatch() 写入 │
│ smartDigest() 读取两者 ──────────┘ │
└─────────────────────────────────────────────────────────────┘- 结构化数据 = 精确的键值对,支持按字段查询、过滤、分页。
- 语义记忆 = AI提取的带向量嵌入的事实,支持按含义搜索。
- smartDigest 将两者组合为单个受token预算限制的Markdown块。
Available Resources
可用资源
| Resource | Contents |
|---|---|
| Full memorize guide: method signatures, data mapping, extractMemories decision tree, source recipes, batch strategies, error handling, feedback loop |
| Full recall guide: method signatures, query strategies, token budgets, scoring, context assembly, export filtering, performance tips |
| How each endpoint (memorize, recall, smartRecall, smartDigest) behaves with email, websiteUrl, recordId, type-only, or no identifier — scenarios A–G with error vs empty vs success table |
| Batch sync from CRM/database with validation and error handling |
| Complete context assembly pattern combining all recall methods |
| 资源 | 内容 |
|---|---|
| 完整记忆存储指南:方法签名、数据映射、extractMemories决策树、数据源方案、批量策略、错误处理、反馈闭环 |
| 完整召回指南:方法签名、查询策略、token预算、评分规则、上下文组装、导出过滤、性能优化技巧 |
| 每个接口(memorize、recall、smartRecall、smartDigest)在传入邮箱、websiteUrl、recordId、仅传type、不传标识符时的行为——场景A-G的报错/空结果/成功对照表 |
| 带验证和错误处理的CRM/数据库批量同步示例 |
| 组合所有召回方法的完整上下文组装模式示例 |
Signal Memorization Patterns
Signal记忆存储模式
@personize/signal uses entity memory for its feedback loop and deferred notification pipeline. Understanding these patterns helps when debugging Signal behavior or building custom integrations.
@personize/signal 使用实体内存实现反馈闭环和延迟通知流水线。了解这些模式有助于调试Signal行为或构建自定义集成。
Tag Conventions
标签约定
| Tag | Written by | Purpose |
|---|---|---|
| Engine (step 8) | Tracks delivered notifications — recalled during context assembly to prevent repetition |
| Engine (step 5) | Marks notifications scored 40-60 for later digest compilation |
| Engine (step 5) | Paired with |
| DigestBuilder | Marks compiled digest notifications |
| Engine (step 7) | Workspace entries created on SEND |
| Engine (step 7) | Workspace entries created on DEFER |
| 标签 | 写入方 | 用途 |
|---|---|---|
| 引擎(第8步) | 追踪已发送的通知——上下文组装时召回避免重复发送 |
| 引擎(第5步) | 标记得分40-60的通知,用于后续摘要编译 |
| 引擎(第5步) | 和 |
| 摘要构建器 | 标记已编译的摘要通知 |
| 引擎(第7步) | 发送时创建的工作区条目 |
| 引擎(第7步) | 延迟时创建的工作区条目 |
Feedback Loop
反馈闭环
After every SEND decision, Signal memorizes what was sent:
typescript
await client.memory.memorize({
content: `[SIGNAL] Sent "${subject}" via ${channel} (score: ${score}). ${reasoning}`,
email,
enhanced: true,
tags: ['signal:sent', `signal:channel:${channel}`, `signal:type:${eventType}`],
});On the next evaluation for the same entity, the engine recalls recent memories (step 3, 4th parallel call). The AI sees what was recently sent and can SKIP to avoid repetition — even if the pre-check dedup window has expired.
signal:sent每次做出发送决策后,Signal会存储已发送的内容:
typescript
await client.memory.memorize({
content: `[SIGNAL] Sent "${subject}" via ${channel} (score: ${score}). ${reasoning}`,
email,
enhanced: true,
tags: ['signal:sent', `signal:channel:${channel}`, `signal:type:${eventType}`],
});对同一个实体的下一次评估时,引擎会召回最近的记忆(第3步第4个并行调用)。AI可以看到最近发送过的内容,即使预检查去重窗口已经过期,也可以跳过避免重复。
signal:sentDeferred → Digest Pipeline
延迟→摘要流水线
- Defer (score 40-60): with tags
memorize()['signal:deferred', 'signal:pending-digest', eventType] - Digest build: retrieves pending items
smartRecall({ query: 'deferred notifications', tags: ['signal:deferred'] }) - Compile: generates a personalized digest from all deferred items + entity context
prompt() - Deliver: Channel sends the compiled digest
- Mark processed: with tag
memorize()— future digest builds skip already-compiled itemssignal:digest
- 延迟(得分40-60):调用添加标签
memorize()['signal:deferred', 'signal:pending-digest', eventType] - 摘要构建:调用获取待处理项
smartRecall({ query: 'deferred notifications', tags: ['signal:deferred'] }) - 编译:调用基于所有延迟项+实体上下文生成个性化摘要
prompt() - 发送:通过渠道发送编译后的摘要
- 标记已处理:调用添加标签
memorize()——后续摘要构建会跳过已编译的内容signal:digest
Querying Signal History
查询Signal历史
typescript
// What notifications has Signal sent to this contact?
const sent = await client.memory.smartRecall({
query: 'notifications sent by signal',
email: 'jane@acme.com',
type: 'Contact',
fast_mode: true,
limit: 10,
});
// What's pending in the digest queue?
const pending = await client.memory.smartRecall({
query: 'deferred notifications pending digest',
email: 'jane@acme.com',
type: 'Contact',
fast_mode: true,
limit: 20,
});typescript
// Signal给该联系人发送过哪些通知?
const sent = await client.memory.smartRecall({
query: 'notifications sent by signal',
email: 'jane@acme.com',
type: 'Contact',
fast_mode: true,
limit: 10,
});
// 摘要队列中有哪些待处理内容?
const pending = await client.memory.smartRecall({
query: 'deferred notifications pending digest',
email: 'jane@acme.com',
type: 'Contact',
fast_mode: true,
limit: 20,
});