neo4j-genai-plugin-skill

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

When to Use

适用场景

  • Generating embeddings inside Cypher without external Python (
    ai.text.embed()
    )
  • Batch-embedding nodes/chunks during ingestion (
    ai.text.embedBatch()
    )
  • Calling LLMs directly in Cypher for completions or GraphRAG (
    ai.text.completion()
    )
  • Extracting structured JSON maps from LLM inside Cypher (
    ai.text.structuredCompletion()
    )
  • Aggregating LLM summaries over grouped rows (
    ai.text.aggregateCompletion()
    )
  • Stateful chat sessions in Cypher (
    ai.text.chat()
    )
  • Counting tokens or chunking text by token limit (
    ai.text.tokenCount()
    ,
    ai.text.chunkByTokenLimit()
    )
  • 在Cypher内生成嵌入向量,无需外部Python环境(
    ai.text.embed()
  • 导入期间批量嵌入节点/文本块(
    ai.text.embedBatch()
  • 在Cypher中直接调用LLM完成文本补全或实现GraphRAG(
    ai.text.completion()
  • 在Cypher内从LLM提取结构化JSON映射(
    ai.text.structuredCompletion()
  • 对分组行的LLM摘要进行聚合(
    ai.text.aggregateCompletion()
  • 在Cypher中实现有状态的聊天会话(
    ai.text.chat()
  • 统计令牌数量或按令牌限制拆分文本(
    ai.text.tokenCount()
    ai.text.chunkByTokenLimit()

When NOT to Use

不适用场景

  • Python-based GraphRAG pipelines (VectorCypherRetriever, HybridCypherRetriever) →
    neo4j-graphrag-skill
  • Vector index CREATE / kNN search / SEARCH clause
    neo4j-vector-index-skill
  • GDS embeddings (FastRP, Node2Vec) →
    neo4j-gds-skill
  • Fulltext / keyword search
    neo4j-cypher-skill

  • 基于Python的GraphRAG流水线(VectorCypherRetriever、HybridCypherRetriever)→ 使用
    neo4j-graphrag-skill
  • 向量索引创建 / kNN搜索 / SEARCH子句 → 使用
    neo4j-vector-index-skill
  • GDS嵌入向量(FastRP、Node2Vec)→ 使用
    neo4j-gds-skill
  • 全文/关键词搜索 → 使用
    neo4j-cypher-skill

Prerequisites

前提条件

CYPHER 25 required for all
ai.*
functions. Two ways to enable:
cypher
// Per-query prefix (self-managed, no admin rights needed):
CYPHER 25 MATCH (n:Chunk) ...

// Per-database default (admin; applies to all sessions):
ALTER DATABASE neo4j SET DEFAULT LANGUAGE CYPHER 25
Installation:
  • Aura: GenAI plugin enabled by default — no action needed
  • Self-managed JAR: copy plugin JAR to
    plugins/
    directory
  • Docker:
    --env NEO4J_PLUGINS='["genai"]'

所有
ai.*
函数要求使用CYPHER 25版本,有两种启用方式:
cypher
// 单查询前缀(自托管环境,无需管理员权限):
CYPHER 25 MATCH (n:Chunk) ...

// 数据库全局默认设置(管理员操作;对所有会话生效):
ALTER DATABASE neo4j SET DEFAULT LANGUAGE CYPHER 25
安装方式:
  • Aura环境: GenAI插件默认启用——无需操作
  • 自托管JAR: 将插件JAR文件复制到
    plugins/
    目录
  • Docker环境: 添加参数
    --env NEO4J_PLUGINS='["genai"]'

Provider Config Quick Reference

提供商配置速查

All
ai.text.*
functions accept a
configuration :: MAP
as last argument.
Provider stringRequired keysNotes
'openai'
token
,
model
token
= OpenAI API key
'azure-openai'
token
,
resource
,
model
token
= OAuth2 bearer;
resource
= Azure resource name
'vertexai'
model
,
project
,
region
,
token
or
apiKey
publisher
defaults to
'google'
'bedrock-titan'
model
,
region
,
accessKeyId
,
secretAccessKey
Embedding only
'bedrock-nova'
model
,
region
,
accessKeyId
,
secretAccessKey
Completion only
Optional for all:
vendorOptions :: MAP
passes provider-specific extras (e.g.
{ dimensions: 1024 }
for OpenAI).
❌ Never hardcode API key literals. ✅ Always use
$param
passed via driver parameters dict.
Full provider config table → references/providers.md

所有
ai.text.*
函数均接受
configuration :: MAP
作为最后一个参数。
提供商字符串必填密钥说明
'openai'
token
,
model
token
= OpenAI API密钥
'azure-openai'
token
,
resource
,
model
token
= OAuth2令牌;
resource
= Azure资源名称
'vertexai'
model
,
project
,
region
,
token
apiKey
publisher
默认值为
'google'
'bedrock-titan'
model
,
region
,
accessKeyId
,
secretAccessKey
仅支持嵌入向量生成
'bedrock-nova'
model
,
region
,
accessKeyId
,
secretAccessKey
仅支持文本补全
所有提供商的可选配置:
vendorOptions :: MAP
用于传递提供商专属参数(例如OpenAI的
{ dimensions: 1024 }
)。
❌ 切勿硬编码API密钥字面量。✅ 始终通过驱动参数字典传递
$param
完整提供商配置表 → references/providers.md

Embedding

嵌入向量生成

Single embed [2025.12]

单个文本嵌入 [2025.12]

cypher
CYPHER 25
MATCH (c:Chunk)
WHERE c.embedding IS NULL
WITH c
CALL {
  WITH c
  SET c.embedding = ai.text.embed(c.text, 'openai', {
    token: $openaiKey,
    model: 'text-embedding-3-small'
  })
} IN TRANSACTIONS OF 500 ROWS
ai.text.embed()
returns
VECTOR
— directly storable and queryable in a vector index.
cypher
CYPHER 25
MATCH (c:Chunk)
WHERE c.embedding IS NULL
WITH c
CALL {
  WITH c
  SET c.embedding = ai.text.embed(c.text, 'openai', {
    token: $openaiKey,
    model: 'text-embedding-3-small'
  })
} IN TRANSACTIONS OF 500 ROWS
ai.text.embed()
返回
VECTOR
类型——可直接存储并在向量索引中查询。

Batch embed procedure [2025.12]

批量嵌入过程 [2025.12]

cypher
CYPHER 25
MATCH (c:Chunk) WHERE c.embedding IS NULL
WITH collect(c) AS chunks
UNWIND chunks AS c
WITH c.text AS text, c AS node
CALL ai.text.embedBatch(text, 'openai', { token: $openaiKey, model: 'text-embedding-3-small' })
YIELD index, resource, vector
MATCH (c:Chunk {text: resource})
SET c.embedding = vector
Procedure signature:
CALL ai.text.embedBatch(resource, provider, config) YIELD index, resource, vector
cypher
CYPHER 25
MATCH (c:Chunk) WHERE c.embedding IS NULL
WITH collect(c) AS chunks
UNWIND chunks AS c
WITH c.text AS text, c AS node
CALL ai.text.embedBatch(text, 'openai', { token: $openaiKey, model: 'text-embedding-3-small' })
YIELD index, resource, vector
MATCH (c:Chunk {text: resource})
SET c.embedding = vector
过程签名:
CALL ai.text.embedBatch(resource, provider, config) YIELD index, resource, vector

List configured embed providers

查看已配置的嵌入提供商

cypher
CYPHER 25
CALL ai.text.embed.providers()
YIELD name, requiredConfigType, optionalConfigType, defaultConfig
RETURN name, requiredConfigType

cypher
CYPHER 25
CALL ai.text.embed.providers()
YIELD name, requiredConfigType, optionalConfigType, defaultConfig
RETURN name, requiredConfigType

Text Completion [2025.12]

文本补全 [2025.12]

cypher
CYPHER 25
RETURN ai.text.completion(
  'Summarize: ' + $text,
  'openai',
  { token: $openaiKey, model: 'gpt-4o-mini' }
) AS summary
Returns
STRING
.
cypher
CYPHER 25
RETURN ai.text.completion(
  'Summarize: ' + $text,
  'openai',
  { token: $openaiKey, model: 'gpt-4o-mini' }
) AS summary
返回
STRING
类型。

Aggregate completion — summarize across rows

聚合补全——对多行内容进行摘要

cypher
CYPHER 25
MATCH (c:Chunk)-[:PART_OF]->(a:Article {id: $articleId})
RETURN ai.text.aggregateCompletion(
  c.text,
  'Summarize the following article chunks in 3 sentences',
  'openai',
  { token: $openaiKey, model: 'gpt-4o-mini' }
) AS summary
value
parameter = each row's STRING fed to the LLM. Uses
toString()
for non-string values.

cypher
CYPHER 25
MATCH (c:Chunk)-[:PART_OF]->(a:Article {id: $articleId})
RETURN ai.text.aggregateCompletion(
  c.text,
  'Summarize the following article chunks in 3 sentences',
  'openai',
  { token: $openaiKey, model: 'gpt-4o-mini' }
) AS summary
value
参数 = 每行传入LLM的字符串。非字符串值会自动调用
toString()
转换。

Pure-Cypher GraphRAG Pattern [2025.12]

纯Cypher GraphRAG模式 [2025.12]

Embed question → vector search → graph traverse → LLM completion — all in one Cypher query:
cypher
CYPHER 25
WITH ai.text.embed($question, 'openai', { token: $openaiKey, model: 'text-embedding-3-small' }) AS qEmbedding
CALL db.index.vector.queryNodes('chunk_embedding', 10, qEmbedding) YIELD node AS chunk, score
MATCH (chunk)<-[:HAS_CHUNK]-(article:Article)
OPTIONAL MATCH path = shortestPath((article)-[*..3]-(other:Article))
WITH chunk, article, collect(DISTINCT other.title) AS related, score
ORDER BY score DESC LIMIT 5
WITH collect(chunk.text + '\n[Source: ' + article.title + ']') AS context, $question AS question
RETURN ai.text.completion(
  'Answer based on context:\n' + reduce(s='', c IN context | s + c + '\n') + '\nQuestion: ' + question,
  'openai',
  { token: $openaiKey, model: 'gpt-4o-mini' }
) AS answer
Key insight (Bergman): shortest path between seed nodes surfaces relationships not visible from direct neighbors alone.

嵌入问题→向量搜索→图遍历→LLM补全——全部在单个Cypher查询中完成:
cypher
CYPHER 25
WITH ai.text.embed($question, 'openai', { token: $openaiKey, model: 'text-embedding-3-small' }) AS qEmbedding
CALL db.index.vector.queryNodes('chunk_embedding', 10, qEmbedding) YIELD node AS chunk, score
MATCH (chunk)<-[:HAS_CHUNK]-(article:Article)
OPTIONAL MATCH path = shortestPath((article)-[*..3]-(other:Article))
WITH chunk, article, collect(DISTINCT other.title) AS related, score
ORDER BY score DESC LIMIT 5
WITH collect(chunk.text + '\n[Source: ' + article.title + ']') AS context, $question AS question
RETURN ai.text.completion(
  'Answer based on context:\n' + reduce(s='', c IN context | s + c + '\n') + '\nQuestion: ' + question,
  'openai',
  { token: $openaiKey, model: 'gpt-4o-mini' }
) AS answer
核心思路(Bergman):种子节点间的最短路径可展现直接邻居之外的关联关系。

Structured Output [2025.12]

结构化输出 [2025.12]

Returns
MAP
— directly storable as node properties or used downstream in Cypher.
cypher
CYPHER 25
MATCH (p:Product {id: $productId})
WITH p,
  ai.text.structuredCompletion(
    'Extract key attributes from: ' + p.description,
    {
      type: 'object',
      properties: {
        category: { type: 'string' },
        tags: { type: 'array', items: { type: 'string' } },
        priceRange: { type: 'string', enum: ['budget', 'mid', 'premium'] }
      },
      required: ['category', 'tags', 'priceRange'],
      additionalProperties: false
    },
    'openai',
    { token: $openaiKey, model: 'gpt-4o-mini' }
  ) AS extracted
SET p.category = extracted.category,
    p.priceRange = extracted.priceRange
WITH p, extracted.tags AS tags
UNWIND tags AS tag
MERGE (t:Tag {name: tag})
MERGE (p)-[:TAGGED]->(t)
返回
MAP
类型——可直接存储为节点属性或在Cypher下游使用。
cypher
CYPHER 25
MATCH (p:Product {id: $productId})
WITH p,
  ai.text.structuredCompletion(
    'Extract key attributes from: ' + p.description,
    {
      type: 'object',
      properties: {
        category: { type: 'string' },
        tags: { type: 'array', items: { type: 'string' } },
        priceRange: { type: 'string', enum: ['budget', 'mid', 'premium'] }
      },
      required: ['category', 'tags', 'priceRange'],
      additionalProperties: false
    },
    'openai',
    { token: $openaiKey, model: 'gpt-4o-mini' }
  ) AS extracted
SET p.category = extracted.category,
    p.priceRange = extracted.priceRange
WITH p, extracted.tags AS tags
UNWIND tags AS tag
MERGE (t:Tag {name: tag})
MERGE (p)-[:TAGGED]->(t)

Aggregate structured completion — extract across multiple rows

聚合结构化输出——从多行内容提取信息

cypher
CYPHER 25
MATCH (:User {id: $userId})-[:ORDERED]->(o:Order)-[:CONTAINS]->(p:Product)
RETURN ai.text.aggregateStructuredCompletion(
  p.name + ': ' + p.category,
  'Build a shopping profile for this user',
  {
    type: 'object',
    properties: {
      preferredCategories: { type: 'array', items: { type: 'string' } },
      spendingTier: { type: 'string', enum: ['economy', 'standard', 'premium'] }
    },
    required: ['preferredCategories', 'spendingTier']
  },
  'openai',
  { token: $openaiKey, model: 'gpt-4o-mini' }
) AS profile

cypher
CYPHER 25
MATCH (:User {id: $userId})-[:ORDERED]->(o:Order)-[:CONTAINS]->(p:Product)
RETURN ai.text.aggregateStructuredCompletion(
  p.name + ': ' + p.category,
  'Build a shopping profile for this user',
  {
    type: 'object',
    properties: {
      preferredCategories: { type: 'array', items: { type: 'string' } },
      spendingTier: { type: 'string', enum: ['economy', 'standard', 'premium'] }
    },
    required: ['preferredCategories', 'spendingTier']
  },
  'openai',
  { token: $openaiKey, model: 'gpt-4o-mini' }
) AS profile

Chat [2025.12]

聊天功能 [2025.12]

Supported providers: openai and azure-openai only.
cypher
// Start new conversation (chatId = null → new session)
CYPHER 25
WITH ai.text.chat(
  'Hello, who are you?',
  null,
  'openai',
  { token: $openaiKey, model: 'gpt-4o-mini' }
) AS result
RETURN result.message AS reply, result.chatId AS sessionId

// Continue conversation (pass returned chatId)
CYPHER 25
WITH ai.text.chat(
  'What did I just ask you?',
  $chatId,
  'openai',
  { token: $openaiKey, model: 'gpt-4o-mini' }
) AS result
RETURN result.message AS reply, result.chatId AS sessionId
Returns
MAP { message: STRING, chatId: STRING }
. Store
chatId
to continue session.

支持的提供商:仅openaiazure-openai
cypher
// 开启新对话(chatId = null → 创建新会话)
CYPHER 25
WITH ai.text.chat(
  'Hello, who are you?',
  null,
  'openai',
  { token: $openaiKey, model: 'gpt-4o-mini' }
) AS result
RETURN result.message AS reply, result.chatId AS sessionId

// 继续对话(传入返回的chatId)
CYPHER 25
WITH ai.text.chat(
  'What did I just ask you?',
  $chatId,
  'openai',
  { token: $openaiKey, model: 'gpt-4o-mini' }
) AS result
RETURN result.message AS reply, result.chatId AS sessionId
返回
MAP { message: STRING, chatId: STRING }
。存储
chatId
即可继续会话。

Tokenization & Chunking [2025.12]

分词与文本拆分 [2025.12]

cypher
// Count tokens before sending to LLM
CYPHER 25
RETURN ai.text.tokenCount($text, 'openai', { token: $openaiKey, model: 'gpt-4o-mini' }) AS tokenCount

// Chunk text by token limit (no external dependencies)
CYPHER 25
UNWIND ai.text.chunkByTokenLimit($longText, 512, 'gpt-4', 50) AS chunk
MERGE (c:Chunk { text: chunk })
ai.text.chunkByTokenLimit(input, limit, model='gpt-4', overlap=0)
model
controls tokenizer;
overlap
= tokens of overlap between chunks.

cypher
// 发送给LLM前统计令牌数量
CYPHER 25
RETURN ai.text.tokenCount($text, 'openai', { token: $openaiKey, model: 'gpt-4o-mini' }) AS tokenCount

// 按令牌限制拆分文本(无外部依赖)
CYPHER 25
UNWIND ai.text.chunkByTokenLimit($longText, 512, 'gpt-4', 50) AS chunk
MERGE (c:Chunk { text: chunk })
ai.text.chunkByTokenLimit(input, limit, model='gpt-4', overlap=0)
model
参数控制分词器;
overlap
= 文本块之间的重叠令牌数。

Write Gate

写入注意事项

SET node.embedding = ai.text.embed(...)
and
SET node.* = ai.text.structuredCompletion(...)
write to the graph.
Before bulk writes:
  1. Count nodes first:
    MATCH (c:Chunk) WHERE c.embedding IS NULL RETURN count(c)
  2. Verify config with one test node before batch
  3. Use
    CALL { ... } IN TRANSACTIONS OF 500 ROWS
    for batches > 1000 nodes
  4. Require explicit confirmation before executing

SET node.embedding = ai.text.embed(...)
SET node.* = ai.text.structuredCompletion(...)
会写入图数据库。
批量写入前:
  1. 先统计节点数量:
    MATCH (c:Chunk) WHERE c.embedding IS NULL RETURN count(c)
  2. 在批量操作前先用单个测试节点验证配置
  3. 对于超过1000个节点的批量操作,使用
    CALL { ... } IN TRANSACTIONS OF 500 ROWS
  4. 执行前需要明确确认

Deprecated — Do NOT Use

已弃用——请勿使用

Old functionReplacement
genai.vector.encode()
[deprecated]
ai.text.embed()
genai.vector.encodeBatch()
[deprecated]
CALL ai.text.embedBatch()
genai.vector.listEncodingProviders()
[deprecated]
CALL ai.text.embed.providers()

旧函数替代方案
genai.vector.encode()
[已弃用]
ai.text.embed()
genai.vector.encodeBatch()
[已弃用]
CALL ai.text.embedBatch()
genai.vector.listEncodingProviders()
[已弃用]
CALL ai.text.embed.providers()

Common Errors

常见错误

ErrorCauseFix
Unknown function 'ai.text.embed'
Missing CYPHER 25 prefix OR plugin not installedAdd
CYPHER 25
prefix; verify plugin installed
Cypher version not supported
Using
CYPHER 25
on Neo4j < 5.20 or missing plugin
Upgrade Neo4j; ensure GenAI plugin loaded
Configuration key 'token' missing
Provider config map incompleteCheck required keys for provider (see table above)
null
returned from embed
Wrong model name or provider auth failedTest with
RETURN ai.text.embed('test', 'openai', {token:$k, model:'text-embedding-3-small'})
standalone
Unsupported provider
Provider string typo (case-sensitive, lowercase)Use
'openai'
not
'OpenAI'
; run
CALL ai.text.embed.providers()
ai.text.chat
fails on VertexAI
Chat only supported on openai/azure-openaiSwitch to openai/azure-openai for chat

错误信息原因修复方案
Unknown function 'ai.text.embed'
缺少CYPHER 25前缀或未安装插件添加
CYPHER 25
前缀;验证插件已安装
Cypher version not supported
在Neo4j < 5.20版本使用
CYPHER 25
或缺少插件
升级Neo4j版本;确保GenAI插件已加载
Configuration key 'token' missing
提供商配置映射不完整检查对应提供商的必填密钥(见上方表格)
嵌入返回
null
模型名称错误或提供商认证失败单独测试
RETURN ai.text.embed('test', 'openai', {token:$k, model:'text-embedding-3-small'})
Unsupported provider
提供商字符串拼写错误(区分大小写,需小写)使用
'openai'
而非
'OpenAI'
;执行
CALL ai.text.embed.providers()
查看支持列表
ai.text.chat
在VertexAI上失败
聊天功能仅支持openai/azure-openai切换到openai/azure-openai使用聊天功能

Checklist

检查清单

  • CYPHER 25
    prefix present on every ai.text.* query
  • GenAI plugin installed (Aura: automatic; self-managed: JAR in plugins/)
  • API key passed as
    $param
    , never as literal string
  • model
    key explicit in config (no silent defaults)
  • Provider string lowercase (
    'openai'
    ,
    'vertexai'
    ,
    'bedrock-titan'
    )
  • Bulk writes use
    IN TRANSACTIONS OF 500 ROWS
    ; count target nodes first
  • genai.vector.encode()
    replaced with
    ai.text.embed()
    [2025.12]
  • Chat sessions: store returned
    chatId
    for continuation; only openai/azure-openai supported
  • Structured output schema uses
    additionalProperties: false
    to prevent hallucination keys

  • 每个包含ai.text.*的查询都带有
    CYPHER 25
    前缀
  • 已安装GenAI插件(Aura环境自动启用;自托管环境需将JAR放入plugins/)
  • API密钥通过
    $param
    传递,而非硬编码字面量字符串
  • 配置中明确指定
    model
    密钥(无默认值)
  • 提供商字符串为小写(
    'openai'
    ,
    'vertexai'
    ,
    'bedrock-titan'
  • 批量写入使用
    IN TRANSACTIONS OF 500 ROWS
    ;先统计目标节点数量
  • 已将
    genai.vector.encode()
    替换为
    ai.text.embed()
    [2025.12]
  • 聊天会话:存储返回的
    chatId
    以继续会话;仅支持openai/azure-openai
  • 结构化输出模式使用
    additionalProperties: false
    防止生成幻觉字段

References

参考资料