neo4j-genai-plugin-skill

Compare original and translation side by side

🇺🇸

Original

English

🇨🇳

Translation

Chinese

When to Use

适用场景

Generating embeddings inside Cypher without external Python (
```
ai.text.embed()
```
)
Batch-embedding nodes/chunks during ingestion (
```
ai.text.embedBatch()
```
)
Calling LLMs directly in Cypher for completions or GraphRAG (
```
ai.text.completion()
```
)
Extracting structured JSON maps from LLM inside Cypher (
```
ai.text.structuredCompletion()
```
)
Aggregating LLM summaries over grouped rows (
```
ai.text.aggregateCompletion()
```
)
Stateful chat sessions in Cypher (
```
ai.text.chat()
```
)
Counting tokens or chunking text by token limit (
```
ai.text.tokenCount()
```
,
```
ai.text.chunkByTokenLimit()
```
)

在Cypher内生成嵌入向量，无需外部Python环境（
```
ai.text.embed()
```
）
导入期间批量嵌入节点/文本块（
```
ai.text.embedBatch()
```
）
在Cypher中直接调用LLM完成文本补全或实现GraphRAG（
```
ai.text.completion()
```
）
在Cypher内从LLM提取结构化JSON映射（
```
ai.text.structuredCompletion()
```
）
对分组行的LLM摘要进行聚合（
```
ai.text.aggregateCompletion()
```
）
在Cypher中实现有状态的聊天会话（
```
ai.text.chat()
```
）
统计令牌数量或按令牌限制拆分文本（
```
ai.text.tokenCount()
```
、
```
ai.text.chunkByTokenLimit()
```
）

When NOT to Use

不适用场景

Python-based GraphRAG pipelines (VectorCypherRetriever, HybridCypherRetriever) →
```
neo4j-graphrag-skill
```
Vector index CREATE / kNN search / SEARCH clause →
```
neo4j-vector-index-skill
```
GDS embeddings (FastRP, Node2Vec) →
```
neo4j-gds-skill
```
Fulltext / keyword search →
```
neo4j-cypher-skill
```

基于Python的GraphRAG流水线（VectorCypherRetriever、HybridCypherRetriever）→ 使用
```
neo4j-graphrag-skill
```
向量索引创建 / kNN搜索 / SEARCH子句 → 使用
```
neo4j-vector-index-skill
```
GDS嵌入向量（FastRP、Node2Vec）→ 使用
```
neo4j-gds-skill
```
全文/关键词搜索 → 使用
```
neo4j-cypher-skill
```

Prerequisites

前提条件

CYPHER 25 required for all

ai.*

functions. Two ways to enable:

cypher

// Per-query prefix (self-managed, no admin rights needed):
CYPHER 25 MATCH (n:Chunk) ...

// Per-database default (admin; applies to all sessions):
ALTER DATABASE neo4j SET DEFAULT LANGUAGE CYPHER 25

Installation:

Aura: GenAI plugin enabled by default — no action needed
Self-managed JAR: copy plugin JAR to
```
plugins/
```
directory
Docker:
```
--env NEO4J_PLUGINS='["genai"]'
```

所有

ai.*

函数要求使用CYPHER 25版本，有两种启用方式：

cypher

// 单查询前缀（自托管环境，无需管理员权限）:
CYPHER 25 MATCH (n:Chunk) ...

// 数据库全局默认设置（管理员操作；对所有会话生效）:
ALTER DATABASE neo4j SET DEFAULT LANGUAGE CYPHER 25

安装方式:

Aura环境: GenAI插件默认启用——无需操作
自托管JAR: 将插件JAR文件复制到
```
plugins/
```
目录
Docker环境: 添加参数
```
--env NEO4J_PLUGINS='["genai"]'
```

Provider Config Quick Reference

提供商配置速查

All

ai.text.*

functions accept a

configuration :: MAP

as last argument.

Provider string	Required keys	Notes
`'openai'`	`token` , `model`	`token` = OpenAI API key
`'azure-openai'`	`token` , `resource` , `model`	`token` = OAuth2 bearer; `resource` = Azure resource name
`'vertexai'`	`model` , `project` , `region` , `token` or `apiKey`	`publisher` defaults to `'google'`
`'bedrock-titan'`	`model` , `region` , `accessKeyId` , `secretAccessKey`	Embedding only
`'bedrock-nova'`	`model` , `region` , `accessKeyId` , `secretAccessKey`	Completion only

Optional for all:

vendorOptions :: MAP

passes provider-specific extras (e.g.

{ dimensions: 1024 }

for OpenAI).

❌ Never hardcode API key literals. ✅ Always use

$param

passed via driver parameters dict.

Full provider config table → references/providers.md

所有

ai.text.*

函数均接受

configuration :: MAP

作为最后一个参数。

提供商字符串	必填密钥	说明
`'openai'`	`token` , `model`	`token` = OpenAI API密钥
`'azure-openai'`	`token` , `resource` , `model`	`token` = OAuth2令牌； `resource` = Azure资源名称
`'vertexai'`	`model` , `project` , `region` , `token` 或 `apiKey`	`publisher` 默认值为 `'google'`
`'bedrock-titan'`	`model` , `region` , `accessKeyId` , `secretAccessKey`	仅支持嵌入向量生成
`'bedrock-nova'`	`model` , `region` , `accessKeyId` , `secretAccessKey`	仅支持文本补全

所有提供商的可选配置：

vendorOptions :: MAP

用于传递提供商专属参数（例如OpenAI的

{ dimensions: 1024 }

）。

❌ 切勿硬编码API密钥字面量。✅ 始终通过驱动参数字典传递

$param

。

完整提供商配置表 → references/providers.md

Embedding

嵌入向量生成

Single embed [2025.12]

单个文本嵌入 [2025.12]

cypher

CYPHER 25
MATCH (c:Chunk)
WHERE c.embedding IS NULL
WITH c
CALL {
  WITH c
  SET c.embedding = ai.text.embed(c.text, 'openai', {
    token: $openaiKey,
    model: 'text-embedding-3-small'
  })
} IN TRANSACTIONS OF 500 ROWS

ai.text.embed()

returns

VECTOR

— directly storable and queryable in a vector index.

cypher

CYPHER 25
MATCH (c:Chunk)
WHERE c.embedding IS NULL
WITH c
CALL {
  WITH c
  SET c.embedding = ai.text.embed(c.text, 'openai', {
    token: $openaiKey,
    model: 'text-embedding-3-small'
  })
} IN TRANSACTIONS OF 500 ROWS

ai.text.embed()

VECTOR

类型——可直接存储并在向量索引中查询。

Batch embed procedure [2025.12]

批量嵌入过程 [2025.12]

cypher

CYPHER 25
MATCH (c:Chunk) WHERE c.embedding IS NULL
WITH collect(c) AS chunks
UNWIND chunks AS c
WITH c.text AS text, c AS node
CALL ai.text.embedBatch(text, 'openai', { token: $openaiKey, model: 'text-embedding-3-small' })
YIELD index, resource, vector
MATCH (c:Chunk {text: resource})
SET c.embedding = vector

Procedure signature:

CALL ai.text.embedBatch(resource, provider, config) YIELD index, resource, vector

cypher

CYPHER 25
MATCH (c:Chunk) WHERE c.embedding IS NULL
WITH collect(c) AS chunks
UNWIND chunks AS c
WITH c.text AS text, c AS node
CALL ai.text.embedBatch(text, 'openai', { token: $openaiKey, model: 'text-embedding-3-small' })
YIELD index, resource, vector
MATCH (c:Chunk {text: resource})
SET c.embedding = vector

过程签名：

CALL ai.text.embedBatch(resource, provider, config) YIELD index, resource, vector

List configured embed providers

查看已配置的嵌入提供商

cypher

CYPHER 25
CALL ai.text.embed.providers()
YIELD name, requiredConfigType, optionalConfigType, defaultConfig
RETURN name, requiredConfigType

cypher

CYPHER 25
CALL ai.text.embed.providers()
YIELD name, requiredConfigType, optionalConfigType, defaultConfig
RETURN name, requiredConfigType

Text Completion [2025.12]

文本补全 [2025.12]

cypher

CYPHER 25
RETURN ai.text.completion(
  'Summarize: ' + $text,
  'openai',
  { token: $openaiKey, model: 'gpt-4o-mini' }
) AS summary

Returns

STRING

cypher

CYPHER 25
RETURN ai.text.completion(
  'Summarize: ' + $text,
  'openai',
  { token: $openaiKey, model: 'gpt-4o-mini' }
) AS summary

STRING

类型。

Aggregate completion — summarize across rows

聚合补全——对多行内容进行摘要

cypher

CYPHER 25
MATCH (c:Chunk)-[:PART_OF]->(a:Article {id: $articleId})
RETURN ai.text.aggregateCompletion(
  c.text,
  'Summarize the following article chunks in 3 sentences',
  'openai',
  { token: $openaiKey, model: 'gpt-4o-mini' }
) AS summary

value

parameter = each row's STRING fed to the LLM. Uses

toString()

for non-string values.

cypher

CYPHER 25
MATCH (c:Chunk)-[:PART_OF]->(a:Article {id: $articleId})
RETURN ai.text.aggregateCompletion(
  c.text,
  'Summarize the following article chunks in 3 sentences',
  'openai',
  { token: $openaiKey, model: 'gpt-4o-mini' }
) AS summary

value

参数 = 每行传入LLM的字符串。非字符串值会自动调用

toString()

转换。

Pure-Cypher GraphRAG Pattern [2025.12]

纯Cypher GraphRAG模式 [2025.12]

Embed question → vector search → graph traverse → LLM completion — all in one Cypher query:

cypher

CYPHER 25
WITH ai.text.embed($question, 'openai', { token: $openaiKey, model: 'text-embedding-3-small' }) AS qEmbedding
CALL db.index.vector.queryNodes('chunk_embedding', 10, qEmbedding) YIELD node AS chunk, score
MATCH (chunk)<-[:HAS_CHUNK]-(article:Article)
OPTIONAL MATCH path = shortestPath((article)-[*..3]-(other:Article))
WITH chunk, article, collect(DISTINCT other.title) AS related, score
ORDER BY score DESC LIMIT 5
WITH collect(chunk.text + '\n[Source: ' + article.title + ']') AS context, $question AS question
RETURN ai.text.completion(
  'Answer based on context:\n' + reduce(s='', c IN context | s + c + '\n') + '\nQuestion: ' + question,
  'openai',
  { token: $openaiKey, model: 'gpt-4o-mini' }
) AS answer

Key insight (Bergman): shortest path between seed nodes surfaces relationships not visible from direct neighbors alone.

嵌入问题→向量搜索→图遍历→LLM补全——全部在单个Cypher查询中完成：

cypher

CYPHER 25
WITH ai.text.embed($question, 'openai', { token: $openaiKey, model: 'text-embedding-3-small' }) AS qEmbedding
CALL db.index.vector.queryNodes('chunk_embedding', 10, qEmbedding) YIELD node AS chunk, score
MATCH (chunk)<-[:HAS_CHUNK]-(article:Article)
OPTIONAL MATCH path = shortestPath((article)-[*..3]-(other:Article))
WITH chunk, article, collect(DISTINCT other.title) AS related, score
ORDER BY score DESC LIMIT 5
WITH collect(chunk.text + '\n[Source: ' + article.title + ']') AS context, $question AS question
RETURN ai.text.completion(
  'Answer based on context:\n' + reduce(s='', c IN context | s + c + '\n') + '\nQuestion: ' + question,
  'openai',
  { token: $openaiKey, model: 'gpt-4o-mini' }
) AS answer

核心思路（Bergman）：种子节点间的最短路径可展现直接邻居之外的关联关系。

Structured Output [2025.12]

结构化输出 [2025.12]

Returns

MAP

— directly storable as node properties or used downstream in Cypher.

cypher

CYPHER 25
MATCH (p:Product {id: $productId})
WITH p,
  ai.text.structuredCompletion(
    'Extract key attributes from: ' + p.description,
    {
      type: 'object',
      properties: {
        category: { type: 'string' },
        tags: { type: 'array', items: { type: 'string' } },
        priceRange: { type: 'string', enum: ['budget', 'mid', 'premium'] }
      },
      required: ['category', 'tags', 'priceRange'],
      additionalProperties: false
    },
    'openai',
    { token: $openaiKey, model: 'gpt-4o-mini' }
  ) AS extracted
SET p.category = extracted.category,
    p.priceRange = extracted.priceRange
WITH p, extracted.tags AS tags
UNWIND tags AS tag
MERGE (t:Tag {name: tag})
MERGE (p)-[:TAGGED]->(t)

MAP

类型——可直接存储为节点属性或在Cypher下游使用。

cypher

CYPHER 25
MATCH (p:Product {id: $productId})
WITH p,
  ai.text.structuredCompletion(
    'Extract key attributes from: ' + p.description,
    {
      type: 'object',
      properties: {
        category: { type: 'string' },
        tags: { type: 'array', items: { type: 'string' } },
        priceRange: { type: 'string', enum: ['budget', 'mid', 'premium'] }
      },
      required: ['category', 'tags', 'priceRange'],
      additionalProperties: false
    },
    'openai',
    { token: $openaiKey, model: 'gpt-4o-mini' }
  ) AS extracted
SET p.category = extracted.category,
    p.priceRange = extracted.priceRange
WITH p, extracted.tags AS tags
UNWIND tags AS tag
MERGE (t:Tag {name: tag})
MERGE (p)-[:TAGGED]->(t)

Aggregate structured completion — extract across multiple rows

聚合结构化输出——从多行内容提取信息

cypher

CYPHER 25
MATCH (:User {id: $userId})-[:ORDERED]->(o:Order)-[:CONTAINS]->(p:Product)
RETURN ai.text.aggregateStructuredCompletion(
  p.name + ': ' + p.category,
  'Build a shopping profile for this user',
  {
    type: 'object',
    properties: {
      preferredCategories: { type: 'array', items: { type: 'string' } },
      spendingTier: { type: 'string', enum: ['economy', 'standard', 'premium'] }
    },
    required: ['preferredCategories', 'spendingTier']
  },
  'openai',
  { token: $openaiKey, model: 'gpt-4o-mini' }
) AS profile

cypher

CYPHER 25
MATCH (:User {id: $userId})-[:ORDERED]->(o:Order)-[:CONTAINS]->(p:Product)
RETURN ai.text.aggregateStructuredCompletion(
  p.name + ': ' + p.category,
  'Build a shopping profile for this user',
  {
    type: 'object',
    properties: {
      preferredCategories: { type: 'array', items: { type: 'string' } },
      spendingTier: { type: 'string', enum: ['economy', 'standard', 'premium'] }
    },
    required: ['preferredCategories', 'spendingTier']
  },
  'openai',
  { token: $openaiKey, model: 'gpt-4o-mini' }
) AS profile

Chat [2025.12]

聊天功能 [2025.12]

Supported providers: openai and azure-openai only.

cypher

// Start new conversation (chatId = null → new session)
CYPHER 25
WITH ai.text.chat(
  'Hello, who are you?',
  null,
  'openai',
  { token: $openaiKey, model: 'gpt-4o-mini' }
) AS result
RETURN result.message AS reply, result.chatId AS sessionId

// Continue conversation (pass returned chatId)
CYPHER 25
WITH ai.text.chat(
  'What did I just ask you?',
  $chatId,
  'openai',
  { token: $openaiKey, model: 'gpt-4o-mini' }
) AS result
RETURN result.message AS reply, result.chatId AS sessionId

Returns

MAP { message: STRING, chatId: STRING }

. Store

chatId

to continue session.

支持的提供商：仅openai和azure-openai。

cypher

// 开启新对话（chatId = null → 创建新会话）
CYPHER 25
WITH ai.text.chat(
  'Hello, who are you?',
  null,
  'openai',
  { token: $openaiKey, model: 'gpt-4o-mini' }
) AS result
RETURN result.message AS reply, result.chatId AS sessionId

// 继续对话（传入返回的chatId）
CYPHER 25
WITH ai.text.chat(
  'What did I just ask you?',
  $chatId,
  'openai',
  { token: $openaiKey, model: 'gpt-4o-mini' }
) AS result
RETURN result.message AS reply, result.chatId AS sessionId

MAP { message: STRING, chatId: STRING }

。存储

chatId

即可继续会话。

Tokenization & Chunking [2025.12]

分词与文本拆分 [2025.12]

cypher

// Count tokens before sending to LLM
CYPHER 25
RETURN ai.text.tokenCount($text, 'openai', { token: $openaiKey, model: 'gpt-4o-mini' }) AS tokenCount

// Chunk text by token limit (no external dependencies)
CYPHER 25
UNWIND ai.text.chunkByTokenLimit($longText, 512, 'gpt-4', 50) AS chunk
MERGE (c:Chunk { text: chunk })

ai.text.chunkByTokenLimit(input, limit, model='gpt-4', overlap=0)

—

model

controls tokenizer;

overlap

= tokens of overlap between chunks.

cypher

// 发送给LLM前统计令牌数量
CYPHER 25
RETURN ai.text.tokenCount($text, 'openai', { token: $openaiKey, model: 'gpt-4o-mini' }) AS tokenCount

// 按令牌限制拆分文本（无外部依赖）
CYPHER 25
UNWIND ai.text.chunkByTokenLimit($longText, 512, 'gpt-4', 50) AS chunk
MERGE (c:Chunk { text: chunk })

ai.text.chunkByTokenLimit(input, limit, model='gpt-4', overlap=0)

—

model

参数控制分词器；

overlap

= 文本块之间的重叠令牌数。

Write Gate

写入注意事项

SET node.embedding = ai.text.embed(...)

and

SET node.* = ai.text.structuredCompletion(...)

write to the graph.

Before bulk writes:

Count nodes first:

MATCH (c:Chunk) WHERE c.embedding IS NULL RETURN count(c)

Verify config with one test node before batch

Use

CALL { ... } IN TRANSACTIONS OF 500 ROWS

for batches > 1000 nodes

Require explicit confirmation before executing

SET node.embedding = ai.text.embed(...)

和

SET node.* = ai.text.structuredCompletion(...)

会写入图数据库。

批量写入前：

先统计节点数量：

MATCH (c:Chunk) WHERE c.embedding IS NULL RETURN count(c)

在批量操作前先用单个测试节点验证配置
对于超过1000个节点的批量操作，使用
```
CALL { ... } IN TRANSACTIONS OF 500 ROWS
```
执行前需要明确确认

Deprecated — Do NOT Use

已弃用——请勿使用

Old function	Replacement
`genai.vector.encode()` [deprecated]	`ai.text.embed()`
`genai.vector.encodeBatch()` [deprecated]	`CALL ai.text.embedBatch()`
`genai.vector.listEncodingProviders()` [deprecated]	`CALL ai.text.embed.providers()`

旧函数	替代方案
`genai.vector.encode()` [已弃用]	`ai.text.embed()`
`genai.vector.encodeBatch()` [已弃用]	`CALL ai.text.embedBatch()`
`genai.vector.listEncodingProviders()` [已弃用]	`CALL ai.text.embed.providers()`

Common Errors

常见错误

Error	Cause	Fix
`Unknown function 'ai.text.embed'`	Missing CYPHER 25 prefix OR plugin not installed	Add `CYPHER 25` prefix; verify plugin installed
`Cypher version not supported`	Using `CYPHER 25` on Neo4j < 5.20 or missing plugin	Upgrade Neo4j; ensure GenAI plugin loaded
`Configuration key 'token' missing`	Provider config map incomplete	Check required keys for provider (see table above)
`null` returned from embed	Wrong model name or provider auth failed	Test with `RETURN ai.text.embed('test', 'openai', {token:$k, model:'text-embedding-3-small'})` standalone
`Unsupported provider`	Provider string typo (case-sensitive, lowercase)	Use `'openai'` not `'OpenAI'` ; run `CALL ai.text.embed.providers()`
`ai.text.chat` fails on VertexAI	Chat only supported on openai/azure-openai	Switch to openai/azure-openai for chat

错误信息	原因	修复方案
`Unknown function 'ai.text.embed'`	缺少CYPHER 25前缀或未安装插件	添加 `CYPHER 25` 前缀；验证插件已安装
`Cypher version not supported`	在Neo4j < 5.20版本使用 `CYPHER 25` 或缺少插件	升级Neo4j版本；确保GenAI插件已加载
`Configuration key 'token' missing`	提供商配置映射不完整	检查对应提供商的必填密钥（见上方表格）
嵌入返回 `null`	模型名称错误或提供商认证失败	单独测试 `RETURN ai.text.embed('test', 'openai', {token:$k, model:'text-embedding-3-small'})`
`Unsupported provider`	提供商字符串拼写错误（区分大小写，需小写）	使用 `'openai'` 而非 `'OpenAI'` ；执行 `CALL ai.text.embed.providers()` 查看支持列表
`ai.text.chat` 在VertexAI上失败	聊天功能仅支持openai/azure-openai	切换到openai/azure-openai使用聊天功能

Checklist

检查清单

```
CYPHER 25
```
prefix present on every ai.text.* query
GenAI plugin installed (Aura: automatic; self-managed: JAR in plugins/)
API key passed as
```
$param
```
, never as literal string
```
model
```
key explicit in config (no silent defaults)
Provider string lowercase (
```
'openai'
```
,
```
'vertexai'
```
,
```
'bedrock-titan'
```
)
Bulk writes use
```
IN TRANSACTIONS OF 500 ROWS
```
; count target nodes first
```
genai.vector.encode()
```
replaced with
```
ai.text.embed()
```
[2025.12]
Chat sessions: store returned
```
chatId
```
for continuation; only openai/azure-openai supported
Structured output schema uses
```
additionalProperties: false
```
to prevent hallucination keys

每个包含ai.text.*的查询都带有
```
CYPHER 25
```
前缀
已安装GenAI插件（Aura环境自动启用；自托管环境需将JAR放入plugins/）
API密钥通过
```
$param
```
传递，而非硬编码字面量字符串
配置中明确指定
```
model
```
密钥（无默认值）
提供商字符串为小写（
```
'openai'
```
,
```
'vertexai'
```
,
```
'bedrock-titan'
```
）
批量写入使用
```
IN TRANSACTIONS OF 500 ROWS
```
；先统计目标节点数量
已将
```
genai.vector.encode()
```
替换为
```
ai.text.embed()
```
[2025.12]
聊天会话：存储返回的
```
chatId
```
以继续会话；仅支持openai/azure-openai
结构化输出模式使用
```
additionalProperties: false
```
防止生成幻觉字段

References

参考资料

Full provider config — all required/optional keys per provider
Official docs
API reference

完整提供商配置 — 各提供商的所有必填/可选密钥
官方文档
API参考