neo4j-snowflake-graph-analytics-skill
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseSnowflake Native App — graph algorithm power inside Snowflake. Data stays in Snowflake; project into a graph, run algorithms via SQL , results written back to Snowflake tables.
CALLSnowflake原生应用——在Snowflake内实现图算法能力。数据无需移出Snowflake;将数据投影为图,通过SQL 运行算法,结果写回Snowflake表中。
CALLWhen to Use
使用场景
- Running graph algorithms / GDS in Snowflake
- Data in Snowflake tables
- On-demand / pipeline workloads — ephemeral sessions, pay per session-minute
- Full isolation from the live database during analytics
- 在Snowflake中运行图算法/GDS
- 数据存储在Snowflake表中
- 按需/流水线工作负载——临时会话,按会话分钟计费
- 分析期间与实时数据库完全隔离
When NOT to Use
不适用场景
- Aura Pro with embedded GDS plugin →
neo4j-gds-skill - Aura Graph Analytics →
neo4j-aura-graph-analytics-skill - Self-managed Neo4j with embedded GDS plugin →
neo4j-gds-skill - Writing Cypher queries →
neo4j-cypher-skill
- 搭载嵌入式GDS插件的Aura Pro → 使用
neo4j-gds-skill - Aura Graph Analytics → 使用
neo4j-aura-graph-analytics-skill - 搭载嵌入式GDS插件的自托管Neo4j → 使用
neo4j-gds-skill - 编写Cypher查询 → 使用
neo4j-cypher-skill
Key Concepts
核心概念
Project → Compute → Write
投影 → 计算 → 写入
Every algorithm run follows three steps:
- Project — specify node/relationship tables; app builds in-memory graph
- Compute — run algorithm with config parameters
- Write — results written back to a Snowflake table
每次算法运行都遵循三个步骤:
- 投影——指定节点/关系表;应用程序构建内存中图
- 计算——使用配置参数运行算法
- 写入——将结果写回Snowflake表
Required Table Columns
所需表列
| Table type | Required columns | Optional columns |
|---|---|---|
| Node table | | Any additional columns become node properties |
| Relationship table | | Any additional columns become relationship properties |
If your tables use different column names, create a view aliasing to , , .
nodeIdsourceNodeIdtargetNodeId| 表类型 | 必填列 | 可选列 |
|---|---|---|
| 节点表 | | 任何额外列将成为节点属性 |
| 关系表 | | 任何额外列将成为关系属性 |
如果你的表使用不同列名,请创建视图将列别名改为、、。
nodeIdsourceNodeIdtargetNodeIdGraph Orientation
图方向
When projecting relationships, you can set :
orientation- (default) — directed, source → target
NATURAL - — treated as bidirectional
UNDIRECTED - — direction flipped
REVERSE
投影关系时,可设置参数:
orientation- (默认)——有向图,从源节点指向目标节点
NATURAL - ——视为双向图
UNDIRECTED - ——方向反转
REVERSE
Installation
安装步骤
- Go to the Snowflake Marketplace
- Install Neo4j Graph Analytics (default app name: )
Neo4j_Graph_Analytics - During install, enable Event sharing when prompted
- After install, go to Data Products → Apps → Neo4j Graph Analytics → Privileges → Grant
- Grant and
CREATE COMPUTE POOLprivileges, then click ActivateCREATE WAREHOUSE
- 访问Snowflake Marketplace
- 安装Neo4j Graph Analytics(默认应用名称:)
Neo4j_Graph_Analytics - 安装过程中,当提示时启用事件共享
- 安装完成后,进入数据产品 → 应用 → Neo4j Graph Analytics → 权限 → 授予
- 授予和
CREATE COMPUTE POOL权限,然后点击激活CREATE WAREHOUSE
Privilege Setup (run once per database/schema)
权限设置(每个数据库/架构运行一次)
sql
-- Step 1: Use ACCOUNTADMIN to set up roles and grants
USE ROLE ACCOUNTADMIN;
-- Create a consumer role for users of the application
CREATE ROLE IF NOT EXISTS MY_CONSUMER_ROLE;
GRANT APPLICATION ROLE Neo4j_Graph_Analytics.app_user TO ROLE MY_CONSUMER_ROLE;
SET MY_USER = (SELECT CURRENT_USER());
GRANT ROLE MY_CONSUMER_ROLE TO USER IDENTIFIER($MY_USER);
-- Step 2: Create a database role and grant it to the app
USE DATABASE MY_DATABASE;
CREATE DATABASE ROLE IF NOT EXISTS MY_DB_ROLE;
GRANT USAGE ON DATABASE MY_DATABASE TO DATABASE ROLE MY_DB_ROLE;
GRANT USAGE ON SCHEMA MY_DATABASE.MY_SCHEMA TO DATABASE ROLE MY_DB_ROLE;
GRANT SELECT ON ALL TABLES IN SCHEMA MY_DATABASE.MY_SCHEMA TO DATABASE ROLE MY_DB_ROLE;
GRANT SELECT ON ALL VIEWS IN SCHEMA MY_DATABASE.MY_SCHEMA TO DATABASE ROLE MY_DB_ROLE;
GRANT SELECT ON FUTURE TABLES IN SCHEMA MY_DATABASE.MY_SCHEMA TO DATABASE ROLE MY_DB_ROLE;
GRANT SELECT ON FUTURE VIEWS IN SCHEMA MY_DATABASE.MY_SCHEMA TO DATABASE ROLE MY_DB_ROLE;
GRANT CREATE TABLE ON SCHEMA MY_DATABASE.MY_SCHEMA TO DATABASE ROLE MY_DB_ROLE;
GRANT DATABASE ROLE MY_DB_ROLE TO APPLICATION Neo4j_Graph_Analytics;
-- Step 3: Grant the consumer role access to output tables
GRANT USAGE ON DATABASE MY_DATABASE TO ROLE MY_CONSUMER_ROLE;
GRANT USAGE ON SCHEMA MY_DATABASE.MY_SCHEMA TO ROLE MY_CONSUMER_ROLE;
GRANT SELECT ON FUTURE TABLES IN SCHEMA MY_DATABASE.MY_SCHEMA TO ROLE MY_CONSUMER_ROLE;
-- Step 4: Switch to the consumer role to run algorithms
USE ROLE MY_CONSUMER_ROLE;Replace,P2P,PUBLIC, andGRAPH_USER_ROLEwith your actual names throughout.GRAPH_DB_ROLE
sql
-- Step 1: Use ACCOUNTADMIN to set up roles and grants
USE ROLE ACCOUNTADMIN;
-- Create a consumer role for users of the application
CREATE ROLE IF NOT EXISTS MY_CONSUMER_ROLE;
GRANT APPLICATION ROLE Neo4j_Graph_Analytics.app_user TO ROLE MY_CONSUMER_ROLE;
SET MY_USER = (SELECT CURRENT_USER());
GRANT ROLE MY_CONSUMER_ROLE TO USER IDENTIFIER($MY_USER);
-- Step 2: Create a database role and grant it to the app
USE DATABASE MY_DATABASE;
CREATE DATABASE ROLE IF NOT EXISTS MY_DB_ROLE;
GRANT USAGE ON DATABASE MY_DATABASE TO DATABASE ROLE MY_DB_ROLE;
GRANT USAGE ON SCHEMA MY_DATABASE.MY_SCHEMA TO DATABASE ROLE MY_DB_ROLE;
GRANT SELECT ON ALL TABLES IN SCHEMA MY_DATABASE.MY_SCHEMA TO DATABASE ROLE MY_DB_ROLE;
GRANT SELECT ON ALL VIEWS IN SCHEMA MY_DATABASE.MY_SCHEMA TO DATABASE ROLE MY_DB_ROLE;
GRANT SELECT ON FUTURE TABLES IN SCHEMA MY_DATABASE.MY_SCHEMA TO DATABASE ROLE MY_DB_ROLE;
GRANT SELECT ON FUTURE VIEWS IN SCHEMA MY_DATABASE.MY_SCHEMA TO DATABASE ROLE MY_DB_ROLE;
GRANT CREATE TABLE ON SCHEMA MY_DATABASE.MY_SCHEMA TO DATABASE ROLE MY_DB_ROLE;
GRANT DATABASE ROLE MY_DB_ROLE TO APPLICATION Neo4j_Graph_Analytics;
-- Step 3: Grant the consumer role access to output tables
GRANT USAGE ON DATABASE MY_DATABASE TO ROLE MY_CONSUMER_ROLE;
GRANT USAGE ON SCHEMA MY_DATABASE.MY_SCHEMA TO ROLE MY_CONSUMER_ROLE;
GRANT SELECT ON FUTURE TABLES IN SCHEMA MY_DATABASE.MY_SCHEMA TO ROLE MY_CONSUMER_ROLE;
-- Step 4: Switch to the consumer role to run algorithms
USE ROLE MY_CONSUMER_ROLE;请将全程的、P2P、PUBLIC和GRAPH_USER_ROLE替换为你的实际名称。GRAPH_DB_ROLE
Running an Algorithm — Full Example
运行算法——完整示例
sql
-- Optional: set default database to avoid fully-qualified names
USE DATABASE Neo4j_Graph_Analytics;
USE ROLE GRAPH_USER_ROLE;
-- Call WCC (Weakly Connected Components)
CALL Neo4j_Graph_Analytics.graph.wcc('CPU_X64_XS', {
'defaultTablePrefix': 'P2P.PUBLIC',
'project': {
'nodeTables': ['USER_VW'],
'relationshipTables': {
'AGG_TRANSACTIONS_VW': {
'sourceTable': 'P2P.PUBLIC.USER_VW',
'targetTable': 'P2P.PUBLIC.USER_VW',
'orientation': 'NATURAL'
}
}
},
'compute': { 'consecutiveIds': true },
'write': [{
'nodeLabel': 'NODES',
'outputTable': 'USER_COMPONENTS'
}]
});
-- Inspect results
SELECT * FROM P2P.PUBLIC.USER_COMPONENTS;First argument is the compute pool size:
| Pool | Use |
|---|---|
| Dev / small graphs |
| Progressively larger |
| Large graphs, lower CPU need |
| Compute-intensive (GraphSAGE); GPU not available in all regions |
See Estimating Jobs to choose size.
sql
-- Optional: set default database to avoid fully-qualified names
USE DATABASE Neo4j_Graph_Analytics;
USE ROLE GRAPH_USER_ROLE;
-- Call WCC (Weakly Connected Components)
CALL Neo4j_Graph_Analytics.graph.wcc('CPU_X64_XS', {
'defaultTablePrefix': 'P2P.PUBLIC',
'project': {
'nodeTables': ['USER_VW'],
'relationshipTables': {
'AGG_TRANSACTIONS_VW': {
'sourceTable': 'P2P.PUBLIC.USER_VW',
'targetTable': 'P2P.PUBLIC.USER_VW',
'orientation': 'NATURAL'
}
}
},
'compute': { 'consecutiveIds': true },
'write': [{
'nodeLabel': 'NODES',
'outputTable': 'USER_COMPONENTS'
}]
});
-- Inspect results
SELECT * FROM P2P.PUBLIC.USER_COMPONENTS;第一个参数是计算池大小:
| 计算池 | 适用场景 |
|---|---|
| 开发/小型图 |
| 规模逐步增大的图 |
| 大型图,对CPU需求较低 |
| 计算密集型任务(如GraphSAGE);GPU并非在所有区域可用 |
请查看估算任务以选择合适的计算池大小。
Available Algorithms
可用算法
Community Detection
社区检测
| Algorithm | Procedure | Use case |
|---|---|---|
| Weakly Connected Components | | Find disconnected subgraphs |
| Louvain | | Community detection, modularity optimisation |
| Leiden | | Improved community detection (more stable than Louvain) |
| K-Means Clustering | | Cluster nodes by node properties |
| Triangle Count | | Measure local clustering / detect dense subgraphs |
| 算法 | 存储过程 | 适用场景 |
|---|---|---|
| Weakly Connected Components(弱连通分量) | | 查找不连通的子图 |
| Louvain | | 社区检测,优化模块度 |
| Leiden | | 改进型社区检测(比Louvain更稳定) |
| K-Means聚类 | | 根据节点属性对节点进行聚类 |
| Triangle Count(三角形计数) | | 衡量局部聚类/检测密集子图 |
Centrality
中心性
| Algorithm | Procedure | Use case |
|---|---|---|
| PageRank | | Rank nodes by influence |
| Article Rank | | PageRank variant, discounts high-degree neighbours |
| Betweenness Centrality | | Find bridge nodes in a network |
| Degree Centrality | | Count direct connections per node |
| 算法 | 存储过程 | 适用场景 |
|---|---|---|
| PageRank | | 根据影响力对节点排名 |
| Article Rank | | PageRank变体,降低高度数邻居的权重 |
| Betweenness Centrality(介数中心性) | | 查找网络中的桥接节点 |
| Degree Centrality(度数中心性) | | 统计每个节点的直接连接数 |
Pathfinding
路径查找
| Algorithm | Procedure | Use case |
|---|---|---|
| Dijkstra Source-Target | | Shortest path between two nodes |
| Dijkstra Single-Source | | Shortest paths from one node to all others |
| Delta-Stepping SSSP | | Faster parallel shortest paths |
| Breadth First Search | | BFS traversal from a source node |
| Yen's K-Shortest Paths | | Top-K shortest paths between two nodes |
| Max Flow | | Maximum flow through a network |
| FastPath | | Fast approximate shortest paths |
| 算法 | 存储过程 | 适用场景 |
|---|---|---|
| Dijkstra源-目标路径 | | 两个节点之间的最短路径 |
| Dijkstra单源路径 | | 从一个节点到所有其他节点的最短路径 |
| Delta-Stepping SSSP | | 更快的并行最短路径算法 |
| Breadth First Search(广度优先搜索) | | 从源节点开始的BFS遍历 |
| Yen's K-最短路径 | | 两个节点之间的前K条最短路径 |
| Max Flow(最大流) | | 网络中的最大流量计算 |
| FastPath | | 快速近似最短路径算法 |
Similarity
相似度
| Algorithm | Procedure | Use case |
|---|---|---|
| Node Similarity | | Find similar nodes based on shared neighbours |
| Filtered Node Similarity | | Node similarity with source/target filters |
| K-Nearest Neighbors | | Find K most similar nodes |
| Filtered KNN | | KNN with source/target filters |
| 算法 | 存储过程 | 适用场景 |
|---|---|---|
| Node Similarity(节点相似度) | | 根据共享邻居查找相似节点 |
| Filtered Node Similarity(过滤式节点相似度) | | 带源/目标过滤的节点相似度计算 |
| K-Nearest Neighbors(K近邻) | | 查找K个最相似的节点 |
| Filtered KNN(过滤式K近邻) | | 带源/目标过滤的K近邻计算 |
Node Embeddings / ML
节点嵌入/机器学习
| Algorithm | Procedure | Use case |
|---|---|---|
| Fast Random Projection (FastRP) | | Fast node embeddings |
| Node2Vec | | Random-walk-based node embeddings |
| HashGNN | | GNN-inspired embeddings without training |
| GraphSAGE (train) | | Train inductive node embeddings |
| GraphSAGE (predict) | | Predict with a trained GraphSAGE model |
| Node Classification (train) | | Supervised node label prediction |
| Node Classification (predict) | | Apply trained node classifier |
| 算法 | 存储过程 | 适用场景 |
|---|---|---|
| Fast Random Projection(FastRP,快速随机投影) | | 快速生成节点嵌入 |
| Node2Vec | | 基于随机游走的节点嵌入 |
| HashGNN | | 无需训练的GNN启发式嵌入 |
| GraphSAGE(训练) | | 训练归纳式节点嵌入 |
| GraphSAGE(预测) | | 使用训练好的GraphSAGE模型进行预测 |
| Node Classification(训练) | | 监督式节点标签预测训练 |
| Node Classification(预测) | | 应用训练好的节点分类器 |
Projection Configuration Reference
投影配置参考
json
{
"project": {
"nodeTables": [
"DB.SCHEMA.TABLE_A",
"DB.SCHEMA.TABLE_B"
],
"relationshipTables": {
"DB.SCHEMA.REL_TABLE": {
"sourceTable": "DB.SCHEMA.TABLE_A",
"targetTable": "DB.SCHEMA.TABLE_B",
"orientation": "NATURAL"
}
}
}
}- — use when all tables are in the same schema
defaultTablePrefix - Multiple node/relationship tables supported — each maps to a different label/type
- Extra columns become node/relationship properties (e.g. column for weighted paths)
weight
json
{
"project": {
"nodeTables": [
"DB.SCHEMA.TABLE_A",
"DB.SCHEMA.TABLE_B"
],
"relationshipTables": {
"DB.SCHEMA.REL_TABLE": {
"sourceTable": "DB.SCHEMA.TABLE_A",
"targetTable": "DB.SCHEMA.TABLE_B",
"orientation": "NATURAL"
}
}
}
}- ——当所有表都在同一个架构下时使用
defaultTablePrefix - 支持多个节点/关系表——每个表对应不同的标签/类型
- 额外列将成为节点/关系属性(例如用于加权路径的列)
weight
Write Configuration Reference
写入配置参考
json
{
"write": [
{
"nodeLabel": "TABLE_A",
"outputTable": "DB.SCHEMA.OUTPUT_TABLE",
"nodeProperty": "score"
}
]
}- — node table name without schema prefix
nodeLabel - — created or overwritten
outputTable - (optional) — which computed property to write if algorithm produces multiple
nodeProperty
For relationship results (KNN, Node Similarity):
json
{
"write": [
{
"relationshipType": "SIMILAR",
"outputTable": "DB.SCHEMA.SIMILARITY_OUTPUT"
}
]
}json
{
"write": [
{
"nodeLabel": "TABLE_A",
"outputTable": "DB.SCHEMA.OUTPUT_TABLE",
"nodeProperty": "score"
}
]
}- ——不带架构前缀的节点表名称
nodeLabel - ——将被创建或覆盖
outputTable - (可选)——如果算法生成多个属性,指定要写入的计算属性
nodeProperty
对于关系结果(如KNN、节点相似度):
json
{
"write": [
{
"relationshipType": "SIMILAR",
"outputTable": "DB.SCHEMA.SIMILARITY_OUTPUT"
}
]
}Common Patterns
常见模式
Chaining Algorithms
算法链式调用
Results write to tables — feed one algorithm's output into the next. grant (done in setup) lets the app read tables it just created.
FUTURE TABLESsql
-- Step 1: Run FastRP to generate embeddings
CALL Neo4j_Graph_Analytics.graph.fastrp('CPU_X64_XS', { ... });
-- Step 2: Run KNN on the embedding output
CALL Neo4j_Graph_Analytics.graph.knn('CPU_X64_XS', { ... });结果写入表中——将一个算法的输出作为下一个算法的输入。设置过程中授予的权限允许应用读取其刚刚创建的表。
FUTURE TABLESsql
-- Step 1: Run FastRP to generate embeddings
CALL Neo4j_Graph_Analytics.graph.fastrp('CPU_X64_XS', { ... });
-- Step 2: Run KNN on the embedding output
CALL Neo4j_Graph_Analytics.graph.knn('CPU_X64_XS', { ... });Using Views Instead of Renaming Columns
使用视图而非重命名列
Create views with required column names and supported data types. Convert categorical data to numerical scores.
sql
CREATE VIEW MY_SCHEMA.NODES_VIEW AS
SELECT user_id AS nodeId, name, age
FROM MY_SCHEMA.USERS;
CREATE VIEW MY_SCHEMA.RELS_VIEW AS
SELECT from_user AS sourceNodeId, to_user AS targetNodeId, weight
FROM MY_SCHEMA.CONNECTIONS;创建包含所需列名和支持数据类型的视图。将分类数据转换为数值评分。
sql
CREATE VIEW MY_SCHEMA.NODES_VIEW AS
SELECT user_id AS nodeId, name, age
FROM MY_SCHEMA.USERS;
CREATE VIEW MY_SCHEMA.RELS_VIEW AS
SELECT from_user AS sourceNodeId, to_user AS targetNodeId, weight
FROM MY_SCHEMA.CONNECTIONS;Troubleshooting
故障排除
| Problem | Solution |
|---|---|
| Check the app has |
| Your table is missing the required column — create a view that aliases it |
| The pool may still be starting up; wait a minute and retry |
| Algorithm returns no results | Check your node/relationship tables are not empty and projections are correct |
Full troubleshooting guide: https://neo4j.com/docs/snowflake-graph-analytics/current/troubleshooting/
| 问题 | 解决方案 |
|---|---|
| 检查应用是否拥有表的 |
| 你的表缺少必填列——创建视图将列别名改为nodeId |
| 计算池可能仍在启动中;等待一分钟后重试 |
| 算法无结果返回 | 检查你的节点/关系表是否非空,且投影配置正确 |