PyGraphistry GFQL
PyGraphistry GFQL
Doc routing (local + canonical)
文档路由(本地+规范)
- First route with
../pygraphistry/references/pygraphistry-readthedocs-toc.md
.
- Use
../pygraphistry/references/pygraphistry-readthedocs-top-level.tsv
for section-level shortcuts.
- Only scan
../pygraphistry/references/pygraphistry-readthedocs-sitemap.xml
when a needed page is missing.
- Use one batched discovery read before deep-page reads; avoid and serial micro-reads.
- In user-facing answers, prefer canonical
https://pygraphistry.readthedocs.io/en/latest/...
links.
- 首先使用
../pygraphistry/references/pygraphistry-readthedocs-toc.md
进行路由。
- 使用
../pygraphistry/references/pygraphistry-readthedocs-top-level.tsv
获取章节级快捷方式。
- 仅在所需页面缺失时扫描
../pygraphistry/references/pygraphistry-readthedocs-sitemap.xml
。
- 在深度页面读取前先进行一次批量发现读取;避免使用和串行微读取。
- 在面向用户的回答中,优先使用规范的
https://pygraphistry.readthedocs.io/en/latest/...
链接。
Two syntaxes, one entrypoint
两种语法,一个入口
accepts
both chain-list (Python AST objects)
and Cypher strings. It auto-detects the language from the argument type:
同时支持
链式列表(Python AST对象)和Cypher字符串。它会根据参数类型自动检测语言:
Chain-list syntax (Python AST objects)
Chain-list syntax (Python AST objects)
g2 = g.gfql([n({'type': 'person'}), e_forward(), n()])
g2 = g.gfql([n({'type': 'person'}), e_forward(), n()])
Cypher string syntax (auto-detected)
Cypher string syntax (auto-detected)
g2 = g.gfql("MATCH (p:Person)-[r:KNOWS]->(q:Person) RETURN p.name, q.name")
g2 = g.gfql("MATCH (p:Person)-[r:KNOWS]->(q:Person) RETURN p.name, q.name")
Explicit language parameter (optional)
Explicit language parameter (optional)
g2 = g.gfql(query_string, language="cypher")
**When to use which:**
- **Chain-list**: Programmatic composition, dynamic parameterization, when building queries from code
- **Cypher**: Readability, familiarity for Cypher users, complex pattern matching with RETURN/ORDER BY/LIMIT
g2 = g.gfql(query_string, language="cypher")
**何时使用哪种语法:**
- **链式列表**:程序化组合、动态参数化、从代码构建查询时使用
- **Cypher**:可读性高、适合熟悉Cypher的用户、带有RETURN/ORDER BY/LIMIT的复杂模式匹配
Quick start — chain-list
快速入门 — 链式列表
python
from graphistry import n, e_forward
g2 = g.gfql([
n({'type': 'person'}),
e_forward({'relation': 'transfers_to'}, min_hops=1, max_hops=3),
n({'risk': True})
])
python
from graphistry import n, e_forward
g2 = g.gfql([
n({'type': 'person'}),
e_forward({'relation': 'transfers_to'}, min_hops=1, max_hops=3),
n({'risk': True})
])
Quick start — Cypher
快速入门 — Cypher
Simple pattern match
Simple pattern match
g2 = g.gfql("MATCH (p:Person)-[r:KNOWS]->(q:Person) WHERE p.age > 30 RETURN p.name, q.name")
g2 = g.gfql("MATCH (p:Person)-[r:KNOWS]->(q:Person) WHERE p.age > 30 RETURN p.name, q.name")
Variable-length paths
Variable-length paths
g2 = g.gfql("MATCH (a:Account)-[*1..3]->(m:Merchant) RETURN a, m")
g2 = g.gfql("MATCH (a:Account)-[*1..3]->(m:Merchant) RETURN a, m")
Parameterized queries
Parameterized queries
g2 = g.gfql(
"MATCH (n) WHERE n.score > $cutoff RETURN n.id, n.score ORDER BY n.score DESC LIMIT $top_n",
params={"cutoff": 50, "top_n": 10}
)
g2 = g.gfql(
"MATCH (n) WHERE n.score > $cutoff RETURN n.id, n.score ORDER BY n.score DESC LIMIT $top_n",
params={"cutoff": 50, "top_n": 10}
)
Relationship type alternation
Relationship type alternation
g2 = g.gfql("MATCH (a:Person)-[:KNOWS|COLLABORATES_WITH]->(b:Person) RETURN a.name, b.name")
g2 = g.gfql("MATCH (a:Person)-[:KNOWS|COLLABORATES_WITH]->(b:Person) RETURN a.name, b.name")
Cypher node labels and DataFrame columns
Cypher节点标签与DataFrame列
GFQL Cypher maps
to boolean columns
, not string columns.
Prefer property filters (simpler, works with any column):
GFQL Cypher将
映射为布尔列
,而非字符串列。
优先使用属性过滤器(更简单,适用于任何列):
Recommended: property filter (works with any string/numeric column)
Recommended: property filter (works with any string/numeric column)
g2 = g.gfql("MATCH (p) WHERE p.type = 'Person' AND p.age > 30 RETURN p.name")
g2 = g.gfql("MATCH (p) WHERE p.type = 'Person' AND p.age > 30 RETURN p.name")
Alternative: pre-create boolean label columns for Cypher :Label syntax
Alternative: pre-create boolean label columns for Cypher :Label syntax
nodes['label__Person'] = nodes['type'] == 'Person'
g = graphistry.edges(edges, 'src', 'dst').nodes(nodes, 'id')
g2 = g.gfql("MATCH (p:Person) WHERE p.age > 30 RETURN p.name")
nodes['label__Person'] = nodes['type'] == 'Person'
g = graphistry.edges(edges, 'src', 'dst').nodes(nodes, 'id')
g2 = g.gfql("MATCH (p:Person) WHERE p.age > 30 RETURN p.name")
Supported Cypher clauses
支持的Cypher子句
- Full: MATCH, WHERE, RETURN, WITH, ORDER BY, SKIP, LIMIT, DISTINCT, CALL graphistry.*, GRAPH {}, USE
- Partial: OPTIONAL MATCH (bounded subset), UNWIND (top-level), UNION/UNION ALL (direct g.gfql() only)
- Not supported: CREATE, MERGE, DELETE, SET, REMOVE (GFQL is read-only)
- 完全支持:MATCH, WHERE, RETURN, WITH, ORDER BY, SKIP, LIMIT, DISTINCT, CALL graphistry.*, GRAPH {}, USE
- 部分支持:OPTIONAL MATCH(有限子集), UNWIND(顶层), UNION/UNION ALL(仅直接通过g.gfql()使用)
- 不支持:CREATE, MERGE, DELETE, SET, REMOVE(GFQL为只读)
- Scalar: labels(), type(), keys(), properties(), abs(), sqrt(), coalesce(), substring(), tointeger(), tofloat(), toboolean(), tostring()
- Aggregation: count(), sum(), min(), max(), avg(), collect(), count(DISTINCT ...)
- Operators: =, <>, <, <=, >, >=, IN, STARTS WITH, ENDS WITH, CONTAINS, IS NULL, IS NOT NULL, AND, OR, NOT
- 标量函数:labels(), type(), keys(), properties(), abs(), sqrt(), coalesce(), substring(), tointeger(), tofloat(), toboolean(), tostring()
- 聚合函数:count(), sum(), min(), max(), avg(), collect(), count(DISTINCT ...)
- 运算符:=, <>, <, <=, >, >=, IN, STARTS WITH, ENDS WITH, CONTAINS, IS NULL, IS NOT NULL, AND, OR, NOT
GRAPH constructor (Cypher extension)
GRAPH构造器(Cypher扩展)
Extract subgraph as a graph object (not a table)
Extract subgraph as a graph object (not a table)
subgraph = g.gfql("GRAPH { MATCH (a)-[r]->(b) WHERE a.risk_score > 7 }")
subgraph = g.gfql("GRAPH { MATCH (a)-[r]->(b) WHERE a.risk_score > 7 }")
Multi-stage pipeline with named GRAPH bindings and USE
Multi-stage pipeline with named GRAPH bindings and USE
result = g.gfql("""
GRAPH g1 = GRAPH { MATCH (a)-[r]->(b) WHERE a.event_count > 100 }
GRAPH g2 = GRAPH { USE g1 CALL graphistry.degree.write() }
USE g2 MATCH (n) RETURN n.id, n.degree ORDER BY n.degree DESC LIMIT 10
""")
result = g.gfql("""
GRAPH g1 = GRAPH { MATCH (a)-[r]->(b) WHERE a.event_count > 100 }
GRAPH g2 = GRAPH { USE g1 CALL graphistry.degree.write() }
USE g2 MATCH (n) RETURN n.id, n.degree ORDER BY n.degree DESC LIMIT 10
""")
Let/DAG bindings
Let/DAG绑定
python
from graphistry import n, e_forward, let, ref
python
from graphistry import n, e_forward, let, ref
Named bindings forming a DAG
Named bindings forming a DAG
result = g.gfql(let({
'high_risk': n({'risk_score': {'$gt': 0.8}}),
'neighborhoods': ref('high_risk', [e_forward(max_hops=2), n()])
}))
result = g.gfql(let({
'high_risk': n({'risk_score': {'$gt': 0.8}}),
'neighborhoods': ref('high_risk', [e_forward(max_hops=2), n()])
}))
Select specific binding output
Select specific binding output
result = g.gfql(let({...}), output='neighborhoods')
result = g.gfql(let({...}), output='neighborhoods')
Multi-stage DAG: sequential refs build on each other
Multi-stage DAG: sequential refs build on each other
result = g.gfql(let({
'people': n({'type': 'person'}),
'contacts': ref('people', [e_forward({'rel': 'contacts'}), n()]),
'owned': ref('contacts', [e_forward({'rel': 'owns'}), n()])
}), output='owned')
result = g.gfql(let({
'people': n({'type': 'person'}),
'contacts': ref('people', [e_forward({'rel': 'contacts'}), n()]),
'owned': ref('contacts', [e_forward({'rel': 'owns'}), n()])
}), output='owned')
Nested let: inner DAGs execute as opaque units for parallel-friendly pipelines
Nested let: inner DAGs execute as opaque units for parallel-friendly pipelines
result = g.gfql(let({
'social': let({
'people': n({'type': 'person'}),
'friends': ref('people', [e_forward({'rel': 'knows'}), n()]),
}),
'infra': let({
'servers': n({'type': 'server'}),
'traffic': ref('servers', [e_forward({'rel': 'serves'}), n()]),
}),
'combined': ref('social', [e_forward(), n()])
}), output='combined')
result = g.gfql(let({
'social': let({
'people': n({'type': 'person'}),
'friends': ref('people', [e_forward({'rel': 'knows'}), n()]),
}),
'infra': let({
'servers': n({'type': 'server'}),
'traffic': ref('servers', [e_forward({'rel': 'serves'}), n()]),
}),
'combined': ref('social', [e_forward(), n()])
}), output='combined')
Let + degree computation + visual encoding
Let + degree computation + visual encoding
from graphistry import n, e_forward, let, ref, call
result = g.gfql(let({
'seeds': n({'risk_flag': True}),
'neighborhood': ref('seeds', [e_forward(max_hops=2), n()]),
}))
from graphistry import n, e_forward, let, ref, call
result = g.gfql(let({
'seeds': n({'risk_flag': True}),
'neighborhood': ref('seeds', [e_forward(max_hops=2), n()]),
}))
Then compute degrees and encode color
Then compute degrees and encode color
result = result.get_degrees().encode_point_color('degree', as_continuous=True)
- **Independent bindings** operate on the root graph
- **ref()** bindings operate on the referenced binding's output
- **Nested let** scope rules (requires pygraphistry >= 0.53.7):
- Inner bindings do NOT leak to outer scope
- Inner bindings CAN read outer bindings (lexical closure)
- Sibling nested lets may reuse names without collision
- Each nested let is an opaque execution unit (parallel-friendly)
result = result.get_degrees().encode_point_color('degree', as_continuous=True)
- **独立绑定**基于根图运行
- **ref()**绑定基于引用绑定的输出运行
- **嵌套let**作用域规则(需要pygraphistry >= 0.53.7):
- 内部绑定不会泄露到外部作用域
- 内部绑定可以读取外部绑定(词法闭包)
- 同级嵌套let可以重用名称而不会冲突
- 每个嵌套let都是一个独立的执行单元(支持并行)
Targeted patterns (high signal)
目标模式(高信号)
Edge query filtering
Edge query filtering
g2 = g.gfql([n(), e_forward(edge_query="type == 'replied_to' and submolt == 'X'"), n()])
g2 = g.gfql([n(), e_forward(edge_query="type == 'replied_to' and submolt == 'X'"), n()])
Same-path constraints with where + compare/col
Same-path constraints with where + compare/col
from graphistry import col, compare
g2 = g.gfql([n(name='a'), e_forward(name='e'), n(name='b')], where=[compare(col('a', 'owner_id'), '==', col('b', 'owner_id'))])
from graphistry import col, compare
g2 = g.gfql([n(name='a'), e_forward(name='e'), n(name='b')], where=[compare(col('a', 'owner_id'), '==', col('b', 'owner_id'))])
Traverse 2-4 hops but only return hops 3-4
Traverse 2-4 hops but only return hops 3-4
g2 = g.gfql([e_forward(min_hops=2, max_hops=4, output_min_hops=3, output_max_hops=4)])
g2 = g.gfql([e_forward(min_hops=2, max_hops=4, output_min_hops=3, output_max_hops=4)])
Edge direction variants
边方向变体
- — source-to-destination
- — destination-to-source
- — both directions
- — alias for any direction
- — 源到目标
- — 目标到源
- — 双向
- — 任意方向的别名
- is the unified entrypoint — pass chain-lists OR Cypher strings.
- NEVER use or — they are deprecated and emit warnings. Always use for chain-list syntax or for Cypher.
- When user explicitly asks for GFQL, final snippets must include explicit .
- When the task says remote execution/dataset, use .
- Use labels for intermediate matches when you need constraints.
- Use for cross-step/path constraints.
- Use / and / for traversal vs returned slice.
- Use predicates (, numeric/date predicates) for concise filtering.
- Use by default; force / only when needed.
- 是统一入口 — 支持传入链式列表或Cypher字符串。
- 切勿使用或 — 它们已被弃用并会发出警告。链式列表语法请始终使用,Cypher语法请使用。
- 当用户明确要求使用GFQL时,最终代码片段必须包含显式的。
- 当任务涉及远程执行/数据集时,使用。
- 当需要约束时,对中间匹配使用标签。
- 使用进行跨步骤/路径约束。
- 使用/和/控制遍历范围与返回片段。
- 使用谓词(、数值/日期谓词)实现简洁过滤。
- 默认使用;仅在需要时强制使用/。
Remote with chain-list
Remote with chain-list
rg = graphistry.bind(dataset_id='my-dataset')
res = rg.gfql_remote([n(), e_forward(), n()], engine='auto')
rg = graphistry.bind(dataset_id='my-dataset')
res = rg.gfql_remote([n(), e_forward(), n()], engine='auto')
Remote with Cypher string
Remote with Cypher string
res = rg.gfql_remote("MATCH (n:Person)-[r]->(m) WHERE n.risk_level = 'critical' RETURN n, r, m")
res = rg.gfql_remote("MATCH (n:Person)-[r]->(m) WHERE n.risk_level = 'critical' RETURN n, r, m")
Remote with Let/DAG
Remote with Let/DAG
res = rg.gfql_remote(let({...}))
res = rg.gfql_remote(let({...}))
Remote slim payload (only required columns)
Remote slim payload (only required columns)
res = rg.gfql_remote([n(), e_forward(), n()], output_type='nodes', node_col_subset=['node_id', 'time'])
res = rg.gfql_remote([n(), e_forward(), n()], output_type='nodes', node_col_subset=['node_id', 'time'])
Post-process on remote side when you want trimmed transfer payloads
Post-process on remote side when you want trimmed transfer payloads
res = rg.python_remote_table(lambda g: g._edges[['src', 'dst']].head(1000))
res = rg.python_remote_table(lambda g: g._edges[['src', 'dst']].head(1000))
Validation and safety
验证与安全
- Validate user-derived query fragments before execution.
- Normalize datetime columns before temporal predicates.
- Prefer small column subsets for remote result transfer.
- Preflight Cypher:
from graphistry.compute.gfql.cypher import parse_cypher, compile_cypher
- 在执行前验证用户提供的查询片段。
- 在使用时间谓词前标准化日期时间列。
- 远程结果传输优先选择小列子集。
- Cypher预检查:
from graphistry.compute.gfql.cypher import parse_cypher, compile_cypher