pygraphistry-ai

Compare original and translation side by side

🇺🇸

Original

English

🇨🇳

Translation

Chinese

PyGraphistry AI

Doc routing (local + canonical)

文档路由（本地+标准）

First route with

../pygraphistry/references/pygraphistry-readthedocs-toc.md

Use

../pygraphistry/references/pygraphistry-readthedocs-top-level.tsv

for section-level shortcuts.

Only scan

../pygraphistry/references/pygraphistry-readthedocs-sitemap.xml

when a needed page is missing.

Use one batched discovery read before deep-page reads; avoid
```
cat *
```
and serial micro-reads.

In user-facing answers, prefer canonical

https://pygraphistry.readthedocs.io/en/latest/...

links.

首先使用

../pygraphistry/references/pygraphistry-readthedocs-toc.md

进行路由。

使用

../pygraphistry/references/pygraphistry-readthedocs-top-level.tsv

获取章节级快捷方式。

仅在所需页面缺失时扫描

../pygraphistry/references/pygraphistry-readthedocs-sitemap.xml

。

在深度页面读取前先进行一次批量发现读取；避免使用
```
cat *
```
和串行微读取。
在面向用户的回答中，优先使用标准链接
```
https://pygraphistry.readthedocs.io/en/latest/...
```
。

Typical workflow

典型工作流

Build graph from nodes/edges.
Run feature/embedding method (
```
umap
```
,
```
embed
```
, optional
```
dbscan
```
).
Inspect derived columns/features and visualize.
Iterate on feature columns and sampling strategy.

从节点/边构建图。
运行特征/嵌入方法（
```
umap
```
、
```
embed
```
，可选
```
dbscan
```
）。
检查派生列/特征并进行可视化。
迭代优化特征列和采样策略。

Baseline examples

基准示例

python

undefined

python

undefined

Similarity embedding / projection

g2 = graphistry.nodes(df, 'id').umap(X=['f1', 'f2', 'f3']) g2.plot()


```python

g2 = graphistry.nodes(df, 'id').umap(X=['f1', 'f2', 'f3']) g2.plot()


```python

Fit/transform flow for consistent projection on new batches

g_train = graphistry.nodes(df_train, 'id').umap(X=['f1', 'f2']) g_batch = g_train.transform_umap(df_batch, return_graph=True) g_batch.plot()


```python

g_train = graphistry.nodes(df_train, 'id').umap(X=['f1', 'f2']) g_batch = g_train.transform_umap(df_batch, return_graph=True) g_batch.plot()


```python

Semantic search over embedded features

g2 = graphistry.nodes(df, 'id').umap(X=['text_col']) results_df, query_vector = g2.search('suspicious login pattern')


```python

g2 = graphistry.nodes(df, 'id').umap(X=['text_col']) results_df, query_vector = g2.search('suspicious login pattern')


```python

Text-first workflow: featurize then search/cluster

g2 = graphistry.nodes(df, 'id').featurize(kind='nodes', X=['title', 'body']).umap(kind='nodes').dbscan() hits, qv = g2.search('credential stuffing campaign')


```python

g2 = graphistry.nodes(df, 'id').featurize(kind='nodes', X=['title', 'body']).umap(kind='nodes').dbscan() hits, qv = g2.search('credential stuffing campaign')


```python

Precomputed embedding columns

embedding_cols = [c for c in df.columns if c.startswith('emb_')] g2 = graphistry.nodes(df, 'id').umap(X=embedding_cols) g_new = g2.transform_umap(df_new, return_graph=True)

undefined

embedding_cols = [c for c in df.columns if c.startswith('emb_')] g2 = graphistry.nodes(df, 'id').umap(X=embedding_cols) g_new = g2.transform_umap(df_new, return_graph=True)

undefined

Practical guardrails

实用注意事项

Start with small/representative samples before full runs.
Keep explicit feature lists (
```
X=...
```
) for reproducibility.
Track engine/dataframe type for CPU vs GPU behavior.
For anomaly workflows, document thresholds and false-positive assumptions.
For graph ML tasks, route deeper model workflows to RGCN/link-prediction references.
For text workflows, prefer
```
featurize(...).umap(...).search(...)
```
when queries are natural language.
If users already have embeddings, reuse them via explicit embedding column lists (
```
X=[...]
```
) before recomputing.
When user asks for a concise workflow snippet, prefer one short code block and avoid long narrative wrappers.

在全量运行前先从小型/代表性样本开始。
保留明确的特征列表（
```
X=...
```
）以确保可复现性。
跟踪引擎/数据框类型以区分CPU与GPU行为。
对于异常检测工作流，记录阈值和假阳性假设。
对于图机器学习任务，将深度模型工作流引导至RGCN/链接预测参考文档。
对于文本工作流，当查询为自然语言时，优先使用
```
featurize(...).umap(...).search(...)
```
流程。
如果用户已有嵌入向量，通过明确的嵌入列列表（
```
X=[...]
```
）复用它们，而非重新计算。
当用户请求简洁的工作流代码片段时，优先提供单个简短代码块，避免冗长的叙述性内容。

Canonical docs

标准文档

GFQL + AI combos: https://pygraphistry.readthedocs.io/en/latest/gfql/combo.html
API AI reference: https://pygraphistry.readthedocs.io/en/latest/api/ai.html
AI notebook index: https://pygraphistry.readthedocs.io/en/latest/notebooks/ai.html
Example RGCN notebook: https://pygraphistry.readthedocs.io/en/latest/demos/more_examples/graphistry_features/embed/simple-ssh-logs-rgcn-anomaly-detector.html

GFQL + AI组合：https://pygraphistry.readthedocs.io/en/latest/gfql/combo.html
API AI参考：https://pygraphistry.readthedocs.io/en/latest/api/ai.html
AI笔记本索引：https://pygraphistry.readthedocs.io/en/latest/notebooks/ai.html
RGCN示例笔记本：https://pygraphistry.readthedocs.io/en/latest/demos/more_examples/graphistry_features/embed/simple-ssh-logs-rgcn-anomaly-detector.html