pygraphistry-ai
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChinesePyGraphistry AI
PyGraphistry AI
Doc routing (local + canonical)
文档路由(本地+标准)
- First route with .
../pygraphistry/references/pygraphistry-readthedocs-toc.md - Use for section-level shortcuts.
../pygraphistry/references/pygraphistry-readthedocs-top-level.tsv - Only scan when a needed page is missing.
../pygraphistry/references/pygraphistry-readthedocs-sitemap.xml - Use one batched discovery read before deep-page reads; avoid and serial micro-reads.
cat * - In user-facing answers, prefer canonical links.
https://pygraphistry.readthedocs.io/en/latest/...
- 首先使用进行路由。
../pygraphistry/references/pygraphistry-readthedocs-toc.md - 使用获取章节级快捷方式。
../pygraphistry/references/pygraphistry-readthedocs-top-level.tsv - 仅在所需页面缺失时扫描。
../pygraphistry/references/pygraphistry-readthedocs-sitemap.xml - 在深度页面读取前先进行一次批量发现读取;避免使用和串行微读取。
cat * - 在面向用户的回答中,优先使用标准链接。
https://pygraphistry.readthedocs.io/en/latest/...
Typical workflow
典型工作流
- Build graph from nodes/edges.
- Run feature/embedding method (,
umap, optionalembed).dbscan - Inspect derived columns/features and visualize.
- Iterate on feature columns and sampling strategy.
- 从节点/边构建图。
- 运行特征/嵌入方法(、
umap,可选embed)。dbscan - 检查派生列/特征并进行可视化。
- 迭代优化特征列和采样策略。
Baseline examples
基准示例
python
undefinedpython
undefinedSimilarity embedding / projection
Similarity embedding / projection
g2 = graphistry.nodes(df, 'id').umap(X=['f1', 'f2', 'f3'])
g2.plot()
```pythong2 = graphistry.nodes(df, 'id').umap(X=['f1', 'f2', 'f3'])
g2.plot()
```pythonFit/transform flow for consistent projection on new batches
Fit/transform flow for consistent projection on new batches
g_train = graphistry.nodes(df_train, 'id').umap(X=['f1', 'f2'])
g_batch = g_train.transform_umap(df_batch, return_graph=True)
g_batch.plot()
```pythong_train = graphistry.nodes(df_train, 'id').umap(X=['f1', 'f2'])
g_batch = g_train.transform_umap(df_batch, return_graph=True)
g_batch.plot()
```pythonSemantic search over embedded features
Semantic search over embedded features
g2 = graphistry.nodes(df, 'id').umap(X=['text_col'])
results_df, query_vector = g2.search('suspicious login pattern')
```pythong2 = graphistry.nodes(df, 'id').umap(X=['text_col'])
results_df, query_vector = g2.search('suspicious login pattern')
```pythonText-first workflow: featurize then search/cluster
Text-first workflow: featurize then search/cluster
g2 = graphistry.nodes(df, 'id').featurize(kind='nodes', X=['title', 'body']).umap(kind='nodes').dbscan()
hits, qv = g2.search('credential stuffing campaign')
```pythong2 = graphistry.nodes(df, 'id').featurize(kind='nodes', X=['title', 'body']).umap(kind='nodes').dbscan()
hits, qv = g2.search('credential stuffing campaign')
```pythonPrecomputed embedding columns
Precomputed embedding columns
embedding_cols = [c for c in df.columns if c.startswith('emb_')]
g2 = graphistry.nodes(df, 'id').umap(X=embedding_cols)
g_new = g2.transform_umap(df_new, return_graph=True)
undefinedembedding_cols = [c for c in df.columns if c.startswith('emb_')]
g2 = graphistry.nodes(df, 'id').umap(X=embedding_cols)
g_new = g2.transform_umap(df_new, return_graph=True)
undefinedPractical guardrails
实用注意事项
- Start with small/representative samples before full runs.
- Keep explicit feature lists () for reproducibility.
X=... - Track engine/dataframe type for CPU vs GPU behavior.
- For anomaly workflows, document thresholds and false-positive assumptions.
- For graph ML tasks, route deeper model workflows to RGCN/link-prediction references.
- For text workflows, prefer when queries are natural language.
featurize(...).umap(...).search(...) - If users already have embeddings, reuse them via explicit embedding column lists () before recomputing.
X=[...] - When user asks for a concise workflow snippet, prefer one short code block and avoid long narrative wrappers.
- 在全量运行前先从小型/代表性样本开始。
- 保留明确的特征列表()以确保可复现性。
X=... - 跟踪引擎/数据框类型以区分CPU与GPU行为。
- 对于异常检测工作流,记录阈值和假阳性假设。
- 对于图机器学习任务,将深度模型工作流引导至RGCN/链接预测参考文档。
- 对于文本工作流,当查询为自然语言时,优先使用流程。
featurize(...).umap(...).search(...) - 如果用户已有嵌入向量,通过明确的嵌入列列表()复用它们,而非重新计算。
X=[...] - 当用户请求简洁的工作流代码片段时,优先提供单个简短代码块,避免冗长的叙述性内容。
Canonical docs
标准文档
- GFQL + AI combos: https://pygraphistry.readthedocs.io/en/latest/gfql/combo.html
- API AI reference: https://pygraphistry.readthedocs.io/en/latest/api/ai.html
- AI notebook index: https://pygraphistry.readthedocs.io/en/latest/notebooks/ai.html
- Example RGCN notebook: https://pygraphistry.readthedocs.io/en/latest/demos/more_examples/graphistry_features/embed/simple-ssh-logs-rgcn-anomaly-detector.html
- GFQL + AI组合:https://pygraphistry.readthedocs.io/en/latest/gfql/combo.html
- API AI参考:https://pygraphistry.readthedocs.io/en/latest/api/ai.html
- AI笔记本索引:https://pygraphistry.readthedocs.io/en/latest/notebooks/ai.html
- RGCN示例笔记本:https://pygraphistry.readthedocs.io/en/latest/demos/more_examples/graphistry_features/embed/simple-ssh-logs-rgcn-anomaly-detector.html