etetoolkit

Compare original and translation side by side

🇺🇸

Original

English

🇨🇳

Translation

Chinese

ETE Toolkit Skill

ETE Toolkit 技能文档

Overview

概述

ETE (Environment for Tree Exploration) is a toolkit for phylogenetic and hierarchical tree analysis. Manipulate trees, analyze evolutionary events, visualize results, and integrate with biological databases for phylogenomic research and clustering analysis.

ETE（Environment for Tree Exploration）是一款用于系统发育树和层次树分析的工具包。可进行树操作、进化事件分析、结果可视化，并与生物数据库集成，用于系统发育组学研究和聚类分析。

Core Capabilities

核心功能

1. Tree Manipulation and Analysis

1. 树操作与分析

Load, manipulate, and analyze hierarchical tree structures with support for:

Tree I/O: Read and write Newick, NHX, PhyloXML, and NeXML formats
Tree traversal: Navigate trees using preorder, postorder, or levelorder strategies
Topology modification: Prune, root, collapse nodes, resolve polytomies
Distance calculations: Compute branch lengths and topological distances between nodes
Tree comparison: Calculate Robinson-Foulds distances and identify topological differences

Common patterns:

python

from ete3 import Tree

加载、操作和分析层次树结构，支持：

树I/O：读写Newick、NHX、PhyloXML和NeXML格式
树遍历：使用前序、后序或层次序策略遍历树
拓扑修改：剪枝、定根、合并节点、解决多歧节点
距离计算：计算分支长度和节点间的拓扑距离
树比较：计算Robinson-Foulds距离并识别拓扑差异

常见使用模式：

python

from ete3 import Tree

Load tree from file

从文件加载树

tree = Tree("tree.nw", format=1)

Basic statistics

基础统计

print(f"Leaves: {len(tree)}") print(f"Total nodes: {len(list(tree.traverse()))}")

print(f"叶子节点数: {len(tree)}") print(f"总节点数: {len(list(tree.traverse()))}")

Prune to taxa of interest

剪枝保留目标类群

taxa_to_keep = ["species1", "species2", "species3"] tree.prune(taxa_to_keep, preserve_branch_length=True)

Midpoint root

中点定根

midpoint = tree.get_midpoint_outgroup() tree.set_outgroup(midpoint)

Save modified tree

保存修改后的树

tree.write(outfile="rooted_tree.nw")


Use `scripts/tree_operations.py` for command-line tree manipulation:

```bash

tree.write(outfile="rooted_tree.nw")


使用 `scripts/tree_operations.py` 进行命令行树操作：

```bash

Display tree statistics

显示树统计信息

python scripts/tree_operations.py stats tree.nw

Convert format

转换格式

python scripts/tree_operations.py convert tree.nw output.nw --in-format 0 --out-format 1

Reroot tree

重新定根

python scripts/tree_operations.py reroot tree.nw rooted.nw --midpoint

Prune to specific taxa

剪枝保留特定类群

python scripts/tree_operations.py prune tree.nw pruned.nw --keep-taxa "sp1,sp2,sp3"

Show ASCII visualization

显示ASCII格式树

python scripts/tree_operations.py ascii tree.nw

undefined

python scripts/tree_operations.py ascii tree.nw

undefined

2. Phylogenetic Analysis

2. 系统发育分析

Analyze gene trees with evolutionary event detection:

Sequence alignment integration: Link trees to multiple sequence alignments (FASTA, Phylip)
Species naming: Automatic or custom species extraction from gene names
Evolutionary events: Detect duplication and speciation events using Species Overlap or tree reconciliation
Orthology detection: Identify orthologs and paralogs based on evolutionary events
Gene family analysis: Split trees by duplications, collapse lineage-specific expansions

Workflow for gene tree analysis:

python

from ete3 import PhyloTree

分析基因树并检测进化事件：

序列比对集成：将树与多序列比对（FASTA、Phylip格式）关联
物种命名：从基因名称中自动或自定义提取物种信息
进化事件检测：使用物种重叠法或树 reconciliation 检测重复和物种形成事件
直系同源检测：基于进化事件识别直系同源和旁系同源基因
基因家族分析：按重复事件拆分树，合并谱系特异性扩张分支

基因树分析工作流：

python

from ete3 import PhyloTree

Load gene tree with alignment

加载带比对信息的基因树

tree = PhyloTree("gene_tree.nw", alignment="alignment.fasta")

Set species naming function

设置物种命名函数

def get_species(gene_name): return gene_name.split("_")[0]

tree.set_species_naming_function(get_species)

def get_species(gene_name): return gene_name.split("_")[0]

tree.set_species_naming_function(get_species)

Detect evolutionary events

检测进化事件

events = tree.get_descendant_evol_events()

Analyze events

分析事件

for node in tree.traverse(): if hasattr(node, "evoltype"): if node.evoltype == "D": print(f"Duplication at {node.name}") elif node.evoltype == "S": print(f"Speciation at {node.name}")

for node in tree.traverse(): if hasattr(node, "evoltype"): if node.evoltype == "D": print(f"节点 {node.name} 处发生重复事件") elif node.evoltype == "S": print(f"节点 {node.name} 处发生物种形成事件")

Extract ortholog groups

提取直系同源组

ortho_groups = tree.get_speciation_trees() for i, ortho_tree in enumerate(ortho_groups): ortho_tree.write(outfile=f"ortholog_group_{i}.nw")


**Finding orthologs and paralogs:**

```python

ortho_groups = tree.get_speciation_trees() for i, ortho_tree in enumerate(ortho_groups): ortho_tree.write(outfile=f"ortholog_group_{i}.nw")


**查找直系同源和旁系同源基因：**

```python

Find orthologs to query gene

查找查询基因的同源基因

query = tree & "species1_gene1"

orthologs = [] paralogs = []

for event in events: if query in event.in_seqs: if event.etype == "S": orthologs.extend([s for s in event.out_seqs if s != query]) elif event.etype == "D": paralogs.extend([s for s in event.out_seqs if s != query])

undefined

query = tree & "species1_gene1"

orthologs = [] paralogs = []

undefined

3. NCBI Taxonomy Integration

3. NCBI分类学集成

Integrate taxonomic information from NCBI Taxonomy database:

Database access: Automatic download and local caching of NCBI taxonomy (~300MB)
Taxid/name translation: Convert between taxonomic IDs and scientific names
Lineage retrieval: Get complete evolutionary lineages
Taxonomy trees: Build species trees connecting specified taxa
Tree annotation: Automatically annotate trees with taxonomic information

Building taxonomy-based trees:

python

from ete3 import NCBITaxa

ncbi = NCBITaxa()

集成NCBI分类学数据库的分类信息：

数据库访问：自动下载并本地缓存NCBI分类学数据库（约300MB）
TaxID/名称转换：在分类学ID和科学名称之间转换
谱系检索：获取完整的进化谱系
分类学树构建：构建连接指定类群的物种树
树注释：自动用分类学信息注释树节点

基于分类学构建树：

python

from ete3 import NCBITaxa

ncbi = NCBITaxa()

Build tree from species names

从物种名称构建树

species = ["Homo sapiens", "Pan troglodytes", "Mus musculus"] name2taxid = ncbi.get_name_translator(species) taxids = [name2taxid[sp][0] for sp in species]

Get minimal tree connecting taxa

获取连接类群的最小树

tree = ncbi.get_topology(taxids)

Annotate nodes with taxonomy info

用分类学信息注释节点

for node in tree.traverse(): if hasattr(node, "sci_name"): print(f"{node.sci_name} - Rank: {node.rank} - TaxID: {node.taxid}")


**Annotating existing trees:**

```python

for node in tree.traverse(): if hasattr(node, "sci_name"): print(f"{node.sci_name} - 分类等级: {node.rank} - TaxID: {node.taxid}")


**注释现有树：**

```python

Get taxonomy info for tree leaves

获取树叶子节点的分类学信息

for leaf in tree: species = extract_species_from_name(leaf.name) taxid = ncbi.get_name_translator([species])[species][0]

# Get lineage
lineage = ncbi.get_lineage(taxid)
ranks = ncbi.get_rank(lineage)
names = ncbi.get_taxid_translator(lineage)

# Add to node
leaf.add_feature("taxid", taxid)
leaf.add_feature("lineage", [names[t] for t in lineage])

undefined

for leaf in tree: species = extract_species_from_name(leaf.name) taxid = ncbi.get_name_translator([species])[species][0]

# 获取谱系
lineage = ncbi.get_lineage(taxid)
ranks = ncbi.get_rank(lineage)
names = ncbi.get_taxid_translator(lineage)

# 添加到节点
leaf.add_feature("taxid", taxid)
leaf.add_feature("lineage", [names[t] for t in lineage])

undefined

4. Tree Visualization

4. 树可视化

Create publication-quality tree visualizations:

Output formats: PNG (raster), PDF, and SVG (vector) for publications
Layout modes: Rectangular and circular tree layouts
Interactive GUI: Explore trees interactively with zoom, pan, and search
Custom styling: NodeStyle for node appearance (colors, shapes, sizes)
Faces: Add graphical elements (text, images, charts, heatmaps) to nodes
Layout functions: Dynamic styling based on node properties

Basic visualization workflow:

python

from ete3 import Tree, TreeStyle, NodeStyle

tree = Tree("tree.nw")

创建可用于出版物的树可视化结果：

输出格式：PNG（光栅图）、PDF和SVG（矢量图），适用于出版物
布局模式：矩形和圆形树布局
交互式GUI：通过缩放、平移和搜索功能交互式探索树
自定义样式：使用NodeStyle设置节点外观（颜色、形状、大小）
Faces：向节点添加图形元素（文本、图片、图表、热图）
布局函数：基于节点属性进行动态样式设置

基础可视化工作流：

python

from ete3 import Tree, TreeStyle, NodeStyle

tree = Tree("tree.nw")

Configure tree style

配置树样式

ts = TreeStyle() ts.show_leaf_name = True ts.show_branch_support = True ts.scale = 50 # pixels per branch length unit

ts = TreeStyle() ts.show_leaf_name = True ts.show_branch_support = True ts.scale = 50 # 每个分支长度单位对应的像素数

Style nodes

设置节点样式

for node in tree.traverse(): nstyle = NodeStyle()

if node.is_leaf():
    nstyle["fgcolor"] = "blue"
    nstyle["size"] = 8
else:
    # Color by support
    if node.support > 0.9:
        nstyle["fgcolor"] = "darkgreen"
    else:
        nstyle["fgcolor"] = "red"
    nstyle["size"] = 5

node.set_style(nstyle)

for node in tree.traverse(): nstyle = NodeStyle()

if node.is_leaf():
    nstyle["fgcolor"] = "blue"
    nstyle["size"] = 8
else:
    # 按支持值着色
    if node.support > 0.9:
        nstyle["fgcolor"] = "darkgreen"
    else:
        nstyle["fgcolor"] = "red"
    nstyle["size"] = 5

node.set_style(nstyle)

Render to file

渲染到文件

tree.render("tree.pdf", tree_style=ts) tree.render("tree.png", w=800, h=600, units="px", dpi=300)


Use `scripts/quick_visualize.py` for rapid visualization:

```bash

tree.render("tree.pdf", tree_style=ts) tree.render("tree.png", w=800, h=600, units="px", dpi=300)


使用 `scripts/quick_visualize.py` 快速可视化：

```bash

Basic visualization

基础可视化

python scripts/quick_visualize.py tree.nw output.pdf

Circular layout with custom styling

带自定义样式的圆形布局

python scripts/quick_visualize.py tree.nw output.pdf --mode c --color-by-support

High-resolution PNG

高分辨率PNG

python scripts/quick_visualize.py tree.nw output.png --width 1200 --height 800 --units px --dpi 300

Custom title and styling

自定义标题和样式

python scripts/quick_visualize.py tree.nw output.pdf --title "Species Phylogeny" --show-support


**Advanced visualization with faces:**

```python
from ete3 import Tree, TreeStyle, TextFace, CircleFace

tree = Tree("tree.nw")

python scripts/quick_visualize.py tree.nw output.pdf --title "物种系统发育树" --show-support


**使用Faces进行高级可视化：**

```python
from ete3 import Tree, TreeStyle, TextFace, CircleFace

tree = Tree("tree.nw")

Add features to nodes

向节点添加属性

for leaf in tree: leaf.add_feature("habitat", "marine" if "fish" in leaf.name else "land")

Layout function

布局函数

def layout(node): if node.is_leaf(): # Add colored circle color = "blue" if node.habitat == "marine" else "green" circle = CircleFace(radius=5, color=color) node.add_face(circle, column=0, position="aligned")

    # Add label
    label = TextFace(node.name, fsize=10)
    node.add_face(label, column=1, position="aligned")

ts = TreeStyle() ts.layout_fn = layout ts.show_leaf_name = False

tree.render("annotated_tree.pdf", tree_style=ts)

undefined

def layout(node): if node.is_leaf(): # 添加彩色圆圈 color = "blue" if node.habitat == "marine" else "green" circle = CircleFace(radius=5, color=color) node.add_face(circle, column=0, position="aligned")

    # 添加标签
    label = TextFace(node.name, fsize=10)
    node.add_face(label, column=1, position="aligned")

ts = TreeStyle() ts.layout_fn = layout ts.show_leaf_name = False

tree.render("annotated_tree.pdf", tree_style=ts)

undefined

5. Clustering Analysis

5. 聚类分析

Analyze hierarchical clustering results with data integration:

ClusterTree: Specialized class for clustering dendrograms
Data matrix linking: Connect tree leaves to numerical profiles
Cluster metrics: Silhouette coefficient, Dunn index, inter/intra-cluster distances
Validation: Test cluster quality with different distance metrics
Heatmap visualization: Display data matrices alongside trees

Clustering workflow:

python

from ete3 import ClusterTree

分析层次聚类结果并集成数据：

ClusterTree：用于聚类树状图的专用类
数据矩阵关联：将树叶子节点与数值特征关联
聚类指标：轮廓系数、Dunn指数、簇间/簇内距离
验证：使用不同距离指标测试聚类质量
热图可视化：在树旁显示数据矩阵

聚类工作流：

python

from ete3 import ClusterTree

Load tree with data matrix

加载带数据矩阵的树

matrix = """#Names\tSample1\tSample2\tSample3 Gene1\t1.5\t2.3\t0.8 Gene2\t0.9\t1.1\t1.8 Gene3\t2.1\t2.5\t0.5"""

tree = ClusterTree("((Gene1,Gene2),Gene3);", text_array=matrix)

matrix = """#Names\tSample1\tSample2\tSample3 Gene1\t1.5\t2.3\t0.8 Gene2\t0.9\t1.1\t1.8 Gene3\t2.1\t2.5\t0.5"""

tree = ClusterTree("((Gene1,Gene2),Gene3);", text_array=matrix)

Evaluate cluster quality

评估聚类质量

for node in tree.traverse(): if not node.is_leaf(): silhouette = node.get_silhouette() dunn = node.get_dunn()

    print(f"Cluster: {node.name}")
    print(f"  Silhouette: {silhouette:.3f}")
    print(f"  Dunn index: {dunn:.3f}")

for node in tree.traverse(): if not node.is_leaf(): silhouette = node.get_silhouette() dunn = node.get_dunn()

    print(f"聚类簇: {node.name}")
    print(f"  轮廓系数: {silhouette:.3f}")
    print(f"  Dunn指数: {dunn:.3f}")

Visualize with heatmap

带热图的可视化

tree.show("heatmap")

undefined

tree.show("heatmap")

undefined

6. Tree Comparison

6. 树比较

Quantify topological differences between trees:

Robinson-Foulds distance: Standard metric for tree comparison
Normalized RF: Scale-invariant distance (0.0 to 1.0)
Partition analysis: Identify unique and shared bipartitions
Consensus trees: Analyze support across multiple trees
Batch comparison: Compare multiple trees pairwise

Compare two trees:

python

from ete3 import Tree

tree1 = Tree("tree1.nw")
tree2 = Tree("tree2.nw")

量化树之间的拓扑差异：

Robinson-Foulds距离：树比较的标准指标
归一化RF距离：尺度不变的距离（0.0到1.0）
分区分析：识别唯一和共享的二分分区
共识树：分析多棵树的支持度
批量比较：成对比较多棵树

比较两棵树：

python

from ete3 import Tree

tree1 = Tree("tree1.nw")
tree2 = Tree("tree2.nw")

Calculate RF distance

计算RF距离

rf, max_rf, common_leaves, parts_t1, parts_t2 = tree1.robinson_foulds(tree2)

print(f"RF distance: {rf}/{max_rf}") print(f"Normalized RF: {rf/max_rf:.3f}") print(f"Common leaves: {len(common_leaves)}")

rf, max_rf, common_leaves, parts_t1, parts_t2 = tree1.robinson_foulds(tree2)

print(f"RF距离: {rf}/{max_rf}") print(f"归一化RF距离: {rf/max_rf:.3f}") print(f"共同叶子节点数: {len(common_leaves)}")

Find unique partitions

查找唯一分区

unique_t1 = parts_t1 - parts_t2 unique_t2 = parts_t2 - parts_t1

print(f"Unique to tree1: {len(unique_t1)}") print(f"Unique to tree2: {len(unique_t2)}")


**Compare multiple trees:**

```python
import numpy as np

trees = [Tree(f"tree{i}.nw") for i in range(4)]

unique_t1 = parts_t1 - parts_t2 unique_t2 = parts_t2 - parts_t1

print(f"Tree1独有的分区数: {len(unique_t1)}") print(f"Tree2独有的分区数: {len(unique_t2)}")


**比较多棵树：**

```python
import numpy as np

trees = [Tree(f"tree{i}.nw") for i in range(4)]

Create distance matrix

创建距离矩阵

n = len(trees) dist_matrix = np.zeros((n, n))

for i in range(n): for j in range(i+1, n): rf, max_rf, _, _, _ = trees[i].robinson_foulds(trees[j]) norm_rf = rf / max_rf if max_rf > 0 else 0 dist_matrix[i, j] = norm_rf dist_matrix[j, i] = norm_rf

undefined

n = len(trees) dist_matrix = np.zeros((n, n))

for i in range(n): for j in range(i+1, n): rf, max_rf, _, _, _ = trees[i].robinson_foulds(trees[j]) norm_rf = rf / max_rf if max_rf > 0 else 0 dist_matrix[i, j] = norm_rf dist_matrix[j, i] = norm_rf

undefined

Installation and Setup

安装与设置

Install ETE toolkit:

bash

undefined

安装ETE工具包：

bash

undefined

Basic installation

基础安装

uv pip install ete3

With external dependencies for rendering (optional but recommended)

安装渲染所需的外部依赖（可选但推荐）

On macOS:

在macOS上：

brew install qt@5

On Ubuntu/Debian:

在Ubuntu/Debian上：

sudo apt-get install python3-pyqt5 python3-pyqt5.qtsvg

For full features including GUI

安装包含GUI的完整功能版本

uv pip install ete3[gui]


**First-time NCBI Taxonomy setup:**

The first time NCBITaxa is instantiated, it automatically downloads the NCBI taxonomy database (~300MB) to `~/.etetoolkit/taxa.sqlite`. This happens only once:

```python
from ete3 import NCBITaxa
ncbi = NCBITaxa()  # Downloads database on first run

Update taxonomy database:

python

ncbi.update_taxonomy_database()  # Download latest NCBI data

uv pip install ete3[gui]


**首次NCBI分类学设置：**

首次实例化NCBITaxa时，会自动下载NCBI分类学数据库（约300MB）到 `~/.etetoolkit/taxa.sqlite`，此操作仅执行一次：

```python
from ete3 import NCBITaxa
ncbi = NCBITaxa()  # 首次运行时下载数据库

更新分类学数据库：

python

ncbi.update_taxonomy_database()  # 下载最新的NCBI数据

Common Use Cases

常见用例

Use Case 1: Phylogenomic Pipeline

用例1：系统发育组学流程

Complete workflow from gene tree to ortholog identification:

python

from ete3 import PhyloTree, NCBITaxa

从基因树到直系同源识别的完整工作流：

python

from ete3 import PhyloTree, NCBITaxa

1. Load gene tree with alignment

1. 加载带比对信息的基因树

tree = PhyloTree("gene_tree.nw", alignment="alignment.fasta")

2. Configure species naming

2. 配置物种命名

tree.set_species_naming_function(lambda x: x.split("_")[0])

3. Detect evolutionary events

3. 检测进化事件

tree.get_descendant_evol_events()

4. Annotate with taxonomy

4. 用分类学信息注释

ncbi = NCBITaxa() for leaf in tree: if leaf.species in species_to_taxid: taxid = species_to_taxid[leaf.species] lineage = ncbi.get_lineage(taxid) leaf.add_feature("lineage", lineage)

5. Extract ortholog groups

5. 提取直系同源组

ortho_groups = tree.get_speciation_trees()

6. Save and visualize

6. 保存并可视化

for i, ortho in enumerate(ortho_groups): ortho.write(outfile=f"ortho_{i}.nw")

undefined

for i, ortho in enumerate(ortho_groups): ortho.write(outfile=f"ortho_{i}.nw")

undefined

Use Case 2: Tree Preprocessing and Formatting

用例2：树预处理与格式化

Batch process trees for analysis:

bash

undefined

批量处理树用于后续分析：

bash

undefined

Convert format

转换格式

python scripts/tree_operations.py convert input.nw output.nw --in-format 0 --out-format 1

Root at midpoint

中点定根

python scripts/tree_operations.py reroot input.nw rooted.nw --midpoint

Prune to focal taxa

剪枝保留核心类群

python scripts/tree_operations.py prune rooted.nw pruned.nw --keep-taxa taxa_list.txt

Get statistics

获取统计信息

python scripts/tree_operations.py stats pruned.nw

undefined

python scripts/tree_operations.py stats pruned.nw

undefined

Use Case 3: Publication-Quality Figures

用例3：出版物级别的图

Create styled visualizations:

python

from ete3 import Tree, TreeStyle, NodeStyle, TextFace

tree = Tree("tree.nw")

创建带样式的可视化结果：

python

from ete3 import Tree, TreeStyle, NodeStyle, TextFace

tree = Tree("tree.nw")

Define clade colors

定义分支颜色

clade_colors = { "Mammals": "red", "Birds": "blue", "Fish": "green" }

def layout(node): # Highlight clades if node.is_leaf(): for clade, color in clade_colors.items(): if clade in node.name: nstyle = NodeStyle() nstyle["fgcolor"] = color nstyle["size"] = 8 node.set_style(nstyle) else: # Add support values if node.support > 0.95: support = TextFace(f"{node.support:.2f}", fsize=8) node.add_face(support, column=0, position="branch-top")

ts = TreeStyle() ts.layout_fn = layout ts.show_scale = True

clade_colors = { "Mammals": "red", "Birds": "blue", "Fish": "green" }

def layout(node): # 高亮分支 if node.is_leaf(): for clade, color in clade_colors.items(): if clade in node.name: nstyle = NodeStyle() nstyle["fgcolor"] = color nstyle["size"] = 8 node.set_style(nstyle) else: # 添加支持值 if node.support > 0.95: support = TextFace(f"{node.support:.2f}", fsize=8) node.add_face(support, column=0, position="branch-top")

ts = TreeStyle() ts.layout_fn = layout ts.show_scale = True

Render for publication

渲染用于出版物

tree.render("figure.pdf", w=200, units="mm", tree_style=ts) tree.render("figure.svg", tree_style=ts) # Editable vector

undefined

tree.render("figure.pdf", w=200, units="mm", tree_style=ts) tree.render("figure.svg", tree_style=ts) # 可编辑的矢量图

undefined

Use Case 4: Automated Tree Analysis

用例4：自动化树分析

Process multiple trees systematically:

python

from ete3 import Tree
import os

input_dir = "trees"
output_dir = "processed"

for filename in os.listdir(input_dir):
    if filename.endswith(".nw"):
        tree = Tree(os.path.join(input_dir, filename))

        # Standardize: midpoint root, resolve polytomies
        midpoint = tree.get_midpoint_outgroup()
        tree.set_outgroup(midpoint)
        tree.resolve_polytomy(recursive=True)

        # Filter low support branches
        for node in tree.traverse():
            if hasattr(node, 'support') and node.support < 0.5:
                if not node.is_leaf() and not node.is_root():
                    node.delete()

        # Save processed tree
        output_file = os.path.join(output_dir, f"processed_{filename}")
        tree.write(outfile=output_file)

系统化处理多棵树：

python

from ete3 import Tree
import os

input_dir = "trees"
output_dir = "processed"

for filename in os.listdir(input_dir):
    if filename.endswith(".nw"):
        tree = Tree(os.path.join(input_dir, filename))

        # 标准化处理：中点定根，解决多歧节点
        midpoint = tree.get_midpoint_outgroup()
        tree.set_outgroup(midpoint)
        tree.resolve_polytomy(recursive=True)

        # 过滤低支持度分支
        for node in tree.traverse():
            if hasattr(node, 'support') and node.support < 0.5:
                if not node.is_leaf() and not node.is_root():
                    node.delete()

        # 保存处理后的树
        output_file = os.path.join(output_dir, f"processed_{filename}")
        tree.write(outfile=output_file)

Reference Documentation

参考文档

For comprehensive API documentation, code examples, and detailed guides, refer to the following resources in the

references/

directory:

api_reference.md
: Complete API documentation for all ETE classes and methods (Tree, PhyloTree, ClusterTree, NCBITaxa), including parameters, return types, and code examples
workflows.md
: Common workflow patterns organized by task (tree operations, phylogenetic analysis, tree comparison, taxonomy integration, clustering analysis)
visualization.md
: Comprehensive visualization guide covering TreeStyle, NodeStyle, Faces, layout functions, and advanced visualization techniques

Load these references when detailed information is needed:

python

undefined

如需完整的API文档、代码示例和详细指南，请参考

references/

目录下的以下资源：

api_reference.md
：所有ETE类和方法（Tree、PhyloTree、ClusterTree、NCBITaxa）的完整API文档，包括参数、返回类型和代码示例
workflows.md
：按任务分类的常见工作流模式（树操作、系统发育分析、树比较、分类学集成、聚类分析）
visualization.md
：全面的可视化指南，涵盖TreeStyle、NodeStyle、Faces、布局函数和高级可视化技术

需要详细信息时加载这些参考文档：

python

undefined

To use API reference

使用API参考

Read references/api_reference.md for complete method signatures and parameters

阅读references/api_reference.md获取完整的方法签名和参数

To implement workflows

实现工作流

Read references/workflows.md for step-by-step workflow examples

阅读references/workflows.md获取分步工作流示例

To create visualizations

创建可视化结果

Read references/visualization.md for styling and rendering options

阅读references/visualization.md获取样式和渲染选项

undefined

undefined

Troubleshooting

故障排除

Import errors:

bash

undefined

导入错误：

bash

undefined

If "ModuleNotFoundError: No module named 'ete3'"

如果出现 "ModuleNotFoundError: No module named 'ete3'"

uv pip install ete3

For GUI and rendering issues

针对GUI和渲染问题

uv pip install ete3[gui]


**Rendering issues:**

If `tree.render()` or `tree.show()` fails with Qt-related errors, install system dependencies:

```bash

uv pip install ete3[gui]


**渲染问题：**

如果 `tree.render()` 或 `tree.show()` 出现Qt相关错误，请安装系统依赖：

```bash

macOS

brew install qt@5

Ubuntu/Debian

sudo apt-get install python3-pyqt5 python3-pyqt5.qtsvg


**NCBI Taxonomy database:**

If database download fails or becomes corrupted:

```python
from ete3 import NCBITaxa
ncbi = NCBITaxa()
ncbi.update_taxonomy_database()  # Redownload database

Memory issues with large trees:

For very large trees (>10,000 leaves), use iterators instead of list comprehensions:

python

undefined

sudo apt-get install python3-pyqt5 python3-pyqt5.qtsvg


**NCBI分类学数据库问题：**

如果数据库下载失败或损坏：

```python
from ete3 import NCBITaxa
ncbi = NCBITaxa()
ncbi.update_taxonomy_database()  # 重新下载数据库

大型树的内存问题：

对于非常大的树（>10,000个叶子节点），使用迭代器而非列表推导式：

python

undefined

Memory-efficient iteration

内存高效的迭代

for leaf in tree.iter_leaves(): process(leaf)

Instead of

替代以下方式

for leaf in tree.get_leaves(): # Loads all into memory process(leaf)

undefined

for leaf in tree.get_leaves(): # 将所有节点加载到内存 process(leaf)

undefined

Newick Format Reference

Newick格式参考

ETE supports multiple Newick format specifications (0-100):

Format 0: Flexible with branch lengths (default)
Format 1: With internal node names
Format 2: With bootstrap/support values
Format 5: Internal node names + branch lengths
Format 8: All features (names, distances, support)
Format 9: Leaf names only
Format 100: Topology only

Specify format when reading/writing:

python

tree = Tree("tree.nw", format=1)
tree.write(outfile="output.nw", format=5)

NHX (New Hampshire eXtended) format preserves custom features:

python

tree.write(outfile="tree.nhx", features=["habitat", "temperature", "depth"])

ETE支持多种Newick格式规范（0-100）：

格式0：灵活支持分支长度（默认）
格式1：包含内部节点名称
格式2：包含bootstrap/支持值
格式5：内部节点名称 + 分支长度
格式8：包含所有特征（名称、距离、支持值）
格式9：仅包含叶子名称
格式100：仅包含拓扑结构

读写时指定格式：

python

tree = Tree("tree.nw", format=1)
tree.write(outfile="output.nw", format=5)

NHX（New Hampshire eXtended）格式保留自定义特征：

python

tree.write(outfile="tree.nhx", features=["habitat", "temperature", "depth"])

Best Practices

最佳实践

Preserve branch lengths: Use
```
preserve_branch_length=True
```
when pruning for phylogenetic analysis
Cache content: Use
```
get_cached_content()
```
for repeated access to node contents on large trees
Use iterators: Employ
```
iter_*
```
methods for memory-efficient processing of large trees
Choose appropriate traversal: Postorder for bottom-up analysis, preorder for top-down
Validate monophyly: Always check returned clade type (monophyletic/paraphyletic/polyphyletic)
Vector formats for publication: Use PDF or SVG for publication figures (scalable, editable)
Interactive testing: Use
```
tree.show()
```
to test visualizations before rendering to file
PhyloTree for phylogenetics: Use PhyloTree class for gene trees and evolutionary analysis
Copy method selection: "newick" for speed, "cpickle" for full fidelity, "deepcopy" for complex objects
NCBI query caching: Store NCBI taxonomy query results to avoid repeated database access

保留分支长度：进行系统发育分析时，剪枝操作使用
```
preserve_branch_length=True
```
缓存内容：对大型树重复访问节点内容时，使用
```
get_cached_content()
```
使用迭代器：处理大型树时使用
```
iter_*
```
方法以节省内存
选择合适的遍历方式：自底向上分析使用后序遍历，自顶向下使用前序遍历
验证单系性：始终检查返回分支的类型（单系/并系/多系）
出版物使用矢量格式：使用PDF或SVG格式生成出版物用图（可缩放、可编辑）
交互式测试：渲染到文件前，使用
```
tree.show()
```
测试可视化效果
系统发育分析使用PhyloTree：基因树和进化分析使用PhyloTree类
选择合适的复制方法："newick" 速度快，"cpickle" 保真度高，"deepcopy" 适用于复杂对象
NCBI查询缓存：存储NCBI分类学查询结果，避免重复访问数据库