Loading...
Loading...
Compare original and translation side by side
| Task Category | Examples |
|---|---|
| Sequence work | DNA/RNA/protein manipulation, motif finding, translation |
| File handling | FASTA, FASTQ, GenBank, Newick, BIOM I/O |
| Alignments | Pairwise or multiple sequence alignment |
| Phylogenetics | Tree construction, manipulation, distance calculations |
| Diversity metrics | Alpha diversity (Shannon, Faith's PD), beta diversity (Bray-Curtis, UniFrac) |
| Ordination | PCoA, CCA, RDA for dimensionality reduction |
| Statistical tests | PERMANOVA, ANOSIM, Mantel tests |
| Microbiome analysis | Feature tables, rarefaction, community comparisons |
| 任务类别 | 示例 |
|---|---|
| 序列处理 | DNA/RNA/蛋白质操作、基序查找、序列翻译 |
| 文件处理 | FASTA、FASTQ、GenBank、Newick、BIOM格式读写 |
| 序列比对 | 两两序列比对或多序列比对 |
| 系统发育分析 | 进化树构建、操作、距离计算 |
| 多样性指标 | α多样性(Shannon、Faith's PD)、β多样性(Bray-Curtis、UniFrac) |
| 排序分析 | 用于降维的PCoA、CCA、RDA分析 |
| 统计检验 | PERMANOVA、ANOSIM、Mantel检验 |
| 微生物组分析 | 特征表、稀疏化、群落比较 |
uv pip install scikit-biouv pip install scikit-bioDNARNAProteinimport skbioDNARNAProteinimport skbio
**Metadata types:**
- Sequence-level: ID, description, source organism
- Positional: Per-base quality scores (from FASTQ)
- Interval: Feature annotations, gene boundaries
**元数据类型:**
- 序列级:ID、描述、来源生物
- 位置级:单碱基质量分数(来自FASTQ)
- 区间级:特征注释、基因边界from skbio.alignment import local_pairwise_align_ssw, TabularMSAfrom skbio.alignment import local_pairwise_align_ssw, TabularMSA
**Notes:**
- `local_pairwise_align_ssw` provides fast SSW-based local alignment
- `StripedSmithWaterman` handles protein sequences with substitution matrices
- Affine gap penalties suit biological sequences best
**注意事项:**
- `local_pairwise_align_ssw`提供基于SSW的快速局部比对
- `StripedSmithWaterman`支持带替换矩阵的蛋白质序列比对
- 仿射空位罚分最适合生物序列比对from skbio import TreeNode
from skbio.tree import nj, upgmafrom skbio import TreeNode
from skbio.tree import nj, upgma
**Tree construction methods:**
| Method | Use case |
|--------|----------|
| `nj()` | Standard neighbor-joining |
| `upgma()` | Assumes molecular clock |
| `bme()` | Scalable for large datasets |
**树构建方法:**
| 方法 | 适用场景 |
|--------|----------|
| `nj()` | 标准邻接法 |
| `upgma()` | 适用于分子钟假设的场景 |
| `bme()` | 适用于大型数据集的可扩展方法 |from skbio.diversity import alpha_diversityfrom skbio.diversity import alpha_diversityundefinedundefinedfrom skbio.diversity import beta_diversityfrom skbio.diversity import beta_diversity
**Key points:**
- Input must be integer counts, not proportions
- Phylogenetic metrics require a tree matching feature IDs
- `partial_beta_diversity()` computes specific sample pairs efficiently
**关键点:**
- 输入必须是整数计数,而非比例值
- 系统发育指标需要与特征ID匹配的进化树
- `partial_beta_diversity()`可高效计算特定样本对的距离from skbio.stats.ordination import pcoa, ccafrom skbio.stats.ordination import pcoa, cca
**Methods:**
| Function | Input | Purpose |
|----------|-------|---------|
| `pcoa()` | Distance matrix | Unconstrained ordination |
| `cca()` | Abundance + environment | Constrained ordination (unimodal) |
| `rda()` | Abundance + environment | Constrained ordination (linear) |
**分析方法:**
| 函数 | 输入 | 用途 |
|----------|-------|---------|
| `pcoa()` | 距离矩阵 | 非约束排序 |
| `cca()` | 物种丰度+环境因子 | 约束排序(单峰模型) |
| `rda()` | 物种丰度+环境因子 | 约束排序(线性模型) |from skbio.stats.distance import permanova, anosim, mantelfrom skbio.stats.distance import permanova, anosim, mantel
**Test overview:**
| Test | Purpose | Key output |
|------|---------|------------|
| PERMANOVA | Group differences | F-statistic, p-value |
| ANOSIM | Group differences (alternative) | R-statistic, p-value |
| PERMDISP | Dispersion homogeneity | Tests PERMANOVA assumption |
| Mantel | Matrix correlation | Correlation coefficient, p-value |
**检验概述:**
| 检验方法 | 用途 | 关键输出 |
|------|---------|------------|
| PERMANOVA | 组间差异分析 | F统计量、p值 |
| ANOSIM | 组间差异分析(替代方法) | R统计量、p值 |
| PERMDISP | 离散度齐性检验 | 验证PERMANOVA的假设条件 |
| Mantel | 矩阵相关性分析 | 相关系数、p值 |import skbioimport skbio
**Supported formats:**
| Category | Formats |
|----------|---------|
| Sequences | FASTA, FASTQ, GenBank, EMBL, QSeq |
| Alignments | Clustal, PHYLIP, Stockholm |
| Trees | Newick |
| Tables | BIOM (HDF5/JSON) |
| Distances | Delimited matrices |
**支持的格式:**
| 类别 | 格式 |
|----------|---------|
| 序列 | FASTA、FASTQ、GenBank、EMBL、QSeq |
| 比对结果 | Clustal、PHYLIP、Stockholm |
| 进化树 | Newick |
| 表格 | BIOM(HDF5/JSON) |
| 距离矩阵 | 分隔符分隔的矩阵 |from skbio import DistanceMatrix
import numpy as npfrom skbio import DistanceMatrix
import numpy as npundefinedundefinedfrom skbio import Tablefrom skbio import Tableundefinedundefinedfrom skbio.embedding import ProteinEmbeddingfrom skbio.embedding import ProteinEmbeddingundefinedundefinedpartial_beta_diversity()partial_beta_diversity()| Library | Integration |
|---|---|
| pandas | DataFrames from distance matrices, diversity results |
| numpy | Array conversions throughout |
| matplotlib/seaborn | Plot ordination results, heatmaps |
| scikit-learn | Distance matrices as input |
| QIIME 2 | Native BIOM, tree, distance matrix compatibility |
| 库 | 集成方式 |
|---|---|
| pandas | 距离矩阵、多样性结果转换为DataFrame |
| numpy | 全程支持数组转换 |
| matplotlib/seaborn | 排序结果可视化、热图绘制 |
| scikit-learn | 距离矩阵作为输入 |
| QIIME 2 | 原生支持BIOM、进化树、距离矩阵格式 |
| File | Contents |
|---|---|
| references/api-reference.md | Complete method signatures, parameters, extended examples, and troubleshooting |
| 文件 | 内容 |
|---|---|
| references/api-reference.md | 完整的方法签名、参数说明、扩展示例及故障排除指南 |