umap-learn

Compare original and translation side by side

🇺🇸

Original

English

🇨🇳

Translation

Chinese

UMAP-Learn

Overview

概述

UMAP (Uniform Manifold Approximation and Projection) is a dimensionality reduction technique for visualization and general non-linear dimensionality reduction. Apply this skill for fast, scalable embeddings that preserve local and global structure, supervised learning, and clustering preprocessing.

UMAP（Uniform Manifold Approximation and Projection）是一种用于可视化和通用非线性降维的技术。使用该工具可生成快速、可扩展的嵌入，同时保留数据的局部和全局结构，适用于监督学习和聚类预处理。

Quick Start

快速开始

Installation

安装

bash

uv pip install umap-learn

bash

uv pip install umap-learn

Basic Usage

基础用法

UMAP follows scikit-learn conventions and can be used as a drop-in replacement for t-SNE or PCA.

python

import umap
from sklearn.preprocessing import StandardScaler

UMAP遵循scikit-learn的规范，可以作为t-SNE或PCA的直接替代工具。

python

import umap
from sklearn.preprocessing import StandardScaler

Prepare data (standardization is essential)

准备数据（标准化至关重要）

scaled_data = StandardScaler().fit_transform(data)

Method 1: Single step (fit and transform)

方法1：单步操作（拟合+转换）

embedding = umap.UMAP().fit_transform(scaled_data)

Method 2: Separate steps (for reusing trained model)

方法2：分步操作（用于复用训练好的模型）

reducer = umap.UMAP(random_state=42) reducer.fit(scaled_data) embedding = reducer.embedding_ # Access the trained embedding


**Critical preprocessing requirement:** Always standardize features to comparable scales before applying UMAP to ensure equal weighting across dimensions.

reducer = umap.UMAP(random_state=42) reducer.fit(scaled_data) embedding = reducer.embedding_ # 获取训练后的嵌入结果


**关键预处理要求：** 在应用UMAP之前，务必将特征标准化到可比的尺度，以确保各维度权重均等。

Typical Workflow

典型工作流

python

import umap
import matplotlib.pyplot as plt
from sklearn.preprocessing import StandardScaler

python

import umap
import matplotlib.pyplot as plt
from sklearn.preprocessing import StandardScaler

1. Preprocess data

1. 预处理数据

scaler = StandardScaler() scaled_data = scaler.fit_transform(raw_data)

2. Create and fit UMAP

2. 创建并拟合UMAP

reducer = umap.UMAP( n_neighbors=15, min_dist=0.1, n_components=2, metric='euclidean', random_state=42 ) embedding = reducer.fit_transform(scaled_data)

3. Visualize

3. 可视化

plt.scatter(embedding[:, 0], embedding[:, 1], c=labels, cmap='Spectral', s=5) plt.colorbar() plt.title('UMAP Embedding') plt.show()

undefined

plt.scatter(embedding[:, 0], embedding[:, 1], c=labels, cmap='Spectral', s=5) plt.colorbar() plt.title('UMAP嵌入结果') plt.show()

undefined

Parameter Tuning Guide

参数调优指南

UMAP has four primary parameters that control the embedding behavior. Understanding these is crucial for effective usage.

UMAP有四个主要参数控制嵌入行为，理解这些参数对有效使用至关重要。

n_neighbors (default: 15)

n_neighbors（默认值：15）

Purpose: Balances local versus global structure in the embedding.

How it works: Controls the size of the local neighborhood UMAP examines when learning manifold structure.

Effects by value:

Low values (2-5): Emphasizes fine local detail but may fragment data into disconnected components
Medium values (15-20): Balanced view of both local structure and global relationships (recommended starting point)
High values (50-200): Prioritizes broad topological structure at the expense of fine-grained details

Recommendation: Start with 15 and adjust based on results. Increase for more global structure, decrease for more local detail.

作用： 平衡嵌入结果中的局部与全局结构。

工作原理： 控制UMAP在学习流形结构时所考察的局部邻域大小。

不同取值的影响：

低取值（2-5）： 强调精细的局部细节，但可能将数据分割为不连通的组件
中等取值（15-20）： 平衡局部结构与全局关系（推荐的起始值）
高取值（50-200）： 优先考虑广泛的拓扑结构，牺牲细粒度细节

建议： 从15开始，根据结果调整。若需更多全局结构则增大取值，若需更多局部细节则减小取值。

min_dist (default: 0.1)

min_dist（默认值：0.1）

Purpose: Controls how tightly points cluster in the low-dimensional space.

How it works: Sets the minimum distance apart that points are allowed to be in the output representation.

Effects by value:

Low values (0.0-0.1): Creates clumped embeddings useful for clustering; reveals fine topological details
High values (0.5-0.99): Prevents tight packing; emphasizes broad topological preservation over local structure

Recommendation: Use 0.0 for clustering applications, 0.1-0.3 for visualization, 0.5+ for loose structure.

作用： 控制低维空间中点的聚集紧密程度。

工作原理： 设置输出表示中允许点之间的最小距离。

不同取值的影响：

低取值（0.0-0.1）： 生成聚集的嵌入结果，适用于聚类；可揭示精细的拓扑细节
高取值（0.5-0.99）： 避免紧密聚集；相较于局部结构，更强调广泛的拓扑保留

建议： 聚类应用使用0.0，可视化使用0.1-0.3，松散结构使用0.5+。

n_components (default: 2)

n_components（默认值：2）

Purpose: Determines the dimensionality of the embedded output space.

Key feature: Unlike t-SNE, UMAP scales well in the embedding dimension, enabling use beyond visualization.

Common uses:

2-3 dimensions: Visualization
5-10 dimensions: Clustering preprocessing (better preserves density than 2D)
10-50 dimensions: Feature engineering for downstream ML models

Recommendation: Use 2 for visualization, 5-10 for clustering, higher for ML pipelines.

作用： 确定嵌入输出空间的维度。

核心特性： 与t-SNE不同，UMAP在嵌入维度上具有良好的扩展性，可用于可视化之外的场景。

常见用途：

2-3维： 可视化
5-10维： 聚类预处理（比2D更能保留密度信息）
10-50维： 下游机器学习模型的特征工程

建议： 可视化使用2维，聚类使用5-10维，机器学习流水线使用更高维度。

metric (default: 'euclidean')

metric（默认值：'euclidean'）

Purpose: Specifies how distance is calculated between input data points.

Supported metrics:

Minkowski variants: euclidean, manhattan, chebyshev
Spatial metrics: canberra, braycurtis, haversine
Correlation metrics: cosine, correlation (good for text/document embeddings)
Binary data metrics: hamming, jaccard, dice, russellrao, kulsinski, rogerstanimoto, sokalmichener, sokalsneath, yule
Custom metrics: User-defined distance functions via Numba

Recommendation: Use euclidean for numeric data, cosine for text/document vectors, hamming for binary data.

作用： 指定输入数据点之间的距离计算方式。

支持的度量方式：

闵可夫斯基变体： euclidean、manhattan、chebyshev
空间度量： canberra、braycurtis、haversine
相关度量： cosine、correlation（适用于文本/文档嵌入）
二进制数据度量： hamming、jaccard、dice、russellrao、kulsinski、rogerstanimoto、sokalmichener、sokalsneath、yule
自定义度量： 通过Numba实现用户自定义的距离函数

建议： 数值数据使用euclidean，文本/文档向量使用cosine，二进制数据使用hamming。

Parameter Tuning Example

参数调优示例

python

undefined

python

undefined

For visualization with emphasis on local structure

强调局部结构的可视化

umap.UMAP(n_neighbors=15, min_dist=0.1, n_components=2, metric='euclidean')

For clustering preprocessing

聚类预处理

umap.UMAP(n_neighbors=30, min_dist=0.0, n_components=10, metric='euclidean')

For document embeddings

文档嵌入

umap.UMAP(n_neighbors=15, min_dist=0.1, n_components=2, metric='cosine')

For preserving global structure

保留全局结构

umap.UMAP(n_neighbors=100, min_dist=0.5, n_components=2, metric='euclidean')

undefined

umap.UMAP(n_neighbors=100, min_dist=0.5, n_components=2, metric='euclidean')

undefined

Supervised and Semi-Supervised Dimension Reduction

监督式与半监督式降维

UMAP supports incorporating label information to guide the embedding process, enabling class separation while preserving internal structure.

UMAP支持整合标签信息来指导嵌入过程，在保留内部结构的同时实现类别分离。

Supervised UMAP

监督式UMAP

Pass target labels via the

parameter when fitting:

python

undefined

拟合时通过

参数传递目标标签：

python

undefined

Supervised dimension reduction

监督式降维

embedding = umap.UMAP().fit_transform(data, y=labels)


**Key benefits:**
- Achieves cleanly separated classes
- Preserves internal structure within each class
- Maintains global relationships between classes

**When to use:** When you have labeled data and want to separate known classes while keeping meaningful point embeddings.

embedding = umap.UMAP().fit_transform(data, y=labels)


**核心优势：**
- 实现清晰的类别分离
- 保留每个类别内部的结构
- 维持类别之间的全局关系

**适用场景：** 拥有标注数据，希望分离已知类别同时保持有意义的点嵌入结果。

Semi-Supervised UMAP

半监督式UMAP

For partial labels, mark unlabeled points with

-1

following scikit-learn convention:

python

undefined

对于部分标注的数据，按照scikit-learn的约定，将未标注的点标记为

-1

：

python

undefined

Create semi-supervised labels

创建半监督标签

semi_labels = labels.copy() semi_labels[unlabeled_indices] = -1

Fit with partial labels

使用部分标签拟合

embedding = umap.UMAP().fit_transform(data, y=semi_labels)


**When to use:** When labeling is expensive or you have more data than labels available.

embedding = umap.UMAP().fit_transform(data, y=semi_labels)


**适用场景：** 标注成本高，或可用数据多于标注数据时。

Metric Learning with UMAP

基于UMAP的度量学习

Train a supervised embedding on labeled data, then apply to new unlabeled data:

python

undefined

在标注数据上训练监督式嵌入，然后应用于新的未标注数据：

python

undefined

Train on labeled data

在标注数据上训练

mapper = umap.UMAP().fit(train_data, train_labels)

Transform unlabeled test data

转换未标注的测试数据

test_embedding = mapper.transform(test_data)

Use as feature engineering for downstream classifier

作为下游分类器的特征工程

from sklearn.svm import SVC clf = SVC().fit(mapper.embedding_, train_labels) predictions = clf.predict(test_embedding)


**When to use:** For supervised feature engineering in machine learning pipelines.

from sklearn.svm import SVC clf = SVC().fit(mapper.embedding_, train_labels) predictions = clf.predict(test_embedding)


**适用场景：** 机器学习流水线中的监督式特征工程。

UMAP for Clustering

UMAP用于聚类

UMAP serves as effective preprocessing for density-based clustering algorithms like HDBSCAN, overcoming the curse of dimensionality.

UMAP可作为HDBSCAN等基于密度的聚类算法的有效预处理步骤，克服维数灾难。

Best Practices for Clustering

聚类最佳实践

Key principle: Configure UMAP differently for clustering than for visualization.

Recommended parameters:

n_neighbors: Increase to ~30 (default 15 is too local and can create artificial fine-grained clusters)
min_dist: Set to 0.0 (pack points densely within clusters for clearer boundaries)
n_components: Use 5-10 dimensions (maintains performance while improving density preservation vs. 2D)

核心原则： 聚类的UMAP配置与可视化的配置不同。

推荐参数：

n_neighbors： 增大至约30（默认15过于局部化，可能产生人工细粒度聚类）
min_dist： 设置为0.0（让点在聚类内密集聚集，以便形成清晰的边界）
n_components： 使用5-10维（相较于2D，在保持性能的同时提升密度保留能力）

Clustering Workflow

聚类工作流

python

import umap
import hdbscan
from sklearn.preprocessing import StandardScaler

python

import umap
import hdbscan
from sklearn.preprocessing import StandardScaler

1. Preprocess data

1. 预处理数据

scaled_data = StandardScaler().fit_transform(data)

2. UMAP with clustering-optimized parameters

2. 使用聚类优化参数的UMAP

reducer = umap.UMAP( n_neighbors=30, min_dist=0.0, n_components=10, # Higher than 2 for better density preservation metric='euclidean', random_state=42 ) embedding = reducer.fit_transform(scaled_data)

reducer = umap.UMAP( n_neighbors=30, min_dist=0.0, n_components=10, # 高于2维以更好地保留密度 metric='euclidean', random_state=42 ) embedding = reducer.fit_transform(scaled_data)

3. Apply HDBSCAN clustering

3. 应用HDBSCAN聚类

clusterer = hdbscan.HDBSCAN( min_cluster_size=15, min_samples=5, metric='euclidean' ) labels = clusterer.fit_predict(embedding)

4. Evaluate

4. 评估

from sklearn.metrics import adjusted_rand_score score = adjusted_rand_score(true_labels, labels) print(f"Adjusted Rand Score: {score:.3f}") print(f"Number of clusters: {len(set(labels)) - (1 if -1 in labels else 0)}") print(f"Noise points: {sum(labels == -1)}")

undefined

from sklearn.metrics import adjusted_rand_score score = adjusted_rand_score(true_labels, labels) print(f"调整兰德指数: {score:.3f}") print(f"聚类数量: {len(set(labels)) - (1 if -1 in labels else 0)}") print(f"噪声点数量: {sum(labels == -1)}")

undefined

Visualization After Clustering

聚类后可视化

python

undefined

python

undefined

Create 2D embedding for visualization (separate from clustering)

创建用于可视化的2D嵌入（与聚类使用的嵌入分离）

vis_reducer = umap.UMAP(n_neighbors=15, min_dist=0.1, n_components=2, random_state=42) vis_embedding = vis_reducer.fit_transform(scaled_data)

Plot with cluster labels

结合聚类标签绘图

import matplotlib.pyplot as plt plt.scatter(vis_embedding[:, 0], vis_embedding[:, 1], c=labels, cmap='Spectral', s=5) plt.colorbar() plt.title('UMAP Visualization with HDBSCAN Clusters') plt.show()


**Important caveat:** UMAP does not completely preserve density and can create artificial cluster divisions. Always validate and explore resulting clusters.

import matplotlib.pyplot as plt plt.scatter(vis_embedding[:, 0], vis_embedding[:, 1], c=labels, cmap='Spectral', s=5) plt.colorbar() plt.title('UMAP可视化结合HDBSCAN聚类结果') plt.show()


**重要提示：** UMAP并不能完全保留密度，可能会产生人工聚类划分。请始终验证并探索生成的聚类结果。

Transforming New Data

转换新数据

UMAP enables preprocessing of new data through its

transform()

method, allowing trained models to project unseen data into the learned embedding space.

UMAP可通过

transform()

方法预处理新数据，允许训练好的模型将未见过的数据投影到学习到的嵌入空间中。

Basic Transform Usage

基础转换用法

python

undefined

python

undefined

Train on training data

在训练数据上训练

trans = umap.UMAP(n_neighbors=15, random_state=42).fit(X_train)

Transform test data

转换测试数据

test_embedding = trans.transform(X_test)

undefined

test_embedding = trans.transform(X_test)

undefined

Integration with Machine Learning Pipelines

与机器学习流水线整合

python

from sklearn.svm import SVC
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
import umap

python

from sklearn.svm import SVC
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
import umap

Split data

划分数据

X_train, X_test, y_train, y_test = train_test_split(data, labels, test_size=0.2)

Preprocess

预处理

scaler = StandardScaler() X_train_scaled = scaler.fit_transform(X_train) X_test_scaled = scaler.transform(X_test)

Train UMAP

训练UMAP

reducer = umap.UMAP(n_components=10, random_state=42) X_train_embedded = reducer.fit_transform(X_train_scaled) X_test_embedded = reducer.transform(X_test_scaled)

Train classifier on embeddings

在嵌入结果上训练分类器

clf = SVC() clf.fit(X_train_embedded, y_train) accuracy = clf.score(X_test_embedded, y_test) print(f"Test accuracy: {accuracy:.3f}")

undefined

clf = SVC() clf.fit(X_train_embedded, y_train) accuracy = clf.score(X_test_embedded, y_test) print(f"测试准确率: {accuracy:.3f}")

undefined

Important Considerations

重要注意事项

Data consistency: The transform method assumes the overall distribution in the higher-dimensional space is consistent between training and test data. When this assumption fails, consider using Parametric UMAP instead.

Performance: Transform operations are efficient (typically <1 second), though initial calls may be slower due to Numba JIT compilation.

Scikit-learn compatibility: UMAP follows standard sklearn conventions and works seamlessly in pipelines:

python

from sklearn.pipeline import Pipeline

pipeline = Pipeline([
    ('scaler', StandardScaler()),
    ('umap', umap.UMAP(n_components=10)),
    ('classifier', SVC())
])

pipeline.fit(X_train, y_train)
predictions = pipeline.predict(X_test)

数据一致性： transform方法假设高维空间中训练数据与测试数据的整体分布一致。当该假设不成立时，考虑使用参数化UMAP。

性能： 转换操作效率很高（通常<1秒），但首次调用可能因Numba JIT编译而较慢。

Scikit-learn兼容性： UMAP遵循标准sklearn规范，可无缝集成到流水线中：

python

from sklearn.pipeline import Pipeline

pipeline = Pipeline([
    ('scaler', StandardScaler()),
    ('umap', umap.UMAP(n_components=10)),
    ('classifier', SVC())
])

pipeline.fit(X_train, y_train)
predictions = pipeline.predict(X_test)

Advanced Features

高级特性

Parametric UMAP

参数化UMAP

Parametric UMAP replaces direct embedding optimization with a learned neural network mapping function.

Key differences from standard UMAP:

Uses TensorFlow/Keras to train encoder networks
Enables efficient transformation of new data
Supports reconstruction via decoder networks (inverse transform)
Allows custom architectures (CNNs for images, RNNs for sequences)

Installation:

bash

uv pip install umap-learn[parametric_umap]

参数化UMAP将直接嵌入优化替换为学习到的神经网络映射函数。

与标准UMAP的主要区别：

使用TensorFlow/Keras训练编码器网络
支持高效转换新数据
支持通过解码器网络进行重构（逆转换）
允许自定义架构（适用于图像的CNN、适用于序列的RNN）

安装：

bash

uv pip install umap-learn[parametric_umap]

Requires TensorFlow 2.x

需要TensorFlow 2.x


**Basic usage:**
```python
from umap.parametric_umap import ParametricUMAP


**基础用法：**
```python
from umap.parametric_umap import ParametricUMAP

Default architecture (3-layer 100-neuron fully-connected network)

默认架构（3层100神经元全连接网络）

embedder = ParametricUMAP() embedding = embedder.fit_transform(data)

Transform new data efficiently

高效转换新数据

new_embedding = embedder.transform(new_data)


**Custom architecture:**
```python
import tensorflow as tf

new_embedding = embedder.transform(new_data)


**自定义架构：**
```python
import tensorflow as tf

Define custom encoder

定义自定义编码器

encoder = tf.keras.Sequential([ tf.keras.layers.InputLayer(input_shape=(input_dim,)), tf.keras.layers.Dense(128, activation='relu'), tf.keras.layers.Dense(64, activation='relu'), tf.keras.layers.Dense(2) # Output dimension ])

embedder = ParametricUMAP(encoder=encoder, dims=(input_dim,)) embedding = embedder.fit_transform(data)


**When to use Parametric UMAP:**
- Need efficient transformation of new data after training
- Require reconstruction capabilities (inverse transforms)
- Want to combine UMAP with autoencoders
- Working with complex data types (images, sequences) benefiting from specialized architectures

**When to use standard UMAP:**
- Need simplicity and quick prototyping
- Dataset is small and computational efficiency isn't critical
- Don't require learned transformations for future data

embedder = ParametricUMAP(encoder=encoder, dims=(input_dim,)) embedding = embedder.fit_transform(data)


**何时使用参数化UMAP：**
- 训练后需要高效转换新数据
- 需要重构能力（逆转换）
- 希望将UMAP与自编码器结合
- 处理复杂数据类型（图像、序列），可受益于专用架构

**何时使用标准UMAP：**
- 需要简洁性和快速原型开发
- 数据集较小，计算效率不是关键
- 不需要为未来数据学习转换方式

Inverse Transforms

逆转换

Inverse transforms enable reconstruction of high-dimensional data from low-dimensional embeddings.

Basic usage:

python

reducer = umap.UMAP()
embedding = reducer.fit_transform(data)

逆转换支持从低维嵌入重构高维数据。

基础用法：

python

reducer = umap.UMAP()
embedding = reducer.fit_transform(data)

Reconstruct high-dimensional data from embedding coordinates

从嵌入坐标重构高维数据

reconstructed = reducer.inverse_transform(embedding)


**Important limitations:**
- Computationally expensive operation
- Works poorly outside the convex hull of the embedding
- Accuracy decreases in regions with gaps between clusters

**Use cases:**
- Understanding structure of embedded data
- Visualizing smooth transitions between clusters
- Exploring interpolations between data points
- Generating synthetic samples in embedding space

**Example: Exploring embedding space:**
```python
import numpy as np

reconstructed = reducer.inverse_transform(embedding)


**重要局限性：**
- 计算成本高
- 在嵌入凸包外的区域效果不佳
- 聚类之间存在间隙的区域，准确率会下降

**使用场景：**
- 理解嵌入数据的结构
- 可视化聚类之间的平滑过渡
- 探索数据点之间的插值
- 在嵌入空间生成合成样本

**示例：探索嵌入空间：**
```python
import numpy as np

Create grid of points in embedding space

在嵌入空间创建点网格

x = np.linspace(embedding[:, 0].min(), embedding[:, 0].max(), 10) y = np.linspace(embedding[:, 1].min(), embedding[:, 1].max(), 10) xx, yy = np.meshgrid(x, y) grid_points = np.c_[xx.ravel(), yy.ravel()]

Reconstruct samples from grid

从网格重构样本

reconstructed_samples = reducer.inverse_transform(grid_points)

undefined

reconstructed_samples = reducer.inverse_transform(grid_points)

undefined

AlignedUMAP

For analyzing temporal or related datasets (e.g., time-series experiments, batch data):

python

from umap import AlignedUMAP

用于分析时间序列或相关数据集（如时间序列实验、批次数据）：

python

from umap import AlignedUMAP

List of related datasets

Create aligned embeddings

创建对齐的嵌入结果

mapper = AlignedUMAP().fit(datasets) aligned_embeddings = mapper.embeddings_ # List of embeddings


**When to use:** Comparing embeddings across related datasets while maintaining consistent coordinate systems.

mapper = AlignedUMAP().fit(datasets) aligned_embeddings = mapper.embeddings_ # 嵌入结果列表


**适用场景：** 在保持一致坐标系的同时，比较相关数据集的嵌入结果。

Reproducibility

可复现性

To ensure reproducible results, always set the

random_state

parameter:

python

reducer = umap.UMAP(random_state=42)

UMAP uses stochastic optimization, so results will vary slightly between runs without a fixed random state.

为确保结果可复现，请始终设置

random_state

参数：

python

reducer = umap.UMAP(random_state=42)

UMAP使用随机优化，若不固定随机状态，不同运行的结果会略有差异。

Common Issues and Solutions

常见问题与解决方案

Issue: Disconnected components or fragmented clusters

Solution: Increase
```
n_neighbors
```
to emphasize more global structure

Issue: Clusters too spread out or not well separated

Solution: Decrease
```
min_dist
```
to allow tighter packing

Issue: Poor clustering results

Solution: Use clustering-specific parameters (n_neighbors=30, min_dist=0.0, n_components=5-10)

Issue: Transform results differ significantly from training

Solution: Ensure test data distribution matches training, or use Parametric UMAP

Issue: Slow performance on large datasets

Solution: Set
```
low_memory=True
```
(default), or consider dimensionality reduction with PCA first

Issue: All points collapsed to single cluster

Solution: Check data preprocessing (ensure proper scaling), increase
```
min_dist
```

问题： 出现不连通的组件或碎片化聚类

解决方案： 增大
```
n_neighbors
```
以强调更多全局结构

问题： 聚类过于分散或分离效果不佳

解决方案： 减小
```
min_dist
```
以允许更紧密的聚集

问题： 聚类结果不佳

解决方案： 使用聚类专用参数（n_neighbors=30, min_dist=0.0, n_components=5-10）

问题： 转换结果与训练结果差异显著

解决方案： 确保测试数据分布与训练数据一致，或使用参数化UMAP

问题： 大型数据集上性能缓慢

解决方案： 设置
```
low_memory=True
```
（默认值），或先使用PCA进行降维

问题： 所有点坍缩为单个聚类

解决方案： 检查数据预处理（确保正确缩放），增大
```
min_dist
```

Resources

资源

references/

Contains detailed API documentation:

```
api_reference.md
```
: Complete UMAP class parameters and methods

Load these references when detailed parameter information or advanced method usage is needed.

包含详细的API文档：

```
api_reference.md
```
：完整的UMAP类参数和方法说明

当需要详细的参数信息或高级方法用法时，可查阅这些参考文档。