cli-anything-unimol-tools

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Uni-Mol Tools - Molecular Property Prediction CLI

Uni-Mol Tools - 分子属性预测CLI工具

Package:
cli-anything-unimol-tools
Command:
python3 -m cli_anything.unimol_tools
Package:
cli-anything-unimol-tools
Command:
python3 -m cli_anything.unimol_tools

Description

说明

Interactive CLI for training and inference of molecular property prediction models using Uni-Mol Tools. Supports 5 task types: binary classification, regression, multiclass, multilabel classification, and multilabel regression.
用于使用Uni-Mol Tools进行分子属性预测模型训练与推理的交互式CLI工具。支持5种任务类型:二分类、回归、多分类、多标签分类和多标签回归。

Key Features

核心特性

  • Project Management: Organize experiments with named projects
  • 5 Task Types: Classification, regression, multiclass, multilabel variants
  • Model Tracking: Automatic performance history and rankings
  • Smart Storage: Analyze usage and clean up underperformers
  • JSON API: Full automation support with
    --json
    flag
  • 项目管理:使用命名项目组织实验
  • 5种任务类型:分类、回归、多分类、多标签变体
  • 模型追踪:自动记录性能历史与排名
  • 智能存储:分析存储使用情况并清理性能不佳的模型
  • JSON API:通过
    --json
    标志支持完全自动化

Common Commands

常用命令

Project Management

项目管理

bash
undefined
bash
undefined

Create a new project

创建新项目

project create --name drug_discovery
project create --name drug_discovery

List all projects

列出所有项目

project list
project list

Switch to a project

切换到指定项目

project switch --name drug_discovery
undefined
project switch --name drug_discovery
undefined

Training

训练

bash
undefined
bash
undefined

Train a classification model

训练分类模型

train --data-path train.csv --target-col active --task-type classification --epochs 10
train --data-path train.csv --target-col active --task-type classification --epochs 10

Train a regression model

训练回归模型

train --data-path train.csv --target-col affinity --task-type regression --epochs 10
undefined
train --data-path train.csv --target-col affinity --task-type regression --epochs 10
undefined

Model Management

模型管理

bash
undefined
bash
undefined

List all trained models

列出所有已训练模型

models list
models list

Show model details and performance

查看模型详情与性能

models show --model-id <id>
models show --model-id <id>

Rank models by performance

按性能对模型排名

models rank
undefined
models rank
undefined

Storage & Cleanup

存储与清理

bash
undefined
bash
undefined

Analyze storage usage

分析存储使用情况

storage analyze
storage analyze

Automatic cleanup of poor performers

自动清理性能不佳的模型

cleanup auto
cleanup auto

Manual cleanup with criteria

按条件手动清理

cleanup manual --max-models 10 --min-score 0.7
undefined
cleanup manual --max-models 10 --min-score 0.7
undefined

Prediction

预测

bash
undefined
bash
undefined

Make predictions with a trained model

使用已训练模型进行预测

predict --model-id <id> --data-path test.csv
undefined
predict --model-id <id> --data-path test.csv
undefined

Data Format

数据格式

CSV files must contain:
  • SMILES
    column: Molecular structures in SMILES format
  • Target column(s): Values to predict (name specified via
    --target-col
    )
Example:
csv
SMILES,target
CCO,1
CCCO,0
CC(C)O,1
CSV文件必须包含:
  • SMILES
    列:SMILES格式的分子结构
  • 目标列:待预测的值(通过
    --target-col
    指定名称)
示例:
csv
SMILES,target
CCO,1
CCCO,0
CC(C)O,1

Task Types

任务类型

  1. classification: Binary classification (0/1)
  2. regression: Continuous value prediction
  3. multiclass: Multiple class classification
  4. multilabel_classification: Multiple binary labels
  5. multilabel_regression: Multiple continuous values
  1. classification:二分类(0/1)
  2. regression:连续值预测
  3. multiclass:多分类
  4. multilabel_classification:多标签二分类
  5. multilabel_regression:多标签连续值预测

JSON Mode

JSON模式

Add
--json
flag to any command for machine-readable output:
bash
python3 -m cli_anything.unimol_tools --json models list
Output format:
json
{
  "status": "success",
  "data": [...],
  "message": "..."
}
在任意命令后添加
--json
标志以获取机器可读输出:
bash
python3 -m cli_anything.unimol_tools --json models list
输出格式:
json
{
  "status": "success",
  "data": [...],
  "message": "..."
}

Interactive Mode

交互模式

Launch without commands for interactive REPL:
bash
python3 -m cli_anything.unimol_tools
Features:
  • Tab completion
  • Command history
  • Contextual help
  • Project state persistence
不添加命令直接启动即可进入交互式REPL:
bash
python3 -m cli_anything.unimol_tools
特性:
  • 制表符补全
  • 命令历史
  • 上下文帮助
  • 项目状态持久化

Test Data

测试数据

Includes data for all 5 task types.
包含适用于所有5种任务类型的数据。

Requirements

环境要求

  • Python 3.8+
  • PyTorch 1.12+
  • Uni-Mol Tools backend
  • 4GB+ RAM (8GB+ recommended for training)
  • Python 3.8+
  • PyTorch 1.12+
  • Uni-Mol Tools后端
  • 4GB以上内存(训练推荐8GB以上)

Installation

安装方法

bash
cd unimol_tools/agent-harness
pip install -e .
bash
cd unimol_tools/agent-harness
pip install -e .

Documentation

文档

  • SOP: UNIMOL_TOOLS.md
  • Quick Start: docs/guides/02-QUICK-START.md
  • Full Documentation: docs/README.md
  • 标准操作流程UNIMOL_TOOLS.md
  • 快速开始docs/guides/02-QUICK-START.md
  • 完整文档docs/README.md

Testing

测试

bash
cd docs/test
bash run_tests.sh --unit -v    # Unit tests (67 tests)
bash run_tests.sh --full -v    # Full test suite
bash
cd docs/test
bash run_tests.sh --unit -v    # 单元测试(67项测试)
bash run_tests.sh --full -v    # 完整测试套件

Performance Tips

性能优化建议

  • Start with 10 epochs for initial experiments
  • Use smaller batch sizes if memory is limited
  • Monitor storage with
    storage analyze
  • Use
    models rank
    to identify best performers
  • Clean up regularly with
    cleanup auto
  • 初始实验先以10个epoch起步
  • 内存有限时使用更小的批次大小
  • 使用
    storage analyze
    监控存储情况
  • 使用
    models rank
    识别最优模型
  • 定期使用
    cleanup auto
    清理模型

Troubleshooting

故障排查

  • CUDA errors: Reduce batch size or use CPU mode
  • CSV not recognized: Verify SMILES column exists
  • Low accuracy: Try more epochs or adjust learning rate
  • Storage full: Run
    cleanup auto
    to free space
  • CUDA错误:减小批次大小或切换到CPU模式
  • CSV无法识别:确认SMILES列存在
  • 准确率低:尝试增加epoch数或调整学习率
  • 存储已满:运行
    cleanup auto
    释放空间

Related

相关链接