axolotl

Compare original and translation side by side

🇺🇸

Original

English

🇨🇳

Translation

Chinese

Axolotl Skill

Axolotl 技能

Comprehensive assistance with axolotl development, generated from official documentation.

基于官方文档生成，为Axolotl开发提供全面的帮助支持。

When to Use This Skill

何时使用该技能

This skill should be triggered when:

Working with axolotl
Asking about axolotl features or APIs
Implementing axolotl solutions
Debugging axolotl code
Learning axolotl best practices

当出现以下场景时可触发该技能：

正在使用Axolotl进行开发工作
询问Axolotl的功能或API相关问题
实现Axolotl相关解决方案
调试Axolotl代码
学习Axolotl最佳实践

Quick Reference

快速参考

Common Patterns

常用模式

Pattern 1: To validate that acceptable data transfer speeds exist for your training job, running NCCL Tests can help pinpoint bottlenecks, for example:

./build/all_reduce_perf -b 8 -e 128M -f 2 -g 3

Pattern 2: Configure your model to use FSDP in the Axolotl yaml. For example:

fsdp_version: 2
fsdp_config:
  offload_params: true
  state_dict_type: FULL_STATE_DICT
  auto_wrap_policy: TRANSFORMER_BASED_WRAP
  transformer_layer_cls_to_wrap: LlamaDecoderLayer
  reshard_after_forward: true

Pattern 3: The context_parallel_size should be a divisor of the total number of GPUs. For example:

context_parallel_size

Pattern 4: For example: - With 8 GPUs and no sequence parallelism: 8 different batches processed per step - With 8 GPUs and context_parallel_size=4: Only 2 different batches processed per step (each split across 4 GPUs) - If your per-GPU micro_batch_size is 2, the global batch size decreases from 16 to 4

context_parallel_size=4

Pattern 5: Setting save_compressed: true in your configuration enables saving models in a compressed format, which: - Reduces disk space usage by approximately 40% - Maintains compatibility with vLLM for accelerated inference - Maintains compatibility with llmcompressor for further optimization (example: quantization)

save_compressed: true

Pattern 6: Note It is not necessary to place your integration in the integrations folder. It can be in any location, so long as it’s installed in a package in your python env. See this repo for an example: https://github.com/axolotl-ai-cloud/diff-transformer

integrations

Pattern 7: Handle both single-example and batched data. - single example: sample[‘input_ids’] is a list[int] - batched data: sample[‘input_ids’] is a list[list[int]]

utils.trainer.drop_long_seq(sample, sequence_len=2048, min_sequence_len=2)

模式1： 要验证训练任务的数据传输速度是否符合要求，运行NCCL测试可以帮助定位瓶颈，示例如下：

./build/all_reduce_perf -b 8 -e 128M -f 2 -g 3

模式2： 在Axolotl的yaml配置中设置模型使用FSDP，示例如下：

fsdp_version: 2
fsdp_config:
  offload_params: true
  state_dict_type: FULL_STATE_DICT
  auto_wrap_policy: TRANSFORMER_BASED_WRAP
  transformer_layer_cls_to_wrap: LlamaDecoderLayer
  reshard_after_forward: true

模式3： context_parallel_size的取值必须是GPU总数量的约数，示例如下：

context_parallel_size

模式4： 示例说明：

若使用8块GPU且未开启序列并行：每步可处理8个不同的批次
若使用8块GPU且context_parallel_size=4：每步仅能处理2个不同的批次（每个批次拆分到4块GPU上运行）
若单GPU的micro_batch_size为2，全局批次大小会从16降至4

context_parallel_size=4

模式5： 在配置中设置save_compressed: true可以启用模型压缩保存功能，该功能：

可减少约40%的磁盘占用空间
保持与vLLM的兼容性，可实现加速推理
保持与llmcompressor的兼容性，可用于后续优化（示例：量化）

save_compressed: true

模式6： 注意：不需要将你的集成代码放在integrations文件夹下，它可以放在任意位置，只要它在你的Python环境中以包的形式安装即可。参考示例仓库：https://github.com/axolotl-ai-cloud/diff-transformer

integrations

模式7： 同时支持单样本和批量数据处理。

单样本：sample['input_ids'] 是 list[int] 类型
批量数据：sample['input_ids'] 是 list[list[int]] 类型

utils.trainer.drop_long_seq(sample, sequence_len=2048, min_sequence_len=2)

Example Code Patterns

代码示例模式

Example 1 (python):

python

cli.cloud.modal_.ModalCloud(config, app=None)

Example 2 (python):

python

cli.cloud.modal_.run_cmd(cmd, run_folder, volumes=None)

Example 3 (python):

python

core.trainers.base.AxolotlTrainer(
    *_args,
    bench_data_collator=None,
    eval_data_collator=None,
    dataset_tags=None,
    **kwargs,
)

Example 4 (python):

python

core.trainers.base.AxolotlTrainer.log(logs, start_time=None)

Example 5 (python):

python

prompt_strategies.input_output.RawInputOutputPrompter()

示例1（python）：

python

cli.cloud.modal_.ModalCloud(config, app=None)

示例2（python）：

python

cli.cloud.modal_.run_cmd(cmd, run_folder, volumes=None)

示例3（python）：

python

core.trainers.base.AxolotlTrainer(
    *_args,
    bench_data_collator=None,
    eval_data_collator=None,
    dataset_tags=None,
    **kwargs,
)

示例4（python）：

python

core.trainers.base.AxolotlTrainer.log(logs, start_time=None)

示例5（python）：

python

prompt_strategies.input_output.RawInputOutputPrompter()

Reference Files

参考文件

This skill includes comprehensive documentation in

references/

api.md - Api documentation
dataset-formats.md - Dataset-Formats documentation
other.md - Other documentation

Use

view

to read specific reference files when detailed information is needed.

该技能在

references/

目录下包含完整的文档：

api.md - API 文档
dataset-formats.md - 数据集格式文档
other.md - 其他文档

当需要详细信息时，使用

view

命令读取特定的参考文件。

Working with This Skill

使用该技能的指引

For Beginners

面向初学者

Start with the getting_started or tutorials reference files for foundational concepts.

从入门指南或教程参考文件开始学习基础概念。

For Specific Features

查找特定功能

Use the appropriate category reference file (api, guides, etc.) for detailed information.

使用对应分类的参考文件（API、指南等）获取详细信息。

For Code Examples

查找代码示例

The quick reference section above contains common patterns extracted from the official docs.

上方的快速参考部分包含了从官方文档中提取的常用模式。

Resources

资源

references/

Organized documentation extracted from official sources. These files contain:

Detailed explanations
Code examples with language annotations
Links to original documentation
Table of contents for quick navigation

从官方来源提取的结构化文档，这些文件包含：

详细的功能解释
带语言标注的代码示例
指向原始文档的链接
便于快速导航的目录

scripts/

Add helper scripts here for common automation tasks.

可在此处添加用于常见自动化任务的辅助脚本。

assets/

Add templates, boilerplate, or example projects here.

可在此处添加模板、脚手架代码或示例项目。

Notes

注意事项

This skill was automatically generated from official documentation
Reference files preserve the structure and examples from source docs
Code examples include language detection for better syntax highlighting
Quick reference patterns are extracted from common usage examples in the docs

该技能是基于官方文档自动生成的
参考文件保留了源文档的结构和示例
代码示例包含语言检测，可实现更好的语法高亮效果
快速参考模式提取自文档中的常见使用示例

Updating

更新方式

To refresh this skill with updated documentation:

Re-run the scraper with the same configuration
The skill will be rebuilt with the latest information

若要使用更新后的文档刷新该技能：

使用相同配置重新运行爬虫程序
该技能将会使用最新信息重建