axolotl
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseAxolotl Skill
Axolotl技能
Comprehensive assistance with axolotl development, generated from official documentation.
基于官方文档生成的Axolotl开发全方位辅助指南。
When to Use This Skill
何时使用该技能
This skill should be triggered when:
- Working with axolotl
- Asking about axolotl features or APIs
- Implementing axolotl solutions
- Debugging axolotl code
- Learning axolotl best practices
在以下场景中触发本技能:
- 处理Axolotl相关工作
- 咨询Axolotl的功能或API
- 实现Axolotl解决方案
- 调试Axolotl代码
- 学习Axolotl的最佳实践
Quick Reference
快速参考
Common Patterns
常见模式
Pattern 1: To validate that acceptable data transfer speeds exist for your training job, running NCCL Tests can help pinpoint bottlenecks, for example:
./build/all_reduce_perf -b 8 -e 128M -f 2 -g 3Pattern 2: Configure your model to use FSDP in the Axolotl yaml. For example:
fsdp_version: 2
fsdp_config:
offload_params: true
state_dict_type: FULL_STATE_DICT
auto_wrap_policy: TRANSFORMER_BASED_WRAP
transformer_layer_cls_to_wrap: LlamaDecoderLayer
reshard_after_forward: truePattern 3: The context_parallel_size should be a divisor of the total number of GPUs. For example:
context_parallel_sizePattern 4: For example: - With 8 GPUs and no sequence parallelism: 8 different batches processed per step - With 8 GPUs and context_parallel_size=4: Only 2 different batches processed per step (each split across 4 GPUs) - If your per-GPU micro_batch_size is 2, the global batch size decreases from 16 to 4
context_parallel_size=4Pattern 5: Setting save_compressed: true in your configuration enables saving models in a compressed format, which: - Reduces disk space usage by approximately 40% - Maintains compatibility with vLLM for accelerated inference - Maintains compatibility with llmcompressor for further optimization (example: quantization)
save_compressed: truePattern 6: Note It is not necessary to place your integration in the integrations folder. It can be in any location, so long as it’s installed in a package in your python env. See this repo for an example: https://github.com/axolotl-ai-cloud/diff-transformer
integrationsPattern 7: Handle both single-example and batched data. - single example: sample[‘input_ids’] is a list[int] - batched data: sample[‘input_ids’] is a list[list[int]]
utils.trainer.drop_long_seq(sample, sequence_len=2048, min_sequence_len=2)模式1: 为验证训练作业的数据传输速度是否达标,运行NCCL测试可以帮助定位瓶颈,例如:
./build/all_reduce_perf -b 8 -e 128M -f 2 -g 3模式2: 在Axolotl的YAML配置中设置模型使用FSDP,例如:
fsdp_version: 2
fsdp_config:
offload_params: true
state_dict_type: FULL_STATE_DICT
auto_wrap_policy: TRANSFORMER_BASED_WRAP
transformer_layer_cls_to_wrap: LlamaDecoderLayer
reshard_after_forward: true模式3: context_parallel_size应是GPU总数的约数,例如:
context_parallel_size模式4: 示例:- 8张GPU且无序列并行:每步处理8个不同批次 - 8张GPU且context_parallel_size=4:每步仅处理2个不同批次(每个批次拆分到4张GPU) - 若单GPU微批次大小为2,全局批次大小将从16降至4
context_parallel_size=4模式5: 在配置中设置save_compressed: true可启用模型压缩保存,这将:- 减少约40%的磁盘空间占用 - 保持与vLLM的兼容性以实现加速推理 - 保持与llmcompressor的兼容性以进行进一步优化(例如:量化)
save_compressed: true模式6: 注意:无需将你的集成代码放在integrations文件夹中,只要它安装在Python环境的某个包中,可放在任意位置。示例仓库:https://github.com/axolotl-ai-cloud/diff-transformer
integrations模式7: 同时处理单样本和批量数据。- 单样本:sample['input_ids']是list[int]类型 - 批量数据:sample['input_ids']是list[list[int]]类型
utils.trainer.drop_long_seq(sample, sequence_len=2048, min_sequence_len=2)Example Code Patterns
示例代码模式
Example 1 (python):
python
cli.cloud.modal_.ModalCloud(config, app=None)Example 2 (python):
python
cli.cloud.modal_.run_cmd(cmd, run_folder, volumes=None)Example 3 (python):
python
core.trainers.base.AxolotlTrainer(
*_args,
bench_data_collator=None,
eval_data_collator=None,
dataset_tags=None,
**kwargs,
)Example 4 (python):
python
core.trainers.base.AxolotlTrainer.log(logs, start_time=None)Example 5 (python):
python
prompt_strategies.input_output.RawInputOutputPrompter()示例1(Python):
python
cli.cloud.modal_.ModalCloud(config, app=None)示例2(Python):
python
cli.cloud.modal_.run_cmd(cmd, run_folder, volumes=None)示例3(Python):
python
core.trainers.base.AxolotlTrainer(
*_args,
bench_data_collator=None,
eval_data_collator=None,
dataset_tags=None,
**kwargs,
)示例4(Python):
python
core.trainers.base.AxolotlTrainer.log(logs, start_time=None)示例5(Python):
python
prompt_strategies.input_output.RawInputOutputPrompter()Reference Files
参考文件
This skill includes comprehensive documentation in :
references/- api.md - Api documentation
- dataset-formats.md - Dataset-Formats documentation
- other.md - Other documentation
Use to read specific reference files when detailed information is needed.
view本技能在目录中包含了全面的文档:
references/- api.md - API文档
- dataset-formats.md - 数据集格式文档
- other.md - 其他文档
当需要详细信息时,使用命令查看特定参考文件。
viewWorking with This Skill
使用本技能
For Beginners
面向初学者
Start with the getting_started or tutorials reference files for foundational concepts.
从getting_started或教程类参考文件开始,学习基础概念。
For Specific Features
针对特定功能
Use the appropriate category reference file (api, guides, etc.) for detailed information.
使用对应类别的参考文件(如api、指南等)获取详细信息。
For Code Examples
代码示例
The quick reference section above contains common patterns extracted from the official docs.
上方的快速参考部分包含了从官方文档中提取的常见模式。
Resources
资源
references/
references/
Organized documentation extracted from official sources. These files contain:
- Detailed explanations
- Code examples with language annotations
- Links to original documentation
- Table of contents for quick navigation
从官方来源整理的文档,这些文件包含:
- 详细说明
- 带有语言标注的代码示例
- 原始文档链接
- 便于快速导航的目录
scripts/
scripts/
Add helper scripts here for common automation tasks.
在此添加用于常见自动化任务的辅助脚本。
assets/
assets/
Add templates, boilerplate, or example projects here.
在此添加模板、样板代码或示例项目。
Notes
注意事项
- This skill was automatically generated from official documentation
- Reference files preserve the structure and examples from source docs
- Code examples include language detection for better syntax highlighting
- Quick reference patterns are extracted from common usage examples in the docs
- 本技能由官方文档自动生成
- 参考文件保留了源文档的结构和示例
- 代码示例包含语言检测,以实现更好的语法高亮
- 快速参考模式提取自文档中的常见用例
Updating
更新
To refresh this skill with updated documentation:
- Re-run the scraper with the same configuration
- The skill will be rebuilt with the latest information
如需使用更新后的文档刷新本技能:
- 使用相同配置重新运行爬虫
- 技能将使用最新信息重新构建