using-dynamic-architectures

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Dynamic Architectures Meta-Skill

动态架构元技能

When to Use This Skill

何时使用该技能

Invoke this meta-skill when you encounter:
  • Growing Networks: Adding capacity during training (new layers, neurons, modules)
  • Pruning Networks: Removing capacity that isn't contributing
  • Continual Learning: Training on new tasks without forgetting old ones
  • Gradient Isolation: Training new modules without destabilizing existing weights
  • Modular Composition: Building networks from graftable, composable components
  • Lifecycle Management: State machines controlling when to grow, train, integrate, prune
  • Progressive Training: Staged capability expansion with warmup and cooldown
This is the entry point for dynamic/morphogenetic neural network patterns. It routes to 7 specialized reference sheets.
当你遇到以下场景时,可调用该元技能:
  • 网络生长:在训练过程中增加模型容量(新增层、神经元、模块)
  • 网络剪枝:移除无贡献的模型容量
  • 持续学习:在学习新任务时不遗忘旧任务知识
  • 梯度隔离:训练新模块时不破坏现有权重的稳定性
  • 模块化组合:使用可移植、可组合的组件构建网络
  • 生命周期管理:通过状态机控制网络的生长、训练、集成、剪枝时机
  • 渐进式训练:分阶段扩展模型能力,包含预热和冷却环节
这是动态/形态发生神经网络模式的入口点,可引导至7份专业参考文档。

How to Access Reference Sheets

如何访问参考文档

IMPORTANT: All reference sheets are located in the SAME DIRECTORY as this SKILL.md file.
When this skill is loaded from:
skills/using-dynamic-architectures/SKILL.md
Reference sheets like
continual-learning-foundations.md
are at:
skills/using-dynamic-architectures/continual-learning-foundations.md
NOT at:
skills/continual-learning-foundations.md
(WRONG PATH)

重要提示:所有参考文档都位于与本SKILL.md文件同一目录下。
若本技能从以下路径加载:
skills/using-dynamic-architectures/SKILL.md
则类似
continual-learning-foundations.md
的参考文档路径为:
skills/using-dynamic-architectures/continual-learning-foundations.md
而非:
skills/continual-learning-foundations.md
(错误路径)

Core Principle

核心原则

Dynamic architectures grow capability, not just tune weights.
Static networks are a guess about capacity. Dynamic networks let training signal drive structure. The challenge is growing without forgetting, integrating without destabilizing, and knowing when to act.
Key tensions:
  • Stability vs. Plasticity: Preserve existing knowledge while adding new capacity
  • Isolation vs. Integration: Train new modules separately, then merge carefully
  • Exploration vs. Exploitation: When to add capacity vs. when to stabilize
动态架构是扩展能力,而非仅调优权重。
静态网络是对模型容量的预判,而动态网络让训练信号驱动结构演变。其挑战在于扩展时不遗忘知识、集成时不破坏稳定性,以及准确把握调整时机。
关键矛盾点:
  • 稳定性vs可塑性:保留已有知识的同时新增模型容量
  • 隔离性vs集成性:先独立训练新模块,再谨慎合并
  • 探索vs利用:判断何时新增容量、何时稳定模型

The 7 Dynamic Architecture Skills

7项动态架构技能

  1. continual-learning-foundations - EWC, PackNet, rehearsal strategies, catastrophic forgetting theory
  2. gradient-isolation-techniques - Freezing, gradient masking, stop_grad patterns, alpha blending
  3. peft-adapter-techniques - LoRA, QLoRA, DoRA, adapter placement, merging strategies
  4. dynamic-architecture-patterns - Grow/prune patterns, slot-based expansion, capacity scheduling
  5. modular-neural-composition - MoE, gating, grafting semantics, interface contracts
  6. ml-lifecycle-orchestration - State machines, quality gates, transition triggers, controllers
  7. progressive-training-strategies - Staged expansion, warmup/cooldown, knowledge transfer
  1. continual-learning-foundations - EWC、PackNet、重放策略、灾难性遗忘理论
  2. gradient-isolation-techniques - 冻结策略、梯度掩码、stop_grad模式、Alpha混合
  3. peft-adapter-techniques - LoRA、QLoRA、DoRA、适配器部署、合并策略
  4. dynamic-architecture-patterns - 生长/剪枝模式、基于插槽的扩展、容量调度
  5. modular-neural-composition - MoE、门控机制、移植语义、接口契约
  6. ml-lifecycle-orchestration - 状态机、质量门控、触发条件、控制器
  7. progressive-training-strategies - 分阶段扩展、预热/冷却、知识迁移

Routing Decision Framework

路由决策框架

Step 1: Identify the Core Problem

步骤1:识别核心问题

Diagnostic Questions:
  • "Are you trying to prevent forgetting when training on new data/tasks?"
  • "Are you trying to add new capacity to an existing trained network?"
  • "Are you designing how multiple modules combine?"
  • "Are you deciding WHEN to grow, prune, or integrate?"
Quick Routing:
ProblemPrimary Skill
"Model forgets old tasks when I train new ones"continual-learning-foundations
"New module destabilizes existing weights"gradient-isolation-techniques
"Fine-tune LLM efficiently without full training"peft-adapter-techniques
"When should I add more capacity?"dynamic-architecture-patterns
"How do module outputs combine?"modular-neural-composition
"How do I manage the grow/train/integrate cycle?"ml-lifecycle-orchestration
"How do I warm up new modules safely?"progressive-training-strategies

诊断问题:
  • "你是否在尝试训练新数据/任务时避免遗忘?"
  • "你是否想为已训练完成的网络新增容量?"
  • "你是否在设计多模块的组合方式?"
  • "你是否在判断何时进行生长、剪枝或集成操作?"
快速路由:
问题对应核心技能
"训练新任务时模型遗忘旧任务知识"continual-learning-foundations
"新模块导致现有权重不稳定"gradient-isolation-techniques
"高效微调大语言模型,无需全量训练"peft-adapter-techniques
"我应该何时新增模型容量?"dynamic-architecture-patterns
"模块输出如何组合?"modular-neural-composition
"如何管理生长/训练/集成的循环流程?"ml-lifecycle-orchestration
"如何安全地预热新模块?"progressive-training-strategies

Step 2: Catastrophic Forgetting (Continual Learning)

步骤2:灾难性遗忘(持续学习)

Symptoms:
  • Performance on old tasks drops when training on new tasks
  • Model "forgets" previous capabilities
  • Fine-tuning overwrites learned features
Route to: continual-learning-foundations.md
Covers:
  • Why SGD causes forgetting (loss landscape geometry)
  • EWC, SI, MAS (regularization approaches)
  • Progressive Neural Networks, PackNet (architectural approaches)
  • Experience replay, generative replay (rehearsal approaches)
  • Measuring forgetting (backward/forward transfer)
When to Use:
  • Training sequentially on multiple tasks
  • Fine-tuning without forgetting base capabilities
  • Designing systems that accumulate knowledge over time

症状:
  • 训练新任务时,旧任务的性能下降
  • 模型“遗忘”之前具备的能力
  • 微调覆盖已学习的特征
路由至: continual-learning-foundations.md
涵盖内容:
  • 为什么SGD会导致遗忘(损失 landscape 几何特性)
  • EWC、SI、MAS(正则化方法)
  • 渐进式神经网络、PackNet(架构层面方法)
  • 经验重放、生成式重放(重放类方法)
  • 遗忘程度的衡量(反向/正向迁移)
适用场景:
  • 按顺序训练多个任务
  • 微调时不遗忘基础能力
  • 设计可随时间积累知识的系统

Step 3: Gradient Isolation

步骤3:梯度隔离

Symptoms:
  • New module training affects host network stability
  • Want to train on host errors without backprop flowing to host
  • Need gradual integration of new capacity
Route to: gradient-isolation-techniques.md
Covers:
  • Freezing strategies (full, partial, scheduled)
  • detach()
    vs
    no_grad()
    semantics
  • Dual-path training (residual learning on errors)
  • Alpha blending for gradual integration
  • Hook-based gradient surgery
When to Use:
  • Training "seed" modules that learn from host errors
  • Preventing catastrophic interference during growth
  • Implementing safe module grafting

症状:
  • 训练新模块影响宿主网络的稳定性
  • 希望基于宿主误差训练,但不向宿主传播反向梯度
  • 需要逐步集成新的模型容量
路由至: gradient-isolation-techniques.md
涵盖内容:
  • 冻结策略(全量、部分、调度式)
  • detach()
    no_grad()
    的语义差异
  • 双路径训练(基于误差的残差学习)
  • 用于逐步集成的Alpha混合
  • 基于钩子的梯度手术
适用场景:
  • 训练从宿主误差中学习的“种子”模块
  • 避免模型生长时的灾难性干扰
  • 实现安全的模块移植

Step 4: PEFT Adapters (LoRA, QLoRA)

步骤4:PEFT适配器(LoRA、QLoRA)

Symptoms:
  • Want to fine-tune large pretrained models efficiently
  • Memory constraints prevent full fine-tuning
  • Need task-specific adaptation without modifying base weights
Route to: peft-adapter-techniques.md
Covers:
  • LoRA (low-rank adaptation) fundamentals
  • QLoRA (quantized base + LoRA adapters)
  • DoRA (weight-decomposed adaptation)
  • Adapter placement strategies
  • Merging adapters into base model
  • Multiple adapter management
When to Use:
  • Fine-tuning LLMs on limited compute
  • Creating task-specific model variants
  • Memory-efficient adaptation of large models

症状:
  • 希望高效微调大尺寸预训练模型
  • 内存限制导致无法进行全量微调
  • 需要针对特定任务适配模型,但不修改基础权重
路由至: peft-adapter-techniques.md
涵盖内容:
  • LoRA(低秩适配)基础原理
  • QLoRA(量化基础模型+LoRA适配器)
  • DoRA(权重分解适配)
  • 适配器部署策略
  • 将适配器合并至基础模型
  • 多适配器管理
适用场景:
  • 在有限计算资源下微调大语言模型
  • 创建特定任务的模型变体
  • 对大模型进行内存高效的适配

Step 5: Dynamic Architecture Patterns

步骤5:动态架构模式

Symptoms:
  • Need to add capacity during training (not just before)
  • Want to prune underperforming components
  • Deciding when/where to grow the network
Route to: dynamic-architecture-patterns.md
Covers:
  • Growth patterns (slot-based, layer widening, depth extension)
  • Pruning patterns (magnitude, gradient-based, lottery ticket)
  • Trigger conditions (loss plateau, contribution metrics, budgets)
  • Capacity scheduling (grow-as-needed vs overparameterize-then-prune)
When to Use:
  • Building networks that expand during training
  • Implementing neural architecture search lite
  • Managing parameter budgets with dynamic allocation

症状:
  • 需要在训练过程中新增容量(而非训练前)
  • 希望剪枝性能不佳的组件
  • 判断网络生长的时机和位置
路由至: dynamic-architecture-patterns.md
涵盖内容:
  • 生长模式(基于插槽、层拓宽、深度扩展)
  • 剪枝模式(基于幅度、梯度、彩票假设)
  • 触发条件(损失平台期、贡献度指标、预算限制)
  • 容量调度(按需生长 vs 先过参数化再剪枝)
适用场景:
  • 构建训练过程中可扩展的网络
  • 实现轻量型神经架构搜索
  • 通过动态分配管理参数预算

Step 6: Modular Composition

步骤6:模块化组合

Symptoms:
  • Combining outputs from multiple modules
  • Designing gating/routing mechanisms
  • Need graftable, replaceable components
Route to: modular-neural-composition.md
Covers:
  • Combination mechanisms (additive, multiplicative, selective)
  • Mixture of Experts (sparse gating, load balancing)
  • Grafting semantics (input/output attachment points)
  • Interface contracts (shape matching, normalization boundaries)
  • Multi-module coordination (independent, competitive, cooperative)
When to Use:
  • Building modular architectures with interchangeable parts
  • Implementing MoE or gated architectures
  • Designing residual streams as module communication

症状:
  • 组合多个模块的输出
  • 设计门控/路由机制
  • 需要可移植、可替换的组件
路由至: modular-neural-composition.md
涵盖内容:
  • 组合机制(加法、乘法、选择式)
  • Mixture of Experts(稀疏门控、负载均衡)
  • 移植语义(输入/输出附着点)
  • 接口契约(形状匹配、归一化边界)
  • 多模块协作(独立、竞争、协同)
适用场景:
  • 构建具备可互换组件的模块化架构
  • 实现MoE或门控架构
  • 将残差流设计为模块通信通道

Step 7: Lifecycle Orchestration

步骤7:生命周期编排

Symptoms:
  • Need to decide WHEN to grow, train, integrate, prune
  • Building state machines for module lifecycle
  • Want quality gates before integration decisions
Route to: ml-lifecycle-orchestration.md
Covers:
  • State machine fundamentals (states, transitions, terminals)
  • Gate design patterns (structural, performance, stability, contribution)
  • Transition triggers (metric-based, time-based, budget-based)
  • Rollback and recovery (cooldown, hysteresis)
  • Controller patterns (heuristic, learned/RL, hybrid)
When to Use:
  • Designing grow/train/integrate/prune workflows
  • Implementing quality gates for safe integration
  • Building RL-controlled architecture decisions

症状:
  • 需要判断何时进行生长、训练、集成、剪枝
  • 为模块生命周期构建状态机
  • 在集成决策前设置质量门控
路由至: ml-lifecycle-orchestration.md
涵盖内容:
  • 状态机基础(状态、转换、终止条件)
  • 门控设计模式(结构、性能、稳定性、贡献度)
  • 转换触发条件(基于指标、时间、预算)
  • 回滚与恢复(冷却、滞后机制)
  • 控制器模式(启发式、基于强化学习、混合式)
适用场景:
  • 设计生长/训练/集成/剪枝的工作流
  • 为安全集成实现质量门控
  • 构建由强化学习控制的架构决策系统

Step 8: Progressive Training

步骤8:渐进式训练

Symptoms:
  • New modules cause instability when integrated
  • Need warmup/cooldown for safe capacity addition
  • Planning multi-stage training schedules
Route to: progressive-training-strategies.md
Covers:
  • Staged capacity expansion strategies
  • Warmup patterns (zero-init, LR warmup, alpha ramp)
  • Cooldown and stabilization (settling periods, consolidation)
  • Multi-stage schedules (sequential, overlapping, budget-aware)
  • Knowledge transfer between stages (inheritance, distillation)
When to Use:
  • Ramping new modules safely into production
  • Designing curriculum over architecture (not just data)
  • Preventing stage transition shock

症状:
  • 新模块集成时导致模型不稳定
  • 需要通过预热/冷却安全新增容量
  • 规划多阶段训练调度
路由至: progressive-training-strategies.md
涵盖内容:
  • 分阶段容量扩展策略
  • 预热模式(零初始化、学习率预热、Alpha渐变)
  • 冷却与稳定(沉淀周期、巩固训练)
  • 多阶段调度(顺序式、重叠式、预算感知)
  • 阶段间知识迁移(继承、蒸馏)
适用场景:
  • 将新模块安全部署至生产环境
  • 针对架构设计课程式训练(而非仅针对数据)
  • 避免阶段转换时的模型震荡

Common Multi-Skill Scenarios

常见多技能组合场景

Scenario: Building a Morphogenetic System

场景:构建形态发生系统

Need: Network that grows seeds, trains them in isolation, and grafts successful ones
Routing sequence:
  1. dynamic-architecture-patterns - Slot-based expansion, where seeds attach
  2. gradient-isolation-techniques - Train seeds on host errors without destabilizing host
  3. modular-neural-composition - How seed outputs blend into host stream
  4. ml-lifecycle-orchestration - State machine for seed lifecycle
  5. progressive-training-strategies - Warmup/cooldown for grafting
需求: 可生长种子模块、隔离训练并移植成功模块的网络
路由顺序:
  1. dynamic-architecture-patterns - 基于插槽的扩展,确定种子模块的附着位置
  2. gradient-isolation-techniques - 在不影响宿主稳定性的前提下,基于宿主误差训练种子模块
  3. modular-neural-composition - 种子模块输出与宿主流的融合方式
  4. ml-lifecycle-orchestration - 种子模块生命周期的状态机
  5. progressive-training-strategies - 移植时的预热/冷却流程

Scenario: Continual Learning Without Forgetting

场景:无遗忘的持续学习

Need: Train on sequence of tasks without catastrophic forgetting
Routing sequence:
  1. continual-learning-foundations - Understand forgetting, choose approach
  2. gradient-isolation-techniques - If using architectural approach (columns, modules)
  3. progressive-training-strategies - Staged training across tasks
需求: 按顺序训练多个任务,避免灾难性遗忘
路由顺序:
  1. continual-learning-foundations - 理解遗忘机制,选择合适方案
  2. gradient-isolation-techniques - 若采用架构层面的方案(列、模块)
  3. progressive-training-strategies - 跨任务的分阶段训练

Scenario: Neural Architecture Search (Lite)

场景:轻量型神经架构搜索

Need: Grow/prune network based on training signal
Routing sequence:
  1. dynamic-architecture-patterns - Growth/pruning triggers and patterns
  2. ml-lifecycle-orchestration - Automation via heuristics or RL
  3. progressive-training-strategies - Stabilization between changes
需求: 根据训练信号生长/剪枝网络
路由顺序:
  1. dynamic-architecture-patterns - 生长/剪枝的触发条件与模式
  2. ml-lifecycle-orchestration - 通过启发式或强化学习实现自动化
  3. progressive-training-strategies - 模型变更后的稳定训练

Scenario: RL-Controlled Architecture

场景:强化学习控制的架构

Need: RL agent deciding when to grow, prune, integrate
Routing sequence:
  1. ml-lifecycle-orchestration - Learned controller patterns
  2. dynamic-architecture-patterns - What actions the RL agent can take
  3. gradient-isolation-techniques - Safe exploration during training

需求: 由RL agent决定何时生长、剪枝、集成
路由顺序:
  1. ml-lifecycle-orchestration - 基于学习的控制器模式
  2. dynamic-architecture-patterns - RL agent可执行的操作
  3. gradient-isolation-techniques - 训练时的安全探索

Rationalization Resistance Table

常见误区纠正表

RationalizationRealityCounter-Guidance
"Just train a bigger model from scratch"Transfer + growth often beats from-scratch"Check continual-learning-foundations for why"
"I'll freeze everything except the new layer"Full freeze may be too restrictive"Check gradient-isolation-techniques for partial strategies"
"I'll add capacity whenever loss plateaus"Need more than loss plateau (contribution check)"Check ml-lifecycle-orchestration for proper gates"
"Modules can just sum their outputs"Naive summation can cause interference"Check modular-neural-composition for combination mechanisms"
"I'll integrate immediately when training finishes"Need warmup/holding period"Check progressive-training-strategies for safe integration"
"EWC solves all forgetting problems"EWC has limitations, may need architectural approach"Check continual-learning-foundations for trade-offs"

错误认知实际情况纠正建议
"直接从头训练更大的模型即可"迁移+生长的效果通常优于从头训练"查看continual-learning-foundations文档了解原因"
"我会冻结除新层之外的所有部分"全量冻结可能过于受限"查看gradient-isolation-techniques文档了解部分冻结策略"
"只要损失进入平台期就新增容量"不能仅依赖损失平台期,还需检查贡献度"查看ml-lifecycle-orchestration文档了解正确的门控机制"
"模块输出直接相加即可"简单求和可能导致干扰"查看modular-neural-composition文档了解组合机制"
"训练完成后立即集成模块"需要预热/等待周期"查看progressive-training-strategies文档了解安全集成方式"
"EWC能解决所有遗忘问题"EWC存在局限性,可能需要架构层面的方案"查看continual-learning-foundations文档了解权衡点"

Red Flags Checklist

风险警示清单

Watch for these signs of incorrect approach:
  • No Isolation: Training new modules without gradient isolation from host
  • No Warmup: Integrating new capacity at full amplitude immediately
  • No Gates: Integrating based only on time, not performance metrics
  • Naive Combination: Summing module outputs without gating or blending
  • Ignoring Forgetting: Adding new tasks without measuring old task performance
  • No Rollback: No plan for what happens if integration fails

注意以下错误方法的迹象:
  • 无隔离机制:训练新模块时未与宿主网络进行梯度隔离
  • 无预热流程:直接以全幅度集成新容量
  • 无质量门控:仅基于时间而非性能指标进行集成
  • 简单组合方式:未使用门控或融合机制,直接求和模块输出
  • 忽略遗忘问题:新增任务时未衡量旧任务的性能
  • 无回滚方案:未规划集成失败后的处理流程

Relationship to Other Packs

与其他工具包的关联

RequestPrimary PackWhy
"Implement PPO for architecture decisions"yzmir-deep-rlRL algorithm implementation
"Evaluate architecture changes without mutation"yzmir-deep-rl/counterfactual-reasoningCounterfactual simulation
"Debug PyTorch gradient flow"yzmir-pytorch-engineeringLow-level PyTorch debugging
"Optimize training loop performance"yzmir-training-optimizationGeneral training optimization
"Design transformer architecture"yzmir-neural-architecturesStatic architecture design
"Deploy morphogenetic model"yzmir-ml-productionProduction deployment
Intersection with deep-rl: If using RL to control architecture decisions (when to grow/prune), combine this pack's lifecycle orchestration with deep-rl's policy gradient or actor-critic methods.
Counterfactual evaluation: Before committing to a live mutation (grow/prune), use deep-rl's
counterfactual-reasoning.md
to simulate the change and evaluate outcomes without risk. This is critical for production morphogenetic systems.

请求对应核心工具包原因
"为架构决策实现PPO算法"yzmir-deep-rlRL算法实现
"无需突变即可评估架构变更"yzmir-deep-rl/counterfactual-reasoning反事实模拟
"调试PyTorch梯度流"yzmir-pytorch-engineering底层PyTorch调试
"优化训练循环性能"yzmir-training-optimization通用训练优化
"设计Transformer架构"yzmir-neural-architectures静态架构设计
"部署形态发生模型"yzmir-ml-production生产环境部署
与深度强化学习的交集:若使用RL控制架构决策(何时生长/剪枝),需将本工具包的生命周期编排与深度强化学习的策略梯度或演员-评论家方法结合。
反事实评估:在实际执行突变(生长/剪枝)前,使用深度强化学习的
counterfactual-reasoning.md
模拟变更并评估结果,这对生产环境中的形态发生系统至关重要。

Diagnostic Question Templates

诊断问题模板

Use these to route users:
使用以下问题引导用户:

Problem Classification

问题分类

  • "Are you training on multiple tasks sequentially, or growing a single-task network?"
  • "Do you have an existing trained model you want to extend, or starting fresh?"
  • "Is the issue forgetting (old performance drops) or instability (training explodes)?"
  • "你是按顺序训练多个任务,还是在扩展单任务网络?"
  • "你是要扩展已训练完成的模型,还是从头开始构建?"
  • 问题是遗忘(旧任务性能下降)还是不稳定(训练过程震荡)?"

Architectural Questions

架构相关问题

  • "Where do new modules attach to the existing network?"
  • "How should new module outputs combine with existing outputs?"
  • "What triggers growth? Loss plateau, manual, or learned?"
  • "新模块将附着在现有网络的哪个位置?"
  • "新模块的输出应如何与现有输出组合?"
  • "触发模型生长的条件是什么?损失平台期、手动触发还是基于学习的判断?"

Lifecycle Questions

生命周期相关问题

  • "What states can a module be in? (training, integrating, permanent, removed)"
  • "What conditions must be met before integration?"
  • "What happens if a module fails to improve performance?"

  • "模块可处于哪些状态?(训练中、集成中、永久保留、已移除)"
  • "集成前需满足哪些条件?"
  • "若集成失败,应如何处理?"

Summary: Routing Decision Tree

总结:路由决策树

START: Dynamic architecture problem

├─ Forgetting old tasks?
│  └─ → continual-learning-foundations

├─ New module destabilizes existing?
│  └─ → gradient-isolation-techniques

├─ Fine-tuning LLM efficiently?
│  └─ → peft-adapter-techniques

├─ When/where to add capacity?
│  └─ → dynamic-architecture-patterns

├─ How modules combine?
│  └─ → modular-neural-composition

├─ Managing grow/train/integrate cycle?
│  └─ → ml-lifecycle-orchestration

├─ Warmup/cooldown for new capacity?
│  └─ → progressive-training-strategies

└─ Building complete morphogenetic system?
   └─ → Start with dynamic-architecture-patterns
      → Then gradient-isolation-techniques
      → Then ml-lifecycle-orchestration

开始:动态架构问题

├─ 是否存在旧任务遗忘?
│  └─ → continual-learning-foundations

├─ 新模块导致现有模型不稳定?
│  └─ → gradient-isolation-techniques

├─ 高效微调大语言模型?
│  └─ → peft-adapter-techniques

├─ 何时/何地新增容量?
│  └─ → dynamic-architecture-patterns

├─ 模块如何组合?
│  └─ → modular-neural-composition

├─ 管理生长/训练/集成循环?
│  └─ → ml-lifecycle-orchestration

├─ 新容量的预热/冷却?
│  └─ → progressive-training-strategies

└─ 构建完整的形态发生系统?
   └─ → 从dynamic-architecture-patterns开始
      → 接着使用gradient-isolation-techniques
      → 再使用ml-lifecycle-orchestration

Reference Sheets

参考文档

After routing, load the appropriate reference sheet:
  1. continual-learning-foundations.md - EWC, PackNet, rehearsal, forgetting theory
  2. gradient-isolation-techniques.md - Freezing, detach, alpha blending, hook surgery
  3. peft-adapter-techniques.md - LoRA, QLoRA, DoRA, adapter merging
  4. dynamic-architecture-patterns.md - Grow/prune patterns, triggers, scheduling
  5. modular-neural-composition.md - MoE, gating, grafting, interface contracts
  6. ml-lifecycle-orchestration.md - State machines, gates, controllers
  7. progressive-training-strategies.md - Staged expansion, warmup/cooldown
完成路由后,加载对应的参考文档:
  1. continual-learning-foundations.md - EWC、PackNet、重放策略、遗忘理论
  2. gradient-isolation-techniques.md - 冻结策略、detach、Alpha混合、钩子手术
  3. peft-adapter-techniques.md - LoRA、QLoRA、DoRA、适配器合并
  4. dynamic-architecture-patterns.md - 生长/剪枝模式、触发条件、调度
  5. modular-neural-composition.md - MoE、门控机制、移植、接口契约
  6. ml-lifecycle-orchestration.md - 状态机、门控、控制器
  7. progressive-training-strategies.md - 分阶段扩展、预热/冷却