terragrunt
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseTerragrunt Infrastructure Skill
Terragrunt基础设施管理技能
Manage bare-metal Kubernetes infrastructure from PXE boot to running clusters.
For architecture overview (units vs modules, config centralization), see infrastructure/CLAUDE.md. For detailed unit patterns, see infrastructure/units/CLAUDE.md.
管理从PXE启动到运行集群的裸金属Kubernetes基础设施。
如需了解架构概述(units与modules的区别、配置集中化),请查看infrastructure/CLAUDE.md。如需详细的单元模式说明,请查看infrastructure/units/CLAUDE.md。
Task Commands (Always Use These)
Task命令(请始终使用这些命令)
bash
undefinedbash
undefinedValidation (run in order)
Validation (run in order)
task tg:fmt # Format HCL files
task tg:test-<module> # Test specific module (e.g., task tg:test-config)
task tg:validate-<stack> # Validate stack (e.g., task tg:validate-integration)
task tg:fmt # Format HCL files
task tg:test-<module> # Test specific module (e.g., task tg:test-config)
task tg:validate-<stack> # Validate stack (e.g., task tg:validate-integration)
Operations
Operations
task tg:list # List available stacks
task tg:plan-<stack> # Plan (e.g., task tg:plan-integration)
task tg:apply-<stack> # Apply (REQUIRES HUMAN APPROVAL)
task tg:gen-<stack> # Generate stack files
task tg:clean-<stack> # Clean generated files
**NEVER** run `terragrunt` or `tofu` directly—always use `task` commands.task tg:list # List available stacks
task tg:plan-<stack> # Plan (e.g., task tg:plan-integration)
task tg:apply-<stack> # Apply (REQUIRES HUMAN APPROVAL)
task tg:gen-<stack> # Generate stack files
task tg:clean-<stack> # Clean generated files
**切勿**直接运行`terragrunt`或`tofu`——请始终使用`task`命令。How to Add a Machine
如何添加机器
- Edit :
inventory.hcl
hcl
node50 = {
cluster = "live"
type = "worker"
install = {
selector = "disk.model == 'Samsung'"
architecture = "amd64"
}
interfaces = [{
id = "eth0"
hardwareAddr = "aa:bb:cc:dd:ee:ff" # VERIFY correct
addresses = [{ ip = "192.168.10.50" }] # VERIFY available
}]
}- Run
task tg:plan-live - Review plan—config module auto-includes machines where
cluster == "live" - Request human approval before apply
- 编辑:
inventory.hcl
hcl
node50 = {
cluster = "live"
type = "worker"
install = {
selector = "disk.model == 'Samsung'"
architecture = "amd64"
}
interfaces = [{
id = "eth0"
hardwareAddr = "aa:bb:cc:dd:ee:ff" # VERIFY correct
addresses = [{ ip = "192.168.10.50" }] # VERIFY available
}]
}- 运行
task tg:plan-live - 查看规划——config模块会自动包含的机器
cluster == "live" - 应用前需获得人工批准
How to Add a Feature Flag
如何添加功能标志
- Add version to if needed
versions.hcl - Add feature detection in :
modules/config/main.tf
hcl
locals {
new_feature_enabled = contains(var.features, "new-feature")
}- Enable in stack's features list:
hcl
features = ["gateway-api", "longhorn", "new-feature"]- 如有需要,在中添加版本
versions.hcl - 在中添加功能检测:
modules/config/main.tf
hcl
locals {
new_feature_enabled = contains(var.features, "new-feature")
}- 在栈的功能列表中启用:
hcl
features = ["gateway-api", "longhorn", "new-feature"]How to Create a New Unit
如何创建新单元
- Create :
units/new-unit/terragrunt.hcl
hcl
include "root" {
path = find_in_parent_folders("root.hcl")
}
terraform {
source = "../../../.././/modules/new-unit"
}
dependency "config" {
config_path = "../config"
mock_outputs = { new_unit = {} }
}
inputs = dependency.config.outputs.new_unit- Create corresponding with
modules/new-unit/,variables.tf,main.tf,outputs.tfversions.tf - Add output from config module
- Add block to stacks that need it
unit
- 创建:
units/new-unit/terragrunt.hcl
hcl
include "root" {
path = find_in_parent_folders("root.hcl")
}
terraform {
source = "../../../.././/modules/new-unit"
}
dependency "config" {
config_path = "../config"
mock_outputs = { new_unit = {} }
}
inputs = dependency.config.outputs.new_unit- 创建对应的目录,包含
modules/new-unit/、variables.tf、main.tf、outputs.tfversions.tf - 在config模块中添加输出
- 向需要该单元的栈中添加块
unit
How to Write Module Tests
如何编写模块测试
Tests use OpenTofu native testing in :
modules/<name>/tests/*.tftest.hclhcl
undefined测试使用OpenTofu原生测试,位于:
modules/<name>/tests/*.tftest.hclhcl
undefinedTop-level variables set defaults for ALL run blocks
Top-level variables set defaults for ALL run blocks
variables {
name = "test-cluster"
features = ["gateway-api"]
machines = {
node1 = {
cluster = "test-cluster"
type = "controlplane"
# ... complete machine definition
}
}
}
run "feature_enabled" {
command = plan
variables {
features = ["prometheus"] # Only override what differs
}
assert {
condition = output.prometheus_enabled == true
error_message = "Prometheus should be enabled"
}
}
Run with `task tg:test-config` or `task tg:test` for all modules.variables {
name = "test-cluster"
features = ["gateway-api"]
machines = {
node1 = {
cluster = "test-cluster"
type = "controlplane"
# ... complete machine definition
}
}
}
run "feature_enabled" {
command = plan
variables {
features = ["prometheus"] # Only override what differs
}
assert {
condition = output.prometheus_enabled == true
error_message = "Prometheus should be enabled"
}
}
使用`task tg:test-config`运行单个模块测试,或使用`task tg:test`运行所有模块测试。Safety Rules
安全规则
- NEVER run apply without explicit human approval
- NEVER use flags
--auto-approve - NEVER guess MAC addresses or IPs—verify against
inventory.hcl - NEVER commit or
.terragrunt-cache/.terragrunt-stack/ - NEVER manually edit Terraform state
- 切勿在未获得明确人工批准的情况下运行apply
- 切勿使用参数
--auto-approve - 切勿猜测MAC地址或IP地址——请对照进行验证
inventory.hcl - 切勿提交或
.terragrunt-cache/目录.terragrunt-stack/ - 切勿手动编辑Terraform状态
State Operations
状态操作
When removing state entries with indexed resources (e.g., ), strips the quotes causing errors. Use a loop instead:
this["rpi4"]xargswhilebash
undefined当移除带有索引资源的状态条目时(例如),会去除引号导致错误。请改用循环:
this["rpi4"]xargswhilebash
undefinedWRONG - xargs mangles quotes in resource names
WRONG - xargs mangles quotes in resource names
terragrunt state list | xargs -n 1 terragrunt state rm
terragrunt state list | xargs -n 1 terragrunt state rm
CORRECT - while loop preserves quotes
CORRECT - while loop preserves quotes
terragrunt state list | while read -r resource; do terragrunt state rm "$resource"; done
This applies to any state operation on resources with map keys like `data.talos_machine_configuration.this["rpi4"]`.terragrunt state list | while read -r resource; do terragrunt state rm "$resource"; done
这适用于所有对带有映射键的资源(如`data.talos_machine_configuration.this["rpi4"]`)执行的状态操作。Validation Checklist
验证检查清单
Before requesting apply approval:
- passes
task tg:fmt - passes (if module tests exist)
task tg:test - passes for ALL stacks
task tg:validate - reviewed
task tg:plan-<stack> - No unexpected destroys in plan
- Network changes won't break connectivity
申请应用批准前,请确认:
- 执行通过
task tg:fmt - 执行通过(如果存在模块测试)
task tg:test - 所有栈的执行通过
task tg:validate - 已查看的结果
task tg:plan-<stack> - 规划中无意外的销毁操作
- 网络变更不会中断连通性
References
参考资料
- stacks.md - Detailed Terragrunt stacks documentation
- units.md - Detailed Terragrunt units documentation
- stacks.md - Terragrunt栈详细文档
- units.md - Terragrunt单元详细文档