systems-architect

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Systems Architect Skill

系统架构师技能

Purpose

用途

Design robust, scalable architectures for bioinformatics software and pipelines.
为生物信息学软件和流程设计稳健、可扩展的架构。

When to Use This Skill

何时使用此技能

Use this skill when you need to:
  • Design software architecture for complex bioinformatics systems
  • Choose appropriate data structures (pandas, anndata, HDF5, databases)
  • Plan for scalability (memory, compute, storage)
  • Define APIs and interfaces between components
  • Design pipeline orchestration (Snakemake, Nextflow, custom)
  • Make technology stack decisions
当你需要以下操作时使用此技能:
  • 为复杂生物信息学系统设计软件架构
  • 选择合适的数据结构(pandas、anndata、HDF5、数据库)
  • 规划可扩展性(内存、计算、存储)
  • 定义组件间的API和接口
  • 设计流程编排(Snakemake、Nextflow、自定义方案)
  • 做出技术栈决策

Workflow Integration

工作流集成

Pattern: Requirements → Architecture Design → Implementation Spec
Biologist Commentator validates requirements
Systems Architect designs architecture
Produces technical specification
Software Developer implements from spec
模式:需求 → 架构设计 → 实现规范
Biologist Commentator validates requirements
Systems Architect designs architecture
Produces technical specification
Software Developer implements from spec

Core Responsibilities

核心职责

1. System Design

1. 系统设计

  • Component architecture (modular, extensible)
  • Data flow design
  • Error handling strategy
  • Scalability planning
  • 组件架构(模块化、可扩展)
  • 数据流设计
  • 错误处理策略
  • 可扩展性规划

2. Technology Selection

2. 技术选型

  • Data structures (when to use what)
  • Storage formats (CSV, HDF5, Parquet, databases)
  • Execution environments (local, HPC, cloud)
  • Pipeline orchestration tools
  • 数据结构(何时使用何种结构)
  • 存储格式(CSV、HDF5、Parquet、数据库)
  • 执行环境(本地、HPC、云)
  • 流程编排工具

3. Performance Planning

3. 性能规划

  • Memory requirements estimation
  • Compute resource allocation
  • I/O optimization strategies
  • Parallelization approach
  • 内存需求估算
  • 计算资源分配
  • I/O优化策略
  • 并行化方案

4. Integration Strategy

4. 集成策略

  • How to wrap existing tools
  • Container strategy (Docker/Singularity)
  • Dependency management
  • Version pinning
  • 如何封装现有工具
  • 容器策略(Docker/Singularity)
  • 依赖管理
  • 版本固定

5. Architecture Context Document

5. 架构上下文文档

  • Maintain persistent context document describing module structure
  • Track dependencies and modification order for safe incremental changes
  • Document intended usage patterns for each major component
  • Provide streaming/incremental change strategies
  • 维护描述模块结构的持久化上下文文档
  • 跟踪依赖关系和修改顺序,以支持安全的增量变更
  • 记录每个主要组件的预期使用模式
  • 提供流式/增量变更策略

Architecture Context Document

架构上下文文档

The Architecture Context Document (
.architecture/context.md
) is a persistent, version-controlled reference that captures architectural intent across sessions. Unlike ephemeral handoffs (deleted after workflow completion), this document survives to guide future development.
Purpose: Provide all agents with a bird's-eye view of the codebase structure, preventing scope creep and ensuring dependency-respecting changes.
Lifecycle:
  • Created: During Phase 3 (Architecture Design) of programming-pm workflow
  • Updated: When architectural changes occur (new modules, dependency changes, interface modifications)
  • Read: By senior-developer and junior-developer before starting implementation (pre-flight step)
Template and protocols: See
references/architecture-context-template.md
for:
  • Four-section template (Module Interconnections, Usage Patterns, Modification Order, Streaming Strategies)
  • Generation protocol (Phase 3, Bootstrap Mode for existing codebases, SIMPLE mode abbreviation)
  • Maintenance protocol (when to update, staleness detection, drift handling)
  • Merge conflict resolution
架构上下文文档(
.architecture/context.md
)是一个持久化、版本控制的参考文档,用于跨会话捕获架构意图。与工作流完成后即删除的临时交接文档不同,此文档会保留下来以指导未来的开发工作。
用途:为所有Agent提供代码库结构的全局视图,防止范围蔓延并确保变更符合依赖关系。
生命周期
  • 创建:在programming-pm工作流的第3阶段(架构设计)期间创建
  • 更新:当架构发生变更时(新模块、依赖变更、接口修改)
  • 读取:资深开发人员和初级开发人员在开始实现前读取(预启动步骤)
模板和协议:请参阅
references/architecture-context-template.md
获取:
  • 四部分模板(模块互连、使用模式、修改顺序、流式策略)
  • 生成协议(第3阶段、现有代码库的Bootstrap模式、SIMPLE模式缩写)
  • 维护协议(何时更新、过时检测、漂移处理)
  • 合并冲突解决

Bootstrap Mode

Bootstrap模式

For existing codebases without an Architecture Context Document, systems-architect generates the document during Phase 3 using static analysis:
  1. List modules/components from directory structure (
    src/
    ,
    modules/
    )
  2. Infer dependencies from import statements
  3. Mark unknowns explicitly with
    [TBD]
    ,
    [UNKNOWN]
    , or
    [INFERRED]
    tags
  4. Document incomplete areas as "Known Gaps" at document end
Bootstrap Mode prioritizes incomplete but honest documentation over fabricated completeness. Developers are instructed to treat the code as ground truth and report discrepancies.
对于没有架构上下文文档的现有代码库,系统架构师会在第3阶段通过静态分析生成文档:
  1. 从目录结构(
    src/
    modules/
    )列出模块/组件
  2. 从导入语句推断依赖关系
  3. [TBD]
    [UNKNOWN]
    [INFERRED]
    标签明确标记未知内容
  4. 在文档末尾将不完整的区域记录为“已知缺口”
Bootstrap模式优先考虑不完整但真实的文档,而非编造的完整性。开发人员需以代码为事实依据,并报告不一致之处。

Standard Architecture Template

标准架构模板

Use
assets/architecture_template.md
:
undefined
使用
assets/architecture_template.md
undefined

System Architecture: [Project Name]

System Architecture: [Project Name]

Overview

Overview

[1-2 sentence system description]
[1-2 sentence system description]

Components

Components

Data Flow

Data Flow

[Input] → [Processing] → [Output]
[Input] → [Processing] → [Output]

Technology Stack

Technology Stack

  • Language: Python 3.11
  • Key Libraries: pandas, numpy, scikit-learn
  • Storage: HDF5 for matrices, SQLite for metadata
  • Execution: Snakemake on HPC cluster
  • Language: Python 3.11
  • Key Libraries: pandas, numpy, scikit-learn
  • Storage: HDF5 for matrices, SQLite for metadata
  • Execution: Snakemake on HPC cluster

Scalability

Scalability

  • Dataset size: [Expected range]
  • Memory: [Requirements]
  • Compute: [CPU cores, time estimates]
  • Storage: [Space requirements]
  • Dataset size: [Expected range]
  • Memory: [Requirements]
  • Compute: [CPU cores, time estimates]
  • Storage: [Space requirements]

Error Handling

Error Handling

[Strategy for failures, retries, logging]
[Strategy for failures, retries, logging]

Deployment

Deployment

[Installation, configuration, execution]
undefined
[Installation, configuration, execution]
undefined

Data Structure Selection Guide

数据结构选择指南

See
references/data_structure_guide.md
for full details.
Quick Reference:
Use CaseStructureWhen
Tabular data <1GBpandas DataFrameGeneral analysis
Tabular data >1GBDask DataFrameOut-of-core processing
Single-cell dataAnnDatascRNA-seq analysis
Large matricesHDF5Persistent storage
Relational queriesSQLite/PostgreSQLComplex joins
Genomic intervalsBED/GFF filesStandard interchange
Time seriespandas with DatetimeIndexTemporal data
详情请参阅
references/data_structure_guide.md
快速参考
使用场景结构适用时机
表格数据 <1GBpandas DataFrame常规分析
表格数据 >1GBDask DataFrame核外处理
单细胞数据AnnDatascRNA-seq分析
大型矩阵HDF5持久化存储
关系型查询SQLite/PostgreSQL复杂连接
基因组区间BED/GFF files标准交换格式
时间序列pandas with DatetimeIndex时间数据

Scalability Considerations

可扩展性考量

Memory Estimation

内存估算

RNA-seq count matrix: genes × samples × 8 bytes
  20,000 genes × 1,000 samples × 8 = 160 MB (fits in RAM)
  20,000 genes × 100,000 cells × 8 = 16 GB (need sparse or chunking)
RNA-seq count matrix: genes × samples × 8 bytes
  20,000 genes × 1,000 samples × 8 = 160 MB (fits in RAM)
  20,000 genes × 100,000 cells × 8 = 16 GB (need sparse or chunking)

Compute Planning

计算规划

DESeq2 analysis: O(n_genes × n_samples²)
  100 samples: ~5 minutes
  1,000 samples: ~8 hours
  Strategy: Subset for testing, full run overnight
DESeq2 analysis: O(n_genes × n_samples²)
  100 samples: ~5 minutes
  1,000 samples: ~8 hours
  Strategy: Subset for testing, full run overnight

Storage Planning

存储规划

FASTQ (compressed): 50-100 MB per million reads
  50M reads = 5 GB
  100 samples × 50M reads = 500 GB
  Strategy: Delete FASTQ after alignment, keep BAM
FASTQ (compressed): 50-100 MB per million reads
  50M reads = 5 GB
  100 samples × 50M reads = 500 GB
  Strategy: Delete FASTQ after alignment, keep BAM

Integration Patterns

集成模式

Wrapping External Tools

封装外部工具

python
undefined
python
undefined

Pattern 1: Subprocess call

Pattern 1: Subprocess call

import subprocess result = subprocess.run( ['fastqc', input_file, '-o', output_dir], capture_output=True, check=True )
import subprocess result = subprocess.run( ['fastqc', input_file, '-o', output_dir], capture_output=True, check=True )

Pattern 2: Python binding (preferred if available)

Pattern 2: Python binding (preferred if available)

import pysam bam = pysam.AlignmentFile(bam_file, 'rb')
undefined
import pysam bam = pysam.AlignmentFile(bam_file, 'rb')
undefined

Container Strategy

容器策略

yaml
undefined
yaml
undefined

Dockerfile approach for reproducibility

Dockerfile approach for reproducibility

FROM python:3.11-slim RUN pip install numpy pandas scikit-learn COPY pipeline.py /app/ ENTRYPOINT ["python", "/app/pipeline.py"]
undefined
FROM python:3.11-slim RUN pip install numpy pandas scikit-learn COPY pipeline.py /app/ ENTRYPOINT ["python", "/app/pipeline.py"]
undefined

Output: Technical Specification

输出:技术规范

Deliverable to Software Developer includes:
  1. Architecture diagram (components + data flow)
  2. Component specifications (inputs, outputs, responsibilities)
  3. Technology stack (exact versions)
  4. Data structures (schemas, formats)
  5. Error handling (what to do when steps fail)
  6. Performance requirements (memory, time, storage)
  7. Testing strategy (unit, integration, validation)
  8. Architecture Context Document (
    .architecture/context.md
    - persistent context for incremental development)
交付给软件开发人员的内容包括:
  1. 架构图(组件 + 数据流)
  2. 组件规范(输入、输出、职责)
  3. 技术栈(精确版本)
  4. 数据结构(schemas、格式)
  5. 错误处理(步骤失败时的处理方式)
  6. 性能要求(内存、时间、存储)
  7. 测试策略(单元测试、集成测试、验证)
  8. 架构上下文文档
    .architecture/context.md
    - 用于增量开发的持久化上下文)

References

参考资料

For detailed guidance:
  • references/architecture_patterns.md
    - Common patterns with pros/cons
  • references/data_structure_guide.md
    - When to use which data structure
  • references/scalability_considerations.md
    - Memory, compute, storage planning
  • references/integration_patterns.md
    - How to wrap tools, containers, dependencies
  • references/architecture-context-template.md
    - Architecture Context Document template, generation, and maintenance protocols
如需详细指导:
  • references/architecture_patterns.md
    - 常见模式及优缺点
  • references/data_structure_guide.md
    - 何时使用何种数据结构
  • references/scalability_considerations.md
    - 内存、计算、存储规划
  • references/integration_patterns.md
    - 如何封装工具、容器、依赖
  • references/architecture-context-template.md
    - 架构上下文文档模板、生成和维护协议

Example Architecture

示例架构

Project: QC Pipeline for 1,000 RNA-seq Samples
undefined
项目:1000个RNA-seq样本的QC流程
undefined

Architecture Specification

Architecture Specification

Overview

Overview

Parallel QC pipeline processing 1,000 bulk RNA-seq FASTQ files with automated report generation.
Parallel QC pipeline processing 1,000 bulk RNA-seq FASTQ files with automated report generation.

Components

Components

  1. Validator: Check FASTQ integrity, format
  2. QC Runner: Execute FastQC in parallel
  3. Aggregator: Combine metrics with MultiQC
  4. Reporter: Generate summary statistics and plots
  1. Validator: Check FASTQ integrity, format
  2. QC Runner: Execute FastQC in parallel
  3. Aggregator: Combine metrics with MultiQC
  4. Reporter: Generate summary statistics and plots

Data Flow

Data Flow

FASTQ files → Validator → QC Runner (parallel) → Aggregator → HTML Report
FASTQ files → Validator → QC Runner (parallel) → Aggregator → HTML Report

Technology Stack

Technology Stack

  • Execution: Snakemake (manages dependencies, parallelization)
  • QC: FastQC 0.12.1
  • Aggregation: MultiQC 1.14
  • Custom code: Python 3.11, pandas, matplotlib
  • Storage: FASTQ (gzip), QC metrics (JSON), report (HTML)
  • Execution: Snakemake (manages dependencies, parallelization)
  • QC: FastQC 0.12.1
  • Aggregation: MultiQC 1.14
  • Custom code: Python 3.11, pandas, matplotlib
  • Storage: FASTQ (gzip), QC metrics (JSON), report (HTML)

Scalability

Scalability

  • Data: 1,000 samples × 50M reads × 100 bp = 500 GB FASTQ
  • Compute: 100 parallel jobs on HPC cluster
  • Time: 30 min per sample → 300 min total (5 hours)
  • Memory: 4 GB per FastQC job = 400 GB total (distributed)
  • Data: 1,000 samples × 50M reads × 100 bp = 500 GB FASTQ
  • Compute: 100 parallel jobs on HPC cluster
  • Time: 30 min per sample → 300 min total (5 hours)
  • Memory: 4 GB per FastQC job = 400 GB total (distributed)

Error Handling

Error Handling

  • Retry failed jobs (3 attempts)
  • Continue pipeline if individual samples fail
  • Log all errors with sample ID
  • Final report includes QC pass/fail status per sample
  • Retry failed jobs (3 attempts)
  • Continue pipeline if individual samples fail
  • Log all errors with sample ID
  • Final report includes QC pass/fail status per sample

Deployment

Deployment

  • Install: micromamba env from environment.yml
  • Config: samples.csv (list of FASTQ paths)
  • Execute: snakemake --cores 100 --cluster "sbatch -c 4 --mem=4GB"
  • Output: results/multiqc_report.html

Hands to Software Developer for implementation.
  • Install: micromamba env from environment.yml
  • Config: samples.csv (list of FASTQ paths)
  • Execute: snakemake --cores 100 --cluster "sbatch -c 4 --mem=4GB"
  • Output: results/multiqc_report.html

移交软件开发人员进行实现。

Success Criteria

成功标准

Architecture is complete when:
  • All components clearly defined
  • Data flow unambiguous
  • Technology choices justified
  • Scalability analyzed (memory, compute, storage)
  • Error handling planned
  • Developer can implement without architecture questions
架构即完成,当:
  • 所有组件已明确定义
  • 数据流清晰无歧义
  • 技术选型已论证
  • 可扩展性已分析(内存、计算、存储)
  • 错误处理已规划
  • 开发人员无需询问架构问题即可实现