system-architecture

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

System Architecture

系统架构

When to Use

使用场景

Activate this skill when:
  • Designing a new module, service, or major feature that requires structural decisions
  • Choosing between architectural approaches (e.g., where to place logic, how to structure data flow)
  • Planning database schema changes or refactoring existing schema
  • Making frontend state management decisions (server state vs client state, context vs store)
  • Evaluating technology trade-offs for a new capability
  • Creating or reviewing Architecture Decision Records (ADRs)
  • Setting up a new project or major subsystem from scratch
Input: If
plan.md
exists (from
project-planner
), read it for context about the feature scope and affected modules. Otherwise, work from the user's request directly.
Output: Write architecture decisions to
architecture.md
and create ADRs in
docs/adr/ADR-NNN-<title>.md
. Tell the user: "Architecture written to
architecture.md
. Run
/api-design-patterns
for API contracts or
/task-decomposition
for implementation tasks."
Do NOT use this skill for:
  • Writing implementation code (use
    python-backend-expert
    or
    react-frontend-expert
    )
  • API contract design or endpoint specifications (use
    api-design-patterns
    )
  • Testing patterns or strategies (use
    pytest-patterns
    or
    react-testing-patterns
    )
  • Deployment or infrastructure decisions (use
    docker-best-practices
    or
    deployment-pipeline
    )
在以下场景中启用该技能:
  • 设计需要进行结构决策的新模块、服务或主要功能
  • 在不同架构方案间做选择(例如逻辑放置位置、数据流结构设计)
  • 规划数据库架构变更或重构现有架构
  • 做前端状态管理决策(服务端状态 vs 客户端状态,Context vs Store)
  • 评估新功能的技术方案权衡
  • 创建或评审架构决策记录(ADRs)
  • 从零开始搭建新项目或主要子系统
输入: 如果存在
plan.md
(来自
project-planner
),请读取该文件以了解功能范围和受影响模块的上下文。否则,直接根据用户请求开展工作。
输出: 将架构决策写入
architecture.md
,并在
docs/adr/ADR-NNN-<title>.md
中创建ADR。告知用户:"架构已写入
architecture.md
。如需API契约设计,请运行
/api-design-patterns
;如需实现任务拆分,请运行
/task-decomposition
。"
请勿在以下场景使用该技能:
  • 编写实现代码(请使用
    python-backend-expert
    react-frontend-expert
  • API契约设计或端点规范制定(请使用
    api-design-patterns
  • 测试模式或策略制定(请使用
    pytest-patterns
    react-testing-patterns
  • 部署或基础设施决策(请使用
    docker-best-practices
    deployment-pipeline

Instructions

操作指南

Project Layer Architecture

项目分层架构

The standard Python/React full-stack architecture follows a layered pattern with strict dependency direction.
Python/React全栈项目的标准架构遵循分层模式,且依赖方向严格受控。

Backend Layers (FastAPI)

后端分层(FastAPI)

HTTP Request
┌─────────────────────┐
│   Routers (routes/)  │  ← HTTP concerns: request parsing, response formatting, status codes
│                      │     Uses: Depends() for injection, Pydantic schemas for validation
├─────────────────────┤
│   Services           │  ← Business logic: orchestration, validation rules, domain operations
│   (services/)        │     No HTTP awareness. Raises domain exceptions, not HTTPException.
├─────────────────────┤
│   Repositories       │  ← Data access: queries, CRUD operations, database interactions
│   (repositories/)    │     No business logic. Returns model instances or None.
├─────────────────────┤
│   Models (models/)   │  ← SQLAlchemy ORM models: table definitions, relationships, indexes
│   Schemas (schemas/) │  ← Pydantic v2 models: request/response contracts, validation
└─────────────────────┘
Database
Dependency direction rules:
  • Routers depend on Services (never on Repositories directly)
  • Services depend on Repositories (never on Routers)
  • Repositories depend on Models (never on Services)
  • Schemas are shared across layers but define no dependencies themselves
  • Never skip layers: no direct database access from routes
Dependency injection pattern:
python
undefined
HTTP 请求
┌─────────────────────┐
│   Routers (routes/)  │  ← HTTP相关处理:请求解析、响应格式化、状态码处理
│                      │     使用:Depends()进行依赖注入,Pydantic schema做校验
├─────────────────────┤
│   Services           │  ← 业务逻辑:流程编排、校验规则、领域操作
│   (services/)        │     无HTTP感知。抛出领域异常,而非HTTPException。
├─────────────────────┤
│   Repositories       │  ← 数据访问:查询、CRUD操作、数据库交互
│   (repositories/)    │     无业务逻辑。返回模型实例或None。
├─────────────────────┤
│   Models (models/)   │  ← SQLAlchemy ORM模型:表定义、关联关系、索引
│   Schemas (schemas/) │  ← Pydantic v2模型:请求/响应契约、校验规则
└─────────────────────┘
数据库
依赖方向规则:
  • Routers依赖Services(绝不直接依赖Repositories)
  • Services依赖Repositories(绝不依赖Routers)
  • Repositories依赖Models(绝不依赖Services)
  • Schemas在各层共享,但自身无依赖
  • 绝不跨层调用:Routers不得直接访问数据库
依赖注入模式:
python
undefined

Router depends on Service via Depends()

Router通过Depends()依赖Service

@router.post("/users", response_model=UserResponse) async def create_user( data: UserCreate, service: UserService = Depends(get_user_service), ) -> UserResponse: return await service.create_user(data)
@router.post("/users", response_model=UserResponse) async def create_user( data: UserCreate, service: UserService = Depends(get_user_service), ) -> UserResponse: return await service.create_user(data)

Service depends on Repository via constructor injection

Service通过构造函数注入依赖Repository

class UserService: def init(self, repo: UserRepository) -> None: self.repo = repo
class UserService: def init(self, repo: UserRepository) -> None: self.repo = repo

Repository depends on AsyncSession via Depends()

Repository通过Depends()依赖AsyncSession

class UserRepository: def init(self, session: AsyncSession) -> None: self.session = session
undefined
class UserRepository: def init(self, session: AsyncSession) -> None: self.session = session
undefined

Frontend Layers (React/TypeScript)

前端分层(React/TypeScript)

┌─────────────────────┐
│   Pages (pages/)     │  ← Route-level components: data fetching, layout composition
├─────────────────────┤
│   Layouts            │  ← Page structure: navigation, sidebars, content areas
│   (layouts/)         │
├─────────────────────┤
│   Features           │  ← Domain-specific: UserProfile, OrderList, ChatPanel
│   (features/)        │     Composed from shared components + hooks
├─────────────────────┤
│   Shared Components  │  ← Reusable UI: Button, Modal, Table, Form, Input
│   (components/)      │     No business logic. Configurable via props.
├─────────────────────┤
│   Hooks (hooks/)     │  ← Custom hooks: useAuth, usePagination, useDebounce
│   API (api/)         │  ← API client functions, TanStack Query configurations
├─────────────────────┤
│   Types (types/)     │  ← Shared TypeScript interfaces and type definitions
└─────────────────────┘
Component dependency direction:
  • Pages import Features and Layouts
  • Features import Shared Components and Hooks
  • Shared Components import only other Shared Components and Types
  • Hooks import API functions and Types
  • API functions import Types only
┌─────────────────────┐
│   Pages (pages/)     │  ← 路由级组件:数据获取、布局组合
├─────────────────────┤
│   Layouts            │  ← 页面结构:导航栏、侧边栏、内容区域
│   (layouts/)         │
├─────────────────────┤
│   Features           │  ← 领域特定组件:UserProfile、OrderList、ChatPanel
│   (features/)        │     由共享组件和hooks组合而成
├─────────────────────┤
│   Shared Components  │  ← 可复用UI组件:Button、Modal、Table、Form、Input
│   (components/)      │     无业务逻辑。通过props配置。
├─────────────────────┤
│   Hooks (hooks/)     │  ← 自定义hooks:useAuth、usePagination、useDebounce
│   API (api/)         │  ← API客户端函数、TanStack Query配置
├─────────────────────┤
│   Types (types/)     │  ← 共享TypeScript接口和类型定义
└─────────────────────┘
组件依赖方向:
  • Pages导入Features和Layouts
  • Features导入Shared Components和Hooks
  • Shared Components仅导入其他Shared Components和Types
  • Hooks导入API函数和Types
  • API函数仅导入Types

Decision Framework

决策框架

When facing architectural decisions, follow this structured process:
面临架构决策时,请遵循以下结构化流程:

Step 1: Define the Problem

步骤1:定义问题

  • What capability is needed?
  • What are the non-functional requirements? (performance, scalability, maintainability)
  • What constraints exist? (team size, timeline, existing infrastructure)
  • 需要实现什么能力?
  • 非功能需求有哪些?(性能、可扩展性、可维护性)
  • 存在哪些约束?(团队规模、时间线、现有基础设施)

Step 2: Identify Options

步骤2:列出可选方案

  • List 2-3 viable architectural approaches
  • For each option, document:
    • How it works (brief technical description)
    • Advantages
    • Disadvantages
    • Risks
  • 列出2-3种可行的架构方案
  • 针对每个方案,记录:
    • 工作原理(简短技术描述)
    • 优势
    • 劣势
    • 风险

Step 3: Evaluate Against Criteria

步骤3:基于标准评估

CriterionWeightDescription
MaintainabilityHighCan the team understand, modify, and debug this easily?
TestabilityHighCan each component be tested in isolation?
PerformanceMediumDoes it meet latency and throughput requirements?
Team familiarityMediumDoes the team have experience with this approach?
Operational costLowWhat are the infrastructure and maintenance costs?
Future flexibilityLowHow easily can this evolve as requirements change?
评估标准权重描述
可维护性团队是否能轻松理解、修改和调试?
可测试性每个组件是否能独立测试?
性能是否满足延迟和吞吐量要求?
团队熟悉度团队是否有该方案的使用经验?
运维成本基础设施和维护成本如何?
未来灵活性需求变化时是否易于演进?

Step 4: Decide and Document

步骤4:决策与文档记录

  • Choose the option that best satisfies the weighted criteria
  • Document the decision in an ADR (see
    references/architecture-decision-record-template.md
    )
  • Record what was NOT chosen and why — this context is valuable for future decisions
  • 选择最符合加权标准的方案
  • 在ADR中记录决策(参考
    references/architecture-decision-record-template.md
  • 记录未被选择的方案及原因——这些上下文对未来决策很有价值

Step 5: Communicate

步骤5:沟通同步

  • Share the ADR with the team
  • Identify any migration or rollout steps needed
  • Flag reversibility: is this a one-way door or a two-way door?
  • 与团队共享ADR
  • 确定所需的迁移或上线步骤
  • 标记可逆性:该决策是单向选择还是可回退的双向选择?

Database Schema Design

数据库架构设计

Design Principles

设计原则

  1. Start normalized (3NF) — Denormalize only for proven performance bottlenecks, not speculation
  2. One migration per logical change — Each Alembic migration should represent a single, coherent schema modification
  3. Always include downgrade — Every migration must have a working
    downgrade()
    function
  4. Index strategically:
    • Primary keys (automatic)
    • Foreign keys (always)
    • Columns in WHERE clauses of frequent queries
    • Composite indexes for multi-column lookups
    • Partial indexes for filtered queries (e.g.,
      WHERE is_active = true
      )
  1. 从规范化开始(3NF) —— 仅在确认存在性能瓶颈时才做反规范化,而非提前预判
  2. 每个逻辑变更对应一次迁移 —— 每个Alembic迁移应代表一个独立、连贯的架构修改
  3. 始终包含回滚逻辑 —— 每个迁移必须有可正常运行的
    downgrade()
    函数
  4. 策略性创建索引:
    • 主键(自动创建)
    • 外键(必须创建)
    • 频繁查询的WHERE子句中的列
    • 多列查询的复合索引
    • 过滤查询的部分索引(例如
      WHERE is_active = true

SQLAlchemy 2.0 Async Patterns

SQLAlchemy 2.0 异步模式

python
undefined
python
undefined

Model definition with Mapped types (SQLAlchemy 2.0 style)

使用Mapped类型的模型定义(SQLAlchemy 2.0风格)

class User(Base): tablename = "users"
id: Mapped[int] = mapped_column(primary_key=True)
email: Mapped[str] = mapped_column(String(255), unique=True, index=True)
is_active: Mapped[bool] = mapped_column(default=True)
created_at: Mapped[datetime] = mapped_column(server_default=func.now())

# Relationships: ALWAYS use eager loading with async
posts: Mapped[list["Post"]] = relationship(
    back_populates="author",
    lazy="selectin",  # or "joined" — NEVER "lazy" with async
)

**Async session rules:**
- One `AsyncSession` per request — never share across concurrent tasks
- Use `async with` context manager for automatic cleanup
- Map session boundaries to transaction boundaries
- Use `selectin` or `joined` loading — lazy loading is incompatible with asyncio
- Use `run_sync()` only as a last resort for legacy code
class User(Base): tablename = "users"
id: Mapped[int] = mapped_column(primary_key=True)
email: Mapped[str] = mapped_column(String(255), unique=True, index=True)
is_active: Mapped[bool] = mapped_column(default=True)
created_at: Mapped[datetime] = mapped_column(server_default=func.now())

# 关联关系:异步模式下务必使用预加载
posts: Mapped[list["Post"]] = relationship(
    back_populates="author",
    lazy="selectin",  # 或"joined" —— 异步模式下绝不要使用"lazy"
)

**异步会话规则:**
- 每个请求对应一个`AsyncSession` —— 绝不要在并发任务间共享
- 使用`async with`上下文管理器自动清理
- 将会话边界与事务边界对齐
- 使用`selectin`或`joined`加载 —— 延迟加载与asyncio不兼容
- 仅在处理遗留代码时作为最后手段使用`run_sync()`

Migration Planning

迁移规划

  1. Schema change → Generate migration:
    alembic revision --autogenerate -m "description"
  2. Review generated migration — verify column types, indexes, constraints
  3. Test upgrade:
    alembic upgrade head
  4. Test downgrade:
    alembic downgrade -1
  5. Test data preservation: ensure existing data survives the round-trip
  1. 架构变更 → 生成迁移:
    alembic revision --autogenerate -m "description"
  2. 评审生成的迁移 —— 验证列类型、索引、约束
  3. 测试升级:
    alembic upgrade head
  4. 测试回滚:
    alembic downgrade -1
  5. 测试数据保留:确保现有数据在升级回滚后仍完整

Frontend Architecture

前端架构

State Management Decision Tree

状态管理决策树

Is the data from the server?
├── YES → Use TanStack Query (useQuery, useMutation)
│         Configure staleTime, gcTime, query keys
└── NO → Is it needed across multiple components?
         ├── YES → Is it complex with actions/reducers?
         │         ├── YES → Use Zustand store
         │         └── NO  → Use React Context
         └── NO → Use useState / useReducer locally
TanStack Query conventions:
  • Query keys:
    [resource, ...identifiers]
    (e.g.,
    ["users", userId]
    ,
    ["posts", { page, limit }]
    )
  • Use
    queryOptions()
    factory to centralize key + fn definitions — prevents copy-paste key errors
  • Set
    staleTime
    based on data freshness needs (default 0 is too aggressive for most cases)
  • Invalidate with
    invalidateQueries()
    after mutations — never manual
    refetch()
  • Handle all states:
    isPending
    ,
    isError
    ,
    data
Component design rules:
  • Props for configuration, hooks for data
  • Lift state only as high as needed — no premature context creation
  • Keep components under 200 lines — extract sub-components or custom hooks when larger
  • Use
    children
    and composition over deep prop drilling
数据是否来自服务端?
├── 是 → 使用TanStack Query (useQuery, useMutation)
│         配置staleTime、gcTime、查询键
└── 否 → 是否需要在多个组件间共享?
         ├── 是 → 是否包含复杂的动作/ reducer?
         │         ├── 是 → 使用Zustand store
         │         └── 否  → 使用React Context
         └── 否 → 在本地使用useState / useReducer
TanStack Query 约定:
  • 查询键:
    [resource, ...identifiers]
    (例如
    ["users", userId]
    ,
    ["posts", { page, limit }]
  • 使用
    queryOptions()
    工厂函数集中管理键和函数定义 —— 避免复制粘贴导致的键错误
  • 根据数据新鲜度需求设置
    staleTime
    (默认0对大多数场景过于激进)
  • 变更后使用
    invalidateQueries()
    使查询失效 —— 绝不手动调用
    refetch()
  • 处理所有状态:
    isPending
    ,
    isError
    ,
    data
组件设计规则:
  • 使用props做配置,使用hooks获取数据
  • 仅将状态提升到必要的层级 —— 不要过早创建Context
  • 组件代码控制在200行以内 —— 代码过长时提取子组件或自定义hooks
  • 使用
    children
    和组合模式替代深层props透传

Routing Structure

路由结构

Organize routes to mirror the URL structure:
src/
├── pages/
│   ├── HomePage.tsx           → /
│   ├── LoginPage.tsx          → /login
│   ├── users/
│   │   ├── UserListPage.tsx   → /users
│   │   └── UserDetailPage.tsx → /users/:id
│   └── settings/
│       └── SettingsPage.tsx   → /settings
路由组织与URL结构保持一致:
src/
├── pages/
│   ├── HomePage.tsx           → /
│   ├── LoginPage.tsx          → /login
│   ├── users/
│   │   ├── UserListPage.tsx   → /users
│   │   └── UserDetailPage.tsx → /users/:id
│   └── settings/
│       └── SettingsPage.tsx   → /settings

Cross-Cutting Concerns

横切关注点

Authentication Flow

认证流程

Login Request
Backend: Validate credentials → Generate JWT (access + refresh tokens)
Frontend: Store access token in memory, refresh token in httpOnly cookie
API Calls: Attach access token via Authorization header
Token Expired: Use refresh token to obtain new access token
Refresh Failed: Redirect to login
Architecture decisions for auth:
  • Access tokens: short-lived (15-30 min), stored in memory (not localStorage)
  • Refresh tokens: longer-lived (7-30 days), stored in httpOnly cookie
  • Backend: FastAPI
    Depends()
    chain for token validation → user extraction → permission check
  • Frontend: Auth context providing
    user
    ,
    login()
    ,
    logout()
    ,
    isAuthenticated
登录请求
后端:验证凭证 → 生成JWT(access token + refresh token)
前端:access token存储在内存中,refresh token存储在httpOnly cookie中
API调用:通过Authorization头携带access token
Token过期:使用refresh token获取新的access token
Refresh失败:重定向到登录页
认证架构决策:
  • Access token:短有效期(15-30分钟),存储在内存中(不使用localStorage)
  • Refresh token:较长有效期(7-30天),存储在httpOnly cookie中
  • 后端:FastAPI
    Depends()
    链实现token验证 → 用户信息提取 → 权限校验
  • 前端:Auth Context提供
    user
    ,
    login()
    ,
    logout()
    ,
    isAuthenticated

Error Handling Strategy

错误处理策略

Errors should be handled at the appropriate layer:
LayerError TypeAction
Router
HTTPException
Return HTTP error response with status code
ServiceDomain exceptionsRaise custom exceptions (e.g.,
UserNotFoundError
)
RepositoryDatabase exceptionsCatch and re-raise as domain exceptions or let propagate
FrontendAPI errorsDisplay user-friendly messages, retry where appropriate
Backend exception hierarchy:
python
class AppError(Exception):
    """Base application error."""

class NotFoundError(AppError):
    """Resource not found."""

class ConflictError(AppError):
    """Resource conflict (duplicate, version mismatch)."""

class ValidationError(AppError):
    """Business rule violation."""
Router-level exception handler maps domain exceptions to HTTP responses:
python
@app.exception_handler(NotFoundError)
async def not_found_handler(request: Request, exc: NotFoundError):
    return JSONResponse(status_code=404, content={"detail": str(exc)})
错误应在对应层级处理:
层级错误类型处理动作
Router
HTTPException
返回带状态码的HTTP错误响应
Service领域异常抛出自定义异常(例如
UserNotFoundError
Repository数据库异常捕获并重抛为领域异常,或直接向上传递
前端API错误显示友好的用户提示,必要时重试
后端异常层级:
python
class AppError(Exception):
    """基础应用异常。"""

class NotFoundError(AppError):
    """资源不存在。"""

class ConflictError(AppError):
    """资源冲突(重复、版本不匹配)。"""

class ValidationError(AppError):
    """业务规则违反。"""
Router层级的异常处理器将领域异常映射为HTTP响应:
python
@app.exception_handler(NotFoundError)
async def not_found_handler(request: Request, exc: NotFoundError):
    return JSONResponse(status_code=404, content={"detail": str(exc)})

Logging Architecture

日志架构

Backend (structlog):
  • Structured JSON logs in production
  • Human-readable console in development
  • Bind request context (request_id, user_id) at middleware level
  • Log at service layer (business events), not repository layer (too noisy)
  • Use log levels: DEBUG (development only), INFO (business events), WARNING (recoverable issues), ERROR (failures requiring attention)
Frontend:
  • console.*
    in development
  • Structured error reporting to backend or Sentry in production
  • Log user actions for debugging, not for analytics
后端(structlog):
  • 生产环境使用结构化JSON日志
  • 开发环境使用易读的控制台日志
  • 在中间件层绑定请求上下文(request_id、user_id)
  • 在Service层记录日志(业务事件),Repository层不记录(过于冗余)
  • 使用日志级别:DEBUG(仅开发环境)、INFO(业务事件)、WARNING(可恢复问题)、ERROR(需关注的故障)
前端:
  • 开发环境使用
    console.*
  • 生产环境将结构化错误上报到后端或Sentry
  • 记录用户操作用于调试,而非分析

Configuration Management

配置管理

Backend (pydantic-settings):
python
class Settings(BaseSettings):
    model_config = SettingsConfigDict(env_file=".env")

    database_url: str
    redis_url: str = "redis://localhost:6379"
    jwt_secret: str
    debug: bool = False
Frontend (environment variables):
  • VITE_API_URL
    for API base URL
  • Build-time injection via Vite's
    import.meta.env
  • No secrets in frontend environment variables
后端(pydantic-settings):
python
class Settings(BaseSettings):
    model_config = SettingsConfigDict(env_file=".env")

    database_url: str
    redis_url: str = "redis://localhost:6379"
    jwt_secret: str
    debug: bool = False
前端(环境变量):
  • VITE_API_URL
    用于API基础地址
  • 通过Vite的
    import.meta.env
    在构建时注入
  • 前端环境变量中绝不包含敏感信息

Output Files

输出文件

architecture.md

architecture.md

Write the architecture document to
architecture.md
at the project root:
markdown
undefined
在项目根目录的
architecture.md
中写入架构文档:
markdown
undefined

Architecture: [Feature/System Name]

架构:[功能/系统名称]

Overview

概述

[1-2 sentence summary of the architectural approach]
[1-2句话总结架构方案]

Layer Structure

分层结构

[Backend and frontend layer descriptions from this skill's patterns]
[来自本技能模式的后端和前端分层描述]

Key Decisions

核心决策

[Summary of decisions made, with links to ADRs]
[已做决策的摘要,包含ADR链接]

Database Schema

数据库架构

[Entity descriptions, relationships, key indexes]
[实体描述、关联关系、核心索引]

Cross-Cutting Concerns

横切关注点

[Auth, error handling, logging approach]
[认证、错误处理、日志方案]

Next Steps

后续步骤

  • Run
    /api-design-patterns
    to define API contracts
  • Run
    /task-decomposition
    to create implementation tasks
undefined
  • 运行
    /api-design-patterns
    定义API契约
  • 运行
    /task-decomposition
    创建实现任务
undefined

ADRs

ADRs

For each significant decision, create an ADR in
docs/adr/
:
markdown
undefined
针对每个重要决策,在
docs/adr/
中创建ADR:
markdown
undefined

ADR-NNN: [Decision Title]

ADR-NNN: [决策标题]

Status

状态

Accepted | Proposed | Superseded
已接受 | 待提议 | 已取代

Context

背景

[Why this decision is needed]
[为何需要该决策]

Decision

决策

[What we decided]
[最终决定内容]

Consequences

影响

[Positive and negative outcomes]

Number ADRs sequentially (ADR-001, ADR-002, etc.).
[正面和负面结果]

ADR按顺序编号(ADR-001、ADR-002等)。

Examples

示例

Architecture Decision: Real-Time Notifications

架构决策:实时通知

Problem: The application needs real-time notifications for users (new messages, status updates).
Options evaluated:
OptionProsCons
WebSocketTrue bidirectional, low latencyComplex connection management, harder to scale
Server-Sent Events (SSE)Simple, HTTP-based, auto-reconnectUnidirectional (server→client only), limited browser connections
PollingSimplest implementation, works everywhereHigher latency, unnecessary server load
Decision: WebSocket for this use case.
Rationale: Notifications require low latency and the system will eventually need bidirectional communication (typing indicators, presence). SSE would work for notifications alone but would require a separate solution for future bidirectional needs. Polling introduces unacceptable latency for real-time UX.
Architecture:
  • Backend: FastAPI WebSocket endpoint with
    ConnectionManager
    class
  • Frontend: Custom
    useWebSocket
    hook with automatic reconnection
  • Scaling: Redis pub/sub for multi-instance message distribution
  • Persistence: Store notifications in database for offline users
  • Fallback: REST endpoint for notification history and initial load
See
references/architecture-decision-record-template.md
for the full ADR format.
问题: 应用需要为用户提供实时通知(新消息、状态更新)。
评估的方案:
方案优势劣势
WebSocket真正的双向通信、低延迟连接管理复杂,扩展难度大
Server-Sent Events (SSE)简单、基于HTTP、自动重连单向通信(仅服务端→客户端),浏览器连接数有限制
轮询实现最简单,所有环境都支持延迟高,服务器负载不必要
决策: 该场景使用WebSocket。
理由: 通知需要低延迟,且系统最终需要双向通信(输入状态提示、在线状态)。SSE仅能满足通知需求,但未来需要双向通信时需单独部署新方案。轮询的延迟无法满足实时用户体验要求。
架构:
  • 后端:FastAPI WebSocket端点,搭配
    ConnectionManager
  • 前端:自定义
    useWebSocket
    hook,支持自动重连
  • 扩展:Redis pub/sub实现多实例消息分发
  • 持久化:离线用户的通知存储在数据库中
  • 降级:REST端点用于通知历史查询和初始加载
完整ADR格式请参考
references/architecture-decision-record-template.md

Edge Cases

边缘场景

Monolith vs Microservices

单体应用 vs 微服务

Default to modular monolith for teams smaller than 10 developers. A modular monolith provides:
  • Clear module boundaries without network overhead
  • Shared database with module-specific schemas
  • Easy refactoring and code navigation
  • Simple deployment and debugging
Consider microservices only when:
  • Independent scaling is required for specific components
  • Different modules need different technology stacks
  • Team size exceeds 10 and ownership boundaries are clear
  • Deployment independence is a business requirement
Migration path: Design module boundaries in the monolith as if they were services (no direct cross-module database access, communicate via service interfaces). This makes extraction to microservices straightforward when needed.
团队规模小于10人时默认选择模块化单体应用。模块化单体应用具备以下优势:
  • 清晰的模块边界,无网络开销
  • 共享数据库,模块拥有独立Schema
  • 重构和代码导航简单
  • 部署和调试便捷
仅在以下场景考虑微服务:
  • 特定组件需要独立扩展
  • 不同模块需要不同技术栈
  • 团队规模超过10人,且 ownership 边界清晰
  • 部署独立性是业务需求
迁移路径: 按微服务的标准设计单体应用的模块边界(模块间无直接数据库访问,通过服务接口通信)。这样在需要时可轻松将模块拆分为微服务。

When to Break the Layer Pattern

何时打破分层模式

The strict Router → Service → Repository pattern should be followed for standard CRUD operations. Acceptable exceptions:
  • Background tasks: May call services directly without going through a router
  • Event handlers: Domain event listeners may call services from any context
  • CLI commands: Management scripts may access services or repositories directly
  • Migrations: Data migrations may access models directly (no service/repo layer needed)
  • Health checks: May access the database directly for simple connectivity verification
In all cases, business logic should still live in the service layer — these exceptions are about the entry point, not about bypassing business rules.
标准的Router → Service → Repository模式应在常规CRUD操作中严格遵循。以下场景可例外:
  • 后台任务: 可直接调用Services,无需经过Router
  • 事件处理器: 领域事件监听器可在任意上下文调用Services
  • CLI命令: 管理脚本可直接访问Services或Repositories
  • 数据迁移: 数据迁移可直接访问Models(无需Service/Repo层)
  • 健康检查: 简单的连通性检查可直接访问数据库
所有例外场景中,业务逻辑仍应放在Service层 —— 这些例外仅针对入口点,而非绕过业务规则。

Evolving Architecture

架构演进

When the architecture needs to change:
  1. Write an ADR documenting the motivation and the proposed change
  2. Identify all affected modules and their dependencies
  3. Plan an incremental migration — never big-bang rewrites
  4. Maintain backward compatibility during transition (strangler fig pattern)
  5. Set a deadline for completing the migration and removing legacy code
当架构需要变更时:
  1. 编写ADR记录变更动机和提议的方案
  2. 识别所有受影响的模块及其依赖
  3. 规划增量式迁移 —— 绝不进行大爆炸式重写
  4. 迁移期间保持向后兼容(绞杀者模式)
  5. 设置迁移完成和遗留代码移除的截止日期