system-architecture

Compare original and translation side by side

🇺🇸

Original

English

🇨🇳

Translation

Chinese

System Architecture

系统架构

When to Use

使用场景

Activate this skill when:

Designing a new module, service, or major feature that requires structural decisions
Choosing between architectural approaches (e.g., where to place logic, how to structure data flow)
Planning database schema changes or refactoring existing schema
Making frontend state management decisions (server state vs client state, context vs store)
Evaluating technology trade-offs for a new capability
Creating or reviewing Architecture Decision Records (ADRs)
Setting up a new project or major subsystem from scratch

Input: If

plan.md

exists (from

project-planner

), read it for context about the feature scope and affected modules. Otherwise, work from the user's request directly.

Output: Write architecture decisions to

architecture.md

and create ADRs in

docs/adr/ADR-NNN-<title>.md

. Tell the user: "Architecture written to

architecture.md

. Run

/api-design-patterns

for API contracts or

/task-decomposition

for implementation tasks."

Do NOT use this skill for:

Writing implementation code (use

python-backend-expert

react-frontend-expert

)

API contract design or endpoint specifications (use
```
api-design-patterns
```
)
Testing patterns or strategies (use
```
pytest-patterns
```
or
```
react-testing-patterns
```
)
Deployment or infrastructure decisions (use
```
docker-best-practices
```
or
```
deployment-pipeline
```
)

在以下场景中启用该技能：

设计需要进行结构决策的新模块、服务或主要功能
在不同架构方案间做选择（例如逻辑放置位置、数据流结构设计）
规划数据库架构变更或重构现有架构
做前端状态管理决策（服务端状态 vs 客户端状态，Context vs Store）
评估新功能的技术方案权衡
创建或评审架构决策记录（ADRs）
从零开始搭建新项目或主要子系统

输入： 如果存在

plan.md

（来自

project-planner

），请读取该文件以了解功能范围和受影响模块的上下文。否则，直接根据用户请求开展工作。

输出： 将架构决策写入

architecture.md

，并在

docs/adr/ADR-NNN-<title>.md

中创建ADR。告知用户："架构已写入

architecture.md

。如需API契约设计，请运行

/api-design-patterns

；如需实现任务拆分，请运行

/task-decomposition

。"

请勿在以下场景使用该技能：

编写实现代码（请使用

python-backend-expert

或

react-frontend-expert

）

API契约设计或端点规范制定（请使用
```
api-design-patterns
```
）
测试模式或策略制定（请使用
```
pytest-patterns
```
或
```
react-testing-patterns
```
）
部署或基础设施决策（请使用
```
docker-best-practices
```
或
```
deployment-pipeline
```
）

Instructions

操作指南

Project Layer Architecture

项目分层架构

The standard Python/React full-stack architecture follows a layered pattern with strict dependency direction.

Python/React全栈项目的标准架构遵循分层模式，且依赖方向严格受控。

Backend Layers (FastAPI)

后端分层（FastAPI）

HTTP Request
    ↓
┌─────────────────────┐
│   Routers (routes/)  │  ← HTTP concerns: request parsing, response formatting, status codes
│                      │     Uses: Depends() for injection, Pydantic schemas for validation
├─────────────────────┤
│   Services           │  ← Business logic: orchestration, validation rules, domain operations
│   (services/)        │     No HTTP awareness. Raises domain exceptions, not HTTPException.
├─────────────────────┤
│   Repositories       │  ← Data access: queries, CRUD operations, database interactions
│   (repositories/)    │     No business logic. Returns model instances or None.
├─────────────────────┤
│   Models (models/)   │  ← SQLAlchemy ORM models: table definitions, relationships, indexes
│   Schemas (schemas/) │  ← Pydantic v2 models: request/response contracts, validation
└─────────────────────┘
    ↓
Database

Dependency direction rules:

Routers depend on Services (never on Repositories directly)
Services depend on Repositories (never on Routers)
Repositories depend on Models (never on Services)
Schemas are shared across layers but define no dependencies themselves
Never skip layers: no direct database access from routes

Dependency injection pattern:

python

undefined

HTTP 请求
    ↓
┌─────────────────────┐
│   Routers (routes/)  │  ← HTTP相关处理：请求解析、响应格式化、状态码处理
│                      │     使用：Depends()进行依赖注入，Pydantic schema做校验
├─────────────────────┤
│   Services           │  ← 业务逻辑：流程编排、校验规则、领域操作
│   (services/)        │     无HTTP感知。抛出领域异常，而非HTTPException。
├─────────────────────┤
│   Repositories       │  ← 数据访问：查询、CRUD操作、数据库交互
│   (repositories/)    │     无业务逻辑。返回模型实例或None。
├─────────────────────┤
│   Models (models/)   │  ← SQLAlchemy ORM模型：表定义、关联关系、索引
│   Schemas (schemas/) │  ← Pydantic v2模型：请求/响应契约、校验规则
└─────────────────────┘
    ↓
数据库

依赖方向规则：

Routers依赖Services（绝不直接依赖Repositories）
Services依赖Repositories（绝不依赖Routers）
Repositories依赖Models（绝不依赖Services）
Schemas在各层共享，但自身无依赖
绝不跨层调用：Routers不得直接访问数据库

依赖注入模式：

python

undefined

Router depends on Service via Depends()

Router通过Depends()依赖Service

@router.post("/users", response_model=UserResponse) async def create_user( data: UserCreate, service: UserService = Depends(get_user_service), ) -> UserResponse: return await service.create_user(data)

Service depends on Repository via constructor injection

Service通过构造函数注入依赖Repository

class UserService: def init(self, repo: UserRepository) -> None: self.repo = repo

Repository depends on AsyncSession via Depends()

Repository通过Depends()依赖AsyncSession

class UserRepository: def init(self, session: AsyncSession) -> None: self.session = session

undefined

class UserRepository: def init(self, session: AsyncSession) -> None: self.session = session

undefined

Frontend Layers (React/TypeScript)

前端分层（React/TypeScript）

┌─────────────────────┐
│   Pages (pages/)     │  ← Route-level components: data fetching, layout composition
├─────────────────────┤
│   Layouts            │  ← Page structure: navigation, sidebars, content areas
│   (layouts/)         │
├─────────────────────┤
│   Features           │  ← Domain-specific: UserProfile, OrderList, ChatPanel
│   (features/)        │     Composed from shared components + hooks
├─────────────────────┤
│   Shared Components  │  ← Reusable UI: Button, Modal, Table, Form, Input
│   (components/)      │     No business logic. Configurable via props.
├─────────────────────┤
│   Hooks (hooks/)     │  ← Custom hooks: useAuth, usePagination, useDebounce
│   API (api/)         │  ← API client functions, TanStack Query configurations
├─────────────────────┤
│   Types (types/)     │  ← Shared TypeScript interfaces and type definitions
└─────────────────────┘

Component dependency direction:

Pages import Features and Layouts
Features import Shared Components and Hooks
Shared Components import only other Shared Components and Types
Hooks import API functions and Types
API functions import Types only

┌─────────────────────┐
│   Pages (pages/)     │  ← 路由级组件：数据获取、布局组合
├─────────────────────┤
│   Layouts            │  ← 页面结构：导航栏、侧边栏、内容区域
│   (layouts/)         │
├─────────────────────┤
│   Features           │  ← 领域特定组件：UserProfile、OrderList、ChatPanel
│   (features/)        │     由共享组件和hooks组合而成
├─────────────────────┤
│   Shared Components  │  ← 可复用UI组件：Button、Modal、Table、Form、Input
│   (components/)      │     无业务逻辑。通过props配置。
├─────────────────────┤
│   Hooks (hooks/)     │  ← 自定义hooks：useAuth、usePagination、useDebounce
│   API (api/)         │  ← API客户端函数、TanStack Query配置
├─────────────────────┤
│   Types (types/)     │  ← 共享TypeScript接口和类型定义
└─────────────────────┘

组件依赖方向：

Pages导入Features和Layouts
Features导入Shared Components和Hooks
Shared Components仅导入其他Shared Components和Types
Hooks导入API函数和Types
API函数仅导入Types

Decision Framework

决策框架

When facing architectural decisions, follow this structured process:

面临架构决策时，请遵循以下结构化流程：

Step 1: Define the Problem

步骤1：定义问题

What capability is needed?
What are the non-functional requirements? (performance, scalability, maintainability)
What constraints exist? (team size, timeline, existing infrastructure)

需要实现什么能力？
非功能需求有哪些？（性能、可扩展性、可维护性）
存在哪些约束？（团队规模、时间线、现有基础设施）

Step 2: Identify Options

步骤2：列出可选方案

List 2-3 viable architectural approaches
For each option, document:
- How it works (brief technical description)
- Advantages
- Disadvantages
- Risks

列出2-3种可行的架构方案
针对每个方案，记录：
- 工作原理（简短技术描述）
- 优势
- 劣势
- 风险

Step 3: Evaluate Against Criteria

步骤3：基于标准评估

Criterion	Weight	Description
Maintainability	High	Can the team understand, modify, and debug this easily?
Testability	High	Can each component be tested in isolation?
Performance	Medium	Does it meet latency and throughput requirements?
Team familiarity	Medium	Does the team have experience with this approach?
Operational cost	Low	What are the infrastructure and maintenance costs?
Future flexibility	Low	How easily can this evolve as requirements change?

评估标准	权重	描述
可维护性	高	团队是否能轻松理解、修改和调试？
可测试性	高	每个组件是否能独立测试？
性能	中	是否满足延迟和吞吐量要求？
团队熟悉度	中	团队是否有该方案的使用经验？
运维成本	低	基础设施和维护成本如何？
未来灵活性	低	需求变化时是否易于演进？

Step 4: Decide and Document

步骤4：决策与文档记录

Choose the option that best satisfies the weighted criteria

Document the decision in an ADR (see

references/architecture-decision-record-template.md

)

Record what was NOT chosen and why — this context is valuable for future decisions

选择最符合加权标准的方案

在ADR中记录决策（参考

references/architecture-decision-record-template.md

）

记录未被选择的方案及原因——这些上下文对未来决策很有价值

Step 5: Communicate

步骤5：沟通同步

Share the ADR with the team
Identify any migration or rollout steps needed
Flag reversibility: is this a one-way door or a two-way door?

与团队共享ADR
确定所需的迁移或上线步骤
标记可逆性：该决策是单向选择还是可回退的双向选择？

Database Schema Design

数据库架构设计

Design Principles

设计原则

Start normalized (3NF) — Denormalize only for proven performance bottlenecks, not speculation
One migration per logical change — Each Alembic migration should represent a single, coherent schema modification
Always include downgrade — Every migration must have a working
```
downgrade()
```
function
Index strategically:
- Primary keys (automatic)
- Foreign keys (always)
- Columns in WHERE clauses of frequent queries
- Composite indexes for multi-column lookups
- Partial indexes for filtered queries (e.g.,
```
WHERE is_active = true
```
  )

从规范化开始（3NF） —— 仅在确认存在性能瓶颈时才做反规范化，而非提前预判
每个逻辑变更对应一次迁移 —— 每个Alembic迁移应代表一个独立、连贯的架构修改
始终包含回滚逻辑 —— 每个迁移必须有可正常运行的
```
downgrade()
```
函数
策略性创建索引：
- 主键（自动创建）
- 外键（必须创建）
- 频繁查询的WHERE子句中的列
- 多列查询的复合索引
- 过滤查询的部分索引（例如
```
WHERE is_active = true
```
  ）

SQLAlchemy 2.0 Async Patterns

SQLAlchemy 2.0 异步模式

python

undefined

python

undefined

Model definition with Mapped types (SQLAlchemy 2.0 style)

使用Mapped类型的模型定义（SQLAlchemy 2.0风格）

class User(Base): tablename = "users"

id: Mapped[int] = mapped_column(primary_key=True)
email: Mapped[str] = mapped_column(String(255), unique=True, index=True)
is_active: Mapped[bool] = mapped_column(default=True)
created_at: Mapped[datetime] = mapped_column(server_default=func.now())

# Relationships: ALWAYS use eager loading with async
posts: Mapped[list["Post"]] = relationship(
    back_populates="author",
    lazy="selectin",  # or "joined" — NEVER "lazy" with async
)


**Async session rules:**
- One `AsyncSession` per request — never share across concurrent tasks
- Use `async with` context manager for automatic cleanup
- Map session boundaries to transaction boundaries
- Use `selectin` or `joined` loading — lazy loading is incompatible with asyncio
- Use `run_sync()` only as a last resort for legacy code

class User(Base): tablename = "users"

id: Mapped[int] = mapped_column(primary_key=True)
email: Mapped[str] = mapped_column(String(255), unique=True, index=True)
is_active: Mapped[bool] = mapped_column(default=True)
created_at: Mapped[datetime] = mapped_column(server_default=func.now())

# 关联关系：异步模式下务必使用预加载
posts: Mapped[list["Post"]] = relationship(
    back_populates="author",
    lazy="selectin",  # 或"joined" —— 异步模式下绝不要使用"lazy"
)


**异步会话规则：**
- 每个请求对应一个`AsyncSession` —— 绝不要在并发任务间共享
- 使用`async with`上下文管理器自动清理
- 将会话边界与事务边界对齐
- 使用`selectin`或`joined`加载 —— 延迟加载与asyncio不兼容
- 仅在处理遗留代码时作为最后手段使用`run_sync()`

Migration Planning

迁移规划

Schema change → Generate migration:

alembic revision --autogenerate -m "description"

Review generated migration — verify column types, indexes, constraints
Test upgrade:
```
alembic upgrade head
```
Test downgrade:
```
alembic downgrade -1
```
Test data preservation: ensure existing data survives the round-trip

架构变更 → 生成迁移：

alembic revision --autogenerate -m "description"

评审生成的迁移 —— 验证列类型、索引、约束
测试升级：
```
alembic upgrade head
```
测试回滚：
```
alembic downgrade -1
```
测试数据保留：确保现有数据在升级回滚后仍完整

Frontend Architecture

前端架构

State Management Decision Tree

状态管理决策树

Is the data from the server?
├── YES → Use TanStack Query (useQuery, useMutation)
│         Configure staleTime, gcTime, query keys
│
└── NO → Is it needed across multiple components?
         ├── YES → Is it complex with actions/reducers?
         │         ├── YES → Use Zustand store
         │         └── NO  → Use React Context
         │
         └── NO → Use useState / useReducer locally

TanStack Query conventions:

Query keys:

[resource, ...identifiers]

(e.g.,

["users", userId]

["posts", { page, limit }]

)

Use
```
queryOptions()
```
factory to centralize key + fn definitions — prevents copy-paste key errors
Set
```
staleTime
```
based on data freshness needs (default 0 is too aggressive for most cases)
Invalidate with
```
invalidateQueries()
```
after mutations — never manual
```
refetch()
```
Handle all states:
```
isPending
```
,
```
isError
```
,
```
data
```

Component design rules:

Props for configuration, hooks for data
Lift state only as high as needed — no premature context creation
Keep components under 200 lines — extract sub-components or custom hooks when larger
Use
```
children
```
and composition over deep prop drilling

数据是否来自服务端？
├── 是 → 使用TanStack Query (useQuery, useMutation)
│         配置staleTime、gcTime、查询键
│
└── 否 → 是否需要在多个组件间共享？
         ├── 是 → 是否包含复杂的动作/ reducer？
         │         ├── 是 → 使用Zustand store
         │         └── 否  → 使用React Context
         │
         └── 否 → 在本地使用useState / useReducer

TanStack Query 约定：

查询键：

[resource, ...identifiers]

（例如

["users", userId]

["posts", { page, limit }]

）

使用
```
queryOptions()
```
工厂函数集中管理键和函数定义 —— 避免复制粘贴导致的键错误
根据数据新鲜度需求设置
```
staleTime
```
（默认0对大多数场景过于激进）
变更后使用
```
invalidateQueries()
```
使查询失效 —— 绝不手动调用
```
refetch()
```
处理所有状态：
```
isPending
```
,
```
isError
```
,
```
data
```

组件设计规则：

使用props做配置，使用hooks获取数据
仅将状态提升到必要的层级 —— 不要过早创建Context
组件代码控制在200行以内 —— 代码过长时提取子组件或自定义hooks
使用
```
children
```
和组合模式替代深层props透传

Routing Structure

路由结构

Organize routes to mirror the URL structure:

src/
├── pages/
│   ├── HomePage.tsx           → /
│   ├── LoginPage.tsx          → /login
│   ├── users/
│   │   ├── UserListPage.tsx   → /users
│   │   └── UserDetailPage.tsx → /users/:id
│   └── settings/
│       └── SettingsPage.tsx   → /settings

路由组织与URL结构保持一致：

src/
├── pages/
│   ├── HomePage.tsx           → /
│   ├── LoginPage.tsx          → /login
│   ├── users/
│   │   ├── UserListPage.tsx   → /users
│   │   └── UserDetailPage.tsx → /users/:id
│   └── settings/
│       └── SettingsPage.tsx   → /settings

Cross-Cutting Concerns

横切关注点

Authentication Flow

认证流程

Login Request
    ↓
Backend: Validate credentials → Generate JWT (access + refresh tokens)
    ↓
Frontend: Store access token in memory, refresh token in httpOnly cookie
    ↓
API Calls: Attach access token via Authorization header
    ↓
Token Expired: Use refresh token to obtain new access token
    ↓
Refresh Failed: Redirect to login

Architecture decisions for auth:

Access tokens: short-lived (15-30 min), stored in memory (not localStorage)
Refresh tokens: longer-lived (7-30 days), stored in httpOnly cookie
Backend: FastAPI
```
Depends()
```
chain for token validation → user extraction → permission check
Frontend: Auth context providing
```
user
```
,
```
login()
```
,
```
logout()
```
,
```
isAuthenticated
```

登录请求
    ↓
后端：验证凭证 → 生成JWT（access token + refresh token）
    ↓
前端：access token存储在内存中，refresh token存储在httpOnly cookie中
    ↓
API调用：通过Authorization头携带access token
    ↓
Token过期：使用refresh token获取新的access token
    ↓
Refresh失败：重定向到登录页

认证架构决策：

Access token：短有效期（15-30分钟），存储在内存中（不使用localStorage）
Refresh token：较长有效期（7-30天），存储在httpOnly cookie中
后端：FastAPI
```
Depends()
```
链实现token验证 → 用户信息提取 → 权限校验
前端：Auth Context提供
```
user
```
,
```
login()
```
,
```
logout()
```
,
```
isAuthenticated
```

Error Handling Strategy

错误处理策略

Errors should be handled at the appropriate layer:

Layer	Error Type	Action
Router	`HTTPException`	Return HTTP error response with status code
Service	Domain exceptions	Raise custom exceptions (e.g., `UserNotFoundError` )
Repository	Database exceptions	Catch and re-raise as domain exceptions or let propagate
Frontend	API errors	Display user-friendly messages, retry where appropriate

Backend exception hierarchy:

python

class AppError(Exception):
    """Base application error."""

class NotFoundError(AppError):
    """Resource not found."""

class ConflictError(AppError):
    """Resource conflict (duplicate, version mismatch)."""

class ValidationError(AppError):
    """Business rule violation."""

Router-level exception handler maps domain exceptions to HTTP responses:

python

@app.exception_handler(NotFoundError)
async def not_found_handler(request: Request, exc: NotFoundError):
    return JSONResponse(status_code=404, content={"detail": str(exc)})

错误应在对应层级处理：

层级	错误类型	处理动作
Router	`HTTPException`	返回带状态码的HTTP错误响应
Service	领域异常	抛出自定义异常（例如 `UserNotFoundError` ）
Repository	数据库异常	捕获并重抛为领域异常，或直接向上传递
前端	API错误	显示友好的用户提示，必要时重试

后端异常层级：

python

class AppError(Exception):
    """基础应用异常。"""

class NotFoundError(AppError):
    """资源不存在。"""

class ConflictError(AppError):
    """资源冲突（重复、版本不匹配）。"""

class ValidationError(AppError):
    """业务规则违反。"""

Router层级的异常处理器将领域异常映射为HTTP响应：

python

@app.exception_handler(NotFoundError)
async def not_found_handler(request: Request, exc: NotFoundError):
    return JSONResponse(status_code=404, content={"detail": str(exc)})

Logging Architecture

日志架构

Backend (structlog):

Structured JSON logs in production
Human-readable console in development
Bind request context (request_id, user_id) at middleware level
Log at service layer (business events), not repository layer (too noisy)
Use log levels: DEBUG (development only), INFO (business events), WARNING (recoverable issues), ERROR (failures requiring attention)

Frontend:

```
console.*
```
in development
Structured error reporting to backend or Sentry in production
Log user actions for debugging, not for analytics

后端（structlog）：

生产环境使用结构化JSON日志
开发环境使用易读的控制台日志
在中间件层绑定请求上下文（request_id、user_id）
在Service层记录日志（业务事件），Repository层不记录（过于冗余）
使用日志级别：DEBUG（仅开发环境）、INFO（业务事件）、WARNING（可恢复问题）、ERROR（需关注的故障）

前端：

开发环境使用
```
console.*
```
生产环境将结构化错误上报到后端或Sentry
记录用户操作用于调试，而非分析

Configuration Management

配置管理

Backend (pydantic-settings):

python

class Settings(BaseSettings):
    model_config = SettingsConfigDict(env_file=".env")

    database_url: str
    redis_url: str = "redis://localhost:6379"
    jwt_secret: str
    debug: bool = False

Frontend (environment variables):

```
VITE_API_URL
```
for API base URL
Build-time injection via Vite's
```
import.meta.env
```
No secrets in frontend environment variables

后端（pydantic-settings）：

python

class Settings(BaseSettings):
    model_config = SettingsConfigDict(env_file=".env")

    database_url: str
    redis_url: str = "redis://localhost:6379"
    jwt_secret: str
    debug: bool = False

前端（环境变量）：

```
VITE_API_URL
```
用于API基础地址
通过Vite的
```
import.meta.env
```
在构建时注入
前端环境变量中绝不包含敏感信息

Output Files

输出文件

architecture.md

Write the architecture document to

architecture.md

at the project root:

markdown

undefined

在项目根目录的

architecture.md

中写入架构文档：

markdown

undefined

Architecture: [Feature/System Name]

架构：[功能/系统名称]

Overview

概述

[1-2 sentence summary of the architectural approach]

[1-2句话总结架构方案]

Layer Structure

分层结构

[Backend and frontend layer descriptions from this skill's patterns]

[来自本技能模式的后端和前端分层描述]

Key Decisions

核心决策

[Summary of decisions made, with links to ADRs]

[已做决策的摘要，包含ADR链接]

Database Schema

数据库架构

[Entity descriptions, relationships, key indexes]

[实体描述、关联关系、核心索引]

Cross-Cutting Concerns

横切关注点

[Auth, error handling, logging approach]

[认证、错误处理、日志方案]

Next Steps

后续步骤

Run
```
/api-design-patterns
```
to define API contracts
Run
```
/task-decomposition
```
to create implementation tasks

undefined

运行
```
/api-design-patterns
```
定义API契约
运行
```
/task-decomposition
```
创建实现任务

undefined

ADRs

For each significant decision, create an ADR in

docs/adr/

markdown

undefined

针对每个重要决策，在

docs/adr/

中创建ADR：

markdown

undefined

ADR-NNN: [Decision Title]

ADR-NNN: [决策标题]

Status

状态

Accepted | Proposed | Superseded

已接受 | 待提议 | 已取代

Context

背景

[Why this decision is needed]

[为何需要该决策]

Decision

决策

[What we decided]

[最终决定内容]

Consequences

影响

[Positive and negative outcomes]


Number ADRs sequentially (ADR-001, ADR-002, etc.).

[正面和负面结果]


ADR按顺序编号（ADR-001、ADR-002等）。

Examples

示例

Architecture Decision: Real-Time Notifications

架构决策：实时通知

Problem: The application needs real-time notifications for users (new messages, status updates).

Options evaluated:

Option	Pros	Cons
WebSocket	True bidirectional, low latency	Complex connection management, harder to scale
Server-Sent Events (SSE)	Simple, HTTP-based, auto-reconnect	Unidirectional (server→client only), limited browser connections
Polling	Simplest implementation, works everywhere	Higher latency, unnecessary server load

Decision: WebSocket for this use case.

Rationale: Notifications require low latency and the system will eventually need bidirectional communication (typing indicators, presence). SSE would work for notifications alone but would require a separate solution for future bidirectional needs. Polling introduces unacceptable latency for real-time UX.

Architecture:

Backend: FastAPI WebSocket endpoint with
```
ConnectionManager
```
class
Frontend: Custom
```
useWebSocket
```
hook with automatic reconnection
Scaling: Redis pub/sub for multi-instance message distribution
Persistence: Store notifications in database for offline users
Fallback: REST endpoint for notification history and initial load

See

references/architecture-decision-record-template.md

for the full ADR format.

问题： 应用需要为用户提供实时通知（新消息、状态更新）。

评估的方案：

方案	优势	劣势
WebSocket	真正的双向通信、低延迟	连接管理复杂，扩展难度大
Server-Sent Events (SSE)	简单、基于HTTP、自动重连	单向通信（仅服务端→客户端），浏览器连接数有限制
轮询	实现最简单，所有环境都支持	延迟高，服务器负载不必要

决策： 该场景使用WebSocket。

理由： 通知需要低延迟，且系统最终需要双向通信（输入状态提示、在线状态）。SSE仅能满足通知需求，但未来需要双向通信时需单独部署新方案。轮询的延迟无法满足实时用户体验要求。

架构：

后端：FastAPI WebSocket端点，搭配
```
ConnectionManager
```
类
前端：自定义
```
useWebSocket
```
hook，支持自动重连
扩展：Redis pub/sub实现多实例消息分发
持久化：离线用户的通知存储在数据库中
降级：REST端点用于通知历史查询和初始加载

完整ADR格式请参考

references/architecture-decision-record-template.md

。

Edge Cases

边缘场景

Monolith vs Microservices

单体应用 vs 微服务

Default to modular monolith for teams smaller than 10 developers. A modular monolith provides:

Clear module boundaries without network overhead
Shared database with module-specific schemas
Easy refactoring and code navigation
Simple deployment and debugging

Consider microservices only when:

Independent scaling is required for specific components
Different modules need different technology stacks
Team size exceeds 10 and ownership boundaries are clear
Deployment independence is a business requirement

Migration path: Design module boundaries in the monolith as if they were services (no direct cross-module database access, communicate via service interfaces). This makes extraction to microservices straightforward when needed.

团队规模小于10人时默认选择模块化单体应用。模块化单体应用具备以下优势：

清晰的模块边界，无网络开销
共享数据库，模块拥有独立Schema
重构和代码导航简单
部署和调试便捷

仅在以下场景考虑微服务：

特定组件需要独立扩展
不同模块需要不同技术栈
团队规模超过10人，且 ownership 边界清晰
部署独立性是业务需求

迁移路径： 按微服务的标准设计单体应用的模块边界（模块间无直接数据库访问，通过服务接口通信）。这样在需要时可轻松将模块拆分为微服务。

When to Break the Layer Pattern

何时打破分层模式

The strict Router → Service → Repository pattern should be followed for standard CRUD operations. Acceptable exceptions:

Background tasks: May call services directly without going through a router
Event handlers: Domain event listeners may call services from any context
CLI commands: Management scripts may access services or repositories directly
Migrations: Data migrations may access models directly (no service/repo layer needed)
Health checks: May access the database directly for simple connectivity verification

In all cases, business logic should still live in the service layer — these exceptions are about the entry point, not about bypassing business rules.

标准的Router → Service → Repository模式应在常规CRUD操作中严格遵循。以下场景可例外：

后台任务： 可直接调用Services，无需经过Router
事件处理器： 领域事件监听器可在任意上下文调用Services
CLI命令： 管理脚本可直接访问Services或Repositories
数据迁移： 数据迁移可直接访问Models（无需Service/Repo层）
健康检查： 简单的连通性检查可直接访问数据库

所有例外场景中，业务逻辑仍应放在Service层 —— 这些例外仅针对入口点，而非绕过业务规则。

Evolving Architecture

架构演进

When the architecture needs to change:

Write an ADR documenting the motivation and the proposed change
Identify all affected modules and their dependencies
Plan an incremental migration — never big-bang rewrites
Maintain backward compatibility during transition (strangler fig pattern)
Set a deadline for completing the migration and removing legacy code

当架构需要变更时：

编写ADR记录变更动机和提议的方案
识别所有受影响的模块及其依赖
规划增量式迁移 —— 绝不进行大爆炸式重写
迁移期间保持向后兼容（绞杀者模式）
设置迁移完成和遗留代码移除的截止日期