postgresql

Compare original and translation side by side

🇺🇸

Original

English

🇨🇳

Translation

Chinese

PostgreSQL Table Design

PostgreSQL 表设计

Use this skill when

适用场景

Designing a schema for PostgreSQL
Selecting data types and constraints
Planning indexes, partitions, or RLS policies
Reviewing tables for scale and maintainability

为PostgreSQL设计数据库架构
选择数据类型与约束
规划索引、分区或RLS策略
针对可扩展性与可维护性评审表结构

Do not use this skill when

不适用场景

You are targeting a non-PostgreSQL database
You only need query tuning without schema changes
You require a DB-agnostic modeling guide

目标数据库非PostgreSQL
仅需查询调优而无需修改架构
需要与数据库无关的建模指南

Instructions

操作步骤

Capture entities, access patterns, and scale targets (rows, QPS, retention).
Choose data types and constraints that enforce invariants.
Add indexes for real query paths and validate with
```
EXPLAIN
```
.
Plan partitioning or RLS where required by scale or access control.
Review migration impact and apply changes safely.

明确实体、访问模式及规模目标（行数、每秒查询量QPS、数据保留周期）。
选择能强制数据一致性的数据类型和约束。
为实际查询路径添加索引，并使用
```
EXPLAIN
```
验证。
根据规模或访问控制需求规划分区或RLS。
评估迁移影响并安全应用变更。

Safety

注意事项

Avoid destructive DDL on production without backups and a rollback plan.
Use migrations and staging validation before applying schema changes.

生产环境中执行破坏性DDL前必须备份并制定回滚方案。
应用架构变更前，先通过迁移脚本在预发布环境验证。

Core Rules

核心规则

Define a PRIMARY KEY for reference tables (users, orders, etc.). Not always needed for time-series/event/log data. When used, prefer
```
BIGINT GENERATED ALWAYS AS IDENTITY
```
; use
```
UUID
```
only when global uniqueness/opacity is needed.
Normalize first (to 3NF) to eliminate data redundancy and update anomalies; denormalize only for measured, high-ROI reads where join performance is proven problematic. Premature denormalization creates maintenance burden.
Add NOT NULL everywhere it’s semantically required; use DEFAULTs for common values.
Create indexes for access paths you actually query: PK/unique (auto), FK columns (manual!), frequent filters/sorts, and join keys.
Prefer TIMESTAMPTZ for event time; NUMERIC for money; TEXT for strings; BIGINT for integer values, DOUBLE PRECISION for floats (or
```
NUMERIC
```
for exact decimal arithmetic).

为参考表（如users、orders等）定义PRIMARY KEY。时间序列/事件/日志数据并非必须。使用时优先选择
```
BIGINT GENERATED ALWAYS AS IDENTITY
```
；仅当需要全局唯一性/不透明ID时才使用
```
UUID
```
。
先规范化至3NF以消除数据冗余和更新异常；仅在已证实连接性能存在问题、且读取性能提升投资回报率高的情况下，才进行反规范化。过早反规范化会增加维护负担。
在所有语义上需要非空的字段添加NOT NULL约束；为常用值设置DEFAULT默认值。
为实际查询的访问路径创建索引：主键/唯一键（自动创建）、外键列（需手动创建！）、频繁过滤/排序的字段，以及连接键。
事件时间优先使用TIMESTAMPTZ；金额使用NUMERIC；字符串使用TEXT；整数值使用BIGINT；浮点数使用DOUBLE PRECISION（如需精确十进制运算则使用
```
NUMERIC
```
）。

PostgreSQL “Gotchas”

PostgreSQL 常见陷阱

Identifiers: unquoted → lowercased. Avoid quoted/mixed-case names. Convention: use
```
snake_case
```
for table/column names.
Unique + NULLs: UNIQUE allows multiple NULLs. Use
```
UNIQUE (...) NULLS NOT DISTINCT
```
(PG15+) to restrict to one NULL.
FK indexes: PostgreSQL does not auto-index FK columns. Add them.
No silent coercions: length/precision overflows error out (no truncation). Example: inserting 999 into
```
NUMERIC(2,0)
```
fails with error, unlike some databases that silently truncate or round.
Sequences/identity have gaps (normal; don't "fix"). Rollbacks, crashes, and concurrent transactions create gaps in ID sequences (1, 2, 5, 6...). This is expected behavior—don't try to make IDs consecutive.
Heap storage: no clustered PK by default (unlike SQL Server/MySQL InnoDB);
```
CLUSTER
```
is one-off reorganization, not maintained on subsequent inserts. Row order on disk is insertion order unless explicitly clustered.
MVCC: updates/deletes leave dead tuples; vacuum handles them—design to avoid hot wide-row churn.

标识符：未加引号的标识符会被转换为小写。避免使用带引号/大小写混合的名称。命名规范：表/列名使用
```
snake_case
```
。
唯一约束与NULL值：UNIQUE约束允许多个NULL值。使用
```
UNIQUE (...) NULLS NOT DISTINCT
```
（PostgreSQL 15+版本支持）限制仅允许一个NULL值。
外键索引：PostgreSQL不会自动为外键列创建索引，需要手动添加。
无隐式类型转换：长度/精度溢出会直接报错（不会截断）。例如：将999插入
```
NUMERIC(2,0)
```
字段会失败并报错，这与某些会自动截断或四舍五入的数据库不同。
序列/自增字段存在间隙（正常现象，无需“修复”）。回滚、崩溃及并发事务会导致ID序列出现间隙（如1,2,5,6...）。这是预期行为，不要尝试让ID连续。
堆存储：默认没有聚簇主键（与SQL Server/MySQL InnoDB不同）；
```
CLUSTER
```
命令仅执行一次性重组，后续插入不会维护聚簇顺序。磁盘上的行顺序为插入顺序，除非显式执行聚簇操作。
MVCC（多版本并发控制）：更新/删除操作会留下死元组；由vacuum进程清理——设计时应避免频繁更新宽行数据导致的膨胀。

Data Types

数据类型

IDs:
```
BIGINT GENERATED ALWAYS AS IDENTITY
```
preferred (
```
GENERATED BY DEFAULT
```
also fine);
```
UUID
```
when merging/federating/used in a distributed system or for opaque IDs. Generate with
```
uuidv7()
```
(preferred if using PG18+) or
```
gen_random_uuid()
```
(if using an older PG version).
Integers: prefer
```
BIGINT
```
unless storage space is critical;
```
INTEGER
```
for smaller ranges; avoid
```
SMALLINT
```
unless constrained.
Floats: prefer
```
DOUBLE PRECISION
```
over
```
REAL
```
unless storage space is critical. Use
```
NUMERIC
```
for exact decimal arithmetic.
Strings: prefer
```
TEXT
```
; if length limits needed, use
```
CHECK (LENGTH(col) <= n)
```
instead of
```
VARCHAR(n)
```
; avoid
```
CHAR(n)
```
. Use
```
BYTEA
```
for binary data. Large strings/binary (>2KB default threshold) automatically stored in TOAST with compression. TOAST storage:
```
PLAIN
```
(no TOAST),
```
EXTENDED
```
(compress + out-of-line),
```
EXTERNAL
```
(out-of-line, no compress),
```
MAIN
```
(compress, keep in-line if possible). Default
```
EXTENDED
```
usually optimal. Control with
```
ALTER TABLE tbl ALTER COLUMN col SET STORAGE strategy
```
and
```
ALTER TABLE tbl SET (toast_tuple_target = 4096)
```
for threshold. Case-insensitive: for locale/accent handling use non-deterministic collations; for plain ASCII use expression indexes on
```
LOWER(col)
```
(preferred unless column needs case-insensitive PK/FK/UNIQUE) or
```
CITEXT
```
.
Money:
```
NUMERIC(p,s)
```
(never float).
Time:
```
TIMESTAMPTZ
```
for timestamps;
```
DATE
```
for date-only;
```
INTERVAL
```
for durations. Avoid
```
TIMESTAMP
```
(without timezone). Use
```
now()
```
for transaction start time,
```
clock_timestamp()
```
for current wall-clock time.
Booleans:
```
BOOLEAN
```
with
```
NOT NULL
```
constraint unless tri-state values are required.
Enums:
```
CREATE TYPE ... AS ENUM
```
for small, stable sets (e.g. US states, days of week). For business-logic-driven and evolving values (e.g. order statuses) → use TEXT (or INT) + CHECK or lookup table.
Arrays:
```
TEXT[]
```
,
```
INTEGER[]
```
, etc. Use for ordered lists where you query elements. Index with GIN for containment (
```
@>
```
,
```
<@
```
) and overlap (
```
&&
```
) queries. Access:
```
arr[1]
```
(1-indexed),
```
arr[1:3]
```
(slicing). Good for tags, categories; avoid for relations—use junction tables instead. Literal syntax:
```
'{val1,val2}'
```
or
```
ARRAY[val1,val2]
```
.
Range types:
```
daterange
```
,
```
numrange
```
,
```
tstzrange
```
for intervals. Support overlap (
```
&&
```
), containment (
```
@>
```
), operators. Index with GiST. Good for scheduling, versioning, numeric ranges. Pick a bounds scheme and use it consistently; prefer
```
[)
```
(inclusive/exclusive) by default.
Network types:
```
INET
```
for IP addresses,
```
CIDR
```
for network ranges,
```
MACADDR
```
for MAC addresses. Support network operators (
```
<<
```
,
```
>>
```
,
```
&&
```
).
Geometric types:
```
POINT
```
,
```
LINE
```
,
```
POLYGON
```
,
```
CIRCLE
```
for 2D spatial data. Index with GiST. Consider PostGIS for advanced spatial features.
Text search:
```
TSVECTOR
```
for full-text search documents,
```
TSQUERY
```
for search queries. Index
```
tsvector
```
with GIN. Always specify language:
```
to_tsvector('english', col)
```
and
```
to_tsquery('english', 'query')
```
. Never use single-argument versions. This applies to both index expressions and queries.
Domain types:
```
CREATE DOMAIN email AS TEXT CHECK (VALUE ~ '^[^@]+@[^@]+$')
```
for reusable custom types with validation. Enforces constraints across tables.
Composite types:
```
CREATE TYPE address AS (street TEXT, city TEXT, zip TEXT)
```
for structured data within columns. Access with
```
(col).field
```
syntax.
JSONB: preferred over JSON; index with GIN. Use only for optional/semi-structured attrs. ONLY use JSON if the original ordering of the contents MUST be preserved.
Vector types:
```
vector
```
type by
```
pgvector
```
for vector similarity search for embeddings.

ID字段：优先选择
```
BIGINT GENERATED ALWAYS AS IDENTITY
```
（
```
GENERATED BY DEFAULT
```
也可）；在分布式系统中进行数据合并/联邦或需要不透明ID时使用
```
UUID
```
。使用
```
uuidv7()
```
（PostgreSQL 18+版本优先）或
```
gen_random_uuid()
```
（旧版本）生成。
整数：优先使用
```
BIGINT
```
，除非存储空间受限；小范围数值使用
```
INTEGER
```
；避免使用
```
SMALLINT
```
，除非有明确限制。
浮点数：优先使用
```
DOUBLE PRECISION
```
而非
```
REAL
```
，除非存储空间受限。如需精确十进制运算，使用
```
NUMERIC
```
。
字符串：优先使用
```
TEXT
```
；如果需要长度限制，使用
```
CHECK (LENGTH(col) <= n)
```
而非
```
VARCHAR(n)
```
；避免使用
```
CHAR(n)
```
。二进制数据使用
```
BYTEA
```
。大字符串/二进制数据（超过默认阈值2KB）会自动存储在TOAST中并压缩。TOAST存储策略：
```
PLAIN
```
（不使用TOAST）、
```
EXTENDED
```
（压缩+离线存储）、
```
EXTERNAL
```
（离线存储，不压缩）、
```
MAIN
```
（压缩，尽可能在线存储）。默认
```
EXTENDED
```
通常是最优选择。使用
```
ALTER TABLE tbl ALTER COLUMN col SET STORAGE strategy
```
和
```
ALTER TABLE tbl SET (toast_tuple_target = 4096)
```
调整阈值。大小写不敏感处理：针对本地化/重音处理使用非确定性排序规则；纯ASCII文本优先使用基于
```
LOWER(col)
```
的表达式索引（除非列需要大小写不敏感的PK/FK/UNIQUE约束），或使用
```
CITEXT
```
类型。
金额：使用
```
NUMERIC(p,s)
```
（绝不要用浮点数）。
时间：时间戳使用
```
TIMESTAMPTZ
```
；仅日期使用
```
DATE
```
；时长使用
```
INTERVAL
```
。避免使用
```
TIMESTAMP
```
（无时区）。使用
```
now()
```
获取事务开始时间，
```
clock_timestamp()
```
获取当前系统时间。
布尔值：使用
```
BOOLEAN
```
并添加
```
NOT NULL
```
约束，除非需要三态值。
枚举类型：使用
```
CREATE TYPE ... AS ENUM
```
定义小型、稳定的集合（如美国各州、星期几）。针对业务逻辑驱动且不断变化的值（如订单状态）→ 使用TEXT（或INT）+ CHECK约束或 lookup表。
数组类型：
```
TEXT[]
```
、
```
INTEGER[]
```
等。用于存储需要查询元素的有序列表。使用GIN索引支持包含查询（
```
@>
```
、
```
<@
```
）和重叠查询（
```
&&
```
）。访问方式：
```
arr[1]
```
（从1开始索引）、
```
arr[1:3]
```
（切片）。适用于标签、分类；避免用于关系数据——应使用关联表。字面量语法：
```
'{val1,val2}'
```
或
```
ARRAY[val1,val2]
```
。
范围类型：
```
daterange
```
、
```
numrange
```
、
```
tstzrange
```
用于表示区间。支持重叠（
```
&&
```
）、包含（
```
@>
```
）等操作符。使用GiST索引。适用于调度、版本控制、数值范围。选择一种边界方案并保持一致；默认优先使用
```
[)
```
（左闭右开）。
网络类型：
```
INET
```
用于IP地址，
```
CIDR
```
用于网络范围，
```
MACADDR
```
用于MAC地址。支持网络操作符（
```
<<
```
、
```
>>
```
、
```
&&
```
）。
几何类型：
```
POINT
```
、
```
LINE
```
、
```
POLYGON
```
、
```
CIRCLE
```
用于二维空间数据。使用GiST索引。如需高级空间功能，考虑使用PostGIS扩展。
全文搜索：
```
TSVECTOR
```
用于全文搜索文档，
```
TSQUERY
```
用于搜索查询。使用GIN索引
```
tsvector
```
字段。必须指定语言：
```
to_tsvector('english', col)
```
和
```
to_tsquery('english', 'query')
```
。绝不要使用单参数版本。此规则适用于索引表达式和查询语句。
域类型：
```
CREATE DOMAIN email AS TEXT CHECK (VALUE ~ '^[^@]+@[^@]+$')
```
用于定义可复用的带验证的自定义类型。可跨表强制约束。
复合类型：
```
CREATE TYPE address AS (street TEXT, city TEXT, zip TEXT)
```
用于在列中存储结构化数据。使用
```
(col).field
```
语法访问字段。
JSONB：优先于JSON使用；使用GIN索引。仅用于可选/半结构化属性。只有当必须保留内容原始顺序时才使用JSON。
向量类型：由
```
pgvector
```
扩展提供的
```
vector
```
类型，用于向量相似度搜索（如嵌入向量）。

Do not use the following data types

应避免使用的数据类型

DO NOT use
```
timestamp
```
(without time zone); DO use
```
timestamptz
```
instead.
DO NOT use
```
char(n)
```
or
```
varchar(n)
```
; DO use
```
text
```
instead.
DO NOT use
```
money
```
type; DO use
```
numeric
```
instead.
DO NOT use
```
timetz
```
type; DO use
```
timestamptz
```
instead.
DO NOT use
```
timestamptz(0)
```
or any other precision specification; DO use
```
timestamptz
```
instead
DO NOT use
```
serial
```
type; DO use
```
generated always as identity
```
instead.

不要使用
```
timestamp
```
（无时区）；应使用
```
timestamptz
```
替代。
不要使用
```
char(n)
```
或
```
varchar(n)
```
；应使用
```
text
```
替代。
不要使用
```
money
```
类型；应使用
```
numeric
```
替代。
不要使用
```
timetz
```
类型；应使用
```
timestamptz
```
替代。
不要使用
```
timestamptz(0)
```
或其他精度指定；应使用
```
timestamptz
```
替代。
不要使用
```
serial
```
类型；应使用
```
generated always as identity
```
替代。

Table Types

表类型

Regular: default; fully durable, logged.
TEMPORARY: session-scoped, auto-dropped, not logged. Faster for scratch work.
UNLOGGED: persistent but not crash-safe. Faster writes; good for caches/staging.

常规表：默认类型；完全持久化、已记录日志。
临时表（TEMPORARY）：会话作用域，自动销毁，不记录日志。适合临时计算。
非日志表（UNLOGGED）：持久化但不保证崩溃安全。写入速度更快；适用于缓存/临时 staging 数据。

Row-Level Security

行级安全（Row-Level Security）

Enable with

ALTER TABLE tbl ENABLE ROW LEVEL SECURITY

. Create policies:

CREATE POLICY user_access ON orders FOR SELECT TO app_users USING (user_id = current_user_id())

. Built-in user-based access control at the row level.

使用

ALTER TABLE tbl ENABLE ROW LEVEL SECURITY

启用。创建策略：

CREATE POLICY user_access ON orders FOR SELECT TO app_users USING (user_id = current_user_id())

。实现基于用户的行级内置访问控制。

Constraints

约束

PK: implicit UNIQUE + NOT NULL; creates a B-tree index.
FK: specify
```
ON DELETE/UPDATE
```
action (
```
CASCADE
```
,
```
RESTRICT
```
,
```
SET NULL
```
,
```
SET DEFAULT
```
). Add explicit index on referencing column—speeds up joins and prevents locking issues on parent deletes/updates. Use
```
DEFERRABLE INITIALLY DEFERRED
```
for circular FK dependencies checked at transaction end.
UNIQUE: creates a B-tree index; allows multiple NULLs unless
```
NULLS NOT DISTINCT
```
(PG15+). Standard behavior:
```
(1, NULL)
```
and
```
(1, NULL)
```
are allowed. With
```
NULLS NOT DISTINCT
```
: only one
```
(1, NULL)
```
allowed. Prefer
```
NULLS NOT DISTINCT
```
unless you specifically need duplicate NULLs.
CHECK: row-local constraints; NULL values pass the check (three-valued logic). Example:
```
CHECK (price > 0)
```
allows NULL prices. Combine with
```
NOT NULL
```
to enforce:
```
price NUMERIC NOT NULL CHECK (price > 0)
```
.
EXCLUDE: prevents overlapping values using operators.
```
EXCLUDE USING gist (room_id WITH =, booking_period WITH &&)
```
prevents double-booking rooms. Requires appropriate index type (often GiST).

PRIMARY KEY（主键）：隐式包含UNIQUE + NOT NULL约束；会创建B-tree索引。
FOREIGN KEY（外键）：指定
```
ON DELETE/UPDATE
```
动作（
```
CASCADE
```
、
```
RESTRICT
```
、
```
SET NULL
```
、
```
SET DEFAULT
```
）。为引用列显式添加索引——可加速连接并避免父表删除/更新时的锁问题。对于循环外键依赖，使用
```
DEFERRABLE INITIALLY DEFERRED
```
在事务结束时检查。
UNIQUE（唯一约束）：创建B-tree索引；允许多个NULL值，除非使用
```
NULLS NOT DISTINCT
```
（PostgreSQL 15+）。标准行为：
```
(1, NULL)
```
和
```
(1, NULL)
```
是允许的。使用
```
NULLS NOT DISTINCT
```
时：仅允许一个
```
(1, NULL)
```
。除非明确需要重复NULL值，否则优先使用
```
NULLS NOT DISTINCT
```
。
CHECK（检查约束）：行级约束；NULL值会通过检查（三值逻辑）。示例：
```
CHECK (price > 0)
```
允许price为NULL。结合
```
NOT NULL
```
约束强制非空：
```
price NUMERIC NOT NULL CHECK (price > 0)
```
。
EXCLUDE（排除约束）：使用操作符防止重叠值。
```
EXCLUDE USING gist (room_id WITH =, booking_period WITH &&)
```
可防止房间重复预订。需要对应的索引类型（通常是GiST）。

Indexing

索引

B-tree: default for equality/range queries (
```
=
```
,
```
<
```
,
```
>
```
,
```
BETWEEN
```
,
```
ORDER BY
```
)
Composite: order matters—index used if equality on leftmost prefix (
```
WHERE a = ? AND b > ?
```
uses index on
```
(a,b)
```
, but
```
WHERE b = ?
```
does not). Put most selective/frequently filtered columns first.
Covering:
```
CREATE INDEX ON tbl (id) INCLUDE (name, email)
```
- includes non-key columns for index-only scans without visiting table.

Partial: for hot subsets (

WHERE status = 'active'

→

CREATE INDEX ON tbl (user_id) WHERE status = 'active'

). Any query with

status = 'active'

can use this index.

Expression: for computed search keys (
```
CREATE INDEX ON tbl (LOWER(email))
```
). Expression must match exactly in WHERE clause:
```
WHERE LOWER(email) = 'user@example.com'
```
.
GIN: JSONB containment/existence, arrays (
```
@>
```
,
```
?
```
), full-text search (
```
@@
```
)
GiST: ranges, geometry, exclusion constraints
BRIN: very large, naturally ordered data (time-series)—minimal storage overhead. Effective when row order on disk correlates with indexed column (insertion order or after
```
CLUSTER
```
).

B-tree：默认索引，适用于等值/范围查询（
```
=
```
、
```
<
```
、
```
>
```
、
```
BETWEEN
```
、
```
ORDER BY
```
）
复合索引：列顺序很重要——如果查询对最左侧前缀列使用等值条件（如
```
WHERE a = ? AND b > ?
```
），则会使用
```
(a,b)
```
索引；但
```
WHERE b = ?
```
不会使用该索引。将选择性最高/过滤最频繁的列放在前面。
覆盖索引：
```
CREATE INDEX ON tbl (id) INCLUDE (name, email)
```
——包含非键列，支持无需访问表的仅索引扫描。
部分索引：针对热点子集（如
```
WHERE status = 'active'
```
→
```
CREATE INDEX ON tbl (user_id) WHERE status = 'active'
```
）。任何包含
```
status = 'active'
```
的查询都可以使用该索引。
表达式索引：针对计算后的搜索键（如
```
CREATE INDEX ON tbl (LOWER(email))
```
）。WHERE子句中的表达式必须与索引完全匹配：
```
WHERE LOWER(email) = 'user@example.com'
```
。
GIN：适用于JSONB包含/存在查询、数组（
```
@>
```
、
```
?
```
）、全文搜索（
```
@@
```
）
GiST：适用于范围、几何类型、排除约束
BRIN：适用于超大规模、自然有序的数据（如时间序列）——存储开销极小。当磁盘上的行顺序与索引列相关时（插入顺序或执行
```
CLUSTER
```
后）效果最佳。

Partitioning

分区

Use for very large tables (>100M rows) where queries consistently filter on partition key (often time/date).
Alternate use: use for tables where data maintenance tasks dictates e.g. data pruned or bulk replaced periodically
RANGE: common for time-series (
```
PARTITION BY RANGE (created_at)
```
). Create partitions:
```
CREATE TABLE logs_2024_01 PARTITION OF logs FOR VALUES FROM ('2024-01-01') TO ('2024-02-01')
```
. TimescaleDB automates time-based or ID-based partitioning with retention policies and compression.

LIST: for discrete values (

PARTITION BY LIST (region)

). Example:

FOR VALUES IN ('us-east', 'us-west')

HASH: for even distribution when no natural key (
```
PARTITION BY HASH (user_id)
```
). Creates N partitions with modulus.
Constraint exclusion: requires
```
CHECK
```
constraints on partitions for query planner to prune. Auto-created for declarative partitioning (PG10+).
Prefer declarative partitioning or hypertables. Do NOT use table inheritance.
Limitations: no global UNIQUE constraints—include partition key in PK/UNIQUE. FKs from partitioned tables not supported; use triggers.

适用于超大型表（超过1亿行），且查询通常按分区键过滤（通常是时间/日期）。
另一种场景：数据维护需求驱动，例如定期清理或批量替换数据
RANGE分区：常用于时间序列数据（
```
PARTITION BY RANGE (created_at)
```
）。创建分区：
```
CREATE TABLE logs_2024_01 PARTITION OF logs FOR VALUES FROM ('2024-01-01') TO ('2024-02-01')
```
。TimescaleDB扩展可自动实现基于时间或ID的分区，并提供保留策略和压缩功能。

LIST分区：针对离散值（

PARTITION BY LIST (region)

）。示例：

FOR VALUES IN ('us-east', 'us-west')

。

HASH分区：无自然键时实现均匀分布（
```
PARTITION BY HASH (user_id)
```
）。通过取模创建N个分区。
约束排除：需要为分区添加
```
CHECK
```
约束，以便查询规划器能裁剪分区。声明式分区（PostgreSQL 10+）会自动创建。
优先使用声明式分区或 hypertables。不要使用表继承。
限制：无全局UNIQUE约束——需将分区键包含在PK/UNIQUE约束中。分区表的外键不被支持；需使用触发器。

Special Considerations

特殊场景考量

Update-Heavy Tables

更新频繁的表

Separate hot/cold columns—put frequently updated columns in separate table to minimize bloat.
Use
fillfactor=90
to leave space for HOT updates that avoid index maintenance.
Avoid updating indexed columns—prevents beneficial HOT updates.
Partition by update patterns—separate frequently updated rows in a different partition from stable data.

分离冷热列——将频繁更新的列放在单独的表中，以减少数据膨胀。
设置
fillfactor=90
——预留空间以支持无需维护索引的HOT更新。
避免更新已索引的列——会阻碍有效的HOT更新。
按更新模式分区——将频繁更新的行与稳定数据放在不同分区。

Insert-Heavy Workloads

插入频繁的工作负载

Minimize indexes—only create what you query; every index slows inserts.
Use
COPY
or multi-row
INSERT
instead of single-row inserts.
UNLOGGED tables for rebuildable staging data—much faster writes.
Defer index creation for bulk loads—>drop index, load data, recreate indexes.
Partition by time/hash to distribute load. TimescaleDB automates partitioning and compression of insert-heavy data.
Use a natural key for primary key such as a (timestamp, device_id) if enforcing global uniqueness is important many insert-heavy tables don't need a primary key at all.
If you do need a surrogate key, Prefer
BIGINT GENERATED ALWAYS AS IDENTITY
over
UUID
.

最小化索引——仅创建实际需要的索引；每个索引都会减慢插入速度。
使用
COPY
或多行
INSERT
——替代单行插入。
非日志表（UNLOGGED）——用于可重建的staging数据——写入速度更快。
批量加载时延迟创建索引——先删除索引，加载数据，再重建索引。
按时间/哈希分区——分散负载。TimescaleDB可自动为插入频繁的数据实现分区和压缩。
使用自然键作为主键——如（timestamp, device_id），如果需要强制全局唯一性；许多插入频繁的表根本不需要主键。
如果确实需要代理键，优先选择
BIGINT GENERATED ALWAYS AS IDENTITY
而非
UUID
。

Upsert-Friendly Design

友好的Upsert设计

Requires UNIQUE index on conflict target columns—
```
ON CONFLICT (col1, col2)
```
needs exact matching unique index (partial indexes don't work).
Use
EXCLUDED.column
to reference would-be-inserted values; only update columns that actually changed to reduce write overhead.
DO NOTHING
faster than
```
DO UPDATE
```
when no actual update needed.

需要UNIQUE索引——冲突目标列必须有对应的唯一索引——
```
ON CONFLICT (col1, col2)
```
需要完全匹配的唯一索引（部分索引无效）。
使用
EXCLUDED.column
——引用待插入的值；仅更新实际变化的列以减少写入开销。
DO NOTHING
比
DO UPDATE
更快——当无需实际更新时。

Safe Schema Evolution

安全的架构演进

Transactional DDL: most DDL operations can run in transactions and be rolled back—
```
BEGIN; ALTER TABLE...; ROLLBACK;
```
for safe testing.
Concurrent index creation:
```
CREATE INDEX CONCURRENTLY
```
avoids blocking writes but can't run in transactions.
Volatile defaults cause rewrites: adding
```
NOT NULL
```
columns with volatile defaults (e.g.,
```
now()
```
,
```
gen_random_uuid()
```
) rewrites entire table. Non-volatile defaults are fast.
Drop constraints before columns:
```
ALTER TABLE DROP CONSTRAINT
```
then
```
DROP COLUMN
```
to avoid dependency issues.
Function signature changes:
```
CREATE OR REPLACE
```
with different arguments creates overloads, not replacements. DROP old version if no overload desired.

事务性DDL——大多数DDL操作可在事务中执行并回滚——
```
BEGIN; ALTER TABLE...; ROLLBACK;
```
用于安全测试。
并发创建索引——
```
CREATE INDEX CONCURRENTLY
```
不会阻塞写入，但不能在事务中执行。
** volatile默认值会导致全表重写**——添加带volatile默认值（如
```
now()
```
、
```
gen_random_uuid()
```
）的NOT NULL列会重写整个表。非volatile默认值则速度更快。
先删除约束再删除列——
```
ALTER TABLE DROP CONSTRAINT
```
然后
```
DROP COLUMN
```
以避免依赖问题。
函数签名变更——
```
CREATE OR REPLACE
```
使用不同参数会创建重载函数，而非替换原函数。如果不需要重载，需删除旧版本。

Generated Columns

生成列

```
... GENERATED ALWAYS AS (<expr>) STORED
```
for computed, indexable fields. PG18+ adds
```
VIRTUAL
```
columns (computed on read, not stored).

```
... GENERATED ALWAYS AS (<expr>) STORED
```
用于可计算、可索引的字段。PostgreSQL 18+新增
```
VIRTUAL
```
列（读取时计算，不存储）。

Extensions

扩展

pgcrypto
:
```
crypt()
```
for password hashing.
uuid-ossp
: alternative UUID functions; prefer
```
pgcrypto
```
for new projects.
pg_trgm
: fuzzy text search with
```
%
```
operator,
```
similarity()
```
function. Index with GIN for
```
LIKE '%pattern%'
```
acceleration.
citext
: case-insensitive text type. Prefer expression indexes on
```
LOWER(col)
```
unless you need case-insensitive constraints.
btree_gin
/
btree_gist
: enable mixed-type indexes (e.g., GIN index on both JSONB and text columns).
hstore
: key-value pairs; mostly superseded by JSONB but useful for simple string mappings.
timescaledb
: essential for time-series—automated partitioning, retention, compression, continuous aggregates.
postgis
: comprehensive geospatial support beyond basic geometric types—essential for location-based applications.
pgvector
: vector similarity search for embeddings.
pgaudit
: audit logging for all database activity.

pgcrypto
：提供
```
crypt()
```
函数用于密码哈希。
uuid-ossp
：提供替代UUID生成函数；新项目优先使用
```
pgcrypto
```
。
pg_trgm
：支持模糊文本搜索（
```
%
```
操作符、
```
similarity()
```
函数）。使用GIN索引加速
```
LIKE '%pattern%'
```
查询。
citext
：大小写不敏感文本类型。除非需要大小写不敏感的约束，否则优先使用基于
```
LOWER(col)
```
的表达式索引。
btree_gin
/
btree_gist
：支持混合类型索引（如同时包含JSONB和文本列的GIN索引）。
hstore
：键值对类型；大多已被JSONB取代，但适用于简单字符串映射。
timescaledb
：时间序列数据必备——自动分区、数据保留、压缩、连续聚合。
postgis
：提供超越基础几何类型的全面地理空间支持——位置类应用必备。
pgvector
：用于嵌入向量的相似度搜索。
pgaudit
：记录所有数据库活动的审计日志。

JSONB Guidance

JSONB 使用指南

Prefer
```
JSONB
```
with GIN index.

Default:

CREATE INDEX ON tbl USING GIN (jsonb_col);

→ accelerates:

Containment
```
jsonb_col @> '{"k":"v"}'
```
Key existence
```
jsonb_col ? 'k'
```
, any/all keys
```
?\|
```
,
```
?&
```
Path containment on nested docs

Disjunction

jsonb_col @> ANY(ARRAY['{"status":"active"}', '{"status":"pending"}'])

Heavy
```
@>
```
workloads: consider opclass
```
jsonb_path_ops
```
for smaller/faster containment-only indexes:
- ```
CREATE INDEX ON tbl USING GIN (jsonb_col jsonb_path_ops);
```
- Trade-off: loses support for key existence (
```
?
```
  ,
```
?|
```
  ,
```
?&
```
  ) queries—only supports containment (
```
@>
```
  )

Equality/range on a specific scalar field: extract and index with B-tree (generated column or expression):

ALTER TABLE tbl ADD COLUMN price INT GENERATED ALWAYS AS ((jsonb_col->>'price')::INT) STORED;

```
CREATE INDEX ON tbl (price);
```

Prefer queries like

WHERE price BETWEEN 100 AND 500

(uses B-tree) over

WHERE (jsonb_col->>'price')::INT BETWEEN 100 AND 500

without index.

Arrays inside JSONB: use GIN +
```
@>
```
for containment (e.g., tags). Consider
```
jsonb_path_ops
```
if only doing containment.
Keep core relations in tables; use JSONB for optional/variable attributes.
Use constraints to limit allowed JSONB values in a column e.g.
```
config JSONB NOT NULL CHECK(jsonb_typeof(config) = 'object')
```

优先使用
```
JSONB
```
并搭配GIN索引。

默认索引：

CREATE INDEX ON tbl USING GIN (jsonb_col);

可加速：

包含查询
```
jsonb_col @> '{"k":"v"}'
```
键存在查询
```
jsonb_col ? 'k'
```
、任意/所有键存在
```
?|
```
、
```
?&
```
嵌套文档的路径包含查询

析取查询

jsonb_col @> ANY(ARRAY['{"status":"active"}', '{"status":"pending"}'])

频繁进行
```
@>
```
查询的场景：考虑使用
```
jsonb_path_ops
```
操作类创建更小/更快的仅包含查询索引：
- ```
CREATE INDEX ON tbl USING GIN (jsonb_col jsonb_path_ops);
```
- 权衡：失去键存在查询（
```
?
```
  、
```
?|
```
  、
```
?&
```
  ）的支持——仅支持包含查询（
```
@>
```
  ）

特定标量字段的等值/范围查询：提取字段并使用B-tree索引（生成列或表达式）：

ALTER TABLE tbl ADD COLUMN price INT GENERATED ALWAYS AS ((jsonb_col->>'price')::INT) STORED;

```
CREATE INDEX ON tbl (price);
```

优先使用

WHERE price BETWEEN 100 AND 500

（使用B-tree索引）而非无索引的

WHERE (jsonb_col->>'price')::INT BETWEEN 100 AND 500

。

JSONB中的数组：使用GIN +
```
@>
```
进行包含查询（如标签）。如果仅进行包含查询，考虑使用
```
jsonb_path_ops
```
。
核心关系数据放在表中；将JSONB用于可选/可变属性。

使用约束限制列中允许的JSONB值，例如

config JSONB NOT NULL CHECK(jsonb_typeof(config) = 'object')

Examples

示例

Users

用户表

sql

CREATE TABLE users (
  user_id BIGINT GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
  email TEXT NOT NULL UNIQUE,
  name TEXT NOT NULL,
  created_at TIMESTAMPTZ NOT NULL DEFAULT now()
);
CREATE UNIQUE INDEX ON users (LOWER(email));
CREATE INDEX ON users (created_at);

sql

CREATE TABLE users (
  user_id BIGINT GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
  email TEXT NOT NULL UNIQUE,
  name TEXT NOT NULL,
  created_at TIMESTAMPTZ NOT NULL DEFAULT now()
);
CREATE UNIQUE INDEX ON users (LOWER(email));
CREATE INDEX ON users (created_at);

Orders

订单表

sql

CREATE TABLE orders (
  order_id BIGINT GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
  user_id BIGINT NOT NULL REFERENCES users(user_id),
  status TEXT NOT NULL DEFAULT 'PENDING' CHECK (status IN ('PENDING','PAID','CANCELED')),
  total NUMERIC(10,2) NOT NULL CHECK (total > 0),
  created_at TIMESTAMPTZ NOT NULL DEFAULT now()
);
CREATE INDEX ON orders (user_id);
CREATE INDEX ON orders (created_at);

sql

CREATE TABLE orders (
  order_id BIGINT GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
  user_id BIGINT NOT NULL REFERENCES users(user_id),
  status TEXT NOT NULL DEFAULT 'PENDING' CHECK (status IN ('PENDING','PAID','CANCELED')),
  total NUMERIC(10,2) NOT NULL CHECK (total > 0),
  created_at TIMESTAMPTZ NOT NULL DEFAULT now()
);
CREATE INDEX ON orders (user_id);
CREATE INDEX ON orders (created_at);

JSONB

JSONB示例

sql

CREATE TABLE profiles (
  user_id BIGINT PRIMARY KEY REFERENCES users(user_id),
  attrs JSONB NOT NULL DEFAULT '{}',
  theme TEXT GENERATED ALWAYS AS (attrs->>'theme') STORED
);
CREATE INDEX profiles_attrs_gin ON profiles USING GIN (attrs);

sql

CREATE TABLE profiles (
  user_id BIGINT PRIMARY KEY REFERENCES users(user_id),
  attrs JSONB NOT NULL DEFAULT '{}',
  theme TEXT GENERATED ALWAYS AS (attrs->>'theme') STORED
);
CREATE INDEX profiles_attrs_gin ON profiles USING GIN (attrs);