design-postgres-tables

Compare original and translation side by side

🇺🇸

Original

English

🇨🇳

Translation

Chinese

PostgreSQL Table Design

PostgreSQL表设计

Core Rules

核心规则

Define a PRIMARY KEY for reference tables (users, orders, etc.). Not always needed for time-series/event/log data. When used, prefer
```
BIGINT GENERATED ALWAYS AS IDENTITY
```
; use
```
UUID
```
only when global uniqueness/opacity is needed.
Normalize first (to 3NF) to eliminate data redundancy and update anomalies; denormalize only for measured, high-ROI reads where join performance is proven problematic. Premature denormalization creates maintenance burden.
Add NOT NULL everywhere it’s semantically required; use DEFAULTs for common values.
Create indexes for access paths you actually query: PK/unique (auto), FK columns (manual!), frequent filters/sorts, and join keys.
Prefer TIMESTAMPTZ for event time; NUMERIC for money; TEXT for strings; BIGINT for integer values, DOUBLE PRECISION for floats (or
```
NUMERIC
```
for exact decimal arithmetic).

为参考表（用户表、订单表等）定义PRIMARY KEY。时间序列/事件/日志数据并非总是需要主键。使用主键时，优先选择
```
BIGINT GENERATED ALWAYS AS IDENTITY
```
；仅当需要全局唯一性/不透明性时才使用
```
UUID
```
。
先规范化至3NF，以消除数据冗余和更新异常；仅在已证实连接性能存在问题、且读取操作能获得高投资回报率的情况下，才进行反规范化。过早反规范化会增加维护负担。
在所有语义上需要的地方添加NOT NULL约束；为常见值设置DEFAULT默认值。
为实际查询的访问路径创建索引：主键/唯一索引（自动创建）、外键列（需手动创建！）、频繁过滤/排序的列，以及连接键。
事件时间优先使用TIMESTAMPTZ；金额使用NUMERIC；字符串使用TEXT；整数值使用BIGINT；浮点数使用DOUBLE PRECISION（如需精确十进制运算则使用
```
NUMERIC
```
）。

PostgreSQL “Gotchas”

PostgreSQL 常见陷阱

Identifiers: unquoted → lowercased. Avoid quoted/mixed-case names. Convention: use
```
snake_case
```
for table/column names.
Unique + NULLs: UNIQUE allows multiple NULLs. Use
```
UNIQUE (...) NULLS NOT DISTINCT
```
(PG15+) to restrict to one NULL.
FK indexes: PostgreSQL does not auto-index FK columns. Add them.
No silent coercions: length/precision overflows error out (no truncation). Example: inserting 999 into
```
NUMERIC(2,0)
```
fails with error, unlike some databases that silently truncate or round.
Sequences/identity have gaps (normal; don't "fix"). Rollbacks, crashes, and concurrent transactions create gaps in ID sequences (1, 2, 5, 6...). This is expected behavior—don't try to make IDs consecutive.
Heap storage: no clustered PK by default (unlike SQL Server/MySQL InnoDB);
```
CLUSTER
```
is one-off reorganization, not maintained on subsequent inserts. Row order on disk is insertion order unless explicitly clustered.
MVCC: updates/deletes leave dead tuples; vacuum handles them—design to avoid hot wide-row churn.

标识符：未加引号的标识符会被转为小写。避免使用带引号/大小写混合的名称。惯例：表/列名使用
```
snake_case
```
格式。
唯一约束与NULL值：UNIQUE约束允许多个NULL值。使用
```
UNIQUE (...) NULLS NOT DISTINCT
```
（PG15及以上版本）来限制仅允许一个NULL值。
外键索引：PostgreSQL不会自动为外键列创建索引，需要手动添加。
无隐式类型转换：长度/精度溢出会直接报错（不会截断）。例如：将999插入
```
NUMERIC(2,0)
```
列会失败报错，不像部分数据库会自动截断或四舍五入。
序列/自增列存在间隙（属于正常现象；无需“修复”）。回滚、崩溃和并发事务会导致ID序列出现间隙（如1,2,5,6...）。这是预期行为——不要试图让ID连续。
堆存储：默认没有聚簇主键（与SQL Server/MySQL InnoDB不同）；
```
CLUSTER
```
是一次性重组操作，后续插入不会维持聚簇顺序。磁盘上的行顺序为插入顺序，除非显式执行聚簇操作。
MVCC多版本并发控制：更新/删除操作会留下死元组；由vacuum进程处理——设计时应避免频繁更新宽行数据导致的膨胀。

Data Types

数据类型

IDs:
```
BIGINT GENERATED ALWAYS AS IDENTITY
```
preferred (
```
GENERATED BY DEFAULT
```
also fine);
```
UUID
```
when merging/federating/used in a distributed system or for opaque IDs. Generate with
```
uuidv7()
```
(preferred if using PG18+) or
```
gen_random_uuid()
```
(if using an older PG version).
Integers: prefer
```
BIGINT
```
unless storage space is critical;
```
INTEGER
```
for smaller ranges; avoid
```
SMALLINT
```
unless constrained.
Floats: prefer
```
DOUBLE PRECISION
```
over
```
REAL
```
unless storage space is critical. Use
```
NUMERIC
```
for exact decimal arithmetic.
Strings: prefer
```
TEXT
```
; if length limits needed, use
```
CHECK (LENGTH(col) <= n)
```
instead of
```
VARCHAR(n)
```
; avoid
```
CHAR(n)
```
. Use
```
BYTEA
```
for binary data. Large strings/binary (>2KB default threshold) automatically stored in TOAST with compression. TOAST storage:
```
PLAIN
```
(no TOAST),
```
EXTENDED
```
(compress + out-of-line),
```
EXTERNAL
```
(out-of-line, no compress),
```
MAIN
```
(compress, keep in-line if possible). Default
```
EXTENDED
```
usually optimal. Control with
```
ALTER TABLE tbl ALTER COLUMN col SET STORAGE strategy
```
and
```
ALTER TABLE tbl SET (toast_tuple_target = 4096)
```
for threshold. Case-insensitive: for locale/accent handling use non-deterministic collations; for plain ASCII use expression indexes on
```
LOWER(col)
```
(preferred unless column needs case-insensitive PK/FK/UNIQUE) or
```
CITEXT
```
.
Money:
```
NUMERIC(p,s)
```
(never float).
Time:
```
TIMESTAMPTZ
```
for timestamps;
```
DATE
```
for date-only;
```
INTERVAL
```
for durations. Avoid
```
TIMESTAMP
```
(without timezone). Use
```
now()
```
for transaction start time,
```
clock_timestamp()
```
for current wall-clock time.
Booleans:
```
BOOLEAN
```
with
```
NOT NULL
```
constraint unless tri-state values are required.
Enums:
```
CREATE TYPE ... AS ENUM
```
for small, stable sets (e.g. US states, days of week). For business-logic-driven and evolving values (e.g. order statuses) → use TEXT (or INT) + CHECK or lookup table.
Arrays:
```
TEXT[]
```
,
```
INTEGER[]
```
, etc. Use for ordered lists where you query elements. Index with GIN for containment (
```
@>
```
,
```
<@
```
) and overlap (
```
&&
```
) queries. Access:
```
arr[1]
```
(1-indexed),
```
arr[1:3]
```
(slicing). Good for tags, categories; avoid for relations—use junction tables instead. Literal syntax:
```
'{val1,val2}'
```
or
```
ARRAY[val1,val2]
```
.
Range types:
```
daterange
```
,
```
numrange
```
,
```
tstzrange
```
for intervals. Support overlap (
```
&&
```
), containment (
```
@>
```
), operators. Index with GiST. Good for scheduling, versioning, numeric ranges. Pick a bounds scheme and use it consistently; prefer
```
[)
```
(inclusive/exclusive) by default.
Network types:
```
INET
```
for IP addresses,
```
CIDR
```
for network ranges,
```
MACADDR
```
for MAC addresses. Support network operators (
```
<<
```
,
```
>>
```
,
```
&&
```
).
Geometric types: avoid
```
POINT
```
,
```
LINE
```
,
```
POLYGON
```
,
```
CIRCLE
```
. Index with GiST. Consider PostGIS for spatial features.
Text search:
```
TSVECTOR
```
for full-text search documents,
```
TSQUERY
```
for search queries. Index
```
tsvector
```
with GIN. Always specify language:
```
to_tsvector('english', col)
```
and
```
to_tsquery('english', 'query')
```
. Never use single-argument versions. This applies to both index expressions and queries.
Domain types:
```
CREATE DOMAIN email AS TEXT CHECK (VALUE ~ '^[^@]+@[^@]+$')
```
for reusable custom types with validation. Enforces constraints across tables.
Composite types:
```
CREATE TYPE address AS (street TEXT, city TEXT, zip TEXT)
```
for structured data within columns. Access with
```
(col).field
```
syntax.
JSONB: preferred over JSON; index with GIN. Use only for optional/semi-structured attrs. ONLY use JSON if the original ordering of the contents MUST be preserved.
Vector types:
```
vector
```
type by
```
pgvector
```
for vector similarity search for embeddings.

ID列：优先选择
```
BIGINT GENERATED ALWAYS AS IDENTITY
```
（
```
GENERATED BY DEFAULT
```
也可）；在合并/联邦/分布式系统中使用或需要不透明ID时使用
```
UUID
```
。使用
```
uuidv7()
```
（PG18及以上版本优先）或
```
gen_random_uuid()
```
（旧版本PG）生成。
整数：优先使用
```
BIGINT
```
，除非存储空间至关重要；小范围数值使用
```
INTEGER
```
；避免使用
```
SMALLINT
```
，除非有明确限制。
浮点数：优先使用
```
DOUBLE PRECISION
```
而非
```
REAL
```
，除非存储空间至关重要。如需精确十进制运算使用
```
NUMERIC
```
。
字符串：优先使用
```
TEXT
```
；如需长度限制，使用
```
CHECK (LENGTH(col) <= n)
```
而非
```
VARCHAR(n)
```
；避免使用
```
CHAR(n)
```
。二进制数据使用
```
BYTEA
```
。大字符串/二进制数据（超过默认2KB阈值）会自动存储在TOAST中并压缩。TOAST存储策略：
```
PLAIN
```
（不使用TOAST）、
```
EXTENDED
```
（压缩+离线存储）、
```
EXTERNAL
```
（离线存储、不压缩）、
```
MAIN
```
（压缩、尽可能在线存储）。默认
```
EXTENDED
```
通常为最优选择。使用
```
ALTER TABLE tbl ALTER COLUMN col SET STORAGE strategy
```
和
```
ALTER TABLE tbl SET (toast_tuple_target = 4096)
```
来控制阈值。大小写不敏感：针对区域设置/重音处理使用非确定性排序规则；纯ASCII场景优先使用
```
LOWER(col)
```
的表达式索引（除非列需要大小写不敏感的PK/FK/UNIQUE约束）或
```
CITEXT
```
类型。
金额：使用
```
NUMERIC(p,s)
```
（绝不要用浮点数）。
时间：时间戳使用
```
TIMESTAMPTZ
```
；仅日期使用
```
DATE
```
；时长使用
```
INTERVAL
```
。避免使用
```
TIMESTAMP
```
（无时区）。事务开始时间使用
```
now()
```
，当前系统时间使用
```
clock_timestamp()
```
。
布尔值：使用
```
BOOLEAN
```
并添加
```
NOT NULL
```
约束，除非需要三态值。
枚举类型：对于小型、稳定的集合（如美国各州、星期几）使用
```
CREATE TYPE ... AS ENUM
```
。对于业务逻辑驱动且不断演变的值（如订单状态）→ 使用TEXT（或INT）+ CHECK约束或查找表。
数组：
```
TEXT[]
```
、
```
INTEGER[]
```
等。用于需要查询元素的有序列表。使用GIN索引支持包含查询（
```
@>
```
、
```
<@
```
）和重叠查询（
```
&&
```
）。访问方式：
```
arr[1]
```
（从1开始索引）、
```
arr[1:3]
```
（切片）。适用于标签、分类；避免用于关系型数据——改用关联表。字面量语法：
```
'{val1,val2}'
```
或
```
ARRAY[val1,val2]
```
。
范围类型：
```
daterange
```
、
```
numrange
```
、
```
tstzrange
```
用于表示区间。支持重叠（
```
&&
```
）、包含（
```
@>
```
）等操作符。使用GiST索引。适用于调度、版本控制、数值范围。选择一种边界规则并保持一致；默认优先使用
```
[)
```
（左闭右开）。
网络类型：
```
INET
```
用于IP地址，
```
CIDR
```
用于网络范围，
```
MACADDR
```
用于MAC地址。支持网络操作符（
```
<<
```
、
```
>>
```
、
```
&&
```
）。
几何类型：避免使用
```
POINT
```
、
```
LINE
```
、
```
POLYGON
```
、
```
CIRCLE
```
。使用GiST索引。如需空间功能考虑使用PostGIS扩展。
全文搜索：
```
TSVECTOR
```
用于全文搜索文档，
```
TSQUERY
```
用于搜索查询。使用GIN索引
```
tsvector
```
列。必须指定语言：
```
to_tsvector('english', col)
```
和
```
to_tsquery('english', 'query')
```
。绝不要使用单参数版本。此规则适用于索引表达式和查询语句。
域类型：
```
CREATE DOMAIN email AS TEXT CHECK (VALUE ~ '^[^@]+@[^@]+$')
```
用于创建带验证的可复用自定义类型，可跨表强制执行约束。
复合类型：
```
CREATE TYPE address AS (street TEXT, city TEXT, zip TEXT)
```
用于在列中存储结构化数据。使用
```
(col).field
```
语法访问字段。
JSONB：优先于JSON使用；使用GIN索引。仅用于可选/半结构化属性。仅当必须保留内容原始顺序时才使用JSON。
向量类型：pgvector扩展提供的
```
vector
```
类型，用于嵌入向量的相似性搜索。

Do not use the following data types

请勿使用以下数据类型

DO NOT use
```
timestamp
```
(without time zone); DO use
```
timestamptz
```
instead.
DO NOT use
```
char(n)
```
or
```
varchar(n)
```
; DO use
```
text
```
instead.
DO NOT use
```
money
```
type; DO use
```
numeric
```
instead.
DO NOT use
```
timetz
```
type; DO use
```
timestamptz
```
instead.
DO NOT use
```
timestamptz(0)
```
or any other precision specification; DO use
```
timestamptz
```
instead
DO NOT use
```
serial
```
type; DO use
```
generated always as identity
```
instead.
DO NOT use
```
POINT
```
,
```
LINE
```
,
```
POLYGON
```
,
```
CIRCLE
```
built-in types, DO use
```
geometry
```
from postgis extension instead.

请勿使用
```
timestamp
```
（无时区）；请改用
```
timestamptz
```
。
请勿使用
```
char(n)
```
或
```
varchar(n)
```
；请改用
```
text
```
。
请勿使用
```
money
```
类型；请改用
```
numeric
```
。
请勿使用
```
timetz
```
类型；请改用
```
timestamptz
```
。
请勿使用
```
timestamptz(0)
```
或任何精度指定；请改用
```
timestamptz
```
。
请勿使用
```
serial
```
类型；请改用
```
generated always as identity
```
。
请勿使用
```
POINT
```
、
```
LINE
```
、
```
POLYGON
```
、
```
CIRCLE
```
内置类型；请改用postgis扩展的
```
geometry
```
类型。

Table Types

表类型

Regular: default; fully durable, logged.
TEMPORARY: session-scoped, auto-dropped, not logged. Faster for scratch work.
UNLOGGED: persistent but not crash-safe. Faster writes; good for caches/staging.

常规表：默认类型；完全持久化、已记录日志。
临时表：会话级作用域，自动删除，不记录日志。适用于临时操作，速度更快。
无日志表：持久化但不保证崩溃安全。写入速度更快；适用于缓存/临时 staging 数据。

Row-Level Security

行级安全

Enable with

ALTER TABLE tbl ENABLE ROW LEVEL SECURITY

. Create policies:

CREATE POLICY user_access ON orders FOR SELECT TO app_users USING (user_id = current_user_id())

. Built-in user-based access control at the row level.

使用

ALTER TABLE tbl ENABLE ROW LEVEL SECURITY

启用。创建策略：

CREATE POLICY user_access ON orders FOR SELECT TO app_users USING (user_id = current_user_id())

。实现基于用户的行级内置访问控制。

Constraints

约束

PK: implicit UNIQUE + NOT NULL; creates a B-tree index.
FK: specify
```
ON DELETE/UPDATE
```
action (
```
CASCADE
```
,
```
RESTRICT
```
,
```
SET NULL
```
,
```
SET DEFAULT
```
). Add explicit index on referencing column—speeds up joins and prevents locking issues on parent deletes/updates. Use
```
DEFERRABLE INITIALLY DEFERRED
```
for circular FK dependencies checked at transaction end.
UNIQUE: creates a B-tree index; allows multiple NULLs unless
```
NULLS NOT DISTINCT
```
(PG15+). Standard behavior:
```
(1, NULL)
```
and
```
(1, NULL)
```
are allowed. With
```
NULLS NOT DISTINCT
```
: only one
```
(1, NULL)
```
allowed. Prefer
```
NULLS NOT DISTINCT
```
unless you specifically need duplicate NULLs.
CHECK: row-local constraints; NULL values pass the check (three-valued logic). Example:
```
CHECK (price > 0)
```
allows NULL prices. Combine with
```
NOT NULL
```
to enforce:
```
price NUMERIC NOT NULL CHECK (price > 0)
```
.
EXCLUDE: prevents overlapping values using operators.
```
EXCLUDE USING gist (room_id WITH =, booking_period WITH &&)
```
prevents double-booking rooms. Requires appropriate index type (often GiST).

主键（PK）：隐式包含UNIQUE + NOT NULL约束；会创建B-tree索引。
外键（FK）：指定
```
ON DELETE/UPDATE
```
动作（
```
CASCADE
```
、
```
RESTRICT
```
、
```
SET NULL
```
、
```
SET DEFAULT
```
）。为引用列添加显式索引——加速连接并避免父表删除/更新时的锁定问题。对于循环外键依赖，使用
```
DEFERRABLE INITIALLY DEFERRED
```
在事务结束时检查约束。
唯一约束（UNIQUE）：创建B-tree索引；允许多个NULL值，除非使用
```
NULLS NOT DISTINCT
```
（PG15及以上版本）。标准行为：
```
(1, NULL)
```
和
```
(1, NULL)
```
是允许的。使用
```
NULLS NOT DISTINCT
```
时：仅允许一个
```
(1, NULL)
```
。除非明确需要重复NULL值，否则优先使用
```
NULLS NOT DISTINCT
```
。
检查约束（CHECK）：行级本地约束；NULL值会通过检查（三值逻辑）。例如：
```
CHECK (price > 0)
```
允许price为NULL。结合
```
NOT NULL
```
约束强制执行：
```
price NUMERIC NOT NULL CHECK (price > 0)
```
。
排除约束（EXCLUDE）：使用操作符防止值重叠。
```
EXCLUDE USING gist (room_id WITH =, booking_period WITH &&)
```
防止房间重复预订。需要对应类型的索引（通常为GiST）。

Indexing

索引

B-tree: default for equality/range queries (
```
=
```
,
```
<
```
,
```
>
```
,
```
BETWEEN
```
,
```
ORDER BY
```
)
Composite: order matters—index used if equality on leftmost prefix (
```
WHERE a = ? AND b > ?
```
uses index on
```
(a,b)
```
, but
```
WHERE b = ?
```
does not). Put most selective/frequently filtered columns first.
Covering:
```
CREATE INDEX ON tbl (id) INCLUDE (name, email)
```
- includes non-key columns for index-only scans without visiting table.

Partial: for hot subsets (

WHERE status = 'active'

→

CREATE INDEX ON tbl (user_id) WHERE status = 'active'

). Any query with

status = 'active'

can use this index.

Expression: for computed search keys (
```
CREATE INDEX ON tbl (LOWER(email))
```
). Expression must match exactly in WHERE clause:
```
WHERE LOWER(email) = 'user@example.com'
```
.
GIN: JSONB containment/existence, arrays (
```
@>
```
,
```
?
```
), full-text search (
```
@@
```
)
GiST: ranges, geometry, exclusion constraints
BRIN: very large, naturally ordered data (time-series)—minimal storage overhead. Effective when row order on disk correlates with indexed column (insertion order or after
```
CLUSTER
```
).

B-tree：默认用于等值/范围查询（
```
=
```
、
```
<
```
、
```
>
```
、
```
BETWEEN
```
、
```
ORDER BY
```
）
复合索引：列顺序至关重要——如果查询对最左侧前缀列使用等值条件（如
```
WHERE a = ? AND b > ?
```
），则会使用
```
(a,b)
```
索引，但
```
WHERE b = ?
```
不会使用该索引。将选择性最高/过滤最频繁的列放在前面。
覆盖索引：
```
CREATE INDEX ON tbl (id) INCLUDE (name, email)
```
- 包含非键列，支持仅索引扫描，无需访问表数据。

部分索引：针对热点子集（如

WHERE status = 'active'

→

CREATE INDEX ON tbl (user_id) WHERE status = 'active'

）。任何包含

status = 'active'

的查询都可使用该索引。

表达式索引：针对计算后的搜索键（如
```
CREATE INDEX ON tbl (LOWER(email))
```
）。WHERE子句中的表达式必须与索引表达式完全匹配：
```
WHERE LOWER(email) = 'user@example.com'
```
。
GIN：JSONB包含/存在查询、数组（
```
@>
```
、
```
?
```
）、全文搜索（
```
@@
```
）
GiST：范围、几何、排除约束
BRIN：适用于超大型、自然有序的数据（时间序列）——存储开销极小。当磁盘上的行顺序与索引列相关时（插入顺序或执行
```
CLUSTER
```
后）效果最佳。

Partitioning

分区

Use for very large tables (>100M rows) where queries consistently filter on partition key (often time/date).
Alternate use: use for tables where data maintenance tasks dictates e.g. data pruned or bulk replaced periodically
RANGE: common for time-series (
```
PARTITION BY RANGE (created_at)
```
). Create partitions:
```
CREATE TABLE logs_2024_01 PARTITION OF logs FOR VALUES FROM ('2024-01-01') TO ('2024-02-01')
```
. TimescaleDB automates time-based or ID-based partitioning with retention policies and compression.

LIST: for discrete values (

PARTITION BY LIST (region)

). Example:

FOR VALUES IN ('us-east', 'us-west')

HASH: for even distribution when no natural key (
```
PARTITION BY HASH (user_id)
```
). Creates N partitions with modulus.
Constraint exclusion: requires
```
CHECK
```
constraints on partitions for query planner to prune. Auto-created for declarative partitioning (PG10+).
Prefer declarative partitioning or hypertables. Do NOT use table inheritance.
Limitations: no global UNIQUE constraints—include partition key in PK/UNIQUE. FKs from partitioned tables not supported; use triggers.

适用于超大型表（超过1亿行），且查询始终按分区键过滤（通常为时间/日期）。
另一种场景：数据维护需求，例如定期清理或批量替换数据
范围分区：时间序列数据常用（
```
PARTITION BY RANGE (created_at)
```
）。创建分区：
```
CREATE TABLE logs_2024_01 PARTITION OF logs FOR VALUES FROM ('2024-01-01') TO ('2024-02-01')
```
。TimescaleDB扩展可自动实现基于时间或ID的分区，并提供保留策略和压缩功能。

列表分区：针对离散值（

PARTITION BY LIST (region)

）。示例：

FOR VALUES IN ('us-east', 'us-west')

。

哈希分区：无自然键时实现均匀分布（
```
PARTITION BY HASH (user_id)
```
）。创建N个分区，基于取模运算分配数据。
约束排除：需要为分区添加
```
CHECK
```
约束，以便查询规划器进行分区裁剪。声明式分区（PG10及以上版本）会自动创建。
优先使用声明式分区或 hypertables。请勿使用表继承。
限制：无全局UNIQUE约束——需将分区键包含在PK/UNIQUE约束中。分区表的外键不被支持；使用触发器替代。

Special Considerations

特殊场景考量

Update-Heavy Tables

频繁更新的表

Separate hot/cold columns—put frequently updated columns in separate table to minimize bloat.
Use
fillfactor=90
to leave space for HOT updates that avoid index maintenance.
Avoid updating indexed columns—prevents beneficial HOT updates.
Partition by update patterns—separate frequently updated rows in a different partition from stable data.

分离冷热列——将频繁更新的列放在单独的表中，以最小化数据膨胀。
使用
fillfactor=90
——预留空间用于HOT更新，避免维护索引。
避免更新带索引的列——会阻止有效的HOT更新。
按更新模式分区——将频繁更新的行与稳定数据放在不同分区。

Insert-Heavy Workloads

频繁插入的工作负载

Minimize indexes—only create what you query; every index slows inserts.
Use
COPY
or multi-row
INSERT
instead of single-row inserts.
UNLOGGED tables for rebuildable staging data—much faster writes.
Defer index creation for bulk loads—>drop index, load data, recreate indexes.
Partition by time/hash to distribute load. TimescaleDB automates partitioning and compression of insert-heavy data.
Use a natural key for primary key such as a (timestamp, device_id) if enforcing global uniqueness is important many insert-heavy tables don't need a primary key at all.
If you do need a surrogate key, Prefer
BIGINT GENERATED ALWAYS AS IDENTITY
over
UUID
.

最小化索引数量——仅创建实际需要查询的索引；每个索引都会减慢插入速度。
**使用
```
COPY
```
或多行
```
INSERT
```
**替代单行插入。
无日志表用于可重建的staging数据——写入速度快得多。
批量加载时延迟创建索引——先删除索引，加载数据，再重新创建索引。
按时间/哈希分区分散负载。TimescaleDB扩展可自动实现频繁插入数据的分区和压缩。
使用自然键作为主键，如（timestamp, device_id），如果需要强制全局唯一性；许多频繁插入的表根本不需要主键。
如果确实需要代理键，优先选择
BIGINT GENERATED ALWAYS AS IDENTITY
而非
UUID
。

Upsert-Friendly Design

友好的Upsert设计

Requires UNIQUE index on conflict target columns—
```
ON CONFLICT (col1, col2)
```
needs exact matching unique index (partial indexes don't work).
Use
EXCLUDED.column
to reference would-be-inserted values; only update columns that actually changed to reduce write overhead.
DO NOTHING
faster than
```
DO UPDATE
```
when no actual update needed.

需要UNIQUE索引在冲突目标列上——
```
ON CONFLICT (col1, col2)
```
需要完全匹配的唯一索引（部分索引无效）。
**使用
```
EXCLUDED.column
```
**引用待插入的值；仅更新实际变化的列以减少写入开销。
DO NOTHING
比
DO UPDATE
更快当无需实际更新时。

Safe Schema Evolution

安全的Schema演进

Transactional DDL: most DDL operations can run in transactions and be rolled back—
```
BEGIN; ALTER TABLE...; ROLLBACK;
```
for safe testing.
Concurrent index creation:
```
CREATE INDEX CONCURRENTLY
```
avoids blocking writes but can't run in transactions.
Volatile defaults cause rewrites: adding
```
NOT NULL
```
columns with volatile defaults (e.g.,
```
now()
```
,
```
gen_random_uuid()
```
) rewrites entire table. Non-volatile defaults are fast.
Drop constraints before columns:
```
ALTER TABLE DROP CONSTRAINT
```
then
```
DROP COLUMN
```
to avoid dependency issues.
Function signature changes:
```
CREATE OR REPLACE
```
with different arguments creates overloads, not replacements. DROP old version if no overload desired.

事务性DDL：大多数DDL操作可在事务中执行并回滚——使用
```
BEGIN; ALTER TABLE...; ROLLBACK;
```
进行安全测试。
并发创建索引：
```
CREATE INDEX CONCURRENTLY
```
避免阻塞写入，但不能在事务中执行。
** volatile默认值会导致表重写**——添加带volatile默认值（如
```
now()
```
、
```
gen_random_uuid()
```
）的NOT NULL列会重写整个表。非volatile默认值速度更快。
先删除约束再删除列——
```
ALTER TABLE DROP CONSTRAINT
```
然后
```
DROP COLUMN
```
以避免依赖问题。
函数签名变更：
```
CREATE OR REPLACE
```
使用不同参数会创建重载而非替换。如果不需要重载，请删除旧版本函数。

Generated Columns

生成列

```
... GENERATED ALWAYS AS (<expr>) STORED
```
for computed, indexable fields. PG18+ adds
```
VIRTUAL
```
columns (computed on read, not stored).

```
... GENERATED ALWAYS AS (<expr>) STORED
```
用于可计算、可索引的字段。PG18及以上版本新增
```
VIRTUAL
```
列（读取时计算，不存储）。

Extensions

扩展

pgcrypto
:
```
crypt()
```
for password hashing.
uuid-ossp
: alternative UUID functions; prefer
```
pgcrypto
```
for new projects.
pg_trgm
: fuzzy text search with
```
%
```
operator,
```
similarity()
```
function. Index with GIN for
```
LIKE '%pattern%'
```
acceleration.
citext
: case-insensitive text type. Prefer expression indexes on
```
LOWER(col)
```
unless you need case-insensitive constraints.
btree_gin
/
btree_gist
: enable mixed-type indexes (e.g., GIN index on both JSONB and text columns).
hstore
: key-value pairs; mostly superseded by JSONB but useful for simple string mappings.
timescaledb
: essential for time-series—automated partitioning, retention, compression, continuous aggregates.
postgis
: comprehensive geospatial support beyond basic geometric types—essential for location-based applications.
pgvector
: vector similarity search for embeddings.
pgaudit
: audit logging for all database activity.

pgcrypto
：
```
crypt()
```
用于密码哈希。
uuid-ossp
：替代UUID函数；新项目优先使用
```
pgcrypto
```
。
pg_trgm
：模糊文本搜索，支持
```
%
```
操作符、
```
similarity()
```
函数。使用GIN索引加速
```
LIKE '%pattern%'
```
查询。
citext
：大小写不敏感文本类型。除非需要大小写不敏感约束，否则优先使用
```
LOWER(col)
```
的表达式索引。
btree_gin
/
btree_gist
：支持混合类型索引（如同时包含JSONB和文本列的GIN索引）。
hstore
：键值对；大多已被JSONB取代，但适用于简单字符串映射。
timescaledb
：时间序列数据必备——自动分区、数据保留、压缩、连续聚合。
postgis
：提供超越基础几何类型的全面空间支持——位置应用必备。
pgvector
：嵌入向量的相似性搜索。
pgaudit
：所有数据库活动的审计日志。

JSONB Guidance

JSONB使用指南

Prefer
```
JSONB
```
with GIN index.

Default:

CREATE INDEX ON tbl USING GIN (jsonb_col);

→ accelerates:

Containment
```
jsonb_col @> '{"k":"v"}'
```
Key existence
```
jsonb_col ? 'k'
```
, any/all keys
```
?\|
```
,
```
?&
```
Path containment on nested docs

Disjunction

jsonb_col @> ANY(ARRAY['{"status":"active"}', '{"status":"pending"}'])

Heavy
```
@>
```
workloads: consider opclass
```
jsonb_path_ops
```
for smaller/faster containment-only indexes:
- ```
CREATE INDEX ON tbl USING GIN (jsonb_col jsonb_path_ops);
```
- Trade-off: loses support for key existence (
```
?
```
  ,
```
?|
```
  ,
```
?&
```
  ) queries—only supports containment (
```
@>
```
  )

Equality/range on a specific scalar field: extract and index with B-tree (generated column or expression):

ALTER TABLE tbl ADD COLUMN price INT GENERATED ALWAYS AS ((jsonb_col->>'price')::INT) STORED;

```
CREATE INDEX ON tbl (price);
```

Prefer queries like

WHERE price BETWEEN 100 AND 500

(uses B-tree) over

WHERE (jsonb_col->>'price')::INT BETWEEN 100 AND 500

without index.

Arrays inside JSONB: use GIN +
```
@>
```
for containment (e.g., tags). Consider
```
jsonb_path_ops
```
if only doing containment.
Keep core relations in tables; use JSONB for optional/variable attributes.
Use constraints to limit allowed JSONB values in a column e.g.
```
config JSONB NOT NULL CHECK(jsonb_typeof(config) = 'object')
```

优先使用
```
JSONB
```
并搭配GIN索引。

默认创建方式：

CREATE INDEX ON tbl USING GIN (jsonb_col);

→ 加速以下操作：

包含查询
```
jsonb_col @> '{"k":"v"}'
```
键存在性查询
```
jsonb_col ? 'k'
```
、任意/所有键存在
```
?|
```
、
```
?&
```
嵌套文档的路径包含查询

析取查询

jsonb_col @> ANY(ARRAY['{"status":"active"}', '{"status":"pending"}'])

频繁进行
```
@>
```
查询的场景：考虑使用
```
jsonb_path_ops
```
操作类创建更小、更快的仅包含查询索引：
- ```
CREATE INDEX ON tbl USING GIN (jsonb_col jsonb_path_ops);
```
- 权衡：失去键存在性查询（
```
?
```
  ,
```
?|
```
  ,
```
?&
```
  ）支持——仅支持包含查询（
```
@>
```
  ）

针对特定标量字段的等值/范围查询：提取字段并创建B-tree索引（使用生成列或表达式）：

ALTER TABLE tbl ADD COLUMN price INT GENERATED ALWAYS AS ((jsonb_col->>'price')::INT) STORED;

```
CREATE INDEX ON tbl (price);
```

优先使用

WHERE price BETWEEN 100 AND 500

（使用B-tree索引）而非无索引的

WHERE (jsonb_col->>'price')::INT BETWEEN 100 AND 500

。

JSONB内的数组：使用GIN +
```
@>
```
进行包含查询（如标签）。如果仅进行包含查询，考虑使用
```
jsonb_path_ops
```
。
核心关系数据存储在表中；JSONB用于存储可选/可变属性。

使用约束限制列中允许的JSONB值，例如

config JSONB NOT NULL CHECK(jsonb_typeof(config) = 'object')

Examples

示例

Users

用户表

sql

CREATE TABLE users (
  user_id BIGINT GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
  email TEXT NOT NULL UNIQUE,
  name TEXT NOT NULL,
  created_at TIMESTAMPTZ NOT NULL DEFAULT now()
);
CREATE UNIQUE INDEX ON users (LOWER(email));
CREATE INDEX ON users (created_at);

sql

CREATE TABLE users (
  user_id BIGINT GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
  email TEXT NOT NULL UNIQUE,
  name TEXT NOT NULL,
  created_at TIMESTAMPTZ NOT NULL DEFAULT now()
);
CREATE UNIQUE INDEX ON users (LOWER(email));
CREATE INDEX ON users (created_at);

Orders

订单表

sql

CREATE TABLE orders (
  order_id BIGINT GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
  user_id BIGINT NOT NULL REFERENCES users(user_id),
  status TEXT NOT NULL DEFAULT 'PENDING' CHECK (status IN ('PENDING','PAID','CANCELED')),
  total NUMERIC(10,2) NOT NULL CHECK (total > 0),
  created_at TIMESTAMPTZ NOT NULL DEFAULT now()
);
CREATE INDEX ON orders (user_id);
CREATE INDEX ON orders (created_at);

sql

CREATE TABLE orders (
  order_id BIGINT GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
  user_id BIGINT NOT NULL REFERENCES users(user_id),
  status TEXT NOT NULL DEFAULT 'PENDING' CHECK (status IN ('PENDING','PAID','CANCELED')),
  total NUMERIC(10,2) NOT NULL CHECK (total > 0),
  created_at TIMESTAMPTZ NOT NULL DEFAULT now()
);
CREATE INDEX ON orders (user_id);
CREATE INDEX ON orders (created_at);

JSONB

JSONB示例

sql

CREATE TABLE profiles (
  user_id BIGINT PRIMARY KEY REFERENCES users(user_id),
  attrs JSONB NOT NULL DEFAULT '{}',
  theme TEXT GENERATED ALWAYS AS (attrs->>'theme') STORED
);
CREATE INDEX profiles_attrs_gin ON profiles USING GIN (attrs);

sql

CREATE TABLE profiles (
  user_id BIGINT PRIMARY KEY REFERENCES users(user_id),
  attrs JSONB NOT NULL DEFAULT '{}',
  theme TEXT GENERATED ALWAYS AS (attrs->>'theme') STORED
);
CREATE INDEX profiles_attrs_gin ON profiles USING GIN (attrs);