design-postgres-tables
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChinesePostgreSQL Table Design
PostgreSQL表设计
Core Rules
核心规则
- Define a PRIMARY KEY for reference tables (users, orders, etc.). Not always needed for time-series/event/log data. When used, prefer ; use
BIGINT GENERATED ALWAYS AS IDENTITYonly when global uniqueness/opacity is needed.UUID - Normalize first (to 3NF) to eliminate data redundancy and update anomalies; denormalize only for measured, high-ROI reads where join performance is proven problematic. Premature denormalization creates maintenance burden.
- Add NOT NULL everywhere it’s semantically required; use DEFAULTs for common values.
- Create indexes for access paths you actually query: PK/unique (auto), FK columns (manual!), frequent filters/sorts, and join keys.
- Prefer TIMESTAMPTZ for event time; NUMERIC for money; TEXT for strings; BIGINT for integer values, DOUBLE PRECISION for floats (or for exact decimal arithmetic).
NUMERIC
- 为参考表(用户表、订单表等)定义PRIMARY KEY。时间序列/事件/日志数据并非总是需要主键。使用主键时,优先选择;仅当需要全局唯一性/不透明性时才使用
BIGINT GENERATED ALWAYS AS IDENTITY。UUID - 先规范化至3NF,以消除数据冗余和更新异常;仅在已证实连接性能存在问题、且读取操作能获得高投资回报率的情况下,才进行反规范化。过早反规范化会增加维护负担。
- 在所有语义上需要的地方添加NOT NULL约束;为常见值设置DEFAULT默认值。
- 为实际查询的访问路径创建索引:主键/唯一索引(自动创建)、外键列(需手动创建!)、频繁过滤/排序的列,以及连接键。
- 事件时间优先使用TIMESTAMPTZ;金额使用NUMERIC;字符串使用TEXT;整数值使用BIGINT;浮点数使用DOUBLE PRECISION(如需精确十进制运算则使用)。
NUMERIC
PostgreSQL “Gotchas”
PostgreSQL 常见陷阱
- Identifiers: unquoted → lowercased. Avoid quoted/mixed-case names. Convention: use for table/column names.
snake_case - Unique + NULLs: UNIQUE allows multiple NULLs. Use (PG15+) to restrict to one NULL.
UNIQUE (...) NULLS NOT DISTINCT - FK indexes: PostgreSQL does not auto-index FK columns. Add them.
- No silent coercions: length/precision overflows error out (no truncation). Example: inserting 999 into fails with error, unlike some databases that silently truncate or round.
NUMERIC(2,0) - Sequences/identity have gaps (normal; don't "fix"). Rollbacks, crashes, and concurrent transactions create gaps in ID sequences (1, 2, 5, 6...). This is expected behavior—don't try to make IDs consecutive.
- Heap storage: no clustered PK by default (unlike SQL Server/MySQL InnoDB); is one-off reorganization, not maintained on subsequent inserts. Row order on disk is insertion order unless explicitly clustered.
CLUSTER - MVCC: updates/deletes leave dead tuples; vacuum handles them—design to avoid hot wide-row churn.
- 标识符:未加引号的标识符会被转为小写。避免使用带引号/大小写混合的名称。惯例:表/列名使用格式。
snake_case - 唯一约束与NULL值:UNIQUE约束允许多个NULL值。使用(PG15及以上版本)来限制仅允许一个NULL值。
UNIQUE (...) NULLS NOT DISTINCT - 外键索引:PostgreSQL不会自动为外键列创建索引,需要手动添加。
- 无隐式类型转换:长度/精度溢出会直接报错(不会截断)。例如:将999插入列会失败报错,不像部分数据库会自动截断或四舍五入。
NUMERIC(2,0) - 序列/自增列存在间隙(属于正常现象;无需“修复”)。回滚、崩溃和并发事务会导致ID序列出现间隙(如1,2,5,6...)。这是预期行为——不要试图让ID连续。
- 堆存储:默认没有聚簇主键(与SQL Server/MySQL InnoDB不同);是一次性重组操作,后续插入不会维持聚簇顺序。磁盘上的行顺序为插入顺序,除非显式执行聚簇操作。
CLUSTER - MVCC多版本并发控制:更新/删除操作会留下死元组;由vacuum进程处理——设计时应避免频繁更新宽行数据导致的膨胀。
Data Types
数据类型
- IDs: preferred (
BIGINT GENERATED ALWAYS AS IDENTITYalso fine);GENERATED BY DEFAULTwhen merging/federating/used in a distributed system or for opaque IDs. Generate withUUID(preferred if using PG18+) oruuidv7()(if using an older PG version).gen_random_uuid() - Integers: prefer unless storage space is critical;
BIGINTfor smaller ranges; avoidINTEGERunless constrained.SMALLINT - Floats: prefer over
DOUBLE PRECISIONunless storage space is critical. UseREALfor exact decimal arithmetic.NUMERIC - Strings: prefer ; if length limits needed, use
TEXTinstead ofCHECK (LENGTH(col) <= n); avoidVARCHAR(n). UseCHAR(n)for binary data. Large strings/binary (>2KB default threshold) automatically stored in TOAST with compression. TOAST storage:BYTEA(no TOAST),PLAIN(compress + out-of-line),EXTENDED(out-of-line, no compress),EXTERNAL(compress, keep in-line if possible). DefaultMAINusually optimal. Control withEXTENDEDandALTER TABLE tbl ALTER COLUMN col SET STORAGE strategyfor threshold. Case-insensitive: for locale/accent handling use non-deterministic collations; for plain ASCII use expression indexes onALTER TABLE tbl SET (toast_tuple_target = 4096)(preferred unless column needs case-insensitive PK/FK/UNIQUE) orLOWER(col).CITEXT - Money: (never float).
NUMERIC(p,s) - Time: for timestamps;
TIMESTAMPTZfor date-only;DATEfor durations. AvoidINTERVAL(without timezone). UseTIMESTAMPfor transaction start time,now()for current wall-clock time.clock_timestamp() - Booleans: with
BOOLEANconstraint unless tri-state values are required.NOT NULL - Enums: for small, stable sets (e.g. US states, days of week). For business-logic-driven and evolving values (e.g. order statuses) → use TEXT (or INT) + CHECK or lookup table.
CREATE TYPE ... AS ENUM - Arrays: ,
TEXT[], etc. Use for ordered lists where you query elements. Index with GIN for containment (INTEGER[],@>) and overlap (<@) queries. Access:&&(1-indexed),arr[1](slicing). Good for tags, categories; avoid for relations—use junction tables instead. Literal syntax:arr[1:3]or'{val1,val2}'.ARRAY[val1,val2] - Range types: ,
daterange,numrangefor intervals. Support overlap (tstzrange), containment (&&), operators. Index with GiST. Good for scheduling, versioning, numeric ranges. Pick a bounds scheme and use it consistently; prefer@>(inclusive/exclusive) by default.[) - Network types: for IP addresses,
INETfor network ranges,CIDRfor MAC addresses. Support network operators (MACADDR,<<,>>).&& - Geometric types: avoid ,
POINT,LINE,POLYGON. Index with GiST. Consider PostGIS for spatial features.CIRCLE - Text search: for full-text search documents,
TSVECTORfor search queries. IndexTSQUERYwith GIN. Always specify language:tsvectorandto_tsvector('english', col). Never use single-argument versions. This applies to both index expressions and queries.to_tsquery('english', 'query') - Domain types: for reusable custom types with validation. Enforces constraints across tables.
CREATE DOMAIN email AS TEXT CHECK (VALUE ~ '^[^@]+@[^@]+$') - Composite types: for structured data within columns. Access with
CREATE TYPE address AS (street TEXT, city TEXT, zip TEXT)syntax.(col).field - JSONB: preferred over JSON; index with GIN. Use only for optional/semi-structured attrs. ONLY use JSON if the original ordering of the contents MUST be preserved.
- Vector types: type by
vectorfor vector similarity search for embeddings.pgvector
- ID列:优先选择(
BIGINT GENERATED ALWAYS AS IDENTITY也可);在合并/联邦/分布式系统中使用或需要不透明ID时使用GENERATED BY DEFAULT。使用UUID(PG18及以上版本优先)或uuidv7()(旧版本PG)生成。gen_random_uuid() - 整数:优先使用,除非存储空间至关重要;小范围数值使用
BIGINT;避免使用INTEGER,除非有明确限制。SMALLINT - 浮点数:优先使用而非
DOUBLE PRECISION,除非存储空间至关重要。如需精确十进制运算使用REAL。NUMERIC - 字符串:优先使用;如需长度限制,使用
TEXT而非CHECK (LENGTH(col) <= n);避免使用VARCHAR(n)。二进制数据使用CHAR(n)。大字符串/二进制数据(超过默认2KB阈值)会自动存储在TOAST中并压缩。TOAST存储策略:BYTEA(不使用TOAST)、PLAIN(压缩+离线存储)、EXTENDED(离线存储、不压缩)、EXTERNAL(压缩、尽可能在线存储)。默认MAIN通常为最优选择。使用EXTENDED和ALTER TABLE tbl ALTER COLUMN col SET STORAGE strategy来控制阈值。大小写不敏感:针对区域设置/重音处理使用非确定性排序规则;纯ASCII场景优先使用ALTER TABLE tbl SET (toast_tuple_target = 4096)的表达式索引(除非列需要大小写不敏感的PK/FK/UNIQUE约束)或LOWER(col)类型。CITEXT - 金额:使用(绝不要用浮点数)。
NUMERIC(p,s) - 时间:时间戳使用;仅日期使用
TIMESTAMPTZ;时长使用DATE。避免使用INTERVAL(无时区)。事务开始时间使用TIMESTAMP,当前系统时间使用now()。clock_timestamp() - 布尔值:使用并添加
BOOLEAN约束,除非需要三态值。NOT NULL - 枚举类型:对于小型、稳定的集合(如美国各州、星期几)使用。对于业务逻辑驱动且不断演变的值(如订单状态)→ 使用TEXT(或INT)+ CHECK约束或查找表。
CREATE TYPE ... AS ENUM - 数组:、
TEXT[]等。用于需要查询元素的有序列表。使用GIN索引支持包含查询(INTEGER[]、@>)和重叠查询(<@)。访问方式:&&(从1开始索引)、arr[1](切片)。适用于标签、分类;避免用于关系型数据——改用关联表。字面量语法:arr[1:3]或'{val1,val2}'。ARRAY[val1,val2] - 范围类型:、
daterange、numrange用于表示区间。支持重叠(tstzrange)、包含(&&)等操作符。使用GiST索引。适用于调度、版本控制、数值范围。选择一种边界规则并保持一致;默认优先使用@>(左闭右开)。[) - 网络类型:用于IP地址,
INET用于网络范围,CIDR用于MAC地址。支持网络操作符(MACADDR、<<、>>)。&& - 几何类型:避免使用、
POINT、LINE、POLYGON。使用GiST索引。如需空间功能考虑使用PostGIS扩展。CIRCLE - 全文搜索:用于全文搜索文档,
TSVECTOR用于搜索查询。使用GIN索引TSQUERY列。必须指定语言:tsvector和to_tsvector('english', col)。绝不要使用单参数版本。此规则适用于索引表达式和查询语句。to_tsquery('english', 'query') - 域类型:用于创建带验证的可复用自定义类型,可跨表强制执行约束。
CREATE DOMAIN email AS TEXT CHECK (VALUE ~ '^[^@]+@[^@]+$') - 复合类型:用于在列中存储结构化数据。使用
CREATE TYPE address AS (street TEXT, city TEXT, zip TEXT)语法访问字段。(col).field - JSONB:优先于JSON使用;使用GIN索引。仅用于可选/半结构化属性。仅当必须保留内容原始顺序时才使用JSON。
- 向量类型:pgvector扩展提供的类型,用于嵌入向量的相似性搜索。
vector
Do not use the following data types
请勿使用以下数据类型
- DO NOT use (without time zone); DO use
timestampinstead.timestamptz - DO NOT use or
char(n); DO usevarchar(n)instead.text - DO NOT use type; DO use
moneyinstead.numeric - DO NOT use type; DO use
timetzinstead.timestamptz - DO NOT use or any other precision specification; DO use
timestamptz(0)insteadtimestamptz - DO NOT use type; DO use
serialinstead.generated always as identity - DO NOT use ,
POINT,LINE,POLYGONbuilt-in types, DO useCIRCLEfrom postgis extension instead.geometry
- 请勿使用(无时区);请改用
timestamp。timestamptz - 请勿使用或
char(n);请改用varchar(n)。text - 请勿使用类型;请改用
money。numeric - 请勿使用类型;请改用
timetz。timestamptz - 请勿使用或任何精度指定;请改用
timestamptz(0)。timestamptz - 请勿使用类型;请改用
serial。generated always as identity - 请勿使用、
POINT、LINE、POLYGON内置类型;请改用postgis扩展的CIRCLE类型。geometry
Table Types
表类型
- Regular: default; fully durable, logged.
- TEMPORARY: session-scoped, auto-dropped, not logged. Faster for scratch work.
- UNLOGGED: persistent but not crash-safe. Faster writes; good for caches/staging.
- 常规表:默认类型;完全持久化、已记录日志。
- 临时表:会话级作用域,自动删除,不记录日志。适用于临时操作,速度更快。
- 无日志表:持久化但不保证崩溃安全。写入速度更快;适用于缓存/临时 staging 数据。
Row-Level Security
行级安全
Enable with . Create policies: . Built-in user-based access control at the row level.
ALTER TABLE tbl ENABLE ROW LEVEL SECURITYCREATE POLICY user_access ON orders FOR SELECT TO app_users USING (user_id = current_user_id())使用启用。创建策略:。实现基于用户的行级内置访问控制。
ALTER TABLE tbl ENABLE ROW LEVEL SECURITYCREATE POLICY user_access ON orders FOR SELECT TO app_users USING (user_id = current_user_id())Constraints
约束
- PK: implicit UNIQUE + NOT NULL; creates a B-tree index.
- FK: specify action (
ON DELETE/UPDATE,CASCADE,RESTRICT,SET NULL). Add explicit index on referencing column—speeds up joins and prevents locking issues on parent deletes/updates. UseSET DEFAULTfor circular FK dependencies checked at transaction end.DEFERRABLE INITIALLY DEFERRED - UNIQUE: creates a B-tree index; allows multiple NULLs unless (PG15+). Standard behavior:
NULLS NOT DISTINCTand(1, NULL)are allowed. With(1, NULL): only oneNULLS NOT DISTINCTallowed. Prefer(1, NULL)unless you specifically need duplicate NULLs.NULLS NOT DISTINCT - CHECK: row-local constraints; NULL values pass the check (three-valued logic). Example: allows NULL prices. Combine with
CHECK (price > 0)to enforce:NOT NULL.price NUMERIC NOT NULL CHECK (price > 0) - EXCLUDE: prevents overlapping values using operators. prevents double-booking rooms. Requires appropriate index type (often GiST).
EXCLUDE USING gist (room_id WITH =, booking_period WITH &&)
- 主键(PK):隐式包含UNIQUE + NOT NULL约束;会创建B-tree索引。
- 外键(FK):指定动作(
ON DELETE/UPDATE、CASCADE、RESTRICT、SET NULL)。为引用列添加显式索引——加速连接并避免父表删除/更新时的锁定问题。对于循环外键依赖,使用SET DEFAULT在事务结束时检查约束。DEFERRABLE INITIALLY DEFERRED - 唯一约束(UNIQUE):创建B-tree索引;允许多个NULL值,除非使用(PG15及以上版本)。标准行为:
NULLS NOT DISTINCT和(1, NULL)是允许的。使用(1, NULL)时:仅允许一个NULLS NOT DISTINCT。除非明确需要重复NULL值,否则优先使用(1, NULL)。NULLS NOT DISTINCT - 检查约束(CHECK):行级本地约束;NULL值会通过检查(三值逻辑)。例如:允许price为NULL。结合
CHECK (price > 0)约束强制执行:NOT NULL。price NUMERIC NOT NULL CHECK (price > 0) - 排除约束(EXCLUDE):使用操作符防止值重叠。防止房间重复预订。需要对应类型的索引(通常为GiST)。
EXCLUDE USING gist (room_id WITH =, booking_period WITH &&)
Indexing
索引
- B-tree: default for equality/range queries (,
=,<,>,BETWEEN)ORDER BY - Composite: order matters—index used if equality on leftmost prefix (uses index on
WHERE a = ? AND b > ?, but(a,b)does not). Put most selective/frequently filtered columns first.WHERE b = ? - Covering: - includes non-key columns for index-only scans without visiting table.
CREATE INDEX ON tbl (id) INCLUDE (name, email) - Partial: for hot subsets (→
WHERE status = 'active'). Any query withCREATE INDEX ON tbl (user_id) WHERE status = 'active'can use this index.status = 'active' - Expression: for computed search keys (). Expression must match exactly in WHERE clause:
CREATE INDEX ON tbl (LOWER(email)).WHERE LOWER(email) = 'user@example.com' - GIN: JSONB containment/existence, arrays (,
@>), full-text search (?)@@ - GiST: ranges, geometry, exclusion constraints
- BRIN: very large, naturally ordered data (time-series)—minimal storage overhead. Effective when row order on disk correlates with indexed column (insertion order or after ).
CLUSTER
- B-tree:默认用于等值/范围查询(、
=、<、>、BETWEEN)ORDER BY - 复合索引:列顺序至关重要——如果查询对最左侧前缀列使用等值条件(如),则会使用
WHERE a = ? AND b > ?索引,但(a,b)不会使用该索引。将选择性最高/过滤最频繁的列放在前面。WHERE b = ? - 覆盖索引:- 包含非键列,支持仅索引扫描,无需访问表数据。
CREATE INDEX ON tbl (id) INCLUDE (name, email) - 部分索引:针对热点子集(如→
WHERE status = 'active')。任何包含CREATE INDEX ON tbl (user_id) WHERE status = 'active'的查询都可使用该索引。status = 'active' - 表达式索引:针对计算后的搜索键(如)。WHERE子句中的表达式必须与索引表达式完全匹配:
CREATE INDEX ON tbl (LOWER(email))。WHERE LOWER(email) = 'user@example.com' - GIN:JSONB包含/存在查询、数组(、
@>)、全文搜索(?)@@ - GiST:范围、几何、排除约束
- BRIN:适用于超大型、自然有序的数据(时间序列)——存储开销极小。当磁盘上的行顺序与索引列相关时(插入顺序或执行后)效果最佳。
CLUSTER
Partitioning
分区
- Use for very large tables (>100M rows) where queries consistently filter on partition key (often time/date).
- Alternate use: use for tables where data maintenance tasks dictates e.g. data pruned or bulk replaced periodically
- RANGE: common for time-series (). Create partitions:
PARTITION BY RANGE (created_at). TimescaleDB automates time-based or ID-based partitioning with retention policies and compression.CREATE TABLE logs_2024_01 PARTITION OF logs FOR VALUES FROM ('2024-01-01') TO ('2024-02-01') - LIST: for discrete values (). Example:
PARTITION BY LIST (region).FOR VALUES IN ('us-east', 'us-west') - HASH: for even distribution when no natural key (). Creates N partitions with modulus.
PARTITION BY HASH (user_id) - Constraint exclusion: requires constraints on partitions for query planner to prune. Auto-created for declarative partitioning (PG10+).
CHECK - Prefer declarative partitioning or hypertables. Do NOT use table inheritance.
- Limitations: no global UNIQUE constraints—include partition key in PK/UNIQUE. FKs from partitioned tables not supported; use triggers.
- 适用于超大型表(超过1亿行),且查询始终按分区键过滤(通常为时间/日期)。
- 另一种场景:数据维护需求,例如定期清理或批量替换数据
- 范围分区:时间序列数据常用()。创建分区:
PARTITION BY RANGE (created_at)。TimescaleDB扩展可自动实现基于时间或ID的分区,并提供保留策略和压缩功能。CREATE TABLE logs_2024_01 PARTITION OF logs FOR VALUES FROM ('2024-01-01') TO ('2024-02-01') - 列表分区:针对离散值()。示例:
PARTITION BY LIST (region)。FOR VALUES IN ('us-east', 'us-west') - 哈希分区:无自然键时实现均匀分布()。创建N个分区,基于取模运算分配数据。
PARTITION BY HASH (user_id) - 约束排除:需要为分区添加约束,以便查询规划器进行分区裁剪。声明式分区(PG10及以上版本)会自动创建。
CHECK - 优先使用声明式分区或 hypertables。请勿使用表继承。
- 限制:无全局UNIQUE约束——需将分区键包含在PK/UNIQUE约束中。分区表的外键不被支持;使用触发器替代。
Special Considerations
特殊场景考量
Update-Heavy Tables
频繁更新的表
- Separate hot/cold columns—put frequently updated columns in separate table to minimize bloat.
- Use to leave space for HOT updates that avoid index maintenance.
fillfactor=90 - Avoid updating indexed columns—prevents beneficial HOT updates.
- Partition by update patterns—separate frequently updated rows in a different partition from stable data.
- 分离冷热列——将频繁更新的列放在单独的表中,以最小化数据膨胀。
- 使用——预留空间用于HOT更新,避免维护索引。
fillfactor=90 - 避免更新带索引的列——会阻止有效的HOT更新。
- 按更新模式分区——将频繁更新的行与稳定数据放在不同分区。
Insert-Heavy Workloads
频繁插入的工作负载
- Minimize indexes—only create what you query; every index slows inserts.
- Use or multi-row
COPYinstead of single-row inserts.INSERT - UNLOGGED tables for rebuildable staging data—much faster writes.
- Defer index creation for bulk loads—>drop index, load data, recreate indexes.
- Partition by time/hash to distribute load. TimescaleDB automates partitioning and compression of insert-heavy data.
- Use a natural key for primary key such as a (timestamp, device_id) if enforcing global uniqueness is important many insert-heavy tables don't need a primary key at all.
- If you do need a surrogate key, Prefer over
BIGINT GENERATED ALWAYS AS IDENTITY.UUID
- 最小化索引数量——仅创建实际需要查询的索引;每个索引都会减慢插入速度。
- **使用或多行
COPY**替代单行插入。INSERT - 无日志表用于可重建的staging数据——写入速度快得多。
- 批量加载时延迟创建索引——先删除索引,加载数据,再重新创建索引。
- 按时间/哈希分区分散负载。TimescaleDB扩展可自动实现频繁插入数据的分区和压缩。
- 使用自然键作为主键,如(timestamp, device_id),如果需要强制全局唯一性;许多频繁插入的表根本不需要主键。
- 如果确实需要代理键,优先选择而非
BIGINT GENERATED ALWAYS AS IDENTITY。UUID
Upsert-Friendly Design
友好的Upsert设计
- Requires UNIQUE index on conflict target columns—needs exact matching unique index (partial indexes don't work).
ON CONFLICT (col1, col2) - Use to reference would-be-inserted values; only update columns that actually changed to reduce write overhead.
EXCLUDED.column - faster than
DO NOTHINGwhen no actual update needed.DO UPDATE
- 需要UNIQUE索引在冲突目标列上——需要完全匹配的唯一索引(部分索引无效)。
ON CONFLICT (col1, col2) - **使用**引用待插入的值;仅更新实际变化的列以减少写入开销。
EXCLUDED.column - 比
DO NOTHING更快当无需实际更新时。DO UPDATE
Safe Schema Evolution
安全的Schema演进
- Transactional DDL: most DDL operations can run in transactions and be rolled back—for safe testing.
BEGIN; ALTER TABLE...; ROLLBACK; - Concurrent index creation: avoids blocking writes but can't run in transactions.
CREATE INDEX CONCURRENTLY - Volatile defaults cause rewrites: adding columns with volatile defaults (e.g.,
NOT NULL,now()) rewrites entire table. Non-volatile defaults are fast.gen_random_uuid() - Drop constraints before columns: then
ALTER TABLE DROP CONSTRAINTto avoid dependency issues.DROP COLUMN - Function signature changes: with different arguments creates overloads, not replacements. DROP old version if no overload desired.
CREATE OR REPLACE
- 事务性DDL:大多数DDL操作可在事务中执行并回滚——使用进行安全测试。
BEGIN; ALTER TABLE...; ROLLBACK; - 并发创建索引:避免阻塞写入,但不能在事务中执行。
CREATE INDEX CONCURRENTLY - ** volatile默认值会导致表重写**——添加带volatile默认值(如、
now())的NOT NULL列会重写整个表。非volatile默认值速度更快。gen_random_uuid() - 先删除约束再删除列——然后
ALTER TABLE DROP CONSTRAINT以避免依赖问题。DROP COLUMN - 函数签名变更:使用不同参数会创建重载而非替换。如果不需要重载,请删除旧版本函数。
CREATE OR REPLACE
Generated Columns
生成列
- for computed, indexable fields. PG18+ adds
... GENERATED ALWAYS AS (<expr>) STOREDcolumns (computed on read, not stored).VIRTUAL
- 用于可计算、可索引的字段。PG18及以上版本新增
... GENERATED ALWAYS AS (<expr>) STORED列(读取时计算,不存储)。VIRTUAL
Extensions
扩展
- :
pgcryptofor password hashing.crypt() - : alternative UUID functions; prefer
uuid-osspfor new projects.pgcrypto - : fuzzy text search with
pg_trgmoperator,%function. Index with GIN forsimilarity()acceleration.LIKE '%pattern%' - : case-insensitive text type. Prefer expression indexes on
citextunless you need case-insensitive constraints.LOWER(col) - /
btree_gin: enable mixed-type indexes (e.g., GIN index on both JSONB and text columns).btree_gist - : key-value pairs; mostly superseded by JSONB but useful for simple string mappings.
hstore - : essential for time-series—automated partitioning, retention, compression, continuous aggregates.
timescaledb - : comprehensive geospatial support beyond basic geometric types—essential for location-based applications.
postgis - : vector similarity search for embeddings.
pgvector - : audit logging for all database activity.
pgaudit
- :
pgcrypto用于密码哈希。crypt() - :替代UUID函数;新项目优先使用
uuid-ossp。pgcrypto - :模糊文本搜索,支持
pg_trgm操作符、%函数。使用GIN索引加速similarity()查询。LIKE '%pattern%' - :大小写不敏感文本类型。除非需要大小写不敏感约束,否则优先使用
citext的表达式索引。LOWER(col) - /
btree_gin:支持混合类型索引(如同时包含JSONB和文本列的GIN索引)。btree_gist - :键值对;大多已被JSONB取代,但适用于简单字符串映射。
hstore - :时间序列数据必备——自动分区、数据保留、压缩、连续聚合。
timescaledb - :提供超越基础几何类型的全面空间支持——位置应用必备。
postgis - :嵌入向量的相似性搜索。
pgvector - :所有数据库活动的审计日志。
pgaudit
JSONB Guidance
JSONB使用指南
- Prefer with GIN index.
JSONB - Default: → accelerates:
CREATE INDEX ON tbl USING GIN (jsonb_col);- Containment
jsonb_col @> '{"k":"v"}' - Key existence , any/all keys
jsonb_col ? 'k',?\|?& - Path containment on nested docs
- Disjunction
jsonb_col @> ANY(ARRAY['{"status":"active"}', '{"status":"pending"}'])
- Containment
- Heavy workloads: consider opclass
@>for smaller/faster containment-only indexes:jsonb_path_opsCREATE INDEX ON tbl USING GIN (jsonb_col jsonb_path_ops);- Trade-off: loses support for key existence (,
?,?|) queries—only supports containment (?&)@>
- Equality/range on a specific scalar field: extract and index with B-tree (generated column or expression):
ALTER TABLE tbl ADD COLUMN price INT GENERATED ALWAYS AS ((jsonb_col->>'price')::INT) STORED;CREATE INDEX ON tbl (price);- Prefer queries like (uses B-tree) over
WHERE price BETWEEN 100 AND 500without index.WHERE (jsonb_col->>'price')::INT BETWEEN 100 AND 500
- Arrays inside JSONB: use GIN + for containment (e.g., tags). Consider
@>if only doing containment.jsonb_path_ops - Keep core relations in tables; use JSONB for optional/variable attributes.
- Use constraints to limit allowed JSONB values in a column e.g.
config JSONB NOT NULL CHECK(jsonb_typeof(config) = 'object')
- 优先使用并搭配GIN索引。
JSONB - 默认创建方式:→ 加速以下操作:
CREATE INDEX ON tbl USING GIN (jsonb_col);- 包含查询
jsonb_col @> '{"k":"v"}' - 键存在性查询 、任意/所有键存在
jsonb_col ? 'k'、?|?& - 嵌套文档的路径包含查询
- 析取查询
jsonb_col @> ANY(ARRAY['{"status":"active"}', '{"status":"pending"}'])
- 包含查询
- 频繁进行查询的场景:考虑使用
@>操作类创建更小、更快的仅包含查询索引:jsonb_path_opsCREATE INDEX ON tbl USING GIN (jsonb_col jsonb_path_ops);- 权衡:失去键存在性查询(,
?,?|)支持——仅支持包含查询(?&)@>
- 针对特定标量字段的等值/范围查询:提取字段并创建B-tree索引(使用生成列或表达式):
ALTER TABLE tbl ADD COLUMN price INT GENERATED ALWAYS AS ((jsonb_col->>'price')::INT) STORED;CREATE INDEX ON tbl (price);- 优先使用(使用B-tree索引)而非无索引的
WHERE price BETWEEN 100 AND 500。WHERE (jsonb_col->>'price')::INT BETWEEN 100 AND 500
- JSONB内的数组:使用GIN + 进行包含查询(如标签)。如果仅进行包含查询,考虑使用
@>。jsonb_path_ops - 核心关系数据存储在表中;JSONB用于存储可选/可变属性。
- 使用约束限制列中允许的JSONB值,例如
config JSONB NOT NULL CHECK(jsonb_typeof(config) = 'object')
Examples
示例
Users
用户表
sql
CREATE TABLE users (
user_id BIGINT GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
email TEXT NOT NULL UNIQUE,
name TEXT NOT NULL,
created_at TIMESTAMPTZ NOT NULL DEFAULT now()
);
CREATE UNIQUE INDEX ON users (LOWER(email));
CREATE INDEX ON users (created_at);sql
CREATE TABLE users (
user_id BIGINT GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
email TEXT NOT NULL UNIQUE,
name TEXT NOT NULL,
created_at TIMESTAMPTZ NOT NULL DEFAULT now()
);
CREATE UNIQUE INDEX ON users (LOWER(email));
CREATE INDEX ON users (created_at);Orders
订单表
sql
CREATE TABLE orders (
order_id BIGINT GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
user_id BIGINT NOT NULL REFERENCES users(user_id),
status TEXT NOT NULL DEFAULT 'PENDING' CHECK (status IN ('PENDING','PAID','CANCELED')),
total NUMERIC(10,2) NOT NULL CHECK (total > 0),
created_at TIMESTAMPTZ NOT NULL DEFAULT now()
);
CREATE INDEX ON orders (user_id);
CREATE INDEX ON orders (created_at);sql
CREATE TABLE orders (
order_id BIGINT GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
user_id BIGINT NOT NULL REFERENCES users(user_id),
status TEXT NOT NULL DEFAULT 'PENDING' CHECK (status IN ('PENDING','PAID','CANCELED')),
total NUMERIC(10,2) NOT NULL CHECK (total > 0),
created_at TIMESTAMPTZ NOT NULL DEFAULT now()
);
CREATE INDEX ON orders (user_id);
CREATE INDEX ON orders (created_at);JSONB
JSONB示例
sql
CREATE TABLE profiles (
user_id BIGINT PRIMARY KEY REFERENCES users(user_id),
attrs JSONB NOT NULL DEFAULT '{}',
theme TEXT GENERATED ALWAYS AS (attrs->>'theme') STORED
);
CREATE INDEX profiles_attrs_gin ON profiles USING GIN (attrs);sql
CREATE TABLE profiles (
user_id BIGINT PRIMARY KEY REFERENCES users(user_id),
attrs JSONB NOT NULL DEFAULT '{}',
theme TEXT GENERATED ALWAYS AS (attrs->>'theme') STORED
);
CREATE INDEX profiles_attrs_gin ON profiles USING GIN (attrs);