setup-timescaledb-hypertables
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseTimescaleDB Complete Setup
TimescaleDB 完整设置
Instructions for insert-heavy data patterns where data is inserted but rarely changed:
- Time-series data (sensors, metrics, system monitoring)
- Event logs (user events, audit trails, application logs)
- Transaction records (orders, payments, financial transactions)
- Sequential data (records with auto-incrementing IDs and timestamps)
- Append-only datasets (immutable records, historical data)
以下是针对写入密集型数据模式的说明,这类数据模式下数据仅被写入但很少被修改:
- 时序数据(传感器、指标、系统监控)
- 事件日志(用户事件、审计追踪、应用日志)
- 交易记录(订单、支付、金融交易)
- 序列数据(带自增ID和时间戳的记录)
- 仅追加数据集(不可变记录、历史数据)
Step 1: Create Hypertable
步骤1:创建Hypertable
sql
CREATE TABLE your_table_name (
timestamp TIMESTAMPTZ NOT NULL,
entity_id TEXT NOT NULL, -- device_id, user_id, symbol, etc.
category TEXT, -- sensor_type, event_type, asset_class, etc.
value_1 DOUBLE PRECISION, -- price, temperature, latency, etc.
value_2 DOUBLE PRECISION, -- volume, humidity, throughput, etc.
value_3 INTEGER, -- count, status, level, etc.
metadata JSONB -- flexible additional data
) WITH (
tsdb.hypertable,
tsdb.partition_column='timestamp',
tsdb.enable_columnstore=true, -- Disable if table has vector columns
tsdb.segmentby='entity_id', -- See selection guide below
tsdb.orderby='timestamp DESC', -- See selection guide below
tsdb.sparse_index='minmax(value_1),minmax(value_2),minmax(value_3)' -- see selection guide below
);sql
CREATE TABLE your_table_name (
timestamp TIMESTAMPTZ NOT NULL,
entity_id TEXT NOT NULL, -- device_id, user_id, symbol, etc.
category TEXT, -- sensor_type, event_type, asset_class, etc.
value_1 DOUBLE PRECISION, -- price, temperature, latency, etc.
value_2 DOUBLE PRECISION, -- volume, humidity, throughput, etc.
value_3 INTEGER, -- count, status, level, etc.
metadata JSONB -- flexible additional data
) WITH (
tsdb.hypertable,
tsdb.partition_column='timestamp',
tsdb.enable_columnstore=true, -- Disable if table has vector columns
tsdb.segmentby='entity_id', -- See selection guide below
tsdb.orderby='timestamp DESC', -- See selection guide below
tsdb.sparse_index='minmax(value_1),minmax(value_2),minmax(value_3)' -- see selection guide below
);Compression Decision
压缩决策
- Enable by default for insert-heavy patterns
- Disable if table has vector type columns (pgvector) - indexes on vector columns incompatible with columnstore
- 默认启用 写入密集型模式
- 禁用场景 表中包含向量类型列(pgvector)——向量列的索引与columnstore不兼容
Partition Column Selection
分区列选择
Must be time-based (TIMESTAMP/TIMESTAMPTZ/DATE) or integer (INT/BIGINT) with good temporal/sequential distribution.
Common patterns:
- TIME-SERIES: ,
timestamp,event_timemeasured_at - EVENT LOGS: ,
event_time,created_atlogged_at - TRANSACTIONS: ,
created_at,transaction_timeprocessed_at - SEQUENTIAL: (auto-increment when no timestamp),
idsequence_number - APPEND-ONLY: ,
created_at,inserted_atid
Less ideal: (when data entered system - use only if it's your primary query dimension)
Avoid: (breaks time ordering unless it's primary query dimension)
ingested_atupdated_at必须是基于时间的类型(TIMESTAMP/TIMESTAMPTZ/DATE)或整数类型(INT/BIGINT),且具有良好的时间/序列分布。
常见模式:
- 时序数据:,
timestamp,event_timemeasured_at - 事件日志:,
event_time,created_atlogged_at - 交易数据:,
created_at,transaction_timeprocessed_at - 序列数据:(无时间戳时使用自增ID),
idsequence_number - 仅追加数据:,
created_at,inserted_atid
不太理想的选择: (数据进入系统的时间——仅当它是主要查询维度时使用)
避免选择: (会破坏时间顺序,除非它是主要查询维度)
ingested_atupdated_atSegment_By Column Selection
Segment_By列选择
PREFER SINGLE COLUMN - multi-column rarely optimal. Multi-column can only work for highly correlated columns (e.g., metric_name + metric_type) with sufficient row density.
Requirements:
- Frequently used in WHERE clauses (most common filter)
- Good row density (>100 rows per value per chunk)
- Primary logical partition/grouping
Examples:
- IoT:
device_id - Finance:
symbol - Metrics: ,
service_name(if sufficient row density),service_name, metric_type(if sufficient row density)metric_name, metric_type - Analytics: if sufficient row density, otherwise
user_idsession_id - E-commerce: if sufficient row density, otherwise
product_idcategory_id
Row density guidelines:
- Target: >100 rows per segment_by value within each chunk.
- Poor: <10 rows per segment_by value per chunk → choose less granular column
- What to do with low-density columns: prepend to order_by column list.
Query pattern drives choice:
sql
SELECT * FROM table WHERE entity_id = 'X' AND timestamp > ...
-- ↳ segment_by: entity_id (if >100 rows per chunk)Avoid: timestamps, unique IDs, low-density columns (<100 rows/value/chunk), columns rarely used in filtering
优先选择单列——多列很少是最优的。多列仅适用于高度相关的列(例如metric_name + metric_type)且具有足够的行密度。
要求:
- 频繁出现在WHERE子句中(最常用的过滤条件)
- 良好的行密度(每个块中每个值对应>100行)
- 主要的逻辑分区/分组依据
示例:
- IoT:
device_id - 金融:
symbol - 指标:,
service_name(如果行密度足够),service_name, metric_type(如果行密度足够)metric_name, metric_type - 分析:(如果行密度足够),否则使用
user_idsession_id - 电商:(如果行密度足够),否则使用
product_idcategory_id
行密度指南:
- 目标:每个块中每个segment_by值对应>100行
- 较差:每个块中每个segment_by值对应<10行 → 选择粒度更粗的列
- 低密度列的处理方式:将其添加到order_by列列表的开头
查询模式驱动选择:
sql
SELECT * FROM table WHERE entity_id = 'X' AND timestamp > ...
-- ↳ segment_by: entity_id (if >100 rows per chunk)避免选择: 时间戳、唯一ID、低密度列(<100行/值/块)、很少用于过滤的列
Order_By Column Selection
Order_By列选择
Creates natural time-series progression when combined with segment_by for optimal compression.
Most common:
timestamp DESCExamples:
- IoT/Finance/E-commerce:
timestamp DESC - Metrics: (if metric_name has too low density for segment_by)
metric_name, timestamp DESC - Analytics: (user_id has too low density for segment_by)
user_id, timestamp DESC
Alternative patterns:
- for event streams with sequence numbers
sequence_id DESC - for sub-ordering within same timestamp
timestamp DESC, event_order DESC
Low-density column handling:
If a column has <100 rows per chunk (too low for segment_by), prepend it to order_by:
- Example: has 20 rows/chunk → use
metric_name,segment_by='service_name'order_by='metric_name, timestamp DESC' - Groups similar values together (all temperature readings, then pressure readings) for better compression
Good test: ordering created by should form a natural time-series progression. Values close to each other in the progression should be similar.
(segment_by_column, order_by_column)Avoid in order_by: random columns, columns with high variance between adjacent rows, columns unrelated to segment_by
与segment_by结合形成自然的时序序列,以实现最佳压缩效果。
最常见的选择:
timestamp DESC示例:
- IoT/金融/电商:
timestamp DESC - 指标:(如果metric_name的密度过低,不适合作为segment_by)
metric_name, timestamp DESC - 分析:(user_id的密度过低,不适合作为segment_by)
user_id, timestamp DESC
替代模式:
- 带序列号的事件流使用
sequence_id DESC - 同一时间戳内需要子排序的使用
timestamp DESC, event_order DESC
低密度列的处理:
如果某列在每个块中的行数<100(密度过低,不适合作为segment_by),则将其添加到order_by的开头:
- 示例:每个块中有20行 → 使用
metric_name,segment_by='service_name'order_by='metric_name, timestamp DESC' - 将相似值分组在一起(所有温度读数,然后是压力读数)以获得更好的压缩效果
验证方法: 由形成的排序应构成自然的时序序列。序列中相邻的值应相似。
(segment_by_column, order_by_column)避免在order_by中使用: 随机列、相邻行之间差异大的列、与segment_by无关的列
Compression Sparse Index Selection
压缩稀疏索引选择
Sparse indexes enable query filtering on compressed data without decompression. Store metadata per batch (~1000 rows) to eliminate batches that don't match query predicates.
Types:
- minmax: Min/max values per batch - for range queries (>, <, BETWEEN) on numeric/temporal columns
Use minmax for: price, temperature, measurement, timestamp (range filtering)
Use for:
- minmax for outlier detection (temperature > 90).
- minmax for fields that are highly correlated with segmentby and orderby columns (e.g. if orderby includes , minmax on
created_atis useful).updated_at
Avoid: rarely filtered columns.
IMPORTANT: NEVER index columns in segmentby or orderby. Orderby columns will always have minmax indexes without any configuration.
Configuration:
The format is a comma-separated list of type_of_index(column_name).
sql
ALTER TABLE table_name SET (
timescaledb.sparse_index = 'minmax(value_1),minmax(value_2)'
);Explicit configuration available since v2.22.0 (was auto-created since v2.16.0).
稀疏索引 允许在不解压缩的情况下对压缩数据进行查询过滤。每个批次(约1000行)存储元数据,以排除不匹配查询谓词的批次。
类型:
- minmax: 每个批次的最小/最大值——用于数值/时间列的范围查询(>, <, BETWEEN)
minmax的适用场景: 价格、温度、测量值、时间戳(范围过滤)
使用场景:
- minmax用于异常值检测(例如temperature > 90)
- minmax用于与segmentby和orderby列高度相关的字段(例如,如果orderby包含,则对
created_at创建minmax索引很有用)updated_at
避免场景: 很少用于过滤的列
重要提示:切勿为segmentby或orderby中的列创建索引。Orderby列会自动创建minmax索引,无需额外配置。
配置方式:
格式为逗号分隔的index_type(column_name)列表。
sql
ALTER TABLE table_name SET (
timescaledb.sparse_index = 'minmax(value_1),minmax(value_2)'
);显式配置从v2.22.0开始可用(v2.16.0及以后版本会自动创建)。
Chunk Time Interval (Optional)
块时间间隔(可选)
Default: 7 days (use if volume unknown, or ask user). Adjust based on volume:
- High frequency: 1 hour - 1 day
- Medium: 1 day - 1 week
- Low: 1 week - 1 month
sql
SELECT set_chunk_time_interval('your_table_name', INTERVAL '1 day');Good test: recent chunk indexes should fit in less than 25% of RAM.
默认值:7天(如果数据量未知则使用默认值,或询问用户)。根据数据量调整:
- 高频数据:1小时 - 1天
- 中频数据:1天 - 1周
- 低频数据:1周 - 1个月
sql
SELECT set_chunk_time_interval('your_table_name', INTERVAL '1 day');验证方法: 最新块的索引应占用不到25%的内存
Indexes & Primary Keys
索引与主键
Common index patterns - composite indexes on an id and timestamp:
sql
CREATE INDEX idx_entity_timestamp ON your_table_name (entity_id, timestamp DESC);Important: Only create indexes you'll actually use - each has maintenance overhead.
Primary key and unique constraints rules: Must include partition column.
Option 1: Composite PK with partition column
sql
ALTER TABLE your_table_name ADD PRIMARY KEY (entity_id, timestamp);Option 2: Single-column PK (only if it's the partition column)
sql
CREATE TABLE ... (id BIGINT PRIMARY KEY, ...) WITH (tsdb.partition_column='id');Option 3: No PK: strict uniqueness is often not required for insert-heavy patterns.
常见的索引模式——id和时间戳的复合索引:
sql
CREATE INDEX idx_entity_timestamp ON your_table_name (entity_id, timestamp DESC);重要提示: 仅创建实际需要的索引——每个索引都会带来维护开销
主键和唯一约束规则: 必须包含分区列
选项1:包含分区列的复合主键
sql
ALTER TABLE your_table_name ADD PRIMARY KEY (entity_id, timestamp);选项2:单列主键(仅当该列是分区列时)
sql
CREATE TABLE ... (id BIGINT PRIMARY KEY, ...) WITH (tsdb.partition_column='id');选项3:无主键:写入密集型模式通常不需要严格的唯一性
Step 2: Compression Policy (Optional)
步骤2:压缩策略(可选)
IMPORTANT: If you used in Step 1, starting with TimescaleDB version 2.23 a columnstore policy is automatically created with . You only need to call if you want to customize the interval to something other than 7 days.
tsdb.enable_columnstore=trueafter => INTERVAL '7 days'add_columnstore_policy()afterSet interval for when: data becomes mostly immutable (some updates/backfill OK) AND B-tree indexes aren't needed for queries (less common criterion).
aftersql
-- In TimescaleDB 2.23 and later only needed if you want to override the default 7-day policy created by tsdb.enable_columnstore=true
-- Remove the existing auto-created policy first:
-- CALL remove_columnstore_policy('your_table_name');
-- Then add custom policy:
-- CALL add_columnstore_policy('your_table_name', after => INTERVAL '1 day');重要提示:如果在步骤1中使用了,从TimescaleDB v2.23版本开始,会自动创建一个的columnstore策略。只有当你想将间隔自定义为7天以外的值时,才需要调用。
tsdb.enable_columnstore=trueafter => INTERVAL '7 days'afteradd_columnstore_policy()设置间隔的时机:数据基本不可变(允许少量更新/回填)且查询不再需要B树索引(较不常见的条件)。
aftersql
-- 仅在TimescaleDB 2.23及以后版本中,当你想覆盖由tsdb.enable_columnstore=true自动创建的7天默认策略时才需要执行以下操作
-- 先移除已自动创建的策略:
-- CALL remove_columnstore_policy('your_table_name');
-- 然后添加自定义策略:
-- CALL add_columnstore_policy('your_table_name', after => INTERVAL '1 day');Step 3: Retention Policy
步骤3:保留策略
IMPORTANT: Don't guess - ask user or comment out if unknown.
sql
-- Example - replace with requirements or comment out
SELECT add_retention_policy('your_table_name', INTERVAL '365 days');重要提示:不要猜测——询问用户,或者如果未知则注释掉。
sql
-- 示例 - 根据需求替换或注释掉
SELECT add_retention_policy('your_table_name', INTERVAL '365 days');Step 4: Create Continuous Aggregates
步骤4:创建连续聚合
Use different aggregation intervals for different uses.
针对不同的使用场景使用不同的聚合间隔。
Short-term (Minutes/Hours)
短期(分钟/小时)
For up-to-the-minute dashboards on high-frequency data.
sql
CREATE MATERIALIZED VIEW your_table_hourly
WITH (timescaledb.continuous) AS
SELECT
time_bucket(INTERVAL '1 hour', timestamp) AS bucket,
entity_id,
category,
COUNT(*) as record_count,
AVG(value_1) as avg_value_1,
MIN(value_1) as min_value_1,
MAX(value_1) as max_value_1,
SUM(value_2) as sum_value_2
FROM your_table_name
GROUP BY bucket, entity_id, category;适用于高频数据的实时仪表盘。
sql
CREATE MATERIALIZED VIEW your_table_hourly
WITH (timescaledb.continuous) AS
SELECT
time_bucket(INTERVAL '1 hour', timestamp) AS bucket,
entity_id,
category,
COUNT(*) as record_count,
AVG(value_1) as avg_value_1,
MIN(value_1) as min_value_1,
MAX(value_1) as max_value_1,
SUM(value_2) as sum_value_2
FROM your_table_name
GROUP BY bucket, entity_id, category;Long-term (Days/Weeks/Months)
长期(天/周/月)
For long-term reporting and analytics.
sql
CREATE MATERIALIZED VIEW your_table_daily
WITH (timescaledb.continuous) AS
SELECT
time_bucket(INTERVAL '1 day', timestamp) AS bucket,
entity_id,
category,
COUNT(*) as record_count,
AVG(value_1) as avg_value_1,
MIN(value_1) as min_value_1,
MAX(value_1) as max_value_1,
PERCENTILE_CONT(0.5) WITHIN GROUP (ORDER BY value_1) as median_value_1,
PERCENTILE_CONT(0.95) WITHIN GROUP (ORDER BY value_1) as p95_value_1,
SUM(value_2) as sum_value_2
FROM your_table_name
GROUP BY bucket, entity_id, category;适用于长期报表和分析。
sql
CREATE MATERIALIZED VIEW your_table_daily
WITH (timescaledb.continuous) AS
SELECT
time_bucket(INTERVAL '1 day', timestamp) AS bucket,
entity_id,
category,
COUNT(*) as record_count,
AVG(value_1) as avg_value_1,
MIN(value_1) as min_value_1,
MAX(value_1) as max_value_1,
PERCENTILE_CONT(0.5) WITHIN GROUP (ORDER BY value_1) as median_value_1,
PERCENTILE_CONT(0.95) WITHIN GROUP (ORDER BY value_1) as p95_value_1,
SUM(value_2) as sum_value_2
FROM your_table_name
GROUP BY bucket, entity_id, category;Step 5: Aggregate Refresh Policies
步骤5:聚合刷新策略
Set up refresh policies based on your data freshness requirements.
start_offset: Usually omit (refreshes all). Exception: If you don't care about refreshing data older than X (see below). With retention policy on raw data: match the retention policy.
end_offset: Set beyond active update window (e.g., 15 min if data usually arrives within 10 min). Data newer than end_offset won't appear in queries without real-time aggregation. If you don't know your update window, use the size of the time_bucket in the query, but not less than 5 minutes.
schedule_interval: Set to the same value as the end_offset but not more than 1 hour.
Hourly - frequent refresh for dashboards:
sql
SELECT add_continuous_aggregate_policy('your_table_hourly',
start_offset => NULL,
end_offset => INTERVAL '15 minutes',
schedule_interval => INTERVAL '15 minutes');Daily - less frequent for reports:
sql
SELECT add_continuous_aggregate_policy('your_table_daily',
start_offset => NULL,
end_offset => INTERVAL '1 hour',
schedule_interval => INTERVAL '1 hour');Use start_offset only if you don't care about refreshing old data
Use for high-volume systems where query accuracy on older data doesn't matter:
sql
-- the following aggregate can be stale for data older than 7 days
-- SELECT add_continuous_aggregate_policy('aggregate_for_last_7_days',
-- start_offset => INTERVAL '7 days', -- only refresh last 7 days (NULL = refresh all)
-- end_offset => INTERVAL '15 minutes',
-- schedule_interval => INTERVAL '15 minutes');IMPORTANT: you MUST set a start_offset to be less than the retention policy on raw data. By default, set the start_offset equal to the retention policy.
If the retention policy is commented out, comment out the start_offset as well. like this:
sql
SELECT add_continuous_aggregate_policy('your_table_daily',
start_offset => NULL, -- Use NULL to refresh all data, or set to retention period if enabled on raw data
-- start_offset => INTERVAL '<retention period here>', -- uncomment if retention policy is enabled on the raw data table
end_offset => INTERVAL '1 hour',
schedule_interval => INTERVAL '1 hour');根据数据新鲜度需求设置刷新策略。
start_offset: 通常省略(刷新所有数据)。例外情况:如果不关心刷新早于X的数据(见下文)。如果原始数据有保留策略:与保留策略匹配。
end_offset: 设置为超出活跃更新窗口的值(例如,如果数据通常在10分钟内到达,则设置为15分钟)。早于end_offset的数据不会出现在查询结果中,除非启用了实时聚合。如果不知道更新窗口,使用查询中time_bucket的大小,但不小于5分钟。
schedule_interval: 设置为与end_offset相同的值,但不超过1小时。
小时级 - 仪表盘的频繁刷新:
sql
SELECT add_continuous_aggregate_policy('your_table_hourly',
start_offset => NULL,
end_offset => INTERVAL '15 minutes',
schedule_interval => INTERVAL '15 minutes');天级 - 报表的低频率刷新:
sql
SELECT add_continuous_aggregate_policy('your_table_daily',
start_offset => NULL,
end_offset => INTERVAL '1 hour',
schedule_interval => INTERVAL '1 hour');仅当不关心刷新旧数据时使用start_offset
适用于高容量系统,这些系统中旧数据的查询准确性不重要:
sql
-- 以下聚合对于早于7天的数据可能过时
-- SELECT add_continuous_aggregate_policy('aggregate_for_last_7_days',
-- start_offset => INTERVAL '7 days', -- only refresh last 7 days (NULL = refresh all)
-- end_offset => INTERVAL '15 minutes',
-- schedule_interval => INTERVAL '15 minutes');重要提示:必须将start_offset设置为小于原始数据的保留策略。默认情况下,将start_offset设置为与保留策略相同的值。如果保留策略被注释掉,也请注释掉start_offset。例如:
sql
SELECT add_continuous_aggregate_policy('your_table_daily',
start_offset => NULL, -- 使用NULL刷新所有数据,如果原始数据启用了保留策略则设置为保留周期
-- start_offset => INTERVAL '<retention period here>', -- 如果原始数据表启用了保留策略则取消注释
end_offset => INTERVAL '1 hour',
schedule_interval => INTERVAL '1 hour');Step 6: Real-Time Aggregation (Optional)
步骤6:实时聚合(可选)
Real-time combines materialized + recent raw data at query time. Provides up-to-date results at the cost of higher query latency.
More useful for fine-grained aggregates (e.g., minutely) than coarse ones (e.g., daily/monthly) since large buckets will be mostly incomplete with recent data anyway.
Disabled by default in v2.13+, before that it was enabled by default.
Use when: Need data newer than end_offset, up-to-minute dashboards, can tolerate higher query latency
Disable when: Performance critical, refresh policies sufficient, high query volume, missing and stale data for recent data is acceptable
Enable for current results (higher query cost):
sql
ALTER MATERIALIZED VIEW your_table_hourly SET (timescaledb.materialized_only = false);Disable for performance (but with stale results):
sql
ALTER MATERIALIZED VIEW your_table_hourly SET (timescaledb.materialized_only = true);实时聚合在查询时结合物化视图数据和最新的原始数据。提供最新的结果,但会增加查询延迟。
对于细粒度聚合(例如分钟级)比粗粒度聚合(例如天级/月级)更有用,因为大时间桶的最新数据通常大部分是不完整的。
v2.13+版本默认禁用,之前的版本默认启用。
启用场景: 需要早于end_offset的数据、实时仪表盘、可以容忍更高的查询延迟
禁用场景: 性能要求高、刷新策略足够、查询量高、可以接受最新数据的缺失或过时
启用以获取当前结果(查询成本更高):
sql
ALTER MATERIALIZED VIEW your_table_hourly SET (timescaledb.materialized_only = false);禁用以提升性能(但结果可能过时):
sql
ALTER MATERIALIZED VIEW your_table_hourly SET (timescaledb.materialized_only = true);Step 7: Compress Aggregates
步骤7:压缩聚合数据
Rule: segment_by = ALL GROUP BY columns except time_bucket, order_by = time_bucket DESC
sql
-- Hourly
ALTER MATERIALIZED VIEW your_table_hourly SET (
timescaledb.enable_columnstore,
timescaledb.segmentby = 'entity_id, category',
timescaledb.orderby = 'bucket DESC'
);
CALL add_columnstore_policy('your_table_hourly', after => INTERVAL '3 days');
-- Daily
ALTER MATERIALIZED VIEW your_table_daily SET (
timescaledb.enable_columnstore,
timescaledb.segmentby = 'entity_id, category',
timescaledb.orderby = 'bucket DESC'
);
CALL add_columnstore_policy('your_table_daily', after => INTERVAL '7 days');规则:segment_by = 除time_bucket外的所有GROUP BY列,order_by = time_bucket DESC
sql
-- 小时级
ALTER MATERIALIZED VIEW your_table_hourly SET (
timescaledb.enable_columnstore,
timescaledb.segmentby = 'entity_id, category',
timescaledb.orderby = 'bucket DESC'
);
CALL add_columnstore_policy('your_table_hourly', after => INTERVAL '3 days');
-- 天级
ALTER MATERIALIZED VIEW your_table_daily SET (
timescaledb.enable_columnstore,
timescaledb.segmentby = 'entity_id, category',
timescaledb.orderby = 'bucket DESC'
);
CALL add_columnstore_policy('your_table_daily', after => INTERVAL '7 days');Step 8: Aggregate Retention
步骤8:聚合数据保留
Aggregates are typically kept longer than raw data.
IMPORTANT: Don't guess - ask user or you MUST comment out if unknown.
sql
-- Example - replace or comment out
SELECT add_retention_policy('your_table_hourly', INTERVAL '2 years');
SELECT add_retention_policy('your_table_daily', INTERVAL '5 years');规则:聚合数据的保留时间通常比原始数据长。
重要提示:不要猜测——询问用户,或者如果未知则必须注释掉。
sql
-- 示例 - 根据需求替换或注释掉
SELECT add_retention_policy('your_table_hourly', INTERVAL '2 years');
SELECT add_retention_policy('your_table_daily', INTERVAL '5 years');Step 9: Performance Indexes on Continuous Aggregates
步骤9:连续聚合的性能索引
Index strategy: Analyze WHERE clauses in common queries → Create indexes matching filter columns + time ordering
Pattern: supports
(filter_column, bucket DESC)WHERE filter_column = X AND bucket >= Y ORDER BY bucket DESCExamples:
sql
CREATE INDEX idx_hourly_entity_bucket ON your_table_hourly (entity_id, bucket DESC);
CREATE INDEX idx_hourly_category_bucket ON your_table_hourly (category, bucket DESC);Multi-column filters: Create composite indexes for :
WHERE entity_id = X AND category = Ysql
CREATE INDEX idx_hourly_entity_category_bucket ON your_table_hourly (entity_id, category, bucket DESC);Important: Only create indexes you'll actually use - each has maintenance overhead.
索引策略: 分析常见查询中的WHERE子句 → 创建与过滤列+时间排序匹配的索引
模式: 支持查询
(filter_column, bucket DESC)WHERE filter_column = X AND bucket >= Y ORDER BY bucket DESC示例:
sql
CREATE INDEX idx_hourly_entity_bucket ON your_table_hourly (entity_id, bucket DESC);
CREATE INDEX idx_hourly_category_bucket ON your_table_hourly (category, bucket DESC);多列过滤: 为创建复合索引:
WHERE entity_id = X AND category = Ysql
CREATE INDEX idx_hourly_entity_category_bucket ON your_table_hourly (entity_id, category, bucket DESC);重要提示: 仅创建实际需要的索引——每个索引都会带来维护开销
Step 10: Optional Enhancements
步骤10:可选增强
Space Partitioning (NOT RECOMMENDED)
空间分区(不推荐)
Only for query patterns where you ALWAYS filter by the space-partition column with expert knowledge and extensive benchmarking. STRONGLY prefer time-only partitioning.
仅适用于始终按空间分区列过滤的查询模式,且需要专业知识和大量基准测试。强烈建议仅使用时间分区。
Step 11: Verify Configuration
步骤11:验证配置
sql
-- Check hypertable
SELECT * FROM timescaledb_information.hypertables
WHERE hypertable_name = 'your_table_name';
-- Check compression settings
SELECT * FROM hypertable_compression_stats('your_table_name');
-- Check aggregates
SELECT * FROM timescaledb_information.continuous_aggregates;
-- Check policies
SELECT * FROM timescaledb_information.jobs ORDER BY job_id;
-- Monitor chunk information
SELECT
chunk_name,
range_start,
range_end,
is_compressed
FROM timescaledb_information.chunks
WHERE hypertable_name = 'your_table_name'
ORDER BY range_start DESC;sql
-- 检查hypertable
SELECT * FROM timescaledb_information.hypertables
WHERE hypertable_name = 'your_table_name';
-- 检查压缩设置
SELECT * FROM hypertable_compression_stats('your_table_name');
-- 检查聚合
SELECT * FROM timescaledb_information.continuous_aggregates;
-- 检查策略
SELECT * FROM timescaledb_information.jobs ORDER BY job_id;
-- 监控块信息
SELECT
chunk_name,
range_start,
range_end,
is_compressed
FROM timescaledb_information.chunks
WHERE hypertable_name = 'your_table_name'
ORDER BY range_start DESC;Performance Guidelines
性能指南
- Chunk size: Recent chunk indexes should fit in less than 25% of RAM
- Compression: Expect 90%+ reduction (10x) with proper columnstore config
- Query optimization: Use continuous aggregates for historical queries and dashboards
- Memory: Run for self-hosting (auto-configured on cloud)
timescaledb-tune
- 块大小: 最新块的索引应占用不到25%的内存
- 压缩: 正确配置columnstore后,预期可减少90%以上的存储空间(10倍压缩比)
- 查询优化: 对历史查询和仪表盘使用连续聚合
- 内存: 自托管环境运行(云环境会自动配置)
timescaledb-tune
Schema Best Practices
模式最佳实践
Do's and Don'ts
注意事项
- ✅ Use NOT
TIMESTAMPTZtimestamp - ✅ Use and
>=NOT<for timestampsBETWEEN - ✅ Use with constraints NOT
TEXT/char(n)varchar(n) - ✅ Use NOT
snake_caseCamelCase - ✅ Use NOT
BIGINT GENERATED ALWAYS AS IDENTITYSERIAL - ✅ Use for IDs by default over
BIGINTorINTEGERSMALLINT - ✅ Use by default over
DOUBLE PRECISION/REALFLOAT - ✅ Use NOT
NUMERICMONEY - ✅ Use NOT
NOT EXISTSNOT IN - ✅ Use or
time_bucket()NOTdate_trunc()for truncationtimestamp(0)
- ✅ 使用而非
TIMESTAMPTZtimestamp - ✅ 使用和
>=而非<处理时间戳BETWEEN - ✅ 使用带约束的而非
TEXT/char(n)varchar(n) - ✅ 使用而非
snake_caseCamelCase - ✅ 使用而非
BIGINT GENERATED ALWAYS AS IDENTITYSERIAL - ✅ 默认使用作为ID,而非
BIGINT或INTEGERSMALLINT - ✅ 默认使用而非
DOUBLE PRECISION/REALFLOAT - ✅ 使用而非
NUMERICMONEY - ✅ 使用而非
NOT EXISTSNOT IN - ✅ 使用或
time_bucket()而非date_trunc()进行截断timestamp(0)
API Reference (Current vs Deprecated)
API参考(当前版本 vs 已弃用版本)
Deprecated Parameters → New Parameters:
- →
timescaledb.compresstimescaledb.enable_columnstore - →
timescaledb.compress_segmentbytimescaledb.segmentby - →
timescaledb.compress_orderbytimescaledb.orderby
Deprecated Functions → New Functions:
- →
add_compression_policy()add_columnstore_policy() - →
remove_compression_policy()remove_columnstore_policy() - →
compress_chunk()(use withconvert_to_columnstore(), notCALL)SELECT - →
decompress_chunk()(use withconvert_to_rowstore(), notCALL)SELECT
Compression Stats (use functions, not views):
- Use function:
hypertable_compression_stats('table_name') - Use function:
chunk_compression_stats('_timescaledb_internal._hyper_X_Y_chunk') - Note: Views like may not be available in all versions; use functions instead
columnstore_settings
Manual Compression Example:
sql
-- Compress a specific chunk
CALL convert_to_columnstore('_timescaledb_internal._hyper_7_1_chunk');
-- Check compression statistics
SELECT
number_compressed_chunks,
pg_size_pretty(before_compression_total_bytes) as before_compression,
pg_size_pretty(after_compression_total_bytes) as after_compression,
ROUND(100.0 * (1 - after_compression_total_bytes::numeric / NULLIF(before_compression_total_bytes, 0)), 1) as compression_pct
FROM hypertable_compression_stats('your_table_name');已弃用参数 → 新参数:
- →
timescaledb.compresstimescaledb.enable_columnstore - →
timescaledb.compress_segmentbytimescaledb.segmentby - →
timescaledb.compress_orderbytimescaledb.orderby
已弃用函数 → 新函数:
- →
add_compression_policy()add_columnstore_policy() - →
remove_compression_policy()remove_columnstore_policy() - →
compress_chunk()(使用convert_to_columnstore()而非CALL)SELECT - →
decompress_chunk()(使用convert_to_rowstore()而非CALL)SELECT
压缩统计信息(使用函数而非视图):
- 使用函数:
hypertable_compression_stats('table_name') - 使用函数:
chunk_compression_stats('_timescaledb_internal._hyper_X_Y_chunk') - 注意:等视图可能在某些版本中不可用;请使用函数替代
columnstore_settings
手动压缩示例:
sql
-- 压缩特定块
CALL convert_to_columnstore('_timescaledb_internal._hyper_7_1_chunk');
-- 检查压缩统计信息
SELECT
number_compressed_chunks,
pg_size_pretty(before_compression_total_bytes) as before_compression,
pg_size_pretty(after_compression_total_bytes) as after_compression,
ROUND(100.0 * (1 - after_compression_total_bytes::numeric / NULLIF(before_compression_total_bytes, 0)), 1) as compression_pct
FROM hypertable_compression_stats('your_table_name');Questions to Ask User
需向用户询问的问题
- What kind of data will you be storing?
- How do you expect to use the data?
- What queries will you run?
- How long to keep the data?
- Column types if unclear
- 你将存储哪种类型的数据?
- 你打算如何使用这些数据?
- 你将运行哪些查询?
- 数据需要保留多久?
- 如果列类型不明确,请确认列类型