timescaledb

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

TimescaleDB 2.24.0 Time-Series Database

TimescaleDB 2.24.0 时间序列数据库

Overview

概述

TimescaleDB 2.24.0 introduces transformational features: lightning-fast recompression (100x faster updates), Direct Compress integration with continuous aggregates, UUIDv7 support in aggregates, and bloom filter sparse index changes. This skill ensures you leverage these capabilities correctly.
Core principle: Time-series data has unique access patterns. Design for append-heavy, time-range queries from the start.
Announce at start: "I'm applying timescaledb to ensure TimescaleDB 2.24.0 best practices."
TimescaleDB 2.24.0 引入了变革性功能:极速重压缩(更新速度提升100倍)、与连续聚合集成的Direct Compress、聚合中支持UUIDv7,以及布隆过滤器稀疏索引的改进。本指南确保你能正确利用这些功能。
核心原则:时间序列数据具有独特的访问模式。从设计之初就要针对写入密集型、时间范围查询进行优化。
开始时需说明:"我将应用TimescaleDB相关规范,确保遵循TimescaleDB 2.24.0最佳实践。"

When This Skill Applies

本指南的适用场景

This skill is MANDATORY when ANY of these patterns are touched:
PatternExamples
**/*hypertable*
migrations/create_hypertable.sql
**/*timeseries*
models/timeseries.ts
**/*metrics*
services/metricsService.ts
**/*events*
db/events.sql
**/*logs*
tables/logs.sql
**/*sensor*
iot/sensor_data.sql
**/*continuous_agg*
views/hourly_stats.sql
**/*compression*
policies/compression.sql
Or when files contain:
sql
-- These patterns trigger this skill
create_hypertable
continuous aggregate
compress_chunk
add_compression_policy
当涉及以下任何模式时,必须遵循本指南:
模式示例
**/*hypertable*
migrations/create_hypertable.sql
**/*timeseries*
models/timeseries.ts
**/*metrics*
services/metricsService.ts
**/*events*
db/events.sql
**/*logs*
tables/logs.sql
**/*sensor*
iot/sensor_data.sql
**/*continuous_agg*
views/hourly_stats.sql
**/*compression*
policies/compression.sql
或者当文件包含以下SQL语句时:
sql
-- 这些模式会触发本指南的应用
create_hypertable
continuous aggregate
compress_chunk
add_compression_policy

TimescaleDB 2.24.0 Features

TimescaleDB 2.24.0 功能特性

1. Lightning-Fast Recompression

1. 极速重压缩

TimescaleDB 2.24.0 introduces
recompress := true
for dramatically faster updates to compressed data:
sql
-- OLD (2.23 and earlier): Decompress entire chunk, update, recompress
-- Could take minutes for large chunks

-- NEW (2.24.0): Update compressed data directly
UPDATE sensor_data
SET value = corrected_value
WHERE time BETWEEN '2026-01-01' AND '2026-01-02';
-- 100x faster for compressed chunks

-- Enable recompression mode (automatic in 2.24.0)
-- Updates to compressed chunks now:
-- 1. Identify affected segments
-- 2. Decompress only those segments
-- 3. Apply updates
-- 4. Recompress immediately

-- Verify recompression is happening
SELECT * FROM timescaledb_information.job_stats
WHERE job_id IN (
  SELECT job_id FROM timescaledb_information.jobs
  WHERE proc_name = 'policy_recompression'
);
When this matters:
  • Late-arriving data corrections
  • Backfill operations
  • Data quality fixes
  • Retroactive updates
TimescaleDB 2.24.0 引入了
recompress := true
参数,可大幅加快压缩数据的更新速度:
sql
-- 旧版本(2.23及更早):解压缩整个数据块,更新后重新压缩
-- 对于大数据块可能需要数分钟

-- 新版本(2.24.0):直接更新压缩数据
UPDATE sensor_data
SET value = corrected_value
WHERE time BETWEEN '2026-01-01' AND '2026-01-02';
-- 压缩数据块的更新速度提升100倍

-- 启用重压缩模式(2.24.0中默认自动启用)
-- 现在对压缩数据块的更新流程:
-- 1. 识别受影响的分段
-- 2. 仅解压缩这些分段
-- 3. 应用更新
-- 4. 立即重新压缩

-- 验证重压缩是否正在运行
SELECT * FROM timescaledb_information.job_stats
WHERE job_id IN (
  SELECT job_id FROM timescaledb_information.jobs
  WHERE proc_name = 'policy_recompression'
);
适用场景
  • 延迟到达的数据修正
  • 数据回填操作
  • 数据质量修复
  • 追溯性更新

2. Direct Compress with Continuous Aggregates

2. 连续聚合与Direct Compress集成

Continuous aggregates can now compress directly without materialized hypertable overhead:
sql
-- Create continuous aggregate with direct compression
CREATE MATERIALIZED VIEW hourly_metrics
WITH (timescaledb.continuous, timescaledb.compress = true) AS
SELECT
  time_bucket('1 hour', time) AS bucket,
  device_id,
  avg(temperature) AS avg_temp,
  min(temperature) AS min_temp,
  max(temperature) AS max_temp,
  count(*) AS sample_count
FROM sensor_readings
GROUP BY bucket, device_id
WITH NO DATA;

-- Add compression policy directly to continuous aggregate
SELECT add_compression_policy('hourly_metrics', INTERVAL '7 days');

-- Refresh policy
SELECT add_continuous_aggregate_policy('hourly_metrics',
  start_offset => INTERVAL '1 month',
  end_offset => INTERVAL '1 hour',
  schedule_interval => INTERVAL '1 hour'
);
Benefits:
  • No intermediate materialized hypertable
  • Automatic compression of aggregate data
  • Reduced storage for historical aggregates
  • Simpler management
连续聚合现在可直接集成压缩功能,无需中间物化 hypertables 的开销:
sql
-- 创建带直接压缩的连续聚合
CREATE MATERIALIZED VIEW hourly_metrics
WITH (timescaledb.continuous, timescaledb.compress = true) AS
SELECT
  time_bucket('1 hour', time) AS bucket,
  device_id,
  avg(temperature) AS avg_temp,
  min(temperature) AS min_temp,
  max(temperature) AS max_temp,
  count(*) AS sample_count
FROM sensor_readings
GROUP BY bucket, device_id
WITH NO DATA;

-- 直接为连续聚合添加压缩策略
SELECT add_compression_policy('hourly_metrics', INTERVAL '7 days');

-- 添加刷新策略
SELECT add_continuous_aggregate_policy('hourly_metrics',
  start_offset => INTERVAL '1 month',
  end_offset => INTERVAL '1 hour',
  schedule_interval => INTERVAL '1 hour'
);
优势
  • 无需中间物化 hypertables
  • 聚合数据自动压缩
  • 历史聚合数据的存储量减少
  • 管理更简单

3. UUIDv7 in Continuous Aggregates

3. 连续聚合中支持UUIDv7

TimescaleDB 2.24.0 supports PostgreSQL 18's native UUIDv7 in continuous aggregates:
sql
-- Hypertable with UUIDv7 primary key (PostgreSQL 18)
CREATE TABLE events (
  id uuid DEFAULT uuidv7(),
  time timestamptz NOT NULL,
  event_type text NOT NULL,
  payload jsonb,
  PRIMARY KEY (id, time)
);

SELECT create_hypertable('events', 'time');

-- Continuous aggregate can now reference UUIDv7 columns
CREATE MATERIALIZED VIEW event_counts
WITH (timescaledb.continuous) AS
SELECT
  time_bucket('1 hour', time) AS bucket,
  event_type,
  count(*) AS event_count,
  count(DISTINCT id) AS unique_events  -- UUIDv7 works here now
FROM events
GROUP BY bucket, event_type
WITH NO DATA;
TimescaleDB 2.24.0 支持在连续聚合中使用PostgreSQL 18原生的UUIDv7:
sql
-- 带UUIDv7主键的Hypertable(PostgreSQL 18)
CREATE TABLE events (
  id uuid DEFAULT uuidv7(),
  time timestamptz NOT NULL,
  event_type text NOT NULL,
  payload jsonb,
  PRIMARY KEY (id, time)
);

SELECT create_hypertable('events', 'time');

-- 连续聚合现在可以引用UUIDv7列
CREATE MATERIALIZED VIEW event_counts
WITH (timescaledb.continuous) AS
SELECT
  time_bucket('1 hour', time) AS bucket,
  event_type,
  count(*) AS event_count,
  count(DISTINCT id) AS unique_events  -- UUIDv7现在可在此处正常使用
FROM events
GROUP BY bucket, event_type
WITH NO DATA;

4. Bloom Filter Sparse Index Changes

4. 布隆过滤器稀疏索引的改进

TimescaleDB 2.24.0 modifies bloom filter behavior for sparse indexes:
sql
-- Bloom filters for sparse data patterns
-- Useful for columns with many NULLs or low cardinality

CREATE TABLE logs (
  time timestamptz NOT NULL,
  level text,
  message text,
  error_code text,  -- Often NULL, sparse
  trace_id uuid     -- Often NULL, sparse
);

SELECT create_hypertable('logs', 'time');

-- Configure compression with bloom filter for sparse columns
ALTER TABLE logs SET (
  timescaledb.compress,
  timescaledb.compress_segmentby = 'level',
  timescaledb.compress_orderby = 'time DESC',
  -- Bloom filter helps find rare non-NULL values
  timescaledb.compress_bloomfilter = 'error_code, trace_id'
);

-- Query efficiency: Bloom filter skips segments without matches
SELECT * FROM logs
WHERE error_code = 'E500'
  AND time > now() - INTERVAL '1 day';
-- Scans only segments where bloom filter indicates possible match
When to use bloom filters:
  • Sparse columns (many NULLs)
  • Rare value queries (finding errors in logs)
  • High-cardinality exact match queries
  • NOT useful for range queries
TimescaleDB 2.24.0 调整了稀疏索引的布隆过滤器行为:
sql
-- 针对稀疏数据模式的布隆过滤器
-- 适用于包含大量NULL值或低基数的列

CREATE TABLE logs (
  time timestamptz NOT NULL,
  level text,
  message text,
  error_code text,  -- 通常为NULL,属于稀疏列
  trace_id uuid     -- 通常为NULL,属于稀疏列
);

SELECT create_hypertable('logs', 'time');

-- 为稀疏列配置带布隆过滤器的压缩
ALTER TABLE logs SET (
  timescaledb.compress,
  timescaledb.compress_segmentby = 'level',
  timescaledb.compress_orderby = 'time DESC',
  -- 布隆过滤器有助于查找罕见的非NULL值
  timescaledb.compress_bloomfilter = 'error_code, trace_id'
);

-- 查询效率:布隆过滤器会跳过无匹配的分段
SELECT * FROM logs
WHERE error_code = 'E500'
  AND time > now() - INTERVAL '1 day';
-- 仅扫描布隆过滤器指示可能存在匹配的分段
布隆过滤器的适用场景
  • 稀疏列(大量NULL值)
  • 稀有值查询(在日志中查找错误)
  • 高基数精确匹配查询
  • 不适用于范围查询

Hypertable Design

Hypertable设计

Creating Hypertables

创建Hypertables

sql
-- Standard time-series table
CREATE TABLE metrics (
  time timestamptz NOT NULL,
  device_id uuid NOT NULL,
  metric_name text NOT NULL,
  value double precision,
  metadata jsonb DEFAULT '{}'
);

-- Convert to hypertable
SELECT create_hypertable('metrics', 'time',
  chunk_time_interval => INTERVAL '1 day',  -- Chunk size
  create_default_indexes => true
);

-- With space partitioning (for high-cardinality dimensions)
SELECT create_hypertable('metrics', 'time',
  partitioning_column => 'device_id',
  number_partitions => 4,
  chunk_time_interval => INTERVAL '1 day'
);
sql
-- 标准时间序列表
CREATE TABLE metrics (
  time timestamptz NOT NULL,
  device_id uuid NOT NULL,
  metric_name text NOT NULL,
  value double precision,
  metadata jsonb DEFAULT '{}'
);

-- 转换为Hypertable
SELECT create_hypertable('metrics', 'time',
  chunk_time_interval => INTERVAL '1 day',  -- 数据块大小
  create_default_indexes => true
);

-- 带空间分区(针对高基数维度)
SELECT create_hypertable('metrics', 'time',
  partitioning_column => 'device_id',
  number_partitions => 4,
  chunk_time_interval => INTERVAL '1 day'
);

Chunk Interval Selection

数据块间隔选择

Data VolumeSuggested IntervalRationale
< 1GB/day1 weekFewer chunks, simpler management
1-10 GB/day1 dayBalance between size and granularity
10-100 GB/day6 hoursFaster compression, better parallelism
> 100 GB/day1 hourMaximum parallelism, fast drops
sql
-- Adjust chunk interval
SELECT set_chunk_time_interval('metrics', INTERVAL '6 hours');

-- View current chunks
SELECT show_chunks('metrics', older_than => INTERVAL '1 day');
数据量建议间隔理由
< 1GB/天1周数据块数量更少,管理更简单
1-10 GB/天1天在大小和粒度之间取得平衡
10-100 GB/天6小时压缩速度更快,并行性更好
> 100 GB/天1小时最大化并行性,快速删除旧数据
sql
-- 调整数据块间隔
SELECT set_chunk_time_interval('metrics', INTERVAL '6 hours');

-- 查看当前数据块
SELECT show_chunks('metrics', older_than => INTERVAL '1 day');

Primary Key Design

主键设计

sql
-- CORRECT: Time column in primary key for efficient chunk pruning
CREATE TABLE events (
  id uuid DEFAULT uuidv7(),
  time timestamptz NOT NULL,
  event_type text NOT NULL,
  PRIMARY KEY (id, time)  -- time included
);

-- WRONG: Time not in primary key (inefficient queries)
CREATE TABLE events_bad (
  id uuid PRIMARY KEY DEFAULT uuidv7(),
  time timestamptz NOT NULL  -- Not in PK
);
sql
-- 正确做法:主键中包含时间列,以实现高效的数据块裁剪
CREATE TABLE events (
  id uuid DEFAULT uuidv7(),
  time timestamptz NOT NULL,
  event_type text NOT NULL,
  PRIMARY KEY (id, time)  -- 包含time列
);

-- 错误做法:主键中不包含时间列(查询效率低)
CREATE TABLE events_bad (
  id uuid PRIMARY KEY DEFAULT uuidv7(),
  time timestamptz NOT NULL  -- 不在主键中
);

Compression Strategy

压缩策略

Enabling Compression

启用压缩

sql
-- Configure compression
ALTER TABLE metrics SET (
  timescaledb.compress,
  timescaledb.compress_segmentby = 'device_id',      -- Group by this
  timescaledb.compress_orderby = 'time DESC',         -- Sort order
  timescaledb.compress_chunk_time_interval = '1 day'  -- Recompress interval
);

-- Manual compression
SELECT compress_chunk(c)
FROM show_chunks('metrics', older_than => INTERVAL '7 days') c;

-- Automatic compression policy
SELECT add_compression_policy('metrics', INTERVAL '7 days');
sql
-- 配置压缩
ALTER TABLE metrics SET (
  timescaledb.compress,
  timescaledb.compress_segmentby = 'device_id',      -- 按此字段分组
  timescaledb.compress_orderby = 'time DESC',         -- 排序方式
  timescaledb.compress_chunk_time_interval = '1 day'  -- 重压缩间隔
);

-- 手动压缩
SELECT compress_chunk(c)
FROM show_chunks('metrics', older_than => INTERVAL '7 days') c;

-- 自动压缩策略
SELECT add_compression_policy('metrics', INTERVAL '7 days');

Segment By Selection

Segment By字段选择

sql
-- GOOD: Segment by commonly filtered dimension
-- Queries filter on device_id get excellent performance
ALTER TABLE metrics SET (
  timescaledb.compress_segmentby = 'device_id'
);

-- GOOD: Multiple segment columns for flexible queries
ALTER TABLE metrics SET (
  timescaledb.compress_segmentby = 'device_id, metric_name'
);

-- BAD: High cardinality segment (too many segments)
-- Don't segment by user_id if you have millions of users
ALTER TABLE events SET (
  timescaledb.compress_segmentby = 'user_id'  -- Too many segments!
);

-- BETTER for high cardinality: Include in orderby instead
ALTER TABLE events SET (
  timescaledb.compress_segmentby = 'event_type',
  timescaledb.compress_orderby = 'user_id, time DESC'
);
sql
-- 推荐:按常用过滤维度进行分段
-- 按device_id过滤的查询将获得极佳性能
ALTER TABLE metrics SET (
  timescaledb.compress_segmentby = 'device_id'
);

-- 推荐:多列分段以支持灵活查询
ALTER TABLE metrics SET (
  timescaledb.compress_segmentby = 'device_id, metric_name'
);

-- 不推荐:高基数分段(分段数量过多)
-- 如果你有数百万用户,不要按user_id分段
ALTER TABLE events SET (
  timescaledb.compress_segmentby = 'user_id'  -- 分段数量过多!
);

-- 针对高基数的更好方案:改为在orderby中包含该列
ALTER TABLE events SET (
  timescaledb.compress_segmentby = 'event_type',
  timescaledb.compress_orderby = 'user_id, time DESC'
);

Order By Selection

Order By字段选择

sql
-- Time descending for "most recent" queries
ALTER TABLE metrics SET (
  timescaledb.compress_orderby = 'time DESC'
);

-- Composite order for specific query patterns
ALTER TABLE logs SET (
  timescaledb.compress_orderby = 'level, time DESC'
);
-- Benefits: WHERE level = 'error' ORDER BY time DESC

-- Include frequently filtered columns
ALTER TABLE events SET (
  timescaledb.compress_orderby = 'device_id, time DESC'
);
sql
-- 按时间降序排列,适用于“最新数据”查询
ALTER TABLE metrics SET (
  timescaledb.compress_orderby = 'time DESC'
);

-- 复合排序,适用于特定查询模式
ALTER TABLE logs SET (
  timescaledb.compress_orderby = 'level, time DESC'
);
-- 优势:优化WHERE level = 'error' ORDER BY time DESC这类查询

-- 包含常用过滤列
ALTER TABLE events SET (
  timescaledb.compress_orderby = 'device_id, time DESC'
);

Continuous Aggregates

连续聚合

Creating Aggregates

创建聚合

sql
-- Basic continuous aggregate
CREATE MATERIALIZED VIEW hourly_stats
WITH (timescaledb.continuous) AS
SELECT
  time_bucket('1 hour', time) AS bucket,
  device_id,
  avg(value) AS avg_value,
  min(value) AS min_value,
  max(value) AS max_value,
  count(*) AS sample_count
FROM metrics
GROUP BY bucket, device_id
WITH NO DATA;

-- Hierarchical aggregates (aggregate of aggregate)
CREATE MATERIALIZED VIEW daily_stats
WITH (timescaledb.continuous) AS
SELECT
  time_bucket('1 day', bucket) AS bucket,
  device_id,
  avg(avg_value) AS avg_value,
  min(min_value) AS min_value,
  max(max_value) AS max_value,
  sum(sample_count) AS sample_count
FROM hourly_stats
GROUP BY 1, device_id
WITH NO DATA;
sql
-- 基础连续聚合
CREATE MATERIALIZED VIEW hourly_stats
WITH (timescaledb.continuous) AS
SELECT
  time_bucket('1 hour', time) AS bucket,
  device_id,
  avg(value) AS avg_value,
  min(value) AS min_value,
  max(value) AS max_value,
  count(*) AS sample_count
FROM metrics
GROUP BY bucket, device_id
WITH NO DATA;

-- 分层聚合(聚合的聚合)
CREATE MATERIALIZED VIEW daily_stats
WITH (timescaledb.continuous) AS
SELECT
  time_bucket('1 day', bucket) AS bucket,
  device_id,
  avg(avg_value) AS avg_value,
  min(min_value) AS min_value,
  max(max_value) AS max_value,
  sum(sample_count) AS sample_count
FROM hourly_stats
GROUP BY 1, device_id
WITH NO DATA;

Refresh Policies

刷新策略

sql
-- Add refresh policy
SELECT add_continuous_aggregate_policy('hourly_stats',
  start_offset => INTERVAL '3 days',   -- Refresh this far back
  end_offset => INTERVAL '1 hour',      -- Don't refresh latest (incomplete)
  schedule_interval => INTERVAL '1 hour'
);

-- Real-time aggregates (include unrefreshed data)
ALTER MATERIALIZED VIEW hourly_stats SET (
  timescaledb.materialized_only = false  -- Include real-time data
);

-- Force refresh
CALL refresh_continuous_aggregate('hourly_stats',
  '2026-01-01'::timestamptz,
  '2026-01-02'::timestamptz
);
sql
-- 添加刷新策略
SELECT add_continuous_aggregate_policy('hourly_stats',
  start_offset => INTERVAL '3 days',   -- 刷新此时间范围之前的数据
  end_offset => INTERVAL '1 hour',      -- 不刷新最新的数据(未完成)
  schedule_interval => INTERVAL '1 hour'
);

-- 实时聚合(包含未刷新的数据)
ALTER MATERIALIZED VIEW hourly_stats SET (
  timescaledb.materialized_only = false  -- 包含实时数据
);

-- 强制刷新
CALL refresh_continuous_aggregate('hourly_stats',
  '2026-01-01'::timestamptz,
  '2026-01-02'::timestamptz
);

With Compression (2.24.0)

带压缩的连续聚合(2.24.0新特性)

sql
-- Continuous aggregate with built-in compression
CREATE MATERIALIZED VIEW hourly_metrics
WITH (
  timescaledb.continuous,
  timescaledb.compress = true  -- New in 2.24.0
) AS
SELECT
  time_bucket('1 hour', time) AS bucket,
  device_id,
  avg(temperature) AS avg_temp,
  percentile_agg(temperature) AS temp_pct  -- For percentiles later
FROM sensor_readings
GROUP BY bucket, device_id
WITH NO DATA;

-- Add compression policy for the aggregate
SELECT add_compression_policy('hourly_metrics', INTERVAL '30 days');

-- Combined with refresh policy
SELECT add_continuous_aggregate_policy('hourly_metrics',
  start_offset => INTERVAL '7 days',
  end_offset => INTERVAL '1 hour',
  schedule_interval => INTERVAL '1 hour'
);
sql
-- 内置压缩的连续聚合
CREATE MATERIALIZED VIEW hourly_metrics
WITH (
  timescaledb.continuous,
  timescaledb.compress = true  -- 2.24.0新特性
) AS
SELECT
  time_bucket('1 hour', time) AS bucket,
  device_id,
  avg(temperature) AS avg_temp,
  percentile_agg(temperature) AS temp_pct  -- 后续用于计算百分位数
FROM sensor_readings
GROUP BY bucket, device_id
WITH NO DATA;

-- 为聚合添加压缩策略
SELECT add_compression_policy('hourly_metrics', INTERVAL '30 days');

-- 结合刷新策略
SELECT add_continuous_aggregate_policy('hourly_metrics',
  start_offset => INTERVAL '7 days',
  end_offset => INTERVAL '1 hour',
  schedule_interval => INTERVAL '1 hour'
);

Retention Policies

数据保留策略

Data Lifecycle

数据生命周期管理

sql
-- Drop old raw data (keep aggregates)
SELECT add_retention_policy('metrics', INTERVAL '90 days');

-- View retention policies
SELECT * FROM timescaledb_information.jobs
WHERE proc_name = 'policy_retention';

-- Remove retention policy
SELECT remove_retention_policy('metrics');
sql
-- 删除旧的原始数据(保留聚合数据)
SELECT add_retention_policy('metrics', INTERVAL '90 days');

-- 查看保留策略
SELECT * FROM timescaledb_information.jobs
WHERE proc_name = 'policy_retention';

-- 删除保留策略
SELECT remove_retention_policy('metrics');

Tiered Storage Pattern

分层存储模式

sql
-- Pattern: Raw → Hourly → Daily → Archive

-- 1. Raw data: Keep 7 days uncompressed
-- 2. Raw data: Keep 30 days compressed
-- 3. Raw data: Drop after 90 days

-- 4. Hourly aggregates: Keep 1 year
-- 5. Daily aggregates: Keep forever

-- Implementation:
-- Raw data policies
SELECT add_compression_policy('metrics', INTERVAL '7 days');
SELECT add_retention_policy('metrics', INTERVAL '90 days');

-- Hourly aggregate policies
SELECT add_compression_policy('hourly_stats', INTERVAL '30 days');
SELECT add_retention_policy('hourly_stats', INTERVAL '1 year');

-- Daily stats: No retention (keep forever)
SELECT add_compression_policy('daily_stats', INTERVAL '90 days');
sql
-- 模式:原始数据 → 小时级聚合 → 天级聚合 → 归档

-- 1. 原始数据:保留7天未压缩数据
-- 2. 原始数据:保留30天压缩数据
-- 3. 原始数据:90天后删除

-- 4. 小时级聚合:保留1年
-- 5. 天级聚合:永久保留

-- 实现方式:
-- 原始数据策略
SELECT add_compression_policy('metrics', INTERVAL '7 days');
SELECT add_retention_policy('metrics', INTERVAL '90 days');

-- 小时级聚合策略
SELECT add_compression_policy('hourly_stats', INTERVAL '30 days');
SELECT add_retention_policy('hourly_stats', INTERVAL '1 year');

-- 天级聚合:不设置保留策略(永久保留)
SELECT add_compression_policy('daily_stats', INTERVAL '90 days');

Query Patterns

查询模式

Time Range Queries

时间范围查询

sql
-- Recent data (uses index)
SELECT * FROM metrics
WHERE time > now() - INTERVAL '1 hour'
  AND device_id = $1
ORDER BY time DESC
LIMIT 100;

-- Time range with aggregation
SELECT
  time_bucket('5 minutes', time) AS bucket,
  avg(value) AS avg_value
FROM metrics
WHERE time BETWEEN $1 AND $2
  AND device_id = $3
GROUP BY bucket
ORDER BY bucket;

-- Last value per device
SELECT DISTINCT ON (device_id)
  device_id,
  time,
  value
FROM metrics
WHERE time > now() - INTERVAL '1 day'
ORDER BY device_id, time DESC;
sql
-- 近期数据(使用索引)
SELECT * FROM metrics
WHERE time > now() - INTERVAL '1 hour'
  AND device_id = $1
ORDER BY time DESC
LIMIT 100;

-- 带聚合的时间范围查询
SELECT
  time_bucket('5 minutes', time) AS bucket,
  avg(value) AS avg_value
FROM metrics
WHERE time BETWEEN $1 AND $2
  AND device_id = $3
GROUP BY bucket
ORDER BY bucket;

-- 每个设备的最新值
SELECT DISTINCT ON (device_id)
  device_id,
  time,
  value
FROM metrics
WHERE time > now() - INTERVAL '1 day'
ORDER BY device_id, time DESC;

Using Continuous Aggregates

使用连续聚合查询

sql
-- Query aggregate instead of raw data
SELECT * FROM hourly_stats
WHERE bucket > now() - INTERVAL '7 days'
  AND device_id = $1
ORDER BY bucket DESC;

-- Real-time aggregate (includes unrefreshed data)
SELECT * FROM hourly_stats
WHERE bucket > now() - INTERVAL '1 hour';
-- Automatically combines materialized + real-time data
sql
-- 查询聚合数据而非原始数据
SELECT * FROM hourly_stats
WHERE bucket > now() - INTERVAL '7 days'
  AND device_id = $1
ORDER BY bucket DESC;

-- 实时聚合查询(包含未刷新的数据)
SELECT * FROM hourly_stats
WHERE bucket > now() - INTERVAL '1 hour';
-- 自动结合物化数据和实时数据

Percentiles and Statistics

百分位数与统计查询

sql
-- Use percentile_agg for continuous aggregates
CREATE MATERIALIZED VIEW metrics_percentiles
WITH (timescaledb.continuous) AS
SELECT
  time_bucket('1 hour', time) AS bucket,
  device_id,
  percentile_agg(value) AS value_pct,  -- Aggregate percentile state
  stats_agg(value) AS value_stats       -- Statistical aggregates
FROM metrics
GROUP BY bucket, device_id;

-- Query percentiles from aggregate
SELECT
  bucket,
  device_id,
  approx_percentile(0.50, value_pct) AS median,
  approx_percentile(0.95, value_pct) AS p95,
  approx_percentile(0.99, value_pct) AS p99,
  average(value_stats) AS avg,
  stddev(value_stats) AS stddev
FROM metrics_percentiles
WHERE bucket > now() - INTERVAL '24 hours';
sql
-- 使用percentile_agg实现连续聚合的百分位数计算
CREATE MATERIALIZED VIEW metrics_percentiles
WITH (timescaledb.continuous) AS
SELECT
  time_bucket('1 hour', time) AS bucket,
  device_id,
  percentile_agg(value) AS value_pct,  -- 聚合百分位数状态
  stats_agg(value) AS value_stats       -- 统计聚合
FROM metrics
GROUP BY bucket, device_id;

-- 从聚合数据中查询百分位数
SELECT
  bucket,
  device_id,
  approx_percentile(0.50, value_pct) AS median,
  approx_percentile(0.95, value_pct) AS p95,
  approx_percentile(0.99, value_pct) AS p99,
  average(value_stats) AS avg,
  stddev(value_stats) AS stddev
FROM metrics_percentiles
WHERE bucket > now() - INTERVAL '24 hours';

Index Strategy

索引策略

Default Indexes

默认索引

sql
-- create_hypertable creates this by default:
-- CREATE INDEX ON metrics (time DESC);

-- Add composite indexes for common queries
CREATE INDEX idx_metrics_device_time ON metrics (device_id, time DESC);

-- Partial indexes for specific patterns
CREATE INDEX idx_metrics_errors ON metrics (time DESC)
WHERE value > threshold;
sql
-- create_hypertable默认会创建以下索引:
-- CREATE INDEX ON metrics (time DESC);

-- 为常见查询添加复合索引
CREATE INDEX idx_metrics_device_time ON metrics (device_id, time DESC);

-- 为特定模式添加部分索引
CREATE INDEX idx_metrics_errors ON metrics (time DESC)
WHERE value > threshold;

Compressed Chunk Considerations

压缩数据块的注意事项

sql
-- Indexes are not used on compressed chunks
-- Query planner uses:
-- 1. Chunk exclusion (time range)
-- 2. Segment filtering (compress_segmentby columns)
-- 3. Orderby optimization (compress_orderby columns)

-- Design compression settings for query patterns, not indexes
ALTER TABLE metrics SET (
  timescaledb.compress_segmentby = 'device_id',  -- Filter column
  timescaledb.compress_orderby = 'time DESC'      -- Sort column
);
sql
-- 索引在压缩数据块上不生效
-- 查询优化器会使用:
-- 1. 数据块排除(时间范围)
-- 2. 分段过滤(compress_segmentby列)
-- 3. 排序优化(compress_orderby列)

-- 针对查询模式设计压缩设置,而非依赖索引
ALTER TABLE metrics SET (
  timescaledb.compress_segmentby = 'device_id',  -- 过滤列
  timescaledb.compress_orderby = 'time DESC'      -- 排序列
);

Migration Patterns

迁移模式

Converting Regular Table to Hypertable

将普通表转换为Hypertable

sql
-- 1. Ensure time column exists and is NOT NULL
ALTER TABLE legacy_metrics ALTER COLUMN time SET NOT NULL;

-- 2. Convert to hypertable
SELECT create_hypertable('legacy_metrics', 'time',
  migrate_data => true,
  chunk_time_interval => INTERVAL '1 day'
);

-- 3. Add compression
ALTER TABLE legacy_metrics SET (
  timescaledb.compress,
  timescaledb.compress_segmentby = 'device_id',
  timescaledb.compress_orderby = 'time DESC'
);

-- 4. Add policies
SELECT add_compression_policy('legacy_metrics', INTERVAL '7 days');
SELECT add_retention_policy('legacy_metrics', INTERVAL '90 days');
sql
-- 1. 确保时间列存在且不为NULL
ALTER TABLE legacy_metrics ALTER COLUMN time SET NOT NULL;

-- 2. 转换为Hypertable
SELECT create_hypertable('legacy_metrics', 'time',
  migrate_data => true,
  chunk_time_interval => INTERVAL '1 day'
);

-- 3. 添加压缩配置
ALTER TABLE legacy_metrics SET (
  timescaledb.compress,
  timescaledb.compress_segmentby = 'device_id',
  timescaledb.compress_orderby = 'time DESC'
);

-- 4. 添加策略
SELECT add_compression_policy('legacy_metrics', INTERVAL '7 days');
SELECT add_retention_policy('legacy_metrics', INTERVAL '90 days');

Adding TimescaleDB to Existing Database

在现有数据库中添加TimescaleDB

sql
-- 1. Install extension
CREATE EXTENSION IF NOT EXISTS timescaledb;

-- 2. Verify version
SELECT extversion FROM pg_extension WHERE extname = 'timescaledb';
-- Should show 2.24.0

-- 3. Check PostgreSQL compatibility
SELECT timescaledb_information.version();
sql
-- 1. 安装扩展
CREATE EXTENSION IF NOT EXISTS timescaledb;

-- 2. 验证版本
SELECT extversion FROM pg_extension WHERE extname = 'timescaledb';
-- 应显示2.24.0

-- 3. 检查PostgreSQL兼容性
SELECT timescaledb_information.version();

TimescaleDB Artifact

TimescaleDB 实现文档

When implementing time-series features, post this artifact:
markdown
<!-- TIMESCALEDB_IMPLEMENTATION:START -->
当实现时间序列功能时,需发布以下文档:
markdown
<!-- TIMESCALEDB_IMPLEMENTATION:START -->

TimescaleDB Implementation Summary

TimescaleDB 实现摘要

Hypertables

Hypertables

TableChunk IntervalSpace PartitionsCompression
metrics1 daydevice_id (4)Yes
events6 hoursNoneYes
logs1 hourlevel (2)Yes
表名数据块间隔空间分区压缩
metrics1天device_id(4个分区)
events6小时
logs1小时level(2个分区)

Compression Settings

压缩设置

TableSegment ByOrder ByBloom Filter
metricsdevice_idtime DESCNone
logsleveltime DESCerror_code, trace_id
表名Segment ByOrder By布隆过滤器
metricsdevice_idtime DESC
logsleveltime DESCerror_code, trace_id

Continuous Aggregates

连续聚合

AggregateSourceIntervalCompression
hourly_metricsmetrics1 hourYes (30d)
daily_metricshourly_metrics1 dayYes (90d)
聚合名称数据源时间间隔压缩
hourly_metricsmetrics1小时是(30天后压缩)
daily_metricshourly_metrics1天是(90天后压缩)

Policies

策略

Table/AggregateCompressionRetentionRefresh
metrics7 days90 daysN/A
hourly_metrics30 days1 year1 hour
daily_metrics90 daysNever1 day
表/聚合压缩策略保留策略刷新策略
metrics7天后压缩90天后删除
hourly_metrics30天后压缩1年后删除每小时刷新
daily_metrics90天后压缩永久保留每天刷新

TimescaleDB 2.24.0 Features Used

使用的TimescaleDB 2.24.0 特性

  • Lightning-fast recompression
  • Direct Compress with continuous aggregates
  • UUIDv7 in continuous aggregates
  • Bloom filter sparse indexes
TimescaleDB Version: 2.24.0 Verified At: [timestamp]
<!-- TIMESCALEDB_IMPLEMENTATION:END -->
undefined
  • 极速重压缩
  • 连续聚合与Direct Compress集成
  • 连续聚合中支持UUIDv7
  • 布隆过滤器稀疏索引
TimescaleDB版本: 2.24.0 验证时间: [时间戳]
<!-- TIMESCALEDB_IMPLEMENTATION:END -->
undefined

Checklist

检查清单

Before completing TimescaleDB implementation:
  • Hypertable created with appropriate chunk interval
  • Compression configured with correct segmentby/orderby
  • Compression policy added
  • Retention policy added (if applicable)
  • Continuous aggregates created for common queries
  • Refresh policies configured
  • Indexes appropriate for uncompressed chunks
  • Query patterns tested with EXPLAIN ANALYZE
  • 2.24.0 features leveraged where beneficial
  • Artifact posted to issue
完成TimescaleDB实现前,请确认以下事项:
  • 创建了带合适数据块间隔的Hypertable
  • 配置了正确的segmentby/orderby压缩参数
  • 添加了压缩策略
  • 添加了保留策略(如适用)
  • 为常见查询创建了连续聚合
  • 配置了刷新策略
  • 为未压缩数据块添加了合适的索引
  • 使用EXPLAIN ANALYZE测试了查询模式
  • 合理利用了2.24.0的新特性
  • 已发布实现文档到对应工单

Integration

集成

This skill integrates with:
  • database-architecture
    - Hypertables follow general schema patterns
  • postgres-rls
    - RLS works with hypertables (use caution with compression)
  • postgis
    - Spatial time-series data
本指南可与以下技能集成:
  • database-architecture
    - Hypertables遵循通用模式设计
  • postgres-rls
    - RLS可与hypertables配合使用(压缩时需谨慎)
  • postgis
    - 空间时间序列数据处理

References

参考资料