postgis-skill
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseDocumentation
文档规范
- Make sure every create statement or CTE has descriptive comment in front of it.
-- - Write enough comments so you can deduce what was a requirement in the future and not walk in circles.
- Every feature needs to have comprehensive up-to-date documentation near it.
- 确保每个CREATE语句或CTE前都有描述性的注释。
-- - 撰写足够的注释,以便未来能推断出当初的需求,避免走弯路。
- 每个功能都需要在附近配有全面且最新的文档。
Style
编码风格
- PostGIS functions follow their spelling from the manual (->
st_segmentize).ST_Segmentize - SQL is lowercase unless instructed otherwise.
- Values in databases and layers should be absolute as much as possible: store "birthday" or "construction date" instead of "age".
- Do not mix tabs and spaces in code.
- Add empty lines between logical blocks.
- Format the code nicely and consistently.
- Call geometry column ; geography column
geom.geog
- PostGIS函数需遵循官方手册中的拼写(如应改为
st_segmentize)。ST_Segmentize - SQL语句除特殊要求外均使用小写。
- 数据库和图层中的值应尽可能采用绝对形式:存储"生日"或"建造日期"而非"年龄"。
- 代码中不要混合使用制表符和空格。
- 在逻辑块之间添加空行。
- 代码格式要美观且统一。
- 几何列命名为;地理列命名为
geom。geog
Indexing
索引优化
- Create brin for all columns when creating large table that will be used for ad-hoc queries.
- If you have cache table that has a primary key, it makes sense to add values into on same index for faster lookup.
including
- 创建用于即席查询的大型表时,为所有列创建BRIN索引。
- 若缓存表有主键,在同一索引的子句中添加相关值可加快查找速度。
including
Debugging
调试技巧
- Make sure that error messages towards developer are better than just "500 Internal server error".
- Don't stub stuff out with insane fallbacks (like lat/lon=0) - instead make the rest of the code work around data absence and inform user.
- SQL files should to be idempotent: drop table if exists + create table as; add some comments to make people grasp queries faster.
- Create both "up' and "down/rollback" migration when creating new migrations for ease of iteration.
- Check to see if all upgrades happened successfully.
select postgis_full_version(); - Don't run one SQL file from other SQL file - this quickly becomes a mess with relative file paths.
- 确保返回给开发者的错误信息比单纯的"500 Internal Server Error"更具信息量。
- 不要用极端的 fallback 方式(如设置经纬度为0)来掩盖问题——而是让代码适配数据缺失的情况并告知用户。
- SQL文件应具备幂等性:先执行再执行
drop table if exists;添加注释以帮助他人快速理解查询语句。create table as - 创建新的迁移脚本时,同时编写"升级"和"回滚"脚本,以便迭代。
- 执行检查所有升级是否成功完成。
select postgis_full_version(); - 不要从一个SQL文件中调用另一个SQL文件——相对路径会很快让代码变得混乱。
Raster
栅格数据处理
- Do not work with GDAL on the filesystem. Import things into database and deal with data there.
- 不要在文件系统层面使用GDAL处理数据。应将数据导入数据库后再进行操作。
SQL gotchas
SQL常见陷阱
- is just
sum(case when A then 1 else 0 end)count() filter (where A) - can likely be redone as
row_number() ... = 1(possibly withorder by + limit 1ordistinct on)lateral - is just
exists(select 1 from ...)exists(select from ...) - is just
tags ->> 'key' = 'value'- works faster for indexestags @> '{"key": "value"} - you can't just create ordered table and then rely on it to be ordered on scan without
order by
- 等价于
sum(case when A then 1 else 0 end)count() filter (where A) - 通常可以改写为
row_number() ... = 1(可能结合order by + limit 1或distinct on)lateral - 等价于
exists(select 1 from ...)exists(select from ...) - 等价于
tags ->> 'key' = 'value'——后者在使用索引时速度更快tags @> '{"key": "value"}' - 不能仅依赖创建时的有序表来保证扫描时的顺序,必须显式添加子句
order by
PostGIS gotchas
PostGIS常见陷阱
- Do not use geometry typmod unless requested (things like ) - use plain
geometry(multilinestring, 4326)orgeometryinstead. This removes clutter ofgeographyand errors viaST_Multi.ST_SetSRID - is just
ST_UnaryUnion(ST_Collect(geom))ST_Union(geom) - should be
ST_Buffer(geom, 0)ST_MakeValid(geom) - should be
select min(ST_Distance(..))to enable knn gistselect ST_Distance() ... order by a <-> b limit 1 - should be
order by ST_Distance(c.geog, t.geog)order by c.geog <-> t.geog - is a sign you're doing something wrong
ST_UnaryUnion - is a sign you're doing something wrong on the previous step
ST_MakeValid - be extra attintive when calling : check the actual projection of input data, check if it can be set correctly during input (
ST_SetSRID,ST_GeomFromGeoJSON-styleEWKT,SRID=4326;POINT(...allow that). Check ifEWKBis needed instead.ST_Transform - when looking for relation between point and polygon, prefer to other topology predicates
ST_Intersects - when generating complex geometry by walking raster or grid, may make sense to
ST_Simplify(geom, 0) - to generate neighbourhoods of predictable size, use with k=2 and
ST_ClusterKMeansset to your distance.max_radius - use for binary representation instead of
ST_AsEWKBto keep SRID.ST_AsWKB - Choosing projection: SRID=4326 (2D longlat) when input or output is longitude and latitude and coordinate value is to be shown to user. SRID=3857 (2D Spherical Mercator) when output will be shown on web map, ST_AsMVT, or 2D KNN requests of short distance are to be executed. SRID=4978 (3D XYZ) when performing internal computations, line-of-sight, clustering and averaging across antimeridian. Beware: only use 3D-aware operations, ST_Force3DZ on 2D CRS data before calling ST_Transform to it.
- Instead of using /
ST_HexagonuseST_HexagonGridextension.h3 - When you know the data is going to be dumped in binary form, gzipped and moved around, consider using if precision is known.
ST_QuantizeCoordinates
- 除非有明确要求,否则不要使用几何类型修饰符(如)——应使用普通的
geometry(multilinestring, 4326)或geometry类型。这可以避免geography的冗余以及ST_Multi引发的错误。ST_SetSRID - 等价于
ST_UnaryUnion(ST_Collect(geom))ST_Union(geom) - 应替换为
ST_Buffer(geom, 0)ST_MakeValid(geom) - 应改写为
select min(ST_Distance(..)),以启用KNN GiST索引select ST_Distance() ... order by a <-> b limit 1 - 应改为
order by ST_Distance(c.geog, t.geog)order by c.geog <-> t.geog - 使用通常意味着你的实现方式存在问题
ST_UnaryUnion - 使用通常意味着上一步的处理存在问题
ST_MakeValid - 调用时要格外注意:检查输入数据的实际投影,确认是否可以在输入阶段就正确设置(
ST_SetSRID、EWKT格式ST_GeomFromGeoJSON、EWKB都支持)。考虑是否需要改用SRID=4326;POINT(...。ST_Transform - 查找点与多边形的关系时,优先使用而非其他拓扑谓词
ST_Intersects - 通过栅格或网格生成复杂几何图形时,使用可能会有帮助
ST_Simplify(geom, 0) - 要生成大小可预测的邻域,可结合使用(设置k=2)和
ST_ClusterKMeans参数指定距离。max_radius - 如需保留SRID,使用而非
ST_AsEWKB来生成二进制表示。ST_AsWKB - 投影选择指南:
- 当输入或输出为经纬度且坐标值需展示给用户时,使用SRID=4326(二维经纬度)。
- 当输出将用于Web地图、ST_AsMVT或执行短距离二维KNN查询时,使用SRID=3857(二维球面墨卡托)。
- 进行内部计算、视线分析、跨日界线聚类与平均时,使用SRID=4978(三维XYZ)。注意:仅使用支持三维的操作,在调用ST_Transform转换到该坐标系前,需对二维CRS数据执行ST_Force3DZ。
- 不要使用/
ST_Hexagon,应使用ST_HexagonGrid扩展。h3 - 当已知数据将以二进制形式导出、压缩并传输时,若精度已知,可考虑使用。
ST_QuantizeCoordinates