carto-geocoding
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseGeocoding Addresses in CARTO Workflows
在CARTO Workflows中进行地址Geocoding
Converts street addresses or place names into geographic coordinates (point geometries). This is an essential first step when working with tabular data that has an address column but no spatial column.
Prerequisites: Load for the development process, JSON structure, and validation commands.
carto-create-workflow将街道地址或地名转换为地理坐标(点几何图形)。这是处理包含地址列但无空间列的表格数据时必不可少的第一步。
前提条件:加载以获取开发流程、JSON结构和验证命令。
carto-create-workflowInstructions
操作步骤
A geocoding workflow follows this pipeline:
Source Table (with address column) -> Geocode -> (Filter successful) -> SaveGeocoding工作流遵循以下流程:
Source Table (with address column) -> Geocode -> (Filter successful) -> SaveStep 1: Load Source Data
步骤1:加载源数据
Use . The input table must contain a column with address strings (e.g. , , ).
native.gettablebynameaddressfull_addresslocationSuccess: Node outputs a table with at least one text column containing address data.
使用。输入表必须包含一个存储地址字符串的列(例如、、)。
native.gettablebynameaddressfull_addresslocation成功标志:节点输出至少包含一个含地址数据的文本列的表格。
Step 2: Geocode
步骤2:执行Geocoding
Use with:
native.geocode| Input | Description | Required |
|---|---|---|
| Table with address data | Yes |
| Column containing the address string | Yes |
| Country filter to improve accuracy (e.g. | No, but strongly recommended |
The address column can contain full addresses () or composite values built from multiple columns (concatenate street + city + postal code in a prior step).
"123 Main St, Springfield, IL 60001"native.selectexpressionTwo output handles: The geocode component produces two separate outputs:
- : Rows where geocoding succeeded -- a
matchcolumn with point geometry is added, plus ageomJSON column with quality info (confidence score, match type).CARTO_GEOCODE_METADATA - : Rows where geocoding failed --
unmatchis NULL.geom
Connect downstream nodes to the correct handle based on your needs.
Success: The geocode node is configured with the address column and (ideally) a country filter. Edges connect to the and/or output handles.
matchunmatch使用,参数如下:
native.geocode| 输入项 | 描述 | 是否必填 |
|---|---|---|
| 包含地址数据的表格 | 是 |
| 存储地址字符串的列 | 是 |
| 用于提高精度的国家筛选器(例如 | 否,但强烈建议填写 |
地址列可以包含完整地址(),也可以包含通过多列拼接而成的复合值(在之前的步骤中拼接街道+城市+邮政编码)。
"123 Main St, Springfield, IL 60001"native.selectexpression两个输出分支:Geocode组件会生成两个独立输出:
- :Geocoding成功的行——新增一个带点几何图形的
match列,以及一个包含质量信息(置信度分数、匹配类型)的geomJSON列。CARTO_GEOCODE_METADATA - :Geocoding失败的行——
unmatch为NULL。geom
根据需求将下游节点连接到正确的分支。
成功标志:Geocode节点已配置地址列(理想情况下还配置了国家筛选器),且已将连线连接到和/或输出分支。
matchunmatchStep 3: Filter or Review Results (optional)
步骤3:筛选或审核结果(可选)
For the match output:
- Optionally filter by confidence using the metadata column (e.g. extract confidence from via
CARTO_GEOCODE_METADATA).native.selectexpression
For the unmatch output:
- Save to a separate table for review and manual correction.
- Common failure causes: typos, incomplete addresses, PO boxes, ambiguous place names.
Success: High-confidence geocoded rows are isolated; failed rows are captured for review.
对于****输出:
match- 可选:使用元数据列按置信度筛选(例如通过从
native.selectexpression中提取置信度)。CARTO_GEOCODE_METADATA
对于****输出:
unmatch- 保存到单独的表格中以便审核和手动修正。
- 常见失败原因:拼写错误、地址不完整、邮政信箱、模糊地名。
成功标志:已分离出高置信度的Geocoding结果行;已捕获失败行以便审核。
Step 4: Save Results
步骤4:保存结果
Use to persist the geocoded output. The column contains WGS84 (EPSG:4326) point geometries, ready for visualization in CARTO Builder or further spatial analysis.
native.saveastablegeomSuccess: Validated workflow that can be uploaded via .
carto workflows create使用持久化Geocoding输出。列包含WGS84(EPSG:4326)点几何图形,可直接在CARTO Builder中可视化或用于进一步的空间分析。
native.saveastablegeom成功标志:可通过上传的已验证工作流。
carto workflows createGeocoding Under the Hood
Geocoding底层原理
The workflow component wraps the CARTO Analytics Toolbox function , which:
GEOCODE_TABLE- Adds a column with point geometry to each row
geom - Adds a JSON column with quality information (confidence, match type)
CARTO_GEOCODE_METADATA - Uses CARTO Location Data Services (LDS) -- each geocoded row consumes LDS quota
Check available quota by querying the Analytics Toolbox function. The fully-qualified name is provider-specific (BigQuery: ; Snowflake: ; Databricks: stored procedure in the dedicated AT schema). Load and consult for the AT path on your warehouse.
LDS_QUOTA_INFO()`carto-un.carto`.LDS_QUOTA_INFO()CARTO.CARTO.LDS_QUOTA_INFO()carto-create-workflowreferences/providers/<provider>.md该工作流组件封装了CARTO Analytics Toolbox函数,该函数:
GEOCODE_TABLE- 为每行新增一个带点几何图形的列
geom - 新增一个包含质量信息(置信度、匹配类型)的JSON列
CARTO_GEOCODE_METADATA - 使用CARTO Location Data Services (LDS)——每一行Geocoding都会消耗LDS配额
可通过查询Analytics Toolbox的函数查看可用配额。该函数的全限定名称因提供商而异(BigQuery:;Snowflake:;Databricks:专用AT schema中的存储过程)。加载并查阅以获取您的数据仓库对应的AT路径。
LDS_QUOTA_INFO()`carto-un.carto`.LDS_QUOTA_INFO()CARTO.CARTO.LDS_QUOTA_INFO()carto-create-workflowreferences/providers/<provider>.mdGotchas
注意事项
- Geocoding consumes LDS quota. Each row geocoded counts against the account's Location Data Services quota. Check quota availability before bulk operations, especially on large tables.
- Two output handles: and
match. Don't connect to the wrong one --unmatchhas geometries,matchhas NULLs. If you connect theunmatchhandle to a spatial operation, it will fail.unmatch - Country filter is strongly recommended. Without it, ambiguous addresses may resolve to the wrong country (e.g. "Springfield" exists in 30+ US states and in other countries). The country parameter improves both accuracy and speed.
- Address formatting matters. Well-formatted addresses produce better results: works better than
"123 Main St, Springfield, IL 60001". Include city, state/region, and postal code when available."123 main street springfield" - Provider casing & SQL dialect. Examples in this skill use lowercase column names (BigQuery / Databricks / Postgres / Redshift convention); on Snowflake unquoted identifiers surface UPPERCASE (e.g. ,
CARTO_GEOCODE_METADATA). When writing dialect-specific SQL or referencing the AT path, seeGEOM.carto-create-workflow/references/providers/<provider>.md - For large tables, consider batching. Geocoding hundreds of thousands of rows in a single run can exhaust quota or time out. Split into batches if needed.
- Output geometry is always WGS84 points. The column contains EPSG:4326 point geometries regardless of the input address format or country.
geom - Failed geocodes deserve review. The output is not just noise -- it often reveals data quality issues (missing postal codes, abbreviated city names, non-standard formatting) that can be fixed and re-geocoded.
unmatch
- Geocoding会消耗LDS配额。每一行Geocoding都会计入账户的Location Data Services配额。在批量操作前,尤其是处理大型表格时,请检查配额可用性。
- 两个输出分支:和
match。请勿连接到错误的分支——unmatch包含几何图形,match为NULL。如果将unmatch分支连接到空间操作,会导致失败。unmatch - 强烈建议使用国家筛选器。如果不设置,模糊地址可能会解析到错误的国家(例如“Springfield”存在于美国30多个州及其他国家)。国家参数可同时提高精度和速度。
- 地址格式很重要。格式规范的地址会产生更好的结果:比
"123 Main St, Springfield, IL 60001"效果更好。请尽可能包含城市、州/地区和邮政编码。"123 main street springfield" - 提供商大小写与SQL方言。本技能中的示例使用小写列名(BigQuery / Databricks / Postgres / Redshift约定);在Snowflake中,未加引号的标识符会显示为大写(例如、
CARTO_GEOCODE_METADATA)。编写特定方言的SQL或引用AT路径时,请查看GEOM。carto-create-workflow/references/providers/<provider>.md - 处理大型表格时请考虑分批处理。单次运行Geocoding数十万行可能会耗尽配额或超时。如有需要,请拆分成分批处理。
- 输出几何图形始终为WGS84点。无论输入地址格式或国家如何,列都包含EPSG:4326点几何图形。
geom - 失败的Geocoding值得审核。输出并非无用数据——它通常会揭示数据质量问题(缺少邮政编码、城市名称缩写、非标准格式),这些问题可以修复后重新进行Geocoding。
unmatch
Reference Templates
参考模板
Academy tutorials and workflow templates covering geocoding:
| Resource | Description | URL |
|---|---|---|
| Geocoding (BigQuery AT) | Step-by-step geocoding with Analytics Toolbox for BigQuery | Academy link |
| Geocoding (Snowflake AT) | Step-by-step geocoding with Analytics Toolbox for Snowflake | Academy link |
| Workflow template: Geocode street addresses | Generating new spatial data from addresses | Academy link |
Common Variations
常见变体
| Variation | How |
|---|---|
| Composite address from multiple columns | Add a |
| Geocode + spatial join | Chain: Geocode -> match -> Spatial Join (e.g. point-in-polygon to assign regions) |
| Geocode + enrichment | Chain: Geocode -> match -> Buffer/Isochrone -> Enrich (add demographics around each geocoded point) |
| Capture failures for re-processing | Connect both |
| Filter by confidence | After geocode, use |
| 变体 | 实现方式 |
|---|---|
| 从多列拼接复合地址 | 在Geocoding前添加 |
| Geocoding + 空间连接 | 链式流程:Geocode -> match -> 空间连接(例如点面匹配以分配区域) |
| Geocoding + 数据增强 | 链式流程:Geocode -> match -> 缓冲区/等时线 -> 数据增强(为每个Geocoding点添加周边人口统计数据) |
| 捕获失败结果以便重新处理 | 将 |
| 按置信度筛选 | Geocoding后,使用 |