carto-site-selection
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseSite Selection and Cannibalization Analysis
选址与蚕食分析
Builds CARTO Workflows that identify optimal locations for new facilities (stores, stations, offices) by combining spatial criteria, and that quantify cannibalization risk from overlapping catchment areas. Also covers twin-area and similar-location discovery.
Prerequisites: Load for the development process, JSON structure, and validation commands. Load if the workflow involves isochrones, buffers, or catchment enrichment — that skill covers the catchment pipeline in detail.
carto-create-workflowcarto-trade-area-analysis在CARTO中构建工作流,通过结合空间识别标准确定新设施(门店、站点、办公室)的最优位置,并量化重叠商圈带来的蚕食风险。同时涵盖相似区域与同类选址的发现。
前置条件:开发过程、JSON结构及验证命令需加载。若工作流涉及等时线、缓冲区或商圈数据增强,需加载——该技能详细介绍了商圈处理流程。
carto-create-workflowcarto-trade-area-analysisDecision Tree
决策树
| User intent | Pattern |
|---|---|
| "Where should I open a new store?" | Site Selection (scoring + ranking) |
| "Will a new store hurt existing ones?" | Cannibalization Analysis |
| "Find locations similar to my best performers" | Twin Areas / Similar Locations |
| 用户意图 | 模式 |
|---|---|
| "我应该在哪里开新店?" | 选址分析(评分+排名) |
| "新店会对现有门店造成影响吗?" | 蚕食分析 |
| "寻找与我最佳门店相似的选址" | 相似区域/同类选址 |
Instructions
操作指南
Pattern A: Site Selection (Scoring + Ranking)
模式A:选址分析(评分+排名)
Existing locations + Target area -> Spatial indexing -> Enrich with demographics/POIs -> Score/Rank -> Filter top candidates -> Save现有位置 + 目标区域 -> 空间索引 -> 人口统计/POI数据增强 -> 评分/排名 -> 筛选顶级候选位置 -> 保存Step 1: Load Data
步骤1:加载数据
Load two datasets with :
native.gettablebyname- Existing locations (current stores/facilities)
- Target area (e.g. city boundary, district polygons, or a grid covering the study area)
Success: Both tables loaded with geometry columns and unique identifiers.
使用加载两个数据集:
native.gettablebyname- 现有位置(当前门店/设施)
- 目标区域(例如城市边界、区域多边形或覆盖研究区域的网格)
成功标志:两个表均已加载,包含几何列与唯一标识符。
Step 2: Build Candidate Grid
步骤2:构建候选网格
Polyfill the target area into H3 or Quadbin cells using or . Each cell is a candidate micro-location.
native.h3polyfillnative.quadbinpolyfillSuccess: A contiguous grid of cells covering the study area.
使用或将目标区域填充为H3或Quadbin网格单元。每个单元都是一个候选微选址。
native.h3polyfillnative.quadbinpolyfill成功标志:生成覆盖研究区域的连续网格单元。
Step 3: Enrich Candidates
步骤3:增强候选位置数据
Attach demand signals to each cell — population, income, foot traffic, POI density — using , , or the Data Observatory.
native.h3enrichnative.joinv2Success: Each grid cell has numeric columns representing demand/suitability factors.
使用、或Data Observatory为每个单元添加需求信号——人口、收入、人流量、POI密度。
native.h3enrichnative.joinv2成功标志:每个网格单元都包含代表需求/适配性因素的数值列。
Step 4: Filter by Proximity to Existing Locations
步骤4:根据与现有位置的距离筛选
Use to compute hop distance from each candidate cell to the nearest existing location. Filter out cells that are too close (cannibalization risk) or too far (logistics cost).
native.h3distance- returns hop count, not physical distance. Convert using the approximate edge length for the resolution (e.g. H3 res 8 ~ 460m edge, so 3 hops ~ 1.4 km).
native.h3distance
Success: Candidate cells are within a sensible distance band from existing locations.
使用计算每个候选单元到最近现有位置的跳数距离。过滤掉过近(存在蚕食风险)或过远(物流成本过高)的单元。
native.h3distance- 返回的是跳数,而非物理距离。可通过对应分辨率的近似边长进行转换(例如H3分辨率8的边长约为460米,因此3跳约为1.4公里)。
native.h3distance
成功标志:候选单元处于与现有位置距离合理的范围内。
Step 5: Score and Rank
步骤5:评分与排名
Use the scoring pattern from :
trade-area-analysis- Normalize each variable to [0,1] with
native.normalize - Composite score via with user-defined weights
native.selectexpression - Rank with (descending) +
native.orderby(top N)native.limit
Success: A ranked shortlist of candidate cells with composite scores and contributing variables.
使用中的评分模式:
trade-area-analysis- 标准化:使用将每个变量标准化至[0,1]区间
native.normalize - 综合评分:通过结合用户定义的权重计算综合评分
native.selectexpression - 排名:使用(降序)+
native.orderby(取前N个)进行排名native.limit
成功标志:生成带有综合评分及贡献变量的候选单元排名短名单。
Step 6: Save
步骤6:保存
Use . The H3/Quadbin column is directly visualizable in CARTO Builder.
native.saveastableSuccess: Validated workflow ready to upload.
使用保存结果。H3/Quadbin列可直接在CARTO Builder中可视化。
native.saveastable成功标志:生成可上传的已验证工作流。
Pattern B: Cannibalization Analysis
模式B:蚕食分析
Existing + Proposed locations -> Trade areas (isoline/buffer) -> Polyfill to grid -> Intersect/Join -> Measure overlap -> Save现有位置 + 拟议位置 -> 商圈(等时线/缓冲区) -> 填充为网格 -> 相交/关联 -> 测量重叠度 -> 保存Step 1: Load Data
步骤1:加载数据
Load existing locations and proposed locations (or a single table with a flag column distinguishing them).
Success: Both sets loaded with geometry and unique identifiers.
加载现有位置与拟议位置(或包含区分二者的标记列的单个表)。
成功标志:两组位置均已加载,包含几何信息与唯一标识符。
Step 2: Generate Trade Areas
步骤2:生成商圈
Create catchment areas around both existing and proposed locations using (realistic) or (simple). Use the same parameters for both sets to ensure comparability.
native.isolinesnative.bufferSuccess: Every location has a catchment polygon with consistent parameters.
使用(更贴合实际)或(简易版)为现有位置与拟议位置创建商圈。为确保可比性,两组位置需使用相同参数。
native.isolinesnative.buffer成功标志:每个位置都有参数一致的商圈多边形。
Step 3: Polyfill to Spatial Index
步骤3:填充为空间索引
Convert all catchment polygons to H3 or Quadbin cells with . Preserve the location identifier and an flag.
native.h3polyfillis_proposedSuccess: One row per cell per location, with location ID and type flag.
使用将所有商圈多边形转换为H3或Quadbin网格单元。保留位置标识符与标记。
native.h3polyfillis_proposed成功标志:每个位置对应一条网格单元记录,包含位置ID与类型标记。
Step 4: Find Overlap
步骤4:查找重叠区域
Use (inner join on the spatial index column) between existing-location cells and proposed-location cells. The result contains cells shared by at least one existing and one proposed location.
native.joinv2Success: Output contains only cells that fall in both an existing and a proposed catchment.
使用(基于空间索引列进行内连接)关联现有位置单元与拟议位置单元。结果包含至少被一个现有位置和一个拟议位置商圈覆盖的单元。
native.joinv2成功标志:输出仅包含同时属于现有与拟议商圈的单元。
Step 5: Measure Impact
步骤5:量化影响
Use to aggregate overlap:
native.groupby- Per existing location: count of overlapping cells / total cells in that location's catchment = overlap percentage
- Enrich overlap cells with population or revenue to quantify shared demand
Use to compute the overlap ratio.
native.selectexpressionSuccess: Each existing location has an overlap metric showing how much of its catchment is shared with proposed locations.
使用聚合重叠数据:
native.groupby- 单个现有位置:重叠单元数 / 该位置商圈的总单元数 = 重叠百分比
- 增强重叠单元数据:结合人口或营收数据量化共享需求
使用计算重叠比率。
native.selectexpression成功标志:每个现有位置都有一个重叠指标,显示其商圈与拟议位置商圈的共享程度。
Step 6: Save
步骤6:保存
Use .
native.saveastableSuccess: Validated workflow with per-location cannibalization metrics.
使用保存结果。
native.saveastable成功标志:生成带有单个位置蚕食指标的已验证工作流。
Pattern C: Twin Areas / Similar Locations
模式C:相似区域/同类选址
Top-performing locations -> Trade areas -> Enrich -> Build similarity model -> Score all candidate areas -> Rank -> Save高绩效位置 -> 商圈 -> 数据增强 -> 构建相似性模型 -> 为所有候选区域评分 -> 排名 -> 保存Step 1: Identify Reference Locations
步骤1:确定参考位置
Load the full location dataset. Filter to top performers (e.g. top quartile by revenue) using or + .
native.wheresimplifiednative.orderbynative.limitSuccess: A subset of high-performing locations isolated as the reference set.
加载完整位置数据集。使用或 + 筛选出高绩效位置(例如营收前四分之一的门店)。
native.wheresimplifiednative.orderbynative.limit成功标志:分离出作为参考集的高绩效位置子集。
Step 2: Generate and Enrich Trade Areas
步骤2:生成并增强商圈数据
Create isochrone or buffer trade areas around reference locations. Polyfill to H3/Quadbin. Enrich with demographics, POIs, and any relevant variables.
Success: Each reference location has a rich demographic profile.
为参考位置创建等时线或缓冲区商圈,填充为H3/Quadbin网格单元,添加人口统计、POI及其他相关变量数据。
成功标志:每个参考位置都有丰富的人口统计特征。
Step 3: Build Twin Areas Model
步骤3:构建相似区域模型
Use (BUILD_TWIN_AREAS_MODEL) to create a PCA-based similarity model from the enriched reference locations.
native.buildtwinareasmodel- Input: enriched reference locations with numeric feature columns
- The model captures the multivariate "signature" of successful locations
Success: A model artifact that encodes the demographic profile of top performers.
使用(BUILD_TWIN_AREAS_MODEL)基于增强后的参考位置创建PCA(主成分分析)相似性模型。
native.buildtwinareasmodel- 输入:带有数值特征列的增强参考位置数据
- 模型捕捉成功选址的多元“特征”
成功标志:生成编码高绩效位置人口统计特征的模型工件。
Step 4: Find Similar Locations
步骤4:寻找相似位置
Use (FIND_SIMILAR_LOCATIONS) to score all candidate areas against the twin-areas model.
native.findsimilarlocations- Input: candidate areas enriched with the same variables used to build the model
- Output: similarity score per candidate
Success: Every candidate area has a similarity score relative to the reference set.
使用(FIND_SIMILAR_LOCATIONS)为所有候选区域基于相似区域模型评分。
native.findsimilarlocations- 输入:使用与构建模型相同变量增强后的候选区域数据
- 输出:每个候选区域的相似性评分
成功标志:每个候选区域都有相对于参考集的相似性评分。
Step 5: Rank and Save
步骤5:排名与保存
Rank by similarity score descending. Save top candidates.
Success: A ranked list of areas most similar to top-performing locations.
按相似性评分降序排名,保存顶级候选区域。
成功标志:生成与高绩效位置最相似的区域排名列表。
Commercial Hotspots Variant
商业热点变体
For demand-driven site selection (e.g. "where is unmet demand highest?"), use :
native.commercialhotspots- Build an H3 grid over the study area
- Enrich with the target demand variable (e.g. population aged 15-34)
- Run with
native.commercialhotspotsandvariablecolumnsweights - Filter results by significance ()
p_value < 0.05 - Optionally filter by from existing locations to focus on underserved areas
native.h3distance
Note: uses Python-style list syntax (), and is comma-separated — see the gotchas for details.
variablecolumns['col1', 'col2']weightstrade-area-analysis对于需求驱动的选址(例如“哪里未满足需求最高?”),使用:
native.commercialhotspots- 在研究区域上构建H3网格
- 添加目标需求变量数据(例如15-34岁人口)
- 使用和
variablecolumns运行weightsnative.commercialhotspots - 按显著性筛选结果()
p_value < 0.05 - 可选:使用筛选远离现有位置的区域,聚焦服务不足的区域
native.h3distance
注意:使用Python风格的列表语法(),为逗号分隔格式——详情请参考中的注意事项。
variablecolumns['col1', 'col2']weightstrade-area-analysisGotchas
注意事项
- Provider casing & SQL dialect. This skill uses lowercase column names (,
h3,is_proposed, etc.) — BigQuery / Databricks / Postgres / Redshift convention. On Snowflake, unquoted identifiers surface UPPERCASE — reference them aspopulation,H3,IS_PROPOSED. SeePOPULATIONfor casing rules and SQL dialect equivalents.carto-create-workflow/references/providers/<provider>.md - requires the Retail module of the Analytics Toolbox. Validate with
native.commercialhotspotsto confirm availability.--connection - Twin Areas and Similar Locations use PCA internally — results are sensitive to variable selection and scaling. Include only relevant, non-redundant variables. Normalize inputs if scales differ widely.
- Cannibalization overlap depends heavily on trade area definition (buffer radius, isoline time). Small changes in parameters can flip results. Document the chosen parameters and rationale.
- returns hop count, not physical distance. Multiply by the approximate cell edge length for the resolution to get a rough metric distance (e.g. res 8 ~ 460m, res 9 ~ 174m per hop).
native.h3distance - When comparing across regions of different sizes, normalize demographics to per-capita or per-area values to avoid size bias (e.g. population density instead of total population).
- The "best" location depends entirely on the criteria and weights chosen — there is no objectively correct answer. Always document assumptions and let the user adjust weights.
- For the twin-areas model, use the same set of enrichment variables for both the reference locations and the candidates. Mismatched variables will cause the model to fail or produce meaningless scores.
- 提供商大小写与SQL方言:本技能使用小写列名(、
h3、is_proposed等)——符合BigQuery / Databricks / Postgres / Redshift的约定。在Snowflake中,未加引号的标识符会显示为大写——需引用为population、H3、IS_PROPOSED。请查看POPULATION了解大小写规则与SQL方言对应关系。carto-create-workflow/references/providers/<provider>.md - 需要Analytics Toolbox的零售模块。使用
native.commercialhotspots验证可用性。--connection - 相似区域与同类选址内部使用PCA——结果对变量选择与缩放敏感。仅包含相关且非冗余的变量。若变量差异较大,请标准化输入数据。
- 蚕食重叠度很大程度上取决于商圈定义(缓冲区半径、等时线时间)。参数的微小变化可能导致结果反转。请记录所选参数及理由。
- 返回的是跳数,而非物理距离。乘以对应分辨率的近似单元边长可得到大致的物理距离(例如分辨率8约为460米,分辨率9约为174米/跳)。
native.h3distance - 比较不同大小的区域时,标准化人口统计数据为人均或每区域数值,避免规模偏差(例如使用人口密度而非总人口)。
- “最佳”选址完全取决于所选标准与权重——不存在客观正确的答案。请始终记录假设,并允许用户调整权重。
- 对于相似区域模型,参考位置与候选区域需使用相同的增强变量集。变量不匹配会导致模型失败或产生无意义的评分。
Reference Templates
参考模板
Academy Tutorials
学院教程
| Tutorial | Provider | URL |
|---|---|---|
| Pizza Hut Honolulu — site selection with commercial hotspots | BigQuery | Link |
| Pizza Hut Honolulu — site selection with commercial hotspots | Snowflake | Link |
| Store cannibalization — quantifying new store impact | BigQuery | Link |
| Starbucks cannibalization — H3 grid overlap analysis | BigQuery | Link |
| Store cannibalization — Quadkey grid overlap | Snowflake | Link |
| Find twin areas of top-performing stores | BigQuery | Link |
| Find similar locations based on trade areas | BigQuery | Link |
| EV charging station site selection | Workflows | Link |
Common Variations
常见变体
| Variant | How |
|---|---|
| Retail expansion | Isochrones -> enrich with demographics + competitor density -> composite score -> top N |
| Franchise territory planning | Cannibalization pattern to ensure non-overlapping catchments before awarding territories |
| EV charging / public services | Grid-based demand (population, traffic) + distance-from-existing filter -> rank underserved cells |
| Billboard / OOH placement | Buffers -> audience enrichment -> normalize + weight -> top N (see |
| Bank branch optimization | Twin areas from top branches -> find similar underserved areas -> propose new branches |
| Competitor proximity analysis | H3 distance to competitor locations -> filter cells far from competitors but near demand |
| 变体 | 实现方式 |
|---|---|
| 零售拓展 | 等时线 -> 人口统计+竞争对手密度数据增强 -> 综合评分 -> 取前N个 |
| 加盟店区域规划 | 使用蚕食分析模式确保授予区域前商圈无重叠 |
| 充电桩/公共服务 | 基于网格的需求(人口、流量)+ 与现有位置距离筛选 -> 服务不足单元排名 |
| 广告牌/户外广告布局 | 缓冲区 -> 受众数据增强 -> 标准化+加权 -> 取前N个(详见 |
| 银行网点优化 | 从顶级网点生成相似区域 -> 寻找相似的服务不足区域 -> 提议新网点 |
| 竞争对手 proximity分析 | 到竞争对手位置的H3距离 -> 筛选远离竞争对手但靠近需求的单元 |