Spatial Autocorrelation with Moran's I
Builds CARTO Workflows that measure spatial autocorrelation using Moran's I, determining whether a variable exhibits clustering, dispersion, or randomness, and classifying each location into HH/HL/LH/LL quadrants.
Prerequisites: Load
for the development process, JSON structure, and validation commands.
When to use Moran's I vs Getis-Ord Gi*:
- Moran's I: "Is there clustering?" + classify into cluster types (HH, HL, LH, LL) + identify spatial outliers (HL, LH)
- Getis-Ord Gi*: "Where are the hotspots/coldspots?" + magnitude of clustering (z-scores)
Instructions
A Moran's I workflow follows this pipeline:
Source Data -> (Filter) -> Spatial Indexing (H3) -> Aggregation -> Moran's I -> (Filter Significant) -> Save
Step 1: Load Source Data
Use
. The input table typically contains point geometries or pre-indexed grid data.
Success: Node outputs a table with a geometry column (e.g.
) or an existing spatial index column.
Step 2: Filter (if needed)
Use
or
to narrow the dataset (e.g. filter by category, date range, non-null values).
Success: Output contains only the subset relevant to the analysis.
Step 3: Spatial Indexing
Convert point geometries to H3 cells using
.
Resolution guidance -- higher resolution = smaller cells = more local patterns:
| Resolution | Cell size | Use case |
|---|
| H3 res 7 | ~5 km edge | District/city-level patterns |
| H3 res 8 | ~2 km edge | Neighborhood-level |
| H3 res 9 | ~500m edge | Street-level (used in Berlin POI tutorial) |
Success: Every row has a spatial index column (e.g.
).
Step 4: Aggregate per Cell
Use
to produce one row per cell with a numeric value:
- Group by: the spatial index column ()
- Aggregation: (or / )
Success: Output has exactly one row per unique cell with a numeric column (e.g.
).
Step 5: Run Moran's I
| Input | Description | Default |
|---|
| Column with H3/Quadbin indexes | |
| Numeric column to test for autocorrelation | |
| K-ring neighborhood radius (in hops) | |
| Distance decay function for spatial weights | |
- : Equal weight to all neighbors within the k-ring
- : Weight decreases exponentially with distance (used in Berlin POI tutorial)
K-ring size: Larger = broader neighborhood = smoother global patterns. Smaller = more localized assessment. The choice of neighborhood size significantly affects results.
Success: Output contains
,
,
, and
columns for every cell. (See the Provider casing note in Gotchas — Snowflake surfaces these UPPERCASE.)
Step 6: Filter Significant Results (recommended)
Use
to keep only statistically significant cells. Quadrant classification is only meaningful for significant cells.
Common filters:
- -- all significant cells (95% confidence)
p_value < 0.05 AND quadrant = 'HH'
-- high-value clusters only
p_value < 0.05 AND (quadrant = 'HL' OR quadrant = 'LH')
-- spatial outliers only
Success: Only cells with statistically meaningful spatial patterns remain.
Step 7: Save
Use
to persist results. The H3/Quadbin column is directly visualizable in CARTO Builder without geometry conversion.
Success: Validated workflow that can be uploaded via
.
Output Columns
| Column | Meaning |
|---|
| Spatial index cell ID (H3 or Quadbin) |
| Local Moran's I value -- positive = similar neighbors, negative = dissimilar neighbors |
| Statistical significance -- lower = more confident |
| Cluster classification: , , , or |
The engine declares these lowercase. See the Provider casing note in Gotchas for Snowflake.
Interpreting Results
Global Moran's I (overall pattern):
- > 0 = spatial clustering (similar values near each other)
- < 0 = spatial dispersion (dissimilar values near each other)
- Near 0 = spatial randomness
Local quadrants (per-cell classification):
| Quadrant | Meaning | Interpretation |
|---|
| HH | High value surrounded by high values | Cluster core |
| LL | Low value surrounded by low values | Low-value cluster |
| HL | High value surrounded by low values | Spatial outlier (high anomaly) |
| LH | Low value surrounded by high values | Spatial outlier (low anomaly) |
Gotchas
- Provider casing & SQL dialect. This skill documents columns in lowercase (BigQuery / Databricks / Postgres / Redshift convention). On Snowflake, unquoted identifiers surface UPPERCASE — reference , , , , , in expressions. See
carto-create-workflow/references/providers/<provider>.md
for casing rules and SQL dialect equivalents.
- The Moran's I component requires the Analytics Toolbox. Always run
carto workflows verify-remote --connection <conn>
to ensure the AT path is resolved. is offline and cannot resolve AT location.
- The output column is named , not or . If you need to join back to original data, rename it (e.g. with ). This is the same behavior as Getis-Ord.
- The must be numeric. If you are counting features, the group-by step must produce a count column -- do not pass the raw index column as the value.
- Resolution too high + large area = very many cells, which can be slow or hit memory limits. Start with a moderate resolution and refine.
- Moran's I is sensitive to the definition of neighborhood. Both k-ring size and decay function choice materially affect results. Document your choices and consider testing alternatives.
- Quadrant classification is only meaningful for statistically significant cells. Always filter by before interpreting quadrants -- non-significant cells may show any quadrant label by chance.
- The decay input parameter is named (not ). Check the component schema if unsure.
Reference Templates
| Resource | Description |
|---|
| BQ Tutorial | Computing spatial autocorrelation of POI locations in Berlin (BigQuery) |
| SF Tutorial | Same tutorial for Snowflake |
| Workflow template | "Computing the spatial auto-correlation of point of interest locations" (available in CARTO Workspace) |
Common Variations
| Variant | How |
|---|
| Pre-indexed data | Skip Step 3 if data already has H3/Quadbin column |
| Polygon input instead of points | Use instead of |
| Complete grid (no gaps) | Polyfill study area boundary first, then enrich with data (same approach as hotspot analysis) |
| Combine with Getis-Ord | Run both analyses on the same aggregated grid, then join results for a richer picture |
| Filter to outliers only | Keep and quadrants to find anomalous locations |