geopandas

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

GeoPandas

GeoPandas

GeoPandas extends pandas to enable spatial operations on geometric types. It combines the capabilities of pandas and shapely for geospatial data analysis.
GeoPandas 扩展了 pandas 的功能,支持对几何类型数据执行空间运算。它结合了 pandas 和 shapely 的能力,用于地理空间数据分析。

Installation

安装

bash
uv pip install geopandas
bash
uv pip install geopandas

Optional Dependencies

可选依赖

bash
undefined
bash
undefined

For interactive maps

用于交互式地图

uv pip install folium
uv pip install folium

For classification schemes in mapping

用于地图中的分类方案

uv pip install mapclassify
uv pip install mapclassify

For faster I/O operations (2-4x speedup)

用于更快的I/O操作(速度提升2-4倍)

uv pip install pyarrow
uv pip install pyarrow

For PostGIS database support

用于PostGIS数据库支持

uv pip install psycopg2 uv pip install geoalchemy2
uv pip install psycopg2 uv pip install geoalchemy2

For basemaps

用于底图

uv pip install contextily
uv pip install contextily

For cartographic projections

用于地图投影

uv pip install cartopy
undefined
uv pip install cartopy
undefined

Quick Start

快速开始

python
import geopandas as gpd
python
import geopandas as gpd

Read spatial data

读取空间数据

gdf = gpd.read_file("data.geojson")
gdf = gpd.read_file("data.geojson")

Basic exploration

基础探索

print(gdf.head()) print(gdf.crs) print(gdf.geometry.geom_type)
print(gdf.head()) print(gdf.crs) print(gdf.geometry.geom_type)

Simple plot

简单绘图

gdf.plot()
gdf.plot()

Reproject to different CRS

重投影到不同CRS

gdf_projected = gdf.to_crs("EPSG:3857")
gdf_projected = gdf.to_crs("EPSG:3857")

Calculate area (use projected CRS for accuracy)

计算面积(为保证精度,使用投影坐标系)

gdf_projected['area'] = gdf_projected.geometry.area
gdf_projected['area'] = gdf_projected.geometry.area

Save to file

保存到文件

gdf.to_file("output.gpkg")
undefined
gdf.to_file("output.gpkg")
undefined

Core Concepts

核心概念

Data Structures

数据结构

  • GeoSeries: Vector of geometries with spatial operations
  • GeoDataFrame: Tabular data structure with geometry column
See data-structures.md for details.
  • GeoSeries:带空间运算的几何数据向量
  • GeoDataFrame:包含几何列的表格数据结构
详情请见 data-structures.md

Reading and Writing Data

数据读写

GeoPandas reads/writes multiple formats: Shapefile, GeoJSON, GeoPackage, PostGIS, Parquet.
python
undefined
GeoPandas 支持读写多种格式:Shapefile、GeoJSON、GeoPackage、PostGIS、Parquet。
python
undefined

Read with filtering

带过滤条件读取

gdf = gpd.read_file("data.gpkg", bbox=(xmin, ymin, xmax, ymax))
gdf = gpd.read_file("data.gpkg", bbox=(xmin, ymin, xmax, ymax))

Write with Arrow acceleration

使用Arrow加速写入

gdf.to_file("output.gpkg", use_arrow=True)

See [data-io.md](references/data-io.md) for comprehensive I/O operations.
gdf.to_file("output.gpkg", use_arrow=True)

全面的I/O操作请见 [data-io.md](references/data-io.md)。

Coordinate Reference Systems

坐标参考系统

Always check and manage CRS for accurate spatial operations:
python
undefined
在执行空间运算前,请务必检查并管理CRS以保证精度:
python
undefined

Check CRS

检查CRS

print(gdf.crs)
print(gdf.crs)

Reproject (transforms coordinates)

重投影(转换坐标)

gdf_projected = gdf.to_crs("EPSG:3857")
gdf_projected = gdf.to_crs("EPSG:3857")

Set CRS (only when metadata missing)

设置CRS(仅当元数据缺失时使用)

gdf = gdf.set_crs("EPSG:4326")

See [crs-management.md](references/crs-management.md) for CRS operations.
gdf = gdf.set_crs("EPSG:4326")

CRS操作详情请见 [crs-management.md](references/crs-management.md)。

Common Operations

常见操作

Geometric Operations

几何运算

Buffer, simplify, centroid, convex hull, affine transformations:
python
undefined
缓冲区、简化、质心、凸包、仿射变换:
python
undefined

Buffer by 10 units

生成10单位的缓冲区

buffered = gdf.geometry.buffer(10)
buffered = gdf.geometry.buffer(10)

Simplify with tolerance

按容差简化几何

simplified = gdf.geometry.simplify(tolerance=5, preserve_topology=True)
simplified = gdf.geometry.simplify(tolerance=5, preserve_topology=True)

Get centroids

获取质心

centroids = gdf.geometry.centroid

See [geometric-operations.md](references/geometric-operations.md) for all operations.
centroids = gdf.geometry.centroid

所有运算详情请见 [geometric-operations.md](references/geometric-operations.md)。

Spatial Analysis

空间分析

Spatial joins, overlay operations, dissolve:
python
undefined
空间连接、叠加操作、融合:
python
undefined

Spatial join (intersects)

空间连接(相交关系)

joined = gpd.sjoin(gdf1, gdf2, predicate='intersects')
joined = gpd.sjoin(gdf1, gdf2, predicate='intersects')

Nearest neighbor join

最近邻连接

nearest = gpd.sjoin_nearest(gdf1, gdf2, max_distance=1000)
nearest = gpd.sjoin_nearest(gdf1, gdf2, max_distance=1000)

Overlay intersection

叠加交集

intersection = gpd.overlay(gdf1, gdf2, how='intersection')
intersection = gpd.overlay(gdf1, gdf2, how='intersection')

Dissolve by attribute

按属性融合

dissolved = gdf.dissolve(by='region', aggfunc='sum')

See [spatial-analysis.md](references/spatial-analysis.md) for analysis operations.
dissolved = gdf.dissolve(by='region', aggfunc='sum')

空间分析详情请见 [spatial-analysis.md](references/spatial-analysis.md)。

Visualization

可视化

Create static and interactive maps:
python
undefined
创建静态与交互式地图:
python
undefined

Choropleth map

分级统计图

gdf.plot(column='population', cmap='YlOrRd', legend=True)
gdf.plot(column='population', cmap='YlOrRd', legend=True)

Interactive map

交互式地图

gdf.explore(column='population', legend=True).save('map.html')
gdf.explore(column='population', legend=True).save('map.html')

Multi-layer map

多层地图

import matplotlib.pyplot as plt fig, ax = plt.subplots() gdf1.plot(ax=ax, color='blue') gdf2.plot(ax=ax, color='red')

See [visualization.md](references/visualization.md) for mapping techniques.
import matplotlib.pyplot as plt fig, ax = plt.subplots() gdf1.plot(ax=ax, color='blue') gdf2.plot(ax=ax, color='red')

制图技巧请见 [visualization.md](references/visualization.md)。

Detailed Documentation

详细文档

  • Data Structures - GeoSeries and GeoDataFrame fundamentals
  • Data I/O - Reading/writing files, PostGIS, Parquet
  • Geometric Operations - Buffer, simplify, affine transforms
  • Spatial Analysis - Joins, overlay, dissolve, clipping
  • Visualization - Plotting, choropleth maps, interactive maps
  • CRS Management - Coordinate reference systems and projections
  • 数据结构 - GeoSeries与GeoDataFrame基础
  • 数据I/O - 文件读写、PostGIS、Parquet操作
  • 几何运算 - 缓冲区、简化、仿射变换
  • 空间分析 - 连接、叠加、融合、裁剪
  • 可视化 - 绘图、分级统计图、交互式地图
  • CRS管理 - 坐标参考系统与投影

Common Workflows

常见工作流

Load, Transform, Analyze, Export

加载、转换、分析、导出

python
undefined
python
undefined

1. Load data

1. 加载数据

gdf = gpd.read_file("data.shp")
gdf = gpd.read_file("data.shp")

2. Check and transform CRS

2. 检查并转换CRS

print(gdf.crs) gdf = gdf.to_crs("EPSG:3857")
print(gdf.crs) gdf = gdf.to_crs("EPSG:3857")

3. Perform analysis

3. 执行分析

gdf['area'] = gdf.geometry.area buffered = gdf.copy() buffered['geometry'] = gdf.geometry.buffer(100)
gdf['area'] = gdf.geometry.area buffered = gdf.copy() buffered['geometry'] = gdf.geometry.buffer(100)

4. Export results

4. 导出结果

gdf.to_file("results.gpkg", layer='original') buffered.to_file("results.gpkg", layer='buffered')
undefined
gdf.to_file("results.gpkg", layer='original') buffered.to_file("results.gpkg", layer='buffered')
undefined

Spatial Join and Aggregate

空间连接与聚合

python
undefined
python
undefined

Join points to polygons

将点数据连接到面数据

points_in_polygons = gpd.sjoin(points_gdf, polygons_gdf, predicate='within')
points_in_polygons = gpd.sjoin(points_gdf, polygons_gdf, predicate='within')

Aggregate by polygon

按面聚合

aggregated = points_in_polygons.groupby('index_right').agg({ 'value': 'sum', 'count': 'size' })
aggregated = points_in_polygons.groupby('index_right').agg({ 'value': 'sum', 'count': 'size' })

Merge back to polygons

合并回面数据

result = polygons_gdf.merge(aggregated, left_index=True, right_index=True)
undefined
result = polygons_gdf.merge(aggregated, left_index=True, right_index=True)
undefined

Multi-Source Data Integration

多源数据集成

python
undefined
python
undefined

Read from different sources

从不同源读取数据

roads = gpd.read_file("roads.shp") buildings = gpd.read_file("buildings.geojson") parcels = gpd.read_postgis("SELECT * FROM parcels", con=engine, geom_col='geom')
roads = gpd.read_file("roads.shp") buildings = gpd.read_file("buildings.geojson") parcels = gpd.read_postgis("SELECT * FROM parcels", con=engine, geom_col='geom')

Ensure matching CRS

确保CRS匹配

buildings = buildings.to_crs(roads.crs) parcels = parcels.to_crs(roads.crs)
buildings = buildings.to_crs(roads.crs) parcels = parcels.to_crs(roads.crs)

Perform spatial operations

执行空间操作

buildings_near_roads = buildings[buildings.geometry.distance(roads.union_all()) < 50]
undefined
buildings_near_roads = buildings[buildings.geometry.distance(roads.union_all()) < 50]
undefined

Performance Tips

性能优化技巧

  1. Use spatial indexing: GeoPandas creates spatial indexes automatically for most operations
  2. Filter during read: Use
    bbox
    ,
    mask
    , or
    where
    parameters to load only needed data
  3. Use Arrow for I/O: Add
    use_arrow=True
    for 2-4x faster reading/writing
  4. Simplify geometries: Use
    .simplify()
    to reduce complexity when precision isn't critical
  5. Batch operations: Vectorized operations are much faster than iterating rows
  6. Use appropriate CRS: Projected CRS for area/distance, geographic for visualization
  1. 使用空间索引:GeoPandas会自动为大多数操作创建空间索引
  2. 读取时过滤:使用
    bbox
    mask
    where
    参数仅加载所需数据
  3. 使用Arrow进行I/O:添加
    use_arrow=True
    可使读写速度提升2-4倍
  4. 简化几何:当精度要求不高时,使用
    .simplify()
    降低几何复杂度
  5. 批量操作:矢量化操作比逐行迭代快得多
  6. 选择合适的CRS:面积/距离计算使用投影坐标系,可视化使用地理坐标系

Best Practices

最佳实践

  1. Always check CRS before spatial operations
  2. Use projected CRS for area and distance calculations
  3. Match CRS before spatial joins or overlays
  4. Validate geometries with
    .is_valid
    before operations
  5. Use
    .copy()
    when modifying geometry columns to avoid side effects
  6. Preserve topology when simplifying for analysis
  7. Use GeoPackage format for modern workflows (better than Shapefile)
  8. Set max_distance in sjoin_nearest for better performance
  1. 执行空间操作前务必检查CRS
  2. 面积与距离计算使用投影坐标系
  3. 空间连接或叠加前匹配CRS
  4. 操作前使用
    .is_valid
    验证几何有效性
  5. 修改几何列时使用
    .copy()
    避免副作用
  6. 分析时简化几何需保留拓扑结构
  7. 现代工作流使用GeoPackage格式(优于Shapefile)
  8. 在sjoin_nearest中设置max_distance以提升性能