geopandas
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseGeoPandas
GeoPandas
GeoPandas extends pandas to enable spatial operations on geometric types. It combines the capabilities of pandas and shapely for geospatial data analysis.
GeoPandas 扩展了 pandas 的功能,支持对几何类型数据执行空间运算。它结合了 pandas 和 shapely 的能力,用于地理空间数据分析。
Installation
安装
bash
uv pip install geopandasbash
uv pip install geopandasOptional Dependencies
可选依赖
bash
undefinedbash
undefinedFor interactive maps
用于交互式地图
uv pip install folium
uv pip install folium
For classification schemes in mapping
用于地图中的分类方案
uv pip install mapclassify
uv pip install mapclassify
For faster I/O operations (2-4x speedup)
用于更快的I/O操作(速度提升2-4倍)
uv pip install pyarrow
uv pip install pyarrow
For PostGIS database support
用于PostGIS数据库支持
uv pip install psycopg2
uv pip install geoalchemy2
uv pip install psycopg2
uv pip install geoalchemy2
For basemaps
用于底图
uv pip install contextily
uv pip install contextily
For cartographic projections
用于地图投影
uv pip install cartopy
undefineduv pip install cartopy
undefinedQuick Start
快速开始
python
import geopandas as gpdpython
import geopandas as gpdRead spatial data
读取空间数据
gdf = gpd.read_file("data.geojson")
gdf = gpd.read_file("data.geojson")
Basic exploration
基础探索
print(gdf.head())
print(gdf.crs)
print(gdf.geometry.geom_type)
print(gdf.head())
print(gdf.crs)
print(gdf.geometry.geom_type)
Simple plot
简单绘图
gdf.plot()
gdf.plot()
Reproject to different CRS
重投影到不同CRS
gdf_projected = gdf.to_crs("EPSG:3857")
gdf_projected = gdf.to_crs("EPSG:3857")
Calculate area (use projected CRS for accuracy)
计算面积(为保证精度,使用投影坐标系)
gdf_projected['area'] = gdf_projected.geometry.area
gdf_projected['area'] = gdf_projected.geometry.area
Save to file
保存到文件
gdf.to_file("output.gpkg")
undefinedgdf.to_file("output.gpkg")
undefinedCore Concepts
核心概念
Data Structures
数据结构
- GeoSeries: Vector of geometries with spatial operations
- GeoDataFrame: Tabular data structure with geometry column
See data-structures.md for details.
- GeoSeries:带空间运算的几何数据向量
- GeoDataFrame:包含几何列的表格数据结构
详情请见 data-structures.md。
Reading and Writing Data
数据读写
GeoPandas reads/writes multiple formats: Shapefile, GeoJSON, GeoPackage, PostGIS, Parquet.
python
undefinedGeoPandas 支持读写多种格式:Shapefile、GeoJSON、GeoPackage、PostGIS、Parquet。
python
undefinedRead with filtering
带过滤条件读取
gdf = gpd.read_file("data.gpkg", bbox=(xmin, ymin, xmax, ymax))
gdf = gpd.read_file("data.gpkg", bbox=(xmin, ymin, xmax, ymax))
Write with Arrow acceleration
使用Arrow加速写入
gdf.to_file("output.gpkg", use_arrow=True)
See [data-io.md](references/data-io.md) for comprehensive I/O operations.gdf.to_file("output.gpkg", use_arrow=True)
全面的I/O操作请见 [data-io.md](references/data-io.md)。Coordinate Reference Systems
坐标参考系统
Always check and manage CRS for accurate spatial operations:
python
undefined在执行空间运算前,请务必检查并管理CRS以保证精度:
python
undefinedCheck CRS
检查CRS
print(gdf.crs)
print(gdf.crs)
Reproject (transforms coordinates)
重投影(转换坐标)
gdf_projected = gdf.to_crs("EPSG:3857")
gdf_projected = gdf.to_crs("EPSG:3857")
Set CRS (only when metadata missing)
设置CRS(仅当元数据缺失时使用)
gdf = gdf.set_crs("EPSG:4326")
See [crs-management.md](references/crs-management.md) for CRS operations.gdf = gdf.set_crs("EPSG:4326")
CRS操作详情请见 [crs-management.md](references/crs-management.md)。Common Operations
常见操作
Geometric Operations
几何运算
Buffer, simplify, centroid, convex hull, affine transformations:
python
undefined缓冲区、简化、质心、凸包、仿射变换:
python
undefinedBuffer by 10 units
生成10单位的缓冲区
buffered = gdf.geometry.buffer(10)
buffered = gdf.geometry.buffer(10)
Simplify with tolerance
按容差简化几何
simplified = gdf.geometry.simplify(tolerance=5, preserve_topology=True)
simplified = gdf.geometry.simplify(tolerance=5, preserve_topology=True)
Get centroids
获取质心
centroids = gdf.geometry.centroid
See [geometric-operations.md](references/geometric-operations.md) for all operations.centroids = gdf.geometry.centroid
所有运算详情请见 [geometric-operations.md](references/geometric-operations.md)。Spatial Analysis
空间分析
Spatial joins, overlay operations, dissolve:
python
undefined空间连接、叠加操作、融合:
python
undefinedSpatial join (intersects)
空间连接(相交关系)
joined = gpd.sjoin(gdf1, gdf2, predicate='intersects')
joined = gpd.sjoin(gdf1, gdf2, predicate='intersects')
Nearest neighbor join
最近邻连接
nearest = gpd.sjoin_nearest(gdf1, gdf2, max_distance=1000)
nearest = gpd.sjoin_nearest(gdf1, gdf2, max_distance=1000)
Overlay intersection
叠加交集
intersection = gpd.overlay(gdf1, gdf2, how='intersection')
intersection = gpd.overlay(gdf1, gdf2, how='intersection')
Dissolve by attribute
按属性融合
dissolved = gdf.dissolve(by='region', aggfunc='sum')
See [spatial-analysis.md](references/spatial-analysis.md) for analysis operations.dissolved = gdf.dissolve(by='region', aggfunc='sum')
空间分析详情请见 [spatial-analysis.md](references/spatial-analysis.md)。Visualization
可视化
Create static and interactive maps:
python
undefined创建静态与交互式地图:
python
undefinedChoropleth map
分级统计图
gdf.plot(column='population', cmap='YlOrRd', legend=True)
gdf.plot(column='population', cmap='YlOrRd', legend=True)
Interactive map
交互式地图
gdf.explore(column='population', legend=True).save('map.html')
gdf.explore(column='population', legend=True).save('map.html')
Multi-layer map
多层地图
import matplotlib.pyplot as plt
fig, ax = plt.subplots()
gdf1.plot(ax=ax, color='blue')
gdf2.plot(ax=ax, color='red')
See [visualization.md](references/visualization.md) for mapping techniques.import matplotlib.pyplot as plt
fig, ax = plt.subplots()
gdf1.plot(ax=ax, color='blue')
gdf2.plot(ax=ax, color='red')
制图技巧请见 [visualization.md](references/visualization.md)。Detailed Documentation
详细文档
- Data Structures - GeoSeries and GeoDataFrame fundamentals
- Data I/O - Reading/writing files, PostGIS, Parquet
- Geometric Operations - Buffer, simplify, affine transforms
- Spatial Analysis - Joins, overlay, dissolve, clipping
- Visualization - Plotting, choropleth maps, interactive maps
- CRS Management - Coordinate reference systems and projections
- 数据结构 - GeoSeries与GeoDataFrame基础
- 数据I/O - 文件读写、PostGIS、Parquet操作
- 几何运算 - 缓冲区、简化、仿射变换
- 空间分析 - 连接、叠加、融合、裁剪
- 可视化 - 绘图、分级统计图、交互式地图
- CRS管理 - 坐标参考系统与投影
Common Workflows
常见工作流
Load, Transform, Analyze, Export
加载、转换、分析、导出
python
undefinedpython
undefined1. Load data
1. 加载数据
gdf = gpd.read_file("data.shp")
gdf = gpd.read_file("data.shp")
2. Check and transform CRS
2. 检查并转换CRS
print(gdf.crs)
gdf = gdf.to_crs("EPSG:3857")
print(gdf.crs)
gdf = gdf.to_crs("EPSG:3857")
3. Perform analysis
3. 执行分析
gdf['area'] = gdf.geometry.area
buffered = gdf.copy()
buffered['geometry'] = gdf.geometry.buffer(100)
gdf['area'] = gdf.geometry.area
buffered = gdf.copy()
buffered['geometry'] = gdf.geometry.buffer(100)
4. Export results
4. 导出结果
gdf.to_file("results.gpkg", layer='original')
buffered.to_file("results.gpkg", layer='buffered')
undefinedgdf.to_file("results.gpkg", layer='original')
buffered.to_file("results.gpkg", layer='buffered')
undefinedSpatial Join and Aggregate
空间连接与聚合
python
undefinedpython
undefinedJoin points to polygons
将点数据连接到面数据
points_in_polygons = gpd.sjoin(points_gdf, polygons_gdf, predicate='within')
points_in_polygons = gpd.sjoin(points_gdf, polygons_gdf, predicate='within')
Aggregate by polygon
按面聚合
aggregated = points_in_polygons.groupby('index_right').agg({
'value': 'sum',
'count': 'size'
})
aggregated = points_in_polygons.groupby('index_right').agg({
'value': 'sum',
'count': 'size'
})
Merge back to polygons
合并回面数据
result = polygons_gdf.merge(aggregated, left_index=True, right_index=True)
undefinedresult = polygons_gdf.merge(aggregated, left_index=True, right_index=True)
undefinedMulti-Source Data Integration
多源数据集成
python
undefinedpython
undefinedRead from different sources
从不同源读取数据
roads = gpd.read_file("roads.shp")
buildings = gpd.read_file("buildings.geojson")
parcels = gpd.read_postgis("SELECT * FROM parcels", con=engine, geom_col='geom')
roads = gpd.read_file("roads.shp")
buildings = gpd.read_file("buildings.geojson")
parcels = gpd.read_postgis("SELECT * FROM parcels", con=engine, geom_col='geom')
Ensure matching CRS
确保CRS匹配
buildings = buildings.to_crs(roads.crs)
parcels = parcels.to_crs(roads.crs)
buildings = buildings.to_crs(roads.crs)
parcels = parcels.to_crs(roads.crs)
Perform spatial operations
执行空间操作
buildings_near_roads = buildings[buildings.geometry.distance(roads.union_all()) < 50]
undefinedbuildings_near_roads = buildings[buildings.geometry.distance(roads.union_all()) < 50]
undefinedPerformance Tips
性能优化技巧
- Use spatial indexing: GeoPandas creates spatial indexes automatically for most operations
- Filter during read: Use ,
bbox, ormaskparameters to load only needed datawhere - Use Arrow for I/O: Add for 2-4x faster reading/writing
use_arrow=True - Simplify geometries: Use to reduce complexity when precision isn't critical
.simplify() - Batch operations: Vectorized operations are much faster than iterating rows
- Use appropriate CRS: Projected CRS for area/distance, geographic for visualization
- 使用空间索引:GeoPandas会自动为大多数操作创建空间索引
- 读取时过滤:使用、
bbox或mask参数仅加载所需数据where - 使用Arrow进行I/O:添加可使读写速度提升2-4倍
use_arrow=True - 简化几何:当精度要求不高时,使用降低几何复杂度
.simplify() - 批量操作:矢量化操作比逐行迭代快得多
- 选择合适的CRS:面积/距离计算使用投影坐标系,可视化使用地理坐标系
Best Practices
最佳实践
- Always check CRS before spatial operations
- Use projected CRS for area and distance calculations
- Match CRS before spatial joins or overlays
- Validate geometries with before operations
.is_valid - Use when modifying geometry columns to avoid side effects
.copy() - Preserve topology when simplifying for analysis
- Use GeoPackage format for modern workflows (better than Shapefile)
- Set max_distance in sjoin_nearest for better performance
- 执行空间操作前务必检查CRS
- 面积与距离计算使用投影坐标系
- 空间连接或叠加前匹配CRS
- 操作前使用验证几何有效性
.is_valid - 修改几何列时使用避免副作用
.copy() - 分析时简化几何需保留拓扑结构
- 现代工作流使用GeoPackage格式(优于Shapefile)
- 在sjoin_nearest中设置max_distance以提升性能