geomaster
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseGeoMaster
GeoMaster
Comprehensive geospatial science skill covering GIS, remote sensing, spatial analysis, and ML for Earth observation across 70+ topics with 500+ code examples in 8 programming languages.
这是一项全面的地理空间科学技能,涵盖GIS、遥感、空间分析以及地球观测机器学习,包含70余个主题和8种编程语言的500余个代码示例。
Installation
安装
bash
undefinedbash
undefinedCore Python stack (conda recommended)
Core Python stack (conda recommended)
conda install -c conda-forge gdal rasterio fiona shapely pyproj geopandas
conda install -c conda-forge gdal rasterio fiona shapely pyproj geopandas
Remote sensing & ML
Remote sensing & ML
uv pip install rsgislib torchgeo earthengine-api
uv pip install scikit-learn xgboost torch-geometric
uv pip install rsgislib torchgeo earthengine-api
uv pip install scikit-learn xgboost torch-geometric
Network & visualization
Network & visualization
uv pip install osmnx networkx folium keplergl
uv pip install cartopy contextily mapclassify
uv pip install osmnx networkx folium keplergl
uv pip install cartopy contextily mapclassify
Big data & cloud
Big data & cloud
uv pip install xarray rioxarray dask-geopandas
uv pip install pystac-client planetary-computer
uv pip install xarray rioxarray dask-geopandas
uv pip install pystac-client planetary-computer
Point clouds
Point clouds
uv pip install laspy pylas open3d pdal
uv pip install laspy pylas open3d pdal
Databases
Databases
conda install -c conda-forge postgis spatialite
undefinedconda install -c conda-forge postgis spatialite
undefinedQuick Start
快速开始
NDVI from Sentinel-2
基于Sentinel-2计算NDVI
python
import rasterio
import numpy as np
with rasterio.open('sentinel2.tif') as src:
red = src.read(4).astype(float) # B04
nir = src.read(8).astype(float) # B08
ndvi = (nir - red) / (nir + red + 1e-8)
ndvi = np.nan_to_num(ndvi, nan=0)
profile = src.profile
profile.update(count=1, dtype=rasterio.float32)
with rasterio.open('ndvi.tif', 'w', **profile) as dst:
dst.write(ndvi.astype(rasterio.float32), 1)python
import rasterio
import numpy as np
with rasterio.open('sentinel2.tif') as src:
red = src.read(4).astype(float) # B04
nir = src.read(8).astype(float) # B08
ndvi = (nir - red) / (nir + red + 1e-8)
ndvi = np.nan_to_num(ndvi, nan=0)
profile = src.profile
profile.update(count=1, dtype=rasterio.float32)
with rasterio.open('ndvi.tif', 'w', **profile) as dst:
dst.write(ndvi.astype(rasterio.float32), 1)Spatial Analysis with GeoPandas
使用GeoPandas进行空间分析
python
import geopandas as gpdpython
import geopandas as gpdLoad and ensure same CRS
Load and ensure same CRS
zones = gpd.read_file('zones.geojson')
points = gpd.read_file('points.geojson')
if zones.crs != points.crs:
points = points.to_crs(zones.crs)
zones = gpd.read_file('zones.geojson')
points = gpd.read_file('points.geojson')
if zones.crs != points.crs:
points = points.to_crs(zones.crs)
Spatial join and statistics
Spatial join and statistics
joined = gpd.sjoin(points, zones, how='inner', predicate='within')
stats = joined.groupby('zone_id').agg({
'value': ['count', 'mean', 'std', 'min', 'max']
}).round(2)
undefinedjoined = gpd.sjoin(points, zones, how='inner', predicate='within')
stats = joined.groupby('zone_id').agg({
'value': ['count', 'mean', 'std', 'min', 'max']
}).round(2)
undefinedGoogle Earth Engine Time Series
Google Earth Engine时间序列分析
python
import ee
import pandas as pd
ee.Initialize(project='your-project')
roi = ee.Geometry.Point([-122.4, 37.7]).buffer(10000)
s2 = (ee.ImageCollection('COPERNICUS/S2_SR_HARMONIZED')
.filterBounds(roi)
.filterDate('2020-01-01', '2023-12-31')
.filter(ee.Filter.lt('CLOUDY_PIXEL_PERCENTAGE', 20)))
def add_ndvi(img):
return img.addBands(img.normalizedDifference(['B8', 'B4']).rename('NDVI'))
s2_ndvi = s2.map(add_ndvi)
def extract_series(image):
stats = image.reduceRegion(ee.Reducer.mean(), roi.centroid(), scale=10, maxPixels=1e9)
return ee.Feature(None, {'date': image.date().format('YYYY-MM-dd'), 'ndvi': stats.get('NDVI')})
series = s2_ndvi.map(extract_series).getInfo()
df = pd.DataFrame([f['properties'] for f in series['features']])
df['date'] = pd.to_datetime(df['date'])python
import ee
import pandas as pd
ee.Initialize(project='your-project')
roi = ee.Geometry.Point([-122.4, 37.7]).buffer(10000)
s2 = (ee.ImageCollection('COPERNICUS/S2_SR_HARMONIZED')
.filterBounds(roi)
.filterDate('2020-01-01', '2023-12-31')
.filter(ee.Filter.lt('CLOUDY_PIXEL_PERCENTAGE', 20)))
def add_ndvi(img):
return img.addBands(img.normalizedDifference(['B8', 'B4']).rename('NDVI'))
s2_ndvi = s2.map(add_ndvi)
def extract_series(image):
stats = image.reduceRegion(ee.Reducer.mean(), roi.centroid(), scale=10, maxPixels=1e9)
return ee.Feature(None, {'date': image.date().format('YYYY-MM-dd'), 'ndvi': stats.get('NDVI')})
series = s2_ndvi.map(extract_series).getInfo()
df = pd.DataFrame([f['properties'] for f in series['features']])
df['date'] = pd.to_datetime(df['date'])Core Concepts
核心概念
Data Types
数据类型
| Type | Examples | Libraries |
|---|---|---|
| Vector | Shapefile, GeoJSON, GeoPackage | GeoPandas, Fiona, GDAL |
| Raster | GeoTIFF, NetCDF, COG | Rasterio, Xarray, GDAL |
| Point Cloud | LAS, LAZ | Laspy, PDAL, Open3D |
| 类型 | 示例 | 库 |
|---|---|---|
| 矢量 | Shapefile、GeoJSON、GeoPackage | GeoPandas、Fiona、GDAL |
| 栅格 | GeoTIFF、NetCDF、COG | Rasterio、Xarray、GDAL |
| 点云 | LAS、LAZ | Laspy、PDAL、Open3D |
Coordinate Systems
坐标系
- EPSG:4326 (WGS 84) - Geographic, lat/lon, use for storage
- EPSG:3857 (Web Mercator) - Web maps only (don't use for area/distance!)
- EPSG:326xx/327xx (UTM) - Metric calculations, <1% distortion per zone
- Use for automatic UTM detection
gdf.estimate_utm_crs()
python
undefined- EPSG:4326(WGS 84)- 地理坐标系,经纬度,用于数据存储
- EPSG:3857(Web墨卡托)- 仅用于Web地图(不要用于面积/距离计算!)
- EPSG:326xx/327xx(UTM)- 米制计算,每个区域的变形<1%
- 使用自动检测UTM坐标系
gdf.estimate_utm_crs()
python
undefinedAlways check CRS before operations
Always check CRS before operations
assert gdf1.crs == gdf2.crs, "CRS mismatch!"
assert gdf1.crs == gdf2.crs, "CRS mismatch!"
For area/distance calculations, use projected CRS
For area/distance calculations, use projected CRS
gdf_metric = gdf.to_crs(gdf.estimate_utm_crs())
area_sqm = gdf_metric.geometry.area
undefinedgdf_metric = gdf.to_crs(gdf.estimate_utm_crs())
area_sqm = gdf_metric.geometry.area
undefinedOGC Standards
OGC标准
- WMS: Web Map Service - raster maps
- WFS: Web Feature Service - vector data
- WCS: Web Coverage Service - raster coverage
- STAC: Spatiotemporal Asset Catalog - modern metadata
- WMS: Web地图服务 - 栅格地图
- WFS: Web要素服务 - 矢量数据
- WCS: Web覆盖服务 - 栅格覆盖数据
- STAC: 时空资产目录 - 现代元数据标准
Common Operations
常见操作
Spectral Indices
光谱指数
python
def calculate_indices(image_path):
"""NDVI, EVI, SAVI, NDWI from Sentinel-2."""
with rasterio.open(image_path) as src:
B02, B03, B04, B08, B11 = [src.read(i).astype(float) for i in [1,2,3,4,5]]
ndvi = (B08 - B04) / (B08 + B04 + 1e-8)
evi = 2.5 * (B08 - B04) / (B08 + 6*B04 - 7.5*B02 + 1)
savi = ((B08 - B04) / (B08 + B04 + 0.5)) * 1.5
ndwi = (B03 - B08) / (B03 + B08 + 1e-8)
return {'NDVI': ndvi, 'EVI': evi, 'SAVI': savi, 'NDWI': ndwi}python
def calculate_indices(image_path):
"""NDVI, EVI, SAVI, NDWI from Sentinel-2."""
with rasterio.open(image_path) as src:
B02, B03, B04, B08, B11 = [src.read(i).astype(float) for i in [1,2,3,4,5]]
ndvi = (B08 - B04) / (B08 + B04 + 1e-8)
evi = 2.5 * (B08 - B04) / (B08 + 6*B04 - 7.5*B02 + 1)
savi = ((B08 - B04) / (B08 + B04 + 0.5)) * 1.5
ndwi = (B03 - B08) / (B03 + B08 + 1e-8)
return {'NDVI': ndvi, 'EVI': evi, 'SAVI': savi, 'NDWI': ndwi}Vector Operations
矢量操作
python
undefinedpython
undefinedBuffer (use projected CRS!)
Buffer (use projected CRS!)
gdf_proj = gdf.to_crs(gdf.estimate_utm_crs())
gdf['buffer_1km'] = gdf_proj.geometry.buffer(1000)
gdf_proj = gdf.to_crs(gdf.estimate_utm_crs())
gdf['buffer_1km'] = gdf_proj.geometry.buffer(1000)
Spatial relationships
Spatial relationships
intersects = gdf[gdf.geometry.intersects(other_geometry)]
contains = gdf[gdf.geometry.contains(point_geometry)]
intersects = gdf[gdf.geometry.intersects(other_geometry)]
contains = gdf[gdf.geometry.contains(point_geometry)]
Geometric operations
Geometric operations
gdf['centroid'] = gdf.geometry.centroid
gdf['simplified'] = gdf.geometry.simplify(tolerance=0.001)
gdf['centroid'] = gdf.geometry.centroid
gdf['simplified'] = gdf.geometry.simplify(tolerance=0.001)
Overlay operations
Overlay operations
intersection = gpd.overlay(gdf1, gdf2, how='intersection')
union = gpd.overlay(gdf1, gdf2, how='union')
undefinedintersection = gpd.overlay(gdf1, gdf2, how='intersection')
union = gpd.overlay(gdf1, gdf2, how='union')
undefinedTerrain Analysis
地形分析
python
def terrain_metrics(dem_path):
"""Calculate slope, aspect, hillshade from DEM."""
with rasterio.open(dem_path) as src:
dem = src.read(1)
dy, dx = np.gradient(dem)
slope = np.arctan(np.sqrt(dx**2 + dy**2)) * 180 / np.pi
aspect = (90 - np.arctan2(-dy, dx) * 180 / np.pi) % 360
# Hillshade
az_rad, alt_rad = np.radians(315), np.radians(45)
hillshade = (np.sin(alt_rad) * np.sin(np.radians(slope)) +
np.cos(alt_rad) * np.cos(np.radians(slope)) *
np.cos(np.radians(aspect) - az_rad))
return slope, aspect, hillshadepython
def terrain_metrics(dem_path):
"""Calculate slope, aspect, hillshade from DEM."""
with rasterio.open(dem_path) as src:
dem = src.read(1)
dy, dx = np.gradient(dem)
slope = np.arctan(np.sqrt(dx**2 + dy**2)) * 180 / np.pi
aspect = (90 - np.arctan2(-dy, dx) * 180 / np.pi) % 360
# Hillshade
az_rad, alt_rad = np.radians(315), np.radians(45)
hillshade = (np.sin(alt_rad) * np.sin(np.radians(slope)) +
np.cos(alt_rad) * np.cos(np.radians(slope)) *
np.cos(np.radians(aspect) - az_rad))
return slope, aspect, hillshadeNetwork Analysis
网络分析
python
import osmnx as ox
import networkx as nxpython
import osmnx as ox
import networkx as nxDownload and analyze street network
Download and analyze street network
G = ox.graph_from_place('San Francisco, CA', network_type='drive')
G = ox.add_edge_speeds(G).add_edge_travel_times(G)
G = ox.graph_from_place('San Francisco, CA', network_type='drive')
G = ox.add_edge_speeds(G).add_edge_travel_times(G)
Shortest path
Shortest path
orig = ox.distance.nearest_nodes(G, -122.4, 37.7)
dest = ox.distance.nearest_nodes(G, -122.3, 37.8)
route = nx.shortest_path(G, orig, dest, weight='travel_time')
undefinedorig = ox.distance.nearest_nodes(G, -122.4, 37.7)
dest = ox.distance.nearest_nodes(G, -122.3, 37.8)
route = nx.shortest_path(G, orig, dest, weight='travel_time')
undefinedImage Classification
影像分类
python
from sklearn.ensemble import RandomForestClassifier
import rasterio
from rasterio.features import rasterize
def classify_imagery(raster_path, training_gdf, output_path):
"""Train RF and classify imagery."""
with rasterio.open(raster_path) as src:
image = src.read()
profile = src.profile
transform = src.transform
# Extract training data
X_train, y_train = [], []
for _, row in training_gdf.iterrows():
mask = rasterize([(row.geometry, 1)],
out_shape=(profile['height'], profile['width']),
transform=transform, fill=0, dtype=np.uint8)
pixels = image[:, mask > 0].T
X_train.extend(pixels)
y_train.extend([row['class_id']] * len(pixels))
# Train and predict
rf = RandomForestClassifier(n_estimators=100, max_depth=20, n_jobs=-1)
rf.fit(X_train, y_train)
prediction = rf.predict(image.reshape(image.shape[0], -1).T)
prediction = prediction.reshape(profile['height'], profile['width'])
profile.update(dtype=rasterio.uint8, count=1)
with rasterio.open(output_path, 'w', **profile) as dst:
dst.write(prediction.astype(rasterio.uint8), 1)
return rfpython
from sklearn.ensemble import RandomForestClassifier
import rasterio
from rasterio.features import rasterize
def classify_imagery(raster_path, training_gdf, output_path):
"""Train RF and classify imagery."""
with rasterio.open(raster_path) as src:
image = src.read()
profile = src.profile
transform = src.transform
# Extract training data
X_train, y_train = [], []
for _, row in training_gdf.iterrows():
mask = rasterize([(row.geometry, 1)],
out_shape=(profile['height'], profile['width']),
transform=transform, fill=0, dtype=np.uint8)
pixels = image[:, mask > 0].T
X_train.extend(pixels)
y_train.extend([row['class_id']] * len(pixels))
# Train and predict
rf = RandomForestClassifier(n_estimators=100, max_depth=20, n_jobs=-1)
rf.fit(X_train, y_train)
prediction = rf.predict(image.reshape(image.shape[0], -1).T)
prediction = prediction.reshape(profile['height'], profile['width'])
profile.update(dtype=rasterio.uint8, count=1)
with rasterio.open(output_path, 'w', **profile) as dst:
dst.write(prediction.astype(rasterio.uint8), 1)
return rfModern Cloud-Native Workflows
现代云原生工作流
STAC + Planetary Computer
STAC + Planetary Computer
python
import pystac_client
import planetary_computer
import odc.stacpython
import pystac_client
import planetary_computer
import odc.stacSearch Sentinel-2 via STAC
Search Sentinel-2 via STAC
catalog = pystac_client.Client.open(
"https://planetarycomputer.microsoft.com/api/stac/v1",
modifier=planetary_computer.sign_inplace,
)
search = catalog.search(
collections=["sentinel-2-l2a"],
bbox=[-122.5, 37.7, -122.3, 37.9],
datetime="2023-01-01/2023-12-31",
query={"eo:cloud_cover": {"lt": 20}},
)
catalog = pystac_client.Client.open(
"https://planetarycomputer.microsoft.com/api/stac/v1",
modifier=planetary_computer.sign_inplace,
)
search = catalog.search(
collections=["sentinel-2-l2a"],
bbox=[-122.5, 37.7, -122.3, 37.9],
datetime="2023-01-01/2023-12-31",
query={"eo:cloud_cover": {"lt": 20}},
)
Load as xarray (cloud-native!)
Load as xarray (cloud-native!)
data = odc.stac.load(
list(search.get_items())[:5],
bands=["B02", "B03", "B04", "B08"],
crs="EPSG:32610",
resolution=10,
)
data = odc.stac.load(
list(search.get_items())[:5],
bands=["B02", "B03", "B04", "B08"],
crs="EPSG:32610",
resolution=10,
)
Calculate NDVI on xarray
Calculate NDVI on xarray
ndvi = (data.B08 - data.B04) / (data.B08 + data.B04)
undefinedndvi = (data.B08 - data.B04) / (data.B08 + data.B04)
undefinedCloud-Optimized GeoTIFF (COG)
云优化GeoTIFF(COG)
python
import rasterio
from rasterio.session import AWSSessionpython
import rasterio
from rasterio.session import AWSSessionRead COG directly from cloud (partial reads)
Read COG directly from cloud (partial reads)
session = AWSSession(aws_access_key_id=..., aws_secret_access_key=...)
with rasterio.open('s3://bucket/path.tif', session=session) as src:
# Read only window of interest
window = ((1000, 2000), (1000, 2000))
subset = src.read(1, window=window)
session = AWSSession(aws_access_key_id=..., aws_secret_access_key=...)
with rasterio.open('s3://bucket/path.tif', session=session) as src:
# Read only window of interest
window = ((1000, 2000), (1000, 2000))
subset = src.read(1, window=window)
Write COG
Write COG
with rasterio.open('output.tif', 'w', **profile,
tiled=True, blockxsize=256, blockysize=256,
compress='DEFLATE', predictor=2) as dst:
dst.write(data)
with rasterio.open('output.tif', 'w', **profile,
tiled=True, blockxsize=256, blockysize=256,
compress='DEFLATE', predictor=2) as dst:
dst.write(data)
Validate COG
Validate COG
from rio_cogeo.cogeo import cog_validate
cog_validate('output.tif')
undefinedfrom rio_cogeo.cogeo import cog_validate
cog_validate('output.tif')
undefinedPerformance Tips
性能优化技巧
python
undefinedpython
undefined1. Spatial indexing (10-100x faster queries)
1. Spatial indexing (10-100x faster queries)
gdf.sindex # Auto-created by GeoPandas
gdf.sindex # Auto-created by GeoPandas
2. Chunk large rasters
2. Chunk large rasters
with rasterio.open('large.tif') as src:
for i, window in src.block_windows(1):
block = src.read(1, window=window)
with rasterio.open('large.tif') as src:
for i, window in src.block_windows(1):
block = src.read(1, window=window)
3. Dask for big data
3. Dask for big data
import dask.array as da
dask_array = da.from_rasterio('large.tif', chunks=(1, 1024, 1024))
import dask.array as da
dask_array = da.from_rasterio('large.tif', chunks=(1, 1024, 1024))
4. Use Arrow for I/O
4. Use Arrow for I/O
gdf.to_file('output.gpkg', use_arrow=True)
gdf.to_file('output.gpkg', use_arrow=True)
5. GDAL caching
5. GDAL caching
from osgeo import gdal
gdal.SetCacheMax(2**30) # 1GB cache
from osgeo import gdal
gdal.SetCacheMax(2**30) # 1GB cache
6. Parallel processing
6. Parallel processing
rf = RandomForestClassifier(n_jobs=-1) # All cores
undefinedrf = RandomForestClassifier(n_jobs=-1) # All cores
undefinedBest Practices
最佳实践
- Always check CRS before spatial operations
- Use projected CRS for area/distance calculations
- Validate geometries:
gdf = gdf[gdf.is_valid] - Handle missing data:
gdf['geometry'] = gdf['geometry'].fillna(None) - Use efficient formats: GeoPackage > Shapefile, Parquet for large data
- Apply cloud masking to optical imagery
- Preserve lineage for reproducible research
- Use appropriate resolution for your analysis scale
- 空间操作前务必检查CRS
- 使用投影坐标系进行面积/距离计算
- 验证几何有效性:
gdf = gdf[gdf.is_valid] - 处理缺失数据:
gdf['geometry'] = gdf['geometry'].fillna(None) - 使用高效格式:GeoPackage优于Shapefile,大数据使用Parquet
- 对光学影像应用云掩膜
- 保留数据溯源信息以实现可复现研究
- 根据分析尺度选择合适的分辨率
Detailed Documentation
详细文档
- Coordinate Systems - CRS fundamentals, UTM, transformations
- Core Libraries - GDAL, Rasterio, GeoPandas, Shapely
- Remote Sensing - Satellite missions, spectral indices, SAR
- Machine Learning - Deep learning, CNNs, GNNs for RS
- GIS Software - QGIS, ArcGIS, GRASS integration
- Scientific Domains - Marine, hydrology, agriculture, forestry
- Advanced GIS - 3D GIS, spatiotemporal, topology
- Big Data - Distributed processing, GPU acceleration
- Industry Applications - Urban planning, disaster management
- Programming Languages - Python, R, Julia, JS, C++, Java, Go, Rust
- Data Sources - Satellite catalogs, APIs
- Troubleshooting - Common issues, debugging, error reference
- Code Examples - 500+ examples
GeoMaster covers everything from basic GIS operations to advanced remote sensing and machine learning.
- 坐标系 - CRS基础、UTM、坐标转换
- 核心库 - GDAL、Rasterio、GeoPandas、Shapely
- 遥感 - 卫星任务、光谱指数、SAR
- 机器学习 - 深度学习、CNN、GNN在遥感中的应用
- GIS软件 - QGIS、ArcGIS、GRASS集成
- 科学领域 - 海洋、水文、农业、林业
- 高级GIS - 3D GIS、时空分析、拓扑
- 大数据 - 分布式处理、GPU加速
- 行业应用 - 城市规划、灾害管理
- 编程语言 - Python、R、Julia、JS、C++、Java、Go、Rust
- 数据源 - 卫星目录、API
- 故障排除 - 常见问题、调试、错误参考
- 代码示例 - 500+示例
GeoMaster涵盖了从基础GIS操作到高级遥感和机器学习的所有内容。