Catalog¶
The catalog module builds a local mapping of parent morton cells to granule S3 URLs by querying NASA's CMR (Common Metadata Repository). This avoids per-worker CMR queries during parallel processing.
Building a Catalog¶
The catalog CLI accepts date ranges, product names, and spatial polygons:
# ICESat-2 convenience (cycle → date range):
python -m zagg.catalog --cycle 22 --parent-order 6
# Explicit date range:
python -m zagg.catalog --start-date 2024-01-06 --end-date 2024-04-07 --parent-order 6
# Custom region via GeoJSON polygon:
python -m zagg.catalog --start-date 2024-01-01 --end-date 2024-06-01 \
--polygon my_region.geojson --parent-order 6
# Different product:
python -m zagg.catalog --start-date 2024-01-01 --end-date 2024-06-01 \
--short-name ATL08 --polygon my_region.geojson --parent-order 6
When --polygon is provided, it is used for two things:
- Cell discovery —
morton_coverageruns on the polygon to find parent cells - CMR bounding box — automatically computed from the polygon's extent
When no polygon is given, Antarctic drainage basins are used as the default (suitable for ATL06 ice sheet work).
Temporal Helpers¶
zagg.catalog.cycle_to_dates ¶
Convert an ICESat-2 repeat cycle number to a date range.
Parameters:
-
cycle(int) –ICESat-2 cycle number (1-based)
Returns:
-
tuple of (start_date, end_date)–Start and end datetimes for the cycle
Spatial Helpers¶
zagg.catalog.load_polygon ¶
Load polygon(s) from a GeoJSON file.
Supports Feature, FeatureCollection, Polygon, and MultiPolygon geometries.
Parameters:
-
geojson_path(str) –Path to a GeoJSON file
Returns:
-
list of (lats, lons)–One (lats, lons) array pair per polygon ring, suitable for morton_coverage multipart input.
zagg.catalog.polygon_to_bbox ¶
Compute a bounding box from polygon parts.
Parameters:
-
parts(list of (lats, lons)) –Polygon parts as returned by load_polygon
Returns:
-
tuple of (lon_min, lat_min, lon_max, lat_max)–Bounding box in CMR format
Cell Discovery¶
zagg.catalog.load_antarctic_basins ¶
load_antarctic_basins(filepath=None)
Load Antarctic drainage basin polygons.
Parameters:
-
filepath(str, default:None) –Path to basin polygon file. Defaults to the file shipped with mortie.
Returns:
-
list of (lats, lons)–One (lats, lons) pair per basin, suitable for morton_coverage multipart input.
zagg.catalog.discover_cells ¶
discover_cells(parent_order, polygon_parts=None)
Discover morton cells at parent_order covering a polygon.
Parameters:
-
parent_order(int) –Morton order for parent cells (e.g., 6)
-
polygon_parts(list of (lats, lons), default:None) –Polygon parts for coverage. Defaults to Antarctic drainage basins.
Returns:
-
ndarray–Sorted array of unique morton indices at parent_order
CMR Query¶
zagg.catalog.query_cmr ¶
query_cmr(
start_date: str,
end_date: str,
short_name: str = "ATL06",
version: str = "007",
provider: str = "NSIDC_CPRD",
bbox: tuple = None,
page_size: int = 2000,
) -> List[dict]
Query CMR for granules matching temporal and spatial filters.
Parameters:
-
start_date(str) –Start date (YYYY-MM-DD)
-
end_date(str) –End date (YYYY-MM-DD)
-
short_name(str, default:'ATL06') –CMR short name (e.g., ATL06, ATL08)
-
version(str, default:'007') –Product version
-
provider(str, default:'NSIDC_CPRD') –CMR provider
-
bbox(tuple of (lon_min, lat_min, lon_max, lat_max), default:None) –Bounding box filter
-
page_size(int, default:2000) –Results per page
Returns:
-
list–List of granule metadata dicts
Catalog Builder¶
zagg.catalog.build_catalog ¶
Build a granule catalog using morton_coverage for cell discovery and shapely STRtree for granule-to-cell intersection.
Parameters:
-
granules(list) –List of granule metadata from CMR
-
parent_order(int) –Morton order for parent cells (e.g., 6)
-
polygon_parts(list of (lats, lons), default:None) –Polygon parts for cell discovery. Defaults to Antarctic drainage basins.
Returns: