Skip to content

Config

The config module loads YAML pipeline configurations that define what data to read, how to aggregate it, and where to write output. See configs/atl06.yaml for the default configuration.

Loading

zagg.config.load_config

load_config(path: str) -> PipelineConfig

Load a YAML config file and return a validated PipelineConfig.

Parameters:

  • path (str) –

    Path to YAML file.

Returns:

zagg.config.load_config_from_dict

load_config_from_dict(d: dict) -> PipelineConfig

Build a PipelineConfig from a plain dict (e.g. Lambda JSON payload).

Parameters:

  • d (dict) –

    Dictionary with keys data_source, aggregation, output.

Returns:

zagg.config.default_config

default_config(name: str = 'atl06') -> PipelineConfig

Load a built-in YAML config shipped with the package.

Parameters:

  • name (str, default: 'atl06' ) –

    Config name (without .yaml extension). Default "atl06".

Returns:

Raises:

Validation

zagg.config.validate_config

validate_config(config: PipelineConfig) -> None

Cross-validate a PipelineConfig.

Parameters:

Raises:

Function Resolution

zagg.config.resolve_function

resolve_function(name: str) -> Callable

Resolve a function name to a callable.

Resolution rules: - "len" or "count" -> builtin len - No dot (e.g. "min") -> np.<name> - Dotted path (e.g. "np.quantile") -> importlib resolution

Parameters:

  • name (str) –

    Function name or dotted path.

Returns:

Raises:

  • ValueError

    If the name cannot be resolved to a callable.

zagg.config.evaluate_expression

evaluate_expression(expression: str, columns: dict[str, ndarray]) -> float

Evaluate an expression string in a restricted namespace.

Parameters:

  • expression (str) –

    Python expression using numpy and column variables.

  • columns (dict[str, ndarray]) –

    Mapping of column names to arrays.

Returns:

Accessors

zagg.config.get_agg_fields

get_agg_fields(config: PipelineConfig) -> dict

Return aggregation variable metadata keyed by variable name.

Parameters:

Returns:

  • dict

    {name: {function/expression, source, params, dtype, fill_value, ...}}

zagg.config.get_coords

get_coords(config: PipelineConfig) -> list[str]

Return coordinate column names from the aggregation config.

Parameters:

Returns:

zagg.config.get_data_vars

get_data_vars(config: PipelineConfig) -> list[str]

Return data variable column names from the aggregation config.

Parameters:

Returns:

zagg.config.get_child_order

get_child_order(config: PipelineConfig) -> int

Return child_order from the output grid config.

Parameters:

Returns:

Raises:

  • ValueError

    If child_order is not set in the config.

zagg.config.get_store_path

get_store_path(config: PipelineConfig) -> str | None

Return the store path from the output config, or None.

Parameters:

Returns:

  • str or None

Types

zagg.config.PipelineConfig dataclass

Full pipeline configuration.

Parameters:

  • data_source (DataSourceDict, default: dict() ) –

    Reader, groups, coordinates, variables, quality filter.

  • aggregation (dict, default: dict() ) –

    Coordinate and variable aggregation definitions.

  • output (dict, default: dict() ) –

    Grid spec, store path, and indexing details.

  • catalog (str or None, default: None ) –

    Optional path to granule catalog JSON.

  • bounds (dict or None, default: None ) –

    Optional temporal/spatial bounds for filtering.