Sharded output (zarr ShardingCodec)¶
zagg can bundle a dispatch shard's inner read-chunks into one zarr
ShardingCodec shard object instead of writing them as many independent
regular chunk objects. This decouples the write/dispatch granularity (the
unit a Lambda worker processes) from the read granularity (the small chunk a
reader partial-decodes), without drowning the reader in millions of tiny objects
at global/dense scale.
It is opt-in (default off) and builds directly on chunk_inner — it changes
only the storage form of the inner chunks, not the grid geometry.
How it relates to chunk_inner¶
chunk_inner sets the geometry: the shard is the dispatch unit (HEALPix
parent_order; rectilinear chunk_shape), the inner chunk is the smaller
read chunk (HEALPix chunk_inner order, e.g. 13 = a 64×64 = 4096-cell chunk;
rectilinear chunk_inner = [inner_h, inner_w]), and one shard owns
K = (shard cells) / (inner-chunk cells) inner chunks.
sharded only picks how those K inner chunks are stored:
sharded: false (default) |
sharded: true |
|
|---|---|---|
| storage | K regular chunk objects per shard |
1 shard object per shard |
| object count | high (empties absent) | low (~K× fewer) |
| sub-shard sparsity | absent objects | shard-index entries inside the object |
| reader 64×64 access | one object per chunk | byte-range partial decode within the shard |
With zarr-shard == dispatch-shard, each worker writes exactly one whole shard
object, alone — the canonical single-writer pattern: no read-modify-write, no
cross-worker contention. Empty inner chunks (e.g. a track corridor crossing only
part of a shard) are omitted from the shard index, so sub-shard sparsity is
preserved inside the object.
Config¶
The knob lives on the grid/chunk block, next to chunk_inner:
# HEALPix
output:
grid:
type: healpix
parent_order: 8 # dispatch shard
child_order: 19 # leaf cell order
chunk_inner: 13 # inner read chunk (64×64 = 4096 cells) -> K = 4^(13-8) = 1024
sharded: true
# Rectilinear
output:
grid:
type: rectilinear
crs: EPSG:3031
resolution: 5000
bounds: [-3200000, -3200000, 3200000, 3200000]
chunk_shape: [256, 256] # dispatch shard tile
chunk_inner: [64, 64] # inner read chunk -> K = (256/64)^2 = 16
sharded: true
sharded is only valid when chunk_inner makes K > 1 (a shard with more than
one inner chunk). Sharding a K == 1 shard is a no-op shard of one chunk, so it
is validated and rejected at grid construction (before any Lambda
deployment), with a clear message — matching the grid-mismatch errors zagg
already raises.
Storage layout¶
- The dense per-cell arrays (
<group>/<varname>) carry asharding_indexedcodec: the outer chunk is the whole shard, the inner chunk is the read chunk. Inner codecs stay bytes-only/uncompressed (zagg's policy), so the on-disk bytes for a populated inner chunk are identical to the regular path. resolution: chunkcompanion arrays and (HEALPix) the 1-D coordinate / (rect) the 1-Dx/yarrays are not sharded — they keep their regular layout.- The worker writes the whole shard in one
set_block_selectionper dense array (block selection is shard-granular on a sharded array), so a single shard object is produced per dispatch shard.
Reader note¶
This is currently a writer-side feature plus offline round-trip read-back.
Consuming sharded stores from the higher-level read helpers (shard-index /
byte-range reads instead of list_prefix enumeration) is tracked as a
follow-up; the underlying zarr partial-decode of a 64×64 chunk within a shard
already works through the standard zarr array API.