Benchmarks

Explore shared benchmark families, aligned external ecosystems, supported tasks, and model compatibility across PyHazards.

At a Glance

Benchmark Families

4

Shared evaluator families available through the benchmark runner.

Ecosystem Mappings

12

External benchmark or data ecosystems linked from the public docs.

Supported Task Families

7

Hazard tasks covered across the family-level benchmark contracts.

Smoke Configurations

27

Unique smoke configs referenced by the benchmark family cards.

Benchmark Families

These four cards summarize the benchmark families exposed through the shared runner and compress the core tasks, metrics, support level, and coverage counts into a scan-friendly catalog.

Wildfire Benchmark

Shared PyHazards evaluator family for wildfire danger and wildfire spread experiments.

Wildfire Danger Spread Synthetic-backed

Tasks: Danger, Spread

Key Metrics: Accuracy, Macro F1, AUC, PR-AUC, +5 more

Coverage: 8 smoke configs | 8 models | 1 ecosystem

Earthquake Benchmark

Shared PyHazards evaluator family for earthquake phase-picking and wavefield-forecasting runs.

Earthquake Phase Picking Wavefield Forecasting Synthetic-backed

Tasks: Phase Picking, Wavefield Forecasting

Key Metrics: P-pick MAE, S-pick MAE, Precision, Recall, +3 more

Coverage: 5 smoke configs | 5 models | 4 ecosystems

Flood Benchmark

Shared PyHazards evaluator family for streamflow forecasting and inundation prediction.

Flood Streamflow Inundation Synthetic-backed

Tasks: Streamflow, Inundation

Key Metrics: MAE, RMSE, NSE, KGE, +3 more

Coverage: 6 smoke configs | 6 models | 4 ecosystems

Tropical Cyclone Benchmark

Shared PyHazards evaluator family for tropical cyclone and hurricane track-intensity forecasting.

Tropical Cyclone Track + Intensity Synthetic-backed

Tasks: Track + Intensity

Key Metrics: Track Error, Intensity MAE

Coverage: 8 smoke configs | 8 models | 3 ecosystems

Coverage Matrix

Use the matrix below for side-by-side comparison of hazard coverage, family-level tasks, primary metrics, linked-model counts, and support status without opening the detail pages first.

Hazard

Benchmark Family

Tasks

Primary Metrics

Linked Models

Support Status

Wildfire

Wildfire Benchmark

Danger, Spread

Accuracy, Macro F1, AUC, PR-AUC, +5 more

8 models

Synthetic-backed

Earthquake

Earthquake Benchmark

Phase Picking, Wavefield Forecasting

P-pick MAE, S-pick MAE, Precision, Recall, +3 more

5 models

Synthetic-backed

Flood

Flood Benchmark

Streamflow, Inundation

MAE, RMSE, NSE, KGE, +3 more

6 models

Synthetic-backed

Tropical Cyclone

Tropical Cyclone Benchmark

Track + Intensity

Track Error, Intensity MAE

8 models

Synthetic-backed

Benchmark Ecosystems

Browse the aligned benchmark ecosystems by hazard family. Each card links to a detail page with the routed benchmark family, source links, and the models currently mapped to that ecosystem.

Ecosystem cards describe the external benchmark or data protocol surfaced on this page and show how it maps back to the shared PyHazards benchmark family.

WildfireSpreadTS

Temporal wildfire spread benchmark coverage for the shared wildfire spread evaluator.

Wildfire Spread Synthetic-backed

Benchmark Family: Wildfire Benchmark

Key Metrics: IoU, F1, Burned-area MAE

Coverage: 5 smoke configs | 5 models

Ecosystem cards describe the external benchmark or data protocol surfaced on this page and show how it maps back to the shared PyHazards benchmark family.

AEFA

AEFA-style forecasting dataset support for the shared earthquake forecasting path.

Earthquake Wavefield Forecasting Synthetic-backed

Benchmark Family: Earthquake Benchmark

Key Metrics: MAE, MSE

Coverage: 1 smoke config | 1 model

pick-benchmark

pick-benchmark-compatible waveform picking support routed through the shared earthquake evaluator.

Earthquake Phase Picking Synthetic-backed

Benchmark Family: Earthquake Benchmark

Key Metrics: P-pick MAE, S-pick MAE, Precision, Recall, +1 more

Coverage: 2 smoke configs | 2 models

pyCSEP

pyCSEP-style forecasting report export for the earthquake forecasting smoke path.

Earthquake Wavefield Forecasting Synthetic-backed

Benchmark Family: Earthquake Benchmark

Key Metrics: MAE, MSE

Coverage: 1 smoke config | 1 model

SeisBench

SeisBench-shaped waveform picking support for the shared earthquake benchmark family.

Earthquake Phase Picking Synthetic-backed

Benchmark Family: Earthquake Benchmark

Key Metrics: P-pick MAE, S-pick MAE, Precision, Recall, +1 more

Coverage: 2 smoke configs | 2 models

Ecosystem cards describe the external benchmark or data protocol surfaced on this page and show how it maps back to the shared PyHazards benchmark family.

Caravan

Caravan-style streamflow benchmark coverage for the shared flood streamflow evaluator.

Flood Streamflow Synthetic-backed

Benchmark Family: Flood Benchmark

Key Metrics: MAE, RMSE, NSE, KGE

Coverage: 2 smoke configs | 2 models

FloodCastBench

FloodCastBench-style inundation benchmark coverage for the shared flood inundation evaluator.

Flood Inundation Synthetic-backed

Benchmark Family: Flood Benchmark

Key Metrics: Pixel MAE, IoU, F1

Coverage: 2 smoke configs | 2 models

HydroBench

HydroBench-style streamflow diagnostics coverage for the shared flood streamflow evaluator.

Flood Streamflow Synthetic-backed

Benchmark Family: Flood Benchmark

Key Metrics: MAE, RMSE, NSE, KGE

Coverage: 1 smoke config | 1 model

WaterBench

WaterBench-style streamflow benchmark coverage for the shared flood evaluator.

Flood Streamflow Synthetic-backed

Benchmark Family: Flood Benchmark

Key Metrics: MAE, RMSE, NSE, KGE

Coverage: 1 smoke config | 1 model

Ecosystem cards describe the external benchmark or data protocol surfaced on this page and show how it maps back to the shared PyHazards benchmark family.

IBTrACS

IBTrACS-backed storm benchmark coverage for the shared tropical cyclone evaluator.

Tropical Cyclone Track + Intensity Synthetic-backed

Benchmark Family: Tropical Cyclone Benchmark

Key Metrics: Track Error, Intensity MAE

Coverage: 4 smoke configs | 4 models

TCBench Alpha

TCBench Alpha-style storm benchmark coverage for the shared tropical cyclone evaluator.

Tropical Cyclone Track + Intensity Synthetic-backed

Benchmark Family: Tropical Cyclone Benchmark

Key Metrics: Track Error, Intensity MAE

Coverage: 3 smoke configs | 3 models

TropiCycloneNet-Dataset

TropiCycloneNet-Dataset-backed storm benchmark coverage for the shared tropical cyclone evaluator.

Tropical Cyclone Track + Intensity Synthetic-backed

Benchmark Family: Tropical Cyclone Benchmark

Key Metrics: Track Error, Intensity MAE

Coverage: 1 smoke config | 1 model

Programmatic Use

from pyhazards.configs import load_experiment_config
from pyhazards.engine import BenchmarkRunner

config = load_experiment_config("pyhazards/configs/earthquake/phasenet_smoke.yaml")
summary = BenchmarkRunner().run(config)
print(summary.metrics)

Use python scripts/run_benchmark.py --help for the CLI entry point, then pair this page with Configs for experiment YAMLs and Reports for comparable benchmark exports.