.. This file is generated by scripts/render_benchmark_docs.py. Do not edit by hand.

Benchmarks
===================

Explore shared benchmark families, aligned external ecosystems, supported
tasks, and model compatibility across PyHazards.

At a Glance
-----------

.. grid:: 1 2 4 4
   :gutter: 2
   :class-container: catalog-grid

   .. grid-item-card:: Benchmark Families
      :class-card: catalog-stat-card

      .. container:: catalog-stat-value

         4

      .. container:: catalog-stat-note

         Shared evaluator families available through the benchmark runner.

   .. grid-item-card:: Ecosystem Mappings
      :class-card: catalog-stat-card

      .. container:: catalog-stat-value

         12

      .. container:: catalog-stat-note

         External benchmark or data ecosystems linked from the public docs.

   .. grid-item-card:: Supported Task Families
      :class-card: catalog-stat-card

      .. container:: catalog-stat-value

         7

      .. container:: catalog-stat-note

         Hazard tasks covered across the family-level benchmark contracts.

   .. grid-item-card:: Smoke Configurations
      :class-card: catalog-stat-card

      .. container:: catalog-stat-value

         27

      .. container:: catalog-stat-note

         Unique smoke configs referenced by the benchmark family cards.


Benchmark Families
------------------

These four cards summarize the benchmark families exposed through the
shared runner and compress the core tasks, metrics, support level, and
coverage counts into a scan-friendly catalog.

.. grid:: 1 1 2 2
   :gutter: 2
   :class-container: catalog-grid

   .. grid-item-card:: Wildfire Benchmark
      :class-card: catalog-entry-card

      .. container:: catalog-entry-summary

         Shared PyHazards evaluator family for wildfire danger and wildfire spread experiments.

      .. container:: catalog-chip-row

         :bdg-primary:`Wildfire` :bdg-secondary:`Danger` :bdg-secondary:`Spread` :bdg-info:`Synthetic-backed`

      .. container:: catalog-meta-row

         **Tasks:** Danger, Spread

      .. container:: catalog-meta-row

         **Key Metrics:** Accuracy, Macro F1, AUC, PR-AUC, +5 more

      .. container:: catalog-meta-row

         **Coverage:** 8 smoke configs | 8 models | 1 ecosystem

      .. container:: catalog-link-row

         **View Details:** :doc:`Wildfire Benchmark <benchmarks/wildfire_benchmark>`

   .. grid-item-card:: Earthquake Benchmark
      :class-card: catalog-entry-card

      .. container:: catalog-entry-summary

         Shared PyHazards evaluator family for earthquake phase-picking and wavefield-forecasting runs.

      .. container:: catalog-chip-row

         :bdg-primary:`Earthquake` :bdg-secondary:`Phase Picking` :bdg-secondary:`Wavefield Forecasting` :bdg-info:`Synthetic-backed`

      .. container:: catalog-meta-row

         **Tasks:** Phase Picking, Wavefield Forecasting

      .. container:: catalog-meta-row

         **Key Metrics:** P-pick MAE, S-pick MAE, Precision, Recall, +3 more

      .. container:: catalog-meta-row

         **Coverage:** 5 smoke configs | 5 models | 4 ecosystems

      .. container:: catalog-link-row

         **View Details:** :doc:`Earthquake Benchmark <benchmarks/earthquake_benchmark>`

   .. grid-item-card:: Flood Benchmark
      :class-card: catalog-entry-card

      .. container:: catalog-entry-summary

         Shared PyHazards evaluator family for streamflow forecasting and inundation prediction.

      .. container:: catalog-chip-row

         :bdg-primary:`Flood` :bdg-secondary:`Streamflow` :bdg-secondary:`Inundation` :bdg-info:`Synthetic-backed`

      .. container:: catalog-meta-row

         **Tasks:** Streamflow, Inundation

      .. container:: catalog-meta-row

         **Key Metrics:** MAE, RMSE, NSE, KGE, +3 more

      .. container:: catalog-meta-row

         **Coverage:** 6 smoke configs | 6 models | 4 ecosystems

      .. container:: catalog-link-row

         **View Details:** :doc:`Flood Benchmark <benchmarks/flood_benchmark>`

   .. grid-item-card:: Tropical Cyclone Benchmark
      :class-card: catalog-entry-card

      .. container:: catalog-entry-summary

         Shared PyHazards evaluator family for tropical cyclone and hurricane track-intensity forecasting.

      .. container:: catalog-chip-row

         :bdg-primary:`Tropical Cyclone` :bdg-secondary:`Track + Intensity` :bdg-info:`Synthetic-backed`

      .. container:: catalog-meta-row

         **Tasks:** Track + Intensity

      .. container:: catalog-meta-row

         **Key Metrics:** Track Error, Intensity MAE

      .. container:: catalog-meta-row

         **Coverage:** 8 smoke configs | 8 models | 3 ecosystems

      .. container:: catalog-link-row

         **View Details:** :doc:`Tropical Cyclone Benchmark <benchmarks/tropical_cyclone_benchmark>`


Coverage Matrix
---------------

Use the matrix below for side-by-side comparison of hazard coverage,
family-level tasks, primary metrics, linked-model counts, and support
status without opening the detail pages first.

.. list-table::
   :widths: 14 22 18 20 14 12
   :header-rows: 1
   :class: catalog-matrix

   * - Hazard
     - Benchmark Family
     - Tasks
     - Primary Metrics
     - Linked Models
     - Support Status
   * - Wildfire
     - :doc:`Wildfire Benchmark <benchmarks/wildfire_benchmark>`
     - Danger, Spread
     - Accuracy, Macro F1, AUC, PR-AUC, +5 more
     - 8 models
     - Synthetic-backed
   * - Earthquake
     - :doc:`Earthquake Benchmark <benchmarks/earthquake_benchmark>`
     - Phase Picking, Wavefield Forecasting
     - P-pick MAE, S-pick MAE, Precision, Recall, +3 more
     - 5 models
     - Synthetic-backed
   * - Flood
     - :doc:`Flood Benchmark <benchmarks/flood_benchmark>`
     - Streamflow, Inundation
     - MAE, RMSE, NSE, KGE, +3 more
     - 6 models
     - Synthetic-backed
   * - Tropical Cyclone
     - :doc:`Tropical Cyclone Benchmark <benchmarks/tropical_cyclone_benchmark>`
     - Track + Intensity
     - Track Error, Intensity MAE
     - 8 models
     - Synthetic-backed

Benchmark Ecosystems
--------------------

Browse the aligned benchmark ecosystems by hazard family. Each card
links to a detail page with the routed benchmark family, source links,
and the models currently mapped to that ecosystem.

.. tab-set::
   :class: catalog-tabs

   .. tab-item:: Wildfire

      .. container:: catalog-section-note

         Ecosystem cards describe the external benchmark or data protocol
         surfaced on this page and show how it maps back to the shared
         PyHazards benchmark family.

      .. grid:: 1 1 2 2
         :gutter: 2
         :class-container: catalog-grid

         .. grid-item-card:: WildfireSpreadTS
            :class-card: catalog-entry-card

            .. container:: catalog-entry-summary

               Temporal wildfire spread benchmark coverage for the shared wildfire spread evaluator.

            .. container:: catalog-chip-row

               :bdg-primary:`Wildfire` :bdg-secondary:`Spread` :bdg-info:`Synthetic-backed`

            .. container:: catalog-meta-row

               **Benchmark Family:** :doc:`Wildfire Benchmark <benchmarks/wildfire_benchmark>`

            .. container:: catalog-meta-row

               **Key Metrics:** IoU, F1, Burned-area MAE

            .. container:: catalog-meta-row

               **Coverage:** 5 smoke configs | 5 models

            .. container:: catalog-link-row

               **View Details:** :doc:`WildfireSpreadTS <benchmarks/wildfirespreadts_ecosystem>`

            .. container:: catalog-link-row

               **Paper:** `WildfireSpreadTS: A Dataset of Multi-Modal Time Series for Wildfire Spread Prediction <https://openreview.net/forum?id=RgdGkPRQ03>`_ | **Repo:** `Repository <https://github.com/SebastianGer/WildfireSpreadTS>`__


   .. tab-item:: Earthquake

      .. container:: catalog-section-note

         Ecosystem cards describe the external benchmark or data protocol
         surfaced on this page and show how it maps back to the shared
         PyHazards benchmark family.

      .. grid:: 1 1 2 2
         :gutter: 2
         :class-container: catalog-grid

         .. grid-item-card:: AEFA
            :class-card: catalog-entry-card

            .. container:: catalog-entry-summary

               AEFA-style forecasting dataset support for the shared earthquake forecasting path.

            .. container:: catalog-chip-row

               :bdg-primary:`Earthquake` :bdg-secondary:`Wavefield Forecasting` :bdg-info:`Synthetic-backed`

            .. container:: catalog-meta-row

               **Benchmark Family:** :doc:`Earthquake Benchmark <benchmarks/earthquake_benchmark>`

            .. container:: catalog-meta-row

               **Key Metrics:** MAE, MSE

            .. container:: catalog-meta-row

               **Coverage:** 1 smoke config | 1 model

            .. container:: catalog-link-row

               **View Details:** :doc:`AEFA <benchmarks/aefa>`

            .. container:: catalog-link-row

               **Paper:** `AEFA <https://github.com/chenyk1990/aefa>`_

         .. grid-item-card:: pick-benchmark
            :class-card: catalog-entry-card

            .. container:: catalog-entry-summary

               pick-benchmark-compatible waveform picking support routed through the shared earthquake evaluator.

            .. container:: catalog-chip-row

               :bdg-primary:`Earthquake` :bdg-secondary:`Phase Picking` :bdg-info:`Synthetic-backed`

            .. container:: catalog-meta-row

               **Benchmark Family:** :doc:`Earthquake Benchmark <benchmarks/earthquake_benchmark>`

            .. container:: catalog-meta-row

               **Key Metrics:** P-pick MAE, S-pick MAE, Precision, Recall, +1 more

            .. container:: catalog-meta-row

               **Coverage:** 2 smoke configs | 2 models

            .. container:: catalog-link-row

               **View Details:** :doc:`pick-benchmark <benchmarks/pick_benchmark>`

            .. container:: catalog-link-row

               **Paper:** `pick-benchmark <https://github.com/seisbench/pick-benchmark>`_

         .. grid-item-card:: pyCSEP
            :class-card: catalog-entry-card

            .. container:: catalog-entry-summary

               pyCSEP-style forecasting report export for the earthquake forecasting smoke path.

            .. container:: catalog-chip-row

               :bdg-primary:`Earthquake` :bdg-secondary:`Wavefield Forecasting` :bdg-info:`Synthetic-backed`

            .. container:: catalog-meta-row

               **Benchmark Family:** :doc:`Earthquake Benchmark <benchmarks/earthquake_benchmark>`

            .. container:: catalog-meta-row

               **Key Metrics:** MAE, MSE

            .. container:: catalog-meta-row

               **Coverage:** 1 smoke config | 1 model

            .. container:: catalog-link-row

               **View Details:** :doc:`pyCSEP <benchmarks/pycsep>`

            .. container:: catalog-link-row

               **Paper:** `pyCSEP <https://github.com/SCECCode/pycsep>`_

         .. grid-item-card:: SeisBench
            :class-card: catalog-entry-card

            .. container:: catalog-entry-summary

               SeisBench-shaped waveform picking support for the shared earthquake benchmark family.

            .. container:: catalog-chip-row

               :bdg-primary:`Earthquake` :bdg-secondary:`Phase Picking` :bdg-info:`Synthetic-backed`

            .. container:: catalog-meta-row

               **Benchmark Family:** :doc:`Earthquake Benchmark <benchmarks/earthquake_benchmark>`

            .. container:: catalog-meta-row

               **Key Metrics:** P-pick MAE, S-pick MAE, Precision, Recall, +1 more

            .. container:: catalog-meta-row

               **Coverage:** 2 smoke configs | 2 models

            .. container:: catalog-link-row

               **View Details:** :doc:`SeisBench <benchmarks/seisbench>`

            .. container:: catalog-link-row

               **Paper:** `SeisBench - A Toolbox for Machine Learning in Seismology <https://joss.theoj.org/papers/10.21105/joss.04418>`_ | **Repo:** `Repository <https://github.com/seisbench/seisbench>`__


   .. tab-item:: Flood

      .. container:: catalog-section-note

         Ecosystem cards describe the external benchmark or data protocol
         surfaced on this page and show how it maps back to the shared
         PyHazards benchmark family.

      .. grid:: 1 1 2 2
         :gutter: 2
         :class-container: catalog-grid

         .. grid-item-card:: Caravan
            :class-card: catalog-entry-card

            .. container:: catalog-entry-summary

               Caravan-style streamflow benchmark coverage for the shared flood streamflow evaluator.

            .. container:: catalog-chip-row

               :bdg-primary:`Flood` :bdg-secondary:`Streamflow` :bdg-info:`Synthetic-backed`

            .. container:: catalog-meta-row

               **Benchmark Family:** :doc:`Flood Benchmark <benchmarks/flood_benchmark>`

            .. container:: catalog-meta-row

               **Key Metrics:** MAE, RMSE, NSE, KGE

            .. container:: catalog-meta-row

               **Coverage:** 2 smoke configs | 2 models

            .. container:: catalog-link-row

               **View Details:** :doc:`Caravan <benchmarks/caravan>`

            .. container:: catalog-link-row

               **Paper:** `Caravan - A global community dataset for large-sample hydrology <https://www.nature.com/articles/s41597-023-01975-w>`_ | **Repo:** `Repository <https://github.com/kratzert/Caravan>`__

         .. grid-item-card:: FloodCastBench
            :class-card: catalog-entry-card

            .. container:: catalog-entry-summary

               FloodCastBench-style inundation benchmark coverage for the shared flood inundation evaluator.

            .. container:: catalog-chip-row

               :bdg-primary:`Flood` :bdg-secondary:`Inundation` :bdg-info:`Synthetic-backed`

            .. container:: catalog-meta-row

               **Benchmark Family:** :doc:`Flood Benchmark <benchmarks/flood_benchmark>`

            .. container:: catalog-meta-row

               **Key Metrics:** Pixel MAE, IoU, F1

            .. container:: catalog-meta-row

               **Coverage:** 2 smoke configs | 2 models

            .. container:: catalog-link-row

               **View Details:** :doc:`FloodCastBench <benchmarks/floodcastbench>`

            .. container:: catalog-link-row

               **Paper:** `FloodCastBench <https://github.com/HydroPML/FloodCastBench>`_

         .. grid-item-card:: HydroBench
            :class-card: catalog-entry-card

            .. container:: catalog-entry-summary

               HydroBench-style streamflow diagnostics coverage for the shared flood streamflow evaluator.

            .. container:: catalog-chip-row

               :bdg-primary:`Flood` :bdg-secondary:`Streamflow` :bdg-info:`Synthetic-backed`

            .. container:: catalog-meta-row

               **Benchmark Family:** :doc:`Flood Benchmark <benchmarks/flood_benchmark>`

            .. container:: catalog-meta-row

               **Key Metrics:** MAE, RMSE, NSE, KGE

            .. container:: catalog-meta-row

               **Coverage:** 1 smoke config | 1 model

            .. container:: catalog-link-row

               **View Details:** :doc:`HydroBench <benchmarks/hydrobench>`

            .. container:: catalog-link-row

               **Paper:** `HydroBench <https://github.com/EMscience/HydroBench>`_

         .. grid-item-card:: WaterBench
            :class-card: catalog-entry-card

            .. container:: catalog-entry-summary

               WaterBench-style streamflow benchmark coverage for the shared flood evaluator.

            .. container:: catalog-chip-row

               :bdg-primary:`Flood` :bdg-secondary:`Streamflow` :bdg-info:`Synthetic-backed`

            .. container:: catalog-meta-row

               **Benchmark Family:** :doc:`Flood Benchmark <benchmarks/flood_benchmark>`

            .. container:: catalog-meta-row

               **Key Metrics:** MAE, RMSE, NSE, KGE

            .. container:: catalog-meta-row

               **Coverage:** 1 smoke config | 1 model

            .. container:: catalog-link-row

               **View Details:** :doc:`WaterBench <benchmarks/waterbench>`

            .. container:: catalog-link-row

               **Paper:** `WaterBench: A Large-scale Benchmark Dataset for Data-driven Streamflow Forecasting <https://neurips.cc/virtual/2023/80632>`_ | **Repo:** `Repository <https://github.com/uihilab/WaterBench>`__


   .. tab-item:: Tropical Cyclone

      .. container:: catalog-section-note

         Ecosystem cards describe the external benchmark or data protocol
         surfaced on this page and show how it maps back to the shared
         PyHazards benchmark family.

      .. grid:: 1 1 2 2
         :gutter: 2
         :class-container: catalog-grid

         .. grid-item-card:: IBTrACS
            :class-card: catalog-entry-card

            .. container:: catalog-entry-summary

               IBTrACS-backed storm benchmark coverage for the shared tropical cyclone evaluator.

            .. container:: catalog-chip-row

               :bdg-primary:`Tropical Cyclone` :bdg-secondary:`Track + Intensity` :bdg-info:`Synthetic-backed`

            .. container:: catalog-meta-row

               **Benchmark Family:** :doc:`Tropical Cyclone Benchmark <benchmarks/tropical_cyclone_benchmark>`

            .. container:: catalog-meta-row

               **Key Metrics:** Track Error, Intensity MAE

            .. container:: catalog-meta-row

               **Coverage:** 4 smoke configs | 4 models

            .. container:: catalog-link-row

               **View Details:** :doc:`IBTrACS <benchmarks/ibtracs>`

            .. container:: catalog-link-row

               **Paper:** `IBTrACS <https://www.ncei.noaa.gov/products/international-best-track-archive>`_

         .. grid-item-card:: TCBench Alpha
            :class-card: catalog-entry-card

            .. container:: catalog-entry-summary

               TCBench Alpha-style storm benchmark coverage for the shared tropical cyclone evaluator.

            .. container:: catalog-chip-row

               :bdg-primary:`Tropical Cyclone` :bdg-secondary:`Track + Intensity` :bdg-info:`Synthetic-backed`

            .. container:: catalog-meta-row

               **Benchmark Family:** :doc:`Tropical Cyclone Benchmark <benchmarks/tropical_cyclone_benchmark>`

            .. container:: catalog-meta-row

               **Key Metrics:** Track Error, Intensity MAE

            .. container:: catalog-meta-row

               **Coverage:** 3 smoke configs | 3 models

            .. container:: catalog-link-row

               **View Details:** :doc:`TCBench Alpha <benchmarks/tcbench_alpha>`

            .. container:: catalog-link-row

               **Paper:** `TCBench Alpha <https://github.com/msgomez06/TCBench_Alpha>`_

         .. grid-item-card:: TropiCycloneNet-Dataset
            :class-card: catalog-entry-card

            .. container:: catalog-entry-summary

               TropiCycloneNet-Dataset-backed storm benchmark coverage for the shared tropical cyclone evaluator.

            .. container:: catalog-chip-row

               :bdg-primary:`Tropical Cyclone` :bdg-secondary:`Track + Intensity` :bdg-info:`Synthetic-backed`

            .. container:: catalog-meta-row

               **Benchmark Family:** :doc:`Tropical Cyclone Benchmark <benchmarks/tropical_cyclone_benchmark>`

            .. container:: catalog-meta-row

               **Key Metrics:** Track Error, Intensity MAE

            .. container:: catalog-meta-row

               **Coverage:** 1 smoke config | 1 model

            .. container:: catalog-link-row

               **View Details:** :doc:`TropiCycloneNet-Dataset <benchmarks/tropicyclonenet_dataset>`

            .. container:: catalog-link-row

               **Paper:** `TropiCycloneNet-Dataset <https://github.com/xiaochengfuhuo/TropiCycloneNet-Dataset>`_


Programmatic Use
----------------

.. code-block:: python

    from pyhazards.configs import load_experiment_config
    from pyhazards.engine import BenchmarkRunner

    config = load_experiment_config("pyhazards/configs/earthquake/phasenet_smoke.yaml")
    summary = BenchmarkRunner().run(config)
    print(summary.metrics)

Use ``python scripts/run_benchmark.py --help`` for the CLI entry point,
then pair this page with :doc:`pyhazards_configs` for experiment YAMLs
and :doc:`pyhazards_reports` for comparable benchmark exports.

.. toctree::
   :maxdepth: 1
   :hidden:

   benchmarks/aefa
   benchmarks/caravan
   benchmarks/earthquake_benchmark
   benchmarks/flood_benchmark
   benchmarks/floodcastbench
   benchmarks/hydrobench
   benchmarks/ibtracs
   benchmarks/pick_benchmark
   benchmarks/pycsep
   benchmarks/seisbench
   benchmarks/tcbench_alpha
   benchmarks/tropical_cyclone_benchmark
   benchmarks/tropicyclonenet_dataset
   benchmarks/waterbench
   benchmarks/wildfire_benchmark
   benchmarks/wildfirespreadts_ecosystem