Engine Overview

The Cliff Horizon engine is a calibrated probability pipeline that transforms raw weather model output into risk-grade probability distributions. It is not a weather model — it is a post-processing and calibration layer that sits on top of existing NWP models and adds proprietary data layers.

Architecture

Development Status

Phase 4 — Multi-Variable Expansion (Complete). The engine now supports five weather variables via a polymorphic WeatherVariable architecture: temperature daily high, temperature nighttime low, rainfall, wind speed, and irradiance. Each variable has its own statistical distribution, bias correction method, and probability computation. A FastAPI risk scoring API provides commercial risk assessments for construction delay (rainfall), solar shortfall (irradiance), and wind operations. A read-only System Status API and dashboard tab provide visibility into the variable registry, pipeline health, data freshness, and engine configuration.

Phase	Status	Description
Phase 0	Complete	GFS baseline accuracy — MAE/bias/RMSE across 10 cities, 900 city-days
Phase 1	Complete	Bias correction + isotonic calibration + edge calculator + backtest P&L
Phase 2	Complete	Live signal generation, NLL pipeline, daily automation, dashboard integration
Phase 3	Complete	Multi-model ECMWF + GFS ensemble, intraday hardening (lead-time sigma, D+0 METAR blending, divergence monitor)
Phase 4	Complete	Multi-variable expansion (rainfall, irradiance, wind), `WeatherVariable` ABC, registry pattern, risk API, generic backtest pipeline

Core Variable Architecture (Phase 4)

The engine uses a registry pattern to support multiple weather variables through a single unified pipeline. Each variable is a subclass of WeatherVariable with its own configuration, data ingestion, bias correction, and probability computation.

WeatherVariable ABC (`src/core/variable.py`)

Every weather variable implements this interface:

class WeatherVariable(ABC):
    config: VariableConfig

    def ingest_historical(cities, start, end, model) -> DataFrame    # backtest forecasts
    def ingest_ensemble(cities, forecast_days, model) -> DataFrame   # live ensemble
    def observe(cities, start, end) -> DataFrame                     # ground truth
    def bias_correct(raw, bias_params, station, month) -> float      # single-value correction
    def bias_correct_ensemble(df, bias_params) -> DataFrame          # bulk correction
    def exceedance_probability(value, sigma, threshold) -> float     # CDF-based
    def ensemble_exceedance_probability(members, threshold) -> float # member counting
    def generate_thresholds(mean) -> list[float]                     # contract strikes

VariableConfig (`src/core/variable.py`)

Each variable's behaviour is controlled by a configuration dataclass:

@dataclass
class VariableConfig:
    name: str                       # "rainfall"
    display_name: str               # "Daily Rainfall"
    unit: str                       # "inches"
    distribution_type: DistributionType  # GAUSSIAN, GAMMA, BETA, WEIBULL
    bias_method: BiasMethod         # ADDITIVE or MULTIPLICATIVE
    open_meteo_variable: str        # "precipitation_sum"
    forecast_col: str               # "forecast_precip_in"
    observed_col: str               # "precip_inches"
    corrected_col: str              # "corrected_precip_in"
    data_dir_name: str              # "rainfall"
    thresholds: list[float]         # [0.01, 0.1, 0.25, 0.5, 1.0, 2.0]
    threshold_integer: bool         # False for rainfall, True for temperature
    clamp_min: float | None         # 0.0 for non-negative variables
    exceedance_semantic: str        # "above" or "below"

Variable Registry (`src/core/registry.py`)

Variables are registered at import time via a decorator:

@register_variable("rainfall")
class Rainfall(WeatherVariable):
    ...

Lookup uses get_variable("rainfall") or aliases like get_variable("high") which resolves to "temperature_high".

Registered variables:

Registry Name	Aliases	Distribution	Bias Method	Exceedance
`temperature_high`	`high`	Gaussian	Additive	above (P(T > threshold))
`temperature_low`	`low`	Gaussian	Additive	below (P(T < threshold))
`rainfall`	—	Gamma (zero-inflated)	Multiplicative	above (P(precip > threshold))
`wind_speed`	—	Weibull	Multiplicative	above (P(wind > threshold))
`wind_gust`	—	Weibull	Multiplicative	above (P(gust > threshold))
`irradiance`	—	Beta	Multiplicative	below (P(CSI < threshold))

Distribution Functions (`src/core/distribution.py`)

Each distribution type has its own exceedance probability function:

Distribution	Variable	Formula	Implementation
Gaussian	Temperature	`P(X > T) = 1 - Phi((T - mu) / sigma)`	`gaussian_exceedance()`
Gamma	Rainfall	`P(X > T) = (1 - p_zero) * P(Gamma(a,b) > T)`	`gamma_exceedance()` — zero-inflated for dry days
Weibull	Wind	`P(X > T) = 1 - CDF_Weibull(T, k, lambda)`	`weibull_exceedance()`
Beta	Irradiance	`P(CSI > T) = 1 - CDF_Beta(T, alpha, beta)`	`beta_exceedance()` — on clear-sky index [0,1]

Distribution parameters are fitted from training data:

fit_gamma_zero_inflated(values) — MLE with method-of-moments fallback
fit_weibull(values) — scipy weibull_min.fit with moment-based fallback
fit_beta(values) — Beta fit with moment-based fallback

Engine Pipeline

Live Signal Generation (`src/engine/live_signal_generator.py`)

The LiveSignalGenerator class orchestrates the full pipeline from ensemble fetch to trade signal:

1. load_models()
   Load bias_params.json[model] for each NWP model
   Load calibration_model.pkl (isotonic regression)
   Extract prob_sigma = sqrt(brier_calibrated)

2. fetch_ensemble(target_date)
   Open-Meteo Ensemble API
   → [station, date, member, forecast_value]

3. bias_correct_ensemble()
   For each member: corrected = raw + get_bias(params, station, month)  [additive]
                  or corrected = raw * get_ratio(params, station, month) [multiplicative]
   → [station, date, member, corrected_value]

4. compute_probabilities()
   Generate thresholds via var.generate_thresholds() or integer +/- STRIKES_PER_SIDE
   For each threshold: P(exceed) = count(members > threshold) / n_members
   → [station, date, threshold, raw_probability]

5. calibrate()
   calibrated_prob = isotonic_model.predict(raw_probability)
   → [station, date, threshold, calibrated_probability]

6. merge_market_prices() [optional]
   Left-join with IBKR quotes
   → [station, date, threshold, calibrated_probability, market_mid]

7. generate_trade_signals()
   edge = |calibrated_prob - market_price|
   direction = "YES" if calibrated > market, "NO" else
   zscore = edge / sqrt(prob_sigma^2 + market_noise^2)
   kelly = (p*b - q) / b * KELLY_FRACTION
   passes = (confidence >= 70) AND (zscore >= 1.5)
   → TradeSignal rows

Constructor signature:

LiveSignalGenerator(variable="high", models=["gfs_seamless"])

The variable parameter accepts strings ("high", "low", "rainfall", etc.) which resolve via the registry, or a WeatherVariable instance directly. When a WeatherVariable is provided, the generator delegates to its methods for ensemble fetching, bias correction, and probability computation.

Multi-model support: When multiple models are specified (e.g., ["gfs_seamless", "ecmwf_ifs025"]), each model has separate bias parameters trained from its own historical forecast-vs-observed data. The ensemble members from all models are pooled for member counting.

Edge Calculator (`src/engine/edge_calculator.py`)

Converts calibrated probabilities into sized trade signals:

Edge: edge = |calibrated_prob - market_price|

Z-Score: z = edge / sqrt(prob_sigma^2 + MARKET_NOISE_SIGMA^2)

Where prob_sigma = sqrt(Brier_calibrated) from Phase 1 training, and MARKET_NOISE_SIGMA = 0.05 (assumed pricing noise).

Kelly Criterion:

f* = (p * b - q) / b
where p = win_prob, q = 1-p, b = (1 - entry_price) / entry_price
position = min(f* * KELLY_FRACTION, MAX_POSITION_PCT)

Lead-time sigma scaling (Phase 3):

Lead Time	Sigma Multiplier	Rationale
D+0	0.7x	Observations available; highest confidence
D+1	1.0x	Baseline calibration window
D+2	1.3x	Moderate uncertainty increase
D+3	1.6x	Approaching ensemble skill horizon
D+4+	2.0x	Large edge required to pass filters

Risk Scorer (`src/engine/risk_scorer.py`)

Commercial risk assessment engine for non-temperature variables. Converts probability curves into actionable risk levels (LOW, MODERATE, HIGH, EXTREME) with industry-specific labels.

Rainfall — Construction Delay Risk:

Condition	Risk Level	Label
P(>0.5") >= 0.70	EXTREME	Halt outdoor work
P(>0.25") >= 0.60	HIGH	Delay concrete/paving
P(>0.10") >= 0.50	MODERATE	Light rain, monitor
else	LOW	Dry conditions

Irradiance — Solar Shortfall Risk:

Condition	Risk Level	Label
P(CSI < 0.2) >= 0.60	EXTREME	Very low generation
P(CSI < 0.4) >= 0.50	HIGH	Significant shortfall
P(CSI < 0.6) >= 0.50	MODERATE	Below-average generation
else	LOW	Normal/above-average

Wind — Operational Risk:

Condition	Risk Level	Label
P(>50 mph) >= 0.30	EXTREME	Halt crane operations
P(>35 mph) >= 0.40	HIGH	Curtail operations
P(>25 mph) >= 0.50	MODERATE	Secure loose items
else	LOW	Light winds

Divergence Monitor (`src/engine/divergence_monitor.py`)

Detects forecast drift between the morning baseline and afternoon ensemble refresh:

DivergenceMonitor(threshold_pp=10)  # 10 percentage-point default

check_divergence(baseline_signals, current_probs) -> DataFrame
# For each (station, date, threshold):
#   shift_pp = (current_prob - morning_prob) * 100
#   Action: REVERSE (direction flipped), FLAG (|shift| >= threshold), HOLD (normal)

Output is saved to data/signals/divergence_YYYY-MM-DD.csv and logged to the dashboard.

D+0 Intraday Blending (`src/signals/d0_signal_generator.py`)

On settlement day, the engine blends live hourly METAR observations with the morning ensemble forecast:

Observation weight ramp: weight = min(hour / 18, 0.9) — linear from 0% at midnight to 90% at 6 PM.

Blended probability: blended = obs_weight * metar_prob + (1 - obs_weight) * ensemble_prob

Hard overrides:

DH: If running max already exceeds threshold, probability set to ~0.95+ (outcome near-certain)
NLL: If running min already below threshold, probability set to ~0.95+

METAR-implied probability (DH):

running_max > threshold → 0.98 (already exceeded)
hour >= 18 AND not exceeded → 0.02 (too late)
gap = threshold - running_max
gap > 10°F → ~0.05 (unlikely to reach)
gap <= 0°F → 0.95 (nearly certain)
gap in [0, 10°F] → interpolated based on gap and remaining hours

METAR data is fetched via src/signals/metar_live.py which parses IEM ASOS hourly data (state-specific networks, 3-letter station codes, tmpf field).

Design Principles

Calibration over accuracy. The engine optimises for calibrated probability (Brier score, reliability diagram) rather than point forecast accuracy (RMSE, MAE). A perfectly calibrated engine that says "70% chance" and is right 70% of the time is more valuable for risk products than a model with lower RMSE but unknown calibration.

Synthesis over data. The engine doesn't own proprietary weather stations or run its own NWP model. Its value is in combining public NWP output with proprietary satellite data (SatSure) and behavioural signals (grid load) — then calibrating the result to produce reliable probabilities.

Adapter, not rewrite. Temperature implementations delegate to existing functions (Phase 0-3). New variables (rainfall, wind, irradiance) plug in via the WeatherVariable interface without touching the existing codebase. Zero code duplication.

Audit trail by default. Every prediction stores its full provenance: timestamp, NWP sources ingested, bias correction parameters, ensemble weights, raw ensemble probability, and calibrated output. This is the chain of custody required for Tier 2 warranty verification.

Processing Cycle

The engine runs four times daily, aligned to NWP model runs:

Run	NWP Cycle	Ingest Time	Output Available
1	00Z	~03:30 UTC	~04:00 UTC
2	06Z	~09:30 UTC	~10:00 UTC
3	12Z	~15:30 UTC	~16:00 UTC
4	18Z	~21:30 UTC	~22:00 UTC

Each cycle processes the latest NWP ensemble data, applies bias corrections, recalculates ensemble weights based on recent verification performance, and produces updated calibrated probabilities for all monitored locations and variables.

Daily automation pipeline (scripts/daily_pipeline.sh, 8 steps):

Step	Script	Description
1a	`daily_scoring.py --variable high`	Score yesterday's DH predictions against IEM observed temps
1b	`daily_scoring.py --variable low`	Score yesterday's NLL predictions against IEM observed lows
2a	`run_phase2.py --signal-only --variable high --models gfs_seamless ecmwf_ifs025`	Generate tomorrow's DH signals (multi-model)
2b	`run_phase2.py --signal-only --variable low --models gfs_seamless ecmwf_ifs025`	Generate tomorrow's NLL signals
3	`run_d0_update.py`	D+0 METAR blending for settlement-day contracts
4	`run_variable_signals.py --variable rainfall --risk-scores`	Rainfall signals + construction delay risk scores
5	`run_variable_signals.py --variable wind_speed --risk-scores`	Wind signals + operational risk scores
6	`run_variable_signals.py --variable irradiance --risk-scores`	Irradiance signals + solar shortfall risk scores
7	`run_divergence.py`	Forecast drift detection against morning baseline
8	`export_dashboard_data.py`	Export all signals, scores, and calibration data to dashboard JSON

Midday pipeline (scripts/midday_pipeline.sh, triggered via LaunchAgent at noon local time):

Step	Script	Description
1	`run_d0_update.py --variable high`	Bayesian-blend METAR observations with morning DH forecast
2	`run_d0_update.py --variable low`	Same for NLL contracts
3	`run_divergence.py --variable high`	Flag >10pp shifts or direction reversals
4	`run_divergence.py --variable low`	Same for NLL contracts

Configuration Constants (`config/settings.py`)

Trading Parameters

Constant	Value	Purpose
`CONFIDENCE_THRESHOLD`	70	Minimum confidence (edge * 100) to pass filters
`KELLY_FRACTION`	0.25	Fractional Kelly multiplier (quarter-Kelly)
`MAX_POSITION_PCT`	0.05	Maximum 5% of bankroll per trade
`ZSCORE_MINIMUM`	1.5	Minimum z-score to pass filters
`MARKET_NOISE_SIGMA`	0.05	Assumed market pricing noise

Probability Bounds

Constant	Value	Context
`PROB_CLAMP_GAUSSIAN`	(0.001, 0.999)	Backtest mode (Gaussian CDF)
`PROB_CLAMP_ENSEMBLE`	(0.02, 0.98)	Live mode (ensemble member counting)

Variable-Specific Thresholds

Variable	Thresholds	Unit
Temperature	Integer +/- `STRIKES_PER_SIDE` (4) around mean	°F
Rainfall	[0.01, 0.1, 0.25, 0.5, 1.0, 2.0]	inches
Wind Speed	[15, 20, 25, 30, 35, 40, 50]	mph
Wind Gust	[25, 30, 40, 50, 60]	mph
Irradiance	[0.2, 0.4, 0.6, 0.8, 1.0]	Clear-sky index

Risk API (`src/api/risk_api.py`)

FastAPI service providing commercial risk endpoints:

Endpoint	Method	Description
`/api/v1/health`	GET	Health check
`/api/v1/risk/{city}/{date}`	GET	All variable risks for a city/date
`/api/v1/risk/{city}/{date}/{variable}`	GET	Specific variable risk
`/api/v1/probability/{variable}/{city}/{date}`	GET	Full probability curve

Response schema:

{
  "city": "KJFK",
  "date": "2026-04-05",
  "risks": [
    {
      "variable": "rainfall",
      "risk_level": "HIGH",
      "risk_label": "Delay concrete/paving",
      "trigger_threshold": 0.25,
      "trigger_probability": 0.65,
      "detail": "P(>0.25 inches) = 65.0%"
    }
  ]
}

System Status API (`src/api/system_api.py`)

Read-only endpoints for dashboard visibility into engine state. Mounted on the same FastAPI app.

Endpoint	Method	Description
`/api/v1/system/variables`	GET	Variable registry with training status, Brier scores, last trained dates
`/api/v1/system/pipeline`	GET	Parse latest `daily_*.log` into step statuses (success/failed/not_run)
`/api/v1/system/config`	GET	Read-only dump of trading, settlement, backtest, and lead-time sigma settings

The /system/variables endpoint checks Phase 1 artifacts (bias_params.json, calibration_model.pkl, phase1_summary.json) for each registered variable to determine training status. The /system/pipeline endpoint parses the most recent pipeline log file using regex pattern matching on the [timestamp] Step N: message format.

Technology Stack

Component	Technology	Purpose
Ingest service	Python	Pulls NWP data via Open-Meteo API, SatSure API, IBKR API
Time-series store	TimescaleDB	Native time-series PostgreSQL extension
Engine pipeline	Python	Bias correction, ensemble weighting, calibration, risk scoring
API layer	FastAPI	REST endpoints for dashboard, risk API, probability curves
Dashboard	React + Tailwind	Client-facing interface (10 tabs incl. System status)
Deployment	Docker	Containerised on single VPS (Kubernetes deferred)

Source Code Map

Directory	Contents
`src/core/`	`variable.py` (ABC + config), `distribution.py` (Gaussian/Gamma/Weibull/Beta), `registry.py` (decorator + factory)
`src/variables/temperature/`	`daily_high.py`, `nighttime_low.py` — Gaussian, additive bias
`src/variables/rainfall/`	`rainfall.py` — zero-inflated Gamma, multiplicative bias, mm-to-inches conversion
`src/variables/wind/`	`wind.py`, `wind_gust.py` — Weibull, multiplicative bias, knots-to-mph conversion
`src/variables/irradiance/`	`irradiance.py`, `clear_sky.py` — Beta on CSI, ERA5 ground truth, solar geometry model
`src/engine/`	`live_signal_generator.py`, `edge_calculator.py`, `risk_scorer.py`, `divergence_monitor.py`
`src/signals/`	`d0_signal_generator.py` (METAR blending), `metar_live.py` (IEM hourly fetch)
`src/models/`	`bias_correction.py` (additive + multiplicative), `calibration.py` (isotonic), `ensemble_probability.py` (generic backtest)
`src/data_ingestion/`	`open_meteo.py` (NWP forecasts), `iem_observed.py` (IEM highs/lows/precip/wind), `open_meteo_reanalysis.py` (ERA5 irradiance)
`src/api/`	`risk_api.py` (FastAPI risk endpoints), `system_api.py` (system status: variables, pipeline, config)
`scripts/`	`daily_pipeline.sh`, `run_phase2.py`, `run_d0_update.py`, `run_divergence.py`, `run_variable_signals.py`, `run_phase_generic.py`
`config/`	`settings.py` (all constants, paths, thresholds), `cities.json` (10-city roster)

Architecture

Development Status

Core Variable Architecture (Phase 4)

WeatherVariable ABC (src/core/variable.py)

VariableConfig (src/core/variable.py)

Variable Registry (src/core/registry.py)

Distribution Functions (src/core/distribution.py)