Cliff Horizon logo

Engine Overview

Architecture of the Cliff Horizon Weather Intelligence Engine — from data ingest to calibrated probability output and risk scoring.

The Cliff Horizon engine is a calibrated probability pipeline that transforms raw weather model output into risk-grade probability distributions. It is not a weather model — it is a post-processing and calibration layer that sits on top of existing NWP models and adds proprietary data layers.

Architecture

Development Status

Phase 4 — Multi-Variable Expansion (Complete). The engine now supports five weather variables via a polymorphic WeatherVariable architecture: temperature daily high, temperature nighttime low, rainfall, wind speed, and irradiance. Each variable has its own statistical distribution, bias correction method, and probability computation. A FastAPI risk scoring API provides commercial risk assessments for construction delay (rainfall), solar shortfall (irradiance), and wind operations. A read-only System Status API and dashboard tab provide visibility into the variable registry, pipeline health, data freshness, and engine configuration.

PhaseStatusDescription
Phase 0CompleteGFS baseline accuracy — MAE/bias/RMSE across 10 cities, 900 city-days
Phase 1CompleteBias correction + isotonic calibration + edge calculator + backtest P&L
Phase 2CompleteLive signal generation, NLL pipeline, daily automation, dashboard integration
Phase 3CompleteMulti-model ECMWF + GFS ensemble, intraday hardening (lead-time sigma, D+0 METAR blending, divergence monitor)
Phase 4CompleteMulti-variable expansion (rainfall, irradiance, wind), WeatherVariable ABC, registry pattern, risk API, generic backtest pipeline

Core Variable Architecture (Phase 4)

The engine uses a registry pattern to support multiple weather variables through a single unified pipeline. Each variable is a subclass of WeatherVariable with its own configuration, data ingestion, bias correction, and probability computation.

WeatherVariable ABC (src/core/variable.py)

Every weather variable implements this interface:

class WeatherVariable(ABC):
    config: VariableConfig

    def ingest_historical(cities, start, end, model) -> DataFrame    # backtest forecasts
    def ingest_ensemble(cities, forecast_days, model) -> DataFrame   # live ensemble
    def observe(cities, start, end) -> DataFrame                     # ground truth
    def bias_correct(raw, bias_params, station, month) -> float      # single-value correction
    def bias_correct_ensemble(df, bias_params) -> DataFrame          # bulk correction
    def exceedance_probability(value, sigma, threshold) -> float     # CDF-based
    def ensemble_exceedance_probability(members, threshold) -> float # member counting
    def generate_thresholds(mean) -> list[float]                     # contract strikes

VariableConfig (src/core/variable.py)

Each variable's behaviour is controlled by a configuration dataclass:

@dataclass
class VariableConfig:
    name: str                       # "rainfall"
    display_name: str               # "Daily Rainfall"
    unit: str                       # "inches"
    distribution_type: DistributionType  # GAUSSIAN, GAMMA, BETA, WEIBULL
    bias_method: BiasMethod         # ADDITIVE or MULTIPLICATIVE
    open_meteo_variable: str        # "precipitation_sum"
    forecast_col: str               # "forecast_precip_in"
    observed_col: str               # "precip_inches"
    corrected_col: str              # "corrected_precip_in"
    data_dir_name: str              # "rainfall"
    thresholds: list[float]         # [0.01, 0.1, 0.25, 0.5, 1.0, 2.0]
    threshold_integer: bool         # False for rainfall, True for temperature
    clamp_min: float | None         # 0.0 for non-negative variables
    exceedance_semantic: str        # "above" or "below"

Variable Registry (src/core/registry.py)

Variables are registered at import time via a decorator:

@register_variable("rainfall")
class Rainfall(WeatherVariable):
    ...

Lookup uses get_variable("rainfall") or aliases like get_variable("high") which resolves to "temperature_high".

Registered variables:

Registry NameAliasesDistributionBias MethodExceedance
temperature_highhighGaussianAdditiveabove (P(T > threshold))
temperature_lowlowGaussianAdditivebelow (P(T < threshold))
rainfallGamma (zero-inflated)Multiplicativeabove (P(precip > threshold))
wind_speedWeibullMultiplicativeabove (P(wind > threshold))
wind_gustWeibullMultiplicativeabove (P(gust > threshold))
irradianceBetaMultiplicativebelow (P(CSI < threshold))

Distribution Functions (src/core/distribution.py)

Each distribution type has its own exceedance probability function:

DistributionVariableFormulaImplementation
GaussianTemperatureP(X > T) = 1 - Phi((T - mu) / sigma)gaussian_exceedance()
GammaRainfallP(X > T) = (1 - p_zero) * P(Gamma(a,b) > T)gamma_exceedance() — zero-inflated for dry days
WeibullWindP(X > T) = 1 - CDF_Weibull(T, k, lambda)weibull_exceedance()
BetaIrradianceP(CSI > T) = 1 - CDF_Beta(T, alpha, beta)beta_exceedance() — on clear-sky index [0,1]

Distribution parameters are fitted from training data:

  • fit_gamma_zero_inflated(values) — MLE with method-of-moments fallback
  • fit_weibull(values) — scipy weibull_min.fit with moment-based fallback
  • fit_beta(values) — Beta fit with moment-based fallback

Engine Pipeline

Live Signal Generation (src/engine/live_signal_generator.py)

The LiveSignalGenerator class orchestrates the full pipeline from ensemble fetch to trade signal:

1. load_models()
   Load bias_params.json[model] for each NWP model
   Load calibration_model.pkl (isotonic regression)
   Extract prob_sigma = sqrt(brier_calibrated)

2. fetch_ensemble(target_date)
   Open-Meteo Ensemble API
   → [station, date, member, forecast_value]

3. bias_correct_ensemble()
   For each member: corrected = raw + get_bias(params, station, month)  [additive]
                  or corrected = raw * get_ratio(params, station, month) [multiplicative]
   → [station, date, member, corrected_value]

4. compute_probabilities()
   Generate thresholds via var.generate_thresholds() or integer +/- STRIKES_PER_SIDE
   For each threshold: P(exceed) = count(members > threshold) / n_members
   → [station, date, threshold, raw_probability]

5. calibrate()
   calibrated_prob = isotonic_model.predict(raw_probability)
   → [station, date, threshold, calibrated_probability]

6. merge_market_prices() [optional]
   Left-join with IBKR quotes
   → [station, date, threshold, calibrated_probability, market_mid]

7. generate_trade_signals()
   edge = |calibrated_prob - market_price|
   direction = "YES" if calibrated > market, "NO" else
   zscore = edge / sqrt(prob_sigma^2 + market_noise^2)
   kelly = (p*b - q) / b * KELLY_FRACTION
   passes = (confidence >= 70) AND (zscore >= 1.5)
   → TradeSignal rows

Constructor signature:

LiveSignalGenerator(variable="high", models=["gfs_seamless"])

The variable parameter accepts strings ("high", "low", "rainfall", etc.) which resolve via the registry, or a WeatherVariable instance directly. When a WeatherVariable is provided, the generator delegates to its methods for ensemble fetching, bias correction, and probability computation.

Multi-model support: When multiple models are specified (e.g., ["gfs_seamless", "ecmwf_ifs025"]), each model has separate bias parameters trained from its own historical forecast-vs-observed data. The ensemble members from all models are pooled for member counting.

Edge Calculator (src/engine/edge_calculator.py)

Converts calibrated probabilities into sized trade signals:

Edge: edge = |calibrated_prob - market_price|

Z-Score: z = edge / sqrt(prob_sigma^2 + MARKET_NOISE_SIGMA^2)

Where prob_sigma = sqrt(Brier_calibrated) from Phase 1 training, and MARKET_NOISE_SIGMA = 0.05 (assumed pricing noise).

Kelly Criterion:

f* = (p * b - q) / b
where p = win_prob, q = 1-p, b = (1 - entry_price) / entry_price
position = min(f* * KELLY_FRACTION, MAX_POSITION_PCT)

Lead-time sigma scaling (Phase 3):

Lead TimeSigma MultiplierRationale
D+00.7xObservations available; highest confidence
D+11.0xBaseline calibration window
D+21.3xModerate uncertainty increase
D+31.6xApproaching ensemble skill horizon
D+4+2.0xLarge edge required to pass filters

Risk Scorer (src/engine/risk_scorer.py)

Commercial risk assessment engine for non-temperature variables. Converts probability curves into actionable risk levels (LOW, MODERATE, HIGH, EXTREME) with industry-specific labels.

Rainfall — Construction Delay Risk:

ConditionRisk LevelLabel
P(>0.5") >= 0.70EXTREMEHalt outdoor work
P(>0.25") >= 0.60HIGHDelay concrete/paving
P(>0.10") >= 0.50MODERATELight rain, monitor
elseLOWDry conditions

Irradiance — Solar Shortfall Risk:

ConditionRisk LevelLabel
P(CSI < 0.2) >= 0.60EXTREMEVery low generation
P(CSI < 0.4) >= 0.50HIGHSignificant shortfall
P(CSI < 0.6) >= 0.50MODERATEBelow-average generation
elseLOWNormal/above-average

Wind — Operational Risk:

ConditionRisk LevelLabel
P(>50 mph) >= 0.30EXTREMEHalt crane operations
P(>35 mph) >= 0.40HIGHCurtail operations
P(>25 mph) >= 0.50MODERATESecure loose items
elseLOWLight winds

Divergence Monitor (src/engine/divergence_monitor.py)

Detects forecast drift between the morning baseline and afternoon ensemble refresh:

DivergenceMonitor(threshold_pp=10)  # 10 percentage-point default

check_divergence(baseline_signals, current_probs) -> DataFrame
# For each (station, date, threshold):
#   shift_pp = (current_prob - morning_prob) * 100
#   Action: REVERSE (direction flipped), FLAG (|shift| >= threshold), HOLD (normal)

Output is saved to data/signals/divergence_YYYY-MM-DD.csv and logged to the dashboard.

D+0 Intraday Blending (src/signals/d0_signal_generator.py)

On settlement day, the engine blends live hourly METAR observations with the morning ensemble forecast:

Observation weight ramp: weight = min(hour / 18, 0.9) — linear from 0% at midnight to 90% at 6 PM.

Blended probability: blended = obs_weight * metar_prob + (1 - obs_weight) * ensemble_prob

Hard overrides:

  • DH: If running max already exceeds threshold, probability set to ~0.95+ (outcome near-certain)
  • NLL: If running min already below threshold, probability set to ~0.95+

METAR-implied probability (DH):

running_max > threshold → 0.98 (already exceeded)
hour >= 18 AND not exceeded → 0.02 (too late)
gap = threshold - running_max
gap > 10°F → ~0.05 (unlikely to reach)
gap <= 0°F → 0.95 (nearly certain)
gap in [0, 10°F] → interpolated based on gap and remaining hours

METAR data is fetched via src/signals/metar_live.py which parses IEM ASOS hourly data (state-specific networks, 3-letter station codes, tmpf field).

Design Principles

Calibration over accuracy. The engine optimises for calibrated probability (Brier score, reliability diagram) rather than point forecast accuracy (RMSE, MAE). A perfectly calibrated engine that says "70% chance" and is right 70% of the time is more valuable for risk products than a model with lower RMSE but unknown calibration.

Synthesis over data. The engine doesn't own proprietary weather stations or run its own NWP model. Its value is in combining public NWP output with proprietary satellite data (SatSure) and behavioural signals (grid load) — then calibrating the result to produce reliable probabilities.

Adapter, not rewrite. Temperature implementations delegate to existing functions (Phase 0-3). New variables (rainfall, wind, irradiance) plug in via the WeatherVariable interface without touching the existing codebase. Zero code duplication.

Audit trail by default. Every prediction stores its full provenance: timestamp, NWP sources ingested, bias correction parameters, ensemble weights, raw ensemble probability, and calibrated output. This is the chain of custody required for Tier 2 warranty verification.

Processing Cycle

The engine runs four times daily, aligned to NWP model runs:

RunNWP CycleIngest TimeOutput Available
100Z~03:30 UTC~04:00 UTC
206Z~09:30 UTC~10:00 UTC
312Z~15:30 UTC~16:00 UTC
418Z~21:30 UTC~22:00 UTC

Each cycle processes the latest NWP ensemble data, applies bias corrections, recalculates ensemble weights based on recent verification performance, and produces updated calibrated probabilities for all monitored locations and variables.

Daily automation pipeline (scripts/daily_pipeline.sh, 8 steps):

StepScriptDescription
1adaily_scoring.py --variable highScore yesterday's DH predictions against IEM observed temps
1bdaily_scoring.py --variable lowScore yesterday's NLL predictions against IEM observed lows
2arun_phase2.py --signal-only --variable high --models gfs_seamless ecmwf_ifs025Generate tomorrow's DH signals (multi-model)
2brun_phase2.py --signal-only --variable low --models gfs_seamless ecmwf_ifs025Generate tomorrow's NLL signals
3run_d0_update.pyD+0 METAR blending for settlement-day contracts
4run_variable_signals.py --variable rainfall --risk-scoresRainfall signals + construction delay risk scores
5run_variable_signals.py --variable wind_speed --risk-scoresWind signals + operational risk scores
6run_variable_signals.py --variable irradiance --risk-scoresIrradiance signals + solar shortfall risk scores
7run_divergence.pyForecast drift detection against morning baseline
8export_dashboard_data.pyExport all signals, scores, and calibration data to dashboard JSON

Midday pipeline (scripts/midday_pipeline.sh, triggered via LaunchAgent at noon local time):

StepScriptDescription
1run_d0_update.py --variable highBayesian-blend METAR observations with morning DH forecast
2run_d0_update.py --variable lowSame for NLL contracts
3run_divergence.py --variable highFlag >10pp shifts or direction reversals
4run_divergence.py --variable lowSame for NLL contracts

Configuration Constants (config/settings.py)

Trading Parameters

ConstantValuePurpose
CONFIDENCE_THRESHOLD70Minimum confidence (edge * 100) to pass filters
KELLY_FRACTION0.25Fractional Kelly multiplier (quarter-Kelly)
MAX_POSITION_PCT0.05Maximum 5% of bankroll per trade
ZSCORE_MINIMUM1.5Minimum z-score to pass filters
MARKET_NOISE_SIGMA0.05Assumed market pricing noise

Probability Bounds

ConstantValueContext
PROB_CLAMP_GAUSSIAN(0.001, 0.999)Backtest mode (Gaussian CDF)
PROB_CLAMP_ENSEMBLE(0.02, 0.98)Live mode (ensemble member counting)

Variable-Specific Thresholds

VariableThresholdsUnit
TemperatureInteger +/- STRIKES_PER_SIDE (4) around mean°F
Rainfall[0.01, 0.1, 0.25, 0.5, 1.0, 2.0]inches
Wind Speed[15, 20, 25, 30, 35, 40, 50]mph
Wind Gust[25, 30, 40, 50, 60]mph
Irradiance[0.2, 0.4, 0.6, 0.8, 1.0]Clear-sky index

Risk API (src/api/risk_api.py)

FastAPI service providing commercial risk endpoints:

EndpointMethodDescription
/api/v1/healthGETHealth check
/api/v1/risk/{city}/{date}GETAll variable risks for a city/date
/api/v1/risk/{city}/{date}/{variable}GETSpecific variable risk
/api/v1/probability/{variable}/{city}/{date}GETFull probability curve

Response schema:

{
  "city": "KJFK",
  "date": "2026-04-05",
  "risks": [
    {
      "variable": "rainfall",
      "risk_level": "HIGH",
      "risk_label": "Delay concrete/paving",
      "trigger_threshold": 0.25,
      "trigger_probability": 0.65,
      "detail": "P(>0.25 inches) = 65.0%"
    }
  ]
}

System Status API (src/api/system_api.py)

Read-only endpoints for dashboard visibility into engine state. Mounted on the same FastAPI app.

EndpointMethodDescription
/api/v1/system/variablesGETVariable registry with training status, Brier scores, last trained dates
/api/v1/system/pipelineGETParse latest daily_*.log into step statuses (success/failed/not_run)
/api/v1/system/configGETRead-only dump of trading, settlement, backtest, and lead-time sigma settings

The /system/variables endpoint checks Phase 1 artifacts (bias_params.json, calibration_model.pkl, phase1_summary.json) for each registered variable to determine training status. The /system/pipeline endpoint parses the most recent pipeline log file using regex pattern matching on the [timestamp] Step N: message format.

Technology Stack

ComponentTechnologyPurpose
Ingest servicePythonPulls NWP data via Open-Meteo API, SatSure API, IBKR API
Time-series storeTimescaleDBNative time-series PostgreSQL extension
Engine pipelinePythonBias correction, ensemble weighting, calibration, risk scoring
API layerFastAPIREST endpoints for dashboard, risk API, probability curves
DashboardReact + TailwindClient-facing interface (10 tabs incl. System status)
DeploymentDockerContainerised on single VPS (Kubernetes deferred)

Source Code Map

DirectoryContents
src/core/variable.py (ABC + config), distribution.py (Gaussian/Gamma/Weibull/Beta), registry.py (decorator + factory)
src/variables/temperature/daily_high.py, nighttime_low.py — Gaussian, additive bias
src/variables/rainfall/rainfall.py — zero-inflated Gamma, multiplicative bias, mm-to-inches conversion
src/variables/wind/wind.py, wind_gust.py — Weibull, multiplicative bias, knots-to-mph conversion
src/variables/irradiance/irradiance.py, clear_sky.py — Beta on CSI, ERA5 ground truth, solar geometry model
src/engine/live_signal_generator.py, edge_calculator.py, risk_scorer.py, divergence_monitor.py
src/signals/d0_signal_generator.py (METAR blending), metar_live.py (IEM hourly fetch)
src/models/bias_correction.py (additive + multiplicative), calibration.py (isotonic), ensemble_probability.py (generic backtest)
src/data_ingestion/open_meteo.py (NWP forecasts), iem_observed.py (IEM highs/lows/precip/wind), open_meteo_reanalysis.py (ERA5 irradiance)
src/api/risk_api.py (FastAPI risk endpoints), system_api.py (system status: variables, pipeline, config)
scripts/daily_pipeline.sh, run_phase2.py, run_d0_update.py, run_divergence.py, run_variable_signals.py, run_phase_generic.py
config/settings.py (all constants, paths, thresholds), cities.json (10-city roster)