Dashboard Guide
A plain-English walkthrough of every screen, metric, and chart on the Cliff Horizon dashboard. No statistics background required.
This guide walks you through every tab, metric, and chart on the Cliff Horizon dashboard in plain English. No statistics background needed — just a desire to understand what you are looking at and what to do about it.
How to read this guide. For each tab, we explain what you are looking at, why it matters, and what decisions or actions you might take based on the data. If a metric is green, things are good. If it is amber or red, we will tell you what that means.
What is this dashboard?
The Cliff Horizon dashboard is the window into our weather prediction engine. The engine takes raw weather model data from multiple sources (GFS, ECMWF, FuXi-S2S, satellite imagery) and turns it into calibrated probabilities — statements like "there is a 72% chance the high temperature in New York will exceed 85°F tomorrow."
These probabilities power three products: a forecast validation tool (proving the engine works), a limited performance warranty (guaranteeing accuracy to clients), and parametric weather derivatives (financial products that pay out when weather events occur).
The dashboard has ten tabs. This guide walks through each one.
1. Overview
The Overview tab is your executive summary. It answers the question: "How is the engine performing right now?" at a single glance.
Metric cards
| Metric | What it means | What to do |
|---|---|---|
| Brier Score | The single most important number. It measures how accurate our probability forecasts are. 0 is perfect, 0.25 is useless (coin flip). Below 0.10 is excellent. The sparkline shows the recent trend. | If rising above 0.10, investigate the Calibration tab. A rising Brier means our predictions are drifting from reality. |
| Calibration | How closely our stated probabilities match real outcomes. ±3.2% means when we say "60% chance," it actually happens between 57% and 63% of the time. | If this exceeds ±8%, the warranty threshold is breached and service credits may be triggered. Check the Warranty tab. |
| Active Signals | The number of weather contracts where our engine spots a pricing difference versus the market. | More signals means more potential opportunity. Zero signals may indicate stale data — check the System tab. |
| Avg Edge | The average size of our pricing advantage. Higher means the market is more mispriced relative to our forecast. The sparkline shows the edge trend. | Edge below 2% means the market largely agrees with us. Edge above 5% is a strong signal worth acting on. |
| Warranty Status | Whether our forecast accuracy is within the guaranteed range. CLEAR means no penalty. TRIGGERED means we owe service credits. | If TRIGGERED, go to the Warranty tab immediately to see which cities and periods are affected. |
Charts and alerts
Signal Edge vs Threshold — the bar chart shows how much our engine disagrees with the market for each contract. Green bars mean we think the market is pricing too low (we would buy). Red bars mean the market is pricing too high (we would sell). Taller bars represent stronger disagreements.
Engine vs Actual vs Market — three lines showing what our engine predicted (cyan), what actually happened (green), and what the market priced (grey dashed). The closer our cyan line is to the green line, the better we are doing.
Confidence Distribution — shows how confident the engine is across all signals. Higher z-scores mean stronger conviction. Z-scores above 1.5 are considered worth trading.
Brier Score by City — compares prediction accuracy across cities. Shorter bars are better. If one city is consistently worse, it may need recalibration.
Top Signal Alerts — the biggest pricing opportunities the engine has found, ranked by severity. High severity means a large mispricing — worth looking at closely.
2. Forecast
The Forecast tab has two views, toggled by buttons at the top.
D+1 Contracts view
This view shows tomorrow's weather contracts. It is the engine's core output: for each city and temperature threshold, what does our engine think versus what the market thinks?
| Metric | What it means | What to do |
|---|---|---|
| Observed High/Low | Yesterday's actual temperature, from official NWS weather station data. This is the ground truth we are measured against. | Compare to what the engine predicted yesterday. Large misses might indicate model drift. |
| Engine Midpoint | The engine's probability at the middle threshold. For example, "65% chance the high exceeds 78°F." | This is our central estimate. If it is far from 50%, the engine has a strong view. |
| Best Edge | The largest pricing gap between our engine and the market, across all thresholds for this city. | This is where the biggest opportunity sits. The direction (BUY/SELL) tells you which way. |
| Tradeable Signals | How many contracts pass both our confidence and statistical significance filters. | Only these are worth acting on. The rest are noise. |
Exceedance Probability Curve — the most important chart on this tab. The cyan line is what our engine thinks. The grey dashed line is what the market thinks. Where they diverge, there is an opportunity. The x-axis shows temperature thresholds; the y-axis shows the probability of exceeding each one.
Edge by Threshold — same information as the curve, but shown as a bar chart. Green bars mean we think it is more likely than the market does; red means less likely. The purple line shows the statistical significance (z-score) of each edge.
Signal Detail table — every contract for the selected city, with engine probability, market price, edge, z-score, direction (BUY YES or BUY NO), and status (TRADE or monitor).
Subseasonal view (D+1 to D+42)
This view shows the engine's longer-range outlook, powered by the FuXi-S2S model — an AI weather model that forecasts 42 days ahead using 51 different scenarios (ensemble members).
Fan Chart — the centrepiece. The light band shows the full range of the 51 ensemble members (10th to 90th percentile). The darker inner band is the interquartile range (25th to 75th percentile — the most likely range). The cyan line is the median. As you move further into the future, the bands widen — meaning uncertainty increases. This is expected and healthy.
Exceedance Probability by Lead Day — multiple coloured lines, one per temperature threshold. Watch how the probability of exceeding each threshold changes over the 42-day horizon. Lines converging toward 50% at longer lead times means the model is losing confidence — again, normal.
The subseasonal data currently comes from historical re-forecasts (hindcasts), not live predictions. The amber banner at the bottom of the tab reminds you of this. Once live inference is set up, this view will show real-time 42-day forecasts.
3. Explorer
A guided wizard for assessing weather risk at any location. Walk through four steps: pick a location, select a weather peril (heat, rain, wind, etc.), choose an industry, and see the risk profile.
The output shows four risk tiers — Normal, Elevated, Severe, and Extreme — with the probability, estimated cost impact, and recommended protection for each. The Protection Waterfall chart shows how gross risk exposure is reduced by warranties and derivatives, leaving a manageable net exposure.
Use this tab when onboarding a new client or evaluating a new location. The "Export PDF" button generates a client-ready risk summary.
4. Scenario
An interactive "what if" modelling tool. Select a project (e.g. Jakarta Highway, Rajasthan Solar Farm), then use the slider to simulate increasingly severe weather events. The dashboard updates in real time to show the probability of that event, the expected delay or revenue impact, the gross cost, and the net exposure after warranty and derivative protection.
The Sales Messaging panel generates context-specific copy based on the coverage gap — useful for showing clients exactly what they are exposed to.
Explorer vs Scenario. Explorer is for new locations where you are starting from scratch. Scenario is for existing projects where you want to stress-test different severity levels.
5. Markets
The Markets tab tracks live contract pricing and historical engine performance against the market.
| Metric | What it means | What to do |
|---|---|---|
| Live Signals | Number of contracts the engine is currently tracking with a view. | More signals means more market coverage. |
| Avg Brier | Engine's average prediction accuracy across all scored contracts. | Below 0.10 is the target. Compare to the market's Brier in the chart below. |
| Direction Accuracy | What percentage of the time the engine correctly predicted whether the temperature would exceed the threshold or not. | Above 85% is strong. Below 75% warrants investigation. |
Live Market Scanner — a table of all active contracts showing engine vs market pricing, edge, and signal strength. This is the trading desk view.
Engine vs Market Brier Score — a weekly comparison. When our line (cyan) is below the market's line (grey), we are more accurate. The gap between the lines is our competitive advantage.
6. Derivatives
This tab shows parametric weather derivative contracts — financial products that automatically pay out when a weather event occurs (e.g. "if rainfall exceeds 50mm in 24 hours, pay $100,000"). There is no claims process; the payout is triggered by the data.
Select a contract to see its trigger condition, the engine's probability of that trigger firing, the fair premium (what it should cost based on pure probability), and the market premium (what the counterparty is charging). The Pricing Waterfall breaks down how the premium is built: raw probability, plus loadings for basis risk, model uncertainty, and counterparty margin.
The interactive purchase flow (Request Cover, Configure, Review, Submit) lets you structure a derivative cover request.
Why this matters for Ensuro. The Ensuro integration connects directly to this tab. The engine's calibrated probabilities become the lossProb input to Ensuro's Risk Module, which determines the premium. Better calibration means more competitive pricing.
7. Calibration
The proof that the engine works. This is the tab you show to anyone who asks "why should I trust your numbers?"
D+1 calibration (top section)
Reliability Diagram — the most important chart in the entire dashboard. It plots what the engine predicted (x-axis) against what actually happened (y-axis). A perfect engine would produce a straight diagonal line — every point would sit exactly on the dashed line. Our engine's cyan line hugs that diagonal closely. Any deviation means the engine is either overconfident (below the line) or underconfident (above the line).
Calibration Summary — key numbers: Brier Score (lower is better; below 0.10 is excellent), MAE (mean absolute error of the raw forecast), and the number of signals evaluated.
Subseasonal backtest (bottom section)
This section appears once the backtesting scripts have been run. It shows how the engine's accuracy degrades as the forecast horizon extends from 1 day to 42 days.
Skill Decay Curve — the headline chart. The cyan line shows Brier score at each lead day (D+1 through D+42). It starts low (accurate) and rises (less accurate) as the forecast gets further out. The green dashed line at 0.10 is the target; the amber line at 0.15 is the acceptable threshold. The shaded area shows ensemble spread — wider spread means more uncertainty.
Reliability Diagrams (2x2 grid) — four small reliability diagrams, one per time period: D+1-7, D+8-14, D+15-28, D+29-42. Watch how the cyan line drifts away from the diagonal at longer horizons. This is expected — the question is how much.
Per-City Brier Table — shows accuracy broken down by city and time period. Green cells are excellent, cyan is good, amber needs attention, red is poor. If one city is consistently red, it may need additional training data or a different bias correction approach.
8. Performance
Historical engine performance over time. While Calibration asks "are the probabilities reliable?", Performance asks "how have we done in practice?"
Accuracy Trend — engine accuracy (cyan) vs market accuracy (grey dashed) over time. The purple shaded area shows the edge — how much better we are than the market. A shrinking edge might mean the market is learning from us or that our models need refreshing.
Edge Attribution — breaks down where our advantage comes from: ensemble bias correction, lead-time calibration, multi-source weighting, satellite data, and behavioural signals. Useful for understanding which components of the engine are pulling their weight.
Accuracy by Lead Time — nested bars comparing engine vs market accuracy at D+1 through D+7. The engine's advantage should be largest at D+1 (where we have the most data) and smallest at D+7 (where uncertainty is highest).
9. Warranty
Cliff Horizon offers a Limited Performance Warranty: we guarantee that our forecast accuracy stays within a specified variance threshold (currently ±8%). If we breach that threshold over a quarterly period, the client receives service credits.
Variance Gauge — the green bar shows current variance (e.g. 3.2%) against the 8% threshold. As long as the bar stays in the green zone, the warranty is clear.
Service Credit Formula — SC = (MD - VT) x WA, where MD is measured deviation, VT is the variance threshold, and WA is the warranty amount. This only kicks in if MD exceeds VT.
Warranty History — quarter-by-quarter record of variance and whether the warranty was triggered.
10. System
The engine's health monitor. Use this tab to check that everything is running correctly.
| Metric | What it means | What to do |
|---|---|---|
| Variables Registered | How many weather variables the engine is tracking (temperature high, temperature low, rainfall, wind, irradiance). | Should be 5 or more. Fewer means some variables failed to register. |
| Pipeline Status | Whether the daily processing pipeline completed successfully. | GREEN is good. If AMBER or RED, check the Pipeline Steps section below for which step failed. |
| Data Freshness | How recent the data is. Colour-coded: green means updated in the last 24 hours, amber means 24-48 hours, red means stale (over 48 hours). | Stale data means the engine may be producing forecasts based on outdated weather models. Investigate immediately. |
Variable Registry — lists every registered weather variable with its statistical distribution, bias correction method, Brier scores (raw and calibrated), and last training date.
Pipeline Steps — shows each step of the daily processing pipeline with its status. Green tick means success, amber triangle means warning, red means failure. Click any step to see details.
Data Flow Diagram — a visual overview: Data Sources, then Processing, then Signals, then Output. Helps you understand the engine's architecture at a glance.
Quick Navigation Tips
Keyboard shortcuts — press 1 through 0 on your keyboard to jump directly to any tab (1 = Overview, 2 = Forecast, and so on). Hover over a tab to see its shortcut number.
City selector — most tabs have city pills at the top. Click a city to filter all charts and tables to that city only. Click "All" to see the aggregate view.
Variable toggle — on the Forecast and Overview tabs, you can switch between Daily High and Nighttime Low temperature variables.
Header status — the green pulsing dot in the top-right corner means the engine is running normally. If it turns amber, the pipeline has a warning. The source count next to the clock icon shows how many data feeds are active.
Loading state — when the dashboard first loads, you will see animated placeholder cards. This is normal — it is fetching the latest data. It should resolve in 1-2 seconds.
Glossary
| Term | Definition |
|---|---|
| Brier Score | A number from 0 to 1 measuring prediction accuracy. 0 is perfect. Below 0.10 is excellent. Above 0.25 is no better than guessing. |
| Calibration | Whether stated probabilities match reality. If we say 70%, does the event happen 70% of the time? |
| Edge | The difference between what our engine thinks and what the market thinks. Larger edge means larger opportunity. |
| Z-Score | A statistical measure of how significant the edge is. Above 1.5 means the edge is likely real, not noise. |
| Ensemble | Multiple model runs with slightly different starting conditions. Each run is a "member." More members means more reliable probability estimates. |
| Lead Day (D+N) | How far ahead the forecast is. D+1 = tomorrow. D+14 = two weeks. D+42 = six weeks. |
| Exceedance | The probability that a value will exceed a threshold. P(high > 85°F) = 72% means a 72% chance the high temperature beats 85°F. |
| Hindcast | A re-forecast of the past using historical data. Used to test model accuracy before deploying it live. |
| Parametric | A type of insurance or derivative where the payout is triggered automatically by measured data (e.g. rainfall exceeding a threshold), with no claims process. |
| Fan Chart | A chart showing forecast uncertainty as widening bands. The wider the bands, the more uncertain the forecast. |
| Reliability Diagram | A chart plotting predicted probability vs observed frequency. Points on the diagonal mean perfect calibration. |