Research Complete — Awaiting Price Validation

Kalshi Weather Bot Research

Comprehensive analysis of the “reading the thermometer” strategy for automated weather contract trading on Kalshi. 171 days of data, 7 NWS stations, reviewed by Grok-4 and Gemini-2.5-Flash. February 2026.

92.9%
Highs Accuracy (within ±2°F)
69.1%80%+
Lows Accuracy (shifted to dawn window)
171
Backtest Days (Sep 2025 – Feb 2026)
7
Stations — KMDW, KNYC, KMIA, KLAX, KAUS, KDEN, KPHL
~53K
METAR Observations Analyzed per Station
3
AI Models — Claude Opus 4.6, Grok-4, Gemini-2.5-Flash

The Strategy — “Reading the Thermometer”

Kalshi offers daily high/low temperature contracts for 7 US cities. These contracts settle off the NWS Daily Climate Report (CLI), which uses integer °F values from ASOS weather stations. Instead of predicting the weather, we wait until the temperature is observable via real-time METAR data, then buy contracts where the current reading confirms the outcome. We call this “reading the thermometer.”

Core Insight

By the time the daily low occurs (~4–6 AM) or the daily high peaks (~2 PM), we can read the actual temperature and know with 80–99% confidence what the CLI will report the next morning.

Key Findings

Highs Are the Moneymaker
Every station crosses 80% accuracy by 2 PM local time. Miami and LA hit 97–99% by mid-afternoon. The afternoon window is reliable across all 7 cities.
Lows Need Dawn, Not Midnight
The original 12–2 AM window only achieves 30–60% accuracy. Shifting to 4–6 AM raises 5 of 7 stations above 80%. Denver and Austin never reach 80% for lows.
Accuracy Does Not Equal Profitability
Both Grok-4 and Gemini-2.5-Flash flagged that by the time we're 90%+ confident, the market may have already priced in the outcome. Historical Kalshi price data is needed to validate the edge.
Deterministic Rules Beat AI for This Task
Both AI reviewers unanimously recommended replacing the Claude decision engine with simple if/then logic. The strategy is fundamentally “is the temperature inside the bracket?” — not a task that requires an LLM.

Research Pages

Research Methodology

METAR Source Iowa Environmental Mesonet (IEM) ASOS API
Settlement IEM CF6 JSON API (official NWS CLI values)
Period September 1, 2025 — February 18, 2026 (171 days)
Stations 7 Kalshi cities (KMDW, KNYC, KMIA, KLAX, KAUS, KDEN, KPHL)
Analysis Python (analyze.py, find_windows.py), Claude Opus 4.6
AI Review Full architecture sent to Grok-4 & Gemini-2.5-Flash

Status & Next Steps

Research Phase Complete

Architecture reviewed by two independent AI models. Awaiting Kalshi historical price data to validate whether the accuracy edge survives market pricing.

  1. Pull historical Kalshi contract prices — critical; determines if edge survives pricing
  2. Run profitability backtest — accuracy × pricing for each station/window
  3. If profitable: build deterministic Node.js daemon — simple if/then rule engine, no LLM
  4. Paper trade for 2 weeks — validate live data pipeline & timing
  5. Go live with $5–10 per contract — conservative sizing, scale with data

Credits

Research & Analysis Claude Opus 4.6 (Anthropic)
Critical Review Grok-4 (xAI), Gemini-2.5-Flash (Google)
Strategy Christian Froehlich & Wes Hines
Data Iowa Environmental Mesonet, National Weather Service

Last updated: February 19, 2026