Kalshi Weather Settlement Research
How temperature contracts actually settle, station-by-station analysis, and what it means for the bot. Research compiled Feb 19, 2026.
! The #1 Finding
Kalshi settles exclusively off the NWS Daily Climate Report (CLI). The CLI reports max/min temps as
whole-degree Fahrenheit integers. Kalshi does zero rounding of its own — it takes the CLI value as-is.
The F→C→F rounding problem only affects the real-time 5-minute time series data, not the settlement value.
→ Settlement Pipeline
How a temperature reading becomes a settlement value
Sensor
RTD
HO-1088 platinum wire
2-5 sec sampling
→
ASOS Internal
2-min avg
Averaged to integer °F
Tracks daily max/min
→
DSM
64°F
Daily Summary Message
Published ~1:00 AM LST
→
CLI Report
64°F
Quality-controlled
Whole-degree °F integer
→
Kalshi
64°F
Takes CLI as-is
No additional rounding
Key: The entire pipeline stays in native integer Fahrenheit. No unit conversions. The settlement value = the ASOS internal max/min from 2-minute averages, reported as a whole number.
⚠ The F→C→F Problem
Affects 5-minute time series data ONLY — not settlement
5-Minute Data Pipeline (Airport ASOS)
1. Sensor reads
71.3°F
2. Round to whole °F
71°F
→
integer
3. Convert to Celsius
21.67°C
→
(71-32)×5/9
4. Round to whole °C
22°C
→
ERROR introduced
5. Convert back to °F
71.6°F
→
22×9/5+32
6. Round & display
72°F
→
+1°F off!
Result: The 5-min time series can show 72°F when the actual reading is 71°F. Up to ±2°F error is possible. This is what you see when watching NWS real-time data for airport stations.
But Settlement Bypasses This Entirely
ASOS internal tracking
71°F
→
stays in native °F
DSM max reported
71°F
→
no conversion
CLI settlement
71°F
→
correct value
◉ What We See vs. What Settles
Real-Time Data Sources (what the bot watches)
METAR
hourly :51
0.1°C precision — HIGH accuracy
SPECI
on event
Exact readings, no 5-min rounding
Settlement Source (final truth)
CLI
next morning
ONLY source. QC'd integer °F. Definitive.
Strategy: Use METAR hourly observations as primary signal (accurate). Use 5-min data for trend direction only (frequent but noisy). Never trust 5-min data at contract boundaries. Build in ±2°F buffer.
The Buffer Rule
// Only buy when current reading is WELL AWAY from boundary
if (currentTemp <= contractBoundary - 2) {
// Safe: even with ±2°F rounding, we win
buyContract();
} else {
// Too close to boundary, rounding could burn us
pass();
}
■ Station-by-Station Comparison
7 cities, 7 stations, different monitoring characteristics
| City |
Station |
Type |
5-Min Data? |
F→C→F Noise |
Update Freq |
Real-Time Accuracy |
Timezone |
Low Window (CT) |
High Window (CT) |
| NYC |
KNYC |
ASOS Park |
No |
Minimal |
60 min |
High |
ET |
~11 PM |
~10 AM |
| Chicago |
KMDW |
ASOS Airport |
Yes |
±1-2°F |
5 min |
Lower |
CT |
~12 AM |
~12 PM |
| Miami |
KMIA |
ASOS Airport |
Yes |
±1-2°F |
5 min |
Lower |
ET |
~11 PM |
~10 AM |
| Los Angeles |
KLAX |
ASOS Airport |
Yes |
±1-2°F |
5 min |
Lower |
PT |
~2 AM |
~2 PM |
| Austin |
KAUS |
ASOS Airport |
Yes |
±1-2°F |
5 min |
Lower |
CT |
~12 AM |
~12 PM |
| Denver |
KDEN |
ASOS Airport |
Yes |
±1-2°F |
5 min |
Lower |
MT |
~1 AM |
~1 PM |
| Philadelphia |
KPHL |
ASOS Airport |
Yes |
±1-2°F |
5 min |
Lower |
ET |
~11 PM |
~10 AM |
KNYC is the cleanest station. No 5-minute data = no F→C→F noise. Hourly METAR at :51 has 0.1°C precision. What you see is what the CLI will report. Tradeoff: updates only once per hour.
Airport stations (6 of 7) broadcast 5-min data with F→C→F artifacts. Use this for trend direction only. For actual temp reads, wait for the METAR at :51 past the hour.
⏱ DST Timing Gotcha
CLI reports in Local Standard Time, not Daylight Time
Standard Time (Nov – Mar)
12:00 AM — 11:59 PM LST (= clock time)
Normal: CLI day matches the calendar day
Daylight Saving Time (Mar – Nov)
12am
1:00 AM — 12:59 AM next day (local clock)
Shifted: The CLI "day" starts at 1 AM local and ends at 12:59 AM the next night
Impact on lows: During DST, a low that occurs between midnight and 1 AM local time counts toward the previous CLI day. The bot must track which CLI day each reading belongs to.
2026 DST Dates
Begins: March 8 — clocks spring forward
Ends: November 1 — clocks fall back
Most of our active trading will be during DST.
☰ CLI Report Format
Actual format from today's reports (Feb 19, 2026)
CLIMATE REPORT FOR CHICAGO MIDWAY
TEMPERATURE (F) YESTERDAY
MAXIMUM 64 1:36 PM
MINIMUM 45 11:59 PM
AVERAGE 55
CLIMATE REPORT FOR NEW YORK CENTRAL PARK
TEMPERATURE (F) YESTERDAY
MAXIMUM 42 1232 PM
MINIMUM 36 743 AM
AVERAGE 39
Settlement extraction: Parse the "MAXIMUM" or "MINIMUM" row, grab the integer. That's the settlement value. No conversion, no rounding, no interpretation needed. The CLI does the work for us.
CLI Publishing Schedule
~6:00 AM LST
Preliminary CLI issued (may be revised)
~8:00 AM LST
Final CLI issued (settlement source)
Morning
Kalshi resolves contracts against final CLI
✗ Old Bot vs. ✓ New Bot
What was wrong and how we fix it
✗ Old: settle.py (BROKEN)
- Fetched NWS observations API (real-time, preliminary)
round(observed) at line 151 — wrong data source
- Compared API reading vs contract boundary
- Subject to F→C→F rounding from 5-min data
- Could be ±2°F off from actual settlement
✓ New Approach
- Fetch CLI report directly for settlement verification
- Parse integer from MAXIMUM/MINIMUM row
- Use METAR (:51) for real-time monitoring
- 5-min data for trend only, never for decisions
- ±2°F buffer rule protects against noise
✗ Old: Prediction Model
- Made-up standard deviations (auto_trade.py:59-68)
- Normal distribution probability model
- MIN_EV_CENTS = 5 (below model error)
- Chased "value bets" that weren't actually value
- Win rate: unknown (no tracking)
✓ New: "Read the Thermometer"
- Wait until low/high is actively happening
- Claude Sonnet 4.6 reasons through live data
- Weather context: fronts, rain, wind, trends
- Target: 80%+ win rate on every trade
- Small edge × high confidence = profit
⏰ Daemon Monitoring Windows
Two active windows per day, timezone-staggered across cities
Lows Window (Evening/Overnight)
| CT Time |
NYC / MIA / PHIL |
CHI / AUS |
DEN |
LAX |
Action |
| 7:00 PM |
Active 8 PM ET |
Warming up |
— |
— |
Start monitoring ET cities |
| 8:00 PM |
Active |
Active 8 PM CT |
Warming up |
— |
Add CT cities |
| 9:30 PM |
Active |
Active |
Active |
Active 7:30 PM PT |
All cities live |
| 11 PM – 3 AM |
Prime buying window |
Peak decision time for lows |
| 4:00 AM |
Wind down, most lows locked in |
Stop monitoring |
Highs Window (Morning/Afternoon)
| CT Time |
NYC / MIA / PHIL |
CHI / AUS |
DEN |
LAX |
Action |
| 9:00 AM |
Active 10 AM ET |
Warming up |
— |
— |
Start monitoring ET cities |
| 10:00 AM |
Active |
Active |
Active |
Warming up |
CT/MT cities join |
| 11:00 AM |
Active |
Active |
Active |
Active |
All cities live |
| 12 PM – 4 PM |
Prime buying window |
Peak decision time for highs |
| 5:00 PM |
Wind down, most highs locked in |
Stop monitoring |
⚙ Bot Data Flow Architecture
How the daemon collects, reasons, and acts
Collect (every 5 min during active window)
NWS METAR API (hourly @ :51)
▼
NWS 5-min Time Series (airport only)
▼
Kalshi Orderbook (contract prices)
▼
Temp Buffer (per city, rolling 6hr)
Decide (when conditions met)
Trigger check: temp stable or falling?
▼
Build prompt: weather data + contracts + rules
▼
Shell out: claude --print (Sonnet 4.6)
▼
Parse response: BUY / PASS + reasoning
Execute (on BUY signal)
Validate: ±2°F buffer check
▼
Kalshi API: place limit order (RSA-PSS auth)
▼
Log trade to D1
▼
Discord notification
? Remaining Unknowns
What we still need to validate before the build is bulletproof
1. Claude CLI Round-Trip Speed
How fast does claude --print respond with Sonnet 4.6 via Max subscription? If it takes 30+ seconds, do we need to pre-compute prompts? What about rate limits at 2 AM?
2. Decision Trigger Logic
When exactly does the daemon call Claude? Every 5 minutes? Only when temp trend reverses? Only after X consecutive drops? Need to define the precise trigger conditions.
3. Daemon Crash Recovery
What happens if the PC sleeps, the daemon crashes, or Node.js OOMs mid-window? Need a watchdog or Task Scheduler restart strategy. State persistence between crashes.
4. Backtesting
Can we backtest the "reading the thermometer" strategy against historical CLI data + METAR archives before risking real money? Iowa State Mesonet has historical 1-min data.