Polymarket Research Toolkit
Research-first scraping + walk-forward backtester for Polymarket. Six strategies, deflated Sharpe ratios, and conservative cost models. Designed to fail loudly when no real edge exists — and it does.
In plain English
Polymarket is a website where people bet real money on real-world questions: “Will Trump win the 2024 election?”, “Will Bitcoin be above $100k by year-end?”, “Will the Fed cut rates next meeting?” Each question has two sides — YES and NO — and the prices fluctuate between $0 and $1 based on what the market thinks the probability is.
If a market is mispriced — for example, NO is trading at $0.10 but the event has been almost certain for weeks — there’s potential profit in buying the cheap side. The question is: are these mispricings real, persistent, and tradeable after fees? Or do they look real in a backtest because the backtester is lying to you?
This project is a toolkit for answering that honestly. It does three things in order:
- Scrapes every public number Polymarket exposes — every market, every historical price tick, every order book snapshot. Plus Kalshi (a US-regulated competitor) for cross-venue comparison.
- Tests trading ideas against that historical record with a backtester deliberately designed to fail when no real edge exists.
- Scans live for the few signals that survive the test, so they can actually be traded.
The interesting findings turned out to be negative — the most promising-looking strategy collapsed when tested honestly, for a specific data-quality reason explained below. That’s the project working as intended.
Anti-overfit methodology
Every result is structured to fail loudly when no real edge exists:
- Walk-forward only. Strategies see prefixes of price series, never the future.
- Discovery / test split at the universe level — the calibration strategy is fit on the first half of resolved markets and scored on the second.
- Deflated Sharpe. When you test N strategies, the best-of-N is inflated by selection. Deflate by N before claiming anything (Bailey & López de Prado).
- Conservative cost model. 1% taker fee + 0.5% half-spread per leg.
- Trade-count floor. Anything with fewer than 100 holdout trades is reported as “no signal yet,” not as a result.
Strategy suite
| Strategy | Hypothesis |
|---|---|
extreme_price_decay |
Buy NO when YES collapses near close — fade late confidence |
favorite_hold |
Buy YES when YES is persistently ≥ 0.95 near close |
longshot_bias |
Short the longshot — buy NO at 0.85–0.95 |
complementary_arb |
YES + NO < $1 — needs the live book |
mean_reversion |
Fade single-bar 10c spikes mid-life |
calibration_edge |
Data-driven, fit on first half of universe only |
Honest empirical findings
complementary_arblooked great in train, collapsed in test. Investigation: the training “edge” was a forward-fill artifact. Bar-resolution price-history shows YES + NO summing to anything between 0.5 and 1.7 because each leg’s prints don’t share timestamps. After bucketing to the hour and inner-joining, real imbalances ≤ 2c essentially never appear in bar data. The arb strategy can only work against the live book. Found because the test split was frozen.- Calibration analysis at the 24h horizon shows the 0–10% YES band actually resolves YES ~11% of the time (vs. 2.4% priced) — enough sample to be suggestive, not enough to bet on. Watch this band as more data accumulates.
- Bar-data limitations. Hourly bars are too coarse for any real microstructure work; live websocket feeds are needed for liquidity / spread strategies.
Stack
requests+ retry/rate-limit aware HTTP client; SQLite for markets / prices / books- Walk-forward engine with deflated Sharpe; reliability tables and Brier / log loss for calibration
- Live-scan loop for complementary-pair edges
- Six sprint reports + a research memo documenting the dead ends as carefully as the live ones
What it demonstrates
- Treating a backtest as a hypothesis test, not a marketing screenshot
- The discipline of letting your own strategies fail
- Microstructure thinking: knowing the difference between bar data and the book