
The model need not be released
Update 2025-09-19 (PST) (AI summary of creator comment): - Resolution will be based on Epoch's reported Frontier Math scores. Other sources (e.g., AI Digest or lab-only reports) will not determine resolution.
Epoch reported long ago that Agent 1 scored 49% at original FrontierMath (now tier 1-3) with pass@16.
https://x.com/EpochAIResearch/status/1945905802998423867
Does this count?
@qumeric Pass@16 should definitely not count... If it did, why not pass@32 or pass@64? It's clear that this market is about pass@1.
Why is this so different from this market? Are both based on FrontierMath Tiers 1-3? https://manifold.markets/SG/top-frontiermath-score-in-2025
Resolution will be based on Epoch's reported Frontier Math scores.
Historically openai reported 32% for o3-mini with python (which counts for the purpose of that other market afaict), but Epoch testing it with the general / minimal scaffold got 11.03%. Likely isn't because OpenAI is making up numbers or whatever but they demonstrably have a different setup
@JaundicedBaboon does this resolve according to AI Digest (which includes e.g. lab-reported scores) or according to Epoch’s evaluation?
@traders 116 days until 2026! is a breakthrough expected over the next 4 months? Given the size of the jump from GPT-4 to GPT-5, I'm not sure why this is at 55%. I'm going to keep buying a little bit more NO every day.