Will GPT-5 score Bronze or better on IMO 2025?
➕
Plus
92
Ṁ28k
Sep 1
21%
chance

GPT-5 with scaffolding or access to tools counts as long as GPT-5 is making real decisions.

Resolves No if GPT-5 doesn’t score Bronze or higher, or if it does not does not exist by then, or nobody makes an IMO attempt by the end of August 2025

  • Update 2025-03-01 (PST): - o3 or future o series models do not count as GPT-5 for this question. Models called GPT-4x would not count. If OpenAI abandons the GPT-X naming scheme and comes out with a new flagship model to replace GPT-4 that's not part of the o series, that counts.

  • Update 2025-08-07 (PST) (AI summary of creator comment): Multiple instances with consensus (like GPT-5 Pro) are allowed. Web search is not allowed.

  • Update 2025-08-07 (PST) (AI summary of creator comment): GPT-5 must be able to get bronze the majority of the time, or get an average score of bronze or higher

  • Update 2025-08-14 (PST) (AI summary of creator comment): - No hints in prompts: Attempts that include any hints in the problem prompt will not count toward resolution.

    • Remain open until market close (end of August 2025): The market will stay open until close to allow further attempts with improved prompting/scaffolding; early failed attempts alone will not trigger a No resolution.

Get Ṁ1,000 play money
Sort by:

Has anyone tested this with GPT-5 pro yet?

Thanks, seems like a solid attempt. They averaged 16 points and Bronze is 19 points, so not far off. Sorry the original criteria was unclear, I was expecting that OpenAI would likely have an official attempt. I think I'll keep the market open until market close at the end of August to see if anyone achieves Bronze with better prompting or scaffolding. Though if they include hints in the prompt that will not count.

bought Ṁ250 NO

@ahalekelly It costs $200 for one attempt. I don't think we'll get many more.

soldṀ12NO

@JDVance1 This bot needs to chill

A single instance of GPT-5? Eg. no heavy or pro modes that spawn 10s of instances and rely on consensus?

@patrik multiple instances with consensus like GPT-5 Pro is allowed. Web search is not.

Also must be able to get bronze the majority of the time, or get an average score of bronze or higher

bought Ṁ50 YES

I have seen 2 people claiming they reached gold (35/42) with scaffolding of public models

@qumeric the people i've seen claiming this have been "kinda cheating", eg giving the model hints as to what direction to attempt to solve the problem from. if you know cases not like that i'd be curious to see them. others just take much more than 4.5 hours per set of 3 problems

o3 or future o series models do not count as GPT-5 for this question. GPT-4x would not count. If OpenAI abandons the GPT-X naming scheme and comes out with a new flagship model to replace GPT-4 that's not part of the o series, that counts.

bought Ṁ100 NO

@ahalekelly If gpt5 can use reasoning models like it uses search or other features, so you can get it to solve the imo through o3, does it count?

@Bayesian hmm no, since it wouldn't be GPT-5 doing the solving

Is it fine if it’s a math-specialized version of GPT-5 (involving math-specific post training consisting of, say, <5% of pretraining compute)?

Yeah that sounds reasonable