Which Benchmarks will OpenAI show results from GPT-5 on, when it is announced?

Plus

Ṁ9104

Jan 1

ALL

99%

SimpleQA

99%

HumanEval

99%

MMLU

99%

GPQA

99%

SWE-Bench

99%

ARC-AGI-2

16%

MATH

14%

Big-Bench-Hard

12%

DROP

12%

MGSM

GSM8K

Some flexibility on variations of specific benchmarks. eg SWE-Bench-Hard would resolve SWE-Bench YES.

Update 2025-05-11 (PST) (AI summary of creator comment): The benchmarks must be those that GPT-5 is benchmarked against by OpenAI.

Must be on roughly the same day / during / around the time of the announcement. If there are several announcements over multiple days, all those times are acceptable for the purpose of this market.

Get Ṁ1,000 play money

5 Comments

Sort by:

bought Ṁ10 SimpleQA NO

you mean benchmarked by OpenAI?

@JoshYou yeah

Surely ARC !?

https://arcprize.org/

@bbb I can't add options, I might create a duplicate where i can in a bit.

bought Ṁ30 MATH NO

@bbb Idk if i was actually able to change the settings back then but since then ive learned how to do it, so added arc agi 2

Related questions

Related questions