Grok 4 Heavy gets on Humanity's Last Exam leaderboard?
2
Ṁ71Sep 2
32%
chance
1D
1W
1M
ALL
The market resolves YES if Grok 4 Heavy has a score in the leaderboard section of https://agi.safe.ai/, regardless of settings (text-only, tools-allowed, etc.).
While the market stays open, it resolves NO when either of the following happen:
The next major iteration of Grok models is released by xAI without Grok 4 Heavy being generally accessible (including eg. limitation to paid users) in the official xAI API. Examples include Grok 4.2, Grok 4.5, Grok 5.
Grok 4 Heavy is made generally accessible in the official xAI API for a month.
In short, YES if Grok 4 Heavy ever appears on the HLE leaderboard; NO if either (i) a newer Grok generation ships first, or (ii) Grok 4 Heavy is on the xAI API for 30 days without reaching the leaderboard.
Get Ṁ1,000 play money
Related questions
Related questions
Will Grok 3.5 Top the Chatbot Leaderboard?
1% chance
Grok 4 in top left of Artificial Analysis' cost to run vs intelligence chart?
1% chance
Open-source OpenAI model beats Grok 4 on LMArena?
19% chance
Top score on Humanity's Last Exam > 50% by 2027?
86% chance
What is Grok 4 Heavy's performance on METR's task length evaluation?
Top score on Humanity's Last Exam > 80% by what year?
What is Grok 4's performance on METR's task length evaluation?
Top score on Humanity's Last Exam > 90% by what year?
Top score on Humanity's Last Exam > 60% by what year?
Top score on Humanity's Last Exam > 70% by what year?