When will an LLM enter and maintain 1360 ELO on LMSys for 10,000 votes? OpenAI? Gemini? Anthropic?
💎
Premium
6
Ṁ37k
Jan 16
3%
By end of October
38%
By end of November (after October)
38%
By end of December 2024 (after November)
22%
Other

LMSys pulled a fast one on us last time.

OpenAI's O1-preview entered the rankings at 1355 ELO, then got voted all the way down to 1335 now.

https://lmarena.ai/?leaderboard


This seems fishy perhaps, but those are the breaks.

Therefore, this time we will look for a model that enters and maintains a ranking of 1360 with 10,000 votes. Instead of looking at the first public checkpoint like we had before, we will resolve this once a model is at 1360+ ELO and at 10,000+ votes.

One caveat is we will look when the model enters the arena, in its first public posting. But resolve only if that model reaches the ELO and votes requirement.

SO, if a model enters the arena (first shows up on leaderboards on October 20th) -- but doesn't get 10,000 votes until November that will still count as October.

Sorry it's confusing but this is more intuitive. We don't want to bet on how long 10,000 votes take. But on whether a good model entered the arena and will eventually meet the requirements.

In other words, we are betting on... when will we get a release that's noticeably better than today's ~1340 ELO models. According to the LMSys voters.

Sorry we need more votes now as the confidence intervals at 3,000 votes appear not to be reliable. That or people tried to downvote O1-preview. Who knows.

Get Ṁ1,000 play money
Sort by:

Fingers crossed for new 3.5 sonnet on the board before Nov!

@ChrisPrichard Yep that’s the only way