Will Qwen2-Math be at the top of the Math category of the LMSYS Chatbot Arena Leaderboard?
Mini
5
Ṁ88Sep 30
55%
chance
1D
1W
1M
ALL
The newly introduced Qwen2-Math model by Alibaba claims to "outperform proprietary models, including GPT-4o and Claude 3.5, in math related downstream tasks".
Question resolves YES if the model Qwen2-Math-72B-Instruct reaches rank 1 in the Math category on the LMSYS Chatbot Arena Leaderboard upon first release of its ranking on that leaderboard. This includes the case of a shared first rank.
If the model is not added to the leaderboard by the 30th of September 2024, the question resolves as N/A.
Get Ṁ1,000 play money
Related questions
Related questions
What organization(s) will be ranked #1 in the LMSYS Org Chatbot Arena Leaderboard at the end of December 2024?
Will China have a model in the top 10 on LMSYS Chatbot Arena on March 1, 2025?
42% chance
Is the LMSYS chatbot arena leaderboard trustworthy?
64% chance
Who will ever rank Top 10 in LMSYS Chatbot Arena Leaderboard in 2025?
What organization will have the highest ELO score in the LMSYS Org Chatbot Arena Leaderboard at the end of Dec, 2024?
Will GPT-4-Turbo be ranked in the top 20 on the Chatbot Arena Leaderboard at the end of 2025?
24% chance
Will any open-source model rank in the top 3 on Chatbot Arena at any point in 2024? (resolves based on ELO rating)
15% chance
Who will ever rank #1 in LMSYS Chatbot Arena Leaderboard in 2025?
Will Claude Opus be ranked in the top 20 on the Chatbot Arena Leaderboard two years from today (3/10/24)?
31% chance
Will any LLM outrank GPT-4 by 150 Elo in LMSYS chatbot arena before 2025?
18% chance