MANIFOLD
BrowseUS ElectionNewsAbout
Will an LLM get > 50% on hard problems on LiveCodeBench Pro?
2
Ṁ200
2026
45%
chance
1D
1W
1M
ALL

#️ Technology
#AI
#Math
#LLMs
#AI Benchmarks
Get Ṁ1,000 play money
2 Comments
Sort by:

when does this resolve?

@alphazom jan 1, 2026

Related questions

Will an LLM consistently create 5x5 word squares by 2026?
80% chance
Will the best public LLM at the end of 2025 solve more than 5 of the first 10 Project Euler problems published in 2026?
75% chance
Will an LLM report >50% score on ARC in 2025?
85% chance
Will the GPT4+code-interpreter+search score > 1350 on Lmsys Arena Leaderboard?
49% chance
Will an LLM be able to solve Raven's Progressive Matrices from an image in 2025?
65% chance
Will an LLM agent complete >50% of the lab tasks on the Factorio Learning Environment benchmark in 2025?
30% chance
EOY 2025: Will open LLMs perform at least as well as 50 Elo below closed-source LLMs on coding?
30% chance
LLM Hallucination: Will an LLM score >90% on SimpleQA before 2026?
60% chance
Will there be an LLM which scores above what a human can do in 2 hours on METR's eval suite before 2026?
67% chance
Will LLMs be able to formally verify non-trivial programs by the end of 2025?
30% chance

Related questions

Will an LLM consistently create 5x5 word squares by 2026?
80% chance
Will an LLM agent complete >50% of the lab tasks on the Factorio Learning Environment benchmark in 2025?
30% chance
Will the best public LLM at the end of 2025 solve more than 5 of the first 10 Project Euler problems published in 2026?
75% chance
EOY 2025: Will open LLMs perform at least as well as 50 Elo below closed-source LLMs on coding?
30% chance
Will an LLM report >50% score on ARC in 2025?
85% chance
LLM Hallucination: Will an LLM score >90% on SimpleQA before 2026?
60% chance
Will the GPT4+code-interpreter+search score > 1350 on Lmsys Arena Leaderboard?
49% chance
Will there be an LLM which scores above what a human can do in 2 hours on METR's eval suite before 2026?
67% chance
Will an LLM be able to solve Raven's Progressive Matrices from an image in 2025?
65% chance
Will LLMs be able to formally verify non-trivial programs by the end of 2025?
30% chance
BrowseElectionNewsAbout