MANIFOLD
BrowseUS ElectionNewsAbout
Will an LLM get > 50% on hard problems on LiveCodeBench Pro?
3
Ṁ360
Jan 1
50%
chance
1D
1W
1M
ALL

#AI
#️ Technology
#Math
#LLMs
#AI Benchmarks
Get Ṁ1,000 play money
2 Comments
Sort by:

when does this resolve?

@alphazom jan 1, 2026

Related questions

Will an LLM be able to solve Raven's Progressive Matrices from an image in 2025?
65% chance
EOY 2025: Will open LLMs perform at least as well as 50 Elo below closed-source LLMs on coding?
62% chance
Will an LLM report >50% score on ARC in 2025?
99% chance
Will the best public LLM at the end of 2025 solve more than 5 of the first 10 Project Euler problems published in 2026?
75% chance
Will LLMs be able to formally verify non-trivial programs by the end of 2025?
27% chance
Will an LLM agent complete >50% of the lab tasks on the Factorio Learning Environment benchmark in 2025?
48% chance
LLM reaches >90% Brier score on Prophet Arena by 2026?
5% chance
LLM Hallucination: Will an LLM score >90% on SimpleQA before 2026?
60% chance
Will there be an LLM which scores above what a human can do in 2 hours on METR's eval suite before 2026?
75% chance
Will an LLM consistently create 5x5 word squares by 2026?
84% chance

Related questions

Will an LLM be able to solve Raven's Progressive Matrices from an image in 2025?
65% chance
Will an LLM agent complete >50% of the lab tasks on the Factorio Learning Environment benchmark in 2025?
48% chance
EOY 2025: Will open LLMs perform at least as well as 50 Elo below closed-source LLMs on coding?
62% chance
LLM reaches >90% Brier score on Prophet Arena by 2026?
5% chance
Will an LLM report >50% score on ARC in 2025?
99% chance
LLM Hallucination: Will an LLM score >90% on SimpleQA before 2026?
60% chance
Will the best public LLM at the end of 2025 solve more than 5 of the first 10 Project Euler problems published in 2026?
75% chance
Will there be an LLM which scores above what a human can do in 2 hours on METR's eval suite before 2026?
75% chance
Will LLMs be able to formally verify non-trivial programs by the end of 2025?
27% chance
Will an LLM consistently create 5x5 word squares by 2026?
84% chance
BrowseElectionNewsAbout