Will an AI get a perfect score in IMO 2025
31
Ṁ17k
Aug 16
5%
chance

Must occur within one month of IMO (will take place on July 15th and 16th). Can't use internet.

  • Update 2025-07-25 (PST) (AI summary of creator comment): The creator has confirmed that a condition for a YES resolution is that the AI cannot be cheating. This addresses concerns about an AI being designed with foreknowledge of the specific problems and their solutions.

  • Update 2025-07-25 (PST) (AI summary of creator comment): A condition for a YES resolution is that the AI cannot be cheating. This addresses concerns that an AI could be designed with foreknowledge of the specific problems and their solutions.

Get Ṁ1,000 play money
Sort by:

It's trivial to design an AI that can solve a problem with a known solution.

added resolution criterion: they cannot be cheating

@Bayesian Define cheating.

@AndrewHebb in this case, something like trivially designing an AI that can solve a problem, because of the fact that the solution is known

@Bayesian What does that mean? The fact that the solution is known makes it arbitrarily easy to make an AI that can solve it. You need well defined constraints on what the design process is to not insert knowledge of the solution into the design process. You don't need to literally hardcode the solution. There is an infinite number of ways to more subtly provide the answer.

🤷‍♂️ I agree you can easily cheat when you know the solution and you incorporate design decisions that make use of the fact that you know the solution, beyond just hardcoding the solution. i mostly think we will know something is cheating when we see it, and setting precise constraints that don't disqualify some false positives is hard

but i'd be happy to bet on something about whether some research lab officially reports getting a perfect score on the IMO 2025 within a month of it, whether or not they are blatantly cheating, vs only if it's officially verified, or bet on the difference between those two, if we disagree on the odds of this ambiguity coming up

@Bayesian thanks you for adding criterion to my market

@ai please update market desc

😆

Would something like Deepmind's silver medal attempt last year using alpha proof + alpha geometry count? Or would it have to be a 'singular' model?