Will an LLM be able to solve the Self-Referential Aptitude Test before 2027?
Will an LLM be able to solve the Self-Referential Aptitude Test before 2027?
➕
Plus
13
Ṁ707
2026
66%
chance

Referring to Jim Propp's Self-Referential Aptitude Test. The model should output correct solution at least 75% of the time it is suitably prompted.

The ability should be demonstrated before 1st Jan 2027 in order for this market to resolve YES, but if there is an exceptionally strong suspicion that a specific non-public LM should be able to solve it, I'll wait until I can test it on that or another very similar model. However, if someone finds a prompt setup which makes say GPT-4 solve it correctly, but they find it after 2026, this market still resolves NO.

The model is allowed to be arbitrarily prompted, as long as

  • There is no human interaction following the initial query (but the model is allowed to e.g. critique its outputs and refine them).

  • There is no information leak about the solution in the prompt, with the possible exception of the answer to question 20 (which is somewhat subjective).

  • The model is not allowed to use any outside tools (e.g. an interpreter) except possibly a scratchpad where it can write down its thoughts outside of its context, or similar.

The model should ideally be able to explain its reasoning in detail, which I would then check. If the model produces erroneous reasoning but gets it right (despite errors or perhaps without thinking out loud), I'll default to assuming that it's just a leak since the solution can be found online – but if there is a strong reason to suspect that it is not a leak (e.g. the model is known to display very strong logical reasoning in other contexts), I'll create variations on the test and see whether model can solve them correctly.

Creator policy: I won't bet.

See the 2024 version:

Get Ṁ1,000 play money

What is this?

What is Manifold?
Manifold is a social prediction market with real-time odds on wide ranging news such as politics, tech, sports and more!
Participate for free in sweepstakes markets to win sweepcash which can be withdrawn for real money!
Are our predictions accurate?
Yes! Manifold is very well calibrated, with forecasts on average within 4 percentage points of the true probability. Our probabilities are created by users buying and selling shares of a market.
In the 2022 US midterm elections, we outperformed all other prediction market platforms and were in line with FiveThirtyEight’s performance. Many people who don't like trading still use Manifold to get reliable news.
Why should I trade?
Trading contributes to accurate answers of important, real-world questions and helps you stay more accountable as you make predictions.
Trade with S Sweepcash (𝕊) for a chance to win withdrawable cash prizes.
Get started for free! No credit card required.
What are sweepstakes markets?
There are two types of markets on Manifold: play money and sweepstakes.
By default all markets are play money and use mana. These markets allow you to win more mana but do not award any prizes which can be cashed out.
Selected markets will have a sweepstakes toggle. These require sweepcash to participate and allow winners to withdraw any sweepcash won to real money.
As play money and sweepstakes markets are independent of each other, they may have different odds even though they share the same question and comments.
Learn more.