Will GPT-5 have a rating of at least 2000 in chess?

Plus

Ṁ15k

2026

chance

ALL

An estimated rating is fine. Filtering out illegal moves is allowed.

Update 2025-08-12 (PST) (AI summary of creator comment): - The rating standard for resolution will be FIDE rating; it may be estimated.

#GPT-5 Speculation

#️ Chess

Get Ṁ1,000 play money

24 Comments

Sort by:

Is this chess.com, lichess.org, or classical otb rating, or something else?

@KeithManning FIDE is the standard, but of course I would have to estimate.

I just played a bit against it. 20 standard moves against a Ruy Lopez, then got confused. Asking it to relist all the moves reset it and let us play further, but it gave up a pawn for nothing 2 moves later

I simplified a bit and then it gave up a rook on move 29 (although, otherwise, I would have gotten another pawn and my position was overall better in terms of space and pawn structure). It kept getting more confused and making more and more illegal moves/blunders (for example, it left a knight hanging on c5, so I took it and it replied dxc5 even though there was no pawn on d6, so I just got a free knight). After 34 moves, I was up a rook and a knight and multiple pawns, so I declared myself the winner

bought Ṁ1,000 NO

@LuisPedroCoelho A 2000 Elo player would crush me even if I was trying my best, BTW

@LuisPedroCoelho Which version though?

@LuisPedroCoelho What if you pass in the full board position on each move?

Alright, how should we resolve this. Maybe set up a bot on chess.com that uses GPT-5 moves?

@IsaacKing Seems like we're going to get a chess rating soon from https://playbench.ai/models

@ChaosIsALadder Oh neat. I'll probably use this then, unless it takes too long to come out or seems otherwise unreliable.

i just tried playing 5- it’s really good. maybe not 2000 good, but… it launched a discovered attack on my king with its bishop by moving its pawn and threatening my rook. see also: it knows tactics

@geuber Being able to plan one move ahead like that is maybe ~1100 level. 2000 is much better.

(Also that could have just happened by accident.)

@IsaacKing well… yes, but still an impressive jump. GPT-4 would’ve made an illegal move by then.

bought Ṁ50 NO

I think that this a serious jump in elo and I don't think it might be possible for now maybe gpt-6

bought Ṁ50 NO

In a traditional game an attempt to make an illegal move results in a forfeit of the match. Does this rule apply to GPT-5?

@Soren Yeah I think it should. Someone who can't even move legally is clearly not "a good player" by the common-sense meaning of the term.

An estimated rating is fine.

Estimated what? Bona fide FIDE Elo? TCEC compatible computer rating? Stockfish tournament result? Lichess.com??

https://twitter.com/andrew_n_carr/status/1735418595759526231

predicts YES

@NoaNabeshima https://twitter.com/GrantSlatton/status/1703913578036904431

predicts YES

@NoaNabeshima I'm not checking these tweets, but this is why I'm trading as I am

Market is so dramatically unspecified as to make it impossible for me to bet.

@mqp *underspecified

@mqp What's the concern? If GPT-5 is able to consistently beat people rated 2000 or above, then it resolves YES.

@IsaacKing having rating of 2000 you will only beat people rated 2000 half of the time, so you are either unclear in this comment or in the title (or do you consider 50% as consistently?)

@IsaacKing results against humans is not a good benchmark (there may not be enough games for statististically sound evaluation, for one thing).

But also, the principal question is: what Elo does this refer to? Many if not most online discussions these days refer to online rating systems, like that of chess.com or Lichess, rather than the classical FIDE Elo. And this is a very important distinction for the context of this market! The baseline Elo performance for GPT-3 (and presumably GPT-4, which actually appears to "play" worse but that is likely just a statistical fluke) is about 1700 on the CCRL scale. This would translate to easy blitz wins over most 2000 rated players on either chess.com or lichess, whose rating scales are inflated by several hundred points

Related questions

Related questions