Top score on codeforces by an AI model at the end of 2025

Plus

Ṁ23k

Jan 1

ALL

1.7%

<2750 (OpenAI's o3 at 2727)

20%

2750-3000

35%

3000-3250

3250-3500

3500-3750

10%

3750-4000

19%

4000+

In December 2024, OpenAI announced that o3 achieved a score of 2727 on codeforces.com. What will be the best score achieved by an AI model at the end of 2025?

This will resolve to reliable sources (ie sources that seem to not be lying) even if it's an announcement where the model that achieved this score is not publicly available.

Update 2025-06-19 (PST) (AI summary of creator comment): The creator has clarified that "score" refers to the overall rating on codeforces.com, not a score from a single contest

Update 2025-10-11 (PST) (AI summary of creator comment): Trustworthiness of sources: The creator has clarified that while Sam Altman may not be "generally" trustworthy, claims about AI model performance on Codeforces are the kind of thing he is unlikely to fabricate, and such announcements will count as reliable sources for resolution purposes.

AI 2025

#AI

#️ Technology

#Technical AI Timelines

#OpenAI

#AI Impacts

Get Ṁ1,000 play money

24 Comments

Sort by:

bought Ṁ25 2750-3000 YES

did o3 not still get the highest score so far?

@Bayesian Sam Altman had made an informal claim in February that their internal model had reached 50th in the world, which at the time implied a rating of 3045. I don't think that was ever actually published though.

@TimothyJohnson5c16 hmmmmmm yeah that should probably count?
does anyone have a case for it not counting

bought Ṁ250 2750-3000 NO

@Bayesian I guess the main case for not counting it would be if you think Sam Altman isn't trustworthy enough.

@TimothyJohnson5c16 right. he clearly isn't "generally" trustworthy but this is the kind of thing he is unlikely to make up

bought Ṁ9 4000+ YES

AI got 2nd on AtCoder Grand Finals Heuristics contents (1 task, 10 hours) and got gold on IMO (6 tasks, 9 hours). Codeforces rounds have more top participants (AtCoders was just 12 people, IMO is restricted by age) but format of 5-7 tasks and 2-3 hours favors AI.

Although tbh due to the current state of the leaderboard, 4000 would need AI to dominate humans, so maybe I shouldn't have bet YES on 4000+ (well it's too late)

bought Ṁ400 4000+ YES

same

4000+ at 24% is kinda wild. 1300 pt leap in 6 months?

tbf I'm not familiar with codeforces. But yeah the best human is 3700.

top 150 (they claimed top 50 internally) to significantly better than the best human is wild.

bought Ṁ50 4000+ NO

@ChinmayTheMathGuy I guess internally they're around 3100 or so now

@ChinmayTheMathGuy humans have been 4000+ several times by now, but the top ratings on codeforces are extremely volatile; if you're the top rated coder you have to win every contest to avoid bleeding massive amounts of rating

@zsig if im not mistaken only 2 people have been 4000+ ?

by score, do you mean a score on a contest or an overall rating?

The setting analogous to the o3 case. Do you know if that was a contest or from overall rating? as far as I can tell it seems to be overall rating, in which case this market will deal with overall rating as well

bought Ṁ100 4000+ NO

@Bayesian thanks ! Somehow I seriously doubt that an ai will get 4000+ Thats the equivalent of ai beating humans almost completely in competitive programming.

bought Ṁ100 4000+ NO

@Bayesian indeed, the o3 case was talking about overrall rating.

bought Ṁ10 4000+ YES

@ZandaZhu I also doubt it tbh but not 25%, ig higher. openai was eyeing #1 at competitive programming by EOY, a bit after the o3 release

opened a Ṁ2,000 4000+ NO at 30% order

@Bayesian NO limit order at 30% for you.

@TimothyJohnson5c16 I agree, I think ai will not surpass humans at competitive programming as early as 2025, if this really happens us humans are closer to being doomed than we might think.

Also this question should at least be somewhat constrained by compute. The o3 rating was done by ranking 1062 prompts and also with 10 passes.

So yes with infinite compute I don't doubt it can achieve 4k but that's not the point of the question imo. If you took 1000 humans they could also achieve 5000+ rating.

@patrik I don’t think if you took 100 humans they could achieve 5000+ rating!

@Bayesian The top 100 for sure. Getting high ratin in codeforces is only about consistently beating everyone.

@patrik Sorry i typoed i meant to write 1000 humans. And yeah i thought u were selecting at random 1000 humans. I agree

@Paul why?

https://x.com/tsarnick/status/1888111042301211084

Related questions

Related questions