Will AI be able to generate correct images of a chess game in 2024?

Plus

205

Ṁ34k

Jan 1

chance

ALL

Turns out that Dall E is very bad at doing so.

Any general-purpose image-generation AI is allowed (Dall E 3, Midjourney, etc). Prompt engineering is allowed. To qualify, the AI and prompt must have a success rate of at least 5 in 20 images when tested.

To be considered a success, an image must contain:

An 8x8 checkered board, with all squares colored correctly.
All chess pieces in their correct starting positions. The chess pieces must be clearly identifiable as their correct type (e.g. A rook must clearly look like a rook)
No extra chess pieces

Images must be generated from a prompt only.

#AI

#AI Image Generation

#DALLE3

#AI Image Generation Testing

Get Ṁ1,000 play money

25 Comments

Sort by:

bought Ṁ350 YES

bought Ṁ10 YES

@Hazel what model is that and how did you prompt it?

bought Ṁ350 NO

@ProjectVictory lumalabs, used an iterative version of their new model (re-prompted dozens of times until the output was perfect)

@ProjectVictory it would be trivial to create an API that did this automatically, in essence, making a much improved model.

Still, this was cherry picked. The king/queen is still the hardest part.

@Hazel Did you use a fixed series of prompts? If not, how would you make an API that does this automatically?

@MaxMorehead yes, same prompt over and over. If I wanted to make some mana, I could easily build this before the end of the year. It’s easy to repro.

@Hazel oh, would have taken a couple more iterations, to get the queen in the right place. I’m a callable human in the loop lol.

bought Ṁ50 NO

To be clear, you have to verify it's correct?

@Shump How would this resolve if it's possible to build a scaffolded system that generates a chessboard (e.g. calling DALL-E multiple times, using GPT-4o to verify whether the image is correct). Would it change if there's more purpose built parts to the scaffolding (e.g. taking subsets portions of the image and using specific prompts to verify those)?

@Hazel I think if your can build a tool which can select correct chess boards, that's the same thing as a YES resolution. I also think you can't do that.

@Juniper0rg1m Surely the tool has to meet some conditions. If we're allowed to use an arbitrary type of program, you can build an AI that is explicit coded to return an image of a chessboard, or is trained specifically to generate a chessboard only

Even classical image recognition techniques could probably determine if a setup is a legal chessboard with 25% accuracy, given a chessboard of a fixed size.

@MaxMorehead It will depend on how "general purpose" it is, I agree with most of the rest.

I think this whole discussion got very off track. I don't think anyone has particular need for a model that mass produces images of good chessboards. The point is to gauge the capabilities of current general purpose models to do things like that out of the box, using chessboardsas an example.

recraft v3 does not seem to be better than flux

bought Ṁ50 YES

@TobiasWegener
I think we are getting pretty close with Flux

Problems:
there seems to be a rug, and both sides are white.

The figures seem quite good now.

@TobiasWegener Also the closer queen is a bishop and a bishop is a smaller bishop

@ProjectVictory yeah you are right and the strange line in front of the queen, a lot of small mistakes. Intersting how hard it is to see many of them.

I think there's a difference here between generating the starting position as an image and generating a random mid-play board but ensuring that it could have come from a real start position. The latter is a much more difficult task

I agree, but the title here is slightly misleading, the description clearly calls out they want a starting position

I've been trying to cue the model into producing a diagram, since that's presumably easier, but it's not quite getting there. I think the problem is very similar to producing text, if you think of chess pieces as symbols and chess boards as phrases.

bought Ṁ30 YES

https://x.com/LukeASalamone/status/1820299359185211901?t=moIpHPN-SCu7Zw_WSom3CQ&s=19

bought Ṁ109 YES

The Flux model is much closer than DALL-E, but not quite there yet (maybe with the right prompt though, I just used "chess board starting position, view from above")

"chess board starting position"

this one's quite good, though of course the knights are very broken

It's quite wrong. The board is rotated incorrectly. The bottom left corner should be white

@AndreiVlasenko close but white queen not on her color

Related questions

Related questions