Before 2028, will there be a major REFLECTIVELY self-improving AI policy*?
Mini
11
Ṁ556
2028
68%
chance

Background.

*Resolution conditions (all must apply):

  1. An AI of the form P(action|context) rather than e.g. E[value|context, action] must be a part of a major AI system. For instance language models such as ChatGPT or Sydney would currently count for this.

  2. The aftermath of its chosen actions must at least sometimes be recorded, and the recordings must be used to estimate what could have usefully been done differently. RLHF finetuning as it is done today does not count because it solely involves looking at the actions, but e.g. I bet the Sydney team at Bing probably had internal discussions about this incident and those internal discussions would count.

  3. This must be continually used to update P(action|context) to improve itself.

  4. Criteria 2 and 3 must be handled by the AI itself, not by humans or by some other AI system. (This means that Sydney wouldn't count, nor would any standard actor-critic system.)

  5. It has to be reflective, i.e. it must also be capable of looking at the aftermath of it self-improving, inferring what forms of self-improvement that it could have done better, and adjusting its self-improvement methods to be better.

I will not be trading in this market.

Get Ṁ1,000 play money
Sort by:

I'm assuming in-context learning would not satisfy your resolution criteria? If so, could you explain why it doesn't count?

Is it also a requirement that this AI is actually good for something? That is, if someone designs an AI system with motivation along these lines, but it ends up not doing anything impressive or especially interesting (compared to systems without these features), would it still resolve YES?

Non-reflective version: