Will RL work for LLMs "spill over" to the rest of RL by 2026? | Manifold

Will RL work for LLMs "spill over" to the rest of RL by 2026?

Plus

6

Ṁ629

2026

34%

chance

1D

1W

1M

ALL

RL is important for training LLMs and it seems likely that there will be significantly more investment in RL by the major LLM groups this year. Will any of the advances they make be:

Published (any publication that allows the research to be used elsewhere counts, this does not have to be a paper)
A significant advance for the rest of RL

For example, a new version of PPO that is close to SOTA for agents in Atari environments would resolve this YES.

What counts as a "significant advance" is mostly subject to my inscrutable whims, but is aimed more at cool research than important result. Think "very exciting to see at a conference" rather than "revolutionizes the field".

#Technical AI Timelines

Get Ṁ1,000 play money

Related questions

By the end of June 2025, will closed-source LLMs increase access to pandemic agents?

Will LLM hallucinations be a fixed problem by the end of 2025?

Will there be major breakthrough in LLM Continual Learning before 2026?

Will one of the major LLMs be capable of continual lifelong learning (learning from inference runs) by EOY 2025?

What will Manifolders mostly use LLMs for, by EOY 2025?

In 2025, will I be able to play Civ against an LLM?

Will LLMs become a ubiquitous part of everyday life by June 2026?

Will LLMs mostly overcome the Reversal Curse by the end of 2025?

Will LLMs be better than typical white-collar workers on all computer tasks before 2026?

Will a frontier-level diffusion LLM exist by 2028?

Related questions

By the end of June 2025, will closed-source LLMs increase access to pandemic agents?

In 2025, will I be able to play Civ against an LLM?

Will LLM hallucinations be a fixed problem by the end of 2025?

Will LLMs become a ubiquitous part of everyday life by June 2026?

Will there be major breakthrough in LLM Continual Learning before 2026?

Will LLMs mostly overcome the Reversal Curse by the end of 2025?

Will one of the major LLMs be capable of continual lifelong learning (learning from inference runs) by EOY 2025?

Will LLMs be better than typical white-collar workers on all computer tasks before 2026?

What will Manifolders mostly use LLMs for, by EOY 2025?

Will a frontier-level diffusion LLM exist by 2028?