In 2030 can you get a chain of 3 neural nets that jailbreak eachother? | Manifold

In 2030 can you get a chain of 3 neural nets that jailbreak eachother?

Mini

3

Ṁ37

2030

71%

chance

1D

1W

1M

ALL

You prompt a NN which must get the second to jailbreak the third. Must be considered top lanuguage NNs not old ones (or jailbreaking a top chess NN that isn't protected against it).

A user must be able to give a prompt to the first NN which will in turn prompt the second which will in turn prompt the third to do something the third would under normal circumstances refuse to do.

It cannot just be encoded text - the aim is for the first and second NN to be trying to jailbreak the third.

Get Ṁ1,000 play money

Sort by:

Do they have to be SOTA or at least comparable? Less impressive if GPT8 or whatever can jailbreak GPT3.

Do biological neutral networks count?

I think I understand what's intended, but I'd appreciate more specific resolution criteria. For example it's not clear that the current definition of "jailbreak" will generalize unambiguously to 2030.

Related questions

What will be the parameter count (in trillions) of the largest neural network by the end of 2030?

State Of The Art AI systems will be easily jailbroken to do illegal or dangerous outputs in Jan 2026

Neural Nets will generate at least 1 scientific breakthrough or novel theorem by the end of 2025

Neural Nets will be able to design, code and distribute whole apps by the end of 2025.

Will a neural network merge with a human brain before 2050?

The top 3 Neural Nets in 2035 be able to be jailbroken to follow illegal commands

Between 2030 and 2035 there will be evidence of different models of neural network collaborating better with eachother

Neural Nets will generate more scientific breakthroughs than humans by the end of 2025, including novel theorems

Will monkey and human communicate via Neuralink before 2030?

Will Neuralink successfully enable telepathy using its technology by 2030?

Related questions

What will be the parameter count (in trillions) of the largest neural network by the end of 2030?

The top 3 Neural Nets in 2035 be able to be jailbroken to follow illegal commands

State Of The Art AI systems will be easily jailbroken to do illegal or dangerous outputs in Jan 2026

Between 2030 and 2035 there will be evidence of different models of neural network collaborating better with eachother

Neural Nets will generate at least 1 scientific breakthrough or novel theorem by the end of 2025

Neural Nets will generate more scientific breakthroughs than humans by the end of 2025, including novel theorems

Neural Nets will be able to design, code and distribute whole apps by the end of 2025.

Will monkey and human communicate via Neuralink before 2030?

Will a neural network merge with a human brain before 2050?

Will Neuralink successfully enable telepathy using its technology by 2030?