Will a major AI lab claim to use activation steering in its main chat assistant by EOY 2025? | Manifold

Will a major AI lab claim to use activation steering in its main chat assistant by EOY 2025?

Mini

15

Ṁ581

Jan 2

25%

chance

1D

1W

1M

ALL

Also includes methods inspired by activation steering, as long as they don't use any gradient descent step.

Only includes announcements about main chat assistants (e.g. Claude, ChatGPT, Bard, ...) of a major AI lab (OpenAI, Google Deepmind, Anthropic, Meta, Inflection or Mistral).

Does not include to fine-tuning API endpoints.

#Technical AI Timelines

Get Ṁ1,000 play money

Sort by:

Anthropic found two features (auto-labeled "Neutrality and impartiality" and "Multiple perspectives and balance") that improve BBQ benchmark scores.

According to Nathan Labenz on the Future of Life Institute Podcast, Anthropic is piloting custom activation steering in limited beta (make-your-own Golden-Gate-Claude).

Anthropic is running a demo of an activation-steered Claude obsessed with the Golden Gate Bridge: https://www.reddit.com/r/singularity/comments/1cz7kuh/claude_golden_gate_bridge_is_now_available_bridge/ (Context: https://www.anthropic.com/research/mapping-mind-language-model )

Related questions

Will a OpenAI, Anthropic, Google or Meta release an AI chatbot that has ads in the responses in 2025?

-7% 1d16% chance

Will Anthropic announce one of their AI systems is ASL-3 before the end of 2025?

+11% 1d93% chance

Will it be public knowledge by EOY 2025 that a major AI lab believed to have created AGI internally before October 2023?

Will Anthropic announce one of their AI systems is ASL-4 or higher before the end of 2025?

Before 2026, will you be able to buy ads in a mainstream AI assistant?

Will a company other than OpenAI, xAI, and Google top the Chatbot Arena Leaderboard in 2025?

Will OpenAI claim that it has achieved AGI in 2025?

Will chatbots/AI be powerful enough to make me unsad by EOY2025?

Who will have the most popular AI assistant at the end of 2025? (judged by active users)

Will OpenAI hint at [read description] or claim to have AGI by 2025 end?

Related questions

Will a OpenAI, Anthropic, Google or Meta release an AI chatbot that has ads in the responses in 2025?

Will a company other than OpenAI, xAI, and Google top the Chatbot Arena Leaderboard in 2025?

Will Anthropic announce one of their AI systems is ASL-3 before the end of 2025?

Will OpenAI claim that it has achieved AGI in 2025?

Will it be public knowledge by EOY 2025 that a major AI lab believed to have created AGI internally before October 2023?

Will chatbots/AI be powerful enough to make me unsad by EOY2025?

Will Anthropic announce one of their AI systems is ASL-4 or higher before the end of 2025?

Who will have the most popular AI assistant at the end of 2025? (judged by active users)

Before 2026, will you be able to buy ads in a mainstream AI assistant?

Will OpenAI hint at [read description] or claim to have AGI by 2025 end?