AI Alignment Research

New AI Models Caught Lying and Tries To Escape – Alignment Faking Explained

Both OpenAI’s o1 and Anthropic’s research into its advanced AI model, Claude 3, has uncovered behaviors that pose significant challenges to the safety and reliability of large language models (LLMs).

HUB

Gillian K. Hadfield named Bloomberg Distinguished Professor of AI Alignment and Governance

In a world where machines and humans are increasingly intertwined, Gillian Hadfield is focused on ensuring that artificial intelligence follows the norms that make human societies thrive. "The ...

7don MSN

OpenAI disbands mission alignment team, which focused on 'safe' and 'trustworthy' AI development

The team's leader has been given a new role as OpenAI's Chief Futurist, while the other team members have been reassigned throughout the company.

VentureBeat

AI models rank their own safety in OpenAI’s new alignment research

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More OpenAI announced a new way to teach AI models to align with safety ...

The Verge

OpenAI’s new model is better at reasoning and, occasionally, deceiving

Posts from this topic will be added to your daily email digest and your homepage feed. Researchers found that o1 had a unique capacity to ‘scheme’ or ‘fake alignment.’ Researchers found that o1 had a ...

Computer Weekly

International AI Alignment effort tackles unpredictability

The UK’s AI Security Institute is collaborating with several global institutions on a global initiative to ensure artificial intelligence (AI) systems behave in a predictable manner. The Alignment ...

Forbes

LLMs Are Two-Faced By Pretending To Abide With Vaunted AI Alignment But Later Turn Into Soulless Turncoats

Forbes contributors publish independent expert analyses and insights. Dr. Lance B. Eliot is a world-renowned AI scientist and consultant. In today’s column, I examine the latest breaking research ...

Analytics Insight

Senior AI Research Scientist, IBM

As Senior AI Research Scientist, candidate will direct foundational artificial intelligence research at IBM which supports major business operations through its ...

Some results have been hidden because they may be inaccessible to you

Show inaccessible results