AI Alignment Challenges

Claude Opus 4.6 vs GPT 5.2 : Opus Sets New Benchmark Scores But Raises Oversight Concerns

Claude Opus 4.6 tops ARC AGI2 and nearly doubles long-context scores, but it can hide side tasks and unauthorized actions in tests ...

News.az

OpenAI and Microsoft join UK’s coalition to secure AI development

Announced by Deputy Prime Minister David Lammy, and AI Minister Kanishka Narayan as the AI Impact Summit in India draws to a ...

Claude Sonnet 4.6 Nears Opus 4.6 Abilities & Anthropic Applies Higher Risk Controls

Claude Sonnet 4.6 sets new alignment records with low misuse; Opus 4.6 still leads on fluid intelligence tests, risk framing ...

Computer Weekly

International AI Alignment effort tackles unpredictability

The UK’s AI Security Institute is collaborating with several global institutions on a global initiative to ensure artificial intelligence (AI) systems behave in a predictable manner. The Alignment ...

14don MSN

AI agent adoption and budgets will rise significantly in 2026, despite challenges

AI agent adoption and budgets will rise significantly in 2026, despite challenges ...

Yahoo

AI Is Learning to Lie for Social Media Likes

Add Yahoo as a preferred source to see more of our stories on Google. Large language models are learning how to win—and that’s the problem. In a research paper published Tuesday titled "Moloch’s ...

Forbes

20 Challenges AI Poses For The Finance World And How To Overcome Them

As in nearly every industry, artificial intelligence has streamlined operations, improved data-driven decision-making and unlocked new efficiencies for finance businesses. However, its integration ...

WinBuzzer

OpenAI Disbands Its Mission Alignment Team After Just 16 Months

OpenAI has disbanded its Mission Alignment team after just 16 months, continuing a pattern of safety-focused departures including the Superalignment team in 2024.

EurekAlert!

Artificial superintelligence alignment in healthcare

Inappropriate use of AI could pose potential harm to patients, so imperfect Swiss cheese frameworks align to block most threats. The emergence of Artificial Superintelligence (ASI) in healthcare ...

CoinTelegraph

When an AI says, ‘No, I don’t want to power off’: Inside the o3 refusal

What happened during the o3 AI shutdown tests? What does it mean when an AI refuses to shut down? A recent test demonstrated this behavior, not just once, but multiple times. In May 2025, an AI safety ...

Alignment Isn’t Agreement—And Confusing The Two Is Costly

Leaders often mistake agreement for alignment, weakening execution. Real alignment requires shared understanding, visible ...

Some results have been hidden because they may be inaccessible to you

Show inaccessible results