AI Alignment Research

A former OpenAI employee explains the 'open secret' of AI: Companies are building systems they still can't reliably control

Daniel Kokotajlo warns AI systems are advancing faster than companies can control, raising concerns about alignment and ...

Computer Weekly

UK AI alignment project gets OpenAI and Microsoft boost

OpenAI and Microsoft are the latest companies to back the UK’s AI Security Institute (AISI). The two firms have pledged support for the Alignment Project, an international effort to work towards ...

Anthropic AI Safety Fellowship 2026: How to apply, $15,000 funding, duration & hiring chances

Anthropic, founded by former OpenAI researchers, has positioned itself as one of the leading firms focused on AI alignment ...

VentureBeat

AI models rank their own safety in OpenAI’s new alignment research

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More OpenAI announced a new way to teach AI models to align with safety ...

eWeek

Anthropic Unleashes ‘Alien Science’ as AI Surpasses Humans in Alignment

Anthropic’s Claude agents outperformed human researchers and produced “alien science,” raising new questions about AI alignment and self-improvement.

4don MSN

Anthropic says decades of evil robot stories may be echoing inside AI models

Anthropic blames sci-fi for AI behavior ...

TechNewsWorld

The AI Alignment Problem Is No Longer Theoretical

I recently got a question from Quora that felt more like a tech support ticket from the future than a movie discussion: Is Skynet’s decision to wipe out humanity in “The Terminator” movies just a bug, ...

The Verge

OpenAI’s new model is better at reasoning and, occasionally, deceiving

Posts from this topic will be added to your daily email digest and your homepage feed. Researchers found that o1 had a unique capacity to ‘scheme’ or ‘fake alignment.’ Researchers found that o1 had a ...

18d

HTEC Research: Only One in Three Healthcare Organizations is Ready to Scale AI

AI is already embedded across healthcare and life sciences. Most organizations are deploying it, and confidence in its potential is high. Yet for many, the real challenge is only just beginning.

Hosted on MSN

Microsoft research links AI ROI to leadership alignment

Organizational factors dominate: Microsoft finds 67% of AI impact comes from culture, leadership, and talent practices, versus 32% from individual mindset. Adoption vs. value gap: TechRadar Pro notes ...

Geeky Gadgets

Alignment Faking : The Hidden Danger of Advanced AI Systems

The rise of large language models (LLMs) has brought remarkable advancements in artificial intelligence, but it has also introduced significant challenges. Among these is the issue of AI deceptive ...

TechCrunch

OpenAI’s research on AI models deliberately lying is wild

Every now and then, researchers at the biggest tech companies drop a bombshell. There was the time Google said its latest quantum chip indicated multiple universes exist. Or when Anthropic gave its AI ...

Results that may be inaccessible to you are currently showing.

Hide inaccessible results