OpenAI Updates Framework to Tackle AI Risks

OpenAI

OpenAI has updated its Preparedness Framework, a system that helps it track and prepare for advanced AI capabilities that could cause serious harm. As AI becomes more powerful, it's critical to have real-world safety systems in place.

What’s New?

This update brings.

  • A sharper focus on high-risk AI threats
  • Stronger standards for reducing those risks
  • Clearer guidance on how OpenAI monitors and shares safety practices

It also includes new research areas to help stay ahead of future risks, making the framework more actionable, reliable, and transparent.

Preparedness Framework

Smarter Risk Detection

OpenAI now uses five clear criteria to flag high-risk capabilities.

They must be plausible, measurable, severe, new, and hard to reverse. These help prioritize which risks to focus on first.

Two Types of Risk Categories

Tracked Categories

Well-defined, with current safeguards included.

  • Biological & Chemical
  • Cybersecurity
  • AI Self-improvement

These areas also offer major scientific and research benefits, so OpenAI prepares early to unlock them safely.

Research Categories

Still developing but potentially dangerous.

  • Long-range Autonomy
  • Sandbagging (AI underperforming on purpose)
  • Autonomous Replication
  • Undermining Safeguards
  • Nuclear/Radiological Risks

Political and Persuasion Risks Handled Separately

OpenAI limits the use of its tools for political campaigns and monitors misuse like influence operations through separate policies and investigations.

Clear Capability Levels

OpenAI now uses two levels

High Capability: Increases existing risks

Critical Capability: introduces brand-new risks

Systems at these levels must have strong safeguards before deployment (or even during development for Critical ones). A Safety Advisory Group reviews them and advises leadership.

Faster, Scalable Testing

AI models now improve faster. So, OpenAI has built automated testing tools to keep up while also using expert evaluations to make sure testing stays accurate.

Adapting to New Developments

If another company releases a risky model without strong safeguards, OpenAI may adjust its own approach, but only after confirming that risks have changed and ensuring safety stays strong.

Introducing Safeguards Reports

Alongside risk assessments, OpenAI will now publish Safeguards Reports explaining how they’re reducing risks. These reports help guide deployment decisions and are reviewed by the safety team before models are released.