US and UK governments vet AI models for safety and security

The United States and United Kingdom have launched joint safety tests on advanced artificial intelligence (AI) systems, marking one of the most extensive government-led evaluations to date.

The US Artificial Intelligence Safety Institute (US AISI), part of NIST, and the UK AI Safety Institute (UK AISI) have been granted early access to OpenAI’s latest model, o1, and its ChatGPT Agent. The aim is to identify vulnerabilities before public release and ensure the technology cannot be misused for cyber or biological threats.

How testing works

Teams of engineers and scientists examined the systems in three key areas: cybersecurity, biology and software development. The process included automated reasoning challenges and agent simulations, where AI models act autonomously to complete virtual tasks. Experts from US security and health agencies also contributed to the testing.

What is red-teaming?

A major part of the process involved red-teaming, a cybersecurity method where experts intentionally try to break or deceive a system to uncover weaknesses. In AI testing, this means probing models to see if they can be tricked into unsafe or unauthorised actions.

What they found

The o1 model performed similarly to other leading AI systems and excelled in cryptography challenges. However, government testers discovered new vulnerabilities in OpenAI’s ChatGPT Agent, which the company said were fixed within a day.

The UK institute also found and reported several biosecurity issues, prompting OpenAI to strengthen its safeguards and policies.

Why it matters

Officials say the collaboration represents a new standard for public and private AI safety, combining national security expertise with cutting-edge technology. Both institutes plan to expand testing as AI systems become more capable and autonomous.