06 May 2025

The UK’s AI Security Institute's Research Agenda

In February 2025, the UK government announced that the AI Safety Institute would be renamed the AI Security Institute, with a strengthened focus on security risks, including those with national security implications. 

On 6 May, the Department of Science, Innovation and Technology's released the AI Security Institute’s research agenda for "tackling the hardest technical challenges in AI security." In addition, the Institute's mission has been refined to focus on advancing understanding of the most serious risks posed by AI technology, building a scientific evidence base to help policymakers keep the country safe as AI capabilities develop. 

The Institute's research priorities center on three key areas: 

  1. Priority risk areas: Identifying and addressing the most pressing AI security threats 

  1. Advancing AI evaluations: Developing robust methods to assess AI systems 

  1. AI alignment and control: Creating technical solutions to ensure AI systems remain safe and under human oversight 

This insight provides more information on the AI Security Institute's recently published research agenda and explores how the Institute plans to address critical security challenges through its three key priority areas. 

 

1. Priority Risk Areas: Identifying and Addressing the Most Pressing AI Security Threats 

The AI Security Institute has identified several critical areas where AI poses significant threats to national security and public safety. These priority risk areas represent domains where AI misuse could cause severe and widespread harm, demanding urgent research and mitigation strategies. 

At the forefront of these concerns is the potential for AI to enable more sophisticated cyberattacks. Recent assessments indicate that AI-enhanced cyber threats are among the most likely to manifest with high impact in the immediate future. These attacks could become dramatically more effective and easier to execute, magnifying their consequences for businesses, consumers, and critical infrastructure. 

The Institute's Autonomous Systems work stream recently published research measuring the extent to which AI systems have the capability to autonomously replicate. This benchmark helps detect emerging replication abilities in AI systems and provides a quantifiable understanding of potential risks, enabling more targeted mitigation strategies. 

Beyond cybersecurity, the Institute is examining how AI might be weaponised for criminal activities or dual-use scientific tasks with harmful applications. The Challenge Fund launched in March 2025 specifically targets research into preventing AI misuse across these domains. 

Another crucial risk area involves ensuring AI systems remain under meaningful human control. The 2025 International AI Safety Report, led by Yoshua Bengio, highlights that without proper checks and balances, risks may not only come from malicious actors misusing AI models but potentially from the autonomous behaviors of the models themselves. 

The Institute is also investigating AI's potential to cause widespread societal disruption and to influence or manipulate human opinions. A recent government assessment noted that risks to political systems and societies will increase as AI technology develops and adoption widens, with synthetic media potentially eroding democratic engagement and public trust in governmental institutions. 

As AI forms the central pillar of the UK Government's Plan for Change, addressing these security concerns is essential for building public confidence and enabling widespread adoption. The Institute's work ensures that as the UK positions itself to harness AI's opportunities for economic growth and public service innovation, it does so with robust safeguards against potential harms. 

 

2. Advancing AI Evaluations: Developing Robust Methods to Assess AI Systems 

A cornerstone of the Institute's research agenda is the advancement of sophisticated methodologies for AI system evaluation. The Institute is taking a leading role in testing AI models regardless of their origin—whether open-source or proprietary—to ensure comprehensive security assessment. 

This evaluation work builds the scientific foundation necessary for understanding and mitigating AI risks. The Institute focuses on research that addresses significant insight gaps between industry, governments, academia, and the public regarding AI capabilities, safeguards, and societal impact. 

Strategic partnerships with frontier AI companies provide the Institute with access to cutting-edge models for evaluation. These collaborations enable researchers to thoroughly assess current capabilities and anticipate future developments, creating a more accurate picture of potential security vulnerabilities. 

The Challenge Fund launched in March 2025 also supports research into testing methodologies that can evaluate AI systems' resilience against misuse and their ability to maintain human oversight. By developing increasingly rigorous evaluation frameworks, the Institute ensures that risk assessments remain ahead of rapidly advancing AI capabilities. 

These assessment methodologies aren't merely academic exercises—they provide essential intelligence for policymakers and industry partners. By appropriately sharing findings with policymakers, regulators, private companies, international partners, and the public, the Institute helps ensure that relevant parties receive the information they need to effectively respond to rapid progress in AI. 

 

3. Technical Solutions: AI Alignment and Control 

Beyond identifying and evaluating AI risks, the Institute's research agenda emphasises the development of concrete technical solutions. The Institute is pursuing cutting-edge research to ensure AI systems remain resilient against misuse, maintain human oversight even while operating autonomously, and strengthen society against emerging threats. 

The Institute's solutions-focused control team recently published important work exploring how "control measures" for AI systems can help mitigate risks from misalignment. This research provides practical approaches for ensuring AI systems operate within safe parameters even as their capabilities advance, offering tangible solutions to one of the field's most pressing challenges. 

AI alignment—ensuring that systems act in accordance with human values and intentions—forms a critical focus area. The Institute's work aims to address significant technical complexities and safety concerns in AI development, helping to forge scientific consensus around risks and their mitigation. 

Control mechanisms that maintain human authority over AI systems represent another priority research domain. Again, The Challenge Fund specifically supports this research into robust controls that allow humans to reliably monitor and intervene to prevent emerging risks, even as AI systems operate with increasing autonomy. By funding high-impact research across these technical areas, the Institute is building the evidence base needed for real-world solutions to the most urgent security challenges AI presents. This technical work complements the Institute's risk identification and evaluation efforts, creating a comprehensive approach to AI security. 

The emphasis on practical technical solutions reflects the Institute's commitment to proactive risk management. Rather than waiting to react to AI's impacts, the Institute is actively working to shape the trajectory of AI development, ensuring national security. 

 

Conclusions 

The AI Security Institute's research agenda addresses critical safety and security challenges associated with advanced AI systems. The agenda identifies priority risk areas, evaluation methodologies, and technical solutions that governments and policymakers worldwide will need to address as AI capabilities continue to advance. The Institute's evaluations and research provide essential data to improve understanding of AI capabilities, safeguards, and societal impacts. As AI systems are deployed across economic sectors, social contexts, and critical infrastructure, these security measures become increasingly consequential. 

The research agenda outlines specific focus areas including AI misuse prevention, human oversight mechanisms, and methods to protect against widespread disruption. International collaboration remains central to the Institute's approach, with the sharing of talent and evaluation methodologies being pursued through the International Network of AI Safety Institutes. 

The Institute's work focuses specifically on research that cannot or is not being addressed by other actors in academia or industry. This research agenda serves both as documentation of the Institute's priorities and as an invitation to the wider technology sector to collaborate on building rigorous methodologies, evaluation tools, and technical solutions to address AI security risks. 

 

This research agenda sets out for the first time not only how the Institute’s work is safeguarding our national security, but how we are laying a secure foundation to use AI to deliver positive change... If we are going to instill confidence in the British public about the incredible potential of this technology, then we need to mitigate any potential risks.

Peter Kyle

Secretary of State for Science, Innovation, and Technology

The capabilities of AI are advancing rapidly, so it’s absolutely vital that we as an organisation can match that pace and operate at the frontier of AI security. This means, where possible, sharing our thinking with the wider community and developing a shared rigour across the field. Today’s research agenda sets out why we’re focusing our research on the areas we do, so we can help tackle the biggest risks which could emerge as this technology evolves. More than that, we want to galvanise other research bodies around this important and fast-moving work.

Jade Leung

AI Security Institute Chief Technology Officer

 

For more information on our digital ethics and AI assurance programming, please contact [email protected].