18 Jan 2024
by Tess Buckley

The AI Safety Institute’s Ambitions and Progress Reports

Ambitions of the AISI 

The AI Safety Institute (AISI) was announced at the Bletchley AI Summit, as an evolution of the AI taskforce. The AISI will continue the taskforce's work on research and evaluations, while DSIT will maintain key policy functions such as identifying uses in the public sector and strengthening capabilities. The ambition of the Institute is to ensure public safety and proactively define the trajectory of AI.  

 

The AISI is positioning itself to become a global hub which deepens UK’s stake in this strategically important technology. The Institute was announced alongside endorsements from the US, Singaporean, German, Canadian and Japanese Governments as well as from major frontier AI labs. The AISI will make its work available to the world so that it is enabling an effective global response to mitigate the risks and reap the benefits of AI. 

 

 

The HOW – Goals for the ambition of AISI and specified paths to completion 

The AISI has set three priority areas to spend its time working towards achieving its ambitions, this includes evaluations of advanced AI models, conducting foundational AI Safety research and facilitating information exchange 

 

1) Develop and conduct evaluations on advanced AI systems 

  • The AISI first plan to understand the capabilities that would make a system safer, such as height of barriers for attacks, any exacerbations of existing harms, limitations of existing safeguards, strength of control. They then plan to create technical tools to test the capabilities, which may include soliciting collective input, methods to fine-tune systems with sensitive data, risk assessment, automated or human-crafted real-world attacks on full AI systems (red teaming), participation in model training and/or analysis of training data for bias.  
  • By providing external evaluations the AISI can support standards of what ‘good’ is, alongside the promotion of what constitutes best practice in evaluations. The AISI specifically notes they will not be conducting the evaluations with the intent to designate a particular system safe or unsafe. These evaluations will act as early warning signs of risk. The result of the evaluations will not be the red or green light on release decisions, but instead provide data about the risks in systems to inform the decision making by companies.  

 

2) Foundational AI Safety research 

  • The research focuses on a range of open questions connected to the evaluations of AI systems and supporting short and long-term AI governance. Three noted research projects include: 

  • Building products for AI governance: New real-world tools to fine-tune systems with sensitive data, analyse training data for bias,  and assurance methods to verify compliance with regulatory frameworks  

  • Improving the science of evaluation: The AISI will establish clear information sharing channels to support voluntary communication between the institute and other national and international actors.  

  • Fundamental AI safety research for novel approaches to safer systems: Technical scoping of emerging capabilities, methods to reduce filter bubble effects and development methods to enable responsible innovation 

 

3) Facilitating information exchange  

  • The AISI will act as a trusted actor and intermediary that is deeply connected across the AI ecosystem and able to distribute evaluation and research findings about advanced AI models with policymakers, regulators, international partners, private companies and the public.  

  • The AISI is positioned well to enable responsible dissemination of information. The supportive mechanisms of this pursuit may include guidelines to report harm and vulnerabilities of systems, a platform for AI companies to disclose information about their systems to bodies responsible for public safety. 

 

The government's third progress report highlights a focus on evaluating AI systems' capabilities and expanding the technical research team. The AISI is pre-deploying tests for AI system’s potential risks and collaborating on an evaluation suite. Talent retention is crucial, with notable hires like Geoffrey Irving, but Rumman Chowdhury leaving. 

The second report emphasises the Institute's goal for foundational AI safety research, with the establishment of Isambard-AI. Partnerships with Apollo Research and OpenMined aid information exchange. The first report showcases a specialised research team and partnerships with leading organisations. 

You can read more about the first, the second, and the third progress report. 

If you would like to learn more, please email [email protected].

 

Tess Buckley

Tess Buckley

Programme Manager - Digital Ethics and AI Safety, techUK

 

Related topics

Authors

Tess Buckley

Tess Buckley

Programme Manager, Digital Ethics and AI Safety, techUK

Tess is the Programme Manager for Digital Ethics and AI Safety at techUK.  

Prior to techUK Tess worked as an AI Ethics Analyst, which revolved around the first dataset on Corporate Digital Responsibility (CDR), and then later the development of a large language model focused on answering ESG questions for Chief Sustainability Officers. Alongside other responsibilities, she distributed the dataset on CDR to investors who wanted to further understand the digital risks of their portfolio, she drew narratives and patterns from the data, and collaborate with leading institutes to support academics in AI ethics. She has authored articles for outlets such as ESG Investor, Montreal AI Ethics Institute, The FinTech Times, and Finance Digest. Covered topics like CDR, AI ethics, and tech governance, leveraging company insights to contribute valuable industry perspectives. Tess is Vice Chair of the YNG Technology Group at YPO, an AI Literacy Advisor at Humans for AI, a Trustworthy AI Researcher at Z-Inspection Trustworthy AI Labs and an Ambassador for AboutFace. 

Tess holds a MA in Philosophy and AI from Northeastern University London, where she specialised in biotechnologies and ableism, following a BA from McGill University where she joint-majored in International Development and Philosophy, minoring in communications. Tess’s primary research interests include AI literacy, AI music systems, the impact of AI on disability rights and the portrayal of AI in media (narratives). In particular, Tess seeks to operationalise AI ethics and use philosophical principles to make emerging technologies explainable, and ethical. 

Outside of work Tess enjoys kickboxing, ballet, crochet and jazz music.

Email:
[email protected]

Read lessmore