Third Progress Report Towards Ambitions of the AI Safety Institute
The government's emphasis on evaluating AI systems' technical capabilities is evident from the publication of previously classified information and the commitment to expanding the technical research team. The AISI is conducting pre-deployment testing for potential risks in advanced AI systems, collaborating with various organisations to develop an evaluation suite focusing on misuse, societal impacts, and safeguards. Talent acquisition and retention are highlighted as crucial for AISI's reputation and progress, with notable hires like Geoffrey Irving and Chris Summerfield and the notable loss of Rumman Chowdhury. The AISI is facilitating information exchange through publications like the International Scientific Report on Advanced AI Safety and engagement with international partners like South Korea.
The AISI has set three priority areas to achieve its ambitions, including evaluations of advanced AI models, conducting foundational AI Safety research and facilitating information exchange. These are the key commitments from the third progress report:
1) Develop and conduct evaluations of advanced AI models
-
Started pre-deployment testing
-
The AISI has begun pre-deployment testing of advanced AI systems for companies such as Anthropic and Google DeepMind, who agreed to government testing at the AI Summit.
-
Partnered with over 20 organisations
-
The AISI has now partnered with over 20 organisations including Fuzzy Labs, Pattern Labs and FutureHouse to build the evaluations suite. The current focus of testing has been decided which includes misuse, societal impacts, autonomous systems and safeguards.
-
Misuse – focus on chemical and biological capabilities and cyber offense capabilities.
-
Onboarding of 23 technical researchers
-
A key government priority has been to show that they have the technical capability to evaluate systems, which began with the publication of previously classified information in the discussion paper on risk from AI systems. This has since been followed with the promise to keep up to date with the newest AI models by onboarding 23 technical researchers with a commitment to grow to 50-60 by the end of 2024.
2) Foundational AI Safety research
-
Talent acquisition and attraction
-
The focus of this progress report highlights the importance of talent density, echoing the sentiment of "build it and they shall come." It's crucial to both attract and retain the right researchers, as this will bolster AISI's reputation and solidify its position as an organisation that is equipped to make meaningful contributions to AI safety.
-
Hogath’s leading announcements included the acquisition of Geoffrey Irving and Chris Summerfield’s wealth of knowledge. Irving, joining AISI from the Scalable Alignment Team at Google DeepMind as Research Director. Chris Summerfield, Oxford University’s Cognitive Neuroscience Professor, is also joining as Research Director to lead on the societal impacts work.
3) Facilitating information exchange
-
The International Scientific Report on Advanced AI Safety
-
Published the principles behind the Report, a landmark paper sharing the latest research and opinions from the world’s leading AI experts chaired by Turing Award winner Yoshua Bengio
-
The first meeting of the External Advisory Panel (30 countries plus the EU and UN) held their first meeting last week
-
Preparing for South Korea
-
The AISI team members have already been to Seoul twice to discuss the upcoming Summit and begin preparations
You can read more about the first and second progress reports and the ambitions of the institute.
If you would like to learn more, please email [email protected].