About the AI Security Institute

The AI Security Institute is the largest team in a government dedicated to understanding AI capabilities and risks in the world.

Our mission is to equip governments with an empirical understanding of the safety of advanced AI systems. We conduct research to understand the capabilities and impacts of advanced AI and develop and test risk mitigations. We focus on risks with security implications, including the potential of AI to assist with the development of chemical and biological weapons, how it can be used to carry out cyber-attacks, enable crimes such as fraud, and the possibility of loss of control.

The risks from AI are not sci-fi, they are urgent. By combining the agility of a tech start-up with the expertise and mission-driven focus of government, we’re building a unique and innovative organisation to prevent AI’s harms from impeding its potential.

Mechanistic Interpretability

AISI is launching a brand-new Mechanistic Interpretability team to research the fundamental question of how can we tell if a model is scheming? This is an ambitious bet to bring interpretability as a field into prime time. We believe that this is a vital challenge that mechanistic interpretability can help solve, ensuring that dangerous capability evaluations can be reliably determine if models are safe to release even when the models themselves are capable of gaming the evals. We also think it can lead to an entirely new field of alignment evaluations and make substantial contributions to the problem of technical AI safety.

To launch this project we're looking for a team lead, research scientists and research engineers. Apply now to join the largest technical AI safety lab on the planet - help us make this happen!

Role Summary

This team will have a large amount of scientific autonomy, with the ability to chase ambitious research bets. As an example of the kind of work you could be doing however, your responsibilities may involve any of the following:

Supervised fine tuning (SFT) of large models for scheming.
Training sparse auto encoders (or fine-tuning open source SAEs).
Circuit discovery/analysis.
Automated scheming detection.

You’ll receive coaching from your manager and mentorship from the research directors at AISI (including Geoffrey Irving and Yarin Gal). You will also regularly interact with world-famous researchers and other incredible staff (including alumni from Anthropic, DeepMind, OpenAI and ML professors from Oxford and Cambridge). We have a very strong learning & development culture to support this, including Friday afternoons devoted to deep reading and multiple weekly paper reading groups. From a compute perspective, you'll have unparalleled access to resources including 5,448 Nvidia Grace-Hopper GPUs (e.g., H100s).

Person Specification

You may be a good fit if you have some of the following skills, experience and attitudes:

Hands-on mechanistic interpretability research experience.
Experience working within a research team that has delivered multiple exceptional scientific breakthroughs, in deep learning (or a related field). We’re looking for evidence of an exceptional ability to drive progress.
Comprehensive understanding of large language models (e.g. GPT-4). This includes both a broad understanding of the literature, as well as hands-on experience with things like pre-training or fine tuning LLMs.
Strong track-record of academic excellence (e.g. multiple spotlight papers at top-tier conferences).
Improving scientific standards and rigour, through things like mentorship & feedback.
Strong written and verbal communication skills.
Experience working with world-class multi-disciplinary teams, including both scientists and engineers (e.g. in a top-3 lab).
Acting as a bar raiser for interviews.

Salary & Benefits

We are hiring individuals at all ranges of seniority and experience within this research unit, and this advert allows you to apply for any of the roles within this range. Your dedicated talent partner will work with you as you move through our assessment process to explain our internal benchmarking process. The full range of salaries are available below, salaries comprise of a base salary, technical allowance plus additional benefits as detailed on this page.

Level 3 - Total Package £65,000 - £75,000 inclusive of a base salary £35,720 plus additional technical talent allowance of between £29,280 - £39,280
Level 4 - Total Package £85,000 - £95,000 inclusive of a base salary £42,495 plus additional technical talent allowance of between £42,505 - £52,505
Level 5 - Total Package £105,000 - £115,000 inclusive of a base salary £55,805 plus additional technical talent allowance of between £49,195 - £59,195
Level 6 - Total Package £125,000 - £135,000 inclusive of a base salary £68,770 plus additional technical talent allowance of between £56,230 - £66,230
Level 7 - Total Package £145,000 inclusive of a base salary £68,770 plus additional technical talent allowance of £76,230

This role sits outside of the DDaT pay framework given the scope of this role requires in depth technical expertise in frontier AI safety, robustness and advanced AI architectures.

There are a range of pension options available which can be found through the Civil Service website.

Selection Process

In accordance with the Civil Service Commission rules, the following list contains all selection criteria for the interview process.

Required Experience

We select based on skills and experience regarding the following areas:

Mechanistic interpretability experience
Research problem selection
Research science
Writing code efficiently
Python
Frontier model architecture knowledge
Frontier model training knowledge
Model evaluations knowledge
AI safety research knowledge
Written communication
Verbal communication
Teamwork
Interpersonal skills
Tackle challenging problems
Learn through coaching

Desired Experience

We additionally may factor in experience with any of the areas that our work-streams specialise in:

Autonomous systems
Cyber security
Chemistry or Biology
Safeguards
Safety Cases
Societal Impacts

Additional Information

Internal Fraud Database

The Internal Fraud function of the Fraud, Error, Debt and Grants Function at the Cabinet Office processes details of civil servants who have been dismissed for committing internal fraud, or who would have been dismissed had they not resigned. The Cabinet Office receives the details from participating government organisations of civil servants who have been dismissed, or who would have been dismissed had they not resigned, for internal fraud. In instances such as this, civil servants are then banned for 5 years from further employment in the civil service. The Cabinet Office then processes this data and discloses a limited dataset back to DLUHC as a participating government organisations. DLUHC then carry out the pre employment checks so as to detect instances where known fraudsters are attempting to reapply for roles in the civil service. In this way, the policy is ensured and the repetition of internal fraud is prevented. For more information please see - Internal Fraud Register.

Security

Successful candidates must undergo a criminal record check and get baseline personnel security standard (BPSS) clearance before they can be appointed. Additionally, there is a strong preference for eligibility for counter-terrorist check (CTC) clearance. Some roles may require higher levels of clearance, and we will state this by exception in the job advertisement. See our vetting charter here.

Nationality requirements

We may be able to offer roles to applicant from any nationality or background. As such we encourage you to apply even if you do not meet the standard nationality requirements (opens in a new window).

Working for the Civil Service

The Civil Service Code (opens in a new window) sets out the standards of behaviour expected of civil servants. We recruit by merit on the basis of fair and open competition, as outlined in the Civil Service Commission's recruitment principles (opens in a new window). The Civil Service embraces diversity and promotes equal opportunities. As such, we run a Disability Confident Scheme (DCS) for candidates with disabilities who meet the minimum selection criteria. The Civil Service also offers a Redeployment Interview Scheme to civil servants who are at risk of redundancy, and who meet the minimum requirements for the advertised vacancy.

Diversity and Inclusion

The Civil Service is committed to attract, retain and invest in talent wherever it is found. To learn more please see the Civil Service People Plan (opens in a new window) and the Civil Service Diversity and Inclusion Strategy (opens in a new window).

Apply for this Job

* Required

First Name *

Last Name *

Email *

Phone *

Resume/CV *

Dropbox Google Drive

(File types: pdf, doc, docx, txt, rtf)

Why do you want to join the AI Safety Institute? *

LinkedIn Profile

Google Scholar

Website

Please specify your nationality. *

Do you have the legal right to work in the UK? *

If applicable, please indicate your current visa status and type.

Do you consider yourself to have a disability as defined by the Equality Act 2010? *

Please select

Your gender *

Please select

What is your sexual orientation? *

Please select

Select your current age group *

Please select

How would you describe your national identity? *

Please select

Select your ethnic group *

Please select

What is your religion or belief? *

Please select

Will you require a reasonable adjustment during the interview or assessment stages? *

Please select

If yes, please provide details of what reasonable adjustments might help you, or have helped you in the past, at the interview or assessment stages

Enter the verification code sent to to confirm you are not a robot, then submit your application.

Security Code *

This application was flagged as potential bot traffic. To resubmit your application, turn off any VPNs, clear the browser's cache and cookies, or try another browser. If you still can't submit it, contact our support team through the help center.

Interpretability Researcher- Autonomous Systems

About the AI Security Institute

Mechanistic Interpretability

Role Summary

Person Specification

Salary & Benefits

Selection Process

Required Experience

Desired Experience

Additional Information

Internal Fraud Database

Security

Nationality requirements

Working for the Civil Service

Diversity and Inclusion

Apply for this Job