AI safety needs excellent software engineers. You can already start contributing to making AI go well, even without deep ML expertise.
Here are three broad ways how:
Scale AI safety research
Build tools for AI safety researchers
Contribute directly to AI safety research
This blog post was written for graduates of BlueDot’s Technical AI Safety course who want to contribute their software engineering skills.
Scale AI safety research
This means taking promising safety experiments that work on 1,000 examples and running them efficiently on millions, or moving from single GPUs to distributed clusters.
This path allows you to leverage your engineering expertise in scaling code without deep ML expertise.
Some examples of what this might look like:
Optimising compute utilization: Turning single-GPU experiments into distributed training runs, or improving GPU utilisation from 30% to 90% through better batching and memory management.
Scaling research code: Refactoring Jupyter notebooks that take days to run into production pipelines that finish in hours, or creating frameworks that handle datasets too large to fit in memory.
Making evaluations reliable: Debugging why evaluation runs partially fail overnight, handling API timeouts gracefully, and implementing automatic retries for failed samples.
Building observability: Creating dashboards that show which evaluation samples are running, which failed, and why – bringing established SWE practices like distributed tracing to ML workflows.
It’s generally useful to understand the basics of ML, but in most cases, good questions to researchers will get you the context you need. You don’t have to know how to design a loss function or interpret attention patterns. Focus on your value-add – your engineering skills!
Building tools for AI safety researchers
This means creating the tools that multiply researcher productivity.
AI safety researchers spend significant time on infrastructure: running evaluations, analysing model internals, managing compute, and reproducing experiments. Some common examples of research workflows and tools include:
Evaluation frameworks
Researchers run the same evaluations repeatedly to track how models perform over time, and when a paper publishes results, others want to reproduce them or test them on newer models.
This requires three things:
Infrastructure layer: Managing resource allocation of servers and compute, where large-scale evaluations run.
For example: using Kubernetes to manage compute, setting up cloud infrastructure like AWS, GCP or Azure and using job schedulers for running millions of evals.
Orchestration layer: Building the framework for running evaluations.
Tools like Inspect provide the building blocks for prompting models and analysing their responses systematically.
Inference optimization tools like vLLM and SGLang handle efficient model serving with batching and memory management at scale.
Model serving platforms like Ollama make it easy to run models locally or self-host them for evaluation workflows.
API rate limiting infrastructure that centrally manages rate limits across model API providers (OpenAI, Anthropic, etc.).
Evaluation layer: Making specific evaluations reproducible and portable across models.
Libraries like inspect-evals provide ready-to-run evaluations for MMLU or other benchmarks. Instead of manually implementing them each time, researchers can run them with a single line of code.
ControlArena uses Inspect’s framework to evaluate different control protocols.
Analysis tools like Docent or InspectScout help researchers interpret evaluation results without custom analysis scripts.
Visualisation tools like Inspect Viz or dashboards help communicate these results more broadly.
Mechanistic interpretability tools
Understanding what happens inside models requires examining activations at each layer, testing changes, and tracking how information flows through the network.
Experimentation tools: Tools like TransformerLens let researchers probe models without building infrastructure from scratch. They can use standard functions to run experiments in minutes instead of writing custom code that takes hours.
Model access: Large models often don’t fit on a single GPU or require significant compute resources. Tools like NNsight provide API access to models, so researchers can run experiments without self-hosting.
Running open-source models
Testing open-source models involves practical challenges like:
Debugging: Getting models to run as expected, handling API quirks, and troubleshooting configuration issues
Reproducibility: Setting up chat templates, parameters, and output formats to match published results
Self-hosting: Configuring models for local deployment to reduce costs or have more control over the setup
AI-assisted research infrastructure
Building better tooling here means creating scaffolds that help AI understand research contexts and produce working, reliable code with good scaffolding, like:
MCP integration: Connecting AI agents to MCP servers to access evaluation results, model outputs, or experimental data
Automated workflow design: Using AI to generate evaluation pipelines, data processing scripts, or analysis code based on research requirements
Structured codebases: Organising projects so AI tools can navigate research code, understand context, and suggest relevant changes
The gap
These tools exist but aren’t perfect. The key here is having a product mindset:
Treat AI safety researchers as your customer and the tools you’re building as the product.
Replicate their paper to understand the workflow of running an evaluation or training experiment.
Talk to researchers to understand their pain points.
Understand why some researchers opt against using particular tools.
Ask the maintainers of existing tools where the gaps are.
As a software engineer (or technical product manager), you can leverage your unique value here! You can build tools that handle multiple use cases, have actual documentation, and won’t break when someone updates a dependency.
Some quick ways to start:
Pick up issues in open source repos, like Inspect, ControlArena or TransformerLens
Try out using the libraries to get a sense of where the gaps are
Implement a benchmark for inspect-evals in their open issues
Run an evaluation on a self-hosted open-source model
Here are more details on research tools and workflows.
Contribute directly to AI safety research
This means developing techniques to train AI systems to be safer, experimenting with approaches, testing what works, and implementing solutions directly on the models.
The depth of ML expertise you need depends entirely on how you want to contribute:
More ML-heavy contributions involve designing the experiments. They propose hypotheses on safety techniques and guide the research direction. These roles are typically research leads or scientists.
More engineering-heavy contributions involve turning research ideas into runnable experiments. They implement the training pipeline. These roles are typically called research engineers or contributors.
Note: While the job title might be the same, the split between ML and engineering varies widely by org. So don’t anchor too hard on job titles.
With a basic understanding of how AI systems work, you could:
Test whether a safety technique that works on GPT-4 also works on Claude or open-source models.
Take a paper on jailbreak resistance and test it on new prompts or different model sizes
Replicate a paper and tweak one variable. (example)
Evaluate how METR’s paper on AI’s doubling of SWE task lengths looks for offensive cybertasks. (example)
Getting these experiments to actually work is where your engineering skills shine! Research code is messy, and reproducing results often requires debugging. You’d be solving problems similar to what AI researchers face.
Then, you can post your findings on LessWrong or the Alignment Forum. These are genuine research contributions! You’re validating results, finding edge cases, and building evidence about what works. Many successful researchers started here, and there’s a lot of low-hanging fruit.
However, you’ll need far more ML expertise if you want to do things like:
Design novel reinforcement learning approaches for alignment
Propose radically new mechanistic interpretability techniques
Lead research directions
If ML expertise isn’t your differential advantage, don’t force it! Your software engineering skills are already incredibly valuable. You don’t have to spend 100s of hours upskilling on ML when there are also other ways to contribute.
This isn’t to say you need to learn everything there is to know about ML or get a PhD to start. You might instead start with a particular area, research paper or question and gain just enough context to achieve your goal. If you do want to upskill, self-studying ARENA is a good place to start.
Other engineers contributing to direct research have also written advice, like Ethan Perez and Andy Jones.
What do I do now?
Start applying. AI safety needs great engineers, and many orgs are looking for engineering talent, even if they haven’t posted roles yet.
Check the BlueDot community Slack for open roles, or directly reach out to AI safety orgs and ask about their engineering bottlenecks.
While you’re applying, you can also:
Facilitate BlueDot’s Technical AI Safety course
Talk to engineers or researchers in AI safety to understand where the engineering bottlenecks are.
Replicate a safety paper to understand the workflow. (project examples)
Resolve issues on or contribute to improving open source AI safety tools. (example)
Build your ML foundation (not because you need to be an expert, but because understanding what you’re building infrastructure for makes you more effective)
There are almost certainly more ways to contribute your engineering skills to AI safety. Lean on what you do best!
Acknowledgments
As someone outside both engineering and AI safety research, I’ve leaned on the experience of others. Thanks to Adam Jones, Alexander Meinke, Jun Shern Chan, Max McGuinness, Monika Jotautaitė, Oliver Makins and Rusheb Shah for their feedback. Any misrepresentations are my own.


