The software engineer’s guide to making your first AI safety contribution in <1 week

Dec 05, 2025

You’re an experienced software engineer who’s ready to start contributing to making AI go well. You’re unsure which of these three areas you’re best placed to leverage your engineering skills:

Scale AI safety research
Build tools for AI safety researchers
Contribute directly to AI safety research

This guide walks through a project you can complete in <1 week to make your first contribution.

This blog post was written for graduates of BlueDot’s Technical AI Safety course who want to contribute their software engineering skills.

Why do a project?

Projects help you figure out where to apply your skills. Does your experience programming GPUs apply to training infrastructure? Does your agent scaffolding experience translate to evals? You’ll learn more from trying than from months of deliberation.

If you find a gap you can fill, it could lead you to orgs doing that work — or inspire you to start something yourself.

When applying to AI safety roles, this project could also serve as a helpful signal for your skills and motivation. A well-executed project can help demonstrate clear reasoning, good communication, and high agency. I’ve spoken to several hiring managers who made offers or fast-tracked candidates because of excellent projects.

While you could complete a project through programs and fellowships which provide more structure, mentorship and stipends, you could also do it yourself!

You already have what it takes to start. Here’s how.

Getting started

Block out 20-40 hours in your calendar.
- If you want to keep going after that, schedule more time later. Right now, focus on finishing something by the end of the week.
Schedule focused blocks.
- Aim for 2-4 hour sessions where you can get “stuck in”. The more fragmented your time, the more time you’ll burn context switching.
- Rope someone in to work on this with you or be accountable to.
Protect this time.
- It’s easy to let social events or work meetings eat up your project hours. If this matters to you, treat these blocks like any other important commitment.
- Make a calendar event for yourself!
Build a routine.
- Work at the same time each day if possible.
- Set a clear intention and put it somewhere you can see it. E.g. “Every day for the next 5 days, I’ll spend 4 hours working on my project at my desk.”

Choose your path

Option 1: Fix open issues in AI safety tools

Contribute to tools that AI safety researchers use every day.

Pick 2-5 good first issues to solve from an open source AI safety repo, like:

Inspect Evals
ControlArena
Inspect
TransformerLens
Neuronpedia
nnsight
safety-tooling
Other volunteer projects
(email me if you know of more! anglilian@bluedot.org)

You can also message the maintainers, join their Discord / Slack communities or just try out the tools to figure out what needs improving.

For example, Anthony Duong looked at the issues on TransformerLens and spent a few weekends making PRs.

Option 2: Replicate and extend a research finding

Get closer to research by reproducing and extending a published result.

The goal is to reproduce and add a small tweak to ONE interesting finding from a paper.

Some ideas for picking a starting point:

Reviewing the resources from the Technical AI safety course
Get inspired by past projects from BlueDot or ARENA alumni
Replicate and extend Anthropic’s alignment faking demo
Pick an open problem in evals
Pick an open problem in mech interp

If you want to spend closer to 20 hours on the project, pick papers that have code (and datasets if applicable) available for you to run. Otherwise, expect to spend a lot more time working out how to implement the code.

Replicating a finding from scratch is more feasible for papers like evals or elicitation techniques that don’t require deep ML expertise.

Don’t get too bogged down with trying to make a novel research contribution. It takes months to develop good research taste. Instead, follow where your curiosity leads you.

Then, find the fastest way to get signal on whether this is an idea worth pursuing before running a high volume of tests. Can you prompt the model and see what happens? Can you use a fine-tuning API like TogetherAI or OpenAI?

Remember, this is meant to be a short project. You can always build on this in your next iteration.

You can get compute for your project from providers like RunPod or Hyperbolic, and access open source models via OpenRouter. If funding becomes a constraint, you can apply for a small grant as a BlueDot course graduate.

Ethan Perez has written useful tips for empirical research here.

Option 3: Make research reproducible

Unblock other researchers by fixing what’s broken in the replication process

Many published papers are hard to replicate because the code is buggy, dependencies are missing, or the workflow is unnecessarily painful.

Your goal is to pick a paper with available code, try to run it, and fix whatever breaks or makes it painful to work with.

This could mean:

Fixing broken code or missing dependencies
Writing clearer setup instructions or documentation
Building micro-tooling for repetitive, manual steps (e.g., a script for batch querying LLMs, a config manager for hyperparameters, or a notebook that visualises results)
Packaging the reproduction in a way that “just works” for the next person

Focus on making it easy for everyone who comes after you to build on it. By the end, you should have opened a PR or created an issue that ideally gets merged!

Write it up!

The biggest mistake people make is treating the write-up as an afterthought. Your work won’t advance AI safety if no one engages with it.

Plan to spend at least one full day writing this up. If you’re spending less, you’re not spending enough.

This includes the different forms of write-ups, like your longer-form blog and the Twitter/X or LinkedIn post that helps with distribution.

As a rule of thumb, you should allocate your writing time proportional to how much total reading time each format will receive. Writing a viral thread deserves as much effort as writing a detailed post.

Why write it up

Your write-up is how you get feedback and find a home for your work. It’s also how others gauge how well you know your stuff and how much you’ve thought it through.

Many of our past graduates have found their co-founders, collaborators, roles and funding opportunities from posting their projects.

You might be thinking: “I can write it up much faster than that” or “I’d rather spend time working on the project”. But if you want your work to reach people, it’s worth communicating well.

Writing clearly demonstrates your understanding. You can’t write clearly about something you don’t fully get.

Think about explaining your own codebase to a new hire. If you’re fumbling through the explanation, you probably don’t understand the architecture as well as you thought. Clear communication is an indicator of understanding, and it takes work to achieve.

How to write it well

Lead with what you did. Don’t bury your insights in walls of text or unnecessary jargon. State it clearly upfront.

Explain why it matters. Motivate why you did this and why anyone should care. People are more likely to read your work (and remember it) when they understand the why.

Keep it simple. The goal is for people to actually read and understand what you’ve done. This is not easy. Writing something short and clear is much harder than rambling on. But that’s exactly why you need to devote real time to it.

Get feedback constantly. Good writing requires iteration. Explain your project to others as you work on it. See if they understand. See if they’re convinced it’s compelling. If not, workshop your idea or delivery.

Don’t wait until the eleventh hour. Many AI safety researchers like Neel Nanda have highlighted how important this is. Start early and keep refining as you go.

Share your work

I know it feels scary to stake your name publicly on what you’ve done. But here’s the thing: your work is far more likely to get drowned out in the noise than criticised. You have to put in a LOT of work to be seen (there’s a whole industry around marketing!). And being seen is exactly what you want.

If you’re working on research, much of the community is on Twitter/X, so focus on making a thread and posting on LessWrong and the Alignment Forum.

Star this project on your GitHub, feature it on your blog (make one if you don’t have one!), post it on LinkedIn and keep talking to people about it.

So what are you waiting for? Let’s get started!

PS: Use/apply to our Technical AI Safety Project sprint.

Declan McKenna

Mar 1

> Your goal is to pick a paper with available code, try to run it, and fix whatever breaks or makes it painful to work with.

Is there a list of repos for this? I can see this being quite tricky to start as you need to find a project that is the right balance of being reproducible but isn't too easily reproducible. Each paper you pick will likely need you to read and understand the paper so the process of selecting your project could use up a significant chunk of someone's allotted time for their project.

Austin Morrissey

Dec 11

Excellent

3 more comments...

BlueDot Impact

Discussion about this post

Ready for more?