How I Got a Job at DeepMind as a Research Engineer (without a Machine Learning Degree!)
I recently landed a job at DeepMind as a Research Engineer! It’s a dream come true for me, I still can’t believe it! 😍(if you feel like an imposter sometimes, trust me, you’re not alone…)
I don’t have a Ph.D. in ML.
I don’t have a master's in ML.
In fact, I don’t have any type of degree in ML — I did my bachelor’s in EE.
That’s it. That’s my formal education. I dropped out of my EE master’s as I figured out I could learn stuff much more effectively and efficiently on my own.
My first exposure to ML was back in 2018. And my first exposure to the world of programming was when I was 19 (I’m 27 now).
So how in the world did I pull it off? (hint: I’m not that smart, you can do it!)
In this blog, I’ll try to tell you the whole story. I’ll be very transparent in order to help you guys out as much as possible. This blog even contains a snapshot of my resume (very much suboptimal, I feel uncomfortable sharing it 😅), the information on how I got referred to DeepMind, and many more details that will hopefully make this blog much closer to your heart.
I’ll structure this blog post into roughly 3 parts:
- I’m going to give you some context on who I am, and what my learning journey so far was! As I don’t want to set false expectations that you can pull this off in a short period of time, without an ML degree, if you don’t already have solid foundations and experience with self-education.
- I’m going to explain to you, in nitty-gritty details, how my ML curriculum looked like since my first exposure to ML back in 2018.
- Finally, I’m going to tell you how exactly did my final preparations for DeepMind look like (+ more info on how I got a referral).
Note: for all practical purposes you can read DeepMind in this post as “any top-notch AI lab/company like DeepMind, OpenAI, FAIR, AI2, etc.” because I’m fairly sure that the same (or at least similar) principles would apply there as well.
Well, let’s go!
Update: if you prefer video format I made a video based on this blog:
Part 1: My background (no ML degree)
It’s the summer of 2017. I‘m doing my first ever internship at a small German startup called Telocate, as an Android dev. I’m about to graduate (in September, once I return back to Belgrade, Serbia) from an EE program that had a strong focus on digital/analog electronics and less so on CS.
Sometime around the end of my studies (end of 2016), I realized I actually want to completely pivot to the software industry. It’s exciting, it’s much more open compared to the hardware industry, there are all these hackathons and datathons, and it offers amazing salaries.
So earlier that year (2017) I started learning Android on my own and I landed this internship in Freiburg, Germany. I was so excited — the world was mine.
Snap back to the summer of 2017, Freiburg. I’m chatting with a friend whom I knew was a very successful software engineer. He’s done multiple internships at companies such as MSFT, Facebook, Jane Street, etc. and he was my peer.
I realized I’m falling behind by a lot (along the SE dimension at least)! Some of my friends were doing competitive programming since their high school days. Algorithms and data structures were their second nature at this point.
I wanted to apply for big tech companies, and I suddenly realized that all of the skills I accumulated throughout the years are not that important when you’re interviewing at FAANG.
Nobody cared that I speak 5 languages, that I know a bunch about how microcontrollers work in the tiniest of details, how an analog high-frequency circuit is built from bare metal, and how computers actually work. All of that is abstracted away. You only need…algorithms & data structures pretty much.
I felt confused and angry at the same time. Some kid who’s 16 (I was 23 back then), could go and do some Leetcode problems for 9 months and he is already ahead of me in this respect. Even though he/she lacks so much of the fundamental engineering knowledge.
It’s a nasty feeling, I very much remember it. I was very ambitious throughout whole my life, and hardworking, and yet I’m falling behind. I remember hating my life for not figuring out about this earlier on in my childhood.
But, what can you do, this world is obviously not fair. Some kids get to grow up in Silicon Valley and have rich and educated parents whereas some kids are born in a “third world” country, sometimes starving to death.
This strong feeling of injustice was giving me energy. I thought to myself — I’m going to hustle really hard over the next years. Warp speed on.
I came back from Germany. I successfully graduated 3 days after that. I remember some of my friends were celebrating their graduation for 3+ weeks, drinking 🍺 and everything that goes along. Me? I celebrated my success over the weekend and I immediately started planning ahead. I felt I was on a mission.
I enrolled in a master’s program “symbolically” (you get some perks as a student) but I knew that I really wanted to land a job at a big tech company.
And so I made my own SE curriculum, plus, I started attending an algorithms course at my university without even being enrolled. I was sitting there with students that were 2 years younger than me.
Over the next couple of months I was working like crazy, and I immediately started applying. Soon after, I got myself an interview with Facebook in December 2017 and I failed miserably on my first interview.
I kept learning and I got an invitation from Microsoft — the tests are in March 2018 they said. Got it, I’m gonna work hard until then. I was learning, I was attending hackathons and datathons, and I started doing competitive programming (mostly Topcoder).
Then, in February 2018, I heard about this ML summer camp being organized by Microsoft people (we have an office in Belgrade, the only big tech company we have). I applied and after nailing a 3-day long entrance test I got accepted!!! (my decent digital image processing background helped)
I had a strong gut feeling that this may be my opportunity to show that I’m good enough to work at Microsoft! 💡
Around that same time, I landed an internship in Ouro Preto, Brazil via a student organization called IAESTE. It ended up being more of an amazing life experience (I lived in a fraternity with 11 Brazilians and I had to learn to speak Portugueses) rather than a tech internship. It was scheduled right after my ML summer camp. It’s as if the stars aligned!
March 2018 came, and I did my first tests with Microsoft. I failed the written entrance tests. Luckily 2 months later they gave me another opportunity. I tried again, passed the written entrance tests, did a round of 4 back to back interviews on-site, and 3 weeks later…they told me that there currently are no positions opened for me. I interpreted that as a rejection. Again.
So if there is somebody that can empathize with newcomers to this field that’s me. I know that it can be very mentally draining, this whole job hunt process.
Fast forward to the summer of 2018. I attended the ML summer camp and I nailed it. I was very engaged, I learned a lot and our final project was internally voted as the best one (solving Rubik’s cube with deep RL).
Immediately after the camp, I flew to Brazil in early August 2018.
3 days in and I get a call from Microsoft. You got an offer. We need you asap.
I was super happy and super sad at the same moment. I had to leave this beautiful country of Brazil for a bit over a month instead of after 3 months. It was either that or I wouldn’t get the job. They told me that the team that wants me is the “HoloLens team“!!! The same team that ran the summer camp. The most famous, exciting team we had in Belgrade — I couldn’t miss this opportunity. I took it.
September 2018, I’m in Belgrade working at Microsoft as a software engineer! I pulled it off. I was incredibly happy.
And that’s where my SE and ML story really started.
But, ever since I attended that summer camp the idea of DeepMind and the feeling that it is almost unreachable for me was deeply in my subconsciousness. The only folks I knew there (some of them were lecturing at the camp) studied at Oxford or Cambridge.
But there is a part of me that’s very attracted to impossible challenges. I thought to myself I nailed Microsoft even though I once thought that’s very hard. Why should this be any different?
And so that idea was buried into my subconsciousness. I really enjoyed the fact that I’m at Microsoft now.
Throughout 2018 and 2019 I was focusing really hard on learning as much as I can about software engineering while working at Microsoft. I was reading coding books such as the C++ book from Scott Meyers, etc.
Other than that I started doing ML in my free time.
I finished all of Andrew Ng’s Coursera courses before the end of 2018. I attended internal ML hackathons, I was heavily involved around the organization of the aforementioned ML camp, and I started reading research papers!
I still remember how they made me feel (and they still do!). 😂 I felt very dumb. I was like, really? I studied EE for 4 years and I can still feel this way? (Little did I know; a semidefinite chuckle)
Somehow, around that same time, I also discovered Lex Fridman’s podcast. That’s 3 years ago before he even had 50k subs on his YouTube channel! I remember religiously watching all of the 20ish podcasts he had back then.
I remember I started feeling increasingly confident with my ML knowledge. And so beginning of 2019 I shared my first blog ever on how to get started with ML:
(My writing was terrible back then, and it still probably is, I apologize for that. I’ll try to update it over the next couple of days after publishing this blog (I won’t change the content as I want to keep it as a historical document and also, the strategy that I’d take to learn ML hasn’t changed, so it’s still very much relevant))
End of 2019 I internally made a switch to an ML role (due to my religious engagement that my manager noticed). I was sent to ICCV 2019 conference.
I returned and I got the task of implementing a paper from scratch in PyTorch. I didn’t know any PyTorch back then. Needless to say — I learned a lot.
And this is where another warp speed moment happened. But this time, I was smarter, this time I’m going to share my journey online as I progress.
So what happened? What made me change my mindset? What made a push from this old Aleksa, that was learning and working hard but kept everything for himself, to this new Aleksa?
Well, throughout 2019, as a “side track”, I was reading a lot about entrepreneurship, marketing, and personal finance. And along the way, I found this guy called Gary Vaynerchuk who motivated me to start sharing stuff I do online. Thanks Gary! ❤️
I know most of you probably don’t like him (if you know of him that is), but he had a positive influence on my thinking. And for that I’m grateful. He definitely wasn’t the only person that influenced my mindset (I was sharing my code even back in 2018) but he finally made it click.
You probably may say (if you don’t know me): “Well, this story is inspiring and all but how did you create your own SE/ML curriculum in the first place?”
Well, before all of this happened, I already had a strong track record of learning on my own. If it still feels like magic I strongly recommend you go ahead and read this blog:
5 Tips to Boost your LEARNING
learn how to learn — effectively and efficiently
gordicaleksa.medium.com
I’ll leave a TL;DR here: through my workout routines, learning human languages throughout my life, and maths (all of that on my own) I got skillful at learning. Coursera’s “Learning how to learn course” helped as well.
Now that you have all of the necessary context let me tell you about the ML curriculum I followed to land a job at DeepMind!
Part 2: ML curriculum (January 2020 — June 2021)
New Year 2020 came along. I just finished reading “Crushing it” from garyvee and decided I should start my own YouTube channel sharing my ML learning journey along the way.
The pandemic struck and I was in warp speed again. The stars aligned. ⭐
I decided that over the next year I want to cover various subfields of AI, and I roughly decided that I want to dedicate 3 months to each one of these.
But things weren’t that smooth in the beginning, I was still figuring stuff out. My first subfield, neural style transfer (NST), took longer than 3 months because I found it enjoyable and I thought why rush it?
Along the way, I found a perfect strategy for myself.
I structured my learning the following way. I had “macro” cycles (3-month periods where I tackle a new subfield) and I had multiple micro-cycles interspersed over macro.
The micro-cycles are of 2 types:
- Input mode: I’m ingesting information. The goal is to either get a high-level understanding of the structure of the subfield (blogs, videos) or an in-depth understanding of the topic at hand (research papers, books, etc.)
- Output mode: I’m sharing information I accumulated during my input mode. Teach others! Create public artifacts like YouTube videos, GitHub projects (1 project in the middle of macro), and blogs (1 blog at the end of a macro). As well as higher-frequency updates on LinkedIn. Much later I started sharing on Twitter and Discord as well.
Now the tricky part was combining all of this with my full-time job at Microsoft! It took a significant amount of willpower. I’d hit the program as soon as I wake up for 2h, I’d go for a quick stroll, I’d do my job at Microsoft and a 30-minute power nap after I wrap it up, and finally I’d be working for additional 2–3 hours before going to sleep. The pandemic helped me maintain this crazy rhythm.
The time management and stress handling probably deserve a separate blog post, but it mostly boils down to willpower, correct mindset, as well as taking smart breaks (power naps are golden — let me know in the comments if you’d like a separate blog on the topic of time management).
Now let me go into the specifics of the macros I did on my way to DeepMind! Luckily I started writing blogs sometime around my transformers macro so if you want to know all the nitty-gritty details and the strategy I took, I’ll link the blogs in every single macro section.
For the first 3 macros (NST, DeepDream, and GANs) I, unfortunately, didn’t write a dedicated blog, but the intro section of the transformers blog below as well as the YouTube videos and projects I created can fill in the gaps:
Having said that, here are the TL; DRs of every macro I did:
[1] NST (Neural Style Transfer)
During this period I was learning about neural style transfer. I ended up reading a pile of research papers (though a lot less than later when I became proficient), implementing 3 different projects, which I then open-sourced on my GitHub, and I made an NST playlist on YouTube.
I think that this is a great topic to start with as it’s highly visual and visceral. I recently wrote some of my thoughts on a similar topic in this LinkedIn post. Also, digital image processing and computer vision were the fields I felt the most comfortable with and so that’s why this was a great starting point for me.
I perfected my PyTorch knowledge, learned a lot about CNNs and optimization methods, improved my presentation skills (just check out those README files 😂), became more proficient at reading research papers, and improved my SE skills in general — to name a few.
Here is an NST image synthesized using the code I wrote:
[2] DeepDream
Ever since I first saw the pictures created with DeepDream I was fascinated. I knew I had to learn it. Every single detail. I couldn’t just use online generators and not understand what was going on.
I was reading blogs, analyzing DeepDream subreddits, and exploring various codebases. Most of the original code was written in Torch & Lua combo. I remember losing a lot of time trying to set it up on Windows. Then I switched to Linux, got it to work, only to realize that I won’t be using that repo either way. 😂
My output was a single video on the theory behind DeepDream and an open-source project. I learned a lot by doing this and I enjoyed the process. I used my code to generate these beautiful images, among them the current visual identity of The AI Epiphany. Here is an example:
[3] GANs (Generative Adversarial Networks)
GANs were still very popular in early 2020. I felt like I’m missing the necessary background and so I decided to tackle them next.
I’ve read all of the seminal papers (as well as other less prominent papers) and I decided to implement vanilla GAN, cGAN (conditional GAN), and DCGAN models. Here is an output from the DCGAN model I trained:
At this point in time, I refined my strategy. I realized that:
- I need to write a blog at the end of each macro, summarizing what I’ve learned. That will help future Aleksas (and Aleksa from the future) tackle topic X.
- I need to cover papers on my YouTube channel as I’m reading them. A similar thought process — by doing this I’ll learn better and I’ll help others.
- I need to open-source a project in the middle of the macro. After you implement something you understand it way better — that was/is my thinking. After that, all the papers I read made more sense.
Unfortunately, I don’t have GAN paper overviews from this period. Back then I was experimenting with videos such as “PyTorch or Tensorflow” which turned out to be the most popular ones (proving a hypothesis I had in my mind).
But I didn’t seek popularity (not that I would mind in general), I was seeking a highly relevant audience. I’d much rather have 1 guy/gal from “DeepMind” following my work than 100 beginners — because I knew what my goals are.
[4] NLP & transformers
I knew I wanted to learn more about BERT and the GPT family of models. Transformers were everywhere, and I didn’t quite understand them.
This time I did everything correctly and executed the plan I sketched above. I ended up implementing the original transformer paper from scratch and learned a ton! I decided to create an English-German machine translation system. Since I speak those languages, I thought that’ll make debugging easier, and it did.
I’m so grateful I did it back then — every now and then you need to understand the QKV attention, and it’s second nature to me now. You can find out much more about this journey here:
[5] Graph/Geometric ML
Even before I started doing transformers I knew I wanted to dig deeper into Graph ML (before it became as popular as it is nowadays).
And, in retrospect, I’m very glad I covered transformers/NLP first since the field of NLP had a huge influence on this field. GAT was directly inspired by the original transformer paper, DeepWalk by Word2Vec, and so on.
At this point in time, I was already chatting with Petar Veličković on a fairly regular basis (I’ll tell you a bit later more about how this came to be) and he knew that I’ll be covering GNNs in the next period. He was very receptive and he told me that I can ping him whenever I felt stuck!
I realized that this is a great opportunity! I’ll be learning Graph ML, a topic that fascinated me and I’ll be communicating with Petar, a researcher at DeepMind, who was among the best researchers in this particular field.
Needless to say, I learned a lot! I made a “popular” YouTube GNN series that was shared by influential Graph ML researchers like Michael Bronstein, Petar, and others.
I also made a popular PyTorch implementation of GAT (Petar is the first author of GAT). The reason it became so popular is that it was, according to others, the most beginner-friendly resource out there. With 2 Jupyter notebooks, PPI and Cora datasets support, a nice README, and accompanying videos it filled the gap and made it easy for beginners to enter the field.
(It became a recommended repo for a Cambridge lecture on GNNs and I even got citations. 😅)
I again massively used the fact that I could ping Petar if I had any problems and that this could be a great collaboration between the 2 of us!
For the whole story check out this blog:
[6] RL (Reinforcement Learning)
Drum roll. Surprise, surprise! I really wanted to learn RL either way but it so happens that DeepMind is famous for its RL breakthroughs. 😅
There were so many papers and projects I wanted to understand! Like AlphaGo, DQN, OpenAI’s robotic hand that could solve Rubik’s cube, etc.
This field proved to be very different compared to other ML subfields. The data point independence assumption doesn’t hold anymore, a random seed can make your agent fail to converge, etc.
I’m very glad I already had a decent amount of experience before tackling RL, otherwise, I might have gotten demotivated. I started with stuff that’s very close to my heart — computer vision — and I used it to ramp up my ML knowledge. Then, I slowly progressed toward topics I knew little about.
I again had some help here! Petar connected me with Charles Blundell (who’s also a DeepMinder) and so from time to time, I used that opportunity, although a lot more frugally.
For the whole story check out this blog:
And those are pretty much the main macros I had. Throughout the whole journey I was making notes in my OneNote:
(If there is enough interest I could maybe convert those to pdfs and share them on a public drive. Let me know in the comments!)
After RL, I started focusing on the latest and greatest papers. You can clearly notice that trend looking at my newest YouTube videos over the past 4 months.
And all of you who were following along my journey knew that these were my plans — I was very transparent throughout the whole process! (except for the fact that I wanted to apply for DeepMind the whole time)
Aside from this “main” track, already in the very beginning of 2020, when I started reading papers, I realized that I was having difficulties with mathematics so I took some time to read the “Mathematics for Machine Learning” book (and I went over 3B1B’s awesome playlists).
That turned out to be a great move. I’m so happy I did it similarly to how I took a step back and took the “Learning How to Learn” course in September of 2019. All of these actions helped propel my learning speed.
Here are some other books I also read in this period:
- Python Data Science Handbook, chapter 5 — helped me get a visceral feeling for SVMs, PCA, linear regression, etc.
- Automate the Boring Stuff with Python, skipped first 11 chapters — I felt like going through at least one book on Python since I learned the language on the fly. It also gave me this nice mindset that improved my productivity in general (in a nutshell: automate the repetitive stuff).
- Deep Learning— I only read the first part as I realized I don’t really need all of that theory. It should be treated as a reference manual.
- The Book of Why — I read it because I was genuinely interested in causality and Judea Pearl’s work! Here is my LinkedIn summary.
Also, needless to say, all this time I was working at Microsoft on various SE-ML engineering projects. Unfortunately, I can’t tell you a lot about all of that but here are some of the generic highlights I can share:
- Developed a glasses detection algorithm as a part of the eye-tracking subsystem on the HoloLens 2 device. The whole point of the eye-tracking subsystem is to predict the user’s eye gaze vector and enable the instinctual interaction with the holograms. It also helps display graphics correctly for a particular user (everybody has different eyes, thus different IPD, etc.).
- Played with video encoders so as to add the foveated rendering functionality to various VR/MR devices. That way you save up power as you don’t need to render stuff in the peripheral vision. I ended up reading the whole reference manual on H.264 encoders (that was, again, probably an overkill 😅).
- Implemented an idea from a research paper (from scratch, in PyTorch) that is used to track the landmarks on the user’s body and showed that a particular sensor we were considering was a feasible option going forward. This is how I learned PyTorch (this was in late 2019/early 2020).
- Wrote various scripts that made sure labelers are doing their job correctly, developed a part of our internal metrics pipeline, dealt with rendering and improving our synthetics data (which is great, check out the paper here!), made various quantization and perf-vs-compute experiments (because I was working with edge devices), and so on.
- Interviewed interns for SE/ML positions at Microsoft. This made me feel more comfortable with the interview process, and gave me a perspective on how it looks like to be on “the other side of the table”.
- Mentored students on the ML summer camp. I was also lecturing and holding workshops (on CNNs).
As a summary that’s 1.5 years of warp-speed ML study, 1.5 years of non-warp ML study, 1-year of warp-speed SE study, and the rest is captured in my “5 Tips to Boost Your Learning” blog.
And that’s my story.
Tip: an additional thing you could do to see more details of this whole period is to just scroll through all of my LinkedIn posts, that’s where I was (and still am) regularly sharing my updates.
Part 3: How did the “DeepMind channel” open up?
Middle of the year 2020 I reached out to Petar on LinkedIn. It turned out he was already following the content I was posting there and found it interesting, so that made things a bit easier! ❤️
During my ICCV19 conference I met other cool DeepMinders, that were also from Serbia, like Relja Arandjelović (who did his Ph.D. with professor Andrew Zisserman at Oxford) and Jovana Mitrović (Oxford girl as well!).
Even though this was my first conference I immediately realized that the value of these conferences lies in:
- 1–1 conversations with the paper authors during the poster sessions
- Meeting other likely-minded people
IMHO, many people get this wrong. By the time you attend the conference most of the papers were already published to arXiv months ago (and chances are somebody already covered them and shared the summary online 😂).
That’s why virtual conferences will never be a good replacement.
Then sometime around September of 2020, I made it clear to Petar and Relja that sometime next year, probably in April, I’ll start applying for DeepMind and asked them for any tips on how I could prepare! In the meanwhile, both of them offered to refer me! ❤️
Fast forward to Friday, April 2021. I was casually chatting with Petar, mentioned that I’m finally ready to apply for DeepMind, and while we were chatting he just said (without me even realizing): “ok, I just finished submitting the referral!” 😂
He then pinged a recruiter (Cameron Anderson, didn’t know him back then but I can tell you now that he’s amazing at his job) on LinkedIn, mentioned who I was, and asked whether it’s fine for me to submit my LinkedIn/YouTube/GitHub instead of my resume. It turned out that Cameron was already following my work! And he said that it’s fine! My God, I was so happy.
Everything happened so fast. I just sent my LinkedIn profile and I already had my interview scheduled for Monday.
So summarizing. Networking matters! A lot. BUT. Please don’t spam people. Instead, create genuine connections. Ask what you can do for them and not what they can do for you. None of them, including me, won’t recommend a person unless we’re familiar with that person’s work already and we can make some guarantees.
(Pro tip: contribute to an open-source project of the company you care about — that’s a great way to get noticed! That way some of them will be familiar with your work and you can later ask for a referral —that is if they don’t already offer it themselves! Why is nobody using this strategy? It beats me.)
At one point an automated email came asking for my resume (although Cameron confirmed that it’s not necessary). After some quick edits, I decided to send them my resume without investing too much effort into it (thinking that they probably won’t even read it).
Which in retrospect could have been a big mistake. A well-written resume matters! And my interviewers did in fact use it. It’s easier to ask questions looking at a 1/2 page resume than at a LinkedIn profile.
I’m leaving it here for legacy:
There are many things wrong with it. For a start, I should have written a lot more about my latest work at Microsoft. The thing is that my team at Microsoft was very secretive (for quite some time my other colleagues from Microsoft didn’t know what I’m working on), and so I took the safest strategy — of not communicating anything to the outside world. 😂
Also, my resume was always a 1-pager, I should probably cut all of these older projects out. I have many other ideas on how to improve it.
Finally, let’s see how my last preps for DeepMind looked like!
Final interview preparations
Ok, at this point I managed to build up solid foundations and I was referred for an RE position. You do need a solid background to land a job at DeepMind. I’m fairly sure I wouldn’t even get a chance to interview if I didn’t have all of my GitHub projects, prior experience, and other public artifacts.
(I don’t have any paper publications but I compensated in different, original ways. REs don’t necessarily need publications, as you can see. But the game may be different for RSs (research scientists). I’m fairly sure they need a strong record in publishing to top-tier conferences and a Ph.D. Take all of this with a grain of salt, I’ll update this part when I start working at DeepMind and get myself better informed.)
But the battle wasn’t won. As all of you know, the interviewing processes are…well, kind of dark magic.
Even when you’re really good, big tech companies may reject you. It’s safer for them to reject a great candidate (by a mistake) than to accept a really bad one (they want to reduce the false positives). Repeat after me — hiring pipelines are never perfect — this will save you from a lot of unnecessary stress!
Secondly, you’re often asked stuff you won’t actually be using in your daily job and that’s why you have to explicitly prepare for the interviews.
Let me now tell you how I pulled it off.
First, here are some general tips:
- Research every single one of your interviewers. Understand where they are coming from and what they do, so that you can ask relevant questions. Don’t waste their time. Check out their LinkedIn/Twitter profiles (they always have at least one of those) and their Google Scholar page. Read all of their papers if possible or the ones with the most citations/the ones where they are the first author. For some of my DeepMind interviewers, I read more than 10 papers. 😅 That’s probably not needed, but I was passionate enough (read nuts) to do it.
- Research every single one of your resume projects. You should be able to talk in-depth about all of those projects. Be those your GitHub projects or the ones you did at some previous company.
- Do some mock interviews. Find friends who do AI at whatever company and are ideally interviewers themselves.
For DeepMind I also did serious research around AGI. I read some of the seminal AI papers like:
- A Collection of Definitions of Intelligence (my LI summary)
- Universal Intelligence: A Definition of Machine Intelligence
- On the Measure of Intelligence (my LI summary)
I’ve previously also researched Alan Turing’s seminal “Computing Machinery and Intelligence” paper as well as follow-up papers from John Searle, etc. You probably don’t need these. (in case you want a summary here are my blogs, parts one and two)
And I also watched YouTube videos featuring some of the following people: Demis Hassabis, Shane Legg, Marcus Hutter, F. Chollet, Murray Shanahan.
(Flex: Demis liked a couple of my tweets 😍)
The reason I did this research, is because it’s DeepMind’s mission — cracking AGI and using it to solve difficult scientific problems. You can also expect “behavioral” questions around these topics so I wanted to have a sound background.
Research as much as you can about DeepMind. For me, it came naturally since I was covering papers on YouTube, implemented some of them, knew people, culture, etc.
Those were some general tips. Replace DeepMind with your dream company and map a concept like “AGI” to “whatever my dream company cares about” and do your research on it.
Now let’s see how DeepMind’s hiring pipeline looks like for an RE position at the “Core” part of the team (more on DeepMind’s org a bit later).
The pipeline consists of the following interviews (at the time I’m writing this, it may change later in the future but you should still have a general idea):
- The initial chat with your Recruiter. This is supposed to give your recruiter an idea of whether you’re more interested in applying for the Applied or the Core part of the team. As with any other interview, treat it with respect, there may be a behavioral component to it. (1 interview)
- The “quiz”. The idea is to test your understanding of fundamental topics in maths, stats, CS, and ML. (2 interviews)
- The coding interview. Your classical SE interview if you ever interviewed for Google, Microsoft, or any other big tech company. (1 interview)
- Team Lead interview. You’ll be discussing various topics. Anything that’s on your resume may come up, as well as open-ended ML problems, and behavioral questions. (1 interview)
- Senior Team Lead interview. Very similar in structure to the above interview. Additionally, this interview more explicitly assesses whether you’re a fit for the team. (TBH, you should treat every interview with that in mind) (1 interview)
- People & Culture interview. Your classical HR interview. (1 interview)
Let me now give you some tips on how to prepare for these!
My goal here is not to give you an unfair advantage or a way to game the hiring system, but rather guidelines on how you can get better at these topics, and thus hopefully a better engineer (and a better human being)! 😜
I’m writing this so that Aleksa of the past would be grateful to have had it!
Interview #1: A chat with the recruiter
Do your due diligence and understand what exactly it is that you want to do at DeepMind (as much as you can!). DeepMind has 2 “teams” (it’s a soft type of a division, both “teams” are working towards a common mission):
- The “Core” research team. They do more pure research types of projects. Examples of more “production-like” projects would be AlphaFold (I guess the Applied team was involved here as well), AlphaGo, etc. Other projects that REs in this team do may involve helping implement certain research ideas in collaboration with research scientists.
- The applied team. This one is self-explanatory. Some of the projects they did are the Data Centre energy reduction project, improving the RecSys for Google Playstore, WaveNet, saving power on Android, and much more (they’re not only Google-focused).
Go over to DeepMind’s website, read their blogs, and research all of the available positions you care about.
Also, I strongly recommend you read the first (behavioral) part of the “Cracking The Coding Interview” (CTCI) book. As I said every interviewer will be assessing your personality and whether you’re fit for the team — so some preps in that regard definitely won’t hurt!
Interviews #2 and #3: The “quiz”
This is probably the hardest part to prepare for (although TBH, it wasn’t for me, it really depends on where you’re coming from). You need solid technical foundations to pass it.
I’ll tell you how I prepared myself, although again, it might be overkill:
- [CS] I read an Algorithms & Data structures textbook that was written by a Serbian professor so it won’t be that relevant to you. The main point is — I picked a book I was already familiar with (I went through it back in 2017 preparing for big tech companies, remember?). This knowledge will also come in handy later during the SE interview so your time won’t be wasted. It’s a 400-page book. I skipped some parts that I knew are “too advanced” like B-trees (external-memory data structures) and certain hash function details (I probably read ~340 pages). I covered topics such as linear/non-linear data structures (graphs/trees), and various sorting and searching algorithms.
- [CS] Skimmed through OS course materials at my faculty. Stuff like deadlocks, threads, virtual memory, etc. Also, I read this blog on threading in Python to brush up on my knowledge of the terminology.
- [CS] Skimmed through the big O-notation formal theory from the “Introduction to Algorithms”(CLRS) book.
- [Maths, classical ML] Read the “Mathematics for ML” book, again. (As you recall the first time I read it was in 2020 so I was way faster this time).
- [ML] Brush up on my RL theory. I just googled stuff and found some ad hoc materials. What’s an RL framework (env-agent-reward), what is an MDP, etc. My “Getting Started with RL” blog is a good starting point.
- [Statistics] YouTube videos (channels such as jbstatistics and StatQuest with Josh Starmer helped me), googling for unknown concepts, skimmed through this material (sheets 1 and 2) from Cambridge. I made some reasonable assumptions on what they could ask me.
That should be enough. If you already feel confident with some of these — just skip it — treat this as a biased guideline rather than as ground truth.
My interview process already started at this point so this was more of a sprint rather than a month-long preparation. I strongly recommend you start preparing before you kick off the interviewing process.
Interview #4: The coding challenge
- I solved some of the CTCI problems. I didn’t go through every single section of the book — it takes too much time. I solved most of the problems from the first 8 chapters (minus chapters 5, 6 where I skimmed the solutions or quickly solved some of them in my head). I skimmed through the solutions of section 9 on system design, I skipped chapter 10 as I have already implemented some of the most popular sorting/searching algorithms while preparing for the quiz, and I’ve completely ignored chapters 11–15.
- I read a chapter on dynamic programming from the CLRS book.
A couple of tips on how to read through CTCI:
- First, try to solve a problem yourself. Only if you’re stuck, take a look at the first hint. If that doesn’t help see the next one, and so on. In the end, even if you managed to solve the problem, go through the solutions. You’ll definitely gain some new perspectives on how to approach that very same problem.
- Make sure to always write down the memory and time complexity of your solution before looking at the solution section.
- Pick a language that you feel the most comfortable with. The first time I went through this book (back in 2017) I used Java (as I was doing Android back then and the book itself contains solutions in Java). This time I used Python. You can find solutions in any language online or just use the GeeksForGeeks website.
You really don’t need anything else other than CTCI! Just that. 😅
Interviews #5, #6 and #7: team lead interviews and people & culture
Just read the behavioral/first part of the CTCI book and follow my general tips (like be ready to talk about your projects and be familiar with your interviewer’s research).
They’ll ask you questions such as: “What do you like about DeepMind?”, “Why DeepMind and not some other company?”, “Tell me something about your favorite project”. Stuff like that.
And that’s it! You should be good to go now!
Finally, let me briefly tell you what happened to me. Some of you may recall from this video that I mentioned failing to land a job at DeepMind:
I failed the 6th (Senior Team Lead) interview. IMHO, it was literally just a misunderstanding more than anything else (I got the feedback via email they were very kind, and that’s how I know that it was a misunderstanding), but life goes on!
As a “technically strong” candidate, I was rerouted to the Incubation/Applied part of the team. I again prepared myself following the general tips I gave you guys and got 4 times YES. I was accepted. Hurray!!!
The good thing is, after having interviewed with them, I realized that the Incubation team is probably an even better fit for me!
The interviews were very conversational in nature. They told me what the team does, there were some behavioral questions, some ML system design, and open-ended problems. The best way to prepare for those is to have genuine work experience.
And that’s it. I guess the rest is history.
And these 2 viral posts on Twitter and LinkedIn. 😂
My next steps
I’ll be joining DeepMind later this year (December 2021) as a Research Engineer. I’m very excited about this next period of my life!
Many new things are waiting for me. I’ll be moving to London, UK. I’ll have to build up new social circles. I’ll be exposed to a new culture, new languages (as London is quite multicultural), a new company, and even a new OS! 😅
As for what you can expect from me over the next period, I’ll be:
- Sharing everything I do on my Twitter, LinkedIn, and Discord!
- Covering more geometric DL papers on my YouTube channel!
- Explaining various open-source ML tools and research code as well.
- Learning JAX and this time sharing the journey with you guys!
On a personal note, I’ll try and travel a bit more before December and I’ll be doing some investing (I’ll be reading the “The Intelligent Investor” book).
I got so many kind words and requests from my community at The AI Epiphany:
And so I felt obliged to share more about my journey to DeepMind in order to help you guys become better. And hopefully, by sharing this deeply personal journey I managed to help at least some of you.
If there is something you would like me to write about — write it down in the comment section or DM me. I’d be glad to write more about mathematics & statistics, ML/deep learning, CS/software engineering, landing a job at a big tech company, getting an invitation to prestigious ML camps, etc., anything that could help you!
Also feel free to drop me a message or:
- Connect and reach me on 💡 LinkedIn and Twitter
- Subscribe to my 🔔 YouTube channel for AI-related content️
- Follow me on 📚 Medium and 💻GitHub
- Subscribe to my 📢 monthly AI newsletter and join the 👨👩👧👦 Discord community!
- 📄 Website
And if you find the content I create useful consider becoming a Patreon!
Much love ❤️