How They Became Leading AI Researchers in Just 1 Year – Sholto Douglas & Trenton Bricken

Dwarkesh Patel2024-03-29

42K views|5 months ago

💫 Short Summary

The video features speakers sharing their journeys in the interpretability field, emphasizing problem-solving, perseverance, and taking initiative. They discuss mentorship, self-experimentation, and the importance of standing out with world-class work. The need for proactiveness, attention to detail, and caring about the entire stack of tasks is highlighted, along with the potential for rapid progress through dedication and hard work.

✨ Highlights

📊 Transcript

✦

Importance of problem-solving and perseverance in achieving success in the interpretability team.

02:09

Scaling up through careful experimentation and execution was key to the team's growth.

Emphasizing the significance of taking initiative and agency in making an impact.

Choosing high-leverage problems that have not been well solved yet to showcase success in the field.

Stressing the need to pursue tasks to completion and do whatever it takes to make things happen.

✦

Transition from Robotics undergrad to scaling multimodal models for robotics solutions.

03:49

Online questions led to being hired by James Bradbury at Google, emphasizing mentorship and working with top engineers.

Self-experimentation and obsessive reading habits led to a shift towards a more focused perspective.

Benefits of having a broad perspective across subfields in academia are highlighted.

✦

Speaker's journey from working on sparsity in networks to joining anthropic.

06:15

Emphasis on luck and agency in career success, importance of networking.

Mention of Andy Jones' paper on scaling laws in board games, showcasing engineering skill.

Need for individuals to produce world-class work to attract opportunities, regardless of academic background.

Importance of standing out to receive job offers from top companies.

✦

The importance of proactiveness and taking charge of one's life is emphasized.

09:49

Caring deeply and paying attention to details are crucial for success, as demonstrated in LeBron's experience.

It is important to care about the entire stack and fix issues beyond one's responsibility.

Becoming world-class quickly is possible by putting in more effort than most people, showcasing the potential for rapid progress with dedication and hard work.

00:00I'm curious how you explain what's

00:02happened like why in a year or a year

00:03and a half have you guys been uh you

00:06know made important contributions to

00:08your field it goes without saying luck

00:10obviously and I I feel like I've been

00:12very lucky in like the the timing of

00:14different progressions has has been just

00:17like really good in terms of advancing

00:19to the next level of growth um I feel

00:23like for the interpretability team

00:24specifically I joined when we were five

00:26people we've now grown quite a lot um

00:30but there were so many ideas floating

00:31around and we just needed to like really

00:33execute on them and have like quick

00:35feedback loops and like do careful

00:38experimentation um that led to like

00:40Signs of Life and have now allowed us to

00:42like really scale um and I feel like

00:44that's kind of been my biggest value add

00:46to the team um which it's not all

00:48engineering but but quite a lot of it

00:50has been interesting so you're saying

00:53like you came at a point where like they

00:54were there was had been a lot of science

00:56done and there was a lot of like good

00:57research leting around but they needed

00:58someone to like just take that like

01:00maniacally execute on it yeah yeah and

01:03and and there's this is why it's not all

01:04engineering because it's like running

01:06different experiments and like having a

01:07hunch for why it might not be working

01:09and then like opening up the model or

01:10opening up the weights and like what is

01:11it learning okay well let me try and do

01:13this instead and that sort of thing but

01:15um a lot of it has just been being able

01:17to do like very careful thorough but

01:20quick um investigation of different

01:22ideas I just don't get blocked very

01:24often like if I'm trying to write some

01:26code and like something isn't working

01:28even if it's like in another part of the

01:29code base I'll often just go in and fix

01:31that thing or at least hack it together

01:33to be able to get results and I've seen

01:34other people where they're just like

01:36help I can't and it's like no that's not

01:39a good enough excuse like go all the way

01:40down I've definitely heard like people

01:41in management type positions talk about

01:44the lack of such people where they'll

01:46check in on somebody a month after they

01:48give them a test a week after they give

01:49them a test I'm like how's it going and

01:51they say well you know we need to do

01:53this thing which requires lawyers

01:56because it requires talking about this

01:57regulation it's like how's that going I

01:59was like well we need lawyers and like

02:01why didn't you get

02:03lawyers I think that's arguably the most

02:05important quality in like almost

02:07anything it's just pursuing it to like

02:09the end of the Earth and like whatever

02:10you need to do to make it happen you'll

02:12make it happen if you do everything you

02:13win if you do everything you win exactly

02:15I think from my side uh definitely that

02:18quality has been important like agency

02:20in the work there are thousands I would

02:22even like probably tens of thousands of

02:23Engineers of Google who are like you

02:25know basically like we're all like

02:27equivalent like software engineering

02:29ability let's say like you know if you

02:31gave us like a very well- defined task

02:33um then we'd probably do it like equival

02:35wellbe a bunch of them would do it a lot

02:36better than me you know in all

02:38likelihood um but what I've been like

02:41one of the reasons that I've been

02:43impactful so far is I've been very good

02:46at picking extremely high leverage

02:50problems so problems that haven't been

02:51like particularly well solved so far um

02:55perhaps as a result of like frustrating

02:57structural factors like the ones that

02:59you pointed out in like that scenario

03:00before where they're like oh we can't do

03:02X cuz this team won't do y or like and

03:05then going okay well I'm just going to

03:06like vertically solve the entire

03:09thing we we should talk about uh how you

03:13guys got hired because I think that's a

03:15really interesting story so like the T

03:16the of this is I studied Robotics and

03:17undergrad and in the meantime on nights

03:19and weekends basically every night from

03:2110: p.m. till 2: a.m. I would do uh my

03:24own like research and every weekend for

03:26like at least six to eight hours each

03:28day I would do my own research and

03:30coding projects and this kind of stuff

03:32that sort of Switched in part from like

03:34quite robotic specific work to after

03:37reading uh gw's scaling hypothesis post

03:39I got completely scaling pilled and was

03:42like okay like clearly the way that you

03:43solve robotics is by like scaling large

03:44multimodal models I was trying to work

03:46out how to scale that effectively and um

03:49James Bradbury uh who at the time was at

03:51Google and is now at anthropic um saw

03:56some of my questions online where I was

03:57trying to work out how to do this

03:58properly he was like I thought I knew

04:00all the people in the world who were

04:01like asking these questions who on Earth

04:03are you um and uh he you know he looked

04:08at that and he looked at some of like

04:09the robotic stuff that i' been putting

04:10up on my blog and that kind of thing and

04:11he reached out and said hey do you want

04:12to have a chat and you want to um like

04:13explore working with us here um and uh I

04:17was hired I as I understand it later as

04:19an experiment in trying to take someone

04:22with extremely high enthusiasm and

04:23agency and pairing them with some of the

04:26best Engineers that he knew um and so

04:29one another one of the reasons I could

04:30say like I've been impactful is I I had

04:32this like dedicated mentorship from

04:34utterly wonderful people what you

04:36mentioned about being um being

04:38bootstrapped immediately by these people

04:39might have meant that since you're

04:41getting up to speed on everything at the

04:42same time rather than spending grad

04:44school going deep on like one specific

04:46way of do RL you actually can take the

04:48global view and aren't like totally

04:50bought in on one thing so not only can

04:52is it something that's possible but like

04:53has greater returns than just hiring

04:55somebody out gratsu

04:56potentially you come at everything with

04:58fresh eyes um and come and locked to any

05:00particular field um now what like one

05:03caveat to that is that before like

05:05during my self- experimentation and

05:07stuff I was reading everything I could I

05:08was like obsessively reading papers

05:10every night um and like actually funnily

05:13enough I I

05:14like read much less widely now that I

05:18like my day is occupied by working on

05:20things um and in some respect I had like

05:22this very broad perspective before where

05:24not that many people even even like in a

05:27PhD program you like focus on a

05:28particular area um if you just like read

05:30all the NLP work and all the computer

05:31vision work and like all the robotics

05:33work you like see all these patterns

05:34just start to emerge across subfields um

05:37in a way that I guess like foreshadowed

05:40some of the the work that I would later

05:41do and Trenton does this map onto any of

05:43your experience I think sh's story is

05:45more more

05:46exciting um mine was just very

05:49serendipitous in that I I got into

05:51computational Neuroscience didn't have

05:52much business being there um my first

05:55paper was mapping the cerebellum to the

05:57attention operation and Transformers my

05:59next ones were looking at like you wrot

06:02that uh it was my first year of grad

06:03school okay um so

06:0622 oh yeah but uh yeah my my next work

06:10was on uh sparsity in networks like

06:12inspired by sparcity in the brain uh

06:15which was when I met Tristan Hume uh and

06:17anthropic was doing the solu the softmax

06:19linear output unit work which was was

06:21very related in quite a few ways of like

06:23let's make the uh activation of neurons

06:25across a layer really sparse and if we

06:27do that then we can get some

06:28interpretability of what neuron's doing

06:30that started the conversation I shared

06:31drafts of that paper with Tristan he was

06:33excited about it and and then and and

06:35that was basically what led me to be

06:37become Tristan's resident and then

06:38convert to full-time um but during that

06:42period I also moved as a visiting

06:43researcher to Berkeley uh and started

06:46working with Bruno olous and Bruno Olen

06:49basically invented sparse coding back in

06:511997 and so it was like the the the my

06:54research agenda and the interpretability

06:56team seemed to just be running in

06:58parallel um

07:00in in with just research taste and and

07:02so it yeah it made a lot of sense for

07:05for me to work with the team um well and

07:07it's been a dream since one thing I've

07:09noticed when people tell stories about

07:11their careers or their successes they

07:14ascribe it way more to contingency but

07:16when they hear about other people's

07:17stories they're like of course it wasn't

07:18contingent you know what I mean it's

07:20like if that didn't happen something

07:21else would have happened yeah but I mean

07:23like I literally met Tristan at a

07:24conference and like wasn't didn't have a

07:27scheduled meeting I'm or anything just

07:29like joined a little group of people

07:30chatting and he happened to be standing

07:32there and I happened to mention what I

07:33was working on and that led to more

07:35conversations and I think I probably

07:36would have applied to anthropic at some

07:37point anyways but I would have waited at

07:40least another year I I I yeah I it's

07:43still crazy to me that I can like

07:45actually contribute to interpretability

07:47in a meaningful way I I think there's a

07:49important aspect of like shots on goal

07:51there so to speak right where like you

07:53even just going to choosing to go to

07:54conferences itself is like putting

07:56yourself in a position where you're

07:58where luck is more likely to happen

08:01my own

08:06was my own way of like trying to

08:08manufacture luck so to speak um and and

08:11like try and do something meaningful

08:12enough that it got noticed for the

08:14people who are like just assuming that

08:16the other end of the job board is like

08:18just like super legible and mechanical

08:20this is not how it works and in fact

08:22like people are looking for the sort of

08:24different way different kind of person

08:25who's agentic and putting stuff out

08:27there and I think specifically what

08:28people are looking for there is two

08:30things one is agency and like putting

08:32yourself out there uh and the second is

08:34the ability to do world class something

08:37yeah Andy Jones from anthropic did an

08:40amazing paper um on scaling laws as

08:43applied to board games it didn't require

08:44much resources it demonstrated

08:46incredible engineering skill it

08:47demonstrated incredible understanding of

08:48like the most topical problem of the

08:50time um and he didn't come from a like

08:52typical academic background or whatever

08:54as I understand it basically like as

08:55soon as he came out with that paper both

08:56ends R and open the eye were like we

08:58would desperately like to hire you

09:00there's this line the system is not your

09:01friend right uh and it's not necessarily

09:04to say it's it's actively against you

09:06it's your your sworn enemy um it's just

09:09not looking out for you right and so I

09:12think that's where a lot of the

09:13proactiveness comes in of like there are

09:15no adults in the room or like and and

09:18like you have to come to some decision

09:21for what you want your life to look like

09:22and execute on it and and yeah hopefully

09:24you can then update later um if you're

09:27two headstrong in the wrong way but but

09:29I think you almost have to just kind of

09:30charge at at certain things to to get

09:33much of anything done not be swept up in

09:35the tide of whatever the expectations

09:36are there's like one final thing I want

09:39to add which is like we talked a lot

09:40about agency and this kind of stuff but

09:41I think actually like surprisingly

09:43enough one of the most important things

09:44is just caring an unbelievable amount um

09:49and when you care an unbelievable amount

09:50you like you check all the details and

09:52you have like this understanding of like

09:53what could have gone wrong and you like

09:55you

09:56uh it just it matters more than you

09:58think because people end up not

10:02caring not caring enough uh this is like

10:04LeBron quote where he talks about how

10:07when he sort of before he started in the

10:09league he was like worried that everyone

10:10would be like incredibly good and and

10:12then he gets there and he like realizes

10:13that actually once people hit Financial

10:14stability then they um like they relax a

10:17bit and he's like oh this is going to be

10:18easy um and I don't think that's quite

10:20true because I think in like AI research

10:22because most people actually care quite

10:23deeply um but there's caring about your

10:27problem and there's also just caring

10:28about the entire stack and everything

10:29goes up and down like going explicitly

10:31going and fixing things that aren't your

10:32responsibility to fix because overall it

10:35makes like the stack better I something

10:37that a friend said to me a while back

10:38but I think is stuck is like it's

10:40amazing how quickly you can can become

10:42world class at something just because

10:44most people aren't trying that hard and

10:45like are only working like I don't know

10:47the actual like 20 hours that they're

10:49actually spending on this thing or

10:51something and so yeah if you just go ham

10:54then like you can you can get really far

10:56pretty fast

🎥 Related Videos

What vaccinating vampire bats can teach us about pandemics | Daniel Streicker

a16z Podcast | Things Come Together -- Truths about Tech in Africa

2024 TSCRS Applications of anterior segments diagnostic instruments in cataract surgery

a16z Podcast | The Infrastructure of Total Health

The Robot Lawyer Resistance with Joshua Browder of DoNotPay

NES Controllers Explained

🔥 Recently Summarized Examples

The Hitler-Stalin Pact | Reflections Episode 9

Uncovering Corruption From Health "Experts" | Scott Carney

The Forgotten Geometry: A New Path to Unification

Joe Rogan Experience #2194 - Luis Elizondo

From Tesla to DNA: The Science of Scalar Waves - Dr. Sandra Rose Michael - Think Tank E44

Bitcoin Holders...Watch Out for Sept

View original video