00:00hi everyone welcome to the a6 & Z
00:02podcast I am sonal today's episode is
00:04one of our hallway conversations where
00:06we just riff on a topic for a bit and
00:07the topic we're talking about today is
00:10the theme of complicated and to give you
00:12more context for this we have a 6nc
00:14board partner Steven Sinofsky who has
00:17written in the past about systems where
00:18the backend is really complicated on the
00:20front end it's deceptively simple and
00:22this tension is also a common theme in
00:24design we have a 16-0 search and deal
00:26team head frank chen who has talked a
00:29lot about AI and deep learning and
00:30that's relevant here because those are
00:32complex systems that learn and finally
00:36we have Sam our bisman who is a
00:37complexity scientist and who also got
00:39his PhD in computational biology and he
00:42has a new book out called
00:43overcomplicated so it all fits together
00:45alright guys let's just get started
00:47I'm excited to talk about this topic so
00:49Sam I was reading the book and one of
00:51the first things that occurred to me is
00:52I wanted to ask you my favorite product
00:54manager interview question of all time
00:56my question is so how two phones work
00:59how do phones work like an iPhone
01:02smartphone any phone you pick any phone
01:04even the simplest landline and tell me
01:06how it works oh boy I am gonna show my
01:10ignorance probably really really quickly
01:11and yeah you know I know you dial and
01:15then actually I'm well I was gonna say
01:18there's some sort of packet switching
01:19thing I guess it really depends if
01:20you're using kind of an IP Phone or not
01:23I'm frankly I don't know well frankly I
01:28think most people do not know I'm we've
01:29been shielded from that complexity the
01:31reason I ask is because that's what
01:33really jumped out at me when I was
01:34reading the book which is like we create
01:37systems that nobody understands and so
01:41exact turns out like you can ask a
01:42million product managers how to a phone
01:44works um actually do exactly what you
01:45did well you dial it and then the next
01:47question is well tell me about the
01:48electromagnetic stuff behind dialing and
01:51what is that or touch tones how do you
01:53generate those frequencies and then you
01:55let you leaped immediately to packet
01:57switching which of course skip the whole
02:00no I it's like yeah I was jumping to a
02:02couple things that I was vaguely
02:03familiar with now right and then like
02:05how does your voice turn into one of
02:07those things to begin with yeah my
02:10question to product managers was okay
02:12home after this interview you're gonna
02:14send me a nice email to thank me how
02:17does that email get to me right and the
02:18same exact thing right this cascade of
02:21technology that's layer and layer and so
02:23what you're looking for if you're trying
02:25to find a technical one is how deep down
02:27the stack and you go and answering that
02:29question oh my god that's so funny you
02:31guys literally just that's the exact
02:32same question but in different forms
02:33Stephen and I are actually you know
02:35twins thing did we share a mother but
02:37you know he's the Jewish version I'm the
02:39Asian why does it even matter to know
02:41these things I mean okay beyond being a
02:42product manager that you're interviewing
02:44you know you're trying to find out their
02:46skills in the enterprise does it really
02:48matter for us as users as consumers to
02:51really know how our systems work I only
02:54care that things are working so I think
02:56for the most part a user they can they
02:59can just use things and often be
03:00blissfully unaware and it seems like
03:01it's fine I think the the major problem
03:04though is that it's one thing to say
03:05that oh there's some experts somewhere
03:08who can understand the system its
03:09entirety and really knows what's going
03:11on we can kind of outsource our
03:12understanding to them but more and more
03:14there's really no one who can understand
03:16is exactly the point that's being made
03:17here and so really when no one fully
03:19understands it it's incumbent upon each
03:21of us to at least have some way of
03:23thinking about these systems at least
03:24some sort of like glimpse into what's
03:26happening um sometimes underneath the
03:28kind of fairly simple interfaces to into
03:30the underlying complexity because
03:31oftentimes we think we understand the
03:33system and then were confronted with a
03:35bug or some other kind of unexpected
03:36behavior and then we realized there's a
03:37gap between how we thought it would work
03:39and how it actually does work one of the
03:41things that I think is so interesting is
03:43not being able to understand it has
03:44become like almost a cool thing like the
03:47one person who understands this one part
03:49of the system you know and even the
03:51words that we use like hack and Cluj and
03:54stuff they're there now like cool like
03:56hack hack is gone from like a problem to
03:58like we now celebrate it with hackathons
04:01and and so I'm trying to understand or
04:04think about you know why is it good to
04:06embrace the complicated nature of things
04:08or the complex nature of things and when
04:10is it detrimental to society to do that
04:13like when is a hack like wow that's not
04:15like I don't want my cat scan machine to
04:18be hacked but I'm okay if like a word
04:21yeah I think it's more about recognizing
04:24that too often this is just
04:26the way of the world that they're just
04:28all around us certainly the way when you
04:30have when you're confronted with a large
04:31system some large technological system
04:33like piece of software or whatever
04:34oftentimes the only way to change it is
04:37through those kind of Kluge's or hacks
04:40because it's kind of iterative tinkering
04:42at the edges of approach which ends up
04:43meaning that you you add something to it
04:46it gets the job done the downside of
04:48that of course is you don't fully
04:49understand what's going on and as more
04:51and more of these accrete than someone
04:53you're left with this like impenetrable
04:54mess I do think though I ideally we
04:57should be deliberate in how we grow
05:00these systems and change them over time
05:02certainly if we're building something
05:03from scratch we should try to be as
05:05logical and drift away from the Cluj and
05:08kind of the kludgy approach at the same
05:10time though these systems they're not
05:12always fully engineered they're almost
05:14grown and when they're revolt and then
05:16like then you often get the kind of all
05:17the terminology from evolution of kind
05:20of like evolved feature or repurposing
05:22some other kind of you know typical
05:24there's like a whole bunch of like
05:25obsolete code in there and and I think
05:28then you kind of realize oh actually
05:29these systems when they get big enough
05:30they end up looking almost biological
05:32well thinking about websites or mobile
05:35apps getting very very big they are
05:37almost all biological now because it's
05:40impossible for any single person to
05:42understand I mean you have a CTO and you
05:43have an architect but if you look at
05:45what happens inside companies as these
05:47complicated sites are actually being
05:49built what happens when you have a very
05:51complicated change is you have this
05:53entity called the Change review board
05:54convene and it's 12 people in a room one
05:58representing the network one
05:59representing storage one representing
06:01servers one representing application
06:02development and you have to sort of
06:04review every change and basically say it
06:06out loud and say oh have I not thought
06:08through what this change is gonna mean
06:10in your world and so you have all of
06:12these people who need to convene to that
06:15changes before they actually get pushed
06:17into production and then the reverse
06:19happens when problems occur so when you
06:22have an outage right and I can tell you
06:24this is probably having an inside
06:25Niantic on a daily basis right now as
06:27Boogie explodes in popularity you
06:30have those exact same people convened to
06:32try to figure out what what is causing
06:33it right when people can't log in what's
06:36causing that and is it a network
06:37problems that storage problems a
06:39no single person can understand it and
06:41so you need to have groups of people to
06:43try to figure things out and it's not
06:45just back-end things I mean I'm thinking
06:46of examples where can touch our lives
06:48and very personal concrete ways and the
06:51classic example that comes to mind for
06:52me whenever we talk about this topic is
06:54self-driving cars and the decisions the
06:57algorithm makes I mean that's a case
06:58where you can certainly code Intuit
07:00certain principles like it should behave
07:01in this way under certain conditions but
07:03as it learns as a system learns and
07:05we're not aware of exactly what it's
07:06learning and how its learning and it
07:08gets increasingly more complicated
07:10that's something that can affect us in
07:12very tangible ways yeah I think this is
07:14one of the fascinating changes to the
07:16way computers are being programmed
07:18increasingly right so for basically up
07:21until this point in time programming has
07:23been functional and procedural which is
07:25I have if loops and else loops and I
07:27tell it and what I'm trying to do is
07:29predict enough state so that the
07:31computer can make the right decision if
07:33this do that else do this right with the
07:36introduction of deep learning what you
07:38have baked into these computer systems
07:40is a probabilistic reasoning system
07:42which is if I see this input I think I
07:45should do X and how are we gonna marry
07:47these two worlds of procedural computer
07:50programmer tells you explicitly what to
07:52do in every case and this probabilistic
07:54reasoning which is well I've seen this
07:56road before and I think the right thing
07:58to do is turn right so pick you up on
08:00that I'm curious how in a sociological
08:03sense like because they've been like say
08:0575 years of computers being these exact
08:08precise things and you know and your use
08:11the analogy of physics and biology in
08:13the book and like how is it what what
08:15needs to happen for when for people to
08:18think that computers are biological that
08:21like hey it's okay if it has this goofy
08:24horn growing out the side of it
08:25evolution will eventually get rid of it
08:27cuz my experience has been that people
08:29have like a pretty low tolerance for
08:33error with anything that comes out of a
08:34computer like you used an example in the
08:37book that hit really home to me when
08:38you're working with an advanced piece of
08:40software such as are gargantuan which
08:43I'll assume you meant as a positive a
08:45gargantuan word processing tool
08:47and the end notes and the end notes in
08:50your document go and I'll quote haywire
08:53don't panic instead look at what went
08:55wrong and I have to tell you I've been
08:58on a lot of support calls with people
09:00with problems with word and trying to
09:02say don't panic hasn't really worked for
09:07your writer I'm guessing like when and
09:11actually I've been on calls with super
09:13famous writers and don't panic just it's
09:19biological the next evolution you know
09:24Darwin will take care of it perhaps that
09:26advice is a little bit more theoretical
09:28than been practical at this point one
09:30thing is that even when we're in the
09:32realm of like more traditional iterative
09:34like pretend procedural kind even
09:36functional programming I mean once you
09:38deal with like huge numbers of edge
09:39cases you can actually still quite
09:41easily build systems you don't really
09:42understand but especially as we move
09:43more into this world of like new types
09:46of machine learning and deep learning I
09:47think we need to kind of think more
09:49consciously about approaching them
09:51biologically and I think we can see some
09:53of these kinds of hints happening it's
09:54like for example Netflix they have this
09:56cast monkey a suite of tools where the
09:59tool will periodically take subsystems
10:01out of commission and see how the
10:03overall system responds live it'll just
10:05knock out portions of Netflix and see
10:08how it responds the idea is to lower the
10:10gap between how they assume the system
10:12works and how it actually does work and
10:14in order to make it as robust as
10:15possible and it turns out in biology
10:17this is actually one of the ways you
10:18learn about a living thing so for
10:21example let's say you have some you have
10:22one type of bacteria and you want to
10:24really understand how the genes interact
10:27what genes are important for which
10:28different kinds of things you can
10:29actively try to mutate it irradiated or
10:31subjected to some sort of chemical and
10:34thereby seeing how as you knock out
10:36certain parts of the genome it actually
10:38affects it and I think people are
10:40beginning to use these more biological
10:42techniques to really understand their
10:43systems now of course it's one thing to
10:45do that when you're building the system
10:47it's another thing to say don't panic
10:48just start tinkering with your with your
10:50word processor and you'll be fine when
10:52you've lost all your own notes it's a
10:53lot easier to just freak out and kind of
10:55go crazy it'll take some time we'll get
10:56there slowly but surely hopefully yeah
10:58chaos monkey is a great example of this
11:01that's happened inside data centers
11:03precisely because we had to introduce
11:05biological thinking rather than physics
11:07thinking into even designing and
11:08troubleshooting these systems the way
11:10I'm using the terms physics and
11:11biological thinking kind of has two
11:13different modes and of course it's an
11:14oversimplification is the physics
11:16mindset might be to write a single
11:18equation that explains a good a good
11:21fraction of what's going on so it might
11:22maybe explain like sixty percent of
11:24what's happening within a system the
11:25biological thinking approach says well
11:28these things are they're very they're
11:29very complex they've evolved over time
11:31there's sort of this organic messiness
11:32we actually need to focus much more on
11:35the details the system may be
11:36understanding subsystems or kind of
11:38different components of what of what's
11:40happening within a living organism in
11:42the hopes that eventually you create
11:43this broader picture because in this
11:45biological mindset is the idea that that
11:46the details really matter it's
11:48wonderfully if you if you can write an
11:49equation that explains 60% of what's
11:51going on but it turns out the remaining
11:5240% is really really important when
11:55you're trying to make sure something
11:56works something really works properly
11:58especially when it comes to technology
12:00now of course there are many physicists
12:02who dwell in details in our many
12:03biologists who have grand theories and
12:05computational models so it's not a
12:07perfect way of describing the two
12:09different groups of scientists but
12:10they're kind of two different mindsets
12:11and how we approach the natural world
12:13but increasingly it's also a really good
12:16framework for thinking about how we
12:17approach the build world and I think we
12:19need to kind of import some of that
12:20biological thinking that recognizes the
12:22details and kind of this iterative
12:23tinkering approach to understanding a
12:25technology to actually understand it
12:27fully or at least part way as we
12:29continue to build on bigger and bigger
12:31so when I read that analogy the what
12:33lept to mind for me is in the data
12:36center over the last 20 years we've been
12:38we've done a big transition from whose
12:41data center do you want to look like and
12:42that transition went from a Wall Street
12:44bank to Facebook or Netflix and I would
12:50argue that the Wall Street banks build
12:52physics thinking into their data centers
12:54which is you had these massive Sun
12:56servers and EMC arrays and Oracle
12:59databases and you paid attention to
13:01every single one of them because if one
13:03of them went down you were screwed but
13:06the benefit of knowing one of these
13:09things going down is you knew where to
13:10look right and then if you look at the
13:12Netflix or Facebook data center they
13:14sort of took the exact
13:15opposite view which is any server any
13:17disk drive any process that could die at
13:19any single time but we still want the
13:22Netflix feed to work and we want the
13:24news feed to work and the system needs
13:26to survive any given failure and that
13:28sort of the big change and so I would
13:30argue that most modern data centers
13:33which are built on micro services
13:34architecture scale out architectures are
13:36designed with sort of this biological
13:38thinking in mind which is any single
13:39instance or disk drive or server can
13:42vanish but we need to make sure that the
13:45entire service doesn't grind to a halt
13:46when you look at like the types of terms
13:48used to describe those types of data
13:50centers like resilient or robustness
13:53like these are the types of terms that
13:54are often used when thinking about an
13:56ecosystem or living organism and I think
13:58that is very symptomatic of the idea the
14:00acid that they they have much more in
14:02line with kind of biological modes of
14:03thought that physics modes Dada in
14:05itself was like a major evolutionary
14:07point in the delivery of computing to
14:09people I mean I remember we a mile job
14:11we were working on like a Netflix
14:13basically it was the way to distribute
14:14video and we talked about like having
14:17data center employees like literally on
14:19rollerskates who are gonna run around
14:21swapping out disk drives and the whole
14:24system actually couldn't work because
14:26they they started doing the math on how
14:28quickly they would need to replace disk
14:31drives and then along comes Google and
14:33they basically pioneered this whole
14:35notion that like all the disk drives
14:37it's not like they're likely to fail
14:39it's that they will fail and so it was
14:41designing a whole system on the on the
14:44presumption of continuous failure which
14:47was like a complete inversion from all
14:49the other systems that had been designed
14:51in a sense I think that the whole
14:53software of a service notion has made
14:55the backend of the services sort of
14:58designed in a biological way but I'm
15:00still fascinated by the fact that the
15:01people at the end of the services still
15:04think of them as physics yeah I don't I
15:06just don't see a tolerance for failure
15:07because what happens is immediately
15:09people start thinking well fine it's
15:10cool if it's Gmail and is down for 18
15:12minutes I guess I could survive but like
15:15that same thought in an airplane kind of
15:17freaks me out you know one of the big
15:19innovations that I'm looking for as we
15:21switch from this deterministic to more
15:23probabilistic population based is the
15:26way that we design test
15:29verify monitor and recover from failures
15:32has got to change and we're in the midst
15:34of that transition right now which is if
15:36you look at monitoring tools they're
15:38going from you know sort of HP openview
15:40to things like signal FX which is you're
15:43looking at populations of servers rather
15:45than individual servers so one of the
15:48things I've been wondering is what's the
15:49big breakthrough that we need to verify
15:52the output of deep learning systems
15:54right which is if these things are
15:56inherently probabilistic how do we test
15:58them how do we give people the assurance
16:00that it feels like physics at the end
16:01right but inside it's biology and
16:04frankly when it comes to biology we as
16:06humans are actually conditioned to
16:09accept this inherent complexity I mean
16:10you go to the doctor they can't figure
16:12out what's wrong with you you got
16:13another doctor and you keep doing that
16:14and you hear this narrative you know and
16:16even though it's very frustrating it's
16:18almost accepted and I wonder if we'd
16:20ever get to the same point with our
16:21computing systems in terms of
16:23expectations I mean it'll definitely
16:24take a new mindset the perspective I
16:26think people are going to need to
16:27eventually embrace to a certain degree I
16:30would say almost like a humility in the
16:32face of technology and I think I like
16:34often times we kind of tend towards two
16:36extremes of either like when we don't
16:38fully understand a system when we maybe
16:39are confronted with kind of the
16:40biological messiness we either freak out
16:42or we say this is like so incredibly
16:45complicated that there's like this like
16:46reverential awe almost religious sense
16:49of the system like it's so beautiful so
16:50wonderful we're never gonna fully
16:51understand it and I think both extremes
16:53they they end up cutting off questioning
16:55and I can trying to actually understand
16:57this isn't even if we can never fully
16:58understand whether or not you're the
17:00designer or even just the user I think
17:01we need to kind of recognize that
17:02there's going to be this almost like
17:04humble approach to our technological
17:05systems where it's gonna be okay if we
17:08don't fully understand these things and
17:09if they do occasionally fail because
17:11ultimately those failures lead us
17:13towards better understanding so that's a
17:14good thing but there's just going to be
17:17this constant iterative process of
17:18trying to understand these systems we
17:20might never get there but there's
17:22something exciting about actually trying
17:24to trying to fully understand and
17:26recognizing that these things are messy
17:27and and complex and and so and but still
17:30also something that we actually created
17:31well also when it comes to something we
17:32created we also have to think about the
17:34very combinatorial nature of that
17:36creation and one of my favorite books
17:38here is the nature of technology and how
17:40it evolves by Brian R
17:41what struck me most I mean there's a lot
17:43of things I love about that book but
17:44what struck me most when I was reading
17:46it and and it even applies to how you
17:48guys open this conversation with your
17:50question it is a narrative around
17:52creation and who invented what and we
17:54tend to talk about it
17:55in a very linear way but it's a very
17:57non-linear iterative thing where people
18:00build on each other's ideas and and it's
18:02very messy and complex and I've always
18:04thought that when we tell these stories
18:06we need to do a better job of
18:07acknowledging all of that complexity and
18:09messiness and oh me now the systems are
18:11even more complex we now build systems
18:14that no one understands and that classic
18:17if I could get a time machine and go
18:18back to like 1952 I'd invent whatever
18:21your favorite product is now and then
18:22you realize you couldn't come up with an
18:23iPhone in 1950 oh yeah it's totally
18:25impossible to do that yeah you just
18:26don't have the knowledge that yeah you
18:28don't have the expertise of other people
18:29to build upon it's simply impossible
18:31there's all these things interacting and
18:33you have to be mindful of every single
18:35one and no one can actually be mindful
18:36of every single one okay well thank you
18:38guys that's all we have time for
18:39and that's another episode of the asics
18:41& z podcast thank you thank you