00:00welcome to the a 16z podcast I'm Michael
00:02Copeland in this world of massive
00:04cloud-based applications and services
00:06rolling out software has moved from an
00:09episodic event to an almost continuous
00:12release cycle in that environment
00:14software products aren't as
00:16quote-unquote done as they used to be
00:18they can't be so the focus is shifted to
00:21reversibility building a development
00:24organization with the design tools and
00:26processes that can aggressively iterate
00:29while also creating safety nets so if
00:32things do get screwy they can be fixed
00:34before customers even notice call it
00:37DevOps or application operations Steven
00:40Sinofsky leads a discussion with karthik
00:42rao from signal FX and alex Solomon from
00:45pager duty about the evolution of IT
00:48operations and the requirements and
00:51challenges that modern distributed
00:53applications pose for development
00:55organization Steven Sinofsky starts the
00:59conversation what we thought we'd do is
01:01have a little bit of a discussion about
01:03the role of DevOps and how that really
01:07changes how things were going in IT
01:08represented by two great founders with
01:12some wonderful tools in the space but
01:14just diving right in I think one of the
01:17most interesting things is that
01:18historically IT has really thought about
01:21you know waterfall development and
01:23requirements gathering and and really
01:25trying to solve these customer problems
01:27where the customer is an internal facing
01:29organization how how does the cloud and
01:32modern techniques and the consumer
01:34internet really alter the the way that
01:37people think about the different roles
01:38and the types of work to get done in IT
01:40maybe start with Kartik in that yeah I
01:42think one of the best examples or
01:45responses to that question I read it was
01:48a facebook engineer who wrote a blog
01:50post about how traditional e the goal
01:53was always to reduce complexity or
01:55mitigate complexity and that's what
01:57waterfall and kind of all the face
01:58checks are essentially all about
01:59managing that complexity and the point
02:02that he made was if you're really trying
02:04to be innovative and move quickly you
02:07can't really manage the complexity
02:08because at Facebook they've got so many
02:10small teams and they're all releasing
02:12it's very you know aggressively and in
02:15that kind of a world you really have to
02:17focus more on organization design tools
02:19and process that focus on what he called
02:23reversibility and so this is you still
02:25move very aggressively but you have to
02:27create the safety nets so that as you're
02:30making changes if you make any change
02:32that is potentially destructive that you
02:33recognize it very quickly and you have
02:35the means both in kind of how your your
02:37software is designed your processes
02:39design your teams are designed so that
02:41you can roll it back very quickly before
02:42your customers even notice when you do
02:44that you then have confidence a lot more
02:46confidence that you can be much much
02:47more aggressive in rolling out software
02:49right so I think that to me summed it up
02:51in a really crisp way of you know the
02:53world has changed a little bit and if
02:55you're gonna really support fast release
02:56cycles and you want to be competitive
02:58and being very responsive to the
03:00marketplace you can't control complexity
03:02the same way you could before you just
03:04have to focus on other aspects primarily
03:07reversibility and and so like with the
03:09this role though of DevOps like how how
03:12do you see customers like you know in
03:14the in the world of like break and
03:16response and rescue late incidents how
03:18do you see the internal customers sort
03:20of managing when their products don't
03:23appear as done as they used to but
03:25they're done a little bit sooner and how
03:29does that influence the the ways to
03:31think about the engineering cycle and
03:33also to just the communication with
03:34those internal customers that's a good
03:37question so what we've seen is that the
03:41requirements stage essentially boils
03:44down to doing customer development and
03:46being able to talk to customers and and
03:49what's really important as part of that
03:51is showing them something so as part of
03:54the development cycle you would show
03:55them wireframes and something to react
03:57to and then you'd make it a much more
03:59iterative process where you wouldn't you
04:02it's a shift away from the waterfall get
04:04it done one Big Bang which actually is
04:08very risky because if you've made any
04:10mistakes along the way and those
04:12mistakes actually add up you at the end
04:15of the day you don't deliver what the
04:16customer needs so being able to to
04:19develop the software much more
04:21iteratively and show them here's what we
04:23have so far what do you guys think
04:26and get the reaction back from the
04:27customer and then adjust and learn and
04:29iterate that's a big part of DevOps well
04:32in the in the consumer space one of the
04:34things that's so interesting is you know
04:35there's this perception that you throw
04:37it out there and you see how people
04:39react and and and then you you adjust
04:42and you iterate and things like that but
04:44often in the business world people said
04:45well we can't we're just not able to do
04:47that because our requirements are fixed
04:49like yeah go build a messaging app go
04:51build a shopping app but those
04:53requirements aren't aren't you know
04:56they're flexible whereas we have to this
04:58is our expense report process or
04:59performance review process or cash to
05:01quote process how do you how do you see
05:04the role of in the customers you work
05:06with how do you see the role of an MVP
05:08or or just these early releases do you
05:11see that evolving in any way well I
05:14think one of the things is you don't
05:15have to build the entire stack right I
05:16mean I think in the web services economy
05:17you can leverage a lot of other
05:19components and focus on the things where
05:21you really you know where you want to
05:23invest and it makes it a lot easier to
05:25get something up and running very
05:26quickly right and I think ultimately
05:30even in the enterprise world markets are
05:32changing very quickly and so if you're
05:33taking two years to get something out
05:35the markets have probably changed in
05:36those two years so it's very
05:39advantageous to get something out
05:40quickly I think the the key is just to
05:42have focus and you know leveraging the
05:46sort of web services ecosystem there all
05:48of these different technologies that you
05:49can leverage without having to kind of
05:51wrap it up and build it up into this one
05:52giant software package that takes two
05:54years to release I think it certainly
05:55makes things easier put you on a spot a
05:57little bit of you have either of you you
05:59know really gone through that with a
06:00customer with a particular kind of app
06:02where it's it's really jumped out at
06:04them and in terms of you know wow this
06:06was an app where we were generating way
06:08more tickets than we used to expect
06:09because we have way more telemetry how
06:12are you you seeing the actual
06:13deployments of like these modern
06:15cloud-based applications really evolving
06:18in terms of the level of support and the
06:19level of understanding that's that's
06:21really going on yeah well what we've
06:24seen is that customers are becoming a
06:26lot more demanding as a software as the
06:29world becomes more powered by just that
06:31these are the customers of the app or
06:32the customers of yours the customers of
06:35the app yeah yeah they become more
06:37demanding they expect everything to be
06:40take like hours to fix an outage you
06:43have to automatically detect outages you
06:45can't have your customers detect the
06:46outage for you and you have to respond
06:50and the if you don't do that you get you
06:54you you your bottom line hurts your
06:57reputation yeah many of these
06:59applications have SLA s so if the app is
07:02not up you're actually you have to
07:03refund money back and so there's a lot
07:06more pressure on the IT department to
07:09deliver 100% uptime of course 100% is
07:12not realistic but you have to get as
07:13close as possible to that yeah that's
07:15one of the things like that it's a good
07:16comment like a hundred percent because
07:18one of the the things that's so
07:19interesting is is that ite is to think
07:21of we can deliver a hundred percent or
07:23we can get really really close if we own
07:25all of the parts from the the network
07:27routers on up but in the SAS world you
07:30know wow you might often with your your
07:33your Active Directory ID you might be
07:36using this storage system from somewhere
07:38else and this other service and you
07:40might be involving involved with an
07:41integration how do you parse the notion
07:44of 100% uptime in that or how does IT
07:47think of accountability even in that
07:49yeah I think that's an interesting
07:51question because your customers don't
07:53don't know the difference right
07:55they're just like I can't log on we I
07:57gave a file an expense report talking to
07:58a media company and they had the
08:02situation they had a really big event
08:03and they had a streaming app and their
08:07one of their ad networks was taking an
08:09abnormally long time to load the ad
08:10before the video streamed and all their
08:13users were on Twitter complaining you
08:14suck and just you know and it was it was
08:16terrible for their brand but it wasn't
08:18their fault it was a third party that
08:20they were taking really long time to
08:21look well it wasn't their fault I mean
08:24it was it was it wasn't their technology
08:26it was someone else's technology right
08:28but the the from a customer point of
08:29view that didn't matter they just felt
08:31like the experience was poor and so you
08:33know for example they're working with us
08:35on instrumenting the calls they make out
08:37into their third party services and
08:39being able to measure it and having the
08:41real-time visibility as they see events
08:44happening if they see some particular
08:46networks that are taking longer to load
08:47adds at least they have that data they
08:49can make real-time decisions if they
08:51need to and they can all their vendors
08:53so what let's it let's switch gears a
08:57little bit just in terms of one of the
08:58interesting things about being a person
09:00in DevOps now particularly inside the
09:02enterprise is really balancing the needs
09:05of these needs of like control which
09:08used to come from being all on-prem
09:09which now might move to sort of a hybrid
09:12cloud model and we we're all very
09:15forward-looking here but many customers
09:16are sort of dealing in these kind of
09:19hybrid environments what what are we how
09:22do we help people to understand that
09:24both the advantages you know of moving
09:26as fast as possible which I think most
09:27people want to do to a cloud world but
09:29then the realities that they're dealing
09:30in in terms of just these these mixed
09:33environments and just responding to
09:35what's going on from a DevOps
09:36perspective yeah I mean I think the DRM
09:40you know the control aspect of it is is
09:43a false illusion I mean the complexity
09:45of these systems I mean but what do you
09:47do it goes I have to interrupt because
09:49like people have made big bets on
09:52delivering on that expectation like and
09:55we're just kind of bursting their bubble
09:56but it's not true where we're gonna
09:58finally tell their boss it's not
09:59possible well I mean the truth is that a
10:03cloud service or a SAS service these
10:06guys have teams of people who are
10:08dedicated to keeping that service up and
10:11you know trying to to take that on
10:14yourself I mean I'd rather pay someone
10:16else to do it for me especially if
10:18they're really good at it
10:19so this is why people use increasingly
10:21AWS and infrastructure as a service
10:24platform as a service and a SAS because
10:26someone else has to worry about that
10:28uptime and I paid them and then if if
10:31they don't deliver I get my money back
10:33SLA is and such so I think you know
10:35that's why a lot of companies including
10:37like the CIA and government are paying
10:39AWS to do it for them because they have
10:41that expertise in-house and it's hard to
10:43gain that same expertise for every
10:45single you know company out there what
10:48that's it's a good way to think about
10:50you know the the different levels of
10:52DevOps what how do you real quick for
10:55folks just even define what DevOps is
10:57how do you help them to understand when
10:59they go to hire them and things like
11:00that yeah well we at signal effects we
11:02like to think of it more as application
11:04you know it's or just operations you
11:07evolution of IT operations has just
11:10focused more on modern distributed
11:11applications and focusing your toolset
11:14in your process on a different set of
11:15challenges for maybe what you did before
11:17and that that's it's there still there's
11:20a significant amount of work involved in
11:23building the process and the processes
11:25and the tooling to make a cloud
11:27infrastructure usable and and highly
11:30effective for an end-user development
11:31organization and that that's you know
11:34what we think of as the modern role of
11:35operations application operations DevOps
11:37whatever you want to call it cool well
11:39thanks everybody this was just a quick
11:41chance to see some excellent work and
11:43really think a little bit about DevOps
11:46thank you very much guys thank you thank