00:00 SRE was one of the highest paying Tech
00:01 roles in 2023 and actually ranked third
00:04 on forbes' list but it's also one of the
00:06 most dynamic interesting and
00:07 transferable jobs in the tech space it
00:10 also happens to be one of the most
00:11 challenging but if that all sounds good
00:12 to you and you're interested in
00:13 transitioning into Sr you want to know
00:16 what it takes to become an Sr then stay
00:18 tuned hey what's up I'm Adam if you're
00:20 new here and if you're returning then
00:21 you know what's good I'm an Sr based in
00:24 London and in this video I want to get
00:26 into a bit of the nitty-gritty about how
00:28 to transition from a roll in Tech or
00:30 from a different type of space into the
00:32 SR role I feel like the SR role is a
00:34 little bit mystical like what do Sr
00:36 actually do what kind of skill should I
00:38 be checking off if I want to be an SRE
00:40 and we're going to get into that today
00:41 but I'm also going to get into what kind
00:43 of Pathways you can follow if say you're
00:45 a software engineer looking to be an Sr
00:47 or devops engineer and actually how you
00:49 can search for jobs that you stand a
00:51 good chance of getting like SRE roles
00:53 that lean into your skills and then
00:55 finally why the AI Revolution actually
00:57 opens up a big opportunity for sres
01:00 let's just get into it so what is site
01:01 reliability engineering and why does it
01:03 matter Well site reliability Engineers
01:05 or sres are responsible for making sure
01:07 that platforms websites applications
01:10 once they're in production that they
01:11 remain reliable that the end user gets
01:14 the experience they expect why does that
01:16 matter well consider this a medical
01:17 application that doctors or medical
01:20 staff use to access patient data maybe
01:22 even their images from scans and things
01:24 like that now imagine a doctor goes to
01:26 user application to treat a patient
01:28 perhaps one that's in dire need and they
01:29 can't access the site they can't access
01:31 the application it's not responding as
01:32 expected or even worse maybe it's
01:34 serving up the wrong data Maybe it's
01:36 delivering data about a different
01:37 patient you can see how the consequences
01:39 of that unreliable system could actually
01:41 be catastrophic and that's just one
01:43 example the impact of an unreliable
01:44 system can be far spread from customer
01:47 dissatisfaction and loss of Revenue to
01:49 actually compliance issues and your own
01:51 employees basically getting sick and
01:52 tired like If all we're doing is putting
01:54 out fires because our system is always
01:56 failing right it's unreliable then
01:58 where's the time for innovation and
02:00 design and moving forward so now we see
02:02 why reliability is important and why
02:03 having an engineer dedicated to it makes
02:06 sense but then what kind of skills does
02:07 the SRE have like how does that
02:09 translate into preparing for a job well
02:11 the key skills of an SRE actually center
02:13 around the principles of site
02:14 reliability engineering and they're as
02:16 follows reliability first this is the
02:18 idea that reliability is the most
02:20 important feature of your platform like
02:21 it doesn't matter how many nice things
02:23 that you can layer on top of it how many
02:25 updates if the system is unreliable
02:27 you're building on really weak
02:28 foundations the second is automation the
02:30 idea that the SR should be automating
02:32 away toil that is manual tasks that take
02:34 away from precious engineering time
02:36 right so putting things in place so that
02:38 we can spend more time on the things
02:39 that are going to be Innovative and move
02:40 us forward next is monitoring and
02:42 alerting this is all about seeing into
02:44 your systems right if you want to ensure
02:46 that the system is reliable and that you
02:47 don't have failures and outages we need
02:49 to be able to see what's going on and be
02:51 able to alert effectively ideally
02:53 automated alerting so when something's
02:55 going wrong we get an alert sent to us
02:56 as sres and we can check it out and
02:58 potentially before the end user even
03:00 knows anything has happened so the next
03:01 one may be a bit surprising but it's
03:02 actually about embracing risk you're
03:04 like hm but I thought we was trying to
03:05 be reliable the aim is not 100%
03:07 reliability we're not trying to foster a
03:09 culture where everyone's scared to touch
03:10 everything right and we never move
03:12 forward because if we press this one
03:13 thing we might break everything or we
03:15 might break one thing and nobody wants
03:17 to do that we want to give enough room
03:19 so that people devs all types of
03:20 engineers want to push boundaries and
03:22 move the applicational platform forward
03:24 but so we can do it in a way that is
03:26 safe if something goes wrong we know how
03:27 to do roll backs and things like that to
03:29 bring the system back and ultimately
03:31 it's more in a controlled system there's
03:33 also the principle of the service level
03:34 model so this is the way that we manage
03:36 and monitor our systems in the service
03:38 level so slos and sis and finally it's
03:40 about collaboration SRS don't work
03:42 independently we work as part of teams
03:44 and that's an important thing a key
03:46 principle in SRE okay so how does that
03:49 translate into the skills that are
03:50 needed to be an SRE cuz I've said a lot
03:52 of things and you're thinking that's a
03:53 lot of things to check off the thing is
03:55 you don't have to be an expert in
03:56 everything there are some areas that
03:58 srvs are expected to be the subject
04:00 matter expert and there are other things
04:02 that we need kind of peripheral
04:04 knowledge of experience of we need to
04:05 know how to recognize them but I don't
04:07 need to know how to be design something
04:08 from scratch and here's what I mean if
04:10 we take a look at this table which
04:11 splits things into subject matter expert
04:13 and things that you need experience and
04:15 knowledge of we can see that you're
04:16 expected to really know the
04:17 nitty-gritties of things like slos and
04:19 sis of monitoring and alerting of data
04:22 driven decisions things like
04:23 architecting in the cloud reliable
04:25 systems and things like automation you
04:27 should have a firm grip on these sorts
04:29 of things but there's other things that
04:31 you need knowledge of and experience of
04:33 but you might not need to be an expert
04:34 like I don't need to be a networking
04:36 engineer to be an SRE right so it's in
04:37 the experience side but I do need to
04:39 have knowledge of networking because if
04:42 there's a networking issue if that's why
04:43 my system is unreliable I need to know
04:45 how to at least identify that maybe do
04:47 some sort of work but then obviously
04:49 seek help with where necessary from the
04:51 appropriate Engineers who are also in
04:52 the team also in the wider company to
04:54 support a similar thing is like
04:55 application testing I don't need to be
04:57 able to write the most elaborate test
04:58 for Java applications python
05:00 applications all of these things but I
05:01 need to understand the job of tests and
05:04 how they fit into the the pipeline and
05:05 how they fit into the software
05:06 development life cycle right and when
05:08 necessary I need to be able to test my
05:10 own code so if I'm writing automation
05:12 scripts in Python I do need to know how
05:14 to write tests at least to the standard
05:16 that we expect to check that that works
05:18 otherwise we're just Flying Blind
05:19 basically and hoping that it works all
05:21 the time but how do you acquire then
05:22 some of this knowledge especially those
05:24 key subject matter expert topics well
05:27 it's kind of up to you there is not many
05:29 all encompassing SRE courses out there
05:31 just because of how diverse and Broad
05:33 the role is right but you can take a
05:36 list like this and go through and find
05:38 resources that are align with it for
05:39 example the Google SRE workbook is a
05:42 great resource for learning SRE
05:43 principles or you may look at something
05:45 like the Linux foundation for some of
05:46 their training and certifications but in
05:48 case you do want to One-Stop shop for
05:50 everything I have my becoming an SRE
05:51 course that is out on the 8th of January
05:53 2024 so here we go through all the
05:55 fundamental topics there's varying
05:57 levels of intensity for the things that
05:59 are sub matter expert you dive very deep
06:01 into things that are more peripheral you
06:03 know we get to grips with understanding
06:05 but we don't want to waste time we want
06:07 to get to the really core of the SR and
06:09 when it takes for us to get a job and
06:11 execute that job within the first few
06:12 weeks months and years of our career so
06:14 the course Hastings from Theory to demos
06:16 to quizzes to make sure that you
06:17 understand what you've been learning
06:18 right in all of these modules but it
06:20 also has projects projects to support
06:22 the learning and actually build out a
06:24 portfolio of these skills right so that
06:25 you can store them in places like GitHub
06:27 and link to them when you're applying
06:28 for jobs and after all of that there's
06:30 the career development pack where we
06:32 dive deep into how do you actually get
06:33 the SRE job right there's things like
06:35 the skills tracker and the application
06:37 tracker but also how do you construct a
06:39 CV for an Sr role right if you're coming
06:41 from a completely different space or how
06:44 do you actually find jobs that align
06:45 with what you do and what are your
06:47 skills are and your background how do
06:48 you then interview what kind of
06:50 interview Styles and questions can you
06:51 expect all of that is in the course and
06:53 if you want more information on that
06:54 check the description of this video but
06:56 anyway back to it so how do you
06:57 transition into an S role depending on
06:59 where you are now because we know that
07:01 SRS come from a range of backgrounds and
07:03 bring their skills from their previous
07:04 jobs with them I just want to make it
07:06 clear that some jobs lend themselves
07:07 very well to SRE that the transition can
07:10 be very smooth things like devops
07:11 Engineers software Engineers Cloud
07:14 Architects second line support even
07:16 Network and security Engineers because
07:18 of the broad aspect of Sr you can bring
07:21 your Specialties you can bring your area
07:22 of expertise into the SR role so you
07:25 have your foundations and you start to
07:27 layer on top so let's quickly go over
07:28 the software engineer path just to
07:30 illustrate how this would work as a
07:31 software engineer you are going to be
07:33 well versed in things like programming
07:35 languages whatever you have been
07:36 programming in whatever you've been
07:37 building applications in and supporting
07:39 you're going to have a really good grasp
07:41 on that which means things like the
07:42 automation element in terms of writing
07:44 scripts to get things done you should be
07:46 able to do that sort of thing a lot
07:47 easier than somebody else because you
07:49 understand how to turn problems into
07:51 code already codification of problems
07:53 you're also going to have an
07:54 understanding of application design and
07:56 to an extent logging right like error
07:57 handling and how to log a effectively
07:59 and appropriately for those who are
08:01 going to be supporting it these are all
08:02 very strong things to bring in when you
08:04 are applying and thinking about the SR
08:06 position but where do you go next like
08:07 what is the next layer to add well now
08:08 you want to start thinking about some of
08:09 these Sr fundamentals and principles
08:12 right you want to start understanding
08:13 slos sis and that model you want to
08:15 start understanding observability and
08:17 alerting right monitoring and alerting
08:19 how can we build out systems like this
08:21 that are functional and useful after
08:23 that you may want to add the next layer
08:24 which will be things around cloud and
08:26 automation right do you understand how
08:28 to build a Rel system in the cloud or at
08:30 least support one right so if you're
08:32 going to be working in an AWS
08:33 environment you need to be in touch with
08:34 these AWS Services understand how they
08:37 work and understand how to build
08:38 reliability into the way that we use
08:40 them and the way that we execute with
08:41 them if you're not familiar with things
08:43 like infrastructures code then you also
08:45 going to start to layer these things on
08:46 in cicd so that's how the transition may
08:48 work and finally kind of throughout this
08:51 you want to start thinking about the
08:53 principles the ideas and the attitudes
08:55 towards Sr like the end use of focus of
08:57 everything that we do right these data
08:59 driven decisions and things like that
09:01 but you can see how you're not starting
09:02 from scratch as a software engineer or
09:04 software developer or a devops engineer
09:06 in another example or second line
09:07 support like you're bringing those
09:09 skills with you so I did promise that I
09:11 would touch on how you identify these
09:12 jobs and the jobs that are aligned with
09:14 you well you want to be looking out for
09:16 SRE jobs where the job description lists
09:19 things that you know that you are
09:20 skilled in let's go back to that
09:21 software engineering example if you are
09:24 looking at a job description for an Sr
09:26 rooll and there's an emphasis on
09:27 programming right and
09:29 understanding um code and understanding
09:32 applications in terms of the larger
09:33 scale and their design then you might
09:35 start to think I probably have a
09:37 competitive advantage over somebody who
09:39 may be from a devil's background and you
09:40 know may not spent that much time in
09:42 application code right so you will want
09:44 to put yourself forward for things like
09:45 that whereas if you are from the dev's
09:47 background and you start seeing job
09:48 descriptions and there's a heavy
09:49 emphasis on cicd or there's a heavy
09:52 emphasis on infrastructures code and
09:54 terraform or even Linux then you're
09:56 going to start to think that is where I
09:58 am well placed right that is where my
10:00 odds are higher well the odds are in my
10:02 favor in an Sr role like that and
10:04 because the role can change so much from
10:06 place to place and companies will Define
10:08 what they mean by SRE this is why it's
10:10 so important to look at the job
10:11 description when you're applying instead
10:12 of just blindly applying for SRE roles
10:14 one C that I did want to include here
10:16 before you start thinking about how you
10:17 get your first SRV job is if you are in
10:20 a role right if you're in a company and
10:22 you're aligned with a tech department or
10:23 you even have access to it so maybe
10:25 you're not an engineer but you work
10:26 within this large organization or in a
10:29 start making connections with the SRE
10:31 and platform teams and even the devops
10:33 team if there isn't an SRE team from
10:35 early on right start thinking about the
10:37 ways that you can bring SRE principles
10:38 into the work that you do and how you
10:40 can maybe take on task and work from the
10:43 SR or the platform team right could you
10:45 ask to be involved in some of the
10:46 tickets could you bring some of the
10:47 knowledge that you have from your
10:48 current role into what they're doing
10:50 right offering support that way you
10:52 start to build up experience that you
10:54 can put on your things like CVS for when
10:56 you're applying for full on SRE roles or
10:58 you you can even make the case to your
11:00 company that you would like to
11:01 transition into the SRE position so
11:03 finally let's wrap this up by talking
11:04 about SRE and AI because I know there's
11:06 a lot of fear in like the tech market
11:08 like are we all going to be replaced we
11:09 don't know what's going to happen
11:10 realistically in 10 years right but what
11:12 we do know is that the SRB role is still
11:14 in demand and actually there is synergy
11:17 happen between Ai and the development
11:19 and the increase in adoption and the SRE
11:21 role AI platforms also need to be
11:23 reliable which means they also need sres
11:26 take a look at this SRE role here you
11:27 know where that's from open AI the
11:29 creators of chat GPT one of the most
11:31 popular chat Bots that are in existence
11:34 they need sres they need reliability
11:36 Engineers all this one here that is from
11:38 anthropic another AI first company who
11:41 also need sres so you can actually Mel
11:44 or like start to join your interest in
11:45 Ai and machine learning if that's
11:47 something that is of interest to you to
11:49 your career like it doesn't have to be
11:50 an either or you don't have to panic
11:52 well not Year anyway about AI coming to
11:54 take your job despite the turbulence of
11:56 2023 it's an exciting time in Tech and
11:58 it's still an exciting time to be an SRE
12:00 and transitioning into the role I will
12:02 go more in depth about the application
12:04 process even things like interviews and
12:05 how to prep in another video but for now
12:08 thank you for watching and I will see
12:09 you in the next one