00:002022 was a breakout year for AI
00:04in fact many have even claimed that
00:06Chachi BT is the fastest growing app of
00:09all time so with so much opportunity on
00:12the table AI is the topic of
00:14conversation in every boardroom as CEOs
00:16figure out how to best integrate this
00:18new superpower but they're also asking
00:20really important questions around data
00:22privacy competition cost accuracy and
00:25also doing this all really quickly
00:27because just like your customers really
00:30don't care if your product is built with
00:31angular or react or runs on AWS or
00:34Heroku there will be a whole host of
00:36ways that companies differentiate as
00:38they look to cleverly embed Ai and here
00:41is how sourcegraph is thinking about
00:43that sourcecraft today it's this kind of
00:45like general purpose source code
00:47understanding engine as a reminder the
00:49content here is for informational
00:50purposes only should not be taken as
00:52legal business tax or investment advice
00:55or be used to evaluate any investment or
00:57security and is not directed at any
00:59investors or potential investors in any
01:01accz fund for more details please see AC
01:0416z.com disclosures so our first kind of
01:07major push I would say is this editor
01:10extension called Kodi and essentially
01:14um it's a it's a chat based interface
01:17um but also allows you to like search
01:18for stuff and context in the code and
01:20and the idea is that like we wanted
01:22something in our editors uh that took
01:24full advantage of the power of language
01:28um kind of addressed a lot of the
01:30challenge that people have encountered
01:32with uh large language models you know
01:34namely the tendency to hallucinate uh
01:36facts when they don't really know the
01:39um and so that's a place where we
01:40thought we could be uniquely positioned
01:43um Source graph you know with all the
01:45pieces of context that we have around
01:46searching for code and you know Finding
01:49references and verifying things actually
01:52exist we are kind of like the perfect
01:54fact Checker if you will for the
01:56language model and perfect like relevant
01:59context provider to the language model
02:03are tools that help you to build code
02:05using these language models we'll just
02:08throw out a couple like a lot of people
02:09are familiar with GitHub copilot a lot
02:11of people are familiar with what replit
02:13is doing with Ghostwriter but maybe you
02:16could actually speak to this idea of
02:18fetching the right information like how
02:20would something like a co-pilot do that
02:22currently they don't do q and AIDS it's
02:24purely kind of like autocomplete driven
02:26and the context that they fetch to do
02:28that auto completion is kind of like
02:30recent files that you've opened in your
02:32editor so it's kind of like this very
02:34local context which works amazingly well
02:36I mean like huge credit of that team
02:37they've built an awesome user experience
02:39we think that the next evolution of that
02:41is is is providing more relevant context
02:44and essentially emulating like what a
02:46human kind of does yeah uh when you're
02:48trying to write code right like you as a
02:51um you might like go back through some
02:53recent history in your Editor to see
02:55like okay how does that code work how
02:56did that code work and use that as like
02:57a pattern matching reference point for
02:59the thing that you're currently writing
03:00but more often than not I think you're
03:02doing stuff like you know go to
03:03definition find references you know let
03:05me see a couple of examples of how to
03:07use this particular API that just I just
03:09imported right and I think that that's
03:12going to lead to much better results I
03:14think it's also going to lead to much
03:16more kind of introspectable results so
03:18getting Beyond this like oh elements are
03:20magic how do they work is it AGI you
03:22know what not it's like Cody will
03:23actually tell you like hey I read these
03:25files and these are the files I'm using
03:26to generate an answer and if it
03:29completely uh you know returns a lie or
03:31is wrong you can usually tell by looking
03:33at the context uh that it read like oh
03:36like that why are you reading that five
03:37code Cody that's dumb and you can like
03:38thumbs down that and and we'll take that
03:41as as like a reference point to to
03:43improve the product later a lot of these
03:45companies that are integrating AI are
03:48building off of just a few models right
03:49A lot of people are familiar with open
03:51ai's API that came out recently
03:54um but there's also that very
03:55interesting Dynamic that a lot of the
03:58same companies that may even consider
04:00themselves competitors are using similar
04:02models and so how do do you think about
04:05that and also there's this kind of
04:06layered question as it relates to
04:08security and privacy because depending
04:10on the company that you are your code is
04:13actually potentially somewhat all the
04:15way to extremely proprietary right if
04:18you're talking about like a self-driving
04:19car company it's especially pertinent to
04:21us because we have a lot of uh you know
04:23Enterprise customers that are very
04:25Security in privacy uh sensitive to the
04:29point where you know one of the reasons
04:30we made it self-hostable is because we
04:33wanted to enable companies that uh
04:36didn't want to put their code bases in
04:37the cloud to still have like awesome
04:39code understanding and so the the spaces
04:43is kind of like Fast evolving and our
04:45mentality is like look we have a wide
04:48range of customers from like very
04:49conservative large Enterprises to like
04:52fast-moving startups that have different
04:53risk and security profiles the language
04:56model in in our like overall
04:58architecture is just one component and
05:01so we want to make it possible to kind
05:02of like bring your own language model uh
05:04to the table so you're basically saying
05:06that you give them the selection or the
05:08option am I understanding that correctly
05:10or uh we'll give you the option so right
05:12now yeah you can use uh Claude which is
05:15anthropics uh kind of Flagship model you
05:17can use Chachi BT which is kind of the
05:20open AI model and we're looking to
05:22integrate additional models too and
05:24there's also kind of like different
05:25models that we plug in in different
05:27pieces of code right so there's kind of
05:29like the chat based models which are
05:30often like the largest ones but there's
05:33also things like the embeddings model
05:34but I think our our mentality is just
05:36like the language model aspect of this
05:38we want to make as kind of like
05:39pluggable as possible that's amazing
05:41because something that that also relates
05:43to is cost right like each of these
05:45different models has a different cost I
05:47I think a couple weeks ago being like 5x
05:50their pricing overnight right like you
05:53have a dependency as well both Source
05:55graph but also like that ends up
05:57filtering down to your your customers
05:58and so every one of these models I mean
06:01I think we're still in the early Innings
06:02and there's going to be so many more
06:03developed and each one will to your
06:05point I'll have a different security
06:06posture it'll have different pricing
06:08scheme it'll probably you know there
06:10will be a range in terms of its efficacy
06:12or specialty yeah you know it never
06:15dawned on me that actually you know you
06:18the access across across the board to
06:21all these models but also kind of relay
06:23the transparent pros and cons to the
06:26customer base that's exactly how we're
06:28thinking about it for us it's kind of
06:30like there's so much Innovation
06:31happening in that space we don't want to
06:33be kind of tied to any one provider and
06:36so I think a lot of the value that we
06:38can provide is really about combining
06:40the language model with the pieces of
06:42context and the structured understanding
06:43of code that we have and it's funny that
06:46you mentioned the kind of Bing price
06:48hike I thought that was like a big proof
06:50point and people notice when chat GPT
06:54um I think a lot of people said like hey
06:56you know this kind of replaces search
06:58engines right like I could just chat
06:59with this thing and it would tell me the
07:01answer instead of me having to go and
07:02like click through a bunch of different
07:04results uh and figure out the answer
07:06myself but then as people started to use
07:09language models a bit more they started
07:10to run into more hallucinations and I
07:13think it was like the release of being
07:15where people finally realized like like
07:17being released the integrated Chachi BT
07:20or gpd4 you know one of those like
07:23awesome like open AI models in but they
07:25they didn't just like ship a white label
07:27Chachi BT they combined that with Bing
07:29search on the back end and I think
07:31combining kind of the language model as
07:34sort of like the the reasoning engine
07:36but you still need kind of like an
07:37informational retrieval engine to make
07:39that truly powerful and the Unison that
07:41that really is valuable and that's maybe
07:44I'm speculating here but like that maybe
07:46had something to do with the being price
07:48hike like it it is not true that
07:51language models make search engines
07:53unnecessary if anything they make the
07:55search engines more valuable because now
07:56all that data that you can search
07:58becomes like 10x more powerful because
08:00you can use that to you know get to get
08:03to your answer with like you know
08:04one-tenth the effort or in one tenth of
08:07if you like this segment you're gonna
08:08love our next video with
08:11with critically about UI
08:13UI how to personalize their new AI
08:15features you're not going to want to
08:16miss this and if you like this topic we
08:19go a lot deeper on the a16z podcast
08:21which you can find on Apple Spotify or
08:24wherever you get your podcasts