00:02 thank you all so much for being here
00:04 this is the second event in our new
00:08 Thursday nights Nai Series this event is
00:11 co-hosted by outset capital and
00:13 generally intelligent outset capital is
00:16 an early stage fund run by me Josh and
00:19 our third partner kanjun
00:21 we invest in pre-seed and seed companies
00:23 we're backed by the founders of Dropbox
00:25 quora replit and more and are actively
00:28 investing so if you're building or you
00:30 know someone who's building please come
00:31 say hi I am full-time on the fund can
00:34 June and Josh are not they are also
00:37 building generally intelligent generally
00:40 intelligent is a research company and we
00:41 focus on making more generally capable
00:44 robust safer agents systems that act in
00:47 kind of digital environments like on
00:48 your browser and your code editor that
00:50 sort of stuff so we do research to
00:52 basically make like imagine Auto GPT but
00:54 working tonight's guest is arabin
00:59 who is the co-founder and CEO of
01:03 Cai recently raised a series a led by
01:06 Nea he's also backed by Nat Friedman
01:09 elad Gill and AI luminaries like Jeff
01:12 Dean Jan lacun and Andre carpathy
01:16 perplexity AI is the world's first
01:18 generally available conversational
01:20 answer engine that directly answers
01:23 questions about any topic and we'll dive
01:25 into what that means in this
01:28 air event previously earned a PhD in
01:30 computer science from UC Berkeley and
01:33 worked in AI at openai Google and
01:37 thank you so much for being here let's
01:39 just start off with the basics most of
01:41 these people in this room probably know
01:42 already but let's just start off what is
01:45 perplexity Ai and how does it compare to
01:48 to what's out there already
01:49 yeah so perplexity is a conversational I
01:54 would say an answer engine rather rather
01:56 than a search engine so what does that
01:58 mean since the beginning of time like
02:01 the fundamental human need in in the top
02:03 of the the triangle of human needs is
02:07 the need for information right the
02:10 beginning of our race like we used to
02:12 rely on asking other people and then
02:14 people stored knowledge in the form of
02:15 books and then we had the printing press
02:18 and then we have like libraries and then
02:20 internet and then organize sources of
02:23 information like Yahoo
02:25 and then actually like algorithmic
02:27 search like Google but still we were
02:30 just consuming links but at the end of
02:33 what we really want is like answers and
02:36 getting things done
02:38 so we really need answer Bots and Action
02:40 Bots to just do what we want them to do
02:42 and answer all our deepest questions
02:44 right people wanted to do this forever
02:47 but there's a reason it didn't happen we
02:49 didn't have this amazing technology
02:51 called large language models but then
02:54 the world changed December last year
02:56 once Chachi BT came out and one week
02:59 before that GP 3.5 updated version came
03:02 out and we figured that like combining
03:07 these large language models with tool
03:10 use which is search indexes or databases
03:13 that have all the facts so combining the
03:16 the facts engine like a search engine
03:18 with the reasoning engine like a
03:20 language model helps you build an answer
03:22 engine that can answer all your
03:24 questions can converse with you and let
03:26 you ask dig deeper ask follow-up
03:29 questions and share all this knowledge
03:31 easily with other people so that they
03:32 don't have to ask these questions again
03:34 so that's sort of what we are building
03:36 we started doing this in December last
03:39 year we launched it a week after chat
03:41 GPT and many people gave us no shot at
03:44 succeeding but we are still surviving
03:46 for eight months so
03:48 it's going pretty well it's got traffic
03:50 is growing so you should check it out
03:51 for most searches now it's pretty
03:54 feature complete with like whatever you
03:56 get on Google in terms of even if you're
03:57 not interested in an LM generated answer
03:59 and just want to get to the link quickly
04:01 the relevance from llm ranking is a lot
04:05 lot better than what you get from Google
04:07 which is full of SEO and ads so there
04:10 are a lot of people who just use it even
04:12 as a traditional search engine and a
04:15 large number of people use it for
04:16 getting answers so that's where we are
04:18 today and we want to continue going on
04:21 this journey to make all of us use
04:23 answer engines and stop using search
04:24 engines we'll talk more about this
04:26 answer engine I like that concept but
04:28 just to take a step back like I
04:30 mentioned you were a PhD student at
04:32 Berkeley and then we're working in AI at
04:34 Google deepmind and openai and then
04:37 decided to launch perplexity and perhaps
04:39 like the craziest moment one of the
04:42 craziest moments I guess every
04:43 subsequent moment has been crazier in Ai
04:45 and like why did you do that what was
04:47 the journey there there
04:50 so I came from India here six years ago
04:53 and I didn't have any interest in
04:55 startups I just came to Berkeley for a
04:57 PhD in AI deploying deep RL was my topic
05:00 at the time it was actually the
05:02 equivalent of elements back then when
05:04 everyone was pretty crazy about it but
05:07 didn't have real product impact and then
05:09 you know there was this TV show Silicon
05:11 Valley I'm sure all of you have seen it
05:13 so I also saw that compression was like
05:18 aspect of it right so lossless
05:21 compression how can you improve so they
05:24 work on generative models that's the
05:26 ultimate thing if you model the log
05:28 probabilities of every next thing that
05:30 you're going to predict and you feed
05:31 that into Huffman encoding you have like
05:33 losses compression but much better than
05:36 jpeg we started working on generative
05:38 models myself and one of my colleagues
05:41 at Berkeley he was he was the one who
05:42 invented the working version of that
05:44 Jonathan who we even thought a class on
05:46 it at Berkeley we didn't call it
05:48 generative AI though we just call it
05:50 deep unsupervised generative modeling so
05:53 I learned a lot about it learned learned
05:54 about Transformers worked on like many
05:57 internships at deepmind and brain there
05:59 was no way to convert all this into a
06:01 startup because the hardware for
06:03 compression was in there like you have
06:04 to make it work on device right so I
06:06 kind of gave up on that idea and when I
06:09 was at deepmind I was
06:11 mostly in the office as interns are
06:14 supposed to be I would go to the library
06:16 and they had a lot of books and some of
06:18 their books were about the early days of
06:20 Google like how Google works or like in
06:22 the Plex and things like that and
06:24 obviously I took it and read it while my
06:26 jobs were running on the cluster that
06:29 that story resonated a lot with me
06:30 because I think in Silicon Valley it's
06:32 your sort of typically romanticizing the
06:34 idea of college dropouts and undergrad
06:36 starting companies and becoming the next
06:38 Zuckerberg right or Gates jobs but for
06:42 me it was like oh we're like PhD people
06:44 who start companies they don't have to
06:45 pay dropouts but at least like academics
06:47 they're entrepreneurs and Larry and
06:50 Sergey were the people who really
06:53 so when I was at deepmind I would go and
06:55 ask I'm the manager of my manager Oreo
06:58 and yells like he's now the head of
06:59 their Gemini team what is the page rank
07:02 of you know 2019 like what is what is
07:05 equivalent to that and he would just say
07:07 I don't know but it's very likely to
07:09 and like it was kind of correct it's the
07:13 test of time paper in AI right now so I
07:17 started working a lot on Transformers in
07:19 Google brain with the guy who invented
07:21 wrote a lot of papers got got a sense of
07:23 like just really working
07:25 and then went to opening I'd do more
07:27 research but clearly the times had
07:30 changed I would always keep hearing
07:31 things like oh you know what there's
07:33 this company called Jasper or copy they
07:35 make a ton of Revenue and then the real
07:37 changing moment was when GitHub co-pilot
07:40 turned on the monetization switch
07:41 hundreds of thousands of people paid on
07:44 day Zero double digit million ARR and
07:46 like it's the first day
07:47 that just shows it's like a real thing
07:49 and clearly add a lot of value to people
07:52 around me I reached out a few people
07:54 like elard Gill and not Friedman told
07:56 them I wanted to start a company I
07:58 didn't know anything in fact the first
07:59 idea I proposed to elat Gil was I wanted
08:02 to google but from from pixels because
08:04 they cannot be disrupted from text so I
08:07 wanted to do it on the glass and there's
08:09 this model called Flamingo from deepmind
08:11 that works so we just need to ship it
08:13 and he was like this is really like a
08:15 cool Pro demo thing but you're not going
08:18 to make it work the hardware is not
08:19 there it's very hard to do distribution
08:21 he told me all the rational things any
08:24 investor tells to a enthusiastic founder
08:26 right but the idea of search just kept
08:29 coming back and back like we tried Texas
08:31 sequel we tried a lot of other database
08:33 search but all of our core founding team
08:36 was just so motivated by search that the
08:39 inspiration from Larry is Sergey or like
08:41 me wanting to do search all the time it
08:43 just somehow flew into the product
08:47 I think a lot of people say this like
08:48 listen to your inner voice whatever you
08:50 ultimately obsess about that's what
08:52 you'll be able to put all your hard work
08:54 on other things like do what the
08:56 customer wants or like go talk to people
08:58 and build something customers want
09:00 you need to be first motivated by two
09:02 what is the problem you deeply care
09:04 about like you have to work on it so
09:05 that somehow ended up being the case for
09:07 us and that became perplexity how does
09:10 perplexity actually work under the hood
09:13 yeah so perplexity is basically a
09:16 combination of a traditional search
09:19 index and the reasoning power and tax
09:22 transformation capabilities of large
09:24 language models put together so every
09:27 time you enter a query in perplexity we
09:29 understand your query we reformulate it
09:31 and we send it to a search engine that
09:35 is very traditional in multiple search
09:36 indexes not ours and external indexes
09:39 pull up the relevant links and like lots
09:42 of links sometimes even hundreds of
09:44 links and then we basically task the llm
09:47 with saying hey you know read all these
09:49 links and pull up the relevant
09:50 paragraphs from each of these and use
09:53 those paragraphs to answer the user's
09:56 query in a very concise way and
10:00 in your answer write it like an academic
10:02 or a journalist would write it
10:04 that is make sure you always have
10:07 supporting citations supporting links
10:09 every part of your answer should have a
10:11 citation to it and this all flows from
10:13 our background like we were academics
10:15 when we write papers we always have
10:17 citations at the end of every sentence
10:19 to make sure that we only say what is
10:21 truthful right like a fact GPD or a true
10:24 GPT bias and then that ends up becoming
10:28 the answer the album does the magic at
10:30 the end and we make it conversational
10:32 remember the context of previous
10:33 questions so that you can referentially
10:37 ask more questions on top of what you
10:39 already asked we also make the process
10:41 of asking more questions Easier by
10:43 suggesting follow-ups to you which is
10:45 also another llm that generates these so
10:48 that way the whole process of
10:49 discovering more information becomes fun
10:53 and you cannot get into these rabbit
10:56 holes of like which is again you know
10:57 our inspiration we took from Wikipedia
10:59 of like asking more and more things by
11:01 clicking on more links
11:03 so when this started was this just like
11:05 a call out to you know open AIS API and
11:09 that's it is that still the way it works
11:10 are there a bunch of specialized models
11:12 like how is it changed over time we
11:14 initially started off with GPD 3.5 I
11:17 think it's still College of DaVinci 3.
11:19 that came a week before chat GPT was
11:22 released and we already had everything
11:24 in place so we noticed a massive
11:27 Improvement in quality from that and we
11:29 also use the Bing API so what we built
11:32 for the first time the first rollout was
11:34 to State API calls but now it's a lot
11:37 more sophisticated and that does like
11:38 many different algorithms including ours
11:40 many different indexes including ours
11:42 it's kind of like playing the orchestra
11:44 like there are so many tools so many
11:47 parts moving parts and your job is to
11:49 play the orchestra where you deliver a
11:52 lot of value on top reliably at scale
11:55 lots of queries per second and making
11:58 sure the latency is really good
12:00 how do you think about building on top
12:02 of other apis whether it's open AI or
12:04 I'm sure Google have their own kind of
12:06 commercial API Etc how do you think
12:08 about kind of having your own business
12:09 on top of that like is that defensible
12:11 like what do you say to the people that
12:12 are you know sort of haters on the oh
12:14 it's like just a rapper around open AI
12:17 I I think if it's just a rapper many
12:20 people will be able to build it really
12:24 if it's just a rapper
12:26 it'll be hard to scale it to this level
12:29 of traffic and usage
12:31 and reliability latency all that
12:34 requires hardcore engineering on the
12:37 but long-term defensibility is only
12:40 possible if you have like either you
12:42 have like so many users like the product
12:45 users just love it and they don't care
12:47 what you use under the hood and so you
12:49 got the user law the network effects and
12:51 the stickiness retention is all good
12:54 that's hard to buy like once you have
12:57 know whether you have your models or not
13:00 it's very hard to like lose from there
13:02 but in terms of asset class that you
13:04 want to own in your company obviously it
13:06 makes sense to invest in your own models
13:08 invest in your own search index I mean I
13:11 would even go to the extent like these
13:12 days people make fun of langtune wrapper
13:14 companies right not even open AI API
13:16 company so we fortunately we're not
13:18 Langston wrapper because when we started
13:20 there was no land chain
13:23 so we kind of built our own Lang chain I
13:26 guess basically right so I I would say
13:28 yeah it makes sense to build your own
13:30 models your own indexes over time but
13:33 there's like two ways of building a
13:35 company one is you roll out a product
13:37 get a lot of users de-risk the product
13:40 Market fit phase get to a sufficient
13:44 and then you start investing in
13:46 infrastructure so you raise the money
13:48 needed for that and you build a company
13:50 out of it the other way is saying like
13:52 I'm gonna first build the infrastructure
13:54 and then I'll build a product later
13:56 so only two companies have like I would
14:00 say only one company has done that
14:01 successfully right it's open here uh
14:04 anthropic has built models but not a
14:07 like no nobody uses Claude as a product
14:09 where they use it as an API but that's
14:12 kind of worth doing if you are
14:13 interested in building infrastructure
14:14 business and maybe a product later out
14:16 of it but more centered around infra
14:19 right that requires you to raise a lot
14:20 of money at a really high valuation
14:22 which is mostly impossible for most
14:25 people and even if it's possible it's
14:27 super risky so we decided to do the
14:31 traditional way of like raising small
14:32 amount of cash Builder product with
14:34 without any infrastructure of our own
14:36 and then later start slowly building it
14:39 that makes a lot of sense to be clear
14:42 science to eventually move off of openai
14:44 or to add build your own models to work
14:48 alongside opening eyes so right now the
14:51 plan is to build our models to work
14:55 I expect openai to have the best models
14:58 for at least two to three more years
15:00 Nobody Knows the future after that like
15:02 it could be some other company or
15:04 it could be us nobody knows right so
15:08 I'm just willing to play be pragmatic
15:11 here obviously look if you ask anybody
15:13 in this room would they want to be the
15:15 owner of gpt5 they'll say yes right so
15:18 I'm also going to say yes I would love
15:19 to have our own model that's as capable
15:21 as the next level I'm being built by
15:23 open AI but what is practically feasible
15:26 today is like we probably can get to 3.5
15:28 but we probably can never get to four
15:30 with the funding we have and definitely
15:33 notify right so we are happy to work
15:35 alongside their apis I love your
15:38 approach it's just very pragmatic not a
15:40 sexy or exciting sometimes as people who
15:42 are raising like 100 million to just
15:44 build models and start there but it just
15:46 strikes me as very pragmatic the whole
15:49 I'm not making fun of the approach in
15:52 fact I'll just be more direct and say
15:54 I'm not bold enough to do that
15:56 because if you want to raise 100 your
15:57 valuation should be at least like 500 or
16:00 or like billion maybe for some people
16:03 after that what what if you never build
16:06 a model as good as open AI or like what
16:07 if next day they announce their apis 10x
16:10 cheaper actually they did do that
16:12 then what happens to you right and then
16:14 what if Nvidia comes with a completely
16:16 different GPU in a few months and you
16:19 invested all your cash into building a
16:21 cluster out of the old generation
16:23 there's so many problems to think about
16:24 when you deal with capital of that size
16:26 and as a first time founder I don't have
16:29 the guts to do all that you have
16:35 since as you kind of alluded to thoughts
16:36 were against you everyone was like well
16:37 you obviously can't do this and you're
16:39 like well let's well let's see you
16:40 posted on LinkedIn I think yesterday
16:42 with some interesting stats comparing
16:44 perplexity to Bard and chat GPT
16:47 so you cited perplexity has 0.7 million
16:50 so 700k visits per day Bard has 4.6
16:55 million chat GPT has 54 million so chat
17:00 GPT is by far the dominant product right
17:03 but you asserted that
17:05 chat gbt might be the dominant product
17:06 but perplexity is the best product when
17:09 you look at visit duration
17:11 pages per visit and bounce rate
17:14 perplexity is the clear winner above
17:19 so those are just very impressive stats
17:20 and I'm curious how do you think you've
17:22 achieved this obviously there are many
17:24 companies following in your footsteps
17:26 not to mention Bard and chachu PT what
17:29 is it about your product that has led to
17:31 these great stats about user love
17:35 a lot of credit goes to our team for
17:37 that we have really good engineers and
17:39 really good product designer also one of
17:42 the most appreciated aspects of the
17:43 product is it's very clean and simple
17:46 so why why did we get these statistics
17:48 I'd say the number one reason is we only
17:50 focused on this one thing we are doing
17:52 which is answer engine
17:54 supportive citations nothing else there
17:57 are a lot of decisions we made like
17:59 oh if we could have gotten more traffic
18:01 if we supported free form chat
18:04 instead of just being a productivity
18:07 assistant or research assistant but we
18:09 didn't do that because
18:10 that would mean by for getting the
18:12 product make it confusing to people
18:13 getting users for one thing and like
18:16 some other users are getting frustrated
18:18 lack of reliability on another thing
18:21 so that really helped us was clear
18:23 simple and only doing one thing at a
18:26 time chachibility has so many other
18:28 things going on that they could lose on
18:31 one Plug-In or one particular
18:34 to a company that's super focused on
18:36 nailing that right and that's what
18:38 happened for their browser plugin which
18:40 are they Bing login
18:42 as for Bard I think it's still improving
18:45 since they rolled it out and I think
18:48 they are trying to go after chant GPT
18:49 then like trying to kind of create a new
18:51 search experience there so they
18:54 hallucinate a lot and like they don't
18:55 say the right things and some of the
18:57 links are not real links that's the same
19:00 problems you have at chat GPD so if you
19:01 kind of go after chargept you end up
19:03 with the same problems it has right and
19:05 another reason they are disadvantaged is
19:07 because if Bard is basically replacing
19:10 Google it's not good for Google
19:12 so they might not invest as many
19:14 resources into Bard as they would do for
19:18 just want to ask about that because
19:20 that's kind of every Founders every
19:21 Builder's question how do you prevent or
19:23 cut down on hallucinations how have you
19:25 approached that issue
19:27 yeah like I said the core tenant of the
19:29 product is only say what you can cite
19:32 that's also the principle in Academia or
19:35 journalism like you need to have sources
19:37 so if you're only going to pull up
19:39 content from a link or a web page and
19:42 only use that content for writing the
19:44 you can reduce hallucinations a lot
19:46 despite that there's still some
19:49 misunderstanding for the llm where let's
19:54 searching for Ali roads then like that's
19:56 some other person of that name it might
19:58 combine the two together to one person
20:01 and like some people get offended by it
20:03 some people are entertained by it so we
20:06 have worked hard on like this
20:07 ambiguation you know there's long tail
20:09 of cases where it goes wrong that can be
20:11 addressed with a better llm for example
20:13 we've noticed gpd4 hardly makes any
20:16 as long as you can decouple the facts
20:19 from reasoning with this retrieval
20:21 augmented generation Paradigm some
20:23 people call it as rag it should be
20:25 possible to address this over time
20:27 and as long as you can parse web pages
20:31 the snippeting logic is better the
20:33 embeddings are better all these things
20:35 can over time just be reduced to like
20:38 practically zero right
20:40 no one's going to be angry with one in a
20:42 thousand queries are
20:44 factually like having some problem and
20:47 that and at least we've been tracking
20:48 metrics and we're realizing that we're
20:51 continually improving here
20:53 last question and then we'll open it up
20:54 so that your questions ready but just
20:56 kind of talking about the psychology of
20:58 building this being a Founder is just
21:01 always incredibly difficult and that's
21:03 just in normal times
21:05 right now the ground is moving so fast
21:08 beneath you open AI is moving so fast
21:10 Google is moving so fast other companies
21:13 are moving so fast and then every other
21:14 day you have a 100 million or billion
21:17 fundraising announcement new people
21:19 coming out of stealth with just huge
21:21 piles of money and I'm curious how you
21:23 kind of deal with that of like just all
21:26 the challenges of building something new
21:27 plus being in this world where like you
21:29 kind of have to be on Twitter you have
21:31 to be monitoring what's happening
21:32 because things are moving so fast
21:36 I think it's it's always fun to prove
21:39 the world wrong right like that's
21:41 nothing better than that when Peter
21:44 deals like oh zero to one book is based
21:47 on this like what is rest of the world
21:48 think what do you think and is it at the
21:51 intersection of what is right
21:53 and if that is the case then you'll end
21:55 up being incredibly successful regarding
22:00 if having more capital
22:02 lets them build that you're trying to
22:06 than they are obviously at a
22:08 disadvantage so for example if your
22:10 company is about building gpt4 and you
22:13 have 10 million dollar in funding and
22:14 somebody else has 500 million dollars in
22:15 funding there like we're gonna win
22:18 if the company is about you know
22:20 taking llama but building like a really
22:24 great assistant out of that for
22:29 and having 500 million dollar funding
22:31 doesn't necessarily make them win it
22:32 might in fact it might make them loose
22:35 they have too way more on Capital and
22:37 they'll get distracted
22:39 and hire a lot of people and like throw
22:42 a lot of cash at things that don't need
22:45 and you have much more Advantage because
22:48 you're lean and you're basically hungry
22:49 and you need to win right
22:51 scarcity cannot be faked like the one
22:54 who has more at stake basically the one
22:57 who has like so much more to gain from
23:00 winning even eventually wins
23:05 yeah I wouldn't get so distracted by
23:07 funding rounds for example the
23:09 inflection funding round doesn't change
23:11 any of our our Destinies at all
23:14 that's more for open AI to worry about
23:19 for products that are competing with you
23:21 in the same space dare you clearly need
23:23 to be competitive but no question about
23:25 that it's good to just focus on your own
23:28 Journey have high sense of urgency
23:31 and not Friedman has this thing on his
23:33 website like a bunch of bullet points
23:35 which I really like uh some of them I
23:37 remember I can share knowledge is get
23:40 your dopamine from making things happen
23:42 I really subscribe to that a lot of it
23:45 aligns with what Mario Zuckerberg sees
23:47 to have done is better than perfect
23:48 always iterate don't wait for Perfection
23:51 get user feedback every week in fact
23:54 like the starting of the company not
23:55 told us like every Friday you should
23:57 basically be discussing what your users
23:59 are saying about your product
24:01 and if there's nothing new there it
24:03 means that week was a failure
24:05 so we we took all these advice pretty
24:07 seriously and we still worked on that
24:09 pace actually it's gone a little bit
24:11 slower because we have a product already
24:12 and we can't keep shipping more and more
24:14 because that confuses the user but we
24:17 still try our best to
24:19 every Friday we discuss all in all hands
24:22 like what people are saying about the
24:24 product what we can improve
24:26 I love it well thank you for sharing
24:28 your journey with us and with that I
24:30 will open it up to questions Natalia
24:32 will run around with Mike so please
24:34 raise your hands we'll start over here
24:36 second row we'll just pass down the mic
24:39 hi so you said earlier that you were
24:42 building a search index are you using a
24:44 retrieval augmented generation approach
24:46 and of yes how often do you update your
24:49 index or vector store index and how do
24:52 you how do you manage to scale it so
24:54 that your I guess your data store is up
24:58 yeah so I don't remember the top of my
25:00 head what is the exact periodicity but
25:02 it's pretty frequent like at least
25:04 happens every few hours it it is using
25:08 retrieval augmented generation so we the
25:11 the necessary elements for this are like
25:13 good embeddings and like good uh logic
25:16 around like re-scraping and things like
25:17 that and that is separate from actually
25:20 the llm so for the llm and this
25:23 retrieval augmented generation are you
25:25 still using the open AI API we still use
25:28 openai API and we use some of our models
25:32 and we expect that to be the case over
25:34 time like in the sense
25:36 if it has like a sliding bar between our
25:39 models and open AI models and like
25:40 there's like a convex combination
25:42 between the two we expect a sliding bar
25:46 shift more towards our models over time
25:49 but continue to have like a non-zero
25:51 usage of openai what metric are you
25:54 using to measure the relative
25:57 performance in to compare the retriever
25:59 performance between yeah so we're
26:01 actually just setting all these things
26:03 up right now we have the single AI
26:05 quality dashboards and we're working on
26:08 that with contractors
26:10 and the thing about this end-to-end
26:12 system is you need to track basically
26:14 whether the answer was correct or not
26:16 that's the most important thing it's
26:17 hard to have a metric for just the
26:21 part of my team is also an llm research
26:23 from eth Zurich and then whenever we we
26:26 also ship a product and whenever we're
26:28 like iterating on the product we have a
26:30 meeting and then at the end you're like
26:31 oh no we need to build like a test set
26:33 to test out how specifically well our
26:35 retriever is and so far we've never
26:37 gotten to the point but we're like
26:38 always intuitive like playing around
26:39 feeling like we're making some changes
26:42 to the pipeline see if it works better
26:43 how much of that was did you guys
26:45 actually do specific testing with like a
26:47 benchmark or did you just
26:50 intuitively you tested on prod
26:55 but I'm like oh we should build
26:57 this is something I learned for at open
26:59 AI that benchmarks are like really kind
27:02 of overrated It's like because we are
27:05 all like indoctrinated to think like
27:07 that from our machine learning research
27:09 but the moment you cannot train your
27:12 mind to only test on real user testing
27:15 your product that's when you actually
27:17 have something reliable and robust right
27:19 otherwise you just start tuning for that
27:23 Josh thoughts on benchmarks your take
27:27 there's definitely something to that I
27:29 don't know if test on prod is always the
27:30 right answer it certainly seems like a
27:32 good way to go if you have users and you
27:34 can try things out in the real world and
27:36 like in a limited fashion right
27:37 obviously you're not going to roll up
27:38 the whole thing to every production user
27:40 getting real feedback is the thing that
27:42 matters and you know I think you still
27:44 can use benchmarks probably right and it
27:46 makes sense and probably you'll develop
27:47 like with the AI dashboards develop some
27:49 of your own internal benchmarks by like
27:50 oh okay this is what it means for it to
27:52 be provided for the search quality but
27:53 like you do need to be really careful
27:55 that you don't over optimize for those
27:56 and I think there's a real danger of
28:01 uh my name is Carson I'm an architect I
28:03 wanted to ask you a question about unit
28:05 economics so in my function as an
28:09 architect I often have to talk to
28:12 this perceived impact on cost if you run
28:15 an llm search versus a traditional
28:18 search infrastructure so as you look
28:21 toward the future what you see are the
28:24 major ingredients to bring that cost
28:26 down and to make it more profitable and
28:28 more competitive to where companies like
28:31 Google for example are at the moment
28:32 yeah so one advantage we have as a
28:35 startup is we don't have the user volume
28:37 that Google has like two one or two
28:40 billion people using it every day
28:42 we don't have that right we are like
28:44 orders of magnitude lower than that
28:46 so this cost is not a big deal
28:50 and in terms of reduction
28:52 moving smaller models
28:55 small that are that are trained
28:57 explicitly for retrieval augmented is
28:59 going to be helpful and I expect
29:02 Hardware will also become cheaper over
29:03 time there'll be more tricks to make
29:06 inference more efficient like flash
29:07 attention and like other techniques
29:10 FPA to all that things will help
29:13 and so considering that as well as the
29:16 fact that we don't need to worry about
29:18 Google's user volume for quite a long
29:20 while I mean if that were the case today
29:23 we will be much bigger company
29:25 all that makes it much more convenient
29:28 for us to do this whole thing and and of
29:30 course building our own index will also
29:31 be a big cost reduction
29:33 because once you index it it's not a lot
29:36 cost for retrieval right
29:39 hi I'm curious if you've done any
29:41 looking at groups of use cases or user
29:44 personas and how you compare contrast
29:46 those against more traditional like
29:48 we found like a lot of users like using
29:51 it for research usually you say you know
29:53 what did you Google it for stuff that
29:56 you would just get the answer really
29:59 what's the time in London right now like
30:01 those are Google Nails those things but
30:04 for Stuff where you have to actually go
30:06 ask does YC support Founders if you're
30:08 still having their jobs with their
30:10 current companies or are they okay
30:12 investing in their company as long until
30:14 they get the funding
30:15 if you have to find out it's going to be
30:17 like opening a bunch of links and
30:18 reading them these are questions that
30:20 perplexity just answers like so lots of
30:22 these questions that you have in your
30:24 day-to-day life that involve you to like
30:25 do some amount of research whether it's
30:27 few minutes a few hours
30:28 our product Just Nails it and that's
30:31 where we found a lot of usage and hence
30:33 why I think it's not really like a
30:35 Google competition even though that's
30:37 what the narrative is very easy to say
30:39 that it's more like opening a new
30:42 segment for these answer Bots that
30:45 support people to come do their research
30:47 directly so I guess I want to ask about
30:49 your kind of 10-year plan and kind of
30:51 what trends you're seeing on the
30:53 technical and business side so one
30:55 example might be agents and we know they
30:58 really don't work very well right now as
31:00 Josh mentioned do you think users are
31:02 going to expect the Search tool to be
31:03 able to do a lot of that research and
31:05 ask those follow-up questions on their
31:07 own and how specialized do you think a
31:09 tool like that would be with domain
31:11 knowledge versus some general tool like
31:13 perplexity for the eating stuff we we're
31:16 already sort of prototyping things there
31:19 so in perplexity if you turn on Copilot
31:24 it's more like an interactive search
31:25 companion rather than just an answer bot
31:29 by that I mean it'll come back and ask
31:31 you clarifying questions and the
31:33 clarifying questions are very specific
31:35 to the original question you asked and
31:37 it generates the UI for the clarifying
31:40 question so that was an idea inspired by
31:42 Auto GPT but not completely autonomous
31:45 it's more like a copilot that works
31:48 and you can imagine this heading towards
31:50 like helping you buy stuff and over time
31:53 like do shopping decisions
31:55 so I I certainly see
31:58 possibilities to make this a reliable
32:01 assistant over time and not just an
32:06 but we need to really make it work or
32:08 else will not ship this right if it
32:10 doesn't work for most people it's gonna
32:12 make the make it a bad experience uh
32:15 10-year plan I I I think this could
32:20 even like three to four years
32:25 yeah Beyond five it's really hard to
32:28 predict gpt1 came out in 2018 so now
32:30 it's five years from then would you have
32:33 predicted this moment is pretty hard so
32:34 even five years is hard so 10 years is
32:37 like really really hard
32:39 yeah I had a question about training
32:41 you're talking about building your own
32:43 models and right now it sounds like you
32:45 have a composite of a number of
32:50 how does your how will your model be
32:52 better than the best possible
32:54 combination of those search
32:56 sources you shouldn't train your models
32:59 for this purpose right that's sort of
33:01 why I'm very skeptical of generally like
33:03 domain specific llms because once you do
33:06 that these models lose the magic of the
33:09 generality of the llm it'll be hard to
33:12 keep it conversational it'd be hard to
33:15 all the stuff that makes a prior feel
33:17 magical when it's sort of built on top
33:19 of open AIS API it will all go away once
33:22 you start fine-tuning on your use case
33:24 alone so how do you preserve the
33:26 generality and yet optimize for your use
33:28 cases kind of the open question and
33:32 I think the answer is like lies in some
33:34 few papers published by Google there's a
33:36 paper published by Google called Minerva
33:39 it's a model that's trained to do math
33:41 answer calculus questions things like
33:45 but they did not just train it on math
33:47 they took Palm but went fine-tuning
33:51 they don't just train on math but they
33:54 also train on regular English in fact 95
33:56 of the training is just on English I
33:59 think maybe like five percent five to
34:00 ten percent is on like math
34:02 and that ensures that they preserve the
34:05 pre-training data distribution even
34:09 and this is pretty important not a lot
34:11 of people know this and so when everyone
34:12 complains that they took llama and
34:14 fine-tuned it on this new thing and then
34:16 it overfed and it's not doing most of
34:18 the general things it was able to do
34:21 and there's a reason for that like your
34:23 drifting Too Much from the initial
34:25 parameter space all right thank you all
34:28 so much and thank you aravindia quick
34:30 Round of Applause for arabin