Perplexity CEO Aravind Srinivas, Thursday Nights in AI

Outset Capital2023-07-18

13K views|1 years ago

💫 Short Summary

Arvind Srinivas, the CEO of Perplexity AI, discusses the company's journey, its approach to building an answer engine, the importance of working alongside OpenAI, and the key factors in the AI landscape. He emphasizes the practicality of their approach and the need to focus on their own journey while navigating the rapidly changing AI industry.

✨ Highlights

📊 Transcript

✦

This section introduces the second event in the Thursday nights Nai Series, co-hosted by outset capital and generally intelligent, featuring Arvind Srinivas, the co-founder and CEO of Perplexity AI.

00:00

Outset capital is an early stage fund that invests in pre-seed and seed companies.

Generally intelligent is a research company focusing on creating more capable and robust systems for digital environments.

Perplexity AI is the world's first generally available conversational answer engine.

✦

Arvind Srinivas talks about his journey from being a PhD student at Berkeley to working in AI at various companies and eventually founding Perplexity AI.

04:00

He originally came to Berkeley for a PhD in AI and was interested in deep reinforcement learning (RL).

Inspired by the TV show Silicon Valley, he started working on generative models for lossless compression.

Saw the potential in AI after reading about the early days of Google at the library.

Got fascinated by the idea of PhD people starting companies.

Explored the idea of building an infrastructure for AI.

✦

The speaker discusses the concept of "doing what the customer wants" and the importance of being motivated by the problem that one deeply cares about.

08:00

The focus was on building an infrastructure for search, inspired by Larry and Sergey.

The company decided to raise a small amount of cash, build the product without infrastructure, and then slowly invest in infrastructure over time.

The current plan is to work alongside OpenAI and build their own models in the future.

The speaker believes in the practical feasibility of building their own models up to a certain level (3.5) with the funding they have.

The approach is considered pragmatic and not necessarily bold, given the fast-paced and competitive AI landscape.

✦

The speaker discusses the defensibility and long-term success of a company in the AI space, emphasizing the importance of having a large user base and investing in proprietary models and indexes.

12:00

Long-term defensibility is achieved through having a large user base and a high-quality product that users love.

Investing in proprietary models and indexes is vital for the company's success.

Two ways of building a company: focusing on infrastructure first or rolling out a product and then investing in infrastructure.

The speaker's company chose to raise a small amount of cash, build the product without infrastructure, and slowly invest in infrastructure later.

The current plan is to work alongside OpenAI and build their own models to work alongside OpenAI's models.

✦

Arvind Srinivas says that being a founder is challenging and fast-paced, and he is not bold enough to raise a huge amount of money without a clear plan.

00;16;00

He believes in proving the world wrong by being successful at the intersection of what is right.

The ability to build gpt4 with $10 million funding may be more advantageous than having $500 million.

Scarcity cannot be faked, and the one who has more at stake and is more focused on winning will eventually succeed.

He mentions that the inflection funding round doesn't change their destinies, and it's more for OpenAI to worry about.

✦

The speaker discusses the advantage of being a startup with a smaller user volume compared to Google, and the potential cost reduction strategies in the future.

22:00

Smaller models trained explicitly for retrieval augmented

Expected future cheaper hardware and more efficient inference techniques

Not needing to worry about Google's user volume for a long time

Building their own index will also be a big cost reduction

✦

The speaker mentions that the cost of not having Google's user volume is not a big deal for their startup and the future cost reduction strategies, including the use of smaller models trained explicitly for retrieval augmented and the expected cheaper hardware.

28:00

The speaker also expects hardware to become cheaper over time and more tricks to make inference more efficient.

Building their own index will be a big cost reduction as well, as once indexed, the retrieval cost is not significant.

00:00[Music]

00:02 thank you all so much for being here

00:04 this is the second event in our new

00:08 Thursday nights Nai Series this event is

00:11 co-hosted by outset capital and

00:13 generally intelligent outset capital is

00:16 an early stage fund run by me Josh and

00:19 our third partner kanjun

00:21 we invest in pre-seed and seed companies

00:23 we're backed by the founders of Dropbox

00:25 quora replit and more and are actively

00:28 investing so if you're building or you

00:30 know someone who's building please come

00:31 say hi I am full-time on the fund can

00:34 June and Josh are not they are also

00:37 building generally intelligent generally

00:40 intelligent is a research company and we

00:41 focus on making more generally capable

00:44 robust safer agents systems that act in

00:47 kind of digital environments like on

00:48 your browser and your code editor that

00:50 sort of stuff so we do research to

00:52 basically make like imagine Auto GPT but

00:54 working tonight's guest is arabin

00:57 srinivas

00:59 who is the co-founder and CEO of

01:01 perplexity AI

01:03 Cai recently raised a series a led by

01:06 Nea he's also backed by Nat Friedman

01:09 elad Gill and AI luminaries like Jeff

01:12 Dean Jan lacun and Andre carpathy

01:16 perplexity AI is the world's first

01:18 generally available conversational

01:20 answer engine that directly answers

01:23 questions about any topic and we'll dive

01:25 into what that means in this

01:27 conversation

01:28 air event previously earned a PhD in

01:30 computer science from UC Berkeley and

01:33 worked in AI at openai Google and

01:36 deepmind

01:37 thank you so much for being here let's

01:39 just start off with the basics most of

01:41 these people in this room probably know

01:42 already but let's just start off what is

01:45 perplexity Ai and how does it compare to

01:48 to what's out there already

01:49 yeah so perplexity is a conversational I

01:54 would say an answer engine rather rather

01:56 than a search engine so what does that

01:58 mean since the beginning of time like

02:01 the fundamental human need in in the top

02:03 of the the triangle of human needs is

02:05 curiosity

02:07 the need for information right the

02:10 beginning of our race like we used to

02:12 rely on asking other people and then

02:14 people stored knowledge in the form of

02:15 books and then we had the printing press

02:18 and then we have like libraries and then

02:20 internet and then organize sources of

02:23 information like Yahoo

02:25 and then actually like algorithmic

02:27 search like Google but still we were

02:30 just consuming links but at the end of

02:33 the day

02:33 what we really want is like answers and

02:36 getting things done

02:38 so we really need answer Bots and Action

02:40 Bots to just do what we want them to do

02:42 and answer all our deepest questions

02:44 right people wanted to do this forever

02:47 but there's a reason it didn't happen we

02:49 didn't have this amazing technology

02:51 called large language models but then

02:54 the world changed December last year

02:56 once Chachi BT came out and one week

02:59 before that GP 3.5 updated version came

03:02 out and we figured that like combining

03:06 the power of

03:07 these large language models with tool

03:10 use which is search indexes or databases

03:13 that have all the facts so combining the

03:16 the facts engine like a search engine

03:18 with the reasoning engine like a

03:20 language model helps you build an answer

03:22 engine that can answer all your

03:24 questions can converse with you and let

03:26 you ask dig deeper ask follow-up

03:29 questions and share all this knowledge

03:31 easily with other people so that they

03:32 don't have to ask these questions again

03:34 so that's sort of what we are building

03:36 we started doing this in December last

03:39 year we launched it a week after chat

03:41 GPT and many people gave us no shot at

03:44 succeeding but we are still surviving

03:46 for eight months so

03:48 it's going pretty well it's got traffic

03:50 is growing so you should check it out

03:51 for most searches now it's pretty

03:54 feature complete with like whatever you

03:56 get on Google in terms of even if you're

03:57 not interested in an LM generated answer

03:59 and just want to get to the link quickly

04:01 the relevance from llm ranking is a lot

04:05 lot better than what you get from Google

04:07 which is full of SEO and ads so there

04:10 are a lot of people who just use it even

04:12 as a traditional search engine and a

04:15 large number of people use it for

04:16 getting answers so that's where we are

04:18 today and we want to continue going on

04:21 this journey to make all of us use

04:23 answer engines and stop using search

04:24 engines we'll talk more about this

04:26 answer engine I like that concept but

04:28 just to take a step back like I

04:30 mentioned you were a PhD student at

04:32 Berkeley and then we're working in AI at

04:34 Google deepmind and openai and then

04:37 decided to launch perplexity and perhaps

04:39 like the craziest moment one of the

04:42 craziest moments I guess every

04:43 subsequent moment has been crazier in Ai

04:45 and like why did you do that what was

04:47 the journey there there

04:50 so I came from India here six years ago

04:53 and I didn't have any interest in

04:55 startups I just came to Berkeley for a

04:57 PhD in AI deploying deep RL was my topic

05:00 at the time it was actually the

05:02 equivalent of elements back then when

05:04 everyone was pretty crazy about it but

05:07 didn't have real product impact and then

05:09 you know there was this TV show Silicon

05:11 Valley I'm sure all of you have seen it

05:13 so I also saw that compression was like

05:16 the core

05:18 aspect of it right so lossless

05:21 compression how can you improve so they

05:24 work on generative models that's the

05:26 ultimate thing if you model the log

05:28 probabilities of every next thing that

05:30 you're going to predict and you feed

05:31 that into Huffman encoding you have like

05:33 losses compression but much better than

05:36 jpeg we started working on generative

05:38 models myself and one of my colleagues

05:41 at Berkeley he was he was the one who

05:42 invented the working version of that

05:44 Jonathan who we even thought a class on

05:46 it at Berkeley we didn't call it

05:48 generative AI though we just call it

05:50 deep unsupervised generative modeling so

05:53 I learned a lot about it learned learned

05:54 about Transformers worked on like many

05:57 internships at deepmind and brain there

05:59 was no way to convert all this into a

06:01 startup because the hardware for

06:03 compression was in there like you have

06:04 to make it work on device right so I

06:06 kind of gave up on that idea and when I

06:09 was at deepmind I was

06:11 mostly in the office as interns are

06:14 supposed to be I would go to the library

06:16 and they had a lot of books and some of

06:18 their books were about the early days of

06:20 Google like how Google works or like in

06:22 the Plex and things like that and

06:24 obviously I took it and read it while my

06:26 jobs were running on the cluster that

06:29 that story resonated a lot with me

06:30 because I think in Silicon Valley it's

06:32 your sort of typically romanticizing the

06:34 idea of college dropouts and undergrad

06:36 starting companies and becoming the next

06:38 Zuckerberg right or Gates jobs but for

06:42 me it was like oh we're like PhD people

06:44 who start companies they don't have to

06:45 pay dropouts but at least like academics

06:47 they're entrepreneurs and Larry and

06:50 Sergey were the people who really

06:51 inspired me a lot

06:53 so when I was at deepmind I would go and

06:55 ask I'm the manager of my manager Oreo

06:58 and yells like he's now the head of

06:59 their Gemini team what is the page rank

07:02 of you know 2019 like what is what is

07:05 equivalent to that and he would just say

07:07 I don't know but it's very likely to

07:08 informers

07:09 and like it was kind of correct it's the

07:12 ultimate

07:13 test of time paper in AI right now so I

07:17 started working a lot on Transformers in

07:19 Google brain with the guy who invented

07:20 it Ashish

07:21 wrote a lot of papers got got a sense of

07:23 like just really working

07:25 and then went to opening I'd do more

07:27 research but clearly the times had

07:30 changed I would always keep hearing

07:31 things like oh you know what there's

07:33 this company called Jasper or copy they

07:35 make a ton of Revenue and then the real

07:37 changing moment was when GitHub co-pilot

07:40 turned on the monetization switch

07:41 hundreds of thousands of people paid on

07:44 day Zero double digit million ARR and

07:46 like it's the first day

07:47 that just shows it's like a real thing

07:49 and clearly add a lot of value to people

07:52 around me I reached out a few people

07:54 like elard Gill and not Friedman told

07:56 them I wanted to start a company I

07:58 didn't know anything in fact the first

07:59 idea I proposed to elat Gil was I wanted

08:02 to google but from from pixels because

08:04 they cannot be disrupted from text so I

08:07 wanted to do it on the glass and there's

08:09 this model called Flamingo from deepmind

08:11 that works so we just need to ship it

08:13 and he was like this is really like a

08:15 cool Pro demo thing but you're not going

08:18 to make it work the hardware is not

08:19 there it's very hard to do distribution

08:21 he told me all the rational things any

08:24 investor tells to a enthusiastic founder

08:26 right but the idea of search just kept

08:29 coming back and back like we tried Texas

08:31 sequel we tried a lot of other database

08:33 search but all of our core founding team

08:36 was just so motivated by search that the

08:39 inspiration from Larry is Sergey or like

08:41 me wanting to do search all the time it

08:43 just somehow flew into the product

08:45 deeply

08:47 I think a lot of people say this like

08:48 listen to your inner voice whatever you

08:50 ultimately obsess about that's what

08:52 you'll be able to put all your hard work

08:54 on other things like do what the

08:56 customer wants or like go talk to people

08:58 and build something customers want

09:00 you need to be first motivated by two

09:02 what is the problem you deeply care

09:04 about like you have to work on it so

09:05 that somehow ended up being the case for

09:07 us and that became perplexity how does

09:10 perplexity actually work under the hood

09:13 yeah so perplexity is basically a

09:16 combination of a traditional search

09:19 index and the reasoning power and tax

09:22 transformation capabilities of large

09:24 language models put together so every

09:27 time you enter a query in perplexity we

09:29 understand your query we reformulate it

09:31 and we send it to a search engine that

09:35 is very traditional in multiple search

09:36 indexes not ours and external indexes

09:39 pull up the relevant links and like lots

09:42 of links sometimes even hundreds of

09:44 links and then we basically task the llm

09:47 with saying hey you know read all these

09:49 links and pull up the relevant

09:50 paragraphs from each of these and use

09:53 those paragraphs to answer the user's

09:56 query in a very concise way and

10:00 in your answer write it like an academic

10:02 or a journalist would write it

10:04 that is make sure you always have

10:07 supporting citations supporting links

10:09 every part of your answer should have a

10:11 citation to it and this all flows from

10:13 our background like we were academics

10:15 when we write papers we always have

10:17 citations at the end of every sentence

10:19 to make sure that we only say what is

10:21 truthful right like a fact GPD or a true

10:24 GPT bias and then that ends up becoming

10:28 the answer the album does the magic at

10:30 the end and we make it conversational

10:32 remember the context of previous

10:33 questions so that you can referentially

10:37 ask more questions on top of what you

10:39 already asked we also make the process

10:41 of asking more questions Easier by

10:43 suggesting follow-ups to you which is

10:45 also another llm that generates these so

10:48 that way the whole process of

10:49 discovering more information becomes fun

10:52 and engaging

10:53 and you cannot get into these rabbit

10:56 holes of like which is again you know

10:57 our inspiration we took from Wikipedia

10:59 of like asking more and more things by

11:01 clicking on more links

11:03 so when this started was this just like

11:05 a call out to you know open AIS API and

11:09 that's it is that still the way it works

11:10 are there a bunch of specialized models

11:12 like how is it changed over time we

11:14 initially started off with GPD 3.5 I

11:17 think it's still College of DaVinci 3.

11:19 that came a week before chat GPT was

11:22 released and we already had everything

11:24 in place so we noticed a massive

11:27 Improvement in quality from that and we

11:29 also use the Bing API so what we built

11:32 for the first time the first rollout was

11:34 to State API calls but now it's a lot

11:37 more sophisticated and that does like

11:38 many different algorithms including ours

11:40 many different indexes including ours

11:42 it's kind of like playing the orchestra

11:44 like there are so many tools so many

11:47 parts moving parts and your job is to

11:49 play the orchestra where you deliver a

11:52 lot of value on top reliably at scale

11:55 lots of queries per second and making

11:58 sure the latency is really good

12:00 how do you think about building on top

12:02 of other apis whether it's open AI or

12:04 I'm sure Google have their own kind of

12:06 commercial API Etc how do you think

12:08 about kind of having your own business

12:09 on top of that like is that defensible

12:11 like what do you say to the people that

12:12 are you know sort of haters on the oh

12:14 it's like just a rapper around open AI

12:16 or whatever

12:17 I I think if it's just a rapper many

12:20 people will be able to build it really

12:21 quickly and then

12:24 if it's just a rapper

12:26 it'll be hard to scale it to this level

12:29 of traffic and usage

12:31 and reliability latency all that

12:34 requires hardcore engineering on the

12:36 back end

12:37 but long-term defensibility is only

12:40 possible if you have like either you

12:42 have like so many users like the product

12:43 is so good

12:45 users just love it and they don't care

12:47 what you use under the hood and so you

12:49 got the user law the network effects and

12:51 the stickiness retention is all good

12:54 that's hard to buy like once you have

12:56 the user law

12:57 know whether you have your models or not

13:00 it's very hard to like lose from there

13:02 but in terms of asset class that you

13:04 want to own in your company obviously it

13:06 makes sense to invest in your own models

13:08 invest in your own search index I mean I

13:11 would even go to the extent like these

13:12 days people make fun of langtune wrapper

13:14 companies right not even open AI API

13:16 company so we fortunately we're not

13:18 Langston wrapper because when we started

13:20 there was no land chain

13:23 so we kind of built our own Lang chain I

13:26 guess basically right so I I would say

13:28 yeah it makes sense to build your own

13:30 models your own indexes over time but

13:33 there's like two ways of building a

13:35 company one is you roll out a product

13:37 get a lot of users de-risk the product

13:40 Market fit phase get to a sufficient

13:43 user volume

13:44 and then you start investing in

13:46 infrastructure so you raise the money

13:48 needed for that and you build a company

13:50 out of it the other way is saying like

13:52 I'm gonna first build the infrastructure

13:54 and then I'll build a product later

13:56 so only two companies have like I would

14:00 say only one company has done that

14:01 successfully right it's open here uh

14:04 anthropic has built models but not a

14:06 product

14:07 like no nobody uses Claude as a product

14:09 where they use it as an API but that's

14:12 kind of worth doing if you are

14:13 interested in building infrastructure

14:14 business and maybe a product later out

14:16 of it but more centered around infra

14:19 right that requires you to raise a lot

14:20 of money at a really high valuation

14:22 which is mostly impossible for most

14:25 people and even if it's possible it's

14:27 super risky so we decided to do the

14:31 traditional way of like raising small

14:32 amount of cash Builder product with

14:34 without any infrastructure of our own

14:36 and then later start slowly building it

14:39 in

14:39 that makes a lot of sense to be clear

14:41 are you saying

14:42 science to eventually move off of openai

14:44 or to add build your own models to work

14:48 alongside opening eyes so right now the

14:51 plan is to build our models to work

14:53 alongside open AIS

14:55 I expect openai to have the best models

14:58 for at least two to three more years

15:00 Nobody Knows the future after that like

15:02 it could be some other company or

15:04 it could be us nobody knows right so

15:08 I'm just willing to play be pragmatic

15:11 here obviously look if you ask anybody

15:13 in this room would they want to be the

15:15 owner of gpt5 they'll say yes right so

15:18 I'm also going to say yes I would love

15:19 to have our own model that's as capable

15:21 as the next level I'm being built by

15:23 open AI but what is practically feasible

15:26 today is like we probably can get to 3.5

15:28 but we probably can never get to four

15:30 with the funding we have and definitely

15:33 notify right so we are happy to work

15:35 alongside their apis I love your

15:38 approach it's just very pragmatic not a

15:40 sexy or exciting sometimes as people who

15:42 are raising like 100 million to just

15:44 build models and start there but it just

15:46 strikes me as very pragmatic the whole

15:48 way through

15:49 I'm not making fun of the approach in

15:52 fact I'll just be more direct and say

15:54 I'm not bold enough to do that

15:56 because if you want to raise 100 your

15:57 valuation should be at least like 500 or

15:59 600

16:00 or like billion maybe for some people

16:03 after that what what if you never build

16:06 a model as good as open AI or like what

16:07 if next day they announce their apis 10x

16:10 cheaper actually they did do that

16:12 then what happens to you right and then

16:14 what if Nvidia comes with a completely

16:16 different GPU in a few months and you

16:19 invested all your cash into building a

16:21 cluster out of the old generation

16:23 there's so many problems to think about

16:24 when you deal with capital of that size

16:26 and as a first time founder I don't have

16:29 the guts to do all that you have

16:32 achieved a

16:35 since as you kind of alluded to thoughts

16:36 were against you everyone was like well

16:37 you obviously can't do this and you're

16:39 like well let's well let's see you

16:40 posted on LinkedIn I think yesterday

16:42 with some interesting stats comparing

16:44 perplexity to Bard and chat GPT

16:47 so you cited perplexity has 0.7 million

16:50 so 700k visits per day Bard has 4.6

16:55 million chat GPT has 54 million so chat

17:00 GPT is by far the dominant product right

17:02 now

17:03 but you asserted that

17:05 chat gbt might be the dominant product

17:06 but perplexity is the best product when

17:09 you look at visit duration

17:11 pages per visit and bounce rate

17:14 perplexity is the clear winner above

17:16 chat GPT and bard

17:19 so those are just very impressive stats

17:20 and I'm curious how do you think you've

17:22 achieved this obviously there are many

17:24 companies following in your footsteps

17:26 not to mention Bard and chachu PT what

17:29 is it about your product that has led to

17:31 these great stats about user love

17:34 basically

17:35 a lot of credit goes to our team for

17:37 that we have really good engineers and

17:39 really good product designer also one of

17:42 the most appreciated aspects of the

17:43 product is it's very clean and simple

17:46 so why why did we get these statistics

17:48 I'd say the number one reason is we only

17:50 focused on this one thing we are doing

17:52 which is answer engine

17:54 supportive citations nothing else there

17:57 are a lot of decisions we made like

17:59 oh if we could have gotten more traffic

18:01 if we supported free form chat

18:04 instead of just being a productivity

18:07 assistant or research assistant but we

18:09 didn't do that because

18:10 that would mean by for getting the

18:12 product make it confusing to people

18:13 getting users for one thing and like

18:16 some other users are getting frustrated

18:17 for

18:18 lack of reliability on another thing

18:21 so that really helped us was clear

18:23 simple and only doing one thing at a

18:26 time chachibility has so many other

18:28 things going on that they could lose on

18:31 one Plug-In or one particular

18:32 functionality

18:34 to a company that's super focused on

18:36 nailing that right and that's what

18:38 happened for their browser plugin which

18:40 are they Bing login

18:42 as for Bard I think it's still improving

18:45 since they rolled it out and I think

18:48 they are trying to go after chant GPT

18:49 then like trying to kind of create a new

18:51 search experience there so they

18:54 hallucinate a lot and like they don't

18:55 say the right things and some of the

18:57 links are not real links that's the same

19:00 problems you have at chat GPD so if you

19:01 kind of go after chargept you end up

19:03 with the same problems it has right and

19:05 another reason they are disadvantaged is

19:07 because if Bard is basically replacing

19:10 Google it's not good for Google

19:12 so they might not invest as many

19:14 resources into Bard as they would do for

19:17 regular Google

19:18 just want to ask about that because

19:20 that's kind of every Founders every

19:21 Builder's question how do you prevent or

19:23 cut down on hallucinations how have you

19:25 approached that issue

19:27 yeah like I said the core tenant of the

19:29 product is only say what you can cite

19:32 that's also the principle in Academia or

19:35 journalism like you need to have sources

19:37 so if you're only going to pull up

19:39 content from a link or a web page and

19:42 only use that content for writing the

19:43 answer

19:44 you can reduce hallucinations a lot

19:46 despite that there's still some

19:49 misunderstanding for the llm where let's

19:52 say I'm

19:54 searching for Ali roads then like that's

19:56 some other person of that name it might

19:58 combine the two together to one person

20:01 and like some people get offended by it

20:03 some people are entertained by it so we

20:06 have worked hard on like this

20:07 ambiguation you know there's long tail

20:09 of cases where it goes wrong that can be

20:11 addressed with a better llm for example

20:13 we've noticed gpd4 hardly makes any

20:15 mistake there

20:16 as long as you can decouple the facts

20:19 from reasoning with this retrieval

20:21 augmented generation Paradigm some

20:23 people call it as rag it should be

20:25 possible to address this over time

20:27 and as long as you can parse web pages

20:29 even better

20:31 the snippeting logic is better the

20:33 embeddings are better all these things

20:35 can over time just be reduced to like

20:38 practically zero right

20:40 no one's going to be angry with one in a

20:42 thousand queries are

20:44 factually like having some problem and

20:47 that and at least we've been tracking

20:48 metrics and we're realizing that we're

20:51 continually improving here

20:53 last question and then we'll open it up

20:54 so that your questions ready but just

20:56 kind of talking about the psychology of

20:58 building this being a Founder is just

21:01 always incredibly difficult and that's

21:03 just in normal times

21:05 right now the ground is moving so fast

21:08 beneath you open AI is moving so fast

21:10 Google is moving so fast other companies

21:13 are moving so fast and then every other

21:14 day you have a 100 million or billion

21:17 fundraising announcement new people

21:19 coming out of stealth with just huge

21:21 piles of money and I'm curious how you

21:23 kind of deal with that of like just all

21:26 the challenges of building something new

21:27 plus being in this world where like you

21:29 kind of have to be on Twitter you have

21:31 to be monitoring what's happening

21:32 because things are moving so fast

21:36 I think it's it's always fun to prove

21:39 the world wrong right like that's

21:41 nothing better than that when Peter

21:44 deals like oh zero to one book is based

21:47 on this like what is rest of the world

21:48 think what do you think and is it at the

21:51 intersection of what is right

21:53 and if that is the case then you'll end

21:55 up being incredibly successful regarding

21:58 the funding round

22:00 if having more capital

22:02 lets them build that you're trying to

22:04 build much faster

22:06 than they are obviously at a

22:08 disadvantage so for example if your

22:10 company is about building gpt4 and you

22:13 have 10 million dollar in funding and

22:14 somebody else has 500 million dollars in

22:15 funding there like we're gonna win

22:18 if the company is about you know

22:20 taking llama but building like a really

22:24 great assistant out of that for

22:27 healthcare

22:29 and having 500 million dollar funding

22:31 doesn't necessarily make them win it

22:32 might in fact it might make them loose

22:34 because

22:35 they have too way more on Capital and

22:37 they'll get distracted

22:39 and hire a lot of people and like throw

22:42 a lot of cash at things that don't need

22:43 to be worked on

22:45 and you have much more Advantage because

22:48 you're lean and you're basically hungry

22:49 and you need to win right

22:51 scarcity cannot be faked like the one

22:54 who has more at stake basically the one

22:57 who has like so much more to gain from

23:00 winning even eventually wins

23:03 so

23:05 yeah I wouldn't get so distracted by

23:07 funding rounds for example the

23:09 inflection funding round doesn't change

23:11 any of our our Destinies at all

23:14 that's more for open AI to worry about

23:16 right

23:18 so

23:19 for products that are competing with you

23:21 in the same space dare you clearly need

23:23 to be competitive but no question about

23:25 that it's good to just focus on your own

23:28 Journey have high sense of urgency

23:31 and not Friedman has this thing on his

23:33 website like a bunch of bullet points

23:35 which I really like uh some of them I

23:37 remember I can share knowledge is get

23:40 your dopamine from making things happen

23:42 I really subscribe to that a lot of it

23:45 aligns with what Mario Zuckerberg sees

23:47 to have done is better than perfect

23:48 always iterate don't wait for Perfection

23:51 get user feedback every week in fact

23:54 like the starting of the company not

23:55 told us like every Friday you should

23:57 basically be discussing what your users

23:59 are saying about your product

24:01 and if there's nothing new there it

24:03 means that week was a failure

24:05 so we we took all these advice pretty

24:07 seriously and we still worked on that

24:09 pace actually it's gone a little bit

24:11 slower because we have a product already

24:12 and we can't keep shipping more and more

24:14 because that confuses the user but we

24:17 still try our best to

24:19 every Friday we discuss all in all hands

24:22 like what people are saying about the

24:24 product what we can improve

24:26 I love it well thank you for sharing

24:28 your journey with us and with that I

24:30 will open it up to questions Natalia

24:32 will run around with Mike so please

24:34 raise your hands we'll start over here

24:36 second row we'll just pass down the mic

24:39 hi so you said earlier that you were

24:42 building a search index are you using a

24:44 retrieval augmented generation approach

24:46 and of yes how often do you update your

24:49 index or vector store index and how do

24:52 you how do you manage to scale it so

24:54 that your I guess your data store is up

24:56 to date at all

24:58 yeah so I don't remember the top of my

25:00 head what is the exact periodicity but

25:02 it's pretty frequent like at least

25:04 happens every few hours it it is using

25:08 retrieval augmented generation so we the

25:11 the necessary elements for this are like

25:13 good embeddings and like good uh logic

25:16 around like re-scraping and things like

25:17 that and that is separate from actually

25:20 the llm so for the llm and this

25:23 retrieval augmented generation are you

25:25 still using the open AI API we still use

25:28 openai API and we use some of our models

25:31 too

25:32 and we expect that to be the case over

25:34 time like in the sense

25:36 if it has like a sliding bar between our

25:39 models and open AI models and like

25:40 there's like a convex combination

25:42 between the two we expect a sliding bar

25:44 to like

25:46 shift more towards our models over time

25:49 but continue to have like a non-zero

25:51 usage of openai what metric are you

25:54 using to measure the relative

25:57 performance in to compare the retriever

25:59 performance between yeah so we're

26:01 actually just setting all these things

26:03 up right now we have the single AI

26:05 quality dashboards and we're working on

26:08 that with contractors

26:10 and the thing about this end-to-end

26:12 system is you need to track basically

26:14 whether the answer was correct or not

26:16 that's the most important thing it's

26:17 hard to have a metric for just the

26:19 embeddings alone

26:21 part of my team is also an llm research

26:23 from eth Zurich and then whenever we we

26:26 also ship a product and whenever we're

26:28 like iterating on the product we have a

26:30 meeting and then at the end you're like

26:31 oh no we need to build like a test set

26:33 to test out how specifically well our

26:35 retriever is and so far we've never

26:37 gotten to the point but we're like

26:38 always intuitive like playing around

26:39 feeling like we're making some changes

26:42 to the pipeline see if it works better

26:43 how much of that was did you guys

26:45 actually do specific testing with like a

26:47 benchmark or did you just

26:50 intuitively you tested on prod

26:53 igy

26:54 for doing

26:55 but I'm like oh we should build

26:57 this is something I learned for at open

26:59 AI that benchmarks are like really kind

27:02 of overrated It's like because we are

27:05 all like indoctrinated to think like

27:07 that from our machine learning research

27:09 but the moment you cannot train your

27:12 mind to only test on real user testing

27:15 your product that's when you actually

27:17 have something reliable and robust right

27:19 otherwise you just start tuning for that

27:22 benchmark

27:23 Josh thoughts on benchmarks your take

27:27 there's definitely something to that I

27:29 don't know if test on prod is always the

27:30 right answer it certainly seems like a

27:32 good way to go if you have users and you

27:34 can try things out in the real world and

27:36 like in a limited fashion right

27:37 obviously you're not going to roll up

27:38 the whole thing to every production user

27:40 getting real feedback is the thing that

27:42 matters and you know I think you still

27:44 can use benchmarks probably right and it

27:46 makes sense and probably you'll develop

27:47 like with the AI dashboards develop some

27:49 of your own internal benchmarks by like

27:50 oh okay this is what it means for it to

27:52 be provided for the search quality but

27:53 like you do need to be really careful

27:55 that you don't over optimize for those

27:56 and I think there's a real danger of

27:57 that

27:58 all right

28:00 up front

28:01 uh my name is Carson I'm an architect I

28:03 wanted to ask you a question about unit

28:05 economics so in my function as an

28:09 architect I often have to talk to

28:10 customers about

28:12 this perceived impact on cost if you run

28:15 an llm search versus a traditional

28:18 search infrastructure so as you look

28:21 toward the future what you see are the

28:24 major ingredients to bring that cost

28:26 down and to make it more profitable and

28:28 more competitive to where companies like

28:31 Google for example are at the moment

28:32 yeah so one advantage we have as a

28:35 startup is we don't have the user volume

28:37 that Google has like two one or two

28:40 billion people using it every day

28:42 we don't have that right we are like

28:44 orders of magnitude lower than that

28:46 so this cost is not a big deal

28:50 and in terms of reduction

28:52 moving smaller models

28:55 small that are that are trained

28:57 explicitly for retrieval augmented is

28:59 going to be helpful and I expect

29:02 Hardware will also become cheaper over

29:03 time there'll be more tricks to make

29:06 inference more efficient like flash

29:07 attention and like other techniques

29:10 FPA to all that things will help

29:13 and so considering that as well as the

29:16 fact that we don't need to worry about

29:18 Google's user volume for quite a long

29:20 while I mean if that were the case today

29:22 then

29:23 we will be much bigger company

29:25 all that makes it much more convenient

29:28 for us to do this whole thing and and of

29:30 course building our own index will also

29:31 be a big cost reduction

29:33 because once you index it it's not a lot

29:35 of

29:36 cost for retrieval right

29:39 hi I'm curious if you've done any

29:41 looking at groups of use cases or user

29:44 personas and how you compare contrast

29:46 those against more traditional like

29:47 Google search

29:48 we found like a lot of users like using

29:51 it for research usually you say you know

29:53 what did you Google it for stuff that

29:56 you would just get the answer really

29:58 fast

29:59 what's the time in London right now like

30:01 those are Google Nails those things but

30:04 for Stuff where you have to actually go

30:06 ask does YC support Founders if you're

30:08 still having their jobs with their

30:10 current companies or are they okay

30:12 investing in their company as long until

30:14 they get the funding

30:15 if you have to find out it's going to be

30:17 like opening a bunch of links and

30:18 reading them these are questions that

30:20 perplexity just answers like so lots of

30:22 these questions that you have in your

30:24 day-to-day life that involve you to like

30:25 do some amount of research whether it's

30:27 few minutes a few hours

30:28 our product Just Nails it and that's

30:31 where we found a lot of usage and hence

30:33 why I think it's not really like a

30:35 Google competition even though that's

30:37 what the narrative is very easy to say

30:39 that it's more like opening a new

30:42 segment for these answer Bots that

30:45 support people to come do their research

30:47 directly so I guess I want to ask about

30:49 your kind of 10-year plan and kind of

30:51 what trends you're seeing on the

30:53 technical and business side so one

30:55 example might be agents and we know they

30:58 really don't work very well right now as

31:00 Josh mentioned do you think users are

31:02 going to expect the Search tool to be

31:03 able to do a lot of that research and

31:05 ask those follow-up questions on their

31:07 own and how specialized do you think a

31:09 tool like that would be with domain

31:11 knowledge versus some general tool like

31:13 perplexity for the eating stuff we we're

31:16 already sort of prototyping things there

31:19 so in perplexity if you turn on Copilot

31:24 it's more like an interactive search

31:25 companion rather than just an answer bot

31:29 by that I mean it'll come back and ask

31:31 you clarifying questions and the

31:33 clarifying questions are very specific

31:35 to the original question you asked and

31:37 it generates the UI for the clarifying

31:40 question so that was an idea inspired by

31:42 Auto GPT but not completely autonomous

31:45 it's more like a copilot that works

31:47 together

31:48 and you can imagine this heading towards

31:50 like helping you buy stuff and over time

31:53 like do shopping decisions

31:55 so I I certainly see

31:58 possibilities to make this a reliable

32:01 assistant over time and not just an

32:04 answer engine

32:06 but we need to really make it work or

32:08 else will not ship this right if it

32:10 doesn't work for most people it's gonna

32:12 make the make it a bad experience uh

32:15 10-year plan I I I think this could

32:18 definitely work in

32:20 even like three to four years

32:23 um so

32:25 yeah Beyond five it's really hard to

32:28 predict gpt1 came out in 2018 so now

32:30 it's five years from then would you have

32:33 predicted this moment is pretty hard so

32:34 even five years is hard so 10 years is

32:37 like really really hard

32:39 yeah I had a question about training

32:41 you're talking about building your own

32:43 models and right now it sounds like you

32:45 have a composite of a number of

32:46 different search

32:48 sources

32:50 how does your how will your model be

32:52 better than the best possible

32:54 combination of those search

32:56 sources you shouldn't train your models

32:59 for this purpose right that's sort of

33:01 why I'm very skeptical of generally like

33:03 domain specific llms because once you do

33:06 that these models lose the magic of the

33:09 generality of the llm it'll be hard to

33:12 keep it conversational it'd be hard to

33:14 do reasoning

33:15 all the stuff that makes a prior feel

33:17 magical when it's sort of built on top

33:19 of open AIS API it will all go away once

33:22 you start fine-tuning on your use case

33:24 alone so how do you preserve the

33:26 generality and yet optimize for your use

33:28 cases kind of the open question and

33:32 I think the answer is like lies in some

33:34 few papers published by Google there's a

33:36 paper published by Google called Minerva

33:39 it's a model that's trained to do math

33:41 answer calculus questions things like

33:44 that

33:45 but they did not just train it on math

33:47 they took Palm but went fine-tuning

33:50 phase

33:51 they don't just train on math but they

33:54 also train on regular English in fact 95

33:56 of the training is just on English I

33:59 think maybe like five percent five to

34:00 ten percent is on like math

34:02 and that ensures that they preserve the

34:05 pre-training data distribution even

34:07 while fine tuning

34:09 and this is pretty important not a lot

34:11 of people know this and so when everyone

34:12 complains that they took llama and

34:14 fine-tuned it on this new thing and then

34:16 it overfed and it's not doing most of

34:18 the general things it was able to do

34:20 before

34:21 and there's a reason for that like your

34:23 drifting Too Much from the initial

34:25 parameter space all right thank you all

34:28 so much and thank you aravindia quick

34:30 Round of Applause for arabin

34:38 foreign

💫 FAQs about This YouTube Video

1. What is Perplexity Ai and how does it compare to existing search engines?

Perplexity Ai is a conversational answer engine that combines large language models with search indexes to provide more comprehensive and accurate answers. It aims to fulfill the human need for information and answers, and the speaker suggests that it outperforms traditional search engines like Google in terms of relevance and ad-free results.

2. Why did Arvind Srinivas decide to launch Perplexity Ai?

Arvind Srinivas, the CEO of Perplexity Ai, was inspired by the early days of Google and the idea of PhD people starting companies. His background in AI and a desire to create a more capable and robust system led him to launch Perplexity Ai, which focuses on making more generally capable, robust, and safer agents.

3. How does Perplexity Ai work under the hood?

Perplexity Ai is a combination of a traditional search index and the reasoning power and text transformation capabilities of large language models. It reformulates user queries, retrieves relevant links and paragraphs, and provides answers with citations in a concise and informative manner.

4. How does the speaker suggest preventing or reducing hallucinations in the AI model?

The speaker suggests that the core principle of the product is to only say what can be cited, similar to the practices in academia and journalism. By pulling up content from credible sources and using it for generating answers, the AI model can reduce hallucinations. However, the speaker acknowledges that there are still some cases where the model's output may be ambiguous or incorrect, and further improvements are being pursued.

5. What are the important factors in the future to bring the cost down and make it more profitable for Perplexity Ai?

The important factors in the future to bring the cost down and make it more profitable for Perplexity Ai include the use of smaller models trained explicitly for retrieval augmented, expected cheaper hardware over time, and the ability to build their own index, which will be a big cost reduction as well.

🎥 Related Videos