Chasing Silicon: The Race for GPUs

a16z2023-08-25

6K views|8 months ago

💫 Short Summary

The video explores the growing demand for AI hardware, challenges in obtaining compute capacity, distribution of supply in cloud services, decision factors for companies choosing cloud services, and the benefits of training larger language models. It discusses the decentralization of model training and inference, the shift towards neural networks, and the need for a different infrastructure to support advancements in AI. The video concludes by teasing future topics and thanking viewers for their engagement.

✨ Highlights

📊 Transcript

✦

The Impact of AI Growth on Hardware Ecosystem.

02:09

The need for faster and more resilient hardware is driven by the exponential growth of AI.

The video highlights the emerging architecture powering AI models and the supply-demand gap for AI companies.

It discusses inventory access, renting vs. owning hardware, and the role of open source in the AI ecosystem.

Special advisor Guido Appenzeller shares insights on data center operations and components fueling the AI boom.

✦

Challenges in AI hardware supply and demand.

03:11

Companies struggle to obtain necessary compute capacity due to shortage of AI hardware by a factor of 10.

Bottlenecks in chip manufacturing and limited foundry capacity from manufacturers like TSMC contribute to the issue.

Increasing production involves significant time and investment in building fabs, hindering quick reactions to demand.

Some countries are investing in new semiconductor plants, but expertise remains concentrated, posing challenges for access to necessary resources.

✦

Considerations for Startups in Accessing Cloud Services.

05:53

Negotiations between companies and providers are necessary for cloud service distribution, with capacity requiring pre-reservation and long-term commitments.

Deals are focused on accessing compute tailored to specific needs, not just cost considerations.

Founders should decide whether to consume hardware directly or use services like SAS companies hosting models.

Shopping around for providers is recommended, with startups potentially benefiting more from specialized cloud services like Core Weave or Lambda, especially for AI infrastructure.

✦

Factors influencing companies' choice of cloud services for compute resources.

08:12

Memory needs, model size, server communication, and network constraints play a crucial role in decision-making.

Compute resources are a significant expense, particularly with the increasing use of AI technology.

Scalability of compute resources affects financial outcomes, with the decision between allocated GPUs and cloud services based on scale and usage demands.

Companies must find the optimal solution for their requirements, whether through pre-reserving resources or utilizing short-term models for cost efficiency depending on application load.

✦

Factors influencing the decision to reserve GPU capacity.

10:59

Companies benefit from owning infrastructure but also face costs.

Founders are advised to rent capacity or use cloud services, except for specialized needs or geopolitical concerns.

Running a data center may be justified at a large scale.

Access to differentiated data can be a competitive advantage, outweighing the importance of compute power or financial investment.

✦

Benefits of training larger language models on more data.

13:01

Improved reasoning and abstract context understanding are some of the benefits of training larger language models on more data.

Future strategies may involve training on all available data and then fine-tuning on specific domains.

Open source projects like Vicunia show promise in competing with larger models through cost-effective fine-tuning.

Advancements are being made in understanding data-to-model size correspondence and optimizing training efficiency for smaller yet equally performing models.

✦

Evolution of Language Models Training Process

16:35

Language models are trained by completing text tasks to predict the next letter, then fine-tuned for specific purposes.

Llama 2, with 70 billion parameters, is an open-source model, but is overshadowed by closed models like GPT-3 (175 billion parameters) and GBT-4 (1.8 trillion parameters).

Performance is not solely based on parameter count, as Llama 2 outperformed GPT-3 due to being trained on more data.

The landscape for large language models is evolving with new releases like Llama 2 and Falcon in the open-source community.

✦

Decentralization of model training and inference for more efficient and optimized models on personal devices.

18:50

Model training may stay with companies due to high costs, but local inference can reduce costs.

Local inference offers advantages in speed and cost efficiency, but quality and performance may vary compared to cloud-based solutions.

The trade-off between local and cloud-based solutions depends on the specific task being performed.

✦

Advancements in technology and the opportunities in AI.

21:55

Emphasis on the shift towards neural networks for problem-solving.

Need for new infrastructure to support these advancements, including hosting providers and database systems.

Teaser for the next part of the series focusing on costs of AI compute, training models like GPT-3, and the evolving technology landscape.

Providing an informed and optimistic perspective on technology and its future.

✦

Viewer engagement encouraged for future topic suggestions.

22:36

Audience thanked for listening.

Promise of more content in the future.

00:00finding the compute capacity to run

00:02their applications is actually a real

00:04challenge what really is stopping

00:06companies from going and like 10xing

00:08their production the crazy exponential

00:10growth of AI at the moment how do I get

00:12access to the compute that I need I

00:15think my number one advice would be to

00:16shop around for a certain process which

00:17you don't want to use this capacity but

00:19for another one that you do want to use

00:20they don't have the capacity how much

00:22should Founders know about Hardware

00:24there's probably a certain scale what

00:27makes sense for you it's a whole like a

00:28new ecosystem forming and I think

00:30there's a ton of opportunities too to

00:31build great conference

00:36with software becoming more important

00:39than ever Hardware is following suit and

00:42with the world constantly generating

00:44more data unlocking the full potential

00:46of AI means a constant need for faster

00:49and more resilient Hardware that is

00:52exactly why we've created this

00:53mini-series on AI Hardware in part one

00:56we took you through the emerging

00:58architecture powering llms from GPU to

01:01TPU including how they work who's

01:04creating them and also whether we can

01:07expect Moore's law to continue but part

01:10two is for the founders trying to build

01:12AI companies and here we dive into the

01:14Delta between supply and demand why we

01:17can't just print our way out of a

01:18shortage how Founders can get access to

01:20inventory whether they should think

01:22about renting or owning or moats can be

01:25found and even where open source comes

01:27into play you should also look out for

01:29part 3 coming very soon where we break

01:31down exactly how much all of this costs

01:34from training to inference and today

01:36we're joined Again by a16z special

01:39advisor Guido appenzeller someone who is

01:41truly unique suited for this deep dive

01:44as a storied infrastructure expert with

01:46experience like CTO for Intel status and

01:49our group dealing a lot with Hardware

01:50the low-level components so it's given

01:52me sort of I think a good Insight how

01:54large data centers work you know what

01:56the the basic components are that make

01:59all of this AI boom uh you know possible

02:02today despite working with

02:03infrastructure for quite some time

02:05here's keto commenting on how the

02:07momentum of the recent AI wave is

02:09Shifting supply and demand Dynamics the

02:12biggest thing that is tricking that is

02:13just the the crazy exponential growth of

02:16ai ai has been booming since mid last

02:19year I think nobody expected how quickly

02:21it would move and that is just a

02:23credited demand which at the moment the

02:25market can fulfill

02:26as a reminder the content here is for

02:28informational purposes only should not

02:31be taken as legal business tax or

02:32investment advice or be used to evaluate

02:34any investment or security and is not

02:37directed at any investors or potential

02:38investors in any a16z fund please note

02:41that a16z and its Affiliates may also

02:44maintain investments in the companies

02:46discussed in this podcast for more

02:48details including a link to our

02:49investments please see a16c.com

02:52disclosures

02:53[Music]

02:57foreign

02:59[Music]

03:03article keto even stated that some

03:06reputable sources indicate that demand

03:08for AI Hardware outstrip Supply by a

03:11factor of 10. here's some commenting on

03:14how that Dynamic is impacting

03:16competition we currently don't have as

03:20many AI chips or servers as we'd like to

03:24have so for some of our portfolio

03:25companies you're finding the compute

03:27capacity that they need to run their

03:29applications is actually a real

03:31challenge right there's a whole value

03:32chain behind that it's a combination of

03:34many things you know we have some

03:35bottlenecks on the chip manufacturing

03:37side we have some bottlenecks on on

03:39building the actual cards you know these

03:41development Cycles take some time maybe

03:43this is a silly question but what really

03:45is stopping companies like Intel like

03:47Nvidia from going in like 10xing their

03:50production like is that on the road map

03:51where we're just gonna see a lot more

03:53chips and we won't see this discrepancy

03:55between supply and demand or is there

03:57something more complex at play it's a

03:59bit more complex because if you want to

04:01make it ship right the way you do it is

04:03you make it in a Foundry right which are

04:05like you know extremely extremely large

04:07extremely complex Intel makes trips on

04:10their own foundries but but most

04:11companies manufacture with Taiwan

04:13semiconductor tsmc right and they are

04:16capacity constraint right you often have

04:18to reserve capacity along in advance

04:20there's different processes so you know

04:22might be for a certain process which you

04:23don't want to use this capacity but for

04:25another one that you do want to use they

04:26don't have the capacity and you know you

04:28could just say like well in that case

04:29let's just build more Fabs but building

04:31a Fab is you know takes you a couple of

04:33years and you know probably uh a couple

04:35of billion or or 10 billion of

04:37investment so you're looking at some

04:39very large investment projects that take

04:41some time to adjust right and that's

04:43sort of what's what prevents us from

04:45reacting more quickly at the moment and

04:46while some countries are indeed making

04:48major multi billion dollar investments

04:51in new semiconductor production plants

04:53AKA Fabs these will take time to scale

04:56and there are also no promises given

04:58that expertise is concentrated in a few

05:01companies so with demand not subsiding

05:04what does this mean for who gets access

05:06to the supply available I mean it

05:08doesn't sound like the demand is going

05:10to subside especially because we see

05:11this really what seems like intrinsic

05:14relationship between the power of these

05:15models and then the compute that's

05:17thrown at them so how does a company

05:19let's say if I'm a Founder today how do

05:21I get access to the compute that I need

05:24and who decides is it just who's willing

05:26to pay the most or how is that Supply

05:29being distributed

05:31yeah there's some of that right at the

05:33moment uh capacity is expensive wherever

05:35you go you know I I try just to run some

05:38some personal experiments you know try

05:39to to reserve an instance monthly cloud

05:41service providers A few days ago and

05:42they just didn't have any it's like no

05:43not available what we're seeing is that

05:46often in order to get access to the the

05:48newer cards or newer chips if you want

05:51that at scale you have to pre-reserve

05:53capacity so often these are negotiations

05:55between a company and a large Cloud

05:57where you say okay I need this many this

05:59many chips for this amount of time what

06:01they'll often ask for is they ask for a

06:03certain time commitment so I'll be like

06:04okay we can give you this many tips but

06:06we want you to sign uh basically that

06:07you get them exclusively for two years

06:09and you pay for

06:10um for that amount right but we've seen

06:12um I mean I think open AI wasn't in use

06:14with that right where you you have

06:16investment deals but for example cloud

06:17provider comes in and invests in a

06:19company as a result so the company gets

06:21capacity so we're seeing all kinds of

06:23deals being struck as with any any scars

06:25scarce resource right there's a lot of

06:27deal making going on and it's not just a

06:29matter of getting access to compute it's

06:31about ensuring you get access to the

06:33compute tailored to your needs and cost

06:35is not the only Factor what would you

06:38say in terms of the considerations that

06:40they should be keeping in mind really

06:42how much should Founders know about

06:45hardware and again selecting which

06:48Hardware to use fantastic question I

06:50mean I think the first question honestly

06:52I would ask is do you really need to

06:54consume the hardware directly or do you

06:57really just want to consume something

06:58that runs on top of the hardware right

07:00today let's take an example if I want to

07:02generate images with a stable diffusion

07:04for my for example for my mobile phone

07:06app or something like that

07:08um it might be easier to go to a SAS

07:11company like replicate for example

07:12essentially will host the model for you

07:14where you just pay for for access to the

07:16model and they send you like the the

07:18generated images and they will manage

07:21all the provision of compute

07:23infrastructure and we'll find the gpus

07:25for you if you do want to run your your

07:27own model I think number one advice

07:29would be to shop around right there's a

07:30there's a fair number of providers the

07:32large clouds in my experience are not

07:34always the best option right if you

07:36price it out uh you know we've seen that

07:38the startups typically are more likely

07:41to go with specialized clouds you know

07:42like core weave or Lambda right that

07:44that obviously specialize in providing

07:46AI infrastructure to startups let's shop

07:49around look at the different offers

07:50compare prices

07:51yeah and when you're shopping around in

07:53addition to price which I feel like is a

07:56major motivating factor what other

07:58factors are there in terms of these

08:01other companies who maybe aren't the big

08:02clouds how are they differentiating

08:04relative to one another how are they

08:06standing out in that market yeah there's

08:09a whole sort of decision tree there you

08:10know I mean the first thing is uh one

08:12thing it often drives the decision is

08:14how much memory do I need in my cards

08:16right if I have a small image model

08:18right I might be able to to work with a

08:20more consumer grade card which is much

08:22cheaper right per hour if I reserve it

08:24in a cloud whereas if I for example

08:27training for a large language model I

08:29not only need a card with the most

08:31memory I can find but I probably won't

08:32have as many cards as possible in one

08:35server because communication between

08:36them matters and I may even care about

08:38the networking fabric behind it right

08:39some of the very large models you

08:41actually Network constraints in in terms

08:43of how quickly you can constrain them so

08:45it really becomes a question of what's

08:47your objective is it inference is the

08:48training if it's training how big is

08:50your model right and based on that you

08:51figure out what the card is what kind of

08:53server you need what kind of fabric you

08:55need between those servers and then you

08:56sort of can can decide what the right

08:58fit is for your application even prior

09:00to this AI wave compute was a major line

09:03item for many software companies and the

09:05calculus of leaning on the easily

09:07accessible Cloud versus bringing

09:09infrastructure in-house was becoming an

09:11increasingly important consideration

09:13here is Guido touching further on that

09:16very calculus in today's era and where

09:18scale comes into play compute is

09:20expensive it's a major line item for

09:22many companies and this is even before

09:24the AI Revolution

09:26today yeah yeah so do you think how do

09:29you think about again like how that

09:31impacts different companies bottom lines

09:32and whether they really factor that in

09:36to having their own I guess like

09:38allocated gpus versus using something

09:40more like replicate you really have to

09:42figure out

09:43um you know what is the right fit for

09:45you and it probably depends a lot on the

09:47scale at which you need them right if

09:48you need a lot you frankly you have to

09:50pre-reserve them right you have to have

09:51your own there's just no no way around

09:53that you need a smaller quantity you may

09:55be able

09:56um to reserve them on a more short-term

09:58basis or you know you have various

10:01models where you can consume only while

10:03your application runs but at a higher

10:05price right and so this really comes

10:07down to what kind of load do you have

10:09right what we're talking is seeing if

10:10somebody is training they're more likely

10:12to to do a long-term reservation for a

10:15GPU because you want to make sure you

10:16have access to it if somebody has a more

10:19continuous workloads for availability is

10:21important like if I just do inference

10:23but you know I want to make 100 sure

10:25that if the request comes in I can

10:26service it I can never be down right it

10:27probably choose a capacity as well right

10:29on the other hand if I have more batch

10:31jobs where it's like well I have this

10:33jobs runs an hour later that's that's

10:34not the end of the world then you

10:35probably can go with variable capacity

10:37and you know just preserve it ad hoc but

10:40it's it's really a conversation of what

10:41is the usage pattern what is your demand

10:43pattern and from that comes you know the

10:45the best pick for for you the partners

10:47that you work with we've seen that

10:49companies even prior to AI have

10:52benefited from building their own

10:54infrastructure by basically bringing

10:56that in-house because

10:57before that they were renting and they

10:59were paying a lot to rent that compute

11:02do you think that will be a

11:04differentiator for companies moving

11:06forward or how should Founders be

11:07thinking about that relationship between

11:09owning the infrastructure and renting it

11:12owning the infrastructure comes with

11:15cost as well right because you need to

11:16you know now hire people that run it

11:18right you need to get money for the

11:20capex and so on so my guess is that most

11:23early stage Founders and probably even

11:26most mid-stage and late stage Founders

11:28uh you know are better off by by renting

11:31capacity renting a cloud or using

11:33consumer sound service right

11:36um there's a couple of exceptions if you

11:38have really really specialized needs

11:40right you may just not not find anybody

11:42who has exactly the kind of Hardware

11:43that you need right there might be some

11:45cases where you have geopolitical

11:46concerns you know your data is just too

11:48sensitive you need you need to run your

11:50own data center

11:52um and there's probably a certain scale

11:53what makes sense for you to run your own

11:55data center but it's a pretty large

11:56scale right if you're spending 10

11:58million dollars a year you're probably

11:59still under critical right you're

12:00spending 100 million dollars a year on

12:02infrastructure that that may be a reason

12:04to look into into options for your own

12:06data but if everyone is competing for

12:08the same compute are there other ways to

12:11stand out where's the moat here or you

12:13could say a moat is getting access to

12:15different training data but that

12:17actually doesn't necessarily have to do

12:19with

12:20compute or money being thrown out the

12:24problem it's getting access to

12:25differentiated data if you have access

12:26to differentiated differentiated data

12:28that could be a mode I mean there's

12:31it's a bit more subtle um because look

12:34if if you had an area where there's just

12:37not much Public Training data that's

12:38probably right right you know and there

12:40might be areas like Finance or so well

12:42that's the case

12:43um but for for example for a large

12:45language model it turns out that just

12:48making a larger model and training on

12:49more data has more benefits than just

12:52absorbing more knowledge it also means

12:54that it's better in reasoning and and

12:56you know understanding abstract context

12:57and then you know answering really

12:59complex multi-stage questions and so on

13:01so so probably you know if I have to

13:04guess I think the future will be that

13:05we'll we'll still train on all the data

13:07we can find right and and then maybe you

13:09fine-tune meaning you yourself to do

13:11some additional training on a particular

13:12problem domain with your with your

13:14private data if that makes sense and so

13:16you you first go to elementary school to

13:18learn reading and writing and you know

13:19then you go to your your uh you know

13:21vocational training for the for the

13:23special specialized job that uh you know

13:25you you have to do in the future another

13:27important question worth addressing is

13:29who can realistically compete in if

13:32compute is expensive will all the

13:34largest most heavily capitalized

13:36companies win since they can build the

13:38largest models with the most data or

13:40what role does open source play as one

13:42of many emerging examples vicunia was

13:45created by fine-tuning meta's llama1

13:47model for chat

13:49the cost of fine-tuning added only an

13:52additional 300 but the result is

13:55competitive with much larger models like

13:57Chachi PT or Bard so what might this

14:01example and a growing number of Open

14:03Source projects tell us about the future

14:05of openlms so first of all the in

14:09general larger models if they're you

14:12know everything else being equal perform

14:13better right so the really small open

14:16source models that we're seeing out

14:18there today they're not yet at a level

14:20of a GPD 3.5 or GT4 and that's actually

14:24a website that runs sort of regular bake

14:26offs where they basically ask users to

14:28prepare answers and you know it seems to

14:30be pretty clear that the the large ones

14:32are still a little bit ahead that said

14:34we're making big advances there we're

14:37figuring out a couple of things so one

14:38thing we've learned is um

14:40there's something called the chinchilla

14:42scaling loss that basically give us an

14:43idea how does data correspond to model

14:46size and if we over train so don't train

14:49as efficiently as we could we can

14:51actually get potentially a smaller and

14:53better model right so you can match the

14:55performance of a large model with a

14:56smaller model if you train it more right

14:57so that's interesting that that reduces

14:59model sizes and the trend at the moment

15:01is to make slightly smaller model models

15:04and train them more to get to get equal

15:06performance right

15:08um the second thing is that um

15:11when we talk about models there's models

15:13for a slightly different purposes right

15:14you have the base large language models

15:16you know all that trained and

15:18practically speaking is completing text

15:19right literally how you train them is

15:21you give them text and say guess the

15:22next letter and then you tell them nope

15:23that was wrong or yes that was right I

15:25didn't and doesn't by propagate the

15:28um you know how they predict

15:30um and they're really good at that

15:31completing text right that's not quite

15:34the same that you want from a chat bot

15:36or from you know a model that you can

15:38tell to do something so there's usually

15:40another step afterwards which is called

15:42uh you know fine-tuning for instruction

15:43following or for for chat specifically

15:46right where basically I tell a model

15:47look if somebody somebody asks you to uh

15:51you know come up with a list of of to do

15:53either like a list of steps how to make

15:54pizza right this is roughly what I

15:56expect you to to answer right these

15:57models are very good in learning these

15:59things so so basically you first train

16:01them just complete text and then you

16:03train them how to uh you know react to

16:07to human requests and and instructions

16:09right as well the instruction fine

16:10tuning and so so so uh you know llama

16:12for example that was a Facebook model

16:14where they published the weights for for

16:15researchers and then some people took

16:17that and they fine-tuned it meaning they

16:18you know they they took a bunch of

16:20instructions foreign things to turn it

16:22into uh which is a much much nicer model

16:25in terms of interacting with it right

16:26for humans much much more useful right

16:28and so the the biggest challenge at the

16:30moment we have the open source side is

16:31there's currently no large open source

16:35llm out there right the the you know the

16:37the GPT three uh was 175 billion

16:41parameters there's currently nothing in

16:43that weight class uh that's that's open

16:45source and that people could use to

16:46fine-tune or to to play without a modify

16:48it's worth noting that since this

16:50recording several more open source

16:52models have been released including

16:54llama 2 with 70 billion parameters and

16:58an open license unlike its predecessor

17:00llama one another 40 billion parameter

17:03open source model falcon was released as

17:05well

17:06but both of these are still dwarfed in

17:08parameters compared to closed models

17:10like open AIS gpt3 at a 175 billion

17:14parameters or gbt4 at an estimated 1.8

17:19trillion parameters although the latter

17:22is speculated to be a collection of

17:24multiple smaller models however

17:26parameter count isn't the only driver of

17:29performance for example while llama 2

17:31has fewer parameters than gpt3 its

17:34performance is actually much better due

17:37to being trained on more data in fact

17:39lamentu is currently comparable to

17:42gpt3's successor gbt3 3.5 the current

17:46default for Chachi BT

17:48and as these models continue to get

17:51larger we may actually see some models

17:53compress becoming more efficient in

17:55enabling inference on your device stable

17:58diffusion can you know run on your

18:00computer's GPU do we expect to see more

18:02of that because right now they are all

18:04hosted by these companies right they're

18:06trained by these companies on their

18:08dedicated servers and then even you know

18:09if you interface with chat GPT it's

18:12running that inference for you do we

18:15expect to see that change at all as

18:16compute becomes cheaper maybe more

18:18decentralized or how would you think

18:20about that that's a really good question

18:22and we're speculating a little bit here

18:24but my guess is we will right and the

18:26we're seeing some of these smaller

18:28models getting pretty good they run on

18:30you know your laptop or even your phone

18:32uh you know essentially stable diffusion

18:35implementations that run well on phones

18:37right which I would have never thought

18:38right they take a couple of 10 seconds

18:40to create an image uh you know which is

18:42comparatively slow but uh you know

18:44there's there's certain applications

18:45where that's with that success

18:47acceptable so my guess is as both the

18:50devices get faster and the models get

18:53more optimized right we'll we'll this

18:55will be a trend that we see more and

18:57more and you know in the future it might

18:58just be part of the operating system to

18:59have a basic large language model of

19:01design image generation model we've

19:03talked about how expensive compute can

19:05be and how ultimately that can be a

19:06major line item for companies and I

19:08guess probably the model training will

19:11remain you know with those companies and

19:13not necessarily on folks devices but in

19:16terms of the inference I assume that's

19:18still you know pretty significant cost

19:20and in a way if if someone is able to

19:22run that locally it doesn't that

19:24disjoint the company from having to pay

19:26for that compute because it's running on

19:28let's say you know someone's MacBook GPU

19:31oh yeah totally I mean like look if I

19:33can generate an image on my phone

19:34directly you know all it takes is some

19:37battery power and it gets a little warm

19:38right and that's and that's it right so

19:40that's a huge Advantage um you know at

19:42the same time there's probably going to

19:43be a little bit bifurcation there around

19:45quality and parameters right you can run

19:47things locally but you can probably run

19:49them a lot better in the cloud right

19:51because you have them have a much bigger

19:52server there so it probably depends on

19:54what you want to do all right if I just

19:55want to have a better spell checker uh

19:57you know that checks my email or maybe

19:59you know just some some simple

20:00completion that's perfectly fine I can

20:03run that on my phone on the other hand

20:04if I want something you know that is

20:07more

20:08you know or write a good speech or you

20:10know summarize a a complex text they

20:12might be like all that I'm going to run

20:14the cloud because it takes so many more

20:15operations hopefully this is getting

20:17your wheel spinning

20:19here is Guido speaking to how this

20:22presents a fundamentally new stack and

20:24what that means in terms of opportunity

20:26it feels like this really is like this

20:28massive wave this Renaissance of of

20:31innovation it's full of opportunities

20:33right I mean we're rebuilding a stack

20:35you know it's it's you can look at AI

20:38just as a new application but honestly I

20:40think it's probably a better way to look

20:41at it as a different type of compute

20:43right we traditionally build software by

20:46composing algorithms in a way that we

20:48understand well

20:49and where you know the end result was uh

20:52you know a program there so Bottoms Up

20:53constructed now we have a second type

20:55compute uh you know where we're just

20:57trying to launch neural network and the

20:58big Advantage is we don't actually need

20:59to know how to solve a problem as long

21:01as the network can figure it out right

21:03the neural network can figure it out

21:04we're fine and that opens up a bunch of

21:06new applications

21:08but it also means you know you need a

21:10completely different stack in terms of

21:11all the different pieces right you

21:13probably want Vector DBS to retrieve a

21:15context you know you you want different

21:17types of Hosting providers that are good

21:19in hosting these models and providing

21:21them to you as a service right it's it's

21:24a whole like you know Cambrian explosion

21:26of creativity as a whole Echo a new

21:28ecosystem forming and I think there's a

21:29ton of opportunities too to build right

21:31companies I think that paints a pretty

21:33incredible picture of opportunity across

21:35the stack and as many of these Trends

21:38continue to progress like supply and

21:40demand the calculus of renting versus

21:42owning compute close versus open source

21:44models we look to part three of the

21:46series to answer a very important

21:48question how much does all of this cost

21:52we'll explore all this in depth

21:53including how much startups are really

21:55spending on AI compute and whether

21:57that's sustainable how much it really

21:59costs to train a model like gpt3 the

22:02difference in cost between training and

22:03inference and how all of this will

22:06change with time we will see you there

22:09thank you so much for listening to the

22:11a16c podcast what we're trying to do

22:14here is provide an informed clear-eyed

22:16but also optimistic take on technology

22:19and its future and we're trying to do

22:21that by featuring some of the most

22:24inspiring people and the things that

22:25they're building so if that is

22:28interesting to you and you'd like to

22:29join us on this journey go ahead and

22:31click subscribe and make sure to let us

22:33know in the comments below what you'd

22:35like to see us cover next

22:36thank you so much for listening and

22:38we'll see you next time

🎥 Related Videos

a16z Podcast | Things Come Together -- Truths about Tech in Africa

a16z Podcast | The Infrastructure of Total Health

The Robot Lawyer Resistance with Joshua Browder of DoNotPay

a16z Podcast | Bots and Beyond

Design Sprints as a Tool for Organizational Change

a16z Podcast | Valuing Today's Fast-Growing Software Companies

🔥 Recently Summarized Examples

Former Priest REVEALS Jesus' MYSTICAL Lost Years & His Connection to BUDDHA! | Fr. Seán ÓLaoire

Kim Kardashian's Plastic Surgery Reversal: Is She Trying to Rewind Time?

How To Succeed As A NEW & YOUNG Realtor [Deals Every Month + Luxury Listings]

BITCOIN EMERGENCY: NEXT PRICE TARGETS REVEALED!! Bitcoin News Today & Ethereum Price Prediction!

Uncovering Ancient Atlantean Ruins: Exploring Evolutionary Pathways and Psychic Phenomenon

Samsung Technician Knives TV To Void Warranty

View original video