00:36Visionary Illuminating galaxies to
00:39witness the birth of
00:47stars and sharpening our understanding
00:56events I am a helper
01:01guiding the blind through a crowded
01:07world I was thinking about running to
01:10the store and giving voice to those who
01:14speak to not make me
01:22Transformer harnessing gravity to store
01:34and Paving the way towards unlimited
01:45trainer teaching robots to
01:51assist to watch out for
01:58danger and help save
02:08Healer providing a new generation of
02:12cures and new levels of patient care
02:16doctor that I am allergic to penicillin
02:18is it still okay to take the medications
02:20definitely these antibiotics don't
02:22contain penicillin so it's perfectly
02:24safe for you to take
02:26them I am a navigator
02:38scenarios to let us safely explore the
02:43world and understand every
02:50decision I even helped write the
02:55script breathe life into the words
03:15AI brought to life by
03:20learning and Brilliant
03:34please welcome to the stage Nvidia
03:36founder and CEO Jensen
04:00I hope you realize this is not a
04:07arrived at a developers
04:11conference there will be a lot of
04:14described algorithms computer
04:27mathematics I sensed a very heavy weight
04:31in the room all of a
04:33sudden almost like you were in the wrong
04:36place no no conference in the
04:40world is there a great assembly of
04:43researchers from such diverse fields of
04:48climatech to radio Sciences trying to
04:51figure out how to use AI to robotically
04:54control MOS for Next Generation 6G
04:57radios robotic self-driving car
05:05intelligence even artificial
05:10everybody's first I noticed a sense of
05:13relief there all of all of a
05:15sudden also this conference is
05:19represented by some amazing companies
05:22this list this is not the
05:26attendees these are the presentors
05:30and what's amazing is
05:32this if you take away all of my friends
05:37close friends Michael Dell is sitting
05:39right there in the IT
05:47industry all of the friends I grew up
05:49with in the industry if you take away
05:52that list this is what's
05:55amazing these are the presenters of the
05:59non it Industries using accelerated
06:01Computing to solve problems that normal
06:09represented in life sciences healthc
06:12genomics Transportation of course retail
06:16Logistics manufacturing
06:20industrial the gamut of Industries
06:23represented is truly amazing and you're
06:25not here to attend only you're here to
06:28present to talk about your research $100
06:32trillion dollar of the world's
06:34Industries is represented in this room
06:36today this is absolutely
06:44amazing there is absolutely something
06:47happening there is something going
06:50on the industry is being transformed not
06:54just ours because the computer industry
06:57the computer is the single most
07:00important instrument of society today
07:03fundamental transformations in Computing
07:05affects every industry but how did we
07:09start how did we get here I made a
07:11little cartoon for you literally I drew
07:14this in one page this is nvidia's
07:201993 this might be the rest of the
07:24talk 1993 this is our journey we were
07:27founded in 1993 there are several
07:29important events that happen along the
07:30way I'll just highlight a few in 2006
07:35Cuda which has turned out to have been a
07:37revolutionary Computing model we thought
07:40it was revolutionary then it was going
07:42to be an overnight success and almost 20
08:082016 recognizing the importance of this
08:11Computing model we invented a brand new
08:13type of computer we called the dgx one
08:17170 Tera flops in this supercomputer
08:21eight gpus connected together for the
08:23very first time I hand delivered the
08:26very first dgx-1 to a startup
08:31Francisco called open
08:40AI dgx-1 was the world's first AI
08:43supercomputer remember 170 Tera
08:492017 the Transformer arrived
08:532022 chat GPT capture the world's imag
08:56imaginations have people realize the
08:58importance and the capabilities of
09:00artificial intelligence and
09:07emerged and a new industry begins
09:12why why is a new industry because the
09:15software never existed before we are now
09:18producing software using computers to
09:20write software producing software that
09:23never existed before it is a brand new
09:26category it took share from
09:28nothing it's a brand new category and
09:31the way you produce the
09:33software is unlike anything we've ever
09:44producing floating Point
09:46numbers at very large scale as if in the
09:51beginning of this last Industrial
09:54Revolution when people realized that you
09:59apply energy to it and this invisible
10:03valuable thing called electricity came
10:07generators and 100 years later 200 years
10:10later we are now creating new types of
10:14electrons tokens using infrastructure we
10:18call factories AI factories to generate
10:21this new incredibly valuable thing
10:24called artificial intelligence a new
10:28emerged well well we're going to talk
10:30about many things about this new
10:33industry we're going to talk about how
10:34we're going to do Computing next we're
10:37going to talk about the type of software
10:39that you build because of this new
10:43software how you would think about this
10:45new software what about applications in
10:49industry and then maybe what's next and
10:52how can we start preparing today for
10:55what is about to come next well but
11:00I want to show you the soul of
11:03Nvidia the soul of our company at the
11:07intersection of computer
11:12physics and artificial
11:15intelligence all intersecting inside a
11:21Omniverse in a virtual world
11:24simulation everything we're going to
11:26show you today literally everything
11:28we're going to show you today
11:30is a simulation not animation it's only
11:34beautiful because it's physics the world
11:37beautiful it's only amazing because it's
11:40being animated with robotics it's being
11:43animated with artificial intelligence
11:45what you're about to see all
11:46day it's completely generated completely
11:50simulated and Omniverse and all of it
11:53what you're about to enjoy is the
11:54world's first concert where everything
12:05everything is homemade you're about to
12:08watch some home videos so sit back and
15:03Nvidia accelerated Computing has reached
15:08Point general purpose Computing has run
15:11out of steam we need another way of
15:14doing Computing so that we can continue
15:16to scale so that we can continue to
15:18drive down the cost of computing so that
15:20we can continue to consume more and more
15:23Computing while being sustainable
15:26accelerated Computing is a dramatic
15:29speed up over general purpose Computing
15:32and in every single industry we engage
15:37many the impact is dramatic but in no
15:40industry is a more important than our
15:43own the industry of using simulation
15:49products in this industry it is not
15:52about driving down the cost of computing
15:54it's about driving up the scale of
15:56computing we would like to be able to
15:58sim at the entire product that we do
16:02completely in full Fidelity completely
16:05digitally in essentially what we call
16:08digital twins we would like to design it
16:11build it simulate it operate it
16:17digitally in order to do that we need to
16:20accelerate an entire industry and today
16:24I would like to announce that we have
16:26some Partners who are joining us in this
16:27journey to accelerate their entire
16:30ecosystem so that we can bring the world
16:33into accelerated Computing but there's a
16:38bonus when you become accelerated your
16:42infrastructure is cou to gpus and when
16:45that happens it's exactly the same
16:47infrastructure for generative
16:50Ai and so I'm just delighted to announce
16:54several very important Partnerships
16:56there are some of the most important
16:57companies in the world and Anis does
17:00engineering simulation for what the
17:01world makes we're partnering with them
17:04to Cuda accelerate the ancis ecosystem
17:07to connect anus to the Omniverse digital
17:10twin incredible the thing that's really
17:13great is that the install base of media
17:14GPU accelerated systems are all over the
17:16world in every cloud in every system all
17:20over Enterprises and so the app the
17:22applications they accelerate will have a
17:24giant installed base to go serve end
17:27users will have amazing applications and
17:29of course system makers and csps will
17:35synopsis synopsis is nvidia's literally
17:40first software partner they were there
17:42in very first day of our company
17:44synopsis revolutionized the chip
17:45industry with high level design we are
17:49going to Cuda accelerate synopsis we're
17:52accelerating computational lithography
17:55one of the most important applications
17:57that nobody's ever known about
17:59in order to make chips we have to push
18:01lithography to limit Nvidia has created
18:04a library domain specific library that
18:07accelerates computational lithography
18:10incredibly once we can accelerate and
18:13software Define all of tsmc who is
18:16announcing today that they're going to
18:18go into production with Nvidia kitho
18:20once this software defined and
18:22accelerated the next step is to apply
18:25generative AI to the future of
18:27semiconductor manufacturing push in
18:31further Cadence builds the world's
18:35essential Eda and SDA tools we also use
18:38Cadence between these three companies
18:40ansis synopsis and Cadence we basically
18:43build Nvidia together we are cud
18:46accelerating Cadence they're also
18:48building a supercomputer out of Nvidia
18:50gpus so that their customers could do
18:53fluid Dynamic simulation at a 100 a
18:57thousand times scale
18:59basically a wind tunnel in real time
19:03Cadence Millennium a supercomputer with
19:05Nvidia gpus inside a software company
19:08building supercomputers I love seeing
19:10that building Cadence co-pilots together
19:14day when Cadence could synopsis ansis
19:18tool providers would offer you AI
19:22co-pilots so that we have thousands and
19:24thousands of co-pilot assistants helping
19:27us design chips Design Systems and we're
19:30also going to connect Cadence digital
19:32twin platform to Omniverse as you could
19:34see the trend here we're accelerating
19:37the world's CAE Eda and SDA so that we
19:40could create our future in digital Twins
19:44and we're going to connect them all to
19:45Omniverse the fundamental operating
19:47system for future digital
19:50twins one of the industries that
19:52benefited tremendously from scale and
19:55you know you all know this one very well
19:57large language model
20:00basically after the Transformer was
20:02invented we were able to scale large
20:05language models at incredible rates
20:08effectively doubling every six months
20:10now how is it possible that by doubling
20:13every six months that we have grown the
20:16industry we have grown the computational
20:18requirements so far and the reason for
20:20that is quite simply this if you double
20:23the size of the model you double the
20:24size of your brain you need twice as
20:25much information to go fill it and so
20:28every time you double your parameter
20:32count you also have to appropriately
20:35increase your training token count the
20:38combination of those two
20:40numbers becomes the computation scale
20:44support the latest the state-of-the-art
20:46open AI model is approximately 1.8
20:49trillion parameters 1.8 trillion
20:52parameters required several trillion
20:57train so so a few trillion parameters on
21:00the order of a few trillion tokens on
21:03the order of when you multiply the two
21:05of them together approximately 30 40 50
21:10billion quadrillion floating Point
21:14operations per second now we just have
21:16to do some Co math right now just hang
21:18hang with me so you have 30 billion
21:21quadrillion a quadrillion is like a paa
21:25and so if you had a PA flop GPU you
21:3030 billion seconds to go compute to go
21:33train that model 30 billion seconds is
21:38years well 1,000 years it's worth
21:47it like to do it sooner but it's worth
21:51it which is usually my answer when most
21:54people tell me hey how long how long's
21:55it going to take to do something 20
21:57years how it it's worth
22:01it but can we do it next
22:05week and so 1,000 years 1,000 years so
22:09what we need what we
22:14gpus we need much much bigger gpus we
22:18recognized this early on and we realized
22:21that the answer is to put a whole bunch
22:23of gpus together and of course innovate
22:26a whole bunch of things along the way
22:27like inventing 10 censor cores advancing
22:30MV links so that we could create
22:32essentially virtually Giant
22:34gpus and connecting them all together
22:36with amazing networks from a company
22:39called melanox infiniband so that we
22:41could create these giant systems and so
22:43djx1 was our first version but it wasn't
22:45the last we built we built
22:48supercomputers all the way all along the
22:522021 we had Seline 4500 gpus or so and
22:57then in 2023 we built one of the largest
23:00AI supercomputers in the world it's just
23:05EOS and as we're building these things
23:08we're trying to help the world build
23:10these things and in order to help the
23:12world build these things we got to build
23:13them first we build the chips the
23:15systems the networking all of the
23:18software necessary to do this you should
23:21systems imagine writing a piece of
23:24software that runs across the entire
23:26system Distributing the computation
23:29thousands of gpus but inside are
23:31thousands of smaller
23:34gpus millions of gpus to distribute work
23:37across all of that and to balance the
23:39workload so that you can get the most
23:41Energy Efficiency the best computation
23:44time keep your cost down and so those
23:50Innovations is what got us here and here
23:55are as we see the miracle of chat GPT
23:59emerg in front of us we also realize we
24:02have a long ways to go we need even
24:06larger models we're going to train it
24:08with multimodality data not just text on
24:10the internet but we're going to we're
24:12going to train it on texts and images
24:15charts and just as we learn watching TV
24:19and so there's going to be a whole bunch
24:21of watching video so that these Mo
24:23models can be grounded in physics
24:26understands that an arm doesn't go
24:27through a wall and so these models would
24:30have common sense by watching a lot of
24:33the world's video combined with a lot of
24:36the world's languages it'll use things
24:38like synthetic data generation just as
24:40you and I do when we try to learn we
24:43might use our imagination to simulate
24:46how it's going to end up just as I did
24:48when I Was preparing for this keynote I
24:50was simulating it all along the
24:54way I hope it's going to turn out as
24:57well as I had it in my
25:05head as I was simulating how this
25:07keynote was going to turn out somebody
25:08did say that another
25:12performer did her performance completely
25:16treadmill so that she could be in shape
25:18to deliver it with full
25:21energy I I didn't do
25:25that if I get a l wind at about 10
25:27minutes into this you know what
25:30happened and so so where were we we're
25:34sitting here using synthetic data
25:36generation we're going to use
25:37reinforcement learning we're going to
25:38practice it in our mind we're going to
25:40have ai working with AI training each
25:42other just like student teacher
25:45Debaters all of that is going to
25:47increase the size of our model it's
25:48going to increase the amount of the
25:50amount of data that we have and we're
25:51going to have to build even bigger
25:55gpus Hopper is fantastic but we need
26:00gpus and so ladies and
26:04gentlemen I would like to introduce
26:07you to a very very big
26:23GPU named after David
26:29ician game theorists
26:32probability we thought it was a perfect
26:35per per perfect name black wealth ladies
29:17Blackwell is not a chip Blackwell is the
29:20platform uh people think we make
29:23gpus and and we do but gpus don't look
29:30to here here's the here's the here's the
29:33the if you will the heart of the blackw
29:36system and this inside the company is
29:39not called Blackwell it's just the
29:44this this is Blackwell sitting next to
29:47oh this is the most advanced GPU in the
29:56Hopper this is Hopper Hopper changed the
30:18Hopper you're you're very
30:29208 billion transistors and so so you
30:33could see you I can see that there's a
30:36small line between two dyes this is the
30:38first time two dieses have abutted like
30:41this together in such a way that the two
30:44chip the two dieses think it's one chip
30:46there's 10 terabytes of data between it
30:4910 terabytes per second so that these
30:52two these two sides of the Blackwell
30:54Chip have no clue which side they're on
30:57there's no memory locality issues no
30:59cach issues it's just one giant chip and
31:03so uh when we were told that Blackwell's
31:07Ambitions were beyond the limits of
31:09physics uh the engineer said so what and
31:12so this is what what happened and so
31:14this is the Blackwell chip and it goes
31:18into two types of systems the first
31:22one is form fit function compatible to
31:25Hopper and so you slide all Hopper and
31:28you push in Blackwell that's the reason
31:29why one of the challenges of ramping is
31:32going to be so efficient there are
31:34installations of Hoppers all over the
31:36world and they could be they could be
31:38you know the same infrastructure same
31:39design the power the electricity The
31:43Thermals the software identical push it
31:46right back and so this is a hopper
31:49version for the current hgx
31:53configuration and this is what the other
31:56the second Hopper looks like this now
31:58this is a prototype board and um Janine
32:04borrow ladies and gentlemen Jan
32:11Paul and so this this is the this is a
32:14fully functioning board and I just be
32:18here this right here is I don't know10
32:33five it gets cheaper after that so any
32:36customers in the audience it's
32:41okay all right but this is this one's
32:44quite expensive this is to bring up
32:45board and um and the the way it's going
32:48to go to production is like this one
32:50here okay and so you're going to take
32:52take this it has two blackw Dy two two
32:56blackw chips and four Blackwell dies
32:59connected to a Grace CPU the grace CPU
33:03has a super fast chipto chip link what's
33:05amazing is this computer is the first of
33:08its kind where this much computation
33:11first of all fits into this small of a
33:14place second it's memory coherent they
33:18feel like they're just one big happy
33:20family working on one application
33:23together and so everything is coherent
33:25within it um the just the amount of you
33:29know you saw the numbers there's a lot
33:31of terabytes this and terabytes that's
33:33um but this is this is a miracle this is
33:35a this let's see what are some of the
33:38things on here uh there's um uh MV link
33:42on top PCI Express on the
33:50your which one is mine and your left one
33:53of them it doesn't matter uh one of them
33:56one of them is a CPU chipto chip link is
34:00my left or your depending on which side
34:01I was just I was trying to sort that out
34:04and I just kind of doesn't
34:11matter hopefully it comes plugged in
34:18so okay so this is the grace Blackwell
34:34more so it turns out it turns out all of
34:38the specs is fantastic but we need a
34:40whole lot of new features uh in order to
34:43push the limits Beyond if you will the
34:47physics we would like to always get a
34:50lot more X factors and so one of the
34:52things that we did was We Invented
34:53another Transformer engine another
34:56Transformer engine the second generation
34:58it has the ability to
35:00dynamically and automatically
35:06recas numerical formats to a lower
35:09Precision whenever it can remember
35:12artificial intelligence is about
35:13probability and so you kind of have you
35:16know 1.7 approximately 1.7 time
35:19approximately 1.4 to be approximately
35:21something else does that make sense and
35:23so so the the ability for the
35:26mathematics to retain the Precision and
35:29the range necessary in that particular
35:32stage of the pipeline super important
35:35and so this is it's not just about the
35:37fact that we designed a smaller ALU it's
35:39not quite the world's not quite that
35:41simple you've got to figure out when you
35:44can use that across a computation that
35:48is thousands of gpus it's running for
35:52weeks and weeks on weeks and you want to
35:54make sure that the the uh uh the
35:56training job is going going to converge
35:59and so this new Transformer engine we
36:01have a fifth generation MV
36:03link it's now twice as fast as Hopper
36:06but very importantly it has computation
36:09in the network and the reason for that
36:11is because when you have so many
36:12different gpus working together we have
36:15to share our information with each other
36:17we have to synchronize and update each
36:19other and every so often we have to
36:21reduce the partial products and then
36:24rebroadcast out the partial products the
36:26sum of the partial products back to
36:28everybody else and so there's a lot of
36:29what is called all reduce and all to all
36:32and all gather it's all part of this
36:34area of synchronization and collectives
36:36so that we can have gpus working with
36:38each other having extraordinarily fast
36:41links and being able to do mathematics
36:43right in the network allows us to
36:46essentially amplify even further so even
36:49though it's 1.8 terabytes per second
36:51it's effectively higher than that and so
36:53it's many times that of Hopper the likel
36:57Ood of a supercomputer running for weeks
37:01on in is approximately zero and the
37:05reason for that is because there's so
37:06many components working at the same time
37:09the statistic the probability of them
37:12working continuously is very low and so
37:14we need to make sure that whenever there
37:16is a well we checkpoint and restart as
37:19often as we can but if we have the
37:22ability to detect a weak chip or a weak
37:26note early we could retire it and maybe
37:29swap in another processor that ability
37:33to keep the utilization of the
37:34supercomputer High especially when you
37:37just spent $2 billion building it is
37:40super important and so we put in a Ras
37:45engine a reliability engine that does
37:48100% self test in system test of every
37:53single gate every single bit of memory
37:58on the Blackwell chip and all the memory
38:01that's connected to it it's almost as if
38:04we shipped with every single chip its
38:07own Advanced tester that we CH test our
38:11chips with this is the first time we're
38:13doing this super excited about it secure
38:22AI only this conference do they clap for
38:28the uh secure AI uh obviously you've
38:32just spent hundreds of millions of
38:34dollars creating a very important Ai and
38:37the the code the intelligence of that AI
38:40is encoded in the parameters you want to
38:42make sure that on the one hand you don't
38:44lose it on the other hand it doesn't get
38:45contaminated and so we now have the
38:48ability to encrypt data of course at
38:53rest but also in transit and while it's
38:58it's all encrypted and so we now have
39:01the ability to encrypt and transmission
39:04and when we're Computing it it is in a
39:06trusted trusted environment trusted uh
39:09engine environment and the last thing is
39:13decompression moving data in and out of
39:15these nodes when the compute is so fast
39:19essential and so we've put in a high
39:23linee speed compression engine and
39:25effectively moves data 20 times times
39:27faster in and out of these computers
39:29these computers are are so powerful and
39:32there's such a large investment the last
39:34thing we want to do is have them be idle
39:36and so all of these capabilities are
39:38intended to keep Blackwell fed and as
39:46possible overall compared to
39:49Hopper it is two and a half times two
39:53and a half times the fp8 performance for
39:56training per chip it is ALS it also has
40:00this new format called fp6 so that even
40:03though the computation speed is the
40:05same the bandwidth that's Amplified
40:09because of the memory the amount of
40:11parameters you can store in the memory
40:12is now Amplified fp4 effectively doubles
40:16the throughput this is vitally important
40:19for inference one of the things that
40:21that um is becoming very clear is that
40:24whenever you use a computer with AI on
40:27side when you're chatting with the
40:30chatbot when you're asking it to uh
40:36image remember in the back is a GPU
40:41tokens some people call it inference but
40:45it's more appropriately
40:48generation the way that Computing is
40:50done in the past was retrieval you would
40:53grab your phone you would touch
40:54something um some signals go off
40:57basically an email goes off to some
40:59storage somewhere there's pre-recorded
41:02content somebody wrote a story or
41:03somebody made an image or somebody
41:04recorded a video that record
41:07pre-recorded content is then streamed
41:09back to the phone and recomposed in a
41:11way based on a recommender system to
41:14present the information to
41:16you you know that in the future the vast
41:20majority of that content will not be
41:22retrieved and the reason for that is
41:24because that was pre-recorded by
41:25somebody who doesn't understand the
41:27context which is the reason why we have
41:29to retrieve so much content if you can
41:33be working with an AI that understands
41:35the context who you are for what reason
41:37you're fetching this information and
41:39produces the information for you just
41:43it the amount of energy we save the
41:46amount of networking bandwidth we save
41:48the amount of waste of time we save will
41:51be tremendous the future is generative
41:55which is the reason why we call it
41:56generative AI which is the reason why
41:59this is a brand new industry the way we
42:02compute is fundamentally different we
42:05created a processor for the generative
42:08AI era and one of the most important
42:11parts of it is content token generation
42:14we call it this format is
42:17fp4 well that's a lot of computation
42:245x the Gen token generation 5x the
42:27inference capability of Hopper seems
42:39there the answer is it's not enough and
42:41I'm going to show you why I'm going to
42:43show you why and so we would like to
42:46have a bigger GPU even bigger than this
42:51we decided to scale it and notice but
42:54first let me just tell you how we've
42:55scaled over the course of the last eight
42:59years we've increased computation by
43:011,000 times8 years 1,000 times remember
43:04back in the good old days of Moore's Law
43:07it was 2x well 5x every what 10 10x
43:12every 5 years that's easier easiest math
43:1410x every 5 years a 100 times every 10
43:17years 100 times every 10 years at the in
43:21the middle in the hey days of the PC
43:25Revolution one 100 times every 10 years
43:29in the last 8 years we've gone 1,000
43:33times we have two more years to
43:35go and so that puts it in
43:41perspective the rate at which we're
43:43advancing Computing is insane and it's
43:46still not fast enough so we built
43:49chip this chip is just an incredible
43:53chip we call it the Envy link switch
43:56it's 50 billion transistors it's almost
43:59the size of Hopper all by itself this
44:02switch ship has four MV links in
44:05it each 1.8 terabytes per
44:10and and it has computation in as I
44:13mentioned what is this chip
44:16for if we were to build such a chip we
44:20can have every single GPU talk to every
44:23other GPU at full speed at the same
44:36insane it doesn't even make
44:39sense but if you could do that if you
44:42can find a way to do that and build a
44:44system to do that that's cost effective
44:48that's cost effective how incredible
44:51would it be that we could have all these
44:53gpus connect over a coherent link so
44:58that they effectively are one giant GPU
45:02well one of one of the Great Inventions
45:04in order to make a cost effective is
45:05that this chip has to drive copper
45:09directly the seres of this chip is is
45:11just a phenomenal invention so that we
45:14could do direct drive to copper and as a
45:16result you can build a system that looks
45:30now this system this system is kind of
45:34insane this is one dgx this is what a
45:38dgx looks like now remember just six
45:43ago it was pretty heavy but I was able
45:49it I delivered the uh the uh first djx1
45:53to open Ai and and the researchers there
45:56it's on you know the pictures are on the
45:57internet and uh uh and we all
46:00autographed it uh and um uh if you come
46:04to my office it's autographed there is
46:06really beautiful and but but you could
46:08lift it uh this dgx this dgx that djx by
46:16teraflops if you're not familiar with
46:18the numbering system that's
46:210.17 pedop flops so this is
46:25720 the first one I delivered to open AI
46:290.17 you could round it up to 0.2 won't
46:32make any difference but and back then
46:34was like wow you know 30 more teraflops
46:37and so this is now 720 pedop flops
46:42almost an exal flop for training and the
46:44world's first one exal flops machine in
46:55rack just so you know there are only a
46:58couple two three exop flops machines on
47:00the planet as we speak and so this is an
47:04exop flops AI system in one single rack
47:09well let's take a look at the back of
47:13it so this is what makes it possible
47:17that's the back that's the that's the
47:19back the dgx MV link spine 130 terabytes
47:25second goes through the back of that
47:28chassis that is more than the aggregate
47:40internet so we we could basically send
47:43everything to everybody within a second
47:46and so so we we have 5,000 cables 5,000
47:50mvlink cables in total 2
47:53miles now this is the amazing thing if
47:56we had to use Optics we would have had
47:58to use transceivers and retim and those
48:01transceivers and reers alone would have
48:07watts 2 kilowatts of just transceivers
48:10alone just to drive the mvlink spine as
48:14a result we did it completely for free
48:16over mvlink switch and we were able to
48:19save the 20 kilow for computation this
48:22entire rack is 120 kilowatts so that 20
48:25kilowatts makes a huge difference
48:27it's liquid cooled what goes in is 25° C
48:31about room temperature what comes out is
48:3445°c about your jacuzzi so room
48:38temperature goes in jacuzzi comes out 2
48:49second we could we could sell a
48:58600,000 Parts somebody used to say you
49:01know you guys make gpus and we do but
49:05this is what a GPU looks like to me when
49:07somebody says GPU I see this two years
49:10ago when I saw a GPU was the hgx it was
49:1370 lb 35,000 Parts our gpus now are
49:21and 3,000 lb 3,000 lb 3,000 lb that's
49:27kind of like the weight of a you know
49:33Ferrari I don't know if that's useful
49:37but everybody's going I feel it I feel
49:40it I get it I get that now that you
49:43mention that I feel it I don't know
49:47lb okay so 3,000 lb ton and a half so
49:53elephant so this is what a dgx looks
49:56like now let's see what it looks like in
49:58operation okay let's imagine what is
50:00what how do we put this to work and what
50:01does that mean well if you were to train
50:03a GPT model 1.8 trillion parameter
50:08model it took it took about apparently
50:11about you know 3 to 5 months or so uh
50:13with 25,000 amp uh if we were to do it
50:16with hopper it would probably take
50:17something like 8,000 gpus and it would
50:20consume 15 megawatts 8,000 gpus on 15
50:23megawatts it would take 90 days about 3
50:25months and that would allows you to
50:27train something that is you know this
50:30groundbreaking AI model and this is
50:34obviously not as expensive as as um as
50:37anybody would think but it's 8,000 8,000
50:39gpus it's still a lot of money and so
50:418,000 gpus 15 megawatts if you were to
50:44use Blackwell to do this it would only
50:49gpus 2,000 gpus same 90 days but this is
50:54the amazing part only 4 me GS of power
50:58so from 15 yeah that's
51:04right and that's and that's our goal our
51:07goal is to continuously drive down the
51:10cost and the energy they're directly
51:11proportional to each other cost and
51:13energy associated with the Computing so
51:15that we can continue to expand and scale
51:17up the computation that we have to do to
51:20train the Next Generation models well
51:23training inference or generation
51:27is vitally important going forward you
51:29know probably some half of the time that
51:31Nvidia gpus are in the cloud these days
51:33it's being used for token generation you
51:36know they're either doing co-pilot this
51:37or chat you know chat GPT that or um all
51:40these different models that are being
51:41used when you're interacting with it or
51:44generating IM generating images or
51:46generating videos generating proteins
51:48generating chemicals there's a bunch of
51:50gener generation going on all of that is
51:53B in the category of computing we call
51:57but inference is extremely hard for
51:59large language models because these
52:01large language models have several
52:03properties one they're very large and so
52:05it doesn't fit on one GPU this is
52:08Imagine imagine Excel doesn't fit on one
52:11GPU you know and imagine some
52:13application you're running on a daily
52:15basis doesn't run doesn't fit on one
52:16computer like a video game doesn't fit
52:18on one computer and most in fact do and
52:23many times in the past in hyperscale
52:25Computing many applic applications for
52:27many people fit on the same computer and
52:29now all of a sudden this one inference
52:31application where you're interacting
52:33with this chatbot that chatbot requires
52:36a supercomputer in the back to run it
52:38and that's the future the future is
52:41generative with these chatbots and these
52:43chatbots are trillions of tokens
52:46trillions of parameters and they have to
52:49tokens at interactive rates now what
52:52does that mean well uh three to tokens
52:58word I you know the the
53:01uh you know space the final frontier
53:05these are the adventures that's like
53:09tokens okay I don't know if that's
53:16so you know the art of communications is
53:19is selecting good an good
53:22analogies yeah this is this is not going
53:28every I don't know what he's talking
53:30about never seen Star Trek and so and so
53:34so here we are we're trying to generate
53:35these tokens when you're interacting
53:37with it you're hoping that the tokens
53:38come back to you as quickly as possible
53:40and as quickly as you can read it and so
53:42the ability for Generation tokens is
53:44really important you have to paralyze
53:46the work of this model across many many
53:48gpus so that you could achieve several
53:51things one on the one hand you would
53:52like throughput because that throughput
53:57the overall cost per token of uh
54:00generating so your throughput dictates
54:03the cost of of uh delivering the service
54:06on the other hand you have another
54:08interactive rate which is another tokens
54:10per second where it's about per user and
54:13that has everything to do with quality
54:14of service and so these two things um uh
54:18compete against each other and we have
54:20to find a way to distribute work across
54:23all of these different gpus and paralyze
54:25it in a way that allows us to achieve
54:27both and it turns out the search search
54:31enormous you know I told you there's
54:34involved and everybody's going oh
54:37dear I heard some gasp just now when I
54:40put up that slide you know so so this
54:43this right here the the y axis is tokens
54:45per second data center throughput the x-
54:48axis is tokens per second interactivity
54:51of the person and notice the upper right
54:53is the best you want interactivity to be
54:56High number of tokens per second per
54:59user you want the tokens per second of
55:01per data center to be very high the
55:02upper upper right is is terrific however
55:05it's very hard to do that and in order
55:08for us to search for the best
55:10answer across every single one of those
55:12intersections XY coordinates okay so you
55:15just look at every single XY coordinate
55:17all those blue dots came from some
55:20repartitioning of the software some
55:23optimizing solution has to go and figure
55:25out what whether to use use tensor
55:29parallel expert parallel pipeline
55:32parallel or data parallel and
55:34distribute this enormous model across
55:37all these G different gpus and sustain
55:40performance that you need this
55:42exploration space would be impossible if
55:45not for the programmability of nvidia's
55:47gpus and so we could because of Cuda
55:49because we have such Rich ecosystem we
55:51could explore this universe and find
55:54that green roof line it turns out that
55:57green roof line notice you got tp2 EPA
56:01dp4 it means two parall two uh tensor
56:05parallel tensor parallel across two gpus
56:08expert parallels across eight data
56:10parallel across four notice on the other
56:12end you got tensor parallel cross 4 and
56:14expert parallel across 16 the
56:17configuration the distribution of that
56:19software it's a different different um
56:22runtime that would produce these
56:25different results and you have to go
56:27discover that roof line well that's just
56:29one model and this is just one
56:32configuration of a computer imagine all
56:34of the models being created around the
56:35world and all the different different um
56:38uh configurations of of uh systems that
56:43available so now that you understand the
56:46basics let's take a look at inference of
56:52to Hopper and this is this is the
56:55extraordinary thing in one generation
56:58because we created a system that's
57:01designed for trillion parameter gener
57:03generative AI the inference capability
57:06of Blackwell is off the
57:08charts and in fact it is some 30 times
57:18y for large language models for large
57:21language models like Chad GPT and others
57:24like it the blue line is Hopper I gave
57:28you imagine we didn't change the
57:30architecture of Hopper we just made it a
57:33chip we just used the latest you know
57:36greatest uh 10 terab you know terabytes
57:40per second we connected the two chips
57:42together we got this giant 208 billion
57:44parameter chip how would we have
57:46performed if nothing else changed and it
57:50wonderfully quite wonderfully and that's
57:52the purple line but not as great as it
57:55could be and and that's where the fp4
57:58tensor core the new Transformer engine
58:01and very importantly the MV link switch
58:04and the reason for that is because all
58:06these gpus have to share the results
58:08partial products whenever they do all to
58:10all all all gather whenever they
58:12communicate with each
58:14other that mvlink switch is
58:17communicating almost 10 times faster
58:20than what we could do in the past using
58:23networks Okay so Blackwell is going to
58:27be just an amazing system for a
58:30generative Ai and in the
58:33future in the future data centers are
58:36going to be thought of as I mentioned
58:38earlier as an AI Factory an AI Factory's
58:42goal in life is to generate revenues
58:50intelligence in this facility not
58:53generating electricity as in AC
58:57but of the last Industrial Revolution
58:59and this Industrial Revolution the
59:00generation of intelligence and so this
59:03ability is super super important the
59:06excitement of Blackwell is really off
59:08the charts you know when we first when
59:10we first um uh you know this this is a
59:14year and a half ago two years ago I
59:16guess two years ago when we first
59:17started to to go to market with hopper
59:20you know we had the benefit of of uh two
59:22two uh two csps uh joined us in a lunch
59:26and and we were you know delighted um
59:31customers uh we have more
59:46now unbelievable excitement for
59:48Blackwell unbelievable excitement and
59:51there's a whole bunch of different
59:52configurations of course I showed you
59:54the configurations that slide into the
59:56hopper form factor so that's easy to
59:58upgrade I showed you examples that are
01:00:01liquid cooled that are the extreme
01:00:03versions of it one entire rack that's
01:00:05that's uh connected by mvlink 72 uh
01:00:08we're going to Blackwell is going to be
01:00:12ramping to the world's AI companies of
01:00:16which there are so many now doing
01:00:18amazing work in different modalities the
01:00:21csps every CSP is geared up all the OEM
01:00:27odms Regional clouds Sovereign AIS and
01:00:32Telos all over the world are signing up
01:00:34to launch with Blackwell
01:00:43this Blackwell Blackwell would be the
01:00:46the the most successful product launch
01:00:48in our history and so I can't wait wait
01:00:51to see that um I want to thank I want to
01:00:53thank some partners that that are
01:00:54joining us in this uh AWS is gearing up
01:00:57for Blackwell they're uh they're going
01:00:59to build the first uh GPU with secure AI
01:01:02they're uh building out a 222 exf flops
01:01:06system you know just now when we
01:01:08animated uh just now the digital twin if
01:01:10you saw the the all of those clusters
01:01:12are coming down by the way that is not
01:01:16just art that is a digital twin of what
01:01:18we're building that's how big it's going
01:01:20to be besides infrastructure we're doing
01:01:22a lot of things together with AWS we're
01:01:24Cuda accelerating stag maker AI we're
01:01:27Cuda accelerating Bedrock AI uh Amazon
01:01:30robotics is working with us uh using
01:01:32Nvidia Omniverse and Isaac Sim AWS
01:01:35Health has Nvidia Health Integrated into
01:01:38it so AWS has has really leaned into
01:01:42accelerated Computing uh Google is
01:01:44gearing up for Blackwell gcp already has
01:01:47A1 100s h100s t4s l4s a whole Fleet of
01:01:51Nvidia Cuda gpus and they recently
01:01:53announced the Gemma model that runs
01:01:55across all of it uh we're work working
01:01:58to optimize uh and accelerate every
01:02:01aspect of gcp we're accelerating data
01:02:03proc which for data processing their
01:02:05data processing engine Jax xlaa vertex
01:02:08Ai and mojoko for robotics so we're
01:02:11working with uh Google and gcp across a
01:02:14whole bunch of initiatives uh Oracle is
01:02:16gearing up for black wellth Oracle is a
01:02:18great partner of ours for Nvidia dgx
01:02:20cloud and we're also working together to
01:02:22accelerate something that's really
01:02:24important to a lot of companies Oracle
01:02:27database Microsoft is accelerating and
01:02:30Microsoft is gearing up for Blackwell
01:02:32Microsoft Nvidia has a wide- ranging
01:02:34partnership we're accelerating Cuda
01:02:36accelerating all kinds of services when
01:02:38you when you chat obviously and uh AI
01:02:41services that are in Microsoft Azure uh
01:02:43it's very very likely Nvidia is in the
01:02:45back uh doing the inference and the
01:02:46token generation uh we built they built
01:02:49the largest Nvidia infiniband
01:02:51supercomputer basically a digital twin
01:02:53of hours or a physical twin of hours uh
01:02:56we're bringing the Nvidia ecosystem to
01:02:58Azure Nvidia djx cloud to Azure uh
01:03:01Nvidia Omniverse is now hosted in Azure
01:03:03Nvidia Healthcare is an Azure and all of
01:03:06it is deeply integrated and deeply
01:03:08connected with Microsoft fabric the
01:03:11whole industry is gearing up for
01:03:13Blackwell this is what I'm about to show
01:03:16you most of the most of the the the uh
01:03:19uh uh scenes that you've seen so far of
01:03:21Blackwell are the are the full Fidelity
01:03:25design of Blackwell everything in our
01:03:28company has a digital twin and in fact
01:03:31this digital twin idea is it is really
01:03:34spreading and it it helps it helps
01:03:36companies build very complicated things
01:03:39perfectly the first time and what could
01:03:43than creating a digital twin to build a
01:03:47computer that was built in a digital
01:03:49twin and so let me show you what wistron
01:03:54doing to meet the demand for NVIDIA
01:03:57accelerated Computing widraw one of our
01:03:59leading manufacturing Partners is
01:04:01building digital twins of Nvidia dgx and
01:04:04hgx factories using custom software
01:04:07developed with Omniverse sdks and
01:04:10apis for their newest Factory wraw
01:04:13started with a digital twin to virtually
01:04:15integrate their multi-ad and process
01:04:17simulation data into a unified view
01:04:20testing and optimizing layouts in this
01:04:22physically accurate digital environment
01:04:24increased worker efficency icy by
01:04:2751% during construction the Omniverse
01:04:30digital twin was used to verify that the
01:04:32physical build matched the digital plans
01:04:35identifying any discrepancies early has
01:04:37helped avoid costly change orders and
01:04:40the results have been impressive using a
01:04:42digital twin helped bring wion's Factory
01:04:44online in half the time just 2 and 1/2
01:04:47months instead of five in operation the
01:04:50Omniverse digital twin helps widraw
01:04:52rapidly Test new layouts to accommodate
01:04:54new processes or improve operations in
01:04:57the existing space and monitor real-time
01:05:00operations using live iot data from
01:05:02every machine on the production
01:05:04line which ultimately enabled wion to
01:05:07reduce End to-end Cycle Times by 50% and
01:05:1240% with Nvidia Ai and Omniverse
01:05:15nvidia's Global ecosystem of partners
01:05:17are building a new era of accelerated AI
01:05:31that's how we that's the way it's going
01:05:34to be in the future we're going to
01:05:35manufacturing everything digitally first
01:05:37and then we'll manufacture it physically
01:05:39people ask me how did it
01:05:41start what got you guys so
01:05:44excited what was it that you
01:05:47saw that caused you to put it all
01:05:52in on this incredible idea and it's
01:06:07second guys that was going to be such a
01:06:12moment that's what happens when you
01:06:19rehearse this as you know was first
01:06:26alexnet you put a cat into this computer
01:06:31and it comes out and it says
01:06:35cat and we said oh my God this is going
01:06:42everything you take 1 million numbers
01:06:45you take one Million numbers across
01:06:49RGB these numbers make no sense to
01:06:52anybody you put it into this software
01:06:56and it compress it dimensionally reduce
01:06:59it it reduces it from a million
01:07:01dimensions a million Dimensions it turns
01:07:04it into three letters one vector one
01:07:11generalized you could have the cat be
01:07:17cats and and you could have it be the
01:07:19front of the cat and the back of the cat
01:07:22and you look at this thing you say
01:07:24unbelievable you mean any
01:07:30cat and it was able to recognize all
01:07:33these cats and we realized how it did it
01:07:37systematically structurally it's
01:07:41scalable how big can you make it well
01:07:44how big do you want to make it and so we
01:07:47imagine that this is a completely new
01:07:51software and now today as you know you
01:07:54could have you type in the word c a and
01:07:58what comes out is a
01:08:00cat it went the other
01:08:07unbelievable how is it possible that's
01:08:10right how is it possible you took three
01:08:13letters and you generated a million
01:08:16pixels from it and it made
01:08:18sense well that's the miracle and here
01:08:21we are just literally 10 years later 10
01:08:26where we recognize textt we recognize
01:08:28images we recognize videos and sounds
01:08:31and images not only do we recognize them
01:08:34we understand their meaning we
01:08:37understand the meaning of the text
01:08:38that's the reason why it can chat with
01:08:39you it can summarize for you it
01:08:42understands the text it understood not
01:08:44just recognizes the the English it
01:08:46understood the English it doesn't just
01:08:48recognize the pixels and understood the
01:08:51pixels and you can you can even
01:08:53condition it between two modalities you
01:08:55can have language condition image and
01:08:57generate all kinds of interesting things
01:09:00well if you can understand these things
01:09:02what else can you understand that you've
01:09:05digitized the reason why we started with
01:09:07text and you know images is because we
01:09:09digitized those but what else have we
01:09:11digitized well it turns out we digitized
01:09:13a lot of things proteins and genes and
01:09:18waves anything you can digitize so long
01:09:21as there's structure we can probably
01:09:23learn some patterns from it and if we
01:09:24can learn the patterns from it we can
01:09:26understand its meaning if we can
01:09:28understand its meaning we might be able
01:09:30to generate it as well and so therefore
01:09:32the generative AI Revolution is here
01:09:36well what else can we generate what else
01:09:37can we learn well one of the things that
01:09:39we would love to learn we would love to
01:09:42learn is we would love to learn climate
01:09:47we would love to learn extreme weather
01:09:49we would love to learn uh what how we
01:09:54predict future weather at Regional
01:09:57scales at sufficiently high resolution
01:10:01such that we can keep people out of
01:10:02Harm's Way before harm comes extreme
01:10:05weather cost the world $150 billion
01:10:08surely more than that and it's not
01:10:10evenly distributed $150 billion is
01:10:13concentrated in some parts of the world
01:10:15and of course to some people of the
01:10:16world we need to adapt and we need to
01:10:19know what's coming and so we are
01:10:20creating Earth too a digital twin of the
01:10:23Earth for predicting weather we and
01:10:26we've made an extraordinary invention
01:10:29called Civ the ability to use generative
01:10:32AI to predict weather at extremely high
01:10:35resolution let's take a
01:10:38look as the earth's climate changes AI
01:10:41powered weather forecasting is allowing
01:10:43us to more accurately predict and track
01:10:45severe storms like super typhoon chanthu
01:10:48which caused widespread damage in Taiwan
01:10:50and the surrounding region in 2021
01:10:53current AI forecast models can
01:10:55accurately predict the track of storms
01:10:57but they are limited to 25 km resolution
01:11:00which can miss important details Invidia
01:11:03cordi is a revolutionary new generative
01:11:06AI model trained on high resolution
01:11:08radar assimilated Warf weather forecasts
01:11:10and air 5 reanalysis data using cordi
01:11:14extreme events like chanthu can be super
01:11:17resolved from 25 km to 2 km resolution
01:11:20with 1,000 times the speed and 3,000
01:11:22times the Energy Efficiency of
01:11:24conventional weather models by combining
01:11:27the speed and accuracy of nvidia's
01:11:29weather forecasting model forecast net
01:11:31and generative AI models like cordi we
01:11:34can explore hundreds or even thousands
01:11:36of kilometer scale Regional weather
01:11:38forecasts to provide a clear picture of
01:11:40the best worst and most likely impacts
01:11:42of a storm this wealth of information
01:11:45can help minimize loss of life and
01:11:47property damage today cordi is optimized
01:11:50for Taiwan but soon generative super
01:11:53sampling will be available as part of
01:11:54the in viia Earth 2 inference service
01:11:57for many regions across the
01:12:09globe the weather company has the trust
01:12:12a source of global weather predictions
01:12:14we are working together to accelerate
01:12:16their weather simulation first
01:12:18principled base of simulation however
01:12:21they're also going to integrate Earth to
01:12:23cordi so that they could help businesses
01:12:25and countries do Regional high
01:12:28resolution weather prediction and so if
01:12:31you have some weather prediction you'd
01:12:32like to know like to do uh reach out to
01:12:34the weather company really exciting
01:12:36really exciting work Nvidia Healthcare
01:12:39something we started 15 years ago we're
01:12:41super super excited about this this is
01:12:43an area where we're very very proud
01:12:46whether it's Medical Imaging or genene
01:12:47sequencing or computational
01:12:50chemistry it is very likely that Nvidia
01:12:53is the computation behind it
01:12:55we've done so much work in this
01:12:57area today we're announcing that we're
01:13:00going to do something really really cool
01:13:03imagine all of these AI models that are
01:13:10generate images and audio but instead of
01:13:12images and audio because it understood
01:13:15images and audio all the digitization
01:13:17that we've done for genes and proteins
01:13:20and amino acids that digitalization
01:13:23capability is now now passed through
01:13:26machine learning so that we understand
01:13:30Life the ability to understand the
01:13:32language of Life of course we saw the
01:13:34first evidence of
01:13:35it with alphafold this is really quite
01:13:38an extraordinary thing after Decades of
01:13:40painstaking work the world had only
01:13:44digitized and reconstructed using cor
01:13:47electron microscopy or Crystal XR x-ray
01:13:51crystallography um these different
01:13:53techniques painstaking reconstructed the
01:13:56protein 200,000 of them in just what is
01:13:59it less than a year or so Alpha fold has
01:14:04reconstructed 200 million proteins
01:14:06basically every protein every of every
01:14:09living thing that's ever been sequenced
01:14:11this is completely revolutionary well
01:14:14those models are incredibly hard to use
01:14:16um for incredibly hard for people to
01:14:18build and so what we're going to do is
01:14:20we're going to build them we're going to
01:14:21build them for uh the the researchers
01:14:24around the world and it won't be the
01:14:26only one there'll be many other models
01:14:27that we create and so let me show you
01:14:29what we're going to do with
01:14:34it virtual screening for new medicines
01:14:37is a computationally intractable problem
01:14:40existing techniques can only scan
01:14:42billions of compounds and require days
01:14:44on thousands of standard compute nodes
01:14:47to identify new drug
01:14:48candidates Nvidia biion Nemo Nims enable
01:14:52a new generative screening Paradigm
01:14:54using Nims for protein structure
01:14:56prediction with Alpha fold molecule
01:14:58generation with MIM and docking with
01:15:01diff dock we can now generate and Screen
01:15:04candidate molecules in a matter of
01:15:05minutes MIM can connect to custom
01:15:08applications to steer the generative
01:15:10process iteratively optimizing for
01:15:12desired properties these applications
01:15:15can be defined with biion Nemo
01:15:17microservices or built from scratch here
01:15:20a physics based simulation optimizes for
01:15:23a molecule's ability to bind to a Target
01:15:25protein while optimizing for other
01:15:27favorable molecular properties in
01:15:29parallel MIM generates high quality
01:15:32drug-like molecules that bind to the
01:15:34Target and are synthesizable translating
01:15:37to a higher probability of developing
01:15:39successful medicines
01:15:41faster biion Nemo is enabling a new
01:15:44paradigm in drug Discovery with Nims
01:15:46providing OnDemand microservices that
01:15:48can be combined to build powerful drug
01:15:51Discovery workflows like denovo protein
01:15:53design or ided molecule generation for
01:15:56virtual screening bio Nims are helping
01:16:00researchers and developers reinvent
01:16:02computational drug
01:16:09design Nvidia M MIM MIM cord diff
01:16:13there's a whole bunch of other models
01:16:15whole bunch of other models computer
01:16:17vision models robotics models and even
01:16:22course some really really terrific open
01:16:25language models these models are
01:16:29groundbreaking however it's hard for
01:16:31companies to use how would you use it
01:16:33how would you bring it into your company
01:16:34and integrate it into your workflow how
01:16:36would you package it up and run it
01:16:38remember earlier I just
01:16:40said that inference is an extraordinary
01:16:43computation problem how would you do the
01:16:46optimization for each and every one of
01:16:48these models and put together the
01:16:50Computing stack necessary to run that
01:16:52supercomputer so that you can run the
01:16:55models in your company and so we have a
01:16:58great idea we're going to invent a new
01:17:00way invent a new way for you to receive
01:17:07software this software comes basically
01:17:11in a digital box we call it a container
01:17:14and we call it the Nvidia inference micr
01:17:17service a Nim and let me explain to you
01:17:21what it is a Nim it's a pre-trained
01:17:24model so it's pretty
01:17:25clever and it is packaged and optimized
01:17:29to run across nvidia's install base
01:17:32which is very very large what's inside
01:17:34it is incredible you have all these
01:17:37pre-trained state-ofthe-art open source
01:17:39models they could be open source they
01:17:41could be from one of our partners it
01:17:43could be created by us like Nvidia mull
01:17:46it is packaged up with all of its
01:17:48dependencies so Cuda the right version
01:17:50CNN the right version tensor RT llm
01:17:54Distributing across the multiple gpus
01:17:56Tred and inference server all completely
01:17:59packaged together it's optimized
01:18:02depending on whether you have a single
01:18:04GPU multi- GPU or multi node of gpus
01:18:06it's optimized for that and it's
01:18:08connected up with apis that are simple
01:18:10to use now this think about what an AI
01:18:13API is an AI API is an interface that
01:18:18you just talk to and so this is a piece
01:18:21of software in the future that has a
01:18:23really simple API and that API called
01:18:25human and these packages incredible
01:18:29bodies of software will be optimized and
01:18:32packaged and we'll put it on a
01:18:34website and you can download it you
01:18:37could take it with you you could run it
01:18:39in any Cloud you can run it in your own
01:18:41data center you can run in workstations
01:18:43if it fit and all you have to do is come
01:18:45to ai. nvidia.com we call it Nvidia
01:18:49inference microservice but inside the
01:18:51company we all call it
01:19:02just imagine you know one of some
01:19:04someday there there's going to be one of
01:19:06these chat Bots and these chat Bots is
01:19:08going to just be in a Nim and you you'll
01:19:12uh you'll assemble a whole bunch of chat
01:19:13Bots and that's the way software is
01:19:15going to be be built someday how do we
01:19:18build software in the future it is
01:19:20unlikely that you'll write it from
01:19:22scratch or write a whole bunch of python
01:19:23code or anything like that it is very
01:19:26likely that you assemble a team of AIS
01:19:29there's probably going to be a super AI
01:19:32that you use that takes the mission that
01:19:34you give it and breaks it down into an
01:19:37execution plan some of that execution
01:19:39plan could be handed off to another Nim
01:19:42that Nim would maybe uh understand
01:19:46sap the language of sap is abap it might
01:19:50understand service now and it go
01:19:52retrieve some information from their
01:19:55it might then hand that result to
01:19:56another Nim who that goes off and does
01:19:59some calculation on it maybe it's an
01:20:01optimization software a
01:20:03combinatorial optimization algorithm
01:20:06maybe it's uh you know some just some
01:20:09calculator maybe it's pandas to do some
01:20:13numerical analysis on it and then it
01:20:15comes back with its
01:20:17answer and it gets combined with
01:20:19everybody else's and it because it's
01:20:21been presented with this is what the
01:20:23right answer should look like it knows
01:20:25what answer what an what right answers
01:20:27to produce and it presents it to you we
01:20:30can get a report every single day at you
01:20:32know top of the hour uh that has
01:20:34something to do with a bill plan or some
01:20:36forecast or uh some customer alert or
01:20:38some bugs database or whatever it
01:20:40happens to be and we could assemble it
01:20:42using all these Nims and because these
01:20:44Nims have been packaged up and ready to
01:20:48work on your systems so long as you have
01:20:50video gpus in your data center in the
01:20:51cloud this this Nims will work together
01:20:55as a team and do amazing things and so
01:20:58we decided this is such a great idea
01:21:00we're going to go do that and so Nvidia
01:21:03has Nims running all over the company we
01:21:05have chatbots being created all over the
01:21:08place and one of the mo most important
01:21:09chatbots of course is a chip designer
01:21:12chatbot you might not be surprised we
01:21:14care a lot about building chips and so
01:21:17we want to build chatbots AI
01:21:21co-pilots that are co-designers with our
01:21:23engineers and so this is the way we did
01:21:26it so we got ourselves a llama llama 2
01:21:30this is a 70b and it's you know packaged
01:21:32up in a NM and we asked it you know uh
01:21:37CTL Well turns out CTL is an internal uh
01:21:42program and it has a internal
01:21:44proprietary language but it thought the
01:21:46CTL was a combinatorial timing logic and
01:21:48so it describes you know conventional
01:21:50knowledge of CTL but that's not very
01:21:52useful to us and so we gave it a whole
01:21:56bunch of new examples you know this is
01:21:58no different than employee onboarding an
01:22:01employee uh we say you know thanks for
01:22:03that answer it's completely wrong um and
01:22:06and uh and then we present to them uh
01:22:09this is what a CTL is okay and so this
01:22:11is what a CTL is at Nvidia and the CTL
01:22:15as you can see you know CTL stands for
01:22:17compute Trace Library which makes sense
01:22:20you know we were tracing compute Cycles
01:22:22all the time and it wrote the program
01:22:32amazing and so the productivity of our
01:22:34chip designers can go up this is what
01:22:35you can do with a Nim first thing you
01:22:37can do with is customize it we have a
01:22:39service called Nemo microservice that
01:22:41helps you curate the data preparing the
01:22:44data so that you could teach this on
01:22:46board this AI you fine-tune them and
01:22:49then you guardrail it you can even
01:22:51evaluate the answer evaluate its
01:22:53performance against um other other
01:22:55examples and so that's called the Nemo
01:22:58micr service now the thing that's that's
01:23:00emerging here is this there are three
01:23:02elements three pillars of what we're
01:23:03doing the first pillar is of course
01:23:06inventing the technology for um uh AI
01:23:09models and running AI models and
01:23:11packaging it up for you the second is to
01:23:13create tools to help you modify it first
01:23:16is having the AI technology second is to
01:23:19help you modify it and third is
01:23:20infrastructure for you to fine-tune it
01:23:23and if you like deploy it you could
01:23:24deploy it on our infrastructure called
01:23:26dgx cloud or you can employ deploy it on
01:23:29Prem you can deploy it anywhere you like
01:23:31once you develop it it's yours to take
01:23:33anywhere and so we are
01:23:36effectively an AI Foundry we will do for
01:23:40you and the industry on AI what tsmc
01:23:43does for us building chips and so we go
01:23:45to it with our go to tsmc with our big
01:23:48Ideas they manufacture and we take it
01:23:50with us and so exactly the same thing
01:23:52here AI Foundry and the three pillar ERS
01:23:54are the NIMS Nemo microservice and dgx
01:23:58Cloud the other thing that you could
01:24:00teach the Nim to do is to understand
01:24:02your proprietary information remember
01:24:05inside our company the vast majority of
01:24:07our data is not in the cloud it's inside
01:24:09our company it's been sitting there you
01:24:11know being used all the time and and
01:24:14gosh it's it's basically invidious
01:24:17intelligence we would like to take that
01:24:20data learn its meaning like we learned
01:24:23the meaning of almost anything else that
01:24:24we just talked about learn its meaning
01:24:27and then reindex that knowledge into a
01:24:30new type of database called a vector
01:24:32database and so you essentially take
01:24:35structured data or unstructured data you
01:24:37learn its meaning you encode its meaning
01:24:39so now this becomes an AI database and
01:24:43that AI database in the future once you
01:24:45create it you can talk to it and so let
01:24:47me give you an example of what you could
01:24:49do so suppose you create you get you got
01:24:51a whole bunch of multi modality data and
01:24:53one good example of that is PDF so you
01:24:56take the PDF you take all of your PDFs
01:24:59all the all your favorite you know the
01:25:01stuff that that is proprietary to you
01:25:03critical to your company you can encode
01:25:05it just as we encoded pixels of a cat
01:25:09and it becomes the word cat we can
01:25:11encode all of your PDF and it turns
01:25:14into vectors that are now stored inside
01:25:16your vector database it becomes the
01:25:18proprietary information of your company
01:25:20and once you have that proprietary
01:25:21information you can chat to it it's an
01:25:24it's a smart database and so you just ch
01:25:27chat with data and how how much more
01:25:29enjoyable is that you know we for for
01:25:33our software team you know they just
01:25:35chat with the bugs database you know how
01:25:38many bugs was there last night um are we
01:25:40making any progress and then after
01:25:42you're done talking to this uh bugs
01:25:45database you need therapy and so so we
01:25:49have another chatbot for
01:26:05it okay so we call this Nemo Retriever
01:26:08and the reason for that is because
01:26:09ultimately it's job is to go retrieve
01:26:11information as quickly as possible and
01:26:13you just talk to it hey retrieve me this
01:26:15information it goes if brings it back to
01:26:18you and do you mean this you go yeah
01:26:20perfect okay and so we call it the Nemo
01:26:22retriever well the Nemo service helps
01:26:24you create all these things and we have
01:26:26all all these different Nims we even
01:26:27have Nims of digital humans I'm Rachel
01:26:33manager okay so so it's a really short
01:26:36clip but there were so many videos to
01:26:39show you I guess so many other demos to
01:26:41show you and so I I had to cut this one
01:26:43short but this is Diana she is a digital
01:26:46human Nim and and uh you just talked to
01:26:50her and she's connected in this case to
01:26:52Hippocratic ai's large language model
01:26:54for healthcare and it's truly
01:26:58amazing she is just super smart about
01:27:01Healthcare things you know and so after
01:27:04you're done after my my Dwight my VP of
01:27:07software engineering talks to the
01:27:08chatbot for bugs database then you come
01:27:11over here and talk to Diane and and so
01:27:13so uh Diane is is um completely animated
01:27:17with AI and she's a digital
01:27:19human uh there's so many companies that
01:27:21would like to build they're sitting on
01:27:25the the Enterprise IT industry is
01:27:27sitting on a gold mine it's a gold mine
01:27:29because they have so much understanding
01:27:31of of uh the way work is done they have
01:27:34all these amazing tools that have been
01:27:36created over the years and they're
01:27:37sitting on a lot of data if they could
01:27:40take that gold mine and turn them into
01:27:43co-pilots these co-pilots could help us
01:27:45do things and so just about every it
01:27:49franchise it platform in the world that
01:27:51has valuable tools that people use is
01:27:53sitting on a gold mine for co-pilots and
01:27:56they would like to build their own
01:27:57co-pilots and their own chatbots and so
01:28:00we're announcing that Nvidia AI foundary
01:28:02is working with some of the world's
01:28:03great companies sap generates 87% of the
01:28:06world's Global Commerce basically the
01:28:09world runs on sap we run on sap Nvidia
01:28:11and sap are building sap Jewel co-pilots
01:28:15uh using Nvidia Nemo and dgx cloud
01:28:18service now they run 80 85% of the
01:28:20world's Fortune 500 companies run their
01:28:23people and customer service operations
01:28:25on service now and they're using Nvidia
01:28:28AI Foundry to build service now uh
01:28:33assistance cohesity backs up the world's
01:28:36data they're sitting on a gold mine of
01:28:38data hundreds of exobytes of data over
01:28:4110,000 companies Nvidia AI Foundry is
01:28:44working with them helping them build
01:28:46their Gaia generative AI agent snowflake
01:28:50is a company that stores the world's uh
01:28:53digital Warehouse in the cloud and
01:28:55serves over 3 billion queries a day for
01:29:0110,000 Enterprise customers snowflake is
01:29:03working with Nvidia AI Foundry to build
01:29:06co-pilots with Nvidia Nemo and Nims net
01:29:09apppp nearly half of the files in the
01:29:12world are stored on Prem on net apppp
01:29:16Nvidia AI Foundry is helping them uh
01:29:18build chat Bots and co-pilots like those
01:29:21Vector databases and retrievers with
01:29:25Nims and we have a great partnership
01:29:27with Dell everybody who everybody who is
01:29:30building these chat Bots and generative
01:29:33AI when you're ready to run it you're
01:29:35going to need an AI
01:29:37Factory and nobody is better at Building
01:29:41end-to-end Systems of very large scale
01:29:43for the Enterprise than Dell and so
01:29:46anybody any company every company will
01:29:48need to build AI factories and it turns
01:29:51out that Michael is here he's happy to
01:29:58ladies and gentlemen Michael
01:30:04del okay let's talk about the next wave
01:30:07of Robotics the next wave of AI robotics
01:30:11AI so far all of the AI that we've
01:30:14talked about is one
01:30:16computer data comes into one computer
01:30:18lots of the world's if you will
01:30:21experience in digital text form the AI
01:30:25imitates Us by reading a lot of the
01:30:28language to predict the next words it's
01:30:30imitating You by studying all of the
01:30:32patterns and all the other previous
01:30:34examples of course it has to understand
01:30:36context and so on so forth but once it
01:30:38understands the context it's essentially
01:30:39imitating you we take all of the data we
01:30:42put it into a system like dgx we
01:30:45compress it into a large language model
01:30:47trillions and trillions of parameters
01:30:49become billions and billion trillions of
01:30:51tokens becomes billions of parameters
01:30:53these billions of parameters becomes
01:30:54your AI well in order for us to go to
01:30:58the next wave of AI where the AI
01:31:00understands the physical world we're
01:31:02going to need three
01:31:03computers the first computer is still
01:31:06the same computer it's that AI computer
01:31:08that now is going to be watching video
01:31:10and maybe it's doing synthetic data
01:31:12generation and maybe there's a lot of
01:31:14human examples just as we have human
01:31:17examples in text form we're going to
01:31:18have human examples in articulation form
01:31:22and the AIS will watch us
01:31:25understand what is
01:31:26happening and try to adapt it for
01:31:29themselves into the
01:31:31context and because it can generalize
01:31:33with these Foundation models maybe these
01:31:36robots can also perform in the physical
01:31:38world fairly generally so I just
01:31:41described in very simple terms
01:31:44essentially what just happened in large
01:31:45language models except the chat GPT
01:31:47moment for robotics may be right around
01:31:49the corner and so we've been building
01:31:52the end to-end systems for robotics for
01:31:54some time I'm super super proud of the
01:31:56work we have the AI system
01:31:59dgx we have the lower system which is
01:32:01called agx for autonomous systems the
01:32:04world's first robotics processor when we
01:32:06first built this thing people are what
01:32:07are you guys building it's a s so it's
01:32:10one chip it's designed to be very low
01:32:12power but it's designed for high-speed
01:32:13sensor processing and Ai and so if you
01:32:17want to run Transformers in a car or you
01:32:20want to run Transformers in a in a you
01:32:24um that moves uh we have the perfect
01:32:26computer for you it's called the Jetson
01:32:29and so the dgx on top for training the
01:32:31AI the Jetson is the autonomous
01:32:33processor and in the middle we need
01:32:35another computer whereas large language
01:32:40benefit of you providing your examples
01:32:43and then doing reinforcement learning
01:32:47feedback what is the reinforcement
01:32:49learning human feedback of a robot well
01:32:52it's reinforcement learning
01:32:54physical feedback that's how you align
01:32:56the robot that's how you that's how the
01:32:59robot knows that as it's learning these
01:33:01articulation capabilities and
01:33:02manipulation capabilities it's going to
01:33:04adapt properly into the laws of physics
01:33:08and so we need a simulation
01:33:11engine that represents the world
01:33:13digitally for the robot so that the
01:33:15robot has a gym to go learn how to be a
01:33:19that virtual world Omniverse and the
01:33:23computer that runs Omniverse is called
01:33:25ovx and ovx the computer itself is
01:33:29hosted in the Azure Cloud okay and so
01:33:32basically we built these three things
01:33:34these three systems on top of it we have
01:33:36algorithms for every single one now I'm
01:33:39going to show you one super example of
01:33:42how Ai and Omniverse are going to work
01:33:45together the example I'm going to show
01:33:46you is kind of insane but it's going to
01:33:49be very very close to tomorrow it's a
01:33:51robotics building this robotics building
01:33:54is called a warehouse inside the
01:33:56robotics building are going to be some
01:33:58autonomous systems some of the
01:34:00autonomous systems are going to be
01:34:01called humans and some of the autonomous
01:34:04systems are going to be called forklifts
01:34:06and these autonomous systems are going
01:34:08to interact with each other of course
01:34:10autonomously and it's going to be
01:34:12overlooked upon by this Warehouse to
01:34:14keep everybody out of Harm's Way the
01:34:16warehouse is essentially an air traffic
01:34:18controller and whenever it sees
01:34:21something happening it will redirect
01:34:23traffic traffic and give New Way points
01:34:26just new way points to the robots and
01:34:28the people and they'll know exactly what
01:34:29to do this warehouse this building you
01:34:33can also talk to of course you could
01:34:35talk to it hey you know sap Center how
01:34:38are you feeling today for example and so
01:34:41you could ask the same the warehouse the
01:34:43same questions basically the system I
01:34:46just described will have Omniverse Cloud
01:34:49that's hosting the virtual simulation
01:34:52and AI running on djx cloud and all of
01:34:56this is running in real time let's take
01:34:59look the future of heavy industri starts
01:35:02as a digital twin the AI agents helping
01:35:05robots workers and infrastructure
01:35:07navigate unpredictable events in complex
01:35:10industrial spaces will be built and
01:35:12evaluated first in sophisticated digital
01:35:15twins this Omniverse digital twin of a
01:35:18100,000 ft Warehouse is operating as a
01:35:22simulation environment that integrates
01:35:24digital workers amrs running the Nvidia
01:35:27Isaac receptor stack centralized
01:35:29activity maps of the entire Warehouse
01:35:31from 100 simulated ceiling mount cameras
01:35:34using Nvidia metropolis and AMR route
01:35:37planning with Nvidia Koop software in
01:35:40Loop testing of AI agents in this
01:35:42physically accurate simulated
01:35:44environment enables us to evaluate and
01:35:47refine how the system adapts to real
01:35:51unpredictability here an incident occurs
01:35:53along this amr's planned route blocking
01:35:56its path as it moves to pick up a pallet
01:35:59Nvidia Metropolis updates and sends a
01:36:01realtime occupancy map to kopt where a
01:36:03new optimal route is calculated the AMR
01:36:06is enabled to see around corners and
01:36:08improve its Mission efficiency with
01:36:11generative AI powered Metropolis Vision
01:36:13Foundation models operators can even ask
01:36:16questions using natural language the
01:36:18visual model understands nuanced
01:36:21activity and can offer immediate
01:36:22insights to improve operations all of
01:36:25the sensor data is created in simulation
01:36:27and passed to the real-time AI running
01:36:30as Nvidia inference microservices or
01:36:32Nims and when the AI is ready to be
01:36:35deployed in the physical twin the real
01:36:37Warehouse we connect metropolis and
01:36:39Isaac Nims to real sensors with the
01:36:42ability for continuous Improvement of
01:36:44both the digital twin and the AI
01:36:49models isn't that
01:36:55so remember remember a future facility
01:37:00Warehouse Factory building will be
01:37:03software defined and so the software is
01:37:05running how else would you test the
01:37:07software so you you you test the
01:37:10software to building the warehouse the
01:37:12optimization system in the digital twin
01:37:14what about all the robots all of those
01:37:15robots you are seeing just now they're
01:37:17all running their own autonomous robotic
01:37:19stack and so the way you integrate
01:37:21software in the future cicd in the
01:37:23future for robotic systems is with
01:37:26digital twins we've made Omniverse a lot
01:37:29easier to access we're going to create
01:37:31basically Omniverse Cloud apis four
01:37:34simple API and a channel and you can
01:37:37connect your application to it so this
01:37:38is this is going to be as wonderfully
01:37:41beautifully simple in the future that
01:37:44Omniverse is going to be and with these
01:37:46apis you're going to have these magical
01:37:48digital twin capability we also have
01:37:52turned om ver into an AI and integrated
01:37:56it with the ability to chat USD the the
01:37:59language of our language is you know
01:38:01human and Omniverse is language as it
01:38:04turns out is universal scene description
01:38:06and so that language is rather complex
01:38:09and so we've taught our Omniverse uh
01:38:12that language and so you can speak to it
01:38:14in English and it would directly
01:38:15generate USD and it would talk back in
01:38:18USD but Converse back to you in English
01:38:20you could also look for information in
01:38:22this world semantically instead of the
01:38:25world being encoded semantically in in
01:38:27language now it's encoded semantically
01:38:29in scenes and so you could ask it of of
01:38:32uh certain objects or certain conditions
01:38:34and certain scenarios and it can go and
01:38:36find that scenario for you it also can
01:38:39collaborate with you in generation you
01:38:41could design some things in 3D it could
01:38:43simulate some things in 3D or you could
01:38:45use AI to generate something in 3D let's
01:38:47take a look at how this is all going to
01:38:49work we have a great partnership with
01:38:51Seamans Seamans is the world's largest
01:38:54industrial engineering and operations
01:38:56platform you've seen now so many
01:38:59different companies in the industrial
01:39:01space heavy Industries is one of the
01:39:03greatest final frontiers of it and we
01:39:06finally now have the Necessary
01:39:08Technology to go and make a real impact
01:39:11seens is building the industrial
01:39:13metaverse and today we're announcing
01:39:14that Seamans is connecting their Crown
01:39:17Jewel accelerator to Nvidia Omniverse
01:39:22look seens technology is transformed
01:39:25every day for everyone team Center acts
01:39:28our leading product life cycle
01:39:29management software from the sems
01:39:31accelerator platform is used every day
01:39:34by our customers to develop and deliver
01:39:36products at scale now we are bringing
01:39:39the real and the digital worlds even
01:39:41Closer by integrating Nvidia Ai and
01:39:44Omniverse Technologies into team Center
01:39:47X Omniverse apis enable data
01:39:50interoperability and physics-based
01:39:52rendering to Industrial scale design and
01:39:55Manufacturing projects our customers HD
01:39:59market leader in sustainable ship
01:40:00manufacturing builds ammonia and
01:40:03hydrogen power chips often comprising
01:40:05over 7 million discrete Parts with
01:40:08Omniverse apis team Center X lets
01:40:11companies like HD yundai unify and
01:40:14visualize these massive engineering data
01:40:17sets interactively and integrate
01:40:19generative AI to generate 3D objects or
01:40:22HDR I backgrounds to see their projects
01:40:26in context the result an ultra inuitive
01:40:29photoal physics-based digital twin that
01:40:32eliminates waste and errors delivering
01:40:35huge savings in cost and
01:40:37time and we are building this for
01:40:39collaboration whether across more semens
01:40:41accelerator tools like seens anex or
01:40:45Star CCM Plus or across teams working on
01:40:49their favorite devices in the same scene
01:40:51together in this is just the beginning
01:40:54working with Nvidia we will bring
01:40:57accelerated Computing generative Ai and
01:40:59Omniverse integration across the Sean
01:41:11portfolio the pro the the professional
01:41:15the professional voice actor happens to
01:41:17be a good friend of mine Roland Bush who
01:41:20happens to be the CEO of
01:41:29once you get Omniverse connected into
01:41:34your workflow your
01:41:36ecosystem from the beginning of your
01:41:40engineering to manufacturing planning
01:41:43all the way to digital twin
01:41:45operations once you connect everything
01:41:48together it's insane how much
01:41:50productivity you can get and it's just
01:41:52really really wonderful all of a sudden
01:41:54everybody is operating on the same
01:41:56truth you don't have to exchange data
01:41:59and convert data make mistakes everybody
01:42:01is working on the same ground truth from
01:42:04the design Department to the art
01:42:06Department the architecture Department
01:42:07all the way to the engineering and even
01:42:09the marketing department let's take a
01:42:11look at how Nissan has integrated
01:42:14Omniverse into their workflow and it's
01:42:17all because it's connected by all these
01:42:19wonderful tools and these developers
01:42:21that we're working with take a look
01:44:01that was not an animation that was
01:44:05Omniverse today we're announcing that
01:44:09Cloud streams to The Vision Pro
01:44:19and it is very very strange
01:44:24that you walk around virtual doors when
01:44:27I was getting out of that
01:44:29car and everybody does it it is really
01:44:33really quite amazing Vision Pro
01:44:35connected to Omniverse portals you into
01:44:38Omniverse and because all of these CAD
01:44:41tools and all these different design
01:44:42tools are now integrated and connected
01:44:44to Omniverse you can have this type of
01:44:46workflow really incredible let's talk
01:44:48about robotics everything that moves
01:44:51will be robotic there's no question
01:44:52about that it's safer it's more
01:44:56convenient and one of the largest
01:44:57Industries is going to be Automotive we
01:45:00build the robotic stack from top to
01:45:02bottom as I was mentioned from the
01:45:04computer system but in the case of
01:45:05self-driving cars including the
01:45:07self-driving application at the end of
01:45:10this year or I guess beginning of next
01:45:12year we will be shipping in Mercedes and
01:45:14then shortly after that jlr and so these
01:45:17autonomous robotic systems are software
01:45:20defined they take a lot of work to do
01:45:22has computer vision has obviously
01:45:24artificial intelligence control and
01:45:26planning all kinds of very complicated
01:45:29technology and takes years to refine
01:45:31we're building the entire stack however
01:45:34we open up our entire stack for all of
01:45:36the automotive industry this is just the
01:45:37way we work the way we work in every
01:45:39single industry we try to build as much
01:45:41of it as we can so that we understand it
01:45:43but then we open it up so everybody can
01:45:45access it whether you would like to buy
01:45:47just our computer which is the world's
01:45:49only full functional save asld system
01:45:56AI this functional safe asld quality
01:46:00computer or the operating system on top
01:46:03or of course our data centers which is
01:46:07in basically every AV company in the
01:46:09world however you would like to enjoy it
01:46:11we're delighted by it today we're
01:46:13announcing that byd the world's largest
01:46:16ev company is adopting our next
01:46:19Generation it's called Thor Thor is
01:46:21designed for Transformer engines Thor
01:46:24our next Generation AV computer will be
01:46:36byd you probably don't know this fact
01:46:38that we have over a million robotics
01:46:40developers we created Jetson this
01:46:43robotics computer we're so proud of it
01:46:45the amount of software that goes on top
01:46:47of it is insane but the reason why we
01:46:49can do it at all is because it's 100%
01:46:50Cuda compatible everything that we do
01:46:53everything that we do in our company is
01:46:55in service of our developers and by us
01:46:58being able to maintain this Rich
01:47:00ecosystem and make it compatible with
01:47:02everything that you access from us we
01:47:05can bring all of that incredible
01:47:06capability to this little tiny computer
01:47:09we call Jetson a robotics computer we
01:47:13announcing this incredibly Advanced new
01:47:16SDK we call it Isaac
01:47:19perceptor Isaac perceptor most most of
01:47:22the Bots today are pre-programmed
01:47:26they're either following rails on the
01:47:27ground digital rails or theyd be
01:47:29following April tags but in the future
01:47:31they're going to have perception and the
01:47:33reason why you want that is so that you
01:47:34could easily program it you say would
01:47:37you like to go from point A to point B
01:47:39and it will figure out a way to navigate
01:47:41its way there so by only programming
01:47:44waypoints the entire route could be
01:47:47adaptive the entire environment could be
01:47:49reprogrammed just as I showed you at the
01:47:51very beginning with the warehouse you
01:47:53can't do that with pre-programmed agvs
01:47:57if those boxes fall down they just all
01:47:59gum up and they just wait there for
01:48:01somebody to come clear it and so now
01:48:05perceptor we have incredible
01:48:07state-of-the-art Vision odometry 3D
01:48:11reconstruction and in addition to 3D
01:48:13reconstruction depth perception the
01:48:15reason for that is so that you can have
01:48:16two modalities to keep an eye on what's
01:48:19happening in the world Isaac perceptor
01:48:22the most used robot today is the
01:48:26manipulator manufacturing arms and they
01:48:29are also pre-programmed the computer
01:48:31vision algorithms the AI algorithms the
01:48:34control and path planning algorithms
01:48:36that are geometry aware incredibly
01:48:38computational intensive we have made
01:48:41these Cuda accelerated so we have the
01:48:44world's first Cuda accelerated motion
01:48:46planner that is geometry aware you put
01:48:50something in front of it it comes up
01:48:51with a new plan and our articulates
01:48:53around it it has excellent perception
01:48:56for pose estimation of a 3D object not
01:49:00just not it's pose in 2D but it's pose
01:49:02in 3D so it has to imagine what's around
01:49:05and how best to grab it so the
01:49:08foundation pose the grip foundation and
01:49:12the um articulation algorithms are now
01:49:15available we call it Isaac manipulator
01:49:17and they also uh just run on nvidia's
01:49:21computers we are are starting to do some
01:49:25really great work in the next generation
01:49:27of Robotics the next generation of
01:49:29Robotics will likely be a humanoid
01:49:32robotics we now have the Necessary
01:49:35Technology and as I was describing
01:49:38earlier the Necessary Technology to
01:49:40imagine generalized human robotics in a
01:49:44way human robotics is likely easier and
01:49:46the reason for that is because we have a
01:49:48lot more imitation training data that we
01:49:51can provide there robots because we are
01:49:54constructed in a very similar way it is
01:49:56very likely that the human robotics will
01:49:58be much more useful in our world because
01:50:00we created the world to be something
01:50:02that we can interoperate in and work
01:50:04well in and the way that we set up our
01:50:07workstations and Manufacturing and
01:50:08Logistics they were designed for for
01:50:10humans they were designed for people and
01:50:12so these human robotics will likely be
01:50:15much more productive to
01:50:17deploy while we're creating just like
01:50:20we're doing with the others the entire
01:50:22stack starting from the top a foundation
01:50:25model that learns from watching video
01:50:28human IM human examples it could be in
01:50:32video form it could be in virtual
01:50:34reality form we then created a gym for
01:50:37it called Isaac reinforcement learning
01:50:40gym which allows the humanoid robot to
01:50:43learn how to adapt to the physical world
01:50:46and then an incredible computer the same
01:50:49computer that's going to go into a
01:50:50robotic car this computer will run
01:50:53inside a human or robot called Thor it's
01:50:55designed for Transformer engines we've
01:50:58combined several of these into one video
01:51:01this is something that you're going to
01:51:03really love take a
01:51:07look it's not enough for humans to
01:51:15imagine we have to
01:51:19invent and explore real and push Beyond
01:51:24what's been done fair amount of
01:51:37faster we push it to
01:51:44learn we teach it then help it teach
01:51:48itself we broaden its understanding
01:51:58challenges with absolute
01:52:06succeed we make it
01:52:17reason so it can share our world with
01:52:41this is where inspiration leads us the
01:52:46Frontier this is Nvidia Project
01:52:54a general purpose Foundation model for
01:52:58learning the group model takes
01:53:00multimodal instructions and past
01:53:03interactions as input and produces the
01:53:05next action for the robot to
01:53:09execute we developed Isaac lab a robot
01:53:12learning application to train gr on
01:53:16Sim and we scale out with osmo a new
01:53:19compute orchestration service that
01:53:21coordinates work flows across dgx
01:53:23systems for training and ovx systems for
01:53:28simulation with these tools we can train
01:53:30Groot in physically based simulation and
01:53:33transfer zero shot to the real
01:53:36world the Groot model will enable a
01:53:39robot to learn from a handful of human
01:53:41demonstrations so it can help with
01:53:46tasks and emulate human movement just by
01:53:51us this is made possible with nvidia's
01:53:54technologies that can understand humans
01:53:56from videos train models and simulation
01:53:59and ultimately deploy them directly to
01:54:01physical robots connecting group to a
01:54:04large language model even allows it to
01:54:06generate motions by following natural
01:54:09language instructions hi go1 can you
01:54:12give me a high five sure thing let's
01:54:16five can you give us some cool moves
01:54:25all this incredible intelligence is
01:54:26powered by the new Jetson Thor robotics
01:54:29chips designed for Groot built for the
01:54:32future with Isaac lab osmo and Groot
01:54:35we're providing the building blocks for
01:54:37the next generation of AI powered
01:55:06Nvidia the intersection of computer
01:55:08Graphics physics artificial intelligence
01:55:12it all came to bear at this moment the
01:55:15name of that project general robotics
01:55:27good well I think we have some special
01:55:45guys so I understand you guys are
01:55:49Jetson they're powered by Jetson
01:55:53little Jetson robotics computers inside
01:55:56they learn to walk in Isaac
01:56:02Sim ladies and gentlemen this this is
01:56:05orange and this is the famous green they
01:56:09are the bdx robots of
01:56:13Disney amazing Disney
01:56:18research come on you guys let's wrap up
01:56:23five things where you
01:56:27going I sit right
01:56:33here Don't Be Afraid come here green
01:56:42saying no it's not time to
01:56:46eat it's not time
01:56:50to I'll I'll give you a snack in a
01:56:53moment let me finish up real
01:56:55quick come on green hurry up stop
01:57:01time five things five things first a new
01:57:06Industrial Revolution every data center
01:57:08should be accelerated a trillion dollars
01:57:11worth of installed data centers will
01:57:13become modernized over the next several
01:57:15years second because of the
01:57:16computational capability we brought to
01:57:18bear a new way of doing software has
01:57:20emerged generative AI which is going to
01:57:23create new in new infrastructure
01:57:25dedicated to doing one thing and one
01:57:27thing only not for multi-user data
01:57:30centers but AI generators these AI
01:57:33generation will create incredibly
01:57:37software a new Industrial Revolution
01:57:40second the computer of this revolution
01:57:43the computer of this generation
01:57:45generative AI trillion
01:57:47parameters blackw insane amounts of
01:57:51computers and computing
01:57:53third I'm trying to
01:57:57concentrate good job third new computer
01:58:02new computer creates new types of
01:58:04software new type of software should be
01:58:06distributed in a new way so that it can
01:58:09on the one hand be an endpoint in the
01:58:10cloud and easy to use but still allow
01:58:13you to take it with you because it is
01:58:15your intelligence your intelligence
01:58:17should be pack packaged up in a way that
01:58:19allows you to take it with you we call
01:58:21them Nims and third these Nims are going
01:58:24to help you create a new type of
01:58:26application for the future not one that
01:58:28you wrote completely from scratch but
01:58:30you're going to integrate them like
01:58:33teams create these applications we have
01:58:36a fantastic capability between Nims the
01:58:39AI technology the tools Nemo and the
01:58:42infrastructure dgx cloud in our AI
01:58:45Foundry to help you create proprietary
01:58:47applications proprietary chat Bots and
01:58:49then lastly everything that moves in the
01:58:51future will be robotic you're not going
01:58:53to be the only one and these robotic
01:58:56systems whether they are humanoid amrs
01:59:00self-driving cars forklifts manipulating
01:59:03arms they will all need one thing Giant
01:59:06stadiums warehouses factories there can
01:59:09to be factories that are robotic
01:59:11orchestrating factories uh manufacturing
01:59:13lines that are robotics building cars
01:59:16robotics these systems all need one
01:59:19thing they need a platform a digital
01:59:22platform a digital twin platform and we
01:59:25call that Omniverse the operating system
01:59:29World these are the five things that we
01:59:31talked about today what does Nvidia look
01:59:33like what does Nvidia look like when we
01:59:35talk about gpus there's a very different
01:59:38image that I have when I when people ask
01:59:40me about gpus first I see a bunch of
01:59:42software stacks and things like that and
01:59:44second I see this this is what we
01:59:47announce to you today this is Blackwell
01:59:57amazing amazing processors MV link
02:00:00switches networking systems and the
02:00:03system design is a miracle this is
02:00:07Blackwell and this to me is what a GPU
02:00:18mind listen orange green I think we have
02:00:22one more treat for everybody what do you
02:00:25we okay we have one more thing to show
02:02:47you thank you have a great have a great
02:02:50GTC thank you all for coming thank