WATCH: Jensen Huang's Nvidia GTC Keynote - LIVE
CNET Highlights2024-03-18
event#livestream#live#2022#CNET Highlights
176K views|5 months ago
💫 Short Summary
The video highlights the diverse applications and advancements in artificial intelligence by Nvidia, including accelerated computing and generative AI. It discusses the development of high-performance chips, supercomputers, and virtual world simulations, emphasizing industry collaborations and future innovations. Nvidia's innovations in weather forecasting, healthcare, and robotics, as well as partnerships with major companies, showcase the transformative impact of AI technology in various sectors. The video concludes with insights into the future of AI-driven data centers, software development, and the integration of AI technologies across industries for enhanced efficiency and innovation.
✨ Highlights
📊 Transcript
✦
Diverse Applications of AI in Various Fields.
02:18AI can guide the blind, provide renewable energy solutions, teach robots, and generate virtual scenarios.
AI is utilized in climate tech, radio sciences, and self-driving cars, driving innovation and solving complex problems.
Industry leaders like Michael Dell emphasize the significance of accelerated computing in different industries.
The speaker discusses the importance of AI in driving innovation and solving complex problems.
✦
Transformation of industries due to computing advancements, particularly in artificial intelligence.
06:45Nvidia's journey since founding in 1993, focusing on key events like the development of Cuda and the djx1 supercomputer.
Emergence of generative AI and new industry categories, highlighting unique software production methods and applications.
Exploration of future preparations needed for the evolving industry landscape.
✦
Nvidia showcases the importance of accelerated computing in their Omniverse virtual world simulation.
10:54The goal is to drive up computing scale for creating digital twins of products, allowing for digital design, simulation, and operation.
Nvidia announces partnerships with Ansys for engineering simulation and connecting Ansys to the Omniverse digital twin.
The intersection of computer graphics, physics, and AI plays a crucial role in advancing virtual world simulations.
The synergy between infrastructure for generative AI is highlighted as a key aspect of accelerating computing.
✦
NVIDIA revolutionizing chip industry with high-level design and CUDA acceleration.
17:37Creation of domain-specific library to accelerate computational lithography and partnering with TSMC for production.
Applying generative AI to semiconductor manufacturing and collaborating with Cadence for supercomputers using NVIDIA GPUs.
Future involves AI co-pilots in chip design and connecting Cadence digital twin platforms to Omniverse for accelerating CAE, EDA, and SDA creation of digital twins.
✦
Importance of larger GPUs for efficient model training.
21:57Innovation of connecting GPUs to create giant systems.
Development of supercomputers like DJX1 and Selene 4500 leading to one of the world's largest AI supercomputers in 2023.
Emphasis on building chips, systems, networking, and software to support advancements.
Utilization of multimodality data for training models, combining text, images, graphs, and charts to ground models in physics and common sense.
✦
Introduction of the Blackwell GPU, the most advanced GPU with 28 billion transistors and a unique design.
29:47The Blackwell Chip features 10 terabytes of data transfer per second, solving memory locality and cache problems.
It is compatible with Hopper, allowing for seamless installations worldwide with the same infrastructure and design.
The ambitious engineering of the Blackwell Chip and its compatibility with Hopper signify substantial advancements in GPU technology.
✦
Development of Grace Blackwell system and second-generation Transformer engine.
32:50Grace Blackwell system utilizes two Blackwell chips connected to a Grace CPU for efficient computation and memory coherence.
Second-generation Transformer engine dynamically adjusts numerical formats for efficient processing in AI applications.
The new engine includes a fifth-generation MV link to enhance computation speed and facilitate communication among multiple GPUs.
Both systems aim to improve processing capabilities and efficiency in computing tasks.
✦
Blackwell supercomputer's speed is 1.8 terabytes per second, much faster than Hopper.
36:52A reliability engine conducts self-tests on components to prevent downtime.
Encrypted data at rest, in transit, and during computation enhances security.
A high-speed compression engine allows data to move 20 times faster, maximizing utilization.
Blackwell's performance is 2.5 times higher than Hopper, with new features like fp6 and fp4 boosting throughput.
✦
The future of content streaming involves AI recomposing and presenting information based on user preferences, saving energy, bandwidth, and time.
41:34Generative AI is revolutionizing computing with a focus on content token generation.
Computing power has increased exponentially over the years, with the development of new chips like the MVY Link Switch facilitating faster communication between GPUs.
The goal is to create a system where all GPUs can connect seamlessly to function as one giant GPU, driving innovation in cost-effective computing solutions.
✦
Evolution of DGX Technology
46:17DGX machine delivers high teraflops for training AI models.
MV link spine enables fast data transfer without additional costs.
Liquid-cooled rack saves significant power for computations.
Current GPU design shows significant increase in parts and weight, highlighting technological progress.
✦
Efforts to reduce cost and energy consumption in training AI models.
50:46Goal is to decrease costs and energy usage to scale up computation for next-gen models.
Large language models require parallel work across multiple GPUs due to their size.
Future involves generative chatbots with trillions of parameters needing supercomputers.
Token generation efficiency crucial for interactions with chatbots, requiring parallelization across numerous GPUs.
✦
Importance of balancing throughput and interactive rate in reducing costs per token generation.
54:51Challenges of distributing work across multiple GPUs for high throughput and quality of service.
Emphasis on optimizing software solutions for different parallel configurations across GPUs.
Programmability of NVIDIA's GPUs, especially with CUDA, for exploring various parallel methods like tensor and expert parallelism.
Significant advancements in inference capability for generative AI models, with Blackwell outperforming Hopper.
✦
Blackwell: Revolutionary AI system with rapid communication capabilities.
01:00:00Blackwell's launch generating excitement and anticipation in the industry.
Companies and organizations signing up to utilize Blackwell's advanced technology.
Partners like AWS and Google investing heavily in developing secure AI systems with Nvidia's technology.
Shift towards AI-focused data centers prioritizing intelligence generation over traditional electricity generation.
✦
Collaboration between companies to optimize and accelerate data processing and services.
01:03:04Oracle, Microsoft, and Nvidia partnering with Google Cloud Platform to enhance offerings.
Wistron using digital twins powered by Omniverse to increase worker efficiency and reduce construction time for factories.
Digital twin technology helping avoid costly change orders and improve operations significantly.
Overall, collaboration and use of digital twins driving innovation and efficiency in various industries.
✦
Breakthrough in software technology compresses million-dimensional data into three letters.
01:06:31Evolution of AI in recognizing and understanding text, images, videos, and sounds.
Potential to apply advancements in understanding and generating digitized data like proteins, genes, and brain waves.
Possibilities of using AI to predict extreme weather events at a regional scale with high resolution to prevent harm.
✦
Nvidia's Civ AI model improves storm prediction accuracy with high-resolution weather forecasting.
01:10:15Cordi AI model can super-resolve extreme events like typhoons from 25 km to 2 km resolution quickly and efficiently.
The advancement in AI technology allows for regional weather forecasts on a kilometer scale, helping to reduce loss of life and property damage.
Nvidia is collaborating with The Weather Company to integrate Earth into Cordi for enhanced weather simulation capabilities.
Nvidia is also applying AI in healthcare for medical imaging, gene sequencing, and computational chemistry.
✦
AlphaFold revolutionizes protein structure prediction and drug discovery.
01:14:40Nvidia's Bion Nemo Nims allow for generative screening, molecule generation, and docking in minutes.
The system optimizes for desired molecular properties, accelerating medicine development.
Nvidia Inference Microservice (Nim) offers pre-trained models for easy integration into workflows.
Companies can effectively utilize groundbreaking models for drug discovery.
✦
Discussion on optimizing AI models across multiple GPUs and utilizing AI APIs for future software interactions.
01:17:50Introduction of AI APIs as user-friendly interfaces for accessing AI technologies, simplifying usage and accessibility.
Development of chatbots and AI agents, envisioning a future where software is created by assembling teams of AI systems.
Emphasis on collaboration between AI systems to accomplish complex tasks and the integration of AI technologies in different industries.
✦
Nvidia's internal program CTL aids in tracing compute cycles and enhances chip designers' productivity.
01:22:00CTL stands for Compute Trace Library, not combinatorial timing logic as previously misunderstood.
The Nemo microservice helps in curating and fine-tuning AI data.
Nvidia serves as an AI Foundry, offering technology, tools, and infrastructure deployment options for developing AI models.
The focus is on customization, understanding proprietary information, and utilizing data within the company.
✦
Use of vector databases and chatbots like Nemo Retriever and Diane enhance information retrieval and organization in companies.
01:27:31Enterprise IT industry has valuable tools and data that can act as co-pilots to aid in tasks.
Nvidia AI Foundry collaborates with major companies to develop AI assistants and improve data management capabilities.
Collaboration with companies like SAP and ServiceNow showcases the potential for innovation and optimization within the industry.
✦
Snowflake provides digital warehouse services in the cloud for 10,000 Enterprise customers.
01:28:42Snowflake is collaborating with Nvidia AI Foundry to develop chatbots and co-pilots for its customers.
Dell is a crucial partner in constructing AI factories for large-scale enterprise systems.
The future of AI includes computers interpreting the physical world by watching videos and analyzing human actions.
Nvidia has introduced the DGX system for AI training and the Jetson processor for autonomous systems, enhancing robotics and AI technology.
✦
Language models for reinforcement learning in robotics.
01:32:56Importance of physical feedback in teaching robots articulation and manipulation capabilities.
Creation of a virtual world called Omniverse and the computer system OVX in the Azure cloud.
Integration of AI and Omniverse in a warehouse setting for autonomous systems to interact with humans and forklifts.
Future of heavy industries through digital twins and AI agents assisting in industrial operations.
✦
Using natural language to ask questions and improve operations through a visual model.
01:36:05Sensor data is simulated and passed to real-time AI for deployment in physical warehouses.
Integration of AI with digital twins in future facilities and development of Omniverse Cloud APIs for easy access.
Announcement of Seaman connecting their Crown Jewel accelerator to Nvidia Omniverse, enhancing product life cycle management with AI and Omniverse technologies.
✦
Nvidia's Omniverse collaboration with Hyundai and Nissan optimizes manufacturing processes through interactive visualization and generative AI.
01:40:54The integration of Omniverse streamlines workflows from design to manufacturing, leading to cost and time savings.
Nissan utilizes Omniverse to connect design tools for enhanced productivity in their workflow.
Vision Pro is connected to Omniverse portals for virtual collaboration and integration of CAD and design tools.
The segment also discusses the future of robotics in the automotive industry, including advancements and upcoming self-driving car releases from Mercedes and Jr.
✦
NVIDIA unveils 'Thor' AV computer and 'Isaac Perceptor' SDK for robotics.
01:46:05'Thor' is designed for the automotive industry and adopted by BYD.
'Isaac Perceptor' enables perception and adaptive programming for robotics.
Focus on enhancing robotics technology with state-of-the-art vision and odometry capabilities.
New 'Isaac Manipulator' offers enhanced pose estimation and grip algorithms for 3D object manipulation.
✦
Advancements in robotics are leading to the development of highly productive humanoid robots.
01:52:47Nvidia's Project group is working on a general-purpose model for humanoid robot learning using multimodal instructions and past interactions.
The group is developing simulation tools like Isaac Lab and Osmo for training and orchestration across systems.
The Groot model aims to enable robots to learn from human demonstrations and perform everyday tasks by observing human movement.
Nvidia's technologies are being leveraged for understanding, training, and deployment of humanoid robots.
✦
Advancements in generative AI and its impact on data centers.
01:57:39New infrastructure for AI generators is emerging, leading to valuable software creation and a new Industrial Revolution.
New types of software are being developed in a user-friendly manner, allowing users to carry their intelligence with them.
The integration of Nims in application development is highlighted as a collaborative and efficient way to create innovative applications for the future.
✦
The future of robotics will be dominated by robotic systems and a digital twin platform called Omniverse.
01:58:45Nvidia introduces Blackwell, a revolutionary processor for GPUs, emphasizing system design and networking.
The speaker discusses the importance of software stacks and technological advancements in GPUs.
The video ends with gratitude towards the audience and a musical performance.
00:02[Music]
00:21I am a
00:27Visionary Illuminating gy GES to witness
00:31the birth of
00:34[Music]
00:38stars and sharpening our understanding
00:41of extreme weather
00:44[Music]
00:48events I am a
00:51helper guiding the blind through a
00:54crowded
00:58world
01:00I was thinking about running to the
01:02store and giving voice to those who
01:04cannot
01:06speak to not make me love
01:10love I am a
01:14Transformer harnessing gravity to store
01:17Renewable
01:19[Music]
01:24Power and Paving the way towards
01:27unlimited clean energy for us
01:31[Music]
01:33all I am a
01:36[Music]
01:37trainer teaching robots to
01:43assist to watch out for
01:47[Music]
01:50danger and help save
01:56lives I am a
01:58Healer
02:01providing a new generation of
02:04cures and new levels of patient care
02:08doctor that I am allergic to penicillin
02:10is it still okay to take the medications
02:12definitely these antibiotics don't
02:14contain penicillin so it's perfectly
02:16safe for you to take
02:18them I am a
02:22[Music]
02:24navigator generating virtual
02:28scenarios
02:30to let us safely explore the real
02:35world and understand every
02:38[Music]
02:42decision I even helped write the
02:47script breathe life into the
02:50[Music]
02:58words
03:05I am
03:07AI brought to life by
03:10Nvidia deep
03:11learning and Brilliant
03:14Minds
03:24everywhere please welcome to the stage
03:27Nvidia founder and CEO Jens
03:30[Music]
03:36[Applause]
03:37[Music]
03:44Wong welcome to
03:51GTC I hope you realize this is not a
03:57concert you have arrived
04:00at a developers
04:03conference there will be a lot of
04:05science
04:06described algorithms computer
04:09architecture
04:19mathematics I sensed a very heavy weight
04:23in the room all of a
04:25sudden almost like you were in the wrong
04:27place no no conference in the
04:31world is there a great assembly of
04:34researchers from such diverse fields of
04:38science from climate Tech to radio
04:42Sciences trying to figure out how to use
04:44AI to robotically control MOS for Next
04:47Generation 6G radios robotic
04:50self-driving
04:52cars even artificial
04:56intelligence even artificial
04:58intelligence
05:02everybody's first I notice a sense of
05:05relief there all of all of a
05:07sudden also this conference is
05:10represented by some amazing
05:13companies this list this is not the
05:18attendees these are the
05:21presentors and what's amazing is
05:24this if you take away all of my friends
05:28close friends
05:30Michael Dell is sitting right there in
05:32the IT
05:39industry all of the friends I grew up
05:41with in the industry if you take away
05:44that list this is what's
05:47amazing these are the presenters of the
05:50non-it industries using accelerated
05:53Computing to solve problems that normal
05:55computers
05:57can't it's RI
06:00represented in life sciences healthc
06:03Care
06:04genomics Transportation of course retail
06:08Logistics manufacturing
06:12industrial the gamut of Industries
06:15represented is truly amazing and you're
06:17not here to attend only you're here to
06:20present to talk about your research $1
06:23100 trillion of the world's Industries
06:26is represented in this room today this
06:29this is absolutely
06:36amazing there is absolutely something
06:39happening there is something going
06:42on the industry is being transformed not
06:45just ours because the computer industry
06:49the computer is the single most
06:51important instrument of society today
06:55fundamental transformations in Computing
06:57affects every industry but how did we
07:00start how did we get here I made a
07:03little cartoon for you literally I drew
07:06this in one page this is nvidia's
07:09Journey started in
07:121993 this might be the rest of the
07:15talk 1993 this is our journey we were
07:18founded in 1993 there are several
07:20important events that happen along the
07:22way I'll just highlight a few in 2006
07:27Cuda which has turned out to have been a
07:29r revolutionary Computing model we
07:31thought it was revolutionary then it was
07:33going to be an overnight success and
07:35almost 20 years later it
07:40happened we saw it
07:44coming two decades
07:48later in
07:502012
07:52alexnet Ai and
07:55Cuda made first
07:58Contact in
08:002016 recognizing the importance of this
08:02Computing model we invented a brand new
08:04type of computer we called the
08:07djx1
08:09170 teraflops in this supercomputer
08:12eight gpus connected together for the
08:15very first time I hand delivered the
08:17very first djx1 to a
08:20startup located in San
08:23Francisco called open
08:28AI
08:33djx1 was the world's first AI
08:35supercomputer remember 170 Tera
08:39flops
08:412017 the Transformer arrived
08:442022 chat GPT captured the world's IM
08:48imaginations have people realize the
08:50importance and the capabilities of
08:52artificial intelligence in
08:552023 generative AI emerged
09:00and a new industry begins
09:03why why is a new industry because the
09:06software never existed before we are now
09:10producing software using computers to
09:12write software producing software that
09:15never existed before it is a brand new
09:17category it took share from
09:20nothing it's a brand new category and
09:23the way you produce the
09:25software is unlike anything we've ever
09:28done before in data
09:31centers generating
09:33tokens
09:35producing floating Point
09:38numbers at very large scale as if in the
09:43beginning of this last Industrial
09:45Revolution when people realized that you
09:47would set up
09:50factories apply energy to it and this
09:54invisible valuable thing called
09:56electricity came out AC generators
10:00and 100 years later 200 years later we
10:02are now creating new types of electrons
10:07tokens using infrastructure we call
10:10factories AI factories to generate this
10:13new incredibly valuable thing called
10:16artificial intelligence a new industry
10:18has
10:19emerged well we're going to talk about
10:23many things about this new
10:24industry we're going to talk about how
10:26we're going to do Computing next we're
10:28going to talk talk about the type of
10:30software that you build because of this
10:32new industry the new
10:35software how you would think about this
10:37new software what about applications in
10:39this new
10:40industry and then maybe what's next and
10:44how can we start preparing today for
10:46what is about to come next well but
10:49before I
10:50start I want to show you the soul of
10:54Nvidia the soul of our company at the
10:58intersection
11:00of computer
11:02Graphics
11:04physics and artificial
11:06intelligence all intersecting inside a
11:11computer in
11:13Omniverse in a virtual world
11:16simulation everything we're going to
11:18show you today literally everything
11:20we're going to show you
11:21today is a simulation not animation it's
11:25only beautiful because it's physics the
11:27world is beautiful
11:29it's only amazing because it's being
11:32animated with robotics it's being
11:34animated with artificial intelligence
11:36what you're about to see all
11:38day it's completely generated completely
11:41simulated and Omniverse and all of it
11:45what you're about to enjoy is the
11:46world's first concert where everything
11:48is
11:56homemade everything is homemade you're
11:59about to watch some home videos so sit
12:03back and enjoy
12:05[Music]
12:14[Music]
12:28yourself
12:49[Music]
12:58St
13:16[Music]
13:25[Music]
13:28down
13:35[Music]
13:58get
14:06[Music]
14:21[Music]
14:27[Music]
14:33[Music]
14:50God I love
14:55Nvidia accelerated Computing has re
14:59reached the Tipping
15:00Point general purpose Computing has run
15:03out of steam we need another way of
15:06doing Computing so that we can continue
15:08to scale so that we can continue to
15:09drive down the cost of computing so that
15:12we can continue to consume more and more
15:15Computing while being sustainable
15:18accelerated Computing is a dramatic
15:20speed up over general purpose Computing
15:24and in every single industry we engage
15:27and I'll show you many
15:29the impact is dramatic but in no
15:32industry is it more important than our
15:35own the industry of using simulation
15:38tools to create
15:41products in this industry it is not
15:44about driving down the cost of computing
15:46it's about driving up the scale of
15:48computing we would like to be able to
15:50simulate the entire product that we do
15:53completely in full Fidelity completely
15:57digitally and essentially what we call
16:00digital twins we would like to design it
16:03build it simulate it operate it
16:07completely
16:08digitally in order to do that we need to
16:12accelerate an entire industry and today
16:15I would like to announce that we have
16:17some Partners who are joining us in this
16:19journey to accelerate their entire
16:22ecosystem so that we can bring the world
16:25into accelerated Computing but there's a
16:29bonus when you become accelerated your
16:33infrastructure is cou to gpus and when
16:37that happens it's exactly the same
16:39infrastructure for generative
16:42Ai and so I'm just delighted to announce
16:46several very important Partnerships
16:47there are some of the most important
16:49companies in the world ansus does
16:51engineering simulation for what the
16:53world makes we're partnering with them
16:55to Cuda accelerate the ancis ecosystem
16:58to connect Anis to the Omniverse digital
17:01twin incredible the thing that's really
17:04great is that the install base of Nvidia
17:06GPU accelerated systems are all over the
17:08world in every cloud in every system all
17:11over Enterprises and so the app the
17:14applications they accelerate will have a
17:16giant installed base to go serve end
17:18users will have amazing applications and
17:21of course system makers and csps will
17:23have great customer
17:25demand
17:27synopsis synop syis is nvidia's
17:30literally first software partner they
17:33were there in very first day of our
17:35company synopsis revolutionize the chip
17:37industry with highlevel design we are
17:41going to Cuda accelerate synopsis we're
17:44accelerating computational lithography
17:46one of the most important applications
17:49that nobody's ever known about in order
17:51to make chips we have to push
17:53lithography to limit Nvidia has created
17:56a library a domain specific Library
17:59that accelerates computational
18:01lithography incredibly once we can
18:04accelerate and software Define all of
18:06tsmc who is announcing today that
18:09they're going to go into production with
18:11Nvidia kitho once is software defined
18:13and accelerated the next step is to
18:16apply generative AI to the future of
18:18semiconductor manufacturing pushing
18:20geometry even
18:23further Cadence builds the world's
18:26essential Eda and SDA tools
18:29we also use Cadence between these three
18:31companies Anis synopsis and Cadence we
18:34basically build Nvidia together we are
18:37cud to accelerating Cadence they're also
18:40building a supercomputer out of Nvidia
18:42gpus so that their customers could do
18:45fluid Dynamic simulation at a 100 a
18:48thousand times scale
18:51basically a wind tunnel in real time
18:54Cadence Millennium a supercomputer with
18:57Nvidia gpus inside a software company
18:59building supercomputers I love seeing
19:01that building Cadence co-pilots together
19:05imagine a
19:06day when Cadence could synopsis ansis
19:10tool providers would offer you AI
19:13co-pilots so that we have thousands and
19:16thousands of co-pilot assistants helping
19:19us design chips Design Systems and we're
19:22also going to connect Cadence digital
19:23twin platform to Omniverse as you could
19:26see the trend here we're accelerating
19:29the world's CAE Eda and SDA so that we
19:32could create our future in digital Twins
19:35and we're going to connect them all to
19:37Omniverse the fundamental operating
19:39system for future digital
19:42twins one of the industries that
19:44benefited tremendously from scale and
19:47you know you all know this one very well
19:49large language
19:50models basically after the Transformer
19:53was
19:54invented we were able to scale large
19:57language models and incredible rates
20:00effectively doubling every 6 months now
20:02how is it possible that by doubling
20:04every 6 months that we have grown the
20:07industry we have grown the computational
20:10requirements so far and the reason for
20:12that is quite simply this if you double
20:14the size of the model you double the
20:16size of your brain you need twice as
20:17much information to go fill it and so
20:20every time you double your parameter
20:24count you also have to appropriately
20:27increase your training token count the
20:30combination of those two
20:32numbers becomes the computation scale
20:35you have to
20:36support the latest the state-of-the-art
20:38open AI model is approximately 1.8
20:41trillion parameters 1.8 trillion
20:44parameters required several trillion
20:47tokens to go
20:49train so a few trillion parameters on
20:52the order of a few trillion tokens on
20:55the order of when you multiply the two
20:56of them together approximately
20:5930 40 50
21:02billion quadrillion floating Point
21:05operations per second now we just have
21:08to do some Co math right now just hang
21:09hang with me so you have 30 billion
21:13quadrillion a quadrillion is like a p
21:17and so if you had a paa flop GPU you
21:20would need
21:2230 billion seconds to go compute to go
21:25train that model 30 billion seconds is
21:27approximately 1,000
21:29years well 1,000 years it's worth
21:38it like to do it sooner but it's worth
21:43it which is usually my answer when most
21:46people tell me hey how long how long's
21:47it going to take to do something 20
21:49years I it's worth
21:52it but can we do it next
21:57week and so 1,000 years 1,000 years so
22:01what we need what we
22:04need are bigger
22:06gpus we need much much bigger gpus we
22:09recognized this early on and we realized
22:13that the answer is to put a whole bunch
22:14of gpus together and of course innovate
22:17a whole bunch of things along the way
22:19like inventing tensor cores advancing MV
22:22link so that we could create essentially
22:24virtually Giant
22:26gpus and connecting them all together
22:28with amazing networks from a company
22:30called melanox infiniband so that we
22:32could create these giant systems and so
22:35djx1 was our first version but it wasn't
22:37the last we built we built
22:39supercomputers all the way all along the
22:42way in
22:442021 we had Seline 4500 gpus or so and
22:49then in 2023 we built one of the largest
22:52AI supercomputers in the world it's just
22:54come
22:55online
22:57EOS and as we're building these things
23:00we're trying to help the world build
23:02these things and in order to help the
23:04world build these things we got to build
23:05them first we build the chips the
23:07systems the networking all of the
23:10software necessary to do this you should
23:12see these
23:13systems imagine writing a piece of
23:15software that runs across the entire
23:17system Distributing the computation
23:19across thousands of gpus but inside are
23:23thousands of smaller
23:26gpus millions of gpus to distribute work
23:29across all of that and to balance the
23:31workload so that you can get the most
23:33Energy Efficiency the best computation
23:35time keep your cost down and so those
23:39those fundamental
23:42Innovations is what got us here and here
23:45we
23:46are as we see the miracle of chat GPT
23:51emerge in front of us we also realize we
23:54have a long ways to go we need even
23:57larger models model we're going to train
23:59it with multimodality data not just text
24:02on the internet but we're going to we're
24:04going to train it on texts and images
24:05and graphs and
24:07charts and just as we learn watching TV
24:11and so there's going to be a whole bunch
24:12of watching video so that these Mo
24:15models can be grounded in physics
24:18understands that an arm doesn't go
24:19through a wall and so these models would
24:22have common sense by watching a lot of
24:25the world's video combined with a lot of
24:28the world languages they'll use things
24:30like synthetic data generation just as
24:32you and I do when we try to learn we
24:35might use our imagination to simulate
24:37how it's going to end up just as I did
24:40when I Was preparing for this keynote I
24:42was simulating it all along the
24:45way hope it's going to turn out as well
24:49as I had it in my
24:56head as I was simulating how this
24:59keynote was going to turn out somebody
25:00did say that another
25:04performer did her performance completely
25:07on a
25:08treadmill so that she could be in shape
25:10to deliver it with full
25:13energy I I didn't do
25:16that if I get a little wind at about 10
25:19minutes into this you know what
25:22happened and so so where were we we're
25:26sitting here using synthetic data
25:27generation we're going to use
25:29reinforcement learning we're going to
25:30practice it in our mind we're going to
25:32have ai working with AI training each
25:34other just like student teacher
25:37Debaters all of that is going to
25:38increase the size of our model it's
25:40going to increase the amount of con the
25:41amount of data that we have and we're
25:43going to have to build even bigger
25:46gpus Hopper is
25:49fantastic but we need bigger
25:52gpus and so ladies and
25:56gentlemen I would like to introduce you
26:00to a very very big
26:05[Applause]
26:14GPU named after David
26:18Blackwell a
26:21mathematician game theorists
26:24probability we thought it was a perfect
26:26per per perfect name
26:29black wal ladies and gentlemen enjoy
26:57this
27:27what
27:57for
28:57s
28:58[Applause]
29:09Blackwell is not a chip Blackwell is the
29:11name of a
29:12platform uh people think we make
29:15gpus and and we do but gpus don't look
29:20the way they used
29:21to uh here here's the here's the here's
29:24the the if you will the heart of the
29:27blackw system
29:28and this inside the company is not
29:30called Blackwell it's just the number
29:33and um uh
29:35this this is Blackwell sitting next to
29:39oh this is the most advanced GPU in the
29:41world in production
29:45today this is
29:47Hopper this is hopper Hopper changed the
29:52world this is
29:57Blackwell
30:04it's okay
30:10Hopper you're you're very
30:13good good good
30:15boy well good
30:20girl 28 billion transistors and so so
30:24you could see you I can see there
30:28there's a small line between two dieses
30:30this is the first time two dyes have
30:32abutted like this together in such a way
30:35that the two chip the two dies think
30:37it's one chip there's 10 terabytes of
30:40data between it 10 terabytes per second
30:43so that these two these two sides of the
30:45Blackwell Chip have no clue which side
30:47they're on there's no memory locality
30:50issues no cach issues it's just one
30:53giant chip and so uh when we were told
30:57that Blackwell's Ambitions were beyond
30:59the limits of physics uh the engineer
31:02said so what and so this is what what
31:05happened and so this is the Blackwell
31:08chip and it goes into two types of
31:11systems the first
31:13one is form fit function compatible to
31:17Hopper and so you slide on Hopper and
31:19you push in Blackwell that's the reason
31:21why one of the challenges of ramping is
31:23going to be so efficient there are
31:25installations of Hoppers all over the
31:27world world and they could be they could
31:29be you know the same infrastructure same
31:31design the power the electricity The
31:34Thermals the software identical push it
31:38right back and so this is a hopper
31:41version for the current hgx
31:45configuration and this is what the other
31:47the second Hopper looks like this now
31:50this is a prototype board and um Janine
31:54could I just
31:55borrow ladies and gentlemen Janine Paul
32:03and so this this is the this is a fully
32:06functioning board and I just be careful
32:10here this right here is I don't know1
32:19billion the second one's
32:25five it gets cheaper after that so any
32:28customers in the audience it's
32:33okay all right but this is this one's
32:35quite expensive this is to bring up
32:37board and um and the the way it's going
32:40to go to production is like this one
32:41here okay and so you're going to take
32:44take this it has two Blackwell D two two
32:47Blackwell chips and four Blackwell dyes
32:50connected to a Grace CPU the grace CPU
32:54has a super fast chipto chip link what's
32:57amazing is this computer is the first of
33:00its kind where this much computation
33:03first of all fits into this small of a
33:06place second it's memory coherent they
33:10feel like they're just one big happy
33:11family working on one application
33:15together and so everything is coherent
33:17within it um the just the amount of you
33:21know you saw the numbers there's a lot
33:22of terabytes this and terabytes that um
33:25but this is this is a miracle this is a
33:28uh this let's see what are some of the
33:30things on here uh there's um uh mvy link
33:34on top PCI Express on the
33:38bottom on on uh
33:42your which one is mine and your left one
33:45of them it doesn't matter uh one of them
33:48one of them is a c CPU chipto chip link
33:51is my left or your depending on which
33:53side I was just I was trying to sort
33:55that out and I just kind of doesn't
34:02matter hopefully it comes plugged in
34:09so okay so this is the grace Blackwell
34:22system but there's
34:26more so turn turns out it turns out all
34:30of the specs is fantastic but we need a
34:32whole lot of new features u in order to
34:35push the limits Beyond if you will the
34:38limits of
34:39physics we would like to always get a
34:41lot more X factors and so one of the
34:44things that we did was We Invented
34:45another Transformer engine another
34:47Transformer engine the second generation
34:50it has the ability to
34:51dynamically and automatically
34:55rescale and recas
34:58numerical formats to a lower Precision
35:02whenever it can remember artificial
35:04intelligence is about probability and so
35:07you kind of have you know 1.7
35:09approximately 1.7 time approximately 1.4
35:12to be approximately something else does
35:14that make sense and so so the the
35:16ability for the mathematics to retain
35:20the Precision and the range necessary in
35:23that particular stage of the pipeline
35:25super important and so this is it's not
35:28just about the fact that we designed a
35:29smaller ALU it's not quite the world's
35:32not quite that simple you've got to
35:34figure out when you can use that across
35:37a computation that
35:40is thousands of gpus it's running for
35:44weeks and weeks on weeks and you want to
35:46make sure that the the uh uh the
35:48training job is going to converge and so
35:51this new Transformer engine we have a
35:53fifth generation MV
35:55link it's now twice as fast this Hopper
35:58but very importantly it has computation
36:01in the network and the reason for that
36:03is because when you have so many
36:04different gpus working together we have
36:06to share our information with each other
36:09we have to synchronize and update each
36:11other and every so often we have to
36:13reduce the partial products and then
36:15rebroadcast out the partial products the
36:18sum of the partial products back to
36:19everybody else and so there's a lot of
36:21what is called all reduce and all to all
36:23and all gather it's all part of this
36:26area of synchronization and collectives
36:28so that we can have gpus working with
36:30each other having extraordinarily fast
36:32links and being able to do mathematics
36:35right in the network allows us to
36:37essentially amplify even further so even
36:41though it's 1.8 terabytes per second
36:43it's effectively higher than that and so
36:45it's many times that of Hopper the
36:49likelihood of a supercomputer running
36:52for weeks on in is approximately zero
36:56and the reason for that is because
36:57there's so many components working at
36:59the same time the statistic the
37:02probability of them working continuously
37:05is very low and so we need to make sure
37:07that whenever there is a well we
37:09checkpoint and restart as often as we
37:12can but if we have the ability to detect
37:16a weak chip or a weak note early we
37:19could retire it and maybe swap in
37:22another processor that ability to keep
37:25the utilization of the supercomputer
37:27High especially when you just spent $2
37:30billion building it is super important
37:33and so we put in a Ras engine a
37:37reliability engine that does 100% self
37:41test in system test of every single gate
37:47every single bit of memory on the
37:51Blackwell chip and all the memory that's
37:53connected to it it's almost as if we
37:56shipped with every every single chip its
37:59own Advanced tester that we CH test our
38:02chips with this is the first time we're
38:04doing this super excited about it secure
38:14AI only this conference do they clap for
38:17R the
38:20the uh secure AI uh obviously you've
38:24just spent hundreds of millions of
38:26dollars creating a very important Ai and
38:28the the code the intelligence of that AI
38:31is encoded in the parameters you want to
38:34make sure that on the one hand you don't
38:35lose it on the other hand it doesn't get
38:37contaminated and so we now have the
38:40ability to encrypt data of course at
38:45rest but also in transit and while it's
38:48being computed it's all encrypted and so
38:52we now have the ability to encrypt and
38:54transmission and when we're Computing it
38:57it is is in a trusted trusted
38:59environment trusted uh engine
39:01environment and the last thing is
39:04decompression moving data in and out of
39:06these nodes when the compute is so fast
39:09becomes really
39:11essential and so we've put in a high
39:14linee speed compression engine and
39:17effectively moves data 20 times faster
39:19in and out of these computers these
39:21computers are are so powerful and
39:24there's such a large investment the last
39:26thing we want to do is have them be idle
39:28and so all of these capabilities are
39:30intended to keep Blackwell fed and as
39:36busy as
39:38possible overall compared to
39:41Hopper it is 2 and a half times two and
39:45a half times the FPA performance for
39:48training per chip it is Al it also has
39:52this new format called fp6 so that even
39:55though the computation speed is the same
39:58the bandwidth that's Amplified because
40:01of the memory the amount of parameters
40:03you can store in the memory is now
40:05Amplified fp4 effectively doubles the
40:07throughput this is vitally important for
40:11inference one of the things that that um
40:14is becoming very clear is that whenever
40:16you use a computer with AI on the other
40:19side when you're chatting with the
40:21chatbot when you're asking it to uh
40:25review or make an image
40:29remember in the back is a GPU generating
40:33tokens some people call it inference but
40:36it's more appropriately
40:40generation the way that Computing is
40:42done in the past was retrieval you would
40:45grab your phone you would touch
40:46something um some signals go off
40:49basically an email goes off to some
40:51storage somewhere there's pre-recorded
40:53content somebody wrote a story or
40:55somebody made an image or somebody
40:56recorded a video
40:58that record pre-recorded content is then
41:00streamed back to the phone and
41:02recomposed in a way based on a
41:04recommender system to present the
41:06information to
41:08you you know that in the future the vast
41:11majority of that content will not be
41:14retrieved and the reason for that is
41:16because that was pre-recorded by
41:17somebody who doesn't understand the
41:19context which is the reason why we have
41:21to retrieve so much content if you can
41:24be working with an AI that understands
41:27the context who you are for what reason
41:29you're fetching this information and
41:31produces the information for you just
41:34the way you like it the amount of energy
41:37we save the amount of networking
41:39bandwidth we save the amount of waste of
41:41time we save will be tremendous the
41:45future is generative which is the reason
41:47why we call it generative AI which is
41:50the reason why this is a brand new
41:52industry the way we compute is
41:55fundamentally different we created a
41:57processor for the generative AI era and
42:02one of the most important parts of it is
42:04content token generation we call it this
42:07format is
42:09fp4 well that's a lot of computation
42:155x the Gen token generation 5x the
42:19inference capability of Hopper seems
42:23like
42:26enough
42:28but why stop
42:30there the answer is it's not enough and
42:33I'm going to show you why I'm going to
42:35show you why and so we would like to
42:37have a bigger GPU even bigger than this
42:40one and so
42:43we decided to scale it and notice but
42:46first let me just tell you how we've
42:47scaled over the course of the last eight
42:50years we've increased computation by
42:531,000 times 8 years 1,000 times remember
42:56back in the good old days of Moore's Law
42:59it was 2x well 5x every what 10 10x
43:03every 5 years that's easy easiest math
43:0510x every five years a 100 times every
43:0810 years 100 times every 10 years at the
43:12in the middle and the Heyday of the PC
43:18Revolution 100 times every 10 years in
43:21the last eight years we've gone 1,000
43:24times we have two more years to go
43:28and so that puts it in
43:33perspective the rate at which we're
43:35advancing Computing is insane and it's
43:37still not fast enough so we built
43:39another
43:41chip this chip is just an incredible
43:45chip we call it the mvy link switch it's
43:4850 billion transistors it's almost a
43:51size of Hopper all by itself this switch
43:54ship has four MV links in it
43:57each 1.8 terabytes per
44:00second
44:01and and it has computation in it as I
44:05mentioned what is this chip
44:07for if we were to build such a chip we
44:11can have every single GPU talk to every
44:15other GPU at full speed at the same time
44:20that's
44:26insane
44:28it doesn't even make
44:31sense but if you could do that if you
44:33can find a way to do that and build a
44:36system to do that that's cost effective
44:40that's cost effective how incredible
44:42would it be that we could have all these
44:45gpus connect over a coherent link so
44:50that they effectively are one giant GPU
44:54well one of one of the Great Inventions
44:55in order to make it cost effective is
44:57that this chip has to drive copper
45:00directly the series of this chip is is
45:03just a phenomenal invention so that we
45:05could do direct drive to copper and as a
45:08result you can build a system that looks
45:11like
45:20this now this system this
45:23system is kind of
45:25insane
45:27this is one dgx this is what a dgx looks
45:31like now remember just six years
45:35ago it was pretty heavy but I was able
45:37to lift
45:40it I delivered the uh the uh first djx1
45:44to open Ai and and the researchers there
45:47it's on you know the pictures are on the
45:49internet and uh uh and we all
45:52autographed it uh and um uh if you come
45:55to my office it's it's autograph there
45:57is really beautiful and but but you
45:59could lift it uh this dgx this dgx that
46:03dgx by the way was
46:06170
46:08teraflops if you're not familiar with
46:10the numbering system that's
46:130.17 pedop flops so this is
46:17720 the first one I delivered to open
46:19aai was
46:210.17 you could round it up to 0.2 won't
46:23make any difference but and back then
46:26was like wow you know 30 more teraflops
46:29and so this is now 720 pedop flops
46:33almost an exif flop for training and the
46:36world's first one xof flops machine in
46:39one
46:47rack just so you know there only a
46:50couple two three exop flops machines on
46:52the planet as we speak and so this is an
46:56ex flops AI system in one single rack
47:01well let's take a look at the back of
47:05it so this is what makes it possible
47:09that's the back that's the that's the
47:11back the dgx MV link spine 130 terabytes
47:16per
47:17second goes through the back of that
47:20chassis that is more than the aggregate
47:22bandwidth of the
47:25internet
47:33so we we could basically send everything
47:35to everybody within a
47:37second and so so we we have 5,000 cables
47:415,000 mvlink cables in total two
47:45miles now this is the amazing thing if
47:47we had to use Optics we would had to use
47:50transceivers and retim and those
47:52transceivers and retim alone would have
47:55cost2
47:5720,000
47:59watts 2 kilow of just transceivers alone
48:02just to drive the MV link spine as a
48:06result we did it completely for free
48:08over mvlink switch and we were able to
48:10save the 20 kilow for computation this
48:13entire rack is 120 kilow so that 20
48:16kilowatts makes a huge difference it's
48:19liquid cooled what goes in is 25° C
48:22about room temperature what comes out is
48:2545° C
48:27about your jacuzzi so room temperature
48:30goes in jacuzzi comes out 2 L per
48:41second we could we could sell a
48:48peripheral 600,000 Parts somebody used
48:52to say you know you guys make gpus and
48:55we do but this is what a GPU looks like
48:58to me when somebody says GPU I see this
49:02two years ago when I saw a GPU was the
49:04hgx it was 70 lb 35,000 Parts our gpus
49:08now are
49:10600,000 parts
49:13and 3,000 lb 3,000 lb 3,000 lb that's
49:18kind of like the weight of a you know
49:21Carbon
49:22Fiber
49:24Ferrari I don't know if that's useful
49:27metric
49:29but everybody's going I feel it I feel
49:32it I get it I get that now that you
49:35mentioned that I feel it I don't know
49:38what's 3,000
49:39lb okay so 3,000 lb ton and a half so
49:43it's not quite an
49:45elephant so this is what a dgx looks
49:47like now let's see what it looks like in
49:49operation okay let's imagine what is
49:52what how do we put this to work and what
49:53does that mean well if you were to train
49:55a GPT model model 1.8 trillion parameter
50:00model it took it took about apparently
50:02about you know 3 to 5 months or so uh
50:05with 25,000 amp uh if we were to do it
50:08with hopper it would probably take
50:09something like 8,000 gpus and it would
50:11consume 15 megawatts 8,000 gpus on 15
50:15megawatts it would take 90 days about
50:16three months and that would allow you to
50:19train something that is you know this
50:22groundbreaking AI model and this is
50:25obviously not as expensive as as um as
50:28anybody would think but it's 8,000 8,000
50:30gpus it's still a lot of money and so
50:338,000 gpus 15 megawatts if you were to
50:36use Blackwell to do this it would only
50:39take 2,000
50:41gpus 2,000 gpus same 90 days but this is
50:46the amazing part only four megawatts of
50:48power so from 15 yeah that's
50:55right
50:57and that's and that's our goal our goal
50:59is to continuously drive down the cost
51:02and the energy they're directly
51:03proportional to each other cost and
51:04energy associated with the Computing so
51:07that we can continue to expand and scale
51:09up the computation that we have to do to
51:11train the Next Generation models well
51:13this is
51:15training inference or generation is
51:19vitally important going forward you know
51:21probably some half of the time that
51:23Nvidia gpus are in the cloud these days
51:25it's being used for token generation you
51:27know they're either doing co-pilot this
51:29or chat you know chat GPT that or um all
51:32these different models that are being
51:33used when you're interacting with it or
51:35generating IM generating images or
51:37generating videos generating proteins
51:40generating chemicals there's a bunch of
51:42gener generation going on all of that is
51:45b in a category of computing we call
51:47inference but inference is extremely
51:50hard for large language models because
51:53these large language models have several
51:54properties one they're very large and so
51:57it doesn't fit on one GPU this is
52:00Imagine imagine Excel doesn't fit on one
52:02GPU you know and imagine some
52:05application you're running on a daily
52:06basis doesn't run doesn't fit on one
52:08computer like a video game doesn't fit
52:10on one computer and most in fact do and
52:14many times in the past in hyperscale
52:17Computing many applications for many
52:19people fit on the same computer and now
52:21all of a sudden this one inference
52:23application where you're interacting
52:25with this chatbot that chatbot requires
52:27a supercomputer in the back to run it
52:30and that's the future the future is
52:33generative with these chat Bots and
52:35these chatbots are trillions of tokens
52:37trillions of parameters and they have to
52:40generate
52:41tokens at interactive rates now what
52:44does that mean oh well uh three tokens
52:48is about a
52:49word uh you know the the uh uh you know
52:54space the final Frontier uh these are
52:57the adventures that's like that's like
52:5980
53:01tokens okay I don't know if that's
53:04useful to you and
53:08so you know the art of communications is
53:11is selecting good and good
53:14analogies yeah this is this is not going
53:19well every I don't know what he's
53:21talking about never seen Star Trek and
53:25so and so so here are we're trying to
53:27generate these tokens when you're
53:28interacting with it you're hoping that
53:29the tokens come back to you as quickly
53:31as possible and as quickly as you can
53:33read it and so the ability for
53:35Generation tokens really important you
53:36have to paralyze the work of this model
53:39across many many gpus so that you could
53:42achieve several things one on the one
53:43hand you would like throughput because
53:46that throughput reduces the cost the
53:48overall cost per token of uh generating
53:52so your throughput dictates the cost of
53:56of uh delivering the service on the
53:58other hand you have another interactive
54:00rate which is another tokens per second
54:03where it's about per user and that has
54:05everything to do with quality of service
54:06and so these two things um uh compete
54:10against each other and we have to find a
54:12way to distribute work across all of
54:15these different gpus and paralyze it in
54:17a way that allows us to achieve both and
54:19it turns out the search search space is
54:23enormous you know I told you there's
54:25going to be math involved
54:27and everybody's going oh
54:29dear I heard some gasp just now when I
54:32put up that slide you know so so this
54:35this right here the the y axis is tokens
54:37per second data center throughput the
54:40x-axis is tokens per second
54:42interactivity of the person and notice
54:44the upper right is the best you want
54:47interactivity to be very high number of
54:49tokens per second per user you want the
54:51tokens per second of per data center to
54:53be very high the upper upper right is is
54:56ter
54:57however it's very hard to do that and in
54:59order for us to search for the best
55:01answer across every single one of those
55:04intersections XY coordinates okay so you
55:07just look at every single XY coordinate
55:09all those blue dots came from some
55:11repartitioning of the software some
55:15optimizing solution has to go and figure
55:17out whether to use use tensor
55:20parallel expert parallel pipeline
55:23parallel or data parallel and distribute
55:27this enormous model across all these
55:30different gpus and sustain performance
55:32that you need this exploration space
55:35would be impossible if not for the
55:37programmability of nvidia's gpus and so
55:39we could because of Cuda because we have
55:41such Rich ecosystem we could explore
55:43this universe and find that green roof
55:47line it turns out that green roof line
55:50notice you got tp2 EPA dp4 it means two
55:55parallel two uh tensor parallel tensor
55:58parallel across two gpus expert
56:00parallels across eight data parallel
56:02across four notice on the other end you
56:04got tensor parallel cross 4 and expert
56:06parallel cross 16 the configuration the
56:10distribution of that software it's a
56:12different different um runtime that
56:15would produce these different results
56:18and you have to go discover that roof
56:19line well that's just one model and this
56:22is just one configuration of a computer
56:24imagine all of the model mod being
56:26created around the world and all the
56:28different different um configurations of
56:31of systems that are going to be
56:35available so now that you understand the
56:37basics let's take a look at inference of
56:42Blackwell compared
56:44to Hopper and this is this is the
56:47extraordinary thing in one generation
56:50because we created a system that's
56:52designed for trillion parameter gener
56:55generative AI
56:57the inference capability of Blackwell is
56:59off the
57:00charts and in fact it is some 30 times
57:04Hopper
57:10y for large language models for large
57:13language models like Chad GPT and others
57:16like it the blue line is Hopper I gave
57:19you imagine we didn't change the
57:21architecture of hoera we just made it a
57:23bigger
57:24chip we just used the latest you know
57:28greatest uh 10 terab you know terabytes
57:32per second we connected the two chips
57:34together we got this giant 28 billion
57:36parameter chip how would we have
57:38performed if nothing else changed and it
57:40turns out quite
57:42wonderfully quite wonderfully and that's
57:44the purple line but not as great as it
57:46could be and that's where the fp4 tensor
57:50core the new Transformer engine and very
57:53importantly the MV link switch and the
57:56reason for that is because all these
57:58gpus have to share the results partial
58:00products whenever they do all to all all
58:03all gather whenever they communicate
58:05with each
58:06other that MV link switch is
58:09communicating almost 10 times faster
58:12than what we could do in the past using
58:14the fastest
58:15networks okay so Blackwell is going to
58:19be just an amazing system for a
58:22generative Ai and in the
58:25future in the future data centers are
58:28going to be thought of as I mentioned
58:30earlier as an AI Factory and AI Factor's
58:34goal in life is to generate revenues
58:38generate in this
58:40case
58:41intelligence in this facility not
58:45generating electricity as in AC
58:47generators but of the last Industrial
58:50Revolution and this Industrial
58:51Revolution the generation of
58:53intelligence and so this ability is
58:55super
58:56super important the excitement of
58:59Blackwell is really off the charts you
59:00know when we first when we first um uh
59:04you know this this is a year and a half
59:06ago two years ago I guess two years ago
59:08when we first started to to go to market
59:10with hopper you know we had the benefit
59:12of of uh two two uh two csps uh joined
59:16us in a lunch and and we were you know
59:19delighted um and so we had two
59:22customers uh we have more now
59:38unbelievable excitement for Blackwell
59:41unbelievable excitement and there's a
59:43whole bunch of different configurations
59:44of course I showed you the
59:46configurations that slide into the
59:48hopper form factor so that's easy to
59:50upgrade I showed you examples that are
59:53liquid cooled that are the extreme
59:55versions of it one one entire rack
59:56that's that's uh connected by mvlink 72
01:00:00uh we're going to Blackwell is going to
01:00:03be ramping to the world's AI companies
01:00:08of which there are so many now doing
01:00:10amazing work in different modalities the
01:00:12csps every CSP is geared up all the oems
01:00:17and
01:00:19odms Regional clouds Sovereign AIS and
01:00:24Telos all over the world are signing up
01:00:26to launch with Blackwell
01:00:35this Blackwell Blackwell would be the
01:00:38the the most successful product launch
01:00:40in our history and so I can't wait wait
01:00:43to see that um I want to thank I want to
01:00:45thank some partners that that are
01:00:46joining us in this uh AWS is gearing up
01:00:49for Blackwell they're uh they're going
01:00:50to build the first uh GPU with secure AI
01:00:54they're uh building out a 200 22 exf
01:00:57flops system you know just now when we
01:00:59animated uh just now the digital twin if
01:01:02you saw the the all of those clusters
01:01:04are coming down by the way that is not
01:01:07just art that is a digital twin of what
01:01:10we're building that's how big it's going
01:01:12to be besides infrastructure we're doing
01:01:14a lot of things together with AWS we're
01:01:16Cuda accelerating Sage maker AI we're
01:01:19Cuda accelerating Bedrock AI uh Amazon
01:01:21robotics is working with us uh using
01:01:24Nvidia Omniverse and Isaac Sim AWS
01:01:27Health has Nvidia Health Integrated into
01:01:30it so AWS has has really leaned into
01:01:33accelerated Computing uh Google is
01:01:36gearing up for Blackwell gcp already has
01:01:38A1 100s h100s t4s l4s a whole Fleet of
01:01:43Nvidia Cuda gpus and they recently
01:01:45announced the Gemma model that runs
01:01:47across all of it uh we're work working
01:01:49to optimize uh and accelerate every
01:01:52aspect of gcp we're accelerating data
01:01:55proc which for data processing their
01:01:56data processing engine Jax xlaa vertex
01:02:00Ai and mojoko for robotics so we're
01:02:03working with uh Google and gcp across a
01:02:05whole bunch of initiatives uh Oracle is
01:02:08gearing up for blackw Oracle is a great
01:02:10partner of ours for Nvidia dgx cloud and
01:02:12we're also working together to
01:02:14accelerate something that's really
01:02:16important to a lot of companies Oracle
01:02:19database Microsoft is accelerating and
01:02:22Microsoft is gearing up for Blackwell
01:02:24Microsoft Nvidia has a ranging
01:02:26partnership we're accelerating could
01:02:28accelerating all kinds of services when
01:02:30you when you chat obviously and uh AI
01:02:33services that are in Microsoft Azure uh
01:02:35it's very very likely Nvidia is in the
01:02:36back uh doing the inference and the
01:02:38token generation uh we built they built
01:02:40the largest Nvidia infiniband
01:02:43supercomputer basically a digital twin
01:02:45of ours or a physical twin of ours we're
01:02:48bringing the Nvidia ecosystem to Azure
01:02:50Nvidia djor cloud to Azure uh Nvidia
01:02:53Omniverse is now hosted in azure viia
01:02:56Healthcare is an Azure and all of it is
01:02:58deeply integrated and deeply connected
01:03:00with Microsoft fabric the whole industry
01:03:04is gearing up for Blackwell this is what
01:03:07I'm about to show you most of the most
01:03:09of the the the uh uh uh scenes that
01:03:12you've seen so far of Blackwell are the
01:03:14are the full Fidelity design of
01:03:18Blackwell everything in our company has
01:03:20a digital twin and in fact this digital
01:03:23twin idea is is really spreading and it
01:03:27it helps it helps companies build very
01:03:29complicated things perfectly the first
01:03:31time and what could be more exciting
01:03:35than creating a digital twin to build a
01:03:38computer that was built in a digital
01:03:40twin and so let me show you what wistron
01:03:43is
01:03:46doing to meet the demand for NVIDIA
01:03:48accelerated Computing wraw one of our
01:03:51leading manufacturing Partners is
01:03:53building digital twins of Nvidia dgx and
01:03:56hgx factories using custom software
01:03:58developed with Omniverse sdks and
01:04:02apis for their newest Factory wion
01:04:04started with a digital twin to virtually
01:04:07integrate their multi-ad and process
01:04:09simulation data into a unified view
01:04:12testing and optimizing layouts in this
01:04:14physically accurate digital environment
01:04:16increased worker efficiency by
01:04:1951% during construction the Omniverse
01:04:22digital twin was used to verify that the
01:04:24physical build matched the digital plans
01:04:27identifying any discrepancies early has
01:04:29helped avoid costly change orders and
01:04:31the results have been impressive using a
01:04:34digital twin helped bring wion's Factory
01:04:36online in half the time just 2 and 1/2
01:04:39months instead of five in operation the
01:04:42Omniverse digital twin helps withdrawn
01:04:44rapidly Test new layouts to accommodate
01:04:46new processes or improve operations in
01:04:48the existing space and monitor realtime
01:04:51operations using live iot data from
01:04:54every machine on the production
01:04:56online which ultimately enabled wiwn to
01:04:59reduce End to-end Cycle Times by 50% and
01:05:02defect rates by
01:05:0440% with Nvidia Ai and Omniverse
01:05:07nvidia's Global ecosystem of partners
01:05:09are building a new era of accelerated AI
01:05:12enabled
01:05:15[Music]
01:05:19[Applause]
01:05:22digitalization that's how we that's way
01:05:25it's going to be in the future we're
01:05:26going to manufacturing everything
01:05:28digitally first and then we'll
01:05:30manufacture it physically people ask me
01:05:32how did it
01:05:33start what got you guys so
01:05:36excited what was it that you
01:05:39saw that caused you to put it all
01:05:43in on this incredible idea and it's
01:05:52this hang on a
01:05:54second
01:05:59guys that was going to be such a
01:06:03moment that's what happens when you
01:06:05don't
01:06:11rehearse this as you know was first
01:06:16Contact 2012
01:06:18alexnet you put a cat into this computer
01:06:23and it comes out and it says cat
01:06:28and we said oh my God this is going to
01:06:31change
01:06:34everything you take 1 million numbers
01:06:37you take one Million numbers across
01:06:40three channels
01:06:41RGB these numbers make no sense to
01:06:44anybody you put it into this software
01:06:47and it compress it dimensionally reduce
01:06:50it it reduces it from a million
01:06:53dimensions a million dimensions it turns
01:06:56it into three letters one vector one
01:07:01number and it's
01:07:03generalized you could have the cat be
01:07:06different
01:07:08cats and and you could have it be the
01:07:11front of the cat and the back of the cat
01:07:14and you look at this thing you say
01:07:15unbelievable you mean any
01:07:18cats yeah any
01:07:22cat and it was able to recognize all
01:07:25these cats
01:07:26and we realized how it did it
01:07:28systematically structurally it's
01:07:32scalable how big can you make it well
01:07:35how big do you want to make it and so we
01:07:38imagine that this is a completely new
01:07:41way of writing
01:07:42software and now today as you know you
01:07:46could have you type in the word
01:07:49cat and what comes out is a
01:07:52cat it went the other way
01:07:56am I right
01:07:58unbelievable how is it possible that's
01:08:01right how is it possible you took three
01:08:04letters and you generated a million
01:08:07pixels from it and it made
01:08:10sense well that's the miracle and here
01:08:13we are just literally 10 years later 10
01:08:16years later we're we recognize text we
01:08:19recognize images we recognize videos and
01:08:22sounds and images not only do we
01:08:24recognize
01:08:26we understand their meaning we
01:08:28understand the meaning of the text
01:08:29that's the reason why it can chat with
01:08:31you it can summarize for you it
01:08:34understands the text it understood not
01:08:36just recognizes the the English it
01:08:38understood the English it doesn't just
01:08:40recognize the pixels and understood the
01:08:42pixels and you can you can even
01:08:44condition it between two modalities you
01:08:47can have language condition image and
01:08:49generate all kinds of interesting things
01:08:52well if you can understand these things
01:08:54what else can you understand that you've
01:08:57digitized the reason why we started with
01:08:59text and you know images is because we
01:09:01digitized those but what else have we
01:09:03digitized well it turns out we digitized
01:09:05a lot of things proteins and genes and
01:09:09brain
01:09:10waves anything you can digitize so long
01:09:13as there's structure we can probably
01:09:14learn some patterns from it and if we
01:09:16can learn the patterns from it we can
01:09:18understand its meaning if we can
01:09:19understand it meaning we might be able
01:09:21to generate it as well and so therefore
01:09:24the generative AI Revolution is here
01:09:27well what else can we generate what else
01:09:29can we learn well one of the things that
01:09:31we would love to learn we would love to
01:09:34learn is we would love to learn climate
01:09:38we would love to learn extreme weather
01:09:40we would love to learn uh what how we
01:09:44can
01:09:46predict future weather at Regional
01:09:49scales at sufficiently high resolution
01:09:53such that we can keep people out of
01:09:54Harm's Way before harm comes extreme
01:09:57weather cost the world $150 billion
01:10:00surely more than that and it's not
01:10:02evenly distributed $150 billion is
01:10:05concentrated in some parts of the world
01:10:06and of course to some people of the
01:10:08world we need to adapt and we need to
01:10:10know what's coming and so we are
01:10:12creating Earth to a digital twin of the
01:10:15Earth for predicting weather and we've
01:10:18made an an extraordinary invention
01:10:21called Civ the ability to use generative
01:10:24AI to predict predict weather at
01:10:26extremely high resolution let's take a
01:10:30look as the earth's climate changes AI
01:10:33powered weather forecasting is allowing
01:10:35us to more accurately predict and track
01:10:37severe storms like super typhoon chanthu
01:10:40which caused widespread damage in Taiwan
01:10:42and the surrounding region in 2021
01:10:45current AI forecast models can
01:10:47accurately predict the track of storms
01:10:49but they are limited to 25 km resolution
01:10:52which can miss important details
01:10:54nvidia's cordi is a revolutionary new
01:10:57generative AI model trained on high
01:10:59resolution radar assimilated Warf
01:11:01weather forecasts and AA 5 reanalysis
01:11:04data using cordi extreme events like
01:11:07chanthu can be super resolved from 25 km
01:11:10to 2 km resolution with 1,000 times the
01:11:13speed and 3,000 times the Energy
01:11:15Efficiency of conventional weather
01:11:17models by combining the speed and
01:11:19accuracy of nvidia's weather forecasting
01:11:21model forecast net and generative AI
01:11:24models like cordi we can explore
01:11:26hundreds or even thousands of kilometer
01:11:28scale Regional weather forecasts to
01:11:30provide a clear picture of the best
01:11:32worst and most likely impacts of a storm
01:11:36this wealth of information can help
01:11:37minimize loss of life and property
01:11:39damage today ciff is optimized for
01:11:42Taiwan but soon generative super
01:11:44sampling will be available as part of
01:11:46the Nvidia Earth 2 inference service for
01:11:48many regions across the
01:11:54globe
01:12:02the weather company has to trust a
01:12:04source of global weather predictions we
01:12:06are working together to accelerate their
01:12:08weather simulation first principled base
01:12:11of simulation however they're also going
01:12:13to integrate Earth to cordi so that they
01:12:16could help businesses and countries do
01:12:19Regional high resolution weather
01:12:22prediction and so if you have some
01:12:23weather prediction you'd like to know
01:12:25like to do uh reach out to the weather
01:12:26company really exciting really exciting
01:12:29work Nvidia Healthcare something we
01:12:31started 15 years ago we're super super
01:12:34excited about this this is an area where
01:12:35we're very very proud whether it's
01:12:38Medical Imaging or Gene sequencing or
01:12:40computational
01:12:41chemistry it is very likely that Nvidia
01:12:44is the computation behind it we've done
01:12:47so much work in this
01:12:49area today we're announcing that we're
01:12:52going to do something really really cool
01:12:55imagine all of these AI models that are
01:12:58being
01:12:59used to
01:13:01generate images and audio but instead of
01:13:04images and audio because it understood
01:13:06images and audio all the digitization
01:13:09that we've done for genes and proteins
01:13:12and amino acids that digitalization
01:13:15capability is now passed through machine
01:13:18learning so that we understand the
01:13:20language of
01:13:21Life the ability to understand the
01:13:23language of Life Of course we saw the
01:13:25first evidence of
01:13:27it with Alpha fold this is really quite
01:13:30an extraordinary thing after Decades of
01:13:32painstaking work the world had only
01:13:36digitized and reconstructed using cor
01:13:39electron microscopy or Crystal x-ray
01:13:42x-ray crystallography um these different
01:13:45techniques painstakingly reconstructed
01:13:47the protein 200,000 of them in just what
01:13:51is it less than a year or so Alpha fold
01:13:54has
01:13:55reconstructed 200 million proteins
01:13:58basically every protein every of every
01:14:00living thing that's ever been sequenced
01:14:03this is completely revolutionary well
01:14:05those models are incredibly hard to use
01:14:08um for incredibly hard for people to
01:14:10build and so what we're going to do is
01:14:12we're going to build them we're going to
01:14:13build them for uh the the researchers
01:14:15around the world and it won't be the
01:14:17only one there'll be many other models
01:14:19that we create and so let me show you
01:14:21what we're going to do with
01:14:24it
01:14:27virtual screening for new medicines is a
01:14:29computationally intractable problem
01:14:32existing techniques can only scan
01:14:34billions of compounds and require days
01:14:36on thousands of standard compute nodes
01:14:38to identify new drug
01:14:40candidates Nvidia Bion Nemo Nims enable
01:14:43a new generative screening Paradigm
01:14:46using Nims for protein structure
01:14:47prediction with Alpha fold molecule
01:14:50generation with MIM and docking with
01:14:52diff dock we can now generate and Screen
01:14:55candidate molecules in a matter of
01:14:57minutes MIM can connect to custom
01:15:00applications to steer the generative
01:15:02process iteratively optimizing for
01:15:04desired properties these applications
01:15:07can be defined with biion Nemo
01:15:09microservices or built from scratch here
01:15:12a physics-based simulation optimizes for
01:15:15a molecule's ability to bind to a Target
01:15:17protein while optimizing for other
01:15:19favorable molecular properties in
01:15:21parallel MIM generates highquality
01:15:24druglike molecules that bind to the
01:15:26Target and are synthesizable translating
01:15:29to a higher probability of developing
01:15:31successful medicines faster biion Nimo
01:15:34is enabling a new paradigm in drug
01:15:36Discovery with Nims providing OnDemand
01:15:39microservices that can be combined to
01:15:41build powerful drug Discovery workflows
01:15:44like denovo protein design or guided
01:15:46molecule generation for virtual
01:15:48screening biion neon Nims are helping
01:15:51researchers and developers reinvent
01:15:53computational drug dis
01:16:00design Nvidia M MIM cord diff there's a
01:16:05whole bunch of other models whole bunch
01:16:07of other models computer vision models
01:16:11robotics models and even of
01:16:13course some really really terrific open
01:16:16source language models these models are
01:16:20groundbreaking however it's hard for
01:16:23companies to use how would you use it
01:16:25how would you bring it into your company
01:16:26and integrate it into your workflow how
01:16:28would you package it up and run it
01:16:30remember earlier I just
01:16:31said that inference is an extraordinary
01:16:35computation problem how would you do the
01:16:38optimization for each and every one of
01:16:40these models and put together the
01:16:42Computing stack necessary to run that
01:16:44supercomputer so that you can run these
01:16:46models in your company and so we have a
01:16:49great idea we're going to invent a new
01:16:52way a invent a new way for you to
01:16:56receive and operate
01:16:58software this software comes basically
01:17:03in a digital box we call it a container
01:17:06and we call it the Nvidia inference micr
01:17:09service a Nim and let me explain to you
01:17:12what it is a NM it's a pre-trained model
01:17:16so it's pretty
01:17:17clever and it is packaged and optimized
01:17:21to run across nvidia's install base
01:17:23which is very very large what's inside
01:17:26it is incredible you have all these
01:17:29pre-trained stateof the open source
01:17:31models they could be open source they
01:17:33could be from one of our partners it
01:17:34could be created by us like Nvidia
01:17:36moment it is packaged up with all of its
01:17:39dependencies so Cuda the right version
01:17:42CNN the right version tensor RT llm
01:17:45Distributing across the multiple gpus
01:17:47tried and inference server all
01:17:50completely packaged together it's
01:17:53optimized depending on what whether you
01:17:55have a single GPU multi-gpu or multi-
01:17:57node of gpus it's optimized for that and
01:18:00it's connected up with apis that are
01:18:02simple to use now this think about what
01:18:04an AI API is an AI API is an interface
01:18:09that you just talk to and so this is a
01:18:12piece of software in the future that has
01:18:14a really simple API and that API is
01:18:17called human and these packages
01:18:20incredible bodies of software will be
01:18:22optimized and packaged and we'll put it
01:18:25on a
01:18:26website and you can download it you
01:18:28could take it with you you could run it
01:18:31in any Cloud you could run it in your
01:18:33own data center you can run in
01:18:34workstations if it fit and all you have
01:18:36to do is come to ai. nvidia.com we call
01:18:39it Nvidia inference microservice but
01:18:42inside the company we all call it
01:18:44Nims
01:18:52okay just imagine you know
01:18:55one of some someday there there's going
01:18:57to be one of these chat Bots and these
01:18:59chat Bots is going to just be in a Nim
01:19:02and you you'll uh you'll assemble a
01:19:04whole bunch of chatbots and that's the
01:19:06way software is going to be be built
01:19:08someday how do we build software in the
01:19:10future it is unlikely that you'll write
01:19:13it from scratch or write a whole bunch
01:19:15of python code or anything like that it
01:19:17is very likely that you assemble a team
01:19:19of AIS there's probably going to be a
01:19:23super AI that you use that takes the
01:19:25mission that you give it and breaks it
01:19:28down into an execution plan some of that
01:19:30execution plan could be handed off to
01:19:33another Nim that Nim would maybe uh
01:19:36understand
01:19:37sap the language of sap is abap it might
01:19:41understand service now and it go
01:19:43retrieve some information from their
01:19:45platforms it might then hand that result
01:19:48to another Nim who that goes off and
01:19:50does some calculation on it maybe it's
01:19:52an optimization software a
01:19:55combinatorial optimization algorithm
01:19:58maybe it's uh you know some just some
01:20:00basic
01:20:01calculator maybe it's pandas to do some
01:20:05numerical analysis on it and then it
01:20:07comes back with its
01:20:09answer and it gets combined with
01:20:11everybody else's and it because it's
01:20:13been presented with this is what the
01:20:15right answer should look like it knows
01:20:17what answer what an what right answers
01:20:19to produce and it presents it to you we
01:20:22can get a report every single day at you
01:20:24know top of the hour uh that has
01:20:26something to do with a bill plan or some
01:20:27forecast or uh some customer alert or
01:20:30some bugs database or whatever it
01:20:31happens to be and we could assemble it
01:20:34using all these Nims and because these
01:20:36Nims have been packaged up and ready to
01:20:39work on your systems so long as you have
01:20:41viia gpus in your data center in the
01:20:43cloud this this Nims will work together
01:20:46as a team and do amazing things and so
01:20:49we decided this is such a great idea
01:20:52we're going to go do that and so so
01:20:54Nvidia has Nims running all over the
01:20:56company we have chat Bots being created
01:20:59all over the place and one of the mo
01:21:00most important chat bots of course is a
01:21:03chip designer chatbot you might not be
01:21:06surprised we care a lot about building
01:21:07chips and so we want to build chatbots
01:21:11AI
01:21:12co-pilots that are co-designers with our
01:21:15engineers and so this is the way we did
01:21:18it so we got ourselves a llama llama 2
01:21:21this is a 70b and it's you know packag
01:21:24up in a Nim and we asked it you know uh
01:21:28what is a
01:21:29CTL Well turns out CTL is an internal uh
01:21:33program and it has a internal
01:21:35proprietary language but it thought the
01:21:37CTL was a combinatorial timing logic and
01:21:40so it describes you know conventional
01:21:42knowledge of CTL but that's not very
01:21:44useful to us and so we gave it a whole
01:21:47bunch of new examples you know this is
01:21:50no different than employee onboarding an
01:21:52employee uh we say you know thanks for
01:21:55that answer it's completely wrong um and
01:21:58and uh and then we present to them uh
01:22:00this is what a CTL is okay and so this
01:22:03is what a CTL is at Nvidia and the CTL
01:22:07as you can see you know CTL stands for
01:22:09compute Trace Library which makes sense
01:22:11you know we were tracing compute Cycles
01:22:13all the time and it wrote the program
01:22:16isn't that
01:22:23amazing and so the productivity of our
01:22:26chip designers can go up this is what
01:22:27you can do with a Nim first thing you
01:22:29can do with this is customize it we have
01:22:30a service called Nemo microservice that
01:22:33helps you curate the data preparing the
01:22:35data so that you could teach this on
01:22:37board this AI you fine-tune them and
01:22:40then you guardrail it you can even
01:22:43evaluate the answer evaluate its
01:22:44performance against um other other
01:22:46examples and so that's called the Nemo
01:22:50micr service now the thing that's that's
01:22:52emerging here is this there are three
01:22:53elements three pillars of what we're
01:22:55doing the first pillar is of course
01:22:57inventing the technology for um uh AI
01:23:01models and running AI models and
01:23:02packaging it up for you the second is to
01:23:05create tools to help you modify it first
01:23:08is having the AI technology second is to
01:23:10help you modify it and third is
01:23:12infrastructure for you to fine-tune it
01:23:14and if you like deploy it you could
01:23:16deploy it on our infrastructure called
01:23:18dgx cloud or you can employ deploy it on
01:23:20PR you can deploy it anywhere you like
01:23:23once you develop it it's yours to take
01:23:25anywhere and so we are
01:23:27effectively an AI Foundry we will do for
01:23:32you and the industry on AI what tsmc
01:23:35does for us building chips and so we go
01:23:37to it with our go to tsmc with our big
01:23:39Ideas they manufacture and we take it
01:23:42with us and so exactly the same thing
01:23:44here AI Foundry and the three pillars
01:23:46are the NIMS Nemo microservice and dgx
01:23:50Cloud the other thing that you could
01:23:52teach the Nim to do is to understand
01:23:54your proprietary information remember
01:23:57inside our company the vast majority of
01:23:59our data is not in the cloud it's inside
01:24:01our company it's been sitting there you
01:24:03know being used all the time and and
01:24:06gosh it's it's basically invidious
01:24:08intelligence we would like to take that
01:24:12data learn its meaning like we learned
01:24:15the meaning of almost anything else that
01:24:16we just talked about learn its meaning
01:24:18and then reindex that knowledge into a
01:24:21new type of database called a vector
01:24:23database
01:24:25and so you essentially take structured
01:24:27data or unstructured data you learn its
01:24:29meaning you encode its meaning so now
01:24:32this becomes an AI database and that AI
01:24:35database in the future once you create
01:24:37it you can talk to it and so let me give
01:24:39you an example what you could do so
01:24:41suppose you create you get you've got a
01:24:42whole bunch of multi modality data and
01:24:45one good example of that is PDF so you
01:24:48take the PDF you take all of your PDFs
01:24:51all the all your favorite you know the
01:24:52stuff that that is proprietary to you
01:24:55critical to your company you can encode
01:24:57it just as we encoded pixels of a cat
01:25:01and it becomes the word cat we can
01:25:03encode all of your PDF and it turns
01:25:05into vectors that are now stored inside
01:25:08your vector database it becomes the
01:25:10proprietary information of your company
01:25:12and once you have that proprietary
01:25:13information you could chat to it it's an
01:25:16it's a smart database so you just ch
01:25:18chat with data and how how much more
01:25:21enjoyable is that you know for for our
01:25:25software team you know they just chat
01:25:27with the bugs database you know how many
01:25:30bugs was there last night um are we
01:25:32making any progress and then after
01:25:34you're done talking to this uh bugs
01:25:37database you need therapy and so so we
01:25:41have another chat bot for
01:25:46you you can do
01:25:53it
01:25:58okay so we call this Nemo Retriever and
01:26:00the reason for that is because
01:26:01ultimately its job is to go retrieve
01:26:03information as quickly as possible and
01:26:05you just talk to it hey retrieve me this
01:26:07information it goes if brings it back to
01:26:09you and do you mean this you go yeah
01:26:11perfect okay and so we call it the Nemo
01:26:14retriever well the Nemo service helps
01:26:16you create all these things and we have
01:26:17all all these different Nims we even
01:26:19have Nims of digital humans I'm Rachel
01:26:22your AI care manager
01:26:26okay so so it's a really short clip but
01:26:29there were so many videos to show you I
01:26:31guess so many other demos to show you
01:26:33and so I I had to cut this one short but
01:26:35this is Diana she is a digital human Nim
01:26:39and and uh uh you just talked to her and
01:26:42she's connected in this case to
01:26:44Hippocratic ai's large language model
01:26:46for healthcare and it's truly
01:26:50amazing she is just super smart about
01:26:52Healthcare things you know and so after
01:26:55you're done after my my Dwight my VP of
01:26:58software engineering talks to the
01:27:00chatbot for bugs database then you come
01:27:02over here and talk to Diane and and so
01:27:05so uh Diane is is um completely animated
01:27:08with AI and she's a digital
01:27:11human uh there's so many companies that
01:27:13would like to build they're sitting on
01:27:15gold mines the the Enterprise IT
01:27:18industry is sitting on a gold mine it's
01:27:20a gold mine because they have so much
01:27:23understanding of of uh the way work is
01:27:25done they have all these amazing tools
01:27:27that have been created over the years
01:27:29and they're sitting on a lot of data if
01:27:31they could take that gold mine and turn
01:27:34them into co-pilots these co-pilots
01:27:37could help us do things and so just
01:27:39about every it franchise it platform in
01:27:42the world that has valuable tools that
01:27:44people use is sitting on a gold mine for
01:27:46co-pilots and they would like to build
01:27:48their own co-pilots and their own
01:27:50chatbots and so we're announcing that
01:27:52Nvidia AI Foundry is working with some
01:27:55of the world's great companies sap
01:27:56generates 87% of the world's Global
01:27:59Commerce basically the world runs on sap
01:28:01we run on sap Nvidia and sap are
01:28:04building sap Jewel co-pilots uh using
01:28:07Nvidia Nemo and dgx Cloud uh service now
01:28:10they run 80 85% of the world's Fortune
01:28:13500 companies run their people and
01:28:15customer service operations on service
01:28:17now and they're using Nvidia AI Foundry
01:28:21to build service now uh assist virtu
01:28:24viral
01:28:25assistance cohesity backs up the world's
01:28:28data they're sitting on a gold mine of
01:28:30data hundreds of exobytes of data over
01:28:3210,000 companies Nvidia AI Foundry is
01:28:35working with them helping them build
01:28:37their Gaia generative AI agent snowflake
01:28:42is a company that stores the world's uh
01:28:45digital Warehouse in the cloud and
01:28:47serves over three billion queries a day
01:28:52for 10,000 Enterprise customers
01:28:54snowflake is working with Nvidia AI
01:28:56Foundry to build co-pilots with Nvidia
01:28:59Nemo and Nims net apppp nearly half of
01:29:03the files in the world are stored on
01:29:06Prem on net app Nvidia AI Foundry is
01:29:09helping them uh build chat Bots and
01:29:11co-pilots like those Vector databases
01:29:13and retrievers with Nvidia Nemo and
01:29:16Nims and we have a great partnership
01:29:19with Dell everybody who everybody who is
01:29:22building these chat Bots and generative
01:29:24AI when you're ready to run it you're
01:29:27going to need an AI
01:29:29Factory and nobody is better at Building
01:29:32end-to-end Systems of very large scale
01:29:35for the Enterprise than Dell and so
01:29:38anybody any company every company will
01:29:40need to build AI factories and it turns
01:29:42out that Michael is here he's happy to
01:29:44take your
01:29:48order ladies and gentlemen Michael
01:29:53del
01:29:57okay let's talk about the next wave of
01:29:59Robotics the next wave of AI robotics
01:30:01physical
01:30:02AI so far all of the AI that we've
01:30:05talked about is one
01:30:08computer data comes into one computer
01:30:10lots of the world's if you will
01:30:12experience in digital text form the AI
01:30:16imitates Us by reading a lot of the
01:30:20language to predict the next words it's
01:30:22imitating you by studying all of the
01:30:24patterns and all the other previous
01:30:26examples of course it has to understand
01:30:27context and so on so forth but once it
01:30:30understands the context it's essentially
01:30:31imitating you we take all of the data we
01:30:34put it into a system like dgx we
01:30:36compress it into a large language model
01:30:39trillions and trillions of parameters
01:30:41become billions and billion trillions of
01:30:43tokens becomes billions of parameters
01:30:44these billions of parameters becomes
01:30:46your AI well in order for us to go to
01:30:49the next wave of AI where the AI
01:30:52understands the physical world we're
01:30:54going to need three
01:30:55computers the first computer is still
01:30:57the same computer it's that AI computer
01:30:59that now it's going to be watching video
01:31:02and maybe it's doing synthetic data
01:31:03generation and maybe there's a lot of
01:31:06human examples just as we have human
01:31:09examples in text form we're going to
01:31:10have human examples in articulation form
01:31:14and the AIS will watch
01:31:16us understand what is
01:31:18happening and try to adapt it for
01:31:21themselves into the
01:31:22context and because it can generalize
01:31:25with these Foundation models maybe these
01:31:27robots can also perform in the physical
01:31:30world fairly generally so I just
01:31:33described in very simple terms
01:31:35essentially what just happened in large
01:31:37language models except the chat GPT
01:31:39moment for robotics may be right around
01:31:41the corner and so we've been building
01:31:44the endtoend systems for robotics for
01:31:46some time I'm super super proud of the
01:31:48work we have the AI system
01:31:51dgx we have the lower system which is
01:31:53called a GX for autonomous systems the
01:31:55world's first robotics processor when we
01:31:57first built this thing people are what
01:31:59are you guys building it's a s so it's
01:32:02one chip it's designed to be very low
01:32:03power but it's designed for high-speed
01:32:05sensor processing and Ai and so if you
01:32:09want to run Transformers in a car or you
01:32:12want to run Transformers in a in a you
01:32:14know anything um that moves uh we have
01:32:18the perfect computer for you it's called
01:32:19the Jetson and so the dgx on top for
01:32:22training the AI the Jetson is the
01:32:24autonomous processor and in the middle
01:32:26we need another computer whereas large
01:32:30language models have the
01:32:32benefit of you providing your examples
01:32:35and then doing reinforcement learning
01:32:37human
01:32:38feedback what is the reinforcement
01:32:41learning human feedback of a robot well
01:32:44it's reinforcement learning physical
01:32:46feedback that's how you align the robot
01:32:49that's how you that's how the robot
01:32:51knows that as it's learning these
01:32:52articulation cap capabilities and
01:32:54manipulation capabilities it's going to
01:32:56adapt properly into the laws of physics
01:32:59and so we need a simulation
01:33:03engine that represents the world
01:33:05digitally for the robot so that the
01:33:07robot has a gym to go learn how to be a
01:33:09robot we call
01:33:11that virtual world Omniverse and the
01:33:15computer that runs Omniverse is called
01:33:17ovx and ovx the computer itself is
01:33:21hosted in the Azure cloud okay and so
01:33:24basically we built these three things
01:33:26these three systems on top of it we have
01:33:28algorithms for every single one now I'm
01:33:31going to show you one super example of
01:33:33how Ai and Omniverse are going to work
01:33:36together the example I'm going to show
01:33:38you is kind of insane but it's going to
01:33:40be very very close to tomorrow it's a
01:33:43robotics building this robotics building
01:33:46is called a warehouse inside the
01:33:48robotics building are going to be some
01:33:49autonomous systems some of the
01:33:51autonomous systems are going to be
01:33:53called humans
01:33:54and some of the autonomous systems are
01:33:56going to be called forklifts and these
01:33:58autonomous systems are going to interact
01:34:01with each other of course autonomously
01:34:03and it's going to be overlooked upon by
01:34:05this Warehouse to keep everybody out of
01:34:07Harm's Way the warehouse is essentially
01:34:09an air traffic controller and whenever
01:34:12it sees something happening it will
01:34:14redirect traffic and give new waypoints
01:34:17just new waypoints to the robots and the
01:34:19people and they'll know exactly what to
01:34:21do this Warehouse this building you can
01:34:24also talk to of course you could talk to
01:34:27it hey you know sap Center how are you
01:34:30feeling today for example and so you
01:34:33could ask the same the warehouse the
01:34:35same questions basically the system I
01:34:37just described will have Omniverse Cloud
01:34:41that's hosting the virtual simulation
01:34:44and AI running on dgx cloud and all of
01:34:47this is running in real time let's take
01:34:49a
01:34:51look the future of heavy Industries
01:34:53starts as a digital twin the AI agents
01:34:57helping robots workers and
01:34:58infrastructure navigate unpredictable
01:35:00events in complex industrial spaces will
01:35:03be built and evaluated first in
01:35:05sophisticated digital
01:35:07twins this Omniverse digital twin of a
01:35:10100,000 ft Warehouse is operating as a
01:35:13simulation environment that integrates
01:35:15digital workers amrs running the Nvidia
01:35:18ISAC perceptor stack centralized
01:35:21activity maps of the entire warehouse
01:35:23for from 100 simulated ceiling mount
01:35:25cameras using Nvidia metropolis and AMR
01:35:28route planning with Nvidia Koop software
01:35:32in Loop testing of AI agents in this
01:35:34physically accurate simulated
01:35:36environment enables us to evaluate and
01:35:39refine how the system adapts to
01:35:41realworld
01:35:42unpredictability here an incident occurs
01:35:45along this amr's plannned route blocking
01:35:47its path as it moves to pick up a pallet
01:35:50Nvidia Metropolis updates and sends a
01:35:53real-time occupancy map to Coop where a
01:35:55new optimal route is calculated the AMR
01:35:58is enabled to see around corners and
01:36:00improve its Mission efficiency with
01:36:03generative AI powered Metropolis Vision
01:36:05Foundation models operators can even ask
01:36:08questions using natural language the
01:36:10visual model understands nuanced
01:36:12activity and can offer immediate
01:36:14insights to improve operations all of
01:36:17the sensor data is created in simulation
01:36:19and passed to the real-time AI running
01:36:22as Nvidia inference micros services or
01:36:24Nims and when the AI is ready to be
01:36:26deployed in the physical twin the real
01:36:29Warehouse we connect metropolis and
01:36:31Isaac Nims to real sensors with the
01:36:34ability for continuous Improvement of
01:36:36both the digital twin and the AI
01:36:41models isn't that
01:36:44incredible and
01:36:47so remember remember a future facility
01:36:52Warehouse Factory building will be
01:36:55software defined and so the software is
01:36:57running how else would you test the
01:36:59software so you you you test the
01:37:01software to building the warehouse the
01:37:03optimization system in the digital twin
01:37:06what about all the robots all of those
01:37:07robots you were seeing just now they're
01:37:09all running their own autonomous robotic
01:37:11stack and so the way you integrate
01:37:13software in the future cicd in the
01:37:15future for robotic systems is with
01:37:17digital twins we've made Omniverse a lot
01:37:21easier to access we're going to create
01:37:23basically Omniverse Cloud apis four
01:37:26simple API and a channel and you can
01:37:28connect your application to it so this
01:37:30is this is going to be as wonderfully
01:37:33beautifully simple in the future that
01:37:36Omniverse is going to be and with these
01:37:37apis you're going to have these magical
01:37:40digital twin capability we also have
01:37:43turned Omniverse into an AI and
01:37:47integrated it with the ability to chat
01:37:50USD the the language of our language is
01:37:53you know human and uh Omniverse is
01:37:55language as it turns out is universal
01:37:57scene description and so that language
01:38:00is rather complex and so we've taught
01:38:02our Omniverse uh that language and so
01:38:04you can speak to it in English and it
01:38:06would directly generate USD and it would
01:38:09talk back in USD but Converse back to
01:38:11you in English you could also look for
01:38:13information in this world semantically
01:38:16instead of the world being encoded
01:38:18semantically in in language now it's
01:38:20encoded semantically in scenes and so
01:38:22you could ask it of of uh certain
01:38:25objects or certain conditions and
01:38:26certain scenarios and it can go and find
01:38:28that scenario for you it also can
01:38:30collaborate with you in generation you
01:38:32could design some things in 3D it could
01:38:35simulate some things in 3D or you could
01:38:36use AI to generate something in 3D let's
01:38:39take a look at how this is all going to
01:38:41work we have a great partnership with
01:38:42seens Seaman is the world's largest
01:38:45industrial engineering and operations
01:38:48platform you've seen now so many
01:38:50different companies in the industrial
01:38:52space heavy Industries is one of the
01:38:55greatest final frontiers of it and we
01:38:58finally now have the Necessary
01:39:00Technology to go and make a real impact
01:39:03Seamans is building the industrial
01:39:04metaverse and today we're announcing
01:39:06that seens is connecting their Crown
01:39:09Jewel accelerator to Nvidia Omniverse
01:39:12let's take a
01:39:14look seus technology is transformed
01:39:17everyday for everyone team Center XS our
01:39:20leading product life cycle management
01:39:21software from the Seaman accelerator
01:39:24platform is used every day by our
01:39:26customers to develop and deliver
01:39:28products at scale now we are bringing
01:39:31the real and digital worlds even Closer
01:39:34by integrating Nvidia Ai and Omniverse
01:39:37Technologies into team Center X
01:39:40Omniverse apis enable data
01:39:42interoperability and physics-based
01:39:44rendering to Industrial scale design and
01:39:47Manufacturing projects our customers HD
01:39:50yund market leader in sustainable ship
01:39:52Manufacturing builds ammonia and
01:39:54hydrogen power chips often comprising
01:39:57over 7 million discrete Parts with
01:40:00Omniverse apis team Center X lets
01:40:03companies like HD yundai unify and
01:40:05visualize these massive engineering data
01:40:08sets interactively and integrate
01:40:10generative AI to generate 3D objects or
01:40:14hdri backgrounds to see their projects
01:40:17in context the result an ultra inuitive
01:40:21photoal physics based digital twin that
01:40:23eliminates waste and errors delivering
01:40:26huge savings in cost and
01:40:29time and we are building this for
01:40:31collaboration whether across more seen
01:40:33accelerator tools like seen anex or Star
01:40:37CCM Plus or across teams working on
01:40:40their favorite devices in the same scene
01:40:43together and this is just the beginning
01:40:46working with Nvidia we will bring
01:40:48accelerate Computing generative Ai and
01:40:51Omniverse integration across the Sean
01:40:54accelerator
01:41:03portfolio the pro the the professional
01:41:07the professional voice actor happens to
01:41:09be a good friend of mine Roland Bush who
01:41:11happens to be the CEO of
01:41:20seamons once you get
01:41:24Omniverse connected into your workflow
01:41:27your
01:41:28ecosystem from the beginning of your
01:41:30design to
01:41:32engineering to manufacturing planning
01:41:35all the way to digital twin
01:41:37operations once you connect everything
01:41:39together it's insane how much
01:41:42productivity you can get and it's just
01:41:44really really wonderful all of a sudden
01:41:46everybody's operating on the same ground
01:41:48truth you don't have to exchange data
01:41:50and convert data make mistakes everybody
01:41:53is working on the same ground truth from
01:41:56the design Department to the art
01:41:57Department the architecture Department
01:41:59all the way to the engineering and even
01:42:01the marketing department let's take a
01:42:03look at how Nissan has integrated
01:42:06Omniverse into their workflow and it's
01:42:09all because it's connected by all these
01:42:11wonderful tools and these developers
01:42:13that we're working with take a look
01:42:19[Music]
01:42:22un
01:42:24[Music]
01:42:52May
01:42:57[Music]
01:43:22cha
01:43:24[Music]
01:43:53that was not an animation that was
01:43:57Omniverse today we're announcing that
01:43:59Omniverse
01:44:00Cloud streams to The Vision Pro
01:44:11and it is very very
01:44:14strange that you walk around virtual
01:44:17doors when I was getting out of that
01:44:20car and everybody does it
01:44:23it is really really quite amazing Vision
01:44:26Pro connected to Omniverse portals you
01:44:29into Omniverse and because all of these
01:44:32CAD tools and all these different design
01:44:34tools are now integrated and connected
01:44:36to Omniverse you can have this type of
01:44:38workflow really incredible let's talk
01:44:40about robotics everything that moves
01:44:43will be robotic there's no question
01:44:44about that it's safer it's more
01:44:47convenient and one of the largest
01:44:49Industries is going to be Automotive we
01:44:51build the robotic stack back from top to
01:44:54bottom as I was mentioned from the
01:44:55computer system but in the case of
01:44:57self-driving cars including the
01:44:59self-driving application at the end of
01:45:01this year or I guess beginning of next
01:45:03year we will be shipping in Mercedes and
01:45:06then shortly after that Jr and so these
01:45:09autonomous robotic systems are software
01:45:11defined they take a lot of work to do
01:45:14has computer vision has obviously
01:45:16artificial intelligence control and
01:45:18planning all kinds of very complicated
01:45:20technology and takes years to refine
01:45:23we're building the entire stack however
01:45:26we open up our entire stack for all of
01:45:27the automotive industry this is just the
01:45:29way we work the way we work in every
01:45:31single industry we try to build as much
01:45:33of it as we can so that we understand it
01:45:34but then we open it up so everybody can
01:45:36access it whether you would like to buy
01:45:38just our computer which is the world's
01:45:41only full functional safe asld system
01:45:47that can run
01:45:48AI this functional safe asld quality
01:45:52computer or the operating system on top
01:45:55or of course our data centers which is
01:45:58in basically every AV company in the
01:46:01world however you would like to enjoy it
01:46:03we're delighted by it today we're
01:46:05announcing that byd the world's largest
01:46:08ev company is adopting our next
01:46:10Generation it's called Thor Thor is
01:46:13designed for Transformer engines Thor
01:46:16our next Generation AV computer will be
01:46:18used by
01:46:21byd
01:46:28you probably don't know this fact that
01:46:30we have over a million robotics
01:46:32developers we created Jetson this
01:46:35robotics computer we're so proud of it
01:46:37the amount of software that goes on top
01:46:38of it is insane but the reason why we
01:46:40can do it at all is because it's 100%
01:46:42Cuda compatible everything that we do
01:46:45everything that we do in our company is
01:46:47in service of our developers and by us
01:46:49being able to maintain this Rich
01:46:51ecosystem and make it compatible with
01:46:54everything that you access from us we
01:46:56can bring all of that incredible
01:46:58capability to this little tiny computer
01:47:01we call Jetson a robotics computer we
01:47:03also today are
01:47:05announcing this incredibly Advanced new
01:47:08SDK we call it Isaac
01:47:11perceptor Isaac perceptor most most of
01:47:14the robots today are pre-programmed
01:47:17they're either following rails on the
01:47:19ground digital rails or they'd be
01:47:20following April tags but in the future
01:47:23they're going to have perception and the
01:47:24reason why you want that is so that you
01:47:26could easily program it you say I would
01:47:29you like to go from point A to point B
01:47:31and it will figure out a way to navigate
01:47:33its way there so by only programming
01:47:36waypoints the entire route could be
01:47:39adaptive the entire environment could be
01:47:41reprogrammed just as I showed you at the
01:47:42very beginning with the warehouse you
01:47:44can't do that with pre-programmed agvs
01:47:49if those boxes fall down they just all
01:47:51gum up and they just wait there for some
01:47:52somebody come clear it and so now with
01:47:55the Isaac perceptor we have incredible
01:47:59state-of-the-art Vision odometry 3D
01:48:03reconstruction and in addition to 3D
01:48:05reconstruction depth perception the
01:48:07reason for that is so that you can have
01:48:08two modalities to keep an eye on what's
01:48:11happening in the world Isaac perceptor
01:48:14the most used robot today is the
01:48:18manipulator manufacturing arms and they
01:48:20are also pre-programmed the computer
01:48:23vision algorithms the AI algorithms the
01:48:25control and path planning algorithms
01:48:27that are geometry aware incredibly
01:48:30computational intensive we have made
01:48:33these Cuda accelerated so we have the
01:48:35world's first Cuda accelerated motion
01:48:38planner that is geometry aware you put
01:48:41something in front of it it comes up
01:48:43with a new plan and articulates around
01:48:45it it has excellent perception for pose
01:48:48estimation of a 3D object not just not
01:48:52it's POS in 2D but it's POS in 3D so it
01:48:54has to imagine what's around and how
01:48:58best to grap it so the foundation pose
01:49:02the grip foundation and the um
01:49:05articulation algorithms are now
01:49:07available we call it Isaac manipulator
01:49:09and they also uh just run on nvidia's
01:49:13computers we are starting to do some
01:49:16really great work in the next generation
01:49:18of Robotics the next generation of
01:49:21Robotics will likely be
01:49:23a humanoid
01:49:24robotics we now have the Necessary
01:49:27Technology and as I was describing
01:49:29earlier the Necessary Technology to
01:49:32imagine generalized human robotics in a
01:49:36way human robotics is likely easier and
01:49:38the reason for that is because we have a
01:49:40lot more imitation training data that we
01:49:43can provide the robots because we are
01:49:46constructed in a very similar way it is
01:49:48very likely that the human robotics will
01:49:49be much more useful in our world because
01:49:52we created the world to be something
01:49:54that we can interoperate in and work
01:49:56well in and the way that we set up our
01:49:58workstations and Manufacturing and
01:50:00Logistics they were designed for for
01:50:02humans they were designed for people and
01:50:04so these human robotics will likely be
01:50:06much more productive to
01:50:09deploy while we're creating just like
01:50:11we're doing with the others the entire
01:50:13stack starting from the top a foundation
01:50:16model that learns from watching video
01:50:20human IM human exam samples it could be
01:50:24in video form it could be in virtual
01:50:26reality form we then created a gym for
01:50:29it called Isaac reinforcement learning
01:50:31gym which allows the humanoid robot to
01:50:35learn how to adapt to the physical world
01:50:38and then an incredible computer the same
01:50:40computer that's going to go into our
01:50:42robotic car this computer will run
01:50:44inside a human or robot called Thor it's
01:50:47designed for Transformer engines we've
01:50:50combined several of these into one video
01:50:53this is something that you're going to
01:50:54really love take a
01:50:59look it's not enough for humans to
01:51:02[Music]
01:51:07imagine we have to
01:51:11invent and
01:51:14explore and push Beyond what's been done
01:51:17a fair amount of
01:51:21detail
01:51:24we create
01:51:25smarter and
01:51:29faster we push it to
01:51:31fail so it can
01:51:36learn we teach it then help it teach
01:51:40itself we broaden its
01:51:46understanding to take on new
01:51:50challenges with absolute
01:51:55precision and
01:51:58succeed we make it
01:52:01perceive and
01:52:05move and even
01:52:08reason so it can share our world with
01:52:18[Music]
01:52:21us