GTC March 2024 Keynote with NVIDIA CEO Jensen Huang

NVIDIA2024-03-18

NVIDIA#GTC#GPU#AI

308K views|5 months ago

💫 Short Summary

The video showcases the diverse applications of AI, from assisting the blind to renewable energy and healthcare. Nvidia's AI technology is highlighted, emphasizing its role in driving down costs and accelerating computing. The video discusses advancements in creating giant GPUs and the development of a supercomputer. It delves into the importance of encryption, data movement, and generative AI in computing. Nvidia's collaboration with industry leaders and the transformative power of AI in manufacturing, healthcare, and weather forecasting are also discussed. The future of software development involves AI interfaces, chatbots, and robotics, with a focus on advancing AI models and generative AI capabilities.

✨ Highlights

📊 Transcript

✦

The diverse applications of AI showcased in the video segment.

01:42

AI is used to assist the blind, generate virtual scenarios, and improve renewable energy, robotics, and healthcare.

Nvidia's AI technology and the CEO's address at a developers conference highlight the importance of AI in climate science and self-driving cars.

Industry leaders like Michael Dell emphasize the significance of AI in shaping the future of society.

✦

Accelerated computing is being used in various industries to solve problems traditional computers cannot address.

07:05

Industries such as life sciences, healthcare, genomics, transportation, retail, logistics, and manufacturing are benefiting from accelerated computing.

Computing has had a transformative impact on industries, showcasing significant progress in the field.

Evolution of computing models like CUDA and advanced AI supercomputers demonstrate continuous innovation and growth.

The emergence of generative AI indicates the start of a new industry focused on creating previously non-existent software, marking a significant shift in technology development.

✦

Highlights of Nvidia's discussion on accelerated computing.

15:03

Nvidia emphasizes the importance of accelerated computing for reducing costs, increasing consumption, and ensuring sustainability.

Accelerated computing is significantly faster than general-purpose computing, with major impacts on various industries.

Simulation tools for product creation are a key application area for accelerated computing.

✦

NVIDIA focusing on driving down the cost of computing and increasing the scale of digital twins for product design, simulation, and operation.

15:56

Partnerships announced with Ansys and Cadence to enhance computational lithography and generative AI in semiconductor manufacturing.

NVIDIA working towards creating a supercomputer using GPUs for fluid dynamic simulations on a large scale.

✦

Use of Nvidia GPUs in software companies for building supercomputers and applications like co-piloting and connecting digital twin platforms to Omniverse.

19:05

Large language models are scaling, leading to exponential growth in computational requirements due to parameter count doubling.

Significantly larger GPUs are needed to support training models with trillions of parameters, requiring billions of floating-point operations per second.

Emphasis on innovation and collaboration in GPU technology development to address future computational challenges.

✦

Advancements in creating giant GPUs and connecting them using networks for supercomputers.

22:34

Development of the DJX1 and construction of one of the largest AI supercomputers by 2023.

Focus on building chips, systems, and networks to distribute computation efficiently across thousands of GPUs.

Emphasis on the need for larger models trained on multimodality data for future innovations.

Grounding AI models in physics through watching videos and languages.

✦

Introduction of the Blackwell chip as the most advanced GPU with 208 billion transistors.

29:47

The chip is comprised of two integrated dies functioning as one giant chip, offering 10 terabytes of data transfer per second.

The Blackwell chip is designed to be compatible with the Hopper system, enabling efficient global installation.

Despite initial skepticism from engineers, the Blackwell chip impressed with its ambitious goals, innovative design, and capabilities.

✦

Introduction to new hopper version for hgx configuration.

33:05

Prototype board features advanced design with high computation power and memory coherence.

MV link and PCI Express included with CPU chip-to-chip links.

Second generation Transformer engine introduced for dynamically rescaling numerical formats.

Engine essential for artificial intelligence and complex mathematical calculations.

✦

Highlights of the new Transformer engine features.

36:01

The new Transformer engine is twice as fast as Hopper and has computation in the network to allow GPUs to work together efficiently.

All reduce, all to all, and all gather are key synchronization methods used in the new Transformer engine.

The reliability engine in the Ras chip ensures thorough self-testing of every component, maximizing supercomputer utilization.

Secure AI is prioritized through data encryption in transit and at rest to safeguard AI parameters from loss or contamination.

✦

Importance of encryption, transmission, compression, and decompression in computing.

39:36

Fast data movement in and out of computers is essential to prevent them from being idle.

Advancements in AI computing, particularly generative AI and understanding context for efficient information retrieval and production.

Future of computing is seen as generative AI, leading to energy, bandwidth, and time savings.

Shift in computing represents a new industry with fundamentally different computational approaches.

✦

Advancements in computing technology have rapidly increased, focusing on content token generation and inference capability.

43:21

Computation has increased by 1,000 times in the past eight years, demonstrating exponential growth in technology.

The Envy link switch chip, with 50 billion transistors, enables every GPU to communicate at full speed simultaneously.

The chip's groundbreaking innovation allows for driving copper directly, connecting GPUs effectively as one giant unit.

This development signifies a significant step towards more powerful and cost-effective computing systems.

✦

Discussion of a powerful AI system with 720 pedop flops, one of the world's first exaflops machines.

47:04

The system is liquid-cooled, consuming 120 kilowatts and operating at 45°C.

Features a unique MVLink spine with 130 terabytes per second bandwidth, equivalent to the entire internet's aggregate bandwidth.

Evolution of GPUs, from 35,000 parts to 600,000 parts, weighing 3,000 pounds like a Carbon Fiber Ferrari.

Emphasis on significant advancements in technology and computational power.

✦

Computational requirements for training large language models like GPT are discussed.

50:30

Significant resources needed include 8,000 GPUs and 15 megawatts of power for training.

The goal is to reduce costs and energy consumption associated with computing to enable scaling up model training.

Inference for large language models poses challenges due to their size exceeding the capacity of a single GPU.

The shift towards complex inference tasks like chatbots and generating content necessitates supercomputing resources, marking a change in computing demands.

✦

Advancements in chatbot technology allow for the generation of tokens at interactive rates using trillions of tokens and parameters.

52:43

Effective communication with chatbots involves selecting appropriate analogies.

Quick token generation is vital, necessitating the parallelization of models across multiple GPUs.

Balancing throughput and interactive rate impacts the cost and quality of service delivery.

Optimizing work distribution across GPUs is crucial for achieving high throughput and interactivity.

✦

NVIDIA's GPUs enable extensive exploration in performance and software configuration.

56:43

Blackwell's inference capability for generative AI surpasses Hopper by 30 times, with potential for large language models like GPT.

Improvements in chip size and speed, including FP4 tensor core and MV link switch, enhance communication speed and efficiency among GPUs.

Data centers are envisioned as AI Factories generating revenue through AI applications.

✦

Blackwell, a revolutionary product in the tech industry, is set to launch with global partners AWS, Google, and Nvidia.

01:00:08

The product boasts advanced GPU systems and collaborations with various companies for accelerated computing and AI development.

Blackwell's impact extends to infrastructure development, robotics, and healthcare integration.

The product promises to be the most successful launch in history, showcasing its potential to revolutionize multiple sectors.

✦

Collaboration between Google, GCP, Oracle, and Nvidia to accelerate services and databases, particularly focusing on digital twin technology.

01:03:54

Wistron, a manufacturing partner, is building digital twins of Nvidia factories using custom software and Omniverse sdks.

Digital twins help optimize layouts, increase worker efficiency, and ensure physical builds match digital plans, reducing costs and improving operations.

Wistron's factory was brought online in half the time using digital twins, enabling rapid testing of layouts and real-time monitoring of operations.

This resulted in significant efficiency gains for Wistron.

✦

Nvidia's global ecosystem of partners driving accelerated AI-enabled digitalization.

01:07:37

AI transforming manufacturing with digital product creation before physical manufacturing.

Advances in AI technology, like compressing dimensions for efficient processing.

Evolution of AI from recognizing to understanding text and images for tasks like chatting and summarizing.

✦

Advancements in generative AI have enabled the digitization and analysis of proteins, genes, brain waves, and weather patterns.

01:09:36

Earth 2.0, a digital twin of Earth, allows for the prediction of extreme weather events with high resolution and accuracy.

Nvidia's Cordi AI model has transformed weather forecasting by improving resolution from 25 km to 2 km and increasing speed and energy efficiency.

The technology can offer detailed regional weather forecasts, helping to mitigate potential damages and loss of life from severe storms.

✦

Nvidia and The Weather Company collaborating to enhance global weather predictions and integrating Earth-2-Cordi technology.

01:12:50

Nvidia advancing in healthcare with AI models for medical imaging, gene sequencing, and computational chemistry.

The AlphaFold project digitizing and reconstructing 200 million proteins, revolutionizing protein structure prediction.

Nvidia's new generative screening paradigm using Nemo Nims, AlphaFold, and diff dock technologies for rapid identification of new drug candidates through virtual screening for new medicines.

✦

Nvidia MIM can optimize molecule properties for drug discovery through custom applications.

01:15:32

Nims offer OnDemand microservices for drug discovery workflows like denovo protein design.

Nvidia Inference Microservice (Nim) is a pre-trained model with state-of-the-art open source models and user-friendly APIs.

Nim is optimized for single GPU, multi-GPU, or multi-node setups.

Nim is designed for easy integration and use in AI applications.

✦

The future of software development involves using AI interfaces like 'Nims' for seamless communication and task handling.

01:18:43

These specialized Nims can collaborate in different areas to provide efficient solutions and automate processes for increased productivity and innovation.

Nvidia has integrated Nims into their organization to create chatbots and streamline operations, showcasing the benefits of this concept in modern software development.

The use of Nims allows for easy scalability and optimization, making them a valuable asset for companies looking to enhance their software development capabilities.

✦

Nvidia utilizes chatbots for chip designing, specifically with the creation of a chatbot named Llama 2.

01:21:08

Initially, the chatbot misunderstood the term CTL as combinatorial timing logic but was later trained to identify it as Compute Trace Library.

Nvidia offers a service called Nemo microservice for customizing and fine-tuning AI models, along with infrastructure like dgx cloud for deployment.

The goal of Nvidia is to become an AI Foundry similar to TSMC in chip manufacturing, providing tools and technology for AI development.

✦

Creation of a Vector Database for Company Data.

01:24:27

Company data is primarily stored internally and the goal is to extract meaning from it by creating a vector database.

The vector database encodes structured and unstructured data into vectors to facilitate communication with the database.

Nemo Retriever is a service designed to quickly retrieve information from the vector database upon request.

Digital Nims, including a digital human named Rachel, serve different purposes such as being an AI care manager connected to a healthcare language model.

✦

Collaboration with Leading Companies in the Enterprise IT Industry.

01:27:27

Nvidia AI Foundry works with companies like SAP, ServiceNow, and Cohesity to develop AI solutions.

Use of Nvidia Nemo and DGX Cloud for building co-pilots and chatbots.

Snowflake partners with Nvidia AI Foundry to store digital data in the cloud and create co-pilots.

Collaboration with NetApp and Dell to develop chatbots and co-pilots, highlighting Dell's expertise in building AI factories for large-scale enterprise systems.

✦

Training AI involves inputting data to create large language models with trillions of parameters.

01:31:06

Three types of computers are needed to advance AI to understand the physical world.

AI computer watches videos and generates data, Jetson for autonomous processing, and another for language models.

Reinforcement learning involves providing human feedback to AI robots for physical alignment and learning articulation capabilities.

✦

Use of simulation engines like Omniverse and OVX in robotics for learning and adaptation to physical laws.

01:34:49

Introduction of a warehouse scenario with autonomous systems interacting under central control.

Highlighting the concept of a digital twin for heavy industry to assist robots and workers in navigating complex environments.

Utilization of Omniverse digital twin of a 100,000 ft warehouse for evaluating and refining system adaptability to real-world unpredictability.

✦

Use of generative AI powered Metropolis Vision Foundation models to improve mission efficiency.

01:36:11

Operators can ask questions using natural language and receive immediate insights to enhance operations.

Sensor data is created in simulation and processed in real-time by AI.

Integration of digital twins and AI models enables continuous improvement in virtual and physical environments.

Omniverse is being simplified with Cloud APIs for easier access and enhanced capabilities in digital twin communication and AI integration.

✦

Siemens integrates Nvidia AI and Omniverse technologies into their Teamcenter platform.

01:39:11

Data interoperability, physics-based rendering, and generative AI streamline design and manufacturing processes.

HD and Hyundai benefit from unifying massive engineering data sets interactively, saving time and costs.

Collaboration with Nvidia accelerates computing, generative AI, and Omniverse integration in Siemens' accelerator portfolio.

Success seen with Nissan's workflow integration demonstrates the benefits of a unified approach.

✦

NVIDIA introduces Omniverse Cloud streams to Vision Pro for virtual design tools integration.

01:44:09

Emphasis on robotics, particularly in the automotive industry for adoption of autonomous systems like self-driving cars.

NVIDIA's AV computer Thor is adopted by BYD, with over a million robotics developers.

Prioritization of compatibility for developers, offering a CUDA-compatible platform for software development.

✦

Introduction of Isaac Perceptor SDK for robot perception and navigation.

01:47:16

Emphasis on the significance of perception in robots for adaptive route planning and environment adaptation.

Features of the technology include advanced vision, odometry, and 3D reconstruction capabilities.

Introduction of Isaac Manipulator with CUDA-accelerated motion planning and perception algorithms for 3D object pose estimation.

Mention of the potential of humanoid robotics in the future with available technology and imitation training data.

✦

Development of humanoid robots with a focus on training them to adapt to the physical world and interact with humans.

01:52:56

Introduction of Nvidia Project Groot as a foundation model using multimodal instructions and past interactions to guide robot actions.

Tools like Isaac Lab and Osmo are highlighted for training and simulation purposes.

Groot model aims to enable robots to learn from human demonstrations and perform everyday tasks by observing human movement.

Nvidia's technologies are crucial in understanding humans, training models, and deploying them to physical robots.

✦

Introduction to Jetson Thor robotics chips and General Robotics 003 project.

01:54:25

Emphasis on advancements in computer graphics, physics, and artificial intelligence for next-gen robotics.

Mention of a new Industrial Revolution focused on accelerating data centers.

Emergence of generative AI creating valuable software.

Importance of distributing new types of software in an easy-to-use and portable manner for the evolution of computing technology.

✦

Nvidia introduces Nims, AI technology, tools, and infrastructure to create proprietary applications and chatbots.

01:58:19

The future will be dominated by robotics in various industries like stadiums, warehouses, and factories, all requiring a digital twin platform called Omniverse.

Nvidia's vision for the future involves a different image of GPUs focused on software stacks and innovative processors.

The Blackwell system design is highlighted as a significant advancement in GPU technology.

00:29I am I am a

00:36Visionary Illuminating galaxies to

00:39witness the birth of

00:47stars and sharpening our understanding

00:50of extreme weather

00:56events I am a helper

01:01guiding the blind through a crowded

01:07world I was thinking about running to

01:10the store and giving voice to those who

01:13cannot

01:14speak to not make me

01:18laugh I am a

01:22Transformer harnessing gravity to store

01:25Renewable

01:28Power

01:34and Paving the way towards unlimited

01:36clean energy for us

01:42all I am a

01:45trainer teaching robots to

01:51assist to watch out for

01:58danger and help save

02:04lives I am a

02:08Healer providing a new generation of

02:12cures and new levels of patient care

02:16doctor that I am allergic to penicillin

02:18is it still okay to take the medications

02:20definitely these antibiotics don't

02:22contain penicillin so it's perfectly

02:24safe for you to take

02:26them I am a navigator

02:33generating virtual

02:38scenarios to let us safely explore the

02:41real

02:43world and understand every

02:50decision I even helped write the

02:55script breathe life into the words

03:13I am

03:15AI brought to life by

03:18Nvidia deep

03:20learning and Brilliant

03:22Minds

03:28everywhere

03:34please welcome to the stage Nvidia

03:36founder and CEO Jensen

03:52Wong welcome to

03:58GTC

04:00I hope you realize this is not a

04:05concert you have

04:07arrived at a developers

04:11conference there will be a lot of

04:13science

04:14described algorithms computer

04:18architecture

04:27mathematics I sensed a very heavy weight

04:31in the room all of a

04:33sudden almost like you were in the wrong

04:36place no no conference in the

04:40world is there a great assembly of

04:43researchers from such diverse fields of

04:46science from

04:48climatech to radio Sciences trying to

04:51figure out how to use AI to robotically

04:54control MOS for Next Generation 6G

04:57radios robotic self-driving car

05:00s even artificial

05:05intelligence even artificial

05:07intelligence

05:10everybody's first I noticed a sense of

05:13relief there all of all of a

05:15sudden also this conference is

05:19represented by some amazing companies

05:22this list this is not the

05:26attendees these are the presentors

05:30and what's amazing is

05:32this if you take away all of my friends

05:37close friends Michael Dell is sitting

05:39right there in the IT

05:47industry all of the friends I grew up

05:49with in the industry if you take away

05:52that list this is what's

05:55amazing these are the presenters of the

05:59non it Industries using accelerated

06:01Computing to solve problems that normal

06:04computers

06:05can't it's

06:09represented in life sciences healthc

06:11Care

06:12genomics Transportation of course retail

06:16Logistics manufacturing

06:20industrial the gamut of Industries

06:23represented is truly amazing and you're

06:25not here to attend only you're here to

06:28present to talk about your research $100

06:32trillion dollar of the world's

06:34Industries is represented in this room

06:36today this is absolutely

06:44amazing there is absolutely something

06:47happening there is something going

06:50on the industry is being transformed not

06:54just ours because the computer industry

06:57the computer is the single most

07:00important instrument of society today

07:03fundamental transformations in Computing

07:05affects every industry but how did we

07:09start how did we get here I made a

07:11little cartoon for you literally I drew

07:14this in one page this is nvidia's

07:18Journey started in

07:201993 this might be the rest of the

07:24talk 1993 this is our journey we were

07:27founded in 1993 there are several

07:29important events that happen along the

07:30way I'll just highlight a few in 2006

07:35Cuda which has turned out to have been a

07:37revolutionary Computing model we thought

07:40it was revolutionary then it was going

07:42to be an overnight success and almost 20

07:44years later it

07:48happened we saw it

07:52coming two decades

07:57later in 2012

08:00alexnet Ai and

08:03Cuda made first

08:06Contact in

08:082016 recognizing the importance of this

08:11Computing model we invented a brand new

08:13type of computer we called the dgx one

08:17170 Tera flops in this supercomputer

08:21eight gpus connected together for the

08:23very first time I hand delivered the

08:26very first dgx-1 to a startup

08:30located in San

08:31Francisco called open

08:40AI dgx-1 was the world's first AI

08:43supercomputer remember 170 Tera

08:47flops

08:492017 the Transformer arrived

08:532022 chat GPT capture the world's imag

08:56imaginations have people realize the

08:58importance and the capabilities of

09:00artificial intelligence and

09:042023 generative AI

09:07emerged and a new industry begins

09:12why why is a new industry because the

09:15software never existed before we are now

09:18producing software using computers to

09:20write software producing software that

09:23never existed before it is a brand new

09:26category it took share from

09:28nothing it's a brand new category and

09:31the way you produce the

09:33software is unlike anything we've ever

09:36done before in data

09:39centers generating

09:42tokens

09:44producing floating Point

09:46numbers at very large scale as if in the

09:51beginning of this last Industrial

09:54Revolution when people realized that you

09:56would set up

09:58factories

09:59apply energy to it and this invisible

10:03valuable thing called electricity came

10:05out AC

10:07generators and 100 years later 200 years

10:10later we are now creating new types of

10:14electrons tokens using infrastructure we

10:18call factories AI factories to generate

10:21this new incredibly valuable thing

10:24called artificial intelligence a new

10:26industry has

10:28emerged well well we're going to talk

10:30about many things about this new

10:33industry we're going to talk about how

10:34we're going to do Computing next we're

10:37going to talk about the type of software

10:39that you build because of this new

10:41industry the new

10:43software how you would think about this

10:45new software what about applications in

10:48this new

10:49industry and then maybe what's next and

10:52how can we start preparing today for

10:55what is about to come next well but

10:58before I start

11:00I want to show you the soul of

11:03Nvidia the soul of our company at the

11:07intersection of computer

11:10Graphics

11:12physics and artificial

11:15intelligence all intersecting inside a

11:19computer in

11:21Omniverse in a virtual world

11:24simulation everything we're going to

11:26show you today literally everything

11:28we're going to show you today

11:30is a simulation not animation it's only

11:34beautiful because it's physics the world

11:36is

11:37beautiful it's only amazing because it's

11:40being animated with robotics it's being

11:43animated with artificial intelligence

11:45what you're about to see all

11:46day it's completely generated completely

11:50simulated and Omniverse and all of it

11:53what you're about to enjoy is the

11:54world's first concert where everything

11:57is

11:58homemade

12:05everything is homemade you're about to

12:08watch some home videos so sit back and

12:12enjoy

12:28yourself

12:58m

13:28what

13:58a

14:58God I love it

15:03Nvidia accelerated Computing has reached

15:07the Tipping

15:08Point general purpose Computing has run

15:11out of steam we need another way of

15:14doing Computing so that we can continue

15:16to scale so that we can continue to

15:18drive down the cost of computing so that

15:20we can continue to consume more and more

15:23Computing while being sustainable

15:26accelerated Computing is a dramatic

15:29speed up over general purpose Computing

15:32and in every single industry we engage

15:36and I'll show you

15:37many the impact is dramatic but in no

15:40industry is a more important than our

15:43own the industry of using simulation

15:46tools to create

15:49products in this industry it is not

15:52about driving down the cost of computing

15:54it's about driving up the scale of

15:56computing we would like to be able to

15:58sim at the entire product that we do

16:02completely in full Fidelity completely

16:05digitally in essentially what we call

16:08digital twins we would like to design it

16:11build it simulate it operate it

16:15completely

16:17digitally in order to do that we need to

16:20accelerate an entire industry and today

16:24I would like to announce that we have

16:26some Partners who are joining us in this

16:27journey to accelerate their entire

16:30ecosystem so that we can bring the world

16:33into accelerated Computing but there's a

16:38bonus when you become accelerated your

16:42infrastructure is cou to gpus and when

16:45that happens it's exactly the same

16:47infrastructure for generative

16:50Ai and so I'm just delighted to announce

16:54several very important Partnerships

16:56there are some of the most important

16:57companies in the world and Anis does

17:00engineering simulation for what the

17:01world makes we're partnering with them

17:04to Cuda accelerate the ancis ecosystem

17:07to connect anus to the Omniverse digital

17:10twin incredible the thing that's really

17:13great is that the install base of media

17:14GPU accelerated systems are all over the

17:16world in every cloud in every system all

17:20over Enterprises and so the app the

17:22applications they accelerate will have a

17:24giant installed base to go serve end

17:27users will have amazing applications and

17:29of course system makers and csps will

17:31have great customer

17:33demand

17:35synopsis synopsis is nvidia's literally

17:40first software partner they were there

17:42in very first day of our company

17:44synopsis revolutionized the chip

17:45industry with high level design we are

17:49going to Cuda accelerate synopsis we're

17:52accelerating computational lithography

17:55one of the most important applications

17:57that nobody's ever known about

17:59in order to make chips we have to push

18:01lithography to limit Nvidia has created

18:04a library domain specific library that

18:07accelerates computational lithography

18:10incredibly once we can accelerate and

18:13software Define all of tsmc who is

18:16announcing today that they're going to

18:18go into production with Nvidia kitho

18:20once this software defined and

18:22accelerated the next step is to apply

18:25generative AI to the future of

18:27semiconductor manufacturing push in

18:29Geometry even

18:31further Cadence builds the world's

18:35essential Eda and SDA tools we also use

18:38Cadence between these three companies

18:40ansis synopsis and Cadence we basically

18:43build Nvidia together we are cud

18:46accelerating Cadence they're also

18:48building a supercomputer out of Nvidia

18:50gpus so that their customers could do

18:53fluid Dynamic simulation at a 100 a

18:57thousand times scale

18:59basically a wind tunnel in real time

19:03Cadence Millennium a supercomputer with

19:05Nvidia gpus inside a software company

19:08building supercomputers I love seeing

19:10that building Cadence co-pilots together

19:13imagine a

19:14day when Cadence could synopsis ansis

19:18tool providers would offer you AI

19:22co-pilots so that we have thousands and

19:24thousands of co-pilot assistants helping

19:27us design chips Design Systems and we're

19:30also going to connect Cadence digital

19:32twin platform to Omniverse as you could

19:34see the trend here we're accelerating

19:37the world's CAE Eda and SDA so that we

19:40could create our future in digital Twins

19:44and we're going to connect them all to

19:45Omniverse the fundamental operating

19:47system for future digital

19:50twins one of the industries that

19:52benefited tremendously from scale and

19:55you know you all know this one very well

19:57large language model

20:00basically after the Transformer was

20:02invented we were able to scale large

20:05language models at incredible rates

20:08effectively doubling every six months

20:10now how is it possible that by doubling

20:13every six months that we have grown the

20:16industry we have grown the computational

20:18requirements so far and the reason for

20:20that is quite simply this if you double

20:23the size of the model you double the

20:24size of your brain you need twice as

20:25much information to go fill it and so

20:28every time you double your parameter

20:32count you also have to appropriately

20:35increase your training token count the

20:38combination of those two

20:40numbers becomes the computation scale

20:43you have to

20:44support the latest the state-of-the-art

20:46open AI model is approximately 1.8

20:49trillion parameters 1.8 trillion

20:52parameters required several trillion

20:55tokens to go

20:57train so so a few trillion parameters on

21:00the order of a few trillion tokens on

21:03the order of when you multiply the two

21:05of them together approximately 30 40 50

21:10billion quadrillion floating Point

21:14operations per second now we just have

21:16to do some Co math right now just hang

21:18hang with me so you have 30 billion

21:21quadrillion a quadrillion is like a paa

21:25and so if you had a PA flop GPU you

21:28would need

21:3030 billion seconds to go compute to go

21:33train that model 30 billion seconds is

21:35approximately 1,000

21:38years well 1,000 years it's worth

21:47it like to do it sooner but it's worth

21:51it which is usually my answer when most

21:54people tell me hey how long how long's

21:55it going to take to do something 20

21:57years how it it's worth

22:01it but can we do it next

22:05week and so 1,000 years 1,000 years so

22:09what we need what we

22:12need are bigger

22:14gpus we need much much bigger gpus we

22:18recognized this early on and we realized

22:21that the answer is to put a whole bunch

22:23of gpus together and of course innovate

22:26a whole bunch of things along the way

22:27like inventing 10 censor cores advancing

22:30MV links so that we could create

22:32essentially virtually Giant

22:34gpus and connecting them all together

22:36with amazing networks from a company

22:39called melanox infiniband so that we

22:41could create these giant systems and so

22:43djx1 was our first version but it wasn't

22:45the last we built we built

22:48supercomputers all the way all along the

22:50way in

22:522021 we had Seline 4500 gpus or so and

22:57then in 2023 we built one of the largest

23:00AI supercomputers in the world it's just

23:02come

23:03online

23:05EOS and as we're building these things

23:08we're trying to help the world build

23:10these things and in order to help the

23:12world build these things we got to build

23:13them first we build the chips the

23:15systems the networking all of the

23:18software necessary to do this you should

23:20see these

23:21systems imagine writing a piece of

23:24software that runs across the entire

23:26system Distributing the computation

23:28across

23:29thousands of gpus but inside are

23:31thousands of smaller

23:34gpus millions of gpus to distribute work

23:37across all of that and to balance the

23:39workload so that you can get the most

23:41Energy Efficiency the best computation

23:44time keep your cost down and so those

23:47those fundamental

23:50Innovations is what got us here and here

23:54we

23:55are as we see the miracle of chat GPT

23:59emerg in front of us we also realize we

24:02have a long ways to go we need even

24:06larger models we're going to train it

24:08with multimodality data not just text on

24:10the internet but we're going to we're

24:12going to train it on texts and images

24:14and graphs and

24:15charts and just as we learn watching TV

24:19and so there's going to be a whole bunch

24:21of watching video so that these Mo

24:23models can be grounded in physics

24:26understands that an arm doesn't go

24:27through a wall and so these models would

24:30have common sense by watching a lot of

24:33the world's video combined with a lot of

24:36the world's languages it'll use things

24:38like synthetic data generation just as

24:40you and I do when we try to learn we

24:43might use our imagination to simulate

24:46how it's going to end up just as I did

24:48when I Was preparing for this keynote I

24:50was simulating it all along the

24:54way I hope it's going to turn out as

24:57well as I had it in my

25:05head as I was simulating how this

25:07keynote was going to turn out somebody

25:08did say that another

25:12performer did her performance completely

25:15on a

25:16treadmill so that she could be in shape

25:18to deliver it with full

25:21energy I I didn't do

25:25that if I get a l wind at about 10

25:27minutes into this you know what

25:30happened and so so where were we we're

25:34sitting here using synthetic data

25:36generation we're going to use

25:37reinforcement learning we're going to

25:38practice it in our mind we're going to

25:40have ai working with AI training each

25:42other just like student teacher

25:45Debaters all of that is going to

25:47increase the size of our model it's

25:48going to increase the amount of the

25:50amount of data that we have and we're

25:51going to have to build even bigger

25:55gpus Hopper is fantastic but we need

25:59bigger

26:00gpus and so ladies and

26:04gentlemen I would like to introduce

26:07you to a very very big

26:23GPU named after David

26:26Blackwell math

26:29ician game theorists

26:32probability we thought it was a perfect

26:35per per perfect name black wealth ladies

26:38and gentlemen enjoy

26:57this

27:57the

28:57com

29:17Blackwell is not a chip Blackwell is the

29:19name of a

29:20platform uh people think we make

29:23gpus and and we do but gpus don't look

29:28the way they used

29:30to here here's the here's the here's the

29:33the if you will the heart of the blackw

29:36system and this inside the company is

29:39not called Blackwell it's just the

29:40number and um uh

29:44this this is Blackwell sitting next to

29:47oh this is the most advanced GPU in the

29:49world in production

29:54today this is

29:56Hopper this is Hopper Hopper changed the

30:00world this is

30:11Blackwell it's okay

30:18Hopper you're you're very

30:21good good good

30:24boy well good

30:26girl

30:29208 billion transistors and so so you

30:33could see you I can see that there's a

30:36small line between two dyes this is the

30:38first time two dieses have abutted like

30:41this together in such a way that the two

30:44chip the two dieses think it's one chip

30:46there's 10 terabytes of data between it

30:4910 terabytes per second so that these

30:52two these two sides of the Blackwell

30:54Chip have no clue which side they're on

30:57there's no memory locality issues no

30:59cach issues it's just one giant chip and

31:03so uh when we were told that Blackwell's

31:07Ambitions were beyond the limits of

31:09physics uh the engineer said so what and

31:12so this is what what happened and so

31:14this is the Blackwell chip and it goes

31:18into two types of systems the first

31:22one is form fit function compatible to

31:25Hopper and so you slide all Hopper and

31:28you push in Blackwell that's the reason

31:29why one of the challenges of ramping is

31:32going to be so efficient there are

31:34installations of Hoppers all over the

31:36world and they could be they could be

31:38you know the same infrastructure same

31:39design the power the electricity The

31:43Thermals the software identical push it

31:46right back and so this is a hopper

31:49version for the current hgx

31:53configuration and this is what the other

31:56the second Hopper looks like this now

31:58this is a prototype board and um Janine

32:02could I just

32:04borrow ladies and gentlemen Jan

32:11Paul and so this this is the this is a

32:14fully functioning board and I just be

32:17careful

32:18here this right here is I don't know10

32:26billion

32:28the second one's

32:33five it gets cheaper after that so any

32:36customers in the audience it's

32:41okay all right but this is this one's

32:44quite expensive this is to bring up

32:45board and um and the the way it's going

32:48to go to production is like this one

32:50here okay and so you're going to take

32:52take this it has two blackw Dy two two

32:56blackw chips and four Blackwell dies

32:59connected to a Grace CPU the grace CPU

33:03has a super fast chipto chip link what's

33:05amazing is this computer is the first of

33:08its kind where this much computation

33:11first of all fits into this small of a

33:14place second it's memory coherent they

33:18feel like they're just one big happy

33:20family working on one application

33:23together and so everything is coherent

33:25within it um the just the amount of you

33:29know you saw the numbers there's a lot

33:31of terabytes this and terabytes that's

33:33um but this is this is a miracle this is

33:35a this let's see what are some of the

33:38things on here uh there's um uh MV link

33:42on top PCI Express on the

33:46bottom on on uh

33:50your which one is mine and your left one

33:53of them it doesn't matter uh one of them

33:56one of them is a CPU chipto chip link is

34:00my left or your depending on which side

34:01I was just I was trying to sort that out

34:04and I just kind of doesn't

34:11matter hopefully it comes plugged in

34:18so okay so this is the grace Blackwell

34:26system

34:31but there's

34:34more so it turns out it turns out all of

34:38the specs is fantastic but we need a

34:40whole lot of new features uh in order to

34:43push the limits Beyond if you will the

34:46limits of

34:47physics we would like to always get a

34:50lot more X factors and so one of the

34:52things that we did was We Invented

34:53another Transformer engine another

34:56Transformer engine the second generation

34:58it has the ability to

35:00dynamically and automatically

35:03rescale and

35:06recas numerical formats to a lower

35:09Precision whenever it can remember

35:12artificial intelligence is about

35:13probability and so you kind of have you

35:16know 1.7 approximately 1.7 time

35:19approximately 1.4 to be approximately

35:21something else does that make sense and

35:23so so the the ability for the

35:26mathematics to retain the Precision and

35:29the range necessary in that particular

35:32stage of the pipeline super important

35:35and so this is it's not just about the

35:37fact that we designed a smaller ALU it's

35:39not quite the world's not quite that

35:41simple you've got to figure out when you

35:44can use that across a computation that

35:48is thousands of gpus it's running for

35:52weeks and weeks on weeks and you want to

35:54make sure that the the uh uh the

35:56training job is going going to converge

35:59and so this new Transformer engine we

36:01have a fifth generation MV

36:03link it's now twice as fast as Hopper

36:06but very importantly it has computation

36:09in the network and the reason for that

36:11is because when you have so many

36:12different gpus working together we have

36:15to share our information with each other

36:17we have to synchronize and update each

36:19other and every so often we have to

36:21reduce the partial products and then

36:24rebroadcast out the partial products the

36:26sum of the partial products back to

36:28everybody else and so there's a lot of

36:29what is called all reduce and all to all

36:32and all gather it's all part of this

36:34area of synchronization and collectives

36:36so that we can have gpus working with

36:38each other having extraordinarily fast

36:41links and being able to do mathematics

36:43right in the network allows us to

36:46essentially amplify even further so even

36:49though it's 1.8 terabytes per second

36:51it's effectively higher than that and so

36:53it's many times that of Hopper the likel

36:57Ood of a supercomputer running for weeks

37:01on in is approximately zero and the

37:05reason for that is because there's so

37:06many components working at the same time

37:09the statistic the probability of them

37:12working continuously is very low and so

37:14we need to make sure that whenever there

37:16is a well we checkpoint and restart as

37:19often as we can but if we have the

37:22ability to detect a weak chip or a weak

37:26note early we could retire it and maybe

37:29swap in another processor that ability

37:33to keep the utilization of the

37:34supercomputer High especially when you

37:37just spent $2 billion building it is

37:40super important and so we put in a Ras

37:45engine a reliability engine that does

37:48100% self test in system test of every

37:53single gate every single bit of memory

37:58on the Blackwell chip and all the memory

38:01that's connected to it it's almost as if

38:04we shipped with every single chip its

38:07own Advanced tester that we CH test our

38:11chips with this is the first time we're

38:13doing this super excited about it secure

38:22AI only this conference do they clap for

38:26Ras the

38:28the uh secure AI uh obviously you've

38:32just spent hundreds of millions of

38:34dollars creating a very important Ai and

38:37the the code the intelligence of that AI

38:40is encoded in the parameters you want to

38:42make sure that on the one hand you don't

38:44lose it on the other hand it doesn't get

38:45contaminated and so we now have the

38:48ability to encrypt data of course at

38:53rest but also in transit and while it's

38:56being computed

38:58it's all encrypted and so we now have

39:01the ability to encrypt and transmission

39:04and when we're Computing it it is in a

39:06trusted trusted environment trusted uh

39:09engine environment and the last thing is

39:13decompression moving data in and out of

39:15these nodes when the compute is so fast

39:18becomes really

39:19essential and so we've put in a high

39:23linee speed compression engine and

39:25effectively moves data 20 times times

39:27faster in and out of these computers

39:29these computers are are so powerful and

39:32there's such a large investment the last

39:34thing we want to do is have them be idle

39:36and so all of these capabilities are

39:38intended to keep Blackwell fed and as

39:44busy as

39:46possible overall compared to

39:49Hopper it is two and a half times two

39:53and a half times the fp8 performance for

39:56training per chip it is ALS it also has

40:00this new format called fp6 so that even

40:03though the computation speed is the

40:05same the bandwidth that's Amplified

40:09because of the memory the amount of

40:11parameters you can store in the memory

40:12is now Amplified fp4 effectively doubles

40:16the throughput this is vitally important

40:19for inference one of the things that

40:21that um is becoming very clear is that

40:24whenever you use a computer with AI on

40:27the other

40:27side when you're chatting with the

40:30chatbot when you're asking it to uh

40:34review or make an

40:36image remember in the back is a GPU

40:40generating

40:41tokens some people call it inference but

40:45it's more appropriately

40:48generation the way that Computing is

40:50done in the past was retrieval you would

40:53grab your phone you would touch

40:54something um some signals go off

40:57basically an email goes off to some

40:59storage somewhere there's pre-recorded

41:02content somebody wrote a story or

41:03somebody made an image or somebody

41:04recorded a video that record

41:07pre-recorded content is then streamed

41:09back to the phone and recomposed in a

41:11way based on a recommender system to

41:14present the information to

41:16you you know that in the future the vast

41:20majority of that content will not be

41:22retrieved and the reason for that is

41:24because that was pre-recorded by

41:25somebody who doesn't understand the

41:27context which is the reason why we have

41:29to retrieve so much content if you can

41:33be working with an AI that understands

41:35the context who you are for what reason

41:37you're fetching this information and

41:39produces the information for you just

41:42the way you like

41:43it the amount of energy we save the

41:46amount of networking bandwidth we save

41:48the amount of waste of time we save will

41:51be tremendous the future is generative

41:55which is the reason why we call it

41:56generative AI which is the reason why

41:59this is a brand new industry the way we

42:02compute is fundamentally different we

42:05created a processor for the generative

42:08AI era and one of the most important

42:11parts of it is content token generation

42:14we call it this format is

42:17fp4 well that's a lot of computation

42:245x the Gen token generation 5x the

42:27inference capability of Hopper seems

42:32like

42:35enough but why stop

42:39there the answer is it's not enough and

42:41I'm going to show you why I'm going to

42:43show you why and so we would like to

42:46have a bigger GPU even bigger than this

42:48one and so

42:51we decided to scale it and notice but

42:54first let me just tell you how we've

42:55scaled over the course of the last eight

42:59years we've increased computation by

43:011,000 times8 years 1,000 times remember

43:04back in the good old days of Moore's Law

43:07it was 2x well 5x every what 10 10x

43:12every 5 years that's easier easiest math

43:1410x every 5 years a 100 times every 10

43:17years 100 times every 10 years at the in

43:21the middle in the hey days of the PC

43:25Revolution one 100 times every 10 years

43:29in the last 8 years we've gone 1,000

43:33times we have two more years to

43:35go and so that puts it in

43:41perspective the rate at which we're

43:43advancing Computing is insane and it's

43:46still not fast enough so we built

43:47another

43:49chip this chip is just an incredible

43:53chip we call it the Envy link switch

43:56it's 50 billion transistors it's almost

43:59the size of Hopper all by itself this

44:02switch ship has four MV links in

44:05it each 1.8 terabytes per

44:08second

44:10and and it has computation in as I

44:13mentioned what is this chip

44:16for if we were to build such a chip we

44:20can have every single GPU talk to every

44:23other GPU at full speed at the same

44:27time that's

44:36insane it doesn't even make

44:39sense but if you could do that if you

44:42can find a way to do that and build a

44:44system to do that that's cost effective

44:48that's cost effective how incredible

44:51would it be that we could have all these

44:53gpus connect over a coherent link so

44:58that they effectively are one giant GPU

45:02well one of one of the Great Inventions

45:04in order to make a cost effective is

45:05that this chip has to drive copper

45:09directly the seres of this chip is is

45:11just a phenomenal invention so that we

45:14could do direct drive to copper and as a

45:16result you can build a system that looks

45:19like

45:25this

45:30now this system this system is kind of

45:34insane this is one dgx this is what a

45:38dgx looks like now remember just six

45:41years

45:43ago it was pretty heavy but I was able

45:46to lift

45:49it I delivered the uh the uh first djx1

45:53to open Ai and and the researchers there

45:56it's on you know the pictures are on the

45:57internet and uh uh and we all

46:00autographed it uh and um uh if you come

46:04to my office it's autographed there is

46:06really beautiful and but but you could

46:08lift it uh this dgx this dgx that djx by

46:12the way was

46:14170

46:16teraflops if you're not familiar with

46:18the numbering system that's

46:210.17 pedop flops so this is

46:25720 the first one I delivered to open AI

46:28was

46:290.17 you could round it up to 0.2 won't

46:32make any difference but and back then

46:34was like wow you know 30 more teraflops

46:37and so this is now 720 pedop flops

46:42almost an exal flop for training and the

46:44world's first one exal flops machine in

46:47one

46:55rack just so you know there are only a

46:58couple two three exop flops machines on

47:00the planet as we speak and so this is an

47:04exop flops AI system in one single rack

47:09well let's take a look at the back of

47:13it so this is what makes it possible

47:17that's the back that's the that's the

47:19back the dgx MV link spine 130 terabytes

47:24per

47:25second goes through the back of that

47:28chassis that is more than the aggregate

47:30bandwidth of the

47:40internet so we we could basically send

47:43everything to everybody within a second

47:46and so so we we have 5,000 cables 5,000

47:50mvlink cables in total 2

47:53miles now this is the amazing thing if

47:56we had to use Optics we would have had

47:58to use transceivers and retim and those

48:01transceivers and reers alone would have

48:04cost

48:0520,000

48:07watts 2 kilowatts of just transceivers

48:10alone just to drive the mvlink spine as

48:14a result we did it completely for free

48:16over mvlink switch and we were able to

48:19save the 20 kilow for computation this

48:22entire rack is 120 kilowatts so that 20

48:25kilowatts makes a huge difference

48:27it's liquid cooled what goes in is 25° C

48:31about room temperature what comes out is

48:3445°c about your jacuzzi so room

48:38temperature goes in jacuzzi comes out 2

48:40liters per

48:49second we could we could sell a

48:55peripheral

48:58600,000 Parts somebody used to say you

49:01know you guys make gpus and we do but

49:05this is what a GPU looks like to me when

49:07somebody says GPU I see this two years

49:10ago when I saw a GPU was the hgx it was

49:1370 lb 35,000 Parts our gpus now are

49:18600,000 parts

49:21and 3,000 lb 3,000 lb 3,000 lb that's

49:27kind of like the weight of a you know

49:30Carbon

49:31Fiber

49:33Ferrari I don't know if that's useful

49:35metric

49:37but everybody's going I feel it I feel

49:40it I get it I get that now that you

49:43mention that I feel it I don't know

49:46what's 3,000

49:47lb okay so 3,000 lb ton and a half so

49:51it's not quite an

49:53elephant so this is what a dgx looks

49:56like now let's see what it looks like in

49:58operation okay let's imagine what is

50:00what how do we put this to work and what

50:01does that mean well if you were to train

50:03a GPT model 1.8 trillion parameter

50:08model it took it took about apparently

50:11about you know 3 to 5 months or so uh

50:13with 25,000 amp uh if we were to do it

50:16with hopper it would probably take

50:17something like 8,000 gpus and it would

50:20consume 15 megawatts 8,000 gpus on 15

50:23megawatts it would take 90 days about 3

50:25months and that would allows you to

50:27train something that is you know this

50:30groundbreaking AI model and this is

50:34obviously not as expensive as as um as

50:37anybody would think but it's 8,000 8,000

50:39gpus it's still a lot of money and so

50:418,000 gpus 15 megawatts if you were to

50:44use Blackwell to do this it would only

50:47take 2,000

50:49gpus 2,000 gpus same 90 days but this is

50:54the amazing part only 4 me GS of power

50:58so from 15 yeah that's

51:04right and that's and that's our goal our

51:07goal is to continuously drive down the

51:10cost and the energy they're directly

51:11proportional to each other cost and

51:13energy associated with the Computing so

51:15that we can continue to expand and scale

51:17up the computation that we have to do to

51:20train the Next Generation models well

51:22this is

51:23training inference or generation

51:27is vitally important going forward you

51:29know probably some half of the time that

51:31Nvidia gpus are in the cloud these days

51:33it's being used for token generation you

51:36know they're either doing co-pilot this

51:37or chat you know chat GPT that or um all

51:40these different models that are being

51:41used when you're interacting with it or

51:44generating IM generating images or

51:46generating videos generating proteins

51:48generating chemicals there's a bunch of

51:50gener generation going on all of that is

51:53B in the category of computing we call

51:56inference

51:57but inference is extremely hard for

51:59large language models because these

52:01large language models have several

52:03properties one they're very large and so

52:05it doesn't fit on one GPU this is

52:08Imagine imagine Excel doesn't fit on one

52:11GPU you know and imagine some

52:13application you're running on a daily

52:15basis doesn't run doesn't fit on one

52:16computer like a video game doesn't fit

52:18on one computer and most in fact do and

52:23many times in the past in hyperscale

52:25Computing many applic applications for

52:27many people fit on the same computer and

52:29now all of a sudden this one inference

52:31application where you're interacting

52:33with this chatbot that chatbot requires

52:36a supercomputer in the back to run it

52:38and that's the future the future is

52:41generative with these chatbots and these

52:43chatbots are trillions of tokens

52:46trillions of parameters and they have to

52:48generate

52:49tokens at interactive rates now what

52:52does that mean well uh three to tokens

52:56is about a

52:58word I you know the the

53:01uh you know space the final frontier

53:05these are the adventures that's like

53:07that's like 80

53:09tokens okay I don't know if that's

53:12useful to you and

53:16so you know the art of communications is

53:19is selecting good an good

53:22analogies yeah this is this is not going

53:25well

53:28every I don't know what he's talking

53:30about never seen Star Trek and so and so

53:34so here we are we're trying to generate

53:35these tokens when you're interacting

53:37with it you're hoping that the tokens

53:38come back to you as quickly as possible

53:40and as quickly as you can read it and so

53:42the ability for Generation tokens is

53:44really important you have to paralyze

53:46the work of this model across many many

53:48gpus so that you could achieve several

53:51things one on the one hand you would

53:52like throughput because that throughput

53:55reduces the cost

53:57the overall cost per token of uh

54:00generating so your throughput dictates

54:03the cost of of uh delivering the service

54:06on the other hand you have another

54:08interactive rate which is another tokens

54:10per second where it's about per user and

54:13that has everything to do with quality

54:14of service and so these two things um uh

54:18compete against each other and we have

54:20to find a way to distribute work across

54:23all of these different gpus and paralyze

54:25it in a way that allows us to achieve

54:27both and it turns out the search search

54:29space is

54:31enormous you know I told you there's

54:33going to be math

54:34involved and everybody's going oh

54:37dear I heard some gasp just now when I

54:40put up that slide you know so so this

54:43this right here the the y axis is tokens

54:45per second data center throughput the x-

54:48axis is tokens per second interactivity

54:51of the person and notice the upper right

54:53is the best you want interactivity to be

54:56very

54:56High number of tokens per second per

54:59user you want the tokens per second of

55:01per data center to be very high the

55:02upper upper right is is terrific however

55:05it's very hard to do that and in order

55:08for us to search for the best

55:10answer across every single one of those

55:12intersections XY coordinates okay so you

55:15just look at every single XY coordinate

55:17all those blue dots came from some

55:20repartitioning of the software some

55:23optimizing solution has to go and figure

55:25out what whether to use use tensor

55:29parallel expert parallel pipeline

55:32parallel or data parallel and

55:34distribute this enormous model across

55:37all these G different gpus and sustain

55:40performance that you need this

55:42exploration space would be impossible if

55:45not for the programmability of nvidia's

55:47gpus and so we could because of Cuda

55:49because we have such Rich ecosystem we

55:51could explore this universe and find

55:54that green roof line it turns out that

55:57green roof line notice you got tp2 EPA

56:01dp4 it means two parall two uh tensor

56:05parallel tensor parallel across two gpus

56:08expert parallels across eight data

56:10parallel across four notice on the other

56:12end you got tensor parallel cross 4 and

56:14expert parallel across 16 the

56:17configuration the distribution of that

56:19software it's a different different um

56:22runtime that would produce these

56:25different results and you have to go

56:27discover that roof line well that's just

56:29one model and this is just one

56:32configuration of a computer imagine all

56:34of the models being created around the

56:35world and all the different different um

56:38uh configurations of of uh systems that

56:40are going to be

56:43available so now that you understand the

56:46basics let's take a look at inference of

56:50Blackwell compared

56:52to Hopper and this is this is the

56:55extraordinary thing in one generation

56:58because we created a system that's

57:01designed for trillion parameter gener

57:03generative AI the inference capability

57:06of Blackwell is off the

57:08charts and in fact it is some 30 times

57:12Hopper

57:18y for large language models for large

57:21language models like Chad GPT and others

57:24like it the blue line is Hopper I gave

57:28you imagine we didn't change the

57:30architecture of Hopper we just made it a

57:32bigger

57:33chip we just used the latest you know

57:36greatest uh 10 terab you know terabytes

57:40per second we connected the two chips

57:42together we got this giant 208 billion

57:44parameter chip how would we have

57:46performed if nothing else changed and it

57:48turns out quite

57:50wonderfully quite wonderfully and that's

57:52the purple line but not as great as it

57:55could be and and that's where the fp4

57:58tensor core the new Transformer engine

58:01and very importantly the MV link switch

58:04and the reason for that is because all

58:06these gpus have to share the results

58:08partial products whenever they do all to

58:10all all all gather whenever they

58:12communicate with each

58:14other that mvlink switch is

58:17communicating almost 10 times faster

58:20than what we could do in the past using

58:22the fastest

58:23networks Okay so Blackwell is going to

58:27be just an amazing system for a

58:30generative Ai and in the

58:33future in the future data centers are

58:36going to be thought of as I mentioned

58:38earlier as an AI Factory an AI Factory's

58:42goal in life is to generate revenues

58:46generate in this

58:48case

58:50intelligence in this facility not

58:53generating electricity as in AC

58:55generator

58:57but of the last Industrial Revolution

58:59and this Industrial Revolution the

59:00generation of intelligence and so this

59:03ability is super super important the

59:06excitement of Blackwell is really off

59:08the charts you know when we first when

59:10we first um uh you know this this is a

59:14year and a half ago two years ago I

59:16guess two years ago when we first

59:17started to to go to market with hopper

59:20you know we had the benefit of of uh two

59:22two uh two csps uh joined us in a lunch

59:26and and we were you know delighted um

59:28and so we had two

59:31customers uh we have more

59:46now unbelievable excitement for

59:48Blackwell unbelievable excitement and

59:51there's a whole bunch of different

59:52configurations of course I showed you

59:54the configurations that slide into the

59:56hopper form factor so that's easy to

59:58upgrade I showed you examples that are

01:00:01liquid cooled that are the extreme

01:00:03versions of it one entire rack that's

01:00:05that's uh connected by mvlink 72 uh

01:00:08we're going to Blackwell is going to be

01:00:12ramping to the world's AI companies of

01:00:16which there are so many now doing

01:00:18amazing work in different modalities the

01:00:21csps every CSP is geared up all the OEM

01:00:26and

01:00:27odms Regional clouds Sovereign AIS and

01:00:32Telos all over the world are signing up

01:00:34to launch with Blackwell

01:00:43this Blackwell Blackwell would be the

01:00:46the the most successful product launch

01:00:48in our history and so I can't wait wait

01:00:51to see that um I want to thank I want to

01:00:53thank some partners that that are

01:00:54joining us in this uh AWS is gearing up

01:00:57for Blackwell they're uh they're going

01:00:59to build the first uh GPU with secure AI

01:01:02they're uh building out a 222 exf flops

01:01:06system you know just now when we

01:01:08animated uh just now the digital twin if

01:01:10you saw the the all of those clusters

01:01:12are coming down by the way that is not

01:01:16just art that is a digital twin of what

01:01:18we're building that's how big it's going

01:01:20to be besides infrastructure we're doing

01:01:22a lot of things together with AWS we're

01:01:24Cuda accelerating stag maker AI we're

01:01:27Cuda accelerating Bedrock AI uh Amazon

01:01:30robotics is working with us uh using

01:01:32Nvidia Omniverse and Isaac Sim AWS

01:01:35Health has Nvidia Health Integrated into

01:01:38it so AWS has has really leaned into

01:01:42accelerated Computing uh Google is

01:01:44gearing up for Blackwell gcp already has

01:01:47A1 100s h100s t4s l4s a whole Fleet of

01:01:51Nvidia Cuda gpus and they recently

01:01:53announced the Gemma model that runs

01:01:55across all of it uh we're work working

01:01:58to optimize uh and accelerate every

01:02:01aspect of gcp we're accelerating data

01:02:03proc which for data processing their

01:02:05data processing engine Jax xlaa vertex

01:02:08Ai and mojoko for robotics so we're

01:02:11working with uh Google and gcp across a

01:02:14whole bunch of initiatives uh Oracle is

01:02:16gearing up for black wellth Oracle is a

01:02:18great partner of ours for Nvidia dgx

01:02:20cloud and we're also working together to

01:02:22accelerate something that's really

01:02:24important to a lot of companies Oracle

01:02:27database Microsoft is accelerating and

01:02:30Microsoft is gearing up for Blackwell

01:02:32Microsoft Nvidia has a wide- ranging

01:02:34partnership we're accelerating Cuda

01:02:36accelerating all kinds of services when

01:02:38you when you chat obviously and uh AI

01:02:41services that are in Microsoft Azure uh

01:02:43it's very very likely Nvidia is in the

01:02:45back uh doing the inference and the

01:02:46token generation uh we built they built

01:02:49the largest Nvidia infiniband

01:02:51supercomputer basically a digital twin

01:02:53of hours or a physical twin of hours uh

01:02:56we're bringing the Nvidia ecosystem to

01:02:58Azure Nvidia djx cloud to Azure uh

01:03:01Nvidia Omniverse is now hosted in Azure

01:03:03Nvidia Healthcare is an Azure and all of

01:03:06it is deeply integrated and deeply

01:03:08connected with Microsoft fabric the

01:03:11whole industry is gearing up for

01:03:13Blackwell this is what I'm about to show

01:03:16you most of the most of the the the uh

01:03:19uh uh scenes that you've seen so far of

01:03:21Blackwell are the are the full Fidelity

01:03:25design of Blackwell everything in our

01:03:28company has a digital twin and in fact

01:03:31this digital twin idea is it is really

01:03:34spreading and it it helps it helps

01:03:36companies build very complicated things

01:03:39perfectly the first time and what could

01:03:41be more exciting

01:03:43than creating a digital twin to build a

01:03:47computer that was built in a digital

01:03:49twin and so let me show you what wistron

01:03:51is

01:03:54doing to meet the demand for NVIDIA

01:03:57accelerated Computing widraw one of our

01:03:59leading manufacturing Partners is

01:04:01building digital twins of Nvidia dgx and

01:04:04hgx factories using custom software

01:04:07developed with Omniverse sdks and

01:04:10apis for their newest Factory wraw

01:04:13started with a digital twin to virtually

01:04:15integrate their multi-ad and process

01:04:17simulation data into a unified view

01:04:20testing and optimizing layouts in this

01:04:22physically accurate digital environment

01:04:24increased worker efficency icy by

01:04:2751% during construction the Omniverse

01:04:30digital twin was used to verify that the

01:04:32physical build matched the digital plans

01:04:35identifying any discrepancies early has

01:04:37helped avoid costly change orders and

01:04:40the results have been impressive using a

01:04:42digital twin helped bring wion's Factory

01:04:44online in half the time just 2 and 1/2

01:04:47months instead of five in operation the

01:04:50Omniverse digital twin helps widraw

01:04:52rapidly Test new layouts to accommodate

01:04:54new processes or improve operations in

01:04:57the existing space and monitor real-time

01:05:00operations using live iot data from

01:05:02every machine on the production

01:05:04line which ultimately enabled wion to

01:05:07reduce End to-end Cycle Times by 50% and

01:05:10defect rates by

01:05:1240% with Nvidia Ai and Omniverse

01:05:15nvidia's Global ecosystem of partners

01:05:17are building a new era of accelerated AI

01:05:20enabled

01:05:24digitalization

01:05:31that's how we that's the way it's going

01:05:34to be in the future we're going to

01:05:35manufacturing everything digitally first

01:05:37and then we'll manufacture it physically

01:05:39people ask me how did it

01:05:41start what got you guys so

01:05:44excited what was it that you

01:05:47saw that caused you to put it all

01:05:52in on this incredible idea and it's

01:06:00this hang on a

01:06:07second guys that was going to be such a

01:06:12moment that's what happens when you

01:06:14don't

01:06:19rehearse this as you know was first

01:06:24Contact 20 12

01:06:26alexnet you put a cat into this computer

01:06:31and it comes out and it says

01:06:35cat and we said oh my God this is going

01:06:39to change

01:06:42everything you take 1 million numbers

01:06:45you take one Million numbers across

01:06:48three channels

01:06:49RGB these numbers make no sense to

01:06:52anybody you put it into this software

01:06:56and it compress it dimensionally reduce

01:06:59it it reduces it from a million

01:07:01dimensions a million Dimensions it turns

01:07:04it into three letters one vector one

01:07:09number and it's

01:07:11generalized you could have the cat be

01:07:15different

01:07:17cats and and you could have it be the

01:07:19front of the cat and the back of the cat

01:07:22and you look at this thing you say

01:07:24unbelievable you mean any

01:07:27cats yeah any

01:07:30cat and it was able to recognize all

01:07:33these cats and we realized how it did it

01:07:37systematically structurally it's

01:07:41scalable how big can you make it well

01:07:44how big do you want to make it and so we

01:07:47imagine that this is a completely new

01:07:49way of writing

01:07:51software and now today as you know you

01:07:54could have you type in the word c a and

01:07:58what comes out is a

01:08:00cat it went the other

01:08:03way am I right

01:08:07unbelievable how is it possible that's

01:08:10right how is it possible you took three

01:08:13letters and you generated a million

01:08:16pixels from it and it made

01:08:18sense well that's the miracle and here

01:08:21we are just literally 10 years later 10

01:08:24years later

01:08:26where we recognize textt we recognize

01:08:28images we recognize videos and sounds

01:08:31and images not only do we recognize them

01:08:34we understand their meaning we

01:08:37understand the meaning of the text

01:08:38that's the reason why it can chat with

01:08:39you it can summarize for you it

01:08:42understands the text it understood not

01:08:44just recognizes the the English it

01:08:46understood the English it doesn't just

01:08:48recognize the pixels and understood the

01:08:51pixels and you can you can even

01:08:53condition it between two modalities you

01:08:55can have language condition image and

01:08:57generate all kinds of interesting things

01:09:00well if you can understand these things

01:09:02what else can you understand that you've

01:09:05digitized the reason why we started with

01:09:07text and you know images is because we

01:09:09digitized those but what else have we

01:09:11digitized well it turns out we digitized

01:09:13a lot of things proteins and genes and

01:09:17brain

01:09:18waves anything you can digitize so long

01:09:21as there's structure we can probably

01:09:23learn some patterns from it and if we

01:09:24can learn the patterns from it we can

01:09:26understand its meaning if we can

01:09:28understand its meaning we might be able

01:09:30to generate it as well and so therefore

01:09:32the generative AI Revolution is here

01:09:36well what else can we generate what else

01:09:37can we learn well one of the things that

01:09:39we would love to learn we would love to

01:09:42learn is we would love to learn climate

01:09:47we would love to learn extreme weather

01:09:49we would love to learn uh what how we

01:09:52can

01:09:54predict future weather at Regional

01:09:57scales at sufficiently high resolution

01:10:01such that we can keep people out of

01:10:02Harm's Way before harm comes extreme

01:10:05weather cost the world $150 billion

01:10:08surely more than that and it's not

01:10:10evenly distributed $150 billion is

01:10:13concentrated in some parts of the world

01:10:15and of course to some people of the

01:10:16world we need to adapt and we need to

01:10:19know what's coming and so we are

01:10:20creating Earth too a digital twin of the

01:10:23Earth for predicting weather we and

01:10:26we've made an extraordinary invention

01:10:29called Civ the ability to use generative

01:10:32AI to predict weather at extremely high

01:10:35resolution let's take a

01:10:38look as the earth's climate changes AI

01:10:41powered weather forecasting is allowing

01:10:43us to more accurately predict and track

01:10:45severe storms like super typhoon chanthu

01:10:48which caused widespread damage in Taiwan

01:10:50and the surrounding region in 2021

01:10:53current AI forecast models can

01:10:55accurately predict the track of storms

01:10:57but they are limited to 25 km resolution

01:11:00which can miss important details Invidia

01:11:03cordi is a revolutionary new generative

01:11:06AI model trained on high resolution

01:11:08radar assimilated Warf weather forecasts

01:11:10and air 5 reanalysis data using cordi

01:11:14extreme events like chanthu can be super

01:11:17resolved from 25 km to 2 km resolution

01:11:20with 1,000 times the speed and 3,000

01:11:22times the Energy Efficiency of

01:11:24conventional weather models by combining

01:11:27the speed and accuracy of nvidia's

01:11:29weather forecasting model forecast net

01:11:31and generative AI models like cordi we

01:11:34can explore hundreds or even thousands

01:11:36of kilometer scale Regional weather

01:11:38forecasts to provide a clear picture of

01:11:40the best worst and most likely impacts

01:11:42of a storm this wealth of information

01:11:45can help minimize loss of life and

01:11:47property damage today cordi is optimized

01:11:50for Taiwan but soon generative super

01:11:53sampling will be available as part of

01:11:54the in viia Earth 2 inference service

01:11:57for many regions across the

01:12:09globe the weather company has the trust

01:12:12a source of global weather predictions

01:12:14we are working together to accelerate

01:12:16their weather simulation first

01:12:18principled base of simulation however

01:12:21they're also going to integrate Earth to

01:12:23cordi so that they could help businesses

01:12:25and countries do Regional high

01:12:28resolution weather prediction and so if

01:12:31you have some weather prediction you'd

01:12:32like to know like to do uh reach out to

01:12:34the weather company really exciting

01:12:36really exciting work Nvidia Healthcare

01:12:39something we started 15 years ago we're

01:12:41super super excited about this this is

01:12:43an area where we're very very proud

01:12:46whether it's Medical Imaging or genene

01:12:47sequencing or computational

01:12:50chemistry it is very likely that Nvidia

01:12:53is the computation behind it

01:12:55we've done so much work in this

01:12:57area today we're announcing that we're

01:13:00going to do something really really cool

01:13:03imagine all of these AI models that are

01:13:06being

01:13:07used to

01:13:10generate images and audio but instead of

01:13:12images and audio because it understood

01:13:15images and audio all the digitization

01:13:17that we've done for genes and proteins

01:13:20and amino acids that digitalization

01:13:23capability is now now passed through

01:13:26machine learning so that we understand

01:13:28the language of

01:13:30Life the ability to understand the

01:13:32language of Life of course we saw the

01:13:34first evidence of

01:13:35it with alphafold this is really quite

01:13:38an extraordinary thing after Decades of

01:13:40painstaking work the world had only

01:13:44digitized and reconstructed using cor

01:13:47electron microscopy or Crystal XR x-ray

01:13:51crystallography um these different

01:13:53techniques painstaking reconstructed the

01:13:56protein 200,000 of them in just what is

01:13:59it less than a year or so Alpha fold has

01:14:04reconstructed 200 million proteins

01:14:06basically every protein every of every

01:14:09living thing that's ever been sequenced

01:14:11this is completely revolutionary well

01:14:14those models are incredibly hard to use

01:14:16um for incredibly hard for people to

01:14:18build and so what we're going to do is

01:14:20we're going to build them we're going to

01:14:21build them for uh the the researchers

01:14:24around the world and it won't be the

01:14:26only one there'll be many other models

01:14:27that we create and so let me show you

01:14:29what we're going to do with

01:14:34it virtual screening for new medicines

01:14:37is a computationally intractable problem

01:14:40existing techniques can only scan

01:14:42billions of compounds and require days

01:14:44on thousands of standard compute nodes

01:14:47to identify new drug

01:14:48candidates Nvidia biion Nemo Nims enable

01:14:52a new generative screening Paradigm

01:14:54using Nims for protein structure

01:14:56prediction with Alpha fold molecule

01:14:58generation with MIM and docking with

01:15:01diff dock we can now generate and Screen

01:15:04candidate molecules in a matter of

01:15:05minutes MIM can connect to custom

01:15:08applications to steer the generative

01:15:10process iteratively optimizing for

01:15:12desired properties these applications

01:15:15can be defined with biion Nemo

01:15:17microservices or built from scratch here

01:15:20a physics based simulation optimizes for

01:15:23a molecule's ability to bind to a Target

01:15:25protein while optimizing for other

01:15:27favorable molecular properties in

01:15:29parallel MIM generates high quality

01:15:32drug-like molecules that bind to the

01:15:34Target and are synthesizable translating

01:15:37to a higher probability of developing

01:15:39successful medicines

01:15:41faster biion Nemo is enabling a new

01:15:44paradigm in drug Discovery with Nims

01:15:46providing OnDemand microservices that

01:15:48can be combined to build powerful drug

01:15:51Discovery workflows like denovo protein

01:15:53design or ided molecule generation for

01:15:56virtual screening bio Nims are helping

01:16:00researchers and developers reinvent

01:16:02computational drug

01:16:09design Nvidia M MIM MIM cord diff

01:16:13there's a whole bunch of other models

01:16:15whole bunch of other models computer

01:16:17vision models robotics models and even

01:16:21of

01:16:22course some really really terrific open

01:16:24source

01:16:25language models these models are

01:16:29groundbreaking however it's hard for

01:16:31companies to use how would you use it

01:16:33how would you bring it into your company

01:16:34and integrate it into your workflow how

01:16:36would you package it up and run it

01:16:38remember earlier I just

01:16:40said that inference is an extraordinary

01:16:43computation problem how would you do the

01:16:46optimization for each and every one of

01:16:48these models and put together the

01:16:50Computing stack necessary to run that

01:16:52supercomputer so that you can run the

01:16:55models in your company and so we have a

01:16:58great idea we're going to invent a new

01:17:00way invent a new way for you to receive

01:17:05and operate

01:17:07software this software comes basically

01:17:11in a digital box we call it a container

01:17:14and we call it the Nvidia inference micr

01:17:17service a Nim and let me explain to you

01:17:21what it is a Nim it's a pre-trained

01:17:24model so it's pretty

01:17:25clever and it is packaged and optimized

01:17:29to run across nvidia's install base

01:17:32which is very very large what's inside

01:17:34it is incredible you have all these

01:17:37pre-trained state-ofthe-art open source

01:17:39models they could be open source they

01:17:41could be from one of our partners it

01:17:43could be created by us like Nvidia mull

01:17:46it is packaged up with all of its

01:17:48dependencies so Cuda the right version

01:17:50CNN the right version tensor RT llm

01:17:54Distributing across the multiple gpus

01:17:56Tred and inference server all completely

01:17:59packaged together it's optimized

01:18:02depending on whether you have a single

01:18:04GPU multi- GPU or multi node of gpus

01:18:06it's optimized for that and it's

01:18:08connected up with apis that are simple

01:18:10to use now this think about what an AI

01:18:13API is an AI API is an interface that

01:18:18you just talk to and so this is a piece

01:18:21of software in the future that has a

01:18:23really simple API and that API called

01:18:25human and these packages incredible

01:18:29bodies of software will be optimized and

01:18:32packaged and we'll put it on a

01:18:34website and you can download it you

01:18:37could take it with you you could run it

01:18:39in any Cloud you can run it in your own

01:18:41data center you can run in workstations

01:18:43if it fit and all you have to do is come

01:18:45to ai. nvidia.com we call it Nvidia

01:18:49inference microservice but inside the

01:18:51company we all call it

01:18:53Nims okay

01:19:02just imagine you know one of some

01:19:04someday there there's going to be one of

01:19:06these chat Bots and these chat Bots is

01:19:08going to just be in a Nim and you you'll

01:19:12uh you'll assemble a whole bunch of chat

01:19:13Bots and that's the way software is

01:19:15going to be be built someday how do we

01:19:18build software in the future it is

01:19:20unlikely that you'll write it from

01:19:22scratch or write a whole bunch of python

01:19:23code or anything like that it is very

01:19:26likely that you assemble a team of AIS

01:19:29there's probably going to be a super AI

01:19:32that you use that takes the mission that

01:19:34you give it and breaks it down into an

01:19:37execution plan some of that execution

01:19:39plan could be handed off to another Nim

01:19:42that Nim would maybe uh understand

01:19:46sap the language of sap is abap it might

01:19:50understand service now and it go

01:19:52retrieve some information from their

01:19:53platforms

01:19:55it might then hand that result to

01:19:56another Nim who that goes off and does

01:19:59some calculation on it maybe it's an

01:20:01optimization software a

01:20:03combinatorial optimization algorithm

01:20:06maybe it's uh you know some just some

01:20:08basic

01:20:09calculator maybe it's pandas to do some

01:20:13numerical analysis on it and then it

01:20:15comes back with its

01:20:17answer and it gets combined with

01:20:19everybody else's and it because it's

01:20:21been presented with this is what the

01:20:23right answer should look like it knows

01:20:25what answer what an what right answers

01:20:27to produce and it presents it to you we

01:20:30can get a report every single day at you

01:20:32know top of the hour uh that has

01:20:34something to do with a bill plan or some

01:20:36forecast or uh some customer alert or

01:20:38some bugs database or whatever it

01:20:40happens to be and we could assemble it

01:20:42using all these Nims and because these

01:20:44Nims have been packaged up and ready to

01:20:48work on your systems so long as you have

01:20:50video gpus in your data center in the

01:20:51cloud this this Nims will work together

01:20:55as a team and do amazing things and so

01:20:58we decided this is such a great idea

01:21:00we're going to go do that and so Nvidia

01:21:03has Nims running all over the company we

01:21:05have chatbots being created all over the

01:21:08place and one of the mo most important

01:21:09chatbots of course is a chip designer

01:21:12chatbot you might not be surprised we

01:21:14care a lot about building chips and so

01:21:17we want to build chatbots AI

01:21:21co-pilots that are co-designers with our

01:21:23engineers and so this is the way we did

01:21:26it so we got ourselves a llama llama 2

01:21:30this is a 70b and it's you know packaged

01:21:32up in a NM and we asked it you know uh

01:21:36what is a

01:21:37CTL Well turns out CTL is an internal uh

01:21:42program and it has a internal

01:21:44proprietary language but it thought the

01:21:46CTL was a combinatorial timing logic and

01:21:48so it describes you know conventional

01:21:50knowledge of CTL but that's not very

01:21:52useful to us and so we gave it a whole

01:21:56bunch of new examples you know this is

01:21:58no different than employee onboarding an

01:22:01employee uh we say you know thanks for

01:22:03that answer it's completely wrong um and

01:22:06and uh and then we present to them uh

01:22:09this is what a CTL is okay and so this

01:22:11is what a CTL is at Nvidia and the CTL

01:22:15as you can see you know CTL stands for

01:22:17compute Trace Library which makes sense

01:22:20you know we were tracing compute Cycles

01:22:22all the time and it wrote the program

01:22:24isn't that

01:22:32amazing and so the productivity of our

01:22:34chip designers can go up this is what

01:22:35you can do with a Nim first thing you

01:22:37can do with is customize it we have a

01:22:39service called Nemo microservice that

01:22:41helps you curate the data preparing the

01:22:44data so that you could teach this on

01:22:46board this AI you fine-tune them and

01:22:49then you guardrail it you can even

01:22:51evaluate the answer evaluate its

01:22:53performance against um other other

01:22:55examples and so that's called the Nemo

01:22:58micr service now the thing that's that's

01:23:00emerging here is this there are three

01:23:02elements three pillars of what we're

01:23:03doing the first pillar is of course

01:23:06inventing the technology for um uh AI

01:23:09models and running AI models and

01:23:11packaging it up for you the second is to

01:23:13create tools to help you modify it first

01:23:16is having the AI technology second is to

01:23:19help you modify it and third is

01:23:20infrastructure for you to fine-tune it

01:23:23and if you like deploy it you could

01:23:24deploy it on our infrastructure called

01:23:26dgx cloud or you can employ deploy it on

01:23:29Prem you can deploy it anywhere you like

01:23:31once you develop it it's yours to take

01:23:33anywhere and so we are

01:23:36effectively an AI Foundry we will do for

01:23:40you and the industry on AI what tsmc

01:23:43does for us building chips and so we go

01:23:45to it with our go to tsmc with our big

01:23:48Ideas they manufacture and we take it

01:23:50with us and so exactly the same thing

01:23:52here AI Foundry and the three pillar ERS

01:23:54are the NIMS Nemo microservice and dgx

01:23:58Cloud the other thing that you could

01:24:00teach the Nim to do is to understand

01:24:02your proprietary information remember

01:24:05inside our company the vast majority of

01:24:07our data is not in the cloud it's inside

01:24:09our company it's been sitting there you

01:24:11know being used all the time and and

01:24:14gosh it's it's basically invidious

01:24:17intelligence we would like to take that

01:24:20data learn its meaning like we learned

01:24:23the meaning of almost anything else that

01:24:24we just talked about learn its meaning

01:24:27and then reindex that knowledge into a

01:24:30new type of database called a vector

01:24:32database and so you essentially take

01:24:35structured data or unstructured data you

01:24:37learn its meaning you encode its meaning

01:24:39so now this becomes an AI database and

01:24:43that AI database in the future once you

01:24:45create it you can talk to it and so let

01:24:47me give you an example of what you could

01:24:49do so suppose you create you get you got

01:24:51a whole bunch of multi modality data and

01:24:53one good example of that is PDF so you

01:24:56take the PDF you take all of your PDFs

01:24:59all the all your favorite you know the

01:25:01stuff that that is proprietary to you

01:25:03critical to your company you can encode

01:25:05it just as we encoded pixels of a cat

01:25:09and it becomes the word cat we can

01:25:11encode all of your PDF and it turns

01:25:14into vectors that are now stored inside

01:25:16your vector database it becomes the

01:25:18proprietary information of your company

01:25:20and once you have that proprietary

01:25:21information you can chat to it it's an

01:25:24it's a smart database and so you just ch

01:25:27chat with data and how how much more

01:25:29enjoyable is that you know we for for

01:25:33our software team you know they just

01:25:35chat with the bugs database you know how

01:25:38many bugs was there last night um are we

01:25:40making any progress and then after

01:25:42you're done talking to this uh bugs

01:25:45database you need therapy and so so we

01:25:49have another chatbot for

01:25:53you

01:25:55you can do

01:26:05it okay so we call this Nemo Retriever

01:26:08and the reason for that is because

01:26:09ultimately it's job is to go retrieve

01:26:11information as quickly as possible and

01:26:13you just talk to it hey retrieve me this

01:26:15information it goes if brings it back to

01:26:18you and do you mean this you go yeah

01:26:20perfect okay and so we call it the Nemo

01:26:22retriever well the Nemo service helps

01:26:24you create all these things and we have

01:26:26all all these different Nims we even

01:26:27have Nims of digital humans I'm Rachel

01:26:31your AI care

01:26:33manager okay so so it's a really short

01:26:36clip but there were so many videos to

01:26:39show you I guess so many other demos to

01:26:41show you and so I I had to cut this one

01:26:43short but this is Diana she is a digital

01:26:46human Nim and and uh you just talked to

01:26:50her and she's connected in this case to

01:26:52Hippocratic ai's large language model

01:26:54for healthcare and it's truly

01:26:58amazing she is just super smart about

01:27:01Healthcare things you know and so after

01:27:04you're done after my my Dwight my VP of

01:27:07software engineering talks to the

01:27:08chatbot for bugs database then you come

01:27:11over here and talk to Diane and and so

01:27:13so uh Diane is is um completely animated

01:27:17with AI and she's a digital

01:27:19human uh there's so many companies that

01:27:21would like to build they're sitting on

01:27:23gold mines

01:27:25the the Enterprise IT industry is

01:27:27sitting on a gold mine it's a gold mine

01:27:29because they have so much understanding

01:27:31of of uh the way work is done they have

01:27:34all these amazing tools that have been

01:27:36created over the years and they're

01:27:37sitting on a lot of data if they could

01:27:40take that gold mine and turn them into

01:27:43co-pilots these co-pilots could help us

01:27:45do things and so just about every it

01:27:49franchise it platform in the world that

01:27:51has valuable tools that people use is

01:27:53sitting on a gold mine for co-pilots and

01:27:56they would like to build their own

01:27:57co-pilots and their own chatbots and so

01:28:00we're announcing that Nvidia AI foundary

01:28:02is working with some of the world's

01:28:03great companies sap generates 87% of the

01:28:06world's Global Commerce basically the

01:28:09world runs on sap we run on sap Nvidia

01:28:11and sap are building sap Jewel co-pilots

01:28:15uh using Nvidia Nemo and dgx cloud

01:28:18service now they run 80 85% of the

01:28:20world's Fortune 500 companies run their

01:28:23people and customer service operations

01:28:25on service now and they're using Nvidia

01:28:28AI Foundry to build service now uh

01:28:31assist virtual

01:28:33assistance cohesity backs up the world's

01:28:36data they're sitting on a gold mine of

01:28:38data hundreds of exobytes of data over

01:28:4110,000 companies Nvidia AI Foundry is

01:28:44working with them helping them build

01:28:46their Gaia generative AI agent snowflake

01:28:50is a company that stores the world's uh

01:28:53digital Warehouse in the cloud and

01:28:55serves over 3 billion queries a day for

01:29:0110,000 Enterprise customers snowflake is

01:29:03working with Nvidia AI Foundry to build

01:29:06co-pilots with Nvidia Nemo and Nims net

01:29:09apppp nearly half of the files in the

01:29:12world are stored on Prem on net apppp

01:29:16Nvidia AI Foundry is helping them uh

01:29:18build chat Bots and co-pilots like those

01:29:21Vector databases and retrievers with

01:29:23Nvidia neemo and

01:29:25Nims and we have a great partnership

01:29:27with Dell everybody who everybody who is

01:29:30building these chat Bots and generative

01:29:33AI when you're ready to run it you're

01:29:35going to need an AI

01:29:37Factory and nobody is better at Building

01:29:41end-to-end Systems of very large scale

01:29:43for the Enterprise than Dell and so

01:29:46anybody any company every company will

01:29:48need to build AI factories and it turns

01:29:51out that Michael is here he's happy to

01:29:53take your order

01:29:58ladies and gentlemen Michael

01:30:04del okay let's talk about the next wave

01:30:07of Robotics the next wave of AI robotics

01:30:09physical

01:30:11AI so far all of the AI that we've

01:30:14talked about is one

01:30:16computer data comes into one computer

01:30:18lots of the world's if you will

01:30:21experience in digital text form the AI

01:30:25imitates Us by reading a lot of the

01:30:28language to predict the next words it's

01:30:30imitating You by studying all of the

01:30:32patterns and all the other previous

01:30:34examples of course it has to understand

01:30:36context and so on so forth but once it

01:30:38understands the context it's essentially

01:30:39imitating you we take all of the data we

01:30:42put it into a system like dgx we

01:30:45compress it into a large language model

01:30:47trillions and trillions of parameters

01:30:49become billions and billion trillions of

01:30:51tokens becomes billions of parameters

01:30:53these billions of parameters becomes

01:30:54your AI well in order for us to go to

01:30:58the next wave of AI where the AI

01:31:00understands the physical world we're

01:31:02going to need three

01:31:03computers the first computer is still

01:31:06the same computer it's that AI computer

01:31:08that now is going to be watching video

01:31:10and maybe it's doing synthetic data

01:31:12generation and maybe there's a lot of

01:31:14human examples just as we have human

01:31:17examples in text form we're going to

01:31:18have human examples in articulation form

01:31:22and the AIS will watch us

01:31:25understand what is

01:31:26happening and try to adapt it for

01:31:29themselves into the

01:31:31context and because it can generalize

01:31:33with these Foundation models maybe these

01:31:36robots can also perform in the physical

01:31:38world fairly generally so I just

01:31:41described in very simple terms

01:31:44essentially what just happened in large

01:31:45language models except the chat GPT

01:31:47moment for robotics may be right around

01:31:49the corner and so we've been building

01:31:52the end to-end systems for robotics for

01:31:54some time I'm super super proud of the

01:31:56work we have the AI system

01:31:59dgx we have the lower system which is

01:32:01called agx for autonomous systems the

01:32:04world's first robotics processor when we

01:32:06first built this thing people are what

01:32:07are you guys building it's a s so it's

01:32:10one chip it's designed to be very low

01:32:12power but it's designed for high-speed

01:32:13sensor processing and Ai and so if you

01:32:17want to run Transformers in a car or you

01:32:20want to run Transformers in a in a you

01:32:23know anything

01:32:24um that moves uh we have the perfect

01:32:26computer for you it's called the Jetson

01:32:29and so the dgx on top for training the

01:32:31AI the Jetson is the autonomous

01:32:33processor and in the middle we need

01:32:35another computer whereas large language

01:32:39models have the

01:32:40benefit of you providing your examples

01:32:43and then doing reinforcement learning

01:32:45human

01:32:47feedback what is the reinforcement

01:32:49learning human feedback of a robot well

01:32:52it's reinforcement learning

01:32:54physical feedback that's how you align

01:32:56the robot that's how you that's how the

01:32:59robot knows that as it's learning these

01:33:01articulation capabilities and

01:33:02manipulation capabilities it's going to

01:33:04adapt properly into the laws of physics

01:33:08and so we need a simulation

01:33:11engine that represents the world

01:33:13digitally for the robot so that the

01:33:15robot has a gym to go learn how to be a

01:33:18robot we call

01:33:19that virtual world Omniverse and the

01:33:23computer that runs Omniverse is called

01:33:25ovx and ovx the computer itself is

01:33:29hosted in the Azure Cloud okay and so

01:33:32basically we built these three things

01:33:34these three systems on top of it we have

01:33:36algorithms for every single one now I'm

01:33:39going to show you one super example of

01:33:42how Ai and Omniverse are going to work

01:33:45together the example I'm going to show

01:33:46you is kind of insane but it's going to

01:33:49be very very close to tomorrow it's a

01:33:51robotics building this robotics building

01:33:54is called a warehouse inside the

01:33:56robotics building are going to be some

01:33:58autonomous systems some of the

01:34:00autonomous systems are going to be

01:34:01called humans and some of the autonomous

01:34:04systems are going to be called forklifts

01:34:06and these autonomous systems are going

01:34:08to interact with each other of course

01:34:10autonomously and it's going to be

01:34:12overlooked upon by this Warehouse to

01:34:14keep everybody out of Harm's Way the

01:34:16warehouse is essentially an air traffic

01:34:18controller and whenever it sees

01:34:21something happening it will redirect

01:34:23traffic traffic and give New Way points

01:34:26just new way points to the robots and

01:34:28the people and they'll know exactly what

01:34:29to do this warehouse this building you

01:34:33can also talk to of course you could

01:34:35talk to it hey you know sap Center how

01:34:38are you feeling today for example and so

01:34:41you could ask the same the warehouse the

01:34:43same questions basically the system I

01:34:46just described will have Omniverse Cloud

01:34:49that's hosting the virtual simulation

01:34:52and AI running on djx cloud and all of

01:34:56this is running in real time let's take

01:34:57a

01:34:59look the future of heavy industri starts

01:35:02as a digital twin the AI agents helping

01:35:05robots workers and infrastructure

01:35:07navigate unpredictable events in complex

01:35:10industrial spaces will be built and

01:35:12evaluated first in sophisticated digital

01:35:15twins this Omniverse digital twin of a

01:35:18100,000 ft Warehouse is operating as a

01:35:22simulation environment that integrates

01:35:24digital workers amrs running the Nvidia

01:35:27Isaac receptor stack centralized

01:35:29activity maps of the entire Warehouse

01:35:31from 100 simulated ceiling mount cameras

01:35:34using Nvidia metropolis and AMR route

01:35:37planning with Nvidia Koop software in

01:35:40Loop testing of AI agents in this

01:35:42physically accurate simulated

01:35:44environment enables us to evaluate and

01:35:47refine how the system adapts to real

01:35:49world

01:35:51unpredictability here an incident occurs

01:35:53along this amr's planned route blocking

01:35:56its path as it moves to pick up a pallet

01:35:59Nvidia Metropolis updates and sends a

01:36:01realtime occupancy map to kopt where a

01:36:03new optimal route is calculated the AMR

01:36:06is enabled to see around corners and

01:36:08improve its Mission efficiency with

01:36:11generative AI powered Metropolis Vision

01:36:13Foundation models operators can even ask

01:36:16questions using natural language the

01:36:18visual model understands nuanced

01:36:21activity and can offer immediate

01:36:22insights to improve operations all of

01:36:25the sensor data is created in simulation

01:36:27and passed to the real-time AI running

01:36:30as Nvidia inference microservices or

01:36:32Nims and when the AI is ready to be

01:36:35deployed in the physical twin the real

01:36:37Warehouse we connect metropolis and

01:36:39Isaac Nims to real sensors with the

01:36:42ability for continuous Improvement of

01:36:44both the digital twin and the AI

01:36:49models isn't that

01:36:52incredible and

01:36:55so remember remember a future facility

01:37:00Warehouse Factory building will be

01:37:03software defined and so the software is

01:37:05running how else would you test the

01:37:07software so you you you test the

01:37:10software to building the warehouse the

01:37:12optimization system in the digital twin

01:37:14what about all the robots all of those

01:37:15robots you are seeing just now they're

01:37:17all running their own autonomous robotic

01:37:19stack and so the way you integrate

01:37:21software in the future cicd in the

01:37:23future for robotic systems is with

01:37:26digital twins we've made Omniverse a lot

01:37:29easier to access we're going to create

01:37:31basically Omniverse Cloud apis four

01:37:34simple API and a channel and you can

01:37:37connect your application to it so this

01:37:38is this is going to be as wonderfully

01:37:41beautifully simple in the future that

01:37:44Omniverse is going to be and with these

01:37:46apis you're going to have these magical

01:37:48digital twin capability we also have

01:37:52turned om ver into an AI and integrated

01:37:56it with the ability to chat USD the the

01:37:59language of our language is you know

01:38:01human and Omniverse is language as it

01:38:04turns out is universal scene description

01:38:06and so that language is rather complex

01:38:09and so we've taught our Omniverse uh

01:38:12that language and so you can speak to it

01:38:14in English and it would directly

01:38:15generate USD and it would talk back in

01:38:18USD but Converse back to you in English

01:38:20you could also look for information in

01:38:22this world semantically instead of the

01:38:25world being encoded semantically in in

01:38:27language now it's encoded semantically

01:38:29in scenes and so you could ask it of of

01:38:32uh certain objects or certain conditions

01:38:34and certain scenarios and it can go and

01:38:36find that scenario for you it also can

01:38:39collaborate with you in generation you

01:38:41could design some things in 3D it could

01:38:43simulate some things in 3D or you could

01:38:45use AI to generate something in 3D let's

01:38:47take a look at how this is all going to

01:38:49work we have a great partnership with

01:38:51Seamans Seamans is the world's largest

01:38:54industrial engineering and operations

01:38:56platform you've seen now so many

01:38:59different companies in the industrial

01:39:01space heavy Industries is one of the

01:39:03greatest final frontiers of it and we

01:39:06finally now have the Necessary

01:39:08Technology to go and make a real impact

01:39:11seens is building the industrial

01:39:13metaverse and today we're announcing

01:39:14that Seamans is connecting their Crown

01:39:17Jewel accelerator to Nvidia Omniverse

01:39:20let's take a

01:39:22look seens technology is transformed

01:39:25every day for everyone team Center acts

01:39:28our leading product life cycle

01:39:29management software from the sems

01:39:31accelerator platform is used every day

01:39:34by our customers to develop and deliver

01:39:36products at scale now we are bringing

01:39:39the real and the digital worlds even

01:39:41Closer by integrating Nvidia Ai and

01:39:44Omniverse Technologies into team Center

01:39:47X Omniverse apis enable data

01:39:50interoperability and physics-based

01:39:52rendering to Industrial scale design and

01:39:55Manufacturing projects our customers HD

01:39:59market leader in sustainable ship

01:40:00manufacturing builds ammonia and

01:40:03hydrogen power chips often comprising

01:40:05over 7 million discrete Parts with

01:40:08Omniverse apis team Center X lets

01:40:11companies like HD yundai unify and

01:40:14visualize these massive engineering data

01:40:17sets interactively and integrate

01:40:19generative AI to generate 3D objects or

01:40:22HDR I backgrounds to see their projects

01:40:26in context the result an ultra inuitive

01:40:29photoal physics-based digital twin that

01:40:32eliminates waste and errors delivering

01:40:35huge savings in cost and

01:40:37time and we are building this for

01:40:39collaboration whether across more semens

01:40:41accelerator tools like seens anex or

01:40:45Star CCM Plus or across teams working on

01:40:49their favorite devices in the same scene

01:40:51together in this is just the beginning

01:40:54working with Nvidia we will bring

01:40:57accelerated Computing generative Ai and

01:40:59Omniverse integration across the Sean

01:41:03accelerator

01:41:11portfolio the pro the the professional

01:41:15the professional voice actor happens to

01:41:17be a good friend of mine Roland Bush who

01:41:20happens to be the CEO of

01:41:22seens

01:41:29once you get Omniverse connected into

01:41:34your workflow your

01:41:36ecosystem from the beginning of your

01:41:39design to

01:41:40engineering to manufacturing planning

01:41:43all the way to digital twin

01:41:45operations once you connect everything

01:41:48together it's insane how much

01:41:50productivity you can get and it's just

01:41:52really really wonderful all of a sudden

01:41:54everybody is operating on the same

01:41:55ground

01:41:56truth you don't have to exchange data

01:41:59and convert data make mistakes everybody

01:42:01is working on the same ground truth from

01:42:04the design Department to the art

01:42:06Department the architecture Department

01:42:07all the way to the engineering and even

01:42:09the marketing department let's take a

01:42:11look at how Nissan has integrated

01:42:14Omniverse into their workflow and it's

01:42:17all because it's connected by all these

01:42:19wonderful tools and these developers

01:42:21that we're working with take a look

01:42:22unbel

01:43:22for

01:43:52for

01:44:01that was not an animation that was

01:44:05Omniverse today we're announcing that

01:44:07Omniverse

01:44:09Cloud streams to The Vision Pro

01:44:19and it is very very strange

01:44:24that you walk around virtual doors when

01:44:27I was getting out of that

01:44:29car and everybody does it it is really

01:44:33really quite amazing Vision Pro

01:44:35connected to Omniverse portals you into

01:44:38Omniverse and because all of these CAD

01:44:41tools and all these different design

01:44:42tools are now integrated and connected

01:44:44to Omniverse you can have this type of

01:44:46workflow really incredible let's talk

01:44:48about robotics everything that moves

01:44:51will be robotic there's no question

01:44:52about that it's safer it's more

01:44:56convenient and one of the largest

01:44:57Industries is going to be Automotive we

01:45:00build the robotic stack from top to

01:45:02bottom as I was mentioned from the

01:45:04computer system but in the case of

01:45:05self-driving cars including the

01:45:07self-driving application at the end of

01:45:10this year or I guess beginning of next

01:45:12year we will be shipping in Mercedes and

01:45:14then shortly after that jlr and so these

01:45:17autonomous robotic systems are software

01:45:20defined they take a lot of work to do

01:45:22has computer vision has obviously

01:45:24artificial intelligence control and

01:45:26planning all kinds of very complicated

01:45:29technology and takes years to refine

01:45:31we're building the entire stack however

01:45:34we open up our entire stack for all of

01:45:36the automotive industry this is just the

01:45:37way we work the way we work in every

01:45:39single industry we try to build as much

01:45:41of it as we can so that we understand it

01:45:43but then we open it up so everybody can

01:45:45access it whether you would like to buy

01:45:47just our computer which is the world's

01:45:49only full functional save asld system

01:45:55that can run

01:45:56AI this functional safe asld quality

01:46:00computer or the operating system on top

01:46:03or of course our data centers which is

01:46:07in basically every AV company in the

01:46:09world however you would like to enjoy it

01:46:11we're delighted by it today we're

01:46:13announcing that byd the world's largest

01:46:16ev company is adopting our next

01:46:19Generation it's called Thor Thor is

01:46:21designed for Transformer engines Thor

01:46:24our next Generation AV computer will be

01:46:26used by

01:46:36byd you probably don't know this fact

01:46:38that we have over a million robotics

01:46:40developers we created Jetson this

01:46:43robotics computer we're so proud of it

01:46:45the amount of software that goes on top

01:46:47of it is insane but the reason why we

01:46:49can do it at all is because it's 100%

01:46:50Cuda compatible everything that we do

01:46:53everything that we do in our company is

01:46:55in service of our developers and by us

01:46:58being able to maintain this Rich

01:47:00ecosystem and make it compatible with

01:47:02everything that you access from us we

01:47:05can bring all of that incredible

01:47:06capability to this little tiny computer

01:47:09we call Jetson a robotics computer we

01:47:12also today are

01:47:13announcing this incredibly Advanced new

01:47:16SDK we call it Isaac

01:47:19perceptor Isaac perceptor most most of

01:47:22the Bots today are pre-programmed

01:47:26they're either following rails on the

01:47:27ground digital rails or theyd be

01:47:29following April tags but in the future

01:47:31they're going to have perception and the

01:47:33reason why you want that is so that you

01:47:34could easily program it you say would

01:47:37you like to go from point A to point B

01:47:39and it will figure out a way to navigate

01:47:41its way there so by only programming

01:47:44waypoints the entire route could be

01:47:47adaptive the entire environment could be

01:47:49reprogrammed just as I showed you at the

01:47:51very beginning with the warehouse you

01:47:53can't do that with pre-programmed agvs

01:47:57if those boxes fall down they just all

01:47:59gum up and they just wait there for

01:48:01somebody to come clear it and so now

01:48:04with the Isaac

01:48:05perceptor we have incredible

01:48:07state-of-the-art Vision odometry 3D

01:48:11reconstruction and in addition to 3D

01:48:13reconstruction depth perception the

01:48:15reason for that is so that you can have

01:48:16two modalities to keep an eye on what's

01:48:19happening in the world Isaac perceptor

01:48:22the most used robot today is the

01:48:26manipulator manufacturing arms and they

01:48:29are also pre-programmed the computer

01:48:31vision algorithms the AI algorithms the

01:48:34control and path planning algorithms

01:48:36that are geometry aware incredibly

01:48:38computational intensive we have made

01:48:41these Cuda accelerated so we have the

01:48:44world's first Cuda accelerated motion

01:48:46planner that is geometry aware you put

01:48:50something in front of it it comes up

01:48:51with a new plan and our articulates

01:48:53around it it has excellent perception

01:48:56for pose estimation of a 3D object not

01:49:00just not it's pose in 2D but it's pose

01:49:02in 3D so it has to imagine what's around

01:49:05and how best to grab it so the

01:49:08foundation pose the grip foundation and

01:49:12the um articulation algorithms are now

01:49:15available we call it Isaac manipulator

01:49:17and they also uh just run on nvidia's

01:49:21computers we are are starting to do some

01:49:25really great work in the next generation

01:49:27of Robotics the next generation of

01:49:29Robotics will likely be a humanoid

01:49:32robotics we now have the Necessary

01:49:35Technology and as I was describing

01:49:38earlier the Necessary Technology to

01:49:40imagine generalized human robotics in a

01:49:44way human robotics is likely easier and

01:49:46the reason for that is because we have a

01:49:48lot more imitation training data that we

01:49:51can provide there robots because we are

01:49:54constructed in a very similar way it is

01:49:56very likely that the human robotics will

01:49:58be much more useful in our world because

01:50:00we created the world to be something

01:50:02that we can interoperate in and work

01:50:04well in and the way that we set up our

01:50:07workstations and Manufacturing and

01:50:08Logistics they were designed for for

01:50:10humans they were designed for people and

01:50:12so these human robotics will likely be

01:50:15much more productive to

01:50:17deploy while we're creating just like

01:50:20we're doing with the others the entire

01:50:22stack starting from the top a foundation

01:50:25model that learns from watching video

01:50:28human IM human examples it could be in

01:50:32video form it could be in virtual

01:50:34reality form we then created a gym for

01:50:37it called Isaac reinforcement learning

01:50:40gym which allows the humanoid robot to

01:50:43learn how to adapt to the physical world

01:50:46and then an incredible computer the same

01:50:49computer that's going to go into a

01:50:50robotic car this computer will run

01:50:53inside a human or robot called Thor it's

01:50:55designed for Transformer engines we've

01:50:58combined several of these into one video

01:51:01this is something that you're going to

01:51:03really love take a

01:51:07look it's not enough for humans to

01:51:15imagine we have to

01:51:19invent and explore real and push Beyond

01:51:24what's been done fair amount of

01:51:31detail we create

01:51:33smarter and

01:51:37faster we push it to

01:51:40fail so it can

01:51:44learn we teach it then help it teach

01:51:48itself we broaden its understanding

01:51:55to take on new

01:51:58challenges with absolute

01:52:03precision and

01:52:06succeed we make it

01:52:09perceive and

01:52:13move and even

01:52:17reason so it can share our world with

01:52:21us

01:52:41this is where inspiration leads us the

01:52:44next

01:52:46Frontier this is Nvidia Project

01:52:51Groot

01:52:54a general purpose Foundation model for

01:52:56humanoid robot

01:52:58learning the group model takes

01:53:00multimodal instructions and past

01:53:03interactions as input and produces the

01:53:05next action for the robot to

01:53:09execute we developed Isaac lab a robot

01:53:12learning application to train gr on

01:53:14Omniverse Isaac

01:53:16Sim and we scale out with osmo a new

01:53:19compute orchestration service that

01:53:21coordinates work flows across dgx

01:53:23systems for training and ovx systems for

01:53:28simulation with these tools we can train

01:53:30Groot in physically based simulation and

01:53:33transfer zero shot to the real

01:53:36world the Groot model will enable a

01:53:39robot to learn from a handful of human

01:53:41demonstrations so it can help with

01:53:43everyday

01:53:46tasks and emulate human movement just by

01:53:49observing

01:53:51us this is made possible with nvidia's

01:53:54technologies that can understand humans

01:53:56from videos train models and simulation

01:53:59and ultimately deploy them directly to

01:54:01physical robots connecting group to a

01:54:04large language model even allows it to

01:54:06generate motions by following natural

01:54:09language instructions hi go1 can you

01:54:12give me a high five sure thing let's

01:54:15high

01:54:16five can you give us some cool moves

01:54:19sure check this

01:54:21out

01:54:25all this incredible intelligence is

01:54:26powered by the new Jetson Thor robotics

01:54:29chips designed for Groot built for the

01:54:32future with Isaac lab osmo and Groot

01:54:35we're providing the building blocks for

01:54:37the next generation of AI powered

01:54:51robotics

01:54:56about the same

01:55:04size the soul of

01:55:06Nvidia the intersection of computer

01:55:08Graphics physics artificial intelligence

01:55:12it all came to bear at this moment the

01:55:15name of that project general robotics

01:55:20003 I know super

01:55:25good super

01:55:27good well I think we have some special

01:55:31guests do

01:55:42we hey

01:55:45guys so I understand you guys are

01:55:48powered by

01:55:49Jetson they're powered by Jetson

01:55:53little Jetson robotics computers inside

01:55:56they learn to walk in Isaac

01:56:02Sim ladies and gentlemen this this is

01:56:05orange and this is the famous green they

01:56:09are the bdx robots of

01:56:13Disney amazing Disney

01:56:18research come on you guys let's wrap up

01:56:21let's go

01:56:23five things where you

01:56:27going I sit right

01:56:33here Don't Be Afraid come here green

01:56:36hurry

01:56:39up what are you

01:56:42saying no it's not time to

01:56:46eat it's not time

01:56:50to I'll I'll give you a snack in a

01:56:53moment let me finish up real

01:56:55quick come on green hurry up stop

01:56:59wasting

01:57:01time five things five things first a new

01:57:06Industrial Revolution every data center

01:57:08should be accelerated a trillion dollars

01:57:11worth of installed data centers will

01:57:13become modernized over the next several

01:57:15years second because of the

01:57:16computational capability we brought to

01:57:18bear a new way of doing software has

01:57:20emerged generative AI which is going to

01:57:23create new in new infrastructure

01:57:25dedicated to doing one thing and one

01:57:27thing only not for multi-user data

01:57:30centers but AI generators these AI

01:57:33generation will create incredibly

01:57:36valuable

01:57:37software a new Industrial Revolution

01:57:40second the computer of this revolution

01:57:43the computer of this generation

01:57:45generative AI trillion

01:57:47parameters blackw insane amounts of

01:57:51computers and computing

01:57:53third I'm trying to

01:57:57concentrate good job third new computer

01:58:02new computer creates new types of

01:58:04software new type of software should be

01:58:06distributed in a new way so that it can

01:58:09on the one hand be an endpoint in the

01:58:10cloud and easy to use but still allow

01:58:13you to take it with you because it is

01:58:15your intelligence your intelligence

01:58:17should be pack packaged up in a way that

01:58:19allows you to take it with you we call

01:58:21them Nims and third these Nims are going

01:58:24to help you create a new type of

01:58:26application for the future not one that

01:58:28you wrote completely from scratch but

01:58:30you're going to integrate them like

01:58:33teams create these applications we have

01:58:36a fantastic capability between Nims the

01:58:39AI technology the tools Nemo and the

01:58:42infrastructure dgx cloud in our AI

01:58:45Foundry to help you create proprietary

01:58:47applications proprietary chat Bots and

01:58:49then lastly everything that moves in the

01:58:51future will be robotic you're not going

01:58:53to be the only one and these robotic

01:58:56systems whether they are humanoid amrs

01:59:00self-driving cars forklifts manipulating

01:59:03arms they will all need one thing Giant

01:59:06stadiums warehouses factories there can

01:59:09to be factories that are robotic

01:59:11orchestrating factories uh manufacturing

01:59:13lines that are robotics building cars

01:59:15that are

01:59:16robotics these systems all need one

01:59:19thing they need a platform a digital

01:59:22platform a digital twin platform and we

01:59:25call that Omniverse the operating system

01:59:27of the robotics

01:59:29World these are the five things that we

01:59:31talked about today what does Nvidia look

01:59:33like what does Nvidia look like when we

01:59:35talk about gpus there's a very different

01:59:38image that I have when I when people ask

01:59:40me about gpus first I see a bunch of

01:59:42software stacks and things like that and

01:59:44second I see this this is what we

01:59:47announce to you today this is Blackwell

01:59:50this is the plat

01:59:57amazing amazing processors MV link

02:00:00switches networking systems and the

02:00:03system design is a miracle this is

02:00:07Blackwell and this to me is what a GPU

02:00:09looks like in my

02:00:18mind listen orange green I think we have

02:00:22one more treat for everybody what do you

02:00:23think should

02:00:25we okay we have one more thing to show

02:00:28you roll

02:00:50it

02:01:20he

02:01:51m

02:02:20yeah

02:02:46thank

02:02:47you thank you have a great have a great

02:02:50GTC thank you all for coming thank

02:03:03you

🎥 Related Videos

What vaccinating vampire bats can teach us about pandemics | Daniel Streicker

a16z Podcast | Things Come Together -- Truths about Tech in Africa

2024 TSCRS Applications of anterior segments diagnostic instruments in cataract surgery

a16z Podcast | The Infrastructure of Total Health

The Robot Lawyer Resistance with Joshua Browder of DoNotPay

NES Controllers Explained

🔥 Recently Summarized Examples

The Hitler-Stalin Pact | Reflections Episode 9

Uncovering Corruption From Health "Experts" | Scott Carney

The Forgotten Geometry: A New Path to Unification

Joe Rogan Experience #2194 - Luis Elizondo

From Tesla to DNA: The Science of Scalar Waves - Dr. Sandra Rose Michael - Think Tank E44

Bitcoin Holders...Watch Out for Sept

View original video