Full Transcript & Notes: NVIDIA CEO Jensen Huang Keynote at CES 2025

1:31:508.5M views0 likes
youtube video poster

The Power of Tokens [00:00:06]

This is how intelligence is made. A new kind of factory. Generator of tokens, the building blocks of AI. Tokens have opened a new frontier, the first step into an extraordinary world where endless possibilities are born.

Tokens transform words into knowledge and breathe life into images. They turn ideas into videos and help us safely navigate any environment. Tokens teach robots to move like the masters, inspire new ways to celebrate our victories, and give us peace of mind when we need it most. They bring meaning to numbers to help us better understand the world around us, predict the dangers that surround us, and find cures for the threats within us. Tokens can bring our visions to life and restore what we've lost. They help us move forward, one small step at a time and one giant leap together.

And here is where it all begins.

Introduction and Welcome [00:03:07]

Welcome to the stage, NVIDIA founder and CEO, Jensen Wong.

Welcome to CES! Are you excited to be in Las Vegas? Do you like my jacket? I thought I’d go the other way from Gary Shapiro. I’m in Las Vegas, after all. If this doesn’t work out, if all of you object, just get used to it. I really think you have to let this sink in. In another hour or so, you’re going to feel good about it.

Welcome to NVIDIA. In fact, you’re inside NVIDIA’s digital twin. And we’re going to take you to NVIDIA. Ladies and gentlemen, welcome to NVIDIA. You’re inside our digital twin. Everything you see here is generated by AI.

NVIDIA's History and Evolution [00:04:26]

It has been an extraordinary journey, an extraordinary year. It started in 1993 with NV1. We wanted to build computers that could do things that normal computers couldn’t. And NV1 made it possible to have a game console in your PC. Our programming architecture was called UDA, missing the letter C until a little while later, but UDA, Unified Device Architecture. And the first developer for UDA and the first application that ever worked on UDA was Sega's Virtua Fighter.

Six years later, we invented in 1999 the programmable GPU, and it started 20 plus years of incredible advance in this incredible processor called the GPU. It made modern computer graphics possible. And now, 30 years later, Sega's Virtua Fighter is completely cinematic. This is the new Virtua Fighter project that's coming. I just can't wait. Absolutely incredible.

Six years after that, six years after 1999, we invented CUDA so that we could express the programmability of our GPUs to a rich set of algorithms that could benefit from it. In 2012, Alex Krizhevsky, Ilya Sutskever, and Geoff Hinton discovered CUDA, used it to process AlexNet, and the rest of it is history. AI has been advancing at an incredible pace since, starting with Perception AI. We now can understand images and words and sounds to Generative AI. We can generate images and text and sounds. And now, Agentic AI, AIs that can perceive, reason, plan, and act. And then the next phase, which we'll talk about tonight, Physical AI.

The Transformation of Computing [00:07:35]

In 2018, Google's Transformer was released as BERT, and the world of AI really took off. Transformers, as you know, completely changed the landscape for artificial intelligence. In fact, it completely changed the landscape for computing altogether. We recognized properly that AI was not just a new application with a new business opportunity, but AI, more importantly, machine learning enabled by Transformers, was going to fundamentally change how computing works.

And today, computing is revolutionized in every single layer. From hand-coding instructions that run on CPUs to create software tools that humans use, we now have machine learning that creates and optimizes neural networks that processes on GPUs and creates artificial intelligence. Every single layer of the technology stack has been completely changed. An incredible transformation in just 12 years.

Announcing RTX Blackwell [00:14:13]

Today we're announcing our next generation: the RTX Blackwell family. Here it is, our brand new GeForce RTX 50 series, Blackwell architecture. The GPU is just a beast: 92 billion transistors, 4,000 TOPS, four petaflops of AI, three times higher than the last generation Ada. And we need all of it to generate those pixels that I showed you.

It has 380 ray-tracing teraflops, 125 shader teraflops, G7 memory from Micron, 1.8 terabytes per second, twice the performance of our last generation. And we now have the ability to intermix AI workloads with computer graphics workloads. And one of the amazing things about this generation is the programmable shader is also able to now process neural networks. As a result, we invented neural texture compression and neural material shading.

RTX 50 Series Pricing and Availability [00:17:50]

  • RTX 5070: 1,000 AI TOPS, $549

  • RTX 5070 Ti: 1,400 AI TOPS, $749

  • RTX 5080: 1,800 AI TOPS, $999

  • RTX 5090: 3,400 AI TOPS, $1,999

Availability Starting January.

RTX Laptops [00:19:46]

We managed to put these gigantic performance GPUs into a laptop. This is a 5070 laptop. For $1,299, this 5070 laptop has 4090 performance at half the power.

  • RTX 5070: 800 AI TOPS, $1,299

  • RTX 5070 Ti: 1,000 AI TOPS, $1,599

  • RTX 5080: 1,350 AI TOPS, $2,199

  • RTX 5090: 1,850 AI TOPS, $2,899

Availability Starting March.

The Future of AI: Scaling Laws and Blackwell Supercomputers [00:22:06]

The industry is racing to scale artificial intelligence. The scaling law says that the more data you have, the larger model that you have, and the more compute that you apply to it, the more effective your model will become. And the scaling law continues.

There are, in fact, two other scaling laws that have now emerged. The second scaling law is post-training scaling law. Post-training scaling uses techniques like reinforcement learning, human feedback. Basically, the AI generates answers based on a human query, the human gives it feedback, and that reinforcement learning system, with a fair number of very high-quality prompts, causes the AI to refine its skills.

We now have a third scaling law, and this third scaling law has to do with what's called test-time scaling. It's basically when you're using the AI, the AI has the ability to now apply a different resource allocation. Instead of improving its parameters, now it's focused on deciding how much computation to use to produce the answers it wants to produce.

The amount of computation that we need is incredible. We would like to scale the amount of computation to produce more and more novel and better intelligence. And so scaling law is driving enormous demand for NVIDIA computing. It's driving enormous demand for this incredible chip we call Blackwell. Blackwell is in full production.

Agentic AI and NVIDIA's Platform [00:36:02]

One of the most important things that's happening in the world of enterprise is Agentic AI. Agentic AI basically is a perfect example of test-time scaling. It's a system of models. Some of it is understanding, interacting with the user. Some of it is maybe retrieving information from storage, a semantic AI system like a RAG. Maybe it's going on to the internet, studying a PDF file. It might be using tools, it might be using a calculator, and it might be using a generative AI to generate charts and such. It's taking the problem you gave it, breaking it down step by step, and it's iterating through all these different models.

To help the industry build Agentic AI, our go-to-market is we work with software developers in the IT ecosystem to integrate our technology, to make possible new capabilities. Just like we did with CUDA libraries, we now want to do that with AI libraries. We've created three things for helping the ecosystem build Agentic AI:

  1. NVIDIA NIMs (NVIDIA Inference Microservices): Essentially AI microservices, all packaged up. We optimize it, we put it into a container, and you could take it wherever you like.

  2. NVIDIA NeMo: Essentially a digital employee onboarding and training evaluation system.

  3. NVIDIA AI Blueprints: We provide a whole bunch of blueprints that our ecosystem could take advantage of. All of this is completely open source.

NVIDIA Llama Nemotron Models [00:41:02]

Today we're also announcing that we're doing something that's really clever. We're announcing a whole family of models that are based off of Llama: the NVIDIA Llama Nemotron Language Foundation Models. Llama 3.1 is a complete phenomenon. It has singularly been the reason why just about every enterprise and every industry has been activated to start working on AI. The thing that we did was realize that the Llama models really could be better fine-tuned for enterprise use. And so we fine-tuned them using our expertise and our capabilities and we turned them into the Llama Nemotron suite of open models.

  • Llama Nemotron Nano: Most cost-efficient, low-latency model to deploy on PC and edge devices.

  • Llama Nemotron Super: Superior accuracy with balanced compute efficiency.

  • Llama Nemotron Ultra: Highest-accuracy model for data-center-scale applications.

Physical AI and the World Foundation Model (NVIDIA Cosmos) [00:52:10]

What if instead of PDFs, it's your surrounding? And what if instead of the prompt, a question, it's a request: "Go over there and pick up that box and bring it back." And instead of what is produced in tokens as text, it produces action tokens. Well, that I just described is a very sensible thing for the future of robotics. And the technology is right around the corner.

But what we need to do is we need to create, effectively, the world model, as opposed to GPT, which is a language model. And this world model has to understand the language of the world. It has to understand physical dynamics, things like gravity and friction and inertia. It has to understand geometric and spatial relationships. It has to understand cause and effect. It has to understand object permanence.

Today we're announcing a very big thing. We're announcing NVIDIA Cosmos, a world foundation model that is designed to understand the physical world. The only way for you to really understand this is to see it. NVIDIA Cosmos is the world's first world foundation model. It is trained on 20 million hours of video. NVIDIA Cosmos, the world's first Physical AI foundation model. It's open, available to activate the world's industries of robotics and such.

The Robotics Three-Computer Solution [00:62:03]

Omniverse is a physics-grounded, not physically grounded, but physics-grounded. It's algorithmic physics, principled physics simulation grounded system. It's a simulator. When you connect that to Cosmos, it provides the grounding, the ground truth that can control and condition the Cosmos generation. As a result, what comes out of Cosmos is grounded on truth. This is exactly the same idea as connecting a large language model to a RAG. You want to ground the AI generation on ground truth. The combination of the two gives you a physically simulated, a physically grounded multiverse generator.

Every robotics company will ultimately have to build three computers. One computer, of course, to train the AI. We call it the DGX computer to train the AI. Another, when you're done, to deploy the AI. We call that AGX, that's inside the car, in the robot. And to connect the two, you need a digital twin. This is the digital twin of the AI. These three computers are going to be working interactively. NVIDIA's strategy for the industrial world is this three-computer system.

Industrial Digitalization and Autonomous Vehicles [00:63:31]

Millions of factories, hundreds of thousands of warehouses, that's basically the backbone of a 50 trillion dollar manufacturing industry. All of that has to become software-defined. All of it has to have automation in the future. And all of it will be infused with robotics.

The AV revolution has arrived. After so many years, with Waymo success and Tesla's success, it is very, very clear autonomous vehicles has finally arrived. Our offering to this industry is the three computers: the training systems, the simulation systems, and the synthetic data generator (Omniverse and now Cosmos), and also the computer that's inside the car. Today, Toyota and NVIDIA are going to partner together to create their next generation AVs.

Today we're announcing our next generation computer for the car is called Thor. This is a robotics computer. It takes madness amount of sensor information: cameras, high resolution, radars, lidars, they're all coming into this chip. And this chip has to process all that sensor, turn them into tokens, put them into a transformer, and predict the next path. And this AV computer is now in full production. DRIVE OS is now the first software-defined, programmable AI computer that has been certified up to ASIL D, which is the highest standard of functional safety for automobiles. The only and the highest.

The AV Data Factory [00:73:06]

The Autonomous Vehicle Data Factory, powered by NVIDIA Omniverse, AI models, and Cosmos, generates synthetic driving scenarios that enhance training data by orders of magnitude.

  1. OmniMap: Fuses map and geospatial data to construct driveable 3D environments.

  2. Neural Reconstruction Engine (NRE): Uses autonomous vehicle sensor logs to create high-fidelity 4D simulation environments.

  3. Edify 3DS: Automatically searches through existing asset libraries or generates new assets to create sim-ready scenes.

The Omniverse scenarios are used to condition Cosmos to generate massive amounts of photorealistic data, reducing the sim-to-real gap. And with text prompts, generate near-infinite variations of the driving scenario. With Cosmos Nemotron Video Search, the massively scaled synthetic data set combined with recorded drives can be curated to train models. NVIDIA's AI data factory scales hundreds of drives into billions of effective miles, setting the standard for safe and advanced autonomous driving.

Project GROOT for Humanoid Robots [00:79:03]

The ChatGPT moment for general robotics is just around the corner. We think that the robotics era is just around the corner. The critical capability is how to train these robots. The imitation information, the human demonstration, is rather laborious to do. We need to come up with a clever way to take hundreds of demonstrations, thousands of human demonstrations, and somehow use artificial intelligence and Omniverse to synthetically generate millions of synthetically generated motions. And from those motions, the AI can learn how to perform a task.

The NVIDIA Isaac GROOT Blueprint for Synthetic Motion Generation is a simulation workflow for imitation learning, enabling developers to generate exponentially larger data sets from a small number of human demonstrations.

  1. GROOT-Teleop: Enables skilled human workers to portal into a digital twin of their robot using the Apple Vision Pro.

  2. GROOT-Mimic: Multiplies these trajectories into a much larger data set.

  3. GROOT-Gen: Built on Omniverse and Cosmos for domain randomization and 3D-to-real upscaling, generating an exponentially larger data set.

Once the policy is trained, developers can perform software-in-the-loop testing and validation in Isaac Sim before deploying to the real robot.

Project DIGITS: The Personal AI Supercomputer [00:81:38]

This is now the new way of doing computing. This is the new way of doing software. Every software engineer, every engineer, every creative artist, everybody who uses computers today as a tool will need an AI supercomputer. I just wished that DGX1 was smaller. This is NVIDIA's latest AI supercomputer. And it's fondly called Project DIGITS right now. This is an AI supercomputer. It runs the entire NVIDIA AI stack. All of NVIDIA's software runs on this. DGX Cloud runs on this. It's based on a super-secret chip that we've been working on, called GB10, the smallest Grace Blackwell that we make. This computer will be available around May time frame.

Conclusion and Recap [00:87:30]

I told you that we are in production with three new Blackwells. Not only is the Grace Blackwell supercomputers, NVLink72s in production all over the world, we now have three new Blackwell systems in production. One amazing AI, foundational world model, the world's first Physical AI foundation model. It's open, available to activate the world's industries of robotics and such. And three robots. We're working on Agentic AI, humanoid robots, and self-driving cars.

It's been an incredible year. I want to thank all of you for your partnership. Thank all of you for coming. I made you a short video to reflect on last year and look forward to the next year. Play please.

Concluding Video Montage [00:88:24]

(The video shows a fast-paced montage recapping the technologies and applications discussed, including NVIDIA ACE for digital humans, RTX Blackwell GPUs, BMW Group's manufacturing, Black Forest Labs' AI-generated cities, NVIDIA Blackwell supercomputers, Project DIGITS, Siemens simulations, KION Group and Accenture's warehouse automation, Rockwell Automation, Delta Electronics, Wistron, NVIDIA Tokkio, Omniverse, Cosmos, NVIDIA DRIVE, NVIDIA Earth-2, human-robot interaction with Foxconn, Innoactive, and Volkswagen, Nissan's design process, and simulations with Cadence and Luminary Cloud.)