4 Sources
4 Sources
[1]
NVIDIA GTC keynote: AI gaming, agents, robots, and more
Do you want to build a snowman? Go to space? Construct a workforce out of 110 kinds of robot? Accelerate the entire timeline of Artificial Intelligence? If so, there was plenty of sizzle on offer from Jensen Huang, CEO of NVIDIA, at the company's two hour-long GTC keynote event in San Jose Monday. NVIDIA, the world's leading purveyor of AI-friendly GPU chips, has 4.4 trillion reasons to dazzle us with potential futures allegedly brought to you by ever-accelerating AI technology. In other words, Huang needs to protect NVIDIA's current $4.4 trillion market cap, double what it was two years ago, by proving it isn't a bubble. Result: Huang peppered a presentation of impenetrable charts with dazzling visions of data centers in orbit, and awkwardly extra interactions with a robot version of Olaf from Frozen. But beneath the future sizzle, the actual steak of present-day practical announcements -- for consumers, at least -- was limited to an AI gaming software update that got a thumbs down from gamers, plus NVIDIA muscling in on the OpenClaw AI agent action. Here's a summary of everything Huang had to say: Huang's first reveal: NVIDIA DLSS 5, the next iteration of the company's DLSS AI upscaling software, coming this fall. NVIDIA describes it as a "breakthrough in visual fidelity" that "infuses pixels with photorealistic lighting and materials, bridging the gap between rendering and reality." Huang showed off before and after comparisons using Resident Evil: Requiem, Hogwarts Legacy, and Starfield. Trouble is, Resident Evil: Requiem, released two weeks ago (here's our review), has already won the hearts of gamers for its graphics -- so many took to social media, furious that NVIDIA was trying to fix what wasn't broken. This Tweet is currently unavailable. It might be loading or has been removed. So claims Huang, who showcased a chart highlighting AI developments of recent years. It began in 2023 with OpenAI's ChatGPT. It continued in 2024 with OpenAI's first reasoning model, o1, and in 2025 with Anthropic's coding assistant, Claude Code (which, Huang was proud to boast, is now used by 100% of his company). Now, in 2026, Huang sees NVIDIA and the entire AI industry reaching an "inflection point for inference." What does that mean? Huang says the value of his chips isn't so much in training large language models any more. NVIDIA's customers have tipped over into deploying those AI models in more novel ways, growing the ecosystem for AI agents. Speaking of which ... "Claude Code and OpenClaw have sparked the agent inflection point," Huang pronounced. (OpenClaw, formerly Clawdbot, is a popular AI assistant with some security issues.) NVIDIA adds a protective layer of security and stability that it's calling Nemo Claw; you can try it now, in a preview version. Beyond that, it makes sense for NVIDIA to grow the entire agentic ecosystem as soon as possible, whether or not those agents are actually working reliably. So the company is offering an expansive new NVIDIA AI Agent Toolkit for companies that want to build their own models. The company is also offering a complex reasoning AI model of its own, one with what is possibly the most grandiose sci-fi name in tech today: the Nemotron 3 omni-understanding model. Huang accelerated his cosmic vision towards the end of the keynote. We're not talking NVIDIA Cosmos 3, another grandly-named AI model, but a vision of Vera Rubin Space-1 -- which Huang says will be the first data center in space. There's no timeline for development, let alone launch, but NVIDIA apparently has a "lot of great engineers" working on it. A lot of great Imagineers worked on Olaf the snowman from Frozen, too. And they could be forgiven for cringing a little when Huang closed the keynote having a conversation with Olaf the snowman robot -- one of 110 AI-powered robots on display in San Jose, all from NVIDIA-partnered companies (in this case, Disney). Olaf is pretty cool, and I wouldn't mind running into one in a Disney park in the future. But Huang repeatedly tripped over Olaf's lines, and the fact that our favorite feisty snowman didn't adjust to the conversational situation made him seem ... well, a little less than artificially intelligent. At the two-hour mark, Huang exited to the bizarre accompaniment of AI-generated country music, apparently coming from his own avatar and a bunch of robots sitting around a campfire on screen. And the audience was left with two possible futures. In one, the AI agents NVIDIA is championing help launch the company into the stratosphere. In the other, AI agents act like a bunch of bumbling Olafs, and continue to deliver little ROI for companies -- in which case NVIDIA's market position may melt like a snowman in spring.
[2]
Nvidia CEO Jensen Huang says the next AI boom belongs to inference
Jensen Huang walked onto the SAP Center stage Monday for his GTC keynote address and did what he does best: turning a product keynote into a zoning hearing for the future. The Nvidia $NVDA founder and CEO opened the company's closely watched developers' conference by promising a tour through "every single layer" of AI, then spent the next stretch arguing that the company isn't just selling chips into a hot market. Nope. The company wants to define the whole physical plant of the AI economy: the compute, the networking, the storage, the software, the models, the factories, and -- because subtlety is clearly out of season -- maybe even the orbital data centers. The keynote sprayed announcements in every direction, but the real message was tighter than the confetti cannon made it look. Huang wanted investors, customers, and rivals to hear four things clearly: AI demand is still climbing fast enough to justify indecent amounts of spending; inference is now the center of the battlefield; agents are supposed to spill out of chatbots and into the daily machinery of office work; and the next gold rush after digital AI could be physical AI, where robots, autonomous systems and industrial software burn through even more data and infrastructure. That number also did some quiet cleanup work. Nvidia has spent months fielding the usual questions that arrive whenever a company becomes the main cashier at a capital-spending frenzy: How long can this last, what happens when hyperscalers get religion on costs, and how much of the next phase leaks to custom chips and cheaper alternatives? Huang's answer was to widen the lens. The token, GTC's opening video declared, is the building block of the new AI era. Huang's point was that the business tied to those tokens won't stop at training giant models and admiring them in benchmarks. It moves into production, where the meter never stops running. So Huang's answer was classic Nvidia. Don't defend the GPU in isolation, swallow the whole stack. He described Vera Rubin as "a generational leap" built around seven chips and five rack-scale systems, with Nvidia claiming the platform can train large mixture-of-experts models with one-fourth the number of GPUs versus Blackwell and deliver up to 10 times higher inference throughput per watt at one-tenth the cost per token. He also used the keynote to look beyond Rubin to the future platform Feynman, because in Nvidia-land the next generation is already standing in the wings before the current one finishes taking its bow. That was the deeper tell from San Jose. Huang wasn't pitching a faster part so much as a bigger dependency. Nvidia announced a Vera Rubin DSX AI factory reference design, DSX simulation tools for planning AI factories before they're built, and a broader menu of storage, networking, and system components meant to operate as one vertically integrated machine. The message was hard to miss: Stop thinking about servers, start thinking about campuses. Or, if you're Nvidia, start sending invoices like a utility. Nvidia paired that rhetoric with its Agent Toolkit, OpenShell runtime, and AI-Q blueprint -- software it says can help enterprises build autonomous agents with policy guardrails and, in AI-Q's case, cut query costs by more than 50% through a hybrid mix of frontier and Nvidia's own open models. There was a strategic hedge tucked inside all that openness. Nvidia unveiled the Nemotron Coalition with Black Forest Labs, Cursor, LangChain, Mistral, Perplexity, Reflection AI, Sarvam, and Thinking Machines Lab, with the first project set to underpin the coming Nemotron 4 model family. Read the subtext, and it's pretty clear that Nvidia doesn't want the future of AI software split neatly between a few giant closed-model vendors and a pile of commodity hardware underneath. It wants a hand in the open-model layer, too -- the piece that shapes who gets to build, tune, and own AI outside the walls of the biggest labs. Huang also previewed GR00T N2, a next-generation robot foundation model based on DreamZero research that the company says more than doubles success versus leading VLA models on new tasks in new environments. That section of the keynote may wind up aging best. Chatbots got Wall Street excited. Physical AI is the part that could keep the infrastructure binge going for years, because robots, industrial systems, and autonomous machines don't just need models -- they need endless training data, simulation, networking, sensors, and edge compute. Huang even took the story a step further and said Nvidia is going to space, with future Vera Rubin-based systems aimed at orbital data centers and autonomous space operations. Sure, that sounds a little like a man who has discovered there are still a few untouched sectors left on the bingo card. But it also sounds like a company determined to make "AI infrastructure" mean nearly every expensive machine in sight. By the time Huang was done, the keynote felt bigger than a launch calendar. It read like an empire map. Yes, there was DLSS 5 for graphics, new industrial software tie-ins, telecom edge partnerships, and an avalanche of developer plumbing. But the durable takeaway was simpler and much bigger: Nvidia wants AI to stop being understood as a category of software and start being treated as a utility-scale infrastructure project, with Nvidia's hardware and software embedded at every layer. That's a very Jensen Huang message -- neatly merchandised and only slightly modest. The unnerving part for rivals is that, for now at least, he still has plenty of customers willing to build around it.
[3]
Nvidia GTC 2026: Jensen Huang's Groq 'Mellanox moment' and the inference land grab - SiliconANGLE
Nvidia GTC 2026: Jensen Huang's Groq 'Mellanox moment' and the inference land grab Ahead of Nvidia Corp.'s GTC 2026, we reiterate our thesis that the center of gravity in AI is shifting from "how fast can you train?" to "how well can you serve?" Training has ushered in the modern AI era. Inference is where the monetization rubber meets the proverbial payback road. Token economics, latency requirements, power constraints, memory bottlenecks, NAND prices and ultimately end customer willingness to pay, will determine how fast and how much AI adopters can benefit. In his remarks on the last Nvidia earnings call, Jensen Huang hinted that NVIDIA intends to push harder into low-latency inference with Groq's decoder technology - and he's telegraphing that we'll see the specifics today at GTC. Low-latency inference is where the edge lights up, where agentic systems deliver value, and where infrastructure investments start to pay off. Jensen essentially told investors two things on his last earnings call. First, he referenced Nvidia's non-exclusive $20B licensing agreement with Groq for low-latency inference technology and said it will "extend Nvidia's architecture with Groq's innovations" the way it extended the architecture with Mellanox - and he explicitly said "we look forward to sharing more at GTC next month." Second, he reinforced the logic of his recent move in that CUDA plus architectural compatibility lets Nvidia package software optimization into one stack and have it benefit Hopper, Blackwell, Ampere - extending useful life, improving performance per dollar and per watt, and giving customers an onramp to a new flywheel. Jensen hinted on his earnings call that Groq becomes an "accelerator" inside that broader architecture - alluding to the Mellanox playbook, but aimed at the inference/decoder opportunity. With that as the setup, we hosted a theCUBE + NYSE Wired panel - The Inference Engine: Building AI That Performs at Scale - to test what "inference at scale" really means when you look at the reality of the many constraints technologists face. The panel, hosted at theCUBE's Palo Alto studio, underscored that the inference market is rapidly expanding, but it is not a single market. It's a fragmented portfolio of workloads with different success metrics, different bottlenecks, and varying economics. Identifying clear horizontal monetization opportunities at scale remains elusive. We are still in the "build it and they will come" phase of inference in our view. In our panel, Sid Sheth (d-Matrix) summarized sentiment saying inference "isn't that much of a secret anymore," especially "after the Nvidia Groq deal" - the industry now acknowledges "the next big wave of AI computing is going to be around inference." We agree with his second point even more than the first in that inference is not one-size-fits-all. It runs in big data centers, small data centers, and edge environments - with big models and small models - and "different metrics of success." That's the real market dynamic, which makes granular sizing difficult. The "training winner-take-most" era was created by a default stack owned by Nvidia. The key question is will that same dynamic carry through to inference. In other words, does the Nvidia/Groq deal validate alternatives or will it blow them out of the market. The key determinants will be latency, context length, cost, throughput, and power; and how these metrics present themselves differently by workload. The assumption is the market is so large and fragmented that while a leader like Nvidia will do well and perhaps take most, there will be enough white space left for competitors. Mitesh Agrawal (Positron) posed inference as "yes and no" on whether every deployment is a "snowflake," meaning the workload definition changes by buyer priorities, time to first token, latency, time to last token, context length, memory, and throughput. He also made a point that sometimes gets lost in the market'a narrative in that Nvidia GPUs have been the default for inference workloads because they've been the best "on a dollar basis," but significant opportunities exist for alternatives that can deliver fast speeds and optimize expensive memory resources, especially as KV caches expand with code generation and video generation. This ties directly to why Jensen's Groq hints we mentioned up front are so important. Specifically, inference at the edge was the one glaring gap in Nvidia's massive portfolio. The Groq deal closes that gap. If Nvidia is about to put a low-latency decoder path inside the Nvidia stack, that's an attempt to collapse one of the highest-value inference opportunities back into the CUDA ecosystem - the same way Mellanox collapsed networking advantage into the Nvidia platform. Jensen is essentially saying "we're not passing on the low-latency opportunity and the best path is inside our control plane." As observers often argue about model benchmarks, the infrastructure builders are staring at glaring energy deficiency. Felix Ejeckam (Akash Systems) explained that there isn't enough power in the grid to support the compute trajectory, and the stress increases as inference deployments ramp up. Akash's pitch is that reducing the cooling load with lab-grown diamond applied directly on GPUs, dropping temperatures by ~10-15°C and pushes PUE closer to 1.0, without having to rebuild the facility. We haven't validated the exact economics but believe the claims are directionally correct. The point is inference economics rely on solving for power and cooling story as much as silicon. We also note the investor commentary from Sam Awrabi (Banyan Ventures) who said the idea that "hardware costs all the money" misses that power can be a meaningful component of total cost. That's a major reason inference is becoming the new battleground - i.e. as inference scales with usage, usage scales power, power scales the bill. So reducing power enables more tokens to be generated at lower costs. The panel conversation turned to memory as pricing pressure becomes a gating factor. Sid Sheth emphasized d-Matrix intentionally avoided CoWoS and HBM, using stacked custom DRAM and LPDDR tiers to reduce exposure to the most constrained parts of the Nvidia-centric supply chain. Mitesh added a broader point that memory pricing increases flow through the whole stack (HBM to DRAM to LP5X), and even beyond price, allocation is the real bottleneck - "good luck getting allocation for CoWos and HBM ahead of Nvidia... then Broadcom ecosystem... then AMD... then Amazon... then Microsoft... then Meta." Our perspective on this is fabrication capacity is a key constraint that is often overlooked. Datacenter accelerators are sucking up fab capacity from TSMC these days as suppliers like Nvidia (GPUs), Broadcom (TPUs, etc.) and others are making much more aggressive volume growth commitments to TSMC relative to consumer product chip designers. At a high level, there are two main constraints we're monitoring - the front end and back end of the semiconductor manufacturing process. Front end capacity refers to the upstream wafer fabrication capacity - i.e. advanced logic process nodes - where the silicon and logic circuits are placed on the wafers. Back end (or sometimes called mid step) is where CoWoS (Chip-on-Wafer-on-Substrate) that our two guests mentioned comes into play. CoWos is a form of advanced packaging where the fabricated chips are integrated with high-bandwidth memory (HBM), substrates, etc. to create the final accelerator packages. Fabs like TSMC have to balance the front end and backend capacity. Last year the backend was a major constraint and while still acute, the bottleneck is shifting to the front end of the process. The point is AI demand is exploding, but silicon wafer production isn't keeping." The relevance for GTC is Jensen's architectural-compatibility argument is also a supply chain argument. When the same CUDA-optimized work benefits a large installed base for years, older installed bases keep producing revenue - and customers can more easily tolerate upgrades on Nvidia's cadence because the stack remains current. That reduces churn, raises switching costs and creates lock-in, underscoring a subtle but powerful inference moat. The panel gave us a look at what the inference era looks like: If Jensen's Mellanox analogy comes to fruition, we expect Nvidia to present Groq as a platform extension that will most definitely not be a bolt-on to its impressive product line. It likely presents itself as a capability that preserves CUDA's "write once, run everywhere advantage while improving latency-sensitive inference workloads. That is how Nvidia keeps its edge and inference story inside its architecture - even when the Groq deal is technically non-exclusive. We believe GTC 2026 will be remembered as the moment Nvidia brings a much stronger inference story into its platform. Jensen's "we'll share more at GTC" hints suggest the unveiling of a Groq roadmap that is likely to reset the narrative around inference. Putting a low-latency decoder path inside Nvidia's stack will extend the useful life of the installed base in our view. Organizations that align with Nvidia's strategy will likely see the best performance per watt per dollar improvements at a fast pace. That said, the market for inference is so large that alternatives will find success where ultra-low latency needs, niche workloads and supply constraints create opportunities. Inference is where revenue growth meets physical constraints - and the winners will be the companies that translate the nuances of the inference market into predictable performance, lower operating cost, and deployable systems across data centers and edge environments. What are your thoughts on the opportunity for AI inference at the edge? Where are the opportunities? What are the risks you see and how can they be mitigated?
[4]
GTC 2026: Jensen Huang's AI future goes beyond just chat
Structured data will define trustworthy, enterprise-ready AI system Every year, like clock work, GTC keynotes are usually judged by NVIDIA's hardware flex. More CUDA, more cores, and more things named after scientists who definitely didn't imagine being turned into GPU roadmaps. However, at GTC 2026, Jensen Huang's real message wasn't about silicon alone, but where AI is going next, and why NVIDIA believes it's building the industrial plumbing for that future. At the heart of Huang's argument is that AI is no longer just learning to speak - that ship sailed long ago. Starting 2026, AI is learning to reason, act, and starting to become useful inside the messy structure of real business - and the even messier chaos of the real world. "Computing used to be retrieval-based, now it's generative," as Jensen Huang, CEO of NVIDIA, put it from his keynote at GTC 2026. Jensen Huang and NVIDIA are signalling that AI is a core computing shift, not just a new software feature. Search and retrieval of information defined the old way of digital transformation, whereas AI-based generation and decision-making will define the new paradigm currently underway in computing. Huang laid out AI's recent evolution almost like a staircase. "An AI that was able to perceive became an AI that could generate. An AI that could generate became an AI that could reason. An AI that could reason now became an AI that can actually do work, very productive work." Now, he argues, they are crossing into something more consequential - not just chatbots that answer, but systems that execute. Huang's description of agentic AI made that abundantly clear. "For the first time, you don't ask AI what, where, when, how. You ask it to create, do, build. It's able to solve problems and actually perform tasks." Also read: NVIDIA GTC 2026 conference: 5 things you should expect If generative AI was the era of amazement, then Huang argues that agentic AI is supposed to be the era of labour. Digital labour, where humans supervise over systems and machines. Huang's keynote's most revealing lines had nothing to do with chips and everything to do with data. "This concept of fusing structured information and generative AI will repeat itself in one industry after another industry after another industry. Structured data is the foundation of trustworthy AI." It is an important point. After all, enterprises don't run on vibes, do they? They are grounded in data, records, processes, workflows and governance rules. AI that wants to operate inside business will have to plug into those structures without hallucinating its way into a compliance disaster. Huang pushed the point further by arguing that AI is finally becoming useful against the vast swamp of data businesses barely use. "About 90% of what's generated every single year is unstructured data. Until now, this data has been completely useless to the world. You can use that same technology... to go read a PDF to understand its meaning. And from that meaning, embed it into a larger structure that we can search into, we can query into." That, in a nutshell, is Huang's vision for enterprise AI - not merely generating answers, but making passive data computable. "Finally, AI is able to do productive work, and therefore, the inflection point of inference has arrived," Huang said, without mincing any words. If training built the AI boom, Huang believes inference will define its economics going forward. Also read: NVIDIA Vera CPU: Performance compared to AMD and Intel x86 chips It also revealed NVIDIA's strategic ambition. If AI must think and reason to act continuously, then inference becomes the main event - not just a side act. That is why Huang's language kept drifting away from data centres as storage and toward factories as production systems. "Your data centre, it used to be a data centre for files. It's now a factory to generate tokens." Huang effectively gave us the phrase that may define NVIDIA's next decade - token factory. Not a datacentre or the cloud, but a token factory. This is where GTC 2026 started to feel like a declaration of industrial intent. NVIDIA CEO Jensen Huang isn't just saying AI will get smarter. He is saying AI will become infrastructure, and that infrastructure will be measured in cost per token and usefulness per watt, as much as it will be about throughput. So yes, NVIDIA still sells the CPUs, GPUs, AI accelerators, server racks, as Huang showcased on the keynote stage of GTC 2026. But Huang's bigger claim is that AI is heading toward systems that reason, act, and work across structured and unstructured knowledge. And to enable that future, NVIDIA wants to be the factory behind the factories.
Share
Share
Copy Link
At GTC 2026, NVIDIA CEO Jensen Huang announced a strategic shift from training to AI inference, calling it the new inflection point. The company unveiled Vera Rubin, a platform promising 10x higher inference throughput, and revealed a $20 billion Groq licensing deal for low-latency inference technology. Huang positioned NVIDIA as builder of the entire AI infrastructure stack, from data centers to orbital facilities.
At the NVIDIA GTC keynote in San Jose, CEO Jensen Huang made a clear declaration: the center of gravity in artificial intelligence has shifted from training models to deploying them. "The inflection point for inference" has arrived, Huang told attendees during the two-hour presentation, marking what he believes will define the next phase of AI economics
1
. The message wasn't subtle. AI inference—the process of running trained models to generate outputs—is now where NVIDIA sees the biggest opportunity for growth and monetization2
.
Source: Digit
Huang's argument rested on a fundamental shift in how AI systems operate. "Computing used to be retrieval-based, now it's generative," he explained, positioning inference as the production layer where AI agents actually perform work rather than simply answer questions
4
. For NVIDIA, a company with a $4.4 trillion market cap—double what it was two years ago—this pivot addresses critical questions about whether the AI infrastructure boom can sustain itself beyond the training phase1
.The centerpiece of NVIDIA's AI strategy came in the form of Vera Rubin, a new platform that Huang described as "a generational leap" built around seven chips and five rack-scale systems
2
. NVIDIA claims the platform can train large mixture-of-experts models with one-fourth the number of GPUs versus Blackwell and deliver up to 10 times higher inference throughput per watt at one-tenth the cost per token. The company also previewed Feynman, the generation beyond Rubin, signaling that NVIDIA's roadmap extends well into the future2
.
Source: Quartz
But Huang wasn't just selling faster chips. He unveiled a Vera Rubin DSX AI factory reference design, DSX simulation tools for planning AI factories before construction, and a broader menu of storage, networking, and system components designed to operate as vertically integrated AI infrastructure
2
. The message was unmistakable: NVIDIA wants customers to think about campuses, not servers, and to treat AI infrastructure like a utility2
.Huang even took the vision orbital, announcing plans for Vera Rubin-based systems aimed at data centers in space for autonomous space operations, though no timeline was provided
1
2
.Perhaps the most strategically significant announcement came in the form of NVIDIA's non-exclusive $20 billion licensing agreement with Groq for low-latency inference technology
3
. On NVIDIA's last earnings call, Jensen Huang explicitly compared the Groq deal to the company's acquisition of Mellanox, suggesting it will "extend NVIDIA's architecture with Groq's innovations" the same way Mellanox extended networking capabilities3
.The Groq technology addresses a glaring gap in NVIDIA's portfolio: low-latency inference at edge computing environments, where agentic AI systems need to deliver value with minimal delay . By integrating Groq's decoder technology into the CUDA ecosystem, NVIDIA aims to collapse one of the highest-value inference opportunities back into its control plane
3
. Industry observers see this as NVIDIA's attempt to dominate the inference land grab before competitors can establish footholds in the fragmented market3
."Claude Code and OpenClaw have sparked the agent inflection point," Huang declared, positioning AI agents as systems that execute tasks rather than simply respond to queries
1
. NVIDIA introduced Nemo Claw, a protective layer of security and stability for OpenClaw (formerly Clawdbot), now available in preview1
. The company also launched an expansive NVIDIA AI Agent Toolkit designed to help enterprises build their own models, along with the AI-Q blueprint that claims to cut query costs by more than 50% through a hybrid mix of frontier and NVIDIA's own open models2
.Huang described agentic AI as a fundamental shift: "For the first time, you don't ask AI what, where, when, how. You ask it to create, do, build. It's able to solve problems and actually perform tasks"
4
. NVIDIA also unveiled the Nemotron 3 omni-understanding model for complex reasoning, and announced the Nemotron Coalition with partners including Black Forest Labs, Cursor, LangChain, Mistral, and Perplexity1
2
.Huang previewed GR00T N2, a next-generation robot foundation model based on DreamZero research that NVIDIA claims more than doubles success versus leading VLA models on new tasks in new environments
2
. The physical AI segment featured 110 AI-powered robots from NVIDIA-partnered companies, including a somewhat awkward interaction with an Olaf robot from Disney's Frozen1
.
Source: Mashable
Industry analysts suggest this physical AI push may be the most strategically important announcement from the NVIDIA GTC keynote. While chatbots sparked initial investor excitement, robots, industrial systems, and autonomous machines require endless training data, simulation, networking, sensors, and edge compute—potentially sustaining the infrastructure spending boom for years
2
.Related Stories
Huang made a compelling case for why enterprise AI requires more than generative capabilities. "Structured data is the foundation of trustworthy AI," he stated, emphasizing that businesses need AI systems that plug into existing data, records, processes, and governance rules without hallucinating
4
. He noted that approximately 90% of data generated annually is unstructured and largely unused, representing a massive opportunity for AI systems that can extract meaning and embed it into searchable structures4
.Huang introduced a new framing for data centers: "Your data centre, it used to be a data centre for files. It's now a factory to generate tokens"
4
. This concept of token factories positions NVIDIA's AI strategy around measuring infrastructure by cost per token and usefulness per watt, treating AI deployment as an industrial production system4
.Not all announcements landed well. NVIDIA revealed DLSS 5, the next iteration of its AI upscaling software coming this fall, which the company describes as a "breakthrough in visual fidelity" that "infuses pixels with photorealistic lighting and materials"
1
. Huang showcased comparisons using Resident Evil: Requiem, Hogwarts Legacy, and Starfield. However, gamers took to social media frustrated that NVIDIA was attempting to "fix" games like Resident Evil: Requiem, which had already won praise for its graphics1
.The keynote concluded with AI-generated country music featuring Huang's avatar and robots around a campfire, leaving the audience with two possible futures: one where AI agents help launch NVIDIA into the stratosphere, and another where agents deliver little ROI and the company's market position weakens
1
. For now, NVIDIA's bet on the next AI boom belonging to inference represents a calculated attempt to justify its massive valuation and define the infrastructure layer for the next decade of AI development.Summarized by
Navi
[3]
07 Jan 2025•Technology

15 May 2025•Technology

15 Mar 2025•Business and Economy

1
Technology

2
Technology

3
Business and Economy
