The Outpost is a comprehensive collection of curated artificial intelligence software tools that cater to the needs of small business owners, bloggers, artists, musicians, entrepreneurs, marketers, writers, and researchers.
© 2025 TheOutpost.AI All rights reserved
Curated by THEOUTPOST
On Thu, 16 Jan, 12:04 AM UTC
15 Sources
[1]
RTX 5070 gaming laptops will be my new gold standard -- here's why
Ever since Nvidia CEO Jensen Huang boldly stated at CES 2025 that an RTX 5070 would come with the same performance as an RTX 4090, I've been holding off browsing powerful gaming laptops I dream of owning. Why? Well, this middle-of-the-rage graphics card may make premium PC gaming redundant. With the announcement of RTX 50-series GPUs, I've been wondering just how impressed I would be with their general gaming performance. We got a hands-on look at RTX 50-series GPUs and were mightily impressed, but there's still the question of how it would translate over to a gaming laptop. Sure, it's likely to knock the socks off the RTX 40-series lineup, if Nvidia's performance claims are anything to go by, but would an RTX 50-series GPU be worth a full-blown (and pricey) laptop upgrade? After testing gaming laptops for a good few years, I've already been stunned by the performance of premium -- and even budget -- gaming laptops that sport anything from an RTX 3060 to an RTX 4090. If an RTX 50-series offers more of the same and many are more than happy with their current graphics and frame rates, then there's hardly much incentive to invest in an expensive gaming laptop. But the RTX 5070 is set to change that. Thanks to AI-enhancement features like Deep Learning Super Sampling (DLSS) 4 with Multi Frame Generation, high-end gaming on a laptop is about to get easier to access. Nvidia RTX 50-series gaming laptops are set to arrive in March, but I'm already willing to bet that these laptops will be my new gold standard. While there's always power and performance to think about, price is arguably one of the biggest factors when choosing a gaming laptop. For me, it's the reason why I've always held off getting an RTX 4090-equipped laptop like the Alienware m18 R2 -- anything close to $4,000 is a huge chunk of change. I was lucky enough to grab an Asus ROG Strix Scar G17 with an RTX 3080 for a massive discount a few years back, and it's been my reliable, main driver ever since. Still, that retailed at over $3,000 at full price at the time, which would have made me shy away from it without the price cut. Now, cut that price by more than half, and I would be far more keen on grabbing it. And I'm sure many other PC gamers would be, too. Fortunately, Nvidia's asking price for RTX 5070 laptop GPUs fits that bracket: $1,299. The power of an RTX 4090 in a $1,299 RTX 5070 gaming laptop? Now that's a slogan. Heck, I'd even go up to an RTX 5070 Ti if I'm willing to push the budget. Not everyone needs the highest-priced hardware to play their favorite PC games, whether it be Cyberpunk 2077 or Black Myth: Wukong, and I'm placed in that camp. But to get a fairly priced gaming laptop that pushes the limits on demanding PC titles is a winning scenario. As a reality check, I don't expect to see the prices play out. Listings of RTX 50-series gaming laptops are already popping up, and according to Best Buy, the upcoming Asus ROG Zephyrus G16 with an Intel Core Ultra 9 and RTX 5070 will set you back $1,999. (Interestingly, the Asus ROG Strix G16 with an RTX 5070 Ti is at $1,899). It isn't uncommon for laptop manufacturers like Asus, HP, Razer and more to raise the price of what Nvidia pitches. After all, there are processors, QHD OLED displays with 240Hz refresh rates, RAM and storage that add to the overall cost. So, while they more than likely won't be as affordable as Nvidia's ballpark of prices, an RTX 5070 will still be a lot cheaper than RTX 4090-equipped laptops that will still set you back around $4,000. With 8GB of GDDR7 compared to an RTX 4090's whopping 24GB of GDDR6X VRAM, it's hard to believe an RTX 5070 can match the gaming potential of Nvidia's last-gen powerhouse. But AI is coming in to give it a major boost. Thanks to DLSS 4 with Multi Frame Generation, Nvidia claims we can expect accelarated frame rates of up to 8x. Playing games at 30 frames per seond (fps)? Expect them to be pushed to over 200 fps on an RTX 50 graphics card -- and that includes an RTX 5070. "With the invention of DLSS 4 with Multi Frame Generation, working in unison with the complete suite of DLSS technologies, we can multiply frame rates by up to 8X over traditional brute-force rendering and provide image quality that is better than native rendering," Nvidia states. There's the brewing arguement of "fake frames," which shows that RTX 50-series GPUs aren't showing off their true performance. However, for casual PC gamers that want to enjoy demaning games at their finest, these frames will be highely beneficial. That's not all, of course, as there's also Nvidia Reflex 2 to reduce latency by up to 75% and next-level graphical fidelity and detail with Nvidia RTX Neural Shaders. We have yet to see this in action on an RTX 50-series laptop, but if our first look at RTX 50-powered gameplay is to be believed, then I'll have a lot to like. Considering that an RTX 4090 can pull off insane detail with smooth frame rates clocking over 200 fps with DLSS 3, and that an RTX 5070 will match that through enhanced AI features that requires less power, there's good reason why an RTX 5070 is on my radar. The RTX 5070 for gaming laptops may not deliver the giant leap in graphics like it's more powerful siblings (the RTX 5080 and RTX 5090), but it will finally bring high quality gaming to even more PC games at a much less intimidating price. For those that missed out on a RTX 4090 (which includes myself other than some testing), they'll get a chance to see the highest grade of gaming on a mobile machine -- one that can be as sleek and portable as an Asus ROG Zephyrus G14. Sure, the RTX 5080 laptop GPU is sure to impress even more, and the expected RTX 5060 lineup of gaming laptops is sure to offer great value, but an RTX 5070 gaming laptop will hit that sweet spot for me when it comes to price and performance, and I'm sure many others will want to scartch that RTX 4090 itch for much less, too.
[2]
I got an exclusive look at Nvidia's RTX 50-Series GPUs -- 5 big reveals
Nvidia's RTX 50-series GPUs were easily the biggest news of CES 2025 -- ushering in the next generation of PC gaming with hardware upgrades and impressive AI-driven improvements. But in terms of going hands-on, we've only been able to go skin deep. Sure, I got to try out the RTX 5090 with Black Myth: Wukong and Black State, but one of those demos only showed the framerate and the other we just had to guess with no on-screen data. So to get more time and experience with these new cards, I got to go behind the scenes at a private Nvidia event to find out a whole lot more about the new RTX Blackwell Architecture, and even see some particularly strenuous tests. With 4K 240hz monitors very much being a thing (I got to try Asus' impressive option out too), Nvidia's new GPUs are targeting this standard. In the slides shown to us, internal benchmarking suggests the company hit it, and in my own time with it, I can confirm that is absolutely the case. So what is the first demo I rushed to for verification of this? Cyberpunk 2077 has become the 2020's equivalent of "can it run Crysis?" Of course I made a beeline over there to see what the new GPUs can do, and to say I was blown away would be an understatement. This is at maxed-out settings, and sprinting around Night City, I saw a maximum of around 265 FPS, and a low of 245 FPS. Every detail is crisp, the fidelity is incredible and the smoothness is fantastic without much in the way of a noticeable latency -- all thanks to DLSS 4 (more on that later). On top of that, I also saw Black Myth: Wukong and Black State either come damn close to or even exceed this 240Hz target at 4K max settings. One thing to note is that changing between the various quality and performance implementations of DLSS (quality = better picture with fewer frames, and performance = prioritizing frame rate) only resulted in a 10-15 FPS difference in my hands-on experience. So with the RTX 50-series GPUs comes Deep Learning Super Sampling (DLSS) 4 -- AI game enhancement tech that uses a machine learning model trained on the game to boost resolution and improve frame rates, while easing pressure on the GPU. Yes, these features are also coming to older GPUs too (I've popped the graph below to show you support all the way back to RTX 20 series), but you're getting everything with the 50s. The secret sauce to this is changing from a Convolutional neural network (CNN) to a transformer model. Let me explain -- in older versions of DLSS, the CNN works to find patterns in the in-game image to do things like sharpen up the graphics and increase the frame rate. A CNN relies on the layers it sees before it, and while it's a solid system, there are some issues that come from it like ghosting (seeing the outline of a fast-moving item on screen follow behind it). Meanwhile, a transformer model is similar to what you see in the likes of ChatGPT and Google Gemini, and Nvidia is using this for DLSS 4. According to Nvidia, this "enables self-attention operations to evaluate the relative importance of each pixel across the entire frame and over multiple frames." Put simply, this new version of DLSS is able to think a few more steps ahead. No, it's not predicting the future like CEO Jensen Huang claimed in an interview recently, but the end result of moving to a more intelligent transformer is drastically better frame rates and image quality, while also tackling some of the key gripes I had with older versions. Heading back to that Cyberpunk demo, for example, the neon holograms are the enemy of older DLSS versions and AMD's FSR tech. It's been challenging for any CNN to render them fully without flickering and fuzzy textures, as they are a tricky mix of frame generation, upscaling and ray tracing to predict. With DLSS 4, however, 99% of that is gone. You'll have to be looking hard to find anything wrong. The leaves on this holographic tree are completely clear and there is zero sign of any flickering around it. The only spots I saw were some slight jitters on in-game HUD elements and the tiniest bits of ghosting around bright screens in especially dark areas. But make no mistake about it -- there is a marked upgrade here. The Super Resolution and Ray Reconstruction transformer really help give even the tiniest of details a real sense of existence in these in-game worlds, while massively upping the FPS. And on top of that, the new transformer model is 40% faster, uses 30% less video memory. That means Nvidia can redeploy that memory to other things. Trust me when I say my eyes lit up when I saw Jensen exclaim you're going to get the performance of the $1,599 RTX 4090 in the $549 RTX 5070. And after talking to our Global Editor-in-Chief Mark Spoonauer who was in the crowd at the conference, the reaction was loud. This would be huge for the price to performance ratio, but of course I needed to see this for myself. And in short, Huang was telling the truth...sort of. It comes down to that Multi Frame Generation aspect of DLSS that is available exclusively on RTX 50-series cards. With DLSS 4 on the 50-series, you're not just limited to generating one additional frame, that transformer model can now generate three. For the comparison, Nvidia booted up Marvel Rivals and the frame rate differences are stark. With RTX 4090 generating one additional frame, you're seeing around 180 FPS, whereas on 5070, that goes up to nearly 250 FPS -- all on the same graphics settings across both cards (only difference being that Multi Frame Gen). In other games, Nvidia was quick to emphasize that the two GPUs are more equal in terms of frames, and that Marvel Rivals was picked for the demo due to it being exceedingly strong in frame generation. And there's the twist. Jensen's claim is entirely dependent on developers supporting Multi Frame Generation. If the game doesn't support it, then of course the RTX 4090 with its vast amounts of additional power and VRAM is going to be better. But I feel conflicted in calling it a twist, because what Nvidia has pulled off here is seriously impressive. Provided that enough developers end up putting Multi Frame Gen in their games, then this could easily be the best GPU you can buy. If you thought Nvidia was going hard on AI game enhancements before, the company's taking things up a notch. You're seeing a lot of new RTX features coming from Nvidia with the word "neural" in the name. From Neural Rendering for full ray and path tracing and Neural Shaders using AI to enhance in-game textures, to Neural Faces using generative AI trained on real faces to cross that uncanny valley of facial expressions. RTX Neural materials was the first standout feature for me looking at the demos in person. Typically, a game developer will have to bake in the texture of an object or surface and the rules of how it interacts with the rest of the world's lighting and ambiance. Now, with RTX Neural materials, AI compresses that code and makes processing of it up to 5x faster. The end result from what I saw is that multiple layered items like the silk pictured above looks dramatically more realistic to what it should look like -- rather than just a shiny foil-like piece of cloth, there's a thread count and the color changes with light diffraction. Coming along to the party too is RTX Mega Geometry that massively increases the amount of triangles that build everything you see in a game for insane levels of granular detail. Typically, if a game is adding more detail to the scene, it has to go through a rebuild of all the on-screen elements, which can be very costly to the GPU. It's something you see happen in cutscenes, as developers up the detail on characters in close-up shots talking, and then reduce that load when you don't need it. With Mega Geometry, Nvidia's offering a way to stream in these details in real-time without needing to rebuild everything else. This means two big things for you, the gamer: The typical stat people look for on a spec list of a GPU is GDDR video memory (VRAM) -- the memory that stores essential graphical instructions in the background that the game will need to constantly refer back to. These could be anything from textures to lighting, facial expressions to animations. To run most of the AAA games at their highest settings, you need a lot of it, and Nvidia does deliver that on the RTX 5090 with 32GB of the fastest GDDR7. But what the onboard AI is actually capable of doing here is really quite clever -- textures can be compressed to save up to 7x of that previous video memory, while the Neural Materials feature works in tandem to process materials up to 5x faster. As for the size of some key textures in the demo Nvidia showed me, they have gone down from a demanding 47MB to 16MB. That's a huge reduction that frees up the memory needed for Mega Geometry. It became abundantly clear that Nvidia's taking the "work smarter, not harder" approach, and while I know there are going to be some PC gaming purists who would prefer to turn off all these AI features and just have pure rendering on the GPU itself (I saw you in the comments), this is the direction the company has decided to take. And based on what I've seen, I think they're on the right track. Latency is something you hear a lot about when it comes to Nvidia's DLSS tech. While the frame generation technique in the background does make the game look smoother, it does not alter latency, since it's a piece of AI trickery rather than brute forcing the issue with raw horsepower on the GPU. So watching DLSS 4 generate an extra two frames beyond the one of DLSS 3 did ring some alarm bells amongst us at the event -- it could cause additional latency to stick more frames in there. However, those fears were quickly put to bed by running around Night City. As you can see, those additional two frames in Multi Frame Gen don't add any additional latency on top of what you saw in DLSS 3. And this potential issue is a relative one. If you're playing at a lower framerate with DLSS, you're probably going to notice it more as the latency time between frames increases. But if you're going for the 100+ FPS target (and let's be honest, that's probably what you're going to get pretty much all the time), then most people will not notice it. But there is an additional weapon in Nvidia's arsenal in the way of Reflex 2 -- the second generation latency reducing tech that aims to make games more responsive. This one is definitely more for the esports crowd looking to reduce those milliseconds, but in my time with The Finals, that latency time was so small! And the way it's being pulled off is seriously awesome. The GPU can analyze what the next frame will be based on what your mouse and keyboard inputs are, while ensuring a precise synchronization of rendering graphics across your whole PC. If competitive gaming is your thing, this is a generational step forward for reducing latency. And that, in a generously sized nutshell, is what RTX 50-series is all about -- big impactful changes to the way gaming graphics works to usher in the next generation. Games of today will be able to render faster, look sharper and run smoother. And games of the future could be packed with so much detail it would take you pressing your face on the screen to actually find the pixels. Plus, while conversation around DLSS may lean towards it being a workaround in the more power-hungry areas of the PC gaming community, over 80% of RTX players activate it while gaming -- a huge adoption statistic that may suggest this has all been a bit blown out of proportion. Trust me, for that 80+% of you out there, you'll love what Nvidia's cooking here.
[3]
Nvidia's RTX 5090 still can't game in 4K without DLSS 4
Nvidia's Blackwell chips offer a 10-30% hardware boost for gaming, but the rest is all software It's finally time for us to find out if the Nvidia "Blackwell" GPU hype is real. Nvidia launched the newest generation of gaming graphics cards, codenamed "Blackwell" after mathematician David Harold Blackwell, at CES 2025 in Las Vegas last week. Nvidia has made plenty of claims about the "Blackwell" 50-series GPUs when it comes to power and performance, not the least of which is the argument that the RTX 5070 can get similar performance to the RTX 4090. Between the impressive gaming claims and the fact that the Nvidia GeForce RTX 50-series GPUs are built for AI and gaming workloads, we can't help but ask: is "Blackwell" worth the investment? During the Nvidia Editor's Day at CES, the company gave attendees an early look at the architecture changes, AI optimizations, and performance expectations of the RTX 50-series GPUs. When it comes to pure gaming power, the RTX 5090 flagship GPU still can't quite gaming at 4K on Max graphics presets. On Star Wars: Outlaws, Cyberpunk 2077, Black Myth: Wukong, and Alan Wake 2 the 5090 still can't break that 60 fps (frames per second) gaming threshold for playability. Granted, you can get away with playing these games at 30 fps, but in any game with gunfights, you'll want to opt for higher framerates whenever possible. That's where DLSS 4 comes in. With the latest iteration of Nvidia's frame generation technology, you'll be able to game in 4K with frame rates over 200 fps. Of course, you're relying on software performance to generate all of those additional frames rather than pure silicon, and not all games offer DLSS 4 support. You can also lose visual fidelity with frame-gen software like DLSS, as it upscales a lower-resolution image using AI. So you're playing the game at a lower resolution like 1440p, but the GPU's AI pipeline makes it appear like you're playing in 4K. For that reason, plenty of gamers won't use frame-generation software like Nvidia's DLSS, AMD's Fluid Motion Frames, or Intel's XeSS. In fact, the 5090 is only about 10% better than the RTX 4090 when it comes to pure silicon performance. While this information is technically about the RTX 5090 desktop GPU, it is still relevant for gamers on the laptop side, because it helps explain why you don't see gaming laptops with 4K displays anymore. We're still too far off the mark for 4K gaming to be viable. Nvidia did also share information about the performance expectations of the RTX 5080 Laptop GPU which will have 16GB of VRAM onboard. As you'll notice from the chart, with RT and DLSS 4 enabled, the RTX 5080 can offer almost twice the gaming performance of the RTX 4080. But on gaming titles and workloads that don't offer DLSS 4 and RT, the performance of the 5080 is just 10-30% better than the RTX 4080. Nvidia has also leveraged the AI features of it's new GPUs for better power efficiency on gaming, offering up to 40% longer gaming sessions on battery power, and up to 30% longer web and video surfing on battery with the optimized BatteryBoost system. The new battery AI can save power during low scene motion, marginal pixel changes, and minimal player interaction scenarios. It can also optimize your display refresh rate and offer faster PCIe, SSD, memory, and IO power states to get you the most battery out of your system as possible. Laptops with the new RTX 50-series cards will be available starting in March. Nvidia's "Blackwell" GPUs are built for AI. The cards have an AI controller that can help with frame generation, which makes DLSS 4 smoother than previous iterations. The GPUs are also built with both gaming and AI workload pipelines. While most gamers probably don't want to use their GPU for generative AI tasks most of the time, Nvidia is dedicated to bringing AI to the gaming sphere with additional advancements in gaming AI. Nvidia showcased AI for gaming with new integrations of InWorld's Ace platform. We've seen Nvidia Ace used for controlling NPC dialogue in various tech demos before, but at CES 2025 the company showcased multiple additional iterations of the AI software. Nvidia was demoing Ace as an AI companion in your Player Unkown's Battle Grounds matches as the PUBG Ally. While the PUBG Ally can join your matches, it doesn't function as an AIM-bot and play for you, but it is a virtual companion that can offer advice and give you gameplay tips on the fly. However, because PUGB Ally exists in the game with you as a digital avatar, it does still feel like a cheat because you can use the PUBG Ally to pick up items and hand them to you, which can give you advantages over other players. The PUBG Ally iteration of Nvidia Ace is not expected to ship as an official part of PUBG, as it runs entirely on the client side. Nvidia was also demoing Ace as a streaming companion as part of a collaboration with StreamLabs, giving you an assistant for your livestream that can control your camera, put together highlights, and adjust your stream feeds. The Ace Streaming Companion does not have an expected launch window, but the Streamlabs Intelligent Streaming Assistant will be coming soon. Nvidia is bringing Ace to the game inZOI, allowing you to have better control of your Smart Zois. This will run on the client side but is expected to be a featured part of inZOI when it launches on March 28, 2025. Lastly, Nvidia is using Ace to create AI raid bosses for Mir5, which will alternate strategies to provide additional challenges to players. Because raids aren't difficult enough without AI. The RTX 50-series GPUs are built for AI, and also for gaming. But if you're a pure hardware performance gamer who wants to play games at max settings natively at 4K, the RTX 50-series is something of a disappointment. While you can game in 4K at lower graphics presets, or with better optimized titles, the RTX 5090 is still not capable of making 4K gaming viable for the competitive player. Additionally, because these cards are as well designed for AI as they are for gaming, the prices of the RTX 50-series are incredibly high. The cards also require a large amount of power and thermal management, which doesn't say great things for your electric bill. The RTX 50-series does have at least 10% pure silicon performance upgrades generation to generation, which is nothing to sneeze at. However, if you were hoping for this latest generation of Nvidia GPUs to suddenly make 4K gaming competitive, make 8K single-player gaming a possibility, or lower the cost of gaming with a discrete GPU, the results fall flat.
[4]
GeForce RTX 50 Series Demystified: How AI Is Poised to Elevate (and Accelerate) Gaming Graphics
For as long as I can remember, I've had love of all things tech, spurred on, in part, by a love of gaming. I began working on computers owned by immediate family members and relatives when I was around 10 years old. I've always sought to learn as much as possible about anything PC, leading to a well-rounded grasp on all things tech today. In my role at PCMag, I greatly enjoy the opportunity to share what I know. Anyone with an eye on the markets in the 2020s knows Nvidia has been all about artificial intelligence in recent years. (Look at its world-beating data-center efforts, and its stock price!) But the impact and benefits of AI as it pertains to computer graphics are still evolving and, by and large, not fully understood by the general public. It's a lot easier to grok what a ChatGPT or a Stable Diffusion does. With the upcoming, much-anticipated GeForce RTX 50 series of graphics cards, however, Nvidia will push several new AI-based technologies that could have the most impactful role on computer graphics and the gaming industry of any AI tech released to date. We'll have to wait to see how widespread these technologies will be in popular games, and how many developers adopt them and how quickly. But here's a teaser of what AI power potential the RTX 50 series will possess. Nvidia's RTX 50-Series AI Architecture I have covered the new RTX 50 series--which will arrive first in desktop cards and soon after in mobile GPUs--in other articles, covering information on RTX 50-series GPUs' core counts and other basic specifications. At an all-day briefing with Nvidia during CES 2025, PCMag received additional details worth touching on before diving into the RTX 50 series' many AI features. This is partly because these new details around the RTX 50 series architecture, "Blackwell," directly relate to the AI hardware and show what Nvidia has focused on in improving its silicon. AI has been a key focus for Nvidia in recent years, but the RTX 50 series is the first GPU built with such an extensive focus on AI features and workloads. The AI hardware is now fully functional within standard graphics workloads, enabling it to help boost performance even more than in previous-generation GPUs. As AI hardware also works in several other functions, Nvidia equipped RTX 50-series graphics cards with an AI management processor (AMP). The AMP helps manage the AI hardware's time between tasks, such as running large language models (LLMs) and processing game code. The RTX 50 series will also adopt GDDR7 memory, which delivers greater bandwidth than GDDR6 and GDDR6X could support. It's also more energy efficient, which could be a significant plus for the next generation of gaming laptops. Indeed, with the Blackwell architecture, Nvidia noted that it had several key goals: to optimize the silicon for neural workloads, to reduce the memory footprint, and to address that key question of energy efficiency. The last could be a boon for laptop gaming, which has typically been bound to plugged-in use. We'll have to see when the first RTX 50-based laptops launch, but Blackwell could show significant improvement in the off-plug gaming experience. The neural rendering architecture is what's at the core of Blackwell's advances, however. Fundamentally, a technology dubbed Cooperative Vectors in DirectX is poised to enable shaders to tap into the power of the Tensor cores. This will enable a new way of handling shaders, which govern challenging graphics facets such as textures and materials. That's one way the move of AI into PC graphics portends some new paradigms: having AI hardware do predictive work rather than rendering every last pixel the old-fashioned way. As Nvidia puts it, image quality, image smoothness, and responsiveness are the three pillars of graphics performance. Everything done in the field is a trade-off among these three factors. For example, one way to improve perceived responsiveness is to make graphics look worse (say, shift to 1080p resolution from 4K). Another way is to add more horsepower via more GPUs or more GPU power--but there's a limit to that, of course. The company's DLSS technology, in its various forms, is a way to do this without pushing the hardware envelope. DLSS and AI take advantage of the fact that there is a lot of structure and signal in what we see, and therefore there is redundancy and patterns. AI unlocks that and lets a GPU improve performance via shortcutting how you display some of that redundancy. Nvidia has had a supercomputer working on DLSS improvements for getting on six years now, and it claims 80%-plus of RTX card owners use DLSS at some time or other. It's not a new technique, but the advances this time are bigger than ever. As ever, though, DLSS and its ilk come down to adoption by developers. Today there are 540 DLSS games, with, according to Nvidia, 15 of the top 20 games from 2024 supporting DLSS. The new version, DLSS 4, should show up in 75 "Day 0" games with the launch of RTX 50 series. More about DLSS in a bit. RTX Neural Shaders, Neural Materials, and Mega Geometry Integrating AI into existing tasks and applications is one of the biggest challenges with using AI technology. On the consumer front, much effort has gone into making dedicated AI software, such as ChatGPT and image generation apps, but now Nvidia is working with Microsoft to use AI for in-game workloads that work in the background to make your graphical and gaming experience better. The idea is that you'd never know that AI processes are behind it all. Take the RTX Neural Shaders tool. It uses AI hardware to perform functions similar to what conventional graphics hardware does. In particular, Neural Shaders can handle vector processing, a key task typically performed by a GPU's shaders. Support for this doesn't appear to be here quite yet, but Nvidia indicated a collaboration with Microsoft to integrate this functionality into DirectX, making it easier for game developers to implement. Another related technology, RTX Neural Materials, is also designed for in-game use to improve image quality. This tool trains the AI hardware on texture data and then has the AI create in-game textures based on this data. In a way, this isn't so different from how games currently use textures. But when textures are used in games, they are typically directly taking an image with the texture and applying it to an in-game object, possibly with some post-processing effects to skew the image or adjust its lighting. From the sounds of it, RTX Neural Materials will do more or less the same thing, but instead of using the texture image directly, it will use a unique and potentially higher-quality AI-generated image based on the texture image. In so doing, it can also take a load off other parts of the GPU. Another AI tool, RTX Mega Geometry, helps with texture and image quality. This technique examines in-game geometry to improve the level of detail on rendered objects at various distances and viewing angles. In short: Less GPU brute force, more AI shortcutting. RTX Skin, Hair, and Neural Faces Those technologies mentioned above focus on improving image quality and performance for in-game objects, but Nvidia has also worked on similar tools for character models. The RTX Skin and RTX Neural Faces efforts both attempt to improve character models by using AI to create more original and realistic facial features. RTX Skin, for one, is a realism advance, given that skin is hard to render believably. RTX Skin uses subscattering algorithms to evoke the translucent quality of the skin and the effect of light on it. Nvidia noted that it was inspired by techniques used by Disney/Pixar. Hair is hard, too. Another similar tool, RTX Hair, will work to try and make more realistic hair for character models while, at the same time, reducing the number of triangles involved to render that believable hair. Rather than drawing lots of triangles to compose each strand, RTX Hair will help reduce the workload by using what Nvidia calls "linear swept spheres," basically cylinders with spheres as endcaps. This technique means the graphics engine needs to draw fewer polygons to render a strand. Nvidia suggests it could take up a third of the resources of what pure triangles would demand. Nvidia DLSS 4 and Reflex 2 Unquestionably, Nvidia's most anticipated new technology supporting its consumer graphics in 2025 is the aforementioned DLSS 4. With DLSS 4, Nvidia switches to a Transformer model from a convolutional neural network (CNN) for its AI-based upscaling and frame generation work. According to Nvidia, this uses twice as many parameters and four times the computer power to create images with higher overall image fidelity. The Transformer model also supports higher-quality ray reconstruction and improved upscaling technology. Graphical challenges like motion trails can be resolved more easily. With DLSS 4, Nvidia is also moving to "multiframe generation." Instead of running two models per frame, DLSS 4 will be running five per frame. As a result, using DLSS 4's frame generation, an RTX 50-series graphics card's AI hardware can generate up to 15 pixels for every pixel created by the GPU's traditional rendering hardware. It's not clear yet how effectively this will scale across the line of cards, but it suggests a significant increase in the frame rate RTX 50-series cards will be capable of outputting over previous generations of Nvidia graphics cards, when the tech is engaged. It's certainly behind CEO Jensen Huang's CES-keynote claim that an RTX 5070 can outrun an RTX 4090. DLSS 4, not magic, comes into play there. The original frame that the GPU hardware creates is upscaled from a lower resolution using more traditional DLSS technology to achieve this high output level. Then, frame generation comes in, but instead of making just a single frame, the frame-generation technology is used to push multiple artificially created frames. The exact number will likely depend on your GPU's capabilities. This feature is exclusive to RTX 50-series graphics cards, and, as noted, at least 75 games will support it at launch. Why was this not done sooner? The resulting image quality simply wasn't good enough, Nvidia's engineers note. After all, if you'll be looking at more generated frames than classically rendered ones, the quality has to be good! Also, it presented issues with frame pacing, or the keeping frames fed in sync with the display hardware. (Generating more frames doesn't help the user experience if they show up in uneven or, as Nvidia put it, "lumpy," intervals.) With Blackwell, the frame generation system should show a whopping 5x improvement for frame times. What does all of this translate to in real-world testing? We'll have to see, but Nvidia showed off a demo of that ever-faithful benchmark game, Cyberpunk 2077. An RTX 5090 ran at 27 frames per second (fps) in Cyberpunk with no DLSS, 70fps in DLSS 2, 141fps in DLSS 3, and a stunning 250fps in DLSS 4. In theory, with an RTX 5090, 240Hz gaming at 4K could be a thing in demanding AAA games that have the proper DLSS support. Sure, that is the very leading edge of PC gaming, but high-refresh 4K is now in view. Alongside DLSS 4, Nvidia has a newer version of its Reflex technology that improves responsiveness. According to the company, Reflex 2 provides 75% faster response times than the original version of Nvidia Reflex. This is mostly one for the esports crowd, with the feature coming to top shooters like The Finals: Next Stage and Valorant. Developers should get the tools to implement these in the next month or so, so look for them in actual games later this year. Another exciting feature of DLSS 4 for graphics tweakers is an Override option built into the Nvidia App. It allows you to force a different version of DLSS onto a game, on a per-game basis. This means you can experiment and try to impose DLSS 4 in games that only support DLSS 3, for example, or push a lower version of DLSS on a game that supports a higher one, should you have a desire to try for performance or quality reasons. RTX 5090 and RTX 5080 Coming Soon! We've covered the most important AI-related RTX 50-series features here, but it's not an exhaustive list. Nvidia showed off several more, including some that it hyped in previous years, like AI-driven non-player characters or NPCs in games, an AI lighting feature for webcams, and a tool that creates a podcast out of a PDF. Though neat and potentially useful, these items don't carry quite the same wide-ranging potential impact as the features we have covered. It's also unclear which of these features will be exclusive to the RTX 50 series, as some, like that PDF-to-podcast tool, might work on older generations of GPU. There's not a clear idea when many of these features will launch, either; much of it is developer-uptake-dependent. Still, with the release of the first RTX 50 series cards around the corner, you may not have long to wait to try out these features for yourself. These days, RTX leads the way.
[5]
Nvidia Announces Their New GeForce 50-Series Graphic Cards
Nvidia may be primarily an AI company, but it returned to its computer gaming roots by launching some powerful new GPUs. Only a few years ago, Nvidia was a gaming company. They made server-grade graphics cards -- GPUs -- for commercial software modeling, but this was a relatively small business. Most of their income, attention, and value came from PC gamers with their G-Force division; therefore, most people watching their presentation at the "Consumer Electronics Show" were computer gamers looking to buy a new graphics card. In the intervening years since their last graphics card release, the 40 series, Nvidia's place in the world has dramatically changed. As companies rushed into AI investment, wanting to throw Large Language Models into every software function possible, they needed thousands of palates of GPUs, and Nvidia was the biggest store in town. Once a gaming company, Nvidia became the chief shovel manufacturer for the AI gold rush. At the 40 series cards launch in 2022, Nvidia sold for $11 a share. Today, the share price hovers around $140. But, they still make gaming products, and though their CES presentation was focused on AI implementations, it opened with the most exciting "consumer product," their new gaming-focused "50" series GPUs. Yes, millions of these will wind up in servers, but many gamers will buy them too. Though more 50-series cards will release down the line, the initial line-up is comprehensive and -- with the use of their newly improved DLSS 4 AI frame-generation technology -- meant to be twice as fast as their previous generation counterparts, which were already pretty damn impressive. The current entry-level model is the RTX 5070, starting at $549, and CEO Jensen Huang said on stage that -- thanks to AI -- it will be as fast as the $1,500 RTX 4090. In real-world usage, this may be true in at least some games, but it's not down to the raw performance of the chip, which can't compare with the specs of the 4090. Instead, this is because DLSS is so much more powerful than it used to be, and the chips are designed with that as a focus, rather than other raw graphics rendering technology like ray-tracing. For those unfamiliar with it, DLSS is a form of AI-upscaling that is enabled through collaboration with game developers. To make it work, they run their game natively on a supercomputer and then train an AI model on that footage so that it knows what the game should look like. Then, when you play that game on your less powerful computer, DLSS can AI generate frames based on your gameplay, knowing what the game should look like based on its training. The result is so indistinguishable from native footage that I never game without it on. To some, this is "faking" the performance; but all I care about is that my games look good and play smoothly, and theoretically, DLSS should allow this $549 card to do that as well as a $1,500 card. Raw specifications will still affect the quality of footage it can generate and how many frames it can pump out, and the 5070 is a good improvement here compared to the last generation 4070. It has 6,144 CUDA cores, 12GB of GDDR7 -- with a memory bandwidth of 672 GB/sec -- and a total power of 250 watts, meaning you only need a 650-watt power supply. This will sell well, and if the performance is as good as advertised, custom, fast gaming computers should get a lot cheaper. One notch up is the RTX 5070 Ti, at $749, with 8,960 CUDA cores and 16GB of GDDR7 memory at 896GB/s. It needs 300 watts, so Nvidia suggests a 750-watt power supply, and both go on sale at the end of February. If you can't wait though, Nvidia will sell you their new high-end GPUs this month; and they're damn impressive. The high-end gaming staple is the RTX 5080. For $999 -- less than the last generation 4080, which started at $1,200 -- this is meant to have double the performance, and that's not down to AI upscaling, as the specifications line up. It has 16GB of GDDR7 memory, at 960GB/sec, and 10,752 CUDA cores, with a total power use of 360 watts, and Nvidia recommends an 850-watt power supply. The flagship ultimate card is the 5090, which nothing else on the market can compete with, and because of that, Nvidia can charge $2,000 for one. That's a hefty price hike over the 5090 -- and hard to justify for a home gamer compared to the 50% cheaper 5080 -- but the raw specifications are astonishing. It comes with 21,760 CUDA cores, 32GB of GDDR7 -- with a bandwidth of up to 1,792GB/sec -- and the Founders Edition card is far smaller than the 4090, now being a two-slot GPU. It has a total power of 575 watts, and Nvidia recommends a 1000-watt power supply. On paper, these are damn impressive cards, with power far exceeding previous generations; but which will be worth the purchase is something that reviews will have to conclude. On paper, the 4090 was a significant upgrade over the 4080, but there wasn't anywhere near as substantial a performance difference in real-world gaming use. Will the 5090 be the same? Given that it's twice the price, it's an important question. For the average consumer, the 5070 Ti has the biggest question mark. Unless it's a big step in performance, it can't be worth the 50% price jump over the base 5070. Then again, if it's close to the 5080 in real-world use, you're getting similar performance for a $250 discount. Regardless, this is an impressive upgrade over the 40 series graphics cards, which were still far ahead of their competitors. I'm looking forward to testing them all.
[6]
Nvidia Blackwell RTX 50-series Founders Edition graphics cards -- Details on the new design, cooling, and features of the reference models
The Nvidia RTX 5090 Founders Edition is coming on January 30, 2025. There will also be an RTX 5080 Founders Edition launching on the same date, with an RTX 5070 Founders Edition coming in February. What about the RTX 5070 Ti? It's also coming in February, but without an Nvidia reference design -- it's basically the same story as the RTX 40-series Founders Edition cards, with some slight tweaks. Nvidia provided a full overview of the upcoming Founders Edition cards, along with a discussion of some of the underlying changes, during its Editors' Day on January 8. There was a lot of other information as well, and we have multiple other articles covering those topics: Nvidia Blackwell architecture, Neural Rendering DLSS 4, etc. But here, our focus will be on the reference cards and specs. The full slide deck from the session is at the bottom of this article. We'll use some of those slides throughout our discussion. Starting with the RTX 5090 Founders Edition, one of the most striking aspects of the reveal was that it's a dual-slot graphics card with only two fans. How in the flaming underworld does Nvidia plan to cool a 575W TGP (Total Graphics Power) card with such a design? Answer: With some clever changes and large fans. The high specs are well-known by now. RTX 5090 comes with 170 SMs (Streaming Multiprocessors), 3352 AI TOPS of FP4 compute, and 105 TFLOPS of FP32 for graphics. It also has 32GB of GDDR7 memory on a 512-bit interface, with a massive 1792 GB/s of bandwidth. And a 575W power limit, as noted above, coupled with a $1,999 price tag. If you're looking at the price and thinking there won't be many gamers willing to fork over that much money for a graphics card, you're probably right. However, given the boosted AI capabilities and large quantities of VRAM and bandwidth, we suspect the RTX 5090 is going to sell like hotcakes. But let's look at the card a bit more closely. The biggest news is the double flow through thermal design. The RTX 4090 (and other Ada GPUs) featured a smaller PCB so that the back fan -- the one furthest from the video ports and IO bracket -- could blow straight through the radiator fins. With the 5090, Nvidia takes that same approach to the next level. The RTX 5090 PCB, as we wrote earlier, is very compact. Both the front and rear fans can now blow straight through the radiator fins. How is this accomplished? Nvidia says the RTX 5090 features three different PCBs. There's one on the bottom for the PCIe 5.0 x16 slot connector and another small PCB on the IO bracket for the video ports. The main PCB connects to those via (presumably) ribbon cables that won't interfere with airflow. How much of a difference does this make? Nvidia shared this slide of power versus noise for its older RTX 20-series (dual axial fans), the RTX 40-series (single flow through), and the RTX 5090 with double flow through: We'll be very curious to see how heat and noise fare in the real world. The 40-series Founders Edition cards have been very good, but the RTX 3090 and 3080 Founders Edition cards -- and the RTX 3080 Ti Founders Edition in particular -- could get very hot! The 4090 and 4080 cards got around that by having massive triple-slot coolers with large fans. The RTX 5090 does have some help, though. The fans are still very large (120mm), and there are rumors that the 5090 at least uses liquid metal. Don't disassemble your graphics card, in other words. That's unofficial information, for now, but it could help with cooling such a power hungry card. Also, even though it's a dual-slot card, the 5090 Founders Edition is big and heavy. It's not quite as heavy as the 4090 FE, but it probably weighs in the vicinity of 3.5~4.0 pounds, I'd guess, and it's relatively tall for a dual-slot card. Not quite as readily visible is that the RTX 5090/5080/5070 Founders Edition cards also have a bit of a divot in the radiator fins. This is presumably to enable better airflow and cooling, somehow, though Nvidia didn't provide much in the way of details on why this was done. The aesthetics of the RTX 5080 Founders Edition are virtually identical to the 5090 FE, just smaller. Of course, under the hood there's a very different set of hardware. Full specs include 16GB of GDDR7 memory clocked at 30 Gbps, for 960 GB/s of bandwidth -- almost matching the RTX 4090! It has 84 SMs and 10,752 CUDA cores, with 58.1 TFLOPS of FP32 compute and 1800 AI TOPS of FP4 performance. The card isn't quite as heavy or large as the RTX 5090, understandably, but it uses the same double flow-through design as far as we know. TGP for the RTX 5080 is 'only' 360W, just a bit more than the old RTX 3090 and 40W more than the RTX 4080 Super. Considering it uses the same basic design as the 5090, we assume this will end up being a very cool running card, similar to the RTX 4080 and 4080 Super -- but more compact. Despite the image shown above, there's no RTX 5070 Ti Founders Edition -- it's purely a render for informational purposes. Think of it as a way of Nvidia not favoring any of its partner cards, as it could only put one on the slide. (Yeah, it could have been more images of small cards, but that's beside the point.) The specs on the RTX 5070 Ti actually look really attractive. It has the same 16GB of GDDR7 memory, clocked just 7% slower at 28 Gbps. That's still 896 GB/s of bandwidth. It also has 1406 AI TOPS of FP4 compute, with 70 SMs, so it's 83% of the larger 5080 for 75% of the price. Power also drops to 300W. We did see some partner RTX 5070 Ti cards at Nvidia's showcase. If memory serves, they're pretty much all 2.5 to 3.0 slot cards with triple fans, probably without a double flow-through design -- but the single flow-through fans of the prior generation have proven effective, particularly with large radiators and three fans. The marketing is strong with the 50-series, as usual. The provided benchmarks show a sizeable improvement over the RTX 4070 Ti... but Nvidia didn't use the newer and more potent RTX 4070 Ti Super. It also uses MFG where applicable to show significant gains in "performance." Finally, we have the RTX 5070 Founders Edition. Our understanding is that this will have a single flow-through fan, like the previous generation, but it also only needs to deal with a 250W TGP. It's a smaller dual-slot card as well. There's only a single divot in the radiator for the flow-through fan as well. Specs include 48 SMs, 6144 CUDA cores, 12GB of GDDR7 memory clocked at 28 Gbps for 672 GB/s of bandwidth, and 988 AI TOPS of FP4 compute with 30.9 TFLOPS of FP32 for the shaders. That's only 6% more compute than the RTX 4070 (ignoring FP4 vs FP8 AI performance). And indeed, Nvidia's own performance figures, discounting MFG, suggest relatively modest gen-on-gen gains. The 5070 appears to be around 20% faster than the 4070 in Resident Evil 4 and Horizon Forbidden West. It's only with the extra generated frames that the performance improvements hit the claimed 2X faster, and we're still waiting to see what that actually feels like. (Hint: It won't be the same as a real, non-generated 2X improvement!) Nvidia also revealed the RTX 50-series laptops, which in all cases correspond to the desktop cards that are one step down. So the RTX 5090 laptop GPU basically has the same hardware as the RTX 5080 desktop card -- except, curiously, it has 24GB of VRAM, apparently via 3GB GDDR7 chips. (So that's where those chips are hiding...) The RTX 5080 laptop GPU has 16GB of memory on a 256-bit interface, with 7680 CUDA cores, so it's slightly less than the desktop RTX 5070 Ti. The mobile 5070 Ti comes with 12GB on a 192-bit interface, matching up with the desktop 5070, and then there's the mobile RTX 5070. The RTX 5070 laptop GPU will have 36 SMs and 4608 CUDA cores, with only 8GB of GDDR7 memory on a 128-bit memory interface. Please insert a very deep sigh. This is basically a very strong hint of what we are likely to see from the future desktop RTX 5060 Ti and RTX 5060. There's still some hope that the 5060 Ti might come with 3GB chips to get to 12GB total VRAM, but otherwise, it looks like a strong confirmation that Nvidia isn't giving up on 8GB yet. Will neural rendering and neural texture compression come to the rescue of the 8GB cards? From talking with Nvidia representatives, that will require game publishers opting in at the very least and, more likely, some developer effort to allow that to happen. But we'll wait and see. The laptop 50-series GPUs will begin shipping in retail products in March. Here's the full GeForce RTX 50-series slide deck.
[7]
Hands on With Nvidia Multi Frame Gen on the RTX 5090
I’ve seen the RTX 5090 double the framerate of 4090, though a big part of that is due to DLSS 4 and multi-frame gen. Players put a lot of work and even more money into trying to move the needle on their FPS counter higher and higher. Nvidia isn’t the only GPU maker who asks users to pay more for more frames, but it may be the first company to ask, will you accept it ifâ€"not some‚ but most of your frames aren’t rendered but generated thanks to AI? We've seen multiple demos of Nvidia's multi-frame gen running in person. Judging solely by the FPS counts, it could be the most significant turning point for PC gaming of the past four years, as long as you have the money to afford another graphics card. During a closed-door media session with journalists, Nvidia touted the power of its new, bulky, and expensive RTX 50-series cards. The pack is led by the $2,000 RTX 5090, with the desktop GPU running with 32GB of VRAM. Of course, it’s a more powerful card. It has 21,760 CUDA cores and the new Blackwell shader cores. However, the VRAM was explicitly upped for the sake of AI processing. Beyond its graphical capability, Nvidia focused much of its GPU announcement on its 5th-gen Tensor cores and 3,352 TOPS of generative AI processing capability. The RTX 5090 is a GPU targeted at AI developers and gamers alike. But so what? Does AI processing mean anything for gamers? It does, at least for the sake of those extra frames. Nvidia pointed out its multi-frame gen capabilities, but the score is bigger than that. The update to its upscaler tech, DLSS 4, is the lynchpin of Nvidia’s hopes for its new cards. Nvidia claims a game like Star Wars: Outlaws will only run at 50 FPS at 4K and max settings on a 5090 and no upscaling. With DLSS 4, plus all the bells and whistles, Nvidia claims you can get more than 250 FPS with less latency than standard. It sounds too good to be true, and if you’re a PC gaming purist, it may well be. In essence, Nvidia’s frame gen uses a generating model to predict the next frame in a sequence, then inserts a new frame based on the normal, rendered in-game image. The current frame gen on DLSS 3.5 is supported by some titles to generate frames on a 1-to-1 basis. Previous versions of single frame gen could offer a solid bump in framerates, though a newly updated version should improve framerates to a larger degree. Nvidia says its new frame generation model is 40% faster than before and uses less VRAM; plus, it can model multiple frames based on a single rendered scene and insert it before the next rendered frame without impacting latency. Nvidia said that if a game runs all of DLSS 4, the GPU handles five AI models at once, so you need the 50-series cards with their new Tensor Cores to run multi-frame gen. In-person, the framerates are immediately impressive. A machine with a 5090 running Cyberpunk 2077 at 90 FPS may be able to do close to 170 FPS in some scenes with multi-frame gen. An RTX 4090 PC hitting 130 FPS in Alan Wake 2 with ray reconstruction may be closer to 300 FPS in the same scene with a 5090 and multi-frame gen. The tech is already impressive, but the real consideration gamers need to make is if they can stomach the concept they’re seeing “fake frames,†as detractors call it, instead of real rendered frames. In one demo, one onlooker pointed out a sudden flickering from a light source, though it may have had to do with an issue on the monitor. As for detail, I couldn’t spot much difference between a game with generated frames and one without. What helps is that Nvidia’s updated ray reconstruction should add more clarity to fine lines, such as overhead wires or chain-link fences. For the sake of gameplay, Nvidia made sure to enable Reflex 2 to reduce latency. In a game like Black Myth Wukong, there wasn’t any noticeable change in the game running at 90 FPS versus a game running with higher framerates, other than the most marginal improvement in smoothness. The difference in 170 FPS versus 230 FPS in a game like Marvel Rivals may be more significant. Still, that’s only if you’re a wannabe pro gamer playing a shooter at such a high standard and need absolute precision. These aren’t “fake frames†but aren’t rendered by the PC’s processors either. Multi-frame gen is a magic trick that requires misdirection. Users will be too busy playing the game and basking in the framerate ticker to notice any potential visual discrepancies when theyâ€"inevitablyâ€"appear. This takes time to parse out, something that can’t be done even with a few hours of demos. We’ll need to get our hands on these new cards to discover how this impacts our gaming experience. Nvidia promises 75 games and apps that will support the multi-frame gen, including future titles like Doom: The Dark Ages and Dune: Awakening. Among those initially supported are a few oddballs, like the climbing game Jusant and Deep Rock Galactic. Both games don’t tax the GPU to a significant degree when running. The nearly 7-year-old Deep Rock Galactic is normally more CPU-intensive. A better card will increase framerates, but not to the extent of a more intensive game. Add on multi-frame gen, and the number of frames you’d get would likely exceed the refresh rate of most high-end monitors you could buy at a reasonable price. The high-end RTX 40-series cards, especially the 4090, were the pinnacle of what most users could get for desktop graphics. AMD and now Intel can compete in the low-to-mid-range GPU market, but Nvidia is the only one that promises “the best.†CEO Jensen Huang said as much during a Q&A the day after his CES keynote address. But the 4090 hasn’t been maxed out yet. You can still buy that card and get great performance when paired with a high-end CPU. To run the 5090, you’ll need at least 575W of power, so in all likelihood, you’ll require a large, certified PSU, at the very least. You’ll want to pair it with the latest and greatest AMD or Intel desktop CPUs, and we still don’t know if any of the current or upcoming top-end processors will lead to any bottlenecks. The 4090 is already excessive but is safe and a known variable. The 5090 is big, and you have to have a reason to run it. Still, if you bought an expensive, high-end 240 Hz monitor and wanted to play your games at their peak in both looks and performance, the 5090 promises to do all that for a pretty penny. That’s why multi-frame generation is so important. Only the new 50-series cards will have this quality. The 40-series will have the remodeled single-frame generation, though those cards, along with the 30-series and 20-series GPUs, will get access to the transformer model super-resolution, DLAA, and updated ray reconstruction. Nvidia has not revealed any details about the anticipated RTX 5060 and RTX 5050. The more budget-end cards are rumored to have 8 GB of VRAM. It’s what the 5070 also packs, and although it seems light, Nvidia still said that the card will support multi-frame gen. Budget gamers would benefit far more from frame generation than those with the cash to drop $2,000 on a new GPU. They’re less likely to care about the “fake frames†complaint so long as they can play demanding games at playable framerates. We know that the $550 RTX 5070 can handle multi-frame generation, so Nvidia promoted it to match the $1,600 RTX 4090's performance. That’s a good sign for budget-end desktop PCs and laptops, though what really matters for users is whether they can stomach the idea that their frames are “fake†or generated. If the point is to experience what developers intended, then it really doesn’t matter where the frames come from as long as they enhance the experience.
[8]
Here's how the new RTX 50-series cards perform against the previous generation of GeForce GPUs
And here's also why we're not getting the same sort of big frame rate bump we got out of the RTX 40-series: hardware is hard. When the RTX Blackwell GPUs were first announced during CEO Jen-Hsun Huang's CES 2025 keynote, the big news was around the fact the RTX 5070 could deliver the same level of gaming performance as the RTX 4090. That was a new $549 GPU delivering frame rates akin to a $1,600 card of the previous generation. Then it became clear that such performance claims came with the heavy caveat that you needed to enable DLSS 4 and Multi Frame Generation to deliver the same frame rates as an RTX 4090 and suddenly the question of generation-on-generation performance became a hot topic. Now the Nvidia RTX Blackwell Editor's Day embargo is up we're allowed to talk about the gen-on-gen performance numbers the company provided during a talk on the new RTX 50-series cards by Justin Walker, Nvidia's senior director of product management. And if you held any illusions the silicon-based performance jump might be in some way similar to what we saw shifting from Ampere to Ada, I'm afraid I'm probably about to disappoint you. You can see in the slides below just where the four different RTX Blackwell GPUs stack up, both in terms of straight gen-on-gen performance, as well how they look when you include the multi-gen AI might of DLSS 4 and Multi Frame Generation. As expected, the RTX 5090, with its healthy dollop of extra TSMC 4N CUDA cores is the winner out of all the new cards. You're looking at a straight ~30% performance uplift over the RTX 4090 when you take Frame Gen out of the equation, and a doubling of performance when you do. Either way, it's a genuine performance boost over the previous generation. Though, it must be said it's considerably more expensive, and doesn't come close to the performance uplift we saw from the RTX 3090 to the RTX 4090. I've just retested both in our updated graphics card test suite, with the mighty AMD Ryzen 7 9800X3D at its heart, and the Ada card was delivering around an 80% gen-on-gen uplift without touching on DLSS or Frame Gen. The weakest of the four is the RTX 5080. Sure, its $999 price tag is lower than the $1,200 of the RTX 4080, and matches the RTX 4080 Super, but you're looking at just a ~15% gen-on-gen frame rate boost over the RTX 4080. Going from the RTX 3080 to the RTX 4080 represented a near 60% frame rate bump at 4K, for reference. Even with Multi Frame Generation enabled the RTX 5080 isn't always hitting the previously posited performance doubling. Then we come to the RTX 5070 cards and their own ~20% performance bumps. And, you know, I'm kinda okay with that, especially for the $549 RTX 5070. I'd argue that in the mid-range the benefits of MFG are going to be more apparent, and feel more significant. Sure, that 12 GB VRAM number is going to bug people, but with the new Frame Gen model demanding some 30% less memory to do its job, that's going to help. And then the new DLSS 4 transformer model boosts image quality, so you could potentially drop down a DLSS tier, too, and still get the same level of fidelity. So, why aren't we getting the same level of performance boost we saw from the shift from the RTX 30-series to the RTX 40-series? For one thing, the RTX 40-series was almost universally more expensive compared to the cards they replaced, and that's not a sustainable model for anyone, not even a trillion dollar company. Basically, silicon's expensive, especially when you want to deliver serious performance gains using the same production node and essentially the same transistors. The phrase I kept hearing from Nvidia folk over the past week has been, 'you know Moore's Law is dead, right?' And they're not talking about some chipmunk-faced YouTuber, either. The original economic 'law' first proposed by Intel's Gordon Moore stated that the reducing cost of transistors would lead to the doubling of transistor density in integrated circuits every year, which was then revised down to every two years. But it no longer really works that way, as the cost of transistors in advanced nodes has increased to counter such positive progression. That means, even if it were physically viable, it's financially prohibitive to jam a ton more transistors into your next-gen GPU and still maintain a price point that doesn't make your customers want to sick up in their mouths. And neither Nvidia nor AMD want to cut their margins or average selling prices in the face of investors who care not a whit for hitting 240 fps in your favourite games. So, what do you do? You find other ways to boost performance. You lean on your decades of work in the AI space and leverage that to find ways to squeeze more frames per second out of essentially the same silicon. That's what Nvidia has done here, and it's hard to argue against it. We'd have all been grabbing the pitchforks had it not pulled the AI levers and simply built bigger, far more costly GPUs, and charged us the earth for the privilege of maybe another 20% higher frame rates on top of what we already have. Instead, whatever your feelings on 'fake frames' are, we can now use AI to generate 15 out of every 16 pixels when we're rocking DLSS 4 and Multi Frame Generation and find ourselves with a $549 graphics card that can give us triple figure frame rates at 1440p in all the latest games. When you're getting a seamless, smooth, artifact-free gaming experience, how much are you worried about which pixel is generated and which is rendered? Maybe I sound like an Nvidia apologist right now, and maybe the bottomless coffee at the Editor's Day was actually the koolaid, but I see the problem. Hardware is hard, and we have a new AI way to tackle the problem, so why not use it? A significant process node shrink, down to 2N or the like, could help in the next generation, but until we get functional, consumer-level chiplet GPUs another huge leap in raw silicon performance is too costly to be likely any time soon. So, once more, I find myself bowing down to our new AI overlords.
[9]
I'll say it: The best thing I saw from Nvidia at CES wasn't its sweet new GPUs, but some tasty AI every RTX gamer can enjoy
The new transformer model for DLSS could be... er... kinda transformative. AI gets a bad wrap. To be clear, that's generally for good reason -- it's either being put to nefarious use, is coming fer our jobs, or is being used as some marketing gimmick with no actual relation to genuine artificial intelligence. But the most impressive thing I saw in my time at CES 2025, and throughout the Nvidia RTX Blackwell Editor's Day was AI-based. And it's a tangible advancement on the AI us gamers likely use every day: DLSS. DLSS -- or Deep Learning Super Sampling, to give it its full title -- is Nvidia's upscaling technology used to deliver higher frame rates for our PC games (mostly because ray tracing sucked them down), and single-handedly kicked off the upscaling revolution that is now demanded of every single GPU and games console maker across the planet. And its fourth iteration offers "by far the most ambitious and most powerful DLSS yet." So says Nvidia's deep learning guru, Brian Catanzaro. And the reason it's such a massive deal is that it's moving from using a US news network for its AI architecture to the technology behind Optimus Prime. Yes, DLSS 4 is now powered by energon cubes. Wait, no, I think I've fundamentally misunderstood something vital. Hang on. From the introduction of DLSS 2 in 2020, Nvidia's upscaling has been using a technology called CNN, which actually stands for convolutional neural network (not Cable News Network, my apologies). CNN has gone hand-in-hand with GPUs since the early 2000s when it was shown that the parallel processing power of a graphics chip could hugely accelerate the performance of a convolutional neural network compared with a CPU. But the big breakthrough event was in 2012 when the AlexNet CNN architecture became the instant standard for AI image recognition. Back then it was trained on a pair of GTX 580s, how times have changed... At its most basic form, a CNN is designed to locally bunch pixels together in an image and analyse them in a branching structure, going from a low level to a higher level. This allows a CNN to summarise an image in a very computationally efficient way and, in the case of DLSS, display what it 'thinks' an image should look like based on all its aggregated pixels. Convolutional neural networks, however, are no longer the cutting edge of artificial intelligence and deep learning. That is now something called the transformer. And this is the big switcheroo that has come out of CES 2025 -- DLSS will be powered by the transformer model going forward, and will deliver far better image quality, being far more accurate and stable... though that does come with a hit on gaming performance. The transformer architecture was developed by the smart bods at Google, and is essentially the power behind the latest AI boom as it forms the heart of large language models, such as ChatGPT. In fact, the GPT part there stands for generative pre-trained transformer. While CNN is more closely related to images, transformers are far more generalised. Their power is about where computational attention is directed. "The idea behind transformer models," Catanzaro tells us, "is that attention -- how you spend your compute and how you analyse data -- should be driven by the data itself. And so the neural network should learn how to direct its attention in order to look at the parts of the data that are most interesting or most useful to make decisions. "And, when you think about DLSS, you can imagine that there are a lot of opportunities to use attention to make a neural graphics model smarter, because some parts of the image are inherently more challenging." Transformer models are also computationally efficient, and that has allowed Nvidia to increase the size of the models it uses for DLSS 4 because they can be trained on much larger data sets now, and "remember many more examples of things that they saw during training." Nvidia has also "dramatically increased the amount of compute that we put into our DLSS model," says Catanzaro." Our DLSS 4 models use four times as much compute as our previous DLSS models did." And it can make a huge difference in games that support DLSS 4. I think maybe the most obvious impact is in games which use Ray Reconstruction to improve the ray tracing and the denoising of a gameworld. I was hugely impressed with Ray Reconstruction when I first used it, and it still anchors characters in a world far better than previous solutions to ray traced lighting. But you do get a noticeable smearing effect sometimes in games such as Cyberpunk 2077 and Alan Wake 2. With the smarts of the transformer model, however, that's all gone. The improved neural network perceives scene elements more accurately than the CNN model, most especially those trickier scene elements, and most especially in darker environments. Alan Wake 2 is the epitome of that, with its always brooding level design, and the example we were given on stage highlights a ceiling fan spinning around, leaving the familiar Ray Reconstruction smearing effect in the wake of its fan blades using the CNN model. With DLSS 4 and the transformer model, it just looks like a regular fan, with the dark ceiling behind the spinning fan now given clear detail. You can also see how much more stable fine details are from this example of a chain-link fence, and you get the same effect with trailing overhead cables, etc. I've also seen it used to great effect in Cyberpunk 2077, and in the RTX Remix version of Half-Life 2 -- which is still yet to be released but I got to have a play with it last week. I got to check out the demo of HL2 where we could toggle between the two DLSS models at will and the level of detail that suddenly pops out of the darkness with the transformer architecture in play is pretty astounding. Not in a strange, 'why is it highlighting that?' kinda way, but more naturally. The way that even in the flickering gloom outside the ring of light around a flaming touch, set just outside of Ravenholm, you can make out a trail of leaves where before there were just muddy pixels. And that extra level of visual fidelity is coming to all games that support DLSS 4, which, thankfully, is every game which currently supports DLSS. There are going to be 75 games and apps at the launch of the RTX 50-series cards that support DLSS 4 out of the box, and they will either give you the option to switch between CNN and transformer models -- this is what the build of Cyberpunk 2077 I've played with does -- or it just switches over wholesale for DLSS 4. But, via the Nvidia App, you're also going to be able to override the DLSS version any game currently uses to essentially drop in the new transformer model to take advantage of the improved image quality. You will also be able to add Multi Frame Generation (MFG) into any game that currently supports Nvidia's existing Frame Generation feature via this method, too. So long as you have an RTX Blackwell card, anyways... It is worth noting, however, that even though MFG is hardware-locked to Blackwell, the standard two-times Frame Generation benefits from an enhanced model, too. Nvidia says its new AI model is 40% faster and uses 30% less VRAM. It also no longer uses the optical flow hardware baked into the Ada architecture, as it's been replaced by a more efficient AI model to do the same job. RTX 40-series cards can also now take advantage of the Flip Metering frame pacing tech -- the same thing which locks MFG to RTX Blackwell -- just without enhanced hardware in the display engine. But DLSS 4's transformer model for Ray Reconstruction and Super Resolution isn't restricted to RTX 50-series or RTX 40-series cards, however, and is available to all RTX GPUs from Turing upwards. Which is a hell of a sell. "One of the reasons this is possible," says Catanzaro, "is because the way that we built DLSS 4 was to be as compatible as possible with DLSS 3. So, games that have integrated DLSS 3 and DLSS 3.5 can just fit right in. "DLSS 4 has something for all RTX gamers." But there always has to be some sort of caveat. While it is supported by all RTX GPUs, the transformer model is more computationally intensive than the previous CNN incarnation. From speaking to Nvidia folk at the show they estimated potentially around 10% more expensive in terms of frame rates. So, if you just look at the raw frame rate data you will see DLSS 4 performing worse at the same DLSS levels compared with previous iterations. What you won't see there is the extra visual fidelity. You could potentially offset that performance hit, however, by leaning on that extra fidelity to drop down a DLSS tier, from Quality to Balanced, for example, and that might well give you better performance and maybe even slightly improved visuals. But we will, however, have to check that out for ourselves when DLSS 4 launches out of beta in full. I for one am almost looking forward more to the DLSS 4 launch than I am to getting my hands on the svelte new RTX 5090. Though I will say, its Multi Frame Generation does have to be seen to be believed.
[10]
Nvidia neural rendering deep dive -- Full details on DLSS 4, Reflex 2, mega geometry, and more
AI technologies will play a big role in the upcoming RTX 50-series. During CES 2025, Nvidia held an Editors' Day where it provided technical briefings on a host of technologies related to its upcoming RTX 50-series GPUs. It was an all-day event, kicked off with the Neural Rendering session that covered topics including DLSS 4, Multi Frame Generation, Neural Materials, Mega Geometry, and more. AI and related technologies will be front and center on the RTX 50-series, though some of the upgrades will also work on existing RTX GPUs, going all the way back to the original 20-series Turing architecture. We also have separate articles (with more to come) on the Blackwell architecture and RTX 50-series Founders Edition cards. There's a ton of information to cover, and we have the full slide decks for each session. Due to time constraints -- Intel's Arc B570 launches tomorrow, and there's still all the testing to get done for the RTX 5090 and 5080 launches later this month -- we won't be able to cover every aspect of each discussion, but let us know in the comments any questions you might have, and we'll do our best to respond and update these articles with additional insight in the coming days. The Neural Rendering session was full of interesting information and demos, most of which should be made publicly available. We'll add links to those if and when they go live. At a high level, many of the enhancements stem from the upgraded tensor cores in the Blackwell architecture. The biggest change is native support for the FP4 (4-bit floating-point) data type, with throughput that's twice as high as the prior generation's FP8 compute. Nvidia has been training and updating its AI models to leverage the enhanced cores, and it lumps these all under the Blackwell Neural Shaders header. Nvidia lists five new features: Neural Textures, Neural Materials, Neural Volumes, Neural Radiance Fields, and Neural Radiance Cache. Read through the list fast, and you might start to feel a bit neurotic... We already discussed some of the implications of the Neural Materials during Jensen's CES keynote, where future games could potentially reduce memory use for textures and materials by around one-third. That will require game developers to implement the feature, however, so it won't magically make any 8GB GPUs -- including the RTX 5070 laptop GPU and most likely an unannounced RTX 5060 desktop GPU -- work better. But for games that leverage the feature? They could look better while avoiding memory limitations. Nvidia is working with Microsoft on a new shader model called Cooperative Vectors that will allow the intermingling of tensor and shader code, aka neural shaders. While this should be an open standard, it's not clear if there will be any support for other GPUs in the near term, as this very much sounds like an RTX 50-series-specific enhancement. It's possible to use these new features on other RTX GPUs as well, though our understanding is that the resulting code will be slower as the older architectures don't natively support certain hardware features. Nvidia also showed off some updated Half-Life 2 RTX Remix sequences, with one enhancement called RTX Skin. With existing rendering techniques, most polygons end up opaque. RTX Skin uses AI training to help simulate how light might illuminate certain translucent objects -- like the above crab. There was a real-time demo shown, which looked more impressive than the still image, but we don't have a public link to that (yet). Another new hardware feature for the RTX 50-series is called linear-swept spheres. While the name is a bit of a mouthful, its purpose is to better enable the rendering of ray-traced hair. Modeling hair with polygons can be extremely demanding -- each segment of a hair would require six triangles, for example, and with potentially thousands of strands, each having multiple segments, it becomes unwieldy. Linear-swept spheres can trim that to two spheres per segment, requiring 1/3 as much data storage. The number of polygons in games has increased substantially over the past 30 years. High-end games in the 1990s might have used 1,000 to 10,000 polygons per scene. Recent games like Cyberpunk 2077 can have up to 50 million polygons in a scene, and with Blackwell and Mega Geometry, Nvidia sees that increasing another order of magnitude. Interestingly, Nvidia mentions Unreal Engine 5's Nanite technology that does cluster-based geometry as a specific use case for Mega Geometry. It also specifically calls out BVH (Bounding Volume Hierarchy), though, so the may be specifically for ray tracing use cases. Nvidia showed a live tech demo called Zorah, both during the keynote and during this session. It was visually impressive, making use of many of these new neural rendering techniques. However, it's also a tech demo rather than an upcoming game. We suspect future games that want to utilize these features may need to incorporate them from an early stage, so it could be a while before games ship with most of this tech. The second half of the Neural Rendering session was more imminent, with VP of Applied Deep Learning Research Bryan Catanzaro leading the discussion. Succinctly, it was all oriented around DLSS 4 and related technologies. More than anything else, DLSS 4 -- like the RTX 40-series' DLSS 3 -- will be key to unlocking higher levels of "performance." And we put that in quotes because, despite the marketing hype, AI-generated frames aren't going to be entirely the same as normally rendered frames. It's frame smoothing, times two. Nvidia notes that over 80% of gamers with an RTX graphics card are using DLSS, and it's now supported in over 540 games and applications. As an AI technology, continuing traning helps to improve both quality and performance over time. The earlier DLSS versions used Convolutional Neural Networks (CNNs) for their models, but now Nvidia is bringing a transformers-based model to DLSS. Transformers have been at the heart of many of the AI advancements of the past two years, allowing for far more detailed use cases like AI image generation, text generation, etc. With the DLSS transformer model, Nvidia has twice the number of parameters and it uses four times the compute, resulting in greatly improved image quality. Moreover, gamers will be able to use the Nvidia App to override the DLSS modes on at least 75 games with the RTX 50-series launches, allowing the use of the latest models. Above are two comparison shots for Alan Wake 2 and Horizon Forbidden West showcasing the DLSS transformers upgrades. Alan Wake 2 uses full ray tracing with ray reconstruction, which apparently runs faster in addition to looking better than the previous CNN model. Forbidden West was used to show off the improved upscaling quality, with a greatly increased level of fine detail. We haven't had a chance to really dig into DLSS transformers yet, but based on the early videos and screenshots, after years of hype about DLSS delivering "better than native" quality, it could finally end up being true. Results will likely vary by game, but this is one of the standout upgrades -- and it will be available to all RTX GPU owners. The DLSS transformer models do require more computing, and so may run slower, particularly on older RTX GPUs, but it's possible we could see DLSS Performance mode upscaling (4X) with the new models that look better and perform better than the old DLSS CNN Quality mode (2X upscaling). We're less enthusiastic about Multi Frame Generation (MFG), if only because of the amount of marketing hype behind it. With the new hardware flip metering to help smooth things out, coupled with faster frame generation algorithms, on the one hand MFG makes sense. Instead of interpolating one frame, you generate three. Poof! Three times the performance uplift! But on the other hand, that means one-fourth the sampling rate. So if you're running a game with MFG and getting 240 FPS, that's really rendering at 60 FPS and quadrupling that with MFG. And that specific example would probably look and run great. What will be more problematic is games and hardware that can't get 240 FPS... for example, an RTX 5070 might only run at 120 FPS. What would that feel like? Based on what we've experienced with DLSS 3 Frame Generation, there's a minimum performance level that you need to hit for things to feel "okay," with a higher threshold required for games to really feel "smooth." With single frame generation, we generally felt like we needed to hit 80 FPS or more -- so a base framerate of 40. With MFG, if that same pattern holds, we'll need to see 160 FPS or more to get the same 40 FPS user sampling rate. More critically, framegen running at 50 FPS, for example, tended to feel sluggish because user input only happened at 25 FPS. With MFG, that same experience will now happen if you fall below 100 FPS. Of course, you won't see perfect scaling with framegen and MFG. Regular DLSS 3 framegen would typically give you about a 50% increase in FPS -- but Nvidia often obfuscates this by comparing DLSS plus framegen with pure native rendering. If you were running at 40 FPS native, you'd often end up with 60 FPS after framegen -- and a reduced input rate of 30 FPS. What sort of boost will we see when going from single framegen to 4X MFG? That might give us closer to a direct doubling, which would be good, but we'll have to wait and see how it plays and feels in practice. Nvidia's own examples in the above slide suggest the performance improvement could range from as little as 33% (Alan Wake 2) to as much as 144% (Hogwarts Legacy) -- but that's using an RTX 5090. Again, we'll need to wait and see how this works and feels on lesser 50-series GPUs. Wrapping things up, the other big new item will be Reflex 2. It will eventually be available for all RTX GPUs, but it will first be available with the RTX 50-series. Where Reflex delayed input sampling until later in the rendering pipeline, Reflex 2 makes some interesting changes. First, our understanding is that, based on past and current inputs, it will project where the camera will be before rendering. So if, as an example, your view has shifted up two degrees over the past two input samples, based on timing, it could predict the camera will actually end up being shifted up three degrees. Now here's where things get wild. After all the rendering has finished, Reflex 2 will then sample user input once more to get the latest data, and then it will warp the frame to be even more accurate. It's a bit like the Asynchronous Space Warp (ASW) used with VR headsets. But the warping here could cause disocclusion -- things that weren't visible are now visible. Reflex 2 addresses this with a fast AI in-painting algorithm. The above gallery shows a sample of white pixels that are "missing." At the Editors' Day presentation, we were shown a live demo where the in-painting could be toggled on and off in real-time. While the end result will depend on the actual framerate you're seeing in a game, Reflex and Reflex 2 are at their best with higher FPS. We suspect the warping and in-painting might be more prone to artifacts if you're only running at 30 FPS, as an example, while it will be less of a problem in games running at 200+ FPS. Not surprisingly, Reflex 2 will be coming first to fast-paced competitive shooters like The Finals and Valorant. And that's it for the Neural Rendering session. The full slide deck is available above, and we glossed over some aspects. A lot is going on with AI, needless to say, and we're particularly curious to see what happens with things like Neural Textures. Could we see that applied to a wider set of games? We asked Nvidia about that, and it sounds like some game studios might be able to "opt-in" to have their assets recompressed at run-time, but it will be on a game-by-game basis. Frankly, even if there's a slight loss in overall fidelity, if it could take the VRAM requirements of some games from 12-16 GB and reduce that to 4~6 GB, that could grant a new lease on life to 8GB GPUs. But let's not count those chickens until they actually hatch.
[11]
A Deeper Analysis of Nvidia RTX 50 Blackwell GPU Architecture
A Deeper Dive Nvidia is introducing the Blackwell GPU architecture to support the next generation of RTX 50-series graphics cards. This new architecture builds upon the Ada Lovelace design, offering enhancements aimed at improving AI and neural rendering capabilities. It introduces support for new standards such as DisplayPort 2.1 UHBR20 and PCIe 5.0, and transitions from GDDR6/GDDR6X to GDDR7 memory across the series. These changes improve data throughput and energy efficiency while providing higher performance, especially in high-end models like the RTX 5090. The upgrades include a significant increase in GPU die size, improved ray triangle intersection rates with 4th-Gen RT cores, and features like Neural Shaders and Shader Execution Reordering enhancements for better handling of complex workloads.The architecture focuses on optimizing for new neural workloads, reducing memory footprint, and enhancing power management. Nvidia has unified FP32 and INT32 compatibility across all shader cores in Blackwell, enhancing efficiency for AI-related tasks. The introduction of an AI Management Processor is a key change. This component manages scheduling and resource allocation for complex AI operations, prioritizing tasks based on real-time demands. It helps balance rendering processes such as neural upscaling, frame generation, and AI-driven interactions, ensuring that resources are allocated optimally. The GPU also delivers improvements in video encoding and decoding support, including compatibility with 4:2:2 video streams, and enhances energy management with quicker transitions into low-power states. The Blackwell SM architecture doubles INT32 bandwidth and throughput by enabling all shader cores to execute INT32 or FP32 instructions. This improvement stems from hardware changes that allow Tensor Cores to be directly accessed through shader cores using DirectX Cooperative Vectors API. Additionally, Shader Execution Reordering enhances the efficiency of shaders generating work for other shaders by doubling throughput. Integration of GDDR7 memory marks a major enhancement in this architecture. GDDR7 offers twice the speed of GDDR6 and uses half the power per bit, achieved by changing signalling technology from PAM4 to PAM3. This results in a larger data eye, enabling higher frequencies and improved performance. PAM3 signaling uses fewer logic levels, which allows more stable and faster data transfers. Even though PAM3 transfers less data per clock cycle, its ability to operate at higher speeds compensates, leading to faster overall data transmission. The architecture also introduces improvements in ray tracing and AI task management. Fourth-generation RT cores feature a triangle cluster intersection engine for efficient mega geometry processing. They include a compression format and lossless decompression engine for more efficient handling of complex scenes. Tensor Cores now support INT4 and FP4, offering faster execution and reduced memory usage at the cost of some precision. An AI Management Processor sits at the GPU's front end to schedule tasks, ensuring AI operations like dialogue generation do not interfere with rendering tasks. This leads to a balanced distribution of resources between AI models and game rendering, maintaining smooth performance and responsive user experiences The architecture supports full GDDR7 memory capabilities, with models like the RTX 5090 featuring up to 32 GB of GDDR7 memory and a 512-bit interface. Neural rendering Neural rendering marks a technical advancement in computer graphics by incorporating neural network methodologies into the rendering workflow. These techniques enable enhanced performance, superior image fidelity, and a higher degree of interactivity compared to traditional rendering pipelines. At its core, neural rendering utilizes trained neural networks to enhance visual data, thereby compressing computational tasks and generating high-quality frames from lower resolution inputs. Early developments such as deep learning super sampling (DLSS) illustrate this process, where low-resolution images processed by neural algorithms result in high-resolution output. The progression of DLSS, particularly its evolution into versions that can predict full frames from low-resolution inputs, is indicative of the potential neural rendering holds. The methodology involves thorough understanding of scene components like shadowing, reflections, and occlusion, which collectively contribute to images that may surpass quality benchmarks set by native rendering techniques. The DLSS technology has evolved to use Multi Frame Generation, a feature that significantly increases frame rate efficiency. Specifically, this feature can boost frame rates by eight times over conventional methods while maintaining, or even enhancing, image quality. The underlying technique employs extensive datasets to train neural networks, which can then reconstruct highly detailed frames from basic inputs. This process effectively reduces the computational load while preserving visual fidelity. The technology's integration into programmable shaders, known as neural shaders, further expands its applications. Neural shaders compress textures substantially, minimizing memory usage and enabling more complex and detailed scenes in real-time environments. Additionally, they support the creation of high-fidelity textures and improved lighting effects, contributing to advancements in the field of interactive digital media. In parallel, specialized technologies such as RTX Neural Shaders and RTX Neural Faces are driving the next phase of graphical innovation. RTX Neural Shaders embed neural network functionality directly into shaders to compress texture data and generate enhanced visual effects. This compression can reach up to sevenfold, allowing more efficient use of resources while delivering high-quality visual content. Moreover, these shaders facilitate the development of cinematic-like graphics through improved texture creation and advanced lighting simulation. On the other hand, RTX Neural Faces applies a real-time generative AI approach to facial rendering. Instead of relying solely on traditional rasterization techniques, this method uses rasterized facial data combined with 3D pose information to generate more natural facial features and expressions. By doing so, it produces more realistic representations of faces, offering a notable improvement over standard rendering practices and hinting at future possibilities in real-time facial rendering technologies DLSS4 The latest iteration, DLSS 4, introduces Multi Frame Generation specifically designed for GeForce RTX 50 Series graphics cards and compatible laptops. Multi Frame Generation is expected to support 75 games and applications upon release. The feature produces up to three additional frames for each fully rendered frame, multiplying overall frame rates significantly. This process involves generating interpolated frames based on AI prediction models, thereby increasing overall performance. When applied to high-end cards like the GeForce RTX 5090, this new technology can boost performance by up to eight times compared to traditional brute-force rendering methods. The substantial increase in frames per second allows rendering of scenes at 4K resolution and up to 240 frames per second with full ray tracing enabled. Additionally, this feature contributes to halving the latency in personal computer gameplay, resulting in more responsive interactions with software and gameplay environments. The DLSS 4 update also brings an important upgrade to its artificial intelligence models since the DLSS 2.0 release in 2020. It introduces DLSS Ray Reconstruction, DLSS Super Resolution, and DLAA, which now incorporate real-time application of transformer architectures. DLSS transforms the gaming experience with innovative neural rendering and powerful AI. This technology boosts FPS, lowers latency, and sharpens image quality, making gameplay smoother and more immersive. The past DLSS Frame Generation feature enhanced gaming performance by using a deep learning model to insert an extra frame between two traditionally rendered frames, nearly doubling the frames per second (FPS) seen by the player. Now, with the official launch of the GeForce RTX 50 series, this technology has evolved further into the DLSS Multi-Frame Generation feature. It harnesses the power of GeForce RTX 50-Series GPUs equipped with fifth-generation Tensor Cores. This upgraded method can insert up to three additional frames, pushing performance boundaries even further. Alongside these improvements, the efficiency of shader execution sequencing has been refined, and the new RT core's ability to detect ray and triangle intersections has been doubled, significantly boosting rendering speed and quality. In addition to frame generation, the Blackwell architecture introduces an innovative collaboration between the stream processor and the tensor processor. This synergy enhances neural shader performance and transitions the deep learning model used by many DLSS functions from traditional convolution to a more powerful Transformer network. Transformers excel at identifying features over longer time periods and across wider spatial areas, making visual predictions more accurate and consistent. Blackwell has even incorporated dedicated hardware acceleration for these advanced models, showcasing how future graphics technology is becoming smarter and more responsive. Another highlight for gamers is the significant enhancement to Reflex, a technology that previously cut mouse input to screen reaction times by 50%. With the new Frame Warp technology integrated into Reflex, reaction times can now be reduced by up to 75%. This dramatic improvement is especially promising for competitive gamers who constantly race against the clock. Faster response times mean a more immersive and competitive experience, providing a tangible edge in high-stakes scenarios. The latest advancement, DLSS 4, introduces multi-frame generation along with refined ray reconstruction and super resolution. DLSS 4 will have support 75 games and apps on Day 1, including Diablo 4, Stalker 2, Indiana Jones and the Great Circle, and more. In addition to already available games, titles like DOOM: The Dark Ages and Dune: Awakening will also support DLSS 4 at launch. DLSS 4 Transformer Model Improves Image Quality for GeForce RTX Users DLSS 4 is advancing with a new transformer model designed to improve image quality for GeForce RTX users. The traditional method using convolutional neural networks analyzed small areas in a frame over time, but the new vision transformer can look at whole frames and even several frames at once. This method doubles the parameters available for analysis, which leads to better pixel stability, fewer ghosting artifacts, more detail in motion, and smoother edges. The transformer model also benefits games that use ray tracing. In complex lighting and detailed scenes, such as those seen in certain modern games, the model provides more stable images. It improves the look of detailed objects like chainlink fences, reduces ghosting on moving parts like fans, and cuts down on shimmering effects on surfaces like power lines. This results in visuals that are more consistent and clearer during gameplay. Additionally, the new model brings improvements to super resolution techniques. A beta version of transformer-based Super Resolution is planned, and users will be able to experience its benefits and give feedback. Early tests show better stability over time, less ghosting, and higher detail during movement. This approach is not just a one-time fix but a step toward ongoing improvements in graphics quality. Summing up and 5090 review coming up Built on the Blackwell GB202 GPU, the RTX 5090 sports a larger die size compared to its predecessor. It integrates the GB202-300-A1 core with 170 (active) Streaming Multiprocessors, giving it 21,760 shader cores. While this is slightly lower than the full 24,576 cores available in the AD102 die, the reduction may lead to better thermal performance and greater overclocking potential. It could also mean that NVIDIA has an even faster card planned at a later stage in time. The RTX 5000 Series thus introduces a flagship GPU with a staggering 92 billion transistors and come with GDDR7 memory, delivering up to 1.8 TB/s of memory bandwidth.
[12]
Nvidia Blackwell architecture deep dive: A closer look at the upgrades coming with RTX 50-series GPUs
The new Nvidia Blackwell GPU architecture will power the upcoming generation of RTX 50-series GPUs. We've known various details for a while now, and many have been rumored, but at its Editors' Day during CES 2025, Nvidia provided a lot more information and details on the core functionality. There's so much to cover, with seven sessions that took all day, plus some live hands-on experiences. Let's get to it. Here's the full slide deck for the Blackwell architecture session. It's... not nearly as long as you might have expected. Nvidia didn't provide a ton of detail on some aspects of the new architecture, but from a high level, there are a lot of things that don't seem to have changed too much from the RTX 40-series Ada Lovelace architecture. Most of the upgrades and enhancements tend to be around AI and various neural rendering technologies -- we have a far more in-depth look at those in a separate article. The fourth slide gives the goals for Blackwell: Optimize for new neural workloads, reduce memory footprint, new quality of service capabilities, and energy efficiency. Those all sound like good things, but outside of the RTX 5090 with its significantly larger GPU die -- 744 mm2 compared to 608 mm2 on the 4090 -- a lot of the upgrades feel more incremental. That's not to say things haven't changed at all. The 4th-Gen RT cores have twice the ray triangle intersect rates as Ada. They're also built for Mega Geometry, which could help future Unreal Engine 5 games run better. The GPU shaders are also enhanced for Neural Shaders, and there are some other new additions. Blackwell will be the first series of Nvidia GPUs to move beyond DisplayPort 1.4a, with full support for DisplayPort 2.1 UHBR20 (80 Gbps). They'll also support PCIe 5.0, the first consumer GPUs to make that transition, though we'll have to see if that applies to all Blackwell GPUs or only the RTX 5090. Video encoding and decoding have been enhanced as well, now with support for 4:2:2 video streams. Getting back to the numbers, if you take the "up to 4,000 AI TOPS" (trillions of operations per second), that scales down to 3,400 TOPS on the 5090 (3352, to be precise). Then, you will discover that a big part of the boost comes from native FP4 support. S,o if you compare like-for-like, the RTX 5090 has 1,676 TFLOPS of FP8, whereas the RTX 4090 offers 1,321 TFLOPS FP8. That's only a 27% increase -- still sizeable but not massive. Similar scaling applies elsewhere, like in the FP32 shader compute. The 5090 delivers up to 104.8 TFLOPS of FP32, compared to 82.6 TFLOPS on the RTX 4090. Again, that's a 27% improvement. Let's put that in perspective. The RTX 4090 delivered a whomping 132% increase in GPU TFLOPS compared to the RTX 3090. Now that was an upgrade to get excited about! The 5090 will no doubt be faster and better than the 4090, but it's not going to completely destroy the prior generation -- at least not unless you want to factor in Multi Frame Generation, which we're far less enamored with than Nvidia's marketing department. Incidentally, the 5090 die is also 22% larger, with 21% more transistors, on the same TSMC 4N process node. Architecturally, there are some other noteworthy changes. With the rise in the use of AI and integer use for such workloads, Nvidia has made all of the shader cores in Blackwell fully FP32/INT32 compatible. So in Ampere (RTX 30-series), Nvidia doubled the number of FP32 CUDA cores but half were only for FP32 while the other half could do FP32 and INT32 -- INT32 often gets used for memory pointer calculations. Ada kept that the same, and now Blackwell makes all the CUDA cores uniform again, just with twice as many as in Turing. Nvidia also changed some things in the shader rendering pipelines to allow better intermixing of shader and tensor core operations. It classifies this as Neural Shaders, and while it sounds as though other RTX generations can still run these workloads, they'll be proportionally slower than Blackwell GPUs. This appears to be partly thanks to improvements to SER (Shader Execution Reordering), which is twice as fast on Blackwell as on Ada. Blackwell also gets a memory upgrade, moving from GDDR6 and GDDR6X on the Ada generation to full GDDR7. We don't know if that will apply to all RTX 50-series GPUs, but considering even the RTX 5070 laptop GPU has 8GB of GDDR7, we assume it's universal. This is the first full shift we've seen in graphics memory since the RTX 20-series back in 2018 first introduced GDDR6 -- clocked at just 14 Gbps. Most of the Blackwell RTX 50-series GPUs will run GDDR7 at 28 Gbps, twice as fast as the original GDDR6 chips, but only 33% faster than the 21 Gbps GDDR6X chips used in many of the higher-spec RTX 40-series GPUs. The RTX 5080 gets a speed bump to 30 Gbps GDDR7, almost twice as fast as the 2080 Super's 15.5 Gbps memory. Memory interface widths aren't changing, except on the RTX 5090. That will get a huge 512-bit interface with 32GB of GDDR7 memory at launch. Future 3GB GDDR6 chips leave the door open for a potential 48GB update later in the product cycle or for professional / data center GPUs with up to 96GB in clamshell mode, but Nvidia won't officially comment on or announce such things for a while. The RTX 5080 still has a 256-bit interface and 16GB, so while it gets 30% more bandwidth than the RTX 4080 Super, capacity remains unchanged. The same goes for the 5070 Ti (vs. 5070 Ti Super) and the 5070 (vs. 4070), except they get 33% more bandwidth -- 28 Gbps vs. 21 Gbps. Another new feature of the Blackwell architecture is the AI Management Processor. (And a quick side note here that Nvidia made no mention at all of an Optical Flow Accelerator, aka OFA, which was new for the Ada generation but may now be discontinued and replaced by more potent tensor operations.) With the increasing complexity of AI workloads, and the potential for more AI models running concurrently -- imagine a game doing upscaling, neural textures, frame generation, and AI NPCs -- Nvidia wanted better scheduling of resources. The AI Management Processor aims to do that and can apparently be given hints of what sort of workload is running and which needs to be finished first. So, an LLM doing text generation might be okay to delay slightly in order to get MFG (Multi Frame Generation) done first. Blackwell also comes with improvements to power gating and energy management, with the ability to enter and exit deeper sleep modes more quickly than prior generations. And that does it for the Blackwell architecture deep dive. Admittedly, a lot of this stuff was covered in more detail in some of the other sessions, like the Neural Rendering and AI portions. Check the full slide deck at the top for additional details.
[13]
NVIDIA Blackwell "RTX 50" GPU Architecture Detailed: Advanced Cores, DLSS 4, Next-Gen Gaming Technologies & More
At CES 2025, NVIDIA offered us a deep-dive of its next-gen Blackwell GPU architecture for RTX 50 gaming GPUs and how it improves upon Ada. NVIDIA GeForce RTX 50 "Blackwell" GPU Architecture Dissected: More AI-Focused Cores, More Throughput, DLSS & Reflex Upgrades, Coprocessor and A Ton More The NVIDIA Blackwell or RTX Blackwell architecture is designed specifically for gamers and content creators. This architecture will be offered on the RTX 50 graphics cards first which launch later this month. What we have known so far about the NVIDIA RTX Blackwell Gaming GPUs is that they are based on TSMC's 4nm process node, feature up to 92 Billion transistors with 4000 AI TOPS, 380 RT TFLOPs, a 125 TFLOPS of FP32 compute, the fastest GDDR7 memory interface with up to 1.8 TB/s bandwidth and come with a brand new Founders Edition design. With Blackwell, NVIDIA had a few design goals in mind to accelerate the graphical capabilities for the next generation of gaming. The architecture was to be designed and optimized around new neural capabilities and workloads. It aims to reduce the overall memory footprint, it also focuses vastly on energy efficiency, and all the while adding new quality of service capabilities. So Blackwell had to introduce a lot of changes and the main ones include the addition of 5th Gen Tensor Cores, offering high-speed FP4, compute and up to 4000 AI TOPS of performance, 4th Gen RT (Ray Tracing) cores with up to 360 RT TFLOPs and designed for Mega Geometry, a next-gen AI Management Processor which enables Simultaneous AI models and graphics workloads to be executed, a brand new Blackwell SM with 125 TFLOPS of peak FP32 compute and the addition of GDDR7 memory which offers the world's fastest memory speeds of up to 30 Gbps (on the RTX 5080). Other notable additions to the RTX Blackwell GPU architecture include DisplayPort 2.1 (UHBR20), PCIe Gen5 support, and 4K NVDEC/NVENC with 4:2:2 colors. Diving into the Blackwell SM, we first compare it with the Ada SM, which was mostly optimized for traditional shaders, & most of its Tensor cores were used either for DLSS or content creation apps. Ada also partitioned the FP32 cores into two blocks, one that could purely execute FP32 and one that could execute both FP32 & INT32 formats. With Blackwell, NVIDIA has doubled its INT32 GPU throughput which can help accelerate workloads such as Work Graphs and Shader Execution, and 5th Gen Tensor Cores also offer the aforementioned doubled throughput. Other microarchitectural changes allow multiple workloads to be executed efficiently. Blackwell also improves SER (Shader Execution Reordering) by 2x by reordering the neural models and the standard shading models and putting the same work together in an organized fashion. These models are then passed through tensor cores (if ML models) or shared cores (if shading models) for final execution. GDDR7 also brings a much-needed upgrade over GDDR6/X memory, offering twice the bandwidth and data rate of G6 memory with higher frequency and lower wattage. GDDR7 also supports PAM4 signaling and the PCB materials used on RTX 50 GPUs are top-of-the-line from an engineering point of view. This is the first full-fledged architecture for desktop PCs to utilize both GDDR7 and PCIe 5.0 materials in full conjunction. The new memory interface also offers twice the efficiency of GDDR6 in terms of PJ/bit. This will be very useful in mobility "Max-Q" designs where efficiency matters the most. Moving over to the Ray Tracing enhancements, the 4th Gen RT Cores introduce various new capabilities, such as a Triangle Cluster Intersection Engine which replaces the previous Triangle Intersection Engine which is optimized for Mega Geometry and can handle clusters of Mega Geometry and standard geometry much more efficiently. The Mega Geometry engine also has a new Triangle Cluster Compression format which can be decompressed using Blackwell's on-chip engine. Lastly, there's the new Linerar Swept Spheres block which accelerates RTX Hair and Fur rendering. To sum things up, the new RT cores bring an 8x ray triangle intersection rate while reducing the memory footprint to 0.75x. The FP4 format introduced on Blackwell's 5th Gen Tensor Cores will offer up to 32x throughput versus the Pascal generation and 2x versus the Ada generation of GPUs. These new cores will be taking full advantage of Neural Shading and Rendering techniques featured in next-gen AAA titles. This also leads us to the next topic, which is about Blackwell's scheduling and how it processes various workloads. In Blackwell, NVIDIA is introducing a new programmable Coprocessor known as Amp which sits at the front of the GPU, and interacts differently with the different cores on the GPU while understanding what's running on them, what's being done on them, and scheduling precisely the specific workload for the right core. NVIDIA also talked about Blackwell's new Power Gating modes. In Blackwell, the entire clock tree can be disabled even while the GPU is active. So, if the memory system or portions of the memory system are idle, power savings can be achieved this way. Another way to save power is to disable logic and SRAM when entire engines are idle. Blackwell also introduces a secondary rail that separates the core and the memory system which runs them at different voltages and, for different workloads, captures more performance within a power budget. It also allows a 15x reduction in the time it takes from the rail gate to the core. The new rail gate system is particularly useful in laptops as it reduces leakage by a major margin. A new aspect of Blackwell is also its accelerated frequency switching capability, which improves clock responsiveness by 1000x. For example, a workload such as physics which doesn't utilize the full width of the GPU can switch to a higher frequency, while a tensor core workload which can use the full width of the GPU can move to a lower frequency. But when the CPU hasn't fed the GPU with any work, Blackwell can drop frequency fast and this is done because Blackwell can switch back to a faster frequency faster. In terms of clock frequency uplift, Blackwell achieves a 300 MHz higher frequency in active state versus Ada GPUs. Lastly, we have Blackwell's Display & Video capabilities. New support on Blackwell includes support for DisplayPort 2.1b (UHBR20) with High-Speed hardware flip metering which improves the pace of frames using DLSS 4. There's also the 9th Gen Encoder and 6th Gen Decoder, offering AV1 UHQ & 2x H.264 Decode capabilities, while MV-HEVC and 4.2.2 Encode/Decode are also included in the RTX Blackwell Video engine block. Since the advent of DLSS back in 2018, the technology has been improving continuously. The DLSS model is being trained on a supercomputer housed within the NVIDIA HQ which is running 24/7 & uses the latest and greatest of their GPUs for the past 6 years. The last major iteration of DLSS was DLSS 3.5, which introduced ray reconstruction. This new feature was part of the failure process in which the model detects various issues such as blurriness, ghosting, and flickering. NVIDIA's in-house team of engineers then try to figure out what went wrong in the model and why the image wasn't created as intended. New approaches are defined to augment the model set which is retrained and tested across 100s of games to achieve the desired image quality and these result in upgraded versions of DLSS, with the latest one now being DLSS 4, which improves upon all aspects of the Super Sampling technology. With DLSS 4, NVIDIA is transitioning to a completely new neural architecture model from 2020s DLSS 2. The main change is the new transformer engine which can be trained across multiple data sets while being computationally efficient, offering 2x the parameter size and 4x the compute horsepower. DLSS 4 also adds the new MFG mode or Multi-Frame Generation which, instead of running two models per frame, runs five models per frame with super-resolution and ray reconstruction. This leads to 15 out of 16 pixels or frames being generated by AI, all the while improving the image quality. NVIDIA also dives a bit into why they only thought of doing multi-frame frame-gen with Blackwell and there were two reasons, one was that DLSS's image quality was not that great to begin with and needed more training time, while the second was the time it takes to generate these new frames could result in frame pacing and artifacting issues. So, as the DLSS model was trained, the image quality became much better, which is fairly noticeable in recent DLSS 3 and DLSS 3.5 titles, while for frame pacing, NVIDIA's flip metering system is the solution that displays frame and this has been upgraded, and it can now reduce the frame variability by 5-10x, leading to similar or better latency versus last-gen DLSS solutions even when MFG is enabled. Best of all, while MFG will be limited to RTX 50 and Frame Generation will be limited to RTX 40 and RTX 50 series, the image quality enhancements and Reflex 2 highlights will be applicable to all RTX GPUs as we reported here, so all RTX GPU owners are in for a small treat even if they don't own the latest and greatest hardware.
[14]
8 things I learned about NVIDIA's new tech at CES 2025
DLSS 4, Blackwell, neural rendering, digital humans - what does it all mean? I was one of a couple of hundred media attendees at NVIDIA's Editor's Day at CES 2025, which was a full day of showcases, speeches and demonstrations of the various aspects of the new suite of graphics cards coming from NVIDIA in 2025, the new RTX 50 series. This includes the GeForce RTX 5070, 5070 Ti, 5080 and the new flagship model, the 5090, all of which are likely to bother our list of the best graphics cards over the coming year. Now, as with seemingly every single exhibitor at CES, NVIDIA has focused a lot on AI capabilities in its launch announcements, and after spending a day listening to, looking at and even getting to test some of that hardware and software myself, I came away a lot more informed, in many ways impressed and in some ways a little wary... These are the 8 most important things I learned about the new GeForce RTX hardware and software at CES 2025 and NVIDIA's Editor's Day. The day started with an overview of NVIDIA's current developments in graphics rendering, and how the company has arrived at this point. Exploring the advancements in graphics rendering technologies, the first part of the day focused on programmable shaders, neural shading, and RTX innovations. I learned a lot about the evolution of shaders, the introduction of neural shading with the new Blackwell architecture used in the latest generation of GeForce RTX cards, and the impact of Cooperative Vectors API on accessing Tensor Cores (which is the central 'AI' element of NVIDIA's hardware). The session also covered Neural Radiance Cache, RTX Skin for real-time subsurface scattering (more realistic skin effects), and RTX Mega Geometry for handling complex scenes. Additionally, RTX Remix's influence on the modding community is discussed, highlighting its integration with industry-standard tools. All of these elements, and more, make up NVIDIA's neural rendering approach with the new Blackwell architecture. Programmable shaders allow developers to customise the appearance of pixels on the screen, moving beyond fixed function shaders. Since they first appeared in GeForce 3 a long, long time ago, they've come a long way, where today, we're seeing neural shading, which involves using neural networks to enhance graphics rendering, allowing for more realistic textures and materials. This is obviously most pertinent to game developers, for the purposes of creating more realistic, more immersive, more convincing game worlds, but the demos we saw gave us a glimpse that we could be on the verge of a big inflexion point not just for gamers, but for most 3D modelling. Let's look at the main elements... The 50 series of GeForce RTX cards will support Neural Radiance Cache, a technology that trains in real-time using the gamer's GPU to create a model that caches light transport throughout a scene. In English, as I understand it at least, it means it can learn how ray-tracing and path-tracing behaves in the 3D world you're inhabiting and will improve those paths to a point where you can have effectively infinite bounces of light to make every piece of lighting more realistic, with eye-catching improvements in shading, texture lighting and scene ambiance in the demos we saw. RTX Neural Materials uses NVIDIA's AI cores to compress complex shader code typically reserved for offline materials and built with multiple layers. The examples we saw on screen included tricky materials such as porcelain and silk. The material processing is claimed to be up to 5x faster, making it possible to render film-quality assets at game-ready frame rates. The above are aimed at gamers and game developers, but as we're seeing more and more crossover between creative and game use for software (Duncan Jones' Rogue Trooper using Unreal Engine for its animation is just one example) this will have an effect outside the game-dev space very soon, I imagine. Another game-changer in ray- and path-tracing, I suspect, will be RTX Mega Geometry. It allows for handling complex scenes with high polygon counts in ray tracing and path tracing, by way of enabling the use of full-resolution meshes without proxy meshes (which have been used to save memory due to the high number of triangles/polygons in any complex 3D scene). It also efficiently compresses and caches clusters over time, which NVIDIA claims will speed up both gameplay and time on the development side. This tech is coming soon to the NVIDIA RTX Branch of Unreal Engine (NvRTX), so developers can use Nanite and fully ray trace every triangle in their projects. I noticed something interesting both at the event and afterwards during chats with some people who are a lot smarter than I am. The new NVIDIA graphics cards don't seem to show a huge jump in VRAM compared to the last generation. For example, the 5090 tops out at 32GB, while a lot of the NVIDIA laptop GPUs in the 50-series come with either 12GB or 16GB of VRAM. So, why are the numbers so low, I hear you groan. From what I gathered, DLSS 4 could be a big reason. DLSS, or Deep Learning Super Sampling, has just entered its fourth phase at NVIDIA. This new version really aims to boost performance and image quality in real-time graphics using AI. With DLSS 4, the trade-offs between image quality, smoothness, and responsiveness in rendering graphics might become a lot less important. By using AI to predict and render graphics more efficiently -- based on the game's data -- DLSS cuts down on the need for high computational power. It essentially takes advantage of redundancy in rendering workloads, which means less VRAM might be needed while still delivering great performance. The demos we were shown displayed a before-and-after approach to some AAA games, with DLSS 4 switched off and then on. With DLSS 4 switched on, the game's visual detail was noticeably improved, especially any motion, due to the multi-frame generation offered by DLSS 4 basically having several frames ready and waiting for you depending on where you moved the camera (using fancy AI trickery). But not only did it improve the graphics performance and make it smoother, it also showed an increase in framerate, so it was running the game more efficiently while showing more detail. That's where the Blackwell architecture development seems to have been focused; on more efficiency rather than brute force or just pumping in more horsepower. While NVIDIA did touch upon the widespread adoption of generative AI (whether I like it personally or not), and how the latest iteration of their graphics cards will aid generative extension of sequences and reframing shots, perhaps eliminating the need for costly re-takes within filmmaking in some cases, what intrigued me most were the developments in video rendering. With multi-camera set-ups for shows, interviews, video podcasts and even on-site reports (we saw an example from a vlogger's racetrack visit with his nine cameras) on the rise, the need for more streamlined video rendering and editing is rising fast. And with improvements brought by the new Blackwell architecture, normal 4K video rendering is going to be up to 40% faster, and the addition of support for 4-2-2 cameras, we will now be able to render and edit multiple videos concurrently and in almost real-time, which constitutes and up-to-11-times faster rendering and editing workflow for those videos. In addition, we saw improvements made to voice and image enhancements, with Studio Voice helping cut out noise and improving voice quality for podcasts and videography, and Virtual Keylight helping balance out unevenly or unflatteringly lit scenes, especially for streamers or video creators who may not have the luxury of a full lighting kit for their recordings. One thing computers, CGI and AI-generated videos have struggled with, and continue to struggle with, is creating convincing-looking and naturally moving humans. At the NVIDIA showcase, and indeed at several places throughout the expo, I saw 'digital humans', 'autonomous game characters', 'neural faces', and even an 'intelligent streaming assistant' from Logitech, to use as your 'companion' during game streams. All of these are different approaches to create a 'UI for your AI', and while remarkable improvements have been made in some respects, we're still deep in the Uncanny Valley. To their credit, NVIDIA acknowledge this, admitting that rendering human faces convincingly is just about the hardest thing you can do in any digital space. One thing that's moving the needle for NVIDIA this year is RTX Skin, a part of the Neural Rendering suite of developments. Most rendering methods we've seen don't accurately simulate how light interacts with human skin, which is partly translucent, and that can, and frequently does, result in a plastic-like look. Subsurface Scattering (SSS, get ready to note down all those acronyms in your notebook, there will be a test) simulates how light penetrates beneath the surface of translucent materials and scatters internally, creating a softer, more natural appearance. The example we saw showed fantastic improvements in how ears and other translucent elements of human skin are rendered, along with more convincing skin and more natural and realistic lighting effects as it bounces of the 'human' skin, but it's abundantly clear we are still not there when it comes to natural facial movements, so I'm not gonna get replaced by a digital avatar in the workplace just yet... As we were shown a developer demo from the makers of Doom: The Dark Ages, we saw how DLSS 4, neural rendering, improved path-tracing and ray-tracing and other introductions from NVIDIA have helped them create a more immersive, photorealistic and convincingly lived-in (and died-in, ey?) world. An argument I'm seeing in several pieces around the interwebz is that AI rendering of frames, multi-frame generation and other tools being introduced will lead to lazy devs pushing out poorly developed AI-reliant slop. And yes, that's definitely gonna happen. But we also get lazy devs making poorly programmed slop now. And we also got lazy devs making poorly programmed slop in 1996, when I was just getting into PC gaming. We've always had those. Thankfully, most of that work gets forgotten and buried. But in the end, it's not them who matter. Who matters are the talented, hard-working visionary devs who see these developments for what they are; potential tools to create new, richer, bigger and more immersive worlds, whether in gaming, filmmaking or the 3D modelling/graphic-design space. And I'm excited to explore all of those.
[15]
Benchmarking Blackwell and RTX 50-series GPUs with Multi Frame Generation will require some changes, according to Nvidia
Nvidia's Blackwell RTX 50-series GPUs will require new tools for benchmarking, particularly if you're using DLSS 4 Multi Frame Generation (MFG). This was the key takeaway from the final session from Nvidia's Editors' Day on January 8, 2025, where it gave briefings to a few hundred press and influencers on neural rendering, RTX Blackwell architecture, the RTX 50-series Founders Edition cards, RTX AI PCs and generative AI for games, and RTX Blackwell for professionals and creators. Much of the core information isn't new, but let's cover the details. First, performance isn't just about raw frames per second. We also need to consider latency and image quality. This is basically benchmarking 101, a subject near and dear to my heart (as I sit here benchmarking as many GPUs as possible while prepping for the Arc B570 launch and the impending RTX 5090 and 5080, not to mention AMD RDNA4 and the RX 9070 XT and RX 9070). Proper benchmarking requires not just a consistent approach, but a good selection of games and applications and an understanding of what the numbers mean. Average FPS is the easiest for most people to grasp, but we also report 1% low FPS. For us, that's the average performance of the bottom 1% of frametimes. (I wrote a script to parse the CSV files in order to calculate this.) It's important because pure minimum FPS -- the highest frametime out of a benchmark run -- can vary wildly over multiple runs. A single bad frame could drop the minimum FPS from 100 down into the teens, and doing multiple runs only partially helps. So instead, we find the 99th percentile, the frametime above which only the worst 1% of frames reside, and then divide the count of those frames by the sum of all the time required. It's a good measurement of how consistent a game is. Can you dig deeper? Yes, absolutely. The difficulty is that it starts to require more time, and the additional information gleaned from doing so suffers from a classic case of diminishing returns. It already takes about a full workday (ten hours, if I'm being real) to benchmark just one GPU on my current test suite. That's 20-something games, multiple applications, and multiple runs on every test. And when you're in a situation like right now where everything needs to be retested on as many GPUs as possible? You can't just decide to increase the testing time by 50%. Nvidia's FrameView utility, which I've been using for the past two years, is a great tool for capturing frametimes and other information -- including CPU use, GPU use, GPU clocks, GPU temperatures, and even real-time GPU power if you have a PCAT adapter (which we do). But there are multiple measurements provided, including the standard MsBetweenPresents (the default for PresentMon) and Nvidia's newer MsBetweenDisplayChange. With Blackwell, Nvidia recommends everyone doing benchmarks switch to using MsBetweenDisplayChange, as it's apparently more accurate. Looking at the numbers, most of the time it's not all that different, but Nvidia says it will better capture dropped frames, frametime variation, and the new flip metering that's used by MFG. So, if you want to get DLSS 4 "framerates" -- in quotes because AI-generated frames are not the same as fully rendered frames -- you'll need to make the switch. That's easy enough, and it's what we plan to do, whether or not we're testing with MFG. Nvidia then goes on to pose the (hopefully rhetorical) question: Is image quality important? The answer is yes, obviously, but here's where we run into problems. When everything renders the same way, we should have the same output -- with only minor differences at most. But in the modern era of upscaling and frame generation, not to mention differences in how ray tracing and denoising are accomplished? It becomes very messy. So, all you need to do is capture videos of every benchmark and then dissect them. Easy, right? [cough] Speaking from experience, best-case just the capturing of such videos adds 50% to the amount of time it takes to conduct benchmarking. Analyzing the videos and composing useful content from them is something more usable on an individual game basis, rather than for graphics reviews. I wish that weren't the case. I wish it was possible to get all the benchmarks from all the potential configurations with clear image quality comparisons using every possible setting. It's not. And it's foolhardy to think otherwise. It's also why, for the time being, our GPU reviews will primarily focus on non-upscaling, non-framegen performance as the baseline, where image quality shouldn't differ too much. We can do some testing of upscaling and framegen as well, but that will be a secondary consideration. And we feel that's pretty fair. Because no matter how much marketing gets thrown at the problem, frame generation differs from rendering and upscaling. It adds latency, and while it makes the visuals on your display smoother, 100 FPS with framegen doesn't feel the same as 100 FPS without framegen -- and certainly not the same as 100 FPS with multi-frame generation! Without framegen, user input would get sampled every ~10ms. With framegen, that drops to every ~20ms. With MFG, it could fall as far as sampling every ~40ms. Upscaling is a different matter. What we've seen in the past clearly shows that DLSS upscaling delivers superior image quality to FSR2/3 and XeSS upscaling. And now, Nvidia is about to overhaul DLSS to further improve image fidelity thanks to a transformers based AI model. It will run slower than the older CNN model but it will look better. How much better? Yeah, things just became that much more complex. There's more to the benchmarking discussion, including AI and professional workloads. We test all these areas on our graphics card reviews, and what Nvidia shows already agrees with what we've been doing for the most part. If you have any thoughts or feedback on the matter, let us know in the comments. The full deck from the session is included below for reference.
Share
Share
Copy Link
Nvidia unveils its new RTX 50 Series GPUs, promising significant performance improvements through AI-driven technologies like DLSS 4, potentially revolutionizing gaming graphics and performance.
Nvidia has announced its latest generation of graphics cards, the RTX 50 Series, at CES 2025. These new GPUs, based on the "Blackwell" architecture, promise significant performance improvements over their predecessors, largely driven by advancements in AI technology 12.
The RTX 50 Series introduces DLSS 4 (Deep Learning Super Sampling), an AI-enhanced technology that uses a transformer model to dramatically improve frame rates and image quality. Nvidia claims that DLSS 4 can accelerate frame rates by up to 8x compared to traditional rendering methods 2. This technology allows the new GPUs to achieve impressive performance metrics, with Nvidia CEO Jensen Huang stating that the RTX 5070 could match the performance of the previous generation's top-tier RTX 4090 1.
The RTX 50 Series lineup includes:
These new GPUs feature GDDR7 memory, offering greater bandwidth and energy efficiency compared to previous generations 3.
While the raw hardware improvements show a 10-30% boost in performance, the integration of AI technologies like DLSS 4 and Multi Frame Generation allows for much more significant gains in gaming performance 34. However, it's worth noting that achieving 4K gaming at high frame rates still relies heavily on these AI-driven technologies rather than native rendering capabilities 3.
Nvidia is also leveraging AI for other gaming-related applications. The company showcased Nvidia Ace, an AI platform that can create intelligent NPCs, AI companions for games like PUBG, and even AI-powered streaming assistants 3.
The Blackwell architecture also focuses on energy efficiency, which could lead to significant improvements in laptop gaming. Nvidia claims up to 40% longer gaming sessions on battery power and 30% longer web and video surfing 3.
The introduction of the RTX 50 Series is expected to have a significant impact on the gaming market, potentially making high-end gaming more accessible at lower price points. However, the full potential of these GPUs will depend on widespread adoption of technologies like DLSS 4 by game developers 24.
As Nvidia continues to balance its focus between AI and gaming, the RTX 50 Series represents a significant step forward in integrating AI technologies into consumer graphics products, potentially reshaping the landscape of PC gaming and graphics rendering 5.
Reference
[4]
[5]
Nvidia's new RTX 5090 GPU offers significant improvements over the RTX 4090, with a focus on AI-driven features and enhanced performance, albeit at a higher price point.
44 Sources
44 Sources
Nvidia's DLSS 4 technology promises massive performance gains and visual improvements, but raises questions about its impact on game design and player experience.
6 Sources
6 Sources
NVIDIA launches its new GeForce RTX 50 Series GPUs, featuring the Blackwell architecture and DLSS 4 technology, promising significant performance improvements and AI-enhanced gaming experiences.
10 Sources
10 Sources
NVIDIA launches the GeForce RTX 5090, a high-end graphics card with significant performance improvements and new AI features, marking a leap in GPU technology for gaming and creative professionals.
24 Sources
24 Sources
NVIDIA's new GeForce RTX 5070 Ti offers impressive 4K gaming performance with DLSS 4 technology at a more affordable price point than higher-end models.
7 Sources
7 Sources