The Outpost is a comprehensive collection of curated artificial intelligence software tools that cater to the needs of small business owners, bloggers, artists, musicians, entrepreneurs, marketers, writers, and researchers.
© 2024 TheOutpost.AI All rights reserved
Curated by THEOUTPOST
On July 24, 2024
4 Sources
[1]
The 6 Most Exciting New Bits From AMD's Big 'Zen 5' Processor Push
For as long as I can remember, I've had love of all things tech, spurred on, in part, by a love of gaming. I began working on computers owned by immediate family members and relatives when I was around 10 years old. I've always sought to learn as much as possible about anything PC, leading to a well-rounded grasp on all things tech today. In my role at PCMag, I greatly enjoy the opportunity to share what I know. AMD's next-generation processors based on its "Zen 5" architecture are set to launch imminently (the desktop chips are due to go on sale July 31), and AMD has kept up a steady drip of information telling us more about Zen 5 with each passing week. We got a decent look at AMD's initial launch lineup during Computex, and a fine overview of the architecture last week. This week, AMD gave us even more architectural details, giving even better insight into what makes Zen 5 tick. I've distilled AMD's latest briefing into the six most exciting new things about these upcoming Zen 5 chips. 1. More Efficient Core Growth Is the Way Forward Boosting processor core performance is a tricky task that requires careful planning and in-depth knowledge of the processor design. You'll find a few well-known and comparatively easy ways to raise performance, such as increasing the clock speed or adding more CPU cores to a processor, but these two options aren't always possible or practical. Power consumption tends to increase non-linearly as clock speed is raised. While pushing higher clocks effectively boosts performance, it is often detrimental to overall energy efficiency. Clock speeds set too high can also cause stability issues and heat problems. Simply put, while it can be effective, upping clock speeds isn't always the best option, and it has its limits. Adding cores poses similar issues, because power consumption and heat generation increase with each added core. In some ways, this is an easier path to boosting performance, as these negative impacts can be offset by reducing clock speed while still achieving a net performance boost thanks to the additional cores. But you'll find other more prominent issues with adding cores: Notably, it drives up production costs. For Zen 5, AMD didn't take either of these paths. Instead, the chipmaker opted to boost performance through more complex methods. These changes are at the heart of the processor design and need far more careful consideration. If not, they may produce little to no improvement, but that doesn't appear to be the case this time around, if AMD's performance claims pan out. 2. Cache-ing in on a Wider Zen 5 In simplest terms, Zen 5 is like a Zen 4 processor with more cache and a broader interior that allows more data to pass through it every second. This is achieved through three critical efforts. First, AMD added more computational resources to the Zen 5 architecture, allowing more work to be completed each clock cycle. Inside of a processor core are hardware components referred to as Arithmetic Logic Units (ALUs), which, as their name suggests, are the core components doing the math that makes a processor what it is. AMD used four ALUs in each Zen 4 CPU core; that's increased to six ALUs on all Zen 5 cores. This change is not all that dissimilar from adding more CPU cores and can result in a significant performance boost. At the same time, AMD increased the number of execution pipes on its floating point unit (FPU) from three on Zen 4 to four on Zen 5. This hardware also does math like the ALUs, but it is more specialized and performs some types of math operations astonishingly fast, far faster than standard ALUs. This is particularly important for gaming and computation-intensive workloads like content creation and research tasks. These changes alone may not improve performance. Though they are the backbone of the processor and do the bulk of the work, they cannot perform work if there isn't any work to do. This is where the other significant changes comes into play, most of which are in the processor front-end. To keep data flowing, processors rely on algorithms to guess what work will come next based on what work is currently underway. The algorithms attempt to predict in which direction the work will go next, a process called "branch prediction." This capacity has doubled compared with Zen 4, and Zen 5 now makes two predictions for each branch instead of one. Though this will lead to more incorrect guesses overall, it increases the likelihood that at least one of the two will be correct, and the processor won't have to pause and wait for the correct data to be fetched. Data can be retrieved far more quickly, too, with Zen 5 able to fetch and decode twice as many bytes of data each clock cycle as Zen 4. To facilitate these changes, bandwidth throughout the processor has increased significantly at what appears to be every single stage. The capacity of L1 and L2 caches has similarly increased to enable the processor to hold more data. The extra cache and functional hardware would undoubtedly increase the size of the processor overall, similar to how adding additional cores would. However, the increase isn't quite that drastic, as making each CPU core a bit bigger doesn't typically add as much physical space to a processor as adding multiple extra cores. What negative impact this would have had on production costs and power consumption is offset by the transition to newer manufacturing processes. AMD uses TSMC's 3nm manufacturing process for its CPU cores, and a 4nm process for the separate I/O die that holds most of the connectivity hardware and an integrated graphics processor (IGP). This change would also likely help to boost energy efficiency further. 3. Cache-ing Out on Zen 5c As mentioned in the last section, adding hardware resources can have negative and positive impacts. Cache has long been one of the more divisive hardware resources, with compelling positive and negative effects. Cache requires many transistors and much die space, and it tends to be relatively power-hungry while generating considerable heat. Yet, without it, processors would run at only a fraction of their current speed. Cache is essential to keeping processors fed with data; without enough data, processors are severely limited in what they can do. Striking a balance here is critically important, but it appears that this time around, AMD has opted for a mixed approach. Its full-size Zen 5 CPUs ship with 32MB of L3 cache for every eight CPU cores. This is shared among all eight cores, effectively meaning the chip has 4MB of L3 cache for each core. The more compact Zen 5c CPU cores will have far less available: 8MB of L3 cache for each set of eight CPU cores, just a meager 1MB of L3 per core. That's an enormous reduction and definitely will help make a processor with Zen 5c cores smaller and cheaper to produce. But it will also have some trade-offs. AMD stated that Zen 5 and Zen 5c have the same features and can perform the same number of instructions per clock (IPC). The chip maker also suggests that performance will be the same between Zen 5 and Zen 5c here, but I'm skeptical. It is likely true in some situations, but as I have just said, cache has its own host of adverse effects. I see no logical reason why AMD would make a Zen 5 processor with 4MB of L3 cache per core if it provides no benefit over having just 1MB of L3 cache per core. That would raise costs, power consumption, and heat production for no gain. As the cache's primary purpose is to keep a processor fed and avoid stalls, I expect some discrepancies when the Zen 5c cores are heavily loaded. At that point, they would likely fall behind standard Zen 5 cores. Besides the reduction in cache and the corresponding drop in physical footprint, I don't see any significant difference between Zen 5 and Zen 5c cores. AMD indicated it targets lower max clock speeds for Zen 5c, but this is likely more due to how AMD intends to use these cores rather than their inability to hit those speeds. 4. 'Granite Ridge' Desktop Processors: Smaller, Cheaper, More Energy Efficient? AMD's plans for its Zen 5 and Zen 5c cores may shift over time, but at launch, AMD intends to address different markets with different cores. AMD's "Granite Ridge" processors target the desktop market and form the Ryzen 9000 series of processors. These processors are all based on the Zen 5 core at launch, but we may see some Zen 5c cores make their way to these desktop chips in the future. Take, for example, AMD's Ryzen 7 8700G. It's a desktop processor with a powerful IGP, and it is based around an AMD Zen 4 CPU core that was on the mobile market for years before the Ryzen 7 8700G's launch. AMD may well do something similar with its mobile Ryzen AI 300 chips in the future, but it likely won't happen anytime soon. Interestingly, while the Zen 5-based CPU cores in the Ryzen 9000-series processors look to be a notable leap over their Zen 4-based Ryzen 7000-series counterparts, we see less change on the I/O die side of the processors. The I/O die is a portion of AMD desktop processors containing almost all of the processor's circuitry outside the CPU cores. This includes items like the IGP, the display controller, multimedia engines for decoding and encoding content, PCIe 5.0 lanes, USB controllers, and other resources that are essential to operating the processor. The I/O die on Ryzen 9000-series processors has only minor improvements over the Ryzen 7000-series I/O die. This is because the two are much the same, except that the I/O controller on Ryzen 7000-series processors is built on a 6nm manufacturing process, and the one on Ryzen 9000-series processors uses a 4nm process. You'll find no AI hardware, IGP improvements, or updated connectivity here. The only real improvements should be that the chip is smaller, cheaper to produce, and more energy efficient thanks to the newer manufacturing process. Desktop Ryzen 9000-series processors will still get USB 4.0 support, but this will be delivered via select motherboard chipsets like the upcoming AMD X870 and X870E, which launch likely in September, after the Ryzen 9000 family hits the street. 5. Epyc Zen 5c Workstation Processors: Truly Massive Core Counts AMD will employ its Zen 5c architecture for servers and workstations instead of Zen 5. This is one area in which the use of Zen 5c cores makes a lot of sense. Servers and workstations need a lot of computing power, making having as many cores as possible necessary -- far more critical than a few extra megabytes of L3 cache. The space saved by having less L3 cache likely only equates to a couple of cores worth of space, but over several eight-core chiplets, the space saved gets to be quite significant. That, along with the reduced power consumption and heat production that comes with having less cache, likely makes the trade-off worthwhile if that can be put to additional cores. And indeed, that is exactly what AMD has done. AMD has plans for Epyc Zen 5c processors with up to 192 cores, a considerable increase over the 128 cores per chip that Epyc Zen 4-based processors topped out at. These processors aren't expected until later this year, however, so we don't know everything yet about what clock speeds they will work at or what they will cost. 6. 'Strix Point' Mobile Processors: Integrated Graphics of Interest AMD's Zen 5 mobile lineup is codenamed "Strix Point." Unlike Granite Ridge and AMD's Epyc processors, it will use a mixture of Zen 5 and Zen 5c processor cores on a single chip. These processors will be known to the larger world in retail as the Ryzen AI 300 series. Here again, the use of some Zen 5c cores makes decent sense. The Zen 5c cores should use less power due to their lower clock speed targets and reduced L3 cache, which makes them a logical option for a mobile platform. However, when you need the extra performance, a few standard Zen 5 CPU cores will pick up the slack. This way, the chips can get greater efficiency benefits while still performing well. However, it isn't quite what we consider a true big.LITTLE architectural design. Regardless, I can't help but wonder how this will affect performance overall, particularly in gaming workloads. Changes in cache size strongly impact some applications like games. This is the underlying basis for AMD's Ryzen processors with 3D V-Cache. We typically don't recommend processors with 3D V-Cache for most users, as the amount of cache on regular processors is sufficient for all but the most extreme gaming scenarios, particularly at higher resolutions and graphics settings. But that doesn't mean we would be comfortable gaming on a processor with such a limited amount of cache available. This could realistically hamper performance, but we'll need to test this with benchmarks when the processors come out. Some of the more exciting details that AMD revealed to us in its latest disclosure are related to the IGP on Strix Point processors. Unlike Granite Ridge processors, which use an RDNA 2 IGP, these have been upgraded to RDNA 3.5, which suggests this IGP has improvements not yet available on AMD's commercial graphics cards. These IGPs are relatively robust and upgraded with a larger graphics engine with more cache and better performance, including a 2x sampler rate for better texture performance. They also have 16 total compute units, up from 12 on the best Zen 4 IGP, which gives it a total of 1,024 shaders and a great deal more computational power to work with. AMD also updated the AI hardware in these processors to its XDNA 2 architecture, which has significantly more performance and up to double the performance per watt over the previous generation. Find AMD Zen 5 in Laptops and Desktops on July 31 With all the architectural details we could hope to have, we know a great deal about Zen 5 and AMD's upcoming processors. We also know they will be released on July 31. The last bits of information we need to learn are how well these processors perform in independent tests and how much they will cost. We expect to find out soon.
[2]
AMD dishes more Zen 5 details -- Compact core is 25% smaller than the normal core, new SoC and chip architecture with dual CCXs
AMD recently held its Zen 5 Tech Day, unveiling the details of its soon-to-be-released Ryzen 9000 'Granite Ridge' and Ryzen AI 300 'Strix Point' processors to the world. However, the company followed up this week with even more deep-dive details on its Zen 5 microarchitecture and SoC layout. AMD revealed that its Zen 5c 'compact' cores are roughly 25% smaller than the standard full-fat Zen 5 cores and that the two core types have varying amounts of cache on the same die -- a first for an AMD design. The company also announced many other interesting technical details, which we'll cover below. AMD developed the Zen 5 architecture and then customized it for a more compact implementation for its Zen 5c cores. This single architecture, deployed in two customizable core types, will be used for its desktop, mobile, and server processors and span both the 4nm and 3nm process nodes. AMD's approach to its 'compact' Zen 5c cores is inherently different than Intel's approach with its e-cores. As with Intel's E-cores, AMD's Zen 5c cores are designed to consume less space on a processor die than the 'standard' performance cores while delivering enough performance for less demanding tasks, thus saving power and delivering more compute horsepower per square millimeter than was previously possible (deep dive here). But the similarities end there. Unlike Intel, AMD employs the same microarchitecture and supports the same features with its smaller cores. AMD's full-fat Zen 5 and compact Zen 5c cores can be used in multiple segments in either heterogeneous designs with both core types on the same die (like Strix Point) or homogeneous designs that only use one core type (like the Granite Ridge desktop chips with only full-sized cores, or the previous-gen EPYC Bergamo server chips with only smaller compact cores). The Ryzen 9000 Granite Ridge SoC is exactly as expected -- a single CCD contains eight full Zen 5 cores paired with 32MB of L3 cache. However, the Strix Point SOC is completely unique. The compact cores are designed for scale-out performance while providing a more optimal power-to-performance ratio. That means AMD takes a different approach with cache capacities for this core type. The die has two CCXs (Core Complexes -- core clusters on the same die), much like we saw in older AMD Zen 2 chips. Both core types have their own private L1 and L2 caches, but the 24MB of L3 cache is split into an 8MB slice for the standard cores and a 16MB slice for the Zen 5c compact cores. AMD's Zen 5c cores mark the first time it has had two core types with different cache capacities on the same die -- the four full-sized performance cores have 4MB of L3 apiece to satisfy low-latency and bursty workloads. In contrast, the eight compact cores have a mere 1MB of L3 apiece for the low-utilization high-residency workloads. The reduced L3 cache capacity saves not only area for the compact cores but also drastically reduces power consumption -- the chip uses far less power-hungry cache per compact core. Given that AMD would like to run the entire machine on compact cores as much as possible while power-gating the performance cores and their large L3 caches, this has tremendous potential for boosting battery life -- provided the scheduling mechanisms work as intended. The move to an asymmetrical cache design presents new management issues for AMD. These two L3 caches have to communicate with each other over the data fabric, much like the CCX-to-CCX cache coherency mechanism found with AMD's older Zen 2 architecture. This does introduce higher latency for cache-to-cache transfers, which AMD says is "not any more than you would have to go to memory for." As such, AMD uses Windows scheduler mechanisms to attempt to constrain workloads to either the Zen 5 or 5c cores to reduce the occurrence of high latency transfers, with background workloads typically being assigned to the 5c cores. Unlike Intel, which prioritizes scheduling work into its e-cores first before it sends it to other cores if the smaller cores aren't fast enough, AMD has no preference for where the workload lands first. Instead, AMD allows the operating system to choose the core type to target based on priority and QoS mechanisms, thus ensuring the best possible user experience based on the given workload. AMD has its own thread scheduling mechanisms. It provides the OS with tables that enumerate performance and power characteristics for each core, along with providing weights for various operations, thus allowing the OS to make scheduling decisions. We can also see a breakdown of the EPYC SoC in the slide as well, with AMD being coy about its next-gen Zen 5 EPYCs by simply listing 'N-Classic/Compact" cores per CCD to keep the lid on core counts for the CCDs (if tradition holds, this would be the same number of cores per CCD as the desktop parts). We see the same with the "X-MB L3" listing. The "futures" bullet point lists both homogenous and heterogenous types of chips next to the EPYC CCDs, which some could take as implying AMD could have some Zen 5 EPYCs with mixed core types, which would be a first. However, do note that the bullet point list is an empirical list of features rather than being associated solely with the EPYC CCDs listed next to it. AMD also expanded on its rationale and goals for the Zen 5c compact cores. Unlike Intel's approach, both Zen 5 core types support SMT and the same instruction set (ISA), avoiding the scheduling concerns that Intel faces with its dissimilar core types (Intel's core types don't support the same ISA). AMD's approach also differs from Intel's because it prioritizes keeping the performance of the Zen 5c cores as close to the standard cores as possible during multi-core workloads. This prevents situations where the larger cores are waiting on smaller cores to complete workloads (important for situations like multi-core workloads with thread dependencies). This sidesteps what Mike Clark, Zen's lead architect, calls a 'scheduling cliff,' wherein a large difference in performance will occur if a workload is scheduled into a Zen 5c core, thus negatively impacting the user experience. Ultimately, the goal is to provide the smallest delta possible between the two core types. So, rather than set the Zen 5c design target predicted by a certain die area requirement, AMD instead targeted a certain voltage/frequency (V/F) curve for the smaller cores. As with all processors, Zen 5's clock rate will drop as you load more cores due to power and thermal limitations. Therefore, when four performance cores are active, the processor will have a lower clock speed than it does with one active core. AMD used loaded frequency as a guide to decide where to define its V/F curve target for the compact cores, thus keeping the speed delta between the two core types tenable. Lowering Zen 5c's frequency target allowed the company to break the design down into fewer, bigger blocks that are placed closer together, which confers power reduction benefits. AMD removed the high-speed repeater and buffer circuitry that was no longer needed in 5c cores to hit the maximum frequencies supported by the standard cores. Combined with lower L3 cache capacity per core, Zen 5c's die area was reduced tremendously compared to the standard cores. (You can read more about this in our interview with Clark here.) In the end, AMD reduced the area for the Zen 5c cores by around 25% compared to the standard Zen 5 cores (Clark notes this is a ballpark figure). This is less than the 35% reduction we saw with the Zen 4c cores used in the EPYC Bergamo processors (slide above for reference). Clark said the Zen 5 core could be compacted even further for compact-core-only (homogenous) designs with different performance targets (for reference, Bergamo only has compact cores), but this design meets the targets for this specific heterogenous design. Therefore, we may see even denser Zen 5c core designs emerge with other products. Make no mistake, a 25% reduction in the core area for Zen 5c is impressive, especially if AMD has managed to keep the performance deltas between cores low. However, only testing will tell. We also can't seem to find the clocks for the Zen 5c cores listed on AMD's site, but we're following up for more detail. Here, we can see a more detailed breakdown of the Strix Point SoC. The most interesting tidbits are the various datapath widths between the different compute units. These datapaths communicate with memory via the Infinity Fabric. Both Zen 5 and Zen 5c core clusters have their own 32B/cycle port, which means L3 cache-to-cache transfers between the CCXs will have limitations. Meanwhile, the bandwidth-hungry GPU has quad 32B/cycle ports. The XDNA neural processing unit (NPU) also has its own single 32B/cycle interface to the data fabric. We also see the standard complement of fixed-function accelerator blocks, such as video encode/decode and the like. Strix supports LPDDR5-7500 and DDR5-5600 memory. Notably, AMD cut back on the PCIe lane allocation. As is customary with its mobile parts, AMD steps back to a previous-gen PCIe interface -- in this case, PCIe 4.0 -- to save power. However, AMD has also dropped from 20 lanes of connectivity to 16, saying this decision was made because the company determined the extra four lanes were almost always used for secondary storage. However, AMD says that use-case isn't common in this segment (low attach rate). As such, AMD determined that reducing the number of lanes was an acceptable trade-off that yielded a pin count reduction that helped save die and substrate area (reduced connections to the die and system board) while reducing power further. The Granite Ridge SoC in the Ryzen 9000 desktop chips has fewer surprises -- the layout is similar to the previous-gen chips. In fact, the SoC uses the same I/O Die (IOD) as with Zen 4 Ryzen 7000 chips. That means the same support for DDR5-5600 memory, 28 lanes of PCIe 5.0, five USB ports, and four display streams from the RDNA 2 graphics engine. Using the same IOD follows AMD's standard policy of smart reuse where possible. The RDNA 2 engine is sufficient for AMD's purposes -- it really is just meant to light up a display and nothing more. It also allows AMD to keep the same package size as before, thus easing its effort to continue supporting the AM5 platform. The iGPU has dual 32B/cycle ports to the Infinity Fabric. The IOD is paired with either one or two eight-core Core Complex Dies (CCDs). Processors with a single CCD have a 32B/cycle read/write port for communication to the IOD via the die-to-die (D2D) Infinity Fabric connection. However, as before, dual-CCD chips have a 16B/cycle write and 32B/cycle read connection between the IODs to save power on the high-power SERDES and also ease package layout (the size of the interface is important here, as the design is more space constrained with two die). AMD says it has characterized real-world workloads and found a typical 3-to-1 ratio of reads to writes, so performance is largely unimpacted by the reduced 16B/cycle write bandwidth. The Granite Ridge 'Eldora' CCD packs 8.315 billion TSMC N4P transistors across 70.6mm^2 of silicon, equating to a transistor density of 117.78 MTr/mm^2 -- a 28% increase in density over Zen 4's Durango CCD. Strix Point has a 232.5mm^2 die, much larger than the 178mm^2 die found on the previous-gen Hawk Point. That's largely because both die use the same process node, but Strix has more cores and cache. AMD hasn't yet shared the transistor count for Strix, but we're following up for more details. For now, you can read more Zen 5 die analysis here. AMD's second briefing contained more information about the Zen 5 microarchitecture than the original slides shared at the Zen 5 event, but we've already covered the lion's share of the information (you can read that analysis here). AMD has plumbed the Zen 5 architecture as a new foundation for computing, so it has several notable changes that will have far-reaching impacts as the company iterates with newer versions. Many of those features are outlined on the first slide, which breaks down the most important changes over Zen 4. AMD also provided more detailed slides for the various components of the core and outlined the new ISA extensions supported with Zen 5. Due to time constraints, we'll provide the full write-up of the new microarchitectural details in our pending review. However, pay particular attention to the second slide (Zen 5 core complex speeds and feeds); this slide has new information about the connections between the different cache levels. We also learned that Zen 5's average misprediction latency has increased by one cycle (for reference, Zen 4 misprediction latency ranged between 12 to 18 cycles, with 13 cycles being a common latency). The Zen 5-powered Ryzen 9000 'Granite Ridge' and Ryzen AI 300 'Strix Point' chips arrive July 31. If tradition holds, reviews will be posted then. Stay tuned.
[3]
AMD ZEN5 - A Deeper Dive Into Architecture
AMD Ryzen 9000 desktop Zen5 - Architecture Deep Dive This next-generation series, based on the Zen5 architecture, includes the Ryzen 9 9950X, Ryzen 9 9900X, Ryzen 7 7900X, and Ryzen 5 9600X, all part of the Granite Ridge (Desktop processor) lineup. The Zen5 architecture represents the fifth generation of AMD's Zen series CPU microarchitectures, demonstrating the company's ongoing innovation in the processor market. The new Zen5 microarchitecture will power the AMD Ryzen 9000 series processors for desktops, the Ryzen AI 300 series Strix Point processors for notebooks, and the 5th Gen EPYC Turin server processors. We dive into the ZEN5 architecture and talk a little about the Desktop and laptop parts. Ryzen 9000 AMD announced the launch of its Ryzen 9000 desktop CPU series, scheduled for release in late July 2024. This next-generation series utilizes the Zen5 architecture, generating significant interest following various leaks and reports. The series includes the Ryzen 9 9950X, Ryzen 9 9900X, 7900X, and 9600X, all part of the "Granite Ridge" lineup. Zen 5 represents the fifth generation of AMD's Zen CPU microarchitectures, marking a significant milestone in AMD's resurgence in the processor market. The new Zen 5 microarchitecture underpins the AMD Ryzen 9000 series "Granite Ridge" processors for desktops, the Ryzen AI 300 series "Strix Point" processors for notebooks, and the 5th Gen EPYC "Turin" processors for servers. The Zen5 architecture introduces optimizations that result in higher instructions per cycle (IPC), lower latency, and better overall system performance. For the mobile/laptop Strix Point products the inclusion of AI acceleration features is particularly noteworthy, aiming to enhance machine learning tasks and other AI-driven applications. You're at Guru3D, so we'll first look into the desktop parts. The Ryzen 9 9950X, positioned as the flagship model, offers a significant performance boost with its higher core and thread count, making it suitable for demanding applications such as gaming, content creation, and professional workloads. The Ryzen 9 9900X and 7900X provide balanced options for users seeking high performance without reaching the flagship model's price point. The 9600X offers an entry point into the high-performance segment, delivering excellent value for its specifications. Further down the lineup, the Ryzen 7 9700X and Ryzen 5 9600X are also confirmed, featuring 8 and 6 Zen5 cores respectively. The 9700X will operate at up to 5.5 GHz with a 40MB cache and a TDP of 65 watts, while the 9600X will have a max clock of 5.4 GHz, a 38MB cache, and the same 65 watts TDP. All processors in this series will support AMD's AM5 socket platform. In addition to the processors, AMD is expected to launch new X870 and B850 motherboards that will accommodate the AM5 socket. These new boards are anticipated to debut at the Computex event, where AMD will likely showcase them alongside the new CPU series. This event will also see the unveiling of the Intel Z890 series, indicating a significant period for advancements in desktop computing hardware. The adjustments in TDP across the new Ryzen models, with reductions ranging from 40 to 50 watts compared to their predecessors, point towards improved power efficiency -- a key focus in the latest generation of AMD processors. These developments suggest that AMD is continuing to prioritize enhancements in performance and efficiency to maintain competitiveness with other leading chip manufacturers. CPU Cores Max Clock L2+L3 Cache iGPU TDP AMD Ryzen For Desktop Ryzen 9 9950X 16x Zen5 5.7 GHz 80MB 2CU RDNA2 170W Ryzen 9 9900X 12x Zen5 5.6 GHz 76MB 2CU RDNA2 120W Ryzen 7 9700X 8x Zen5 5.5 GHz 40MB 2CU RDNA2 65W Ryzen 5 9600X 6x Zen5 5.4 GHz 38MB 2CU RDNA2 65W The Ryzen 9 9950X will become the new flagship processor, which will feature 16 cores and 32 threads, achieving boost speeds up to 5.7 GHz. Uniquely, this processor will operate with a Thermal Design Power (TDP) of 170 watts and will include 80MB of total cache. The Ryzen 9 9900X, another high-end SKU, will offer 12 cores and 24 threads with a boost capability of up to 5.6 GHz. It will have a reduced TDP of 120 watts compared to its predecessor and include 76MB of cache, marking a significant adjustment in power efficiency. Both of these Ryzen 9 models maintain the core count and clock speeds seen in the preceding Zen4 series but show enhancements in energy consumption and thermal management. This suggests AMD's ongoing dedication to improving power efficiency without sacrificing performance. Moving to the mid-range offerings, the Ryzen 7 9700X will feature 8 cores and 16 threads with a maximum boost speed of 5.5 GHz and a notably lower TDP of 65 watts. It will come equipped with 40MB of cache. This marks a 100 MHz increase in boost speed over previous models while significantly decreasing the power requirement from 105 watts in earlier versions. The entry-level Ryzen 5 9600X will include 6 cores and boost up to 5.4 GHz. Like the 9700X, it will also maintain a TDP of 65 watts. This model is designed to offer robust performance for mainstream users and gamers looking for a balance between power and thermal efficiency. Architecture The processors are engineered to provide an average Instruction Per Cycle (IPC) performance improvement of 16 percent over the previous Zen 4 architecture. This enhancement implies that the Zen 5 chips are designed to be 16 percent more efficient at the same clock speeds and core counts, contingent upon specific workloads. Unlike the incremental updates often seen between previous Zen iterations, AMD characterizes Zen 5 as a substantive leap forward from Zen 4. Several architectural improvements have been implemented to achieve this advancement. Among these, AMD has optimized the branch prediction accuracy and reduced its latency, to improve the processor's efficiency in predicting the direction of a branch instruction before it is confirmed. AMD also enhanced the throughput capabilities by expanding the pipelines and vector sizes, which facilitates the handling of more data simultaneously and improves the core's parallel processing ability. Furthermore, Zen 5 processors are reported to have enlarged window sizes that allow for more instructions in the pipeline, enhancing the overall computational throughput. In terms of data handling, AMD let us know that the Zen 5 architecture doubles the bandwidth for front-end instructions. This enhancement is also reflected in the increased data transfer rates between the L1 and L2 caches and from L1 cache to the floating-point (FP) unit, which should notably boost the processor's efficiency in handling complex computations and data-intensive tasks. AMD's latest branch predictor in the Zen 5 architecture aims to reduce latency and enhance accuracy, thereby improving overall throughput. Lower latency enables the CPU to access and process branch prediction data more swiftly. Enhanced accuracy reduces mispredictions, conserving CPU resources. Given the Zen 5's wider core design, the increased branch predictor throughput is essential for upholding optimal performance. The additional decode pipeline further supports this objective by ensuring efficient data flow. Zen 5 introduces an 8-wide dispatch, a noteworthy improvement over the 6-wide dispatch of previous Zen architectures. This expansion allows Zen 5 CPU cores to handle more operations concurrently, provided they receive adequate data.
[4]
AMD Reveals More Zen 5 CPU Core Details Review
As a follow-up to last week's AMD Zen 5 overview with the Ryzen 9000 series and Ryzen AI 300 series, today the embargo has lifted on some additional Zen 5 CPU core details. AMD hosted a Zen 5 architecture deep-dive this week ahead of the Ryzen AI 300 series laptops and Ryzen 9000 series desktop processors launching next week. This brief presentation was mostly to go over more Zen 5 core specifics and additional insight on some of the alterations compared to prior generation Zen 4 and Zen 4C cores. The Zen 5 common features were recapped before diving into more of the Zen 5 core details. There was a comment during the briefing that Zen 5C is roughly 25% smaller than the standard Zen 5 core. Unlike Zen 4/4C, with Zen 5/5C also has different amounts of cache. This was one of the most interesting slides for the Zen 4 vs. Zen 5 side-by-side comparison. With Strix Point all of the SoCs are using the 256-bit data path for AVX-512 in the double pumped approach like with Zen 4. It's for the desktop and server SKUs where we'll see the new Zen 5 full 512-bit data path. I look forward to doing a comprehensive Zen 5 AVX-512 benchmark analysis soon. The Zen 5 ISA details have been known since the Znver5 patch for GCC earlier this year. Meanwhile we are still waiting on the Zen 5 / Znver5 support for LLVM/Clang. The PMIC virtualization will be interesting for EPYC Turin. With the heterogeneous topology functionality, AMD has been working on AMD P-State improvements to better handle the heterogeneous topology. Due to the short turnaround time for the embargo lift and being preoccupied with other testing, this is a very brief article. Plus we are more interested in seeing the real-world performance impact in benchmarks and what material gains there are for end-users in both raw performance and performance-per-Watt. That's it for now while we are very eager to begin AMD Zen 5 Linux testing in looking at the support/compatibility and super excited to see how Zen 5 performs across a range of hundreds of Linux workloads and benchmarks. Stay tuned to Phoronix to learn more about the Linux specifics of the Ryzen AI 300 series and Ryzen 9000 series in the open-source world.
Share
Share
Copy Link
AMD has revealed details about its upcoming Zen 5 processor architecture, promising significant improvements in performance, efficiency, and versatility. The new design introduces a compact core variant and a revamped SoC architecture, setting the stage for the next generation of computing.
AMD has lifted the curtain on its highly anticipated Zen 5 processor architecture, showcasing a range of innovations that promise to reshape the computing landscape. The new design, set to debut in 2024, introduces several key enhancements aimed at boosting performance, efficiency, and versatility across various computing segments 1.
One of the most significant developments in the Zen 5 architecture is the introduction of a compact core variant. This new core design is approximately 25% smaller than the standard Zen 5 core, allowing for greater flexibility in chip designs 2. The compact core is expected to deliver impressive performance while consuming less power, making it ideal for mobile devices and other applications where energy efficiency is crucial.
AMD has also unveiled a new System-on-Chip (SoC) architecture for Zen 5, featuring a modular design that allows for easier customization and scalability. This approach enables AMD to tailor its processors for specific market segments, from high-performance computing to low-power mobile devices 3.
Zen 5 incorporates significant improvements in AI and machine learning capabilities. The architecture includes new instructions and optimizations designed to accelerate AI workloads, reflecting the growing importance of AI in modern computing 1.
The Zen 5 architecture will leverage advanced manufacturing processes, including 4nm and 3nm nodes from TSMC. This shift to smaller process nodes is expected to contribute to improved performance and energy efficiency across the Zen 5 product lineup 4.
AMD has expanded the instruction set for Zen 5, introducing new capabilities that cater to emerging workloads and applications. These additions are expected to enhance performance in areas such as cryptography, data compression, and scientific computing 3.
The Zen 5 architecture features enhancements to its cache and memory subsystem, including larger L2 and L3 caches for some models. These improvements are designed to reduce latency and increase bandwidth, contributing to overall performance gains 4.
As AMD prepares to launch its Zen 5-based processors in 2024, the industry eagerly anticipates the real-world performance and capabilities of this next-generation architecture. With its focus on efficiency, versatility, and advanced features, Zen 5 appears poised to maintain AMD's competitive edge in the processor market.
Reference
[2]
[3]
[4]
AMD has revealed details about its upcoming Zen 5 architecture and Ryzen 9000 series processors, promising significant improvements in performance and efficiency. The new design lays the foundation for future CPU architectures and introduces advanced features like RDNA 3.5 iGPU and XDNA 2 NPU.
9 Sources
AMD's latest Ryzen 9 processors, the 9950X and 9900X, bring significant improvements in efficiency and performance. These new chips challenge Intel's dominance in the high-end desktop market.
5 Sources
AMD launches its Ryzen AI 300 "Strix Point" APU, bringing powerful AI capabilities to laptops. Early benchmarks show impressive performance in both CPU and NPU tasks.
2 Sources
Recent leaks reveal AMD's upcoming Strix Halo APU, featuring 8 cores, 16 threads, and a boost clock of up to 5.36 GHz. The processor shows promising performance in benchmarks, potentially challenging high-end CPUs.
3 Sources
AMD's upcoming Ryzen AI 9 HX 370 APU shows impressive performance in recent benchmark tests, outpacing competitors like Intel's Core Ultra 9 185H and Apple's M3 Max in various metrics.
2 Sources