Curated by THEOUTPOST
On Tue, 19 Nov, 12:03 AM UTC
11 Sources
[1]
AI Factory at SC24: Insights from Dell experts on AI and HPC - SiliconANGLE
Hardware is having a '90s moment, and it's turned high-performance computing into the rockstar. That's the biggest takeaway from this year's Supercompute conference, as SiliconANGLE Media's livestream news desk, theCUBE, delivers analyst-led insights from SC24. Among the top ecosystem players, Dell Technologies Inc. maintained a large presence at the show, demonstrating its AI Factory, liquid cooling technology and an assortment of AI-ready chips. Joining theCUBE for real-time commentary were several members of Dell's ecosystem, contextualizing the integrated nature of scaling powerful computing systems on a path to redefine business, healthcare, government and society at large. Here's a special recap of key themes discussed with Dell executives and partners during SC24, and be sure to check out SiliconANGLE and theCUBE's full coverage. Find more articles here, and our on-demand broadcast here. (* Disclosure below.) Manya Rastogi, technical marketing engineer at Dell, delved into the evolution of AI workloads, including inferencing, fine-tuning and emerging applications, such as autonomous digital agents. Live demonstrations of multimodal AI capabilities and chain-of-thought reasoning were showcased. "Influencing, distributed fine-tuning -- those are the main AI/ML/DL workloads everyone is expecting," Rastogi said. Tim Shedd, engineering technologist at Dell, spoke about the balancing act required to deliver innovative cooling solutions while maintaining flexibility for diverse customer needs. He highlighted advancements in liquid and air cooling technologies designed to optimize both performance and serviceability. "We're growing the amount of the computer that is liquid-cooled, but we have to leave some parts open to air so that we can quickly service it, so we can allow the customer to change up ... That's easy to do in our systems. But that also means that we're not just innovating in," Shedd explained. Read the full story: https://siliconangle.com/2024/11/20/direct-liquid-cooling-seguente-dell-nvidia-sc24/ Geeta Vaghela, senior director of product management at Dell, discussed the transition from legacy data systems to AI-native approaches. She highlighted the importance of breaking down data silos and architecting systems designed for AI's unique demands. "Data management takes a lens beyond infrastructure into data," Vaghela remarked. Read the full story: https://siliconangle.com/2024/11/20/ai-in-cancer-research-transforming-breakthroughs-msk-sc24/ Saurabh Kapoor, director of product management and strategy at Dell, detailed efforts to diversify AI compute options with AMD GPUs and modular AI Factory solutions. He stressed the importance of sustainability and decentralization to foster innovation and reduce reliance on single vendors. "Having alternative solutions out there for people to use is really important for the safety and growth of AI as a whole," Kapoor said. Adam Glick, senior director of AI portfolio marketing at Dell, unveiled the AI Factory, a scalable infrastructure model designed to reinvent AI development from the data center to the edge. "The whole idea is that you've got something that's massively scalable," Glick noted. David Schmidt, senior director of PowerEdge product management at Dell, showcased PowerEdge R7725 servers and their ability to support various use cases, such as video analysis for smart cities and enterprise IT. He highlighted Dell's focus on adaptability for industries with evolving needs. "We wanted to give enterprise IT customers that kind of large-scale, highly adaptable system," Schmidt said. Armando Acosta, director of HPC product management at Dell, explored the rise of exascale computing and the cooling technologies required for high-performance CPUs and GPUs. He emphasized Dell's commitment to standardization, modularity, and customer-centric innovation. "If you want the highest performance, you want the best CPU, the highest performing GPU... guess what? You have to do direct liquid cooling," Acosta stated. Ihab Tarazi (pictured), senior vice president and chief technology officer of core, AI and networking at Dell, discussed strategy to align with open standards and accelerate AI system development. "Networking is the most important element in the performance of an AI system... We've tuned our network now at 97% throughput," Tarazi said. Arun Narayanan, SVP of compute and networking product management at Dell highlighted how modular components have transformed Dell's product development process, reducing cycles from 18 months to six. "We've changed almost every process in our company to hit the inflection points of the market," Narayanan shared. Read the full story: https://siliconangle.com/2024/11/20/ai-factories-dell-broadcom-denvr-dataworks-sc24/ James Wynia, director of product management networking, Enterprise Solutions Group, at Dell, emphasized a focus on scalable, resilient AI networking solutions and the importance of standards like Ethernet and innovations that improve fabric management and bandwidth for future AI workloads. "There's no debate, Ethernet is the de facto standard for all things networking," Wynia explained. "There are niche opportunities where things show up like InfiniBand and we can list a bunch of others, but if you stack up all the networks, IP is by far the dominant player." Scott Bils, VP of product management and professional services at Dell, discussed Dell's comprehensive approach to data management, focusing on automation, compliance, and the scalability needed for AI workloads. He also highlighted Dell's sustainable data center services as critical for meeting regulatory and energy efficiency goals. "Data is the critical driver of getting value from AI and gen AI," Bils remarked. Read the full story: https://siliconangle.com/2024/11/22/building-robust-ai-networking-pipeline-automation-sc24/
[2]
AI networking solutions drive scalable, efficient systems - SiliconANGLE
Broadcom SC24 highlights: Tackling AI challenges with energy-efficient chips and scalable networks With artificial intelligence driving unprecedented demand for AI networking solutions, the push for increased performance, scalability and efficiency in modern infrastructure is top of mind for many companies. Broadcom Inc.'s latest advancements, from energy-efficient chips to innovative AI networking solutions technologies, address these challenges head-on, enabling dense AI workloads while cutting power consumption and costs. With a collaborative approach and a focus on reliability, Broadcom is setting the stage for future-ready systems designed to meet the escalating demands of AI and high-performance computing. At SC24, during three interviews with theCUBE, Broadcom leaders shared insights into the critical innovations shaping AI and high-performance computing. Collaborative efforts with key partners and a focus on reliability and adaptability underscore the company's vision for meeting the growing demands of AI, ensuring systems are future-ready and cost-efficient, according to Hasan Siraj (pictured), head of software products, ecosystem, at Broadcom. "There [are] people who know how to manage Ethernet-based networks," Siraj said. "There are troubleshooting tools, monitoring tools that are available. Whenever you're building an AI network, you have a front end, a backend, a storage and an outband management network that's all Ethernet. It's a standard way of managing all of it." Hasan Siraj and Hemal Shah, distinguished engineer and architect at Broadcom, spoke with theCUBE Research's John Furrier, Dave Vellante and Savannah Peterson at SC24, during an exclusive broadcast on theCUBE, SiliconANGLE Media's livestreaming studio. They discussed Broadcom's advancements in AI networking solutions, focusing on energy-efficient technologies, scalable systems and collaborative innovations. (* Disclosure below.) Hasan Siraj of Broadcom highlighted the company's groundbreaking advancements in power efficiency and networking solutions, emphasizing their critical role in AI and HPC. Broadcom's Tomahawk 5 chip and Thor 2 NIC were showcased as transformative technologies, enabling up to 75% reductions in power consumption and cooling needs while maintaining exceptional performance. "We'll see the scale become bigger and bigger over the next four years, but we'll also see this go down to other verticals," Siraj said. "We will see enterprise adopters ... and from a networking perspective, we believe ethernet will win. It's already on its way, and it can scale from the largest clusters on the planet to whatever optimizations that are required for inference and other use cases." Collaboration is key to Broadcom's strategy, as its partnerships with Dell Technologies Inc. and Denver Dataworks Corp. help create open, scalable systems that seamlessly integrate networking, storage and computing components, Siraj stressed. Read More: https://siliconangle.com/2024/11/20/ai-factories-dell-broadcom-denvr-dataworks-sc24/ The transformative role of networking is enabling large-scale AI workloads, Siraj outlined during the keynote analysis on day 2 at SC24. There has been a shift from traditional server-based systems to clustered architectures, where networking serves as the essential "glue" for scalability. There are also unique demands of AI, such as massive bandwidth and low latency, with networking inefficiencies critically hindering AI job completion and infrastructure utilization, Siraj added. "There [are] people who know how to manage Ethernet-based networks," he said. "There are troubleshooting tools, monitoring tools that are available. Whenever you're building an AI network, you have a front end, a backend, a storage and an outband management network that's all Ethernet. It's a standard way of managing all of it." It is likewise important to address challenges such as congestion management and failure recovery as GPU clusters scale to unprecedented sizes, potentially reaching millions of nodes, according to Siraj. Broadcom's forward-looking approach aims to meet the increasing complexity of AI infrastructure with strong, future-ready solutions. "If you are training a large model and these models are growing at an exponential, they don't fit in a CPU, and a core of a CPU, virtualization is no play," he explained. "This is why you cannot fit a model within a server or two servers or four servers. That is why you need a cluster. When you have a cluster and everything is spread out, you need glue to put this all together. That is networking." There is currently a rapid advancement in high-performance computing to address the increasing demands of AI and machine learning. The critical role of scalable AI networks and open standards is a critical part in driving innovation. "Dell and Broadcom with our other partners, we are working to build really high bandwidth, high network utilized fabrics," Shah said. "In partnership, what we'll bring together is a lot of software integration, the whole diagnostic monitoring of the fabric, which makes life easy for deployments." Looking ahead to SC25, Shah expect continued focus on scalability, alongside emerging developments such as advancements in UEC specifications and enhancements in fabric solutions. These ongoing innovations aim to meet the ever-growing demands of the AI and machine learning landscape while maintaining a commitment to quality and ease of use. "We should be able to talk about some of the enhancements we are doing at the solution, which are already in the works, but it'll be more mature next year," he said.
[3]
Scalable AI networks detailed by Dell and Broadcom - SiliconANGLE
Dell and Broadcom outline plans to drive scalable AI networks at SC24 The high-performance computing industry is rapidly advancing to meet the demands of artificial intelligence and machine learning workloads. At SC24, there's been a major focus on scalable AI networks and what comes next. There have been plenty of developments on that front over the past year. It involved a focus on networking and how to build a big fabric given that AI and machine learning is all about scale, according to Hemal Shah (pictured, left), distinguished engineer and architect at Broadcom Inc. "Dell and Broadcom with our other partners, we are working to build really high bandwidth, high network utilized fabrics," Shah said. "In partnership, what we'll bring together is a lot of software integration, the whole diagnostic monitoring of the fabric, which makes life easy for deployments." Shah and James Wynia (pictured, left), director of product management networking, Enterprise Solutions Group, at Dell Technologies Inc., spoke with theCUBE Research's John Furrier and Savannah Peterson at SC24, during an exclusive broadcast on theCUBE, SiliconANGLE Media's livestreaming studio. They discussed scalable AI networks and the role of open standards in driving high-performance computing innovation. (* Disclosure below.) Dell and Broadcom have been working together for some time and now have a validated design which is already public. The rapid pace of innovation has shifted timelines from years to mere months. "It's interesting. It felt like we all settled into, OK, this is what it takes to make an ASIC and to do a feature and to integrate and converge solutions," Wynia said. "Then, when the opportunity came with the whole AI explosion, we realized we can do it a whole lot faster. Every single company that we're working with is like, 'You're doing it how fast?' That's the way it's working. These solutions, they're quality solutions." Fabric management becomes a huge challenge for anybody who's deploying this at scale, according to Shah. For that reason, it's important to make things easy to deploy, monitor, and manage. "You need to now have fabrics smart enough that they have built-in resiliency," Shah said. "The transport, or reliability, is also built-in." In many cases, it's not visible to end customers, because all they want to do is run their workload, according to Shah. Companies don't want to worry about a link going down. "These are the kind of intelligence, through software, we bring it in and then take that into account and make fabric deployment easy, and make fabric running all the time," he said. Looking ahead to next year's SC25, these trends are expected to continue. There are some interesting developments on the horizon, according to Wynia. "Next year this time, we will be talking a lot more about 1.6, where everything doubles, and we'll certainly be talking about the buzz that gets generated as the UEC spec gets ratified and released," he said. "And we start seeing progress when we start seeing announcements from our friends making silicon, as well as software on how to implement that." Next year, it will all be about scale, scale and scale, according to Shah. But there will be more to dive into as well. "We should be able to talk about some of the enhancements we are doing at the solution, which are already in the works, but it'll be more mature next year," he said. Here's the complete video interview, part of SiliconANGLE's and theCUBE Research's coverage of SC24:
[4]
Dell AI Factory solutions outlined at SC24 - SiliconANGLE
Dell and Nvidia aim to reshape enterprise AI amid industry transition AI is driving a seismic shift in enterprise infrastructure, with companies racing to integrate cutting-edge technologies into their operations. Things are moving quickly, and Dell Technologies Inc. and Nvidia Corp. are aiming to change the game on the data center with Dell AI Factory solutions. The goal of Dell AI Factory is to get something that is massively scalable. From there, people can start out with whatever their early workloads are, according to Adam Glick (pictured, right), senior director of AI portfolio marketing at Dell. "They can start small and literally just stack that up, not only just within a rack, but create rack scale deployments," Glick said of Dell AI Factory solutions. "We make it super simple to be able to take the hardware. We've worked a lot with our friends at Nvidia. There's a lot of work that's gone together to integrate these two pieces." Glick and Jason Schroedl (left), director of product marketing for enterprise platforms at Nvidia, spoke with theCUBE Research's John Furrier and Dave Vellante at SC24, during an exclusive broadcast on theCUBE, SiliconANGLE Media's livestreaming studio. They discussed the transformative impact of AI infrastructure on enterprise productivity and what's available through Dell AI Factory solutions. (* Disclosure below.) It's not just the hardware pieces and machines that go together. It's also the networking and the software stack, according to Glick. "Tested, integrated, optimized to be able to deploy that and deploy it fast and at scale, to deliver solutions that our customers are looking for," Glick said. For Nvidia, there's a completely new paradigm at play. Companies are reinventing their AI infrastructure with concepts like the AI Factory, according to Schroedl. "We packaged this up in a simplified solution that can help companies get up and running very quickly, leveraging all the best practice and learnings that we've done at large scale, bringing that to the enterprise, helping them wherever they are on their journey, whether they're just getting started with generative AI or whether they're looking to go from proof of concept to proof of value," he said. Meanwhile, the democratization of AI is empowering companies and individuals to innovate at unprecedented scales. Today, companies that are investing in generative AI are beginning to see tremendous advantages, according to Schroedl. "I think the difference that you'll see is the existing companies that are leveraging generative AI, deploying those use cases, are going to be the ones that win in their industry," he said. "There will be new entrants that'll also disrupt those industries. They may take share, but the companies that don't get started today with generative AI, they're going to be left behind." The AI transformation taking place is really a Cambrian explosion, according to Glick. There's likely to be greater and greater adoption in the months and years ahead. "The organizations that get in sooner, they're going to get the greatest advantage out of that, because everything has a learning curve," Glick said. "As much as we try and integrate it, make it fast for people, everyone goes through that curve. We try and flatten that curve for folks, but ultimately you've got to get on the curve in order to start to get that benefit." Here's the complete video interview, part of SiliconANGLE's and theCUBE Research's coverage of SC24:
[5]
Dell advances high-performance computing solutions for AI - SiliconANGLE
Dell Technologies innovates AI infrastructure with scalable solutions In the world of high-performance computing solutions and artificial intelligence, enterprises face mounting challenges to scale efficiently while managing the heat, power demands and complexity of cutting-edge infrastructure. Dell Technologies Inc. is addressing these challenges with innovations that prioritize flexibility, sustainability and rapid deployment, according to Tim Shedd (pictured), engineering technologist, office of the chief technology innovation officer, at Dell. "Our strategy is to make it easy for our customers to deploy the compute they need to run their businesses," he said. "All of this translates to a platform that is flexible, that will enable our customers to grow with that platform. It also creates a package that we can pre-integrate for our customers, deliver onsite ... have power and cooling, and you're up and running." Shedd spoke with theCUBE Research's John Furrier and Savannah Peterson at SC24, during an exclusive broadcast on theCUBE, SiliconANGLE Media's livestreaming studio. They discussed Dell's innovations in high-performance computing, including scalable platforms, advanced cooling technologies and customer-focused solutions for AI infrastructure. (* Disclosure below.) Dell is tackling the rising demands of high-performance computing solutions by focusing on scalable, efficient and customer-friendly infrastructure. As AI workloads intensify, Dell has prioritized advanced cooling strategies to manage increased heat densities and simplify deployment, according to Shedd. "Up until recently, we were looking at scaling out with 50 to 60 kilowatt racks, but about two to three years ago, we realized, 'Hey, this is literally going to get hot,'" he said. "That's putting a whole bunch of heat in one place, and it's really that density that's a challenge." To address these challenges, Dell's innovations aim to streamline data center integration through multi-generational hardware platforms that evolve to meet customers' changing needs in high-performance computing solutions. These platforms support higher power densities and accommodate future networking, graphics processing units and software upgrades, according to Shed. "We've already got customers for 220-kilowatt racks, but we knew we needed to get to a half-megawatt," he noted. "How do we design the power and cooling to allow one platform to be flexible enough for our customers to update to the next and next and next GPUs and software and networking, and everything that needs to go in there?" Building on this foundation, Dell is redefining hardware platforms to adapt to the rapid pace of technological change. By prioritizing modularity and multi-generational support, Dell ensures its systems can evolve alongside advancements in silicon, cooling and power requirements, according to Shedd. "We've engineered what we call impedance match servers," he said. "No matter what you put in there, whether you put it in today or you put it in six years from now, it's going to get the cooling it needs automatically. It just works." The company has also focused on balancing high-power requirements with the flexibility that customers demand. This dual approach drives advancements in both liquid and air-cooling technologies for simplified integration across diverse workloads, Shedd explained. "We're growing the amount of the computer that is liquid-cooled, but we have to leave some parts open to air so that we can quickly service it," he said. "How can we use the air that's flowing through there now to cool better? You're going to see some innovations coming out from that early next year that will use the same amount of air or even less to pull higher power devices to get us there, to allow us that." Dell's commitment to customer-centric innovation extends to developing comprehensive solutions that combine hardware, software and services. By delivering platforms that integrate into data centers, Dell enables rapid deployment and long-term reliability for diverse enterprise needs, Shedd added. "It's not just the hardware and the service, but also we're continually building and aggressively expanding our software package to make it easy to deploy," he said. "It's going to become more evident ... just how complete a package Dell is delivering. It's not just a limited number of platforms; it's the whole spectrum ... for customers and ... the services and software that enables it that's continually growing." Here's the complete video interview, part of SiliconANGLE's and theCUBE Research's coverage of SC24:
[6]
Analyzing Dell's blueprint for AI networking and data infrastructures - SiliconANGLE
Building out robust AI infrastructures: Networking, data pipeline automation and sustainability at the fore A lot is riding on the success of today's artificial intelligence efforts, placing the underlying infrastructural underpinnings under immense scrutiny. From AI networking to storage and compute, the enterprise resource draw is higher now than it's ever been. Given these facts, how can organizations streamline their infrastructure to maintain sustainable, robust long-term AI operations? "The key to driving outcomes and business value from gen AI is data," said Scott Bils (pictured), vice president of product management, and professional services, at Dell Technologies Inc. "That's where the role of AI networking becomes so critical. When you think about AI networking and the role it plays in data, when you think about clusters and AI architectures, they're fundamentally different than traditional data center networking. When you think about clusters of GPUs, you essentially want the clusters at a rack level, or even a data center level, to function as a single computer ... a single brain." In three separate interviews, Bils spoke with theCUBE Research's Rob Strechay at SC24, during an exclusive broadcast on theCUBE, SiliconANGLE Media's livestreaming studio. They discussed Dell providing vital support as organizations streamline their AI data management processes and build scalable, performant infrastructures. (* Disclosure below.) AI networking significantly differs from traditional data center networks. AI workloads demand low latency, high throughput and seamless GPU-to-GPU communication. Unlike conventional setups where data is stored and retrieved in silos, AI systems require integrated architectures that enable data centers to function as cohesive units, according to Bils. To achieve this, technologies such as InfiniBand and RDMA are becoming essential for connecting GPUs at scale. However, their complexity poses challenges for many organizations, particularly in terms of in-house expertise and architectural readiness. "As enterprise deployments begin to scale out, they're going to face and are facing similar issues," Bils said. "Helping them think through the overall design architecture, not just for today, but going forward as they scale out the environment, is a big part of the capability we bring -- then, the expertise from Nvidia and our other partners in the space as well." Organizations face hurdles such as network bottlenecks and underutilized GPUs, which directly affect AI performance. Dell addresses these challenges by combining its expertise, partnerships with industry leaders such as Nvidia Corp. and tailored solutions. Their approach involves deploying hardware and integrating networking and computing resources to ensure optimal performance, according to Bils. "It's helping them then integrate the data into the appropriate use cases and then automate and orchestrate that to ensure that you have the right velocity, including the right access to the right data sets to support the use cases," he added. "It's also that life cycle view from identifying the data sources, classifying, curating, cleansing and then automation, ingestion and scaling. It's what organizations are going to have to do comprehensively to enable the AI opportunity." AI-driven applications require vast amounts of data processed with speed and efficiency. Dell's approach to addressing these needs involves automating and orchestrating data pipelines tailored to specific use cases. This means understanding the performance metrics for each AI application -- whether for large language models or other AI systems -- and designing pipelines that meet those specific demands. With the right automation tools, businesses can scale and ensure the responsiveness of their AI models, according to Bils. "You have to ensure that the data throughput, the way you've automated and orchestrated that model, is going to drive the scale, performance and responsiveness you need to match the outcome and deliver the value," he said. Another crucial aspect of managing AI data is the implementation of AI-specific data catalogs. These catalogs enhance data discoverability, classification and compliance, making it easier for organizations to access the most relevant data sets for their AI applications. Additionally, catalogs track data lineage, ensuring traceability of the data and its transformations, which is vital for ensuring data integrity and meeting governance requirements, Bils explained. "It gets back to the data quality issues, being able to track that lineage and who's touched that data," he said. "Then the metadata as well. We think about data catalogs, an incredibly important part is the metadata about the content or the file itself, but also the metadata about the content that sits in the file or the object." The integration of AI into data centers has escalated energy demands, with GPUs driving significantly higher power consumption than traditional CPUs. In addition to energy costs, organizations also face geopolitical instability and infrastructure limitations, all while managing increasing regulatory pressure for sustainability. To combat these challenges, companies must prioritize improving power usage effectiveness, according to Bils. "When you take a look at your typical data center, 40% to 60% of the operating costs are driven by energy costs," he said. "A lot of the factors that drive prices there are beyond our customer's control: geopolitical factors, factors around infrastructure, brittleness and stability. They have to control what they can control from an energy and sustainability standpoint." Here's the complete video interviews, part of SiliconANGLE's and theCUBE Research's coverage of SC24:
[7]
AI data center solutions a key focus for Dell - SiliconANGLE
Dell looks to advance AI data center solutions for scalable enterprise innovation The enterprise artificial intelligence landscape is changing rapidly and companies are seeking solutions that meet diverse data center needs. There's a focus on creating scalable, efficient and user-friendly AI data center solutions to bridge the demands of companies in this new era. In recent years, customers have been seeking a variety of AI data center solutions. These solutions span different shapes, sizes and capabilities to meet diverse needs, according to David Schmidt (pictured), senior director of PowerEdge product management at Dell Technologies Inc. "They really need things that fit their existing data centers," Schmidt said. "There are certainly use cases where customers are looking to build new data centers ... but we have to meet customers where they are today and provide them with powerful AI solutions." Schmidt spoke with theCUBE Research's Dave Vellante and Savannah Peterson at SC24, during an exclusive broadcast on theCUBE, SiliconANGLE Media's livestreaming studio. They discussed the spectrum of AI computing solutions and the innovative strategies enabling high-performance infrastructure. (* Disclosure below.) When it comes to AI solutions, some of that can be done with the compute-based solutions being built with PowerEdge, according to Schmidt. That involves a breadth of solutions. "We can deliver AI on compute as well as purpose-built, large-scale AI solutions and we're doing that with systems built on fifth-generation AMD Turin," Schmidt said. "We're really excited to have new servers, new rack-based servers that are supporting latest-generation Turin processors." Around SC24, there have been several liquid cooling or alternative cooling stations. But some customers aren't ready for that in their data centers, according to Schmidt. "We are really focused on providing compelling air-cooled designs that they can fit inside their existing power and thermal footprint inside their racks, inside their data centers," he said. "Then, it's a matter of crafting the right AI solution and use cases and the right-sizing that's going to work for that type of environment." These days, companies are wondering how they can deploy and demystify AI solutions for their environments, according to Schmidt. That doesn't mean they want to buy a new data center with many wanting to continue utilizing a previous investment. "The AI Factory is no surprise. I love it. I think we do really great things. We have reference architecture, we have validated designs that run the entire spectrum," he said. The Llama 3.2, for instance, is considered a small language model and has three billion parameters. Dell can support 100 concurrent users on a standard two-socket rack server running a 128-core Turin, according to Schmidt. "That's the kind of footprint, because that's the exact same type of system a customer would deploy to run just their standard IT infrastructure as well. It gives them a common install base, and they love that," he said. Over the coming year, companies are likely to be enabled to be successful in a variety of industries, according to Schmidt. In the view of Dell, some of that will be surprising. "We have some really compelling conversations going on right now. We have the capability to deliver at scale in a way that is just going to accelerate customers' time to value," he said. "It's just going to be awesome to sit here a year from now and have those types of conversations." Here's the complete video interview, part of SiliconANGLE's and theCUBE Research's coverage of SC24:
[8]
Open standards and modular AI shaping Dell's tech vision - SiliconANGLE
Dell's AI evolution: Driving tangible innovation through open standards and scalable systems For Dell Technologies Inc., the journey into artificial intelligence and high-performance computing has been a meticulously planned marathon rather than a sprint. What once was a hardware-focused company is now at the forefront of open innovation, blazing a trail in modular AI systems and networking solutions. "Three years ago, we started to work on scalable systems, which is what you have on the floor now, and we wanted to be a much bigger player in HPC as well as AI," said Ihab Tarazi (pictured, left), senior vice president and chief technology officer of core, AI and networking at Dell. "We took a different path than most people. We decided to build a fully open system based on OCP specs, ORV3 21 inch, and also DC-MHS for the compute and fully open on software. We also wanted this to be the most dense, best in terms of performance, and a fully open system." Tarazi and Arun Narayanan (right), senior vice president of compute and networking product management at Dell, spoke with theCUBE Research's Dave Vellante and John Furrier at SC24, during an exclusive broadcast on theCUBE, SiliconANGLE Media's livestreaming studio. They discussed Dell setting the pace for innovation with modular AI, networking excellence and a customer-first approach. (* Disclosure below.) One of Dell's standout strategies has been modular design. Instead of crafting each server as a standalone product, Dell created building blocks that can be assembled to meet diverse needs. This approach slashed product development timelines from up to two years to as little as four months. "We're fully compliant with OCP specs, which means anybody can build on it and we're able to accommodate the ecosystem," Tarazi said. "However, we did modifications to make it much more operationally easy. All the cabling is easy to do, it supports all the power -- it supports the manifolds for liquid cooling in a very simple way with quick disconnects." Dell's PCIe-based platforms, for example, accommodate smaller AI models -- while the 21-inch rack design supports massive language model workloads. This flexibility ensures that Dell's solutions cater to a broad spectrum of use cases without sacrificing speed or customization. Another bedrock of Dell's strategy is the alignment with open standards. Adhering to OCP specifications has catalyzed industry-wide innovation, allowing hardware and software ecosystems to evolve collaboratively. Open systems enable faster development across components such as GPUs, CPUs and liquid cooling units, according to Narayanan. "The way we've thought about our portfolio is we want to build rack-scalable systems," he said. "I think about rack-scalable systems [as] designed by Dell, engineered by Dell, manufactured by Dell and supported by Dell. That's what you need to think about, what a rack-scalable system is. And the way we've designed this is we want to support all customer use cases." This open philosophy aligns with the rapid innovation cycles seen in AI today. By fostering a blueprint for collaborative development, Dell empowers silicon vendors, cooling providers and others to innovate efficiently, driving the industry forward, Narayanan added. Here's the complete video interview, part of SiliconANGLE's and theCUBE Research's coverage of SC24:
[9]
Dell AI Factory utilizes strategic partnerships and more - SiliconANGLE
How Dell's AI Factory and strategic partnerships are shaping enterprise AI The rapid evolution of artificial intelligence is reshaping the enterprise landscape. As organizations face growing pressure to adopt scalable AI solutions, Dell AI Factory offers an integrated approach intended to meet these demands. This new era poses challenges for businesses, which must balance budget constraints alongside demands to innovate. As theCUBE has written about extensively, enterprises seek a deeper AI value. "Enterprises are fighting a dual mandate of operating inside a tight information technology budget envelope while at the same time transforming their organization into an AI-first company," wrote theCUBE Research co-founder and Chief Analyst Dave Vellante in a recent edition of his Breaking Analysis series. "Navigating macroeconomic headwinds while driving innovation is an exciting challenge for IT decision-makers." Those challenges have led many organizations to actively search for solutions that simplify AI integration while supporting broader business objects. Included in the mix is Dell Technologies Inc. and Dell AI Factory, which is intended to meet the growing demand for integrated AI solutions by offering what the company says is access to the industry's broadest AI portfolio. In today's era, data centers have to be designed from the ground up to handle AI's speed and scale while new AI PCs are transforming productivity and collaboration, according to Jeff Clark, vice chairman and chief operating officer of Dell. "What's needed are new IT infrastructure and devices purpose-built to meet the specific demands of AI. The Dell AI Factory helps customers accelerate AI adoption with the world's broadest AI portfolio and leading AI ecosystem partners, offering right-sized approaches and greater control over AI deployments on-premises, at the edge and across cloud environments," Clark recently said. Dell says its AI Factory initiative aims to equip enterprises with the capabilities to integrate AI across their operations and drive digital transformation. Given the rapidly evolving nature of enterprise AI, it's worth examining what's on offer from the company and what might come next. This feature is part of SiliconANGLE Media's exploration of Dell's efforts in enterprise AI. Be sure to watch theCUBE's analyst-led coverage of SC24 from November 19-21. (* Disclosure below.) Earlier this year, Dell announced it was expanding its infrastructure portfolio with new Nvidia Corp.-powered AI platforms. At the time, Dell said its AI Factory was intended to be an "end-to-end AI enterprise solution" for training, tuning and running AI models, combining Nvidia chips with products from Dell's compute, storage, client device and software portfolios, as well as professional services. Those services, the company said, were intended to ease tasks, such as preparing AI datasets. In April, more detail was provided by Scott Bils, vice president of gen AI professional services at Dell. Dell AI Factory is intended to be an end-to-end solution for enterprises, providing them with an "easy button" for AI. It brings the breadth of Dell infrastructure and hardware: "Compute, storage, networking, workstations ... plus Nvidia AI infrastructure and their software stacks, including the new MIM microservices software that they'll be bringing to market," Bils said during an interview with theCUBE. It also involves a combination of Dell and Nvidia professional services. That's intended to provide a turnkey solution, he added: "To help them address their biggest challenges, their most complex issues around adopting AI use cases, and driving those into operations and production." The solution combines Dell's compute, storage and security capabilities with Nvidia's AI infrastructure, intended to offer businesses a comprehensive, end-to-end system to accelerate business transformation and boost productivity. The fully-integrated solution is intended to take advantage of rack-level design, while utilizing rigorous testing and validation to transform data into valuable insights and outcomes, according to the company. "This solution also leverages existing offerings in enterprise data security with accompanying Dell services offerings in security and privacy," the company said in a recent release. "The Dell AI Factory with Nvidia supports a wide array of AI use cases and applications to support the entire gen AI lifecycle, from model creation and tuning to augmentation and inferencing." Customers can also take advantage of enterprise-grade professional services, the company said. It also stated that Dell AI Factory is available via traditional channels and Dell APEX. The Dell AI Factory also intends to draw on the "raw material" of customer data. Despite that, the company notes that the success of any AI initiative depends on the quality of the data used. "The Dell AI Factory brings AI as close as possible to where data resides to minimize latency, lower costs and maintain data security by keeping sensitive information within a controlled environment," the company wrote in a recent blog. "It also provides a way to prepare this data for use by the AI factory, ensuring that customers are working with quality and accurate data, with easy access and built-in data governance." The Dell AI Factory promises to simplify AI adoption with an all-in-one infrastructure package, aiming to deliver on the complex needs of modern enterprises. With that in mind, the company has sought to highlight its partnership with Nvidia, which it claims can simplify development, automate workflows and deliver up to 86% faster business outcomes. "Dell Technologies and Nvidia have a long-standing partnership with over 25 years of joint innovation, focused on accelerating innovation and delivering cutting-edge platforms, solutions and software that enable transformative results for our joint customers," the company wrote of the partnership in May. It also outlined another partnership deal aimed at AI development in recent months. A deal with Hugging Face Inc. is intended to set up a development hub on Hugging Face's development platform, with an eye towards enabling organizations to train and deploy open customized large language models on-prem on Dell infrastructure. "Dell will be Hugging Face's preferred on-prem infrastructure provider to support enterprise adoption of tailored open-source gen AI data sets, libraries and models," said Varun Chhabra, senior vice president of product marketing for Dell's infrastructure solutions group, in May. With an eye toward redefining enterprise AI, Dell has positioned its AI Factory as an all-in-one solution to help enterprises overcome the complexities of AI adoption. The company says its partnerships with Nvidia and Hugging Face can help it get there. As CEO Michael Dell explained at Dell Technology World 2024, the goal is to bring AI to the data and leverage factories to convert data to intelligence for a business. Its strategy is clear, according to Bob Laliberte, principal analyst at theCUBE Research. "To accelerate the building of these AI factories, Dell and its partner ecosystem will provide all the infrastructure and services required to build these on-premises environments. This includes high-performance servers, GPUs, networking, storage, and even AI-enabled PCs," Laliberte wrote in May. In this new era, organizations must embark on AI initiatives to remain competitive, but the barrier to entry is high, as most companies need more skills and knowledge of where to start, according to Laliberte. Dell is looking to change that with its end-to-end solutions and in helping to focus organizations on outcomes that they desire. "Dell customers even have access to a free AI accelerator workshop to help them get started. The most crucial step is the first: Organizations must dedicate the time and resources to understand how AI can help their business," Laliberte said. "Dell wants to help accelerate the time to create your own AI Factory, leveraging its own and ecosystem partner solutions." It's clear that Dell's AI Factory reflects a company goal to make AI adoption achievable for enterprises. With such a priority reflected across the enterprise, time will tell if Dell's solution can drive the impactful results the company is shooting for.
[10]
Dell expands AI Factory with new high-powered servers and dense racks - SiliconANGLE
Dell expands AI Factory with new high-powered servers and dense racks Dell Technologies Inc. is expanding its AI Factory in announcements being made today at the SC24 high-performance computing conference in Atlanta. The AI Factory is intended to help organizations adopt and scale up artificial intelligence across their operations by deploying an integrated set of infrastructure, software and services including Nvidia Corp. AI infrastructure and software. Dell is expanding its Integrated Rack Scalable Systems to include the PowerEdge XE9685L and PowerEdge XE7740 servers in a standard 19-inch enclosure called the Dell Integrated Rack 5000. Both will be available in the first quarter of next year. The IRSS program is a turnkey factory integration program that delivers fully loaded, rack-scale systems with a choice of water or air cooling and centralized service and support options for the entire rack. The IR5000 is designed to support high-density applications in a space-efficient form factor, delivering high performance and energy efficiency. The PowerEdge XE9685L is a dense, 4U liquid-cooled server designed for AI, machine learning, high-performance computing and other data-intensive workloads. It features dual 5 Gen Advanced Micro Devices Inc. EPYC central processing units paired with the Nvidia HGX H200 graphics processing unit or Nvidia B200 platforms. There are up to 12 Peripheral Component Interconnect Express slots, optimized storage connectivity and maximum input/output speed. Each rack can accommodate up to 96 Nvidia GPUs per rack. With support for 196 CPU cores and 96 GPUs, the new server will provide the industry's highest GPU density, said Arunkumar Narayanan, senior vice president of server and networking products at Dell. "We will integrate all of this in the factory so all the customer needs to do is have the rack delivered and plug it in," he said. "The liquid cooling will be set up, the air-cooled servers will be set up, and the entire thing will work as one solution. If there's any issue, there's a single support call to Dell and Dell will manage the entire estate of issues." The 4U air-cooled PowerEdge XE7740 uses dual Intel Corp. Xeon 6 processors using Intel Performance Cores and up to eight double-wide accelerators -- including Intel Gaudi 3 AI accelerators or the Nvidia H200 NVL Tensor Core GPUs -- or up to 16 single-wide accelerators, such as the Nvidia L4 Tensor Core GPU. P-Cores are a type of processor core optimized for high-performance and intensive workloads. New platform doubles the GPU capacity of the predecessor R760X, said Arunkumar Narayanan, senior vice president of server and networking products at Dell. "We expect this to be our most successful platform in this AI enterprise inferencing category," he said. Dell also said it plans to support the upcoming Nvidia GB200 Grace Blackwell NVL4 Superchip with a new PowerEdge XE server designed for its Integrated Rack 7000, a liquid cooled high-density computing infrastructure with scalable power management announced last month. The IR 7000 supports up to 144 GPUs per rack in a 50OU standard rack. Updates to the Dell Data Lakehouse, announced earlier this year, add Apache Spark support for distributed data processing at scale, providing a unified approach for data analytics, management, processing and analysis. Updates to the Dell AI Factory with Nvidia add support for Nvidia's HGX H200 and H100NVL to deliver up to 1.9 times higher performance than the Nvidia HGX H100. This enables users to perform more complex queries and accelerate retrieval-augmented generation operations leveraging PowerEdge, Dell PowerScale and Nvidia AI Enterprise software and generative AI tools. New Validated Designs for AI PCs, available now, are open-source guides designed to boost the development of AI applications on Dell AI PCs with neural processing unit technology. Developers can customize the modular designs to integrate features like large language models, vision, text and speech into applications that can be deployed across multiple platforms, regardless of processor. Dell is also expanding its line of professional services to include sustainable data centers, data management services, AI networking services and assistance implementing ServiceNow Inc.'s Now Assist, a generative AI-powered product that uses generative AI to improve productivity and workflow efficiency.
[11]
Data management at the forefront of Dell's AI services - SiliconANGLE
Saying that artificial intelligence has complicated the world of data management would be an understatement. Companies are being forced to reckon with their previous data management policies and update them to match the demands of large language models. Dell Technologies Inc. aims to support customers in that journey. "People have been storing data for a long time and they have been doing what they felt was necessary," said Geeta Vaghela (pictured), senior director of product management, Unstructured Data Solutions, at Dell. "But data management takes a lens beyond infrastructure into data. What is it and where are you using it and why does it matter? I don't know that everyone's been doing that. There's a section of the market who are ... starting with a clean sheet of paper." Vaghela spoke with theCUBE Research's Dave Vellante and Savannah Peterson at SC24, during an exclusive broadcast on theCUBE, SiliconANGLE Media's livestreaming studio. They discussed Dell's history with data storage and current trends in the industry. (* Disclosure below.) Adding to the complexities of setting up a modern data storage infrastructure are security requirements and government regulations. The Dell Data Lakehouse is one of several solutions the company offers for data-intensive AI workloads. "There's compliance and governance and all these things that many organizations, the government, have to comply to," Vaghela explained. "I think we're just now starting to scratch the surface on what does that mean? And how do you quickly get yourself ready so that you've got the data, these GPUs that are either on their way to you or have been delivered. It's a really, really interesting time where people are asking hard questions about storage and data management." Dell also announced the Dell AI Factory with Nvidia Corp., with the goal of driving AI innovation and productivity. The portfolio of solutions allows customers to extract metadata about their organization and produce accurate AI models. "Your metadata comes from one engine," Vaghela said. "The biggest thing that I'm seeing is how do we use a selection of toolsets and bring them together such that we're providing a full workflow ... but also do it in a way that the sum total gives a repeatable model that many of these companies can stand behind because otherwise we get into hallucinations." Although there are a variety of vertical trends for AI applications, including life sciences and autonomous driving, many of the challenges are the same, according to Vaghela. Dell offers additional AI implementation services to its customers, especially those just getting into data management. "What Dell does is we validate those solutions, we've got a point of view on where the strengths are, what to consider," she said. "And for those that are early in the journey, there's professional service opportunities." Here's the complete video interview, part of SiliconANGLE's and theCUBE Research's coverage of SC24:
Share
Share
Copy Link
Dell Technologies and its partners presented advancements in AI infrastructure, including the AI Factory, cooling technologies, and networking solutions at the Supercompute conference (SC24).
At the Supercompute conference (SC24), Dell Technologies showcased its latest innovations in AI infrastructure, including the AI Factory and advanced cooling technologies. Adam Glick, senior director of AI portfolio marketing at Dell, introduced the AI Factory as a scalable infrastructure model designed to reinvent AI development from data centers to the edge. "The whole idea is that you've got something that's massively scalable," Glick noted 1.
Tim Shedd, engineering technologist at Dell, highlighted the company's advancements in cooling technologies. As AI workloads intensify, Dell has prioritized advanced cooling strategies to manage increased heat densities and simplify deployment. Shedd explained, "We're growing the amount of the computer that is liquid-cooled, but we have to leave some parts open to air so that we can quickly service it" 2.
James Wynia, director of product management networking at Dell, emphasized the company's focus on scalable, resilient AI networking solutions. He stressed the importance of Ethernet as the standard for networking, stating, "There's no debate, Ethernet is the de facto standard for all things networking" 1.
Dell's partnership with Broadcom was highlighted as crucial for developing high-bandwidth, high-network utilized fabrics. Hemal Shah, distinguished engineer and architect at Broadcom, stated, "Dell and Broadcom with our other partners, we are working to build really high bandwidth, high network utilized fabrics" 3.
The conference underscored AI's transformative effect on enterprise infrastructure. Nvidia's Jason Schroedl emphasized the importance of early adoption, saying, "I think the difference that you'll see is the existing companies that are leveraging generative AI, deploying those use cases, are going to be the ones that win in their industry" 4.
Dell's approach focuses on creating future-ready systems designed to meet the escalating demands of AI and high-performance computing. Saurabh Kapoor, director of product management and strategy at Dell, stressed the importance of diversifying AI compute options and fostering innovation 1.
Geeta Vaghela, senior director of product management at Dell, discussed the transition from legacy data systems to AI-native approaches, highlighting the importance of breaking down data silos and architecting systems designed for AI's unique demands 1.
Broadcom showcased energy-efficient chips, including the Tomahawk 5 chip and Thor 2 NIC, which enable up to 75% reductions in power consumption and cooling needs while maintaining exceptional performance 5.
As the AI landscape continues to evolve, Dell and its partners are positioning themselves at the forefront of innovation, addressing the complex challenges of scalability, efficiency, and performance in AI infrastructure.
Reference
[4]
A comprehensive look at the latest advancements in high-performance computing and multicloud AI strategies, highlighting key insights from SC24 and Microsoft Ignite 2024 events.
2 Sources
2 Sources
Weka, Nvidia, and partners showcase advancements in AI infrastructure at SC24, addressing challenges in scalability, efficiency, and sustainability for enterprise AI deployments.
7 Sources
7 Sources
Dell Technologies showcases its vision for AI PCs at CES 2025, introducing innovative hardware and software solutions designed to bring AI capabilities to the edge and simplify enterprise AI adoption.
12 Sources
12 Sources
Dell Technologies introduces innovative AI-focused data center solutions, including liquid cooling systems and high-density compute infrastructure, to power large-scale AI workloads efficiently.
2 Sources
2 Sources
Dell Technologies enhances its PowerStore platform to meet the demands of AI-driven data storage, focusing on performance, security, and adaptability in response to the evolving needs of enterprise IT infrastructure.
3 Sources
3 Sources
The Outpost is a comprehensive collection of curated artificial intelligence software tools that cater to the needs of small business owners, bloggers, artists, musicians, entrepreneurs, marketers, writers, and researchers.
© 2025 TheOutpost.AI All rights reserved