2 Sources
[1]
Production AI reshapes private cloud
Infrastructure reality check: Broadcom makes the private cloud case for AI Production AI is pushing private cloud back into the center of enterprise infrastructure. The shift is not just about where workloads run. It is about cost control, security, governance and the need to bring AI closer to enterprise data. Broadcom Inc.'s VMware Cloud Foundation strategy reflects that reality, as organizations weigh cloud economics against the demands of AI inference at scale. "AI is driving a couple of factors," said Paul Turner, chief product officer of the VMware Cloud Foundation Division at Broadcom. "It's driving one, great opportunity. There's compelling reasons of why people are adopting AI. One of the things that they need is that the platform that runs AI must be better, and that's what's driving VCF adoption today. The second thing that we're really seeing happen is AI is actually a cost multiplier, because it's increasing the cost of infrastructure. You've got to deal with the risks that AI can expose as well." Turner; Prashanth Shenoy, chief marketing officer and vice president of marketing of the VMware Cloud Foundation Division at Broadcom; and others, spoke with John Furrier and Gemma Allen at the Broadcom "Modern Private Cloud: A Secure Foundation for Production AI" event, during an exclusive broadcast on theCUBE, SiliconANGLE Media's livestreaming studio. They discussed how Broadcom and its partners and customers see private cloud becoming a practical foundation for secure production AI. (* Disclosure below.) Production AI changes the private cloud equation AI has made the infrastructure conversation more practical. Enterprises are not only asking how to adopt models; they are asking how to run inference, secure data and manage rising infrastructure costs without losing control of their operating model. That is where VCF is being positioned as a private cloud foundation, according to Turner. "AI is also a risk and cost multiplier," he said. "Just think about a few stats: 73% of enterprises see AI-related attacks. That is almost every industry out there ... actually seeing these new attacks that are driven by AI-enabled software." That risk profile is changing how enterprises think about public cloud. Training and experimentation may still lean on cloud services, but day-to-day production AI has different economics. Once inference becomes part of operations, token costs, data gravity and compliance become boardroom issues instead of technical footnotes, Shenoy explained. "Last year, when we did the private cloud outlook study, there was a definitive cloud reset happening in the market, where private cloud and the operating model of private cloud to run your mission-critical workload on-premises or in a hybrid environment was on par with public cloud," he said. "Fast-forward to this year, when we did the same survey with 1,800 IT leaders and decision-makers around the globe. A lot of organizations are now moving their AI applications from a pilot phase of trying, to production, doing it at scale." Cost pressure is also making virtualization feel newly relevant. Memory tiering, GPU utilization and shared platforms for virtual machines and containers all matter more when AI workloads move from isolated pilots into everyday business systems. Broadcom's argument is that infrastructure efficiency becomes a direct lever for AI adoption, Turner noted. "It makes a huge difference to our customers when we save the money," he said. "At the end of the day, we do a lot to make a platform. You're going to hear more about how we make a platform powerful for AI. But it's also very important that we make it cost-effective, that we virtualize, that we provide the best platform possible." Here's theCUBE's complete video interview with Paul Turner and Prashanth Shenoy: AI sovereignty turns into an infrastructure priority Private AI has also matured beyond a single architecture. The enterprise need is becoming more distributed, with local models, frontier models and AI gateways working together depending on sensitivity, cost and performance. That is especially important for organizations trying to keep data, control planes and audit trails under their own governance model, according to Chris Wolf, global head of AI and advanced services for the VMware Cloud Foundation Division at Broadcom. "For a lot of our customers today, their definition means that it's not just about the data plane being sovereign; it's about the control plane being sovereign," he said. "It's that, 'I can disconnect from the internet and I can continue to run. I can continue to operate.' That's a difference, and that's been really driven over the last couple of years, far more so than we've seen previously." Cloud services provider ThinkOn's work in Canada shows how this plays out in the field. For regulated environments, AI sovereignty is not an abstract policy debate. It becomes a deployment requirement that covers data classification, access control, model choice and the ability to operate in a trusted private cloud environment, according to Craig McLellan, founder and chief executive officer of ThinkOn. "I'd even go a step further and say that it's also about model sovereignty," he added. "Many countries want to have their own sovereign models. For instance, in Canada, Cohere is a vibrant participant in the market, and we actually took the opportunity to work closely with Broadcom to add the Cohere model to the environment as a private cloud. We are able to provide the public sector with a combination of model sovereignty, certainly economic and data sovereignty, as well as control plan and data plane sovereignty." Execution is now the harder test. Many organizations have AI strategies, but production use depends on whether infrastructure teams can make the experience safe, repeatable and easy enough for users. That means the operational stack around AI is becoming just as important as the GPU capacity underneath it, McLellan pointed out. "The clients are always looking for the easy buttons despite the complexity the client wants to bring to the table," he said. "What I love about this first project that we worked on over the last few months was it was a three-way collaboration. There's a collaboration between Broadcom and ThinkOn because this is a complex environment that we need to present the easy button with. On top of that, we had to work with the client that was bringing workload to us that wasn't necessarily entirely thought through either. To be able to turn it into a tool that internal members of the public sector community can actually use safely and securely and, most importantly, easily was no small task." Here's theCUBE's complete video interview with Craig McLellan and Chris Wolf: Manufacturing use case shows the infrastructure stakes Charlotte Pipe and Foundry Co. brings the private cloud discussion into a more grounded setting. The 125-year-old manufacturer is not chasing AI for novelty. Its VCF journey started with practical needs around workload mobility, security and the ability to modernize without refactoring core applications, noted Rodney Barnhardt, server administration at Charlotte Pipe and Foundry. "Originally when we moved to VCF, it was prior to being able to do brownfield imports," he said. "While we've been VMware customers for a long time, prior to moving to VCF, we were on three tiers: Cisco, BladeCenter, Unity all-flash storage array. In looking at VMware by Broadcom and the VCF platform using HCX to be able to do those migrations, as well as vDefend to put microsegmentation around products, made VMware Cloud Foundation an ideal product to look at deploying within our environment." Security became a central driver. For Charlotte Pipe, vDefend and microsegmentation offered a way to limit lateral movement and reduce exposure if an attacker breached the environment, Barnhardt explained. That kind of control becomes more important as AI connects more systems and expands the number of workflows touching sensitive operational data. "If you use vDefend and create microsegmentation, for those that are not familiar, that limits applications and services to only the ports that they need access to," he said. "You can say that these servers cannot do a remote desktop protocol, or RDP, to these servers, and that can help lower the threat landscape. If an attacker gets in, they cannot just be jumping from server to server to server. It also adds a layer of protection to the overall environment." The operational benefits are also tied to patching, upgrades and day-to-day management. A more automated platform can reduce manual effort and help IT teams respond faster when vulnerabilities emerge. Barnhardt's advice is practical: Get the fundamentals right before starting the migration. "I think the key is to plan, is look at your current environment, make sure all of your hardware meets the HCL list, make sure you have all of the appropriate things in place," he said. "Look at the requirements and the HCL list, and make sure that what you have is already in place or that you may have to go out and make some changes or acquire some different hardware to make that upgrade. You really want to look into that before you start down the upgrade process or the move process, or you may have a delay in getting this transformation done." Here's theCUBE's complete video interview with Rodney Barnhardt: (* Disclosure: TheCUBE is a paid media partner for the Broadcom "Modern Private Cloud: A Secure Foundation for Production AI" event. Neither Broadcom, the sponsor of theCUBE's event coverage, nor other sponsors have editorial control over content on theCUBE or SiliconANGLE.)
[2]
Broadcom Debuts VMware Cloud Foundation 9.1 to Power Secure, Cost-Effective Production AI
VCF 9.1 Empowers Enterprises with Accelerated AI Workload Deployment at Lower Costs, Enhanced Security, and an Open Hardware Ecosystem Broadcom Inc. today announced VMware Cloud Foundation (VCF) 9.1, a secure and cost-effective infrastructure platform for production AI workloads. VCF 9.1 delivers an AI and Kubernetes native private cloud platform with integrated security and mixed compute infrastructure support across AMD, Intel, and NVIDIA. This enables enterprises to deploy inference and agentic AI applications with significantly lower costs, enhanced security, and freedom to choose best-of-breed GPU and CPU hardware. A preview of Broadcom's Private Cloud Outlook 2026 report reveals private cloud continues to be the preferred platform for production AI. More than half of organizations surveyed (56%) are running or planning to run production inferencing in a private cloud. More importantly, public cloud use for production inference was 41%, down 15% year over year. Additionally, 62% of IT leaders reported being very or extremely concerned about generative AI infrastructure costs while 36% report AI is driving new requirements for data protection, privacy, security controls and risk management. VMware Cloud Foundation provides a better alternative to public cloud for production workloads through intelligent software that maximizes infrastructure efficiency on existing servers while providing architectural control and regulatory compliance capabilities essential for production AI deployments. VMware Cloud Foundation 9.1 will enable enterprises to deploy production workloads including inference and agentic AI with: * Up to 40% reduction in server costs through intelligent memory tiering for clusters running a mix of AI and non-AI workloads; * Up to 39% lower storage TCO through enhanced compression and deduplication for AI data pipelines; * Up to 46% reduction in Kubernetes operational costs for running AI workloads at scale; * 4x faster cluster upgrades and 2x increased fleet capacity to rapidly scale AI infrastructure. "As more enterprises turn to AI for driving competitive advantage, they face three critical challenges: data and IP privacy concerns, surging infrastructure costs, and their readiness for the world of agentic AI," said Krish Prasad, senior vice president and general manager, VMware Cloud Foundation Division, Broadcom. "VCF 9.1 is a single unified platform that addresses all three and delivers one of the most advanced infrastructures for Private AI. We enable zero-trust security for AI, reduce costs through intelligent infrastructure optimization and hardware choice, and provide the flexibility to run both agentic workflows and accelerated inferencing on the same platform." Efficient Infrastructure at Scale for AI Workloads VCF 9.1 maximizes density for both VM and containerized AI workloads on existing infrastructure while dramatically reducing operational complexity. Through intelligent resource management and automated operations, enterprises can deploy more production workloads on current servers, scale efficiently across distributed environments, and eliminate the need for costly infrastructure expansion during a period of hardware shortage and rising costs. Key capabilities include: * Intelligent resource optimization that maximizes infrastructure utilization through advanced memory tiering and next-generation storage compression for AI data pipelines, enabling higher AI workload density without performance compromises or expensive hardware refresh. * Automated fleet operations at scale that deliver doubled management capacity to 5,000 hosts and 4x faster cluster upgrades across distributed and air-gapped environments, eliminating manual patching overhead while supporting rapid AI infrastructure expansion. * Multi-tenant infrastructure for AI isolation that enables enterprises and service providers to run multiple AI projects and customers on shared infrastructure with strict security boundaries, maximizing utilization of expensive GPU and CPU resources while supporting data sovereignty for sensitive models. * Open ecosystem integration that delivers multi-accelerator GPU choice across AMD and NVIDIA, support for leading AMD and Intel CPU platforms, and standards-based EVPN and VXLAN interoperability with Arista Universal Cloud Network, demonstrating VCF's commitment to providing the high-performance connectivity and compute flexibility production AI demands. * High speed networking for AI workloads through VCF support for NVIDIA ConnectX-7 NICs and NVIDIA BlueField-3 with Enhanced DirectPath I/O. With this enhancement high-speed, multi-host AI model training and data transfer, crucial for demanding Gen AI workloads is enabled. * Virtualized load balancing and security with VMware Avi Load Balancer and VMware vDefend eliminate hardware appliance requirements for AI inference endpoints and agentic applications, reducing capital expense while providing enterprise-grade resilience and automated lifecycle management. High Velocity App Delivery: Modern Workload Platform for AI, Containers, and VMs VCF 9.1 delivers a unified platform that accelerates AI application deployment by running inference workloads, agentic applications, containerized services, and traditional VMs on a single infrastructure layer. This eliminates operational fragmentation and the cost of managing separate stacks while providing the developer velocity and platform governance that production AI requires. Key capabilities include: * Kubernetes scale and performance for AI that delivers 2.6x increased cluster scale, 70% faster deployments, 75% shorter upgrade windows compared to preview versions, and seamless scaling that enables zero downtime for production AI services. * Mixed compute management that efficiently handles both CPU-intensive agentic AI workflows and GPU-accelerated inference on a unified platform, addressing the reality that agentic workloads demand significantly more CPU than GPU capacity for workflow execution and decision orchestration. * AI observability and governance that provides detailed metrics for time to first token, token throughput, and GPU utilization across multiple accelerator types, enabling enterprises to maximize infrastructure ROI through precise hardware utilization monitoring while centralized policy injection and data sovereignty controls enable AI compliance enforcement and secure model access. * Live application stack blueprints that capture multi-VM applications as reusable templates for rapid environment deployment, eliminating manual configuration errors and preventing configuration drift across development, test, and production environments while accelerating infrastructure delivery velocity. Zero-Trust Architecture for AI Data Sovereignty and Governance VCF 9.1 integrates security at the infrastructure layer to protect AI workloads, proprietary models, and training data from hypervisor to application. By delivering zero-trust segmentation, sovereign recovery, and continuous patching without bolt-on tools, VCF strengthens the security posture essential for production AI deployments that public cloud environments cannot match. Key capabilities include: * On-premises ransomware recovery that provides isolated recovery environments and integrated validation tools including new CrowdStrike Falcon® Endpoint Security support protect AI models and training data - significant intellectual property - from cross-border movement while avoiding massive bandwidth fees during crisis restoration. * Continuous compliance enforcement that maintains regulatory adherence through centralized monitoring and automated desired state remediation for workloads and VCF stack components, enabling enterprises to demonstrate audit readiness for production AI deployments without manual overhead or separate compliance tools. * Zero-downtime live patching that supports up to 80% of use cases without host evacuation or maintenance windows, eliminating disruption to production AI inference services and agentic applications that require continuous availability for service level agreements. * Zero-trust lateral security that extends distributed IDS/IPS protection to Kubernetes AI workloads for the first time, delivering 9 Tbps threat inspection performance for distributed inference and 5x increased application identification for private cloud and internet applications. * Self-service security with automation that provides centralized tagging, pre-defined security profiles, delegated firewall configurations and ingress web application security, enabling enterprises and service providers to secure AI deployments without operational complexity or fragmented security toolchains. "Analyzing years of news archives in the public cloud is cost-prohibitive, with unpredictable pricing that makes AI projects difficult to plan," said V V Jacob, Senior General Manager, Systems for Malayala Manorama Co Ltd. "By deploying VCF Private AI Services on our existing VMware Cloud Foundation infrastructure, we will run AI-powered content summarization, heading generation, and editorial assistance directly on our private cloud. We believe this will give us the privacy and security essential for protecting editorial sources while delivering the cost predictability that on-premises private cloud infrastructure provides." "By unifying our VMs and containers on VMware Cloud Foundation, we've achieved greater operational efficiency and raised the overall availability," said Alexander Hopfgartner, Head of Technology at Notruf Niederösterreich. "VMware vSphere Kubernetes Service, as the built-in Kubernetes runtime of VCF, empowers our operations team to easily deploy, scale, and manage our most critical applications." "As enterprises move AI from experimentation to production, they need infrastructure that delivers performance, efficiency, and flexibility across a broad ecosystem at scale," said Kumaran Siva, corporate vice president, Compute and Enterprise AI, AMD. "AMD enterprise AI solutions, along with VMware Cloud Foundation 9.1, enable scalable, cost-efficient AI workloads; helping customers deploy inference and agentic AI with the performance, security, and data sovereignty required for production environments." "Arista Networks and Broadcom share a fundamental commitment to open, standards-based networking that gives enterprises true architectural freedom and choice for production AI infrastructure," said Jeff Raymond, Vice President and General Manager of EOS Software and Services, Arista Networks. "EVPN and VXLAN interoperability between Arista Universal Cloud Network and VMware Cloud Foundation 9.1 delivers the openness and performance that production AI requires. Through standards-based direct ESX-to-fabric connectivity, enterprises can build scalable network architectures for AI infrastructure while reducing both capital and operational costs." "AI workloads are now prime targets, and recovery without validation is a risk enterprises can't afford," said Chris Stewart, Vice President, Global Cloud and Technology Alliance Partners, CrowdStrike. "With CrowdStrike integrated with VMware Cloud Foundation, organizations can stop breaches faster, validate that environments are truly clean before restoring, and prevent reinfection - critical to protecting high-value models and data while maintaining full control over sovereignty and compliance." "VMware Cloud Foundation 9.1 is further optimized for Intel® Xeon® 6 processors, unlocking the full potential of a high‑density, AI‑ready platform. Native integration of Intel® QuickAssist Technology accelerates Encrypted vMotion while freeing valuable compute resources," said Caitlin Anderson, Corporate Vice President, Americas Sales at Intel Corporation. "Together, we remain committed to delivering continuous innovation with superior total cost of ownership, helping customers accelerate their AI and container modernization journeys." "Enterprises need infrastructure that delivers breakthrough AI performance while maintaining data sovereignty and control," said John Fanelli, vice president of enterprise software at NVIDIA. "Our collaboration with Broadcom brings NVIDIA Blackwell architecture-including RTX Pro Servers equipped with BlueField-3 and the NVIDIA Blackwell HGX platform-along with high-speed DirectPath I/O to VMware Cloud Foundation. This enables organizations to deploy private AI with the same performance they expect from public cloud, but with complete control over their models and data. This collaboration addresses the reality that production AI requires both extraordinary compute power and enterprise-grade governance."
Share
Copy Link
Broadcom launches VMware Cloud Foundation 9.1, targeting production AI workloads with up to 40% server cost reduction and enhanced security. The move reflects a broader shift as enterprises bring AI closer to enterprise data, with 56% now running or planning production inference on private cloud—while public cloud use drops 15% year-over-year.
Broadcom has unveiled VMware Cloud Foundation 9.1, positioning private cloud as the practical foundation for secure and cost-effective production AI
2
. The release comes as enterprises move AI workloads from experimental pilots to operational systems, fundamentally changing how organizations evaluate infrastructure for AI workloads1
. This shift is not just about where workloads run—it centers on cost control, security, governance, and the strategic need to bring AI closer to enterprise data1
.
Source: CXOToday
According to Broadcom's Private Cloud Outlook 2026 report, 56% of organizations are running or planning to run production inference on private cloud, while public cloud use for production inference dropped to 41%, down 15% year-over-year
2
. The data reveals a practical recalibration: while training and experimentation may still lean on cloud services, inference at scale has different economics. Once inference becomes part of operations, token costs, data gravity, and compliance become boardroom issues rather than technical footnotes.Cost pressure is making infrastructure efficiency a direct lever for AI adoption. Broadcom reports that 62% of IT leaders are very or extremely concerned about generative AI infrastructure costs, while 36% report AI is driving new requirements for data protection, privacy, security controls, and risk management
2
. "AI is also a risk and cost multiplier," said Paul Turner, chief product officer of the VMware Cloud Foundation Division at Broadcom. "Just think about a few stats: 73% of enterprises see AI-related attacks"1
.VMware Cloud Foundation 9.1 addresses these concerns with measurable infrastructure improvements. The AI and Kubernetes-native private cloud platform delivers up to 40% reduction in server costs through intelligent resource optimization and advanced memory tiering for clusters running mixed AI and non-AI workloads
2
. Storage total cost of ownership drops up to 39% through enhanced compression and deduplication for AI data pipelines, while Kubernetes operational costs for running AI workloads at scale decrease up to 46%2
.The platform integrates multi-tenant infrastructure for AI isolation, enabling enterprises and service providers to run multiple AI projects on shared infrastructure with strict security boundaries
2
. This maximizes utilization of expensive GPU and CPU resources while supporting data sovereignty for sensitive models. VCF 9.1 provides mixed compute infrastructure support across AMD, Intel, and NVIDIA, giving enterprises freedom to choose best-of-breed GPU and CPU hardware2
.
Source: SiliconANGLE
The platform also delivers 4x faster cluster upgrades and 2x increased fleet capacity, now managing up to 5,000 hosts, to rapidly scale AI infrastructure across distributed and air-gapped environments
2
. High-speed networking for AI workloads is enabled through support for NVIDIA ConnectX-7 NICs and NVIDIA BlueField-3 with Enhanced DirectPath I/O, crucial for demanding generative AI workloads2
.Related Stories
For regulated environments, AI sovereignty has evolved from abstract policy debate to concrete deployment requirement. Chris Wolf, global head of AI and advanced services for the VMware Cloud Foundation Division at Broadcom, explained that enterprise definitions now extend beyond data plane sovereignty: "It's about the control plane being sovereign. It's that, 'I can disconnect from the internet and I can continue to run. I can continue to operate'"
1
. This governance model matters as organizations deploy local models, frontier models, and AI gateways together depending on sensitivity, cost, and performance requirements.Prashanth Shenoy, chief marketing officer of the VMware Cloud Foundation Division at Broadcom, noted the shift in enterprise priorities: "Last year, when we did the private cloud outlook study, there was a definitive cloud reset happening in the market, where private cloud and the operating model of private cloud to run your mission-critical workload on-premises or in a hybrid environment was on par with public cloud. Fast-forward to this year, when we did the same survey with 1,800 IT leaders and decision-makers around the globe. A lot of organizations are now moving their AI applications from a pilot phase of trying, to production, doing it at scale"
1
. This transition signals that enterprises are watching infrastructure efficiency, security boundaries, and operational control as key indicators for sustainable AI deployment at enterprise scale.Summarized by
Navi
[1]
1
Technology

2
Business and Economy

3
Health
