
2 Sources
2 Sources
[1]

Edge computing's rise will drive cloud consumption, not replace it
Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More The signs are everywhere that edge computing is about to transform AI as we know it. As AI moves beyond centralized data centers, we're seeing smartphones run sophisticated language models locally, smart devices processing computer vision at the edge and autonomous vehicles making split-second decisions without cloud connectivity. "A lot of attention in the AI space right now is on training, which makes sense in traditional hyperscale public clouds," said Rita Kozlov, VP of product at Cloudflare. "You need a bunch of powerful machines close together to do really big workloads, and those clusters of machines are what are going to predict the weather, or model a new pharmaceutical discovery. But we're right on the cusp of AI workloads shifting from training to inference, and that's where we see edge becoming the dominant paradigm." Kozlov predicts that inference will move progressively closer to users -- either running directly on devices, as with autonomous vehicles, or at the network edge. "For AI to become a part of a regular person's daily life, they're going to expect it to be instantaneous and seamless, just like our expectations for web performance changed once we carried smartphones in our pockets and started to depend on it for every transaction," she explained. "And because not every device is going to have the power or battery life to do inference, the edge is the next best place." Yet this shift toward edge computing won't necessarily reduce cloud usage as many predicted. Instead, the proliferation of edge AI is driving increased cloud consumption, revealing an interdependency that could reshape enterprise AI strategies. In fact, edge inference represents only the final step in a complex AI pipeline that depends heavily on cloud computing for data storage, processing and model training. New research from Hong Kong University of Science and Technology and Microsoft Research Asia demonstrates just how deep this dependency runs -- and why the cloud's role may actually grow more vital as edge AI expands. The researchers' extensive testing reveals the intricate interplay required between cloud, edge and client devices to make AI tasks work more effectively. How edge and cloud complement each other in AI deployments To understand exactly how this cloud-edge relationship works in practice, the research team constructed a test environment mirroring real-world enterprise deployments. Their experimental setup included Microsoft Azure cloud servers for orchestration and heavy processing, a GeForce RTX 4090 edge server for intermediate computation and Jetson Nano boards representing client devices. This three-layer architecture revealed the precise computational demands at each level. The key test involved processing user requests expressed in natural language. When a user asked the system to analyze a photo, GPT running on the Azure cloud server first interpreted the request, then determined which specialized AI models to invoke. For image classification tasks, it deployed a vision transformer model, while image captioning and visual questions used bootstrapping language-image rre-training (BLIP). This demonstrated how cloud servers must handle the complex orchestration of multiple AI models, even for seemingly simple requests. The team's most significant finding came when they compared three different processing approaches. Edge-only inference, which relied solely on the RTX 4090 server, performed well when network bandwidth exceeded 300 KB/s, but faltered dramatically as speeds dropped. Client-only inference running on the Jetson Nano boards avoided network bottlenecks but couldn't handle complex tasks like visual question answering. The hybrid approach -- splitting computation between edge and client -- proved most resilient, maintaining performance even when bandwidth fell below optimal levels. These limitations drove the team to develop new compression techniques specifically for AI workloads. Their task-oriented method achieved remarkable efficiency: Maintaining 84.02% accuracy on image classification while reducing data transmission from 224KB to just 32.83KB per instance. For image captioning, they preserved high-quality results (biLingual evaluation understudy -- BLEU -- scores of 39.58 vs 39.66) while slashing bandwidth requirements by 92%. These improvements demonstrate how edge-cloud systems must evolve specialized optimizations to work effectively. But the team's federated learning experiments revealed perhaps the most compelling evidence of edge-cloud symbiosis. Running tests across 10 Jetson Nano boards acting as client devices, they explored how AI models could learn from distributed data while maintaining privacy. The system operated with real-world network constraints: 250 KB/s uplink and 500 KB/s downlink speeds, typical of edge deployments. Through careful orchestration between cloud and edge, the system achieved over ~68% accuracy on the CIFAR10 dataset while keeping all training data local to the devices. CIFAR10 is a widely used dataset in machine learning (ML) and computer vision for image classification tasks. It consists of 60,000 color images, each 32X32 pixels in size, divided into 10 different classes. The dataset includes 6,000 images per class, with 5,000 for training and 1,000 for testing. This success required an intricate dance: Edge devices running local training iterations, the cloud server aggregating model improvements without accessing raw data and a sophisticated compression system to minimize network traffic during model updates. This federated approach proved particularly significant for real-world applications. For visual question-answering tasks under bandwidth constraints, the system maintained 78.22% accuracy while requiring only 20.39KB per transmission -- nearly matching the 78.32% accuracy of implementations that required 372.58KB. The dramatic reduction in data transfer requirements, combined with strong accuracy preservation, demonstrated how cloud-edge systems could maintain high performance even in challenging network conditions. Architecting for edge-cloud The research findings present a roadmap for organizations planning AI deployments, with implications that cut across network architecture, hardware requirements and privacy frameworks. Most critically, the results suggest that attempting to deploy AI solely at the edge or solely in the cloud leads to significant compromises in performance and reliability. Network architecture emerges as a critical consideration. While the study showed that high-bandwidth tasks like visual question answering need up to 500 KB/s for optimal performance, the hybrid architecture demonstrated remarkable adaptability. When network speeds dropped below 300 KB/s, the system automatically redistributed workloads between edge and cloud to maintain performance. For example, when processing visual questions under bandwidth constraints, the system achieved 78.22% accuracy using just 20.39KB per transmission -- nearly matching the 78.32% accuracy of full-bandwidth implementations that required 372.58KB. The hardware configuration findings challenge common assumptions about edge AI requirements. While the edge server utilized a high-end GeForce RTX 4090, client devices ran effectively on modest Jetson Nano boards. Different tasks showed distinct hardware demands: For enterprises concerned with data privacy, the federated learning implementation offers a particularly compelling model. By achieving 70% accuracy on the CIFAR10 dataset while keeping all training data local to devices, the system demonstrated how organizations can leverage AI capabilities without compromising sensitive information. This required coordinating three key elements: Build versus buy Organizations that view edge AI merely as a way to reduce cloud dependency are missing the larger transformation. The research suggests that successful edge AI deployments require deep integration between edge and cloud resources, sophisticated orchestration layers and new approaches to data management. The complexity of these systems means that even organizations with substantial technical resources may find building custom solutions counterproductive. While the research presents a compelling case for hybrid cloud-edge architectures, most organizations simply won't need to build such systems from scratch. Instead, enterprises can leverage existing edge computing providers to achieve similar benefits. Cloudflare, for example, has built out one of the largest global footprints for AI inference, with GPUs now deployed in more than 180 cities worldwide. The company also recently enhanced its network to support larger models like Llama 3.1 70B while reducing median query latency to just 31 milliseconds, compared to 549ms previously. These improvements extend beyond raw performance metrics. Cloudflare's introduction of persistent logs and enhanced monitoring capabilities addresses another key finding from the research: The need for sophisticated orchestration between edge and cloud resources. Their vector database improvements, which now support up to 5 million vectors with dramatically reduced query times, show how commercial platforms can deliver task-oriented optimization. For enterprises looking to deploy edge AI applications, the choice increasingly isn't whether to build or buy, but rather which provider can best support their specific use cases. The rapid advancement of commercial platforms means organizations can focus on developing their AI applications rather than building infrastructure. As edge AI continues to evolve, this trend toward specialized platforms that abstract away the complexity of edge-cloud coordination is likely to accelerate, making sophisticated edge AI capabilities accessible to a broader range of organizations. The new AI infrastructure economics The convergence of edge computing and AI is revealing something far more significant than a technical evolution -- it's unveiling a fundamental restructuring of the AI infrastructure economy. There are three transformative shifts that will reshape enterprise AI strategy. First, we're witnessing the emergence of what might be called "infrastructure arbitrage" in AI deployment. The true value driver isn't raw computing power -- it's the ability to dynamically optimize workload distribution across a global network. This suggests that enterprises building their own edge AI infrastructure aren't just competing against commercial platforms; they're also competing against the fundamental economics of global scale and optimization. Second, the research reveals an emerging "capability paradox" in edge AI deployment. As these systems become more sophisticated, they actually increase rather than decrease dependency on cloud resources. This contradicts the conventional wisdom that edge computing represents a move away from centralized infrastructure. Instead, we're seeing the emergence of a new economic model where edge and cloud capabilities are multiplicative rather than substitutive -- creating value through their interaction rather than their independence. Perhaps most profoundly, the rise of what could be termed "orchestration capital," where competitive advantage derives not from owning infrastructure or developing models, but from the sophisticated optimization of how these resources interact. It's about building a new form of intellectual property around the orchestration of AI workloads. For enterprise leaders, these insights demand a fundamental rethinking of AI strategy. The traditional build-versus-buy decision framework is becoming obsolete in a world where the key value driver is orchestrating. Organizations that understand this shift will stop viewing edge AI as a technical infrastructure decision and begin seeing it as a strategic capability that requires new forms of expertise and organizational learning. Looking ahead, this suggests that the next wave of AI innovation won't come from better models or faster hardware, but from increasingly sophisticated approaches to orchestrating the interaction between edge and cloud resources. The entire economic structure of AI deployment is likely to evolve accordingly. The enterprises that thrive in this new landscape will be those that develop deep competencies in what might be called "orchestration intelligence," or the ability to dynamically optimize complex hybrid systems for maximum value creation. This represents a fundamental shift in how we think about competitive advantage in the AI era, moving from a focus on ownership and control to a focus on optimization and orchestration.
[2]

Purpose-built AI hardware: Smart strategies for scaling infrastructure
Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More Enterprises can look forward to new capabilities -- and strategic decisions -- around the crucial task of creating a solid foundation for AI expansion in 2025. New chips, accelerators, co-processors, servers and other networking and storage hardware specially designed for AI promise to ease current shortages and deliver higher performance, expand service variety and availability, and speed time to value. The evolving landscape of new purpose-built hardware is expected to fuel continued double-digit growth in AI infrastructure that IDC says has lasted 18 straight months. The IT firm reports that organizational buying of compute hardware (primarily servers with accelerators) and storage hardware infrastructure for AI grew 37% year over-year in the first half of 2024. Sales are forecast to triple to $100 billion a year by 2028. "Combined spending on dedicated and public cloud infrastructure for AI is expected to represent 42% of new AI spending worldwide through 2025" writes Mary Johnston Turner, research VP for digital infrastructure strategies at IDC. The main highway for AI expansion Many analysts and experts say these staggering numbers illustrate that infrastructure is the main highway for AI growth and enterprise digital transformation. Accordingly, they advise, technology and business leaders in mainstream companies should make AI infrastructure a crucial strategic, tactical and budget priority in 2025. "Success with generative AI hinges on smart investment and robust infrastructure," said Anay Nawathe, director of cloud and infrastructure delivery at ISG, a global research and advisory firm. "Organizations that benefit from generative AI redistribute their budgets to focus on these initiatives." As evidence, Nawathe cited a recent ISG global survey that found that proportionally, organizations had ten projects in the pilot phase and 16 in limited deployment, but only six deployed at scale. A major culprit, says Nawathe, was the current infrastructure's inability to affordably, securely, and performantly scale." His advice? "Develop comprehensive purchasing practices and maximize GPU availability and utilization, including investigating specialized GPU and AI cloud services." Others agree that when expanding AI pilots, proof of concepts or initial projects, it's essential to choose deployment strategies that offer the right mix of scalability, performance, price, security and manageability. Experienced advice on AI infrastructure strategy To help enterprises build their infrastructure strategy for AI expansion, VentureBeat consulted more than a dozen CTOs, integrators, consultants and other experienced industry experts, as well as an equal number of recent surveys and reports. The insights and advice, along with hand-picked resources for deeper exploration, can help guide organizations along the smartest path for leveraging new AI hardware and help drive operational and competitive advantages. Smart strategy 1: Start with cloud services and hybrid For most enterprises, including those scaling large language models (LLMs), experts say the best way to benefit from new AI-specific chips and hardware is indirectly -- that is, through cloud providers and services. That's because much of the new AI-ready hardware is costly and aimed at giant data centers. Most new products will be snapped up by hyperscalers Microsoft, AWS, Meta and Google; cloud providers like Oracle and IBM; AI giants such as XAI and OpenAI and other dedicated AI firms; and major colocation companies like Equinix. All are racing to expand their data centers and services to gain competitive advantage and keep up with surging demand. As with cloud in general, consuming AI infrastructure as a service brings several advantages, notably faster jump-starts and scalability, freedom from staffing worries and the convenience of pay-go and operational expenses (OpEx) budgeting. But plans are still emerging, and analysts say 2025 will bring a parade of new cloud services based on powerful AI optimized hardware, including new end-to-end and industry-specific options. Smart strategy 2: DIY for the deep-pocketed and mature New optimized hardware won't change the current reality: Do it yourself (DIY) infrastructure for AI is best suited for deep-pocketed enterprises in financial services, pharmaceuticals, healthcare, automotive and other highly competitive and regulated industries. As with general-purpose IT infrastructure, success requires the ability to handle high capital expenses (CAPEX), sophisticated AI operations, staffing and partners with specialty skills, take hits to productivity and take advantage of market opportunities during building. Most firms tackling their own infrastructure do so for proprietary applications with high return on investment (ROI). Duncan Grazier, CTO of BuildOps, a cloud-based platform for building contractors, offered a simple guideline. "If your enterprise operates within a stable problem space with well-known mechanics driving results, the decision remains straightforward: Does the capital outlay outweigh the cost and timeline for a hyperscaler to build a solution tailored to your problem? If deploying new hardware can reduce your overall operational expenses by 20-30%, the math often supports the upfront investment over a three-year period." Despite its demanding requirements, DIY is expected to grow in popularity. Hardware vendors will release new, customizable AI-specific products, prompting more and more mature organizations to deploy purpose-built, finely tuned, proprietary AI in private clouds or on premise. Many will be motivated by faster performance of specific workloads, derisking model drift, greater data protection and control and better cost management. Ultimately, the smartest near-term strategy for most enterprises navigating the new infrastructure paradigm will mirror current cloud approaches: An open, "fit-for- purpose" hybrid that combines private and public clouds with on-premise and edge. Smart strategy 3: Investigate new enterprise-friendly AI devices Not every organization can get their hands on $70,000 high end GPUs or afford $2 million AI servers. Take heart: New AI hardware with more realistic pricing for everyday organizations is starting to emerge . The Dell AI Factory, for example, includes AI Accelerators, high-performance servers, storage, networking and open-source software in a single integrated package. The company also has announced new PowerEdge servers and an Integrated Rack 5000 series offering air and liquid-cooled, energy-efficient AI infrastructure. Major PC makers continue to introduce powerful new AI-ready models for decentralized, mobile and edge processing. Veteran industry analyst and consultant Jack E. Gold -- president and principal analyst of J. Gold Associates -- said he sees a growing role for less expensive options in accelerating adoption and growth of enterprise AI. Gartner projects that by the end of 2026, all new enterprise PCs will be AI-ready. Smart strategy 4: Double down on basics The technology might be new. But good news: Many rules remain the same. "Purpose-built hardware tailored for AI, like Nvidia's industry-leading GPUs, Google's TPUs, Cerebras wafer-scale chips and others are making build versus buy decisions much more nuanced," said ISG's Nawathe. But he and others point out that the core principles for making these decisions remain largely consistent and familiar. "Enterprises are still evaluating business need, skills availability, cost, usability, supportability and best of breed versus best in class." Experienced hands stress that the smartest decisions about whether and how to adopt AI-ready hardware for maximum benefit requires fresh-eyed, disciplined analysis of procurement fundamentals. Specifically: Impact on the larger AI stack of software, data and platforms and a thorough review of specific AI goals, budgets, total cost of ownership (TCO) and ROI, security and compliance requirements, available expertise and compatibility with existing technology. Energy for operating and cooling are a big X-factor. While much public attention focuses on new, mini nuclear plants to handle AI's voracious hunger for electricity, analysts say non-provider enterprises must begin factoring in their own energy expenses and the impact of AI infrastructure and usage on their corporate sustainability goals. Start with use cases, not hardware and technology In many organizations, the era of AI "science experiments" and "shiny objects" is ending or over. From now on, most projects will require clear, attainable key performance indicators (KPIs) and ROI. This means enterprises must clearly identify the "why" of business value before considering the "how "of technology infrastructure. "You'd be surprised at how often this basic gets ignored," said Gold. No doubt, choosing the best qualitative and quantitative metrics for AI infrastructure and initiatives is a complex, emerging, personalized process. Get your data house in order first Likewise, industry experts -- not just sellers of data products -- stress the importance of a related best practice: Beginning with data. Deploying high-performance (or any) AI infrastructure without ensuring data quality, quantity, availability and other basics will quickly and expensively lead to bad results. Juan Orlandini, CTO of North America for global solutions and systems integrator Insight Enterprises pointed out: "Buying one of these super highly accelerated AI devices without actually having done the necessary hard work to understand your data, how to use it or leverage it and whether it's good is like buying a firewall but not understanding how to protect yourself." Unless you're eager to see what garage in/ garbage out (GIGO) on steroids looks like, don't make this mistake. And, make sure to keep an eye on the big picture, advises Kjell Carlsson, head of AI strategy at Domino Data Lab, and a former Forrester analyst. He warned: "Enterprises will see little benefit from these new AI hardware offerings without dramatically upgrading their software capabilities to orchestrate, provision and govern this infrastructure across all of the activities of the AI lifecycle." Be realistic about AI infrastructure needs If your company is mostly using or expanding CoPilot, Open AI and other LLMs for productivity, you probably don't need any new infrastructure for now, said Matthew Chang, principal and founder of Chang Robotics. Many large brands, including Fortune 500 manufacturer clients of his Jacksonville, Fl., engineering company, are getting great results using AI-as-a-service. "They don't have the computational demands," he explained, "so, it doesn't make sense to spend millions of dollars on a compute cluster when you can get the highest-end product in the market, Chat GPT Pro, for $200 a month." IDC advises thinking about AI impact on infrastructure and hardware requirements as a spectrum. From highest to lowest impact: Building highly tailored custom models, adjusting pre-trained models with first-party data, contextualizing off the-shelf applications, consuming AI- infused applications "as-is". How do you determine minimum infrastructure viability for your enterprise? Learn more here. Stay flexible and open for a fast-changing future Sales of specialized AI hardware are expected to keep rising in 2025 and beyond. Gartner forecasts a 33% increase, to $92 billion, for AI-specific chip sales in 2025. On the service side, the growing ranks of GPU cloud providers continue to attract new money, players including Foundry and enterprise customers. An S&P/Weka survey found that more than 30% of enterprises have already used alternate providers for inference and training, often because they couldn't source GPUs. An oversubscribed $700-million private funding round for Nebius Group, a provider of cloud-based, full-stack AI infrastructure, suggests even wider growth in that sphere. AI is already moving from training in giant data centers to inference at the edge on AI-enabled smart phones, PCs and other devices. This shift will yield new specialized processors, noted Yvette Kanouff, partner at JC2 Ventures and former head of Cisco's service provider business. "I'm particularly interested to see where inference chips go in terms of enabling more edge AI, including individual CPE inference-saving resources and latency in run time," she said. Because the technology and usage are evolving quickly, many experts caution against getting locked into any service provider or technology. There's wide agreement that multi-tenancy environments which spread AI infrastructure, data and services across two or more cloud providers -- is a sensible strategy for enterprises. Srujan Akula, CEO and co-founder of The Modern Data Company, goes a step further. Hyperscalers offer convenient end-to-end solutions, he said, but their integrated approaches make customers dependent on a single company's pace of innovation and capabilities. A better strategy, he suggested , is to follow open standards and decouple storage from compute. Doing so lets an organization rapidly adopt new models and technologies as they emerge, rather than waiting for the vendor to catch up. "Organizations need the freedom to experiment without architectural constraints," agreed BuildOps CTO Grazier. "Being locked into an iPhone 4 while the iPhone 16 Pro is available would doom a consumer application, so why should it be any different in this context? The ability to transition seamlessly from one solution to another without the need to rebuild your infrastructure is crucial for maintaining agility and staying ahead in a rapidly evolving landscape."
Share
Share
Copy Link
As edge computing rises in prominence for AI applications, it's driving increased cloud consumption rather than replacing it. This symbiosis is reshaping enterprise AI strategies and infrastructure decisions.

The AI landscape is witnessing a significant shift towards edge computing, with smartphones running sophisticated language models locally and smart devices processing computer vision at the edge. Rita Kozlov, VP of product at Cloudflare, predicts that AI workloads will increasingly move from training to inference, with the latter progressively closer to users
1
.Contrary to earlier predictions, the shift towards edge computing is not reducing cloud usage. Instead, it's driving increased cloud consumption, revealing a complex interdependency that could reshape enterprise AI strategies. Edge inference represents only the final step in a complex AI pipeline that heavily relies on cloud computing for data storage, processing, and model training
1
.Recent research from Hong Kong University of Science and Technology and Microsoft Research Asia demonstrates the intricate interplay required between cloud, edge, and client devices for effective AI tasks. Their experimental setup, which included Microsoft Azure cloud servers, a GeForce RTX 4090 edge server, and Jetson Nano boards, revealed that a hybrid approach - splitting computation between edge and client - proved most resilient in maintaining performance
1
.The researchers developed new compression techniques specifically for AI workloads, achieving remarkable efficiency. They maintained 84% accuracy on image classification while reducing data transmission from 224KB to just 32.5KB per instance. For image captioning, they preserved high-quality results while slashing bandwidth requirements by 92%
1
.Federated learning experiments revealed compelling evidence of edge-cloud symbiosis. The system achieved over ~68% accuracy on the CIFAR10 dataset while keeping all training data local to the devices, operating under real-world network constraints
1
.Related Stories
As edge computing gains prominence, purpose-built AI hardware is emerging as a key factor in scaling AI infrastructure. New chips, accelerators, co-processors, servers, and other networking and storage hardware specially designed for AI promise to ease current shortages and deliver higher performance
2
.Enterprises face crucial decisions in creating a solid foundation for AI expansion. IDC reports that organizational buying of compute and storage hardware infrastructure for AI grew 37% year-over-year in the first half of 2024, with sales forecast to triple to $100 billion a year by 2028
2
.For most enterprises, including those scaling large language models (LLMs), experts recommend leveraging new AI-specific chips and hardware indirectly through cloud providers and services. This approach offers advantages such as faster jump-starts, scalability, and the convenience of pay-as-you-go and operational expenses budgeting
2
.Summarized by

Navi