Your data is critical to AI success. So, are you really taking care of it?
Sponsored Feature There can't be many tech leaders who don't realize that AI relies on vast amounts of data.
Yet it's not always clear that everyone has grasped just how fundamental their storage choices are to the success of their AI initiatives.
It's not for want of spending. The first half of 2024 saw organizations increase spending on compute and storage infrastructure for AI deployments by 37 percent to $31.8bn, according to research firm IDC. Almost 90 percent of this was accounted for by servers, with 65 percent being deployed in cloud or shared environments. The storage element increased 36 percent as organizations strive to manage the massive datasets needed for training as well as data repositories for inference.
Gartner's John-David Lovelock also sounded a warning in October, as he predicted increased spending on AI infrastructure in 2025: "The reality of what can be accomplished with current GenAI models, and the state of CIO's data, will not meet today's lofty expectations."
In fact, CIOs could be justified in asking whether they are pouring all that time and money into fighting what might prove to be the last storage war. AI, and transformations in general, have implications not just for how much data and storage is used, but the way that it is used and how it is provisioned and managed.
As HPE's vice president of product management Sanjay Jagad explains "Storage has gone from a repository of data value to where value gets created through data."
The challenge for IT is making sure that's what happens. For all the obsession with GPUs, Jagad says, those powerful processors cannot deliver AI on their own. "You need to still understand what data you have, what value you want to derive out of it, and how are you going to monetize it?"
Nor does it help that the majority of customers' data is "dark" according to some reports - ie tied up in repositories that don't allow live access and formats which are hard to analyze. Or even languishing on tape in off-site facilities. That's because traditional - legacy - storage systems were designed for legacy, monolithic apps, such as databases, and ERPs.
A question of control
Jagad explains: "They have a box, there are two controllers in them. They have a mid-plane, and a bunch of drives attached to it. That's a legacy design." And because of configurations like active passive, typically one controller is doing all the work. These architectures are not easy to expand and tend to be siloed. Scaling them tends to increase, rather than solve, inefficiencies.
As a result, accommodating increasing application demands becomes an ongoing headache. Because keeping pace with transformational changes in technology typically means a whole new box.
"That means you are going to now go do a forklift upgrade, or you have to do a rip and replace and do costly migrations," says Jagad. That all adds up to constant firefighting just to keep things stable, with an inevitable firestorm of upgrades every three to five years. This model was clumsy enough when it came to those legacy applications. It becomes even more problematic when it comes to modern applications, including AI systems.
It's not just that AI and other modern workloads require large amounts of data, increasing the burden on storage infrastructure. It also stems from them being effectively cloud native, meaning they are distributed, with a focus on orchestration and reusability. As well as performance, access and scalability are also key. Developers launching new projects or rapidly iterating and expanding existing ones mean resources must be easily provisioned.
All of which means IT departments need a storage platform that is not just performant but also disaggregated and easily scalable.
That's the promise of HPE's Alletra Storage MP B10000, which relies on "standardized, composable" building blocks, in the shape of compute nodes, JBOF expansion shelves for capacity with up to 24 drives per enclosure, and switches. The system, configured for block storage, can scale from 15TB to 5.6PB, with controllers upgraded one at a time, and storage in two disk increments.
The B10000 is optimized for NVMe, offering 100Gbit/s data speeds today, with 200Gbit/s and 400Gbit/s on the roadmap. So, the fabric will not be a bottleneck, Jagad says. The controllers themselves are powered by AMD EPYCâ„¢ embedded CPUs, which as well as high performance are built to offer a high number of PCI lanes, high memory bandwidth and low energy consumption.
The architecture has been designed to liberate "ownership" of application data from a specific controller. Because they are stateless, each controller can see every disk beneath it, which means no silos. As a disaggregated system, if admins want to add more performance or more capacity, new controllers or new media or both can be added directly. That echoes the sort of iterative, rapid deployment we see in cloud native or DevOps paradigms.
Working over time?
More tangibly, under HPE's Timeless program, hardware upgrades are included at no additional cost, meaning admins can take advantage of processor or media improvements. If that sounds more like a software subscription or XaaS offering, you wouldn't be wrong.
As Jagad says, "The hardware is the smallest value story in terms of the whole economics." Rather, the value comes in software and the disaggregated architecture, he argues.
"I'm pretty confident that you are going to get the value out of it through higher performance, better SLAs and being able to avoid the expensive migrations and forklift upgrades," he adds. Modern media often has a much longer lifecycle, meaning that over time the storage blocks can be recast as a lower tier of storage rather than simply dumped. Afterall, "Do you really need all that media on your primary storage."
The B10000 was built to meet a range of deployment need across hybrid cloud - including on-prem, a software defined solution for the public cloud as well as a disconnected air-gapped option for highly regulated environments. That ties in with the third pillar of HPE's strategy, its HPE GreenLake cloud, which automates management of compute, networking and storage across those on prem, colo and cloud environments, with a cloud-like experience from a single console.
HPE GreenLake itself has become increasingly AI driven, Jagad explains, with a Gen AI hybrid intelligent mode in the back end that handles key tasks, such as managing and moving workloads and optimizing storage accordingly, as well as predictive space reclamation, upgrade recommendations and running simulations.
It also draws on HPE's Zerto continuous data protection technology to protect data from cyberattacks, including ransomware, and streamline disaster recovery, whether from malicious actions or the myriad other problems to which storage systems are prone. These innovations all add up. HPE's internal analysis suggests that customers can expect a 30 percent TCO saving by avoiding the need for one forklift upgrade, and a 99 percent operational time saving due to AI-driven self-service provisioning. Earlier research on HPE GreenLake suggested a 40 percent saving on IT resources - this before the current level of AI Ops.
HPE backs up its confidence in the platform with a 100 percent data availability guarantee, a 4:1 StoreMore total savings guarantee, and a 30-day satisfaction guarantee. But that's not the most important benefit. "Imagine the resources that I can free up from an IT operation so that they don't have to worry about life cycle support," says Jagad.
That's an immense amount of time and resources that can be redirected to other tasks. Not least figuring out the potential value of data, and what they can do to best realize it. "That's where their time needs to go, right?" says Jagad. "The role of IT is changing." But for IT to make the most of its potential, storage needs to change too.