2 Sources
2 Sources
[1]
This distributed data storage startup wants to take on Big Cloud | TechCrunch
The explosion of AI companies has pushed demand for computing power to new extremes, and companies like CoreWeave, Together AI and Lambda Labs have capitalized on that demand, attracting immense amounts of attention and capital for their ability to offer distributed compute capacity. But most companies still store data with the big three cloud providers, AWS, Google Cloud, and Microsoft Azure, whose storage systems were built to keep data close to their own compute resources, not spread across multiple clouds or regions. "Modern AI workloads and AI infrastructure are choosing distributed computing instead of big cloud," Ovais Tariq, co-founder and CEO of Tigris Data, told TechCrunch. "We want to provide the same option for storage, because without storage, compute is nothing." Tigris, founded by the team that developed Uber's storage platform, is building a network of localized data storage centers that it claims can meet the distributed compute needs of modern AI workloads. The startup's AI-native storage platform "moves with your compute, [allows] data [to] automatically replicate to where GPUs are, supports billions of small files, and provides low-latency access for training, inference, and agentic workloads," Tariq said. To do all of that, Tigris recently raised a $25 million Series A round that was led by Spark Capital and saw participation from existing investors, which include Andreessen Horowitz, TechCrunch has exclusively learned. The startup is going against the incumbents, who Tariq calls "Big Cloud." Tariq feels these incumbents not only offer a more expensive data storage service, but also a less efficient one. AWS, Google Cloud and Microsoft Azure have historically charged egress fees (dubbed "cloud tax" in the industry) if a customer wants to migrate to another cloud provider, or download and move their data if they want to, say, use a cheaper GPU or train models in different parts of the world simultaneously. Think of it like having to pay your gym extra if you want to stop going there. According to Batuhan Taskaya, head of engineering at Fal.ai, one of Tigris' customers, those costs once accounted for the majority of Fal's cloud spending. Beyond egress fees, Tariq says there's still the problem of latency with larger cloud providers. "Egress fees were just one symptom of a deeper problem: centralized storage that can't keep up with a decentralized, high-speed AI ecosystem," he said. Most of Tigris' 4,000+ customers are like Fal.ai: generative AI startups building image, video and voice models, which tend to have large, latency-sensitive datasets. "Imagine talking to an AI agent that's doing local audio," Tariq said. "You want the lowest latency. You want your compute to be local, close by, and you want your storage to be local, too." Big clouds aren't optimized for AI workloads, he added. Streaming massive datasets for training or running real-time inference across multiple regions can create latency bottlenecks, slowing model performance. But being able to access localized storage means data is retrieved faster, which means developers can run AI workloads reliably and more cost effectively using decentralized clouds. "Tigris lets us scale our workloads in any cloud by providing access to the same data filesystem from all these places without charging egress," Fal's Taskaya said. There are other reasons why companies want to have data closer to their distributed cloud options. For example, in highly regulated fields like finance and healthcare, one large roadblock to adopting AI tools is that enterprises need to ensure data security. Another motivation, says Tariq, is that companies increasingly want to own their data, pointing to how Salesforce earlier this year blocked its AI rivals from using Slack data. "Companies are becoming more and more aware of how important the data is, how it's fueling the LLMs, how it's fueling the AI," Tariq said. "They want to be more in control. They don't want someone else to be in control of it." With the fresh funds, Tigris intends to continue building its data storage centers to support increasing demand -- Tariq says the startup has grown 8x every year since its founding in November 2021. Tigris already has three data centers in Virginia, Chicago and San Jose, and wants to continue expanding in the U.S. as well as in Europe and Asia, specifically in London, Frankfurt and Singapore.
[2]
Tigris Data raises $25M for its AI-optimized cloud storage service - SiliconANGLE
Tigris Data raises $25M for its AI-optimized cloud storage service Tigris Data Inc., the operator of an object storage service optimized for artificial intelligence workloads, has raised $25 million in funding. The company said in its announcement of the Series A round today that Spark Capital was the lead investor. Returning backer Andreessen Horowitz chipped in as well. Sunnyvale, California-based Tigris positions its service as an alternative to the major public clouds' object storage offerings. The service is S3-compatible, which means that it can be used by applications originally written for Amazon S3 without major code modifications. That lowers the entry barrier for new customers. Tigris says that its platform is particularly suitable for storing small files such as embeddings, the mathematical structures in which AI models keep data. It claims that the platform can make such files available to workloads with less latency than S3. Reducing data retrieval times enables AI applications to process prompts faster. One way Tigris reduces latency is by caching frequently-used files. The company's cache is based on a data structure known as a log-structured merge-tree, or LSM for short. An LSM boosts performance by optimizing how information is organized in storage. A data storage device is divided into numerous segments that each hold a tiny amount of information. Usually, applications can read data from adjacent segments significantly faster than from non-adjacent ones. The LMS data structure used by Tigris keeps information in adjacent segments to speed up retrieval times. The company's platform also reduces latency in other ways. If Tigris detects that a data repository is frequently accessed by users in a certain region, it can move the repository to that region or create a cached copy. That reduces the distance that users' network traffic must travel and thereby speeds up access times. Tigris' platform offers four storage infrastructure tiers. There's a standard tier, another that is geared toward infrequently accessed files and two optimized for archived datasets. The latter tiers trade off some speed for lower pricing. Developers can move their workloads' data to Tigris from other platforms using a feature called Tigris shadow buckets. The feature gradually copies an application's most frequently used records. That removes the need to move the entire datasets at once, which is an often complicated endeavor with the potential to cause technical issues.
Share
Share
Copy Link
Tigris Data secures $25 million in Series A funding to provide distributed, AI-optimized data storage. The startup aims to offer a more efficient and cost-effective alternative to traditional cloud storage providers, addressing key pain points in the AI industry.
In a bold move to revolutionize data storage for AI workloads, Tigris Data has secured $25 million in Series A funding, led by Spark Capital with participation from Andreessen Horowitz
1
2
. The Sunnyvale-based startup, founded in November 2021, aims to provide a more efficient and cost-effective alternative to traditional cloud storage providers.As the demand for AI computing power skyrockets, companies like CoreWeave, Together AI, and Lambda Labs have capitalized on offering distributed compute capacity. However, data storage has largely remained centralized with the "Big Three" cloud providers: AWS, Google Cloud, and Microsoft Azure.
Ovais Tariq, co-founder and CEO of Tigris Data, explains the problem: "Modern AI workloads and AI infrastructure are choosing distributed computing instead of big cloud. We want to provide the same option for storage, because without storage, compute is nothing."
1
Tigris Data's platform offers several key advantages:
Distributed Storage: The company is building a network of localized data storage centers to meet the distributed compute needs of modern AI workloads
1
.AI-Native Design: The platform "moves with your compute, [allows] data [to] automatically replicate to where GPUs are, supports billions of small files, and provides low-latency access for training, inference, and agentic workloads," according to Tariq
1
.Optimized Performance: Tigris uses a log-structured merge-tree (LSM) data structure to optimize data organization and speed up retrieval times
2
.Flexible Storage Tiers: The platform offers four storage infrastructure tiers, catering to different needs and budgets
2
.Related Stories
Tigris Data tackles several challenges faced by AI companies:
Egress Fees: Traditional cloud providers often charge hefty fees for data migration or downloads. Batuhan Taskaya, head of engineering at Fal.ai, a Tigris customer, noted that these costs once accounted for the majority of their cloud spending
1
.Latency: By offering localized storage, Tigris reduces latency for AI workloads, crucial for applications like real-time audio processing
1
.Data Control: As companies become more aware of the value of their data in fueling AI models, Tigris offers greater control over data storage and access
1
.With its recent funding, Tigris Data plans to expand its network of data centers beyond its current locations in Virginia, Chicago, and San Jose. The company aims to establish a presence in Europe and Asia, specifically targeting London, Frankfurt, and Singapore
1
.As Tigris Data continues to grow—having expanded 8x every year since its founding—it poses a significant challenge to traditional cloud storage providers. By offering a more flexible, efficient, and cost-effective solution tailored to AI workloads, Tigris Data is positioning itself as a key player in the evolving landscape of AI infrastructure.
Summarized by
Navi
08 Oct 2024•Technology
14 Nov 2024•Business and Economy
17 Jul 2024