The Outpost is a comprehensive collection of curated artificial intelligence software tools that cater to the needs of small business owners, bloggers, artists, musicians, entrepreneurs, marketers, writers, and researchers.
© 2025 TheOutpost.AI All rights reserved
Curated by THEOUTPOST
On Wed, 5 Feb, 12:05 AM UTC
2 Sources
[1]
Alluxio Enterprise AI 3.5 Enhances AI Workflows with Breakthrough Cache Mode, Distributed Cache Management, and Python SDK Integration
Enter your email to get Benzinga's ultimate morning update: The PreMarket Activity Newsletter SAN MATEO, Calif., Feb. 04, 2025 (GLOBE NEWSWIRE) -- Alluxio, the AI and data acceleration platform, today announced the latest enhancements in Alluxio Enterprise AI. Version 3.5 showcases the platform's capability to accelerate AI model training and streamline operations with features such as a new Cache Only Write Mode, advanced cache management, and enhanced Python SDK integrations. These updates empower organizations to train models faster, handle massive datasets more efficiently, and streamline the complexity of AI infrastructure operations. AI-driven workloads face significant challenges in managing the sheer volume and complexity of data, which can lead to inefficiencies and increased training times. Ensuring fast, prioritized access to critical data and seamless integration with common AI frameworks is essential for optimizing performance and accelerating model development. "The latest release of Alluxio Enterprise AI is packed with new capabilities designed to further accelerate AI workload performance," said Haoyuan (HY) Li, Founder and CEO of Alluxio. "Our customers are training AI models with enormous datasets that often span billions of files. Alluxio Enterprise AI 3.5 was built to ensure workloads perform at peak performance while also simplifying management and operations of AI infrastructure." Alluxio Enterprise AI version 3.5 includes the following key features: New caching mode accelerates AI checkpoints - Alluxio's new CACHE_ONLY Write Mode significantly improves the performance of write operations, such as writing checkpoint files during AI model training. When enabled, this mode writes data exclusively to the Alluxio cache instead of the underlying file system (UFS). By bypassing the UFS, write performance is enhanced by eliminating bottlenecks typically associated with underlying storage systems. This feature is experimental.Advanced cache eviction policies provide fine-grained control - Alluxio's TTL Cache Eviction Policies allow administrators to enforce time-to-live (TTL) settings on cached data, ensuring less frequently accessed data is automatically evicted based on defined policies. Alluxio's priority-based cache eviction policies enable administrators to define caching priorities for specific data that override Alluxio's default Least Recently Used (LRU) algorithm, ensuring critical data remains in cache even if it would otherwise be evicted. This is ideal for workloads requiring consistent low-latency access to key datasets. Both TTL and Priority-based Cache Eviction Policies are generally available.Python SDK integrations enhance AI framework compatibility - Alluxio's Python SDK now supports leading AI frameworks, including PyTorch, PyArrow, and Ray. These integrations provide a unified Python filesystem interface, enabling applications to interact seamlessly with various storage backends. This simplifies the adoption of Alluxio Enterprise AI for Python applications, particularly those handling data-intensive workloads and AI model training, by facilitating quick and repeated access to both local and remote storage systems. This feature is experimental. This release also introduces several enhancements to Alluxio's S3 API, which are immediately available: Support for HTTP persistent connections (HTTP keep-alive) - Alluxio now supports HTTP persistent connections, which maintain a single TCP connection for multiple requests. This reduces the overhead of opening new connections for each request and decreases latency by approximately 40% for 4KB S3 ReadObject requests.TLS encryption for enhanced security - Communication between the Alluxio S3 API and the Alluxio worker now supports TLS encryption, ensuring secure data transmission.Multipart upload (MPU) support - The Alluxio S3 API now supports multipart upload, which splits files into multiple parts and uploads each part separately. This feature simplifies the upload process and improves throughput for large files. Other enhancements included in version 3.5 are: The Alluxio Index Service - A new caching service that improves the performance of directory listings for directories storing hundreds of millions of files and subdirectories. The Index Service ensures scalability and delivers 3-5x faster results by serving directory listing details from the cache, compared to listing directories on the UFS. This enhancement is experimental.UFS read rate limiter - Administrators can now set a rate limit to control the maximum bandwidth an individual Alluxio Worker can read from the UFS. By configuring the UFS Read Rate Limiter, administrators ensure optimized resource utilization while maintaining system stability. Alluxio supports rate limiting for various UFS types, including S3, HDFS, GCS, OSS, and COS. This enhancement is generally available.Support for heterogeneous worker nodes - Alluxio now supports clusters with worker nodes that have heterogeneous resource configurations (CPU, memory, disk, and network). This enhancement provides administrators greater flexibility in configuring clusters and offers improved opportunities to optimize resource allocation. This enhancement is generally available. Availability Alluxio Enterprise AI version 3.5 is available for download here: https://www.alluxio.io/demo Supporting Resources Learn more about Alluxio Enterprise AI 3.5: www.alluxio.io/blog/new-features-in-alluxio-enterprise-ai-3-5Download a trial version: https://www.alluxio.io/demo About Alluxio Alluxio, a leading provider of the high performance data platform for analytics and AI, accelerates time-to-value of data and AI initiatives and maximizes infrastructure ROI. Uniquely positioned at the intersection of compute and storage systems, Alluxio has a universal view of workloads on the data platform across stages of a data pipeline. This enables Alluxio to provide high performance data access regardless of where the data resides, simplify data engineering, optimize GPU utilization, and reduce cloud and storage costs. With Alluxio, organizations can achieve magnitudes faster model training and serving without the need for specialized storage, and build AI infrastructure on existing data lakes. Backed by leading investors, Alluxio powers technology, internet, financial services, and telecom companies, including 9 out of the top 10 internet companies globally. To learn more, visit www.alluxio.io. Beth Winkowski Winkowski Public Relations, LLC for Alluxio 978-649-7189 beth@alluxio.com Market News and Data brought to you by Benzinga APIs
[2]
Alluxio boosts performance for AI model training - SiliconANGLE
Alluxio Inc., which sells a commercial version of an open-source distributed filesystem and cache, today announced new features that accelerate artificial intelligence model training and enhance integration with Python software development kits. The company said the updates collectively enable organizations to train models faster, handle large datasets more efficiently and simplify complex AI infrastructure. Alluxio said enhancements are intended to support fast, prioritized access to important training data and integrate with common AI frameworks. The company has pivoted to address AI model training, a process that can take months, with promises of significant performance improvements. "We see DeepSeek as an opportunity," said founder and Chief Executive Haoyuan Li, referring to the Chinese startup that tanked tech stocks this week with news of its low-cost approach to model training. "It creates an easier sell for us." Last July, the company trumpeted enhancements that it said can improve utilization of costly graphics processing units to 97%. "Everybody's running very fast to take advantage of AI, so we help them innovate faster, accelerating training workloads, getting models into market faster, learning how they're being used, and bringing that info back into the model training process," said Bill Hodak, vice president of marketing and product marketing. "The faster they can do that, the more advanced and accurate their models will be." Alluxio Enterprise AI version 3.5 includes an experimental CACHE_ONLY write mode that the company said significantly improves the performance of write operations. When enabled, it mode writes data exclusively to the Alluxio cache instead of the underlying file system, eliminating bottlenecks associated with storage systems. Hodak said the feature is particularly useful with checkpoint files, which are saved snapshots of a model's state at a given point that can be used to resume from a saved point rather than restarting from scratch. Hodak said that the files can be large and cause long delays in the training process while loading. "If it was taking an hour before, it probably takes 20 minutes now," he said. Advanced cache eviction allows administrators to enforce time-to-live settings on cached data, which define how long cached data remains valid before it is automatically expired and removed. Administrators can now define caching priorities for specific data that override Alluxio's default. "least recently used" algorithm to keep data in the cache that would otherwise be expunged. "The goal is to reduce as much overhead as possible," Hodak said. "This improves cache hit ratios, which depends on the workload." Another experimental feature is enhanced integration between Alluxio's Python SDK and popular AI frameworks like PyTorch, PyArrow and Ray. The integrations provide a unified Python filesystem interface, enabling applications to interact seamlessly with local and remote storage systems. The release also introduces several enhancements to Alluxio's application programming interface for accessing data in S3 object storage. Support for HTTP persistent connections maintains a single TCP connection for multiple requests. This reduces the overhead of opening new connections for each request and decreases latency by approximately 40% for 4KB S3 ReadObject requests, the company said. Communication between the Alluxio S3 API and the Alluxio worker now supports TLS encryption and multipart upload. The latter splits files into multiple parts for faster parallel uploads. Hodak said that a new caching service improves the performance of very large directory listings, serving up results up to five times faster by delivering directory listing metadata from the cache and speeding performance up to five times. Administrators can now set a rate limit to control the maximum bandwidth an individual Alluxio Worker can read from the Under File System, the underlying storage system Alluxio uses to store data for cache access. Clusters can now have worker nodes with heterogeneous CPU, memory, disk and network configurations, enhancing flexibility.
Share
Share
Copy Link
Alluxio releases version 3.5 of its Enterprise AI platform, introducing new features to accelerate AI model training, improve data management, and enhance integration with popular AI frameworks.
Alluxio, a leading provider of AI and data acceleration platforms, has announced the release of Alluxio Enterprise AI version 3.5. This latest update introduces several key enhancements designed to accelerate AI model training, streamline data management, and improve integration with popular AI frameworks 1.
One of the standout features of the new release is the experimental CACHE_ONLY Write Mode. This innovative caching approach significantly improves the performance of write operations, particularly beneficial for AI checkpoint files during model training. By writing data exclusively to the Alluxio cache instead of the underlying file system, the new mode bypasses potential bottlenecks associated with storage systems 2.
Bill Hodak, VP of Marketing and Product Marketing at Alluxio, highlighted the impact of this feature: "If it was taking an hour before, it probably takes 20 minutes now" 2.
Alluxio Enterprise AI 3.5 introduces two new cache eviction policies:
TTL Cache Eviction: Administrators can now enforce time-to-live settings on cached data, ensuring automatic eviction of less frequently accessed information.
Priority-based Cache Eviction: This feature allows administrators to define caching priorities for specific data, overriding Alluxio's default Least Recently Used (LRU) algorithm 1.
These enhancements aim to provide more fine-grained control over cache management, optimizing performance for critical datasets.
The new release features improved integration between Alluxio's Python SDK and popular AI frameworks such as PyTorch, PyArrow, and Ray. This integration provides a unified Python filesystem interface, enabling seamless interaction between applications and various storage backends 1.
Alluxio has also introduced several enhancements to its S3 API:
Other notable improvements in version 3.5 include:
Haoyuan (HY) Li, Founder and CEO of Alluxio, emphasized the release's significance: "Our customers are training AI models with enormous datasets that often span billions of files. Alluxio Enterprise AI 3.5 was built to ensure workloads perform at peak performance while also simplifying management and operations of AI infrastructure" 1.
As the AI industry continues to evolve rapidly, Alluxio's latest release positions the company to address the growing demands of AI-driven workloads, potentially offering a competitive edge in the face of emerging low-cost training approaches 2.
Snowflake announces major updates to its AI and data collaboration capabilities at its annual BUILD conference, including enhancements to Cortex AI, the introduction of Snowflake Intelligence, and improvements in cross-cloud collaboration and security features.
8 Sources
8 Sources
A new ISG report reveals that AI adoption, particularly generative AI, is driving U.S. demand for public cloud services. Enterprises are leveraging cloud platforms for affordable access to AI infrastructure and tools.
2 Sources
2 Sources
Bitdeer AI, a subsidiary of Bitdeer Technologies Group, introduces an advanced AI training platform with serverless GPU infrastructure, aiming to revolutionize AI/ML development with scalable and efficient solutions.
4 Sources
4 Sources
Aerospike Inc. has released an updated version of its Vector Search technology, featuring new indexing and storage innovations designed to enhance real-time accuracy, scalability, and ease of use for developers working with generative AI and machine learning applications.
3 Sources
3 Sources
Teradata announces new AI capabilities, partnerships, and strategies at Possible 2024, focusing on scalable AI platforms, hybrid analytics, and sustainable AI practices to drive business value and innovation.
6 Sources
6 Sources