2 Sources
[1]
Context architecture is replacing RAG in AI
Redis built its name as the caching layer that kept web applications from collapsing under load. The problem it is targeting now has the same structure but is harder to solve: production AI agents failing not because the models are wrong, but because the data underneath them is scattered, stale and structured for humans rather than machines. Retrieval pipelines built for single queries cannot absorb the volume agents generate. The gap Redis is targeting is structural: agents make orders of magnitude more data requests than human users, but most retrieval layers were built for the human-scale problem. Redis Iris, launched Monday, is the company's answer: a context and memory platform that sits between an agent and the data it needs to act. The platform combines real-time data ingestion, a semantic interface that auto-generates MCP tools from business data models, and an agent memory server built on Redis Flex, a rewritten storage engine that runs 99% of data on flash at a tenth of the cost of in-memory storage alone. The announcement lands as enterprise RAG infrastructure is in active transition. VentureBeat's Q1 2026 VB Pulse RAG Infrastructure Market Tracker found buyer intent to adopt hybrid retrieval tripling from 10.3% to 33.3% between January and March. Retrieval optimization surpassed evaluation as the top enterprise investment priority for the first time. Custom in-house retrieval stacks rose from 24.1% to 35.6% as enterprises outgrew off-the-shelf options. Redis is not the only infrastructure vendor reading those signals -- several data platform providers have repositioned around agent context layers in recent weeks. The scale mismatch is the structural argument behind the launch. "Companies will have orders of magnitude more agents than human beings," Rowan Trollope, CEO of Redis, told VentureBeat. "Orders of magnitude more agents than human beings means orders of magnitude more load on back end systems." Trollope traces the parallel back to the mobile era: When legacy backends built for branch tellers suddenly had to serve a million smartphone users, Redis became the caching layer that absorbed the load without a full rebuild. What is different this time is that agents cannot write their own middleware. In the mobile era, a developer would sit with a database administrator, identify the queries an application needed and hard-code the caching logic into a middleware layer. Agents cannot do that. They need to find the right data at runtime, through interfaces built for them in advance, or they stall. "This is like the analogy of the grocery store in the fridge," he said. "If every time you have to go make your sandwich, you have to run to the grocery store to get the food, that's not very efficient. You put a fridge in every house, you store a little bit of food there. And that's kind of where we still tend to exist in the infrastructure stack." Iris ships five components that together cover data ingestion, semantic access, memory and caching. Redis Data Integration. Now in general availability. RDI uses change data capture pipelines to sync data from relational databases, warehouses and document stores into Redis continuously, with connectors for Oracle, Snowflake, Databricks and Postgres. Context Retriever. Now in preview. Developers define a semantic model of business data using pydantic models and Redis auto-generates MCP tools agents use to query it directly, with row-level access controls enforced server-side. Trollope describes the shift from classic RAG as a directional inversion. "It's just a flip to let the agent pull the data instead of presupposing and stuffing it into the pipeline," he said. Agent Memory. Now in preview. Stores short and long-term state across sessions so agents carry context without re-deriving it on each turn. Redis Flex. A rewritten storage engine that runs 99% of data on SSDs and 1% in RAM, delivering petabyte-scale retrieval at sub-millisecond latencies. Redis Search and LangCache. The retrieval and semantic caching backbone underneath the platform. LangCache reduces redundant model calls by caching prompt responses. The data industry is generally heading in the same direction now. Every major database vendor is making a context layer argument. Traditional database vendors including Oracle are integrating context and memory layers to bring relational databases into the agentic AI era. Purpose-built vector database vendors including Pinecone are doing the same, building out a new knowledge layer for agentic AI context. Standalone context layers like Hindsight are also part of the emerging landscape. Trollope frames Redis's position as structurally different from that competition. "For us to win, no one else has to lose," he said. Many Redis deployments already run MongoDB or Oracle as the backend system of record. Iris reflects and caches from those systems rather than displacing them. Redis is launching Iris in the Snowflake marketplace with native connectors. Stephanie Walter, Practice Leader for AI Stack at HyperFRAME Research, puts the market context plainly. "The market is converging on the same conclusion: agents don't just need more tokens or better models. They need governed, current, low-latency context," Walter said. Her read on Redis's differentiation focuses on where Redis already sits in the stack, which is close to runtime, latency-sensitive operational state, and real-time data., "The pitch is not 'better RAG' as much as 'agents need live context, memory, and fast retrieval while they are actually working," she said. Whether it's Redis or another vendor, every context layer technology will face a governance challenge to be successful. "Agentic AI will not scale in the enterprise if every agent becomes a new cost center, a new data access risk, and a new governance exception," she said. "The winning context layers will be the ones that make agents faster, cheaper, and safer to run." Mangoes.ai is one company that has already had to answer those questions in production, under conditions where the cost of getting context wrong is measured in patient outcomes. Amit Lamba, founder and CEO of Mangoes.ai, runs a real-time voice AI platform deployed across large healthcare facilities where patients and clinicians ask live questions about treatment, scheduling and case history. Mangoes.ai built its stack natively on Redis from the start. "Retrieval, memory, and session state all run through Redis, so we're not stitching together separate tools and hoping they talk to each other," Lamba said. The problem Iris's dynamic memory capability addresses is what happens across a complex session. "Think about a one-hour group therapy session," Lamba said. "You need to know who said what, when, and be able to surface the right information to the therapist in the moment. That's not a simple retrieval problem." The platform runs multiple specialized agents in parallel, one for entity identification, one for relationship reasoning and one for integrating case history. "The dynamic memory capability maps almost perfectly to the problem we're solving," Lamba said. For enterprises that built their AI stack around RAG, the retrieval layer that got them to production is no longer enough to keep them there The RAG era is giving way to context architecture. The classic RAG model pushed data into the agent before the model was called. Production deployments are flipping that: agents pull what they need at runtime through tool calls, treating the data layer as a live resource rather than a pre-loaded payload. Teams still optimizing RAG pipelines are solving last year's problem. The semantic layer is now production infrastructure. The model that defines business entities, their relationships and the access rules between them needs to be built, versioned and maintained with the same discipline as a data pipeline. Most organizations have not staffed or structured for that work. The enterprises that define their context architecture now are the ones that will not have to rebuild it when agent workloads scale. Budget is already moving. VB Pulse Q1 2026 data shows retrieval optimization investment rising from 19% to 28.9% across the quarter, overtaking evaluation spending for the first time. Organizations that spent the previous year measuring their retrieval quality are now spending to fix it. The context layer is an active procurement decision, not a roadmap item. "The first buyer question should not be 'Do I need a vector database, long context, memory, or a context engine?' It should be 'What does this agent need to know, how fresh must that knowledge be, who is allowed to access it, and what does every retrieval cost?'" Walter said.
[2]
Redis debuts the much-needed memory layer for enterprise AI agents - SiliconANGLE
Redis debuts the much-needed memory layer for enterprise AI agents Artificial intelligence agents have a memory problem and now Redis Inc., the database management startup, is trying to fix that with its new, real-time Context Engine. As the company explains, it's all about helping enterprise AI agents move beyond simply chatting to users and making them productive workers in their own right. Redis explained that there are three core tools behind the Context Engine, including the Redis Context Retriever, Redis Agent Memory and Redis Data Integration, with the latter made generally available starting today. The three tools are designed to solve what Redis terms the "context problem" in enterprise AI, which causes autonomous systems to hallucinate and output incorrect information or results, or sometimes even stall from a lack of data. The company argues that the context problem is the result of a lack of memory, which causes problems when AI agents are asked to perform complex tasks. For instance, if an agent is trying to resolve a customer's issue on the phone, it might need to pull data from the customer relationship management system, a shipping database and a PDF that outlines company policies. Without a dedicated context engine, the only way to do this is to use brittle, onetime integrations that are slow and difficult to maintain. Redis is aiming to provide a dedicated layer for agents that sits between them and the data, powered by its powerful, in-memory data store. It gives each agent an "agent-readable" view of the environments they operate in, defining business entities and their relationships so that they immediately understand the bigger picture around whatever problem they're tasked with solving. The new Context Retriever is perhaps the most critical innovation in the Context Engine. Currently available in preview, it allows developers to create a semantic model of their business data, so agents can map how each customer relates to an opportunity or support ticket. Rather than forcing agents to gamble with "text-to-SQL" queries that often break down, the retriever will automatically generate the tools required by the agent to grab the data they need, using the open-source Model Context Protocol. Meanwhile, the Agent Memory component in preview from today provides a "dual-layered" approach to the agent's state. It helps manage the short-term interaction history while also creating a more durable, long-term memory cache that agents can use to remember preferences and previous interactions from past sessions. The new Data Integration is the final piece of the puzzle, providing the plumbing for the Context Engine. Its job is to continuously synchronize business data from the company's main relational databases and data warehouses. This means that agents will always act on the most up-to-date data, rather than information that might be several weeks old. What makes this offering compelling is Redis' strong enterprise presence. Its flagship open-source in-memory data store, which functions as a database, cache, streaming engine and message broker, already exists in 43% of all enterprise AI agent stacks. Its evolution from a high-speed cache into a sophisticated context layer suggests that Redis believes it can become a kind of operating system for AI agents. For customers, the big promise is that they'll be able to create more complex and reliable agents that don't break down when trying to automate more advanced business tasks.
Share
Copy Link
Redis unveiled Iris, a context and memory platform designed to fix the structural data problem causing production AI agents to fail. The platform addresses the scale mismatch between agent-generated data requests and existing retrieval infrastructure built for human-scale queries. With components including Context Retriever, Agent Memory, and Redis Data Integration, the launch signals a broader industry shift away from traditional RAG approaches.
Redis has launched Iris, a context and memory platform that addresses a critical infrastructure gap causing production AI agents to stall or hallucinate
1
. The problem isn't model accuracy but data architecture: AI agents generate orders of magnitude more data requests than human users, yet most retrieval systems were built for human-scale workloads1
. When production AI agents attempt complex tasks like resolving customer issues, they must pull data from multiple sources including CRM systems, shipping databases, and policy documents—a process that existing infrastructure struggles to support2
.
Source: VentureBeat
The launch of Redis Iris reflects a fundamental shift in enterprise AI infrastructure. VentureBeat's Q1 2026 VB Pulse RAG Infrastructure Market Tracker found buyer intent to adopt hybrid retrieval tripling from 10.3% to 33.3% between January and March, while custom in-house retrieval stacks rose from 24.1% to 35.6% as enterprises outgrew off-the-shelf options
1
. Retrieval-Augmented Generation approaches designed for single queries cannot absorb the volume agentic AI systems generate. "Companies will have orders of magnitude more agents than human beings," Rowan Trollope, CEO of Redis, told VentureBeat. "Orders of magnitude more agents than human beings means orders of magnitude more load on back end systems"1
.Redis Iris ships with five integrated components that cover real-time data ingestion, semantic access, and agent memory capabilities. Redis Data Integration, now in general availability, uses change data capture pipelines to sync data continuously from relational databases, warehouses, and document stores including Oracle, Snowflake, Databricks, and Postgres
1
. The Context Retriever, currently in preview, allows developers to define a semantic model of business data using pydantic models, then auto-generates Model Context Protocol tools that agents use to query data directly with row-level access controls enforced server-side1
2
.Related Stories
Redis Agent Memory, also in preview, provides a dual-layered approach that stores both short-term interaction history and long-term memory cache, enabling agents to carry context across sessions without re-deriving information on each turn
1
2
. The platform runs on Redis Flex, a rewritten storage engine that runs 99% of data on SSDs and 1% in RAM, delivering petabyte-scale retrieval at sub-millisecond latencies at a tenth of the cost of in-memory storage alone1
. Redis Search and LangCache provide the retrieval and semantic caching backbone, with LangCache reducing redundant model calls by caching prompt responses1
.Redis isn't alone in recognizing this infrastructure shift. Traditional database vendors including Oracle are integrating context layers to bring relational databases into the agentic AI era, while purpose-built vector database vendors including Pinecone are building out knowledge layers for AI agent context
1
. Standalone context layers like Hindsight have also emerged in this landscape. Redis positions itself differently by functioning as a reflection and caching layer that sits between agents and existing systems of record rather than displacing them. With Redis already present in 43% of all enterprise AI agent stacks, the company is leveraging its existing footprint to become what it describes as an operating system for AI agents2
. The semantic interface approach represents a directional inversion from classic RAG: "It's just a flip to let the agent pull the data instead of presupposing and stuffing it into the pipeline," Trollope said1
.Summarized by
Navi
[1]