Curated by THEOUTPOST
On Wed, 9 Oct, 8:02 AM UTC
2 Sources
[1]
RAG data preparation startup Vectorize launches with $3.6M in seed funding - SiliconANGLE
RAG data preparation startup Vectorize launches with $3.6M in seed funding Data integration startup Vectorize AI Inc. says its software is ready to play a critical role in the world of artificial intelligence after closing on a $3.6 million seed funding round today. The round was led by True Ventures, and was announced alongside the debut of its novel platform that's meant to aid in retrieval-augmented generation or RAG. The startup is aiming to tackle a problem it has identified among AI practitioners, namely the challenge of taking various bits and bytes of unstructured data like written documents, video, audio files and so on, and transforming these so they can be fitted neatly into a vector database and optimized for RAG. RAG is a technique that's used to provide generative AI models with real-time access to the most relevant and up to date information, which is required to make better decisions. One of the problems with AI chatbots like ChatGPT is that they're trained on much older information. For instance, the GPT-3.5 model that powered ChatGPT when it launched a couple of years ago was trained on basically the entire internet as it was in 2022. So it doesn't have access to any recent news beyond that date. By using RAG techniques, it's possible to connect AI models to proprietary datasets and enhance their knowledge with the most recent information. To do this, teams generally rely on a vector database Pinecone, DataStax, Couchbase or Elastic, which stores unstructured data as vector embeddings that can be accessed and understood by AI models. What Vectorize does is connect these vector databases to live, unstructured data sources such as an internal knowledge base, collaboration tool or customer relationship management platform. It's an important capability because managing and vectorizing unstructured information is a major headache for data scientists. At the heart of Vectorize's platform is a "production-ready RAG pipeline" that makes it possible to transform unstructured data into optimized vector search indexes. Using this, companies can feed their most relevant new information into the large language models they are using to power their AI applications. To simplify this task, Vectorize has devised an intuitive three-step process for transforming data. The first step involves importing data into its platform, which involves feeding it with scanned paper-based documents or connecting it to some kind of computer system. Once it's connected, Vectorize extracts any natural language content within. The next step is to evaluate that new data. The platform evaluates multiple chunking and embedding strategies in real-time, quantifying the results to find the most optimal configuration. Customers can go with Vectorize's recommendations or implement their own strategy on how best to vector their new data. The final step is deployment, which involves creating a real-time vector pipeline to automatically update the AI models and ensure continuous accuracy. By doing this, AI models will always have access to the most current information as the organization's data evolves. Vectorize reckons that these three, simple steps can accelerate the data preparation process, reducing the time it takes from weeks or months to just a few hours. There are a few things that set Vectorize apart from its competitors, such as its self-service model and its pay-as-you-go pricing. Users have the flexibility to import data from almost any source they can think of, and they can test and optimize different approaches to doing this before settling on the most efficient pipeline architecture. Because the platform is pay-as-you-go, it's also ready to use almost immediately, with no long enterprise commitments or onboarding processes. In addition, the flexibility of Vectorize means users can also define how frequently they want to update their vector search databases, so they can set it up to constantly update in real-time, or just add new information on a weekly or monthly basis. Another novelty of Vectorize's platform is its "agentic AI" approach, which combines RAG with AI agents capable of autonomously solving problems for users. For instance, the AI cloud infrastructure company Groq Inc. uses Vectorize to power its AI support agents, which can automatically fix customer's problems using real-time data and context. The company offers free access to its platform with enough bandwidth to support smaller projects, while larger enterprises with more data to prepare only need to pay as they go for the information they feed into their vector databases. As such, Vectorize says it's one of the most cost-effective data preparation tools for RAG on the market. Nicholas Ward, president of the advertising technology company Koddi Inc. and an angel investor in Vectorize, believes the company's platform will become a foundational technology for many enterprise AI projects. "Having worked with Vectorize's founders in the past, I've seen firsthand their ability to tackle complex data challenges," Ward said. "The RAG platform is set to become a cornerstone technology for companies leveraging AI, from adtech to fintech and beyond."
[2]
Vectorize debuts agentic RAG platform for real time enterprise data
Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More While vector databases are now increasingly commonplace as a core element of an enterprise AI deployment for Retrieval Augmented Generation (RAG), that's not all that's needed. Chris Latimer, the CEO and co-founder of startup Vectorize, spent several years working at DataStax where he helped to lead the database vendor's cloud efforts. A recurring issue that he saw time and again was that the vector database wasn't really the hard part of enabling enterprise RAG. The hard part of the problem was taking all the unstructured data and getting it into the vector database, in a way that was optimized and going to work well for generative AI. That's why Latimer started Vectorize just ten months ago, in a bid to help solve that challenge. Today the company is announcing that it has raised $3.6 million in a seed round of funding, led by True Ventures. Alongside the funding, the company announced the general availability of its enterprise RAG platform. The Vectorize platform can enable an agentic RAG approach for near real-time data capability. Vectorize focuses on the data engineering side of AI. The platform helps companies prepare and maintain their data for use in vector databases and large language models. The Vectorize platform also enables enterprises to quickly build an RAG data pipeline through an intuitive interface. Another core capability is an RAG evaluation feature that allows enterprises to test different approaches. "We kept seeing people get to the end of the development cycle with their Gen AI projects and find out that they didn't work really well," Chris Latimer, co-founder and CEO of Vectorize told VentureBeat in an exclusive interview. "The context they were getting for their vector database wasn't the most useful to the large language model, it was still hallucinating or it was misinterpreting the data." How Vectorize fits into the enterprise RAG stack Vectorize is not a vector database itself. Rather, it's a platform that connects unstructured data sources to existing vector databases like Pinecone, DataStax, Couchbase and Elastic. Latimer explained that Vectorize ingests and optimizes data from diverse sources for vector databases. The platform will provide a production-ready data pipeline that handles ingestion, synchronization, error handling and other data engineering best practices. Vectorize itself is not a vector embedding technology either. The process of converting data, be it text, images or audio into vectors, is what vector embedding is all about. Vectorize helps users evaluate different embedding models and data chunking methods to determine the best configuration for the enterprise's specific use case and data. Latimer explained that Vectorize allows users to choose from any number of different embedding models. The different models could include for example OpenAI's ada, or even Voyage AI embeddings, which are now being adopted by Snowflake. "We do take into account innovative ways to vectorize the data so that you get the best results," Latimer said. "But ultimately, where we see the value is in giving enterprises and developers a production-ready solution that they just don't have to worry about the data engineering side." Using agentic AI to power enterprise RAG One of Vectorize's key innovations is its "agentic RAG" approach. It's an approach that combines traditional RAG techniques with AI agent capabilities, allowing for more autonomous problem-solving in applications. Agentic RAG isn't a hypothetical concept either. It's already being used by one of Vectorize's early users, AI inference silicon startup Groq, which recently raised $640 million. Grok is using Vectorize's agentic RAG capabilities to power an AI support agent. The agent can autonomously solve customer problems using the data and context provided by Vectorize's data pipelines. "If a customer has a question that's been asked and answered before, you want that agent to be able to solve the customer's problem without a human getting involved," Latimer said. "But if there's something that the agent can't solve, you do want to have a human in the loop where you can escalate, so this idea of being able to have an agent reason its way through solving a problem, is the whole idea behind an AI agent architecture." Why real time data pipelines are essential to enterprise RAG A primary reason why an enterprise will use RAG is to connect to its own sources of data. What's equally important though is making sure that data is up to date. "Stale data is going to lead to stale decisions," Latimer said. Vectorize provides real-time and near-real-time data update capabilities, with the ability for customers to configure their tolerance for data staleness. "We've actually let people configure the platform based on their tolerance for stale data and their need for real-time data," he said. "So if all you need is to schedule your pipeline to run once a week, we'll let you do that, and then if you need to run real-time, we'll let you do that as well, and you'll have real-time updates as soon as they're available."
Share
Share
Copy Link
Vectorize AI Inc. debuts its platform for optimizing retrieval-augmented generation (RAG) data preparation, backed by $3.6 million in seed funding led by True Ventures. The startup aims to streamline the process of transforming unstructured data for AI applications.
Vectorize AI Inc., a data integration startup, has launched its platform aimed at revolutionizing retrieval-augmented generation (RAG) data preparation. The company recently secured $3.6 million in seed funding led by True Ventures, marking its entry into the competitive AI infrastructure market 1.
At the core of Vectorize's offering is a solution to a critical problem faced by AI practitioners: efficiently transforming unstructured data into a format suitable for vector databases and optimized for RAG. This process is crucial for enhancing AI models with up-to-date information, a capability that standard models like ChatGPT often lack due to their training on historical data 1.
Vectorize's platform introduces a streamlined three-step process for data transformation:
This approach significantly reduces the data preparation time from weeks or months to mere hours, addressing a major pain point in AI development 1.
One of Vectorize's key innovations is its "agentic RAG" approach, which combines traditional RAG techniques with AI agent capabilities. This allows for more autonomous problem-solving in applications. An early adopter, AI inference silicon startup Groq, is already using this technology to power an AI support agent capable of autonomously solving customer issues 2.
Vectorize offers a self-service model with pay-as-you-go pricing, providing users with the flexibility to import data from various sources and optimize their approach without long-term commitments. The platform allows users to define update frequencies for their vector search databases, ranging from real-time to weekly or monthly updates 1.
Nicholas Ward, president of Koddi Inc. and an angel investor in Vectorize, believes the platform will become a foundational technology for many enterprise AI projects. The company's focus on the data engineering side of AI, rather than being a vector database itself, positions it as a complementary solution to existing vector databases like Pinecone, DataStax, Couchbase, and Elastic 12.
Vectorize emphasizes the importance of up-to-date data in decision-making processes. The platform offers real-time and near-real-time data update capabilities, allowing customers to configure their tolerance for data staleness. This feature ensures that AI models always have access to the most current information, which is crucial for making informed decisions 2.
As enterprises increasingly adopt AI technologies, Vectorize's platform stands to play a significant role in streamlining the data preparation process, potentially accelerating the development and deployment of AI applications across various industries.
Reference
[1]
Voyage AI raises $20 million in Series A funding to develop improved embedding and retrieval models for enterprise Retrieval Augmented Generation (RAG) AI use cases, with backing from Snowflake and plans for integration into Snowflake's Cortex AI service.
2 Sources
2 Sources
Vector databases are emerging as crucial tools in AI development, offering efficient storage and retrieval of high-dimensional data. Their impact spans various industries, from e-commerce to healthcare, revolutionizing how we handle complex information.
3 Sources
3 Sources
Zilliz, the company behind the open-source Milvus vector database, has announced significant updates to its Zilliz Cloud offering, aiming to reduce costs and complexity for enterprise AI deployments while improving performance.
2 Sources
2 Sources
Glean, an enterprise search startup, has raised $260 million using Graph RAG technology. This innovative approach combines knowledge graphs with retrieval-augmented generation to improve information discovery and AI-powered search capabilities.
2 Sources
2 Sources
ApertureData, a California-based startup, has raised $8.25 million in seed funding to develop ApertureDB, a unified database solution for managing multimodal data in AI applications. The company claims to offer significant speed improvements and productivity gains for enterprises working with diverse data types.
2 Sources
2 Sources
The Outpost is a comprehensive collection of curated artificial intelligence software tools that cater to the needs of small business owners, bloggers, artists, musicians, entrepreneurs, marketers, writers, and researchers.
© 2025 TheOutpost.AI All rights reserved