Curated by THEOUTPOST
On Wed, 11 Dec, 12:06 AM UTC
2 Sources
[1]
Generative AI app testing platform Gentrace raises $8M to make LLM development more accessible - SiliconANGLE
Generative AI app testing platform Gentrace raises $8M to make LLM development more accessible Gentrace, a developer platform for testing and monitoring artificial intelligence applications, said today it has raised $8 million in an early-stage funding round led by Matrix Partners to expand large language model testing product capabilities beyond engineering teams. Today's Series A funding round attracted participation from Headline and K9 Ventures and brings the company's total raised to date to more than $14 million. Founded in 2023, Gentrace offers a testing and monitoring product that allows non-technical users to participate in the evaluation, testing and monitoring of AI applications. According to the company, many industries are rushing to add generative AI to their offerings, development teams are facing the challenge of ensuring it remains reliable and safe. Similarly, the ability to evaluate and test large language models remains largely the domain of development and engineering teams, making it difficult to collaborate with product managers, subject matter experts, designers and quality assurance teams. "Generative AI represents a paradigm shift in software development, but the reality is there's way too much noise and not enough signal on how to test and build them easily or correctly," said Doug Safreno, co-founder and chief executive of Gentrace. "We're not just creating another dev tool -- we're reimagining how entire organizations can collaborate and build better LLM products." To help tackle this challenge, Gentrace announced Experiments, a tool that allows cross-functional teams to collaborate in purpose-built testing environments to assess AI model performance. They can test AI outputs directly, preview test outcomes, anticipate errors and explore scenarios while exchanging data and information freely between technical and non-technical members. The company's platform and Experiments interfaces with many existing tools and model providers including OpenAI, vector database Pinecone Systems Inc. and visual LLM programming environment Rivet. Early adopters, including companies such as Webflow and Quizlet using the platform helped predict AI-related issues before they affected users. According to Quizlet, by implementing Gentrace's platform, the company increased testing frequency from two times per month to more than 20 times per week, significantly improving the speed of iteration with testing. "Gentrace was the right product for us because it allowed us to implement our custom evaluations, which was crucial for our unique use cases," said Madeline Gilbert, a staff machine learning engineer at Quizlet. As an education technology company providing study tools for students and teachers, Quizlet uses generative AI on unstructured notes and materials to create study tools. According to Gilbert, even minor changes, such as a comma in a prompt, could significantly change the predictability of models. Gentrace's solution allowed quality assurance teams and subject matter experts to evaluate and test quickly after any modification. "It's dramatically improved our ability to predict the impact of even small changes in our LLM implementations," said Gilbert.
[2]
Gentrace makes it easier for businesses to test AI-powered software
As businesses continue to integrate generative AI into their products, many find it challenging to actually test whether the AI is behaving correctly and giving useful answers. To help address this problem, a startup called Gentrace offers an integrated platform for testing software built around large language models. Whereas traditional software that can be subjected to automated tests to verify that, say, data submitted to a web form ended up properly formatted in a database, AI-powered software often can't be expected to behave exactly in a specified way in response to input, says Gentrace cofounder and CEO Doug Safreno. Customers can end up defining a set of test data for the AI after any changes to the AI model, the databases it interacts with, or other parameters. But without a testing platform, running those tests can mean maintaining spreadsheets of AI test prompts and manually logging that they give satisfactory results. And while automation is possible, verifying that an AI response contains certain keywords or even asking another AI system to confirm that an AI looks satisfactory, complex testing often requires engineers to be heavily involved, even if other team members like product managers might know better what good output looks like, Safreno says. "The problem becomes, nobody can look at it and collaborate on these tests and on these evaluation methods," he says. "As new product requirements come in, they're not being captured in the testing."
Share
Share
Copy Link
Gentrace, a developer platform for testing and monitoring AI applications, has secured $8 million in Series A funding to expand its LLM testing capabilities and make AI development more accessible to non-technical teams.
Gentrace, a developer platform specializing in testing and monitoring artificial intelligence applications, has successfully raised $8 million in a Series A funding round led by Matrix Partners. The investment, which also saw participation from Headline and K9 Ventures, brings Gentrace's total funding to over $14 million [1].
Founded in 2023, Gentrace aims to tackle a critical issue in the rapidly evolving field of generative AI. As industries rush to incorporate AI into their offerings, development teams face the challenge of ensuring the reliability and safety of these applications. Traditionally, the evaluation and testing of large language models (LLMs) have been primarily handled by development and engineering teams, creating a bottleneck in collaboration with other stakeholders [1].
Doug Safreno, co-founder and CEO of Gentrace, emphasized the paradigm shift that generative AI represents in software development. He stated, "We're not just creating another dev tool -- we're reimagining how entire organizations can collaborate and build better LLM products" [1].
To address these challenges, Gentrace has launched Experiments, a tool designed to facilitate collaboration among cross-functional teams in testing AI model performance. This platform allows both technical and non-technical team members to:
Experiments integrates with existing tools and model providers, including OpenAI, Pinecone Systems Inc., and Rivet, enhancing its versatility and applicability across different AI development environments [1].
Early adopters of Gentrace's platform, including companies like Webflow and Quizlet, have reported significant improvements in their AI development processes. Quizlet, an education technology company, increased its testing frequency from twice a month to over 20 times per week, greatly enhancing their iteration speed [1].
Madeline Gilbert, a staff machine learning engineer at Quizlet, highlighted the importance of Gentrace's customizable evaluations for their unique use cases. She noted, "It's dramatically improved our ability to predict the impact of even small changes in our LLM implementations" [1].
Traditional software testing methods often fall short when applied to AI-powered applications. Unlike conventional software, where automated tests can verify specific behaviors, AI responses can be less predictable and require more nuanced evaluation [2].
Gentrace's platform aims to bridge this gap by providing a structured environment for defining and executing AI tests. This approach helps businesses move beyond maintaining spreadsheets of test prompts and manually logging results, a common practice in the absence of specialized testing tools [2].
One of the key advantages of Gentrace's solution is its ability to involve non-technical team members in the AI testing process. This inclusivity allows product managers, subject matter experts, and other stakeholders to contribute their insights without requiring extensive engineering knowledge [2].
As the adoption of generative AI continues to grow across industries, platforms like Gentrace are poised to play a crucial role in ensuring the quality, reliability, and safety of AI-powered applications. By making LLM development more accessible and collaborative, Gentrace is contributing to the broader ecosystem of AI innovation and responsible development.
Early, a Tel Aviv-based startup, secures $5 million in seed funding to develop an AI-powered tool that automates code testing, aiming to improve software quality and catch bugs early in the development process.
2 Sources
Distributional, an AI testing platform founded by former Intel AI software GM Scott Clark, raises $19 million in Series A funding to automate and enhance AI model and application testing for enterprises.
2 Sources
As businesses move beyond the pilot phase of generative AI, key lessons emerge on successful implementation. CXOs are adopting strategic approaches, while diverse use cases demonstrate tangible business value across industries.
4 Sources
A new report by Menlo Ventures reveals that while enterprise AI spending has skyrocketed to $13.8 billion in 2024, over a third of companies lack a clear vision for implementing generative AI across their organizations.
2 Sources
Patronus AI introduces a new API designed to detect and prevent AI failures in real-time, offering developers tools to ensure accuracy and reliability in AI applications.
2 Sources
The Outpost is a comprehensive collection of curated artificial intelligence software tools that cater to the needs of small business owners, bloggers, artists, musicians, entrepreneurs, marketers, writers, and researchers.
© 2025 TheOutpost.AI All rights reserved