Exclusive: Encord puts multimodal AI data - including audio - all in one platform.
Data development platform Encord is going beyond business analysis to become "the world's only multimodal AI data development platform."
On Thursday, the company announced new multi-modal data annotation capabilities for classifying audio and documents -- all in one interface. The update expands on Encord's existing support for medical, computer vision, and video data.
Also: I've tested a lot of AI tools for work. These 4 actually help me get more done every day
By now, AI chatbots and image generators are relatively commonplace. But it's much harder to generate convincing video or audio than it is to generate text. The AI industry is focused increasingly on multi-modal capabilities, especially with the release of features like ChatGPT's Voice Mode.
To fine-tune an AI model, you need quality -- and sometimes hyper-specific -- data. Text-based data doesn't provide the nuance these complex models need, and accuracy is even more important in high-stakes contexts like medicine. Builders need platforms that can annotate and evaluate all kinds of data -- video, audio, images, graphs, reports, retail listings, PDFs, and more, ideally in one place. Several of Encord's clients use the platform for medical images like MRI scans to develop better models for assisting doctors.
Having high-quality, well-annotated audio data helps build speech and emotion recognition models, and can even identify sounds. Video and audio AI products need increasingly sophisticated data support to achieve a human-like realism, whether in transcription or lip-syncing accuracy. For example, the AI text-to-video platform Synthesia uses Encord to develop training models for its lifelike AI avatars.
Encord's update includes new annotation and curation features for documents, audio files, vision, and medical data. With multimodal annotation, AI teams can customize an interface to review and edit different file types side by side. Currently, different data types often are siloed across multiple services and platforms, adding time and costs to data annotation. Encord already supports key data annotation categories such as entity recognition, translation, summarization, text classification, and sentiment analysis.
"It is time-consuming and often impossible for teams to gain visibility into large-scale datasets throughout model development due to a lack of integration and consistent interface to unify these siloed tools," the company said in the release.
Also: Organizations face mounting pressure to accelerate AI plans, despite lack of ROI
With Encord, AI teams can filter through their data to identify and curate exactly what they need to build a model. Its evaluation dashboard can also flag data that's hampering a model's performance so that teams can remove or replace it.
"On average, Encord customers use 35% smaller data sets, which leads to models performing 20% more accurately," an Encord rep told ZDNET via email.
In a demo, Encord co-founder and president Ulrik Stig Hansen told ZDNET that he sees the company's focus on quality and centralization as eventually enabling artificial general intelligence (AGI).