Unscript provides its customers with unlimited video generation and charges only for the final videos.
Since traditional video shoots are expensive and time-consuming, many enterprises have been turning to generative AI to craft their promotional videos. However, even state-of-the-art models like Sora and Kling frequently produce inaccurate results and concerns regarding copyright infringement and overall quality still persist.
Today, there are a handful of startups providing tools and solutions for enterprises to create AI videos that enterprises can use. Bengaluru-based AI startup Unscript is one among them.
The startup recently transformed a single photo into a full-fledged video, generating head and eye movements, facial expressions, voice modulations, and body language, achieving studio-quality results in under 2 minutes. This significantly reduces manual shooting efforts.
Interestingly, the startup claims that their model has surpassed OpenAI's Sora, Google Vlogger, Microsoft's VASA-1, and Alibaba's EMO, making it an ideal choice for brands, marketing agencies, and virtual influencers.
Over 50 leading companies, including Ceat and Mahindra, are already leveraging its cost-effective and scalable video-production capabilities.
"We have built a 'Canva for videos', but in a version where you can get an end-to-end solution. Starting from shooting to the final video that you want to deliver to social media, everything can be automated here," said co-founder Ritwika Chowdhury in an exclusive interview with AIM.
Unscript provides advanced video-generation solutions, including text-to-video, image-to-video, and the creation of virtual influencers as brand ambassadors for enterprises.
The company is experiencing strong demand from sectors such as BFSI, pharma, and media & entertainment.
The startup has built its own diffusion model for converting text to video. According to Chowdhury, their model follows diffusion principles but is distinct in its architecture.
"It's an encoder-decoder plus diffusion-based model that we have developed. A key aspect is that we have collected substantial data -- about 1,000 hours -- to train it," said Chowdhury.
OpenAI's Sora tends to generate highly creative videos that are not commonly seen in the real world, such as dolphins cycling in the ocean. Unscript's model, however, is trained specifically to generate content based on humans.
"Sora's videos tend to be very abstract. You might generate something like a dog playing with a ball, but we specialise in creating content featuring human beings.
"If you look at Sora, you'll see that physical interactions with the world are not mapped properly. For example, a person walking might appear to be floating or not interacting realistically, as it is not specifically trained for human-like interactions."
Talking about Unscript, she said, "Our video tool is perfect at not only generating lip sync, but generating it based on the individual. This is important because we are working with a lot of enterprise customers, like Ceat, Mahindra, Bajaj, Maxlife, Flowworks, and Healthifyme."
For script generation, Unscript uses third-party LLM vendors like OpenAI. "For the LLM component, especially in documents and videos where end-to-end script generation is needed, we've trained and fine-tuned with 1 million ad copies," said Chowdhury.
She also mentioned that they train their models using proprietary data that they have collected, as open-source datasets often do not cover all types of ethnicities. "We literally hired 30 people last year, and for six months, we focused solely on collecting data," she said.
Meanwhile, Unscript also supports content generation in over 40 languages. The team has published more than 30 research papers and has researchers from Samsung Research, Microsoft, Intel, and various IITs. Moreover, the company is advised by a former employee of OpenAI.
Chowdhury also cautioned about certain issues with Luma and Sora. "You cannot maintain the brand image consistently in all the clips. When creating enterprise content, you need to have logos, colours, and other elements presented in a specific way. Luma and Sora are not built for enterprise videos," she said.
Unscript provides its customers with unlimited video generation. "We only charge for the final videos that you use, not for the R&D you might need to do," Chowdhury said.
Businesses use the platform's services to create diverse content, from YouTube videos to marketing assets and customer communications. BFSI companies use Unscript to produce short explainer videos as a more engaging alternative to traditional policy documents.
Chowdhury revealed that her journey with generative AI started in 2014 while she was at IIT Kharagpur. Founded in 2021 by Chowdhury and Apurv Jain, Unscript has since raised over $1.25 million.
The company does not plan to raise funds in the near future. Chowdhury noted that they are continually experimenting with new products and are preparing to release a new feature, which will be announced soon.
While Unscript is operating in an interesting space, there are other companies excelling in that space as well.