3 Sources
[1]
TikTok, YouTube creators sell unpublished content for AI training
OpenAI, Google, and AI media company Moonvalley are paying YouTubers to purchase unused video footage in order to train their AI algorithms, Bloomberg reported. Here, "unused" footage refers to unpublished content, for which digital creators are being paid roughly $1-$4 per minute of footage depending on video quality and format. Besides YouTubers, creators on TikTok and Instagram have also turned to this opportunity as an avenue to diversify their income beyond advertising deals with brands, according to the news portal. This development is significant given Big Tech platforms were previously critiqued for training their AI applications on content without seeking consent or licenses. Recently, a lawsuit filed by a group of U.S-based authors alleged that Meta CEO Mark Zuckerberg approved the use of "pirated" versions of copyright-protected books to train its AI models. To explain, Meta used the dataset belonging to file-sharing project LibGen (currently involved in legal suits for online piracy) for training purposes. In July 2024, other AI giants including Apple, Nvidia, Salesforce, and Anthropic were under fire for using transcripts from over 170,000 YouTube videos without the consent of their creators and in breach of YouTube's terms of service. Such usage has been termed "free-rider behaviour", where organisations like OpenAI and Microsoft, among others use content without offering compensation and in violation of copyright laws. However, OpenAI and Meta have signed deals with news organisations and YouTube with major music labels to access their content. Further, even as companies like Google deployed features enabling website publishers to opt out of its crawlers scraping data to train AI models, concerns over crawlers scraping and indexing websites persist. Moving forward, as companies capitalise on the AI boom, experts now warn of data exhaustion for AI training purposes. OpenAI CEO Sam Altman and more recently xAI owner Elon Musk have also expressed concerns about human knowledge for such purposes being near depletion. According to a Data Provenance study, a rise in restrictions on crawling and training of AI models through data from web sources was found. The research, which audited 14,000 web domains, also concluded that such restrictions would also introduce bias in data used for training purposes toward older and "less fresh content", lacking in diversity. However, companies can explore alternate tactics for AI training amid rising constraints. For instance, Musk believes that synthetic data (created through AI) can 'fine-tune' new systems and train AI. However, this process is subject to risks like cybersecurity, bias propagation, and a surge in errors through hallucinations. Additionally, such agreements with digital creators to utilise their unpublished video footage to train AI models could help potentially aid this process. While most agreements prohibit AI companies from creating digital replicas or mimicking exact scenes, other questions persist. MediaNama reached out to OpenAI, Moonvalley, and Google with questions and is awaiting their response. We will update the story as soon as we hear from them.
[2]
AI Companies Are Paying Content Creators For Their Unpublished Videos
Content creators are having an unexpected boon: artificial intelligence companies are buying their unpublished videos for training purposes. It has been a little over two years since generative AI exploded into the world but already tech companies have burned through all the "available data" on the internet. In a bid to plug the gap and continue building AI video generators, Bloomberg reports that AI companies, including Google and OpenAI, are paying content creators for access to their unpublished videos. This content holds value precisely because it is material that never saw the light of day and is therefore unique. Sources tell Bloomberg that these AI firms are paying between $1 and $4 per minute of footage with the price depending on the quality or format. 4K videos, for example, will go for a higher price as well as specialist imagery such as that taken from a drone. For regular content creators posting to platforms like YouTube, Instagram, and TikTok, they can expect to fetch between $1 and $2 per minute. It could be a bonus revenue stream for creators who have filmed a video but for whatever reason decided not to post it. One social media agent tells Bloomberg that he thinks the window for selling content to AI companies won't be open for long. "It's an arms race and they all need more footage," says senior vice president of creators at Wasserman Dan Levitt. "I see a window in the next couple of years where licensing footage is lucrative for creators who are open to doing so. But I don't think that window is going to last that long." There are third-party agencies that mediate between IP owners and AI companies; such as Troveo AI and Calliope Networks which negotiate terms on behalf of content creators. It could be an opportunity for photographers and videographers willing to enter into a deal. "All the companies building video models we're either working with or are in our pipeline right now," says co-founder and chief executive officer of Troveo Marty Pesis, who adds his company has paid out over $5 million to creators. These deals between AI companies and creators generally have terms forbidding AI companies from creating digital replicas of the creators themselves or making exact replicas of the content. AI companies face a challenge to keep making bigger and better generative AI models which require huge amounts of data. The controversial practice of taking pictures, videos, and text from the internet without the copyright holders' permission is apparently over after everything has already been taken. During an X livestream last week, Elon Musk said the way to plug the gap is with synthetic data. "The only way to supplement [real-world data] is with synthetic data, where the AI creates [training data]," Musk says. "With synthetic data ... [AI] will sort of grade itself and go through this process of self-learning." However, this method is not totally proven. One study suggests that AI models trained on AI images start churning out garbage images with the lead author comparing it to species inbreeding. "If a species inbreeds with their own offspring and doesn't diversify their gene pool, it can lead to a collapse of the species," says Hany Farid, a computer scientist at the University of California, Berkeley. Nevertheless, Microsoft, Meta, OpenAI, and Anthropic are all reportedly using synthetic data to train AI models with.
[3]
YouTubers are selling their unused video footage to AI companies
YouTubers and other digital content creators are selling their unused video footage to artificial intelligence companies seeking exclusive videos to better train their AI algorithms, often netting thousands of dollars per deal. OpenAI, Alphabet's Google, AI media company Moonvalley and several other AI companies are collectively paying hundreds of content creators for access to their unpublished videos, according to people familiar with the negotiations. That content, which hasn't been posted elsewhere online, is considered valuable for training artificial intelligence systems since it's unique. AI companies are currently paying between $1 and $4 per minute of footage, the people said, with prices increasing depending on video quality or format. Videos that are shot in 4K, for example, go for a higher price, as does non-traditional footage like videos captured from drones or using 3D animations. Most footage, such as unused video created for networks like YouTube, Instagram and TikTok, is selling for somewhere between $1 and $2 per minute.
Share
Copy Link
Major AI companies are purchasing unused video footage from digital content creators to train their AI algorithms, offering a new revenue stream for creators and addressing the growing need for unique training data.
In a significant development for both the AI industry and content creators, major tech companies like OpenAI, Google, and AI media company Moonvalley are now purchasing unpublished video footage from digital content creators to train their AI algorithms 12. This move comes as AI companies face a growing need for unique data to improve their models.
Content creators, including YouTubers, TikTokers, and Instagrammers, are being offered between $1 and $4 per minute of footage, depending on the quality and format 13. High-quality 4K videos and specialized content like drone footage command higher prices. Most standard footage from popular platforms is selling for $1 to $2 per minute 3.
This unexpected development provides content creators with a fresh avenue to monetize their work beyond traditional advertising deals 1. Dan Levitt, senior vice president of creators at Wasserman, views this as a potentially lucrative but time-limited opportunity for creators willing to license their footage 2.
The surge in demand for unpublished content stems from AI companies having already utilized most of the publicly available data on the internet for training purposes 2. This data exhaustion has led to concerns about the future of AI training, with industry leaders like OpenAI CEO Sam Altman and xAI owner Elon Musk expressing worries about the depletion of human knowledge for AI training 1.
This new approach appears to be a response to previous criticisms and legal challenges faced by AI companies. In the past, tech giants were accused of using content without proper consent or licensing for AI training. Notable cases include a lawsuit against Meta for using copyrighted books and allegations against companies like Apple and Nvidia for utilizing YouTube video transcripts without permission 1.
Third-party agencies like Troveo AI and Calliope Networks are emerging as mediators between content creators and AI companies. Marty Pesis, co-founder and CEO of Troveo, reports that his company has already paid out over $5 million to creators 2.
The agreements between AI companies and creators typically include clauses prohibiting the creation of digital replicas of the creators or exact reproductions of their content 12. This measure aims to protect creators' intellectual property and prevent potential misuse of their likeness.
As restrictions on web crawling for AI training data increase, companies are exploring alternative strategies. Elon Musk has suggested using synthetic data to fine-tune AI systems, though this approach comes with its own set of challenges, including potential biases and increased errors 12.
This development marks a significant shift in the AI training landscape, potentially reshaping the relationship between content creators and AI companies while addressing ethical concerns surrounding data usage for AI development.
Summarized by
Navi
[3]
Apple is in early talks with Google to potentially use Gemini AI for a Siri revamp, signaling a shift in Apple's AI strategy as it faces delays in its own development efforts.
18 Sources
Technology
13 hrs ago
18 Sources
Technology
13 hrs ago
Meta has announced a partnership with Midjourney to license their AI image and video generation technology, aiming to enhance Meta's AI capabilities and compete with industry leaders in creative AI.
8 Sources
Technology
13 hrs ago
8 Sources
Technology
13 hrs ago
As artificial intelligence becomes an integral part of daily life, its significant energy consumption and environmental impact are coming under scrutiny. This article explores the hidden climate costs associated with AI usage and data centers, and suggests ways to mitigate these effects.
6 Sources
Technology
13 hrs ago
6 Sources
Technology
13 hrs ago
NVIDIA introduces Spectrum-XGS Ethernet, a revolutionary networking technology designed to connect distributed data centers into giga-scale AI super-factories, addressing the growing demands of AI computation and infrastructure.
3 Sources
Technology
21 hrs ago
3 Sources
Technology
21 hrs ago
NVIDIA CEO Jensen Huang confirms the development of the company's most advanced AI architecture, 'Rubin', with six new chips currently in trial production at TSMC.
2 Sources
Technology
5 hrs ago
2 Sources
Technology
5 hrs ago