Curated by THEOUTPOST
On Wed, 20 Nov, 12:11 AM UTC
6 Sources
[1]
Microsoft and HarperCollins partner up to train AI models on books
The Microsoft and HarperCollins strike a major AI licensing deal. Credit: Omer Taha Cetin / Anadolu / Getty Images Microsoft has signed a licensing deal with HarperCollins to train its AI models. According to Bloomberg sources, HarperCollins will allow Microsoft's LLMs to train on nonfiction titles. Microsoft reportedly doesn't plan on creating AI-generated books, but instead will access high quality data to make its models more intelligent and accurate. "HarperCollins authors will have the option to participate or not," said the outlet. 404 Media first broke the news of a licensing deal with an anonymous AI company. Author Daniel Kibblesmith shared screenshots of an email on Bluesky, likely from his agent informing him about the deal. "You are likely aware, as we all are, that there are controversies surrounding the use of copyrighted materials in the training of AI models," said the memo. "Much of the controversy comes from the fact that many companies seem to be doing so without acknowledging or compensating the original creators. And of course there is concern that these AI models might one day make us all obsolete." According to the screenshots of the email, HarperCollins is offering a non-negotiable payment of $2,500 per title for a three-year licensing deal. HarperCollins confirmed there is a deal with an unnamed AI company, telling Bloomberg, "its limited scope and clear guardrails around model output" respects authors' rights while presenting them with new opportunities. Meanwhile, multiple outlets have reported that AI companies, like Google, OpenAI, and Anthropic are getting diminishing returns with the development of new models because they're running out of high quality data to train on. Microsoft was not included in these reports, but its Copilot model relies on underlying genAI technology from OpenAI. So training AI models on nonfiction works might be a strategy to combat lessening improvements.
[2]
Microsoft and HarperCollins partner to train AI using nonfiction titles
Disclaimer: This content generated by AI & may have errors or hallucinations. Edit before use. Read our Terms of use Microsoft has signed a deal with HarperCollins to use their nonfiction titles to train an unreleased Artificial Intelligence (AI) model, Bloomberg reported. The company isn't planning to use the content to generate new books but only to improve model quality and performance. HarperCollins will also give authors the choice of whether they want to participate or not. Not all authors were enthused, however, with writer Daniel Kibblesmith calling the initiative "abominable." The author posted an email he received from his literary agent, who revealed that the deal involved a license period of three years with "certain protections concerning credit and limits of verbatim usage per AI response." Authors would receive a non-negotiable amount of $2,500 per title. Background: This deal represents a shift in approach taken by AI developers, who are now actively seeking to collaborate with publishers and creators, rather than simply scraping their data directly from the internet. In June, OpenAI, which has received major investments from Microsoft, announced a partnership with TIME Magazine to integrate the publication's journalistic work into company products like ChatGPT. The company had previously struck deals with Atlantic and News Corp, which owns publications like The Wall Street Journal and New York Post. ChatGPT responses utilising content from partner publications would provide attribution, alongside a link to the source article. These agreements follow a host of lawsuits filed against AI developers by news publishers and creative writers. OpenAI alone is facing lawsuits from The New York Times and prominent authors like David Baldacci and George R.R. Martin, who called it an "existential threat to the author profession." Most recently, Indian news agency Asian News International (ANI) sued OpenAI over copyright infringement for allegedly training its AI models on ANI's published content. Also Read:
[3]
Microsoft Signs AI-Learning Deal With News Corp.'s HarperCollins
Microsoft are hunting for additional sources of high-quality text Microsoft reached a deal with News Corp.'s HarperCollins that will allow the software company to use nonfiction titles from the book publisher to train its artificial intelligence models, according to a person familiar with the matter. Microsoft wants the HarperCollins books for a model that it hasn't yet announced, according to the person, who asked not to be identified discussing plans that aren't public. The company isn't planning to use the content to generate new books without human authors, the person said. Microsoft declined to comment. In a statement to Bloomberg News, HarperCollins confirmed it reached an agreement with an unidentified AI technology company that would "allow limited use of select nonfiction backlist titles for training AI models to improve model quality and performance." HarperCollins authors will have the option to participate or not, the company said. "Part of our role is to present authors with opportunities for their consideration while simultaneously protecting the underlying value of their works and our shared revenue and royalty streams," HarperCollins said. "This agreement, with its limited scope and clear guardrails around model output that respects author's rights, does that." Technology companies use an array of data, from social-media sites to news articles, to train AI models, and companies like Microsoft are hunting for additional sources of high-quality text that they can license to make their programs more accurate, better able to answer questions or provide expertise on specific subjects. News Corp. signed an agreement in May with OpenAI to let the company use content from more than a dozen of its publications, including the Wall Street Journal, Barron's and MarketWatch. OpenAI has also signed licensing deals with publishers including Axel Springer SE, the Atlantic, Vox Media, Dotdash Meredith Inc., Hearst Communications Inc. and Time magazine. Microsoft has worked on AI initiatives with Reuters, Hearst and Axel Springer, which publishes Business Insider and Politico. Some publishers have taken issue with AI companies pulling in content without permission. The New York Times is suing OpenAI and Microsoft, alleging copyright infringement. Perplexity AI, another AI startup, has faced similar lawsuits. © 2024 Bloomberg LP
[4]
Microsoft Signs AI-Learning Deal With News Corp.'s HarperCollins
Microsoft Corp. reached a deal with News Corp.'s HarperCollins that will allow the software company to use nonfiction titles from the book publisher to train its artificial intelligence models, according to a person familiar with the matter. Microsoft wants the HarperCollins books for a model that it hasn't yet announced, according to the person, who asked not to be identified discussing plans that aren't public. The company isn't planning to use the content to generate new books without human authors, the person said. Microsoft declined to comment.
[5]
Microsoft is the mystery AI company licensing HarperCollins books, says Bloomberg
Just one day after 404 Media revealed that HarperCollins struck a three-year AI training deal with an unnamed company, Bloomberg now says Microsoft will use the publisher's content to train a not-yet-announced AI model. HarperCollins said in a statement to 404 Media that its deal will include "select nonfiction backlist titles" and that authors will have to opt into the training program. One author, Daniel Kibblesmith, posted screenshots a few days ago showing that he was offered $2,500 per book for a three-year AI licensing contract. Asked what would be an offer he'd consider taking, Kibblesmith said in a post on Tuesday, "I'd probably do it for a billion dollars. I'd do it for an amount of money that wouldn't require me to work anymore, since that's the end goal of this technology," Not much else is known about the model HarperCollins' content will train, but a source tells Bloomberg that Microsoft doesn't intend to generate AI-written books with the material. Microsoft declined to comment.
[6]
HarperCollins reportedly pens deal with Microsoft to train AI on its books
The publishing giant said there will be clear guardrails to respect authors' rights. Publishing giant HarperCollins has agreed to allow a technology company to use "select nonfiction" books to train its artificial intelligence (AI) models. The company told 404 Media (18 November) that it made a deal with an unnamed "technology company" and that it will allow authors to opt in for the new venture. Bloomberg reported yesterday (19 November) that Microsoft is the tech company that will team up with HarperCollins and use its nonfiction books to train a new AI model. Exact details about this AI model are currently unknown. "HarperCollins has a long history of innovation and experimentation with new business models," the company said in a statement. "Part of our role is to present authors with opportunities for their consideration while simultaneously protecting the underlying value of their works and our shared revenue and royalty streams. This agreement, with its limited scope and clear guardrails around model output that respects authors' rights, does that." Last week, writer Daniel Kibblesmith shared an email he received asking if he'd consent to include his novel Santa's Husband in the training bundle. According to screenshots posted by Kibblesmith, the deal was worth $2,500 for each title for a three-year licensing agreement, and would include "certain protections concerning credit and limits of verbatim usage per AI response". Kibblesmith refused the deal, calling it "abominable". In a response to his original post, he said: "Direct any outrage toward the incredibly doable action of purchasing physical books by living authors from local bookstores." In May of this year, News Corp, the parent company of HarperCollins, struck a deal with OpenAI to allow the ChatGPT creator train its AI models on the company's news content. The deal also allows OpenAI to display news content from several publications owned by News Corp, including The Wall Street Journal and The Sunday Times, in response to questions asked by users of its AI models. While other news organisation have also struck deals with OpenAI, including The Atlantic and Vox Media, some news organisations and publishers have not been so welcoming of AI disruption. The New York Times is suing the AI giant for allegedly copying and using millions of copyrighted news articles, in-depth investigations and other journalistic work "without permission or payment". In October, The Guardian reported that UK ministers are facing a backlash over plans to allow AI companies to train their models on content from publishers and artists by default unless they opt out. Earlier that month, thousands of creatives around the world signed a statement warning AI companies that the unlicensed use of their work to train generative AI models is a "major, unjust threat" to their livelihoods. SiliconRepublic.com has reached out to HarperCollins for comment. Don't miss out on the knowledge you need to succeed. Sign up for the Daily Brief, Silicon Republic's digest of need-to-know sci-tech news.
Share
Share
Copy Link
Microsoft has entered into a licensing agreement with HarperCollins to use nonfiction books for training an unreleased AI model, aiming to improve model quality and performance without generating AI-written books.
In a significant move within the AI industry, Microsoft has entered into a licensing agreement with HarperCollins, a subsidiary of News Corp., to utilize nonfiction titles for training its artificial intelligence models 1. This partnership marks a shift in approach for AI developers, who are now actively seeking collaborations with publishers and creators rather than relying solely on web scraping techniques 2.
The agreement allows Microsoft to access HarperCollins' nonfiction backlist for a three-year period. Authors will be offered a non-negotiable payment of $2,500 per title, with the option to participate or opt out of the program 1. HarperCollins emphasized that the deal includes "clear guardrails around model output" to respect authors' rights while presenting new opportunities 3.
Microsoft intends to use the licensed content to train an unreleased AI model, focusing on improving model quality and performance rather than generating AI-written books 4. This move comes as AI companies face diminishing returns in model development due to a scarcity of high-quality training data 1.
The deal reflects a broader trend of AI companies seeking partnerships with content creators. OpenAI, which has received significant investment from Microsoft, has previously struck deals with publications like TIME Magazine, The Atlantic, and various News Corp. outlets 2.
However, not all authors are enthusiastic about these developments. Writer Daniel Kibblesmith expressed skepticism, stating he would only consider such a deal for an amount that would render further work unnecessary, given the potential impact of AI on the writing profession 5.
The partnership comes amid ongoing legal challenges in the AI industry. Several lawsuits have been filed against AI developers by news publishers and authors, including a recent suit by The New York Times against OpenAI and Microsoft alleging copyright infringement 3. These legal actions highlight the complex intellectual property issues surrounding AI training data.
Reference
[3]
[4]
HarperCollins has reached an agreement with an unnamed AI company to use select nonfiction books for AI model training, offering authors $2,500 per book. The deal highlights growing tensions between publishers, authors, and AI firms over copyright and compensation.
7 Sources
7 Sources
New research reveals that major AI companies like OpenAI, Google, and Meta prioritize high-quality content from premium publishers to train their large language models, sparking debates over copyright and compensation.
2 Sources
2 Sources
Penguin Random House, the world's largest trade publisher, has updated its copyright pages to prohibit the use of its books for training AI systems, marking a significant move in the ongoing debate over AI and copyright.
6 Sources
6 Sources
Microsoft and OpenAI are collaborating with the Lenfest Institute for Journalism to provide $10 million in funding and resources to help local news outlets integrate AI tools into their operations.
4 Sources
4 Sources
OpenAI has formed a significant content partnership with Hearst, allowing integration of Hearst's newspaper and magazine content into OpenAI's AI products, including ChatGPT. This move marks a growing trend of collaboration between AI companies and traditional media publishers.
12 Sources
12 Sources
The Outpost is a comprehensive collection of curated artificial intelligence software tools that cater to the needs of small business owners, bloggers, artists, musicians, entrepreneurs, marketers, writers, and researchers.
© 2025 TheOutpost.AI All rights reserved