Mastodon Bans AI Training on User Data, Highlighting Challenges in Decentralized Platforms

2 Sources

Share

Mastodon updates its terms to prohibit AI training on user content, but the decentralized nature of the platform poses challenges in enforcing these rules across the entire Fediverse.

Mastodon Takes a Stand Against AI Training

In a significant move to protect user privacy, Mastodon, the open-source social media platform, has updated its terms and conditions to prohibit the use of user content for training large language models (LLMs). The new policy, set to take effect from July 1, 2025, specifically bans the scraping of data from its main server, Mastodon.social

1

.

Source: The Register

Source: The Register

"We want to make it clear that training LLMs on the data of Mastodon users on our instances is not permitted," the platform stated in an email to users

1

. This decision aligns Mastodon with other platforms like Bluesky, which have expressed similar intentions to protect user data from AI training

1

.

Challenges in Enforcement

While the move is welcomed by privacy advocates, the decentralized nature of Mastodon poses significant challenges in enforcing these rules across the entire Fediverse. The new terms apply only to Mastodon's own instances, not the wider network of independently managed servers

2

.

Eugen Rochko, founder of Mastodon, acknowledged the difficulty in enforcing such restrictions on a platform that prides itself on decentralization and openness. While it's possible to deploy a file to block AI crawlers, the effectiveness of this measure relies on the compliance of those behind the bots

1

.

Broader Context of AI Data Scraping

Mastodon's policy change comes amid growing concerns about AI companies using public online content without permission to train their models. This issue has sparked legal actions, such as Reddit's lawsuit against Anthropic for allegedly scraping user-generated content in violation of contractual terms

1

.

The debate around AI training data has intensified, with platforms like Reddit signing data-sharing deals with AI companies like OpenAI and Google, while simultaneously taking legal action against others for unauthorized use of their data

1

.

Additional Policy Changes

Along with the AI training ban, Mastodon has introduced other significant policy updates:

  1. The minimum age for users has been raised from 13 to 16 years globally

    2

    .
  2. Stronger rules against hacking attempts and automated tools that could disrupt the platform have been implemented

    2

    .
  3. An arbitration clause has been added, requiring users to go through a legal process in Germany for any lawsuits

    2

    .

Implications for the Future

While Mastodon's efforts to protect user data are commendable, the limited scope of the policy highlights the challenges faced by decentralized platforms in presenting a unified front against AI data harvesting. Unless more servers in the Fediverse adopt similar rules, AI companies may still have access to vast amounts of user-generated content

2

.

As the debate over AI training data continues, Mastodon's policy change serves as a significant milestone in the ongoing struggle between open, decentralized platforms and the need for user data protection in the age of AI.

TheOutpost.ai

Your Daily Dose of Curated AI News

Don’t drown in AI news. We cut through the noise - filtering, ranking and summarizing the most important AI news, breakthroughs and research daily. Spend less time searching for the latest in AI and get straight to action.

© 2025 Triveous Technologies Private Limited
Instagram logo
LinkedIn logo