Curated by THEOUTPOST
On Fri, 18 Oct, 4:06 PM UTC
2 Sources
[1]
The AI opt-out models Meta, Musk's X, and the UK gov are proposing are simply not a good enough way for us to protect ourselves from data scraping
Nobody likes the idea of what major social media companies do with their information but as trends like "Goodbye Meta AI" suggest, people are even more worried about what AI scrapers, specifically, are doing with their data. Proposed changes by the UK government and Elon Musk's X, like Meta before them, could end up being pretty tedious to opt out of, if not downright obfuscated. Starting with X, as spotted by Tech Crunch, a recent change made to the privacy policy of the social media site says it may share your data with third parties. If you don't opt out of this data sharing, it can be used to train AI models, "whether generative or otherwise". You can opt out of this by going into 'Settings', then 'Data Sharing and Personalization', and then turning off data sharing. This is turned on by default and you are not warned of such upon creating an account. However, making an account does entitle the social media site to harvest data used on it, so it doesn't appear to be under any obligation to do so. Not only does X have its own AI model and chatbot named Grok, but your data could, and likely has been, used to train AI models from other sites. In a very similar story, as reported by the Financial Times, the UK government is currently consulting on a proposal that would allow companies to train AI models on data scraped from websites unless its users choose to opt out. This is frankly not a good enough way to allow consumers to be fully educated in how and why their data is used. Giving users the ability to opt out of AI scraping in some fairly obscure part of an app's settings won't give the majority of users enough information to know how their data is used and that they can even opt out in the first place. An opt-in model would work much better here, where users can choose to allow their data to be scraped by AI if they so choose. However, it's hard to believe enough users would do so to satiate the data desires of current AI models and their owners. If site owners are unhappy with an opt-in model, that implies owners are aware many would not choose to have their data scraped, and this touches on part of the problem with opt-out policies. It feels like an appeasement to those in the know but not a good enough tool to alert the average consumer about their own data rights. AI scraping specifically, is a very new thing for many users and sites, and it feels like many of the current models are built out of a sort of Wild West, frontier approach which was there at the very start of generative artificial intelligence. AI companies have been bypassing copyright and acting in ethically ambiguous ways to get data for some time, so consumers need to be more proactive about their privacy than ever right now.
[2]
Here's the deal: AI giants get to grab all your data unless you say they can't. Fancy that? No, neither do I | Chris Stokel-Walker
Data is vital to AI systems, so firms want the right to take it and ministers may let them. We must wake up to the danger Imagine someone drives up to a pub in a top-of-the-range sports car - a £1.5m Koenigsegg Regera, to pick one at random - parks up and saunters out of the vehicle. They come into the pub you're drinking in and begin walking around its patrons, slipping their hand into your pocket in full view, smiling at you as they take out your wallet and empty it of its cash and cards. The not-so-subtle pickpocket stops if you shout and ask what the hell they're doing. "Sorry for the inconvenience," the pickpocket says. "It's an opt-out regime, mate." Sounds absurd. Yet it seems to be the approach the government is pursuing in order to placate AI companies. A consultation is soon to open, the Financial Times reports, that will allow AI companies to scrape content from individuals and organisations unless they explicitly opt out of their data being used. The AI revolution has been as all encompassing as it has been rapid. Even if you're not one of the 200 million people who log on to ChatGPT every week, or dabble with its generative AI competitors such as Claude and Gemini, you will undoubtedly have interacted with an AI system - knowingly or unknowingly. But the fire of AI needs two constantly replenishing sources in order to survive and not burn out. One is energy - which is why AI companies are getting into the business of buying nuclear power plants. And the other is data. Data is vital to AI systems because it helps them to develop facsimiles of how we interact. If AI has any "knowledge" - and that's highly disputed, given it's really a fancy pattern-matching machine - then it stems from the data on which it is trained. One study forecast that large language models such as ChatGPT will run out of training data by 2026, so voracious is its appetite. Yet, without that data, the AI revolution may stall. Tech companies know that, which is why they are penning licensing deals for content left, right and centre. But that introduces friction, and a sector whose unofficial motto for the past decade or more has been "move fast and break things" doesn't do friction. Which is why they are already trying to nudge us towards an opt-out approach to copyright, where everything we type, post and share is destined to become AI training data by default unless we say no, rather than an opt-in regime, where companies have to ask us to use our data. We can already see how companies are priming us for this reality: this week, X began notifying users of a change to its terms and conditions of use that would enable all posts to be used to train Grok, Elon Musk's AI model designed to compete with ChatGPT. And Meta, the parent company of Facebook and Instagram, has made a similar change - resulting in the viral "Goodbye Meta AI" urban legend that supposedly overrides legal agreements. The reason AI companies want an opt-out regime is obvious: if you ask most people if they want anything from the books they write or music they produce, or the posts and photos they share on social networks, to be used to train AI, they'll say no. And then the wheels come off the AI revolution. The reason governments want to enable such a change to the concept of copyright ownership that has existed for more than 300 years, and has been enshrined in law for more than 100, is less obvious. But like many things, it seems to come down to money. The government has been confronted with lobbying from big tech companies suggesting that this is a requirement for them to consider the country as a place to invest in and share the spoils of AI innovation. One lobbying document penned by Google suggested backing its approach for an opt-out copyright regime would "ensure the UK can be a competitive place to develop and train AI models in the future". The government's mooted framing of the issue, which already puts the opt-out approach on the table as the method to be argued against, is therefore a big win for big tech lobbyists. With the amount of money washing around the tech sector and the levels of investment being thrown at AI projects, it's unsurprising Keir Starmer doesn't want to miss out on the potential bounty available. The government would be remiss not to consider how to appease tech companies as they develop a world-changing technology, and to try to make the UK an AI powerhouse. But this isn't the answer. Let's be clear: the UK's mooted copyright scheme would effectively enable companies to nick our data - every post we make, every book we write, every song we create - with impunity. It would require us to sign up to every individual service and tell them that no, we don't want them to chew up our data and spit out a poor composite image of us. Potentially hundreds of them, from big tech companies to small research labs. Lest we forget, OpenAI - a company now valued at more than $150bn - is planning to forswear its founding non-profit principles to become a for-profit company. It has more than enough money in its coffers to pay for training data, rather than rely on the beneficence of the general public. Companies like that can certainly afford to put their hands in their own pockets, rather than ours. So hands off.
Share
Share
Copy Link
The UK government and major tech companies are proposing opt-out models for AI data scraping, raising concerns about user privacy and data rights. Critics argue for an opt-in approach to better protect consumer interests.
The UK government and major tech companies are pushing for an opt-out model for AI data scraping, sparking intense debate about user privacy and data rights. This move has raised concerns among privacy advocates and consumers alike, as it could potentially allow AI companies to access vast amounts of personal data without explicit consent.
The UK government is currently consulting on a proposal that would permit companies to train AI models on data scraped from websites unless users choose to opt out 1. This approach is similar to recent changes made by social media platforms like X (formerly Twitter) and Meta.
X has updated its privacy policy to allow sharing user data with third parties for AI model training, with an opt-out option buried in the settings 1. Similarly, Meta has implemented changes that have led to viral concerns about data usage for AI training 2.
Critics argue that the opt-out model is inadequate for protecting user privacy and data rights. Key concerns include:
Lack of transparency: The opt-out options are often hidden in obscure settings, making it difficult for users to be aware of and exercise their rights 1.
Default data sharing: Most users may remain unaware that their data is being shared by default, potentially leading to widespread, uninformed data collection 1.
Ethical implications: The approach raises questions about the ethics of using personal data for AI training without explicit consent 2.
Privacy advocates and some experts argue for an opt-in model, where users would actively choose to allow their data to be used for AI training. This approach would:
The push for an opt-out model is driven by the AI industry's need for vast amounts of training data. Companies argue that this approach is necessary to maintain the pace of AI development and innovation 2.
The UK government's consideration of this model appears to be influenced by lobbying from tech giants. Documents from companies like Google suggest that adopting such an approach would "ensure the UK can be a competitive place to develop and train AI models in the future" 2.
This debate highlights the tension between rapid AI advancement and individual data rights. As AI companies seek to secure their data sources, there are growing concerns about the potential exploitation of user-generated content without fair compensation or consent 2.
The outcome of this policy discussion could set a precedent for how personal data is treated in the age of AI, potentially reshaping the landscape of digital rights and AI development globally.
Reference
The UK government's new AI action plan, aimed at making Britain an AI superpower, faces backlash from artists and writers over proposed copyright reforms that could allow AI companies to use creative works without permission.
2 Sources
2 Sources
AI firms are encountering a significant challenge as data owners increasingly restrict access to their intellectual property for AI training. This trend is causing a shrinkage in available training data, potentially impacting the development of future AI models.
3 Sources
3 Sources
A viral social media post claiming to protect user data from Meta's AI has been debunked as a hoax. Experts warn that such posts spread misinformation and do not affect data privacy on platforms like Facebook and Instagram.
11 Sources
11 Sources
LinkedIn has stopped collecting UK users' data for AI training following regulatory scrutiny. This move highlights growing concerns over data privacy and the need for transparent AI practices in tech companies.
8 Sources
8 Sources
LinkedIn, with its 930 million users, is using member data to train AI models, sparking a debate on data privacy and the need for transparent opt-out options. This practice has raised concerns among privacy advocates and users alike.
4 Sources
4 Sources
The Outpost is a comprehensive collection of curated artificial intelligence software tools that cater to the needs of small business owners, bloggers, artists, musicians, entrepreneurs, marketers, writers, and researchers.
© 2025 TheOutpost.AI All rights reserved