Curated by THEOUTPOST
On Thu, 19 Dec, 4:01 PM UTC
3 Sources
[1]
EU data watchdog sets terms for AI model's legitimate use of personal data
The EDPB responded to the Irish authority, clarifying AI model anonymity, the legitimacy of using personal data for their development while leaving leeway to national data protection authorities. The EU's data protection agency has clarified in what circumstances developing AI models may access personal data in an opinion which sets out a three-stage test for legitimate interest of such use. The opinion published this week by the European Data Protection Board (EDPB) - the coordination body for national privacy regulators across the EU - followed a request from the Irish Data Protection Authority in November, seeking clarification on whether personal data could be used in AI training without breaching EU law. Ireland's DPA acts as a watchdog for many of the largest US tech companies, headquartered in Dublin. Reaffirming models anonymity and 'legitimate interest' The opinion outlines that for an AI model to be considered truly anonymous, the likelihood of identifying individuals through the data must be "insignificant". The EDPB also established a framework for determining when a company may consider it has a "legitimate interest" giving it a valid legal basis for processing personal data to develop and deploy AI models without obtaining explicit consent from individuals. The three-step test for assessing legitimate interest requires identifying the interest, evaluating whether processing is necessary to achieve it, and ensuring that the interest does not override the fundamental rights of individuals. The EDPB also stressed the importance of transparency, ensuring that individuals are informed about how their data is being collected and used. The EDPB stressed in the opinion that ultimately it is the responsibility of national data protection authorities to assess, on a case-by-case basis, whether GDPR has been violated in the processing of personal data for AI development. Models developed with data extracted and processed illegally may not be deployed, the opinion states. Reactions from civil society and industry The ruling was welcomed by the Computer & Communications Industry Association (CCIA), which represents major tech companies, including those developing AI models. "It means that AI models can be properly trained using personal data. Indeed, access to quality data is necessary to ensure that AI output is accurate, to mitigate biases, and to reflect the diversity of European society," said Claudia Canelles Quaroni, CCIA Europe's Senior Policy Manager. However, CCIA also called for more legal clarity to avoid future uncertainties. Digital rights advocates, however, raised concerns, particularly about the anonymity of AI models. "Although this may seem plausible in theory, it is unrealistic to fine-tune such a distinction to the threshold set earlier, creating significant challenges in ensuring effective data protection," said Itxaso Dominguez de Olazabal, Policy Advisor at EDRi. Dominguez de Olazabal also highlighted the broad discretion given to national authorities, warning that it could lead to inconsistent enforcement. "This lack of alignment has already proven problematic under the GDPR, threatening the effective protection of fundamental rights. Harmonisation is key to ensuring digital rights are upheld universally." Looking ahead: web scraping guidelines Further guidelines are expected from the EDPB to address emerging issues, such as web scraping - the automated extraction of data from websites, including text, images, and links, to train AI models and enhance their capabilities. These additional clarifications will be crucial as AI development continues to rely heavily on vast amounts of data.
[2]
EU issues new data processing guidelines for AI firms
The Irish DPC welcomed the new guidelines, which were issued after its request in September. The European Data Protection Board (EDPB) has introduced guidelines around the use of personal data for the development and deployment of artificial intelligence (AI) models. The guidelines - requested by the Irish Data Protection Commission (DPC) in September - will determine how and at what situations AI models can be considered anonymous, the "legitimate interest" argument when developing or using AI models and the consequences for an AI model developed using personal data that was processed unlawfully. The measures, which also considers the usage of first and third-party data, mandates that the anonymity of AI models should be assessed on a case-by-case basis by national data protection watchdogs. It details that an AI model can be anonymous only if it is "very unlikely" to directly or indirectly identify individuals whose data was used in the models, or extract their personal data using search queries via the model. Moreover, the EDPB said that these guidelines provide "general consideration," that national watchdogs should take into discretion while making decisions relating to the legitimate interest of AI models processing personal data. EDPB guidelines consists of a number of criteria, including whether the personal data in question was publicly available, the nature of the service the model provides as well as the source from which the personal data was collected. "As the lead supervisory authority of many of the world's largest tech companies, we have a deep awareness and understanding of the complexities associated with regulating the processing of personal data in an AI context," said DPC chairperson Des Hogan. "In having made this request for an opinion, the DPC triggered a discussion, in which we participated, that led to this agreement at EDPB level, on some of the core issues that arise in the context of processing personal data for the development and deployment of AI models, thereby bringing some much needed clarity to this complex area." Moreover, the watchdog's commissioner Dale Sunderland said that the guidelines will enable "proactive, effective and consistent regulation across the EU/EEA, giving greater clarity and guidance to industry, while promoting responsible innovation." Commenting on the new guidelines, EDPB's chair Anu Talus said: "AI technologies may bring many opportunities and benefits to different industries and areas of life. We need to ensure these innovations are done ethically, safely, and in a way that benefits everyone. "The EDPB wants to support responsible AI innovation by ensuring personal data are protected and in full respect of the GDPR". The Irish DPC has been on the forefront of the battle for data privacy. Earlier this year, the Commission opened an investigation into Google to see whether it complied with EU data laws when developing its PaLM2 AI model, while it also opened an inquiry into Ryanair, looking into how the airline company processes personal data - including potentially biometric data. Also this year, the DPC concluded a number of cases against Big Tech companies, including an investigation into X which was "struck out" due to the company's agreement to suspend its processing of the personal data of its EU and EEA users on a permanent basis, as well as slapping a €310m fine on LinkedIn after it found that the company's data processing practices infringed on multiple articles of the GDPR. Don't miss out on the knowledge you need to succeed. Sign up for the Daily Brief, Silicon Republic's digest of need-to-know sci-tech news.
[3]
EU privacy body weighs in on some tricky GenAI lawfulness questions | TechCrunch
The European Data Protection Board (EDPB) published an opinion on Wednesday that explores how AI developers might use personal data to develop and deploy AI models, such as large language models (LLMs), without falling foul of the bloc's privacy laws. The Board plays a key steering role in the application of these laws, issuing guidance that supports regulatory enforcement, so its views are important. Areas the EDPB opinion covers include whether AI models can be considered to be anonymous (which would mean privacy laws wouldn't apply); whether a "legitimate interests" legal basis can be used for lawfully processing personal data for the development and the deployment of AI models (which would mean individuals' consent would not need to be sought); and whether AI models that were developed with unlawfully processed data could subsequently be deployed lawfully. The question of what legal basis might be appropriate for AI models to ensure they are compliant with the General Data Protection Regulation (GDPR), especially, remains a hot and open one. We've already seen OpenAI's ChatGPT getting into hot water here. And failing to abide by the privacy rules could lead to penalties of up to 4% of global annual turnover and/or orders to change how AI tools work. Almost a year ago, Italy's data protection authority issued a preliminary finding that OpenAI's chatbot breaches the GDPR. Since then, other complaints have been lodged against the tech, including in Poland and Austria, targeting aspects such as its lawful basis for processing people's data, tendency to make up information and inability to correct erroneous pronouncements on individuals. The GDPR contains both rules for how personal data can be processed lawfully and a suite of data access rights for individuals -- including the ability to ask for a copy of data held about them; have data about them deleted; and correct incorrect info about them. But for confabulating AI chatbots (or "hallucinating", as the industry frames it) these are not trivial asks. But while generative AI tools have quickly faced multiple GDPR complaints, there has -- so far -- been a lot less enforcement. EU data protection authorities are clearly wrestling with how to apply long-established data protection rules on a technology that demands so much data for training. The EDPB opinion is intended to help oversight bodies with their decision-making. Responding in a statement, Ireland's Data Protection Commission (DPC), the regulator which instigated the request for Board views on the areas the opinion tackles -- and the watchdog that's set to lead on GDPR oversight of OpenAI following a legal switch late last year -- suggested the EDPB's opinion will "enable proactive, effective and consistent regulation" of AI models across the region. "It will also support the DPC's engagement with companies developing new AI models before they launch on the EU market, as well as the handling of the many AI related complaints that have been submitted to the DPC," commissioner Dale Sunderland added. As well as giving pointers to regulators on how to approach generative AI, the opinion offers some steer to developers on how privacy regulators might break on crux issues such as lawfulness. But the main message they should take away is there won't be a one-size-fits-all solution to the legal uncertainty they face. For instance, on the question of model anonymity -- which the Board defines as meaning an AI model that should be "very unlikely" to "directly or indirectly identify individuals whose data was used to create the model" and be very unlikely to allow users to extract such data from the model through prompt queries -- the opinion stresses this must be assessed "on a case-by-case basis". The document also provides what the Board dubs "a non-prescriptive and non-exhaustive list" of methods whereby model developers might demonstrate anonymity, such as via source selection for training data that contains steps to avoid or limit collection of personal data (including by excluding "inappropriate" sources); data minimization and filtering steps during the data preparation phase pre-training; making robust "methodological choices" that "may significantly reduce or eliminate" the identifiability risk, such as choosing "regularization methods" aimed at improving model generalization and reducing overfitting, and applying privacy-preserving techniques like differential privacy; as well as any measures added to the model that could lower the risk of a user obtaining personal data from training data via queries. This indicates that a whole host of design and development choices AI developers make could influence regulatory assessments of the extent to which the GDPR applies to that particular model. Only truly anonymous data, where there is no risk of re-identification, falls outside the scope of the regulation -- but in the context of AI models the bar is being set at risks of identifying individuals or their data at "very unlikely". Prior to the EDPB opinion, there has been some debate among data protection authorities over AI model anonymity -- including suggestions models can never themselves be personal data -- but the Board is clear that AI model anonymity is not a given. Case by case assessments are necessary. The opinion also looks at whether a legitimate interest legal basis can be used for AI development and deployment. This is important because there are only a handful of available legal basis in the GDPR, and most are inappropriate for AI -- as OpenAI has already discovered via the Italian DPA's enforcement. Legitimate interest is likely to be the basis of choice for AI developers building models, since it does not require obtaining consent from every individual whose data is processed to build the tech. (And given the quantities of data used to train LLMs, it's clear that a consent-based legal basis would not be commercially attractive or scalable.) Again, the Board's view is that DPAs will have to undertake assessments to determine whether legitimate interest is an appropriate legal basis for processing personal data for the development and the deployment of AI models -- referring to the standard three-step test which requires watchdogs to consider the purpose and necessity of the processing (i.e., it is lawful and specific; and were there any alternative, less intrusive ways to achieve the intended outcome) and perform a balancing test to look at the impact of the processing on individual rights. The EDPB' opinion leaves the door open to it being possible for AI models to meet all the criteria for relying on a legitimate interest legal basis, suggesting for example that the development of an AI model to power a conversational agent service to assist users, or the deployment of improved threat detection in an information system would meet the first test (lawful purpose). For assessing the second test (necessity), assessments must look at whether the processing actually achieves the lawful purpose and whether there is no less intrusive way to achieve the aim -- paying particular attention to whether the amount of personal data processed is proportionate vs. the goal, with mind to the GDPR's data mininization principle. The third test (balancing individual rights) must "take into account the specific circumstances of each case", per the opinion. Special attention was required to any risks to individuals' fundamental rights that may emerge during development and deployment. Part of the balancing test also requires regulators to consider the "reasonable expectations" of data subjects -- meaning, whether individuals whose data got processed for AI could have expected their information to be used in such a way. Relevant considerations here include whether the data was publicly available, the source of the data and the context of its collection, any relationship between the individual and the processor, and potential further uses of the model. In cases where the balancing test fails, as the individuals' interest outweigh the processors', the Board says mitigation measures to limit the impact of the processing on individuals could be considered -- which should be tailored to the "circumstances of the case" and "characteristics of the AI model", such as its intended use. Examples of mitigation measures the opinion cites include technical measures (such as those listed above in the section on model anonymity); pseudanonymization measures (such as checks that would prevent any combination of personal data based on individual identifiers); measure to mask personal data or substitute it with fake personal data in the training set; measure that aim to enable individuals to exercise their rights (such as opt-outs); and transparency measures. The opinion also discusses measures for mitigating risks associated with web scraping, which the Board says raises "specific risks". The opinion also weighs in on the sticky issue of how regulators should approach AI models that were trained on data that was not processed lawfully, as the GDPR demands. Again, the Board recommends regulators take into account "the circumstances of each individual case" -- so the answer to how EU privacy watchdogs will respond to AI developers who fall into this law-breaking category is... it depends. However, the opinion appears to offer a sort of get-out clause for AI models that may have been built on shaky (legal) foundations, say because they scraped data from anywhere they could get it with no consideration of any consequences, if they take steps to ensure that any personal data is anonymized before the model goes into the deployment phase. In such cases -- so long as the developer can demonstrate that subsequent operation of the model does not entail the processing of personal data -- the Board says the GDPR would not apply, writing: "Hence, the unlawfulness of the initial processing should not impact the subsequent operation of the model." Discussing the significance of this element of the opinion Lukasz Olejnik, an independent consultant and affiliate of KCL Institute for Artificial Intelligence -- whose GDPR complaint against ChatGPT remains under consideration by Poland's DPA more than a year on -- warned that "care must be taken not to allow systematic misuse schemes". "That's an interesting potential divergence from the interpretation of data protection laws until now," he told TechCrunch. "By focusing only on the end state (anonymization), the EDPB may unintentionally or potentially legitimize the scraping of web data without proper legal bases. This potentially undermines GDPR's core principle that personal data must be lawfully processed at every stage, from collection to disposal." Asked what impact he sees the EDPB opinion as a whole having on his own complaint against ChatGPT, Olejnik added: "The opinion does not tie hands of national DPAs. That said I am sure that PUODO [Poland's DPA] will consider it in its decision," though he also stressed that his case against OpenAI's AI chatbot "goes beyond training, and includes accountability and Privacy by Design."
Share
Share
Copy Link
The European Data Protection Board (EDPB) has released an opinion clarifying the circumstances under which AI models can use personal data, addressing issues of anonymity, legitimate interest, and legal compliance in AI development and deployment.
The European Data Protection Board (EDPB) has issued a significant opinion addressing the use of personal data in AI model development and deployment. This guidance, requested by the Irish Data Protection Commission (DPC), aims to clarify how AI companies can comply with EU privacy laws, particularly the General Data Protection Regulation (GDPR) [1][2].
The EDPB states that for an AI model to be considered anonymous, it must be "very unlikely" to identify individuals directly or indirectly through the data used in its creation. The assessment of anonymity should be conducted on a case-by-case basis by national data protection authorities [1][2].
The opinion outlines a three-step test for assessing whether companies have a "legitimate interest" in processing personal data for AI development without explicit consent:
The EDPB clearly states that AI models developed using unlawfully processed personal data may not be deployed [1].
The opinion emphasizes that national data protection authorities are responsible for assessing GDPR compliance in AI development on a case-by-case basis. This approach allows for flexibility but may lead to inconsistent enforcement across the EU [1][3].
The Computer & Communications Industry Association (CCIA) welcomed the ruling, stating it allows for proper AI model training using personal data. However, they called for more legal clarity to avoid future uncertainties [1].
Digital rights advocates expressed concerns about the broad discretion given to national authorities and the challenges in ensuring effective data protection. They emphasized the need for harmonization to uphold digital rights universally [1].
The EDPB is expected to issue further guidelines on emerging issues such as web scraping, which is crucial for AI development. These additional clarifications will be essential as AI continues to rely heavily on vast amounts of data [1].
This opinion provides a framework for AI developers to navigate the complex landscape of data protection regulations. It supports responsible AI innovation while ensuring personal data protection in compliance with GDPR [2][3].
The guidelines will enable proactive, effective, and consistent regulation across the EU/EEA, offering greater clarity to the industry while promoting responsible innovation [2].
As the AI landscape continues to evolve, these guidelines will play a crucial role in shaping the development and deployment of AI models in the European Union, balancing innovation with privacy protection.
Reference
[1]
[2]
Google's AI model, PaLM 2, is under investigation by EU regulators for potential privacy violations. The Irish Data Protection Commission is leading the inquiry, focusing on data collection and usage practices.
32 Sources
The European Union's AI Act, a risk-based rulebook for artificial intelligence, is nearing implementation with the release of draft guidelines for general-purpose AI models. This landmark legislation aims to foster innovation while ensuring AI remains human-centered and trustworthy.
3 Sources
Experts discuss the complexities of developing AI while adhering to privacy laws, highlighting the need for 'Privacy by Design' and addressing challenges in data governance and regulatory compliance.
3 Sources
Major technology companies are pushing for changes to the European Union's AI Act, aiming to reduce regulations on foundation models. This effort has sparked debate about balancing innovation with potential risks of AI technology.
9 Sources
X, formerly known as Twitter, has suspended the use of European users' personal data for training its AI models. This decision comes in response to legal challenges and concerns over GDPR compliance.
5 Sources
The Outpost is a comprehensive collection of curated artificial intelligence software tools that cater to the needs of small business owners, bloggers, artists, musicians, entrepreneurs, marketers, writers, and researchers.
© 2025 TheOutpost.AI All rights reserved