Curated by THEOUTPOST
On Mon, 21 Oct, 4:03 PM UTC
3 Sources
[1]
PrivacyNama 2024: How Privacy by Design helps AI Compliance
Disclaimer: This content generated by AI & may have errors or hallucinations. Edit before use. Read our Terms of use At PrivacyNama 2024, speakers discussed the way to best develop AI while ensuring privacy for users. The law often mandates AI model developers create models in a way that protects the rights of users. However, approaches to this may differ across the spectrum. Some organisations may choose to adopt a harms-based approach, wherein they remedy any deficiencies based on feedback. Conversely, speakers at PrivacyNama suggested 'Privacy by Design' as a more efficient and fairer alternative. The speakers said that to ensure compliance with privacy regulations, developers must consider data protection principles right at the inception of the model. Udbhav Tiwari, Head of Global Product Policy, Mozilla Foundation, said that 'Privacy by Design' is integral to data protection principles, "there's almost no point trying to make an AI system comply with privacy regulation unless you're accounting for those characteristics and principles, right from the initial design of the product." He said there are two ways to ensure that an AI model protects the privacy of individuals. First, the developers can train models on data sets that follow certain privacy considerations and do not include information that can violate privacy. Conversely, developers can code a model ,explicitly, not to generate certain outputs. He said that without this, "you will almost certainly end up causing or creating risks for your product that will end up making it much harder for it to be able to both comply with regulation, as well as not be subject to the adversarial environments." "The scale of AI is way wider than any of the enforcement capacities available, and therefore, we have to think of safeguards at the stage of training data at the stage or even before training data," said Beni Chugh, Head - Future of Finance, Dvara Research. Srinidhi Srinivas, Partner, Ikigai Law, noted that the applicable and relevant data protection principles may differ at stage. For instance, during the development stage, the principles of exception of publicly available data may apply, and at the deployment stage, the principles of consent seeking and consent withdrawal may apply. As speakers discussed the need for regulating AI, another idea proposed was to create self-policing and self-monitoring AI systems that could flag if they detected any inconsistency in following the law, employing AI to fill in the gap of regulation. Tiwari said this would be quite challenging as "all code is controlled by human beings". He said, "Technology can certainly help you comply with legal regulations, and sometimes it's necessary to comply with legal regulations. But I think extending that to say that technology can help you monitor the harms that might occur prior to the product being deployed or even after being deployed is a much harder question." He also warned that AI systems have a tendency to hallucinate. Further, he warned that models that are that advanced and powerful may also pose a safety risk. Instead he said, "I think [technology] can play a role of creating the records that are necessary to investigate whether legal compliance was done or not ...But I think to catch violations that are happening outside in the world because of bad privacy practices is a really, really hard thing to do." He also noted that there were many startups and organizations have products can record internal practices and set up internal, frameworks for documentation of data. He said that tools to flag non-compliance would be a lot more effective if they were widely available and attested by regulators. He added that these data governance tools can ensure that developers are meeting their obligations. The tools can also help with investigation as companies will no longer be able to shirk away responsibility and ensure their compliance. Pundarikaksh Sharma, the session chair summed up, that technological progress in this sector was aimed at increasing accountability through transparency, logging the steps to a model's output and the manners in which the input were provided and the output itself. The Digital Personal Data Protection (DPDP) Act says that data protection principles do not apply to personal data that is made or caused to be made publicly available by the user to whom such personal data relates. Sreenidhi Srinivas said that an exemption for any publicly available data is "too wide an exemption" and that "just because some information is out there doesn't mean it's out there for us to pick up". She said that one must consider the provenance of that data, the source and if the individual made the data publicly available themselves. Beni Chugh said she was against exemption for certain forms of data and could not think of a use case where exemption could apply. "Privacy is really not confidentiality. If someone is accessing my data and using it in a bounded manner that is agreeable and lawful, my privacy is intact, and I don't see why any particular actor or agent in the society would need exemptions over and beyond," she said. Tiwari agreed saying that "blanket exemption" may not be fair because the same product can be used for multiple different use cases and themes. Tiwari said that strict liability could apply to certain use cases where there is a real tangible threat to life" or ones that "would lead to outcomes that are so discriminatory that they're considered universal norms that should not be violated." Srinivas noted that India's DPDP Act had restriction on tracking, monitoring targeted advertising based on children's data, explaining that regulators decide the terms of strict liability based in the vulnerability of the community. "Strict liability largely means that you just really need to be extra careful about what you do and that the consequences of what would happen if things go wrong will be more severe than if things would just normally go wrong. .... It's about like, 'What if a person doesn't get healthcare and dies because of clearly documented harm that occurred because of the system?' So the threshold to require it and prove it is also quite high and therefore the consequences are also quite high," Tiwari said. Chugh said that under strict liability, developers must fine-tune other ex-ante regulations such as heightened transparency, heightened accountability, heightened risk assessment, to the risk level. However, she questioned if the regulators at the government level, had the capacity to implement solutions. She asked whether the industry has the confidence to really open up to the government and regulators? "Can the regulator actually pick apart and still guarantee that their trade secrets are fine or their IP's are fine?," she opined. Speakers also discussed if there must be obligations wherein AI developers inform all participants when they process their data. Srinivas said this provision may cause complications, as it is unclear if the onus must be on the developer or the deployer. "There is a base model on top of which, other deployers may have built their service or their end product, and in these scenarios who's really calling the shots with respect to their data?", she questioned. "And also there may be base training data in the foundational model and there may be certain training data that the deployer has added on top of that layer. Would that mean the deployer is responsible for that entire training data set?", she added. Udbhav Tiwari said, "I don't think the technology is the problem for how many people you can send data requests to and whether they have to comply with it. But I don't see a world in which one company merely telling the other company that they got an erasure request, would automatically make the second company erase the data. That is an obligation that needs to come from either the law or the regulator, not from one company to another."
[2]
AI and Privacy Laws: How Companies Can Stay Compliant
Disclaimer: This content generated by AI & may have errors or hallucinations. Edit before use. Read our Terms of use The rapid growth of AI has introduced a host of new challenges for organizations, particularly around data protection and regulatory compliance. Data Protection Officers now find themselves navigating an increasingly complex web of regulations across multiple jurisdictions, often in the face of uncertainty. In a highly competitive landscape, companies have to balance innovation, such as training AI models on vast datasets, with the need to protect privacy. At the Data Protection Officer Roundtable during PrivacyNama 2024, experts explored the challenges of developing and deploying AI, offering best practices for overcoming these obstacles. Derek Ho, Assistant General Counsel, Mastercard, said that developers in industries and finance are often already heavily regulated when it comes to frameworks around risk management and IT outsourcing. On top of that, AI integration brings in the "additional complexities around data governance." Data protection officers often have to balance these regulations, in addition to having to ensure that external service providers apply those principles and requirements, as AI often necessitates hiring an external service provider. Another challenge companies face is the lack of consistency in regulation. "We've seen this in some countries where a privacy law defines personal data very differently from how a financial regulator would define it, or a healthcare regulator would define it. And then you have different definitions of a basic construct of personal data or non-personal data," Derek Ho said. Internationally, however, he said there is a lot more consistency. He gave examples of multilateral organizations like the Organization for Economic Co-operation and Development (OECD) which have created common principles for policymakers to follow. He said that while regulations differ across countries, most regulatory proposals, frameworks, and guidelines, all try to land on a certain common set of principles like fairness, accountability, transparency, explainability, and human centricity. And there are going to be differences, but you're always going to have some level of differences. This is not a unique problem. You have competition laws which are different. Healthcare laws are different as well." Sriram SL, Data Protection Officer, Zomato, said that companies based in countries that do not have fixed regulations like India also face challenges. He said that the general principle they follow for data processing is asking if its integration is necessary and seeing if the AI is causing harm, which he said is the fundamental basis for new regulations."While I may not have a regulatory risk framework, I can internally always put in place a system so that high-risk [applications of AI] are subject to more scrutiny and therefore are subject to more checks and balances within the system before it [the platform] becomes a user-facing model," he said. Sriram SL noted that companies like Zomato that deploy AI may have less control over the AI system, as opposed to its developers. He noted that companies developing AI should consider its unpredictable emergent capabilities and avoid being overly aggressive in implementing new features. "I can't just let an AI system loose within the organization, within the product, within my app, and then deal with the fall. There needs to be extensive testing and there needs to be a proportionality. Does the benefit of the AI system really necessitate the risks we take in deploying?" he asked. As more jurisdictions introduce regulations, more guardrails are imposed on AI models that train on personal data. Those who trained their models before these regulations are now in a difficult position. For example, the FTC has suggested algorithmic disgorgement in certain cases. To address this issue, Srinidhi Srinivas, Partner, Ikigai Law suggested at the session on "Applying Data Protection Principles to AI," retraining the model when it comes to input data or training data, implement data subject rights and deletion rights and creating some output filters. Idriss Kechhida, Global Data Protection Officer, OpenAI noted that companies in these cases can face litigation and regulatory action. He advised "to always keep the past as a good example of a cautionary tale when these things happen. You can also make use of these things in a way, that can also help you improve in the future because you can have actual data points that you can talk to internally when you are doing these things. ..... when it comes to the future, try always getting better. It's never too late, to apply a good culture within a company that can help you address this in advance." Derek Ho suggested that companies follow the US National Institute of Standards and Technology AI risk framework -- govern, map, manage, and then measure -- to stay ahead of regulatory expectations. Mugjan Caba, Senior Compliance Officer, Truecaller, said, "[Organizations must] ideally have internal governance structure, we need to definitely team up with the broader organization. We were already doing that for the purposes of data protection compliance, but now the need is even bigger for the technical and product aspect of the AI that we see." Derek Ho said that a company's sponsorship could have a large influence on its culture of privacy. "I would say to any company who wants to get better at operationalizing privacy, is, do you think that you have the right sponsorship internally? And an easy way to answer this [is to ask], do we think the CEO actually has a sense of the challenges and the opportunities that relate to privacy? Is there a forum in which these senior executives hear about privacy and can weigh in on certain critical decision-making? Is there a steering committee for privacy?" he said. He also pointed out that the compliance culture usually comes from the top of the company and it is important to inculcate that culture within the organization. He said that while companies may want to incorporate AI to be competitive they must not be data-hungry and must keep consent at the center of their policy. "It is also the firm or the company's responsibility to be upfront with the customer about what the data is being used for and not start doing processing on the backend without the customer. So it really has to be consent-driven. But at the same time, we have to be mindful of data minimization. Don't just start using processes or start deploying features which are overly reliant on personal data, which they were not before," he said. Sriram SL said companies must have a proactive approach to data subject rights. "You [companies] will have to foster a system where we take into account from day one that the data subject whose data is being used for a particular product or a platform will continue to have his rights for the rights over the course of that data's storage and retention period in the system," he said. He also said that companies must keep track of their data ensuring that personal data does not leave the system and have a systemic approach to ensure that users have more control over their data. He said companies must not rush to capitalise on AI. "You do an assessment... and make sure that you're only using it for a specific purpose, and you're not just putting out personal data. And that, in turn, makes it easier for you to handle DSRs [Data Subject Requests] because while you have integrated a new technology on your platform, you've not really done something new or unconventional with the data that was ever consented to. So, you're still staying true to the purpose to which the data was consented. And therefore, it becomes easier to compile, because ultimately, you're not doing something new, even if the nature of the request is new," he said.
[3]
Can AI Training and Global Data Laws Coexist? #PrivacyNama
Disclaimer: This content generated by AI & may have errors or hallucinations. Edit before use. Read our Terms of use "The dirty little secret of generative AI is that all the data scraping that happens to train the models is in total disrespect of all the 160+ laws that exist and that protect data protection in the world. Because data scraping is at its core incompatible with lawfulness. There is no consent in scraping all personal data, all contents of any publicly available site, because even if it is public, it doesn't mean that you have given a consent to use it, to collect and use it as you want. That is at odds with purpose limitation and it is at odds with lawfulness," said Professor Luca Belli of the Fundação Getulio Vargas (FGV) Law School during MediaNama's flagship event PrivacyNama on October 3 and 4, 2024. Throughout the event, speakers talked about the tussle between AI training and privacy protection. Belli specifically talked about the inclusion of personal data within publicly available data sets. Regarding India and its Digital Personal Data Protection Act, 2023, Belli said that there should be a consent mechanism for publicly available data even if India's data protection law allows for the scraping of such data. He pushed for the Data Protection Board to explicate and clarify the data collection and processing structure in case of publicly available data. "Let's say that I'm participating in this conference in this moment. And so my bio is on the website of MediaNama. I've not necessarily made it explicitly public and given consent for it to be scraped and utilized to train models, right? So, I think that there is a very important role to be played by regulator to specify, to clarify what elements of the law. To my knowledge, India is the only one that exempts the application of the law when data, personal data are public. But again, it [the provision in the Indian law] doesn't mean that if you scrape all the internet and use it as you please, that is legal or you have a legitimate interest for it. So, again, very important role for regulators to be played here," said Belli. Watch the discussion here: Udbhav Tiwari, Head of Global Product Policy at the Mozilla Foundation, argued that AI models should be trained not to divulge personal information and identify such pieces of data. One can easily find credit card numbers, phone numbers, email IDs, sometimes even information from digital ID leaks, within an AI's training dataset, which often scrapes information from the public web, apart from other sources. "Unless you tell them [AI models] to not divulge or share this information, it's quite possible they might start bringing it out. So, I definitely think it's something that needs to happen right from the beginning, and that it's not a binary between a harms approach and a principle-based approach," said Tiwari. Tiwari called for conversations around the control of datasets used for AI training. He questioned whether companies working on AI models be obligated to submit their datasets to regulators prior to training. Even if a company resolves to remove particular data, Tiwari argued that doing so is not feasible for openly available model weights that exist on consumer computers and have already been trained. For this reason, he also suggested that a content moderation layer, which is usually used for harmful content and socially unacceptable content, can be used to make sure that certain kinds of data don't turn up. "I do think that this is a place where the law will probably have to evolve a little bit, not to say that you cannot exercise these rights, but by being a lot more specific about how these rights apply with regard to AI systems. As far as I understand, it's really, really if not borderline impossible to remove individual pieces of information from a data set before it starts showing up in the world. To the point of being prohibitively expensive both technically as well as financially to market providers, and definitely something that people outside of the biggest model, like trading companies that have the money to do that, cannot do," said Tiwari. Tiwari talked about a need for further enforcement against data scraping when asked about mixed data sets and protection of personal data within such sets. He described the current situation as a "free for all" where entities have "gone around and done what they have." He gave the example of the Data Subject Access Request (DSAR) that allows a data subject to access their personal data held by an organisation. "Under the GDPR, people have been trying to use them a lot for the last year and a half to try different ways to figure out whether that information even exists in these systems or not. And almost uniformly the standard response that they have gotten is either the trade secret and intellectual property defense saying that we can't, or that in order to answer your request, we will have to scan all of the content that we already have in order to determine whether your content exists there. And therefore, we like it's technically infeasible and we cannot do that. Even in cases of very targeted pieces of information where people have evidence that the model is spewing output what is their information, they have not gotten responses and have now filed complaints with privacy regulators asking companies to figure out," said Tiwari, adding that there have been instances where companies have used internal data sets to train models and still refused to comply with DSAR requests. Derek Ho, Mastercard Senior Vice President, Assistant General Counsel, Privacy & Data Protection highlighted disparities in how different organizations handle data protection and AI governance. Ho emphasized that organizational size and resources heavily influence privacy implementation capabilities. According to Ho, small and medium enterprises focused on basic operational survival often lack dedicated resources for data protection, with officers frequently juggling multiple roles including technical, legal, and administrative duties. In contrast, Ho claimed that larger, better-resourced organizations have generally implemented privacy by design frameworks, providing a foundation for product development involving data and increasingly, AI and machine learning. Ho concluded that a one-size-fits-all approach to data protection and AI governance is impractical given the diverse organizational landscape. When asked whether the creation of a sovereign data protection authority that ensures anonymized data would resolve the provenance issue, Tiwari said content provenance will remain a problem. This is an initiative that makes sure content is from a source which is trustworthy and reliable. Digital rights management (DRM) plays a significant role in such actions to enforce those properties upon a particular piece of data. However, to do so, the DRM will have to control, measure and mitigate data either from the point at which it is generated or when the data is first stored. He also warned against the idea of allowing companies like Google or Apple or Microsoft deploy forms of content provenance on user devices that are out of user controls. While, this will help tag content both personal or otherwise as it makes its way through the system, it ends up creating an Aadhaar-like identifier for each data. "It's sort of like having a universal Aadhaar for every piece of data that you generate, which is like a single identifier that is supposed to be able to track throughout [its virtual journey] knowing everywhere it's gone and everything that it's done. That's a pretty bad idea. So, I think that provenance in knowing where information comes from and like for data management generally, that's fine... but content provenance in the way that from the beginning of the place where the content is created to when it becomes the output in an AI system... I think the consequences of that approach are too severe for us to seriously consider them," said Tiwari. However, Merl Chandana, LIRNEasia Team Lead for Data, Algorithms and Policy, said that some countries are already working on responsible curation of data sets for language models. Some of these practices involve working on non-personal data sets and to see how the AI models perform and consider the amount of information required when using personal data sets. Speakers talked about navigating compliance and enforcement when dealing with data from a mixed data set. According to Mugjan Caba, Truecaller Senior Compliance Officer, compliance for mixed data sets starts with user's understanding of their data footprint that leads to creation and management data registers. While some entities bound by GDPR law do carry out such processes to comply with the data protect regulation, the method is yet to expand to the "complex nature of AI." "Already today, it's sometimes not crystal clear whether a data piece is personal data or not... So even today we might find ourselves making hours of analysis just to understand whether a piece of data is personal data. Now add outputs from AI to that... we should expand the definition of data there and [decide] in which processing activities we utilize personal data. But also [consider] data in a gray zone maybe, where you need to discuss or properly analyze and document whether it can be classified a person data or not," said Caba.
Share
Share
Copy Link
Experts discuss the complexities of developing AI while adhering to privacy laws, highlighting the need for 'Privacy by Design' and addressing challenges in data governance and regulatory compliance.
As artificial intelligence (AI) continues to evolve rapidly, developers and companies face increasing challenges in balancing innovation with privacy protection. At PrivacyNama 2024, experts emphasized the importance of 'Privacy by Design' as a fundamental approach to ensure compliance with data protection regulations 1.
Udbhav Tiwari, Head of Global Product Policy at Mozilla Foundation, stressed that privacy considerations must be integrated from the inception of AI model development. He outlined two primary methods to protect individual privacy: training models on carefully curated datasets that exclude privacy-violating information, and explicitly coding models to avoid generating certain outputs 1.
The rapid growth of AI has introduced complex regulatory challenges for organizations. Data Protection Officers now navigate a web of regulations across multiple jurisdictions, often facing uncertainty 2.
Derek Ho, Assistant General Counsel at Mastercard, highlighted the lack of consistency in regulations across different sectors and countries. However, he noted that international organizations like the OECD are working to establish common principles for policymakers 2.
To address these challenges, experts recommended several strategies:
A significant point of contention in AI development is the use of publicly available data. Professor Luca Belli of the Fundação Getulio Vargas (FGV) Law School argued that data scraping for AI training often disregards existing data protection laws worldwide 3.
Belli emphasized that public availability does not equate to consent for data use in AI training. He called for regulators to clarify the structure of data collection and processing, especially concerning publicly available data 3.
Udbhav Tiwari proposed that AI models should be trained to identify and avoid divulging personal information. He suggested implementing a content moderation layer to prevent certain types of data from appearing in AI outputs 3.
The challenge of removing individual pieces of information from existing datasets was also discussed. Tiwari noted the technical and financial difficulties in exercising data subject rights in the context of AI systems, suggesting that laws may need to evolve to address these issues specifically 3.
As the AI landscape continues to evolve, experts stressed the need for:
In conclusion, as AI technology advances, the industry must prioritize privacy protection and regulatory compliance while fostering innovation. The adoption of 'Privacy by Design' principles and the development of clear, consistent regulatory frameworks will be crucial in navigating this complex landscape.
Reference
The Global South faces unique challenges in balancing AI innovation with data protection, as discussed at PrivacyNama 2024. Issues include regulatory gaps, enforcement difficulties, and the complexities of using non-personal data in AI development.
4 Sources
4 Sources
A comprehensive look at India's efforts to develop its AI ecosystem, covering regulatory challenges, data access issues, and strategies for fostering innovation while addressing privacy concerns.
4 Sources
4 Sources
India's rapid progress in artificial intelligence development is encountering potential obstacles due to stringent privacy regulations. The country's AI sector growth may be hindered by data protection laws, raising concerns about the balance between innovation and privacy.
2 Sources
2 Sources
PrivacyNama 2024, a conference on data protection and privacy, announces key sessions on data protection officers and AI. The event, scheduled for October 3rd, will feature expert speakers discussing crucial aspects of data protection in the evolving technological landscape.
3 Sources
3 Sources
The European Data Protection Board (EDPB) has released an opinion clarifying the circumstances under which AI models can use personal data, addressing issues of anonymity, legitimate interest, and legal compliance in AI development and deployment.
3 Sources
3 Sources
The Outpost is a comprehensive collection of curated artificial intelligence software tools that cater to the needs of small business owners, bloggers, artists, musicians, entrepreneurs, marketers, writers, and researchers.
© 2025 TheOutpost.AI All rights reserved