Curated by THEOUTPOST
On Fri, 11 Oct, 4:03 PM UTC
4 Sources
[1]
PrivacyNama Shows A Global South Dilemma: AI Innovation vs. Regulation
Only 49% of nations in the Global South had legally enforceable frameworks for data protection with respect to AI, said Nicolas Grossman, project director of the Global Index on Responsible AI in his presentation at PrivacyNama 2024. Comparatively, 83% of nations in the Global North had legal frameworks and laws that deal with AI and data protection and privacy. He added that jurisdictions like the EU and the US have introduced major AI regulations, countries in the Global South have been more measured while introducing laws regulating AI. The context of AI in the Global South differs vastly from that in the Global North. For some countries, it serves as an opportunity for economic development and stringent regulation may pose as an obstacle. In other countries, AI has been the catalyst for conversations on privacy principles. Speakers discussed how the Global South could contend with this new technology and approach privacy regulation at PrivacyNama 2024. Professor Luca Belli of Fundação Getulio Vargas (FGV) Law School, noted that most countries in the Global South already have general data protection laws. These laws may not include automated processing, but general concepts of lawfulness, fairness, transparency, purpose, limitation, consent, etc. that already exist still apply to AI. He noted the gap was in enforcement of the laws. He said 19 countries in Latin America already have data protection frameworks (in some cases these frameworks are 60 years old), yet they were poor in enforcement. "In the books, we have very nice principles, but in action, we have regulators that drag their feet. They are very severely under-resourced and that may not be willing to go after large tech companies. Could you imagine the data protection authority of Peru or Uruguay going after Microsoft and OpenAI?," he said. Belli stated that the role of regulators in Global South countries is "not merely to sanction." "When you sanction, it means that you are already in the pathology. The disaster has already happened, and you have to try to revert it. The main role of the regulators is to do pedagogy, not act when the pathology already happened", he explained. He expanded that regulators could explain to companies how they could implement pre-existing data protection principles. Linh Tong, Deputy Director of the Center for Education Promotion and Empowerment of Women (CEPEW), pointed out that this approach may pose a challenge in countries like Vietnam, which are still developing AI systems due to a lack of understanding among regulators. She explained that while regulators instituted data protection regulations, they are usually created to mirror the Global North's efforts without grasping the concepts. "They might be able to incorporate one of the best languages in laws and policy into the local context, but it does not necessarily mean that they understand all the implications behind it," Tong said. She argued that it was not possible to expect regulators from Vietnam and countries at a similar development stage to engage with tech companies without learning from the United Nations or other advanced economies. Tong cited Singapore data protection regulations as an example, which are divided into two stages: Singapore first imposes very strict with the sanctions on tech companies so they gain an understanding of the laws and then focus on prevention. She noted that enforcing laws like this is easier in Singapore as it is already developed and has a much smaller population compared to nations like Vietnam or India. Merl Chandana, Team Lead - Data, Algorithms and Policy Team, LIRNEasia, concurred and added that countries within the Global South are at different stages of development. He prompted the panel to assess the capabilities of countries aiming to develop AI for economic support, while they grappling with regulatory and governance challenges related to data protection and AI. "On one hand, you have like industrialized countries who are heavy into manufacturing like Vietnam. And India has perhaps comparatively a large digital public infrastructure and some level of capacity in the government, at least in terms of like deployment of AI. There are so many other countries which are really interested in unlocking AI for growth and benefit, but in terms of both deploying AI and understanding risk and managing those risks aren't quite there." However, Chandana said that AI has revitalized the conversation on data sharing policies within governments, intellectual property and cyber security in Sri Lanka. Tong noted that Vietnam is currently focused on becoming an ASEAN digital leader and also becoming part of the AI global value chain. Thus, it has encouraged the use and sharing of data to create and generate benefits. This puts the enforcement of the personal data protection law on low priority as government resources are encouraging innovation. "I don't think most countries, especially in the Global South, are particularly interested in directly regulating AI," Chandana said. "This common perception about trade-offs whether it exists or not, between AI innovation and AI regulation, that's very much the case. But broadly interpreting AI regulation as mechanisms that need to be put in place to ensure that the overreaches and the downsides of AI are mitigated," he added. He argued that direct AI regulation may not be the way forward, instead countries looking to innovate can adopt tools like Singapore's AI verify. Further, data protection principles like data minimization could be a road block for many important use cases. For instance, AI models can leverage large amounts of data to create healthcare applications. They can facilitate public health interventions by identify groups of populations who are at risk. However, they may require access to health data that some may deem sensitive. Sriram SL, Data Protection Officer, Zomato said at the "Data Protection Officers Roundtable" that ideally he would prefer that India had a more enabling framework when it came to AI. "I want a law that doesn't get overly prescriptive and rather says, 'Okay, here are some things you should steer clear of, but otherwise feel free to do it.' I think the lack of innovation could hold back AI," he said. Belli noted that in order to develop and innovate AI, regulators need a strategy for investing in in AI. "Regulation is not only about sanctioning, it's also about facilitating what you want to have," he said. He noted that China studied Japan, Taiwan, and South Korea, which copied the what the the US has been doing over the past 70 years - very well-focused industrial policy. "Although they [the US] don't admit it, and they say that it's all the private sector, everything we have in our technology basically originates from multi-billion-dollar contracts from Department of Defense, DARPA, NASA, and then is commercialized," he said. Belli added that countries that wish to scale AI programmes will need infrastructure, compute and cloud computing. "Most countries in the world are all dependent from three US corporations - AWS, Microsoft Azure, and Google Cloud for computing. And the three of them are dependent from NVIDIA for chips, and chips is dependent from TSMC for producing the chips. So, it's a very concentrated market, a computing market on which everyone depends," he explained. "We are all dependent in terms of infrastructures, on very, very limited number of corporations. It's also a way in which these corporations can regulate through architecture, as Lawrence Lessig wrote 20 years ago, the way in which the AI develops." He noted that China is the only AI sovereign country in the world, apart from the US, that has developed their own infrastructure. "There is a very different dynamic because the only Global North countries are the US and China in terms of infrastructure, all the rest is Global South. We are all dependent on them. This is very important to understand, to realize the type of regulation we need and the type of investment we need," he said. Linh Tong said that multilateral agreements between countries could facilitate development in AI as well as data protection principles. She provided the example of the Digital Economic Partnership Agreement (DEPA), a form of trade agreement specifically designed for digital economy, initiated by Singapore. DEPAs, she explained, break down digital economy partnership into different models. It gives countries the flexibility to join the models that they are ready to cooperate on. The Singapore government and the Vietnamese government has already decided to collaborate on the Data Protection Model through a Memorandum of Understanding (MoU). Vietnam's Ministry of Public Security and Singapore's Ministry of Information and Communications have signed bilateral agreements for the same. The MoU will lay out the timeline for enhancing and enforcing personal data protection in Vietnam and Singapore will provide capacity building training. Singapore can share their experiences and allow Vietnam to work with their own nuances. Tong also suggested that United Nations or the International Telecommunication Union (ITU), or ISO, promote international standards to prevent balkanization of data. Grossman concurred that global agreements and transnational actors are very important. He added that that scale is very important, both for development, but also for regulation. "Perhaps a small country cannot do a lot, but if there are regional, bilateral, trilateral agreements, and there is a clear strategy of what's the role of the regulators and what's the objective of a country while promoting the use of AI and regulating it, I think that's the best approach for Global South countries," he said. There has been a significant push within countries to create LLMs that can process input in local languages. Large campaigns have been undertaken in countries like India to create datasets of these local languages. However, this too poses the risk of privacy violation. "When it comes to thinking about large language models, there is no need to follow, let's say, the business model of the leading industry that would lead to huge data scrapping and use that may put at risk data protection and privacy," he said. "I think it's a good point to think about NLP or LLMs focused to a purpose. There is no need for Global South countries to have language models that apply to all uses and to show them that, 'Okay, this is almost human.' That is now the marketing strategy of Big Tech, that if you scale them and if you focus to a purpose, you can address different challenges at the same time," he argued. This approach he said could make help to reduce the dependency on Big Tech while ensuring data protection and privacy. Tong said that Vietnam has undertaken projects to develop language models based on minority languages. However Vietnam can strike the balance between privacy and the development of the LLMs as currently there are two ways to conduct these projects Governments can integrate their regulations and requirements for privacy right from the screening of the project or screening of the partnerships and prevent violation of privacy, she added. Belli also called on regulators to mull over open data initiatives. He highlighted the efforts of the Brazilian government that invested public goods can be used by private companies for the Portuguese language. He noted that Brazil has some of the best open data ranking. Big tech companies like OpenAI and Google have trained their models on this data to very successful results. " I think a coherent vision would be also to accompany open data with licenses so that the value that is generated is also reverted to the country that originates it," he said.
[2]
Privacy Risks in AI Training with Non-Personal Data
Disclaimer: This content generated by AI & may have errors or hallucinations. Edit before use. Read our Terms of use "We lack regulations when it comes to, say, data scraping for publicly available data, which is quite a big risk when it comes to AI development and using such publicly," said Anwesha Sen, Session Chair for the Data Trusts, Non-Personal Data and AI panel discussion at PrivacyNama 2024. She pointed out that while the risks associated with using personal data in training AI models were discussed frequently, there was less focus on the issues with using non-personal data. "Using such data without the required regulations, etc., not only poses possible copyright and IP violations but also risks the, also risks re-identification of individuals along with a host of other concerns," she explained. Non-personal data is data that doesn't contain any Personally Identifiable Information (PII). It can be data that was never personal or data that was once personal but has been anonymised. This includes data that has been scraped from the internet. Amol Kulkarni, Research Director at CUTS International emphasised the confusion and ambiguity surrounding the definitions of personal and non-personal data. He pointed out that the distinction between the two is often unclear, especially when large datasets -- both personal and non-personal -- are merged for AI training. When such mixed datasets are used for AI training, which includes anonymised data, there is a high risk for deanonymisation or identifying people from the data. "I think that is one of the clear risks when, particularly when there are these data sets which are sort of which originally belong to individuals and they have been optimized and they sort of used as non-personal data," he said. But even with data that was never personal, there are still risks. Datasets created through government measures like electricity consumption or data related to roads, soil or climate were still related to groups of individuals or communities. "If usage or misuse of this data could adversely impact or target a particular community in a particular geography or particular type of, sort of a group, then there are risks of targeting, discrimination, ethical concerns about using the data," he explained. He also raised the question of how consent would work when the data was related to a group of communities. He gave the example of Airtel, which had recently announced an AI-based spam prevention service which allegedly used non-personal data. However, innocent people could still be marked as spammers and their names and numbers could be shared with some other platform that collects spam data from public datasets. Kulkarni also raised concerns about multiparty owned data, like addresses or bank accounts, which could belong to multiple people. "How do we think about consent in that scenario is something that we need to still think about," he said. Shweta Mohandas, a researcher at the Centre for Internet and Society, pointed out that non-personal data provided by the government was often of poor quality. These datasets often contain inaccuracies regarding birth or death dates. She stated that there were risks in using AI models trained on such datasets for public policy decision-making. She also explained that there was no clear definition of non-personal data. While previous drafts of the Digital Personal Data Protection Act contained examples of personal data, the current version leaves it up to interpretation. "So I think in terms of hybrid data, I think that is the issue that comes as you never know when which, what data becomes personal, what becomes non-personal data" she stated. AI itself could be capable of deanonymising non-personal data, especially when it was part of a mixed dataset that also contained related data. Like anonymised patient records and hospital records together as part of a mixed dataset. "Maybe the power of AI is powerful enough to go through all these data sets and figure out patterns and actually identify individuals," she said. "Another example could be like your food delivery apps. If they decide, like, okay, you're ordering non-veg on a festival day, maybe they can, without your consent or, put you into one category or profile you based on that," she said. She also pointed out that data subjects are not necessarily aware of the third parties who have access to their data. It could be insurance companies, employers or the government. "You never know how these can be used to make decisions and you never know when these decisions have been made against you based on these data," she said. In the absence of legal frameworks, Mohandas proposed that some form of consent or choice should be offered when collecting data, allowing individuals to opt-out if they're uncomfortable. If full consent was impractical, data subjects should at least receive a notice when non-personal data was being collected. While consent might not be necessary for all types of non-personal data (like weather data), she argued that it was important for more sensitive information such as traffic or transportation data, which could lead to inferences about personal habits or behaviours. She gave the example of an insurance company knowing that a person was frequently travelling and considering that to be unsafe behaviour. Astha Kapoor, co-founder of the Aapti Institute, suggested that as consent at an individual level may not always be feasible, it was possible to pool data rights under a trusted intermediary who can make consent-based decisions on behalf of individuals, called a data steward. She explained data stewards as "an ability for us to pool assets in the way that people pool wealth and land and have them manage that land and have a fiduciary responsibility to the people who pool this asset." She gave the example of a driver's data cooperative based out of San Francisco and Sacramento called Driver's Seat, which had an app where workers could put in the data, which the app sold to municipalities through a rigorous discussion with the workers. "My view is, I think it needs to be much simpler, is that I think it needs to rely on existing humanity collectives," she said. She suggested the cooperative structure as a model for data stewards, which could not only provide greater insight to members through collected data but also hold government mechanisms accountable. However, Amol Kulkarni also pointed out the risks of replicating existing hierarchies or power structures present in traditional communities in cooperatives. He suggested that this could lead to discrimination or a monopolisation of decision-making by an already influential subset. Kapoor also wanted to reorient data governance from being based on the type of data (personal vs non-personal) to the uses and impact of the data. "What kinds of harms that are being created? Is it discriminatory? Is it exclusionary?" she said. She gave the example of Bangalore, where real estate developers were using publicly available data on flooding to manipulate housing prices. She also stated that such non-personal data markers could be used to ghettoise communities. "The binary between personal and non-personal is being created largely to create opportunities to monetize data," she claimed. This would mark a shift from data governance to data commodification. Amol Kulkarni pointed out that due to the level of investment required in developing AI models, Big Tech had monopolised a lot of the AI training manpower. He referenced proposals for mandatory data sharing and creating frameworks for metadata availability at low cost, with raw data available at fair, reasonable, and non-discriminatory pricing. However, there were potential risks, such as diminished incentives for startups to collect data and create niche products. A pipeline which allowed data to travel from bigger players to smaller players could also flow the other way, he pointed out, which would cause the big players to become bigger. He suggested that sector-specific data exchange and sharing frameworks, supported by government incentives, might be a promising approach. Kulkarni also discussed competition law. While many countries, including India, were mulling ex-ante competition laws, he cautioned that the effectiveness of these frameworks was still uncertain. He also recommended strengthening ex-post competition laws, such as recent developments in India like the new merger review regime for the digital economy, settlement norms, and third-party checks on remedy implementation. He also suggested continuing experiments with data stewardship frameworks. Shweta Mohandas pointed out that AI developers often preferred personal data over non-personal data due to its human-like quality. She gave the example of startups disregarded voice data in Indian languages created by IITs in favour of user-generated data. She questioned if it was reasonable to expect people to want non-personal data. Further, even seemingly non-personal data, like X-ray images, may contain personal information that requires careful cleaning and annotation to ensure it's truly non-personal. "Sometimes what happens is doctors or the radiologists, they write the patient's name or they write something on the sheet, and when that has been used, it becomes it is personal data," she explained. She also argued that the use of non-personal data should depend on the context and intended use of the AI system. For example, using non-personal data for facial recognition systems would be inappropriate. Shashidhar KJ, Managing Editor of MediaNama, raised the issue of AI bias stemming from limited or biased datasets, which leads to discrimination. Correcting these biases often requires more data, especially in social sector applications like subsidies. However, this creates tension with data privacy concerns since more data collection can pose risks to privacy. Astha Kapoor highlighted the importance of demonstrating the value of data to communities, especially women in agricultural cooperatives. While people may contribute more data if they see its value, she highlights the challenge of communicating this value clearly. She also emphasised that data should not only be extractive but should serve a broader purpose, such as ensuring accountability and benefiting the community. Shashidhar also asked if there are models that offer fair compensation for communities that create specialized datasets. Kapoor referred to data labelling work, which employed women. However, this data was often unrelated to their material context. She suggested that the value of data went beyond monetary compensation and emphasised that communities should have control over what they reveal and conceal, which encourages greater participation in dataset creation. "Many many instances particularly in Southeast Asia, women are starting to do which is that, you know, they're taking photos of deforestation, they're tracking, they're tracking like I said flooding, soil health and things like that. And are now using it to hold local governments accountable on climate action," she said. Amol Kulkarni drew parallels with the challenges faced by communities in land data governance, where intermediaries often step in to manage data collection but may not represent the communities' best interests. He warned against the risks of intermediaries capturing the benefits meant for communities.
[3]
Is AI Sovereignty the New Battleground for Showcasing Technological Supremacy? #PrivacyNama - MEDIANAMA
"We're increasingly seeing this focus on AI sovereignty manifesting and reflecting in increasing calls from countries to exert a greater amount of control over the development as well as the use of AI within national borders," said Nehaa Chaudhari, Session Chair for 'AI Sovereignty and Cross Border Data Flows,' at PrivacyNama 2024. She stated that the discourse around AI sovereignty had two manifestations. In one, nations pushed hard to develop infrastructure for AI (advanced computing capabilities, AI chip manufacturing) and in the other, they built localised and specialised training datasets. Policy-wise, it manifested itself as mandatory requirements on data sharing, opening up data sets, data localisation, and restrictions on cross-border data flows. AI sovereignty was also a site of geopolitical contestation, as countries aimed for technological supremacy while attempting to prevent their rivals from acquiring key technologies. According to Amlan Mohanty, an independent lawyer, one has to think of sovereign AI in terms of the whole AI stack, which involves the chips, the cloud, the models, algorithms, software, and even the applications. Bringing attention to anonymised data, which formed the bulk of AI training datasets, he asked, "Where are the safeguards around processing anonymised data? Where is the industry guidance on anonymisation standards? If you're thinking about it from a sovereign point of view, are governments recognising the inherent strategic value of anonymised non-personal data to serve their sovereign interests?" Agreeing that there was a difference between data sovereignty and AI sovereignty, Vivek Abraham, Senior Director at Salesforce, stated that the concept of data sovereignty was born out of the fear that bad actors or foreign citizens could harm a nation's citizens if it relaxed control over its data. "AI sovereignty is all about innovation, leading in innovation," he said. According to Abraham, nations were afraid that they could lag behind in the AI race. "I think at the policy level, at the central government, it's more about -- can we encourage more Indian AI players coming up?" Abraham said, adding that although India was yet to develop foundational LLMs like OpenAI in the US, it still had access to a lot of data that it could leverage to secure a leading position in the AI race. The government wasn't afraid of foreign AI players, he argued, but was instead pushing Indian companies to come up with their own tech. "What is foreign AI? What is local AI? And that really brings me down to this idea, which is, for me, sovereignty is not a binary. It's actually a relative concept. You're never going to be able to reach full sovereign AI or full sovereignty," said Mohanty. He pointed out that it was difficult to build an AI model based purely on indigenous manufacturing and programming. However, he stated that Sovereign AI could also be understood in a relative sense, where a nation does not have a major dependency on foreign actors. Vivek Abraham agreed. "Very few people are going to be making foundational models from scratch. It's very expensive. By the time you have a foundational model worth using, you're already behind in the rat race as of today," he said. According to Abraham, countries should focus on "getting the right contextual data, building the content-specific AI that you can use with the competitive edge that you have or with the data that you have." Additionally, Mohanty pointed out how AI sovereignty might not just be about a country's market and innovation, but also about control over sensitive applications of AI. "AI sovereignty is also about access and control and supervision. Let's make no bones about it, because if you're, for example, talking about sensitive applications of AI, like military applications of AI, there is a sense that you want to build some of these technologies locally, but it's really about saying, do we have a kill switch, right? Do we have sufficient supervision about how our data is being used for these very sensitive applications? So I do want to make clear that when countries are thinking about sovereign AI, they are thinking about building locally from the ground up the entire AI value chain, but they are also talking about control and supervision." Rajkiran Panuganti, Senior Director at Ola Krutrim, argued for establishing control over the AI stack. "I'm not saying we shut off from the foreign influences, but we can't be completely dependent on any part of the stack, whether it's the chips, whether it's the cloud infrastructure, whether it's various kinds of models that go into a typical AI stack or the ultimate foundational models itself," he clarified. He argued that AI was a critical technology that could change the world and having dependencies on other nations or companies was risky. Rostam Neuwirth, Head of the Department of Legal Studies at Macau University, argued that national AI initiatives and global collaboration are complementary, not contradictory. He stated that while local efforts enable meaningful contributions to global debates, addressing global challenges requires both collaborative solutions and domestic AI capabilities. "What is important is that we look at it not from the perspective of a mutual contradiction but from mutual complementarity. And I think even that when we look at the competition between China and the United States, for example, two very powerful and well-established players in that sense, they also face a paradox that they actually need each other's data or knowledge and know-how," he explained. But then wouldn't this be contradictory to what sovereignty stands for? It won't be, Neuwirth argued. "I'm afraid we're making a mistake if we think that the issue of AI is a solely local issue...we [do] see now even EU pushing for this digital or AI sovereignty. At the same time, we have a push for a global digital compact by the United Nations in the context of the summit of the future. And we have the UNESCO recommendation on the ethics of AI, which clearly, in my view, reflects that there's a global consensus among all the UNESCO members that this is a global issue," he said. "What will be the new battlegrounds around which questions of data governance are going to be contested in the context of AI sovereignty?" asked Mohanty. He argued that these future conflicts won't be over personal data but over the anonymised data that AI is trained on. "This is not the business model of targeted personalised targeted advertising. It's a completely different model. They care and they want, and they're only interested in anonymised data sets." he said. The way AI models would become more powerful, he said, was when they would have access to smaller but more personalised data sets. This was an area that ought to be addressed from the perspective of AI sovereignty, he suggested. Panuganti presented another argument in support of AI sovereignty. "So sovereignty should not be seen just as protecting yourself for your economic benefit, but also you want to enable foreign players also to represent you in a way that you want," he said. He argued that a lack of India-specific data and control over the AI stack could lead to negative stereotypes. "Who are we making the gatekeepers of the representation? And do we have control over that representation? And unless you provide your perspective, unless your perspective is weighed properly, then you're allowing external guys to be the gatekeepers of your cultural aspect," he said. Pursuing AI sovereignty also had a strategic advantage, he explained. India is currently one of the biggest generators of data, but it is owned by multinational or foreign companies. On top of that, it is also a consumer of AI models created in other countries. He likened this situation to the British Raj when India exported raw cotton but imported processed clothes. However, ensuring appropriate cultural representation is not as simple as tailoring the output of the model. While one can definitely improve the output through content moderation, "there needs to be something from the foundation and sovereign AI to really ensure that," said Panuganti. Mohanty also pointed out the role of localized datasets in making AI models powerful and culturally richer, an area where sovereignty can play an important role. "These models are going to become more powerful, more capable, is when they have really small, localized, specialized data sets that are going to be licensed. And I think that's going to be increasingly an area in which we're going to have to start thinking from the lens of sovereignty. Which countries have these data sets? Why do some countries have these data sets and not others? Again, this is going to be a question about cultural preservation. You want to train a model to be able to understand a language for which data doesn't exist. How are you going to be able to train that model, right? So I think there are all of these questions about the specific types of data on which new and more capable models are going to be trained," he said. Companies are cautious about sharing data, said Abraham, as it could train AI models outside their control. For example, e-commerce companies often limit the information in email notifications to protect their data. This is crucial for businesses to maintain competitive advantages and avoid potential data leaks. However, he stated that there were ways one could ensure that sensitive information was not used to train AI models. While these methods may increase costs, they allow businesses to benefit from AI capabilities without compromising data privacy. Rajkiran Panuganti stated that while data privacy was important in the context of AI, many startups were not in a position to ensure privacy. He wanted the government to aid them in this regard. He also suggested that companies building AI products need to be conscious of privacy every step of the way. "You need to be conscious of it every step of the way, that you build every product that you are enabling it. You need to have a consistent thinking from this point of view, and it's not an easy job, but it is possible to enable that," he said. Mohanty pointed out that it was also important to consider how AI could change the way international trade agreements function. Taking the example of TRIPS (Trade-Related Aspects of Intellectual Property Rights) Agreement, an international legal framework under the World Trade Organization (WTO) which sets minimum standards for the protection and enforcement of intellectual property (IP) rights, he questioned if such agreements needed to be revisited in light of AI. In a similar vein, he also wondered what would collaboration look like in an emergency like the COVID-19 pandemic and what role AI would play in collaboration in such situations. "Let's say there's a new health emergency five years from now, I can imagine an important role [will be played by] AI models. And if we're thinking about joint collaboration between countries, what role will data and copyright have to play in that? And are our larger domestic governance frameworks and larger international multilateral treaties equipped to deal with those situations?" Neuwirth asserted that with lines blurring between goods and services, the TRIPS agreement had stopped working and was becoming "obsolete" much before the rise of AI. Instead, he suggested that there was a need for consensus-building on basic issues at the global level. "We need a new thinking, and that needs to be translated into the relevant laws and frameworks, and institutional frameworks in particular. I think at the institutional level, we have to work because it is there where we can allow people to voice different opinions. And then what is extremely important to reach is, I think, a consensus on the terminology in particular, because I think when we talk about data, there can be so much like, there can be so many misunderstandings...the technology community [needs] to come in and have a very intense debate globally and interdisciplinarily, in my view," he said.
[4]
On Legitimate interest and data processing for AI #PrivacyNama
Disclaimer: This content generated by AI & may have errors or hallucinations. Edit before use. Read our Terms of use The Irish Data Protection Commission delayed the launch of Meta's AI model in the EU in June this year, after the company said that it would begin training its AI models using public content shared by adults on Facebook and Instagram. Meta had provided users with an opt-out form that they could fill out to prevent Meta from using their for training AI. Meta argued that it had a 'legitimate interest' in processing first and third-party data to build its AI services under the General Data Protection Regulation (GDPR). Yet, the investigation is still ongoing, and Meta has since halted the release of some of its AI models in the EU. Speakers at PrivacyNama 2024 discussed these challenges of processing data. Particularly, they spoke about how legitimate interest could be an alternative to the often limited consent framework. They explored how asking users to consent to the use of their data could be challenging and put a burden on users. They also discussed how legitimate interest, when clearly defined by a regulator, could be an effective legal basis for processing data. Legitimate interest is a legal basis for processing personal data. It allows data collectors to collect and process personal data without the explicit consent of the user under certain conditions. The ICO, UK's Data Protection Authority, explains that legitimate interest puts the onus on the processor of data as they must determine "the necessity of processing the personal data against the interests, rights, and freedoms of the individual taking into account the particular circumstances." It is an alternative to other legal bases of data processing like 'consent' wherein a user is responsible for agreeing to collectors processing their data. The EU and the UK are some jurisdictions that have legitimate interest as a basis for personal data processing. The UK provides data processors with a three-step test to analyze if they have the right to process data - Purpose, Necessity and Balancing. The GDPR in the EU notes that legitimate interest is a legal basis for personal data processing when the data subject can "reasonably expect" when "processing for that purpose may take place." Legitimate interest is also valid in the EU, when it is "strictly necessary" for preventing fraud and, in some cases, for marketing. India's data protection law has similar provisions but is far narrower. As opposed to the wide-ranging legitimate interest, Digital Personal Data Protection Act (DPDPA) has 'legitimate uses'- eight conditions wherein an entity can process a user's personal data without explicit consent. Beni Chugh, Head of Future of Finance, Dvara Research, noted that the "privacy self-management" or "consent" framework did not account for the lack of awareness among users. Speaking at 'Applying Data Protection Principles to AI', she said that the framework presumes a great level of awareness, agency and autonomy from the data subject and also presumes "that there is a very accessible, inclusive regulator at the other end who understands all of these issues and will act quickly, etc." She argued that the framework required that users be aware of all their data protection rights and know when to revoke consent while using AI. This, she noted, was challenging, especially for customers who were low-income, new to digital, or first-generation users. Linh Tong, Deputy Director, Center for Education Promotion and Empowerment of Women (CEPEW) also noted during 'AI and Privacy in the Global South' that data subjects lacked an understanding of personal data protection in the AI context. In her research interviewing locals in Vietnam, she found that most people had little to no knowledge of data rights or data subjects. She found that most understood personal data protection within the context of cybersecurity as opposed to protecting the rights of the citizens. Chugh argued that the consent framework puts the burden on a customer to understand the nuances and technical details of data protection. She provided an analogy, saying, "A customer is not expected to undertake the safety quality check of an aircraft they're flying in or a car that they're driving. So why is it that in such a precarious, fragile environment where implications are of life and beyond, that so much is being put on the customer?" Further, the pervasiveness of AI makes obtaining consent far more complicated. Users are often unaware of when they are even using AI; this makes it difficult for them to provide consent. She said, "As an industry that has a lot of prowess and resources at its disposal and is evolving very dynamically, it is time to start thinking about obligations seriously, as opposed to leaving it to a matter of people exercising their rights." She added, "It's not up to the rights to safeguard people. It is up to the obligations." Idriss Kechida, Global Data Protection Officer, OpenAI said, "Consent is a blunt instrument." He opined that legitimate interest was a much more "refined and mature legal basis" because developers have to be "accountable for it in front of regulators." He said at the "Data Protection Officers Roundtable" that "Legitimate interest was created to enable use cases that are ancillary to the main activity, but it also then comes with a lot of strings in the form of balancing tests, which, in my view, are very appropriate and good." He argued that with while consent may work if you have a good relationship with your customer, legitimate interest requires higher accountability from developers. He said, [Legitimate interest] requires implementing a lot of measures that eventually get you to a place where there is a balance between you processing the data and the purpose for which you're processing it." Luca Belli, Professor, Fundação Getulio Vargas (FGV) Law School, said, "Legitimate interest is not carte blanche which means that you can do whatever you want." During the discussion on 'AI and Privacy in the Global South' he added, "You have to demonstrate that that legitimate interest is compatible with the original purpose for which the data was collected and does not hinder the fundamental rights of the data subject." He added, "You have to make a very careful data protection impact assessment before starting your activity to demonstrate that you have assessed this and you have minimized all potential risks." Mugjan Caba, Senior Compliance Officer, Truecaller at the "Data Protection Officers Roundtable" also added that data processing in AI takes place in various stages: training, deployment, and output generation. She said that one must consider the legal basis for processing at each of these stages. While referring to training AI models, she said, "If you can justify the business benefit and all the measures you take for individual's rights, then legitimate interest can also be .... a bit more relevant." Derek Ho, Assistant General Counsel, Mastercard, said at the "Data Protection Officers Roundtable" that developers could apply legitimate interest to certain use cases like fraud prevention. He said, "If you're thinking about a fraud prevention use case or information security use case, you really want to make sure that the criminals aren't able to also opt out of having their data being found." He said that training models on data could be key to preventing malicious action, and legitimate interest ensures that criminals cannot conceal themselves from authorities. He noted that the GDPR also cited fraud prevention as a legitimate interest. The respective data protection authorities in the UK and Brazil have paused Meta's plans to train their AI models on publicly available data on its platforms because it failed to prove legitimate interest at the time of deployment. Belli said regulators should work with developers, especially smaller startups, to provide them with clarity on what is the lawful way to process data. This is because he said, "If you are not a data protection expert, which most startups are not, and if you don't have specific guidance on how to translate the principle basis, the normative basis into concrete actions, it is simply impossible to build privacy within the technical and administrative structures of their service."
Share
Share
Copy Link
The Global South faces unique challenges in balancing AI innovation with data protection, as discussed at PrivacyNama 2024. Issues include regulatory gaps, enforcement difficulties, and the complexities of using non-personal data in AI development.
The Global South is facing significant challenges in regulating artificial intelligence (AI) and protecting data privacy, as revealed at PrivacyNama 2024. Nicolas Grossman, project director of the Global Index on Responsible AI, highlighted that only 49% of nations in the Global South had legally enforceable frameworks for data protection concerning AI, compared to 83% in the Global North [1].
Professor Luca Belli of Fundação Getulio Vargas (FGV) Law School noted that while many Global South countries have general data protection laws, enforcement remains a critical issue. He emphasized that regulators in these countries are often under-resourced and may be reluctant to confront large tech companies [1].
Linh Tong, Deputy Director of CEPEW, pointed out that some countries, like Vietnam, are prioritizing economic development through AI over stringent regulation. This focus on becoming part of the global AI value chain has led to encouraging data use and sharing, potentially at the expense of personal data protection [1].
The discussion at PrivacyNama 2024 also highlighted the overlooked risks associated with using non-personal data in AI training. Anwesha Sen, Session Chair for the Data Trusts, Non-Personal Data and AI panel, emphasized that the lack of regulations for data scraping poses significant risks, including potential copyright violations and re-identification of individuals [2].
Amol Kulkarni, Research Director at CUTS International, pointed out the ambiguity in defining personal and non-personal data, especially when large datasets are merged for AI training. This confusion raises concerns about deanonymization and potential misuse of data [2].
The concept of AI sovereignty emerged as a key topic, with nations seeking to exert greater control over AI development within their borders. Nehaa Chaudhari, Session Chair for 'AI Sovereignty and Cross Border Data Flows,' noted that this manifests in efforts to develop infrastructure for AI and build localized training datasets [3].
Rostam Neuwirth, Head of the Department of Legal Studies at Macau University, argued that national AI initiatives and global collaboration are complementary rather than contradictory. He emphasized the need for both local efforts and international cooperation to address global challenges effectively [3].
The challenges of obtaining user consent for data processing in AI applications were discussed, with some experts advocating for the use of 'legitimate interest' as an alternative legal basis. Beni Chugh, Head of Future of Finance at Dvara Research, argued that the consent framework puts an undue burden on users, especially those who are new to digital technologies or from low-income backgrounds [4].
Idriss Kechida, Global Data Protection Officer at OpenAI, described consent as a "blunt instrument" and suggested that legitimate interest could be a more refined and mature legal basis for data processing in AI development [4].
As the Global South continues to navigate the complex landscape of AI regulation and data protection, balancing innovation with privacy concerns remains a significant challenge. The discussions at PrivacyNama 2024 highlight the need for tailored approaches that consider the unique contexts and priorities of different regions while striving for global standards in AI governance.
Reference
[2]
[3]
Experts discuss the complexities of developing AI while adhering to privacy laws, highlighting the need for 'Privacy by Design' and addressing challenges in data governance and regulatory compliance.
3 Sources
India's rapid progress in artificial intelligence development is encountering potential obstacles due to stringent privacy regulations. The country's AI sector growth may be hindered by data protection laws, raising concerns about the balance between innovation and privacy.
2 Sources
India grapples with the timing and approach to AI regulation, as experts debate whether current discussions are premature given the nascent state of AI adoption in the country, especially in rural areas.
4 Sources
Exploring the impact of AI on society and personal well-being, from ethical considerations to potential health benefits, as discussed by experts Madhumita Murgia and Arianna Huffington.
2 Sources
A comprehensive look at the latest developments in AI, including OpenAI's internal struggles, regulatory efforts, new model releases, ethical concerns, and the technology's impact on Wall Street.
6 Sources
The Outpost is a comprehensive collection of curated artificial intelligence software tools that cater to the needs of small business owners, bloggers, artists, musicians, entrepreneurs, marketers, writers, and researchers.
© 2024 TheOutpost.AI All rights reserved