Daniel Murcia

Daniel Murcia

Senior Legal Advisor at Govertis part of Telefónica Tech

Cyber Security
AI & Data
Privacy and AI risk management to protect personal data
It is undeniable that there is a close relationship between AI models and the protection of personal data, and it is to be expected that this relationship will be explored in ever greater depth as the use of AI becomes more widespread. In this regard, recently The Hamburg Commissioner for Data Protection and Freedom of Information published a discussion paper analyzing and ruling on the applicability of the General Data Protection Regulation (GDPR) to large-scale language models (LLMs). The Authority argues that these models do not store data in their original format, since the text information used in their training is transformed into tokens or text fragments, which makes them not directly identifiable. Discussion on data protection in language models Tokenization is a process by which text is converted into sequences of smaller fragments that do not contain the original information in its entirety, making it virtually impossible to identify a person directly from these fragments. Once the training is finished, the model only retains mathematical patterns represented by the weights of the neural connections, without storing the original text. Based on the above, it states that the mere storage of information by an LLM does not constitute processing of personal data within the meaning of the GDPR. Although LLMs do not store full texts, they can generate information that matches personal data. This approach, however, has been criticized by experts such as David Rosenthal who point out that, although LLMs do not explicitly store data, they are capable of generating consistent information that matches personal data if these have repeatedly appeared in the training data, which raises risks associated with the use of AI even if there is no direct storage. Risk identification In terms of regulatory compliance, a common point of the European privacy regulation (GDPR) and the recent AI regulation is the risk management-oriented approach, which is going to require organizations to adapt this management to new realities that arise hand in hand with the adoption of emerging technologies such as Artificial Intelligence. Lack of transparency and interpretability in LLMs complicates GDPR auditing and makes analysis of how personal data is processed difficult. A reference framework when identifying risks in this context is the one developed by the MIT (Massachusetts Institute of Technology) in its AI Risk Repository, a comprehensive and living database of more than 700 AI risks categorized by their cause and risk domain on the basis of 43 AI-related risk frameworks. These risks include: Compromise of privacy by leaking sensitive information: as mentioned above, although LLMs do not store full text, they can generate information that matches personal data. False or misleading information: LLMs can generate incorrect or misinterpreted content about individuals, affecting their reputation and the integrity of the personal data processed. Disinformation and manipulation: LLMs can be used in malicious campaigns to manipulate personal information or influence people's behavior, which, although unintentional, can result in improper data processing. Lack of transparency and interpretability: LLMs function as “black boxes”, making it difficult to analyze how personal data is processed and complicating auditing in terms of GDPR compliance. Security vulnerabilities: LLMs can be vulnerable to attacks that exploit weaknesses in their infrastructure, which could result in unauthorized access and exposure of personal data. Compliance and risk reduction Based on these general risks taken from a reference framework, doubts may arise regarding compliance with the principles relating to processing set out in Article 5 of the GDPR, such as the accuracy principle: if LLMs can generate incorrect information based on the data they were trained on, there could be a spread of inaccurate personal data. In addition, there is no easy mechanism to correct errors in the data generated by the model, which contravenes the accuracy principle. Another question that can be raised is how the minimization principle fits in when these models are usually trained with huge amounts of data (often from unstructured datasets that are difficult to control) and whether it is possible to determine that the personal information used in training the model was only that necessary to achieve the model's objectives. By integrating minimization and access limitation principles by design, risks can be identified and mitigated preemptively. This reinforces the idea that AI risk management systems will need to be aligned with those already being developed for GDPR compliance. It will therefore be crucial to keep in mind the principle of privacy by design and by default (PbD) to implement measures that ensure that only the personal data necessary for each specific purpose is processed. Integrating principles of data minimization, access limitation and adequate conservation from the beginning of the design of systems and processes enables the identification and mitigation of risks in a preventive way. ✅ At Govertis, part of Telefónica Tech, we recommend the consideration of the practical application of the PbD (privacy by design and by default) in the adoption of technology that includes AI. Image: DC Studio / Freepik.
October 30, 2024
Cyber Security
AI & Data
High-Risk AI systems in the Artificial Intelligence Regulation (AIR)
In this new article of the series we are publishing, we will focus on high-risk AI systems in the context of the European Artificial Intelligence Regulation (AIR). As we already advanced in previous posts, this Regulation establishes a specific regulatory framework for those AI systems that represent a high risk with an impact on areas such as health, safety, and fundamental rights of individuals. All this with the aim of safeguarding the values and principles of the Union. The aim of this publication is to set out and describe those aspects that this new regulation has considered as High Risk and their main implications. Artificial Intelligence Regulation (AIR) safeguards EU citizens' rights, while guiding entities towards the responsible and safe use of Artificial Intelligence. Conditions for AI High Risk Classification According to Article 6 of the AIR, the classification of an artificial intelligence system as high risk obeys to conditions that imply that we are facing two possibilities: That it is a safety component or product: the system must be designed to be used as a safety component in products that are specified in EU legislation (Annex I), or that the AI system itself is one such product. This covers a wide range of products, from toys to medical devices, all of which are subject to strict safety regulations. In addition, the AI system must undergo a conformity assessment carried out by an independent body for its placing on the market or putting into service, according to the Union harmonization legislative acts listed in Annex I (Art 6.1). Or that it is one of the high-risk IA systems referred to in Annex III (Art 6.2). On the other hand, in the event that an AI system falls within one of the areas of Annex III, it may not be considered high-risk if its influence on decision-making is not substantial and one or more of the following conditions are met: the system performs a limited procedural task, enhances previous human activities, detects patterns of decision or deviation without replacing human assessment, or performs preparatory tasks (Art 6.3). Delimiting High-Risk domains AI systems classified as high risk are distinguished not only by their advanced technical capabilities, but also by the profound implications they have on society and individual lives: Autonomy and complexity: these systems can make decisions and execute actions independently, which is especially critical in sectors such as medicine and justice, where mistakes can have severe consequences. Social and personal impact: they can significantly affect fundamental rights, such as privacy, equality, and non-discrimination. They also have the potential to perpetuate or introduce bias, disproportionately impacting vulnerable groups. Critical infrastructure security: Their application in essential sectors such as energy and transportation can result in failures or manipulations that cause serious and extensive damage. These systems must therefore operate within a strict set of ethical, legal and security standards designed to protect society and foster an environment of trust and confidence in technological development. The Regulation contemplates that the systems listed in Annex III covering the following areas are considered to be high risk: Biometrics Remote biometric identification: Systems that identify individuals remotely and can be used in multiple contexts and pose privacy risks. Biometric categorization: Systems that classify individuals according to sensitive attributes, which can lead to discrimination or unfair treatment. Emotion recognition: Involves analyzing facial expressions or voice modulations to infer emotional states, and can be misinterpreted or misused. Critical infrastructures Use in security systems for the management of critical infrastructures such as energy, transportation, and basic supplies, where a failure or manipulation can involve serious consequences. Education and occupational training Admission and distribution in educational institutions: decisions that affect the academic and professional future of individuals. Assessment of learning outcomes: Can significantly influence educational and career paths. Behavioral monitoring during exams: Surveillance that may invade student privacy. Employment, employee management and access to self-employment Systems used in hiring and evaluation of employees, which implies a direct impact on job opportunities and working conditions. Access to essential private and public services Assessment for eligibility to essential public services, which is crucial as it affects access to basic needs such as health and welfare. Solvency and risk assessment for insurance, which can determine accessibility to financial services and insurance. Law enforcement Tools used by authorities to assess risks of criminality or behaviors, which directly affects civil rights and can lead to surveillance or discrimination. Migration, asylum, and border control management Systems used to assess risks in border and asylum management, significantly affecting individuals in vulnerable situations. Administration of justice and democratic processes Support in the interpretation of facts and laws, potentially influencing judicial decisions. Influence on elections or referendums, which could alter the integrity of democratic processes. Anticipation and flexibility: keys to cope with the rapid evolution of AI AIR's Article 7 illustrates a proactive and adaptive evolution of the law in the face of rapid technological developments in the field of AI, an area where legislative development has traditionally struggled to keep pace with technological advances. This precept establishes a dynamic mechanism for the review and updating of the aforementioned Annex III, granting the European Commission the power to modify its content by adding systems that meet the pre-established risk and scope criteria, as well as removing those that no longer represent a considerable risk. The addition of new systems to this list will consider whether such systems are intended for use in areas already covered in Annex III and whether they present a risk of harm to health and safety or negative impact on fundamental rights that is equivalent to or greater than that of systems previously recognized as high risk. Including new AI systems on the High Risk list will consider whether they pose a risk to health and safety or have a negative impact on fundamental rights. Specific Requirements for High-Risk AI Systems In general, high-risk systems must comply with the specific requirements of the regulation, considering their intended purposes and the current state of the art in AI technology. Articles 8 to 49 develop the main obligations that these systems must comply with: Risk management: they must establish and maintain a risk management system that addresses known and foreseeable risks to health, safety, or fundamental rights. This system should be an ongoing process and reviewed periodically. Data and data governance: systems that use model training techniques with data should be developed from data sets that meet high quality standards and managed appropriately to avoid bias and error. Technical documentation: technical documentation that demonstrates compliance with regulatory requirements must be developed and kept up to date and accessible to the competent authorities. Record keeping: systems must allow for the technical recording of significant events to ensure traceability and risk management. Transparency and communication: they should be designed to operate with an adequate level of transparency and be accompanied by clear and complete instructions for safe and effective use. Human oversight: they should include measures to be effectively monitored by humans to minimize risks associated with their use. Accuracy, robustness, and cyber security: they must be designed to achieve adequate levels of accuracy and robustness and be resistant to external tampering. Supplier obligations: suppliers of these systems must ensure that the systems comply with all requirements and regulations, including the establishment of a quality management system and compliance with conformity assessment procedures, prior to market introduction. Cooperation with authorities: suppliers must cooperate with the competent authorities to ensure compliance and manage any incidents related to high-risk IA systems. Cooperation with authorities: suppliers must cooperate with the competent authorities to ensure compliance and manage any incidents related to high-risk IA systems. Suppliers have a very important role to play regarding high-risk systems, but so do deployers, as they will have to take appropriate technical and organizational measures to ensure that they use such systems in accordance with the instructions for use, they must entrust human supervision to natural persons with the necessary competence, training and authority and, in the case of having control over the data, they must ensure that the input data is relevant and sufficiently representative for the intended purpose of the high-risk AI system. Deployers will also have a key role to play in monitoring the operation of the high-risk AI system. In fact, one of the main obligations on them, which has been introduced in the AIR in the Parliament's version is the obligation to carry out fundamental rights impact assessments, an issue we will discuss in another forthcoming article in this series. Conclusion The European Artificial Intelligence Regulation is a key piece in the framework of liability and security in the digital age, providing a systematic approach to mitigate the potential impacts presented by high-risk AI systems, establishing obligations for everyone in the value chain. It not only protects the interests of citizens, but also guides organizations towards a safer and more ethical future with its proactive and adaptive vision. ◾ CONTINUING THIS SERIES Cyber Security AI of Things An introduction to the Artificial Intelligence Regulation (AIR) April 8, 2024 Cyber Security AI of Things The application scope of the Artificial Intelligence Regulation (AIR) April 16, 2024 Cyber Security AI of Things AI practices forbidden in the Artificial Intelligence Regulation (AIR) May 2, 2024 Image by Freepik.
May 9, 2024