European policies and legal instruments are increasingly drawn to a technocentric approach regarding Artificial Intelligence (AI) fairness – that is, they pay a lot of attention to the technical issues of AI, such as ensuring that datasets are balanced and free from errors. With this blog post, I argue that it is not sufficient to focus solely on the technical aspects of fairness without also considering the social, political, and economic systems that shape AI’s development and deployment. Instead, a broader socio-technical approach should be considered. This post is based on my recent publication “Predictive justice in light of the new AI Act proposal”.
In the latest years, there has been an increasing trend for police forces and judicial authorities to employ predictive profiling technologies in criminal justice, posing major risks to fundamental rights of citizens. Although there are currently no Courts in which decisions are solely made by an algorithm, in many countries decisional forecast is found in judicial proceedings and in other cases related to law enforcement, such as assessing the risk of domestic violence (e.g., the Spanish Viogen system). That is, AI systems are employed to help judges, lawyers, and law enforcement officers to make decisions on citizens.
The new legal framework on Predictive Justice
The regulatory framework is fragmented across the EU as both the GDPR and the Law Enforcement Directive (LED) leave substantial space to each Member State to regulate the subject under national law. Recently, the AI Act regulation has also been adopted to regulate AI systems. The new proposal introduces a risk-based approach, dividing AI devices into different categories: low risk, medium risk, high risk, and unacceptable risk. These categories are based on the potential negative effects of AI systems on citizens’ rights and freedoms. While exploitative systems are banned, and low/medium risk systems only have to comply with minimal transparency measures, high risk systems are heavily regulated, having to comply with several requirements and obligations aimed at decreasing the risks involved in the use of those devices. Examples of such measures are the obligation for the producer of a high risk AI tools to establish a robust quality assessment system and data government procedures that include training, validation, and testing data sets to detect and correct biases.
As noted by Schumann Barragan, one of the characteristics of the use of AI systems in the judicial field is the intense power exercised by the States on its citizens. In the context of criminal justice, it is particularly important to take into consideration that citizens’ fundamental rights are at stake, such as freedom, the right to a due process, and the presumption of innocence. For this reason, the new AI Act regulation classifies AI systems employed in the judicial field as ”high-risk”, and prescribes a number of safeguards to prevent their misuse, such as banning real-time face recognition in public spaces for law enforcement purposes. According to the new rules introduced by the AI Act, predictive justice algorithms are classified as high-risk systems not only when the assessment is made about a criminal offender, but also when it is aimed at evaluating the risk for potential victims.
Classifying predictive justice tools as high risk means that AI providers (including law enforcement officers when they build their own internal AI tools) must comply with all data governance provisions, including bias mitigation measures.
This classification , however, creates some conflicts with data protection law, since Recital 63 claims that no new GDPR legal grounds for the processing of personal data will be introduced. At the same time, it provides for an additional legal basis authorizing the processing of special categories of personal data for bias monitoring purposes. In other words, although the AI Act claims that GDPR rules are untouched by the new provisions, in reality, this is far from being the case, as a legal possibility of reusing sensitive data in AI training is provided by its provisions. This means that the processing of sensitive data will be facilitated for companies providing law enforcement with predictive policing AI systems.
The reuse of sensitive data in Predictive Justice
In the context of predictive justice, the reuse of sensitive data as allowed by the AI Act may entail new risks for data subjects, particularly vulnerable groups like minorities and disabled people, whose data will be processed without the guarantees of GDPR, since the LED framework provides for a substantial restriction of data subjects rights. In particular, recent cases have shown how difficult it is to obtain the deletion of personal data archived for law enforcement purposes even when the data subject has been fully rehabilitated (see Italian Court of Cassation, judgment no.35548 of 11st of December 2020).
In the data governance section of the AI Act, it is explicitly provided that the processing of special categories of data is permitted for the purpose of bias monitoring. Bias monitoring in the data sets means carrying out an assessment related to potential discrimination in the data, such as a gender imbalance, overrepresentation of a certain population, or disregarding a specific characteristic peculiar to certain minorities. To determine whether a predictive model can effectively function across diverse populations, including non-cisgender individuals, neurodivergent people, and racial minorities, it is essential to recognize and process specific characteristics of these groups, such as being transgender, autistic, or Jewish. This involves handling sensitive personal data as outlined in Article 9 GDPR. Without this information, it becomes impossible to create a representative dataset as required by the AI Act. Consequently, law enforcement agencies may feel compelled to gather extensive sensitive data to adhere to new rules with the aim of ensuring fairness and accuracy in predictive justice systems. However, this practice raises significant ethical concerns about privacy and potential misuse of such data.
Increased risks for citizens’ rights
The literature in critical AI studies has criticized this technocentric approach, as explained by Balayn and S. Gurses (2021) in the EDRi report “Beyond debiasing: Regulating AI and its inequalities”. According to the authors, policies focusing on data debiasing often take a limited view of AI fairness, focusing only on data without addressing the larger societal context, in which AI systems are deployed. The report suggests that even efforts to conform to the AI Act ensuring that the datasets are “representative, error-free, and complete” may still result in AI systems that reflect an inequitable world. In line with this critique, Hanna et al. (2020) highlight the importance of reframing discussions about fairness in AI away from the algorithmic level and toward the social and institutional contexts in which these systems are implemented. Both the EDRi report and the literature recognize the importance of data quality and statistical considerations pertaining to data, as those significantly influence the quality of the ML model training.
Although including bias monitoring in the legislation is a noble act, the AI Act provision regarding sensitive data are not sufficiently balanced with citizens’ rights and freedom, since such data (e.g., health information) can lead to stigmatization, especially considering that appropriate safeguards are not provided at the EU level. Collecting special categories of data is considered by several scholars harmful for minorities, as explained by the EDRi report: “the machine learning models trained on this data could make inferences that disadvantage them. Paradoxically, auditing, while aiming at monitoring the fairness of a model’s outcomes for unprivileged, often minority populations, raises further harms for them, since collecting more data leads to over-policing minorities and mass-surveillance”.
In addition, with regards to biometric data, the recent Clearview AI case has shown how law enforcement authorities have misused AI systems notwithstanding the limitations provided by law. In this context, AI systems employed for predictive justice purposes could be even greatly misused if a large number of sensitive data are collected, stored, and used by public authorities.
The need for a socio-technical approach towards AI fairness
Even from an ethical point of view, it cannot be considered acceptable to employ such data to train predictive algorithms without data subjects’ consent, while their rights are heavily limited, while just giving them a mere notice (as prescribed by the LED). The current EU discipline creates a discrimination between people who were unlucky enough to come across police collecting their data for random checks or because involved as victims or witnesses in criminal cases, even if completely innocent and without fault, and people whose data were never collected. Both the building of predictive algorithms and their actual implementation have been the source of discrimination and violation of citizens’ rights and freedoms, therefore the provisions of the AI Act only increase the risk of incurring such problems, without really providing for safeguards and mitigation measures.
This tension between the goals of dataset fairness and the realities of an unjust world points to a need for broader interdisciplinary approaches to fairness in AI. It is not enough to focus solely on the technical aspects of fairness – such as ensuring that datasets are balanced and free from errors – without also considering the social, political, and economic systems that shape AI’s development and deployment. Addressing fairness in AI requires engaging with these larger systems of power and inequality, rather than relying solely on technical solutions to solve problems rooted in social inequalities.