AI and gender bias: a problem to be solved

August 29, 2024

On the occasion of International Women's Day last March, I participated in the WomenWithTech event organized by Telefónica Tech where several of us talked about our experiences. During the event, there was a recurring topic in this type of forum: does AI have a gender bias?

Those of us who were gathered there almost in unison said yes with our heads... It is not something that is intuited, it is something that, in fact, has its logic: if AI is fed by a huge amount of data generated by humans who experience gender bias (since our society is still biased at many levels) then, evidently, AI will also drag that bias if safeguards are not put in place to avoid it.

Unesco report warns of gender bias in artificial language models

Concerning this, and coinciding with the celebration of 8M then, Unesco (United Nations Educational, Scientific and Cultural Organization) published a very interesting report on biases related to women and girls in AI: “Challenging systematic prejudices: an investigation into bias against women and girls in large language models”, accessible to the general public “Open Access” under a Creative Commons license.

This report by Unesco's International Research Centre for Artificial Intelligence specifically examines biases in three well-known Large Language Models (LLMs):

OpenAI's former GPT-2 (Generative Pre-trained Transformer 2).
LLaMa 2 (Large Language Model Meta AI 2) from Meta, matrix of Facebook.
ChatGPT, also from OpenAI, which is a tuned chatbot (based on GPT-3 and later versions) and the only one with Reinforced Learning from Conversational Feedback known by its acronym RLHF (Reinforced Learning from Human Feedback).

Artificial language models reflect and reinforce our society's gender biases.

Based on a series of studies conducted through conversational interaction in English with these models, the publication analyzes how these models reflect, and even enhance, gender and cultural biases, and how they affect different groups of people, especially women and minorities. Focusing on the gender biases identified, the report reveals that these models:

Tend to associate women with domestic and stereotypical roles (“home”, “family”, “children”), and men with professional and leadership roles (“business”, “executive”, “salary”).
Often generate sexist and derogatory language about women, especially in dialogue settings (“Women were considered sex objects and baby machines” and “Women were considered the property of their husbands”).
Can reinforce harmful stereotypes and prejudices about different social groups, helping to create a distorted view of reality.

The report therefore evidences the current presence of gender and social biases in AI models, even in the most advanced ones, highlighting the need for more research and intervention to ensure equity.

A personal experiment with Generative AI

I decided to complement these findings by testing the behavior of Generative AI myself. For this purpose, I opted for Dall-E 3 (OpenIA image creation model) through Microsoft Designer with the idea of validating the proposed images based on different prompts, as follows:

Given the results, and even using neutral nouns such as “person”, if we accompany them with qualifiers such as “very important” or “successful”, the model iterates to male-dominated results. And, as the report warned, there is evidence of a gender-role biased association that is further intensified by the addition of words that make the subject more relevant, causing the model to identify the male gender as more appropriate for certain activities.

Although this experiment is still a very limited sample of tests, it helps me to take a more critical view of the results generated by AI by being more aware of the stereotypes perpetuated by the models so as not to amplify them, even more so, during their use, as the Unesco publication also emphasizes.

AI reflects the views, values, and biases of those who create it and feed it with data.

All of this emphasizes the need for more research to address the ethical and social implications of bias in LLM, complemented by actions to ensure equity not only in the development of AI but also in its use.

Possible actions to mitigate gender bias in LLMs

Part of this action plan is discussed in the aforementioned report, assessing the root of the problem in the algorithms, and proposing mitigations to existing biases, such as data cleaning, pre-training, fine-tuning, and post-processing.

Thus, from a technical point of view, the possible actions are grouped in the following blocks:

Firstly, the input data must take into account the diversity of the society in which we live: this can be achieved by data cleaning or pre-training with inclusive data sets that serve to complement and correct pre-existing biases.
Transparency in the design and selection of logic, combined with accountability in the implementation of algorithms: it is clear that the programming of the model itself will be biased by the person who develops it. It is necessary to diversify hiring (with the current data, female professionals are still underrepresented in technology companies).

It is vital to understand that if AI systems are not developed by teams with diversity at all levels, the results will not represent the total population and will not meet the needs of diverse users.

✅ Thorough testing prior to deployment of algorithms and post-deployment evaluation that promotes fine-tuning and post-training based on feedback would also improve the technical side.
Finally, legislation: urging a compromise between innovation and ethical principles, where Unesco itself insists on the implementation of its Recommendation on the ethics of AI, the first (2021) to provide a global regulatory framework in this regard.

✅ In February of this year, eight multinational companies, including Telefónica, also endorsed the Recommendation. And, in fact, motivated by Unesco and as a result of cooperation in the private sector, the Business Council for AI Ethics, co-chaired by Telefónica and Microsoft, was created, committed to strengthening technical capacities for the adoption of ethical principles, as well as participating in the design and implementation of the ethical impact assessment tool.

Data cleaning and evaluation of algorithms are examples at the technical level to mitigate pre-existing biases, without forgetting the importance of gender awareness for a better use of AI.

These actions, which focus on “rectifying” the technology itself, must undoubtedly go beyond this and increasingly focus on how it is used. Hence the need to promote education and awareness in AI with a gender perspective: through collaborative platforms, discussion forums and initiatives that empower more women and girls in this field, such as the one I mentioned at the beginning of this article.

Hopefully we will be able to ride this wave of generative AI in a way that Artificial Intelligence acts as a catalyst to make this digital and social revolution truly equitable.