Machine unlearning: AI models that learn to forget

August 31, 2023

Unlearning in Artificial Intelligence (AI), or 'machine-unlearning,' is the process of updating or modifying the acquired knowledge of an AI model based on the latest information or changes in the scope of application or circumstances. It can also be motivated by technical considerations or regulatory requirements, for example.

Unlearning refers to adjusting or eliminating existing data, rules, or connections in an AI model. It is an "emergent subfield of machine learning that aims to remove the influence of a specific subset of training examples from a trained model, ideally while preserving intact valid data and learning," as explained by Google.

Machine unlearning tackles challenges such as data bias and adaptation to changing scenarios or regulations.

Why machine-unlearning matters in AI

Machine unlearning is fundamental because AI models often encounter changing, biased, contradictory, or excessive data. Maintaining previous connections acquired from such data can lead to incorrect or biased decisions.

As explained in SemiWiki, "adaptability is a cornerstone of intelligence, both human and artificial. Just as humans learn to navigate new situations and respond to changing environments, AI systems strive to exhibit a similar capacity.”

It is added that one of the advantages of adaptability through machine unlearning is the mitigation of a phenomenon known as 'catastrophic forgetting.' When AI models are trained with updated data inconsistent with the original training data, there is a risk of "forgetting" (overwriting or losing) valuable training.

Proper and controlled unlearning helps address catastrophic forgetting by carefully eliminating obsolete or incorrect information while preserving previously acquired knowledge.

Therefore, machine-unlearning enables AI models to be more flexible and adaptable. It also allows them to make more precise and fair decisions as they are used because unlearning enables:

Correcting Biases and Errors

Biases and errors in training data can lead to biased or ineffective AI models. Machine unlearning allows the identification and correction of these biases by eliminating or modifying connections that lead to unfair or incorrect decisions.

Example: A personnel selection model with acquired gender or age biases can correct them through unlearning.

Adapting to changes

AI models, like humans, must adapt to new data, regulations, or circumstances. Machine unlearning enables models to update their knowledge as they encounter updated or changing information.

Example: A news recommendation system can adjust user preferences as their interests change.

Improve accuracy and generalization

Machine unlearning can enhance models' generalization ability by removing obsolete or noisy information that could negatively affect their performance.

Example: A machine translation model that has learned incorrect grammar rules can unlearn those rules to produce accurate translations.

Benefits of machine unlearning

Machine unlearning has been used to enhance AI models' accuracy and fairness in various applications, such as personnel selection, fraud detection, or medical diagnosis. Benefits of machine unlearning in AI include:

Improving performance by reducing outdated data and models that slow down the system.
Protecting sensitive information or individual privacy.
Making AI systems more efficient in processing new information.
Enhancing AI's ability to process and analyze complex data.
Correcting biases acquired during training.
Enabling more precise predictions and recommendations tailored to changes in user habits and preferences.

Challenges in AI unlearning

Machine unlearning in AI faces significant challenges. Some experts even doubt its effectiveness: "To use a human analogy, once an A.I. has ‘seen’ something, there is no easy way to tell the model to ‘forget’ what it saw. And deleting the model entirely is also surprisingly difficult", says Fortune.

Google also points out that "fully erasing the influence of the data requested to be deleted is challenging since, aside from simply deleting it from databases where it’s stored, it also requires erasing the influence of that data on other artifacts such as trained machine learning models". In the face of this difficulty, Google launched the first Machine Unlearning Challenge this summer.

Machine unlearning faces challenges, just like other approaches such as retraining or regularization, including:

Obtaining suitable, quality, and relevant datasets can be challenging, especially if data is scarce or difficult to collect.
Modifying AI models' complex internal representations can be challenging, as is identifying which parts of the model need to be modified and how to do it properly.
Improper or unbalanced application can lead to unwanted results, such as removing useful information (overfitting) or not removing enough biased information (underfitting).
Understanding changes and consequences well to avoid opacity and ensure accountability, including for AI decisions before unlearning.
Implementing techniques to measure accuracy, fairness, and other relevant aspects of the updated model and ensuring effective improvements.

Even resetting an AI model to completely erase previous knowledge and enable new learning from scratch is a complex process. Training or retraining an AI model can be very expensive depending on its size and complexity. "GPT-4 was probably trained using trillions of words of text and thousands of computer processors in a process that cost more than 100 million euros," according to Wired.

Conclusion

Machine unlearning plays a fundamental role in improving AI models' accuracy, fairness, privacy, and adaptability. It enables models to update and modify their previous knowledge based on new information or changes in the environment.

Therefore, as AI influences more aspects of our lives, machine unlearning becomes an essential tool to ensure AI models are fair, accurate, and adaptable. However, finding a balance between effective learning and unlearning in AI is still a major challenge.