Introduction to Machine Unlearning
AI companies generally keep a tight grip on their models to discourage misuse. For example, if you ask ChatGPT to give you someone’s phone number or instructions for doing something illegal, it will likely just tell you it cannot help. However, as many examples over time have shown, clever prompt engineering or model fine-tuning can sometimes get these models to say things they otherwise wouldn’t. The unwanted information may still be hiding somewhere inside the model so that it can be accessed with the right techniques.
The Current Approach
At present, companies tend to deal with this issue by applying guardrails; the idea is to check whether the prompts or the AI’s responses contain disallowed material. Machine unlearning instead asks whether an AI can be made to forget a piece of information that the company doesn’t want it to know. The technique takes a leaky model and the specific training data to be redacted and uses them to create a new model—essentially, a version of the original that never learned that piece of data. While machine unlearning has ties to older techniques in AI research, it’s only in the past couple of years that it’s been applied to large language models.
Understanding Machine Unlearning
Jinju Kim, a master’s student at Sungkyunkwan University who worked on the paper with Ko and others, sees guardrails as fences around the bad data put in place to keep people away from it. “You can’t get through the fence, but some people will still try to go under the fence or over the fence,” says Kim. But unlearning, she says, attempts to remove the bad data altogether, so there is nothing behind the fence at all.
Complications in Text-to-Speech Systems
The way current text-to-speech systems are designed complicates this a little more, though. These so-called “zero-shot” models use examples of people’s speech to learn to re-create any voice, including those not in the training set—with enough data, it can be a good mimic when supplied with even a small sample of someone’s voice. So “unlearning” means a model not only needs to “forget” voices it was trained on but also has to learn not to mimic specific voices it wasn’t trained on. All the while, it still needs to perform well for other voices.
Demonstration of Voice Unlearning
To demonstrate how to get those results, Kim taught a recreation of VoiceBox, a speech generation model from Meta, that when it was prompted to produce a text sample in one of the voices to be redacted, it should instead respond with a random voice. To make these voices realistic, the model “teaches” itself using random voices of its own creation.
Results and Implications
According to the team’s results, which are to be presented this week at the International Conference on Machine Learning, prompting the model to imitate a voice it has “unlearned” gives back a result that—according to state-of-the-art tools that measure voice similarity—mimics the forgotten voice more than 75% less effectively than the model did before. In practice, this makes the new voice unmistakably different. But the forgetfulness comes at a cost: The model is about 2.8% worse at mimicking permitted voices. While these percentages are a bit hard to interpret, the demo the researchers released online offers very convincing results, both for how well redacted speakers are forgotten and how well the rest are remembered.
Audio Samples
A voice sample of a speaker to be forgotten by the model can be heard, followed by the generated text-to-speech audio from the original model using the above as a prompt. Then, the generated text-to-speech audio using the same prompt, but now from the model where the speaker was forgotten, shows a significant difference.
Process and Requirements
Ko says the unlearning process can take “several days,” depending on how many speakers the researchers want the model to forget. Their method also requires an audio clip about five minutes long for each speaker whose voice is to be forgotten.
Conclusion
In machine unlearning, pieces of data are often replaced with randomness so that they can’t be reverse-engineered back to the original. In this paper, the randomness for the forgotten speakers is very high—a sign, the authors claim, that they are truly forgotten by the model. This technique has the potential to significantly improve the safety and privacy of AI systems, especially in applications where sensitive information is involved.
FAQs
- Q: What is machine unlearning?
A: Machine unlearning is a technique that allows AI models to forget specific pieces of information or data, especially those that are sensitive or could be misused. - Q: How does machine unlearning differ from current approaches?
A: Current approaches often use guardrails to prevent AI models from accessing or sharing forbidden information. Machine unlearning, however, aims to remove the unwanted information from the model entirely. - Q: What are the implications of machine unlearning for text-to-speech systems?
A: For text-to-speech systems, machine unlearning means not only forgetting voices the model was trained on but also preventing it from mimicking voices it wasn’t trained on, all while maintaining performance for other voices. - Q: How effective is the voice unlearning method demonstrated by the researchers?
A: The method shows promising results, with the model mimicking forgotten voices more than 75% less effectively after unlearning, though at a slight cost to its performance on permitted voices.