Introduction to AI Control
When Altman celebrates finally getting GPT to avoid em dashes, he’s really celebrating that OpenAI has tuned the latest version of GPT-5.1 (probably through reinforcement learning or fine-tuning) to weight custom instructions more heavily in its probability calculations.
The Challenge of Control
There’s an irony about control here: Given the probabilistic nature of the issue, there’s no guarantee the issue will stay fixed. OpenAI continuously updates its models behind the scenes, even within the same version number, adjusting outputs based on user feedback and new training runs. Each update arrives with different output characteristics that can undo previous behavioral tuning, a phenomenon researchers call the “alignment tax.”
Understanding Neural Networks
Precisely tuning a neural network’s behavior is not yet an exact science. Since all concepts encoded in the network are interconnected by values called weights, adjusting one behavior can alter others in unintended ways. Fix em dash overuse today, and tomorrow’s update (aimed at improving, say, coding capabilities) might inadvertently bring them back, not because OpenAI wants them there, but because that’s the nature of trying to steer a statistical system with millions of competing influences.
The Road to AGI
This gets to an implied question we mentioned earlier. If controlling punctuation use is still a struggle that might pop back up at any time, how far are we from AGI? We can’t know for sure, but it seems increasingly likely that it won’t emerge from a large language model alone. That’s because AGI, a technology that would replicate human general learning ability, would likely require true understanding and self-reflective intentional action, not statistical pattern matching that sometimes aligns with instructions if you happen to get lucky.
Real-World Examples
And speaking of getting lucky, some users still aren’t having luck with controlling em dash use outside of the “custom instructions” feature. Upon being told in-chat to not use em dashes within a chat, ChatGPT updated a saved memory and replied to one X user, “Got it—I’ll stick strictly to short hyphens from now on.”
Conclusion
In conclusion, controlling AI behavior, even in simple tasks like punctuation use, is a complex challenge. The probabilistic nature of neural networks and the continuous updates to AI models make it difficult to achieve consistent results. While we have made progress in fine-tuning AI models, we are still far from achieving true understanding and self-reflective intentional action, which is necessary for AGI.
FAQs
- Q: What is the challenge in controlling AI behavior?
A: The challenge in controlling AI behavior lies in the probabilistic nature of neural networks and the continuous updates to AI models, which can undo previous behavioral tuning. - Q: What is the alignment tax?
A: The alignment tax refers to the phenomenon where updates to AI models can alter their behavior in unintended ways, making it difficult to achieve consistent results. - Q: How far are we from achieving AGI?
A: We can’t know for sure, but it seems increasingly likely that AGI won’t emerge from a large language model alone, and will require true understanding and self-reflective intentional action. - Q: Can AI models be fine-tuned to achieve specific results?
A: Yes, AI models can be fine-tuned to achieve specific results, but the results may not be consistent due to the probabilistic nature of neural networks and continuous updates to AI models.








