DeepSeek goes beyond “open weights” AI with plans for source code release

The Future of Open-Source AI: A Glimpse into the Future of Machine Learning

The Rise of Open-Source AI Models

In recent years, major models like Google’s Gemma, Meta’s Llama, and OpenAI’s GPT2 have been released under an open weights structure. These models also often release open-source code covering the inference-time instructions run when responding to a query. This trend towards open-source AI has the potential to revolutionize the way we approach machine learning.

The Importance of Open-Source AI

A truly open AI must include "sufficiently detailed information about the data used to train the system so that a skilled person can build a substantially equivalent system," according to the Open Source Initiative’s formal definition. This means that a fully open-source release, including training code, can give researchers more visibility into how a model works at a core level, potentially revealing biases or limitations that are inherent to the model’s architecture instead of its parameter weights.

Open-Source Releases from Other Companies

Elon Musk’s xAI released an open-source version of Grok 1’s inference-time code last March and has promised to release an open-source version of Grok 2 in the coming weeks. However, the recent release of Grok 3 will remain proprietary and only be available to X Premium subscribers for the time being.

HuggingFace’s Open-Source Clone

Earlier this month, HuggingFace released an open-source clone of OpenAI’s proprietary "Deep Research" feature mere hours after it was released. This clone relies on a closed-weights model at release "just because it worked well," but the source code’s "open pipeline" can easily be switched to any open-weights model as needed.

Conclusion

The rise of open-source AI models has the potential to democratize access to powerful machine learning tools and techniques. By making the training code and data used to train these models available, researchers and developers can gain a deeper understanding of how these models work and how they can be improved. As the open-source AI movement continues to gain momentum, we can expect to see even more innovative applications and advancements in the field of machine learning.

FAQs

What is open-source AI?
Open-source AI refers to AI models that are made available with open-source code and data, allowing researchers and developers to access and modify the code to suit their needs.
What is the Open Source Initiative’s definition of "Open-Source AI"?
According to the Open Source Initiative, a truly open AI must include "sufficiently detailed information about the data used to train the system so that a skilled person can build a substantially equivalent system."
Will xAI release the training code for Grok 3?
No, the recent release of Grok 3 will remain proprietary and only be available to X Premium subscribers for the time being.