DeepSeek goes beyond “open weights” AI with plans for source code release – Ars Technica

Abstract technology image of AI robot installing binary data from node stream of dynamic array.
Chinese AI firm says daily releases will reveal “code that moved our tiny moonshot forward.”
Last month, DeepSeek turned the AI world on its head with the release of a new, competitive simulated reasoning model that was free to download and use under an MIT license. Now, the company is preparing to make the underlying code behind that model more accessible, promising to release five open source repos starting next week.In a social media post late Thursday, DeepSeek said the daily releases it is planning for its “Open Source Week” would provide visibility into “these humble building blocks in our online service [that] have been documented, deployed and battle-tested in production. As part of the open-source community, we believe that every line shared becomes collective momentum that accelerates the journey.”While DeepSeek has been very non-specific about just what kind of code it will be sharing, an accompanying GitHub page for “DeepSeek Open Infra” promises the coming releases will cover “code that moved our tiny moonshot forward” and share “our small-but-sincere progress with full transparency.” The page also refers back to a 2024 paper detailing DeepSeek’s training architecture and software stack.The move threatens to widen the contrast between DeepSeek and OpenAI, whose market-leading ChatGPT models remain completely proprietary, making their inner workings opaque to outside users and researchers. The open source release could also help provide wider and easier access to DeepSeek even as its mobile app is facing international restrictions over privacy concerns.DeepSeek’s initial model release already included so-called “open weights” access to the underlying data representing the strength of the connections between the model’s billions of simulated neurons. That kind of release allows end users to easily fine-tune those model parameters with additional training data for more targeted purposes.Major models, including Google’s Gemma, Meta’s Llama, and even older OpenAI releases like GPT2, have been released under this open weights structure. Those models also often release open source code covering the inference-time instructions run when responding to a query.It’s currently unclear whether DeepSeek’s planned open source release will also include the code the team used when training the model. That kind of training code is necessary to meet the Open Source Initiative’s formal definition of “Open Source AI,” which was finalized last year after years of study. A truly open AI also must include “sufficiently detailed information about the data used to train the system so that a skilled person can build a substantially equivalent system,” according to OSI.A fully open source release, including training code, can give researchers more visibility into how a model works at a core level, potentially revealing biases or limitations that are inherent to the model’s architecture instead of its parameter weights. A full source release would also make it easier to reproduce a model from scratch, potentially with completely new training data, if necessary.Elon Musk’s xAI released an open source version of Grok 1’s inference-time code last March and recently promised to release an open source version of Grok 2 in the coming weeks. However, the recent release of Grok 3 will remain proprietary and only available to X Premium subscribers for the time being, the company said.Earlier this month, HuggingFace released an open source clone of OpenAI’s proprietary “Deep Research” feature mere hours after it was released. That clone relies on a closed-weights model at release “just because it worked well,” Hugging Face’s Aymeric Roucher told Ars Technica, but the source code’s “open pipeline” can easily be switched to any open-weights model as needed.Ars Technica has been separating the signal from
the noise for over 25 years. With our unique combination of
technical savvy and wide-ranging interest in the technological arts
and sciences, Ars is the trusted source in a sea of information. After
all, you don’t need to know everything, only what’s important.