Apple ships truly open-source AI models

Apple is turning weakness into strength as it opens up its approach to artificial intelligence (AI) research.

To everyone’s surprise, Apple is turning weakness into strength with its evolving approach to artificial intelligence (AI), as it becomes one of the biggest open-source research contributors in the field.

Apple last week released its DCLM (dataComp for Language Models) models on HuggingFace. This aims to be a solution for data training and curation, enabling new models to be trained and tested with relatively few “training tokens.”

‘Best performing truly open-source models’

“Our baseline model is also comparable to Mistral-7B-v0.3 and Llama 3 8B on MMLU (63% & 66%) and performs similarly on an average of 53 natural language understanding tasks while being trained with 6.6x less compute than Llama 3 8B,” the team wrote. “Our results highlight the importance of dataset design for training language models and offer a starting point for further research on data curation.”

Apple’s researchers worked with peers from the University of Washington, the Toyota Research Institute, Stanford, and others on the project.

“To our knowledge these are by far the best performing truly open-source models (open data, open weight models, open training code),” Apple machine learning researcher Vaishaal Shankar wrote when announcing the news.

Reaction seems positive. “The data curation process is a must-study for anyone looking to train a model from scratch or fine-tune an existing one,” said Applied AI Scientist Akash Shetty

Opening up, strategically

The model seems to compete strongly with other models of its type, including Mistral-7B, and approaches the performance of models from Meta and Google — even though it is trained on smaller quantities of content.

The idea is that research teams can use the tech to create their own small AI models, which can themselves be embedded in (say) apps and deployed at low cost.

While it is unwise to read too much into things, Apple’s AI teams do seem to have embraced a more open approach to research in the field. That makes sense for a company allegedly racing to catch up to competitors, of course; it also makes sense in another way, because the company that contributes and maintains code to the open source community puts itself in a strong position for future research in terms of contact with peers and fostering future goodwill.

Collaboration counts

That alone is a small, but remarkable, step for Apple, which has a reputation for prizing secrecy above all else. That secrecy has, we’ve been intermittently told in recent years, been a big problem for Apple’s research teams, who wanted to work in a more collaborative way with others in the cutting-edge industry.

Apple seems to have listened, which is why I think it is now turning what was once a disadvantage into an advantage. In the short-term, the company wants to promote effective innovation in AI while it develops its own solutions under the Apple Intelligence brand.

It might also hope to position itself as a source technology provider powering many open-source projects. Ensuring good technologies are widely available to the open-source community could help prevent other entities owning too much of the core technology.

Coders on the edge of time

The release of the small-size model also reflects Apple’s core approach toward edge-based AI solutions, supplemented by its own secure server-based AI services and also third-party offers, such as from OpenAI or in future Google Gemini.

The Apple release is just the latest in a string of such releases to have emerged since the company intensified its focus on AI research. The company has now published dozens of models, including most recently its OpenELM and CoreML models. The latter models are optimized to run generative AI (genAI) and machine learning applications on device.

While nothing has been stated to this effect, the cards Apple is showing indicate it is working more closely with researchers outside the company. And it’s investing on the development of edge AI — ironically, the direction the industry will inevitably head toward as the real-life problems of power and water consumption, copyright, and privacy present existential challenges to the future evolution of the server-led AI space.

Please follow me on Mastodon, or join me in the AppleHolic’s bar & grill and Apple Discussions groups on MeWe.

Apple ships truly open-source AI models

Apple is turning weakness into strength as it opens up its approach to artificial intelligence (AI) research.

‘Best performing truly open-source models’

Opening up, strategically

Collaboration counts

Coders on the edge of time

More from this author

Germany’s BSI guns for better tech security

For IT, Jamf’s Microsoft Azure partnership means a lot

Convenience has a cost, privacy has iPhone

Apple’s Patreon fee will hurt the wrong people

Apple, this is the time to seize the moment

Seeking DMA compliance, Apple gets to business

Why health might be Apple’s AI profit center

Macs are becoming more locked down

Most popular authors

Show me more

Microsoft's Patch Tuesday updates: Keeping up with the latest fixes

For August, Patch Tuesday means patch now

AI and AR can supercharge ‘ambient computing’

Podcast: Is the gold rush for AI talent slowing down?

Podcast: Google loses antitrust, and the world yawns

Podcast: Does a chief risk officer make sense?

Is there still a gold rush for AI talent?

Tech news roundup: Google antitrust, Delta-Microsoft tiff, and stuck astronauts

Do companies need a Chief Risk Officer?