Meta has an Open Source Dilemma: The Changing Landscape of AI Foundation Models
Open source software has long symbolized transparency, community collaboration, and the unrestricted sharing of knowledge. However, in the realm of AI foundation models, the concept of openness and transparency is taking on new dimensions. In a rapidly evolving world of AI technology, major players like Meta and OpenAI have released powerful models, but the extent to which these models adhere to open-source principles is a matter of ongoing debate.
In recent months, Meta made a significant move by releasing its large language model, Llama 2, to the public, labeling it as open source. This decision, while considered a step toward increased accessibility and transparency, has raised concerns among open source advocates. While Llama 2 is available for free, it comes with certain limitations that fall short of the stringent requirements set by the Open Source Initiative (OSI).
Not-So-Open Open Source
The Open Source Initiative defines open source as more than simply sharing code or research. True open source entails free redistribution, access to source code, allowance for modifications, and independence from specific products. Despite its public accessibility, Llama 2, by Meta’s admission, imposes restrictions such as licensing fees for developers with over 700 million daily users and limitations on training other models using Llama 2.
Researchers from Radboud University in the Netherlands argue that Meta’s claim of Llama 2 being open source is “misleading.” This raises the question of whether major AI companies are genuinely committed to open source principles or if their openness comes with asterisks.
Striking the Balance
Joelle Pineau, FAIR lead and Meta’s VP for AI research, acknowledges the limitations in Meta’s approach to openness. She argues that finding a balance between information-sharing and safeguarding Meta’s business interests is essential. The limited approach to openness has encouraged Meta’s researchers to prioritize safety and responsibility in their AI projects.
One of Meta’s notable open-source initiatives is PyTorch, a machine learning coding language. Released to the open-source community in 2016, PyTorch has seen significant contributions and improvements from external developers. Pineau emphasizes that the extent of openness depends on the maturity and safety of the code. When safety concerns are high, Meta is cautious about releasing research to a wider audience.
Fostering Collaborative Innovation
Meta is committed to fostering collaboration in the AI community. It actively participates in industry groups like the Partnership on AI and MLCommons to develop benchmarks and guidelines for safe model deployment. The company believes that a collective effort is necessary to promote safe and responsible AI in the open source community, as no single company can drive this conversation.
A Unique Approach in the AI Landscape
Meta’s approach to openness stands out in the world of major AI companies. Companies like OpenAI and Google have adopted different strategies, with OpenAI initially emphasizing open research but later shifting towards a more closed approach. Smaller developers, including Stability AI and EleutherAI, have made significant contributions to the open source space by regularly releasing new language models.
Meta’s Open Source License Challenges
One of the challenges in applying traditional open source licenses to AI models is the handling of vast amounts of external data. Traditional licenses provide limited liability to users and developers, making them ill-suited to address the unique risks associated with AI models. AI models like Llama 2 contain extensive training data, leading to potential liability issues. Pineau suggests that these licenses need to evolve to better suit the AI model landscape.
A Shifting Landscape
The open source landscape is evolving, with developers and companies leaning towards a more permissive approach. The post-open source revolution has seen open source principles giving way to a focus on accessibility and ease of use. Developers value the ability to deploy their preferred open-source software in the cloud with minimal friction.
The Role of Open Source
Open source, while not the primary focus, remains essential in rallying around standards and facilitating access to common skills and infrastructure. However, the debate over whether models like Llama 2 truly adhere to open-source principles is ongoing.
The Transparency Challenge
A recent report from Stanford HAI highlights a broader concern in the AI industry. Major developers of foundation models, including Meta and OpenAI, are criticized for not providing sufficient information about the potential societal impact of their models. Stanford HAI introduced the Foundation Model Transparency Index to assess the transparency of the top 10 AI models, including Llama 2, BloomZ, and GPT-4. While these models scored moderately on transparency, none disclosed information about societal impact, privacy, copyright, or bias complaints.
In the world of AI foundation models, the concept of open source is evolving. Major companies like Meta and OpenAI are striving to balance accessibility, transparency, and business interests. The open source landscape is shifting towards a more permissive approach, with ease of access and use taking precedence.
As AI technology continues to advance, the definition of openness in the open source community is undergoing a transformation. The challenge lies in finding the right balance between collaboration and protecting intellectual property. While the debate over the openness of AI foundation models continues, the focus remains on enabling developers to build with minimal friction and greater opportunity in our increasingly cloudified software world. In this evolving landscape, the definition of “open source” may be less critical than the practicality of access and use for developers and researchers alike.