Curated by THEOUTPOST
On Fri, 23 Aug, 12:04 AM UTC
4 Sources
[1]
There is now an open source AI definition. This is what it means
Google, Elon Musk, and Mark Zuckerberg say their AI is open source but a new definition may challenge that. Generative artificial intelligence (AI) companies, such as Meta's Llama or Elon Musk's Grok, claim they are open source. But not many agree on what open source AI is. A new working definition of what the term means for AI could change that and has just been released by the Open Source Initiative (OSI), the self-appointed stewards of the term. Open source generally means the software's source code is available to everyone in the public domain to use, modify, and distribute. The OSI open source definition states it must also comply with 10 criteria, including having a well-publicised means of obtaining the source code at a reasonable cost or for free, not being discriminatory, and the license not restricting other software. But AI systems are more difficult to assess against the OSI's 10 points and so it has a new specific definition for AI. The open source AI definition states that it can be used for any reason without getting permission from the company, and researchers should be able to freely see how the system works. It also states that the system can be modified for any purpose, including to change its output and share the system for others to use with or without modifications for any reason. The definition states AI companies must also be transparent about the data used to train the system, the source code used to train and run the system and the weights - the numerical parameters that influence how an AI model performs. Herein lies the problem. OpenAI, despite its name, is closed source in that its algorithms, models and data sets are kept secret. But Meta, Grok and Google's models, which claim they are open source, are not really either, if you go by the OSI definition. This is because the companies are not transparent in what data is used to train the weight, which can cause copyright issues and ethical questions around if the data is biased. The OSI acknowledges that sharing full training data sets can be challenging so is not so black and white. It therefore does not disqualify otherwise open source AI development from being considered "open source". The definition has been a couple of years in the making and will likely need to be updated as AI progresses. The OSI made the working definition by consulting a 70-person group of researchers, lawyers, policymakers, activists, and representatives of big tech companies such as Microsoft, Meta, and Google. "This definition will become a valuable resource to combat the widespread practice of 'openwashing' that is becoming quite rampant," Mozilla representatives Ayah Bdeir, Imo Udom, and Nik Marda said in a statement sent to Euronews Next. They explained "overwashing" was where non-open models (or even open-ish models like Meta's Llama 3) are promoted as leading "open source" options without contributing to the commons. "Researchers have shown that 'the consequences of open-washing are considerable' and affect innovation, research and the public understanding of AI," they added. "We are the stewards, maintainers of the definition, but we don't really have any strong powers to enforce it," Stefano Maffulli, the OSI's executive director told Euronews Next in an interview in March. He added that judges and courts around the world are starting to recognise that the open source definition is important, especially when it comes to mergers but also regulation. Countries around the world are finalising how they will regulate AI and open source software has been an issue of contention. "The open source definition serves as a barrier to identify false advertising," said Maffulli. "If a company says it's open source, it must carry the values that the open source definition carries. Otherwise, it's just confusing".
[2]
Like it or not, this open source AI definition take a giant step forward
HONG KONG -- To paraphrase the late John F. Kennedy, we choose to define open-source AI not because it is easy, but because it is hard; because that goal will serve to organize and measure the best of our energies and skills. Stefano Maffulli, executive director of the Open Source Initiative (OSI), told me that the software and data that mixes artificial intelligence (AI) with existing open-source licenses is a bad fit. "Therefore," said Maffulli, "We need to make a new definition for open-source AI." The big tech giants, a Mozilla representative explained, "have not necessarily adhered to the full principles of open source regarding their AI models." Also, a new definition "will help lawmakers working to develop rules and regulations to protect consumers from AI risks." The OSI has been working diligently on creating a comprehensive definition for open-source AI, similar to the Open-Source Definition for software. This critical effort addresses the growing need for clarity in determining what makes up an open-source AI system at a time when many companies claim their AI models are open source without really being open at all, such as Meta's Llama 3,1. The latest OSI Open-Source AI Definition draft, 0.0.9, has several significant changes. These are: As Linux Foundation executive director Jim Zemlin detailed at the Open Source Summit China, the MOF "is a way to help evaluate if a model is open or not open. It allows people to grade models." Within the MOF, Zemlin added, there are three tiers of openness. "The highest level, level one, is an open science definition where the data, every component used, and all of the instructions need to actually go and create your own model the exact same way. Level two is a subset of that where not everything is actually open, but most of them are. Then, on level three, you have areas where the data may not be available, and the data that describe the data sets would be available. And you can kind of understand that -- even though the model is open -- not all the data is available." Also: This AI model lets you generate videos using only your photos These three levels -- a concept that also appears in training data -- will be troublesome for some open-source purists to accept. Arguments over both the models and the training data will emerge as the debate continues about which AI and machine learning (ML) systems are truly open and which are not. Building the Open Source AI definition has been done collaboratively with diverse stakeholders worldwide. These include, among many others, Code for America, Wikimedia Foundation, Creative Commons, Linux Foundation, Microsoft, Google, Amazon, Meta, Hugging Face, Apache Software Foundation, and UN International Telecommunications Union. The OSI has held numerous town halls and workshops to gather input, ensuring that the definition is inclusive and representative of various perspectives. The process is still ongoing. Also: Sonos is failing and millions of devices could go with it - why open-source audio is our only hope The definition will continue to be refined and polished via worldwide roadshows and the collection of feedback and endorsements from diverse communities. OSI's Maffulli knows not everyone will be happy with this draft of the definition. Indeed, before this version's appearance, AWS Principal Open Source Technical Strategist Tom Callaway posted on LinkedIn, "It is my strong belief (and the belief of many, many others in open source) that the current Open Source AI Definition does not accurately ensure that AI systems preserve the unrestricted rights of users to run, copy, distribute, study, change, and improve them." Now that the draft has seen the light of day, I'm sure others will get their say. The OSI hopes to present a stable version of the definition at the All Things Open conference in October 2024. If all goes well, the result will be a definition that most -- if not everyone -- can agree promotes transparency, collaboration, and innovation in open-source AI systems.
[3]
We're a big step closer to defining open source AI - but not everyone is happy
HONG KONG -- To paraphrase the late John F. Kennedy, we choose to define open-source AI not because it is easy, but because it is hard; because that goal will serve to organize and measure the best of our energies and skills. Stefano Maffulli, executive director of the Open Source Initiative (OSI), told me that the software and data that mixes artificial intelligence (AI) with existing open-source licenses is a bad fit. "Therefore," said Maffulli, "We need to make a new definition for open-source AI." The big tech giants, a Mozilla representative explained, "have not necessarily adhered to the full principles of open source regarding their AI models." Also, a new definition "will help lawmakers working to develop rules and regulations to protect consumers from AI risks." The OSI has been working diligently on creating a comprehensive definition for open-source AI, similar to the Open-Source Definition for software. This critical effort addresses the growing need for clarity in determining what makes up an open-source AI system at a time when many companies claim their AI models are open source without really being open at all, such as Meta's Llama 3,1. The latest OSI Open-Source AI Definition draft, 0.0.9, has several significant changes. These are: As Linux Foundation executive director Jim Zemlin detailed at the Open Source Summit China, the MOF "is a way to help evaluate if a model is open or not open. It allows people to grade models." Within the MOF, Zemlin added, there are three tiers of openness. "The highest level, level one, is an open science definition where the data, every component used, and all of the instructions need to actually go and create your own model the exact same way. Level two is a subset of that where not everything is actually open, but most of them are. Then, on level three, you have areas where the data may not be available, and the data that describe the data sets would be available. And you can kind of understand that -- even though the model is open -- not all the data is available." Also: Linus Torvalds talks AI, Rust adoption, and why the Linux kernel is 'the only thing that matters' These three levels -- a concept that also appears in training data -- will be troublesome for some open-source purists to accept. Arguments over both the models and the training data will emerge as the debate continues about which AI and machine learning (ML) systems are truly open and which are not. Building the Open Source AI definition has been done collaboratively with diverse stakeholders worldwide. These include, among many others, Code for America, Wikimedia Foundation, Creative Commons, Linux Foundation, Microsoft, Google, Amazon, Meta, Hugging Face, Apache Software Foundation, and UN International Telecommunications Union. The OSI has held numerous town halls and workshops to gather input, ensuring that the definition is inclusive and representative of various perspectives. The process is still ongoing. Also: Sonos is failing and millions of devices could go with it - why open-source audio is our only hope The definition will continue to be refined and polished via worldwide roadshows and the collection of feedback and endorsements from diverse communities. OSI's Maffulli knows not everyone will be happy with this draft of the definition. Indeed, before this version's appearance, AWS Principal Open Source Technical Strategist Tom Callaway posted on LinkedIn, "It is my strong belief (and the belief of many, many others in open source) that the current Open Source AI Definition does not accurately ensure that AI systems preserve the unrestricted rights of users to run, copy, distribute, study, change, and improve them." Now that the draft has seen the light of day, I'm sure others will get their say. The OSI hopes to present a stable version of the definition at the All Things Open conference in October 2024. If all goes well, the result will be a definition that most -- if not everyone -- can agree promotes transparency, collaboration, and innovation in open-source AI systems.
[4]
We're a long way from truly open-source AI
Open-source artificial intelligence has been one of the most surprising tech stories of the past year. As companies such as OpenAI and Google have poured billions of dollars into building ever more powerful AI, "open" models that are freely available for developers to use and adapt have closed the performance gap. There's just one drawback: most of these open-source systems are not very open. Critics accuse their backers of "open washing" -- trying to benefit from the halo effect of open source, with its freedom from the constraints of normal commercial software products, but not living up to the name. The effort to create a truly open-source version of AI is finally gaining momentum. But there is no guarantee that its progress will match that of open-source software, which has come to play a critical role in the tech world over the past 20 years. With traditional open-source software, such as the Linux operating system, the code is freely available for developers to inspect, use and adapt. So-called open source AI has been very different, not least because most modern AI systems learn from data rather than having their logic programmed in code. Take Meta's Llama. Only the "weights" which determine how the model responds to queries are disclosed. Users can take and adapt it, but they can't see the underlying data on which it was trained and don't have enough information to reproduce the model from scratch. For many developers, this still has some clear benefits. They can adapt and train quasi-open models on their own information without needing to hand the sensitive internal data over to another company. But not being fully open has its costs. According to Ayah Bdeir, a senior adviser to the Mozilla Foundation, only a true open-source technology would give people a full understanding of the systems that are starting to affect all facets of our lives, while also guaranteeing that innovation and competition can't be squashed by a handful of dominant AI companies. One response has come from the Open Source Initiative -- which laid out the definition of open-source software more than 20 years ago. This week, it produced a near-final definition that could help to shape how the field develops. This would need not only the weights for a model to be released, but also enough information about the data on which it was trained to allow someone else to reproduce it, as well all the code behind the system. Other groups, such as Mozilla and the Linux Foundation, are pushing similar initiatives. Moves such as these are already leading to a greater segmentation in the AI world. Many companies are being more careful with their terminology -- perhaps mindful that the OSI owns the trademark to the term "open source" and could sue to prevent it being used on AI models that fall outside its own definition. Mistral, for instance, calls its Nemo an "open weights" model. Alongside the partly open systems, full open-source models are starting to appear, such as the Olmo large language model developed by the Allen Institute for AI. Yet it is far from clear that this version will have as big an impact in the AI world as it has had in traditional software. For this to happen, two things would be required. One is that the technology will need to meet a big enough need to attract a critical mass of users and developers. With traditional software, the Linux server operating system represented a clear alternative to Microsoft's Windows, winning it a large base of users and strong backing from Microsoft's rivals, including IBM and Oracle. Linux has no equivalent in the AI world. The market is already more fragmented and many users will find quasi-open LLMs such as Llama adequate. Backers of open source AI also need to make a better case for its safety. The prospect of such a powerful, general-purpose technology being released for anyone to use rightly stirs widespread concern. Oren Etzioni, former head of the Allen Institute, says that many fears are overblown. When it comes to going online to research how to make a bomb or a bioweapon: "You're not really able to get more out of these [AI models] than you're able to get out of Google. There's lots of it out there -- it's just being packaged differently." He concedes that there are some areas where making AI more freely available could cause harm, such as automating the creation of more online misinformation. "Closed" AI also comes with risks. But until the extra marginal risk of open sourcing the technology has been more thoroughly studied, along with the potential benefits, the fears will remain.
Share
Share
Copy Link
The open-source AI community has reached a consensus on a definition for open-source AI, marking a significant milestone in the field. However, the new definition has sparked debates and raised concerns among various stakeholders.
In a groundbreaking development, the open-source artificial intelligence (AI) community has finally agreed upon a definition for open-source AI. This milestone, achieved through collaborative efforts led by the Linux Foundation and the TODO Group, marks a significant step forward in the rapidly evolving field of AI 1.
The newly established definition of open-source AI encompasses several key aspects:
This comprehensive definition aims to promote transparency, accessibility, and collaboration in AI development 2.
While the new definition has been widely welcomed, it has also sparked debates within the AI community. Some experts argue that the stringent requirements may inadvertently exclude certain AI projects from being classified as open-source. This concern highlights the delicate balance between maintaining high standards and fostering innovation in the field 3.
Major tech companies and AI developers are now grappling with the implications of this new definition. Some firms may need to reassess their AI projects and potentially make significant changes to align with the open-source criteria. This shift could lead to increased transparency in AI development processes and foster greater collaboration within the industry 4.
As the AI community adapts to this new definition, several key questions remain:
The coming months and years will likely see further discussions and refinements to the open-source AI definition as the industry continues to navigate this complex and rapidly changing landscape.
Reference
[1]
[4]
The Open Source Initiative (OSI) has released the Open Source AI Definition (OSAID) 1.0, establishing criteria for what qualifies as open-source AI. This definition has sparked debate and disagreement among tech companies and AI developers.
9 Sources
Tech giants like Google and Meta face scrutiny over their 'open-source' AI models. The Open Source Initiative questions whether these models truly meet open-source criteria, sparking a debate in the tech community.
2 Sources
Exploring the challenges and complexities in the intersection of AI and open source software. The article delves into the reasons behind AI's struggle with open source principles and the complications of making AI truly open source.
2 Sources
The Open Source Initiative criticizes Meta for calling its Llama AI models "open-source," arguing that the term is being misused and could harm the development of truly open AI technologies.
2 Sources
Meta has released the largest open-source AI model to date, marking a significant milestone in artificial intelligence. This development could democratize AI research and accelerate innovation in the field.
2 Sources
The Outpost is a comprehensive collection of curated artificial intelligence software tools that cater to the needs of small business owners, bloggers, artists, musicians, entrepreneurs, marketers, writers, and researchers.
© 2024 TheOutpost.AI All rights reserved