Curated by THEOUTPOST
On Mon, 28 Oct, 4:02 PM UTC
9 Sources
[1]
We have an official open-source AI definition now, but the fight is far from over
RALEIGH, NC. -- The Open Source Initiative (OSI) released Open Source AI Definition (OSAID) 1.0 on Oct. 28, 2024, at the All Things Open conference. Creating it wasn't easy. It took the OSI almost two years to create and set up the OSAID. But with no change from the OSAID's last draft, it's finally done. Unfortunately, not everyone is happy with it, and even its creators admit it's a work in progress. Also: OpenAI plans to offer its 250 million ChaptGPT users even more services Why? Carlo Piana, the OSI's chairman and an attorney, explained in an interview that, "Our collective understanding of what AI does, what's required to modify language models is limited now. The more we use it, the more we'll understand. Right now our understanding is limited, and we don't know yet what the technology will look like in one year, two years, or three years." Or, as Taylor Dolezal, head of ecosystem for the Cloud Native Computing Foundation (CNCF) put it, "Balancing open source principles with AI complexities can sometimes feel like trying to solve a Rubik's Cube blindfolded." As to why people object to the new definition, broadly speaking, there are three groups who are concerned with OSAID: pragmatists, idealists, and faux-source business leaders. Also: Google's new AI course will teach you to write more effective prompts To start, you need to understand what the conflicts are about. Mark Collier, the OpenStack Foundation's COO who helped with drafting the OSAID, recently put it well in an essay: One of the biggest challenges in creating the Open Source AI Definition is deciding how to treat datasets used during the training phase. At first, requiring all raw datasets to be made public might seem logical. However, this analogy between datasets and source code is imperfect and starts to fall apart the closer you look. Training data influences models through patterns, while source code provides explicit instructions. AI models produce learned parameters (weights), whereas software is directly compiled from source code. ... many AI models are trained on proprietary or legally ambiguous data, such as web-scraped content or sensitive datasets like medical records. [Therefore] any publicly available data used for training should be accessible, alongside full transparency about all datasets used and the procedures followed for cleaning and labeling them. Striking the right balance on this issue is one of the toughest parts of creating the definition, especially with the rapid changes in the market and legal landscape. So it is that the pragmatists wanted, and got, an open-source AI definition where not all the data needs to be open and shared. For their purposes, there only needs to be "sufficiently detailed information about the data used to train the system" rather than the full dataset itself. This approach aims to balance transparency with practical and legal considerations such as copyright and private medical data. Besides the OSI, organizations like the Mozilla Foundation, the OpenInfra Foundation, Bloomberg Engineering, and SUSE have endorsed the OSAID. For example, Alan Clark of SUSE's CTO office said, "SUSE applauds the progress made by the OSI and its OSAID. The efforts are culminating in a very thorough definition, which is important for the quickly evolving AI landscape and the role of open source within it. We commend the process OSI is utilizing to arrive at the definition and the adherence to the open source methodologies." Academics have also approved of this first OSAID release. Percy Liang, director of the Center for Research on Foundation Models at Stanford University, said, in a statement, "Coming up with the proper open-source definition is challenging, given restrictions on data, but I'm glad to see that the OSI v1.0 definition requires at least that the complete code for data processing (the primary driver of model quality) be open-source. The devil is in the details, so I'm sure we'll have more to say once we have concrete examples of people trying to apply this Definition to their models." Speaking of that devil, the idealists strongly object to non-open data being allowed inside an open-source AI model. While Piana stated, "The board is confident that the process has resulted in a definition that meets the standards of Open Source as defined in the Open Source Definition and the Four Essential Freedoms," the idealists don't see it that way at all. Also: Agentic AI is the top strategic technology trend for 2025 Tom Callaway, Principal Open-Source Technical Strategist at Amazon Web Services (AWS), summarized their objections well: "The simple fact remains... it allows you to build an AI system binary from proprietary data sources and call the result 'open source,' and that's simply wrong. It damages every established understanding of what 'open source' is, all in the name of hoping to attach that brand to a 'bigger tent' of things." The OSI is well aware of these arguments. At an OSI panel discussion at All Things Open, an OSI representative said, "Members of our communities are upset. They felt like their voices were not heard as a part of this process." The OSI felt that it had to come up with a definition because laws were being passed both in the US and the EU about open-source AI without defining it. The OSI and many other groups felt the issue had to be addressed before companies went ahead with their own bogus open-source AI definitions. Looking ahead, the OSI will adjust the definition to address upcoming changes in AI. In the meantime, at least one group, Digital Public Goods (DPG) is updating its DPG Standard for AI to mandate open training data for AI systems. Its proposal will appear on GitHub in early November and will be open for public comment for a 4-week community review period. There will be more such efforts. Also: Could AI make data science obsolete? The faux-source companies have a vested interest in their programs being considered open source. The laws and regulations for open-source AI are more lenient than those for proprietary AI systems. That means they can save a lot of money if their products are regulated under open-source rules. For example, Meta's Llama 3's license doesn't make the open-source grade on several grounds. Nonetheless, Meta claimed, "There is no single open-source AI definition, and defining it is a challenge because previous open-source definitions do not encompass the complexities of today's rapidly advancing AI models." Meta and other major AI powers, such as OpenAI, will try to get governments to recognize their self-defined definitions. I expect them to come up with a faux-source AI definition to cover their proprietary products and services. What all this means, from where I sit, is that while the OSAID has a standard that many groups will observe, the conflicts over what really is open-source AI have only just begun. I don't see any resolution to the conflict for years to come. Now, most AI users won't care. They just want help with their homework or with writing Star Wars fanfic and to make their jobs easier. It's an entirely different story for companies and government agencies. For them, open-source AI is vital for both business and development purposes.
[2]
We finally have an 'official' definition for open source AI
There's finally an "official" definition of open source AI. The Open Source Initiative (OSI), a long-running institution aiming to define and "steward" all things open source, today released version 1.0 of its Open Source AI Definition (OSAID). The product of several years of collaboration with academia and industry, the OSAID is intended to offer a standard by which anyone can determine whether AI is open source -- or not. You might be wondering -- as this reporter was -- why consensus matters for a definition of open source AI. Well, a big motivation is getting policymakers and AI developers on the same page, said OSI EVP Stefano Maffulli. "Regulators are already watching the space," Maffulli told TechCrunch, noting that bodies like the European Commission have sought to give special recognition to open source. "We did explicit outreach to a diverse set of stakeholders and communities -- not only the usual suspects in tech. We even tried to reach out to the organizations that most often talk to regulators in order to get their early feedback." Open AI To be considered open source under the OSAID, an AI model has to provide enough information about its design so that a person could "substantially" recreate it. The model must also disclose any pertinent details about its training data, including the provenance, how the data was processed, and how it can be obtained or licensed. "An open source AI is an AI model that allows you to fully understand how it's been built," Maffulli said. "That means that you have access to all the components, such as the complete code used for training and data filtering." The OSAID also lays out usage rights developers should expect with open source AI, like the freedom to use the model for any purpose and modify it without having to ask anyone's permission. "Most importantly, you should be able to build on top," added Maffulli. The OSI has no enforcement mechanisms to speak of. It can't pressure developers to abide by or follow the OSAID. But it does intend to flag models described as "open source" but which fall short of the definition. "Our hope is that when someone tries to abuse the term, the AI community will say, 'We don't recognize this as open source,' and it gets corrected," Maffulli said. Historically, this has had mixed results, but it isn't entirely without effect. Many startups and big tech companies, most prominently Meta, have employed the term "open source" to describe their AI model release strategies -- but few meet the OSAID's criteria. For example, Meta mandates that platforms with over 700 million monthly active users request a special license to use its Llama models. Maffulli has been openly critical of Meta's decision to call its models "open source." After discussions with the OSI, Google and Microsoft agreed to drop their use of the term for models that aren't fully open, but Meta hasn't, he said. Stability AI, which has long advertised its models as "open," requires that businesses making more than $1 million in revenue obtain an enterprise license. And French AI upstart Mistral's license bars the use of certain models and outputs for commercial ventures. A study last August by researchers at the Signal Foundation, the nonprofit AI Now Institute, and Carnegie Mellon found that many "open source" models are basically open source in name only. The data required to train the models is kept secret, the compute power needed to run them is beyond the reach of many developers, and the techniques to fine-tune them are intimidatingly complex. Instead of democratizing AI, these "open source" projects tend to entrench and expand centralized power, the study's authors concluded. Indeed, Meta's Lllama models have racked up hundreds of millions of downloads, and Stability claims that its models power up to 80% of all AI-generated imagery. Dissenting opinions Meta disagrees with this assessment, unsurprisingly -- and takes issue with the OSAID as written (despite having participated in the drafting process). A spokesperson defended the company's license for Llama, arguing that the terms -- and accompanying acceptable use policy -- act as guardrails against harmful deployments. Meta also said it's taking a "cautious approach" to sharing model details, including details about training data, as regulations like California's training transparency law evolve. "We agree with our partner the OSI on many things, but we, like others across the industry, disagree with their new definition," the spokesperson said. "There is no single open source AI definition, and defining it is a challenge because previous open source definitions do not encompass the complexities of today's rapidly advancing AI models. We make Llama free and openly available, and our license and acceptable use Policy help keep people safe by having some restrictions in place. We will continue working with the OSI and other industry groups to make AI more accessible and free responsibly, regardless of technical definitions."' The spokesperson pointed to other efforts to codify "open source" AI, like the Linux Foundation's suggested definitions, the Free Software Foundation's criteria for "free machine learning applications," and proposals from other AI researchers. Meta, incongruously enough, is one of the companies funding the OSI's work -- along with tech giants like Amazon, Google, Microsoft, Cisco, Intel, and Salesforce. (The OSI recently secured a grant from the nonprofit Sloan Foundation to lessen its reliance on tech industry backers.) Meta's reluctance to reveal training data likely has to do with the way its -- and most -- AI models are developed. AI companies scrape vast amounts of images, audio, videos, and more from social media and websites, and train their models on this "publicly available data," as it is usually called. In today's cut-throat market, a company's methods of assembling and refining datasets are considered a competitive advantage, and companies cite this as one of the main reasons for their nondisclosure. But training data details can also paint a legal target on developers' backs. Authors and publishers claim that Meta used copyrighted books for training. Artists have filed suits against Stability for scraping their work and reproducing it without credit, an act they compare to theft. It's not tough to see how the OSAID could be problematic for companies trying to resolve lawsuits favorably, especially if plaintiffs and judges find the definition compelling enough to use in court. Open questions Some suggest the definition doesn't go far enough, for instance in how it deals with proprietary training data licensure. Luca Antiga, the CEO of Lightning AI, points out that a model may meet all of the OSAID's requirements despite the fact that the data used to train it isn't freely available. Is it "open" if you have to pay thousands to inspect the private stores of images that a model's creators paid to license? "To be of practical value, especially for businesses, any definition of open source AI needs to give reasonable confidence that what is being licensed can be licensed for the way that an organization is using it," Antiga told TechCrunch. "By neglecting to deal with licensing of training data, the OSI is leaving a gaping hole that will make terms less effective in determining whether OSI-licensed AI models can be adopted in real-world situations." In version 1.0 of the OSAID, the OSI also doesn't address copyright as it pertains to AI models, and whether granting a copyright license would be enough to ensure a model satisfies the open source definition. It's not clear yet whether models -- or components of models -- can be copyrighted under current IP law. But if the courts decide they can be, the OSI suggests new "legal instruments" may be needed to properly open source IP-protected models. Maffulli agreed that the definition will need updates -- perhaps sooner than later. To this end, the OSI has established a committee that'll be responsible for monitoring how the OSAID is applied, and proposing amendments for future versions. "This isn't the work of lone geniuses in a basement," he said. "It's work that's being done in the open with wide stakeholders and different interest groups."
[3]
The OSI clarifies what makes AI systems open-source, but most 'open' models fall short - SiliconANGLE
The OSI clarifies what makes AI systems open-source, but most 'open' models fall short The highly respected Open Source Initiative, which has a reputation for being one of the most prominent stewards of open-source software, has finally come up with an official definition on what makes artificial intelligence models open or not. The definition was immediately rejected by Meta Platforms Inc., whose popular Llama large language models fail to make the grade. The OSI unveiled the Open Source AI Definition v1.0 at the All Things Open 2024 conference taking place in Raleigh, North Carolina, this week, saying it followed a years-long process that saw it collaborate with various organizations and academia. It intends for the OSAID to be a standard by which anyone can determine if an AI system is truly open-source or not. Standards for what makes traditional software "open" have long been agreed on, but AI software is a different beast as it incorporates elements that aren't covered by traditional licenses, such as the vitally important data used to train it. That's why the OSI spent years coming up with a new definition explicitly for such systems, and it has decreed that for AI to be considered truly open-source, it must provide the following three things: Much to the chagrin of self-professed "champions" of open-source AI, like Meta, Stability AI Ltd. and Mistral, the vast majority of their AI models fall short of the OSI's definition. For instance, Meta's Llama models come with restrictions on commercial use, which prevents them from being used freely by applications with over 700 million users. In addition, Meta does not provide access to Meta's training datasets, nor does it provide comprehensive details about that data, so the Llama models are impossible to recreate. Stability AI, which specializes in image and video generation, has long insisted that its popular Stable Diffusion models are "open". But it also falls short of the OSI's definition due to its demand that businesses with over $1 million in annual revenue purchase an enterprise license to use its models. Melitany of lawsuitsanwhile, Mistral also puts restrictions on the use of its newest Ministral 3B and 8B models for certain commercial ventures. It's likely that far more AI companies that profess to be open-source will be upset by the OSI's definition. A recent study by Carnegie Mellon, the AI Now Institute and the Signal Foundation found that the vast majority of "open-source" models are in fact much more secretive than such a claim merits. For instance, very few release the datasets used to train the models, and most require vast amounts of computing power to train and run, which puts them beyond the reach of most developers. In the case of Llama, Meta says safety concerns prevent it from making the underlying training data available to the community, but few people believe that's the only reason it does so. It's almost certainly the case that Meta uses vast amounts of content posted by users of platforms like Facebook and Instagram, including stuff that is restricted to the user's contacts only. In addition, Llama is likely trained on a hefty amount of copyrighted material that has been posted on the web, and Meta doesn't want to publicize the details. In April, the New York Times said Meta had acknowledged internally that Llama's training dataset includes copyrighted content, because there's no feasible way to avoid collecting such material. Still, the company needs to keep schtum, for it's currently embroiled in a litany of lawsuits brought by publishers, authors and other content creators. Rather than decide to challenge the OSI, Meta appears to "agree to disagree" with its definition of what constitutes open-source AI. A spokesperson for the company said that while Meta agrees with the OSI on many things, it doesn't concur with today's pronouncement. "There is no single open source AI definition, and defining it is a challenge because previous open source definitions do not encompass the complexities of today's rapidly advancing AI models," the spokesperson said. Meta's problem is that most people are likely to accept the OSI's definition, because it's based on fairly straightforward logic, Rob Enderle, an analyst with the Enderle Group, told SiliconANGLE. "The OSI is correct in its assessment because without transparency on training data you really don't have an open platform," Enderle said. "Training data isn't a trivial thing, as it defines how the AI functions. Without access to it, the AI system cannot be open, because the very nature of how it works is closed." Most experts who don't have a stake in the big technology companies pursuing AI are likely to agree with the OSI's definition. The organization's definition of open-source software is widely regarded as the bottom line on which software is free to use without fear of lawsuits and licensing traps. Moreover, it spent more than two years, working closely with various academics, AI developers and researchers to refine its definition of open-source AI. In addition, the OSI's definition closely resembles an earlier attempt to clarify what makes AI open. Earlier this year, the Linux Foundation published its own definition, listing many of the same requirements.
[4]
What qualifies as open-source AI? Open Source Initiative clarifies
Disclaimer: This content generated by AI & may have errors or hallucinations. Edit before use. Read our Terms of use After sustained ambiguity around open-source artificial intelligence (AI), the Open Source Initiative (OSI) has introduced a definition for it. For companies to call their AI 'open source', they must share detailed information about the data they used to train said AI system. This would allow a developer to create their own AI models which are substantially similar to an existing open source AI model. Specifically, companies must include: The concept of open source comes from software development. It refers to publicly accessible software (and also AI) that people can modify and share. Within the realm of software development, software whose source code is publicly available for people to inspect/modify is called open-source. The Open Source Initiative's definition of open-source software is recognized internationally, including by several governments, which makes it a particularly relevant voice in the open-source ecosystem. In the AI space, there is no agreed-upon definition of open-source AI. Lea Gimpel from the Digital Public Goods Alliance, pointed this out during Carnegie India's Global Technology Summit last year. "There are currently several work streams and ways that are trying to define open source AI with the community to better understand what would we actually need to open source in order to maintain the benefits that we see in open source software," she noted. OSI explains that open source models must grant people the freedom to use the system permission for using the AI system without seeking permission. Companies creating such models also need to allow people to study the AI and inspect how it works, modify it for any purpose and share it with others to use (with or without modifications). Its definition could lend clarity on the full criteria an AI model release must meet to qualify as open source. Besides details of the training data, companies must also release the following about their models to classify them as open-source AI: Companies can ask any person using their open-source model to release products/services built on top of the model under the same terms and conditions as the original model. OSI acknowledges that currently there is no clear way to "legally" make AI model parameters (weights) freely available, and companies could make these parameters freely available without a license or could require some kind of legal document to do so. "We expect this will become clearer over time, once the legal system has had more opportunity to address Open Source AI systems," OSI adds. Many companies classify their AI systems as open source; for example, Meta labels its large language models Llama 2 and Llama 3 this way. The company makes the model weights publicly available but does not specify the training datasets it uses to train its models. However, both these models have restrictions baked into their open-source licenses. One such restriction is on the scale of commercial organisations that can access Llama under an open-source license. If a company has more than 700 million monthly active users, they have to request Meta for a license before using Llama 2 and Llama 3. Gimpel explained during the aforementioned summit that Llama wouldn't strictly fall within the scope of open source going by the software-based definition of open source because of its baked-in restrictions. Even with OSI's newly created definition, these restrictions make Meta's models fall short of the scope of "open source." Other companies like Apple have also come out with their own open-source AI systems namely the OpenELM family of AI models. Unlike Meta's models, these models allow people to use, reproduce, modify and redistribute these models with or without making changes. They also provide the complete framework for "training and evaluation of the language model on publicly available datasets, including training logs, multiple checkpoints, and pre-training configurations," the company's research paper on OpenELM says. It comes with a complete list of datasets that Apple used for model training, this includes Wikipedia, Wikibooks, Reddit, Github, the open-access archive for scholarly articles arXiv.org, and Project Gutenberg. So far, organisations like Mozilla Foundation, Common Crawl, Bloomberg Engineering, and Open Infra Foundation have endorsed OSI's definition of open source. Names like Meta, whose open-source models would not fit OSI's definition, are notably missing from the list of endorsements.
[5]
What Does Open-Source AI Actually Mean? There's Finally a Definition
Companies like Meta have been calling their products open source. They're not. In the buzzy world of AI, boring things like definitions often get overlooked. The term artificial intelligence itself is so broadly applied that it can refer to everything from linear regression models to killer robots. But when it comes to regulating emerging technologies, clear and precise definitions are important. Without them, you end up with the kind of goofy-if-it-wasn't-so-serious debates that state lawmakers around the country are having, like whether the language they wrote to ban deceptive deepfakes will also apply to spell check. So while it might not be the sexiest project, it's notable that, following years of research and global debate, the Open Source Initiative has finally agreed on a definition for "open source AI" that the nonprofit organization hopes can guide international regulation. In order to be labeled open source under the new definition, an AI systemâ€"including its component code, weights, and training dataâ€"must be made freely available in such a way that anyone can, without permission, use it for any purpose, study how it works, modify it, and share it with others. That's a pretty big departure from the way some tech companies have used the label amid the generative AI arms race. Most notably, Meta advertises its Llama family of models as open-source because they're free to use (as long as developers adhere to the company's license terms) and some of the code is publicly available. Last year, Metal also helped create a lobbying coalition called the AI Alliance to advocate for policies that benefit its particular brand of open-source technologies. Llama models don't qualify as open source because their licenses still place limits on how they can be used for some commercial purposes, like improving other large language models. They also outright prohibit uses that might violate various laws or cause harm. Meta has also not fully disclosed the training data for its Llama models. The Open Source Initiative's new definition was the product of years of discussions between stakeholders from more than 35 countries, the organization said. "Arriving at today's [open source AI definition] version 1.0 was a difficult journey, filled with new challenges for the OSI community," Open Source Initiative Executive Director Stefano Maffulli said in a statement. "Despite this delicate process, filled with differing opinions and uncharted technical frontiersâ€"and the occasional heated exchangeâ€"the results are aligned with the expectations set out at the start of this two-year process. This is a starting point for a continued effort to engage with the communities to improve the definition over time as we develop with the broader Open Source community the knowledge to read and apply OSAID v.1.0."
[6]
Meta, OSI tussle over definition of open source AI
Why it matters: Meta makes its Llama models freely available for others to use, but doesn't provide full disclosure of all of the elements that go into them. Driving the news: The Open Source Institute Monday published its definition of what constitutes a truly open source AI model, outlining four characteristics that it says should apply to AI just as they do to software in order to be considered open source. The organization says people should be able to: The other side: Meta, for its part, said it disagrees with the OSI's definition. Between the lines: OSI can't stop anyone else from calling their product "open source," but its new definition gives ammo to advocates of fuller disclosure of the weights and data that differentiate one model from another.
[7]
Open-source AI must reveal its training data, per new OSI definition
"We will continue working with OSI and other industry groups to make AI more accessible and free responsibly, regardless of technical definitions," Eischen added. For 25 years, OSI's definition of open-source software has been widely accepted by developers who want to build on each other's work without fear of lawsuits or licensing traps. Now, as AI reshapes the landscape, tech giants face a pivotal choice: embrace these established principles or reject them. The Linux Foundation has also made a recent attempt to define "open-source AI," signaling a growing debate over how traditional open-source values will adapt to the AI era.
[8]
Why Meta's 'open source' AI isn't all it seems
The Open Source Initiative has just set a new international definition for AI that could throw a spanner in the works for tech companies. Meta and some other technology firms rolling out so-called open source generative artificial intelligence (AI) models are "depriving the pubic from having innovation cycles" and making a profit from it, according to the group that has pioneered the open source term in software for the past 25 years. Open Source is yet another buzzword in AI circles with Big Tech companies such as Meta and Elon Musk's Grok AI model stating open source is "good for the world,'" according to Facebook founder Mark Zuckerberg. But no one can agree on what open source AI means. That could change as the Open Source Initiative (OSI), the organisation that is the self-appointed steward of the term, sets a final definition for open source AI on Monday, and it is not the same as Meta's version of the term. "They fail, especially Meta, because their terms of use and terms of distribution are incompatible with the open source definition and the open source principles," Stefano Maffulli, who heads the OSI, told Euronews Next. "They're basically the Microsoft, the Oracle, the Adobe of this space where they say 'build on top of my platform don't worry about it and I'll keep on getting grants from you using our platforms. But they also say 'it's open, so everyone can use it,'" he added. The OSI definition took a couple of years to cook up, and the organisation consulted a 70-person group of researchers, lawyers, policymakers and activists as well as representatives of big tech companies such as Microsoft, Meta, and Google. It states that an open source AI can be used for any reason without getting permission from the company, and researchers should be able to freely see how the system works. It also says that the AI system can be modified for any purpose, including to change its output and share it for others to use with or without modifications for any reason. Meta's Llama 3.1 model is partially open source, according to the OSI definition, in that developers and researchers can download it for free and customise it. But Meta does not specify where it got the data to train Llama 3.1, which can be problematic for users as it could lead to copyright issues or biased data. Maffulli said that if tech companies do say where the data comes from, they are often vague and will say the Internet. But he said that the "real innovation" and way that AI models perform better is in how the datasets are passed through the training machinery. "If you talk to companies, they don't want to release that code," Maffulli said, adding that "that's where the innovation happens". By confusing which AI models are truly open source, Meta and other firms may hamper the long-term development of AI models that are controlled by the user rather than several tech companies, Maffulli said. "I fear that society as a whole would be in a worse place if we let a handful of companies just go on and be the only ones who have the edge and the access to innovation this way," he added. Euronews Next contacted Meta for a reaction but did not receive a reply at the time of publication. However, Zuckerberg said in a blog post "we're taking the next steps towards open source AI becoming the industry standard" and said that Llama "has been a bedrock of AI innovation globally". Maffulli said that other companies such as Microsoft and Google had retracted using the open source term for their models that were not fully open, as per the definition. But he said that talks with Meta did not produce any result. The open source label can have positive connotations for a tech company's image as it is free to use. But confusion around the term can lead to 'openwashing', experts have previously told Euronews Next, which means they promote open models without contributing to the commons which can affect innovation and the public's understanding of AI. Using the open source term can also impact a company's bottom line as other companies can use the open source technology which then integrates new innovations into its products. In a February earnings call, Zuckerberg said: "Open source software often becomes an industry standard, and when companies standardise on building with our stack, that then becomes easier to integrate new innovations into our products". Unlike in the 2000s, when social media and the Big Tech companies took off and were largely unregulated, Maffulli believes it will be a different story with AI as now "regulators are watching and are already regulating". While the OSI are the stewards of the open source AI definition, it does not have any strong power to enforce the definition. However, judges and courts around the world are starting to recognise that the open source definition is important, especially when it comes to mergers but also regulation. "We do expect the definition to have an impact on regulators," Maffulli said. "They're watching us. We have become credible interlocutors".
[9]
OSI unveils Open Source AI Definition 1.0
The Open Source Initiative (OSI) on Monday released the result of a year-long global community initiative to create a standard defining what, exactly, constitutes an open source artificial intelligence (AI) system. The Open Source AI Definition (OSAID) v1.0, unveiled at the All Things Open 2024 conference in Raleigh, North Carolina, is described as "the first stable version" of a project to establish a set of principles that "that can recreate permissionless, pragmatic, and simplified collaboration for AI practitioners, similar to that which the Open Source Definition has done for the software ecosystem," the OSI said in its FAQ. Created via a co-design process involving more than 25 organizations, including leaders from commercial entities such as Microsoft, Google, Amazon, Meta, Intel, and Samsung, and groups including the Mozilla Foundation, Linux Foundation, the Apache Software Foundation, and the United Nations International Telecommunications Union, the document has already been endorsed by organizations worldwide.
Share
Share
Copy Link
The Open Source Initiative (OSI) has released the Open Source AI Definition (OSAID) 1.0, establishing criteria for what qualifies as open-source AI. This definition has sparked debate and disagreement among tech companies and AI developers.
The Open Source Initiative (OSI) has officially released version 1.0 of the Open Source AI Definition (OSAID) at the All Things Open conference on October 28, 2024 [1]. This definition aims to establish clear criteria for what qualifies as open-source AI, addressing the growing need for standardization in the rapidly evolving field of artificial intelligence.
According to the OSI, for an AI system to be considered open-source, it must meet the following criteria:
The release of OSAID has sparked debate within the tech industry. While organizations like Mozilla Foundation, OpenInfra Foundation, and SUSE have endorsed the definition [1], major players like Meta have expressed disagreement.
Meta, which advertises its Llama models as open-source, argues that the new definition doesn't encompass the complexities of today's AI models [3]. The company's spokesperson stated, "There is no single open source AI definition, and defining it is a challenge because previous open source definitions do not encompass the complexities of today's rapidly advancing AI models" [4].
Many AI models currently labeled as "open-source" fall short of the OSI's new definition:
A key point of contention is the requirement for transparency in training data. The OSI argues that without access to training data, an AI system cannot be truly open because the nature of how it works remains closed [4]. This requirement poses challenges for companies like Meta, which may use proprietary or sensitive data in their training processes [1].
The introduction of OSAID is likely to have significant implications for AI development and regulation:
As the AI landscape continues to evolve, the OSI acknowledges that the definition may need to be adjusted over time to address emerging challenges and technologies in the field [5].
Reference
[3]
The open-source AI community has reached a consensus on a definition for open-source AI, marking a significant milestone in the field. However, the new definition has sparked debates and raised concerns among various stakeholders.
4 Sources
Exploring the challenges and complexities in the intersection of AI and open source software. The article delves into the reasons behind AI's struggle with open source principles and the complications of making AI truly open source.
2 Sources
Tech giants like Google and Meta face scrutiny over their 'open-source' AI models. The Open Source Initiative questions whether these models truly meet open-source criteria, sparking a debate in the tech community.
2 Sources
The Open Source Initiative criticizes Meta for calling its Llama AI models "open-source," arguing that the term is being misused and could harm the development of truly open AI technologies.
2 Sources
Meta has released the largest open-source AI model to date, marking a significant milestone in artificial intelligence. This development could democratize AI research and accelerate innovation in the field.
2 Sources
The Outpost is a comprehensive collection of curated artificial intelligence software tools that cater to the needs of small business owners, bloggers, artists, musicians, entrepreneurs, marketers, writers, and researchers.
© 2024 TheOutpost.AI All rights reserved