Curated by THEOUTPOST
On Wed, 2 Oct, 4:03 PM UTC
3 Sources
[1]
Nvidia drops a 72 billion-parameter, open-source LLM | Digital Trends
Nvidia, which builds some of the most highly sought-after GPUs in the AI industry, has announced that it has released an open-source large language model that reportedly performs on par with leading proprietary models from OpenAI, Anthropic, Meta, and Google. The company introduced its new NVLM 1.0 family in a recently released white paper, and it's spearheaded by the 72 billion-parameter NVLM-D-72B model. "We introduce NVLM 1.0, a family of frontier-class multimodal large language models that achieve state-of-the-art results on vision-language tasks, rivaling the leading proprietary models (e.g., GPT-4o) and open-access models," the researchers wrote. Recommended Videos Get your weekly teardown of the tech behind PC gaming ReSpec Subscribe Check your inbox! Privacy Policy The new model family is reportedly already capable of "production-grade multimodality," with exceptional performance across a variety of vision and language tasks, in addition to improved text-based responses compared to the base LLM that the NVLM family is based on. "To achieve this, we craft and integrate a high-quality text-only dataset into multimodal training, alongside a substantial amount of multimodal math and reasoning data, leading to enhanced math and coding capabilities across modalities," the researchers explained. The result is an LLM that can just as easily explain why a meme is funny as it can solve complex mathematics equations, step by step. Nvidia also managed to increase the model's text-only accuracy by an average of 4.3 points across common industry benchmarks, thanks to its multimodal training style. Nvidia appears serious about ensuring that this model meets the Open Source Initiative's newest definition of "open source" by not only making its training weights available for public review, but also promising to release the model's source code in the near future. This is a marked departure from the actions of rivals like OpenAI and Google, who jealously guard the details of their LLMs' weights and source code. In doing so, Nvidia has positioned the NVLM family to not necessarily compete directly against ChatGPT-4o and Gemini 1.5 Pro, but rather serve as a foundation for third-party developers to build their own chatbots and AI applications.
[2]
Nvidia introduces open-source NVLM 1.0 models
Nvidia has officially entered the ring with a powerful open-source AI model, NVLM 1.0, challenging industry giants like OpenAI and Google. The company's new NVLM 1.0 family of large multimodal language models promises to deliver cutting-edge capabilities across both visual and text-based tasks. Leading the pack is the 72 billion parameter NVLM-D-72B, a model designed to perform at the highest level, making a massive impact on vision-language tasks while improving traditional text-based outputs. The release of NVLM 1.0 marks a notable shift in the AI ecosystem, which proprietary models have largely dominated. Nvidia's decision to make these model weights publicly available -- and eventually release the training code -- offers researchers and developers access to tools that rival the likes of GPT-4. This is a rare move in an industry where most advanced models remain under lock and key, tightly controlled by tech giants. As Nvidia stated in their research paper, "NVLM 1.0 achieves state-of-the-art results on vision-language tasks, rivaling both proprietary and open-access models." What this means for developers is a new frontier in AI accessibility, much like what Meta did with Llama 3.2, giving smaller labs and independent researchers a chance to work with top-tier AI tools without having to navigate the often prohibitive costs or corporate restrictions. The open-source release of NVLM 1.0 has generated excitement across the AI research community. One prominent researcher highlighted the significance of the model on social media, stating: At the center of this open-source revolution is the NVLM-D-72B model, which stands out for its ability to handle both visual and textual inputs seamlessly. This multimodal capacity means the model can interpret images, analyze complex visuals, and even solve mathematical problems step-by-step -- all within a single framework. Where many multimodal models struggle with retaining performance in text-only tasks after integrating visual learning, NVLM-D-72B bucks the trend. According to Nvidia, the model improved its text accuracy by an average of 4.3 points across several key benchmarks after multimodal training. This kind of adaptability positions NVLM-D-72B as a unique tool in a market that typically forces users to choose between models optimized for either visual or textual tasks, but not both. The NVLM project is not just about open access. It also introduces innovative architectural designs that blend different multimodal processing techniques, pushing the boundaries of what's possible in AI. Nvidia's hybrid approach could very well inspire a new direction in AI research and development, as teams across the world get their hands on these tools. However, as with any leap in technology, there are risks involved. Making such powerful AI models widely available raises concerns about potential misuse and the ethical challenges that come with it. The AI community will need to balance the drive for innovation with the need to develop responsible frameworks for using these models. Nvidia's decision to open-source NVLM 1.0 could set off a wave of change throughout the tech world. Other industry leaders might feel pressure to follow suit, potentially shifting the entire landscape of AI development. If state-of-the-art models become freely accessible, it could force companies to rethink how they generate value and maintain a competitive edge in the market. The long-term impact of Nvidia's move is still unknown. In the coming months and years, we could see an era of unprecedented collaboration in AI, where researchers from all corners of the globe work together on shared platforms. Or, this development could prompt a deeper examination of the consequences of releasing advanced technology without strict controls in place. One thing is clear: Nvidia's release of NVLM 1.0 is a game-changing move that signals a shift in the balance of power within the AI industry. By making such a high-caliber model open-source, Nvidia is challenging the status quo, setting off what could be a new chapter in AI development. The question now isn't if the AI models and market will change -- it's how dramatically, and who will be able to keep up.
[3]
Nvidia just dropped a bombshell: Its new AI model is open, massive, and ready to rival GPT-4
Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More Nvidia has released a powerful open-source artificial intelligence model that competes with proprietary systems from industry leaders like OpenAI and Google. The company's new NVLM 1.0 family of large multimodal language models, led by the 72 billion parameter NVLM-D-72B, demonstrates exceptional performance across vision and language tasks while also enhancing text-only capabilities. "We introduce NVLM 1.0, a family of frontier-class multimodal large language models that achieve state-of-the-art results on vision-language tasks, rivaling the leading proprietary models (e.g., GPT-4o) and open-access models," the researchers explain in their paper. By making the model weights publicly available and promising to release the training code, Nvidia breaks from the trend of keeping advanced AI systems closed. This decision grants researchers and developers unprecedented access to cutting-edge technology. NVLM-D-72B: A versatile performer in visual and textual tasks The NVLM-D-72B model shows impressive adaptability in processing complex visual and textual inputs. Researchers provided examples that highlight the model's ability to interpret memes, analyze images, and solve mathematical problems step-by-step. Notably, NVLM-D-72B improves its performance on text-only tasks after multimodal training. While many similar models see a decline in text performance, NVLM-D-72B increased its accuracy by an average of 4.3 points across key text benchmarks. "Our NVLM-D-1.0-72B demonstrates significant improvements over its text backbone on text-only math and coding benchmarks," the researchers note, emphasizing a key advantage of their approach. AI researchers respond to Nvidia's open-source initiative The AI community has reacted positively to the release. One AI researcher commenting on social media, observed, "Wow! Nvidia just published a 72B model with is ~on par with llama 3.1 405B in math and coding evals and also has vision ?" Nvidia's decision to make such a powerful model openly available could accelerate AI research and development across the field. By providing access to a model that rivals proprietary systems from well-funded tech companies, Nvidia may enable smaller organizations and independent researchers to contribute more significantly to AI advancements. The NVLM project also introduces innovative architectural designs, including a hybrid approach that combines different multimodal processing techniques. This development could shape the direction of future research in the field. NVLM 1.0: A new chapter in open-source AI development Nvidia's release of NVLM 1.0 marks a pivotal moment in AI development. By open-sourcing a model that rivals proprietary giants, Nvidia isn't just sharing code -- it's challenging the very structure of the AI industry. This move could spark a chain reaction. Other tech leaders may feel pressure to open their research, potentially accelerating AI progress across the board. It also levels the playing field, allowing smaller teams and researchers to innovate with tools once reserved for tech giants. However, NVLM 1.0's release isn't without risks. As powerful AI becomes more accessible, concerns about misuse and ethical implications will likely grow. The AI community now faces the complex task of promoting innovation while establishing guardrails for responsible use. Nvidia's decision also raises questions about the future of AI business models. If state-of-the-art models become freely available, companies may need to rethink how they create value and maintain competitive edges in AI. The true impact of NVLM 1.0 will unfold in the coming months and years. It could usher in an era of unprecedented collaboration and innovation in AI. Or, it might force a reckoning with the unintended consequences of widely available, advanced AI. One thing is certain: Nvidia has fired a shot across the bow of the AI industry. The question now is not if the landscape will change, but how dramatically -- and who will adapt fast enough to thrive in this new world of open AI.
Share
Share
Copy Link
NVIDIA has released an open-source large language model with 72 billion parameters, positioning it as a potential competitor to OpenAI's GPT-4. This move marks a significant shift in NVIDIA's AI strategy and could reshape the AI landscape.
In a surprising move that has sent ripples through the AI community, NVIDIA has unveiled a massive open-source large language model (LLM) boasting 72 billion parameters 1. This new model, part of the NVIDIA NeMo framework, is being positioned as a potential rival to OpenAI's GPT-4, signaling a significant shift in NVIDIA's approach to AI development and distribution.
The sheer size of NVIDIA's new model is noteworthy. With 72 billion parameters, it stands as one of the largest openly available AI models to date. This scale puts it in direct competition with proprietary models like GPT-4, which is estimated to have over 1 trillion parameters 2. The model's release is part of NVIDIA's broader NeMo large language model (NVLM) 1.0 series, which includes variants ranging from 1.3 billion to 72 billion parameters.
NVIDIA's decision to make this model open-source is a departure from the closed, proprietary approach taken by many leading AI companies. This move could democratize access to advanced AI capabilities, allowing researchers, developers, and businesses to build upon and customize the model for various applications 3. The open-source nature of the model also promotes transparency and collaborative improvement within the AI community.
While the exact capabilities of NVIDIA's new model are still being explored, early reports suggest impressive performance across a range of tasks. The model has shown proficiency in natural language processing, code generation, and complex reasoning [1]. However, it's important to note that parameter count alone doesn't determine a model's effectiveness, and real-world applications will be crucial in assessing its true potential.
The release of this open-source model by NVIDIA, a company primarily known for its hardware contributions to AI, could significantly alter the competitive dynamics in the AI field. It challenges the dominance of closed-source models and may accelerate the pace of AI innovation by providing a robust foundation for further development [3]. This move also aligns with growing calls for more openness and accessibility in AI research and development.
As with any powerful AI model, the release of NVIDIA's 72 billion parameter LLM raises important ethical considerations. The open-source nature of the model means that it could potentially be used for both beneficial and harmful purposes. NVIDIA has emphasized the importance of responsible AI development and use, but the broader implications of such a powerful, openly available model remain a topic of discussion in the tech and ethics communities [2].
Reference
[1]
[2]
NVIDIA quietly released a new open-source AI model, Llama-3.1-Nemotron-70B-Instruct, which has reportedly outperformed leading models from OpenAI and Anthropic in benchmark tests, signaling a shift in NVIDIA's AI strategy.
6 Sources
The Allen Institute for AI (Ai2) has unveiled OLMo 2, a family of open-source language models that compete with leading AI models while adhering to open-source principles, potentially reshaping the landscape of accessible AI technology.
3 Sources
Meta has released Llama 3.1, its largest and most advanced open-source AI model to date. This 405 billion parameter model is being hailed as a significant advancement in generative AI, potentially rivaling closed-source models like GPT-4.
5 Sources
Researchers at the Allen Institute for AI have developed Molmo, an open-source multimodal AI model that rivals proprietary models in performance while being significantly smaller and more efficient.
3 Sources
Mistral AI and NVIDIA have jointly announced Mistral NeMo 12B, a new language model designed for enterprise use. This collaboration marks a significant advancement in AI technology, offering improved performance and accessibility for businesses.
4 Sources
The Outpost is a comprehensive collection of curated artificial intelligence software tools that cater to the needs of small business owners, bloggers, artists, musicians, entrepreneurs, marketers, writers, and researchers.
© 2024 TheOutpost.AI All rights reserved