Curated by THEOUTPOST
On Thu, 6 Mar, 4:02 PM UTC
3 Sources
[1]
Alibaba's new open source model QwQ-32B matches DeepSeek R1 with way smaller compute requirements
Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More Chinese e-commerce giant Alibaba's team behind its growing family of open source Qwen large language models (LLMs) today unveiled QwQ-32B, a new 32-billion-parameter reasoning model designed to improve performance on complex problem-solving tasks through reinforcement learning. The model is available as open-weight on Hugging Face and ModelScope under an Apache 2.0 license, meaning it's available for commercial and research uses. So, enterprises can take it and use it immediately to power their products and applications, even ones they charge other customers to use. It can also be accessed for individual users via Qwen Chat. Quan-with-Questions was Alibaba's answer to OpenAI's original reasoning model o1 QwQ, short for Qwen-with-Questions, was first introduced by Alibaba in November 2024 as an open-source reasoning model aimed at competing with OpenAI's o1-preview. At launch, the model was designed to enhance logical reasoning and planning by reviewing and refining its own responses during inference, a technique that made it particularly effective in math and coding tasks. The initial version of QwQ featured 32 billion parameters and a 32,000-token context length, with Alibaba highlighting its ability to outperform o1-preview in mathematical benchmarks like AIME and MATH, as well as scientific reasoning tasks such as GPQA. Despite its strengths, QwQ's early iterations struggled with programming benchmarks like LiveCodeBench, where OpenAI's models maintained an edge. Additionally, as with many emerging reasoning models, QwQ faced challenges such as language mixing and occasional circular reasoning loops. However, Alibaba's decision to release the model under an Apache 2.0 license ensured that developers and enterprises could freely adapt and commercialize it, distinguishing it from proprietary alternatives like OpenAI's o1. Since QwQ's initial release, the AI landscape has evolved rapidly. The limitations of traditional large language models (LLMs) have become more apparent, with scaling laws yielding diminishing returns in performance improvements. This shift has fueled interest in large reasoning models (LRMs) -- a new category of AI systems that leverage inference-time reasoning and self-reflection to enhance accuracy, including OpenAI's o3 series and the massively successful DeepSeek R1 from rival Chinese lab DeepSeek, an offshoot of Hong Kong quantitative analysis (quant) firm High-Flyer Capital Management. A separate, new report from web traffic analytics and research firm SimilarWeb found that since the launch of R1 back in late January 2024, DeepSeek has rocketed up the charts to become the most visited AI model providing website behind OpenAI. QwQ-32B, Alibaba's latest iteration, builds on these advancements by integrating reinforcement learning (RL) and structured self-questioning, positioning it as a serious competitor in the growing field of reasoning-focused AI. Scaling up performance with multi-stage reinforcement learning Traditional instruction-tuned models often struggle with difficult reasoning tasks, but the Qwen Team's research suggests that reinforcement learning (RL) can significantly improve a model's ability to solve complex problems. QwQ-32B builds on this idea by implementing a multi-stage RL training approach to enhance mathematical reasoning, coding proficiency, and general problem-solving. The model has been benchmarked against leading alternatives such as DeepSeek-R1, o1-mini, and DeepSeek-R1-Distilled-Qwen-32B, demonstrating competitive results despite having fewer parameters than some of these models. For example, while DeepSeek-R1 operates with 671 billion parameters (with 37 billion activated), QwQ-32B achieves comparable performance with a much smaller footprint -- typically requiring 24 GB of vRAM on a GPU (Nvidia's H100s have 80GB) compared to more than 1500 GB of vRAM for running the full DeepSeek R1 (16 Nvidia A100 GPUs) -- highlighting the efficiency of Qwen's RL approach. QwQ-32B follows a causal language model architecture and includes several optimizations: The RL process for QwQ-32B was executed in two phases: What it means for enterprise decision-makers For enterprise leaders -- including CEOs, CTOs, IT leaders, team managers, and AI application developers -- QwQ-32B represents a potential shift in how AI can support business decision-making and technical innovation. With its reinforcement learning-driven reasoning capabilities, the model can provide more accurate, structured, and context-aware insights, making it valuable for use cases such as automated data analysis, strategic planning, software development, and intelligent automation. Companies looking to deploy AI solutions for complex problem-solving, coding assistance, financial modeling, or customer service automation may find QwQ-32B's efficiency an attractive option. Additionally, its open-weight availability allows organizations to fine-tune and customize the model for domain-specific applications without proprietary restrictions, making it a flexible choice for enterprise AI strategies. The fact that it comes from a Chinese e-commerce giant may raise some security and bias concerns for some non-Chinese users, especially when using the Qwen Chat interface. But as with DeepSeek R1, the fact that the model is available on Hugging Face for download and offline usage and fine-tuning or retraining suggests that these can be overcome fairly easily. And it is a viable alternative to DeepSeek R1 for those interested in pursuing one. Early reactions from AI power users and influencers The release of QwQ-32B has already gained attention from the AI research and development community, with several developers and industry professionals sharing their initial impressions on X (formerly Twitter): Agentic capabilities QwQ-32B incorporates agentic capabilities, allowing it to dynamically adjust reasoning processes based on environmental feedback. For optimal performance, Qwen Team recommends using the following inference settings: The model supports deployment using vLLM, a high-throughput inference framework. However, current implementations of vLLM only support static YaRN scaling, which maintains a fixed scaling factor regardless of input length. Future developments Qwen Team sees QwQ-32B as the first step in scaling reinforcement learning to enhance reasoning capabilities. Looking ahead, the team plans to: With QwQ-32B, Qwen Team is positioning reinforcement learning as a key driver of the next generation of AI models, demonstrating that scaling reinforcement learning can produce highly performant and effective reasoning systems.
[2]
Alibaba's Latest Open-Source Model Said to Match DeepSeek-R1's Performance
It outperforms DeepSeek-R1 in LiveBench, IFEval, and BFCL benchmarks Alibaba's Qwen Team, a division tasked with developing artificial intelligence (AI) models, released the QwQ-32B AI model on Wednesday. It is a reasoning model based on extended test time compute with visible chain-of-thought (CoT). The developers claim that despite being smaller in size compared to the DeepSeek-R1, the model can match its performance based on benchmark scores. Like other AI models released by the Qwen Team, the QwQ-32B is also an open-source AI model, however, it is not fully open-sourced. In a blog post, Alibaba's Qwen Team detailed the QwQ-32B reasoning model. QwQ (short for Qwen with Questions) series AI models were first introduced by the company in November 2024. These reasoning models were designed to offer an open-source alternative for the likes of OpenAI's o1 series. The QwQ-32B is a 32 billion parameter model developed by scaling reinforcement learning (RL) techniques. Explaining the training process, the developers said that the RL scaling approach was added to a cold-start checkpoint. Initially, RL was used only for coding and mathematics-related tasks, and the responses were verified to ensure accuracy. Later the technique was used for general capabilities along with rule-based verifiers. The Qwen Team found that this method increased general capabilities of the model without reducing its math and coding performance. The developers claim that these training structures enabled the QwQ-32B to perform at similar levels to the DeepSeek-R1 despite the latter being a 671-billion-parameter model (with 37 billion activated). Based on internal testing, the team claimed that QwQ-32B outperforms DeepSeek-R1 in the LiveBench (coding), IFEval (chat or instruction fine-tuned language), and the Berkeley Function Calling Leaderboard V3 or BFCL (ability to call functions) benchmarks. Developers and AI enthusiasts can find the open weights of the model on Hugging Face listing and Modelscope. The model is available under the Apache 2.0 licence which allows academic and research-related usage but forbids commercial use cases. Additionally, since the full training details and datasets are not available, the model is also not replicable or can be deconstructed. DeepSeek-R1 was also available under the same licence. In case one lacks the right hardware to run the AI model locally, they can also access its capabilities via Qwen Chat. The model picker menu at the top-left of the page will let users select the QwQ-32B-preview model.
[3]
Alibaba's New QwQ 32B Model is as Good as DeepSeek-R1 ; Outperforms OpenAI's o1-mini
'This remarkable outcome underscores the effectiveness of RL when applied to robust foundation models pre-trained on extensive world knowledge.' Alibaba, the Chinese giant, announced on Thursday a new AI model under the Qwen umbrella, called QwQ 32B. The model contains 32 billion parameters, but is said to 'achieve performance comparable' to DeepSeek-R1, which consists of 671 billion parameters (with 37 billion activated parameters). The company attributes the model's success to the 'effectiveness of reinforcement learning (RL)' when applied to foundational models on a large corpus of knowledge. The QwQ 32B reasoning model also carries agentic capabilities, which helps it think critically based on external feedback. "We find that RL training continuously improves the performance, especially in math and coding, and we observe that the continuous scaling of RL can help a medium-size model achieve competitive performance against gigantic MoE (mixture of experts) model," said the company in a post on X. QwQ-32B is released as an open weights model and is available in Hugging Face and ModelScope. Users can access it online via Qwen Chat. The model offers performance parity with DeepSeek's flagship R1 model, outperforming OpenAI's o1-mini in several benchmarks pertaining to code, mathematical reasoning, and general problem-solving tasks. Source: Qwen Recently, they launched QwQ-Max-Preview, built on the Qwen 2.5 Max, specialising in mathematics and coding-based tasks. On the LiveCodeBench leaderboard, a platform that evaluates LLMs for code, the QwQ-Max-Preview scored 65.6 points, higher than OpenAI's o1 medium (63.4) and o3 Mini Low (60.9). It was then mentioned that smaller variants of QwQ reasoning models will be open sourced for local device deployment - and the QwQ 32B model is likely the first variant. Alibaba, too, recently released the Wan 2.1, its open-source video foundation model, which can generate videos with complex motions that accurately simulate real-world physics. The suite includes three main models: Wan2.1-I2V-14B, Wan2.1-T2V-14B, and Wan2.1-T2V-1.3B. The I2V-14B model generates videos at 480P and 720P resolutions, producing complex visual scenes and motion patterns. The model outperformed OpenAI's Sora on the VBench Leaderboard. Last week, the company announced that it plans to invest over $52 billion in the cloud computing and artificial intelligence sector over the next three years. This investment exceeds the company's total AI and cloud spending in the past decade. During the earnings call, Eddie Wu, CEO of Alibaba Group, said, "We see AI as a once-in-a-generation industry transformation opportunity, and the primary goal of our AI strategy is to pursue the realisation of AGI (Artificial General Intelligence) and continuously push the boundaries of model intelligence capabilities."
Share
Share
Copy Link
Alibaba's Qwen Team unveils QwQ-32B, an open-source AI model matching DeepSeek R1's performance with significantly lower computational requirements, showcasing advancements in reinforcement learning for AI reasoning.
Alibaba's Qwen Team has unveiled QwQ-32B, a new open-source AI model that promises to revolutionize the field of artificial intelligence reasoning. This 32-billion-parameter model, released under the Apache 2.0 license, is designed to match the performance of larger models like DeepSeek R1 while requiring significantly less computational power 1.
QwQ-32B stands out for its remarkable efficiency:
This efficiency is attributed to Alibaba's innovative use of multi-stage reinforcement learning (RL) in the model's training process.
The QwQ-32B model incorporates several advanced training techniques:
QwQ-32B is designed for broad accessibility and application:
For enterprise decision-makers, QwQ-32B offers several advantages:
The release of QwQ-32B is part of Alibaba's broader AI strategy:
As the AI landscape continues to evolve rapidly, Alibaba's QwQ-32B represents a significant step forward in creating more efficient and powerful AI models. Its ability to match the performance of much larger models while requiring less computational resources could have far-reaching implications for the future of AI development and deployment.
Reference
[1]
[2]
[3]
Analytics India Magazine
|Alibaba's New QwQ 32B Model is as Good as DeepSeek-R1 ; Outperforms OpenAI's o1-miniAlibaba releases QwQ-32B-Preview, an open-source AI model that rivals OpenAI's o1 in reasoning capabilities. The model outperforms o1 on specific benchmarks and is available for commercial use.
5 Sources
5 Sources
Alibaba's Qwen research team has released QVQ-72B, an experimental open-source AI model that combines visual analysis with advanced reasoning capabilities, potentially outperforming some closed-source competitors in specific benchmarks.
2 Sources
2 Sources
Alibaba Cloud launches Qwen2.5-Omni-7B, an open-source multimodal AI model capable of processing text, images, audio, and video inputs while generating real-time responses. This development marks a significant advancement in cost-effective AI agents and intelligent voice applications.
13 Sources
13 Sources
Alibaba has released a new version of its AI model, Qwen 2.5-Max, claiming it outperforms competitors like DeepSeek, ChatGPT, and Meta's Llama. This move comes amid intense competition in the AI industry, particularly from the rapidly rising Chinese startup DeepSeek.
17 Sources
17 Sources
DeepSeek's open-source R1 model challenges OpenAI's o1 with comparable performance at a fraction of the cost, potentially revolutionizing AI accessibility and development.
6 Sources
6 Sources
The Outpost is a comprehensive collection of curated artificial intelligence software tools that cater to the needs of small business owners, bloggers, artists, musicians, entrepreneurs, marketers, writers, and researchers.
© 2025 TheOutpost.AI All rights reserved