10 Sources
[1]
China's AI startup Zhipu releases open-source model GLM-4.5
BEIJING, July 28 (Reuters) - Chinese AI startup Zhipu on Monday released open-source model GLM-4.5 designed for intelligent agent applications, a statement said, as it joins a wave of similar releases from local rivals in an increasingly competitive sector. Zhipu, one of China's "AI tigers" backed by Chinese local governments, gained significant attention after OpenAI said in June that the U.S. company had observed notable progress by the Chinese startup in securing government contracts across several regions. Zhipu's new model release adds to the growing pipeline of models from Chinese companies. As of July, China has already released 1,509 large-language models, ranking first among the 3,755 models released globally, according to a report from state-owned Xinhua news agency. Reporting by Liam Mo and Brenda Goh; editing by Mark Heinrich Our Standards: The Thomson Reuters Trust Principles., opens new tab
[2]
China's latest AI model claims to be even cheaper to use than DeepSeek
BEIJING -- Chinese companies are making smarter artificial intelligence models that are increasingly cheaper to use, echoing key aspects of DeepSeek's market-shaking breakthrough. Startup Z.ai, formerly known as Zhipu, announced Monday that its new GLM-4.5 AI model would cost less than DeepSeek to use. In contrast to the logic underlying existing AI models, Z.ai said its new GLM-4.5 is built on what's known as "agentic" AI, meaning that the model automatically breaks down a task into sub-tasks in order to complete it more accurately. The new model is also open sourced, meaning it is free for developers to download and use. At about half the size of DeepSeek's model, GLM-4.5 only needs eight Nvidia H20 chips to operate, Z.ai CEO Zhang Peng told CNBC on Monday. That's the chip Nvidia customized for China in order to comply with U.S. export controls. The chipmaker said this month that the U.S. will allow it to resume those China sales after a three-month pause, but it's unclear when those shipments will begin. Zhang said the company doesn't need to buy more of the chips as it has enough computing power for now, but declined to share how much Z.ai spent on training the AI model. Details will be released later, he said.
[3]
Chinese startup Z.ai launches powerful open source GLM-4.5 model family with PowerPoint creation
Want smarter insights in your inbox? Sign up for our weekly newsletters to get only what matters to enterprise AI, data, and security leaders. Subscribe Now Another week in the summer of 2025 has begun, and in a continuation of the trend from last week, with it arrives more powerful Chinese open source AI models. Little-known (at least to us here in the West) Chinese startup Z.ai has introduced two new open source LLMs -- GLM-4.5 and GLM-4.5-Air -- casting them as go-to solutions for AI reasoning, agentic behavior, and coding. And according to Z.ai's blog post, the models perform near the top of the pack of other proprietary LLM leaders in the U.S. For example, the flagship GLM-4.5 matches or outperforms leading proprietary models like Claude 4 Sonnet, Claude 4 Opus, and Gemini 2.5 Pro on evaluations such as BrowseComp, AIME24, and SWE-bench Verified, while ranking third overall across a dozen competitive tests. Its lighter-weight sibling, GLM-4.5-Air, also performs within the top six, offering strong results relative to its smaller scale. Both models feature dual operation modes: a thinking mode for complex reasoning and tool use, and a non-thinking mode for instant response scenarios. They can automatically generate complete PowerPoint presentations from a single title or prompt, making them useful for meeting preparation, education, and internal reporting. They further offer creative writing, emotionally aware copywriting, and script generation to create branded content for social media and the web. Moreover, z.ai says they support virtual character development and turn-based dialogue systems for customer support, roleplaying, fan engagement, or digital persona storytelling. While both models support reasoning, coding, and agentic capabilities, GLM-4.5-Air is designed for teams seeking a lighter-weight, more cost-efficient alternative with faster inference and lower resource requirements. Z.ai also lists several specialized models in the GLM-4.5 family on its API, including GLM-4.5-X and GLM-4.5-AirX for ultra-fast inference, and GLM-4.5-Flash, a free variant optimized for coding and reasoning tasks. They're available now to use directly on Z.ai and through the Z.ai application programming interface (API) for developers to connect to third-party apps, and their code is available on HuggingFace and ModelScope. The company also provides multiple integration routes, including support for inference via vLLM and SGLang. Licensing and API pricing GLM-4.5 and GLM-4.5-Air are released under the Apache 2.0 license, a permissive and commercially friendly open-source license. This allows developers and organizations to freely use, modify, self-host, fine-tune, and redistribute the models for both research and commercial purposes. For those who don't want to download the model code or weights and self-host or deploy on their own, z.ai's cloud-based API offers the model for the following prices. * GLM-4.5: * GLM-4.5-Air: A CNBC article on the models reported that z.ai would charge only $0.11 / $0.28 per million input/output tokens, which is also supported by a Chinese graphic the company posted on its API documentation for the "Air model." However, this appears to be the case only for inputting up to 32,000 tokens and outputting 200 tokens at a single time. (Recall tokens are the numerical designations the LLM uses to represent different semantic concepts and word components, the LLM's native language, with each token translating to a word or portion of a word). In fact, the Chinese graphic reveals far more detailed pricing for both models per batches of tokens inputted/outputted. I've tried to translate it below: Another note: since z.ai is based in China, those in the West who are focused on data sovereignty will want to due diligence through internal policies to pursue using the API, as it may be subject to Chinese content restrictions. Competitive performance on third-party benchmarks, approaching that of leading closed/proprietary LLMs GLM-4.5 ranks third across 12 industry benchmarks measuring agentic, reasoning, and coding performance -- trailing only OpenAI's GPT-4 and xAI's Grok 4. GLM-4.5-Air, its more compact sibling, lands in sixth position. In agentic evaluations, GLM-4.5 matches Claude 4 Sonnet in performance and exceeds Claude 4 Opus in web-based tasks. It achieves a 26.4% accuracy on the BrowseComp benchmark, compared to Claude 4 Opus's 18.8%. In the reasoning category, it scores competitively on tasks such as MATH 500 (98.2%), AIME24 (91.0%), and GPQA (79.1%). For coding, GLM-4.5 posts a 64.2% success rate on SWE-bench Verified and 37.5% on Terminal-Bench. In pairwise comparisons, it outperforms Qwen3-Coder with an 80.8% win rate and beats Kimi K2 in 53.9% of tasks. Its agentic coding ability is enhanced by integration with tools like Claude Code, Roo Code, and CodeGeex. The model also leads in tool-calling reliability, with a success rate of 90.6%, edging out Claude 4 Sonnet and the new-ish Kimi K2. Part of the wave of open source Chinese LLMs The release of GLM-4.5 arrives amid a surge of competitive open-source model launches in China, most notably from Alibaba's Qwen Team. In the span of a single week, Qwen released four new open-source LLMs, including the reasoning-focused Qwen3-235B-A22B-Thinking-2507, which now tops or matches leading models such as OpenAI's o4-mini and Google's Gemini 2.5 Pro on reasoning benchmarks like AIME25, LiveCodeBench, and GPQA. This week, Alibaba continued the trend with the release of Wan 2.2, a powerful new open source video model. Alibaba's new models are, like z.ai, licensed under Apache 2.0, allowing commercial usage, self-hosting, and integration into proprietary systems. The broad availability and permissive licensing of Alibaba's offerings and Chinese startup Moonshot before it with its Kimi K2 model reflects an ongoing strategic effort by Chinese AI companies to position open-source infrastructure as a viable alternative to closed U.S.-based models. It also places pressure on the U.S.-based model provider efforts to compete in open source. Meta has been on a hiring spree after its Llama 4 model family debuted earlier this year to a mixed response from the AI community, including a hefty dose of criticism for what some AI power users saw as benchmark gaming and inconsistent performance. Meanwhile, OpenAI co-founder and CEO Sam Altman recently announced that OpenAI's long-awaited and much-hyped frontier open source LLM -- its first since before ChatGPT launched in late 2022 -- would be delayed from its originally planned July release to an as-yet unspecified later date. Architecture and training lessons revealed GLM-4.5 is built with 355 billion total and 32 billion active parameters. Its counterpart, GLM-4.5-Air, offers a lighter-weight design at 106 billion total and 12 billion active parameters. Both use a Mixture-of-Experts (MoE) architecture, optimized with loss-free balance routing, sigmoid gating, and increased depth for enhanced reasoning. The self-attention block includes Grouped-Query Attention and a higher number of attention heads. A Multi-Token Prediction (MTP) layer enables speculative decoding during inference. Pre-training spans 22 trillion tokens split between general-purpose and code/reasoning corpora. Mid-training adds 1.1 trillion tokens from repo-level code data, synthetic reasoning inputs, and long-context/agentic sources. Z.ai's post-training process for GLM-4.5 relied upon a reinforcement learning phase powered by its in-house RL infrastructure, slime, which separates data generation and model training processes to optimize throughput on agentic tasks. Among the techniques they used were mixed-precision rollouts and adaptive curriculum learning. The former help the model train faster and more efficiently by using lower-precision math when generating data, without sacrificing much accuracy. Meanwhile, adaptive curriculum learning means the model starts with easier tasks and gradually moves to harder ones, helping it learn more complex tasks gradually over time. GLM-4.5's architecture prioritizes computational efficiency. According to CNBC, Z.ai CEO Zhang Peng stated that the model runs on just eight Nvidia H20 GPUs -- custom silicon designed for the Chinese market to comply with U.S. export controls. That's roughly half the hardware requirement of DeepSeek's comparable models. Interactive demos Z.ai highlights full-stack development, slide creation, and interactive artifact generation as demonstration areas on its blog post. Examples include a Flappy Bird clone, Pokémon Pokédex web app, and slide decks built from structured documents or web queries. Users can interact with these features on the Z.ai chat platform or through API integration. Company background and market position Z.ai was founded in 2019 under the name Zhipu, and has since grown into one of China's most prominent AI startups, according to CNBC. The company has raised over $1.5 billion from investors including Alibaba, Tencent, Qiming Venture Partners, and municipal funds from Hangzhou and Chengdu, with additional backing from Aramco-linked Prosperity7 Ventures. Its GLM-4.5 launch coincides with the World Artificial Intelligence Conference in Shanghai, where multiple Chinese firms showcased advancements. Z.ai was also named in a June OpenAI report highlighting Chinese progress in AI, and has since been added to a U.S. entity list limiting business with American firms. What it means for enterprise technical decision-makers For senior AI engineers, data engineers, and AI orchestration leads tasked with building, deploying, or scaling language models in production, the GLM-4.5 family's release under the Apache 2.0 license presents a meaningful shift in options. The model offers performance that rivals top proprietary systems across reasoning, coding, and agentic benchmarks -- yet comes with full weight access, commercial usage rights, and flexible deployment paths, including cloud, private, or on-prem environments. For those managing LLM lifecycles -- whether leading model fine-tuning, orchestrating multi-stage pipelines, or integrating models with internal tools -- GLM-4.5 and GLM-4.5-Air reduce barriers to testing and scaling. The models support standard OpenAI-style interfaces and tool-calling formats, making it easier to evaluate in sandboxed environments or drop into existing agent frameworks. GLM-4.5 also supports streaming output, context caching, and structured JSON responses, enabling smoother integration with enterprise systems and real-time interfaces. For teams building autonomous tools, its deep thinking mode provides more precise control over multi-step reasoning behavior. For teams under budget constraints or those seeking to avoid vendor lock-in, the pricing structure undercuts major proprietary alternatives like DeepSeek and Kimi K2. This matters for organizations where usage volume, long-context tasks, or data sensitivity make open deployment a strategic necessity. For professionals in AI infrastructure and orchestration, such as those implementing CI/CD pipelines, monitoring models in production, or managing GPU clusters, GLM-4.5's support for vLLM, SGLang, and mixed-precision inference aligns with current best practices in efficient, scalable model serving. Combined with open-source RL infrastructure (slime) and a modular training stack, the model's design offers flexibility for tuning or extending in domain-specific environments. In short, GLM-4.5's launch gives enterprise teams a viable, high-performing foundation model they can control, adapt, and scale, without being tied to proprietary APIs or pricing structures. It's a compelling option for teams balancing innovation, performance, and operational constraints.
[4]
China's Z.ai Scoffs at US Sanctions, Launches Supercharged AI With $1.5B War Chest - Decrypt
New models are open-source, super-efficient, and cost a fifth of Western rivals to use. Beijing-based Z.ai, recently added to the U.S. Entity List and backed by $1.5 billion in fresh funding, has released two open-source AI models that outperformed most Western counterparts in global benchmarks, challenging U.S. sanctions as it prepares for a Hong Kong listing. The startup, formerly known as Zhipu AI, is listed on Washington's Entity List, a trade restriction tool that identifies foreign individuals, companies, and other organizations the government believes are involved in activities contrary to U.S. national security or foreign policy interests. Rubbing salt on the wounds, Z.ai dunked on San Francisco-based OpenAI in a blog post earlier on Monday, arguing that the massively popular AI Platform is something of a mess, with a confusing array of special-use products. By comparison, Z.ai's new open-source model GLM-4.5 merges reasoning and non-reasoning skills into one fully capable and versatile architecture. "In the past five years, OpenAI's GPT-3 learns common-sense knowledge, and o1 uses reinforcement learning to think before responding, significantly improving reasoning skills in coding, data analysis, and complex math," the post said. "However, the resultant models are still not really general: some of them are good at coding, some good at math, and some good at reasoning, but none of them could achieve the best performance across all the different tasks," it said in its post. "GLM-4.5 makes efforts toward the goal of unifying all the different capabilities." OpenAI has stated that its next model, GPT-5, expected by the end of summer, will be its first fully unified tool. In the meantime, GLM-4.5 and its smaller sibling GLM-4.5-Air scored an average of 63.2 and 59.8 across 12 industry tests, ranking third globally, behind OpenAI o3 and Grok-4. The models employ a Mixture of Experts architecture with a total of 355 billion parameters for the flagship version. However, only 32 billion are active at any given time, making it more efficient than brute-force approaches. GLM-4.5 achieved a 90.6% tool-calling success rate, outperforming Claude 3.5 Sonnet's 89.5% and even Kimi K2's 86.2%. On web browsing tasks, it correctly answered 26.4% of complex questions compared to Claude 4 Opus's 18.8%. The model solved 98.2% of problems in the MATH 500 benchmark, matching Claude 4 Opus, all while being open source. The U.S. Department of Commerce added Zhipu to its Entity List in January, accusing the company of supporting Beijing's military advances. Companies on the list cannot purchase American technology without special government approval. Zhipu said the designation "will not have a substantial impact" on its operations. Which is undoubtedly true: Thus far, State-backed investors have poured money into the company since the Entity List announcement. Huafa Group, backed by the Zhuhai government, invested 500 million yuan ($69 million) in March. Chengdu's state fund followed with 300 million yuan, while Shanghai's Pudong Venture Capital Group and Zhangjiang Group jointly added 1 billion yuan in July. The startup plans to raise another $300 million through a Hong Kong initial public offering, according to people familiar with the matter. The company initially filed for a mainland China listing but shifted strategies as geopolitical tensions escalated. Its valuation doubled to 40 billion yuan ($5.6 billion) between September 2024 and July 2025. Z.ai prices API calls at $0.11 per million input tokens and $0.28 per million output tokens -- roughly one-fifth the cost of comparable Western models. The company released the weights under an MIT license on Hugging Face and GitHub, allowing anyone to download and modify the code. The models feature what Z.ai calls "hybrid reasoning" with two modes. A thinking mode handles complex reasoning and planning tasks, while a non-thinking mode provides instant responses. The company shared samples of how people can build entire websites with natural language commands or generate interactive visualizations ranging from Pokémon databases to complex presentations. Chinese companies are accelerating their investment in AI, releasing 1,509 large language models as of July 2025, which represents 40% of the 3,755 models released globally and offers some of the best open-source models in the industry. Just recently, Kimi K2 impressed. Additionally, today, Alibaba released Wan 2.2, which is the state-of-the-art (SOTA) open-source generative video model. Hidream-I1 is the best open-source image generator with Hailuo, Seedream, and Kling beating some of the biggest names in the generative video scene. Early investors of Z.ai include Alibaba Group, Tencent Holdings, and HongShan Capital Group. The company spun out from Tsinghua University in 2019 and represents one of China's "AI Tigers," startups that the government considers crucial for achieving technological independence amid heavy political constraints imposed by the U.S. To boost its own ecosystem of AI technology and apps, Z.ai launched a 1.5 billion yuan ($211 million) fund called the "Z Fund" to invest in startups that build on its technology. The strategy mirrors OpenAI's approach of creating a developer ecosystem around proprietary models while maintaining more open licensing terms. Users can try an online version of the chatbot by accessing this link: Chat with Z.ai. The site is hosted in Singapore, so you can breathe a little easier.
[5]
Chinese startup Z.ai releases cost-efficient GLM-4.5 reasoning model - SiliconANGLE
Chinese startup Z.ai releases cost-efficient GLM-4.5 reasoning model Chinese startup Z.ai today open-sourced GLM-4.5, a reasoning model that it claims is more cost-efficient than DeepSeek's R1. CNBC reported that the algorithm can run on eight H20 graphics cards. The H20 is a scaled-down version of Nvidia Corp.'s H100 chip, which was its flagship artificial intelligence accelerator until last year. The U.S. government recently greenlit the sale of the former processor to companies in China. The launch of GLM-4.5 comes about six months after DeepSeek released its open-source R1 reasoning model. At the time, the company stated that the algorithm can perform some tasks using 50 times less hardware than OpenAI's o1. Furthermore, DeepSeek claimed to have trained its model for a fraction of the cost of earlier AI projects. R1's release led to investor concerns that increasingly hardware-efficient language models may lower demand for AI infrastructure. Nvidia's market capitalization dropped more than $580 billion in the subsequent selloff, setting a new Wall Street record. The release of GLM-4.5 today didn't lead to a similar drop in AI stocks, but it sends investors another signal that reasoning models are continuing to become more hardware-efficient. Z.ai reportedly expects to charge $0.11 for every 1 million input tokens entered into GLM-4.5. That's 3 cents lower than R1. One million output tokens cost $0.28 cents, just over one tenth what DeepSeek charges for R1. One of the main factors behind GLM-4.5's cost-efficiency is that it's relatively small. The model features 355 billion parameters, or about 316 million less than R1. GLM-4.5 only activates 32 billion of those parameters at any given time to reduce hardware usage. An AI model comprises numerous code snippets called artificial neurons that each perform a tiny portion of the work involved processing a prompt. Those neurons, in turn, are organized into so-called layers. Z.ai removed some of GLM-4.5's components to add more layers, an approach that it says helped boost the model's reasoning skills. The company trained GLM-4.5 through a multistep workflow. First, it developed an initial version of the model using a dataset that included 15 trillion tokens' worth of information. Z.ai then honed GLM-4.5's reasoning skills with several smaller training datasets that together comprised more than 7 trillion tokens. The company evaluated the model's capabilities using a dozen popular AI benchmarks. According to Z.ai, GLM-4.5 outperformed multiple popular alternatives including Claude 4 Opus. It ranked third behind xAI Holdings Corp.'s Grok 4 and OpenAI's o3. For use cases that place particular emphasis on cost-efficiency, Z.ai has developed a scaled-down version of its model called GLM-4.5-Air. The algorithm features 106 billion parameters, or about three times less than the original. GLM-4.5-Air activates 12 billion parameters to process prompts. In January, the U.S. Commerce Department added Z.ai to its Entity List of organizations subject to export controls. The company is backed by $1.5 billion in funding from Alibaba Group, Tencent Inc. and other investors. It reportedly plans to file for a public offering later this year.
[6]
A New Open Source Model From China is Crushing the Benchmarks | AIM
Ziphu AI -- also known as Z.ai -- has unveiled the new GLM 4.5 family of AI models, which the company claims outperform Anthropic's highly regarded Opus 4.0 on several benchmarks. The release follows Moonshot AI's Kimi K2, which exhibited benchmark performance. Z.ai, based out of Beijing and backed by the e-commerce giant Alibaba, released the GLM-4.5 and the GLM-4.5-Air AI models. Looking at the recent releases of open source AI models from the East, it wouldn't be an overstatement to say that the future of open source AI may not be led by the West. "It seems like Chinese labs are playing musical chairs at this point," said Satvik Paramkusham, an engineer on X. Both - the GLM-4.5 and the GLM-4.5-Air - are based on the Mixture of Experts (MoE) architecture and are packed with reasoning, coding and agentic capabilities. GLM-4.5 is built with 355 billion total parameters and 32 billion active parameters, and GLM-4.5-Air with 106 billion total parameters and 12 billion active parameters. The startup claims that this is an effort to develop a genuinely general model. While acknowledging the capabilities of the models today, Z.ai said, "Models are still not really general: some of them are good at coding, some good at math, and some good at reasoning, but none of them could achieve the best performance across all the different tasks." "GLM-4.5 makes efforts toward the goal of unifying all the different capabilities," added the company. And if benchmark scores are anything to go by, it would be fair to assert that Z.ai has fulfilled its claims. Across coding, reasoning, and other benchmarks, these models are comparable to some of the best-performing models today, and in some cases, even surpass models like Claude 4 Opus and OpenAI's o3. Compared to DeepSeek-R1, a model that once disrupted both the ecosystem and NVIDIA's market cap, Z.ai's models consistently outperform it on multiple evaluations. In addition, a few users who have used the model have also generally reported a positive experience. One user on Reddit said, "GLM-4.5 is absolutely crushing it for coding-way better than Claude's recent performance." Another user on a Hacker News thread said, "I could get it to consistently use the tools and follow instructions in a way that never really worked well with Deepseek R1 or Qwen. Even compared to Kimi, I feel like this is probably the best open source coding model out right now." Furthermore, these models also excel across benchmarks that evaluate their agentic and tool-use capabilities. The company tested the GLM-4.5 on the BrowseComp benchmark for web browsing, which includes complex questions requiring short answers. With a web browsing tool enabled, it provides correct responses for 26.4% of all questions, outperforming Claude-4-Opus (18.8%) and nearing o4-mini-high (28.3%). On other benchmarks, such as TAU-Bench - airline and retail, which assess a model's ability to perform agentic tasks involving realistic customer-related activities reliably within the airline and retail domains, both the GLM-4.5 and the GLM-4.5-Air perform on par with the Claude 4 Sonnet but beat OpenAI's o3. These models were also put to the test on the 'Pelican benchmark' from Simon Wilson, the co-creator of the Django Web framework, a rather amusing test that demands AI models to generate an SVG of a pelican riding a bicycle. This helps evaluate a model's practical coding and creative capabilities. While several models have historically struggled with this particular test, the GLM 4.5 provided an impressive result. 'I like how the pelican has its wings on the handlebars," said Wilson. (Left: SVG created by o3 Pro, Right: SVG created by GLM-4.5) The GLM-4.5 model costs $0.6 per million input tokens and $2.2 per million output tokens, while the more affordable GLM-4.5-Air variant is priced at $0.2 per million input tokens and $1.1 per million output tokens. Besides, being available as open-source means that there's more information released about the model's training process, much to the appreciation of developers. Also Read: OpenAI is Flirting with Danger by Naming China's Blacklisted Zhipu AI as a Threat Z.ai mentioned that during pre-training, the model was first trained on a corpus of 15 trillion tokens of general information, followed by 7 trillion tokens of code and a reasoning corpus, and then introduced additional stages to train it on more specific domains. In the blog post, the company also stated, "We employ loss-free balance routing and sigmoid gates for MoE layers. Unlike DeepSeek-V3 and Kimi K2, we reduce the width (hidden dimension and number of routed experts) of the model while increasing the height (number of layers), as we found that deeper models exhibit better reasoning capacity." Alongside the models, the company has also open-sourced a reinforcement learning infrastructure called slime, which is said to be engineered for "exceptional flexibility, efficiency, and scalability." Slime's primary innovations are designed to overcome common RL bottlenecks, particularly in complex agentic tasks, said Z.ai. Some of these techniques involve using a flexible training architecture that enables maximum utilisation of GPUs, and an agent-oriented design that separates rollout engines from the training engines, which helps eliminate some bottlenecks associated with RL. Besides, slime is also said to employ the memory-efficient FP8 format for data generation, while retaining the stability of the more precise BF16 format. Casper Hansen, a natural language processing (NLP) scientist, shared his experience on X, that the GLM 4.5 Air model can "easily give up to 200 tokens/second" in FP8. The release of Kimi K2, and now from Z.ai, comes at a time when users are eagerly waiting for OpenAI's GPT-5 and its open source model. OpenAI now faces an unprecedented challenge. The company that initially released GPT-2 cautiously as open-source now re-enters a market filled with competitors, and this time, it won't have the first-mover advantage. On the other hand, Meta faces a larger challenge, having been the creator of the leading open-source model but now halting development on its most powerful and largest model, the Llama 4 Behemoth. This has left many speculating about Llama 5, especially as the company is now fully dedicated to building a 'superintelligence' team.
[7]
Cheaper than DeepSeek? Z.ai releases a new open-source model
Chinese startup Z.ai, formerly known as Zhipu, released a new open-source AI model, which the company asserts operates at a lower cost than DeepSeek. Z.ai introduced three distinct models as part of this release. The flagship offering is designated as GLM-4.5. Alongside this, a more lightweight version, named GLM-4.5-Air, has been made available. The third model, GLM-4.5-Flash, is provided without cost and is specifically engineered for coding tasks, reasoning capabilities, and various agent-based applications. While these models share functional similarities with offerings from American technology companies such as OpenAI, Google's Gemini, and Anthropic's Claude AI, a key distinction lies in the complete open-source nature of GLM-4.5. This open-source attribute differentiates it from several proprietary models currently available in the market. Z.ai is identified as one of several "AI tigers," a term referring to Chinese artificial intelligence companies that receive backing from the Chinese government. The introduction of GLM-4.5 contributes to a substantial volume of large-language models (LLMs) originating from China. Records indicate that GLM-4.5 is one of 1,509 large-language models released by various Chinese AI companies within recent months. While a number of these models, including GLM-4.5, offer open-source systems for development and integration, certain considerations may influence their adoption by users in Western countries. One primary concern involves data transmission protocols, as models like DeepSeek have been observed to transmit user data back to China. This practice raises privacy implications, particularly given the geopolitical dynamics between the United States and China. OpenAI's Global Affairs team previously issued a warning regarding Z.ai and other similar Chinese startups in a public report. The sustained advancements by Chinese AI companies have also prompted policy responses in the United States. The Trump administration, for instance, released "America's AI Action Plan." This document outlines strategies aimed at maintaining the United States' leadership position in artificial intelligence development. The plan's objectives include reducing regulations deemed unnecessary for AI companies and integrating AI technologies more extensively across governmental operations. These measures are designed to foster innovation and reinforce America's competitive edge in the global AI landscape. Observations concerning other Chinese AI models, such as DeepSeek, indicate capacities for censoring content related to China. This aspect suggests that AI models like GLM-4.5 may encounter limitations in achieving widespread adoption across Western regions, predominantly due to concerns regarding data privacy and content censorship. Advocacy has emerged, recommending that users exercise caution or avoid the use of DeepSeek and similar models. Despite GLM-4.5 being an open-source model, the degree of control afforded to users concerning the collection and utilization of their data remains limited. This issue is not exclusive to Chinese AI models; many AI models, regardless of their origin, retain user data and frequently employ it for model training purposes. While certain privacy-focused AI models, such as Proton's Lumo chatbot, have been introduced, a comprehensive solution to the broader challenge of data retention by major technology companies has not yet materialized. The significant economic value attributed to consumer data by large technology corporations suggests that this practice is unlikely to undergo fundamental changes in the near future. Below are the shared benchmarks: The release of Z.ai's new model illustrates the rapid pace of evolution within the field of artificial intelligence and highlights the accelerated progress made by nations such as China in narrowing the technological lead previously held by the United States. This development has the potential to incentivize American companies, including OpenAI, to enhance their existing AI models and to encourage emerging startups to pursue similar advancements in their respective offerings.
[8]
This Chinese AI Lab Just Released Two Open-Source Agentic AI Models
GLM-4.5 AI models offer reasoning, coding, and agentic capabilities Chinese artificial intelligence (AI) lab Z.ai announced the release of two new open-source general language models (GLM) on Monday. Dubbed GLM-4.5 and GLM-4.5-Air, the AI firm calls them its latest flagship models. Both of them are hybrid reasoning models that offer a thinking mode for complex reasoning and tool use, and a non-thinking mode for instant responses. Additionally, the company says these models support agentic capabilities. Notably, the AI firm claims that its latest models outperform all other open-source models worldwide. In a blog post, the Chinese AI firm announced the release of these models. The idea behind GLM models is to create a large language model (LLM) that is truly generalistic and can equally perform different types of tasks. The company argues that despite several advancements in generative AI, the models from the likes of Google, OpenAI, and Anthropic are not general enough, as they display strong performance in some areas while lagging in others. "GLM-4.5 makes efforts toward the goal of unifying all the different capabilities," the company said. The GLM-4.5 AI model features a total of 355 billion parameters with 32 billion active parameters. On the other hand, the Air variant gets 106 billion total parameters, with 12 billion parameters being active. Both models unify reasoning, coding, and agentic capabilities in a single architecture. It has a context window of 1,28,000 tokens and comes with native function calling capacity. Coming to the model architecture, Z.ai opted for a mixture-of-experts (MoE) architecture to improve the compute efficiency of both training and inference. Instead of increasing the width (hidden dimensions and number of experts) of the MoE layers like DeepSeek-V3, the GLM-4.5 series reduces the width while increasing the height (number of layers). This was done as the company believed that deeper models displayed improved reasoning capability. The Chinese AI firm also listed the novel techniques used for the pre-training and post-training process in the blog post to help the developer community understand how the models were built from scratch. Z.ai claimed to have tested the GLM-4.5 model's performance on 12 benchmarks across agentic, reasoning, and coding. It then claimed to compare the model's overall scores against various LLMs from OpenAI, Anthropic, Google, xAI, Alibaba, and more. Based on this internal evaluation, the Chinese AI firm claims that GLM-4.5 ranked in the 3rd place, behind OpenAI's o3 and xAI's Grok 4. Interested individuals can access the open weights of these models from Z.ai's GitHub and Hugging Face listings. Alternatively, these LLMs can also be accessed via the company's website and application programming interface (API).
[9]
Z.ai Says New AI Model Costs Less to Use Than DeepSeek | PYMNTS.com
By completing this form, you agree to receive marketing communications from PYMNTS and to the sharing of your information with our sponsor, if applicable, in accordance with our Privacy Policy and Terms and Conditions. Z.ai's new GLM-4.5 AI model costs 11 cents per million input tokens and 28 cents per million output tokens, CNBC reported Monday (July 28), citing its interview with Z.ai CEO Zhang Peng. For DeepSeek R1, those costs are 14 cents and $2.19, respectively, according to the report. Z.ai's GLM-4.5 is open source and can operate on eight Nvidia H20 chips, per the report. The company said in a Monday blog post that the GLM-4.5 model series features two models -- GLM-4.5 and GLM-4.5-Air -- that satisfy the requirements of agentic applications by unifying reasoning, coding and agentic capabilities. They are hybrid reasoning models that offer a "thinking mode" for complex reasoning and tool use, as well as a "non-thinking mode" for instant responses, according to the post. "As a good [large language model (LLM)], it is necessary to deal with general problem solving, generalization, [commonsense] reasoning and self-improvement," Z.ai said in the post. "In the past five years, OpenAI's GPT-3 learns [commonsense] knowledge, and o1 uses reinforcement learning to think before [responding], significantly improving reasoning skills in coding, data analysis and complex math. However, the resultant models are still not really general. Some of them are good at coding, some good at math, and some good at reasoning, but none of them could achieve the best performance across all the different tasks. GLM-4.5 makes efforts toward the goal of unifying all the different capabilities." According to the CNBC report, Z.ai is on the United States' entity list, meaning American companies are restricted from doing business with it. The AI startup has raised $1.5 billion from investors and is planning an initial public offering (IPO) in China, per the report. It was reported in March that Z.ai, then known as Zhipu AI, raised more than 1 billion yuan (about $140 million) in a funding round led by Hangzhou Municipal Construction Investment Group Co. and Shangcheng Capital, both of which have ties to the Hangzhou local government.
[10]
Meet GLM-4.5: The most capable open-source AI model yet
Zhipu AI's GLM-4.5 rivals commercial giants with 128K context, dual-mode thinking, and MIT licensing While the West buzzes with the rise of OpenAI's GPT-4.5 and xAI's Grok-4, a powerful open-source challenger has emerged in the East. Developed by Zhipu AI (also known as Z.ai), GLM-4.5 is the latest entry in the company's General Language Model series and it's already turning heads in the AI research and developer communities. On paper, GLM-4.5 is a technical marvel: 355 billion total parameters (with 32 billion active), support for 128K context windows, a novel thinking/non-thinking hybrid architecture, and the highest tool-use accuracy ever recorded by an open-source model. In the benchmarks, it doesn't just compete, it outperforms nearly every other open alternative in coding, reasoning, and real-world applications. And perhaps most importantly, it's free to use under the MIT license. Also read: What is Voxtral: Mistral's open source AI audio model, key features explained GLM-4.5 is built using a Mixture-of-Experts (MoE) framework. This architecture activates only a subset of its parameters for each query (32B active out of 355B total), giving it GPT-4-level performance while staying computationally efficient. Also read: Gemini CLI: Google's latest open source AI agent explained But the standout innovation is its dual-mode processing. GLM-4.5 can toggle between "thinking" and "non-thinking" states, essentially choosing between deep, multi-step reasoning or fast, lightweight response generation based on the task. This results in more intelligent and contextual responses without always requiring full-throttle compute. For users who want a smaller version, Zhipu has also launched GLM-4.5-Air, a leaner 106B model with just 12B active parameters which is ideal for inference on edge or budget GPUs. Zhipu AI didn't just make big claims, it backed them with data. In internal and third-party evaluations, GLM-4.5 pulled ahead in several categories. In code generation, it achieved a 53.9% win rate against top competitors, and an astonishing 80.8% win rate in dedicated coding benchmarks. On real-world agent benchmarks requiring API calling, document reading, or search, GLM-4.5 boasted a 90.6% success rate, higher than many closed and open models on record. With 128K token support, it matches Claude 3 and GPT-4 in retaining long conversations and analyzing multi-document input. These results push GLM-4.5 to the top tier of open models. In some internal comparisons, Zhipu AI even ranks it behind only xAI's Grok-4 and OpenAI's o3 model. GLM-4.5's significance goes beyond technical specs. At a time when most cutting-edge models are locked behind APIs or usage caps, Zhipu has released GLM-4.5 completely open-source, hosted on Hugging Face and available for enterprise deployment. That means any developer, startup, or research lab can fine-tune, embed, or deploy the model without worrying about licensing hurdles. And given its performance, it could become the default foundation model for many new AI applications, especially those focused on autonomous agents or tool-using LLMs. The MIT license also encourages commercial adoption. Early reports suggest that Chinese tech giants and even several European research teams are exploring integrations. Zhipu AI has been flying under the radar internationally, but GLM-4.5 may change that. The Beijing-based firm, a spin-off from Tsinghua University's innovation labs, has now released multiple world-class models within two years including GLM-3, ChatGLM, and now GLM-4.5. With GLM-5 already hinted to be in development and a growing suite of agent-centric tools, Zhipu seems poised to become a global leader in open, high-performance AI. And with geopolitical concerns growing around closed-source Western models, their timing couldn't be better.
Share
Copy Link
Z.ai, a Chinese AI startup, releases its open-source GLM-4.5 model, claiming superior performance and cost-efficiency compared to Western counterparts, despite facing U.S. sanctions.
Chinese AI startup Z.ai, formerly known as Zhipu, has released its latest open-source AI model, GLM-4.5, positioning itself as a formidable competitor in the global AI landscape. This release comes amid increasing tensions between China and the United States, with Z.ai recently added to the U.S. Entity List 4.
Source: Digit
GLM-4.5 is designed for intelligent agent applications, featuring dual operation modes: a thinking mode for complex reasoning and tool use, and a non-thinking mode for instant responses 3. The model boasts impressive capabilities, including:
In terms of performance, Z.ai claims that GLM-4.5 ranks third globally across 12 industry benchmarks, trailing only OpenAI's GPT-4 and xAI's Grok 4 34. The model reportedly outperforms several leading proprietary models, including Claude 4 Sonnet and Claude 4 Opus, in various evaluations 3.
Source: PYMNTS
GLM-4.5 employs a Mixture of Experts architecture with a total of 355 billion parameters, but only 32 billion are active at any given time, making it more efficient than traditional approaches 4. This efficiency allows the model to run on just eight Nvidia H20 chips, a scaled-down version of the H100 chip customized for the Chinese market 25.
Z.ai has released GLM-4.5 under the Apache 2.0 license, allowing for free use, modification, and redistribution for both research and commercial purposes 3. The company offers cloud-based API access with competitive pricing:
This pricing structure is reportedly about one-fifth the cost of comparable Western models 4.
Despite facing U.S. sanctions, Z.ai has secured significant funding and support:
The company's valuation has doubled to 40 billion yuan ($5.6 billion) between September 2024 and July 2025 4.
Source: SiliconANGLE
Z.ai's release of GLM-4.5 is part of a broader trend of Chinese companies accelerating their investment in AI. As of July 2025, China has released 1,509 large language models, representing 40% of the 3,755 models released globally 14. This surge in AI development reflects China's push for technological independence amid U.S. restrictions 4.
The release of GLM-4.5 and similar models from Chinese companies challenges the dominance of Western AI firms and raises questions about the effectiveness of U.S. sanctions in slowing China's AI progress. As the global AI race intensifies, the open-source nature of these models may accelerate innovation and potentially reshape the competitive landscape in the AI industry.
NVIDIA announces significant upgrades to its GeForce NOW cloud gaming service, including RTX 5080-class performance, improved streaming quality, and an expanded game library, set to launch in September 2025.
10 Sources
Technology
16 hrs ago
10 Sources
Technology
16 hrs ago
Nvidia is reportedly developing a new AI chip, the B30A, based on its latest Blackwell architecture for the Chinese market. This chip is expected to outperform the currently allowed H20 model, raising questions about U.S. regulatory approval and the ongoing tech trade tensions between the U.S. and China.
11 Sources
Technology
16 hrs ago
11 Sources
Technology
16 hrs ago
SoftBank Group has agreed to invest $2 billion in Intel, buying common stock at $23 per share. This strategic investment comes as Intel undergoes a major restructuring under new CEO Lip-Bu Tan, aiming to regain its competitive edge in the semiconductor industry, particularly in AI chips.
18 Sources
Business
8 hrs ago
18 Sources
Business
8 hrs ago
Databricks, a data analytics firm, is set to raise its valuation to over $100 billion in a new funding round, showcasing the strong investor interest in AI startups. The company plans to use the funds for AI acquisitions and product development.
7 Sources
Business
42 mins ago
7 Sources
Business
42 mins ago
OpenAI introduces ChatGPT Go, a new subscription plan priced at ₹399 ($4.60) per month exclusively for Indian users, offering enhanced features and affordability to capture a larger market share.
15 Sources
Technology
8 hrs ago
15 Sources
Technology
8 hrs ago