9 Sources
[1]
China's AI startup Zhipu releases open-source model GLM-4.5
BEIJING, July 28 (Reuters) - Chinese AI startup Zhipu on Monday released open-source model GLM-4.5 designed for intelligent agent applications, a statement said, as it joins a wave of similar releases from local rivals in an increasingly competitive sector. Zhipu, one of China's "AI tigers" backed by Chinese local governments, gained significant attention after OpenAI said in June that the U.S. company had observed notable progress by the Chinese startup in securing government contracts across several regions. Zhipu's new model release adds to the growing pipeline of models from Chinese companies. As of July, China has already released 1,509 large-language models, ranking first among the 3,755 models released globally, according to a report from state-owned Xinhua news agency. Reporting by Liam Mo and Brenda Goh; editing by Mark Heinrich Our Standards: The Thomson Reuters Trust Principles., opens new tab
[2]
China's latest AI model claims to be even cheaper to use than DeepSeek
BEIJING -- Chinese companies are making smarter artificial intelligence models that are increasingly cheaper to use, echoing key aspects of DeepSeek's market-shaking breakthrough. Startup Z.ai, formerly known as Zhipu, announced Monday that its new GLM-4.5 AI model would cost less than DeepSeek to use. In contrast to the logic underlying existing AI models, Z.ai said its new GLM-4.5 is built on what's known as "agentic" AI, meaning that the model automatically breaks down a task into sub-tasks in order to complete it more accurately. The new model is also open sourced, meaning it is free for developers to download and use. At about half the size of DeepSeek's model, GLM-4.5 only needs eight Nvidia H20 chips to operate, Z.ai CEO Zhang Peng told CNBC on Monday. That's the chip Nvidia customized for China in order to comply with U.S. export controls. The chipmaker said this month that the U.S. will allow it to resume those China sales after a three-month pause, but it's unclear when those shipments will begin. Zhang said the company doesn't need to buy more of the chips as it has enough computing power for now, but declined to share how much Z.ai spent on training the AI model. Details will be released later, he said.
[3]
Chinese startup Z.ai launches powerful open source GLM-4.5 model family with PowerPoint creation
Want smarter insights in your inbox? Sign up for our weekly newsletters to get only what matters to enterprise AI, data, and security leaders. Subscribe Now Another week in the summer of 2025 has begun, and in a continuation of the trend from last week, with it arrives more powerful Chinese open source AI models. Little-known (at least to us here in the West) Chinese startup Z.ai has introduced two new open source LLMs -- GLM-4.5 and GLM-4.5-Air -- casting them as go-to solutions for AI reasoning, agentic behavior, and coding. And according to Z.ai's blog post, the models perform near the top of the pack of other proprietary LLM leaders in the U.S. For example, the flagship GLM-4.5 matches or outperforms leading proprietary models like Claude 4 Sonnet, Claude 4 Opus, and Gemini 2.5 Pro on evaluations such as BrowseComp, AIME24, and SWE-bench Verified, while ranking third overall across a dozen competitive tests. Its lighter-weight sibling, GLM-4.5-Air, also performs within the top six, offering strong results relative to its smaller scale. Both models feature dual operation modes: a thinking mode for complex reasoning and tool use, and a non-thinking mode for instant response scenarios. They can automatically generate complete PowerPoint presentations from a single title or prompt, making them useful for meeting preparation, education, and internal reporting. They further offer creative writing, emotionally aware copywriting, and script generation to create branded content for social media and the web. Moreover, z.ai says they support virtual character development and turn-based dialogue systems for customer support, roleplaying, fan engagement, or digital persona storytelling. While both models support reasoning, coding, and agentic capabilities, GLM-4.5-Air is designed for teams seeking a lighter-weight, more cost-efficient alternative with faster inference and lower resource requirements. Z.ai also lists several specialized models in the GLM-4.5 family on its API, including GLM-4.5-X and GLM-4.5-AirX for ultra-fast inference, and GLM-4.5-Flash, a free variant optimized for coding and reasoning tasks. They're available now to use directly on Z.ai and through the Z.ai application programming interface (API) for developers to connect to third-party apps, and their code is available on HuggingFace and ModelScope. The company also provides multiple integration routes, including support for inference via vLLM and SGLang. Licensing and API pricing GLM-4.5 and GLM-4.5-Air are released under the Apache 2.0 license, a permissive and commercially friendly open-source license. This allows developers and organizations to freely use, modify, self-host, fine-tune, and redistribute the models for both research and commercial purposes. For those who don't want to download the model code or weights and self-host or deploy on their own, z.ai's cloud-based API offers the model for the following prices. * GLM-4.5: * GLM-4.5-Air: A CNBC article on the models reported that z.ai would charge only $0.11 / $0.28 per million input/output tokens, which is also supported by a Chinese graphic the company posted on its API documentation for the "Air model." However, this appears to be the case only for inputting up to 32,000 tokens and outputting 200 tokens at a single time. (Recall tokens are the numerical designations the LLM uses to represent different semantic concepts and word components, the LLM's native language, with each token translating to a word or portion of a word). In fact, the Chinese graphic reveals far more detailed pricing for both models per batches of tokens inputted/outputted. I've tried to translate it below: Another note: since z.ai is based in China, those in the West who are focused on data sovereignty will want to due diligence through internal policies to pursue using the API, as it may be subject to Chinese content restrictions. Competitive performance on third-party benchmarks, approaching that of leading closed/proprietary LLMs GLM-4.5 ranks third across 12 industry benchmarks measuring agentic, reasoning, and coding performance -- trailing only OpenAI's GPT-4 and xAI's Grok 4. GLM-4.5-Air, its more compact sibling, lands in sixth position. In agentic evaluations, GLM-4.5 matches Claude 4 Sonnet in performance and exceeds Claude 4 Opus in web-based tasks. It achieves a 26.4% accuracy on the BrowseComp benchmark, compared to Claude 4 Opus's 18.8%. In the reasoning category, it scores competitively on tasks such as MATH 500 (98.2%), AIME24 (91.0%), and GPQA (79.1%). For coding, GLM-4.5 posts a 64.2% success rate on SWE-bench Verified and 37.5% on Terminal-Bench. In pairwise comparisons, it outperforms Qwen3-Coder with an 80.8% win rate and beats Kimi K2 in 53.9% of tasks. Its agentic coding ability is enhanced by integration with tools like Claude Code, Roo Code, and CodeGeex. The model also leads in tool-calling reliability, with a success rate of 90.6%, edging out Claude 4 Sonnet and the new-ish Kimi K2. Part of the wave of open source Chinese LLMs The release of GLM-4.5 arrives amid a surge of competitive open-source model launches in China, most notably from Alibaba's Qwen Team. In the span of a single week, Qwen released four new open-source LLMs, including the reasoning-focused Qwen3-235B-A22B-Thinking-2507, which now tops or matches leading models such as OpenAI's o4-mini and Google's Gemini 2.5 Pro on reasoning benchmarks like AIME25, LiveCodeBench, and GPQA. This week, Alibaba continued the trend with the release of Wan 2.2, a powerful new open source video model. Alibaba's new models are, like z.ai, licensed under Apache 2.0, allowing commercial usage, self-hosting, and integration into proprietary systems. The broad availability and permissive licensing of Alibaba's offerings and Chinese startup Moonshot before it with its Kimi K2 model reflects an ongoing strategic effort by Chinese AI companies to position open-source infrastructure as a viable alternative to closed U.S.-based models. It also places pressure on the U.S.-based model provider efforts to compete in open source. Meta has been on a hiring spree after its Llama 4 model family debuted earlier this year to a mixed response from the AI community, including a hefty dose of criticism for what some AI power users saw as benchmark gaming and inconsistent performance. Meanwhile, OpenAI co-founder and CEO Sam Altman recently announced that OpenAI's long-awaited and much-hyped frontier open source LLM -- its first since before ChatGPT launched in late 2022 -- would be delayed from its originally planned July release to an as-yet unspecified later date. Architecture and training lessons revealed GLM-4.5 is built with 355 billion total and 32 billion active parameters. Its counterpart, GLM-4.5-Air, offers a lighter-weight design at 106 billion total and 12 billion active parameters. Both use a Mixture-of-Experts (MoE) architecture, optimized with loss-free balance routing, sigmoid gating, and increased depth for enhanced reasoning. The self-attention block includes Grouped-Query Attention and a higher number of attention heads. A Multi-Token Prediction (MTP) layer enables speculative decoding during inference. Pre-training spans 22 trillion tokens split between general-purpose and code/reasoning corpora. Mid-training adds 1.1 trillion tokens from repo-level code data, synthetic reasoning inputs, and long-context/agentic sources. Z.ai's post-training process for GLM-4.5 relied upon a reinforcement learning phase powered by its in-house RL infrastructure, slime, which separates data generation and model training processes to optimize throughput on agentic tasks. Among the techniques they used were mixed-precision rollouts and adaptive curriculum learning. The former help the model train faster and more efficiently by using lower-precision math when generating data, without sacrificing much accuracy. Meanwhile, adaptive curriculum learning means the model starts with easier tasks and gradually moves to harder ones, helping it learn more complex tasks gradually over time. GLM-4.5's architecture prioritizes computational efficiency. According to CNBC, Z.ai CEO Zhang Peng stated that the model runs on just eight Nvidia H20 GPUs -- custom silicon designed for the Chinese market to comply with U.S. export controls. That's roughly half the hardware requirement of DeepSeek's comparable models. Interactive demos Z.ai highlights full-stack development, slide creation, and interactive artifact generation as demonstration areas on its blog post. Examples include a Flappy Bird clone, Pokémon Pokédex web app, and slide decks built from structured documents or web queries. Users can interact with these features on the Z.ai chat platform or through API integration. Company background and market position Z.ai was founded in 2019 under the name Zhipu, and has since grown into one of China's most prominent AI startups, according to CNBC. The company has raised over $1.5 billion from investors including Alibaba, Tencent, Qiming Venture Partners, and municipal funds from Hangzhou and Chengdu, with additional backing from Aramco-linked Prosperity7 Ventures. Its GLM-4.5 launch coincides with the World Artificial Intelligence Conference in Shanghai, where multiple Chinese firms showcased advancements. Z.ai was also named in a June OpenAI report highlighting Chinese progress in AI, and has since been added to a U.S. entity list limiting business with American firms. What it means for enterprise technical decision-makers For senior AI engineers, data engineers, and AI orchestration leads tasked with building, deploying, or scaling language models in production, the GLM-4.5 family's release under the Apache 2.0 license presents a meaningful shift in options. The model offers performance that rivals top proprietary systems across reasoning, coding, and agentic benchmarks -- yet comes with full weight access, commercial usage rights, and flexible deployment paths, including cloud, private, or on-prem environments. For those managing LLM lifecycles -- whether leading model fine-tuning, orchestrating multi-stage pipelines, or integrating models with internal tools -- GLM-4.5 and GLM-4.5-Air reduce barriers to testing and scaling. The models support standard OpenAI-style interfaces and tool-calling formats, making it easier to evaluate in sandboxed environments or drop into existing agent frameworks. GLM-4.5 also supports streaming output, context caching, and structured JSON responses, enabling smoother integration with enterprise systems and real-time interfaces. For teams building autonomous tools, its deep thinking mode provides more precise control over multi-step reasoning behavior. For teams under budget constraints or those seeking to avoid vendor lock-in, the pricing structure undercuts major proprietary alternatives like DeepSeek and Kimi K2. This matters for organizations where usage volume, long-context tasks, or data sensitivity make open deployment a strategic necessity. For professionals in AI infrastructure and orchestration, such as those implementing CI/CD pipelines, monitoring models in production, or managing GPU clusters, GLM-4.5's support for vLLM, SGLang, and mixed-precision inference aligns with current best practices in efficient, scalable model serving. Combined with open-source RL infrastructure (slime) and a modular training stack, the model's design offers flexibility for tuning or extending in domain-specific environments. In short, GLM-4.5's launch gives enterprise teams a viable, high-performing foundation model they can control, adapt, and scale, without being tied to proprietary APIs or pricing structures. It's a compelling option for teams balancing innovation, performance, and operational constraints.
[4]
China's Z.ai Scoffs at US Sanctions, Launches Supercharged AI With $1.5B War Chest - Decrypt
New models are open-source, super-efficient, and cost a fifth of Western rivals to use. Beijing-based Z.ai, recently added to the U.S. Entity List and backed by $1.5 billion in fresh funding, has released two open-source AI models that outperformed most Western counterparts in global benchmarks, challenging U.S. sanctions as it prepares for a Hong Kong listing. The startup, formerly known as Zhipu AI, is listed on Washington's Entity List, a trade restriction tool that identifies foreign individuals, companies, and other organizations the government believes are involved in activities contrary to U.S. national security or foreign policy interests. Rubbing salt on the wounds, Z.ai dunked on San Francisco-based OpenAI in a blog post earlier on Monday, arguing that the massively popular AI Platform is something of a mess, with a confusing array of special-use products. By comparison, Z.ai's new open-source model GLM-4.5 merges reasoning and non-reasoning skills into one fully capable and versatile architecture. "In the past five years, OpenAI's GPT-3 learns common-sense knowledge, and o1 uses reinforcement learning to think before responding, significantly improving reasoning skills in coding, data analysis, and complex math," the post said. "However, the resultant models are still not really general: some of them are good at coding, some good at math, and some good at reasoning, but none of them could achieve the best performance across all the different tasks," it said in its post. "GLM-4.5 makes efforts toward the goal of unifying all the different capabilities." OpenAI has stated that its next model, GPT-5, expected by the end of summer, will be its first fully unified tool. In the meantime, GLM-4.5 and its smaller sibling GLM-4.5-Air scored an average of 63.2 and 59.8 across 12 industry tests, ranking third globally, behind OpenAI o3 and Grok-4. The models employ a Mixture of Experts architecture with a total of 355 billion parameters for the flagship version. However, only 32 billion are active at any given time, making it more efficient than brute-force approaches. GLM-4.5 achieved a 90.6% tool-calling success rate, outperforming Claude 3.5 Sonnet's 89.5% and even Kimi K2's 86.2%. On web browsing tasks, it correctly answered 26.4% of complex questions compared to Claude 4 Opus's 18.8%. The model solved 98.2% of problems in the MATH 500 benchmark, matching Claude 4 Opus, all while being open source. The U.S. Department of Commerce added Zhipu to its Entity List in January, accusing the company of supporting Beijing's military advances. Companies on the list cannot purchase American technology without special government approval. Zhipu said the designation "will not have a substantial impact" on its operations. Which is undoubtedly true: Thus far, State-backed investors have poured money into the company since the Entity List announcement. Huafa Group, backed by the Zhuhai government, invested 500 million yuan ($69 million) in March. Chengdu's state fund followed with 300 million yuan, while Shanghai's Pudong Venture Capital Group and Zhangjiang Group jointly added 1 billion yuan in July. The startup plans to raise another $300 million through a Hong Kong initial public offering, according to people familiar with the matter. The company initially filed for a mainland China listing but shifted strategies as geopolitical tensions escalated. Its valuation doubled to 40 billion yuan ($5.6 billion) between September 2024 and July 2025. Z.ai prices API calls at $0.11 per million input tokens and $0.28 per million output tokens -- roughly one-fifth the cost of comparable Western models. The company released the weights under an MIT license on Hugging Face and GitHub, allowing anyone to download and modify the code. The models feature what Z.ai calls "hybrid reasoning" with two modes. A thinking mode handles complex reasoning and planning tasks, while a non-thinking mode provides instant responses. The company shared samples of how people can build entire websites with natural language commands or generate interactive visualizations ranging from Pokémon databases to complex presentations. Chinese companies are accelerating their investment in AI, releasing 1,509 large language models as of July 2025, which represents 40% of the 3,755 models released globally and offers some of the best open-source models in the industry. Just recently, Kimi K2 impressed. Additionally, today, Alibaba released Wan 2.2, which is the state-of-the-art (SOTA) open-source generative video model. Hidream-I1 is the best open-source image generator with Hailuo, Seedream, and Kling beating some of the biggest names in the generative video scene. Early investors of Z.ai include Alibaba Group, Tencent Holdings, and HongShan Capital Group. The company spun out from Tsinghua University in 2019 and represents one of China's "AI Tigers," startups that the government considers crucial for achieving technological independence amid heavy political constraints imposed by the U.S. To boost its own ecosystem of AI technology and apps, Z.ai launched a 1.5 billion yuan ($211 million) fund called the "Z Fund" to invest in startups that build on its technology. The strategy mirrors OpenAI's approach of creating a developer ecosystem around proprietary models while maintaining more open licensing terms. Users can try an online version of the chatbot by accessing this link: Chat with Z.ai. The site is hosted in Singapore, so you can breathe a little easier.
[5]
Chinese startup Z.ai releases cost-efficient GLM-4.5 reasoning model - SiliconANGLE
Chinese startup Z.ai releases cost-efficient GLM-4.5 reasoning model Chinese startup Z.ai today open-sourced GLM-4.5, a reasoning model that it claims is more cost-efficient than DeepSeek's R1. CNBC reported that the algorithm can run on eight H20 graphics cards. The H20 is a scaled-down version of Nvidia Corp.'s H100 chip, which was its flagship artificial intelligence accelerator until last year. The U.S. government recently greenlit the sale of the former processor to companies in China. The launch of GLM-4.5 comes about six months after DeepSeek released its open-source R1 reasoning model. At the time, the company stated that the algorithm can perform some tasks using 50 times less hardware than OpenAI's o1. Furthermore, DeepSeek claimed to have trained its model for a fraction of the cost of earlier AI projects. R1's release led to investor concerns that increasingly hardware-efficient language models may lower demand for AI infrastructure. Nvidia's market capitalization dropped more than $580 billion in the subsequent selloff, setting a new Wall Street record. The release of GLM-4.5 today didn't lead to a similar drop in AI stocks, but it sends investors another signal that reasoning models are continuing to become more hardware-efficient. Z.ai reportedly expects to charge $0.11 for every 1 million input tokens entered into GLM-4.5. That's 3 cents lower than R1. One million output tokens cost $0.28 cents, just over one tenth what DeepSeek charges for R1. One of the main factors behind GLM-4.5's cost-efficiency is that it's relatively small. The model features 355 billion parameters, or about 316 million less than R1. GLM-4.5 only activates 32 billion of those parameters at any given time to reduce hardware usage. An AI model comprises numerous code snippets called artificial neurons that each perform a tiny portion of the work involved processing a prompt. Those neurons, in turn, are organized into so-called layers. Z.ai removed some of GLM-4.5's components to add more layers, an approach that it says helped boost the model's reasoning skills. The company trained GLM-4.5 through a multistep workflow. First, it developed an initial version of the model using a dataset that included 15 trillion tokens' worth of information. Z.ai then honed GLM-4.5's reasoning skills with several smaller training datasets that together comprised more than 7 trillion tokens. The company evaluated the model's capabilities using a dozen popular AI benchmarks. According to Z.ai, GLM-4.5 outperformed multiple popular alternatives including Claude 4 Opus. It ranked third behind xAI Holdings Corp.'s Grok 4 and OpenAI's o3. For use cases that place particular emphasis on cost-efficiency, Z.ai has developed a scaled-down version of its model called GLM-4.5-Air. The algorithm features 106 billion parameters, or about three times less than the original. GLM-4.5-Air activates 12 billion parameters to process prompts. In January, the U.S. Commerce Department added Z.ai to its Entity List of organizations subject to export controls. The company is backed by $1.5 billion in funding from Alibaba Group, Tencent Inc. and other investors. It reportedly plans to file for a public offering later this year.
[6]
Cheaper than DeepSeek? Z.ai releases a new open-source model
Chinese startup Z.ai, formerly known as Zhipu, released a new open-source AI model, which the company asserts operates at a lower cost than DeepSeek. Z.ai introduced three distinct models as part of this release. The flagship offering is designated as GLM-4.5. Alongside this, a more lightweight version, named GLM-4.5-Air, has been made available. The third model, GLM-4.5-Flash, is provided without cost and is specifically engineered for coding tasks, reasoning capabilities, and various agent-based applications. While these models share functional similarities with offerings from American technology companies such as OpenAI, Google's Gemini, and Anthropic's Claude AI, a key distinction lies in the complete open-source nature of GLM-4.5. This open-source attribute differentiates it from several proprietary models currently available in the market. Z.ai is identified as one of several "AI tigers," a term referring to Chinese artificial intelligence companies that receive backing from the Chinese government. The introduction of GLM-4.5 contributes to a substantial volume of large-language models (LLMs) originating from China. Records indicate that GLM-4.5 is one of 1,509 large-language models released by various Chinese AI companies within recent months. While a number of these models, including GLM-4.5, offer open-source systems for development and integration, certain considerations may influence their adoption by users in Western countries. One primary concern involves data transmission protocols, as models like DeepSeek have been observed to transmit user data back to China. This practice raises privacy implications, particularly given the geopolitical dynamics between the United States and China. OpenAI's Global Affairs team previously issued a warning regarding Z.ai and other similar Chinese startups in a public report. The sustained advancements by Chinese AI companies have also prompted policy responses in the United States. The Trump administration, for instance, released "America's AI Action Plan." This document outlines strategies aimed at maintaining the United States' leadership position in artificial intelligence development. The plan's objectives include reducing regulations deemed unnecessary for AI companies and integrating AI technologies more extensively across governmental operations. These measures are designed to foster innovation and reinforce America's competitive edge in the global AI landscape. Observations concerning other Chinese AI models, such as DeepSeek, indicate capacities for censoring content related to China. This aspect suggests that AI models like GLM-4.5 may encounter limitations in achieving widespread adoption across Western regions, predominantly due to concerns regarding data privacy and content censorship. Advocacy has emerged, recommending that users exercise caution or avoid the use of DeepSeek and similar models. Despite GLM-4.5 being an open-source model, the degree of control afforded to users concerning the collection and utilization of their data remains limited. This issue is not exclusive to Chinese AI models; many AI models, regardless of their origin, retain user data and frequently employ it for model training purposes. While certain privacy-focused AI models, such as Proton's Lumo chatbot, have been introduced, a comprehensive solution to the broader challenge of data retention by major technology companies has not yet materialized. The significant economic value attributed to consumer data by large technology corporations suggests that this practice is unlikely to undergo fundamental changes in the near future. Below are the shared benchmarks: The release of Z.ai's new model illustrates the rapid pace of evolution within the field of artificial intelligence and highlights the accelerated progress made by nations such as China in narrowing the technological lead previously held by the United States. This development has the potential to incentivize American companies, including OpenAI, to enhance their existing AI models and to encourage emerging startups to pursue similar advancements in their respective offerings.
[7]
This Chinese AI Lab Just Released Two Open-Source Agentic AI Models
GLM-4.5 AI models offer reasoning, coding, and agentic capabilities Chinese artificial intelligence (AI) lab Z.ai announced the release of two new open-source general language models (GLM) on Monday. Dubbed GLM-4.5 and GLM-4.5-Air, the AI firm calls them its latest flagship models. Both of them are hybrid reasoning models that offer a thinking mode for complex reasoning and tool use, and a non-thinking mode for instant responses. Additionally, the company says these models support agentic capabilities. Notably, the AI firm claims that its latest models outperform all other open-source models worldwide. In a blog post, the Chinese AI firm announced the release of these models. The idea behind GLM models is to create a large language model (LLM) that is truly generalistic and can equally perform different types of tasks. The company argues that despite several advancements in generative AI, the models from the likes of Google, OpenAI, and Anthropic are not general enough, as they display strong performance in some areas while lagging in others. "GLM-4.5 makes efforts toward the goal of unifying all the different capabilities," the company said. The GLM-4.5 AI model features a total of 355 billion parameters with 32 billion active parameters. On the other hand, the Air variant gets 106 billion total parameters, with 12 billion parameters being active. Both models unify reasoning, coding, and agentic capabilities in a single architecture. It has a context window of 1,28,000 tokens and comes with native function calling capacity. Coming to the model architecture, Z.ai opted for a mixture-of-experts (MoE) architecture to improve the compute efficiency of both training and inference. Instead of increasing the width (hidden dimensions and number of experts) of the MoE layers like DeepSeek-V3, the GLM-4.5 series reduces the width while increasing the height (number of layers). This was done as the company believed that deeper models displayed improved reasoning capability. The Chinese AI firm also listed the novel techniques used for the pre-training and post-training process in the blog post to help the developer community understand how the models were built from scratch. Z.ai claimed to have tested the GLM-4.5 model's performance on 12 benchmarks across agentic, reasoning, and coding. It then claimed to compare the model's overall scores against various LLMs from OpenAI, Anthropic, Google, xAI, Alibaba, and more. Based on this internal evaluation, the Chinese AI firm claims that GLM-4.5 ranked in the 3rd place, behind OpenAI's o3 and xAI's Grok 4. Interested individuals can access the open weights of these models from Z.ai's GitHub and Hugging Face listings. Alternatively, these LLMs can also be accessed via the company's website and application programming interface (API).
[8]
Z.ai Says New AI Model Costs Less to Use Than DeepSeek | PYMNTS.com
By completing this form, you agree to receive marketing communications from PYMNTS and to the sharing of your information with our sponsor, if applicable, in accordance with our Privacy Policy and Terms and Conditions. Z.ai's new GLM-4.5 AI model costs 11 cents per million input tokens and 28 cents per million output tokens, CNBC reported Monday (July 28), citing its interview with Z.ai CEO Zhang Peng. For DeepSeek R1, those costs are 14 cents and $2.19, respectively, according to the report. Z.ai's GLM-4.5 is open source and can operate on eight Nvidia H20 chips, per the report. The company said in a Monday blog post that the GLM-4.5 model series features two models -- GLM-4.5 and GLM-4.5-Air -- that satisfy the requirements of agentic applications by unifying reasoning, coding and agentic capabilities. They are hybrid reasoning models that offer a "thinking mode" for complex reasoning and tool use, as well as a "non-thinking mode" for instant responses, according to the post. "As a good [large language model (LLM)], it is necessary to deal with general problem solving, generalization, [commonsense] reasoning and self-improvement," Z.ai said in the post. "In the past five years, OpenAI's GPT-3 learns [commonsense] knowledge, and o1 uses reinforcement learning to think before [responding], significantly improving reasoning skills in coding, data analysis and complex math. However, the resultant models are still not really general. Some of them are good at coding, some good at math, and some good at reasoning, but none of them could achieve the best performance across all the different tasks. GLM-4.5 makes efforts toward the goal of unifying all the different capabilities." According to the CNBC report, Z.ai is on the United States' entity list, meaning American companies are restricted from doing business with it. The AI startup has raised $1.5 billion from investors and is planning an initial public offering (IPO) in China, per the report. It was reported in March that Z.ai, then known as Zhipu AI, raised more than 1 billion yuan (about $140 million) in a funding round led by Hangzhou Municipal Construction Investment Group Co. and Shangcheng Capital, both of which have ties to the Hangzhou local government.
[9]
Meet GLM-4.5: The most capable open-source AI model yet
Zhipu AI's GLM-4.5 rivals commercial giants with 128K context, dual-mode thinking, and MIT licensing While the West buzzes with the rise of OpenAI's GPT-4.5 and xAI's Grok-4, a powerful open-source challenger has emerged in the East. Developed by Zhipu AI (also known as Z.ai), GLM-4.5 is the latest entry in the company's General Language Model series and it's already turning heads in the AI research and developer communities. On paper, GLM-4.5 is a technical marvel: 355 billion total parameters (with 32 billion active), support for 128K context windows, a novel thinking/non-thinking hybrid architecture, and the highest tool-use accuracy ever recorded by an open-source model. In the benchmarks, it doesn't just compete, it outperforms nearly every other open alternative in coding, reasoning, and real-world applications. And perhaps most importantly, it's free to use under the MIT license. Also read: What is Voxtral: Mistral's open source AI audio model, key features explained GLM-4.5 is built using a Mixture-of-Experts (MoE) framework. This architecture activates only a subset of its parameters for each query (32B active out of 355B total), giving it GPT-4-level performance while staying computationally efficient. Also read: Gemini CLI: Google's latest open source AI agent explained But the standout innovation is its dual-mode processing. GLM-4.5 can toggle between "thinking" and "non-thinking" states, essentially choosing between deep, multi-step reasoning or fast, lightweight response generation based on the task. This results in more intelligent and contextual responses without always requiring full-throttle compute. For users who want a smaller version, Zhipu has also launched GLM-4.5-Air, a leaner 106B model with just 12B active parameters which is ideal for inference on edge or budget GPUs. Zhipu AI didn't just make big claims, it backed them with data. In internal and third-party evaluations, GLM-4.5 pulled ahead in several categories. In code generation, it achieved a 53.9% win rate against top competitors, and an astonishing 80.8% win rate in dedicated coding benchmarks. On real-world agent benchmarks requiring API calling, document reading, or search, GLM-4.5 boasted a 90.6% success rate, higher than many closed and open models on record. With 128K token support, it matches Claude 3 and GPT-4 in retaining long conversations and analyzing multi-document input. These results push GLM-4.5 to the top tier of open models. In some internal comparisons, Zhipu AI even ranks it behind only xAI's Grok-4 and OpenAI's o3 model. GLM-4.5's significance goes beyond technical specs. At a time when most cutting-edge models are locked behind APIs or usage caps, Zhipu has released GLM-4.5 completely open-source, hosted on Hugging Face and available for enterprise deployment. That means any developer, startup, or research lab can fine-tune, embed, or deploy the model without worrying about licensing hurdles. And given its performance, it could become the default foundation model for many new AI applications, especially those focused on autonomous agents or tool-using LLMs. The MIT license also encourages commercial adoption. Early reports suggest that Chinese tech giants and even several European research teams are exploring integrations. Zhipu AI has been flying under the radar internationally, but GLM-4.5 may change that. The Beijing-based firm, a spin-off from Tsinghua University's innovation labs, has now released multiple world-class models within two years including GLM-3, ChatGLM, and now GLM-4.5. With GLM-5 already hinted to be in development and a growing suite of agent-centric tools, Zhipu seems poised to become a global leader in open, high-performance AI. And with geopolitical concerns growing around closed-source Western models, their timing couldn't be better.
Share
Copy Link
Z.ai, a Chinese AI startup, releases its open-source GLM-4.5 model, claiming superior performance and cost-efficiency compared to Western counterparts, despite U.S. sanctions.
Chinese AI startup Z.ai, formerly known as Zhipu AI, has released its latest open-source large language model (LLM) family, GLM-4.5, positioning itself as a formidable competitor in the global AI landscape. This release comes amid increasing tensions between China and the United States, with Z.ai recently added to the U.S. Entity List 4.
Source: Digit
GLM-4.5 boasts impressive capabilities, ranking third globally across 12 industry benchmarks, trailing only behind OpenAI's GPT-4 and xAI's Grok-4 3. The model family includes GLM-4.5 and its lighter-weight sibling, GLM-4.5-Air, both designed for intelligent agent applications 1.
Key features of GLM-4.5 include:
Z.ai has prioritized efficiency in the design of GLM-4.5:
Source: PYMNTS
One of the most notable aspects of GLM-4.5 is its cost-effectiveness:
Z.ai has taken a bold step by making GLM-4.5 open-source:
Despite facing U.S. sanctions, Z.ai has secured significant backing:
The release of GLM-4.5 is part of a broader trend in China's AI sector:
Source: SiliconANGLE
The launch of GLM-4.5 signifies China's growing prowess in AI technology:
As geopolitical tensions continue to shape the AI landscape, the success of models like GLM-4.5 could have far-reaching implications for the future of AI research, development, and commercialization on a global scale.
OpenAI introduces Study Mode for ChatGPT, designed to enhance learning experiences by encouraging critical thinking rather than providing direct answers. This new feature aims to address concerns about AI's impact on education and student learning.
20 Sources
Technology
11 hrs ago
20 Sources
Technology
11 hrs ago
Microsoft and OpenAI are negotiating a new deal that could ensure Microsoft's continued access to OpenAI's technology, even after achieving AGI. This comes as OpenAI diversifies its cloud partnerships, potentially challenging Microsoft's AI edge.
11 Sources
Technology
19 hrs ago
11 Sources
Technology
19 hrs ago
Anthropic, the AI startup, is close to securing a massive funding round led by Iconiq Capital, potentially valuing the company at $170 billion. This development highlights the growing investor interest in AI companies and the increasing involvement of Middle Eastern capital in the sector.
4 Sources
Business and Economy
11 hrs ago
4 Sources
Business and Economy
11 hrs ago
Meta CEO Mark Zuckerberg's ambitious pursuit of AI talent and superintelligence capabilities faces challenges as the company reports slower growth amid rising costs. The tech giant's strategy includes massive investments in AI infrastructure and high-profile hires, but questions remain about its open-source approach and the performance of its Llama 4 model.
7 Sources
Technology
11 hrs ago
7 Sources
Technology
11 hrs ago
Google introduces new AI Mode features including Canvas for study planning, image and PDF uploads on desktop, and real-time video input for Search Live, aimed at improving research and learning experiences.
11 Sources
Technology
11 hrs ago
11 Sources
Technology
11 hrs ago