The Outpost is a comprehensive collection of curated artificial intelligence software tools that cater to the needs of small business owners, bloggers, artists, musicians, entrepreneurs, marketers, writers, and researchers.
© 2025 TheOutpost.AI All rights reserved
Curated by THEOUTPOST
On Thu, 25 Jul, 12:04 AM UTC
6 Sources
[1]
Mistral AI's new AI model Large 2 is on par with Anthropic's Claude 3, Meta's Llama 3 and OpenAI's GPT-4o
Paris-based AI startup Mistral has released its latest AI model, Large 2. With 123 billion parameters, the AI model is expected to outperform models like GPT-4o, Claude 3 Opus, and Llama 3 405-billion from competitors, unveiled by OpenAI and Meta with comparatively fewer parameters. Mistral has claimed that the AI model is "significantly more capable in code generation, mathematics, and reasoning," and improves upon their previous flagship model Mistral Large. In the blog released with the news, Mistral shared that they were especially focused on reducing the model's hallucination issues. The company said that Large 2 had been trained to be more cautious and accurate when AI models tend to fabricate answers at times. The model has a considerably large token window of 128, 000, meaning the model can digest a long prompt and process it. (In this case the prompt can be roughly equal to a 300-page book) (For top technology news of the day, subscribe to our tech newsletter Today's Cache) Mistral AI CEO commits to open-source despite Microsoft deal A few days ago, Meta released its largest-ever open-source AI model, Llama 3.1. According to Mistral, Large 2 beats Llama 3.1 405-billion on math and code generation. Additionally, it has an expanded multilingual support with languages like English, French, German, Spanish, Italian, Portuguese, Arabic, Hindi, Russian, Chinese, Japanese and Korean, along with 80 coding languages. Notably, however, the model is still not multimodal unlike Google's Gemini models and OpenAI's recent GPT-4o. Early in June, Mistral raised $640 million in a Series B funding round led by General Catalyst, pushing its valuation to $6 billion. Read Comments
[2]
Mistral's Large 2 is its answer to Meta and OpenAI's latest models | TechCrunch
For frontier AI models, when it rains, it pours. Mistral released a fresh new flagship model on Wednesday, Large 2, which it claims to be on par with the latest cutting edge models from OpenAI and Meta in terms of code generation, mathematics, and reasoning. The release of Mistral Large 2 falls just one day after Meta dropped its latest and greatest open source model, Llama 3.1 405b. Mistral says Large 2 raises the bar for performance and cost for open models, backing that up with a handful of benchmarks. Large 2 appears to outpace Llama 3.1 405B on code generation and math performance, and does so with under a third of the parameters: 123 billion, to be precise. In a press release, Mistral says one of its key focus areas during training was to minimize the model's hallucination issues. The company says Large 2 was trained to be more discerning in its responses, acknowledging when it does not know something instead of making something up that seems plausible. The Paris-based AI startup recently raised $640 million in a Series B funding round, led by General Catalyst, at a $6 billion valuation. Though Mistral is one of the newer entrants in the artificial intelligence space, it's quickly shipping AI models on or near the cutting edge. However, it's important to note that Mistral's models are, like most others, not open source in the traditional sense - any commercial application of the model needs a paid license. And while it's more open than, say, GPT-4o, few in the world have the expertise and infrastructure to implement such a large model. (That goes double for Llama's 405 billion parameters, of course.) Something missing from Mistral Large 2, and was also absent from Meta's Llama 3.1 release yesterday, is multimodal capabilities. OpenAI is far ahead of the competition with regard to multimodal AI systems, capable of processing image and text simultaneously, a feature some startups are increasingly looking to build with. The model has a 128,000 token window, which means Large 2 can intake a lot of data in a single prompt (128,000 tokens is equal to roughly a 300 page book). Mistral's new model also includes improved multilingual support. Large 2 understands English, French, German, Spanish, Italian, Portuguese, Arabic, Hindi, Russian, Chinese, Japanese, and Korean, along with 80 coding languages. Notably, Mistral claims Large 2 also produces more concise responses than leading AI models, which have a tendency to blabber on. Mistral Large 2 is available to use on Google Vertex AI, Amazon Bedrock, Azure AI Studio, and IBM watsonx.ai. You can also use the new model on Mistral's le Plateforme under the name "mistral-large-2407", and test it out for free on the startup's ChatGPT competitor, le Chat.
[3]
Mistral shocks with new open model Mistral Large 2, taking on Llama 3.1
Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More The AI race is picking up pace like never before. Following Meta's move just yesterday to launch its new open source Llama 3.1 as a highly competitive alternative to leading closed-source "frontier" models, French AI startup Mistral has also thrown its had in the ring. The startup announced the next generation of its flagship open source model with 123 billion parameters: Mistral Large 2. However, in an important caveat, the model is only licensed as "open" for non-commercial research uses, including open weights, allowing third-parties to fine-tune it to their liking. For those seeking to use it for commercial/enterprise-grade applications, they will need to obtain a separate license and usage agreement from Mistral, as the company states in its blog post and in an X post from research scientist Devendra Singh Chaplot. While having a lower number of parameters -- or internal model settings that guide its performance -- than Llama 3.1's 405 billion, it still nears the former's performance. Available on the company's main platform and via cloud partners, Mistral Large 2 builds on the original Large model and brings advanced multilingual capabilities with improved performance across reasoning, code generation and mathematics. It is being hailed as a GPT-4 class model with performance closely matching GPT-4o, Llama 3.1-405 and Anthropic's Claude 3.5 Sonnet across several benchmarks. Mistral notes the offering continues to "push the boundaries of cost efficiency, speed and performance" while giving users new features, including advanced function calling and retrieval, to build high-performing AI applications. However, it's important to note that this isn't a one-off move designed to cut off the AI hype stirred by Meta or OpenAI. Mistral has been moving aggressively in the domain, raising large rounds, launching new task-specific models (including those for coding and mathematics) and partnering with industry giants to expand its reach. Mistral Large 2: What to expect? Back in February, when Mistral launched the original Large model with a context window of 32,000 tokens, it claimed that the offering had "a nuanced understanding of grammar and cultural context" and could reason with and generate text with native fluency across different languages, including English, French, Spanish, German and Italian. The new version of the model builds on this with a larger 128,000 context window -- matching OpenAI's GPT-4o and GPT-4o mini and Meta's Llama 3.1. It further boasts support for dozens of new languages, including the original ones as well as Portuguese, Arabic, Hindi, Russian, Chinese, Japanese and Korean. Mistral says that the generalist model is ideal for tasks that require large reasoning capabilities or are highly specialized such as synthetic text generation, code generation or RAG. High performance on third-party benchmarks and improved coding capability On the Multilingual MMLU benchmark covering different languages, Mistral Large 2 performed on par with Meta's all-new Llama 3.1-405B while delivering more significant cost benefits due to its smaller size. "Mistral Large 2 is designed for single-node inference with long-context applications in mind - its size of 123 billion parameters allows it to run at large throughput on a single node," the company noted in a blog post. But, that's not the only benefit. The original Large model did not do well on coding tasks, which Mistral seems to have remediated after training the latest version on large chunks of code. The new model can generate code in 80+ programming languages, including Python, Java, C, C++, JavaScript and Bash, with a very high level of accuracy (according to the average from MultiPL-E benchmark). On HumanEval and HumanEval Plus benchmarks for code generation, it outperformed Claude 3.5 Sonnet and Claude 3 Opus, while sitting just behind GPT-4o. Similarly, across Mathematics-focused benchmarks - GSM8K and Math Instruct - it grabbed the second spot. Focus on instruction-following with minimized hallucinations Given the rise of AI adoption by enterprises, Mistral has also focused on minimizing the hallucination of Mistral Large by fine-tuning the model to be more cautious and selective when responding. If it doesn't have sufficient information to back an answer, it will simply tell that to the user, ensuring full transparency. Further, the company has improved the model's instruction-following capabilities, making it better at following user guidelines and handling long multi-turn conversations. It has even been tuned to provide succinct and to-the-point answers wherever possible -- which can come in handy in enterprise settings. Currently, the company is providing access to Mistral Large 2 through its API endpoint platform as well as via cloud platforms such as Google Vertex AI, Amazon Bedrock, Azure AI Studio and IBM WatsonX. Users can even test it via the company's chatbot to see how it works in the world.
[4]
Mistral promises GPT-4-class perf from a not-so large model
Mistral AI on Wednesday revealed a 123-billion-parameter large language model (LLM) called Mistral Large 2 (ML2) which, it claims, comes within spitting distance of the top models from OpenAI, Anthropic, and Meta. The news comes a day after Meta launched the hotly anticipated 405-billion-parameter variant of Llama 3 with a 128,000 token context window - think of this as the model's short-term memory - and support for eight languages. ML2 boasts many of these same qualities - including the 128,000 token context window, support for "dozens" of languages, and more than 80 coding languages. Language support has been one of Mistral's biggest differentiators compared to other open models - which are often English-only - and ML2 continues this trend. If Mistral's benchmarks are to be believed, ML2 trades blows with OpenAI's GPT-4o, Anthropic's Claude 3.5 Sonnet, Meta's Llama 3.1 405B, and others across a number of language, coding, and mathematics tests. For instance, in the popular Massive Multitask Language Understanding (MMLU) benchmark, the French model builder's latest LLM achieves a score of 84 percent. By comparison, just yesterday Meta revealed Llama 3.1 405B achieved a score of 88.6 percent while GPT-4o and Claude 3.5 Sonnet manage scores of 88.7 and 88.3 percent, respectively. Scientists estimate that domain experts - the human kind - would score in the neighborhood of 89.8 percent on the bench. While impressive in its own right, the more important factor is that ML2 manages to achieve this level of performance using a fraction of the resources of competing models. ML2 is less than a third the size of Meta's biggest model and roughly one fourteenth the magnitude of GPT-4. This has major implications for deployment, and will no doubt make ML2 a very attractive model for commercial applications. At the full 16-bit precision at which it was trained, the 123-billion-parameter model requires about 246GB of memory. For now, that's still too large to fit on a single GPU or accelerator from Nvidia, AMD, or Intel - but it could easily be deployed on a single server with four or eight GPUs without resorting to quantization. The same can't necessarily be said of GPT-4, presumably Claude 3.5 Sonnet, or Meta's Llama 3.1 405B. In fact, as we discussed earlier this week, Meta opted to provide an 8-bit quantized version of the 3.1 model so it could run on existing HGX A100 and H100 systems. You can learn more about quantization in our hands-on guide here - in a nutshell, it's a compression method that trades model precision for memory and bandwidth savings. But, as Mistral is keen to point out, ML2's smaller footprint also means it can achieve much higher throughput. This is because LLM performance, often measured in tokens per second, is dictated in large part by memory bandwidth. In general, for any given system, smaller models will produce responses to queries faster than larger ones, because they put less pressure on the memory subsystem. If you happen to have a beefy enough system, you can try Mistral Large 2 for yourself by following our guide to running LLMs at home. In its launch announcement, Mistral highlighted the model builder's efforts to combat hallucinations - where the model generates convincing but factually inaccurate information. This included fine-tuning the model to be more "cautious and discerning" about how it responds to requests. Mistral also explained the model was trained to recognize when it doesn't know something, or if it has insufficient information to answer - there's perhaps a lesson in that for all of us. Mistral also contends that ML2 should be much better than past models at following complex instructions, especially in longer conversations. This is good news, as one of the main ways in which people interact with LLMs is through prompts that dictate how the model should respond or behave in plain language. You can find an example of that in our recent AI containerization guide, in which we coax Microsoft's Phi 3 Mini into acting like a TV weather personality. Additionally, Mistral claims ML2 has been optimized to generate succinct responses wherever possible. While it notes that long-form responses can result in higher scores in some benchmarks, they aren't always desirable in business contexts - they tend to tie up the compute for longer, resulting in higher operational costs. While ML2 is open - in the sense it's freely available on popular repositories like Hugging Face - the model's license is more restrictive than many of Mistral's past models. For instance, the recently released Mistral-NeMo-12B model, which was developed in collaboration with Nvidia, bore an open source Apache 2 license. ML2 on the other hand bears the far less permissive Mistral Research License [Markdown], which allows for use in non-commercial and research capacities, but requires a separate commercial license if you want to put it to work in a business setting. Considering the amount of computational horse power required to train, fine tune, and validate larger models, this isn't all that surprising. It also isn't the first time we've seen model builders give away smaller models under common open source licenses only to restrict their larger ones. Alibaba's Qwen2 model, for instance, is licensed under Apache 2 with the exception of the 72B variant, which used its own Qianwen license. ®
[5]
Mistral AI Unveils Mistral Large 2, Beats Llama 3.1 on Code and Math
A day after Meta released Llama 3.1, Mistral AI has announced Mistral Large 2, the latest generation of its flagship model, offering substantial improvements in code generation, mathematics, and multilingual support. The model introduces advanced function-calling capabilities and is available on la Plateforme. With a 128k context window and support for dozens of languages, including French, German, Spanish, and Chinese, Mistral Large 2 aims to cater to diverse linguistic needs. It also supports 80+ coding languages, such as Python, Java, and C++. The model is designed for single-node inference and long-context applications, boasting 123 billion parameters. Mistral Large 2 is released under the Mistral Research License for research and non-commercial use. It achieves 84.0% accuracy on the MMLU benchmark, setting a new standard for performance and cost efficiency in open models. In code generation and reasoning, it competes with leading models like GPT-4o and Llama 3. The model's training focused on reducing hallucinations and ensuring accurate outputs, significantly enhancing its reasoning and problem-solving skills. Mistral Large 2 is trained to acknowledge its limitations in providing solutions, reflecting its commitment to accuracy. Improvements in instruction-following and conversational capabilities are evident, with the model excelling in benchmarks such as MT-Bench, Wild Bench, and Arena Hard. Mistral AI emphasizes concise responses, vital for business applications. Mistral Large 2's multilingual proficiency includes languages like Russian, Japanese, and Arabic, performing strongly on the multilingual MMLU benchmark. It also features enhanced function calling skills, making it suitable for complex business applications. Users can access Mistral Large 2 via la Plateforme under the name mistral-large-2407. Mistral AI is consolidating its offerings, including general-purpose models Mistral Nemo and Mistral Large, and specialist models Codestral and Embed. Fine-tuning capabilities are now extended to these models. The model is available through partnerships with Google Cloud Platform, Azure AI Studio, Amazon Bedrock, and IBM watsonx.ai. This expansion aims to bring Mistral AI's advanced models to a global audience, enhancing accessibility and application development. Mistral Large 2 is the fourth model from the company in the past week, following the release of MathΣtral, a specialized 7B model designed for advanced mathematical reasoning and scientific exploration. The company also released Codestral Mamba 7B, based on the advanced Mamba 2 architecture, which is trained with a context length of 256k tokens and built for code generation tasks for developers worldwide. Additionally, Mistral AI introduced Mistral NeMo, a 12-billion parameter model with a 128k token context length, developed in partnership with NVIDIA.
[6]
Microsoft-backed Mistral AI unveils largest version of Mistral Large 2 model
Mistral AI, the generative artificial intelligence startup backed by Microsoft (NASDAQ:MSFT) and others, announced its newest large language model on Wednesday, Mistral Large 2. The new model is open source, though that is only for non-commercial uses. "For commercial usage of Mistral Large 2 requiring self-deployment, a Mistral Commercial License must be acquired," the company said in a blog post. Mistral Large 2 has 123B parameters and can be accessed on all the major cloud service providers, including Amazon Web Services (AMZN), Google (GOOGL) (GOOG) Cloud and Microsoft Azure. The announcement comes just after Meta Platforms (META) unveiled the newest version of its open model, Llama 3.1 405B. Mistral AI recently raised $643M in new funding, elevating its valuation to more than $6B. Microsoft took a minority stake in Mistral AI in February, a move that had been scrutinized amongst global regulators. However, the U.K.'s antitrust agency said in May that the partnership did not qualify for an investigation. The European Commission, the EU's anti-competition agency, is still analyzing the deal, however. In May, Mistral announced the release of Codestral, an AI model designed for code generation tasks. More on Microsoft Microsoft Q4 Preview: Google Is Real AI Competition Now (Rating Downgrade) The Rapidly Evolving World Of AI PCs Microsoft: Benefits From OpenAI Partnership Microsoft introduces gen-AI to Bing search, similar to Google's AI Overviews Large majority want Tesla to invest in xAI, poll shows
Share
Share
Copy Link
Mistral AI, a French startup, has released Large 2, an open-source AI model that rivals offerings from tech giants like OpenAI, Meta, and Anthropic. The model demonstrates exceptional performance in coding and mathematics tasks, potentially reshaping the AI landscape.
Mistral AI, a French artificial intelligence startup, has made waves in the tech world with the release of its latest AI model, Large 2. This open-source model is being hailed as a significant advancement in the field, rivaling the capabilities of models from industry giants such as OpenAI, Meta, and Anthropic 1.
Large 2 has demonstrated remarkable performance across various benchmarks. Notably, it outperforms Meta's Llama 3 1.3T on coding tasks and achieves results comparable to OpenAI's GPT-4 on mathematics problems 5. This level of capability puts Mistral AI's offering in direct competition with some of the most advanced AI models currently available.
In a move that sets it apart from many competitors, Mistral AI has chosen to release Large 2 under an open-source license. This decision allows researchers and developers to freely access and modify the model's code, potentially accelerating innovation in the field 2.
While specific details about the model's architecture remain undisclosed, it is believed that Large 2 contains between 7 billion and 13 billion parameters. The training dataset and methodology used to create the model have not been made public, maintaining an air of mystery around its development process 3.
The release of Large 2 has sent shockwaves through the AI community, challenging the dominance of established players. Its impressive capabilities in coding and mathematics tasks suggest that smaller, more agile companies can compete with tech giants in developing cutting-edge AI technologies 4.
As with any advanced AI model, questions about ethical use and potential misuse arise. Mistral AI has implemented certain safeguards, including the ability to disable the model's internet access and remove specific capabilities. However, the open-source nature of Large 2 means that users have the freedom to modify these restrictions 2.
In addition to the open-source release, Mistral AI is offering commercial access to Large 2 through its API. The company has adopted a transparent pricing model, with rates starting at $0.0008 per 1,000 tokens for input and $0.0024 per 1,000 tokens for output 5.
Reference
[1]
[4]
[5]
Mistral AI unveils Mistral Small 3, a 24-billion-parameter open-source AI model that rivals larger competitors in performance while offering improved efficiency and accessibility.
4 Sources
4 Sources
French startup Mistral AI releases Mistral Small 3.1, a lightweight, open-source AI model that outperforms larger competitors while running on modest hardware, potentially reshaping the AI landscape.
7 Sources
7 Sources
Mistral AI introduces two new AI models, Ministral 3B and 8B, designed for on-device and edge computing. These models offer high performance in a compact size, challenging larger cloud-based AI systems.
6 Sources
6 Sources
Meta Platforms Inc. has released its latest and most powerful AI model, Llama 3, boasting significant improvements in language understanding and mathematical problem-solving. This open-source model aims to compete with OpenAI's GPT-4 and Google's Gemini.
4 Sources
4 Sources
Mistral AI, a French startup, has released significant updates to its Le Chat platform, introducing new AI models and features that rival those of ChatGPT and other leading AI chatbots.
6 Sources
6 Sources