2 Sources
[1]
Mistral OCR 4: cheap, self-hosted document AI
Mistral OCR 4 reads a document like a structured map, not a wall of text. It is cheap, speaks 170 languages, and can run entirely on your own servers. Europe's AI champion is going after the enterprise back office. Mistral OCR 4 reads a document like a structured map, not a wall of text. It is cheap, speaks 170 languages, and can run entirely on your own servers. Europe's AI champion is going after the enterprise back office. Mistral has a new model, and it is not a chatbot. The French company on 23 June released Mistral OCR 4, a system that turns documents into structured data, it said in a blog post. The model stays small and focused, chasing one huge target: the world's paperwork. Optical character recognition has been around for decades. The pitch here is what the model returns. Older systems convert a page into clean text. OCR 4 hands back a map of the page, with each block labelled and located. Independent annotators preferred it to every rival system tried, Mistral said, with an average win rate of 72%. From page to structured map OCR 4 does three new things at once. It draws bounding boxes around every element, so software knows exactly where each line sits. It classifies each block by type, marking titles, tables, equations and even signatures. And it adds a confidence score, per page and per word, so a human knows which parts to double-check. Customers asked for bounding boxes more than any other feature, Mistral said. They let an app point to the exact source of an answer. Paired with block types and confidence scores, they enable citations, redactions and human review. The output also arrives as clean markdown. The shift matters because of what comes next. A chatbot can summarise a contract. An agent has to file it. For that, software needs to tell a signature from a sub-total, and know where each one sits. OCR 4 supplies that scaffolding, where older tools handed back a flat block of words. It marks a clear break from the last version. OCR 3 focused on turning a page into clean text and tidy tables. OCR 4 returns the whole structure instead. Each block carries a location, a type and a score. Downstream systems then learn not just what a document says, but how it is built. Built for the back office OCR 4 targets enterprise drudgery. It feeds retrieval systems, the "RAG" pipelines that let chatbots answer from a company's own files. It also gives AI agents the structure they need to act, not just read. That means filling forms, processing invoices and running compliance checks. Its reach runs wide. The model handles PDFs, Word, PowerPoint and OpenDocument files, and reads 170 languages across 10 groups. Mistral says it holds up on low-resource languages where rivals fall away. Early users are digitising archives, turning invoices into fields and pulling clean text from scientific reports. OCR 4 also plugs into Mistral's new Search Toolkit, an open-source framework the firm unveiled at its AI Now Summit. The model's structured output feeds straight into that pipeline. The aim is to hand developers citation-ready inputs, so an answer can point back to the page it came from. The speed claims form part of the sell. Anaqua, which manages intellectual-property filings, said the model runs about four times faster per page than its previous tool. For high-volume docketing, where deadlines are unforgiving, that pace decides whether a workflow scales. It slots into Mistral's push beyond chatbots. The company already sells industrial AI to Airbus, BMW and EDF, and document work is the same enterprise bet by another name. The sovereignty pitch The headline feature for European buyers is where the model runs. OCR 4 is small enough to fit in a single container. So a company can host it on its own infrastructure and keep sensitive documents in-house. That lands on Mistral's core message. The firm sells itself as Europe's sovereign alternative to American AI, and self-hosting answers the data-residency worries that come with Europe's tightening sovereignty rules. For banks, hospitals and governments, keeping the paperwork on home soil is the point. Cheap, and nearly everywhere The price looks aggressive. The API costs $4 per 1,000 pages, halving to $2 in batch mode. A higher-level Document AI product, which reshapes output into custom fields, runs $5 per 1,000 pages. One customer, financial-research firm Rogo, claimed similar accuracy to its old provider at roughly eight times lower cost. Distribution runs broad too. OCR 4 is live through Mistral's own studio, Amazon SageMaker and Microsoft's Foundry, with Snowflake support coming. Mistral, now valued near €20bn in fresh funding talks, is making sure its tools sit inside the clouds its customers already use. Microsoft called the launch a milestone in its partnership with Mistral. That endorsement carries weight. It routes the model toward the enterprise buyers who already sit inside Microsoft's cloud, and gives Mistral a distribution channel it could never build alone. The strategy stays consistent. Over the past year, Mistral has wired itself into enterprise software rather than chasing consumer hype. A cheap, self-hostable document reader fits that plan neatly, because it pulls customers into the rest of its stack. The case for caution The benchmarks deserve a careful read. Mistral tops the public OlmOCRBench (85.20) and its own multilingual test. But the company calls those scores "directional". It admits the benchmarks misjudge maths and multi-column text, and that it reproduced every competitor figure itself. The 72% win rate looks firmer, because humans judged real documents. There are limits on use, too. Mistral is blunt that OCR 4 reads documents, it does not decide on them. It says the model is not for medical diagnosis, legal judgment or high-stakes finance. It extracts the words; a human still makes the call. The market looks crowded as well. Google, AWS and a wave of startups all sell document AI. Mistral's edge comes from the combination: structured output, low cost and a version you can run yourself. Whether that wins the back office, against far bigger clouds, remains the open question. For now, Europe's AI champion has decided the boring documents are worth fighting for.
[2]
Mistral OCR 4 with structured document extraction, 170 languages and self-hosting launched
Mistral AI has introduced Mistral OCR 4, a new optical character recognition (OCR) model designed for enterprise document understanding. Unlike previous versions that mainly converted documents into text and tables, OCR 4 produces structured document outputs with bounding boxes, block classification, and confidence scores for every page and word. The small, focused model supports 170 languages across 10 language groups, accepts multiple enterprise document formats, runs inside a single container for fully self-hosted deployments, and is designed as an ingestion component for enterprise search, Retrieval-Augmented Generation (RAG), and domain-specific retrieval pipelines. What's new in Mistral OCR 4 OCR 4 extracts both document content and its structure. Along with recognized text, it returns: * Bounding boxes for every detected element * Block classification * Page- and word-level confidence scores * Markdown-formatted structured output This structured output supports semantic chunking for RAG, enterprise search, document ingestion and indexing pipelines, domain-specific retrieval pipelines, AI agents for form filling, invoice processing and compliance checks, and custom document connectors. Enterprise capabilities OCR 4 accepts common enterprise document formats, including PDF, DOC, PPT, and OpenDocument, and supports 170 languages across 10 language groups, including specialized and low-resource languages. The compact model runs inside a single container, making it suitable for cost-sensitive and high-volume deployments. It can also be fully self-hosted, allowing organizations with strict data-sovereignty, privacy, or compliance requirements to keep document data entirely within their own infrastructure. On Mistral's internal Crawl Multilingual evaluation, OCR 4 led across all eight language groups. According to Mistral, OCR 4 showed its widest performance advantage in specialized and low-resource languages, where many competing OCR systems typically lose accuracy. Benchmark performance Mistral AI evaluated OCR 4 against AI-native OCR models, frontier general-purpose models, enterprise document services, and its previous OCR 3 model. In a blind human preference evaluation involving more than 600 real-world documents across 12+ languages, sourced from third-party vendors and reviewed by independent annotators, OCR 4 was preferred over competing systems in the majority of documents tested. According to the company, OCR 4 also achieved: * 85.20 on OlmOCRBench * 93.07 on OmniDocBench * 0.98 on Mistral's internal Crawl Multilingual evaluation Mistral says benchmark scores should be treated as directional because issues such as incorrect ground-truth annotations, equivalent mathematical notation, equation segmentation, multi-column reading order, and header/footer attribution can penalize otherwise correct outputs. The company recommends evaluating OCR 4 using an organization's own documents. OCR 4 API vs Document AI Developers can integrate OCR 4 directly through the API, while business users can access the same OCR engine through Document AI in Mistral Studio as a no-code workflow. Use OCR 4 when you need: * Raw extracted text and markdown output * Bounding boxes, block classification, and confidence scores * High-volume document ingestion with Batch API support * Self-hosted deployments for privacy, sovereignty, or compliance * Direct integration into applications, AI agents, and enterprise data pipelines Use Document AI when you need: * Structured JSON output using custom schemas * Image annotation with structured JSON * Prompt-based document interpretation and summarization * Domain-specific structured outputs * Structured results without building downstream parsing logic Every request returns OCR 4's extracted content, including text, markdown, bounding boxes, block types, and confidence scores. Document AI adds structured processing, custom prompts, schemas, or image annotations on top of the OCR output using the same API endpoint. Recommended use cases OCR 4 is designed for: * Complex multilingual document parsing and extraction * Enterprise search and Retrieval-Augmented Generation (RAG) * AI agent workflows for form filling, invoice processing, and compliance checks * Structured extraction, redaction, and verification pipelines using confidence scores According to Mistral, early users are using OCR 4 to convert invoices into structured fields, digitize company archives, extract clean text from technical and scientific reports, and power enterprise search. OCR 4 output can also be integrated directly into retrieval pipelines using the company's Search Toolkit. The company notes that OCR 4 is a document-understanding model and is not intended for medical diagnosis, legal advice or judgment, high-stakes financial decisions, safety-critical systems, real-time or latency-sensitive processing, or non-document inputs such as audio and video. Pricing and availability Mistral OCR 4 is available through the Mistral API, while Document AI, powered by OCR 4, is available in Mistral Studio for no-code document processing. Both are also available via API through Amazon SageMaker and Microsoft Foundry, with support for Snowflake Parse Document coming soon. Organizations with strict privacy, sovereignty, or compliance requirements can deploy OCR 4 entirely within their own infrastructure using the self-hosted option.
Share
Copy Link
Mistral AI launched Mistral OCR 4, an optical character recognition model that converts documents into structured data rather than plain text. Supporting 170 languages and deployable on private servers, the model targets enterprise document processing with bounding boxes, block classification, and confidence scores. At $4 per 1,000 pages, it positions itself as a sovereign alternative to U.S. AI tools for European organizations.
Mistral AI released Mistral OCR 4 on 23 June, marking a shift from chatbots to enterprise document understanding
1
. The French company's latest optical character recognition model doesn't just convert documents into structured data—it returns a complete structural map of each page with precise element locations, classifications, and reliability indicators. Independent annotators preferred it to every rival system tested, with an average win rate of 72%1
.
Source: The Next Web
Unlike traditional OCR systems that output flat text, Mistral OCR 4 delivers bounding boxes around every element, block classification for titles, tables, equations, and signatures, plus page- and word-level confidence scores
2
. This structured approach enables AI agent workflows to distinguish a signature from a subtotal and know exactly where each sits on the page—critical for invoice processing, compliance checks, and form filling1
.The model runs inside a single container, allowing organizations to deploy self-hosted document AI entirely on their own infrastructure
2
. For European banks, hospitals, and governments navigating tightening sovereignty rules, keeping sensitive documents on home soil matters. Mistral AI positions itself as a sovereign alternative to U.S. AI tools, directly addressing data-residency worries that accompany cross-border data flows1
.This compact architecture suits cost-sensitive and high-volume deployments. Anaqua, which manages intellectual-property filings, reported the model runs approximately four times faster per page than its previous tool—a pace that determines whether workflows scale when deadlines are unforgiving
1
.Mistral OCR 4 handles PDF, Word, PowerPoint, and OpenDocument files across 170 languages spanning 10 language groups
1
. On Mistral's internal Crawl Multilingual evaluation, the model achieved a 0.98 score and led across all eight language groups tested2
. The widest performance advantage appeared in specialized and low-resource languages, where competing systems typically lose accuracy2
.The model also scored 85.20 on OlmOCRBench and 93.07 on OmniDocBench, though Mistral AI cautions that benchmark scores should be treated as directional due to issues like incorrect ground-truth annotations and multi-column reading order
2
.Related Stories
The API costs $4 per 1,000 pages, dropping to $2 in batch mode
1
. Financial-research firm Rogo claimed similar accuracy to its previous provider at roughly eight times lower cost1
. A higher-level Document AI product in Mistral Studio, which reshapes output into custom fields using schemas and prompts, runs $5 per 1,000 pages1
.Developers needing raw markdown output, bounding boxes, and confidence scores can integrate the OCR 4 API directly. Business users seeking structured JSON output, image annotation, or domain-specific results without building parsing logic can access the same engine through Document AI as a no-code workflow
2
.Mistral OCR 4 plugs directly into the company's open-source Search Toolkit, unveiled at its AI Now Summit
1
. The structured output feeds Retrieval-Augmented Generation pipelines, enabling chatbots to cite exact page sources when answering from a company's own files. Early users are digitizing archives, converting invoices into structured fields, and extracting clean text from scientific reports1
.The model is live through Mistral Studio, Amazon SageMaker, and Microsoft's Foundry, with Snowflake support coming
1
. Microsoft called the launch a milestone in its partnership with Mistral AI, routing the model toward enterprise buyers already inside its cloud1
. Mistral AI, now valued near €20 billion in fresh funding talks, is ensuring its tools sit inside the clouds its customers already use1
.Summarized by
Navi
[1]
1
Policy and Regulation

2
Policy and Regulation

3
Policy and Regulation
