Curated by THEOUTPOST
On Wed, 9 Apr, 8:02 AM UTC
2 Sources
[1]
Deep Cogito emerges from stealth with hybrid AI 'reasoning' models | TechCrunch
A new company, Deep Cogito, has emerged from stealth with a family of openly available AI models that can be switched between "reasoning" and non-reasoning modes. Reasoning models like OpenAI's o1 have shown great promise in domains like math and physics, thanks to their ability to effectively fact-check themselves by working through complex problems step by step. This reasoning comes at a cost, however: higher computing and latency. That's why labs like Anthropic are pursuing "hybrid" model architectures that combine reasoning components with standard, non-reasoning elements. Hybrid models can quickly answer simple questions while spending additional time considering more challenging queries. All of Deep Cogito's models, called Cogito 1, are hybrid models. Cogito claims that they outperform the best open models of the same size, including models from Meta and Chinese AI startup DeepSeek. "Each model can answer directly [...] or self-reflect before answering (like reasoning models)," the company explained in a blog post. "[All] were developed by a small team in approximately 75 days." The Cogito 1 models range from 3 billion parameters to 70 billion parameters, and Cogito says that models ranging up to 671 billion parameters will join them in the coming weeks and months. Parameters roughly correspond to a model's problem-solving skills, with more parameters generally being better. Cogito 1 wasn't developed from scratch, to be clear. Deep Cogito built on top of Meta's open Llama and Alibaba's Qwen models to create its own. The company says that it applied novel training approaches to boost the base models' performance and enable toggleable reasoning. According to the results of Cogito's internal benchmarking, the largest Cogito 1 model, Cogito 70B, with reasoning outperforms DeepSeek's R1 reasoning model on a few mathematics and language evaluations. Cogito 70B with reasoning disabled also eclipses Meta's recently released Llama 4 Scout model on LiveBench, a general-purpose AI test. Every Cogito 1 model is available for download or use via APIs on cloud providers Fireworks AI and Together AI. "Currently, we're still in the early stages of [our] scaling curve, having used only a fraction of compute typically reserved for traditional large language model post/continued training," wrote Cogito in its blog post. "Moving forward, we're investigating complementary post-training approaches for self-improvement." According to filings with California State, San Francisco-based Deep Cogito was founded in June 2024. The company's LinkedIn page lists two co-founders, Drishan Arora and Dhruv Malhotra. Malhotra was previously a product manager at Google AI lab DeepMind, where he worked on generative search technology. Arora was a senior software engineer at Google. Deep Cogito, whose backers include South Park Commons, according to Pitchbook, ambitiously aims to build "general superintelligence." The company's founders understand the phrase to mean AI that can perform tasks better than most humans and "uncover entirely new capabilities we have yet to imagine."
[2]
New open source AI company Deep Cogito releases first models and they're already topping the charts
Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More Deep Cogito, a new AI research startup based in San Francisco, officially emerged from stealth today with Cogito v1, a new line of open source large language models (LLMs) fine-tuned from Meta's Llama 3.2 and equipped with hybrid reasoning capabilities -- the ability to answer quickly and immediately, or "self-reflect" like OpenAI's "o" series and DeepSeek R1. The company aims to push the boundaries of AI beyond current human-overseer limitations by enabling models to iteratively refine and internalize their own improved reasoning strategies. It's ultimately on a quest toward developing superintelligence -- AI smarter than all humans in all domains -- yet the company says that "All models we create will be open sourced." Deep Cogito's CEO and co-founder Drishan Arora -- a former Senior Software Engineer at Google who says he led the large language model (LLM) modeling for Google's generative search product -- also said in a post on X they are "the strongest open models at their scale - including those from LLaMA, DeepSeek, and Qwen." The initial model lineup includes five base sizes: 3 billion, 8 billion, 14 billion, 32 billion, and 70 billion parameters, available now on AI code sharing community Hugging Face, Ollama and through application programming interfaces (API) on Fireworks and Together AI. They're available under the Llama licensing terms which allows for commercial usage -- so third-party enterprises could put them to work in paid products -- up to 700 million monthly users, at which point they need to obtain a paid license from Meta. The company plans to release even larger models -- up to 671 billion parameters -- in the coming months. Arora describes the company's training approach, iterated distillation and amplification (IDA), as a novel alternative to traditional reinforcement learning from human feedback (RLHF) or teacher-model distillation. The core idea behind IDA is to allocate more compute for a model to generate improved solutions, then distill the improved reasoning process into the model's own parameters -- effectively creating a feedback loop for capability growth. Arora likens this approach to Google AlphaGo's self-play strategy, applied to natural language. The Cogito models are open-source and available for download via Hugging Face and Ollama, or through APIs provided by Fireworks AI and Together AI. Each model supports both a standard mode for direct answers and a reasoning mode, where the model reflects internally before responding. Benchmarks and evaluations The company shared a broad set of evaluation results comparing Cogito models to open-source peers across general knowledge, mathematical reasoning, and multilingual tasks. Highlights include: Cogito models generally show their highest performance in reasoning mode, though some trade-offs emerge -- particularly in mathematics. For instance, while Cogito 70B (Standard) matches or slightly exceeds peers in MATH and GSM8K, Cogito 70B (Reasoning) trails DeepSeek R1 in MATH by over five percentage points (83.3% vs. 89.0%). Tool calling built-in In addition to general benchmarks, Deep Cogito evaluated its models on native tool-calling performance -- a growing priority for agents and API-integrated systems. These improvements are attributed not only to model architecture and training data, but also to task-specific post-training, which many baseline models currently lack. Looking Ahead Deep Cogito plans to release larger-scale models in upcoming months, including mixture-of-expert variants at 109B, 400B, and 671B parameter scales. The company will also continue updating its current model checkpoints with extended training. The company positions its IDA methodology as a long-term path toward scalable self-improvement, removing dependence on human or static teacher models. Arora emphasizes that while performance benchmarks are important, real-world utility and adaptability are the true tests for these models -- and that the company is just at the beginning of what it believes is a steep scaling curve. Deep Cogito's research and infrastructure partnerships include teams from Hugging Face, RunPod, Fireworks AI, Together AI, and Ollama. All released models are open source and available now.
Share
Share
Copy Link
Deep Cogito, a new AI research startup, has unveiled a series of open-source large language models with hybrid reasoning capabilities, aiming to push the boundaries of AI development towards superintelligence.
Deep Cogito, a San Francisco-based AI research startup, has emerged from stealth mode with the release of Cogito v1, a new family of open-source large language models (LLMs) that feature innovative hybrid reasoning capabilities 1. These models, which can switch between "reasoning" and non-reasoning modes, represent a significant advancement in AI technology and have already shown impressive performance on various benchmarks.
The Cogito v1 models are built on a hybrid architecture that combines reasoning components with standard, non-reasoning elements. This approach allows the models to quickly answer simple questions while dedicating additional time to more complex queries that require deeper consideration 1. The ability to toggle between these modes offers a unique flexibility in AI applications, potentially improving efficiency and accuracy across various tasks.
Deep Cogito has released five base sizes of the Cogito v1 models, ranging from 3 billion to 70 billion parameters 2. These models are available for download via Hugging Face and Ollama, and can also be accessed through APIs provided by Fireworks AI and Together AI. The company plans to release even larger models, up to 671 billion parameters, in the coming months 2.
According to Deep Cogito's internal benchmarking, the largest model, Cogito 70B, outperforms other open models of similar size, including those from Meta and DeepSeek, on several mathematics and language evaluations 1. The company claims that their models are "the strongest open models at their scale," surpassing offerings from LLaMA, DeepSeek, and Qwen 2.
Deep Cogito employs a unique training methodology called iterated distillation and amplification (IDA). This approach, described as an alternative to traditional reinforcement learning from human feedback (RLHF), aims to create a feedback loop for capability growth by allowing the model to generate improved solutions and then distill the enhanced reasoning process into its own parameters 2.
Founded in June 2024, Deep Cogito is led by co-founders Drishan Arora and Dhruv Malhotra, both of whom have backgrounds in prominent AI research institutions 1. The company's ambitious goal is to build "general superintelligence," which they define as AI capable of performing tasks better than most humans and uncovering entirely new capabilities 1.
Deep Cogito has committed to open-sourcing all of its models, making them available under the Llama licensing terms for commercial usage 2. This approach aligns with the company's vision of pushing AI development forward collaboratively. As they continue to refine and expand their model lineup, Deep Cogito aims to remove dependence on human or static teacher models, paving the way for scalable self-improvement in AI systems 2.
OpenAI launches o3 and o4-mini, new AI reasoning models with enhanced capabilities in math, coding, science, and visual understanding. These models can integrate images into their reasoning process and use ChatGPT tools independently.
14 Sources
14 Sources
DeepSeek's open-source R1 model challenges OpenAI's o1 with comparable performance at a fraction of the cost, potentially revolutionizing AI accessibility and development.
6 Sources
6 Sources
DeepSeek, a Chinese AI company, has launched R1-Lite-Preview, an open-source reasoning model that reportedly outperforms OpenAI's o1 preview in key benchmarks. The model showcases advanced reasoning capabilities and transparency in problem-solving.
11 Sources
11 Sources
Chinese AI startup DeepSeek releases a major upgrade to its V3 language model, showcasing improved performance and efficiency. The open-source model challenges industry leaders with its ability to run on consumer hardware.
16 Sources
16 Sources
Alibaba releases QwQ-32B-Preview, an open-source AI model that rivals OpenAI's o1 in reasoning capabilities. The model outperforms o1 on specific benchmarks and is available for commercial use.
5 Sources
5 Sources
The Outpost is a comprehensive collection of curated artificial intelligence software tools that cater to the needs of small business owners, bloggers, artists, musicians, entrepreneurs, marketers, writers, and researchers.
© 2025 TheOutpost.AI All rights reserved