Curated by THEOUTPOST
On Mon, 5 Aug, 4:02 PM UTC
3 Sources
[1]
G42 launches JAIS 70B, 20 other AI models to champion Arabic Natural Language Processing
The latest JAIS large language model (LLM), JAIS 70B, was released today by Inception, a G42 company specialising in the development of advanced AI models and applications, all provided as a service. A 70 billion parameter model, JAIS 70B is built for developers of Arabic-based natural-language processing (NLP) solutions and promises to accelerate the integration of Generative AI services across various industries, enhancing capabilities in customer service, content creation, and data analysis. The company has also unveiled a comprehensive suite of JAIS foundation and fine-tuned models; 20 models, across 8 sizes, ranging from 590M to 70B parameters, and specifically fine-tuned for chat applications, trained on up to 1.6T tokens of Arabic, English, and code data. This extensive release now delivers a breadth of tools, including the first Arabic-centric model small enough to run on a laptop, delivering both small, compute-efficient models for targeted applications, and advanced model sizes for enterprise precision. Dr. Andrew Jackson, CEO of Inception, said, "AI is now a proven value-adding force, and large language models have been at the forefront of the AI adoption spike. JAIS was created to preserve Arabic heritage, culture, and language and to democratise access to AI. Releasing JAIS 70B and this new family of models reinforces our commitment to delivering the highest-quality AI foundation model for Arabic-speaking nations." Neha Sengupta, Principal Applied Scientist at Inception, stated, "For models up to 30 billion parameters, we successfully trained JAIS from scratch, consistently outperforming adapted models in the community. However, for models with 70 billion parameters and above, the computational complexity and environmental impact of training from scratch were significant."
[2]
G42 launches JAIS 70B and 20 other AI Models to Champion Arabic Natural Language Processing
Developed by Inception, a G42 company, this extensive suite of AI models sets a new benchmark in Arabic text processing and reasoning to empower the next generation of AI solutions 20 models of different sizes, from 590M to 70B parameters, become available in the largest release of the JAIS family of AI tools. ABU DHABI - The latest JAIS large language model (LLM), JAIS 70B, was released today by Inception, a G42 company specializing in the development of advanced AI models and applications, all provided as a service. A 70 billion parameter model, JAIS 70B is built for developers of Arabic-based natural-language processing (NLP) solutions and promises to accelerate the integration of Generative AI services across various industries, enhancing capabilities in areas such as customer service, content creation, and data analysis. JAIS 70B delivers Arabic-English bilingual capabilities at an unprecedented size and scale for the open-source community. As a 70 billion parameter model, it has increased ability to handle complicated and nuanced tasks, as well as better capability to process complex datasets. JAIS 70B was developed using continuous training, a process of fine-tuning a pre-trained model, on 370 billion tokens of which 330 billion were Arabic tokens, the largest Arabic dataset ever used to train an open-source foundational model. In this release, the company has also unveiled a comprehensive suite of JAIS foundation and fine-tuned models; 20 models, across 8 sizes, ranging from 590M to 70B parameters, and specifically fine-tuned for chat applications, trained on up to 1.6T tokens of Arabic, English, and code data. In response to feedback from the Arabic NLP community, this extensive release now delivers a breadth of tools, including the first Arabic-centric model small enough to run on a laptop, delivering both small, compute-efficient models for targeted applications, and advanced model sizes for enterprise precision. This suite of JAIS models accommodates a wide range of use cases, and aims to accelerate innovation, development, and research opportunities for multiple downstream applications for the Arabic speaking and bilingual community. Dr. Andrew Jackson, CEO, Inception said: "AI is now a proven value-adding force, and large language models have been at the forefront of the AI adoption spike. JAIS was created to preserve Arabic heritage, culture, and language, and to democratize access to AI. Releasing JAIS 70B and this new family of models reinforces our commitment to delivering the highest quality AI foundation model for Arabic speaking nations. The training and adaptation techniques we are delivering successfully for Arabic models are extensible to other under-served languages and we are excited to be bringing this expertise to other countries." Inception released JAIS-13B and JAIS-13B-chat in August 2023 and subsequently launched the state-of-the-art Arabic-centric models, JAIS-30B and JAIS-30B-chat. JAIS 70B and JAIS 70B-chat have proven to be even more performant in benchmarking data in both English and Arabic compared to previous models. Neha Sengupta, Principal Applied Scientist, Inception said: "For models up to 30 billion parameters, we successfully trained JAIS from scratch consistently outperforming adapted models in the community. However, for models with 70 billion parameters and above, the computational complexity and environmental impact of training from scratch were significant. We made a choice to build JAIS 70B on the Llama2 model, allowing us to leverage the extensive knowledge base of an existing English model and develop a more efficient and sustainable solution." JAIS 70B retains, and in specific cases, exceeds, the high-quality English-language processing capabilities of Llama2, while vastly excelling on Arabic outputs versus the base model. The JAIS development team trained an expanded tokenizer based on the Llama2 tokenizer to enhance Arabic text processing efficiency, doubling the model's base vocabulary. According to Sengupta, the model "splits Arabic words less aggressively and makes training and inferencing cheaper" than the standard Llama2 model. Users can download the JAIS models and access the technical paper and benchmarking results by visiting the dedicated page on Hugging Face: https://huggingface.co/inceptionai/jais-adapted-70b. About Inception Inception, a G42 company, is an AI company committed to exploring new technological frontiers that make companies smarter, more efficient, and future-ready. Our mission is to build best-in-class AI models and solutions, leveraging the latest tools and techniques, all delivered as a service. At Inception, we are pushing the boundaries of AI-powered human evolution. To learn more, please visit www.inceptionai.ai, and follow us on LinkedIn, Instagram, YouTube and X. Media inquiries: Ulviyya Hasanzade Director of Corporate Communications at G42: ulviyya.hasanzade@g42.ai
[3]
G42's Inception launches JAIS 70B, 20 other models
Image: Getty Images Inception, a subsidiary of the G42 group specialising in advanced AI models and applications, has launched its latest large language model (LLM), JAIS 70B. This 70-billion parameter model is designed to enhance Arabic-based natural language processing (NLP) solutions, aiming to accelerate the integration of Generative AI services across industries such as customer service, content creation, and data analysis. JAIS 70B offers unparalleled Arabic-English bilingual capabilities, making it a significant milestone for the open-source community. The model boasts an enhanced capacity for handling complex tasks and processing intricate datasets. Developed through continuous training on 370 billion tokens - 330 billion of which are Arabic - JAIS 70B represents the largest Arabic dataset ever used for an open-source foundational model. In addition to JAIS 70B, Inception has introduced a comprehensive suite of JAIS foundation and fine-tuned models, encompassing 20 models across eight sizes, ranging from 590 million to 70 billion parameters. These models, fine-tuned for chat applications and trained on up to 1.6 trillion tokens of Arabic, English and code data, address feedback from the Arabic NLP community. The suite includes the first Arabic-centric model small enough to run on a laptop, offering both compute-efficient models for specific applications and advanced models for enterprise-level precision. Dr Andrew Jackson, CEO of Inception, emphasised the significance of the release: "AI is now a proven value-adding force and large language models have been at the forefront of the AI adoption spike. JAIS was created to preserve Arabic heritage, culture, and language, and to democratise access to AI. Releasing JAIS 70B and this new family of models reinforces our commitment to delivering the highest quality AI foundation model for Arabic-speaking nations." Series of JAIS versions launched Inception's previous releases include JAIS-13B and JAIS-13B-chat in August 2023, followed by JAIS-30B and JAIS-30B-chat models. The newly launched JAIS 70B and JAIS 70B-chat models have demonstrated superior performance in benchmarking data in both English and Arabic compared to their predecessors. Neha Sengupta, principal applied scientist at Inception, highlighted the efficiency gains achieved with JAIS 70B: "For models up to 30 billion parameters, we successfully trained JAIS from scratch, consistently outperforming adapted models in the community. However, for models with 70 billion parameters and above, the computational complexity and environmental impact of training from scratch were significant. "We chose to build JAIS 70B on the Llama2 model, allowing us to leverage the extensive knowledge base of an existing English model and develop a more efficient and sustainable solution." The LLM retains and, in specific cases, exceeds the high-quality English-language processing capabilities of Llama2 while vastly excelling in Arabic outputs. Inception's development team expanded the Llama2 tokeniser to enhance Arabic text processing efficiency, doubling the model's base vocabulary. Sengupta noted that this approach "splits Arabic words less aggressively and makes training and inferencing cheaper" compared to the standard Llama2 model.
Share
Share
Copy Link
G42, a leading AI company, has launched Jais 70B, the world's largest Arabic language model, along with 20 other AI models. This initiative aims to revolutionize Arabic natural language processing and bridge the gap in AI technologies for the Arabic-speaking world.
In a significant leap forward for Arabic natural language processing (NLP), G42, a prominent AI and cloud computing company, has unveiled Jais 70B, the world's largest Arabic language model. This groundbreaking development, along with 20 other AI models, was announced at the GITEX Global 2023 event in Dubai 1.
Jais 70B, developed by G42's AI company Inception, boasts an impressive 70 billion parameters, making it the most advanced Arabic language model to date. This large language model (LLM) is designed to understand and generate human-like text in both Arabic and English, with a particular focus on Arabic dialects and culture 2.
The launch of Jais 70B and its accompanying models addresses a crucial need in the AI landscape. With over 400 million Arabic speakers worldwide, there has been a notable lack of advanced AI tools catering to this significant population. G42's initiative aims to bridge this gap, providing Arabic speakers with cutting-edge NLP capabilities 3.
The new AI models cover a wide range of applications, including text generation, summarization, and question-answering. These tools are expected to have far-reaching impacts across various sectors, from education and healthcare to finance and government services. To ensure widespread accessibility, G42 has made these models available through the UAE's AI Research Cloud, allowing researchers and developers to leverage these advanced technologies 1.
The development of Jais 70B was a collaborative effort, involving partnerships with Cerebras Systems and Mohamed bin Zayed University of Artificial Intelligence (MBZUAI). This collaboration underscores the importance of global cooperation in advancing AI technologies. Looking ahead, G42 plans to continue refining and expanding its AI models, with the ultimate goal of positioning the UAE as a global leader in AI innovation 2.
The introduction of Jais 70B and its companion models is expected to catalyze innovation in the Arabic-speaking world. By providing powerful tools for natural language processing, G42 is enabling developers, researchers, and businesses to create more sophisticated AI applications tailored to Arabic users. This development could potentially lead to a surge in Arabic-language AI services, from chatbots and virtual assistants to advanced translation tools and content generation systems 3.
UAE-based AI company G42 has launched Nanda, an advanced Hindi large language model, at the UAE-India Business Forum in Mumbai. This development marks a significant step in AI technology for the Hindi-speaking world.
10 Sources
10 Sources
The Technology Innovation Institute (TII) in UAE has launched Falcon 3, a family of small language models ranging from 1B to 10B parameters, outperforming larger models in various benchmarks and promising efficient AI deployment across industries.
2 Sources
2 Sources
Mistral AI introduces Saba, a 24-billion-parameter language model tailored for the Middle East and South Asia, excelling in Arabic and South Indian languages like Tamil and Malayalam.
4 Sources
4 Sources
Cohere's research arm releases Aya Expanse, a family of multilingual AI models that outperform leading open-source alternatives, aiming to bridge the global language divide in AI technology.
3 Sources
3 Sources
Core42, a G42 company, partners with Qualcomm Technologies to introduce Compass 2.0, an advanced enterprise AI platform. This collaboration aims to revolutionize AI deployment across various industries.
2 Sources
2 Sources
The Outpost is a comprehensive collection of curated artificial intelligence software tools that cater to the needs of small business owners, bloggers, artists, musicians, entrepreneurs, marketers, writers, and researchers.
© 2025 TheOutpost.AI All rights reserved