Inception Announces Open-Source Release of “Jais” Arabic Large Language Model

Inception, a G42 company dedicated to pushing the boundaries of AI, has announced the open-source release of “Jais”, the world’s highest-quality Arabic Large Language Model. The model, developed in collaboration with Mohamed bin Zayed University of Artificial Intelligence (MBZUAI) and Cerebras Systems, is a 13-billion parameter model trained on a newly developed 395-billion-token Arabic and English dataset.

Jais: Advancing Generative AI in the Arabic-Speaking World

Named after UAE’s highest peak, Jais aims to bring the advantages of generative AI to the Arabic-speaking world. This collaboration between Inception, MBZUAI, and Cerebras Systems leveraged the power of Condor Galaxy, a multi-exaFLOP AI supercomputer built by G42 and Cerebras for training the model.

A Milestone for AI in the Arabic World

The release of Jais marks a significant milestone for AI in the Arabic world. As a model developed in Abu Dhabi, it offers over 400 million Arabic speakers the opportunity to harness the potential of generative AI. This release highlights Abu Dhabi’s leading position as a hub for AI, innovation, culture preservation, and international collaboration.

Open-Sourcing Jais: Accelerating Arabic Language AI Ecosystem

By open-sourcing Jais, Inception aims to engage the scientific, academic, and developer communities to accelerate the growth of a vibrant Arabic language AI ecosystem. This initiative can serve as a model for other underrepresented languages in mainstream AI.

Commitment to Excellence and Innovation

Andrew Jackson, CEO of Inception, emphasized the importance of collaboration and setting new standards for AI advancement in the Middle East. The release of Jais demonstrates Inception’s commitment to excellence, democratizing AI, and promoting innovation.

Jais: Outperforming Existing Arabic Models

Jais outperforms existing Arabic models by a sizable margin and remains competitive with English models of similar size. This achievement showcases the model’s ability to learn from both Arabic and English data, opening new possibilities for large language model development and training.

Pioneering High-Caliber LLMs

MBZUAI President and University Professor Eric Xing highlighted the research efforts and partnerships involved in developing high-caliber LLMs. MBZUAI will continue pioneering efficient, effective, and accurate language models through collaborations with Inception and other organizations.

Cutting-Edge Features and Techniques

Jais is a transformer-based large language model that incorporates cutting-edge features such as ALiBi position embeddings for better context handling and accuracy. State-of-the-art techniques like SwiGLU and maximal update parameterization further enhance the model’s training efficiency and accuracy.

Training on Condor Galaxy 1

Jais’ training, fine-tuning, and evaluation were undertaken by a joint team from Inception and MBZUAI on the Condor Galaxy 1 (CG-1) supercomputer. This state-of-the-art AI supercomputer, co-developed by G42 and Cerebras Systems, facilitated the training of the 13-billion parameter open-source model on a purpose-built dataset capturing Arabic’s complexity and richness.

Expanding and Refining Jais

Inception and MBZUAI have plans to expand and refine Jais as its user community grows, ensuring continuous improvement and innovation.

Contributions to the Open-Source Community

Cerebras Systems CEO Andrew Feldman expressed pride in the strategic partnership with G42 and highlighted how Jais contributes to the international open-source community. The release of Jais showcases the ease of use and rapid AI model development enabled by CG-1.

Accessing Jais

Jais is available for download on Hugging Face. Users can also try Jais online by registering interest on Jais’ website and receiving an invite to access the playground environment.

