T + T – Normal size
The UAE is leading the region in seeking to benefit from artificial and generative intelligence as one of the main components of the Fourth Industrial Revolution, and to consolidate its position as a destination for global and regional companies operating in this vital field.
The UAE has entered the field of developing large open-source language models, as part of its ambitious plans to maximize the benefit of generative artificial intelligence, and in the context of its journey to establish a knowledge economy and develop new economic systems that keep pace with the future, according to a new research paper prepared by Interregional Strategic Analytics, based in Abu Dhabi.
According to the “Official Portal of the UAE Government”, open source large language models are a type of artificial intelligence model that is trained on huge amounts of text to learn patterns, rules, contexts and semantics in the language. Accelerators are also used to process large text data to understand and simulate human language.
Great efforts
Interregional pointed to the great efforts launched by the UAE, represented by many government agencies, as His Excellency Omar bin Sultan Al Olama, Minister of State for Artificial Intelligence, Digital Economy and Remote Work Applications, said in a previous statement to the Financial Times: The deal signed with Microsoft to acquire a $1.5 billion stake in G42, the artificial intelligence company in Abu Dhabi, is just the beginning of greater technical cooperation between the UAE and the United States of America.
The Financial Times recently reported that Abu Dhabi is investing heavily in AI projects abroad, attracting industry leaders such as Sam Altman at OpenAI and Jensen Huang at Nvidia. In the same vein, the Mohamed bin Zayed University of Artificial Intelligence has launched Baymedics, Palo, GlamM, GeoChat and MobileLamma as small and large multimodal language models that use multimodal learning to process and analyze data from multiple media or sources, ranging from text to audio and images, with a particular focus on the capabilities of these models in Arabic.
Technology Innovation Institute
The Technology Innovation Institute (TII), part of the Advanced Technology Research Council (ATRC), is making available the open-source AI model (Falcon 40B) for research and commercial uses, which includes 40 billion variables and is trained on a trillion tokens, thus providing access to unprecedented integrated capabilities for researchers, innovators, and small and medium enterprises.
The Technology Innovation Institute announced the launch of the first distinguished platform for large Arabic language models, in cooperation with the “Hugging Face” platform under the name (OALL), which aims to create a platform dedicated to evaluating and comparing the performance of large language models for the Arabic language.
G42 Group
The Interregional Center said that the G42 Group, which was established in Abu Dhabi, is one of the most prominent Emirati technology companies that are leading the world in creating and developing artificial intelligence technologies. It recently announced its intention to launch the “Nanda” model as the latest large language model for the Hindi language, consisting of 13 billion parameters, which was trained on a database that includes approximately 2.13 trillion linguistic units, including the Hindi language.
G42 said that the launch of the NANDA model comes as a result of the collaboration between Inception, a subsidiary of the group, the Mohamed bin Zayed University of Artificial Intelligence, and Cerebras Systems. In August 2023, G42 launched the GIS model, the first large open-source language model to provide Arabic-based natural language processing solutions, opening the door to accessing the capabilities of generative AI in the native language of more than 400 million Arabic speakers worldwide.
The paper reviewed the concept of “large language models” based on artificial intelligence techniques, as the global company “Shaip” stated that large language models are advanced artificial intelligence (AI) systems designed to process, understand and generate human-like text based on deep learning techniques trained on large data sets. According to the “Wikipedia” encyclopedia, a large language model is a type of language model characterized by its ability to understand and generate language for general purposes using a huge amount of data.
According to Amazon Web Services, large language models are very large deep learning models that are pre-trained on massive amounts of data and are capable of self-training and learning. There are many practical applications for large language models, such as writing content, with the exception of ChatGPT and GPT-3. The Claude, Llama 2, Cohere Command, and Jurassiccan models can write original content, while the AI21 Wordspice model suggests making changes to original sentences to improve style and wording.