Today: 7 February 2025
12 April 2022
3 mins read

NOOR – The world’s largest Arabic NLP model

Technology Innovation Institute (TII) unveils NOOR, the world’s largest Arabic natural language processing (NLP) model which is useful in automated summarization, chatbots, personalised marketing

Technology Innovation Institute (TII), a global research center and applied research pillar of Abu Dhabi’s Advanced Technology Research Council, today announced the launch of NOOR, the world’s largest Arabic natural language processing (NLP) model to date.

TII’s team of advanced researchers and Artificial Intelligence (AI) specialists at its AI Cross-Center Unit, joined forces on this initiative with LightOn, a technology company that unlocks extreme-scale machine intelligence for businesses, to revolutionize Arabic NLP models. The NOOR model carries out varied, cross-domain tasks simply from natural language instructions. To build NOOR, researchers at TII designed an end-to-end pipeline for the collection of high-quality data, including crawling, filtering, and curation at scale. TII’s specialists also built optimized services for extreme-scale distributed training and serving – to deliver applications with efficient inference and model specialization.

Dr. Ray O. Johnson, CEO, TII and ASPIRE, said: “With this development, we are on track to boost our research capabilities and credentials in AI, as well as elevating the status of Abu Dhabi and the UAE as a serious research ecosystem. Our expert teams have demonstrated yet again that this region can achieve breakthrough R&D outcomes that impact the world.”

Dr. Ebtesam Almazrouei, Director, AI Cross-Center Unit, TII, said: “Large language models have taken the world of natural language processing by storm, and we are proud to introduce this cutting-edge model with 10 billion parameters – the world’s largest Arabic NLP model.  The uniquely large Arabic dataset collected to train the model is the result of months of work that included curating, scrapping, and filtering of varied sources. A special thank you to the entire team that worked on this project to make NOOR the go-to exploration model in Arabic for academicians and businesses everywhere.”

Speaking on the upcoming launch, Prof. Mérouane Debbah, Chief Researcher, Digital Science Research Center and AI Cross-Center Unit, TII, said: “With NOOR, TII has expanded the scope of the modern standard Arabic model by leveraging know-how in large language models to build cross-disciplinary, cutting-edge expertise in this new generation of AI research.”

NOOR’s training dataset is the world’s largest high-quality cross-domain Arabic dataset, combining web data with books, poetry, news articles, and technical information to significantly widen the applicability of the model.

Abdulaziz Alshamsi, AI Researcher and PhD student, TII, said: “As an Emirati researcher, I’m proud to be a part of TII. I have enjoyed meeting and working with the passionate researchers and trainers at the AI Cross-Center Unit, who bring profound value to the Arabic language and NLP domain. I learned technical and advanced skills that will pave the way for me as a researcher to discover the world beyond NLP horizons. Training workshops helped upskill and enlighten us about new concepts and equipped us with the right tools to bring the NOOR project to life.”

Dr. Ebtesam Almazrouei said the NOOR model is based on the popular Transformer architecture. As a decoder-only model, similar in structure to GPT-3, it is programmed to tackle generative tasks with architecture upgraded to reflect the latest developments in the world of machine learning, including improvements such as better positional embeddings. To help ensure quality at scale in the NOOR dataset, the TII team designed an automated filtering pipeline based on machine learning techniques. These tools identify text like quality references and safeguard the model from exposure to spam content.

ALSO READ: UAEBBY launches 14th Etisalat Award for Arabic Children’s literature at LBF 2022

Leveraging state-of-the-art 3D parallelism, NOOR was trained on a High-Performance Computing resource with 128 A100 GPUs, allowing for the distribution of computations and ensuring efficient use of the available hardware resources.

The Director of the AI Cross-Center Unit noted that this was only the first step in TII’s efforts to contribute to the wider UAE Strategy for Artificial Intelligence, through supporting AI integration across key sectors of the economy.

Named for the Arabic word “light”, the model has been so called to establish the correlation of the Arabic language model to enlightening the mind. It represents the United Arab Emirates global contribution to advanced technology and artificial intelligence.

Previous Story

Eid Al Fitr to fall on May 2

Next Story

Global Covid caseload tops 499.8 mn

Latest from Arab News

‘Ozoum’ shines light on social change 

A groundbreaking Saudi television series is offering an unprecedented glimpse into the Kingdom’s social transformation, captivating domestic audiences and challenging long-standing perceptions, writes Pedro Carvalho  A groundbreaking television series is offering unprecedented

Starmer pledges to rebuild Gaza

Prime Minister reaffirms his commitment to a long-term two-state solution, insisting that Palestinians must be allowed to return following the ceasefire agreement, writes Zahra Jawad. Keir Starmer’s pledge to support the return

Sharaa’s Saudi Trip Sparks Optimism

Experts believe that al-Sharaa’s trip to Saudi Arabia underscores the Kingdom’s strategic role in shaping Syria’s post-conflict transition..reports Asian Lite News Syrian interim President Ahmed al-Sharaa’s first official visit abroad since taking

WH Downplays Trump’s Gaza Takeover Plan

Trump Hasn’t Committed to Deploying Ground Troops in Gaza, Says WH…reports Asian Lite News The White House said on Wednesday that President Donald Trump had not committed any funds for his proposal
Go toTop

Don't Miss

Masdar City unveils plans for Dh4 b projects targeting AI, space

Ahmed Baghoum stressed that Masdar City is currently focusing on

Union Coop Boosts CSR Initiatives with Burjeel Hospital Agreement

Under this agreement, Cooperative employees and members of the “Tamayaz”