Meta’s NLLB-200 AI model improves translation quality by 44%

Meta’s NLLB-200 AI model improves translation quality by 44% Ryan is a senior editor at TechForge Media with over a decade of experience covering the latest technology and interviewing leading industry figures. He can often be sighted at tech conferences with a strong coffee in one hand and a laptop in the other. If it's geeky, he’s probably into it. Find him on Twitter (@Gadget_Ry) or Mastodon (@gadgetry@techhub.social)


Meta has unveiled a new AI model called NLLB-200 that can translate 200 languages and improves quality by an average of 44 percent. 

Translation apps have been fairly adept at the most popular languages for some time. Even when they don’t offer a perfect translation, it’s normally close enough for the native speaker to understand.

However, there are hundreds of millions of people in regions with many languages – like Africa and Asia – that still suffer from poor translation services.

In a press release, Meta wrote:

“To help people connect better today and be part of the metaverse of tomorrow, our AI researchers created No Language Left Behind (NLLB), an effort to develop high-quality machine translation capabilities for most of the world’s languages.

Today, we’re announcing an important breakthrough in NLLB: We’ve built a single AI model called NLLB-200, which translates 200 different languages with results far more accurate than what previous technology could accomplish.”

The metaverse aims to be borderless. To enable that, translation services will have to quickly offer accurate translations.

“As the metaverse begins to take shape, the ability to build technologies that work well in a wider range of languages will help to democratise access to immersive experiences in virtual worlds,” the company explained.

According to Meta, NLLB-200 scored 44 percent higher in the “quality” of translations compared to previous AI research. For some African and Indian-based languages, NLLB-200’s translations were more than 70 percent more accurate.

Meta created a dataset called FLORES-200 to evaluate and improve NLLB-200. The dataset enables researchers to assess FLORES-200’s performance “in 40,000 different language directions.”

Both NLLB-200 and FLORES-200 are being opened to developers to help build on Meta’s work and improve their own translation tools.

Meta has a pool of up to $200,000 in grants for researchers and nonprofit organisations that wish to use NLLB-200 for impactful uses focused on sustainability, food security, gender-based violence, education, or other areas that support UN Sustainable Development Goals. 

However, not everyone is fully convinced by Meta’s latest breakthrough.

“It’s worth bearing in mind, despite the hype, that these models are not the cure-all that they may first appear. The models that Meta uses are massive, unwieldy beasts. So, when you get into the minutiae of individualised use-cases, they can easily find themselves out of their depth – overgeneralised and incapable of performing the specific tasks required of them,” commented Victor Botev, CTO at Iris.ai.

“Another point to note is that the validity of these measurements has yet to be scientifically proven and verified by their peers. The datasets for different languages are too small, as shown by the challenge in creating them in the first place, and the metric they’re using, BLEU, is not particularly applicable.”

A demo of NLLB-200 is available here.

(Photo by Jason Leung on Unsplash)

Want to learn more about AI and big data from industry leaders? Check out AI & Big Data Expo taking place in Amsterdam, California, and London.

Explore other upcoming enterprise technology events and webinars powered by TechForge here.

Tags: , , , , , , , ,

View Comments
Leave a comment

Leave a Reply