nllb Archives - AI News https://www.artificialintelligence-news.com/tag/nllb/ Artificial Intelligence News Tue, 22 Aug 2023 14:30:36 +0000 en-GB hourly 1 https://www.artificialintelligence-news.com/wp-content/uploads/sites/9/2020/09/ai-icon-60x60.png nllb Archives - AI News https://www.artificialintelligence-news.com/tag/nllb/ 32 32 Meta unveils SeamlessM4T multimodal translation model https://www.artificialintelligence-news.com/2023/08/22/meta-unveils-seamlessm4t-multimodal-translation-model/ https://www.artificialintelligence-news.com/2023/08/22/meta-unveils-seamlessm4t-multimodal-translation-model/#respond Tue, 22 Aug 2023 14:30:33 +0000 https://www.artificialintelligence-news.com/?p=13509 Meta researchers have unveiled SeamlessM4T, a pioneering multilingual and multitask model that facilitates seamless translation and transcription across both speech and text.  The internet, mobile devices, social media, and communication platforms have ushered in an era where access to multilingual content has reached unprecedented levels. SeamlessM4T aims to realise the vision of seamless communication and... Read more »

The post Meta unveils SeamlessM4T multimodal translation model appeared first on AI News.

]]>
Meta researchers have unveiled SeamlessM4T, a pioneering multilingual and multitask model that facilitates seamless translation and transcription across both speech and text. 

The internet, mobile devices, social media, and communication platforms have ushered in an era where access to multilingual content has reached unprecedented levels. SeamlessM4T aims to realise the vision of seamless communication and comprehension across languages.

Boasting an impressive array of capabilities, SeamlessM4T encompasses:

  • Automatic speech recognition for nearly 100 languages
  • Speech-to-text translation supporting nearly 100 input and output languages
  • Speech-to-speech translation for nearly 100 input languages and 35 (including English) output languages
  • Text-to-text translation for almost 100 languages
  • Text-to-speech translation for nearly 100 input languages and 35 (including English) output languages

SeamlessM4T is being made available to researchers and developers under the CC BY-NC 4.0 license, embodying an ethos of open science.

Additionally, the metadata of SeamlessAlign – the largest multimodal translation dataset ever compiled, consisting of 270,000 hours of mined speech and text alignments – has been released. This facilitates independent data mining and further research within the community.

The development of SeamlessM4T addresses a long-standing challenge in the field of multilingual communication. Unlike earlier systems, which were confined by limited language coverage and reliance on separate subsystems, SeamlessM4T presents a unified model capable of comprehensively handling speech-to-speech and speech-to-text translation tasks. 

Meta has built upon previous innovations – such as No Language Left Behind (NLLB) and Universal Speech Translator – to create this unified multilingual model. With its impressive performance on low-resource languages and consistently strong performance on high-resource languages, SeamlessM4T holds the potential to revolutionise cross-language communication.

Underpinning the model’s architecture is the multitask UnitY model, which excels in generating translated text and speech.

UnitY supports various translation tasks, including automatic speech recognition, text-to-text translation, and speech-to-speech translation, all from a single model. To train this versatile model, Meta employed advanced techniques such as text and speech encoders, self-supervised encoders, and sophisticated decoding processes.

The result is a model that outperforms previous leaders:

To ensure the accuracy and safety of the system, Meta adheres to a responsible AI framework.

Meta says that extensive research on toxicity and bias mitigation has been conducted, resulting in a model that is more aware of and responsive to potential issues. The public release of the SeamlessM4T model encourages collaborative research and development in the AI community.

As the world becomes more connected, SeamlessM4T’s ability to transcend language barriers is a testament to the power of AI-driven innovation. This milestone brings us closer to a future where communication knows no linguistic limitations, enabling a world where people can truly understand each other regardless of language.

A demo of SeamlessM4T can be found here. The code, model, and data can be downloaded on GitHub.

(Image Credit: Meta AI)

See also: Study highlights impact of demographics on AI training

Want to learn more about AI and big data from industry leaders? Check out AI & Big Data Expo taking place in Amsterdam, California, and London. The comprehensive event is co-located with Digital Transformation Week.

Explore other upcoming enterprise technology events and webinars powered by TechForge here.

The post Meta unveils SeamlessM4T multimodal translation model appeared first on AI News.

]]>
https://www.artificialintelligence-news.com/2023/08/22/meta-unveils-seamlessm4t-multimodal-translation-model/feed/ 0
Meta’s NLLB-200 AI model improves translation quality by 44% https://www.artificialintelligence-news.com/2022/07/07/metas-nllb-200-ai-model-improves-translation-quality-by-44/ https://www.artificialintelligence-news.com/2022/07/07/metas-nllb-200-ai-model-improves-translation-quality-by-44/#respond Thu, 07 Jul 2022 17:02:38 +0000 https://www.artificialintelligence-news.com/?p=12146 Meta has unveiled a new AI model called NLLB-200 that can translate 200 languages and improves quality by an average of 44 percent.  Translation apps have been fairly adept at the most popular languages for some time. Even when they don’t offer a perfect translation, it’s normally close enough for the native speaker to understand.... Read more »

The post Meta’s NLLB-200 AI model improves translation quality by 44% appeared first on AI News.

]]>
Meta has unveiled a new AI model called NLLB-200 that can translate 200 languages and improves quality by an average of 44 percent. 

Translation apps have been fairly adept at the most popular languages for some time. Even when they don’t offer a perfect translation, it’s normally close enough for the native speaker to understand.

However, there are hundreds of millions of people in regions with many languages – like Africa and Asia – that still suffer from poor translation services.

In a press release, Meta wrote:

“To help people connect better today and be part of the metaverse of tomorrow, our AI researchers created No Language Left Behind (NLLB), an effort to develop high-quality machine translation capabilities for most of the world’s languages.

Today, we’re announcing an important breakthrough in NLLB: We’ve built a single AI model called NLLB-200, which translates 200 different languages with results far more accurate than what previous technology could accomplish.”

The metaverse aims to be borderless. To enable that, translation services will have to quickly offer accurate translations.

“As the metaverse begins to take shape, the ability to build technologies that work well in a wider range of languages will help to democratise access to immersive experiences in virtual worlds,” the company explained.

According to Meta, NLLB-200 scored 44 percent higher in the “quality” of translations compared to previous AI research. For some African and Indian-based languages, NLLB-200’s translations were more than 70 percent more accurate.

Meta created a dataset called FLORES-200 to evaluate and improve NLLB-200. The dataset enables researchers to assess FLORES-200’s performance “in 40,000 different language directions.”

Both NLLB-200 and FLORES-200 are being opened to developers to help build on Meta’s work and improve their own translation tools.

Meta has a pool of up to $200,000 in grants for researchers and nonprofit organisations that wish to use NLLB-200 for impactful uses focused on sustainability, food security, gender-based violence, education, or other areas that support UN Sustainable Development Goals. 

However, not everyone is fully convinced by Meta’s latest breakthrough.

“It’s worth bearing in mind, despite the hype, that these models are not the cure-all that they may first appear. The models that Meta uses are massive, unwieldy beasts. So, when you get into the minutiae of individualised use-cases, they can easily find themselves out of their depth – overgeneralised and incapable of performing the specific tasks required of them,” commented Victor Botev, CTO at Iris.ai.

“Another point to note is that the validity of these measurements has yet to be scientifically proven and verified by their peers. The datasets for different languages are too small, as shown by the challenge in creating them in the first place, and the metric they’re using, BLEU, is not particularly applicable.”

A demo of NLLB-200 is available here.

(Photo by Jason Leung on Unsplash)

Want to learn more about AI and big data from industry leaders? Check out AI & Big Data Expo taking place in Amsterdam, California, and London.

Explore other upcoming enterprise technology events and webinars powered by TechForge here.

The post Meta’s NLLB-200 AI model improves translation quality by 44% appeared first on AI News.

]]>
https://www.artificialintelligence-news.com/2022/07/07/metas-nllb-200-ai-model-improves-translation-quality-by-44/feed/ 0