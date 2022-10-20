SAN FRANCISCO — Meta announced this week that it built a speech-to-speech translation system powered by artificial intelligence (AI) that can translate into English primarily oral languages that don’t have a widely used writing system, starting with Hokkien, a Taiwanese language that lacks a standard written form.

The system is a result of deep research by Meta AI teams across the world including in Israel, where Meta built a significant R&D operation, the largest outside the US.

The Silicon Valley tech titan that owns Facebook, Instagram, and WhatsApp billed the work at its Universal Speech Translator project as an effort to enable users from around the world to socialize regardless of the languages they speak. The project was one of two first announced in February.

The second, related project is called No Language Left Behind, where Meta says it is building a new advanced AI model that “can learn from languages with fewer examples to train from, and we will use it to enable expert-quality translations in hundreds of languages, ranging from Asturian to Luganda to Urdu.”

These two projects are part of Meta’s stated long-term effort to build language tools and machine translation systems that will apply to “most of the world’s languages.”

When Facebook renamed itself as Meta a year ago, co-founder and chief Mark Zuckerberg said the company was focusing on a shift to online life playing out in virtual realms, a concept referred to as the metaverse.

“Spoken communications can help break down barriers and bring people together wherever they are located — even in the metaverse,” Meta said in a blog post.

The fledgling system for translating Hokkien was billed by Meta as the first artificial intelligence-powered “speech-to-speech translation system developed for an unwritten language.”

The translation technology, which the tech firm said will be shared for others to use, allows someone speaking Hokkien to converse with someone who speaks English, but only with one full sentence at a time, according to Meta.

The tech giant said that to build the system, it used a variety of methods to overcome the fact that most speech translation systems rely on transcriptions.

Since Hokiien does not have a writing system, Meta said it used speech-to-speech translations that rely on some text from a related language, in this case, Mandarin, and speech-to-unit translations that produce acoustic sounds.

“Our team first translated English or Hokkien speech to Mandarin text, and then translated it to Hokkien or English,” Juan Pino, one of the AI researchers at Meta, told VentureBeat. “They then added the paired sentences to the data used to train the AI model.”

Meta said the work was a “step toward a future where simultaneous translation between languages is possible.”

“The techniques we pioneered with Hokkien can be extended to many other unwritten languages and eventually will work in real-time.”

Hokkien is widely spoken within the Chinese diaspora. It is used by 16 million people across Asia and is spoken by three-quarters of the population of Taiwan, according to the French National Institute of Oriental Languages and Civilizations.

But the language lacks a standard written form, making it a challenge to train AI models how to interpret what is said, according to Meta.

More than 40 percent of the world’s 7,000 existing languages are primarily spoken, without a standard or widely known written form, the tech firm said.

“In the future, all languages, whether written or unwritten, may no longer be an obstacle to mutual understanding,” Meta said.