These AI products are breaking barriers for 430 million people

We previously wrote an article – "Don't Ask Me Again Why Deaf People Go to Music Festivals" , which popularized the special position of overseas music festivals: sign language interpreters.

Although the hearing-impaired can't hear or hear the music clearly, they can feel the rhythm of the music and the warmth of the atmosphere through the sign language interpreters' highly infectious hand movements, facial expressions, and body language.

This may be an unexpected sign language scenario for hearing people. In fact, there are still many places where sign language interpreters are needed, both online and offline, but there are not enough of them.

Sally Chalk, an Englishman, opened a British Sign Language interpreting company in 2002. After 20 years of operation, the company has grown to a considerable scale, and the time for booking a sign language interpreter has been shortened to 30 minutes, but she is still not satisfied.

Can hearing-impaired people get immediate access to sign language interpretation, just like turning on subtitles on a video website?

Her answer is to get AI involved.

From online to offline, hearing-impaired people should be allowed to use their "native language" more often

In 2022, Sally Chalk opened a new start-up company, Signapse, focusing on developing generative AI sign language translation software to translate written text into American Sign Language and British Sign Language in real time.

In May this year, Signapse received £2 million in seed round financing, of which £500,000 came from the British government.

One of the offline scenarios they are targeting is transportation hubs such as train stations and airports.

The Cincinnati/Northern Kentucky International Airport in the United States has partnered with Signapse to display American Sign Language on the screen to provide welcome, security, departure, arrival and other information.

How does AI work? Signapse is based on a large sign language dataset and uses generative adversarial networks (GAN) and deep learning techniques to create lifelike virtual sign language interpreters that translate as accurately as possible.

These avatars are based on real sign language interpreters, and every time they are used commercially, the real people get a cut.

Considering that airport destinations, departure times, and platform numbers often change, Signapse's sign language translation can be updated in real time by integrating with traffic data.

At the same time, Signapse has not ignored online needs and also provides sign language translation for websites and video streaming.

Although websites such as YouTube have closed subtitles, hearing-impaired people often prefer sign language to subtitles because sign language has grammatical structures and expressions that are independent of other languages, making their online experience better.

You should have noticed that when referring to sign language, we use American Sign Language and British Sign Language. Just as spoken and written languages ​​around the world are incomprehensible, sign language is all-encompassing.

According to the United Nations, approximately 70 million people around the world use sign language as their main form of communication, and there are more than 300 different types of sign language used around the world. In the United States alone, 500,000 people use American Sign Language.

Therefore, what Signapse is currently doing is actually very limited, covering only a small number of people who use American and British Sign Language, and limited vertical scenarios. Over the past two years, Signapse has created around 5,000 British Sign Language traffic announcements every day.

Signapse hopes that in the future their services will be more universal, expanded to education and other scenarios, and also more personalized, allowing users to customize the appearance of virtual sign language interpreters.

The environment and conditions support AI, and major domestic manufacturers also have similar sign language products.

AI sign language anchors have appeared in Tencent’s Honor of Kings live broadcast room and Huawei’s developer conference.

At the 2022 Winter Olympics, the AI ​​sign language anchor jointly created by CCTV News and Baidu Intelligent Cloud Xilin was launched, and the School of Artificial Intelligence for the Deaf of Tianjin University of Technology participated in sign language material annotation.

Behind the AI ​​sign language anchor, Baidu Intelligent Cloud Xiling's AI sign language platform can also meet the needs of rapid sign language translation in different scenarios such as hospitals, stations, banks, etc., which is the same as the Signapse hero.

Smoother travel, more immersive viewing experience, more barrier-free services…

If the room for improvement in sign language interpretation is deeper than the sea, at least the way hearing-impaired people obtain public information is being changed by AI, with visible waves continuing to rise.

Duolingo people in the sign language world

Do hearing-impaired people also need to "listen" to music? Is it enough for hearing-impaired people to read text? This is a typical problem considered from the perspective of hearing people's logic.

In fact, we should ask the other way around: How can hearing-impaired people also have a sense of participation in music festivals? How can the Internet make surfing more enjoyable for the hearing-impaired?

Therefore, it is not that there is an extra screen in a busy station, but that the screen should be there.

More companies and more individuals are leveraging the power of technology to make sign language more and more relevant.

Letting hearing people learn sign language is one of the easier ideas to think of.

PopSign is an app for learning sign language while playing. It uses an AI sign language model and was jointly developed by Google, Rochester Institute of Technology, and Georgia Institute of Technology. It can be used on Android and iOS. The main user group is hearing-impaired children. of hearing parents.

Learning from the lesson that memorizing words starts with "abandon" (giving up) and ends with giving up, PopSign does not play boring sign language videos, but uses mini games to enhance your interest and confidence in learning sign language. It is similar to Duolingo, which frantically urges you to check in. .

There is also an American company called SLAIT that wants to be the "Duolingo" in the sign language world. They also provide immersive interactive courses and tests. If you get it right, the AI ​​tutor will give you real-time feedback and provide the right amount of emotional value.

However, teaching sign language is only the second best option for SLAIT. What they wanted to do at the beginning was actually an AI sign language tool for real-time video chat and translation.

But it’s hard to make a meal without rice. SLAIT is a small team without enough data or funds. Compared with directly translating sign language sentences, teaching individual sign language vocabulary is simpler, but equally valuable.

The hard work of interpreting sign language is left to the wealthy giants.

In August 2023, Lenovo Brazil developed an AI-based real-time chat translation app to translate Portuguese sign language, and plans to cover more sign languages ​​around the world in the future.

When a hearing-impaired person signs in front of the device's camera, the algorithm will instantly translate it into Portuguese text and send it to the recipient on the other end.

There should be as many such tools as possible, which complement sign language teaching services and allow the hearing-impaired to take a more active position and become more initiators of conversations.

Google is more product-oriented and launched the 2023 Kaggle AI Sign Language Recognition Competition.

The theme of this competition is very interesting – contestants build a finger-spelling model that uses smart cameras to quickly track fingers, palms and faces based on more than 3 million finger-spelling characters obtained from selfies of hearing-impaired people.

Finger spelling is a type of sign language that uses different shapes and positions of fingers to represent letters. For many people with disabilities, finger spelling is much faster than typing on a smartphone's virtual keyboard.

Therefore, improving sign language recognition and building finger-spelling models is to allow hearing-impaired people to directly use the sign language they are better at instead of typing and speaking, and use functions such as search, maps, and text messages on their mobile phones.

Furthermore, this will also help to develop sign language-to-speech applications, breaking the deadlock of hearing-impaired people being unable to use voice to summon digital assistants.

In other words, many voice-first products have not considered users who are not good at speaking from the beginning. It is time to fill the loopholes.

Sam Sepah, chief accessibility research product manager at Google, mentioned in an interview with Forbes that their goal is to make sign language a universal language option when using Google products.

In fact, this should also be the goal of the entire Internet – to make sign language a universal language in the digital world.

As a language learning software, Duolingo provides everyone with equal educational opportunities. What AI sign language products make people feel is that restrictions that should not be there are being lifted, and people can communicate with each other everywhere.

The more powerful AI becomes, the more we must value humanity

In May, when GPT-4o was released, a demo video was very touching. GPT-4o acted as eyes, allowing the visually impaired to "see" the surrounding environment.

The visually impaired person knows from the AI's mouth that flags are flying over Buckingham Palace, ducks are playing leisurely in the river, and a taxi is about to arrive. The corners of his mouth rise in response to the AI's cheerful tone.

As the saying goes, technology opens the door to a new world. Can it be conversely understood that people with disabilities originally lived in a world that was not designed for them?

WHO data shows that 430 million people worldwide need rehabilitation treatment to address disabling hearing loss. The number of sign language interpreters is far from adequate. In the United States, the ratio of hearing-impaired users to American Sign Language interpreters is approximately 50 to 1.

So for now, AI sign language only plays a supplementary and icing on the cake role, and is not yet at the point of "stealing jobs."

The AI ​​sign language products mentioned above are basically small-scale, vertical, and rooted in specific regions, making up for the inaccessibility of human interpreters.

Last month, I also saw a cool AI sign language product.

Researchers from several universities, including Rutgers University and Carnegie Mellon University, processed public sign language videos into a data set containing 8 sign languages ​​and trained SignLLM, the first multilingual sign language generation model.

It covers a variety of sign languages ​​and can generate sign languages ​​through text prompt words. Isn’t that very convenient? However, the researchers said that the outside world should not exaggerate their research results. The demonstration video is not a direct output of the model, and it is still very troublesome to actually produce.

At the same time, some hearing-impaired experts have come forward and said that the quality of sign language translation in these videos varies. Some are half-understandable, some are completely incomprehensible, and lack facial expressions. The project has potential, but it needs to be improved.

The most important thing is to allow hearing-impaired users to participate, express their opinions, and jointly improve the product, because "without our participation, there will be no decisions about us."

A subtle feeling is that it seems difficult to make barrier-free products "sexy".

They are often not as exciting as the release of large models and AI hardware. They always tell you what functions they have and who they serve. They hope to do better in the future and will not "bite more than they can chew."

And in the eyes of venture capital, they are also niche, have unknown potential, and may not have a return on investment.

But "AI Godmother" Li Feifei once said that AI is to help people. The more powerful AI is, the more we must cherish humanity.

Everyone should not be afraid of missing a flight, everyone should be able to interact with products, and everyone should enjoy music festivals.

Those things that were once invisible and unheard should also be illuminated by the light of technology. Let's snap our fingers in resonance, so that more people's needs can be met, more people's abilities can be enhanced, so that we can gain more and lose less.

It is as sharp as autumn frost and can ward off evil disasters. Work email: [email protected]

# Welcome to follow the official WeChat public account of aifaner: aifaner (WeChat ID: ifanr). More exciting content will be provided to you as soon as possible.

Ai Faner | Original link · View comments · Sina Weibo