Stanford alumni, USA, are creating an app to change accents

Three Stanford University alumni, in U.S, heard the sadness in a friend’s voice when he told the news. “Guys, I had to quit my job”.

To them, it didn’t make sense. He was fluent in English and Spanish, extremely friendly and an expert in systems engineering. Why wasn’t he able to keep a job at a call center?

His accent, the friend said, made it difficult for many customers to understand him. Some even hurled insults at the way he spoke. The three students realized that the problem was even greater than their friend’s experience. So they founded a startup to solve this.

Now the company, Sanas, is testing software powered by artificial intelligence which aims to eliminate miscommunication by changing people’s accents in real time. A call center employee at Philippines, for example, could speak normally into the microphone and end up sounding more like someone from Kansas State to a customer on the other end of the line.

Call centers, say the startup’s founders, are just the beginning. The company’s website advertises its plans as “Speech, reimagined”.

Eventually, they expect the application they are developing to be used by a variety of industries and individuals. It could help doctors understand patients better, they say, or help grandchildren understand their grandparents better.

“We have a very big vision for Sanas,” said CEO Maxim Serebryakov. And for Serebryakov and his co-founders, the project is personal.

“People’s voices aren’t being heard as much as their accents”

The trio who founded Sanas met at Stanford University, but originally they were all from different countries – Serebryakov, now the CEO, is from Russia, Andrés Pérez Soderi, now CFO, is from Venezuela, and Shawn Zhang, now the chief technology officer, is from China.

They are no longer Stanford students. Serebryakov and Pérez graduated, and Zhang dropped out of college to focus on bringing Sanas to life.

They launched the company last year, and gave it a name that can be easily pronounced in multiple languages ​​“to highlight our global mission and want to bring people together,” says Pérez.

Over the years, the three say they’ve experienced how accents can disrupt communication.

“We all come from international backgrounds. We saw firsthand how people treat you differently just because of the way you talk,” Serebryakov said. “It breaks your heart sometimes.”

Zhang says his mother, who came to the United States from China more than 20 years ago, still makes him talk to the cashier when they go shopping together because she is embarrassed.

“That’s one of the reasons I joined Max and Andrés in building this company, trying to help those people who think their voices aren’t being heard as much as their accents,” he says.

Serebryakov says he’s seen how his parents are treated in hotels when they visit him in the United States — how people make assumptions when they hear his accent. “They speak a little louder. They change their behavior,” he says.

Pérez says that after attending a British school, he at first had difficulty understanding the American accent when he arrived in the United States. He also cites the father’s difficulty when he tries to use the voice assistant of the Amazon, Alexa, that his family gave him in the Natal.

“We quickly found out, when Alexa was turning on lights in random places in the house and turning them pink, that she didn’t understand my father’s accent,” says Pérez.

Call centers are testing the technology

English is the most used language in the world. About 1.5 billion people speak – and most of them are not native speakers. In the United States alone, millions of people speak English as a second language.

This has created a growing market for apps that help users practice English pronunciation. But Sanas is using artificial intelligence to take a different approach.

The premise is that instead of learning to pronounce words differently, technology can do it for you. There would no longer be a need for costly or time-consuming accent reduction training. And understanding would be almost instantaneous.

Serebryakov says he knows that accents and people’s identity can be closely linked, and he emphasizes that the company isn’t trying to erase the accent or implying that one way of speaking is better than another.

“We allow people not to change the way they speak to get a position, to get a job. Identity and accents are essential. They are interconnected,” he says. “You never want someone to change their accent just to satisfy someone else.”

Currently, Sanas’ algorithm can convert English to and from American, Australian, British, Filipino, Indian, and Spanish accents, and the team is planning to add more. They can add a new accent to the system by training a neural network with audio recordings of professional actors and other data – a process that takes several weeks.

The Sanas team did two demonstrations for the CNN. In one, a man with an Indian accent is heard reading a series of literary phrases. These same phrases are then converted into an American accent. Check out:

Another example features phrases that might be more common in a call center setup, such as “if you give me your full name and order number, we can go ahead and start making the correction for you.”

The American-accented results sound somewhat artificial and stilted, like virtual assistant voices like Siri and Alexa, but Pérez says the team is working to improve the technology.

“The accent changes, but the intonation is maintained,” he says. “We continue to work to make the result as natural, exciting and exciting as possible.”

Initial feedback from call centers testing the technology has been positive, says Pérez. They say their plans for the company netted $5.5 million in seed funding from investors earlier this year.

How startup founders see their future

The investment allowed Sanas to increase its staff. The majority of employees of the company, based in Palo Alto, California, comes from international experiences. And this is no coincidence, says Serebryakov.

“What we’re building has resonated with so many people, even the people we’ve hired … It’s really exciting to see,” he says. While the company is growing, it may still be a while before Sanas appears in an app store or on a cell phone near you.

The team says it is working with larger call center outsourcing companies for now, and opting for slower deployment for individual users so they can refine the technology and ensure security.

But eventually they hope the Sanas will be used by anyone who needs it – in other areas as well. Pérez believes he plays an important role in helping people communicate with their doctors.

“Any second lost in misunderstandings because of timing or the wrong message is potentially very, very impactful,” he says. “We really want to make sure there’s nothing lost in the translation.”

Someday, he says, he could also help people learn languages, improve voice acting in movies, and help smart voice assistants in homes and cars understand different accents.

And not just English: the Sanas team also hopes to add other languages ​​to the algorithm. The three co-founders are still working on the details. But how this technology can make communication better in the future, they say, is easy to understand.

This content was originally created in English.

original version

Reference: CNN Brasil

You may also like