How can a voice interface navigate different accents and jargon?

Published by:

Eran Soroka

Latin America, as a market, has more than 650 million people. Most of them, statistically, speak Spanish. However, if you speak to a Mexican or a Chilean, you will find different pronunciations, accents, and even words to describe the same concepts. So, when you’re working for an international giant like Mercado Libre, a leading e-commerce technology company in Latin America, and you have branches in 18 Latin American countries – you have to take into account that a voice interface will face lots of challenges.

How do you navigate that? And where do you draw the line between a human-like feeling and an automated experience? For episode 37 of Taking Turns, we brought over Einath Apel, semi-senior UX writer, and former VUI designer.

First of all, how did you become a conversation designer before transferring to UX?

It’s either a strange or a very nice thing: we’re all conversational designers, as we design our conversations on a daily basis. So we are born natural. From there, the only thing that you have to do in the process is getting some tools. Personally, I always loved to write. Then I started as a conversational designer, as soon as I got into UX and the design of experiences for customers and users. By the way, I am not very fond of calling them customers or users. They’re persons. We have been and we still are involved in conversations all the time.

What’s the bot or project you’re most proud of?

For me, it’d be the implementation of natural language processing inside the IVR of AT&T. First, it is a new technology. Second, it involved a lot of knowledge and technical knowledge, aside from the UX writing itself. So that would be the one.

In Latin America, there’s a lot of accents & jargon. So how hard is it to handle all of this to make it work?

That’s an interesting question. So in the AI tools we use, specifically Nuance and Watson, you have some pre-work, some pre-training data. However, you have to also feed them and train them in order to understand different jargon and accents. So it was very interesting on one hand; On the other hand, it was also very frustrating for the people to get the machines to understand them.

In the beginning, it’s very difficult because you have a learning curve. After the machine learns, it goes off, it flies, and becomes a smoother process. Although it also still has challenges to understand, it happens in all the major languages at least.

But when a customer or a person starts a conversation with the voice assistant, the voice assistant doesn’t know in advance in what accent will the person speak. So, how does it work?

It has a big amount of information that it can process and interpret. However – if as a person I don’t understand you, I’ll say “what are you saying to me? I’m sorry, can you repeat?”. Here, the machine does the same thing. So on the first time, the person said something out loud, in a personal way and accent. On the second time, if the machine didn’t understand, the person was more gentle and talked slower and clearer, so it will understand. So in the second time, it’s more probable that the voice interface understood the person. Since we tend to do this in a social environment – the same thing happens with the IVR.

Ready to build your bot on cocohub? Start here!

How to create your chatbot’s personality?

How to use and create intents in your bot?

What’s the most important thing for a voice interface or a chatbot?

In my opinion, it’s to find the sweet spot between being a robot and being a person. Because it’s not quite natural for you as a person to interact with a machine and think you’re speaking to a person; Actually, it’s really strange, even a bit of cringe. You’re not a person. On the other hand, you don’t want to be too robotic. So you want a machine looking to understand what the person is saying and be empathetic with the person, but at the same time, not “too much of a person”. Sometimes people want to speak to other people, in cases when a machine can’t solve the problem. So to find that sweet spot may be the most important thing.

Seems like a thin line to walk and that you had a lot of fun experimentation. Is there a specific anecdote you vividly remember in that context?

Not in my job, but I definitely experienced it as a person. For example, I have a bank account. So I always prefer chatbots because some of my questions are easily solved with the voice interface or the chatbot. But sometimes they’re like, “Hey, I’m Sophia. How are you? These are my choices”. Then I try to write or to speak naturally, and Sophia doesn’t understand me. She wants me to find the number and tell them “1”. So most of the time, the presented options are not the options that I would like to know.

So if you are ״Sophia״, let me speak to you more like you’re a person, not with numbers. First, understand these technologies that you can own, then develop a chatbot with a personality. So it’s a thin line, but I celebrate all the progress in this area.

Previously on Taking Turns | Watch the whole playlist

Alessandra Cherubini shares her tips: how to build an emphatic healthcare bot?

Esha Metiari talks about the art of building a great bot persona

Phil D. Hall takes us back to a tour about the history of chatbots

Do human agents even love voice assistants or chatbots? In a sense they come to help them, but the humans may be afraid that they will be replaced by them.

Personally, I don’t think that the people are scared; After all, conversational design and the AI involved have made their work better. Actually, they have some information that has already been processed before it comes to them. Also, they work with the tool to straighten it, to feed it, and to consult it. The Watson is not only for us as persons using the conversation platform, it’s also for the contact center to know the information that the user starts asking. So if anything – the voice interface made their work easier.

Can you speak about the differences between a UX writer and a conversation designer?

In conversational design, it all depends on where do you work and what are the tools that you have. Nowadays, I work for Mercado Libre. A huge company. We have very specific types of jobs. So as a UX writer, I barely use other tools or I barely touch other fields. The programmers are a team; I work with a designer, we design together the experience. However, this is very specific role inside design process. 

Then, as a conversational designer, I got more in touch with the AI and ML tools. In some way, you have to be in touch with other tools that are not your specialty. 

Another difference that I see between being a UX writer and a conversation designer: In UX, you also have to have access to a lot of information from design and visual design information. Then, in conversation design, you have this constant dialogue with ML and tools that as a writer, I didn’t imagine that I’d have to interact with.

If somebody wants to become a conversation designer, what tips can you give them? Where to start?

Mainly, for every human-centered designer – always keep the person in mind. If you have that in mind, you cannot make mistakes. Even if you do, you cannot get far from the objective of designing experiences for users to get where they want to. Although the content is different, it’s all the same. 

How big is the community of conversation designers in Argentina, in Latin America?

I think it’s very small. As a conversational designer, I understood I didn’t know what it was until I was working in it. It’s a very new role. It’s growing but it’s actually pretty small.

Last thing: do you have any forecast, for the future of conversational AI, voice interfaces?

In my opinion, there’s a great future for voice interfaces. After all, we are more often involved in situations that we cannot use our hands. For example, driving or using our bikes to move through the city, or just having the cell phone in your pocket and your EarPods. So there’s a big opportunity for voice interfaces, as we interpret the context of a person, and the inability to continue to look at screens and continue to use our hands to do everything. As the voice assistants, the ML and the NLP get there, it’s a big opportunity for these kinds of interfaces.

Overall, I don’t think humans can be replaced with machines. When you want to speak to a person, that’s it – there’s no interface that can solve it for you.