Voice assistants are becoming a central platform for gathering data and streamlining everyday life for consumers. For example, Amazon’s Alexa, which can now control more than 85,000 smart home products from TVs to doorbells to microwaves, along with the likes of Google Home and solutions from startups have become omnipresent companions in millions of homes.
Of note, voice assistants are beginning a shift from passive to proactive interactions. In other words, rather than waiting for requests, the future of voice assistants lies in anticipating what you want.
We note below some of the key trends that are driving the future evolution of AI-based voice assistants:
Trend 1: Voice Assistants Will Try to Become Your Best Friend
Voice-enabled devices check boxes that appeal to a broad spectrum of people, because they are easy to use, physically unobtrusive, and their interactivity can be fun.
FrontPorch, a nonprofit that works with retirement communities, has deployed voice assistants to the homes of elderly people. FrontPorch found that voice assistants are particularly helpful to older people with poor vision and the project has experimented using them to help people with dementia know where they are if they are confused about their surroundings.
Users grow reliant on asking their voice assistant for help on everyday tasks: What’s the weather? Remind me of a lunch appointment. What’s the meaning of this word? Users grow friendly with their voice assistant. It’s about convenience, and they like that they can talk to a machine 24/7 and ask questions. But they also say they have a new friend at home. In the morning they get up and say good morning, and when they go to bed, they say good night.
As often as not the seniors spend a lot of time with Alexa just chatting and mitigating loneliness. Loneliness, of course is an urgent concern, and has been linked to a higher likelihood of depression and anxiety, as well as increased risk of heart attack, stroke, and death.
Tellingly, one of the most popular requests from people is renaming their Google Home so they don’t have to start every command with “Okay, Google.” They think of it as their friend and want to give it a name.
Trend 2: Voice Assistant Software Will Become MUCH Better
Dr. Boris Katz of MIT is eminently qualified to critique the quality of software used by voice assistants. Over the past 40 years, Katz has made key contributions to the linguistic abilities of machines. In the 1980s, he developed START, a system capable of responding to naturally phrased queries. The ideas used in START helped IBM’s Watson win on Jeopardy! and laid the groundwork for today’s voice assistants.
According to Katz, the field currently relies on decades old technology. Today’s Alexa, Google Home, and Siri – while more widely available – are not a whole lot better than what Katz created many years ago at MIT.
We at Omega Venture Partners have stated before, if you look at machine-learning advances, many the ideas originated 20 to 30 years ago. Today’s engineers are making these ideas a reality. This technology, as great as it is though, will not solve the problem of real understanding—of real intelligence. Modern AI / ML techniques are statistical and very good at finding regularities. Because humans usually produce the same sentences much of the time, it’s easy to build systems that capture the regularities and act as if they are intelligent.
Katz is conducting pioneering AI research that is built on ideas from developmental psychology, cognitive science, and neuroscience. This will soon allow AI models to reflect what is already known about how humans learn and understand the world, and be “intelligent” in a much more real sense.
Trend 3: Expect Business AI to Dominate in the US; Consumer AI to Emerge Faster in China
We recently showcased Dr. Jin Rong, the Head of Alibaba’s Machine Intelligence and Technology Lab, for our portfolio companies. What surprised us is that Alibaba already has a voice assistant way better than Google or Amazon. It navigates interruptions and other tricky features of human conversation to field millions of requests a day.
In side by side comparisons that we conducted at MIT and shared with our network, we found that Alibaba’s agent smoothly handled the three most common, and trickiest, conversational ingredients: interruption, nonlinear conversation, and implicit intent.
These three ingredients are commonplace in human conversations, but machines often struggle to handle them. That Alibaba’s voice assistant can do so suggests it’s more sophisticated than Google Duplex, judging from similar sample calls demoed by Google.
Alibaba’s agent is trained on the massive number of customer recordings at the company’s disposal (to the tune of 50,000 customer service calls per day), in addition to other resources. Alibaba’s biggest advantage in this field is the overwhelming wealth of consumer conversational data it has to train its AI models. As a result, Alibaba’s assistant learns and improves quicker because it gets more practice in handling many different types of situations. We expect this abundance of Consumer data to be an advantage for China’s companies. Conversely, in the US the massively greater amount of digitized business data creates a distinct advantage for American companies.
##########