Technalysis Research
Previous USAToday Columns

July 19, 2016
Is semi-autonomous driving really viable?

June 28, 2016
Your average car is a lot more code-driven than you think

June 11, 2016
Augmented reality comes to phones — and kitchens

May 18, 2016
The future of computing will be ambient and invisible

May 3, 2016
The hottest new technologies are coming to cars

April 22, 2016
The shifting landscape of tech platforms, services

April 10, 2016
It's time for upgradable cars: O'Donnell

March 31, 2016
Forget 4K, It's Time for UltraHD

March 24, 2016
AR and VR Driving Major Innovations in Tech

February 24, 2016
The why's and what's of 5G

February 17, 2016
Dark clouds over cloud services reflect pull of legacy technology

January 25, 2016
Biometrics is the latest shield against password hacks

January 6, 2016
Navigating the in-car tech experience

2015 USAToday Columns

2014 USAToday Columns

USAToday Column

July 29, 2016
What happens when the digital assistants get (really) good?

By Bob O'Donnell

The human voice is one of the most personal and unique aspects of every individual, as well as one of the most sought after means of interacting with our devices. Voice-based interfaces have, in fact, been promised as the next big breakthrough for several decades.

Until recently, though, they’ve been much more future promise than current reality. Even with the much heralded introduction of Apple’s Siri back in October of 2011, spoken interactions with computing devices have often been more comical than helpful.

Beginning with Amazon’s Alexa, however—part of their Echo series of devices—the notion of natural human-style interactions with computing devices is starting to take hold. Part of the reason has to do with important technological improvements that are driving higher degrees of accuracy, which, in turn, are helping generate more useful responses.

Early versions of Siri, Google Now, and even Microsoft’s Cortana relied more on traditional natural language processing technologies that often had a coded set of learned rules. Amazon’s Alexa and later versions of the other voice-based personal assistants are starting to incorporate deep learning and neural networks, which essentially learn and create new rules by analyzing massive data sets.

As the accuracy across these different systems improve and head towards their ultimate goal—the ability to easily and correctly understand exactly what a human being said and, more importantly, meant—it raises an interesting theoretical question. If the end result is the ability to accurately respond to a human voice-based query or comment, how do these different digital assistants, and the platforms that they are a part of, distinguish themselves from one another?

To really be useful, all voice-based UIs would arguably need to act and react essentially the same way to our requests, thereby invalidating the value and uniqueness that each could potentially provide—a rather interesting dilemma.

Unlike apps and visual-based methods of doing something, the whole point of a voice-based interaction is that it should be very simple and very consistent. In other words, if I ask a question, or request that an action be taken on my behalf, the end result should be the same, regardless of which system I speak to. Otherwise, it wouldn’t provide the level of consistency and usefulness that we expect from natural voice-based interactions.

Of course, there’s bound to be some degree of difference in the accuracy of the recognition across different voice-based user interfaces, as there is today. In addition, the level of intelligence in the response that comes as a result of that recognition will also differ. Relatively quickly, however, it’s fair to assume that all the different voice-based personal assistants will be good enough to be reasonably useful.

Then, the question is, why use one versus another?

Now, underneath it all, they will undoubtedly function differently and will likely respond somewhat differently to different requests. However, they cannot respond too differently, or they risk being unnatural and unfriendly and, therefore, less likely to be used. Of course, just as we inevitably find ourselves more drawn to conversations and interactions with certain people versus others, so too might we find ourselves more drawn to certain “digital” personalities than others (sometimes without really even knowing why).

The trick will be figuring out what sorts of things drive those more favorable interactions. As a result, there will still be an opportunity for multiple providers in the voice-based UI world, but the type of competition and differentiation between potential options is going to be significantly different, and more subtle, than typical product or platform differentiators.

Another interesting potential issue is determining the manner in which these personal digital assistants might interact directly, or indirectly, with one another. If you use a voice-based assistant from one platform to set a meeting or flight, for example, but then go to make changes on another platform, how might the two systems interact with one another?

Truly natural voice-based interactions and personal digital assistants are still in their infancy, so these problems don’t have to be solved overnight. However, it’s clear to me that figuring out these kinds of challenges is going to be critical for each of the main platform and voice UI providers. Given the inevitable and ongoing platform wars between not only Apple, Google and Microsoft, but now Amazon, Facebook and likely others as well, the answers to these dilemmas will be driving tech platform agendas for some time to come.

Here’s a link to the column:

USA TODAY columnist Bob O'Donnell is the president and chief analyst of TECHnalysis Research, a market research and consulting firm that provides strategic consulting and market research services to the technology industry and professional financial community. His clients are major technology firms including Microsoft, HP, Dell, and Qualcomm. You can follow him on Twitter @bobodtech.