Google Duplex: Exploring Natural Language AI
By Allen Foster
We are fast approaching the point of no return. That critical moment when we will find out if the intelligent machines we've created, the ones that learn and evolve, will serve us or rise up against us. Will mankind achieve paradise or extinction? That's the fear that resides at the heart of Google Duplex: is it an astounding technological leap forward or a terrifying shuffle closer to our own annihilation?
If you haven't heard, Google Duplex is the much-hyped feature that is gradually learning to master natural language, but only in extremely specific situations. At least for now. Recently, Google released and promoted two recorded examples of human interactions with Duplex. In one instance, the software made dinner reservations, while the other involved getting a hair appointment. Both conversations transpired over the phone. Admittedly, it's not the kind of activity that should have us concerned about the longevity of the human race. But for some reason, it still does.
Where the uneasiness enters is the fact that, presumably, the humans on the receiving end of the call were completely unaware that they were engaged with a computer program – go ahead, call it a robot if you want. The reason they were fooled was because the program listened, interpreted, and correctly responded (with proper pauses and inflections) to all the complexities involved in a natural conversation.
As humans, this might not sound like such a remarkable feat. As a matter of fact, we do it all the time, mostly while only half paying attention to the conversation. But that is exactly what makes Duplex so remarkable. It can deduce meaning from ambiguity and understand complex, poorly communicated ideas. It also interjects during pauses that are a little too long, fills the thinking space with "hmm" or "uh," and actually can help put the conversation back on track by clarifying elements that previously were only vaguely stated. Duplex even knows how to hesitate to achieve maximum impact of its response!
Again, is that amazing or scary?
If it's to serve us so we stop getting that infuriating, "I'm sorry, I didn't get that," when we query our devices, I say, "Hallelujah!" However, if it's to make robocalls indistinguishable from real people, ultimately, that would be rather disheartening. If elements such as pitch, accent, and speaking mannerisms can eventually be mimicked, maybe that Nigerian prince will start calling instead of sending all of those emails?
Whether it excites you or frightens you, communication is the core of understanding. If, at some point in the future, you can have a heartfelt conversation about why the season finale of This Is Us left you so emotionally distraught, or you want to debate exactly when (or if) The Walking Dead lost its magic, and you can do it with your television?! Come on, admit it. That would be kind of awesome!