November 24, 2014

Exploring the boundaries of digital assistants

Future vision storytelling

“What does internet search look like in 2017?”

—Our design team, 2014

Every once in awhile, studio managers will gather up a handful of designers with the hopes of peering into the future. The goals vary from year to year by a function of the competitive landscape, available technology, and studio leadership. In general, there are a few common goals which I think are fairly constant:

  1. Obviously we want to be relevant and utilizing the latest technologies blended with experience patterns that are easily used and not outdated.
  2. We aim to find a balance between fantasy and reality. Too much science fiction is unattainable, while too much reality is uninspiring and boring.
  3. We want to inspire and keep designers, developers, product managers, and organizational budget controllers motivated. It’s always a challenge to effectively communicate these somewhat abstract ideas to such a variety of perspectives.


Squares, circles, and arrows

Goals aside, the actual process is always messy—but it’s a wonderful mess to be a part of! It’s refreshing for designers to step out of their typical roles, use a different typeface, throw out their grids, let their hair down, and tap into their raw, creative problem-solving talents.

Photo of whiteboard session for the

Whiteboard session for the Gnome on the Range vignette

Within our group, we all brainstormed and storyboarded key concepts. Then we went back to our desks, wrote the scripts, and comped up some visuals. Once we iterated, edited, and refined to a point, we handed everything off to a local production company to film and edit.

Photo composition of phone and tablet with future concepts of image-based search engine

Left: Gnome-selector app and Right: Unicorn-powered visual search engine—landscaping mode

At the time, the promise of A.I.-fueled digital assistants to ease the tension between complex human-computer interactions was just getting started. We highlighted a few alternate input scenarios. People using their voice and cameras really seemed to teeter on the sci-fi side of things but have since become ubiquitous in our lives.

Aside from the purely technical restraints of voice and image input modalities, I was most fascinated by the potential contextual awareness these devices had. For example, in the Found in Translation script I wrote, the actor uses her phone for a real-time language translation. That’s definitely cool, but she’s out of her element and about to make a significant cultural faux pas by gifting a clock to an older colleague. The device she’s using to translate is the same device connected to an array of semantic databases that has the ability to give her contextually relevant information despite her not explicitly asking for it. 

A few years later, I worked on Cortana and got into the nitty-gritty of implicit and explicit information delivery. It’s a lot more nuanced and complicated than it seems, but still a worthy pursuit and shows great promise in easing that tension between humans and computers.

Found in Translation product vignette

Gnome on the Range product vignette

My mother-in-law is Chinese. The first time she came to spend some time with us in New York, I wanted to make her feel at home in the guest room. I also wanted to buy her a gift. So, I somehow ended up frantically trying to find a gift for someone who I didn’t really know too well. Long story short, I came across a really cool-looking clock in a shop downtown. I was so proud of my purchase—who doesn’t need another clock? After I get home to show off my prize purchase to my wife, she covers her smiling mouth and explains how there is no way in hell I am giving this clock to her mother.

Live and learn.


* * *

Back to top Arrow