New Zealand company Soul Machines is on a mission to reverse engineer the brain and humanise AI interactions. And it’s making very good progress. Jihee Junn explores the rise of – and potential uses for – its ‘digital humans’.
In Spike Jonze’s 2013 film Her, Joaquin Phoenix plays Theodore Twombly, a lonely, impending divorcee whose job involves dictating other people’s handwritten letters. But when Samantha, an artificially intelligent voice operating system, enters his life, Theodore finds his emotional desolation offset by Samantha’s remarkably lifelike personality. Pithy, humorous, empathetic and even embarrassed at times, Samantha’s springboard of emotions enthralls Theodore, who admits to her that “I don’t feel like I can say [things] to anybody, but I feel like I can say [things] to you”.
While the film goes on to explore Samantha and Theodore’s blossoming (albeit doomed) romantic relationship, Samantha’s scope for emotional response conveys an oft-explored topic in the realm of science fiction: what if computers were capable of not just thinking, but feeling as well? In Ridley Scott’s 1980s film Blade Runner, android ‘replicants’ are so advanced they're practically indistinguishable from humans, requiring a fine-tuned Voight-Kampff test (equivalent to today’s Turing test) in order to determine who’s who. Similiar sentient beings can also be seen in recent hits like HBO’s TV series Westworld, Alex Garland’s film Ex Machina, and in the latest instalments of the Alien franchise, all of which depict technologically advanced androids capable of feeling joy, grief, anger, hope and even a desire for violent retaliation.
Although not quite at the hyperconscious level depicted in these science fiction classics, it’s entirely plausible to say that the technology hosted by Auckland-based company Soul Machines has come closer to making it a reality than anyone else. As the name suggests, Soul Machines creates emotionally intelligent, lifelike avatars (or, as it prefers to call them, ‘digital humans’) that act as a visual interface for customer service chatbots, virtual assistants and a host of other practical uses.
While artificial intelligence (AI) has become a term even the most technologically inept among us have become familiar with, emotional intelligence (EI) – the capacity to identify and manage one’s own emotions and the emotions of others – has been a term applied more commonly among psychologists than in computer programming circles. But as robotics and automation become increasingly ingrained into the workings of society, experts have realised that to extend the possibilities of AI, they must equip these technologies with the capability to form engaging interactions with humans. In fact, the inclusion of EI is what distinguishes Soul Machines from the rest of the pack: its avatars can recognise emotions by analysing an individual’s facial and vocal expressions in real time, while reciprocating these reactions with an unprecedented level of human-like response. Like AI, EI develops through experience – the more it interacts with you, the more emotionally sentient it gets.
These lifelike interactions can most notably be seen in several demonstrations of BabyX run by Soul Machines CEO and co-founder Dr. Mark Sagar. With a past career as Weta Digital’s special projects supervisor for blockbusters like Avatar, King Kong and Rise of the Planet of the Apes, Dr. Sagar joined the University of Auckland’s Laboratory for Animate Technologies in 2012 where he began to develop the BabyX technology that now underpins Soul Machines. BabyX, an interactive virtual infant prototype, appears on screen as a rosy cheeked, strawberry blonde, doe-eyed toddler. Just like a real child, BabyX whimpers and cries when it’s insulted or ignored, and smiles and coos when it’s encouraged or entertained.
While the technology behind Soul Machines has been a project several years in the making, it’s still a newcomer to the commercial realm, having only formally launched in 2016 after receiving a $7.5 million investment from Hong Kong-based Horizon Ventures. From the start, the company has attracted a huge amount of attention. Elon Musk’s biographer Ashlee Vance visited Sagar as part of his technology show Hello World; Bill Reichert, entrepreneur and managing director of Garage Technology Ventures, listed Soul Machines as one of the startups that impressed him the most during a recent visit to New Zealand; and in PwC’s 2017 Commercialising Innovation Report, Soul Machines was again cited as a prime example of “leading the way in the AI space”.
But the hype appears to be warranted. In February this year, Soul Machines unveiled ‘Nadia’ to the public, its virtual assistant developed for the NDIS (National Disability Insurance Scheme) in Australia. Designed to better help disabled people that traditionally struggle with technology interfaces, Nadia, whose voice was recorded by none other than actress Cate Blanchett, astounded many with her remarkably detailed physiology and astutely aware interactions.
July will mark one year since the company spun out of the University of Auckland, eventually trading its academic headquarters for an office in Auckland’s historic Ferry Building. With just nine full time employees at the time of its commercial launch, Soul Machines now boasts more than 40 people on its burgeoning staff roster. And while the company already has plenty to pride itself on, it certainly isn’t resting on its laurels just yet. Having recently returned from debuting Soul Machines at the Cannes Lions Festival in France, chief business officer Greg Cross says that he and Dr. Sagar conducted a total of 28 presentations in four days, demonstrating their offerings to marketing officers from companies like Mazda, Subaru, Airbnb and Booking.com.
Cross, who’s been part of the Soul Machines team since launching last year, has had a big hand in helping to commercialise Dr. Sagar’s remarkable innovation. As a serial tech entrepreneur who’s helped build companies from all over the world, Cross’ excitement around Soul Machines is palpable when I speak to him at the company’s Ferry Building office, admitting that in his 30 years working in the tech industry, he’s never had so much fun in his life.
“The cool thing about this technology is that it’s only really limited by your imagination,” he says. “[Nadia] was an amazing first project for us because you’re providing services to people that have historically not been very well serviced. You’re providing many of them the ability to be more independent and get information directly rather than have to work through third parties or have to wait for hours or even days to get someone to talk to.”
“You can imagine building digital teachers to provide education to kids who don’t have access to teachers. You can imagine providing digital service agents for refugees where governments can interface and interact with them in a simple and easy manner. This is what’s really exciting, every time you sit down and talk with somebody, you come up with a different use case.”
With some of these use cases, it’s not just speculation fuelling them either. Although Cross is tight-lipped about the specific companies involved, he says it’s currently in the process of building another female digital human for a big software company in Silicon Valley, as well as developing its first AR/VR project for a media company in the UK. And perhaps indicative of its impending launch into the financial sector, Soul Machines introduced its latest digital human, Rachel, on stage at the LendIt Conference in New York City. Powered by IBM Watson’s AI and Soul Machines’ EI (as was Nadia), Rachel demonstrated to an audience of FinTech executives how she could help customers pick out the ideal credit card not just efficiently, but conversationally as well.
A face in the crowd
Perhaps one of the most extraordinary things about Rachel (other than the complex neural network platforms that support her) is that her appearance is based on a real-life person. And not just any person, but a Soul Machines employee sitting just three metres away from where Cross and I converse.
“Real Rachel is actually an avatar engineer. She spends half the day talking to herself. She’s got the weirdest job on the planet,” he remarks.
The creep factor
In the 1970s, Japanese robotics professor Masahiro Mori noticed that when he showed people the robots he built, the more vaguely human his robots appeared, the more positively people reacted. But as he began to improve his robots by adding more lifelike features, such as synthetic skin, he found that most people were more repulsed than impressed. He eventually hypothesised that without human characteristics, robots were simply less interesting. But gift them with too many human characteristics, they can generate a sense of disquiet and dread.
The chasm between nearly human and fully human is what Mori identified as the uncanny valley. The discussion around the uncanny valley has been raging for years now, particularly with the advent of highly developed CGI techniques. When the film Shrek was first test screened during the early 2000s, its young audience of children were left mortified by the hyperrealism of the character Princess Fiona. As a result, Dreamworks Animation reworked the look of Fiona to make her seem more like the cartoon that she is and a less like the human she was seemed to be simulating.
Avoiding the uncanny valley is a difficult task. Characters either have to be photorealistic (practically indistinguishable from real humans, like Blade Runner’s replicants) or charmingly stylised (like in Pixar’s Wall-E). It’s clear that with every pore, freckle, lash and line carefully rendered in each of its avatars, Soul Machines is aiming to hit the nail on the former rather than the latter. And while many respond to Soul Machines’ avatars with astonishment and excitement, others express a more disconcerting reaction.
“You get the full range of reactions when people see our technology,” explains Cross. “When Mark does his BabyX demo, most people’s jaws will hit the floor at some point during the presentation. People are just blown away…[but] there are still people who find the concept of AI and robots creepy. They look at Westworld and are horrified.”
While Soul Machines straddles the delicate precipice between real and creepy, its founder isn’t quite as concerned about falling into the uncanny valley as he is on the avatars establishing a deep connection. “The brain reacts differently to something it perceives to be alive versus something which it perceives to be inanimate,” Dr. Sagar recently told VentureBeat. “If you ever see a realistic eye looking at you, you’re much more likely to respond than if you see a cartoon eye looking at you.”
Part of what makes Soul Machines’ digital humans so visually lifelike is that, like Rachel, they’re all based on real life people. Its most recent digital human, for example, is based off Filthy Rich star Shushila Takao, making her the first professional actress to have her ‘likeness’ licensed to use as an avatar.
The process of building a digital human is a three-stage process that takes approximately eight weeks. The first stage is visual, starting with a 3D scan of the individual candidate that is used to build out the graphics for the face. The second stage involves the character component, where a personality is built and a series of emotional states that it’s allowed to express are formed. Finally, in the third stage, the avatar is brought to life using the company’s core computing technology before it’s ready to be used.
With the rise of the internet of things (IoT) and the proliferation of technologies like Amazon’s Alexa, Apple’s Siri, Google’s ‘OK Google’ and Microsoft’s Cortana, interactive AI has already become somewhat ubiquitous. But our interactions with these programmes have so far been confined to a voice emanating out of an inanimate object. But Dr. Sagar and his company believe that talking to something that looks a lot like a human is far more likely to encourage individuals to be more open about their thoughts and expressive with their face, allowing a company to pick up additional information about what drives its customers.
“The human face is incredibly engaging. We’re naturally programmed to look at and interact with them,” says Cross. “The way we look at it is over the next period of time, we’re going to be spending a lot more time interacting with machines and AI. Whether it’s a virtual assistant on a website to a concierge that sits inside your self-driving car, the more we can humanise computing, the more useful it’s going to be for us.”
While many may doubt that an artificially rendered face could elicit such genuine response from human beings, numerous cases have proven otherwise. In 2015, Japanese researchers found that when subjects were exposed to images of robot hands and human hands being sliced with a pair of scissors, EEG scans showed that images of both types of hands elicited the same neuropsychological response. Even non-human looking robots subjected to violence can generate a strong sense of empathy. In one MIT experiment, participants were asked to play with small, mechanised dinosaurs called Pleos. When they’re eventually asked to torture their Pleos, many refused and even found the exercise unbearable to watch. And when a bomb-defusing robot was left crippled, burnt and brutralised during a routine military test in the USA, an army colonel brought the test to a halt, charging that the exercise was “inhumane”.
If this type of emotional response can be goaded from humans in reaction to non-humanlike robots, it would be natural to assume that humanoid machines with hyper realistic features can make an even deeper, more meaningful connection with those that interact with them on a regular basis. After all, when participants for the pilot of Nadia were asked if they’d use her again, 74 percent responded positively, indicating they’d be happy use a digital human as their primary means of interaction with the government.
“A lot of focus is on moving to that voice interface. But our view is that voice only takes you so far. The analogy we talk about is what happened to radio when television came along. Television was a much more engaging, entertaining and interactive experience. Just talking to a voice can get irritating at times.”
Building the DNA factory
For businesses and brands looking into Soul Machines’ offering, their excitement derives not just from the potential increase in efficiency and customer satisfaction, but the fact that it could be employing one of the very first digital employees in the world. With the ability to customise its employee in both character and physical appearance, each digital human that’s created exhibits its own unique set of personal traits.
Nadia, who was designed by people with disabilities for people with disabilities, is relatively “conservative, very empathetic and not overly emotionally expressive at this point in time”. Due to the nature of Nadia’s role, it was important she didn’t end up expressing an inappropriate emotion in reaction to something she saw from someone with cerebral palsy or autism, for example.
At the other end of the scale, a digital human developed in the form of Nathan Drake – a character in Sony Playstation’s Uncharted series – is a much more outgoing character who’s humorous and full of bravado, while BabyX, being an infant, is much more spontaneous in her reactions than her adult counterparts. When it comes to Rachel, who’s a virtual customer service agent, she exhibits more breadth in her personality, with her emotional states ranging anywhere from sassy to conservative depending on whether she’s talking to a 50-something business person or a 20-something college student.
While it currently takes about eight weeks for Soul Machines to build a digital human according to its customers wants and needs, Cross says it’s hoping to streamline its avatars by creating a “DNA factory”, reducing the process down from weeks to days, and eventually, from days to hours.
“By capturing somewhere between 20 to 30 digital humans of different age groups, ethnicities and genders, we’ll be able to create digital humans from that digital DNA without having to start from scratch,” he says.
“When we’re working with big corporates, they often have quite strong views on the design phase. But my personal view is that in the long term, they’ll move away from [the idea of having] a digital brand representative and instead have 20 to 30 digital employees from which their consumers can choose to interact with. Do you want to talk to someone who speaks Chinese? Do you want to interact with a male or female? Or would you prefer a cartoon character because digital humans aren’t your thing? I think people’s approach to this will change quickly, but it’s still very early days.”
The shock of the new
While sentient beings have long featured in modern day science fiction, it goes without saying that cultural instances of emotional and artificial intelligence coming together have, for the most part, exhibited a cynically dystopian slant. In Ex Machina, the humanoid robot Ava manages to escape the locked down facility by emotionally manipulating a young programmer, leaving him trapped in a room to presumably starve to death. In Stanley Kubrick’s 1960s epic 2001: A Space Odyssey, the ship’s computer, Hal, famously goes rogue, taking control of the pods and turning off the life support systems of all of its crew on board. Even as far back as the 1860s, essays like Samuel Butler’s ‘Darwin among the Machines’ argued that mechanical inventions were undergoing constant evolution, and that eventually, the human race would be supplanted from its as status as the dominant species.
The examples are endless when it comes to showing how technology could turn from a state of benevolent subservience to malevolent self-interest. Having been conditioned with this recurring narrative over the years, it’s no surprise apprehension has been the prevailing reaction among those exposed to AI/EI beings. Combined with the natural technophobia that arises when new technologies are introduced and the relative lack of understanding around AI in New Zealand, highly advanced companies like Soul Machines have a lot to contend with.
“Very few people on the planet have actually had a chance to have a live interaction with one of our digital humans, so it’s an intellectual thing,” says Cross. “I think it's one of those things that you just have to experience. There's always that percentage of people who will be completely turned off. It's like all new technology. There's a percentage of people who don't like Facebook, there's people who don't like voicemail, and a lot of people don't use Siri and that's their choice.”
Cross underscores that despite people’s fears that robots will reduce the number of paid employment opportunities in the near future (a suspicion that dates back to the first Industrial Revolution), technology like that of Soul Machines’ is on course to enabling humans to do more with their lives rather than less.
“We’ve had this concept of a 40 hour work week for a very long time now. But what happens if we only had to work 20 hours? We can spend more time in our communities, more time with our families. Is that such a bad thing for mankind? I don’t think so.”
And while science fiction’s cynical narrative prevails in most instances in pop culture, it’s important to note that not all fictional computers have made it their covert mission to annihilate the human race. Just ask David Hasselhoff’s favourite pal on four wheels.
“Think KITT the car in the TV show Knight Rider,” says Cross. “It was a personality in a car. It had flashing lights but it didn’t have a face. Now, we actually have the opportunity to give KITT a face, or put a face inside a luxury car. We can see how we can make that science fiction a reality.”
Let’s just hope that science fiction is a little more Knight Rider and little less Westworld.