Emotional awareness is intuitive to us. We are wired to know when we and others are feeling angry, sad, disgusted… because our survival depends on it.
Our ancestors needed to monitor reactions of disgust to know which foods to stay away from. Children observed reactions of anger from their elders to know which group norms should not be broken.
In other words, the decoding of the contextual nuances of these emotional expressions has served us since time immemorial.
Enter: AI.
Presumably, artificial intelligence exists to serve us. So, to build truly ‘intelligent’ AI that adequately serves humanity, the ability to detect and understand human emotion ought to take center-stage, right?
This was part of the reasoning behind Microsoft and Apple’s vision when they dove into the topic of AI-powered emotion recognition.
Turns out, it’s not that simple.
Inside ≠ Out
Microsoft and Apple’s mistake is two-pronged. First, there was an assumption that emotions come in defined categories: Happy, Sad, Angry, etc. Second, that these defined categories have equally defined external manifestations on your face.
To be fair to the tech behemoths, this style of thinking is not unheard of in psychology. Psychologist Paul Ekman championed these ‘universal basic emotions’. But we’ve come a long way since then.
In the words of psychologist Lisa Feldman Barrett, detecting a scowl is not the same as detecting anger. Her approach to emotion falls under psychological constructivism, which basically means that emotions are simply culturally specific ‘flavors’ that we give to physiological experiences.
Your expression of joy may be how I express grief, depending on the context. My neutral facial expression may be how you express sadness, depending on the context.
So, knowing that facial expressions are not universal, it’s easy to see why emotion-recognition AI was doomed to fail.
It’s Complicated…
Much of the debate around emotion-recognition AI revolves around basic emotions. Sad. Surprised. Disgusted. Fair enough.
But what about the more nuanced ones… the all-too-human, self-conscious emotions like guilt, shame, pride, embarrassment, jealousy?
A substantive assessment of facial expressions cannot exclude these crucial experiences. But these emotional experiences can be so subtle, and so private, that they do not produce a consistent facial manifestation.
What’s more, studies on emotion-recognition AI tend to use very exaggerated “faces” as origin examples to feed into machine-learning algorithms. This is done to “fingerprint” the emotion as strongly as possible for future detection.
But while it’s possible to find an exaggeratedly disgusted face, what does an exaggeratedly jealous face look like?
An Architectural Problem
If tech companies want to figure out emotion-recognition, the current way AI is set up probably won’t cut it.
Put simply, AI works by finding patterns in large sets of data. This means that it’s only as good as the data we put into it. And our data is only as good as us. And we’re not always that great, that accurate, that smart… or that emotionally expressive.