10/19/2018

Outfitting Avatars To Cross The Uncanny Valley

12:16 minutes

on the left, an image of a face. in the middle and right side, two cgi faces under different lighting conditions based off the first image, rotating left to right
In this simulation, the 3D faces were constructed based on a 2D image. Credit: Hao Li

What if, instead of searching YouTube for “cute cat videos,” you could just tell an artificial intelligence system that you were interested in watching a video of a puppy parachuting out of a plane? Virtual reality algorithms would then assemble this custom-built entertainment—which is exactly what you want, when you want it—without relying on creators to come up with the idea first. That’s just one of the many future applications of virtual reality and graphics technology imagined by Hao Li, CEO of Pinscreen and director of USC’s Institute for Creative Technology’s Vision and Graphics Lab.

[Toss out the map. These juvenile Atlantic salmon rely on magnetism to find their way home.]

In this segment, he talks about the big challenges for creating photorealistic avatars, and how face-swapping technology threatens our perception of what’s real in the news.

Support great science journalism!

Segment Guests

Hao Li

Hao Li is the director of USC’s Institute for Creative Technology’s Vision and Graphics Lab. He’s also the CEO of Pinscreen. 

Segment Transcript

IRA FLATOW: This is Science Friday. I’m Ira Flatow coming to you from the Civic Arts Plaza in Thousand Oaks, California.

[APPLAUSE]

Back in the good old days of black-and-white television, which some of you may remember. Yeah? Remember how the family gathered around a big, boxy TV set? Everybody had one. They watched whatever they found on maybe five channels you had to choose from then.

And then you had cable, and you had an unlimited scroll of hundreds of channels. We’ve got cooking shows. We had reality TV, news, old movies. And if that wasn’t enough variety, along came YouTube and video apps, allowing the whole family to watch but not together.

Right? They’re not sitting around that TV anymore. They’re on their smartphones watching a limitless number of items.

But I want you to now imagine the possible future of video entertainment. Maybe instead of searching for something someone else has made, you’re going to just type in a few terms of interest, have your phone scan the photos of your loved ones in your library, and maybe you will spit out a completely customized video, a virtual world populated by people that you know. And they are the stars of the video.

Sound strange? Sound intriguing? Sound scary?

My next guest says it could happen. And he’s designing some of the technology to get us there. Hao Li is the CEO of Pinscreen and professor and director of the USC Institute for Creative Technologies at the University of Southern California in Los Angeles. Welcome to Science Friday.

[APPLAUSE]

HAO LI: Thanks for having me.

IRA FLATOW: Nice to have you. I want to go back a bit in history first, way back when Science Friday used to broadcast every show in the virtual world called Second Life. Do you remember Second Life?

HAO LI: Yeah, I do.

IRA FLATOW: I had my own avatar named Ira Flatley. I don’t know why it came like that. But things have come a ways since then. And we are now in a whole different world. Catch us up on where we are.

HAO LI: Yeah. I think in the past few years, there’s a couple of things that have changed. First of all, I think graphics performance has changed a lot. You can probably tell from video games that things get to look more and more realistic. And sometimes, you can’t really tell is my TV on, or is it actually a video game that kids are playing?

And the second thing is also this entire movement with virtual reality, augmented reality, where suddenly we’re no longer watching a two-dimensional scene. We get immersed into it. So the idea is really to simulate the physical environment as if we were actually in there.

IRA FLATOW: Well, what first got you really interested in virtual reality and CGI, as they call it?

HAO LI: Oh, that goes way back. Over 30 years ago, my dad brought back a computer. It was an old Commodore c64.

IRA FLATOW: I had one of those.

HAO LI: OK.

IRA FLATOW: Yeah. Commodore 64– you had to hook it up to your TV set, right?

HAO LI: Right. Exactly. So you know, I was playing video games and then creating little programs. But then, I think in the ’90s, I watched two movies. One of them is Terminator 2. The other one is Jurassic Park. And then you suddenly get to see something that you can’t really tell the difference of real or not real. And this whole virtual content is really changing everything.

IRA FLATOW: And that’s what you do now– you work to create photorealistic avatars, onscreen avatars, that look like real humans.

HAO LI: Right. One of the hardest things actually to do when you’re working in the field of computer graphics and computer vision is how do you create a digital version of yourself or of any humans? We are especially very sensitive to how we look like. We can tell if someone, you know, looks sick or not. And if you want to recreate a digital human, that’s one of the hardest things to do.

IRA FLATOW: And you use something called face swapping.

HAO LI: Right.

IRA FLATOW: That’s kind of cool.

HAO LI: It’s one of the applications.

IRA FLATOW: Explain what that is.

HAO LI: OK. So basically face swapping consists of the following. Imagine Mission Impossible, where they had an actual physical mask. Here, everything is digital. So all you need to do is look into a webcam or your iPhone camera. And then, what you can do is you can reenact as someone else– as me, for example.

IRA FLATOW: There must be a dark side to this.

[LAUGHTER]

HAO LI: Unfortunately, there is. Although we didn’t devolve these type of technologies to fool people. I mean, we were trying to fool them in a different way, like anyone in visual effects. But the initial goal was to develop these type of virtual humans to change the way we would communicate in the future. We want to see people could talk remotely as if they were actually there. And the other applications are just gaming, right?

IRA FLATOW: Gaming, yeah.

HAO LI: Imagine you can play games with yourself in it or your friends in it.

IRA FLATOW: And one of the hardest things to do about making someone’s face is creating the tiny, little defects and pockmarks and whatever on our faces. And you’re working toward actually make a smooth face really look like a real face.

HAO LI: Right. Actually, one of the hardest things to create about creating virtual humans is that you have to create all the digital models that are assimilating how light interacts with the skin. You have to capture all the details of your face. And from a single picture, you don’t have all this information. You have to turn it into 3D. You have to simulate how it would interact with the light around you.

IRA FLATOW: The lighting is important, because you could make light from any direction.

HAO LI: Right.

IRA FLATOW: You could light the face.

HAO LI: Exactly. Because you need to look like you are in a new virtual environment. And the way we solve this is that we just have massive amounts of data about human faces and then train the model to simulate this.

IRA FLATOW: Now, I’ve seen some demonstrations of this and where it is so real looking. I mentioned the dark side a little bit. But you could actually impersonate politicians and have them really– people believing that that’s what they’re saying.

HAO LI: That’s right and there’s also this problem right now– and a lot of people talk about it in the news as well– about all these deep fakes, all these technologies that are out there and also accessible to people in order to create malicious information that are manipulated.

IRA FLATOW: If I were smart enough, which I’m not, and I looked at one of your composites, your face recognition, could I tell if it was fake or real by going inside it?

HAO LI: Yeah. You can definitely tell right now with the naked eye almost that this has been digitally manipulated. But all these technologies are also very new. So in a year or two, we might be in a point where it’s impossible to tell the difference.

IRA FLATOW: Impossible?

HAO LI: Yeah.

IRA FLATOW: Wow. That’s scary. Another thing that’s interesting about this is a few years ago, we were out here and we were talking with some actors. And we talked with an actor from Avatar, the movie Avatar. And he once said to us that he has enough scans of his face and body that were done that they could make movies with him long after he was dead. Is that true? Is that where we’re heading with how accurate this stuff is?

HAO LI: For sure. I think right now the technology is not there to fully replace a human. But certainly, as I said before, in a couple of years we’ll be able to do this.

And there are also examples. If you put enough resources and money, you can have a sufficiently performing pipeline, especially in the VFX industry, where you have deceased actors. One good example is in the movie Furious 7, where Paul Walker died in a car accident. They were able to make hundreds of shots of him in the movie without having him being the actor.

IRA FLATOW: Wow. OK, very interesting. Let’s go over here, on the right. Yes?

AUDIENCE: It’s a pleasure to meet you. Me and my brother are a big fan of virtual reality, anything computer-related. I just had a question about could there be any medical applications to this stuff, like facial reconstruction or virtual therapy to help people?

HAO LI: Yeah. There is actually a lot of applications that are related to the face. One of them is basically just about analyzing the face. So when you go to the doctor, when they look at your face, they can already tell, well, you know, how are you behaving, et cetera. In the long term, for example, for cancer treatment, these type of areas, one of the things is you can have a quantitative way to measure pain and all these things. And as a matter of fact, at USC ICT, one of the research areas is really about analyzing the behavior of war fighters who are coming back and suffer from PTSD.

So if you have a quantitative way of analyzing your face, this is almost like the first step. When we build the 3D avatars, we’re basically looking at the shape of your face, the movements. And by having the ability to analyze this, you have a more accurate way of assessing if certain treatments work, if the person is healthy or not.

IRA FLATOW: Hmm. I know there are already avatars on Instagram with over a million followers. The avatar has a million followers. We have one here on the screen, Lil Miquela. Tell us about her.

HAO LI: Well, she’s a mystery. But Lil Miquela is like one of the most successful CG influencers.

IRA FLATOW: She’s totally phony, a totally made-up figure.

HAO LI: Yeah. Absolutely–

IRA FLATOW: And she’s got over a million followers.

HAO LI: –virtual, right. And I think this is a really interesting new phenomenon, where you have robots, virtual avatars, that are emerging and contributing to social media in a way that people would follow her, would want to interact with her.

And I think this is really the beginning. At some point, we’ll have chatbots that would react to people. She is already starting to carry brands.

IRA FLATOW: So the brand– well, they’re using her for advertising?

HAO LI: Right.

IRA FLATOW: I understand that in Japan they have even taken it a step further, something called Vtubers.

HAO LI: That’s right.

IRA FLATOW: What’s a Vtuber?

HAO LI: A Vtuber is basically– imagine these anime-style cartoons, where you are basically interacting with a cartoon on either YouTube or something that has live-streaming capabilities. And then they basically have an audience. And one thing that is really interesting is that they also perform concerts. And you have people who wear VR headsets and pay tickets and are attending a virtual concert with someone that is performing as someone else.

IRA FLATOW: Let’s talk about the last question I have, which we always talk about when we’re talking about CGI or facial stuff, and it’s something called the uncanny valley. And this is sort of an uncomfortable feeling that people get looking at something they see as a robot, and it’s human-like but not close enough to be a human. And it makes people feel a little queasy about it. Do you face that problem?

HAO LI: Yeah. That’s basically a way to measure our success in some ways. So basically, the uncanny valley is when you try to replicate a photorealistic human, and you try to get as real as possible. And if you’re not quite there, there’s always something very disturbing about the person. It may look freaky. It may look like a zombie. And the Holy Grail is basically to cross the uncanny valley so that you have the ability to generate a human that you can fool us and believe it’s an actual person.

IRA FLATOW: Thank you, Hao. Thank you very much for taking the time to be with us today. Hao Li is the CEO of Pinscreen, professor and director of the USC Institute for Creative Technologies at the University of Southern California in Los Angeles.

Copyright © 2018 Science Friday Initiative. All rights reserved. Science Friday transcripts are produced on a tight deadline by 3Play Media. Fidelity to the original aired/published audio or video file might vary, and text might be updated or amended in the future. For the authoritative record of Science Friday’s programming, please visit the original aired/published recording. For terms of use and more information, visit our policies pages at http://www.sciencefriday.com/about/policies/

Meet the Producer

About Christopher Intagliata

Christopher Intagliata is Science Friday’s senior producer. He once served as a prop in an optical illusion and speaks passable Ira Flatowese.

Explore More