Why Machines Discriminate—and How to Fix Them

Subscribe to Science Friday

Some believers in big data have claimed that, in big data sets, “the numbers speak for themselves.” Or in other words, the more data available to them, the closer machines can get to achieving objectivity in their decision-making. But data researcher Kate Crawford says that’s not always the case, because big data sets can perpetuate the same biases present in our culture, teaching machines to discriminate when scanning resumes or approving loans, for example.

And when algorithms do discriminate, computer scientist Suresh Venkatasubramanian says he tends to hear expressions of disbelief, such as, “Algorithms are just code—they only do what you tell them.” But the decisions that machine-learning algorithms spit out are a lot more complicated and opaque than people think, he says, which makes tracking down an offending line of code a near impossibility.

One solution, he says, is to screen algorithms more rigorously, testing them on subsets of data to see if they produce the same high-quality results for different populations of people. And Crawford says it might be worth training computer scientists differently, too, in order to raise their awareness of the pitfalls of machine learning in regards to race, gender, bias, and discrimination.

Segment Guests

Kate Crawford

Kate Crawford is a principal researcher for Microsoft Research and a visiting professor at the MIT Center for Civic Media in Cambridge, Massachusetts.

Segment Transcript

IRA FLATOW: This is Science Friday. I’m Ira Flatow.

Let’s say you apply for a new job. Would you rather have your resume judged by a person or an algorithm, software, computer? The fact is you may have no choice. If you’ve applied for a job at a big company recently, chances are you may have already been subjected to this trial by machine.

And in theory that should be a good thing, right? Take the human out of the equation– all the bias, the baggage, the discrimination. Be judged purely on your merits. But it turns out that machines resemble their human makers much more than we might have imagined.

My next guest says that if our algorithms are fed data from a prejudiced world, they can end up with many of the same biases that we do. Kate Crawford is a principal researcher at Microsoft Research, Visiting Professor at the MIT Center for Civic Media. Welcome to Science Friday.

KATE CRAWFORD: It’s a pleasure to be here, thanks, Ira.

IRA FLATOW: And we want to ask our listeners, we want to hear from them– would they be rather be judged by a human or an algorithm? Has an algorithm ever discriminated against you, to your knowledge? So give us a call, 844-724-8255. Also, you can tweet us @scifri.

Kate, there is a lot of hype about the power of big data. Why, in your view, do we have to treat the data and what machines are doing with a bit more skeptical eye?

KATE CRAWFORD: Well, big data is very exciting for lots of reasons. It gives us capacities that we simply didn’t have before. But there’s also this little trap that we can tend to look at large collections of data as somehow being more objective and more representative when this is not necessarily the case.

And we can make the same sort of miscalculation if you will about algorithms– that because they’re systematic we assume that they’re somehow more objective than humans. But what we’re starting to see is that we can have elements of bias in both the data sets that we’re using and in the algorithms themselves.

IRA FLATOW: Give us an idea. What kind of real life examples.

KATE CRAWFORD: Sure. Well, I first started researching this several years ago now when I was looking at the way that we use social media data to understand natural disasters. And what you find if you’re using say, for example, Twitter data, is that in many cases Twitter tends to be used by people who have a little bit more money, often a smartphone, often they’re urban, often they skew younger. So if we’re starting to use that data to really try to understand where to direct resources in a natural disaster, you’re missing part of the picture.

Then I started to look at how apps might be reproducing this form of bias. So there’s an app that’s actually quite fascinating called Street Bump which is used in the Cambridge area. Now, this was designed for really good reasons. It’s actually very smart. What it does is it basically tracks your accelerometer and your GPS data as you’re traveling down the road so when you hit a bump it’ll say, oh, there’s a bump, and if a few other people hit the same bump, then the city knows to send out a road crew and–

IRA FLATOW: There’s a pothole there.

KATE CRAWFORD: And repair that pothole. Absolutely right.

But what I started to do is to look at who owned smartphones in the Boston area, and what, unsurprisingly enough, it tends to map with people with more disposable income and also younger audiences. Particularly if you look at the over 65s, you’ll find that in some cases smartphone penetration is as low as 16%.

So what you end up doing is having an app which in many cases is accentuating the signals from areas that have younger, wealthier residents, and we’re not getting the signals from areas where we have people who have less money or if they’re generally older populations. So what that could mean is cities could start to be resourced and repaired based on signals that are really skewing the population towards those who are already are pretty privileged base.

So I started to think about both how the data collection can produce particular kinds of, shall we say, black holes and really boosting the signal of populations who are already, let’s face it, very well-connected.

IRA FLATOW: So if you were to look at that example and you say how– knowing that the data could be biased– how do I unbias it, maybe then you put the detectors on garbage trucks that go everywhere.

KATE CRAWFORD: And that’s exactly right, and that’s actually exactly what they did. And the group– the New Urban Mechanics who built this apps are actually very sensitive to making sure that this data is equitable and they’re getting a good picture of the city.

So what they did is they put this app along all kind of counsel vehicles, from garbage trucks through to police cars so that they get a wider signal. That was a really good example of a quick response and thinking about how you try to unbias your data. But in many cases that’s a lot harder to do, particularly if you’re using historical data.

IRA FLATOW: Now, everybody– the first thing they do when they want to know something is that they Google it. Are we getting biased data coming back from a Google search?

KATE CRAWFORD: Well, it’s really interesting. So a group of CMU researchers recently published a paper looking at what happens if you Google for particular jobs. And they actually created a very clever system that was automated and they created a lot of fake profiles, half of which were female and half of which were male. And they used this to test what kind of job ads that were being shown. And very problematically what they found was that all of the really high-paying jobs– the CEO level jobs, the things that were 200,000k plus were being shown far more to men and they weren’t being shown to women.

So that produces a very serious bias that, and of course, impacts upon itself because not only do we have a problem with not enough women in the C-suite, now they’re not even seeing the ads for those positions. So absolutely we’ve seen particular forms of search bias across all of the search engines, and this is something obviously that’s very serious for technology companies to think about.

IRA FLATOW: I want to bring in another guest. Suresh Venkatasubramanian is an associate professor in the School of Computing at the University of Utah in Salt Lake City. He joins us from KUER there today. Welcome to Science Friday, Suresh.

SURESH VENKATASUBRAMANIAN: Thanks for having me.

IRA FLATOW: Let’s talk about– before we get ahead of ourselves too much– let’s get into the nitty gritty about the software itself which at its heart contains these algorithms. What actually is an algorithm? You had a great article in Medium comparing algorithms to recipes. Let’s talk about that.

SURESH VENKATASUBRAMANIAN: Sure. So one of the things we do in our undergraduate classes in computer science is to teach people what it means for something to be an algorithm. And the usual story we tell them goes something like this– that you should think of an algorithm as a set of instructions that takes some inputs and produce some outputs, and the best way to think about this is like a recipe.

So you have a recipe for making something. It has a set of instructions that are hopefully well-defined and easy to follow– not always, but most of the time. And your inputs are the ingredients and the output, hopefully, is the food you want to make.

And so we’ve been telling this story for a long time, and I think it’s a very good story. It captures a vast majority of the kinds of algorithms we work with, but what it does not do anymore is capture the kinds of machine-learning algorithms that we’re actually using in the cases that Kate mentioned.

By the way, Kate, honor to meet you. I loved your article on the six provocations of big data, by the way.

KATE CRAWFORD: Oh, that’s very kind.

SURESH VENKATASUBRAMANIAN: So–

KATE CRAWFORD: Thanks, Suresh.

SURESH VENKATASUBRAMANIAN: So the better analogy and what I wrote about in the article was something a bit more involved. It looks something like this, if you permit me to tell a little story.

So Thanksgiving is coming up and we’re all going to be making, among other things, stuffing. And you could, of course, get a stuffing recipe from a book and make it. But let’s suppose you didn’t do it that way. You didn’t use the algorithm for making stuffing that way. You did something different.

What you did was you called 10 of your friends over to your house three days before Thanksgiving and said, bring me your examples of stuffing that you like and if you can, bring me an example of stuffing you don’t like. So they come and bring the examples and they leave their bowls there marked off with what they like and don’t like and they go away.

And you taste the difference stuffings and you have some idea, OK, I need these ingredients. I need maybe, I don’t know, bread crumbs or I need cornmeal or I need celery, and you make something. And you make something and they come back the next day and tell you what they like and didn’t like and you do this over and over again.

Eventually you come up with a set of instructions that might involve mixing the ingredients in a bowl, twirling three times on one foot, spinning it over your head, and it works great. You write these instructions down very faithfully and you send it out to everyone saying these are the instructions for making stuffing. And they make the stuffing and it works, and if they come and ask you, well, why do I have to twirl three times, your answer is, I don’t know. It just seemed to work.

And, now, if you did this in Salt Lake, you might get one recipe. You did this in New York City you might get a different recipe. They will all be recipes for stuffing but they’re all very different. They all come from different sources, and it’s very hard to tell why they are the way they are. And that’s what these algorithms are nowadays. That’s exactly how they look.

IRA FLATOW: And does that mean– does that explain how the algorithms get written? Is the software actually writing its own algorithm then if it’s different in every city?

SURESH VENKATASUBRAMANIAN: Essentially yes.

IRA FLATOW: Wow.

SURESH VENKATASUBRAMANIAN: They are being trained so the algorithm takes in some data, it trains itself on some data. It has what I was describing– it plays a game of 50-dimensional roulette essentially trying to find a spot to land in a very, very high-dimensional space and it lands somewhere and where it lands is the final algorithm.

IRA FLATOW: Wow. That’s many things. So that’s how then it might filter down to a resume because we’re feeding in all these biases of our own, all the different recipes and that pops something that we don’t recognize or we don’t really want.

SURESH VENKATASUBRAMANIAN: Yes. And one of the problems is that humans are very good at trying to find patterns in data. Machines are also very good at trying to find patterns in data, whether or not they’re meaningful. So if finding a pattern gives you some edge in terms of prediction accuracy or how well you can match the results that the human told you you should match– the stuffing that your friends told you was good for example– you will find these small patterns.

So an algorithm that scans resumes might even say, oh, I notice that this– when people use this kind of font it has a high correlation with being productive, so we should– this is an important feature. Is it or is it? I don’t know. Maybe it is. Maybe it isn’t. But it could do things like that and then it’s hard to understand why it’s doing that.

IRA FLATOW: And, Kate, and big companies are doing resume scanning like this?

KATE CRAWFORD: Absolutely. This is really common at the moment, and I think the big problem is that, as Suresh points out, there can be frankly just errors in how they’re understanding what productivity looks like.

In many ways we could think about the process of being hired as always having an algorithm. You met somebody who would look at your CV. They would decide what a productive person looks like and acts like and talks like. It’s just that now there’s algorithms basically within systems– in many cases, a black box– so you really don’t see how you’re being judged.

And that’s actually part of the problem. I like this idea of the cooking analogy because I think one of my favorite descriptions of how we can think about these kinds of big scale data systems comes from Geoff Bowker who said that raw data is both an oxymoron and a bad idea. That in fact your data should be cooked with care. And in many of these cases, these are systems where the data is being cooked but we don’t know necessarily the level of care that’s being brought to it.

IRA FLATOW: I like that. I’m going to go to the phones. Back in the day when I was studying computers, it was garbage in, garbage out.

Let’s go to Tod in Oakland, California. Hi, welcome to Science Friday.

TOD: Hi, thanks for having me.

IRA FLATOW: Go ahead.

TOD: So, yeah. So I work at Google and awhile ago I built a hiring algorithm that did a pretty good job at predicting who’d be successful, so I thought I would just weigh in a bit.

KATE CRAWFORD: I’d love to hear what kind of criteria you’re using to determine what success looks like there.

SURESH VENKATASUBRAMANIAN: And what algorithm you’re using.

TOD: Yeah. So I can’t give away the secret algorithm, but I can say that we looked at both were you successful in getting hired but also once you got hired, we looked at 30 different performance metrics from performance scores to organizational systemship behaviors to all sorts of different performance metrics to see once you’re there, are you a good performer.

KATE CRAWFORD: And what were some of the unexpected findings in terms of things that would make somebody likely to stick around?

TOD: So we thought going in that GPA would be a important variable but what we found was it’s only important for the first three years after somebody graduates and then after that it loses all its predictive power. So we kind of changed what we were looking for to once you’ve been out of school for three years we don’t really look at your GPA.

IRA FLATOW: So what was the most successful predictor?

TOD: A really interesting one was the age you got into computers compared to your peers. And we didn’t ask you what the absolute age was. We just asked compared to your peers. So if you’re 50 and you’re getting into computers and all your friends don’t get into it for another 10 years, you actually get in younger, and that actually tended to correlate really well with success on the job.

IRA FLATOW: Let me just remind everybody that I’m Ira Flatow. This is Science Friday from PRI– Public Radio International, talking about–

SURESH VENKATASUBRAMANIAN: So I have a question for Tod, actually–

IRA FLATOW: I know. You’re chomping at the bit, Suresh. Go ahead.

SURESH VENKATASUBRAMANIAN: Sorry. I was wondering what kind of training data do you use?

TOD: We used everybody that had come to our systems for I think about a year and then looked at their performance data a couple years later.

SURESH VENKATASUBRAMANIAN: OK.

IRA FLATOW: Did you have to change anything? Are you now have a fully cooked– because we’re using a recipe analogy here– do you have a fully cooked method down?

TOD: Well, I agree with the other two callers that it uses the input in the hiring process and we get a lot of pieces of information and we don’t ever just turn to the algorithm and say who do you want to hire, but it’s better at combining some of the pieces that we do have and then we still turn it back to the human to decide who should we hire.

KATE CRAWFORD: Can I ask a little question about where you might think bias could emerge in a system like this? And I’m thinking, for example, about gender bias which is obviously a very big issue in the technology sector and we’re trying to increase the number of female engineers and programmers but it’s still at a very low level.

I’m thinking about the fact that quite often boys are encouraged to get into computers earlier than girls for a whole range of reasons. Do you think that might also skew that particular marker that you have that says age that you get interested in computers relative to your peers, do you think that might then have some sort of gender flow on effects in terms of who’s being seen as a hireable candidate?

TOD: Well, it’s actually not something we ask about when we hire for people. It’s just what it shows is interest in technology, and that was one way in the validation study that we asked it. But when people come in and do interviews we ask everybody all sorts of different questions that just gets into are you curious about technology. So it didn’t just ask you when did you buy your first computer but did you take apart your phone at home and other ways that you got into technology.

IRA FLATOW: How do you address jobs that are not technological– teaching, welding, anything like that, or are you just talking about technological jobs?

TOD: Well, for us we’re more interested in that because we hire technologists. But I think what we’re looking for is a passion in the subject matter that you’re being hired for. So if you’re hiring welders you might say what age did you get interested in welding stuff together or paper hangers or different things. It’s just kind of passion for the thing that you’re being hired for.

SURESH VENKATASUBRAMANIAN: So one of the interesting things that comes up in this– and I’m not implying that this is directly affecting the algorithm that you guys are using at Google– is that even if you don’t look at attributes like gender or race, it’s well known that it’s easy to essentially pick up on correlated attributes that could give you this information effectively. And, in fact, one of the things that we’ve been looking at in our research is can you make some assessment of the potential for bias by discovering these correlations?

For example, if you’re not looking for gender but if you’re looking at things like when you first got into computing, as Kate mentioned. If there’s a disparity in when boys and girls first start getting into computing, it’s quite conceivable that with some collection of attributes you could essentially predict the gender attribute that the algorithm might effectively be doing even if you’re not explicitly asking it to do so. And I’m wondering if you’ve ever investigated any possibilities for things like that?

TOD: Well, the comparison group is peer group, so that’s kind of one of them, so I guess if your peers were fellow girls and you still got in younger, that would help. But I think the larger point is that we ask when people apply to us for a whole ton of different pieces of information– work samples, they come in and interview, they code while they’re there, and all these are different signals. And then at the end of the day we still are asking a hiring committee, a group of peers, to look at all the pieces of information and still decide.

IRA FLATOW: All right, thanks for calling us, Tod, and don’t be a stranger.

TOD: All right, thanks.

IRA FLATOW: Have a good holiday. Interesting people listening to Science Friday.

All right, we’re going to come back– take a short break and come back and talk about some more about hiring and maybe see if you’d rather have a computer or a person do your job interview. I don’t know. Maybe it doesn’t matter anymore.

Well, we’ll talk about it. Stay with us. We’ll be right back after this break.

This is Science Friday. I’m Ira Flatow. In case you joined us, you’ve joined us into a good hour. We’re talking this hour about algorithms and prejudice hidden in computer code with Kate Crawford, principal researcher at Microsoft Research, Visiting Professor at MIT Center for Civic Media. Suresh Venkatasubramanian. He’s an associate professor in the School of Computing in the University of Utah in Salt Lake City. Our number, 844-724-8255. Lots of discussion.

Kate, let me ask you, let’s say a company found that its algorithms were indeed discriminating against job applicants but it saves a ton of money to use machines and wants to stay with the machines. Is there any incentive for the company to change this? And way for it to do that?

KATE CRAWFORD: This is a big risk right now that there just simply isn’t, in some cases, enough incentive for them to stop. If it’s really saving them that much money and they get a few error cases here and there, it’s tempting to stay with the system. And this is part of the reason that, certainly in my work, one of the things I’ve been calling for is how do we have due process with big data systems?

And due process is an idea that’s been around since the Magna Carta. We’ve had it since 1215. But when it comes to big data systems, what does due process look like? And at the very least, I think it has to begin with letting people know that they are being judged by these large, algorithmically-driven systems, and it also should allow people to see the data that’s being used about them and, in some cases, correct the record if it’s bad data.

We all know that there are many systems that have just incorrect data about us. I like taking that browser test which predicts how old you are and what gender you are and I quite frequently come up as a 29-year-old male. I think that says something more about my music tastes than anything else and possibly my film taste. But we know that incorrect data is out there, so how do you let people know that they’re being judged by systems and that they can actually intervene. I think that’s really important.

IRA FLATOW: Let’s go to the phones. Let’s go to Michaela in Kalamazoo, Michigan. Hi there.

MICHAELA: Hi, thanks for taking my call.

So I work at the writing center at my college. I’m a college senior. And as a person I feel like there’s very much a formula for writing a resume. There’s words you’re supposed to use and not use and I hate the idea of a computer weeding through that even more because as a person we worked so hard on these resumes and like that’s all they see us as is this piece of paper.

So my method is to try to use the same words that are going to get you through but have a different voice, something that’s going to catch the person who’s reading it, catch their attention. But if it’s a computer, can they pick up on a voice? Can they really get more of who you are?

KATE CRAWFORD: It’s a great question and in some cases there are interesting ways that you can write algorithms to look for unusual turns of phrase. But perhaps what’s more interesting here is that you are being judged for things that you’re probably not even thinking about in your resume like, for example, your address.

So there was one HR firm that has been using an algorithmically-driven system that gives people extra credit if they live within a close radius of the workplace, if they’re sort of within a 15 minute commute, because that their data showed that if you had a longer commute you were more likely to quit or to be fired within a year, which basically means that the investment for them is being lost.

So what that also means is that they just start to hire people who live nearby, which can have a whole range of other discriminatory functions. Particularly if you’re in an area where it’s expensive to live there, you’re then immediately tending to hire people who have more money and more access to urban areas. So there’s a lot of questions that we might ask about how that resume scanning works. But keep in mind that when you’re trying to find a beautiful turn of phrase, they might just be looking at your address and then that’s actually the real problem in terms of discrimination.

IRA FLATOW: Suresh, you got a common on that?

SURESH VENKATASUBRAMANIAN: Yeah. Another example I’ve heard of is where the system will scan the layout of the various text blocks on the page and use that because of some pattern they found that people lay out blocks in certain ways is a better resume, is more likelihood of someone being a good employee than if they lay it out in some other fashion.

It sounds ludicrous, but again this is one of the things that there’s such a vast choice of patterns that a machine can pick up on. It’s not just the text and it’s not just the words you use. But having said that, there are algorithms that, yes.

For example, in music, there are algorithms that claim they can pick up on tone, they can pick up on characterization. So I think there’s a lot of work on trying to pick up on things like tone and, of course, can you tell someone’s gender from the text they use and the words they use? This is something that’s been around for a long– so there are lots of things you can do with that.

IRA FLATOW: Is there any research that shows that you get a better interview by a person or by computer? Your chances? And should you request to be interviewed by a person if you’re going for a job?

KATE CRAWFORD: I’ve yet to see this get down to the interview stage I have to say, Ira. I think at this point there’s still humans at that interface. But it’s the back end that I’m more worried about.

IRA FLATOW: Could you ask in your resume to be looked at by a person?

SURESH VENKATASUBRAMANIAN: No, but in fact there are companies that are offering the following service where rather than go in for an interview you sit down with Skype and talk to a machine. It sends you questions, you answer. They record the video, they analyze the video, they create a bunch of features and a bunch of estimators, and then give it to a human.

IRA FLATOW: See, I’m ahead of my time.

KATE CRAWFORD: You are.

SURESH VENKATASUBRAMANIAN: So this is happening now, yes.

IRA FLATOW: Wow. Wow. But beyond thinking of a job or a loan application, are algorithms being used by law enforcement in many ways that might actually–

SURESH VENKATASUBRAMANIAN: Oh, yes, everywhere. So I just came back from a workshop the Data Society ran in DC on data and civil rights and it was all about use of data in various stages of the criminal justice pipeline, going from predictive policing to sentencing guidelines to parole guidelines, recidivism rate estimation. It’s been used all over the place.

And that’s, I think, in some ways an even more serious and consequential use of algorithms then in hiring. Not that hiring is less important, but– and there’s a lot of work being done. It’s being pushed all over the place, and there’s a lot of questions about the effectiveness and the value and the ethical use of these methods.

IRA FLATOW: Well, and now with terrorism attacks are we going to see more our algorithms being used do you suspect, Suresh? Or–

[INTERPOSING VOICES]

SURESH VENKATASUBRAMANIAN: Probably. They already are being used. I don’t think it’s– they’re already being used all over the place. I don’t think there’s a question of more. There’s a lot that’s happening that we don’t know about and there is definitely a lot of profiling that’s happening inside at that level. The NSA has a gigantic facility out here in Utah, among other things. So, yes, I think they are going to be doing a lot more of it and that’s exactly the problem I think that.

The thing is this– going back to the example with the caller from Google. Google in many ways in what he describes is doing the right thing. They have a number of different features. They have a number of different predictors. And the hope is that if you put enough of these things together that you hope and, again, you hope that biases may cancel out. Again, you don’t know but you hope so.

The problem is– and as Kate mentioned– when you have bad data, right? You have incorrect data.

There was this horrific story I heard a couple of years ago on This American Life about someone whose name was similar to another known terrorist and was put into an FBI watchlist and 15 years later, even though the FBI had cleared him as being uninvolved with any activities, could not get his name off the watchlist and that bit that was set up with his name was propagating into credit applications and loan– things like that, and he just couldn’t get it out of there.

And so what Kate is saying makes perfect sense. You need a way to verify in some ways whether the data being used for you is accurate in any form. And this is a big problem, I think.

KATE CRAWFORD: Yeah, and I think that’s absolutely right, Suresh. And I’m really interested in what we do about it because I’m concerned about the kind of discrimination were seeing against entire groups– be they African American, be they women, be they people who live in rural areas, you name it. We’re seeing a form of group discrimination often occur in these kinds of systems.

But there are things we can do about it. I think one of the things we’ve been discussing today is how do you have internal systems that are checking the discriminatory outcomes? A lot of technology companies are looking into that. Another thing you can do is external audits. Can we actually really test who’s getting a loan from this company, who’s getting housing from this company, who’s getting jobs over here, and that’s another way to look at it.

But perhaps the thing that I think is most exciting to me right now that I think we can do pretty much immediately is to really think about this concept of data ethics. How do we train a data scientist to be thinking about these questions of discrimination and fairness? When they’re in school, when they’re actually learning how to use these techniques.

It’s actually very powerful to think about these questions in an educational setting so that when you’re out in the world designing these systems these are the first questions you ask yourself. Because, as we know, these systems do encode forms of human blindness and sometimes human bias. And if we’re basically just thinking about who is affected by the system and who might I not be including, that can actually be very powerful.

IRA FLATOW: All right.

SURESH VENKATASUBRAMANIAN: And I’d like to go further than that even. I think along with the education that is incredibly important– and I have been educated myself just by looking at this– we have to look at the algorithms very carefully. Some of the research that I and others and many of us have been looking at is how do you instrument the algorithms to do exactly what Kate’s asking– detect potential for bias and even try to repair bias?

There actually are ways now where you can take a black box algorithm that someone’s using and modify the data that’s going into it to essentially remove signals that could be potentially discriminatory. Now, there are a lot of technical details in how this works but this is some of the research that I’ve been doing.

And there are ways– we’re still very in the early days of doing this research– but there are ways we can do this and essentially we’re trying to formulate a mathematical way of describing bias and describing how to be fair– how algorithms could be fair and trying to implement that into algorithms. So there are lots of things we can do and I think we need a lot more study of this and there is a growing interest in the technical side of things in how to do this.

IRA FLATOW: And we’ll be following this.

I want to thank both of you for taking time to be with us today. Suresh Venkatasubramanian is Associate Professor in the School of Computing at the University of Utah in Salt Lake. Kate Crawford, Principal Researcher at Microsoft Research and Visiting Professor at the MIT Center for Civic Media. Thank you both.

KATE CRAWFORD: Thank you, Ira.

[INTERPOSING VOICES]

IRA FLATOW: –have a happy holiday season.

Copyright © 2015 Science Friday Initiative. All rights reserved. Science Friday transcripts are produced on a tight deadline by 3Play Media. Fidelity to the original aired/published audio or video file might vary, and text might be updated or amended in the future. For the authoritative record of ScienceFriday’s programming, please visit the original aired/published recording. For terms of use and more information, visit our polices pages at http://www.sciencefriday.com/about/policies.

Meet the Producer

About Christopher Intagliata

@cintagliata

Christopher Intagliata was Science Friday’s senior producer. He once served as a prop in an optical illusion and speaks passable Ira Flatowese.

Cookie	Duration	Description
_abck	1 year	This cookie is used to detect and defend when a client attempt to replay a cookie.This cookie manages the interaction with online bots and takes the appropriate actions.
ASP.NET_SessionId	session	Issued by Microsoft's ASP.NET Application, this cookie stores session data during a user's website visit.
AWSALBCORS	7 days	This cookie is managed by Amazon Web Services and is used for load balancing.
bm_sz	4 hours	This cookie is set by the provider Akamai Bot Manager. This cookie is used to manage the interaction with the online bots. It also helps in fraud preventions
cookielawinfo-checkbox-advertisement	1 year	Set by the GDPR Cookie Consent plugin, this cookie is used to record the user consent for the cookies in the "Advertisement" category .
cookielawinfo-checkbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checkbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
csrftoken	past	This cookie is associated with Django web development platform for python. Used to help protect the website against Cross-Site Request Forgery attacks
JSESSIONID	session	The JSESSIONID cookie is used by New Relic to store a session identifier so that New Relic can monitor session counts for an application.
nlbi_972453	session	A load balancing cookie set to ensure requests by a client are sent to the same origin server.
PHPSESSID	session	This cookie is native to PHP applications. The cookie is used to store and identify a users' unique session ID for the purpose of managing user session on the website. The cookie is a session cookies and is deleted when all the browser windows are closed.
TiPMix	1 hour	The TiPMix cookie is set by Azure to determine which web server the users must be directed to.
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.
visid_incap_972453	1 year	SiteLock sets this cookie to provide cloud-based website security services.
X-Mapping-fjhppofk	session	This cookie is used for load balancing purposes. The cookie does not store any personally identifiable data.
x-ms-routing-name	1 hour	Azure sets this cookie for routing production traffic by specifying the production slot.

Cookie	Duration	Description
__cf_bm	30 minutes	This cookie, set by Cloudflare, is used to support Cloudflare Bot Management.
bcookie	2 years	LinkedIn sets this cookie from LinkedIn share buttons and ad tags to recognize browser ID.
bscookie	2 years	LinkedIn sets this cookie to store performed actions on the website.
lang	session	LinkedIn sets this cookie to remember a user's language setting.
lidc	1 day	LinkedIn sets the lidc cookie to facilitate data center selection.
S	1 hour	Used by Yahoo to provide ads, content or analytics.
sp_landing	1 day	The sp_landing is set by Spotify to implement audio content from Spotify on the website and also registers information on user interaction related to the audio content.
sp_t	1 year	The sp_t cookie is set by Spotify to implement audio content from Spotify on the website and also registers information on user interaction related to the audio content.
UserMatchHistory	1 month	LinkedIn sets this cookie for LinkedIn Ads ID syncing.

Cookie	Duration	Description
__jid	30 minutes	Cookie used to remember the user's Disqus login credentials across websites that use Disqus.
_gat	1 minute	This cookie is installed by Google Universal Analytics to restrain request rate and thus limit the collection of data on high traffic sites.
_gat_UA-28243511-22	1 minute	A variation of the _gat cookie set by Google Analytics and Google Tag Manager to allow website owners to track visitor behaviour and measure site performance. The pattern element in the name contains the unique identity number of the account or website it relates to.
AWSALB	7 days	AWSALB is an application load balancer cookie set by Amazon Web Services to map the session to the target.
countryCode	session	This cookie is used for storing country code selected from country selector.

Cookie	Duration	Description
_fbp	3 months	This cookie is set by Facebook to display advertisements when either on Facebook or on a digital platform powered by Facebook advertising, after visiting the website.
fr	3 months	Facebook sets this cookie to show relevant advertisements to users by tracking user behaviour across the web, on sites that have Facebook pixel or Facebook social plugin.
IDE	1 year 24 days	Google DoubleClick IDE cookies are used to store information about how the user uses the website to present them with relevant ads and according to the user profile.
NID	6 months	NID cookie, set by Google, is used for advertising purposes; to limit the number of times the user sees an ad, to mute unwanted ads, and to measure the effectiveness of ads.
personalization_id	2 years	Twitter sets this cookie to integrate and share features for social media and also store information about how the user uses the website, for tracking and targeting.
test_cookie	15 minutes	The test_cookie is set by doubleclick.net and is used to determine if the user's browser supports cookies.
vglnk.Agent.p	1 year	VigLink sets this cookie to track the user behaviour and also limit the ads displayed, in order to ensure relevant advertising.
vglnk.PartnerRfsh.p	1 year	VigLink sets this cookie to show users relevant advertisements and also limit the number of adverts that are shown to them.
VISITOR_INFO1_LIVE	5 months 27 days	A cookie set by YouTube to measure bandwidth that determines whether the user gets the new or old player interface.
YSC	session	YSC cookie is set by Youtube and is used to track the views of embedded videos on Youtube pages.
yt-remote-connected-devices	never	YouTube sets this cookie to store the video preferences of the user using embedded YouTube video.
yt-remote-device-id	never	YouTube sets this cookie to store the video preferences of the user using embedded YouTube video.
yt.innertube::nextId	never	This cookie, set by YouTube, registers a unique ID to store data on what videos from YouTube the user has seen.
yt.innertube::requests	never	This cookie, set by YouTube, registers a unique ID to store data on what videos from YouTube the user has seen.

Cookie	Duration	Description
_dc_gtm_UA-28243511-20	1 minute	No description
abtest-identifier	1 year	No description
AnalyticsSyncHistory	1 month	No description
ARRAffinityCU	session	No description available.
ccc	1 month	No description
COMPASS	1 hour	No description
cookies.js_dtest	session	No description
debug	never	No description available.
donation-identifier	1 year	No description
f	never	No description available.
GFE_RTT	5 minutes	No description available.
incap_ses_1185_2233503	session	No description
incap_ses_1185_823975	session	No description
incap_ses_1185_972453	session	No description
incap_ses_1319_2233503	session	No description
incap_ses_1319_823975	session	No description
incap_ses_1319_972453	session	No description
incap_ses_1364_2233503	session	No description
incap_ses_1364_823975	session	No description
incap_ses_1364_972453	session	No description
incap_ses_1580_2233503	session	No description
incap_ses_1580_823975	session	No description
incap_ses_1580_972453	session	No description
incap_ses_198_2233503	session	No description
incap_ses_198_823975	session	No description
incap_ses_198_972453	session	No description
incap_ses_340_2233503	session	No description
incap_ses_340_823975	session	No description
incap_ses_340_972453	session	No description
incap_ses_374_2233503	session	No description
incap_ses_374_823975	session	No description
incap_ses_374_972453	session	No description
incap_ses_375_2233503	session	No description
incap_ses_375_823975	session	No description
incap_ses_375_972453	session	No description
incap_ses_455_2233503	session	No description
incap_ses_455_823975	session	No description
incap_ses_455_972453	session	No description
incap_ses_8076_2233503	session	No description
incap_ses_8076_823975	session	No description
incap_ses_8076_972453	session	No description
incap_ses_867_2233503	session	No description
incap_ses_867_823975	session	No description
incap_ses_867_972453	session	No description
incap_ses_9117_2233503	session	No description
incap_ses_9117_823975	session	No description
incap_ses_9117_972453	session	No description
li_gc	2 years	No description
loglevel	never	No description available.
msToken	10 days	No description

Subscribe to Science Friday

Segment Guests

Segment Transcript

Related Links

Meet the Producer

About Christopher Intagliata