An Exit Interview With U.S. Chief Data Scientist DJ Patil
In 2013, taking a page out of the Silicon Valley playbook, President Obama signed an executive order that made open and machine-readable data the new default for government information. In 2015 he appointed DJ Patil to the newly created role of Deputy Chief Technology Officer for Data Policy at the White House Office of Science and Technology Policy. Patil, who had worked in the private sector for Ebay, LinkedIn, and others, once honed his skills in data science by improving mathematical models for weather prediction using open data sets available through the National Oceanic and Atmospheric Administration. Now he was going to use data to tackle problems in areas that required the most spending (costing $1 trillion or more), and which served the greatest number of Americans.
One of those big issues involved the criminal justice system. In 2015 Patil helped launch the White House’s Police Data Initiative, through which police jurisdictions release data collected on their policing, including information about the use of force and traffic stops. By looking at the data, Patil noticed that a number of negative police encounters occurred just after an officer had responded to a suicide or domestic violence call, which suggested that quickly re-dispatching these officers to their normal beat without giving them time to decompress may have led to the incidents of violence.
In the realm of healthcare, Patil’s data efforts centered on President Obama’s Precision Medicine Initiative, which would build the largest and richest database of genetic information. Funding provided by the 21st Century Cures Act will go to the National Institutes of Health’s effort to sequence individual human genomes and collect biological samples to be made available for scientific study.
Patil joins Ira to talk about the legacy of his initiatives and the future role of big data in government.
DJ Patil is the U.S Chief Data Scientist in the White House Office of Science and Technology Policy in Washington, D.C..
IRA FLATOW: This is Science Friday. I’m Ira Flatow. Researchers estimate that there are about four zetabytes of data in the world. How much is a zetabyte? Well, if a single character of text represents one byte, then there are 1,250 pages– you know that War and Peace tomb? That would fit into a zetabyte 323 trillion times. Wow, that’s a lot of War and Peace. That’s some big data.
But, so what? Data is just something websites collect to try and sell you that perfect electric toothbrush, right? Sure. But it turns out that the world is full of data. You have metro card swipes, police dispatch calls, blood pressure numbers, facial recognition. All of which can be used to solve a number of important issues.
And to that end, in 2015, President Obama tapped data scientist DJ Patil to be the first Deputy Chief Technology Officer for data policy at the White House. And over the next two years, Patil worked on programs that used big data to solve big issues in areas like criminal justice and health care.
But now as the Obama administration starts to wind down, what legacy will these big data projects leave behind? Were they successful? What’s the future looking like for big data in government?
Well DJ Patil is here to talk with us about it. Welcome to Science Friday.
DJ PATIL: It’s great to be here.
IRA FLATOW: When you were first named back in 2015, what was the vision you had for big data at The White House?
DJ PATIL: Well, we sat down, actually, with the president and we came up with a mission statement. And the mission statement is to responsibly unleash the power of data to benefit all Americans. And the big two, kind of components of that are responsibly, and that is just because we can, doesn’t always mean we should.
And how do you make sure it benefits all Americans? We have this talk about technology being radical and revolutionary. And our assertion is that a technology is neither radical nor revolutionary, unless it benefits every single person.
IRA FLATOW: Did people understand what big data was when you came to the White House?
DJ PATIL: Well, in fact, the president is the one person who has singularly pushed that we have to get ahead of the big data curve. And when he– even before I got there– he had commissioned a report talk about what is big data. How do we think about it and what are the implications for society.
And out of that, there came two really big components of it. One was, how do we use it in health care? And what are the implications that we need to get ready for? We’ve written a number of White House reports following on that, everything from the intersection of big data and civil rights, all the way to most recently, how do we think about artificial intelligence?
IRA FLATOW: Really? Let’s talk about civil rights. How did you help craft or use big data in civil rights?
DJ PATIL: Well one of the big things that we have been looking at is where should you be thinking about the opportunities for data to help on civil rights? And also, where could it be used to impede civil rights?
So one of the cases that we don’t always think about is in housing. Is data being used to help people get into housing or is it being used to artificially isolate a certain population? In evictions and using that data as a marker, how do you have recourse to know if that was true? Or how does somebody actually fact check your background and credit history?
There’s also a lot of data that’s out there that could show your credit worthy, but it’s just not utilized or it’s not tracked.
IRA FLATOW: Really?
DJ PATIL: Absolutely. So in the case of banking, there’s a large portion of the population that just doesn’t have a data signature. And because they don’t have a data signature, they don’t actually get credit for this. One of the group’s, Capital One, way back 20, 30 years ago, they’re the first ones to start showing that you could actually get credit to middle class Americans. And they just used different data signatures.
On the flip side, things like in policing, and the president after Ferguson– all the incidents in Ferguson and the shootings, he commissioned a task force– a task force on 21st century policing. A number of the recommendations were all data and technology related.
Some good. How do we use body cameras? How do we think about using data and all sorts of different elements? And I can talk more at length about that.
But the other is, the concerns, also. So if somebody is coming up with the idea of predictive policing, well how do we have the ability to have transparency into how those algorithms are chosen or what data goes into that? So it’s not just some artificial justification of stop and frisk.
IRA FLATOW: So the data could be biased, the algorithm could be biased, if you’re not careful.
DJ PATIL: Absolutely. In fact, what we’re finding is biased data all over the place. And so you have to be constantly asking the questions. And we’re seeing this from basic science all the way at the genomic layer of testing, all the way through other types of activities.
The one place that is really important to call out, also, what we’re trying to do is make sure that we’re getting this data out there so everyone can evaluate it. Just like you were talking about with Feynman, somebody is able to reproduce the results. In our case, we want everyone to reproduce the results.
So people often talk, well the data is just biased in climate. Well, in the climate– the data is so overwhelming clear, and anybody can download the data, and look at it, and evaluate, and ask questions, it’s just obvious at this point that climate change is happening.
And so we go through that rigor and we eliminate that saying yes, this is a case of– this is scientifically rigorous.
IRA FLATOW: And then what about precision medicine big data.
DJ PATIL: So the idea of precision medicine is how do we really step forward into the genomic era? Right now, precision medicine happens where if you are sick with cancer or some other type of chronic disease, somebody could use your genomic information, all that type of information from a DNA test, to get you tailored treatments. The problem is, that’s not true for everyone right now.
So how do we create that at large? How do we enable data to happen for everybody? So what has happened through the precision initiative, which the president launched two years ago, is the idea that any American could contribute their data, donate it, to the National Institutes of Health and create the world’s largest repository of data, not just at the genomic layer of your DNA, but all the other medical information that sits inside the doctor’s office, to truly understand all the implications of chronic diseases or other things that we just haven’t thought about it.
IRA FLATOW: But people are going to be worried, hey, if my information is in a big database that anybody can get to, they’re going to know all about my private stuff, no privacy.
DJ PATIL: Absolutely. So one of the first– there’s a couple of tenants that are essential to everything you do when you go into precision medicine.
The first, at the most part, is when you’re deciding. Classically when you do research, you don’t have the actual patient or the participant at the table. In this case, the patient participant is there front and center, always at the table. And that is a real dramatic shift. Because the questions of consent, who has access to the data? What are you doing with the data? It’s not just some advocacy group on your behalf. It’s the actual people.
The other part of that is security and privacy. And that’s why we’ve been working very aggressively to release, and we’ve released the White House Guidance and Policies around how to think about security and privacy, and those programs and grants have gone out and they have a very different model.
One of the ones that we’ve been working on very closely with the Secretary of Defense on is this idea of bug bounty program. He announced a program called Hack the Pentagon. Which is what it does, which it sounds kind of weird, but it says, hey let’s go out and ask America’s best hackers to come beat us up and then tell us about it. And within minutes, you find all the vulnerabilities and then you’re able to go fix them. So we’re taking those approaches.
IRA FLATOW: Could you have found the vulnerabilities in this last campaign if the Russians had hacked our system, would big data show that?
DJ PATIL: Well, I can’t get into those comments of what had happened in the elections, but what I can say is for any type of system that we’re working on– and this is why the president has another commission on cybersecurity and the creation of the NICE Cybersecurity National Action Plan– is how do we start thinking about the next generation of security?
And this is one of the big things and this also gets into AI. And what we publish in AI report is, there has to be two important things in any training program that anybody in data is going into. And when I say data, I mean, economics. I mean statistics, mathematics, computer science, any type of engineering. You have to have ethics as part of the core training. So you can start to have conversations about is this OK? Is this acceptable?
The second is security. Security no longer can be an elective for anybody that is training for these things. Because if you’re learning about a database or building one of these systems, you have to know how somebody might be getting data out or attacking that system in a way that you hadn’t anticipated. All of those things as they start to come together, give us a different approach to think about what is it going to require to protect systems.
IRA FLATOW: Talking with DJ Patil, Deputy Chief Technology Officer for data policy at the White House Office of Science and Technology Policy. Our number, 844-724-8255, if you have a question. Or you want to tweet us here @sciencefriday.
So your job is over January 20.
DJ PATIL: That’s right.
IRA FLATOW: Do you know where you’ll be going and what you’re going to be doing yet?
DJ PATIL: I’m going home and I’m going to take a long nap. But the problems– the most important thing that I’ve taken away from this, I tell people is, we always focus on the data. It’s more important to actually focus on the people. And the data provides a solution to get there.
Some of the problems that I am extremely excited about continuing working on is not only the type of precision medicine and cancer moonshot programs of how do we enable new data sets to ask really phenomenal questions, but is on the policing side. And how do we also address our criminal justice system?
Because we have– just to give a sense of numbers for the audience out there– the numbers are crazy. We have more than 11 million people around– roughly 11.4 people going through 3,100 jails every year. And when I say jails, I mean your local jail. They stay there, on average, 23 days. 95% of them never go on to prison. So we have this cycle that people are just going around and around and around in.
Those are incredible dollars. They are going and preventing your local city from paying for a teacher, an officer, a park. So how do you deal with it?
Well, it turns out cities have figured out if you take your data from your criminal justice system and say who are these people you keep cycling? And you write them on a list– you can think of it just like a spreadsheet– and you hand that over to the health care system, the hospitals, and they say, have you seen these people? And they can say, yes. We see these people a lot.
In New Jersey, 70% of all the medical costs come for 10% of the population.
So when you look at that, now you have the privacy measures because of the health ecosystem and you can say to dispatch, hey, instead of taking that person to jail, let’s get them into the right medical treatment. Let’s get them into the right opioid treatment plan. And– I know you’re going to talk about 21st century cures later– these are why these dollars for these things are important.
Miami-Dade, Florida. They trained all their people up in this idea of crisis intervention. Cost about a million in the first year. But as a result, in the first year alone, they saved more than $10 million. And more importantly, they were able to close a full jail. That’s the power when data comes together with public policy.
IRA FLATOW: How do you get them to talk to one another?
DJ PATIL: Well, that’s where we come in. So oftentimes, what happens is, and what we did in a lot of these things, is literally we took a bunch of the technologists, and the police chiefs, and the medical people, and the civic activists, and we locked them in a room at the White House. And we left a bunch of presidential cupcakes on the table.
And we said well what could happen? And they said, hey, you know what? We could open up data. We could share data.
Even in police departments, one of the things that we’ve seen is, we have something called the Police Data Initiative which gets communities to start opening up their data and just putting it on the web. So people can ask basic questions.
So it says town A, your data, your stop rates, your search rates, they look like this. How does it compare to town B?
Those little things, they now cover actually over 95 million Americans. They cover 10 states. They cover 139 jurisdictions. They get a community to form to ask best practices. Because there’s a lot of legacy technology and problems that prevent people from doing any work in those local towns.
IRA FLATOW: I’m sure that very few people have heard about these things until this very second. Why are these the best hidden secrets about big data? And how cooperation between different parts of the government–
DJ PATIL: Well, I think one of the challenges is we’re really focused on talking about data around just a few dimensional areas, obviously one is fantasy football. And I don’t want to take that away from anybody– or baseball. But we’re also seeing this dimension where we’re moving beyond just talking about what can data do for a social networking site. And we’re moving on to saying what can data to empower people and do things in a remarkable way?
And the biggest change that we’re seeing, and this is what I think President Obama has singularly realized in putting science and technology back into his rightful place at the White House, is that when technologists and science policymakers and everyone are at the table, at the beginning you say, hey what about if we did this? And it becomes a dramatic force multiplier.
What we have to do is we have to help people see the value proposition. And that’s why I think it’s so great that we’re talking about it.
IRA FLATOW: Well somebody should be talking about science on Science Friday from PRI, Public Radio International. I’m Ira Flatow talking with DJ Patil.
So is your position– now you’re the first person in this position, do we know if it’s going to survive the next presidency?
DJ PATIL: Well right now the way a transition process works is, it’s really best to think of it as an analogy as a baton race. So we’re sprinting. We need the next team of President-elect Trumps to sprint with us. We’ve prepared all the material. We’ve helped them not only see the decisions we made, but how we got to these decisions. And we’re passing that baton to them so they can figure it out.
It is every presidents and president-elect’s choice of how they structure the White House. And so the only thing we can comment on is that we’ll give them the best option to think about it. And I think one of the benefits we see is that data is a force multiplier.
The part that I tell people, though, is there are in the agencies– the federal agencies, think National Institutes for Health, think Commerce Department, Department of Defense, there are over 40 chief data scientists or chief data officers in those departments now. And so there is a transition that has happened of all these agencies also recognizing it. And those are career civil servants and they will continue to really continue to push the data mission forward.
IRA FLATOW: But before your career as a data scientist, you started out in community college.
DJ PATIL: That’s right.
IRA FLATOW: You support the president’s community college education initiative, I would imagine?
DJ PATIL: I’m a big advocate, obviously, of community college. And the reason I am is, most people think, we have this vision of our self of the mathematician who just instantly gets it. And by fifth grade is already doing collegiate work.
IRA FLATOW: MIT mentor.
DJ PATIL: I was. My father was a professor at MIT and let me tell you, I’m not. [INAUDIBLE]
IRA FLATOW: Me neither.
DJ PATIL: So I was a kid who barely graduated through high school because of my math grades. But I went to the local community college. I was super fortunate that I had a girlfriend who was in this calculus class. And so I went along to the calculus class. And I fell in love with it. And the instructor who is teaching calculus was so good that she helped me understand the value of proposition.
And more so, I was in class with people who were putting all their energy into earning $1 so they could go to these classes. And it gave me a different appreciation for why doing that. I was very able to quickly move on from the community college into the traditional academic sectors, but the thing that’s always stuck with me is imagine as we talk about all these programs, there’s somebody out there that wants to take one of these classes. How do we know? Because all these massive online courses are out there.
But there’s something else that you need when you’re working on hard problems around data or mathematics or science. You need a community. You need a group to work with. And there’s some of that on the online forum. But if we had that power and everyone had the opportunity to get a little bit more education, we would massively transform this country.
IRA FLATOW: And that’s what we need for new jobs. Thank you DJ.
DJ PATIL: My pleasure. Thank you.
IRA FLATOW: DJ Patil, US Chief Data Scientist at the White House Office of Science and Technology Policy. Good luck wherever you’re going.
DJ PATIL: Thank you.
IRA FLATOW: Check back in with us–
DJ PATIL: I’d love to.
IRA FLATOW: When you land on your feet again, which I’m sure you will. We’re going to take a break and when we come back, we are going to change direction and I’m going to see what we’re going to talk about, break it down. Did you break something down for us? We told you to take it apart, break it down.
We’re going to have some folks who shared with us some really interesting things. And you’re going to be surprised about the most common object that people found to break down. You may be carrying it with you as I speak. I’ve said enough. We’ll be right back after the break.