The Replication Game: How Well Do Psychology Studies Hold Up?

12:09 minutes

Illustration by Christine Fleming
Illustration by Christine Fleming

Replication is a cornerstone of scientific research, a way of checking to make sure a particular effect or result is real and not a statistical anomaly. But a lot of research can’t be replicated—a fact that recently hit home in the field of psychological science. Last year, for instance, researchers at the Center for Open Science found that they were unable to replicate findings in 61 out of 100 psychology papers selected, the most of any field they tested.

But there’s another problem: Many journals shy away from accepting studies that revisit earlier, high-profile research that turns out to be irreproducible.

To improve the quality of psychology research—and earn back public trust—a team of researchers has come up with a system to test influential psychology experiments for reproducibility. Dan Simons and his colleague Alex Holcombe created Registered Replication Reports, which allows researchers to nominate and re-examine original papers that had big cultural impact. So far, the hit rate hasn’t been great: Out of five well-known experiments that were re-tested, four resulted in data that strayed from the original findings.

Simons joins Barbara Spellman, former editor-in-chief of the journal Perspectives on Psychological Science, to discuss how a new era of psychology research is focusing on replication.

Segment Guests

Dan Simons

Dan Simons is a professor of psychology and the editor for Perspectives on Psychological Science. He’s at the University of Illinois at Urbana-Champaign in Champaign, Illinois.

Barbara Spellman

Barbara Spellman is a professor of law at the University of Virginia in  Charlottesville, Virginia.

Segment Transcript

IRA FLATOW: This is Science Friday. I’m Ira Flatow. Replication studies, studies that you do over again, are like the green leafy vegetables of the research world, not the most exciting thing on the plate and occasionally difficult to stomach, despite their healthy reputation. For example, a few years back, an influential study showed that the more committed people felt to their partner, the more likely they were to forgive them. Well, hey, that certainly feels true now. There’s a certain truthiness to that. But researchers know that in order to make a concrete case for that idea, they really need to rigorously test it. More studies are needed.

But here’s the rub. Research journals don’t want to publish follow up studies. But now a group of psychologists has set out to make replication more palatable. A new series of papers called Registered Replication Reports is reexamining high profile psychology experiments, like that study of forgiveness in relationships, to see if results hold up. And in many cases, they do not.

My guest Dan Simons is a professor of psychology at the University of Illinois and co-creator of Registered Replication Reports. Joining him is Barbara Spelman, professor of law at the University of Virginia and former editor-in-chief of the journal Perspectives on Psychological Science. Welcome to Science Friday.

DAN SIMONS: Thanks for having us on.

BARBARA SPELLMAN: Thanks. Yeah, great to be here.

IRA FLATOW: You’re welcome. Dan, give us an idea. What are Registered Replication Reports?

DAN SIMONS: Sure. So Registered Replication Reports are a new kind of paper. And the idea is to develop a careful protocol to exactly, or as close as possible, replicate an original finding. So what we do is develop the protocol, the materials, the methods, the analysis plan, in advance, in collaboration with the original authors of the study. And then we put out a call for other labs to follow that protocol and then publish all of those results together.

IRA FLATOW: And the registered part means that you have registered what?

DAN SIMONS: So yeah, the idea of pre-registration is you’ve probably heard, for clinical trials, how a lot of them will be planned in advance and specified in advance so that you know how they’re going to be designed and how they’re going to be analyzed. That’s the whole idea of registration. You’re specifying in advance what you’re going to do and how you’re going to do it so that after the fact, there’s no question about, say, cherry picking data from the study or reporting some of the results and not others. Everything is specified in advance and planned out in advance.

IRA FLATOW: Let’s talk about this the outcome of the replication that was announced this week about relationship forgiveness.

DAN SIMONS: Yeah, so there are two interesting things here. So first, there’s a large literature showing that there is a relationship, an association, between how committed you are in your relationship and how you respond to betrayals. So that relationship is fairly solid. And we actually, as part of this replication report, we confirm that relationship, that association. What was interesting about this study is, that we were replicating, was that it was the only study in that literature that tried to actually vary people’s commitment to their partner and then measure how that affected their betrayal.

So they actually tried through manipulation where you had people think about their relationship, either in terms of how committed they were or how independent they were, and then looked to see if that would change how they respond to betrayal. So it was an experimental manipulation as opposed to just a correlation.

IRA FLATOW: But yet, four out of the five studies you’ve looked at so far could not be replicated.

DAN SIMONS: Yeah, so I should say going in with all of these studies, we’re actually selecting studies for which we expect there to be some result, some effect. But there’s some uncertainty about it. So this one was really important, because it was the only one that did that causal experiment. But we expected going in that there would be an effect there. And what was interesting was that the manipulation itself didn’t work, the attempt to manipulate how committed people were, that part didn’t work, which makes it kind of hard to interpret what the failure to find any effects on betrayal meant.

IRA FLATOW: Dr. Spellman, these reports are published in the journal Perspectives on Psychological Science. But historically, journals, as I said, they don’t want to publish replications. Why? Is that what drove you guys to get on board?

BARBARA SPELLMAN: Yes, in part, because there was no real way to consistently examine previous findings. Journals don’t want to be publishing the same thing over and over. You want to get the first study. You want to believe it’s correct. And then you want to be able to move on from it. But unless people do it again in other situations in other contexts with other materials, how do we know how robust and generalizable it really is?

IRA FLATOW: So is that then a mythology about science? We hear that the definition of science is reproducibility. But if there’s no one reproducing it, how do we know about the veracity of what we’ve discovered?

BARBARA SPELLMAN: That’s a funny question. Actually, people are trying to reproduce it. It’s just that the reproductions, if you will, don’t get published. So for example, if one of my students reads a really interesting paper in a journal and says, “Wow, this is a great finding. We ought to study it more. Let’s find out whether this matters or that matters and under what conditions does it hold.” They run into my lab. And I say, what a great idea. The first thing we should do is replicate that finding.

So we do. We run the study again. And we say, OK, that works. Let’s try some other stuff. So the other newer stuff might get published. But the original thing that just says, yeah, it worked the first time, doesn’t.

IRA FLATOW: And Dan, how do original study authors feel about having their research re-examined?

DAN SIMONS: Well, there are a couple of ways you could think about this. One is at some level, it’s kind of an honor to say, your study is important enough that we really need to know how robust it is, is it consistent across a whole bunch of cultures and societies and languages? So on the one hand, it’s saying, yes, this is important work. This is something we really need to know about. But it’s also, of course, scary to any original author, because nobody wants to be wrong. And there’s always the possibility that the result will turn out not to have been as robust as we thought. So of course, there is a risk in doing this.

But what we found so far is that most of the original authors have been incredibly supportive of the process. That was certainly true in this most recent replication report of one of Eli Finkel’s studies. He was incredibly helpful throughout and wrote a commentary about it that talked about how the process was an effective way of studying this. So I think most people recognize that this approach of a collaborative development of a protocol that is as accurate as we possibly can make it and gives us the best chance of showing the results as possible is a good way to go about this.

IRA FLATOW: Is there– go ahead, Dr. Spellman.

BARBARA SPELLMAN: Yeah, but Ira, you put your finger really on something in the beginning that was troubling. And that is the early lead up to all this. Things were pretty contentious. And some people thought that the only reason somebody else would try to replicate their study is to show that they were wrong. They were really worried about that. And some people thought other people who would try to replicate studies were either only people who didn’t have any ideas of their own or were somehow incompetent.

So the lead up to this was really pretty ugly where a lot of people were afraid of this whole replication thing. But now with the replication reports in Perspectives and in some other journals following along, we wanted to get back to being the mainstream kind of thing that it probably was in earlier days.

IRA FLATOW: So are you finding a change in mood? I mean, why now? What’s happening in the field of psychology that made replicating experiments a priority?

DAN SIMONS: Bobby, do you want to take that? Go ahead.

BARBARA SPELLMAN: Yeah, sure. There was this big confluence of things a few years ago, not only in psychology, but also, really importantly, in medicine and some of the other biological and social sciences, where people couldn’t replicate each other’s research, where various types of fraud were detected. Although I want to say that I don’t think that that’s the major motivator for what’s going on now. There were some things published that other people really very much disagreed with whether they were well done or not. And so there was a bit of turmoil in the field. Plus there are other people pointing at the way we use bad statistics–

IRA FLATOW: Manipulating data, things like that.

BARBARA SPELLMAN: Yeah, yeah, so so many things were going on at once. And the funny thing is none of them were new. These complaints have been around for a long time. But the fact that so many of them were happening together, I think, is one of the reasons that pushed psychology to go through this new introspective phase and then the fact that other fields were going through it too when we were– made us think, oh, it’s not just us. We’re not in it alone. So it’s been more collaborative now.

IRA FLATOW: Dan, what’s the future look like for replication in psychology? From what Barbara says, it doesn’t look like you’re going to run out of things, run out of topic.

DAN SIMONS: Yeah, the goal obviously isn’t to try and replicate everything under the sun, because we don’t need to go back to every single study and redo it. A lot of– science does progress. So the things that are relevant to a field now may not be, or that were relevant to a field 20 years ago, may not be relevant anymore. So one of the things we look to is whether the studies for examining have what we call replication value. So are they still findings that influence the field? And would it change how we think about the mind and behavior if we had a better estimate of what this effect is?

So I don’t think there’s an infinite amount of stuff that we need to go and replicate. But I do think of this as one of the many ways that you can go about shoring up the field.

IRA FLATOW: Well, is there one particular field that’s more important or needs– is crying out for attention now more than anything else?

DAN SIMONS: I think what we’ve been seeing from a lot of different disciplines is that this is not a psychology specific problem or even a subfield of psychology specific problem. This is something that’s an issue across all of the sciences that there hasn’t been enough of a focus on getting all of the results published. So if you look at what’s published in our journals, almost everything that’s published is statistically significant, which means that things that didn’t work don’t end up in the literature. And that’s true across pretty much every discipline that publishes and uses statistics. So I wouldn’t say there’s any one particular subfield that needs fixing.

IRA FLATOW: A tweet came in from Ted Pavlik who thought about my next question, I was thinking the same thing. Where is the money coming from? These are expensive studies.

DAN SIMONS: That’s a great question. Yeah, fortunately, so far, the ones we’ve been looking at haven’t been incredibly expensive studies. And we have a very generous grant that comes through the Center for Open Sciences to the Association for Psychological Science from the Arnold Foundation. And they’ve been highly supportive of replication research for years now. So they’ve been supporting most of these studies, which for the most part, aren’t that expensive.

If we were, for example, trying to replicate a large scale study of neuroimaging, of fMRI, we’d run out of money immediately. It’s just too expensive. But the sorts of things we’re doing are fairly easy to do. And most labs can cover the basic costs themselves.

IRA FLATOW: That’s not to say that these other people shouldn’t be looking at their own work, even psychiatry, which would involve critical drug tests and things like that. Expensive.

DAN SIMONS: Absolutely. That’s the big issue is that often the NIH and other government funding agencies will fund a big clinical intervention, for example. But then there’s never money to go back and do the same thing again because people want to spend money on new things.

IRA FLATOW: Maybe you’ll be changing that today, you never know. Dan Simons, professor of Psychology, University of Illinois, co-creator of Registered Replication Reports, Barbara Spellman, professor of law, University of Virginia, former editor in chief of the journal Perspectives on Psychological Science. Thank you both for taking time to be with us today. Have a great weekend.

Copyright © 2016 Science Friday Initiative. All rights reserved. Science Friday transcripts are produced on a tight deadline by 3Play Media. Fidelity to the original aired/published audio or video file might vary, and text might be updated or amended in the future. For the authoritative record of ScienceFriday’s programming, please visit the original aired/published recording. For terms of use and more information, visit our policies pages at http://www.sciencefriday.com/about/policies.

Meet the Producer

About Katie Feather

Katie Feather is a former SciFri producer and the proud mother of two cats, Charleigh and Sadie.

Explore More