Corralling the Texas Sharp Shooter: $1,000,000 Reward
Thursday, January 28, 2016
Legend has it that in Texas, and perhaps other jurisdictions where the value of pi is determined by political vote, sharpshooters market their skills by first firing a shotgun at a barn door and then painting a bull’s eye around their preferred hole. There has been much concern recently that parts of science are not immune to similar strategies, and that in consequence there is a looming “replication crisis”—not just in psychology but other fields as well.
Any time an experiment is conducted that has 20 potential outcome measures, only one of which is ultimately reported in the analysis, a researcher has potentially turned into the proverbial Texan sharpshooter. If the null hypothesis were true for that hypothetical experiment, then out of those 20 outcome measures, one would be significant by chance alone (assuming they are uncorrelated).
Very few researchers conduct science in this caricatured manner. However, many long-standing research practices that were considered acceptable—or at least tolerable—for decades have now been identified as a slippery slope with a big Texan barn at the end.
Removing participants from analysis because they are outliers can be perfectly legitimate (I once removed a subject from a speeded recognition experiment because their chair collapsed under them: I still do not feel any guilt over that choice), but in other instances it can be risqué—especially when the decision or the criterion for outlier removal is informed by inspection of the data.
Similarly, it may be legitimate to remove an aberrant item from a questionnaire if it turns out that subjects systematically misunderstood the question or misread an alternative, but in other instances this exclusion of measures can be less benign.
A large literature has been developed to address those problems, and this brief post cannot cover all nuances of the problem or the proposed solutions. There are, however, two measures on which most commentators agree: first, the need for data to be open, and second, the rewards of preregistration.
APA guidelines have long stipulated that research data be shared with other researchers, but in reality the availability of data has been rather more limited. Accordingly, many journals now require or reward with “badges” public posting of relevant data, thereby facilitating data sharing for re-analyses.
Various additional initiatives have arisen, including one that seeks to encourage data availability commencing with peer review, which I am part of. The advantages of open data are numerous and obvious, although it is also important to consider boundary conditions and potential problems in contested areas of science. Anyone interested in those nuances may wish to read the commentary by Dorothy Bishop and myself that appeared in Nature this week.
Data availability is, however, only part of the solution.
A second pillar of the drive to improve the reliability of the field involves preregistration of research. In a nutshell, this means that the bull’s eye is drawn onto the barn door before the shotgun is fired. That makes it a little harder for Farmer Joe to collect his prize at the county fair, unless he really is a sharpshooter.
Likewise, if the motivation, design, sampling and analysis plan of an experiment are put on record before the research commences, your theory and methodology has to aim pretty well in order to yield the predicted effect. Conversely, if you do obtain the effect in a preregistered study, then one’s confidence that the effect is real, rather than a barn-door-assisted chance effect, should increase.
The steps to preregistration are simple: several online public repositories exist that specialize in putting any collection of documents—from a method section to ethics approvals to data—into a data base. For members of the Psychonomic Society, the repository that is most likely to be relevant is the Open Science Framework www.osf.io. (A great introduction is here).
At the heart of a preregistration data base is an intricate system of access control, which includes the crucial capability of preventing the user to make any changes to files once they have been entered into preregistration. You can write and edit the files to your heart’s content until you register them. Thereafter they are no longer editable (even though you wrote them), thereby creating am unalterable record of your research intentions at that point in time. The Bull’s eye has been drawn before a shot is fired.
Preregistered files can remain hidden from public viewing for an embargo period, typically up to 2/3 years. This protects you from being scooped by someone who can test subjects faster than you and might otherwise publish your experiment first. You can lift that embargo at any time, typically upon submission or publication of your article, at which point it becomes a matter of indisputable public record that your hypotheses, method, and analysis plan were preregistered.
To provide a glimpse at the scope of the Open Science Framework, the user base now includes 16,500 researchers, with about 50 being added per day now. More than 20,000 projects are archived in the data base, of which around 5,000 are public. To date, there have been nearly 4,000 registrations.
Brian Nosek, the driving force behind OSF, summarized its advantages succinctly:
“(1) It solves the problem of losing data, materials, and study protocols from your own collaborations for your own use. No more ‘where did it go’ after a computer or graduate student explodes. The OSF is a web-based application with very strong preservation and security standards to make sure that you never lose your stuff again.
(2) A key feature is that it is entirely up to the user whether the project, or parts of the project, are public or private. Nothing has to be made public that should not be. This integrates private and public workflows. You can use the same environment for the private work of the lab and the public facing communication of what the lab is producing.”
To further stimulate experience with preregistration, the Center for Open Science just launched the $1,000,000 Preregistration Challenge. 1000 researchers will earn $1000 for publishing the results of a preregistered study.
Interested in this challenge?
Some final thoughts: Many researchers are—rightly—concerned about the replicability of psychological data and associated questions about research practices. However, those concerns should not translate into wall-to-wall doom and gloom: it appears to me that in cognitive psychology the proverbial glass is more than half full as far as replicability is concerned. And more important, as far as the field overall is concerned, the last few years have seen a stunning pace of reform that has moved from “crisis” to opportunity and exciting new ways to do research in what to me appears to be the blink of an eye. Anyone looking for an example of the self-correction of science needs to look no further and can join the revolution at www.osf.io.