Ten decades in the past this week, I was startled to see tweets stating that Dutch psychologist Diederik Stapel, a former colleague, experienced admitted to falsifying and fabricating data in dozens of content articles. My inbox crammed with e-mails from fellow methodologists, scientists who analyze and refine study techniques and statistical equipment. They expressed disbelief about the extent of the misconduct, but also a feeling of inevitability. We all understood that sloppiness, small moral requirements and competitiveness ended up popular.
What took place upcoming was inspiring: an open discussion that went significantly past misconduct and targeted on increasing investigate. Various scientists, numerous early in their professions, applied social media to call for bias-countering techniques, these kinds of as sharing data and designs for analysis. It improved the dialogue. Right before 2011, my purposes for grants to analyze statistical glitches and biases in psychology had been repeatedly rejected as low precedence. By 2012, I experienced been given funding and set up my present-day investigation group.
This August, an additional incident of data fraud arrived to gentle, this time in a 2012 publication from behavioural-science celebrity Dan Ariely, who agrees that the data are fabricated, but says he did not fabricate them. This circumstance, ironically in a study evaluating how to inspire honesty, is an invitation to take a look at how expectations for investigate apply have transformed, and how considerably further reform will have to go.
Publication bias — the inclination for conclusions that affirm hypotheses to be released more generally than are null success — was documented obviously in the 1950s. The 1960s and 1970s brought warnings that choices about how information were analysed could cause bias, this sort of as the identification of spurious or extremely powerful results. The common failure to share psychology information for verification applications was also declaimed in the 1960s and 1970s. (My team documented it in 2006.)
By the 1990s, methodologists experienced elevated the alarm that most scientific tests experienced unacceptably low statistical ability — the likelihood that true outcomes are remaining detected — and that scientists frequently misrepresented a analyze as remaining designed to test a precise speculation, when in actuality they had spotted a pattern in exploratory operate. The superior prevalence of statistical errors was not news, at least to methodologists. Nor was the apply of tweaking and repeating analyses until finally a statistical threshold (this sort of as P < 0.05) was reached. In 2005, a modelling paper showed that, combined, these biases could mean that most published results were false (J. P. A. Ioannidis PLoS Med. 2, e124 2005). This provocative message generated attention, but little practical change.
Despite this history, before Stapel, researchers were broadly unaware of these problems or dismissed them as inconsequential. Some months before the case became public, a concerned colleague and I proposed to create an archive that would preserve the data collected by researchers in our department, to ensure reproducibility and reuse. A council of prominent colleagues dismissed our proposal on the basis that competing departments had no similar plans. Reasonable suggestions that we made to promote data sharing were dismissed on the unfounded grounds that psychology data sets can never be safely anonymized and would be misused out of jealousy, to attack well-meaning researchers. And I learnt about at least one serious attempt by senior researchers to have me disinvited from holding a workshop for young researchers because it was too critical of suboptimal practices.
Around the time that the Stapel case broke, a trio of researchers coined the term P hacking and demonstrated how the practice could produce statistical evidence for absurd premises (J. P. Simmons et al. Psychol. Sci. 22, 1359–1366 2011). Since then, others have tirelessly promoted study preregistration and organized large collaborative projects to assess the replicability of published findings.
Much of the advocacy and awareness has been driven by early-career researchers. Recent cases show how preregistering studies, replication, publishing negative results, and sharing code, materials and data can both empower the self-corrective mechanisms of science and deter questionable research practices and misconduct.
For these changes to stick and spread, they must become systemic. We need tenure committees to reward practices such as sharing data and publishing rigorous studies that have less-than-exciting outcomes. Grant committees and journals should require preregistration or explanations of why it is not warranted. Grant-programme officers should be charged with checking that data are made available in accordance with mandates, and PhD committees should demand that results are verifiable. And we need to strengthen a culture in which top research is rigorous and trustworthy, as well as creative and exciting.
The Netherlands is showing the way. In 2016, the Dutch Research Council allocated funds for replication research and meta-research aimed at improving methodological rigour. This year, all universities and major funders in the country are discussing how to include open research practices when they assess the track records of candidates for tenure, promotion and funding.
Grass-roots enthusiasm has created a fleet of researchers who want to improve practices. Now the system must assure them that they can build successful careers by following these methods. Never again can research integrity become a taboo topic: that would only create more untrustworthy research and, ultimately, misconduct.
Competing Interests
The author declares no competing interests.