I just found out that Keith Baggerly will be speaking at Rice University this afternoon. His talk is entitled “Cell Lines, Microarrays, Drugs and Disease: Trying to Predict Response to Chemotherapy.” Here is part of the seminar announcement most relevant to reproducibility.
In this talk, we will describe how we have analyzed the data, and the implications of the ambiguities for the clinical findings. We will also describe methods for making such analyses more reproducible, so that progress can be made more steadily.
The talk will be at 4 PM in Keck Hall room 102.
Michael Neilsen posted an excellent article this morning Three myths of scientific peer review. He points out that peer review has only become common in the last 40 or 50 years. Maybe a few years from now someone will write an article looking back at how reproducible research came to be de rigueur. No one questions whether peer review is a good thing, though many people have complaints about the current system and argue about ways to make it better. Maybe the same will be said for reproducible research some day.
When I was in college, a friend of mine told me he liked to take his code out for a walk every now and then. By that he meant recompiling and running all of his programs. At the time I though that was unnecessary. If a program compiled and ran the last time you touched it, why shouldn’t it compile and run now? He simply said that I might be surprised.
Even when your source code isn’t changing, the environment around it is changing. When I was in college, computers didn’t have automatic weekly updates, but they changed often enough that taking your code out for a walk now and then made sense. Now it makes even more sense. See Jon Claerbout’s story along these lines.
The following articles on RR are included.
Guest Editors’ Introduction: Reproducible Research
Sergey Fomel, University of Texas at Austin
Jon F. Claerbout, Stanford University
Reproducible Research in Computational Harmonic Analysis
David L. Donoho, Stanford University
Arian Maleki, Stanford University
Inam Ur Rahman, Apple Computer
Morteza Shahram, Stanford University
Victoria Stodden, Harvard University
Python Tools for Reproducible Research on Hyperbolic Problems
Randall J. LeVeque, University of Washington
Distributed Reproducible Research Using Cached Computations
Roger D. Peng, Johns Hopkins Bloomberg School of Public Health
Sandrah P. Eckel, Johns Hopkins Bloomberg School of Public Health
The Legal Framework for Reproducible Scientific Research: Licensing and Copyright
Victoria Stodden, Harvard University
I just found out about BioMed Critical Commentary. Here’s an excerpt from the site’s philosophy statement.
The current system of scientific journals serves well certain constituencies: the advertisers, the journals themselves, and the authors. It is the underlying philosophy of BioMed Critical Commentary to serve the readers in preference to any other constituency.
In particular, this site could serve as a public forum for criticism that journals are not eager to publish. It could be a good place to discuss specific examples of irreproducible analyses.
Even when lab work and statistical analysis carried out perfectly, microarray experiment conclusions have a high probability of being incorrect for probabilistic reasons. Of course lab work and statistical analysis are not carried out perfectly. I went to a talk earlier this week that demonstrated reproducibility problems coming both from the wet lab and from the statistical analysis.
The talk presented a study that supposedly discovered genes that can distinguish those who will respond to a certain therapy from those who will not. On closer analysis, the paper actually demonstrated that is it possible to distinguish microarray experiments conducted on one day from experiments conducted another day. That is, batch effects from the lab were much larger than differences between patients who did and did not respond to therapy. I hear that this is typical unless gene expression levels vary dramatically between subgroups.
The talk also discussed problems with reproducing the statistical analysis. As is so often the case, data were mislabeled. In fact, 3/4 of the samples were mislabeled. Simply keeping up with indexes is the biggest barrier to reproducibility. It is shocking how often studies simply did not analyze the data they say they analyzed. This seems like a simple matter to get right; perhaps people give little attention to it precisely because it seems so simple.
So, three reasons to be skeptical of microarray experiment conclusions:
- High probability of false discovery
- Statistical reproducibility problems
- Physical reproducibility problems
Sergey Fomel just told me about a special session on reproducible research at the “Berlin 6 Open Access Conference” in Dusseldorf, Germany. Presentations from the conference are available online.
Sergey Fomel and Sünje Dallmeier-Tiessen gave presentations in geophysics. Patrick Vandewalle and Jelena Kovacevic gave presentations in signal processing. Mark Liberman, Kai von Fintel, and Steven Krauwer gave presentations related to language and technology.
Video of the presentations is available here.
Thomas Guest has a new blog post Books, blogs, comments and code samples discussing the challenges of writing a book that contains code samples, may be rendered to multiple devices as well as paper, etc. He points to a project by author Scott Meyers called Fastware that explores ways of meeting these challenges. I haven’t had time to explore Fastware yet, but it sounds like it is concerned with some of the same problems that come up in reproducible research.
My previous post discussed Keith Baggerly and his efforts as a “forensic bioinformatician.”
In that article, the reporter asks Keith to name the biggest problem he sees in trying to reproduce results.
It’s not sexy, it’s not higher mathematics. It’s bookkeeping … keeping track of the labels and keeping track of what goes where. The thing that we have found repeatedly in our analyses is that it actually is one of the most difficult steps in performing some of these analyses.
I’ve seen presentations where Keith discusses specific bookkeeping errors. Quite often columns get transposed in spreadsheets, so researchers are not analyzing the data they say they are analyzing.