I am glad to let you know that our paper has been published in the latest issue of IEEE Signal Processing Magazine:
P. Vandewalle, J. Kovacevic and M. Vetterli, Reproducible Research in Signal Processing – What, why, and how, IEEE Signal Processing Magazine, Vol. 26, Nr. 3, pp. 37-47, 2009, DOI: 10.1109/MSP.2009.932122.
Have you ever tried to reproduce the results presented in a research paper? For many of our current publications, this would unfortunately be a challenging task. For a computational algorithm, details such as the exact data set, initialization or termination procedures, and precise parameter values are often omitted in the publication for various reasons, such as a lack of space, a lack of self-discipline, or an apparent lack of interest to the readers, to name a few. This makes it difficult, if not impossible, for someone else to obtain the same results. In our experience, it is often even worse as even we are not always able to reproduce our own experiments, making it difficult to answer questions from colleagues about details. Following are some examples of e-mails we have received: “I just read your paper X. It is very completely described, however I am confused by Y. Could you provide the implementation code to me for reference if possible?” “Hi! I am also working on a project related to X. I have implemented your algorithm but cannot get the same results as described in your paper. Which values should I use for parameters Y and Z?”
Enjoy reading! And feel free to post your comments!
Last month, a few former colleagues at LCAV did some cross-testing of the reproducible research compendia available at rr.epfl.ch. And I must say, from the results I have seen so far, it is quite a sobering experience. Many of those which I considered to be definitely reproducible didn’t pass the test (entirely). I guess that shows again how difficult it is to make work really reproducible, even if you fully intend to do it. So that also leads me to my conviction that for papers that do not have code and data online, it is almost impossible to reproduce the exact results. There is work to be done on the road to reproducible research!
I’ll need to look further into the reasons why even some of my own work did not pass the test.
The current issue of Computing in Science and Engineering (CiSE) is a special issue on reproducible research, edited by two pioneers in the field: Jon Claerbout and Sergey Fomel. They have assembled a great set of articles from experts with a lot of first-hand, personal reproducible research experience, so I would highly recommend this to my colleague researchers!
I got a pointer earlier this week to a New York Times article about R. A very interesting article about the use of R in scientific communities and industrial research, mainly for statistical analysis. R is open source software, so it is free and has already taken advantage from contributions made by various authors. And (although I haven’t used it myself yet), it is a great tool for reproducible research. Using the package Sweave, authors can write a single document containing their article and the R code to reproduce the results and put them in place. This ensures that all the material is in a single place.
It also shows something about the amazing power of open source software developed by a community of authors (and typically users at the same time).
An article close to my current work on 3D now:
D. Scharstein and R. Szeliski, A taxonomy and evaluation of dense two-frame stereo correspondence algorithms, International Journal of Computer Vision, 47(1/2/3), pp. 7-42, April-June 2002.
In their article, Scharstein and Szeliski make a comparison of stereo estimation algorithms. But they do not just offer this overview of algorithms. On their webpage, they also provide the source code, and a widely used dataset of stereo images. They also invite other researchers to try their own algorithm on this dataset, and upload the results. This has resulted over the years in a performance comparison of almost 50 stereo algorithms, nicely listed on their webpage.
A nice example of what reproducible research can do! I think we need a lot more of these comparisons on common (representative) datasets.
I just read the following article:
C. Laine, S. N. Goodman, M. E. Griswold, and H. C. Sox, Reproducible Research: Moving toward Research the Public Can Really Trust, Annals of Internal Medicine, Vol. 146, Nr. 6, pp. 450-453, 2007.
A very interesting article, about how the journal “Annals of Internal Medicine” is promoting reproducible research. They do not require that all papers are reproducible, but they do ask the authors of each paper whether theirs is reproducible or not. If it is reproducible, they provide links to the protocol, data, or statistical code that was used.
While, certainly in medicine, this still does not guarantee that the entire research work is reproducible, it does give a lot of additional information (and credibility) about the presented work. I (as an ignorant researcher) also found it very interesting to read the description of the thorough editorial process that each paper undergoes. I have put an overview of reproducible research initiatives by journals on our RR links page. That is, the initiatives I know about of course. Feel free to let me know if you know other examples!
This initiative was (among others) initiated by an article about this topic by Peng et al. It would be great if other journals take over these examples, and reproducible research becomes the ‘default’ for a paper…