Reproducible research, literate programming, open science, and science 2.0. All different namings, and (in my opinion) all covering largely the same topic: sharing code and/or data complementing a publication as a presentation of your research work. While literate programming is more focused on adding documentation to code, and science 2.0 seems to include the assumption that you put work in progress online, there really seems to be a very large intersection between these topics.
This clearly shows that from various sides of the scientific community, in very different fields of science, the same ideas pop up. That is a really exciting thing! And at the same time it also shows that there is a clear need for such open publication of a piece of research. And I think everyone will agree that there would be nothing nicer than being able to really start from the current state-of-the-art when starting to do research in a certain field?
Should all these efforts be merged under a single “label”? It would definitely be exciting. And it would create a huge impact, as a joint effort for “open science”, “reproducible research”, or whatever the name may be, would receive a lot of attention, and cannot be overlooked by anyone anymore. At the same time, every research domain needs other specifics or finetuning, and it is not clear to me now what the “best” setup would be for the type of work I am doing now. So maybe we should let these variations co-exist for some more time, and see later which ones survive, are the simplest to use, and which tools can be combined to create an optimal method for research.
But of course (if anyone is reading these posts), I would be very happy to hear your own opinion on this!
A few months ago, I read in a Belgian newspaper that 9% of the participants in a study among 2.000 American scientists said they had witnessed scientific fraud within the past three years. And it seems they were not talking about those cases where people use Photoshop to crop an image or so, but rather inventing fake results or falsifying articles.
Although I wasn’t able to find this back on the web with Google, I am quite sure the original authors checked the number. Wikipedia reports on another study, where the actual number was 3%. Anyhow, whether it is 3 or 9 percent, this number is much too high. Let us hope it can be taken down by requiring higher reproducibility of our research work. I do realize that there will always be people cheating, and falsifying results (Wikipedia even keeps a list of the most famous cases). But I also strongly believe that in the end, most researchers just want to do good work. And many of them perform non-reproducible work, just because they don’t feel the need for making it reproducible (yet). Or are too busy with their next piece of work to properly finish off the current one…
Welcome on my personal blog!
On these pages, I plan to post thoughts and ideas on reproducible research, image processing research, or other things I find interesting enough to share with “the world” (that means you). It is also meant for experimenting with this medium, so it is still a bit unclear to me what and how often I will post here. I guess that will also depend on your feedback…
To be honest, it’s not my first attempt at blogging. When I was still at EPFL, I already started a blog on reproducible research, but somehow I never managed to publish things regularly enough there. So this time I’ll try to keep it a bit broader, and write a bit more regularly.
An article about reproducible research appeared in the July 2007 issue of IEEE Signal Processing Magazine eNewsletter. It invites readers to discuss about reproducible research on our discussion forum.
Recently, a note encouraging authors to make their publications reproducible was also added to the IEEE Transactions on Signal Processing homepage.
Things are moving, and they are moving fast!
Things have been rather quiet here recently… Not because nothing was happening on reproducible research, but mainly because I was not sure about the purposes and use of this Blog. Please feel free to let me know if you have any feelings about the use or lack of use for such a blog.
After some interesting discussions about reproducible research and open access, some colleagues have reported about our reproducible research initiatives on their blogs:
– Peter Murray-Rust wrote an article “ Open Data is critical for Reproducible Research” on his blog at University of Cambridge. He is quite active on Open Access to publications and data in chemistry. He and his colleagues have built a robot that extracts cristallographic information from publications and gathers them in an online database CristalEye. In their community, they also have the Blue Obelisk which collects open source code and data in chemistry.
– Peter Suber referred to reproducible research on his Open Access News blog: OA for text, data, and code to make research reproducible. Peter is a policy strategist for open access to scientific and scholarly research literature. On his blog, he gives a lot of news about new initiatives, publishing policies, etc.
And thanks of course also to Stevan Harnad for his kind and helpful reactions, and for bringing me into contact with these people!
At the next ICASSP conference (April 15-20, 2007, in Hawaii), we are organizing a special session on the topic, together with Mauro Barni and Fernando Perez-Gonzalez. This should allow discussion and exchange of ideas with a broader public. We will have six papers covering various aspects of reproducible research: case studies, public datasets, publishing issues, tools for making research reproducible, etc. I can already say that the papers look very interesting. I look already forward to the conference and the inspiring discussions around (and not only because of its location ;-)).
The idea behind reproducible research is quite simple: all the information relevant to the work should be made available. This means that the publication(s), the data and code used to produce results, figures, etc. should be available, typically online. In practice, this does require some effort, which is largely paid back in additional visibility, impact, and ease of reuse of the work.
Welcome on this blog about reproducible research!
The goal of this blog is to exchange ideas about reproducible research in general, and on how to make all our lab’s research reproducible. May the exchange of ideas be fruitful, and result in a good setup for reproducible research!