Author Archives: John

Forensic bioinformatics

The October 2008 issue of AMSTAT News has an article entitled “Forensic Bioinformatician Aims To Solve Mysteries of Biomarker Studies.” The article is about Keith Baggerly of M. D. Anderson Cancer Center and his efforts to reproduce analyses in bioinformatics papers.

The article quotes David Ransohoff, professor of medicine at UNC Chapel Hill, saying this about Keith Baggerly.

I think Keith is doing a wonderful and needed job … But the fact that we need people like him means that our journals are failing us. The kinds of things that Keith spends time finding out — what [the researchers] actually do — that’s what methods and results are supposed to be for in journals. … We need to figure out how to do science without needing people like Keith.

One of the reasons for lack of reproducibility is that journals press authors for space and so statistics sections get abbreviated. (Why not put the full details online?) Another reason is that bioinformatics articles are inherently cross-disciplinary and it may be that no single person is responsible for or even understands the entire article.

Embedding .NET code in Office documents

I recently heard out about some interesting tools from Blue Reference. I haven’t had a chance to try them out yet, but they look promising.

Sweave has received a fair amount of attention with regard to reproducibility because it lets you embed R code in LaTeX. Code stays with the presentation document, reducing the chance of error and increasing transparency. However, the number of people who use R and LaTeX is small, and asking people to learn these two packages before they can reproducible research is not going to fly. The number of people who use C# and Microsoft Word is orders of magnitude larger than the number of folks who use R and LaTeX.

It looks like Blue Reference’s product Inference for .NET lets .NET programmers do the kinds of things Sweave lets R programmers do, embedding .NET code in Microsoft Office documents. The also make a product Inference for MATLAB for embedding MATLAB code in Office documents.

Python developers who don’t think of themselves as .NET developers might want to use Inference for .NET to embed Python code in Word documents via Iron Python. Ruby developers might want to use Iron Ruby similarly.

Programming is understanding

Bjarne Stroustrup’s book The C++ Programming Language begins with the quote

Programming is understanding.

Many times I’ve thought I understood something until I tried to implement it in software. Then the process of writing and testing the software exposed my lack of understanding

One thing that can make reproducible research difficult is that you have to deeply understand what you’re doing. Making work reproducible may require automating steps that you do not fully understand, and don’t realize that you don’t understand until you try automating them.

Stated more positively, attempts to make research reproducible can lead to new insights into the research itself.

Related post: Paper doesn’t abort

Musical chairs and reproducibility drills

In a recent interview on the Hanselminutes podcast, Jeff Web said that if he were to teach a computer science class, he would have the class work on an assignment, then a week later make everyone move over one chair, i.e. have everyone take over the code their neighbor started. Aside from the difficulty of assigning individual grades in such a class, I think that’s a fantastic idea.

Suppose students did have to take over a new code base every week. People who write mysterious code would be chastised by their peers. Hopefully people who think they write transparent code would realize that they don’t. The students might even hold a meeting outside of class to set a strategy. I could imagine someone standing up to argue that they’re all going to do poorly in the class unless they agree on some standards. It would be fantastic if the students would discover a few principles of software engineering out of self-defense.

I had a small taste of this in college. My first assignment in one computer science class was to add functionality to a program the instructor had started. Then he asked us to add the same functionality to a program that a typical student had written the previous semester. As the instructor emphasized, he didn’t pick out the worst program turned in, only a typical one. As I recall, the student code wasn’t terrible, but it wasn’t exactly clear either. This was by far the most educational homework problem I had in a CS class. I realized that the principles we’d been taught about how to write good code were not just platitudes but survival skills. Later my experience as a professional programmer and as a project manager reinforced the same conclusion.

In some environments, it’s not practical to have people switch projects unless it is absolutely necessary. Maybe the code is high quality (and maybe it’s not!) but there is a large amount of domain knowledge necessary before someone could contribute to the code. But at least software developers ought to be able to build each other’s code, even if they couldn’t maintain it.

When I was managing a group of around 20 programmers, mostly working on one-person projects, I had what I called reproducibility drills. These were similar to Jeff Webb’s idea for teaching computer science. I had everyone try to build someone else’s project. These exercises turned out to be far more difficult than anyone anticipated, but they caused us to improve our development procedures.

We later added a policy that a build master would have to extract a project from version control and build it without help from the developer before the project could be deployed. The developer was allowed (required) to create written instructions for how to build the project, and these instructions were to be in a location dictated by convention. The build master position rotated so we wouldn’t become too dependent on one person’s implicit knowledge.

Having a rotating build master is great improvement, but it lacked some of the benefits of the reproducibility drills. The build master procedure only requires a project to be reproducible before it’s deployed. That is essential, but it could foster an attitude that it’s OK for a project to be in bad shape until the very end. Also, some projects never actually deploy, such as research projects, and so they never go to  the build master.

Medieval project management

I wrote a post on my personal blog recently called Medieval software project management. The post compares software project management to the medieval practice of “beating the bounds,” having young boys walk the perimeter of a parish to memorize the boundaries in order to preserve this information for their lifetimes. Many research projects use a similar strategy, assigning one person to a project for life, depending on that person’s memory rather than capturing project information in prose or in software.

Johanna Rothman wrote a response to the medieval project management post in which she gives good advice on how businesses can avoid such traps. Here’s an excerpt.

Here’s what I did when I was a manager inside organizations, and what I suggest to clients now: make sure a team works on each project. That means no single-person projects, ever. A team to me contains all the people necessary to release a product. Certainly a developer and a tester. Maybe a writer, maybe a release engineer, maybe an analyst. Maybe a DBA. Whatever it takes to release a product, everyone’s on the team. Everyone participates. If they can automate their work and explain it to other people, great. But it’s not a team unless the team can release the product. (Emphasis added.)

It would be terrific progress if more scientific programming were done this way. In theory, science strives for a higher standard. Not only should your team of colleagues be able to reproduce your work, so should anonymous scientists around the world. But in practice, science often has lower standards than business with regard to software development.

Provenance in art and science

Here’s an excerpt from Jon Udell’s interview with Roger Barga explaining the idea of provenance in art and science.

JU: Explain what you mean by provenance.

RB: Think about it in terms of art. For a given piece of art, we’re able to establish through authorities that it’s original, where it came from, and who’s had their hands on it through its lifetime. Provenance for a workflow result is the same thing. Minimally we want to be able to establish trust in a result. If you think about how that happens, it often starts by considering who wrote the workflow. So with Trident you can click on a result and interrogate the history of the workflow: who wrote it, who reviewed it, who revised it, when it first entered the system.

Test-driven development

Test-driven software development has much in common with reproducible research. Here’s an excerpt from a talk by Kent Beck, one of the most visible proponents of test-driven development. He says test-driven development isn’t about testing.

Testing really isn’t the point. The point here is about responsibility. When you say it’s done is it done? Can you go to sleep at night knowing the software that software you finished today works and will help and isn’t going to take anything away from people?

You could say similar things about reproducible research. RR is about responsibility, really finishing a project rather than sorta finishing it. Can other people build on top of your work with confidence? Can you build with confidence tomorrow on the work you did today?

Software unit tests exist not only to verify that code is correct, but to insure that the code stays correct over time. These tests act as tripwires. The hope is that if a future change introduces a bug, a unit test will fail. Again similar remarks apply to RR. With RR, you’re not just interested in producing a result. You’re also giving some thought to producing a variation on that result with minimum effort and maximum confidence in the future when something changes.

How to teach RR in one hour

Last week Greg Wilson asked me what I would do if I had one hour to teach a group about reproducible research. He said to assume that the group is already convinced of the need for reproducibility.

First here are some thoughts on what I’d say if the group had not given much thought to reproducibility. I would start impersonal and then become more personal. I’d start by relating some horror stories of how someone else’s work was impossible to reproduce and contained false conclusions. It’s easy to gang up on some third party researcher, griping about how sloppy someone not in the room was in their research. This plants the idea that at least some people need to think more about reproducibility. Then I’d transition by talking about times when I’ve had difficulty reproducing my own work. Then I would try to convince them that their own work is probably not reproducible or at least not easily reproducible. So my outline would be they have problems, I have problems, you have problems.

I believe that convincing people of the need to be concerned about reproducibility is most of the problem. If people are highly motivated, they will come up with their own ways to make their work easier to reproduce and they will gladly take advantage of tools they are introduced to.

To Greg’s original question, now what? First I’d expound the merits of version control systems. You can’t possibly reproduce software if you can’t put your hands on the source code, and you can’t reproduce software as it existed at a particular point in time without revision history. Then I’d emphasize that version control is necessary but not sufficient. When people first understand version control, they tend to think it takes care of all their reproducibility problems when in fact it’s just the first step. I’d share some war stories of projects that have taken many hours to build even though we had all the source code. (If I had a semester rather than an hour, I’d let them experience this for themselves rather than just telling them about it by bringing in some outside projects for them to rebuild.) I’d also emphasize that it’s not enough to put code in version control: data needs to be versioned as well.

Once they grok version control, I’d discuss automation. When a process is 99% automated and 1% manual, the reproducibility problems come from the 1% that is manual. The principle behind many reproducibility tools is automating steps that are otherwise manual, undocumented, and error-prone. (See Programming the last mile.)

Finally, I’d emphasize the need for auditing. As I pointed out in an earlier post “You cannot say whether your own research is reproducible. It’s reproducible when someone else can reproduce it.” Again if I had a semester rather than an hour, I’d let them experience this by having them reproduce each other’s assignments. I could hear it now: “What do you mean you can’t reproduce my homework? It’s all right there!”

Which comes first, users or tools?

Greg Wilson and I have been discussing the importance of tools in reproducible research lately. Would more people use reproducibile research practices if tools made doing so more convenient? Would better tools appear if more people cared about reproducibility?

I believe both statements are true, and I believe Greg does as well. However, he and I have different emphases. Greg says “In my experience, most people won’t adopt a programming practice unless there is at least some basic support for it.” I agree, but I think the biggest obstacle to more widespread reproducibility is that few people realize or care that their work is irreproducible. I think that when more people care about reproducibility, some percentage of them will develop and give away tools and we’ll have enough tool support.

We are not in a chicken-and-egg scenario. It’s not as if Greg is saying first we need tools and I’m saying first we need users. We have both tools and users. There are people who care about reproducibility, and some of them have produced tools that make it easier for others to follow. But not many of these people know each other or know about their tools. I hope that the ReproducibleResearch.org web site and this blog will change this.

It help to look at the early history of object oriented programming. Some people were writing object oriented programs before there were (popular) object oriented languages. For example, some people were writing object oriented C before C++ baked support for OO into the language. This was painful, but some pioneers did it. To Greg’s point, the number of programmers writing OO programs took off once there were OO languages with good tool support. To my point, first there were programmers wanting to write OO code; these were the folks who developed the tools and the early adopters of the tools.