Replicable by design

At EACL last year, I had a lengthy chat with a guy next to his poster about the (ir)replicability of some high-profile papers in information retrieval [1]. During some 5 years of research that I’ve gone through, I also often ran into reproducibility problems. Probably many PhD students out there have relatable experiences.

Obviously, researchers should take full responsibility to produce replicable research. But we should also recognize the underlying systemic issue. Researchers are not rewarded to make their work repeatable. Once a paper is accepted, you are already in the middle of a new one so there’s no time to make your old code re-runable (if that’s possible at all). Added to that, the likelihood (or threat) of your work being reproduced is terribly small. There are not many reports of reproducibility problem in NLP and retracted papers are non-existent. While big conferences are starting to address this problem (COLING 2018 has a track for reproduction and LREC 2018 also mentions “replicability and reproducibility issues”), I suspect it will take years for the effect to be felt.

In the meantime, what we could do is to align the effort to the incentive. Ideally, it should take no extra work to make your research replicable. The solution, I think, is to make experiments replicable by design. Continue reading