How I Fail in Open Science

Last week I had the pleasure of giving a talk at OpenMR Benelux event, wonderfully organized by @fmrwhy.  Although the slides and a video of the talk will be available online, for those of you who prefer reading, I thought I would write a few of the things I mentioned during my talk. 

As I mentioned in my talk, I was feeling a bit like an imposter speaking at this event, since I neither do a lot of MR, nor a lot of open science”. Nevertheless I’ve decided to be open about how open my science is and share my experiences with it so far – hence the title “How I Fail in Open Science”. 

Open science during my PhD 

My story begins in 2011 when I started my PhD. After focusing on workshop papers for two years, I realized I needed journal papers to graduate. I submitted three papers that year and followed the suggestion to post them on arXiV because the review process could be lengthy. I used public datasets and a publicly available MATLAB toolbox, and since both the data and tools were online, I didn’t think it was needed to share the rest of my code. 

In 2015 the papers were finally accepted and I finished my PhD. Because the papers were already online for two years, I was able to benefit from the preprint bump. I would also occasionally get emails about the experiments in my paper. I then decided to share my (non-version controlled) experiments code to reproduce the results table in the paper. Miraculously even after two years I was still able to run my code AND get the same results. So I shared the code with a CRAPL license, which I felt absolved me from doing any other “cleaning up of the code”.

Open science during my postdoc

After starting my postdoc in 2015 I felt like I should publish as fast as possible. Instead of investigating the best tools for my project, I decided to go with my tried and trusted method. This was not a good strategy and in retrospect, I would have been much better off investing some time into switching to Python, creating clean code and so forth. In the end I didn’t publish much at all that year.

The publishing situation became even worse in 2016 when I started searching for my next job. However, since I was updating my CV often, I did also decide to share a few more things online. I also started using social media more often, and learning more about open science in general. 

Open science now

In 2017 I found myself in a tenure track position. Inspired by everything I saw on Twitter, I wanted to do everything right – switch to Python, publish in new open access journals, share everything online. I quickly discovered that this is not feasible next to all the other responsibilities you have when starting on the tenure track.

The only thing I have been doing consistently is posting preprints on arXiV. Here and there I have a paper for which I’ve shared data or code (still not version controlled), but it’s not something that happens by default. 

Why is my science not as open as I want it to be? It’s easy to say there’s too little time, but in the end it is a question of priorities. I am still influenced by my grant reviewers who tell me “that’s nice, but you should have published more”, and the funding agency who agrees with them. And although overall my experience on Twitter has been positive, people with strong opinions about what counts as open science, can be quite intimidating. 

How can I do better? I cannot change the system, but I can at least try to create a habit out of being more open. To do so I decided to draw parallels between open science and another area of my life in which I’ve had both successes and failures – running! 

Strategy 1: Start slow and focus on process

The first strategy is to start slow and focus on process. Find a thing that’s easy to do, and do it often. For running, my thing was “go for a run three times a week”. Note that there’s no distance or time – I just had to go out of the house, and even running 10 minutes was a success. If I had set a more difficult goal than that, I would get discouraged and quit – something that has happened to me several times before.

Translating this to open science, it’s a bad idea to try to do everything at once. I started with preprints and am now slowly adding sharing things online. I do this by using templates in Todoist. For example, every time I agree to give a talk, I import a fixed set of tasks, including “Create slides”, but also “Upload slides to website”.

Todoist project for the OpenMR talk, which includes preparing the talk but also sharing the slides

Strategy 2: Find accountability and support

To motivate yourself to continue with the habit you need to find accountability and support. With running, I find accountability by signing up for 10K races and then deciding that it’s probably going to be better for me to train on a regular basis. I also have a few friends who have either been running for a long time, or are just getting into it, so we can support each other. 

With sharing data and code, I feel accountable towards my students. I want them to do things better than I did myself, so I’m helping them set up their projects on Github from the start (inspired by Kirstie Whitaker). The code might still not be clean and run out of the box, but I feel like it’s an important first step.

As for support, I’m in a Slack group with other academics where we discuss this and other issues. And of course Twitter is a great place to learn new things and find people who are trying to improve their open science too. 

Strategy 3: Reward yourself

Finally, to create a habit don’t forget to reward yourself! After a race I might get a beer and a badge in my Strava app. But of course there are also long term rewards such as overall health, and being able to socialize with others. 

For open science there are also various metrics such as the Altmetric – here’s an example for a recent preprint. There are also gamified ewards, for example badges on ImpactStory. But more important is feeling the impact of your work on others, such as a thank you email, or an invitation to talk at an OpenMR event 🙂 

***

Do you struggle with sharing your work online? Or do you have any other helpful strategies? Leave a comment or let me know on Twitter!

Why you should post preprints on arXiV

Recently on Twitter I saw a lot of discussions about preprints, such as under the #ASAPBio hashtag, which originated in the biology community. My guess is that preprints are more or less common in different fields, and I thought it was normal for Computer Science to do it, so I couldn’t contribute anything to the topic. But I’ve encountered some doubts when I encouraged other CS students to upload their work to arXiV, so I thought I’d share my N=1 experience with preprints.

Long story short, I spent a good part of 2013 writing journal papers. I submitted three of them that year, and directly uploaded the submitted versions on arXiV. You can see my page on arXiV here.

I spent 2014 revising these papers. One paper was accepted in 2014, and two others only in 2015, when I was already a postdoc. One of the accepted papers is still in press, even though it is already 2016. I imagine it will be three years (!) between my initial submission — that really isn’t that different from the revised version — and the published version. And this is in Computer Science, a fast-moving field!

As a PhD student / postdoc / aspiring researcher, you can’t really afford such a time lag. And that is where preprints have been immensely helpful to me in different ways:

  • Two of the papers were based on earlier conference papers. When I was discussing that work with other researchers (at conferences, via email), I could send them the preprint, which contained more detailed results.
  • The third paper (a type of survey) was completely new, and I was a bit scared that somebody would publish something similar before me. The preprint was actually a way to assure myself that it was now documented that I came up with the idea. Again, I discussed this work with other researchers while it was already in arXiV, and even got some valuable comments, which helped me a lot when revising the paper.
  • The preprints were cited (mostly by myself, but also by other researchers). After publication, I merged each preprint with the published version in Google Scholar. I don’t really have a lot of citations, but I would have had even less if the papers only became available in 2015 instead of 2013.
  • I didn’t apply for jobs while I had any unpublished preprints, but if this was the case, I could put the preprints on my CV, which is more informative than simply listing the paper title with the comment “manuscripts in preparation”.
  • Most journals allow this! You can check on this website what your journal’s policy is

If you are a student in Computer Science (or anything, really) and you are doubting about uploading a preprint of your recent work, I hope this might change your opinion a little bit.

%d bloggers like this: