The European Journal of Personality promotes the development of all areas of current empirical and theoretical personality psychology. Welcome to the EJP Blog, the landing page for news related to the European Journal of Personality.

How to Use R Markdown Reports to Shorten Review Times, Receive Better Feedback, and Ensure Reproducibility

A post by Daniel A. Briley

I have vivid memories from graduate school of Daryl Bem’s extrasensory perception (ESP) paper being published. Our journal club at the time had a spirited conversation about the paper and the general state of the field. As has been written about ad nauseum, this paper claiming to have found paranormal powers played an important role in kicking off the replicability crisis in psychology. A finding is replicable if other researchers in other labs can run the same procedure and find similar results. However, in this post my focus is on a related and even more fundamental topic — reproducibility. Reproducibility refers to the ability to run the same code on the same data as reported in a paper and find the same results. Who cares about replicating findings in a new sample if we cannot even reproduce results in the same sample? Reproducibility can be depressingly low. In this post, we will see how R Markdown reports can be used to ensure that your results are reproducible, saving you many headaches down the road. And if that isn’t enough of a carrot, my experience is that R Markdown reports can cut down at least one round of reviews, so give 6 months of your life back (although, EJP has much quicker turnaround times!), and allow reviewers to offer more helpful feedback.

What is R Markdown?

An R Markdown report is a collection of executed code chunks embedded in descriptive text. Creating a report formalizes the loose statistical interpretation that accompanies most research projects in the early stages. The report can function essentially as a lab notebook, where you jot down new ideas, document problems, and interpret the estimates. These features set R Markdown reports apart from the current standard practice of including (example) analytic code. Simply including the code does not inform the reader as to the full results, the justification for specifications, or the meaning of the estimates. 

As an example, here is what example code often looks like:

Ok, we can see what commands were used. We have little information about why, the results, or the interpretation. 

Next, let’s look at output:

Now we have some numbers, but we still don’t know why this approach was taken or how the authors interpret the numbers.

An R Markdown report, on the other hand, can provide all this information:

The commands, approach, and output can all be interpreted together to show the flow of analyses. I have posted the R Markdown input file, the HTML output, and the data file to OSF. I use this set of files as an example of the different formatting choices with my students. Feel free to modify and build on the template.

It is rare in academia for there to be a simple foolproof solution to a problem. R Markdown reports are one of the rare exceptions. With these reports, we can see exactly what data is being read into the statistical software. We can see the exact specification of the statistical model. And we can see the exact parameter estimates that are produced. As long as the estimates have been faithfully transferred from the report to the text document (if you are worried about this step, the entire manuscript can be written in R Markdown), then there would be no more question of whether results are reproducible. The reproduction has literally been included as part of the submission. Congratulations, we just solved an important cornerstone of the replicability crisis!

Personal Benefits

Of course, I probably need to appeal to your personal interests as an author to initiate behavioural change. Getting work accepted for publication in a timely fashion is incredibly important to building an academic career, particularly for graduate students and early-career researchers. Including an R Markdown report with your submission helps tremendously with the review process. I have handled several papers that have included R Markdown reports, and the review process is much more efficient. The typical flow of reviews goes something like this: A reviewer asks about a specific part of the analytic approach. The authors respond with more details. Actually, the reviewer thinks it should be done a different way. Then, the authors test out this possibility, and the reviewer needs to see the result to make sure it matches their concern (the reviewer is so invested now, they might as well keep reviewing for another year). 

When a detailed R Markdown report is included, the reviewer can offer a clear, detailed description of their preferred approach and the pros and cons of that approach compared to the author’s approach. Since the reviewer can see the actual analysis, they can inform the editor of the full situation. Then, the authors respond, and because the editor knows more of what the situation is, they can make a decision right away. Hopefully, that means the editor does not need to burden the reviewer with an additional request, and you get to share your work in dramatically less time. Everyone wins. 

Practically speaking, how does one get these benefits when journals do not ask for such reports? Don’t let a missing dropdown option for a file type in the submission portal stop you. An R Markdown report can be submitted as supplemental material to a paper and can act as a much more detailed results section. Supplemental tables and figures can be embedded right in the report. 

R Markdown reports are also excellent for collaboration. When working with graduate students, communicating statistical concepts that are new can be difficult. Being able to see the exact code and results allows me to be a better advisor. Similarly, large collaborative teams are increasingly becoming the norm. The skill to produce analytic reports that researchers trained in different statistical backgrounds can interpret will be in demand. 

Limitations

Of course, it is not all roses. There are at least four obstacles to the wider adoption of analytic reports.

First, creating a report is another, somewhat tedious task to perform before submitting a paper. If generating a report is the last thing between you and that submit button for a (burnout-inducing) project, then I understand the urge to reject the idea. Researchers already do a lot of work to publish content for free. My advice would be to treat the analytic report as the first draft of the results section. From the report, you can select what the most relevant results are, while also remaining transparent about all the analyses that were run. 

Second, I’ve mostly discussed R Markdown reports. Learning a new programming language may be a barrier. For data analysis, I almost always use Mplus. Luckily, R can integrate with Mplus to run models and read in the results (e.g., mplusautomation). By adopting R Markdown reports, I freed myself from my organization “system” of thousands of input and output scripts, sometimes with letters or numbers signifying which model comes first (a truly terrible system; grad school should really include some sort of filing system training). If you use a different statistical program, I imagine there are packages to integrate R with just about any software. An analytic report that contains all the code, output, and interpretive text could be compiled in your preferred text editing program. However, R Markdown also has an extremely useful feature of allowing results to be piped into text or APA formatted tables (e.g., the APA Tables package). Rather than redoing tables or figures when the model changes, R does it all for you automatically. 

Third, we are all self-conscious about our code. I imagine there are a few people reading with sweaty palms at the thought of posting code. Won’t people mock my notes or my use of a loop instead of apply()?! Sure, probably. But who cares? Everyone’s code is messy. I often do tasks in a terribly inefficient way because it makes more sense to me. The ultimate goal of code is not to be as efficient as possible. It is to do a job effectively. Part of the job is to be intelligible, both to yourself and to a reader. With the computing power of even a middling laptop today, almost all analyses run for psychology papers can be performed in seconds. If we establish a norm around sharing analytic reports, we will all see each other’s messy code and realize it is alright. 

Fourth, this one is for the cynics out there. “Why put all this work in when nobody will ever look at it? Nobody ever checks my Open Science Framework page!” Although there is certainly some truth to this, my experience is that some reviewers do look at this material – I always do as an editor. All reviewers have their interests and expertise. In a group of three reviewers, there will likely be at least one person curious enough and interested enough in the topic to look over the code. I have handled several papers that have included analytic reports, and they are always commented on. Example code, on the other hand, rarely receives comments. Analytic code without results or interpretive text lacks the context necessary to make sense of the analysis. 

Resources

When I originally set out to write this post, I envisioned a tutorial on how to put together a report. I searched for other examples for inspiration. Turns out, there are some great resources out there! So, rather than reinventing the wheel, I thought I would get more argumentative to encourage the adoption of the practice and then include some links. If you would like to learn more about how to generate R Markdown reports, I would suggest checking out the excellent Rmarkdown: The Definitive Guide by Xie, Allaire, and Grolemund. The book is comprehensive and written entirely in R Markdown (with the help of the bookdown package). For a shorter introduction with helpful videos, I would also recommend checking out the materials put together by the R Studio team, including amazing cheatsheets. For more applied-oriented material, check out herehere, or here.

To conclude, R Markdown reports can solve the reproducibility problem, shorten your review times, enhance the quality of your reviews, help with collaboration, and generally make your quantitative data analytic life easier. These are the carrots to encourage researchers to take up this behaviour. You may want to learn now before the sticks come out. Personally, I would favour making an analytic report a requirement of submitting a paper because of the amount of effort it saves editors and reviewers. Currently, there is no clear plan to implement this policy at EJP, but a number of individuals have expressed support. EJP has been a leader in rigour in personality psychology. It may be only a matter of time before any reader can learn how to perform analyses reported in the journal by looking up the exact analytic code, a tremendous resource for aspiring data analysts.  

A Conversation with Walter Scott

The European Journal of Personality Newsletter — Edition 4