Tools for Reproducible Research in Linguistics



Bradley McDonnell, Andrea Berez-Kroeker, Eve Koller

University of Hawai‘i at Mānoa



Bradley McDonnell, University of Hawai‘i at Mānoa

Na-Rae Han, University of Pittsburgh

Eve Koller, University of Hawai‘i at Mānoa


Thursday, January 3, 2019

Flatiron Room

Sheraton New York Times Square

9:00 AM - 3:30 PM


Reproducible research—the concept that data and any code for analysis should be published alongside the research results so that others can validate and/or build upon the claims of the research—has gained real traction in the social sciences in recent years, and with the development of several open-source digital tools, conducting research in a reproducible research is more accessible than ever. While some sub-disciplines of linguistics have advocated for reproducible research and individual linguists have implemented reproducible methodologies into their research, many linguists lack the knowledge and/or practical training to make their research reproducible.


The aim of this workshop, then, is to provide a conceptual foundation for reproducible research in linguistics alongside practical hands-on training in implementing best practices in reproducible research through the use of several open-source digital tools. These include topics such as, (i) versioning with git and publishing and collaborating on versioned research outputs using web-based platforms, such as GitHub, and (ii) ‘notebooks’ or ‘dynamic documents’ that directly link the data and code for analysis to the prose of a research report using Jupyter Notebooks with Python and RStudio with R, which allow for various outputs (e.g., pdf, HTML) that contain both computer code  and rich text elements (paragraphs, equations, figures, links, etc.).



Due to the technical nature of this workshop, it will be capped at 20 participants, but thanks to the generous support of a National Science Foundation EAGER grant (NSF SMA 1745249) “Data Science Literacy for All of Linguistics,” participation in the workshop is free. It is primarily aimed at early- and mid-career researchers, but will be accessible to both graduate students and senior researchers. Basic knowledge of either R or Python will be very beneficial but is not required. Participants will not be asked to write their own code; exercises will contain the necessary Python and R code.



Participants will be required to bring laptop computers to the workshop running OS-X (Mac) or Windows (mobile systems such as iPads, Android tablets, and Chromebooks are not suitable for the workshop). Prior to the workshop, the instructors will send out instructions for installing all of the necessary software.



To register please fill out this form. If you have any questions about this workshop, please contact Bradley McDonnell (


This material is based upon work supported by the National Science Foundation under grant SMA-1745249 to the University of Hawai‘i at Mānoa. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation.