Write a Blog >>
MSR 2019
Sun 26 - Mon 27 May 2019 Montreal, QC, Canada
co-located with ICSE 2019
Mon 27 May 2019 11:55 - 12:10 at Centre-Ville - Session VIII: Software Quality (part 2) Chair(s): Yasutaka Kamei

Jupyter Notebooks have been widely adopted by many different communities, both in science and industry. They support the creation of literate programming documents that combine code, text, and execution results with visualizations and all sorts of rich media. The self-documenting aspects and the ability to reproduce results have been touted as significant benefits of notebooks. At the same time, there has been growing criticism that the way notebooks are being used leads to unexpected behavior, encourage poor coding practices, and that their results can be hard to reproduce. To understand good and bad practices used in the development of real notebooks, we studied 1.4 million notebooks from GitHub. We present a detailed analysis of their characteristics that impact reproducibility. We also propose a set of best practices that can improve the rate of reproducibility and discuss open challenges that require further research and development.

Mon 27 May

Displayed time zone: Eastern Time (US & Canada) change

11:55 - 12:30
Session VIII: Software Quality (part 2)MSR 2019 Technical Papers / MSR 2019 Data Showcase at Centre-Ville
Chair(s): Yasutaka Kamei Kyushu University
11:55
15m
Full-paper
A Large-scale Study about Quality and Reproducibility of Jupyter Notebooks
MSR 2019 Technical Papers
João Felipe Pimentel , Leonardo Murta Universidade Federal Fluminense (UFF), Vanessa Braganholo , Juliana Freire
Pre-print
12:10
15m
Full-paper
Cross-language clone detection by learning over abstract syntax trees
MSR 2019 Technical Papers
Daniel Perez Imperial College London, Shigeru Chiba University of Tokyo, Japan
Pre-print
12:25
6m
Talk
SeSaMe: A Data Set of Semantically Similar Java Methods
MSR 2019 Data Showcase
Marius Kamp , Patrick Kreutzer , Michael Philippsen Friedrich-Alexander University Erlangen-Nürnberg (FAU)