Write a Blog >>
MSR 2019
Sun 26 - Mon 27 May 2019 Montreal, QC, Canada
co-located with ICSE 2019

Since 2013, the MSR conference has included a Data Showcase. The purpose of the Data Showcase is to provide a forum to share and discuss the important data sets that underpin the work of the Mining Software Repositories community.

The important dates for the Data Showcase are:

  • Abstracts Due: February 1, 2019

  • Papers Due: February 6, 2019

  • Author Notification: March 1, 2019

  • Camera Ready: March 15, 2019

Please see the Call for Data Showcase Papers for all details.


Sun 26 May

11:00 - 11:45: Technical Papers - Representations for Mining (Part 1) at Room 1
msr-2019-papers11:00 - 11:11
msr-2019-papers11:11 - 11:22
msr-2019-papers11:22 - 11:33
Bart TheetenNokia Bell Labs, Belgium, Frederik Vandeputte, Tom Van CutsemNokia Bell Labs
msr-2019-Data-Showcase11:33 - 11:45
Vasiliki EfstathiouAthens University of Economics and Business, Diomidis SpinellisAthens University of Economics and Business
11:55 - 12:30: Technical Papers - Defect prediction and Testing (Part 2) at Room 2
msr-2019-Data-Showcase11:55 - 12:00
Aida Radu, Sarah NadiUniversity of Alberta
msr-2019-papers12:00 - 12:06
Adithya Raghuraman, Truong Ho-Quang, Michel ChaudronChalmers University of Technology, Alexander SerebrenikEindhoven University of Technology, Bogdan VasilescuCarnegie Mellon University
msr-2019-papers12:06 - 12:12
Stanislav ChrenMasaryk University, Radoslav Micko, Barbora BuhnovaMasaryk University, Bruno RossiMasaryk University
msr-2019-Data-Showcase12:12 - 12:18
Dirk BeyerLMU Munich
msr-2019-papers12:18 - 12:24
Hongyu Zhai, Casey CasalnuovoUniversity of California at Davis, USA, Premkumar DevanbuUniversity of California, Davis
msr-2019-papers12:24 - 12:30
Domenico Serra, Giovanni GranoUniversity of Zurich, Fabio Palomba, Filomena FerrucciUniversity of Salerno, Harald GallUniversity of Zurich, Alberto BacchelliUniversity of Zurich
11:55 - 12:30: Technical Papers - Representations for Mining (Part 2) at Room 1
msr-2019-papers11:55 - 12:06
Eeshita Biswas, K. Vijay-Shanker, Lori PollockUniversity of Delaware, USA
msr-2019-Data-Showcase12:06 - 12:18
Musfiqur RahmanConcordia University, Montreal, Canada, Peter RigbyConcordia University, Montreal, Canada, Dharani PalaniConcordia University, Tien NguyenUniversity of Texas at Dallas
msr-2019-papers12:18 - 12:30
Christoph TreudeThe University of Adelaide, Markus Wagner
14:45 - 15:30: Technical Papers - Energy and Economics at Room 2
msr-2019-papers14:45 - 14:54
msr-2019-Data-Showcase14:54 - 15:03
Rui PereiraHASLab/INESC TEC & Universidade do Minho, Marco CoutoHASLab/INESC TEC & Universidade do Minho, João Paulo FernandesRelease/LISP, CISUC, Bruno Cabral, Hugo MatalongaUniversity of Minho, Simão Melo de Sousa, Fernando CastorFederal University of Pernambuco (UFPE)
msr-2019-Data-Showcase15:03 - 15:12
Rui RuaHASLab/INESC TEC & Universidade do Minho, Marco CoutoHASLab/INESC TEC & Universidade do Minho, João SaraivaUniversity of Minho, Portugal
msr-2019-papers15:12 - 15:21
Asher TrockmanUniversity of Evansville, Rijnard van TonderCarnegie Mellon University, Bogdan VasilescuCarnegie Mellon University
msr-2019-Data-Showcase15:21 - 15:30
Rijnard van TonderCarnegie Mellon University, Asher TrockmanUniversity of Evansville, Claire Le GouesCarnegie Mellon University
14:45 - 15:30: Technical Papers - Large-Scale Mining at Room 1
msr-2019-papers14:45 - 14:56
Dimitris Mitropoulos , Panos Louridas , Vitalis Salis, Diomidis SpinellisAthens University of Economics and Business
msr-2019-Data-Showcase14:56 - 15:07
Antoine Pietri, Diomidis SpinellisAthens University of Economics and Business, Stefano Zacchiroli
msr-2019-papers15:07 - 15:18
Yuxing Ma, Christopher BogartCarnegie Mellon University, Sadika Amreen, Russell Zaretzki, Audris MockusUniversity of Tennessee - Knoxville
msr-2019-papers15:18 - 15:30
Dimitris KolovosUniversity of York, Patrick NeubauerUniversity of York, UK, Konstantinos Barmpis , Nicholas Matragkas, Richard PaigeMcMaster University

Not scheduled yet

msr-2019-Data-ShowcaseNot scheduled yet
Serena Elisa Ponta
Henrik Plate
Antonino Sabetta
Michele Bezzi
Cédric Dangremont
msr-2019-Data-ShowcaseNot scheduled yet
Saket Joshi
Sridhar ChimalakondaIndian Institute of Technology Tirupati
msr-2019-Data-ShowcaseNot scheduled yet
Marius Kamp
Patrick Kreutzer
Michael PhilippsenFriedrich-Alexander University Erlangen-Nürnberg (FAU)
msr-2019-Data-ShowcaseNot scheduled yet
Gian Luca Scoccia
Anthony PerumaRochester Institute of Technology
Virginia Pujols
Ben Christians
Daniel KrutzRochester Institute of Technology
msr-2019-Data-ShowcaseNot scheduled yet
Oliviero Riganelli
Marco Mobilio
Daniela MicucciUniversity of Milano-Bicocca, Italy
Leonardo MarianiUniversity of Milano Bicocca
msr-2019-Data-ShowcaseNot scheduled yet
Anna-Katharina WickertTU Darmstadt, Germany
Michael ReifTU Darmstadt, Germany
Michael EichbergTU Darmstadt, Germany
Anam Dodhy
Mira MeziniTU Darmstadt, Germany
msr-2019-Data-ShowcaseNot scheduled yet
Haoyu Wang
Junjun Si
Hao Li
Yao GuoPeking University
msr-2019-Data-ShowcaseNot scheduled yet
Amine Benelallam
Nicolas Harrand
César Soto-ValeroKTH Royal Institute of Technology
Benoit BaudryKTH Royal Institute of Technology, Sweden
Olivier Barais
msr-2019-Data-ShowcaseNot scheduled yet
Sumon BiswasIowa State University
Md Johirul Islam
Yijia Huang
Hridesh RajanIowa State University

Call for Data Showcase Papers

Data Showcase papers should describe data sets that are curated by their authors and made available to use by others. Ideally, these data sets should be of value to others in the community, should be preprocessed or filtered in some way, and should provide an easy-to-understand schema. Data showcase papers are expected to include:

  • a description of the data source,
  • a description of the methodology used to gather it (provenance; the tool used to create/generate/gather the data, if such a tool has been used, see below),
  • a description of the storage mechanism, including a schema if applicable,
  • if the data has been used by authors or others, a description of how this was done including references to previously published papers,
  • a description of the originality of the data set (that is, even if the dataset has been used in a published article, its complete description must be unpublished),
  • ideas for future research questions that could be answered using the data set,
  • ideas for further improvements that could be made to the data set, and
  • any limitations and/or challenges in creating or using the data set.

The data set should be made available at the time of submission of the paper for review, but will be considered confidential until publication of the paper. At latest upon publication of the paper the authors should archive data on preserved archives that can provide a digital object identifier (DOI) such as zenodo.org, figshare.com or institutional preserved archives. In this way the data will become citable; DOI-based citation of the dataset should be included in the camera-ready version. If the size of the dataset exceeds the limits imposed by the preserved archives (e.g., 50GB for zenodo), the authors can store their data on Archive.org and refer to the URL in their camera-ready version.

Data showcase papers are not:

  • empirical studies
  • tool demos
  • data sets that are
    • based on poorly explained or untrustworthy heuristics for data collection, or
    • result of trivial application of generic tools.

We expect all datasets to be accompanied by the source code of the tool that was used to create them, along with clear documentation on how to run the tool in order to recreate the datasets. The tool should be open source, accompanied by an appropriate license; the source code should be citable, i.e., refer to a specific release and have a DOI. GItHub provides an easy way to make source code citable. If you cannot provide the source code or the source code clause is not applicable (e.g., because the dataset consists of qualitative data), please provide a short explanation of why this is not possible.


Submit your data paper (maximum 4 pages, plus 1 additional page of references) to EasyChair on or before February 6th, 2019 (abstract due February 1st).

Submitted papers will undergo single-blind peer review. We opt for single-blind peer review (as opposed to the double-blind peer review of the main track) due to the requirement above to describe the ways how data has been used in the previous studies, including the bibliographic reference to those studies. Such reference is likely to disclose the authors’ identity.

To make research data sets and research software accessible and citable, we further encourage authors to attend to the FAIR rules, i.e. data should be: Findable, Accessible, Interoperable, and Reusable.

The submission must conform to the IEEE Conference Proceedings Formatting Guidelines (title in 24pt font and full text in 10pt type, LaTEX users must use \documentclass[10pt,conference]{IEEEtran} without including the compsoc or compsocconf option).

Papers submitted for consideration should not have been published elsewhere and should not be under review or submitted for review elsewhere during the duration of consideration. The submission must also comply with the ACM plagiarism policy and procedures. The submission must also comply with the IEEE Policy on Authorship. To submit please use the EasyChair link.

Upon notification of acceptance, all authors of accepted papers will be asked to complete an IEEE Copyright form and will receive further instructions for preparing their camera ready versions. At least one author of each paper is expected to present the results at the MSR conference. All accepted contributions will be published in the conference electronic proceedings.

The official publication date is the date the proceedings are made available in the ACM or IEEE Digital Libraries. This date may be up to two weeks prior to the first day of ICSE 2019. The official publication date affects the deadline for any patent filings related to the published work. Purchases of additional pages in the proceedings is not allowed.

A selection of the best papers will be invited to EMSE Special Issue.

Important Dates

Abstracts Due: February 1, 2019

Papers Due: February 6, 2019

Author Notification: March 1, 2019

Camera Ready: March 15, 2019


Nicole Novielli, University of Bari, Italy

Alexander Serebrenik, Eindhoven University of Technology, The Netherlands

Accepted Papers