Since 2013, the MSR conference has included a Data Showcase. The purpose of the Data Showcase is to provide a forum to share and discuss the important data sets that underpin the work of the Mining Software Repositories community.
The important dates for the Data Showcase are:
-
Abstracts Due: February 1, 2019
-
Papers Due: February 6, 2019
-
Author Notification: March 1, 2019
-
Camera Ready: March 15, 2019
Please see the Call for Data Showcase Papers for all details.
Sun 26 May Times are displayed in time zone: Eastern Time (US & Canada) change
09:05 - 09:50 Talk | Keynote: We Won! Now What? MSR 2019 Keynote | ||
09:50 - 10:00 | Q&A for Keynote MSR 2019 Keynote | ||
10:00 - 10:30 | Discussion: Ethical MSR MSR 2019 Keynote |
11:00 - 11:45: Session II: Defect Prediction and Testing (Part 1)MSR 2019 Paper Presentations / MSR 2019 Technical Papers at Centre-Ville Chair(s): Patanamon ThongtanunamThe University of Melbourne | |||
11:00 - 11:15 Full-paper | DeepJIT: An End-To-End Deep LearningFramework for Just-In-Time Defect Prediction MSR 2019 Technical Papers Thong HoangSingapore Management University, Singapore, Hoa Khanh DamUniversity of Wollongong, Yasutaka KameiKyushu University, David LoSingapore Management University, Naoyasu UbayashiKyushu University | ||
11:16 - 11:31 Full-paper | Lessons learned from using a deep tree-based model for software defect prediction in practice MSR 2019 Technical Papers Hoa Khanh DamUniversity of Wollongong, Trang PhamDeakin University, Shien Wee NgUniversity of Wollongong, Truyen Tran, John GrundyMonash University, Aditya Ghose, Taeksu Kim, Chul-Joo Kim | ||
11:32 - 11:38 Short-paper | Empirical study in using version histories for change risk classification MSR 2019 Technical Papers | ||
11:39 - 11:45 Short-paper | Snoring: a Noise in Defect Prediction Datasets MSR 2019 Technical Papers Aalok Ahluwalia, Davide FalessiCalifornia Polytechnic State University, Massimiliano Di PentaUniversity of Sannio |
11:00 - 11:45: Session I: Representations for Mining (Part 1)MSR 2019 Paper Presentations / MSR 2019 Technical Papers / MSR 2019 Data Showcase at Place du Canada Chair(s): Chanchal K. RoyUniversity of Saskatchewan | |||
11:00 - 11:15 Full-paper | SCOR: Source Code Retrieval With Semantics and Order MSR 2019 Technical Papers Pre-print Media Attached | ||
11:16 - 11:22 Short-paper | PathMiner : A Library for Mining of Path-Based Representations of Code MSR 2019 Technical Papers Vladimir KovalenkoTU Delft, Egor BogomolovHigher School of Economics, JetBrains Research, Timofey Bryksin, Alberto BacchelliUniversity of Zurich DOI Pre-print Media Attached | ||
11:23 - 11:38 Full-paper | Import2vec: learning embeddings for software libraries MSR 2019 Technical Papers Pre-print | ||
11:39 - 11:45 Talk | Semantic Source Code Models Using Identifier Embeddings MSR 2019 Data Showcase Vasiliki EfstathiouAthens University of Economics and Business, Diomidis SpinellisAthens University of Economics and Business Pre-print |
11:55 - 12:30: Session III: Representations for Mining (Part 2)MSR 2019 Paper Presentations / MSR 2019 Technical Papers / MSR 2019 Data Showcase at Place du Canada Chair(s): Nicole NovielliUniversity of Bari | |||
11:55 - 12:10 Full-paper | Exploring Word Embedding Techniques to Improve Sentiment Analysis of Software Engineering Texts MSR 2019 Technical Papers Pre-print | ||
12:10 - 12:16 Talk | Cleaning StackOverflow for Machine Translation MSR 2019 Data Showcase Musfiqur RahmanConcordia University, Montreal, Canada, Peter RigbyConcordia University, Montreal, Canada, Dharani PalaniConcordia University, Tien N. NguyenUniversity of Texas at Dallas | ||
12:16 - 12:31 Full-paper | Predicting Good Configurations for GitHub and Stack Overflow Topic Models MSR 2019 Technical Papers Pre-print |
13:50 - 14:35: Discussion: Data vs. Theory-driven ResearchMSR 2019 Paper Presentations at Place du Canada Chair(s): Andy ZaidmanTU Delft, Michael W. GodfreyUniversity of Waterloo, Canada | |||
14:45 - 15:30: Session VI: Energy and EconomicsMSR 2019 Paper Presentations / MSR 2019 Data Showcase / MSR 2019 Technical Papers at Centre-Ville Chair(s): Maleknaz NayebiPolytechnique Montréal | |||
14:45 - 15:00 Full-paper | Recommending Energy-Efficient Java Collections MSR 2019 Technical Papers Wellington de Oliveira Júnior, Renato Santos, Fernando CastorFederal University of Pernambuco (UFPE), José Benito Fernandes De Araújo Neto, Gustavo PintoUFPA Pre-print | ||
15:01 - 15:07 Talk | GreenHub Farmer: Real-world data for Android Energy Mining MSR 2019 Data Showcase Rui PereiraHASLab/INESC TEC & Universidade do Minho & Universidade da Beira Interior, Marco CoutoHASLab/INESC TEC & Universidade do Minho, João Paulo FernandesRelease/LISP, CISUC, Bruno Cabral, Hugo MatalongaUniversity of Minho, Simão Melo de Sousa, Fernando CastorFederal University of Pernambuco (UFPE) Pre-print | ||
15:08 - 15:14 Talk | GreenSource: a large-scale collection of Android code, tests and energy metrics MSR 2019 Data Showcase Rui RuaHASLab/INESC TEC & Universidade do Minho, Marco CoutoHASLab/INESC TEC & Universidade do Minho, João SaraivaUniversity of Minho, Portugal | ||
15:15 - 15:21 Short-paper | Striking Gold in Software Repositories? An Econometric Study of Cryptocurrencies on GitHub MSR 2019 Technical Papers Asher TrockmanUniversity of Evansville, Rijnard van TonderCarnegie Mellon University, Bogdan VasilescuCarnegie Mellon University Pre-print | ||
15:22 - 15:28 Talk | Panel Data of Cryptocurrency Development Activity on GitHub MSR 2019 Data Showcase Rijnard van TonderCarnegie Mellon University, Asher TrockmanUniversity of Evansville, Claire Le GouesCarnegie Mellon University |
Mon 27 May Times are displayed in time zone: Eastern Time (US & Canada) change
08:45 - 09:30: Session II: Automatic SummarizationMSR 2019 Paper Presentations / MSR 2019 Technical Papers at Centre-Ville Chair(s): Xin XiaMonash University | |||
08:45 - 09:00 Full-paper | Generating Commit Messages from Diffs using Pointer-generator Network MSR 2019 Technical Papers Qin Liu, Zihe LiuSchool of Software Engineering, Tongji University, Shanghai, China, Hongming Zhu, Hongfei Fan, Bowen Du, Yu Qian | ||
09:00 - 09:15 Full-paper | Automatically Generating Documentation for Lambda Expressions in Java MSR 2019 Technical Papers Anwar Alqaimi, Patanamon ThongtanunamThe University of Melbourne, Christoph TreudeThe University of Adelaide Pre-print | ||
09:15 - 09:30 Full-paper | Extracting API Tips from Developer Question and Answer Websites MSR 2019 Technical Papers |
08:45 - 09:30: Session I: APIs & Dependencies (Part 1)MSR 2019 Paper Presentations / MSR 2019 Technical Papers at Place du Canada Chair(s): Philipp LeitnerChalmers University of Technology & University of Gothenburg | |||
08:45 - 09:00 Full-paper | Investigating Next-Steps in Static API-Misuse Detection MSR 2019 Technical Papers Sven AmannCQSE GmbH, Hoan NguyenIowa State University, Sarah NadiUniversity of Alberta, Tien N. NguyenUniversity of Texas at Dallas, Mira MeziniTU Darmstadt, Germany Pre-print | ||
09:00 - 09:15 Full-paper | Identifying Experts in Software Libraries and Frameworks among GitHub Users MSR 2019 Technical Papers João Eduardo MontandonUniversidade Federal de Minas Gerais (UFMG), Luciana L. Silva, Marco Tulio ValenteFederal University of Minas Gerais, Brazil Pre-print | ||
09:15 - 09:30 Full-paper | Data-Driven Solutions to Detect API Compatibility Issues in Android: An Empirical Study MSR 2019 Technical Papers Simone ScalabrinoUniversity of Molise, Gabriele BavotaUniversità della Svizzera italiana (USI), Mario Linares-VasquezUniversidad de los Andes, Michele LanzaUniversita della Svizzera italiana (USI), Rocco OlivetoUniversity of Molise |
09:40 - 10:30: Session IV: SecurityMSR 2019 Paper Presentations / MSR 2019 Data Showcase / MSR 2019 Technical Papers at Centre-Ville Chair(s): Sarah NadiUniversity of Alberta | |||
09:40 - 09:55 Full-paper | Automated Software Vulnerability Assessment with Concept Drift MSR 2019 Technical Papers | ||
09:55 - 10:01 Talk | A Manually-Curated Dataset of Fixes to Vulnerabilities of Open-Source Software MSR 2019 Data Showcase | ||
10:01 - 10:16 Full-paper | Negative Results on Mining Crypto-API Usage Rules in Android Apps MSR 2019 Technical Papers Jun GaoUniversity of Luxembourg, SnT, Pingfan KongInterdisciplinary Centre for Security, Reliability and Trust, University of Luxembourg, Li LiMonash University, Australia, Tegawendé F. BissyandéSnT, University of Luxembourg, Jacques KleinUniversity of Luxembourg, SnT | ||
10:16 - 10:22 Talk | A Dataset of Parametric Cryptographic Misuses MSR 2019 Data Showcase Anna-Katharina WickertTU Darmstadt, Germany, Michael ReifTU Darmstadt, Germany, Michael EichbergTU Darmstadt, Germany, Anam Dodhy, Mira MeziniTU Darmstadt, Germany Pre-print Media Attached | ||
10:22 - 10:28 Talk | RmvDroid: Towards A Reliable Android Malware Dataset with App Metadata MSR 2019 Data Showcase Haoyu WangBeijing University of Posts and Telecommunications, China, Junjun Si, Hao Li , Yao GuoPeking University |
11:00 - 11:45: Session VI: Software Quality (part 1)MSR 2019 Paper Presentations / MSR 2019 Technical Papers at Centre-Ville Chair(s): Fabio PalombaUniversity of Zurich | |||
11:00 - 11:15 Full-paper | The Rise of Android Code Smells: Who Is to Blame? MSR 2019 Technical Papers Sarra HabchiUniversity of Lille, Romain RouvoyUniversity Lille 1 and INRIA, Naouel MohaUniversity of Montreal | ||
11:15 - 11:30 Full-paper | Assessing Diffusion and Perception of Test Smells in Scala Projects MSR 2019 Technical Papers Jonas De BleserSofware Languages Lab, Vrije Universiteit Brussel, Dario Di NucciVrije Universiteit Brussel, Coen De RooverVrije Universiteit Brussel Pre-print | ||
11:30 - 11:45 Full-paper | style-analyzer: fixing code style inconsistencies with interpretable unsupervised algorithms MSR 2019 Technical Papers Vadim Markovtsevsource{d}, Hugo Mougardsource{d}, Waren Longsource{d}, Egor Bulychev, Konstantin Slavnov Pre-print |
11:00 - 11:45: Session V: Collaboration & Communication (Part 1)MSR 2019 Paper Presentations / MSR 2019 Technical Papers at Place du Canada Chair(s): Peter RigbyConcordia University, Montreal, Canada | |||
11:00 - 11:15 Full-paper | An Empirical Study of Multiple Names and Email Addresses in OSS Version Control Repositories MSR 2019 Technical Papers Jiaxin ZhuInstitute of Software at Chinese Academy of Sciences, China, Jun WeiInstitute of Software, Chinese Academy of Sciences, China | ||
11:15 - 11:30 Full-paper | Characterizing the Roles of Contributors in Open-source Scientific Software Projects MSR 2019 Technical Papers Reed MilewiczSandia National Laboratories, Gustavo PintoUFPA, Paige Rodeghero University of Notre Dame Pre-print | ||
11:30 - 11:45 Full-paper | git2net - Mining Time-Stamped Co-Editing Networks from Large git Repositories MSR 2019 Technical Papers DOI Pre-print |
11:55 - 12:30: Session VIII: Software Quality (part 2)MSR 2019 Paper Presentations / MSR 2019 Technical Papers / MSR 2019 Data Showcase at Centre-Ville Chair(s): Yasutaka KameiKyushu University | |||
11:55 - 12:10 Full-paper | A Large-scale Study about Quality and Reproducibility of Jupyter Notebooks MSR 2019 Technical Papers João Felipe Pimentel, Leonardo MurtaUniversidade Federal Fluminense (UFF), Vanessa Braganholo, Juliana Freire Pre-print | ||
12:10 - 12:25 Full-paper | Cross-language clone detection by learning over abstract syntax trees MSR 2019 Technical Papers Pre-print | ||
12:25 - 12:31 Talk | SeSaMe: A Data Set of Semantically Similar Java Methods MSR 2019 Data Showcase Marius Kamp, Patrick Kreutzer, Michael PhilippsenFriedrich-Alexander University Erlangen-Nürnberg (FAU) |
11:55 - 12:30: Session VII: Collaboration & Communication (Part 2)MSR 2019 Paper Presentations / MSR 2019 Technical Papers at Place du Canada Chair(s): Kelly BlincoeUniversity of Auckland | |||
11:55 - 12:10 Full-paper | Can Issues Reported at Stack Overflow Questions be Reproduced? An Exploratory Study MSR 2019 Technical Papers Saikat MondalUniversity of Saskatchewan, Masud RahmanUniversity of Saskatchewan , Chanchal K. RoyUniversity of Saskatchewan Pre-print | ||
12:10 - 12:25 Full-paper | Exploratory Study of Slack Q&A Chats as a Mining Source for Software Engineering Tools MSR 2019 Technical Papers Preetha ChatterjeeUniversity of Delaware, USA, Kostadin DamevskiVirginia Commonwealth University, Lori PollockUniversity of Delaware, USA, Vinay Augustine, Nicholas A. KraftABB Corporate Research Pre-print | ||
12:25 - 12:31 Short-paper | Impacts of Daylight Saving Time on Software Development MSR 2019 Technical Papers Junichi HayashiOsaka University, Yoshiki HigoOsaka University, Shinsuke MatsumotoOsaka University, Shinji KusumotoOsaka University Pre-print |
13:50 - 14:35: Discussion: SE for AI for SEMSR 2019 Paper Presentations at Place du Canada Chair(s): Neil ErnstUniversity of Victoria, Tim MenziesNorth Carolina State University | |||
Call for Data Showcase Papers
Data Showcase papers should describe data sets that are curated by their authors and made available to use by others. Ideally, these data sets should be of value to others in the community, should be preprocessed or filtered in some way, and should provide an easy-to-understand schema. Data showcase papers are expected to include:
- a description of the data source,
- a description of the methodology used to gather it (provenance; the tool used to create/generate/gather the data, if such a tool has been used, see below),
- a description of the storage mechanism, including a schema if applicable,
- if the data has been used by authors or others, a description of how this was done including references to previously published papers,
- a description of the originality of the data set (that is, even if the dataset has been used in a published article, its complete description must be unpublished),
- ideas for future research questions that could be answered using the data set,
- ideas for further improvements that could be made to the data set, and
- any limitations and/or challenges in creating or using the data set.
The data set should be made available at the time of submission of the paper for review, but will be considered confidential until publication of the paper. At latest upon publication of the paper the authors should archive data on preserved archives that can provide a digital object identifier (DOI) such as zenodo.org, figshare.com or institutional preserved archives. In this way the data will become citable; DOI-based citation of the dataset should be included in the camera-ready version. If the size of the dataset exceeds the limits imposed by the preserved archives (e.g., 50GB for zenodo), the authors can store their data on Archive.org and refer to the URL in their camera-ready version.
Data showcase papers are not:
- empirical studies
- tool demos
- data sets that are
- based on poorly explained or untrustworthy heuristics for data collection, or
- result of trivial application of generic tools.
We expect all datasets to be accompanied by the source code of the tool that was used to create them, along with clear documentation on how to run the tool in order to recreate the datasets. The tool should be open source, accompanied by an appropriate license; the source code should be citable, i.e., refer to a specific release and have a DOI. GItHub provides an easy way to make source code citable. If you cannot provide the source code or the source code clause is not applicable (e.g., because the dataset consists of qualitative data), please provide a short explanation of why this is not possible.
Submission
Submit your data paper (maximum 4 pages, plus 1 additional page of references) to EasyChair on or before February 6th, 2019 (abstract due February 1st).
Submitted papers will undergo single-blind peer review. We opt for single-blind peer review (as opposed to the double-blind peer review of the main track) due to the requirement above to describe the ways how data has been used in the previous studies, including the bibliographic reference to those studies. Such reference is likely to disclose the authors’ identity.
To make research data sets and research software accessible and citable, we further encourage authors to attend to the FAIR rules, i.e. data should be: Findable, Accessible, Interoperable, and Reusable.
The submission must conform to the IEEE Conference Proceedings Formatting Guidelines (title in 24pt font and full text in 10pt type, LaTEX users must use \documentclass[10pt,conference]{IEEEtran}
without including the compsoc
or compsocconf
option).
Papers submitted for consideration should not have been published elsewhere and should not be under review or submitted for review elsewhere during the duration of consideration. The submission must also comply with the ACM plagiarism policy and procedures. The submission must also comply with the IEEE Policy on Authorship. To submit please use the EasyChair link.
Upon notification of acceptance, all authors of accepted papers will be asked to complete an IEEE Copyright form and will receive further instructions for preparing their camera ready versions. At least one author of each paper is expected to present the results at the MSR conference. All accepted contributions will be published in the conference electronic proceedings.
The official publication date is the date the proceedings are made available in the ACM or IEEE Digital Libraries. This date may be up to two weeks prior to the first day of ICSE 2019. The official publication date affects the deadline for any patent filings related to the published work. Purchases of additional pages in the proceedings is not allowed.
A selection of the best papers will be invited to EMSE Special Issue.
Important Dates
Abstracts Due: February 1, 2019
Papers Due: February 6, 2019
Author Notification: March 1, 2019
Camera Ready: March 15, 2019
Organization
Nicole Novielli, University of Bari, Italy
Alexander Serebrenik, Eindhoven University of Technology, The Netherlands