Write a Blog >>
MSR 2019
Sun 26 - Mon 27 May 2019 Montreal, QC, Canada
co-located with ICSE 2019

Word embeddings produced by the word2vec algorithm provide us with a strong mechanism to discover relationships between the words based on the degree to which they are contextually related to one another. In and of itself, algorithms like word2vec do not give us a mechanism to impose ordering constraints on the embedded word representations. Our main goal in this paper is to exploit the semantic word vectors obtained from word2vec in such a way that allows for the ordering constraints to be invoked on them when comparing a sequence of words in a query with a sequence of words in a file for source code retrieval. These ordering constraints employ the logic of Markov Random Fields (MRF), a framework used previously to enhance the precision of the source-code retrieval engines based on the Bag-of-Words (BoW) assumption. The work we present here demonstrates that by combining word2vec with the power of MRF, it is possible to achieve improvements between 6% and 30% in retrieval accuracy over the best results that can be obtained with the more traditional applications of MRF to representations based on term and term-term frequencies. The performance improvement was 30% for the Java AspectJ repository using only the titles of the bug reports provided by iBUGS, and 6% for the case of the Eclipse repository using titles as well as descriptions of the bug reports provided by BUGLinks.

Sun 26 May

msr-2019-Paper-Presentations
11:00 - 11:45: MSR 2019 Paper Presentations - Session I: Representations for Mining (Part 1) at Place du Canada
Chair(s): Chanchal K. RoyUniversity of Saskatchewan
msr-2019-papers11:00 - 11:15
Full-paper
Pre-print Media Attached
msr-2019-papers11:16 - 11:22
Short-paper
Vladimir KovalenkoTU Delft, Egor BogomolovHigher School of Economics, JetBrains Research, Timofey Bryksin, Alberto BacchelliUniversity of Zurich
DOI Pre-print Media Attached
msr-2019-papers11:23 - 11:38
Full-paper
Bart TheetenNokia Bell Labs, Belgium, Frederik Vandeputte, Tom Van CutsemNokia Bell Labs
Pre-print
msr-2019-Data-Showcase11:39 - 11:45
Talk
Vasiliki EfstathiouAthens University of Economics and Business, Diomidis SpinellisAthens University of Economics and Business
Pre-print