Archive


Blog - Readings - posts for October 2021

Oct 25 2021

CodeMatcher performs fuzzy search

https://www.decoder-project.eu/download/Main/Readings/CodeMatcher/CodeMatcher_fig2.jpg?rev=1.1

Title: CodeMatcher, Searching Code Based on Sequential Semantics of Important Query Words
Authors: Chao Liu - Xin Xia - David Lo - Zhiwe Liu - Ahmed E. Hassan - Shanping Li

Abstract: To accelerate software development, developers frequently search and reuse existing code snippets from a large-scale codebase, e.g., GitHub. Over the years, researchers proposed many information retrieval (IR)-based models for code search, but they fail to connect the semantic gap between query and code. An early successful deep learning (DL)-based model DeepCS solved this issue by learning the relationship between pairs of code methods and corresponding natural language descriptions. 

Two major advantages of DeepCS are the capability of understanding irrelevant/noisy keywords and capturing sequential relationships between words in query and code. In this article, we proposed an IR-based model CodeMatcher that inherits the advantages of DeepCS (i.e., the capability of understanding the sequential semantics in important query words), while it can leverage the indexing technique in the IR-based model to accelerate the search response time substantially. 

More