Machine Learning on source code is a promising field to work in
How would you present DECODER?
The main goal of DECODER is to help IT professionals – software developers, maintainers and reviewers – with their daily work, as well as to support them to get familiar with existing software projects in which they were not involved in the past. To this end, a set of models and tools are being designed and developed, and will be available through the so-called PKM (Persistent Knowledge Monitor). Some of these tools rely on Artificial Intelligence (AI) techniques to provide advanced features, such as code summarization.
What is your role in DECODER?
The participation of Tree Technology in DECODER focuses on two roles: leader of WP2 “Activities for the developer” and use case leader in WP6 “Use-cases”. In WP2, we use Natural Language Processing (NLP) techniques to address two challenges: the extraction of information from source code and the automatic generation of summaries to describe the goal of the different methods. For this, we use annotated datasets provided both by the end users and public sources/repositories. In WP6, our use case is based on a previous European project in the field of robotics. In this context, the developed OpenCV code should serve as example to test and validate the different functionalities of DECODER. Specifically, the scenario deals with a programmer who inherits code from a colleague and takes advantage of DECODER to better understand it.
What key innovation do you bring or help to develop?
Tree Technology is making use of recent milestones and know-how built on NLP in order to apply them to source code, namely developing the most advanced sequential models based on deep learning. Machine learning on source code, or MLonCode as it is also known, is a new topic and not much research can be found in the literature, making it a challenging but also a promising field to work in with the final purpose of either improving already existing tools or creating new ones that help to ease the work of code developers.
A word about yourself and your organization
I am a Senior Data Scientist at Tree Technology, an R&D-intensive company, established in Spain, that provides innovate IT solutions in several sectors, such as healthcare, banking, mobility, etc. We are experts in Big Data technologies and Artificial Intelligence. In particular, the AI team is made of eleven people – mainly engineers (computing, telecommunications) and mathematicians – whose aim is to achieve more efficient and smarter software systems leveraging AI techniques.
Marcos Fernández Díaz received his MSc. degree in Telecommunication Engineering from the University of Oviedo, Spain in 2011, where he was also working as researcher. In 2013, he joined TREE, where he is with the Artificial Intelligence chapter and works in the development of machine learning models for different domains, including predictive maintenance, NLP applied to source code, industrial process optimization or federated machine learning among others, having been involved in several EU projects.