Advancing Novel Textual Similarity-based Solutions in Software Development

Program for Development of Projects
in the field of Artificial Intelligence
(funded by Science Fund Republic of Serbia)

About project

News

The latest news from our project team will be published in this section.

Invitation to participate in the workshop

17.02.2021.

With the desire to make a short presentation about the goals and planned activities of our research team on the AVANTES project, but also other research and development projects in the field of natural language processing, machine learning, data analysis, we'll organize an online workshop on Thursday, February 25, 2021, at 12:00.

The workshop will consist of two sessions:
1) Presentation of the AVANTES project and research work in the above areas.
2) Round table, with discussions - representatives of ministries, NGOs, IT companies and academia.

We invite all interested researchers to apply via the following link.

Participation in the national conference "Serbian AI Meeting"

31.01.2021.

Members of the AVANTES project team participated in the national conference "Serbian AI Meeting" on December 18, 2020. More than 100 researchers from Serbia and our researchers, who work abroad, at universities, scientific institutes and well-known research and development centers of IT companies took part in this year's conference. The thematic areas that were covered: general artificial intelligence, formal logic and reasoning, machine learning and natural language processing. Slides of all lecturers can be found at the following link.

You can watch the full video of the event at the following link.



Vuk Batanovic defended his PhD dissertation

25.01.2021.

A member of our AVANTES team, Vuk Batanović, at the end of December 2020, defended his doctoral dissertation on "A methodology for solving semantic tasks in the processing of short texts written in natural languages with limited resources" under the mentorship of Prof. Boško Nikolić, PhD, and Prof. Miloš Cvetanović, PhD.

Congratulations to Vuk Batanović, PhD, on his dedicated work during his doctoral studies, and the preparation of his dissertation, and we wish him much success in his further research work.



Scientific article published in the prestigious scientific journal "PLOS ONE"

16.11.2020.

Scientific-research paper entitled "A versatile framework for resource-limited sentiment articulation, annotation, and analysis of short texts" (authors: Vuk Batanović, Miloš Cvetanović, Boško Nikolić), was published in the prestigious scientific journal PLOS ONE. The abstract of the paper and the link to the scientific paper can be viewed in section Results - Published papers.

Virtual PSSOH conference

27.10.2020.

Members of our project team, Zaharija Radivojevic and Vuk Batanovic, participated in the third PSSOH conference, entitled "Application of free software and open hardware" organized by the University of Belgrade - School of Electrical Engineering, where they presented the results of their work, as part of AVANTES project activities .

Exhibition dedicated to scientific projects in the field of artificial intelligence

10.10.2020.

An exhibition dedicated to scientific projects in the field of artificial intelligence, organized by the Science Fund of the Republic of Serbia with the Center for the Promotion of Science, opened on the Sava Promenade at the Kalemegdan Fortress on Friday, October 9. Each scientific project is presented with a poster that describe the research. Members of the AVANTES project team attended the opening ceremony of the exhibition and talked to visitors. The exhibition is open until October 23.

Kick-off project team meeting

31.8.2020.

Мembers of the project team held the kick-off meeting where they took over their tasks for the period of 3 next months.



Project AVANTES highly ranked

15.8.2020.

Within the Program for the development of projects in the field of artificial intelligence, the Science Fund of the Republic of Serbia will finance 12 projects. Out of 70 project proposals in the public call closed on 31.1.2020, 6 projects were selected from basic research and 6 from applied research. Our scientific team and project proposal achieved an excellent result of 91 points on the final ranking list of projects and were ranked in a high second place out of 12 projects that will be funded over the next two years.



Project information

Acronym: AVANTES

The result of close cooperation between researchers from seemingly distant scientific fields will be a new system which will facilitate the work of software engineers, as well as linguists who study the Serbian language.

Period: Sept 2020 - Sept 2022

Budget: 198,261.12 €

An interdisciplinary research team will develop an intelligent tool for recognizing the semantic similarity between parts of a software system written in programming languages and comments in natural languages. The system will be able to recognize code clones, , while a special research focus will be directed towards solving the problem of cross-level semantic textual similarity primarily in Serbian, with comparison with the results obtained for the English language. Within the scope of this project, new methods for program code analysis will be used, which include the use of machine learning techniques and artificial intelligence.

In addition to the tool for determining the similarity between the parts of the software and the comments, a group of software engineers and linguists will develop a new semantic search algorithm for exploring code using natural language input. One of the goals is also to establish a collection of data and models for automatic Serbian language processing.

The AVANTES project is of great importance for Serbia because researchers will create complex datasets and introduce innovations into existing technologies for processing the Serbian language, for which far fewer resources are currently available than for other international languages such as English. This will facilitate not only the work of software engineering in Serbia, but also of linguists who study the Serbian language.

Team members

The team is multidisciplinary and consists of researchers from the School of Electrical Engineering, University of Belgrade, the Faculty of Philology, University of Belgrade and the Innovation Center of the School of Electrical Engineering.



Boško Nikolić, PhD

Principal Investigator

Zaharije Radivojević, PhD

Member of the project team

Dražen Drašković, PhD

Member of the project team

Vuk Batanović, PhD

Member of the project team

Vladimir Jocović, PhD candidate

Member of the project team

Tamara Šekularac, PhD candidate

Member of the project team

Marko Mićović, PhD candidate

Member of the project team

Uroš Radenković, PhD candidate

Member of the project team

Jelica Cincović, PhD candidate

Member of the project team

Adrian Milaković, PhD candidate

Member of the project team

Dušan Stojković, PhD candidate

Member of the project team

Aleksa Srbljanović, BSc EE

Member of the project team

Maja Miličević Petrović, PhD

Member of the project team

Radoslava Trnavac, PhD

Member of the project team

Tanja Samardžić, PhD

Member of the project team

Borko Kovačević, PhD

Member of the project team

Resources

This section will show the resources that will be published during the project.

Published papers

Papers from conferences and scientific journals will be published in this section.

  • V.Batanović et al., "Open Resources and Technologies for Serbian Language Processing"

    V.Batanović, N.Ljubešić, T. Samardžić, M. Miličević Petrović, "Open Resources and Technologies for Serbian Language Processing" (in Serbian), PSSOH conference, Belgrade, Oct. 2020
    Link: https://zenodo.org/record/4113230#.X6GcaohKiUk
    Abstract: The openness of language resources and tools is of great importance for increasing the quality and speed of development of technologies for computer processing of natural languages. This paper presents open resources for the processing of the Serbian language. Hand-annotated corpora are described, as well as a wider range of tools and computer models, including a web service that makes them easy to use.

  • V. Batanović, M.Cvetanović, B.Nikolić, "A versatile framework for resource-limited sentiment articulation, annotation, and analysis of short texts", PLoS ONE 15(11): e0242050. https://doi.org/10.1371/journal.pone.0242050
    Link: https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0242050
    Apstrakt:
    Choosing a comprehensive and cost-effective way of articulating and annotating the sentiment of a text is not a trivial task, particularly when dealing with short texts, in which sentiment can be expressed through a wide variety of linguistic and rhetorical phenomena. This problem is especially conspicuous in resource-limited settings and languages, where design options are restricted either in terms of manpower and financial means required to produce appropriate sentiment analysis resources, or in terms of available language tools, or both. In this paper, we present a versatile approach to addressing this issue, based on multiple interpretations of sentiment labels that encode information regarding the polarity, subjectivity, and ambiguity of a text, as well as the presence of sarcasm or a mixture of sentiments. We demonstrate its use on Serbian, a resource-limited language, via the creation of a main sentiment analysis dataset focused on movie comments, and two smaller datasets belonging to the movie and book domains. In addition to measuring the quality of the annotation process, we propose a novel metric to validate its cost-effectiveness. Finally, the practicality of our approach is further validated by training, evaluating, and determining the optimal configurations of several different kinds of machine-learning models on a range of sentiment classification tasks using the produced dataset.

Contact Us

Location:

Belgrade 11000, Bulevar kralja Aleksandra 73

Loading
Your message has been sent. Thank you!