Davide Rigoni

Ritratto Davide Rigoni

Computer Science and Innovation for Societal Challenges, XXXV series
Grant sponsor

Fondazione Bruno Kessler
Luciano Serafini (FBK), Alessandro Sperduti

Anna Spagnolli

Project description
Understanding multimedia content with the help of background knowledge is one of the fundamental problems in computer vision and pattern recognition, since the use of different types of input data allows us to extract more relationships existing among them. In fact nowadays the most common approach is to manage different types of data in a separate way, losing any hidden relationship among them that could be found if they are managed together. For this reason it is becoming increasingly important to develop algorithms able to discover relationships among different information sources that current algorithms are unable to detect. Just imagine all texts, videos and images that nowadays are related to each other thanks to Facebook, Instagram, YouTube and so on. A better understanding of the connections among them or parts of them could lead to many real successful applications in image and video understanding, generating images and video annotations, indexing and querying multimedia datasets. However making these connections explicit is difficult using a traditional algorithmic approach. Learning by examples provides a viable approach, especially using deep learning approaches that are particularly suited to deal with perceptual tasks. An important component of learning in these domains is the generation of the graph describing the relations among entities of interest represented in data. Deep learning generative models have been widely used for both learning and generating new representations for continuous data such as images. However generative modeling of relational data as graphs still poses significant challenges. For this reason my PhD project aims to study, implement and improve the state-of-the-art in order to extract relational data from all kinds of sources, such as texts and images using generative neural network models.