Mattia Carletti

Ritratto Mattia Carletti

Computer Science and Innovation for Societal Challenges, XXXV series
Grant sponsor


GianAntonio Susto

Anna Spagnolli

Project: Trustworthy Machine Learning for Anomaly Detection and Computer Vision Applications
Full text of the dissertation book can be downloaded from:

Abstract: Technologies based on Artificial Intelligence (AI) have gained tremendous popularity in the past few years. This is made possible by the fact that Machine Learning models can now achieve amazing performance, even outperforming humans in several application domains. Unfortunately, there is still one fundamental aspect in which humans succeed and machines fail miserably: trustworthiness. Countless cases of unintended harm caused by AI-enabled technologies have been reported and have attracted wide media coverage. While unintentional, the consequences of these events could be devastating and affect our quality of life. Even more so if we consider that AI is being increasingly adopted in sensitive domains to support high-stakes decisions. In light of these observations, the need for Trustworthy AI is particularly pressing. In this thesis, we profoundly investigate the properties of popular models that have been used for years in academia and industry and provide tools to improve their trustworthiness. The focus is on two specific dimensions in the space of Trustworthy AI, i.e., interpretability and robustness, and on two application domains of great practical interest, i.e., Anomaly Detection and Computer Vision. In the context of Anomaly Detection, we introduce novel model-specific methods to interpret the Isolation Forest, a popular model in this field, at both the global and local scales. In Computer Vision, we address the problem of robust image classification with Convolutional Neural Networks. We first unveil unknown properties of adversarially-trained models, elucidating inner mechanisms through which robustness against adversarial examples may be enforced by Adversarial Training. We also showcase failure modes related to the simplicity biases induced by Adversarial Training that may be harmful when robust models are deployed in the wild. Finally, we design a novel filtering procedure aimed at removing textures while preserving the image's semantic content. Such filtering procedure is then exploited to design a defense against adversarial attacks.