Iqbal Zain

Ritratto Iqbal Zain

Computer Science and Innovation for Societal Challenges, XXXVI series
Grant sponsor

Fondazione CARIPARO
Tullio Vardanega

Project description
Software is ubiquitous in modern society. Almost every aspect of life, including healthcare, energy, transportation, public safety, and even entertainment, depends on the software's reliable operation. Software engineering methodologies and tools have contributed to a wide range of traditional information systems but are challenging to implement in Machine Learning (ML) application projects because the traditional software systems and ML applications differ in fundamental ways. ML applications lifecycles also differ from traditional software because they involve a computational model (ML model) trained on the training data set and possess an additional dataset to make some inference. ML model-based applications depend on the training data set and are often unpredictable, which raises uncertainties, hence lack of trust, in the system's outcome. The growing importance of ML-based systems in our daily life makes it imperative to find sound and effective ways to ensure their correctness in the classic Software Engineering sense (which, to date, is the sole plausible and applicable ground truth for correctness). Recently, researchers have started applying concepts from the software testing domain (e.g., code coverage, mutation testing, or property-based testing) to help ML engineers detect and correct faults in ML programs. Conventionally, software systems are created deductively by writing down as program code the executable rules that govern system behavior at run time. However, with ML techniques, these rules are inferred from training data (from which the requirements are generated inductively). This is a major change of perspective. It is therefore imperative for both software engineering (SE) and ML communities to research and develop novel approaches capable of addressing these emerging challenges. The trained behavior of an ML-based system might be incorrect, even if the learning algorithm is implemented correctly, a situation in which traditional testing techniques are ineffective. A critical problem is how to develop ML-aware tests effectively, and how to evolve ML-driven systems, given that they do not have (complete) specifications or even source code corresponding to some of their critical behaviours. The research's main objectives and the broad time scheduling of them are as follows: 1. To consolidate the notion of ML-aware testing for needs, components, and expectations (year 0.5); 2. To explore and evaluate productive solutions to feed ML-aware testing (year 1.5); 3. To apply such findings to the specific ambit of ML-driven software development, where the product of the use of ML is a software artifact that needs to be proven correct (year 3).