<!– PRIN HOME –>
Discount quality for responsible data science:
Human-in-the-Loop for quality data
The technological boost in the capability of analyzing data for scientific research and reusing it in Open Science initiatives is enormous. The attempt to build data spaces, or data ecosystems, that support the publication and reuse of data for feeding data science pipelines has inspired several initiatives worldwide and in Europe. On the other hand, assessing the quality of data and results can be very expensive in terms of computational and human costs.
In this scenario, the project is committed to responsible data science, with a human-in-the-loop (HITL) approach, focusing on making the whole process sustainable, both computationally and in terms of human effort.
The original contributions of the project focus on data preparation, as it is known to reach up to 80% of the time required for data analysis, balancing the need to achieve a data quality level that makes the data “fit for use” in a given context and the effort needed for such a high-quality data preparation for a given analysis goal (“discount” quality). A theoretical basis and instruments will be developed to provide a minimally viable approach, which can be adapted to the context of use. Two main goals will be pursued towards sustainability: i) reducing the computational effort needed to analyze data; ii) introducing HITL in a sustainable way, to make human contributions effective, keeping them limited in time and size.
<ul>
<li> Objectives
<li> Team
</ul>