Large-scale data management cannot rely on machines alone. Human judgment is essential for handling ambiguity, ensuring quality, and interpreting domain-specific context. My research develops human-in-the-loop frameworks that integrate crowd contributions intelligently at each stage of the data pipeline.
Data collection and quality#
I have developed models and tools for evaluating and improving data quality in crowdsourced settings, including crowd-based annotation tasks and quality metrics that account for worker expertise and bias WWW JWE.
Crowdsourcing platforms and patterns#
Early work at Politecnico di Milano produced pattern-based specifications for crowdsourcing applications ICWE, community-based crowdsourcing models WWW, and adaptive platform designs IEEE IC.
Applications#
Human-in-the-loop methods have been applied across several domains:
- Music transcription — microtask crowdsourcing for music score error detection ISMIR
- Energy consumption — user-generated content analysis for smart city energy behavior WWW Energies
- Social research — complementing studies on vulnerable youth with social media data CHItaly
- Privacy — human-in-the-loop approaches to image privacy preservation IIR