Project Overview
The rapid growth and fragmented character of social media and publicly available structured data challenges established approaches to knowledge extraction. Many algorithms fail when they encounter noisy, multilingual and contradictory input. Efforts to increase the reliability and scalability of these algorithms face a lack of suitable training data and gold standards. Given that humans excel at interpreting contradictory and context-dependent evidence, the uComp project addresses the above mentioned shortcomings by merging collective human intelligence and automated knowledge extraction methods in a symbiotic fashion. The project builds upon the emerging field of Human Computation (HC) in the tradition of games with a purpose and crowdsourcing marketplaces. It advances the field of Web Science by developing a scalable and generic HC framework for knowledge extraction and evaluation, delegating the most challenging tasks to large communities of users and continuously learning from their feedback to optimise automated methods as part of an iterative process. A major contribution is the foundational research on Embedded Human Computation (EHC), which will advance and integrate the currently disjoint research fields of human and machine computation. EHC goes beyond mere data collection and embeds the HC paradigm into adaptive knowledge extraction workflows. Accuracy and scalability of EHC to acquire factual and affective knowledge were assessed in an open evaluation campaign and two crowdsourcing applications based on the uComp human computation engine:
- Climate Challenge | Collective Awareness of Sustainability Issues
- Language Quiz | Multilingual Language Resource Acquisition
While the generic uComp methods were evaluated across different domains, climate change was chosen as the main use case for its challenging nature, subject to fluctuating and often conflicting interpretations. Showcases benefiting from the extracted knowledge include the Media Watch on Climate Change and the Climate Resilience Toolkit, which use the webLyzard Web Intelligence platform to visualize the extracted knowledge. The collaboration with international organisations such as the Climate Program Office of the National Oceanic and Atmospheric Administration (NOAA) and the United Nations Environment Programme have increased impact, provided a rich stream of input data, attracted a critical mass of users, and promoted EHC among a wide range of stakeholders. To be further pursued in the upcoming Horizon Europe program, the achieved advances in affective knowledge extraction improve the computation of communication success metrics and help assess the perception of societal issues.