Abstract: With the rise of linked data and knowledge graphs, the need becomes compelling to find suitable solutions to increase the coverage and correctness of data sets, to add missing knowledge and to identify and remove errors. Several approaches – mostly relying on machine learning and natural language processing techniques – have been proposed to address this refinement goal; they usually need a partial gold standard, i.e. some “ground truth” to train automatic models. Gold standards are manually constructed, either by involving domain experts or by adopting crowdsourcing and human computation solutions. In this paper, we present an open source software framework to build Games with a Purpose for linked data refinement, i.e. web applications to crowdsource partial ground truth, by motivating user participation through fun incentive. We detail the impact of this new resource by explaining the specific data linking “purposes” supported by the framework (creation, ranking and validation of links) and by defining the respective crowdsourcing tasks to achieve those goals. We also introduce our approach for incremental truth inference over the contributions provided by players of Games with a Purpose (also abbreviated as GWAP): we motivate the need for such a method with the specificity of GWAP vs. traditional crowdsourcing; we explain and formalize the proposed process and we explain its positive consequences and we illustrate the results of an experimental comparison with state-of-the-art approaches. To show this resource’s versatility, we describe a set of diverse applications that we built on top of it; to demonstrate its reusability and extensibility potential, we provide references to detailed documentation, including an entire tutorial which in a few hours guides new adopters to customize and adapt the framework to a new use case.
Keywords: Human computation; Games with a purpose; Linked data; Knowledge graph; Data refinement; Data linking; Truth inference
This work was partially supported by the STARS4ALL project (H2020-688135) co-funded by the European Commission.
I. Celino, G. Re Calegari & A. Fiano. Refining linked data with games with a purpose. Data Intelligence 2(2020), 417-442.
Irene Celino
The ideas and concepts presented in the paper are the results of at least three years of cooperation between the authors. I. Celino focused on data linking, crowdsourcing tasks and incremental truth inference. All authors contributed to the manuscript writing and they edited and reviewed the final version of the article.
Irene Celino is the Head of the Knowledge Technologies group at Cefriel, where she leads an R&D team and she is Portfolio and Project Manager. With expertise in Semantic Web and Human Computation technologies, her research activities cover the application of such innovative technologies to the design and development of Web applications, search engines, recommendations systems and mobile games, especially in Smart City and transportation-related scenarios. She has over 15 years of experience in over 30 R&D cooperative projects, both at National/Regional level and at European level within FP6, FP7, H2020 and EIT Digital. She is author of over 70 scientific publications in peer-reviewed journals, books and conferences.
Gloria Re Calegari
G. Re Calegari and A. Fiano focused on the framework, its applications, evaluation and tutorial. All authors contributed to the manuscript writing and they edited and reviewed the final version of the article.
Gloria Re Calegari is a researcher at Cefriel. She has a computer science background and her fields of expertise are Data Science and Human Computation technologies. Her research activities cover the design and development of gamified application and Games with a Purpose, next to the development of machine learning solutions that bring together humans and artificial intelligence. During her over 5 years of experience in R&D cooperative projects, both at National and Regional level, she published more than 20 scientific publications in peer-reviewed journals and conferences.
Andrea Fiano
G. Re Calegari and A. Fiano focused on the framework, its applications, evaluation and tutorial. All authors contributed to the manuscript writing and they edited and reviewed the final version of the article.
Andrea Fiano is a senior developer at Cefriel. Starting with the development of Web application in .Net, he continued with the development of backend solutions and REST APIs in Java and practiced with Single Page Applications and Progressive Web App in Angular and Node.js. He provides his expertise in the development of customer tailored solutions as well as in supporting the research branch. In particular, he has helped in the field of Human Computation with the development of some Games with a Purpose in the Smart City and Crowdsourcing scenarios.
