Title: Multi-source Entity Resolution Evaluation
Research Area: Data Quality and Data Integration
Multi-source entity resolution generates clusters consisting of records representing the same entity. Due to data quality issues, automatized solutions do not completely achieve correct results. Therefore, review tools are required to enable an efficient review process of record clusters to determine wrongly assigned records.
Initially, the user imports the cluster results and original similarity graph in the Gradoop data format into the review tool.
After that, she can check which clusters are already evaluated and can continue the review process only for the unverified clusters. To estimate the effort of the review process, basic statistics are shown such as the number of records and edges.
The cluster view allows a detailed verification of a certain cluster with its records. The user can look at each record and can decide if it is correct or not.