JavaScript is required to use this site. Please enable JavaScript in your browser settings.
Decorative Header Image

Entity Resolution on Heterogeneous Knowledge Graphs

Knowledge Graphs are a powerful datastructure able to model relationships and store information in a machine-readable and semantically meaningful way.
In order to adress complex information needs it is often necessary to integrate knowledge from multiple sources.
This entails detecting which entities in different data sources refer to the same real-world entity (which is called Entity Resolution).
For example if we were to integrate Wikipedia and IMDB, we need to (among other things) figure out which actors in these data sources refer to the same person.
While much research has focused on tackling this problem in the domain of tabular data, Knowledge Graphs pose specific challenges, but also opportunities for this data integration problem.

Aims

This research project aims to utilize the rich relational information present in Knowledge Graphs to create Entity Resolution tools that address this specific problem.
We investigate how (Knowledge Graph) embeddings, which translates the information of these data sources into lower-dimensional vectors, can aid in this process, as well as finding synergies with more traditional (machine learning) approaches.

Software/Data

  • kiez: Hubness reduced nearest neighbor search for Entity Alignment with Knowledge Graph embeddings
  • ScaDSMovieGraphBenchmark: Benchmark datasets for Entity Resolution on Knowledge Graphs
  • klinker: Blocking methods for Entity Resolution (in Knowledge Graphs)

Publications

  • M. Hofer, D. Obraczka, A. Saeedi, H. Köpcke, and E. Rahm, “Construction of Knowledge Graphs: State and Challenges,” CoRR, vol. abs/2302.11509, 2023, doi: 10.48550/ARXIV.2302.11509.
  • D. Obraczka and E. Rahm, “Fast Hubness-Reduced Nearest Neighbor Search for Entity Alignment in Knowledge Graphs,” SN Comput. Sci., vol. 3, no. 6, p. 501, 2022, doi: 10.1007/S42979-022-01417-1.
  • D. Obraczka, A. Saeedi, V. Christen, and E. Rahm, “Big Data Integration for Industry 4.0,” in Digital Transformation – Core Technologies and Emerging Topics from a Computer Science Perspective, B. Vogel-Heuser and M. Wimmer, Eds., Springer, 2022, pp. 247–268. doi: 10.1007/978-3-662-65004-2_10.
  • D. Obraczka and E. Rahm, “An Evaluation of Hubness Reduction Methods for Entity Alignment with Knowledge Graph Embeddings,” in Proceedings of the 13th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management, IC3K 2021, Volume 2: KEOD, Online Streaming, October 25-27, 2021, D. Aveiro, J. L. G. Dietz, and J. Filipe, Eds., SCITEPRESS, 2021, pp. 28–39. doi: 10.5220/0010646400003064.
  • D. Obraczka, J. Schuchart, and E. Rahm, “Embedding-Assisted Entity Resolution for Knowledge Graphs,” in Proceedings of the 2nd International Workshop on Knowledge Graph Construction co-located with 18th Extended Semantic Web Conference (ESWC 2021), Online, June 6, 2021, D. Chaves-Fraga, A. Dimou, P. Heyvaert, F. Priyatna, and J. F. Sequeda, Eds., in CEUR Workshop Proceedings, vol. 2873. CEUR-WS.org, 2021. Accessed: Jan. 22, 2024. [Online]. Available: https://ceur-ws.org/Vol-2873/paper8.pdf
  • D. Obraczka, A. Saeedi, and E. Rahm, “Knowledge Graph Completion with FAMER (DI2KG Challenge Winner),” in Proceedings of the 1st International Workshop on Challenges and Experiences from Data Integration to Knowledge Graphs co-located with the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining (KDD 2019), Anchorage, Alaska, August 5, 2019, D. Firmani, V. Crescenzi, A. D. Angelis, X. L. Dong, M. Mazzei, P. Merialdo, and D. Srivastava, Eds., in CEUR Workshop Proceedings, vol. 2512. CEUR-WS.org, 2019. Accessed: Jan. 22, 2024. [Online]. Available: https://ceur-ws.org/Vol-2512/paper1.pdf
  • D. Obraczka and A.-C. N. Ngomo, “Dragon: Decision Tree Learning for Link Discovery,” in Web Engineering – 19th International Conference, ICWE 2019, Daejeon, South Korea, June 11-14, 2019, Proceedings, M. Bakaev, F. Frasincar, and I.-Y. Ko, Eds., in Lecture Notes in Computer Science, vol. 11496. Springer, 2019, pp. 441–456. doi: 10.1007/978-3-030-19274-7_31.

Team

  • Prof. Erhard Rahm
  • Daniel Obraczka
funded by:
Gefördert vom Bundesministerium für Bildung und Forschung.
Gefördert vom Freistaat Sachsen.