Home // TF-IDF for Entity Resolution in Huge Knowledge graphs
Type of thesis: Bachelorarbeit / location: Leipzig / Status of thesis: Theses in progress
Entity Resolution (also known as Deduplication, Record Linkage, Link Discovery) refers to the task of identifying entities, which refer to the same real-world entity. Entities are usually matched by determining the similarity between them and this similarity is then used to determine if the entities are the same. One of these similarity measures is tf-idf (term frequency inverse document frequency).
This bachelor thesis consists of implementing tf-idf as similarity measure for FAMER(FAst Multi-source Entity Resolution system), a scalable framework for distributed multi-source entity resolution implemented with Apache Flink™ .
Contact: obraczka@informatik.uni-leipzig.de
Universität Leipzig
Knowledge Graphs, Entity Resolution, Knowledge Graph Embeddings, Machine Learning
ScaDS.AI Dresden/Leipzig (Center for Scalable Data Analytics and Artificial Intelligence) is a center for Data Science, Artificial Intelligence and Big Data with locations in Dresden and Leipzig.
Bürokomplex Falkenbrunnen Chemnitzer Str. 46b, 2. Obergeschoss 01187 Dresden
Löhrs Carré Humboldtstraße 25, 3. Obergeschoss 04105 Leipzig Postal address Leipzig: Universität Leipzig Data Science Zentrum Internes Postfach: 212104 04081 Leipzig
Copyright 2023 © ScaDS.AI Dresden/Leipzig – All rights reserved.