Home // Scalable and Accurate Decision-Tree Learning for Entity Resolution
Type of thesis: Bachelorarbeit / location: Leipzig / Status of thesis: Theses in progress
Entity Resolution (also known as Deduplication, Record Linkage, Link Discovery) refers to the task of identifying entities, which refer to the same real-world entity. Entities are usually matched by determining the similarity between them and this similarity is then used to determine if the entities are the same. With a plethora of different similarity measures and possibilities of combining them, creation good match conditions can be a cumbersome process of trial and error. This is why machine learning approaches are used to aid in this process.
This bachelor thesis consists of integrating the decision-tree based DRAGON algorithm into FAMER(FAst Multi-source Entity Resolution system), a scalable framework for distributed multi-source entity resolution implemented with Apache Flink™ .
Contact: obraczka@informatik.uni-leipzig.de
Universität Leipzig
Knowledge Graphs, Entity Resolution, Knowledge Graph Embeddings, Machine Learning
ScaDS.AI Dresden/Leipzig (Center for Scalable Data Analytics and Artificial Intelligence) is a center for Data Science, Artificial Intelligence and Big Data with locations in Dresden and Leipzig.
Bürokomplex Falkenbrunnen Chemnitzer Str. 46b, 2. Obergeschoss 01187 Dresden
Löhrs Carré Humboldtstraße 25, 3. Obergeschoss 04105 Leipzig Postal address Leipzig: Universität Leipzig Data Science Zentrum Internes Postfach: 212104 04081 Leipzig
Copyright 2023 © ScaDS.AI Dresden/Leipzig – All rights reserved.