AMPL Project

Title: AMPL Project — A​utomatic ​M​eta Data ​P​rofiling and ​L​ineage for Integrating Heterogeneous Data Sources

Project duration: 01/2021 – 12/2023

Lead: Prof. Dr. Erhard Rahm, Matthias Täschner

Team Members: Matthias Täschner, Michal Miazga, Daniel Abitz

Partners: Data Virtuality GmbH, Leipzig University

Research Area: Data Quality

AMPL Project

Efficiently managing and merging many heterogeneous, dynamic data sources has become a critical success factor for financial institutions. However, with increasing heterogeneity and dynamic data, it is becoming increasingly difficult to keep track of historically collected and exponentially growing data pots. This has already led to significant macroeconomic damage, including the global financial crisis of 2007 and 2008. The scale of which could have been contained with real-time transparency and thus a better overview of risk and metadata. Unfortunately, there is currently no solution for financial institutions that allows flexible integration of heterogeneous data sources while providing intuitive metadata preparation. AMPL aims to develop a new tool for structuring, analyzing, and exploring large volumes of heterogeneous, dynamic data sources. For this purpose, the tool computes comprehensive data profiles consisting of statistics, correlations, and complex provenance information (lineage).

Aims

By breaking down existing silos and merging innovative technologies with the requirements of market participants, AMPL thus allows to completely rethink data and metadata management.

Technology

Machine learning assisted methods help in schema mapping (schema matching, ontology matching) between data sources as well as new methods for scalable and incremental computation of data profiles. These will be developed based on current preliminary work of the project partners and recent research results in graph analysis, SQL-based data integration and incremental record linkage (entity resolution) on dynamic and heterogeneous data sources. The data profiles are then presented in a novel web-based visual front-end that greatly simplifies data interaction and exploration.

Find out more about the projects of the Transfer and Service Center of ScaDS.AI Dresden/Leipzig.

TU
Universität
Max
Leibnitz-Institut
Helmholtz
Hemholtz
Institut
Fraunhofer-Institut
Fraunhofer-Institut
Max-Planck-Institut
Institute
Max-Plank-Institut