JavaScript is required to use this site. Please enable JavaScript in your browser settings.

Leading Principal Investigator

Decorative Header Image

Big Data Analytics and Engineering

At ScaDS.AI Dresden/Leipzig, we conduct research in the area of Big Data Analytics and Engineering. Data is “the new oil” that fuels Machine Learning models and innovation. For building trustful Machine Learning and AI tools it is therefore mandatory to have access to large amounts of high-quality Open Data and Open Models as well as Data Quality and Integration methods to generate this kind of data. Moreover, data needs to be accessible by usable Big Data Analytic tools, scalable methods, and back-ends.

Topic Areas

Open Data and Open Models

ScaDS.AI Dresden/Leipzig is very active in the creation and maintenance of open data and open models. Open data include knowledge graphs, labeled training data, web data, medical information, and Natural Language Processing (NLP) sources of German texts. We aim to become one of the first a major research centers in Europe to enable researchers to exploit the web at industry scales for AI research and development (i.e. Immersive Web Observatory). ScaDS.AI Dresden/Leipzig is also active in several consortia of the National Research Data Initiative (NFDI).


Data Quality and Data Integration

We develop generic approaches for a largely automated data cleaning and data integration, particularly for the generation and maintenance of large knowledge graphs. Furthermore, we devise active learning techniques to generate large labeled training datasets and develop approaches to leverage large data repositories for Artificial Intelligence and Machine Learning.


Big Data Analytics

We will investigate the fundamentals of how scalable Machine Learning algorithms can be built with a LEGO-type construction and reuse metaphor. We aim to provide modular analytics in building blocks for Artificial Intelligence, we will significantly extend our work on data cubes, and provide a web data analytics platform on a petabyte scale.


funded by:
Gefördert vom Bundesministerium für Bildung und Forschung.
Gefördert vom Freistaat Sachsen.