Scalable Graph Analytics
Many big data applications in business and science require the management and analysis of huge amounts of graph data. Suitable systems to manage and to analyze such graph data should meet a number of challenging requirements including support for an expressive graph data model, powerful query and graph mining capabilities, ease of use as well as high performance and scalability. In this tutorial, we survey current system approaches for management and analysis of ”big graph data”. We discuss graph database systems, distributed graph processing systems such as Google Pregel and its variations, and graph dataﬂow approaches based on Apache Spark and Flink. We further introduce a new research framework called Gradoop, developed at the German Big Data center ScaDS, that is built on the so-called Extended Property Graph Data Model with dedicated support for analyzing not only single graphs but also collections of graphs. We also discuss current and future research challenges.
Erhard Rahm is full professor for databases at the computer science institute of the University of Leipzig, Germany. His current research focusses on Big Data, graph analytics and data integration. He has authored several books and more than 200 peer-reviewed journal and conference publications. His research has been awarded several times, in particular with the renowned 10-year best-paper award of the conference series VLDB (Very Large Databases) and the Influential Paper Award of the conference series ICDE (Int. Conf. on Data Engineering). Prof. Rahm is one of the two scientific coordinators of the new German center of excellence on Big Data ScaDS (competence center for SCAlable Data services and Solutions) Dresden/Leipzig.