Benchmarking Graph Data Analysis with LDBC
Many so-called „Big Data“ data management problems revolve around analysis of heterogeneous, and complexly structured data sets where the data is interrelated, thus forming a graph that connects billions of nodes. Here, the worth of the data and its analysis is not only in the attribute values of these nodes, but in the way these nodes are connected. Specific application areas that exhibit the growing need for management of such graph-shaped data include life science analytics, social network marketing and digital forensics.
In this talks, I will make the case that in order to evaluate pros and cons of using new emerging technology such as graph database systems or distributed graph programming frameworks, as well as more established technology such as MapReduce, SPark or even traditional (parallel) database systems, it is important to develop a new generation of benchmarks. In this context I will describe some of the work being conducted in the LDBC (Linked Data Benchmark Council – ldbcouncil.org), focusing on so-called „choke point“ based benchmark development, focusing on the Social Network Benchmark (SNB) and its three workloads: one that measures interactive graph database performance, one that tests Business Intelligence queries on graph data, and the Graphalytics workload that tests graph analysis algorithms (such as PageRank, clustering, and community detection).
Peter Boncz holds appointments as tenured researcher at CWI and professor at VU University Amsterdam. His academic background is in core database architecture, with the architecture of MonetDB the main topic of his PhD thesis. This work focused on architecture-conscious database research, which studies the interaction between computer architecture and data management techniques. His specific contributions are in cache-conscious join methods, query and transaction processing in columnar database systems, and vectorized query execution. In recent years he has also worked on (distributed) XML database systems, forensic data analysis, scalable RDF data management, and graph database benchmarking.
He has a strong track record in bridging the gap between academia and commercial application, receiving the Dutch ICT Regie Award 2006 for his role in the CWI spin-off company Data Distilleries. In 2008 he founded a new CWI spin-off company called Vectorwise, dedicated to state-of-the art business intelligence technology. He is also the co-recipient of the 2009 VLDB 10 Years Best Paper Award, and in 2013 received the Humboldt Research Award. In connection with the latter award, in the 2013/2014 academic year he resides part-time at Technical University Munich (TUM), and was also appointed TUM-IAS honorary Hans Fisher fellow there. Peter Boncz currently is the scientific director of the EU project LDBC, that aims to establish industry-strength graph and RDF benchmarks and benchmark practices.
Back to the Summer School 2016 overview