Graph processing on Apache Flink with the Gelly framework
In this talk, I will give an overview of Apache Flink’s Graph processing API, Gelly. Flink’s iterative operators and other unique features make it a competitive alternative for large-scale graph processing. Graph analysis tasks can elegantly be expressed using common Flink operators, and different graph processing models, like vertex-centric and gather-sum-apply, can easily be mapped to Flink dataflows. Using Gelly, you can perform loading, transformation, filtering, graph creation and analysis, with a single program.
I will also share our recent work at KTH, on supporting single-pass graph streaming analytics on Apache Flink. I will introduce gelly-stream, a prototype that allows computing graph statistics, aggregates, sketches, as well as more complex algorithms, like connected components on streams of edges.
Vasia Kalavri is a PhD candidate at KTH, Stockholm, doing research on distributed data processing, systems optimization, and large-scale graph analysis. She is a PMC member of Apache Flink and a core developer of Flink’s graph processing API, Gelly. She has previously taught training courses for engineers, interned at Telefonica Research and data Artisans, and worked as a web developer.