Distributed Data Processing and Streaming in Flink
Nowadays, data is produced and collected everywhere in our daily life leading to a massive amount of information generated every second. Apache Flink is an open source system for expressive, declarative, fast, and efficient data analysis on both batch and streaming data. Flink combines the scalability and programming flexibility of distributed MapReduce-like platforms with the efficiency, out-of-core execution, and query optimization capabilities found in parallel databases. At its core, Flink builds on a distributed dataflow runtime that unifies batch and incremental computations over a true-streaming, pipelined execution engine. Its programming model allows for stateful, fault tolerant computations, flexible user-defined windowing semantics for streaming and unique support for iterations. Flink is converging into a use-case complete system for parallel data processing with a wide range of top level libraries including machine learning and graph processing. Apache Flink originates from the Stratosphere project led by TU Berlin and incorporates the results of various scientific papers published in VLDBJ, SIGMOD, (P)VLDB, ICDE, HPDC, etc.
Tilmann Rabl is a research director at the Database Systems and Information Management (DIMA) group and technical coordinator of the Berlin Big Data Center (BBDC). He is also a senior researcher at the German Research Center for Artificial Intelligence (DFKI). Tilmann received his PhD at the University of Passau and spent 4 years at the University of Toronto as a postdoc at the Middleware Systems Research Group (MSRG).
In his PhD thesis, Tilmann invented the Parallel Data Generation Framework (PDGF), for which he received the Transaction Performance Processing Council’s (TPC) Technical Contribution Award. In Toronto, he received a MITACS Award in 2013 and 2014 and an IBM CAS postdoctoral fellowship in 2013 and 2014. He is a professional affiliate of the TPC and co-founder and chair of the SPEC Research Working Group on Big Data. Tilmann is member of the steering committee of the Workshop on Big Data Benchmarking (WBDB) series and member of the board of directors of the BigData Top100 List. Tilmann is also cofounder of the startup bankmark, for which he acquired an EXIST award.