Nowadays, data is produced and collected everywhere in our daily life leading to a massive amount of information generated every second. However, a significant portion of such information is only useful or valid for a certain time or simply too large to be stored for later processing. Batch processing platforms like Hadoop MapReduce do not fit these needs of incremental processing of continuous data streams. Therefore, modern big data processing engines combine the scalability of distributed architectures with the one-pass semantics of traditional stream engines.
In this talk, we give a survey of the current state of the art in scalable stream processing from a user perspective. We examine and describe architectures, execution models, and programming interfaces of the most prominent platforms, relate them to the state of the art in traditional stream processing and discuss challenges and limitations.
Kai-Uwe Sattler heads the Database and Information Systems group at the Faculty of Computer Science and Automation of the Technische Universität (TU) Ilmenau. He received his Diploma (M.Sc.) in Computer Science from the University of Magdeburg, Germany. In 1998, he received his Ph.D. in Computer Science (magna cum laude) from the same university. From 1998 to September 2003 he was a member of the Database research group (head: Prof. Gunter Saake) at the University of Magdeburg. From October 2001 until March 2002, he worked as a visiting assistant professor at the UC Davis, U.S.A. In June 2003, he received his Habilitation (venia legendi) in Computer Science from the University Magdeburg. He joined the Department of Computer Science and Automation of the TU Ilmenau in October 2003 as Professor.
Currently, he serves as the Dean of the department. His research interests include query processing techniques, data management on modern hardware as well as data stream processing and analytics.
