LLLS #7: Big Data Performance Analysis

11:00 – 12:00
3.02.

Jan Frenzel introduces the audience to the area of performance evaluation and performance investigation of these frameworks. Furthermore, he presents benefits of using an established performance analysis tool, Vampir, as an alternative to the dashboards of Apache Spark and Apache Flink.
In the last years, the amount of data that needs to processed has increased tremendously. Java-based frameworks, such as Apache Hadoop, Apache Spark and Apache Flink have been developed to simplify the work with distributed data by hiding much of the complexity related to distributed data processing, such as splitting data or moving data in the compute cluster, behind functional building blocks. However, because of this hidden complexity, performance analysis of applications written with these frameworks is particularly challenging. The performance could be limited by the application, the framework itself or the framework's configuration. Different approaches could be used to investigate these potential causes of low performance.

More information

View full calendar

Dresden