Master/Diploma thesis: Analysis of Java Methods for Performance Measurements

Type of thesis: Masterarbeit / location: Dresden / Status of thesis: Theses in progress

In the Big Data domain Java-based frameworks, such as Apache Hadoop, Flink or Spark, provide an approach to distribute the application workload for processing large-scale datasets. For fast and resource-efficient execution of such applications, it is important to find and optimize program sections that limit the speed of the application’s execution. For performance analysis, the code of an application is enhanced with measurement code for capturing timestamps of method entries and exits (instrumentation). A program trace or profile is automatically created when the application executes and can be later presented to the performance analyst. Since the instrumentation slows down the application, the methods to be instrumented need to be carefully selected. Manual selection of important methods for large applications with a complex structure is however tedious if not infeasible. Thus, (semi-) automatic selection needs to be investigated and applied. A selection based on the static analysis of a method‘s bytecode is one such candidate, but is expected to give only small improvements due to the missing runtime information.

In this master or diploma thesis, the potential influence of method selections on the runtime based on static method information should be investigated. Such static method information, e. g. the size of a method or existence of specific instructions, should be extracted from a method’s bytecode. The information should then be used to select methods to instrument. The relevance of this selection for the analysis and the runtime overhead should be evaluated. A measurement infrastructure based on Score-P (see: http://www.score-p.org) with instrumentation and filtering possibilities is available.

Envisioned Tasks

  1. Investigation of suitable method information
  2. Investigation of relevance and overhead
  3. Implementation of the proposed static analysis
  4. Validation and analysis of the proposed solution
  5. Documentation of implementation and results in written form

 

For this work, basic knowledge of Java, Java bytecode as well as tracing and profiling of applications are desirable.

The language can be either German or English.

Counterpart

Jan Frenzel

Service and Transfer Center

TU Dresden

Service and Transfer Center, Performance Analysis, Estimation of Big Data Applications, Big Data Frameworks on HPC

TU
Universität
Max
Leibnitz-Institut
Helmholtz
Hemholtz
Institut
Fraunhofer-Institut
Fraunhofer-Institut
Max-Planck-Institut
Institute
Max-Plank-Institut