In this tutorial, we will introduce R users to the advantages of working on R on a High Performance Computing cluster. We will provide an overview of the most common Machine Learning methods and then look into how exactly we can explore their parallelization for the purposes of speeding up the run time. We will also show how some of the benchmarking packages in R work. In the end, the participants will have the opportunity to do it all themselves in the Hands-on Session.
Course Details
Title: R on HPC – Introduction
Last Session: 11.07.2022, 10 a.m. – 3 p.m. (Speakers: Neringa Jurenaite, Dr. Iryna Okhrin, Dr. Taras Lazariv)
Registration: https://event.zih.tu-dresden.de/nhr/r-hpc
Target Group: HPC Basics / HPC User
Language: English
Format: Tutorial
Agenda
- Accessing R and RStudio on our HPC system
- Overview of some of the main Machine Learning models (e.g. Linear and Logistic regression, Random Forest, etc.)
- Introduction to model benchmarking in R
- Introduction to parallelization in R: data-based and model-based
- Hands-on Session: Exercises
Handouts
The course material (slides, sample application) will be available.
Pre-Knowledge
Participants should have an understanding of Machine Learning methods and basic experience in using R. We recommend attending ML-HPC-B NHR Tutorial in advance or familiarize with Taurus and its compendium page.
Post-Knowledge
Participants will understand the application of main Machine Learning methods in R and be aware of corresponding issues. Further, they will know more about the implementation of parallelization and benchmarking of Machine Learning models in R on an HPC cluster using specific examples.
Contact
Check out the other trainings by ScaDS.AI Dresden/Leipzig.