Due to the heterogeneity of Machine Learning applications, the motivation to switch to an HPC system can be manifold, e.g. due to large memory requirements, GPU usage or increase of computation speed. The course presents how a typical Machine Learning workflow can be realized in the HPC environment. It is possible to switch to the HPC system at different points in the workflow – depending on the requirements. The development of Machine Learning applications is often done by collaborative work within groups, which is also taken into account in the implementation of the Machine Learning workflow.
Title: Machine Learning on HPC – Introduction
Next Session: 27.09.2022, 10 a.m. – 3 p.m. (Speakers: Dr. Iryna Okhrin, Dr. Peter Winkler, Wenyu Zhang)
Target Group: HPC Basics / HPC User
- Access to the HPC system (e.g. ssh, Jupyterhub)
- Data transfer and storage of training data, models, source codes etc. (e.g. scp, dtcp, user space, workspaces)
- Setup of the required software environment (e.g. using module system, virtual environments, containers)
- Execution/testing/debugging of applications (e.g. batch jobs, interactive jobs)
- Evaluation and storage of results
- Simple monitoring to optimize applications (Pika)
The course material (slides, sample application) will be available.
Participants should have basic knowledge of Python as well as the use of Tensorflow or Pytorch or R.
Participants will gain knowledge about the implementation of Machine Learning workflows using specific examples, taking into account individual requirements.
Check out the other trainings by ScaDS.AI Dresden/Leipzig.