by Peter Winkler & Norman Koch
What is Omniopt?
- A tool for hyperparameter optimization when you work with neural networks and Big Data.
- Omniopt is applicable to a broad class of problems (both classical simulations and neural networks).
- Omniopt is robust. It checks and installs all dependencies automatically and fixes many problems in the background
without the user even noticing that they have occurred.
- While Omniopt optimizes, no further intervention is required. You can follow the ongoing stdout (standard output) live in the console.
- Omniopt’s overhead is minimal and virtually imperceptible.
What can you use it for?
- Classical simulation methods as well as neural networks have a large number of hyperparameters that
significantly determine the accuracy, efficiency, and transferability of the method.
- In classical simulations, the hyperparameters are usually determined by adaptation to measured values.
- In neural networks, the hyperparameters determine the network architecture: number and type of layers,
number of neurons, activation functions, measures against overfitting etc.
- The most common methods to determine hyperparameters are intuitive testing, grid search or random search.
How does it work?
- The hyperparameters are determined with a parallelizable stochastic minimization algorithm (TPE) using
the GPUs of the HPC system, Taurus.
- The user has to provide his application, which can be either a neural network or a classical simulation
as a black box and a target data set representing the optimal result.
- In an .ini file, the hyper parameters to be optimized (e.g. number of epochs, number of hidden layers, batch sizes, …)
are defined and their parameter limits (minimum, maximum) are specified. Generate the config file!
- The number of hyperparameters is in principle arbitrary. In practice, up to ten parameters are currently recommended (further tests are required).
- The Bayesian stochastic optimization algorithm TPE calculates per optimization step the objective functions for
a set of parameter distributions in the parameter space.
- The user must provide a version of his program that reads the values of the hyperparameters to be
optimized and outputs the target function.
- For neural networks, either the loss or another (more descriptive) quantity (e.g. F1 measure) can be used as
- The parallelization and distribution of the calculations on the Taurus GPUs is done automatically.
How can you use it? Do you have more questions?
Bergstra, J., Yamins, D., Cox, D. D. (2013) Making a Science of Model
Search: Hyperparameter Optimization in Hundreds of Dimensions for Vision
Architectures. TProc. of the 30th International Conference on Machine
Learning (ICML 2013), June 2013, pp. I-115 to I-23.