At the 10th International Summer School on AI and Big Data, Prof. Ingo Steinwart will give a presentation in the field of Mathematical Foundations of AI.
A central, initial task in data science is cluster analysis, where the goal is to find
clusters in unlabeled data. One widely accepted definition of clusters has its roots in a paper by Carmichael et al., where clusters are described to be densely populated areas in the input space that are separated by less populated areas. The mathematical translation of this idea usually assumes that the data is generated by some unknown probability measure that has a density with respect to the Lebesgue measure. Given a threshold level, the clusters are then defined to be the connected components of the density level set. However, choosing this threshold and possible width parameters of a density estimator, which is left to the user, is a notoriously difficult problem, typically only addressed by heuristics.
In the first part of this talk, I show how a simple algorithm based on a density estimator can find the smallest level for which there are more than one connected component in the level set. Unlike other cluster algorithms this approach is fully adaptive in the sense that it does not require the user to guess crucial hyper-parameters. In the second part of the talk I will discuss
practical aspects of the algorithm including an efficient implementation. Finally, I present some numerical illustrations.
Prof. Ingo Steinwart is a mathematician and university professor with a focus on the fields of support vector machines, applied stochastics and entropy numbers. In 2000 he completed his doctorate in mathematics on the subject of “Entropy of C(K) – valued operators and some applications” at the Friedrich-Schiller-University in Jena. He then continued his research at various universities, most recently as a research assistant at the Computer, Computational & Statistical Sciences Division (CCS-3) of Los Alamos National Laboratory in the United States and at the Jack Baskin School of Engineering of the Department of Computer Science at the University of California, Santa Cruz. Since April 2010, Ingo Steinwart has held the Chair of Stochastics at the University of Stuttgart – based at the Institute of Stochastics and Applications in the Department of Mathematics.
Find more information about Prof. Ingo Steinwart and his work on his website.