The area of online Machine Learning in big data streams covers algorithms that are (1) distributed and (2) work from data streams with only a limited possibility to store past data. The first requirement mostly concerns software architectures and efficient algorithms. The second one also imposes nontrivial theoretical restrictions on the modeling methods: In the data stream model, older data is no longer available to revise earlier suboptimal modeling decisions as the fresh data arrives.
In my presentation, I will give an overview of distributed software architectures and libraries as well as machine learning algorithms and models for online learning, focusing on classification, regression, recommendation, and show how they are implemented in various distributed data stream processing systems. I will give a detailed description of recommendation by online machine learning and show why online learning is natural and practical for recommenders. I will also give an overview of more potential applications of online learning.
For further information see: Dr. András Benczúr.