SeisBench – Lessons learned from building a library for Machine Learning in seismology
As in many other disciplines, seismology has seen numerous breakthroughs in the last years due to Machine Learning. The ability of these techniques to efficiently infer the statistical properties of large datasets often provides significant improvements over traditional techniques when the number of examples are large. With the entire spectrum of seismological tasks, e.g., seismic picking and detection, magnitude and source property estimation, ground motion prediction, hypocentre determination; among others, now incorporating ML approaches, numerous models are emerging as these techniques are further adopted within seismology.
However, there is a lack of standardisation in models and datasets leading to several issues: comparability of model performance; applicability of models for practitioners; generalization of models. To address these issues, we built SeisBench – A toolbox for Machine Learning in seismology. SeisBench is an extensible, open-source python package, incorporating benchmark datasets, preimplemented models, and training pipelines. SeisBench aims both at Deep Learning researchers, developing new models, and seismological practitioners, seeking to apply Deep Learning for their analysis. This way, SeisBench aims to bridge the gap between model development and application.
While SeisBench is targeted specifically towards seismological applications, the underlying considerations and design patterns are common to many fields with recent advances from applied Machine Learning. Therefore, I will also discuss the underlying design choices as well as our experiences and takeaways from building the SeisBench framework.
Jannes Münchmeyer is a mathematician and computer scientist by training. He is currently pursuing his PhD in Seismology at GFZ Potsdam and Humboldt University Berlin. In his research he seeks to understand the generation of large earthquakes and addresses the question of rupture predictability. Previously and in side projects, he has worked on (biomedical) text mining, transcription factor activity, and Machine Learning for evaluating mass extinction events in Earth’s history.