Title: A Global Database of Natural Hazards Impacts Reported in the Scientific Literature
Duration: 01.06.2023 – present
Research Area: Earth and Environmental Sciences
In this research project, we aim to build a comprehensive global database on the impacts of natural hazards reported in scientific literature since the 1950s. To achieve this, we have initiated a systematic mapping of worldwide research on hydroclimatic extremes, including droughts, heatwaves, and floods. The corpus includes full-text open-access papers extracted from databases such as Science Direct and Pubmed. To do that, we will build a classification model to identify hazards and their reported impacts from scientific text. We will pre-train a transformer-based language model on the corpus and fine-tune it to (i) classify sentences and (ii) identify entities in the sentences describing the impacts of natural hazards, such as hazard cause, date, and location of the hazard, impacts, and number of affected people.
Existing studies and databases of natural hazard impacts have several limitations, such as (1) a low level of detail on how people were affected; (2) an underestimation of the impacts; (3) a limited geographical range; and (4) a lack of information on the source of the data. However, scientific publications, reports, and handbooks compose a large data repository that can provide valuable and trustworthy information on natural hazards.