Status: open / Type of Theses: Master theses / Location: Dresden
Interpretability methods for machine learning (ML) models, such as Variable Importance (VI), Partial Dependence Plots (PDP), and SHAP values, play a crucial role in explaining model predictions. However, most of these methods are correlational in nature—they capture associations between input features and predictions but do not provide information about causal relationships. That is, they indicate how variables move together but not whether changing a variable would actually cause a change in the outcome.
A classic example illustrating this limitation is the positive correlation between the number of firefighters dispatched and the amount of fire damage. While more firefighters are often present at larger fires, they do not cause the increased damage; the true cause is the fire’s size. This highlights the importance of causal reasoning in interpreting model behavior.
Causal analysis is especially critical in high-stakes domains such as healthcare, economics, and public policy, where understanding the effect of interventions (e.g., changing treatment or policy) is essential for trustworthy and actionable AI systems. Incorporating causality into interpretability methods can bridge the gap between black-box ML models and decision-making based on real-world cause-and-effect relationships.
This thesis aims to integrate traditional ML interpretability techniques with causal inference frameworks, specifically Structural Causal Models (SCMs), Counterfactual analysis, and do-calculus.
The goal is to extend or adapt interpretability methods like VI, PDP, and SHAP to their causal counterparts and compare how well they reflect the true causal effects of features on predictions.