March 30, 2026
From March 24–29, 2026, ScaDS.AI Dresden/Leipzig joined the 19th Conference of the European Chapter of the Association for Computational Linguistics (EACL 2026) in Rabat, Morocco. We are proud to have contributed three papers to the conference.
Luca Giordano presented the paper “Foundations of LLM Knowledge Materialization: Termination, Reproducibility, Robustness”. The paper results from research conducted at the Chair of Knowledge-Aware AI. Prof. Simon Razniewski is the co-author.
See more information about the Chair of Knowledge-Aware AI here.
Large language models (LLMs) encode substantial factual knowledge, yet measuring and systematizing this knowledge remains challenging. Converting it into structured format, for example through recursive extraction approaches such as the GPTKB methodology (Hu et al., 2025b), is still underexplored. Key open questions include whether such extraction can terminate, whether its outputs are reproducible, and how robust they are to variations.
Luca Giordano and Prof. Simon Razniewski systematically studied large language model knowledge materialization using miniGPTKBs (domain-specific, tractable subcrawls), analyzing termination, reproducibility, and robustness across three categories of metrics: yield, lexical similarity, and semantic similarity. They experimented with four variations (seed, language, randomness, model) and three illustrative domains (from history, entertainment, and finance). Their findings show:
These results suggest that LLM knowledge materialization can reliably surface core knowledge, while also revealing important limitations.
The paper “The Hidden Bias: A Study on Explicit and Implicit Political Stereotypes in Large Language Models” was presented by Konrad Löhr (TU Dresden). It was co-authored by Shuzhou Yuan and Prof. Michael Färber from the Chair of Scalable Software Architectures for Data Analytics at ScaDS.AI Dresden/Leipzig.
Find more information about the Chair of Scalable Software Architectures for Data Analytics here.
Large Language Models (LLMs) are increasingly integral to information dissemination and decision-making processes. Given their growing societal influence, understanding potential biases, particularly within the political domain, is crucial to prevent undue influence on public opinion and democratic processes. This work investigates political bias and stereotype propagation across eight prominent LLMs using the two-dimensional Political Compass Test (PCT). Initially, the PCT is employed to assess the inherent political leanings of these models. Subsequently, persona prompting with the PCT is used to explore explicit stereotypes across various social dimensions. In a final step, implicit stereotypes are uncovered by evaluating models with multilingual versions of the PCT. Key findings reveal a consistent left-leaning political alignment across all investigated models. Furthermore, while the nature and extent of stereotypes vary considerably between models, implicit stereotypes elicited through language variation are more pronounced than those identified via explicit persona prompting. Interestingly, for most models, implicit and explicit stereotypes show a notable alignment, suggesting a degree of transparency or “awareness” regarding their inherent biases. This study underscores the complex interplay of political bias and stereotypes in LLMs.
Our guest member Christopher Schröder presented the third paper at EACL 2026. It is called “Reassessing Active Learning Adoption in Contemporary NLP: A Community Survey” and co-authored by Julia Romberg, Julius Gonsior, Katrin Tomanek, and Fredrik Olsson.
Supervised learning relies on data annotation which usually is time-consuming and therefore expensive. A longstanding strategy to reduce annotation costs is active learning, an iterative process, in which a human annotates only data instances deemed informative by a model. Research in active learning has made considerable progress, especially with the rise of large language models (LLMs). However, we still know little about how these remarkable advances have translated into real-world applications, or contributed to removing key barriers to active learning adoption. To fill in this gap, the authors conduct an online survey in the NLP community to collect previously intangible insights on current implementation practices, common obstacles in application, and future prospects in active learning. They also reassess the perceived relevance of data annotation and active learning as fundamental assumptions. Their findings show that data annotation is expected to remain important and active learning to stay highly relevant while benefiting from LLMs. Consistent with a community survey from over 15 years ago, three key challenges yet persist—setup complexity, uncertain cost reduction, and tooling—for which the authors propose alleviation strategies. An anonymized version of the dataset is published.
EACL 2026 took place from March 24–29, 2026 at the Palais des Congrès Rabat Bouregreg in Rabat, Morocco. As a flagship conference in the field of computational linguistics, EACL welcomed international researchers working on computational approaches to natural language. Learn more about the conference on the official website. Proceedings of the conference are available here.