July 3, 2026

Shuzhou Yuan Successfully Defends His Doctoral Thesis

Research

Large language models have taken the world by storm, but beneath their impressive fluency lies a surprising blind spot. They process language as flat sequences of tokens, with no native understanding of the relational structures that make meaning rich and reliable. Shuzhou Yuan spent four years asking a deceptively simple question: what if we gave language models the ability to reason with graphs? On June 19, 2026, he successfully defended his doctoral thesis Learning Language Models on Graphs. Congratulations, Dr. Yuan!

Teaching Language Models to Think in Graphs

Modern language models, from BERT to GPT-4 to Llama, are built on the Transformer architecture, which processes text as ordered sequences of tokens. This works remarkably well for many tasks. But natural language is not just a sequence. It contains syntactic dependencies, discourse relations, causal links, and multi-hop reasoning chains that are fundamentally graph-like in character. When a model is forced to approximate these structures implicitly through attention weights alone, the result can be hallucinations, inefficiency, and explanations that do not faithfully reflect how the model arrived at its answer.

Graph Neural Networks (GNNs), by contrast, are purpose-built to model relational structure through nodes and edges. They are widely used in molecular biology, knowledge graph reasoning, and social network analysis, but their integration into natural language processing remained limited. Shuzhou Yuan’s dissertation addresses this gap directly. It proposes a unified framework in which graph-based structural information systematically improves language models across three fundamental dimensions:

architecture,
efficiency, and
interpretability.

Three Contributions, One Unifying Idea

Architecture: GraSAME

When a language model is asked to generate text from a knowledge graph, it must translate relational structure into coherent prose. Existing models often hallucinate facts or misrepresent relations when performing this task. GraSAME (published at NAACL 2024) addresses this by injecting a lightweight GNN layer directly into the self-attention mechanism of a pretrained language model. This enables joint reasoning over both graph and textual inputs within a single architecture. The original model parameters remain frozen; only the GNN layer is trained. The result is substantial improvements in text quality, confirmed by both automatic metrics and human evaluators.

Efficiency: GNNavi and slimLM

GNNavi (ACL 2024) reframes in-context learning through a graph lens. It represents the information flow between prompt demonstrations and the target token as a graph structure, and inserts a GNN layer to navigate that flow during few-shot training. This achieves competitive performance while training as few as 2.6 million parameters – compared to 1.6 billion for full fine-tuning. A complementary line of work treats the language model itself as a graph, with each Transformer layer as a node. It shows that pruning redundant layers causes surprisingly little degradation. In some cases, a single-layer model outperforms its full-depth counterpart, enabling up to 93% parameter reduction.

Interpretability: G-TEx

Language models can generate natural language explanations of their predictions, but these explanations are frequently unfaithful to the model’s actual internal reasoning. G-TEx (EMNLP 2025) closes this gap by extracting the input tokens and token interactions most critical to a model’s prediction, encoding them as graph structures, and reinjecting them via a GNN layer during fine-tuning. The result is explanations that more accurately reflect the model’s decision-making, with consistent gains in faithfulness across both encoder-decoder and decoder-only architectures.

Impact and Outlook

Across all four contributions, empirical results on graph-to-text generation, text classification, and reasoning tasks demonstrate consistent improvements over strong baselines. Graph-based integration strengthens model reasoning, reduces computational overhead, and produces more faithful explanations. Together, the dissertation establishes graph-guided language modeling as a general framework. It also provides the first systematic evidence that integrating structured relational information into language model design can unlock capabilities that are difficult to achieve with text alone.

The work opens several promising directions for future research. In scientific discovery, molecular graphs and protein interaction networks offer rich relational signals that graph-enhanced language models could exploit for drug discovery and hypothesis generation. Yuan has already begun exploring this direction in follow-up work. In multilingual and multimodal settings, graph structures can serve as a cross-modal bridge between language, vision, and other sensory signals. And in interpretability, the G-TEx framework points toward models that dynamically construct their own reasoning graphs, making AI systems more transparent and accountable.

Dr. Shuzhou Yuan

Shuzhou Yuan began his doctoral studies under Prof. Michael Färber, first at the Karlsruhe Institute of Technology (KIT) and then at TU Dresden, where Prof. Färber holds an AI Professorship at ScaDS.AI Dresden/Leipzig. Over four years of research spanning Karlsruhe, Dresden, Munich, and Copenhagen, Shuzhou Yuan built a cohesive body of work demonstrating how graph neural networks can make language models more capable, more efficient, and more interpretable. Prof. Michael Färber supervised the doctoral thesis. The doctoral committee included Prof. York Sure-Vetter, Prof. Gerard de Melo, Prof. Simon Razniewski, and Prof. Diana Göhringer. We wish Dr. Shuzhou Yuan all the best for the next chapter of his academic journey!

Previous Entry Back to Overview Next Entry

ACL 2026: Outstanding Paper Award for Zhan Qu and Michael Färber

Events

From July 2–7, 2026, the 64th Annual Meeting of the Association for Computational Linguistics (ACL […]

SemRepo: Knowledge Graph for Research Software

ScaDS.AI Dresden/Leipzig

Modern scientific research increasingly depends on software. In fields such as artificial intelligence, computational biology, […]

ScaDS.AI Dresden/Leipzig at ICML in Seoul, South Korea

Events

From July 6–11, 2026, researchers from ScaDS.AI Dresden/Leipzig joined the 43th International Conference on Machine […]

Greetings from the 37th IEEE Intelligent Vehicles Symposium in Detroit

Transfer and Service

Last Week Miriam Louise Carnot took part at the 37th IEEE Intelligent Vehicles Symposium in […]

funded by:

Gefördert vom Bundesministerium für Bildung und Forschung.

ScaDS.AI Dresden/Leipzig (Center for Scalable Data Analytics and Artificial Intelligence) is a center for Data Science, Artificial Intelligence and Big Data with locations in Dresden and Leipzig.