Understanding Language

Language is often viewed as the pinnacle of (human) intelligence. The seamlessness by which machines can be integrated with society depends on their understanding and mastery of language. Our research thus covers domain-specific large-scale language modeling, text manipulation algorithms, argumentation, and causal language, studied specifically in the context of conversational AI and connecting knowledge extraction and graphs with goal-driven dialogs, as well as in the context of mining the scientific literature.

Modeling and manipulating language

Our research in natural language processing and information retrieval focuses on algorithms and models. The overarching challenge is advancing language understanding and manipulation.

Key tasks

Artificial Intelligence technologies with the help of increasingly large language resources from web archives fuel the generalization capabilities of these models.


  1. Building domain-specific language models for writing assistance and problem solving, focusing on latent variables in language models.
  2. Paraphrasing at paragraph level; we expect to gain insights from summarizing long texts.
  3. Summarization research in new domains, such as social media.
  4. Constrained paraphrasing and summarization; constraints include language simplicity, writing style, and domain-specific requirements.
  5. Integrating computational argumentation and conversational technologies.
  6. Causal knowledge acquisition from text for advanced AI reasoning.
  7. Bias analytics in all of the above; focus on minority protection.


(1) corpus selection, (2) model selection, and (3) quality aspect assessment.

Conversational AI and knowledge extraction

Our research on conversational AI brings together knowledge graphs, natural language understanding, and deep learning. Goals include:

  1. Fast domain adaptation techniques in goal-driven dialogs for context-sensitive, coherent, and correct responses.
  2. Code synthesis for data analytics, with focus on foundational paradigm shifts (e.g., transformer-based encoding of tree structures).
  3. Conversational search for exploring research, e.g., recent COVID-19 related research; building on our expertise in question answering.
  4. Explainability approaches based on graph representations.

Mining the scientific literature

Motivated by several successful examples, we will pursue biomedical text mining.

This includes tailored information extraction and language modeling. We will also present facts and results in an argumentative frame-work for support and explanation purposes.

Emati: A recommender system for biomedical literature based on supervised learning .

Portrait of Jun.-Prof. Dr. Martin Potthast

Jun.-Prof. Dr. Martin Potthast

Leading Principal Investigator

Universität Leipzig


Find out more about our research in the field of AI Algorithms and Methods.