JavaScript is required to use this site. Please enable JavaScript in your browser settings.

OpenGPT-X

Title: OpenGPT-X

Project duration: November 2021 – October 2024

Research Area: Large Language Models, NLP, Neural Networks

Logo. Gefördert durch: Bundesministerium für Wirtschaft und Klimaschutz aufgrund eines Beschlusses des Deutschen Bundestages.
Logo. openGPT-X

The aim of the project “OpenGPT-X” is to create Gaia-X-compatible advanced smart services based on innovative language technologies. These will enable data-based business solutions in the Gaia-X ecosystem using large GPT-3 [1] type AI language models. The basis for this is the training of large AI language models, which are currently being developed primarily in the USA, using corresponding high-performance and scalable Gaia-X infrastructures and data rooms. The resulting practical applications are being trialled in the mobility, media and finance/insurance domains. Due to the rapidly growing importance and disruptive potential of large AI language models, there is an urgent need to ensure technology and data sovereignty in Germany and Europe [2].

Gaia-X forms the perfect basis for providing scalable computing resources as well as networked and cross-application data spaces using Gaia-X Federated Services for the creation of large AI language models. The extensive network of project partners ensures the successful and European-oriented utilisation of the project results.

Aims

  • Training GPT like Language models of different sizes with a focus on european languages

Outlook

  • various 7 billion parameter models, 13B, 30B model

Publications

  • Ali, M., Fromm, M., Thellmann, K., Rutmann, R., Lübbering, M., Leveling, J., Klug, K., Ebert, J., Doll, N., Buschhoff, J. S., Jain, C., Weber, A. A., Jurkschat, L., Abdelwahab, H., John, C., Suarez, P. O., Ostendorff, M., Weinbach, S., Sifa, R., … Flores-Herr, N. (2023). Tokenizer Choice For LLM Training: Negligible or Crucial?

Team

Lead

  • Dr. Nicolas Flores-Herr, Fraunhofer IAIS (project)
  • Dr. René Jäkel (at ScaDS.AI Dresden/Leipzig and ZIH)

Team Members

  • Lalith Manjunath
  • Klaudia-Doris Thellmann
  • Lena Jurkschat

Partners

funded by:
Gefördert vom Bundesministerium für Bildung und Forschung.
Gefördert vom Freistaat Sachsen.