Home // LLM Inference Acceleration Techniques: A Literature Review
Type of thesis: Bachelorarbeit / location: Dresden / Status of thesis: Open theses
Large Language models based on Transformers are a rapidly evolving field. With the increasing amount of pretrained models, the need for efficient inference methods grows. However, inference still lacks from underutilized hardware and yet a too slow generation of tokens on small amounts of GPUs to make it suitable for large amounts of data or real-time applications. To overcome those issues, several inference acceleration methods and frameworks arised in the last years. Yet, the performance comparability of those is still an open topic. Therefore, the goal of the research project is to review and evaluate the field of inference acceleration literature to create a survey still missing in the LLM community.
The topic can be worked on as a Bachelor’s thesis or research project.
GPT-X, Natural Language Processing
ScaDS.AI Dresden/Leipzig (Center for Scalable Data Analytics and Artificial Intelligence) is a center for Data Science, Artificial Intelligence and Big Data with locations in Dresden and Leipzig.
Chemnitzer Str. 46b,
Postal address Leipzig:
Data Science Zentrum
Internes Postfach: 212104
Copyright 2024 © ScaDS.AI Dresden/Leipzig – All rights reserved.