Track A: Large AI Models

TRACK A: Large AI Models

The rise of large multimodal models like ChatGPT has significantly influenced both research and public perception of AI in recent months. Theme track A, “Large AI Models,” dives into the foundational technology and the latest breakthroughs of such large models. The track will feature labs covering essential and fundamental topics such as language modeling, multimodal models, and training with massive datasets. Additionally, it will explore more advanced themes, including LLM alignment and efficiency, along with application-specific topics like using LLMs for code generation. Attendees will gain comprehensive insights from leading experts through a combination of theoretical and practical sessions.


Monday, Sept 9, 2024

Opening Speech, 13:00-13:30

Pierre Alliez (President of the Inria Evaluation Commission)

Philipp Slusallek (Executive Director of DFKI Saarbrücken)

Keynote 1, 13:30-14:30

Prof. Dr. rer. nat. Dr. h.c. mult. Wolfgang Wahlster (DFKI)

Professor Wolfgang Wahlster is a pioneer of AI in Germany and Europe as a founding director of the DFKI. He has served as an elected President of three international AI organizations: IJCAII, EurAI, and ACL. He is an elected Fellow of AAAI, EurAI, and GI. He laid some of the foundations for multimodal dialog systems, user modelling, and speech-to-speech translation cyber-physical production systems for the fourth industrial revolution (Industrie 4.0), a concept that he coined in 2010. Wahlster is a member of the Nobel Prize Academy in Stockholm, the German National Academy Leopoldina and three other prestigious academies. For his research, he has been awarded the German Future Prize, and the Grand Cross of Merit by the Federal President of Germany. (for more info see: https://www.wolfgang-wahlster.de/)

Industrial AI for Smart Manufacturing

In the next decade of Industry 4.0 a new generation of AI technologies will take smart factories to a new level. Large Language Models (LLMs) will be complemented by Large Process Models (LPMs) and Large Action Models (LAMs), so that generative AI models not only predict what to say or visualize next, but also what to do next with explanations of why these actions make sense. Although deep learning is the most powerful machine learning method developed to date, it has already reached its inherent limits in many industrial application domains. It must be combined with various symbolic approaches in new system architectures. This leads to hybrid LxM (x=L,P, or M) technologies that use holonic multiagent architectures for combining neural approaches with symbolic reasoning technologies such as constraint solving, physics-based simulation and terminological reasoning in knowledge graphs.


Course 1, 15:00-17:30

Christophe Cerisara (CNRS)

Christophe Cerisara is a French researcher at CNRS (National Centre for Scientific Research), specialized in machine learning models for natural language processing (NLP). He has created and is leading the SYNALP research team composed of about 20 NLP researchers since 2012. He is also the leader of the AI-NLP axis of the LORIA laboratory since 2019, and he has been referent for the French National Plan in AI in 2020. He has supervised more than 12 Ph.D. thesis, and has lead several projects about AI and training Large Language Models in the past few years.

Introduction to Large Language Models

The first part of this course will give the basic principles of the transformer architecture and how the decoder can be trained to build a Large Language Model (LLM), including a short overview of its scaling laws. The second part will present how to use such a trained LLM, either directly through zero-shot and in-context learning, or through fine-tuning to adapt the LLM to a given task, but with a focus on the direct usage of the LLM on either low-end and high-end hardware and without going into the details of parameter-efficient fine-tuning and other advanced adaptation strategies. The third part (30′) will consist of a practical session about how to implement this with the huggingface transformers library. The prerequisites for this course are a good knowledge of python and of fundamentals of machine learning; some experience with pytorch is useful.


Tuesday, Sept 10, 2024

Keynote 2, 9:00-10:00

Karën Fort (Université de Lorraine)

Karën Fort is a Professor at Université de Lorraine and does her research at the LORIA laboratory in Nancy, in the Inria team Semagramme. Her primary research interest is ethics in natural language processing (NLP), of which she is a pioneer: she organized the first colloquium on the subject in 2014, in France, followed by a national workshop (ETeRNAL 2015 and 2020) and a special issue of the TAL journal in 2016. She initiated the ethics and NLP French blog (http://www.ethique-et-tal.org/) as well as the first survey on ethics in NLP (Fort & Couillault, 2016). She was co-chair of the first two ethics committees in the field (EMNLP 2020 and NAACL 2021) and is co-chair of the ethics committee of the association for computational linguistics (ACL). Beside her work on stereotypical biases (Névéol et al., 2022), she is interested in deontological ethics using NLP4NLP techniques (Abdalla et al, 2023).

Ethics in Natural Language Processing: don't look up!

With the success of neural methods, NLP has undergone a major revolution in the past decade: we now have (lots of) real users. Although the field already had an impact on society 10 years ago, it was easier to ignore. Ethics has now become a central issue. In this talk, I will show what can go wrong when we develop a system, if we are not careful enough. I’ll also provide tools and methodologies to better evaluate the impact of our work. More importantly, I’ll show that the issues we are facing are not limited to stereotypical biases and that we need to learn to question our work on a larger scale, as ethical thinking helps developing better systems.


Course 2, 10:30-13:00

Malte Ostendorff (Deutsche Telekom)

Dr. Malte Ostendorff is a senior research engineer at Deutsche Telekom where he works on large language models (LLMs) and related topics. Previously, Malte was a senior researcher at the German Research Center for Artificial Intelligence (DFKI) and a Ph.D. student in the Scientific Information Analytics group at the University of Göttingen. Furthermore, Malte is a co-founder of Occiglot, a research collective for open-source language models for and by Europe, and a co-founder of Open Legal Data.

Training Data for Large Language Models

Large language models (LLMs) have emerged as a powerful technology underpinning state-of-the-art chatbots and various other natural language processing applications. Model sizes and computing resources that are used for building LLMs dominate the public discourse around these models, whereas one crucial aspect is often neglected – the LLM training data. LLMs are statistical models that learn from data and, therefore, the training data is crucial for LLMs and one of the main differentiators between different models. In this course, we will explore the datasets that were used by existing LLMs, automated tools for data curation and processing at the Web scale, and the most prominent sources where the data is coming from. We will discuss why commercial LLM providers are secretive about their data and what issues arise from training models on large-scale datasets. A basic understanding of what LLMs are and how they are trained is a prerequisite for this course.


Course 3, 15:30-18:00

Christophe Cerisara (CNRS)

Christophe Cerisara is a French researcher at CNRS (National Centre for Scientific Research), specialized in machine learning models for natural language processing (NLP). He has created and is leading the SYNALP research team composed of about 20 NLP researchers since 2012. He is also the leader of the AI-NLP axis of the LORIA laboratory since 2019, and he has been referent for the French National Plan in AI in 2020. He has supervised more than 12 Ph.D. thesis, and has lead several projects about AI and training Large Language Models in the past few years.

Efficient LLM training

Beyond performance on tasks, another important characteristic of LLMs is their efficiency, i.e., their memory and computational costs during training, inference and deployment. In this course, I will first cover aspects related to efficient LLM training and then focus on low-data regimes.
The first part starts from discussing memory and computational costs of LLM, and covers concepts such as quantization, adapters, prefix tuning, qLoRA and model compression. The second part will briefly introduce notions related to transfer learning and continual learning: overfitting, learning to hallucinate, catastrophic forgetting, few-shot learning. The lab will focus on parameter-efficient methods.
The prerequisites for this course are the first introductory course about LLM and notions of machine learning.
I’d like to thank Simon Ostermann for providing the initial materials for this course.


Wednesday, Sept 11, 2024

Keynote 3, 9:00-10:00

Kai Warsönke (VW)

Kai Warsönke is a graduate engineer (Diplom FH) in Production Engineering and a fourth-year PhD student at Volkswagen, focusing on data-driven product influence in vehicle projects. His research includes stochastic and statistical tolerance simulation models and preparing quality assurance data for the usage of AI methods. He creates and simulates measurement data-coupled tolerance models to propose targeted action plans for improving vehicle quality. His innovative approach integrates advanced simulation techniques to optimize product development and ensure high-quality outcomes in the automotive industry.

Henrik Waschke (VW)

Henrik Waschke holds both a Bachelor and Master’s degree in automotive engineering. Currently, he is a first-year PhD student at Volkswagen. His research focuses on enhancing quality in the automotive sector using 3D-AI technology. Henrik deals with AI systems to optimize customer-relevant quality features and streamline quality planning processes. His work aims to improve the customer-relevant quality features and accelerate quality planning processes.

Increasing product quality in the automotive industry through the Virtual Measurement Data Analysis (VMDA)

The Virtual Measurement Data Analysis (VMDA) has been developed to assess how component deviations in the production process affect the corresponding closure dimension across the entire tolerance chain. VMDA uses the latest measurement data to show and analyze changes in how production-related deviations affect the whole process in real time. So far, the VMDA has given real-time feedback on measurement data to a tolerance analysis model that represents the whole vehicle. Right now, VMDA is used on stationary computers. User feedback has indicated a high level of complexity and the necessity for extensive technical knowledge regarding the interaction of individual assemblies and quality-relevant areas. The next step involves simplifying, refining, and explicitly transferring VMDA functionality to a portable device. This will provide the operator with specific instructions for correcting quality deviations. Subsequently, there will be a discussion on the potential applications of artificial intelligence subfields in optimizing planning processes.


Course 4, 10:30-13:00

Mariya Toneva (MPI)

Mariya Toneva leads the Bridging AI and Neuroscience group (BrAIN) at the Max Planck Institute for Software Systems. Her research is at the intersection of Machine Learning, Natural Language Processing, and Neuroscience, with a focus on building computational models of language processing in the brain that can also improve natural language processing systems. She obtained her PhD from Carnegie Mellon University in a joint program between Machine Learning and Neural Computation.

Relating LLMs to human brains

Current large language models (LLMs, e.g. ChatGPT, GPT-4, etc.) have impressive capabilities, but how closely do they actually align with the capabilities of the only system that truly understands complex language–the human brain? In this session, we will learn the core computational techniques for relating language in machines and language in the brain. We will also have a hands-on session with brain recordings of people processing complex language (e.g. reading a book).

The prerequisites are good familiarity with programming in python and basic machine learning concepts, such as regression and cross validation.


Course 5, 15:30-18:00

Jindong Gu (University of Oxford/Google DeepMind)

Dr. Jindong Gu is a senior research fellow at University of Oxford. He also partially works in Google DeepMind as a faculty researcher in Gemini Safety team. Prior to that, He received his Ph.D. Degree from University of Munich. His research focus is to build Responsible AI systems. Specifically, he is interested in the interpretability, robustness, privacy, and safety of visual perception, foundation models, robotic policy and planning, and their fusion towards general intelligent systems.

Responsible Generative AI

In recent years, generative AI (GenAI), like large language models and text-to-image models, has received significant attention across various domains. However, ensuring the responsible generation of content by these models is crucial for their real-world applicability. This raises an interesting question: What should responsible GenAI generate, and what should it not? This course will introduce the practical responsible requirements of both textual and visual generative models, outlining five key considerations: generating truthful content, avoiding toxic content, refusing harmful instruction, leaking no training data-related content, and ensuring generated content identifiable.


Thursday, Sept 12, 2024

Keynote 4, 9:00-10:00

Xavier Hinaut (Inria)

Xavier Hinaut is Research Scientist in Bio-inspired Machine Learning and Computational Neuroscience at Inria, Bordeaux, France since 2016. He received a MSc and Engineering degree form Compiègne Technology University (UTC), FR in 2008, a MSc in Cognitive Science & AI from EPHE, FR in 2019, then his PhD of Lyon University, FR in 2013. He is a member (Vice Chair) of IEEE CIS Task Force on Reservoir Computing. His work is at the frontier of neurosciences, machine learning, robotics and linguistics: from the modeling of human sentence processing to the analysis of birdsongs and their neural correlates. He both uses reservoirs for machine learning (e.g. birdsong classification) and models (e.g. sensorimotor models of how birds learn to sing). He manages the “DeepPool” ANR project on human sentence modeling with Deep Reservoirs architectures and the Inria Exploratory Action “BrainGPT” on Reservoir Transformers. He leads ReservoirPy development: the most up-to-date Python library for Reservoir Computing. https://github.com/reservoirpy/reservoirpy He is also involved in public outreach, notably by organising hackathons from which fun projects with reservoirs came out (ReMi project on reservoir generating MIDI and sounds).

BrainGPT: Tailoring Transformers into Cognitive Language Models

Language involves several levels of abstraction, from small sound units like phonemes to contextual sentence-level understanding. Large Language Models (LLMs) have shown an impressive ability to predict human brain recordings. For instance, while a subject is listening to a book chapter from Harry Potter, LLMs can predict parts of brain imaging activity (recorded by functional Magnetic Resonance Imaging or Electroencephalography) at the phoneme or word level. These striking results are likely due to their hierarchical architectures and massive training data. Despite these feats, they differ significantly from how our brains work and provide little insight into the brain’s language processing. We will see how simple Recurrent Neural Networks like Reservoir Computing can model language acquisition from limited and ambiguous contextual data better than LSTMs. From these results, in the BrainGPT project, we explore various architectures inspired by both reservoirs and LLMs, combining random projections and attention mechanisms to build models that can be trained faster with less data and greater biological insight.


Course 6, 10:30-13:00


Course 7, 15:30-18:00

Gerrit Großmann (DFKI)

Gerrit Großmann received his doctorate in Saarbrücken. His PhD topic was the behavior of stochastic processes on graphs and networks, including the spread of (online and offline) epidemics. He also worked within the interdisciplinary project NextAID, where he researched neuro-symbolic approaches for drug discovery, specifically by using diffusion models and graph neural networks. Gerrit has been researching at DFKI in Saarbrücken and Kaiserslautern since 2023. His research interests there revolve around the question of how to integrate the distinct realms of discrete structures such as graphs and networks with the continuous nature of dynamic evolution, diffusion, and learning.

Language Models and Structured Knowledge in AI

Despite their groundbreaking impact, LLMs have their imperfections. This track examines the integration of LLMs with structured information like knowledge graphs. We investigate ways to improve the quality and reliability of LLMs and techniques for extracting structured data from them. By the end of the lab, you will have a first prototype of an implementation of an LLM combined with a knowledge graph. No specific experience working with LLMs is required, but some basic knowledge of deep learning is recommended.

Islam Mesabah (DFKI)

Islam Mesabah obtained his Master’s degree in Computer Science from the RPTU (Rhineland-Palatinate Technical University) in Kaiserslautern. His master’s thesis focused on the application of Large Language Models (LLMs) for effective code generation through the utilization of API documentation. Additionally, he researched text-style transfer evaluation using LLMs. Since 2023, Islam Mesabah has been serving as a researcher at the German Research Center for Artificial Intelligence (DFKI) in Kaiserslautern. His research at DFKI primarily explores the applications of Large Language Models and key information extraction and structuring from image documents. In addition to his research endeavors, Islam holds the position of teaching assistant for the “Engineering with Generative AI” course at RPTU Kaiserslautern.

Language Models and Structured Knowledge in AI

Despite their groundbreaking impact, LLMs have their imperfections. This track examines the integration of LLMs with structured information like knowledge graphs. We investigate ways to improve the quality and reliability of LLMs and techniques for extracting structured data from them. By the end of the lab, you will have a first prototype of an implementation of an LLM combined with a knowledge graph. No specific experience working with LLMs is required, but some basic knowledge of deep learning is recommended.


Friday, Sept 13, 2024

Course 8, 9:00-11:30

Alexandre Défossez (Kyutai)

Alexandre is part of the founding research team at Kyutai, a leading non profit research lab in Paris. Before he was a research scientist for 3 years at Meta AI Research, leading in particular the development of the AudioCraft framework (EnCodec, AudioGen, MusicGen). Alexandre completed his PhD at Facebook AI Research and INRIA Paris, working in particular on music source separation (Demucs).

Auto-regressive modeling of discrete audio tokens.

In this course, we will learn the theory around discrete audio modeling, covering the different components and techniques used (neural audio codec, multi stream transformers, etc), as well as the specificities of the audio domain. Then, we will apply theses techniques to fine tuning pre trained audio models to new datasets of audio. Attendees should have previous experience with Pytorch for the practical part of the class, along with a setup google account to use google Colab. Previous experience working with audio will help but is not required!

Comments are closed.