The poster session of the participants will take place on Tuesday at 14:00 – 15:00 in the foyer of the Innovation Centre.
In a relaxed atmosphere, the young researchers can present their various works to each other and exchange and discuss them on a professional level.
The open exchange and the public appearance of the poster session will also allow external parties to be addressed and reached.
So grab a coffee and exchange ideas!
Aymene Bouayed
Federated image generation on heterogeneous data
In our approach, we propose to combine autoencoders with Schrödinger bridges to improve image generation speed and quality in a heterogeneous data setting. In this pipeline, a vector is first sampled from the latent space of a VAE which is then decoded into a life-like image. The quality of the latter image is then improved via a Schrödinger bridge model.
Consequently, we put forward the probability mass function sampling method, a method refining the sampling process from the latent space of autoencoders, and improving upon state-of-the-art Gaussian Mixture model sampling. Then, we introduce TuckerCAM, a state-of-the-art explainability method for convolutional autoencoder providing a layer of trust into our proposed pipeline. Finally, we are working on the heterogeneous data aspect, specifically dataset imbalance, to improve image generation on tail classes. The final pipeline will be able to produce high-quality images in a federated data heterogeneous setting.
Quentin Lemesle
ParaPLUIE - une mesure automatique d'évaluation de la qualité sémantique des systèmes de paraphrases
Evaluating automatic paraphrase production systems is a difficult task because it involves, among other things, assessing the semantic proximity between two sentences. Usual measures are based on lexical distances, or at least on semantic embedding alignments. In this article we study some of these measures on datasets of paraphrases and non-paraphrases known for their quality or difficulty on this task. We propose a new measure, ParaPLUIE, based on the use of a large language model. According to our experiments, this one is better to sort pairs of sentences by semantic proximity.
Hassan Hussein
ORKGEx: Leveraging Language and Vision Models with Knowledge Graphs for Research Contribution Annotation
A major challenge in scholarly information retrieval is the semantic description of research contributions. While Generative AI can assist, minimally invasive user engagement is needed. We introduce an innovative browser-based approach to annotating research articles, integrating with the Open Research Knowledge Graph (ORKG). This method combines human intelligence with advanced neural and symbolic AI techniques to extract semantic research contribution descriptions, seamlessly integrating an AI-driven annotation tool within a web browser. This facilitates user interaction and improves the creation and curation of scholarly knowledge.
Tomáš Horeličan
Bridging natural language reasoning with navigation algorithms for intelligent robot agents
Thanks to the rapid developments in generative AI architectures, deploying intelligent autonomous robot agents is becoming more easily achievable. Although privately owned solutions provide API access to their state-of-the-art models, notable potential still lies in open-sourced and locally hosted solutions. This contribution will explore the integration of openly available ML frameworks, such as GGML, and quantized generative models into an interactive commanding agent for an autonomous robot. A pipeline of speech-to-text, large-language-model, and text-to-speech architectures is formed to create agents that bridge its autonomous navigation algorithms with natural language reasoning, all hosted on a local machine.
Justyna Zając
Analytical and numerical optimization methods for operations management: a software implementation perspective
Analytical or numerical mathematical optimization techniques can be used in operations management for the purpose of automated decision-making. Algorithms implemented in software applications for (near) real-time, continuous decision-making often need to ingest new, previously unseen data while delivering robust, consistent, and reliable results rather than the most accurate ones. Both analytical and numerical approaches have distinct advantages and limitations. This study aims to highlight these aspects from the perspective of industry software applications, using a classical example from inventory management to illustrate the concepts.
Mario Corsanici
JOiiNT LAB: a synergy between research and industry
JOiiNT LAB is a collaborative research initiative established in 2020 through a partnership between the Italian Institute of Technology and Intellimech Consortium. Its mission is to strengthen technology transfer, bridge research and industrial needs, train professionals with advanced technical-scientific skills, and enhance the region’s technological excellence
Benedikt Hosp
Gaze-Driven Intention Prediction for Assistive AI: Personalization and Collaboration
In the rapidly evolving field of assistive technology, creating AI systems that intuitively and effectively collaborate with users is critical to enhancing autonomy and quality of life for individuals with disabilities. Our research focuses on developing a personalized, gaze-driven intention prediction system that facilitates seamless collaboration between humans and assistive AI in both disability and learning contexts. By incorporating personalization, the system adapts to the unique needs and behaviors of each user, significantly improving the accuracy and efficiency of task execution and learning processes. Leveraging advanced machine learning techniques, such as large language models (LLMs) and vision-language models (VLMs), the system continuously learns and refines its predictions, leading to more responsive and user-centered interactions. This personalized approach not only reduces cognitive and physical effort but also enhances trust and safety in human-AI collaboration. The outcomes of this research will contribute to more intuitive, reliable, and inclusive assistive technologies, ultimately empowering users and advancing the field of human-AI interaction.
Silvia Romana Ottaviani
Robots at work: a usability evaluation of a robot avatar for inspection and maintenance
Nowadays, inspection and maintenance tasks in industrial sectors rely heavily on manual labor. The use of robot avatars and shared autonomy, offers a potential solution to enhance operator safety, comfort, and task digitization. We have evaluated the usability of a robot avatar for inspection of welded joints in pressure vessels for gas plants, and conducted experimental tests with human operators. We have compared three control methods: fully teleoperated, semi-autonomous with an Augmented Reality (AR) interface, and semi-autonomous with a physical interface, and evaluated physical and cognitive ergonomics using wearable systems. The results of this research offer insights into how robotic avatar technology impacts on human operators ergonomics, safety, and efficiency in industrial inspection and maintenance processes.
Uélison Santos
Next Generation AI-Based Multimodal Stream Processing
Real-time processing is crucial for critical applications, especially with diverse streaming data like video, audio, and time-series. However current streaming systems so far are limited to process structural data. Large language models (LLMs) show promise in multimodal data understanding but face performance issues that limit their use in real-time streaming. Our vision is a general-purpose AI-based streaming system to efficiently process multimodal data in real time. By addressing the challenges associated with the integration of LLMs into streaming architectures, we aim to push the boundaries of what is achievable in real-time, ultimately paving the way for more advanced and versatile applications in various domains.
Satyam Uttamkumar Dudhagara
Autonomous Simulation, Testing, and Data Generation Framework for Mobile Robots within Randomly Generated Plausible Scenarios
The core of this work is to conceptualize and build an automated testing pipeline for mobile robot applications. The initial focus is on the automated generation of plausible factory layouts and the assignment of goal stations within the generated simulation environment. This is extended to imitate complex tasks in a factory environment, such as intra-logistics scenarios with multiple robots. The aim is to develop a Synthetic Dataset generation pipeline for mobile robots that can also validate the navigation stack of mobile robots by randomly generating tasks and scenarios in a simulation, checking, and logging collision data. The pipeline utilizes the simulation environment set up in the Nvidia Isaac Sim.
Johnata Brayan Souza Soares
UAIbot: Beginner-friendly web-based simulator for interactive robotics learning and research
UAIBot is a versatile, web-based robotic simulator designed to provide an interactive and accessible platform for learning and research in robotics. Supporting multiple programming languages, UAIBot focuses primarily on simulating open-chain serial robotic manipulators. The simulator allows users to create and visualize robotic scenarios, experiment with diverse control algorithms, and rapidly prototype new control strategies. This makes UAIBot an excellent tool for both educational purposes, helping users gain practical, hands-on experience in robotics programming, and for research, where it facilitates the development and testing of innovative robotic solutions.”
Seyedmasih Tabaei
synthetixt: Simple yet modular text data augmentation framework for text classification and sentiment analysis utilizing large language models
Deep learning approaches are ravenous for vast amounts of high-quality data, a need that poses challenges in domains where high-quality annotated datasets are scarce or costly to produce. To address these challenges, data augmentation techniques offer a viable solution by artificially extending datasets to harmonize both quality and quantity. Traditional text augmentation techniques, such as synonym replacement or back translation, can enhance model performance for downstream tasks like text classification or sentiment analysis but often suffer from limitations in generating truly diverse data while preserving the original meaning. To overcome these limitations, we propose synthetixt—a simple yet modular text data augmentation framework that harnesses the power of large language models (LLMs). “synthetixt” utilizes LLMs to generate synthetic data that maintains semantic consistency while enhancing diversity, thereby improving the robustness of models. By integrating an enhancement and a validation step, it also minimizes the risk of introducing erroneous data.
Tobias Schreieder
On Stance Detection in Image Retrieval for Argumentation
Given a text query on a controversial topic, the task of Image Retrieval for Argumentation is to rank images according to how well they can be used to support a discussion on the topic. An important subtask therein is to determine the stance of the retrieved images, i.e., whether an image supports the pro or con side of the topic. This work provides a detailed investigation of the challenges of this task by means of a novel and modular retrieval pipeline called NeurArgs. Building on NeurArgs, new stance detection models were developed to address previous weaknesses.
Gabin Maxime Nguegnang
Convergence of GD for Learning Linear Neural Networks
We study the convergence properties of gradient descent in deep linear neural networks by extending a previous analysis for the related gradient flow. We show that under suitable conditions on the step sizes gradient descent converges to a critical point of the loss function, i.e., the square loss in this work. Furthermore, we prove that for almost all initialization gradient descent converges to a global minimum in the case of two layers.
In the case of three or more layers we show that gradient descent converges to a global minimum on the manifold matrices of some fixed rank, where the rank cannot be determined a priori.
Esmaeel Mohammadi
Optimization and Control of Industrial Processes Using deep Learning-based Simulators as an Environment to Train Deep Reinforcement Learning Algorithms
The study presented focuses on optimizing industrial process control, specifically phosphorus removal in wastewater treatment plants (WWTPs), through the use of deep learning-based simulators integrated with deep reinforcement learning (DRL) algorithms. The research tackles a major challenge in industrial process control: the absence of accurate simulators. Six advanced deep learning models (LSTM, Transformer, Informer, Autoformer, DLinear, and NLinear) are used to simulate the phosphorus removal process, achieving up to 97% prediction accuracy in one-step forecasts. These models provide a robust environment for DRL algorithms to optimize control strategies, significantly improving phosphorus removal efficiency. The study highlights the benefits of using iterative training for improvement and incorporating exogenous state variables into models to reduce multi-step prediction errors, leading to more reliable long-term forecasts. These improvements led to a 55% reduction in Mean Squared Error (MSE) and a 34% reduction in Dynamic Time Warping (DTW) compared to models without exogenous variables, significantly improving long-term simulations. The application of the Soft Actor-Critic (SAC) DRL algorithm is also explored, demonstrating its potential for managing time delays in industrial processes and further improving control strategies. This approach offers promising advancements in resource management and environmental sustainability for WWTPs.