Events 2023

Societal Computing: Computing Of and For Society

Ingmar Weber Fachrichtung Informatik - Saarbrücken
01 Mar 2023, 12:15 pm - 1:15 pm
Saarbrücken building E1 5, room 002
Joint Lecture Series
-

Learning for Decision Making: A Tale of Complex Human Preferences

Leqi Liu Carnegie Mellon University
14 Feb 2023, 2:00 pm - 3:00 pm
Virtual talk
SWS Colloquium
Machine learning systems are deployed in diverse decision-making settings in service of stakeholders characterized by complex preferences. For example, in healthcare and finance, we ought to account for various levels of risk tolerance; and in personalized recommender systems, we face users whose preferences evolve dynamically over time. Building systems better aligned with stakeholder needs requires that we take the rich nature of human preferences into account. In this talk, I will give an overview of my research on the statistical and algorithmic foundations for building such human-centered machine learning systems. ...
Machine learning systems are deployed in diverse decision-making settings in service of stakeholders characterized by complex preferences. For example, in healthcare and finance, we ought to account for various levels of risk tolerance; and in personalized recommender systems, we face users whose preferences evolve dynamically over time. Building systems better aligned with stakeholder needs requires that we take the rich nature of human preferences into account. In this talk, I will give an overview of my research on the statistical and algorithmic foundations for building such human-centered machine learning systems. First, I will present a line of work that draws inspiration from the economics literature to develop learning algorithms that account for the risk preferences of stakeholders. Subsequently, I will discuss a line of work that draws insights from the psychology literature to develop online learning algorithms for personalized recommender systems that account for users’ evolving preferences. Please contact the office team for link information
Read more

Programmatic Interfaces for Design and Simulation

Aman Mathur Max Planck Institute for Software Systems
09 Feb 2023, 1:30 pm - 2:30 pm
Kaiserslautern building G26, room 111
SWS Student Defense Talks - Thesis Defense
Although Computer Aided Design (CAD) and simulation tools have been around for quite some time, modern design and prototyping pipelines are challenging the limits of these tools. Advances in 3D printing have brought manufacturing capability to the general public. Moreover, advancements in machine learning and sensor technology are enabling enthusiasts and small companies to develop their own autonomous vehicles and machines. This means that many more users are designing (or customizing) 3D objects in CAD, and many are testing autonomous machines in simulation. ...
Although Computer Aided Design (CAD) and simulation tools have been around for quite some time, modern design and prototyping pipelines are challenging the limits of these tools. Advances in 3D printing have brought manufacturing capability to the general public. Moreover, advancements in machine learning and sensor technology are enabling enthusiasts and small companies to develop their own autonomous vehicles and machines. This means that many more users are designing (or customizing) 3D objects in CAD, and many are testing autonomous machines in simulation. Though Graphical User Interfaces (GUIs) are the de-facto standard for these tools, we find that these interfaces are not robust and flexible. For example, designs made using GUIs often break when customized, and setting up large simulations is quite tedious in GUI. Though programmatic interfaces do not suffer from these limitations, they are generally quite difficult to use, and often do not provide appropriate abstractions and language constructs.

In this thesis, we present our work on combining the ease of use of GUI, with the robustness and flexibility of programming. For CAD, we propose an interactive framework that automatically synthesizes robust programs from GUI-based design operations. Additionally, we apply program analysis to ensure customizations do not lead to invalid objects. Finally, for simulation, we propose a novel programmatic framework that simplifies the building of complex test environments, and a test generation mechanism that guarantees good coverage over test parameters.
Read more

Toward Deep Semantic Understanding: Event-Centric Multimodal Knowledge Acquisition

Manling Li University of Illinois Urbana Champaign
01 Feb 2023, 3:00 pm - 4:00 pm
Saarbrücken building E1 5, room 002
simultaneous videocast to Kaiserslautern building G26, room 111
SWS Colloquium
Please note that this is a virtual talk which will be video casted to Saarbrücken and Kaiserslautern.

Traditionally, multimodal information consumption has been entity-centric with a focus on concrete concepts (such as objects, object types, physical relations, e.g., a person in a car), but lacks ability to understand abstract semantics (such as events and semantic roles of objects, e.g., driver, passenger, mechanic). However, such event-centric semantics are the core knowledge communicated, regardless whether in the form of text, ...
Please note that this is a virtual talk which will be video casted to Saarbrücken and Kaiserslautern.

Traditionally, multimodal information consumption has been entity-centric with a focus on concrete concepts (such as objects, object types, physical relations, e.g., a person in a car), but lacks ability to understand abstract semantics (such as events and semantic roles of objects, e.g., driver, passenger, mechanic). However, such event-centric semantics are the core knowledge communicated, regardless whether in the form of text, images, videos, or other data modalities. At the core of my research in Multimodal Information Extraction (IE) is to bring such deep semantic understanding ability to the multimodal world. My work opens up a new research direction Event-Centric Multimodal Knowledge Acquisition to transform traditional entity-centric single-modal knowledge into event-centric multi-modal knowledge. Such a transformation poses two significant challenges: (1) understanding multimodal semantic structures that are abstract (such as events and semantic roles of objects): I will present my solution of zero-shot cross-modal transfer (CLIP-Event), which is the first to model event semantic structures for vision-language pretraining, and supports zero-shot multimodal event extraction for the first time; (2) understanding long-horizon temporal dynamics: I will introduce Event Graph Model, which empowers machines to capture complex timelines, intertwined relations and multiple alternative outcomes. I will also show its positive results on long-standing open problems, such as timeline generation, meeting summarization, and question answering. Such Event-Centric Multimodal Knowledge starts the next generation of information access, which allows us to effectively access historical scenarios and reason about the future. I will lay out how I plan to grow a deep semantic understanding of language world and vision world, moving from concrete to abstract, from static to dynamic, and ultimately from perception to cognition. Please contact the Office team for Zoom link information.
Read more

Terabyte-Scale Genome Analysis for Underfunded Labs

Sven Rahmann MMCI; CISPA Helmholtz Center for Information Security;
01 Feb 2023, 12:15 pm - 1:15 pm
Saarbrücken building E1 5, room 002
Joint Lecture Series
In 2023, you can get your personal genome sequenced for under 1000 Euros. If you do, you will obtain the data (50G to 100G) on a USB stick or hard drive. The data consists of many short strings (length 100 or 150) over the famous DNA alphabet {A,C,G,T}, for a total of 50 to 100 billion letters.

With this personal data, you may want to learn about your ancestry, or judge your personal risk of getting various conditions, ...
In 2023, you can get your personal genome sequenced for under 1000 Euros. If you do, you will obtain the data (50G to 100G) on a USB stick or hard drive. The data consists of many short strings (length 100 or 150) over the famous DNA alphabet {A,C,G,T}, for a total of 50 to 100 billion letters.

With this personal data, you may want to learn about your ancestry, or judge your personal risk of getting various conditions, such as myocardial infarction, stroke, breast cancer, and others. So you must look for certain (known) DNA variations in your individual genome that are known to be associated to certain population groups or diseases. As the raw data ("reads") consists of essentially random sub-strings of the genome, it is necessary to find the place of origin of each read in the genome, an error-tolerant pattern search task. In a medium-scale research study (say, for heart disease), we have similar tasks for a few hundred individual patients and healthy control persons, for a total of roughly 30-50 TB of data, delivered on a few USB hard drives. After primary analysis, the task is to find genetic variants (or, more coarsely, genes) related to the disease, i.e., we have a pattern mining problem with millions of features and a few hundred samples. The full workflow for such a study consists of more than 100_000 single steps, including simple per-sample steps (e.g., removing low-quality reads), and complex ones, involving statistical models across all samples for variant calling. Particularly in a medical setting, each step needs to be fully reproducible, we need to trace data provenance and maintain a chain of accountability. In the past ten years, we have worked and contributed to many aspects of variant-calling workflows and realized that the strategy to attack the ever-growing data with ever-growing compute clusters and storage systems will not scale well in the near future. Thus, our current work focuses on so-called alignment-free methods, which have the potential to yield the same answers as current state-of-the-art methods with 10 to 100 times less CPU work. I will present our recent advances in laying better foundations for alignment-free methods: engineered and optimized parallel hash tables for short DNA pieces (k-mers), and the design masks for gapped k-mers with optimal error tolerance. These new methods will enable even small labs to analyze large genomics datasets on a "good gaming PC", while investing less than 5000 Euros into computational hardware. I will also advertise our workflow language and execution engine "Snakemake", a combination of Make and Python that is now one of the most frequently used Bioinformatics workflow management tools, but actually not restricted to Bioinformatics research.
Read more