Mosaics of Big Data Database Systems and Information Management – Trends and a Vision

Join us at the CRSS Weekly Seminar, where we will have pizza and listen to a talk from Dr. Volker Markl on Mosaics of Big Data Database Systems and Information Management – Trends and a Vision!

Zoom Link:


The global database research community has greatly impacted the functionality and performance of data storage and processing systems along the dimensions that define “big data”, i.e., volume, velocity, variety, and veracity. Locally, over the past five years, we have also been working on varying fronts. Among our contributions are: (1) establishing a vision for a database-inspired big data analytics system, which unifies the best of database and distributed systems technologies, and augments it with concepts drawn from compilers (e.g., iterations) and data stream processing, as well as (2) forming a community of researchers and institutions
to create the Stratosphere platform to realize our vision. One major result from these activities was Apache Flink, an open-source big data analytics platform and its thriving global community of developers and production users. Although much progress has been made, when looking at the overall big data stack, a major challenge for database research community still remains. That is, how to maintain the ease-of-use despite the increasing heterogeneity and complexity of data analytics, involving specialized engines for various aspects of an end-to-end data analytics pipeline, including, among others, graph-based, linear algebra-based, and relational-based algorithms, and the underlying, increasingly heterogeneous hardware and computing
infrastructure. At TU Berlin, DFKI, and the Berlin Institute for Foundations of Learning and Data (BIFOLD) we currently aim to advance research in this field via the NebulaStream and Agora projects. Our goal is to remedy some of the heterogeneity challenges that hamper developer productivity and limit the use of data science technologies to just the privileged few, who are coveted experts. In this talk, we will outline how state-of-the- art SPEs have to change to exploit the new capabilities of the IoT and showcase how we tackle IoT challenges in
our own system, NebulaStream. We will also present our vision for Agora, an asset ecosystem that provides the technical infrastructure for offering and using data and algorithms, as well as physical infrastructure components.


About Dr. Volker Markl:

Dr. Markl is Chair of the Database Systems and Information Management (DIMA) Group at TU Berlin, Director of the Berlin Institute for the Foundations of Learning and Data (BIFOLD)
Chief Scientist and Head of the Intelligent Analytics for Massive Data Research Group at German Research Center for Artificial Intelligence (DFKI)

Thursday, January 18, 2024 at 12:00 PM


CRSS Contact:
McCarley, Cynthia

Last modified 17 Jan 2024