Moira: A goal-oriented incremental machine learning approach to dynamic resource cost estimation in distributed stream processing systems

Daniele Foroni, Cristian Axenie, Stefano Bortoli, Mohamad Al Hajj Hassan, Ralph Acker, Radu Tudoran, Goetz Brasche, Yannis Velegrakis

Research output: Chapter in Book/Report/Conference proceedingConference contributionAcademicpeer-review

Abstract

The need for real-time analysis is still spreading and the number of available streaming sources is increasing. The recent literature has plenty of works on Data Stream Processing (DSP). In a streaming environment, the data incoming rate varies over time. The challenge is how to efficiently deploy these applications in a cluster. Several works have been conducted on improving the latency of the system or to minimize the allocated resources per application through time. However, to the best of our knowledge, none of the existing works takes into consideration the user needs for a specific application, which is different from one user to another. In this paper, we propose Moria, a goal-oriented framework for dynamically optimizing the resource allocation built on top of Apache Flink. The system takes actions based on the user application and on the incoming data characteristics (i.e., input rate and window size). Starting from an initial estimation of the resources needed for the user query, at each iteration we improve our cost function with the collected metrics from the monitored system about the incoming data, to fulfill the user needs. We present a series of experiments that show in which cases our dynamic estimation outperforms the baseline Apache Flink and the thumb rule estimation alone performed at the deployment of the applications.

Original languageEnglish
Title of host publicationProceedings of the International Workshop on Real-Time Business Intelligence and Analytics, BIRTE 2018
PublisherAssociation for Computing Machinery (ACM)
ISBN (Electronic)9781450366076
DOIs
Publication statusPublished - 27 Aug 2018
Event12th International Workshop on Real-Time Business Intelligence and Analytics, BIRTE 2018 - Rio de Janeiro, Brazil
Duration: 27 Aug 201827 Aug 2018

Conference

Conference12th International Workshop on Real-Time Business Intelligence and Analytics, BIRTE 2018
Country/TerritoryBrazil
CityRio de Janeiro
Period27/08/1827/08/18

Fingerprint

Dive into the research topics of 'Moira: A goal-oriented incremental machine learning approach to dynamic resource cost estimation in distributed stream processing systems'. Together they form a unique fingerprint.

Cite this