Repository logo
 

To Tune or Not to Tune?: In Search of Optimal Configurations for Data Analytics

Accepted version
Peer-reviewed

Type

Conference Object

Change log

Authors

Fekry, A 
Carata, L 
Pasquier, T 

Abstract

This experimental study presents several overlooked issues that pose a challenge for data analytics configuration tuning and deployment. These issues include: 1) the assumption of static workload/environment ignoring the dynamic characteristics of the analytics environment (e.g. the frequent need for workload retuning). 2) the speed of tuning cost amortization and how this influences the tuning decision. 3) the need for a comprehensive incremental tuning for a diverse set of workloads. To prove our point, we present Tuneful, an efficient configuration tuning framework for data analytics. We show how it is designed to overcome the above issues and illustrate its applicability by experimenting with it on two cloud service providers.

Description

Keywords

Data analytics, Configuration tuning, Bayesian Optimization, Cost amortization

Journal Title

Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining

Conference Name

KDD '20: The 26th ACM SIGKDD Conference on Knowledge Discovery and Data Mining

Journal ISSN

Volume Title

Publisher

ACM

Rights

All rights reserved