Home
Scholarly Works
Guidelines for Selecting Hadoop Schedulers Based...
Journal article

Guidelines for Selecting Hadoop Schedulers Based on System Heterogeneity

Abstract

Hadoop has been developed as a solution for performing large-scale data-parallel applications in Cloud computing. A Hadoop system can be described based on three factors: cluster, workload, and user. Each factor is either heterogeneous or homogeneous, which reflects the heterogeneity level of the Hadoop system. This paper studies the effect of heterogeneity in each of these factors on the performance of Hadoop schedulers. Three schedulers which consider different levels of Hadoop heterogeneity are used for the analysis: FIFO, Fair sharing, and COSHH (Classification and Optimization based Scheduler for Heterogeneous Hadoop). Performance issues are introduced for Hadoop schedulers, and experiments are provided to evaluate these issues. The reported results suggest guidelines for selecting an appropriate scheduler for Hadoop systems. Finally, the proposed guidelines are evaluated in different Hadoop systems.

Authors

Rasooli A; Down DG

Journal

Journal of Grid Computing, Vol. 12, No. 3, pp. 499–519

Publisher

Springer Nature

Publication Date

September 1, 2014

DOI

10.1007/s10723-014-9299-2

ISSN

1570-7873

Contact the Experts team