Home
Scholarly Works
A RST-Based Stateful Data Analytics within Spark
Conference

A RST-Based Stateful Data Analytics within Spark

Abstract

Stateful data analytics framework have emerged to provide fresh and low-latency results for big data processing. At present, it is desired to achieve the fine-grained data model in mainstream data processing framework, e.g. Spark. However, Spark adopts coarse-grained data model in order to facilitate parallization, it makes the fine-grained data access in stateful data analytics very challenging. In this paper, we introduce a stateful component, Resilient State Table (RST) to Spark framework. To fill the gap between the coarse-grained data model in Spark and the fine-grained state access requirements in stateful data analytics, we devise the programming model of RST which interacts with Spark's coarse-grained memory representation seamlessly, and enables users to query/update the state entries in fine granularity with Spark-like programming interfaces. Performance evaluation in various application fields demonstrate that our proposed solution achieves the improvements in latency, fault-tolerance, as well as scalability.

Authors

Ge J; Chen Z; Liu C; Peng J; He W; Zhu N

Pagination

pp. 394-399

Publisher

Institute of Electrical and Electronics Engineers (IEEE)

Publication Date

July 1, 2017

DOI

10.1109/icci-cc.2017.8109779

Name of conference

2017 IEEE 16th International Conference on Cognitive Informatics & Cognitive Computing (ICCI*CC)
View published work (Non-McMaster Users)

Contact the Experts team