Home
Scholarly Works
Incremental FP-Growth Mining Strategy for Dynamic...
Conference

Incremental FP-Growth Mining Strategy for Dynamic threshold Value and Database based on Mapreduce

Abstract

With the coming of the Big Data era, data mining has been confronted with new opportunities and challenges. Some limitations are exposed when traditional association rule mining algorithms are used to deal with large-scale data. In the Apriori algorithm, scanning the external storage repeatedly leads to high I/O load and brings about low performance. As for FP-Growth algorithm, the effectiveness is limited by internal memory size because mining process is on the base of large tree-form data structure. What's more, although remarkable achievements have been scored, there are still problems in dynamic scenarios. The paper presents a parallelized incremental FP-Growth mining strategy based on MapReduce, which aims to process large-scale data. The proposed incremental algorithm realizes effective data mining when threshold value and original database change at the same time. This novel algorithm is implemented on Hadoop and shows great advantages according to the experimental results.

Authors

Wei X; Ma Y; Zhang F; Liu M; Shen W

Pagination

pp. 271-276

Publisher

Institute of Electrical and Electronics Engineers (IEEE)

Publication Date

May 1, 2014

DOI

10.1109/cscwd.2014.6846854

Name of conference

Proceedings of the 2014 IEEE 18th International Conference on Computer Supported Cooperative Work in Design (CSCWD)
View published work (Non-McMaster Users)

Contact the Experts team