Home
Scholarly Works
Lempel–Ziv Factorization Using Less Time &...
Journal article

Lempel–Ziv Factorization Using Less Time & Space

Abstract

Abstract.For 30 years the Lempel–Ziv factorization LZx of a string x = x[1..n] has been a fundamental data structure of string processing, especially valuable for string compression and for computing all the repetitions (runs) in x. Traditionally the standard method for computing LZx was based on Θ(n)-time (or, depending on the measure used, O(n log n)-time) processing of the suffix tree STx of x. Recently Abouelhoda et al. proposed an efficient Lempel–Ziv factorization algorithm based on an “enhanced” suffix array – that is, a suffix array SAx together with supporting data structures, principally an “interval tree”. In this paper we introduce a collection of fast space-efficient algorithms for LZ factorization, also based on suffix arrays, that in theory as well as in many practical circumstances are superior to those previously proposed; one family out of this collection achieves true Θ(n)-time alphabet-independent processing in the worst case by avoiding tree structures altogether.

Authors

Chen G; Puglisi SJ; Smyth WF

Journal

Mathematics in Computer Science, Vol. 1, No. 4, pp. 605–623

Publisher

Springer Nature

Publication Date

June 1, 2008

DOI

10.1007/s11786-007-0024-4

ISSN

1661-8270

Contact the Experts team