Home
Scholarly Works
The longest common extension problem revisited and...
Journal article

The longest common extension problem revisited and applications to approximate string searching

Abstract

The Longest Common Extension (LCE) problem considers a string s and computes, for each pair (i,j), the longest substring of s that starts at both i and j. It appears as a subproblem in many fundamental string problems and can be solved by linear-time preprocessing of the string that allows (worst-case) constant-time computation for each pair. The two known approaches use powerful algorithms: either constant-time computation of the Lowest Common Ancestor in trees or constant-time computation of Range Minimum Queries in arrays. We show here that, from practical point of view, such complicated approaches are not needed. We give two very simple algorithms for this problem that require no preprocessing. The first is 5 times faster than the best previous algorithms on the average whereas the second is faster on virtually all inputs. As an application, we modify the Landau–Vishkin algorithm for approximate matching to use our simplest LCE algorithm. The obtained algorithm is 13 to 20 times faster than the original. We compare it with the more widely used Ukkonen's cutoff algorithm and show that it behaves better for a significant range of error thresholds.

Authors

Ilie L; Navarro G; Tinta L

Journal

Journal of Discrete Algorithms, Vol. 8, No. 4, pp. 418–428

Publisher

Elsevier

Publication Date

December 1, 2010

DOI

10.1016/j.jda.2010.08.004

ISSN

1570-8667

Contact the Experts team