Aho–Corasick algorithm

E672058

The Aho–Corasick algorithm is a classic string-searching algorithm that efficiently finds all occurrences of multiple patterns in a text using a trie-based finite-state machine.

Jump to: Statements Referenced by

Statements (48)

Predicate Object
instanceOf finite-state machine based algorithm
multiple-pattern matching algorithm
string-searching algorithm
author Alfred V. Aho NERFINISHED
Margaret J. Corasick NERFINISHED
canBeImplementedAs deterministic finite automaton
field computer science
hasAdvantage efficient for large dictionaries of patterns
finds all pattern occurrences in a single pass over the text
hasPhase failure function construction
search phase
trie construction
hasProperty preprocessing of patterns is done once
search time independent of number of patterns
hasTimeComplexity O(n + m + z)
linear in length of text plus total length of patterns plus number of matches
input set of patterns
text string
isDeterministic true
isExactMatching true
isOnline true
namedAfter Alfred V. Aho NERFINISHED
Margaret J. Corasick NERFINISHED
operatesOn finite alphabet
originalTitleOfPaper Efficient string matching: an aid to bibliographic search
output all occurrences of patterns in the text
publishedIn Communications of the ACM NERFINISHED
relatedTo Knuth–Morris–Pratt algorithm NERFINISHED
finite automata theory
trie data structure
spaceComplexity O(m * Σ)
linear in total pattern length times alphabet size
subfield algorithms
pattern matching
string algorithms
supports overlapping matches
simultaneous search of many patterns
usedIn DNA sequence analysis
digital forensics
intrusion detection systems
search engines
spam filtering
text editors and IDEs
usesConcept failure links
output links
usesDataStructure finite automaton
trie
yearIntroduced 1975

Referenced by (1)

Full triples — surface form annotated when it differs from this entity's canonical label.

Alfred V. Aho notableConcept Aho–Corasick algorithm