Aho–Corasick algorithm
E672058
The Aho–Corasick algorithm is a classic string-searching algorithm that efficiently finds all occurrences of multiple patterns in a text using a trie-based finite-state machine.
Statements (48)
| Predicate | Object |
|---|---|
| instanceOf |
finite-state machine based algorithm
ⓘ
multiple-pattern matching algorithm ⓘ string-searching algorithm ⓘ |
| author |
Alfred V. Aho
NERFINISHED
ⓘ
Margaret J. Corasick NERFINISHED ⓘ |
| canBeImplementedAs | deterministic finite automaton ⓘ |
| field | computer science ⓘ |
| hasAdvantage |
efficient for large dictionaries of patterns
ⓘ
finds all pattern occurrences in a single pass over the text ⓘ |
| hasPhase |
failure function construction
ⓘ
search phase ⓘ trie construction ⓘ |
| hasProperty |
preprocessing of patterns is done once
ⓘ
search time independent of number of patterns ⓘ |
| hasTimeComplexity |
O(n + m + z)
ⓘ
linear in length of text plus total length of patterns plus number of matches ⓘ |
| input |
set of patterns
ⓘ
text string ⓘ |
| isDeterministic | true ⓘ |
| isExactMatching | true ⓘ |
| isOnline | true ⓘ |
| namedAfter |
Alfred V. Aho
NERFINISHED
ⓘ
Margaret J. Corasick NERFINISHED ⓘ |
| operatesOn | finite alphabet ⓘ |
| originalTitleOfPaper | Efficient string matching: an aid to bibliographic search ⓘ |
| output | all occurrences of patterns in the text ⓘ |
| publishedIn | Communications of the ACM NERFINISHED ⓘ |
| relatedTo |
Knuth–Morris–Pratt algorithm
NERFINISHED
ⓘ
finite automata theory ⓘ trie data structure ⓘ |
| spaceComplexity |
O(m * Σ)
ⓘ
linear in total pattern length times alphabet size ⓘ |
| subfield |
algorithms
ⓘ
pattern matching ⓘ string algorithms ⓘ |
| supports |
overlapping matches
ⓘ
simultaneous search of many patterns ⓘ |
| usedIn |
DNA sequence analysis
ⓘ
digital forensics ⓘ intrusion detection systems ⓘ search engines ⓘ spam filtering ⓘ text editors and IDEs ⓘ |
| usesConcept |
failure links
ⓘ
output links ⓘ |
| usesDataStructure |
finite automaton
ⓘ
trie ⓘ |
| yearIntroduced | 1975 ⓘ |
Referenced by (1)
Full triples — surface form annotated when it differs from this entity's canonical label.