public class AhoCorasickDoubleArrayTrie<V> extends Object implements Serializable, Dictionary
Modifier and Type | Class and Description |
---|---|
class |
AhoCorasickDoubleArrayTrie.Hit<V>
A result output
|
static interface |
AhoCorasickDoubleArrayTrie.IHit<V>
Processor handles the output when hit a keyword
|
static interface |
AhoCorasickDoubleArrayTrie.IHitFull<V>
Processor handles the output when hit a keyword, with more detail
|
Modifier and Type | Field and Description |
---|---|
protected int[] |
base
base array of the Double Array Trie structure
|
protected int[] |
check
check array of the Double Array Trie structure
|
protected int[] |
fail
fail table of the Aho Corasick automata
|
protected int[] |
l
the length of every key
|
protected int[][] |
output
output table of the Aho Corasick automata
|
protected int |
size
the size of base and check array
|
protected V[] |
v
outer value array
|
Constructor and Description |
---|
AhoCorasickDoubleArrayTrie() |
Modifier and Type | Method and Description |
---|---|
void |
add(String item)
将单个词加入词典
|
void |
addAll(List<String> items)
批量将词加入词典
|
void |
build(Map<String,V> map)
Build a AhoCorasickDoubleArrayTrie from a map
|
void |
clear()
清空词典中的所有的词
|
boolean |
contains(String item)
判断文本是不是一个词
|
boolean |
contains(String item,
int start,
int length)
判断指定的文本是不是一个词
|
int |
exactMatchSearch(String key)
match exactly by a key
|
V |
get(int index)
Pick the value by index in value array
Notice that to be more efficiently, this method DONOT check the parameter |
V |
get(String key)
Get value by a String key, just like a map.get() method
|
int |
getMaxLength()
词典中的词的最大长度,即有多少个字符
|
void |
load(ObjectInputStream in)
Load
|
static void |
main(String[] args) |
void |
parseText(char[] text,
AhoCorasickDoubleArrayTrie.IHit<V> processor)
Parse text
|
void |
parseText(char[] text,
AhoCorasickDoubleArrayTrie.IHitFull<V> processor)
Parse text
|
List<AhoCorasickDoubleArrayTrie.Hit<V>> |
parseText(String text) |
void |
parseText(String text,
AhoCorasickDoubleArrayTrie.IHit<V> processor)
Parse text
|
List<AhoCorasickDoubleArrayTrie.Hit<V>> |
parseText(String text,
int start,
int length)
Parse text
|
void |
remove(String item)
将单个词从词典中删除
|
void |
removeAll(List<String> items)
批量将词从词典中删除
|
void |
save(ObjectOutputStream out)
Save
|
int |
size()
Get the size of the keywords
|
protected int |
transition(int current,
char c)
transition of a state
|
protected int |
transitionWithRoot(int nodePos,
char c)
transition of a state, if the state is root and it failed, then returns the root
|
protected int[] check
protected int[] base
protected int[] fail
protected int[][] output
protected V[] v
protected int[] l
protected int size
public List<AhoCorasickDoubleArrayTrie.Hit<V>> parseText(String text)
public List<AhoCorasickDoubleArrayTrie.Hit<V>> parseText(String text, int start, int length)
text
- The textpublic void parseText(String text, AhoCorasickDoubleArrayTrie.IHit<V> processor)
text
- The textprocessor
- A processor which handles the outputpublic void parseText(char[] text, AhoCorasickDoubleArrayTrie.IHit<V> processor)
text
- The textprocessor
- A processor which handles the outputpublic void parseText(char[] text, AhoCorasickDoubleArrayTrie.IHitFull<V> processor)
text
- The textprocessor
- A processor which handles the outputpublic void save(ObjectOutputStream out) throws IOException
out
- An ObjectOutputStream objectIOException
- Some IOExceptionpublic void load(ObjectInputStream in) throws IOException, ClassNotFoundException
in
- An ObjectInputStream objectIOException
ClassNotFoundException
public V get(String key)
key
- The keypublic V get(int index)
index
- The indexpublic int getMaxLength()
Dictionary
getMaxLength
in interface Dictionary
public boolean contains(String item, int start, int length)
Dictionary
contains
in interface Dictionary
item
- 文本start
- 指定的文本从哪个下标索引开始length
- 指定的文本的长度
比如:contains("我爱写程序", 3, 2);
表示的意思是“程序”是不是一个定义在词典中的词public boolean contains(String item)
Dictionary
contains
in interface Dictionary
item
- 文本public void addAll(List<String> items)
Dictionary
addAll
in interface Dictionary
items
- 集合中的每一个元素是一个词public void add(String item)
Dictionary
add
in interface Dictionary
item
- 词public void removeAll(List<String> items)
Dictionary
removeAll
in interface Dictionary
items
- 集合中的每一个元素是一个词public void remove(String item)
Dictionary
remove
in interface Dictionary
item
- 词public void clear()
Dictionary
clear
in interface Dictionary
protected int transition(int current, char c)
current
- c
- protected int transitionWithRoot(int nodePos, char c)
nodePos
- c
- public void build(Map<String,V> map)
map
- a map containing key-value pairspublic int exactMatchSearch(String key)
key
- the keypublic int size()
public static void main(String[] args)
Copyright © 2014–2015 APDPlat. All rights reserved.