类 WordTokenizer
java.lang.Object
weka.core.tokenizers.Tokenizer
weka.core.tokenizers.CharacterDelimitedTokenizer
weka.core.tokenizers.WordTokenizer
- 所有已实现的接口:
Serializable
,Enumeration
,OptionHandler
,RevisionHandler
A simple tokenizer that is using the java.util.StringTokenizer class to tokenize the strings.
Valid options are:
-delimiters <value> The delimiters to use (default ' \r\n\t.,;:'"()?!').
- 版本:
- $Revision: 1.4 $
- 作者:
- FracPete (fracpete at waikato dot ac dot nz)
- 另请参阅:
-
构造器概要
构造器 -
方法概要
修饰符和类型方法说明Returns the revision string.Returns a string describing the stemmerboolean
Tests if this enumeration contains more elements.static void
Runs the tokenizer with the given options and strings to tokenize.Returns the next element of this enumeration if this enumeration object has at least one more element to provide.void
Sets the string to tokenize.从类继承的方法 weka.core.tokenizers.CharacterDelimitedTokenizer
delimitersTipText, getDelimiters, getOptions, listOptions, setDelimiters, setOptions
从类继承的方法 weka.core.tokenizers.Tokenizer
runTokenizer, tokenize
从接口继承的方法 java.util.Enumeration
asIterator
-
构造器详细资料
-
WordTokenizer
public WordTokenizer()
-
-
方法详细资料
-
globalInfo
Returns a string describing the stemmer- 指定者:
globalInfo
在类中Tokenizer
- 返回:
- a description suitable for displaying in the explorer/experimenter gui
-
hasMoreElements
public boolean hasMoreElements()Tests if this enumeration contains more elements.- 指定者:
hasMoreElements
在接口中Enumeration
- 指定者:
hasMoreElements
在类中Tokenizer
- 返回:
- true if and only if this enumeration object contains at least one more element to provide; false otherwise.
-
nextElement
Returns the next element of this enumeration if this enumeration object has at least one more element to provide.- 指定者:
nextElement
在接口中Enumeration
- 指定者:
nextElement
在类中Tokenizer
- 返回:
- the next element of this enumeration.
-
tokenize
Sets the string to tokenize. Tokenization happens immediately. -
getRevision
Returns the revision string.- 返回:
- the revision
-
main
Runs the tokenizer with the given options and strings to tokenize. The tokens are printed to stdout.- 参数:
args
- the commandline options and strings to tokenize
-