程序包 weka.core

类 Stopwords

java.lang.Object
weka.core.Stopwords
所有已实现的接口:
RevisionHandler

public class Stopwords extends Object implements RevisionHandler
Class that can test whether a given string is a stop word. Lowercases all words before the test.

The format for reading and writing is one word per line, lines starting with '#' are interpreted as comments and therefore skipped.

The default stopwords are based on Rainbow.

Accepts the following parameter:

-i file
loads the stopwords from the given file

-o file
saves the stopwords to the given file

-p
outputs the current stopwords on stdout

Any additional parameters are interpreted as words to test as stopwords.

版本:
$Revision: 1.6 $
作者:
Eibe Frank (eibe@cs.waikato.ac.nz), Ashraf M. Kibriya (amk14@cs.waikato.ac.nz), FracPete (fracpete at waikato dot ac dot nz)
  • 构造器概要

    构造器
    构造器
    说明
    initializes the stopwords (based on Rainbow).
  • 方法概要

    修饰符和类型
    方法
    说明
    void
    add(String word)
    adds the given word to the stopword list (is automatically converted to lower case and trimmed)
    void
    removes all stopwords
    Returns a sorted enumeration over all stored stopwords
    Returns the revision string.
    boolean
    is(String word)
    Returns true if the given string is a stop word.
    static boolean
    Returns true if the given string is a stop word.
    static void
    main(String[] args)
    Accepts the following parameter:
    void
    Generates a new Stopwords object from the reader.
    void
    read(File file)
    Generates a new Stopwords object from the given file
    void
    read(String filename)
    Generates a new Stopwords object from the given file
    boolean
    remove(String word)
    removes the word from the stopword list
    returns the current stopwords in a string
    void
    Writes the current stopwords to the given writer.
    void
    write(File file)
    Writes the current stopwords to the given file
    void
    write(String filename)
    Writes the current stopwords to the given file

    从类继承的方法 java.lang.Object

    equals, getClass, hashCode, notify, notifyAll, wait, wait, wait
  • 构造器详细资料

    • Stopwords

      public Stopwords()
      initializes the stopwords (based on Rainbow).
  • 方法详细资料

    • clear

      public void clear()
      removes all stopwords
    • add

      public void add(String word)
      adds the given word to the stopword list (is automatically converted to lower case and trimmed)
      参数:
      word - the word to add
    • remove

      public boolean remove(String word)
      removes the word from the stopword list
      参数:
      word - the word to remove
      返回:
      true if the word was found in the list and then removed
    • is

      public boolean is(String word)
      Returns true if the given string is a stop word.
      参数:
      word - the word to test
      返回:
      true if the word is a stopword
    • elements

      public Enumeration elements()
      Returns a sorted enumeration over all stored stopwords
      返回:
      the enumeration over all stopwords
    • read

      public void read(String filename) throws Exception
      Generates a new Stopwords object from the given file
      参数:
      filename - the file to read the stopwords from
      抛出:
      Exception - if reading fails
    • read

      public void read(File file) throws Exception
      Generates a new Stopwords object from the given file
      参数:
      file - the file to read the stopwords from
      抛出:
      Exception - if reading fails
    • read

      public void read(BufferedReader reader) throws Exception
      Generates a new Stopwords object from the reader. The reader is closed automatically.
      参数:
      reader - the reader to get the stopwords from
      抛出:
      Exception - if reading fails
    • write

      public void write(String filename) throws Exception
      Writes the current stopwords to the given file
      参数:
      filename - the file to write the stopwords to
      抛出:
      Exception - if writing fails
    • write

      public void write(File file) throws Exception
      Writes the current stopwords to the given file
      参数:
      file - the file to write the stopwords to
      抛出:
      Exception - if writing fails
    • write

      public void write(BufferedWriter writer) throws Exception
      Writes the current stopwords to the given writer. The writer is closed automatically.
      参数:
      writer - the writer to get the stopwords from
      抛出:
      Exception - if writing fails
    • toString

      public String toString()
      returns the current stopwords in a string
      覆盖:
      toString 在类中 Object
      返回:
      the current stopwords
    • isStopword

      public static boolean isStopword(String str)
      Returns true if the given string is a stop word.
      参数:
      str - the word to test
      返回:
      true if the word is a stopword
    • getRevision

      public String getRevision()
      Returns the revision string.
      指定者:
      getRevision 在接口中 RevisionHandler
      返回:
      the revision
    • main

      public static void main(String[] args) throws Exception
      Accepts the following parameter:

      -i file
      loads the stopwords from the given file

      -o file
      saves the stopwords to the given file

      -p
      outputs the current stopwords on stdout

      Any additional parameters are interpreted as words to test as stopwords.

      参数:
      args - commandline parameters
      抛出:
      Exception - if something goes wrong