类 ComplementNaiveBayes

java.lang.Object
weka.classifiers.Classifier
weka.classifiers.bayes.ComplementNaiveBayes
所有已实现的接口:
Serializable, Cloneable, CapabilitiesHandler, OptionHandler, RevisionHandler, TechnicalInformationHandler, WeightedInstancesHandler

public class ComplementNaiveBayes extends Classifier implements OptionHandler, WeightedInstancesHandler, TechnicalInformationHandler
Class for building and using a Complement class Naive Bayes classifier.

For more information see,

Jason D. Rennie, Lawrence Shih, Jaime Teevan, David R. Karger: Tackling the Poor Assumptions of Naive Bayes Text Classifiers. In: ICML, 616-623, 2003.

P.S.: TF, IDF and length normalization transforms, as described in the paper, can be performed through weka.filters.unsupervised.StringToWordVector.

BibTeX:

 @inproceedings{Rennie2003,
    author = {Jason D. Rennie and Lawrence Shih and Jaime Teevan and David R. Karger},
    booktitle = {ICML},
    pages = {616-623},
    publisher = {AAAI Press},
    title = {Tackling the Poor Assumptions of Naive Bayes Text Classifiers},
    year = {2003}
 }
 

Valid options are:

 -N
  Normalize the word weights for each class
 
 -S
  Smoothing value to avoid zero WordGivenClass probabilities (default=1.0).
 
版本:
$Revision: 5516 $
作者:
Ashraf M. Kibriya (amk14@cs.waikato.ac.nz)
另请参阅:
  • 构造器详细资料

    • ComplementNaiveBayes

      public ComplementNaiveBayes()
  • 方法详细资料

    • listOptions

      public Enumeration listOptions()
      Returns an enumeration describing the available options.
      指定者:
      listOptions 在接口中 OptionHandler
      覆盖:
      listOptions 在类中 Classifier
      返回:
      an enumeration of all the available options.
    • getOptions

      public String[] getOptions()
      Gets the current settings of the classifier.
      指定者:
      getOptions 在接口中 OptionHandler
      覆盖:
      getOptions 在类中 Classifier
      返回:
      an array of strings suitable for passing to setOptions
    • setOptions

      public void setOptions(String[] options) throws Exception
      Parses a given list of options.

      Valid options are:

       -N
        Normalize the word weights for each class
       
       -S
        Smoothing value to avoid zero WordGivenClass probabilities (default=1.0).
       
      指定者:
      setOptions 在接口中 OptionHandler
      覆盖:
      setOptions 在类中 Classifier
      参数:
      options - the list of options as an array of strings
      抛出:
      Exception - if an option is not supported
    • getNormalizeWordWeights

      public boolean getNormalizeWordWeights()
      Returns true if the word weights for each class are to be normalized
      返回:
      true if the word weights are normalized
    • setNormalizeWordWeights

      public void setNormalizeWordWeights(boolean doNormalize)
      Sets whether if the word weights for each class should be normalized
      参数:
      doNormalize - whether the word weights are to be normalized
    • normalizeWordWeightsTipText

      public String normalizeWordWeightsTipText()
      Returns the tip text for this property
      返回:
      tip text for this property suitable for displaying in the explorer/experimenter gui
    • getSmoothingParameter

      public double getSmoothingParameter()
      Gets the smoothing value to be used to avoid zero WordGivenClass probabilities.
      返回:
      the smoothing value
    • setSmoothingParameter

      public void setSmoothingParameter(double val)
      Sets the smoothing value used to avoid zero WordGivenClass probabilities
      参数:
      val - the new smooting value
    • smoothingParameterTipText

      public String smoothingParameterTipText()
      Returns the tip text for this property
      返回:
      tip text for this property suitable for displaying in the explorer/experimenter gui
    • globalInfo

      public String globalInfo()
      Returns a string describing this classifier
      返回:
      a description of the classifier suitable for displaying in the explorer/experimenter gui
    • getTechnicalInformation

      public TechnicalInformation getTechnicalInformation()
      Returns an instance of a TechnicalInformation object, containing detailed information about the technical background of this class, e.g., paper reference or book this class is based on.
      指定者:
      getTechnicalInformation 在接口中 TechnicalInformationHandler
      返回:
      the technical information about this class
    • getCapabilities

      public Capabilities getCapabilities()
      Returns default capabilities of the classifier.
      指定者:
      getCapabilities 在接口中 CapabilitiesHandler
      覆盖:
      getCapabilities 在类中 Classifier
      返回:
      the capabilities of this classifier
      另请参阅:
    • buildClassifier

      public void buildClassifier(Instances instances) throws Exception
      Generates the classifier.
      指定者:
      buildClassifier 在类中 Classifier
      参数:
      instances - set of instances serving as training data
      抛出:
      Exception - if the classifier has not been built successfully
    • classifyInstance

      public double classifyInstance(Instance instance) throws Exception
      Classifies a given instance.

      The classification rule is:
      MinC(forAllWords(ti*Wci))
      where
      ti is the frequency of word i in the given instance
      Wci is the weight of word i in Class c.

      For more information see section 4.4 of the paper mentioned above in the classifiers description.

      覆盖:
      classifyInstance 在类中 Classifier
      参数:
      instance - the instance to classify
      返回:
      the index of the class the instance is most likely to belong.
      抛出:
      Exception - if the classifier has not been built yet.
    • toString

      public String toString()
      Prints out the internal model built by the classifier. In this case it prints out the word weights calculated when building the classifier.
      覆盖:
      toString 在类中 Object
    • getRevision

      public String getRevision()
      Returns the revision string.
      指定者:
      getRevision 在接口中 RevisionHandler
      覆盖:
      getRevision 在类中 Classifier
      返回:
      the revision
    • main

      public static void main(String[] argv)
      Main method for testing this class.
      参数:
      argv - the options