类 BayesianLogisticRegression

java.lang.Object
weka.classifiers.Classifier
weka.classifiers.bayes.BayesianLogisticRegression
所有已实现的接口:
Serializable, Cloneable, CapabilitiesHandler, OptionHandler, RevisionHandler, TechnicalInformationHandler

public class BayesianLogisticRegression extends Classifier implements OptionHandler, TechnicalInformationHandler
Implements Bayesian Logistic Regression for both Gaussian and Laplace Priors.

For more information, see

Alexander Genkin, David D. Lewis, David Madigan (2004). Large-scale bayesian logistic regression for text categorization. URL http://www.stat.rutgers.edu/~madigan/PAPERS/shortFat-v3a.pdf.

BibTeX:

 @techreport{Genkin2004,
    author = {Alexander Genkin and David D. Lewis and David Madigan},
    institution = {DIMACS},
    title = {Large-scale bayesian logistic regression for text categorization},
    year = {2004},
    URL = {http://www.stat.rutgers.edu/\~madigan/PAPERS/shortFat-v3a.pdf}
 }
 

版本:
$Revision: 7984 $
作者:
Navendu Garg (gargnav at iit dot edu)
另请参阅:
  • 字段详细资料

    • LogLikelihood

      public static double[] LogLikelihood
      Log-likelihood values to be used to choose the best hyperparameter.
    • InputHyperparameterValues

      public static double[] InputHyperparameterValues
      Set of values to be used as hyperparameter values during Cross-Validation.
    • NormalizeData

      public boolean NormalizeData
      Choose whether to normalize data or not
    • Tolerance

      public double Tolerance
      Tolerance criteria for the stopping criterion.
    • Threshold

      public double Threshold
      Threshold for binary classification of probabilisitic estimate
    • GAUSSIAN

      public static final int GAUSSIAN
      Distributions available
      另请参阅:
    • LAPLACIAN

      public static final int LAPLACIAN
      另请参阅:
    • TAGS_PRIOR

      public static final Tag[] TAGS_PRIOR
    • PriorClass

      public int PriorClass
      Distribution Prior class
    • NumFolds

      public int NumFolds
      NumFolds for CV based Hyperparameters selection
    • m_seed

      public int m_seed
      seed for randomizing the instances before CV
    • NORM_BASED

      public static final int NORM_BASED
      Methods for selecting the hyperparameter value
      另请参阅:
    • CV_BASED

      public static final int CV_BASED
      另请参阅:
    • SPECIFIC_VALUE

      public static final int SPECIFIC_VALUE
      另请参阅:
    • TAGS_HYPER_METHOD

      public static final Tag[] TAGS_HYPER_METHOD
    • HyperparameterSelection

      public int HyperparameterSelection
      Hyperparameter selection method
    • ClassIndex

      public int ClassIndex
      The class index from the training data
    • HyperparameterValue

      public double HyperparameterValue
      Best hyperparameter for test phase
    • HyperparameterRange

      public String HyperparameterRange
      CV Hyperparameter Range
    • maxIterations

      public int maxIterations
      Maximum number of iterations
    • iterationCounter

      public int iterationCounter
      Iteration counter
    • BetaVector

      public double[] BetaVector
      Array for storing coefficients of Bayesian regression model.
    • DeltaBeta

      public double[] DeltaBeta
      Array to store Regression Coefficient updates.
    • DeltaUpdate

      public double[] DeltaUpdate
      Trust Region Radius Update
    • Delta

      public double[] Delta
      Trust Region Radius
    • Hyperparameters

      public double[] Hyperparameters
      Array to store Hyperparameter values for each feature.
    • R

      public double[] R
      R(i)= BetaVector X x(i) X y(i). This an intermediate value with respect to vector BETA, input values and corresponding class labels
    • DeltaR

      public double[] DeltaR
      This vector is used to store the increments on the R(i). It is also used to determining the stopping criterion.
    • Change

      public double Change
      This variable is used to keep track of change in the value of delta summation of r(i).
    • m_Filter

      public Filter m_Filter
      Filter interface used to point to weka.filters.unsupervised.attribute.Normalize object
  • 构造器详细资料

    • BayesianLogisticRegression

      public BayesianLogisticRegression()
  • 方法详细资料

    • globalInfo

      public String globalInfo()
    • initialize

      public void initialize() throws Exception
       (1)Initialize m_Beta[j] to 0.
       (2)Initialize m_DeltaUpdate[j].
       
      抛出:
      Exception
    • getCapabilities

      public Capabilities getCapabilities()
      This method tests what kind of data this classifier can handle. return Capabilities
      指定者:
      getCapabilities 在接口中 CapabilitiesHandler
      覆盖:
      getCapabilities 在类中 Classifier
      返回:
      the capabilities of this object
      另请参阅:
    • buildClassifier

      public void buildClassifier(Instances data) throws Exception
      • (1) Set the data to the class attribute m_Instances.
      • (2)Call the method initialize() to initialize the values.
      指定者:
      buildClassifier 在类中 Classifier
      参数:
      data - training data
      抛出:
      Exception - if classifier can't be built successfully.
    • classSgn

      public static double classSgn(double value)
      This class is used to mask the internal class labels.
      参数:
      value - internal class label
      返回:
       
      • -1 for internal class label 0
      • +1 for internal class label 1
    • getTechnicalInformation

      public TechnicalInformation getTechnicalInformation()
      Returns an instance of a TechnicalInformation object, containing detailed information about the technical background of this class, e.g., paper reference or book this class is based on.
      指定者:
      getTechnicalInformation 在接口中 TechnicalInformationHandler
      返回:
      the technical information about this class
    • bigF

      public static double bigF(double r, double sigma)
      This is a convient function that defines and upper bound (Delta>0) for values of r(i) reachable by updates in the trust region. r BetaVector X x(i)y(i). delta A parameter where sigma > 0
      返回:
      double function value
    • stoppingCriterion

      public boolean stoppingCriterion()
      This method implements the stopping criterion function.
      返回:
      boolean whether to stop or not.
    • logisticLinkFunction

      public static double logisticLinkFunction(double r)
      This method computes the values for the logistic link function.
      f(r)=exp(r)/(1+exp(r))
      返回:
      output value
    • sgn

      public static double sgn(double r)
      Sign for a given value.
      参数:
      r -
      返回:
      double +1 if r>0, -1 if r<0
    • normBasedHyperParameter

      public double normBasedHyperParameter()
      This function computes the norm-based hyperparameters and stores them in the m_Hyperparameters.
    • classifyInstance

      public double classifyInstance(Instance instance) throws Exception
      Classifies the given instance using the Bayesian Logistic Regression function.
      覆盖:
      classifyInstance 在类中 Classifier
      参数:
      instance - the test instance
      返回:
      the classification
      抛出:
      Exception - if classification can't be done successfully
    • toString

      public String toString()
      Outputs the linear regression model as a string.
      覆盖:
      toString 在类中 Object
      返回:
      the model as string
    • CVBasedHyperparameter

      public double CVBasedHyperparameter() throws Exception
      Method computes the best hyperparameter value by doing cross -validation on the training data and compute the likelihood. The method can parse a range of values or a list of values.
      返回:
      Best hyperparameter value with the max likelihood value on the training data.
      抛出:
      Exception
    • getLoglikeliHood

      public double getLoglikeliHood(double[] betas, Instances instances)
      返回:
      likelihood for a given set of betas and instances
    • listOptions

      public Enumeration listOptions()
      Returns an enumeration describing the available options.
      指定者:
      listOptions 在接口中 OptionHandler
      覆盖:
      listOptions 在类中 Classifier
      返回:
      an enumeration of all the available options.
    • setOptions

      public void setOptions(String[] options) throws Exception
      Parses a given list of options.

      Valid options are:

       -D
        Show Debugging Output
       
       -P <integer>
        Distribution of the Prior (1=Gaussian, 2=Laplacian)
        (default: 1=Gaussian)
       -H <integer>
        Hyperparameter Selection Method (1=Norm-based, 2=CV-based, 3=specific value)
        (default: 1=Norm-based)
       -V <double>
        Specified Hyperparameter Value (use in conjunction with -H 3)
        (default: 0.27)
       -R <string>
        Hyperparameter Range (use in conjunction with -H 2)
        (format: R:start-end,multiplier OR L:val(1), val(2), ..., val(n))
        (default: R:0.01-316,3.16)
       -Tl <double>
        Tolerance Value
        (default: 0.0005)
       -S <double>
        Threshold Value
        (default: 0.5)
       -F <integer>
        Number Of Folds (use in conjuction with -H 2)
        (default: 2)
       -I <integer>
        Max Number of Iterations
        (default: 100)
       -N
        Normalize the data
       -seed <number>
        Seed for randomizing instances order
        in CV-based hyperparameter selection
        (default: 1)
      指定者:
      setOptions 在接口中 OptionHandler
      覆盖:
      setOptions 在类中 Classifier
      参数:
      options - the list of options as an array of strings
      抛出:
      Exception - if an option is not supported
    • getOptions

      public String[] getOptions()
      从类复制的说明: Classifier
      Gets the current settings of the Classifier.
      指定者:
      getOptions 在接口中 OptionHandler
      覆盖:
      getOptions 在类中 Classifier
      返回:
      an array of strings suitable for passing to setOptions
    • main

      public static void main(String[] argv)
      Main method for testing this class.
      参数:
      argv - the options
    • debugTipText

      public String debugTipText()
      Returns the tip text for this property
      覆盖:
      debugTipText 在类中 Classifier
      返回:
      tip text for this property suitable for displaying in the explorer/experimenter gui
    • setDebug

      public void setDebug(boolean debugMode)
      从类复制的说明: Classifier
      Set debugging mode.
      覆盖:
      setDebug 在类中 Classifier
      参数:
      debugMode - true if debug output should be printed
    • hyperparameterSelectionTipText

      public String hyperparameterSelectionTipText()
      Returns the tip text for this property
      返回:
      tip text for this property suitable for displaying in the explorer/experimenter gui
    • getHyperparameterSelection

      public SelectedTag getHyperparameterSelection()
      Get the method used to select the hyperparameter
      返回:
      the method used to select the hyperparameter
    • setHyperparameterSelection

      public void setHyperparameterSelection(SelectedTag newMethod)
      Set the method used to select the hyperparameter
      参数:
      newMethod - the method used to set the hyperparameter
    • priorClassTipText

      public String priorClassTipText()
      Returns the tip text for this property
      返回:
      tip text for this property suitable for displaying in the explorer/experimenter gui
    • setPriorClass

      public void setPriorClass(SelectedTag newMethod)
      Set the type of prior to use.
      参数:
      newMethod - the type of prior to use.
    • getPriorClass

      public SelectedTag getPriorClass()
      Get the type of prior to use.
      返回:
      the type of prior to use
    • thresholdTipText

      public String thresholdTipText()
      Returns the tip text for this property
      返回:
      tip text for this property suitable for displaying in the explorer/experimenter gui
    • getThreshold

      public double getThreshold()
      Return the threshold being used.
      返回:
      the threshold
    • setThreshold

      public void setThreshold(double threshold)
      Set the threshold to use.
      参数:
      threshold - the threshold to use
    • toleranceTipText

      public String toleranceTipText()
      Returns the tip text for this property
      返回:
      tip text for this property suitable for displaying in the explorer/experimenter gui
    • getTolerance

      public double getTolerance()
      Get the tolerance value
      返回:
      the tolerance value
    • setTolerance

      public void setTolerance(double tolerance)
      Set the tolerance value
      参数:
      tolerance - the tolerance value to use
    • hyperparameterValueTipText

      public String hyperparameterValueTipText()
      Returns the tip text for this property
      返回:
      tip text for this property suitable for displaying in the explorer/experimenter gui
    • getHyperparameterValue

      public double getHyperparameterValue()
      Get the hyperparameter value. Used when the hyperparameter selection method is set to specific value
      返回:
      the hyperparameter value
    • setHyperparameterValue

      public void setHyperparameterValue(double hyperparameterValue)
      Set the hyperparameter value. Used when the hyperparameter selection method is set to specific value
      参数:
      hyperparameterValue - the value of the hyperparameter
    • numFoldsTipText

      public String numFoldsTipText()
      Returns the tip text for this property
      返回:
      tip text for this property suitable for displaying in the explorer/experimenter gui
    • getNumFolds

      public int getNumFolds()
      Return the number of folds for CV-based hyperparameter selection
      返回:
      the number of CV folds
    • setNumFolds

      public void setNumFolds(int numFolds)
      Set the number of folds to use for CV-based hyperparameter selection
      参数:
      numFolds - number of folds to select
    • seedTipText

      public String seedTipText()
      Returns the tip text for this property
      返回:
      tip text for this property suitable for displaying in the explorer/experimenter gui
    • setSeed

      public void setSeed(int seed)
      Set the seed for randomizing the instances for CV-based hyperparameter selection
      参数:
      seed - the seed to use
    • getSeed

      public int getSeed()
      Get the seed for randomizing the instances for CV-based hyperparameter selection
      返回:
      the seed to use
    • maxIterationsTipText

      public String maxIterationsTipText()
      Returns the tip text for this property
      返回:
      tip text for this property suitable for displaying in the explorer/experimenter gui
    • getMaxIterations

      public int getMaxIterations()
      Get the maximum number of iterations to perform
      返回:
      the maximum number of iterations
    • setMaxIterations

      public void setMaxIterations(int maxIterations)
      Set the maximum number of iterations to perform
      参数:
      maxIterations - maximum number of iterations
    • normalizeDataTipText

      public String normalizeDataTipText()
      Returns the tip text for this property
      返回:
      tip text for this property suitable for displaying in the explorer/experimenter gui
    • isNormalizeData

      public boolean isNormalizeData()
      Returns true if the data is to be normalized first
      返回:
      true if the data is to be normalized
    • setNormalizeData

      public void setNormalizeData(boolean normalizeData)
      Set whether to normalize the data or not
      参数:
      normalizeData - true if data is to be normalized
    • hyperparameterRangeTipText

      public String hyperparameterRangeTipText()
      Returns the tip text for this property
      返回:
      tip text for this property suitable for displaying in the explorer/experimenter gui
    • getHyperparameterRange

      public String getHyperparameterRange()
      Get the range of hyperparameter values to consider during CV-based selection.
      返回:
      the range of hyperparameters as a Stringe
    • setHyperparameterRange

      public void setHyperparameterRange(String hyperparameterRange)
      Set the range of hyperparameter values to consider during CV-based selection
      参数:
      hyperparameterRange - the range of hyperparameter values
    • isDebug

      public boolean isDebug()
      Returns true if debug is turned on.
      返回:
      true if debug is turned on
    • getRevision

      public String getRevision()
      Returns the revision string.
      指定者:
      getRevision 在接口中 RevisionHandler
      覆盖:
      getRevision 在类中 Classifier
      返回:
      the revision