类 BayesianLogisticRegression
java.lang.Object
weka.classifiers.Classifier
weka.classifiers.bayes.BayesianLogisticRegression
- 所有已实现的接口:
Serializable
,Cloneable
,CapabilitiesHandler
,OptionHandler
,RevisionHandler
,TechnicalInformationHandler
public class BayesianLogisticRegression
extends Classifier
implements OptionHandler, TechnicalInformationHandler
Implements Bayesian Logistic Regression for both Gaussian and Laplace Priors.
For more information, see
Alexander Genkin, David D. Lewis, David Madigan (2004). Large-scale bayesian logistic regression for text categorization. URL http://www.stat.rutgers.edu/~madigan/PAPERS/shortFat-v3a.pdf. BibTeX:
For more information, see
Alexander Genkin, David D. Lewis, David Madigan (2004). Large-scale bayesian logistic regression for text categorization. URL http://www.stat.rutgers.edu/~madigan/PAPERS/shortFat-v3a.pdf. BibTeX:
@techreport{Genkin2004, author = {Alexander Genkin and David D. Lewis and David Madigan}, institution = {DIMACS}, title = {Large-scale bayesian logistic regression for text categorization}, year = {2004}, URL = {http://www.stat.rutgers.edu/\~madigan/PAPERS/shortFat-v3a.pdf} }
- 版本:
- $Revision: 7984 $
- 作者:
- Navendu Garg (gargnav at iit dot edu)
- 另请参阅:
-
字段概要
字段修饰符和类型字段说明double[]
Array for storing coefficients of Bayesian regression model.double
This variable is used to keep track of change in the value of delta summation of r(i).int
The class index from the training datastatic final int
double[]
Trust Region Radiusdouble[]
Array to store Regression Coefficient updates.double[]
This vector is used to store the increments on the R(i).double[]
Trust Region Radius Updatestatic final int
Distributions availableCV Hyperparameter Rangedouble[]
Array to store Hyperparameter values for each feature.int
Hyperparameter selection methoddouble
Best hyperparameter for test phasestatic double[]
Set of values to be used as hyperparameter values during Cross-Validation.int
Iteration counterstatic final int
static double[]
Log-likelihood values to be used to choose the best hyperparameter.Filter interface used to point to weka.filters.unsupervised.attribute.Normalize objectint
seed for randomizing the instances before CVint
Maximum number of iterationsstatic final int
Methods for selecting the hyperparameter valueboolean
Choose whether to normalize data or notint
NumFolds for CV based Hyperparameters selectionint
Distribution Prior classdouble[]
R(i)= BetaVector X x(i) X y(i).static final int
static final Tag[]
static final Tag[]
double
Threshold for binary classification of probabilisitic estimatedouble
Tolerance criteria for the stopping criterion. -
构造器概要
构造器 -
方法概要
修饰符和类型方法说明static double
bigF
(double r, double sigma) This is a convient function that defines and upper bound (Delta>0) for values of r(i) reachable by updates in the trust region.void
buildClassifier
(Instances data) (1) Set the data to the class attribute m_Instances. (2)Call the method initialize() to initialize the values.double
classifyInstance
(Instance instance) Classifies the given instance using the Bayesian Logistic Regression function.static double
classSgn
(double value) This class is used to mask the internal class labels.double
Method computes the best hyperparameter value by doing cross -validation on the training data and compute the likelihood.Returns the tip text for this propertyThis method tests what kind of data this classifier can handle.Get the range of hyperparameter values to consider during CV-based selection.Get the method used to select the hyperparameterdouble
Get the hyperparameter value.double
getLoglikeliHood
(double[] betas, Instances instances) int
Get the maximum number of iterations to performint
Return the number of folds for CV-based hyperparameter selectionString[]
Gets the current settings of the Classifier.Get the type of prior to use.Returns the revision string.int
getSeed()
Get the seed for randomizing the instances for CV-based hyperparameter selectionReturns an instance of a TechnicalInformation object, containing detailed information about the technical background of this class, e.g., paper reference or book this class is based on.double
Return the threshold being used.double
Get the tolerance valueReturns the tip text for this propertyReturns the tip text for this propertyReturns the tip text for this propertyvoid
(1)Initialize m_Beta[j] to 0.boolean
isDebug()
Returns true if debug is turned on.boolean
Returns true if the data is to be normalized firstReturns an enumeration describing the available options.static double
logisticLinkFunction
(double r) This method computes the values for the logistic link function.static void
Main method for testing this class.Returns the tip text for this propertyReturns the tip text for this propertydouble
This function computes the norm-based hyperparameters and stores them in the m_Hyperparameters.Returns the tip text for this propertyReturns the tip text for this propertyReturns the tip text for this propertyvoid
setDebug
(boolean debugMode) Set debugging mode.void
setHyperparameterRange
(String hyperparameterRange) Set the range of hyperparameter values to consider during CV-based selectionvoid
setHyperparameterSelection
(SelectedTag newMethod) Set the method used to select the hyperparametervoid
setHyperparameterValue
(double hyperparameterValue) Set the hyperparameter value.void
setMaxIterations
(int maxIterations) Set the maximum number of iterations to performvoid
setNormalizeData
(boolean normalizeData) Set whether to normalize the data or notvoid
setNumFolds
(int numFolds) Set the number of folds to use for CV-based hyperparameter selectionvoid
setOptions
(String[] options) Parses a given list of options.void
setPriorClass
(SelectedTag newMethod) Set the type of prior to use.void
setSeed
(int seed) Set the seed for randomizing the instances for CV-based hyperparameter selectionvoid
setThreshold
(double threshold) Set the threshold to use.void
setTolerance
(double tolerance) Set the tolerance valuestatic double
sgn
(double r) Sign for a given value.boolean
This method implements the stopping criterion function.Returns the tip text for this propertyReturns the tip text for this propertytoString()
Outputs the linear regression model as a string.从类继承的方法 weka.classifiers.Classifier
distributionForInstance, forName, getDebug, makeCopies, makeCopy
-
字段详细资料
-
LogLikelihood
public static double[] LogLikelihoodLog-likelihood values to be used to choose the best hyperparameter. -
InputHyperparameterValues
public static double[] InputHyperparameterValuesSet of values to be used as hyperparameter values during Cross-Validation. -
NormalizeData
public boolean NormalizeDataChoose whether to normalize data or not -
Tolerance
public double ToleranceTolerance criteria for the stopping criterion. -
Threshold
public double ThresholdThreshold for binary classification of probabilisitic estimate -
GAUSSIAN
public static final int GAUSSIANDistributions available- 另请参阅:
-
LAPLACIAN
public static final int LAPLACIAN- 另请参阅:
-
TAGS_PRIOR
-
PriorClass
public int PriorClassDistribution Prior class -
NumFolds
public int NumFoldsNumFolds for CV based Hyperparameters selection -
m_seed
public int m_seedseed for randomizing the instances before CV -
NORM_BASED
public static final int NORM_BASEDMethods for selecting the hyperparameter value- 另请参阅:
-
CV_BASED
public static final int CV_BASED- 另请参阅:
-
SPECIFIC_VALUE
public static final int SPECIFIC_VALUE- 另请参阅:
-
TAGS_HYPER_METHOD
-
HyperparameterSelection
public int HyperparameterSelectionHyperparameter selection method -
ClassIndex
public int ClassIndexThe class index from the training data -
HyperparameterValue
public double HyperparameterValueBest hyperparameter for test phase -
HyperparameterRange
CV Hyperparameter Range -
maxIterations
public int maxIterationsMaximum number of iterations -
iterationCounter
public int iterationCounterIteration counter -
BetaVector
public double[] BetaVectorArray for storing coefficients of Bayesian regression model. -
DeltaBeta
public double[] DeltaBetaArray to store Regression Coefficient updates. -
DeltaUpdate
public double[] DeltaUpdateTrust Region Radius Update -
Delta
public double[] DeltaTrust Region Radius -
Hyperparameters
public double[] HyperparametersArray to store Hyperparameter values for each feature. -
R
public double[] RR(i)= BetaVector X x(i) X y(i). This an intermediate value with respect to vector BETA, input values and corresponding class labels -
DeltaR
public double[] DeltaRThis vector is used to store the increments on the R(i). It is also used to determining the stopping criterion. -
Change
public double ChangeThis variable is used to keep track of change in the value of delta summation of r(i). -
m_Filter
Filter interface used to point to weka.filters.unsupervised.attribute.Normalize object
-
-
构造器详细资料
-
BayesianLogisticRegression
public BayesianLogisticRegression()
-
-
方法详细资料
-
globalInfo
-
initialize
(1)Initialize m_Beta[j] to 0. (2)Initialize m_DeltaUpdate[j].
- 抛出:
Exception
-
getCapabilities
This method tests what kind of data this classifier can handle. return Capabilities- 指定者:
getCapabilities
在接口中CapabilitiesHandler
- 覆盖:
getCapabilities
在类中Classifier
- 返回:
- the capabilities of this object
- 另请参阅:
-
buildClassifier
- (1) Set the data to the class attribute m_Instances.
- (2)Call the method initialize() to initialize the values.
- 指定者:
buildClassifier
在类中Classifier
- 参数:
data
- training data- 抛出:
Exception
- if classifier can't be built successfully.
-
classSgn
public static double classSgn(double value) This class is used to mask the internal class labels.- 参数:
value
- internal class label- 返回:
- -1 for internal class label 0
- +1 for internal class label 1
-
getTechnicalInformation
Returns an instance of a TechnicalInformation object, containing detailed information about the technical background of this class, e.g., paper reference or book this class is based on.- 指定者:
getTechnicalInformation
在接口中TechnicalInformationHandler
- 返回:
- the technical information about this class
-
bigF
public static double bigF(double r, double sigma) This is a convient function that defines and upper bound (Delta>0) for values of r(i) reachable by updates in the trust region. r BetaVector X x(i)y(i). delta A parameter where sigma > 0- 返回:
- double function value
-
stoppingCriterion
public boolean stoppingCriterion()This method implements the stopping criterion function.- 返回:
- boolean whether to stop or not.
-
logisticLinkFunction
public static double logisticLinkFunction(double r) This method computes the values for the logistic link function.f(r)=exp(r)/(1+exp(r))
- 返回:
- output value
-
sgn
public static double sgn(double r) Sign for a given value.- 参数:
r
-- 返回:
- double +1 if r>0, -1 if r<0
-
normBasedHyperParameter
public double normBasedHyperParameter()This function computes the norm-based hyperparameters and stores them in the m_Hyperparameters. -
classifyInstance
Classifies the given instance using the Bayesian Logistic Regression function.- 覆盖:
classifyInstance
在类中Classifier
- 参数:
instance
- the test instance- 返回:
- the classification
- 抛出:
Exception
- if classification can't be done successfully
-
toString
Outputs the linear regression model as a string. -
CVBasedHyperparameter
Method computes the best hyperparameter value by doing cross -validation on the training data and compute the likelihood. The method can parse a range of values or a list of values.- 返回:
- Best hyperparameter value with the max likelihood value on the training data.
- 抛出:
Exception
-
getLoglikeliHood
- 返回:
- likelihood for a given set of betas and instances
-
listOptions
Returns an enumeration describing the available options.- 指定者:
listOptions
在接口中OptionHandler
- 覆盖:
listOptions
在类中Classifier
- 返回:
- an enumeration of all the available options.
-
setOptions
Parses a given list of options. Valid options are:-D Show Debugging Output
-P <integer> Distribution of the Prior (1=Gaussian, 2=Laplacian) (default: 1=Gaussian)
-H <integer> Hyperparameter Selection Method (1=Norm-based, 2=CV-based, 3=specific value) (default: 1=Norm-based)
-V <double> Specified Hyperparameter Value (use in conjunction with -H 3) (default: 0.27)
-R <string> Hyperparameter Range (use in conjunction with -H 2) (format: R:start-end,multiplier OR L:val(1), val(2), ..., val(n)) (default: R:0.01-316,3.16)
-Tl <double> Tolerance Value (default: 0.0005)
-S <double> Threshold Value (default: 0.5)
-F <integer> Number Of Folds (use in conjuction with -H 2) (default: 2)
-I <integer> Max Number of Iterations (default: 100)
-N Normalize the data
-seed <number> Seed for randomizing instances order in CV-based hyperparameter selection (default: 1)
- 指定者:
setOptions
在接口中OptionHandler
- 覆盖:
setOptions
在类中Classifier
- 参数:
options
- the list of options as an array of strings- 抛出:
Exception
- if an option is not supported
-
getOptions
从类复制的说明:Classifier
Gets the current settings of the Classifier.- 指定者:
getOptions
在接口中OptionHandler
- 覆盖:
getOptions
在类中Classifier
- 返回:
- an array of strings suitable for passing to setOptions
-
main
Main method for testing this class.- 参数:
argv
- the options
-
debugTipText
Returns the tip text for this property- 覆盖:
debugTipText
在类中Classifier
- 返回:
- tip text for this property suitable for displaying in the explorer/experimenter gui
-
setDebug
public void setDebug(boolean debugMode) 从类复制的说明:Classifier
Set debugging mode.- 覆盖:
setDebug
在类中Classifier
- 参数:
debugMode
- true if debug output should be printed
-
hyperparameterSelectionTipText
Returns the tip text for this property- 返回:
- tip text for this property suitable for displaying in the explorer/experimenter gui
-
getHyperparameterSelection
Get the method used to select the hyperparameter- 返回:
- the method used to select the hyperparameter
-
setHyperparameterSelection
Set the method used to select the hyperparameter- 参数:
newMethod
- the method used to set the hyperparameter
-
priorClassTipText
Returns the tip text for this property- 返回:
- tip text for this property suitable for displaying in the explorer/experimenter gui
-
setPriorClass
Set the type of prior to use.- 参数:
newMethod
- the type of prior to use.
-
getPriorClass
Get the type of prior to use.- 返回:
- the type of prior to use
-
thresholdTipText
Returns the tip text for this property- 返回:
- tip text for this property suitable for displaying in the explorer/experimenter gui
-
getThreshold
public double getThreshold()Return the threshold being used.- 返回:
- the threshold
-
setThreshold
public void setThreshold(double threshold) Set the threshold to use.- 参数:
threshold
- the threshold to use
-
toleranceTipText
Returns the tip text for this property- 返回:
- tip text for this property suitable for displaying in the explorer/experimenter gui
-
getTolerance
public double getTolerance()Get the tolerance value- 返回:
- the tolerance value
-
setTolerance
public void setTolerance(double tolerance) Set the tolerance value- 参数:
tolerance
- the tolerance value to use
-
hyperparameterValueTipText
Returns the tip text for this property- 返回:
- tip text for this property suitable for displaying in the explorer/experimenter gui
-
getHyperparameterValue
public double getHyperparameterValue()Get the hyperparameter value. Used when the hyperparameter selection method is set to specific value- 返回:
- the hyperparameter value
-
setHyperparameterValue
public void setHyperparameterValue(double hyperparameterValue) Set the hyperparameter value. Used when the hyperparameter selection method is set to specific value- 参数:
hyperparameterValue
- the value of the hyperparameter
-
numFoldsTipText
Returns the tip text for this property- 返回:
- tip text for this property suitable for displaying in the explorer/experimenter gui
-
getNumFolds
public int getNumFolds()Return the number of folds for CV-based hyperparameter selection- 返回:
- the number of CV folds
-
setNumFolds
public void setNumFolds(int numFolds) Set the number of folds to use for CV-based hyperparameter selection- 参数:
numFolds
- number of folds to select
-
seedTipText
Returns the tip text for this property- 返回:
- tip text for this property suitable for displaying in the explorer/experimenter gui
-
setSeed
public void setSeed(int seed) Set the seed for randomizing the instances for CV-based hyperparameter selection- 参数:
seed
- the seed to use
-
getSeed
public int getSeed()Get the seed for randomizing the instances for CV-based hyperparameter selection- 返回:
- the seed to use
-
maxIterationsTipText
Returns the tip text for this property- 返回:
- tip text for this property suitable for displaying in the explorer/experimenter gui
-
getMaxIterations
public int getMaxIterations()Get the maximum number of iterations to perform- 返回:
- the maximum number of iterations
-
setMaxIterations
public void setMaxIterations(int maxIterations) Set the maximum number of iterations to perform- 参数:
maxIterations
- maximum number of iterations
-
normalizeDataTipText
Returns the tip text for this property- 返回:
- tip text for this property suitable for displaying in the explorer/experimenter gui
-
isNormalizeData
public boolean isNormalizeData()Returns true if the data is to be normalized first- 返回:
- true if the data is to be normalized
-
setNormalizeData
public void setNormalizeData(boolean normalizeData) Set whether to normalize the data or not- 参数:
normalizeData
- true if data is to be normalized
-
hyperparameterRangeTipText
Returns the tip text for this property- 返回:
- tip text for this property suitable for displaying in the explorer/experimenter gui
-
getHyperparameterRange
Get the range of hyperparameter values to consider during CV-based selection.- 返回:
- the range of hyperparameters as a Stringe
-
setHyperparameterRange
Set the range of hyperparameter values to consider during CV-based selection- 参数:
hyperparameterRange
- the range of hyperparameter values
-
isDebug
public boolean isDebug()Returns true if debug is turned on.- 返回:
- true if debug is turned on
-
getRevision
Returns the revision string.- 指定者:
getRevision
在接口中RevisionHandler
- 覆盖:
getRevision
在类中Classifier
- 返回:
- the revision
-