类 Logistic
java.lang.Object
weka.classifiers.Classifier
weka.classifiers.functions.Logistic
- 所有已实现的接口:
Serializable
,Cloneable
,CapabilitiesHandler
,OptionHandler
,RevisionHandler
,TechnicalInformationHandler
,WeightedInstancesHandler
public class Logistic
extends Classifier
implements OptionHandler, WeightedInstancesHandler, TechnicalInformationHandler
Class for building and using a multinomial logistic regression model with a ridge estimator.
There are some modifications, however, compared to the paper of leCessie and van Houwelingen(1992):
If there are k classes for n instances with m attributes, the parameter matrix B to be calculated will be an m*(k-1) matrix.
The probability for class j with the exception of the last class is
Pj(Xi) = exp(XiBj)/((sum[j=1..(k-1)]exp(Xi*Bj))+1)
The last class has probability
1-(sum[j=1..(k-1)]Pj(Xi))
= 1/((sum[j=1..(k-1)]exp(Xi*Bj))+1)
The (negative) multinomial log-likelihood is thus:
L = -sum[i=1..n]{
sum[j=1..(k-1)](Yij * ln(Pj(Xi)))
+(1 - (sum[j=1..(k-1)]Yij))
* ln(1 - sum[j=1..(k-1)]Pj(Xi))
} + ridge * (B^2)
In order to find the matrix B for which L is minimised, a Quasi-Newton Method is used to search for the optimized values of the m*(k-1) variables. Note that before we use the optimization procedure, we 'squeeze' the matrix B into a m*(k-1) vector. For details of the optimization procedure, please check weka.core.Optimization class.
Although original Logistic Regression does not deal with instance weights, we modify the algorithm a little bit to handle the instance weights.
For more information see:
le Cessie, S., van Houwelingen, J.C. (1992). Ridge Estimators in Logistic Regression. Applied Statistics. 41(1):191-201.
Note: Missing values are replaced using a ReplaceMissingValuesFilter, and nominal attributes are transformed into numeric attributes using a NominalToBinaryFilter. BibTeX:
There are some modifications, however, compared to the paper of leCessie and van Houwelingen(1992):
If there are k classes for n instances with m attributes, the parameter matrix B to be calculated will be an m*(k-1) matrix.
The probability for class j with the exception of the last class is
Pj(Xi) = exp(XiBj)/((sum[j=1..(k-1)]exp(Xi*Bj))+1)
The last class has probability
1-(sum[j=1..(k-1)]Pj(Xi))
= 1/((sum[j=1..(k-1)]exp(Xi*Bj))+1)
The (negative) multinomial log-likelihood is thus:
L = -sum[i=1..n]{
sum[j=1..(k-1)](Yij * ln(Pj(Xi)))
+(1 - (sum[j=1..(k-1)]Yij))
* ln(1 - sum[j=1..(k-1)]Pj(Xi))
} + ridge * (B^2)
In order to find the matrix B for which L is minimised, a Quasi-Newton Method is used to search for the optimized values of the m*(k-1) variables. Note that before we use the optimization procedure, we 'squeeze' the matrix B into a m*(k-1) vector. For details of the optimization procedure, please check weka.core.Optimization class.
Although original Logistic Regression does not deal with instance weights, we modify the algorithm a little bit to handle the instance weights.
For more information see:
le Cessie, S., van Houwelingen, J.C. (1992). Ridge Estimators in Logistic Regression. Applied Statistics. 41(1):191-201.
Note: Missing values are replaced using a ReplaceMissingValuesFilter, and nominal attributes are transformed into numeric attributes using a NominalToBinaryFilter. BibTeX:
@article{leCessie1992, author = {le Cessie, S. and van Houwelingen, J.C.}, journal = {Applied Statistics}, number = {1}, pages = {191-201}, title = {Ridge Estimators in Logistic Regression}, volume = {41}, year = {1992} }Valid options are:
-D Turn on debugging output.
-R <ridge> Set the ridge in the log-likelihood.
-M <number> Set the maximum number of iterations (default -1, until convergence).
- 版本:
- $Revision: 5523 $
- 作者:
- Xin Xu (xx5@cs.waikato.ac.nz)
- 另请参阅:
-
构造器概要
构造器 -
方法概要
修饰符和类型方法说明void
buildClassifier
(Instances train) Builds the classifierdouble[][]
Returns the coefficients for this logistic model.Returns the tip text for this propertydouble[]
distributionForInstance
(Instance instance) Computes the distribution for a given instanceReturns default capabilities of the classifier.boolean
getDebug()
Gets whether debugging output will be printed.int
Get the value of MaxIts.String[]
Gets the current settings of the classifier.Returns the revision string.double
getRidge()
Gets the ridge in the log-likelihood.Returns an instance of a TechnicalInformation object, containing detailed information about the technical background of this class, e.g., paper reference or book this class is based on.Returns a string describing this classifierReturns an enumeration describing the available optionsstatic void
Main method for testing this class.Returns the tip text for this propertyReturns the tip text for this propertyvoid
setDebug
(boolean debug) Sets whether debugging output will be printed.void
setMaxIts
(int newMaxIts) Set the value of MaxIts.void
setOptions
(String[] options) Parses a given list of options.void
setRidge
(double ridge) Sets the ridge in the log-likelihood.toString()
Gets a string describing the classifier.从类继承的方法 weka.classifiers.Classifier
classifyInstance, forName, makeCopies, makeCopy
-
构造器详细资料
-
Logistic
public Logistic()
-
-
方法详细资料
-
globalInfo
Returns a string describing this classifier- 返回:
- a description of the classifier suitable for displaying in the explorer/experimenter gui
-
getTechnicalInformation
Returns an instance of a TechnicalInformation object, containing detailed information about the technical background of this class, e.g., paper reference or book this class is based on.- 指定者:
getTechnicalInformation
在接口中TechnicalInformationHandler
- 返回:
- the technical information about this class
-
listOptions
Returns an enumeration describing the available options- 指定者:
listOptions
在接口中OptionHandler
- 覆盖:
listOptions
在类中Classifier
- 返回:
- an enumeration of all the available options
-
setOptions
Parses a given list of options. Valid options are:-D Turn on debugging output.
-R <ridge> Set the ridge in the log-likelihood.
-M <number> Set the maximum number of iterations (default -1, until convergence).
- 指定者:
setOptions
在接口中OptionHandler
- 覆盖:
setOptions
在类中Classifier
- 参数:
options
- the list of options as an array of strings- 抛出:
Exception
- if an option is not supported
-
getOptions
Gets the current settings of the classifier.- 指定者:
getOptions
在接口中OptionHandler
- 覆盖:
getOptions
在类中Classifier
- 返回:
- an array of strings suitable for passing to setOptions
-
debugTipText
Returns the tip text for this property- 覆盖:
debugTipText
在类中Classifier
- 返回:
- tip text for this property suitable for displaying in the explorer/experimenter gui
-
setDebug
public void setDebug(boolean debug) Sets whether debugging output will be printed.- 覆盖:
setDebug
在类中Classifier
- 参数:
debug
- true if debugging output should be printed
-
getDebug
public boolean getDebug()Gets whether debugging output will be printed.- 覆盖:
getDebug
在类中Classifier
- 返回:
- true if debugging output will be printed
-
ridgeTipText
Returns the tip text for this property- 返回:
- tip text for this property suitable for displaying in the explorer/experimenter gui
-
setRidge
public void setRidge(double ridge) Sets the ridge in the log-likelihood.- 参数:
ridge
- the ridge
-
getRidge
public double getRidge()Gets the ridge in the log-likelihood.- 返回:
- the ridge
-
maxItsTipText
Returns the tip text for this property- 返回:
- tip text for this property suitable for displaying in the explorer/experimenter gui
-
getMaxIts
public int getMaxIts()Get the value of MaxIts.- 返回:
- Value of MaxIts.
-
setMaxIts
public void setMaxIts(int newMaxIts) Set the value of MaxIts.- 参数:
newMaxIts
- Value to assign to MaxIts.
-
getCapabilities
Returns default capabilities of the classifier.- 指定者:
getCapabilities
在接口中CapabilitiesHandler
- 覆盖:
getCapabilities
在类中Classifier
- 返回:
- the capabilities of this classifier
- 另请参阅:
-
buildClassifier
Builds the classifier- 指定者:
buildClassifier
在类中Classifier
- 参数:
train
- the training data to be used for generating the boosted classifier.- 抛出:
Exception
- if the classifier could not be built successfully
-
distributionForInstance
Computes the distribution for a given instance- 覆盖:
distributionForInstance
在类中Classifier
- 参数:
instance
- the instance for which distribution is computed- 返回:
- the distribution
- 抛出:
Exception
- if the distribution can't be computed successfully
-
coefficients
public double[][] coefficients()Returns the coefficients for this logistic model. The first dimension indexes the attributes, and the second the classes.- 返回:
- the coefficients for this logistic model
-
toString
Gets a string describing the classifier. -
getRevision
Returns the revision string.- 指定者:
getRevision
在接口中RevisionHandler
- 覆盖:
getRevision
在类中Classifier
- 返回:
- the revision
-
main
Main method for testing this class.- 参数:
argv
- should contain the command line arguments to the scheme (see Evaluation)
-