类 ThresholdCurve

java.lang.Object
weka.classifiers.evaluation.ThresholdCurve
所有已实现的接口:
RevisionHandler

public class ThresholdCurve extends Object implements RevisionHandler
Generates points illustrating prediction tradeoffs that can be obtained by varying the threshold value between classes. For example, the typical threshold value of 0.5 means the predicted probability of "positive" must be higher than 0.5 for the instance to be predicted as "positive". The resulting dataset can be used to visualize precision/recall tradeoff, or for ROC curve analysis (true positive rate vs false positive rate). Weka just varies the threshold on the class probability estimates in each case. The Mann Whitney statistic is used to calculate the AUC.
版本:
$Revision: 7833 $
作者:
Len Trigg (len@reeltwo.com)
  • 字段详细资料

    • RELATION_NAME

      public static final String RELATION_NAME
      The name of the relation used in threshold curve datasets
      另请参阅:
    • TRUE_POS_NAME

      public static final String TRUE_POS_NAME
      attribute name: True Positives
      另请参阅:
    • FALSE_NEG_NAME

      public static final String FALSE_NEG_NAME
      attribute name: False Negatives
      另请参阅:
    • FALSE_POS_NAME

      public static final String FALSE_POS_NAME
      attribute name: False Positives
      另请参阅:
    • TRUE_NEG_NAME

      public static final String TRUE_NEG_NAME
      attribute name: True Negatives
      另请参阅:
    • FP_RATE_NAME

      public static final String FP_RATE_NAME
      attribute name: False Positive Rate"
      另请参阅:
    • TP_RATE_NAME

      public static final String TP_RATE_NAME
      attribute name: True Positive Rate
      另请参阅:
    • PRECISION_NAME

      public static final String PRECISION_NAME
      attribute name: Precision
      另请参阅:
    • RECALL_NAME

      public static final String RECALL_NAME
      attribute name: Recall
      另请参阅:
    • FALLOUT_NAME

      public static final String FALLOUT_NAME
      attribute name: Fallout
      另请参阅:
    • FMEASURE_NAME

      public static final String FMEASURE_NAME
      attribute name: FMeasure
      另请参阅:
    • SAMPLE_SIZE_NAME

      public static final String SAMPLE_SIZE_NAME
      attribute name: Sample Size
      另请参阅:
    • LIFT_NAME

      public static final String LIFT_NAME
      attribute name: Lift
      另请参阅:
    • THRESHOLD_NAME

      public static final String THRESHOLD_NAME
      attribute name: Threshold
      另请参阅:
  • 构造器详细资料

    • ThresholdCurve

      public ThresholdCurve()
  • 方法详细资料

    • getCurve

      public Instances getCurve(FastVector predictions)
      Calculates the performance stats for the default class and return results as a set of Instances. The structure of these Instances is as follows:

      • True Positives
      • False Negatives
      • False Positives
      • True Negatives
      • False Positive Rate
      • True Positive Rate
      • Precision
      • Recall
      • Fallout
      • Threshold contains the probability threshold that gives rise to the previous performance values.

      For the definitions of these measures, see TwoClassStats

      参数:
      predictions - the predictions to base the curve on
      返回:
      datapoints as a set of instances, null if no predictions have been made.
      另请参阅:
    • getCurve

      public Instances getCurve(FastVector predictions, int classIndex)
      Calculates the performance stats for the desired class and return results as a set of Instances.
      参数:
      predictions - the predictions to base the curve on
      classIndex - index of the class of interest.
      返回:
      datapoints as a set of instances.
    • getNPointPrecision

      public static double getNPointPrecision(Instances tcurve, int n)
      Calculates the n point precision result, which is the precision averaged over n evenly spaced (w.r.t recall) samples of the curve.
      参数:
      tcurve - a previously extracted threshold curve Instances.
      n - the number of points to average over.
      返回:
      the n-point precision.
    • getROCArea

      public static double getROCArea(Instances tcurve)
      Calculates the area under the ROC curve as the Wilcoxon-Mann-Whitney statistic.
      参数:
      tcurve - a previously extracted threshold curve Instances.
      返回:
      the ROC area, or Double.NaN if you don't pass in a ThresholdCurve generated Instances.
    • getThresholdInstance

      public static int getThresholdInstance(Instances tcurve, double threshold)
      Gets the index of the instance with the closest threshold value to the desired target
      参数:
      tcurve - a set of instances that have been generated by this class
      threshold - the target threshold
      返回:
      the index of the instance that has threshold closest to the target, or -1 if this could not be found (i.e. no data, or bad threshold target)
    • getRevision

      public String getRevision()
      Returns the revision string.
      指定者:
      getRevision 在接口中 RevisionHandler
      返回:
      the revision
    • main

      public static void main(String[] args)
      Tests the ThresholdCurve generation from the command line. The classifier is currently hardcoded. Pipe in an arff file.
      参数:
      args - currently ignored