类 Discretize

所有已实现的接口:
Serializable, CapabilitiesHandler, OptionHandler, RevisionHandler, WeightedInstancesHandler, UnsupervisedFilter
直接已知子类:
PKIDiscretize

public class Discretize extends PotentialClassIgnorer implements UnsupervisedFilter, WeightedInstancesHandler
An instance filter that discretizes a range of numeric attributes in the dataset into nominal attributes. Discretization is by simple binning. Skips the class attribute if set.

Valid options are:

 -unset-class-temporarily
  Unsets the class index temporarily before the filter is
  applied to the data.
  (default: no)
 -B <num>
  Specifies the (maximum) number of bins to divide numeric attributes into.
  (default = 10)
 -M <num>
  Specifies the desired weight of instances per bin for
  equal-frequency binning. If this is set to a positive
  number then the -B option will be ignored.
  (default = -1)
 -F
  Use equal-frequency instead of equal-width discretization.
 -O
  Optimize number of bins using leave-one-out estimate
  of estimated entropy (for equal-width discretization).
  If this is set then the -B option will be ignored.
 -R <col1,col2-col4,...>
  Specifies list of columns to Discretize. First and last are valid indexes.
  (default: first-last)
 -V
  Invert matching sense of column indexes.
 -D
  Output binary attributes for discretized attributes.
版本:
$Revision: 8284 $
作者:
Len Trigg (trigg@cs.waikato.ac.nz), Eibe Frank (eibe@cs.waikato.ac.nz)
另请参阅:
  • 构造器详细资料

    • Discretize

      public Discretize()
      Constructor - initialises the filter
    • Discretize

      public Discretize(String cols)
      Another constructor, sets the attribute indices immediately
      参数:
      cols - the attribute indices
  • 方法详细资料

    • listOptions

      public Enumeration listOptions()
      Gets an enumeration describing the available options.
      指定者:
      listOptions 在接口中 OptionHandler
      覆盖:
      listOptions 在类中 PotentialClassIgnorer
      返回:
      an enumeration of all the available options.
    • setOptions

      public void setOptions(String[] options) throws Exception
      Parses a given list of options.

      Valid options are:

       -unset-class-temporarily
        Unsets the class index temporarily before the filter is
        applied to the data.
        (default: no)
       -B <num>
        Specifies the (maximum) number of bins to divide numeric attributes into.
        (default = 10)
       -M <num>
        Specifies the desired weight of instances per bin for
        equal-frequency binning. If this is set to a positive
        number then the -B option will be ignored.
        (default = -1)
       -F
        Use equal-frequency instead of equal-width discretization.
       -O
        Optimize number of bins using leave-one-out estimate
        of estimated entropy (for equal-width discretization).
        If this is set then the -B option will be ignored.
       -R <col1,col2-col4,...>
        Specifies list of columns to Discretize. First and last are valid indexes.
        (default: first-last)
       -V
        Invert matching sense of column indexes.
       -D
        Output binary attributes for discretized attributes.
      指定者:
      setOptions 在接口中 OptionHandler
      覆盖:
      setOptions 在类中 PotentialClassIgnorer
      参数:
      options - the list of options as an array of strings
      抛出:
      Exception - if an option is not supported
    • getOptions

      public String[] getOptions()
      Gets the current settings of the filter.
      指定者:
      getOptions 在接口中 OptionHandler
      覆盖:
      getOptions 在类中 PotentialClassIgnorer
      返回:
      an array of strings suitable for passing to setOptions
    • getCapabilities

      public Capabilities getCapabilities()
      Returns the Capabilities of this filter.
      指定者:
      getCapabilities 在接口中 CapabilitiesHandler
      覆盖:
      getCapabilities 在类中 Filter
      返回:
      the capabilities of this object
      另请参阅:
    • setInputFormat

      public boolean setInputFormat(Instances instanceInfo) throws Exception
      Sets the format of the input instances.
      覆盖:
      setInputFormat 在类中 PotentialClassIgnorer
      参数:
      instanceInfo - an Instances object containing the input instance structure (any instances contained in the object are ignored - only the structure is required).
      返回:
      true if the outputFormat may be collected immediately
      抛出:
      Exception - if the input format can't be set successfully
    • input

      public boolean input(Instance instance)
      Input an instance for filtering. Ordinarily the instance is processed and made available for output immediately. Some filters require all instances be read before producing output.
      覆盖:
      input 在类中 Filter
      参数:
      instance - the input instance
      返回:
      true if the filtered instance may now be collected with output().
      抛出:
      IllegalStateException - if no input format has been defined.
    • batchFinished

      public boolean batchFinished()
      Signifies that this batch of input to the filter is finished. If the filter requires all instances prior to filtering, output() may now be called to retrieve the filtered instances.
      覆盖:
      batchFinished 在类中 Filter
      返回:
      true if there are instances pending output
      抛出:
      IllegalStateException - if no input structure has been defined
    • globalInfo

      public String globalInfo()
      Returns a string describing this filter
      返回:
      a description of the filter suitable for displaying in the explorer/experimenter gui
    • findNumBinsTipText

      public String findNumBinsTipText()
      Returns the tip text for this property
      返回:
      tip text for this property suitable for displaying in the explorer/experimenter gui
    • getFindNumBins

      public boolean getFindNumBins()
      Get the value of FindNumBins.
      返回:
      Value of FindNumBins.
    • setFindNumBins

      public void setFindNumBins(boolean newFindNumBins)
      Set the value of FindNumBins.
      参数:
      newFindNumBins - Value to assign to FindNumBins.
    • makeBinaryTipText

      public String makeBinaryTipText()
      Returns the tip text for this property
      返回:
      tip text for this property suitable for displaying in the explorer/experimenter gui
    • getMakeBinary

      public boolean getMakeBinary()
      Gets whether binary attributes should be made for discretized ones.
      返回:
      true if attributes will be binarized
    • setMakeBinary

      public void setMakeBinary(boolean makeBinary)
      Sets whether binary attributes should be made for discretized ones.
      参数:
      makeBinary - if binary attributes are to be made
    • desiredWeightOfInstancesPerIntervalTipText

      public String desiredWeightOfInstancesPerIntervalTipText()
      Returns the tip text for this property
      返回:
      tip text for this property suitable for displaying in the explorer/experimenter gui
    • getDesiredWeightOfInstancesPerInterval

      public double getDesiredWeightOfInstancesPerInterval()
      Get the DesiredWeightOfInstancesPerInterval value.
      返回:
      the DesiredWeightOfInstancesPerInterval value.
    • setDesiredWeightOfInstancesPerInterval

      public void setDesiredWeightOfInstancesPerInterval(double newDesiredNumber)
      Set the DesiredWeightOfInstancesPerInterval value.
      参数:
      newDesiredNumber - The new DesiredNumber value.
    • useEqualFrequencyTipText

      public String useEqualFrequencyTipText()
      Returns the tip text for this property
      返回:
      tip text for this property suitable for displaying in the explorer/experimenter gui
    • getUseEqualFrequency

      public boolean getUseEqualFrequency()
      Get the value of UseEqualFrequency.
      返回:
      Value of UseEqualFrequency.
    • setUseEqualFrequency

      public void setUseEqualFrequency(boolean newUseEqualFrequency)
      Set the value of UseEqualFrequency.
      参数:
      newUseEqualFrequency - Value to assign to UseEqualFrequency.
    • binsTipText

      public String binsTipText()
      Returns the tip text for this property
      返回:
      tip text for this property suitable for displaying in the explorer/experimenter gui
    • getBins

      public int getBins()
      Gets the number of bins numeric attributes will be divided into
      返回:
      the number of bins.
    • setBins

      public void setBins(int numBins)
      Sets the number of bins to divide each selected numeric attribute into
      参数:
      numBins - the number of bins
    • invertSelectionTipText

      public String invertSelectionTipText()
      Returns the tip text for this property
      返回:
      tip text for this property suitable for displaying in the explorer/experimenter gui
    • getInvertSelection

      public boolean getInvertSelection()
      Gets whether the supplied columns are to be removed or kept
      返回:
      true if the supplied columns will be kept
    • setInvertSelection

      public void setInvertSelection(boolean invert)
      Sets whether selected columns should be removed or kept. If true the selected columns are kept and unselected columns are deleted. If false selected columns are deleted and unselected columns are kept.
      参数:
      invert - the new invert setting
    • attributeIndicesTipText

      public String attributeIndicesTipText()
      Returns the tip text for this property
      返回:
      tip text for this property suitable for displaying in the explorer/experimenter gui
    • getAttributeIndices

      public String getAttributeIndices()
      Gets the current range selection
      返回:
      a string containing a comma separated list of ranges
    • setAttributeIndices

      public void setAttributeIndices(String rangeList)
      Sets which attributes are to be Discretized (only numeric attributes among the selection will be Discretized).
      参数:
      rangeList - a string representing the list of attributes. Since the string will typically come from a user, attributes are indexed from 1.
      eg: first-3,5,6-last
      抛出:
      IllegalArgumentException - if an invalid range list is supplied
    • setAttributeIndicesArray

      public void setAttributeIndicesArray(int[] attributes)
      Sets which attributes are to be Discretized (only numeric attributes among the selection will be Discretized).
      参数:
      attributes - an array containing indexes of attributes to Discretize. Since the array will typically come from a program, attributes are indexed from 0.
      抛出:
      IllegalArgumentException - if an invalid set of ranges is supplied
    • getCutPoints

      public double[] getCutPoints(int attributeIndex)
      Gets the cut points for an attribute
      参数:
      attributeIndex - the index (from 0) of the attribute to get the cut points of
      返回:
      an array containing the cutpoints (or null if the attribute requested has been discretized into only one interval.)
    • getRevision

      public String getRevision()
      Returns the revision string.
      指定者:
      getRevision 在接口中 RevisionHandler
      覆盖:
      getRevision 在类中 Filter
      返回:
      the revision
    • main

      public static void main(String[] argv)
      Main method for testing this class.
      参数:
      argv - should contain arguments to the filter: use -h for help