Class PrincipalComponents

  • All Implemented Interfaces:
    java.io.Serializable, CapabilitiesHandler, OptionHandler, RevisionHandler, UnsupervisedFilter

    public class PrincipalComponents
    extends Filter
    implements OptionHandler, UnsupervisedFilter
    Performs a principal components analysis and transformation of the data.
    Dimensionality reduction is accomplished by choosing enough eigenvectors to account for some percentage of the variance in the original data -- default 0.95 (95%).
    Based on code of the attribute selection scheme 'PrincipalComponents' by Mark Hall and Gabi Schmidberger.

    Valid options are:

     -D
      Don't normalize input data.
     -R <num>
      Retain enough PC attributes to account
      for this proportion of variance in the original data.
      (default: 0.95)
     -A <num>
      Maximum number of attributes to include in 
      transformed attribute names.
      (-1 = include all, default: 5)
     -M <num>
      Maximum number of PC attributes to retain.
      (-1 = include all, default: -1)
    Version:
    $Revision: 11449 $
    Author:
    Mark Hall (mhall@cs.waikato.ac.nz) -- attribute selection code, Gabi Schmidberger (gabi@cs.waikato.ac.nz) -- attribute selection code, fracpete (fracpete at waikato dot ac dot nz) -- filter code
    See Also:
    Serialized Form
    • Constructor Detail

      • PrincipalComponents

        public PrincipalComponents()
    • Method Detail

      • globalInfo

        public java.lang.String globalInfo()
        Returns a string describing this filter.
        Returns:
        a description of the filter suitable for displaying in the explorer/experimenter gui
      • listOptions

        public java.util.Enumeration listOptions()
        Returns an enumeration describing the available options.
        Specified by:
        listOptions in interface OptionHandler
        Returns:
        an enumeration of all the available options.
      • setOptions

        public void setOptions​(java.lang.String[] options)
                        throws java.lang.Exception
        Parses a list of options for this object.

        Valid options are:

         -D
          Don't normalize input data.
         -R <num>
          Retain enough PC attributes to account
          for this proportion of variance in the original data.
          (default: 0.95)
         -A <num>
          Maximum number of attributes to include in 
          transformed attribute names.
          (-1 = include all, default: 5)
         -M <num>
          Maximum number of PC attributes to retain.
          (-1 = include all, default: -1)
        Specified by:
        setOptions in interface OptionHandler
        Parameters:
        options - the list of options as an array of strings
        Throws:
        java.lang.Exception - if an option is not supported
      • getOptions

        public java.lang.String[] getOptions()
        Gets the current settings of the filter.
        Specified by:
        getOptions in interface OptionHandler
        Returns:
        an array of strings suitable for passing to setOptions
      • centerDataTipText

        public java.lang.String centerDataTipText()
        Returns the tip text for this property
        Returns:
        tip text for this property suitable for displaying in the explorer/experimenter gui
      • setCenterData

        public void setCenterData​(boolean center)
        Set whether to center (rather than standardize) the data. If set to true then PCA is computed from the covariance rather than correlation matrix.
        Parameters:
        center - true if the data is to be centered rather than standardized
      • getCenterData

        public boolean getCenterData()
        Get whether to center (rather than standardize) the data. If true then PCA is computed from the covariance rather than correlation matrix.
        Returns:
        true if the data is to be centered rather than standardized.
      • varianceCoveredTipText

        public java.lang.String varianceCoveredTipText()
        Returns the tip text for this property.
        Returns:
        tip text for this property suitable for displaying in the explorer/experimenter gui
      • setVarianceCovered

        public void setVarianceCovered​(double value)
        Sets the amount of variance to account for when retaining principal components.
        Parameters:
        value - the proportion of total variance to account for
      • getVarianceCovered

        public double getVarianceCovered()
        Gets the proportion of total variance to account for when retaining principal components.
        Returns:
        the proportion of variance to account for
      • maximumAttributeNamesTipText

        public java.lang.String maximumAttributeNamesTipText()
        Returns the tip text for this property.
        Returns:
        tip text for this property suitable for displaying in the explorer/experimenter gui
      • setMaximumAttributeNames

        public void setMaximumAttributeNames​(int value)
        Sets maximum number of attributes to include in transformed attribute names.
        Parameters:
        value - the maximum number of attributes
      • getMaximumAttributeNames

        public int getMaximumAttributeNames()
        Gets maximum number of attributes to include in transformed attribute names.
        Returns:
        the maximum number of attributes
      • maximumAttributesTipText

        public java.lang.String maximumAttributesTipText()
        Returns the tip text for this property.
        Returns:
        tip text for this property suitable for displaying in the explorer/experimenter gui
      • setMaximumAttributes

        public void setMaximumAttributes​(int value)
        Sets maximum number of PC attributes to retain.
        Parameters:
        value - the maximum number of attributes
      • getMaximumAttributes

        public int getMaximumAttributes()
        Gets maximum number of PC attributes to retain.
        Returns:
        the maximum number of attributes
      • setInputFormat

        public boolean setInputFormat​(Instances instanceInfo)
                               throws java.lang.Exception
        Sets the format of the input instances.
        Overrides:
        setInputFormat in class Filter
        Parameters:
        instanceInfo - an Instances object containing the input instance structure (any instances contained in the object are ignored - only the structure is required).
        Returns:
        true if the outputFormat may be collected immediately
        Throws:
        java.lang.Exception - if the input format can't be set successfully
      • input

        public boolean input​(Instance instance)
                      throws java.lang.Exception
        Input an instance for filtering. Filter requires all training instances be read before producing output.
        Overrides:
        input in class Filter
        Parameters:
        instance - the input instance
        Returns:
        true if the filtered instance may now be collected with output().
        Throws:
        java.lang.IllegalStateException - if no input format has been set
        java.lang.Exception - if conversion fails
      • batchFinished

        public boolean batchFinished()
                              throws java.lang.Exception
        Signify that this batch of input to the filter is finished.
        Overrides:
        batchFinished in class Filter
        Returns:
        true if there are instances pending output
        Throws:
        java.lang.NullPointerException - if no input structure has been defined,
        java.lang.Exception - if there was a problem finishing the batch.
      • main

        public static void main​(java.lang.String[] args)
        Main method for running this filter.
        Parameters:
        args - should contain arguments to the filter: use -h for help