类 LatentSemanticAnalysis

所有已实现的接口:
Serializable, AttributeEvaluator, AttributeTransformer, CapabilitiesHandler, OptionHandler, RevisionHandler

public class LatentSemanticAnalysis extends UnsupervisedAttributeEvaluator implements AttributeTransformer, OptionHandler
Performs latent semantic analysis and transformation of the data. Use in conjunction with a Ranker search. A low-rank approximation of the full data is found by specifying the number of singular values to use. The dataset may be transformed to give the relation of either the attributes or the instances (default) to the concept space created by the transformation.

Valid options are:

 -N
  Normalize input data.
 -R
  Rank approximation used in LSA. May be actual number of 
  LSA attributes to include (if greater than 1) or a proportion 
  of total singular values to account for (if between 0 and 1). 
  A value less than or equal to zero means use all latent variables.
  (default = 0.95)
 -A
  Maximum number of attributes to include in 
  transformed attribute names. (-1 = include all)
版本:
$Revision: 11821 $
作者:
Amri Napolitano
另请参阅:
  • 构造器详细资料

    • LatentSemanticAnalysis

      public LatentSemanticAnalysis()
  • 方法详细资料

    • globalInfo

      public String globalInfo()
      Returns a string describing this attribute transformer
      返回:
      a description of the evaluator suitable for displaying in the explorer/experimenter gui
    • listOptions

      public Enumeration listOptions()
      Returns an enumeration describing the available options.

      指定者:
      listOptions 在接口中 OptionHandler
      返回:
      an enumeration of all the available options.
    • setOptions

      public void setOptions(String[] options) throws Exception
      Parses a given list of options.

      Valid options are:

       -N
        Normalize input data.
       -R
        Rank approximation used in LSA. May be actual number of 
        LSA attributes to include (if greater than 1) or a proportion 
        of total singular values to account for (if between 0 and 1). 
        A value less than or equal to zero means use all latent variables.
        (default = 0.95)
       -A
        Maximum number of attributes to include in 
        transformed attribute names. (-1 = include all)
      指定者:
      setOptions 在接口中 OptionHandler
      参数:
      options - the list of options as an array of strings
      抛出:
      Exception - if an option is not supported
    • normalizeTipText

      public String normalizeTipText()
      Returns the tip text for this property
      返回:
      tip text for this property suitable for displaying in the explorer/experimenter gui
    • setNormalize

      public void setNormalize(boolean newNormalize)
      Set whether input data will be normalized.
      参数:
      newNormalize - true if input data is to be normalized
    • getNormalize

      public boolean getNormalize()
      Gets whether or not input data is to be normalized
      返回:
      true if input data is to be normalized
    • rankTipText

      public String rankTipText()
      Returns the tip text for this property
      返回:
      tip text for this property suitable for displaying in the explorer/experimenter gui
    • setRank

      public void setRank(double newRank)
      Sets the desired matrix rank (or coverage proportion) for feature-space reduction
      参数:
      newRank - the desired rank (or coverage) for feature-space reduction
    • getRank

      public double getRank()
      Gets the desired matrix rank (or coverage proportion) for feature-space reduction
      返回:
      the rank (or coverage) for feature-space reduction
    • maximumAttributeNamesTipText

      public String maximumAttributeNamesTipText()
      Returns the tip text for this property
      返回:
      tip text for this property suitable for displaying in the explorer/experimenter gui
    • setMaximumAttributeNames

      public void setMaximumAttributeNames(int newMaxAttributes)
      Sets maximum number of attributes to include in transformed attribute names.
      参数:
      newMaxAttributes - the maximum number of attributes
    • getMaximumAttributeNames

      public int getMaximumAttributeNames()
      Gets maximum number of attributes to include in transformed attribute names.
      返回:
      the maximum number of attributes
    • getOptions

      public String[] getOptions()
      Gets the current settings of LatentSemanticAnalysis
      指定者:
      getOptions 在接口中 OptionHandler
      返回:
      an array of strings suitable for passing to setOptions()
    • getCapabilities

      public Capabilities getCapabilities()
      Returns the capabilities of this evaluator.
      指定者:
      getCapabilities 在接口中 CapabilitiesHandler
      覆盖:
      getCapabilities 在类中 ASEvaluation
      返回:
      the capabilities of this evaluator
      另请参阅:
    • buildEvaluator

      public void buildEvaluator(Instances data) throws Exception
      Initializes the singular values/vectors and performs the analysis
      指定者:
      buildEvaluator 在类中 ASEvaluation
      参数:
      data - the instances to analyse/transform
      抛出:
      Exception - if analysis fails
    • transformedHeader

      public Instances transformedHeader() throws Exception
      Returns just the header for the transformed data (ie. an empty set of instances. This is so that AttributeSelection can determine the structure of the transformed data without actually having to get all the transformed data through getTransformedData().
      指定者:
      transformedHeader 在接口中 AttributeTransformer
      返回:
      the header of the transformed data.
      抛出:
      Exception - if the header of the transformed data can't be determined.
    • transformedData

      public Instances transformedData(Instances data) throws Exception
      Transform the supplied data set (assumed to be the same format as the training data)
      指定者:
      transformedData 在接口中 AttributeTransformer
      返回:
      the transformed training data
      抛出:
      Exception - if transformed data can't be returned
    • evaluateAttribute

      public double evaluateAttribute(int att) throws Exception
      Evaluates the merit of a transformed attribute. This is defined to be the square of the singular value for the latent variable corresponding to the transformed attribute.
      指定者:
      evaluateAttribute 在接口中 AttributeEvaluator
      参数:
      att - the attribute to be evaluated
      返回:
      the merit of a transformed attribute
      抛出:
      Exception - if attribute can't be evaluated
    • convertInstance

      public Instance convertInstance(Instance instance) throws Exception
      Transform an instance in original (unnormalized) format
      指定者:
      convertInstance 在接口中 AttributeTransformer
      参数:
      instance - an instance in the original (unnormalized) format
      返回:
      a transformed instance
      抛出:
      Exception - if instance can't be transformed
    • toString

      public String toString()
      Returns a description of this attribute transformer
      覆盖:
      toString 在类中 Object
      返回:
      a String describing this attribute transformer
    • getRevision

      public String getRevision()
      Returns the revision string.
      指定者:
      getRevision 在接口中 RevisionHandler
      覆盖:
      getRevision 在类中 ASEvaluation
      返回:
      the revision
    • main

      public static void main(String[] argv)
      Main method for testing this class
      参数:
      argv - should contain the command line arguments to the evaluator/transformer (see AttributeSelection)