程序包 weka.clusterers

类 FarthestFirst

所有已实现的接口:
Serializable, Cloneable, Clusterer, CapabilitiesHandler, OptionHandler, Randomizable, RevisionHandler, TechnicalInformationHandler

public class FarthestFirst extends RandomizableClusterer implements TechnicalInformationHandler
Cluster data using the FarthestFirst algorithm.

For more information see:

Hochbaum, Shmoys (1985). A best possible heuristic for the k-center problem. Mathematics of Operations Research. 10(2):180-184.

Sanjoy Dasgupta: Performance Guarantees for Hierarchical Clustering. In: 15th Annual Conference on Computational Learning Theory, 351-363, 2002.

Notes:
- works as a fast simple approximate clusterer
- modelled after SimpleKMeans, might be a useful initializer for it

BibTeX:

 @article{Hochbaum1985,
    author = {Hochbaum and Shmoys},
    journal = {Mathematics of Operations Research},
    number = {2},
    pages = {180-184},
    title = {A best possible heuristic for the k-center problem},
    volume = {10},
    year = {1985}
 }
 
 @inproceedings{Dasgupta2002,
    author = {Sanjoy Dasgupta},
    booktitle = {15th Annual Conference on Computational Learning Theory},
    pages = {351-363},
    publisher = {Springer},
    title = {Performance Guarantees for Hierarchical Clustering},
    year = {2002}
 }
 

Valid options are:

 -N <num>
  number of clusters. (default = 2).
 -S <num>
  Random number seed.
  (default 1)
版本:
$Revision: 5538 $
作者:
Bernhard Pfahringer (bernhard@cs.waikato.ac.nz)
另请参阅:
  • 构造器详细资料

    • FarthestFirst

      public FarthestFirst()
  • 方法详细资料

    • globalInfo

      public String globalInfo()
      Returns a string describing this clusterer
      返回:
      a description of the evaluator suitable for displaying in the explorer/experimenter gui
    • getTechnicalInformation

      public TechnicalInformation getTechnicalInformation()
      Returns an instance of a TechnicalInformation object, containing detailed information about the technical background of this class, e.g., paper reference or book this class is based on.
      指定者:
      getTechnicalInformation 在接口中 TechnicalInformationHandler
      返回:
      the technical information about this class
    • getCapabilities

      public Capabilities getCapabilities()
      Returns default capabilities of the clusterer.
      指定者:
      getCapabilities 在接口中 CapabilitiesHandler
      指定者:
      getCapabilities 在接口中 Clusterer
      覆盖:
      getCapabilities 在类中 AbstractClusterer
      返回:
      the capabilities of this clusterer
      另请参阅:
    • buildClusterer

      public void buildClusterer(Instances data) throws Exception
      Generates a clusterer. Has to initialize all fields of the clusterer that are not being set via options.
      指定者:
      buildClusterer 在接口中 Clusterer
      指定者:
      buildClusterer 在类中 AbstractClusterer
      参数:
      data - set of instances serving as training data
      抛出:
      Exception - if the clusterer has not been generated successfully
    • clusterInstance

      public int clusterInstance(Instance instance) throws Exception
      Classifies a given instance.
      指定者:
      clusterInstance 在接口中 Clusterer
      覆盖:
      clusterInstance 在类中 AbstractClusterer
      参数:
      instance - the instance to be assigned to a cluster
      返回:
      the number of the assigned cluster as an integer if the class is enumerated, otherwise the predicted value
      抛出:
      Exception - if instance could not be classified successfully
    • numberOfClusters

      public int numberOfClusters() throws Exception
      Returns the number of clusters.
      指定者:
      numberOfClusters 在接口中 Clusterer
      指定者:
      numberOfClusters 在类中 AbstractClusterer
      返回:
      the number of clusters generated for a training dataset.
      抛出:
      Exception - if number of clusters could not be returned successfully
    • listOptions

      public Enumeration listOptions()
      Returns an enumeration describing the available options.
      指定者:
      listOptions 在接口中 OptionHandler
      覆盖:
      listOptions 在类中 RandomizableClusterer
      返回:
      an enumeration of all the available options.
    • numClustersTipText

      public String numClustersTipText()
      Returns the tip text for this property
      返回:
      tip text for this property suitable for displaying in the explorer/experimenter gui
    • setNumClusters

      public void setNumClusters(int n) throws Exception
      set the number of clusters to generate
      参数:
      n - the number of clusters to generate
      抛出:
      Exception - if number of clusters is negative
    • getNumClusters

      public int getNumClusters()
      gets the number of clusters to generate
      返回:
      the number of clusters to generate
    • setOptions

      public void setOptions(String[] options) throws Exception
      Parses a given list of options.

      Valid options are:

       -N <num>
        number of clusters. (default = 2).
       -S <num>
        Random number seed.
        (default 1)
      指定者:
      setOptions 在接口中 OptionHandler
      覆盖:
      setOptions 在类中 RandomizableClusterer
      参数:
      options - the list of options as an array of strings
      抛出:
      Exception - if an option is not supported
    • getOptions

      public String[] getOptions()
      Gets the current settings of FarthestFirst
      指定者:
      getOptions 在接口中 OptionHandler
      覆盖:
      getOptions 在类中 RandomizableClusterer
      返回:
      an array of strings suitable for passing to setOptions()
    • toString

      public String toString()
      return a string describing this clusterer
      覆盖:
      toString 在类中 Object
      返回:
      a description of the clusterer as a string
    • getRevision

      public String getRevision()
      Returns the revision string.
      指定者:
      getRevision 在接口中 RevisionHandler
      覆盖:
      getRevision 在类中 AbstractClusterer
      返回:
      the revision
    • main

      public static void main(String[] argv)
      Main method for testing this class.
      参数:
      argv - should contain the following arguments:

      -t training file [-N number of clusters]