程序包 weka.clusterers
类 SimpleKMeans
java.lang.Object
weka.clusterers.AbstractClusterer
weka.clusterers.RandomizableClusterer
weka.clusterers.SimpleKMeans
- 所有已实现的接口:
Serializable
,Cloneable
,Clusterer
,NumberOfClustersRequestable
,CapabilitiesHandler
,OptionHandler
,Randomizable
,RevisionHandler
,WeightedInstancesHandler
public class SimpleKMeans
extends RandomizableClusterer
implements NumberOfClustersRequestable, WeightedInstancesHandler
Cluster data using the k means algorithm
Valid options are:
-N <num> number of clusters. (default 2).
-V Display std. deviations for centroids.
-M Replace missing values with mean/mode.
-S <num> Random number seed. (default 10)
-A <classname and options> Distance function to be used for instance comparison (default weka.core.EuclidianDistance)
-I <num> Maximum number of iterations.
-O Preserve order of instances.
- 版本:
- $Revision: 10537 $
- 作者:
- Mark Hall (mhall@cs.waikato.ac.nz), Eibe Frank (eibe@cs.waikato.ac.nz)
- 另请参阅:
-
构造器概要
构造器 -
方法概要
修饰符和类型方法说明void
buildClusterer
(Instances data) Generates a clusterer.int
clusterInstance
(Instance instance) Classifies a given instance.Returns the tip text for this propertyReturns the tip text for this property.Returns the tip text for this propertyint[]
Gets the assignments for each instanceReturns default capabilities of the clusterer.Gets the the cluster centroidsint[][][]
Returns for each cluster the frequency counts for the values of each nominal attributeint[]
Gets the number of instances in each clusterGets the standard deviations of the numeric attributes in each clusterboolean
Gets whether standard deviations and nominal count Should be displayed in the clustering outputreturns the distance function currently in use.boolean
Gets whether missing values are to be replacedint
gets the number of maximum iterations to be executedint
gets the number of clusters to generateString[]
Gets the current settings of SimpleKMeansboolean
Gets whether order of instances must be preservedReturns the revision string.double
Gets the squared error for all clustersReturns a string describing this clustererReturns an enumeration describing the available options.static void
Main method for testing this class.Returns the tip text for this propertyint
Returns the number of clusters.Returns the tip text for this propertyReturns the tip text for this propertyvoid
setDisplayStdDevs
(boolean stdD) Sets whether standard deviations and nominal count Should be displayed in the clustering outputvoid
sets the distance function to use for instance comparison.void
setDontReplaceMissingValues
(boolean r) Sets whether missing values are to be replacedvoid
setMaxIterations
(int n) set the maximum number of iterations to be executedvoid
setNumClusters
(int n) set the number of clusters to generatevoid
setOptions
(String[] options) Parses a given list of options.void
setPreserveInstancesOrder
(boolean r) Sets whether order of instances must be preservedtoString()
return a string describing this clusterer从类继承的方法 weka.clusterers.RandomizableClusterer
getSeed, seedTipText, setSeed
从类继承的方法 weka.clusterers.AbstractClusterer
distributionForInstance, forName, makeCopies, makeCopy
-
构造器详细资料
-
SimpleKMeans
public SimpleKMeans()the default constructor
-
-
方法详细资料
-
globalInfo
Returns a string describing this clusterer- 返回:
- a description of the evaluator suitable for displaying in the explorer/experimenter gui
-
getCapabilities
Returns default capabilities of the clusterer.- 指定者:
getCapabilities
在接口中CapabilitiesHandler
- 指定者:
getCapabilities
在接口中Clusterer
- 覆盖:
getCapabilities
在类中AbstractClusterer
- 返回:
- the capabilities of this clusterer
- 另请参阅:
-
buildClusterer
Generates a clusterer. Has to initialize all fields of the clusterer that are not being set via options.- 指定者:
buildClusterer
在接口中Clusterer
- 指定者:
buildClusterer
在类中AbstractClusterer
- 参数:
data
- set of instances serving as training data- 抛出:
Exception
- if the clusterer has not been generated successfully
-
clusterInstance
Classifies a given instance.- 指定者:
clusterInstance
在接口中Clusterer
- 覆盖:
clusterInstance
在类中AbstractClusterer
- 参数:
instance
- the instance to be assigned to a cluster- 返回:
- the number of the assigned cluster as an interger if the class is enumerated, otherwise the predicted value
- 抛出:
Exception
- if instance could not be classified successfully
-
numberOfClusters
Returns the number of clusters.- 指定者:
numberOfClusters
在接口中Clusterer
- 指定者:
numberOfClusters
在类中AbstractClusterer
- 返回:
- the number of clusters generated for a training dataset.
- 抛出:
Exception
- if number of clusters could not be returned successfully
-
listOptions
Returns an enumeration describing the available options.- 指定者:
listOptions
在接口中OptionHandler
- 覆盖:
listOptions
在类中RandomizableClusterer
- 返回:
- an enumeration of all the available options.
-
numClustersTipText
Returns the tip text for this property- 返回:
- tip text for this property suitable for displaying in the explorer/experimenter gui
-
setNumClusters
set the number of clusters to generate- 指定者:
setNumClusters
在接口中NumberOfClustersRequestable
- 参数:
n
- the number of clusters to generate- 抛出:
Exception
- if number of clusters is negative
-
getNumClusters
public int getNumClusters()gets the number of clusters to generate- 返回:
- the number of clusters to generate
-
maxIterationsTipText
Returns the tip text for this property- 返回:
- tip text for this property suitable for displaying in the explorer/experimenter gui
-
setMaxIterations
set the maximum number of iterations to be executed- 参数:
n
- the maximum number of iterations- 抛出:
Exception
- if maximum number of iteration is smaller than 1
-
getMaxIterations
public int getMaxIterations()gets the number of maximum iterations to be executed- 返回:
- the number of clusters to generate
-
displayStdDevsTipText
Returns the tip text for this property- 返回:
- tip text for this property suitable for displaying in the explorer/experimenter gui
-
setDisplayStdDevs
public void setDisplayStdDevs(boolean stdD) Sets whether standard deviations and nominal count Should be displayed in the clustering output- 参数:
stdD
- true if std. devs and counts should be displayed
-
getDisplayStdDevs
public boolean getDisplayStdDevs()Gets whether standard deviations and nominal count Should be displayed in the clustering output- 返回:
- true if std. devs and counts should be displayed
-
dontReplaceMissingValuesTipText
Returns the tip text for this property- 返回:
- tip text for this property suitable for displaying in the explorer/experimenter gui
-
setDontReplaceMissingValues
public void setDontReplaceMissingValues(boolean r) Sets whether missing values are to be replaced- 参数:
r
- true if missing values are to be replaced
-
getDontReplaceMissingValues
public boolean getDontReplaceMissingValues()Gets whether missing values are to be replaced- 返回:
- true if missing values are to be replaced
-
distanceFunctionTipText
Returns the tip text for this property.- 返回:
- tip text for this property suitable for displaying in the explorer/experimenter gui
-
getDistanceFunction
returns the distance function currently in use.- 返回:
- the distance function
-
setDistanceFunction
sets the distance function to use for instance comparison.- 参数:
df
- the new distance function to use- 抛出:
Exception
- if instances cannot be processed
-
preserveInstancesOrderTipText
Returns the tip text for this property- 返回:
- tip text for this property suitable for displaying in the explorer/experimenter gui
-
setPreserveInstancesOrder
public void setPreserveInstancesOrder(boolean r) Sets whether order of instances must be preserved- 参数:
r
- true if missing values are to be replaced
-
getPreserveInstancesOrder
public boolean getPreserveInstancesOrder()Gets whether order of instances must be preserved- 返回:
- true if missing values are to be replaced
-
setOptions
Parses a given list of options. Valid options are:-N <num> number of clusters. (default 2).
-V Display std. deviations for centroids.
-M Replace missing values with mean/mode.
-S <num> Random number seed. (default 10)
-A <classname and options> Distance function to be used for instance comparison (default weka.core.EuclidianDistance)
-I <num> Maximum number of iterations.
-O Preserve order of instances.
- 指定者:
setOptions
在接口中OptionHandler
- 覆盖:
setOptions
在类中RandomizableClusterer
- 参数:
options
- the list of options as an array of strings- 抛出:
Exception
- if an option is not supported
-
getOptions
Gets the current settings of SimpleKMeans- 指定者:
getOptions
在接口中OptionHandler
- 覆盖:
getOptions
在类中RandomizableClusterer
- 返回:
- an array of strings suitable for passing to setOptions()
-
toString
return a string describing this clusterer -
getClusterCentroids
Gets the the cluster centroids- 返回:
- the cluster centroids
-
getClusterStandardDevs
Gets the standard deviations of the numeric attributes in each cluster- 返回:
- the standard deviations of the numeric attributes in each cluster
-
getClusterNominalCounts
public int[][][] getClusterNominalCounts()Returns for each cluster the frequency counts for the values of each nominal attribute- 返回:
- the counts
-
getSquaredError
public double getSquaredError()Gets the squared error for all clusters- 返回:
- the squared error
-
getClusterSizes
public int[] getClusterSizes()Gets the number of instances in each cluster- 返回:
- The number of instances in each cluster
-
getAssignments
Gets the assignments for each instance- 返回:
- Array of indexes of the centroid assigned to each instance
- 抛出:
Exception
- if order of instances wasn't preserved or no assignments were made
-
getRevision
Returns the revision string.- 指定者:
getRevision
在接口中RevisionHandler
- 覆盖:
getRevision
在类中AbstractClusterer
- 返回:
- the revision
-
main
Main method for testing this class.- 参数:
argv
- should contain the following arguments:-t training file [-N number of clusters]
-