类 KDTree

所有已实现的接口:
Serializable, AdditionalMeasureProducer, OptionHandler, RevisionHandler, TechnicalInformationHandler

public class KDTree extends NearestNeighbourSearch implements TechnicalInformationHandler
Class implementing the KDTree search algorithm for nearest neighbour search.
The connection to dataset is only a reference. For the tree structure the indexes are stored in an array.
Building the tree:
If a node has <maximal-inst-number> (option -L) instances no further splitting is done. Also if the split would leave one side empty, the branch is not split any further even if the instances in the resulting node are more than <maximal-inst-number> instances.
**PLEASE NOTE:** The algorithm can not handle missing values, so it is advisable to run ReplaceMissingValues filter if there are any missing values in the dataset.

For more information see:

Jerome H. Friedman, Jon Luis Bentley, Raphael Ari Finkel (1977). An Algorithm for Finding Best Matches in Logarithmic Expected Time. ACM Transactions on Mathematics Software. 3(3):209-226.

Andrew Moore (1991). A tutorial on kd-trees.

BibTeX:

 @article{Friedman1977,
    author = {Jerome H. Friedman and Jon Luis Bentley and Raphael Ari Finkel},
    journal = {ACM Transactions on Mathematics Software},
    month = {September},
    number = {3},
    pages = {209-226},
    title = {An Algorithm for Finding Best Matches in Logarithmic Expected Time},
    volume = {3},
    year = {1977}
 }
 
 @techreport{Moore1991,
    author = {Andrew Moore},
    booktitle = {University of Cambridge Computer Laboratory Technical Report No. 209},
    howpublished = {Extract from PhD Thesis},
    title = {A tutorial on kd-trees},
    year = {1991},
    HTTP = {Available from http://www.autonlab.org/autonweb/14665.html}
 }
 

Valid options are:

 -S <classname and options>
  Node splitting method to use.
  (default: weka.core.neighboursearch.kdtrees.SlidingMidPointOfWidestSide)
 -W <value>
  Set minimal width of a box
  (default: 1.0E-2).
 -L
  Maximal number of instances in a leaf
  (default: 40).
 -N
  Normalizing will be done
  (Select dimension for split, with normalising to universe).
版本:
$Revision: 1.3 $
作者:
Gabi Schmidberger (gabi[at-the-rate]cs[dot]waikato[dot]ac[dot]nz), Malcolm Ware (mfw4[at-the-rate]cs[dot]waikato[dot]ac[dot]nz), Ashraf M. Kibriya (amk14[at-the-rate]cs[dot]waikato[dot]ac[dot]nz)
另请参阅:
  • 字段详细资料

    • MIN

      public static final int MIN
      The index of MIN value in attributes' range array.
      另请参阅:
    • MAX

      public static final int MAX
      The index of MAX value in attributes' range array.
      另请参阅:
    • WIDTH

      public static final int WIDTH
      The index of WIDTH (MAX-MIN) value in attributes' range array.
      另请参阅:
  • 构造器详细资料

    • KDTree

      public KDTree()
      Creates a new instance of KDTree.
    • KDTree

      public KDTree(Instances insts)
      Creates a new instance of KDTree. It also builds the tree on supplied set of Instances.
      参数:
      insts - The instances/points on which the BallTree should be built on.
  • 方法详细资料

    • getTechnicalInformation

      public TechnicalInformation getTechnicalInformation()
      Returns an instance of a TechnicalInformation object, containing detailed information about the technical background of this class, e.g., paper reference or book this class is based on.
      指定者:
      getTechnicalInformation 在接口中 TechnicalInformationHandler
      返回:
      the technical information about this class
    • kNearestNeighbours

      public Instances kNearestNeighbours(Instance target, int k) throws Exception
      Returns the k nearest neighbours of the supplied instance. >k neighbours are returned if there are more than one neighbours at the kth boundary.
      指定者:
      kNearestNeighbours 在类中 NearestNeighbourSearch
      参数:
      target - The instance to find the nearest neighbours for.
      k - The number of neighbours to find.
      返回:
      The k nearest neighbours (or >k if more there are than one neighbours at the kth boundary).
      抛出:
      Exception - if the nearest neighbour could not be found.
    • nearestNeighbour

      public Instance nearestNeighbour(Instance target) throws Exception
      Returns the nearest neighbour of the supplied target instance.
      指定者:
      nearestNeighbour 在类中 NearestNeighbourSearch
      参数:
      target - The instance to find the nearest neighbour for.
      返回:
      The nearest neighbour from among the previously supplied training instances.
      抛出:
      Exception - if the neighbours could not be found.
    • getDistances

      public double[] getDistances() throws Exception
      Returns the distances to the kNearest or 1 nearest neighbour currently found with either the kNearestNeighbours or the nearestNeighbour method.
      指定者:
      getDistances 在类中 NearestNeighbourSearch
      返回:
      array containing the distances of the nearestNeighbours. The length and ordering of the array is the same as that of the instances returned by nearestNeighbour functions.
      抛出:
      Exception - if called before calling kNearestNeighbours or nearestNeighbours.
    • setInstances

      public void setInstances(Instances instances) throws Exception
      Builds the KDTree on the given set of instances.
      覆盖:
      setInstances 在类中 NearestNeighbourSearch
      参数:
      instances - The insts on which the KDTree is to be built.
      抛出:
      Exception - If some error occurs while building the KDTree
    • update

      public void update(Instance instance) throws Exception
      Adds one instance to the KDTree. This updates the KDTree structure to take into account the newly added training instance.
      指定者:
      update 在类中 NearestNeighbourSearch
      参数:
      instance - the instance to be added. Usually the newly added instance in the training set.
      抛出:
      Exception - If the instance cannot be added.
    • addInstanceInfo

      public void addInstanceInfo(Instance instance)
      Adds one instance to KDTree loosly. It only changes the ranges in EuclideanDistance, and does not affect the structure of the KDTree.
      覆盖:
      addInstanceInfo 在类中 NearestNeighbourSearch
      参数:
      instance - the new instance. Usually this is the test instance supplied to update the range of attributes in the distance function.
    • measureTreeSize

      public double measureTreeSize()
      Returns the size of the tree.
      返回:
      the size of the tree
    • measureNumLeaves

      public double measureNumLeaves()
      Returns the number of leaves.
      返回:
      the number of leaves
    • measureMaxDepth

      public double measureMaxDepth()
      Returns the depth of the tree.
      返回:
      The depth of the tree
    • enumerateMeasures

      public Enumeration enumerateMeasures()
      Returns an enumeration of the additional measure names.
      指定者:
      enumerateMeasures 在接口中 AdditionalMeasureProducer
      覆盖:
      enumerateMeasures 在类中 NearestNeighbourSearch
      返回:
      an enumeration of the measure names
    • getMeasure

      public double getMeasure(String additionalMeasureName)
      Returns the value of the named measure.
      指定者:
      getMeasure 在接口中 AdditionalMeasureProducer
      覆盖:
      getMeasure 在类中 NearestNeighbourSearch
      参数:
      additionalMeasureName - the name of the measure to query for its value.
      返回:
      The value of the named measure
      抛出:
      IllegalArgumentException - If the named measure is not supported.
    • setMeasurePerformance

      public void setMeasurePerformance(boolean measurePerformance)
      Sets whether to calculate the performance statistics or not.
      覆盖:
      setMeasurePerformance 在类中 NearestNeighbourSearch
      参数:
      measurePerformance - Should be true if performance statistics are to be measured.
    • centerInstances

      public void centerInstances(Instances centers, int[] assignments, double pc) throws Exception
      Assigns instances to centers using KDTree.
      参数:
      centers - the current centers
      assignments - the centerindex for each instance
      pc - the threshold value for pruning.
      抛出:
      Exception - If there is some problem assigning instances to centers.
    • assignSubToCenters

      public void assignSubToCenters(KDTreeNode node, Instances centers, int[] centList, int[] assignments) throws Exception
      Assigns instances of this node to center. Center to be assign to is decided by the distance function.
      参数:
      node - The KDTreeNode whose instances are to be assigned.
      centers - all the input centers
      centList - the list of centers to work with
      assignments - index list of last assignments
      抛出:
      Exception - If there is error assigning the instances.
    • minBoxRelWidthTipText

      public String minBoxRelWidthTipText()
      Tip text for this property.
      返回:
      the tip text for this property
    • setMinBoxRelWidth

      public void setMinBoxRelWidth(double i)
      Sets the minimum relative box width.
      参数:
      i - the minimum relative box width
    • getMinBoxRelWidth

      public double getMinBoxRelWidth()
      Gets the minimum relative box width.
      返回:
      the minimum relative box width
    • maxInstInLeafTipText

      public String maxInstInLeafTipText()
      Tip text for this property.
      返回:
      the tip text for this property
    • setMaxInstInLeaf

      public void setMaxInstInLeaf(int i)
      Sets the maximum number of instances in a leaf.
      参数:
      i - the maximum number of instances in a leaf
    • getMaxInstInLeaf

      public int getMaxInstInLeaf()
      Get the maximum number of instances in a leaf.
      返回:
      the maximum number of instances in a leaf
    • normalizeNodeWidthTipText

      public String normalizeNodeWidthTipText()
      Tip text for this property.
      返回:
      the tip text for this property
    • setNormalizeNodeWidth

      public void setNormalizeNodeWidth(boolean n)
      Sets the flag for normalizing the widths of a KDTree Node by the width of the dimension in the universe.
      参数:
      n - true to use normalizing.
    • getNormalizeNodeWidth

      public boolean getNormalizeNodeWidth()
      Gets the normalize flag.
      返回:
      True if normalizing
    • getDistanceFunction

      public DistanceFunction getDistanceFunction()
      returns the distance function currently in use.
      覆盖:
      getDistanceFunction 在类中 NearestNeighbourSearch
      返回:
      the distance function
    • setDistanceFunction

      public void setDistanceFunction(DistanceFunction df) throws Exception
      sets the distance function to use for nearest neighbour search.
      覆盖:
      setDistanceFunction 在类中 NearestNeighbourSearch
      参数:
      df - the distance function to use
      抛出:
      Exception - if not EuclideanDistance
    • nodeSplitterTipText

      public String nodeSplitterTipText()
      Returns the tip text for this property.
      返回:
      tip text for this property suitable for displaying in the explorer/experimenter gui
    • getNodeSplitter

      public KDTreeNodeSplitter getNodeSplitter()
      Returns the splitting method currently in use to split the nodes of the KDTree.
      返回:
      The KDTreeNodeSplitter currently in use.
    • setNodeSplitter

      public void setNodeSplitter(KDTreeNodeSplitter splitter)
      Sets the splitting method to use to split the nodes of the KDTree.
      参数:
      splitter - The KDTreeNodeSplitter to use.
    • globalInfo

      public String globalInfo()
      Returns a string describing this nearest neighbour search algorithm.
      覆盖:
      globalInfo 在类中 NearestNeighbourSearch
      返回:
      a description of the algorithm for displaying in the explorer/experimenter gui
    • listOptions

      public Enumeration listOptions()
      Returns an enumeration describing the available options.
      指定者:
      listOptions 在接口中 OptionHandler
      覆盖:
      listOptions 在类中 NearestNeighbourSearch
      返回:
      an enumeration of all the available options.
    • setOptions

      public void setOptions(String[] options) throws Exception
      Parses a given list of options.

      Valid options are:

       -S <classname and options>
        Node splitting method to use.
        (default: weka.core.neighboursearch.kdtrees.SlidingMidPointOfWidestSide)
       -W <value>
        Set minimal width of a box
        (default: 1.0E-2).
       -L
        Maximal number of instances in a leaf
        (default: 40).
       -N
        Normalizing will be done
        (Select dimension for split, with normalising to universe).
      指定者:
      setOptions 在接口中 OptionHandler
      覆盖:
      setOptions 在类中 NearestNeighbourSearch
      参数:
      options - the list of options as an array of strings
      抛出:
      Exception - if an option is not supported
    • getOptions

      public String[] getOptions()
      Gets the current settings of KDtree.
      指定者:
      getOptions 在接口中 OptionHandler
      覆盖:
      getOptions 在类中 NearestNeighbourSearch
      返回:
      an array of strings suitable for passing to setOptions
    • getRevision

      public String getRevision()
      Returns the revision string.
      指定者:
      getRevision 在接口中 RevisionHandler
      返回:
      the revision