类 AprioriItemSet

java.lang.Object
weka.associations.ItemSet
weka.associations.AprioriItemSet
所有已实现的接口:
Serializable, RevisionHandler

public class AprioriItemSet extends ItemSet implements Serializable, RevisionHandler
Class for storing a set of items. Item sets are stored in a lexicographic order, which is determined by the header information of the set of instances used for generating the set of items. All methods in this class assume that item sets are stored in lexicographic order. The class provides methods that are used in the Apriori algorithm to construct association rules.
版本:
$Revision: 9096 $
作者:
Eibe Frank (eibe@cs.waikato.ac.nz), Stefan Mutter (mutter@cs.waikato.ac.nz)
另请参阅:
  • 构造器详细资料

    • AprioriItemSet

      public AprioriItemSet(int totalTrans)
      Constructor
      参数:
      totalTrans - the total number of transactions in the data
  • 方法详细资料

    • confidenceForRule

      public static double confidenceForRule(AprioriItemSet premise, AprioriItemSet consequence)
      Outputs the confidence for a rule.
      参数:
      premise - the premise of the rule
      consequence - the consequence of the rule
      返回:
      the confidence on the training data
    • liftForRule

      public double liftForRule(AprioriItemSet premise, AprioriItemSet consequence, int consequenceCount)
      Outputs the lift for a rule. Lift is defined as:
      confidence / prob(consequence)
      参数:
      premise - the premise of the rule
      consequence - the consequence of the rule
      consequenceCount - how many times the consequence occurs independent of the premise
      返回:
      the lift on the training data
    • leverageForRule

      public double leverageForRule(AprioriItemSet premise, AprioriItemSet consequence, int premiseCount, int consequenceCount)
      Outputs the leverage for a rule. Leverage is defined as:
      prob(premise & consequence) - (prob(premise) * prob(consequence))
      参数:
      premise - the premise of the rule
      consequence - the consequence of the rule
      premiseCount - how many times the premise occurs independent of the consequent
      consequenceCount - how many times the consequence occurs independent of the premise
      返回:
      the leverage on the training data
    • convictionForRule

      public double convictionForRule(AprioriItemSet premise, AprioriItemSet consequence, int premiseCount, int consequenceCount)
      Outputs the conviction for a rule. Conviction is defined as:
      prob(premise) * prob(!consequence) / prob(premise & !consequence)
      参数:
      premise - the premise of the rule
      consequence - the consequence of the rule
      premiseCount - how many times the premise occurs independent of the consequent
      consequenceCount - how many times the consequence occurs independent of the premise
      返回:
      the conviction on the training data
    • generateRules

      public FastVector[] generateRules(double minConfidence, FastVector hashtables, int numItemsInSet)
      Generates all rules for an item set.
      参数:
      minConfidence - the minimum confidence the rules have to have
      hashtables - containing all(!) previously generated item sets
      numItemsInSet - the size of the item set for which the rules are to be generated
      返回:
      all the rules with minimum confidence for the given item set
    • generateRulesBruteForce

      public final FastVector[] generateRulesBruteForce(double minMetric, int metricType, FastVector hashtables, int numItemsInSet, int numTransactions, double significanceLevel) throws Exception
      Generates all significant rules for an item set.
      参数:
      minMetric - the minimum metric (confidence, lift, leverage, improvement) the rules have to have
      metricType - (confidence=0, lift, leverage, improvement)
      hashtables - containing all(!) previously generated item sets
      numItemsInSet - the size of the item set for which the rules are to be generated
      numTransactions -
      significanceLevel - the significance level for testing the rules
      返回:
      all the rules with minimum metric for the given item set
      抛出:
      Exception - if something goes wrong
    • subtract

      public final AprioriItemSet subtract(AprioriItemSet toSubtract)
      Subtracts an item set from another one.
      参数:
      toSubtract - the item set to be subtracted from this one.
      返回:
      an item set that only contains items form this item sets that are not contained by toSubtract
    • toString

      public final String toString(Instances instances)
      Returns the contents of an item set as a string.
      覆盖:
      toString 在类中 ItemSet
      参数:
      instances - contains the relevant header information
      返回:
      string describing the item set
    • singletons

      public static FastVector singletons(Instances instances) throws Exception
      Converts the header info of the given set of instances into a set of item sets (singletons). The ordering of values in the header file determines the lexicographic order.
      参数:
      instances - the set of instances whose header info is to be used
      返回:
      a set of item sets, each containing a single item
      抛出:
      Exception - if singletons can't be generated successfully
    • mergeAllItemSets

      public static FastVector mergeAllItemSets(FastVector itemSets, int size, int totalTrans)
      Merges all item sets in the set of (k-1)-item sets to create the (k)-item sets and updates the counters.
      参数:
      itemSets - the set of (k-1)-item sets
      size - the value of (k-1)
      totalTrans - the total number of transactions in the data
      返回:
      the generated (k)-item sets
    • getRevision

      public String getRevision()
      Returns the revision string.
      指定者:
      getRevision 在接口中 RevisionHandler
      覆盖:
      getRevision 在类中 ItemSet
      返回:
      the revision