类 InterquartileRange
java.lang.Object
weka.filters.Filter
weka.filters.SimpleFilter
weka.filters.SimpleBatchFilter
weka.filters.unsupervised.attribute.InterquartileRange
- 所有已实现的接口:
Serializable
,CapabilitiesHandler
,OptionHandler
,RevisionHandler
A filter for detecting outliers and extreme values based on interquartile ranges. The filter skips the class attribute.
Outliers:
Q3 + OF*IQR < x <= Q3 + EVF*IQR
or
Q1 - EVF*IQR <= x < Q1 - OF*IQR
Extreme values:
x > Q3 + EVF*IQR
or
x < Q1 - EVF*IQR
Key:
Q1 = 25% quartile
Q3 = 75% quartile
IQR = Interquartile Range, difference between Q1 and Q3
OF = Outlier Factor
EVF = Extreme Value Factor Valid options are:
Outliers:
Q3 + OF*IQR < x <= Q3 + EVF*IQR
or
Q1 - EVF*IQR <= x < Q1 - OF*IQR
Extreme values:
x > Q3 + EVF*IQR
or
x < Q1 - EVF*IQR
Key:
Q1 = 25% quartile
Q3 = 75% quartile
IQR = Interquartile Range, difference between Q1 and Q3
OF = Outlier Factor
EVF = Extreme Value Factor Valid options are:
-D Turns on output of debugging information.
-R <col1,col2-col4,...> Specifies list of columns to base outlier/extreme value detection on. If an instance is considered in at least one of those attributes an outlier/extreme value, it is tagged accordingly. 'first' and 'last' are valid indexes. (default none)
-O <num> The factor for outlier detection. (default: 3)
-E <num> The factor for extreme values detection. (default: 2*Outlier Factor)
-E-as-O Tags extreme values also as outliers. (default: off)
-P Generates Outlier/ExtremeValue pair for each numeric attribute in the range, not just a single indicator pair for all the attributes. (default: off)
-M Generates an additional attribute 'Offset' per Outlier/ExtremeValue pair that contains the multiplier that the value is off the median. value = median + 'multiplier' * IQR Note: implicitely sets '-P'. (default: off)Thanks to Dale for a few brainstorming sessions.
- 版本:
- $Revision: 9529 $
- 作者:
- Dale Fletcher (dale at cs dot waikato dot ac dot nz), fracpete (fracpete at waikato dot ac dot nz)
- 另请参阅:
-
字段概要
字段 -
构造器概要
构造器 -
方法概要
修饰符和类型方法说明Returns the tip text for this propertyReturns the tip text for this propertyReturns the tip text for this propertyReturns the tip text for this propertyGets the current range selectionReturns the Capabilities of this filter.boolean
Gets whether an Outlier/ExtremeValue attribute pair is generated for each numeric attribute ("true") or just one pair for all numeric attributes together ("false").boolean
Get whether extreme values are also tagged as outliers.double
Gets the factor for determining the thresholds for extreme values.String[]
Gets the current settings of the filter.double
Gets the factor for determining the thresholds for outliers.boolean
Gets whether an additional attribute "Offset" is generated per Outlier/ExtremeValue attribute pair that lists the multiplier the value is off the median: value = median + 'multiplier' * IQR.Returns the revision string.Returns a string describing this filterReturns an enumeration describing the available options.static void
Main method for testing this class.Returns the tip text for this propertyReturns the tip text for this propertyvoid
setAttributeIndices
(String value) Sets which attributes are to be used for interquartile calculations and outlier/extreme value detection (only numeric attributes among the selection will be used).void
setAttributeIndicesArray
(int[] value) Sets which attributes are to be used for interquartile calculations and outlier/extreme value detection (only numeric attributes among the selection will be used).void
setDetectionPerAttribute
(boolean value) Set whether an Outlier/ExtremeValue attribute pair is generated for each numeric attribute ("true") or just one pair for all numeric attributes together ("false").void
setExtremeValuesAsOutliers
(boolean value) Set whether extreme values are also tagged as outliers.void
setExtremeValuesFactor
(double value) Sets the factor for determining the thresholds for extreme values.void
setOptions
(String[] options) Parses a list of options for this object.void
setOutlierFactor
(double value) Sets the factor for determining the thresholds for outliers.void
setOutputOffsetMultiplier
(boolean value) Set whether an additional attribute "Offset" is generated per Outlier/ExtremeValue attribute pair that lists the multiplier the value is off the median: value = median + 'multiplier' * IQR.从类继承的方法 weka.filters.SimpleBatchFilter
batchFinished, input
从类继承的方法 weka.filters.SimpleFilter
debugTipText, getDebug, setDebug, setInputFormat
从类继承的方法 weka.filters.Filter
batchFilterFile, filterFile, getCapabilities, getOutputFormat, isFirstBatchDone, isNewBatch, isOutputFormatDefined, makeCopies, makeCopy, numPendingOutput, output, outputPeek, toString, useFilter, wekaStaticWrapper
-
字段详细资料
-
NON_NUMERIC
public static final int NON_NUMERICindicator for non-numeric attributes- 另请参阅:
-
-
构造器详细资料
-
InterquartileRange
public InterquartileRange()
-
-
方法详细资料
-
globalInfo
Returns a string describing this filter- 指定者:
globalInfo
在类中SimpleFilter
- 返回:
- a description of the filter suitable for displaying in the explorer/experimenter gui
-
listOptions
Returns an enumeration describing the available options.- 指定者:
listOptions
在接口中OptionHandler
- 覆盖:
listOptions
在类中SimpleFilter
- 返回:
- an enumeration of all the available options.
-
setOptions
Parses a list of options for this object. Valid options are:-D Turns on output of debugging information.
-R <col1,col2-col4,...> Specifies list of columns to base outlier/extreme value detection on. If an instance is considered in at least one of those attributes an outlier/extreme value, it is tagged accordingly. 'first' and 'last' are valid indexes. (default none)
-O <num> The factor for outlier detection. (default: 3)
-E <num> The factor for extreme values detection. (default: 2*Outlier Factor)
-E-as-O Tags extreme values also as outliers. (default: off)
-P Generates Outlier/ExtremeValue pair for each numeric attribute in the range, not just a single indicator pair for all the attributes. (default: off)
-M Generates an additional attribute 'Offset' per Outlier/ExtremeValue pair that contains the multiplier that the value is off the median. value = median + 'multiplier' * IQR Note: implicitely sets '-P'. (default: off)
- 指定者:
setOptions
在接口中OptionHandler
- 覆盖:
setOptions
在类中SimpleFilter
- 参数:
options
- the list of options as an array of strings- 抛出:
Exception
- if an option is not supported- 另请参阅:
-
SimpleFilter.reset()
-
getOptions
Gets the current settings of the filter.- 指定者:
getOptions
在接口中OptionHandler
- 覆盖:
getOptions
在类中SimpleFilter
- 返回:
- an array of strings suitable for passing to setOptions
-
attributeIndicesTipText
Returns the tip text for this property- 返回:
- tip text for this property suitable for displaying in the explorer/experimenter gui
-
getAttributeIndices
Gets the current range selection- 返回:
- a string containing a comma separated list of ranges
-
setAttributeIndices
Sets which attributes are to be used for interquartile calculations and outlier/extreme value detection (only numeric attributes among the selection will be used).- 参数:
value
- a string representing the list of attributes. Since the string will typically come from a user, attributes are indexed from 1.
eg: first-3,5,6-last- 抛出:
IllegalArgumentException
- if an invalid range list is supplied
-
setAttributeIndicesArray
public void setAttributeIndicesArray(int[] value) Sets which attributes are to be used for interquartile calculations and outlier/extreme value detection (only numeric attributes among the selection will be used).- 参数:
value
- an array containing indexes of attributes to work on. Since the array will typically come from a program, attributes are indexed from 0.- 抛出:
IllegalArgumentException
- if an invalid set of ranges is supplied
-
outlierFactorTipText
Returns the tip text for this property- 返回:
- tip text for this property suitable for displaying in the explorer/experimenter gui
-
setOutlierFactor
public void setOutlierFactor(double value) Sets the factor for determining the thresholds for outliers.- 参数:
value
- the factor.
-
getOutlierFactor
public double getOutlierFactor()Gets the factor for determining the thresholds for outliers.- 返回:
- the factor.
-
extremeValuesFactorTipText
Returns the tip text for this property- 返回:
- tip text for this property suitable for displaying in the explorer/experimenter gui
-
setExtremeValuesFactor
public void setExtremeValuesFactor(double value) Sets the factor for determining the thresholds for extreme values.- 参数:
value
- the factor.
-
getExtremeValuesFactor
public double getExtremeValuesFactor()Gets the factor for determining the thresholds for extreme values.- 返回:
- the factor.
-
extremeValuesAsOutliersTipText
Returns the tip text for this property- 返回:
- tip text for this property suitable for displaying in the explorer/experimenter gui
-
setExtremeValuesAsOutliers
public void setExtremeValuesAsOutliers(boolean value) Set whether extreme values are also tagged as outliers.- 参数:
value
- whether or not to tag extreme values also as outliers.
-
getExtremeValuesAsOutliers
public boolean getExtremeValuesAsOutliers()Get whether extreme values are also tagged as outliers.- 返回:
- true if extreme values are also tagged as outliers.
-
detectionPerAttributeTipText
Returns the tip text for this property- 返回:
- tip text for this property suitable for displaying in the explorer/experimenter gui
-
setDetectionPerAttribute
public void setDetectionPerAttribute(boolean value) Set whether an Outlier/ExtremeValue attribute pair is generated for each numeric attribute ("true") or just one pair for all numeric attributes together ("false").- 参数:
value
- whether or not to generate indicator attribute pairs for each numeric attribute.
-
getDetectionPerAttribute
public boolean getDetectionPerAttribute()Gets whether an Outlier/ExtremeValue attribute pair is generated for each numeric attribute ("true") or just one pair for all numeric attributes together ("false").- 返回:
- true if indicator attribute pairs are generated for each numeric attribute.
-
outputOffsetMultiplierTipText
Returns the tip text for this property- 返回:
- tip text for this property suitable for displaying in the explorer/experimenter gui
-
setOutputOffsetMultiplier
public void setOutputOffsetMultiplier(boolean value) Set whether an additional attribute "Offset" is generated per Outlier/ExtremeValue attribute pair that lists the multiplier the value is off the median: value = median + 'multiplier' * IQR.- 参数:
value
- whether or not to generate the additional attribute.
-
getOutputOffsetMultiplier
public boolean getOutputOffsetMultiplier()Gets whether an additional attribute "Offset" is generated per Outlier/ExtremeValue attribute pair that lists the multiplier the value is off the median: value = median + 'multiplier' * IQR.- 返回:
- true if the additional attribute is generated.
-
getCapabilities
Returns the Capabilities of this filter.- 指定者:
getCapabilities
在接口中CapabilitiesHandler
- 覆盖:
getCapabilities
在类中Filter
- 返回:
- the capabilities of this object
- 另请参阅:
-
getRevision
Returns the revision string.- 指定者:
getRevision
在接口中RevisionHandler
- 覆盖:
getRevision
在类中Filter
- 返回:
- the revision
-
main
Main method for testing this class.- 参数:
args
- should contain arguments to the filter: use -h for help
-