weka.classifiers.meta
Class RegressionByDiscretization

java.lang.Object
  extended by weka.classifiers.AbstractClassifier
      extended by weka.classifiers.SingleClassifierEnhancer
          extended by weka.classifiers.meta.RegressionByDiscretization
All Implemented Interfaces:
java.io.Serializable, java.lang.Cloneable, Classifier, ConditionalDensityEstimator, IntervalEstimator, CapabilitiesHandler, OptionHandler, RevisionHandler

public class RegressionByDiscretization
extends SingleClassifierEnhancer
implements IntervalEstimator, ConditionalDensityEstimator

A regression scheme that employs any classifier on a copy of the data that has the class attribute (equal-width) discretized. The predicted value is the expected value of the mean class value for each discretized interval (based on the predicted probabilities for each interval).

Valid options are:

 -B <int>
  Number of bins for equal-width discretization
  (default 10).
 
 -E
  Whether to delete empty bins after discretization
  (default false).
 
 -F
  Use equal-frequency instead of equal-width discretization.
 -D
  If set, classifier is run in debug mode and
  may output additional info to the console
 -W
  Full name of base classifier.
  (default: weka.classifiers.trees.J48)
 
 Options specific to classifier weka.classifiers.trees.J48:
 
 -U
  Use unpruned tree.
 -C <pruning confidence>
  Set confidence threshold for pruning.
  (default 0.25)
 -M <minimum number of instances>
  Set minimum number of instances per leaf.
  (default 2)
 -R
  Use reduced error pruning.
 -N <number of folds>
  Set number of folds for reduced error
  pruning. One fold is used as pruning set.
  (default 3)
 -B
  Use binary splits only.
 -S
  Don't perform subtree raising.
 -L
  Do not clean up after the tree has been built.
 -A
  Laplace smoothing for predicted probabilities.
 -Q <seed>
  Seed for random data shuffling (default 1).

Version:
$Revision: 5925 $
Author:
Len Trigg (trigg@cs.waikato.ac.nz), Eibe Frank (eibe@cs.waikato.ac.nz)
See Also:
Serialized Form

Field Summary
static int ESTIMATOR_HISTOGRAM
          Use histogram estimator
static int ESTIMATOR_KERNEL
          filter: Standardize training data
static int ESTIMATOR_NORMAL
          filter: No normalization/standardization
static Tag[] TAGS_ESTIMATOR
          The filter to apply to the training data
 
Constructor Summary
RegressionByDiscretization()
          Default constructor.
 
Method Summary
 void buildClassifier(Instances instances)
          Generates the classifier.
 double classifyInstance(Instance instance)
          Returns a predicted class for the test instance.
 java.lang.String deleteEmptyBinsTipText()
          Returns the tip text for this property
 java.lang.String estimatorTypeTipText()
          Returns the tip text for this property
 Capabilities getCapabilities()
          Returns default capabilities of the classifier.
 boolean getDeleteEmptyBins()
          Gets whether empty bins are deleted.
 SelectedTag getEstimatorType()
          Get the estimator type
 boolean getMinimizeAbsoluteError()
          Gets whether to min.
 int getNumBins()
          Gets the number of bins numeric attributes will be divided into
 java.lang.String[] getOptions()
          Gets the current settings of the Classifier.
 java.lang.String getRevision()
          Returns the revision string.
 TechnicalInformation getTechnicalInformation()
          Returns an instance of a TechnicalInformation object, containing detailed information about the technical background of this class, e.g., paper reference or book this class is based on.
 boolean getUseEqualFrequency()
          Get the value of UseEqualFrequency.
 java.lang.String globalInfo()
          Returns a string describing classifier
 java.util.Enumeration listOptions()
          Returns an enumeration describing the available options.
 double logDensity(Instance instance, double value)
          Returns natural logarithm of density estimate for given value based on given instance.
static void main(java.lang.String[] argv)
          Main method for testing this class.
 java.lang.String minimizeAbsoluteErrorTipText()
          Returns the tip text for this property
 java.lang.String numBinsTipText()
          Returns the tip text for this property
 double[][] predictIntervals(Instance instance, double confidenceLevel)
          Returns an N * 2 array, where N is the number of prediction intervals.
 void setDeleteEmptyBins(boolean b)
          Sets whether to delete empty bins.
 void setEstimatorType(SelectedTag newEstimator)
          Set the estimator
 void setMinimizeAbsoluteError(boolean b)
          Sets whether to min.
 void setNumBins(int numBins)
          Sets the number of bins to divide each selected numeric attribute into
 void setOptions(java.lang.String[] options)
          Parses a given list of options.
 void setUseEqualFrequency(boolean newUseEqualFrequency)
          Set the value of UseEqualFrequency.
 java.lang.String toString()
          Returns a description of the classifier.
 java.lang.String useEqualFrequencyTipText()
          Returns the tip text for this property
 
Methods inherited from class weka.classifiers.SingleClassifierEnhancer
classifierTipText, getClassifier, setClassifier
 
Methods inherited from class weka.classifiers.AbstractClassifier
debugTipText, distributionForInstance, forName, getDebug, makeCopies, makeCopy, setDebug
 
Methods inherited from class java.lang.Object
equals, getClass, hashCode, notify, notifyAll, wait, wait, wait
 

Field Detail

ESTIMATOR_HISTOGRAM

public static final int ESTIMATOR_HISTOGRAM
Use histogram estimator

See Also:
Constant Field Values

ESTIMATOR_KERNEL

public static final int ESTIMATOR_KERNEL
filter: Standardize training data

See Also:
Constant Field Values

ESTIMATOR_NORMAL

public static final int ESTIMATOR_NORMAL
filter: No normalization/standardization

See Also:
Constant Field Values

TAGS_ESTIMATOR

public static final Tag[] TAGS_ESTIMATOR
The filter to apply to the training data

Constructor Detail

RegressionByDiscretization

public RegressionByDiscretization()
Default constructor.

Method Detail

globalInfo

public java.lang.String globalInfo()
Returns a string describing classifier

Returns:
a description suitable for displaying in the explorer/experimenter gui

getTechnicalInformation

public TechnicalInformation getTechnicalInformation()
Returns an instance of a TechnicalInformation object, containing detailed information about the technical background of this class, e.g., paper reference or book this class is based on.

Returns:
the technical information about this class

getCapabilities

public Capabilities getCapabilities()
Returns default capabilities of the classifier.

Specified by:
getCapabilities in interface Classifier
Specified by:
getCapabilities in interface CapabilitiesHandler
Overrides:
getCapabilities in class SingleClassifierEnhancer
Returns:
the capabilities of this classifier
See Also:
Capabilities

buildClassifier

public void buildClassifier(Instances instances)
                     throws java.lang.Exception
Generates the classifier.

Specified by:
buildClassifier in interface Classifier
Parameters:
instances - set of instances serving as training data
Throws:
java.lang.Exception - if the classifier has not been generated successfully

predictIntervals

public double[][] predictIntervals(Instance instance,
                                   double confidenceLevel)
                            throws java.lang.Exception
Returns an N * 2 array, where N is the number of prediction intervals. In each row, the first element contains the lower boundary of the corresponding prediction interval and the second element the upper boundary.

Specified by:
predictIntervals in interface IntervalEstimator
Parameters:
inst - the instance to make the prediction for.
confidenceLevel - the percentage of cases that the interval should cover.
Returns:
an array of prediction intervals
Throws:
java.lang.Exception - if the intervals can't be computed

logDensity

public double logDensity(Instance instance,
                         double value)
                  throws java.lang.Exception
Returns natural logarithm of density estimate for given value based on given instance.

Specified by:
logDensity in interface ConditionalDensityEstimator
Parameters:
inst - the instance to make the prediction for.
the - value to make the prediction for.
Returns:
the natural logarithm of the density estimate
Throws:
java.lang.Exception - if the intervals can't be computed

classifyInstance

public double classifyInstance(Instance instance)
                        throws java.lang.Exception
Returns a predicted class for the test instance.

Specified by:
classifyInstance in interface Classifier
Overrides:
classifyInstance in class AbstractClassifier
Parameters:
instance - the instance to be classified
Returns:
predicted class value
Throws:
java.lang.Exception - if the prediction couldn't be made

listOptions

public java.util.Enumeration listOptions()
Returns an enumeration describing the available options.

Specified by:
listOptions in interface OptionHandler
Overrides:
listOptions in class SingleClassifierEnhancer
Returns:
an enumeration of all the available options.

setOptions

public void setOptions(java.lang.String[] options)
                throws java.lang.Exception
Parses a given list of options.

Specified by:
setOptions in interface OptionHandler
Overrides:
setOptions in class SingleClassifierEnhancer
Parameters:
options - the list of options as an array of strings
Throws:
java.lang.Exception - if an option is not supported

getOptions

public java.lang.String[] getOptions()
Gets the current settings of the Classifier.

Specified by:
getOptions in interface OptionHandler
Overrides:
getOptions in class SingleClassifierEnhancer
Returns:
an array of strings suitable for passing to setOptions

numBinsTipText

public java.lang.String numBinsTipText()
Returns the tip text for this property

Returns:
tip text for this property suitable for displaying in the explorer/experimenter gui

getNumBins

public int getNumBins()
Gets the number of bins numeric attributes will be divided into

Returns:
the number of bins.

setNumBins

public void setNumBins(int numBins)
Sets the number of bins to divide each selected numeric attribute into

Parameters:
numBins - the number of bins

deleteEmptyBinsTipText

public java.lang.String deleteEmptyBinsTipText()
Returns the tip text for this property

Returns:
tip text for this property suitable for displaying in the explorer/experimenter gui

getDeleteEmptyBins

public boolean getDeleteEmptyBins()
Gets whether empty bins are deleted.

Returns:
true if empty bins get deleted.

setDeleteEmptyBins

public void setDeleteEmptyBins(boolean b)
Sets whether to delete empty bins.

Parameters:
b - if true, empty bins will be deleted

minimizeAbsoluteErrorTipText

public java.lang.String minimizeAbsoluteErrorTipText()
Returns the tip text for this property

Returns:
tip text for this property suitable for displaying in the explorer/experimenter gui

getMinimizeAbsoluteError

public boolean getMinimizeAbsoluteError()
Gets whether to min. abs. error

Returns:
true if abs. err. is to be minimized

setMinimizeAbsoluteError

public void setMinimizeAbsoluteError(boolean b)
Sets whether to min. abs. error.

Parameters:
b - if true, abs. err. is minimized

useEqualFrequencyTipText

public java.lang.String useEqualFrequencyTipText()
Returns the tip text for this property

Returns:
tip text for this property suitable for displaying in the explorer/experimenter gui

getUseEqualFrequency

public boolean getUseEqualFrequency()
Get the value of UseEqualFrequency.

Returns:
Value of UseEqualFrequency.

setUseEqualFrequency

public void setUseEqualFrequency(boolean newUseEqualFrequency)
Set the value of UseEqualFrequency.

Parameters:
newUseEqualFrequency - Value to assign to UseEqualFrequency.

estimatorTypeTipText

public java.lang.String estimatorTypeTipText()
Returns the tip text for this property

Returns:
tip text for this property suitable for displaying in the explorer/experimenter gui

getEstimatorType

public SelectedTag getEstimatorType()
Get the estimator type

Returns:
the estimator type

setEstimatorType

public void setEstimatorType(SelectedTag newEstimator)
Set the estimator

Parameters:
newEstimator - the estimator to use

toString

public java.lang.String toString()
Returns a description of the classifier.

Overrides:
toString in class java.lang.Object
Returns:
a description of the classifier as a string.

getRevision

public java.lang.String getRevision()
Returns the revision string.

Specified by:
getRevision in interface RevisionHandler
Overrides:
getRevision in class AbstractClassifier
Returns:
the revision

main

public static void main(java.lang.String[] argv)
Main method for testing this class.

Parameters:
argv - the options