weka.filters.unsupervised.attribute
Class EMImputation

java.lang.Object
  extended by weka.filters.Filter
      extended by weka.filters.SimpleFilter
          extended by weka.filters.SimpleBatchFilter
              extended by weka.filters.unsupervised.attribute.EMImputation
All Implemented Interfaces:
java.io.Serializable, CapabilitiesHandler, OptionHandler, RevisionHandler, UnsupervisedFilter

public class EMImputation
extends SimpleBatchFilter
implements UnsupervisedFilter

Replaces missing numeric values using Expectation Maximization with a multivariate normal model. Described in " Schafer, J.L. Analysis of Incomplete Multivariate Data, New York: Chapman and Hall, 1997."

Valid options are:

 -N
  Maximum number of iterations for Expectation 
  Maximization. (-1 = no maximum)
 -E
  Threshold for convergence in Expectation 
  Maximization. If the change in the observed data 
  log-likelihood (posterior density if a ridge prior 
   is being used) across iterations is no more than 
  this value, then convergence is considered to be 
  achieved and the iterative process is ceased. 
  (default = 0.0001)
 -P
  Use a ridge prior instead of the noninformative 
  prior. This helps when the data has a singular 
  covariance matrix.
 -Q
  The ridge parameter for when a ridge prior is 
  used.

Version:
$Revision: 5987 $
Author:
Amri Napolitano
See Also:
Serialized Form

Constructor Summary
EMImputation()
           
 
Method Summary
 Capabilities getCapabilities()
          Returns the Capabilities of this filter.
 double getLogLikelihoodThreshold()
          Gets the EM log-likelihood convergence threshold
 int getNumIterations()
          Gets the maximum number of EM iterations
 java.lang.String[] getOptions()
          Gets the current settings of EMImputation
 java.lang.String getRevision()
          Returns the revision string.
 double getRidge()
          Get ridge parameter.
 boolean getUseRidgePrior()
          Get whether to use a ridge prior.
 java.lang.String globalInfo()
          Returns a string describing this filter
 java.util.Enumeration listOptions()
          Returns an enumeration describing the available options.
 java.lang.String logLikelihoodThresholdTipText()
          Returns the tip text for this property
static void main(java.lang.String[] argv)
          Main method for testing this class.
 java.lang.String numIterationsTipText()
          Returns the tip text for this property
 java.lang.String ridgeTipText()
          Returns the tip text for this property.
 boolean setInputFormat(Instances instanceInfo)
          Sets the format of the input instances.
 void setLogLikelihoodThreshold(double newThreshold)
          Sets the EM log-likelihood convergence threshold
 void setNumIterations(int newIterations)
          Sets the maximum number of EM iterations
 void setOptions(java.lang.String[] options)
          Parses a given list of options.
 void setRidge(double ridge)
          Set ridge parameter
 void setUseRidgePrior(boolean prior)
          Set whether to use a ridge prior.
 java.lang.String useRidgePriorTipText()
          Returns the tip text for this property.
 
Methods inherited from class weka.filters.SimpleBatchFilter
batchFinished, input
 
Methods inherited from class weka.filters.SimpleFilter
debugTipText, getDebug, setDebug
 
Methods inherited from class weka.filters.Filter
batchFilterFile, filterFile, getCapabilities, getOutputFormat, isFirstBatchDone, isNewBatch, isOutputFormatDefined, makeCopies, makeCopy, numPendingOutput, output, outputPeek, toString, useFilter, wekaStaticWrapper
 
Methods inherited from class java.lang.Object
equals, getClass, hashCode, notify, notifyAll, wait, wait, wait
 

Constructor Detail

EMImputation

public EMImputation()
Method Detail

globalInfo

public java.lang.String globalInfo()
Returns a string describing this filter

Specified by:
globalInfo in class SimpleFilter
Returns:
a description of the filter suitable for displaying in the explorer/experimenter gui

getCapabilities

public Capabilities getCapabilities()
Returns the Capabilities of this filter.

Specified by:
getCapabilities in interface CapabilitiesHandler
Overrides:
getCapabilities in class Filter
Returns:
the capabilities of this object
See Also:
Capabilities

listOptions

public java.util.Enumeration listOptions()
Returns an enumeration describing the available options.

Specified by:
listOptions in interface OptionHandler
Overrides:
listOptions in class SimpleFilter
Returns:
an enumeration of all the available options.

setOptions

public void setOptions(java.lang.String[] options)
                throws java.lang.Exception
Parses a given list of options.

Valid options are:

 -N
  Maximum number of iterations for Expectation 
  Maximization. (-1 = no maximum)
 -E
  Threshold for convergence in Expectation 
  Maximization. If the change in the observed data 
  log-likelihood (posterior density if a ridge prior 
   is being used) across iterations is no more than 
  this value, then convergence is considered to be 
  achieved and the iterative process is ceased. 
  (default = 0.0001)
 -P
  Use a ridge prior instead of the noninformative 
  prior. This helps when the data has a singular 
  covariance matrix.
 -Q
  The ridge parameter for when a ridge prior is 
  used.

Specified by:
setOptions in interface OptionHandler
Overrides:
setOptions in class SimpleFilter
Parameters:
options - the list of options as an array of strings
Throws:
java.lang.Exception - if an option is not supported
See Also:
SimpleFilter.reset()

getOptions

public java.lang.String[] getOptions()
Gets the current settings of EMImputation

Specified by:
getOptions in interface OptionHandler
Overrides:
getOptions in class SimpleFilter
Returns:
an array of strings suitable for passing to setOptions()

numIterationsTipText

public java.lang.String numIterationsTipText()
Returns the tip text for this property

Returns:
tip text for this property suitable for displaying in the explorer/experimenter gui

setNumIterations

public void setNumIterations(int newIterations)
Sets the maximum number of EM iterations

Parameters:
newIterations - the maximum number of EM iterations

getNumIterations

public int getNumIterations()
Gets the maximum number of EM iterations

Returns:
the maximum number of EM iterations

logLikelihoodThresholdTipText

public java.lang.String logLikelihoodThresholdTipText()
Returns the tip text for this property

Returns:
tip text for this property suitable for displaying in the explorer/experimenter gui

setLogLikelihoodThreshold

public void setLogLikelihoodThreshold(double newThreshold)
Sets the EM log-likelihood convergence threshold

Parameters:
newThreshold - the EM log-likelihood convergence threshold

getLogLikelihoodThreshold

public double getLogLikelihoodThreshold()
Gets the EM log-likelihood convergence threshold

Returns:
the EM log-likelihood convergence threshold

useRidgePriorTipText

public java.lang.String useRidgePriorTipText()
Returns the tip text for this property.

Returns:
tip text for this property suitable for displaying in the explorer/experimenter gui

getUseRidgePrior

public boolean getUseRidgePrior()
Get whether to use a ridge prior.

Returns:
whether to use a ridge prior.

setUseRidgePrior

public void setUseRidgePrior(boolean prior)
Set whether to use a ridge prior.

Parameters:
prior - whether to use a ridge prior.

ridgeTipText

public java.lang.String ridgeTipText()
Returns the tip text for this property.

Returns:
tip text for this property suitable for displaying in the explorer/experimenter gui

getRidge

public double getRidge()
Get ridge parameter.

Returns:
the ridge parameter

setRidge

public void setRidge(double ridge)
Set ridge parameter

Parameters:
ridge - new ridge parameter

setInputFormat

public boolean setInputFormat(Instances instanceInfo)
                       throws java.lang.Exception
Sets the format of the input instances.

Overrides:
setInputFormat in class SimpleFilter
Parameters:
instanceInfo - an Instances object containing the input instance structure (any instances contained in the object are ignored - only the structure is required).
Returns:
true if the outputFormat may be collected immediately
Throws:
java.lang.Exception - if the input format can't be set successfully
See Also:
SimpleFilter.reset()

getRevision

public java.lang.String getRevision()
Returns the revision string.

Specified by:
getRevision in interface RevisionHandler
Overrides:
getRevision in class Filter
Returns:
the revision

main

public static void main(java.lang.String[] argv)
Main method for testing this class.

Parameters:
argv - should contain arguments to the filter: use -h for help