RandomForestClassificationModel

java.lang.Object
- org.apache.spark.ml.PipelineStage
- - org.apache.spark.ml.Transformer
  - - org.apache.spark.ml.Model<M>
    - - org.apache.spark.ml.PredictionModel<FeaturesType,M>
      - org.apache.spark.ml.classification.ClassificationModel<FeaturesType,M>
        
        org.apache.spark.ml.classification.ProbabilisticClassificationModel<Vector,RandomForestClassificationModel>
        
        org.apache.spark.ml.classification.RandomForestClassificationModel

All Implemented Interfaces:

java.io.Serializable, Logging, Params, Identifiable
```
public final class RandomForestClassificationModel
extends ProbabilisticClassificationModel<Vector,RandomForestClassificationModel>
implements scala.Serializable
```
:: Experimental :: Random Forest model for classification. It supports both binary and multiclass labels, as well as both continuous and categorical features. param: _trees Decision trees in the ensemble. Warning: These have null parents. param: numFeatures Number of features used by this model

See Also:
Serialized Form

Method Summary

Methods
Modifier and Type	Method and Description
`RandomForestClassificationModel`	`copy(ParamMap extra)` Creates a copy of this instance with the same UID and some extra params.
`Vector`	`featureImportances()` Estimate of the importance of each feature.
`static RandomForestClassificationModel`	`fromOld(RandomForestModel oldModel, RandomForestClassifier parent, scala.collection.immutable.Map<java.lang.Object,java.lang.Object> categoricalFeatures, int numClasses)` (private[ml]) Convert a model from the old API
`int`	`numClasses()` Number of classes (values which the label can take).
`int`	`numFeatures()`
`protected Vector`	`predictRaw(Vector features)` Raw prediction for each possible label.
`protected Vector`	`raw2probabilityInPlace(Vector rawPrediction)` Estimate the probability of each class given the raw prediction, doing the computation in-place.
`java.lang.String`	`toString()`
`protected DataFrame`	`transformImpl(DataFrame dataset)`
`org.apache.spark.ml.tree.DecisionTreeModel[]`	`trees()`
`double[]`	`treeWeights()`
`java.lang.String`	`uid()` An immutable unique ID for the object and its derivatives.
`StructType`	`validateAndTransformSchema(StructType schema, boolean fitting, DataType featuresDataType)`
`StructType`	`validateAndTransformSchema(StructType schema, boolean fitting, DataType featuresDataType)` Validates and transforms the input schema with the provided param map.

Methods inherited from class org.apache.spark.ml.classification.ProbabilisticClassificationModel
normalizeToProbabilitiesInPlace, predictProbability, probability2prediction, raw2prediction, raw2probability, setProbabilityCol, setThresholds, transform

Methods inherited from class org.apache.spark.ml.classification.ClassificationModel
predict, setRawPredictionCol

Methods inherited from class org.apache.spark.ml.PredictionModel
featuresDataType, setFeaturesCol, setPredictionCol, transformSchema

Methods inherited from class org.apache.spark.ml.Model
hasParent, parent, setParent

Methods inherited from class org.apache.spark.ml.Transformer
transform, transform, transform

Methods inherited from class org.apache.spark.ml.PipelineStage
transformSchema

Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait

Methods inherited from interface org.apache.spark.ml.param.Params
clear, copyValues, defaultCopy, defaultParamMap, explainParam, explainParams, extractParamMap, extractParamMap, get, getDefault, getOrDefault, getParam, hasDefault, hasParam, isDefined, isSet, paramMap, params, set, set, set, setDefault, setDefault, shouldOwn, validateParams

Methods inherited from interface org.apache.spark.Logging
initializeIfNecessary, initializeLogging, isTraceEnabled, log_, log, logDebug, logDebug, logError, logError, logInfo, logInfo, logName, logTrace, logTrace, logWarning, logWarning

- Method Detail
  - fromOld
```
public static RandomForestClassificationModel fromOld(RandomForestModel oldModel,
                                      RandomForestClassifier parent,
                                      scala.collection.immutable.Map<java.lang.Object,java.lang.Object> categoricalFeatures,
                                      int numClasses)
```
    (private[ml]) Convert a model from the old API
  - uid
```
public java.lang.String uid()
```
    Description copied from interface: Identifiable
    
    An immutable unique ID for the object and its derivatives.
    
    Specified by:
    
    uid in interface Identifiable
    
    Returns:
    (undocumented)
  - numFeatures
```
public int numFeatures()
```
  - numClasses
```
public int numClasses()
```
    Description copied from class: ClassificationModel
    
    Number of classes (values which the label can take).
    
    Specified by:
    
    numClasses in class ClassificationModel<Vector,RandomForestClassificationModel>
  - trees
```
public org.apache.spark.ml.tree.DecisionTreeModel[] trees()
```
  - treeWeights
```
public double[] treeWeights()
```
  - transformImpl
```
protected DataFrame transformImpl(DataFrame dataset)
```
    Overrides:
    
    transformImpl in class PredictionModel<Vector,RandomForestClassificationModel>
  - predictRaw
```
protected Vector predictRaw(Vector features)
```
    Description copied from class: ClassificationModel
    
    Raw prediction for each possible label. The meaning of a "raw" prediction may vary between algorithms, but it intuitively gives a measure of confidence in each possible label (where larger = more confident). This internal method is used to implement transform() and output rawPredictionCol.
    
    Specified by:
    
    predictRaw in class ClassificationModel<Vector,RandomForestClassificationModel>
    
    Parameters:
    features - (undocumented)
    
    Returns:
    vector where element i is the raw prediction for label i. This raw prediction may be any real number, where a larger value indicates greater confidence for that label.
  - raw2probabilityInPlace
```
protected Vector raw2probabilityInPlace(Vector rawPrediction)
```
    Description copied from class: ProbabilisticClassificationModel
    
    Estimate the probability of each class given the raw prediction, doing the computation in-place. These predictions are also called class conditional probabilities.
    This internal method is used to implement transform() and output probabilityCol.
    
    Specified by:
    
    raw2probabilityInPlace in class ProbabilisticClassificationModel<Vector,RandomForestClassificationModel>
    
    Parameters:
    rawPrediction - (undocumented)
    
    Returns:
    Estimated class conditional probabilities (modified input vector)
  - copy
```
public RandomForestClassificationModel copy(ParamMap extra)
```
    Description copied from interface: Params
    
    Creates a copy of this instance with the same UID and some extra params. Subclasses should implement this method and set the return type properly.
    
    Specified by:
    
    copy in interface Params
    
    Specified by:
    
    copy in class Model<RandomForestClassificationModel>
    
    Parameters:
    extra - (undocumented)
    
    Returns:
    (undocumented)
    See Also:
    defaultCopy()
  - toString
```
public java.lang.String toString()
```
    Specified by:
    
    toString in interface Identifiable
    
    Overrides:
    
    toString in class java.lang.Object
  - featureImportances
```
public Vector featureImportances()
```
    Estimate of the importance of each feature.
    This generalizes the idea of "Gini" importance to other losses, following the explanation of Gini importance from "Random Forests" documentation by Leo Breiman and Adele Cutler, and following the implementation from scikit-learn.
    This feature importance is calculated as follows: - Average over trees: - importance(feature j) = sum (over nodes which split on feature j) of the gain, where gain is scaled by the number of instances passing through node - Normalize importances for tree based on total number of training instances used to build tree. - Normalize feature importance vector to sum to 1.
    
    Returns:
    (undocumented)
  - validateAndTransformSchema
```
public StructType validateAndTransformSchema(StructType schema,
                                    boolean fitting,
                                    DataType featuresDataType)
```
  - validateAndTransformSchema
```
public StructType validateAndTransformSchema(StructType schema,
                                    boolean fitting,
                                    DataType featuresDataType)
```
    Validates and transforms the input schema with the provided param map.
    
    Parameters:
    schema - input schema
    fitting - whether this is in fitting
    featuresDataType - SQL DataType for FeaturesType. E.g., VectorUDT for vector features.
    
    Returns:
    output schema

Class RandomForestClassificationModel

Method Summary

Methods inherited from class org.apache.spark.ml.classification.ProbabilisticClassificationModel

Methods inherited from class org.apache.spark.ml.classification.ClassificationModel

Methods inherited from class org.apache.spark.ml.PredictionModel

Methods inherited from class org.apache.spark.ml.Model

Methods inherited from class org.apache.spark.ml.Transformer

Methods inherited from class org.apache.spark.ml.PipelineStage

Methods inherited from class java.lang.Object

Methods inherited from interface org.apache.spark.ml.param.Params

Methods inherited from interface org.apache.spark.Logging

Method Detail

fromOld

uid

numFeatures

numClasses

trees

treeWeights

transformImpl

predictRaw

raw2probabilityInPlace

copy

toString

featureImportances

validateAndTransformSchema

validateAndTransformSchema