FeaturesType
- Type of input features. E.g., Vector
M
- Concrete Model typepublic abstract class ProbabilisticClassificationModel<FeaturesType,M extends ProbabilisticClassificationModel<FeaturesType,M>> extends ClassificationModel<FeaturesType,M> implements ProbabilisticClassifierParams
ProbabilisticClassifier
.
Classes are indexed {0, 1, ..., numClasses - 1}.
Constructor and Description |
---|
ProbabilisticClassificationModel() |
Modifier and Type | Method and Description |
---|---|
static void |
normalizeToProbabilitiesInPlace(DenseVector v)
Normalize a vector of raw predictions to be a multinomial probability vector, in place.
|
Vector |
predictProbability(FeaturesType features)
Predict the probability of each class given the features.
|
Param<String> |
probabilityCol()
Param for Column name for predicted class conditional probabilities.
|
M |
setProbabilityCol(String value) |
M |
setThresholds(double[] value) |
DoubleArrayParam |
thresholds()
Param for Thresholds in multi-class classification to adjust the probability of predicting each class.
|
Dataset<Row> |
transform(Dataset<?> dataset)
Transforms dataset by reading from
featuresCol , and appending new columns as specified by
parameters:
- predicted labels as predictionCol of type Double
- raw predictions (confidences) as rawPredictionCol of type Vector
- probability of each class as probabilityCol of type Vector . |
StructType |
transformSchema(StructType schema)
Check transform validity and derive the output schema from the input schema.
|
numClasses, predict, predictRaw, rawPredictionCol, setRawPredictionCol, transformImpl
featuresCol, labelCol, numFeatures, predictionCol, setFeaturesCol, setPredictionCol
transform, transform, transform
params
equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
validateAndTransformSchema
extractInstances
extractInstances, extractInstances
getLabelCol, labelCol
featuresCol, getFeaturesCol
getPredictionCol, predictionCol
clear, copy, copyValues, defaultCopy, defaultParamMap, explainParam, explainParams, extractParamMap, extractParamMap, get, getDefault, getOrDefault, getParam, hasDefault, hasParam, isDefined, isSet, paramMap, params, set, set, set, setDefault, setDefault, shouldOwn
toString, uid
getRawPredictionCol, rawPredictionCol
getProbabilityCol
getThresholds
$init$, initializeForcefully, initializeLogIfNecessary, initializeLogIfNecessary, initializeLogIfNecessary$default$2, initLock, isTraceEnabled, log, logDebug, logDebug, logError, logError, logInfo, logInfo, logName, logTrace, logTrace, logWarning, logWarning, org$apache$spark$internal$Logging$$log__$eq, org$apache$spark$internal$Logging$$log_, uninitialize
public static void normalizeToProbabilitiesInPlace(DenseVector v)
The input raw predictions should be nonnegative. The output vector sums to 1.
NOTE: This is NOT applicable to all models, only ones which effectively use class instance counts for raw predictions.
v
- (undocumented)IllegalArgumentException
- if the input vector is all-0 or including negative valuespublic DoubleArrayParam thresholds()
HasThresholds
thresholds
in interface HasThresholds
public final Param<String> probabilityCol()
HasProbabilityCol
probabilityCol
in interface HasProbabilityCol
public M setProbabilityCol(String value)
public M setThresholds(double[] value)
public StructType transformSchema(StructType schema)
PipelineStage
We check validity for interactions between parameters during transformSchema
and
raise an exception if any parameter value is invalid. Parameter value checks which
do not depend on other parameters are handled by Param.validate()
.
Typical implementation should first conduct verification on schema change and parameter validity, including complex parameter interaction checks.
transformSchema
in class ClassificationModel<FeaturesType,M extends ProbabilisticClassificationModel<FeaturesType,M>>
schema
- (undocumented)public Dataset<Row> transform(Dataset<?> dataset)
featuresCol
, and appending new columns as specified by
parameters:
- predicted labels as predictionCol
of type Double
- raw predictions (confidences) as rawPredictionCol
of type Vector
- probability of each class as probabilityCol
of type Vector
.
transform
in class ClassificationModel<FeaturesType,M extends ProbabilisticClassificationModel<FeaturesType,M>>
dataset
- input datasetpublic Vector predictProbability(FeaturesType features)
This internal method is used to implement transform()
and output probabilityCol
.
features
- (undocumented)