public class ClusteringEvaluator extends Evaluator implements HasPredictionCol, HasFeaturesCol, HasWeightCol, DefaultParamsWritable
The Silhouette is a measure for the validation of the consistency within clusters. It ranges between 1 and -1, where a value close to 1 means that the points in a cluster are close to the other points in the same cluster and far from the points of the other clusters.
Constructor and Description |
---|
ClusteringEvaluator() |
ClusteringEvaluator(String uid) |
Modifier and Type | Method and Description |
---|---|
ClusteringEvaluator |
copy(ParamMap pMap)
Creates a copy of this instance with the same UID and some extra params.
|
Param<String> |
distanceMeasure()
param for distance measure to be used in evaluation
(supports
"squaredEuclidean" (default), "cosine" ) |
double |
evaluate(Dataset<?> dataset)
Evaluates model output and returns a scalar metric.
|
Param<String> |
featuresCol()
Param for features column name.
|
String |
getDistanceMeasure() |
String |
getMetricName() |
ClusteringMetrics |
getMetrics(Dataset<?> dataset)
Get a ClusteringMetrics, which can be used to get clustering metrics such as
silhouette score.
|
boolean |
isLargerBetter()
Indicates whether the metric returned by
evaluate should be maximized (true, default)
or minimized (false). |
static ClusteringEvaluator |
load(String path) |
Param<String> |
metricName()
param for metric name in evaluation
(supports
"silhouette" (default)) |
Param<String> |
predictionCol()
Param for prediction column name.
|
static MLReader<T> |
read() |
ClusteringEvaluator |
setDistanceMeasure(String value) |
ClusteringEvaluator |
setFeaturesCol(String value) |
ClusteringEvaluator |
setMetricName(String value) |
ClusteringEvaluator |
setPredictionCol(String value) |
ClusteringEvaluator |
setWeightCol(String value) |
String |
toString() |
String |
uid()
An immutable unique ID for the object and its derivatives.
|
Param<String> |
weightCol()
Param for weight column name.
|
getPredictionCol
getFeaturesCol
getWeightCol
clear, copyValues, defaultCopy, defaultParamMap, explainParam, explainParams, extractParamMap, extractParamMap, get, getDefault, getOrDefault, getParam, hasDefault, hasParam, isDefined, isSet, paramMap, params, set, set, set, setDefault, setDefault, shouldOwn
write
save
public ClusteringEvaluator(String uid)
public ClusteringEvaluator()
public static ClusteringEvaluator load(String path)
public static MLReader<T> read()
public final Param<String> weightCol()
HasWeightCol
weightCol
in interface HasWeightCol
public final Param<String> featuresCol()
HasFeaturesCol
featuresCol
in interface HasFeaturesCol
public final Param<String> predictionCol()
HasPredictionCol
predictionCol
in interface HasPredictionCol
public String uid()
Identifiable
uid
in interface Identifiable
public ClusteringEvaluator copy(ParamMap pMap)
Params
defaultCopy()
.public boolean isLargerBetter()
Evaluator
evaluate
should be maximized (true, default)
or minimized (false).
A given evaluator may support multiple metrics which may be maximized or minimized.isLargerBetter
in class Evaluator
public ClusteringEvaluator setPredictionCol(String value)
public ClusteringEvaluator setFeaturesCol(String value)
public ClusteringEvaluator setWeightCol(String value)
public Param<String> metricName()
"silhouette"
(default))public String getMetricName()
public ClusteringEvaluator setMetricName(String value)
public Param<String> distanceMeasure()
"squaredEuclidean"
(default), "cosine"
)public String getDistanceMeasure()
public ClusteringEvaluator setDistanceMeasure(String value)
public double evaluate(Dataset<?> dataset)
Evaluator
isLargerBetter
specifies whether larger values are better.
public ClusteringMetrics getMetrics(Dataset<?> dataset)
dataset
- a dataset that contains labels/observations and predictions.public String toString()
toString
in interface Identifiable
toString
in class Object