public class NaiveBayes extends ProbabilisticClassifier<Vector,NaiveBayes,NaiveBayesModel>
http://nlp.stanford.edu/IR-book/html/htmledition/naive-bayes-text-classification-1.html
)
which can handle finitely supported discrete data. For example, by converting documents into
TF-IDF vectors, it can be used for document classification. By making every vector a
binary (0/1) data, it can also be used as Bernoulli NB
(http://nlp.stanford.edu/IR-book/html/htmledition/the-bernoulli-model-1.html
).
The input feature values must be nonnegative.Constructor and Description |
---|
NaiveBayes() |
NaiveBayes(java.lang.String uid) |
Modifier and Type | Method and Description |
---|---|
NaiveBayes |
copy(ParamMap extra)
Creates a copy of this instance with the same UID and some extra params.
|
java.lang.String |
getModelType() |
double |
getSmoothing() |
Param<java.lang.String> |
modelType()
The model type which is a string (case-sensitive).
|
NaiveBayes |
setModelType(java.lang.String value)
Set the model type using a string (case-sensitive).
|
NaiveBayes |
setSmoothing(double value)
Set the smoothing parameter.
|
DoubleParam |
smoothing()
The smoothing parameter.
|
protected NaiveBayesModel |
train(DataFrame dataset)
Train a model using the given dataset and parameters.
|
java.lang.String |
uid()
An immutable unique ID for the object and its derivatives.
|
StructType |
validateAndTransformSchema(StructType schema,
boolean fitting,
DataType featuresDataType) |
StructType |
validateAndTransformSchema(StructType schema,
boolean fitting,
DataType featuresDataType)
Validates and transforms the input schema with the provided param map.
|
setProbabilityCol, setThresholds
setRawPredictionCol
extractLabeledPoints, fit, setFeaturesCol, setLabelCol, setPredictionCol, transformSchema
transformSchema
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
clear, copyValues, defaultCopy, defaultParamMap, explainParam, explainParams, extractParamMap, extractParamMap, get, getDefault, getOrDefault, getParam, hasDefault, hasParam, isDefined, isSet, paramMap, params, set, set, set, setDefault, setDefault, shouldOwn, validateParams
toString
initializeIfNecessary, initializeLogging, isTraceEnabled, log_, log, logDebug, logDebug, logError, logError, logInfo, logInfo, logName, logTrace, logTrace, logWarning, logWarning
public NaiveBayes(java.lang.String uid)
public NaiveBayes()
public java.lang.String uid()
Identifiable
uid
in interface Identifiable
public NaiveBayes setSmoothing(double value)
value
- (undocumented)public NaiveBayes setModelType(java.lang.String value)
value
- (undocumented)protected NaiveBayesModel train(DataFrame dataset)
Predictor
fit()
to avoid dealing with schema validation
and copying parameters into the model.
train
in class Predictor<Vector,NaiveBayes,NaiveBayesModel>
dataset
- Training datasetpublic NaiveBayes copy(ParamMap extra)
Params
copy
in interface Params
copy
in class Predictor<Vector,NaiveBayes,NaiveBayesModel>
extra
- (undocumented)defaultCopy()
public DoubleParam smoothing()
public double getSmoothing()
public Param<java.lang.String> modelType()
public java.lang.String getModelType()
public StructType validateAndTransformSchema(StructType schema, boolean fitting, DataType featuresDataType)
public StructType validateAndTransformSchema(StructType schema, boolean fitting, DataType featuresDataType)
schema
- input schemafitting
- whether this is in fittingfeaturesDataType
- SQL DataType for FeaturesType.
E.g., VectorUDT
for vector features.