public abstract class GeneralizedLinearAlgorithm<M extends GeneralizedLinearModel> extends java.lang.Object implements Logging, scala.Serializable
Constructor and Description |
---|
GeneralizedLinearAlgorithm() |
Modifier and Type | Method and Description |
---|---|
protected boolean |
addIntercept()
Whether to add intercept (default: false).
|
protected abstract M |
createModel(Vector weights,
double intercept)
Create a model given the weights and intercept
|
int |
getNumFeatures()
The dimension of training features.
|
boolean |
isAddIntercept()
Get if the algorithm uses addIntercept
|
protected int |
numFeatures()
The dimension of training features.
|
protected int |
numOfLinearPredictor()
In
GeneralizedLinearModel , only single linear predictor is allowed for both weights
and intercept. |
abstract Optimizer |
optimizer()
The optimizer to solve the problem.
|
M |
run(RDD<LabeledPoint> input)
Run the algorithm with the configured parameters on an input
RDD of LabeledPoint entries.
|
M |
run(RDD<LabeledPoint> input,
Vector initialWeights)
Run the algorithm with the configured parameters on an input RDD
of LabeledPoint entries starting from the initial weights provided.
|
GeneralizedLinearAlgorithm<M> |
setIntercept(boolean addIntercept)
Set if the algorithm should add an intercept.
|
GeneralizedLinearAlgorithm<M> |
setValidateData(boolean validateData)
Set if the algorithm should validate data before training.
|
protected boolean |
validateData() |
protected scala.collection.Seq<scala.Function1<RDD<LabeledPoint>,java.lang.Object>> |
validators() |
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
initializeIfNecessary, initializeLogging, isTraceEnabled, log_, log, logDebug, logDebug, logError, logError, logInfo, logInfo, logName, logTrace, logTrace, logWarning, logWarning
protected scala.collection.Seq<scala.Function1<RDD<LabeledPoint>,java.lang.Object>> validators()
public abstract Optimizer optimizer()
protected boolean addIntercept()
protected boolean validateData()
protected int numOfLinearPredictor()
GeneralizedLinearModel
, only single linear predictor is allowed for both weights
and intercept. However, for multinomial logistic regression, with K possible outcomes,
we are training K-1 independent binary logistic regression models which requires K-1 sets
of linear predictor.
As a result, the workaround here is if more than two sets of linear predictors are needed,
we construct bigger weights
vector which can hold both weights and intercepts.
If the intercepts are added, the dimension of weights
will be
(numOfLinearPredictor) * (numFeatures + 1) . If the intercepts are not added,
the dimension of weights
will be (numOfLinearPredictor) * numFeatures.
Thus, the intercepts will be encapsulated into weights, and we leave the value of intercept in GeneralizedLinearModel as zero.
public int getNumFeatures()
protected int numFeatures()
protected abstract M createModel(Vector weights, double intercept)
weights
- (undocumented)intercept
- (undocumented)public boolean isAddIntercept()
public GeneralizedLinearAlgorithm<M> setIntercept(boolean addIntercept)
addIntercept
- (undocumented)public GeneralizedLinearAlgorithm<M> setValidateData(boolean validateData)
validateData
- (undocumented)public M run(RDD<LabeledPoint> input)
input
- (undocumented)public M run(RDD<LabeledPoint> input, Vector initialWeights)
input
- (undocumented)initialWeights
- (undocumented)