org.apache.spark.mllib.clustering
Learning rate: exponential decay rate
Mini-batch fraction, which sets the fraction of document sampled and used in each iteration
A (positive) learning parameter that downweights early iterations.
A (positive) learning parameter that downweights early iterations. Larger values make early iterations count less.
Learning rate: exponential decay rate---should be between (0.
Learning rate: exponential decay rate---should be between (0.5, 1.0] to guarantee asymptotic convergence. Default: 0.51, based on the original Online LDA paper.
Mini-batch fraction in (0, 1], which sets the fraction of document sampled and used in each iteration.
Mini-batch fraction in (0, 1], which sets the fraction of document sampled and used in each iteration.
Note that this should be adjusted in synch with LDA.setMaxIterations() so the entire corpus is used. Specifically, set both so that maxIterations * miniBatchFraction >= 1.
Default: 0.05, i.e., 5% of total documents.
A (positive) learning parameter that downweights early iterations.
A (positive) learning parameter that downweights early iterations. Larger values make early iterations count less. Default: 1024, following the original Online LDA paper.
:: DeveloperApi ::
An online optimizer for LDA. The Optimizer implements the Online variational Bayes LDA algorithm, which processes a subset of the corpus on each iteration, and updates the term-topic distribution adaptively.
Original Online LDA paper: Hoffman, Blei and Bach, "Online Learning for Latent Dirichlet Allocation." NIPS, 2010.