public class FPGrowth extends java.lang.Object implements Logging, scala.Serializable
A parallel FP-growth algorithm to mine frequent itemsets. The algorithm is described in
Li et al., PFP: Parallel FP-Growth for Query
Recommendation
. PFP distributes computation in such a way that each worker executes an
independent group of mining tasks. The FP-Growth algorithm is described in
Han et al., Mining frequent patterns without candidate
generation
.
param: minSupport the minimal support level of the frequent pattern, any pattern appears more than (minSupport * size-of-the-dataset) times will be output param: numPartitions number of partitions used by parallel FP-growth
http://en.wikipedia.org/wiki/Association_rule_learning Association rule learning
(Wikipedia)}
,
Serialized FormModifier and Type | Class and Description |
---|---|
static class |
FPGrowth.FreqItemset<Item>
Frequent itemset.
|
Constructor and Description |
---|
FPGrowth()
Constructs a default instance with default parameters {minSupport:
0.3 , numPartitions: same
as the input data}. |
Modifier and Type | Method and Description |
---|---|
<Item,Basket extends java.lang.Iterable<Item>> |
run(JavaRDD<Basket> data)
Java-friendly version of
run . |
<Item> FPGrowthModel<Item> |
run(RDD<java.lang.Object> data,
scala.reflect.ClassTag<Item> evidence$2)
Computes an FP-Growth model that contains frequent itemsets.
|
FPGrowth |
setMinSupport(double minSupport)
Sets the minimal support level (default:
0.3 ). |
FPGrowth |
setNumPartitions(int numPartitions)
Sets the number of partitions used by parallel FP-growth (default: same as input data).
|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
initializeIfNecessary, initializeLogging, isTraceEnabled, log_, log, logDebug, logDebug, logError, logError, logInfo, logInfo, logName, logTrace, logTrace, logWarning, logWarning
public FPGrowth()
0.3
, numPartitions: same
as the input data}.
public FPGrowth setMinSupport(double minSupport)
0.3
).
minSupport
- (undocumented)public FPGrowth setNumPartitions(int numPartitions)
numPartitions
- (undocumented)public <Item> FPGrowthModel<Item> run(RDD<java.lang.Object> data, scala.reflect.ClassTag<Item> evidence$2)
data
- input data set, each element contains a transactionevidence$2
- (undocumented)FPGrowthModel
public <Item,Basket extends java.lang.Iterable<Item>> FPGrowthModel<Item> run(JavaRDD<Basket> data)
run
.