11.5 示例:hello world!FP-growth

FP-growth算法有大量的开源实现,其中名气较大的是pyfpgrowth。完整演示代码请见本书GitHub上的11-3.py。pyfpgrowth的安装非常简单:


pip install pyfpgrowth

pyfpgrowth实现后封装的函数如下,其中support代表支持度,minConf代表置信度:


patterns = pyfpgrowth.find_frequent_patterns(transactions, support)
rules = pyfpgrowth.generate_association_rules(patterns, minConf)

假设我们需要从下列数据中挖掘频繁项集:


transactions = [[1, 2, 5],
                [2, 4],
                [2, 3],
                [1, 2, 4],
                [1, 3],
                [2, 3],
                [1, 3],
                [1, 2, 3, 5],
                [1, 2, 3]]

满足的条件为支持度为2,置信度为0.7:


patterns = pyfpgrowth.find_frequent_patterns(transactions, 2)
rules = pyfpgrowth.generate_association_rules(patterns, 0.7)

输出结果为:


{(1, 5): ((2,), 1.0),
 (5,): ((1, 2), 1.0), 
 (2, 5): ((1,), 1.0), 
 (4,): ((2,), 1.0)}