Current Issue Cover
挖掘空间关联规则的前缀树算法设计与实现

刘君强1, 潘云鹤2(1.浙江大学人工智能研究所,杭州 310027;2.杭州商学院计算机信息工程学院,杭州 310035)

摘 要
空间关联规则挖掘是在空间数据库中进行知识发现的一类重要问题.为此提出了挖掘空间关联规则的二阶段策略,通过多轮次单层布尔型关联规则挖掘,自顶向下逐步细化空间谓词的粒度,从而空间谓词的计算量大大减少.同时,设计了一种基于前缀树的单层布尔型关联规则挖掘算法(FPT-Generate),不需要反复扫描数据库,不产生候选模式集,并在关键优化技术上取得了突破.实验表明,以FPT-Generate为挖掘引擎的空间关联规则发现系统的时间效率与空间可伸缩性远远优于以经典算法Apriori为引擎的系统。
关键词
Design and Implementation of FIPT-Based Spatial Association Rules Mining Algorithm

()

Abstract
Spatial association rule discovery in spatial databases is a very important data mining task. In this paper, a two stage strategy for the discovery of spatial association rules in geographical databases is proposed. The spatial computational overhead is greatly reduced by top down refinement of spatial predicate granularities and multiple recursions of single level boolean association rule discovery step, which is the key step of the algorithm. The single level boolean association rule mining algorithm, FPT Generate, is detailed. FPT Generate uses the frequent item prefix tree, FIPT, to compress and project frequent item sets, and discovers association rules by growing a frequent pattern tree, FPT, by depth first search. The algorithm FPT Generate generates association rules without candidate generation and without redundant scans of databases. Optimizing techniques for the implementation, such as pseudo projecting and pruning, dynamic threading and hashing, and disk based partitioning, are also discussed. Experiments show that spatial association discovery systems powered by FPT Generate are much more time efficient and space scalable than those powered by the classical algorithm, Apriori. Finally, a spatial association rule discovery system, SmartMiner, upon the support of MapInfo Professional, is developed.
Keywords

订阅号|日报