Current Issue Cover
基于Nyström密度值逼近的减法聚类

孙志海, 孔万增(杭州电子科技大学计算机学院, 杭州 310018)

摘 要
针对大规模数据集减法聚类时间复杂度高的问题,提出一种基于Nyström密度值逼近的减法聚类方法。特别适用于大规模数据集的减法聚类问题,可极大程度降低减法聚类的时间复杂度。基于Nyström逼近理论,结合经典减法聚类样本密度值计算的特点,巧妙地将Nyström理论用于减法聚类未采样样本之间密度权值矩阵的逼近,从而实现了对所有样本的密度值逼近,最后沿用经典减法聚类修正样本密度值的方法,实现整个减法聚类过程。将本文算法在人工数据、标准彩色图像及UCI数据集上进行了实验,详细说明了本文算法利用少数采样样本逼近多数未采样样本密度权值、密度值以及进行减法聚类的详细过程,并给出了聚类准确率、耗时及算法性能加速比。实验结果表明,与经典的减法聚类相比,本文算法在不影响聚类结果的情况下,对于较大规模数据集,可显著降低减法聚类的时间复杂度,极大程度地提高减法聚类的实时性能。
关键词
Density value approximation for subtractive clusteringbased on Nyström method

Sun Zhihai, Kong Wanzeng(College of Computer Science, Hangzhou Dianzi University, Hangzhou 310018, China)

Abstract
Subtractive clustering based methods have been well known for data clustering problems. However, due to the computational demands of these approaches, clustering for large scale datasets, such as spatio-temporal data and images, have been slow to appear. A novel subtractive clustering method based on Nyström approximation is proposed. The proposed method is based on the famous Nyström method. Combined with the density value computation characteristics for each sample of the classical subtractive clustering method, we apply Nyström theory to approximate the density value for each data point which has not been sampled. Finally, we complete the whole clustering procedure using classical subtractive clustering method in modifying the density values in each circulation. The proposed method substantially reduces the computational requirements of subtractive clustering based algorithms, making it feasible to use subtractive clustering to large scale subtractive clustering problems. Density value of samples could be approximated quickly using only a small number of samples. The experiment results on artificial datasets, color images, and UCI machine learning repository show efficiency in comparing with classical subtractive clustering method.
Keywords

订阅号|日报