两阶段混合粒子群优化聚类
doi: 10.3969/j.issn.0258-2724.2012.06.020
Two-Step Hybrid PSO-Based Clustering Algorithm
-
摘要: 为解决数据集样本维数较高时已有粒子群优化K均值算法计算速度较慢且聚类结果不稳定的问题,利用第1阶段聚类层次凝聚聚类获得准确率较高的子簇集合,作为粒子群优化K均值聚类算法初始聚类中心的搜索空间,进行第2阶段聚类.提出了一种简化的粒子编码方法,以减小样本维数对计算复杂度的影响;引入混沌的思想,以保持粒子种群的多样性,从而避免粒子群优化算法可能出现的早熟现象.通过两阶段聚类,有效地融合了粒子群优化、层次聚类与划分聚类算法的优点.在多个UCI数据集上的聚类结果表明,与几种对比算法聚类结果的最优值相比,其纯度分别提高了1%~8%,且耗时减少50%以上.Abstract: In order to solve the problems of the existing PSO (particle swarm optimization) K-means algorithms, i.e., their calculation speeds are slow and the clustering results are unstable when samples have a high dimension, some high-quality sub-clusters were generated by hierarchical agglomerative clustering. These sub-clusters were used as the search space of candidate centroids of the PSO K-means. In order to reduce the computational complexity when the dimension of a sample is high, a simplified particle encoding method was proposed. In addition, chaotic idea was introduced to keep the diversity of particle swarm to avoid premature. By two-step hybrid clustering the advantages of the hierarchical clustering, the partitioning clustering and the PSO were combined. The experimental results on several UCI data sets show that compared with the best results of several contrastive algorithms, the purity of its clustering result increases by 1% to 8% and the consuming time reduces by 50% at least.
-
Key words:
- clustering /
- dissimilarity /
- particle swarm optimization /
- particle encoding /
- initial centroid
-
王维博,冯全源. 基于改进粒子群算法的工程项目综合优化[J]. 西南交通大学学报,2011,46(1): 76-83. WANG Weibo, FENG Quanyuan. Synthesis optimization for construction project based on modified particle swarm optimization algorithm[J]. Journal of Southwest Jiaotong University, 2011, 46(1): 76-83. 秦钰,荆继武,向继,等. 基于优化初始类中心点的K-means改进算法[J]. 中国科学院研究生院学报,2007,24(6): 771-776. QIN Yu, JING Jiwu, XIANG Ji, et al. An improved K-means algorithm based on optimizing initial points[J]. Journal of the Graduate School of the Chinese Academy of Sciences, 2007, 24(6): 771-776. 汪中,刘贵全,陈恩红. 一种优化初始中心点的K-means算法[J]. 模式识别与人工智能,2009,22(2):299-304. WANG Zhong, LIU Guiquan, CHEN Enhong. A K-means algorithm based on optimized initial center points[J]. Pattern Recognition and Artificial Intelligence, 2009, 22(2): 299-304. KENNEDY J, EBERHART R C. Particle swarm optimization//Proceedings of IEEE Internal Conference on Neural Networks. Piscataway: IEEE Press, 1995: 1942-1948. RANA S, JASOLA S, KUMAR R. A review on particle swarm optimization algorithms and their applications to data clustering[J]. Artificial Intelligence Review, 2011, 35(3): 211-222. Vander MERWE D W, ENGELHRECHT A P. Data clustering using particle swarm optimization//Proceedings of Evolutionary Computation. Piscataway: IEEE Press, 2003: 215-220. ESMIN A A A, PEREIRA D L, de ARAUJO F P A. Study of different approach to clustering data by using the particle swarm optimization algorithm//IEEE World Congress on Computational Intelligence. Piscataway:IEEE Press, 2008: 1817-1822. 刘靖明,韩丽川,侯立文. 基于粒子群的K-均值聚类算法[J]. 系统工程理论与实践,2005,6(3): 55-58. LIU Jingming, HAN Lichuan, HOU Liwen. Clustering analysis based on particle swarm optimization algorithm[J]. System Engineering: Theory and Practice, 2005, 6(3): 55-58. CUI X H, POLOK T E. Document clustering using particle swarm optimization//Swarm Intelligence Symposium 2005. Piscataway: IEEE Press, 2005: 185-191. 陶新民,徐晶,杨立标,等. 一种改进的粒子群和K均值混合聚类算法[J]. 电子与信息学报,2010,32(1): 92-97. TAO Xinmin, XU Jing, YANG Libiao, et al. Improved cluster algorithm based on K-means and particle swarm optimization[J]. Journal of Electronics & Information Technology, 2010, 32(1): 92-97. 巩敦卫,蒋余庆,张勇,等. 基于微粒群优化聚类数目的K-均值算法[J]. 控制理论与应用,2009,26(10): 1175-1179. GONG Dunwei, JIANG Yuqing, ZHANG Yong, et al. K-mean algorithm for optimizing the number of clusters based on particle swarm optimization[J]. Control Theory & Applications, 2009, 26(10): 1175-1179. 朱红求,阳春华,桂卫华,等. 一种带混沌变异的粒子群优化算法[J]. 计算机科学,2010,37(3): 215-217. ZHU Hongqiu, YANG Chunhua, GUI Weihua, et al. Particle swarm optimization with chaotic mutation[J]. Computer Science, 2010, 37(3): 215-217. 吕振肃,侯志容. 自适应变异的粒子群优化算法[J]. 电子学报,2004,32(1): 416-420. LV Zhensu, HOU Zhirong. Particle swarm optimization with adaptive mutation[J]. Acta Electronica Sinica, 2004, 32(1): 416-420. FREY B J, DUECK D. Clustering by passing messages between data points[J]. Science, 2007, 315(5814): 972-976.
点击查看大图
计量
- 文章访问数: 1152
- HTML全文浏览量: 58
- PDF下载量: 423
- 被引次数: 0