Volume 6 Number 2 (Feb. 2011)
Home > Archive > 2011 > Volume 6 Number 2 (Feb. 2011) >
JCP 2011 Vol.6(2): 271-279 ISSN: 1796-203X
doi: 10.4304/jcp.6.2.271-279

An Efficient Global K-means Clustering Algorithm

Juanying Xie1, 2, Shuai Jiang2, Weixin Xie1, 3, Xinbo Gao1, 4
1School of Electronic Engineering, Xidian University, Xi’an 710071, P. R. China
2School of Computer Science, Shaanxi Normal University, Xi’an 710062, P. R. China
3National Laboratory of Automatic Target Recognition (ATR), Shenzhen University, Shenzhen 518001, P.R. China
4College of Information Engineering, Shenzhen University, Shenzhen 518001, P.R. China
5Key Laboratory of Intelligent Perception and Image Understanding of Ministry of Education of China, Xidian University, Xi'an 710071, P.R. China


Abstract—K-means clustering is a popular clustering algorithm based on the partition of data. However, K-means clustering algorithm suffers from some shortcomings, such as its requiring a user to give out the number of clusters at first, and its sensitiveness to initial conditions, and its being easily trapped into a local solution et cetera. The global Kmeans algorithm proposed by Likas et al is an incremental approach to clustering that dynamically adds one cluster center at a time through a deterministic global search procedure consisting of N (with N being the size of the data set) runs of the K-means algorithm from suitable initial positions. It avoids the depending on any initial conditions or parameters, and considerably outperforms the K-means algorithms, but it has a heavy computational load. In this paper, we propose a new version of the global K-means algorithm. That is an efficient global K-means clustering algorithm. The outstanding feature of our algorithm is its superiority in execution time. It takes less run time than that of the available global K-means algorithms do. In this algorithm we modified the way of finding the optimal initial center of the next new cluster by defining a new function as the criterion to select the optimal candidate center for the next new cluster. Our idea grew under enlightened by Park and Jun’s idea of K-medoids clustering algorithm. We chose the best candidate initial center for the next cluster by calculating the value of our new function which uses the information of the natural distribution of data, so that the optimal initial center we chose is the point which is not only with the highest density, but also apart from the available cluster centers. Experiments on fourteen well-known data sets from UCI machine learning repository show that our new algorithm can significantly reduce the computational time without affecting the performance of the global Kmeans algorithms. Further experiments demonstrate that our improved global K-means algorithm outperforms the global K-means algorithm greatly and is suitable for clustering large data sets. Experiments on colon cancer tissue data set revealed that our new global K-means algorithm can efficiently deal with gene expression data with high dimensions. And experiment results on synthetic data sets with different proportions noisy data points prove that our global k-means can avoid the influence of noisy data on clustering results efficiently.

Index Terms—clustering, K-means clustering, global Kmeans clustering, machine learning, pattern recognition, data mining, non-smooth optimization

[PDF]

Cite: Juanying Xie, Shuai Jiang, Weixin Xie, Xinbo Gao, "An Efficient Global K-means Clustering Algorithm," Journal of Computers vol. 6, no. 2, pp. 271-279, 2011.

General Information

ISSN: 1796-203X
Abbreviated Title: J.Comput.
Frequency: Monthly
Editor-in-Chief: Prof. Liansheng Tan
Executive Editor: Ms. Nina Lee
Abstracting/ Indexing: DBLP, EBSCO,  ProQuest, INSPEC, ULRICH's Periodicals Directory, WorldCat, CNKI,etc
E-mail: jcp@iap.org
  • Jul 19, 2019 News!

    Vol 14, No 7 has been published with online version   [Click]

  • Jun 21, 2019 News!

    Vol 14, No 6 has been published with online version   [Click]

  • Apr 28, 2019 News!

    Vol 14, No 5 has been published with online version 7 papers are published in this issue after peer review   [Click]

  • Mar 20, 2019 News!

    Vol 14, No 3 has been published with online version   [Click]

  • Feb 22, 2019 News!

    Vol 14, No 2 has been published with online version 8 papers are published in this issue after peer review   [Click]

  • Read more>>