COMPARISON OF REAL DATASETS CHARACTERISTICS BY USING CLUSTERING APPROACHES

Authors:

S. Rahamat Basha,M.Surya Bhupal Rao,Dr. P. Kiran Kumar Reddy,

DOI NO:

https://doi.org/10.26782/jmcms.2020.08.00061

Keywords:

Clustering Analysis,Cluster Accuracy,visual assessment,CCE,DBE,VAT,

Abstract

Major issue in cluster analysis is determining the number of clusters present in a data set. The automated identification of the number of clusters can be satisfactorily solved with very few techniques. Recent developments have resulted in a very popular visual mechanism for clustering trend determination (VAT, Visual Assessment of Clustering Tendency) in data sets. The techniques used for image processing depend on the structure of the VAT image, without using any cluster validity concept. High speed solutions can be found in conjunction with GAs from VAT approaches. This approach however depends on the ability of the index concerned to identify overlapping clusters.We will explain how VAT algorithms can be very quickly used to correctly determine the number of clusters. The implementation of the approaches proposed by taking cluster accuracy, cluster error and computational time as metrics.

Refference:

I. Ahmad A, Dey L (2007), K-Mean clustering algorithm for mixed numeric and categorical data. Data & Knowledge Engineering, 63(2), 503-527.2007.
II. Bandyopadhyay S, Saha S (2008), A point symmetry-based clustering technique for automatic evolution of clusters. Knowledge and Data Engineering, IEEE Transactions, 20(11), 1441-1457.2008.
III. Caliński T, Harabasz J (2012), A dendrite method for cluster analysis. Communications in Statistics-theory and Methods, 41(12),1-27.2012.
IV. Cattell R (1944) A note on correlation clusters and cluster search methods. Psychometrika, 9(3) (1944) 169-184.1994.11
V. G.Ravi Kumar, S.Rahamat Basha, Surya Bhupal Rao, “A Summarization on Text Mining Techniques for Information Extracting from Applications and Issues”, Journal of Mechanics of Continua and Mathematical Sciences,Special Issue, No.-5, 2020.
VI. I. J. Sledge and T. C. Havens and J. M. Huband and J. C. Bezdek and J. M. Keller, “Finding the number of clusters in ordered dissimilarities,” in Soft Computing, vol. 13, 2009, pp. 1125-1142.
VII. Liang Wang, Christopher Leckie, KotagiriRamamohanarao, and James Bezdek(2009), “Automatically Determining the Number of Clusters in Unlabeled Data Sets”, Fellow, IEEE, 21(3), 335-350.2009.
VIII. L. Wang and C. Leckie and R. Kotagiri and J. C. Bezdek, “Automatically Determining the Number of Clusters in Unlabeled Data Sets,” in IEEE Transaction on Knowledge and Data Syetems, vol. 21, 2009, pp. 335-350.
IX. Maimon O, Rokach L (2005), Decomposition methodology for knowledge discovery and data mining: Springer, pp 981-1003,2005.
X. S.Rahamat Basha, J. Keziya Rani “A Comparative Approach of Dimensionality Reduction Techniques in Text Classification” Engineering, Technology & Applied Science Research, Vol. 9, No. 6, Dec 2019, PP:4974-4979.
XI. S.Rahamat Basha, J. Keziya Rani,JJC Prasad Yadav, “A Novel Summarization-based Approach for Feature Reduction, Enhancing Text Classification Accuracy” Engineering, Technology &Applied Science Research, Vol. 9, No. 6, Dec 2019, PP 5001-5005.
XII. S.Rahamat Basha, J. Keziya Rani, JJC Prasad Yadav, G.Ravi Kumar, “Impact of feature selection techniques in Text Classification:An Experimental study”, Journal of Mechanics of Continua and Mathematical Sciences, Special Issue, No.-3, September (2019) PP 39-51.
XIII. Surya Bhupal Rao, S.Rahamat Basha, “Chaotic Algorithm for Standard Image Encryption”, Journal of Mechanics of Continua and Mathematical Sciences, Special Issue, No.-3, September (2019).
XIV. Surya Bhupal Rao, S.Rahamat Basha, G.Ravi Kumar, “A Comparative approach of Text Mining: Classification, Clustering and Extraction Techniques”, Journal of Mechanics of Continua and Mathematical Sciences, Special Issue, No.-5,2020.
XV. T. Havens. J. C. Bezdek, J. M. Keller and M. Popescu, “Dunn’s cluster validity index as a contrast measure of VAT images,” in Proc ICPR, Tampa, FL, 2008.
XVI. Timothy C. Havens1, James C. Bezdek1, and James M. Keller1(2012), “A New Implementation of the co-VAT Algorithm for Visual Assessment of Clusters in Rectangular Relational Data”, Fellow, IEEE, 21(3), 335-350.2012.
XVII. Timothy C. Havens, Senior Member (2012), IEEE, and James C. Bezdek, “An Efficient Formulation of the Improved Visual Assessment of Cluster Tendency (iVAT) Algorithm”, Fellow, IEEE, 21(3), 335-350.2012.
XVIII. Zhang Z, Zhang J, Xue H (2008), Improved K-means clustering algorithm. In Image and Signal Processing, 2008. CISP’08. Congress on, Vol. 5, pp 169-172,2008.

View Download