Search WWW Search inass.org
»Journal Description
»Topics
»Call for Papers and Reviewers
»Author Guidelines
»Contents & Papers
»Call for Special Issues
»SCOPUS
 
»IEEE CIS
»INNS
»IEEE IS
DOI: http://dx.doi.org/10.22266/ijies2017.0430.14

Optimal Decision Tree Based Unsupervised Learning Method for Data Clustering

Author(s):

Nagarjuna Reddy Seelam1*, Sai Satyanaryana Reddy Seelam2, Babu Reddy Mukkala3


Affiliations:

1Jawaharlal Nehru Technological University, Kakinada, Andhra Pradesh, India
2Vardhaman College of Engineering, Hyderabad, Telangana, India
3Krishna University, MachiliPatnam, Andhra Pradesh, India







Abstract:

Clustering is an investigative data analysis task. It aims to find the intrinsic structure of data by organizing data objects into similarity groups or clusters. Our investigation using a pattern based clustering on numerical data set; here, we are using a Parkinson and spam dataset. These techniques are strongly related to the statistical field of cluster analysis, where over the years a large number of clustering methods has been proposed. Here, we have proposed an improved k-means clustering algorithm is used to extract patterns from a collection of an unsupervised decision tree. In our proposed research, we introduce a binary cuckoo search based decision tree. In this tree based learning technique, extracting patterns from a given dataset. Here, we have clustered the data with the aid of improved k-means clustering algorithm. The performance can be evaluated in terms of sensitivity, specificity, and accuracy.


Keywords:

K-means clustering, Binary cuckoo search, Sensitivity, Specificity, Accuracy, Pattern.


Full Text:




References:
  1. A. Pal, P. Shraddha and J. Maurya, “Classification and Analysis of High Dimensional Datasets using Clustering and Decision tree”, International journal of Computer Science and Information Technologies, Vol. 5, No. 2, pp. 2329-2333, 2014.
  2. N. Zhong, Y. Li and S. T. Wu, “Effective Pattern Discovery for Text Mining”, In the Proceeding of IEEE Transaction on Knowledge and Data Engineering, Vol. 24, No. 1, pp. 30-44, 2012.
  3. P. Berkhin, “A Survey of Clustering Data Mining Techniques”, In the Proceeding of Springer on grouping Multidimensional Data, pp. 25-71, 2006.
  4. J. Srivastava, R. Cooley, M. Deshpande and P. N. Tan, “Web Usage Mining: Discovery and Applications of Usage Patterns from Web Data”, In Proceedings of ACM Digital Library on SIGKDD Exploration , Vol. 1, No. 2, pp. 12-23, 2000.
  5. Y. Li and N. Zhong, "Mining Ontology for Automatically Acquiring Web User Information Needs”, In Proceedings of IEEE Transaction on Knowledge and Data Engineering, Vol. 18, No. 4, pp. 554-568, 2006.
  6. C. Fraley and A. E. Raftery, “Model-Based Clustering, Discriminant Analysis, and Density Estimation”, Journal of the American Statistical Association, Vol. 97, No. 458, pp. 611-631, 2002.
  7. I. Bose and X. Chen, “Hybrid Models Using Unsupervised Clustering for Prediction of Customer Churn”, In Proceedings of the International Multi Conference and Computer Science, Vol. 19, No. 2, pp. 133-151, 2009.
  8. K. Mythili and K. Yasodha, “A Pattern Taxonomy Model with New Pattern Discovery Model for Text Mining”, International Journal of Science and Applied Information Technology, Vol. 1, No. 3, pp. 88-92, Aug 2012.
  9. H. P. Kriegel, P. Kröger and A. Zimek, “Clustering High-Dimensional Data: A Survey on Subspace Clustering, Pattern-Based Clustering, and Correlation Clustering”, Journal Of Acm Transaction on Knowledge Discovery From Data (Tkdd), Vol. 3, No. 1, pp. 1-8, 2009.
  10. Q. Zhao, S. S. Bhowmick and L. Gruenwald, “Cleopatra: Evolutionary Pattern-based Clustering of Web Usage Data”, In Proceedings of Advances ikn Knowledge Discovery and Data mining, Vol. 3918, pp. 323-333, 2006.
  11. Y. Yang and B. Padmanabhan, “GHIC: A Hierarchical Pattern-Based Clustering Algorithm for Grouping Web Transactions”, In Proceedikng of IEEE Transaction On Knoeledge and Data Engineering, Vol. 17, No. 9, pp. 1300-1304, 2005.
  12. M. M. Ozdal and C. Aykanat, “Hypergraph Models and Algorithms for Data-Pattern-Based Clustering”, Springer Journal of Data Mining and Knowledge Discovery, Vol. 9, No. 1, pp. 29-57, 2004.
  13. H. Wang, W. Wang, J. Yang and P. S. Yu, “Clustering by Pattern Similarity in Large Data Sets”, In Proceedings of ACM SIMMOD International Conference on Management of Data, pp. 394-405, 2015.
  14. G. Alexe, S. Alexe and P. L. Hammer, “Pattern-Based Clustering and Attribute Analysis”, Journal of Soft Computing, Vol. 10, No. 5, pp. 442-452, 2006.
  15. A. Hatamlou, “In search of optimal centroids on data clustering using a binary search algorithm”, Pattern Recognition Letters, Vol. 33, No. 13, pp. 1756-1760, 2012.
  16. M. B. Dowlatshahi and H. Nezamabadi-pour, “GGSA: A Grouping Gravitational Search Algorithm for data clustering”, Engineering Applications of Artificial Intelligence, Vol. 36, pp. 114-121, 2014.
  17. L. Liao, X. Shen and Y. Zhang, “Image Segmentation Based on Fast Kernelized Fuzzy Clustering Analysis”, In Proceedings of IEEE Eighth International Conference on Fuzzy System and Knowledge Discovery (FSKD), Vol. 1, pp. 438-442, July 2011.
  18. G. Chicco, “Overview and performance assessment of the clustering methods for electrical load pattern grouping”, ELSEVIER Journal of 8th World Energy System, Vol. 42, No. 1, pp. 68-80, 2012.
  19. L. Galluccio, O. Michel, P. Comon, M. Kliger and A. O. Hero, “Clustering with a new distance measure based on a dual-rooted tree”, ELSEVIER Journal of Information Sciences, Vol. 251, No. 1, pp. 96-113, 2013.
  20. J. Pei, X. Zhang, M. Cho, H. Wang and P. S. Yu, "MaPle: A Fast Algorithm for Maximal Pattern-based Clustering", In Proceedings of IEEE International Conference on Data Mining, pp. 259-266, Nov 2003.
  21. A. Zimek, I. Assent and J. Vreeken, “Frequent Pattern Mining Algorithms for Data Clustering”, International Springer Journal of Frequent Pattern Mining, pp. 403-423, 2014.

INASS Home | Copyright@2008 The Intelligent Networks and Systems Society