Search WWW Search
»Journal Description
»Call for Papers and Reviewers
»Author Guidelines
»Contents & Papers
»Call for Special Issues

Grey Fuzzy Neural Network-Based Hybrid Model for Missing Data Imputation in Mixed Database


Vijayakumar Kuppusamy1*, Ilango Paramasivam1


1 School of Computer Science and Engineering, Vellore Institute of Technology University, Vellore, India


Nowadays, the missing data imputation is the novel paradigm to replace with the imputed value of the missing attribute. The missing data occurs due to bias information, non-response of the system. In the medical domain, it becomes the major challenge to impute the both categorical and numerical data. In this paper, the Grey Fuzzy Neural Network is proposed for missing data imputation in the mixed database. Initially, the WLI fuzzy clustering mechanism is utilized to generate the different clusters in which the medical data are grouped together. Then, we intend to integrate the Grey Wolf Optimizer (GWO) with the ANFIS network model, termed the Grey Fuzzy Neural Network (GFNN). The proposed method is mainly used to determine the optimal parameters to design the membership function. Finally, the hybrid prediction model is used to find out the imputed data for both categorical and numerical. In the hybrid prediction model, the categorical data is then imputed by the distance measure. The experimental results are validated, and performance is analysed by metrics such as MSE and RMSE using MATLAB implementation. The outcome of the proposed GFNN attains lower 0.13 MSE, and 0.35 RMSE ensures to impute the data significantly in the missing attribute of the mixed database.


Categorical and Numerical missing data, WLI fuzzy clustering, Grey Wolf Optimizer, ANFIS, Hybrid prediction model.

Full Text:

  1. C.O. Galan, F.S. Lasheras, F. J. Juez and A.B.Sanchez, "Missing data imputation of questionnaires by means ofgeneticalgorithms with different fitness functions",Journal of Computational and Applied Mathematics, Vol. 311, pp. 707-717, 2016.
  2. M.Amiria and R. Jensen, “Missing data imputation using fuzzy-rough methods”, Neurocomputing, Vol. 205, pp. 152-164, 2016.
  3. M. MostafizurRahman and D.N.Davis, “Machine Learning-Based Missing Value Imputation Method for Clinical Datasets”, IAENG Transactions on Engineering Technologies, vol. 229, pp. 245-257, 2013.
  4. P.J.Garcia-Laencina and P.H. Abreu, “Missing data imputation on the 5-year survival prediction of breast cancer patients with unknown discrete values”, Computers in Biology and Medicine, Vol. 59, pp. 125-133, 2015.
  5. J.Tian, B. Yu, D. Yu and S. Ma, "Missing data analyses: a hybrid multiple imputation algorithmusingGrey System Theory and entropy based on clustering", Applied Intelligence, Vol. 40, No. 2, pp. 376-388, 2014.
  6. X.P. Zhang, A. ShaharyarKhwaja and A.Anpalagan, "Multiple Imputations Particle Filters: Convergence and Performance Analyses for Nonlinear State Estimation with Missing Data", IEEE Journal of Selected Topics in Signal Processing, Vol. 9, No. 8, pp. 1536-1547, 2015.
  7. T. H. Lin, "Missing Data Imputation inQuality-of-Life Assessment", Pharmaco Economics, Vol. 24, No. 9, pp. 917-925, 2006.
  8. X. Zhu, S. Zhang, Z.Zhang, and Z.Xu, "Missing Value Estimation for Mixed-Attribute Data Sets", IEEE Transactions on Knowledge and Data Engineering, Vol. 23, No. 1, pp. 110-121, 2011.
  9. S. Zhang, "Shell-neighbor method and its application in missing data imputation", Applied Intelligence, Vol. 35, No. 1, pp. 123-133, 2011.
  10. W. Wei and Y. Tang, "A generic neural network approach for filling missing data in data mining", In proceedings of IEEE International Conference on Systems, Man and Cybernetics, pp. 862-867, 2003.
  11. Y. Zhang and Y. Liu, "Data Imputation Using Least Squares Support Vector Machines in Urban Arterial Streets”, IEEE Signal Processing Letters, Vol. 16, No. 5, pp. 414-417, 2009.
  12. R. Pan, T. Yang, J. Cao, K. Lu and Z. Zhang, “Missing data imputation by K nearest neighbours based on grey relational structure and mutual information”, Applied Intelligence, Vol. 43, No. 3, pp. 614-632, 2015.
  13. J. Luengo, J. A. Saez and F. Herrera, “Missing data imputation for fuzzy rule-based classification systems”, Soft computing, Vol.16, No. 5, pp. 863-881, 2012.
  14. J. Luis, S.Gomez, A.F Vidal and M.Verleysen, “K nearest neighbours with mutual information for simultaneous classification and missing data imputation”, Neurocomputing, Vol. 72, No. 7-9, pp. 1483-1493, 2009.
  15. A.Purwar and S.K. Singh, “Hybrid Prediction Model with missing value Imputation for medical data”, Expert Systems with Applications, Vol. 42, No. 13, pp. 5621-5631, 2015.
  16. C-H. Wu, C.S.Ouyang, L.W Chen, and L.W.Lu, "A New Fuzzy Clustering Validity Index with a Median Factor for Centroid-based Clustering", IEEE Transactions on Fuzzy Systems, Vol. 23, No. 3, pp. 701 - 718, 2014.
  17. S.Mirjalili, S. M. Mirjalili and A. Lewis, "Grey Wolf Optimizer”, Advances in Engineering Software, Vol. 69, pp. 46–61, 2014.
  18. N.Walia, H. Singh and A.Sharma, "ANFIS: Adaptive Neuro-Fuzzy Inference System- A Survey", International Journal of Computer Applications, Vol. 123, No. 13, pp. 32-38, 2015.
  19. UC Irvine Machine Learning Repository from".
  20. V.Ravi, M.Krishna,"A new online data imputation method based on general regression auto associative neural network", Neurocomputing, Vol.138, No. 22,pp. 106–113, August 2014.

INASS Home | Copyright@2008 The Intelligent Networks and Systems Society