Many intelligent healthcare systems have been developed to diagnose human diseases such as breast cancer, hepatitis, diabetes and heart diseases. Diabetes is a lifelong chronic disease that occurs when the pancreas does not produce enough insulin (Type I diabetes mellitus), or when the body's produced insulin is unable to be utilised properly (Type II diabetes mellitus), Researches that are carried out on diabetes using data mining techniques were done to predict type II diabetes mellitus using different diabetes datasets by different researchers; Pima Indians Diabetes Dataset (PIDD) is used by the majority of the researchers. The dataset (PIDD) has eight (8) attributes which limits more exploration in the field of Machine Learning (ML) for diabetes prediction. Diabetes prediction is limited because of the few attributes available in the diabetes datasets used, and these attributes play important roles in predicting diabetes mellitus types, classes and risk factors whenever a diabetes patient is diagnosed. This paper provides a systematic review of diabetes mellitus datasets, identifying the strength and weakness of the 8 attributes described in the PIDD, which is used by the most of the researchers. Furthermore, this paper has identified the need of the potential researchers in the research community to address the gap by enhancing the existing diabetes dataset attributes with additional attributes, identify the attributes required for the prediction of glucose level, diabetes Types, diabetes classes, diabetes risk factors and to develop a Model that can be used for the prediction.
| [1] | DelVecchio, A. (2019). health informatics https://searchhealthit.techtarget.com/definition/health-informatics. |
| [2] | Azhar, F. (2020). Data Mining in Healthcare: Benefits, Techniques, and Prospects https://www.way2smile.ae/blog/data-mining-in-healthcare/. |
| [3] | Chaves, L. & Marques, G. (2021) Data Mining Techniques for Early Diagnosis of Diabetes: A Comparative Study.View Article |
| [4] | Yusuf, A. B., Dima, R. M., & Aina, S. K. (2021). Optimized Breast Cancer Classification using Feature Selection and Outliers Detection. Journal of the Nigerian Society of Physical Sciences, 298-307.View Article |
| [5] | Hina, S., Shaikh, A., & AbulSattar, A. (2017). Analyzing Diabetes Datasets using Data Mining. Journal of Basic & Applied Sciences, 13, 466-471.View Article |
| [6] | Peker, M., Özkaraca, O., & Şaşar, A. (2018). Use of Orange Data Mining Toolbox for Data Analysis in Clinical Decision Making: The Diagnosis of Diabetes Disease.View Article PubMed |
| [7] | World Health Organization (2021) Diabetes. https://www.who.int/news-room/fact-sheets/detail/diabetes. |
| [8] | Saeedi, P.; Petersohn, I.; Salpea, P.; Malanda, B.; Karuranga, S.; Unwin, N.; Colagiuri, S.; Guariguata, L.; Motala, A.A.; & Ogurtsova, K. (2019) Global and regional diabetes prevalence estimates for 2019 and projections for 2030 and 2045: Results from the International Diabetes Federation Diabetes Atlas, 9th edition. Diabetes Res. Clin. Pract.View Article PubMed |
| [9] | Khanam, J.J. & Foo, S.Y. (2021) A comparison of machine learning algorithms for diabetes prediction, ICT Express.View Article |
| [10] | Manimaran, R., & Vanitha, M. (2017) Prediction of Diabetes Disease Using Classification Data Mining Techniques. International Journal of Engineering and Technology, https://www.researchgate.net/publication/331672855 |
| [11] | Alshammari1, R., Atiyah, N., Daghistani, T., & Alshammari, A. (2020) Improving Accuracy for Diabetes Mellitus Prediction by Using Deepnet. Public Health Informatics * ISSN 1947-2579 * http://ojphi.org * 12(1):e11.View Article PubMed |
| [12] | Breault, J. L. (2011). “Data Mining Diabetic Databases: Are Rough Sets a Useful Addition? |
| [13] | Parthiban, G., Rajesh, A., & Srivatsa, S.K. (2011). “Diagnosis of Heart Disease for Diabetic Patients using Naive Bayes Method”, International Journal of Computer Applications, 24(3).View Article |
| [14] | Padmaja, P. (2008) “Characteristic evaluation of diabetes data using clustering techniques”, IJCSNS International Journal of Computer Science and Network Security, 8(11). |
| [15] | Rajesh, K. & Sangeetha, V. (2012). Application of Data Mining Methods and Techniques for Diabetes Diagnosis. International Journal of Engineering and Innovative Technology (IJEIT), 2(3). |
| [16] | Rahim, S.S. (2016). Automatic Screening and Classification of Diabetic Retinopathy Eye Fundus Images. Unpublished PhD Thesis. Coventry: Coventry University. |
| [17] | Neilesh, B. & Gandhi, K. (2014) Diabetes prediction using feature selection and classification. Int. J. Adv. Eng. Res. Dev. |
| [18] | Vijayan, V. & Anjali, C. (2015) Prediction and Diagnosis of Diabetes Mellitus - A Machine Learning Approach. IEEE.View Article |
| [19] | Miss, S.J., & Megha, B. (2016) detection and prediction of diabetes mellitus using back-propagation neural network. IEEE. |
| [20] | Mohebbi, A., Tinna, A.B., Alexander, J.R., Henrik, B., Marco, F., & Morten, M. (2017). A deep learning approach to adherence detection for type 2 diabetics. IEEE.View Article PubMed |
| [21] | Francesco, M., Nardone, V., & Santone, A. (2017) Diabetes mellitus affected patients classification and diagnosis through machine learning techniques. Sci. Direct;112:2519-28.View Article |
| [22] | Maham, J., Hammad, A., Mehreen, A., Khawar, K., Raheel, N. (2017) An expert system for diabetes prediction using auto-tuned multi-layer perceptron. In: IEEE, vol. 2017 intelligent systems Conference (IntelliSys). London: IEEE. |
| [23] | Wenqian, C., Shuyu, C., Hancui, Z., Tianshu, W. (2017) A hybrid prediction model for type 2 diabetes using K-means and decision tree. In: 8th IEEE Int. Conf. Softw. Eng. Serv. Sci. ICSESS Beijing IEEE. |
| [24] | Mangrulkar R.S. (2017) Retinal image classification technique for diabetes identification. Int. Conf. Comput. Methodol. Commun. ICCMC Erode IEEE.View Article |
| [25] | Sidong, W., Xuejiao, Z., & Chunyan, M. (2018) A comprehensive exploration to the machine learning techniques for diabetes identification. IEEE 4th world forum internet of things WF-IoT IEEE. |
| [26] | Ashiquzzaman, A. (2018) Reduction of overfitting in diabetes prediction using deep learning neural network. IT Converge. Secure. 2017 Lect. Notes Electr. Eng, vol. 449. Springer Singap. |
| [27] | Deepti, S., & Dilip, S.S. (2018) Prediction of diabetes using classification algorithms. Sci. Direct. |
| [28] | Han, W., Shengqi, Y., Zhangqin, H., Jian, H., & Xiaoyi, W. (2018) Type 2 diabetes mellitus prediction model based on data mining. Sci. Direct. |
| [29] | Safial, I.A., & Islam M. (2019) Diabetes prediction: a deep learning approach. Int. J. Inf. Eng. Electron. Bus, vol. 11.View Article |
| [30] | Ayon, S.I & Islam, M. (2019) “Diabetes Prediction: A Deep Learning Approach", International Journal of Information Engineering and Electronic Business (IJIEEB), Vol.11, No.2.View Article |
| [31] | Naz, H., & Ahuja, S. (2020) Deep learning approach for diabetes prediction using PIMA Indian dataset. Journal of Diabetes & Metabolic Disorders.View Article PubMed |
| [32] | Bhoia, S.K, Pandab, S.K., Jenaa, K.K., Abhisekhc, P.A., Sahood, K.S., Samae, N.U., Pradhan, S.S., & Sahooa, R.R. (2021) Prediction of Diabetes in Females of Pima Indian Heritage: A Complete Supervised Learning Approach. Turkish Journal of Computer and Mathematics Education. Vol.12 No.10 3074-3084. |
| [33] | Islam, M., Rahman, J., Roy , D.C., Maniruzzaman, M. (2020) Automated detection and classification of diabetes disease based on Bangladesh demographic and health survey data, 2011 using machine learning approach. Diabetes and Metabolic Syndrome Clinical Research and Reviews https://www.researchgate.net/publication/339846671.View Article PubMed |
| [34] | Alpan, K., & İlgi, G.S. (2020) Classification of Diabetes Dataset with Data Mining Techniques by Using WEKA Approach. 978-1-7281-9090-7, IEEE. |
| [35] | Anwar, F., & Ul-Ain, Q., & Ejaz, M., & Mosavi, A. (2020). A comparative analysis on diagnosis of diabetes mellitus using different approaches -A survey. Informatics in Medicine Unlocked. 21. 100482.View Article |