Machine Learning-based Web Application for Early Diagnosis of Diabetes

Hamayoun Yousaf Shahwani


Diabetes has become a chronic disease that seriously threatens human health. It is a group of metabolic diseases characterized by hyperglycemia and there is no role of the age factor involved. The long-term of diabetes disease causes chronic damage and dysfunction of various tissues, especially the eyes, kidneys, heart, blood vessels, and nerves. Most of the time people are not sure about this common disease at the early stage and unluckily the patient moves to a critical situation to meet with major disease due to the continuous effect of diabetes. This research is conducted to build the machine learning-based web application platform for the early diagnosis of the disease, freely accessible anywhere anytime. We used the benchmark dataset named PIDD (Prima Indian Diabetes Dataset) and performed the comparative analysis among the Naïve Bayes, Logistic Regression, K-Nearest Neighbors, Decision Trees, Random Forest and Support Vector Machines. Based on the classification performance, we found that SVM performed the best among the pool of mentioned algorithms and, therefore, adopted for the development of the intelligent web application for the diabetes diagnosis.


Classification, Support Vector Machine, Diabetes diagnosis, Diabetes prediction

Full Text:



M. Maniruzzaman et al., "Comparative approaches for classification of diabetes mellitus data: Machine learning paradigm", Computer Methods and Programs in Biomedicine, vol. 152, pp. 23-34, 2017. Available: 10.1016/j.cmpb.2017.09.004.

"Diabetes",, 2020. [Online]. Available:

A. Mohammad, T. Alwada‘n and O. Al-Momani, "Arabic Text Categorization Using Support vector machine, Naïve Bayes and Neural Network", GSTF Journal on Computing (JoC), vol. 5, no. 1, 2016.

Available: 10.7603/s40601-016-0016-9.

"National Diabetes Statistics Report, 2020", Centers for Disease Control and Prevention, 2020. [Online]. Available:,1%20in%203%E2%80%94have%20prediabetes.

M. Maniruzzaman et al., "Comparative approaches for classification of diabetes mellitus data: Machine learning paradigm", Computer Methods and Programs in Biomedicine, vol. 152, pp. 23-34, 2017. Available: 10.1016/j.cmpb.2017.09.004.

R. BELLAZZI and B. ZUPAN, "Predictive data mining in clinical medicine: Current issues and guidelines", International Journal of Medical Informatics, vol. 77, no. 2, pp. 81-97, 2008. Available: 10.1016/j.ijmedinf.2006.11.006.

M. Otoom, H. Alshraideh, H. Almasaeid, D. López-de-Ipiña and J. Bravo, "A Real-Time Insulin Injection System", Ambient Assisted Living and Active Aging, pp. 120-127, 2013. Available: 10.1007/978-3-319-03092-0_18.

V. Vijayan and C. Anjali, "Decision support systems for predicting diabetes mellitus — A Review", 2015 Global Conference on Communication Technologies (GCCT), 2015. Available: 10.1109/gcct.2015.7342631 [Accessed 14 October 2020].

S. Kumari and A. Singh, "A data mining approach for the diagnosis of diabetes mellitus", 2013 7th International Conference on Intelligent Systems and Control (ISCO), 2013. Available: 10.1109/isco.2013.6481182.

S. Dey, A. Hossain and M. Rahman, "Implementation of a Web Application to Predict Diabetes Disease: An Approach Using Machine Learning Algorithm", 2018 21st International Conference of Computer and Information Technology (ICCIT), 2018. Available: 10.1109/iccitechn.2018.8631968.

V. Vijayan and C. Anjali, "Prediction and diagnosis of diabetes mellitus — A machine learning approach", 2015 IEEE Recent Advances in Intelligent Computational Systems (RAICS), 2015. Available: 10.1109/raics.2015.7488400.

K. De Silva, D. Jönsson and R. Demmer, "A combined strategy of feature selection and machine learning to identify predictors of prediabetes", Journal of the American Medical Informatics Association, vol. 27, no. 3, pp. 396-406, 2019. Available: 10.1093/jamia/ocz204.

F. Emmert-Streib and M. Dehmer, "A Machine Learning Perspective on Personalized Medicine: An Automized, Comprehensive Knowledge Base with Ontology for Pattern Recognition", Machine Learning and Knowledge Extraction, vol. 1, no. 1, pp. 149-156, 2018. Available: 10.3390/make1010009.

S. Mukherjee and N. Sharma, "Intrusion Detection using Naive Bayes Classifier with Feature Reduction", Procedia Technology, vol. 4, pp. 119-128, 2012. Available: 10.1016/j.protcy.2012.05.017.

C. Han et al., "Subclinical Hypothyroidism and Type 2 Diabetes: A Systematic Review and Meta-Analysis", PLOS ONE, vol. 10, no. 8, p. e0135233, 2015. Available: 10.1371/journal.pone.013523.

E. Ogheneovo and P. Nlerum, "Iterative Dichotomizer 3 (ID3) Decision Tree: A Machine Learning Algorithm for Data Classification and Predictive Analysis", International Journal of Advanced Engineering Research and Science, vol. 7, no. 4, pp. 514-521, 2020. Available: 10.22161/ijaers.74.60.

S. Shetty and S. Joshi, "A Tool for Diabetes Prediction and Monitoring Using Data Mining Technique", International Journal of Information Technology and Computer Science, vol. 8, no. 11, pp. 26-32, 2016. Available: 10.5815/ijitcs.2016.11.04.

J. C. Ortega, "An Analysis of Classification of Breast Cancer Dataset Using J48 Algorithm", International Journal of Advanced Trends in Computer Science and Engineering, vol. 9, no. 13, pp. 475-480, 2020. Available: 10.30534/ijatcse/2020/7591.32020.

AHMED, T., and Abdalaziz, P., 2016. USING DATA MINING TO DEVELOP MODEL FOR CLASSIFYING DIABETIC PATIENT CONTROL LEVEL BASED ON HISTORICALMEDICALRECORDS. Journ al of Theoretical and Applied Information Technology, Vol.87. No.2(1992-8645).

L. Kopitar, P. Kocbek, L. Cilar, A. Sheikh and G. Stiglic, "Early detection of type 2 diabetes mellitus using machine learning-based prediction models", Scientific Reports, vol. 10, no. 1, 2020. Available: 10.1038/s41598-020-68771-z.

Younus, M., Munna, M., Alam, M., Allayear, S., and Ara, S., 2019. Prediction Model for Prevalence of Type-2 Diabetes Mellitus Complications Using Machine Learning Approach. Studies in Big Data, pp.103-116.

Karatsiolis, S., and Schizas, C., 2012. Region based Support Vector Machine algorithm for medical diagnosis on Pima Indian Diabetes dataset. 2012 IEEE 12th International Conference on Bioinformatics & Bioengineering (BIBE),

Vijayan, V., and Anjali, C., 2015. Decision support systems for predicting diabetes mellitus — A Review. 2015 Global Conference on Communication Technologies (GCCT).

Y. Wang, "Iteration-based naive Bayes sentiment classification of microblog multimedia posts considering emoticon attributes", Multimedia Tools and Applications, vol. 79, no. 27-28, pp. 19151-19166, 2020. Available: 10.1007/s11042-020-08797-7.

J. van Engelen and H. Hoos, "A survey on semi-supervised learning", Machine Learning, vol. 109, no. 2, pp. 373-440, 2019. Available: 10.1007/s10994-019-05855-6 [Accessed 14 October 2020].

H. Saadatfar, S. Khosravi, J. Joloudari, A. Mosavi and S. Shamshirband, "A New K-Nearest Neighbors Classifier for Big Data Based on Efficient Data Pruning", Mathematics, vol. 8, no. 2, p. 286, 2020. Available: 10.3390/math8020286.

M. NirmalaDevi, S. Appavu and U. Swathi, "An amalgam KNN to predict diabetes mellitus", 2013 IEEE International Conference ON Emerging Trends in Computing, Communication and Nanotechnology (ICECCN), 2013. Available: 10.1109/ice-ccn.2013.6528591.

S. Shabani, H. Pourghasemi and T. Blaschke, "Forest stand susceptibility mapping during harvesting using logistic regression and boosted regression tree machine learning models", Global Ecology and Conservation, vol. 22, p. e00974, 2020. Available: 10.1016/j.gecco.2020.e00974.

A. Al Jarullah, "Decision tree discovery for the diagnosis of type II diabetes", 2011 International Conference on Innovations in Information Technology, 2011. Available: 10.1109/innovations.2011.5893838.

A. Moraru, D. Costin, R. Moraru and D. Branisteanu, "Artificial intelligence and deep learning in ophthalmology - present and future (Review)", Experimental and Therapeutic Medicine, 2020. Available: 10.3892/etm.2020.9118.

M. Farhadian, P. Shokouhi and P. Torkzaban, "A decision support system based on support vector machine for diagnosis of periodontal disease", BMC Research Notes, vol. 13, no. 1, 2020. Available: 10.1186/s13104-020-05180-5.

T. Santhanam and M. Padmavathi, "Application of K-Means and Genetic Algorithms for Dimension Reduction by Integrating SVM for Diabetes Diagnosis", Procedia Computer Science, vol. 47, pp. 76-83, 2015. Available: 10.1016/j.procs.2015.03.185.

K. Lo, S. Lin, C. Lu, C. Kuo and C. Liu, "Whole-genome sequencing and comparative analysis of two plant-associated strains of Rhodopseudomonas palustris (PS3 and YSC3)", Scientific Reports, vol. 8, no. 1, 2018. Available: 10.1038/s41598-018-31128-8.

I. Contreras and J. Vehi, "Artificial Intelligence for Diabetes Management and Decision Support: Literature Review", Journal of Medical Internet Research, vol. 20, no. 5, p. e10775, 2018. Available: 10.2196/10775.


Creative Commons License
Journal of Applied and Emerging Sciences by BUITEMS is licensed under a Creative Commons Attribution 4.0 International License.
Based on a work at
Permissions beyond the scope of this license may be available at

Contacts | Feedback
© 2002-2014 BUITEMS