Applying data mining techniques to predict vitamin D deficiency in diabetic patients


Eşsiz U. E., Yüregir H. O., Saraç E.

HEALTH INFORMATICS JOURNAL, cilt.29, sa.4, ss.1-10, 2023 (SCI-Expanded) identifier

  • Yayın Türü: Makale / Tam Makale
  • Cilt numarası: 29 Sayı: 4
  • Basım Tarihi: 2023
  • Doi Numarası: 10.1177/14604582231214864
  • Dergi Adı: HEALTH INFORMATICS JOURNAL
  • Derginin Tarandığı İndeksler: Science Citation Index Expanded (SCI-EXPANDED), Scopus, Academic Search Premier, Applied Science & Technology Source, CINAHL, Computer & Applied Sciences, EBSCO Education Source, Educational research abstracts (ERA), EMBASE, INSPEC, Library and Information Science Abstracts, Library, Information Science & Technology Abstracts (LISTA), MEDLINE, Directory of Open Access Journals
  • Sayfa Sayıları: ss.1-10
  • Çukurova Üniversitesi Adresli: Evet

Özet

Vitamin D is among the vitamins necessary for both adults’ and children’s health. It plays a significant role in calcium absorption, the immune system, cell proliferation and differentiation, bone protection, skeletal health, rickets, muscle health, heart health, disease pathogenesis and severity, glucose metabolism, glucose intolerance, varying insulin secretion, and diabetes. Because the 25-hydroxyvitamin D (25OHD) test, which is used to measure vitamin D is expensive and may not be covered in healthcare benefits in many countries, this study aims to predict vitamin D deficiency in diabetic patients. The prediction method is based on data mining techniques combined with feature selection by using historical electronic health records. The results were compared with a filter-based feature selection algorithm, namely relief-F. Non-valuable features were eliminated effectively with the relief-F feature selection method without any performance loss in classification. The performances of the methods were evaluated using classification accuracy (ACC), sensitivity, specificity, F1-score, precision, kappa results, and receiver operating characteristic (ROC) curves. The analyses have been conducted on a vitamin D dataset of diabetic patients and the results show that the highest classification accuracy of 97.044% was obtained for the support vector machines (SVM) model using radial kernel that contains 18 features.