Prediction of Thyroid Disease Using Data Mining Techniques

Irina Ioniţă, Liviu Ioniţă


Recently, thyroid diseases are more and more spread worldwide. In Romania, for example, one of eight women suffer from hypothyroidism, hyperthyroidism or thyroid cancer. Various research studies estimate that about 30% of Romanians are diagnosed with endemic goiter. The factors that affect the thyroid function are: stress, infection, trauma, toxins, low-calorie diet, certain medication etc. It is very important to prevent such diseases rather than cure them, because the majority of treatments consist in long term medication or in chirurgical intervention. The current study refers to the thyroid disease classification in two of the most common thyroid dysfunctions (hyperthyroidism and hypothyroidism) among the population. The authors analyzed and compared four classification models: Naive Bayes, Decision Tree, Multilayer Perceptron and Radial Basis Function Network. The results indicate a significant accuracy for all the classification models mentioned above, the best classification rate being that of the Decision Tree model. The data set used to build and to validate the classifier was provided by the UCI machine learning repository and by a website with Romanian data. The framework for building and testing the classification models was KNIME Analytics Platform and Weka, two data mining software.


data mining, classification model, thyroid diseases, neural network, decision tree, Naïve Bayes

Full Text:


(C) 2010-2017 EduSoft