How to Discover Hidden Knowledge According to Different Type Data Set: A Guideline to Apply the Right Hybrid Information Mining Approach

Roberto Paiano - Department of Engineering for Innovation, University of Salento (IT), Stefania Pasanisi - Department of Engineering for Innovation, University of Salento (IT),

Abstract


The use of advanced data analysis techniques is now of considerable importance in order to allow the complex extraction of previously unknown and potentially useful implicit information on the data. Interest in this area has grown appreciably since these techniques had to meet the challenges introduced by the enormous proliferation of data triggered by the big data era. This implied, in the last few years, on developing advanced analysis techniques or improving existing ones by constantly introducing new techniques. The selection of an appropriate algorithm for a specific problem is very difficult and often the only solution is to proceed by trial and error. This paper intends to investigate which analysis technique should be used on a particular data set, based on the characteristics of this data set. We present three case studies, each of them concerns a very specific domain (Educational, Health and Safety) that is represented by a particular type of data set. The results establish a possible relationship between the analysis techniques implemented, that is Clustering analysis, Association Rule and Neural Network and the data set type analyzed.

 


Keywords


Information Mining; Data set characteristics; Big Data; Rich Data; Data Mining; Exploratory Data Analysis

Full Text:

PDF


(C) 2010-2024 EduSoft