2014 : Two level clustering approach for data quality improvement in web usage mining
Dr. Ir. Yoyon Kusnendar Suprapto MSc.
Abstract Web Usage Mining (WUM) is a term related to the extraction of knowledge from web log data. Web log data has a lot of irrelevant data to proceed WUM. Therefore, it requires several steps to get a good quality of data, because the final result of WUM depends on the quality of the input data. Therefore, in this paper we propose a new approach to overcome these problems, hence, it is called the two level clustering approach. The first level clustering is performed on the data in the form of access frequently and use nonhierarchical cluster method, followed by a second level clustering on the web log data in the form of user access. At the second level clustering, it combines cluster hierarchical and non-hierarchical methods. From the experiments, 90.78% on web log data quality is reached