Data Mining Techniques in Analyzing Process Data

This paper notes that previous papers that explore how data methods can be used to analyze process data in log files of technology-enhanced assessments are limited in that they only explore the efficacy of one data mining technique under one specific scenario. This also demonstrates the usage of four often-used supervised learning techniques and two unsupervised methods fitted to one assessment data and discusses the pros and cons of each. For example, the authors note that regression trees may deal with noise well but are easily influenced by small changes. Can you differentiate between a confirmatory approach and an exploratory approach?

Data Mining Techniques in Analyzing Process Data: A Didactic

Due to increasing use of technology-enhanced educational assessment, data mining methods have been explored to analyse process data in log files from such assessment. However, most studies were limited to one data mining technique under one specific scenario. The current study demonstrates the usage of four frequently used supervised techniques, including Classification and Regression Trees (CART), gradient boosting, random forest, support vector machine (SVM), and two unsupervised methods, Self-organizing Map (SOM) and k-means, fitted to one assessment data. The USA sample (N = 426) from the 2012 Program for International Student Assessment (PISA) responding to problem-solving items is extracted to demonstrate the methods. After concrete feature generation and feature selection, classifier development procedures are implemented using the illustrated techniques. Results show satisfactory classification accuracy for all the techniques. Suggestions for the selection of classifiers are presented based on the research questions, the interpretability and the simplicity of the classifiers. Interpretations for the results from both supervised and unsupervised learning methods are provided.


Source: Xin Qiao and Hong Jiao, https://www.frontiersin.org/articles/10.3389/fpsyg.2018.02231/full#h6
Creative Commons License This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 License.