Read this article and pay attention to the data mining techniques, classifier development, and evaluation criteria. Then take notes and understand the difference between supervised and unsupervised learning models. Finally, read the summary and discussion section of this article. What distinctions can be made about the three major purposes of problem-solving items using data-mining techniques?
There are different types of data warehouses, and each has a specific purpose within an organization. Remember, it is important to use the correct type of warehouse to support the "decision support" model being employed. Decision support techniques such as classification, prediction, time-series analysis, association, clustering, and so on will each have their own unique data needs. Correctly designing the data warehouse will ensure the best possible evidence to support strategic and daily decisions.
Managing data is an important function in the administrative process. Because organizations use data to guide decisions, decision-makers rely on you to produce a data management plan for sustainability, growth, and strategy. As you start to interact with decision-makers and the decision-support systems they use, you will also find that additional study of the models employed through a course on quantitative methods or decision-support technology will prove useful.
Methods
Data Description
The PISA 2012 log file dataset for the
problem-solving item was downloaded at
http://www.oecd.org/pisa/pisaproducts/database-cbapisa2012.htm. The
dataset consists of 4722 actions from 426 students as rows and 11
variables as columns. Eleven variables (see Figure 2) include: cnt
indicates country, which is USA in the present study; schoolid and
StIDStd indicate the unique school and student IDs, respectively;
event_number (ranging from 1 to 47) indicates the cumulative number of
actions the student took; event_value (see raw event_values presented in
Table 1) tells the specific action the student took at one time stamp
and time indicates the exact time stamp (in seconds) corresponding to
the event_value. Event notifies the nature of the action (start item,
end item, or actions in process). Lastly, network, fare_type,
ticket_type, and number_trips all describe the current choice the
student had made. The variables used were schoolid, StIDStd, event_value
and time. ID variables helped to identify students, while event_value
and time variables were used to generate features. The scores for all
students were not provided in the log file, thus, hand coded and
carefully double checked based on the scoring rule. Among the 426
students, 121 (28.4%) got full credit, 224 (52.6%) got partial credit
and 81 (19.0%) did not get any credit. Full, partial, and no credit were
coded as 2, 1, and 0, respectively.
Figure 2. The screenshot of the log file for one student.

Table 1. 15 raw event values and 36 generated features.