Contributions from Data Mining to Study Academic Performance of Students of a Tertiary Institute

Veiw figure View Figure

Reference ^[18] have a survey on educational data mining between 1995 and 2005. They concluded that educational data mining is a promising area of research and it has a specific requirements not presented in other domains; they described the cycle of applying data mining in educational systems (Figure 1).

In ^[19], different methods and techniques of data mining were compared during the prediction of students' success, applying the data collected from the surveys conducted at the University of Tuzla, the Faculty of Economics, among first year students and the data taken during the enrollment. The success was evaluated with the passing grade at the exam. The impact of students' socio-demographic variables, achieved results from high school and from the entrance exam, and attitudes towards studying which can have an effect on success, were all investigated.

In ^[20], they predicted a student’s academic success (classified into low, medium, and high risk classes) using different data mining methods (decision trees and neural network).

In the national context (Argentine):

Reference ^[1] related the experiment conducted by the authors in the use of an Unconventional Virtual Classroom Algebra in FaCENA – UNNE, which concludes that the approach using b-learning and multimedia, has been successful.

Reference ^[2] shows the problems of integrating technological and pedagogical perspectives, providing an architecture for b-learning systems. Didactically, adopt proven educational principles from the focus on the person (person-centered approach) to promote educational processes, with use of the ICTs in a safe manner. Technically, proposes a working environment (framework) on layers to provide support based on Web for these educational principles.

In ^[3] it was possible to see the great benefits of using the technologies and latest software support cross-platform systems, academic performance with techniques of DW and DM is studied, considering the importance attached to the study and its influence on the academic performance.

2.2. Theoretical Framework

According to ^{[21, 22]}, the new information society or cyber society raises a number of questions of technical, economic, sociological, cultural and political order.

One question is whether education systems are able to produce the quantity and quality of graduates needed to withstand the demands of highly trained staff of this Information and Knowledge Society (IKS) in different areas, especially those related to ICTs.

It is here where it appears the problem of performance or academic performance.

Reference ^[23] examined to which extent different motivational concepts contribute to the prediction of school achievement among adolescent students independently from intelligence.

Reference ^[24] defines the academic performance as the productivity of the subject, nuanced by their activities, features and more or less correct perception of the assigned tasks.

Reference ^[25] discusses the predictive power of the different skills, using multiple regression, concluding that the most important predictor of academic performance is the verbal, followed by the numerical aptitude and reasoning.

In ^[26], standard t-test and ANOVA were applied to investigate the effect of different factors on students’ achievement.

In a study of academic performance in the first college course ^[27] are used as indicators graduation rates, differentiated by type of institution and analyzing academic performance from individual data.

In ^[28] it has been considered since the early research on learning studies focused exclusively on the cognitive aspects; after researchers discovered the importance of affective components and their decisive influence on learning; finally the cognitive and affective aspects came together, giving birth to the construct called self-regulated learning (self-regulated learning).

Also, ^[29], has studied the University academic performance, applying the production function approach to estimating the determinants of academic performance.

In ^[30] the authors have analyzed the determinants of learning using a production function approach, suggesting that school performance depend on genetic and socio-economic factors of the quality of teaching, the conditions of the school and the student group (peer effect).

The results published in ^[31] have shown that the factor most significantly related to the quality of education is the student as co-producer, itself measured by household socioeconomic status came from.

In ^[32] has shown that student productivity is higher for women, for younger students and those from households with more educated parents.

In ^[33] it has analyzed in detail the relationship between hours worked and academic performance.

In general, empirical studies confirm the correlation between higher levels of education and positive attributes after studies ^[34].

In California (USA), the Academic Performance Index Reports includes aspects related to academic performance ^[35].

Reference ^[36] has studied the ability of the linear regression and logistic regression in predicting the performance and academic success/failure, based on variables such as the attendance and participation in class.

In ^[37] has shown that the study planning, intelligence, support teacher, study, time, environmental conditions of study and involvement variables were part of the prediction equation multiple regression, explaining 25.70% of the variance school performance in high school courses.

The problem of finding good predictors of future performance so that academic failure is reduced in graduate programs has received special attention in the U.S. ^[38], having found that the classification techniques such as discriminant analysis or logistic regression are more appropriate than the multiple linear regression predicting academic success / failure.

In addition to traditional tools before you point used for the study of academic performance, there are other from the Business Intelligence (BI), such as Data Warehouses (DW) and Data Mining (DM), used for discovering hidden knowledge in large volumes of data.

A DW is a collection of data-oriented issues, integrated, nonvolatile, time variant, which is used to support the process of managerial decision making ^{[39, 40, 41]}.

The process of the formation of significant models and assessment within Knowledge Discovery in Databases (KDD) is referred to as DM ^[42].

KDD is an interdisciplinary area focusing on methodologies for extracting useful knowledge from data. Extracting knowledge from data draws on research in statistics, databases, pattern recognition, machine learning, data visualization, optimization, and high-performance computing to deliver advanced business intelligence and Web discovery solutions ^[12].

Data mining is the field of discovering of implicit and interesting patterns for large data collections ^[43].

The DM is the discovery stage in the process of KDD, is the step consisting in the use of specific algorithms that generate a list of patterns from the pre-processed data ^[44], ^[45].

The DM is closely linked to the DW because they provide historical information with which mining algorithms obtain the information needed for decision-making ^[46].

The DM is a set of data analysis techniques that allow to extract patterns, trends and regularities to describe and understand the data and extract patterns and trends to predict future behavior ^{[41, 47, 48]}.

The DM generated models can be descriptive or predictive ^[49]; its techniques are different, one of the most used is the clustering (or grouping of data) ^{[50, 51]}. The demographic cluster is an algorithm developed by IBM that automatically solves the problems of defining distance / similarity metrics, providing criteria for defining an optimal segmentation.

Educational Data Mining (EDM) develops methods and applies techniques from statistics, machine learning, and data mining to analyze data collected during teaching and learning. EDM tests learning theories and informs educational practice. Learning analytics applies techniques from information science, sociology, psychology, statistics, machine learning, and data mining to analyze data collected during education administration and services, teaching, and learning. Learning analytics creates applications that directly influence educational practice ^[12].

In ^[52], using the decision tree predicted the result of the final exam to help professors identify students who needed help, in order to improve their performance and pass the exam.

In ^[42], the relationship between student’s university entrance examination results and their success was studied using cluster analysis and k-means algorithm techniques.

Reference ^[53] develops a methodology by the derivation of performance prediction indicators to deploying a simple student performance assessment and monitoring system within a teaching and learning environment by mainly focusing on performance monitoring of students, continuous assessment (tests) and examination scores in order to predict their final achievement status upon graduation. Based on various DM techniques and the application of machine learning processes, rules are derived that enable the classification of students in their predicted classes.

Reference ^[54] shows that Educational Data Mining (EDM) is concerned with developing methods and analyzing educational content to enable better understanding of students’ performance.

Reference ^[16] shows studies conducted to identify the possible parameters that contributed to the successfulness of student grade in academic especially in computer science course.

In ^[17] six parameters were selected for the Students’ Academic Performances (SAP) which include: Grade Point Average (GPA), race, gender, hometown, family income and university entry mode.

Reference ^[55] applies the kernel method as data mining techniques to analyze the relationships between students’s behavioral and their success and to develop the model of student performance predictors.

Reference ^[56] shows a case study that used data mining to identify behavior of failing students to warn students at risk before final exam.

Reference ^[57] shows how using data mining techniques can help discovering pedagogically relevant knowledge contained in databases obtained from Web-based educational systems or Online Learning Systems.

Reference ^[58] describe different types of data mining techniques, both classical and emergent, used for educational tasks by different stakeholders.

Reference ^[59] provides a technical overview of the current state of knowledge in educational data mining. It helps education experts understand what types of questions data mining can address and helps data miners understand what types of questions are important in education decision making.

In ^[60], they define learning analytics, how it has been used in educational institutions, what learning analytics tools are available, and how faculty can make use of data in their courses to monitor and predict student performance.

Reference ^[61] presents the capabilities of data mining in the context of higher educational system by i) proposing an analytical guideline for higher education institutions to enhance their current decision processes, and ii) applying data mining techniques to discover new explicit knowledge which could be useful for the decision making processes.

In ^[62], different data sources such as student record system, virtual learning system are integrated and analyzed with the intention of linking behavior pattern to academic histories and other recorded information. These patterns built into data mining models can then be used to predict individual performance with high accuracy.

Reference ^[63] presents a data-based user modeling framework that uses both unsupervised and supervised classification to build student models for exploratory learning environments.

In ^[64], apriori algorithm is used which extracts the set of rules, specific to each class and analyzes the given data to classify the student based on their performance in academics. Students are classified based on their involvement in doing assignment, internal assessment tests, attendance etc., which helps to analyze the performance of the student based on the pattern extracted from the educational database.

Reference ^[65] proposes a methodology based on data mining and self-evaluation in order to predict whether an instructor will or will not accept the students’ proposed marks in a course.

Reference ^[66] presents an approach to classifying students in order to predict their final grade based on features extracted from logged data in an education web-based system.

In ^[67], the variables considered are: status of the student, educational level of parents, secondary education, socio-economic level, and others. DW and DM techniques were used to search profiles of students and determine success or failure academic potential situations. Classifications through techniques of clustering according to different criteria have become. Some criteria were the following: mining of classification according to academic program, according to final status of the student, according to importance given to the study, mining of demographic clustering and Kohonen clustering according to final status of the student. The experience was developed at the Northeastern National University, Argentine.

3. Methodology Used

We used a quantitative logic approach, working with measurement of variables, hypothesis production and use of intelligent data mining, for the purpose of extracting hidden knowledge in the data.

In the research carried out there have been several hypotheses, which were then verified with data mining methods; also used methods of data mining that do not requires prior hypothesis, except for the decision of which variables to include in each mining process (e.g.: supervised classification and unsupervised classification).

We sought to fulfill the above objectives previously working with the hypothesis already mentioned in section 1.

The universe consisted of the students able to study the subject Operating Systems TSAP ISCC career.

The unit of analysis consisted of each student in a position to take the subject Operating Systems. The selected cases were students able to attend this course (about 200 students).

The data generated by this research will be added to data of other similar research conducted at the Northeast National University (Argentine), at the National Technological University (Argentine), the Catholic University of Santiago del Estero (Argentine) and the National University of the East (Paraguay); the above investigations will continue to incorporating data for several years more. It is considered that this justifies using a structure of DW, which will continue to grow in the future and will enable more research work.

Quantitative data obtained (integrated into a DW) were analyzed with DM tools, in order to investigate relationships between variables with non-traditional methods.

Has been used the IBM Data Warehouse Edition (DWE) V.9.5, including the DB2 Enterprise Server Edition (ESE DB2), the Design Studio (DS) and the Intelligent Miner (IM) (Figure 2, Figure 3).

Figure 2. IBM DWE architecture

Download as

Veiw figure View Figure

Figure 3. IBM DWE components

Download as

Veiw figure View Figure

3.1. Methodology of Definition of Used DW

It is important to remember that a DW cannot be acquired, must be built following certain methodology.

The technique used in the creation of the DW depends on to whom main point focuses its development, can be to the management of data, goals or users ^[68]. The proposed models are: “Data-Driven”, “Goal-Driven” and “User-Driven”. The following describes in general terms what constitutes each.

Data-Driven: This model considers that in a DW handled data, in contrast to the classical systems, that are managed requirements, which are the last aspect to be considered in the decision-making process, considering the needs of users in second term ^[69]. The data model consists of few dimensions and groups of facts. The dimension represents the basic structure of the design. The facts are based on time and have low level of granularity.

Goal Driven: This model considers that the development process revolves around the objectives and targets set out in principle. Unlike the previous model, it contains more dimensions and few facts, which are based on time and have a low level of granularity.

User Driven: It is considered that the main factor to take into account is the needs of users, as are those who ultimately use the system. The model consists of a few facts, which have a moderate level of granularity.

Regardless of the development models mentioned, the methodologies to be followed for the development of DW depend largely on the size of DW to create and how quickly DW required.

The following is an overview of the two main methodologies for the development of a DW, the “Big Bang” and “Rapid Warehousing”.

Big Bang: This methodology tries to solve all known problems creating large DW, before releasing for evaluation and testing ^[70].

Rapid Warehousing: This is also known as evolutionary or incremental methodology and considers the construction and implementation of a DW is an evolutionary process, which is to quickly create a portion of a DW with the integration of data marts ^[71].

In this work we have followed “User Driven” model and “Big Bang” methodology.

3.2. Structure Description of Used DW

Table 1 shows the most significant variables in the fact table.

Table 1. Variables and meanings of the fact table

Download as

Veiw figure View Table

In Table 2 are observed variables that make up the dimension importance awarded to the study.

Table 2. Variables and meanings of dimension importance awarded to the study

Download as

Veiw figure View Table

Table 3 shows the variables that constitute the dimension of student hometown.

The variables that constitute the dimension use of ICTs in consideration of the student can be seen in Table 4.

The variables that constitute the dimension student's secondary studies can be seen in Table 5.

Table 3. Variables and meanings of dimension student hometown

Download as

Veiw figure View Table

Table 4. Variables and meanings of dimension use of ICTs in consideration of the student

Download as

Veiw figure View Table

Table 5. Variables and meanings of dimension student's secondary studies

Download as

Veiw figure View Table

Table 6 shows the variables that make up the dimension student's current residence.

Table 6. Variables and meanings of dimension student's current residence

Download as

Veiw figure View Table

Table 7 shows the variables that make up the dimension hours dedicated to the study on the assessment of the student.

Table 7. Variables and meanings of dimension hours dedicated to the study on the assessment of the student

Download as

Veiw figure View Table

Table 8 describes the variables that make up the dimension of employment situation of the mother of student.

Table 8. Variables and meanings of the employment situation of the mother of student

Download as

Veiw figure View Table

Table 9 shows the variables that make up the student employment status dimension.

Table 9. Variables and meanings of the student employment status dimension

Download as

Veiw figure View Table

Table 10 describes the variables that make up the dimension of employment situation of the parent of student.

Table 10. Variables and meanings of the employment situation of the parent of student

Download as

Veiw figure View Table

The study was carried out on data obtained through surveys of students, considering also the results of the different instances of evaluation envisaged during the course of Operating System.

3.3. Used DM Methodology

Currently, there are several DM methodologies; the most used are the SEMMA and the CRISP-DM.

SEMMA methodology was developed by SAS Institute to discover unknown business patterns. The name refers to the five basic stages of the process ^{[72, 73]}.

The CRISP-DM (Cross-Industry Standard Process for Data Mining) methodology is organized in six stages, each of which in turn is divided into several tasks ^[74]. The main steps are including: domain understanding, data understanding, data preparation, modeling, evaluation and deployment.

In this work the CRISP-DM methodology was used.

Data mining functions and algorithms

The IBM DWE used provides mining functions to solve various problems ^{[75, 76]}:

Associations: The Associations mining function finds items in your data that frequently occur together in the same transactions.

Classification: With the Classification algorithms, you can create, validate, or test classification models. For example, you can analyze why a certain classification was made, or you can predict a classification for new data.

Clustering: The Clustering mining function searches the input data for characteristics that frequently occur in common. It groups the input data into clusters. The members of each cluster have similar properties.

Regression: Regression is similar to classification except for the type of the predicted value. Classification predicts a class label, regression predicts a numeric value. Regression also can determine the input fields that are most relevant to predict the target field values. The predicted value might not be identical to any value contained in the data that is used to build the model. An example application is customer ranking by expected profit.

Sequence Rules: The Sequence Rules mining function finds typical sequences of events in your data.

Time Series: The Time Series mining function enables forecasting of time series values.

4. Main Results

They will then show the main results (only the main results) obtained with different used DM techniques: clustering (segmentation), association generators (association rules) and decision trees (classification prediction).

4.1. Results with Clustering

The Clustering mining function searches the input data for characteristics that frequently occur in common. It groups the input data into clusters. The members of each cluster have similar properties ^[75].

IM (in DWE) provides the Clustering mining function. The Clustering mining function includes the following algorithms: a) Demographic clustering (distribution-based); b) Kohonen feature maps (center-based); c) Enhanced BIRCH (distribution-based).

These Clustering algorithms group data records on the basis of how similar the data records are.

A data record might, for example, consist of information about a customer. The Clustering algorithm groups similar customers together. At the same time it maximizes the differences between the different customer groups that are formed in this way.

The algorithms of the Clustering mining function provide common parameters and algorithm-specific parameters:

Generic clustering: The Clustering mining function searches the input data for characteristics that frequently occur in common. It groups the input data into clusters. The members of each cluster have similar properties. There are no preconceived notions of what patterns exist within the data. Clustering is a discovery process. You can specify common clustering parameters for all clustering algorithms: a) Active or supplementary fields: You can split the input fields between active fields and supplementary fields to determine the similarity or the distance between the records. Only active fields are used by the mining algorithm to determine similarity between records; b) Field weighting: Field weighting gives more or less weight to certain input fields during a Clustering training run; c) Maximum number of clusters: You can control the number of clusters to be created during a Clustering training run by specifying a value for the maximum number of clusters. Limiting the number of clusters prevents the production of many small clusters, and thus saves run time; d) Outlier treatment: Outliers are values that lie beyond the scope of a field's value range. You can define the field's value range by specifying the lower and upper bounds for this field.

Demographic clustering: Demographic clustering is distribution-based. It provides fast and natural clustering of very large databases. Clusters are characterized by the value distributions of their members. It automatically determines the number of clusters to be generated. Typically, demographic data contains many categorical variables. The mining function works well with data sets that consist of this type of variables. You can also use numerical variables. The Demographic Clustering algorithm treats numerical variables by assigning similarities according to the numeric difference of the values. Demographic Clustering is an iterative process over the input data. Each input record is read in succession. The similarity of each record with each of the currently existing clusters is calculated. If the biggest calculated similarity is above a given threshold, the record is added to the relevant cluster. This cluster's characteristics change accordingly. If the calculated similarity is not above the threshold, or if there is no cluster (which is initially the case) a new cluster is created that contains the record alone. You can specify the maximum number of clusters, as well as the similarity threshold. Demographic Clustering uses the statistical Condorcet criterion to manage the assignment of records to clusters and the creation of new clusters. The Condorcet criterion evaluates how homogeneous each discovered cluster is (in that the records it contains are similar) and how heterogeneous the discovered clusters are among each other. The iterative process of discovering clusters stops after two or more passes over the input data if the improvement of the clustering result according to the Condorcet criterion does not justify a new pass. Besides the common clustering parameters, you can define specific parameters for the Demographic Clustering algorithm: a) Similarity threshold: The similarity threshold is the desired lower limit for the similarity of two data records that belong to the same cluster. For example, if you set the similarity threshold to 0.25, data records with field values that are less than 25% similar are unlikely to be assigned to the same cluster. Assignment might still occur if the number of clusters is restricted; b) Similarity scale: The similarity scale determines how similarities for numerical fields are calculated; c) Similarity matrices: For each categorical field, you can define a similarity matrix that contains user-defined similarities between pairs of field values; d) Value weighting: Value weighting deals with the fact that particular values in a field might be more common than other values in that field. The coincidence of rare values in a field adds more to the overall similarity than the coincidence of frequent values.

Clustering with Kohonen Feature Maps: The Kohonen Feature Map algorithm is center-based. It normalizes input variables to the value range [0;1]. Categorical input variables are encoded by using nominal encoding. Therefore, categorical input variables with lots of different values can slow down the mining run considerably. The Kohonen Feature Map tries to put the cluster centers in places that minimize the overall distance between records and their cluster centers. Euclidean distance is used to determine the distance between a record and the centers. The separation of clusters is not taken into account. The center vectors are arranged in a map with a certain number of columns and rows. These vectors are interconnected so that, when a record is assigned to a cluster, not only the winning center vector that is closest to a training record is adjusted, but also the vectors in its neighborhood. However, the further away the other centers are, the less they are adjusted. For the Kohonen Feature Map algorithm, you can specify a total number of passes. With each pass, the center vectors are adjusted to minimize the total distance between records and their cluster centers. Also, the amount by which the vectors are adjusted is decreased. In the first pass, the adjustments are rough. In the final pass, the amount by which the centers are adjusted is rather small. Only minor adjustments are done.

Enhanced BIRCH Clustering: The enhanced BIRCH algorithm is distribution-based. BIRCH means balanced iterative reducing and clustering using hierarchies. It minimizes the overall distance between records and their clusters. To determine the distance between a record and a cluster, the log-likelihood distance is used by default. If all active fields are numeric, you can select Euclidean distance. The enhanced BIRCH clustering algorithm performs the following independent steps to cluster data: 1) Creating a clustering feature (CF) tree by arranging the input records such that similar records become part of the same tree nodes; 2) Clustering the leaves of the CF tree hierarchically in memory to generate the final clustering result. In this step, the best number of clusters is automatically determined. If you specified the maximum number of clusters to be generated, the best number of clusters is determined within the specified limit; 3) Reffining the clustering result by using a number of K-Means passes. Beside the common parameters for clustering, you can define specific parameters for the Enhanced BIRCH Clustering algorithm: a) Distance measure: The enhanced BIRCH algorithm provides the distance measures log-likelihood and Euclidean. By default, the log-likelihood distance is used. If all active fields are numeric, you can use Euclidean distance; b) Number of leaf nodes: The number of clustering leaf nodes influences the model quality. Not only the model quality, but also the run-time increases with the number of leaf nodes. The default number of leaf nodes is 1,000; c) Number of passes: You can specify the number of passes the enhanced BIRCH algorithm does for refinement of the final clustering result. The number of passes affects not only the model quality but also the processing time of training runs because each pass requires a full scan of the data.

The mining parameter setup for generating clusters shown in Figure 4.

Figure 4. Mining parameters for generating clusters

Download as

Veiw figure View Figure

Influence of sex (gender) in the use of ICTs by students and their academic performance

The largest group contains 16% of the total population. The smallest group contains the 4.17% of the total population. The overall quality of the model, measure of homogeneity of the clusters is 0.749, which indicates that, on average, tuples in a same cluster have 74.9% of the same value in the active attributes.

One of the clusters, which represents 16% of the total population, has predominantly students with the following characteristics: male, final status of 6, note which approved the course, single marital status, their hometown is Curuzú Cuatiá (84%), its origin is Corrientes province (96%) for the male population (predominant in that cluster) ICTs facilitate the teaching of the subject by 58%, while 27% displayed the importance of same as applied to the professional field.

Another cluster, with 11.46% of the population is completely female students, 21% have achieved a final position of 7, 8 and 9; this group can be seen that although no regularity of 6 common note in the male population, women have obtained higher grades, marital status is single in all cases, the city of origin is Curuzú Cuatiá for 86 %, the home province of Corrientes is 100%, 27% believe that ICTs are a reality, while 64% think that the importance of these lies in its application to the professional field.

Influence of educational level of parents in the use of ICTs by students

The largest group contains 31% of the total population. The smallest group containing 3.84% of the total population

A cluster corresponding to 31% of the total population, indicates that 23% of parents of students have completed elementary school, while 14% have completed secondary school; regarding the degree of use of ICTs by students, 56% define the use thereof as facilitators of the learning process, 28% consider that they will be essential in professional practice, which allows to assert a priori a high degree of acceptance in relation to the use of these technologies (84%).

Another cluster, corresponding to 13% of the total population, shows that the level of education of the parents is 100% complete secondary school; regarding the degree of use of ICTs by students, can be seen a strong response in relation to the importance that the student assigned to the use of these tools (98%), linking them fundamentally to its academic formation process.

Another group, corresponding to 11.39% of the total population, shows that 95% of parents of students have completed elementary school, while 3% had completed university studies and 2% non-university higher education complete; 59% of students believe that ICTs facilitate the learning process, while 26% say it will be essential for professional practice.

Whereas previously indicated, can be extracted as a comment that as improves the level of education of parents, this undoubtedly influences the opinion that the student has regarding the use of ICTs.

Influence of type of training received in high (secondary) school in the use of ICTs by students

The largest group contains 38% of the total population. The smallest group containing 3.36% of the total population.

In the cluster corresponding to 38% of the total population, it appears that the predominant qualification profile is related to the administrative management of organizations (35%); respect of the opinion that the student has in relation to the use of ICTs, we can see that 100% define these tools as facilitators of the teaching; a priori we can say that it does influence the type of degree obtained by the student at the end of high school, as the student whose degree profile is oriented to the administrative management of companies, has a better opinion regarding the use of these technologies.

Influence of the fact that the students work in addition to studying, in the use of the ICTs

The largest group contains 18.61% of the total population. The smallest group containing 5.12% of the total population.

The cluster corresponding to 18.61% of the total population, with regard to the employment situation of the student, shows that 100% of this population does not work; with respect to the use of the ICTs, it shows that 100% of the population agree that facilitate the teaching process.

In another cluster, corresponding to 8.54% of the total population, compared to the number of hours worked by the student in the week, it can be seen that 100% of the population works in tasks that consuming an average of more than to 5 clock hours per day; referred to the situation of the use of ICTs by students, it can be said that although the importance attached to the use of these tools in terms of their use does not clearly indicate that there is an influence as students working and which does not, however it can be stated that there is a more concrete opinion on the student who works and studies, based on the fact that students who work and study also expresses interest in using these tools in the professional field.

Influence of the general attitude toward study in the use of ICTs by students

The largest group contains 19.72% of the total population. The lower group contains 5.45% of the total population.

The cluster corresponding to 19.72% of the total population, has predominantly students who spend more than 10 and up to 20 hours even to the study, are, also with regard to the use of the ICTs, saying which facilitate the process of teaching and learning and the importance assigned to the study is more than fun; with respect to the number of hours devoted to the study by the student, it can be seen that 100% of the population expressed a commitment between 10 and 20 hours; with respect to the importance that the student assigned to study, it can be seen that 100% of this population appears to give one importance greater than the fun; with respect to the use of the ICTs from the learner, it can be observed that 100% of the population reported that they facilitate the learning process.

Another cluster, corresponding to 10.14% of the total population, with respect to the amount of hours a week dedicated to the study by the student, it can be seen that 100% of the population expressed a commitment between 10 and 20 hours; with respect to the importance that the student assigned to study, it can be seen 98% of this population appears to give one importance greater than the fun, while 1% more than the family; with respect to the use of the ICTs from the learner, it can be seen that 100% of the population reported that they will be essential to the professional practice.

Another grouping, corresponding to 5.45% of the total population, with respect to the amount of hours a week dedicated to the study by the student, it can be seen that 88% of the population expresses more than 20 hours, while a 2% up to 10 hours inclusive; with respect to the importance that the student assigned to the study, 77% of the population believes that it is more important than the fun, on the other hand 1% more than the family and 22% more than work; with respect to the use of the ICTs from the learner, it is observed that 70% of the population believes that they facilitate the teaching process, moreover 15% believes that they will be essential for the professional practice and 11% believes that they are a reality today.

It can be seen that the degree of commitment and importance attached by students to their studies has a direct relationship with the same attitude about the use of ICTs.

4.2. Results with Association Generators

Mining association aims to find the elements that are consistently associated with others in a meaningful way. Discovered relationships are expressed as association rules. The role of association mining and associations is also assigns probabilities. The first part of an association rule is called the body of the rule and the second part the head of the rule.

Association rules have the following attributes: a) confidence: confidence value represents the validity of the rule (a rule has 70% confidence if at 70% of the cases in which the body of the rule is also present in a group, the head of the rule is present in the group); b) support: the support value is expressed as a percentage of the total number of records or transactions; c) elevation: elevation value indicates to what extent the value of confidence is higher than expected; It is defined as the ratio of the value of confidence and the value of support from the head of the rule; the value of the rule head support can be considered as the value expected for confidence and indicates the relative frequency of the head of the rule in the whole transactions.

You can make the associations or the sequences that are found among items more meaningful if you group the items in categories. You can group these categories again into subcategories. The result is a hierarchy of categories with the items on the lowest level. This is called a taxonomy ^[75].

112 rules are obtained, some of which are listed below; the mining parameter setup for generating associations shown in Figure 5.

Figure 5. Model parameters for generating associations

Download as

Veiw figure View Figure

If the student is male gender, single marital status involves 91% of cases.

If the student is female gender, single marital status involves 85% of cases.

If the final status of the student is 6, which occurs in 31%, implies a single marital status in 86% of cases.

If the student is female gender implies that opine that ICTs facilitate the teaching process in 56% of cases.

If the student believes that the use of ICTs is essential for professional practice, which occurs in 25%, means that your marital status is single in 88% of cases.

If sex (gender) of student is male, implies that its final status will be 6 in 37.5% of cases.

If sex (gender) of student is female, implies that its final status will be 6 in 35.44% of cases.

If the student believes that the use of ICTs is essential for professional practice, which occurs in 14%, implies that the student is female gender in 49% of cases.

If they student opinion is that the use of ICTs facilitates the teaching and the hours are devoted to the study up to 10 hours inclusive, what happens in 12.54%, implies that gender student is male in 50.31% of cases.

If your marital status is single and the student believes that the use of ICTs is essential for professional performance, which occurs in 13%, implies that the gender of the student will be male in 52% of cases.

If they student opinion is that the use of ICTs facilitates the teaching and the hours devoted to the study are over 10 and up to 20 inclusive, which occurs in a 13.43%, implies that the student is female gender at 49.68% of the cases.

If they student opinion is that the use of ICTs facilitates the teaching and the hours devoted to the study are over 10 and up to 20 inclusive, which occurs in a 13.60%, implies that the student is male gender at 50.31% of the cases.

If the student is female gender and the final status is 6, what happens in a 14.46%, implies that the marital status of the student will be single in 82% of cases.

If the final status of the student is 6 and the hours are devoted to the study and 10 to 20 inclusive, which occurs in 15%, implies that the marital status of the student will be single in 86% of cases.

If the final state is 6 and is male, which occurs in 17%, implies that the marital status of the student will be single in 90% of cases.

If female and spends up to 10 hours to study inclusive, which occurs in 19%, implies that the marital status of the student will be single in 85% of cases.

If it is single and hours devoted to the study are to 10 inclusive, which occurs in 22% of cases, implies that the opinion on the use of ICTs is to facilitate the learning process in 56% of cases.

If students have the opinion that the use of ICTs facilitates the teaching and the hours devoted to the study are more than 10 to 20 inclusive, which occurs in 24%, implies that the marital status of the student will be single at 88 % of cases.

If the use of ICTs facilitates the process of teaching and student gender is male, which occurs in a 25.63%, implies that the marital status of the student will be singles in 91.25% of cases.

4.3. Results with Decision Trees

IM supports a decision tree implementation of classification ^[75]. A Tree Classification algorithm is used to compute a decision tree. Decision trees are easy to understand and modify, and the model developed can be expressed as a set of decision rules.

This algorithm scales well, even where there are varying numbers of training examples and considerable numbers of attributes in large databases.

Decision Tree Classification generates the output as a binary tree-like structure, which gives fairly easy interpretation to the marketing people and easy identification of significant variables for the churn management.

A Decision Tree model contains rules to predict the target variable. The Tree Classification algorithm provides an easy-to-understand description of the underlying distribution of the data.

The configuration options are:

1. Maximum purity per internal node: You can customize the binary decision tree by specifying the maximum purity per internal node. This value is a limit to stop further splitting of a node that has reached the specified purity value when the initial decision tree is being built. The maximum purity is specified as a percentage value.

2. Maximum tree depth: You can customize the binary decision tree by specifying the tree depth. Maximum tree depth is a limit to stop further splitting of nodes when the specified tree depth has been reached during the building of the initial decision tree.

3. Minimum number of records per internal node: You can customize the binary decision tree by specifying the minimum number of records per internal node. This sets a limit value. It prevents further splitting of a node that has reached the specified minimal size when the initial decision tree is being built.

4. Threshold for incorrect predictions: You can specify the percentage of incorrect predictions that you will tolerate on validation data in a classification model. The Tree Classification algorithm does not stop if this percentage is reached.

5. Cost matrix: If, by default, all misclassifications had equal weights, target values (class labels) that appear less frequently would not be privileged. You might obtain a model that misclassifies these less frequent target values while achieving a very low overall error rate. To improve classification decision trees and to get better models with such 'skewed data', the Tree heuristic automatically generates an appropriate cost matrix to balance the distribution of class labels when a decision tree is trained. You can also manually adjust the cost matrix.

The configuration set parameters for classification mining with decision tree shown in Figure 6.

Figure 6. Model parameters for decision tree classification

Download as

Veiw figure View Figure

The results have been summarized and grouped according to final rating (class); has been considered high-performance academic to the final ratings between 7 and 10, academic performance medium to the final score of 6 and low academic performance to the final score from 0 to 5; by way of example, the results for high academic performance with a grade of 7 are shown in Table 11.

Table 11. Students of high academic performance: final grade 7

Download as

Veiw figure View Table

For reasons of space does not include tables for the other final grades.

The results summarized from the profile of the students considered high academic performance, corresponding to 25.78% of the population, are the following: a) most lives with the family group, b) generally do not work, c) a minority group works up to 20 hours a week, d) in the majority of cases the work relationship with the chosen career is partial, e) the degree of primary and secondary schooling of parents is relatively low, registering cases of tertiary or university education, f) mostly the parents occupancy rate is relatively high, g) in most cases the goal of students is to study to learn to learn or to fully learn the subject, h) the majority considers the use of the ICTs associated with the process of teaching and learning and as essential for professional practice, i) the majority are single, registering a good percentage of married, j) the majority correspond to the male gender, k) a minority group gives the study more priority than job.

The results summarized from the profile of the students considered average academic performance, corresponding to 36.44% of the population, are the following: a) the majority living with the family group, b) usually do not work, c) a minority group works up to 20 hours a week, d) in most cases the relationship between work and career choice is partial, e) the degree of primary and secondary schooling of parents is relatively low, not registering cases of tertiary or university schooling, f) mostly the parents occupancy rate is relatively low, g) in most cases the goal of students is to study to pass the subject, h) the majority considers the use of the ICTs associated with the teaching-learning process, i) most are single, recorded a good percentage of married, j) the majority correspond to the male gender.

The results summarized from the profile of the students considered low academic performance, corresponding to 37.73% of the population, are the following: a) the most lives with the family group, registering a significant minority who lives independently, focusing especially on the class corresponding to the rating of 2, b) generally do not work, but a significant group does, in this category is the largest number of students who work, c) a minority group works up to 20 hours a week and another smaller group more than 36 hours per week, d) in the majority of cases the work relationship with the chosen career is partial or non-existent relationship, e) the degree of primary and secondary schooling of parents is relatively low, registering cases of tertiary or university education, f) mostly the parents occupancy rate is relatively high, registering an important minority group with a low occupancy rate, g) in most cases the goal of students is to study to pass the subject and a minority group makes learning to learn or learn integrally the subject, h) most consider the use of ICTs associated with the teaching-learning process and a minority group as essential for professional practice, i) most are single, j) the majority correspond to the female gender.

Table 12 shows some of the correlations that have been selected, considering them relevant for the analysis of the objectives set out in this research.

There are interesting correlations, for example, showing the incidence of the first quarter note in the final situation of the student, so the impact of the type of residence regarding the final status of the student, the education level of parents in relation to the hours devoted to the study and final status of the student, the incidence of the use of ICTs in relation to the final situation of the student.

Table 12. Correlation and significance of field

Download as

Veiw figure View Table