ANALYSIS OF ACADEMIC PERFORMANCE BASED ON SOCIOGRAMS: A CASE STUDY WITH STUDENTS FROM AT-RISK GROUPS

The present work analyzes the academic performance of students from at-risk groups from the perspective of Social Network Analysis (SNA), studying the academic and interaction information of 45 students belonging to at-risk groups who attended a pilot socio-academic course during one academic term. This information was used to create a sociogram, which served as the basis for determining the centrality metrics of the SNA. The relationships between these metrics and the academic variables were then studied by means of correlation analysis and linear regression with LASSO standardization. As a preview of the results, it was determined that the academic performance of the students in the pilot course was influenced, on the one hand, by their academic knowledge prior to being admitted to the university, represented by the score on the Mathematics and Geometry section of the diagnostic test, and on the other hand, by the dynamics of the social network in which they interacted in the classroom, represented by the eigenvector centrality. These results have significant potential for explaining the academic performance according to SNA metrics, and they provide evidence to support the implementation of practices that promote a healthy social environment in an academic context.


Introduction
Education is a crucial and determining factor in the development of a nation, as it has a direct influence on the progress of both people and societies. The educational process provides the competences that allow progress to be made in different areas of society, such as culture, productivity and economic competitiveness, scientific and technological innovation and processes intended to ensure higher levels of social well-being. As a result, all educational processes seek to optimize, constantly and permanently, the performance of students in the different stages of their academic training (Bhardwaj, 2016;Smith, Fraser, Chykina, Ikoma, Levitan et al., 2017).
In this sense, one of the primary indicators in any analysis concerning education is academic performance, in which two basic components are considered: the learning process and evaluation (García, Lamos-Duarte, Vargas-Rivera, Camargo-Villalba & Capacho, 2019). According to Pizarro (1985), academic performance is a measurable indicator of the responding or indicative skills that allows us to estimate what an individual has learned during a certain educational process. Along the same lines, Caballero D., Abello and Palacio (2007) emphasize that academic performance, besides being the result of institutionalized training, also includes aspects of non-institutionalized training, and thus the presence of other underlying components of this indicator is relevant. In this context, Navarro (2003) states that when academic performance is studied, other factors that might influence it are analyzed on par with it. These factors encompass the environments of the educational system itself, as well as the teaching methodologies that are applied, the previous knowledge of the students and the amplitude of the curricula, among other factors. However, other components are also considered, such as socioeconomic factors, motivational aspects, emotional factors and social skills.
In light of the fact that academic performance is inherently multifactorial in nature, several different focuses can in turn be proposed to analyze it and to study the factors that converge in this complex component of the educational process.

Previous Works
Traditionally, academic performance at the university level has been associated with factors such as previous academic preparation, access to technology and scores on entrance exams (Callejas, Griol & Lázaro-Álvarez, 2020;Goodchild & Bjørkestøl, 2020;Ismail, Mahmood & Abdelmaboud, 2018). Likewise, socioeconomic variables have also been studied, such as monthly family income, residence in rural areas, housing type and gender, as determining factors in academic performance.
Parallel to this, the influence of student-student and student-teacher interaction has been studied with a qualitative focus (Amo & Santelices, 2017;Sandoval, Sánchez, Velasteguí & Naranjo, 2018;Yang & Tang, 2003;Sánchez, Gilar-Corbi, Castejón, Vidal & León, 2020). However, in spite of the fact that very interesting results have been obtained in qualitative studies in terms of the influence of social skills on academic performance, the importance of interactions in the classroom has become especially relevant from the perspective of social network analysis (SNA), the methodology of which is based on building a sociogram to establish the relationships between the actors and to analyze the structures that result from the recurrence of these relationships (Abbasi, Altmann & Hossain, 2011;Jain & Langer, 2014).
On the sociogram, each of the actors in the social network is represented by a node, and each of the interactions that exist between these actors is shown by segments that link said nodes. Based on the sociogram, different metrics are determined to quantify the properties at the whole network and node levels, which are known as centrality metrics (Gomes Jr., 2019). Among these metrics is degree centrality, which represents the number of nodes that are directly related to the node in question; closeness centrality, which represents the capacity of a node to reach the remaining nodes on the network by following the shortest distances that separate them (geodesic path); betweenness centrality, which represents the number of times that a node is found on a geodesic path; and eigenvector centrality, which represents the relevance of a node on the network, based on the node degree values and, recursively, on the relevance of its neighboring nodes (Abbasi et al., 2011;Rizzuto, Ledoux & Hatala, 2009).
In addition, among the metrics that quantify the properties at the network level is density, which represents the number of real relationships between the nodes of a network with regard to the total number of possible relationships; and the average degree, which represents the average number of relationships for each of the nodes on a network (Abbasi et al., 2011;Rizzuto et al., 2009). Different studies have found that the structural properties of the social networks are associated not only with aspects exclusively concerning socialization, such as the formation of friendship bonds or the creation of a sense of belonging to an institution, but also with metrics such as density. Centrality indicators can be analyzed in conjunction with the variables traditionally considered in the study of academic performance and, consequently, be directly and substantially linked to phenomena inherent to the educational system, such as dropping out and academic performance (De-Marcos, Garciá-López, Garciá-Cabot, Medina-Merodio, Domínguez, Martínez-Herraíz et al., 2016;Gomes Jr., 2019;Helal, Li, Liu, Ebrahimie, Dawson, Murray et al., 2018;Jain & Langer, 2014;Mihaly, 2011).
Likewise, it has been found that the structure of a social network in a group of students can be considered an explanatory component, which can be harnessed to varying degrees of performance in a formal instruction process in an educational setting (Helal et al., 2018;Jain & Langer, 2014).
Parallel to this, the inclusion of SNA metrics in the analysis of academic performance has made it possible to indirectly associate the influence of the so-called soft skills, which are the qualities that people develop in order to improve their relationships with peers; these are learned through daily experience, and they have a great impact on a personal, professional, labor and social level (Patacsil & Tablatin, 2017).
To summarize, SNA can be used to generate numeric indicators that describe some of the characteristics of the behavior of the actors of a social network, as well as the structure of the network itself. These indicators can be linked to other variables that describe a phenomenon of interest, such as academic performance, in order to determine correlations and thus obtain measurable results based on an abstract concept, which in this case, comes from social interaction.

The Present Study
The present study proposes to analyze, from the perspective of SNA, the influence of social interactions on the academic performance of a group of students belonging to at-risk sectors. This research is developed within a context that is relatively different from that of other previous works, as it analyzes the information on students of similar socioeconomic and academic characteristics, known in Ecuador as beneficiaries of the Quota Policy. The students who make up this group are selected by the Secretaría Nacional de Educación Superior, Ciencia, Tecnología e Innovación in Ecuador (SENESCYT, according to its acronym in Spanish) during each admissions process.
Since 2014, the SENESCYT has conducted a socioeconomic characterization process on applicants to institutions of higher education (IHE) with the aim of identifying those who are most at-risk socioeconomically and promoting their access to higher education through an affirmative action policy known as the Quota Policy. Students benefiting from this program are given beneficial access to 15% of the total positions offered by institutions of higher education in Ecuador (Sandoval et al., 2018;Sandoval, Sánchez, Naranjo & Jiménez, 2019).
Starting in 2017, the SENESCYT has included beneficiaries of the Quota Policy among the students given preferential access to the Leveling Course of the Escuela Politécnica Nacional (EPN). However, when comparing them to their peers from the general population, it was evident that the students from the at-risk groups had certain deficiencies in their previous academic preparation, which led to high failure and dropout rates. At the same time, it was observed that these students showed high levels of demotivation, disorganization and had difficulties interacting in a classroom setting with both classmates and their instructors (Ramos, Sánchez, Reina & Franco-Crespo, 2020;Sandoval, Sánchez, Naranjo et al., 2019). In response to these precedents, the EPN implemented a pilot socio-academic intervention course during the second term of 2019 with a sample of beneficiaries of the Quota Policy. This course was focused on giving students from at-risk groups the academic skills they need and developing their abilities for classroom interaction and participation.
Previous works on this group of students have investigated the relationship between academic achievement and traditionally studied variables, such as socioeconomic factors and academic background (Sandoval, Sánchez, Velasteguí, et al., 2018;Sandoval-Palis, Naranjo, Vidal & Gilar-Corbi, 2020). However, this work also proposed the study of the influence of social interactions on this phenomenon and applied a regression technique based on automatic learning in order to determine the set of variables with the greatest explanatory potential. It should be stressed that in the context of this study, the component of interest that represents academic performance concerns evaluation, which is expressed in the form of scores that measure student performance (García et al., 2019).
The precedents presented suggest the possibility of studying the relationship between social interactions and academic performance, one component of the multifactorial educational process, and thus the following objectives have been established: (1) To determine in a quantifiable manner the interaction between students in the pilot socio-academic intervention course through the calculation of SNA metrics.
(2) To analyze the correlation between the variables that represent social interaction and those representing academic performance.
(3) To establish the set of variables, both academic and SNA-related, which have the greatest influence on academic performance.

Materials and Methods
This research is framed within a descriptive and correlative case study approach.

Participants
A total of 257 students benefiting from the Quota Policy were admitted to the EPN during the second term of 2019, of which 70% were male and 30% were female. 69% of the students applied to the leveling course for the Engineering, Science and Administrative Sciences programs, while the remaining 31% applied to the leveling course for the Advanced Technology program .63.6% came from the province of Pichincha, 22.9% came from other provinces in the Andes Region, 10.3% came from the provinces of the Amazon Region and the remaining 3.2% came from the provinces of the Coastal Region. The students were given a diagnostic test that evaluated their knowledge and skills in two sections: Mathematics and Geometry and Language and Communication. The average score out of 10 possible points obtained on each section was 4.28 ± 1.9090 and 4.70 ± 1.4975, respectively.
For the selection of participants in the pilot socio-academic invention course, an orientation workshop was held in which the students were informed about the option to participate in said program for one term prior to the EPN leveling course. Given that the pilot socio-academic intervention course was not mandatory and that it was necessary for it to have a similar number of students as a parallel ordinary leveling course, a sample of 45 students were selected from among those who agreed to participate voluntarily in the program. An attempt was made to maintain the populational proportions with regard to gender, type of leveling course (Engineering, Sciences and Administrative Sciences or Advanced Technology) and province of origin. Likewise, through an inferential analysis, it was determined that the average of scores of the 45 students on each of the sections of the diagnostic test showed no statistically significant differences from the corresponding populational averages. Table 1 shows the characteristics of the study participants.

Measurements
The previous knowledge and skills in Mathematics, Geometry and Language and Communication were evaluated according to a maximum of 10 points on a diagnostic test consisting of 80 multiple choice questions. The test was designed and validated by professionals from the Basic Science Department of the EPN (Sandoval, Sánchez, Velasteguí, et al., 2018;Sandoval, Sánchez, Naranjo et al., 2019).
The final score out of 10 possible points is the simple average of the scores obtained by the students following the culmination of the pilot socio-academic intervention course in the Mathematics, Geometry and Reading/Writing subjects. In turn, the scores on each of the subjects were made up by grades on homework, quizzes, a mid-term exam and another final exam at the end of the term.
The SNA metrics were determined based on an interactive survey administered to the students at the end of the term.

Procedure
Information on gender, the score on each of the sections of the diagnostic test and the final score were obtained from the academic records from the pilot intervention course. In this study, only the sum of anonymous information is presented. Students provided informed consent to voluntarily participate in this study and they were told that their information would be used for research purposes.
The 45 students who made up the sample of this study attended a pilot socio-academic intervention course for one term. During this period, they received classes in Mathematics (10 hours/week), Geometry (10 hours/week), and Reading and Writing (10 hours/week). They also received instruction on the use of computer tools (2 hours/week), study techniques and strategies (2 hours/week), and motivational workshops and coaching (2 hours/week). In the latter, the students explored different aspects concerning the development of their soft skills and emotional intelligence, as well as how they could apply these skills in the teaching-learning process. Parallel to the training process, they were continuously monitored by the social work department created exclusively to work with these students.
The interaction survey was administered by means of an electronic form sent to the students by email. This form consisted of a list with the names of all 45 students, with the instructions to select the two people with whom they had interacted the most in the academic setting (performing tasks, classroom work and study groups) during their experience in the course.

Data Analysis
With the results of the interactive survey, a sociogram was created to represent the social network of the pilot intervention course. Since the intent was to study the potential of the interactions between students rather than their direction, all relationships were considered to be symmetric (undirected) and binary (Newman, 2003).
The density, average score and the degree centrality (DC), closeness centrality (CC), betweenness centrality (BC) and eigenvector centrality (EC) metrics were determined based on the sociogram. Next, they were integrated into a SNA metrics matrix with information on the academic performance of the students. Those missing interaction or performance information (as the result of having dropped out of the pilot course) were excluded from the set of data.
T tests and correlation analyses were performed on the variables studied, and finally, a linear regression was estimated to determine the variables with the greatest influence on the academic performance of the students after having finished the pilot course. The linear regression model was constructed according to the following equation (1): FS = β 0 + β 1 Gender + β 2 MDS + β 3 LDS + β 4 DC + β 5 BC + β 6 CC + β 7 EC (1), The regression coefficients β i were determined by means of the Least Absolute Shrinkage and Selection Operator (LASSO) automatic learning standardization. This technique was used to obtain a series of iterations which provided, first of all, the reduction of the variability of the estimates by reducing the regression coefficients, and parallel to this, the selection of variables to build a simplified model, since some coefficients are reduced to zero (Gauraha, 2018).
All analyses were carried out on RStudio version 1.2.1335.

Results
The mean of the final score obtained by students was 5.33, with a standard deviation of 1.5052. Two students were excluded from the analysis because they dropped out of the course before it ended. Figure 1 shows the sociogram of the pilot course. Each student is represented by a node, the size of which is proportional to the final score. It is observed that apparently both men and women are distributed in a relatively homogeneous manner throughout the network; however, it can be seen that some of the nodes representing men form subnetworks with an interconnection that is relatively greater than that shown for women. On the other hand, it is observed that a large part of the smaller nodes (corresponding to lower final scores) are located in peripheral areas, while the larger nodes (higher final scores) are found in areas where the network has greater cohesion. This is a preliminary indicator of the potential relationship that exists between academic performance and the interaction among students.   Table 3 shows a summary of the SNA metrics on the node level obtained from the sociogram of the pilot course. In order to determine the potential relationships between these metrics and the other variables, a t test for equal means was first carried out between women and men; these results are presented in Table 4.  Table 3. SNA metrics on a node level It is observed that for a level of significance of 0.05, only the eigenvector centrality is statistically different between women and men. This means that the relevance of men in order to establish relationships on the pilot course network was greater than that of women. This agrees with what was previously observed on the sociogram, which showed subnetworks with greater interconnection formed by men. On the other hand, Table 5 shows the correlation matrix considering both the SNA metrics and the academic variables. It is observed that, for a 0.01 level of significance, both the diagnostic score in Mathematics and Geometry (MDS) and the eigenvector centrality (EC) show a moderate correlation with the final score (FS). Likewise, a moderate correlation is observed between EC and the remaining SNA metrics, and a strong correlation is revealed between closeness centrality (CC) and betweenness centrality (BC). Table 6 shows the results of the regression coefficients by means of LASSO standardization. It is observed that the simplified final score model (FS) consists of the score from the Mathematics and Geometry diagnostic test (MDS) and the eigenvector centrality (EC), as shown in Equation 2; likewise, the coefficients associated with each of these variables are significant at a 0.001 and 0.01 level, respectively. 1.00 **The correlation is significant at the 0.01 level (2-tailed). The correlation is significant at the 0.05 level (2-tailed).  (2) Figure 2 shows the dispersion diagram of the final score (FS) according to the Mathematics and Geometry diagnostic score (MDS); the size of the dots is proportional to the eigenvector centrality.

Discussion
The present study intended to analyze the academic performance of students belonging to at-risk groups based on the SNA metrics obtained from the sociogram.
The apparently low resulting value for density could be explained by the fact that the students selected only two classmates with whom they interacted the most, which results in a systematic reduction of the total number of real relationships on the network. However, it is likely that having allowed the students to select the number of classmates with whom they most interacted without any restrictions would have provided a similar result, since as Ramírez Ortiz, Caballero Hoyos and Ramírez López (2004) explain, the interactions among students are not expected to be completely homogeneous, since students develop preferences for working with certain classmates over the course of the academic term. This study also analyzed the information on students from at-risk groups and the network densities fell within a range of 0.095-0.169. Even so, the previous values only serve as a guide, since in order for them to be comparable to the results obtained in the present study, the number of nodes on the network must be the same in both cases (Mihaly, 2011;Ramírez Ortiz et a.l 2004;Rizzuto et al., 2009).
Furthermore, the fact that the interactions observed for men demonstrate greater relevance than those of women could be a consequence of the fact that, even today, collaborative work by women is often limited by an environment in which gender bias is present. This behavior has been observed by Jain & Langer (2014), who studied the relevance of social interactions on academic performance.
With regard to the correlation analysis, the results of the present study are similar to those obtained by Gomes Jr. (2019) and Mihaly (2011), who found strong and moderate correlations between the eigenvector centrality value and the mean final score obtained by the students. This result, in turn, suggests that the students with the best connections tend to earn the best grades.
Similarly, some SNA studies, such as those conducted by Abbasi et al. (2011) and Rizzuto et al. (2009), have found that the correlation between the SNA metrics is determined, on the one hand, by the very definition of a particular metric, and on the other hand, by the specific structure of the network. As a result, contrasting results between different students can be obtained. Along these lines, Gomes Jr. (2019), for example, found a weak correlation between closeness centrality (CC) and betweenness centrality (BC), while in the present study, a strong correlation was observed between these two metrics, which indicates that those students who acted as intermediaries among their classmates (greater BC) have a greater potential to interact with classmates with whom they did not have a direct relationship (greater CC). On the other hand, as in this study, De-Marcos et al. (2016) observed a moderate correlation between eigenvector centrality and the other centrality metrics, which is explained by the inherent recursiveness that defines the eigenvector centrality.
The linear regression model showed a clear relationship, on the one hand, between the academic performance of the students in the pilot course and their prior knowledge, and on the other hand, between academic performance and the power of the students' social connections. This result is in agreement with what was mentioned by Sawyer (2013), in the sense that the skills and knowledge students have when they enter the university are reliable predictors of academic performance, especially during the first year; however, an increasing number of studies like this one include variables that describe the social capital, considered from the perspective of Sociology.
In the case of this study, the variable describing the social capital in the model is the eigenvector centrality. This metric has higher values at the nodes that are linked to highly connected nodes (recursively). In this sense, it can be established that the resulting relationship between social capital and academic performance is, in fact, a property of the complex social dynamics represented on a sociogram (Pulgar, Candia & Leonardi, 2020;Ramírez Ortiz et al., 2004).
This complexity is evidenced by studying the distribution of the final score according to the eigenvector centrality, for example. If we consider a low score to be one less than 5 points and an acceptable score to be between 6 and 7 points, Figure 2 shows that there are groups of students with a low final score and an acceptable final score, even though they have similar eigenvector centrality values. This behavior could be due to the fact that, in the case of students with a low final score, their performance was negatively influenced by their academic background, in spite of having established academic bonds with their classmates. Meanwhile, in the case of students with an acceptable final score, in spite of their inadequate academic preparation, having built relationships with their classmates allowed them to perform better than expected in the pilot course. Nevertheless, it is also reasonable to expect that there are additional factors that have an influence on their classroom performance, and since information is not available to determine the underlying reasons, it cannot be ruled out that these data show an antagonistic behavior, even if this procedure would strengthen the correlation.
On the other hand, while the optimization of an academic performance model goes beyond the scope of this study, the adjusted coefficient of determination for the model presented was 0.5016; this value is comparable to the results obtained by De-Marcos et al. (2016), Gomes Jr. (2019 and Mihaly (2011), who have modeled academic performance with SNA metrics; for this reason, the results from this study were considered to have significant explanatory potential, given that the variables considered merely represent a small subset of the factors that influence academic performance.

Conclusions
Previous works have established a certain degree of association between student friendship circles and academic performance. In this study, these associations and correlations were explored with a more precise and measurable focus, through the metrics resulting from the sociogram of a pilot socio-academic intervention course for students from at-risk groups. Using a linear regression model, it was determined that, in addition to academic knowledge prior to university admission, the academic performance of a student is influenced by the dynamics of the social network on which they interact within the classroom.
The prior academic knowledge of the students was described solely by the score on the Mathematics and Geometry section of the diagnostic test; in this sense, the score on the Language and Communication section of the diagnostic test did not have a great effect on the score earned by students at the end of the pilot course. Furthermore, the dynamics of the social network were described by the eigenvector centrality value.
The linear model, constructed based on the score on the Mathematics and Geometry section of the diagnostic test, and the eigenvector centrality, has significant explanatory potential, since in general, the academic performance of a student is influenced by many more factors, such as social, emotional and economic aspects, which in turn have an influence on classroom interaction.
This study could serve as the basis for guiding educators at institutions of higher education in evaluating the role of social interactions on academic performance, since the results obtained provide evidence to support the implementation of practices that promote a healthy social environment in an academic context.
Despite the fact that metrics were determined in the SNA which describe social interactions, their dynamic nature could not be expressed over time, which represents one limitation of this study. Similarly, the study sample is relatively small and is subject to the limitations of a case study, particularly with regard to the voluntary decision to participate in the pilot socio-academic intervention course. It will be necessary to conduct further studies with a larger sample that also consider how social interactions change over time, in order to obtain much more general results and to apply models that evaluate the fixed effects inherent to complex social phenomena.

Declaration of Conflicting Interests
The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.