DOES LIKE SEEK LIKE ? : THE FORMATION OF WORKING GROUPS IN A PROGRAMMING PROJECT

In a course of the degree of computer science, the programming project has changed from individual to teamed work, tentatively in couples (pair programming). Students have full freedom to team up with minimum intervention from teachers. The analysis of the working groups made indicates that students do not tend to associate with students with a similar academic performance, perhaps because general cognitive parameters do not drive the choice of academic partners. Pair programming seems to give great results, so the efforts of future research in this field should focus precisely on how these pairs are formed, underpinning the mechanisms of human social interactions.


Introduction
Pair programming (programming in couples) has been the subject of several studies arguing that this way of coding improves communication and team working skills, given a minimum temporal investment, in addition to reducing errors in programming and improving the quality of the generated code (Cockburn & Williams, 2001;Hughes, 2015).Pair programming has been researched and practiced as an optimal choice for learning and teaching programming at diverse educational stages (Hughes, 2015;Williams & Kessler, 2003 for a review).The question remains: is there an optimal way to form pairs? How can we pair couples in order to improve the learning of programming?
The formation of homogeneous or heterogeneous work groups is a classic problem in science education (Esposito, 1973), a central topic in cooperative learning (Ashman & Gillies, 2003).It has even been theoretically modeled in recent data mining approaches applied to education (Bahargam, Erdös, Bestavros & Terzi, 2015).Unfortunately, its relevance is often forgotten in theoretical learning models (Novikoff, Kleinberg & Strogatz, 2012).
On the other hand, when given the freedom to choose, how do people decide who to partner with?In Psychology and Cognitive Science, the problem of social decision making has been studied from different perspectives, ranging from selecting sexual partners to product purchase.
Both emotional and rational factors are usually taken into account (Hsee & Hastie, 2006, for a review), and, in general, it is well known that we tend to associate with humans with similar characteristics to our own, even genetically (Rushton, 1989;Fowler, Settleb & Christakis, 2011).
What happens in our classroom, then?When our students work in groups, do they select a friend as their partner, or a classmate who will help them improve their grade?Or, taking this even further, are we generally friends with people who are similar to us?
In this study, we analyze empirical team formation data from a programming course of a degree in computer science in our University in the context of current educational neuroscience research.In particular, we explore whether academic factors play a role in the selection of partner in pair programming.

Team formation in pair programming
In a subject of our degree in computer science (Programming 2, see next section), the programming project changed recently from individual to team work (where teams are generally formed by two members), with the idea of exploiting the potential benefits of pair programming (Hughes, 2015;Williams & Kessler, 2003).In fact, the initial purpose of this change was to give the students the opportunity to practice their team work skills and to develop the communication skills that this entails, following the framework and guidelines of the European Higher Education Area (De Miguel, 2006).This framework promotes team work as a core and transversal competency all graduating students should have (González & Wagenaar, 2003).
Previous to changing the programming project from individual to team work, the teaching staff debated possible advantages and disadvantages of this change and, in particular, whether teams should be selected by teachers (direct educational intervention) or freedom of choice should be given to students, considering the educational literature background on group formation (McClure, 1990).Teachers even considered the possibility of using group formation software (Gogoulou, Gouli, Boas, Liakou & Grigoriadou, 2007) and the multiple aspects of cooperative learning (Dillenbourg, 1999;Inaba, Supnithi, Ikeda, Mizoguchi & Toyoda, 2000).
It is possible that, when given complete freedom of choice, students would tend to partner with colleagues of opposite academic performance, known in sociology as "disassortative mixing" (Newman, 2010).On the one hand, this type of partnership could be a handicap for students of superior performance: they would probably have to do more work than the work needed when teaming with a student of similar capabilities.Also, they would run the risk of getting lower grades than what would be achieved individually or with a better partner.On the other hand, in disassortative mixing the student with lower performance would be more likely to obtain a better grade than the one achieved individually, and at the same time benefit from the peer learning opportunity thanks to the collaboration with a partner with higher capacities.
However, it could also happen that students team up in a natural way with students of similar academic performance, which would constitute a clear example of "assortative mixing" (Newman, 2010).
In this study, we analyze whether students' cognitive skills determine team formation in pair programming, considering previous academic achievements in the same course as an indicator of the general cognitive skills of our students.As we will see in the following sections, teams from our study are not generally formed by students of similar performance (at least not in a statistically significant manner), which stands in clear contrast to previous related work (Fowler et al., 2011).

Methodology
Our data corresponds to the 2015 spring semester of the Programming 2 (PRO2) course, the first in which the programming project of computer science was done in teams.The rest of evaluation assessments that we have of PRO2 are individual: there is an exam prior to the formation of the teams, then two theory tests, and a final exam of the programming project.
Our students are divided into different theory groups and each theory group is then divided into four laboratory subgroups (with a different teacher assigned to each subgroup).In the semester reviewed we had 288 students divided into 20 subgroups of laboratory.Students had total freedom to form teams with other students within the same laboratory subgroup; in principle they could not form teams with members of different subgroups.
For simplicity, teams not formed by two students (that means, "teams" formed by one or three students) and its members have been excluded from our analysis, leaving finally the statistical sample in 129 pairs and therefore 258 students.We compute similarity measures based on the grades obtained in the exams and tests; students with missing grades (not presented) have been excluded.Our final sample contains 105 pairs (210 students).
Laboratory subgroups are codified with two digits, xy, where x is the (theory) group number and y the (lab) subgroup number.In our school, the order of registration into groups is determined by the previous course's grades; after that students occupy subgroups in order of increasing y.
The relationship between y and the final grades can be seen in Figure 1.In order to test our hypothesis, we define the similarity parameter S as the average of the similarity between grades of the members of a couple.The latter is defined as the cosine similarity between vectors of individual grades of each member of the couple.As null hypothesis, we employ pairs formed at random involving the same pool of students but with the restriction that the members of the random couple must be in the same original subgroup.The reason to do that is that the subgroups tend to include students with similar academic performance (due to the order of registration in PRO2) and, therefore, students in the same group of the laboratory could have similar grades (Figure 1).To find out whether the value of S is significantly high we have estimated the p-value (unilateral test) using a Monte Carlo test with 10000 replicas.
In addition to the control with random pairs, we consider yet another control that consists of forming pairs following the ranking of students' performance (average grades) inside each subgroup: thus, the first pair of this second control are the two students with the highest-grade average, the second couple are the third and the fourth student with highest grades, and so on.

Results
We find that if the vector of grades contains all the grades obtained in the course, then S is significantly high (S=0.8112,p-value=0.02).However, if the vector of grades contains only the laboratory exam and the two theory tests, or just the two theory tests (that is to say, in fact, that we eliminate the programming project exam), we cannot reject the null hypothesis: S=0.8971 (p-value=0.38) and S=0.8289 (p-value=0.31),respectively.Figure 2 shows the distribution of S in real pairs versus pairs generated at random.This result suggests that the programming project exam (this exam assesses the team work but using an individual test) is the responsible of the high values of S in our first analysis.We cannot claim that our students choose partners following academic factors.What we see is, in fact, that teamwork has a balancing effect in the grades of the members of each pair because the programming project exam is included.Figure 2 explains this phenomenon: when the programming project exam is included, the density of probability is shifted to high values of S in real pairs, while this shift goes in the opposite direction in random pairs.
If we consider all the grades except the collaborative programming project, we get that the parameter S of similarity between the students is closer to the control group of pairs generated by grade ranking (S=0.8112versus S=0.8323) than to the control of random couples (S=0.7828).
On the other hand, if we eliminate the programming project exam from the grades considered, we arrive at a different result: the parameter of similarity between the students S is closer to that of random couples (S = 0.8289 versus S=0.8224) than to grade ranking couples (S=0.8617).
Thus, after eliminating the grade responsible for obtaining higher similarity values among pairs, results seem to indicate that students do not team up taking into consideration their previous academic background.

Conclusions
Our results suggest that students do not show neither assortative nor disassortative mixing guided by their academic background (Newman, 2010) contrary to what it could be expected a priori (Fowler et al., 2011;McClure, 1990).However, we cannot exclude the possibility that our data does not allow us to see academic preferences that really exist between students.We see two possible reasons for this: first, because pairs among different subgroups are prohibited and, secondly, because students are in fact ordered into subgroups by similar academic performance (because of the fact that registration order is given by previous academic performance).This registration order policy could in fact be preventing that possible cognitive preferences or academic biases emerge in the natural formation of groups.
Further research is necessary in pair programming.The advances in the knowledge of brain structure will surely make a new science of learning possible (Tokuhama-Espinosa, 2010), which allow us to understand in depth the implications of the new cognitive paradigm of the so-called neuroeducation (Nouri & Mehrmohammadi, 2012).Neuroscientists are just finding out what brain mechanisms underlie the process of education and how it operates, mainly guided by the social basis of our psychology (Meltzoff, Kuhl, Movellan & Sejnowski, 2009), which would explain the potential of collaborative learning and pair programming (Williams & Kessler, 2003).
Thus, team formation emerges as a central topic of research, especially now that collaborative learning is ubiquitous in our schools.
-237-Journal of Technology and Science Education -https://doi.org/10.3926/jotse.255 We believe that, given that pair programming seems to give great results, the efforts of future educational research in this field should focus primarily on how these couples are formed, something that is particularly crucial in the educational context of pair programming (Williams & Kessler, 2003).Therefore, in future work, we should consider alternative parameters guiding team formation without teacher intervention, in order to understand and achieve optimal effectiveness of couples or teams (Oakley, Felder, Brent & Elhajj, 2004) in pair programming (Hughes, 2015) in our classrooms.

Figure 1 .
Figure 1.Mean final grades (0-10 point scale) of the students of Programming 2 (PRO2) in the spring semester (2015) classified by the laboratory of computer science subgroup.Laboratory subgroups are codified with two digits, xy, where x is the (theory) group number and y the (lab) subgroup number (thus subgroup 24 means that the students belongs to the theory group number 2 and to the 4th lab subgroup)

Figure 2 .
Figure 2. Histograms of S, the average similarity between the grades of the members of the couples.The latter is defined as the cosine similarity between vectors of individual grades of each student of the couple.Students come from the spring semester (2015) edition of the course Programming 2 (PRO2).The comparison consists of: random pairs excluding the scores of the practical exam (upper left panel), real pairs excluding the scores of the practical exam (upper right panel), random pairs including these scores (bottom left panel) and real pairs including these scores again (bottom right panel)