ASCB logo LSE Logo

Implementation of a Learning Assistant Program Improves Student Performance on Higher-Order Assessments

    Published Online:


    Learning assistant (LA) programs have been implemented at a range of institutions, usually as part of a comprehensive curricular transformation accompanied by a pedagogical switch to active learning. While this shift in pedagogy has led to increased student learning gains, the positive effect of LAs has not yet been distinguished from that of active learning. To determine the effect that LAs would have beyond a student-centered instructional modality that integrated active learning, we introduced an LA program into a large-enrollment introductory molecular biology course that had already undergone a pedagogical transformation to a highly structured, flipped (HSF) format. We used questions from a concept test (CT) and exams to compare student performance in LA-supported HSF courses with student performance in courses without LAs. Students in the LA-supported course did perform better on exam questions common to both HSF course modalities but not on the CT. In particular, LA-supported students’ scores were higher on common exam questions requiring higher-order cognitive skills, which LAs were trained to foster. Additionally, underrepresented minority (URM) students particularly benefited from LA implementation. These findings suggest that LAs may provide additional learning benefits to students beyond the use of active learning, especially for URM students.


    In response to national calls to increase STEM (science, technology, engineering, and mathematics) student retention (President’s Council of Advisors on Science and Technology [PCAST], 2012), the Life Sciences Core Education department at the University of California, Los Angeles (UCLA), is transforming all of its large-enrollment introductory life science courses into highly structured, flipped (HSF) classes that integrate active learning and inclusive teaching practices (Knight and Wood, 2005; Handelsman et al., 2007).

    Active learning has been shown to increase student performance (Freeman et al., 2014) but can be challenging to implement fully in large-enrollment courses due to the high student to instructor ratios. One cost-effective strategy to lower this ratio is to implement an undergraduate learning assistant (LA) program (Twigg, 2003; Otero et al., 2010, 2011; Goertzen et al., 2011). LAs differ from other forms of peer instructors (such as undergraduate teaching assistants, peer tutors, or peer learning facilitators), due to the requirement of an accompanying training in pedagogies that foster student collaboration and stimulate discussion by asking open-ended questions and eliciting student reasoning rather than providing explanations (Otero et al., 2006; Learning Assistant Alliance, 2016). This is a practice that has been shown to increase students’ higher-order cognitive skills (HOCS) such as application, analysis, and evaluation in contrast to lower-order cognitive skills (LOCS) such as remembering and understanding (Gokhale, 1995; Richmond and Hagan, 2011).

    LA programs were pioneered by the University of Colorado, Boulder (CU Boulder), and have since been implemented in more than 100 other institutions (Learning Assistant Alliance, 2016). In addition to a weekly pedagogy seminar, the CU Boulder model involves a weekly mentoring meeting with the instructor that focuses on course content (i.e., concepts that students should know or skills and intellectual operations that students should be able to perform) and teaching practice during lectures and discussion sections.

    The implementation of LA programs is often accompanied by course transformations to active and highly structured classrooms, a combined effect, which when compared with traditional lecture-based courses has led to significant learning gains when measured using concept inventories (Otero et al., 2006, 2010; Pollock and Finkelstein, 2008; Goertzen et al., 2011; Talbot et al., 2015). If LA programs are introduced at the same time as the pedagogical shift to active learning, disentangling the effects of the two interventions becomes impossible.

    The objective of this study was to determine the effect that implementation of an LA program in a large-enrollment science course would have in addition to the effect of a shift in teaching practice to an HSF classroom format alone.

    Efforts were already underway to change the instructional modality in our large-enrollment molecular biology course to an HSF format that integrates active learning. Thus, the LA program that we implemented in this course was a secondary intervention intended to further improve student outcomes. This enabled us to disentangle the effects of both interventions from each other and report on the effects that can be attributed to LA program implementation alone. We hypothesized that the addition of LAs might have an additive, or even synergistic, effect on student learning, increasing gains beyond those achieved with active learning alone. The effects were investigated on several measures: a concept test (CT) that was administered as a pre- and posttest to determine learning gains and common exam questions that required either LOCS or HOCS. Interestingly, our results showed that the implementation of an LA program did not result in increased student learning gains on a CT. Deeper investigation of the effects of LA program implementation in an HSF course revealed that LAs had a very specific effect on student performance when questions demanded HOCS, which is consistent with the cognitive skills LAs were trained to foster. This effect was more pronounced for underrepresented minority (URM) students and thus helped to close the achievement gap between URM and non-URM students.


    Description of the Course and Student Population

    This study was conducted in an introductory molecular biology course during the Fall 2015, Winter 2016 and Spring 2016 academic terms (10-week quarters). Enrollment was 97 students in Fall, 139 in Winter, and 282 in Spring. The course is a lower-division class that can be taken at any point in the year, and enrolled students are most typically in their sophomore year or are first-year transfer students from community colleges.

    The course fulfills a major requirement for Life Science and Biochemistry BS degree programs. It is part of a four-course life science core curriculum with prerequisite courses in cell biology and physiology, as well as mathematics and chemistry.

    Because students can take the course at any point in the academic year, there is no “normal” or preferred term for students to take this class. The curriculum structure is inherently meant to be flexible and is offered in all academic quarters plus two Summer sessions. Students typically take it in any term convenient for their class schedule or enrollment capacity permitting. Transfer students often take this course in their first academic term (Fall quarter), because most local community colleges do not offer equivalent classes that would be equivalent to this course. The three observed terms had different enrollment sizes due to the room capacities in which they were offered. The demographic distribution of students is shown in Table 1. Given the frequent enrollment of transfer students, who are more frequently first-generation (i.e. neither parent holds a 4-year college degree) and/or URM students, the respective percentages are higher in the Fall quarter (FQ15). Students’ academic year was coded to be 1 for first-year students, 2 for second-year students, and so on. Entering transfer students were coded as being in their academic year 3. Scholastic Aptitude Test (SAT) scores were comparable between groups. However, it should be noted that transfer students do not typically report their SAT scores as part of their application for admission to UCLA and are not represented in those averages. High school grade point average (HS GPA) was slightly lower for the Fall quarter than the Winter (WQ16) and Spring quarter (SpQ16).

    TABLE 1. Demographics of the courses included in this studya

    No LALA
    FQ 2015WQ 2016SpQ 2016
    Total (N)97139272
    Female %b626965
    URM %332519
    First generation %402633
    Transfer %49198
    Pell recipient %222831
    Avg. academic year (SD)3.0 (0.5)2.6 (0.6)2.2 (0.5)
    Avg. SAT math (SD)666 (76)664 (79)670 (83)
    Avg. SAT verbal (SD)634 (74)626 (85)640 (78)
    Avg. SAT composite (SD)1952 (213)1939 (230)1972 (217)
    Avg. HS GPA (SD)4.0 (0.4)4.3 (0.4)4.3 (0.3)

    aLA, learning assistant program implementation; FQ, WQ, SpQ: Fall, Winter, and Spring quarters, each being a 10-week term, year is indicated; URM, underrepresented minority student (American Indian, Native American, Black non-Hispanic, and Hispanic students); Pell recipient, received Pell Grant for one or more terms while enrolled at UCLA (proxy for low socioeconomic status); HS GPA: high school GPA.

    bMissing data not included in percentages (valid percent).

    Courses were co-taught by two faculty members, with the first author (N.S.) being a constant in each term and the second instructor changing every term. The approximate percentage of class meetings covered by N.S. ranged from ∼50% in the Fall and Spring terms to ∼90% in the Winter term. N.S. was a discipline-based education research postdoctoral fellow, helping to transform the course into an HSF classroom by coteaching with other faculty. Each faculty member was able to decide which percentage of the course he or she would like to teach in the classroom, hence the time N.S. dedicated to teaching varied from term to term. However, all instructors used the teaching materials developed by N.S. The number of meetings per week and number of midterms was determined after taking the other instructors’ preferences into account. All exams were written and edited by N.S., and if desired, the other instructor coteaching the same course added some questions. Each offering of the course used the same instructional materials (such as textbook, online videos, simulations and quizzes, in-classroom activities, clicker questions, discussion section worksheets, and lecture slides) with minor changes such as correcting spelling or grammar mistakes in slides and worksheets. Class meetings were either 50 minutes for 3 days per week (SpQ16) or 75 minutes for 2 days per week (FQ15 and WQ16). Each course required students to attend discussion sections of 24 students each, meeting for 75 minutes once per week. Each offering had either one (FQ15) or two (WQ16 and SpQ16) midterm exams and one final exam. Each exam consisted exclusively of multiple-choice and true/false questions that were scored on Scantron forms (Scantron Corporation).

    All offerings were taught in an HSF format that incorporated active learning. The organization of the course included preclass video and reading assignments accompanied by preclass quizzes.

    Every week, students were asked to watch a number of videos, created either by N.S. (video lecture) using Camtasia software (TechSmith) or sourced from other publicly available resources such as hhmi BioInteractive ( or YouTube ( Students who viewed the videos and answered a five- to six-question multiple-choice quizzes (due the morning before the first class meeting of the week) were awarded a small amount of course credit. The videos averaged 20–40 minutes in duration. Additionally, animations and simulations provided by the textbook publisher (Macmillan LaunchPad, W. H. Freeman) were assigned to be viewed for course credit. These activities were awarded credit for up to 12.75% of the total possible course points in sum. Weekly textbook readings accompanied by reading quizzes were also assigned. The reading quizzes were due before the first class meeting every week, and students were awarded up to an additional 12.75% of the total possible points for the course in sum. In total, these homework assignments were worth 25% of the total course points.

    The course was also accompanied by an online discussion board (, Piazza Technologies) on which students could ask content-related and class logistics–related questions. This discussion board was monitored closely by the instructor (N.S.), graduate student or professional non-student teaching assistants (TAs), and LAs. The average response time to student questions was approximately 1 hour. These efforts were aimed at helping students fully understand the material before class by providing expert assistance and feedback at the same time as students were doing the assigned homework. All students participated in discussion board activities, with varying engagement and with no apparent bias toward stronger or weaker students. Discussion board activity typically peaked in the week before midterm and final exams. No course credit was associated with asking or answering questions on the discussion board.

    In-class activities involved the use of clickers accompanied by peer discussion (think–pair–share) and worksheets. Per class meeting, there were 5–10 clicker questions. Scoring was based on both participation (1.5 points for answering 75% of questions) and correctness (0.5 points per correct answer), capped at 3 points per class meeting. In-class worksheets were not graded and were only used occasionally (about once every other week). Points awarded for clicker questions comprised 10.7% of the total possible course points.

    Weekly discussion sections (section size: 24 students each) were taught by one TA per section accompanied by two LAs if an LA program had been implemented in the course. Discussion section facilitators were provided with worksheets, case studies, and any other materials for their sections by the instructor to ensure that all students were experiencing similar instruction across discussion sections. TAs and LAs, if applicable, met weekly with the instructor(s) to review the discussion section activities and anticipate conceptual challenges or other barriers to learning that students might encounter during the lesson. Completion of discussion section worksheets or activities was awarded course credit regardless of correctness, totaling 12.75% of the total course points.

    The remaining course points were divided between one to two midterm exams and a final exam.

    Graduate students in doctoral programs in the biological sciences typically are assigned as a TA for one to two quarters as part of their graduate program degree requirements, with course assignments determined by their respective home departments. Professional non-student TAs had their BS degrees and had successfully completed the molecular biology course before being hired as TAs. Typically, all TAs participate in a 10-week training seminar focusing on preparing and delivering effective lecture presentations and handling classroom conflict, grading policies and practices, and relevant university policies and resources. Introduction of pedagogical strategies such as active learning and training to facilitate collaborative instruction varies by department, is inconsistent from year to year, and depends on the preference of the instructor for each departmental seminar. Training of graduate student and professional non-student TAs for the molecular biology course varied depending on the departmental seminar in which a TA elected to enroll.

    Implementation of the LA Program following the CU Boulder Model

    To facilitate active learning in instructor-led class meetings and TA-led discussion sections, we introduced LAs to the course in the Winter quarter and continued this intervention during the following Spring quarter.

    LAs were being trained in a weekly pedagogy seminar on how to facilitate discussions and collaborative learning and to not give answers and instead use questions to promote student reasoning skills. The weekly pedagogy seminar met for 50 minutes 1 day per week, and following the CU Boulder model, LAs completed weekly reading assignments and reading and teaching reflections for the seminar.

    The seminar was structured as a highly active, discussion- and activity-based course with minimal lecture time and a focus on peer discussion and practice of techniques with peer and instructor feedback. Instructors used the seminar activities to model desired LA behavior in the classroom. A detailed description of the pedagogy seminar, LA tasks, and sample syllabus are provided in the Supplemental Material.

    LAs were particularly trained and instructed to circulate around the classroom during think–pair–share or group activities to engage in discussions with student groups. They were tasked with eliciting student reasoning during those conversations, as this practice had been shown to be the most effective at eliciting student reasoning (Knight et al., 2015). LAs were further trained and tasked to facilitate collaborative learning in the discussion sections by moving through the classroom, engaging with student groups in discussion, and eliciting student reasoning. Additional LA duties included holding weekly office hours, monitoring and answering questions on the online discussion board, and meeting with the course instructor and TAs once per week. LAs were also trained to provide some informal mentoring on effective study strategies to struggling students. An overview of LA activities and tasks is provided in Figure 1 and the Supplemental Material.

    FIGURE 1.

    FIGURE 1. LA duties and tasks.

    Each lecture typically had one active instructor, supported by one TA for every 72 students and one LA for every 24 students (if implemented). Discussion sections were led by one TA and supported by two LAs.

    LAs were selected through a competitive online application process. Selection criteria were overall GPA, the grade previously earned in the course, and stated teaching experience and motivation.

    Data Sources and Collection: Demographic Characteristics, CT, and Exam Questions

    This study focused on the improvement of student performance using LAs in an already active HSF course.

    Student demographic characteristics were obtained from university records. Characteristics included ethnicity, first-generation college student status, Pell grant recipient status, transfer student status, HS GPA, SAT scores (if applicable), sex, and admission term. Pell grant recipient status serves as a proxy for low socioeconomic (low-SES) standing of a student, as this is a grant awarded to students with financial need. Admission term was recoded to a year in college at the time of taking the molecular biology course variable, with transfer students being coded to be in their third year when entering. Ethnicity was recoded into a URM student variable, with white and Asian students being coded as non-URM, and American Indian, Native American, Black non-Hispanic, and Hispanic students being coded as URM.

    An additional academic term variable, “term number,” was created, with the Fall quarter being coded as 1, Winter as 2, and Spring as 3. This variable is equivalent to the number of terms of instructor experience with the HSF format. Additionally, the academic term variable serves as a grouping variable for the term in which students completed the observed course.

    Learning gains were measured using a CT, which consists of a combination of 25 multiple-choice items that had previously been published as parts of other concept inventory and diagnostic test instruments (Howitt et al., 2008; Smith et al., 2008; Klymkowsky et al., 2010; Shi et al., 2010; Taylor et al., 2013). The CT test items, topics, and respective sources are provided in Table 2.

    TABLE 2. Composition of CT for Introduction to Molecular Biology

    CT question numberaSourcebTopicSource question number
    1BCIMolecular basis for DNA as appropriate molecule for genetic information storage10
    2BCIMolecular basis of binding specificity17
    3GCAGenetic makeup of somatic cells1
    4GCADefinition and consequence of DNA mutation4
    5GCACloning and gene expression; protein function; interpretation of experimental results21
    6GCAEffects of DNA mutations on mRNA11
    7IMCACharacteristics of viruses3
    8IMCAMolecular basis of protein structure10
    9IMCAMechanism of enzymatic catalysis using reaction diagrams11
    10IMCAMechanism of enzymatic catalysis12
    11IMCAStructure of DNA and chromosomes during the cell cycle19
    12IMCADNA replication mechanism21
    13IMCAConcept and mechanism of transcription22
    14IMCACloning and gene expression; mechanism of translation23
    15IMCAMechanism of translation24
    17MLSInheritance of mutations and mistakes occurring in DNA replication, transcription, and translationM6-4
    20MLSMechanism of DNA replication, transcription, and translationM6-4
    22CI TTStructure and function of RNA2
    23CI TTMechanism of translation5
    24CI TTEffects of DNA mutations on proteins14
    25CI TTMechanism of translation; Genetic code15

    aThe full CT is available upon request.

    bSources for CT questions: BCI, Biological Concepts Instrument: Klymkowsky et al. (2010); CI TT, Concept Inventory for Transcription and Translation: Taylor et al. (2013):; GCA, Genetics Concept Assessment: Smith et al. (2008); IMCA, Introductory Molecular and Cell Biology Assessment: Shi et al. (2010); MLS, Molecular Life Science Concept Inventory: Howitt et al. (2008).

    The CT had been assembled in an effort to create a suitable assessment tool for determining the effect of transitioning from the traditional lecture-based course format to HSF pedagogy. This pedagogical transition and associated internal assessment efforts begun in 2012 and are ongoing. Because none of the validated existing concept inventories included all topics traditionally covered in the molecular biology course, a team of experienced instructors for this course selected applicable test items from published instruments and combined them into a single instrument that they agreed would cover many topics typically taught in this course regardless of instructor. No particular attention was paid to the cognitive level of selected test items at the time the CT was created, given the focus on covering desired course topics. The order of test items loosely follows the typical order of class topics in the prior lecture-based course format. The switch from lecture-based to HSF course format resulted in increased student performance on the CT items (unpublished observations), in line with the result of similar pedagogical changes at other institutions (Freeman et al., 2014).

    The CT was administered as a pretest in the first discussion section of the quarter and was included in the final exam as a posttest. Each final exam consisted of 100 total items, of which 25 were the CT.

    Final and midterm exams were created after the transition to HSF course format, and questions were designed to align with course learning objectives rather than course topics. Midterms each consisted of 50–80 items. There were 88 identical exam questions other than the CT items, which were part of either midterm or final exams in all three quarters. This analysis focuses on the CT and identical exam questions exclusively. Any other nonidentical exam question items or assignments are not part of this study. The Supplemental Material contains a few sample questions. All exam questions are available upon request.

    UCLA’s Institutional Review Board gave approval to work with human subjects on all aspects of the assessment (IRB#13-001490).

    Data Analysis

    Questions can be categorized at different cognitive levels, ranging from knowledge and understanding to application and analysis, followed by synthesis and evaluation (Bloom, 1956; Anderson et al., 2001). The cognitive level required to answer each of the exam and CT items was assessed using the Blooming Biology Tool (Crowe et al., 2008). Two instructors (N.S. and Deb Pires) separately and independently scored each of the items used in this study, including those of the pre-existing CT. After the initial scoring, the instructors met and compared their ratings. When they did not initially agree on a Bloom’s level, they discussed their reasoning until a consensus was reached; the question subsequently was classified at the consensus level. Bloom’s levels of knowledge and understanding were designated as LOCS, whereas application, analysis, synthesis, and evaluation were categorized as HOCS, as described in the Blooming Biology Tool (Crowe et al., 2008). As questions at the application level could be classified as either HOCS or LOCS, particular attention was paid to the topic of the question and the intellectual task. If either the task or the topic/data were new to the students, the question was classified as HOCS. If both were familiar to students from the exercises in the course, the question was designated as LOCS. All analysis-level questions in this study were deemed HOCS according to these criteria. Sample exam questions at different Bloom’s levels are provided in the Supplemental Material.

    For each student, we calculated several subscores for groups of identical test items, based on the cognitive level and whether the item was part of the CT or not. For the sake of these calculations, each correctly answered item was given a score of 1 point, whereas incorrect answers were scored as 0. There were thus 113 points of overlapping exam questions, 25 of which were from CTs, 38 of which were for LOCS questions, and 50 of which were for HOCS questions. The analysis of non-CT exam questions only includes students who had available scores for all 88 items, leading to a sample size of 94 students in the Fall course without LAs and 404 in the Winter or Spring courses with LAs (one LA per 24 students).

    The normalized learning gain for the CT questions can be calculated as (score on posttest – score on pretest)/(25 – score on pretest) and represented as a percentage of total possible learning gain (Dirks et al., 2014; Vickrey et al., 2015). We selected a modification of this calculation to calculate the “normalized change,” which uses a slightly adjusted formula if the pretest score is higher than posttest score (Marx and Cummings, 2006). Normalized change was only calculated for students who had both pre- and posttest scores available. Enrollment is not final until week 3, so a number of students did not have a pretest score to be analyzed. This left a sample size of 76 students in the Fall course without LAs and 368 in the Winter or Spring courses with LAs.

    Statistical analysis was performed using the software SPSS (IBM). The t tests were performed as paired or unpaired, two-tailed tests, comparing the means of subscores or normalized change for students in courses with or without LA program implementation (Cohen et al., 2013). To measure normalized changes, we compared the mean scores for the three terms using repeated measures analysis of variance (ANOVA; Tabachnick and Fidell, 2013). Regressions were performed with general linear models (GLM) using CT or exam test scores as dependent variables, with student demographics and course structure as independent variables. Estimated marginal means were generated from the GLM model to determine predicted HOCS exam score percentages. Data were plotted using the data visualization software Tableau (Tableau Software).


    This study is the first of its kind assessing the impact of LA implementation on student outcomes separately from that of active learning.

    LAs Are Not Associated with Improved Learning Gains on a CT in Comparison with Active Learning Alone

    A common measure of the effectiveness of instructional interventions is the use of concept inventories. It has been demonstrated that LAs increased student learning gains on concept inventories in physics courses (Otero, 2005; Otero et al., 2010; Goertzen et al., 2011) and in an introductory biology course (Talbot et al., 2015).

    In our study, we used an existing internal CT instrument composed of items from published concept inventories to calculate the normalized change of 76 students of the Fall term HSF class without LAs compared with the learning gains of a combined total of 368 students enrolled in the Winter and Spring term HSF classes with LAs. The students scored significantly higher on their posttests compared with their pretests (Figure 2A; mean pre: 51.4%, SD: 14.0%; mean post: 73.4%, SD: 12.2%; p < 0.001). While the students with LAs had slightly higher normalized change on the CT questions (mean without LA: 42.9%, SD: 25.0%; mean with LA: 44.1%, SD: 21.9%), this difference was not statistically significant (p = 0.153). Figure 2B shows the distribution of the normalized change data from the CT questions. Surprisingly, in contrast to our hypothesis, we observed no significant difference between classes with and without LAs across the three terms included in the analysis (p = 0.502, F = 0.690).

    FIGURE 2.

    FIGURE 2. LAs do not lead to differences in normalized change on the CT questions. (A) The posttest CT scores are significantly higher than the pretest CT scores by t test (paired). (B) Distributions of normalized change are not significantly different by t test (unpaired). NO, no LA program implemented; YES, LA program implemented. Boxes represent the 25th and 75th percentiles of data points; whiskers extend to data within 1.5 times the interquartile range; horizontal lines within boxes represent the median, and accompanying numbers represent the mean. N.S., not significant; ***, significant at the p < 0.001 level.

    LAs Are Associated with Increased Performance on Non-CT Exam Questions

    In contrast to the comparison CT normalized change alone, students with access to LAs had a significantly higher average total score on identical exam questions (including the CT) when compared with an HSF course without LAs (mean without LA: 72.4%, SD: 10.0%; mean with LA: 77.6%, SD: 8.7%; p = 0.037).

    Given the nonsignificant difference in normalized change on the CT, we decided to disaggregate the CT questions from the remainder of the exam questions for further analysis, hypothesizing that the CT questions could be a mediating factor masking the effect of LA implementation on student achievement.

    After removing the CT questions and then comparing the scores on common exam questions, we find significantly higher exam scores for students in HSF courses with LAs than for students who did not have LAs, at a higher level of statistical significance than when CT questions were included (Figure 3). On average, students in the HSF course without LAs scored 72.6% (SD: 10.4%) and those with LAs scored 77.6% (SD: 8.6%; p = 0.006).

    FIGURE 3.

    FIGURE 3. LAs lead to higher scores on identical exam questions, excluding CT question items. Distributions are significantly different by t test (p = 0.006). NO, no LA program implemented; YES, LA program implemented. Boxes represent the 25th and 75th percentiles of data points; whiskers extend to data within 1.5 times the interquartile range; horizontal lines within boxes represents the median, and accompanying numbers represent the mean. **, significant at the p < 0.01 level.

    Regression analysis with the CT normalized change as the dependent variable did not result in a statistically significant contribution of any of the variables included in the GLM, including the implementation of LAs, confirming this result. Because transfer students do not submit SAT scores as part of their application materials and are thus excluded from a model using these variables due to missing data, we performed the GLM analysis both with and without these variables, which produced similar results (Supplemental Tables 1 and 2). The similarity of the findings is not surprising, as HS GPA and SAT math are correlated (Pearson correlation 0.282, p < 0.001), and all students, including transfer students, submit HS GPA scores, establishing that omitting SAT scores from the regression analyses does not drastically affect the observed results.

    We conclude that LAs enhance active learning in HSF classrooms, which results in higher scores on common exam questions, consistent with our hypothesis that the addition of LAs might have an additive, or even synergistic, effect on student learning beyond that achieved with active learning alone.

    The Increase in Performance on Exam Questions Comes from Better Performance on HOCS but Not LOCS Questions

    Because LAs were not associated with significant learning gains on the CT questions, we hypothesized that the CT questions were not aligned with the learning objectives for the course as effectively as the non-CT exam questions. Backward design had been employed during the design of the HSF course (Handelsman et al., 2007), with many of the learning objectives articulated at a cognitive level that emphasized application and analysis tasks. When the CT items were compared with the rest of the exam questions, the misalignment between the intellectual operations required of the CT questions and the course learning outcomes became readily apparent. This lack of alignment may help to explain why we did not find a difference between LA-supported students and their peers in the course without LAs. Additionally, we noted that the CT questions were primarily categorized as LOCS, while the remainder of the exam questions were largely coded as HOCS.

    The way we train the LAs is specifically targeted to promote development of HOCS. Active and collaborative learning have been shown to increase student performance specifically for HOCS (Gokhale, 1995; Richmond and Hagan, 2011). We thus hypothesized that the effect of LA implementation might be specific to HOCS questions.

    To address this question, we disaggregated the identical non-CT exam questions into HOCS and LOCS questions. Notably, we found a statistically significant difference for HOCS questions but not LOCS questions. Students scored on average 71.6% (SD: 12.0%) out of the 50 questions at HOCS level without LAs, and 76.8% (SD: 10.2%) with LAs (p = 0.016). The mean performance on the 38 LOCS questions without LAs was 73.9% (SD: 10.5%) and 78.6% (SD: 8.9%) with LAs, which was not significantly different (p = 0.114; Figure 4).

    FIGURE 4.

    FIGURE 4. Blooming the exam questions revealed that students with LAs do better on HOCS than the students without LAs, but there is no significant difference in performance on questions requiring LOCS. NO, no LA program implemented; YES, LA program implemented. Boxes represent the 25th and 75th percentiles of data points; whiskers extend to data within 1.5 times the interquartile range; horizontal lines within boxes represent the median, and accompanying numbers represent the mean. N.S., not significant; *, significant at the p < 0.05 level.

    GLM regression analysis takes student precollege preparation, year in college, and course term into account. The regression reveals that variables significantly contributing to the dependent variable (HOCS score) are HS GPA, year in college, Pell grant recipient status, transfer student status, sex, URM status, and LA implementation. These variables are significant predictors of HOCS exam score if SAT scores are omitted from the regression to allow for a greater number of cases entering into the model. If SAT scores are included, thus excluding transfer students from the analysis, the significant predictors in the model are SAT verbal and math scores, student sex, and LA implementation (Supplemental Tables 1 and 2). Notably, course term did not enter as a significant predictor into either model and thus indicates that the effect of the term in which the course was taken was not significantly associated with improved student HOCS scores. For both models, LA implementation is a significant predictor of the dependent variable, confirming the effect of LA implementation on HOCS exam scores, while taking into account student demographic characteristics and precollege preparation.

    Our findings therefore suggest that the addition of LAs to an HSF classroom where active learning is already being implemented specifically increased student performance on HOCS exam questions, which is in line with our hypothesis.

    LA Implementation Contributes to Closing the Gap between URM and Non-URM Students on HOCS Exam Questions

    To determine whether LA implementation benefited certain student populations more than others, we disaggregated the data on HOCS exam scores by the student characteristics of sex, transfer status, URM status, Pell grant recipient status, and first-generation college student status (Supplemental Figure 1).

    Disaggregation of the data by these characteristics revealed that URM students had a higher increase in HOCS score when comparing non–LA supported courses with those that were supported by LA programs, while non-URM students showed a smaller difference in HOCS scores between the two conditions. URM students scored on average 64.6% (SD: 9.9%) without LAs and 73.2% (SD: 9.9%) with LAs. Non-URM students without LA implementation scored 74.8% (SD: 11.8%) and 77.5% (SD: 9.9%) with LAs.

    When disaggregating by sex, transfer status, Pell grant recipient status, and first-generation college student status, similar results were observed, with increases in HOCS scores upon LA implementation for all student populations. However, the differences in increase were not as large as for the URM and non-URM student comparison.

    To determine whether there are possibly interaction effects between LA implementation and the student characteristic variables, we performed GLM regressions including the respective interaction terms. The interaction term between URM status and LA implementation was a significant predictor for HOCS scores in the GLM model (Table 3), while none of the other interaction terms had significant predictive power in the respective models (unpublished data). All GLM analyses were also performed including SAT scores, with similar results (unpublished data).

    TABLE 3. Impact of student and course characteristics on HOCS exam scores (N = 466)

    VariableBaSEPbPartial eta squared
    High school GPA3.891.050.000.03
    Year in college−1.590.490.000.02
    Term number−0.350.560.530.00
    Pell recipientc−1.290.620.040.01
    Transfer studentc3.501.070.000.02
    First-generation studentc−0.550.610.370.00
    Student sex = femalec−1.830.490.000.03
    URM studentc−0.720.670.000.03
    LA implementationc4.851.380.010.02
    URM student*LA impl.c,d−3.591.270.000.02
    Corrected model0.000.20

    aUnstandardized regression coefficient.

    bBold type indicates significant p values.

    cVariables are coded 0 = no, 1 = yes; the reference value is 0 = no.

    dInteraction term between variables underrepresented minority (URM) student and learning assistant (LA) program implementation.

    LA, learning assistant program implementation; FQ, WQ, SpQ: Fall, Winter, and Spring quarters, each being a 10-week term, year is indicated; URM, underrepresented minority student (American Indian, Native American, Black non-Hispanic, and Hispanic students); Pell recipient, received Pell Grant for one or more terms while enrolled at UCLA (proxy for low socioeconomic status); HS GPA: high school GPA.

    This demonstrates that the positive effect of LA implementation is higher for URM students than non-URM students and is contributing to closing the gap on performance on HOCS exam scores. Using the model to determine estimated marginal means (i.e., predicted exam HOCS percentages) for URM and non-URM students with and without LAs while holding other variables constant (HS GPA = 4.27, year in college = 2.45, academic term = 2.35) results in estimated HOCS percentages of 67.8% without LAs and 77.4% with LAs for URM students, while predicted scores for non-URM students are 76.4% and 78.8%, respectively. This further confirms that LA implementation helps to close the gap between URM and non-URM students. We conclude that URM students particularly benefited from LA implementation.


    This is the first study to evaluate the effect of an LA program on an HSF course in which the shift to active learning had taken place before integration of LAs. Several other studies have reported improved student performance on concept inventories in physics and biology courses (Otero et al., 2006, 2010; Pollock and Finkelstein, 2008; Goertzen et al., 2011; Talbot et al., 2015). However, in all of these studies, the addition of LAs occurred at the same time as the pedagogical shift to active learning. It is therefore not possible to distinguish whether the increased learning gains were due to the new pedagogy or to the LA intervention. Active learning alone has been shown to improve student performance dramatically (Freeman et al., 2011, 2014), and it had already been argued for the flipped-classroom modality that the positive effect on student learning stems from the increased use of active learning (Jensen et al., 2015). Hence, it was possible that we might have encountered a “ceiling effect,” in which adding LAs to an HSF course would not improve learning gains beyond the effect of active learning alone. Alternatively, as we originally hypothesized, the addition of LAs might have an additive, or even synergistic, effect on student learning, increasing gains beyond those achieved with active learning alone.

    Consistent with our hypothesis, findings from our study of HSF courses suggest that LAs are associated with improved student performance beyond the use of active learning alone. Although the effect of LAs is small and additive, it is statistically significant. We had previously observed a dramatic improvement in student learning gains when the course was changed from a traditional lecture-based format to an HSF class integrating active learning (unpublished data). This improvement in the learning gains as measured by differences in CT pre and post scores was not increased by adding LAs.

    Factors that may have diminished a measurable effect include differences in instructor effectiveness or experience. A controlled experiment comparing student groups having the same instructor within the same academic term would have to be conducted to determine the extent to which this factor masks, if at all, the positive impact of LAs on student learning. A variable for academic term in the GLM models, which did not have statistically significant predictive power in the model, was used as a proxy to factor in instructor experience and term effect. Given the lack of evidence that this variable had predictive power on the dependent variable, we conclude that the instructor experience effect or term in which the course was taken by students is negligible in this study.

    The observation that the effect on HOCS scores was so well aligned with the pedagogical training that LAs receive and the teaching practices that LAs implement in the classroom further argues that instructor experience or academic term were likely not the main contributors to improved HOCS scores.

    The positive impact of LAs on HOCS is in line with previous studies reporting improvement of HOCS through active and collaborative learning (Gokhale, 1995; Richmond and Hagan, 2011). LAs are specifically trained to elicit student reasoning and tend to focus on this and using the Socratic method to teach (Prince and Felder, 2006; Gray et al., 2008). Asking for student reasoning has a higher chance of eliciting a student response using reasoning, promoting development of the students’ HOCS skills (Knight et al., 2015).

    Importantly, the CT items used in this study were a collection of items from published concept inventories and validated diagnostic tests and mostly required LOCS. The selected CT questions were not perfectly aligned with the learning objectives established for the course or with the activities that were designed to help students achieve those learning objectives. Learning objectives had been established after the CT was created, while the course was being transformed from lecture to HSF format. It is therefore not entirely surprising that we do not see an improvement of student performance in our study on the included CT questions, but we do see higher scores on identical exam questions that are aligned with the learning objectives as they were created after those objectives had been established. Interestingly, this effect seems to be mostly the result of higher scores on HOCS questions rather than LOCS questions. This is an encouraging sign that the LA training is having the desired effect of facilitating HOCS skills by emphasizing the importance of requesting reasoning and facilitating collaborative learning, among other learning outcomes. Similar to our findings, other studies have found that active learning had a positive effect on the performance on HOCS but not on LOCS (Richmond and Hagan, 2011) and that collaborative learning enhances critical thinking (which is defined as Bloom’s level of analysis and higher and would be HOCS according to our definition) but that students did not perform significantly better on LOCS (Gokhale, 1995).

    The TAs, who led the discussion sections, were not consistently trained to facilitate active and collaborative learning, which may have decreased the magnitude of the effect of LA program implementation. They did participate in the weekly mentoring meetings with the instructor and had limited pedagogical training in comparison with the LAs. Given that multiple TAs led the weekly LA-supported discussion sections, their ability to implement active and collaborative learning likely varied across sections and thus would not have been optimal in every section, thereby reducing the overall effect LAs had on student performance in the course. Given that students mostly interact with LAs during discussion sections and lecture (Talbot et al., 2015; White et al., 2016), this may have had a significant effect on the outcomes reported here. We suggest that incorporating the facilitation of active and collaborative learning into the TA training could further improve student performance on HOCS assessments in active classrooms, as previously suggested by others (Pentecost et al., 2012). Without this type of training, students tend to rate the effectiveness of their LAs higher than that of their TAs (Twigg, 2003). Future work should address the effect of TA training on student performance in conjunction with the implementation of active learning and LA programs.

    In the example presented here, the LA program was used to help a broader curricular reform effort and increase the instructor to student ratio in order to implement active learning more effectively in large-enrollment courses. LA programs have been widely used to facilitate curricular transformation, with assessment results demonstrating increases in student satisfaction in LA-supported courses and decreasing faculty concerns about adopting new pedagogical strategies (Groccia and Miller, 1996; Pollock and Finkelstein, 2008; Goertzen et al., 2011; Thompson and Garik, 2015).

    LAs have also been used as peer tutors to improve performance, perception, and retention of struggling students in a biology course (Batz et al., 2015). These struggling students are often members of URM groups or students from low-SES backgrounds (Pell grant recipients and first-generation college students in our study). Although our data did not support the hypothesis in which low-SES students benefited more from LA implementation, we demonstrated here that LAs help to close the achievement gap between minority and nonminority students, which is an issue of national importance (American Association for the Advancement of Science, 2010; PCAST, 2012). This could guide other institutions looking to decrease the achievement gaps between URM and non-URM students in their implementation of active learning and will hopefully encourage them to consider adding LA programs to their HSF classrooms.

    The effect of LAs on URMs in particular may have been influenced by affective factors promoted by LAs, such as improved sense of belonging and scientist identity (Seymour, 2000; Beasley and Fischer, 2012; Eddy and Hogan, 2014). Further research should be conducted to understand the origin of the positive effect of LAs on URMs better. We would have hypothesized that these same factors might also contribute in a similar way to low-SES student achievement; however, our data did not support this. Further efforts should be made to better understand the differences between these student populations, and how these differences may contribute to differential effects of LA implementation.

    The results of this study are encouraging, as they demonstrate that the implementation of an LA program enhances the effectiveness of active learning, promotes the success of all students, and importantly, decreases the achievement gap between URM students and their non-URM peers.


    We thank Deb Pires for help with scoring Bloom’s levels of exam questions. This research study was supported, in part, by a grant to the University of California, Los Angeles, from the National Science Foundation’s Improving Undergraduate STEM Education program (DUE award no. 1432804). Institutional support for the courses and instructors was provided by the Division of Life Sciences in the College of Letters and Science.


    • American Association for the Advancement of Science (2010). Retrieved April 14, 2016, from Google Scholar
    • Anderson L. W., Krathwohl D. R., Bloom B. S. (2001). A taxonomy for learning, teaching, and assessing: A revision of Bloom’s taxonomy of educational objectives, Harlow, Essex, UK: Longman,. Google Scholar
    • Batz Z., Olsen B. J., Dumont J., Dastoor F., Smith M. K. (2015). Helping struggling students in introductory biology: A peer-tutoring approach that improves performance, perception, and retention. CBE—Life Sciences Education 14, (2), ar16. LinkGoogle Scholar
    • Beasley M. A., Fischer M. J. (2012). Why they leave: The impact of stereotype threat on the attrition of women and minorities from science, math and engineering majors. Social Psychology of Education 15, (4), 427-448. Google Scholar
    • Bloom B. S. (1956). Taxonomy of educational objectives: The classification of educational goals, (1st ed). Harlow, Essex, UK: Longman,. Google Scholar
    • Cohen L., Manion L., Morrison K. (2013). Research methods in education, Abingdon, UK: Routledge,. Google Scholar
    • Crowe A., Dirks C., Wenderoth M. P. (2008). Biology in Bloom: Implementing Bloom’s taxonomy to enhance student learning in biology. CBE—Life Sciences Education 7, (4), 368-381. LinkGoogle Scholar
    • Dirks C., Wenderoth M. P., Withers M. (2014). Assessment in the college science classroom, New York: Freeman,. Google Scholar
    • Eddy S. L., Hogan K. A. (2014). Getting under the hood: How and for whom does increasing course structure work. CBE—Life Sciences Education 13, (3), 453-468. LinkGoogle Scholar
    • Freeman S., Eddy S. L., McDonough M., Smith M. K., Okoroafor N., Jordt H., Wenderoth M. P. (2014). Active learning increases student performance in science, engineering, and mathematics. Proceedings of the National Academy of Sciences USA 111, (23), 8410-8415. MedlineGoogle Scholar
    • Freeman S., Haak D., Wenderoth M. P. (2011). Increased course structure improves performance in introductory biology. CBE—Life Sciences Education 10, (2), 175-186. LinkGoogle Scholar
    • Goertzen R. M., Brewe E., Kramer L. H., Wells L., Jones D. (2011). Moving toward change: Institutionalizing reform through implementation of the learning assistant model and open source tutorials. Physical Review Special Topics—Physics Education Research 7, (2), 020105. Google Scholar
    • Gokhale A. A. (1995). Collaborative learning enhances critical thinking. Journal of Technology Education 7, (1), 22-30. Google Scholar
    • Gray K. E., Otero V. K., Henderson C., Sabella M., Hsu L. (2008). Analysis of learning assistants’ views of teaching and learning, AIP Conference Proceedings, 1179, 123-126. Google Scholar
    • Groccia J. E., Miller J. E. (1996). Collegiality in the classroom: The use of peer learning assistants in cooperative learning in introductory biology. Innovative Higher Education 21, (2), 87-100. Google Scholar
    • Handelsman J., Miller S., Pfund C. (2007). Scientific teaching, London: Macmillan,. Google Scholar
    • Howitt S., Anderson T., Costa M., Hamilton S., Wright T. (2008). A concept inventory for molecular life sciences: How will it help your teaching practice?. Australian Biochemist 39, (3), 14-17. Google Scholar
    • Jensen J. L., Kummer T. A., Godoy P. D. D. M. (2015). Improvements from a flipped classroom may simply be the fruits of active learning. CBE—Life Sciences Education 14, (1), ar5. LinkGoogle Scholar
    • Klymkowsky M. W., Underwood S. M., Garvin-Doxas R. K. (2010). Biological Concepts Instrument (BCI): A Diagnostic Tool for Revealing Student Thinking In: arXiv:1012.4501 [q-Bio]. Google Scholar
    • Knight J. K., Wise S. B., Rentsch J., Furtak E. M. (2015). Cues matter: Learning assistants influence introductory biology student interactions during clicker-question discussions. CBE—Life Sciences Education 14, (4), ar41. LinkGoogle Scholar
    • Knight J. K., Wood W. B. (2005). Teaching more by lecturing less. Cell Biology Education 4, (4), 298-310. LinkGoogle Scholar
    • Learning Assistant Alliance (2016). Retrieved September 5, 2016, from Google Scholar
    • Marx J. D., Cummings K. (2006). Normalized change. American Journal of Physics 75, (1), 87-91. Google Scholar
    • Otero V. (2005, Summer). The learning assistant model for teacher education in science and technology. Forum on Education of the American Physical Society Newsletter Retrieved September 20, 2016, from Google Scholar
    • Otero V., Pollock S., Finkelstein N. (2010). A physics department’s role in preparing physics teachers: The Colorado Learning Assistant model. American Journal of Physics 78, (11), 1218-1224. Google Scholar
    • Otero V., Pollock S., McCray R., Finkelstein N. (2006). Who is responsible for preparing science teachers?. Science 313, (5786), 445-446. MedlineGoogle Scholar
    • Otero V., Ross M., Samson S. (2011, Fall). A synergistic model of educational change. Forum on Education of the American Physical Society Newsletter Retrieved September 7, 2016, from Google Scholar
    • Pentecost T. C., Langdon L. S., Asirvatham M., Robus H., Parson R. (2012). Graduate teaching assistant training that fosters student-centered instruction and professional development. Journal of College Science Teaching 41, (6), 68-75. Google Scholar
    • Pollock S. J., Finkelstein N. D. (2008). Sustaining educational reforms in introductory physics. Physical Review Special Topics—Physics Education Research 4, (1), 010110. Google Scholar
    • President’s Council of Advisors on Science and Technology (2012). Engage to excel: Producing one million additional college graduates with degrees in science, technology, engineering, and mathematics, Washington, DC Retrieved September 7, 2016, from Google Scholar
    • Prince M. J., Felder R. M. (2006). Inductive teaching and learning methods: Definitions, comparisons, and research bases. Journal of Engineering Education 95, (2), 123-138. Google Scholar
    • Richmond A. S., Hagan L. K. (2011). Promoting higher level thinking in psychology: Is active learning the answer. Teaching of Psychology 38, (2), 102-105. Google Scholar
    • Seymour E. (2000). Talking about leaving: Why undergraduates leave the sciences, Boulder, CO: Westview,. Google Scholar
    • Shi J., Wood W. B., Martin J. M., Guild N. A., Vicens Q., Knight J. K. (2010). A diagnostic assessment for introductory molecular and cell biology. CBE—Life Sciences Education 9, (4), 453-461. LinkGoogle Scholar
    • Smith M. K., Wood W. B., Knight J. K. (2008). The Genetics Concept Assessment: A new concept inventory for gauging student understanding of genetics. CBE—Life Sciences Education 7, (4), 422-430. LinkGoogle Scholar
    • Tabachnick B. G., Fidell L. S. (2013). Using multivariate statistics, London: Pearson,. Google Scholar
    • Talbot R. M., Hartley L. M., Marzetta K., Wee B. S. (2015). Transforming undergraduate science education with learning assistants: Student satisfaction in large-enrollment courses. Journal of College Science Teaching 44, (5), 24-30. Google Scholar
    • Taylor J., Oh-McGinnis R., Chowrira S., Smith K. (2013, November 24). Transcription and Translation Concept Inventory In: Retrieved November 13, 2017, from Google Scholar
    • Thompson M. M., Garik P. (2015). The effect of learning assistants on student learning outcomes and satisfaction in large science and engineering courses In: Presented at the Annual International Conference of the National Association of Research in Science Teaching (Chicago, IL). Retrieved September 5, 2016, from Google Scholar
    • Twigg C. A. (2003). Improving quality and reducing cost: Designs for effective learning. Change: The Magazine of Higher Learning 35, (4), 22-29. Google Scholar
    • Vickrey T., Rosploch K., Rahmanian R., Pilarz M., Stains M. (2015). Research-based implementation of peer instruction: A literature review. CBE—Life Sciences Education 14, (1), es3. LinkGoogle Scholar
    • White J.-S. S., Van Dusen B., Roualdes E. A. (2016). The Impacts of Learning Assistants on Student Learning of Physics In: arXiv:1607.07469 [physics]. Google Scholar