Hard Facts & Half-Truths: Teacher Observation

1 Sept

One of the pillars of belief in the independent and international school world, a belief shared by most governing boards who rely on school management for an accurate indication of teaching quality in schools, is that the head of school is a skillful, highly-qualified observer of teachers, due to that person’s years of experience in education. It’s not just the head of school, though; it would include division heads, directors of curriculum and instruction, and others—anyone who ever observes teachers in some form of evaluative capacity. The operative cognitive underpinning here is: “They’ll know a good lesson when they see it.”

Where education has woefully underserved teachers, though, is in knowing and considering what the reliable evidence (research) shows with respect to teacher observation: “when untrained observers are asked to judge the quality of a lesson, there is likely to be only modest agreement among them. Worse still, even if they do agree that what they see is good practice, it often actually isn’t.” (Prof. Robert Coe, “Classroom observation: it’s harder than you think,” 2014). As Coe then posits, how can something that feels so right actually be so wrong?

We need to consider reliability (the extent to which the judgments made independently by two observers who see the same lesson would agree), and validity (if a teacher’s lesson is rated highly in terms of its effectiveness, does it really mean that the teacher in question is an effective teacher?).

From the perspective of reliability, the Gates Foundation funded a deeply insightful study, the Measures of Effective Teaching (MET) Project. That study relied on five different observation protocols, and involved 3,000 teacher volunteers, 20,000 videotaped lessons, student surveys, and student performance on state and supplemental higher-order thinking skills tests over the course of two full school years. Teachers represented two groups, 1) grades four to eight, teaching mathematics and English (English Language Arts); and 2) grades 9 and 10, teaching English, Algebra 1, and biology. Overall, the study found that indicators of reliability exist in a limited way, but to create a composite ‘score’ from them requires any number of trade-offs in terms of those indicators one wishes to weigh more heavily. Specifically in the area of teacher observation, though, the insights are powerful. Even for those observers who had substantial training in clinical observation of teaching, and who had passed a required test in observation, there is great variation. As Coe summarizes it, using the English quality assurance inspectorate Ofsted as an example, “One way to understand these values is to estimate the percentage of judgements that would agree if two raters watch the same lesson. Using Ofsted’s categories, if a lesson is judged ‘Outstanding’ by one observer, the probability that a second observer would give a different judgement is between 51% and 78%.”

Even more worrisome is validity. In “Do We Know a Successful Teacher When We See One? Experiments in the Identification of Effective Teachers” (Michael Strong, et al, Journal of Teacher Education, 2011), the authors, as Coe describes, “used value-added scores to identify ‘effective’ and ‘ineffective’ teachers, showed videos of them teaching to observers and asked them to say which teachers were in the group. In both the experiments where the observers were not trained in observation, the proportion correctly identified by experienced teachers and head teachers [heads of school] was below the 50% that would be expected by pure chance. At this level of accuracy, fewer than 1% of those judged to be ‘Inadequate’ are genuinely inadequate; of those rated ‘Outstanding,’ only 4% actually produce outstanding learning gains; overall, 63% of judgements will be wrong.”

The typical response to seeing this kind of evidence is to pronounce that it simply cannot be true; therefore, we reject it. Based on what? Our belief that we know good teaching when we see it, Coe says, “is so strong that it is a real challenge to be told that research does not support it.” He goes on to cite five reasons why our belief may be wrong (the language below is that of Coe, with minor insertions from me).

Observation produces a strong emotional response.
1. It’s challenging, when observing, not to project our own preferences for styles or behaviors, and compare what we see with what we think we would have done. This is an issue of cognitive bias, and it does no one any good.
Learning is invisible
1. The following indicators, shared by Coe in a post in 2013, are good examples of things that observers often look at, but in reality do not show whether any actual learning has taken place: (a) students are busy; lots of work is done, especially written work; (b) students are engaged, interested, and motivated; (c) students are getting attention; feedback, explanations; (d) classroom is ordered, calm, and under control; (e) curriculum has been ‘covered’; (f) at least some students have supplied correct answers (irrespective of whether they really understood them, could reproduce them independently, or knew them already).
Accepted ‘good practice’ may be more fashionable than effective
1. When we observe, we have a tendency (as humans who teach) to try to match what we see with what we know to be good pedagogy. Unfortunately, we are limited by two things: (1) our ability to define and operationalize specific pedagogic practices, and (2) our knowledge about whether they really are effective. Coe notes that, if a group of observers are shown the same lesson, they might well disagree about whether each of these practices has been seen.
We assume that, if you can do it, you can spot it
1. For experienced teachers, their classroom behavior is mostly automated and subconscious. Even an effective teacher may not understand fully which bits of their practice really make a difference. And if the observing teacher is experienced but not particularly effective themselves, it may be even less likely that they will be able to identify effective practices.
We don’t believe observation can miss so much
1. Anyone who has observed the Gorilla in Our Midst study (Harvard, 1999), usually seen via video, can attest that around 50% of observers failed to spot a gorilla walking through the middle of a basketball game, when their attention was focused on specific events (this is called inattentional blindness).

What should independent and international schools do?

Teacher observation, in and of itself, can be a useful practice. At a minimum, schools could move toward a more structured practice of observation, whether by administrators or even in professional learning communities focused on observation. Consider the following:

Stop assuming that untrained observers can either make valid judgments, or provide feedback that improves anything. (Coe)
Identify what your school values, in terms of teaching, and (ideally) engage with a local graduate school of education to construct a context-specific ‘foundations of observation’ that can be used in your school; importantly, provide training (through the graduate school partner) on clinical observation of teaching. This way, you follow Coe’s recommendation of applying a critical research standard and the best existing knowledge to the process of developing, implementing, and validating observation protocols. In too many schools, teacher observations are conducted without the use of any protocol at all.
Ensure that good evidence supports any uses or interpretations you make for observations, performance or otherwise. Caveats should be clearly stated, and not confounded with other contexts. (Coe)
Undertake robustly evaluated research to investigate how feedback from teacher observation might be used to improve teaching quality. (Coe). Such a school-level commitment speaks to raising the game on effective teaching, and provides a more robust, evidence-based answer when asked about effective teaching at the school.

Highly effective teaching, combined with focused attention to relationships, are the cornerstone of successful schools. We owe it to our teachers and our students to ensure that we, as schools, are supporting the growth and development of teachers, just as we aim to support the growth and development of students. They are inextricably bound.

Semper ad meliora.

Quality TeachingEffective TeachingTeacher ObservationCognitive Biases

Kevin Ruth

Hard Facts & Half-Truths: Teacher Observation

On Monoculture

Leftover Spaces