As I continue my researches in international education, one thing I notice is the current emphasis on the search for superior teachers in countries all over the world (Norway, Australia, England, the United States, . . . ). It has become well established by now that some students learn faster than others, and that this happens not just among individual students but also among different classrooms. The assumption is usually that the difference between teachers is the cause of the different outcomes. The elusive, desirable power of these sought-after teachers is called "teacher quality"; it is difficult to define, but researchers have been busy attempting to find it and then replicate it, to the benefit of society.
Hasn't anyone noticed the circular reasoning involved here?
Educational researchers today find quantitative social science fascinating. Using our modern, enhanced computing powers, they pour over data to find statistical correlations, then infer that some correlations have cause-and-effect relationships, and then look more closely to try to derive useful consequences. But in the case of sloppy educational argumentation, this reasoning has too often looked like this: "Some teachers can raise student scores more than one-and-a-half times faster than average, while some only raise the scores half as fast as average." (Those arguing thus usually neglect random variation, or statistical noise, which, as Dana Goldstein has reported in "The Test Generation", can be so pervasive as to require ten years of data to reduce the value-added error rate for an individual teacher to a mere twelve percent.) "The former are the good teachers we want every child to have." But who are these "good teachers"? "Those who raise test scores at least one-and-a-half times faster than average." Doesn't anyone see that that magic elixir, "teacher quality", has been built into the definition of what so many seek?
Logically speaking, this argument is this:
"If the test scores go up, x is a good teacher", and
"If x is a good teacher, the test scores go up", therefore
"The test scores go up if and only if x is a good teacher."
This is a tautology, of no utility for empirical research.
An analogy may be helpful here. Suppose we measure all of the primary school children in a neighborhood, and we decide we want all of our kids to grow faster. We guess that the houses might have something to do with it, so we define the high quality houses as those that help kids grow one-and-a-half times faster than average. We study the statistics, and find, in a given year, that some kids do grow one-and-a-half times faster than average, and others only half as fast as average. Now we separate these houses into groups, and start searching for their elusive powers of growing children.
I am not claiming that teachers have no more to do with children's test scores than houses have to do with children's growth rates; I am picking on the circularity of much-too-fashionable reasoning regarding teachers and assessments.
Let's return to the children's growth rate analogy. Suppose that we more plausibly connect children's diets with their growth rates. Fortunately, food comes in many different kinds, so if we can't get parents to put their kids on extreme, simple diets (say, all meat or all vegetables), we might be able to do more natural experiments by having them eat as they normally would, merely recording everything they eat for a given period of time, and then afterwards analyze the data and look for patterns and correlations between diets and growth rates. Fair enough; at least we have some variety in the children's inputs to search through. But if all we had done had been to define "high quality food" as that which helps kids grow, and not paid any attention to differences among what they ate to sort through, we likely wouldn't have made any better progress with our dietary analysis than we would have with our housing analysis.
For our analyses of effective teaching to be any good, we need to do more than simply define the quality we are looking for into our definitions of both input and output; we have to record, in advance, some variety in the inputs, perhaps length of lessons or style of delivery or experience of the teacher or similarity in race between teacher and taught or any of a huge number of possible variables, that we can correlate with the expected results we want to see. I will argue elsewhere that those results should not necessarily be rising test scores; I'll save that for another day. But please, when insisting upon (arguably non-existent) "impeccable research", don't use reasoning that is so obviously peccable.