New York City Teacher Evaluations Made Public: Will it Help?
Following a court challenge last year by several media outlets, the New York City Department of Education — which operates the nation’s largest school district — agreed to release individual evaluation rankings for 18,000 of its teachers. The New York Times is publishing its analysis of the data, including the teachers’ names and their school assignments.GothamSchools, an independent online publication, has opted out, citing significant concerns about both the fairness and the accuracy of the data.
Indeed, there is potentially a wide margin of error. The final scores for teachers could be off on average by as much as 35 or 53 percentage points for English and math exams, respectively, the New York Times reported.
Additionally, some teachers’ evaluations reflect test data collected for as few as 10 students.“The purpose of these reports is not to look at any individual score in isolation ever,” the city’s Education Department’s Chief Academic Officer Shael Polakow-Suransky told the New York Times. “No principal would ever make a decision on this score alone and we would never invite anyone, parents, reporters, principals, teachers, to draw a conclusion based on this score alone.”
That might be easier said than done. In many cases, it’s easier to gather data than it is to comprehend what they mean or how they should be used, either in education policy or journalism.
Several years ago, the Los Angeles Times hired an outside researcher to help devise its own method for interpreting the teacher evaluation data provided by the district. The high-profile publication of the “Grading the Teachers” project led to much debate in the education and journalism communities.
In a recent issue brief, Center for American Progress senior education policy analyst Diana Epstein and co-author Raegen Miller (CEP’s associate director for education research) suggested publicizing evaluation results for individual teachers would do more harm than good.
Epstein and Miller were blunt in their criticism of a groundbreaking – and controversial – project the Los Angeles Times’ which ranked thousands of the city’s teachers by their multi-year test scores.
By building its own interpretative model, the newspaper crossed the line from reporting to research, the authors contended. The Los Angeles Times (and others that follow its lead) should be held to a more rigorous standard, they argued.
“If journalists attempt to do their own analyses of value-added data, they should follow the same standards that researchers do when protecting human subjects,” Epstein and Miller wrote. “This means that data are de-identified and individual names are never published.”
For the evaluations to be useful, the teachers have to be willing partners, and publicizing their names along with the results will only make teachers less willing to engage in the process, the authors contended.
Doug Smith, the data reporter for the Los Angeles Times project, said his biggest issue with the report was that it was “doctrinal.”
The issue brief suggested there might be negative consequences to this kind of reporting, but those were assumptions, rather than known facts, Smith said.
In reality, there have been no obvious negative consequences to publishing the names of the teachers and their ratings, Smith said.
“The district’s test scores increased by about the same amount they have in previous years – there was no significant difference,” Smith said. “There were no reports of parents storming the schools and demanding different teachers for their children, throwing the campuses into turmoil.”
The Los Angeles Times gave teachers the chance to review their own rankings prior to the list being published, as well as a chance to add written comments to their personal page on the paper’s Web site. (The New York Times is offering teachers a similar opportunity to comment and correct errors.)
“Most of the teachers who called were angry that we had published individual names, but we also got calls from teachers who thanked us for giving them some feedback that they had never received before,” Smith said. “We certainly know a lot of teachers were upset. There could have been a morale issue, I won’t disagree with that. But if we observed a clear and distinct negative impact, we would probably rethink what we’re doing. But so far, we haven’t seen that.”
In his State of the Union address, President Obama referenced a study by Harvard and Columbia University economists that found a correlation between teachers who were successful at raising test scores and some long-term indicators of the health and well-being of their students. For the teachers with the better track records on test scores, their students tended to have a low rate of teen pregnancy, were more likely to attend college, and had higher earnings.
Dale Ballou, associate professor of public policy and education at Vanderbilt University, who reviewed the study for the National Center on Education Policy, said it’s important not to give too much weight to any one piece of research.
The next step is for the study to be replicated in other districts, Ballou said, with consistent results for multiple locations. He also wants to see more of the original study’s evidence.
“The issue is whether we can confirm that the same teachers who seem to be raising test scores in the short term also have these positive long term impacts,” Ballou said. “If if you can measure the teachers who are having success raising test scores, and in the long term they are having these positive effects, that validates the use of test scores as a way of evaluating teachers.”
Facing significant pressure from the U.S. Department of Education as well as parents and lawmakers, states and districts are scrambling to build teacher evaluation models that satisfy the clamor for accountability. But in the long run, how much of the data is actually being used to help teachers do a better job? Can the so-called “value-added” formulas ever truly account for the numerous socioeconomic factors that play such a significant role in a student’s performance, and are often outside a school’s control? Will publishing teachers’ names and rankings somehow make the data of greater use?
Ballou said he is concerned that most people don’t have a good understanding of statistical instruments or realize that the measures are estimates – not final pronouncements of a teacher’s success or failure.
“People are going to overreact to small differences that aren’t that meaningful,” Ballou said. “When you start publishing these numbers without indicating to people how much error there can be in these estimates, they’re going to draw conclusions that aren’t true.”
On the flip side, however, Ballou said that districts have been collecting teacher evaluation data for years without doing much to make use of the information. There’s the potential that by making it public, districts will have a greater motivation to take action.
“When you put this information out in the public, you’re going to have people clamoring to get their kids in certain classrooms,” Ballou said. “The other side of the issue is if you don’t put this stuff out in the public, nothing will change or improve. It’s a real tough call.”