Q & A with Laura Hamilton: States should wait to evaluate teachers under Common Core

I was away on vacation and asked our newest Hechinger Report writer, Aisha Asif, to fill in. She interviewed RAND’s Laura Hamilton, who argues that states should wait a couple years before judging teachers’ performance based on the new common core test scores. But Hamilton acknowledges that few states will be able to do that. I also found it interesting that Hamilton rejects the conventional wisdom that it can be more accurate to average several years of student test scores when evaluating teachers. – Jill Barshay

New Common Core aligned tests are sure to have an effect on how teachers are rated now that teacher evaluations in many states are tied to their students’ test scores. The Common Core standards in math and English emphasize greater critical thinking skills and non-fiction reading, and some districts in places like New York and Kentucky have already seen their students’ test scores fall dramatically after being tested on the tougher criteria.

The Hechinger Report spoke with Laura S. Hamilton, senior behavioral scientist at the RAND Corporation, whose research is concentrated on assessment, accountability and evaluation of teachers and school leadership, about the repercussions of the Common Core State Standards on teacher evaluation systems.

Question: Do different tests give different value-added scores for teachers?

Answer: Yes, there’s been work that shows even different sets of items on the same test can give different value-added estimates for teachers. A lot of the differences have to do with the teacher’s own content coverage and how well the curriculum that the teacher is using matches the content of the test. If it’s testing something that isn’t included in that teacher’s curriculum, it’s likely to be less sensitive to the teacher’s effects. So it can make a really big difference.

Q: Why is there such a difference?

A: I think part of it is because you know the assumption behind these kind of teacher evaluation systems is that you have a test that’s measuring what teachers taught and measuring whether students learned what teachers taught. That’s why we’ve seen with all the research on high-stakes testing that over time teachers adjust their instruction to try to make it match the content and the format of the test because that increases the likelihood that their students will be exposed to the tested content.

Q: How many years of data should be used to get a reliable rating for a teacher?

A: It’s a really hard question. It’s particularly hard now given that so many states are changing the test so that the value-added estimates based on the existing states’ tests aren’t necessarily going to be comparable to the ones that are based on new tests, say for the states that are adopting the Common Core aligned assessments.

And the other problem is that if you assume that teachers’ effectiveness stays the same over time, then aggregating across multiple years will give you a more stable value-added estimate. But we know that particularly in teachers’ first few years they tend to improve. So by averaging you won’t see that improvement.

Q: What effect do you see the Common Core standards having on teacher evaluations?

A: In my sense from talking with states is that they plan to continue using the same general type of evaluation system they already have in place. But, when these new tests are adopted one of the problems we’re going to see and we’ve already seen in many cases is students’ scores have declined because of lack of familiarity with the content as well as stricter cut score for determining proficiency.

One effect will be that we won’t have value-added tests that will be comparable across those years when states make those transitions. And I think that’ll be a challenge for how states to figure out how to deal with that in their evaluation system. In addition, it will take [teachers] a while to really learn what are on their tests – the materials they need to emphasize to make sure their kids are prepared for it.

The other thing is that the classroom observation systems that are currently used, a lot of them are sort of generic so they can be used in any grade or subject. They’re not focused on teaching of specific content so I think states and districts will want to take a close look at what they’re measuring with their observation protocols, and make sure that it’s consistent with the goals they have for promoting teaching that’s aligned with the Common Core standards.

Q: If you need multiple years to get a reliable rating on a teacher, and the tests will be different under Common Core, should there be a waiting period to get enough data under the new tests?

A: I would advise states and districts to institute a waiting period if at all possible because I think it’s very difficult to combine information across these very different kinds of tests and make any sense of it. Ideally, schools and teachers should have a couple of years to get up to speed on the standards and get familiar with the testing program. That’s not often feasible given the some of the policies states have enacted. But that would be ideal.

This interview has been edited for length and clarity.