Building Better Student Assessments
Some student assessments don’t look much like standardized tests at all, even when they’re being used for school accountability.
“We asked the kids to be a town planner, and as part of that planning board they are asked to design two [water] towers,” said Lee Sheedy, a math teacher at Spaulding High School in New Hampshire who’s been working on new assessments for two years. “One would be a simple solid and the other had to be a compound solid. They then write a proposal to the town recommending one of the towers.”
This geometry task comes from an initiative in New Hampshire, the Performance Assessment of Competency Education (PACE), which has drawn national attention.
The state in 2015 received a federal waiver to work with eight New Hampshire districts on the pilot. They are using a set of performance assessments at most grade levels instead of the state’s standardized end-of-year exams for testing and accountability purposes.
The assessments — some developed by local districts, others common to all participating school systems — involve multi-step activities that seek to gauge and promote deeper learning. According to the New Hampshire Department of Education, this approach guards against over-testing by making assessments part and parcel of classroom learning rather than a disruptive addition to it.
Sheedy participated in a discussion last month on the future of assessment at the Education Writers Association’s national seminar in Boston.
“The center of gravity on assessment has really been pulled toward the end-of-year summative assessment,” said Tony Siddall, a program officer at the nonprofit Next Generation Learning Challenges who also spoke on the EWA panel. “One of our hypotheses is that it needs to be pulled back much more to teaching and learning.”
Siddall’s organization earlier this year issued a dozen grants to states, districts, and nonprofit organizations to support innovations in assessment design, including “culturally responsive” assessments in Hawaii, work by Summit Public Schools in California on assessing “habits of success,” as well as further work in New Hampshire, related to the PACE project, focused on promoting personalized learning in a multi-age setting.
‘You Could Hear a Pin Drop’
With PACE, New Hampshire has taken advantage of a two-year federal accountability waiver to overhaul its testing system. During the pilot, districts using PACE administer the common, statewide assessments developed by the Smarter Balanced testing consortium in third-grade English, fourth-grade math and eighth-grade English and math. All juniors take the SAT. But in other grades, students complete performance tasks.
Previously, many high school students did not take the state assessments seriously when they learned that the results would not count toward their grades, Sheedy said. Now, he’s never seen kids so focused as when they are working on the performance tasks.
“When you give students a real-world problem, you allow them to be creative, you allow them to think critically,” Sheedy said. “They get incredibly motivated. If you walked into my room during PACE you could hear a pin drop.”
How the assessment pilot fits into the state’s accountability system is a little trickier. While New Hampshire is currently able to pursue the project under a waiver from the federal No Child Left Behind Act issued by the U.S. Department of Education, the state will have to get similar flexibility going forward under the new Every Student Succeeds Act, signed by President Obama in December. The federal agency has already said it will choose up to seven states to receive such flexibility for new, innovative assessments. According to Education Week, New Hampshire has until the end of 2016 to use its waiver.
As other states consider applying for the ESSA pilot, New Hampshire’s work on PACE has been of keen interest. Throughout the pilot in New Hampshire, Sheedy said he’s been impressed by how much developing the tasks has helped him as a teacher.
“When you let teachers … get out of their classrooms and you look at student work and you talk about it, teachers become better teachers,” Sheedy said. “Their ability to instruct and assess, it increases exponentially. I have grown more as a teacher since I’ve been doing PACE than any other thing I’ve been doing in the classroom over the last 12 years.”
The teacher-led work in designing and learning to grade the tasks was significant. Teams of teachers worked on the questions themselves and the scoring guides to grade them, which could be modeled after a state-developed process based on typical student answers.
Scoring Student Work
For the water tower problem, students are asked to design a tower that will hold approximately 45,000 cubic feet of water, with special attention to using the least amount of construction materials. Student work is scored at four levels — from an answer and explanation that is incomplete or inaccurate to one that is well-structured, organized and correct. The scores also are focused on three main areas: models and scale drawings; calculations and mathematical strategy; and communication of the analysis.
For example, a level 4 score in models and scale drawings would include two drawings that meet the design specifications and are mathematically accurate and properly labeled. A level 1 in that category would show inaccurate drawings that aren’t labeled and don’t meet the requirements of the question.
The scoring process is an in-depth effort that requires significant training. Teachers are trained by their peers to use the scoring guides to grade student answers and then compare their scoring process to teachers’ from other schools and districts to ensure accuracy and consistency. Then, scores are reported to the state. The initial training for teacher leaders takes about six days spread across the year and three to five days in the summer.
Kathleen Cotton, a curriculum and instruction coach in Sheedy’s district in Rochester, New Hampshire, said although there is extra work involved on the front end, the performance tasks give teachers information they can use immediately.
“You look at some of this high-stakes testing that we have, and it really is not engaging at the time because the students don’t really have any buy-in except of that one score at the end,” Cotton said during the EWA panel.
‘Changing a Culture’
A key dimension of New Hampshire’s pilot system is how it promotes a different kind of relationship between districts and the state, said Scott Marion, the president of the Center for Assessment, a nonprofit consulting firm that is working on the pilot.
“What’s really new in PACE is the shift from simply focusing on assessment to shifting the entire accountability structure,” Marion said. “It’s a true partnership between the state and the districts as opposed to everything being top-down.”
Shifting toward a performance-based assessment and accountability system isn’t easy, and may be especially challenging. At the very least, districts must have leaders ready to get to work creating the tasks and learning to grade them. Schools have to work together with each other and the state to make sure scores are valid and reliable. States have to be open to taking on new challenges in measuring accountability and commit to the funding to do it.
Paul Leather, the New Hampshire Department of Education’s deputy commissioner, said in an interview that his state uses a mix of state, federal and private grant funding for the PACE program, but the goal is to eventually make it fully a state and federal responsibility. The Nellie Mae Education Foundation provided $300,000 to train educators in how to develop and grade the new assessment, but that amount wouldn’t necessarily be the same for other states, Leather said. It depends on what work they might already be doing and what their specific needs are, he said.
Ultimately, a smooth transition requires teachers, districts, state officials and students to be on the same page, Cotton said.
“At this point,” she said, “we are changing a culture.”