There are few questions more crucial to the field of education than what students should learn and how that learning should be measured. This Topics section examines several currently hot topics – including common standards, international comparisons, and cheating – in the often-contentious realm of standards and testing.
While standards and tests have been part of American public education since before the 20th century, the modern push for standards-based reform is often traced to the 1983 publication of “A Nation at Risk: The Imperative for Educational Reform.” Warning of a “rising tide of mediocrity,” the report prompted a surge of political interest in school reform that resulted in several unsuccessful attempts to develop national standards and tests. But while those efforts fell short, states took the initiative to develop standards on their own; between 1990 and 2002, most states developed specific standards for their core academic subjects, as well as tests that purported to measure how well students were learning them.
Yet many experts say the marriage of standards and assessments was not complete until after the federal No Child Left Behind Act was enacted in 2002. NCLB tied federal funding to a requirement that states administer standards-based tests to students in grades 3-8 and once in high school – far more testing than most states had previously conducted. The federal law also imposed an array of new consequences for schools that failed to show “adequate yearly progress” on state exams. But the law also left it up the states to determine where to set the bar on standards and tests. The result, many analysts argue, has been enormous disparities across states in what students are expected to achieve.
Despite those disparities, the nation has long had a common yardstick for measuring student achievement: the National Assessment of Educational Progress, also referred to as the “Nation’s Report Card.” NAEP exams are taken periodically by representative samples of students in the fourth, eighth and 12th grades in various subjects. Though prominent and influential within the education policymaking world, NAEP exams do not provide student-level achievement scores, are not tied to accountability requirements, and are not linked to a specific set of standards being implemented in the schools.
Read the results from the most recent NAEP results in fourth and eighth math and reading, which were released in November of 2013. Something to note: The results can be interpreted in many ways. For some, nearly 40 percent of the tested population scoring at the level of proficient in math is a positive sign, given that in 1990 that was true for less than a fifth of students. To others, it means less than half of fourth graders are proficient in math,
Push for Common Standards
With the Common Core State Standards, a new attempt is underway to wed standards and assessments. As of June 2012, 46 states plus the District of Columbia had agreed to adopt the Common Core State Standards, which aim to spell out what students should know and be able to do throughout their K-12 education careers. Not all the participating states adopted both the Mathematics and English Language Arts portions of the Common Core; Minnesota elected to use their own Mathematics standards.
Development of the Common Core State Standards was spearheaded by the National Governors Association, the Council of Chief State School Officers and the nonprofit group Achieve, using private grant funding. Under the initiative, groups have organized to develop standards in mathematics and English Language Arts, as well as standards for literacy in the sciences and social studies. Meanwhile, two interstate consortia are creating related assessments, which are slated to be fully implemented in 2014-15 with mathematics and English Language Arts components for grades three through high school. Those consortia are the Partnership for Assessment of Readiness for College and Careers (PARCC), managed by Achieve, and the SMARTER Balanced Assessment Consortium (SBAC), managed by WestEd. The process of building the assessments was funded through $360 million from the U.S. Department of Education with funds from the federal economic-stimulus act of 2009.
Among the factors seen as lending momentum to the common standards movement is the lackluster performance by U.S. students on international assessments, rising concerns about preparing students to compete in a global workforce, and the wide variations in expectations and performance among states. The Common Core initiative aims to erect instructional signposts that guide students toward an end goal of graduating from high school ready for college and careers. Among the concerns some critics of the standards have raised is the question of whether the standards-writers have strayed from content into pedagogy. Some critics have taken aim at the “publisher’s criteria,” which guide the development of curricular and instructional materials based on the standards, saying that the criteria include specific instructions on how teachers should lead lessons.
And in contrast to NCLB testing, Common Core involves states’ agreeing to set a single minimum “cut score” that students must attain on the tests to be designated proficient. The idea is to enable participating states to measure their student achievement against a shared yardstick. While each consortium will have its own cut score, supporters of the new assessments say that two cut scores among the participating states and the District of Columbia represent an improvement over the status quo of separate cut scores in each state.
The common assessments also are being designed to be administered electronically, a feature that advocates say will speed delivery of results. The two consortia also plan to offer several “formative assessments” over the course of a school year before the more high-stakes “summative” tests. Assessment planners maintain the additional periodic testing will allow teachers to better target student weaknesses ahead of end-of-year exams.
Education experts already have started to debate whether the Common Core Standards are more rigorous than standards currently in place. A 2011 survey of more than 300 school districts in the participating states by the Center on Education Policy found that roughly 60 percent of respondents believed the Common Core Standards are more rigorous than the ones they have been using. Yet some experts have questioned that conclusion. Likewise, some scholars have raised questions about whether the standards are likely to have much impact on raising student achievement. Even strong supporters of the initiative acknowledge that how states and districts implement the common standards and assessments will make all the difference in how they affect students and schools.
Part of the impetus for designing the Common Core State Standards was to catch up with other countries that place highly on international exams, including the Program for International Student Assessment (PISA), administered by the Organization for Economic Cooperation and Development (OECD). In 2009, students from dozens of countries, including 34 from the OECD, were tested in math, reading, and science. The U.S. ranked 25th, 12th, and 17th in those subjects respectively, among participating nations. The results, published in December 2010, were greeted with renewed calls for education reform and closing the international achievement gap. The 2012 results, which came out in December 2013, showed little change among U.S. students while a bevy of poorer countries saw considerable gains, and some like Poland and Vietnam surpassed the U.S. Fifteen-year olds in the U.S. were below average in mathematics, while in English and Science their scores were on par with the OECD average. More than a quarter of U.S. students finished in the lowest tier of math, while less than a tenth demonstrated skills that placed them in the top tier of the subject. Leading PISA countries had far more of their students place in the top tier than in the bottom rung.
Some analysts have noted differences in how U.S. students from various ethnic and socioeconomic backgrounds perform on international comparisons. For example, on the 2009 PISA tests, the gap in performance between rich and poor students in the United States was among the highest of all participating nations, while the countries with the top scores overall had smaller performance gaps between their students of different income levels. On average, various U.S. racial and ethnic subgroups perform differently on the exams, as well. For instance, according to the Education Trust, U.S. students classified as white or Asian-American scored similarly to students from such high-scoring countries as Japan and Finland. However, students from historically underserved African-American and Latino subgroups performed at levels comparable to students in lower-scoring nations, such as Turkey and Bulgaria. Despite the gap, the OECD notes that for the 2012 results, the U.S.’s share of poor students approaches the OECD average. OECD also writes America’s large percentage of immigrant students can explain only 4 percent of the country’s scores. Canada, a similarly high-immigrant country, performs better than the U.S. across all subjects.
The OECD chalks up some of America’s middling performance to the low share of poor students who can be characterized as ‘resilient’ — meaning their scores are comparable to wealthier students despite their modest socio-economic backgrounds. In leading PISA countries, far more low-income students display such resilience.
Another international exam worth looking at is Trends in International Mathematics and Science Study (TIMSS), in which roughly 60 countries participate. Fourth and eighth graders are evaluated according to a rubric developed by the International Association for the Evaluation of Educational Achievement (IEA), of which the U.S. Department of Education is a member. The tests have been administered every four years since 1995. Results for 2011 were released in December of 2012. Here’s EWA’s summary of the findings. Previous results were posted in 2008 from the 2007 tests. In the most recent scores, U.S. fourth graders were bested by five other countries or sub–country groups like Finland, Singapore, Hong Kong and Russia, but scored higher than 40 other education systems (a term used by the group behind the tests to account for participants that technically are not countries). In general, U.S. fourth graders’ scores were more competitive relative to their peers when compared to how well U.S. eighth graders placed against their peers. The test publishers also tally results for select U.S. states, showing that Massachusetts, Minnesota, North Carolina and Indiana fourth and eighth graders scored nearly as high as international leaders in math.
A cross-country study released in late 2013 shows American eighth graders in most states test above average in math and science when compared to students abroad. Massachusetts, Minnesota, and Vermont lead all U.S. states in science performance, besting 42 of the 47 countries that were evaluated in the study. Students in 36 states were above average in math, while those in 47 states reached that threshold in science. The report compared NAEP results to international TIMSS scores.
PISA and TIMSS, like NAEP, are not high-stakes exams, and teachers and schools are not held directly accountable for students’ results on those tests. But when test scores carry immediate consequences – such as whether schools meet targets for “adequate yearly progress” under NCLB, or whether students earn admission to selective universities – cheating has emerged as a serious issue.
One prominent NCLB-era cheating scandal was the revelation in 2011 that 44 schools and 178 principals and teachers in Atlanta routinely erased incorrect student answers on standardized test sheets and replaced them with the correct ones. A state report on the scandal described efforts to exert pressure on those educators to improve scores.
One technique for uncovering suspicious test-score patterns is “erasure analysis,” which considers how many test answers are changed from wrong to right and how many students make the same corrections. The method involves calculating the likelihood of those wrong-to-right changes happening by chance, with very small probabilities signaling possible cheating. In 2011, USA Today used erasure analysis for a national series that raised questions about apparent test-score anomalies in six states and the District of Columbia.
Among college-bound students, cheating of a different kind has been chronicled –students taking the SATs or ACTs for peers who pay for the service. In 2011, The New York Times revealed one such cheating operation on Long Island. Those reports led the College Board to alter its SAT registration process, adding the requirement that students upload a photograph of themselves so that proctors can ensure that the student sitting for the test is the one who registered.
Even though more universities are telling students that SAT and ACT scores are not mandatory, many postsecondary institutions rely on those standardized tests as a hedge against high school transcripts with inflated grades. For example, one 2009 study found that high school grade point averages rose among Virginia applicants without any corresponding uptick in SAT scores. — Mikhail Zinshteyn, December 2013