Blog: The Educated Reporter

Reporting Recipes: Stories Using Data

We asked some of the education reporters who joined us at Stanford University in May to contribute blog posts from the sessions.Today’s guest blogger is Sharon Noguchi of the San Jose Mercury NewsStream sessions from National Seminar in your browser, or subscribe via RSS or iTunes.For more on data and accountability, visit EWA ’s Story Starters online resource. 

Tipped off about possible student absence falsification by the Columbus City Schools district, two Columbus Dispatch reporters started mining Excel spreadsheets and produced about 100 stories indicating widespread cheating had occurred. The stories by Bill Bush and Jennifer Smith Richards led to a state auditor’s inquiries, the early retirement of the superintendent and other administrators, and an FBI investigation.

Bush and Smith Richards, who won an EWA award for investigative reporting, told attendees at EWA’s National Seminar last spring about how they delved into 165 Excel files and found that Columbus City Schools effectively deleted an astonishing 2.1 million student absences — falsely moving students in and out of the district in order for their test scores not to count, and thus improve the schools standing on state report cards. The reporters found the district also changed grades.

Contacted by Smith Richards, an ex-district employee agreed to tell them what happened by describing the electronic location of the data he had helped compile and showing them how that data had been changed by school district officials.

Among the tips Bush and Smith Richards offered to reporters seeking stories buried in data:

  • Buddy up to nerds, the folks who track data in school districts;
  • Know what’s in your state’s Student Information System;
  • Know you state’s accountability rules;
  • Ask for record layouts — the key or headers for any spreadsheet;
  • Send requests early: Some of theirs took months to fill;
  • Once you’ve analyzed your data, take it and your conclusions back to the district to make sure you are interpreting it correctly.

Moderator Cathy Grimes of the Hampton Roads Daily Press agreed: “You don’t want to gather data and draw the wrong conclusions.

California Watch Reporters Agustin Armendariz and Erica Perez used California’s Public Records Act to get financial data illustrating how California’s 72 community college districts spend millions on duplicative administrators. The pair fed the colleges’ data into a mapping system, to show where colleges were physically clustered. Together, the districts spent $1.7 billion on top-level salaries, the reporters found.

The journalists on the panel agreed, asking informally for information, sometimes starting with a district’s data people, is the best way to start seeking data before filing public records requests.

If you’re not sure what kind of data might be available, knowing the name of the system being used to collect the data can help. Then, Bush said, the software’s maker can explain what kind of information it collects.

One rich resource, Smith Richards said, is a school’s annual directory. It includes student and parent names, addresses, phone numbers, staff information and sometimes birth dates, salaries and course lists. Under federal privacy laws, families have to opt out in order not to be included.

Armendariz warned about sifting through data: As much as 10 percent may be junk, because of mistakes such as misspellings or the wrong location in data entry. Data is always dirty, he said. Bush agreed: ”If you get back a really crazy answer, it’s probably wrong.”