Innes’ letter on problematic report published in Education Week

October 28, 2012 Richard Innes

Let’s do a little graph work with the Long Term Trend version of the National Assessment of Educational Progress (LTT NAEP). We’ll use this graph of the LTT NAEP reading performance, which is cut and pasted from the “NAEP 2008 Trends in Academic Progress.”

NAEP LTT Reading Trend Graph

On this graph, when scores for earlier years are statistically significantly different from the latest scores for 2008, those earlier year scores are tagged with an asterisk. If an earlier score lacks an asterisk, the score is not statistically significantly different from the latest, 2008 score.

Consider the Age 17 reading results; the scores from 1984 to 1992 are higher than the 2008 score, and each has an asterisk. That indicates these scores are higher than the 2008 score by a statistically significant amount (NAEP uses a 95 percent confidence test for statistical significance). Scores for 1980 all the way back to 1971 lack an asterisk, indicating these are basically tied with the 2008 score. Scores from 1994 to 2004 based on the original assessment format lack an asterisk, so they are just statistically tied with the 2008 score.

Note: Although the NAEP changed the assessment format somewhat in 2004, the people who administer the assessment claim the differences are not very significant and scores from the revised format can be compared to the original format scores.

So, in the early days of LTT NAEP reading, Age 17 reading did improve very slightly. However, it is clear that progress went flat around 1984 and stayed flat until recently, when a statistically significant DECLINE in Age 17 reading started. The most recent Age 17 reading scores are statistically significantly lower than scores for 1984 to 1992.The bottom line: there is nothing to crow about in the Age 17 results, as a number of researchers have noted.

Now, check the Age 13 results. They have basically been flat since 1992, and the entire score gain over the 37 years shown is a measly five point rise. There is nothing to crow about here, either.

So, the modest improvement in the Age 9 results has never translated to upper age levels, a fact that concerns plenty of education researchers.

If you look at the math LTT NAEP results for Age 17 students, you would find a similar situation. The 2008 Age 17 LTT NAEP math score is not statistically different from the score way back in 1973, and the 2008 score is not statistically different from all scores posted since 1990. That’s nothing to crow about either.

Still, that didn’t stop some people from trying to fabricate a mountain out of the Mississippi Delta with this data. Several weeks ago, Education Week published a Commentary article titled, “Public Schools: Glass Half Full or Half Empty?”

The article and the paper upon which the article is based, “Restoring Faith in Public Education,” make incredible claims such as the educational progress on the LTT NAEP has been “Commendable.” Even our education commissioner got fooled.

However, the truth is the paper engages in some very poor analysis and cherry-picking of data.

You can read part of my reaction to this nonsense in my recently published letter to Education Week.

I also posted two rather extensive comments to Ed Week’s Commentary article. If you don’t want to check that out, just click the “Read more” link for a brief synopsis.

The Commentary/paper’s authors conveniently ignore the very obvious and dismal long-term performance of Age 17 students in the US on both reading and math (and incorrectly refer to LTT NAEP results by grade level instead of age level, as well). Whatever improvement might have been made in the lower age groups has not, after decades of testing, translated into better performance for high school students.

Furthermore, the Commentary and the “Restoring Faith” source paper relied on a secondary source for their LTT NAEP scores – a mistake solid researchers strive to avoid. As a result, the paper does not list the critical Age 13 reading scores from 1992. That omission changes the entire interpretation of the performance on Age 13 reading even if you only do an unsophisticated examination of the scores that ignores the sampling errors. Without 1992 included it isn’t so obvious that Age 13 reading in LTT NAEP has been in stagnation for the past two decades. Once you add the 1992 scores, even a simplistic analysis reveals that the 2008 Age 13 reading score merely ties the 1992 score. That makes it crystal clear that the Age 13 reading history in the LTT NAEP isn’t commendable.

Oh, Yeah – I’ve been telling our readers that it is important to break out NAEP data by race to really get a good idea about true educational performance over time (something else the paper didn’t do). So, I checked the Age 13 and Age 17 LTT NAEP reading and math results for different racial groups using the NAEP LTT Data Explorer. In general, whether we are talking about whites, blacks or Hispanics, scores have trended flat back to at least the early 1990s on both Age 17 math and reading for all the races. The latest Age 13 reading scores are also not statistically different from scores in the early 1990s for these races, as well.

Fortunately, some good news is coming from this mess.

Education Week is considering posting a clear statement in the future that Commentary pieces are neither created nor endorsed by Education Week’s own, very professional staff.

Also, the point of contact for the Digest of Education Statistics, the publication that omitted those critical 1992 NAEP scores in its 2010 and 2011 editions, advises me that this will also be corrected.

Of course, fixing data problems probably won’t stop some in the education world from trying to spin educational achievement mountains out of the actual molehills of progress shown by the LTT NAEP. That will continue to undermine the public’s confidence in its public education system. However, the availability of better data will make it a bit easier for the rest of us to tell when educators are missing the real message in the data.

You see, restoring faith in education can’t be done with fairy tales. It will only happen if educators give us quality information and a quality analysis of that data.