Fuzzy thinking in Quality Counts education report???

The new Quality Counts education report has been released by Education Week, and the inevitable spin is on from the Kentucky Department of Education in their News Release 14-003.

The first table in the department’s press release shows Kentucky’s ranking in the various areas Quality Counts examined between 2011 and 2014. The table below extracts the three most recent years of those state rankings (I don’t show 2011 because Quality Counts didn’t consider all of the listed areas that year).

Quality Counts KY Rankings Through 2014 from KDE

Quality Counts KY Rankings Through 2014 from KDE

Quality Counts ranked Kentucky against the other states in six areas:

• Chance for Success

• K-12 Achievement

• School Finance

• Transitions & Alignment

• Standards, Assessments and Accountability

• Teaching Profession

Also note that Quality Counts computed an overall state rank, somehow, across those six areas in 2012 and 2013. This is the data I show in the row with the red typeface as listed by the Kentucky Department of Education in their row titled “Overall Score” (Which isn’t a correct title. These are overall ranks. See last year’s KDE News Release 13-003).

In any event, over the past year we all heard ad nauseum about how wonderful it was that Quality Counts rated Kentucky’s education system as the 10th best in the nation.

Well, consider this: the bottom row in the table, titled “Average of Rankings,” which I separately computed, is the simple average of the six Quality Counts subarea rankings for each year, rounded to the nearest point. Notice that those averages look quite different from, and lower than, the overall state rankings Quality Counts somehow developed for Kentucky.

I don’t know what scoring scheme led to the inflated “Overall Score” ranking figures for Kentucky listed above, but it certainly wasn’t based on an equal averaging of the six subareas.

In fact, it looks like Quality Counts used some sort of weighting scheme to come up with its final rankings. And, that weighting apparently gave undue credit to the “Transitions and Alignment” and “Teaching Profession” subareas over other areas like “K-12 Achievement,” which to me are more important indicators of current school performance.

So, who can say if Quality Counts’ scoring even makes any sense?

I have a lot more concerns about Quality Counts. Click the “Read more” link to learn about things such as how New Mexico got unfairly treated compared to Kentucky by the Quality Counts analysis.

Quality Counts – Ensnared in statistical traps

Quality Counts makes significant use of the overall average student scores, or “all students” scores, for each state from the National Assessment of Educational Progress (NAEP). NAEP scores show up in calculations under the “Chance for Success” area and in multiple portions of the “K-12 Achievement” calculations. Unfortunately, the Quality Counts rankings ignore important statistical limitations in this NAEP data.

First, the NAEP is a sampled assessment. There are plus or minus sampling errors in all of the results, including the proficiency rates used by Education Week. In many cases, NAEP proficiency rates for states that appear different, and which Quality Counts treats as if they are different, in fact are only statistical ties.

Consider the case of the 2013 NAEP Grade 8 Math Assessment results used by Quality Counts. Kentucky’s “all students” proficiency rate reported by Quality Counts is 30.0 percent, and a little work with the online NAEP Data Explorer tool shows that is generally correct. But sampling error information from the NAEP Data Explorer also indicates we only can be 95 percent confident that Kentucky’s real grade 8 math proficiency rate in 2013 might range from somewhere between 27.6 to 32.4 percent. That spread of nearly five points could move Kentucky’s true ranking around considerably.

The presence of this amount measurement error also makes it inappropriate for Quality Counts to conduct its rankings using NAEP proficiency rates carried out to the nearest tenth of a point. The actual NAEP data simply isn’t that precise.

A second, and probably more severe, error in Quality Counts is that it mostly relies on “all students” scores for its state rankings. As every NAEP report card from 2005 onward cautions, that can lead to serious misconceptions. The NAEP report cards say you need to look beyond only the “all students” NAEP scores to get a fair idea of relative state-to-state education performance. Here is an example from the 2013 NAEP Grade 8 Math Assessment to drive that interesting fact home.

Consider this statistical surprise. The next table summarizes results for Kentucky and New Mexico from the 2013 NAEP Grade 8 Math Assessment. The table lists both the reported proficiency rates (carried out to the nearest tenth of a point to match Quality Counts’ format, not because it is suitable to list these scores in such detail) along with simplistic state rankings for those rates such as Quality Counts also reports. I show breakouts for “all students” and for the three predominant racial groups found in Kentucky (other racial groups are reported by NAEP, but either Kentucky or New Mexico has only trace numbers of students from those groups, making comparisons impossible).

KY Vs NM on NAEP G8 Math in 2013

KY Vs NM on NAEP G8 Math in 2013

Now, here is the big surprise. If you look at the “all students” scores for Kentucky and New Mexico, which agree with the “State Highlights 2014” reports from Quality Counts, you are told that Kentucky ranked well ahead of New Mexico, 39th place vs. 47th place, for grade 8 math performance. But, is that really correct?

Again, if we ignore sampling error the way Quality Counts does, when we compare performance for the three dominant racial groups found in Kentucky, we find that New Mexico outscores Kentucky for every one of those racial groups – Every One. Now, which state looks like it does better?

So, how does Kentucky wind up on top in the “all students” only ranking used by Quality Counts? The answer involves something public school systems cannot control – the overall racial makeup of their classrooms.

In Kentucky, whites made up 83 percent of the eighth grade classroom in 2013 while in New Mexico whites only made up 25 percent of the eighth grade enrollment. In New Mexico, the dominant racial group is Hispanic (60 percent), but Hispanics only comprised four percent of Kentucky’s eighth grade class during 2013 NAEP testing.

Simply because whites outscore the minorities by significant amounts in both states, by having a lot more whites, Kentucky winds up with a very unfair advantage when Quality Counts predominantly looks only at “all students” scores.

New Mexico educators should be complaining loud and clear to the publishers of Quality Counts about such unfair treatment.

By the way, the statistical surprise I just discussed is so well known that it even has a name, “Simpson’s Paradox.” Furthermore, I’ve talked about this issue with the Quality Counts people before. It is very unfortunate that the statistical folks who create Quality Counts apparently refuse to make adjustments for these real world facts of statistical life.

Quality Counts – Too much stuff does not directly relate to current school performance

Consider this: Quality Counts’ specific report for Kentucky, “Kentucky State Highlights 2014,” lists a number of factors that go into its “Chance for Success” calculation. Far too many of these data points really have no bearing on how well the state’s school system is currently operating.

For example, one factor is “Family Income.” What schools are doing now does not have impact on that, at least in the sort term.

The same is true for “Parent Education,” “Parent Employment,” “Adult Educational Attainment” (which seems to be mostly a double-counting of “Parent Education”), “Annual Income” (Again, largely double-counting “Family Income”) and “Steady Employment” (a double-count of “Parent Employment”).

Most definitely, today’s schools have virtually no impact on “Linguistic Integration,” which is the proportion of school students whose parents are fluent in English. By the way, with its predominantly white, US-born population, Kentucky gets an unfair advantage from this dubiously included item.

Throwing such schools-cannot-control-this data into what many think is a valid rating of school system performance is clearly out of line. In fact, some schools, including some high performing charter schools, show effective schools can overcome issues of parent financial limitations and low education levels. An evaluation system such as Quality Counts that penalizes a school system for things the school cannot control is clearly invalid.

I also want to make some cautionary comments about Quality Counts very high scoring for Kentucky in the areas of “Transitions and Alignment” and “Teaching Profession.”

“Transitions and Alignment” scores are based on a checklist of Yes/No answers for the presence of 14 different programs. However, and rather ironically for a report whose title talks about “Quality,” there is no evaluation of the quality of those programs. Does Kentucky even define Workforce Readiness accurately? Who knows?

Furthermore, I don’t find it very beneficial if the state has a definition for “Work Readiness” if only around half our graduates meet it (Note: The Kentucky School Report Card database shows only 54.1 percent of our high school graduates were college and career ready [CCR Tab] in 2013).

Also, it does not matter if our state standards are aligned from early childhood to college if our kids still are not getting the education they need. The NAEP says only 36 percent of our fourth graders were proficient for NAEP reading in 2014 and, as previously mentioned, only 30 percent of our eighth graders made the NAEP cut in math. And, minority students in Kentucky did much worse. Until we produce much better results than that, I am not going to be impressed by any claims made about our education standards. If this is the product of the 10th best state performance in the country, the United States is in very serious trouble.

Finally, Quality Counts’ “Teaching Profession” element is loaded with more checklists of nice-sounding stuff but makes absolutely no judgment about the quality of that stuff.

It does not mean very much if the state claims it has “Formal Evaluations” of teachers if those are mostly a rubber stamp.

Furthermore, I am not sure Quality Counts even reports on some of Kentucky’s “Teaching Profession” elements correctly. The report indicates we link teachers to student growth data. That’s planned for the future, but it was not in effect in 2011 as the report indicates.

In closing, I want to make it clear that Quality Counts certainly collects some interesting information. However, the overall analysis of that data leaves a lot to be desired. It is misleading to Kentucky to tell its citizens our education system ranks 10th in the whole country when correctly analyzed data from things like the NAEP and the ACT make it painfully clear that isn’t the case.