Independent Education Review: Volume 1 Number 3                                                  December 5, 2005

 


“Lake Woebegone,” Twenty Years Later


John Jacob Cannell, MD

Former President

Friends for Education


Almost twenty years ago, I wrote the two “Lake Woebegone” reports, named after Garrison Keillor’s mythical Minnesota town where “all the women are strong, all the men are good-looking, and all the children are above average.” The first report documented that all fifty states were testing above the “national average.” The second showed that American educators were cheating on standardized achievement tests in a systematic and pervasive manner. Endnote , Endnote The 1988 summer issue of Educational Measurement: Issues and Practice was entirely devoted to the first report, with numerous invited responses, including one from the U.S. Department of Education that also raised the possibility of widespread testing irregularities. Endnote


Fifteen years ago, I abruptly left the testing reform movement that I helped start. It is time I discuss the reasons why I left.


My education about the falsification of American public school achievement tests was a gradual process. It started in my medical office in a tiny town in the coal fields of Southern West Virginia, led to school rooms in the county and then the state, to the offices of testing directors and school administrators around the country, to the boardrooms of commercial test publishers, to schools of education at major American universities, to various governors’ offices, and finally, to two American presidents.

 

One day in 1985, West Virginia newspapers announced all fifty-five West Virginia counties had tested above the national average. I asked myself, how could all the counties in West Virginia, perhaps the poorest and most illiterate state in the union, be above the national average? In my Flat Top, West Virginia, clinic, illiterate adolescent patients with unbelievably high standardized achievement test scores told me their teachers drilled them on test questions in advance of the test. How did the teachers know what questions would be on a standardized test?


Then I learned West Virginia schools, like most other states, used very unusual “standardized tests.” Unlike the ACT or SAT, these tests used the same exact questions, year after year, and then compared those scores to an old, and dubious, norm group, not to a national average. Furthermore, the school administrators had physical control of the tests and the teachers administered them without any meaningful test security. Numerous teachers - usually my patients - told me they simply memorized or copied the test questions and taught their students the answers the following year.


(Things have changed little in 15 years. I dare the reader to find even one of the fifty-five West Virginia counties in which the majority of elementary students tested below the national average on the Stanford-9 Achievement Test between 1999 and 2002. Endnote In 2003, the federal government forced the state to stop excluding lower functioning student from testing and several dirt-poor counties in the coalfields fell slightly below the national average. However, the average West Virginia third grader still tested at a whopping 64 national percentile in 2003.)


This could not occur in a vacuum. School administrators had to know what was going on. I quickly learned that administrators - often “good old boys” - were the principal beneficiaries of teacher cheating - especially rapidly rising scores – and they wanted no part of honest tests. Administrators needed rapidly rising scores to show they were doing a good job. When West Virginia finally changed to an equivalent, but different, test –and teachers were presented with an unfamiliar set of questions - scores plummeted, newspapers complained, administrators played musical chairs and exchanged jobs. However, next years’ scores started their inexorable and flattering ascent.


In fact, administrators first made great displays of telling the teachers not to dare look at the tests, and then made sure the teachers had ample time to memorize the questions. When this didn’t work, some fifty California school administrators, who had physical control of the tests, simply erased the student’s wrong answers and marked the correct ones to get their low performing schools above the national norm. Endnote Who was supplying the schools with these bogus tests?


Posing as a school administrator, I called a major test publisher. The woman I spoke with was more than happy to supply tests with any “national norms” that I requested - all certified by respected testing consultants from major universities. The publisher would sell inner city norms, low-socioeconomic norms, adjusted norms, etc., with their tests. She explained that choosing the right “national norm” was very important.


If I choose a low performing norm group, I could look forward to high initial scores but year-over-year gains would quickly become unbelievably high. If the initial norms were “tougher,” I would look bad the first year but could look forward to very flattering year-over-year gains. The woman finally caught on to me when I asked how she knew the scores would go up every year. Of course, she knew that such gains are guaranteed when most of the questions are the same every year.


By 1986, I had found out several important things. The test publishers would supply any “national norms” school districts wanted; they knew that using the same test questions year after year assured that enough teachers would cheat to cause flattering year-over-year gains; and the publishers were making good money selling these “standardized tests.” Who was consulting for these companies, giving them academic cover?


In 1987, I formed an education reform group, Friends for Education. We conducted a series of campaigns to improve schools in West Virginia. For example, we held “The Cleanest School in West Virginia Contest.” When that was ignored, we held a “Dirtiest School in West Virginia Contest,” promising a bucket, mop and broom to the winner. Endnote Suddenly, Friends for Education was getting press. We went on to hold rallies focusing on improving the worst schools in the state. We also filed complaints with the Equal Opportunity Employment Commission, claiming school officials were illegally denying women administrative positions in West Virginia schools. Endnote


However, I kept wondering about the tests. The American educational system is built around high or improving test scores. Could the entire testing system be fraudulent? If West Virginia was testing above the national average, then perhaps all the states were reporting the same thing and no one knew it.


Between patients, I had my nurse, x-ray technician, and lab technician collect test scores from the relevant education departments in all fifty states. There it was: all fifty states were reporting they were above average in elementary achievement1 I also realized that newspapers throughout the country were repeatedly running flattering stories on local school achievement on one page and dire warnings that the United States was “A Nation at Risk” on another page.


CTB-McGraw Hill wrote me and threatened to sue me should I publish the data. Endnote I promptly dipped into my savings and published my second report: Nationally Normed Elementary Achievement Testing in America's Public Schools: How All 50 States Are above the National Average. My findings of showed up in newspapers around the country, including the front page of the Washington Post. Endnote


In February 1988, my education about standardized testing in America continued at a special meeting of test publishers and academicians at the U.S. Department of Education. Secretary William Bennett called the meeting, asking me to explain my findings that all fifty states were testing above the national average. During that meeting, Assistant U. S. Secretary of Education, Chester Finn, labeled my findings, the “Lake Woebegone Report.”


At the meeting, I learned that a handful of academicians at major American universities consulted with test publishers to develop both the tests and the various norms. These same academicians explained that “high stakes” testing were causing the “Lake Woebegone” phenomenon. In effect, they said the cheating was occurring because the American public wanted their children, and their schools, to do well on achievement tests! I immediately wondered why they would come up with such a simplistic, and meaningless, explanation.


The representatives from college entrance test publishers, the SAT and ACT, also expressed amazement at this explanation. After all, their tests were the two highest stakes tests in the country but suffered from no “Lake Woebegone” psychometrics. I didn’t understand why some of the academics insisted on a simplistic “high stakes” explanation when the obvious problem was cheating. I left the meeting having learned three things. One, neither William Bennett nor Chester Finn knew about the cheating. Two, officials of the SAT and ACT knew all about the cheating but were powerless to stop it. Three, the academicians were aware of the cheating and, for reasons I didn’t understand, wanted “Lake Woebegone” testing to continue.


By that time, I assumed most politicians knew what was going on. While speaking at an April 1990, meeting of the Education Writers Association in Chicago, I confronted another speaker, Governor Bill Clinton, on the flagrantly high -, and very politically flattering -, test scores in Arkansas. Early in his run for the White House, Clinton promptly returned to Arkansas after learning about a front-page story about Arkansas teacher-cheating in the state’s largest newspaper. He responded, “Gosh, we’ll look into that . . .,” and promptly announced plans for improved test security. Endnote The National media did not pick up the story. Whatever he knew at the time, Clinton was hardly alone in benefiting from such scores.


Governors around the country, including George Bush in Texas, found that rapidly rising test scores led to flattering press releases, re-election, and improved chances at higher office. As far as the electorate and the press were concerned, standardized tests were standardized tests. In 1996, Clinton went on to recommend a national achievement test with strict security (a proposal refused by the Republican Congress), so he did try to end “Lake Woebegone” testing.


Unlike Clinton, Bush was not spared national media attention about cheating in Texas. One month before the election of 2000, the Rand Corporation carefully analyzed the “Texas Miracle,” the dramatic gains in Texas school achievement that propelled George Bush to the White House. They compared gains on the Texas test with Texas gains on National Assessment of Educational Progress (NAEP) and concluded that most of the gains on the Texas test were bogus. Endnote (NAEP is a national test, which samples participating school districts using a truly standardized test.) Unlike Clinton, I am unaware of any effort Bush has made to stop the cheating in state run testing programs.


Successful politicians like Bush and Clinton may have learned long ago of the political value of “Lake Woebegone” testing, and either cynically or ignorantly, both built their careers on the backs of illiterate children. Perhaps, like most Americans, both just assume a standardized test is a standardized test. However, public educators, test publishers, and academic testing experts cannot claim ignorance – they are too intimately involved with the cheating.


I was hopeful the first “Lake Woebegone” report would stop the cheating. After all, it made headlines around the country. When nothing changed, I again surveyed all 50 states and published a second report in 1989 with the help of a grant from the Kettering Family Foundation. The “Lake Wobegon” Report, How Public Educators Cheat on Standardized Achievement Tests detailed the extent of cheating and how to detect it. The second report received even more publicity then the first, with front-page coverage by the Wall Street Journal. 

Endnote    Surely, policy makers would be outraged at the cheating and realize that American schools would never improve until we had honest tests. I then wrote and distributed Testing Ethics Model Legislation; forlornly hoping state legislators would enact simple laws to stop the cheating. Endnote


No response. However, in 1990 my hopes soared. Sixty Minutes called. They were doing a story about teachers and school administrators cheating on tests, highlighting my second report. I was sure that would do it. Scandals exposed by Sixty Minutes often led to reform. Morley Safer came to my house for filming and he brought copies of some “test preparation” materials that CBS News had obtained from various states. He asked me to look them over before filming, Teacher is a Cheater.


It hit me like a brick. The “test preparation” materials contained all the answers to the test questions. The same academicians who claimed “high stakes” testing was the problem, and who made money collaborating with test publishers to develop “Lake Woebegone” tests, were making even more money by selling schools test preparation materials. Academic psychometricians at major American universities had side businesses; they provided school districts with crib sheets.


I waited for the fallout from the Sixty Minutes report. In April 1990, Sixty Minutes ran Teacher is a Cheater and reported the tests were fraudulent and that cheating by educators was rampant in American schools. Endnote However, nothing happened: no outrage, no commissions, no hearings, nothing. I could do no more. I resigned from Friends for Education and the organization fell apart. I quickly became involved with two entirely different causes: first, false recovered memories of sexual abuse, and subsequently, widespread vitamin D deficiency. Endnote


My education reform days are painful to remember, mainly because I fear that little has changed and my work may be for naught. Every year, from my California home, I read hundreds of stories online, published in local California newspapers, all about improving test scores in their local schools. The same articles praise California politicians and the California State Department of Education for statewide improvements.


As part of their state-run STAR testing program, the California Department of Education administered a “Lake Woebegone” test, the Stanford-9, from 1998 to 2002, using the same booklets, and the same questions, for five years in a row. Endnote California fourth grade national-percentile-rank reading scores on the Stanford-9 section of the California STAR testing program are instructive:

1998    40

1999    42

2000    45

2001    47

2002    50


Pretty impressive. In five years, California went from perhaps the lowest reading scores in the country to the national average. However, in 2003, California changed to another “Lake Woebegone test, the California Achievement Test-6 (CAT-6), and scores plummeted when the teachers were presented with unfamiliar test questions. In fact, fourth grade California reading scores were lower in 2003 than in 1998, with a national percentile rank of 39.


Fourth grade Stanford-9 language scores are even more instructive, as language (such as spelling and punctuation) is the easiest subject for cheating.

1998    44

1999    46

2000    50

2001    53

2002    55


Here California went from perhaps the lowest scores in the nation to well above the national average. However, in 2003, when California changed questions by adopting the CAT-6, fourth grade language scores fell below their 1998 levels: 42.


In 2002, California began to emphasis the other component of their STAR testing program; the state developed criterion referenced test, the California Standards Test (CST). Anywhere from 50% to 75% of the questions on the CST is the same year-to-year. Fourth grade mean scaled English/Language Arts scores on the CST give a similar, but less dramatic, impression of improving learning in California.

2002    333

2003    339

2004    339

2005    346


Compare the Stanford-9 and CST to the fourth grade NAEP reading scores for California. Endnote Examine the percentage of California children who performed at or above the rudimentary “basic” reading level on NAEP:

1998  48

2002    50

2003    50

2005    50


NAEP reports no significant difference in fourth grade reading average scale scores during the same time. Even when one takes into account that federal No Child Left Behind regulations are now forcing California to exclude fewer low-achieving students from taking the test, NAEP reading “proficiency” in California is still a pitiful 22%. Even more telling is California’s NAEP ranking: NAEP reports that of the 52 other states and jurisdictions that participated in the 2005 assessment, California performed better than only one other jurisdiction.


(The Thomas B. Fordham Foundation recently came to the same conclusion. They listed California as one of several states with rather dramatic gains on state controlled “Lake Woebegone” tests, but no progress on NAEP testing. Endnote

The same study found Tennessee educators did the best job. In 2005, eighty-eight percent of Tennessee eighth-graders tested proficient on their “Lake Woebegone” state reading test, while only 26% were proficient on NAEP. That is, virtually all Tennessee students are proficient in Tennessee reading, but very few in American reading!)


California, the largest state in the union, continues to conduct “Lake Woebegone” testing that has changed little from the late 1980’s. Although I was not able to learn if independent academics are still supplying “test preparation materials” to California schools, it matters little. California teachers tell me they really don’t have to memorize the test questions anymore, the curriculum materials supplied by the state are laced, repeatedly, with the test questions.


Nor do I see that national legislation has made any difference. As I understand it, No Child Left Behind (NCLB), places no meaningful restrictions on cheating on the myriad of state tests used to meet NCLB criteria. Recent press articles detailed cheating scandals in several states, blaming it on the “high stakes” testing environment created by NCLB, without noting that such cheating has been regularly reported by the media for the last twenty years. Endnote , Endnote , Endnote , Endnote , Endnote

  

When NAEP scores are much lower than state scores, state superintendents of schools never mention the cheating. Instead, they simply explain that they have not aligned their curriculum with the NAEP test. As California Superintendent of Schools, Jack O’Connell, recently explained to the Los Angeles Times, “Results on our statewide tests, which are aligned to our rigorous standards, indicate that a focus on high expectations is leading to steady gains in student achievement.” Endnote That is, the fourth grade students in California are showing steady improvement in California reading but not in American reading! No doubt, O’Connell will be as successful at hoodwinking Californians in the future as his predecessors have been in the past.


In their lives, my two young daughters will compete with children from other countries who have been ruthlessly educated. My daughters need a broad-based education driven by an incorruptible test assessing the widest swath of achievement. Such testing is commonplace in the rest of the world. As someone once said, “If the educators want to know what will be on the test, tell them the English section will have lots of letters and the math section lots of numbers.”


Instead of preparing children with a broad and challenging curriculum, American schools focus on teaching a narrow curriculum aligned with corrupted tests. These tests are still dedicated to mollifying parents, glorifying educators, promoting political careers, and enriching publishers and their academic consultants.


“High stakes” tests are not the problem. Truly “high stakes” public school tests, similar to the ACT, will rapidly reform American schools. The current tests are neither standardized, nor honest, nor “high stakes.” It is time we join the rest of the world with truly “high stakes” standardized national tests, broad tests of achievement, ethically designed and honestly administered.




 

Access this article in pdf format.

 

Citation: Cannell, J. (2005). “Lake Wobegon” Twenty Years Later. Independent Education Review, 1(3). Retrieved [date] from http://www.independenteducationreview.org/Review/Articles/v1n1/v1n3.pdf