Wednesday, June 20, 2012

Grading Papers, Jobs and a Passionate Heart


What exactly makes a great essay? And if the nuance and detail, the commitment to thought and the ability to express oneself well are particularly important criteria, the essence of excellence, who are the evaluators, the “graders,” with a sufficiently objective eye to do the job right? With applications to post-secondary education riding in significant part on writing skills, from the SAT essay section to the individualized efforts required by most institutions of higher learning, it is obviously important that applicants know that they are treated equally, free from bias and the foibles of subjectivity that logic would dictate are inherent in those engaged in such evaluations, however professional they might be.

On the other hand, there is an innate revulsion of an automaton grading such essays, a fear that the shades of excellence would be lost in a mechanical heart, a computer analysis of the results. To date, the primary use of computer analytics in evaluating student applications has been relegated to finding technical mistakes and in identifying those who submit work, which they purchase online or from other self-anointed essay mills with deep catalogs or past success, through computer scans and comparisons with the vast pool of essays on file. But what about evaluating those writings that are not plagiarized? How do we manage that flood? In a world of too many students, too many applicants, too many competitors and not enough teachers, professors or graders – and an economic impairment making increases in efficiency mandatory – there appears to be another way.

OK, as I present this alternative, I have to admit to my own squeamishness at the process. Although my writings today themselves aren’t really fodder for such analyses and since I am not entering into a money-award-motivating writing competition, this discomfort probably stems from my time in undergraduate and graduate studies, as well as my applications to esteemed institutions of higher learning. I wanted sufficiently to impress those charged with my evaluation to move forward in life, to achieve my goals and succumb to the pressures of peers and, ok I admit it, my parents. I am thinking back and asking, “what if…?” And since rising to the level of being asked to make such evaluations myself as a professor, am I marginalized by this new process?

Here’s the premise: complex algorithms have been developed to analyze not just sentence structure, grammar and punctuation, but substantive text as well. Not only can these computer analytics address technical excellence, the creators claim, but they are equally capable of addressing the underlying message and research as well. The June 9th New York Times summarizes this seminal experiment: “This spring, the William and Flora Hewlett Foundation sponsored a competition to see how well algorithms submitted by professional data scientists and amateur statistics wizards could predict the scores assigned by human graders. The winners were announced [in May] — and the predictive algorithms were eerily accurate.

The competition was hosted by Kaggle, a Web site that runs predictive-modeling contests for client organizations — thus giving them the benefit of a global crowd of data scientists working on their behalf. The site says it ‘has never failed to outperform a pre-existing accuracy benchmark, and to do so resoundingly.’

“Kaggle’s tagline is ‘We’re making data science a sport.’ Some of its clients offer sizable prizes in exchange for the intellectual property used in the winning models. For example, the Heritage Health Prize (‘Identify patients who will be admitted to a hospital within the next year, using historical claims data’) will bestow $3 million on the team that develops the best algorithm… The essay-scoring competition that just concluded offered a mere $60,000 as a first prize, but it drew 159 teams. At the same time, the Hewlett Foundation sponsored a study of automated essay-scoring engines now offered by commercial vendors. The researchers found that these produced scores effectively identical to those of human graders.

“Barbara Chow, education program director at the Hewlett Foundation, says: ‘We had heard the claim that the machine algorithms are as good as human graders, but we wanted to create a neutral and fair platform to assess the various claims of the vendors. It turns out the claims are not hype.’ … If the thought of an algorithm replacing a human causes queasiness, consider this: In states’ standardized tests, each essay is typically scored by two human graders; machine scoring replaces only one of the two. And humans are not necessarily ideal graders: they provide an average of only three minutes of attention per essay, Ms. Chow says.”

And what about introducing these processes into daily classroom work within the schools themselves? So far, the computers are just looking at the simple stuff: “Teachers would still judge the content of the essays. That’s crucial, because it’s been shown that students can game software by feeding in essays filled with factual nonsense that a human would notice instantly but software could not.” NY Times. At least not yet.

With the cost-per-student-per-year for classroom essay evaluation through computer algorithms expected to drop from the current $10-$20, the possibility of replacing (firing?) human graders looms large. But perhaps what bothers me more from within is the temptation (the excuse?) to use such systems to help manage larger classrooms, further depersonalizing the educational experience. And if my own academic experiences are any measure, I wouldn’t be here today without a rather wonderful and oft-repeated one-on-one time with some of the finest minds in the land, frequently initiated by the essays I submitted. I’m wondering, what’s your feeling about this process and what boundaries, if any, you believe should be placed around it. Or is the reality simply going to expand anyway in a world of budget deficits and limited resources?

I’m Peter Dekom, and if you haven’t figured it out by now, my most fundamental belief is that a nation without a solid educational system for its young has absolutely no future in a modern world.

No comments: