Essay-Grading Software Regarded As Time-Saving Tool
Teachers are looking at essay-grading software to critique student writing, but critics point to serious flaws into the technology
Jeff Pence knows the simplest way for his 7th grade English students to boost their writing is always to do a lot more of it. However with 140 students, it can take him at least fourteen days to grade a batch of their essays.
So the Canton, Ga., middle school teacher uses an internet, automated essay-scoring program which allows students to have feedback on the writing before handing in their work.
“It doesn’t inform them how to handle it, but it points out where issues may exist,” said Mr. Pence, who says the a Pearson WriteToLearn program engages the students almost like a game.
A week and individualize instruction efficiently with the technology, he has been able to assign an essay. “I feel it really is pretty accurate,” Mr. Pence said. “Is it perfect? No. But once I reach that 67th essay, i am not accurate that is real either. As a team, our company is pretty good.”
Aided by the push for students in order to become better writers and meet with the new Common Core State Standards, teachers are eager for new tools to help out. Pearson, that is located in London and New York City, is one of several companies upgrading its technology in this space, also called artificial intelligence, AI, or machine-reading. New assessments to check deeper learning and move beyond multiple-choice email address details are also fueling the demand for software to help automate the scoring of open-ended questions.
Critics contend the software doesn’t do a lot more than count words and for that reason can not replace readers that are human so researchers will work hard to improve the application algorithms and counter the naysayers.
While the technology happens to be developed primarily by companies in proprietary settings, there’s been a new focus on improving it through open-source platforms. New players available in the market, such since the startup venture LightSide and edX, the enterprise that is nonprofit by Harvard University and also the Massachusetts Institute of Technology, are openly sharing their research. This past year, the William and Flora Hewlett Foundation sponsored an open-source competition to spur innovation in automated writing assessments that attracted commercial vendors and teams of scientists from about the world. (The Hewlett Foundation supports coverage of “deeper learning” issues in Education Week.)
“we have been seeing a lot of collaboration among competitors and individuals,” said Michelle Barrett, the director of research systems and analysis for CTB/McGraw-Hill, which produces the Writing Roadmap for usage in grades 3-12. “this collaboration that is unprecedented encouraging a lot of discussion and transparency.”
Mark D. Shermis, an education professor at the University of Akron, in Ohio, who supervised the Hewlett contest, said the meeting of top public and commercial researchers, along with input from many different fields, may help boost performance regarding the technology. The recommendation from the Hewlett trials is that the software that is automated used as a “second reader” to monitor the human readers’ performance or provide extra information about writing, Mr. Shermis said.
“The technology can not try everything, and nobody is claiming it can,” he said. “But it really is a technology which has had a promising future.”
The very first automated essay-scoring systems go back to the early 1970s, but there isn’t much progress made until the 1990s with all the advent associated with the Internet as well as the power to store data on hard-disk drives, Mr. Shermis said. More recently, improvements have been made in the technology’s capability to evaluate language, grammar, mechanics, and magnificence; detect plagiarism; and supply quantitative and qualitative feedback.
The computer programs assign grades to writing samples, sometimes on a scale of 1 to 6, in a number of areas, from word choice to organization. These products give feedback to assist students boost their writing. Others can grade answers that are short content. The technology can be used in various ways on formative exercises or summative tests to save time and money.
The Educational Testing Service first used its e-rater automated-scoring engine for a high-stakes exam in 1999 for the Graduate Management Admission Test, or GMAT, in accordance with David Williamson, a senior research director for assessment innovation when it comes to Princeton, N.J.-based company. Moreover it uses the technology with its Criterion Online Writing Evaluation Service for grades 4-12.
The capabilities changed substantially, evolving from simple rule-based coding to more sophisticated software systems over the years. And statistical techniques from computational linguists, natural language processing, and machine learning have helped develop better ways of identifying certain patterns written down.
But challenges remain in picking out a universal concept of good writing, plus in training a computer to understand nuances such as for example “voice.”
Over time, with larger sets of information, more experts can identify nuanced aspects of writing and improve the technology, said Mr. Williamson, who is encouraged because of the era that is new of about the research.
“It really is a topic that is hot” he said. “there is a large number of researchers and academia and industry looking at this, and that is a very important thing.”
High-Stakes Testing
Along with utilising the technology to improve writing in the classroom, West Virginia employs software that is automated its statewide annual reading language arts assessments for grades 3-11. Their state spent some time working with CTB/McGraw-Hill to customize its product and train the engine, using huge number of papers it offers collected, to score the students’ writing according to a specific prompt.
“we have been confident the scoring is very accurate,” said Sandra Foster, the lead coordinator of assessment and accountability into the West Virginia education office, who acknowledged skepticism that is facing from teachers. However, many were won over, she said, after a comparability study indicated that the accuracy of a trained teacher and the scoring engine performed a lot better than two trained teachers. Training involved a few hours in just how to assess the writing rubric. Plus, writing scores have gone up since implementing the technology.
Automated essay scoring is also used on the ACT Compass exams for community college placement, the newest Pearson General Educational Development tests for a school that is high diploma, as well as other summative tests. Nonetheless it has not yet been embraced type essay because of the College Board for the SAT or the ACT that is rival college-entrance.
The two consortia delivering the new assessments under the normal Core State Standards are reviewing machine-grading but never have focused on it.
Jeffrey Nellhaus, the director of policy, research, and design when it comes to Partnership for Assessment of Readiness for College and Careers, or PARCC, really wants to determine if the technology are going to be a good fit with its assessment, additionally the consortium will soon be conducting a research based on writing from the first field test to observe how the scoring engine performs.
Likewise, Tony Alpert, the principle officer that is operating the Smarter Balanced Assessment Consortium, said his consortium will assess the technology carefully.
Together with new company LightSide, in Pittsburgh, owner Elijah Mayfield said his data-driven way of automated writing assessment sets itself aside from other products on the market.
“What we are attempting to do is build a method that instead of correcting errors, finds the strongest and weakest parts of the writing and the best place to improve,” he said. “It is acting more as a revisionist than a textbook.”
The new software, which is available on an open-source platform, has been piloted this spring in districts in Pennsylvania and New York.
In higher education, edX has just introduced automated software to grade open-response questions for use by teachers and professors through its free online courses. “One associated with challenges in the past was that the code and algorithms were not public. These people were seen as black magic,” said company President Anant Argawal, noting the technology is within an experimental stage. “With edX, we place the code into open source where you can observe how it really is done to assist us improve it.”
Still, critics of essay-grading software, such as for example Les Perelman, want academic researchers to possess broader usage of vendors’ products to gauge their merit. Now retired, the previous director associated with MIT Writing over the Curriculum program has studied some of the devices and managed to get a score that is high one with an essay of gibberish.
“My main concern is so it doesn’t work,” he said. As the technology has many limited use with grading short answers for content, it relies an excessive amount of on counting words and reading an essay requires a deeper level of analysis best done by a human, contended Mr. Perelman.