EXP: Development of Human Language Technologies to Improve Disciplinary Writing and Learning through Self-Regulated Revising


PIs: Rebecca Hwa, Diane Litman, Amanda Godley
University of Pittsburgh
Award Details

Writing and revising are essential parts of learning, yet many college students graduate without demonstrating improvement or mastery of academic writing. This project explores the feasibility of improving students’ academic writing through a revision environment that integrates natural language processing methods, best practices in data visualization and user interfaces, and current pedagogical theories. The environment will support and encourage students to develop self-regulation skills that are necessary for writing and revising, including goal-setting, selection of writing strategies, and self-monitoring of progress. As a learning technology, the environment can be applied on a large scale, thereby improving the writing of diverse student populations, including English learners. Additionally, the project’s multidisciplinary training of graduate students is focused on increasing diversity in cyberlearning research and development.

Three stages of investigation are planned. First, to analyze data on students’ revision behaviors, a series of experiments are conducted to study interactions between students and variations of the revision writing environment. Second, the collected data forms the gold standard for developing an end-to-end system that automatically extracts revisions between student drafts and identifies the goal for each revision. Multiple extraction algorithms are considered, including phrasal alignment based on semantic similarity metrics and deep learning approaches. To identify the goal of a revision, a supervised classifier is trained from the gold standard. A diverse set of features and the representations of the identified goals (e.g., granularity, scope) are explored. In addition to the “extract-then-classify” pipeline, an alternative joint sequence labeling model is also developed. The labeling of sequences is used to recognize revision goals and the sequences are mutated to generate possible corrections of sentence alignments for revision extraction. The writing environment is iteratively refined, augmenting the interface prototyping through frequent user studies. Third, a complete end-to-end system that integrates the most successful component models is deployed in college-level writing classes. Student progress is tracked across multiple assignments.

Tags: , ,