Open source prototype

Automatic Essay Grading

NLP support for evaluating English learner essays

An NLP and machine-learning project for scoring English Language Learner essays across cohesion, syntax, vocabulary, phraseology, grammar, and conventions.

GitHub

Automatic essay evaluation result screen

Year: 2023
GitHub: kmnis/automatic-essay-grading

Highlights

Writing assessment workflow for 3,911 pre-scored ELL essays across six analytic measures.
Preprocessing, exploratory analysis, data mining, topic modeling, and ML notebooks.
Notebook-based evaluator app that produces score-oriented feedback and essay analysis.

Automatic Essay Grading is an NLP project for evaluating essays written by English Language Learners. The goal is not just to assign a single score, but to reason across multiple writing dimensions that teachers already use when assessing proficiency.

The project uses a dataset of argumentative essays from 8th-12th grade students. Each essay has scores for cohesion, syntax, vocabulary, phraseology, grammar, and conventions, making it a useful benchmark for multi-dimensional writing assessment.

What The Project Does

Processes and analyzes student essay text.
Models six writing-proficiency dimensions rather than a single overall grade.
Includes exploratory data analysis, preprocessing, data mining, and ML notebooks.
Uses topic modeling experiments to understand essay patterns.
Provides a notebook-based evaluator app for score-oriented feedback and analysis.

Dataset

The dataset contains 3,911 essays written by English Language Learners. The essays were pre-scored by teachers across six analytic measures, with scores ranging from 1 to 5 in half-point increments.

This structure makes the task more nuanced than generic essay scoring. A strong answer may have good vocabulary but weaker grammar, or strong cohesion but limited phraseology. The project is organized around preserving that multi-axis view of writing proficiency.

Modeling Workflow

The repository separates preprocessing, visualization, modeling, and the evaluator app. The modeling notebooks explore machine-learning and NLP approaches, while the data mining notebook includes topic modeling experiments with BERTopic.

The app layer demonstrates how essay text can be turned into feedback and analysis, rather than leaving the work as notebook-only modeling output.

Example Workflows

The included screenshots show an evaluator interface, essay scoring output, and essay analysis. These examples help make the project concrete: the system is meant to support writing assessment workflows, not simply train a model in isolation.

Limitations

Automated writing assessment should be used carefully. Scores can help triage or support feedback, but they should not replace teacher judgment, especially for English Language Learners. Bias, prompt variation, rubric alignment, and explainability all matter in a real educational setting.

Gallery