Skip to main content

Custom Grading Script Examples

Complete, ready-to-use examples for building custom autograding scripts in Vocareum labs, covering test-based grading, AI-powered grading, and hybrid approaches.

Written by Mary Gordanier

For Teachers and Admins

This guide covers Vocareum Notebook, VS Code, and JupyterLab assignments. For setup instructions and an explanation of how the grading environment works, see Using Custom Grading Scripts in Vocareum Labs. For Vocareum Notebook-specific context — including when to use a custom script vs. the built-in nbgrader-based autograder — see Using Custom Grading Scripts in Vocareum Notebook.

The key facts you need to use the examples below:

  • Student files are at /voc/work/ ($VOC_HOME_DIR) — read submissions from here

  • Your scripts and support files go in /voc/scripts/ — reference them with $SCRIPT_DIR

  • Write scores to $vocareumGradeFile (CSV: Criterion Name, Score)

  • Write feedback to $vocareumReportFile (free-form text or HTML)

File Layout

Every autograding setup requires at minimum one file:

File

Location

Required

Purpose

grade.sh

/voc/scripts/grade.sh

Yes

Shell entry point Vocareum calls

grade.py

/voc/scripts/grade.py

No

Python grading logic (recommended for complex grading)

grade_prompt.txt

/voc/scripts/grade_prompt.txt

No

AI grading prompt (for AI-based grading)

All files under /voc/scripts/ are placed via Configure Workspace in the assignment settings.


The Grade File and Report File

Grade File ($vocareumGradeFile)

A CSV file where each line maps a rubric criterion to a score:

<Rubric Criterion Name>,<Score>

Rules:

  • Criterion names must exactly match (case-sensitive) the names defined in your Part rubric settings.

  • One criterion per line.

  • Scores are numeric (integer or decimal), up to the max score defined for that criterion.

Example (rubric has "Correctness" with max 7 and "Style" with max 3):

Correctness, 5 
Style, 3

Report File ($vocareumReportFile)

Free-form text displayed to the student as feedback. Anything you write here appears in their submission report.

There are two patterns for producing report content — choose one per script:

Pattern 1: Print-based. Write report content to stdout using print() statements. Vocareum automatically appends all stdout from grade.sh to the report file. This is the simpler approach and works well when all script output is student-facing.

Pattern 2: File-based with VOC_NO_REPORT_OUTPUT. Write VOC_NO_REPORT_OUTPUT as the first line of the report file, then write all report content explicitly using file operations. Vocareum suppresses stdout when this line is present, giving you full control over what students see. Use this pattern when your script produces internal status output (such as progress messages or debug logging) that you do not want students to see.

Avoid mixing both patterns in the same script — if VOC_NO_REPORT_OUTPUT is present, print statements will not reach the student.


Test-Based Grading

Test-based grading runs the student's code against known inputs and checks the outputs. This is the most common and deterministic approach.


Simple Output Comparison

Assignment: Students write a function add(a, b) in submit.py that returns the sum of two numbers.

Rubric: One criterion — Score with max 10.

/voc/scripts/grade.sh

#!/bin/bash

SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"

python3 "$SCRIPT_DIR/grade.py" "$vocareumGradeFile" "$vocareumReportFile"

/voc/scripts/grade.py

import sys
import importlib.util

grade_file = sys.argv[1]
report_file = sys.argv[2]

# Load the student's module
spec = importlib.util.spec_from_file_location("student", "submit.py")
student = importlib.util.module_from_spec(spec)

try:
spec.loader.exec_module(student)

except Exception as e:
# Student code fails to load — give 0 and report the error
with open(grade_file, "w") as f:
f.write("Score, 0\n")

with open(report_file, "w") as f:
f.write(f"ERROR: Could not load submit.py\n{e}\n")

sys.exit(0)

# Define test cases: (a, b, expected_result)
test_cases = [
(1, 2, 3),
(0, 0, 0),
(-1, 1, 0),
(100, 200, 300),
(-5, -10, -15),
]

passed = 0
total = len(test_cases)
report_lines = []

for i, (a, b, expected) in enumerate(test_cases, 1):
try:
result = student.add(a, b)

if result == expected:
passed += 1
report_lines.append(f"Test {i}: add({a}, {b}) = {result} PASSED")
else:
report_lines.append(f"Test {i}: add({a}, {b}) = {result}, expected {expected} FAILED")

except Exception as e:
report_lines.append(f"Test {i}: add({a}, {b}) raised {type(e).__name__}: {e} ERROR")

score = round(passed / total * 10)

with open(grade_file, "w") as f:
f.write(f"Score, {score}\n")

with open(report_file, "w") as f:
f.write(f"Passed {passed}/{total} tests\n\n")
f.write("\n".join(report_lines))

Unit Testing with Multiple Criteria

Assignment: Students implement a Calculator class in submit.py with add, subtract, multiply, and divide methods.

Rubric: Four criteria — Addition (max 3), Subtraction (max 3), Multiplication (max 2), Division (max 2).

/voc/scripts/grade.sh

#!/bin/bash

SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"

python3 "$SCRIPT_DIR/grade.py" "$vocareumGradeFile" "$vocareumReportFile"

/voc/scripts/grade.py

import sys
import importlib.util

grade_file = sys.argv[1]
report_file = sys.argv[2]

# Load student module
spec = importlib.util.spec_from_file_location("student", "submit.py")
student = importlib.util.module_from_spec(spec)

try:
spec.loader.exec_module(student)
calc = student.Calculator()

except Exception as e:
with open(grade_file, "w") as f:
f.write("Addition, 0\nSubtraction, 0\nMultiplication, 0\nDivision, 0\n")

with open(report_file, "w") as f:
f.write(f"ERROR: Could not create Calculator instance\n{e}\n")

sys.exit(0)


def run_tests(method_name, test_cases, max_score):
"""Run tests for a single method. Returns (score, report_lines)."""
passed = 0
lines = [f"--- {method_name} ---"]
method = getattr(calc, method_name, None)

if method is None:
lines.append(f" Method '{method_name}' not found")
return 0, lines

for args, expected in test_cases:
try:
result = method(*args)

if abs(result - expected) < 1e-9:
passed += 1
lines.append(f" {method_name}{args} = {result} PASSED")
else:
lines.append(f" {method_name}{args} = {result}, expected {expected} FAILED")

except Exception as e:
lines.append(f" {method_name}{args} raised {type(e).__name__}: {e} ERROR")

score = round(passed / len(test_cases) * max_score)
lines.append(f" Score: {score}/{max_score}")

return score, lines


# Define tests per method
results = {}
report = []

s, r = run_tests("add", [((2, 3), 5), ((0, 0), 0), ((-1, 1), 0)], max_score=3)
results["Addition"] = s
report.extend(r)

s, r = run_tests("subtract", [((5, 3), 2), ((0, 0), 0), ((1, 5), -4)], max_score=3)
results["Subtraction"] = s
report.extend(r)

s, r = run_tests("multiply", [((3, 4), 12), ((0, 5), 0)], max_score=2)
results["Multiplication"] = s
report.extend(r)

s, r = run_tests("divide", [((10, 2), 5), ((7, 3), 7/3)], max_score=2)
results["Division"] = s
report.extend(r)

# Write grades
with open(grade_file, "w") as f:
for criterion, score in results.items():
f.write(f"{criterion}, {score}\n")

# Write report
total = sum(results.values())
with open(report_file, "w") as f:
f.write(f"Total: {total}/10\n\n")
f.write("\n".join(report))
Did this answer your question?