AI and Marking: A Radical Change?

The New Zealand government has recently set off a lively debate by proposing to outsource much of the marking of NCEA assessments to artificial intelligence (AI). Education Minister Erica Stanford argues that this would ease teacher workload, speed up assessment cycles, and make major reforms to NCEA possible. In her words, New Zealand is already “world-leading” in using AI for marking and digital exams.

At first glance, the promise is enticing. I often work late into the night marking piles of assessments. If an algorithm could shoulder this load while maintaining fairness, perhaps it would free teachers to focus on planning lessons, mentoring students, and reclaiming their evenings. Yet as with any bold innovation, the risks are as significant as the benefits, and the proposal deserves careful scrutiny, particularly when viewed through the values of Te Ao Māori which underpin education in New Zealand and the central role of relationships in education.

What’s on the table?

The Ministry of Education and NZQA have already begun using AI for assessment on a small scale. In 2025, around 60,000 Year 10 literacy writing responses were marked initially by an AI system. Human markers then reviewed about 40 percent of cases that sat on the borderline between pass and fail. NZQA’s chief executive, Dr Grant Klinkum, reported that the AI’s agreement with human markers was comparable to the agreement between two human markers. When combined with human oversight, the reliability of marking was even higher.

Encouraged by this, the minister has called for broader use of AI across NCEA. She argues that with major changes planned, such as an overhauled NCEA, the scale of assessment would be unmanageable without automation. Hiring enough human markers would require, she says, a “massive injection” of funding. AI, in her view, is not only cost-effective but also just as accurate as teachers.

The Promise of AI

Advocates of AI marking stress efficiency above all else. Unlike humans, an algorithm does not tire after the 70th essay or let boredom creep into its judgments. It can churn through scripts quickly and consistently, potentially reducing teacher burnout. Some global trials show that AI grading can cut marking time by as much as 90 percent. Faster turnaround is another touted advantage. In the literacy pilot, results were returned to students more quickly, giving those who failed a chance to re-sit sooner.

Consistency is also part of the appeal. Teachers, even well-trained, inevitably vary in their judgments. AI systems apply criteria the same way every time. The NZQA trial suggested that combining AI with human checks produces higher overall reliability than relying solely on two human markers. Proponents see this as a pathway to fairer outcomes.

From a policy perspective, AI also enables scale. The government hopes it will make the ambitious restructuring of NCEA possible without overwhelming the system. Stanford presents this as evidence that New Zealand is ahead of the world in education technology, positioning the country as a potential leader in innovation.

A Growing Unease

Despite these promises, critics are deeply concerned. AI systems are not infallible; they can be wrong in unexpected and sometimes bizarre ways. Even the Ministry of Education has warned schools not to rely on tools like ChatGPT for grading because of risks of unfairness or outright error. A student using unusual phrasing or a cultural reference might be penalised simply because the algorithm does not recognise it.

Bias is another major issue. Algorithms reflect the data they are trained on. If that data skews towards certain cultural or linguistic norms, it risks undervaluing the work of students outside those norms. Research overseas has shown that AI can behave inconsistently with weaker work or with students from a second language, which ironically are the students most in need of careful, fair evaluation.

Scaling AI beyond Year 10 literacy assessments is also far more complex than the government acknowledges. Structured writing tasks are one thing but history essays, science reports, visual arts portfolios, and Te Reo Māori assessments are quite another. These require cultural understanding, judgment of creativity, and sensitivity to context all of which are qualities that cannot easily be reduced to an algorithm.

Teachers worry, too, about the erosion of professional judgment. Marking is not just about allocating grades to students. Marking is a key moment when teachers gain insight into student understanding. Outsourcing this process to AI risks deskilling teachers and cutting them off from valuable feedback about their students’ progress. Auckland principal Claire Amos has called the idea of AI doing most marking “hugely disempowering” for both teachers and students.

Students may also respond differently when they know a computer is their audience. Will they be motivated to bring their full creativity to an essay, or will they focus instead on “gaming the algorithm” with long words and formulaic structures? Research shows AI graders can be tricked by such tactics, and these risks could shift learning away from authentic expression towards writing for the machine.

Finally, transparency is a thorny problem. When a teacher gives a grade, they can explain their reasoning. When an algorithm gives a grade, the decision is often a black box. How can students appeal? Who is accountable for mistakes? If trust in the fairness of assessments is undermined, the integrity of NCEA itself could be at stake.

Te Ao Māori Perspectives

Any major change in Aotearoa’s education system should be examined through the lens of Te Ao Māori. Māori principles such as whanaungatanga, manaakitanga, and ako place relationships and respect at the centre of learning. Marking is often a moment when teachers connect with students, offering encouragement, guidance, and personalised feedback. An algorithm cannot foster these relationships or uphold a student’s mana in the same way.

There are cultural risks as well. Will an AI system recognise the value of a whakataukī in an essay? If it misinterprets such expressions, students could feel that their identity is not valued. That undermines manaakitanga and conflicts with the goal of Māori achieving success as Māori.

Ako, the principle of reciprocal teaching and learning, also suffers. Teachers often learn about their students’ thinking through the marking process. If AI takes this over, the loop is broken. Teaching risks becoming automated, with learning reduced to data points rather than the human stories and perspectives students bring to their learning.

Proceeding with Care

There is no denying the potential of AI to reduce workload and bring efficiency to assessment. But there is also no denying the risks of unfairness, loss of professional judgment, and cultural misalignment. The consensus among many teachers is that AI should play a supporting role rather than replace teachers. NZQA’s current model of AI plus human oversight may be the safest path forward.

Ultimately, assessment is not just about assigning grades. It is part of the dialogue between teacher and student, a space where relationships are built and learning is deepened. If AI is used wisely, it can support this dialogue by taking on repetitive tasks and allowing teachers more time for building the human aspects of teaching. But if it is rushed in to cut costs or replace teachers, it risks damaging both learning and trust.

New Zealand has an opportunity to lead, but leadership here means moving cautiously, investing in teacher training, and ensuring consultation with diverse voices. Teaching is, at its heart, about people. Any tool we adopt must serve students and teachers first, upholding fairness, relationships, and cultural values. If AI can do that, it may prove a valuable assistant. If not, the price may be too high.

Search This Blog

TheFlippedScientist