AI and Marking: A Radical Change?
The New Zealand government has recently set off a lively debate by proposing to outsource much of the marking of NCEA assessments to artificial intelligence (AI). Education Minister Erica Stanford argues that this would ease teacher workload, speed up assessment cycles, and make major reforms to NCEA possible. In her words, New Zealand is already “world-leading” in using AI for marking and digital exams.
At first glance, the promise is enticing. I often work late
into the night marking piles of assessments. If an algorithm could shoulder
this load while maintaining fairness, perhaps it would free teachers to focus
on planning lessons, mentoring students, and reclaiming their evenings. Yet as
with any bold innovation, the risks are as significant as the benefits, and the
proposal deserves careful scrutiny, particularly when viewed through the values
of Te Ao Māori which underpin education in New Zealand and the
central role of relationships in education.
What’s on the table?
The Ministry of Education and NZQA have already begun using
AI for assessment on a small scale. In 2025, around 60,000 Year 10 literacy
writing responses were marked initially by an AI system. Human markers then
reviewed about 40 percent of cases that sat on the borderline between pass and
fail. NZQA’s chief executive, Dr Grant Klinkum, reported that the AI’s
agreement with human markers was comparable to the agreement between two human
markers. When combined with human oversight, the reliability of marking was
even higher.
Encouraged by this, the minister has called for broader use
of AI across NCEA. She argues that with major changes planned, such as an
overhauled NCEA, the scale of assessment would be unmanageable without
automation. Hiring enough human markers would require, she says, a “massive
injection” of funding. AI, in her view, is not only cost-effective but also
just as accurate as teachers.
The Promise of AI
Advocates of AI marking stress efficiency above all else.
Unlike humans, an algorithm does not tire after the 70th essay or let boredom
creep into its judgments. It can churn through scripts quickly and
consistently, potentially reducing teacher burnout. Some global trials show
that AI grading can cut marking time by as much as 90 percent. Faster
turnaround is another touted advantage. In the literacy pilot, results were
returned to students more quickly, giving those who failed a chance to re-sit
sooner.
Consistency is also part of the appeal. Teachers, even
well-trained, inevitably vary in their judgments. AI systems apply criteria the
same way every time. The NZQA trial suggested that combining AI with human
checks produces higher overall reliability than relying solely on two human
markers. Proponents see this as a pathway to fairer outcomes.
From a policy perspective, AI also enables scale. The
government hopes it will make the ambitious restructuring of NCEA possible
without overwhelming the system. Stanford presents this as evidence that New
Zealand is ahead of the world in education technology, positioning the country
as a potential leader in innovation.
A Growing Unease
Despite these promises, critics are deeply concerned. AI
systems are not infallible; they can be wrong in unexpected and sometimes
bizarre ways. Even the Ministry of Education has warned schools not to rely on
tools like ChatGPT for grading because of risks of unfairness or outright
error. A student using unusual phrasing or a cultural reference might be
penalised simply because the algorithm does not recognise it.
Bias is another major issue. Algorithms reflect the data
they are trained on. If that data skews towards certain cultural or linguistic
norms, it risks undervaluing the work of students outside those norms. Research
overseas has shown that AI can behave inconsistently with weaker work or with
students from a second language, which ironically are the students most in need
of careful, fair evaluation.
Scaling AI beyond Year 10 literacy assessments is also far
more complex than the government acknowledges. Structured writing tasks are one
thing but history essays, science reports, visual arts portfolios, and Te Reo
Māori assessments are quite another. These require cultural understanding,
judgment of creativity, and sensitivity to context all of which are qualities
that cannot easily be reduced to an algorithm.
Teachers worry, too, about the erosion of professional
judgment. Marking is not just about allocating grades to students. Marking is a
key moment when teachers gain insight into student understanding. Outsourcing
this process to AI risks deskilling teachers and cutting them off from valuable
feedback about their students’ progress. Auckland principal Claire Amos has
called the idea of AI doing most marking “hugely disempowering” for both
teachers and students.
Students may also respond differently when they know a
computer is their audience. Will they be motivated to bring their full
creativity to an essay, or will they focus instead on “gaming the algorithm”
with long words and formulaic structures? Research shows AI graders can be
tricked by such tactics, and these risks could shift learning away from
authentic expression towards writing for the machine.
Finally, transparency is a thorny problem. When a teacher
gives a grade, they can explain their reasoning. When an algorithm gives a
grade, the decision is often a black box. How can students appeal? Who is
accountable for mistakes? If trust in the fairness of assessments is
undermined, the integrity of NCEA itself could be at stake.
Te Ao Māori Perspectives
Any major change in Aotearoa’s education system should be
examined through the lens of Te Ao Māori. Māori principles such as whanaungatanga, manaakitanga,
and ako place relationships and respect at the centre of
learning. Marking is often a moment when teachers connect with students,
offering encouragement, guidance, and personalised feedback. An algorithm
cannot foster these relationships or uphold a student’s mana in
the same way.
There are cultural risks as well. Will an AI system
recognise the value of a whakataukī in an essay? If it
misinterprets such expressions, students could feel that their identity is not
valued. That undermines manaakitanga and conflicts with the
goal of Māori achieving success as Māori.
Ako, the principle of reciprocal teaching and
learning, also suffers. Teachers often learn about their students’ thinking
through the marking process. If AI takes this over, the loop is broken.
Teaching risks becoming automated, with learning reduced to data points rather
than the human stories and perspectives students bring to their learning.
Proceeding with Care
There is no denying the potential of AI to reduce workload
and bring efficiency to assessment. But there is also no denying the risks of
unfairness, loss of professional judgment, and cultural misalignment. The
consensus among many teachers is that AI should play a supporting role rather
than replace teachers. NZQA’s current model of AI plus human oversight may be
the safest path forward.
Ultimately, assessment is not just about assigning grades.
It is part of the dialogue between teacher and student, a space where
relationships are built and learning is deepened. If AI is used wisely, it can
support this dialogue by taking on repetitive tasks and allowing teachers more
time for building the human aspects of teaching. But if it is rushed in to cut
costs or replace teachers, it risks damaging both learning and trust.
New Zealand has an opportunity to lead, but leadership here
means moving cautiously, investing in teacher training, and ensuring
consultation with diverse voices. Teaching is, at its heart, about people. Any
tool we adopt must serve students and teachers first, upholding fairness,
relationships, and cultural values. If AI can do that, it may prove a valuable
assistant. If not, the price may be too high.
Comments