Beyond Recall: Rethinking Assessment in a World with AI.
Inspired by conversations with Dr. Simon McCallum (Victoria University, Wellington) and Benny Pan (Rototuna High School, Hamilton) at the recent PPTA Leadership Summit in Christchurch, it’s time to examine how assessment must evolve. Traditionally, assessment has prioritized factual recall. In an AI-driven era where ChatGPT and similar tools can supply information instantly, simply rewarding recall is no longer sufficient. We need to shift our focus toward critical engagement and metacognitive reflection.
A quick review of current assessment in New Zealand under the National Certificate of Educational Achievement (NCEA) shows that external examinations emphasize definitions with questions often asking students to reproduce textbook definitions under timed conditions. Internal assessments favor knowledge checks which frequently take the form of short reports that reward memorisation. Talking to students the credit tally takes over critical thinking. The earning of credits inherent in NCEA has become synonymous with accumulating discrete facts rather than demonstrating deeper understanding and connections between concepts and ideas. Hopefully, the New Zealand Qualifications Authority (NZQA) is considering these issues with their redevelopment of the achievement standards for level 2 and 3 NCEA.
NCEA’s grading structure often correlates depth of knowledge with achievement bands but much of this depth currently revolves around recall and application rather than genuine critique or reflection. Students typically meet the achieved criteria by memorizing definitions and reproducing key concepts. While this demonstrates foundational understanding, it places little emphasis on questioning or analyzing information. At the merit level, students apply knowledge in more complex contexts. However, tasks often still rely on structured prompts that guide application rather than encourage independent critique or evaluation of sources especially AI-generated ones. Finally, excellence tasks require higher-order thinking, but many exemplar materials focus on extended explanations or problem-solving using formulaic methods rather than explicitly critiquing the reliability of information or reflecting on one’s thinking process. This structure means even top-performing students may not explicitly develop the ability to interrogate and reflect on knowledge, skills that are critical in a world with AI.
While I acknowledge that recall is a foundational skill, overemphasis on it, undervalues analysis, synthesis, and evaluation, key domains in SOLO taxonomy and Bloom’s Revised Taxonomy. Focus on recall fails to build students’ capacity to judge the reliability of information when AI can generate plausible but flawed response and also misses opportunities to develop learners’ awareness of their own thinking processes (metacognition).
Large language models underpinning AI tools generate text by recognizing patterns across vast data sets. They do not “understand” content but predict likely word sequences. As a result, AI may invent facts, confidently presenting them as truth. These are called hallucinations. Training data can embed cultural or topical biases into AI outputs. Finally, subtle nuances or localized knowledge (for example, specific New Zealand contexts like beetroot has a place in burgers) may be misrepresented. These limitations highlight why students need more than recall; they require the skills to critique, corroborate, and reflect.
To align assessment with the demands with the brave new world of AI, consider these ideas. Consider reframing learning tasks at achieved level. Beyond recalling definitions, require students to spot one potential limitation or bias in the summary text of a concept. This small shift builds initial critique skills.
When looking at the achieved with merit level, instead of only applying concepts, ask students to compare a process or solution sequence with a standard exemplar, identifying differences and reflecting on which is more accurate and why.
Finally, for excellence, embed a metacognitive reflection. Students must not only solve a complex problem but also document their thinking strategies and justify how they verified each piece of information, particularly AI-sourced data. This could be particularly useful for internal assessment.
So, what would that look like? Being a physics teacher here are just a few ways both external and internal assessment could be changes to move to more metacognition and critiquing of knowledge.
AS91171: Demonstrate Understanding of Mechanics
Achieved: Recall key mechanics definitions (e.g., force, acceleration) and identify one factual error in an explanation of projectile motion.
Merit: Solve a kinematics problem and compare your step-by-step solution with a provided solution, evaluating discrepancies and explaining which method is more reliable and why.
Excellence: Tackle a multi-part mechanics question (e.g., involving energy conservation and momentum), critically assess assumptions in its solution, and include the documenting of your reasoning and verification processes.
AS91172: Demonstrate Understanding of Atomic and Nuclear Physics
Achieved: Describe one atomic or nuclear physics concept (e.g., half-life, nuclear fusion) and spot a limitation or bias in a summary of that concept.
Merit: Compare an account of radioactive decay with a textbook explanation, analyze any inconsistencies, and justify which account is more scientifically accurate.
Excellence: Analyze experimental or simulation data related to atomic interactions, integrate AI-generated interpretations critically, and produce a metacognitive reflection on how you evaluated and confirmed each piece of information.
By rebalancing assessments away from recall and towards critical engagement, synthesis, and metacognitive awareness, we empower students to navigate information with discernment. In doing so, we are building resilient learners ready to thrive in a world with AI.
Comments