The question of how universities should handle AI-generated term papers concerns education leaders throughout the DACH region. A recent experiment at FernUniversität Hagen now provides empirical insights that illustrate the scale of the challenge: Four term papers generated entirely by AI achieved a level sufficient to pass sociology examinations. For decision-makers at universities, academies, and continuing education institutions, this raises the urgent question of how assessment formats can be designed for the future.
The Experiment: What Generative AI Can Achieve Today
Benedikt Engelmeier, a research associate at FernUniversität Hagen, created four term papers for sociology modules entirely using the AI model Claude Sonnet 4.5 for his experiment. What makes this remarkable: He used no subject-specific input whatsoever. Neither was the topic predetermined nor were any content corrections made. Instead, he focused exclusively on detailed work instructions for the AI.
The results are striking: All four term papers feature a research question appropriate to the module and a suitable structure. They draw on relevant sociological theories and analyze the subject matter on this basis. The AI-generated papers show particular strengths in topic selection—that is, the successful combination of theoretical approach and subject of investigation—as well as in coherent structure.
For education leaders, this means: Technical development has reached a point where AI-generated academic papers no longer fail due to obvious content or structural deficiencies. The quality is sufficient to pass examinations.
Why Detection Has Become So Difficult
In the early days of generative AI, hallucinated references were considered a reliable indicator. However, this criterion is becoming increasingly less relevant. The review of the four term papers reveals a nuanced picture:
- Reference Lists:
- Of 101 sources, 63 were nearly error-free, 30 contained minor errors such as incorrect author names. Only two entries were actually hallucinated.
- In-Text Citations:
- For 305 of 316 citations, a corresponding entry exists in the reference list.
- Direct Quotations:
- Here the serious weakness emerges: Only 15 of 56 direct quotations could be verified in the cited sources.
The problem for examiners: Verifying direct quotations requires cross-referencing with original sources. This effort is not realistic as a standard check for all submitted papers. The more time spent verifying formalities, the less remains for substantive engagement with student work.
Technical detectors for AI-generated text also have weaknesses, as various studies have shown. Reliable automatic detection is currently not possible.
The Dilemma: Neither Allowing Nor Prohibiting Solves the Problem
In the current debate, it is often argued that a good term paper created with AI still says something about students' subject expertise. The experiment refutes this assumption: If term papers can be generated without any subject-specific input, the format no longer tests subject competency but rather AI competency or the ability to commission such services.
For universities and educational institutions, this creates a genuine dilemma:
- Prohibiting AI use cannot be enforced, as detection is not reliably possible.
- Allowing AI use means that the actually desired competencies are no longer being assessed.
- Higher standards for all papers would overwhelm many students and increase pressure to use AI.
The provocative insight from the experiment: In some cases, student term papers may in future be identifiable by the fact that their level is too low to pass as an AI product. A circumstance that presents absurd challenges for assessment.
Process Supervision as a Solution
A promising solution lies in focusing on the creation process rather than the finished product. When not only the submitted work is assessed, but also the path to getting there, actual competency development can be better tracked.
However, these process-based assessments require significantly closer supervision and more working time. With current examination loads, staffing levels, and resulting supervision ratios, this is not feasible at many institutions.
This is where digital learning companions can make a crucial contribution. An AI tutor integrated into existing learning management systems like Moodle can continuously support students throughout the learning process. Unlike text generation tools, such a learning companion documents the individual engagement with the material and promotes independent competency development.
The difference lies in the approach: While AI text generators deliver the end product and thereby bypass the learning process, an AI tutor accompanies the journey to the goal. It answers comprehension questions, provides feedback on solution approaches, and supports the structuring of thoughts. The cognitive effort remains with the learners.
What Education Leaders Should Consider Now
The experiment makes clear that the question is no longer whether the traditional term paper will become untenable as an assessment format, but when this point will be reached. With the further spread of AI knowledge and the increasing capability of available tools, generating academic papers will become ever easier and require fewer prerequisites.
For decision-makers at universities and continuing education institutions, this gives rise to concrete areas for action:
- Redesigning assessment formats with a stronger focus on the process
- Building supervision capacity for process-oriented assessments
- Integrating learning companions that support rather than replace competency development
- Developing criteria that value authentic student achievements
Written assessments without supervision or intensive guidance will no longer be meaningful in the long term. Institutions that adopt process-oriented formats and supportive learning companions early will manage this transition better than those that cling to outdated assessment methods.
The Hagen experiment provides no ready-made solutions, but it describes the problem with empirical precision. For education leaders, now is the time to critically examine their own assessment formats and develop alternatives that remain meaningful even in the age of generative AI.
Frequently Asked Questions
Can AI-generated term papers pass university examinations?
Can AI-generated term papers be reliably detected?
What alternatives exist to traditional term papers as an assessment format?
How can an AI tutor help solve this problem?
Should universities allow or prohibit AI use in term papers?
Discover how the Alphabees AI Tutor intelligently extends your Moodle courses – with 24/7 learning support and no new infrastructure costs.