Page 690 - ISC PROCEEDINGS 21.4

P. 690

informational accuracy (Alansari & Luqman, 2025). LLMs may also suffer from the inability
to detect contradictions in input text, leading to their failure in executing statements in
the right priority order, even when provided with explicit prompts about it (Geng et al.,
2025). The tendency of LLMs to unquestioningly affirm users’ suggestions - even when
there exists harmful intentions - is also detected in Cheng et al. (2025)’s study. In order to
realize these problems in output, students are expected to have a decent level of
informational literacy to detect misleading information, false affirmation from part of
LLMs and maintain wording consistency in their own input; however, university students
had been shown to not perform well in recognizing the accuracy rate of AI output and
constructing their prompt in the most comprehensible way (Cui & Zhang, 2025). Kim et al.
(2025) found out that there are also major differences in high-literacy and low-literacy
students’ way of giving LLMs prompts, and with that, their writing assignment with the
assistance of models. Students may also be unable to realize apparent problems with AI
output, or they may blindly accept LLMs’ responses due to confirmation bias. Cheng et al.
(2025)’s results suggest that people are more likely to trust models with a sycophantic
tendency, even when they risk losing their independent judgement, which inadvertently
increases the chances of LLMs leaning towards acceptance of human input.
Another important and especially significant dimension of the problem is how the
LLM responses distort the user's self-evaluated confidence in their knowledge and
abilities. Since students can get an instant, fluent, and authoritative answer by asking the
LLM a question, they systematically forfeit the cognitive process of reasoning, retrieval,
and self-monitoring. Indeed, according to the research, learners using generative AI tools
have inflated performance confidence, even though there is minimal gain in deep learning
or transferable knowledge, a phenomenon described as a disconnect between perceived
and actual competence (Fernandes & Welsch, 2025). In this sense, students increasingly
rely on generative AI to carry out academic work and develop domain-specific "false self-
efficacy" that conceals actual competence; a user's task-specific self-confidence and
confidence in AI each predict whether critical thinking is activated at all in AI-assisted
work (Lee et al., 2025). Overall, these findings indicate that unreflective use of AI does not
supplement students' knowledge; instead, it actively erodes the metacognitive processes
by which students evaluate what they know.
Multiple measures have been taken to raise users' awareness of current LLM
limitations and to actively reduce overreliance. These include in-product disclaimers
embedded in ChatGPT, Gemini, Claude, and DeepSeek, as well as broader educational
initiatives and, in certain jurisdictions, outright bans. Among the most prominent early
cases, New York City Public Schools and Los Angeles Unified blocked access to ChatGPT
from school networks and devices, and numerous other districts followed suit. At
university level, many institutions adopted a nuanced approach, encouraging responsible
use of these tools to achieve high-quality outcomes while adhering to ethical principles
and regulations, while others opted for stricter prohibitions. Nevertheless, such top-down
regulatories share a fundamental limitation: they only operate based on assumption of
informational literacy and coercions. What remains conspicuously underdeveloped in
both policy and research is the application of behavioral nudges - subtle, non-coercive
modifications to the choice environment - that could reshape how students engage with
AI-generated responses in real time, without relying on prohibitions that are difficult to
enforce or awareness campaigns that may not translate into changed behavior (Thaler &
Sunstein, 2008; Buçinca et al., 2021).

689

685 686 687 688 689 690 691 692 693 694 695