Page 693 - ISC PROCEEDINGS 21.4

P. 693

connects directly to Festinger's (1954) Social Comparison Theory, which proposes that
individuals evaluate their own abilities by comparing themselves to their counterparts in
the absence of objective standards. In this study, users are asked to form their own
answer before being shown an AI-generated response; AI, in this context, assumes the
‘counterpart’ role traditionally occupied by a human peer – and humans have a tendency
to boost their self-esteem when looking at other people. Recent research demonstrates
that exposure to generative AI- produced content can increase people's self-confidence in
their creative abilities (Reich et al., 2025). Further, research on confidence alignment in
human-AI decision-making found that users' self-confidence is not independent of AI
confidence - human self-confidence leans towards AI's expressed confidence level, and
this alignment subsequently affects users' self- confidence calibration. This body of
evidence confirms that when users compare their own answers to those of an AI system,
the AI functions as a meaningful social referent that can raise or lower internal confidence,
depending on contextual framing.
Despite notable advances, important gaps remain in the literature that the present
study seeks to address. Most existing research on social comparison in AI contexts has
focused on creative tasks or advisory scenarios, leaving underexplored the specific
mechanism by which encountering an AI-generated response to a knowledge question -
particularly after forming one's own answer - influences self-reported confidence in their
previous answer. This study directly addresses this gap by isolating the comparative
moment between user response and AI response as the key event through which
confidence is updated, offering a theoretically grounded and empirically novel
contribution to the understanding of self-confidence in AI- mediated settings.
The hypotheses are as below.
H2.1 Confidence level after receiving LLM responses is significantly higher than that
before receiving LLM responses within participants experiencing a delay with LLM
suggestions before receiving LLM response.
H2.2 Confidence level after receiving LLM responses is significantly higher than that
before receiving LLM responses within participants experiencing a delay with AI
suggestions before receiving LLM response.
H2.3 The proportion of accuracy within participants experiencing a delay before
receiving LLM response is significantly higher than that after receiving LLM response.
H2.4 The proportion of accuracy within participants experiencing a delay with LLM
suggestions before receiving LLM response is significantly higher than that after receiving
LLM response.
H2.5. The proportion of choosing the deliberate incorrect answer within
participants experiencing a delay before receiving LLM response is significantly lower than
that after receiving LLM response.
H2.6. The proportion of choosing the deliberate incorrect answer within
participants experiencing a delay with guidance before receiving LLM response is
significantly lower than that after receiving LLM response.
2.2. Methodology
2.2.1. Participants and design
Participants (n = 151) are Vietnamese university students from the Faculty of
Economics, Hanoi Open University, randomly assigned to one of four conditions:
Control condition (CC). Participants receive LLM response immediately upon
choosing their condition, replicating typical LLM user experience. The LLM response is

692

688 689 690 691 692 693 694 695 696 697 698