Page 720 - ISC PROCEEDINGS 21.4

P. 720

two tools are substantively equivalent (absence of evidence is not evidence of absence
[5]). The relatively small overall sample size reflects the size of a single class cohort and
may limit the generalizability of findings to other institutions or student populations.
Third, the study did not control for AI use by students outside formal sessions. If students
continued AI-assisted practice after sessions, the T1 outcomes would reflect a composite
of the structured intervention and uncontrolled self-directed learning, with no means of
separating the two contributions. Future studies should record AI interaction logs or apply
supervised protocols during sessions. Fourth, the study did not measure learning
retention over time (no T2 assessment). The large improvements at T1 may partly reflect
a novelty effect with AI tools and cannot guarantee maintenance at 3-6 months. Finally,
AI hallucinations were not systematically recorded, and their frequency and educational
impact remain unclear
Future work should address these gaps directly. A cluster RCT with a parallel control
group across multiple institutions would provide the strongest test of effectiveness.
Retention should be assessed at six months. Recording AI interaction logs would allow
separation of structured and self-directed contributions. Finally, an adequately powered
head-to-head comparison of ChatGPT and Gemini requiring substantially larger
subgroups remains necessary before platform specific recommendations can be made
5. Conclusion
This study demonstrated that a structured, instructor-guided AI chatbot protocol
produced gains across all seven pharmacoepidemiological competency indicators among
fourth-year pharmacy students at Thanh Do University. Gains were pronounced for
the demanding indicators NNT/ARR and I² pointing to the protocol's particular utility in
developing higher-order analytical reasoning. The absence of between platform
differences suggests that instructional design, rather than AI tool selection, is the more
consequential variable in this context

References
[1] Altman, D. G., & Bland, J. M. (1995). Absence of evidence is not evidence of
absence. BMJ, 311(7003), 485.
[2] Anderson, J. R. (1987). Skill acquisition: Compilation of weak-method problem
solutions. Psychological Review, 94(2), 192–210.
[3] Bhatt, D. L., Cryer, B. L., Contant, C. F., et al. (2010). Clopidogrel with or without
omeprazole in coronary artery disease. New England Journal of Medicine, 363(20), 1909–
1917.
[4] Cook, T. D., & Campbell, D. T. (1979). Quasi-experimentation: Design and
analysis issues for field settings. Houghton Mifflin.
[5] Hedges, L. V., & Olkin, I. (1985). Statistical methods for meta-analysis. Academic
Press.
[6] Kasneci, E., Sebler, K., Küchemann, S., et al. (2023). ChatGPT for good? On
opportunities and challenges of large language models for education. Learning and
Individual Differences, 103, 102274.
[7] Kung, T. H., Cheatham, M., Medenilla, A., et al. (2023). Performance of ChatGPT
on USMLE: Potential for AI-assisted medical education using large language models. PLOS
Digital Health, 2(2), e0000198.

719

715 716 717 718 719 720 721 722 723 724 725