Page 717 - ISC PROCEEDINGS 21.4
P. 717

groups at T1: Fisher’s exact test for binary variables (preferred over Chi-square given
                  small subgroup sizes n₁ = 19, n₂ = 18); Mann-Whitney U test for Likert variables. Effect
                  sizes were calculated to estimate the magnitude of change: phi coefficient (φ) for binary
                  variables (φ > 0.50 = large); Cohen’s d for continuous variables (d > 0.80 = large; 0.50- 0.80
                  = moderate). Statistical significance threshold: p < 0.05, two-tailed. Software: SPSS version
                  26.0.
                        2.8. Research ethics
                        Student participation was voluntary and had no bearing on formal academic results.
                  Data was anonymized and stored securely.
                        3. Results
                        3.1. Sample characteristics
                        All 37 students completed all four phases of the study (completion rate 100%).
                  Demographic and academic characteristics are presented in Table 2.
                        Table 2. Demographic and academic characteristics of the study sample (n = 37)
                   Characteristic                   Category                                   n (%)

                                                    Female                                     29 (78.4%)
                   Sex
                                                    Male                                       8 (21.6%)
                   Age (mean ± SD, years)           22.1 ± 0.8

                                                    Previously used AI (general                14 (37.8%)
                   Prior AI experience              learning;    no      literature analysis)

                                                    Never used AI in learning                  23 (62.2%)
                                                    ChatGPT (GPT-4o / 4o- mini)                19 (51.4%)
                   AI tool assigned
                                                    Gemini (1.5 Pro / Flash)                   18 (48.6%)
                   Biostatistics score (mean ±      -                                          6.8 ± 1.1
                   SD)

                        SD: standard deviation. ChatGPT/Gemini allocation performed by simple lottery
                  randomization.
                        Notably, 62.2% of students had never used AI in an academic context; for these
                  students, this intervention was their first guided AI learning experience
                        3.2. Quantitative outcomes (Table 3)
                        All seven outcome indicators improved significantly (p < 0.001), with uniformly large
                  effect sizes across all measures (Table 3).
                        Table 3. Quantitative outcomes before and after the AI intervention (n = 37)

                                                                                             EffectSize
                      Outcome Indicator         T0         T1         Δ        Test / p
                                                                                              (95% CI)

                   Correct interpretation    27%        86%        +59 pp    McNemar,       φ = 0.77
                   of aOR/IRR (%)            (10/37)    (32/37)              p<0.001

                   Correct differentiation   19%        81%        +62 pp    McNemar,       φ = 0.79
                   of aOR/RR/HR/IRR (%)      (7/37)     (30/37)              p<0.001

                   Correct interpretation    11%        73%        +62 pp    McNemar,       φ = 0.79
                   of NNT and ARR (%)        (4/37)     (27/37)              p<0.001



                                                                                                      716
   712   713   714   715   716   717   718   719   720   721   722