Confidence assessment in MCQs is an innovative method of accounting for the degree of belief in a respondent's answer and minimising guessing, which can skew scores upwards. I've been a fan of the approach and have used it in some of my module assessment as it rewards those who are knowledgeable AND confident in that knowledge. Details of the method can be found in
this paper on formative and summative confidence based assessment by Gardner-Medwin and Gahan from UCL. Essentially, respondents pick their chosen answer and indicate their level of confidence (say 1, 2 or 3 where 1 is low and 3 is high). If they get the answer correct then they score according to their confidence level, so correct and v. confident = 3; while correct but not at all confident (possibly guessing) = 1. However, if they get it wrong AND are confident in their answer to some degree then a penalty is applied, so v. confident but wrong will be a minus score.
So why is this relevant to my research? Well the main outcome measure is a 26 item MCQ assessing spatial cognition and I want to minimise the impact of guessing on the results and evaluate if participants' confidence in their knowledge is enhanced following the intervention. But ... is it worth doing? For one thing it complicates the procedure for participants and, in itself, influences how they might respond. Furthermore, I recall a discussion with my supervisor some time ago where she argued that it wasn't really necessary. I remember arguing the case FOR it but coming away thinking that I really should investigate further and compare early results with and without confidence assessment to see if it WAS having any impact. It is essential that the main outcome measure in a RCT is as relaible and valid as possible.
So, as I didn't use confidence assessment in the pilot, my plan was to use data collected over the next fortnight to establish the impact of confidence assessment by doing 2 separate analyses and comparing the results. I then remembered that about 12 months ago I collected data to test the VR model, pilot revisions in the outcome measures/questionnaires and check their test-retest reliability. I already had data I could use. It took a little while to find it amongst one of the 3 USB drives and 2 PC's where all my stuff seems to be (dis)organised! Note to self - must spend a day getting this all organised properly and backed up too.
The marking scheme for the confidence based assessment I used was as follows:
Confidence level
|
1
|
2
|
3
|
Mark if correct
|
1
|
2
|
3
|
Penalty if wrong
|
0
|
-1
|
-2
|
Results
The chart below shows the knowledge enhancement (difference between pre and post intervention MCQ scores) for 2 separate analyses for the 20 participants. Both are normalised to a percentage score. Blue bars represent analysis WITH confidence assessment. Red bars represent the analysis WITHOUT it applied. Basically I adjusted the scores for this latter analysis by scoring correct answers as 1 point and incorrect answers as 0 points.
As you can see it appears that confidence based assessment has virtually no impact on the scoring in all but 2 of the participants. Sadly I have no additional data that allows me to explore WHY confidence in knowledge
was enhanced significantly in those 2 participants. I did a quick and dirty paired t-test to compare the datasets and this confirmed no significant difference between the 2 analyes (p = 0.45).
The MCQ inventory also has 3 different categories of items so I thought it would be important to compare improvement scores in these too. Once again, no statistically significant difference in any of them. However, it was interesting to note that, irrespective of confidence based assessment, there was a significant correlation between score improvement and MCQ item difficulty. Basically, a bigger score improvement was seen in those items with the lowest mean score on the pre-test. This is something I'll explore further in the main bulk of data collection/analysis.
One final point to note is that this analysis was based ONLY on the results from participants using the VR intervention and NOT the control group intervention. It is, I suppose, possible that there
may have been differences in confidence pre and post with this group.
Conclusions (tentative)
- Incorporating confidence-based assessment in the MCQ measure does not appear to influence the difference between pre and post-intervention scores.
- This finding is consistent for overall knowledge enhancement AND knowledge enhancement in sub-categories of MCQ items.
- Incorporating confidence-based assessment in the main study is not justified.
- There appears to be a positive correlation between increasing MCQ item difficulty and degree of knowledge improvement.
I admit to being a little surprised by this analysis. I had really believed that I would have seen significant improvements in scores as a result of increased confidence in knowledge and that this would have justified the use of confidence-based assessment. However, this is clearly not the case and at least I won't be wasting mine and participants' time by incorporating it.