Introduction
In 2008, the UK Chief Medical Officer produced a report entitled Safer Medical Practice.1 It highlighted the importance of computer simulation in medical training. Flight safety has improved notably as a result of simulation training for pilots, and there is increasing evidence that this technology can improve training in other high risk professions such as surgery.
This comes at a time when the introduction of the European Working Time Directive and the increase in consultant-led care have decreased the number of operative opportunities for trainee surgeons. Operative training can be variable, unstructured, and dependent on the patient and the trainer rather than the trainee. Furthermore, surgical training can result in extra patient morbidity due to the trainee's inexperience.
For these reasons, the traditional Halstead ‘apprenticeship’ model, involving observation, coaching and practice, is being replaced by a competency-based model. This paradigm shift requires novel learning opportunities and objective assessment methods. Computer simulators potentially allow trainees to develop operative skills, experienced surgeons to practise unfamiliar surgical approaches,Reference Tolsdorffa, Petersikb, Pflesserb, Pommert, Tiede and Leuwer2 and examiners to objectively assess surgical skills,Reference Zirkle, David, Roberson, Leuwer and Dubrowski3 all within a virtual reality environment.
Surgical simulators have been validated in numerous settings,Reference Haque and Srinivasan4 including sinus surgery,Reference Arora, Uribe, Ralph, Zeltsan, Cuellar and Gallagher5 laparoscopic biliary surgeryReference Aggarwal, Crochet, Dias, Misra, Ziprin and Darzi6 and gastrointestinal endoscopy.Reference Ferlitsch, Glauninger, Gupper, Schillinger, Haefner and Gangl7 Temporal bone simulators have also been validated.Reference Zhao, Kennedy, Hall and O'Leary8–Reference Reddy-Kolanu and Alderson10
The Voxel-Man temporal bone surgery computer simulator (Spiggle and Theis, Hamburg, Germany) is commercially available in the UK. It has undergone face and content validation,Reference Khemani, Rennie, Singh and Tolley11, Reference Arora, Khemani, Tolley, Singh, Budge and Varela12 and thus provides a convincing representation of temporal bone drilling. It has also undergone construct validation, and thus can differentiate between experienced and novice surgeons using an objective outcome framework. This outcome framework has also been justified by correlation of the simulator's objective scores against senior clinicians' subjective scoring of novice and experienced surgeons.Reference McDonald, Alderson and Powles13
The purpose of this study was to demonstrate the parameters of improvement in novice trainees using the Voxel-Man temporal bone simulator. This information could be used to guide the use of this tool in surgical training and assessment.
Materials and methods
A study was performed using four medical students, each with a rudimentary knowledge of temporal bone anatomy and no experience of temporal bone surgery. Each trainee performed three pre-defined tasks on six separate occasions. The tasks were all performed on the Spiggle and Theis Voxel-Man TempoSurg temporal bone simulator14 (see Figure 1).
This simulator derives models of the temporal bone from high-resolution volumetric computed tomography (CT) images (see Figure 2). Vital structures are colour-coded, and the simulator records key indicators that relate to these structures, such as excessive force or direct damage from a rotating burr. The simulator uses a computer with two Athlon™MP processors, with SuSE Linux 8.0 and an Nvidia Quadro2 MXR graphics board. A hand-piece with PHANTOM® Premium haptic feedback is used to simulate a drill, enabling the user to experience changes in pressure and to visualise changes in tissues in real timeReference Zirkle, David, Roberson, Leuwer and Dubrowski3 as the drill cuts through the simulated temporal bone. The stereoscopic display is through a mirror with Elsa Revelator polarised shutter glasses, which provide a three-dimensional representation of the temporal bone with realistic depth perception.Reference Pflesser, Petersik, Tiede, Hohne and Leuwer15 The simulator can, if desired, display transverse, coronal and sagittal CT slices, and a warning monitor indicates the distance between the burr and key structures. The user is able to alter patient orientation and drill size, type and rotation speed.
The tasks performed were cortical mastoidectomy, exposure of the sigmoid sinus, and exposure of the short process of the incus. These tasks were performed as a set and were repeated five times, with each iteration separated by an interval of one week.
Prior to the start of the study, each subject was given a one-to-one simulator orientation by an experienced operator, and was able to have hands-on experience of a number of introductory tasks, prior to evaluation on the specified tasks. Subjects were given a basic introduction to each of the set tasks. They were also given their simulator data as feedback after each iteration. This information included an illustration demonstrating the reference bone volume to be removed (see Figure 3).
Scores were based on the following factors: (1) volume of reference bone removed (this parameter provided a positive outcome); (2) mistakes made during the task, including damage to important structures (i.e. dura, sigmoid sinus, facial nerve, chorda tympani, ossicles, bony labyrinth and external auditory meatus), excessive force near the above structures, and drilling time during which the rotating burr was not visible (these parameters provided negative outcomes); and (3) time taken to perform the procedure (this parameter could provide a positive or negative outcome).
We also recorded, but did not use in the scoring system, the average drill path and the volume of bone removed with each drill strike.
Statistical methods
The objective of the analysis was to examine how the outcome parameters varied over the course of the study. These outcome parameters were: overall score, volume of reference bone removed, mistakes made, time spent performing task, average drill path length, and average bone volume removed each time the drill struck bone. All six parameters were measured on a continuous scale. Therefore, the association between iteration and outcome was examined using linear or non-linear regression, depending on the nature of the parameter.Reference Twomey and Kroll16
Data from all three tasks were assessed in a single analysis. Additionally, the interaction between task and iteration was examined. Finally, terms were also included in the analysis to allow for differences in outcome between the four trainees.
Results and analysis
Total score
The first outcome of interest was the total score obtained. The progressive effect of iteration was quantified for each task separately, and is shown in Table I and Figure 4. The results suggested that there was no significant effect of iteration upon the total scores for the cortical task. However, there was a significant change with subsequent iterations for both the sigmoid and incus tasks.
*Demonstrating improvement with each iteration. CI = confidence interval; CM = cortical mastoidectomy; SS exp = sigmoid sinus exposure; SPI exp = short process of incus exposure
Time taken
The second set of analyses examined the change in time taken to complete each task over the course of the study. The results were quantified for each task separately, and are shown in Table II and Figure 5. The analysis results indicated that there was no evidence that the time taken to complete the cortical task changed over the course of the study. However, there was a significant change in the time required for the remaining two tasks, with a significant decrease in time with successive iterations.
*Demonstrates time reduction with each iteration (minutes). CI = confidence interval; CM = cortical mastoidectomy; SS exp = sigmoid sinus exposure; SPI exp = short process of incus exposure
Target bone volume removed
Linear regression suggested that there was no significant interaction between task and iteration (p = 0.35) for this variable. This indicates that the relationship between iteration and target bone volume removed did not significantly vary between the three tasks. As a result, it was assumed that there was a similar relationship between iteration and target volume removed for all three tasks, and so the relationship was examined for all tasks combined. A summary of the results is shown in Table III and Figure 6. No significant change was identified.
* Over iterations. CI = confidence interval
Mistakes
The fourth analysis examined how the mistakes score varied over the course of the study. Once again, there was no evidence of a significant interaction between iteration and task (p = 0.38), so a similar relationship for all tasks was assumed. There was a non-linear relationship between iteration and mistakes, and this was factored into the analysis. The results suggested that the mistakes score increased (i.e. became less negative) from iteration one up to iteration four. After this point, there was a slight tailing-off of scores, and even a slight decrease by the last iteration, although these results were only of borderline statistical significance (p = 0.05). The results are shown in Table IV and Figure 7.
* For each task. CI = confidence interval
Average bone volume removed with each strike
The fifth set of analyses examined how the average bone volume removed with each strike varied over the study. A summary of the analytical results is given in Table V. The regression analysis suggested that the average volume removed with each strike increased, but not to a level that reached significance.
CI = confidence interval
Drill path length
The change in drill path length over subsequent iterations was also examined using regression analysis. With progressive iterations, the average drill path length decreased to a non-significant extent. The results of this analysis are shown in Table VI.
* Required to complete task. CI = confidence interval
Discussion
Our results indicate the expected parameters of improvement during simulated temporal bone drilling. There was a notable decrease in the time taken to perform each procedure, which was statistically significant for two of the three tasks. There was a decrease in the quantity of mistakes made across all three tasks, which trended towards significance (p = 0.05), and there was an increase in total score in two of the three tasks. The task for which improvement was not seen at a statistically significant level, for either timing or overall score, was cortical mastoidectomy. We speculate that this was due to an apparent plateau of improvement being reached at the fourth iteration. This apparent plateau in overall scoring was seen at the fifth iteration in the sigmoid sinus task; however, at this later stage in the study there was already a significant trend towards improvement. There was no apparent plateau seen in the task requiring exposure of the short process of the incus. It seems logical that it apparently took longer to develop competence in the more complicated temporal bone drilling tasks. These tasks required more dexterity, better knowledge of surrounding structures, and a familiarity with the capabilities of the drill.
The clearest parameter in which trainees showed improvement was the length of time taken to perform each procedure. It is interesting to note that the length of time taken to perform the procedure accounted for a large proportion of the scoring variability between each subject. It may be that the weighting of simulator scoring created a greater incentive to perform the procedure more quickly. However, despite feedback using a picture demonstrating the target volume for removal (see Figure 3), this parameter remained relatively constant. This indicates that trainees became more proficient at removing a safe reference volume of temporal bone, but that a more extensive dissection was not attempted, presumably due to the risk of damage to important structures. It seems likely that the nature of the scoring system and feedback dictates the way in which trainees' operative performance improves. This may allow the same simulator, with different scoring frameworks, to be used for trainees of different standards, in order to develop skills relevant to different training levels.
There was no notable improvement in the volume of reference bone removed. This was despite feedback, including a pictorial representation of the proportion of the target volume which was incorrectly retained. It is possible that this persistent error was due to the nature of the initial simulator orientation, during which the aims of the operation were explained. More extensive dissection also increases the risk of damage to important structures, and takes more time.
As trainees' operative time decreased, the average length of the drill path also decreased, and the volume removed with each strike increased. Whilst these changes did not reach statistical significance, they demonstrated the method by which efficiency improved as surgical time decreased, and indicated greater confidence in dissection. This finding agrees with those of previous studies of temporal bone simulators, in which face validity was established by surgeons indicating a greater level of confidence in temporal bone surgery and anatomy as a result of using simulators.Reference O'Leary, Hutchins, Stevenson, Gunn, Krumpholz and Kennedy9 It is likely that this is the principal benefit of temporal bone simulator use by novices. In addition to increased confidence, trainees' knowledge of anatomy improves and they receive an introduction to the relevant surgical techniques. In addition, the simulator's haptic feedback provides a sense of what might be expected during real surgery.Reference Agus, Giachetti, Gobbetti, Zanetti, Zorcolo and Picasso17
There are some flaws with the virtual reality provided by the simulator. The drill shaft is not rendered as a solid object, and thus the drill may be held at any angle, regardless of nearby structures. Furthermore, whilst the subject can be moved, there is no facility to ‘zoom out’ away from the predetermined level of magnification. There are numerous other differences between a real and a virtual surgical environment. However, whilst there will inevitably be flaws in a virtual reality environment, face validation demonstrates that the fidelity of the experience is sufficient to provide meaningful training.Reference Khemani, Rennie, Singh and Tolley11 The simulator allows development of the conceptual knowledge and practical skills which are prerequisites for otological surgery.
The most notable difference between the real and simulated surgical environments is in outcomes. The simulator's scoring paradigm does not recognise that the most difficult piece of bone to remove may also be the most crucial to the successful completion of the task. Equal marks will be given for removal of an arbitrary volume of reference bone at the periphery of the dissection as for technically difficult removal of bone crucial to surgical completion. Similarly, the simulator does not differentiate between exposure of the dura and passage of a rotating burr right through the dura.
• Computerised temporal bone dissection simulators have been validated in face and content
• This study assessed novices' learning curves for various simulated tasks
• Data indicate expected improvements using this training method
However, it is important to note that improvement in simulated outcomes does not necessarily indicate an improvement in real surgical capabilities. The use of the simulator requires a diverse skill set. Knowledge of temporal bone anatomy and familiarity with the simulator, with temporal bone drilling and with the scoring framework all influence the final score. It may be that our subjects' score improvements were most influenced by an improved knowledge of the scoring framework and increased familiarity with the simulator. However, although this is an important point to consider, it is unlikely to be the case, as there were different improvement trends for the three set tasks. This does not invalidate the utility of the data, which still indicate the expected improvement, and which may be used to guide the incorporation of the tool into assessment and training programmes.
Conclusion
These results indicate the utility of computerised temporal bone simulators in the training of novice surgeons. Our data indicated that more complex tasks required a longer training period in order for trainees to reach the required level of competence. Data also indicated the importance of the feedback and scoring framework in guiding trainees' progress. Further investigation is required in order to develop scoring frameworks appropriate for different levels of surgical training.
Acknowledgement
We would like to acknowledge Paul Bassett for his assistance with statistics.