Hostname: page-component-745bb68f8f-lrblm Total loading time: 0 Render date: 2025-02-11T14:49:11.811Z Has data issue: false hasContentIssue false

Objective structured assessment of technical skill in temporal bone dissection: validation of a novel tool

Published online by Cambridge University Press:  12 May 2021

M Stavrakas*
Affiliation:
ENT Department, AHEPA University Hospital, Aristotle University of Thessaloniki, Thessaloniki, Greece
G Menexes
Affiliation:
Department of Field Crops and Ecology, Faculty of Agriculture, Forestry and Natural Environment, School of Agriculture, Aristotle University of Thessaloniki, Thessaloniki, Greece
S Triaridis
Affiliation:
ENT Department, AHEPA University Hospital, Aristotle University of Thessaloniki, Thessaloniki, Greece
P Bamidis
Affiliation:
Laboratory of Medical Physics, Medical School, Aristotle University of Thessaloniki, Thessaloniki, Greece
J Constantinidis
Affiliation:
ENT Department, AHEPA University Hospital, Aristotle University of Thessaloniki, Thessaloniki, Greece
P D Karkos
Affiliation:
ENT Department, AHEPA University Hospital, Aristotle University of Thessaloniki, Thessaloniki, Greece
*
Author for correspondence: Dr Marios Stavrakas, ENT Department, Αristotle University of Thessaloniki, AHEPA Hospital, Thessaloniki Kiriakidi 1, Thessaloniki 546 21, Greece E-mail: mstavrakas@doctors.org.uk
Rights & Permissions [Opens in a new window]

Abstract

Objective

This study developed an assessment tool that was based on the objective structured assessment for technical skills principles, to be used for evaluation of surgical skills in cortical mastoidectomy. The objective structured assessment of technical skill is a well-established tool for evaluation of surgical ability. This study also aimed to identify the best material and printing method to make a three-dimensional printed temporal bone model.

Methods

Twenty-four otolaryngologists in training were asked to perform a cortical mastoidectomy on a three-dimensional printed temporal bone (selective laser sintering resin). They were scored according to the objective structured assessment of technical skill in temporal bone dissection tool developed in this study and an already validated global rating scale.

Results

Two external assessors scored the candidates, and it was concluded that the objective structured assessment of technical skill in temporal bone dissection tool demonstrated some main aspects of validity and reliability that can be used in training and performance evaluation of technical skills in mastoid surgery.

Conclusion

Apart from validating the new tool for temporal bone dissection training, the study showed that evolving three-dimensional printing technologies is of high value in simulation training with several advantages over traditional teaching methods.

Type
Main Articles
Copyright
Copyright © The Author(s), 2021. Published by Cambridge University Press

Introduction

A hugely important part of surgery is the training of doctors, followed by the assessment of their competence and quality of the training they have received. Improvement of surgical skills should not follow Halsted's model, which claims that learning is achieved by performing the procedure.Reference Bismuth, Donovan, O'Malley, El Sayed, Naoum and Peden1 The principle ‘see one, do one, teach one’ tends to be abandoned as ineffective.Reference Satava2 The training methods that simulate real conditions and scenarios have been conscripted in numerous other industries, such as aviation, architecture and the military. Simulation has entered medical education only during the past decade. In order to understand the role of simulation in medical training, it is useful to define the term. Bismuth et al. define simulation as ‘a person, device or set of conditions which attempts to present [education and] evaluation problems authentically’.Reference Bismuth, Donovan, O'Malley, El Sayed, Naoum and Peden1

Training in surgery is entirely different from medical training. As a result, some training programmes end up producing less experienced and less competent surgeons owing to the decreased number of training hours. This could be correlated to the fact that every trainee surgeon has a different learning curve. Moreover, some surgeons may finish their training at a lower point on their learning curve.Reference Pandey, Black, Lazaris, Allenberg, Eckstein and Hagmüller3 Simulation is an excellent adjunct in training and has been adopted by many surgical specialties, including otolaryngology.Reference Musbahi, Aydin, Al Omran, Skilbeck and Ahmed4

According to a literature review by Musbahi et al., there are 64 otolaryngology simulators available, including virtual reality and bench models, with various levels of validity.Reference Musbahi, Aydin, Al Omran, Skilbeck and Ahmed4 The integration of surgical simulation in training is essential as it endorses clinical skill acquisition in an environment of reduced learning opportunities, especially after the introduction of the European Working Time Directive.Reference Fitzgerald and Caesar5 Moreover, it enhances communication, decision-making processes and situational awareness.Reference Yule, Parker, Wilkinson, McKinley, MacDonald and Neill6

Medical students and specialty trainees are familiar with objective structured clinical examination, which represents a method of assessment of skills in physical examination, communication and professionalism.Reference Satava2 Although it seems to be a widely accepted method of evaluation, it cannot be applied in surgery, as it does not assess technical skills. The objective structured assessment of technical skill (‘OSATS’) was developed in Toronto by Martin et al.Reference Martin, Regehr, Reznick, Macrae, Murnaghan and Hutchison7 with the purpose of assessing the development of surgical skills.

The objective structured assessment of technical skill in temporal bone dissection (‘TempOSATS’) is a novel proposed tool. Its principal aim is to assess surgical skills in temporal bone dissection and more specifically in cortical mastoidectomy, according to the already validated pillars of the objective structured assessment of technical skill tool.

Our study comprised two aims. The first was to assess the best material to make a three-dimensional (3D) temporal bone model, to present the advantages of 3D printing in temporal bone dissection as a means of surgical simulation and to implement these technologies in setting up a skills laboratory using exclusively 3D-printed models. Moreover, we aim to explore some main aspects of the validity and reliability of the proposed objective structured assessment of technical skill in temporal bone dissection tool, which is based on the principles of the objective structured assessment of technical skill, as an assessment tool for basic temporal bone dissection, utilising 3D-printing techniques to establish identical anatomical models.

Materials and methods

Selection of materials and printing modality

After selecting a computed tomography (CT) scan of a well aerated, disease-free temporal bone, we converted the Dicom® data to a stereolithographic (‘stl’) file, which is appropriate for 3D printing. Only a few improvements were required to limit any artifacts in the final format, such as removal of supporting structures from the mastoid air cells and draining holes, depending on the printing method.

The main question was which 3D-printing technology would approach anatomical accuracy of the real temporal bone, allow quick reproducibility, satisfactory tactile feedback and affordable cost. The materials we tested were polylactic acid, polylactic acid plus polyvinyl alcohol, resin by conventional printing and selective laser sintering.

Afterwards, we assessed all four models by a focus group, consisting of five specialist otolaryngologists with experience in temporal bone surgery. The focus group also agreed on the steps that should be included in the objective structured assessment of technical skill in temporal bone dissection tool. These are the main surgical steps described in the literature and also reflect the experience of the focus group.Reference Arnoldner, Lin and Chen8,Reference Francis, Masood, Laeeq and Bhatti9 The advantages and disadvantages of these models are summarised in Table 1. Paying attention to the anatomical resemblance and feedback to drilling, we concluded that selective laser sintering resin technology was the best for our purpose. The cost of each model was approximately 25 euros.

Table 1. Comparison of different printing materials and methods

The experiments were conducted on a simple bench with a temporal bone holder and drill, which can be easily replaced by a Dremel-type drill (Illinois, USA). Different types of drill heads were available (cutting and diamond), as well as suction, irrigation and otological micro-instruments (for example, needles and crocodile forceps). The task was cortical mastoidectomy. MacEwen's triangle could be easily identified as the spine of Henle and the zygomatic root. Drilling of the selective laser sintering model was smooth, with close to realistic tactile feedback. The mastoid cells were empty of material, and the position of the other landmarks (sigmoid sinus, lateral semi-circular canal and incus buttress) could also be identified. All the surgical steps were previously agreed by the members of the group, executed in an uninterrupted sequence and videotaped so they could be reassessed later (Figure 1).

Fig. 1. Flowchart of methodology. 3D = three dimensional; OSATS = objective structured assessment for technical skills

Selection of sample and simulation process

To determine the minimum required sample, power analysis was conducted following the minimum expected correlation coefficient (Spearman's rank correlation rho) for testing inter- and intra-rater reliability of the two assessors relative to their total scoring of overall achievement. For an anticipated correlation coefficient of rho = 0.60 utilising a sample size of at least n = 19 units, a two-tailed t-test for testing the statistical significance of the corresponding correlation coefficient, at significance level a = 0.05, showed enough power (γ = 0.80) to highlight the association as statistically significant. Generally, a value of a correlation coefficient of 0.60 is considered to correspond to a ‘large’ effect size according to Cohen's conventions.Reference Cohen10 Power analysis was conducted with G*Power (version 3.1.2) statistical power analysis software (software detailed in Faul et al.Reference Faul, Erdfelder, Lang and Buchner11 and Faul et al.Reference Faul, Erdfelder, Buchner and Lang12).

A flowchart of the methodology and the experimental part is presented in Figure 1. Two of the authors acted as external assessors, who initially delivered a brief tutorial to the candidate (slides disseminated via e-mail), focusing on the objectives and surgical steps that were expected to be performed. Following this, specialty trainees of various levels from rotations in Northern Greece were asked to perform a cortical mastoidectomy in the pilot skills labarotary using the selective laser sintering resin printed temporal bone models. They were invited via personal e-mail invitation, and their participation was registered on a first come, first-served basis. They all had the same equipment available to complete the task, and they were videotaped (Figures 2–4).

Fig. 2. Temporal bone three-dimensional model.

Fig. 3. Temporal bone skills station.

Fig. 4. Three-dimensional printed model after cortical mastoidectomy.

Assessment and scoring

The videos were given a number from 1 to 24. Then they were scored according to objective structured assessment of technical skill in temporal bone dissection by the two external assessors at two different times: after the completion of the experimental part and one month later. Before scoring, a meeting took place for calibration purposes, and the two assessors agreed on the scoring methodology. According to the literature, similar projects involved 2–3 assessors, directly evaluating the candidates, especially for the first time.Reference Martin, Regehr, Reznick, Macrae, Murnaghan and Hutchison7,Reference Hopmans, den Hoed, van der Laan, van der Harst, van der Elst and Mannaerts13,Reference Chang, King, Modest and Hur14 Additionally, the videos were reviewed again after one month. This is in line with the relevant literature, where intra-rater variability was assessed by reviewing video recordings after some days up to six weeks.Reference Chang, King, Modest and Hur14Reference Schlager, Ahlqvist, Rasmussen-Barr, Bjelland, Pingel and Olsson17

The importance of video recording has been highlighted in several studies.Reference Jokinen, Mikkola and Härkki18,Reference Rezniczek, Severin, Hilal, Dogan, Krentel and Buerkle19 The assessors also scored the candidates according to an already validated global rating scale,Reference Martin, Regehr, Reznick, Macrae, Murnaghan and Hutchison7,Reference Reznick, Regehr, MacRae, Martin and McCulloch20 which was utilised as a control tool for testing the criterion validity of the proposed objective structured assessment of technical skill in temporal bone dissection. As shown in Figure 5, objective structured assessment of technical skill in temporal bone dissection consists of seven questions (scored as yes/no) and one question of overall achievement, scored from 0 to 5. The global rating scale has seven questions, scored from one to five (Figure 6).

Fig. 5. The objective structured assessment of technical skill in temporal bone dissection tool for assessment of cortical mastoidectomy.

Fig. 6. Global rating scale.

The study was approved by the Committee of Bioethics of the Aristotle University Medical School, Thessaloniki, Greece. All participants gave written consent before participating in the experimental part, and the consent forms were also approved by the Committee of Bioethics.

Statistical analysis

Data were summarised by calculating descriptive statistical indices such as absolute and relative frequencies (percentages), measures of central tendency (means and medians) and variability (standard deviations), correlation-association indices (Spearman's rho for correlating quantitative variables, and gamma or Cramer's V for assessing the degree of the association between categorical variables).

The process of testing some aspects of the reliability and the validity of the proposed objective structured assessment of technical skill in temporal bone dissection assessment tool was based on the following methodological scheme: (1) the internal consistency of the objective structured assessment of technical skill in temporal bone dissection tool was tested and evaluated by estimating and assessing the value of the Kuder–Richardson formula 20 reliability coefficient.Reference Nunnally21,Reference Spector22 The Kuder–Richardson formula 20 coefficient is analogous to Cronbach's a reliability coefficient, but it is appropriate for binary items. (2) For both tools, the average discrimination index was calculated. The discrimination index was used for testing the homogeneity of the two tools.Reference Nunnally21 This index is related mainly to the construct validity of a scale consisting of several items. These first two analyses were performed for each examiner within each evaluation time (time 1 and time 2). (3) The criterion validity of the objective structured assessment of technical skill in temporal bone dissection assessment tool was tested and evaluated by correlating, at each evaluation time (time 1 and time 2), the examiners’ scores on the overall assessment item of the objective structured assessment of technical skill in temporal bone dissection tool with the average score of the global rating scale of operative performance tool. (4) The ‘inter-rater’ and ‘intra-rater’ reliability were tested with Spearman's rho and Wilcoxon tests.

In all statistical tests, the observed significance level (p-value) was computed with the Monte-Carlo simulation method utilising 10 000 random samples.Reference Mehta23,Reference Mehta and Patel24 All the statistical analyses were performed with SPSS® (version 24.0) statistical software enhanced with the module ‘exact tests’ (for the implementation of the Monte-Carlo simulation). The significance level in all hypothesis testing procedures was predetermined at a = 0.05 (p ≤ 0.05).

Results

According to data presented in Table 2, the vast majority of the scores of the two examiners using the two tools, for both time periods, showed satisfactory reliability indices (Kuder–Richardson formula 20 or Cronbach's a reliability coefficients more than or equal to 0.60) and homogeneity (average discrimination index more than 0.30).

Table 2. Reliability results of the two tools used by the two assessors at two time points

For the objective structured assessment of technical skill in temporal bone dissection (TempOSATS) tool items, Cronbach's a reliability coefficient is equivalent to Kuder–Richardson formula 20 (KR20) reliability coefficient, and discrimination index (DI) is the average discrimination index

Based on data presented in Tables 3 and 4, for each examiner, there was a very strong (almost absolute) positive and statistically significant correlation between examiner scores at time 1 and time 2 for the overall assessment of objective structured assessment of technical skill in temporal bone dissection tool (for examiner one: rho = 0.942, p < 0.001; for examiner 2: rho = 0.908, p < 0.001). However, for examiner one there was a statistically significant difference (p = 0.002) between the two assessments (time 1 vs time 2). The mean value of the overall evaluation at time 1 was estimated to be 3.8 and at time 2 was estimated to be 3.3; that is, significantly lower than time 1 (mean difference was equal to 0.5 in a 6-point scale). For examiner two, no statistically significant difference (p = 0.748) between the two assessments was highlighted, according to the results of the Wilcoxon test. It must be noted that in all comparisons, the median values were all equal to 4.0.

Table 3. TempOSATS overall assessment intra-rater reliability and comparison of means

Table shows intra-rater reliability (Spearman's rho rank correlation coefficient) and comparison of means, for each examiner, between time 1 and time 2 and between the two examiners at each time point, for the objective structured assessment of technical skill in temporal bone dissection (TempOSATS) score for overall assessment

Table 4. TempOSATS overall assessment of intra-rater reliability and comparison of means

Table shows intra-rater reliability (Spearman's rho rank correlation coefficient) and comparison of means, for each examiner, between time 1 and time 2 and between the two examiners at each time point, for the objective structured assessment of technical skill in temporal bone dissection (TempOSATS) score for overall assessment

According to the data presented in Tables 3 and 4, at each time point there was a very strong (almost absolute at time 2) positive and statistically significant correlation between the scores of the two examiners for the overall assessment of the objective structured assessment of technical skill in temporal bone dissection tool (at time 1: rho = 0.837, p < 0.001; at time 2: rho = 0.999, p < 0.001). However, at time 1, there was a statistically significant difference (p = 0.035) between the two examiners. The mean value of the overall assessment for examiner one was estimated to 3.8, and for examiner two, it was equal to 3.4. That is, significantly lower than examiner one (mean difference was equal to 0.4 on a 6-point scale). At time 2, no statistically significant difference (p = 1.000) between the two examiners was found, according to the results of the Wilcoxon test.

Based on data presented in Tables 5 and 6, for both examiners at times 1 and 2, there was a very strong, positive and statistically significant correlation (p < 0.001) between their overall assessment scores derived from the two tools.

Table 5. Correlation between the overall assessment scores of the two tools at time 1

Table shows correlation (Spearman's rho rank correlation coefficient) between the overall assessment scores of the two tools, objective structured assessment of technical skill in temporal bone dissection (TempOSATS) and global rating scale of operative performance, reported by the two examiners at time 1. E1Τ1 = examiner 1 at time 1; E2Τ1 = examiner 2 at time 1

Table 6. Correlation between the overall assessments scores of the two tools at time 2

Table shows correlation (Spearman's rho rank correlation coefficient) between the overall assessment scores of the two tools, objective structured assessment of technical skill in temporal bone dissection (TempOSATS) and global rating scale of operative performance, reported by the two examiners at time 2. E1Τ2 = examiner 1 at time 2; E2Τ2 = examiner 2 at time 2

Landis and Koch (1977) remark that kappa values around 0.20 express a weak degree of agreement, values around 0.40 indicate a satisfactory degree of agreement, values around 0.60 express a moderate degree of agreement, values around 0.80 indicate a significant degree of agreement and, finally, kappa values over 0.80 express an almost perfect degree of agreement.Reference Landis and Koch25 Based on the data presented in Table 7, the vast majority of Cohen's kappa measures of agreement were greater than 0.80 and statistically significant (maximum p = 0.042, <0.05). The simple overall agreement percentages between any two assessments’ scores were greater than 95 per cent (ranged from 96 to 100 per cent). Regarding the degree of the association between any two assessments’ scores, the corresponding association indices were very high (both Cramer's V > 0.80 and gamma > 0.80, range, 0.836 to 1) and statistically significant (p < 0.001). Consequently, testing the items of the objective structured assessment of technical skill in temporal bone dissection tool, the two examiners showed very strong agreement between their intra- and inter-reliability assessments.

Table 7. Degree of agreement or correlation between scores and overall performance of TempOSATS tool

Table shows degree of agreement (Cohen's kappa measure) or correlation (Cramer's V and gamma association indices) between the two examiners' scores within each attempt and between the two attempts (time 1 and time 2) for the 7 items and the overall performance of the objective structured assessment of technical skill in temporal bone dissection (TempOSATS) assessment tool. *In those cases where it was not possible to compute the Cohen's kappa measure of agreement, the simple overall agreement percentage between any two assessments is reported instead; in those cases where it was not possible to compute the Cohen's kappa measure of agreement, the Cramer's V and gamma association indices between any two assessments are reported instead. E1 = examiner 1; E2 = examiner 2; Τ1 = time 1; Τ2 = time 2

Tables 8 and 9 present the results of the intra- and inter-rater reliability testing for the average summated score of the global rating scale of operative performance tool. In all testing procedures, there was a very strong (almost absolute) positive and statistically significant correlation between any two assessments, in all cases p < 0.001 (Tables 8 and 9).

Table 8. Intra-rater reliability and comparison of means for the average summated score of the GRSOP

Table shows intra-rater reliability (Spearman's rho rank correlation coefficient) and comparison of means, for each examiner, between time 1 and time 2, and between the two examiners at each time point, for the average summated score of the global rating scale of operative performance (GRSOP).

Table 9. Intra-rater reliability and comparison of means for the average summated score of the GRSOP

Table shows intra-rater reliability (Spearman's rho rank correlation coefficient) and comparison of means, for each examiner, between time 1 and time 2, and between the two examiners at each time point, for the average summated score of the global rating scale of operative performance (GRSOP).

Discussion

Surgical training was previously mainly confined to the practice and development of surgical skills in the operating theatre. According to Reznick et al., the operating theatre has many limitations when it comes to training and assessment. First of all, it is difficult to standardise any operation in similar training patterns. Secondly, it is almost impossible to standardise the degree to which a trainee is performing elements of an operation. In addition, surgical time is far more expensive compared with any other training method.Reference Reznick, Regehr, MacRae, Martin and McCulloch20

Aiming to overcome the above limitations, efforts have been made to develop effective teaching methods. Animal models are carefully selected to simulate human anatomy, and the animal must be anaesthetised before the operation. Obviously, ethical issues are involved, and animal models do not offer a wide range of alternatives to real patients. The use of bench models simulates human anatomy well and are used for ordinary surgical tasks. Comparing bench model based training to previous methods, it has a lower cost, is portable, readily available and allows the reproducibility of various tasks.Reference Martin, Regehr, Reznick, Macrae, Murnaghan and Hutchison7

Objective structured assessment of technical skill gives the candidate a score that ranges from 8 to 40, with 24 representing a competent performance. Pandey et al.Reference Pandey, Black, Lazaris, Allenberg, Eckstein and Hagmüller3 described the value of objective structured assessment of technical skill. Despite the small number of participants (15 surgical trainees), this study showed that the participants had significant improvement in all aspects of their generic skill but mainly improved in the flow of the procedure, their overall performance and their procedure-specific skills. In the same study, although significant improvement was observed, some participants did not improve. They were mainly older surgeons who proved to be less able to learn in this type of setting because they had accumulated other methods of performing the examined procedures. Another reason may be that they have learned other types of the same procedure that are different from those demonstrated to them.Reference Pandey, Black, Lazaris, Allenberg, Eckstein and Hagmüller3

The Vascular Department of Imperial College London, which is based at St Mary's Hospital, adopted objective structured assessment of technical skill in their surgeons’ training. They took objective structured assessment of technical skill a step beyond its original idea: evaluating surgical competence in a specific procedure and not only basic surgical tasks. The new tool that Imperial introduced was called Imperial College Evaluation of Procedure Specific Skill. This involves a rating scale with five standard points to assess the content of a procedure.Reference Bismuth, Donovan, O'Malley, El Sayed, Naoum and Peden1

There is no doubt as to the value of surgical skills assessment. The most beneficial impact is the considerable improvement in patient safety because the trainee surgeon does not practice a specific procedure on a patient for the first time. In addition, the ‘learning curve’ of making mistakes takes place in the laboratory and not on a patient. In that way, the trainee can perform the same procedure many times until improvement is reached. As a result, operating time decreases, efficiency increases and medical errors decrease.Reference Satava2 This agrees with our philosophy of applying the objective structured assessment of technical skill principles to the whole surgical procedure and not only for limited skills. Moreover, our experiments demonstrated the need for adequate calibration between the assessors, some discrepancies in scoring that may have to do with the different levels of experience of the assessors and the value of video recordings, which allow more careful evaluation of the various surgical steps.

A possible problem in applying objective structured assessment of technical skills in every training hospital is the relatively high cost. When the method first became known, only a few major teaching centres had the resources to organise courses and evaluations, and this could only occur a few times a year. Cost for models, facilities and especially trainers are obstacles to its wider spread.Reference Pandey, Black, Lazaris, Allenberg, Eckstein and Hagmüller3 In our study, we managed to reproduce a number of identical models of temporal bones at a low cost, and the printing time was a few hours for each.

Three-dimensional printing is a technology that has been known since the 1980s, but its involvement in the medical field has increased significantly over the last two decades, with numerous examples in training, patient education and bioengineering. Three-dimensional printing equipment has improved, is less expensive and the expertise is more widespread, and therefore it has become available in many parts of the world for medical use in several fields.Reference Crafts, Ellsperman, Wannemuehler, Bellicchi, Shipchandler and Mantravadi26,Reference Gross, Erkal, Lockwood, Chen and Spence27

There are numerous studies available in the literature, exploring the potential use of 3D-printing technologies in ENT head and neck surgery. They vary from pre-operative planning and patient education to more advanced training applications for residents and undergraduate medical students. Additionally, there have been descriptions of applications associated with tissue engineering and prosthetics, which are extremely promising for medical innovations in the near future.

According to Canzi et al., there are 23 studies in the literature focusing on otological applications in training, mainly to do with temporal bone surgery simulation.Reference Canzi, Magnetto, Marconi, Morbini, Mauramati and Aprile28 In 2015, a temporal bone model based on CT scan data of two selected patients with well pneumatised and disease-free mastoids was developed. The final evaluation of the models showed satisfactory reproducibility of most structures and anatomical landmarks but also raised two significant issues: the accuracy of the ossicular chain (mainly the stapes) and also the retained resin within the mastoid air cells. The latter issue impacts the drilling experience and can be overcome by adding a small drain hole in the region of the sigmoid sinus. The authors concluded that the model produced is useful for training, without depleting a limited supply of cadavers and by using conventional (non-surgical) tools, such as a Dremel drill.Reference Yushkevich, Piven, Hazlett, Smith, Ho and Gee29 On the other hand, it is still difficult to approach the ‘natural’ structure of the cadaveric specimen, mainly because of the ‘stair-stepping’ artifact and the lack of anatomical elements such as the dura, nerves, blood vessels, tympanic membrane, and oval and round windows.Reference Cohen and Reyes30 We have overcome the obstacles of stair-stepping and retained resin by comparing different materials and printing techniques and choosing selective laser sintering printing. This method allows more accurate printing without retained material and better external and internal contours.

  • Objective structured assessment of technical skill is a widely accepted tool for assessing surgical skills

  • Only a few of its applications in otolaryngology have been explored so far

  • There are numerous studies in the literature exploring the potential use of three-dimensional printing

  • Three-dimensional printing is a novel but reliable approach to surgical simulation

  • This study explored the validity and reliability of a newly proposed assessment model for surgical training

  • The objective structured assessment of technical skill in temporal bone dissection is a tool that can be useful in training assessment

Other groups also confirmed the similarity to the cadaveric specimens and the positive feedback from the trainees.Reference Da Cruz and Francis31Reference Rose, Webster, Harrysson, Formeister, Rawal and Iseli34 More specifically, Hochman et al. showed that tactile feedback is satisfactory by analysing subjective and objective methods. The improvement of materials has provided a better simulation of bone consistency, resulting in a more realistic experience.Reference Hochman, Kraut, Kazmerik and Unger35 A useful adjunct in training is the coupling with electronic simulators, which offers the possibility of real-time alert in case of vital structural injury. An example is the ElePhant model (Electronic Phantom), where the facial nerve is replaced with a conductive alloy or fibre-optic material, allowing immediate feedback.Reference Grunert, Strauss, Moeckel, Hofer, Poessneck and Fickweiler36 Anecdotal feedback from the participants confirmed the satisfactory tactile feedback, which is associated with the different thickness of the structures (mastoid air cells and bony labyrinth).

Our group has studied the different materials and printing techniques and the application in relatively larger scales has shown that such methods can be used to run skills labs based on 3D-printed models.

Conclusion

Three-dimensional printing is a novel but equally reliable approach to surgical simulation, and reproduction of anatomical models can be of great value in training and personalised patient care. Additionally, objective structured assessment of technical skill in temporal bone dissection is a tool that can be extremely useful in the assessment of training and monitoring of a surgeon's learning curve. More studies are necessary to expand its applications in more complex operations, where cortical mastoidectomy represents the initial stage of surgery.

Competing interests

None declared

Footnotes

Dr M Stavrakas takes responsibility for the integrity of the content of the paper

References

Bismuth, J, Donovan, MA, O'Malley, MK, El Sayed, HF, Naoum, JJ, Peden, EK et al. Incorporating simulation in vascular surgery education. J Vasc Surg 2010;52:1072–80CrossRefGoogle ScholarPubMed
Satava, RM. The revolution in medical education-the role of simulation. J Grad Med Educ 2009;1:172–5CrossRefGoogle ScholarPubMed
Pandey, VA, Black, SA, Lazaris, AM, Allenberg, JR, Eckstein, HH, Hagmüller, GW et al. Do workshops improve the technical skill of vascular surgical trainees? Eur J Vasc Endovasc Surg 2005;30:441–7CrossRefGoogle Scholar
Musbahi, O, Aydin, A, Al Omran, Y, Skilbeck, CJ, Ahmed, K. Current status of simulation in otolaryngology: a systematic review. J Surg Educ 2017;74:203–15CrossRefGoogle ScholarPubMed
Fitzgerald, JEF, Caesar, BC. The European working time directive: a practical review for surgical trainees. Int J Surg 2012;10:399403CrossRefGoogle ScholarPubMed
Yule, S, Parker, SH, Wilkinson, J, McKinley, A, MacDonald, J, Neill, A et al. Coaching non-technical skills improves surgical residents’ performance in a simulated operating room. J Surg Educ 2015;72:1124–30CrossRefGoogle Scholar
Martin, JA, Regehr, G, Reznick, R, Macrae, H, Murnaghan, J, Hutchison, C et al. Objective structured assessment of technical skill (OSATS) for surgical residents. Br J Surg 1997;84:273–8Google ScholarPubMed
Arnoldner, C, Lin, VYW, Chen, JM. Cortical mastoidectomy. In: Manual of Otologic Surgery. Vienna: Springer, 2015;513Google Scholar
Francis, HW, Masood, H, Laeeq, K, Bhatti, NI. Defining milestones toward competency in mastoidectomy using a skills assessment paradigm. Laryngoscope 2010;120:1417–21CrossRefGoogle ScholarPubMed
Cohen, J. Statistical Power Analysis for the Behavioral Sciences, 2nd edn. New Jersey: Lawrence Erlbaum Associates, 1988;13Google Scholar
Faul, F, Erdfelder, E, Lang, A-G, Buchner, A. G* Power 3: a flexible statistical power analysis program for the social, behavioral, and biomedical sciences. Behav Res Methods 2007;39:175–91CrossRefGoogle ScholarPubMed
Faul, F, Erdfelder, E, Buchner, A, Lang, A-G. Statistical power analyses using G* Power 3.1: tests for correlation and regression analyses. Behav Res Methods 2009;41:1149–60CrossRefGoogle Scholar
Hopmans, CJ, den Hoed, PT, van der Laan, L, van der Harst, E, van der Elst, M, Mannaerts, GHH et al. Assessment of surgery residents' operative skills in the operating theater using a modified objective structured assessment of technical skills (OSATS): a prospective multicenter study. Surgery 2014;156:1078–88CrossRefGoogle ScholarPubMed
Chang, OH, King, LP, Modest, AM, Hur, H-C. Developing an objective structured assessment of technical skills for laparoscopic suturing and intracorporeal knot tying. J Surg Educ 2016;73:258–63CrossRefGoogle ScholarPubMed
Siddiqui, NY, Stepp, KJ, Lasch, SJ, Mangel, JM, Wu, JM. Objective structured assessment of technical skills for repair of fourth-degree perineal lacerations. Am J Obstet Gynecol 2008;199:676CrossRefGoogle ScholarPubMed
Siddiqui, NY, Galloway, ML, Geller, EJ, Green, IC, Hur, H-C, Langston, K et al. Validity and reliability of the robotic objective structured assessment of technical skills. Obstet Gynecol 2014;123:1193CrossRefGoogle ScholarPubMed
Schlager, A, Ahlqvist, K, Rasmussen-Barr, E, Bjelland, EK, Pingel, R, Olsson, C et al. Inter-and intra-rater reliability for measurement of range of motion in joints included in three hypermobility assessment methods. BMC Musculoskelet Disord 2018;19:376CrossRefGoogle ScholarPubMed
Jokinen, E, Mikkola, TS, Härkki, P. Simulator training and residents' first laparoscopic hysterectomy: a randomized controlled trial. Surg Endosc 2020;34:4874–82CrossRefGoogle ScholarPubMed
Rezniczek, GA, Severin, S, Hilal, Z, Dogan, A, Krentel, H, Buerkle, B et al. Surgical performance of large loop excision of the transformation zone in a training model: a prospective cohort study. Medicine (Baltimore) 2017;96:7026CrossRefGoogle Scholar
Reznick, R, Regehr, G, MacRae, H, Martin, J, McCulloch, W. Testing technical skill via an innovative “bench station” examination. Am J Surg 1997;173:226–30CrossRefGoogle Scholar
Nunnally, JC. Psychometric Theory, 3rd edn. New York: Tata McGraw-Hill Education, 1994Google Scholar
Spector, PE. Summated Rating Scale Construction: An Introduction. Newbury Park, CA: Sage, 1992CrossRefGoogle Scholar
Mehta, CR. SPSS Exact Tests 7.0 for Windows. Chicago: SPSS Inc, 1996Google Scholar
Mehta, CR, Patel, NR. Exact permutational inference for categorical and nonparametric data. Stat Strateg Small Sample Res. 1999;129Google Scholar
Landis, JR, Koch, GG. The measurement of observer agreement for categorical data. Biometrics 1977;33:159–74CrossRefGoogle ScholarPubMed
Crafts, TD, Ellsperman, SE, Wannemuehler, TJ, Bellicchi, TD, Shipchandler, TZ, Mantravadi, A V. Three-dimensional printing and its applications in otorhinolaryngology--head and neck surgery. Otolaryngol Neck Surg 2017;156:9991010CrossRefGoogle ScholarPubMed
Gross, BC, Erkal, JL, Lockwood, SY, Chen, C, Spence, DM. Evaluation of 3D printing and its potential impact on biotechnology and the chemical sciences. Anal Chem 2014;86:3240–53CrossRefGoogle ScholarPubMed
Canzi, P, Magnetto, M, Marconi, S, Morbini, P, Mauramati, S, Aprile, F et al. New frontiers and emerging applications of 3D printing in ENT surgery: a systematic review of the literature. Acta Otorhinolaryngol Ital 2018;38:286303Google ScholarPubMed
Yushkevich, PA, Piven, J, Hazlett, HC, Smith, RG, Ho, S, Gee, JC et al. User-guided 3D active contour segmentation of anatomical structures: significantly improved efficiency and reliability. Neuroimage 2006;31:1116–28CrossRefGoogle ScholarPubMed
Cohen, J, Reyes, SA. Creation of a 3D printed temporal bone model from clinical CT data. Am J Otolaryngol 2015;36:619–24CrossRefGoogle ScholarPubMed
Da Cruz, MJ, Francis, HW. Face and content validation of a novel three-dimensional printed temporal bone for surgical skills development. J Laryngol Otol 2015;129:S23–9CrossRefGoogle ScholarPubMed
Hochman, JB, Rhodes, C, Wong, D, Kraut, J, Pisa, J, Unger, B. Comparison of cadaveric and isomorphic three-dimensional printed models in temporal bone education. Laryngoscope 2015;125:2353–7CrossRefGoogle ScholarPubMed
Mowry, SE, Jammal, H, Myer, C IV, Solares, CA, Weinberger, P. A novel temporal bone simulation model using 3D printing techniques. Otol Neurotol 2015;36:1562–5CrossRefGoogle ScholarPubMed
Rose, AS, Webster, CE, Harrysson, OLA, Formeister, EJ, Rawal, RB, Iseli, CE. Preoperative simulation of pediatric mastoid surgery with 3D-printed temporal bone models. Int J Pediatr Otorhinolaryngol 2015;79:740–4CrossRefGoogle Scholar
Hochman, JB, Kraut, J, Kazmerik, K, Unger, BJ. Generation of a 3D printed temporal bone model with internal fidelity and validation of the mechanical construct. Otolaryngol Neck Surg 2014;150:448–54CrossRefGoogle ScholarPubMed
Grunert, R, Strauss, G, Moeckel, H, Hofer, M, Poessneck, A, Fickweiler, U et al. ElePhant--an anatomical electronic phantom as simulation--system for otologic surgery. Conf Proc IEEE Eng Med Biol Soc 2006;4408–11CrossRefGoogle ScholarPubMed
Figure 0

Table 1. Comparison of different printing materials and methods

Figure 1

Fig. 1. Flowchart of methodology. 3D = three dimensional; OSATS = objective structured assessment for technical skills

Figure 2

Fig. 2. Temporal bone three-dimensional model.

Figure 3

Fig. 3. Temporal bone skills station.

Figure 4

Fig. 4. Three-dimensional printed model after cortical mastoidectomy.

Figure 5

Fig. 5. The objective structured assessment of technical skill in temporal bone dissection tool for assessment of cortical mastoidectomy.

Figure 6

Fig. 6. Global rating scale.

Figure 7

Table 2. Reliability results of the two tools used by the two assessors at two time points

Figure 8

Table 3. TempOSATS overall assessment intra-rater reliability and comparison of means

Figure 9

Table 4. TempOSATS overall assessment of intra-rater reliability and comparison of means

Figure 10

Table 5. Correlation between the overall assessment scores of the two tools at time 1

Figure 11

Table 6. Correlation between the overall assessments scores of the two tools at time 2

Figure 12

Table 7. Degree of agreement or correlation between scores and overall performance of TempOSATS tool

Figure 13

Table 8. Intra-rater reliability and comparison of means for the average summated score of the GRSOP

Figure 14

Table 9. Intra-rater reliability and comparison of means for the average summated score of the GRSOP