Introduction
A hugely important part of surgery is the training of doctors, followed by the assessment of their competence and quality of the training they have received. Improvement of surgical skills should not follow Halsted's model, which claims that learning is achieved by performing the procedure.Reference Bismuth, Donovan, O'Malley, El Sayed, Naoum and Peden1 The principle ‘see one, do one, teach one’ tends to be abandoned as ineffective.Reference Satava2 The training methods that simulate real conditions and scenarios have been conscripted in numerous other industries, such as aviation, architecture and the military. Simulation has entered medical education only during the past decade. In order to understand the role of simulation in medical training, it is useful to define the term. Bismuth et al. define simulation as ‘a person, device or set of conditions which attempts to present [education and] evaluation problems authentically’.Reference Bismuth, Donovan, O'Malley, El Sayed, Naoum and Peden1
Training in surgery is entirely different from medical training. As a result, some training programmes end up producing less experienced and less competent surgeons owing to the decreased number of training hours. This could be correlated to the fact that every trainee surgeon has a different learning curve. Moreover, some surgeons may finish their training at a lower point on their learning curve.Reference Pandey, Black, Lazaris, Allenberg, Eckstein and Hagmüller3 Simulation is an excellent adjunct in training and has been adopted by many surgical specialties, including otolaryngology.Reference Musbahi, Aydin, Al Omran, Skilbeck and Ahmed4
According to a literature review by Musbahi et al., there are 64 otolaryngology simulators available, including virtual reality and bench models, with various levels of validity.Reference Musbahi, Aydin, Al Omran, Skilbeck and Ahmed4 The integration of surgical simulation in training is essential as it endorses clinical skill acquisition in an environment of reduced learning opportunities, especially after the introduction of the European Working Time Directive.Reference Fitzgerald and Caesar5 Moreover, it enhances communication, decision-making processes and situational awareness.Reference Yule, Parker, Wilkinson, McKinley, MacDonald and Neill6
Medical students and specialty trainees are familiar with objective structured clinical examination, which represents a method of assessment of skills in physical examination, communication and professionalism.Reference Satava2 Although it seems to be a widely accepted method of evaluation, it cannot be applied in surgery, as it does not assess technical skills. The objective structured assessment of technical skill (‘OSATS’) was developed in Toronto by Martin et al.Reference Martin, Regehr, Reznick, Macrae, Murnaghan and Hutchison7 with the purpose of assessing the development of surgical skills.
The objective structured assessment of technical skill in temporal bone dissection (‘TempOSATS’) is a novel proposed tool. Its principal aim is to assess surgical skills in temporal bone dissection and more specifically in cortical mastoidectomy, according to the already validated pillars of the objective structured assessment of technical skill tool.
Our study comprised two aims. The first was to assess the best material to make a three-dimensional (3D) temporal bone model, to present the advantages of 3D printing in temporal bone dissection as a means of surgical simulation and to implement these technologies in setting up a skills laboratory using exclusively 3D-printed models. Moreover, we aim to explore some main aspects of the validity and reliability of the proposed objective structured assessment of technical skill in temporal bone dissection tool, which is based on the principles of the objective structured assessment of technical skill, as an assessment tool for basic temporal bone dissection, utilising 3D-printing techniques to establish identical anatomical models.
Materials and methods
Selection of materials and printing modality
After selecting a computed tomography (CT) scan of a well aerated, disease-free temporal bone, we converted the Dicom® data to a stereolithographic (‘stl’) file, which is appropriate for 3D printing. Only a few improvements were required to limit any artifacts in the final format, such as removal of supporting structures from the mastoid air cells and draining holes, depending on the printing method.
The main question was which 3D-printing technology would approach anatomical accuracy of the real temporal bone, allow quick reproducibility, satisfactory tactile feedback and affordable cost. The materials we tested were polylactic acid, polylactic acid plus polyvinyl alcohol, resin by conventional printing and selective laser sintering.
Afterwards, we assessed all four models by a focus group, consisting of five specialist otolaryngologists with experience in temporal bone surgery. The focus group also agreed on the steps that should be included in the objective structured assessment of technical skill in temporal bone dissection tool. These are the main surgical steps described in the literature and also reflect the experience of the focus group.Reference Arnoldner, Lin and Chen8,Reference Francis, Masood, Laeeq and Bhatti9 The advantages and disadvantages of these models are summarised in Table 1. Paying attention to the anatomical resemblance and feedback to drilling, we concluded that selective laser sintering resin technology was the best for our purpose. The cost of each model was approximately 25 euros.
Table 1. Comparison of different printing materials and methods
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20220511162900662-0225:S0022215121001201:S0022215121001201_tab1.png?pub-status=live)
The experiments were conducted on a simple bench with a temporal bone holder and drill, which can be easily replaced by a Dremel-type drill (Illinois, USA). Different types of drill heads were available (cutting and diamond), as well as suction, irrigation and otological micro-instruments (for example, needles and crocodile forceps). The task was cortical mastoidectomy. MacEwen's triangle could be easily identified as the spine of Henle and the zygomatic root. Drilling of the selective laser sintering model was smooth, with close to realistic tactile feedback. The mastoid cells were empty of material, and the position of the other landmarks (sigmoid sinus, lateral semi-circular canal and incus buttress) could also be identified. All the surgical steps were previously agreed by the members of the group, executed in an uninterrupted sequence and videotaped so they could be reassessed later (Figure 1).
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20220511162900662-0225:S0022215121001201:S0022215121001201_fig1.png?pub-status=live)
Fig. 1. Flowchart of methodology. 3D = three dimensional; OSATS = objective structured assessment for technical skills
Selection of sample and simulation process
To determine the minimum required sample, power analysis was conducted following the minimum expected correlation coefficient (Spearman's rank correlation rho) for testing inter- and intra-rater reliability of the two assessors relative to their total scoring of overall achievement. For an anticipated correlation coefficient of rho = 0.60 utilising a sample size of at least n = 19 units, a two-tailed t-test for testing the statistical significance of the corresponding correlation coefficient, at significance level a = 0.05, showed enough power (γ = 0.80) to highlight the association as statistically significant. Generally, a value of a correlation coefficient of 0.60 is considered to correspond to a ‘large’ effect size according to Cohen's conventions.Reference Cohen10 Power analysis was conducted with G*Power (version 3.1.2) statistical power analysis software (software detailed in Faul et al.Reference Faul, Erdfelder, Lang and Buchner11 and Faul et al.Reference Faul, Erdfelder, Buchner and Lang12).
A flowchart of the methodology and the experimental part is presented in Figure 1. Two of the authors acted as external assessors, who initially delivered a brief tutorial to the candidate (slides disseminated via e-mail), focusing on the objectives and surgical steps that were expected to be performed. Following this, specialty trainees of various levels from rotations in Northern Greece were asked to perform a cortical mastoidectomy in the pilot skills labarotary using the selective laser sintering resin printed temporal bone models. They were invited via personal e-mail invitation, and their participation was registered on a first come, first-served basis. They all had the same equipment available to complete the task, and they were videotaped (Figures 2–4).
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20220511162900662-0225:S0022215121001201:S0022215121001201_fig2.png?pub-status=live)
Fig. 2. Temporal bone three-dimensional model.
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20220511162900662-0225:S0022215121001201:S0022215121001201_fig3.png?pub-status=live)
Fig. 3. Temporal bone skills station.
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20220511162900662-0225:S0022215121001201:S0022215121001201_fig4.png?pub-status=live)
Fig. 4. Three-dimensional printed model after cortical mastoidectomy.
Assessment and scoring
The videos were given a number from 1 to 24. Then they were scored according to objective structured assessment of technical skill in temporal bone dissection by the two external assessors at two different times: after the completion of the experimental part and one month later. Before scoring, a meeting took place for calibration purposes, and the two assessors agreed on the scoring methodology. According to the literature, similar projects involved 2–3 assessors, directly evaluating the candidates, especially for the first time.Reference Martin, Regehr, Reznick, Macrae, Murnaghan and Hutchison7,Reference Hopmans, den Hoed, van der Laan, van der Harst, van der Elst and Mannaerts13,Reference Chang, King, Modest and Hur14 Additionally, the videos were reviewed again after one month. This is in line with the relevant literature, where intra-rater variability was assessed by reviewing video recordings after some days up to six weeks.Reference Chang, King, Modest and Hur14–Reference Schlager, Ahlqvist, Rasmussen-Barr, Bjelland, Pingel and Olsson17
The importance of video recording has been highlighted in several studies.Reference Jokinen, Mikkola and Härkki18,Reference Rezniczek, Severin, Hilal, Dogan, Krentel and Buerkle19 The assessors also scored the candidates according to an already validated global rating scale,Reference Martin, Regehr, Reznick, Macrae, Murnaghan and Hutchison7,Reference Reznick, Regehr, MacRae, Martin and McCulloch20 which was utilised as a control tool for testing the criterion validity of the proposed objective structured assessment of technical skill in temporal bone dissection. As shown in Figure 5, objective structured assessment of technical skill in temporal bone dissection consists of seven questions (scored as yes/no) and one question of overall achievement, scored from 0 to 5. The global rating scale has seven questions, scored from one to five (Figure 6).
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20220511162900662-0225:S0022215121001201:S0022215121001201_fig5.png?pub-status=live)
Fig. 5. The objective structured assessment of technical skill in temporal bone dissection tool for assessment of cortical mastoidectomy.
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20220511162900662-0225:S0022215121001201:S0022215121001201_fig6.png?pub-status=live)
Fig. 6. Global rating scale.
The study was approved by the Committee of Bioethics of the Aristotle University Medical School, Thessaloniki, Greece. All participants gave written consent before participating in the experimental part, and the consent forms were also approved by the Committee of Bioethics.
Statistical analysis
Data were summarised by calculating descriptive statistical indices such as absolute and relative frequencies (percentages), measures of central tendency (means and medians) and variability (standard deviations), correlation-association indices (Spearman's rho for correlating quantitative variables, and gamma or Cramer's V for assessing the degree of the association between categorical variables).
The process of testing some aspects of the reliability and the validity of the proposed objective structured assessment of technical skill in temporal bone dissection assessment tool was based on the following methodological scheme: (1) the internal consistency of the objective structured assessment of technical skill in temporal bone dissection tool was tested and evaluated by estimating and assessing the value of the Kuder–Richardson formula 20 reliability coefficient.Reference Nunnally21,Reference Spector22 The Kuder–Richardson formula 20 coefficient is analogous to Cronbach's a reliability coefficient, but it is appropriate for binary items. (2) For both tools, the average discrimination index was calculated. The discrimination index was used for testing the homogeneity of the two tools.Reference Nunnally21 This index is related mainly to the construct validity of a scale consisting of several items. These first two analyses were performed for each examiner within each evaluation time (time 1 and time 2). (3) The criterion validity of the objective structured assessment of technical skill in temporal bone dissection assessment tool was tested and evaluated by correlating, at each evaluation time (time 1 and time 2), the examiners’ scores on the overall assessment item of the objective structured assessment of technical skill in temporal bone dissection tool with the average score of the global rating scale of operative performance tool. (4) The ‘inter-rater’ and ‘intra-rater’ reliability were tested with Spearman's rho and Wilcoxon tests.
In all statistical tests, the observed significance level (p-value) was computed with the Monte-Carlo simulation method utilising 10 000 random samples.Reference Mehta23,Reference Mehta and Patel24 All the statistical analyses were performed with SPSS® (version 24.0) statistical software enhanced with the module ‘exact tests’ (for the implementation of the Monte-Carlo simulation). The significance level in all hypothesis testing procedures was predetermined at a = 0.05 (p ≤ 0.05).
Results
According to data presented in Table 2, the vast majority of the scores of the two examiners using the two tools, for both time periods, showed satisfactory reliability indices (Kuder–Richardson formula 20 or Cronbach's a reliability coefficients more than or equal to 0.60) and homogeneity (average discrimination index more than 0.30).
Table 2. Reliability results of the two tools used by the two assessors at two time points
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20220511162900662-0225:S0022215121001201:S0022215121001201_tab2.png?pub-status=live)
For the objective structured assessment of technical skill in temporal bone dissection (TempOSATS) tool items, Cronbach's a reliability coefficient is equivalent to Kuder–Richardson formula 20 (KR20) reliability coefficient, and discrimination index (DI) is the average discrimination index
Based on data presented in Tables 3 and 4, for each examiner, there was a very strong (almost absolute) positive and statistically significant correlation between examiner scores at time 1 and time 2 for the overall assessment of objective structured assessment of technical skill in temporal bone dissection tool (for examiner one: rho = 0.942, p < 0.001; for examiner 2: rho = 0.908, p < 0.001). However, for examiner one there was a statistically significant difference (p = 0.002) between the two assessments (time 1 vs time 2). The mean value of the overall evaluation at time 1 was estimated to be 3.8 and at time 2 was estimated to be 3.3; that is, significantly lower than time 1 (mean difference was equal to 0.5 in a 6-point scale). For examiner two, no statistically significant difference (p = 0.748) between the two assessments was highlighted, according to the results of the Wilcoxon test. It must be noted that in all comparisons, the median values were all equal to 4.0.
Table 3. TempOSATS overall assessment intra-rater reliability and comparison of means
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20220511162900662-0225:S0022215121001201:S0022215121001201_tab3.png?pub-status=live)
Table shows intra-rater reliability (Spearman's rho rank correlation coefficient) and comparison of means, for each examiner, between time 1 and time 2 and between the two examiners at each time point, for the objective structured assessment of technical skill in temporal bone dissection (TempOSATS) score for overall assessment
Table 4. TempOSATS overall assessment of intra-rater reliability and comparison of means
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20220511162900662-0225:S0022215121001201:S0022215121001201_tab4.png?pub-status=live)
Table shows intra-rater reliability (Spearman's rho rank correlation coefficient) and comparison of means, for each examiner, between time 1 and time 2 and between the two examiners at each time point, for the objective structured assessment of technical skill in temporal bone dissection (TempOSATS) score for overall assessment
According to the data presented in Tables 3 and 4, at each time point there was a very strong (almost absolute at time 2) positive and statistically significant correlation between the scores of the two examiners for the overall assessment of the objective structured assessment of technical skill in temporal bone dissection tool (at time 1: rho = 0.837, p < 0.001; at time 2: rho = 0.999, p < 0.001). However, at time 1, there was a statistically significant difference (p = 0.035) between the two examiners. The mean value of the overall assessment for examiner one was estimated to 3.8, and for examiner two, it was equal to 3.4. That is, significantly lower than examiner one (mean difference was equal to 0.4 on a 6-point scale). At time 2, no statistically significant difference (p = 1.000) between the two examiners was found, according to the results of the Wilcoxon test.
Based on data presented in Tables 5 and 6, for both examiners at times 1 and 2, there was a very strong, positive and statistically significant correlation (p < 0.001) between their overall assessment scores derived from the two tools.
Table 5. Correlation between the overall assessment scores of the two tools at time 1
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20220511162900662-0225:S0022215121001201:S0022215121001201_tab5.png?pub-status=live)
Table shows correlation (Spearman's rho rank correlation coefficient) between the overall assessment scores of the two tools, objective structured assessment of technical skill in temporal bone dissection (TempOSATS) and global rating scale of operative performance, reported by the two examiners at time 1. E1Τ1 = examiner 1 at time 1; E2Τ1 = examiner 2 at time 1
Table 6. Correlation between the overall assessments scores of the two tools at time 2
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20220511162900662-0225:S0022215121001201:S0022215121001201_tab6.png?pub-status=live)
Table shows correlation (Spearman's rho rank correlation coefficient) between the overall assessment scores of the two tools, objective structured assessment of technical skill in temporal bone dissection (TempOSATS) and global rating scale of operative performance, reported by the two examiners at time 2. E1Τ2 = examiner 1 at time 2; E2Τ2 = examiner 2 at time 2
Landis and Koch (1977) remark that kappa values around 0.20 express a weak degree of agreement, values around 0.40 indicate a satisfactory degree of agreement, values around 0.60 express a moderate degree of agreement, values around 0.80 indicate a significant degree of agreement and, finally, kappa values over 0.80 express an almost perfect degree of agreement.Reference Landis and Koch25 Based on the data presented in Table 7, the vast majority of Cohen's kappa measures of agreement were greater than 0.80 and statistically significant (maximum p = 0.042, <0.05). The simple overall agreement percentages between any two assessments’ scores were greater than 95 per cent (ranged from 96 to 100 per cent). Regarding the degree of the association between any two assessments’ scores, the corresponding association indices were very high (both Cramer's V > 0.80 and gamma > 0.80, range, 0.836 to 1) and statistically significant (p < 0.001). Consequently, testing the items of the objective structured assessment of technical skill in temporal bone dissection tool, the two examiners showed very strong agreement between their intra- and inter-reliability assessments.
Table 7. Degree of agreement or correlation between scores and overall performance of TempOSATS tool
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20220511162900662-0225:S0022215121001201:S0022215121001201_tab7.png?pub-status=live)
Table shows degree of agreement (Cohen's kappa measure) or correlation (Cramer's V and gamma association indices) between the two examiners' scores within each attempt and between the two attempts (time 1 and time 2) for the 7 items and the overall performance of the objective structured assessment of technical skill in temporal bone dissection (TempOSATS) assessment tool. *In those cases where it was not possible to compute the Cohen's kappa measure of agreement, the simple overall agreement percentage between any two assessments is reported instead; †in those cases where it was not possible to compute the Cohen's kappa measure of agreement, the Cramer's V and gamma association indices between any two assessments are reported instead. E1 = examiner 1; E2 = examiner 2; Τ1 = time 1; Τ2 = time 2
Tables 8 and 9 present the results of the intra- and inter-rater reliability testing for the average summated score of the global rating scale of operative performance tool. In all testing procedures, there was a very strong (almost absolute) positive and statistically significant correlation between any two assessments, in all cases p < 0.001 (Tables 8 and 9).
Table 8. Intra-rater reliability and comparison of means for the average summated score of the GRSOP
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20220511162900662-0225:S0022215121001201:S0022215121001201_tab8.png?pub-status=live)
Table shows intra-rater reliability (Spearman's rho rank correlation coefficient) and comparison of means, for each examiner, between time 1 and time 2, and between the two examiners at each time point, for the average summated score of the global rating scale of operative performance (GRSOP).
Table 9. Intra-rater reliability and comparison of means for the average summated score of the GRSOP
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20220511162900662-0225:S0022215121001201:S0022215121001201_tab9.png?pub-status=live)
Table shows intra-rater reliability (Spearman's rho rank correlation coefficient) and comparison of means, for each examiner, between time 1 and time 2, and between the two examiners at each time point, for the average summated score of the global rating scale of operative performance (GRSOP).
Discussion
Surgical training was previously mainly confined to the practice and development of surgical skills in the operating theatre. According to Reznick et al., the operating theatre has many limitations when it comes to training and assessment. First of all, it is difficult to standardise any operation in similar training patterns. Secondly, it is almost impossible to standardise the degree to which a trainee is performing elements of an operation. In addition, surgical time is far more expensive compared with any other training method.Reference Reznick, Regehr, MacRae, Martin and McCulloch20
Aiming to overcome the above limitations, efforts have been made to develop effective teaching methods. Animal models are carefully selected to simulate human anatomy, and the animal must be anaesthetised before the operation. Obviously, ethical issues are involved, and animal models do not offer a wide range of alternatives to real patients. The use of bench models simulates human anatomy well and are used for ordinary surgical tasks. Comparing bench model based training to previous methods, it has a lower cost, is portable, readily available and allows the reproducibility of various tasks.Reference Martin, Regehr, Reznick, Macrae, Murnaghan and Hutchison7
Objective structured assessment of technical skill gives the candidate a score that ranges from 8 to 40, with 24 representing a competent performance. Pandey et al.Reference Pandey, Black, Lazaris, Allenberg, Eckstein and Hagmüller3 described the value of objective structured assessment of technical skill. Despite the small number of participants (15 surgical trainees), this study showed that the participants had significant improvement in all aspects of their generic skill but mainly improved in the flow of the procedure, their overall performance and their procedure-specific skills. In the same study, although significant improvement was observed, some participants did not improve. They were mainly older surgeons who proved to be less able to learn in this type of setting because they had accumulated other methods of performing the examined procedures. Another reason may be that they have learned other types of the same procedure that are different from those demonstrated to them.Reference Pandey, Black, Lazaris, Allenberg, Eckstein and Hagmüller3
The Vascular Department of Imperial College London, which is based at St Mary's Hospital, adopted objective structured assessment of technical skill in their surgeons’ training. They took objective structured assessment of technical skill a step beyond its original idea: evaluating surgical competence in a specific procedure and not only basic surgical tasks. The new tool that Imperial introduced was called Imperial College Evaluation of Procedure Specific Skill. This involves a rating scale with five standard points to assess the content of a procedure.Reference Bismuth, Donovan, O'Malley, El Sayed, Naoum and Peden1
There is no doubt as to the value of surgical skills assessment. The most beneficial impact is the considerable improvement in patient safety because the trainee surgeon does not practice a specific procedure on a patient for the first time. In addition, the ‘learning curve’ of making mistakes takes place in the laboratory and not on a patient. In that way, the trainee can perform the same procedure many times until improvement is reached. As a result, operating time decreases, efficiency increases and medical errors decrease.Reference Satava2 This agrees with our philosophy of applying the objective structured assessment of technical skill principles to the whole surgical procedure and not only for limited skills. Moreover, our experiments demonstrated the need for adequate calibration between the assessors, some discrepancies in scoring that may have to do with the different levels of experience of the assessors and the value of video recordings, which allow more careful evaluation of the various surgical steps.
A possible problem in applying objective structured assessment of technical skills in every training hospital is the relatively high cost. When the method first became known, only a few major teaching centres had the resources to organise courses and evaluations, and this could only occur a few times a year. Cost for models, facilities and especially trainers are obstacles to its wider spread.Reference Pandey, Black, Lazaris, Allenberg, Eckstein and Hagmüller3 In our study, we managed to reproduce a number of identical models of temporal bones at a low cost, and the printing time was a few hours for each.
Three-dimensional printing is a technology that has been known since the 1980s, but its involvement in the medical field has increased significantly over the last two decades, with numerous examples in training, patient education and bioengineering. Three-dimensional printing equipment has improved, is less expensive and the expertise is more widespread, and therefore it has become available in many parts of the world for medical use in several fields.Reference Crafts, Ellsperman, Wannemuehler, Bellicchi, Shipchandler and Mantravadi26,Reference Gross, Erkal, Lockwood, Chen and Spence27
There are numerous studies available in the literature, exploring the potential use of 3D-printing technologies in ENT head and neck surgery. They vary from pre-operative planning and patient education to more advanced training applications for residents and undergraduate medical students. Additionally, there have been descriptions of applications associated with tissue engineering and prosthetics, which are extremely promising for medical innovations in the near future.
According to Canzi et al., there are 23 studies in the literature focusing on otological applications in training, mainly to do with temporal bone surgery simulation.Reference Canzi, Magnetto, Marconi, Morbini, Mauramati and Aprile28 In 2015, a temporal bone model based on CT scan data of two selected patients with well pneumatised and disease-free mastoids was developed. The final evaluation of the models showed satisfactory reproducibility of most structures and anatomical landmarks but also raised two significant issues: the accuracy of the ossicular chain (mainly the stapes) and also the retained resin within the mastoid air cells. The latter issue impacts the drilling experience and can be overcome by adding a small drain hole in the region of the sigmoid sinus. The authors concluded that the model produced is useful for training, without depleting a limited supply of cadavers and by using conventional (non-surgical) tools, such as a Dremel drill.Reference Yushkevich, Piven, Hazlett, Smith, Ho and Gee29 On the other hand, it is still difficult to approach the ‘natural’ structure of the cadaveric specimen, mainly because of the ‘stair-stepping’ artifact and the lack of anatomical elements such as the dura, nerves, blood vessels, tympanic membrane, and oval and round windows.Reference Cohen and Reyes30 We have overcome the obstacles of stair-stepping and retained resin by comparing different materials and printing techniques and choosing selective laser sintering printing. This method allows more accurate printing without retained material and better external and internal contours.
• Objective structured assessment of technical skill is a widely accepted tool for assessing surgical skills
• Only a few of its applications in otolaryngology have been explored so far
• There are numerous studies in the literature exploring the potential use of three-dimensional printing
• Three-dimensional printing is a novel but reliable approach to surgical simulation
• This study explored the validity and reliability of a newly proposed assessment model for surgical training
• The objective structured assessment of technical skill in temporal bone dissection is a tool that can be useful in training assessment
Other groups also confirmed the similarity to the cadaveric specimens and the positive feedback from the trainees.Reference Da Cruz and Francis31–Reference Rose, Webster, Harrysson, Formeister, Rawal and Iseli34 More specifically, Hochman et al. showed that tactile feedback is satisfactory by analysing subjective and objective methods. The improvement of materials has provided a better simulation of bone consistency, resulting in a more realistic experience.Reference Hochman, Kraut, Kazmerik and Unger35 A useful adjunct in training is the coupling with electronic simulators, which offers the possibility of real-time alert in case of vital structural injury. An example is the ElePhant model (Electronic Phantom), where the facial nerve is replaced with a conductive alloy or fibre-optic material, allowing immediate feedback.Reference Grunert, Strauss, Moeckel, Hofer, Poessneck and Fickweiler36 Anecdotal feedback from the participants confirmed the satisfactory tactile feedback, which is associated with the different thickness of the structures (mastoid air cells and bony labyrinth).
Our group has studied the different materials and printing techniques and the application in relatively larger scales has shown that such methods can be used to run skills labs based on 3D-printed models.
Conclusion
Three-dimensional printing is a novel but equally reliable approach to surgical simulation, and reproduction of anatomical models can be of great value in training and personalised patient care. Additionally, objective structured assessment of technical skill in temporal bone dissection is a tool that can be extremely useful in the assessment of training and monitoring of a surgeon's learning curve. More studies are necessary to expand its applications in more complex operations, where cortical mastoidectomy represents the initial stage of surgery.
Competing interests
None declared