Introduction
In 1995, the Committee on Hearing and Equilibrium of the American Academy of Otolaryngology – Head and Neck Surgery (AAO-HNS) published a draft of guidelines for reporting conductive hearing loss following tympanoplasty and stapes surgery.1 The aim was to ensure that minimal datasets appeared in a standardised format when reporting the outcome of middle-ear surgery, thereby allowing for better inter-study comparability. In order to achieve this aim, two levels of reporting were established: level 1 is a uniform summary of data reporting technical outcomes, and level 2 is more discretionary and allows authors to provide raw data for analysis. Table I provides a summary of level 1 requirements.
* Following stapes and middle-ear surgery.1 AAO-HNS = American Academy of Otolaryngology – Head and Neck Surgery; AC = air conduction; BC = bone conduction; SD = standard deviation; ABG = air–bone gap
These guidelines have now been in implementation for more than 20 years, thus allowing sufficient time for audiological data and outcomes to be collected correctly in a prospective fashion.
This paper highlights the importance of correctly reporting air–bone gap (ABG) closure in stapes surgery according to the AAO-HNS guidelines and emphasises the impact this could have on outcomes, and assesses how compliant clinicians have been at following these guidelines in recent years.
Materials and methods
Patient selection
In this retrospective case series, all adult patients who underwent primary and revision stapedotomy, from the beginning of 1999 through to August 2014, were reviewed. Patients who had non-otosclerotic hearing loss at the time of surgery and those who failed to attend for audiology assessment following surgery were excluded.
Surgical technique
The procedures were performed by a single, fellowship-trained otological surgeon. The preferred surgical technique was a reverse narrow fenestra stapedotomy conducted under awake local anaesthetic conditions. In all cases, the stapedotomy was performed using a hand-held micro-perforator and micro drill (Skeeter Otologic Drill System; Medtronic, Sydney, Australia). Thereafter, a fluoroplastic and platinum wire piston prosthesis (Richard's Piston; Gyrus ACMI, Melbourne, Australia) was crimped into position. Patients were discharged the next day and reviewed 10 days later in the clinic to ensure there were no complications following the surgery. Hearing was assessed clinically during the surgery using free-field speech assessment, and objectively with audiometry between three and six months after the surgery.
Audiological data
Where possible, audiometric data were collected according to AAO-HNS guidelines. Pre- and post-operative air conduction and bone conduction thresholds were assessed at 0.25, 0.5, 1, 2, 4 and 8 kHz and at 0.5, 1, 2, 4 and 8 kHz respectively. Thereafter, a four-tone average was calculated using 0.5, 1, 2 and 4 kHz. As it is not common practice to measure 3 kHz in Australia, this was substituted with 4 kHz in order to calculate the average (Table II).
Data represent pure tone averages (mean ± standard deviation (range); in dB). Pre-op = pre-operative; AC = air conduction; BC = bone conduction; post-op = post-operative
The ABG closure was calculated according to the guidelines (Table I) and by subtracting the pre-operative bone conduction from the post-operative air conduction. The mean, standard deviation (SD) and range were supplied for both, and results were placed into 10 dB bins (Table III).
Total n = 204. Air–bone gap for average of 0.5, 1, 2 and 4 kHz. AAO-HNS = American Academy of Otolaryngology – Head and Neck Surgery; ABG = air–bone gap; SD = standard deviation; AC = air conduction; BC = bone conduction
Statistical analysis
Statistical software IBM SPSS® version 22 was used to analyse the data. Paired t-tests or Wilcoxon signed rank tests were used as appropriate to test for systematic differences within patients. The mean and SD, or median and 25–75th percentile and range, were used to summarise continuous variables as appropriate. A boxplot graph was used to illustrate the distribution of continuous outcomes.
Literature review
We searched PubMed, Medline and Embase databases, using the terms ‘stapes surgery’, ‘stapedectomy’ and ‘stapedotomy and results’, to identify surgical outcomes for the study period. Selection criteria included all papers published in the English language, in journals with a mean impact factor of greater than 1 (over the study period) from 2005 to 2014. Paediatric cases, reviews and meta-analyses were excluded. Data parameters collected included: ABG closur2e correctly calculated using the prescribed four-tone average (0.5, 1, 2 and 3 kHz) at one year or more, with the mean, SD and range reported.
Results
Patient demographics
A total of 226 narrow fenestra stapedotomy procedures were performed over the study period. Twenty-two patients did not attend for follow-up audiometry and were thus excluded from the study. Of the remaining 204 procedures, 182 were primary operations and 22 were revision procedures. The male-to-female ratio was 1:1.7, and mean patient age was 48 years (range, 16–81 years).
Audiological findings
The post-operative ABG measured 8 dB (SD ± 7.6) when calculated according to the guidelines (post-operative air conduction minus post-operative bone conduction) and 5 dB (SD ± 12.6) when the pre-guideline method was used (post-operative air conduction minus pre-operative bone conduction) (Table III). Closure of the ABG of less than 10 dB and less than 20 dB occurred in 73.5 and 77 per cent and in 93.1 and 93.2 per cent, respectively.
Statistical analysis findings
Although there appears to be very little difference between the current and pre-guideline method, statistical analysis (Wilcoxon signed rank test) demonstrated a significant within-patient difference when calculating ABG closure, with 75 per cent of the results being over-reported as successful using the previous method (median = 3.75 dB, interquartile range = 0–7.5 dB; p < 0.001) (Figure 1).
Literature review findings
From the initial search, 322 articles were identified as potentially suitable for review. When the exclusion criteria were applied and duplicates removed, 51 articles were selected. A summary of the results is shown in Table IV.
* Air–bone gap was not reported in compliance with the guidelines. AAO-HNS = American Academy of Otolaryngology – Head and Neck Surgery; ABG = air–bone gap; SD = standard deviation
A review of the literature demonstrated that 44 papers (86.3 per cent) used a 4-tone average to calculate the ABG. Only 23 papers (45.1 per cent) used 3 kHz to calculate the 4-tone average, with the remaining papers either averaging 2 kHz and 4 kHz, or substituting 3 kHz with 4 kHz (Figure 2a).
Results were correctly reported according to guidelines in 42 papers (82.4 per cent) (Figure 2b). Three papers (5.9 per cent) used the pre-guideline method, and in 6 papers (11. 8 per cent) it was not clear how the ABG was calculated. Only 17 papers (33.3 per cent) reported outcomes in the correct 10 dB bins; 6 (11.8 per cent) fully reported SD, mean and range, with 60.1 per cent of these results calculated after more than one year.
Discussion
Closure of the ABG remains the most common outcome described when reporting stapes surgery, with surgeons aiming to close the gap to less than 10 dB in more than 90 per cent of cases. An audit of post-operative results comparing to this ‘gold standard’ thus allows surgeons to have an open discussion with their patients about achievable and realistic outcomes. Prior to the publication of the AAO-HNS guidelines, ABG closure was easier to achieve by ignoring the Carhart effect,Reference Carhart2, Reference Gatehouse and Browning3 potentially leading to the over-reporting of successful results.
In studies by Badran et al.,Reference Badran, Gosh, Farag and Timms4 Berliner et al.,Reference Berliner, Doyle and Goldenberg5 Fiorino and Barbieri,Reference Fiorino and Barbieri6 and Gerlinger et al.,Reference Gerlinger, Toth, Bako, Nemeth and Pytel7 the ABG was calculated using both methods.Reference Badran, Gosh, Farag and Timms4–Reference Gerlinger, Toth, Bako, Nemeth and Pytel7 The difference in ABG varied between 0 dB and 5 dB in favour of the method pre-dating the guidelines. Similarly, when the same results were placed into 10 dB bins, results were 3–11 per cent better for ABG closure of less than 10 dB, and 0–3 per cent better for ABG closure of less than 20 dB.
In our study, we found similar results, with a median 3.75 dB difference in the ABG and a 3.5 per cent difference in favour of the old method when calculating ABG closure of less than 10 dB. There was, however, no difference for closure of less than 20 dB.
This appears to indicate only a slight difference, with no real significance, thereby rebutting the guidelines. However, statistical analysis demonstrated that the within-patient difference was significant (p < 0.001, when the Wilcoxon signed rank test was used), with 75 per cent of the outcomes being over-reported as successful.
In 1995, in an attempt to avoid such discrepancies when describing outcomes, the Committee on Hearing and Equilibrium of the AAO-HNS released guidelines for reporting tympanoplasty and stapes surgery.1 The main aim of these guidelines was to enable a standardised minimal dataset for reporting results, to facilitate comparative studies. These guidelines applied in particular to the reporting of improvements in bone conduction thresholds and ABG and the change in ABG pre- and post-operatively. Authors were encouraged to adhere to the minimum dataset, but were also encouraged to report data in novel ways should they wish.
It was noted by the senior author that, despite the existence of the guidelines, some papers pre-dating 2005 were still using the old method to calculate ABG. Some of these studies spanned the transition period between the traditional method of collecting and reporting data and the current guidelines, and therefore could be regarded with less criticism.Reference Banerjee, Hawthorne, Flood and Martin8–Reference Lippy, Wingate, Burkey, Rizer and Schuring13 However, over the last 10 years, there has been sufficient time for surgeons to become aware of the guidelines and change their reporting practice accordingly.
A review of the literature revealed that 82.4 per cent of the papers reported the ABG closure correctly,Reference Badran, Gosh, Farag and Timms4, Reference Fiorino and Barbieri6, Reference Gerlinger, Toth, Bako, Nemeth and Pytel7, Reference Acar, Kivekas, Hanna, Huang, Gopen and Poe14–Reference Wiet, Battista, Wiet and Sabin51 with 5.9 per cent of papers still reporting their outcomes incorrectly using the previously accepted method of subtracting the pre-operative bone conduction threshold from the post-operative air conduction threshold.Reference Cuda, Murri, Mochi, Solenghi and Tinelli52–Reference Shine, Rodrigues, Miller and Packer54 Two of these studies were performed after the release of the guidelines,Reference Cuda, Murri, Mochi, Solenghi and Tinelli52, Reference Pudel and Briggs53 and one was carried out retrospectively during the transition period.Reference Shine, Rodrigues, Miller and Packer54 For the remaining 11.8 per cent of papers, calculation of ABG closure was either unclear, not reported or used individual frequencies.Reference Albers, Schonfeld, Kandilakis and Jovanovic55–Reference Yavuz, Caylakli, Ozer and Ozluoglu61
The ABG closure was calculated using the four-tone average for pre- and post-operative air conduction and bone conduction thresholds in 86.3 per cent of the papers;Reference Badran, Gosh, Farag and Timms4, Reference Fiorino and Barbieri6, Reference Gerlinger, Toth, Bako, Nemeth and Pytel7, Reference Acar, Kivekas, Hanna, Huang, Gopen and Poe14–Reference Galli, Parrilla, Fiorita, Marchese and Paludetti21, Reference Harris and Gong23–Reference Pudel and Briggs53, Reference Albers, Schonfeld, Kandilakis and Jovanovic55, Reference Lavy and Khalil59, Reference Vincent, Gratacap, Oates and Sperling62 however, 3 kHz was only used in 45.1 per cent of these calculations.Reference Fiorino and Barbieri6, Reference Gerlinger, Toth, Bako, Nemeth and Pytel7, Reference Arnoldner, Schwab and Lenarz15, Reference Bauer, Pytel, Vona and Gerlinger16, Reference Fayad, Semaan, Meier and House20, Reference Galli, Parrilla, Fiorita, Marchese and Paludetti21, Reference Hazenberg, Minovi, Dazert and Hoppe24, Reference Lippy, Burkey, Schuring and Berenholz28, Reference Massey, Kennedy and Shelton32–Reference Quaranta, Besozzi, Fallacara and Quaranta34, Reference Roosli and Huber38, Reference Satar, Sen, Karahatay, Birkent and Yetiser40–Reference Szyfter, Mielcarek-Kuchta, Mietkiewska-Leszniewska, Mlodkowska and Laczkowska-Przybylska44, Reference Van Rompaey, Claes, Somers and Offeciers47, Reference Wiet, Battista, Wiet and Sabin51, Reference Pudel and Briggs53, Reference Albers, Schonfeld, Kandilakis and Jovanovic55, Reference Gouveris, Toth, Koutsimpelas, Schmidtmann and Mann57, Reference Lavy and Khalil59 The remaining papers either substituted 3 kHz with 4 kHz (in 79.2 of papers), or averaged 2 kHz and 4 kHz (in 16.7 per cent). Although Berliner et al. noted that substituting 3 kHz with 4 kHz demonstrated little difference in four-tone averages, success diminished by 6 per cent overall.Reference Berliner, Doyle and Goldenberg5 In a further study, by Gurgel et al., there appeared to be no major difference if 3 kHz was substituted with an average of 2 kHz and 4 kHz.Reference Gurgel, Jackler, Dobie and Popelka63 In those papers that reported the ABG closure correctly, only 11.8 per cent reported the mean, SD and range of their results.Reference Fiorino and Barbieri6, Reference Brown and Gantz19, Reference Gerard, Serry and Gersdorff22, Reference Mangham29, Reference Sorom, Driscoll, Beatty and Lundy42, Reference Szyfter, Mielcarek-Kuchta, Mietkiewska-Leszniewska, Mlodkowska and Laczkowska-Przybylska44, Reference Wiet, Battista, Wiet and Sabin51 A one-year follow-up audiogram was conducted in only 60.1 per cent of studies.Reference Gerlinger, Toth, Bako, Nemeth and Pytel7, Reference Acar, Kivekas, Hanna, Huang, Gopen and Poe14–Reference Bauer, Pytel, Vona and Gerlinger16, Reference Brown and Gantz19, Reference Galli, Parrilla, Fiorita, Marchese and Paludetti21, Reference Harris and Gong23, Reference Javed, Leong and Fairley25, Reference Kojima, Komori, Chikazawa, Yaguchi, Yamamoto and Chujo27, Reference Mangham29–Reference Marchese, Scorpecci, Cianfrone and Paludetti31, Reference Parrilla, Galli, Fetoni, Rigante and Paludetti33, Reference Rajan, Diaz, Blackham, Eikelboom, Atlas and Shelton36–Reference Sarac, McKenna, Mikulec, Rauch, Nadol and Merchant39, Reference Schmid and Hausler41–Reference Tenney, Arriaga, Chen and Arriaga45, Reference Van Rompaey, Claes, Somers and Offeciers47, Reference Vincent, Rovers, Zingade, Oates, Sperling and Deveze50, Reference Wiet, Battista, Wiet and Sabin51, Reference Albers, Schonfeld, Kandilakis and Jovanovic55, Reference Gouveris, Toth, Koutsimpelas, Schmidtmann and Mann57, Reference Lavy and Khalil59–Reference Yavuz, Caylakli, Ozer and Ozluoglu61
• Non-adherence to current guidelines can lead to over-inflated reports of successful air–bone gap closure
• This can affect the consent process and patient expectations
• Clear guidelines exist for reporting otological surgery outcomes
• Nevertheless, incorrect reporting continues and minimal datasets are still missing
• Adherence to guidelines provides accurate outcomes and inter-study comparisons
We acknowledge that we did not adhere strictly to the guidelines, as we used 4 kHz in place of 3 kHz. Furthermore, in light of geographical distance, follow up was only carried out between three and six months post-operatively. Regarding data collection for the review, we restricted ourselves to English-language, peer-reviewed journals, with impact factors greater than 1 averaged over the study period, thus introducing reporting bias.
Conclusion
Correct calculation of the ABG using the method described by the AAO-HNS is important, to prevent the over-reporting of successful ABG closure, which can lead to unrealistic patient expectations, and to aid the comparison of surgical outcomes.
Although the AAO-HNS guidelines have been in place for 20 years, we are still not fully compliant when reporting results. Any surgeons embarking on prospective studies or audits should endeavour to collect a minimal dataset, according to the guidelines, and report their results accordingly.
In the current electronic era, with the ease of accessibility to data, it may be worth considering providing all raw audiometric data for future outcome studies. Finally, all journals should provide clear and specific instructions on reporting outcomes based on guidelines and local policy.
Acknowledgement
The authors acknowledge K Byth, medical biostatistician at Westmead Millennium Institute, for her input in analysing the data.
Competing interests
None declared