Introduction
The advent of readily available magnetic resonance imaging (MRI) has led to the knowledge that many relatively asymptomatic acoustic tumours show little growth. Many centres now adopt an expectant policy in these circumstances, if the tumour is less than 15 mm in the cerebellopontine angle. Serial MRI is the only intervention that is taken.
The senior author (PF) has noted that a great many patients express concern when told that a tumour is showing growth. However, it is necessary to ask what significant growth is and how important inter-observer reliability is, particularly when serial studies are carried out in different institutions by various radiologists utilising different technology. Furthermore, other studies have shown significant inter-observer differences in tumour measurement.Reference van de Langenberg, de Bondt, Nelemans, Baumert and Stokroos 1 – Reference Harris, Plotkin, Maccollin, Bhat, Urban and Lev 5
This study examined inter-observer reliability in a specialised institution.
Materials and methods
The MRI maximum diameter of 12 cerebellopontine angle tumours was independently measured by 4 experienced radiologists. The 12 cases were randomly chosen from a selection of patients who had received no treatment.
The MRIs were conducted using either a 3 T Siemens Magnetom Verio or a 1.5 T Siemens Magnetom Aera scanner (Munich, Germany). All sequences were performed with the three-dimensional sampling perfection with application-optimised contrast using different flip angle evolution (‘SPACE’) sequence protocol.
The images were examined using the virtual calipers on an Agfa Impax picture archiving and communication system (Mortsel, Belgium). To avoid bias, all radiologists cleared records of image analysis following each measurement on the imaging database.
The intervals measured included the largest dimension of the cerebellopontine angle moiety of the tumour in the axial plane (in millimetres). In addition, ipsilateral and contralateral middle cerebellar peduncles were measured at the same level for comparison. This measurement was taken from the point of tumour–peduncle contact to the most lateral tip of the fourth ventricle. Inter-observer variability was determined between two separate measurements of the same case taken by the same radiologist after at least a 1-day interval.
The average deviation was calculated for all 4 of the first and then second readings for all 12 cases, providing 24 average deviations to enable calculation of the mean measurement difference and standard error in millimetres. The student t-test was used to test the hypothesis that there would be a difference in the variation for smaller tumours, less than 15 mm in diameter, compared to larger tumours, 15 mm or more in diameter. Intraclass correlation was employed to determine the reliability of the measurements, where it is assumed that a correlation value of more than 0.7 is reliable. Intraclass correlation and 95 per cent confidence intervals (CIs) were calculated using an array of all the readings obtained for each of the measured structures.
Statistics were calculated using Microsoft Excel® for Mac 2011 software, version 14.5.4, with the Real Statistics Resource Pack for Excel 2011 add-in.
Results
Table I presents mean measurement deviation with standard error for all 12 cases. Inter-observer difference for tumour diameter averaged 0.33 ± 0.04 mm (range, 0.0–0.8 mm), and intra-observer difference averaged 0.17 ± 0.03 mm (range, 0.0–0.8 mm). Figure 1 demonstrates the variation in all the readings of maximal tumour diameter as a scatter plot. A students t-test investigating the difference in mean differences between inter- and intra-observer measurements demonstrated more consistency between readings by the same radiologist, with average lower differences of 0.17 mm (95 per cent CI = 0.27–0.06, p = 0.002).
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary-alt:20170408061903-54335-mediumThumb-S002221511600935X_fig1g.jpg?pub-status=live)
Fig. 1 Measurement variation in 12 cases of cerebellopontine angle tumours.
Table I Mean measurement deviation in cerebellopontine angle tumour mri assessment
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary-alt:20170408061903-38496-mediumThumb-S002221511600935X_tab1.jpg?pub-status=live)
MRI = magnetic resonance imaging; SE = standard error
The hypothesis that a significant difference in inter-observer variability existed for smaller tumours versus larger tumours was tested with a student t-test. Despite there being a trend with a larger mean difference for the smaller tumours (less than 15 mm) at 0.44 ± 0.07 mm (range, 0.11–0.86 mm), compared to the larger tumours (15 mm or more) at 0.30 ± 0.05 mm (range, 0.08–0.65 mm), the results were not significantly different (p = 0.09).
Test results for reliability using intraclass correlation are presented in Table II. Inter-observer reliability for maximal tumour diameter was 0.99 (95 per cent CI = 0.97–0.99). These results indicate that the measurements between radiologists were extremely reliable. Similarly, intra-observer reliability, at 0.99 (95 per cent CI = 0.99–1.00), were extremely reliable.
Table II Intraclass correlation of cerebellopontine angle tumour mri assessment
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary-alt:20170408061903-36348-mediumThumb-S002221511600935X_tab2.jpg?pub-status=live)
MRI = magnetic resonance imaging; CI = confidence interval
Discussion
According to a recent study, intervention is generally not undertaken until the cerebellopontine angle moiety of the tumour reaches 15 mm.Reference Jufas, Flanagan, Biggs, Chang and Fagan 6 If we take a hypothetical case of a tumour that grows from 4 mm to 6 mm in diameter and one considers the spherical volume of a sphere (4/3 πr3), such a change represents a volume increase of more than three times. Some studies have found that there is an existing error of 1–2 mm in inter-observer tumour measurement.Reference van de Langenberg, de Bondt, Nelemans, Baumert and Stokroos 1 – Reference Marshall, Owen, Nikolopoulos and O'Donoghue 3 Of course, no major clinical decision would be made on such a volume increase in a tumour of this size, but there are significant issues of patient concern and anxiety.
This study indicates very little test–retest change and very little inter-observer difference. Such results, unexpected by the senior author, have not been shown in other studies.Reference Hougaard, Norgaard, Pedersen, Bibby and Ovesen 2 – Reference Cross, Baguley, Antoun, Moffat and Prevost 4 In order to avoid surgery that proves unnecessary, and to similarly avoid alarming patients about apparent tumour growth, the authors believe that if an expectant policy is to be undertaken, serial imaging should be carried out in the same institution by experienced neuroradiologists.
While the results of this study are reassuring, it is what is to be expected from highly trained neuroradiologists who use the same machine and who work very closely with the clinicians involved. The clinicians themselves also need to be confident about their ability to interpret the cases placed before them.
Clinicians should give precise instructions to their radiologists requesting that the cerebellopontine angle moiety be individually measured at the point of maximum horizontal extent.Reference Tanaka, Hongo, Tada and Kobayashi 7
The slice thickness of the MRI scans used in this study was 0.5 mm; to our knowledge, this is the smallest interval that has been used to test for inter-observer variability. The next closest study had used a 1.0 mm slice thickness, yielding similar mean readings but a comparatively larger error.Reference Hougaard, Norgaard, Pedersen, Bibby and Ovesen 2
-
• Errors of 1–2 mm on two-dimensional (2D) measurements have been demonstrated
-
• Volumetric measurements are superior to 2D measurements in determining growth
-
• However, volumetric measurements are time-consuming, dependent on magnetic resonance imaging sequences and are still susceptible to human error
-
• Despite the errors, 2D measurements are reliable and practical for clinical use
-
• There is innate bias associated with intra-observer readings and different measuring protocols
-
• The same centre and radiologist should ideally monitor tumour progression to enhance decision making
While there are claims that volumetric analysis is important,Reference Harris, Plotkin, Maccollin, Bhat, Urban and Lev 5 this is an expensive technique that is not available to everybody, and is in itself susceptible to human error.Reference Cross, Baguley, Antoun, Moffat and Prevost 4 , Reference Yamada, Tsunoda, Noguchi, Komatsuzaki and Shibuya 8 These authors believe that the assessment of middle cerebellar peduncle compression (as conducted in this study) is probably the most important factor leading to intervention. While unproven, we propose that slice thickness influences the accuracy of axial measurements.
Acknowledgements
The authors would like to acknowledge the following radiologists who participated in our study: John Ly, Pascal Bou-Haidar, Sebastian Fung and Yael Barnett.