Diagnosis of appendicitis in adults by ultrasonography or computed tomography: A systematic review and meta-analysis
Published online by Cambridge University Press: 04 August 2005
Abstract
Objectives: The use of ultrasonography and computed tomography (CT) in the diagnosis of appendicitis in adult patients was compared.
Methods: Systematic review and meta-analysis of current evidence in two clinical situations: unselected nonpregnant, adult patients with symptoms of appendicitis, and more selective use in only those patients who still have an equivocal diagnosis subsequent to routine clinical investigations.
Results: Meta-analysis of eligible studies shows CT to have better sensitivity and specificity than ultrasound in both clinical situations.
Conclusions: Application of these findings in clinical practice and/or policy would need to evaluate the better diagnostic performance of CT against its cost and availability. In addition, it is imperative that future studies be conducted in patient populations that are well-defined with respect to prior investigations. Sequelae of false-negative and false-positive diagnoses should also be evaluated.
- Type
- GENERAL ESSAYS
- Information
- International Journal of Technology Assessment in Health Care , Volume 21 , Issue 3 , July 2005 , pp. 368 - 379
- Copyright
- © 2005 Cambridge University Press
INTRODUCTION
Acute appendicitis is the most common indication for emergency abdominal surgery in Australia. Despite the prevalence of acute appendicitis, diagnosis remains problematic. Many patients have typical presentations consisting of rapid-onset abdominal pain followed by nausea and vomiting. Subsequent to the initial nonspecific abdominal pain, there is localization of the pain to the right lower quadrant, often with rebound tenderness and pain upon cough. Leukocytosis and low-grade fever typically complete the clinical picture. However, up to a third of patients present atypically, resulting in a delayed or missed diagnosis (5;36). Furthermore, a range of other conditions can mimic the clinical presentation of appendicitis, making a differential diagnosis problematic.
Delaying the treatment of patients with atypical presentations to confirm diagnosis may result in perforation of the appendix, which can occur within 24 hours of the onset of symptoms. The incidence of perforation is particularly high among elderly patients, children, and women of child-bearing age (67). Perforation leads to peritonitis, with associated morbidity and mortality, resulting in additional resource use.
Surgical treatment of acute appendicitis is a highly successful medical intervention. However, the inherent risk of surgical complications cannot be discounted. Furthermore, surgical procedures and aftercare services occur at a considerable cost. The treating clinician, therefore, is faced with the need to balance the considerable morbidity and even mortality associated with a missed diagnosis, with exposing the patient to unnecessary surgery, and associated morbidity and mortality as a result of an incorrect positive diagnosis.
The relationship between rates of negative appendicectomy and perforation is controversial. Some argue that there are direct trade-offs; that lowering the rate of negative appendicectomy increases the rate of perforation (4;60). Others argue that both can be lowered safely, using evidence that the two rates are independent of each other when compared across hospitals (6;39;41). Even if lowering the negative appendicectomy rate results in more perforations, this finding may not lead to increased mortality or morbidity in the context of modern surgical care (65).
Individual clinicians and hospitals vary in their approach to this problem. Some delay surgery and use additional diagnostic tests and procedures, in an effort to minimize the negative appendicectomy rate.
This systematic review summarizes the published medical literature reporting the performance of ultrasonography (US) and computed tomography (CT) as optional (“add-on”) diagnostic procedures used in the diagnosis of acute appendicitis in adults in Australia. The aim is to summarize the diagnostic performance of each procedure (e.g., sensitivity and specificity) that has the potential to impact upon the cost-effectiveness of appendicitis management. It is part of a larger project, commissioned by the Research and Development Grants Assessment Committee (RADGAC) of the National Health and Medical Research Council, to define diagnostic and treatment pathways that optimize the cost-effectiveness of care for patients presenting to the hospital with signs and symptoms of acute appendicitis.
The objective of this review is to systematically evaluate the evidence relating to the diagnostic performance of US and CT for the diagnosis of appendicitis in adults. These procedures are widely available in Australia but are considered to be “in addition” to conventional clinical assessment (history, physical examination, and plain X-ray).
The focus of the review was the diagnosis of appendicitis in nonpregnant, adult patients. Evidence derived from subgroups of this population was not included when this information could not be generalized to the entire group (e.g., elderly patients only). Evidence derived from broader populations (e.g., those inclusive of children), was only included when the target population represented at more than 50 percent of the study population.
In most cases, histological examination of tissue removed at operation can provide the definitive reference diagnosis. With respect to patients discharged with a negative diagnosis, dedicated systematic clinical follow-up is an acceptable reference standard, as truly active disease is likely to represent in a short time frame.
Ultrasound
Ultrasound is a safe, noninvasive technique involving the emission of high-frequency sound waves from a transducer into the underlying tissues, from where sound is reflected in accordance with tissue density, allowing an image to be viewed in real-time. Ultrasound is relatively inexpensive, with low consumable costs and modest capital expenditure; however, payment of experienced staff adds to the overall cost burden. The procedure is very operator-dependent, and poor technique may compromise its diagnostic accuracy.
Computed Tomography
Computed tomography (CT) is a fast, noninvasive procedure using X-rays to generate cross-sectional images, secondary to the rate of radiation absorption of different tissues. CT scanning has advantages over other methods of imaging the appendix, as it is able to visualize the entire appendix (49). A diagnosis of acute appendicitis is generally made when the appendix measures greater than 6 mm in diameter, the lumen is not filled with air or enteric material, and there is evidence of inflammation (40). Studies with and without the use of a contrast medium have been included in this review.
METHODS
The medical literature was searched to identify original studies that investigated and reported the ability of US and helical CT to diagnose acute appendicitis. The literature search covered the period from January 1985 to February 2003. Any additional papers identified from the bibliographies of included publications were added to the review.
Searches were conducted by means of Medical Literature Analysis and Retrieval System On-Line (MEDLINE), Excerpta Medica Database (EMBASE), Cochrane Systematic Reviews; Database of Abstracts of Reviews of Effects (DARE), National Health Service Economic Evaluation Database (NHS EED), and health technology assessment databases. Search strategies included terms for “appendicitis” and “diagnosis”. After the removal of duplicate citations, and addition of further citations sourced from the reference lists of recent key publications, a total of 1,087 unique citations remained. The following inclusion and exclusion criteria were then applied.
Inclusion Criteria
Included in this study were original publications reporting the results of one or more clinical study suitably designed to evaluate the diagnostic performance of the test (i.e., nonsystematic reviews, editorials, opinion pieces, and letters were excluded as were methodological, descriptive, or prognostic studies); studies conducted on human patients; studies involving one or more of the diagnostic investigations within the scope of the current review; studies in which 50 patients or more underwent the investigation in question; and studies reporting (or providing sufficient information to calculate) relevant clinical outcomes (sensitivity, specificity, diagnostic accuracy, positive predictive value, negative predictive value).
Exclusion Criteria
Excluded were those studies for which the study population comprised only those patients who had had an appendicectomy, where a broader population existed; studies conducted in pregnant women only; studies conducted in children only, and those in which >50 percent of the patient population were children; studies conducted in the elderly only; studies where the diagnostic test was conducted in a nonstandard or outdated manner or by a nonroutine operator (e.g., trainees, surgeons alone); studies duplicating patients presented in another publication or studies presenting only a subset of the patients presented in another publication; studies published in a language other than English. Studies reporting the results for men only or women only were included, with the gender selection noted. Publications that did not adequately define and report the patient recruitment criteria and the nature of previous clinical and diagnostic investigations or had additional patient selection criteria that made the patient population irrelevant to the current systematic review were excluded (3;10–17;22;24;25;27;30–32;34;37;38;42;45;48;56– 58;69).
After application of these criteria, a total of thirty-two publications were included in this review. Twenty publications described studies of ultrasound alone, eleven publications described studies of CT alone, and one publication described a study of both ultrasound and CT.
The settings chosen for this review related to the two most common clinical placements of the tests in question: (i) in all patients with suspected appendicitis subsequent to routine clinical investigations (“all presentations”); and (ii) in only those patients with an equivocal diagnosis of appendicitis subsequent to routine clinical investigations (“equivocal only”).
A detailed assessment of study quality was undertaken using a modification of the diagnostic-specific checklist published by the Cochrane Screening and Diagnostic Tests Methods group. This approach enabled a quality score to be estimated for each study. Quality scoring was undertaken on the basis of the information clearly enunciated in the published paper. No attempt was made to contact authors to seek clarification.
This review aimed to identify studies, review their quality, and then summarize the evidence in each setting. Therefore, where patient population, test techniques, and interpretation and study quality were suitably consistent, meta-analyses were conducted to calculate pooled estimates of the key diagnostic performance measures. As all tests involve outcomes that are generally considered to be truly dichotomous rather than continuous in nature, it was considered more appropriate to conduct weighted pooling of proportions rather than summarizing in receiver operator curve space. Nevertheless, the association between sensitivity and specificity across the contributing studies was investigated before the pooling of results. As there was no significant association between sensitivity and specificity across the included studies in any of the settings, pooling was undertaken assuming that no underlying cut-point effect was present. Meta-analysis was then conducted with weighting applied according to the number of patients in each study.
RESULTS
Ultrasound
After application of the inclusion/exclusion criteria, a total of twenty-one studies were included in the review of ultrasonography for the diagnosis of acute appendicitis. One previously published meta-analysis was also retrieved (43). This meta-analysis included seventeen studies published between 1986 and 1994 that had been conducted in a variety of patient groups, although those conducted in pediatric populations alone were excluded. The authors included both prospective and retrospective studies, and they did not consider the patient selection criteria of these studies or other methodological limitations in any detail. No minimum criterion was applied with respect to the reference standard for the study to be included in the meta-analysis. All studies reported in Orr et al. (43) were considered for inclusion in the current systematic review.
The study design and quality of each of the included trials conducted in the “all presentations” setting are summarized in Table 1. The diagnostic performance results in this setting are presented in Table 2. The overall prevalence of acute appendicitis in these studies was 41 percent. However, despite the studies purporting to include all patients with suspected appendicitis after initial clinical examinations, a broad range of prevalences was apparent (23–78 percent). Therefore, the extent to which these studies have investigated a comparable group of patients is unclear. It is possible that some of this variation may reflect differences in the nature and extent of initial examination by the referring general practitioner and/or the emergency department or the experience of the referring or examining clinicians.

As the majority of studies were of a similar quality, all results were included in pooling. Where more than one value was available for an individual study, the value used was that where nonvisualizations and nondiagnostic ultrasounds were treated as disease-negative.
Five of the included studies were conducted in only those patients with an “equivocal” diagnosis subsequent to routine clinical investigations. The study design and quality of each of the included trials conducted in the “equivocal presentations only” setting are summarized in Table 1.
The diagnostic performance results in this setting are presented in Table 2. The overall prevalence of acute appendicitis in these studies was 28 percent. Prevalence ranged from 19 to 54 percent. As there was only a small range in study quality scores between all of the studies, all results were included in pooling.

Computed Tomography
After application of the inclusion/exclusion criteria, a total of twelve studies were included in the review of CT for the diagnosis of acute appendicitis. Of these, seven studies were undertaken in “all presentations” and five were conducted in only those patients with “equivocal” diagnosis. The study design and quality of each of the included trials in the “all presentations” setting are summarized in Table 3.

The diagnostic performance results in this setting are presented in Table 2. The overall prevalence of acute appendicitis in these studies was 46 percent. Again, a broad range of prevalence was apparent (24–72 percent); thus, the extent to which these studies have investigated a comparable group of patients is unclear. As with the ultrasound results, it is possible that some of this variation may reflect differences in the nature and extent of initial examination by the referring general practitioner and/or the emergency department or the experience of the referring or examining clinicians. As there was only a small range in study quality scores between all of the studies, all results were included in pooling.
The study design and quality of each of the five included trials conducted in the “equivocal presentations only” setting are summarized in Table 3. The diagnostic performance results in this setting are presented in Table 2. The overall prevalence of acute appendicitis in these studies was 46 percent. Prevalence ranged from 30 to 78 percent.
As the majority of studies were of similar quality, all results were included in pooling. Pooling was undertaken assuming that no underlying cut-point effect was present. This approach is supported by the lack of association between sensitivity and specificity across the included studies.
The intention of this review was to consider the diagnostic performance of CT and ultrasound in the two discrete settings indicated in Table 2. However, if these were truly discrete populations, the prevalence of positive appendicitis would be expected to be consistently higher in the former group, which would have included patients where there was a high level of suspicion of appendicitis, in addition to those patients whose diagnosis remained equivocal. Given the broad range and extent of overlap in appendicitis prevalence in the studies included in the two settings, it is possible that this grouping may be problematic. Poor definition and/or poor reporting of the studies' patient recruitment criteria, together with differences in the nature of the routine clinical investigations, may have contributed to the overlapping disease prevalence in the two chosen settings. Therefore, an overall pooled result is presented that captures all included studies.
DISCUSSION
This systematic review indicates that ultrasound has only a modest ability to detect patients with acute appendicitis. This finding was particularly the case when used in the group of patients with an equivocal diagnosis after routine clinical investigations (pooled sensitivity = 76.4 percent). When translated into clinical practice, these results indicate that approximately one quarter of patients with acute appendicitis may be misdiagnosed on the basis of their ultrasound result. This reduces the clinician's confidence in the ultrasound results, in turn increasing the need for additional diagnostic investigations to rule out appendicitis.
In contrast, in both settings, the relatively high specificity results indicate that ultrasound has considerable value in the detection of patients who do not have acute appendicitis. This distinction is based upon data from studies that, in most cases, classified all nonvisualizations of the appendix as a negative ultrasound result. In clinical practice, this will also have implications for resource use if these patients in fact go on to have repeat ultrasound or other diagnostic investigations. Higher diagnostic accuracy was apparent in the most recent trials, suggesting that technological advances may have improved the diagnostic performance of ultrasound.
The current systematic review found that CT has considerable ability to detect patients with acute appendicitis (pooled sensitivity=96.5 percent). This finding was equally the case when used in all presentations and in only patients with equivocal diagnoses after other diagnostic testing. Similarly, in both settings, the relatively high specificity indicates that CT has considerable value in the detection of patients who do not have acute appendicitis (pooled specificity =94.7 percent). When translated into clinical practice, these results indicate that clinicians can have confidence in CT results, in turn reducing the need for additional diagnostic investigations.
Direct comparison of diagnostic performance was undertaken in only one of the studies reviewed here (68), with the same patients undergoing both ultrasound and CT scanning. CT was found to have better diagnostic performance than ultrasound, particularly with respect to sensitivity.
The results of the current study are confirmed by another systematic review of CT and US, which was published concurrently with the peer review of this publication (59). The authors of this publication also found that CT had a higher sensitivity (94 percent) and specificity (95 percent) than US (sensitivity =86 percent; specificity =81 percent). Teresawa et al. (2004) also found that there was a wide variation in appendicitis prevalence in the publications they reviewed. They noted that it was problematic to explore the reasons for these differing prevalences, because the studies did not describe the patient populations in sufficient detail.
Use of the current systematic review's results in the planning of individual diagnostic testing, or in best-practice advice for populations of patients, requires that several factors be taken into account alongside the diagnostic performance of the tests themselves. Possible reductions in test ordering based on the more accurate CT findings would have to be borne out empirically. The higher capital costs of CT may restrict availability in diagnosing common conditions such as suspected appendicitis. This restriction may be exacerbated by reduced after-hours availability for emergency patients, where US is more likely to be available. The lower capital cost of ultrasound, however, is offset by its greater operator dependence, combined with (or perhaps contributing to) its lower sensitivity in diagnosing appendicitis. Timely access to diagnostic information, thus, must be weighed against the higher sensitivity of CT. Economic evaluation would need to compare the clinical and cost implications of false-positive and false-negative diagnoses arising from use of both technologies. Clinicians in the emergency department are acutely aware of the trade-offs between delayed diagnosis and a negative appendicectomy, both with potential risks to patients. This is particularly the case for women patients, as long-term fertility risks must be considered in addition to the risks of the acute phase of the disease.
Finally, it is important to consider the quality of the included studies in the interpretation of these findings. In general, the studies are of poor methodological quality, with considerable potential for bias, particularly with respect to patient selection. Future evaluation of diagnostic technologies in appendicitis should take care to clearly define the patient population in whom the diagnostic performance is being measured. In particular, authors should clearly define and report the patient recruitment criteria and the nature of previous clinical and diagnostic investigations. In addition, many of the studies suffer from poor follow-up of patients with negative ultrasound results, and few of the studies interpreted imaging findings independent of other clinical information. As a result, it is difficult to definitively determine the “added value” of either ultrasound or CT in the diagnosis of acute appendicitis.
Whereas the current systematic review suggests that CT is a sensitive and specific diagnostic modality, it does not support the routine use of CT in the diagnosis of acute appendicitis. Other factors, such as the cost and time delay associated with obtaining a CT scan, are essential considerations.
CONTACT INFORMATION
Adèle R. Weston, PhD, Director (aweston@htanalysts. com), Health Technology Analysts Pty Ltd., P.O. Box 133, Balmain, New South Wales 2041, Australia
Terri J. Jackson, PhD, Senior Research Fellow (terri. jackson@latrobe.edu.au), School of Public Health, LaTrobe University, Health Sciences Building 1, Bundoora, Victoria 3086, Australia
Stephen Blamey, MB, BS, Head of Surgery (sblamey@netspace.net.au) Gastrointestinal Surgery Unit, Monash Medical Centre, Clayton, Victoria 3168, Australia
The authors are grateful for clinical advice provided at an early stage of the project by Dr. Ian McDonald (MD, FRACP, FRACR) and Mr. Brian Collopy (FRACS, FRCS) and for the research assistance of Ms. Jenny Watts, Ms. Lisa Lane, Mr. Lachlan Standfield, and Dr. Suzanne Campbell. Funding support for Dr. Terri Jackson from the Research and Development Grants scheme of the Australian National Health and Medical Research Council is gratefully acknowledged.
References

Study Design and Quality for Ultrasonography Studies of Patients with Suspected Appendicitis.

Summary of CT and US Diagnostic Performance

Study Design and Quality for Studies of CT in Patients with Suspected Appendicitis
- 36
- Cited by