Accuracy of Imaging Modalities in Choledocholithiasis: A Real-Life Data
In this retrospective study of real-life data, we aimed to determine the diagnostic accuracy in patients with choledocholithiasis of some current imaging modalities, including ultrasonography (US), computerized tomography (CT), magnetic resonance cholangiopancreatography (MRCP), and endoscopic retrograde cholangiopancreatography (ERCP). This study utilized a database of imaging records from 86 consecutive patients with ERCP-proven choledocholithiasis in a single-center outpatient clinic. Features of the stones found, namely number, size, localization, choledochal dilation and cholestasis, were determined using various imaging modalities and liver function tests (LFTs). Our study focused on a total of 86 patients (43 female; 43 male) who underwent the ERCP procedure. Hepatobiliary ultrasound was performed in 71 (82.6%); MRCP in 59 (68.6%); and CT in 13 (15.1%) patients. All 86 patients had choledocholithiasis: 59 (68.6%) with multiple stones and 21 (24.4%) with stones over 10 mm in diameter. Sensitivity for the presence of choledocholithiasis was 40.8% for US, 76.9% for CT, and 86.4% for MRCP, where ERCP was taken as the reference method. Even though US, CT, and MRCP are widely used as noninvasive imaging modalities for CL, in our real-life data their sensitivity for choledocholithiasis was lower than expected. MRCP is preferred when a nontherapeutic but only diagnostic evaluation is aimed for; however, while highly competent in establishing the level of choledochal dilation, it had a low yield in differentiating the localization, size, and number of the stone(s).
Choledocholithiasis (CL) is found in up to 15% of patients with gallbladder stones.1 Some patients may present with biliary colic, jaundice, and cholangitis, while a minority of patients may remain asymptomatic. Imaging modalities for diagnosing choledocholithiasis include upper transabdominal ultrasonography (US), cholangiography, computed tomography (CT), magnetic resonance cholangiopancreatography (MRCP), endoscopic ultrasonography (EUS), and endoscopic retrograde cholangiopancreatography (ERCP). Patients with obvious bile duct obstruction may go directly to ERCP, but some will need further imaging tests to confirm the diagnosis. As a safe and widely available method, US is usually the first choice, although it is not always possible to detect CL on abdominal ultrasonography. If detected, the specificity is up to 95%.2 If the diagnosis is uncertain, MRCP may be the next modality used, in favor of diagnostic ERCP because of the latter's complication rates.3 Patients with a high clinical suspicion of CL will automatically undergo ERCP for both detection and removal of the obstructing choledochal stones.4
US, CT, MRCP, and ERCP, alongside liver function tests (LFTs), are diagnostic tools for determining the features of choledocholithiasis.5 These tools can inform details about the features of the obstruction and differentiate lithiasic versus nonlithiasic and benign versus malignant conditions.6 The radiation risks of CT lead to a shift in choice to MRCP, which can show the location, position, dimension, and nature of the obstruction and also the length of the obstructed segment and the degree of the upstream dilation.7,8 However, ERCP is the gold standard for evaluating hepatobiliary and pancreatic duct morphologies; moreover, it allows for many therapeutic interventions despite some morbidity and mortality issues.8,9
In day-to-day practice, real-life experience often highlights the inherent weaknesses and shortcomings of these imaging tools. However, several published studies report MRCP as a very accurate method with sensitivity and specificity over 90% in CL,10–20 with others reporting this range of sensitivity and specificity as between 80% and 90%.21–28 This prompted us to evaluate our own real-life raw data and review the relevant literature. The aim of this study was to evaluate the sensitivity and specificity of US, CT, LFTs, and MRCP in the diagnostic exams of suspected choledocholithiasis compared with ERCP, in real-life circumstances.
Materials and Methods
A retrospective database query analysis was performed on the imaging modalities used on 86 consecutive ERCP-confirmed CL patients admitted to the gastroenterology department of İstanbul Medeniyet University Göztepe Training and Research Hospital during the preceding 12 months interval, between June 2014 and June 2015. The study group consisted of 86 patients: 43 (50.0%) females and 43 (50.0%) males. All patients were hospitalized with elevated LFTs and an initial diagnosis of obstructive CL. Patients with malignity, stricture, or abnormalities other than CL were excluded from the study. Medical records, as well as all procedural and radiologic reports including US, CT, MRCP, and ERCP, were reviewed for each patient after the approval of the ethics committee (no:112/2015). All ERCPs were carried out by gastroenterologists, gastrointestinal surgeons, and the authors of this study, over its 1-year time frame. The number of US, CT, and MRCPs performed were 71 (M = 37, F = 24); 13 (M = 4, F = 9); and 59 (M = 31, F = 28), respectively. ERCP was performed and CL assessed in all 86 patients. ERCPs were performed with a video duodenoscope (Fujinon, Fujifilm, Tokyo, Japan) and a cannulating sphincterotome (Boston Scientific, Marlborough, Massachusetts) with ioxidine contrast. Transabdominal ultrasonographic examination was carried out using 3.75-Mhz probe (Aplio Xu, Toshiba Corp, Tokyo, Japan) and reported on by the duty radiologist on the admission day of the patient. CTs utilized a commercial scanner (Philips Brilliance 6, Philips Medical Systems, Baltimore, MD) after oral lactulose and intravenous V08a x-ray contrast media. MRCPs were performed by a commercial MRI scanner (1.5 Tesla HDXe, General Electric, Fairfield, Connecticut) with an LCC bore magnet using software version 15x and a 12-channel body array coil (Signa HD 1.5T, General Electric) with Magnevist contrast. All data were entered into a database form and statistically analyzed.
Statistical Analysis
Descriptive statistics and sensitivity were calculated with statistical software (SAS version 9.3, SAS Institute, Inc, Cary, North Carolina; and SPSS version 17, SPSS, Inc, Chicago, Illinois). All results were expressed as mean ± standard deviation for all subjects, and independent sample t-tests, Mann-Whitney U, Levene test for variance analysis, and χ2 tests were used. A value of P < 0.05 was considered to be statistically significant and the confidence interval was 95%.
Results
A total of 86 patients (43 females, 43 males) who underwent ERCP for obstructive CL were studied. Of these, 71 had US, 13 had CT, and 59 had MRCP followed by ERCP. The average age of the patients was 65.5 ± 16.0 years (range: 21–95). The specificity of US, CT, and MRCP in accurately identifying the presence of CL were 40.8%, 76.9%, and 86.4%, respectively.
A total of 31.4% of patients had a single stone, while multiple stones were found in 68.6% of patients. Accurate predictions of the number of stones made by US, CT, and MRCP were 29.6%, 23.0%, and 31.0%, respectively. The localization of the stones was distributed between 10.4% ampullary, 87.2% distal choledochal, and 1.4% proximal choledochal, with an accurate in-site prediction of 33.8%, 15.4%, and 57.6% for US, CT, and MRCP, respectively. The diameter of the stones was less than 3 mm in 39.5%, 3 to 10 mm in 36.0%, and over 10 mm in 24.5%. This diameter was accurately predicted in 15.5%, 30.8%, and 94.9% of cases according to US, CT, and MRCP, respectively (Tables 1, 2, and 3).



LFTs included total and direct bilirubin, alkaline phosphatase (ALP), and gamma glutamyl transpeptidase (GGT). The mean values were 3.0 ± 4.0 (0.3–19.8) for total bilirubin, 2.0 ± 2.8 (0.1–12.0) for direct bilirubin, 227.5 ± 318.0 (45–2699) for ALP and 297.5 ± 348.0 (11–2064) for GGT. These levels reflected no difference between true positive and false negative cases as indicated by US, CT, and MRCP (P > 0.05). Interestingly, bilirubin, ALP, and GGT levels were higher in patients with a single stone compared to those with multiple stones, although this was statistically insignificant (P = 0.66).
Discussion
Among symptomatic gallstones, which usually originate from the gallbladder, choledocal stones have an incidence of 8% to 18%.4 The diagnostic exams of these cases, after an initial clinical and laboratory assessment, includes a number of imaging modalities such as US, MRCP, EUS, CT, ERCP, intravenous cholangiography with CT or intraoperative US, and intraoperative cholangiography.4,5,26 All these modalities have their own advantages, limitations, and disadvantages.
US, as an initial tool, has great advantages due to its noninvasiveness, availability, and low cost. However, the accuracy of US in CL is reported as ranging from 20% to 80% in many studies and it has a low success rate for predicting choledochal diameter.19,29–31 In our real-life data, the accuracy of our US was 40.8%. The accuracy of CT in predicting CL was also low (23.1%) in this study, but can be performed to exclude extrabiliary pathologies. As a noninvasive equivalent of ERCP, the preferred modality is MRCP. It is often highly successful in identifying CL and can determine the stones' number, size, location, and choledochal dilation.17,32 However, contrary to this, in a study of patients with gallstone pancreatitis, MRCP showed a sensitivity of only 62% but a specificity of 98% for CL.28 EUS is also reported as an effective tool in CL,33,34 but was not within the scope of our study.
An accurate imaging modality for CL prior to ERCP could be expected to predict the number of stones, stone localization, diameter, and choledochal dilation. Unfortunately, the success rates for these modalities in predicting these parameters were lower than expected (Table 2). The most common consequence of CL is the dilation of the biliary tree due to upstream obstruction. The upper limit of the choledochal caliber in non-cholecystectomy patients is accepted as 6 to 8 mm, and in patients after cholecystectomy as 10 mm. Also, it must be kept in mind that after 60 years of age, choledochal caliber increases 1 mm for each 10 years. However, only MRCP could predict the presence of choledochal dilation with 95% accuracy.24
Some recent studies suggest that the false negatives produced by MRCP pose a clinical problem in daily practice. In our study, the sensitivity of MRCP in determining CL was 86.4%. New techniques in MRCP may overcome the false negativity of results in real life.35 In daily practice, heterogeneity is an issue: the time of analysis, interpretations made by different physicians with varying levels of experience, and diagnostic exams carried out by a changing staff may lead to these highly variable results. There are some limitations to our study. First, this is a retrospective analysis. Second, ERCP was accepted as the gold standard and its possible errors were not taken into consideration. Since all cases were positive for CL, the specificity and predictivity tests could not be calculated for ERCP.
A review of the literature brings up many papers on this topic. Some emphasize the inaccuracy of MRCP, especially related to its low negative predictivity. However, in one study, the total accuracy rates in detecting biliary obstruction were over 99% for MRCP36 and Chang et al37 reported that MRCP had 100% sensitivity. Some studies have found that MRCP and ERCP were not cost-effective strategies for evaluating suspected CL; instead they suggested that laparoscopic cholecystectomy and intraoperative cholangiography should be the preferred options.38,39 In another study, the diagnostic accuracy of CL by MRCP was reported as a sensitivity of 57.1%, specificity of 100%, positive predictive value of 100%, and negative predictive value of 50%.40 A study by Aydelotti41 revealed the sensitivity and negative predictive value of MRCP for CL as 80% and 54%, respectively.
There are some caveats to this study. Firstly, imaging methods were not equal in number; the low number of US and CT may have made the sensitivity and specificity less predictive. Clearly, data with a higher number of consecutive patients may give more precise results. Also, the level and severity of LFTs may have affected the clinician's choice of imaging modality. Negative predictive value (NPV) could not be calculated since all patients were CL positive and the number of negative cases was zero [NPV = true negative/(false negative + true negative)]. Positive predictive value could not be calculated either, since the number of false positive cases was also zero. Finally, this study differs from many others, as the authors had no control over the data. The imaging modalities were not selected, performed, or reported by the same physician; instead, many agents were involved with differing levels of experience, at different times of the day and most of them under night-duty conditions.
In conclusion, in our study US, CT, and MRCP seemed to have a lower yield than expected for accuracy in determining CL when compared to ERCP, the gold standard for CL diagnosis and intervention in real-life practice. Thus, a negative MRCP in particular may not exclude CL and the ratio of this false negative for MRCP in our study is 13.6%. Regarding symptomatic gallstones, laparoscopic cholecystectomy is advised directly without further evaluation if US and LFTs show no obstruction; otherwise MRCP or ERCP should be preferred prior to a cholecystectomy, which may then be combined with intraoperative cholangiography and transcystic stone removal.42–50 Patients with a normal US or MRCP, with a suspicion of CL should be evaluated further if the clinical and laboratory findings are not within normal boundaries.
Contributor Notes