Skip to main content
Advertisement
Browse Subject Areas
?

Click through the PLOS taxonomy to find articles in your field.

For more information about PLOS Subject Areas, click here.

  • Loading metrics

Application of the 3D slicer chest imaging platform segmentation algorithm for large lung nodule delineation

  • Stephen S. F. Yip ,

    Stephen_Yip@dfci.harvard.edu

    Affiliation Department of Radiation Oncology, Dana-Farber Cancer Institute, Brigham and Women’s Hospital, and Harvard Medical School, Boston, MA, United States of America

  • Chintan Parmar,

    Affiliation Department of Radiation Oncology, Dana-Farber Cancer Institute, Brigham and Women’s Hospital, and Harvard Medical School, Boston, MA, United States of America

  • Daniel Blezek,

    Affiliation Biomedical Engineering Department, Mayo Graduate School of Medicine Rochester, MN, United States of America

  • Raul San Jose Estepar,

    Affiliation Department of Radiology, Brigham and Women’s Hospital and Harvard Medical School, Boston, MA, United States of America

  • Steve Pieper,

    Affiliation Isomics, Inc., Cambridge, MA, United States of America

  • John Kim,

    Affiliation Department of Radiology, University of Michigan Health System, Ann Arbor MI, United States of America

  • Hugo J. W. L. Aerts

    Affiliations Department of Radiation Oncology, Dana-Farber Cancer Institute, Brigham and Women’s Hospital, and Harvard Medical School, Boston, MA, United States of America, Department of Radiology, Brigham and Women’s Hospital and Harvard Medical School, Boston, MA, United States of America

Abstract

Purpose

Accurate segmentation of lung nodules is crucial in the development of imaging biomarkers for predicting malignancy of the nodules. Manual segmentation is time consuming and affected by inter-observer variability. We evaluated the robustness and accuracy of a publically available semiautomatic segmentation algorithm that is implemented in the 3D Slicer Chest Imaging Platform (CIP) and compared it with the performance of manual segmentation.

Methods

CT images of 354 manually segmented nodules were downloaded from the LIDC database. Four radiologists performed the manual segmentation and assessed various nodule characteristics. The semiautomatic CIP segmentation was initialized using the centroid of the manual segmentations, thereby generating four contours for each nodule. The robustness of both segmentation methods was assessed using the region of uncertainty (δ) and Dice similarity index (DSI). The robustness of the segmentation methods was compared using the Wilcoxon-signed rank test (pWilcoxon<0.05). The Dice similarity index (DSIAgree) between the manual and CIP segmentations was computed to estimate the accuracy of the semiautomatic contours.

Results

The median computational time of the CIP segmentation was 10 s. The median CIP and manually segmented volumes were 477 ml and 309 ml, respectively. CIP segmentations were significantly more robust than manual segmentations (median δCIP = 14ml, median dsiCIP = 99% vs. median δmanual = 222ml, median dsimanual = 82%) with pWilcoxon~10−16. The agreement between CIP and manual segmentations had a median DSIAgree of 60%. While 13% (47/354) of the nodules did not require any manual adjustment, minor to substantial manual adjustments were needed for 87% (305/354) of the nodules. CIP segmentations were observed to perform poorly (median DSIAgree≈50%) for non-/sub-solid nodules with subtle appearances and poorly defined boundaries.

Conclusion

Semi-automatic CIP segmentation can potentially reduce the physician workload for 13% of nodules owing to its computational efficiency and superior stability compared to manual segmentation. Although manual adjustment is needed for many cases, CIP segmentation provides a preliminary contour for physicians as a starting point.

Introduction

Quantitative imaging has become an important area of research for the development of non-invasive imaging biomarkers for numerous applications, such as the prediction of clinical outcomes, and assessment of treatment response and gene expression [1, 2]. In particular, quantitative imaging has the potential to have an immense impact on lung cancer patients. Lung cancer is a leading cause of cancer-related death among men and women, affecting over 1.8 million patients worldwide [3]. At the time of diagnosis, the majority of patients are in advanced stages of disease, resulting in poor prognoses with a 5-year overall survival rate of < 20% [4]. However, patients who are treated for early stage disease have a substantially greater overall survival rate of > 50% [4]. Therefore, identification of patients with early stage disease is crucial for improving prognosis of lung cancer patients [5].

Computed tomography (CT) is routinely used to diagnose and monitor disease progression in lung cancer patients, where early stage disease is often manifested as pulmonary nodules [6, 7]. One of the challenges of identifying patients with early stage lung cancer is that these pulmonary nodules may also be an indicator of other benign conditions, such as inflammation and/or infection, rather than malignancy [8]. Studies have hypothesized that malignant nodules possess distinctive CT imaging features from benign nodules, such as greater lesion volume, longer diameter and faster growth rate [914]. Classifiers that are built using imaging features have shown promise in assisting physicians to effectively identify different nodule types [1520]. The development and accuracy of these classifiers relies on accurate delineation of the region of interest that conforms only to the nodule boundaries. Quantitative imaging features are then extracted and evaluated from this region of interest to generate the classifier. Therefore, inaccurate segmentation of tumors can lead to the development of inaccurate classifiers or biomarkers. Manual segmentation by experienced radiologists is commonly used for defining the nodule volume (or region of interest) using a slice-by-slice approach. However, manual segmentation is not only labor intensive, but is also impacted by inter- and intra-observer variability [2124]. A number of automatic and semiautomatic segmentation methods have been proposed, ranging from simple approaches, such as thresholding [25] and region growing [26], to more complex methods based on the probability map of nodule textures and convexity [17, 27, 28]. Despite having great potential to reduce human errors and expedite the nodule contouring workflow, these methods are currently not publically accessible, which limits their widespread use in clinical and biomedical research.

Alternatively, 3D Slicer is an open-source software platform for biomedical research [29] that supports versatile visualization and provides advanced analysis tools, such as image segmentation and registration. An algorithm implemented in 3D Slicer, known as GrowCut, can delineate large lung tumor volumes more robustly than manual segmentation [30], and reliably extract imaging features for the development of imaging biomarkers [31]. However, segmentation of pulmonary nodules presents a unique challenge since the nodules are often smaller and in close proximity with surrounding tissues. Therefore, additional pruning steps are required in the nodule segmentation process to remove pleural and/or vessel attachments [32, 33]. To address these challenges with nodule segmentation, a level set-based algorithm has been implemented within the Chest Imaging Platform (CIP) in 3D Slicer [33, 34]. This algorithm is based on a front propagation approach from a “seed point” placed within the nodule. The propagation of the front (or segmentation) is constrained to prevent leakage into the chest wall, airway walls or regions with appearance of tubular or vessel-like structures.

This study investigated the ability of CIP segmentation to assist physicians with nodule segmentation. In particular, we evaluated the robustness of the CIP segmentation algorithm in delineating lung nodules and compared its performance with the manual segmentations. The accuracy of the CIP segmentation algorithm and nodule characteristics that could affect the segmentation quality was also investigated.

Materials and methods

Patient dataset: Since a publicly available dataset was used in this study, approval by an institutional review board was not needed. A publicly available thoracic CT dataset, known as the Lung Image Database Consortium (LIDC), was downloaded from The Cancer Imaging Archive (TCIA: https://wiki.cancerimagingarchive.net/display/Public/LIDC-IDRI/) [23]. The LIDC dataset consisted of 1007 patients with low dose helical thoracic CT images containing annotated lung nodules that were acquired from seven academic institutions with slice thicknesses ranging from 1 mm to 5 mm. In the LIDC dataset, each nodule had 1 to 4 manual segmentations that were performed on a slice-by-slice basis by experienced thoracic radiologists. The LIDC radiologists assigned scores (ranging from 1 to 5) to each nodule for nine categories that described the nodule characteristics, including its subtlety, internal structure, roundness, margin sharpness, lobulation, spiculation, texture, and likelihood of being malignant. Table 1 contains an annotation of the scoring system. Images were excluded from the current study if they were not segmented by 4 or more radiologists (n = 596), nodule numbers were mislabeled (n = 60) or had imaging artifacts (n = 77, Fig A in S1 File). The imaging artifacts were due to corruption in the original LIDC DICOM files. As a result, images from 274 patients with 354 nodules (1–4 nodules/patient) were used to analyze the robustness and accuracy of 3D Slicer CIP-segmentation.

Nodule segmentation algorithm in 3D Slicer: The CIP in 3D Slicer 4.5 [29] employs semiautomatic nodule segmentation algorithm based on the open-source a Lesion Sizing Toolkit-based [33]. A seed point is placed within the nodule region to initialize the segmentation. In this current study, the seed point was chosen as the centroid of each manual contour. As there were four radiologist-defined nodule contours, four CIP-segmentations were generated automatically for each nodule. To speed up the computation time, the CIP segmentation algorithm also automatically cropped the CT images around the seed points with a radius of 30 mm. If an image consisted of multiple nodules, then an automatically cropped region was created around the seed point for each nodule.

Within each cropped region, the nodule segmentation was based on a level set formulation (Sethian 1999) to propagate a front according to a Geodesic Active Contour functional [35]. The contour propagation is governed by a smoothing term that minimizes the curvature of the contour and an “attachment” term that pulls the front towards the features of interest. This second term employs a speed map, F, to guide the segmentation results according to the desired characteristics of nodules. The speed map is obtained as a sigmoid transformed min pooling of four different feature maps that are designed to slow down the evolution in 1) the chest wall region, 2) vascular structures, 3) the interface between the nodule and the lung parenchyma, and 4) areas whose density is not compatible with nodular structures. The chest wall feature map was obtained according to a threshold-based approach followed by morphological operations similar to the ones employed in standard lung segmentation approaches [36]. Vessel-like structures were penalized based on the Sato vesselness filter [37]. The interface map between the nodule and the lung parenchyma is defined according to a canny edge detector. Finally, the non-nodular regions were excluded based on a sigmoid function with parameters, alpha = 100 and beta = -200 and -500 for solid and non-solid nodules respectively. One or more seed points within the nodule initialize the segmentation.

Robustness of the segmentation methods: The region of uncertainty (δ) and dice similarity index (DSI), were used to assess the robustness of the manual and CIP segmentations. The region of uncertainty was defined as the negation of the intersect regions of all the segmentations (Fig 1). In particular, the region of uncertainty (δ) was defined as follow: (1) Where method could either be manual or CIP segmentation. For manual segmentation, the superscript indicates the nodule volume delineated by the four different radiologists, whereas for CIP segmentation, it indicates the segmentations initialized by the centroid computed from the four radiologist-defined volumes. δ equaled to zero indicated that the segmentation method was perfectly robust across the four segmentations. The stability of the segmentation method decreases with increasing in the δmethod (Fig 2).

thumbnail
Fig 1. Comparison of manual (left) and CIP-based (right) segmentation.

Yellow shaded region indicated the disagreement (or region of uncertainty) between contours performed by four radiologists (bottom left) or different CIP-based seed locations (bottom right). In this example, the region of uncertainty for manual segmentation was 3222 ml while the region was only 46 ml for the CIP-based segmentation. dsiCIP was ≈ 100%, while dsimanual was 88%.

https://doi.org/10.1371/journal.pone.0178944.g001

thumbnail
Fig 2. Robustness (or stability) of the manual and CIP-based segmentation.

The robustness of the manual and CIP-based segmentation assessed with the region of uncertainty (δ) and Dice similarity index (dsi).

https://doi.org/10.1371/journal.pone.0178944.g002

The DSI for segmentation stability was defined as follow: (2) Where method could either be manual or CIP segmentation. n(V) indicates the number of voxel in volume V. i and j ranged from 1 to 4 indicating nodule volumes segmented by radiologist i and j, or initialized by the centroid computed from radiologist i and j, for manual and CIP segmentations, respectively. There were four contours for each segmentation method and, thus, six possible combinations of i and j. The stability of the segmentation method increases with increasing dsimethod, where dsimethod = 100% indicates a perfectly robust method.

The robustness of the CIP segmentation method (δCIP or dsiCIP) was compared with the manual segmentation method (δmanual or dsimanual) using the Wilcoxon signed-rank test, where pWilcoxon < 0.05 indicated statistical significance. Moreover, the average nodule volume segmented by the manual and CIP contouring methods were also compared and tested for significant differences. The average nodule volume was defined as ; where i indicates radiologist i. Since the ground truth of the nodule segmentation is unknown, the average nodule volume () computed from the manual contours was used to estimate the true nodule volume. Unless otherwise specified, is referred to as nodule volume.

Accuracy of the CIP segmentation method: The accuracy of CIP segmentations was evaluated to ensure that non-nodular tissues were excluded and the entire nodule volume was contoured. Even if the CIP segmentation was perfectly robust, it may include nearby non-nodule tissues or fail to capture the entire nodule region. For example, despite being almost perfectly robust, nodules contoured by the CIP segmentation method were observed to include substantial normal lung regions as shown in Figs 3B and 4A. The agreement between the manual and CIP segmentations was used to estimate how well the nodule volume could be delineated by the CIP segmentation. DSIAgree was used to assess for the segmentation agreement and was defined as follow: (3) Where is the CIP segmentation nodule volume initialized by the centroid of nodule volume segmented by radiologist i. Vj could either be the intersection (j = 1) or the union (j = 1) of the radiologist defined segmentations.

thumbnail
Fig 3. Bland-Altman plots.

Bland-Altman plots highlights the differences between and for all nodules. The 95% interval of the differences are depicted by the blue dotted lines. Solid red line is the average difference between and (= 318ml).

https://doi.org/10.1371/journal.pone.0178944.g003

thumbnail
Fig 4. Examples of nodules that were segmented by radiologists manually and CIP segmentations.

a) The robustness of the CIP segmentation was excellent, while substantial interobserver variability was observed in manual segmentation. CIP segmentation was also in excellent agreement with manual contours. However, CIP segmentation was observed to include part of the chest wall (indicated by an arrow) b) Despite being perfectly robust CIP segmentation, it included the region of the normal lung in proximity of the small nodule. c) Cavitation in the center of the nodule. Poor CIP segmentation performance was found. d) Non-solid (ground glass opacity) nodule with poorly defined boundary and subtle appearance is indicated by the red arrow. Poor CIP segmentation performance was found.

https://doi.org/10.1371/journal.pone.0178944.g004

To avoid confusion, lower case dsi was used to indicate the robustness of the segmentation method while upper case DSI was used to indicate the accuracy of the CIP-based segmentation in this paper.

Moreover, all the CIP-segmentations were visually inspected by an experienced radiologist (J.K.) and researcher (S.Y.). They then classified the nodule segmentations into four categories: 1) substantial, 2) moderate, 3) minor, and 4) no manual adjustment required.

Relationship between nodule characteristics and CIP segmentation accuracy: To identify nodule characteristics that may affect the accuracy of the CIP segmentation, the Spearman’s correlation coefficient was computed between the radiologists scored nodule characteristics and DSIAgree. For nodule characteristics that had a continuous scoring scale (e.g. margin ranges from 1 to 5, where 1 indicates a poorly defined margin and 5 indicates a sharp margin) (Table 1), a t-test was used to assess if the correlation coefficient was significantly different from 0 (pt-test<0.05). For characteristic categories where the scoring scale was categorical (ordinal) rather than continuous (i.e. nodule calcification where each score indicates a different appearance) (Table 1), the Kruskal-Wallis test (pKruskal-Wallis<0.05) was used.

The correlations between , DSIAgree and all nodule characteristics were also calculated. Four radiologists scored each category, and thus, there was some variability in the characteristic scoring. When there was a heterogeneous rating, the score that was assigned by the majority of radiologists was chosen for the analysis. In the case of a tie rating, the score that were most frequently assigned to the patient population was chosen. The distributions of the scores for each nodule characteristic are shown in Fig B in S1 File.

Furthermore, Spearman’s correlation coefficient was employed between image voxel thickness and DSIAgree to investigate if the voxel thickness affected the segmentation quality. The significance of the relationship was assessed by a t-test (pt-test<0.05).

Results

In this study, a semiautomatic segmentation method implemented in the CIP of 3D Slicer was used to contour 354 nodules. The computation time of the CIP segmentations was 5–79 s (median: 10s) on a personal computer with 16GB RAM and 3.40GHz Core i7-4770 CPU.

Robustness of the segmentation methods: For the CIP segmentation method, the median dsiCIP was 99% (Interquartile (IQR) range: 97–100%) and the median δCIP was 14 ml (IQR range: 7–37 ml), while for the manual segmentation method, dsimanual was 82% (IQR range: 77–85%) and the median δmanual was 222 ml (IQR range: 124–461 ml) (Fig 2). Although both segmentation methods were generally robust (median dsi>80%), CIP segmentations were significantly more stable than the manual segmentations with pWilcoxon~10−16 for both robustness measures. Fig 4A shows a visual example of a patient with more stable nodule contours by the CIP segmentation method than by the manual segmentation method.

Accuracy of the CIP segmentation: The Bland-Altman plot in Fig 3 highlights the differences between and for all nodules. The median value of was 309ml (IQR range: 162–796ml) and was 477ml (IQR range: 153–1290ml). Nodules segmented by the CIP method were significantly greater in volume than those by manual method (pWilcoxon~10−12). Fig 4B shows an example where CIP segmentation overestimated the nodule region, including parts of the normal lung.

The agreement between CIP and manual segmentations that was assessed by the median DSI was 60% (IQR range: 46–71%). The relationship between various nodules characteristics and the accuracy of the CIP segmentation (i.e. DSIAgree) is shown in Fig 5. Nodule subtlety, margin, texture, lobulation, malignancy, and nodule volume () were positively and significantly correlated to the DSIAgree (pt-test range: 1.5x10-9 - 6x10-3) (Fig 5). As the nodule volume increased from 162ml to 796ml, the median DSIAgree increased from 55% to 78%. The median agreement between CIP and manual segmentations increased from 56% to 70% as the likelihood of the nodule malignancy increased.

thumbnail
Fig 5. The relationships between nodule characteristics, nodule volume, and DSIAgree.

This figure highlights the relationships between nodule characteristics, nodule volume, and DSIAgree. Calcification: Solid = solid calcification, Central = central calcification, None = no calcification. Lobulation: None = not lobulated. Spiculated: None = not spiculated. Texture: Mixed = Semi-solid nodules. Malignancy: Unlikely = unlikely for cancer, Suspicious = suspicious for cancer. Nodule Volume: Q1 = 162ml, Q1–Q3 = 162ml to 796ml, and Q3 = 796ml; Q = quantile.

https://doi.org/10.1371/journal.pone.0178944.g005

An example of a non-solid subtle nodule with poorly defined boundaries is shown in Fig 4D. The accuracy of the CIP segmentation was poor for non-solid or semi-solid nodules, or nodules with poorly defined boundaries and subtle appearances with a median DSIAgree ranging from 15%–41% (Table 2). The performance of CIP segmentations for solid nodules with sharp margins and obvious appearances increased to 61% (Table 2). Nodules that were not marked to be lobulated or spiculated by the radiologists had a median DSIAgree of 59%. Substantial agreement (median DSIAgree > 65%) between CIP and manual segmentations were found in nodules with marked lobulation and spiculation (Table 2). Nodule sphericity (pt-test = 0.94) and calcification (pKruskal-Wallis = 0.49) were not significantly correlated with . Median DSIAgree was ~60% for all nodules regardless of the nodule sphercitiy and calcification conditions (Fig 5, Table 2).

thumbnail
Table 2. Distribution of nodule characteristics.

Median , median DSIAgree and their corresponding interquartile ranges (IQR) for each nodule characteristic.

https://doi.org/10.1371/journal.pone.0178944.t002

While the interior structure of all the other 343 nodules was scored as soft tissue, one nodule was rated to be air (Fig 4C). For this nodule, the CIP segmentation failed to identify the boundary of the nodule resulting in a DSIAgree. of 1% and was unstable (dsiCIP = 42%) (Fig 4C). Nodule malignancy, subtlety, calcification, lobulation, and spiculation were positively and significantly correlated to (pt-test range = 6.87x10-27–1.12x10-4).

As image voxel thickness increased from 1 mm to 5 mm, the median DSIAgree increased from 62% to 79%. However, of the 354 CT images, only two images had a thickness of 5 mm. After excluding these two images from the analysis, the influence of image thickness on the accuracy of CIP segmentation was insignificant and was nearly negligible (Spearman’s correlation coefficient of 0.01 and pt-test of 0.82).

According to visual inspection, 13% (47/354) of the nodules did not require any manual adjustment. Minor to moderate manual adjustments were needed for 37% (129/354) of nodules that included non-nodular tissues (e.g. pleura). Substantial manual adjustment were required for 50% (176/354) of the nodules.

Discussion

Pulmonary nodules can indicate early stage lung cancer or a number of benign conditions. CT-based imaging features have been used to generate imaging biomarkers that predict the malignancy of lung nodules and have demonstrated promising results [19, 20]. Careful delineation of the lung nodule volumes is required for accurate feature extraction to build these imaging biomarkers [1518]. Most commonly, manual segmentation is the method of choice; however, manual segmentation is not only time consuming, but is also affected by inter-observer variability [21, 22, 24]. Although many automatic and semi-automatic segmentation algorithms for nodule segmentation have been proposed, the widespread use of these algorithms, in the scientific and clinical communities, is hampered by their limited accessibility. In this study, we compared the robustness of manual segmentation and a publically accessible nodule segmentation algorithm, known as CIP segmentation.

CIP segmentation may potentially provide a reliable way to assist physicians in the nodule delineation process by reducing inter-observer variability and the physician workload. The CIP segmentations computed from different seed points from the four radiologists were in excellent agreement, indicating that the CIP method is robust and stable to different segmentation seed points. In comparison, manual segmentation was significantly less stable than CIP segmentation. Comparatively, Velazquez et al (2013) assessed the robustness of manual delineations and a 3D Slicer semi-automatic algorithm, known as GrowCut, in defining the volume of twenty non-small cell lung (NSCLC) tumors [30]. They found that the GrowCut algorithm resulted in significantly smaller regions of uncertainty than manual delineations and concluded that it could be used as a starting point for tumor target delineation in radiotherapy and high-throughput data mining research when manual delineations are not available. The results of our study are consistent with their findings that semiautomatic algorithms (in our case, CIP segmentations) are more stable than manual segmentations in defining lung nodule volumes. Furthermore, CIP segmentation is efficient with a median computation time of only 10s on a personal computer. We anticipate that the computational time of the CIP segmentation algorithm would be significantly reduce on a more powerful computer.

Despite the potential applications of the CIP segmentation algorithm, manual adjustment of the segmentations may be needed, especially for small nodules and nodules with poorly defined boundaries, subtle appearance, and non-solid or part-solid textures. Nodule calcification and sphericity have no impact on the performance of CIP segmentations. The accuracy of CIP segmentations tended to be better when the nodule was solid, more obvious, and with a sharp boundary. Non- and part-solid nodules with a hazy appearance failed to completely obscure parenchymal structures, and have been therefore difficult detect and segment by many segmentation algorithms [27, 28, 38, 39]. Similarly, CIP segmentations also suffer from this limitation, where nodules with subtle appearances may have similar image density as its background that makes the full extent of nodules difficult to define. Therefore, in these cases, the knowledge of experienced radiologists is needed to estimate the extent (or boundary) of the nodules and manually edited the CIP segmentation. The robustness of the CIP segmentation was nearly perfect and significantly better than the manual segmentations (Fig 2). Hence, the robustness of manual adjustment based on the CIP segmentation is anticipated to be superior to manual segmentation, but not as robust as CIP segmentation alone.

Several segmentation algorithms have been proposed to improve the contours of structures with hazy appearances, such as non- and part-solid nodules, such as the Markov random field theory-based algorithm [40, 41], neural network [42], and a hybrid algorithm that combines threshold-based region growing, connected component analyses and convex hull calculations [28, 39, 43]. Brief descriptions of five example algorithms that have good performance in segmenting GGO and partly solid nodules are shown in Table 3. However, these more sophisticated algorithms are not easily accessible and have not been implemented into open source platforms for widespread use. Incorporating algorithms for defining non- and sub-solid nodules into the 3D Slicer CIP can further improve the performance of the CIP segmentations. Furthermore, although juxtapleural nodules and nodules with vessel attachment are more challenging to segment than the isolated nodules [44, 45], these nodules characteristics were not evaluated and scored by the LIDC radiologists. In the future, it would be interesting to compare the performance of the CIP segmentation in delineating juxtapleural, vessel-attached, and isolated nodules.

thumbnail
Table 3. Brief descriptions of five algorithms for ground glass opacity (GGO) or partly solid nodule segmentation.

https://doi.org/10.1371/journal.pone.0178944.t003

CIP segmentations may overestimate nodule region of interest for small nodules. A previous study used eighteen nodules of different sizes, shapes and densities that were embedded into various locations of an anthropomorphic thorax phantom to validate the CIP segmentation algorithm [33]. On average, CIP segmentation overestimated the phantom nodule volume by 35%. In our study, CIP segmentation performed best for large nodules with a difference between segmented and phantom nodule volumes <15%. In patients, larger differences were found between the CIP and manually segmented nodule volumes ( = 477ml vs = 309ml). This may due to the fact that patient nodules (e.g. vessel-attached and juxtapleural nodules) were more variable than those embedded in the phantom. CIP segmentations performed better for nodules with larger volumes () and a higher likelihood of being malignant. Nodules that are larger in size (e.g. >4mm nodule diameter in the National Lung Screening Trail in the United Stated [5] are generally considered to be more likely to be malignant. Moreover, the appearance of a large nodule is less subtle and more obvious. As expected, in our study, nodule malignancy and subtlety were positively correlated with nodule volumes. According to the LIDC publications and documentations, it is unclear whether all segmentations were performed by the same or different radiologists [23, 42, 46]. However, we found that the nodule volumes segmented by different LIDC radiologists were consistent and in excellent agreement (Fig 1). Thus, our comparison between CIP- and manual segmentations would be mildly influenced by the inter-radiologist variability even if the nodules were not defined by the same radiologist. Substantial agreement between CIP and manual segmentations were found for nodule volumes >796ml. Moreover, larger nodule volumes may be more likely to be lobulated and spiculated due to the significant correlation between these characteristics and the nodule volume. This may explain why the CIP segmentation method performed better for nodules with marked lobulation and spiculation. We observed that nodule volumes computed from CIP segmentations were significantly greater than those computed from manual segmentation. For nodules with smaller size, CIP segmentations often include adjacent tissues, such as normal lung and blood vessels. Furthermore, small nodules were not only more likely to have subtle appearances and thus, were difficult to detect, but could also be easily overestimated by CIP segmentations. Therefore, manual adjustments may be needed to correct for the overestimation of the small nodules in the CIP segmentations.

An emerging field that converts medical images into high dimensional mineable data is called radiomics [47]. In addition to differentiating between benign and malignant nodules, radiomic features of lung lesions could also be used to predict clinical outcomes and treatment response [1, 2, 48]. Several lung screening trials using CT images have been launched in Asia [4951], Europe [5254], and the United States [5, 55, 56] to identify patients with early lung cancer. Due to the easy accessibility of the CIP segmentation algorithm, this method may be useful for nodule delineation in these lung trial datasets that consist of a large number of patients. This could subsequently expedite the high-throughput extraction of imaging features for radiomic analysis for nodule classification and patient outcomes for precision medicine, especially for patients with large nodules. Future studies will need to investigate how inaccurate CIP segmented nodule volumes influence the predictive power of radiomic features.

The CIP segmentation algorithm relies on several parameters for the generation of the feature maps that were experimentally set up to default values. Further improvements in the segmentation result may be expected by a careful selection of those parameters. Our experience was that the default parameters provided in CIP work well on average but specific nodule characteristics would benefit for tailored parameters selection.

Conclusion

A semi-automatic segmentation algorithm implemented under the 3D Slicer Chest Imaging Platform (CIP) may be useful for assisting physicians in nodule volume delineation. CIP segmentations can potentially reduce the physician workload in 13% of the nodules and inter-observer variability due to its computational efficiency and superior stability compared to manual segmentation. Due to the public accessibility of the CIP segmentation algorithm, it can be employed to initiate nodule segmentation for large datasets, such as lung screening trials, thereby facilitating efficient nodule classification and high-throughput data mining research. However, CIP segmentations should be used with care and manual adjustment of the segmentations may be needed for the majority (87%) of the nodules, including small nodules, and nodules with subtle appearances, poorly defined boundaries and non- and part-solid texture. Although manual adjustment is needed for many cases, CIP segmentation provides a preliminary contour for physicians as a starting point

Supporting information

S1 File. Examples of the segmentation artifact and the distribution of radiologist rating for each nodule characteristics.

https://doi.org/10.1371/journal.pone.0178944.s001

(DOCX)

Acknowledgments

The authors would like to acknowledge support from the National Institute of Health (Award Number U01CA190234 and U24CA194354) and research seed funding grant from the American Association of Physicists in Medicine. The Chest Imaging Platform (CIP) is supported by the National Institute of Health award number 1R01HL116931. Furthermore, we would like to thank Dr. Elizabeth Huynh for editorial assistance.

Author Contributions

  1. Conceptualization: SSFY CP HJWLA.
  2. Data curation: DB SP SSFY CP HJWLA.
  3. Formal analysis: SSFY CP JK HJWLA.
  4. Funding acquisition: HJWLA RSJE.
  5. Investigation: SSFY CP DB JK.
  6. Methodology: SSFY CP RSJE SP JK HJWLA.
  7. Project administration: SSFY CP DB SP HJWLA.
  8. Resources: DB RSJE SP.
  9. Software: DB RSJE SP.
  10. Supervision: DB SP HJWLA.
  11. Validation: SSFY CP JK HJWLA.
  12. Visualization: SSFY CP JK.
  13. Writing – original draft: SSFY CP HJWLA.
  14. Writing – review & editing: SSFY CP DB RSJE SP JK HJWLA.

References

  1. 1. Aerts HJWL, Velazquez ER, Leijenaar RTH, Parmar C, Grossmann P, Carvalho S, et al. Decoding tumour phenotype by noninvasive imaging using a quantitative radiomics approach. Nat Commun. 2014;5:4006. pmid:24892406
  2. 2. Coroller TP, Agrawal V, Narayan V, Hou Y, Grossmann P, Lee SW, et al. Radiomic phenotype features predict pathological response in non-small cell lung cancer. Radiotherapy and Oncology. 2016;119(3):480–6. https://doi.org/10.1016/j.radonc.2016.04.004. pmid:27085484
  3. 3. Torre LA, Bray F, Siegel RL, Ferlay J, Lortet-Tieulent J, Jemal A. Global cancer statistics, 2012. CA: A Cancer Journal for Clinicians. 2015;65(2):87–108. pmid:25651787
  4. 4. Association AL. Lung Cancer Fact Sheet. http://wwwlungorg/. 2016.
  5. 5. Team TNLSTR. Reduced Lung-Cancer Mortality with Low-Dose Computed Tomographic Screening. New England Journal of Medicine. 2011;365(5):395–409. pmid:21714641.
  6. 6. World Health Organization. WHO methods and data sources for global burden of disease estimates 2000–2011. World Health Organization. 2013. http://www.who.int/healthinfo/statistics/GlobalDALYmethods_2000_2011.pdf?ua=1. Accessed 13 Aug 2015.
  7. 7. Armato SG, Giger ML, Moran CJ, Blackburn JT, Doi K, MacMahon H. Computerized Detection of Pulmonary Nodules on CT Scans. RadioGraphics. 1999;19(5):1303–11. pmid:10489181.
  8. 8. Erasmus JJ, Connolly JE, McAdams HP, Roggli VL. Solitary Pulmonary Nodules: Part I. Morphologic Evaluation for Differentiation of Benign and Malignant Lesions. RadioGraphics. 2000;20(1):43–58. pmid:10682770.
  9. 9. McNitt-Gray MF, Wyckoff N, Sayre JW, Goldin JG, Aberle DR. The effects of co-occurrence matrix based texture parameters on the classification of solitary pulmonary nodules imaged on computed tomography. Computerized Medical Imaging and Graphics. 1999;23(6):339–48. https://doi.org/10.1016/S0895-6111(99)00033-6. pmid:10634146
  10. 10. Shah SK, McNitt-Gray MF, Rogers SR, Goldin JG, Suh RD, Sayre JW, et al. Computer Aided Characterization of the Solitary Pulmonary Nodule Using Volumetric and Contrast Enhancement Features1. Academic Radiology. 2005;12(10):1310–9. https://doi.org/10.1016/j.acra.2005.06.005. pmid:16179208
  11. 11. Li F, Sone S, Abe H, MacMahon H, Doi K. Malignant versus Benign Nodules at CT Screening for Lung Cancer: Comparison of Thin-Section CT Findings. Radiology. 2004;233(3):793–8. pmid:15498895.
  12. 12. Revel M-P, Merlin A, Peyrard S, Triki R, Couchon S, Chatellier G, et al. Software Volumetric Evaluation of Doubling Times for Differentiating Benign Versus Malignant Pulmonary Nodules. American Journal of Roentgenology. 2006;187(1):135–42. pmid:16794167
  13. 13. Yankelevitz DF, Reeves AP, Kostis WJ, Zhao B, Henschke CI. Small Pulmonary Nodules: Volumetrically Determined Growth Rates Based on CT Evaluation. Radiology. 2000;217(1):251–6. pmid:11012453.
  14. 14. Lee YH, Kim DW, In HS, Park JS, Kim SH, Eom JW, et al. Differentiation between Benign and Malignant Solid Thyroid Nodules Using an US Classification System. Korean Journal of Radiology. 2011;12(5):559–67. pmid:21927557
  15. 15. Farag A, Ali A, Graham J, Elhabian S, Farag A, Falk R. Feature-Based Lung Nodule Classification. In: Bebis G, Boyle R, Parvin B, Koracin D, Chung R, Hammound R, et al., editors. Advances in Visual Computing: 6th International Symposium, ISVC 2010, Las Vegas, NV, USA, November 29—December 1, 2010, Proceedings, Part III. Berlin, Heidelberg: Springer Berlin Heidelberg; 2010. p. 79–88.
  16. 16. Armato SG, Altman MB, Wilkie J, Sone S, Li F, Doi K, et al. Automated lung nodule classification following automated nodule detection on CT: A serial approach. Medical Physics. 2003;30(6):1188–97. https://doi.org/10.1118/1.1573210. pmid:12852543
  17. 17. Dhara AK, Mukhopadhyay S, Dutta A, Garg M, Khandelwal N. A Combination of Shape and Texture Features for Classification of Pulmonary Nodules in Lung CT Images. Journal of Digital Imaging. 2016;29(4):466–75. pmid:26738871
  18. 18. Madero Orozco H, Vergara Villegas OO, Cruz Sánchez VG, Ochoa Domínguez HdJ, Nandayapa Alfaro MdJ. Automated system for lung nodules classification based on wavelet feature descriptor and support vector machine. BioMedical Engineering OnLine. 2015;14(1):1–20. pmid:25888834
  19. 19. Hawkins S, Wang H, Liu Y, Garcia A, Stringfield O, Krewer H, et al. Predicting malignant nodules from screening CTs. Journal of Thoracic Oncology. 2016. https://doi.org/10.1016/j.jtho.2016.07.002.
  20. 20. Ma J, Wang Q, Ren Y, Hu H, Zhao J, editors. Automatic lung nodule classification with radiomics approach2016.
  21. 21. Bd Hoop, Gietema H, Vorst Svd, Murphy K, Klaveren RJv, Prokop M. Pulmonary Ground-Glass Nodules: Increase in Mass as an Early Indicator of Growth. Radiology. 2010;255(1):199–206. pmid:20123896.
  22. 22. Leader JK, Warfel TE, Fuhrman CR, Golla SK, Weissfeld JL, Avila RS, et al. Pulmonary Nodule Detection with Low-Dose CT of the Lung: Agreement Among Radiologists. American Journal of Roentgenology. 2005;185(4):973–8. pmid:16177418
  23. 23. Armato SG, McLennan G, Bidaut L, McNitt-Gray MF, Meyer CR, Reeves AP, et al. The Lung Image Database Consortium (LIDC) and Image Database Resource Initiative (IDRI): A Completed Reference Database of Lung Nodules on CT Scans. Medical Physics. 2011;38(2):915–31. https://doi.org/10.1118/1.3528204. pmid:21452728
  24. 24. Steenbakkers RJHM, Duppen JC, Fitton I, Deurloo KEI, Zijp LJ, Comans EFI, et al. Reduction of observer variation using matched CT-PET for lung cancer delineation: A three-dimensional analysis. International Journal of Radiation Oncology*Biology*Physics. 2006;64(2):435–48. https://doi.org/10.1016/j.ijrobp.2005.06.034.
  25. 25. Goo JM, Tongdee T, Tongdee R, Yeo K, Hildebolt CF, Bae KT. Volumetric Measurement of Synthetic Lung Nodules with Multi–Detector Row CT: Effect of Various Image Reconstruction Parameters and Segmentation Thresholds on Measurement Accuracy. Radiology. 2005;235(3):850–6. pmid:15914478.
  26. 26. Dehmeshki J, Amin H, Valdivieso M, Ye X. Segmentation of Pulmonary Nodules in Thoracic CT Scans: A Region Growing Approach. IEEE Transactions on Medical Imaging. 2008;27(4):467–80. pmid:18390344
  27. 27. Tao Y, Lu L, Dewan M, Chen AY, Corso J, Xuan J, et al. Multi-level Ground Glass Nodule Detection and Segmentation in CT Lung Images. In: Yang G-Z, Hawkes D, Rueckert D, Noble A, Taylor C, editors. Medical Image Computing and Computer-Assisted Intervention–MICCAI 2009: 12th International Conference, London, UK, September 20–24, 2009, Proceedings, Part II. Berlin, Heidelberg: Springer Berlin Heidelberg; 2009. p. 715–23.
  28. 28. Lassen BC, Jacobs C, Kuhnigk JM, Ginneken Bv, Rikxoort EMv. Robust semi-automatic segmentation of pulmonary subsolid nodules in chest computed tomography scans. Physics in Medicine and Biology. 2015;60(3):1307. pmid:25591989
  29. 29. Fedorov A, Beichel R, Kalpathy-Cramer J, Finet J, Fillion-Robin J-C, Pujol S, et al. 3D Slicer as an Image Computing Platform for the Quantitative Imaging Network. Magnetic resonance imaging. 2012;30(9):1323–41. pmid:22770690
  30. 30. Velazquez ER, Parmar C, Jermoumi M, Mak RH, van Baardwijk A, Fennessy FM, et al. Volumetric CT-based segmentation of NSCLC using 3D-Slicer. Sci Rep. 2013;3:3529. http://www.nature.com/srep/2013/131218/srep03529/abs/srep03529.html#supplementary-information. pmid:24346241
  31. 31. Parmar C, Rios Velazquez E, Leijenaar R, Jermoumi M, Carvalho S, Mak RH, et al. Robust Radiomics Feature Quantification Using Semiautomatic Volumetric Segmentation. PLoS ONE. 2014;9(7):e102107. pmid:25025374
  32. 32. Kuhnigk JM, Dicken V, Bornemann L, Bakai A, Wormanns D, Krass S, et al. Morphological segmentation and partial volume analysis for volumetry of solid pulmonary lesions in thoracic CT scans. IEEE Transactions on Medical Imaging. 2006;25(4):417–34. pmid:16608058
  33. 33. Krishnan K, Ibanez L, Turner WD, Jomier J, Avila RS. An open-source toolkit for the volumetric measurement of CT lung lesions. Optics Express. 2010;18(14):15256–66. pmid:20640012
  34. 34. Raul San Jose E, James CR, Rola H, Jorge O, Alejandro AD, George RW. Chest Imaging Platform: An Open-Source Library and Workstation for Quantitative Chest Imaging. C66 LUNG IMAGING II: NEW PROBES AND EMERGING TECHNOLOGIES. American Thoracic Society International Conference Abstracts: American Thoracic Society; 2015. p. A4975-A.
  35. 35. Caselles V, Kimmel R, Sapiro G. Geodesic Active Contours. International Journal of Computer Vision. 1997;22(1):61–79.
  36. 36. Ross JC, Estépar RSJ, Díaz A, Westin C-F, Kikinis R, Silverman EK, et al. Lung Extraction, Lobe Segmentation and Hierarchical Region Assessment for Quantitative Analysis on High Resolution Computed Tomography Images. Medical image computing and computer-assisted intervention: MICCAI International Conference on Medical Image Computing and Computer-Assisted Intervention. 2009;12(Pt 2):690–8. PubMed PMID: PMC3061233.
  37. 37. Sato Y, Nakajima S, Shiraga N, Atsumi H, Yoshida S, Koller T, et al. Three-dimensional multi-scale line filter for segmentation and visualization of curvilinear structures in medical images. Medical Image Analysis. 1998;2(2):143–68. https://doi.org/10.1016/S1361-8415(98)80009-1. pmid:10646760
  38. 38. Kubota T, Jerebko AK, Dewan M, Salganicoff M, Krishnan A. Segmentation of pulmonary nodules of various densities with morphological approaches and convexity models. Medical Image Analysis. 2011;15(1):133–54. https://doi.org/10.1016/j.media.2010.08.005. pmid:20863740
  39. 39. Zhou J, Chang S, Metaxas DN, Zhao B, Ginsberg MS, Schwartz LH, editors. An Automatic Method for Ground Glass Opacity Nodule Detection and Segmentation from CT Studies. Engineering in Medicine and Biology Society, 2006 EMBS '06 28th Annual International Conference of the IEEE; 2006 Aug. 30 2006-Sept. 3 2006.
  40. 40. Zhu Y, Tan Y, Hua Y, Zhang G, Zhang J. Automatic Segmentation of Ground-Glass Opacities in Lung CT Images by Using Markov Random Field-Based Algorithms. Journal of Digital Imaging. 2012;25(3):409–22. pmid:22089834
  41. 41. Tan Y, Schwartz LH, Zhao B. Segmentation of lung lesions on CT scans using watershed, active contours, and Markov random field. Medical Physics. 2013;40(4):043502. pmid:23556926
  42. 42. Messay T, Hardie RC, Tuinstra TR. Segmentation of pulmonary nodules in computed tomography using a regression neural network approach and its application to the Lung Image Database Consortium and Image Database Resource Initiative dataset. Medical Image Analysis. 2015;22(1):48–62. https://doi.org/10.1016/j.media.2015.02.002. pmid:25791434
  43. 43. Tachibana R, Kido S, editors. Automatic segmentation of pulmonary nodules on CT images by use of NCI lung image database consortium. Medical Imaging 2006: Image Processing; 2006: Proc. SPIE
  44. 44. Jirapatnakul CA, Mulman DY, Yankelevitz FD, Henschke IC. Segmentation of Juxtapleural Pulmonary Nodules Using a Robust Surface Estimate. International Journal of Biomedical Imaging. 2011;2011. pmid:22114585
  45. 45. Chen K, Li B, Tian L-f, Zhu W-b, Bao Y-h. Vessel attachment nodule segmentation using integrated active contour model based on fuzzy speed function and shape–intensity joint Bhattacharya distance. Signal Processing. 2014;103:273–84. https://doi.org/10.1016/j.sigpro.2013.09.009.
  46. 46. The Cancer Imaging Archive (TCIA) Public Access. The Lung Image Database Consortium image collection (LIDC-IDRI) https://wiki.cancerimagingarchive.net/display/Public/LIDC-IDRI Accessed 30 Dec 2016.
  47. 47. Yip SFS, Aerts JWLH. Applications and limitations of radiomics. Physics in Medicine and Biology. 2016;61(13):R150. pmid:27269645
  48. 48. Coroller TP, Grossmann P, Hou Y, Rios Velazquez E, Leijenaar RTH, Hermann G, et al. CT-based radiomic signature predicts distant metastasis in lung adenocarcinoma. Radiotherapy and Oncology. 2015;114(3):345–50. https://doi.org/10.1016/j.radonc.2015.02.015. pmid:25746350
  49. 49. Sone S, Takashima S, Li F, Yang Z, Honda T, Maruyama Y, et al. Mass screening for lung cancer with mobile spiral computed tomography scanner. The Lancet. 1998;351(9111):1242–5. https://doi.org/10.1016/S0140-6736(97)08229-9.
  50. 50. Nawa T, Nakagawa T, Kusano S, Kawasaki Y, Sugawara Y, Nakata H. Lung Cancer Screening Using Low-Dose Spiral CT: Results of Baseline and 1-Year Follow-up Studies. Chest. 2002;122(1):15–20. https://doi.org/10.1378/chest.122.1.15. pmid:12114333
  51. 51. Chong S, Lee KS, Chung MJ, Kim TS, Kim H, Kwon OJ, et al. Lung Cancer Screening with Low-Dose Helical CT in Korea: Experiences at the Samsung Medical Center. J Korean Med Sci. 2005;20(3):402–8. pmid:15953860
  52. 52. Ashraf H, Tønnesen P, Holst Pedersen J, Dirksen A, Thorsen H, Døssing M. Effect of CT screening on smoking habits at 1-year follow-up in the Danish Lung Cancer Screening Trial (DLCST). Thorax. 2009;64(5):388–92. pmid:19052048
  53. 53. Lopes Pegna A, Picozzi G, Mascalchi M, Maria Carozzi F, Carrozzi L, Comin C, et al. Design, recruitment and baseline results of the ITALUNG trial for lung cancer screening with low-dose CT. Lung Cancer. 2009;64(1):34–40. https://doi.org/10.1016/j.lungcan.2008.07.003. pmid:18723240
  54. 54. Pedersen JH, Ashraf H, Dirksen A, Bach K, Hansen H, Toennesen P, et al. The Danish Randomized Lung Cancer CT Screening Trial—Overall Design and Results of the Prevalence Round. Journal of Thoracic Oncology. 2009;4(5):608–14. https://doi.org/10.1097/JTO.0b013e3181a0d98f. pmid:19357536
  55. 55. Gohagan J, Marcus P, Fagerstrom R, Pinsky P, Kramer B, Prorok P. Baseline Findings of a Randomized Feasibility Trial of Lung Cancer Screening With Spiral CT Scan vs Chest Radiograph: The Lung Screening Study of the National Cancer Institute. Chest. 2004;126(1):114–21. https://doi.org/10.1378/chest.126.1.114. pmid:15249451
  56. 56. Team NLSTR. The National Lung Screening Trial: Overview and Study Design. Radiology. 2011;258(1):243–53. pmid:21045183.