Clinical Evidence
Convolutional neural network-based automated maxillary alveolar bone segmentation on cone-beam computed tomography images

Link to paper

Abstract

Objectives

To develop and assess the performance of a novel artificial intelligence (AI)-driven convolutional neural network (CNN)-based tool for automated three-dimensional (3D) maxillary alveolar bone segmentation on cone-beam computed tomography (CBCT) images.

Materials and Methods

A total of 141 CBCT scans were collected for performing training (n = 99), validation (n = 12), and testing (n = 30) of the CNN model for automated segmentation of the maxillary alveolar bone and its crestal contour. Following automated segmentation, the 3D models with under- or overestimated segmentations were refined by an expert for generating a refined-AI (R-AI) segmentation. The overall performance of CNN model was assessed. Also, 30% of the testing sample was randomly selected and manually segmented to compare the accuracy of AI and manual segmentation. Additionally, the time required to generate a 3D model was recorded in seconds (s).

Results

The accuracy metrics of automated segmentation showed an excellent range of values for all accuracy metrics. However, the manual method (95% HD: 0.20 ± 0.05 mm; IoU: 95% ± 3.0; DSC: 97% ± 2.0) showed slightly better performance than the AI segmentation (95% HD: 0.27 ± 0.03 mm; IoU: 92% ± 1.0; DSC: 96% ± 1.0). There was a statistically significant difference of the time-consumed among the segmentation methods (p < .001). The AI-driven segmentation (51.5 ± 10.9 s) was 116 times faster than the manual segmentation (5973.3 ± 623.6 s). The R-AI method showed intermediate time-consumed (1666.7 ± 588.5 s).

Conclusion

Although the manual segmentation showed slightly better performance, the novel CNN-based tool also provided a highly accurate segmentation of the maxillary alveolar bone and its crestal contour consuming 116 times less than the manual approach.

1 INTRODUCTION

Over the last decade, digital workflows have revolutionized treatment strategies in various dentomaxillofacial clinical specialties (Shujaat, Bornstein, et al., 2021). One of the key elements of the majority of workflows is the generation of three-dimensional (3D) augmented virtual models with sufficient anatomical details from cone-beam computed tomography (CBCT)-derived datasets. These models are a necessity to guide practitioners for making an accurate diagnosis and allow patient-specific virtual planning with the delivery of predictable clinical outcomes (Joda et al., 2021; Shujaat, Bornstein, et al., 2021; Vandenberghe, 2020).

Radiographic assessment of alveolar bone level is vital for making a precise periodontal diagnosis and controlled monitoring of periodontal treatment (Nguyen et al., 2020). Furthermore, it is also crucial for a successful implant treatment, especially in the aesthetic edentulous zone (Ferrus et al., 2010). This task is often accomplished using either two-dimensional (2D) or 3D imaging modalities (Jacobs, Salmon, et al., 2018; Jacobs, Vranckx, et al., 2018). Among these modalities, only CBCT allows for an accurate measurement of bone dimensions and 3D modeling-based virtual simulation of regenerative surgical procedures and dental implant placement (Abduo & Lau, 2021; Graf et al., 2021; Shujaat, Bornstein, et al., 2021). Furthermore, CBCT-derived 3D models allow designing and fabrication of patient-specific guides for the delivery of precise treatment with a low complication rate (Graf et al., 2021; Van Assche et al., 2007, 2010; Vercruyssen, Fortin, et al., 2014; Vercruyssen, Hultin, et al., 2014).

Segmentation of anatomical structures from CBCT images for the generation of 3D surface models is the second most crucial step within the digital diagnostic and treatment planning workflows (Shujaat, Bornstein, et al., 2021; Wang et al., 2014, 2021). The process of segmentation is often accomplished by thresholding-based semiautomated or fully automated approaches, which are laborious and inaccurate owing to the inherent limitations of CBCT imaging, such as low signal-to-noise ratio, low contrast resolution, absence of calibrated Hounsfield units (HUs), similar voxel-intensity of tooth roots and surrounding alveolar bone, and image artifacts (Shaheen et al., 2021; Shujaat, Bornstein, et al., 2021; Wang et al., 2014, 2021). In addition, low density bony anatomical structures are often inadequately segmented, such as alveolar bone and anterior wall of maxillary sinus.

Recently, artificial intelligence (AI)-driven convolutional neural networks (CNNs) have been developed for overcoming the limitations associated with the conventional segmentation approaches. Previous studies have reported the application of CNN-based models to automatically and accurately segment teeth (Fontenele et al., 2022; Shaheen et al., 2021), dentomaxillofacial complex (Preda et al., 2022; Wang et al., 2021), mandible (Verhelst et al., 2021), pharyngeal airway space (Shujaat, Jazil, et al., 2021), and mandibular canal (Lahoud et al., 2022) on CBCT images. However, CNN-based automated alveolar bone segmentation with a detailed outlining of the alveolar crest on CBCT images has not yet been investigated. A precise segmentation of the alveolar bone could be beneficial for several clinical tasks, such as predictable dental implant insertion in relation to adjacent structures, designing patient-specific surgical guides, and monitoring periodontal bone loss (Abduo & Lau, 2021; Graf et al., 2021; Lee et al., 2021; Shujaat, Bornstein, et al., 2021).

Therefore, this study aimed to develop and assess the performance of a CNN-based tool for automated maxillary alveolar bone segmentation on CBCT images. The hypothesis is that the CNN-based tool would provide an accurate and time-efficient maxillary alveolar bone segmentation with a precise crestal outline.

2 MATERIALS AND METHODS

This retrospective study was conducted in compliance with the World Medical Association Declaration of Helsinki on medical research and following the recommendations by the STROBE guidelines. Ethical approval was obtained from the Ethical Review Board (reference number S66447). The investigators did not retain patient-specific information at the time of CBCT scan collection from the hospital database. Based on these reasons, the referred ethical committee waived the need for informed consent.

CBCT data collection

A total of 141 CBCT scans were retrospectively collected from the hospital database (Dentomaxillofacial Imaging Center, UZLeuven, Leuven, Belgium). The scans were acquired using two CBCT devices: NewTom VGi evo (QR Verona, Cefla, Verona, Italy) and 3D Accuitomo 170 (Morita, Kyoto, Japan) with different acquisition protocols for representing a heterogeneous dataset, as seen in Table 1. Inclusion criteria consisted of CBCT scans with a complete permanent dentition or maximum of two consecutive missing maxillary teeth and satisfactory image quality (adequate sharpness, contrast, and noise levels subjectively observed) which allowed to delineate the bone boundaries properly and a field of view (FOV) covering the maxillary alveolar bone region. Scans with a high expression of artifacts generated from high-density materials (e.g., implants, metal crowns, and orthodontic brackets) and motion artifacts were excluded.

TABLE 1. Acquisition settings of the cone-beam computed tomography devices.

Abbreviations: CBCT, cone-beam computed tomography; FOV, field of view; kVp, kilovoltage peak; mA, milliampere; mm, millimeters; s, seconds.

The dataset was randomly split into three subsets:

  1. Training set (n = 99): For training and fitting the CNN model based on manual segmentation of the maxillary alveolar bone;
  2. Validation set (n = 12): For testing the obtained CNN model following training; and
  3. Testing set (n = 30): For assessing the performance of the CNN model by comparing automated segmentation with the refined-AI (R-AI) segmentation performed by an operator.

The ground-truth datasets for training and validation of the CNN model were created by manual segmentation of the maxillary alveolar bone with crestal delineation on CBCT images using an online cloud-based AI-driven tool, known as “Virtual Patient Creator” (Relu, Leuven, Belgium). This tool is specialized to perform 3D automated segmentation of the CBCT-derived dentomaxillofacial structures by employing a voxel-wise approach. Furthermore, several tools have been incorporated in the tool, such as brush, contour, and interpolation, which allowed the operators to manually delineate the boundaries of the region of interest slice-by-slice in the three orthogonal planes (i.e., axial, coronal, and sagittal) of the CBCT images. Following segmentation, a 3D model of the alveolar bone was exported in Standard Tessellation Language (STL) file format for further processing. This task was performed by two operators (RCF and MNG). Later, both the observers examined all the segmentations for any flaws during a consensus meeting and further adjustments were performed using a brush tool by proper delineating the region of interest in the axial, coronal, and sagittal orthogonal planes, if deemed necessary.

CNNs model architecture

The network pipeline consisted of two successive 3D U-Nets, both of which consisted of four contracting encoder and three expansive decoder blocks, two convolutions with a kernel size of 3 × 3 × 3, one stride, and one dilation. After the convolutions, rectified linear unit (ReLU) activation and group normalization (Wu & He, 2020) with eight feature maps were performed. Furthermore, each encoder, was applied a max pooling operation. Thus, the spatial resolution was reduced by a factor 2 in all dimensions (Çiçek et al., 2016).

As the CNN architectures normally do not accept large-sized images in input due to the computation and memory limitations, such as large FOV CBCT scans, a two-step method was employed, as previously described by Preda et al. (2022). In the first step, the U-Net model produced a rough segmentation of the region of interest. In the second step, another U-Net model performed fine segmentation based on the previously generated coarse segmentation. Finally, a full-resolution segmentation of the maxillary alveolar crest bone was automatically generated.

The CNN models were developed in PyTorch and data augmentation techniques were applied to increase the training dataset and the robustness of the CNN model, such as scaling, mirroring, rotation, random cropping, and elastic deformation. Furthermore, the model parameters were optimized via ADAM optimization to decrease the learning rate, and early stopping was applied on the validation set to avoid overfitting (Fontenele et al., 2022). The initial learning rate adopted was of 1.24 e−4 which was halved seven times during 300 epochs.

The described AI model is available in an online cloud-based platform, the Virtual Patient Creator platform (Relu, Leuven, Belgium). This user-interactive platform is specialized to perform a multiclass automatic segmentation of craniomaxillofacial structures (e.g., maxilla, mandible, teeth, pharyngeal airway space, and maxillary sinus) from CBCT images. Thereby, it is possible to perform a simultaneous segmentation of these different anatomical structures. Furthermore, access to this platform may be done upon registration by the users.

Testing of the CNNs—Automated segmentation

The automated segmentation of the alveolar bone and detailed crestal delineation was performed using the “Virtual Patient Creator” platform. Each CBCT scan was uploaded in a Digital Imaging and Communication in Medicine (DICOM) format, which resulted in an automatic generation of the corresponding 3D model in STL format (Average number of triangles: 1,030,3021; Minimum and maximum number of triangles: 592,656–1,374,068). Moreover, the time needed to produce a model was automatically provided in seconds (s).

Testing of the CNNs—Refinement of the automated segmentation

The automated segmentation of the testing set was checked for any errors (under- or overestimation) using the aforementioned cloud-based platform by one operator (FFP). The segmentation maps identified with errors were fixed with a brush tool by delineating the region of interest in the axial, coronal, and sagittal orthogonal planes. These refined segmentations were referred to as R-AI and exported in STL format (Average number of triangles: 1,013,596; Minimum and maximum number of triangles: 563,454–1,391,540). Additionally, the time required to perform the refinements was recorded using a digital stopwatch. The timing of R-AI was calculated by adding the automated segmentation and refinement time together.

Thirty days following the refinements, 30% of the testing sample (n = 9) was randomly selected. Two operators (FFP, RCF) performed the automated segmentation and refinement of these cases for assessing the intra-and inter-operator consistency of the refined segmentation maps.

Accuracy and consistency metrics

The performance of automated segmentation was evaluated by applying a confusion matrix for the voxel-wise comparison between the AI and R-AI segmentation maps. Four variables were obtained from this comparison:

  1. True positive (TP): voxels that belong to the alveolar crest bone and were correctly segmented.
  2. False positive (FP): voxels that were segmented even though not belonging to the alveolar crest bone.
  3. True negative (TN): voxels that did not belong to the alveolar crest bone and were not included in the segmentation map.
  4. False negative (FN): voxels that were not included in the segmentation map even though belonging to the alveolar crest bone.

Based on these variables, the performance of the CNN model and R-AI consistency was evaluated according to the following accuracy metrics: 95% Hausdorff distance (HD), intersection over union (IoU), dice similarity coefficient (DSC), precision, recall, and accuracy. Table 2 summarizes the description of each metric and its respective formula.

TABLE 2. Accuracy metrics for the assessment of segmentation

The intra-operator consistency was assessed by comparing the segmentation maps generated from the two time points of refinements performed by the same operator (FFP). For the inter-operator consistency, the same approach was employed to compare the segmentation maps generated after the refinements done by the two distinct operators (FFP and RCF). The same metrics used for assessing the accuracy of the AI-driven segmentation was used to evaluate the intra- and inter-operator consistencies.

Comparison between manual and AI-driven segmentation

The performance of the AI-driven segmentation was compared with human performance by manual segmentation. A total of 30% of the testing sample (n = 9) was randomly selected, and one experienced operator (RCF) performed the manual segmentation in the same online platform used for the obtaining the ground-truth datasets for training and validation of the CNN model. In summary, manual segmentation was performed using the contour tool available in the aforementioned online platform (Nogueira-Reis et al., 2022). In first, the operator established circles delineating the maxillary alveolar bone boundaries slice-by-slice. Circles were typically established based on coronal reconstructions, except for the anterior region, where circles were made using sagittal reconstructions. However, before finishing manual segmentation, the segmentation map was checked by reviewing all CBCT reconstructions (i.e., axial, sagittal, and coronal). This task was performed twice with an interval of 30 days to obtain the accuracy metrics previously mentioned. These metrics were calculated by the comparison between the STLs files generated after the first and second segmentation within each CBCT scan. Also, the time to perform the manual segmentation was recorded.

Timing analysis

The time consumed to segment the maxillary alveolar bone was compared among manual, AI, and R-AI methods. The sample used to perform the comparison between manual and AI-driven segmentation (n = 9) was used for this analysis:

  1. Manual segmentation: Time spent from the DICOM data import to the segmentation software until generation of the segmentation map was obtained and recorded using a digital stopwatch.
  2. AI segmentation: The AI-driven software automatically provided the time consumed to produce the segmentation map.
  3. AI-R segmentation: The time spent to perform the refinements was recorded using a digital stopwatch and added to the AI method.

Qualitative analysis of the AI-driven segmentation

The qualitative analysis of the automated segmentation was performed by two operators (FFP and RCF) in consensus using a 5-grade rating scale (1—unacceptable, 2—poor, 3—average, 4—good, and 5—excellent). The scoring was based on the number of voxels required to be added or removed in more than two consecutive multiplanar CBCT reconstructions for obtaining a proper delineation of the maxillary region of interest (anterior, premolar, and molar regions) in the 3D segmentation map: unacceptable—need to add/remove more than nine rows of voxels; poor—need to add/remove from 7 to 9 rows of voxels; average—need to add/remove from 4 to 6 rows of voxels; good—need to add/remove three rows of voxels; and excellent—no corrections needed or addition/removal of up to two rows of voxels. The CBCT multiplanar reconstructions (i.e., axial, sagittal, and coronal sections) were used as a reference for judging the quality of automated segmentations.

Prior to assessment, operators were calibrated and trained by qualitatively assessing automated segmentations of nine randomly selected CBCT scans. Both operators performed the evaluation in consensus and repetition of the assessment was performed 30 days after the first session. Following confirmation of the intra-operator reliability, the testing set (n = 30) was qualitatively assessed based on the aforementioned criteria. Operators reevaluated in consensus 30% of the sample (n = 9) after 30 days to determine intra-operator reliability. Qualitative evaluation was performed on the cloud-based platform at a 60 cm distance of a high-resolution medical display (MDRC-2124, Barco N.V., Kortrijk, Belgium) in a quiet and dimmed light room.

Statistical analysis

Data were analyzed using the SPSS statistical software (version 24.0, IBM Corp., Armonk, NY). The weighted Kappa test was used to measure the intra-operator reliability of the qualitative analysis. Descriptive statistics including the mean and standard deviation (SD) values were calculated for each metric to evaluate the accuracy performance of the AI and manual segmentation methods and the consistency of the refinements. The t-test was used to compare the accuracy metrics between the automated and manual approaches. Furthermore, the time required to generate a 3D model by the different segmentation methods (manual, AI, and R-AI) was assessed using descriptive statistics, and the difference was compared using the one-way analysis of variance with the Tukey post hoc test. The association between the qualitative scoring of the automated segmentation and the region of interest (anterior, premolar, or molar) was assessed using the chi-square test. Additionally, the correlation between the scoring and the time required to refine the automated segmentation was analyzed by Spearman's correlation coefficient. The significance level was set at 5%.

3 RESULTS

The intra-operator reliability obtained from the calibration session of the qualitative assessment was almost perfect (weighted κ = 1.00), which indicated that the operators were perfectly calibrated. Additionally, the reliability of the qualitative assessment scale was further confirmed by an almost perfect intra-operator reliability (κ = 0.87) for the testing set evaluation (Landis & Koch, 1977).

After checking the presence of any error in the automated segmentation, it was seen that all cases needed a minor adjustment in the segmentation in one of the maxilla regions (i.e., anterior, premolars, or molars). Table 3 describes the overall performance of the automated segmentation and R-AI consistency based on the accuracy metrics. The automated segmentation revealed a high IoU (91.2 ± 1.2%), DSC (95.4 ± 0.6%), precision (93.0 ± 1.5%), recall (98.0 ± 1.2%), and accuracy (99.1 ± 0.1%). These high values and a low 95% HD score (0.279 ± 0.025 mm) suggested that the AI-driven tool provided an accurate segmentation of the maxillary alveolar bone with optimal overlap with the R-AI segmentation (Figure 1). In addition, the intra- (95% HD = 0 ± 0 mm and IoU = 99.7 ± 0.2%) and inter-operator (95% HD = 00 ± 0 mm and IoU = 100 ± 0%) consistency showed an almost perfect reliability in relation to R-AI segmentation; hence, confirming the observer reliability for refining the automated segmentations.

TABLE 3. Overall performance of artificial intelligence (AI)-driven tool for maxillary alveolar bone segmentation and consistency of refined-AI (R-AI) segmentation based on accuracy metrics.

Abbreviations: AI, Artificial intelligence; DSC, Dice similarity coefficient; HD, Hausdorff distance; IoU, Intersection over union; n, sample size; R-AI, Refined artificial intelligence; SD, Standard deviation.
FIGURE 1 Three-dimensional (3D) segmentation of the maxillary alveolar bone. a-c—STLs comparison map showing the root mean square error between the AI and R-AI models in the right lateral, frontal, and left lateral views, respectively; d-f—3D models provided by the AI-driven tool in the same views, respectively.

Table 4 presents the comparison of the accuracy between the automated and manual segmentation approaches. The manual method showed slightly better performance than the automated approach, as seen by the lower 95% HD value and higher values of IoU, DSC, precision, and accuracy (p < .05). Differently, the automated method showed a slightly higher recall value than the manual method (p < .05).

TABLE 4. Comparison between the manual and automated AI-driven segmentation based on accuracy metrics.

Abbreviations: AI, Artificial intelligence; DSC, Dice similarity coefficient; HD, Hausdorff distance; IoU, Intersection over union; n, sample size; R-AI, Refined artificial intelligence; SD, Standard deviation.

Figure 2 demonstrates results regarding the timing analysis. A significant difference existed between the time required to generate the 3D models by manual, automated, and R-AI approaches (p < .001), where the automated segmentation needed the shortest time (51.5 ± 10.9 s) and was almost 116 times faster than the manual segmentation (5973.3 ± 623.6 s). The R-AI approach showed intermediate consumed time (1666.7 ± 588.5 s).

FIGURE 2 Timing analysis according to segmentation methods (manual, AI, and R-AI). Different uppercase letters mean a significant statistical difference among the segmentation methods (p < .001).

Figure 3a illustrates the findings of the qualitative assessment of the automated segmentation according to the anatomical region of interest. A significant association (p < .001) existed between the quality of the automated segmentation and the maxillary region of interest. The quality of segmented anterior region varied from good to excellent, whereas premolar and molar regions scored excellent for all segmentations. As only the anterior region showed a range of qualitative scoring, hence, a correlation analysis was conducted for assessing the relationship between the scoring of this region and the time required to refine the automated segmentation of the entire validation sample (2192.9 ± 1150.8 s). The findings showed a significant negative moderate correlation (Schober et al., 2018) (𝛤𝑅 = − 0.63, p = .0002), which meant that the 3D models rated as a good quality required more time to be refined (Figure 3b).

FIGURE 3 Subjective analysis of the AI-segmentation. (a) Qualitative assessment of automated segmentation using the 5-points scale according to the region of maxillary alveolar bone (anterior, premolar, molar). There were no cases of automated segmentation classified as unacceptable, poor, and average categories. Asterisk sign (*) indicates a statistically significant association (p = .035) between the quality of segmentation and the region of the alveolar bone, and “ns” means no statistical significance difference between the regions of the maxilla regarding the qualitative assessment of the automated segmentation; (b) Graph illustrating the correlation between the qualitative scoring of automated segmentation and the time required to refine the automated segmentation at the anterior region, Spearman correlation coefficient.

4 DISCUSSION

The incorporation of digital workflow in a dental practice has allowed clinicians to obtain CBCT-derived 3D models of dentomaxillofacial structure for several clinical applications, such as virtual treatment planning, designing of surgical guides, and follow-up evaluation by superimposing 3D models of a patient at different time points (Bartnikowski et al., 2020; Sumida et al., 2015). An accurate modeling of the region of interest through the process of segmentation is a vital step in the digital workflows (Verhelst et al., 2021; Wang et al., 2021). Among several clinical and research applications of segmentation, virtual 3D modeling of the maxillary alveolar bone is of particular importance for improving the predictability and reliability of presurgical planning in periodontology and dental implantology. The conventional semiautomated or template-based segmentation techniques are prone to certain inherent limitations that reduce the accuracy of the generated models (Chung et al., 2020; Wang et al., 2021). Hence, this study developed and validated a novel AI-driven CNN-based tool for the segmentation of maxillary alveolar bone and its crestal contour on CBCT images. Based on its high performance and time-efficiency, it could act as a viable alternative for the segmentation and assessment of periodontal bone level.

The CNN model was deployed onto an online cloud-based platform which not only provided the automated segmentation but also allowed the operator to manually adjust the final output if any error was identified in the segmentation map. Although human intervention could suggest a reduction in the consistency of the R-AI segmentation, the high intra-operator consistency (IoU = 99.7% and 95% HD = 0 mm) indicated that the minor adjustments were performed in a standardized and replicable approach. In addition, the high inter-operator consistency (IoU = 100% and 95% HD = 0 mm) suggested that observers were able to identify the same errors within the segmentation maps. In a previous study, Verhelst et al. (2021) also reported a high consistency of the refinement process following automated segmentation of mandibular bone using a similar CNN architecture. It should be noted that the required refinements were clinically insignificant and the largest difference between the AI and R-AI segmentation map was 0.28 mm in the current investigation, which is less than 1 mm and would be clinically undetectable by a periodontal probe (Al Shayeb et al., 2014).

As this is the first study which applied an AI-driven tool for the 3D modeling of maxillary alveolar bone on CBCT images, thereby, comparison with already existing studies was deemed difficult. Nevertheless, previous studies developed CNN models for performing multi-class automated segmentation of the maxillofacial complex on CT (Dot et al., 2022) and CBCT images (Ham et al., 2018; Preda et al., 2022; Wang et al., 2021). These studies reported an acceptable performance of the CNN models with the DSC values ranging from 82.6% to 96.8%. However, no information concerning the inclusion of segmented alveolar bone or its crestal contour for training and validation of the AI model was provided, and a flattening pattern of the maxillary alveolar bone is seen in illustrative 3D models of the aforementioned studies. This anatomical detail is crucial for a precise assessment of the bone level, a task commonly performed in Periodontology and Implant Dentistry. One study was found assessing the CNN-based automated segmentation of the alveolar bone on intraoral ultrasound images (Nguyen et al., 2020). Although a high performance was reported (DSC = 83.7%, 95% HD = 0.32 mm), ultrasound is not usually recommended for evaluating bone tissue. Furthermore, these findings should be interpreted with caution due to a lack of generalizability, since the selected sample belonged to a young patient group with the segmentation of only buccal alveolar bone of the anterior region (Nguyen et al., 2020).

Our findings suggested that the segmentation of anterior crestal contour of the alveolar bone showed slightly lower quality compared to the posterior region. This could have happened as occasionally the anterior cortical bone is very thin or even absent which complicates the segmentation. Yet, it is also essential to highlight that the anterior region had a higher percentage of excellent scores (n = 26) than good scores (n = 4). The statistical difference observed can be justified because all 3D models presented an excellent quality in the posterior maxillary region (n = 30). As expected, the refinement of the automated segmentation of the bone crests in the anterior region scored as with good quality required more time of refinement, as evidenced by the statistically significant negative and moderate correlation shown.

The clinical applicability of a tool dedicated for the segmentation of dentomaxillofacial structures from medical images is usually evaluated based on the accuracy metrics. However, the reporting of time to generate a 3D model has received little attention, which is an important factor to consider for demonstrating the real applicability of these tools in a clinical setting (Fontenele et al., 2022; Shujaat, Jazil, et al., 2021). Hence, it is advised that future studies should also report the time required for automated segmentation. The time analysis in this study showed that the automated segmentation was significantly faster than the R-AI method. The R-AI segmentation was time-consuming because the segmentation maps were refined by viewing the CBCT images slice-by-slice in the axial, coronal, and sagittal planes. However, this laborious task had no negative impact on the consistency of the segmentation, as confirmed by the high operator consistency.

A comparison between the AI and human performance (i.e., manual segmentation) was made using part of the testing sample due to the high time consumed to perform the manual segmentation. In general, manual segmentation showed slightly better performance than AI-driven segmentation. Nonetheless, this statistical difference was relatively low, and the impact in the clinical scenario can be considered unreliable (Fontenele et al., 2022). This statistical difference can be attributed to the too-low standard deviation values, which makes it easy for a statistical difference to be observed. Furthermore, the manual segmentation consumed 116 more times than the AI approach. However, it is essential to highlight that manual segmentation was done by an experienced oral and maxillofacial radiologist using the aforementioned online platform with an intuit contour tool available that aids the operator in delineating the bone boundaries of the region of interest. Thus, manual segmentation would be slower if other segmentation software had been used. Considering that both methods showed accurate performance, the lower time consumed by the AI-driven segmentation is a crucial factor that favors the use of the automated AI-driven approach within the digital workflow in dentistry.

Concerning the methodology adopted, it was excluded CBCT scans with high expression of artifacts generated by high-density materials (e.g., implants, metal crowns, and orthodontic brackets) and motion artefacts. However, some degree of artifacts was seen due to the presence of dental fillings. Considering the pioneering of this study's aim, this criterion was adopted considering that these factors could directly impact image quality and/or CBCT spatial resolution (Schulze et al., 2011) and could be a confounding factor in evaluating the performance of the tool tested. Corroborating with this hypothesis, a previous investigation (Fontenele et al., 2022) showed that the presence of dental fillings slightly impacted the accuracy of teeth segmentation, being the undersegmentation the main error noticed. Based on these findings, future studies are encouraged to investigate the influence of the artifacts produced by high-density materials on the accuracy of the segmentation of the maxillary alveolar bone. Furthermore, future studies are encouraged to optimize the CNN to be powerful enough to perform an accurate alveolar bone segmentation even in the clinical scenario with higher artifacts generation, such as in the presence of dental implants and metallic crowns.

The study was associated with certain limitations. Firstly, the tested CNNs were trained and validated with images acquired from two CBCT devices, which cannot be generalized to other devices. Secondly, the anatomical region of interest was limited to the maxillary region for reducing the possibility of error associated with the manual labeling of training and validation data. Future studies are recommended to develop CNN models dedicated for automated segmentation of the mandibular alveolar bone.

5 CONCLUSION

In conclusion, although the manual segmentation showed slightly better accuracy metrics, the AI-driven segmentation also provided a highly accurate segmentation of the maxillary alveolar bone and its crestal contour consuming 116 times less working time. Moreover, the refinement process of the automated segmentation was consistent and consumed intermediate time between the automated and manual approaches. Thus, the proposed AI-based tool can facilitate the digital workflow for oral implant planning, by allowing accurate, time-efficient, and consistent segmentation of the alveolar bone. Future research is needed to validate its application of other CBCT machines to allow generalizability.

AUTHOR CONTRIBUTIONS

R.C.F., M.D.N.G., D.Q.F., and R.J. conceived the ideas; R.C.F., M.D.N.G., F.F.P., A.V.G., S.N., and H.W. collected and analyzed the data; R.C.F. and D.Q.F.—statistical analysis; R.C.F, M.D.N.G, F.F.P, A.V.G., S.N., and H.W.—led the writing—original draft; R.C.F., M.D.N.G., F.F.P., A.V.G., S.N., H.W., D.Q.F., and R.J.—Writing—review and editing.

ACKNOWLEDGMENTS

This publication was made possible with the support of a Development Project of VLAIO (Flanders Innovation & Entrepreneurship). This study was financed in part by the Coordenação de Aperfeiçoamento de Pessoal de Nível Superior—Brasil (CAPES)—Finance Code 001 and the National Council for Scientific and Technological Development—Brazil (CNPq, process #312046/2021-9).

CONFLICT OF INTEREST STATEMENT

The authors declare the following financial interest/personal relationships which may be considered as a potential competing interest. The authors A.V.G, S.N., and H.W. have professional relationship with RELU BV (ownership, development, and commercial interests). Also, the author R.J. is part of the advisory board of RELU BV.

ETHICS STATEMENT

The study protocol was approved by the ethical review board (reference number S66447).

Would you like to learn more?

Feel free to schedule a meeting with us.