*Artificial Intelligence: Development and Applications in Neurosurgery DOI: http://dx.doi.org/10.5772/intechopen.113034*

AI applications in vertebral fracture detection have generated tremendous interest due to the relative ease in algorithmic-driven image discrimination relative to other neurosurgical contexts. Many studies have evaluated both ML and DL models in the context of fracture detection. Tomita et al. utilized a deep neural network to detect osteoporotic vertebral fractures trained upon 1432 CT scans, finding an ROC-AUC between 0.909 and 0.918 with an F-score of 90.8% and accuracy of 89.2%, measures approximately equivalent to radiologists [129]. Small et al. tested Cspine, an FDA-approved CNN developed by Aidoc to detect cervical spine fractures, finding an accuracy, sensitivity, and specificity for the CNN and radiologists of 92 vs. 95%, 76 vs. 93%, and 97 vs. 96%, respectively [130]. Derkatch et al. trained a CNN binary classifier fed with dual-energy x-ray absorptiometry data to vertebral compression fractures, which yielded an ROC-AUC of 0.94 with a sensitivity of 87.4% and a specificity of 88.4% [131]. Thus, these data suggest that ML and DL models can serve as an accessory to the radiologist and the neurosurgeon in vertebral fracture detection.

Currently, only a few semi-automatic methods for disc and vertebral labeling exist and are widely utilized. However, these methods are inundated with subjectivity due to the presence of user-directed input. Hence, many studies have sought to develop alternative methods to enhance accuracy and efficiency in radiological evaluation. Lehnen et al. demonstrated the feasibility of using a single CNN to identify various degenerative changes of the lumbar spine from MR images, finding high diagnostic accuracy for intervertebral disc detection/labeling (100%), spinal canal stenosis (98%), and nerve root compressions (91%) as well as moderately high diagnostic accuracy for disc herniations (87%), extrusions (86%), bulgings (76%), and spondylolisthesis (87.61%) [132]. However, the generalizability of their study is limited by a small sample size and exclusion of patients over 70 years old. Furthermore, the use of CNNs for spine segmentation is not particularly novel; in 2018, Whitehead et al. trained a cascade of CNNs and achieved Dice scores of 0.832 and 0.865 for vertebrae and discs, respectively [133]. Huang et al. developed a DL tool appropriately named Spine Explorer which quickly and automatically segments and measures lumbar MR images, achieving a near perfect mean Intersection-over-Union (IoU) of 94.7 and 92.6% for the vertebra and disc, respectively [134]. A year later, Shen et al. expanded the scope of Spine Explorer to include the paraspinal muscles and the spinal canal, finding IoU values of 83.3 to 88.4% and 82.1%, respectively [135]. However, both studies using Spine Explorer suffered from a low patient sample size. Recently, Cheng et al. developed a two-stage MultiResUNet DL model for the automatic segmentation of specific intervertebral discs, which yielded a segmentation accuracy of 94%, potentially indicating its eminence over other DL models, such as the U-Net, CNN-based, Attention U-Net, and standard MultiResUNet models [136].

Spine imaging findings are often insufficient in the determination of the underlying cause of lower back pain (LBP) and are often not of clinical significance due to the high frequency of asymptomatic presenting patients. NLP algorithms, however, can bridge the gap in data abstraction in the relationship between spine imaging findings and LBP. Tan et al. developed an NLP to identify lumbar spine imaging findings related to LBP on x-ray and MR radiology reports, demonstrating a significantly greater sensitivity (0.94, compared to 0.83 for rules-based), a higher overall AUC (0.98, compared to 0.90 for rules-based), and comparable specificity (0.97 vs. 0.95 for rules-based) when compared to the rules-based model [36]. Miotto et al. developed a convolutional neural network which, after training on manual free-text clinical notes on LBP patients, was able to discriminate between acute and chronic LBP (AUC of 0.98 and F score of 0.70), demonstrating the potential for systematization of patient symptomatology [137].
