*Artificial Intelligence: Development and Applications in Neurosurgery DOI: http://dx.doi.org/10.5772/intechopen.113034*

assistive or autonomous AI devices are approved by Center for Medicare and Medicaid (CMS) for repayment, with two of the technologies holding surgical utility (**Table 1**). The criteria for repayment is very specific and quite complex, with payments ultimately only covering a maximum of 65% of the actual expense [158]. Compensation is based on Current Procedural Terminology (CPT) codes or New Technology Add-on Payments (NTAPs), which have a reimbursement limit of 3 years [157, 158]. In Europe, AI is not routinely covered and not recognized as a separately reimbursable expense. Several suggested payment models including gainsharing models, outcome incentivization, and advance market commitments have been proposed as the potential for abuse/fraud or underutilization in underserved areas with per use payments has been recognized as a legitimate concern [157].

Ultimately, the future integration of AI into the field of neurosurgery will depend heavily upon whether the increase in efficiency and performance result in a tangible improvement in patient outcomes while providing a net cost savings to health networks. If AI is proven to be a substantial solution, reassessment of reimbursements and insurance coverage are likely to follow.


#### **Table 1.**

*Modified from paying for artificial intelligence in medicine. Parikh and Helmchen [157].*

#### **7.2 Limitations**

The remarkable growth and promise of AI in neurosurgery are not without limitations and concerns that must be taken into account. Firstly, it is imperative to consider that potentially substantial ML-driven improvements in performance are distinct from clinically significant improvements. Although ML models may offer drastic improvements in big data prediction problems, many medical prediction scenarios tend to be intrinsically linear and binary; in such cases, it is unlikely ML models will offer substantial improvements in discrimination and be of clinical value to the neurosurgeon [12, 23]. In short, the efficacy of ML algorithms boils down to the ability to predict future outcomes based on past data.

A primary concern with LLMs is their current inability to fully comprehend context or exercise judgment, which causes significant misinterpretations along with the potential to disseminate incorrect and potentially harmful information [159]. LLMs lack a mechanism for discriminating against biased or false information and cannot inform the end user that the information provided is incorrect. This concern is further compounded by the lack of transparency in the decision-making processes of LLMs like GPT-4. These models can offer explanations as to how and why they make certain decisions upon request, but these justifications are formed post-hoc [160]. This makes it impossible to verify if the explanations accurately represent the model's actual decision-making process. Even more problematic is that when probed for an explanation, GPT-4 may provide contradictory information to its previous statements [159, 160]. The lack of reliability and reproducibility necessitates constant human oversight to ensure accuracy. Specific to medicine, clinicians would be required to fact check these tools, which could easily negate any time savings LLMs may offer. Intellectual property matters are another issue with LLMs. These tools not only pull data and property from creators without consent, but some have also created and cited false references [150].

Furthermore, there is a tendency for bias, violations of privacy, and inherent logistical difficulties with the global utilization of AI. Datasets used to train algorithms are predominantly composed of information representing the majority and common conditions. This model bias can negatively impact racial and ethnic minority groups, genders, and socioeconomically disadvantaged peoples, in addition to diminishing the ability to recognize difficult anatomy [161, 162]. A study by Kamulageya et al. found that the AI dermatologic algorithm Skin Image Search was woefully inaccurate when presented images of pathology in Ugandan patients with dark (Fitzpatrick 6) skin types [163]. The company website boasts an accuracy of 80% and but was found to only be 17% accurate when presented with darker skin tones [163]. Facial recognition algorithms have also been found to have diminished capabilities with both gender and race, performing the worst with females of darker skin tones [164]. These very groups already suffer from diminished access to care and undertreatment of disease in comparison to non-disadvantaged people. Model variance, which stems from insufficient data from minority groups also furthers the bias of AI algorithms. Differences in practices, equipment, and coding also decrease the generalization of AI algorithms. Designing algorithms with the global population in mind, analyzing performance on a subgroup basis, as well as externally validating the algorithms are ways to combat this [162].

Obtaining large quantities of patient data to train AI systems is difficult due to the necessary privacy protections added to patient data [161]. Inappropriate access to data sets and algorithms poses significant ethical, security, and privacy concerns. Algorithms can be manipulated by the addition of noise or altered data to produce

### *Artificial Intelligence: Development and Applications in Neurosurgery DOI: http://dx.doi.org/10.5772/intechopen.113034*

harmful or deleterious effects on the system. Ensuring data privacy and security while allowing users and developers to learn and improve upon the technology is key to moving AI forward.

On a global scale, challenges to telesurgery include lags in connection speeds and the potential for delays and disconnections. The introduction of 5G technology has been touted as a possible remedy, however this remains to be seen [165]. Another consideration to this includes the cost of these systems and the maintenance [165, 166]. Will lower to middle income countries, which are in the greatest need of assistance, share in the cost or will the burden fall on the higher income nations? While this will reduce medical tourism to a point, this will still remain unless the infrastructure for preoperative and postoperative care is created within the countries in need. A likely solution for remote regions would involve smartphone apps for preand post-operative care and medical tourism over a shorter distance for operative and immediate aftercare until the patient is sufficiently recovered. With any AI solution to be implemented in a low to middle income country, the obstacles of infrastructure (electricity, wifi, phone lines, etc.), and governance for AI will need to be overcome on a broad scale.

Frequently stated worries are overreliance on technology, the loss of jobs, and physician disapproval. Most technologies being created are intended to assist and prevent fatigue, and skills must be maintained in order to properly utilize the technology. While there are solutions that involve autonomous actions to be handled solely by AI technology, patients themselves are not in favor of operations or procedures in which a surgeon is not involved. A cross-sectional study conducted by Palmisciano et al. found that while the majority of patient respondents thought AI use was appropriate for image interpretation/preoperative planning or indicating potential complications (76.7 and 82.2% respectively), only 17.7% of these patients approved of AI performing an entire operation [167]. Physicians themselves are also quite welcoming of AI integration into neurosurgery. A survey of neurosurgeons, anesthetists, nurses, and operating room practitioners conducted by Horsfall et al. revealed that the majority of respondents viewed the use of AI in various aspects of neurosurgery favorably [167].

The responses were 62% in favor of use for imaging interpretation, 82% approved of use for operative planning, 70% use for coordinating the surgical team, 85% in favor of AI generated real time alerts to complications or hazards, and 66% approved of autonomous surgery by AI. Members of the Congress of Neurological Surgeons and European Association of the Neurosurgical Societies were polled by Staartjes and colleagues regarding the use of ML in neurosurgery. The results demonstrated that 28.8% of respondents used ML in clinical practice and 31.1% used ML for research [168].
