**4. digitalMLPA software**

The analysis of digitalMLPA data involves assessing signals representing the relative number of target sequences in DNA samples. Conventional MLPA utilizes fluorescent tags connected to PCR primers for signal measurement [1]. Capillary

electrophoresis separates and measures relative fluorescent units of probe PCR products based on length. In digitalMLPA, NGS devices sequence the probe products [2]. Analysis of resulting FASTQ files involves counting reads associated with specific probes in samples. Barcode sequences correspond to samples, while target sequences identify the probes. Normalization compares read counts to reference values, providing probe ratio values for copy number estimation. Probes can be clustered and sorted to identify genomic patterns for disease recognition, prognosis, and correlation with treatment effectiveness in precision medicine research.

### **4.1 The analysis workflow**

The proposed software for digitalMLPA organizes the analysis of FASTQ files into different domains with specific tasks (**Figure 4**).

Each domain has its own executable subprogram, which can be accessed through a provided GUI that executes all steps sequentially. Alternatively, these subprograms can be integrated into existing pipelines and connected to LIMS. Incorporating LIMS can enhance automation, but it requires on-site experts familiar with the program's

**Figure 4.** *digitalMLPA analysis processes flow.*

#### *Quality Assurance When Developing Software with a Medical Purpose DOI: http://dx.doi.org/10.5772/intechopen.113389*

input and output requirements. The analysis workflow begins with FASTQ conversion, which requires information about the NGS run (Coffalyser Definition File, CDF file) and a configuration containing digitalMLPA probe mix and barcode collection information. FASTQ conversion generates proprietary file formats for each sample barcode, serving as input for subsequent analysis stages. This allows users to demultiplex large FASTQ files at one location and continue experiment-specific analysis at a different location. In institutes where multiple groups share an NGS device, this behavior is often expected as each group may only be interested in a subset of the FASTQ file. FASTQ conversion can be automated through a tasked operation system that triggers the conversion when the FASTQ file is generated by the NGS device. As FASTQ conversion can be time-consuming, automation is beneficial. After conversion, the resulting sample files undergo fragment analysis, which performs sample-specific quality checks to determine their suitability for subsequent analysis steps. Several steps implement controls for higher processes, including NGS machine performance, DNA sample quality, sample identity, MLPA reaction quality, and context-related factors such as reference sample and probe quality factors. Quality control also ensures reliability of the overall results, aiding in quality evaluation when discrepancies occur. Following fragment analysis, the sample files are updated and used as input for data normalization, which applies specific instructions from the CDF file and configuration to normalize samples within the same experiment. After normalization, the sample files are updated again and serve as input for results analysis. This component sorts and clusters the results based on sample type for comparative statistical analysis, enabling determination of significant changes in sample probe results compared to reference collections. The results analysis facilitates confident result calling and customized recognition of genomic disease patterns. Results are stored in the sample files, which can generate various types of reports. The report format is highly configurable, allowing digitalMLPA probe mix designers to customize the products for customers. Each component produces debug files to facilitate validation and stability testing. This enables automated unit testing when updating a single component, ensuring stability in other components. This methodology supports quality assurance by facilitating estimation of the effects of changes within a single domain.

Domain-specific changes enable focused and specific verification, allowing scrutiny of individual software components while verifying the stability of others. This approach optimizes large software projects by integrating each part into the organization while other processes continue in parallel. Although domain-organized software may initially have a longer development path, it remains more agile throughout its lifetime. Documentation around quality assurance can benefit from this structure if organizations are prepared for this work style. Importantly, this structure lends itself well to a top-down risk approach that centralizes focus on the requirements of the highest processes, which may have an impact on patient safety. This approach allows for the implementation of dedicated controls to minimize risks effectively. The effectiveness of controls and regulations in minimizing risks to patient and user safety should be continuously verified through implementation and measurement in the field, encompassing the entire scope. These measurements are essential for establishing the effectiveness of controls and ensuring the usability of regulations.

In summary, the main requirements for digitalMLPA software involve the analysis of signals representing the relative number of target sequences in DNA samples, using NGS devices and FASTQ files. The analysis workflow encompasses several domains with specific tasks, allowing for automation, quality control, data normalization,

and results analysis. The domain-specific approach facilitates focused verification and the integration of software components into the organization. Furthermore, this approach aligns with a top-down risk perspective, emphasizing patient safety and the implementation of effective controls. Continuous measurement and evaluation are crucial for assessing the effectiveness of controls and maintaining regulatory compliance.

#### **4.2 Quality assurance when developing software**

#### *4.2.1 Initial development of digitalMLPA software*

How do we approach quality assurance when developing software with unclear goals and intended purpose? Here are some considerations to guide us. The software's classification depends on the intended purpose of the supported probe mix, which may require us to anticipate future usage scenarios. During software development, we can incorporate configurable functionality to cater to various product needs. These configurations become a separate domain, requiring thorough testing and evaluation alongside other product aspects. When determining the risk class, we have two options: assuming the highest possible risk class throughout the software's lifetime or aligning it with the current risk level and upgrading when higher-risk products emerge. The choice depends on factors like project size, available resources, and documentation practices. Timely documentation, preferably done alongside coding, ensures synchronization and avoids outdated information. Identifying controls is often best achieved during method development. When the exact goal of the software is not completely defined, then we may search for guidance to quality assurance from the highest applicable authority. Authority comes from excellence, which may be defined by dedication to the highest standard. This may mean IVDR or FDA, or how its regulations are implemented in the organization, it may also involve some inner conviction and experience with the matter at hand. The main requirements of the company, such as MRC Holland's goal to provide affordable and user-friendly products for detecting chromosomal aberrations, can shape the software's objectives. Supporting input from measurement machinery and calculating MLPA probe ratios are key requirements. These ratios, in combination with specific MLPA products, assist in copy number estimation and disease diagnosis. The risk class depends on the potential impact on patients and its associated probability. To address uncertainty, initially assigning the highest risk class and gradually lowering it through control implementation and verification of validity can be pursued. Use cases that assess severity and probability help determine the initial risk class.

#### *4.2.2 Use cases*

#### *4.2.2.1 User case one*

Conventional MLPA as a technique focuses on the detection of genetic diseases, that are typically tested for when a phenotype is observed. The intended use may then be described as a function that is employed to know something that could lead to understanding of its genetic underpinnings. Genetic test kits could also be used to understand the probabilities of certain inherited genetic traits being passed on to offspring. For instance, we may have a scenario where a user has a genetic test result and the intended purpose of the probe mix is to provide understanding in underlying

#### *Quality Assurance When Developing Software with a Medical Purpose DOI: http://dx.doi.org/10.5772/intechopen.113389*

conditions of a patient, but not treatment. Positive (false) results may be expected to be confirmed by a second method (as stated per IFU), but even without that confirmation, the product is not expected to harm the patient directly. In case of false negatives, patients are typically expected to remain being monitored due to their already observed phenotype and to be subjected to alternative types of testing. The severity to the patient is therefore low as well as the probability to do harm, the risk class may therefore be expected to be a IIa risk class.

#### *4.2.2.2 User case two*

We may also consider a use case where product has the intention to test for genetic diseases on newborn. In this case, we may deal with specific syndromes, where an outcome may be used in a chain of a decision-making apparatus, that ultimately leads to application of medicine, for instance to delay progression of disease. In that case, a false negative may lead to harm to the patient, but not death. A false positive would still be expected to be confirmed, especially since the pharmaceutical industry is expected to adhere to higher risk classes under FDA or the IVDR. The severity in this case may also be expected to be higher than with the previous case since an extended delay influences the patient sovereignty. The initial risk class for such a product may be IIb. We need to keep in mind however, before such product obtains such an intended purpose, they usually must pass extensive rounds of testing by external users. The post market survey data from these tests are important to understand how products are implemented and what role they play on different locations. This data may result in changes to risk class as well as help to provide information about its performance in the field. Such data may be generated by allowing a new genetic test to be used for an extensive period alongside current implemented technologies or so-called gold standards. Information about its application, context of usage and performance in the field may then lead to better insights that can change the severity and probability leading to a lower or higher class. Developers are usually limited to verification and unit testing, that have their own importance, but do not grant validation of a devices multi operational implementation. To determine to the risk class of medical software we thus must adhere to the risk class of the known process that lies above its application. For software for digitalMLPA it's the product itself. Processes above that may be the design and production of the product, but without data about its usage in the field, no verifiable validity can be determined. Verifiable validity may mean that the intended purpose has compliancy by operation and evaluation of its performance in the field. Please note that we do not intend to dictate an importance to a certain risk class here, merely to examine and invite discussion to the outcome of certain case scenarios.

#### *4.2.2.3 User case three*

If we would consider digitalMLPA in its current employment, we may state that its focus relates mostly to research in cancer genomics [2]. It's very suitable to investigate the copy number changes that investigate specific genomic locations and combinations of changes that lead to effects in biochemical pathways. Such understanding can lead to better insight into prognosis and patient specific medicine. Software intended for research may be classified as a general laboratory purpose device and does not need to undergo external validation by notified bodies. However, if we also consider the commercial interest in such products, then there is also a need to develop trust

with customers in the field that a product including its software version and configuration results in a stable element in a process they use to investigate cancer diagnostics. Considering commercial endeavor, then a positive outcome of a research trial may lead to the establishment of a diagnostic product for a manufacturer. This may be a screening test or a device that is part of standard operating procedures in clinical diagnosis. Data that may lead to insight into performance in the field also requires a stable supply of a product and its specific software and related configurations. The risk class that may be considered in this category under the IVDR may be IIb, since the deployment of the product may influence diagnosis related to cancer genetics. The obvious problem here is that data that may lead to insight to its implementation, does not exist before it is implemented. When developing software, we may thus better be prepared for the road ahead and invest in accuracy and stability to attempt conformity as best as possible to the company's main requirements. To create software that may remain stable over periods of 10 years broader considerations need to be evaluated. We may for instance consider to the decisions I want to be accountable for, when that period has passed. We may then conclude that we should be prepared for its ultimate usage in practice. If we accept a high-risk classification early, we may have the option to choose a suitable documentation style that matches its purpose and is an accepted company standard. We could for instance choose to comment in such a way in code, that it leads to automated description of processes and/or understandable diagrams and workflows. It may be considered that using such data in combination with Artificial Intelligence (AI) linguistical models could amount to great potential in this area, as well as leading to an easier flow of data considering support to QA documentation and support material.

Note that risk class is often dependent on its verifiable validity of the software implementation in the world, and that there is a dependency on that data to establish and optimize controls. This however does not mean that we cannot attempt to mitigate risks, via case scenarios that lead to controls that allow adjustment of the probability. Controls also require testing and evaluation of these results to understand whether a mitigation has been properly implemented and that the final risk is acceptable. Sometimes we may also choose to mitigate risks with lower probabilities, because their detectability is high and easy to implement. Such situations are typically based on verifiable data and consider the detectability of a failure mode. We may also imagine failure of the highest possible processes that may influence the overall intended use and highest risk class to which applicable detection controls may be set in place. Risk priority is determined by the risk class combined with the detectability. That thus means that via the severity, it may be relevant to also work on low probability failures, due to availability of the mitigations. It's not in the scope of this piece to consider all scenarios, but we may attempt to evaluate a few that may rank the high.

#### *4.2.3 Use scenarios*

Prioritization by combining detectability with the risk class allows focusing on implementation of controls to ensure its endured overall intended purpose, while considering safety to patient and user alike, in every step of the process. While the severity may be directly linked to the intended purpose of the product, the probability to different failure modes may be a different component altogether. Probabilities may be adjusted by increasing the detectability of failure of the main processes. As such, in discussion we adhere to the highest severity imaginable, within confines of the context at the time of implementation of the controls, to the highest processes that are available for the software to investigate or support. In this section we discuss some examples of failure modes that may exist in practice, and the types of control that have been implemented to mitigate risks.

#### *4.2.3.1 Case scenario one*

Failure of NGS machine measurement device. *Failure mode*: The NGS machine measurement device fails to obtain an accredited accuracy rate. Very low read quality results in changes in probe ratio, as reads are not recognized or assigned incorrect identities.

*Severity*: Complete failure of the device may lead to false positive results. Verification: The validity of the software's methodology should be verified by provable specificity and sensitivity in real-world implementation. Logical reasoning and aiming for the highest degree of accuracy can also be employed. *Specificity and sensitivity*: Read recognition is a critical requirement of the software. With digitalMLPA, read identification is simplified as it only needs to match against a limited probe set. By implementing a method that distinguishes barcode sequences and links them to specific sample barcodes, probe mix identification is established, increasing specificity. Further segmentation and setting of thresholds enhance read recognition. *Controls*: Establishing control systems to measure data outside standard limits is important regardless of severity and probability. Controls for NGS data may include sequence quality, detection range, signal amplitude, sample signal bias, and more. New controls can be developed based on observed measurements, which may impact the probability of risk. *Priority and detectability*: NGS run quality and library preparation are prioritized due to their effect on outcome and high detectability. Improving NGS run quality can enhance the detectability of controls in later stages of data analysis. Rerunning digitalMLPA products can improve data quality without repeating the experiments. *Adjusting probability and detectability*: The effectiveness of implemented controls determines whether adjustments can be made in risk classification. Actual measurement under configured conditions is necessary to evaluate the impact of controls.

#### *4.2.3.2 Case scenario two*

Copy number estimation failure. *Failure mode*: High salt concentration in certain samples negatively affects the specificity to discern chromosomal aberrations from poorly denatured DNA, resulting in incomplete probe hybridization. This behavior may introduce patterns originating from chromosomal locations, increasing the probability of a false positive result. *Severity*: High salt concentration can mimic a false deletion pattern, leading to a high severity level. *Probability*: The strain on discerning copy number estimation via probe ratio and genomic pattern recognition increases due to the influence of high salt concentration, affecting the probability of accurate estimation. *Detectability*: digitalMLPA products incorporate smart controls using specially designed probes to test for different quality aspects, such as proper denaturation and DNA fragmentation, enhancing detectability. *Priority*: Supporting the requirement of proper denaturation and minimizing false positives becomes a high priority. It is prioritized due to the increased probability of false positives, the high detectability provided by the product, and its potential impact on later processes. The software should exclude such results from copy number estimation processes.

#### *4.2.3.3 Case scenario three*

MLPA reaction failure for copy number estimation**.** *Failure mode*: Poorly closed tubes in the MLPA reaction may result in sample volume evaporation, leading to changes in salt concentration that negatively affect probe hybridization completeness. *Severity*: Similar to Case Scenario Two, this scenario affects larger groups of probes and leads to poorer distinction of chromosomal aberration patterns, resulting in false positives. The severity is therefore high. *Probability*: Since this scenario can mimic genomic aberration patterns, there is a higher probability of an end user incorrectly calling a result aberrant while observing a false positive. *Detectability*: The digitalMLPA product includes specially designed control fragments that measure the extent of the hybridization reaction, enhancing detectability for identifying such issues. *Priority*: Given the high severity and probability of false positives, along with the high detectability facilitated by the product, addressing this scenario becomes the highest priority. The software should aim to prevent end users from attempting copy number evaluation on identified cases related to this failure mode.

However, the exact implementation of the control system's effects cannot be solely attributed to the software itself. It should align with the intended purpose and risk class of the overall product. For example, a risk IIb class product should not produce probe ratio results but instead issue warnings and error reports to eliminate the chance of false positives. In contrast, if the software is distributed with digitalMLPA products intended for research purposes only, a less strict control system may be applied, and end results could include warnings. The behavior and effectiveness of these control systems can be observed in practice, providing insights on how to improve the accuracy of the product to achieve its intended purpose. Adjustments to the control systems can be made based on their effectiveness, ultimately enhancing the overall product's accuracy through harmonious implementation of all processes and control systems.

#### *4.2.3.4 Case scenario four*

Controls on software processes for compliance and measurement. In this case scenario, we focus on implementing controls on the main software processes used to establish compliance with company design, specifically related to signal measurement and probe ratio estimation. It is important to recognize and prioritize these processes based on their influence and significance in achieving the intended purpose of the product. *Failure mode*: One possible failure mode is when part of an experiment becomes unreliable due to structural variations, such as insufficient master mix during the MLPA procedure. This failure can introduce changes in the biochemistry, leading to aberrant patterns and false positives. *Severity*: Such changes may lead to aberrant patterns and thus false positives. The severity of this failure mode is determined by the severity of the product and its specific configuration. It is important to address this issue to prevent misleading results. *Probability*: The probability of this failure mode occurring may be considered low since laboratory staff is trained to avoid such situations, and the usage of multichannel pipettes is recommended. *Detectability*: To establish control systems for such scenarios, specific recommendations can be provided through the information for usage (IFU) that consider different experimental contexts. For example, recommending the inclusion of reference samples at the start, end, and intermediate positions of an experiment sample population. The software can then measure the variation over these reference

#### *Quality Assurance When Developing Software with a Medical Purpose DOI: http://dx.doi.org/10.5772/intechopen.113389*

samples, although the effectiveness of this control may be context-dependent and influenced by assumed limitations of higher processes. *Priority and effectiveness*: The priority of implementing control systems should be determined by the severity and detectability of the failure mode, with a focus on higher processes that have a greater influence. While the potential effectiveness of measuring the variation over reference data can help prevent false positives, its actual value can only be established through field experience and data analysis. Acceptance criteria and definable logic can be implemented in the software to prevent false positives, and adjustments can be made based on data in the field to improve the control's effectiveness.

Harmonization and collaboration: The effectiveness of control systems relies on the harmonization of all underlying processes and collaboration between departments, manufacturers, and users. Sharing data and experiences can lead to improvements in control systems. The increased focus on Post-Market Surveillance (PMS) in the IVDR is understandable considering the importance of ongoing improvement. It is important to note that these examples do not cover all possible failure modes but provide an examination of a few scenarios and their hierarchical order. Controls can be established based on machine manufacturer guidelines or by adhering to controls designed by the digitalMLPA product itself. A top-down risk-based approach should prioritize implementing controls for higher processes with higher probability and detectability. The severity of the software processes should align with the severity level of the product, and comprehensive testing of the product, including software, configuration, and support material, is necessary to ensure conformity to the intended purpose. By implementing appropriate control systems, such as evaluating variation in reference data for copy number estimation in MLPA, the accuracy of probe ratio measurements can be improved. Strict control criteria should be initially researched for effectiveness, and adjustments should be based on empirical logic and observable data. Stricter controls may be applied for specific probes or arbitrary borders can be tightened to reduce false negatives. However, it is essential to always prioritize the effectiveness of controls on higher processes over lower processes.

#### **5. Discussion**

When working on scientific software, encountering the new IVDR regulations can initially seem rigid and defining, while the implementation regarding patient safety and effectiveness may appear vague and transitory. Standards are often based on existing devices and knowledge, and while they set limitations to minimize outcomes, they rely on comparing and examining observable data. However, even with definitions in place, the practical implementation of patient safety measures can be challenging.

There seems to be a correlation between the focus on patient safety and the level of connection between the provider and the recipient of care. Direct patient contact allows tangible correlations, but in other cases, the ethical implications remain a question. It's important to reflect on how our day-to-day actions and decisions influence patient safety, regardless of our level of direct interaction.

Standardization has its advantages, such as uniform results, lower product costs, and predictability. However, it can also invite commercial interests and potentially lead to monopolization. Conformity to standards can become a goal, conflicting with personal integrity, and potentially causing resignation. In the context of patient care, caregivers should prioritize providing care rather than simply conforming to

definitions. In diagnostics, focusing on delimitation of factors of categorization may be a strength leading to a more accurate result. In research this limitation may result in a blockage to discovery, that may lead to innovation.

While it may be easy to differentiate between these two categories in theory, the status of products and their implementations can be more fluid in practice. Implementing regulations in a way that ensures patient safety often lacks harmonization and may create discord and separation. Harmonization is crucial for effectively achieving the intended purpose of medical products while maintaining the highest standards, which should ultimately serve the well-being of the patient. To achieve this, there needs to be a personal understanding of how regulations translate into improved patient safety in day-to-day actions and decisions. In practice such an approach may be admirable, but vain.

A top-down risk approach therefore prioritizes the highest processes and their intended purpose. For companies, improving patient care should be aligned with the goals of the IVDR. It is therefore surprising that the EU does not invest more in improving communication between medical institutions and manufacturers to exchange information on standards' implementation and their effectiveness. They are in a prime position to advise on implementation since they can measure the effects of standards and could consider endorsing a legal entity for products and their users. Such collaboration would enable the placement of patient safety controls at a highlevel process and allow PMS to focus on measuring product accuracy and the effectiveness of implemented controls.

Empiric logic, however, does require a vantage point. Lawmaking on the other hand often seems necessary and is often met with deadlines. In an urge to control content there may be a drive to document everything. It could also be so that the extensive nature of the regulations invites to make a set of procedures that are of equal size making sure that every delimiting box is checked to showcase conformity to the standard. Enforcing controls to promote patient safety is great but we need to take an active stand, taking interest to measure its accuracy and effectiveness in practice.

The extensive nature of regulations may tempt the creation of equally extensive procedures to ensure conformity, potentially overshadowing the goal of promoting patient safety. It is essential to take an active stance, measuring the accuracy and effectiveness of enforced controls in practice.

The extent to which the IVDR will be implemented and actively practiced remains unknown. In this piece, we discussed how applying such standards to software development in the intersection of research and diagnostics can have implications. We explored how software architecture choices contribute to quality assurance and how domain-based programming can establish quality control points. While we can attempt to safeguard quality through predefined logic and controls, the proof of a product's intended purpose can only be measured through discernable observations. Methods and controls should be subjected to empirical logic and adjusted based on newly obtained information to improve accuracy. However, it's important to recognize that harmonious implementation of products and controls often comes with its own set of limitations. Using a top-down risk-based approach, we can prioritize and establish hierarchical logic to ensure quality. These controls remain subject to change if they stimulate the accuracy of the intended purpose or increase the detectability of failure modes. However, for the sake of patient safety, we must acknowledge our limitations and recognize that processes and context beyond our own, offer higher value in effectiveness but require harmonious implementation.
