4. Customer defect prediction

consistent manner, it can be used as a good leading indicator for predicting defects, as shown

It was later confirmed that there were two major process changes made during the reported period. As eDPM was applied to the data (Transformation #1), we were able to identify the trend change after several months where the transformation is no longer valid. It turns out that the trend change occurred when a major process change was made. Another set of transformation functions, called Transformation #2, were then used. The predicted values are very closely matching with actual defect data. Several months later, we encountered another trend change, which turns out to be caused by another major process change. We then used another transformation #3. With the successive use of eDPM we demonstrated that defects can be

Another benefit of eDPM is to help quantify the process improvement. One of the parameters, γ, as described in Eq. (6), represents defects per story point in this case study. By comparing the values of γ between two transformation periods, we can calculate the relative change in γ values. This relative change represents the percent of improvement due to the process change. To further provide the significance of this benefit, we were able to quantify an improvement of

Case Study #5—Defect closure data: This case study we consider a project with both defect arrival and closure data in addition to sub-feature arrival data. This project is still in early test phase but project management wants to know whether the defect backlog can reach zero at the delivery date. First, we predict the defect arrival curve based on the sub-feature arrival data and actual defect arrival data so far using eDPM. We then predict the closure curve based on the predicted arrival curve using eDPM. By combining the predicted arrival and closure curves we can now calculate the defect backlog by subtracting the closure curve from the arrival curve. Figure 12 shows the predicted arrival and closure curves, along with the predicted defect backlog curve. The project can now see some actions to be taken to improve the backlog

predicted with reasonable accuracy for the entire reported period.

10 and 70% for the first process change and the second, respectively.

curve to get closer to zero at the delivery date.

Figure 11. eDPM case study #4: Story points vs. integration test defects.

in Figure 11.

54 Telecommunication Networks - Trends and Developments

In Section 3, we demonstrated that the last curve prior to software delivery represents the final product from which the total number of defects and residual defects can be calculated. Previous release data or historical data from other projects will be helpful for determining the percent of delivered defects to be found during the operation period. See [16] for detailed discussions.

The assumption that the defect curve can be extended from the development phase into the operational phase (e.g. [23–25]) does not hold in practice, as there are usually discontinuities due to changes in the intensity of testing, as well as operational conditions not always being exactly the same as test environments [7].

To highlight the procedure and results we will use defect data taken from Project B. Figure 13 shows a cumulative view of customer defect prediction. Note that the curve should be always above the actual data after delivery. The difference between the curve and the last actual data

Figure 13. Cumulative view of project B customer defect prediction vs. actual.

measured in terms of failures per year. The defect conversion factor may be expressed as

<sup>d</sup> <sup>¼</sup> <sup>λ</sup> λf

Reliability and availability and among the key factors that are used to define the quality of software in practice. In what follows, we formulate mathematical representations for both

The availability of software can be expressed using cycles of uninterrupted working intervals (Uptime), followed by a repair period after a failure has occurred (Downtime) using (8).

Uptime <sup>þ</sup> Downtime <sup>¼</sup> <sup>1</sup> � Downtime

Considering that availability is typically evaluated over a 1 year period, Uptime þ Downtime ¼ 1 year ¼ ð Þ 60 � 24 � 365 minutes. Therefore, as an example, to achieve system availability of 5

On the other hand, software reliability is the probability that the software has not failed after a time period t. Therefore, reliability is a function of t, and can be denoted as R(t). R(t) is typically modeled using an exponential distribution in which the parameter is failure rate λ as shown in Eq. (9)

It is important to note that while both reliability and availability are a measure of software quality, they have different technical meanings. In particular, availability is determined by both uptime and downtime, while reliability is only influenced by uptime. This implies that two software releases or systems having the same failure rate, would have the same reliability, but might have different availabilities. Achieving a high availability generally requires having automated ways of recovering from failures, for example, through redundancy or rebooting, so that the downtime is minimized. Software failures for which the system is able to automatically recover are known as covered failures. On the other hand, if a system fails to automatically detect and/or recover from a failure, such a failure is known as an uncovered failure, and usually leads to customer perceived defects. In systems where recovery time is significant, a coverage factor – the proportion of all failures that are covered failures – is defined. However, in most practical applications, it requires specialized tools to determine covered failures.

In what follows, we use (anonymised/scaled) data from project A to demonstrate the various aspects of software failure, reliability and availability, together with the predictions that are

9's (i.e. A = 99.999%) the maximum allowed downtime would be 5.26 minutes/year.

Therefore, typical failure counts usually only consider the uncovered defects.

<sup>A</sup> <sup>¼</sup> Uptime

(7)

57

Software Quality Assurance

http://dx.doi.org/10.5772/intechopen.79839

Uptime <sup>þ</sup> Downtime (8)

R tðÞ¼ 1 � exp ð Þ �λt (9)

shown in Eq. (7)

these factors.

5.2. Availability

5.3. Reliability

5.4. Discussion

Figure 14. Weekly view of project B customer defect prediction vs. actual.

point indicates the defects not found in this release and they will become a part of the next release. That is, not all delivered defects will be found during the operation period. It also demonstrates that actual data follows as predicted, indicating the importance of historical data in predicting post-delivery defects.

Figure 14 illustrates the difference in defect rate, λ(te), at the end of test phase, te, and during the operation period, λf. It can be observed that there is a difference in defect rate, likely due to differences in the intensity of testing during the two periods, as well as possible differences between a test environment and a field operational state.
