3.1. Automated SRGM

As earlier mentioned, exponential models assume that defects can be found and resolved at a constant rate [10]. While this results into a simple and flexible model with well understood assumptions, it is not always the case that software development is stable, as sometimes changes have to be made to the processes. To ensure that defect prediction adapts to such trend changes, we apply the exponential prediction model to each wave of software defects. The idea of piecewise application of SRGMs is not exactly new. For example, a concept for evolving software content was originally discussed in [10]. This is the first time that we successfully formulated it mathematically, developed an innovative algorithm to automate the process and implemented it in a cloud environment.

The mathematical model is a non-homogeneous Poisson Process (NHPP) with a mean value function following an exponential model. The tool uses a piece-wise application of NHPP exponential models as illustrated in Figure 4. The NHPP assumption is used to implement the statistical method of maximum likelihood for estimating model parameters with the normal approximation confidence limits for a set of defect data. As new test defect data becomes available, we continuously monitor and predict residual (or remaining) defects at delivery. It then uses the last curve for predicting defects to be found after delivery to customer site.

To illustrate the model, consider a finite number, a, of defects such that each defect is found and removed by time, t, following a cumulative distribution function, F(t). For a defect find process, N(t), the probability of finding n defects by time t is in general expressed as a binomial distribution given in (1).

$$P\{N(t) = n\} = \binom{a}{n} F(t)^n [1 - F(t)]^{a-n} \tag{1}$$

The parameters a and b represent total defects in the software and the rate at which each defect is found, respectively. Therefore, a total of a defects is assumed to be found according to an exponential distribution with a rate of b. Note that the parameter a represents the number of defects associated with each period for a case of piece-wise application. Since the mean value function is a function of time, it is called an exponential NHPP model. Taking the derivative of the mean value function we can derive the corresponding defect intensity function or defect

It should be pointed out that if b is positive, m(t) converges exponentially, approaching to a positive value of a, and λ(t) decreases exponentially. This is a typical trend for reliability growth. As b approaches zero and a tends to infinity, m(t) becomes a straight line and λ(t) becomes constant, i.e., a stationary Poisson process. If both a and b are negative, both m(t) and λ(t) increase exponentially. Although most of the time b is positive, there are a few cases with b tending to zero during site test and in-service periods and b being negative in early test phases. Note that the basic assumption of a finite number of defects is violated if b is zero or negative. However, it will be useful in explaining different trends for individual test periods within the

Answers to use cases (a), (b) & (c) in Section 2 can be illustrated using Figure 5 as follows: SRGM predicts 3700 defects by the delivery date. Therefore, assuming that current date corresponds to week 19 (vertical blue line), we would expect to find 1200 more defects in the 11 weeks to delivery assuming the same test progress continues. Since SRGM predicts a total of 4500 defects and 3700 defects at delivery, the residual defects will be 800 (= 4500–3700). The percentage residual defect can be calculated as 18% (= 800 / 4500). Based on our experience, we have determined thresholds (the percentage of residual defects to total defects) and provide color codes that are indicative of software quality and the readiness to deliver. Specifically,

same release. We will discuss further with actual data later.

Figure 5. Example of software defect prediction.

λðÞ¼ t ab exp ð Þ �bt (4)

Software Quality Assurance

49

http://dx.doi.org/10.5772/intechopen.79839

rate, given in (4):

In practice, the value of a is large, and therefore we can approximate (1) by a Poisson distribution with the mean value function, m(t), as given in (2).

$$P\{N(t) = n\} = m(t)^n \exp\left\{-m(t)\right\}/n! \tag{2}$$

Note that m(t) = aF tð Þ represents the average number of defects found by time t. An exponential model is described as an NHPP with the mean value function:

$$m(t) = a \{ 1 - \exp\left(-bt\right) \}\tag{3}$$

Figure 4. An example of a piece-wise application of NHPP model.

The parameters a and b represent total defects in the software and the rate at which each defect is found, respectively. Therefore, a total of a defects is assumed to be found according to an exponential distribution with a rate of b. Note that the parameter a represents the number of defects associated with each period for a case of piece-wise application. Since the mean value function is a function of time, it is called an exponential NHPP model. Taking the derivative of the mean value function we can derive the corresponding defect intensity function or defect rate, given in (4):

$$
\lambda(t) = ab \exp\left(-bt\right) \tag{4}
$$

It should be pointed out that if b is positive, m(t) converges exponentially, approaching to a positive value of a, and λ(t) decreases exponentially. This is a typical trend for reliability growth. As b approaches zero and a tends to infinity, m(t) becomes a straight line and λ(t) becomes constant, i.e., a stationary Poisson process. If both a and b are negative, both m(t) and λ(t) increase exponentially. Although most of the time b is positive, there are a few cases with b tending to zero during site test and in-service periods and b being negative in early test phases. Note that the basic assumption of a finite number of defects is violated if b is zero or negative. However, it will be useful in explaining different trends for individual test periods within the same release. We will discuss further with actual data later.

Answers to use cases (a), (b) & (c) in Section 2 can be illustrated using Figure 5 as follows: SRGM predicts 3700 defects by the delivery date. Therefore, assuming that current date corresponds to week 19 (vertical blue line), we would expect to find 1200 more defects in the 11 weeks to delivery assuming the same test progress continues. Since SRGM predicts a total of 4500 defects and 3700 defects at delivery, the residual defects will be 800 (= 4500–3700). The percentage residual defect can be calculated as 18% (= 800 / 4500). Based on our experience, we have determined thresholds (the percentage of residual defects to total defects) and provide color codes that are indicative of software quality and the readiness to deliver. Specifically,

Figure 5. Example of software defect prediction.

assumptions, it is not always the case that software development is stable, as sometimes changes have to be made to the processes. To ensure that defect prediction adapts to such trend changes, we apply the exponential prediction model to each wave of software defects. The idea of piecewise application of SRGMs is not exactly new. For example, a concept for evolving software content was originally discussed in [10]. This is the first time that we successfully formulated it mathematically, developed an innovative algorithm to automate

The mathematical model is a non-homogeneous Poisson Process (NHPP) with a mean value function following an exponential model. The tool uses a piece-wise application of NHPP exponential models as illustrated in Figure 4. The NHPP assumption is used to implement the statistical method of maximum likelihood for estimating model parameters with the normal approximation confidence limits for a set of defect data. As new test defect data becomes available, we continuously monitor and predict residual (or remaining) defects at delivery. It

then uses the last curve for predicting defects to be found after delivery to customer site.

PNt f g ðÞ¼ <sup>n</sup> <sup>¼</sup> <sup>a</sup>

To illustrate the model, consider a finite number, a, of defects such that each defect is found and removed by time, t, following a cumulative distribution function, F(t). For a defect find process, N(t), the probability of finding n defects by time t is in general expressed as a binomial

> n

In practice, the value of a is large, and therefore we can approximate (1) by a Poisson distribu-

Note that m(t) = aF tð Þ represents the average number of defects found by time t. An exponential

F tð Þ<sup>n</sup>

PNt f g ðÞ¼ <sup>n</sup> <sup>¼</sup> m tð Þ<sup>n</sup> exp f g �m tð Þ <sup>=</sup>n! (2)

m tðÞ¼ <sup>a</sup> <sup>1</sup> � exp ð Þ �bt (3)

½ � <sup>1</sup> � F tð Þ <sup>a</sup>�<sup>n</sup> (1)

the process and implemented it in a cloud environment.

48 Telecommunication Networks - Trends and Developments

tion with the mean value function, m(t), as given in (2).

Figure 4. An example of a piece-wise application of NHPP model.

model is described as an NHPP with the mean value function:

distribution given in (1).

readiness is given as green (implying 'good to go') if the threshold is less than 15%, yellow if it is between 15% and 25%, and red if it is greater than 25%. In this example, the software falls in the yellow range, indicating that some caution is needed if the project decides to proceed with the planned delivery.

As illustrated above, residual defects, which are derived from the defect arrival curve using SRGM, play a key role in software quality assessment in terms of delivery readiness. Our recommendation is that readiness is given as green (implying 'good to go') if the threshold is less than 15%, yellow if it is between 15 and 25%, and red if it is greater than 25%. In addition, it is important to track backlog defects at delivery, so as not to deliver known issues. Our recommendation is that all customer critical and major issues be resolved by delivery. We will address how to predict backlog defects in Section 3.2.2.
