3.2. Enhanced SRGM: early defect prediction model (eDPM)

Typical SRGM techniques require defect data from the software test period. This limits their use during the early phases of software development during which it is usually necessary to make important (and time intensive) decisions (such as the level of staffing or amount of required testing or number of features to focus on) about the development process. Considering the industry trend towards very short software development lifecycles (i.e. agile development), it is essential to be able to make such decisions accurately very early on in the development phase. Specifically, in order to determine the staffing requirements for development and test activities during the early planning phase, many projects now need to understand what the defect find curve would look like during the internal test period. Therefore, early software defect prediction is needed for the early identification of software quality, cost overrun, and optimal development strategy.

Let (x, y) represent a feature curve and (xnew, ynew) represent a defect arrival curve. We can move the feature curve to the right and closer to the defect arrival curve with the horizontal

where α and β are parameters intrinsic to an individual release. The parameter α represents the average time to find α defects, and β represents the additional delay in the defect find process, likely due to a test resource constraint or critical bugs. Next, we use a simple form for the

The parameter γ is determined as a ratio of the feature count and the defect count used for the best fitted line in the Q-Q plot. It represents the number of defects per feature. Combining (5) and (6), we can transform the feature curve to represent the defect arrival curve. If previous release data is not available, we can use defect data from the initial test period. Figure 7 demonstrates that feature ready curve is a good leading indicator for defect arrival curves.

We will now provide four case studies to demonstrate the robustness of eDPM for practical uses. It should be pointed out that the team "feature" is used here in a generic sense to represent either sub-feature, epic, story, or sprint depending on the availability of metrics for individual projects. Similarly, the term "release" represents a set of features defined for each software delivery. The release content continues to evolve over the software lifecycle. It is important to continuously monitor the release content and adjust the transformation functions to improve the prediction accuracy. While out of scope for this chapter, we have recently developed an algorithm which automates the estimation of parameters as new feature and

The feature arrival data is readily available for most projects.

xnew ¼ α þ βx (5)

Software Quality Assurance

51

http://dx.doi.org/10.5772/intechopen.79839

ynew ¼ γ y (6)

shift function in (5)

3.2.2. Case studies

defect data becomes available.

vertical shift as shown in (6).

Figure 6. A sample Q-Q plot for project B release 5 data.

We propose a novel method, eDPM, for predicting defect arrival curves based on the feature arrival curve during the planning phase. The feature arrival curve often gives the number of sub-features for each feature of the project, together with the times when each sub-feature is expected to be completed. Such information is usually available during the development planning phase of the software development life cycle. Specifically, eDPM involves using data from a previous release of the same product, together with the feature arrival curve for the upcoming release. In order to produce a reliability modeling approach that covers the whole development process, the eDPM approach has been integrated into BRACE as an enhanced SRGM.
