**4.3 Statistical methods for evaluation of biomarkers in myelodysplastic syndrome**

Biostatistics gives us important tools to evaluate biomarkers in myelodysplastic syndrome and other diseases. Quantitative indices, estimates, hypothesis tests and survival tables are useful to point out biomarkers. We already discussed about quantitative indices and hypothesis tests. Let us make few comments about estimates and survival tables.

In many situations, populations are so large that it is impossible to describe their central tendency and dispersion by studying 100% of their members, or by studying a sufficiently large portion of population to justify treating sample statistics as population parameters. In other situations, clinicians may study a new phenomenon with little basis to determine a population parameter. In these cases, we use estimates. Two types of estimates of a population parameter can be used: a point estimate and an interval estimate. A point estimate is a single numerical value of a sample statistic used to estimate the corresponding population parameter. Point estimates are not used widely because the value of some statistic, such as the sample mean, varies from sample to sample. So, an interval estimative is typically used. An interval estimate is a range of values which the parameter is likely to occur. Interval estimates are also called confidence intervals.

Survival tables are used to describe prognosis. Prognosis is a prediction of the future course of a disease following its onset (Fletcher et al., 1988). We can describe the prognosis of a disease considering a fixed period of time (measures or taxes) or considering varying periods of time (survival tables). Table 7 shows the common measures used to describe prognosis when we consider a fixed period of time.


Table 7. Common measures that describe prognosis.

Survival tables can handle situations in which patients enter in some trial at different times and are followed for varying periods. We usually consider the length of time in a certain trial as being days, weeks or months and the end point may be, in the MDS case, death or the reappearance of the disease. The usual method used to construct a survival table is the Kaplan-Meier method. The curve obtained from the data presented in a survival table is called a survival curve.

Epigenetics in Cancer: The Myelodysplastic Syndrome as a

disease and the level of methylation.

particular outcome or medical decision.

**5. Conclusion** 

biomarkers.

characterizing the advanced stages of MDS, with p-value < 0.05.

statistically significant by using Mann-Whitney (p-value < 0.02).

Model to Study Epigenetic Alterations as Diagnostic and Prognostic Biomarkers 41

correction or chi-square with Yates'correction. The results obtained suggested that the frequency of p16INK4A gene methylation was found more frequently in subtypes

Quantitative analysis to evaluate whether there was a correlation between percentage of p15INK4B methylation and MDS subtypes was performed in (Rodrigues et al., 2010). The percentage of p15INK4B methylation was higher in RAEB and RAEB-t compared to RC. The authors used ANOVA (one-factor) and obtained a p-value < 0,0001. The same result was obtained for the p16INK4A gene. The authors used ANOVA (more-factors) to verify that QMS-PCR method was more sensitive than COBRA method, obtaining a p-value < 0.0001, although both methods were accurate in showing a correlation between the subtypes of

The Mann-Whitney test was also used. The authors calculated the mean time of disease evolution in patients who had p15INK4B methylation and in patients with no p15INK4B methylation. The results were 4.6 months and 14.6 months, respectively. The mean time of disease evolution for patients who had p15INK4B methylation, therefore, was three times less than the mean time of evolution for patients who had no p15INK4B methylation, which was

Many concepts and theorems that are not familiar to medical professionals are used in Statistics, as null hypothesis, regression, parametric tests, the central limit theorem, Bayes theorem and so on. Of course, medical professionals should put away the complexity of such concepts and the mathematics is behind all this theory, although only mathematics could explain *rigorously* why these techniques really work. Nevertheless we must say that Statistics is an important tool that *can help* making decisions and must be used if the statistics outcomes are clinically meaningful. Accumulated experience and specific knowledge must be combined with results from statistical tests to assess the usefulness of a

The field of cancer epigenetics is evolving rapidly. Advances in the understanding of chromatin structure, histone modifications, DNA methylation and transcriptional activity have resulted in an increasingly integrated view of epigenetics. These discoveries lead to the development of new treatments in cancer using epigenetic therapies. The MDS comprises a complex spectrum of hematopoietic stem cell disorders, where the study of epigenetics has brought new knowledge about the development and evolution of this disease to AML. Other important points in epigenetics studies in MDS were the introduction of the treatments using hypomethylant drugs and histone deacethylases inhibitors. The MDS may be considered a good model to study the epigenetics in the cancer pathogenesis research and the applicability in clinical. The identification of biomarkers of diagnosis and prognosis in MDS will possibility the elaboration of new classification and score prognostic systems and will help to understand the different pathways involved in the MDS pathogenesis. With the advance of the technologies involving epigenome projects, future research in the epigenetic therapies will be the development of inhibitors with specificity to particular

The survival tables can allow us to compare two or more groups of patients. In this case, the first thing we should do is draw the survival curves for the two (or more) groups on the same graph. Statistical methods are important here, because we cannot make judgments simply on the basis of the amount of separation between the curves; a small difference may be statistically significant if the sample size is large, and a large difference may not if the sample size is small. We have mainly two methods to determine if the differences are statistically significant: the Wilcoxon rank sum test and the log rank test. Figure 6 shows comparative survival curves for pediatric and adult patients diagnosed with MDS treated with allogeneic hematopoietic stem cell transplantation (HSCT).

Fig. 6. Overall survival of primary MDS patients treated with allogeneic HSCT, pediatric patients versus adult patients.

In the Rodrigues et al (2010) study, the authors studied the methylation status of the p15INK4B and p16INK4A genes in 47 pediatric patients with MDS, its correlation with subtype, and the role of p15INK4B and p16INK4A in the evolution of MDS toward AML. The results obtained suggest that methylation of these genes is an epigenetic biomarker of pediatric disease evolution. The authors used some statistical tools presented here. For example, the correlation between p15INK4B gene methylation status and subtypes of pediatric primary MDS, considering initial stage RC, and later stages RAEB and RAEB-t, was assessed by the chi-square test, which is a nonparametric test. The statistical analysis suggested that the frequency of p15INK4B gene methylation was significantly higher in later stages of disease compared with the initial stage, with p-value < 0.003. The correlation between p16INK4A gene methylation status and subtypes of pediatric primary MDS was also assessed by the chisquare test, with a slight modification. In fact, it is a correction factor, which is necessary when we have a small number of data. This is known as chi-square with continuity correction or chi-square with Yates'correction. The results obtained suggested that the frequency of p16INK4A gene methylation was found more frequently in subtypes characterizing the advanced stages of MDS, with p-value < 0.05.

Quantitative analysis to evaluate whether there was a correlation between percentage of p15INK4B methylation and MDS subtypes was performed in (Rodrigues et al., 2010). The percentage of p15INK4B methylation was higher in RAEB and RAEB-t compared to RC. The authors used ANOVA (one-factor) and obtained a p-value < 0,0001. The same result was obtained for the p16INK4A gene. The authors used ANOVA (more-factors) to verify that QMS-PCR method was more sensitive than COBRA method, obtaining a p-value < 0.0001, although both methods were accurate in showing a correlation between the subtypes of disease and the level of methylation.

The Mann-Whitney test was also used. The authors calculated the mean time of disease evolution in patients who had p15INK4B methylation and in patients with no p15INK4B methylation. The results were 4.6 months and 14.6 months, respectively. The mean time of disease evolution for patients who had p15INK4B methylation, therefore, was three times less than the mean time of evolution for patients who had no p15INK4B methylation, which was statistically significant by using Mann-Whitney (p-value < 0.02).

Many concepts and theorems that are not familiar to medical professionals are used in Statistics, as null hypothesis, regression, parametric tests, the central limit theorem, Bayes theorem and so on. Of course, medical professionals should put away the complexity of such concepts and the mathematics is behind all this theory, although only mathematics could explain *rigorously* why these techniques really work. Nevertheless we must say that Statistics is an important tool that *can help* making decisions and must be used if the statistics outcomes are clinically meaningful. Accumulated experience and specific knowledge must be combined with results from statistical tests to assess the usefulness of a particular outcome or medical decision.
