**4.1 Genetic variants**

*Proteoforms - Concept and Applications in Medical Sciences*

**3. Proteome differs from transcriptome**

*2.2.2 Prognosis*

ALL [21].

Cancer Genome Atlas, suggesting that the number of recognized recurrent mutations in AML alone is not sufficient to explain its heterogeneity. They demonstrated that the landscape of somatic variants in pediatric AML was markedly different from that reported in adults, highlighting the need for and facilitate the development of age-tailored targeted therapies for the treatment of pediatric AML [19, 20].

Among adult patients who are under 60 years of age, AML can be cured in 35–40% of the patients, whereas the survival rates of patients older than 60 is only 5–15%. For older patients who are unable to receive intensive chemotherapy without acceptable side effects the prognosis is even more dismal, with a median survival of only 5–10 months [2]. Survival rates in the pediatric population, have improved greatly, although OS rates of 65–70% are still much lower than that for pediatric

The human genome is the total amount of DNA that each cell in the body contains, including an estimated of 30,000–40,000 protein-coding genes. While the basic dogma of biology formerly was that DNA was transcribed into messenger RNA, which is then translated into proteins, and that mRNA levels could be used to predict protein abundance, it becomes more and more clear that this is overly simplistic due to our expanding knowledge of the effects of epigenetics, environmental influences, mRNA editing, alternative splicing and noncoding RNAs on gene expression. For instance, coding single-nucleotide polymorphisms and mutations can affect the final protein sequence and function, and based on endogenous proteolysis and mRNA splicing, different isoforms can be generated from the same set of nucleotides. Additionally, after translation of the RNA transcript, proteins undergo multiple modifications affecting the protein function, localization, lifespan and

One of the first studies back in 1999, that compared a limited number of mRNA and proteins using *Saccharomyces cerevisiae*, already concluded that the correlation between both was only 0.36 [4, 22]. And, even with the significant improvements in high-throughput genomic and proteome approaches, this fundamental observation continues to be widely, though not universally, supported, as most studies nowadays still show a correlation coefficient that varies between 0.17 and 0.40. Per example, Mun et al. recently performed correlation analysis of mRNA and protein log2-fold changes between gastric cancer tumor samples and adjacent normal tissues using 6803 genes with protein and mRNA abundances available in at least 30% (≥24) of the patients. Of the 6803 genes, only 34.3% showed significant (FDR < 0.01) positive correlation with an average correlation coefficient of 0.28 [23]. Zang et al., performed an integrated proteogenomic analyses human colon and rectal cancer samples and while 89% of the samples showed significant positive mRNA-protein correlation (of which only 32% was significantly correlated), the average correlation between messenger RNA transcript abundance and protein abundance was only 0.23 [24].

Aforementioned, the functional variant of a protein, the proteoform, is defined by genetics, mRNA editing, and PTMs. In particular in ALL, that peaks between 2

activity. Together this results in up to a million of proteoforms.

**4. Age-associated proteoforms in acute leukemia**

**64**

Emerging genome wide sequencing techniques identified disease and age-specific gene variants in acute leukemia. For example, Perez-Andreu et al. discovered a single nucleotide polymorphism (SNP), a variant of the coding region of the DNA, of *GATA3* on 10p14 that was associated with the susceptibility to ALL in adolescents and young adults, and that progressively increased with age [25]. Furthermore, genomic variants that occur in both pediatric and adult leukemia sometimes display a different phenotype at the protein level. As shown by Zuurbier et al., loss of *PTEN* protein due to the production of an unstable and truncated proteoform caused by a frameshift mutation or genomic deletion is a frequently seen in T-cell ALL (predominantly in pediatric T-ALL). *PTEN* is often recognized as a tumor suppressor, but its behavior and relation to outcome is highly context dependent. *PTEN* abnormalities may impact *NOTCH1* and, in a cohort of *PTEN* mutated pediatric T-ALL patients (with loss of *PTEN* protein) that lacked the *NOTCH1* activating mutations, had significantly fewer relapses compared to patients with activated *PTEN* and *NOTCH1* [26]. In contrast, another study showed that *PTEN* mutations without *NOTCH1* abnormalities were associated with poor prognosis in adults [37]. Thus, genomic mutations within the same gene, do not always produce the same proteoform with the same function. Mutations can create a proteoform with a completely different function and can convert a protein from a tumor suppressor into a tumor driver [27]. Although, genome wide studies are very meaningful in detection of conditions specific to age and disease, but the net effect on the cell largely depends on the production of the final proteoform (tumor suppressor or tumor driver) and the pathways they act in.
