**Etiological Research of Childhood Acute Leukemia with Cluster and Clustering Analysis**

David Aldebarán Duarte-Rodríguez,

Richard J.Q. McNally, Juan Carlos Núñez-Enríquez, Arturo Fajardo-Gutiérrez and

Juan Manuel Mejía-Aranguré

Additional information is available at the end of the chapter

http://dx.doi.org/10.5772/54456

**1. Introduction**

#### **1.1. The investigation of clusters of diseases**

properly cited.

Cluster disease and clustering of diseases is an aggregation of cases of a particular disease that occur within a group of people, a geographic area or a period of time, and which is higher than the researchers would expect considering its natural history, and the chance fluctuations. So far, the study of clusters has led to the identification of health problems that have spatial and temporal dimensions.

According to Elliot and Best, "The study of the geographic patterns of a specific disease is part of the classic triad in descriptive epidemiology characterized for the time, the person, and the place"[1]. Therefore, these studies have been considered as part of *Spatial analysis* in Epidemi‐ ology because they can be interpreted in a temporal and spatial context; they should not be confused with an ecological study as commonly happens.

The space-time clustering studies describe populations in historical and geographical contexts, not individuals or population´s particularities, such as risk factors. The results of these studies must be interpreted in terms of period of time and geographic area. The space-time clustering studies are classified as follows:

**1.** Spatial cluster also named geographic cluster. Is an excess of cases or events in a geo‐ graphic area, which can range from a small settlement to a large region.

© 2013 Aldebarán Duarte-Rodríguez et al.; licensee InTech. This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/3.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is © 2013 The Author(s). Licensee InTech. This chapter is distributed under the terms of the Creative Commons Attribution License http://creativecommons.org/licenses/by/3.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

**2.** Temporal cluster. Is an excess of cases or events observed in a limited period of time. The results of this study must be interpreted chronologically. For example, a cluster composed of unrelated individuals whose dates of birth approximately coincide, would result in a temporary cluster; however, this coincidence do not necessarily is observed in individuals that lived in the same geographic area, they could live distantly. There are clusters which individuals coincide on the time of birth, time of diagnosis, or the period in which they moved to a new city.

be used for a temporal cluster or space-time cluster. Moreover, data from a cohort study is frequently used for other studies such as temporal cluster studies. For example, Gaudar et al used a cohort study to identify clusters, which allowed them to define an epidemiological surveillance tool [4]. The objective of this research was to identify areas of high risk for malaria using a dynamic cohort from 1996 to 2001. This group of researchers employed a cohort of identification of clusters by Kulldorf's technique. Their results showed an identification of six clusters of high risk of infection by *P. falciparum*, and concluded the advantages of detecting clusters to generate maps of high risk for malaria. A cohort is characterized by collecting data over a period of time. Nonetheless, the passing of time in collecting and analyzing data is not an exclusive characteristic of the cohort study. There are techniques of cluster analysis able to analyze the effect of time on the formation of clusters. Regarding at the studies of clusters, it can be observed that this condition can also occur with temporal data. You can also use data from a case-control research to detect clusters. Alternatively, the case-control studies are generally used to verify the findings of a study of clusters. The techniques that combine casecontrol studies with clusters studies are currently improving through for example, new scanning techniques [5]; they also can be used to evaluate the mobility of individuals [6]. At the end of this chapter, we present a study of clusters with information from a case-control

Etiological Research of Childhood Acute Leukemia with Cluster and Clustering Analysis

http://dx.doi.org/10.5772/54456

117

Historically, studies of clusters were used to know what caused a heterogeneous distribution of cases; in other words to better understand why the incidence of cases were more concen‐ trated in a particular space or time. In fact, with this question in mind the cluster detection studies are conducted. However, on several occasions has taken the initiative to anticipate it and, without waiting for evidencing disease predilection to join an aggregate of cases, the researchers have investigated whether this phenomenon is part of the natural history of the

This difference is related to the distinction between clusters and clustering. By clustering means the overall propensity to form groups of cases. Moreover, clusters are excessive pooling of people, usually in a small and well defined area and generally have a few cases [7]. Thus, with two different approaches, there is another classification of such studies: the *post-hoc*, which is based on observation of past events; and *the priori*, found as a result of a specific statistical exercise. *A priori* investigation of a propensity for the tendency of clustering is relatively new and can be useful in the interpretation of some *post hoc* cluster

The studies of *post hoc* clusters take place because of public concern about a possible cluster. The problem is that the cases have been identified by personal knowledge and therefore may lead to an inherent bias. Indeed cases may not even be from the same disease. The *a priori* surveillance schemes, systematically monitoring a region for geographical or temporal excesses. There are specialized methods for looking for overall clustering (space-time, spatial or temporal) or to find specific clusters. In turn, clusters studies can be classified otherwise, as

study.

disease.

observations.

expressed in the following chart:

**3.** Space-time cluster. Is an excess of cases or events in both space and time. In other words, a space-time cluster can be observed in cases that are geographically close and are observed in the same period of time.

The more common space-time clustering studies designs are spatial clusters and space-time clustering. Importantly, these types of studies should not be confused with other statistical cluster analysis methods. Both techniques have similarities as they search characteristics of groups or other elements. However, their objectives are different, since the space-time clustering studies search groups of people in both dimensions; while the statistical cluster analysis explores the strength of the relationship between words, ideas or interrelated concepts. The distinction between the two methods is even more ambiguous as is common to find them under the same term [2].

The term spatial dimension is used to refer spatial cluster. The latter research associations between individuals distributed in a geographic space. The spatial cluster involves the presence of an environmental factor or factors in the etiology of the disease. One example worth to mention from veterinary medicine is the exposed by Poljak et al who published a study of influenza in pigs using Cuzick and Edwards method. They searched several clusters, but only found significant results in two strains, for influenza H3N2 Sw/Col/77 and H3N2 Sw/Tex/98 in an area near to a documented region of isolation of avian influenza. From an epidemiological perspective, the source of the spread of these types of influenza in pig herds was an environ‐ mental factor. Evidence suggests that the proximity between both types of farms favored the formation of a cluster of swine influenza [3].

There are specific tests for each scenario. The considerations for determining whether a cluster actually exists, or not, depend on the underlying populations. These studies describe the spatial or temporal behavior of a population; making inferences that are able to describe an area or a period. The findings are then extrapolated to the population under study. One of the advan‐ tages of these results is that they can be explaining visually when displayed on a map or a time curve.

Moreover, the cluster and the ecological studies differ from the classic major epidemiological studies such as cohort, case-control and cross sectional. However, these studies can be combined with the objective of studying a particular population. For example, information from a cross-sectional study does not require major changes in order to use it for a cluster study. The main objective of a cluster study is the description of the population. The principal objectives of a cross-sectional study are sometimes similar to the objectives of clusters studies. On the other hand, information collected longitudinally, for instance from a cohort study, can be used for a temporal cluster or space-time cluster. Moreover, data from a cohort study is frequently used for other studies such as temporal cluster studies. For example, Gaudar et al used a cohort study to identify clusters, which allowed them to define an epidemiological surveillance tool [4]. The objective of this research was to identify areas of high risk for malaria using a dynamic cohort from 1996 to 2001. This group of researchers employed a cohort of identification of clusters by Kulldorf's technique. Their results showed an identification of six clusters of high risk of infection by *P. falciparum*, and concluded the advantages of detecting clusters to generate maps of high risk for malaria. A cohort is characterized by collecting data over a period of time. Nonetheless, the passing of time in collecting and analyzing data is not an exclusive characteristic of the cohort study. There are techniques of cluster analysis able to analyze the effect of time on the formation of clusters. Regarding at the studies of clusters, it can be observed that this condition can also occur with temporal data. You can also use data from a case-control research to detect clusters. Alternatively, the case-control studies are generally used to verify the findings of a study of clusters. The techniques that combine casecontrol studies with clusters studies are currently improving through for example, new scanning techniques [5]; they also can be used to evaluate the mobility of individuals [6]. At the end of this chapter, we present a study of clusters with information from a case-control study.

**2.** Temporal cluster. Is an excess of cases or events observed in a limited period of time. The results of this study must be interpreted chronologically. For example, a cluster composed of unrelated individuals whose dates of birth approximately coincide, would result in a temporary cluster; however, this coincidence do not necessarily is observed in individuals that lived in the same geographic area, they could live distantly. There are clusters which individuals coincide on the time of birth, time of diagnosis, or the period in which they

116 Clinical Epidemiology of Acute Lymphoblastic Leukemia - From the Molecules to the Clinic

**3.** Space-time cluster. Is an excess of cases or events in both space and time. In other words, a space-time cluster can be observed in cases that are geographically close and are

The more common space-time clustering studies designs are spatial clusters and space-time clustering. Importantly, these types of studies should not be confused with other statistical cluster analysis methods. Both techniques have similarities as they search characteristics of groups or other elements. However, their objectives are different, since the space-time clustering studies search groups of people in both dimensions; while the statistical cluster analysis explores the strength of the relationship between words, ideas or interrelated concepts. The distinction between the two methods is even more ambiguous as is common to

The term spatial dimension is used to refer spatial cluster. The latter research associations between individuals distributed in a geographic space. The spatial cluster involves the presence of an environmental factor or factors in the etiology of the disease. One example worth to mention from veterinary medicine is the exposed by Poljak et al who published a study of influenza in pigs using Cuzick and Edwards method. They searched several clusters, but only found significant results in two strains, for influenza H3N2 Sw/Col/77 and H3N2 Sw/Tex/98 in an area near to a documented region of isolation of avian influenza. From an epidemiological perspective, the source of the spread of these types of influenza in pig herds was an environ‐ mental factor. Evidence suggests that the proximity between both types of farms favored the

There are specific tests for each scenario. The considerations for determining whether a cluster actually exists, or not, depend on the underlying populations. These studies describe the spatial or temporal behavior of a population; making inferences that are able to describe an area or a period. The findings are then extrapolated to the population under study. One of the advan‐ tages of these results is that they can be explaining visually when displayed on a map or a time

Moreover, the cluster and the ecological studies differ from the classic major epidemiological studies such as cohort, case-control and cross sectional. However, these studies can be combined with the objective of studying a particular population. For example, information from a cross-sectional study does not require major changes in order to use it for a cluster study. The main objective of a cluster study is the description of the population. The principal objectives of a cross-sectional study are sometimes similar to the objectives of clusters studies. On the other hand, information collected longitudinally, for instance from a cohort study, can

moved to a new city.

find them under the same term [2].

formation of a cluster of swine influenza [3].

curve.

observed in the same period of time.

Historically, studies of clusters were used to know what caused a heterogeneous distribution of cases; in other words to better understand why the incidence of cases were more concen‐ trated in a particular space or time. In fact, with this question in mind the cluster detection studies are conducted. However, on several occasions has taken the initiative to anticipate it and, without waiting for evidencing disease predilection to join an aggregate of cases, the researchers have investigated whether this phenomenon is part of the natural history of the disease.

This difference is related to the distinction between clusters and clustering. By clustering means the overall propensity to form groups of cases. Moreover, clusters are excessive pooling of people, usually in a small and well defined area and generally have a few cases [7]. Thus, with two different approaches, there is another classification of such studies: the *post-hoc*, which is based on observation of past events; and *the priori*, found as a result of a specific statistical exercise. *A priori* investigation of a propensity for the tendency of clustering is relatively new and can be useful in the interpretation of some *post hoc* cluster observations.

The studies of *post hoc* clusters take place because of public concern about a possible cluster. The problem is that the cases have been identified by personal knowledge and therefore may lead to an inherent bias. Indeed cases may not even be from the same disease. The *a priori* surveillance schemes, systematically monitoring a region for geographical or temporal excesses. There are specialized methods for looking for overall clustering (space-time, spatial or temporal) or to find specific clusters. In turn, clusters studies can be classified otherwise, as expressed in the following chart:

other studies that are considered traditional in epidemiology (cross sectional, cohort and casecontrol studies). The objectives of Spatial epidemiology are two: 1) to identify the possible risk factor that contribute to the spatial variation of the disease, and 2) highlight unusual groups that may say something more, than what is already known through other research channels.

Etiological Research of Childhood Acute Leukemia with Cluster and Clustering Analysis

http://dx.doi.org/10.5772/54456

119

Furthermore the study of geographical distribution´s patterns of disease depends of the geographic or temporal scale [1]. Of the scale, because, for example, in a big city, one kilometer can be sufficient to determine the presence of a cluster. While on the other hand, if the territory under study is only a section of a city, the assessment may be limited to a few tens of meters. And of the time, since if the health problem is the result of a very clear, and definite exposure, then the clusters of cases may be observable after a few months. One example is the radiation´s effects from Chernobyl on the increasing in the preva‐ lence of Down´s syndrome in the children of Belarus [10]. This cluster could be related to radiation´s exposure, and was confined to a single month. Another example is related to studies that monitoring acute outbreak, which lasted a few days, where a cluster could be detected on a much smaller window of time. E.g., one study, conducted in Hong Kong, illustrates the spatial and temporal dynamics of human influenza A (H1N1). They could detect space-time heterogeneity in the incidence of disease. It is remarkable the chronolog‐ ical description of the spread of disease across the territory of Hong Kong. In this study, were detected space-time clusters of people with the disease from the third week until week 22. It described the cluster transformation weekly. Although researchers have evaluated space-time clustering, rather than temporal clusters, their results demonstrate how can be

Finally in these studies, the clusters are defined as groups of persons [1,9].

detected clusters within small periods of weeks, or even in a few days [11].

previously stated hypothesis.

A study of clusters (or clustering) can support, or not support, an etiological hypothesis. There must be similarities between the premises of the hypothesis and the design princi‐ ples inherent at the study. For example, in one study of McNally et al, they assumed that, according to the hypothesis put forward by Smith [12] a high incidence of acute leuke‐ mia in children is linked with an infectious exposure that occurred *in uterus*. Under this premise, the space-time cluster of children with leukemia should have manifested when it searched according to the place and date of birth, because these are according with assumptions of Smith's hypothesis. If it had, this result could have been interpreted as an indirect support of that acute leukemia in children is linked, in fact, with an infectious exposure *in uterus*, prior to disease development. They did not find this result, so it ended without support the Smith's hypothesis [13]. This is one illustration of how the principles of design of a cluster (or clustering) study, should coincide with the assumptions of a

The test driver of these studies is derived from other sciences. According to Lawson, Spatial epidemiology concerns the analysis of the spatial-geographical incidence of disease. The Spatial epidemiology keeps a close link with Spatial analysis. Last one forms an entire branch, and a school of thought within the geographical science. Lawson noted that the Spatial epidemiology is a field, or discipline, whose interest concerns the use and interpretation of maps for the location of cases of disease. Also pointed out that all matters related to the

**Figure 1.** Types of clusters of diseases

The particularity of clusters analysis is that it shows the heterogeneous spatial distribution of cases, or the different behavior of occurrence of cases. It is generally accepted that the explan‐ ation of this distinction lies in the unequal distribution of the causal factors of disease in time and space. The underlying causes factors may vary by location, such as city, country, neigh‐ borhood, or rural areas. Putative exposures may have changed through time.

The value of a cluster analysis is that they show the hypothetical consequences of any possible factor on the spatial or temporal distribution of the population. If there is a spatial or temporal difference in the incidence of a disease, this could suggest the presence of an environmental factor. When the positive detection of a cluster happens, this suggests that an environmental factor may be involved in the development of this health problem. At the level of individuals, genetic factors are important in determining which people get sick. However, when we need to explain the disease in the population level, environmental factors and lifestyle have a higher relative weight [8]. Given the conclusion, the question is: what are these factors?

## **2. Spatial epidemiology**

#### **2.1. Definition and concepts**

Among the various techniques and methods to study epidemiology, there is an area called *Spatial epidemiology* (or G*eographical epidemiology)*. The Spatial epidemiology complements other studies that are considered traditional in epidemiology (cross sectional, cohort and casecontrol studies). The objectives of Spatial epidemiology are two: 1) to identify the possible risk factor that contribute to the spatial variation of the disease, and 2) highlight unusual groups that may say something more, than what is already known through other research channels. Finally in these studies, the clusters are defined as groups of persons [1,9].

Furthermore the study of geographical distribution´s patterns of disease depends of the geographic or temporal scale [1]. Of the scale, because, for example, in a big city, one kilometer can be sufficient to determine the presence of a cluster. While on the other hand, if the territory under study is only a section of a city, the assessment may be limited to a few tens of meters. And of the time, since if the health problem is the result of a very clear, and definite exposure, then the clusters of cases may be observable after a few months. One example is the radiation´s effects from Chernobyl on the increasing in the preva‐ lence of Down´s syndrome in the children of Belarus [10]. This cluster could be related to radiation´s exposure, and was confined to a single month. Another example is related to studies that monitoring acute outbreak, which lasted a few days, where a cluster could be detected on a much smaller window of time. E.g., one study, conducted in Hong Kong, illustrates the spatial and temporal dynamics of human influenza A (H1N1). They could detect space-time heterogeneity in the incidence of disease. It is remarkable the chronolog‐ ical description of the spread of disease across the territory of Hong Kong. In this study, were detected space-time clusters of people with the disease from the third week until week 22. It described the cluster transformation weekly. Although researchers have evaluated space-time clustering, rather than temporal clusters, their results demonstrate how can be detected clusters within small periods of weeks, or even in a few days [11].

*Types of clusters*

**Figure 1.** Types of clusters of diseases

**2. Spatial epidemiology**

**2.1. Definition and concepts**

Transient

Permanent or prolonged

Familial

Household

School

Neighborhood

Occupational

Of 'known' or inferred causation

According to the *presence of cluster through the* 

According to *the population in which clusters are identified* 

The particularity of clusters analysis is that it shows the heterogeneous spatial distribution of cases, or the different behavior of occurrence of cases. It is generally accepted that the explan‐ ation of this distinction lies in the unequal distribution of the causal factors of disease in time and space. The underlying causes factors may vary by location, such as city, country, neigh‐

The value of a cluster analysis is that they show the hypothetical consequences of any possible factor on the spatial or temporal distribution of the population. If there is a spatial or temporal difference in the incidence of a disease, this could suggest the presence of an environmental factor. When the positive detection of a cluster happens, this suggests that an environmental factor may be involved in the development of this health problem. At the level of individuals, genetic factors are important in determining which people get sick. However, when we need to explain the disease in the population level, environmental factors and lifestyle have a higher

Among the various techniques and methods to study epidemiology, there is an area called *Spatial epidemiology* (or G*eographical epidemiology)*. The Spatial epidemiology complements

borhood, or rural areas. Putative exposures may have changed through time.

relative weight [8]. Given the conclusion, the question is: what are these factors?

*time*

118 Clinical Epidemiology of Acute Lymphoblastic Leukemia - From the Molecules to the Clinic

A study of clusters (or clustering) can support, or not support, an etiological hypothesis. There must be similarities between the premises of the hypothesis and the design princi‐ ples inherent at the study. For example, in one study of McNally et al, they assumed that, according to the hypothesis put forward by Smith [12] a high incidence of acute leuke‐ mia in children is linked with an infectious exposure that occurred *in uterus*. Under this premise, the space-time cluster of children with leukemia should have manifested when it searched according to the place and date of birth, because these are according with assumptions of Smith's hypothesis. If it had, this result could have been interpreted as an indirect support of that acute leukemia in children is linked, in fact, with an infectious exposure *in uterus*, prior to disease development. They did not find this result, so it ended without support the Smith's hypothesis [13]. This is one illustration of how the principles of design of a cluster (or clustering) study, should coincide with the assumptions of a previously stated hypothesis.

The test driver of these studies is derived from other sciences. According to Lawson, Spatial epidemiology concerns the analysis of the spatial-geographical incidence of disease. The Spatial epidemiology keeps a close link with Spatial analysis. Last one forms an entire branch, and a school of thought within the geographical science. Lawson noted that the Spatial epidemiology is a field, or discipline, whose interest concerns the use and interpretation of maps for the location of cases of disease. Also pointed out that all matters related to the production of maps and statistical analysis of mapped data should be dedicated to study in Epidemiology. Furthermore, it pointed out that many epidemiological concepts play an important role in their analysis.

to Lawson, a general study of clusters is the valuation of a map of a complete territory, in order to find out if there are clusters in that place. If anything there is no cluster, as proposed by a null hypothesis, the map should not observe any difference in the distribution of the disease. The explanation, in the alternative hypothesis, should then provide some specific mechanisms to understand the grouping that the maps show. It concerns a preconceived notion of how these clusters are given. Such studies may also be called as non-specific, since in reality are not required to identify the place where the clusters are placed, but that really only intended to

Etiological Research of Childhood Acute Leukemia with Cluster and Clustering Analysis

http://dx.doi.org/10.5772/54456

121

It is extremely important, in Spatial epidemiology, which in all the techniques and methods explained, epidemiological considerations, proper exercise of Epidemiology, are taken into account. Elliot and Best, declared that the differences in the distribution of disease incidence, could produce important clues to the etiology of disease research. Then, later studies could be carried out by methods designed to analyze the population at an individual level (e.g. cohort

The Spatial epidemiological analysis has three peculiarities inherent to the source of their data, which explain the logic of the clusters. First, epidemiological analysis has in statis‐ tics and, in particular in spatial statistics, at the core of their method. This is because these data have the property of being geo-referenced (located on a map) and may be inter‐ linked a result of their location. The data can be at the personal level, always associated with it spatial localization. Secondly, in Epidemiology, generally the spatial data are discrete, i.e. they are not quantifiable data on a continuous scale. They are the occurrence of these phenomena is in part a consequence of previous events, and in addition it also depends on an important individual independence, due to random processes. That is, they are stochastic processes. Furthermore, these data (for example, the location of a child with leukemia) behave according to processes associated with discrete probability distribu‐ tions. Put another way, these are processes whose phenomena are clearly separated values. Finally, the nature of all information used in Spatial epidemiology are linked to convention‐ al studies of Epidemiology, which leads to the derivation of models and methods related to spatial analysis [9]. In these studies a null hypothesis indicates the "normal" variation of sick cases, or with health problems. It is compared against alternative hypothesis, which

There should be no confusion. This type of study stored multiple matches with other epide‐ miological studies. For example, the size of the sample examined, studies of clusters also yield less uncertainty when making inferences. Can be made a study of conglomerates stratified, where a disease cases are divided into groups suitable for research purposes. For example, it can do cluster detection between boys and girls or by age groups sensitive to the different susceptibility of some individuals over others. The results of a study of clusters can be enhanced, too, by using more variables. These variables need not be necessarily included in the operations of the proper analysis of the detection of clusters. Its can indicate the environ‐ mental conditions of the population. For example, one can know the socioeconomic status of the cases studied the population density of the locality where they live, or industrial activity where they live, just to name a few. Finally, the study outcomes can be explained in light of

identify whether there is a pattern, a pattern of grouping into clusters.

or case-control-control) [1].

explains the difference in question

The importance of maps in epidemiological work is clear. However, Spatial epidemiology does not restrict its activity only to such cartographic work. There are a detailed set of tasks for Spatial/Geographical analysis in Epidemiology. There are: cluster (or clustering) studies, models of exposure to sources of risk of disease mapping, field surveys of information, analysis, ecological models of infectious diseases, among other studies [9]. You can do analysis within related areas, comparisons, analysis of surfaces and areas, analysis of lines, and analysis of points. Each category has additional subdivisions into more types of studies [14]. Clustering studies look for general patterns in a region, whereas, in contrast, cluster studies have a focused representation. The study of spatial clusters (or clustering) and spatial-temporal clusters (or clustering) correspond to the study of points and areas, and correspond with focused and general clusters and clustering studies. Both types of studies are having a geographical interpretation. Focused cluster studies actually seek to detect a very specific cluster, e.g., distinctly clustered cases around a defined point localized in a territory. General clustering and focused cluster analyses have their own statistical tests. Also, there are two concepts whose interpretation may be mistaken, and these are cluster and clustering. A cluster is a group of children that arises in a small and well defined area, usually has a few cases. By clustering means the general propensity of cases to form groups [13].

Tango, in his book Statistical Methods for Disease Clustering [15], classified clusters studies according to geographical approach to the problem. If the intention is to recognize the occurrence of conglomerate, over a territory and/or a given time, then this is a general test. If, instead, there is already a predetermined point, as a given event in a defined location, then a test is focused. To further clarify the last point, it is the example of radiation in Chernobyl. The event is the nuclear disaster, the place is the nuclear plant, and the conglomerate is looking around that point, after that event. The cluster, detected in the above example, was a focused cluster.

A focused study of clusters is a specific study of clusters. In these studies, the location of any cluster, if it exists on the map, is a matter of first importance. In these studies there is a source of exposure, which is a fixed place in the territory, and is known incidence of the disease. The search feature is that if any cluster is detected, it should revolve around sources of exposure that are under suspicion. Often this relationship is identified as a causal interaction, i.e., indicate the relationship between nuclear plant exposure and the spatial distribution of sick cases around it. Indirectly, the results of these studies are interpreted as support evidence that an exposure located in one place can generate a disease in the population that is under the effect of exposure. Tango pointed a difference: when the location of possible clusters is expected *a priori*, then it is a specific focusing study; when possible location is unknown, a specific study is not focused [15].

For his part, Lawson, classifies these two ways of approaching with other names: like a general study of clusters, and as specific studies [9]. However, he further detailed definition of both concepts, as he points out the usefulness of studies from a map view. Furthermore, according to Lawson, a general study of clusters is the valuation of a map of a complete territory, in order to find out if there are clusters in that place. If anything there is no cluster, as proposed by a null hypothesis, the map should not observe any difference in the distribution of the disease. The explanation, in the alternative hypothesis, should then provide some specific mechanisms to understand the grouping that the maps show. It concerns a preconceived notion of how these clusters are given. Such studies may also be called as non-specific, since in reality are not required to identify the place where the clusters are placed, but that really only intended to identify whether there is a pattern, a pattern of grouping into clusters.

production of maps and statistical analysis of mapped data should be dedicated to study in Epidemiology. Furthermore, it pointed out that many epidemiological concepts play an

120 Clinical Epidemiology of Acute Lymphoblastic Leukemia - From the Molecules to the Clinic

The importance of maps in epidemiological work is clear. However, Spatial epidemiology does not restrict its activity only to such cartographic work. There are a detailed set of tasks for Spatial/Geographical analysis in Epidemiology. There are: cluster (or clustering) studies, models of exposure to sources of risk of disease mapping, field surveys of information, analysis, ecological models of infectious diseases, among other studies [9]. You can do analysis within related areas, comparisons, analysis of surfaces and areas, analysis of lines, and analysis of points. Each category has additional subdivisions into more types of studies [14]. Clustering studies look for general patterns in a region, whereas, in contrast, cluster studies have a focused representation. The study of spatial clusters (or clustering) and spatial-temporal clusters (or clustering) correspond to the study of points and areas, and correspond with focused and general clusters and clustering studies. Both types of studies are having a geographical interpretation. Focused cluster studies actually seek to detect a very specific cluster, e.g., distinctly clustered cases around a defined point localized in a territory. General clustering and focused cluster analyses have their own statistical tests. Also, there are two concepts whose interpretation may be mistaken, and these are cluster and clustering. A cluster is a group of children that arises in a small and well defined area, usually has a few cases. By clustering

Tango, in his book Statistical Methods for Disease Clustering [15], classified clusters studies according to geographical approach to the problem. If the intention is to recognize the occurrence of conglomerate, over a territory and/or a given time, then this is a general test. If, instead, there is already a predetermined point, as a given event in a defined location, then a test is focused. To further clarify the last point, it is the example of radiation in Chernobyl. The event is the nuclear disaster, the place is the nuclear plant, and the conglomerate is looking around that point, after that event. The cluster, detected in the above example, was a focused

A focused study of clusters is a specific study of clusters. In these studies, the location of any cluster, if it exists on the map, is a matter of first importance. In these studies there is a source of exposure, which is a fixed place in the territory, and is known incidence of the disease. The search feature is that if any cluster is detected, it should revolve around sources of exposure that are under suspicion. Often this relationship is identified as a causal interaction, i.e., indicate the relationship between nuclear plant exposure and the spatial distribution of sick cases around it. Indirectly, the results of these studies are interpreted as support evidence that an exposure located in one place can generate a disease in the population that is under the effect of exposure. Tango pointed a difference: when the location of possible clusters is expected *a priori*, then it is a specific focusing study; when possible location is unknown, a

For his part, Lawson, classifies these two ways of approaching with other names: like a general study of clusters, and as specific studies [9]. However, he further detailed definition of both concepts, as he points out the usefulness of studies from a map view. Furthermore, according

important role in their analysis.

cluster.

specific study is not focused [15].

means the general propensity of cases to form groups [13].

It is extremely important, in Spatial epidemiology, which in all the techniques and methods explained, epidemiological considerations, proper exercise of Epidemiology, are taken into account. Elliot and Best, declared that the differences in the distribution of disease incidence, could produce important clues to the etiology of disease research. Then, later studies could be carried out by methods designed to analyze the population at an individual level (e.g. cohort or case-control-control) [1].

The Spatial epidemiological analysis has three peculiarities inherent to the source of their data, which explain the logic of the clusters. First, epidemiological analysis has in statis‐ tics and, in particular in spatial statistics, at the core of their method. This is because these data have the property of being geo-referenced (located on a map) and may be inter‐ linked a result of their location. The data can be at the personal level, always associated with it spatial localization. Secondly, in Epidemiology, generally the spatial data are discrete, i.e. they are not quantifiable data on a continuous scale. They are the occurrence of these phenomena is in part a consequence of previous events, and in addition it also depends on an important individual independence, due to random processes. That is, they are stochastic processes. Furthermore, these data (for example, the location of a child with leukemia) behave according to processes associated with discrete probability distribu‐ tions. Put another way, these are processes whose phenomena are clearly separated values. Finally, the nature of all information used in Spatial epidemiology are linked to convention‐ al studies of Epidemiology, which leads to the derivation of models and methods related to spatial analysis [9]. In these studies a null hypothesis indicates the "normal" variation of sick cases, or with health problems. It is compared against alternative hypothesis, which explains the difference in question

There should be no confusion. This type of study stored multiple matches with other epide‐ miological studies. For example, the size of the sample examined, studies of clusters also yield less uncertainty when making inferences. Can be made a study of conglomerates stratified, where a disease cases are divided into groups suitable for research purposes. For example, it can do cluster detection between boys and girls or by age groups sensitive to the different susceptibility of some individuals over others. The results of a study of clusters can be enhanced, too, by using more variables. These variables need not be necessarily included in the operations of the proper analysis of the detection of clusters. Its can indicate the environ‐ mental conditions of the population. For example, one can know the socioeconomic status of the cases studied the population density of the locality where they live, or industrial activity where they live, just to name a few. Finally, the study outcomes can be explained in light of these variables, also under the logic of stratified analysis, if required. On the other hand, cluster studies are sensitive to data quality. If the database from which the analysis is not good quality, the final conclusion should be taken into account these shortcomings, or even avoid the start of operations. Other deficiencies that also affect the validity of the results include the under‐ estimation a result of misdiagnosis or because of diagnoses made by different methods. Also, excessive division and subdivision into groups or strata of the population under study, can lead to pulverization of the sample into too small sets, unable to be analyzed, without sufficient statistical power. And similarly, these studies are not without the danger of research bias, whether by chance or by a systematic error.

Another limitation is the geographical and historical principles that the researcher must working when making interpretations. Analyzing retrospective data carries uncertainty. For example, using mortality data in children with LA, it seems bias, may not be appropriate for studies of clusters, because the difference between the date of illness onset and date of death are variable among cases, and because people are likely to move between the time of onset

Etiological Research of Childhood Acute Leukemia with Cluster and Clustering Analysis

http://dx.doi.org/10.5772/54456

123

If not careful, you can change the unit of analysis, from population to the territory. In itself this is not bad, but the evidence should not be combined. For example, Bellec et al, did a spatial analysis, discarding the population analysis, and results are properly interpreted from a spatial logic, focused on the areas under study [20]. Even after, its meaning was explicated to the people. Related to that, among the challenges to be addressed soon, there are: to study too small areas; to study too large areas; and use biomarkers to check risk factors. An example of the limitations of both geographical and historical boundaries is the definition of significant results for both spatial and temporal dimensions. The danger is that you can led out of the expected results to several other cases are linked. One test, the Knox test shows this disad‐

One test, known as Moran test, prove test's spatial limits, and how it could produce a bias in the correlation tests. Rogerson's method indicates something similar. These could be due to the difficulty of finding differences in a too small or too large geographic area. This is a disadvantage in cases such as a city, where opportunities for analysis using very large spatial limits potentially could led to finding an enormous variability; but on the other hand, for diseases that are hard to find in general population, like childhood leukemia, to find a cluster in a very large area will be difficult. As shown, also for this reason, is preferable works it with

Finally, not all studies carried out by detecting the presence of complete clusters. This is in part due to the use of the wrong method, or because the sample size are not adequate. This has been reported with Glass et al [19], and Bellec et al [20]. The work of Birch et al also makes a

An old idea has always troubled the minds of those who investigate the causes of childhood leukemia. The research regarding the etiology of childhood acute leukemia has a long history of about 100 years, without finding the full factor explaining the origin of the disease. For example, by the results of this research have been identified two risk factors for the develop‐ ment of leukemia: intense ionizing radiation and certain congenital genetic syndromes, which only account for 10% of cases [22]. Amin pointed one factor more: exposure to chemothera‐ peutic agents. However, these factors explain only a small percentage of all cases of leukemia. He presented several studies that examined the complexity of other risk factors: "early-life exposures to infectious agents, parental, fetal, or childhood exposures to environmental toxins, parental occupational exposures to radiation or chemicals; parental medical conditions during

existing cases, and choose a convenient method, or a combination of methods.

**3. Studies of clustering and clusters of children with leukemia**

and death [19].

vantage [21].

similar warning [21].

#### **2.2. Limitations**

From the epidemiological discipline, two important mistakes could occur in these studies: the fallacy of aggregation, and the ecological fallacy. The concepts are defined, including:


In other words, the conclusions, if not verified by other research designs, should be limited to the population level inferences. For instance, when it is used in conjunction with a case-control study, the formulation of hypotheses that promotes may result in many more immediate applications, like the identification of possible risk factors [17]. Otherwise, any generalization of the results from a study of clusters may be inferred.

A space-time cluster study has the advantage that the associations between cases, being close to both a time and in space, can provide explanations that detail the coincidence of the incidence of leukemia in space and at a time determined. This feature is characteristic also of infectious events. On the other hand, when we only talk about spatial clusters, possible causes of these clusters can potentially just be environmental. The consideration of an infectious cause is harder. The explanations behind their formation must be sought strictly by spatial thought, less specific.

One more limitation is the lack of consistent information between the outcomes of similar studies. Sometimes it is necessary to work with this drawback, in addition to bias influences. This bias can be reported and noted, provided that the information given is somewhat predictable. For example, due to lack of data for workers who arrived in the area of study up before 1983, and the lack of information from other data, the results could be biased [18].

Another limitation is the geographical and historical principles that the researcher must working when making interpretations. Analyzing retrospective data carries uncertainty. For example, using mortality data in children with LA, it seems bias, may not be appropriate for studies of clusters, because the difference between the date of illness onset and date of death are variable among cases, and because people are likely to move between the time of onset and death [19].

these variables, also under the logic of stratified analysis, if required. On the other hand, cluster studies are sensitive to data quality. If the database from which the analysis is not good quality, the final conclusion should be taken into account these shortcomings, or even avoid the start of operations. Other deficiencies that also affect the validity of the results include the under‐ estimation a result of misdiagnosis or because of diagnoses made by different methods. Also, excessive division and subdivision into groups or strata of the population under study, can lead to pulverization of the sample into too small sets, unable to be analyzed, without sufficient statistical power. And similarly, these studies are not without the danger of research bias,

122 Clinical Epidemiology of Acute Lymphoblastic Leukemia - From the Molecules to the Clinic

From the epidemiological discipline, two important mistakes could occur in these studies: the

**1.** Fallacy of aggregation: the misapplication of a causal explanation at the individual, when it was seen as a relation to the group level. It is considered a kind of ecological fallacy [16].

**2.** Ecological fallacy: it has two meanings. The first sense is very similar to the aforemen‐ tioned fallacy of aggregation; sometimes it is taken as synonymous with this. The second meaning, more detailed, defines it as an error of inference due to the mistake to distinguish between different levels of organization. "A correlation between variables that are based on characteristics of an group, not necessarily reproduced between variables based on characteristics of individuals; an association that is given at a level could be gone in the

In other words, the conclusions, if not verified by other research designs, should be limited to the population level inferences. For instance, when it is used in conjunction with a case-control study, the formulation of hypotheses that promotes may result in many more immediate applications, like the identification of possible risk factors [17]. Otherwise, any generalization

A space-time cluster study has the advantage that the associations between cases, being close to both a time and in space, can provide explanations that detail the coincidence of the incidence of leukemia in space and at a time determined. This feature is characteristic also of infectious events. On the other hand, when we only talk about spatial clusters, possible causes of these clusters can potentially just be environmental. The consideration of an infectious cause is harder. The explanations behind their formation must be sought strictly by spatial thought,

One more limitation is the lack of consistent information between the outcomes of similar studies. Sometimes it is necessary to work with this drawback, in addition to bias influences. This bias can be reported and noted, provided that the information given is somewhat predictable. For example, due to lack of data for workers who arrived in the area of study up before 1983, and the lack of information from other data, the results could be biased [18].

fallacy of aggregation, and the ecological fallacy. The concepts are defined, including:

whether by chance or by a systematic error.

other, or even be reversed" [16].

of the results from a study of clusters may be inferred.

**2.2. Limitations**

less specific.

If not careful, you can change the unit of analysis, from population to the territory. In itself this is not bad, but the evidence should not be combined. For example, Bellec et al, did a spatial analysis, discarding the population analysis, and results are properly interpreted from a spatial logic, focused on the areas under study [20]. Even after, its meaning was explicated to the people. Related to that, among the challenges to be addressed soon, there are: to study too small areas; to study too large areas; and use biomarkers to check risk factors. An example of the limitations of both geographical and historical boundaries is the definition of significant results for both spatial and temporal dimensions. The danger is that you can led out of the expected results to several other cases are linked. One test, the Knox test shows this disad‐ vantage [21].

One test, known as Moran test, prove test's spatial limits, and how it could produce a bias in the correlation tests. Rogerson's method indicates something similar. These could be due to the difficulty of finding differences in a too small or too large geographic area. This is a disadvantage in cases such as a city, where opportunities for analysis using very large spatial limits potentially could led to finding an enormous variability; but on the other hand, for diseases that are hard to find in general population, like childhood leukemia, to find a cluster in a very large area will be difficult. As shown, also for this reason, is preferable works it with existing cases, and choose a convenient method, or a combination of methods.

Finally, not all studies carried out by detecting the presence of complete clusters. This is in part due to the use of the wrong method, or because the sample size are not adequate. This has been reported with Glass et al [19], and Bellec et al [20]. The work of Birch et al also makes a similar warning [21].

## **3. Studies of clustering and clusters of children with leukemia**

An old idea has always troubled the minds of those who investigate the causes of childhood leukemia. The research regarding the etiology of childhood acute leukemia has a long history of about 100 years, without finding the full factor explaining the origin of the disease. For example, by the results of this research have been identified two risk factors for the develop‐ ment of leukemia: intense ionizing radiation and certain congenital genetic syndromes, which only account for 10% of cases [22]. Amin pointed one factor more: exposure to chemothera‐ peutic agents. However, these factors explain only a small percentage of all cases of leukemia. He presented several studies that examined the complexity of other risk factors: "early-life exposures to infectious agents, parental, fetal, or childhood exposures to environmental toxins, parental occupational exposures to radiation or chemicals; parental medical conditions during pregnancy or before conception; maternal diet during pregnancy, early postnatal feeding patterns and diet, and maternal reproductive [23]. Finally, he added that environmental factors may play a role in cancer development in children, and too many cases, concentrated in one geographical area, one cluster, could be evidence of that. After all, these sets of studies have supported the etiological investigation of childhood leukemia, especially acute lymphoblastic leukemia. At its completion, attention has been paid to various causes, such as ionizing radiation, contaminated water, petrochemical industry, exposure to agrochemicals [24]. The less controversial idea is that the clusters are evidence that environmental factors are generally involved in the development of cancer, including childhood leukemia.

pose, proposed a methodological improvement, or only sought to detect at least one cluster in the territory studied [12,26–31]. Also within the objectives of the studies of clusters there are some that are rare but very interesting, such studies have been conducted on explor‐ ing the presence of risk factors that could potentially be common between different types of childhood cancer [33]; or studies that are even more specific to look for the presence of a cluster of children, within a particular subtype of leukemia (pre-B ALL) [21]. Moreover cluster studies conducted so far, to consider since the start of the investigation. By contrast, there are publications about where it was proposed as a risk factor that could potentially lead to the development of leukemia, which allowed those jobs also propose a working hypothesis or also called "a priori" with the sole purpose of find scientific evidence [7,17,24,34–38]. As mentioned above, one of the risk factors considered most important, proposed and used in different studies, is related to the role of infections in the develop‐ ment of leukemia, resulting in the so-called infectious hypothesis [17,34–36,39]. Another risk factor that has been considered and that is relevant to this issue is the so-called environmental risk factor (unspecified) [7,37]. For example, in a study conducted by Petridou et al, in the year 1997 was referred on that environmental factors may impact the development of leukemia, but in this study did not specify these environmental factors nor

Etiological Research of Childhood Acute Leukemia with Cluster and Clustering Analysis

http://dx.doi.org/10.5772/54456

125

According to a possible risk factor involved in the development of ALL, referenced within the findings of cluster studies conducted so far, they can be divided into three groups: 1) studies that suggest to infections [17,21,24,29,30,33,34,36,37]; 2) studies proposing an unspecified environmental factor [27,38] and finally 3) studies that consider both factors, infections and

Likewise cluster studies have favored the generation of new knowledge about how to conduct such studies [21,31], and even both proposals were generated for subsequent original studies

A longstanding discussion agreement concerning the definition, existence, frequency and interpretation of clusters of childhood leukemia (CL) remains unresolved [40]. In 2009, zur Hausen et al., said it that is very difficult to understand that mere clusters of cases of leukemia, Hodgkin lymphoma, Hodgkin lymphoma, or even cases of multiple myeloma, are indicating that these tumors have their origin in systemic infections. It has been proposed that viral infections should be revealed in a geographical area according to a random pattern of geo‐ graphical distribution. The same author argued that if the simple processes in non-infectious disease lead to cancer onset, as further modifications of the genome are needed at the cellular level, caused by viruses. However, he also thought about the possibility that clusters may be the expression of a mutagenic factor in a family or, probably, in a small community or region,

The debate is not over. The controversies surrounding the usefulness of cluster studies for understanding the leukemia in children are constant. "Most clusters do not have evi‐ dence for obvious, prolonged and biologically plausible exposures. The etiology is ob‐ scure or unknown. Very rarely they can lead to a better understanding of causation, usually in situations with well-documented and heavy local contamination" [42]. There are

their relationship in a given moment with infections [38].

unspecified environmental factor [7,13,28].

as to confirm previous findings [32,35].

exposed to the factor [41].

Ward, for example, since 1917, thought of a theory of infection of childhood acute leukemia [25]. It is a very old hypothesis and, until today, has not been proven or disproved. The study of clustering and clusters is often used to support the ideas developed around the possibility that infections are behind the onset of childhood leukemia. A cluster of children with leukemia has been interpreted as evidence that behind the development of the disease are implicated infections in children's lifetime [7]. Tango, in 2010, said "in the search for evidence of whether a disease such as leukemia, is indeed an infectious disease and, therefore, a viral etiology, the focus will be on whether the cases are grouped into clusters" [15]. Specifically, concerning analysis of space-time clustering McNally and Eden clarified that if the infections were implicated in the etiology of childhood leukemia, then the geographic distribution of these children may show evidence of clustering in space, and under certain conditions may also show space-time clustering [26].

In fact, the presence of a cluster of children with leukemia has been interpreted in a more broad and diverse sense. A cluster of children with leukemia would suggest that since the origin of the disease, one or more factors are involved, not just infection. As already mentioned, genetic predisposition may also play a role. It is very important to reiterate that the results obtained from such studies should be interpreted with special care. The cluster studies, by themselves, do not reveal causal agents in the sense of identifying risk factors involved in a health problem. It should be noted that studies of clustering and clusters are behind the search for relationships that are implicit in theory of the etiology of a disease, but these studies are not used to make formal determinations of risk factors implied. Hence, the inferences developed from the results of these studies are not without controversy.

In a literature review of the last fifteen years (1997-2012), were found more than 20 publications, of which 18 are focused on acute lymphoblastic leukemia (ALL). There are two questions that have been repeatedly proposed to be answered by studies of clusters, the first is whether infections play an important role in the development of childhood acute lymphoblastic leukemia, and the second is whether there are environmental risk factors that are not well specified influence on disease onset. However, some researchers have considered infection like a type of environmental exposure, and therefore the two terms may have been used interchangeably in some research.

The objectives that want to answer research questions of cluster studies are several, for example, the objectives can be descriptive, or may be the result of pose a hypothesis "a priori". Of the 18 studies conducted in children with ALL, eight had a descriptive pur‐

pose, proposed a methodological improvement, or only sought to detect at least one cluster in the territory studied [12,26–31]. Also within the objectives of the studies of clusters there are some that are rare but very interesting, such studies have been conducted on explor‐ ing the presence of risk factors that could potentially be common between different types of childhood cancer [33]; or studies that are even more specific to look for the presence of a cluster of children, within a particular subtype of leukemia (pre-B ALL) [21]. Moreover cluster studies conducted so far, to consider since the start of the investigation. By contrast, there are publications about where it was proposed as a risk factor that could potentially lead to the development of leukemia, which allowed those jobs also propose a working hypothesis or also called "a priori" with the sole purpose of find scientific evidence [7,17,24,34–38]. As mentioned above, one of the risk factors considered most important, proposed and used in different studies, is related to the role of infections in the develop‐ ment of leukemia, resulting in the so-called infectious hypothesis [17,34–36,39]. Another risk factor that has been considered and that is relevant to this issue is the so-called environmental risk factor (unspecified) [7,37]. For example, in a study conducted by Petridou et al, in the year 1997 was referred on that environmental factors may impact the development of leukemia, but in this study did not specify these environmental factors nor their relationship in a given moment with infections [38].

pregnancy or before conception; maternal diet during pregnancy, early postnatal feeding patterns and diet, and maternal reproductive [23]. Finally, he added that environmental factors may play a role in cancer development in children, and too many cases, concentrated in one geographical area, one cluster, could be evidence of that. After all, these sets of studies have supported the etiological investigation of childhood leukemia, especially acute lymphoblastic leukemia. At its completion, attention has been paid to various causes, such as ionizing radiation, contaminated water, petrochemical industry, exposure to agrochemicals [24]. The less controversial idea is that the clusters are evidence that environmental factors are generally

Ward, for example, since 1917, thought of a theory of infection of childhood acute leukemia [25]. It is a very old hypothesis and, until today, has not been proven or disproved. The study of clustering and clusters is often used to support the ideas developed around the possibility that infections are behind the onset of childhood leukemia. A cluster of children with leukemia has been interpreted as evidence that behind the development of the disease are implicated infections in children's lifetime [7]. Tango, in 2010, said "in the search for evidence of whether a disease such as leukemia, is indeed an infectious disease and, therefore, a viral etiology, the focus will be on whether the cases are grouped into clusters" [15]. Specifically, concerning analysis of space-time clustering McNally and Eden clarified that if the infections were implicated in the etiology of childhood leukemia, then the geographic distribution of these children may show evidence of clustering in space, and under certain conditions may also

In fact, the presence of a cluster of children with leukemia has been interpreted in a more broad and diverse sense. A cluster of children with leukemia would suggest that since the origin of the disease, one or more factors are involved, not just infection. As already mentioned, genetic predisposition may also play a role. It is very important to reiterate that the results obtained from such studies should be interpreted with special care. The cluster studies, by themselves, do not reveal causal agents in the sense of identifying risk factors involved in a health problem. It should be noted that studies of clustering and clusters are behind the search for relationships that are implicit in theory of the etiology of a disease, but these studies are not used to make formal determinations of risk factors implied. Hence, the inferences developed from the results

In a literature review of the last fifteen years (1997-2012), were found more than 20 publications, of which 18 are focused on acute lymphoblastic leukemia (ALL). There are two questions that have been repeatedly proposed to be answered by studies of clusters, the first is whether infections play an important role in the development of childhood acute lymphoblastic leukemia, and the second is whether there are environmental risk factors that are not well specified influence on disease onset. However, some researchers have considered infection like a type of environmental exposure, and therefore the two terms may have been used

The objectives that want to answer research questions of cluster studies are several, for example, the objectives can be descriptive, or may be the result of pose a hypothesis "a priori". Of the 18 studies conducted in children with ALL, eight had a descriptive pur‐

involved in the development of cancer, including childhood leukemia.

124 Clinical Epidemiology of Acute Lymphoblastic Leukemia - From the Molecules to the Clinic

show space-time clustering [26].

of these studies are not without controversy.

interchangeably in some research.

According to a possible risk factor involved in the development of ALL, referenced within the findings of cluster studies conducted so far, they can be divided into three groups: 1) studies that suggest to infections [17,21,24,29,30,33,34,36,37]; 2) studies proposing an unspecified environmental factor [27,38] and finally 3) studies that consider both factors, infections and unspecified environmental factor [7,13,28].

Likewise cluster studies have favored the generation of new knowledge about how to conduct such studies [21,31], and even both proposals were generated for subsequent original studies as to confirm previous findings [32,35].

A longstanding discussion agreement concerning the definition, existence, frequency and interpretation of clusters of childhood leukemia (CL) remains unresolved [40]. In 2009, zur Hausen et al., said it that is very difficult to understand that mere clusters of cases of leukemia, Hodgkin lymphoma, Hodgkin lymphoma, or even cases of multiple myeloma, are indicating that these tumors have their origin in systemic infections. It has been proposed that viral infections should be revealed in a geographical area according to a random pattern of geo‐ graphical distribution. The same author argued that if the simple processes in non-infectious disease lead to cancer onset, as further modifications of the genome are needed at the cellular level, caused by viruses. However, he also thought about the possibility that clusters may be the expression of a mutagenic factor in a family or, probably, in a small community or region, exposed to the factor [41].

The debate is not over. The controversies surrounding the usefulness of cluster studies for understanding the leukemia in children are constant. "Most clusters do not have evi‐ dence for obvious, prolonged and biologically plausible exposures. The etiology is ob‐ scure or unknown. Very rarely they can lead to a better understanding of causation, usually in situations with well-documented and heavy local contamination" [42]. There are intermediate positions, which do not ensure the possibility of a conglomerate but neither discarded. Law, in 2008, setting out "[...] it is difficult to see how these clusters provide evidence for infectious disease being involved in the etiology of childhood leukemia" [43]. But he clarified that the positive results of a study of clusters are used as a proxy, in fact, that its possible etiology. In addition, he relied on evidence generated by other studies, as the peak age between 2 and 5 years, the incidence of disease or increased incidence over time, and seasonal variations in the incidence of leukemia. These studies have been identified as potentially indicative of the role of infections [43].

Interpretations for each type of cancer have its nuances, but the meanings of space-time clusters in the above variables are as follows. When clusters of cases matched both in time and in place of birth, the given interpretation is that this conjunction supports the possible involvement of infections in the etiology of the disease studied. In particular, the space-time clusters of children with Hodgkin lymphoma suggest that a relevant etiological exposure occurred among children at similar ages after birth, or in uterus period. Moreover, since the dates are the time of birth and diagnosis, this would indicate that there was a heterogeneous latent period from the time of exposure until the time of diagnosis. Clusters of children with central nervous system tumors were interpreted with caution. According to the authors, this finding only strengthens the possibility that infections are implicated in the development of tumors

Etiological Research of Childhood Acute Leukemia with Cluster and Clustering Analysis

http://dx.doi.org/10.5772/54456

127

On the other hand, when the space-time clusters match the place of birth and date of diagnosis rather than date of birth, the conclusion was different. For example, clusters of cases with NHL suggest that exposure which resulted in the onset of this disease in similar stages, before diagnosis. Childhood leukemia received a special mention. According to McNally et al, in a previous study [37], the authors had found clusters of children with leukemia by place and date of diagnosis, whereas in the study cited here, they found clusters by date of birth more date of diagnosis. In no cases children, clusters were found by location and date of birth. Again, the results supported the hypothesis of infectious childhood leukemia. The result is restricted since, although children with leukemia were between 1 and 4 years of age in the first study, this outcome was found only among children with ALL, the most common; and in the second

When we talk about leukemia clusters in children, McNally and colleagues reported that there are clusters when they are sought under the variables of time and place of birth [21]. Mulder et al, found clusters around environmental factors: the cases were grouped when analyzed by exposure to petroleum products and pesticides, and even found a link between having swum in a pond contaminated with petrochemical spill in previous years [44]. Petridou's team found correspondence between the ages of cases and their place of residence, while in urban areas had clusters of children between 0 and 4 years, and excluding those over 5 years in rural areas should be age higher [45]. In England, there was found clusters from both the date and place of birth of cases, and between the date and place of diagnosis of the same group [17]. Gus‐ stafsson and Carstensen, in contrast, dismissed the clustering according to the place and date of diagnosis, but they found them by date and place [29]. Gilman came to similar results, finding clusters by date and place of diagnosis, and date and place of birth, with the added bonus that also could rule out cancer clusters in solid tumors [46]. Finally, Alexander, who sought and found clusters with defined variables from another perspective: cases susceptible

There are also studies of clustering and clusters with negative outcomes. In France, Bellec not found clusters in a study which used data from the national registry, notwithstanding they used many different methods to detect clusters: Potthoff-Whittinghill, Moran, Knox and Kulldorff [20]. Dockerty and colleagues [48], and Alexander et al [49], found nothing when only raw data were based on geographic, demographic and diagnostic information. There are

study, was found among children with overall acute leukemia.

mentioned.

and infected cases [47].

Perhaps studies of space-time clustering result in less doubt when their results are interpreted. Possibly, they are tremendously bounded by the condition of the two dimensions examined by these techniques. It must be remembered that this type of study seeks to distinguish clustering patterns of cases, both in time and in space, simultaneously. In 2006, McNally, Alexander and Bithell, began a study in the hope of detecting space-time clustering of children with cancer in the United Kingdom. They predicted that if an antecedent infection was involved in cancer development, this type of clusters of children with cancer should be revealed in the territory of the island, if the infection (either viral or bacterial)was not ubiqui‐ tous or endemic. Otherwise there could be differences in spatial or temporal dimensions. The authors clarified that, however, this would only occur when the delay time between exposure and cancer diagnosis was short, or at least relatively constant. In the first scenario they wanted to test that, in the etiology of some childhood cancers, the timing of onset of this disease possibly masks the ability to be a rare response to infection [37].

Furthermore, McNally et al., suggested that infections may act in the etiology of certain types of diseases of childhood cancer. Similarly, they relied on the fact that several studies conducted elsewhere in the world, had also found space-time clustering of children with childhood acute leukemia, especially acute lymphoblastic leukemia, with similar conclusions. They began to change their language, from infections to environmental exposures. Three years later, in 2009, they presented a retrospective reflection on the study mentioned in the preceding paragraph, and referred to themselves in the following words: "These findings provided support for the involvement of environmental agents in etiological processes occurring close to diagnosis" [13]. Really, they confirmed a change in favor of extending the suspicion of the sum of environmental factors, rather than just infection.

In that study were found space-time clustering of children with various types of cancer. The children were registered in a database, from 1969 to 1993. Among cancers, they included leukemia. The clusters were searched according to two possible outcomes. The first of these was the phenomenon that presumably finds clusters of children that matched both their place of residence at birth, for the date they were born. The second concerned the possibility of finding these clusters when they seek groups according to place of birth of children and also according to the time of diagnosis, regardless of date of birth. Each result has a different interpretation. By residence and date of birth, there were clusters of cases with Hodgkin lymphoma and central nervous system tumors. When searched according to the residence at birth, but considering the date of diagnosis, there were space-time clusters with leukemia cases (specifically for children aged 1 to 4 years) and also for Wilms tumor and Hodgkin lymphoma. Interpretations for each type of cancer have its nuances, but the meanings of space-time clusters in the above variables are as follows. When clusters of cases matched both in time and in place of birth, the given interpretation is that this conjunction supports the possible involvement of infections in the etiology of the disease studied. In particular, the space-time clusters of children with Hodgkin lymphoma suggest that a relevant etiological exposure occurred among children at similar ages after birth, or in uterus period. Moreover, since the dates are the time of birth and diagnosis, this would indicate that there was a heterogeneous latent period from the time of exposure until the time of diagnosis. Clusters of children with central nervous system tumors were interpreted with caution. According to the authors, this finding only strengthens the possibility that infections are implicated in the development of tumors mentioned.

intermediate positions, which do not ensure the possibility of a conglomerate but neither discarded. Law, in 2008, setting out "[...] it is difficult to see how these clusters provide evidence for infectious disease being involved in the etiology of childhood leukemia" [43]. But he clarified that the positive results of a study of clusters are used as a proxy, in fact, that its possible etiology. In addition, he relied on evidence generated by other studies, as the peak age between 2 and 5 years, the incidence of disease or increased incidence over time, and seasonal variations in the incidence of leukemia. These studies have been

Perhaps studies of space-time clustering result in less doubt when their results are interpreted. Possibly, they are tremendously bounded by the condition of the two dimensions examined by these techniques. It must be remembered that this type of study seeks to distinguish clustering patterns of cases, both in time and in space, simultaneously. In 2006, McNally, Alexander and Bithell, began a study in the hope of detecting space-time clustering of children with cancer in the United Kingdom. They predicted that if an antecedent infection was involved in cancer development, this type of clusters of children with cancer should be revealed in the territory of the island, if the infection (either viral or bacterial)was not ubiqui‐ tous or endemic. Otherwise there could be differences in spatial or temporal dimensions. The authors clarified that, however, this would only occur when the delay time between exposure and cancer diagnosis was short, or at least relatively constant. In the first scenario they wanted to test that, in the etiology of some childhood cancers, the timing of onset of this disease

Furthermore, McNally et al., suggested that infections may act in the etiology of certain types of diseases of childhood cancer. Similarly, they relied on the fact that several studies conducted elsewhere in the world, had also found space-time clustering of children with childhood acute leukemia, especially acute lymphoblastic leukemia, with similar conclusions. They began to change their language, from infections to environmental exposures. Three years later, in 2009, they presented a retrospective reflection on the study mentioned in the preceding paragraph, and referred to themselves in the following words: "These findings provided support for the involvement of environmental agents in etiological processes occurring close to diagnosis" [13]. Really, they confirmed a change in favor of extending the suspicion of the sum of

In that study were found space-time clustering of children with various types of cancer. The children were registered in a database, from 1969 to 1993. Among cancers, they included leukemia. The clusters were searched according to two possible outcomes. The first of these was the phenomenon that presumably finds clusters of children that matched both their place of residence at birth, for the date they were born. The second concerned the possibility of finding these clusters when they seek groups according to place of birth of children and also according to the time of diagnosis, regardless of date of birth. Each result has a different interpretation. By residence and date of birth, there were clusters of cases with Hodgkin lymphoma and central nervous system tumors. When searched according to the residence at birth, but considering the date of diagnosis, there were space-time clusters with leukemia cases (specifically for children aged 1 to 4 years) and also for Wilms tumor and Hodgkin lymphoma.

identified as potentially indicative of the role of infections [43].

126 Clinical Epidemiology of Acute Lymphoblastic Leukemia - From the Molecules to the Clinic

possibly masks the ability to be a rare response to infection [37].

environmental factors, rather than just infection.

On the other hand, when the space-time clusters match the place of birth and date of diagnosis rather than date of birth, the conclusion was different. For example, clusters of cases with NHL suggest that exposure which resulted in the onset of this disease in similar stages, before diagnosis. Childhood leukemia received a special mention. According to McNally et al, in a previous study [37], the authors had found clusters of children with leukemia by place and date of diagnosis, whereas in the study cited here, they found clusters by date of birth more date of diagnosis. In no cases children, clusters were found by location and date of birth. Again, the results supported the hypothesis of infectious childhood leukemia. The result is restricted since, although children with leukemia were between 1 and 4 years of age in the first study, this outcome was found only among children with ALL, the most common; and in the second study, was found among children with overall acute leukemia.

When we talk about leukemia clusters in children, McNally and colleagues reported that there are clusters when they are sought under the variables of time and place of birth [21]. Mulder et al, found clusters around environmental factors: the cases were grouped when analyzed by exposure to petroleum products and pesticides, and even found a link between having swum in a pond contaminated with petrochemical spill in previous years [44]. Petridou's team found correspondence between the ages of cases and their place of residence, while in urban areas had clusters of children between 0 and 4 years, and excluding those over 5 years in rural areas should be age higher [45]. In England, there was found clusters from both the date and place of birth of cases, and between the date and place of diagnosis of the same group [17]. Gus‐ stafsson and Carstensen, in contrast, dismissed the clustering according to the place and date of diagnosis, but they found them by date and place [29]. Gilman came to similar results, finding clusters by date and place of diagnosis, and date and place of birth, with the added bonus that also could rule out cancer clusters in solid tumors [46]. Finally, Alexander, who sought and found clusters with defined variables from another perspective: cases susceptible and infected cases [47].

There are also studies of clustering and clusters with negative outcomes. In France, Bellec not found clusters in a study which used data from the national registry, notwithstanding they used many different methods to detect clusters: Potthoff-Whittinghill, Moran, Knox and Kulldorff [20]. Dockerty and colleagues [48], and Alexander et al [49], found nothing when only raw data were based on geographic, demographic and diagnostic information. There are certainly limitations to this type of study, but, specifically, since 1970, it warned that a number of cases over a long period could lead to detection of artificial nature [50].

investigation and better understanding of these findings, especially with respect to the role

Etiological Research of Childhood Acute Leukemia with Cluster and Clustering Analysis

http://dx.doi.org/10.5772/54456

129

Acute leukemia among children in Mexico City has been studied for over a decade. Through these studies, we know well that its incidence is among the highest in the world. In 2011 Pérez Saldívar et al., reported an incidence of 57.6 cases/million children [64]. This high incidence, coupled with the large population of the city―more than eight million in 2010 in the territory of the Federal District, and more than 20 million people in the metropolitan area of Mexico City, in the same year―are expressed in more than 200 children with childhood leukemia each year. These children and adolescents are treated in nine tertiary hospitals, the highest rank of specialist medical care in the Mexican health system. It has been estimated that these hospitals serve approximately 97.5% of the cases of childhood acute leukemia in Mexico City [65].

For over ten years it was suspected that the spatial distribution of children with leukemia in Mexico City was heterogeneous. In 2000, in a descriptive longitudinal study, conducted by researchers at the Instituto Mexicano del Seguro Social [66], were found morbidity standar‐ dized rates (MSR), that suggest a spatial concentration of cases of childhood leukemia. For acute lymphoblastic leukemia (ALL), the MSR were highest at south of Mexico City; for acute myeloblastic leukemia (AML), the MSR were highest in the west. In addition, from further investigations, some matches attracted the attention: it was found that among patients with acute leukemia (AL), some of them were immediate neighbors, suggesting that behind the development of the disease there is an environmental factor that could promote it [67]. It was, therefore, decided to make a study of spatial clustering in Mexico City, to confirm the hypoth‐ esis that children with leukemia are grouped into clusters that reflect the aforementioned

Information was extracted on individuals aged 0-14 years, diagnosed with ALL between 2006 and 2007, who were resident in Mexico City (Federal District). A total of 224 incident cases were identified. We also included 224 children without leukemia (controls), nor other cancers, genetic malformation or asthma. The controls were matched by sex, age and health institution of origin. We located the addresses of the homes, with an accuracy of 0.1 km, from where children residing at diagnosis of leukemia, or the time of the interview, in those without the disease. The cases were recruited between the years 2006 and 2007, and for the controls needed a much larger period, from 1998 to 2011. Due to this discrepancy, we discard the detection of clusters according to the temporal dimension. We used the Kullorff´s scan statistic, which is based on a Bernoulli model, to identify individual clusters. The complete study region was scanned by construction of a two-dimensional circular window. The window was varied so at most it included 50% of the entire geographical area. The variable circular window is centered on the geo-reference of each case [68]. The method has been used previously in an analysis of leukaemia in Sweden [69]. Statistical significance (P<0.05) was evaluated using one-sided tests and 99999 simulations. Only one large statistically significant cluster was identified (see Fig.

of population density and population mixing [20].

**4. Analysis of data from Mexico City**

heterogeneous spatial distribution.

The infectious etiology of acute leukemia has been revised, too, from other study designs. Wartenberg et al, in 2004, sought to study the infectious origin of leukemia, testing a hypothesis developed for this purpose (Kinlen hypothesis). The study was conducted from an ecological point of view. This type of study is characterized in that its inferences can only be applied to the ambient, and cannot be categorical causal statements about the population being studied [51]. Another ecological study, carried out by Knox [52], revealed an association between the prevalence of childhood cancers (including leukemia) with the geographic distribution of air pollutants. In a third study, also ecological [53], there was a comparison between cases of people with different cancers (and ages), in a city of Wales. Hypothetically they expected that the prevalence of cancers decrease as people were found at greater distances from a source of pollution (petrochemical industry). There was an inverse correspondence between the distance and the incidence of some cancers, but not the incidence or mortality of leukemia.

The relationship between environmental factors and the development of cancer, including leukemia, has been extensively studied, and the outcomes, far from discouraging the search, prompt further investigation.

Studies of Lehtinen et al [54], Bogdanovic et al [55], Roman et al [56], Gilham et al [57], looked for associations between viral agents and the development of leukemia. These studies used a case-control design. Lehtinen hypothesized about *in utero* infection with Epstein-Barr virus and human herpes virus. There were no significant results. Bogdanovic analyzed the relation‐ ship between the Epstein-Barr virus and its reactivation in the mother, showing a possible association with childhood acute lymphoblastic leukemia. Roman found positive relationships between the incidences of disease by these viruses generated with leukemia. Gilham assumed that a large exposure to infection, when the child lives in day care, was associated with protection that reduced the development of childhood leukemia; his results led to conclude that reduced exposure to infections during the first months of the child's life increases the risk of developing acute lymphoblastic leukemia.

The study of clusters suggests possible etiologies. The interest is that these putative risk factors can be assessed in more detail. For example, the study of cluster analysis is often combined with a case-control study. This gives more relevant results, as measured relationships between risk factors and disease [44,48,58–61]. In addition, cluster detection techniques can be com‐ pared with other techniques, using the same data, allowing comparisons [20,62,63].

When spatial distributions are unusual, it is possible to speculate on the etiological implications [49]. A study of clusters can evaluate multicausal factors. Bellec's findings supported the hypothesis that a community's geographic isolation and low density, possibly combined with the mixture of different populations, may play an important role in the etiology of leukemia. It could not been possible to consider the geographic isolation as a risk factor using any other technique. However, the same author suggests that a new research question should be to determine whether this phenomenon was specific to one age group or diagnosis; and proposes that future statistical models could allow further

investigation and better understanding of these findings, especially with respect to the role of population density and population mixing [20].

## **4. Analysis of data from Mexico City**

certainly limitations to this type of study, but, specifically, since 1970, it warned that a number

The infectious etiology of acute leukemia has been revised, too, from other study designs. Wartenberg et al, in 2004, sought to study the infectious origin of leukemia, testing a hypothesis developed for this purpose (Kinlen hypothesis). The study was conducted from an ecological point of view. This type of study is characterized in that its inferences can only be applied to the ambient, and cannot be categorical causal statements about the population being studied [51]. Another ecological study, carried out by Knox [52], revealed an association between the prevalence of childhood cancers (including leukemia) with the geographic distribution of air pollutants. In a third study, also ecological [53], there was a comparison between cases of people with different cancers (and ages), in a city of Wales. Hypothetically they expected that the prevalence of cancers decrease as people were found at greater distances from a source of pollution (petrochemical industry). There was an inverse correspondence between the distance

of cases over a long period could lead to detection of artificial nature [50].

128 Clinical Epidemiology of Acute Lymphoblastic Leukemia - From the Molecules to the Clinic

and the incidence of some cancers, but not the incidence or mortality of leukemia.

prompt further investigation.

of developing acute lymphoblastic leukemia.

The relationship between environmental factors and the development of cancer, including leukemia, has been extensively studied, and the outcomes, far from discouraging the search,

Studies of Lehtinen et al [54], Bogdanovic et al [55], Roman et al [56], Gilham et al [57], looked for associations between viral agents and the development of leukemia. These studies used a case-control design. Lehtinen hypothesized about *in utero* infection with Epstein-Barr virus and human herpes virus. There were no significant results. Bogdanovic analyzed the relation‐ ship between the Epstein-Barr virus and its reactivation in the mother, showing a possible association with childhood acute lymphoblastic leukemia. Roman found positive relationships between the incidences of disease by these viruses generated with leukemia. Gilham assumed that a large exposure to infection, when the child lives in day care, was associated with protection that reduced the development of childhood leukemia; his results led to conclude that reduced exposure to infections during the first months of the child's life increases the risk

The study of clusters suggests possible etiologies. The interest is that these putative risk factors can be assessed in more detail. For example, the study of cluster analysis is often combined with a case-control study. This gives more relevant results, as measured relationships between risk factors and disease [44,48,58–61]. In addition, cluster detection techniques can be com‐

When spatial distributions are unusual, it is possible to speculate on the etiological implications [49]. A study of clusters can evaluate multicausal factors. Bellec's findings supported the hypothesis that a community's geographic isolation and low density, possibly combined with the mixture of different populations, may play an important role in the etiology of leukemia. It could not been possible to consider the geographic isolation as a risk factor using any other technique. However, the same author suggests that a new research question should be to determine whether this phenomenon was specific to one age group or diagnosis; and proposes that future statistical models could allow further

pared with other techniques, using the same data, allowing comparisons [20,62,63].

Acute leukemia among children in Mexico City has been studied for over a decade. Through these studies, we know well that its incidence is among the highest in the world. In 2011 Pérez Saldívar et al., reported an incidence of 57.6 cases/million children [64]. This high incidence, coupled with the large population of the city―more than eight million in 2010 in the territory of the Federal District, and more than 20 million people in the metropolitan area of Mexico City, in the same year―are expressed in more than 200 children with childhood leukemia each year. These children and adolescents are treated in nine tertiary hospitals, the highest rank of specialist medical care in the Mexican health system. It has been estimated that these hospitals serve approximately 97.5% of the cases of childhood acute leukemia in Mexico City [65].

For over ten years it was suspected that the spatial distribution of children with leukemia in Mexico City was heterogeneous. In 2000, in a descriptive longitudinal study, conducted by researchers at the Instituto Mexicano del Seguro Social [66], were found morbidity standar‐ dized rates (MSR), that suggest a spatial concentration of cases of childhood leukemia. For acute lymphoblastic leukemia (ALL), the MSR were highest at south of Mexico City; for acute myeloblastic leukemia (AML), the MSR were highest in the west. In addition, from further investigations, some matches attracted the attention: it was found that among patients with acute leukemia (AL), some of them were immediate neighbors, suggesting that behind the development of the disease there is an environmental factor that could promote it [67]. It was, therefore, decided to make a study of spatial clustering in Mexico City, to confirm the hypoth‐ esis that children with leukemia are grouped into clusters that reflect the aforementioned heterogeneous spatial distribution.

Information was extracted on individuals aged 0-14 years, diagnosed with ALL between 2006 and 2007, who were resident in Mexico City (Federal District). A total of 224 incident cases were identified. We also included 224 children without leukemia (controls), nor other cancers, genetic malformation or asthma. The controls were matched by sex, age and health institution of origin. We located the addresses of the homes, with an accuracy of 0.1 km, from where children residing at diagnosis of leukemia, or the time of the interview, in those without the disease. The cases were recruited between the years 2006 and 2007, and for the controls needed a much larger period, from 1998 to 2011. Due to this discrepancy, we discard the detection of clusters according to the temporal dimension. We used the Kullorff´s scan statistic, which is based on a Bernoulli model, to identify individual clusters. The complete study region was scanned by construction of a two-dimensional circular window. The window was varied so at most it included 50% of the entire geographical area. The variable circular window is centered on the geo-reference of each case [68]. The method has been used previously in an analysis of leukaemia in Sweden [69]. Statistical significance (P<0.05) was evaluated using one-sided tests and 99999 simulations. Only one large statistically significant cluster was identified (see Fig. 2) (O=98, E=74.66, O/E=1.313, p=0.01325). There were no statistically significant secondary clusters. This finding indicates that locally varying environmental factors may be implicated in the origin of ALL in Mexico City. However, the possibility of chance may play a role, cannot be excluded.

Geography and Informatics (INEGI by its Spanish acronym), the two main boroughs by number of economic units of industrial activity are Azcapotzalco and Miguel Hidalgo. In fact, these two areas form the most industrialized landscape of Mexico City, for over half a century. There was an old oil refinery, an auto plant, dozens of railways and many other industries. Contradictorily, these delegations are not part of the cluster found on the east of Mexico City. At first glance, this suggests that the spatial cluster detected in this study is not related to large industrial facilities. However, to be more careful, we can see that the phenomenon probably does have a relationship but more nuanced. In third place for the number of industrial establishments, appears Iztapalapa borough, which is included within the area of spatial cluster. When comparing those boroughs, Azcapotzalco and Miguel Hidalgo, against the latter, Iztapalapa, we note that the average personnel employed in industrial establishments is very different. While the staff working at the Miguel Hidalgo and Azcapotzalco boroughs is 32.01 and 31.29 workers per economic unit, respectively comparatively, in the Iztapalapa borough would, there are on average, 11.19 workers per industrial unit. This suggests that facilities most prevalent in the spatial cluster detected include small establishments such as family workshops. If so, the work on these workshops can be an important parental exposure.

Etiological Research of Childhood Acute Leukemia with Cluster and Clustering Analysis

http://dx.doi.org/10.5772/54456

131

Another consideration is air pollution, which is a risk factor that has been studied. Knox suggested an association between the prevalence of cancer in the geographical distribution of air pollutants [71]. According to National Institute of Ecology (INE by its Spanish acronym), the impact of air pollution in Mexico City generates 4,000 premature deaths per year and 2.5 million lost work days [72]. EMBARQ states that, with about 18 million people and 6 million cars, Mexico City's metropolitan area is one of the largest and busiest cities in the world. Around 600 new cars come into service each day, and in 2007 it sold just over 300,000 cars this year. Most alarming is that, according to the same place, less than 4% of vehicles, trucks and buses, generated in 2002, about 70% of air pollution. The other 30 percent is allocated to factories, small cars and motorcycles [73]. Air pollution is thus mainly attributed to heavy cars. The smog of Mexico City is concentrated in the southern part of it, and the children of the conglomerate are located mainly to the east. With the evidence found, it is difficult to ensure that the cause of spatial cluster is due, mainly, to the atmospheric concentration of pollution in the city. If so, we would have expected a spatial cluster in the south-southwest of Mexico

In an earlier study, which also uses the information from the same database of this study, we measured the relationship between socioeconomic status and the development of childhood acute leukemia. The study, conducted by Perez-Saldivar et al, considered the problem according to three indicators [64]. The first is indirect and only represents the existing agri‐ cultural activity in each delegation of Mexico [74]. The second used the information developed by the United Nations to measure human development, the Human Development Index by municipalities, published to Mexico in 2005 [75]. The third sought to relate the average number of people per household [76]. The results only showed a relationship between the incidences of ALL with the number of people per household, in Pre-B ALL. For none of the other two indicators found a significant relationship. In this study, the detected cluster is located in an area of relatively low economic level, with several boroughs suffering from poverty problems

City, instead of having appeared in the east.

Although we cannot exclude the effect of chance on these results, nor forget the fact of that the periods of data collection between cases and controls are different, these results are remarkable. For comparison, in a study conducted in Ohio [62], also carried out with SaTScan, the most significant data were found for the group of children aged 10-14 years, with a value p=0.33, and only three cases formed part of cluster. When all the data were analyzed together, including all types of leukemia and all ages, the most likely cluster had 43 cases, with p=0.81. As we can see, none of the clusters is statistically significant. The author of the paper com‐ mented that these results were not entirely surprising, considering that the study area is very large and diverse (the state of Ohio). He argued that in a large area, it is doubtful that a particular risk factor going to have a consistent and sustained effect through space. It's unlikely to see a cluster in a large geographical area. And again, in other study, Wheeler repeated the same sentence with another argument: the results are consistent with the literature worldwide, because it is difficult to find statistically significant clusters. Wheeler used several tests of clusters (K-function, Cuzick & Edward's, kernel intensity function) plus SaTScan, without finding significant associations with any of them.

Apparently the cluster analysis is most effective for small-area spatial analysis [31], on the scale of a city, and not in an area as large as a U.S. state, or in a entire country. Goujon-Bellec et al, concluded that very few cluster detection techniques have enough power to scan large areas, as a large country like France [70]. It has also been observed that when the unit of geographic analysis is very large, such as an aggregate of municipalities (about 30 × 30 Km) or cantons, the sensitivity of a technique for detecting clusters is diminished; it is as though the proximity between the cases are "diluted" among these geographical units, and the cluster simply "does not appear". Studies of Germany [7] and France [20] seem to confirm it. Therefore, differences in surface between the territories of Ohio, in the United States, and Mexico City, in the Mexican capital (116,096 Km² vs. 1,485 Km², respectively), could be one reason why we detected a cluster in the city. However, the number of clster's children (98 cases), and a value of p = 0.01325, is not commonly reported. In addition, the data collection period of the cases was two years, against eight of Ohio study.

When the geographic location of the children who make up the cluster is mappeding, it can be seen that this spatial cluster is located to the east of Mexico City (Federal District), slightly to the northeast. Indeed, about half of all children with leukemia investigated are part of this cluster. The conglomerate is clearly excluded from west and southwest of Mexico City. The extent of the surface cluster includes the territories of some of the boroughs of the Mexican Federal District. The hypothetical explanations for this cluster are the following.

Industrial establishments located in the environment of a child have been considered as risk factors for developing cancer. Sans et al, in 1995, expected the prevalence of cancers, including childhood acute leukemia, decrease as people were found farther from the petrochemical industry plant, as in Britain [53]. According to data from the National Institute of Statistics, Geography and Informatics (INEGI by its Spanish acronym), the two main boroughs by number of economic units of industrial activity are Azcapotzalco and Miguel Hidalgo. In fact, these two areas form the most industrialized landscape of Mexico City, for over half a century. There was an old oil refinery, an auto plant, dozens of railways and many other industries. Contradictorily, these delegations are not part of the cluster found on the east of Mexico City. At first glance, this suggests that the spatial cluster detected in this study is not related to large industrial facilities. However, to be more careful, we can see that the phenomenon probably does have a relationship but more nuanced. In third place for the number of industrial establishments, appears Iztapalapa borough, which is included within the area of spatial cluster. When comparing those boroughs, Azcapotzalco and Miguel Hidalgo, against the latter, Iztapalapa, we note that the average personnel employed in industrial establishments is very different. While the staff working at the Miguel Hidalgo and Azcapotzalco boroughs is 32.01 and 31.29 workers per economic unit, respectively comparatively, in the Iztapalapa borough would, there are on average, 11.19 workers per industrial unit. This suggests that facilities most prevalent in the spatial cluster detected include small establishments such as family workshops. If so, the work on these workshops can be an important parental exposure.

2) (O=98, E=74.66, O/E=1.313, p=0.01325). There were no statistically significant secondary clusters. This finding indicates that locally varying environmental factors may be implicated in the origin of ALL in Mexico City. However, the possibility of chance may play a role, cannot

130 Clinical Epidemiology of Acute Lymphoblastic Leukemia - From the Molecules to the Clinic

Although we cannot exclude the effect of chance on these results, nor forget the fact of that the periods of data collection between cases and controls are different, these results are remarkable. For comparison, in a study conducted in Ohio [62], also carried out with SaTScan, the most significant data were found for the group of children aged 10-14 years, with a value p=0.33, and only three cases formed part of cluster. When all the data were analyzed together, including all types of leukemia and all ages, the most likely cluster had 43 cases, with p=0.81. As we can see, none of the clusters is statistically significant. The author of the paper com‐ mented that these results were not entirely surprising, considering that the study area is very large and diverse (the state of Ohio). He argued that in a large area, it is doubtful that a particular risk factor going to have a consistent and sustained effect through space. It's unlikely to see a cluster in a large geographical area. And again, in other study, Wheeler repeated the same sentence with another argument: the results are consistent with the literature worldwide, because it is difficult to find statistically significant clusters. Wheeler used several tests of clusters (K-function, Cuzick & Edward's, kernel intensity function) plus SaTScan, without

Apparently the cluster analysis is most effective for small-area spatial analysis [31], on the scale of a city, and not in an area as large as a U.S. state, or in a entire country. Goujon-Bellec et al, concluded that very few cluster detection techniques have enough power to scan large areas, as a large country like France [70]. It has also been observed that when the unit of geographic analysis is very large, such as an aggregate of municipalities (about 30 × 30 Km) or cantons, the sensitivity of a technique for detecting clusters is diminished; it is as though the proximity between the cases are "diluted" among these geographical units, and the cluster simply "does not appear". Studies of Germany [7] and France [20] seem to confirm it. Therefore, differences in surface between the territories of Ohio, in the United States, and Mexico City, in the Mexican capital (116,096 Km² vs. 1,485 Km², respectively), could be one reason why we detected a cluster in the city. However, the number of clster's children (98 cases), and a value of p = 0.01325, is not commonly reported. In addition, the data collection period of the cases was two years,

When the geographic location of the children who make up the cluster is mappeding, it can be seen that this spatial cluster is located to the east of Mexico City (Federal District), slightly to the northeast. Indeed, about half of all children with leukemia investigated are part of this cluster. The conglomerate is clearly excluded from west and southwest of Mexico City. The extent of the surface cluster includes the territories of some of the boroughs of the Mexican

Industrial establishments located in the environment of a child have been considered as risk factors for developing cancer. Sans et al, in 1995, expected the prevalence of cancers, including childhood acute leukemia, decrease as people were found farther from the petrochemical industry plant, as in Britain [53]. According to data from the National Institute of Statistics,

Federal District. The hypothetical explanations for this cluster are the following.

be excluded.

finding significant associations with any of them.

against eight of Ohio study.

Another consideration is air pollution, which is a risk factor that has been studied. Knox suggested an association between the prevalence of cancer in the geographical distribution of air pollutants [71]. According to National Institute of Ecology (INE by its Spanish acronym), the impact of air pollution in Mexico City generates 4,000 premature deaths per year and 2.5 million lost work days [72]. EMBARQ states that, with about 18 million people and 6 million cars, Mexico City's metropolitan area is one of the largest and busiest cities in the world. Around 600 new cars come into service each day, and in 2007 it sold just over 300,000 cars this year. Most alarming is that, according to the same place, less than 4% of vehicles, trucks and buses, generated in 2002, about 70% of air pollution. The other 30 percent is allocated to factories, small cars and motorcycles [73]. Air pollution is thus mainly attributed to heavy cars. The smog of Mexico City is concentrated in the southern part of it, and the children of the conglomerate are located mainly to the east. With the evidence found, it is difficult to ensure that the cause of spatial cluster is due, mainly, to the atmospheric concentration of pollution in the city. If so, we would have expected a spatial cluster in the south-southwest of Mexico City, instead of having appeared in the east.

In an earlier study, which also uses the information from the same database of this study, we measured the relationship between socioeconomic status and the development of childhood acute leukemia. The study, conducted by Perez-Saldivar et al, considered the problem according to three indicators [64]. The first is indirect and only represents the existing agri‐ cultural activity in each delegation of Mexico [74]. The second used the information developed by the United Nations to measure human development, the Human Development Index by municipalities, published to Mexico in 2005 [75]. The third sought to relate the average number of people per household [76]. The results only showed a relationship between the incidences of ALL with the number of people per household, in Pre-B ALL. For none of the other two indicators found a significant relationship. In this study, the detected cluster is located in an area of relatively low economic level, with several boroughs suffering from poverty problems in Mexico City and higher number of people per household. However, given the information that it has, we cannot explain a relationship between socioeconomic status and the cluster reported in the study. Further studies are needed to investigate this point.

The variable circular window is centered on the geo-reference of each case [68]. The method has been used previously in an analysis of leukaemia in Sweden [69]. Statistical significance (P<0.05) was evaluated using one-sided tests and 99999 simulations. Only one large statistically significant cluster was identified (see Fig. 2) (O = 98, E = 74.66, O/E = 1.313, P = 0.01325). There were no statistically significant secondary clusters. This finding indicates that locally varying environmental factors may be implicated in the origin of ALL in Mexico City. However, the

Etiological Research of Childhood Acute Leukemia with Cluster and Clustering Analysis

http://dx.doi.org/10.5772/54456

133

Although we cannot exclude the effect of chance on these results, or forget the fact of that the periods of data collection between cases and controls are different, these results are remarkable. For comparison, in a study conducted in Ohio [62], also carried out with SaTScan, the most significant data were found for the group of children aged 10-14 years, with a value p = 0.33, and only three cases formed part of cluster. When all the data were analyzed together, including all types of leukemia and all ages, the most likely cluster had 43 cases, with p = 0.81. As we can see, none of the clusters is statistically significant. The author of the paper com‐ mented that these results were not entirely surprising, considering that the study area is very large and diverse (the state of Ohio). He argued that in a large area, it is doubtful that a particular risk factor going to have a consistent and sustained effect through space. It's unlikely to see a cluster in a large geographical area. And again, Wheeler repeated the same sentence with another argument: the results are consistent with the literature worldwide, because it is difficult to find statistically significant clusters. Wheeler used several tests of clusters (Kfunction, Cuzick & Edward's, kernel intensity function) plus SaTScan without finding

Apparently the cluster analysis is most effective for small-area spatial analysis [31], on the scale of a city, and not in an area as large as a U.S. state. Goujon-Bellec et al, concluded that very few cluster detection techniques have enough power to scan large areas, as a large country like France [70]. It has also been observed that when the unit of geographic analysis is very large, such as a county or a municipality, or an aggregate of these, the sensitivity of a technique for detecting clusters is diminished; it is as though the proximity between the cases are "diluted" among these geographical units, and the cluster simply "does not appear". Studies of Germany [7] and France [20] seem to confirm it. Therefore, differences in surface between the territories of Ohio, in the United States, and Mexico City, in the Mexican capital (116.096 km² vs. 1485 km², respectively), could be one reason why we detected a cluster in the city. However, many children (98 cases), and a value of p = 0.01325, statistically significant, is not commonly reported. In addition, the data collection period of the cases was two years, against eight of

When the geographic location of the children who make up the cluster is mapped, it can be seen that this spatial cluster is located to the east of Mexico City (Federal District), slightly to the northeast. Indeed, about half of all children with leukemia investigated are part of this cluster. The conglomerate is clearly excluded from west and southwest of Mexico City. The extent of the surface cluster includes the territories of some of the boroughs of the Mexican

Federal District. The hypothetical explanations for this cluster are the following.

possibility that variable levels of ascertainment may play a role cannot be excluded.

significant associations with any of them.

Ohio study.

In summary, we can say that the investigation of spatial clusters in geographic areas of a relatively small size, as a city with a high population density and a high incidence of childhood acute leukemia, is favorable for the detection of clusters.

Unfortunately we do not have a longer register of children with leukemia, so that the inter‐ pretation of the results is inconclusive. The results of this study support the involvement of environmental factors in the development of childhood acute lymphoblastic leukemia.

these studies, we know well that its incidence is among the highest in the world. In 2011 we reported an incidence of 57.6 cases/million children [64]. This high incidence, coupled with the large population of the city―more than eight million in 2010 in the territory of the Federal District, and more than 20 million people in the metropolitan area of Mexico City, in the same year―are expressed in the more than 200 children with childhood leukemia incidents each year. These children and adolescents are treated in nine tertiary hospitals, the highest rank of specialist medical care in the Mexican health system. It has been estimated that these hospitals serve approximately 97.5% of the cases of childhood acute leukemia Mexico City [65].

For over ten years it was suspected that the spatial distribution of children with leukemia in Mexico City was heterogeneous. In 2000, in a descriptive longitudinal study, conducted by researchers at the Mexican Social Security Institute [66], were found morbidity standarized rates (MSR), that suggest a spatial concentration of cases of childhood leukemia. For acute lymphoblastic leukemia (ALL), the MSR were highest at south of Mexico City; for acute myeloblastic leukemia (AML), the MSR were highest in the west. In addition, from further investigations, some matches attracted the attention: it was found that among patients with acute leukemia (AL), some of them were immediate neighbors, suggesting that behind the development of the disease; there is an environmental factor that could promote it [67]. It was, therefore, decided to make a study of spatial clustering in Mexico City, to confirm the hypoth‐ esis that children with leukemia are grouped into clusters that reflect the aforementioned heterogeneous spatial distribution.

Information was extracted on individuals aged 0-14 years, diagnosed with ALL between 2006 and 2007, who were resident in Mexico City. A total of 224 incident cases were identified. We also included 224 children without leukemia, nor other cancers, genetic malformation or asthma. Were matched by sex, age and health institution of origin. We located the addresses of the homes, with an accuracy of 0.1 km, from where children residing at diagnosis of leukemia, or the time of the interview, in those without the disease. The cases were collected between 2006 and 2007, and for the controls needed a much larger period, from 1998 to 2011. Due to this discrepancy, we discard the detection of clusters according to the temporal dimension. We used Kullorff´s scan statistic based on a Bernoulli model to identify individual clusters. The complete study region was scanned by construction of a two-dimensional circular window. The window was varied so at most it included 50% of the entire geographical area. The variable circular window is centered on the geo-reference of each case [68]. The method has been used previously in an analysis of leukaemia in Sweden [69]. Statistical significance (P<0.05) was evaluated using one-sided tests and 99999 simulations. Only one large statistically significant cluster was identified (see Fig. 2) (O = 98, E = 74.66, O/E = 1.313, P = 0.01325). There were no statistically significant secondary clusters. This finding indicates that locally varying environmental factors may be implicated in the origin of ALL in Mexico City. However, the possibility that variable levels of ascertainment may play a role cannot be excluded.

in Mexico City and higher number of people per household. However, given the information that it has, we cannot explain a relationship between socioeconomic status and the cluster

In summary, we can say that the investigation of spatial clusters in geographic areas of a relatively small size, as a city with a high population density and a high incidence of childhood

Unfortunately we do not have a longer register of children with leukemia, so that the inter‐ pretation of the results is inconclusive. The results of this study support the involvement of environmental factors in the development of childhood acute lymphoblastic leukemia.

these studies, we know well that its incidence is among the highest in the world. In 2011 we reported an incidence of 57.6 cases/million children [64]. This high incidence, coupled with the large population of the city―more than eight million in 2010 in the territory of the Federal District, and more than 20 million people in the metropolitan area of Mexico City, in the same year―are expressed in the more than 200 children with childhood leukemia incidents each year. These children and adolescents are treated in nine tertiary hospitals, the highest rank of specialist medical care in the Mexican health system. It has been estimated that these hospitals serve approximately 97.5% of the cases of childhood

For over ten years it was suspected that the spatial distribution of children with leukemia in Mexico City was heterogeneous. In 2000, in a descriptive longitudinal study, conducted by researchers at the Mexican Social Security Institute [66], were found morbidity standarized rates (MSR), that suggest a spatial concentration of cases of childhood leukemia. For acute lymphoblastic leukemia (ALL), the MSR were highest at south of Mexico City; for acute myeloblastic leukemia (AML), the MSR were highest in the west. In addition, from further investigations, some matches attracted the attention: it was found that among patients with acute leukemia (AL), some of them were immediate neighbors, suggesting that behind the development of the disease; there is an environmental factor that could promote it [67]. It was, therefore, decided to make a study of spatial clustering in Mexico City, to confirm the hypoth‐ esis that children with leukemia are grouped into clusters that reflect the aforementioned

Information was extracted on individuals aged 0-14 years, diagnosed with ALL between 2006 and 2007, who were resident in Mexico City. A total of 224 incident cases were identified. We also included 224 children without leukemia, nor other cancers, genetic malformation or asthma. Were matched by sex, age and health institution of origin. We located the addresses of the homes, with an accuracy of 0.1 km, from where children residing at diagnosis of leukemia, or the time of the interview, in those without the disease. The cases were collected between 2006 and 2007, and for the controls needed a much larger period, from 1998 to 2011. Due to this discrepancy, we discard the detection of clusters according to the temporal dimension. We used Kullorff´s scan statistic based on a Bernoulli model to identify individual clusters. The complete study region was scanned by construction of a two-dimensional circular window. The window was varied so at most it included 50% of the entire geographical area.

reported in the study. Further studies are needed to investigate this point.

132 Clinical Epidemiology of Acute Lymphoblastic Leukemia - From the Molecules to the Clinic

acute leukemia, is favorable for the detection of clusters.

acute leukemia Mexico City [65].

heterogeneous spatial distribution.

Although we cannot exclude the effect of chance on these results, or forget the fact of that the periods of data collection between cases and controls are different, these results are remarkable. For comparison, in a study conducted in Ohio [62], also carried out with SaTScan, the most significant data were found for the group of children aged 10-14 years, with a value p = 0.33, and only three cases formed part of cluster. When all the data were analyzed together, including all types of leukemia and all ages, the most likely cluster had 43 cases, with p = 0.81. As we can see, none of the clusters is statistically significant. The author of the paper com‐ mented that these results were not entirely surprising, considering that the study area is very large and diverse (the state of Ohio). He argued that in a large area, it is doubtful that a particular risk factor going to have a consistent and sustained effect through space. It's unlikely to see a cluster in a large geographical area. And again, Wheeler repeated the same sentence with another argument: the results are consistent with the literature worldwide, because it is difficult to find statistically significant clusters. Wheeler used several tests of clusters (Kfunction, Cuzick & Edward's, kernel intensity function) plus SaTScan without finding significant associations with any of them.

Apparently the cluster analysis is most effective for small-area spatial analysis [31], on the scale of a city, and not in an area as large as a U.S. state. Goujon-Bellec et al, concluded that very few cluster detection techniques have enough power to scan large areas, as a large country like France [70]. It has also been observed that when the unit of geographic analysis is very large, such as a county or a municipality, or an aggregate of these, the sensitivity of a technique for detecting clusters is diminished; it is as though the proximity between the cases are "diluted" among these geographical units, and the cluster simply "does not appear". Studies of Germany [7] and France [20] seem to confirm it. Therefore, differences in surface between the territories of Ohio, in the United States, and Mexico City, in the Mexican capital (116.096 km² vs. 1485 km², respectively), could be one reason why we detected a cluster in the city. However, many children (98 cases), and a value of p = 0.01325, statistically significant, is not commonly reported. In addition, the data collection period of the cases was two years, against eight of Ohio study.

When the geographic location of the children who make up the cluster is mapped, it can be seen that this spatial cluster is located to the east of Mexico City (Federal District), slightly to the northeast. Indeed, about half of all children with leukemia investigated are part of this cluster. The conglomerate is clearly excluded from west and southwest of Mexico City. The extent of the surface cluster includes the territories of some of the boroughs of the Mexican Federal District. The hypothetical explanations for this cluster are the following.

Industrial establishments located in the environment of a child have been considered as risk factors for developing cancer. Sans et al, in 1995, expected the prevalence of cancers, including childhood acute leukemia, decrease as people were found farther from the petrochemical industry plant, as in Britain [53]. According to data from the National Institute of Statistics, Geography and Informatics (INEGI by its Spanish acronym), the two main boroughs by number of economic units of industrial activity are Azcapotzalco and Miguel Hidalgo. In fact, these two areas form the most industrialized landscape of Mexico City, for over half a century. There was an old oil refinery, an auto plant, dozens of railways and many other industries. Contradictorily, these delegations are not part of the cluster found on the east of Mexico City. At first glance, this suggests that the spatial cluster detected in this study is not related to large industrial facilities. However, to be more careful, we can see that the phenomenon probably does have a relationship but more nuanced. In third place for the number of industrial establishments, appears Iztapalapa borough, which is included within the area of spatial cluster. When comparing those boroughs, Azcapotzalco and Miguel Hidalgo, against the latter, Iztapalapa, we note that the average personnel employed in industrial establishments is very different. While the staff working at the Miguel Hidalgo and Azcapotzalco boroughs is 32.01 and 31.29 workers per economic unit, respectively. Comparatively in the Iztapalapa borough would, on average, 11.19 workers per industrial unit. This suggests that facilities most prevalent in the spatial cluster detected include more small establishments such as family workshops. If so, the work on these workshops can be an important parental exposure.

Another consideration is air pollution, which is a risk factor that has been studied. Knox suggested an association between the prevalence of cancer in the geographical distribution of air pollutants [71]. According to National Institute of Ecology (INE by its Spanish acronym), the impact of air pollution in Mexico City generates 4,000 premature deaths per year and 2.5 million lost work days [72]. EMBARQ states that, with about 18 million people and 6 million cars, Mexico City is one of the largest and busiest cities in the world. Around 600 new cars come into service each day, and in 2007 it sold just over 300,000 cars this year. Most alarming is that, according to the same place, less than 4% of vehicles, trucks and buses, generated in 2002, about 70% of air pollution. The other 30 percent is allocated to factories, small cars and motorcycles [73]. Air pollution is thus mainly attributed to heavy cars. The smog of Mexico City is concentrated in the southern part of it, and the children of the conglomerate are located mainly to the east. With the evidence found, it is difficult to ensure that the cause of spatial cluster is due, mainly, to the atmospheric concentration of pollution in the city. If so, we would have expected a conglomerate in the south-southwest of Mexico City, instead of having appeared in the east.

In an earlier study, which also uses the information from the same database of this study, we measured the relationship between socioeconomic status and the development of childhood acute leukemia. The study, conducted by Perez-Saldívar et al, considered the problem according to three indicators [64]. The first is indirect and only represents the existing agri‐ cultural activity in each delegation of Mexico [74]. The second used the information developed by the United Nations to measure human development, the Human Development Index by municipalities, published from Mexico in 2005 [75]. The third sought to relate the average number of people per household [76]. The results only showed a relationship between the incidences of ALL with the number of people per household, in Pre-B ALL. For none of the other two indicators found a significant relationship. In this study, the detected cluster is located in an area of relatively low economic level, with several boroughs suffering from poverty problems in Mexico City. However, given the information that it has, we cannot explain a relationship between socioeconomic status and the cluster reported in the study.

**Figure 2.** Location of spatial cluster of childhood acute leukemia cases in Mexico. The numbers are the identification

Etiological Research of Childhood Acute Leukemia with Cluster and Clustering Analysis

http://dx.doi.org/10.5772/54456

135

Further studies are needed to investigate this point.

of each case into the cluster

Etiological Research of Childhood Acute Leukemia with Cluster and Clustering Analysis http://dx.doi.org/10.5772/54456 135

Industrial establishments located in the environment of a child have been considered as risk factors for developing cancer. Sans et al, in 1995, expected the prevalence of cancers, including childhood acute leukemia, decrease as people were found farther from the petrochemical industry plant, as in Britain [53]. According to data from the National Institute of Statistics, Geography and Informatics (INEGI by its Spanish acronym), the two main boroughs by number of economic units of industrial activity are Azcapotzalco and Miguel Hidalgo. In fact, these two areas form the most industrialized landscape of Mexico City, for over half a century. There was an old oil refinery, an auto plant, dozens of railways and many other industries. Contradictorily, these delegations are not part of the cluster found on the east of Mexico City. At first glance, this suggests that the spatial cluster detected in this study is not related to large industrial facilities. However, to be more careful, we can see that the phenomenon probably does have a relationship but more nuanced. In third place for the number of industrial establishments, appears Iztapalapa borough, which is included within the area of spatial cluster. When comparing those boroughs, Azcapotzalco and Miguel Hidalgo, against the latter, Iztapalapa, we note that the average personnel employed in industrial establishments is very different. While the staff working at the Miguel Hidalgo and Azcapotzalco boroughs is 32.01 and 31.29 workers per economic unit, respectively. Comparatively in the Iztapalapa borough would, on average, 11.19 workers per industrial unit. This suggests that facilities most prevalent in the spatial cluster detected include more small establishments such as family workshops. If so, the work on these workshops can be an important parental exposure.

134 Clinical Epidemiology of Acute Lymphoblastic Leukemia - From the Molecules to the Clinic

Another consideration is air pollution, which is a risk factor that has been studied. Knox suggested an association between the prevalence of cancer in the geographical distribution of air pollutants [71]. According to National Institute of Ecology (INE by its Spanish acronym), the impact of air pollution in Mexico City generates 4,000 premature deaths per year and 2.5 million lost work days [72]. EMBARQ states that, with about 18 million people and 6 million cars, Mexico City is one of the largest and busiest cities in the world. Around 600 new cars come into service each day, and in 2007 it sold just over 300,000 cars this year. Most alarming is that, according to the same place, less than 4% of vehicles, trucks and buses, generated in 2002, about 70% of air pollution. The other 30 percent is allocated to factories, small cars and motorcycles [73]. Air pollution is thus mainly attributed to heavy cars. The smog of Mexico City is concentrated in the southern part of it, and the children of the conglomerate are located mainly to the east. With the evidence found, it is difficult to ensure that the cause of spatial cluster is due, mainly, to the atmospheric concentration of pollution in the city. If so, we would have expected a conglomerate in the south-southwest of Mexico City, instead of having

In an earlier study, which also uses the information from the same database of this study, we measured the relationship between socioeconomic status and the development of childhood acute leukemia. The study, conducted by Perez-Saldívar et al, considered the problem according to three indicators [64]. The first is indirect and only represents the existing agri‐ cultural activity in each delegation of Mexico [74]. The second used the information developed by the United Nations to measure human development, the Human Development Index by municipalities, published from Mexico in 2005 [75]. The third sought to relate the average

appeared in the east.

**Figure 2.** Location of spatial cluster of childhood acute leukemia cases in Mexico. The numbers are the identification of each case into the cluster

number of people per household [76]. The results only showed a relationship between the incidences of ALL with the number of people per household, in Pre-B ALL. For none of the other two indicators found a significant relationship. In this study, the detected cluster is located in an area of relatively low economic level, with several boroughs suffering from poverty problems in Mexico City. However, given the information that it has, we cannot explain a relationship between socioeconomic status and the cluster reported in the study. Further studies are needed to investigate this point.

## **Acknowledgements**

This work was funded by the Consejo Nacional de la Ciencia y la Tecnología (CONACYT) through its program, Fondo Sectorial de Investigación en Salud y Seguridad Social (SALUD 2007-1-71223/FIS/IMSS/PROT/592); the Fondo Sectorial de Investigación para la Educación (CB-2007-1-83949/FIS/IMSS/PROT/616) and by Instituto Mexicano del Seguro Social (FIS/ IMSS/PROT/G10/846).

village. BMC Public Health [Internet]. (2006). Jan [cited 2012 Nov 7];6:286. Available from: http://www.pubmedcentral.nih.gov/articlerender.fcgi?ar‐

Etiological Research of Childhood Acute Leukemia with Cluster and Clustering Analysis

http://dx.doi.org/10.5772/54456

137

[5] Duczmal, L. H, Moreira, G. J, Burgarelli, D, Takahashi, R. H, Magalhães, F. C, & Bod‐ evan, E. C. Voronoi distance based prospective space-time scans for point data sets: a dengue fever cluster analysis in a southeast Brazilian town. International Journal of Health Geographics [Internet]. (2011). Jan [cited 2012 Nov 14];10:29. Available from: http://www.pubmedcentral.nih.gov/articlerender.fcgi?artid=3118312&tool=pmcen‐

[6] Meliker, J. R, & Jacquez, G. M. Space-time clustering of case-control data with resi‐ dential histories: insights into empirical induction periods, age-specific susceptibility, and calendar year-specific effects. Stochastic Environmental Research and Risk As‐ sessment [Internet]. (2007). Aug [cited 2012 Nov 14];Available from: http://

www.pubmedcentral.nih.gov/articlerender.fcgi?artid=2430065&tool=pmcen‐

[7] Schmiedel, S, Blettner, M, Kaatsch, P, & Schüz, J. Spatial clustering and space-time clusters of leukemia among children in Germany, 1987-2007. European Journal of Ep‐ idemiology [Internet]. (2010). Sep [cited 2012 Aug 15];Available from: http://

[8] Rose, G. Sick individuals and sick populations. International Journal of Epidemiolo‐ gy [Internet]. (1985). Mar [cited 2012 Aug 23];Available from: http://

[9] Lawson, A. B. Statistical Methods in Spatial Epidemiology. Wiley, editor. West Sus‐

[10] Zatsepin, I, Verger, P, Robert-gnansia, E, Gagnière, B, Tirmarche, M, Khmel, R, et al. Down syndrome time-clustering in January 1987 in Belarus: link with the Chernobyl accident? Reproductive toxicology (Elmsford, N.Y.) [Internet]. (2007). cited 2012 Aug 15];24(3-4):289-95. Available from: http://www.ncbi.nlm.nih.gov/pubmed/17706919

[11] Lee, S. S, & Wong, N. S. The clustering and transmission dynamics of pandemic in‐ fluenza A (H1N1) 2009 cases in Hong Kong. The Journal of Infection [Internet]. (2011). Oct [cited 2012 Aug 23];Available from: http://www.ncbi.nlm.nih.gov/

[12] Smith, M. Considerations on a possible viral etiology for B-precursor acute lympho‐ blastic leukemia of childhood. Journal of Immunotherapy [Internet]. (1997). Mar [cit‐ ed 2012 Aug 15];Available from: http://www.ncbi.nlm.nih.gov/pubmed/9087381,

[13] McNally RJQBithell JF, Vincent TJ, Murphy MFG. Space-time clustering of childhood cancer around the residence at birth. International Journal of Cancer [Internet].

tid=1684261&tool=pmcentrez&rendertype=abstract

trez&rendertype=abstract

sexs; (2001). , 277.

20(2), 89-100.

pubmed/21601284, 63(4), 274-80.

trez&rendertype=abstract, 21(5), 625-34.

www.ncbi.nlm.nih.gov/pubmed/20623321, 25(9), 627-33.

www.ncbi.nlm.nih.gov/pubmed/3872850, 14(1), 32-8.

## **Author details**

David Aldebarán Duarte-Rodríguez1 , Richard J.Q. McNally2 , Juan Carlos Núñez-Enríquez1 , Arturo Fajardo-Gutiérrez1 and Juan Manuel Mejía-Aranguré3

1 Unidad de Investigación en Epidemiología Clínica, Hospital de Pediatría, Centro Médico Nacional Siglo XXI, Instituto Mexicano del Seguro Social (IMSS), México

2 Institute of Health and Society, Newcastle University, UK

3 Coordinación de Investigación en Salud, Centro Médico Nacional Siglo XXI, Instituto Mex‐ icano del Seguro Social (IMSS), México

## **References**


village. BMC Public Health [Internet]. (2006). Jan [cited 2012 Nov 7];6:286. Available from: http://www.pubmedcentral.nih.gov/articlerender.fcgi?ar‐ tid=1684261&tool=pmcentrez&rendertype=abstract

**Acknowledgements**

IMSS/PROT/G10/846).

**Author details**

**References**

Arturo Fajardo-Gutiérrez1

David Aldebarán Duarte-Rodríguez1

icano del Seguro Social (IMSS), México

This work was funded by the Consejo Nacional de la Ciencia y la Tecnología (CONACYT) through its program, Fondo Sectorial de Investigación en Salud y Seguridad Social (SALUD 2007-1-71223/FIS/IMSS/PROT/592); the Fondo Sectorial de Investigación para la Educación (CB-2007-1-83949/FIS/IMSS/PROT/616) and by Instituto Mexicano del Seguro Social (FIS/

, Richard J.Q. McNally2

1 Unidad de Investigación en Epidemiología Clínica, Hospital de Pediatría, Centro Médico

3 Coordinación de Investigación en Salud, Centro Médico Nacional Siglo XXI, Instituto Mex‐

[1] Elliot, P. NB. Geographic patterns of disease. In: Armitage P, Colton T, editors. Inter‐

[2] Mejía-arangure, J. Pérez-Saldivar M, Rosana Pelayo-Camacho R, Fuentes-Pananá E, Bekker-Mendez C, Morales-Sánchez A, et al. Childhood Acute Leukemias in Hispan‐ ic Population: Differences by Age Peak and Immunophenotype. In: Faderl S, editor. Novel Aspects in Acute Lymphoblastic Leukemia [Internet]. InTech; (2011). cited 2012 Aug 15]. Available from: http://intechopen.com/books/novel-aspects-in-acutelymphoblastic-leukemia/childhood-acute-leukemias-in-hispanic-population-differen‐

[3] Poljak, Z, Dewey, C. E, Martin, S. W, Christensen, J, Carman, S, & Friendship, R. M. Spatial clustering of swine influenza in Ontario on the basis of herd-level disease sta‐ tus with different misclassification errors. Preventive Veterinary Medicine [Internet]. (2007). Oct 16 [cited 2012 Aug 15];Available from: http://www.ncbi.nlm.nih.gov/

[4] Gaudart, J, Poudiougou, B, Dicko, A, Ranque, S, Toure, O, Sagara, I, et al. Space-time clustering of childhood malaria at the household level: a dynamic cohort in a Mali

national Encyclopaedia of Biostatistics. 2nd ed. Chichester: Wiley; (1998).

and Juan Manuel Mejía-Aranguré3

Nacional Siglo XXI, Instituto Mexicano del Seguro Social (IMSS), México

136 Clinical Epidemiology of Acute Lymphoblastic Leukemia - From the Molecules to the Clinic

2 Institute of Health and Society, Newcastle University, UK

ces-by-age-peak-and-immunophenotype

pubmed/17531333, 81(4), 236-49.

, Juan Carlos Núñez-Enríquez1

,


(2009). Jan 15 [cited 2012 Aug 15];Available from: http://www.ncbi.nlm.nih.gov/ pubmed/18844236, 124(2), 449-55.

[24] Alexander, F. E, Chan, L. C, Lam, T. H, Yuen, P, Leung, N. K, Ha, S. Y, et al. Cluster‐ ing of childhood leukaemia in Hong Kong: association with the childhood peak and common acute lymphoblastic leukaemia and with population mixing. British Journal of Cancer [Internet]. (1997). Jan;Available from: http://www.pubmedcentral.nih.gov/ articlerender.fcgi?artid=2063384&tool=pmcentrez&rendertype=abstract, 75(3), 457-63.

Etiological Research of Childhood Acute Leukemia with Cluster and Clustering Analysis

http://dx.doi.org/10.5772/54456

139

[25] Ward, G. The infective theory of acute leukemia. British Journal of Childhood's Dis‐

[26] McNally RJQAlston RD, Eden TOB, Kelsey AM, Birch JM. Further clues concerning the aetiology of childhood central nervous system tumours. European Journal of Cancer [Internet]. (2004). Dec [cited 2012 Aug 15];Available from: http://

[27] Demoury, C, Goujon-bellec, S, Guyot-goubin, A, Hémon, D, & Clavel, J. Spatial var‐ iations of childhood acute leukaemia in France, 1990-2006: global spatial heterogenei‐ ty and cluster detection at "living-zone" level. European Journal of Cancer Prevention [Internet]. (2012). Jul [cited 2012 Oct 26];Available from: http://

[28] Gilman, E. A. McNally RJQ, Cartwright RA. Space-time clustering of Hodgkin's Dis‐ ease in parts of the UK, Leukemia & Lymphoma [Internet]. (1999). Available from:

[29] Gustafsson, B, & Carstensen, J. Evidence of space-time clustering of childhood acute lymphoblastic leukaemia in Sweden. British Journal of Cancer [Internet]. (1999). Feb; 79(3-4):655-7. Available from: http://www.pubmedcentral.nih.gov/articlerender.fcgi?

[30] Gustafsson, B, & Carstensen, J. Space-time clustering of childhood lymphatic leukae‐ mias and non-Hodgkin's lymphomas in Sweden. European Journal of Epidemiology [Internet]. (2000). Jan [cited 2012 Oct 25];Available from: http://

[31] Kaatsch, P, & Mergenthaler, A. Incidence, time trends and regional variation of child‐ hood leukaemia in Germany and Europe. Radiation Protection Dosimetry [Internet]. (2008). Jan [cited 2012 Aug 15];Available from: http://www.ncbi.nlm.nih.gov/

[32] Kulkarni, K, Stobart, K, Witol, A, & Rosychuk, R. J. Leukemia and lymphoma inci‐ dence in children in Alberta, Canada: a population-based 22-year retrospective study. Pediatric Hematology and Oncology [Internet]. (2011). Nov [cited 2012 Oct 26];Available from: http://www.ncbi.nlm.nih.gov/pubmed/21981741, 28(8), 649-60.

[33] McNally RJQEden TOB, Alexander FE, Kelsey AM, Birch JM. Is there a common aeti‐ ology for certain childhood malignancies? Results of cross-space-time clustering analyses. European Journal of Cancer [Internet]. (2005). Dec [cited 2012 Oct 25];Avail‐

able from: http://www.ncbi.nlm.nih.gov/pubmed/16243517, 41(18), 2911-6.

www.ncbi.nlm.nih.gov/pubmed/15571959, 40(18), 2766-72.

www.ncbi.nlm.nih.gov/pubmed/22108445, 21(4), 367-74.

http://www.ncbi.nlm.nih.gov/pubmed/10613453, 1984-1993.

artid=2362409&tool=pmcentrez&rendertype=abstract

www.ncbi.nlm.nih.gov/pubmed/11484799, 16(12), 1111-6.

pubmed/18996968, 132(2), 107-13.

ease. (1917). , 14, 10-20.


[24] Alexander, F. E, Chan, L. C, Lam, T. H, Yuen, P, Leung, N. K, Ha, S. Y, et al. Cluster‐ ing of childhood leukaemia in Hong Kong: association with the childhood peak and common acute lymphoblastic leukaemia and with population mixing. British Journal of Cancer [Internet]. (1997). Jan;Available from: http://www.pubmedcentral.nih.gov/ articlerender.fcgi?artid=2063384&tool=pmcentrez&rendertype=abstract, 75(3), 457-63.

(2009). Jan 15 [cited 2012 Aug 15];Available from: http://www.ncbi.nlm.nih.gov/

[14] Albert, D. P, Gesler, W. M, & Levergood, B. Spatial Analysis, GIS, and Remote Sens‐ ing Applications in the Health Sciences. Chelsea, Michigan: Ann Arbor Press;

[15] Tango, T. Statistical Methods for Disease Clustering. 1st ed. New Yory: Springer;

[16] Porta, M. A Dictionary of Epidemiology. 5th ed. Porta M, editor. Oxford University

[17] Birch, J. M, Alexander, F. E, Blair, V, Eden, O. B, Taylor, G. M, & Mcnally, R. J. Spacetime clustering patterns in childhood leukaemia support a role for infection. British Journal of Cancer [Internet]. (2000). May;Available from: http://www.pubmedcen‐ tral.nih.gov/articlerender.fcgi?artid=2363399&tool=pmcentrez&rendertype=abstract,

[18] Boutou, O, Guizard, A-V, Slama, R, Pottier, D, & Spira, A. Population mixing and leukaemia in young people around the La Hague nuclear waste reprocessing plant. British Journal of Cancer [Internet]. (2002). Sep 23 [cited 2012 Aug 15];Available from: http://www.pubmedcentral.nih.gov/articlerender.fcgi?ar‐

[19] Glass, A. G, & Mantel, N. Lack of time-space clustering of childhood leukemia in Los Angeles County, 1960-1964. Cancer Research [Internet]. (1969). Available from:

[20] Bellec, S, Hémon, D, Rudant, J, Goubin, A, & Clavel, J. Spatial and space-time cluster‐ ing of childhood acute leukaemia in France from 1990 to 2000: a nationwide study. British Journal of Cancer [Internet]. (2006). Mar 13 [cited 2012 Aug 15];Available from: http://www.pubmedcentral.nih.gov/articlerender.fcgi?ar‐

[21] McNally RJQAlexander FE, Birch JM. Space-time clustering analyses of childhood acute lymphoblastic leukaemia by immunophenotype. British Journal of Cancer [In‐ ternet]. (2002). Aug 27 [cited 2012 Aug 15];Available from: http://www.pubmedcen‐ tral.nih.gov/articlerender.fcgi?artid=2376144&tool=pmcentrez&rendertype=abstract,

[22] Wiemels, J. Perspectives on the causes of childhood leukemia. Chemico-Biological In‐ teractions [Internet]. (2012). Apr 5 [cited 2012 Oct 22];Available from: http://

[23] Amin, R, Bohnert, A, Holmes, L, Rajasekaran, A, & Assanasen, C. Epidemiologic mapping of Florida childhood cancer clusters. Pediatric Blood & Cancer [Internet]. (2010). Available from: http://www.ncbi.nlm.nih.gov/pubmed/20054842, 54(4), 511-8.

tid=2364264&tool=pmcentrez&rendertype=abstract, 87(7), 740-5.

http://www.ncbi.nlm.nih.gov/pubmed/5358216, 29(11), 1995-2001.

tid=2374236&tool=pmcentrez&rendertype=abstract, 94(5), 763-70.

www.ncbi.nlm.nih.gov/pubmed/22326931, 196(3), 59-67.

pubmed/18844236, 124(2), 449-55.

138 Clinical Epidemiology of Acute Lymphoblastic Leukemia - From the Molecules to the Clinic

(2000). , 221.

(2010). , 248.

82(9), 1571-6.

87(5), 513-5.

Press; (2008). , 320.


[34] Alexander, F. E, Boyle, P, Carli, P. M, Coebergh, J. W, Draper, G. J, Ekbom, A, et al. Spatial temporal patterns in childhood leukaemia: further evidence for an infectious origin. EUROCLUS project. British Journal of Cancer [Internet]. (1998). Mar;Available from: http://www.pubmedcentral.nih.gov/articlerender.fcgi?ar‐ tid=2149966&tool=pmcentrez&rendertype=abstract, 77(5), 812-7.

Aug 15];Available from: http://www.ncbi.nlm.nih.gov/pubmed/18945723, 132(2),

Etiological Research of Childhood Acute Leukemia with Cluster and Clustering Analysis

http://dx.doi.org/10.5772/54456

141

[44] Mulder, Y. M, Drijver, M, & Kreis, I a. Case-control study on the association between a cluster of childhood haematopoietic malignancies and local environmental factors in Aalsmeer, The Netherlands. Journal of Epidemiology and Community Health [In‐ ternet]. (1994). Apr;Available from: http://www.pubmedcentral.nih.gov/articleren‐

[45] Petridou, E, Revinthi, K, Alexander, F. E, Haidas, S, Koliouskas, D, Kosmidis, H, et al. Space-time clustering of childhood leukaemia in Greece: evidence supporting a vi‐ ral aetiology. British Journal of Cancer [Internet]. (1996). May;Available from: http://

www.pubmedcentral.nih.gov/articlerender.fcgi?artid=2074508&tool=pmcen‐

tid=1060101&tool=pmcentrez&rendertype=abstract, 49(2), 158-63.

tid=1977570&tool=pmcentrez&rendertype=abstract, 65(4), 589-92.

[46] Gilman, E. A, & Knox, E. G. Childhood cancers: space-time distribution in Britain. Journal of Epidemiology and Community Health [Internet]. (1995). Apr [cited 2012 Aug 15];Available from: http://www.pubmedcentral.nih.gov/articlerender.fcgi?ar‐

[47] Alexander, F. E. Space-time clustering of childhood acute lymphoblastic leukaemia: indirect evidence for a transmissible agent. British Journal of Cancer [Internet]. (1992). Apr;Available from: http://www.pubmedcentral.nih.gov/articlerender.fcgi?ar‐

[48] Dockerty, J. D, Sharples, K. J, & Borman, B. An assessment of spatial clustering of leukaemias and lymphomas among young people in New Zealand. Journal of Epi‐ demiology and Community Health [Internet]. (1999). Mar;Available from: http://

www.pubmedcentral.nih.gov/articlerender.fcgi?artid=1756850&tool=pmcen‐

www.pubmedcentral.nih.gov/articlerender.fcgi?artid=1060595&tool=pmcen‐

[50] Klauber, M. R, & Mustacchi, P. Space-Time Clustering of Childhood Leukemia in San Francisco Space-Time Clustering of Childhood Leukemia in San Francisco. Cancer

[51] Wartenberg, D, Schneider, D, & Brown, S. Childhood leukaemia incidence and the population mixing hypothesis in US SEER data. British Journal of Cancer [Internet]. (2004). May 4 [cited 2012 Nov 15];Available from: http://www.pubmedcen‐ tral.nih.gov/articlerender.fcgi?artid=2409734&tool=pmcentrez&rendertype=abstract,

[52] Knox, E. G, & Gilman, E. Leukaemia clusters in Great Britain. 1. Space-time interac‐ tions. Journal of Epidemiology and Community Health [Internet]. (1992). Dec;Availa‐

[49] Alexander, F, Cartwright, R, Mckinney, P a, & Ricketts, T. J. Investigation of spacial clustering of rare diseases: childhood malignancies in North Humberside. Journal of Epidemiology and Community Health [Internet]. (1990). Mar;Available from: http://

trez&rendertype=abstract, 73(10), 1278-83.

trez&rendertype=abstract, 53(3), 154-8.

trez&rendertype=abstract, 44(1), 39-46.

Research. (1970). , 30, 1969-73.

90(9), 1771-6.

der.fcgi?artid=1059926&tool=pmcentrez&rendertype=abstract, 48(2), 161-5.

267-72.


Aug 15];Available from: http://www.ncbi.nlm.nih.gov/pubmed/18945723, 132(2), 267-72.

[44] Mulder, Y. M, Drijver, M, & Kreis, I a. Case-control study on the association between a cluster of childhood haematopoietic malignancies and local environmental factors in Aalsmeer, The Netherlands. Journal of Epidemiology and Community Health [In‐ ternet]. (1994). Apr;Available from: http://www.pubmedcentral.nih.gov/articleren‐ der.fcgi?artid=1059926&tool=pmcentrez&rendertype=abstract, 48(2), 161-5.

[34] Alexander, F. E, Boyle, P, Carli, P. M, Coebergh, J. W, Draper, G. J, Ekbom, A, et al. Spatial temporal patterns in childhood leukaemia: further evidence for an infectious origin. EUROCLUS project. British Journal of Cancer [Internet]. (1998). Mar;Available from: http://www.pubmedcentral.nih.gov/articlerender.fcgi?ar‐

[35] Alexander, F. E, Boyle, P, Carli, P. M, Coebergh, J. W, Draper, G. J, Ekbom, A, et al. Spatial clustering of childhood leukaemia: summary results from the EUROCLUS project. British Journal of Cancer [Internet]. (1998). Mar;Available from: http://

www.pubmedcentral.nih.gov/articlerender.fcgi?artid=2149947&tool=pmcen‐

[36] Francis, S. S, Selvin, S, Yang, W, Buffler, P a, & Wiemels, J. L. Unusual space-time patterning of the Fallon, Nevada leukemia cluster: Evidence of an infectious etiology. Chemico-Biological Interactions [Internet]. Elsevier Ireland Ltd; (2012). Apr 5 [cited 2012 Aug 15];Available from: http://www.ncbi.nlm.nih.gov/pubmed/21352818,

[37] McNally RJQAlexander FE, Bithell JF. Space-time clustering of childhood cancer in great Britain: a national study, 1969-1993. International Journal of Cancer [Internet]. (2006). Jun 1 [cited 2012 Aug 15];Available from: http://www.ncbi.nlm.nih.gov/

[38] Petridou, E, Alexander, F. E, Trichopoulos, D, Revinthi, K, Dessypris, N, Wray, N, et al. Aggregation of childhood leukemia in geographic areas of Greece. Cancer Causes & Control [Internet]. (1997). Mar;Available from: http://www.ncbi.nlm.nih.gov/

[39] Staines, a, Bodansky, H. J, Mckinney, P a, Alexander, F. E, Mcnally, R. J, Law, G. R, et al. Small area variation in the incidence of childhood insulin-dependent diabetes mellitus in Yorkshire, UK: links with overcrowding and population density. Interna‐ tional Journal of Epidemiology [Internet]. (1997). Dec;Available from: http://

[40] MacMahon BIs acute lymphoblastic leukemia in children virus-related? American Journal of Epidemiology [Internet]. (1992). Oct 15;Available from: http://

[41] Zur Hausen HChildhood leukemias and other hematopoietic malignancies: interde‐ pendence between an infectious event and chromosomal modifications. International Journal of Cancer [Internet]. (2009). Oct 15 [cited 2012 Aug 15];Available from: http://

[43] Law, G. R. Host, family and community proxies for infections potentially associated with leukaemia. Radiation Protection Dosimetry [Internet]. (2008). Jan [cited 2012

[42] McNally RJQLecture for master's Newcastle university. Newcastle; (2012). , 96.

www.ncbi.nlm.nih.gov/pubmed/9447411, 26(6), 1307-13.

www.ncbi.nlm.nih.gov/pubmed/1456268, 136(8), 916-24.

www.ncbi.nlm.nih.gov/pubmed/19330827, 125(8), 1764-70.

tid=2149966&tool=pmcentrez&rendertype=abstract, 77(5), 812-7.

140 Clinical Epidemiology of Acute Lymphoblastic Leukemia - From the Molecules to the Clinic

trez&rendertype=abstract, 77(5), 818-24.

pubmed/16381003, 118(11), 2840-6.

pubmed/9134248, 8(2), 239-45.

196(3), 102-9.


ble from: http://www.pubmedcentral.nih.gov/articlerender.fcgi? artid=1059670&tool=pmcentrez&rendertype=abstract, 46(6), 566-72.

tral.nih.gov/articlerender.fcgi?artid=1060598&tool=pmcentrez&rendertype=abstract,

Etiological Research of Childhood Acute Leukemia with Cluster and Clustering Analysis

http://dx.doi.org/10.5772/54456

143

[62] Wheeler, D. C. A comparison of spatial clustering and cluster detection techniques for childhood leukemia incidence in Ohio, International Journal of Health Geograph‐ ics [Internet]. (2007). Jan [cited 2012 Aug 15];6(13). Available from: http://

www.pubmedcentral.nih.gov/articlerender.fcgi?artid=1851703&tool=pmcen‐

tid=2542713&tool=pmcentrez&rendertype=abstract, 309(6953), 501-5.

clerender.fcgi?artid=3171387&tool=pmcentrez&rendertype=abstract

[63] Bithell, J. F, Dutton, S. J, Draper, G. J, & Neary, N. M. Distribution of childhood leu‐ kaemias and non-Hodgkin's lymphomas near nuclear installations in England and Wales. BMJ (Clinical research ed.) [Internet]. (1994). cited 2012 Aug 15];Available from: http://www.pubmedcentral.nih.gov/articlerender.fcgi?ar‐

[64] Pérez-saldivar, M. L, Fajardo-gutiérrez, A, Bernáldez-ríos, R, Martínez-avalos, A, Medina-sanson, A, Espinosa-hernández, L, et al. Childhood acute leukemias are fre‐ quent in Mexico City: descriptive epidemiology. BMC Cancer [Internet]. (2011). Jan [cited 2012 Aug 16];11:355. Available from: http://www.pubmedcentral.nih.gov/arti‐

[65] Fajardo-gutiérrez, A, Sandoval-mex, A. M, Mejía-aranguré, J. M, & Rendón-macías, M. E. Martínez-García M del C. Clinical and social factors that affect the time to diag‐ nosis of Mexican children with cancer. Medical and Pediatric Oncology [Internet]. (2002). Jul [cited 2012 Jul 16];Available from: http://www.ncbi.nlm.nih.gov/pubmed/

[66] Mejía-aranguré, J. M, Fajardo-gutiérrez, A, Bernáldez-ríos, R, Paredes-aguilera, R, Flores-aguilar, H, & Martínez-garcía, M. C. Incidencia de las leucemias agudas en ni‐ ños de la ciudad de México, de 1982 a 1991. Salud Pública de México [Internet]. (2000). cited 2012 Aug 16];Available from: http://www.ncbi.nlm.nih.gov/pubmed/

[67] Duarte Rodríguez DAConglomerados espaciales de niños con leucemia aguda infan‐ til en la Ciudad de México. Universidad Nacional Autónoma de México; (2012). , 104.

[68] Kulldorff, M, & Nagarwalla, N. Spatial disease clusters: detection and inference. Sta‐ tistics in Medicine [Internet]. (1995). Apr 30 [cited 2012 Aug 23];Available from:

[69] Hjalmars, U, Kulldorff, M, Gustafsson, G, & Nagarwalla, N. Childhood leukaemia in Sweden: using GIS and a spatial scan statistic for cluster detection. Statistics in Medi‐ cine [Internet]. (1996). Available from: http://www.ncbi.nlm.nih.gov/pubmed/

[70] Goujon-bellec, S, Demoury, C, Guyot-goubin, A, Hémon, D, & Clavel, J. Detection of clusters of a rare disease over a large territory: performance of cluster detection methods. International Journal of Health Geographics [Internet]. (2011). Jan;10:53.

http://www.ncbi.nlm.nih.gov/pubmed/7644860, 14(8), 799-810.

44(1), 55-8.

trez&rendertype=abstract, 1996-2003.

12116075, 39(1), 25-31.

11125628, 42(5), 431-7.

9132898


tral.nih.gov/articlerender.fcgi?artid=1060598&tool=pmcentrez&rendertype=abstract, 44(1), 55-8.

ble from: http://www.pubmedcentral.nih.gov/articlerender.fcgi?

[53] Sans, S, Elliott, P, Kleinschmidt, I, Shaddick, G, Pattenden, S, Walls, P, et al. Cancer incidence and mortality near the Baglan Bay petrochemical works, South Wales. Oc‐ cupational and Environmental Medicine [Internet]. (1995). Apr;Available from: http://www.pubmedcentral.nih.gov/articlerender.fcgi?artid=1128198&tool=pmcen‐

[54] Lehtinen, M. Maternal Herpesvirus Infections and Risk of Acute Lymphoblastic Leu‐ kemia in the Offspring. American Journal of Epidemiology [Internet]. (2003). Aug 1 [cited 2012 Aug 15];Available from: http://aje.oxfordjournals.org/cgi/content/abstract/

[55] Bogdanovic, G, Jernberg, A. G, Priftakis, P, Grillner, L, & Gustafsson, B. Human her‐ pes virus 6 or Epstein-Barr virus were not detected in Guthrie cards from children who later developed leukaemia. British Journal of Cancer [Internet]. (2004). Aug 31 [cited 2012 Aug 15];Available from: http://www.pubmedcentral.nih.gov/articleren‐

[56] Roman, E, Simpson, J, Ansell, P, Kinsey, S, Mitchell, C. D, Mckinney, P. A, et al. Childhood acute lymphoblastic leukemia and infections in the first year of life: a re‐ port from the United Kingdom Childhood Cancer Study. American Journal of Epi‐ demiology [Internet]. (2007). Mar 1 [cited 2012 Aug 15];Available from: http://

[57] Gilham, C, Peto, J, Simpson, J, & Roman, E. Eden TOB, Greaves MF, et al. Day care in infancy and risk of childhood acute lymphoblastic leukaemia: findings from UK case-control study. BMJ (Clinical research ed.) [Internet]. (2005). Jun 4 [cited 2012 Aug 15];330(7503):1294. Available from: http://www.pubmedcentral.nih.gov/article‐

[58] Knox, E. G. Leukaemia clusters in childhood: geographical analysis in Britain. Jour‐ nal of Epidemiology and Community Health [Internet]. (1994). Aug;Available from: http://www.pubmedcentral.nih.gov/articlerender.fcgi?artid=1059986&tool=pmcen‐

[59] Williams EH (Kuluva HSmith PG (DHSS CE and CTU, Day NE (WHO IA for R on C, Geser A (East AVRI, Ellice J, Tukei P. Space-time clustering of Burkitt's Lymphoma in the West Nile District of Uganda: 1961-1975. British Journal of Cancer. (1978). , 37,

[60] Smith, P. G. Case-control studies of leukaemia clusters. BMJ (Clinical research ed.) [Internet]. (1991). Mar 23 [cited 2012 Aug 15];Available from: http://www.pubmed‐ central.nih.gov/articlerender.fcgi?artid=1669108&tool=pmcentrez&rendertype=ab‐

[61] Morris, V. Space-time interactions in childhood cancers. Journal of Epidemiology and Community Health [Internet]. (1990). Mar;Available from: http://www.pubmedcen‐

der.fcgi?artid=2409878&tool=pmcentrez&rendertype=abstract, 91(5), 913-5.

www.ncbi.nlm.nih.gov/pubmed/17182983, 165(5), 496-504.

render.fcgi?artid=558199&tool=pmcentrez&rendertype=abstract

trez&rendertype=abstract, 48(4), 369-76.

109-22.

stract, 302(6778), 672-3.

artid=1059670&tool=pmcentrez&rendertype=abstract, 46(6), 566-72.

142 Clinical Epidemiology of Acute Lymphoblastic Leukemia - From the Molecules to the Clinic

trez&rendertype=abstract, 52(4), 217-24.

158/3/207, 158(3), 207-13.


Available from: http://www.pubmedcentral.nih.gov/articlerender.fcgi?ar‐ tid=3204219&tool=pmcentrez&rendertype=abstract

**Chapter 7**

**Sociodemographic and Birth Characteristics in**

Acute leukemias are cancers of the hematopoietic system that involve, in the majority of cases, a malignant transformation of myeloid and lymphoid progenitor cells [1]. Acute leukemias represent the most common type of childhood cancer [2,3]. Acute lymphoblastic leukemia (ALL) has a frequency of five times greater than that of acute myeloblastic leukemia (AML) and is the most common cancer in children, representing 25% to 35% of all childhood cancers [3,4]. The incidence of ALL varies significantly among developed and developing countries. With a reported annual incidence of 20-45 cases per million children, the highest incidence rates are recorded in the Hispanic population in California, Texas and Florida and in Costa Rica and the City of Mexico [4-7]. Despite advances in therapy and improvements in survival, acute leukemia represents one of the main causes of morbidity and mortality in children. The etiology of this disease remains unknown. Only Down syndrome and ionizing radiation have been recognized as risk factors for the development of childhood acute leukemia [8]. However, the risk attributable to these factors is very small. Epidemiological studies exploring different environmental exposures along with advances in cytogenetics and immunophenotyping have identified different subgroups of the disease that must be considered separately. Such is the case of infantile leukemia. Although it is a rare disease in this group, the molecular character‐ istics and survival are different in infants than in older children, suggesting that the etiology is distinct and most likely involves prenatal factors. The purpose of this chapter is to introduce the reader to a systematic review of the current literature on reported risk factors for childhood acute leukemia (AL). This review reports what is currently known about acute leukemia in

> © 2013 Pérez-Saldivar et al.; licensee InTech. This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/3.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

> © 2013 The Author(s). Licensee InTech. This chapter is distributed under the terms of the Creative Commons Attribution License http://creativecommons.org/licenses/by/3.0), which permits unrestricted use, distribution,

and reproduction in any medium, provided the original work is properly cited.

**Infant Acute Leukemia: A Review**

ML Pérez-Saldivar, JM Mejía-Aranguré, A Rangel-López and A Fajardo Gutiérrez

http://dx.doi.org/10.5772/54457

infants and future directions.

**1. Introduction**

Additional information is available at the end of the chapter

