Agent-Based Modeling and Analysis of Cancer Evolution

*Atsushi Niida and Watal M. Iwasaki*

## **Abstract**

Before the development of the next-generation sequencing (NGS) technology, carcinogenesis was regarded as a linear evolutionary process, driven by repeated acquisition of multiple driver mutations and Darwinian selection. However, recent cancer genome analyses employing NGS revealed the heterogeneity of mutations in the tumor, which is known as intratumor heterogeneity (ITH) and generated by branching evolution of cancer cells. In this chapter, we introduce a simulation modeling approach useful for understanding cancer evolution and ITH. We first describe agent-based modeling for simulating branching evolution of cancer cells. We next demonstrate how to fit an agent-based model to observational data from cancer genome analyses, employing approximate Bayesian computation (ABC). Finally, we explain how to characterize the dynamics of the simulation model through sensitivity analysis. We not only explain the methodologies, but also introduce exemplifying applications. For example, simulation modeling of cancer evolution demonstrated that ITH in colorectal cancer is generated by neutral evolution, which is caused by a high mutation rate and stem cell hierarchy. For cancer genome analyses, new experimental technologies are actively being developed; these will unveil various aspects of cancer evolution when combined with the simulation modeling approach.

**Keywords:** cancer, evolution, agent-based model, approximate Bayesian computation, sensitivity analysis

#### **1. Introduction**

Cancer is a clump of abnormal cells that originates from normal cells. Normal cells proliferate or stop proliferating depending on their surrounding environment. For example, when skin cells are injured, they proliferate to cover the wound; however, when the wound heals, they stop proliferating. In contrast, cancer cells continue proliferating by ignoring the surrounding environment. Moreover, cancer cells invade surrounding tissues, metastasize to distant organs, and impair functions in the human body.

Malignant transformation from normal to cancer cells generally results from the accumulation of somatic mutations, which are induced by various causes such as aging, ultraviolet rays, cigarette, alcohol, chemical carcinogens, etc. Mutations that contribute to malignant transformation are known as "driver mutations", whereas genes whose function are impaired by driver mutations are named as "driver genes". There are two types of driver genes are categorized: "oncogenes" and

"tumor suppressor genes". Oncogenes act as gas pedals for cell proliferation, which are constitutively turned on by driver mutations. Tumor suppressor genes act as brakes to stop cell proliferation, and inhibiting the function of the brakes is necessary for malignant transformation.

Normal cells are transformed into cancer cell when two to 10 driver mutations are acquired. Because these mutations are not induced simultaneously, but rather gradually over a long period of time, this process is known as "multi-stage carcinogenesis" [1]. This process is also regarded as a linear evolutionary process, driven by repeated acquisition of multiple driver mutations and Darwinian selection. Understanding cancer from an evolutionary perspective is important, as therapeutic difficulties against cancer originate from the high evolutionary capacity, which easily endows cancer cells with therapeutic resistance.

Mutations in cancer cells are experimentally detected by DNA sequencing. Nextgeneration sequencing (NGS) technology, which raised around 2010, enabled cancer genome analysis to comprehensively detect mutations in cancer cells. During the last decade, cancer genome analysis has revolutionized our understanding of cancer [2]. Cancer genome analysis showed that cancer cells harbor a large number of mutations, only a small fraction of which is driver mutations; namely, most mutations in cancer cells are "neutral mutations", which have no selective advantages (also referred to as "passenger mutations" in paired with driver mutations). By sequencing hundreds of tumor samples from different patients with the same cancer type, the repertories of driver genes were also determined across various types of cancer. Moreover, cancer genome analysis has revealed heterogeneity of mutations within one tumor, which is termed intratumor heterogeneity (ITH) [3]. As described above, carcinogenesis was regarded as a linear evolutionary process until the arrival of NGS; however, ITH is actually generated by branching evolution of cancer cells.

However, cancer genome analysis is not sufficient to explain the origin of ITH. To understand the evolutionary principles underlying the generation of ITH, a simulation modeling approach is useful and increasingly employed in the field of cancer research. In this chapter, we introduce such simulation modeling approaches. We first describe agent-based modeling for simulating branching evolution of cancer cells. We next demonstrate how to fit an agent-based model to observational data obtained by cancer genome analyses, employing approximate Bayesian computation (ABC). Finally, we explain how to characterize the dynamics of the simulation models through sensitivity analysis.

#### **2. Agent-based modeling of cancer evolution**

To simulate heterogenous cancer evolution, agent-based modeling is widely employed. An agent-based model assumes a set of system constituents, known as independent agents, and specifies rules for the independent behavior of the agents themselves, as well as for the interactions between agents and the agent environment [4]. The agent-based model is a flexible representation of the model, and given the initial conditions and parameters of the system, the behavior of the system can be easily analyzed by computational simulation. For modeling of cancer evolution, if each cell is assumed to be an agent, ITH can be easily represented by the differences in the internal states of each agent. As an example, we explain an agent-based model named as the branching evolutionary process (BEP) model, which was originally introduced by Uchi *et al.* [5] to studying ITH of colorectal cancer (**Figure 1A**). In the BEP model, a cell assumed to be an agent has a genome containing *n* genes, each of which is represented as a binary value, 0 (wild-type) or *Agent-Based Modeling and Analysis of Cancer Evolution DOI: http://dx.doi.org/10.5772/intechopen.100140*

#### **Figure 1.**

*Illustration of the BEP model. (A) A flowchart of a simulation based on the BEP model. First, a cell is tested for survival and killed with a provability q. Next, the cell is tested for cell division and replicated with a provability p. Before cell division, each gene in the cell is mutated with a provability r. After this process is applied to each cell, the simulation proceeds to the next time step. (B–D) illustration of division operation (see the main text for details). This image originally appeared in [5].*

1 (mutated). Thus, the genome is represented as a binary vector **g** with length *n*. In a unit time step, a cell replicates with a probability *p* and dies with a probability *q*. When the cell replicates, a wild-type gene is mutated with a probability *r*. The first *d* genes in **g** are considered as driver genes, whose mutations accelerate replication. A normal cell without mutations has a replication probability *p*0, and each driver mutation increases *p* by 10 *<sup>f</sup>* -fold; i.e., *<sup>p</sup>* <sup>¼</sup> *<sup>p</sup>*<sup>0</sup> � <sup>10</sup>*fk*, where *<sup>k</sup>* <sup>¼</sup> <sup>P</sup>*<sup>d</sup> <sup>i</sup>*¼<sup>1</sup>*gi* , the number of mutated driver genes. The death probability is fixed as *q* ¼ *q*0. Let *c* and *t* denote the size of the simulated cell population and number of the time steps, respectively. A simulation is started with *c*<sup>0</sup> normal cells and the unit time step is repeated while the population size *c*≤ *cmax* and time step *t*≤ *tmax*.

The BEP model assumes that a simulated tumor grows in a two-dimensional square lattice where each cell occupies one lattice point. Initially, *c*<sup>0</sup> cells are initialized as close as possible to the center of the lattice. In a unit time step, along an outward spiral starting from the center, we replicate and kill each cell with probabilities *p* and *q*, respectively. When cells replicate, the BEP model places the daughter cell in the neighborhood of the parent cell, assuming a Moore neighborhood (i.e., eight points surrounding a central point). If empty neighbor points exist, we randomly select one of these points. Otherwise, we create an empty point in any of the eight neighboring points as follows. First, for each of the eight directions, we count the number of consecutive occupied points that range from each neighboring point to immediately before the nearest empty cell as indicated in **Figure 1B**. Next, any of the 8 directions is randomly selected proportionally with 1*=li*, where *li*ð Þ 1≤ *i*≤ 8 is the count of the consecutive occupied points for each direction. The consecutive occupied points in the selected direction are then shifted by one point

so that an empty neighboring point appears as shown in **Figure 1C**. Note that simulation results depend on the order of the division operation in the twodimensional square lattice. The BEP model first marks cells to be divided and then applies the division operation to the marked cells along an outward spiral starting from the center. In each round on the spiral, the direction is randomly flipped in order to maintain spatial symmetry. An example of such spirals is shown in **Figure 1D**.

Given that a cell without mutations divides according to this rule, after a normal cell acquires its first driver mutation, which accelerates cell division, the proportion of the clone originating from the cell increases in the whole cell population. By repeating these steps, each cell gradually accumulates driver mutations and accompanying passenger mutations, which do not affect the cell division rate, finally forming a tumor with many mutations. Depending on the parameter values during the course of cancer evolution, each cancer cell can accumulate different combinations of mutations to generate different types of ITH. **Figure 2** show an example of snapshots of two-dimensional tumor growth simulated based on the BEP model with an appropriate parameter setting. In this example, driver mutations gradually accumulated in the cells, and a clone with four mutations was selected through Darwinian selection and finally became dominant in the tumor.

The BEP model is a very simple model and has many limitations. Although this BEP model assumes that driver mutations increase the replication probability, it is considered that driver mutations decrease the death probability. The BEP model also assume that each diver mutation has the same effect on the replication probability; however, actual tumors contain different driver mutations of different strengths. Although actual tumors grow in a three-dimensional space, the BEP model assumes tumor growth on the a two-dimensional square lattice; extension to a three-dimensional lattice should be considered as a future improvement. For onlattice models, various other simulators has been developed for studying tumor growth (off-lattice models which do not assume that tumor growth on the lattice reflects the actual situations more accurately, but are computationally intensive and not commonly used [7]). For example, the pioneering works of agent-based

#### **Figure 2.**

*Visualization of a simulation based on the BEP model. (A) Evolutionary snapshots obtained by simulating twodimensional tumor growth based on the BEP model with an appropriate parameter setting. The region with the same color represents a clone with the same set of mutated genes. (B) Single-cell mutation profiles at three time points in the simulated tumor growth. Top colored bands represent clones, whereas the blue bands on the left represent driver genes. This image was obtained by modifying a figure that originally appeared in [6].*

modeling were performed by Anderson and colleagues [8, 9]. Enderling and colleagues extended the model to incorporate cell differentiation from cancer stem cells where differentiated cells have a limited potential for cell division [10, 11]. Sottoriva *et al.* [12] found that hierarchical organization of cell differentiation affects tumor heterogeneity, which leads to an invasive morphology with fingerlike front. Waclaw *et al.* [13] predicted that dispersal and cell turnover limit intratumor heterogeneity.

Each group developed a model different from the others, and thus only limited conditions were considered in each study. To address this issue, Iwasaki and Innan [14] developed a flexible and comprehensive simulation framework named as *tumopp* so that all previous models were included in a single program. This enables researchers to explore the effects of various model settings with simple commandline options. For example, the combined effect of local density on the cell division rate and method of placing new cells had a major impact on ITH. Under the condition with random push and no local competition, all cells undergo cell division at a constant rate regardless of the local density, and new cells are placed while randomly pushing out pre-existing neighbor cells. This behavior creates shuffled patterns of ITH with weak isolation by distance. In contrast, under the conditions of strong resource competition, the division rate of a cell is higher when it has more space (empty sites) in the neighborhood, which generally applies to cells near the surface of the simulated tumor; new cells are placed to fill the empty space without pushing existing cells. This setting tends to create a biased complex shape with clusters of genetically closely related cells, resulting in strong isolation by distance. Thus, it has been demonstrated that various patterns in the shape and heterogeneity of tumors arose depending on the model setting even without Darwinian selection. This suggests a caveat in analyzing ITH data with simulations using limited settings because another setting may predict a different ITH pattern, which could result in a different conclusion.

Moreover, *tumopp* introduced several other factors to relax various assumptions. First, it adopted a gamma function for the waiting time involved in cell division, whereas all previous studies assumed a simple decreasing (i.e., exponential) function implicitly or explicitly. The shape of the gamma distribution can be specified by parameter *k*, which affects the growth curve and inequality of clones in a tumor. An exponential distribution, which is the most widely used, is included as a special case with *k* ¼ 1, whereas punctual and deterministic cell divisions can be achieved with *k* ¼ ∞. It is reasonable to expect that the true value of *k* lies somewhere inbetween, such as at � 10, because cell division is neither a memoryless Poisson process (*k* ¼ 1; equivalent to an exponential distribution) nor perfectly synchronized (*k* ¼ ∞; equivalent to Dirac delta distribution) [15, 16]. Second, a hexagonal lattice has been implemented, which is thought to be biologically more reasonable because the distance to all neighboring cells is identical so that there is only one definition of the neighborhood. A hexagonal lattice should be used for future work when the spatial pattern is of interest to the study. These factors will contribute to simulating the evolutionary processes of cellular populations under more realistic conditions.
