**Graphical Models of Functional MRI Data for Assessing Brain Connectivity**

Junning Li1, Z. Jane Wang1 and Martin J. McKeown<sup>2</sup>

<sup>1</sup>*Department of Electrical and Computer Engineering* <sup>2</sup>*Department of Medicine (Neurology), Pacific Parkinson's Research Centre University of British Columbia Canada*

#### **1. Introduction**

374 Neuroimaging – Cognitive and Clinical Neuroscience

Tibbles, P. M. and P. L. Perrotta (1994). Treatment of carbon monoxide poisoning: a critical

Uchino, A., K. Hasuo, et al. (1994). MRI of the brain in chronic carbon monoxide poisoning.

Uemura, K., K. Harada, et al. (2001). Apoptotic and necrotic brain lesions in a fatal case of

Vieregge, P., W. Klostermann, et al. (1989). Carbon monoxide poisoning: clinical,

Watanabe, N., S. Nohara, et al. (2002). Statistical parametric mapping in brain single photon

Weaver, L. K. and R. O. Hopkins (2005). Hemorrhagic infarction in white matter following acute carbon monoxide poisoning. *Neurology* 64, 6: pp. 1101; author reply 1101. Weaver, L. K., K. J. Valentine, et al. (2007). Carbon monoxide poisoning: risk factors for

Weinachter, S. N., N. Blavet, et al. (1990). Models of hypoxia and cerebral ischemia.

Weiner, L. M. (1986). Magnetic resonance study of the structure and functions of cytochrome

Wu, C. I., S. P. Changlai, et al. (2003). Usefulness of 99mTc ethyl cysteinate dimer brain

interval form of carbon monoxide poisoning. *J Neurol Sci* 160, 1: pp. 87-91. Zhang, J. and C. A. Piantadosi (1992). Mitochondrial oxidative stress after carbon monoxide

carbon monoxide poisoning. *Nucl Med Commun* 24, 11: pp. 1185-1188. Yoshii, F., R. Kozuma, et al. (1998). Magnetic resonance imaging and 11C-N-

*Pharmacopsychiatry* 23 Suppl 2: pp. 94-97; discussion 98.

hypoxia in the rat brain. *J Clin Invest* 90, 4: pp. 1193-1199.

P450. *CRC Crit Rev Biochem* 20, 2: pp. 139-200.

Weaver, L. K. (1999). Carbon monoxide poisoning. *Crit Care Clin* 15, 2: pp. 297-317, viii. Weaver, L. K. (2009). Clinical practice. Carbon monoxide poisoning. *N Engl J Med* 360, 12:

carbon monoxide poisoning. *Forensic Sci Int* 116, 2-3: pp. 213-219.

oxygen. *Ann Emerg Med* 24, 2: pp. 269-276.

*Neuroradiology* 36, 5: pp. 399-401.

up. *J Neurol* 236, 8: pp. 478-481.

*Commun* 23, 4: pp. 355-366.

pp. 1217-1225.

5: pp. 491-497.

review of human outcome studies comparing normobaric oxygen with hyperbaric

neurophysiological, and brain imaging observations in acute disease and follow-

computed emission tomography after carbon monoxide intoxication. *Nucl Med* 

cognitive sequelae and the role of hyperbaric oxygen. *Am J Respir Crit Care Med* 176,

SPECT to detect abnormal regional cerebral blood flow in patients with acute

methylspiperone/positron emission tomography studies in a patient with the

#### **1.1 Brain connectivity and fMRI**

Modern neuroimaging technologies have allowed researchers to non-invasively observe indirect markers of brain activity *in vivo* (Fig. 1). This has resulted in a rapid growth of studies trying to ascertain what brain loci are associated with certain cognitive, sensory and motor tasks. In particular, the recent development of functional magnetic resonance imaging (fMRI) has allowed researchers to non-invasively investigate brain activity at excellent spatial resolution and relatively good temporal resolution. While probing aspects of brain function is typically under the domain of neuroscientists, fMRI work is inherently interdisciplinary: it involves MR physicists who determine MRI sequences sensitive to small changes in the brain, neuroscientists who design the behavioural experiments and interpret the observations, statisticians to assess significance of changes, and increasingly, people with signal processing expertise to derive more and more information from the time series extracted.

Analysis of fMRI data sets represents a special challenge for traditional statistical methods that were originally designed for a large number of samples of low-dimensional data points. The number of "voxels" (ie. representing a specific locus in the brain) to be analyzed are large (<sup>≈</sup> 105), yet the number of time points (<sup>≈</sup> <sup>10</sup>2) is relatively small. Most early fMRI analysis methods were designed to ascertain the regions where brain functions are localized by performing voxel-wise analysis.

Even when simple tasks are performed in the MRI scanner, widespread activation can be observed in the brain with fMRI. These and other studies suggest that the brain is active at multiple spatial and time scales supporting both segregated and distributed information processing (Bassett & Bullmore, 2006). In fact, the advent of non-invasive functional neuroimaging has re-ignited a centuries-old debate about whether or not cognitive and motor tasks are encoded in discrete loci or are more diffusely and fluidly represented, the latter emphasizing the importance of assessing brain connectivity (Catani & ffytche, 2005).

While connectivity appears to be of critical importance for understanding and assessment of brain function, it can be difficult to define in a rigorous sense with current technologies that can only probe brain activity at certain spatial and temporal scales (see Fig. 1). Conventionally, brain connectivity can be studied at three levels: anatomical, functional, and effective connectivity (see Fig. 2). Anatomical connectivity refers to actual physical connections

(a)∗ (b)∗∗ (c)

Graphical Models of Functional MRI Data for Assessing Brain Connectivity 377

Correlation thresholding (Cao & Worsley, 1999) directly examines the correlation between the activities of brain regions. If the correlation is so strong that it is extremely unlikely based on chance, then the two regions are considered connected, though not necessarily directly. Linear decomposition approaches, e.g. principal component analysis and independent component analysis (ICA) (Calhoun et al., 2001; McKeown, 2000), assume that observed brain activities are a combination of underlying psychological processes that spatially recruit different brain regions or temporally have unrelated behaviours. Regions involved in the same psychological process as revealed by the decomposition is considered as connected, though not necessarily directly. Both correlation thresholding and linear decomposition are designed for discovering functional connectivity, and neither can distinguish whether two regions interact directly or indirectly through a third region (Kaminski, 2005). Though correlation thresholding and linear decomposition are generally not considered as graphical model, actually both can be related

Unlike correlation thresholding and linear decomposition whose results can be visualized as brain images at the voxel level, structure equation models1(Bollen, 1989), dynamic causal models (Friston et al., 2003), multivariate auto-regression (Valdes-Sosa et al., 2005), and Bayesian networks (Zheng & Rajapakse, 2006), are another category of methods that normally work at the level of regions, and whose results can be visualized as graphs where nodes usually represent brain regions and edges represent connections. The brain regions are typically defined anatomically, and some automatic or manual segmentation of brain

<sup>1</sup> Structure equation models allow reciprocal connections, and normally are not considered as classical graphical models. As advanced graphical models, their Markov property and equivalence classes have

to graphical models (Roweis & Ghahramani, 1999).

been explored in (Ali et al., 2009; Richardson, 2003; Spirtes et al., 1998).

Fig. 2. Conventionally, brain connectivity is studied at three levels: (a) anatomical, (b) functional, and (c) effective connectivity. Anatomical connectivity is actual physical connections between brain structures. Functional connectivity is defined as the significant mutual information between the time series found at distinct loci in the brain. Effective connectivity has been used to imply the causal influence that activity in one brain region exerts over the activity of another. \* Sub-figure (a) is from P. Hagmann, J.-P. Thiran, L. Jonasson, P. Vandergheynst, S. Clarke, P. Maeder and R. Meuli (2003) DTI mapping of human brain connectivity: statistical fibre tracking and virtual dissection, NeuroImage 19(3): 545–554. \*\* Sub-figure (b) is from Daniel S. Margulies, A.M. Clare Kelly, Lucina Q. Uddin, Bharat B. Biswal, F. Xavier Castellanos and Michael P. Milham (2007) NeuroImage 37(2):

579–588.

Fig. 1. Temporal and spatial resolution of current neuro-imaging technology. TMS: transcranial magnetic stimulation, MEG: magnetoencephalography, EEG: electroencephalography, PET: positron emission tomography, and Pharm.: pharmacological. (Adapted from: Churchland, Patricia, and Terrence Sejnowski (1992) The Computational Brain. Cambridge, MA: MIT Press.)

between brain structures. It can be determined with the help of rich anatomical studies that have been developed over decades, or more recently, using MR techniques such as Diffusion Tensor Imaging (DTI). Functional connectivity is defined as the significant mutual information between the time series found at distinct loci in the brain. However this raises several problems. If two regions have similarities between their respective time series, is this because one region influences the other, or there is a third region affecting both (Figs. 4 and 5)? Thus the term effective connectivity has been used to imply the causal influence that activity in one brain region exerts over the activity of another. The importance of assessing brain effective connectivity is also related to the fact that brain connectivity impairments are associated with many neuropsychiatric diseases such as depression (Schlösser et al., 2008), schizophrenia (Schlösser et al., 2008), Alzheimer's (Supekar et al., 2008) and Parkinson's disease (Palmer et al., 2009).

#### **1.2 Graphical models for brain effective connectivity**

Many methods for inferring connectivity from the four-dimensional fMRI data (three spatial dimensions and one temporal dimension) have been suggested. Proposed methods include correlation thresholding (Cao & Worsley, 1999), linear decomposition (Calhoun et al., 2001; McKeown, 2000), structural equation models (SEM) (Bollen, 1989), multi-variate auto-regression (Valdes-Sosa et al., 2005), dynamic causal models (Friston et al., 2003), Bayesian networks (Li et al., 2008; Zheng & Rajapakse, 2006), wavelet analysis (Bullmore et al., 2004), and clustering (Heller et al., 2006).

2 Will-be-set-by-IN-TECH

Fig. 1. Temporal and spatial resolution of current neuro-imaging technology. TMS:

electroencephalography, PET: positron emission tomography, and Pharm.: pharmacological. (Adapted from: Churchland, Patricia, and Terrence Sejnowski (1992) The Computational

between brain structures. It can be determined with the help of rich anatomical studies that have been developed over decades, or more recently, using MR techniques such as Diffusion Tensor Imaging (DTI). Functional connectivity is defined as the significant mutual information between the time series found at distinct loci in the brain. However this raises several problems. If two regions have similarities between their respective time series, is this because one region influences the other, or there is a third region affecting both (Figs. 4 and 5)? Thus the term effective connectivity has been used to imply the causal influence that activity in one brain region exerts over the activity of another. The importance of assessing brain effective connectivity is also related to the fact that brain connectivity impairments are associated with many neuropsychiatric diseases such as depression (Schlösser et al., 2008), schizophrenia (Schlösser et al., 2008), Alzheimer's (Supekar et al., 2008) and Parkinson's disease (Palmer

Many methods for inferring connectivity from the four-dimensional fMRI data (three spatial dimensions and one temporal dimension) have been suggested. Proposed methods include correlation thresholding (Cao & Worsley, 1999), linear decomposition (Calhoun et al., 2001; McKeown, 2000), structural equation models (SEM) (Bollen, 1989), multi-variate auto-regression (Valdes-Sosa et al., 2005), dynamic causal models (Friston et al., 2003), Bayesian networks (Li et al., 2008; Zheng & Rajapakse, 2006), wavelet analysis (Bullmore et al.,

transcranial magnetic stimulation, MEG: magnetoencephalography, EEG:

Brain. Cambridge, MA: MIT Press.)

**1.2 Graphical models for brain effective connectivity**

2004), and clustering (Heller et al., 2006).

et al., 2009).

Fig. 2. Conventionally, brain connectivity is studied at three levels: (a) anatomical, (b) functional, and (c) effective connectivity. Anatomical connectivity is actual physical connections between brain structures. Functional connectivity is defined as the significant mutual information between the time series found at distinct loci in the brain. Effective connectivity has been used to imply the causal influence that activity in one brain region exerts over the activity of another. \* Sub-figure (a) is from P. Hagmann, J.-P. Thiran, L. Jonasson, P. Vandergheynst, S. Clarke, P. Maeder and R. Meuli (2003) DTI mapping of human brain connectivity: statistical fibre tracking and virtual dissection, NeuroImage 19(3): 545–554. \*\* Sub-figure (b) is from Daniel S. Margulies, A.M. Clare Kelly, Lucina Q. Uddin, Bharat B. Biswal, F. Xavier Castellanos and Michael P. Milham (2007) NeuroImage 37(2): 579–588.

Correlation thresholding (Cao & Worsley, 1999) directly examines the correlation between the activities of brain regions. If the correlation is so strong that it is extremely unlikely based on chance, then the two regions are considered connected, though not necessarily directly. Linear decomposition approaches, e.g. principal component analysis and independent component analysis (ICA) (Calhoun et al., 2001; McKeown, 2000), assume that observed brain activities are a combination of underlying psychological processes that spatially recruit different brain regions or temporally have unrelated behaviours. Regions involved in the same psychological process as revealed by the decomposition is considered as connected, though not necessarily directly. Both correlation thresholding and linear decomposition are designed for discovering functional connectivity, and neither can distinguish whether two regions interact directly or indirectly through a third region (Kaminski, 2005). Though correlation thresholding and linear decomposition are generally not considered as graphical model, actually both can be related to graphical models (Roweis & Ghahramani, 1999).

Unlike correlation thresholding and linear decomposition whose results can be visualized as brain images at the voxel level, structure equation models1(Bollen, 1989), dynamic causal models (Friston et al., 2003), multivariate auto-regression (Valdes-Sosa et al., 2005), and Bayesian networks (Zheng & Rajapakse, 2006), are another category of methods that normally work at the level of regions, and whose results can be visualized as graphs where nodes usually represent brain regions and edges represent connections. The brain regions are typically defined anatomically, and some automatic or manual segmentation of brain

<sup>1</sup> Structure equation models allow reciprocal connections, and normally are not considered as classical graphical models. As advanced graphical models, their Markov property and equivalence classes have been explored in (Ali et al., 2009; Richardson, 2003; Spirtes et al., 1998).







1. they directly reciprocally communicate with each other;

6. they communicate by a combination of the above possibilities.

4. one indirectly exerts the other via other regions;

2. one region directly exerts the other;

5. they both are driven other regions;

Aug., 1969), can be employed.

**1.4 Challenges in modeling brain connectivity**


 -

Graphical Models of Functional MRI Data for Assessing Brain Connectivity 379

Fig. 4. When two brain regions show similar activation patterns, they can be connected with different underlying possibilities: (1) they directly reciprocally communicate with each other; (2) one region directly exerts the other; (3) they indirectly reciprocally communicate with each other via other brain regions; (4) one indirectly exerts the other via other regions; (5) they both are driven other regions; (6) they communicate with a combination of (1)–(5).

3. they indirectly reciprocally communicate with each other via other brain regions;

Pair-wise correlation can only tell that two regions is probably connected, but cannot distinguish among the above possibilities. To distinguish between direct and indirect connections, conditional independence must be considered. The example in Fig. 5 clearly explains this motivation. The two signals A and B show strong pair-wise correlation, but if we consider a third signal C, then the residuals of A and B after C is extracted from them hardly show any correlation. In this example, A and B are conditionally independent if given C, and maybe both are driven by C, as illustrated in the indirect common-stimuli case in Fig. 4. It must be noted that conditional independence alone without temporal information is not enough to determine causal relationships, ie. the direction of connections. To infer the direction, criteria considering temporal information, such as Granger causality (Granger,

Biomedical research explores the highly complex and diverse realm of living organisms and often incorporates clinical needs such as diagnosis and treatment design. Analysis



 --


 

 

Fig. 3. Examples of the structures of classical graphical models. The structure of a Markov random field is an undirected graph. The joint probability is decomposed as the product of clique potential functions Φ*c*(*xc*) where *c* is a clique in the graph and *xc* is the variables associated with the nodes in *c*. The structure of a Bayesian network is a directed acyclic graph. The joint probability is decomposed as the product of node conditional probabilities *Pi*(*xi*|*xpa*[*i*]) where *i* is a node in the graph and *pa*[*i*] is the parent nodes of node *i*. Chain graph models unify Markov random fields and Bayesian networks. They allow both directed and undirected edges, but forbid directed cycles. The joint probability is decomposed as the product of chain-component conditional probabilities *Pτ*(*xτ*|*xpa*[*τ*]) where *τ* is a chain component and *pa*[*τ*] is the parent nodes of the component. The chain-component conditional probability *Pτ*(*xτ*|*xpa*[*τ*]) can be further decomposed as clique potential functions Φ*c*(*xc*) where *c* is a clique in the moral graph derived from the chain component *τ*. Dynamic causal models (Friston et al., 2003) can be regarded as non-linear Bayesian networks with an observed layer and a latent layer. Multi-variate auto-regression (Valdes-Sosa et al., 2005) can be regarded as linear Bayesian networks with many time slices and directed edges from slices at time *t* − 1, *t* − 2, ··· pointing to the slice at time *t*.

structures is required to act as nodes in the model. According to the interaction relationships specified by the graph, the joint probability of node random variables can be decomposed as the product of many local potential functions or local conditional probabilities, as shown in Fig. 3. A node variable usually depends on its neighbor variables and/or parent variables. For example, in Bayesian networks, the activity of a region A is usually modeled as a stochastic function of the activities of its "parent" regions, as in Eq. (1)

$$X\_A = f(X\_{pa\_1[A]}, X\_{pa\_2[A]}, \dots, X\_{pa\_n[A]}) \tag{1}$$

where *XA* is the activity of region A and *pai*[*A*]s are the parent nodes of A in the graph. The graph structure of the model is not just for visualization, but encodes conditional-independence relationships among the activities of brain regions. A network structure can be translated to a set of conditional-independence relationships according to the Markov properties and vice versa, with certain assumptions, a set of conditional-independence relationships can also be encoded by a network structure (Lauritzen, 1996).

#### **1.3 Pair-wise and conditional correlation**

Graphical models are suitable for modelling brain connectivity, not only because their structures can be easily visualized as a network, but more importantly, their fundamental feature, namely conditional independence, is a key concept for differentiating effective connectivity from functional connectivity. When two brain regions show similar activation patterns, they can be somehow connected with several underlying possibilities, as illustrated in Fig. 4:

4 Will-be-set-by-IN-TECH

Markov Random Field Bayesian Network Chain Graph Model Latent Layer Time Slices

Fig. 3. Examples of the structures of classical graphical models. The structure of a Markov random field is an undirected graph. The joint probability is decomposed as the product of clique potential functions Φ*c*(*xc*) where *c* is a clique in the graph and *xc* is the variables associated with the nodes in *c*. The structure of a Bayesian network is a directed acyclic graph. The joint probability is decomposed as the product of node conditional probabilities *Pi*(*xi*|*xpa*[*i*]) where *i* is a node in the graph and *pa*[*i*] is the parent nodes of node *i*. Chain graph models unify Markov random fields and Bayesian networks. They allow both directed and undirected edges, but forbid directed cycles. The joint probability is decomposed as the product of chain-component conditional probabilities *Pτ*(*xτ*|*xpa*[*τ*]) where *τ* is a chain component and *pa*[*τ*] is the parent nodes of the component. The chain-component

conditional probability *Pτ*(*xτ*|*xpa*[*τ*]) can be further decomposed as clique potential functions Φ*c*(*xc*) where *c* is a clique in the moral graph derived from the chain component *τ*. Dynamic causal models (Friston et al., 2003) can be regarded as non-linear Bayesian networks with an observed layer and a latent layer. Multi-variate auto-regression (Valdes-Sosa et al., 2005) can be regarded as linear Bayesian networks with many time slices and directed edges from

structures is required to act as nodes in the model. According to the interaction relationships specified by the graph, the joint probability of node random variables can be decomposed as the product of many local potential functions or local conditional probabilities, as shown in Fig. 3. A node variable usually depends on its neighbor variables and/or parent variables. For example, in Bayesian networks, the activity of a region A is usually modeled as a stochastic

where *XA* is the activity of region A and *pai*[*A*]s are the parent nodes of A in the graph. The graph structure of the model is not just for visualization, but encodes conditional-independence relationships among the activities of brain regions. A network structure can be translated to a set of conditional-independence relationships according to the Markov properties and vice versa, with certain assumptions, a set of conditional-independence relationships can also be encoded by a network structure

Graphical models are suitable for modelling brain connectivity, not only because their structures can be easily visualized as a network, but more importantly, their fundamental feature, namely conditional independence, is a key concept for differentiating effective connectivity from functional connectivity. When two brain regions show similar activation patterns, they can be somehow connected with several underlying possibilities, as illustrated

*XA* = *f*(*Xpa*1[*A*], *Xpa*2[*A*],..., *Xpan*[*A*]) (1)

*<sup>P</sup><sup>τ</sup>* (*x<sup>τ</sup>* |*xpa*[*τ*]) = <sup>∏</sup>*c*∈*CT* <sup>Φ</sup>*<sup>c</sup>* (*xc* )

*<sup>P</sup>*(*x*) = <sup>∏</sup>*c*∈*<sup>C</sup>* <sup>Φ</sup>*<sup>c</sup>* (*xc* ) *<sup>P</sup>*(*x*) = <sup>∏</sup>*i*∈*<sup>N</sup> Pi*(*xi*|*xpa*[*i*]) *<sup>P</sup>*(*x*) = <sup>∏</sup>*τ*∈*<sup>T</sup> <sup>P</sup><sup>τ</sup>* (*x<sup>τ</sup>* |*xpa*[*τ*])

slices at time *t* − 1, *t* − 2, ··· pointing to the slice at time *t*.

function of the activities of its "parent" regions, as in Eq. (1)

(Lauritzen, 1996).

in Fig. 4:

**1.3 Pair-wise and conditional correlation**


 - -

Fig. 4. When two brain regions show similar activation patterns, they can be connected with different underlying possibilities: (1) they directly reciprocally communicate with each other; (2) one region directly exerts the other; (3) they indirectly reciprocally communicate with each other via other brain regions; (4) one indirectly exerts the other via other regions; (5) they both are driven other regions; (6) they communicate with a combination of (1)–(5).


Pair-wise correlation can only tell that two regions is probably connected, but cannot distinguish among the above possibilities. To distinguish between direct and indirect connections, conditional independence must be considered. The example in Fig. 5 clearly explains this motivation. The two signals A and B show strong pair-wise correlation, but if we consider a third signal C, then the residuals of A and B after C is extracted from them hardly show any correlation. In this example, A and B are conditionally independent if given C, and maybe both are driven by C, as illustrated in the indirect common-stimuli case in Fig. 4. It must be noted that conditional independence alone without temporal information is not enough to determine causal relationships, ie. the direction of connections. To infer the direction, criteria considering temporal information, such as Granger causality (Granger, Aug., 1969), can be employed.

#### **1.4 Challenges in modeling brain connectivity**

Biomedical research explores the highly complex and diverse realm of living organisms and often incorporates clinical needs such as diagnosis and treatment design. Analysis

potential connections or just report abstract statistical scores, without providing an intuitive interpretation. Rather, clinicians prefer interpretable, informative and human-understandable results, for example, which brain regions play the central role in conducting a functional task, or which connections are normalized by a pharmacological manipulation. These considerations have implications for interpretation and feature extraction from graphical

Graphical Models of Functional MRI Data for Assessing Brain Connectivity 381

As a response to the above common challenges in biomedical research (ie. reliability, generality and interpretability), in the following sections, we will focus on three topics: error control in learning brain connectivity, group analysis taking into account the enhanced inter-subject variability typically seen in patient populations, and brain network analysis. Finally, for completeness, we also briefly overview several popular software packages suitable

In real world applications, especially in modelling brain connectivity, graphical models are not only a tool for operations such as classification or prediction, but more often than not, it is the network structure of the model itself which is of particular interest. Thus a desirable graphical model of fMRI data should not only statistically fit the overall data well, but also accurately reflect the internal brain connectivity structure. Structure-learning algorithms must

There are two basic types of statistical errors: type I errors, ie. falsely claiming connections when they actually do not exist; and type II errors, ie. failure in detecting connections that truly exist. Since real data are not free from noise, limited samples may appear to support the existence of a connection when it does not exist, or vice versa. It is therefore impossible to absolutely prevent the two types of errors simultaneously, but rather keep a balance between them. This can be done by, for example, minimizing a loss function associated with the two

There are several criteria available for error-rate control (see Table 2). Generally there is no single criteria that is universally superior if the research scenario is not specified. Selecting the error rate is largely not an abstract question "which error rate is superior over others?", but a practical question "which error rate is the researchers' concern?". One error-rate criterion may be favored in one scenario while another may be right in a different scenario, for example: • We are diagnosing a serious disease whose treatment has serious potential side effects. Due to the risk of the treatment, we hope that less than 0.01% of healthy people will be falsely diagnosed as affected by the disease. In this case, the type I error rate should be

• We are diagnosing a disease with high mortality, e.g. a type of cancer. Because failure in detecting the disease will have catastrophic consequences, we hope that 95% of subjects with the disease will be correctly detected. In this case, the type II error rate should be

• In a pilot study, we are selecting candidate genes for a genetic research on Parkinson's disease. Because of limited funding, we can only study a limited number of genes, so when selecting candidate genes in the pilot study, we hope that 95% of the selections are

therefore control or assess the error rate of the connections/edges detected by them.

for assessing fMRI brain connectivity in the Appendix.

types of errors according to Bayesian decision theory.

**2. Error control in structure learning**

**2.1 Criteria for error control**

controlled under 0.01%.

controlled under 5%.

models.

Scatter plot of A and B Scatter plot of the residuals

Fig. 5. Two signals A (blue) and B (green) show strong pair-wise correlation, but with a third signal C (red) being considered, the residuals of A and B after removing the projection onto C hardly show any correlation.

of biomedical data typically emphasizes such features of reliability, interpretability and generality of reported results.

For example, when brain connections are reported, it is important to control or assess error rates in the claimed discoveries, addressing questions such as "how many among the reported connections are actually true connections?" and "how many true connections can be detected?"

Additionally, the ultimate goal of a biomedical experiment is usually a population inference applicable to a group of people, such as patients with a particular disease. However, subjects classified to the same experimental group according to the factor of interest can still be highly diverse with respect to other factors, such as gender, age, or race. Even repetitive experiments with the same subject can still be affected by various physical or psychological factors, such as drowsiness or stress. It is therefore important to integrate the information from separate experiments to make inference on the target topic, and to keep a balance between commonality and diversity.

Finally, as a multidisciplinary field, end users of connectivity analysis reports are often biomedical researchers or clinicians who focus on the biological implication of the results and the effects of medication. Therefore, it is undesirable to simply generate a vast network of potential connections or just report abstract statistical scores, without providing an intuitive interpretation. Rather, clinicians prefer interpretable, informative and human-understandable results, for example, which brain regions play the central role in conducting a functional task, or which connections are normalized by a pharmacological manipulation. These considerations have implications for interpretation and feature extraction from graphical models.

As a response to the above common challenges in biomedical research (ie. reliability, generality and interpretability), in the following sections, we will focus on three topics: error control in learning brain connectivity, group analysis taking into account the enhanced inter-subject variability typically seen in patient populations, and brain network analysis. Finally, for completeness, we also briefly overview several popular software packages suitable for assessing fMRI brain connectivity in the Appendix.
