**3. Analyzing group differences in brain networks**

After brain network construction, for each scanning of each subject, the preprocessed fMRI data were converted to a brain network represented by a graph adjacency matrix. The next question is how to find the difference between groups of brain networks. Here we summarize two popular methods to further analyze brain networks.

## **3.1 Significance analysis**

The most basic method is analyzing functional connectivity directly. Specifically, suppose we are comparing two groups of networks. Each connectivity value is extracted from every network, forming two sets of values. Statistical hypothesis testing can be adopted to decide whether this connection shows a significant difference as well as which group is higher. After performing a comparison on every connection in the network, the group difference network consisting of significant different connections is obtained. All edges with a significant difference were stored in a network for further discussion. We can also select several regions based on prior knowledge, such as the sensorimotor area or visual area, to further filter the set of significant different connections.

Another method is calculating graph theory attributes. Graph theory characterizes the topology of the network by nodal and global attributes. Common node level graph theory attributes are betweenness centrality, clustering coefficient, local efficiency, modularity, and weighted degree, while the network level graph theory attributes include global efficiency and characteristic path length. Small-worldness is also a common index used in brain network analysis. For multilevel brain networks, we define intra-region features as the attributes calculated at voxel-based local networks, and the attributes calculated at region-based whole-brain networks are called interregion features. We can calculate the global feature of the voxel-based local network (intra-region features), and the nodal feature of the region-based whole-brain network (inter-region features). As a result, for each graph attribute, we obtain a feature vector whose length equals the number of nodes in the network, representing the whole-brain network feature.

After obtaining feature vectors of graph theory attributes, we can perform a statistical comparison on each region similar to FC analysis. The feature at each region is extracted, forming two sets of values; and statistical testing is used to find significant regions or significant different features. Moreover, the clinical relevance of the features can be evaluated by assessing the correlation of features and clinical scores, which produces features with significant correlation. The intersection of significant different and significant correlated features is selected for further discussion and following analysis.

We also investigated methods to analyze dynamic graph theory attributes [51]. For dynamic brain networks, at each sliding window location, the obtained brain network is static, and graph theory attributes can be calculated. As the window slides, graph theory attributes at each window location are estimated, forming the dynamic graph theory attributes of the dynamic network. To combine static and dynamic attributes together with clinical scores, we proposed an analysis framework [51]. The strength and stability of dynamic graph attributes were calculated. We found significant different and correlated features for both static and dynamic networks, as well as their intersection. The resulting features were further analyzed using receiver-operating curves (ROC) to test their ability in classification.

A controversy regarding the above analysis method is the multiple comparison problem. For each single statistical comparison with a 0.05 significance level, there is a 0.05 chance of obtaining a false positive. However, when performing multiple statistical comparisons at the same time, the chance of getting at least one false positive would become higher as the number of comparisons increases. To tackle this problem, correction methods, such as Bonferroni correction and false discovery rate (FDR) correction, were proposed. The basic idea behind these correction methods is to decrease the single comparison significance level according to the number of comparisons. However, since the amount of comparison is related to the number of nodes in the network, and certain features show high within group variance, directly applying correction might result in no significant result. We argue that statistical comparison can be seen as a feature selection procedure. The significant or selected features are then fed into the next module, such as a classifier. During feature selection, we should keep as much useful

information as possible. The uncorrected significant features are preliminary scanning results and taking the intersection of significant different and correlated features further select clinically relevant information. Searching for intersected significant features might be an alternative method to multiple comparison correction.

#### **3.2 Network-based statistics**

For brain networks, to overcome the multiple comparison issue, network-based statistics (NBS) was proposed, enabling direct comparison of groups of brain networks [52]. NBS assumes that the effect or the group difference forms a certain structure instead of distributed single connections. The edge-wise comparison is performed first and the links are thresholded according to the test statistics or p-values obtained from the edge-wise comparison, producing a binarized difference network. It then searches for structures or connected components in the binarized difference network. The size of the component, defined as the number of edges or nodes, is used to determine if the component is significant by a permutation test, where group labels of samples are randomly shuffled and the same procedure is performed to search for the maximum component size. The permutation is repeated 5000 times and the empirical distribution of the component size is obtained. An empirical p-value can be assigned to the original connected component by calculating the ratio of the number of permutations, where the maximal size is larger than the original size, to the total permutation number.

Compared with edge-wise comparison and direct edge-wise correction, NBS provides higher statistical power at the cost of coarser spatial resolution in detecting differences [52]. In other words, NBS can only declare the connected component as a whole to be significant. It draws no conclusion on the significance of each single connection within the component. However, the original NBS only works for symmetric adjacency matrices, which corresponds to functional connectivity.

Based on directed connectivity, we proposed the extended-NBS (e-NBS) to search for altered connected components in groups of directed networks [47]. The method overview is shown in **Figure 3**. We search for strongly connected components (SCC) and weakly connected components (WCC) with and without direction information. A classical depth-first search algorithm was adopted when searching for SCCs and WCCs. The edge-wise p-value was utilized to filter for candidate connections and construct a difference network. Since there is no consensus on how to choose the pre-defined p-value threshold, we changed it within a certain range to test method performance. Specifically, an edge is kept if the p-value is less than the pre-define

#### **Figure 4.**

*Two-step connected component. The first level node is directly connected to the ROI in the binarized difference network, while the second level node is connected with the first-level node.*

p-value threshold. For edge-wise comparison, we also tried to use two-sample t-test and the non-parametric Mann-Whitney test. The e-NBS method, together with the CCM-based directed connection estimation method, was verified using a dataset of spinal cord injury patients and healthy controls.

Moreover, we note that given the framework of e-NBS, one can define connected components that suit research needs. For example, in a study of motor function alteration following spinal cord injury, researchers are interested in connections related to sensorimotor areas and visual regions. The connected component can be defined as significant different connections related to these regions of interest. Furthermore, we can define two-step connected components that comprise connections directly related to the ROIs in the binarized difference network, and connections related to regions (first level nodes) that connect with ROIs (**Figure 4**). Either way, the permutation test in e-NBS makes it possible to draw conclusion on the significance of the defined component.
