Step 7. Save results as .csv file
write.csv(results, file="xcmsresults.csv")
```
## **3. Descriptive statistics**

multiple peaks, and missing values, i.e., failure to detect certain peaks in some samples that

**Figure 2.** Representative total ion chromatograms (TICs) of green fruits (A, C) and red fruits (B, D) of the hp‐2dg (A, B)

The next step after filtration and peak identification is matching peaks across samples. Peaks representing the same analyte across samples must be placed into groups. This is accomplished with the *group* function, which returns a new xcmsSet object. After matching peaks into groups, XCMS can use those groups to identify and correct correlated drifts in retention time using the *retcor* function. The aligned peaks can then be used for a second pass of peak grouping which

After the second pass of peak grouping, there will still be missing peaks from some of the samples. This can occur because peaks were missed during peak identification or because an analyte was not present or below the detection limit in a sample. Missing values can be problematic for statistical methods that require a fully defined data matrix. Those missing data points can be filled in by re‐reading the raw data files and integrating them in the regions of

XCMS can generate a report showing fold change differences in analyte intensities and their statistical significance using the *diffreport* function. However, we recommend obtaining the raw peak integration results using the *groupval* function. This function returns a numerical matrix in which each row represents a peak defined by its mass and retention time and each column represents a different sample. In an example used here, 308 peaks were identified with 1.4% missing values. This matrix provides the starting point for downstream statistical analysis.

can interfere with downstream statistical analysis [8].

74 Metabolomics - Fundamentals and Applications

and Manapal (C, D) tomato varieties.

will be more accurate than the first.

missing peaks using the *fillPeaks* function.

Once the raw data have been processed in XCMS, it is often useful to obtain descriptive statistics for each variable. This can be accomplished in R by creating a function to calculate the mean, median, standard deviation, standard error, and coefficient of variation for each variable. The *apply* function executes these operations on each row and returns the results into a data frame.

```
sumstats <- function(z) {
      Mean <- apply(z, 1, mean)
      Median <- apply(z, 1, median)
      SD <- apply(z, 1, sd)
      SE <- apply(z, 1, function(x) sd(x)/sqrt(length(x)))
      CV <- apply(z, 1, function(x) sd(x)/mean(x))
      result <- data.frame(Mean, Median, SD, SE, CV)
      return(result)
      }
```
This function creates a new data object containing descriptive statistics for each variable that can be used to rank variables according to mean or median intensity or to assess the degree of dispersion for each variable.

#### **4. Data preprocessing**

Before higher order statistical methods can be applied, it is often necessary to "clean up" the data to improve the analysis. Typical steps include (1) imputation of missing values, (2) transformation, (3) scaling, and (4) normalization.

#### **4.1. Imputation of missing values**

A common phenomenon in metabolomics measurements is the occurrence of missing values, i.e., empty cells where a respective metabolite peak has not been assigned any numerical value. As many multivariate methods require a fully defined matrix or become computationally inefficient for incomplete data, estimation of missing values is an important step in the preparation of the data [8].

Even after using the *fillPeaks* function in XCMS, there are still typically a large number of missing values. There are several strategies for dealing with these, including removing variables with missing values that exceed a certain threshold. However, these are often interesting features that are important in discriminating experimental groups. An alternative and widely used approach is imputation, where missing values are replaced with a small value with the assumption that the feature in question is below the limit of detection in those samples where XCMS fails to detect a peak.

The following function can be used to find the minimum nonzero value in a set of numbers and then divide that value by 2. We can then use the *apply* function to replace the missing values in each row of the matrix using this function.

```
replacezero <- function(x) "[<-"(x, !x | is.na(x), min(x[x > 0], 
na.rm = TRUE) / 2)
newdata <- apply(data, 1, replacezero)
```
#### **4.2. Log transformation**

Since metabolomics studies are generally concerned with relative changes in metabolite levels, a log or other suitable transformation is normally applied before performing higher order statistical analysis. A log transformation helps to remove heteroscedasticity from the data and correct for a skewed data distribution [7]. This operation is easily performed in R using the *log* function. The default option is to compute the natural logarithm. However, the general form *log* (*x*, base) computes logarithms with any desired based. The base 2 log transformation is commonly used in metabolomics studies. Note that the *log* function will return NA for any zero values in the data matrix.

#### **4.3. Normalization and scaling**

The purpose of data normalization is to reduce systematic variation and to separate biological variation from nonbiological variation introduced by the experimental process. This is often necessary to improve the results of higher order statistical analysis [7, 23, 25].

Normalization can be sample wise or feature wise or both. Sample wise normalization makes the samples more comparable to each other. Common approaches include normalization to constant sum, to a reference sample or feature, or sample specific normalization such as dry weight or tissue volume.

Feature wise normalization involves centering the data around the mean combined with various types of scaling. Centering focuses the data on the amount of variation instead of the mean intensity. Scaling involves dividing each variable by a factor that approximates the amount of data dispersion. The most common scaling approach is known as unit scaling or autoscaling where each variable is mean centered and then divided by the standard deviation. After autoscaling all variables become equally important and are analyzed on the basis of correlations instead of covariances. A disadvantage of autoscaling is that it tends to inflate the importance of small variables which are more likely to contain measurement errors [7]. Other scaling operations include Pareto scaling, which uses the square root of the standard deviation as the scaling factor, vast scaling, which uses the standard deviation and the coefficient of variation as scaling factors, and range scaling, where the range is used as the scaling factor [7].

The *scale* function in R automatically performs centering and autoscaling. Other scaling procedures can also be carried out using the custom functions. For example, the following function can be used for Pareto scaling which is recommended for metabolomics data.

```
paretoscale <- function(z) {
      rowmean <- apply(z, 1, mean) # row means
      rowsd <- apply(z, 1, sd) # row standard deviation
      rowsqrtsd <- sqrt(rowsd) # sqrt of sd
      rv <- sweep(z, 1, rowmean,"-") # mean center
      rv <- sweep(rv, 1, rowsqrtsd, "/") # divide by sqrtsd
      return(rv)
      }
```
## **5. Principal component analysis**

**4.1. Imputation of missing values**

76 Metabolomics - Fundamentals and Applications

where XCMS fails to detect a peak.

na.rm = TRUE) / 2)

**4.2. Log transformation**

zero values in the data matrix.

**4.3. Normalization and scaling**

weight or tissue volume.

in each row of the matrix using this function.

newdata <- apply(data, 1, replacezero)

preparation of the data [8].

A common phenomenon in metabolomics measurements is the occurrence of missing values, i.e., empty cells where a respective metabolite peak has not been assigned any numerical value. As many multivariate methods require a fully defined matrix or become computationally inefficient for incomplete data, estimation of missing values is an important step in the

Even after using the *fillPeaks* function in XCMS, there are still typically a large number of missing values. There are several strategies for dealing with these, including removing variables with missing values that exceed a certain threshold. However, these are often interesting features that are important in discriminating experimental groups. An alternative and widely used approach is imputation, where missing values are replaced with a small value with the assumption that the feature in question is below the limit of detection in those samples

The following function can be used to find the minimum nonzero value in a set of numbers and then divide that value by 2. We can then use the *apply* function to replace the missing values

replacezero <- function(x) "[<-"(x, !x | is.na(x), min(x[x > 0],

Since metabolomics studies are generally concerned with relative changes in metabolite levels, a log or other suitable transformation is normally applied before performing higher order statistical analysis. A log transformation helps to remove heteroscedasticity from the data and correct for a skewed data distribution [7]. This operation is easily performed in R using the *log* function. The default option is to compute the natural logarithm. However, the general form *log* (*x*, base) computes logarithms with any desired based. The base 2 log transformation is commonly used in metabolomics studies. Note that the *log* function will return NA for any

The purpose of data normalization is to reduce systematic variation and to separate biological variation from nonbiological variation introduced by the experimental process. This is often

Normalization can be sample wise or feature wise or both. Sample wise normalization makes the samples more comparable to each other. Common approaches include normalization to constant sum, to a reference sample or feature, or sample specific normalization such as dry

necessary to improve the results of higher order statistical analysis [7, 23, 25].

Principal component analysis (PCA) is the foundation for many multivariate techniques that seek to describe a set of observations based on a large number of variables [25, 26]. The core idea of PCA is to reduce the dimensionality of the data, i.e., the number of variables while retaining as much of the variation as possible. Using PCA, it is possible to extract and display the systematic variation in the data and identify related groups, trends, and outliers.

PCA returns two important types of information: a scores matrix and a loadings matrix. The scores matrix contains the coordinates of the samples (i.e., observations) for each principal component and provides a summary of the observations in a lower dimensional space. The first principal component describes the largest variation in the data matrix, the second component describes the second largest, and so on. All PCA components are mutually orthogonal, meaning they are uncorrelated. Generally, most of the variation is captured in the first two or three principal components. Therefore, a scatter plot of the first two score vectors usually provides a good summary of all the samples and can reveal if there are differences between the groups as well as outliers.

Analogous to the scores matrix, the loadings matrix describes the relationships among the measured variables for each principal component. A scatterplot of the first two loading vectors can reveal the influence (weights) of individual variables in the model. An important aspect of PCA is that directions in the scores plot correspond to directions in the loadings plot, and so a comparison of these two plots can be used to identify, which variables (loadings) are most important for separating the different samples (scores) [28].

When there are more than two experimental classes, it is generally appropriate to use multi‐ variate methods such as PCA to find patterns in the data [27]. The primary goal of these methods is to determine if the classes can be predicted from the variables (class assignment) and to identify which variables are important in predicting class membership.

There are several ways to perform PCA in R. Here, we will demonstrate the procedure using the *prcomp* function, which comes with the built‐in R stats package. This method uses singular value decomposition (SVD) to calculate eigenvalues, which is the standard approach in PCA. The following syntax is used.

prcomp(data, retx = TRUE, center = TRUE, scale = FALSE)

where "data" is a dataframe or matrix containing the data, retx is a logical value that indicates whether the scores will be returned, center is a logical value indicating whether the variables should be mean centered, and scale is a logical value indicating whether the variables should be scaled to unit variance.

After missing values have been replaced, the data are log transformed and Pareto scaled. Note that the Pareto scale function must first be defined as above.

```
logdata <- log(newdata, 2)
pareto.logdata <- paretoscale(logdata)
```
Now, we perform PCA. The center and scale options are set to FALSE since these operations have already been performed with the *paretoscale* function.

pca <- prcomp(t(pareto.logdata), center=F, scale=F)

The *t* function is used here to transpose the matrix so that each row represents an observation (sample) and each column represents a variable (peak). This is necessary if we want the scores matrix to correspond to samples and the loadings matrix to correspond to variables.

The *prcomp* function returns several outputs that can be accessed by the *summary* command.

This returns a list that contains the standard deviations (eigenvalues) and proportion of the total variance for each principal component, the scores matrix, and the loadings matrix. We can extract each of these outputs into a new data frame and save the results to file for later use.

scree.data <- as.data.frame(pcaresults\$importance) score.data <- as.data.frame(pcaresults\$x) loadings.data <- as.data.frame(pcaresults\$rotation) write.csv(scree.data, "pca\_scree.csv") write.csv(score.data, "pca\_scores.csv") write.csv(loadings.data, "pca\_loadings.csv")

#### **5.1. Scores plot**

Analogous to the scores matrix, the loadings matrix describes the relationships among the measured variables for each principal component. A scatterplot of the first two loading vectors can reveal the influence (weights) of individual variables in the model. An important aspect of PCA is that directions in the scores plot correspond to directions in the loadings plot, and so a comparison of these two plots can be used to identify, which variables (loadings) are most

When there are more than two experimental classes, it is generally appropriate to use multi‐ variate methods such as PCA to find patterns in the data [27]. The primary goal of these methods is to determine if the classes can be predicted from the variables (class assignment)

There are several ways to perform PCA in R. Here, we will demonstrate the procedure using the *prcomp* function, which comes with the built‐in R stats package. This method uses singular value decomposition (SVD) to calculate eigenvalues, which is the standard approach in PCA.

prcomp(data, retx = TRUE, center = TRUE, scale = FALSE)

where "data" is a dataframe or matrix containing the data, retx is a logical value that indicates whether the scores will be returned, center is a logical value indicating whether the variables should be mean centered, and scale is a logical value indicating whether the variables should

After missing values have been replaced, the data are log transformed and Pareto scaled. Note

pareto.logdata <- paretoscale(logdata)

Now, we perform PCA. The center and scale options are set to FALSE since these operations

pca <- prcomp(t(pareto.logdata), center=F, scale=F)

The *t* function is used here to transpose the matrix so that each row represents an observation (sample) and each column represents a variable (peak). This is necessary if we want the scores

The *prcomp* function returns several outputs that can be accessed by the *summary* command.

matrix to correspond to samples and the loadings matrix to correspond to variables.

and to identify which variables are important in predicting class membership.

important for separating the different samples (scores) [28].

that the Pareto scale function must first be defined as above.

have already been performed with the *paretoscale* function.

logdata <- log(newdata, 2)

The following syntax is used.

78 Metabolomics - Fundamentals and Applications

be scaled to unit variance.

The results of a PCA can be easily visualized using the base graphics functions in R. However, it is often desirable to produce a high‐quality figure with custom formatting using ggplot2. To do this, we first import the scores matrix from the PCA. Since the first two principal compo‐ nents capture most of the variance, we will subset the data to include only those values.

> data <- read.csv("pca\_scores.csv", header=T) data <- data[, c(1:3)] # subset columns 1-3

In order to map individual samples to their respective groups, we need to add a new column to the data frame indicating the group to which each sample belongs. To do this, we first create a vector of group names and then add the vector to the data frame with the *cbind* function. We can then generate the scores plot with ggplot2.

```
ggplot(data, aes(PC1, PC2)) +
geom_point(aes(shape=Group)) +
geom_text(aes(label=data$X)) +
stat_ellipse(aes(fill=Group))
```
This script defines the data and adds layers for data points, text labels, and confidence ellipses. The resulting plot is shown in **Figure 3**.

The scores plot shows that the two genotypes are well separated along the first PC axis while the developmental stage (green versus red) is separated along the second PC axis. There is more variation among the hp‐2dg green samples than among the other groups.

**Figure 3.** PCA scores plot.

#### **5.2. Loadings plot**

We can create a loadings plot using a similar approach. Since there are a large number of variables, we would also like to know which ones have the largest influence on the PCA. Variables with high loadings (positive and negative) are more likely to be important for discriminating groups that are separated in the scores plot.

One of the major advantages of R is that it has many powerful and flexible functions for subsetting data. One approach might be to identify the maximum and minimum loadings using the *range* function and then subset the data based on a percentage of these values. Alternatively, we can make a plot of PC1 versus PC2 loadings and visually inspect the data for high and low values. The *subset* function can be used to select rows that meet certain criteria. In this example, 0.09 and −0.09 were selected for threshold values.

```
significant.loadings <- subset(loadings, PC1 > 0.09 | PC1 < -0.09 |
                                         PC2 > 0.09 | PC2 < -0.09)
```
For plotting in ggplot2, it is generally recommended to add factor columns to the data frame for the purpose of mapping aesthetics to variables. A factor is a categorical value in R with predefined levels. We can use the *ifelse* function to specify the factor level in the new columns much in the same way as the *subset* function was used to create a new data frame. Loadings above and below the threshold values are marked for subsetting in this way.

```
loadings$pc1.change <-
  ifelse(loadings$PC1 > 0.09,"UP",
         ifelse(loadings$PC1 < -0.09,"DOWN",
                "nochange"))
```
With the added factors, we can make the plot in ggplot2 and indicate significant loadings with different colors. The grid package provides several options for adding text annotations. The resulting plot is shown in **Figure 4**.

**Figure 4.** PCA loadings plot.

**Figure 3.** PCA scores plot.

80 Metabolomics - Fundamentals and Applications

**5.2. Loadings plot**

We can create a loadings plot using a similar approach. Since there are a large number of variables, we would also like to know which ones have the largest influence on the PCA. Variables with high loadings (positive and negative) are more likely to be important for

One of the major advantages of R is that it has many powerful and flexible functions for subsetting data. One approach might be to identify the maximum and minimum loadings using the *range* function and then subset the data based on a percentage of these values. Alternatively, we can make a plot of PC1 versus PC2 loadings and visually inspect the data for high and low values. The *subset* function can be used to select rows that meet certain criteria.

significant.loadings <- subset(loadings, PC1 > 0.09 | PC1 < -0.09 |

For plotting in ggplot2, it is generally recommended to add factor columns to the data frame for the purpose of mapping aesthetics to variables. A factor is a categorical value in R with

PC2 > 0.09 | PC2 < -0.09)

discriminating groups that are separated in the scores plot.

In this example, 0.09 and −0.09 were selected for threshold values.

Since the scores matrix and the loadings matrix share similar geometric properties, the direction of the loadings indicate those variables that have the greatest influence on class separation. Based on these criteria, 64 potentially significant peaks were identified out of the original 308 (**Figure 4**). Separation along the PC1 axis identified features that show high variation by genotype while separation along the PC2 axis identified features that show high variation by developmental stage.

It should be emphasized that since PCA is an exploratory method, the interpretation of PCA results for the purpose of inferring biological importance must be done with caution. Poten‐ tially interesting features must be further analyzed to assess their biological significance. This can be done using boxplots, heatmaps, or other suitable graphical displays and rechecking the raw data by generating extracted ion chromatograms.

The first step in this process is to subset the original data to include only those variables of interest, i.e., the colored symbols in **Figure 4**. This can be done in R out using the extremely useful *merge* function.

data.sub <- merge(newdata, significant.loadings, by="row.names")

This command creates a new data frame containing the peak intensity values for the 64 variables with high PCA loadings that can be further analyzed as described below.

### **6. Partial least squares-discriminant analysis (PLS-DA)**

Partial least squares‐discriminant analysis (PLS‐DA) is a supervised method that uses multiple linear regression to find the direction of maximum covariance between a data set and class labels [28]. Supervised techniques can be very helpful for highlighting sample/group differ‐ ences when PCA results are masked by high levels of spectral noise, strong batch effects, or high within group variation [26]. PLS‐DA sharpens the separation between groups of obser‐ vations by rotating PCA components such that a maximum separation among classes is obtained and identifies variables that carry most of the class separating information. However, contrary to PCA, supervised methods like PLS‐DA aggressively overfit models to the data, almost always yielding scores in which classes are separated [26, 27]. As a result, these methods can generate excellent class separation even with random data. For this reason results of these types of tests should be critically checked and properly cross‐validated.

The pls, plsdepot, and muma packages can all be used for partial least squares analysis in R [33, 34]. We will demonstrate how to perform an extension of the PLS method known as OPLS‐ DA using the muma package below.

## **7. Heatmaps**

Heatmaps are an effective tool for displaying feature variation among groups of samples [35]. The basic concept of a heatmap is to represent relationships among variables as a color image. Rows and columns typically are reordered so that variables and/or samples with similar profiles are closer to one another, making these profiles more visible. Each value in the data matrix is displayed as a color, making it possible to view the patterns graphically.

Heatmaps use an agglomerative hierarchical clustering algorithm to order and display the data as a dendrogram. Two important factors to consider when constructing a heatmap are the type of distance measure and the agglomeration method used. For details on the various methods available see [35].

The *heatmap.2* function in the gplots package can be used to construct a heatmap that is easily customizable and include options for both the distance and agglomeration methods, as well as scaling options for rows or columns. Unless the actual numerical values in the data matrix have an explicit meaning, row scaling is usually advisable [35].

A heatmap showing the scaled data from the 64 loadings extracted by PCA is shown in **Figure 5**. Four well-defined clusters are evident that correlate well with the four different experimental groups. These variables form a starting point for further experiments and analyses.

**Figure 5.** Heatmap of significant features obtained from PCA loadings.

#### **8. Boxplots**

It should be emphasized that since PCA is an exploratory method, the interpretation of PCA results for the purpose of inferring biological importance must be done with caution. Poten‐ tially interesting features must be further analyzed to assess their biological significance. This can be done using boxplots, heatmaps, or other suitable graphical displays and rechecking the

The first step in this process is to subset the original data to include only those variables of interest, i.e., the colored symbols in **Figure 4**. This can be done in R out using the extremely

data.sub <- merge(newdata, significant.loadings, by="row.names")

This command creates a new data frame containing the peak intensity values for the 64

Partial least squares‐discriminant analysis (PLS‐DA) is a supervised method that uses multiple linear regression to find the direction of maximum covariance between a data set and class labels [28]. Supervised techniques can be very helpful for highlighting sample/group differ‐ ences when PCA results are masked by high levels of spectral noise, strong batch effects, or high within group variation [26]. PLS‐DA sharpens the separation between groups of obser‐ vations by rotating PCA components such that a maximum separation among classes is obtained and identifies variables that carry most of the class separating information. However, contrary to PCA, supervised methods like PLS‐DA aggressively overfit models to the data, almost always yielding scores in which classes are separated [26, 27]. As a result, these methods can generate excellent class separation even with random data. For this reason results of these

The pls, plsdepot, and muma packages can all be used for partial least squares analysis in R [33, 34]. We will demonstrate how to perform an extension of the PLS method known as OPLS‐

Heatmaps are an effective tool for displaying feature variation among groups of samples [35]. The basic concept of a heatmap is to represent relationships among variables as a color image. Rows and columns typically are reordered so that variables and/or samples with similar profiles are closer to one another, making these profiles more visible. Each value in the data

matrix is displayed as a color, making it possible to view the patterns graphically.

variables with high PCA loadings that can be further analyzed as described below.

**6. Partial least squares-discriminant analysis (PLS-DA)**

types of tests should be critically checked and properly cross‐validated.

DA using the muma package below.

**7. Heatmaps**

raw data by generating extracted ion chromatograms.

useful *merge* function.

82 Metabolomics - Fundamentals and Applications

Boxplots are another good way to visualize and compare features among different samples. A boxplot graphically depicts a vector through its five-number summary. The edges of the box represent the lower and upper quartiles while the whiskers represent the minimum and maximum values. The median is displayed as a line within the box. Outliers are represented as symbols outside of the whiskers.

A simple boxplot can be generated from any numeric vector using the *boxplot* function in R. However, a more customizable boxplot can be created using the ggplot2 package. **Figure 6** shows boxplots for four significant features from the PCA results. The data first were log transformed and Pareto scaled to show relative differences.

**Figure 6.** Boxplot of four significant peaks identified from PCA loadings.

The 495.2/2285 peak, which had a high positive PC1 loading, was significantly higher in the Manapal strain, whereas the 529.8/992 peak, which had a high negative PC1 loading, was significantly higher in the hp‐2dg strain. The 1136.4/2038 peak, which had the highest positive loading for PC2, was significantly higher in green fruits of both varieties. Interestingly, the 805.2/2198 peak, which had a high negative loadings for both PC1 and PC2, was only significant in red fruits of the hp‐2dg strain.

## **9. Extracted ion chromatograms**

While statistical procedures provide important clues about potentially significant variables, a critical but often overlooked step in analyzing metabolomics data is reinspecting the raw data to assess the validity of these results. Not only can this provide confirmation of meaningful features but it can also reveal false positives caused by scaling artifacts or spurious peak assignment, which is common in XCMS‐processed data.

LCMS ion peaks can be visualized through extracted ion chromatograms (EICs). An EIC is essentially a "slice" of the raw data that covers a specific *m/z* and time range. XCMS automat‐ ically generates EICs for peaks that show high significance, but these are low quality "snap‐ shot" images. However, the *plotEIC* function in XCMS can be used to extract the numerical data for any EIC of interest. The following commands describe how to obtain EIC data for all samples in a data set and generate an EIC plot for grouped samples.

We first create a list of xcmsRaw objects from the raw data files with the *lapply* function.

maximum values. The median is displayed as a line within the box. Outliers are represented

A simple boxplot can be generated from any numeric vector using the *boxplot* function in R. However, a more customizable boxplot can be created using the ggplot2 package. **Figure 6** shows boxplots for four significant features from the PCA results. The data first were log

The 495.2/2285 peak, which had a high positive PC1 loading, was significantly higher in the Manapal strain, whereas the 529.8/992 peak, which had a high negative PC1 loading, was significantly higher in the hp‐2dg strain. The 1136.4/2038 peak, which had the highest positive loading for PC2, was significantly higher in green fruits of both varieties. Interestingly, the 805.2/2198 peak, which had a high negative loadings for both PC1 and PC2, was only significant

While statistical procedures provide important clues about potentially significant variables, a critical but often overlooked step in analyzing metabolomics data is reinspecting the raw data to assess the validity of these results. Not only can this provide confirmation of meaningful features but it can also reveal false positives caused by scaling artifacts or spurious peak

LCMS ion peaks can be visualized through extracted ion chromatograms (EICs). An EIC is essentially a "slice" of the raw data that covers a specific *m/z* and time range. XCMS automat‐

as symbols outside of the whiskers.

84 Metabolomics - Fundamentals and Applications

transformed and Pareto scaled to show relative differences.

**Figure 6.** Boxplot of four significant peaks identified from PCA loadings.

assignment, which is common in XCMS‐processed data.

in red fruits of the hp‐2dg strain.

**9. Extracted ion chromatograms**

```
cdffiles <-list.files("./cdf", recursive = TRUE, full=T)
dat.raw <- lapply(cdffiles, xcmsRaw)
```
Next, we set the upper and lower limits for *m/z* and time. For this example, we will look at the 805.2/2198 peak since PCA and boxplots indicate that this peak was highly significant in red fruits of the hp‐2dg strain.

```
mrange <- c(804.5, 805.5)
trange <- c(2000, 2400)
```
We can use the *lapply* function again to create an EIC for all samples. The data are then merged into a data frame for custom plotting in ggplot2.

eicraw <- lapply(dat.raw, plotEIC, mzrange=mrange, rtrange=trange) eic.df <- do.call(rbind, lapply(eicraw, data.frame))

**Figure 7.** Extracted ion chromatograms for the *m/z* range 804.5‐805.5 in hp‐2dg green fruits (A), hp‐2dg red fruits (B), Manapal green fruits (C) and Manapal red fruits (D). Each panel represents 10 samples.

The results are shown in **Figure 7**. The grouped EIC data clearly show that this feature is completely absent in green fruits and is very low in red fruits of Manapal. In contrast, there are several large peaks over this mass range in red fruits of hp‐2dg, indicating this feature is a class‐specific biomarker.

## **10. Volcano plots**

A metabolomics experiment often involves a comparison of two groups, e.g., a treatment group versus a control. It is customary in such cases to use univariate methods to obtain a summary of the data and identify potentially important variables before applying multivariate methods [27]. A common tool to identify discriminatory features is to construct a volcano plot. This type of plot displays the fold change differences and the statistical significance for each variable. The log of the fold change is plotted on the *x*‐axis so that changes in both directions (up and down) appear equidistant from the center. The *y*‐axis displays the negative log of the *p*‐value from a two‐sample *t*‐test. Data points that are far from the origin, i.e., near the top of the plot and to the far left or right, are considered important variables with potentially high biological relevance.

The steps required to construct a volcano plot can be carried out using several base R functions. The fold change is typically calculated as the ratio of the two means. We can use the *apply* function to determine the means for each variable.

> group1mean <- apply(data[1:10], 1, FUN=mean) group2mean <- apply(data[11:20], 1, FUN=mean)

We then divide the means of each variable to obtain the ratio and take the logarithm so that changes in both directions appear equidistant from the center.

```
FC <- group1mean/group2mean
log2FC <- log(FC,2)
```
The *t.test* function in the R stats package returns the *p*‐value for an unpaired *t*‐test of two independent samples. The default option is Welch's *t*‐test, which assumes unequal variance. Note that data preprocessing steps, such as sum normalization and log transformation, are usually applied to make the samples more comparable and to reduce heteroscedasticity.

```
pvalue <- apply(log.data, 1, function(x) 
          {t.test(x[1:10], x[10:20])$p.value } )
```
It is recommended to use a multiple testing correction when performing *t*‐test on multiple variables. There *p.adjust* function in R provides several options for this, including the family wise error rate (FWER), also known as the Bonferroni correction, and the false discovery rate (FDR), also known as the Benjamini‐Hochberg correction. The false discovery rate is a less stringent condition than the family‐wise error rate, so this method is preferred when one is interested in having more true positives.

The results are shown in **Figure 7**. The grouped EIC data clearly show that this feature is completely absent in green fruits and is very low in red fruits of Manapal. In contrast, there are several large peaks over this mass range in red fruits of hp‐2dg, indicating this feature is

A metabolomics experiment often involves a comparison of two groups, e.g., a treatment group versus a control. It is customary in such cases to use univariate methods to obtain a summary of the data and identify potentially important variables before applying multivariate methods [27]. A common tool to identify discriminatory features is to construct a volcano plot. This type of plot displays the fold change differences and the statistical significance for each variable. The log of the fold change is plotted on the *x*‐axis so that changes in both directions (up and down) appear equidistant from the center. The *y*‐axis displays the negative log of the *p*‐value from a two‐sample *t*‐test. Data points that are far from the origin, i.e., near the top of the plot and to the far left or right, are considered important variables with potentially high biological

The steps required to construct a volcano plot can be carried out using several base R functions. The fold change is typically calculated as the ratio of the two means. We can use the *apply*

> group1mean <- apply(data[1:10], 1, FUN=mean) group2mean <- apply(data[11:20], 1, FUN=mean)

We then divide the means of each variable to obtain the ratio and take the logarithm so that

FC <- group1mean/group2mean

The *t.test* function in the R stats package returns the *p*‐value for an unpaired *t*‐test of two independent samples. The default option is Welch's *t*‐test, which assumes unequal variance. Note that data preprocessing steps, such as sum normalization and log transformation, are usually applied to make the samples more comparable and to reduce heteroscedasticity.

It is recommended to use a multiple testing correction when performing *t*‐test on multiple variables. There *p.adjust* function in R provides several options for this, including the family

{t.test(x[1:10], x[10:20])\$p.value } )

log2FC <- log(FC,2)

pvalue <- apply(log.data, 1, function(x)

a class‐specific biomarker.

86 Metabolomics - Fundamentals and Applications

**10. Volcano plots**

relevance.

function to determine the means for each variable.

changes in both directions appear equidistant from the center.

```
pvalue.BHcorr <- p.adjust(pvalue, method = "BH")
```
We take the negative log10 values so that variables with low adjusted *p*‐values (i.e., high significance) appear near the top of the plot.

```
pvalue.BHcorr.neglog <- -log10(pvalue.BHcorr)
```
Finally, the data are merged into a single data frame that can be plotted in ggplot2.

```
volcano.data <- data.frame(log2FC, pvalue.BHcorr.neglog)
```
A volcano plot comparison of the two tomato genotypes in green and red fruits is shown in **Figure 8**. Significant variables are shown as colored symbols. We selected rather conservative cutoff values of 2 and −2 for the log2 fold change (fold change >4) and 2 for the –log FDR adjusted *p*‐value (*p* < 0.01) to highlight those features that showed the largest differences. In green fruits, this led to the identification of 38 metabolite peaks that were significantly higher in Manapal and 20 peaks that were higher in hp‐2dg, while in red fruits, 40 metabolite peaks were higher in Manapal while 24 were higher in hp‐2dg. As with the PCA loadings, these variables can be explored further with heatmaps, boxplots, EICs, etc.

**Figure 8.** Volcano plot analysis of Manapal versus hp‐2dg in green (A) and red (B) fruits.

The R package muma (Metabolomic Univariate and Multivariate Analysis) has a more sophisticated procedure for testing significance and returning *p*‐values for a volcano plot [34]. Briefly, Shapiro Wilk's test for normality is performed to assess whether each variable has a normal distribution and to decide whether to perform a parametric test (Welch's *t*‐test) or a nonparametric test (Wilcoxon‐Mann Whitney test). The analysis returns fold change differen‐ ces and a merged set of *p*‐values from both tests and also applies a multiple testing correction that the user can specify. Finally, a volcano plot is generated highlighting significant variables based on the corrected *p*‐values.

## **11. Orthogonal projection to latent structures-discriminant analysis (OPLS-DA)**

An extension of the PLS technique known as orthogonal projection to latent structure is another very useful tool for analyzing metabolomics data. Like PLS this is a supervised method that pairs a data matrix *X* with a corresponding matrix *Y* containing sample information. The basic concept in OPLS is to separate the systematic variation in *X* into two parts, one that is correlated to *Y* and one that is not correlated (orthogonal) with *Y* [28]. Only the *Y*‐predictive variation is used to model the data. When working with discrete variables such as class labels the method is called OPLS discriminant analysis. The main advantages of OPLS‐DA over PCA are better class discrimination and more robust identification of important features. The OPLS‐DA algorithm normally is applied when there are only two classes comprising *Y*.

**Figure 9.** S‐plots from OPLS‐DA modeling of green and red fruits in Manapal and hp‐2dg varieties.

The muma package can be used to perform OPLS‐DA in R. A numerical class vector must be added to represent the *Y* matrix, i.e., the control group is given a value of 1 and the experimental group is given a value of 2. The data frame is saved in the working directory, and the analysis is carried out with two simple functions.

```
explore.data(file="logdata.csv", scaling="pareto")
oplsda(scaling="pareto")
```
This method automatically creates a new folder in the working directory that contains the OPLS‐DA results in both numerical and graphical formats. The numerical data can be merged into a new data frame for custom plotting in ggplot2.

The loadings from an OPLS‐DA model are displayed by means of an "S‐plot" where the modeled covariance *p*[1] is plotted on the *x*‐axis and the correlation profile *p*(corr)[1] is plotted on the *y*‐axis. Variables with higher *p*[1] values in both positive and negative directions have a larger impact on the variance between the groups, whereas variables with higher *p*(corr)[1] values have more reliability. Therefore, data points that fall in the upper right and lower left quadrants have a high impact on the model and represent possible class‐specific biomarkers.

**Figure 9** shows S‐plots from an OPLS‐DA model of the two tomato varieties in green and red fruits. Variables with |*p*[1]| > 0.004 are highlighted, and the top 10 for each class are listed in tabular form on the graph. In general, there was good agreement between the OPLS‐DA and volcano plot results for identifying significant variables.

## **12. Metabolite identification**

The R package muma (Metabolomic Univariate and Multivariate Analysis) has a more sophisticated procedure for testing significance and returning *p*‐values for a volcano plot [34]. Briefly, Shapiro Wilk's test for normality is performed to assess whether each variable has a normal distribution and to decide whether to perform a parametric test (Welch's *t*‐test) or a nonparametric test (Wilcoxon‐Mann Whitney test). The analysis returns fold change differen‐ ces and a merged set of *p*‐values from both tests and also applies a multiple testing correction that the user can specify. Finally, a volcano plot is generated highlighting significant variables

**11. Orthogonal projection to latent structures-discriminant analysis (OPLS-**

An extension of the PLS technique known as orthogonal projection to latent structure is another very useful tool for analyzing metabolomics data. Like PLS this is a supervised method that pairs a data matrix *X* with a corresponding matrix *Y* containing sample information. The basic concept in OPLS is to separate the systematic variation in *X* into two parts, one that is correlated to *Y* and one that is not correlated (orthogonal) with *Y* [28]. Only the *Y*‐predictive variation is used to model the data. When working with discrete variables such as class labels the method is called OPLS discriminant analysis. The main advantages of OPLS‐DA over PCA are better class discrimination and more robust identification of important features. The OPLS‐DA

algorithm normally is applied when there are only two classes comprising *Y*.

**Figure 9.** S‐plots from OPLS‐DA modeling of green and red fruits in Manapal and hp‐2dg varieties.

is carried out with two simple functions.

The muma package can be used to perform OPLS‐DA in R. A numerical class vector must be added to represent the *Y* matrix, i.e., the control group is given a value of 1 and the experimental group is given a value of 2. The data frame is saved in the working directory, and the analysis

based on the corrected *p*‐values.

88 Metabolomics - Fundamentals and Applications

**DA)**

One of the major challenges in LCMS‐based metabolomics is metabolite annotation, i.e., identifying biological molecules from mass spectral data. Although metabolic profiling approaches that do not assign observed features to known metabolites can provide a powerful means of classifying and directly comparing samples, metabolite identification remains a crucial step for obtaining mechanistic insights into cellular processes. However, the complexity of the metabolome combined with the fact that many metabolites have not been structurally identified means that untargeted metabolomic studies typically yield a large number of unknown peaks.

The accurate identification of metabolites usually requires the ability to match candidate spectra with standard compounds run under the same conditions. Ideally, an orthogonal descriptor such as retention index is used for further validation. However, the lack of readily available standards remains a major obstacle in this regard, particularly in plant phytochemical studies [36]. Consequently, a number of strategies are being brought forward to assist in the chemical identification of unknown metabolites, including the development comprehensive mass spectral libraries [37–39], searchable databases [40–42], and information networks that integrate genomic, transcriptomic, and metabolomic data [43–45]. The construction, mainte‐ nance, and integration of these resources are crucial to the advancement of the field of metabolomics.

## **13. Conclusions**

Untargeted metabolomics has become an increasingly powerful tool to investigate biological problems in agriculture, medicine, and a number of other fields. Therefore, efficient processing methods must be developed and refined to enable robust interpretation of metabolomics data. Method development and new software tools have helped address these challenges over the last decade. However, since improvements are still required at the various stages of data processing, establishing and refining new methods will continue to be important in the future of metabolomics research.

In this chapter, we have presented an overview of several common methods used for proc‐ essing and analyzing LCMS‐based metabolomics data and how to carry out these methods in the R programming environment. Although a variety of open source and web‐based tools are available to support metabolomics data analysis, the ability to tailor the data processing workflow to one's own needs and generate custom graphics in R offers major advantages.

As the data sets used in all scientific disciplines get ever larger and more complex, it is becoming critical for scientists to be knowledgeable about how to use high‐level languages such as R, which allow for easy and intuitive data manipulation. Along with powerful statistical capabilities, graphical tools make R an ideal environment for exploratory data analysis and provide exceptional flexibility for preparing high‐quality publication‐ready figures. Nevertheless, many technical and methodological issues must still be addressed to create analytical platforms that readily answer biological questions efficiently.

## **Acknowledgements**

The authors like to thank Arthur Colvis for help with the LCMS experiments.

## **Author details**

Stephen C. Grace\* and Dane A. Hudson

\*Address all correspondence to: scgrace@ualr.edu

Department of Biology, University of Arkansas, Little Rock, AR, USA

### **References**

[1] Fiehn O. Metabolomics: the link between genotypes and phenotypes. Plant Mol Biol. 2002;48:155–171. DOI: 10.1023/A:1013713905833

[2] Kim HK, Verpoorte R. Sample preparation for plant metabolomics. Phytochem. Anal. 2010;12:4–13. DOI: 10.1002/pca.1188

**13. Conclusions**

90 Metabolomics - Fundamentals and Applications

of metabolomics research.

**Acknowledgements**

**Author details**

Stephen C. Grace\*

**References**

Untargeted metabolomics has become an increasingly powerful tool to investigate biological problems in agriculture, medicine, and a number of other fields. Therefore, efficient processing methods must be developed and refined to enable robust interpretation of metabolomics data. Method development and new software tools have helped address these challenges over the last decade. However, since improvements are still required at the various stages of data processing, establishing and refining new methods will continue to be important in the future

In this chapter, we have presented an overview of several common methods used for proc‐ essing and analyzing LCMS‐based metabolomics data and how to carry out these methods in the R programming environment. Although a variety of open source and web‐based tools are available to support metabolomics data analysis, the ability to tailor the data processing workflow to one's own needs and generate custom graphics in R offers major advantages.

As the data sets used in all scientific disciplines get ever larger and more complex, it is becoming critical for scientists to be knowledgeable about how to use high‐level languages such as R, which allow for easy and intuitive data manipulation. Along with powerful statistical capabilities, graphical tools make R an ideal environment for exploratory data analysis and provide exceptional flexibility for preparing high‐quality publication‐ready figures. Nevertheless, many technical and methodological issues must still be addressed to

create analytical platforms that readily answer biological questions efficiently.

The authors like to thank Arthur Colvis for help with the LCMS experiments.

and Dane A. Hudson

Department of Biology, University of Arkansas, Little Rock, AR, USA

2002;48:155–171. DOI: 10.1023/A:1013713905833

[1] Fiehn O. Metabolomics: the link between genotypes and phenotypes. Plant Mol Biol.

\*Address all correspondence to: scgrace@ualr.edu


subsequent statistical analyses. Anal. Chem. 2014;86(14):6931–6939. DOI: 10.1021/ ac500734c


[31] Mustilli AC, Fenzi F, Ciliento R, Alfano F, Bowler C. Phenotype of the tomato high pigment‐2 mutant is caused by a mutation in the tomato homolog of DEETIOLATED1. Plant Cell. 1999;11:145–157.

subsequent statistical analyses. Anal. Chem. 2014;86(14):6931–6939. DOI: 10.1021/

[18] Xia J, Wishart DS. Web‐based inference of biological patterns, functions and pathways from metabolomic data using MetaboAnalyst. Nat. Protocol. 2011;6:743–760. DOI:

[19] Xia J, Sinelnikov I, Han B, Wishart DS. MetaboAnalyst 3.0 ‐ making metabolomics more meaningful. Nucleic Acids Res. 2015;43:251–257. DOI: 10.1093/nar/gkv380

[20] Grace SC, Embry S, Luo H. Haystack, a web‐based tool for metabolomics research. BMC Bioinformatics. 2014;15(Suppl 11):S12. DOI: 10.1186/1471‐2105‐15‐S11‐S12

[21] Pluskal T, Castillo S, Villar‐Briones A, Oresic M. MZmine 2: Modular framework for processing visualizing, and analyzing mass spectrometry‐based molecular profile data.

[22] Lei Z, Li H, Chang J, Zhao PX, Sumner LW. MET‐IDEA version 2.06: improved efficiency and additional functions for mass spectrometry‐based metabolomics data processing.

[23] Katajamaa M, Orešič M. Data processing for mass spectrometry‐based metabolomics.

[24] Sugimoto M, Kawakami M, Robert M, Soga T, Tomita M. Bioinformatics tools for mass spectroscopy‐based metabolomic data processing and analysis. Curr. Bioinformatics.

[25] Scholz M, Selbig J. Visualization and analysis of molecular data. In: Weckwerth W. Metabolomics: methods and protocols. Meth. Mol. Biol. 2007;358:87–104. DOI:

[26] Worley B, Powers R. Multivariate analysis in metabolomics. Curr. Metabolomics.

[27] Saccenti E, Hoefsloot H, Smilde AK, Westerhuis JA, Hendriks MM. Reflections on univariate and multivariate analysis of metabolomics data. Metabolomics. 2014;10:361–

[28] Trygg J, Lundstedt T. Chemometrics techniques for metabonomics. The Handbook of Metabonomics and Metabolomics. J.C. Lindon, J.K. Nicholson, E. Holmes (eds.).

[29] Levin I, Frankel P, Gilboa N, Tanny S, Lalazar A. The tomato dark green mutation is a novel allele of the tomato homolog of the DEETIOLATED1 gene. Theor. Appl. Genet.

[30] Peters JL, van Tuinen A, Adams P, Kendrick RE, Koornneef M. High pigment mutants of tomato exhibit high sensitivity for phytochrome action. Plant Physiol. 1989;134:661–

BMC Bioinformatics. 2010;11:395. DOI: 10.1186/1471‐2105‐11‐395

Metabolomics. 2012;8(S1):105–110. DOI: 10.1007/s11306‐012‐0397‐5

2012;7:96–108. DOI: 10.2174/157489312799304431

2013;1:92–107. DOI: 10.2174/2213235X11301010092

2003;106:454–460. DOI: 10.1007/s00122‐002‐1080‐4

666. DOI: 10.1016/S0176‐1617(89)80024‐0

Amsterdam, The Netherlands, Elsevier B.V. 2007; pp. 171–199.

10.1007/978‐1‐59745‐244‐1\_6

374. DOI: 10.1007/s11306‐013‐0598‐6

J. Chromatogr. A. 2007;1158:318–328. DOI: 10.1016/j.chroma.2007.04.021

ac500734c

10.1038/nprot.2011.319

92 Metabolomics - Fundamentals and Applications


(ReSpect) for phytochemicals: A plant-specific MS/MS-based data resource and database. Phytochemistry. 2012;82:38–45. DOI: 10.1016/j.phytochem.2012.07.007


**Metabolomics in Plant Metabolism and Agriculture**

(ReSpect) for phytochemicals: A plant-specific MS/MS-based data resource and database. Phytochemistry. 2012;82:38–45. DOI: 10.1016/j.phytochem.2012.07.007 [42] Udayakumar M, Prem Chandar D, Arun N, Mathangi J, Hemavathi K, Seenivasagam R. (2012). PMDB: Plant metabolome database-a metabolomic approach. Med. Chem.

[43] Bais P, Moon SM, He K, Leitao R, Dreher K, Walk T, Sucaet Y, Barkan L, Wohlgemuth G, Roth MR, Wurtele ES, Dixon P, Fiehn O, Lange BM, Shulaev V, Sumner LW, Welti R, Nikolau BJ, Rhee SY, Dickerson JA. PlantMetabolomics.org: a web portal for plant metabolomics experiments. Plant Physiol. 2010;152:1807–1816. DOI: 10.1104/pp.

[44] Sucaet Y, Wang Y, Li J, Wurtele ES. MetNet Online: a novel integrated resource for plant systems biology. BMC Bioinformatics. 2012;13:267. DOI: 10.1186/1471-2105-13-267 [45] King ZA, Lu J, Dräger A, Miller P, Federowicz S, Lerman JA, Ebrahim A, Palsson BO, Lewis NE. BiGG Models: A platform for integrating, standardizing and sharing genome-scale models. Nucleic Acids Res. 2016;44(D1):D515–D522. DOI: 10.1093/nar/

Res. 2012;21(1):47–52. DOI: 10.1007/s00044-010-9506-z

109.151027

94 Metabolomics - Fundamentals and Applications

gkv1049

## **13C-Isotope-Labeling Experiments to Study Metabolism in** *Catharanthus roseus* **<sup>13</sup>C-Isotope-Labeling Experiments to Study Metabolism in** *Catharanthus roseus*

Qifang Pan, Natali Rianika Mustafa, Robert Verpoorte and Kexuan Tang Qifang Pan, Natali Rianika Mustafa, Robert Verpoorte and Kexuan Tang Additional information is available at the end of the chapter

Additional information is available at the end of the chapter

http://dx.doi.org/10.5772/65401

#### **Abstract**

Plant metabolism is a complex network. Pathways are correlated and affect each other. Secondary metabolic pathways in plant cells are regulated strictly, and upon an intra- or extra-stimuli (e.g. stress), the metabolic fluxes will change as a response on the stimuli, for example, to protect the plant against herbivores or against microbial infections. 13C-isotope-labeling experiment has been performed on cell cultures and hairy roots of *Catharanthus roseus* to measure fluxes through some pathways. However, due to the complexity of the total metabolic network in an intact plant, no experiments have yet been carried on *C. roseus* plants. In this study, [1-13C] glucose was first applied to *C. roseus* seedlings grown in Murashige and Skoog's (MS) medium. In a time course, the amount and position of 13C incorporation into the metabolites were analyzed by proton nuclear magnetic resonance (1 H NMR) and <sup>1</sup> H-<sup>13</sup>C heteronuclear single quantum coherence (HSQC) NMR. The results show that the fed <sup>13</sup>C-isotope was efficiently incorporated into and recycled in metabolism of the intact *C. roseus* plant. The *C. roseus* plants seem to be a good system for metabolic flux analysis.

**Keywords:** 13C-isotope labeling, *Catharanthus roseus*, intact plant, metabolic fluxes, NMR, HSQC

#### **1. Introduction**

Metabolic flux analysis (MFA) aims at the quantitation of the carbon flow through a metabolic network by measuring the enrichment and position of labels in the various measurable metabolites after feeding a labeled precursor *in vivo* or *in vitro*. Though now common in microorganisms, in plants, with their complex secondary metabolic pathways; MFA is so far mostly focused on primary metabolism. In fact, each metabolic flux reflects the function and

© 2016 The Author(s). Licensee InTech. This chapter is distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/3.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. © 2017 The Author(s). Licensee InTech. This chapter is distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/3.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

performance of a specific pathway in a plant's development and its interaction with the environment, for example, defense against herbivores or microorganisms [1]. Consequently, metabolic fluxes represent the fourth dimension of a living organism. There are three dimensions of space, which form the phenotype, but the dynamics, the fluxes, represent life. Flux analysis based on 13C-isotope-labeling experiments (13CLE) has been established as an effective method for determining the flux distribution through the compartmented pathways of primary metabolism in plant cells. The 13C isotope is not radioactive, thus convenient to be used to label the metabolites in the pathways. Usually, a specifically 13C-isotope-labeled substrate, for example, [1-13C] glucose, is used in a CLE study. After feeding, this labeled material is distributed over the various metabolic pathways. At various time points, the distribution of the label over the various measurable metabolites is measured by using different NMR or MS instruments [2, 3]. By NMR, the position of the label as well as the enrichment on every position in a molecule can be measured. By MS, the overall enrichment of a molecule can be determined, whereas the position can only to some extent be determined by analysis of the fragments.

There are two strategies for 13C MFA: one is dynamic-labeling strategy (time course experiments), and the other is steady-state-labeling strategy. The dynamic-labeling strategy has an advantage in studying small partial networks and it is particularly effective for the analysis of secondary metabolism [1]. In this approach, a specific labeled advanced precursor of a pathway is pulse fed, and after a given time the incorporation is measured in the products of the pathway involved. In a steady-state-labeling strategy, the organisms are permanently growing in a medium containing a very early substrate for primary metabolism (e.g., a labeled sugar of pyruvate) and the diffusion of the label through all pathways is monitored by measuring the incorporation and position of the label in all measurable metabolites. This approach is usually utilized in studies of central carbon metabolism. In fact, the limiting factor in flux analyses in plants is the detection limits for the various metabolites, as the level of primary metabolites in plants is many folds higher than of secondary metabolites, the dynamic range analytical tools hamper the analysis of minor compounds. Therefore, often specific and selective extraction methods are used for the dynamic approach, whereas for the steady-state approach one uses the more general metabolomics analytical protocols.

In *Catharanthus roseus*, 13C label has been applied for both pathway elucidation and system-wide flux quantification. By feeding [1-13C] glucose to a cell culture of *C. roseus* with 13C NMR spectroscopy [4], the biosynthetic pathway of secologanin was elucidated from which secologanin was found to originate from the triose-phosphate pathway. Salicylic acid biosynthesis was uncovered in *C. roseus* cell cultures by a retrobiosynthetic study based on 13C feeding experiments [5]. Flux quantification in central carbon metabolism of *C. roseus* hairy roots by 13C-labeling-based flux analysis, and quantitative assessment of crosstalk between the two isoprenoid biosynthesis pathways in cell cultures of *C. roseus* were also reported [6, 7]. Antonio et al. (2013) used plant cell suspension cultures of *C. roseus* to study the changes in fluxes after elicitation with jasmonate. The incorporation of fully labeled pyruvate was measured by gas chromatography-mass spectrometry (GC-MS) and ultra performance liquid chromatography (UPLC)-MS. The elicitation was found to disturb various metabolic pathways, as could be concluded from the differences in the incorporation of labels. Up to now, 13CLE-based MFA has not been implemented on intact *C. roseus* plants. The major reason is that intact plants are a more complex metabolic

performance of a specific pathway in a plant's development and its interaction with the environment, for example, defense against herbivores or microorganisms [1]. Consequently, metabolic fluxes represent the fourth dimension of a living organism. There are three dimensions of space, which form the phenotype, but the dynamics, the fluxes, represent life. Flux analysis based on 13C-isotope-labeling experiments (13CLE) has been established as an effective method for determining the flux distribution through the compartmented pathways of primary metabolism in plant cells. The 13C isotope is not radioactive, thus convenient to be used to label the metabolites in the pathways. Usually, a specifically 13C-isotope-labeled substrate, for example, [1-13C] glucose, is used in a CLE study. After feeding, this labeled material is distributed over the various metabolic pathways. At various time points, the distribution of the label over the various measurable metabolites is measured by using different NMR or MS instruments [2, 3]. By NMR, the position of the label as well as the enrichment on every position in a molecule can be measured. By MS, the overall enrichment of a molecule can be determined, whereas the

There are two strategies for 13C MFA: one is dynamic-labeling strategy (time course experiments), and the other is steady-state-labeling strategy. The dynamic-labeling strategy has an advantage in studying small partial networks and it is particularly effective for the analysis of secondary metabolism [1]. In this approach, a specific labeled advanced precursor of a pathway is pulse fed, and after a given time the incorporation is measured in the products of the pathway involved. In a steady-state-labeling strategy, the organisms are permanently growing in a medium containing a very early substrate for primary metabolism (e.g., a labeled sugar of pyruvate) and the diffusion of the label through all pathways is monitored by measuring the incorporation and position of the label in all measurable metabolites. This approach is usually utilized in studies of central carbon metabolism. In fact, the limiting factor in flux analyses in plants is the detection limits for the various metabolites, as the level of primary metabolites in plants is many folds higher than of secondary metabolites, the dynamic range analytical tools hamper the analysis of minor compounds. Therefore, often specific and selective extraction methods are used for the dynamic approach, whereas for the steady-state

In *Catharanthus roseus*, 13C label has been applied for both pathway elucidation and system-wide flux quantification. By feeding [1-13C] glucose to a cell culture of *C. roseus* with 13C NMR spectroscopy [4], the biosynthetic pathway of secologanin was elucidated from which secologanin was found to originate from the triose-phosphate pathway. Salicylic acid biosynthesis was uncovered in *C. roseus* cell cultures by a retrobiosynthetic study based on 13C feeding experiments [5]. Flux quantification in central carbon metabolism of *C. roseus* hairy roots by 13C-labeling-based flux analysis, and quantitative assessment of crosstalk between the two isoprenoid biosynthesis pathways in cell cultures of *C. roseus* were also reported [6, 7]. Antonio et al. (2013) used plant cell suspension cultures of *C. roseus* to study the changes in fluxes after elicitation with jasmonate. The incorporation of fully labeled pyruvate was measured by gas chromatography-mass spectrometry (GC-MS) and ultra performance liquid chromatography (UPLC)-MS. The elicitation was found to disturb various metabolic pathways, as could be concluded from the differences in the incorporation of labels. Up to now, 13CLE-based MFA has not been implemented on intact *C. roseus* plants. The major reason is that intact plants are a more complex metabolic

position can only to some extent be determined by analysis of the fragments.

98 Metabolomics - Fundamentals and Applications

approach one uses the more general metabolomics analytical protocols.

99

system than cell cultures or hairy root cultures which only have a few different cell types. For example, previous research showed that some valuable TIAs (e.g., vindoline, vinblastine, and vincristine) can only be produced in leaves of *C. roseus*, not in cell cultures and hairy roots, due to the tissue- and cell-specific organization of terpenoid indole alkanoid (TIA) biosynthesis. So, a more detailed understanding of carbon flux distribution in the complex metabolic networks of intact *C. roseus* plants is a prerequisite for progress in metabolic engineering of TIA production in order to feed the rapidly growing market demands of these important TIAs.

In this study, the fate of [1-13C] glucose fed to the intact *C. roseus* plants via the root system was analyzed in considerable detail. Labeling patterns of targeted metabolites were deduced from previous publications [4, 5, 8] (**Figure 1**), and confirmed by the current experiment. By tracing the label in some of the detected primary and secondary metabolites through a time course, we have information about the 13C incorporation status of these compounds and thus in the metabolic fluxes in the *C. roseus* plant metabolism and the channeling of carbon into the MIA biosynthesis. Also, the metabolic changes after elicitation were measured in this model.

## **2. Materials and methods**

#### **2.1. Plant material and in vitro culture**

*C. roseus* seeds (Pacifica Cherry Red cultivar) were purchased from PanAmerican Seed Company (USA). The seeds were surface sterilized in 75% of ethanol (v/v) and 10% of NaClO (v/v) for 1 and 15 min (respectively), and subsequently washed three times with sterile distilled water. Sterilized seeds were germinated on solid MS [9] basal medium with 1% nonlabeled glucose. Finally, 54 seedlings were obtained, subcultured every 2 weeks in the same MS solid medium, and used before flowering for the labeling experiments. After 10 weeks, 19 plants as control were transferred to glass tubes and reared (each) with 5 ml of 10 g/l nonlabeled glucose solution, whereas the other 35 plants were placed in separate glass tubes containing 5 ml of 10 g/l [1-<sup>13</sup>C] glucose solution. The plant cultures were grown in a climate chamber under a 16-h light and 8-h dark photoperiod at 25 ± 2°C.

#### **2.2. Chemicals**

Murashige & Skoog medium (including vitamins) and gelrite (strength: 550–850 g/cm<sup>2</sup> ) were purchased from Duchefa Biochemie. D (+)-Glucose (>99.0%) was obtained from Fluka Chemie (Buchs, Germany), whereas [1-13C]-D-glucose (>99.0%, with > 99% atom 1-13C) was from Campro Scientific (Veenendaal, The Netherlands). Jasmonic acid (JA) was from Sigma-Aldrich Chemie (Steinheim, Germany).

#### **2.3. Jasmonic acid elicitation**

A stock solution of 10 mg/ml of JA 40% EtOH was prepared, filter sterilized, and used for elicitation. After 5 days of submerging the plant roots with 5 ml of [1-<sup>13</sup>C] glucose solution (1%, w/v), 11 μl of the JA stock solution was aseptically spiked into each tube. The control samples received only the same volume of 40% EtOH. The plants were harvested at 0, 6, 24, and 72 h after treatment; young leaves, old leaves, stems, and roots of *C. roseus* plants were harvested separately, immediately frozen and ground in liquid nitrogen into powder and freeze-dried for 72 h before NMR extraction (five replicates per sample).

#### **2.4. NMR analysis**

system than cell cultures or hairy root cultures which only have a few different cell types. For example, previous research showed that some valuable TIAs (e.g., vindoline, vinblastine, and vincristine) can only be produced in leaves of *C. roseus*, not in cell cultures and hairy roots, due to the tissue- and cell-specific organization of terpenoid indole alkanoid (TIA) biosynthesis. So, a more detailed understanding of carbon flux distribution in the complex metabolic networks of intact *C. roseus* plants is a prerequisite for progress in metabolic engineering of TIA produc-

In this study, the fate of [1-13C] glucose fed to the intact *C. roseus* plants via the root system was analyzed in considerable detail. Labeling patterns of targeted metabolites were deduced from previous publications [4, 5, 8] (**Figure 1**), and confirmed by the current experiment. By tracing the label in some of the detected primary and secondary metabolites through a time course, we have information about the 13C incorporation status of these compounds and thus in the metabolic fluxes in the *C. roseus* plant metabolism and the channeling of carbon into the MIA biosynthesis. Also, the metabolic changes after elicitation were measured in this model.

*C. roseus* seeds (Pacifica Cherry Red cultivar) were purchased from PanAmerican Seed Company (USA). The seeds were surface sterilized in 75% of ethanol (v/v) and 10% of NaClO (v/v) for 1 and 15 min (respectively), and subsequently washed three times with sterile distilled water. Sterilized seeds were germinated on solid MS [9] basal medium with 1% nonlabeled glucose. Finally, 54 seedlings were obtained, subcultured every 2 weeks in the same MS solid medium, and used before flowering for the labeling experiments. After 10 weeks, 19 plants as control were transferred to glass tubes and reared (each) with 5 ml of 10 g/l nonlabeled glucose solution, whereas the other 35 plants were placed in separate glass tubes containing 5 ml of 10 g/l [1-<sup>13</sup>C] glucose solution. The plant cultures were grown in a climate

Murashige & Skoog medium (including vitamins) and gelrite (strength: 550–850 g/cm<sup>2</sup>

were purchased from Duchefa Biochemie. D (+)-Glucose (>99.0%) was obtained from Fluka Chemie (Buchs, Germany), whereas [1-13C]-D-glucose (>99.0%, with > 99% atom 1-13C) was from Campro Scientific (Veenendaal, The Netherlands). Jasmonic acid (JA) was from Sigma-

A stock solution of 10 mg/ml of JA 40% EtOH was prepared, filter sterilized, and used for elicitation. After 5 days of submerging the plant roots with 5 ml of [1-<sup>13</sup>C] glucose solution (1%, w/v), 11 μl of the JA stock solution was aseptically spiked into each tube. The control samples received only the same volume of 40% EtOH. The plants were harvested at 0, 6, 24, and 72 h

)

chamber under a 16-h light and 8-h dark photoperiod at 25 ± 2°C.

tion in order to feed the rapidly growing market demands of these important TIAs.

**2. Materials and methods**

100 Metabolomics - Fundamentals and Applications

**2.2. Chemicals**

**2.1. Plant material and in vitro culture**

Aldrich Chemie (Steinheim, Germany).

**2.3. Jasmonic acid elicitation**

1 H NMR spectra were recorded in CH<sup>3</sup> OH-*d*<sup>4</sup> using a Bruker DMX 600-MHz spectrometer, while the coherence-order selective gradient heteronuclear single quantum coherence (HSQC) spectra were recorded in CH<sup>3</sup> OH-*d*<sup>4</sup> using a Bruker AV 500-MHz spectrometer. HSQC spectra were recorded for a data matrix of 256 × 2048 points covering 30182.7 × 7812.5 Hz with 64 scans for each increment [10]. INEPT transfer delays were optimized for a heteronuclear coupling of 145 Hz and a relaxation delay of 1.5 s was applied. Data were linear predicted in *F*<sup>1</sup> to 512 × 2048 using 32 coefficients and then zero-filled to 2048 × 2048 points prior to echo-anti-echo-type two-dimensional (2D) Fourier transformation and a sine bell-shaped window function shifted by π/2 in both dimensions was applied. One-dimensional (1D) projection along the *F*<sup>1</sup> -axis was extracted using the build-in positive projection tool of Topspin (version 2.1, Bruker Biospin).

The signal intensity of carbons at certain positions of a given metabolite was obtained from peak height in the 13C-dimension spectra abstracted from the 2D HSQC spectra. The signal height of CH<sup>3</sup> OH-*d*<sup>4</sup> was selected as the standard and set as 1 in both labeled and non-labeled samples. The other signals were normalized and expressed relative to this signal. 13C signal intensity ratio was calculated by comparison of normalized 13C signal heights between 13C-labeled and non-labeled samples.

## **3. Results and discussion**

#### **3.1. Comparison of growth and metabolism of** *C. roseus* **plants grown in the solid MS medium versus soil**

Two batches of *C. roseus* seeds (each containing of 10 seeds) were germinated, one batch in soil and another one in solid MS medium with glucose. They were kept in the same condition of light and temperature. The height, the size, and the leaf pairs of the plants from seedling until flowering were monitored and recorded regularly to determine the growth state of plants.

After 10–12 days, seeds in both batches germinated and produced their first pair of leaves. In the first 3 weeks after germination, there were no significant differences of height, leaf pairs, and leaf size between plantlets grown in MS medium and in the soil (**Figure 2**). However, in the following days, the plantlets in MS medium provided one more pair of leaves than those in soil did, but the leaf size was much smaller than that of plantlets grown in the soil (**Figure 2A** and **B**). Moreover, the soil plantlets grew higher than those grown in MS medium (**Figure 2C**). Plantlets in MS medium entered flowering time around 100 days after sowing, whereas those in soil flowered at 75 days. The plantlets grown in soil had a higher growth rate and a larger biomass than those grown in MS medium.

Metabolic differences between the plants grown in soil and MS medium were observed by <sup>1</sup> H NMR (**Figure 3**). The <sup>1</sup> H NMR spectra showed that qualitatively metabolites of plants grown

**Figure 2.** Comparison of the number of leaf pairs (A), leaf size (B), and height (C) of *Catharanthus roseus* plants grown in MS medium and soil during the development stage.

**Figure 3.** <sup>1</sup> H NMR spectrum of the crude extracts of *Catharanthus roseus* plants grown in soil (A) and MS medium (B).

in soil or MS medium were similar, but the levels varied (**Table 1**). Plants grown in soil produced higher levels of organic acids and sugars (malate, fumaric acid, glucose, and sucrose) than those grown in MS medium, indicating a low function/reduced level of carbon fixation in the leaves of the MS-grown plants. Also, secondary metabolites (such as secologanin, vindoline, quercetin, and kaempherol) were found in higher levels in soil-grown plants than the


**Table 1.** Comparison of metabolite levels in *Catharanthus roseus* plants grown in soil and MS medium, based on <sup>1</sup> H NMR.

plants grown in MS medium. On the other hand, plants cultured in MS medium displayed significantly higher levels of arginine, glutamine, and asparagine but relatively low level of glucose and sucrose. The levels of threonine, glutamate, quinic acid, and lactic acid were also higher in plants grown in MS medium than those in soil.

Some groups of metabolites have a close correlation with plant growth and biomass, such as the tricarboxylic acid cycle (TCA) intermediates succinate, citrate, or malate, as well as amino acids [11]. Both glutamine and asparagine are the major compounds for nitrogen fixing, transport, and storage in plants [12]. With the much more abundant nitrogen source in the medium than in the soil, the high levels of the amino acids in the medium grown plants could be explained. Meyer et al. [11] reported a negative correlation to the plant biomass with glutamine, which is in line with our findings. Sucrose starvation may lead to the presence of a large excess of asparagine in plant cells [13]. In the present study, the plants cultured on solid

in soil or MS medium were similar, but the levels varied (**Table 1**). Plants grown in soil produced higher levels of organic acids and sugars (malate, fumaric acid, glucose, and sucrose) than those grown in MS medium, indicating a low function/reduced level of carbon fixation in the leaves of the MS-grown plants. Also, secondary metabolites (such as secologanin, vindoline, quercetin, and kaempherol) were found in higher levels in soil-grown plants than the

H NMR spectrum of the crude extracts of *Catharanthus roseus* plants grown in soil (A) and MS medium (B).

**Figure 2.** Comparison of the number of leaf pairs (A), leaf size (B), and height (C) of *Catharanthus roseus* plants grown in

MS medium and soil during the development stage.

102 Metabolomics - Fundamentals and Applications

**Figure 3.** <sup>1</sup>

MS medium require an aseptic jar with a cap, which limits the space to grow, and also affects air exchange, CO<sup>2</sup> availability, and accumulation of volatiles in the head space if compared with plants grown in soil. Despite the uptake of carbohydrates from the medium through the roots, the growth was less than the plants grown in soil which are dependent of carbon fixation by leaves. The limited availability of CO<sup>2</sup> in the sterile closed containers may thus be a reason for lower biomass production.

#### **3.2. [1-13C] glucose feeding experiment and JA elicitation on** *C. roseus* **plantlets**

Samples from different organs (upper and lower leaves, stems, and roots) were measured by proton and carbon NMR. After feeding the plants with [1-13C] glucose for 5 days, the incorporation of 13C label was found in some primary and secondary metabolites detected in all organs of the *C. roseus* plantlets. 13C signals of some primary and secondary metabolites were assigned based on the "in-house" database and some references [14, 15], and confirmed in Chapter 4. Totally, 14 amino acids, nine organic acids, two carbohydrates, six phenylpropanoids, five TIAs, two terpenoids, and three other compounds were identified. Among them, only the metabolites from which characteristic signals were clearly visible and non-overlapping in both 1 H and <sup>13</sup>C NMR spectra were quantified (**Figure 4**). Those include some primary metabolites such as amino acids (threonine, alanine, asparagine, aspartate, glutamine, glutamate, and arginine) and malic acid (**Figure 4A**), as well as some secondary metabolites such as phenylpropanoids (chlorogenic acid and 4-*O*-caffeoyl quinic acid), terpenoids (loganic acid and secologanin), and TIA (vindoline) (**Figure 4C**).

**Figure 5** shows the 13C-dimension HSQC spectra and <sup>1</sup> H NMR spectra of the non-labeled sample and the 13C-enriched sample determined in CH<sup>3</sup> OH-*d*<sup>4</sup> . As expected, the superimposed 1 H-NMR spectra of leaves (**Figure 5**) and stems (data not shown) did not show any significant difference in proton signal intensity of the metabolites for the control and the 13C-enriched sample. Production of these sugars caused a decrease in the levels of glucose and sucrose in roots but did not affect the metabolite levels in other organs. Except this, there was no significant change in metabolite levels of the plants fed with labeled- and non-labeled glucose solution (**Table 2**). This information is necessary to confirm that the 13C signals of the spectra of enriched samples are due to incorporation of label, and not because of higher levels of production of the metabolites. Superimposed 13C-dimension HSQC spectrum showed that the enriched sample had a much higher intensity of 13C signals than the non-labeled one. The results indicate that the [1-13C] glucose fed *C. roseus* plants grew normally, and incorporated the labeled sugar into its metabolic network. Previous work with *Arabidopsis* supports that 13C feeding does not in itself distort the fluxes through the metabolic network in a plant [16].

#### **3.3. 13C incorporation into primary and secondary metabolites**

The signals in the HSQC spectra of the enriched samples were identified (**Figure 4**). The carbon position of 13C incorporation into a metabolite was investigated by calculating 13C signal intensity ratios between the same carbons of the metabolite in labeled and non-labeled samples (**Table 3**).

MS medium require an aseptic jar with a cap, which limits the space to grow, and also affects

with plants grown in soil. Despite the uptake of carbohydrates from the medium through the roots, the growth was less than the plants grown in soil which are dependent of carbon fixa-

Samples from different organs (upper and lower leaves, stems, and roots) were measured by proton and carbon NMR. After feeding the plants with [1-13C] glucose for 5 days, the incorporation of 13C label was found in some primary and secondary metabolites detected in all organs of the *C. roseus* plantlets. 13C signals of some primary and secondary metabolites were assigned based on the "in-house" database and some references [14, 15], and confirmed in Chapter 4. Totally, 14 amino acids, nine organic acids, two carbohydrates, six phenylpropanoids, five TIAs, two terpenoids, and three other compounds were identified. Among them, only the metabolites from which characteristic signals were clearly visible and non-overlap-

metabolites such as amino acids (threonine, alanine, asparagine, aspartate, glutamine, glutamate, and arginine) and malic acid (**Figure 4A**), as well as some secondary metabolites such as phenylpropanoids (chlorogenic acid and 4-*O*-caffeoyl quinic acid), terpenoids (loganic acid

H and <sup>13</sup>C NMR spectra were quantified (**Figure 4**). Those include some primary

H-NMR spectra of leaves (**Figure 5**) and stems (data not shown) did not show any

significant difference in proton signal intensity of the metabolites for the control and the 13C-enriched sample. Production of these sugars caused a decrease in the levels of glucose and sucrose in roots but did not affect the metabolite levels in other organs. Except this, there was no significant change in metabolite levels of the plants fed with labeled- and non-labeled glucose solution (**Table 2**). This information is necessary to confirm that the 13C signals of the spectra of enriched samples are due to incorporation of label, and not because of higher levels of production of the metabolites. Superimposed 13C-dimension HSQC spectrum showed that the enriched sample had a much higher intensity of 13C signals than the non-labeled one. The results indicate that the [1-13C] glucose fed *C. roseus* plants grew normally, and incorporated the labeled sugar into its metabolic network. Previous work with *Arabidopsis* supports that 13C feeding does not in itself distort the fluxes through the metabolic network

The signals in the HSQC spectra of the enriched samples were identified (**Figure 4**). The carbon position of 13C incorporation into a metabolite was investigated by calculating 13C signal intensity ratios between the same carbons of the metabolite in labeled and non-labeled

OH-*d*<sup>4</sup>

**3.2. [1-13C] glucose feeding experiment and JA elicitation on** *C. roseus* **plantlets**

availability, and accumulation of volatiles in the head space if compared

in the sterile closed containers may thus be a

H NMR spectra of the non-labeled

. As expected, the superim-

air exchange, CO<sup>2</sup>

ping in both 1

posed 1

in a plant [16].

samples (**Table 3**).

tion by leaves. The limited availability of CO<sup>2</sup>

and secologanin), and TIA (vindoline) (**Figure 4C**).

**Figure 5** shows the 13C-dimension HSQC spectra and <sup>1</sup>

sample and the 13C-enriched sample determined in CH<sup>3</sup>

**3.3. 13C incorporation into primary and secondary metabolites**

reason for lower biomass production.

104 Metabolomics - Fundamentals and Applications

**Figure 4.** 2-D [13C, 1 H] HSQC spectrum of CH<sup>3</sup> OH-*d4* extract of *Catharanthus roseus* leaves. A, spectrum region displaying amino acid resonances; B, spectrum region displaying sugar resonances; C, spectrum region displaying aromatic resonances. 1, alanine; 2, threonine; 3, arginine; 4, glutamine; 5, glutamate; 6, malate; 7, aspartate; 8, asparagine; 9, vindoline; 10, loganic acid; 11, chlorogenic acid; 12, 4-*O*-caffeoyl quinic acid.

**Figure 5.** Superimposed 1 H NMR spectra and <sup>13</sup>C-dimension HSQC spectrum of labeled and non-labeled *Catharanthus roseus* plants. Spectra in green were non-labeled plants sample, spectra in red were 13C-labeled plants sample.

Among amino acids, the signals corresponding to C at δ 16.98, C-3 of alanine, exhibited a high 13C relative enrichment ratio. Glycolysis introduces the C-1 or C-6 of glucose into alanine C-3 [8]. Carbon signals at δ 20.47 of threonine and at δ 37.21 of aspartate also showed a relatively high labeling. The carbons of arginine and asparagine were apparently less labeled.

Glutamate (C-3 at δ 27.74, C-4 at δ 34.44, and C-5 at δ 55.67) and glutamine (C-3 at δ 27.11, C-4 at δ 31.83, and C-5 at δ 55.02) showed clear high <sup>13</sup>C incorporation. The relative enrichment ratios of C-3 and C-2 of glutamine were lower than that of C-4, which indicate the entry of a diluting flux of C4 compounds into the TCA cycle [17]. For glutamate, however, C-4 had a lower relative enrichment ratio than C-3 and C-2. Non-symmetrical enrichment ratios of


**Table 2.** Comparison of metabolite levels in different organs between labeled and non-labeled *Catharanthus roseus* plants.

Among amino acids, the signals corresponding to C at δ 16.98, C-3 of alanine, exhibited a high 13C relative enrichment ratio. Glycolysis introduces the C-1 or C-6 of glucose into alanine C-3 [8]. Carbon signals at δ 20.47 of threonine and at δ 37.21 of aspartate also showed a relatively

H NMR spectra and <sup>13</sup>C-dimension HSQC spectrum of labeled and non-labeled *Catharanthus* 

Glutamate (C-3 at δ 27.74, C-4 at δ 34.44, and C-5 at δ 55.67) and glutamine (C-3 at δ 27.11, C-4 at δ 31.83, and C-5 at δ 55.02) showed clear high <sup>13</sup>C incorporation. The relative enrichment ratios of C-3 and C-2 of glutamine were lower than that of C-4, which indicate the entry of a diluting flux of C4 compounds into the TCA cycle [17]. For glutamate, however, C-4 had a lower relative enrichment ratio than C-3 and C-2. Non-symmetrical enrichment ratios of

high labeling. The carbons of arginine and asparagine were apparently less labeled.

*roseus* plants. Spectra in green were non-labeled plants sample, spectra in red were 13C-labeled plants sample.

**Figure 5.** Superimposed 1

106 Metabolomics - Fundamentals and Applications

C-2 and C-3 imply that there might be a form of channeling that converts oxoglutarate C-4 to oxaloacetate C-2 or C-3 [18].

In plant cells, the labeling of amino acids alanine, glutamate, and aspartate is found to reflect that of the corresponding α-oxoacids: pyruvate, α-oxoglutarate, and oxaloacetate, respectively [19]. The organic acid malate showed a sixfold increased intensity for the carbon signal at δ 43.40.

Besides primary metabolites, secondary metabolites also exhibited clear 13C incorporation. Two phenylpropanoids, chlorogenic acid and its isomer 4-*O*-caffeoyl quinic acid, have an increased 13C intensity of C-6. Incorporation of 13C could be observed for C-3 and C-10 of loganic acid. These results are in agreement with the prediction shown in **Figure 1**. The signal


**Table 3.** The chemical shifts, peak height, and relative enrichment ratio of the same carbon signals in metabolites in labeled and non-labeled *Catharanthus roseus* plants.

corresponding to C-18 of vindoline in the labeled sample was twofold higher than in the spectrum of the control.

#### **3.4. 13C incorporation in different organs**

Based on 1 H NMR spectra, relative levels of primary and secondary metabolites in different organs were calculated by normalizing the integral of signal peaks to the internal standard (TSP). **Table 4** showed that leaves, especially upper leaves, contained higher levels of amino acids, phenylpropanoids, iridoids, and vindoline, than stems and roots. In roots, phenylpropanoids and vindoline, which are biosynthesis dependent on chloroplasts, were not detected,


**Table 4.** Relative level of metabolites in different organs of *Catharanthus roseus* based on 1 H NMR spectra.

corresponding to C-18 of vindoline in the labeled sample was twofold higher than in the

**Table 3.** The chemical shifts, peak height, and relative enrichment ratio of the same carbon signals in metabolites in

**Compound Chemical Shift Peak height Relative intensity to** 

108 Metabolomics - Fundamentals and Applications

**H C Control Labeled Control (***Y*) **Labeled** (*X*)

2.38 31.83 1.6E+08 1.7E+08 0.80 3.72 4.63 3.71 55.02 3.0E+08 2.3E+08 1.56 5.12 3.29

2.46 34.44 6.3E+07 1.5E+08 0.32 3.36 10.45 3.72 55.67 9.8E+06 3.9E+07 0.05 0.86 17.14

1.92 28.53 2.1E+07 8.1E+06 0.11 0.18 1.68 3.24 41.38 5.9E+07 2.1E+07 0.30 0.47 1.56

3.96 52.21 1.7E+08 4.7E+07 0.89 1.05 1.19

7.07 123.12 1.2E+07 7.2E+06 0.06 0.16 2.53

7.09 124.16 1.7E+06 2.5E+06 0.01 0.06 6.63

7.03 146.1 6.5E+05 5.2E+06 0.00 0.12 34.49

Glucose 4.58 97.04 3.2E+07 1.3E+08 0.17 2.89 17.31 Alanine 1.49 16.98 8.0E+06 2.1E+07 0.04 0.47 11.26 Glutamine 2.05 27.11 4.6E+07 3.9E+07 0.23 0.88 3.75

Glutamate 2.14 27.74 2.1E+06 7.0E+06 0.01 0.16 14.81

Arginine 1.72 24.9 1.7E+07 6.7E+06 0.09 0.15 1.77

Aspartate 2.64 37.21 1.0E+07 7.0E+06 0.05 0.16 3.05 Asparagine 2.83,2.96 35.23 1.2E+08 4.0E+07 0.62 0.90 1.44

Threonine 1.34 20.47 8.8E+06 6.2E+06 0.05 0.14 3.08 Malate 2.35,2.72 43.4 6.0E+07 8.5E+07 0.31 1.91 6.21

Loganic acid 1.07 12.69 2.5E+06 5.6E+06 0.01 0.13 9.96

Vindoline 0.49 7.43 6.0E+06 2.4E+06 0.03 0.05 1.76

**CH3 OH-***d***<sup>4</sup>** **Relative enrichment** 

**ratio (***X/Y*)

organs were calculated by normalizing the integral of signal peaks to the internal standard (TSP). **Table 4** showed that leaves, especially upper leaves, contained higher levels of amino acids, phenylpropanoids, iridoids, and vindoline, than stems and roots. In roots, phenylpropanoids and vindoline, which are biosynthesis dependent on chloroplasts, were not detected,

H NMR spectra, relative levels of primary and secondary metabolites in different

spectrum of the control.

Based on 1

Chlorogenic acid

4-*O*-Caffeoyl quinic acid

**3.4. 13C incorporation in different organs**

labeled and non-labeled *Catharanthus roseus* plants.

whereas iridoids displayed a much lower level in roots while glucose and sucrose had relatively higher levels than in other organs.

The incorporation of <sup>13</sup>C in different organs (upper leaf, lower leaf, stem, and root) was also investigated by comparison of relative enrichment ratios in order to have a clue about the accumulation of label in different organs and its connection with transport and compartmentation of the pathways in the plants (**Table 5**). From the <sup>13</sup>C dimension of HSQC spectra of all organs, 13C signals of labeled samples showed an apparently higher intensity in the amino acid and sugar areas than those of non-labeled ones (**Figure 6**), which indicated that 13C-isotope was efficiently incorporated into the primary metabolism of intact *C. roseus* plants via the roots. Glucose had higher 13C intensity ratio in lower leaves and roots but relatively low in upper leaves and stems, thus showing a time-dependent distribution through the plant. Glutamate and aspartate, directly derived from α-ketoglutarate and oxaloacetate of the TCA cycle, showed clear 13C enrichment in all organs. So did malate, one of the bricks in the TCA cycle. Meanwhile, glutamate, aspartate, and malate all displayed


**Table 5.** Relative enrichment ratios of the carbons of some metabolites in different organs of *Catharanthus roseus* plants fed with [1-13C] glucose.

the highest 13C intensity ratio in roots. These results indicate that 13C was efficiently incorporated and recycled in the primary metabolism of intact plants. Upper leaves had higher levels and higher relative enrichment ratios of glutamate and malate compared with lower leaves, reflecting the faster rate of TCA cycle in the upper parts for plants growing. The glutamate-derived amino acids glutamine and arginine displayed a different pattern of 13C incorporation. Glutamine showed 13C incorporation in all organs with the highest intensity ratio in roots and the lowest in stems, whereas arginine showed low 13C incorporation in all organs, implying a low flux in its biosynthetic pathway and low usage for other pathways. The 13C incorporation of aspartate-derived amino acids asparagine and threonine was also

13C-Isotope-Labeling Experiments to Study Metabolism in *Catharanthus roseus* http://dx.doi.org/10.5772/65401 111

**Figure 6.** <sup>13</sup>C dimension of HSQC spectra of amino acids (δ 10–55 ppm) and secondary metabolites (δ 105–150 ppm) in different organs of *Catharanthus roseus* after feeding [1-13C] glucose. L, labeled samples; N, non-labeled samples; M, malate.

the highest 13C intensity ratio in roots. These results indicate that 13C was efficiently incorporated and recycled in the primary metabolism of intact plants. Upper leaves had higher levels and higher relative enrichment ratios of glutamate and malate compared with lower leaves, reflecting the faster rate of TCA cycle in the upper parts for plants growing. The glutamate-derived amino acids glutamine and arginine displayed a different pattern of 13C incorporation. Glutamine showed 13C incorporation in all organs with the highest intensity ratio in roots and the lowest in stems, whereas arginine showed low 13C incorporation in all organs, implying a low flux in its biosynthetic pathway and low usage for other pathways. The 13C incorporation of aspartate-derived amino acids asparagine and threonine was also

**Table 5.** Relative enrichment ratios of the carbons of some metabolites in different organs of *Catharanthus roseus* plants

**Compounds 13C Chemical shift Relative enrichment ratio (labeled:control)**

110 Metabolomics - Fundamentals and Applications

Alanine 16.98 9.76 32.38 118.89 84.86 Threonine 20.47 3.55 1.06 1.39 18.91 Arginine 24.9 1.89 1.12 1.67 2.99

Glutamine 27.11 2.80 2.34 1.89 6.38

Glutamate 27.74 11.79 5.39 3.36 20.65

Asparagine 35.23 1.21 0.75 1.09 5.67

Aspartate 37.21 2.25 3.40 4.07 36.29 Malate 43.4 4.67 4.41 7.82 26.51 β-glc 97.04 30.96 55.55 15.01 32.79 Vindoline 7.43 2.96 nd nd nd Loganic acid 12.69 7.66 3.75 4.26 23.62

Chlorogenic acid 123.12 2.88 1.42 nd nd

4-*O*-Caffeoyl quinic acid 124.16 10.35 nd nd nd

nd, not detected.

fed with [1-13C] glucose.

**Upper leaf Lower leaf Stem Root**

28.53 1.58 1.07 1.17 1.15 41.38 1.52 0.91 0.91 1.66

31.83 3.21 4.03 3.09 6.70 55.02 2.34 1.38 2.01 8.43

34.44 7.21 3.37 4.63 16.75 55.67 15.49 2.72 5.11 25.06

52.21 1.17 0.95 1.23 4.18

146.1 27.96 13.77 6.59 24.79

146.8 93.69 nd nd nd

different. Threonine had relatively high 13C incorporation in upper leaves and roots, but relatively low in lower leaves and stems, indicating a high turnover in the upper leaves. Asparagine, except for roots, displayed low 13C incorporation in the other organs. Pyruvatederived alanine exhibited the lowest relative enrichment ratio in upper leaves, while the highest was found in stems and roots.

Based on the 13C dimension of HSQC spectra, leaves had more <sup>13</sup>C signals in the area of >δ 100 ppm than stems and roots (**Figure 6**), even after feeding [1-13C] glucose. Upper leaves had relatively high 13C incorporation for vindoline, chlorogenic acid, and 4-*O*-caffeoyl quinic acid while lower leaves only showed 13C incorporation into chlorogenic acid. The levels of these phenylpropanoids in lower leaves were almost as much as in upper leaves. This means that all three compounds have a high biosynthetic rate in the upper leaves, whereas in the lower leaves the biosynthesis of chlorogenic acid is more active than of the other two compounds. In plants, lower leaves, with older age, are entering the senescence phase, among others reflected by a lower rate of both primary and secondary metabolism. Studies in previous chapters have shown that vindoline levels show an age-related decrease, consistent with previously reported results [20].

In stems and roots, no 13C signals of vindoline, chlorogenic acid, and 4-*O*-caffeoyl quinic acid were detected with or without feeding [1-13C] glucose. Vindoline is not found in roots due to its tissue-specific biosynthesis requiring chloroplasts for one of its biosynthetic steps [20–25]. The <sup>13</sup>C signal of loganic acid at δ 12.69 and 146.1 ppm was clearly present and showed a high relative enrichment ratio at the spectra of all organs while that of secologanin at δ 121.53 ppm was only found in the spectra of leaves. It was difficult to calculate the relative enrichment ratio of secologanin due to the signal overlapping. In roots and stems, secologanin was too low for further analysis. The high levels of loganic acid in the roots are in line with a previous study that reported that LAMT activity, which converts loganic acid into loganin (the direct substrate of secologanin), was four to eight times lower in hairy roots than that in the other organs of the plant [22].

#### **3.5. Effect of JA elicitation on <sup>13</sup>C fluxes into metabolic pathways**

JA was spiked into the labeled glucose solution at the sixth day after submerging the plant roots in the solution. The control plants were also reared in labeled glucose solution but without JA elicitation. Leaves were harvested at 0, 6, 24, and 72 h (6, 7, and 9 d of incubation with the labeled glucose solution) after elicitation and measured by 1 H NMR and HSQC.

For control plants, NMR spectra showed that the enrichments of malic acid and of the amino acids alanine, arginine, glutamate, glutamine, aspartate, and asparagine in the leaves were nearly identical at 6 and 9 d of incubation with the labeled glucose solution (**Figure 7**), suggesting the establishment of steady state at 6 d. However, the incorporation of label in glucose and threonine increased continuously within the measured period of 9 days. Besides, loganic acid and chlorogenic acid kept the same enrichments while vindoline and 4-*O*-caffeoyl quinic acid showed an increase of the enrichments within 9 days. Previous study with *C. roseus* hairy roots grown in the light showed that the 13C label was not diluted by CO<sup>2</sup> fixation [7]. In tobacco plants grown on agar containing labeled glu-

13C-Isotope-Labeling Experiments to Study Metabolism in *Catharanthus roseus* http://dx.doi.org/10.5772/65401 113

different. Threonine had relatively high 13C incorporation in upper leaves and roots, but relatively low in lower leaves and stems, indicating a high turnover in the upper leaves. Asparagine, except for roots, displayed low 13C incorporation in the other organs. Pyruvatederived alanine exhibited the lowest relative enrichment ratio in upper leaves, while the

Based on the 13C dimension of HSQC spectra, leaves had more <sup>13</sup>C signals in the area of >δ 100 ppm than stems and roots (**Figure 6**), even after feeding [1-13C] glucose. Upper leaves had relatively high 13C incorporation for vindoline, chlorogenic acid, and 4-*O*-caffeoyl quinic acid while lower leaves only showed 13C incorporation into chlorogenic acid. The levels of these phenylpropanoids in lower leaves were almost as much as in upper leaves. This means that all three compounds have a high biosynthetic rate in the upper leaves, whereas in the lower leaves the biosynthesis of chlorogenic acid is more active than of the other two compounds. In plants, lower leaves, with older age, are entering the senescence phase, among others reflected by a lower rate of both primary and secondary metabolism. Studies in previous chapters have shown that vindoline levels show an age-related decrease, consistent with

In stems and roots, no 13C signals of vindoline, chlorogenic acid, and 4-*O*-caffeoyl quinic acid were detected with or without feeding [1-13C] glucose. Vindoline is not found in roots due to its tissue-specific biosynthesis requiring chloroplasts for one of its biosynthetic steps [20–25]. The <sup>13</sup>C signal of loganic acid at δ 12.69 and 146.1 ppm was clearly present and showed a high relative enrichment ratio at the spectra of all organs while that of secologanin at δ 121.53 ppm was only found in the spectra of leaves. It was difficult to calculate the relative enrichment ratio of secologanin due to the signal overlapping. In roots and stems, secologanin was too low for further analysis. The high levels of loganic acid in the roots are in line with a previous study that reported that LAMT activity, which converts loganic acid into loganin (the direct substrate of secologanin), was four to eight times lower in hairy roots than that in the other

JA was spiked into the labeled glucose solution at the sixth day after submerging the plant roots in the solution. The control plants were also reared in labeled glucose solution but without JA elicitation. Leaves were harvested at 0, 6, 24, and 72 h (6, 7, and 9 d of incubation

For control plants, NMR spectra showed that the enrichments of malic acid and of the amino acids alanine, arginine, glutamate, glutamine, aspartate, and asparagine in the leaves were nearly identical at 6 and 9 d of incubation with the labeled glucose solution (**Figure 7**), suggesting the establishment of steady state at 6 d. However, the incorporation of label in glucose and threonine increased continuously within the measured period of 9 days. Besides, loganic acid and chlorogenic acid kept the same enrichments while vindoline and 4-*O*-caffeoyl quinic acid showed an increase of the enrichments within 9 days. Previous study with *C. roseus* hairy roots grown in the light showed that the 13C label was

fixation [7]. In tobacco plants grown on agar containing labeled glu-

H NMR and HSQC.

**3.5. Effect of JA elicitation on <sup>13</sup>C fluxes into metabolic pathways**

with the labeled glucose solution) after elicitation and measured by 1

highest was found in stems and roots.

112 Metabolomics - Fundamentals and Applications

previously reported results [20].

organs of the plant [22].

not diluted by CO<sup>2</sup>

**Figure 7.** Relative enrichment ratio of primary and secondary metabolites during incubation of *Catharanthus roseus* plants with [1-13C] glucose. Gray bars: JA-elicited samples; black bars: control samples; red bars (U): unlabeled samples (without incubation in [1-13C] glucose solution).

cose, the metabolism was studied on a quantitative basis showing that the labeled glucose was efficiently absorbed via the root system, metabolized, and recycled [26]. Our results indicate that the *C. roseus* plant system can reach a relatively steady isotopic state with plants growing in 13CLE.

JA elicitation had little effect on the level of most metabolites, except glutamate, glutamine, vindoline, and loganic acid. Although JA induced an increase of glutamate and glutamine levels (**Figure 8**), their relative enrichment ratio remained unchanged compared with the controls. At the same time, the enrichment of alanine at C-3 showed an increase without levels changing compared to the controls. Vindoline levels showed an increase and reached the highest level at 72 h (23% higher than the controls) after JA treatment (**Figure 8**). However, the relative enrichment ratio of the C-18 signal of vindoline was lower in JA-elicited samples than in the controls, especially at 6 h (**Figure 7**). The level of loganic acid decreased with time (**Figure 8**), leading to a dramatical decrease of its enrichment at both C-3 and C-10 from 6 to 72 h. The levels of chlorogenic acid and 4-*O*-caffeoyl

**Figure 8.** Relative levels of metabolites in *Catharanthus roseus* leaves after JA elicitation. Gray bars: JA-elicited samples; black bars: control samples.

quinic acid in the time course did not change after JA elicitation (**Figure 8**), but the enrichments were lower than those of the control-labeled samples (**Figure 7**). 13C fluxes to various metabolic pathways, such as glutamate and loganic acid, could be disturbed within 24 h after MeJA treatment [27].

## **4. Future prospects**

Metabolic flux analysis is the quantification of all intracellular fluxes in an organism, which is thus an important cornerstone of metabolic engineering and systems biology. This study reports a comprehensive 13C-labeling-based metabolomics of a plant system. [1-13C] glucose was efficiently absorbed via the root system and recycled in the whole plant of *C. roseus*. The plant system of *C. roseus* could reach a relatively steady isotopic state in 13CLE, which appears to be well qualified to study flux contributions in the biosynthesis of sink metabolites for system biology. Combined with exogenous elicitation, 13C MFA appears to be a good tool to study the crosslink among pathways in the complicated plant metabolic network. The development of a comprehensive flux analysis tool for the plant system of *C. roseus* is expected to be valuable in assessing the metabolic impact of genetic or environmental changes.

## **Author details**

Qifang Pan<sup>1</sup> \*, Natali Rianika Mustafa<sup>2</sup> , Robert Verpoorte<sup>2</sup> and Kexuan Tang<sup>1</sup>

\*Address all correspondence to: panqf@sjtu.edu.cn

1 Plant Biotechnology Research Center, SJTU-Cornell Institute of Sustainable Agriculture and Biotechnology, Fudan-SJTU-Nottingham Plant Biotechnology R&D Center, School of Agriculture and Biology, Shanghai Jiaotong University, Shanghai, PR China

2 Natural Products Laboratory, Institute of Biology, Leiden University, Leiden, The Netherlands

## **References**

**Figure 8.** Relative levels of metabolites in *Catharanthus roseus* leaves after JA elicitation. Gray bars: JA-elicited samples;

black bars: control samples.

114 Metabolomics - Fundamentals and Applications


[18] Dieuaide-Noubhani M, Raffard G, Canioni P, Pradet A, Raymond P. Quantification of compartmented metabolic fluxes in maize root tips using isotope distribution from 13C-or 14C-labeled glucose. J Biol Chem. 1995;270: 13147–13159.

[4] Contin A, van der Heijden R, Lefeber AWM, Verpoorte R. The iridoid glucoside secologanin is derived from the novel triose phosphate/pyruvate pathway in a *Catharanthus* 

[5] Mustafa NR, Kim HK, Choi YH, Erkelens C, Lefeber AW, et al. Biosynthesis of salicylic acid in fungus elicited *Catharanthus roseus* cells. Phytochemistry. 2009;70: 532–539.

[6] Sriram G, Fulton DB, Shanks JV. Flux quantification in central carbon metabolism of *Catharanthus roseus* hairy roots by 13C labeling and comprehensive bondomer balancing.

[7] Schuhr CA, Radykewicz T, Sagner S, Latzel C, Zenk MH, et al. Quantitative assessment of crosstalk between the two isoprenoid biosynthesis pathways in plants by NMR spec-

[8] Lundström P, Teilum K, Carstensen T, Bezsonova I, Wiesner S, et al. Fractional 13C enrichment of isolated carbons using [1-13C]-or [2-13C]-glucose facilitates the accurate measurement of dynamics at backbone Cα and side-chain methyl positions in proteins.

[9] Murashige T, Skoog F. A revised medium for rapid growth and bio assays with tobacco

[10] Kim HK, Khan S, Wilson EG, Kricun SDP, Meissner A, et al. Metabolic classification of South American Ilex species by NMR-based metabolomics. Phytochemistry. 2010;71: 773–784. [11] Meyer RC, Steinfath M, Lisec J, Becher M, Witucka-Wall H, et al. The metabolic signature related to high plant growth rate in *Arabidopsis thaliana*. Proc Nat Acad Sci. 2007;104:

[12] Lea PJ, Sodek L, Parry MA, Shewry PR, Halford NG. Asparagine in plants. Ann App

[13] Genix P, Bligny R, Martin J-B, Douce R. Transient accumulation of asparagine in sycamore cells after a long period of sucrose starvation. Plant Physiol. 1990;94: 717–722. [14] Choi YH, Tapias EC, Kim HK, Lefeber AWM, Erkelens C, et al. Metabolic discrimination

[15] Mustafa NR, Kim HK, Choi YH, Verpoorte R. Metabolic changes of salicylic acid-elicited *Catharanthus roseus* cell suspension cultures monitored by NMR-based metabolomics.

[16] Kruger NJ, Huddleston JE, Le Lay P, Brown ND, Ratcliffe RG. Network flux analysis: Impact of 13 C-substrates on metabolism in *Arabidopsis thaliana* cell suspension cultures.

[17] Malloy CR, Sherry AD, Jeffrey F. Evaluation of carbon flux and substrate selection through alternate pathways involving the citric acid cycle of the heart by 13C NMR

H-NMR spectroscopy and

of *Catharanthus roseus* leaves infected by phytoplasma using 1

multivariate data analysis. Plant Physiol. 2004;135: 2398–2410.

*roseus* cell culture. FEBS Lett. 1998;434: 413–416.

Phytochemistry. 2007;68: 2243–2257.

116 Metabolomics - Fundamentals and Applications

troscopy. Phytochem Rev. 2003;2: 3–16.

tissue cultures. Physiol Plantarum. 1962;15: 473–497.

J Biomol NMR. 2007;38: 199–212.

4759–4764.

Biol. 2007;150: 1–26.

Biotechnol Let. 2009;31: 1967–1974.

Phytochemistry. 2007;68: 2176–2188.

spectroscopy. J Biol Chem. 1988;263: 6964–6971.


#### **Metabolomics Approaches and their Hidden Potential for Explaining the Mycotoxin Contamination Problem Metabolomics Approaches and their Hidden Potential for Explaining the Mycotoxin Contamination Problem**

Laura Righetti, Chiara Dall'Asta, Jana Hajslova and Josep Rubert Laura Righetti, Chiara Dall'Asta, Jana Hajslova and Josep Rubert

Additional information is available at the end of the chapter Additional information is available at the end of the chapter

http://dx.doi.org/10.5772/65647

#### **Abstract**

Food is essential for life. On the basis of the previous sentence, consumers have a right to expect that the foods they purchase and consume will be safe, authentic and of high quality. On these premises, target compounds, such as mycotoxins, pesticides or antibiotics, have been commonly investigated on the food chain, and subsequently, were regulated by authorities. This raises the following question: may consumer be prevented to these risk exposures? Probably not, food chain is step-by-step longer and more complex than ever before. Note that food chain is affected by globalized trade, culture, travel and migration, an ageing population, changing consumer trends and habits, new technologies, emergencies, climate change and extreme weather events which are increasing foodborne health risks, especially for mycotoxins. Because of the fact that mycotoxins are natural toxic compounds produced by certain filamentous fungi on many agricultural communities. In fact, these toxins have adverse effects on humans, animals and crops that result in illnesses and economic losses. Nevertheless, so far mycotoxins and their modified forms have been mainly monitored in cereal and cerealbased products, however, may an early detection of mycotoxins be considered a reliable strategy? In this chapter, recent metabolomics approaches have been reviewed in order to answer this question and to understand future strategies in the field of mycotoxin contamination.

**Keywords:** food metabolomics, mycotoxins, plant metabolome, fungal pathogens

#### **1. Introduction**

Mycotoxins are secondary metabolites (300–800 Da) produced by filamentous fungi that colonize crops in field and upon storage, being among them cereals one of the most affected

© 2016 The Author(s). Licensee InTech. This chapter is distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/3.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. © 2017 The Author(s). Licensee InTech. This chapter is distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/3.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

commodities [1]. Fungal colonization is strongly dependent on environmental conditions and agricultural practices. Climatic factors such as temperature, humidity, rainfalls, as well as the concomitant presence of other pests or insects may support the fungal infection. Therefore, climate change is significantly affecting the mycotoxin contamination of crops worldwide. As a consequence, fungal infection and related pathogenic diseases can cause significant yield losses, quality reductions and mycotoxins accumulation in crops, particularly grains [2].

Although regulations, adequate quality controls and good agricultural practices have been implemented in many countries, the mycotoxin contamination represents a serious challenge for global trade in terms of animal and human health threat and economical losses. For this reason, the establishment of common standard procedure for fungal biocontrol and mycotoxin mitigation are under investigation.

From a toxicological perspective, mycotoxins can cause both acute and chronic effects for humans and animals. They are responsible for a broad spectrum of toxic activities, ranging from severe adverse effects on the liver, kidney, hematopoetic, immune system, foetal and reproductive systems, as well as significant contribution to carcinogenic and mutagenic developments [3]. In fact, The International Agency for Cancer Research (IARC) has formally classified a number of mycotoxins. For example, four aflatoxins are classified in Group 1 (AFB1, AFB2, AFG1 and AFG2) while ochratoxin A (OTA) is classified in Group 2B [4, 5].

Among them, those produced by *Fusarium* spp. are often found in cereals, and are related to pathogenic diseases in plants, as well. In particular, *Fusarium* Head Blight (FHB) is recognized as one of the most destructive global diseases of wheat and barley [6]. FHB can cause, indeed, significant reductions in grain yield and quality, and is associated with the accumulation of mycotoxins, such as deoxynivalenol (DON). Thus, besides the severe economic impact, due to losses in productivity, FHB represents a serious health risk for consumers and livestock [3]. In order to reduce the economic and health impact of FHB, several cultural practices have been proposed so far. However, crop rotation, tillage, use of fungicides or other biocontrol agents are generally regarded as insufficient to tackle FHB and mycotoxins contamination alone [3]. This is mainly due to the fact that the breeding of grains for superior technological properties has led to a decrease of the genetic diversity, with a subsequent increase of susceptibility towards pathogenic diseases. Therefore, the study of the plant response to fungal infection is crucial for developing possible strategies to counteract mycotoxin accumulation.

From a biological point of view, the role of mycotoxins in fungal colonization is still to be clarified. Some of them—such as deoxynivalenol (DON) have been proved to be virulence factor for fungal infection [7]. However, the intense cross-talk among plant and pathogen affects the biological cascade, from genes to metabolites, and plays a significant role in mycotoxin accumulation. Fungal infection and mycotoxin contamination are commonly addressed with classical methods, from DNA-based techniques for fungal identification to analytical methodologies for mycotoxin detection. The residual DNA content of fungal pathogens was used to identify unequivocally fungal species, and they were associated with cereals and their mycotoxins [8], basically allowing for a toxigenic fungi monitoring. However, the main disadvantage of this technique is associated to relatively high cost and the fact that it is time-consuming. In addition, a poor correlation between fungal growth and mycotoxin accumulation has been pointed out.

commodities [1]. Fungal colonization is strongly dependent on environmental conditions and agricultural practices. Climatic factors such as temperature, humidity, rainfalls, as well as the concomitant presence of other pests or insects may support the fungal infection. Therefore, climate change is significantly affecting the mycotoxin contamination of crops worldwide. As a consequence, fungal infection and related pathogenic diseases can cause significant yield losses, quality reductions and mycotoxins accumulation in crops, particularly grains [2].

Although regulations, adequate quality controls and good agricultural practices have been implemented in many countries, the mycotoxin contamination represents a serious challenge for global trade in terms of animal and human health threat and economical losses. For this reason, the establishment of common standard procedure for fungal biocontrol and mycotoxin

From a toxicological perspective, mycotoxins can cause both acute and chronic effects for humans and animals. They are responsible for a broad spectrum of toxic activities, ranging from severe adverse effects on the liver, kidney, hematopoetic, immune system, foetal and reproductive systems, as well as significant contribution to carcinogenic and mutagenic developments [3]. In fact, The International Agency for Cancer Research (IARC) has formally classified a number of mycotoxins. For example, four aflatoxins are classified in Group 1 (AFB1, AFB2, AFG1 and AFG2) while ochratoxin A (OTA) is classified in Group 2B [4, 5].

Among them, those produced by *Fusarium* spp. are often found in cereals, and are related to pathogenic diseases in plants, as well. In particular, *Fusarium* Head Blight (FHB) is recognized as one of the most destructive global diseases of wheat and barley [6]. FHB can cause, indeed, significant reductions in grain yield and quality, and is associated with the accumulation of mycotoxins, such as deoxynivalenol (DON). Thus, besides the severe economic impact, due to losses in productivity, FHB represents a serious health risk for consumers and livestock [3]. In order to reduce the economic and health impact of FHB, several cultural practices have been proposed so far. However, crop rotation, tillage, use of fungicides or other biocontrol agents are generally regarded as insufficient to tackle FHB and mycotoxins contamination alone [3]. This is mainly due to the fact that the breeding of grains for superior technological properties has led to a decrease of the genetic diversity, with a subsequent increase of susceptibility towards pathogenic diseases. Therefore, the study of the plant response to fungal infection is

crucial for developing possible strategies to counteract mycotoxin accumulation.

From a biological point of view, the role of mycotoxins in fungal colonization is still to be clarified. Some of them—such as deoxynivalenol (DON) have been proved to be virulence factor for fungal infection [7]. However, the intense cross-talk among plant and pathogen affects the biological cascade, from genes to metabolites, and plays a significant role in mycotoxin accumulation. Fungal infection and mycotoxin contamination are commonly addressed with classical methods, from DNA-based techniques for fungal identification to analytical methodologies for mycotoxin detection. The residual DNA content of fungal pathogens was used to identify unequivocally fungal species, and they were associated with cereals and their mycotoxins [8], basically allowing for a toxigenic fungi monitoring. However, the main disadvantage of this technique is associated to relatively high cost and the fact that

mitigation are under investigation.

120 Metabolomics - Fundamentals and Applications

For this reason, classical chromatographic methods are often used for mycotoxin determination in crops and products thereof [9–11]. Over the last decade, mass spectrometry (MS)-based methods have become the golden standard for mycotoxin analysis, being the multitoxin approach the most promising strategy to control the occurrence of multiple analytes in the same material [12]. As a complementation, rapid diagnostic methods are commonly based on immunochemical assays (i.e., lateral flow devices, dipsticks, etc.) for early detection at pre- and post-harvest [13]. More recently, nondestructive imaging methods have been proposed as well as rapid diagnostic tool [14]. In this context, the untargeted methodologies have started to be applied only recently, and only to meet specific needs. In particular, the omics strategies have been applied to the mycotoxin issue to investigate the interaction between the plant and the pathogen in field, leading to mycotoxin accumulation [3, 15–19].

In a top-down view, genomics and transcriptomics studies have proposed to investigate the biosynthetic pathways for mycotoxin production, and their regulation upon biotic and abiotic stress. Similarly, proteomics has been often proposed for identifying enzymes and proteins responsive to pathogenic diseases, such as FHB [20], or responsible for mycotoxin modification in plant [21, 22]. Over the last decade, however, the field of metabolomics has gained increasing interest across all disciplines, and has found a prominent role in mycotoxin-related studies as well. Metabolomics is an emerging technique that can be considered complementary to the other omics approaches and highlighting unique advantages. A metabolic fingerprint may generate thousands of data points, of which only a handful might be needed to describe the problem adequately [23, 24]. Extracting the most meaningful elements of these data is thus key to generating useful new knowledge with mechanistic or explanatory power.

To date, however, in the vast majority of cases, mycotoxin contamination has been directionally explored. In this way, up to now, the mycotoxin contamination loop has not been properly closed and many issues are still open. One of the main challenges in mycotoxin analysis will be to improve our limited understanding of the roles of plant pathogen cross-talk at the molecular level. In this context, a multiomics global strategy may be able to identify chemical markers at the earliest stage, and to univocally characterize resistant varieties and the early detection of mycotoxins. The early detection of toxigenic fungi or of markers of the interaction between the pathogen and its host can be usefully exploited to limit entering of mycotoxins into the food/feed production chain.

## **2. Advanced analytical tools merged with chemometrics**

The multiomics approach has been poorly compared to classical approaches described in the previous section. Initially, innovative spectral techniques (i.e., imaging analysis, near-infrared, Raman) have been proposed for the early detection of fungal pathogens [25, 26]. Since fungal growth is not strictly related to mycotoxin accumulation, and to the pattern of occurring mycotoxins, these techniques cannot provide a response on mycotoxin occurrence or chemical markers, mainly linked to the plant-pathogen interactions. In this framework, metabolomics may represent the golden tool for understanding the biological pathways involved in mechanisms of plant resistance. Nowadays, gas chromatography (GC) and liquid chromatography (LC) are commonly used for metabolomics approaches, mainly coupled to mass spectrometry (MS) [3]. In principle, LC-MS and GC-MS provide a high number of scans per peak, allowing peak picking and alignment (feature extraction), and if necessary quantification, as well as a large dynamic range in order to monitor low and high concentration levels of metabolites.

#### **2.1. Liquid chromatography coupled to mass spectrometry (LC-MS)**

LC-MS has been the most commonly used metabolic fingerprinting/profiling approach for understanding plant resistance mechanisms and the plant-pathogen cross-talk. For instance, Cajka et al. [27] have recently developed an analytical procedure based on the optimization of a solid-liquid extraction procedure using methanol/water (50:50, v/v), in order to isolate polar/ medium-polar barley metabolites followed by ultra high performance liquid chromatography quadrupole-time-of-flight (UHPLC-QTOF) [27]. **Figure 1** shows unique and shared metabolites acquired by UHPLC-QTOF using both positive and negative ionization modes.

**Figure 1.** Venn diagrams illustrating shared and unique features in barley extracts prepared under the different extraction procedures and analyzed by both positive (A) and negative (B) ionization modes UHPLC-QTOF.

The authors demonstrated how the carefully in-depth investigation of sample preparation could support the extraction of the broadest spectrum of metabolites isolated from the matrix, in this particular case barley. Obviously, UHPLC-QTOF chemical fingerprints differed significantly depending on the extraction solvent used (see **Figure 2**). For example, when deionized water was used, a lower extraction efficiency of less polar compounds was exhibited. Nevertheless, sample preparation using a mixture of acetonitrile/water (84:16, v/v) or methanol/water (50:50, v/v) enhanced the extraction of less polar and polar compounds were also detected. The authors, as a compromise, chose methanol/water (50:50, v/v), since the extraction mixture permitted isolation of both highly polar and less polar metabolites. So far, various proportion of aqueous methanol has been mainly applied, as it can be seen in **Table 1**. In this way, the changes occurring both in primary carbohydrates and primary nitrogen metabolism upon plant infection have been partially elucidated. On the other hand, lipidomic approaches applying more nonpolar solvent (e.g., hexane, dichloromethane, ethyl acetate) have been exclusively used to investigate the plant-pathogen cross-talk in maize [28–30]. Increasing evidence indicates, indeed, that lipid signalling is an integrated part of the complex regulatory network in plant pathogen cross-talk.

markers, mainly linked to the plant-pathogen interactions. In this framework, metabolomics may represent the golden tool for understanding the biological pathways involved in mechanisms of plant resistance. Nowadays, gas chromatography (GC) and liquid chromatography (LC) are commonly used for metabolomics approaches, mainly coupled to mass spectrometry (MS) [3]. In principle, LC-MS and GC-MS provide a high number of scans per peak, allowing peak picking and alignment (feature extraction), and if necessary quantification, as well as a large dynamic range in order to monitor low and high concentration levels of metabolites.

LC-MS has been the most commonly used metabolic fingerprinting/profiling approach for understanding plant resistance mechanisms and the plant-pathogen cross-talk. For instance, Cajka et al. [27] have recently developed an analytical procedure based on the optimization of a solid-liquid extraction procedure using methanol/water (50:50, v/v), in order to isolate polar/ medium-polar barley metabolites followed by ultra high performance liquid chromatography quadrupole-time-of-flight (UHPLC-QTOF) [27]. **Figure 1** shows unique and shared metabo-

**Figure 1.** Venn diagrams illustrating shared and unique features in barley extracts prepared under the different extrac-

The authors demonstrated how the carefully in-depth investigation of sample preparation could support the extraction of the broadest spectrum of metabolites isolated from the matrix, in this particular case barley. Obviously, UHPLC-QTOF chemical fingerprints differed significantly depending on the extraction solvent used (see **Figure 2**). For example, when deionized water was used, a lower extraction efficiency of less polar compounds was exhibited. Nevertheless, sample preparation using a mixture of acetonitrile/water (84:16, v/v) or methanol/water (50:50, v/v) enhanced the extraction of less polar and polar compounds were also detected. The authors, as a compromise, chose methanol/water (50:50, v/v), since the extraction mixture permitted isolation of both highly polar and less polar metabolites. So far, various proportion of aqueous methanol has been mainly applied, as it can be seen in **Table 1**. In this way, the changes occurring both in primary carbohydrates and primary nitrogen metabolism upon plant infection have been partially elucidated. On the other hand, lipidomic approaches applying more nonpolar solvent (e.g., hexane, dichloromethane, ethyl acetate) have been exclusively used to investigate the plant-pathogen cross-talk in maize [28–30]. Increasing

tion procedures and analyzed by both positive (A) and negative (B) ionization modes UHPLC-QTOF.

lites acquired by UHPLC-QTOF using both positive and negative ionization modes.

**2.1. Liquid chromatography coupled to mass spectrometry (LC-MS)**

122 Metabolomics - Fundamentals and Applications

**Figure 2.** Overlaid extracted ion chromatograms (EICs) based on MetExtract data processing output showing the biotransformation products of a sample treated with a mixture of 12C/13C-HT-2 toxin (red trace) and one treated with a mixture of 12C/13C T-2 toxin (blue trace). EICs of nonlabeled metabolites were displayed with positive intensities; those of the corresponding labeled metabolites were displayed as negative intensities.

Not only fingerprinting approaches, but also metabolic profiling strategy has been recently performed using a stable isotopic labelling approach in order to understand the metabolic fate of HT-2 toxin and T-2 toxin in wheat [31]. In general, untargeted metabolomics approaches are usually based on generic settings for sample preparation (which usually include a simple extraction without any purification step, or nonsample preparation), separation and detection. In contrast, if a particular group of metabolites is preselected, a metabolic profiling is carried out. Thereby, a more specific extraction procedure and chromatographic separation/detection has to be performed. In this way, this study was focused on Type A trichothecenes, such as HT2 and T2 toxins, and their detoxification pathways.

The stable isotopic labelling approach applied is really innovative since monitoring pairs of corresponding nonlabeled and labeled precursor allowed metabolome to be easily monitored and interpreted, providing further information. Liquid chromatography high-resolution tandem mass spectrometry (LC-HRMS/MS) spectra of the observed metabolites of HT2 and T2 were compared with those obtained in wheat and were shown to be identical. **Figure 2** shows overlaid extracted ion chromatograms of all detected biotransformation products. In this frame, the authors demonstrated that the exposure of wheat to either HT2 or T2 toxins primarily activates biotransformations involving hydroxylation, (de)acetylation and various conjugations. Furthermore, kinetic data revealed that detoxification progressed rapidly, resulting in the almost complete degradation of the toxins, within 1 week, after a single exposure.


RRC, resistance-related constitutive; RRI, resistance-related induced; RI, resistance indicator; PR, pathogenesis-related; PRr, pathogenesis-related resistant; PRs, pathogenesis-related susceptible; PRp, pathogenesis-related proteins.

**Table 1.** Putative metabolites involved in *Fusarium* Head Blight resistance, reported in the literature so far.

#### **2.2. Gas chromatography-mass spectrometry (GC-MS)**

Surprisingly, GC coupled to high-resolution mass spectrometry (HRMS) has not been applied to metabolomics strategies. As it was discussed above, the applicability of HRMS permitted metabolic pathways to be clearly described. Nevertheless, GC coupled to a single quadrupole shows selectivity and specificity for metabolomics approaches, since available databases containing mass spectra and retention indexes can tentatively identify metabolites due to the extensive and reproducible fragmentation pattern obtained in full-scan mode using electron ionisation (EI). A recent research work was focused on the applicability of GC-EI-MS in order to understand deoxynivalenol (DON) accumulation in wheat [15]. In this research, the experimental design was nicely described, and similar to previous research described above. Nevertheless, sample preparation took extra time compared to LC-MS, due to derivatization procedure based on silylation. Many metabolites contain polar functional groups and are thermally labile for separation by GC or present limited volatility, therefore, derivatization often has to be applied. Oximation or silylation has been commonly applied due to their universality and versatility [24].

#### **2.3. Data processing to extract meaningful markers**

**Instrumentation Extraction Plant Chemical group Markers class References**

Barley Fatty acids, flavonoid phenylpropanoids, amino acids, terpenoids, RRC, PRr, PRs, RI

RRI, RRC, RI

RRI, RRC, RI, PRp

RRC, RRI, RI

RR, RI Cajka

RR Warth

RR Nussbaumer

RR Paranidharan

RR, PR Cuperlovic-Culf

RR, PR Browne

Bollina et al. [16]

Bollina et al. [17]

et al. [27]

Gunnaiah et al. [18]

Gunnaiah et al. [19]

et al. [15]

et al. [36]

et al. [35]

et al. [33]

et al. [24]

organic acids

Barley Fatty acids, flavonoid phenylpropanoids

phenylpropanoids

phenylpropanoids, terpenoids

Wheat Flavonoid phenylpropanoids, terpenoids, amino acids, carbohydrates

Wheat Polyamines, amino acids, phenylpropanoids, carbohydrates

Wheat Amino acids, amines, carbohydrates

Wheat Polyamines, amino acids, phenylpropanoids, carbohydrates

Wheat Amines, amino acids, carbohydrates

Wheat Amines, amino acids, carbohydrates, phenylpropanoids

RRC, resistance-related constitutive; RRI, resistance-related induced; RI, resistance indicator; PR, pathogenesis-related; PRr, pathogenesis-related resistant; PRs, pathogenesis-related susceptible; PRp, pathogenesis-related proteins.

Surprisingly, GC coupled to high-resolution mass spectrometry (HRMS) has not been applied to metabolomics strategies. As it was discussed above, the applicability of HRMS permitted metabolic pathways to be clearly described. Nevertheless, GC coupled to a single quadrupole shows selectivity and specificity for metabolomics approaches, since available databases containing mass spectra and retention indexes can tentatively identify metabolites due to the extensive and reproducible fragmentation pattern obtained in full-scan mode using electron ionisation (EI). A recent research work was focused on the applicability of GC-EI-MS in order

**Table 1.** Putative metabolites involved in *Fusarium* Head Blight resistance, reported in the literature so far.

Barley Fatty acids,

Wheat Fatty acids,

LC-HRMS Water/methanol

124 Metabolomics - Fundamentals and Applications

GC-MS Water/methanol

H NMR Methanol/water

(45:65, v/v)

Water/methanol (50:50, v/v)

Water/methanol (50:50, v/v)

Water/methanol (40:60, v/v)

Water/methanol (40:60, v/v)

/formic acid (74:25:1, v/v)

Water/methanol /formic acid (74:25:1, v/v)

Water/methanol (50:50, v/v)

Methanol/water (40:60, v/v)

**2.2. Gas chromatography-mass spectrometry (GC-MS)**

(40:60, v/v)

For processing massive information based on separation techniques and mass spectrometry, effective software tools capable of rapid data mining procedures have to be used. Note that data matrices contain thousands of variables (*m/z*, RT, intensity), and they have to be converted into more manageable information [24].

Data processing and data pretreatment must be carried out in order to permit the identification of significant metabolites, which capture the bulk of variation between different datasets and may therefore potentially serve as biomarkers. Data processing usually involves four basic steps: deconvolution, alignment, filtering and gap filling. The features, defined by their *m/z* and retention time, and their intensities in different samples are used for the statistical analysis. Samples would be grouped and it can be observed using scores plots, heatmaps or hierarchical clustering. After data pretreatment, a statistical comparison can be performed using the multivariate data analysis (MVDA). Usually this step involves unsupervised models (PCA) and supervised classification tools, such as PLS-DA and OPLS-DA. These supervised methods are performed to maximize differences between groups and to highlight potential biomarkers. When the experimental design is more complex, the use of *t*-test or other univariate data analysis (UVDA) tools represents the best choice [32].

## **3. Metabolomics to decipher pathways involved in** *Fusarium* **Head Blight resistance**

As it was already mentioned in the Introduction section, mycotoxins and fungal pathogens, such as *Fusarium graminearum*, can cause global diseases of wheat (*Triticum aestivum* L.) and barley [6]. Nevertheless, up to now, all preventive techniques used have been pointless, such as fungicides or crop rotation. Breeding strategies for increasing pathogen resistance seem to be the most promising and environmentally safe strategy for controlling mycotoxin accumulation in grains. It is known, indeed, that plant resistance mechanisms may be controlled by several quantitative trait loci (QTLs) that contribute to overall pathogen resistance in three different ways classified as type 1, 2 and 3, and referred as resistance to initial infection of spikelets, spread of pathogen within spikes and accumulation of mycotoxins, respectively. The involved QTLs typically are linked to, or contain, the genes that control the phenotype. Over hundred of QTLs for FHB resistance in wheat have been already identified [3, 7, 15, 32]. However, fully resistant varieties are still to be identified or inbred. Thus, there is an urgent need to better understand the mechanisms of resistance against *Fusarium* spp. in order to develop novel strategies and resistance varieties.

Nowadays, recent advances in metabolomics offer new opportunities to elucidate complex metabolic pathways involved in *Fusarium* resistance and potential FHB resistance biomarker metabolites in barley and wheat [3, 15–19, 32]. In fact, during the last decade, the applicability of metabolomics has significantly increased in this field. Nevertheless, knowledge remains still partial, and a long way has to be covered towards the development and understanding of the plant-pathogen interactions. This new scenario will provide suitable knowledge related to plant metabolome, which was already explained by a few examples in the previous section.

Different strategies have been applied so far, NMR for polar metabolites [33, 34], LC-QTOF for semipolar metabolites [16–19, 32] and GC-MS for volatile compounds [15, 35, 36]. However, we should keep in mind that a strategy able to simultaneously extract and detect the entire metabolome does not exist. Consequently, the data delivered by metabolomics studies only cover a fraction of the metabolome. In other words, the picture taken exclusively reveals one part of the metabolome. In addition, the resistance mechanism is a result of multi-interactions between biomolecules such as genes, proteins, metabolites and environmental factors. Therefore, a multiomics approach based on proteomics and metabolomics could overcome any limitation in the experimental design. For example, an integrated nontargeted metabolo-proteomics approach was recently published [18, 32]. This strategy demonstrated to be a powerful tool for a more comprehensive analysis in order to elucidate the mechanism, revealing successfully changes in the wheat primary metabolism, in response to *F. graminearum*.

## **4. Setting up of the experimental plan**

Depending on the hypotheses to be tested, different combinations of plants and fungal pathogens can be employed to explore the system relationship. Up to date, the metabolomic approaches have been mainly restricted to study the resistance against *F. graminearum* and *F. culmorum* in wheat and barley [3]. Resistance mechanisms have been elucidated by using wheat/barley genotypes with various levels of resistance, classified as susceptible, intermediate and resistant. However, in most of the studies, unrelated germoplasms are compared, leading to a confusing interpretation of the data delivered, since the differences in the metabolic profiles may actually result from the cultivar background [3]. Thus, the use of near isogenic lines (NILs) that differ in QTL conditioning FHB, is suggested to be the best approach to simplify the complexity, and allow to reach conclusive evidence related to resistance functions [18].

As for the comparison, mock-inoculated versus pathogen-inoculated plants is considered the best approach to highlight differences. Gunnaiah et al. [19] instead designed a different experiment in order to elucidate the host biochemical resistance to FHB spread in response to trichothecene producing and nonproducing isolates of *F. graminearum*. The two *F. graminearum* strains differed in the loss of function of Tri5 gene [19]. In addition to *F. graminearum* inoculation, Warth et al. [15] also used DON injection into the middle florets of spikelets to decipher the mechanism of plant resistance to the toxin. Experiments have been performed in field conditions [27], under greenhouse [16–18, 33, 36] with computer-controlled settings for light, temperature and relative air humidity [15] and more recently, in environmental controlled growth chamber [34]. All these approaches are summarized in **Table 1** together with the extraction and detection methodologies applied, the plants used and the main classes of metabolites identified by the authors so far.

## **5. Elucidating FHB resistance mechanisms by metabolomics**

need to better understand the mechanisms of resistance against *Fusarium* spp. in order to

Nowadays, recent advances in metabolomics offer new opportunities to elucidate complex metabolic pathways involved in *Fusarium* resistance and potential FHB resistance biomarker metabolites in barley and wheat [3, 15–19, 32]. In fact, during the last decade, the applicability of metabolomics has significantly increased in this field. Nevertheless, knowledge remains still partial, and a long way has to be covered towards the development and understanding of the plant-pathogen interactions. This new scenario will provide suitable knowledge related to plant metabolome, which was already explained by a few examples in the previous section. Different strategies have been applied so far, NMR for polar metabolites [33, 34], LC-QTOF for semipolar metabolites [16–19, 32] and GC-MS for volatile compounds [15, 35, 36]. However, we should keep in mind that a strategy able to simultaneously extract and detect the entire metabolome does not exist. Consequently, the data delivered by metabolomics studies only cover a fraction of the metabolome. In other words, the picture taken exclusively reveals one part of the metabolome. In addition, the resistance mechanism is a result of multi-interactions between biomolecules such as genes, proteins, metabolites and environmental factors. Therefore, a multiomics approach based on proteomics and metabolomics could overcome any limitation in the experimental design. For example, an integrated nontargeted metabolo-proteomics approach was recently published [18, 32]. This strategy demonstrated to be a powerful tool for a more comprehensive analysis in order to elucidate the mechanism, revealing successfully changes in the wheat primary metabolism, in

Depending on the hypotheses to be tested, different combinations of plants and fungal pathogens can be employed to explore the system relationship. Up to date, the metabolomic approaches have been mainly restricted to study the resistance against *F. graminearum* and *F. culmorum* in wheat and barley [3]. Resistance mechanisms have been elucidated by using wheat/barley genotypes with various levels of resistance, classified as susceptible, intermediate and resistant. However, in most of the studies, unrelated germoplasms are compared, leading to a confusing interpretation of the data delivered, since the differences in the metabolic profiles may actually result from the cultivar background [3]. Thus, the use of near isogenic lines (NILs) that differ in QTL conditioning FHB, is suggested to be the best approach to simplify the complexity, and allow to reach conclusive evidence related to resistance functions

As for the comparison, mock-inoculated versus pathogen-inoculated plants is considered the best approach to highlight differences. Gunnaiah et al. [19] instead designed a different experiment in order to elucidate the host biochemical resistance to FHB spread in response to trichothecene producing and nonproducing isolates of *F. graminearum*. The two *F. graminearum* strains differed in the loss of function of Tri5 gene [19]. In addition to *F. graminearum* inocula-

develop novel strategies and resistance varieties.

126 Metabolomics - Fundamentals and Applications

response to *F. graminearum*.

[18].

**4. Setting up of the experimental plan**

Plant resistance to *Fusarium* Head Blight and related mycotoxin accumulation has been described through five major types of mechanism, mainly described for wheat and further applied to other cereals. These mechanisms are often host-specific, thus requiring plantspecific elucidation studies. Type I resistance is related to initial infection of the floret in wheat and barley, and of the silk in maize [37]. The spreading of infection is then limited by type II and type III resistance. Type IV resistance is related with tolerance and ability to maintain yields, and type V resistance gathers all mechanisms of resistance to mycotoxin accumulation [38–40]. According to Boutigny et al. [41], type V-1 represents resistance to toxin accumulation operated by metabolic biotransformation [42, 43], while type V-2 corresponds to resistance due to the inhibitory effect of mycotoxin biosynthesis exerted by plant endogenous compounds. Metabolomics has been exploited so far in this field for the comparison of metabolite composition of resistant and susceptible varieties upon *Fusarium* infection, allowing for the definition of a large set of compounds potentially involved in FHB modulation [3, 15–19]. Among those, fatty acids and compounds thereof have been found to be involved in the plant-pathogen signalling system, while terpenoids and phenylpropanoids take part to cell wall reinforcement, show antifungal properties and may interfere with mycotoxin biosynthesis [3]. Generally, the workflow of markers identification comprises the following steps: (1) marker identification based on accurate mass (MS), isotopic pattern and MS/MS pathway, (2) off- or online database searching and (3) data interpretation. These markers can be tentatively identified without analytical standards, or unambiguously identified using analytical standards. The identification of markers usually represents the last step within metabolomics studies. This is crucial in order to understand the metabolite pathway, since they can be interesting intermediates or final secondary metabolites. In this particular topic of mycotoxin contamination, hundreds of metabolites related to FHB resistance have been putatively identified so far by metabolomics strategies [4]. It was already mentioned that the number and chemical structures of metabolites significantly vary according to the experimental design and the applied analytical strategy.

Biomarker metabolites of resistance can be further subclassified according to their function. Those metabolites, whose abundance was increased in both resistant and susceptible cultivars, following pathogen inoculation as compared with those inoculated with water, were referred as pathogenesis-related (PR) metabolites [44]. Accordingly, metabolites that were significantly higher in resistant cultivars than in susceptible one were designated as resistance-related (RR) metabolites.

Among RR metabolites, some of them have been demonstrated as constitutive, while others are induced upon fungal infection [16, 17]. Among them, resistance-indicator metabolites [3, 16, 17] include modified mycotoxins such as DON, DON-3Glc and the other DON-biotransformation products (**Figure 3**). Following wheat inoculation by *Fusarium*, DON is spread within spike, and the host counteracts mycotoxins by conjugating them to endogenous metabolites (i.e., by glycosylation, acylation, conjugation to amino acids and glutathione). Thus, all the modified forms are designed as resistance indicators, since they indicate that the plant is reacting against the infection also by converting mycotoxins into their less toxic forms. According to the literature [19, 32], the chemical defense against fungal pathogens including DON producing *Fusarium* species is linked to three main mechanisms of resistance: cell wall reinforcement through the deposition of lignin, production of antimicrobial compounds and specific induction of defense signalling pathways. As reported by Gunnaiah et al. [18] among the metabolites reported as involved in plant response to FHB in soft wheat, the main chemical groups are phenylpropanoids, and terpenoids, followed by amino acid derivatives. On the other hand, when functional properties are considered, the majority of resistance-related metabolites showed an antimicrobial activity, followed by cell wall strengthening properties.

**Figure 3.** Chemical structure of deoxynivalenol (DON).

Phenylpropanoids such as flavonoids and phenolic acids have been frequently described for their contribution to plant defense mechanisms. Their activity is exerted either through direct interference with the fungus, or through the reinforcement of plant structural components acting as a mechanical barrier [45, 46]. Flavonoids, especially flavones, flavonones and isoflavonoids, lignans and other phenolic compounds were induced in Sumai-3 as antimicrobial agents, following *F. graminearum* inoculation. This is mainly due to their antioxidant activity leading to the neutralization of ROS, produced under biotic stress. A similar profile was identified upon *F. graminearum* inoculation in barley cultivars [16, 47] and in wheat [18]. In addition, phenolic acids have been reported as inhibitory agents towards mycotoxin biosynthesis in vitro [48, 49]. Among phenolic acids, hydroxycinnamic acid (HCA) derivatives, such as ferulic and caffeic acids, have been reported as important contributors to FHB resistance [4], probably on account of the high antioxidant properties [50].

Among HCAs, chlorogenic acid has been reported as a potential resistance factor in different pathosystems [49, 51, 52]. Concerning the cell wall reinforcement, hydroxycinnamic acid amides (HCAAs) are deposited as cell wall appositions at the inner side of plant cell walls after cross-linking with polysaccharides, lignin and suberin [27]. These HCAAs are synthesized by condensation of hydroxycinnamoyl-CoA thioesters with aromatic amines (e.g., spermidine, spermine, tyramine) originated from aromatic amino acids. Thus, the involvement of amino acids in resistance to *Fusarium* may also be related to their role as a precursor of cell wall-bound HCAAs. Among those identified so far, N-caffeoylputrescine, 4-coumaroyl-3-hydroxyagmatine and feruloyl-serotonin are significantly upregulated upon *F. graminearum* infection in the resistant cultivar Sumai-3 [27]. With regards to the differences in terpenoid profile, Sumai-3 was characterized by a higher amount of syringyl lignin precursors like sinapoyl alcohol and sinapaldehyde, and glucose conjugate of sinapoyl alcohol, syringing [27]. Lignin results from monolignol glucosides' polymerizations and lead to a reinforced cell wall that is more resistant to fungal cell wall degradation enzymes [4].

Moreover, changes in the cell wall polysaccharides following infection were described by Cuperlovic-Culf et al. [24]. Large increase in concentration of sugars and inositols was found in all wheat varieties, particularly for Sumai-3, indicating an attempt at creation of cell wall barrier for *F. graminearum* penetration. In addition, fatty acids were also suggested to participate in resistance as physical barrier to pathogen ingress through their role in cuticle formation [4]. As far as the involvement of resistance related metabolites—mainly lipids—in the plant signalling pathways, significant results are summarized in the last part of this review.

## **6. The role of lipids in the plant-pathogen cross-talk**

Among RR metabolites, some of them have been demonstrated as constitutive, while others are induced upon fungal infection [16, 17]. Among them, resistance-indicator metabolites [3, 16, 17] include modified mycotoxins such as DON, DON-3Glc and the other DON-biotransformation products (**Figure 3**). Following wheat inoculation by *Fusarium*, DON is spread within spike, and the host counteracts mycotoxins by conjugating them to endogenous metabolites (i.e., by glycosylation, acylation, conjugation to amino acids and glutathione). Thus, all the modified forms are designed as resistance indicators, since they indicate that the plant is reacting against the infection also by converting mycotoxins into their less toxic forms. According to the literature [19, 32], the chemical defense against fungal pathogens including DON producing *Fusarium* species is linked to three main mechanisms of resistance: cell wall reinforcement through the deposition of lignin, production of antimicrobial compounds and specific induction of defense signalling pathways. As reported by Gunnaiah et al. [18] among the metabolites reported as involved in plant response to FHB in soft wheat, the main chemical groups are phenylpropanoids, and terpenoids, followed by amino acid derivatives. On the other hand, when functional properties are considered, the majority of resistance-related metabolites showed an antimicrobial activity, followed by cell wall strengthening properties.

Phenylpropanoids such as flavonoids and phenolic acids have been frequently described for their contribution to plant defense mechanisms. Their activity is exerted either through direct interference with the fungus, or through the reinforcement of plant structural components acting as a mechanical barrier [45, 46]. Flavonoids, especially flavones, flavonones and isoflavonoids, lignans and other phenolic compounds were induced in Sumai-3 as antimicrobial agents, following *F. graminearum* inoculation. This is mainly due to their antioxidant activity leading to the neutralization of ROS, produced under biotic stress. A similar profile was identified upon *F. graminearum* inoculation in barley cultivars [16, 47] and in wheat [18]. In addition, phenolic acids have been reported as inhibitory agents towards mycotoxin biosynthesis in vitro [48, 49]. Among phenolic acids, hydroxycinnamic acid (HCA) derivatives, such as ferulic and caffeic acids, have been reported as important contributors to FHB resist-

Among HCAs, chlorogenic acid has been reported as a potential resistance factor in different pathosystems [49, 51, 52]. Concerning the cell wall reinforcement, hydroxycinnamic acid

ance [4], probably on account of the high antioxidant properties [50].

**Figure 3.** Chemical structure of deoxynivalenol (DON).

128 Metabolomics - Fundamentals and Applications

Increasing evidence indicates that lipid signalling is an integral part of the complex regulatory network in plant response to stress/infection. Modifications of membrane lipids produce different classes of signalling messengers, such as phosphatidic acid (PA), diacylglycerol (DAG), DAG pyrophosphate (DAGPP), lysophospholipids, free fatty acids (FFAs), oxylipins, phosphoinositides and inositol polyphosphates. Lipidomic approaches were developed to investigate in depth the plant-pathogen cross-talk, demonstrating a close relationship between the modification of the pathogen oxylipin profile with the mycotoxin synthesis [28].

Among metabolites associated with fatty acid metabolic pathways, a number of compounds have been identified for their potential contribution to cereal resistance towards FHB [53]. Fatty acids and their derivatives play significant role in plant defense against pathogens. Among their functions, they contribute to basal immunity, gene-mediated and systemic acquired resistance in plants. In addition, fatty acids are involved in the plant defense signalling pathway, through the formation of important mediators such as oxylipins and jasmonates. The unsaturated C18:1, C18:2 and C18:3 fatty acids, namely oleic, linoleic and linolenic acid, are often described as involved into defense mechanisms against fungal pathogens [47, 54, 55] and able to modulate mycotoxin production [55, 56]. The antimicrobial activity is probably due to their role in modulating ROS production, and in cuticle formation, which constitutes a physical barrier to pathogen infection [57]. In addition, they are precursors of the plant oxylipin pathway, which moves from the enzymatic formation of hydroperoxides, carried out by lipoxygenase (LOX) [58]. Distinct LOX isoforms, referred as 9-LOX and 13-LOX, preferentially add a hydroxyl moiety at C9 or C13 position of the fatty acid backbone, leading therefore to 9- and 13-hydroperoxides, respectively. These compounds act then as substrates for the two distinct biosynthetic cascades, with the formation of approximately 150 known oxylipins including hydroxy-, oxo- or keto-fatty acids, green leaf volatiles (GLVs) and jasmonic acid (JA) [59]. Jasmonates originate from 13-LOX products, while 9-LOX products lead to less-known metabolites known as defense factors in response to fungal attack [60]. Jasmonic acid and methyl jasmonate are well known for their roles as plant stress hormones. They cause programmed cell death activation, the production of ROS and the deposit of wax layers on plant tissues [61]. Jasmonates play, in addition, an active role in the regulation of the phenylpropanoids pathway [62], exhibit antimicrobial properties towards toxigenic fungi [47, 60] and modulate mycotoxin accumulation [63, 64].

Besides these functions, jasmonates were proved to activate glucosyltransferase in *Arabidopsis thaliana* and barley [65]. This is a key enzyme activity involved in a DON detoxification pathway that transforms DON into less phytotoxic DON-3-Glc. Several metabolomic studies have highlighted the involvement of jasmonic acid [15–19, 33] in resistance to DON-producing *Fusarium* species. While the physiological function of jasmonates has been well described over the last years, little is known about other 9-LOX-derived compounds. Recent studies demonstrated that 9-oxylipins contribute to maize susceptibility or resistance to fungal pathogens, in a pathosystem-dependent way [61]. Several studies, indeed, suggested that mycotoxin accumulation is modulated by host oxylipins. In particular, linoleic acid and 9-oxylipins seem to be conserved signal molecules modulating mycotoxin biosynthesis, fungal sporulation and other aspects of fungal differentiation processes [54]. The effects of mutation of LOX gene were often studied in maize, observing that inactivation of the 9-LOX gene led to an increased susceptibility of maize to *Aspergillus flavus*, *A*. *nidulans* and *F. verticillioides* [66–68]. Similarly, modification of LOX genes led to a modulation of fumonisin production in the maize—*F. verticillioides* pathosystem [69, 70]. The deep involvement of oxylipins in the intense cross-talk between host and pathogen has still to be clarified. Endogenous fungal oxylipins are known indeed for supporting host colonization, as well as mycotoxin biosynthesis. Some authors suggest the possible interaction between fungal oxylipins and plant GPCRs, transmembraneproteins or receptor-like kinases, for host manipulation.

## **7. Conclusions**

A metabolomics approach may support the quick growth of this relatively new field of research, allowing for a better understanding of the changes occurring in the plant and pathogen metabolites upon interaction. In principle, analytical methods developed have demonstrated significant advances in sensitivity, robustness, flexibility and discrimination power in order to build successfully statistical models, and subsequent marker identification. Increasing evidence indicates that lipid signalling is an integral part of the complex regulatory network in plant response to stress/infection. Modifications of membrane lipids produce different classes of signalling messengers, such as phosphatidic acid, diacylglycerol pyrophosphate, lysophospholipids, free fatty acids, oxylipins, phosphoinositides and inositol polyphosphates. Lipidomic approaches can be developed to investigate in depth the plantpathogen cross-talk, demonstrating a close relationship between the modification of the pathogen oxylipin profile with the mycotoxin synthesis. Therefore, metabolomics approaches will provide new solutions to old problems. In fact, the early detection of mycotoxins and smart detoxifications can be performed by metabolomics strategies for the first time, and these approaches can fill the gap in order to answer these questions and go a step further.

## **Acknowledgements**

lipoxygenase (LOX) [58]. Distinct LOX isoforms, referred as 9-LOX and 13-LOX, preferentially add a hydroxyl moiety at C9 or C13 position of the fatty acid backbone, leading therefore to 9- and 13-hydroperoxides, respectively. These compounds act then as substrates for the two distinct biosynthetic cascades, with the formation of approximately 150 known oxylipins including hydroxy-, oxo- or keto-fatty acids, green leaf volatiles (GLVs) and jasmonic acid (JA) [59]. Jasmonates originate from 13-LOX products, while 9-LOX products lead to less-known metabolites known as defense factors in response to fungal attack [60]. Jasmonic acid and methyl jasmonate are well known for their roles as plant stress hormones. They cause programmed cell death activation, the production of ROS and the deposit of wax layers on plant tissues [61]. Jasmonates play, in addition, an active role in the regulation of the phenylpropanoids pathway [62], exhibit antimicrobial properties towards toxigenic fungi [47, 60] and

Besides these functions, jasmonates were proved to activate glucosyltransferase in *Arabidopsis thaliana* and barley [65]. This is a key enzyme activity involved in a DON detoxification pathway that transforms DON into less phytotoxic DON-3-Glc. Several metabolomic studies have highlighted the involvement of jasmonic acid [15–19, 33] in resistance to DON-producing *Fusarium* species. While the physiological function of jasmonates has been well described over the last years, little is known about other 9-LOX-derived compounds. Recent studies demonstrated that 9-oxylipins contribute to maize susceptibility or resistance to fungal pathogens, in a pathosystem-dependent way [61]. Several studies, indeed, suggested that mycotoxin accumulation is modulated by host oxylipins. In particular, linoleic acid and 9-oxylipins seem to be conserved signal molecules modulating mycotoxin biosynthesis, fungal sporulation and other aspects of fungal differentiation processes [54]. The effects of mutation of LOX gene were often studied in maize, observing that inactivation of the 9-LOX gene led to an increased susceptibility of maize to *Aspergillus flavus*, *A*. *nidulans* and *F. verticillioides* [66–68]. Similarly, modification of LOX genes led to a modulation of fumonisin production in the maize—*F. verticillioides* pathosystem [69, 70]. The deep involvement of oxylipins in the intense cross-talk between host and pathogen has still to be clarified. Endogenous fungal oxylipins are known indeed for supporting host colonization, as well as mycotoxin biosynthesis. Some authors suggest the possible interaction between fungal oxylipins and plant GPCRs, transmembrane-

A metabolomics approach may support the quick growth of this relatively new field of research, allowing for a better understanding of the changes occurring in the plant and pathogen metabolites upon interaction. In principle, analytical methods developed have demonstrated significant advances in sensitivity, robustness, flexibility and discrimination power in order to build successfully statistical models, and subsequent marker identification. Increasing evidence indicates that lipid signalling is an integral part of the complex regulatory network in plant response to stress/infection. Modifications of membrane lipids produce different classes of signalling messengers, such as phosphatidic acid, diacylglycerol pyro-

modulate mycotoxin accumulation [63, 64].

130 Metabolomics - Fundamentals and Applications

proteins or receptor-like kinases, for host manipulation.

**7. Conclusions**

Josep Rubert thanks the Generalitat Valenciana (Conselleria d'Educació, Cultura i Esport) for the VALi+d postdoctoral fellowship 'Contractació de personal investigador en formació en fase postdoctoral 2014' [grant number APOSTD/2014/120].

## **Author details**

Laura Righetti1 , Chiara Dall'Asta1 , Jana Hajslova2 and Josep Rubert2\*

\*Address all correspondence to: rubertbj@vscht.cz

1 Department of Food Science, University of Parma, Parma, Italy

2 Department of Food Analysis and Nutrition, Faculty of Food and Biochemical Technology, University of Chemistry and Technology, Prague, Czech Republic

## **References**


profiling identifies changes in the wheat metabolome following deoxynivalenol treatment. Metabolomics. 2015; 11: 722–738. DOI: 10.1007/s11306-014-0731-1.

[5] International Agency for Research on Cancer (IARC). Aflatoxins. In Some Traditional Herbal Medicines, Some Mycotoxins, Naphthalene and Styrene. IARC Monogr. Eval.

[6] International Agency for Research on Cancer (IARC). Ochratoxin A. In Some Naturally Occurring Substances: Food Items and Constituents, Heterocyclic Aromatic Amines

[7] Lemmens M, Scholz U, Berthiller F, Dall'Asta C, Koutnik A, Schuhmacher R, Adam G, Buerstmayr H, Mesterházy A, Krska R, Ruckenbauer P. The ability to detoxify the mycotoxin deoxynivalenol colocalizes with a major quantitative trait locus for Fusarium head blight resistance in wheat. Mol. Plant Microbe Interact.. 2005; 18(12): 1318–

[8] Logrieco A, Bottalico A, Mulé G, Moretti A, Perrone G. Epidemiology of toxigenic fungi and their associated mycotoxins for some Mediterranean crops. Eur. J. Plant Pathol.

[9] Turner NW, Bramhmbhatt H., Szabo-Vezse M, Poma A, Coker R, Piletsky SA. Analytical methods for determination of mycotoxins: an update (2009–2014). Anal. Chim. Acta

[10] Rubert J, Dzuman Z, Vaclavikova M, Zachariasova M, Soler C, Hajslova J. Analysis of mycotoxins in barley using ultra high liquid chromatography high resolution mass spectrometry: comparison of efficiency and efficacy of different extraction procedures.

[11] Berthiller F, Brera C, Crews C, Iha M-H, Krska R, Lattanzio V-M-T, MacDonald S, Malone R-J, Maragos C, Solfrizzo M, Stroka J, Whitaker TB. Developments in mycotoxin analysis: an update for 2014–2015. WMJ. 2016; 9(1): 5–29. DOI: http://dx.doi.org/

[12] Dzuman Z, Zachariasova M, Veprikova Z, Godula M, Hajslova J. Multi-analyte high performance liquid chromatography coupled to high resolution tandem mass spectrometry method for control of pesticide residues, mycotoxins, and pyrrolizidine

[13] Zachariasova M, Cuhra P, Hajslova J. Cross-reactivity of rapid immunochemical methods for mycotoxins detection towards metabolites and masked mycotoxins: the current state of knowledge. WMJ. 2014; 7: 449–464. DOI: http://dx.doi.org/10.3920/

[14] Fox G, Manley M. Applications of single kernel conventional and hyperspectral imaging near infrared spectroscopy in cereals. J. Sci. Food Agric. 2014; 94(2): 174–179.

[15] Warth B, Parich A, Bueschl C, Schoefbeck D, Neumann NKN, Kluger B, Schuster K, Krska R, Adam G, Lemmens M, Schuhmacher R. GC-MS based targeted metabolic

alkaloids. Anal. Chim. Acta 2015; 863: 29–40. DOI: 10.1016/j.aca.2015.01.021.

and Mycotoxins. IARC Monogr. Eval. Carcinog. Risks Hum. 1993; 56: 489–521.

Carcinog. Risks Hum. 2002; 82: 1–556.

132 Metabolomics - Fundamentals and Applications

1324. DOI: 10.1094/MPMI-18-1318.

10.3920/WMJ2015.1998.

WMJ2014.1701.

DOI: 10.1002/jsfa.6367.

2003; 109(7): 645–667. DOI: 10.1023/A:1026033021542.

Talanta 2012; 99: 712–719. DOI: 10.1016/j.talanta.2012.07.010.

2015; 901: 12–33. DOI: 10.1016/j.aca.2015.10.013.


Schweiger W. Joint transcriptomic and metabolomic analyses reveal changes in the primary metabolism and imbalances in the subgenome orchestration in the bread wheat molecular response to Fusarium graminearum. G3-Genes Genom Genet. 2005; 5(12): 2579–2592. DOI: 10.1534/g3.115.021550.

[37] Schroeder HW, Christensen JJ. Factors affecting resistance of wheat to scab caused by Gibberella zeae. Phytopathology 1963; 53: 831–838.

[25] Del Fiore A, Reverberi M, Ricelli A, Pinzari F, Serranti S, Fabbri AA, Bonifazi G, Fanelli C. Early detection of toxigenic fungi on maize by hyperspectral imaging analysis. Int.

[27] Cajka T, Vaclavikova M, Dzuman Z, Vaclavik L, Ovesna J, Hajslova J. Rapid LC–MSbased metabolomics method to study the Fusarium infection of barley. J. Sep. Sci. 2014;

[28] Scala V, Camera E, Ludovici M, Dall'Asta C, Cirlini M, Giorni P, Battilani P, Bello C, Fabbri AA, Fanelli C, Reverberi M. Fusarium verticillioides and maize interaction in vitro: relationship between oxylipin cross-talk and fumonisin synthesis. World Mycotoxin J. 2013; 6(3): 343–351. DOI: http://dx.doi.org/10.3920/WMJ2012.1527. [29] Ludovici M, Ialongo C, Reverberi M, Beccaccioli M, Scarpari M, Scala M. Quantitative profiling of oxylipins through comprehensive LC-MS/MS analysis of Fusarium verticillioides and maize kernels. Food Addit. Contam. Part A. 2014; 31(12): 2026–2033.

[30] Giorni P, Dall'Asta C, Reverberi M, Scala V, Ludovici M, Cirlini M, Galaverna G, Fanelli C, Battilani P. Open field study of some *Zea mays* hybrids, lipid compounds and fumonisins accumulation. Toxins. 2015; 7: 3657–3670. DOI: 10.3390/toxins7093657. [31] Nathanail AV, Varga E, Meng-Reiterer J, Bueschl C, Michlmayr H, Malachova A, Fruhmann P, Jestoi M, Peltonen K, Adam G, Lemmens M, Schuhmacher R, Berthiller F. Metabolism of the Fusarium mycotoxins T-2 toxin and HT-2 toxin in wheat. J. Agric.

[32] Kushalappa AC, Gunnaiah R. Metabolo-proteomics to discover plant biotic stress resistance genes. Trends Plant Sci. 2013; 11(9): 522–531. DOI: 10.1016/j.tplants.

[33] Browne RA, Brindle KM. H NMR-based metabolite profiling as potential selection tool for breeding passive resistance against Fusarium head blight (FHB) in wheat. Mol. Plant

[34] Cuperlovic-Culf M, Wang L, Forseille L, Boyle K, Merkley N, Burton I, Fobert PR. Metabolic marker panels to response to Fusarium Head Blight infection in different wheat varieties. PLoS One. 2016; 11(4): e0153642. DOI: 10.1371/journal.pone.0153642.

[35] Paranidharan V, Abu-Nada Y, Hamzehzarghani H, Kushalappa AC, Mamer O, Dion Y, Rioux S, Comeau A, Choiniere L. Resistance-related metabolites in wheat against *Fusarium graminearum* and the virulence factor deoxynivalenol (DON). Botany. 2008;

[36] Nussbaumer T, Warth B, Sharma S, Ametz C, Bueschl C, Parich A, Pfeifer M, Siegwart G, Steiner B, Lemmens M, Schuhmacher R, Buerstmayr H, Mayer KFX, Kugler KG,

Food Chem. 2015; 63: 7862–7872. DOI: 10.1021/acs.jafc.5b02697.

Pathol. 2007; 8(4): 401–410. DOI: 10.1111/j.1364-3703.2007.00400.x.

J. Food Microbiol. 2010; 144(1): 64–71. DOI: 10.1016/j.ijfoodmicro.2010.08.001. [26] Berardo N, Pisacane V, Battilani P, Scandolara A, Pietri A, Marocco A. Rapid detection of kernel rots and mycotoxins in maize by near-infrared reflectance spectroscopy. J.

Agric. Food Chem.. 2005; 53(21): 8128–8134. DOI: 10.1021/jf0512297.

37; 912–919. DOI: 10.1002/jssc.201301292.

134 Metabolomics - Fundamentals and Applications

DOI: 10.1080/19440049.2014.968810.

86: 1168–1179. DOI: 10.1139/B08-052.

2013.05.002.


and Tri gene expression in Fusarium liquid cultures. Mycol. Res.. 2009; 113(6–7): 746– 753. DOI: 10.1016/j.mycres.2009.02.010.


[60] Goodrich-Tanrikulu M, Mahoney NE, Rodriguez SB. The plant growth regulator methyl jasmonate inhibits aflatoxin production by *Aspergillus flavus*. Microbiology 1995; 141: 2831–2837. DOI: 10.1099/13500872-141-11-2831.

and Tri gene expression in Fusarium liquid cultures. Mycol. Res.. 2009; 113(6–7): 746–

[49] Atanasova-Penichon V, Bernillon S, Marchegay G, Lornac A, Pinson-Gadais L, Ponts N, Zehraoui E, Barreau C, Richard-Forget F. Bioguided isolation, characterization, and biotransformation by Fusarium verticillioides of maize kernel compounds that inhibit fumonisin production. Mol. Plant-Microbe Interact. 2014; 27: 1148–1158. DOI: 10.1094/

[50] Ponts N, Pinson-Gadais L, Boutigny AL, Barreau C, Richard-Forget F. Cinnamic derived acids significantly affect *Fusarium graminearum* growth and in vitro synthesis of type B trichothecenes. Phytopathology 2011; 101: 929–934. DOI: 10.1094/

[51] Villarino M, Sandin-Espana P, Melgarejo P, De Cal A. High chlorogenic and neochlorogenic acid levels in immature peaches reduce *Monilinia laxa* infection by interfering with fungal melanin biosynthesis. J. Agric. Food Chem. 2011; 59: 3205–3213. DOI:

[52] Wojciechowska E, Weinert CH, Egert B, Trierweiler B, Schmidt-Heydt M, Horneburg B, Graeff-Honninger S, Kulling SE, Geisen R. Chlorogenic acid, a metabolite identified by untargeted metabolome analysis in resistant tomatoes, inhibits the colonization by *Alternaria alternata* by inhibiting alternariol biosynthesis. Eur. J. Plant Pathol. 2014; 139:

[53] Kachroo A, Kachroo P. Fatty acid-derived signals in plant defense. Annu. Rev. Phyto-

[54] Burow GB, Nesbitt TC, Dunlap J, Keller NP. Seed lipoxygenase products modulate Aspergillus mycotoxin biosynthesis. Mol. Plant Microbe Interact. 1997; 10: 380–387.

[55] Tiwari RP, Mittal V, Singh G, Bhalla TC, Saini SS, Vadehra DV. Effect of fatty-acids on aflatoxin production by *Aspergillus parasiticus*. Folia Microbiol. 1986; 31: 120–123. DOI:

[56] Walters D, Raynor L, Mitchell A, Walker R, Walker K. Antifungal activities of four fatty acids against plant pathogenic fungi. Mycopathologia. 2004; 157: 87–90. DOI: 10.1023/

[57] Yaeno T, Matsuda O, Iba K. Role of chloroplast trienoic fatty acids in plant disease defense responses. Plant J. 2004; 40: 931–941. DOI: 10.1111/j.1365-313X.2004.02260.x.

[58] Feussner I, Wasternack C. The lipoxygenase pathway. Annu. Rev. Plant Biol. 2002; 53:

[59] Mosblech A, Feussner I, Heilmann I. Oxylipins: structurally diverse metabolites from fatty acid oxidation. Plant Physiol. Biochem. 2009; 47: 511–517. DOI: 10.1016/j.plaphy.

275–297. DOI: 10.1146/annurev.arplant.53.100301.135248.

pathol. 2009; 47: 153–176. DOI: 10.1146/annurev-phyto-080508-081820.

753. DOI: 10.1016/j.mycres.2009.02.010.

735–747. DOI: 10.1007/s10658-014-0428-3.

doi.org/10.1094/MPMI.1997.10.3.380.

B:MYCO.0000012222.68156.2c.

MPMI-04-14-0100-R.

136 Metabolomics - Fundamentals and Applications

PHYTO-09-10-0230.

10.1021/jf104251z.

10.1007/bf02926829.

2008.12.011.


**Provisional chapter**

## **Impact of Metabolomics in Symbiosis Research**

**Impact of Metabolomics in Symbiosis Research**

Alba Chavez-Dozal and Michele K. Nishiguchi Alba Chavez-Dozal and Michele K. Nishiguchi

Additional information is available at the end of the chapter Additional information is available at the end of the chapter

http://dx.doi.org/10.5772/66631

#### **Abstract**

In symbiotic associations, there is a constant molecular complexity that allows establishment and maintenance of the relationship. Metabolomic profiles have enabled researchers to explain symbiotic associations in terms of their underlying molecules and interactions between the symbiotic partners. In this review, we have selected studies on symbioses as examples that have helped to explain the metabolic integration of bacterial symbionts and their hosts in an effort to understand the molecular fingerprint of animalmicrobial symbioses.

**Keywords:** symbiosis, mutualism, metabolomics, co-clustering analysis

## **1. Introduction**

The intimate association between two organisms is a very complex biological phenomenon; nevertheless, it is a very common way of life for every living organism on Earth. Symbiotic associations with one or many phylogenetically different organisms provide a fascinating view into how symbionts adapt and co-evolve. As Chaston and Douglas beautifully described in their comprehensive review [1], the omics revolution has transformed our ability to understand symbiotic associations at the molecular level. Researchers have adopted multiple techniques with great fervor in an effort to decipher the basis and complexity of symbiotic associations. Until recent years, the molecular pathways of symbiotic associations could only be studied in the context of genetic changes (transcriptomic studies) and protein profiles (proteomics); however, it is very likely that the establishment of a mutualistic association involved multiple evolutionary changes in the biochemistry and metabolic network of all the partners involved in the symbiosis [1]. Omics biology brings challenges and opportunities; one of the recent advances is the ability to construct a molecular metabolic catalog of an organism within a symbiotic association.

© 2016 The Author(s). Licensee InTech. This chapter is distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/3.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. © 2017 The Author(s). Licensee InTech. This chapter is distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/3.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Metabolomics refers to the analytical approach used to study different cell products ("chemical fingerprints") that help to understand the physiological state of an organism [2].

In this section, we provide a comprehensive description of four experiments where the approach of metabolomics was selected in a particular type of animal-microbial symbiosis, in order to answer specific questions in symbiosis research.

## **2. Exemplars of metabolomic approaches in symbiosis research**

## **2.1. Inferring metabolic interactions in arbuscular mycorrhizal symbiosis**

Our exemplar of metabolomics studies of microbe-plant interactions is a set of observations by Schweiger et al. [3] that describe species-specific leaf metabolic responses to arbuscular mycorrhiza (AM) [4]. Arbuscular mycorrhiza is a unique symbiotic association between root arbuscular mycorrhizal fungi (AMF) and plants [4]. This is an ancient and widespread association where the fungus improves water uptake to the host plant, and in return the fungus receives plant carbohydrates. The fungus is restricted to the roots of the plant; however, the biochemical pathways and the involvement of exchanged substances are reflected on systemic root tissues affecting the chemical composition of plant tissues (defined as "phytometabolome") [4].

Comparative studies conducted on five different plant-AMF associations demonstrate that foliar metabolome is highly plant-species-specific, with low degrees of conservation across species. The experimental design was crucial to the success of this analysis, with the metabolome analysis performed on leaves of five plant species exposed to the worldwide distributed AMF *Rhizophagus irregularis*. Furthermore, the study took into account the implications of metabolite fluctuation at different leaf developmental stages and plant-reproductive status. Additionally, mycorrhizal plants were compared with control plants that received a sterilized inoculum. The results from this study indicate the high specificity of plant metabolome responses to the same AMF colonization; among the most striking findings indicate that metabolomics responses related to phosphate uptake, citric acid cycle, and amino acids were species-specific [3]. **Figure 1** summarizes the most important findings of this interesting study.

#### **2.2. Metabolomic profile of the ryegrass-endophyte symbiosis**

Along the lines of microbe-plant interactions, there is an interesting study conducted by Cao et al. [5] that is of particular relevance for symbiosis research. The metabolomics profile of perennial ryegrass (*Lolium perenne*) infected with endophytic fungus (*Neotyphodium lolii*) provided understanding of regulatory biochemical mechanisms for the production of beneficial alkaloids.

*N. lolii* is a naturally occurring fungus whose complete cycle occurs within perennial ryegrass. The fungus grows between the cells of the host plant drawing nutrients from it, and in return, the endophyte produces chemical compounds that provide resistance to drought, pests, and protection from overgrazing. Therefore, the aim of this study was to gather metabolomics information and combine it with microarray data in order to obtain a better understanding of

Metabolomics refers to the analytical approach used to study different cell products ("chemical

In this section, we provide a comprehensive description of four experiments where the approach of metabolomics was selected in a particular type of animal-microbial symbiosis, in

Our exemplar of metabolomics studies of microbe-plant interactions is a set of observations by Schweiger et al. [3] that describe species-specific leaf metabolic responses to arbuscular mycorrhiza (AM) [4]. Arbuscular mycorrhiza is a unique symbiotic association between root arbuscular mycorrhizal fungi (AMF) and plants [4]. This is an ancient and widespread association where the fungus improves water uptake to the host plant, and in return the fungus receives plant carbohydrates. The fungus is restricted to the roots of the plant; however, the biochemical pathways and the involvement of exchanged substances are reflected on systemic root tissues affecting the chemical composition of plant tissues (defined as "phyto-

Comparative studies conducted on five different plant-AMF associations demonstrate that foliar metabolome is highly plant-species-specific, with low degrees of conservation across species. The experimental design was crucial to the success of this analysis, with the metabolome analysis performed on leaves of five plant species exposed to the worldwide distributed AMF *Rhizophagus irregularis*. Furthermore, the study took into account the implications of metabolite fluctuation at different leaf developmental stages and plant-reproductive status. Additionally, mycorrhizal plants were compared with control plants that received a sterilized inoculum. The results from this study indicate the high specificity of plant metabolome responses to the same AMF colonization; among the most striking findings indicate that metabolomics responses related to phosphate uptake, citric acid cycle, and amino acids were species-specific [3]. **Figure 1** summarizes the most important findings of this interesting study.

Along the lines of microbe-plant interactions, there is an interesting study conducted by Cao et al. [5] that is of particular relevance for symbiosis research. The metabolomics profile of perennial ryegrass (*Lolium perenne*) infected with endophytic fungus (*Neotyphodium lolii*) provided understanding of regulatory biochemical mechanisms for the production of beneficial alkaloids. *N. lolii* is a naturally occurring fungus whose complete cycle occurs within perennial ryegrass. The fungus grows between the cells of the host plant drawing nutrients from it, and in return, the endophyte produces chemical compounds that provide resistance to drought, pests, and protection from overgrazing. Therefore, the aim of this study was to gather metabolomics information and combine it with microarray data in order to obtain a better understanding of

fingerprints") that help to understand the physiological state of an organism [2].

**2. Exemplars of metabolomic approaches in symbiosis research**

**2.1. Inferring metabolic interactions in arbuscular mycorrhizal symbiosis**

**2.2. Metabolomic profile of the ryegrass-endophyte symbiosis**

order to answer specific questions in symbiosis research.

140 Metabolomics - Fundamentals and Applications

metabolome") [4].

**Figure 1.** Summary of findings on the leaf phytometabolome when plants are exposed to the same arbuscular mycorrhizal fungi (AMF). Leaf metabolites detected included carbohydrates, organic acids, amino acids and derivatives, cyclic polyols, and sugar alcohols. Metabolites are differentially regulated primordially affecting the phosphate and citric acid cycles.

the biochemical mechanisms involved in the cross talk between partners, with the eventual purpose of achieving genetic manipulation of beneficial metabolite production (in particular manipulation of alkaloids).

Twenty-four perennial ryegrass samples comprising three tissue types (immature leaves, blades, and mature leaves) were examined of both endophyte-infected plants and endophyte free as a control. Targeted metabolomics analysis was used as the quantitative approach that provided identities of 70 metabolites based on the available databases of reference compounds. The use of targeted metabolomics in combination with microarray data provided better identification and classification Accuracy of compounds, as well as greater insights into the dynamics and fluxes of the newly identified metabolites. Results of this comprehensive study included the identification of accumulated alkaloids in the mature tissues of endophyte-infected ryegrass, and the co-clustering analysis of microarray data-identified genes with distinctive expression patterns which coincide with the pattern of alkaloid accumulation [5]. **Figure 2** summarizes the findings of this study. Results of this study indicate that coclustering analysis is not a straightforward task no matter what kind of algorithm is used, and that the integration of transcriptomics and metabolomics can generate noisy data. However, this study demonstrated that co-cluster analysis could be a comprehensive choice to gain a more complete understanding of a complex biological system involving two entirely different taxa that are intertwined in their metabolic capabilities.

#### **2.3. Metabolomic profile of symbiotic protection against pathogens**

It is believed that specific strains from the gut microbiota can influence host immunity and protect from infection by pathogenic bacteria. One example is the early and prevalent gut colonizer *Bifidobacterium*, which is considered part of the healthy normal gut flora. It is believed that different strains of *Bifidobacterium* protect against enteropathogenic *Escherichia coli* O157:H7 infection in mice; however, the potential molecular and cellular mechanisms underpinning this protective effect are still under investigation [6].

**Figure 2.** Summary of the study conducted by Cao et al. Co-clustering analysis of microarray and metabolomics data on endophytic-infected ryegrass indicate a set of genes and metabolites that are important for alkaloid production.

One study conducted by Fukuda et al. [7] used a combined "omics" strategy in an effort to gain a better understanding of the protective effect of *Bifidobacterium* over its mice host. Experiments designed comprised mice infected with different species of the symbiotic bacterium *Bifidobacterium* (including *B. longum* and *B. adolescentis*) and the pathogen *E. coli* O157:H7. The life span of co-infected mice was observed and transcriptomic and metabolomic profiles were conducted. **Figure 3** diagrams the experimental design of this study. This sophisticated analysis included a combination of sequencing, the platform used for metabolite detection was HPLC-MS (high-performance liquid chromatography-mass spectrometry) and for the analysis of products, the dataset was subjected to a multivariate analysis method named PLS (partial least squares) projection to latent structures. Typical data-processing flow included detection of signal peaks and normalization of dataset to generate a matrix of the products detected. For their statistical analysis, the method selected was PCA (principal component analysis) and CL (cluster analysis).

Results from this study indicate that mice bearing the strain *B. longum* survived, whereas those infected with *B. adolescentis* died. Metabolomic profiles between the two treatments revealed that the concentration of fatty acids (acetic acid in particular) was significantly elevated in those mice that survived *E. coli* infection. Furthermore, mice that survived showed an increased expression of genes involved in ATP-binding-cassette carbohydrate transporters [7]. Observations from the study suggest that the elevated production of acetic acid improved intestinal defense, thereby enhancing the barrier function of colon epithelial cells inhibiting the transport of *E. coli* toxins.

#### **2.4. Metabolomics of a beneficial marine bacterium**

The marine luminescent bacterium *Vibrio fischeri* establishes a symbiotic association with numerous sepiolid squids and monocentric fishes. *V. fischeri* infects a specialized light organ in the mantle (body) cavity of host squids and produces bioluminescence that is used by its host to avoid predation in a behavior known as counterillumination. In return, the squid host provides an enriched habitat for *Vibrio* to reproduce and to form bacterial communities of monospecies biofilms. The ability of *V. fischeri* to form a biofilm in the light organ of its squid host plays a central role in establishment and maintenance of the symbiotic association. This interesting symbiotic association has been the center of attention of many researchers, and has been investigated for more than 25 years; however, as indicated for other examples of mutualistic associations, the molecular basis of the squid-*Vibrio* symbiosis is still obscure.

**Figure 3.** Summary of the experiment conducted by Fukoda et al. Mice were coinfected with beneficial strains of *Bifidobacterium* and the pathogenic strain of *Escherichia coli* O157:H7. Combined transcriptomic and metabolomic profiles revealed an increase in acetate and fructose transporters in those mice that survived lethal infection.

One study conducted by Fukuda et al. [7] used a combined "omics" strategy in an effort to gain a better understanding of the protective effect of *Bifidobacterium* over its mice host. Experiments designed comprised mice infected with different species of the symbiotic bacterium *Bifidobacterium* (including *B. longum* and *B. adolescentis*) and the pathogen *E. coli* O157:H7. The life span of co-infected mice was observed and transcriptomic and metabolomic profiles were conducted. **Figure 3** diagrams the experimental design of this study. This sophisticated analysis included a combination of sequencing, the platform used for metabolite detection was HPLC-MS (high-performance liquid chromatography-mass spectrometry) and for the analysis of products, the dataset was subjected to a multivariate analysis method named PLS (partial least squares) projection to latent structures. Typical data-processing flow included detection of signal peaks and normalization of dataset to generate a matrix of the products detected. For their statistical analysis, the method selected was PCA (principal component

**Figure 2.** Summary of the study conducted by Cao et al. Co-clustering analysis of microarray and metabolomics data on endophytic-infected ryegrass indicate a set of genes and metabolites that are important for alkaloid production.

Results from this study indicate that mice bearing the strain *B. longum* survived, whereas those infected with *B. adolescentis* died. Metabolomic profiles between the two treatments revealed that the concentration of fatty acids (acetic acid in particular) was significantly elevated in those mice that survived *E. coli* infection. Furthermore, mice that survived showed an increased expression of genes involved in ATP-binding-cassette carbohydrate transporters [7]. Observations from the study suggest that the elevated production of acetic acid improved intestinal defense, thereby enhancing the barrier function of colon epithelial cells inhibiting

The marine luminescent bacterium *Vibrio fischeri* establishes a symbiotic association with numerous sepiolid squids and monocentric fishes. *V. fischeri* infects a specialized light organ in the mantle (body) cavity of host squids and produces bioluminescence that is used by its host to avoid predation in a behavior known as counterillumination. In return, the squid host provides an enriched habitat for *Vibrio* to reproduce and to form bacterial communities of

analysis) and CL (cluster analysis).

142 Metabolomics - Fundamentals and Applications

the transport of *E. coli* toxins.

**2.4. Metabolomics of a beneficial marine bacterium**

In a recent study conducted by Chavez-Dozal et al. [8], both proteomic and metabolomic profiles were performed in parallel in strains of *V. fischeri* in their biofilm form and compared to profiles of free-living (or planktonic) *V. fischeri* cells of the same strain. The main objective of this study was to obtain a comprehensive profile of the molecular components to provide the first meta-proteome profile of biofilms that are important for establishment of this mutualistic association. A summary of this study is illustrated in **Figure 4**.

Biofilms are a complex microbial community composed of cells encased within a self-produced exopolymeric matrix. Expression profiles of biofilm communities reveal the composition of the matrix, which include a combination of lipids, polysaccharides, proteins, and DNA [9, 10].

**Figure 4.** Summary of the experiment conducted by Chavez-Dozal et al. [5]. Proteomic and metabolomic profiles were performed in planktonic cells and biofilm communities of the same strain of *Vibrio fischeri*. Results revealed an upregulation of biofilm matrix components and molecules related to multiple stress responses.

Results of this study revealed a time-resolved picture of approximately 100 proteins and 200 metabolites present in the biofilm state of *V. fischeri*. The most important components found in this study include proteins, sugars, and molecules that form part of the exopolysaccharide matrix of biofilms; surprisingly, an increased concentration of intermediates of the glycolysis pathway was found to be prevalent during the biofilm state [8]. Results from this study suggest that molecules involved in the construction of the biofilm matrix are essential to bacterial community formation, a process that has been known to activate stress responses such as upregulation of alternative anaerobic pathways. The reported findings of this study have broad implications for *V. fischeri* ecology, since many of the symbiosis-regulated genes are not yet described. The combination of proteomics and metabolomics has therefore provided a link between protein regulation and function during different phases of the symbiosis, improving our understanding of the mechanisms that are important for successful host colonization.

## **3. Concluding remarks**

Metabolomic approaches are increasingly selected for multiple purposes of symbiosis research. Although other "omic" approaches are needed to understand molecular function in symbiotic associations, the emerging use of metabolomics provides a new level of biochemical sophistication. The different examples provided in this mini review are only some of the pillar studies that included the use of either metabolomics or a combinational analysis of metabolomics with transcriptomics/proteomics of different mutualistic systems; however, many more studies are in progress using metabolomics profiles to define and characterize molecular and biochemical pathways that are important for establishment and persistence of symbiotic associations. The advancement of technologies that allows higher resolution of minute concentrations of proteins and their modulation will expand the area of metabolomics research and will enable a better perspective of the physiological state of organisms as single entities (otherwise known as the holobiome).

## **Author details**

Alba Chavez-Dozal and Michele K. Nishiguchi\*

\*Address all correspondence to: nish@nmsu.edu

Department of Biology, New Mexico State University, Las Cruces, NM, USA

## **References**

[1] Chaston J, Douglas AE. Making the most of "omics" for symbiosis research. Biol Bull; 2012;**223**:21–9. doi: 10.1111/j.1365-2133.2012.10859.x.

[2] Tyagi S, Raghvendra, Singh U, Kalra T, Munjal K. Applications of metabolomics, A systematic study of the unique chemical fingerprints: an overview. Int J Pharma Scie Rev and Res; 2010;**3**(1):019.

Results of this study revealed a time-resolved picture of approximately 100 proteins and 200 metabolites present in the biofilm state of *V. fischeri*. The most important components found in this study include proteins, sugars, and molecules that form part of the exopolysaccharide matrix of biofilms; surprisingly, an increased concentration of intermediates of the glycolysis pathway was found to be prevalent during the biofilm state [8]. Results from this study suggest that molecules involved in the construction of the biofilm matrix are essential to bacterial community formation, a process that has been known to activate stress responses such as upregulation of alternative anaerobic pathways. The reported findings of this study have broad implications for *V. fischeri* ecology, since many of the symbiosis-regulated genes are not yet described. The combination of proteomics and metabolomics has therefore provided a link between protein regulation and function during different phases of the symbiosis, improving our understanding of the mechanisms that are important for successful host

Metabolomic approaches are increasingly selected for multiple purposes of symbiosis research. Although other "omic" approaches are needed to understand molecular function in symbiotic associations, the emerging use of metabolomics provides a new level of biochemical sophistication. The different examples provided in this mini review are only some of the pillar studies that included the use of either metabolomics or a combinational analysis of metabolomics with transcriptomics/proteomics of different mutualistic systems; however, many more studies are in progress using metabolomics profiles to define and characterize molecular and biochemical pathways that are important for establishment and persistence of symbiotic associations. The advancement of technologies that allows higher resolution of minute concentrations of proteins and their modulation will expand the area of metabolomics research and will enable a better perspective of the physiological state of organisms as single

colonization.

**Author details**

**References**

**3. Concluding remarks**

144 Metabolomics - Fundamentals and Applications

entities (otherwise known as the holobiome).

Alba Chavez-Dozal and Michele K. Nishiguchi\* \*Address all correspondence to: nish@nmsu.edu

Department of Biology, New Mexico State University, Las Cruces, NM, USA

2012;**223**:21–9. doi: 10.1111/j.1365-2133.2012.10859.x.

[1] Chaston J, Douglas AE. Making the most of "omics" for symbiosis research. Biol Bull;


#### **Metabolomics as a Tool in Agriculture Metabolomics as a Tool in Agriculture**

Emmanuel Ibarra-Estrada, Emmanuel Ibarra-Estrada, Ramón

Ramón Marcos Soto-Hernández and Marcos Soto-Hernández and Mariana

Mariana Palma-Tenango Palma-Tenango

Additional information is available at the end of the chapter Additional information is available at the end of the chapter

http://dx.doi.org/10.5772/66485

#### **Abstract**

Metabolomics is a study through can be obtain a better understanding of the complexity of the biological systems, through the chemical composition and relations with the physiology of the plant. The literature describes a lot of information on this support in areas of medicinal plants, chemosystematics, adulteration of plants, etc., but it is scarce in agriculture. At present, agriculture plays a crucial role in human beings. The demand of foods has increased due to the continuous increase in the population, and this requires an increase in the production of crops, besides the crops being affected by the climatic change, attack of pest and diseases and resistance to a conventional agrochemicals. At present, scientists are doing some practices or studies of genetic improvement of crops to increase their production and avoid the problems pointed out. It is an important part of the genomic studies; the results could be the basis for a genetic improvement based on the chemical composition of the crops, and in the metabolomics studies represent a crucial role in their quality for human consumption. The aim of the chapter is to review the literature from 10 years behind emphasizing the importance of metabolomics in crops of economic and feed value.

**Keywords:** agriculture, metabolomics, crops

## **1. Introduction**

Plants are a large source of natural products with a large chemical diversity and unimaginable properties; therefore, the interest in their study is increasing. Nowadays, the interest in agricultural crops has increased due to their economic importance and their projected importance in the next years in relation to world food security.

and reproduction in any medium, provided the original work is properly cited.

© 2016 The Author(s). Licensee InTech. This chapter is distributed under the terms of the Creative Commons © 2017 The Author(s). Licensee InTech. This chapter is distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/3.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Attribution License (http://creativecommons.org/licenses/by/3.0), which permits unrestricted use, distribution,

The usage on the plant species destined to agriculture is varied, from traditional foods to those with some desirable traits, such as nutritional value and the industrial products derived from them such as polymers, fibres, latex, industrial oils and packaging materials, in addition to basic chemical building blocks and fuels [1].

The objectives of the use of metabolomics tools in agriculture are to know the biochemistry and the functions of the species involved to apply that knowledge to food and environmental security; to use their potentials as tools in the improvement of nutrition, diets and health [2]; and to use them for genetic improvement in plants based on chemical composition, that is, taking into account some notable trait.

Nowadays, agriculture plays a crucial role in human feeding. Food demand has increased considerably due to the continuous increase of population centres, and thus, the demands on the production and diversity of crops, especially basic crops, are increasing. To this problem, we can add the noxious effects of climate change on crops, the attack of pests and diseases, as well as the emergence of new diseases and the development of resistance by them towards commonly used pesticides. Currently, trials or studies of genetic improvement of crops are being carried out to increase the quantity and quality of yield, avoid damages caused by pest and diseases and develop resistance to several factors, especially environmental ones. There are some crops that are considered as basic or important to world food, such as rice, maize, potato, avocado, tomato and citrus fruits, among others; these crops are placing themselves among priority crops because they support human consumption and thus have been the subject of the aforementioned studies or trials.

## **2. Metabolomics**

Metabolomics represents a field of study with which we can gain a better understanding of the complexity of biological systems. It deals with the identification and quantification of the metabolites present in such systems [3, 4] with molecular weights less than 1500 Da [5], although the range could be occasionally wider (30–3000 Da) [6]. The group of small molecules of metabolites that are in a cell, in an organ or organism, is called metabolome [6, 7]. It is made up of a large variety of molecules such as peptides, amino acids, nucleic acids, carbohydrates, organic acids, vitamins, flavonoids, polyphenols, alkaloids, minerals or any other chemical compound that is used, metabolized or synthesized by a cell or by a given organism [7]. The importance of the metabolites resides in that they are an essential part of the behaviour of the individual that contains them; these compounds are the final products of the regulatory processes of the cell, and their presence represents the response of biological systems to environmental or genetic changes. It is due to this that metabolomics is considered as the link between genotype and phenotype [8, 9]. In the last years, metabolomics has managed to position itself as an area of research that is essential and complementary to proteomic, genomic and transcriptomic studies.

Metabolomics is a very important complement of genomic studies, and its results could be the basis of genetic improvement based on the chemical composition of crops, be it in nutritional or functional aspects, or in the participation of some chemical compounds in the resistance of some plant species to the factors that have already been mentioned.

It is important to know the current state of metabolomics studies of the most important crops in the world to use it as the basis for future studies destined to improve their production, or to know the diversity of the chemical composition that we have available, its benefits and its possible uses. Moreover, it would be important to know the advantages and disadvantages of using current analytical techniques, in addition to their current state and possible improvements in future analyses.

## **3. Extraction methods in metabolomics**

The usage on the plant species destined to agriculture is varied, from traditional foods to those with some desirable traits, such as nutritional value and the industrial products derived from them such as polymers, fibres, latex, industrial oils and packaging materials, in addition

The objectives of the use of metabolomics tools in agriculture are to know the biochemistry and the functions of the species involved to apply that knowledge to food and environmental security; to use their potentials as tools in the improvement of nutrition, diets and health [2]; and to use them for genetic improvement in plants based on chemical composition, that is,

Nowadays, agriculture plays a crucial role in human feeding. Food demand has increased considerably due to the continuous increase of population centres, and thus, the demands on the production and diversity of crops, especially basic crops, are increasing. To this problem, we can add the noxious effects of climate change on crops, the attack of pests and diseases, as well as the emergence of new diseases and the development of resistance by them towards commonly used pesticides. Currently, trials or studies of genetic improvement of crops are being carried out to increase the quantity and quality of yield, avoid damages caused by pest and diseases and develop resistance to several factors, especially environmental ones. There are some crops that are considered as basic or important to world food, such as rice, maize, potato, avocado, tomato and citrus fruits, among others; these crops are placing themselves among priority crops because they support human consumption and thus have been the sub-

Metabolomics represents a field of study with which we can gain a better understanding of the complexity of biological systems. It deals with the identification and quantification of the metabolites present in such systems [3, 4] with molecular weights less than 1500 Da [5], although the range could be occasionally wider (30–3000 Da) [6]. The group of small molecules of metabolites that are in a cell, in an organ or organism, is called metabolome [6, 7]. It is made up of a large variety of molecules such as peptides, amino acids, nucleic acids, carbohydrates, organic acids, vitamins, flavonoids, polyphenols, alkaloids, minerals or any other chemical compound that is used, metabolized or synthesized by a cell or by a given organism [7]. The importance of the metabolites resides in that they are an essential part of the behaviour of the individual that contains them; these compounds are the final products of the regulatory processes of the cell, and their presence represents the response of biological systems to environmental or genetic changes. It is due to this that metabolomics is considered as the link between genotype and phenotype [8, 9]. In the last years, metabolomics has managed to position itself as an area of research that is essential and complementary to proteomic, genomic and transcriptomic studies. Metabolomics is a very important complement of genomic studies, and its results could be the basis of genetic improvement based on the chemical composition of crops, be it in nutritional or functional aspects, or in the participation of some chemical compounds in the resistance of

some plant species to the factors that have already been mentioned.

to basic chemical building blocks and fuels [1].

taking into account some notable trait.

148 Metabolomics - Fundamentals and Applications

ject of the aforementioned studies or trials.

**2. Metabolomics**

There are numerous extraction methods used to extract and isolate the compounds of interest, but is important to keep in mind to use a simple method, robust, low consume time, repeatable and cheap. The selected solvent should extract diverse groups of metabolites. Traditional extraction methods as percolation, maceration, Soxhlet extraction, steam distillation or hydrodistillation are convenient to extract a broad class of metabolites; they are cheap, simple and

**Figure 1.** Extraction process and analysis of a sample. Modified from Ref. [92].

repeatable and can be used for raw plant extraction; and the amounts to be processed depend on the source of plant material, and the amounts to be processed can be since a few grams to a higher amounts, but depends of the source of plant material, but they are time consuming (**Figure 1**). At present, they can complemented with modern techniques as ultrasonication (sonochemistry), microwaves, supercritical fluid extraction or accelerated solvent extraction; they are simple, repeatable and mainly the extraction time is short, but the cost of these equipments is high. It is also important to consider the effect of temperature provided by the selected method, because some components of the sample can be decomposed if the temperature is high. Roesnner and Dias [10] described a recent review where they notice all the details that should be consider to get a good result with the extraction and isolation of the compounds of interest. Once the sample is extracted properly, is ready to submit for the analysis on GC‐MS, LC‐MS, NMR or MS.

## **4. Analytical techniques**

Today we have a wide variety of analytical techniques that are getting better and better, and which help us to obtain reliable data about the behaviour of a plant species, or about its response to diverse environmental factors, both biotic and abiotic. During the last years, there has been a boom of genetic studies of many plant species, and the generated information can be used in a wide range of applications, for example, genetic improvement. A very important aspect that could help us to improve the understanding and to enrich the information of such studies would be the inclusion of metabolomic studies. Metabolomics has been considered as the link between genotype and phenotype, that is, with this kind of studies, gene expression would be better understood, and hence, the behaviour of the plant species in the face of the aforementioned factors.

An essential aspect of the metabolomic study is the qualitative and quantitative analysis of metabolites, which allows us to know the biochemical state of an organism; that information can be used to monitor and evaluate gene function and the multiple responses of the o rganism to the conditions where it develops [8]. In spite of the technological advances, it is practically impossible to determine the entire composition of even a single cell in a single study or with a single analytical technique [9]; in order to do that, coupled techniques have been developed, such as high‐performance liquid chromatography (HPLC) and gas chromatography (GC) coupled with mass spectrometry (MS).

The analyses that have already been mentioned are carried out using analytical techniques specialized in separation, identification and quantification. Such techniques must have high resolution, be very precise and very sensitive, and be able to analyse a wide variety of compounds of different chemical nature and origin because the structural complexity of many molecules makes their study difficult. Several analytical technologies can be used, depending on the chemical nature of the compounds. Some of them are nuclear magnetic resonance (NMR), GC and HPLC coupled with MS, as well as capillary electrophoresis coupled with MS [7].

It is known that a single analytical technique is not enough to visualize whole metabolome [11], and thus, it is necessary and important to carry out separate studies using different techniques, or using the aforementioned coupled systems.

NMR has several uses in metabolomics which can be applied or adapted to agriculture. For example, in quality control, chemotaxonomy (classification and characterization), analysis of genetically modified plants and the study of interactions with other organisms and the environment, besides the study of diseases in humans. Currently, the study of the metabolome based on NMR is accepted as an efficient analytical tool to study biological systems [6].

The main advantages of the NMR over the rest of the analytical techniques are that it can detect a wide range of chemical compounds of different nature; quantification does not pose any problem; it is highly reproducible and metabolite identification is simple [12]. Perhaps the most important benefit is that the method is quick and simple, and the damage of the existing compounds during the preparation of the extracts is minimal. However, the technique is not very sensitive [13, 14].

As it was already mentioned, chemotaxonomy is one of the main applications of NMR, since through the metabolomics profiles obtained by this technique it is possible to classify and identify plants and their derived preparations. There are several examples of the application of NMR in chemotaxonomy. The classification of Ilex species [15], *Cannabis sativa* [16], *Ephedra* [17], and the metabolomic differentiation and classification of species of *Verbascum* [18], discrimination of commercial preparations of *Matricaria* [19] and commercial samples of catuaba [20], among others.

## **5. Applications of metabolomics in agriculture**

repeatable and can be used for raw plant extraction; and the amounts to be processed depend on the source of plant material, and the amounts to be processed can be since a few grams to a higher amounts, but depends of the source of plant material, but they are time consuming (**Figure 1**). At present, they can complemented with modern techniques as ultrasonication (sonochemistry), microwaves, supercritical fluid extraction or accelerated solvent extraction; they are simple, repeatable and mainly the extraction time is short, but the cost of these equipments is high. It is also important to consider the effect of temperature provided by the selected method, because some components of the sample can be decomposed if the temperature is high. Roesnner and Dias [10] described a recent review where they notice all the details that should be consider to get a good result with the extraction and isolation of the compounds of interest. Once the sample is extracted properly, is ready to submit for the analysis on GC‐MS, LC‐MS,

Today we have a wide variety of analytical techniques that are getting better and better, and which help us to obtain reliable data about the behaviour of a plant species, or about its response to diverse environmental factors, both biotic and abiotic. During the last years, there has been a boom of genetic studies of many plant species, and the generated information can be used in a wide range of applications, for example, genetic improvement. A very important aspect that could help us to improve the understanding and to enrich the information of such studies would be the inclusion of metabolomic studies. Metabolomics has been considered as the link between genotype and phenotype, that is, with this kind of studies, gene expression would be better understood, and hence, the behaviour of the plant species in the face of the

An essential aspect of the metabolomic study is the qualitative and quantitative analysis of metabolites, which allows us to know the biochemical state of an organism; that information can be used to monitor and evaluate gene function and the multiple responses of the o rganism to the conditions where it develops [8]. In spite of the technological advances, it is practically impossible to determine the entire composition of even a single cell in a single study or with a single analytical technique [9]; in order to do that, coupled techniques have been developed, such as high‐performance liquid chromatography (HPLC) and gas chroma-

The analyses that have already been mentioned are carried out using analytical techniques specialized in separation, identification and quantification. Such techniques must have high resolution, be very precise and very sensitive, and be able to analyse a wide variety of compounds of different chemical nature and origin because the structural complexity of many molecules makes their study difficult. Several analytical technologies can be used, depending on the chemical nature of the compounds. Some of them are nuclear magnetic resonance (NMR), GC and HPLC coupled with MS, as well as capillary electrophoresis coupled with MS [7].

NMR or MS.

**4. Analytical techniques**

150 Metabolomics - Fundamentals and Applications

aforementioned factors.

tography (GC) coupled with mass spectrometry (MS).

Specifically in agriculture, metabolite content is related to developmental and differentiation processes, fruit maturation processes, resistance to adverse environmental factors, stress‐ related problems and pathogen attack, among others.

The wide range of compounds is analysed through several analytical techniques; some of which are specific to certain compounds. Liquid chromatography coupled with mass spectrometry is a technique that can be used to analyse a wide range of compounds such as vitamins (hydrophilic and hydrophobic), coenzymes, phenylpropanoids, polyketides, terpenoids, amino acids and amines, lipids, carbohydrates, phenolic compounds and alkaloids, among others. With GC‐MS, fewer compounds can be analysed due to the type of compounds that this technique can detect; although by using derivatization reactions, the number of metabolites detected by this technique increases considerably. With this technique, it is easy to analyse essential oils and fatty acids as well as terpenoids, alkaloids, monosaccharides and steroids, among others. Capillary electrophoresis, for its part, helps in the detection of oligosaccharides, hydrophobic vitamins, coenzymes, prosthetic groups, nitrogenous bases, nucleotides and nucleosides, among others [21].


**Table 1.** Metabolomics studies in agricultural crops.

Currently, there are many examples of metabolomic studies that have been applied to agriculture with various purposes (**Table 1**). Some of the main objectives of metabolomic studies in agriculture are as follows: know the metabolic responses towards any type of stress; generate metabolic profiles for genetic mapping and generate metabolic profiles to study the impact of heredity. Generate metabolic profiles to determine the impact of geographic location and season, study the phenotype of natural variations of certain plant species, evaluate transgenic varieties, study metabolic variations of different cultivars and carry out functional characterizations (functional genomics). Analyse metabolic changes during growth, development and differentiation, elucidate biosynthetic pathways, analyse population differentiation, carry out chemotaxonomic analyses, carry out nontargeted studies and characterize cultivars [21].

Nowadays, human world food depends on certain crops that are indispensable due to their nutritional value, and the access that we have to them.

Maize (*Zea mays* L.) has been subject of numerous metabolomic studies, many of which are related to genomic studies. For example, it has been obtained the metabolomic profile of its leaves to carry out genetic mapping using GC‐MS [22]. On the other hand, the genetic basis of the metabolic diversity in its grains was analysed in a study that involved 702 genotypes collected from different geographic zones and growing conditions; in that study, 983 metabolites were quantified [23]. Another study that included the analysis of genetic background (heredity), different geographic locations and environmental conditions through GC‐MS was the one carried out by Röhlig et al. [24]; in that study, they could distinguish the chemical composition of the genotypes under analysis, which helped to determine notable chemical characteristics that varied depending on the region of origin.

The development of maize, as well as of other crops, under ideal nutritional conditions is paramount in order to obtain good yield and quality; hence, the importance of carrying out studies that involve nutrient analysis. The effect of nitrogen (N) deficiency on the development of maize has been studied using metabolomic studies [25], and the effect of phosphorous (P) deficiency has been analysed through MALDI‐TOF MS [26]. Stress caused by any factor limits the growth and development of any plant species. The effect of salt stress on the growth of maize seedlings was studied through 1 H‐NMR; the metabolic profile obtained in the study can be used to improve the growing conditions [27].

Rice (*Oryza sativa* L.) is one of the most important crops in the world, especially in Asia. Several studies have been carried out in this crop, where the main objective has been to characterize cultivars through the metabolic profiles obtained from several plant structures, mainly the leaf, using capillary electrophoresis coupled with mass spectrometry [28]. This crop has also been subject of studies where genetic and metabolic studies are combined using techniques such as GC‐FID, GC‐MS and GC‐TOF‐MS, in what is called functional genomics [29, 30]. Naturally, in modern times, studies on transgenic individuals could not be absent, where their metabolic profiles have been determined [31].

Potato (*Solanum tuberosum* L.) is a crop that has been widely studied in several areas, among which is metabolomics. In such studies, the objectives have ranged from the elucidation of the metabolic profile to know the chemical characteristics of the tubers through GC‐MS [32, 33] and the characterization of some cultivars [34, 35], to the characterization of the relationship between composition and quality traits [36].

Currently, there are many examples of metabolomic studies that have been applied to agriculture with various purposes (**Table 1**). Some of the main objectives of metabolomic studies in agriculture are as follows: know the metabolic responses towards any type of stress; generate metabolic profiles for genetic mapping and generate metabolic profiles to study the impact of

**Compound/analytical technique Topic References**

Cyanidin 3‐O‐glucoside (anthocyanin)/HPLC Fruit ripening [84] Sugars, protein, oil/LC Fruit ripening [54]

Flavonoids/GC‐MS, LC‐MX Metabolic engineering [51]

Organic acids/NMR, GC‐MS Metabolic engineering [89] Enzymes/GC‐MS Fruit development in transgenic plants [90]

Total nitrogen, protein/MS‐MS Development and growth [85]

deficit

trait

Genetic modifications to metabolic pathways

Total composition analysis and quality

Plant‐pathogen interaction [40]

Phytochemical diversity [34]

Protein/MALDI‐TOF‐MS Metabolic changes under phosphorus

Potato Glycoalkaloids, fructans/GC‐TOF‐MS, LC‐MS Genetically modified plants [37]

Tomato 1‐Methyl‐tryptophan/UHLC‐MS Plant‐pathogen interaction [50]

Development and growth [86, 87, 91]

Development and growth [85, 86, 90]

Development of rapid method [88]

Targeted metabolomics [53]

Growth and development [49]

Salt stress [27]

[26]

[39]

[36]

**Crop or plant species**

Avocado Mannoheptulose

/LC‐MS

152 Metabolomics - Fundamentals and Applications

carbohydrate)/LC‐MS

palmitoleic acids (fatty acids), carbohydrates

/FID‐GC, LC

Maize Amino acids, carbohydrates, organic acids/NMR

HPLC‐UV

NMR

Perseitol (carbohydrate)

Persenone A and B/HPLC,

Sugars, amino acids, organic acids/NMR, GC‐TOF‐MS

Fatty acids, amino acids, organic acids/NMR,

Sugars, amino acids, organic acids/FIE‐MS, GC‐MS

Fatty acids, organic acids, alkaloids and amino acids

Amino acids, organic acids, sugars and sugar alcohols

**Table 1.** Metabolomics studies in agricultural crops.

Stearic, palmitic, oleic, Linoleic, linolenic and

As it was mentioned in maize and rice, there have been analyses in transgenic potato crops to know their metabolic profile; such studies have been carried out in tubers through techniques such as GC‐MS [37, 38] proton NMR and HPLC‐UV [39].

One of the main limitations in yield and quality of crops is the effects caused by pathogens. It is important to study the interaction between pathogens and crops to gain a better understanding of its effects. In this crop, the interaction with *Rhizoctonia solani* has been studied using FT‐ICR/MS and GC‐EI/MS [40].

Metabolomics in agriculture can be used to obtain a chemotaxonomic classification, where differences and similarities can be distinguished between related species or cultivars, as it was the case between genetically modified potato crops and conventional potato crops, where the study was carried out using GC TOF‐MS and FIE‐MS [37].

Species in the family Solanaceae, such as potato and tomato (*Solanum lycopersicum* L.), are among the most important crops for agriculture and food and thus are subject of different types of studies. With respect to metabolomic studies, metabolomic profiles of tomato have been obtained for genetic mapping of fruits and leaves of both, conventional and wild genotypes [41, 42]. It is known that tomato has a large number of chemical compounds with important uses. This chemical variability has been elucidated through metabolomic studies. An important group of compounds that have been studied is the volatiles from fruit, peel and pulp, which were analysed through UPLC‐QTOF‐MS [43, 44]. The profile of carotenoid compounds from the fruit has also been characterized by proton NMR [45]. Changes in chemical composition during growth, development and differentiation are very important to determine the conditions where these events are occurring. It has been obtained chemical profiles from tomato plants in several stages of development through LC‐MS [46]. Something similar was carried out by Moco et al. [47], where they obtained databases from the metabolome from the fruit using LC‐MS and 1 H‐NMR. The simple characterization of fruits from different cultivars was carried out through MALDI/TOF‐MS [48].

During the day, any plant species has changes in its chemical composition due to the action or effect of the environment, something that [49] determined when detected 70 metabolites in tomato leaves and 60 in the fruit using NMR and MS. Also, with the aid of metabolomics, it is possible to identify particular compounds with some specific function, important to the survival of a species. With respect to this, 1‐methyltryptophan was identified as the metabolite involved in the response of the plant to *Botrytis cinerea* and *Pseudomonas syringae* [50].

On the other hand, it can be altered the levels and composition of flavonoids in tomato fruits through the modification of biosynthetic pathways using regulatory and structural genes [51].

Throughout time, about 95% of the genetic and chemical diversity of tomato has been lost owing to its domestication; therefore, it is important to know the entire metabolic profiles of both commercial and surviving wild tomatoes to determine their chemical and genetic variability and use that information as the foundation to obtain better genotypes or varieties [52].

One of the most important fruit trees, economically speaking, due to its particular characteristics as flavour, texture and chemical composition, is the avocado (*Persea americana* Miller). It has also been subject of different studies where its main chemical traits have been determined. It is a species with important properties and benefits to health, based on several chemical compounds.

As it was mentioned in maize and rice, there have been analyses in transgenic potato crops to know their metabolic profile; such studies have been carried out in tubers through techniques

One of the main limitations in yield and quality of crops is the effects caused by pathogens. It is important to study the interaction between pathogens and crops to gain a better understanding of its effects. In this crop, the interaction with *Rhizoctonia solani* has been studied

Metabolomics in agriculture can be used to obtain a chemotaxonomic classification, where differences and similarities can be distinguished between related species or cultivars, as it was the case between genetically modified potato crops and conventional potato crops, where the

Species in the family Solanaceae, such as potato and tomato (*Solanum lycopersicum* L.), are among the most important crops for agriculture and food and thus are subject of different types of studies. With respect to metabolomic studies, metabolomic profiles of tomato have been obtained for genetic mapping of fruits and leaves of both, conventional and wild genotypes [41, 42]. It is known that tomato has a large number of chemical compounds with important uses. This chemical variability has been elucidated through metabolomic studies. An important group of compounds that have been studied is the volatiles from fruit, peel and pulp, which were analysed through UPLC‐QTOF‐MS [43, 44]. The profile of carotenoid compounds from the fruit has also been characterized by proton NMR [45]. Changes in chemical composition during growth, development and differentiation are very important to determine the conditions where these events are occurring. It has been obtained chemical profiles from tomato plants in several stages of development through LC‐MS [46]. Something similar was carried out by Moco et al. [47], where they obtained databases from the metabolome

During the day, any plant species has changes in its chemical composition due to the action or effect of the environment, something that [49] determined when detected 70 metabolites in tomato leaves and 60 in the fruit using NMR and MS. Also, with the aid of metabolomics, it is possible to identify particular compounds with some specific function, important to the survival of a species. With respect to this, 1‐methyltryptophan was identified as the metabolite

On the other hand, it can be altered the levels and composition of flavonoids in tomato fruits through the modification of biosynthetic pathways using regulatory and structural genes [51].

Throughout time, about 95% of the genetic and chemical diversity of tomato has been lost owing to its domestication; therefore, it is important to know the entire metabolic profiles of both commercial and surviving wild tomatoes to determine their chemical and genetic variability and use that information as the foundation to obtain better genotypes or varieties [52]. One of the most important fruit trees, economically speaking, due to its particular characteristics as flavour, texture and chemical composition, is the avocado (*Persea americana* Miller).

involved in the response of the plant to *Botrytis cinerea* and *Pseudomonas syringae* [50].

H‐NMR. The simple characterization of fruits from different

such as GC‐MS [37, 38] proton NMR and HPLC‐UV [39].

study was carried out using GC TOF‐MS and FIE‐MS [37].

cultivars was carried out through MALDI/TOF‐MS [48].

using FT‐ICR/MS and GC‐EI/MS [40].

154 Metabolomics - Fundamentals and Applications

from the fruit using LC‐MS and 1

The study of the chemical compounds present in avocado is very important, and it will help us to lay the foundations to use the chemical information of this fruit tree, for example, in genetic improvement or it can be used to increase the interest in those varieties or races that are not widely consumed and find new applications to them. In this approach, the chemical composition of avocado is addressed using techniques and strategies that combine the identification and quantification of cellular metabolites through sophisticated analytical techniques, and the use of statistical and multivariate methods to analyse and interpret data. In the development of analytical methods throughout time, there have been important technological advances, which have resulted in improvements in the way biological systems are seen, analysed and interpreted [13].

In the last years, the content of acetogenins in avocado has attracted attention. These long‐ chain fatty acid derivatives have important medicinal properties; they are considered anticancer agents. The profile of acetogenins in the peel, seed and pulp of avocado fruits has been determined to obtain a chemotaxonomic model using a linear discriminant analysis [53]. Also, changes of sugars, total protein and oil in 'Hass' avocado have been determined using GC‐ FID and LC [54]. On the other hand, important proteins from the pulp have been identified through the use of nano‐LC‐MS/MS [55]. The use of coupled analytical systems has helped to get better resolution, higher sensitivity, higher speed of analysis and wider diversity of applications. A study in avocado was carried out to investigate fruit maturation through its metabolomic profile using GC‐APCI‐TOF‐MS. Such technique showed that it is a valuable and powerful tool to improve the understanding of the process of fruit maturation [56].

Through the years, the metabolomic knowledge has been applied to agriculture through various planned objectives. For example, some studies have been carried out to elucidate the biosynthetic pathways that produce metabolites in herbs in order to discover the mechanisms responsible for the evolution of these pathways and to understand the function of a given natural product within the physiology of the plant where it is found [2]. Such studies have been carried out in ginger (*Zingiber officinale* Roscoe) [57], turmeric (*Curcuma longa* L.) [58] and basil (*Ocimum basilicum* L.) [59]. These plants have an enormous potential for the development of products with some application, mainly industrial or in medicinal.

A metabolomic study, whatever it may be, is a process with some key stages to know the real status of a plant. Such stages are collection and extraction of the sample, and the analysis, identification, and quantification of the chemical compounds of interest, according to the objective of study. Another case is the analysis of the extracts of *Pisum sativum* L. to define the impact of the environment and the genetic diversity based on the metabolic profile [60, 61].

Several metabolic studies have been carried out in cultivable plant species, where all the tissues have been used and all the analytical techniques employed. Within the cereal group, the leaves and roots of barley (*Hordeum vulgare* L.) have been studied to analyse the metabolic responses to salt in two cultivars using GC‐MS [62]. In oat, the metabolic profile of leaves was obtained using <sup>1</sup> H‐NMR to evaluate metabolic variation in European cultivars [63]. Several studies have been carried out also in legumes. For example, in pea (*Pisum sativum* L.), the metabolic fingerprint of transgenic varieties was obtained from the leaves using 1 H‐NMR [61]. In *Medicago truncatula* Gaertn, the functional characterization of glycosyltransferases of terpenes in cell culture was obtained using HPLC‐MS and also the metabolomic profile for functional genomics in cell culture [64, 65]. In the seeds of *Vigna radiata* (L.) R. Wilczek, the metabolic changes that occur during sprouting were investigated through GC‐MS [66].

The tobacco (*Nicotiana tabacum L*) has been subject of study in functional genomics to study the changes induced by jasmonates in the biosynthesis of metabolites in a cell suspension culture, using GC‐MS [67]. Also in this family, the metabolic profile from the fruit of chili (*Capsicum* sp.) was determined to study its diversity, using HPLC‐PDA/LC‐PDA‐QTOF‐ MS‐MS [68].

Fruit trees have also been studied to obtain their metabolic profiles for several purposes. In grape berries (*Vitis vinifera* L.), the metabolic variation in several cultivars was evaluated using <sup>1</sup> H‐NMR [69]. Also, the metabolic profile of raspberry fruits (*Rubus idaeus* L.) was characterized to identify beneficial compounds using LC‐MS [70]. In strawberry (*Fragaria × ananassa* hort.), metabolic studies in fruits and flowers were carried out to know the chemical composition at several developmental stages [71, 72]. Finally, in muskmelon (*Cucumis melo* L.), the metabolic profile of the fruit was obtained using GC‐MS [73]. Medicinal plants are also subject of study to find medicinal chemical compounds. In *Catharanthus roseus* (L.) G. Don, there has been hundreds of studies to understand and improve the biosynthetic pathway of indole alkaloids in cell culture; many of the studies have used LC‐MS and 1 H‐NMR.

The analytical techniques have many advantages for metabolomic analyses; however, their application is not universal. It is important to know the limitations and the processes that are being developed to improve them.

Metabolomic approaches have been divided in two groups, targeted and untargeted metabolomics. In principle, both approaches can be applied in all analytical platforms as LC‐MS and GC‐MS. The main difference between them is the identification of the analytical signals. In the first one, the metabolites under study are known. In the second one, no specific metabolites are chosen and all the signals detected are taken in account. Clearly, both techniques have advantages and disadvantages. The main benefit of untargeted approach is a more holistic view of the behaviour of metabolite composition with the chance of low probability of missing key metabolites. If it already known what specific metabolites are key for the research, an optimized targeted approach could be most successful. Through the use of internal standards, analysis can be undertaken in a quantitative or semi‐quantitative form.

These tools have been useful in pre‐harvest and postharvest issues of food constituents and are related with food safety and quality control. In fruits as mango prone to a pre‐harvest fungal disease or the postharvest contamination of onions have been assessed by GC‐MS, in which the sample is trapped by head space and then submitted for the GC‐MS analysis. In fact, metabolomics techniques may find their greatest application in the food industry in monitoring quality control of different batches. It is described that fruit juice adulteration is quite common; it is not easy to detect the adulteration, but the LC‐MS has helped to control the adulteration. The distinction between fresh squeezed juices and those come from pulp washes also has been determined by <sup>1</sup> H NMR, with a high accuracy of the method [4]. Recently, Canela et al. [83] described a review on foodomics imaging by mass spectrometry and magnetic resonance (IMS and MRI). They pointed out that these tools have advantages over fluorescent microscopy or immunochemistry to localization and chemical identity of small molecules. These can determine the presence of many compounds in a single multi‐detecting measurement. In the review described some examples where these techniques can support the visualization of the compounds present in a plant tissue used for evaluation of quality control.

In general, there are many uses of metabolomics in agriculture, and there are no limitations in the study of the multiple species. It is possible to work with any plant structure, with any crop, for almost any purpose, and the array of techniques available makes it easy to carry out these studies. In the end, the results of metabolomics from agriculture will have an impact on other areas of study, such as medicine, food quality control, nutrition, and genetic improvement, among others. Finally, an essential part of metabolomic studies is the statistical analysis and bioinformatics resources used to interpret the results, which are important to take into account in a wide review of metabolomics.

## **6. Plant‐pathogen interaction**

leaves and roots of barley (*Hordeum vulgare* L.) have been studied to analyse the metabolic responses to salt in two cultivars using GC‐MS [62]. In oat, the metabolic profile of leaves was

studies have been carried out also in legumes. For example, in pea (*Pisum sativum* L.), the

[61]. In *Medicago truncatula* Gaertn, the functional characterization of glycosyltransferases of terpenes in cell culture was obtained using HPLC‐MS and also the metabolomic profile for functional genomics in cell culture [64, 65]. In the seeds of *Vigna radiata* (L.) R. Wilczek, the metabolic changes that occur during sprouting were investigated through GC‐MS [66].

The tobacco (*Nicotiana tabacum L*) has been subject of study in functional genomics to study the changes induced by jasmonates in the biosynthesis of metabolites in a cell suspension culture, using GC‐MS [67]. Also in this family, the metabolic profile from the fruit of chili (*Capsicum* sp.) was determined to study its diversity, using HPLC‐PDA/LC‐PDA‐QTOF‐

Fruit trees have also been studied to obtain their metabolic profiles for several purposes. In grape berries (*Vitis vinifera* L.), the metabolic variation in several cultivars was evalu-

was characterized to identify beneficial compounds using LC‐MS [70]. In strawberry (*Fragaria × ananassa* hort.), metabolic studies in fruits and flowers were carried out to know the chemical composition at several developmental stages [71, 72]. Finally, in muskmelon (*Cucumis melo* L.), the metabolic profile of the fruit was obtained using GC‐MS [73]. Medicinal plants are also subject of study to find medicinal chemical compounds. In *Catharanthus roseus* (L.) G. Don, there has been hundreds of studies to understand and improve the biosynthetic pathway of indole alkaloids in cell culture; many of the studies

The analytical techniques have many advantages for metabolomic analyses; however, their application is not universal. It is important to know the limitations and the processes that are

Metabolomic approaches have been divided in two groups, targeted and untargeted metabolomics. In principle, both approaches can be applied in all analytical platforms as LC‐MS and GC‐MS. The main difference between them is the identification of the analytical signals. In the first one, the metabolites under study are known. In the second one, no specific metabolites are chosen and all the signals detected are taken in account. Clearly, both techniques have advantages and disadvantages. The main benefit of untargeted approach is a more holistic view of the behaviour of metabolite composition with the chance of low probability of missing key metabolites. If it already known what specific metabolites are key for the research, an optimized targeted approach could be most successful. Through the use of internal standards, analysis can be undertaken in a quantita-

These tools have been useful in pre‐harvest and postharvest issues of food constituents and are related with food safety and quality control. In fruits as mango prone to a pre‐harvest fungal

H‐NMR [69]. Also, the metabolic profile of raspberry fruits (*Rubus idaeus* L.)

metabolic fingerprint of transgenic varieties was obtained from the leaves using 1

H‐NMR to evaluate metabolic variation in European cultivars [63]. Several

H‐NMR

obtained using <sup>1</sup>

156 Metabolomics - Fundamentals and Applications

MS‐MS [68].

ated using <sup>1</sup>

have used LC‐MS and 1

being developed to improve them.

tive or semi‐quantitative form.

H‐NMR.

Plants are always influenced by different factors, mainly environmental ones; those factors force them to adapt and change parts of their functioning to protect themselves, in most cases. One of the main interactions that exist in plants is that with micro‐organisms, which also cause changes in the physiology and development of plants, like the environment does. Normally, there is an adverse effect to the plant, but in some occasions, both obtain a benefit, in what is called symbiosis. Pathogen attacks can lead to yield losses although occasionally the micro‐organisms may help the plant to improve nutrient uptake. The most characteristic example is the interaction of nitrogen‐fixing bacteria with legumes.

When the interaction between a plant and a pathogen occurs, one of the changes that happen in the plant is the production of several kinds of compounds that act as a type of defence as attractants, repellents, feeding inhibitors or the production of chemical compounds that are beneficial to human health.

Metabolomics, through its variants, can help to determine the physiological and biochemical changes that happen during the interaction and provide a general overview of the whole system.

Primary and secondary metabolites are a key in the response of the plant towards pathogen attacks. There are many modifications that could happen during such interaction, such as molecular and physiological modifications. The modifications range from changes in primary metabolism, where basic processes like photosynthesis can be affected, modifications in the cell wall and in some organs of the plant, to the production of secondary metabolites that can be toxic or trigger defensive signals in the plant. Depending on the type of interaction, resistance, tolerance or susceptibility could occur [74].

One example is the plant‐fungus interaction, where a study was carried out using mass spectrometry using electrospray ionization to detect changes in the levels of lipids and hormones. In this study, the researchers had predicted that those molecules were involved in the interaction between *Brachypodium distachyon* and *Magnaporthe grisea*. A variation in the level of phospholipids was detected, which was the main response of the plant to the attack of the fungus [75]. One advantage of the use of metabolomics is that targeted and nontargeted studies can be performed at the same time, as in the case of the interaction between *Lupinus angustifolius* with the fungus *Colletotrichum lupini*. The response of this interaction was characterized focusing on the metabolites of the cuticle using GC‐MS (nontargeted study) and on the flavonoids using LC‐MS (targeted study). In this case, it was found that there was a higher variation in the kinds of flavonoids when the interaction took place and that variation was more marked in the extract from the plant than in the extract from the fungus [76].

On the other hand, a metabolomic study of the plant‐pathogen interaction can help to elucidate the genetic mechanisms originated during resistance. Through a nontargeted study using GC‐MS, the resistance of sunflower (*Helianthus annuus*) to the necrotrophic pathogen *Sclerotinia sclerotiorum* was characterized; 63 metabolites were found, including sugars, organic acids, amino acids and secondary metabolites, such as chlorogenic acid, which are associated with tolerant phenotypes [77].

Many studies about plant‐pathogen interactions have focused on relating chemical compounds with diseases using bacteria, fungi, oomycetes and even viruses, interacting with plants such as Arabidopsis, tobacco, sunflower, barley, rice, potato and grapevine, where changes have been detected in both primary and secondary metabolism [74]. Most of the studies that have been mentioned have used mass spectrometry (coupled with LC or GC) as the analytical technique that helps to characterize the metabolomic response of the interaction, but there are studies based on NMR to achieve the characterization. In a study of the interaction between tobacco (*Nicotiana tabacum* L.) and the tobacco mosaic virus, the alteration of the metabolic pathways in the leaves was proposed after they were infected with the virus. Healthy and diseased leaves were compared through 1D and 2D NMR [78].

NMR has also been used to characterize the compounds involved in the resistance of host plants to the western flower thrips (*Frankliniella occidentalis*). For this study, three plants of different types were used: Senecio (wild), chrysanthemum (ornamental) and tomato (cultivated). A resistant Senecio hybrid was evaluated because it showed significantly higher amounts of pyrrolizidine alkaloids (which are involved in plant chemical defences against herbivores in general), as well as some flavonoids. In the case of the resistant chrysanthemum, high amounts of chlorogenic acid and ferulic acid were found which, as phenolic compounds, are expressed during plant defence. Cultivated tomato was the most susceptible towards pathogen attack, but the resistant hybrid of this crop showed high levels of acyl sugars, which confer some protection against the attack of pathogens [79].

There are many examples of the application of metabolomics in plant‐pathogen interactions that help us to know and interpret this interrelation through the physiological and biochemical changes that take place. For this, advanced analytical platforms of high sensitivity are used; these platforms allow the characterization of almost any molecule expressed during the interaction. Any kind of plant involved in any type of attack or interaction can be studied, and valuable information can be obtained that will help us to understand the reaction of the plant or both. From this type of interactions, and through applied analytical techniques, information about the best way to exploit such interaction can be obtained, in case one wants to obtain or increase chemical compounds with some use, or if one has interest in other aspects of life, such as health.

## **7. Future perspectives of metabolomics**

molecular and physiological modifications. The modifications range from changes in primary metabolism, where basic processes like photosynthesis can be affected, modifications in the cell wall and in some organs of the plant, to the production of secondary metabolites that can be toxic or trigger defensive signals in the plant. Depending on the type of interaction, resis-

One example is the plant‐fungus interaction, where a study was carried out using mass spectrometry using electrospray ionization to detect changes in the levels of lipids and hormones. In this study, the researchers had predicted that those molecules were involved in the interaction between *Brachypodium distachyon* and *Magnaporthe grisea*. A variation in the level of phospholipids was detected, which was the main response of the plant to the attack of the fungus [75]. One advantage of the use of metabolomics is that targeted and nontargeted studies can be performed at the same time, as in the case of the interaction between *Lupinus angustifolius* with the fungus *Colletotrichum lupini*. The response of this interaction was characterized focusing on the metabolites of the cuticle using GC‐MS (nontargeted study) and on the flavonoids using LC‐MS (targeted study). In this case, it was found that there was a higher variation in the kinds of flavonoids when the interaction took place and that variation was more marked in the extract from the plant than in the extract from the fungus [76].

On the other hand, a metabolomic study of the plant‐pathogen interaction can help to elucidate the genetic mechanisms originated during resistance. Through a nontargeted study using GC‐MS, the resistance of sunflower (*Helianthus annuus*) to the necrotrophic pathogen *Sclerotinia sclerotiorum* was characterized; 63 metabolites were found, including sugars, organic acids, amino acids and secondary metabolites, such as chlorogenic acid, which are

Many studies about plant‐pathogen interactions have focused on relating chemical compounds with diseases using bacteria, fungi, oomycetes and even viruses, interacting with plants such as Arabidopsis, tobacco, sunflower, barley, rice, potato and grapevine, where changes have been detected in both primary and secondary metabolism [74]. Most of the studies that have been mentioned have used mass spectrometry (coupled with LC or GC) as the analytical technique that helps to characterize the metabolomic response of the interaction, but there are studies based on NMR to achieve the characterization. In a study of the interaction between tobacco (*Nicotiana tabacum* L.) and the tobacco mosaic virus, the alteration of the metabolic pathways in the leaves was proposed after they were infected with the virus.

NMR has also been used to characterize the compounds involved in the resistance of host plants to the western flower thrips (*Frankliniella occidentalis*). For this study, three plants of different types were used: Senecio (wild), chrysanthemum (ornamental) and tomato (cultivated). A resistant Senecio hybrid was evaluated because it showed significantly higher amounts of pyrrolizidine alkaloids (which are involved in plant chemical defences against herbivores in general), as well as some flavonoids. In the case of the resistant chrysanthemum, high amounts of chlorogenic acid and ferulic acid were found which, as phenolic compounds, are expressed during plant defence. Cultivated tomato was the most susceptible towards pathogen attack, but the resistant hybrid of this crop showed high levels of acyl sugars, which

Healthy and diseased leaves were compared through 1D and 2D NMR [78].

confer some protection against the attack of pathogens [79].

tance, tolerance or susceptibility could occur [74].

158 Metabolomics - Fundamentals and Applications

associated with tolerant phenotypes [77].

Over the years, the field of metabolomics has gained interest in several disciplines such as functional genomics, biological and integrative systems, pharmacogenomics, and the discovery of biomarkers to predict diseases, diagnostics and therapy monitoring [80], besides the area of study of agriculture [2].

Modern challenges for metabolomics are diverse in all the fields mentioned, but they are particularly relevant in the discovery of biomarkers [82], especially in the field of diseases, because their detection, monitoring and treatment are important. One of the problems that researchers face these days is the difficulty in the identification and quantification of several chemical compounds at the same time in a reliable and ideal way, since their number can be huge, and some metabolites, which may be undetectable due to their low concentrations, may be relevant to some function or application. Nowadays, new analytical techniques are being developed or improved to solve these problems, widening the range of detectable metabolites based on their structural characteristics, and making the methods more sensitive, so they can detect very low concentrations of compounds.

Nowadays, the information obtained from analysed samples may not be correctly interpreted, or it may offer much more information than what is obtained, but some limitations related to the design of the experiment could have impeded the correct interpretation and use of that information.

A future challenge will be to improve this type of situations to use the information better and find possible applications for it. The validity of a metabolomic study is affected by the sizes of unbalanced samples (this aspect has to do with the design of the experiment), especially in studies with humans, mainly when statistical methods are used to interpret the data. This is mentioned because in some studies that have already been carried out, the number of control and diseased individuals is not associated, that is, it is not balanced [81], so it is necessary to balance these cases of study.

One of the challenges of metabolomics is to participate in more fields of study which could be waiting for this type of analyses; although the most important ones are already included, such as disease detection and health in general, studies have already started in fields such as evolution, chemotaxonomy, agriculture, ecology and food quality control, among others.

Nowadays, there are many databases related to chemical compounds, their identification and their structural elucidation, and probably over the years these databases will increase significantly due to the increasing number of studies in this area; therefore, we will have access to a huge number of metabolites, their properties, their possible health benefits and other properties.

Certainly, with the passing of the years, the number of studies related to the functionality of what is already sequenced will increase, since in this field there are still some delays. This is the main challenge of metabolomics for the next years, and it will be conquered by integrating the study of several fields focused on the same objective, or by integrating the latter.

The complete understanding of the function of the cell system and the deciphering of gene function will arrive with time, since metabolomics is integrated with genomics, transcriptomics and proteomics; thus, an integral work will give the result that everyone wants in this branch of knowledge, which is the complete understanding of cell function [81].

Chemical studies in plants, from their origins, have been based on their traditional use and knowledge, since people used them to get some benefit, but without really knowing what caused such effect. Therefore, something that must be addressed is to make the composition and the beneficial effects of the compounds available to the general public in an easy‐to‐understand way.

Metabolomic studies in plants related to natural products will increase simply because they are everywhere, they can be used for everything, and every day new applications are found for them; most of these applications will benefit humans.

It has mentioned that metabolomics is not a goal in itself, but a tool to improve our understanding about the metabolism and biochemistry of the organisms [6]. Therefore, among all, this must be the most important future perspective: to know and completely understand metabolism and cell function.

## **8. Conclusion**

Metabolomics is a relatively new field. During the last years, this discipline has been growing because diverse applications have been found for it, and different analytical techniques have been developed and improved; this has allowed an easier interpretation and analysis of the results. The inclusion of a wide variety of crops in this type of studies is paramount, because it is necessary to know the qualities that they have, and take from them the most important traits with the aim of developing an application that benefits food, health or industry.

The field of study of agriculture, with respect to metabolomic aspects, will keep growing, because in the next years, there will be challenges to ensure world food sovereignty. Currently, most crops and their diversity are at risk; therefore, it is necessary to carry out actions focused on their conservation, rescue and rational exploitation.

The range of analytical techniques implemented in metabolomics allows us to be a step ahead in the analysis of extracts or chemical compounds, through which new uses or applications for the plant species studied can be found, or strengthen those already existing, in addition to the development of improvement programmes based on distinctive chemical traits.

## **Author details**

Nowadays, there are many databases related to chemical compounds, their identification and their structural elucidation, and probably over the years these databases will increase significantly due to the increasing number of studies in this area; therefore, we will have access to a huge number of metabolites, their properties, their possible health benefits and other properties. Certainly, with the passing of the years, the number of studies related to the functionality of what is already sequenced will increase, since in this field there are still some delays. This is the main challenge of metabolomics for the next years, and it will be conquered by integrating

The complete understanding of the function of the cell system and the deciphering of gene function will arrive with time, since metabolomics is integrated with genomics, transcriptomics and proteomics; thus, an integral work will give the result that everyone wants in this

Chemical studies in plants, from their origins, have been based on their traditional use and knowledge, since people used them to get some benefit, but without really knowing what caused such effect. Therefore, something that must be addressed is to make the composition and the beneficial effects of the compounds available to the general public in an easy‐to‐understand way. Metabolomic studies in plants related to natural products will increase simply because they are everywhere, they can be used for everything, and every day new applications are found

It has mentioned that metabolomics is not a goal in itself, but a tool to improve our understanding about the metabolism and biochemistry of the organisms [6]. Therefore, among all, this must be the most important future perspective: to know and completely understand

Metabolomics is a relatively new field. During the last years, this discipline has been growing because diverse applications have been found for it, and different analytical techniques have been developed and improved; this has allowed an easier interpretation and analysis of the results. The inclusion of a wide variety of crops in this type of studies is paramount, because it is necessary to know the qualities that they have, and take from them the most important traits with the aim of developing an application that benefits food, health or industry.

The field of study of agriculture, with respect to metabolomic aspects, will keep growing, because in the next years, there will be challenges to ensure world food sovereignty. Currently, most crops and their diversity are at risk; therefore, it is necessary to carry out actions focused

The range of analytical techniques implemented in metabolomics allows us to be a step ahead in the analysis of extracts or chemical compounds, through which new uses or applications for the plant species studied can be found, or strengthen those already existing, in addition to the development of improvement programmes based on distinctive chemical traits.

the study of several fields focused on the same objective, or by integrating the latter.

branch of knowledge, which is the complete understanding of cell function [81].

for them; most of these applications will benefit humans.

on their conservation, rescue and rational exploitation.

metabolism and cell function.

160 Metabolomics - Fundamentals and Applications

**8. Conclusion**

Emmanuel Ibarra‐Estrada1 , Ramón Marcos Soto‐Hernández<sup>2</sup> \* and Mariana Palma‐Tenango<sup>2</sup>

\*Address all correspondence to: msoto@colpos.mx

1 Ministry of Agriculture, México City, México

2 Postgraduate Studies College, Campus Montecillo, Texcoco, México

## **References**


[26] Li K, Xu C, Zhang K, Yang A, Zhang J: Proteomic analysis of roots growth and metabolic changes under phosphorus deficit in maize (*Zea mays* L.) plants. Proteomics. 2007;**7**:1501–1512.

[12] Verpoorte R, Choi Y H, Mustafa N R, Kim H K: Metabolomics: back to basics.

[13] Sumner L W, Mendes P, Dixon R A: Plant metabolomics: large‐scale phytochemistry in

[14] Kim H K, Choi Y H, Verpoorte R: NMR‐based metabolomic analysis of plants. Nature

[15] Kim H K, Saifullah, Khan S, Wilson E G, Prat Kricun S D, Meissner A, Goraler S, Deelder A M, Choi Y H, Verpoorte R: Metabolic classification of South American Ilex species by

[16] Choi Y H, Kim H K, Hazekamp A, Erkelens C, Lefeber A W M, Verpoorte R. Metabolomic

[17] Kim H K, Choi Y H, Erkelens C, Lefeber A W M, Verpoorte R: Metabolic fingerprint-

[18] Georgiev M I, Ali K, Alipieva K, Verpoorte R, Choi Y H: Metabolic differentiations and classification of Verbascum species by NMR‐based metabolomics. Phytochemistry.

[19] Bailey N J C, Sampson J, Hylands P J, Nicholson J K, Holmes E: Multicomponent meta-

troscopy and chemometrics. Planta Medica. 2002;**68**:734–738. doi:10.1055/s‐2002‐33793

[20] Daolio C, Beltrame F L, Ferreira A G, Cass Q B, Cortez D A G, Ferreira M C: Classification of commercial catuaba samples by NMR, HPLC and chemometrics. Phytochemical

[21] Carreno‐Quintero N, Bouwmeester H J, Keurentjes J J B: Genetic analysis of metabolome‐phenotype interactions: from model to crop species. Trends in Genetics.

[22] Riedelsheimer C, Lisec J, Czedik‐Eysenberg A, Sulpice R, Flis A, Grieder C, Altmann T, Stitt M, Willmitzer L, Melchinger A E: Genome‐wide association mapping of leaf metabolic profiles for dissecting complex traits in maize. Proceedings of the National

[23] Wen W, Li D, Li X, Gao Y, Li W, Li H, Liu J, Liu H, Chen W, Luo J, Yan J. Metabolome‐ based genome‐wide association study of maize kernel leads to novel biochemical

[24] Röhlig R M, Eder J, Engel K H: Metabolite profiling of maize grain: differentiation due

[25] Simons M, Saha R, Guillard L, Clément G, Armengaud P, Cañas R, Maranas C D, Lea P J, Hirel B: Nitrogen‐use efficiency in maize (*Zea mays* L.): from 'omics' studies to metabolic

Academy of Sciences of the United States of America. 2012;**109**:8872–8877.

insights. Nature Communications. 2014;**5**:1–10.

to genetics and environment. Metabolomics. 2009;**5**:459–477.

modelling. Journal of Experimental Botany. 2014;**65**:5657–5671.

bolic classification of commercial feverfew preparations via high‐field 1

H‐NMR spectroscopy and principal

H‐NMR spec-

H‐NMR spectroscopy and principal component analysis.

the functional genomics era. Phytochemistry. 2003;**62**:817–836.

NMR‐bases metabolomics. Phytochemistry. 2010;**71**:773–784.

component analysis. Journal of Natural Products. 2004;**67**:953–957.

differentiation of *Cannabis sativa* cultivars using 1

Chemical and Pharmaceutical Bulletin. 2005;**53**:105–109.

Phytochemistry Reviews. 2008;**7**:525–537.

Protocols. 2010;**5**:536–549.

162 Metabolomics - Fundamentals and Applications

ing of Ephedra species using 1

2011;**72**:2045–2051.

Analysis. 2008;**19**:218–228.

2013;**29**:41–50.


[48] Fraser P D, Enfissi E M, Goodfellow M, Eguchi T, Bramley P M: Metabolite profiling of plant carotenoids using the matrix‐assisted laser desorption ionization time‐of‐flight mass spectrometry. The Plant Journal. 2007;**49**:552–564.

[37] Catchpole G S, Beckmann M, Enot D P, Mondhe M, Zywicki B, Taylor J, Hardy N, Smith A, King R D, Kell D B, Fiehn O, Draper J: Hierarchical metabolomics demonstrates substantial compositional similarity between genetically modified and conventional potato crops. Proceedings of the National Academy of Sciences of the United States of America.

[38] Roessner U, Luedemann A, Brust D, Fiehn O, Linke T, Willmitzer L, Fernie A R: Metabolic profiling allows comprehensive phenotyping of genetically or environmental modified

[39] Defernez M, Gunning Y M, Parr A J, Shepherd L V T, Davies H V, Colquhoun I J: NMR and HPLC‐UV profiling of potatoes with genetic modifications to metabolic pathways.

[40] Aliferis K A, Jabali S: FT‐ICR/Ms and GC‐EI/MS metabolomics networking unravels global potato sprout's responses to Rhizoctonia solani infection. PLoS One. 2012;**7**:e42576.

[41] Schauer N, Zamir D, Fernie A R: Metabolic profiling of leaves and fruit of wild species tomato: a survey of the *Solanum lycopersicum* complex. Journal of Experimental Botany.

[42] Schauer N, Semel Y, Roessner U, Gur A, Balbo I, Carrari F, Pleban T, Pérez‐Melis A, Bruedigam C, Kopka J, Willmitzer L, Zamir D, Fernie A R: Comprehensive metabolic profiling and phenotyping of interspecific introgression lines for tomato improvement.

[43] Tikunov Y, Lommen A, de Vos C H R, Verhoeven H A, Bino R J, Hall R D, Bovy A G. A: Novel approach for nontargeted data analysis for metabolomics large‐scale profiling in

[44] Mintz‐Oron S, Mandel T, Rogachev I, Feldberg L, Lotan O, Yativ M, Wang Z, Jetter R, Venger I, Adato A, Aharoni A: Gene expression and metabolism in tomato fruit surface

[45] Le Gall G, Colquhoun I J, Davis A L, Collins G J, Verhoeyen V E: Metabolite profil-

[46] Moco S, Bino R J, Vorst O, Verhoeven H A, Groot J, van Beek T A, Vervoort J, deVos C H R: A liquid chromatography‐mass spectrometry‐based metabolome database for tomate.

[47] Moco S, Forshed J, deVos R C H, Bino R J, Vervoort J: Intra‐ and inter‐metabolite correlation spectrocospy of tomato metabolism data obtained by liquid chromatography‐mass

spectrometry and nuclear magnetic resonance. Metabolomics. 2008;**4**:202–215.

potential unintended effects following a genetic modification. Journal of Agricultural

H‐NMR spectroscopy as a tool to detect

Journal of Agricultural and Food Chemistry. 2004;**52**:6075–6085.

tomato fruit volatiles. Plant Physiology. 2005;**139**:1125–1137.

2005;**102**:14458–14462.

164 Metabolomics - Fundamentals and Applications

plant systems. Plant Cell. 2001;**13**:11–29.

doi:10.1371/journal.pone.0042576

Nature Biotechnology. 2006;**24**:447–454.

tissues. Plant Physiology. 2008;**147**:823–851.

and Food Chemistry. 2003;**51**:2447–2456.

Plant Physiology. 2006;**141**:1205–1218.

ing of tomato (*Lycopersicum esculentum*) using 1

2005;**56**:297–307.


[71] Aharoni, A., C. H. R. de Vos, H. A. Verhoeven, C. A. Maliepaard, G. Kruppa, R. Bino, and D. B. Goodenowe: Nontargeted metabolome analysis by use of Fourier Transform Ion Cyclotron Mass Spectrometry. OMICS. 2002;**6**: 217–234.

[59] Iijima Y, Gang D R, Lewinsohn E, Pichersky E: Characterization of geraniol synthase from the peltate glands of sweer basil (*Ocimum basilicum*). Plant Physiology. 2004;**134**:370–379.

[60] Welham T, Domoney C: Temporal and spatial activity of a promoter from a pea enzyme inhibitor gene and its exploitation for seed quality improvement. Plant Science.

[61] Charlton A, Allnutt T, Holmes S, Chisholm J, Bean S, Ellis N, Mullineaux, Oehlschlager S P: NMR profiling of transgenic peas. Plant Biotechnology Journal. 2004;**2**:27–35.

[62] Widodo, Patterson J H, Newbigin E, Tester M, Bacic A, Roessner U: Metabolic responses to salt stress of barley (*Hordeum vulgare* L.) cultivars, Sahara and Clipper, which differ in

[63] Graham S, Amigues E, Migaud M, Browne R A: Application of NMR based metabolomics for mapping metabolite variation in European wheat. Metabolomics. 2009;**5**:302–306.

[64] Achnine L, Huhman D V, Farag M A, Sumner L W, Blount J W, R A Dixon: Genomics‐ based selection and functional characterization of triterpene glycosyltransferases from

[65] Farag M A, Deavours B E, A. de Fátima, Naoumkina M, Dixon R A Sumner: Integrated metabolite and transcript profiling identify a biosynthetic mechanism for hispidol in

[66] Na Jom K, Frank T, Engel K H: A metabolite profiling approach to follow the sprouting

[67] Goossens A, Häkkinen S T, Seppänen‐Laakso T, Biondi S, V de Sutter, Lammertyn F, Nuutila A M, Söderlund H, Zabeau M, Inzé D, Oksman‐Caldentey K M: A functional genomics approach toward the understanding of secondary metabolism in plant cells. Proceedings of the National Academy of Sciences of the United States of America.

[68] Wahyuni Y, Ballester A R, Sudarmonowati E, Bino R J, Bovy A G: Metabolite biodiversity in pepper (*Capsicum*) fruits of thirty‐two diverse accessions: variation in health‐related

[69] Pereira G E, Gaudillere J P, van Leeuwen C, Hilbert G, Lavialle O, Maucourt M, Deborde

[70] Stewart D, McDougall G J, Sungurtas J, Verrall S, Graham J, Martinusen I: Metabolomic approach to identifying bioactive compounds in berries: advances toward fruit nutri-

tional enhancement. Molecular Nutrition & Food Research. 2007;**51**:645–651.

ries in four wine‐growing areas in Bordeaux, France. Journal of Agricultural and Food

H‐NMR and chemometrics to characterize mature grape ber-

compounds and implications for breeding. Phytochemisty. 2011;**72**:1358–1370.

the model legume *Medicago truncatula*. The Plant Journal. 2005;**41**:875–887.

*Medicago truncatula* cell cultures. Plant Physiology. 2009;**151**:1096–1113.

process of mung beans (*Vigna radiata*). Metabolomics. 2011;**7**:102–117.

salinity tolerance. Journal of Experimental Botany. 2009;**60**:4089–4103.

2000;**159**:289–299.

166 Metabolomics - Fundamentals and Applications

2003;**100**:8595–8600.

C, Moing A, Rolin D: 1

Chemistry. 2005;**53**:6382–6389.


**Biomedical and Clinical Metabolomics**

[82] Putri S P, Nakayama Y, Matsuda F, Uchikata T, Kobayashi S, Marsubara A, Fujusaki E: Current metabolomics: practical applications. Journal of Bioscience and Bioengineering.

[83] Canela N, Rodríguez M A, Baiges I, Nadal P, Arola L: Foodomics imaging by mass spec-

[84] Ashton O F O, Wong M, McGuie T K, Vather R, Wang Y, Requejo‐Jackman C, Ramankutty P, Woolf A B: Pigments in avocado tissue and oil. Journal of Agricultural and Food

[85] Liao C, Peng Y, Ma W, Liu R, Li C, Li X: Proteomic analysis revealed nitrogen‐mediated metabolic developmental, and hormonal regulation of maize (*Zea mays* L.) ear growth.

[86] Liu X, Robinson P W, Madore MA, Witney G W, Arpaia M L: "Hass" avocado carbohydrate fluctuation I. Growth and phenology. Journal of the American Society of

[87] Liu X, Sievert J, Arpaia M L, Madore M A: Postulated physiological roles of the seven‐ carbon sugars, mannoheptulose, and perseitol in avocado. Journal of the American

[88] Meyer M D, Terry LA: Development of a rapid method for the sequential extraction and subsequent quantification of fatty acids and sugars from avocado mesocarp tissue.

[89] Morgan M, Osorio S, Gehl B, Baxter C J, Kruger N J, Ratcliffe R G, Fernie A R, Sweetlove L J: Metabolic engineering of tomato fruit organic acid content guided by biochemical

[90] Roesnnser‐Tulani U, Hegemann B, Lytovchenko A, Carrari F, Bruedigam C, Granot D, Fernie A R: Overexpressing hexokinase reveals that the influence of hexose phosphory-

[91] Tesfay S Z, Bertling I, Bower J P, Lovatt C: The quest for the function of "Hass" avocado carbohydrates: clues from fruit and seed development as well as seed germination.

[92] Kim H K, Choi Y H, Verpoorte R. NMR‐based metabolomic analysis of plants. Nature

lation diminishes during fruit development. Plant Physiology. 2003;**133**:84–99.

trometry and magnetic resonance. Electrophoresis. 2016;**37**:1748–1767.

2013;**115**:579–589.

168 Metabolomics - Fundamentals and Applications

Chemistry. 2006;**54:**10151–10158.

Journal of Experimental Botany. 2012;**63**:5275–5288.

Society for Horticultural Science. 2002;**127**:108–114.

Australian Journal of Botany. 2012;**60**:79–86.

Protocols. 2010;**5**:536–549. doi:10.1038/nprot.2009.237

Journal of Agricultural and Food Chemistry. 2008;**56**:7439–7445.

analysis of an introgression line. Plant Physiology. 2013;**161**:397–407.

Horticultural Science. 1999;**124:**671–675.

#### **Application of Metabolomics for the Diagnosis and Traditional Chinese Medicine Syndrome Differentiation of Chronic Heart Failure Application of Metabolomics for the Diagnosis and Traditional Chinese Medicine Syndrome Differentiation of Chronic Heart Failure**

Juan Wang, Jianxin Chen, Huihui Zhao and Wei Wang Juan Wang, Jianxin Chen, Huihui Zhao and Wei Wang

Additional information is available at the end of the chapter Additional information is available at the end of the chapter

http://dx.doi.org/10.5772/66209

#### **Abstract**

Chronic heart failure (CHF) was characterized by the failure of enough blood supply from the heart to meet the body's metabolic demands, and the prevalence of CHF continuously increases globally. The personalized diagnosis of Traditional Chinese Medicine (TCM) classifies CHF into several different syndrome types, and integrating Western and TCM to treat CHF has proved a validated therapeutic approach. Over the last few years, there has been a rapidly growing number of metabolomics applications aimed at finding biomarkers that could assist diagnosis, provide therapy guidance, and evaluate response to therapy for individualized intervention of CHF. Thus, in this review, particular attention will be paid to the past successes in applications of state-ofthe-art technology on metabolomics to contribute to biomarker discovery in CHF research.

**Keywords:** metabolomics, chronic heart failure, Chinese Medicine

## **1. Introduction**

Chronic heart failure (CHF), a progressive clinical syndrome characterized by the inability of the heart to adequately pump blood to meet metabolic demands of the body, represents the final common ground of pathogenesis wherein various causes of heart damage converge [1].

© 2016 The Author(s). Licensee InTech. This chapter is distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/3.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. © 2017 The Author(s). Licensee InTech. This chapter is distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/3.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Despite a substantial improvement in the survival rate after the onset of CHF due to increasing use of pharmacological interventions, mortality of patients suffering from CHF remains high. Since it has been well recognized that the incidence and prevalence of CHF are expected to further increase with the aging of the population, better strategy for the prevention and treatment of CHF patients is still needed.

#### **1.1. Epidemiology of CHF**

CHF, in recent years, has been a major cause of morbidity and mortality in the general population [2]. Both the increasing age of the population and success in the treatment of patients with acute myocardial infarction raise the prevalence and thus the economic expenditures of chronic heart failure [3]. Heart failure is not only a common, costly, disabling, and potentially fatal condition but also the leading cause of hospitalization in people older than 65 [4]. In developing countries, 2–3% of the population suffers from heart failure, but in people from 70 to 80 years old, it occurs in 20–30% [5]. In developed countries, around 2% of adults suffer from heart failure, but in people older than 65, this increases to 6–10% [6].

#### **1.2. Personalized intervention of CHF based on Traditional Chinese Medicine (TCM) syndrome**

With progress being made in bioinformatics and medical science, the view on health and disease in Western life sciences has shifted from standard protocol-based disease management to personalized medicine [7]. Based on personalized health and systematically diagnostic principles, Traditional Chinese Medicine (TCM) has proved effective to restore the self-regulatory ability of the human system by thousands of years in clinic. And using integrating TCM and Western medicine to treat CHF has been reported to enhance heart function and reduce related clinical symptoms, including expiratory dyspnea and chronic fatigue, and subsequently improve echocardiographic measures, 6-min walking distance test, and patients' quality of life [8]. TCM physicians also pay more attention to the overall maladjustments of functional status called "syndrome type" [9]. It is not simply an assemblage of diseases' signs and symptoms but also a functional status caused by the reaction to or interaction with environmental changes and pathogenic factors [10]. In other words, the essence of TCM "syndrome type" is disturbance in biological metabolism networks, the changes in concentration and relative proportions of metabolomic biomarkers resulting from the imbalance of the human system. For example, yin deficiency syndrome of CHF patients is described as low fever, night sweats, afternoon zygomaticus red, dysphoria (fever) in chest palms soles, dry mouth and throat, red tongue with little coating, and thready rapid pulse according to the Clinical Terminology of Traditional Chinese Medical Diagnosis and Treatment—Syndromes, and Yang deficiency syndrome of CHF is a cluster of symptoms including an aversion to coldness, dispirited feelings and lack of motivation, diarrhea before dawn, shortness of breath, frequent urination, edema, and liability to catch cold. Therefore, TCM syndrome, also defined as TCM pattern, is the essence of diagnosis and treatment in TCM.

## **2. Bringing metabolomics into the forefront of CHF research**

Despite a substantial improvement in the survival rate after the onset of CHF due to increasing use of pharmacological interventions, mortality of patients suffering from CHF remains high. Since it has been well recognized that the incidence and prevalence of CHF are expected to further increase with the aging of the population, better strategy for the prevention and

CHF, in recent years, has been a major cause of morbidity and mortality in the general population [2]. Both the increasing age of the population and success in the treatment of patients with acute myocardial infarction raise the prevalence and thus the economic expenditures of chronic heart failure [3]. Heart failure is not only a common, costly, disabling, and potentially fatal condition but also the leading cause of hospitalization in people older than 65 [4]. In developing countries, 2–3% of the population suffers from heart failure, but in people from 70 to 80 years old, it occurs in 20–30% [5]. In developed countries, around 2% of adults

suffer from heart failure, but in people older than 65, this increases to 6–10% [6].

**1.2. Personalized intervention of CHF based on Traditional Chinese Medicine (TCM)**

With progress being made in bioinformatics and medical science, the view on health and disease in Western life sciences has shifted from standard protocol-based disease management to personalized medicine [7]. Based on personalized health and systematically diagnostic principles, Traditional Chinese Medicine (TCM) has proved effective to restore the self-regulatory ability of the human system by thousands of years in clinic. And using integrating TCM and Western medicine to treat CHF has been reported to enhance heart function and reduce related clinical symptoms, including expiratory dyspnea and chronic fatigue, and subsequently improve echocardiographic measures, 6-min walking distance test, and patients' quality of life [8]. TCM physicians also pay more attention to the overall maladjustments of functional status called "syndrome type" [9]. It is not simply an assemblage of diseases' signs and symptoms but also a functional status caused by the reaction to or interaction with environmental changes and pathogenic factors [10]. In other words, the essence of TCM "syndrome type" is disturbance in biological metabolism networks, the changes in concentration and relative proportions of metabolomic biomarkers resulting from the imbalance of the human system. For example, yin deficiency syndrome of CHF patients is described as low fever, night sweats, afternoon zygomaticus red, dysphoria (fever) in chest palms soles, dry mouth and throat, red tongue with little coating, and thready rapid pulse according to the Clinical Terminology of Traditional Chinese Medical Diagnosis and Treatment—Syndromes, and Yang deficiency syndrome of CHF is a cluster of symptoms including an aversion to coldness, dispirited feelings and lack of motivation, diarrhea before dawn, shortness of breath, frequent urination, edema, and liability to catch cold. Therefore, TCM syndrome, also defined as TCM pattern, is the essence of diagnosis and

treatment of CHF patients is still needed.

**1.1. Epidemiology of CHF**

172 Metabolomics - Fundamentals and Applications

**syndrome**

treatment in TCM.

The metabolome is the final downstream product of transcription and translation and is thus closest to the phenotype [11]. Dynamics of primary metabolism operate in timescales of seconds. These two characteristics allow the metabolome to be a sensitive and rapid measure of the system phenotype. Metabolomics were first defined in 1998 [12, 13]. Progress has been made in methodological technologies that have lead to the discovery of metabolomic biomarkers and greater knowledge regarding disease mechanism from that time. From the 1960s, applications of mass spectrometry (MS) [14] and nuclear magnetic resonance spectroscopy [15, 16] drove the first holistic studies of mammalian biofluids to be conducted. In the past years, the development of technology has given impetus to metabolomics to its current status. More than 13,000 studies searched in PubMed Database have proved metabolomics a routinely applied tool nowadays. However, metabolomics is still the younger and smaller sibling of proteomics, transcriptomics, and genomics. Metabolite profiles can provide a fingerprint of metabolic changes that characterize the mechanism of CHF, a progressive clinical syndrome, and also highlight the potential of metabolomic analysis in the evaluation of disease condition.

Metabolomics, as an important component of systematical biology, can be used to perform dynamic studies on noninjured tissues and organs in vivo and in vitro using noninvasive approaches under nearly physiological conditions. Therefore, metabolomic detection and analysis of biological samples may contribute to understand the biochemical changes associated with the progression of diseases. And identification of disease-associated metabolic biomarkers could allow early diagnosis of disease and establishment of predictive diagnostic systems. It is reported that metabolomics gives itself unparalleled advantage to the most common cardiovascular condition encountered in clinical practices, heart failure [17]. Metabolomics has also showed significant potential in TCM studies in recent years. And several studies [18, 19] combining metabolomic techniques and TCM syndrome types have demonstrated fingerprints of metabolic changes that characterize Western Medicine-diagnosed diseases, which highlighted the potential of metabolomics in the evaluation of disease condition and TCM-guided personalized treatment.

#### **2.1. General procedures in which metabolomics can be used for diagnosis and biomarker discovery**

Metabolomics operates with a workflow [20, 21] starting from a biological question and experiment, proceeding through sample collection and preparation, analytical experiment(s) to acquire data, data preprocessing and analysis followed by biological interpretation. The metabolomics experimental workflow involves the design of biological and analytical experiments, sample preparation, data acquisition, data preprocessing and analysis and data interpretation. This workflow leads to biological interpretation and reasoning, as shown in **Figure 1**.


**Figure 1.** The general workflow of metabolomic experiment.

Specifically, analytes in a metabolomic sample comprise a highly complex mixture. Mass spectrometry (MS) is used to identify and to quantify metabolites after optional separation by liquid chromatography-mass spectrometry (LC-MS), gas chromatography-mass spectrometry (GC-MS), nuclear magnetic resonance (NMR) spectroscopy, and capillary electrophoresis (CE). The raw data usually consist of measurements performed on subjects under various conditions. These measurements may be digitized spectra, or a list of metabolite levels. The software, called XCMS, is one of the most widely cited mass spectrometry-based metabolomics software programs in scientific literature. Prior to multivariate data analysis, statistical analyses are performed using SIMCA-P+12 software (Umetrics, Umea, Sweden) as variables and then mean-centered and pareto-scaled. Principal component analysis (PCA) and orthogonal partial least-squares discriminant analysis (OPLS-DA) are commonly used for the processed data. Score and loading plots are calculated to demonstrate discriminatory metabolites for each group; clustering of samples with similar metabolic fingerprints can be detected. This clustering can elucidate patterns and assist in the determination of disease biomarkers.

Two generalized experimental strategies are applied, metabolic profiling (or metabolomics) and metabolite-targeted analysis. Metabolic profiling allows comprehensive phenotyping of genetically or environmentally modified systems. The study is devised so as to acquire data on a large scale of metabolites (100–1000s) followed by interrogation of these data to figure out biological differences. The design of metabolomic experiments is of great importance here as it becomes easier to introduce confounding factors that are not recognized during data analysis and which can falsify biological conclusions [22].

#### **2.2. Advances in metabolomics techniques and related statistical methods**

Without knowing which metabolites are of specific biological interest, powerful analytical methods are used to illustrate thousands of metabolites reproducibly in a single sample. Relative alterations in metabolite concentrations are researched, and at this stage, the absolute concentrations of metabolites are not generally identified. With the goal to figure out all metabolites, this is currently not technologically realized. A platform of sample preparation and analytical methods is applied to acquire good coverage of identified metabolites (e.g., give reference to the Husermet project where LC-MS, GC-MS, and NMR spectroscopy have all been used; www.husermet.org). The application of univariate and multivariate analysis tools [23– 25] was performed to process data. These discovery-phase researches and hypothesisgenerating or inductive studies [26], sometimes by the non-cognoscenti as a "fishing expedition," aim to detect new biological and metabolomic markers. Moreover, another strategy, driven from known biology where a limited number of metabolites (generally less than 20) are known to be biologically relevant before the biological experiment, is conducted, and accurate quantification of distinguished metabolites is analyzed in a designed approach. Under the remit of traditional analytical chemistry and biochemical assays commonly found in clinical laboratories, this strategy has been applied for decades.

## **3. Biomarkers and metabolomics studies on CHF**

**1) Biological Experiment**

174 Metabolomics - Fundamentals and Applications

**2) Analytical Experiment**

**5) Biological Interpretation**

**3) Data Integration**

**Figure 1.** The general workflow of metabolomic experiment.

and which can falsify biological conclusions [22].

(Design of experiment; Sample collection)

(Sample preparation; Data acquisition)

**4) Analysis and metabolite identification**

(Metabolomic biomarkers;biological mechanism)

ing can elucidate patterns and assist in the determination of disease biomarkers.

**2.2. Advances in metabolomics techniques and related statistical methods**

Two generalized experimental strategies are applied, metabolic profiling (or metabolomics) and metabolite-targeted analysis. Metabolic profiling allows comprehensive phenotyping of genetically or environmentally modified systems. The study is devised so as to acquire data on a large scale of metabolites (100–1000s) followed by interrogation of these data to figure out biological differences. The design of metabolomic experiments is of great importance here as it becomes easier to introduce confounding factors that are not recognized during data analysis

Without knowing which metabolites are of specific biological interest, powerful analytical methods are used to illustrate thousands of metabolites reproducibly in a single sample. Relative alterations in metabolite concentrations are researched, and at this stage, the absolute concentrations of metabolites are not generally identified. With the goal to figure out all

(Data pre-processing;Data analysis;Integration with metadata)

Specifically, analytes in a metabolomic sample comprise a highly complex mixture. Mass spectrometry (MS) is used to identify and to quantify metabolites after optional separation by liquid chromatography-mass spectrometry (LC-MS), gas chromatography-mass spectrometry (GC-MS), nuclear magnetic resonance (NMR) spectroscopy, and capillary electrophoresis (CE). The raw data usually consist of measurements performed on subjects under various conditions. These measurements may be digitized spectra, or a list of metabolite levels. The software, called XCMS, is one of the most widely cited mass spectrometry-based metabolomics software programs in scientific literature. Prior to multivariate data analysis, statistical analyses are performed using SIMCA-P+12 software (Umetrics, Umea, Sweden) as variables and then mean-centered and pareto-scaled. Principal component analysis (PCA) and orthogonal partial least-squares discriminant analysis (OPLS-DA) are commonly used for the processed data. Score and loading plots are calculated to demonstrate discriminatory metabolites for each group; clustering of samples with similar metabolic fingerprints can be detected. This cluster-

In studies of human beings, metabolomics has been applied to define biomarkers related to prognosis or diagnosis of a disease or drug toxicity/efficacy, and it is hoped to provide greater pathophysiological understanding of disease or therapeutic toxicity/efficacy by these studies [27]. Coupling bioinformatics and biostatistics with metabolomic technology platforms permits the identification and quantification of molecules to characterize the whole organism's response to diseases [28]. Studies have demonstrated that metabolomics lends itself ideally to the most common cardiovascular condition encountered in clinical practices, heart failure [29]. Serum metabolites collected from 52 patients developed with CHF and 57 controls were analyzed and 38 peaks illustrated significant differences between patients and controls. As the current gold-standard biomarker brain natriuretic peptide (BNP), two metabolites of pseudouridine including a modified nucleotide present in tRNA and rRNA, a marker of cell turnover, and the tricarboxylic acid cycle intermediate 2-oxoglutarate were at least as diagnostic metabolic biomarkers of heart failure [30]. Moreover, 2-hydroxy, 2-methylpropanoic acid, erythritol, and 2, 4, 6-trihydroxypyrimidine were also good discriminators for cases and controls observed. We identified metabolites from early experiments, also biomarkers in the future, and will need to lay a foundation for larger, prospective, externally validated researches in clinical studies. Our study used to apply a metabolomics approach to plasma obtained directly from patients in order to assess its accuracy and reliability in diagnosing CHF, which showed better performance in terms of both specificity and sensitivity. It should be noted that the heart is known to consume a diverse set of fuel substrates, including lactate, glucose, amino acids, ketones, and particularly free fatty acids (FFAs). The metabolites include energy metabolism-related molecules, lipid/protein complexes, and amino acids. Plasma samples from 39 CHF patients and 15 controls were analyzed by NMR spectroscopy. After processing the data, PCA and OPLS-DA were performed. The statistical model revealed good explained variance and predictability, and the diagnostic performance assessed by leave-one-out analysis exhibited 92.31% sensitivity and 86.67% specificity. The OPLS-DA score plots of spectra revealed good separation between case and control on the level of metabolites, and multiple biochemical changes indicated hyperlipidemia, alteration of energy metabolism, and other potential biological mechanisms underlying CHF. It was concluded that the NMR-based metabolomics approach demonstrated good performance to identify diagnostic plasma markers and provided new insights into metabolic process related to CHF.

It has become well recognized that alteration in energy metabolites is of great importance in chronic heart failure in recent years. Impairment in extraction of a wide range of metabolites, probably pointing to severe energy deficiency, always leads to cardiac dysfunction in CHF patients. Moreover, higher level of lactate and decreased glucose in plasma metabolites seem to aggravate the impairment of energetic pathway in patients with CHF. A few metabolites relevant only in distinguishing CHF patients from healthy controls could be associated with prolonged exertion. Therefore, we observed increased lactate during anaerobic exercise. When stores of glucose are low and concentration of oxaloacetate has been exhausted, acetone is generated accordingly. Furthermore, alanine is expected to rise resulting from gluconeogenesis when lactate is produced [31]. A rise in creatine, phosphorylated to phosphocreatine in muscle, may indicate a physiological state of energy depletion. Besides, glycolate is of great importance in energy generation by mitochondria. Pyruvate stem from glycolysis is diverted away from the pyruvate dehydrogenase and toward the lactate dehydrogenase reaction, which contributes to the increase in lactate. A rise in acetyl-CoA may facilitate the inhibition of pyruvate dehydrogenase, which continues to be stemmed from fatty acid oxidation; however, it accumulates on the account of lowered trichloroacetic acid (TCA) cycle flux. Stress on energy metabolism can also affect metabolic area in CHF patients. 3-Hydroxybutyrate, 2-hydroxyisobutyrate, and 3-methyladipate had proved significant in glucose and lipid metabolism by Lin's study [32]. Glycoprotein and carnitine are reported in oxidative metabolism in mitochondrial and hypoxemic stress [33]. These results suggested oxidative fuel decrease and a greater reliance on anaerobic metabolism of glucose for energy production in the plasma of CHF patients.

In CHF patients, increased low-density lipoprotein (LDL) and decreased high-density lipoprotein (HDL) were observed to be relative to lipolysis as a backup mechanism for energy generation. Previous proteomic analysis of left-atrial cardiomyocyte and tissue samples from the congestive heart failure model also found significant alterations in apolipoprotein levels [34]. As apolipoproteins play a significant role in lipid metabolism, the alterations in apolipoprotein concentrations indicate that lipid metabolic dysregulation may be relative to CHD. Other studies have shown a close relationship between CHF and lipid regulation, which may account for the high comorbidity between metabolic syndrome and CHF [35]. Along these lines, previous study examining FFA extraction in myocardial ischemia patients also has found decreased FFA extraction and oxidation [36]. Choline and its derivatives represent important constituents in phospholipid metabolism of cell membranes and have been previously identified as markers of cellular proliferation. To summarize, CHF patients have limited myocardial metabolic reserve and flexibility, which verify a preliminary hypothesis of association between lipid metabolic disorder and CHF.

The change of the TCA cycle may also be a metabolic marker in CHF patients. Impaired TCA cycle flux, derived from the catabolism of glucose and amino acids, appears to occur in part through limiting levels of anaplerotic substrates [37]. The fall in glutamate/glutamine uptake was observed in CHF patients because they would normally be transfinite to form the anaplerotic substrate α-ketoglutarate. An increase in alanine release was also noted. The net production of alanine likely occurs via transamination of pyruvate, with glutamate as the nitrogen donor in the alanine transaminase reaction. In this period, this metabolic signature is consistent with impaired glucose oxidation resulted in the diversion of pyruvate away from the TCA cycle and into alanine transaminase reactions. Some scholars hypothesize that CHF activates proteolysis of skeletal muscle and enhances branched amino acid oxidation, as was described in inflammatory states [38]. Besides that, increased proline had been reported to play a role in coronary atherosclerosis diseases [39]. In view of these studies, the heart's ability to maintain homeostasis via glycogeolysis, neoglucogenesis and ketogenesis is compromised with CHF, as well as altering the amino acids.

## **4. Diagnostic power of metabolomics in TCM syndromes of CHF**

Despite advances in the drug treatment strategy for CHF, the number of deaths resulting from this condition continues to rise [40]. TCM pays special attention to the integrity and holism of the human body and its interrelationship with nature. Based on different symptoms and signs, TCM adheres to the basic principle to treat the same disease by different methods and different diseases by one method and emphasizes personalized treatment, which truly indicates the essence of TCM intervention [41]. Therefore, treatment based on syndrome differentiation is the core of TCM therapy or CHF. From the perspective of TCM, CHF may occur in all differentiation types, including qi deficiency and blood stasis, yang deficiency and water retention, yin deficiency, and so on. Many Chinese herbs have demonstrated safety and efficacy in the management of chronic heart failure in either animal models or humans [42, 43]. In addition, modern biologic research has entered an era of integrating various research technologies and methods to tackle difficult biological problems at biomolecular level as a whole, which is exemplified by studies in the new scientific fields of metabolomics. It is therefore crucial to investigate the potential correlation between TCM syndrome type and metabolites to develop novel therapeutic approaches for better treatment of CHF.

#### **4.1. Qi deficiency and blood stasis syndrome**

biochemical changes indicated hyperlipidemia, alteration of energy metabolism, and other potential biological mechanisms underlying CHF. It was concluded that the NMR-based metabolomics approach demonstrated good performance to identify diagnostic plasma

It has become well recognized that alteration in energy metabolites is of great importance in chronic heart failure in recent years. Impairment in extraction of a wide range of metabolites, probably pointing to severe energy deficiency, always leads to cardiac dysfunction in CHF patients. Moreover, higher level of lactate and decreased glucose in plasma metabolites seem to aggravate the impairment of energetic pathway in patients with CHF. A few metabolites relevant only in distinguishing CHF patients from healthy controls could be associated with prolonged exertion. Therefore, we observed increased lactate during anaerobic exercise. When stores of glucose are low and concentration of oxaloacetate has been exhausted, acetone is generated accordingly. Furthermore, alanine is expected to rise resulting from gluconeogenesis when lactate is produced [31]. A rise in creatine, phosphorylated to phosphocreatine in muscle, may indicate a physiological state of energy depletion. Besides, glycolate is of great importance in energy generation by mitochondria. Pyruvate stem from glycolysis is diverted away from the pyruvate dehydrogenase and toward the lactate dehydrogenase reaction, which contributes to the increase in lactate. A rise in acetyl-CoA may facilitate the inhibition of pyruvate dehydrogenase, which continues to be stemmed from fatty acid oxidation; however, it accumulates on the account of lowered trichloroacetic acid (TCA) cycle flux. Stress on energy metabolism can also affect metabolic area in CHF patients. 3-Hydroxybutyrate, 2-hydroxyisobutyrate, and 3-methyladipate had proved significant in glucose and lipid metabolism by Lin's study [32]. Glycoprotein and carnitine are reported in oxidative metabolism in mitochondrial and hypoxemic stress [33]. These results suggested oxidative fuel decrease and a greater reliance on anaerobic metabolism of glucose for energy production in the plasma of

In CHF patients, increased low-density lipoprotein (LDL) and decreased high-density lipoprotein (HDL) were observed to be relative to lipolysis as a backup mechanism for energy generation. Previous proteomic analysis of left-atrial cardiomyocyte and tissue samples from the congestive heart failure model also found significant alterations in apolipoprotein levels [34]. As apolipoproteins play a significant role in lipid metabolism, the alterations in apolipoprotein concentrations indicate that lipid metabolic dysregulation may be relative to CHD. Other studies have shown a close relationship between CHF and lipid regulation, which may account for the high comorbidity between metabolic syndrome and CHF [35]. Along these lines, previous study examining FFA extraction in myocardial ischemia patients also has found decreased FFA extraction and oxidation [36]. Choline and its derivatives represent important constituents in phospholipid metabolism of cell membranes and have been previously identified as markers of cellular proliferation. To summarize, CHF patients have limited myocardial metabolic reserve and flexibility, which verify a preliminary hypothesis of

The change of the TCA cycle may also be a metabolic marker in CHF patients. Impaired TCA cycle flux, derived from the catabolism of glucose and amino acids, appears to occur in part

association between lipid metabolic disorder and CHF.

markers and provided new insights into metabolic process related to CHF.

176 Metabolomics - Fundamentals and Applications

CHF patients.

Based on TCM, we classified CHF patients into several syndromes. We investigated plasma metabolites of CHF patients with qi deficiency and blood stasis syndrome to illuminate new approaches to the diagnosis and identify metabolic signatures of TCM syndromes in CHF. Combining plasma metabolomics with TCM syndrome-type diagnosis showed the distinguished metabolites of CHF patients with qi deficiency and blood stasis syndrome, including energy metabolites (glucose, lactate, and glycoprotein), lipid/protein complexes [HDL, LDL/ very low-density lipoprotein (VLDL)], and amino acids (alanine, glutamate, valine, glycine, proline, and carnitine). Therefore, this metabolomic method may demonstrate potential in understanding of TCM syndromes of CHF. It is indisputable that there are limitations to each study with any other new diagnostic approaches. Here, the effects of other confounding factors

on the metabolic profiles, though, can be analyzed by further studies with large cohort required to validate this method. For plasma samples representing the effects of metabolism in different organs, it is also difficult to assign a metabolic fingerprint to specific metabolic processes [44]. However, it should be noted that the altered metabolites are reflected in CHF patients with certain TCM syndrome, which can be harnessed as markers of diseases.

Qi deficiency and blood stasis syndrome, as a major syndrome among CHF patients, show a distinct signature of altered metabolism, which includes increased level of lactate, gly-protein, low-density lipoprotein (LDL)/very low-density lipoprotein (VLDL), and lower levels of glucose, valine, proline, alanine, and carnitine. And glycoprotein is closely associated with the physiology and pathology of cells' growth and can affect human metabolic energy supply and cellular immunity [45]. Furthermore, increased level of LDL and VLDL was observed in CHF patients with qi deficiency and blood stasis syndrome, which were the most prominent factors differentiating from controls. This metabolomics profile could be associated with lipolysis as a backup mechanism for energy utilization, for apolipoproteins are of great weight in lipid metabolism. Meanwhile, the plasma levels of well-known essential and nonessential amino acids (such as alanine and valine) decreased in the CHF patients with qi deficiency and blood stasis syndrome, breaking the internal equilibrium of the body gradually. This is in line with Yan's research; a metabolomics study on the rat model of myocardial ischemia with this syndrome showed that increased inositol and decreased valine, glycine, and serine were closely associated with energy metabolism and oxidative stress response [46]. Besides that, carnitine, an important substance involved in fat metabolism and energy supply, reduced dramatically in these patients. Studies have confirmed that L-carnitine can increase the uptake of free fatty acids (FAA), which will make use of glucose as oxidative fuel in certain circumstances [47]. If carnitine was insufficient, the oxidation process in mitochondria will be affected, which leads to the imbalance of cell metabolism and heart diseases. Tricarboxylic acid cycle, carbohydrates, proteins, and fats would involve in the above metabolic processes, indicating a complicated metabolic disorder in CHF patients with qi deficiency and blood stasis syndrome. Therefore, this metabolomic method may demonstrate potential in understanding of qi deficiency and blood stasis syndrome of CHF. It is indisputable that there are limitations to each new diagnostic approach. The effects of other confounding factors on the metabolic profiles, though, can be analyzed by further studies with large cohort required to validate this method [47]. For plasma samples representing the effects of metabolism in different organs, it is also difficult to assign a metabolic fingerprint to specific metabolic processes. However, it should be noted that these altered metabolites are at least partly reflected in CHF patients with qi deficiency and blood stasis syndrome, which can be harnessed as markers of diseases for personalized treatment based on TCM. Further mechanistic studies regarding this issue are warranted.

#### **4.2. Yang deficiency and water retention syndrome**

Yang deficiency and water retention patients were observed to show higher levels of lactate, gly-protein, pyruvic acid, alanine, glutamate, and lower levels of glucose, low-density lipoprotein (LDL)/very low-density lipoprotein (VLDL), and high-density lipoprotein (HDL). Based on TCM theory, yang deficiency is associated with signs of chronic, weak, hypofunction, hypometabolism, degenerative symptoms, and extremely common to be observed in late or severe stage of many diseases [48]. The metabolism pattern in CHF patients with yang deficiency and water retention syndrome demonstrated the decrease in glucose metabolism and the increase in lactate, alanine, and pyruvate, suggesting that the disorder of carbohydrate and energy metabolism in patients is more serious. There might be enhanced endogenous glucose production from gluconeogenesis and pyruvic acid change may indicate increased hepatic gluconeogenesis to provide extra pyruvate as a substrate for glucose [49]. That is why syndrome typically occurs in patients with chronic heart failure at stage III and stage IV. Moreover, increased glycoprotein indicates immune defects in the patients [50], while the generally lower lipoprotein levels including LDL/VLDL and HDL also suggest the insufficient absorption and utilization of protein in this phase. And higher excretion of measured metabolites (glutamate and alanine) could indicate that they might have more potential disturbance of renal function, and resultant missed metabolites that are necessary for carbohydrate and energy metabolism [51]. Previous study [52] investigated the urinary metabolites of syndrome in patients with chronic kidney disease and revealed that the key distinguished metabolites differed between yang deficiency and water retention syndrome and control group including alanine, diethylamine, proline, and so on. As essential substance in cellular activities, the deficiency will affect energy supply in all aspects of human body. Finally, these alterations are likely important contributing factors to the altered metabolite profiling of CHF patients with yang deficiency and water retention syndrome.

## **5. Conclusions and future perspectives**

The future goals for metabolomics are the validation of existing biomarkers in terms of mechanism and translation to man, together with a focus on characterizing the individual health care. So far, metabolomics may be of special clinical relevance for the diagnosis of syndromes of CHF and uncovering metabolomic pathway and prognosis in some extent, which could lead to a better understanding and improvement of personalized interventions for CHF. Metabolomics has also shown great advantage to discover possible early biomarkers for the development of CHF and assess progression during treatment, which can aid the discovery of prognostic indicators of outcome and disease response to personalized therapy.

## **Acknowledgements**

on the metabolic profiles, though, can be analyzed by further studies with large cohort required to validate this method. For plasma samples representing the effects of metabolism in different organs, it is also difficult to assign a metabolic fingerprint to specific metabolic processes [44]. However, it should be noted that the altered metabolites are reflected in CHF patients with

Qi deficiency and blood stasis syndrome, as a major syndrome among CHF patients, show a distinct signature of altered metabolism, which includes increased level of lactate, gly-protein, low-density lipoprotein (LDL)/very low-density lipoprotein (VLDL), and lower levels of glucose, valine, proline, alanine, and carnitine. And glycoprotein is closely associated with the physiology and pathology of cells' growth and can affect human metabolic energy supply and cellular immunity [45]. Furthermore, increased level of LDL and VLDL was observed in CHF patients with qi deficiency and blood stasis syndrome, which were the most prominent factors differentiating from controls. This metabolomics profile could be associated with lipolysis as a backup mechanism for energy utilization, for apolipoproteins are of great weight in lipid metabolism. Meanwhile, the plasma levels of well-known essential and nonessential amino acids (such as alanine and valine) decreased in the CHF patients with qi deficiency and blood stasis syndrome, breaking the internal equilibrium of the body gradually. This is in line with Yan's research; a metabolomics study on the rat model of myocardial ischemia with this syndrome showed that increased inositol and decreased valine, glycine, and serine were closely associated with energy metabolism and oxidative stress response [46]. Besides that, carnitine, an important substance involved in fat metabolism and energy supply, reduced dramatically in these patients. Studies have confirmed that L-carnitine can increase the uptake of free fatty acids (FAA), which will make use of glucose as oxidative fuel in certain circumstances [47]. If carnitine was insufficient, the oxidation process in mitochondria will be affected, which leads to the imbalance of cell metabolism and heart diseases. Tricarboxylic acid cycle, carbohydrates, proteins, and fats would involve in the above metabolic processes, indicating a complicated metabolic disorder in CHF patients with qi deficiency and blood stasis syndrome. Therefore, this metabolomic method may demonstrate potential in understanding of qi deficiency and blood stasis syndrome of CHF. It is indisputable that there are limitations to each new diagnostic approach. The effects of other confounding factors on the metabolic profiles, though, can be analyzed by further studies with large cohort required to validate this method [47]. For plasma samples representing the effects of metabolism in different organs, it is also difficult to assign a metabolic fingerprint to specific metabolic processes. However, it should be noted that these altered metabolites are at least partly reflected in CHF patients with qi deficiency and blood stasis syndrome, which can be harnessed as markers of diseases for personalized treatment based on TCM. Further mechanistic studies regarding this issue are

certain TCM syndrome, which can be harnessed as markers of diseases.

178 Metabolomics - Fundamentals and Applications

warranted.

**4.2. Yang deficiency and water retention syndrome**

Yang deficiency and water retention patients were observed to show higher levels of lactate, gly-protein, pyruvic acid, alanine, glutamate, and lower levels of glucose, low-density lipoprotein (LDL)/very low-density lipoprotein (VLDL), and high-density lipoprotein (HDL). This work was supported by the National Department Public Benefit Research Foundation of China under grant no. 200807007 and the National Natural Science Foundation of China under grant no. 81302914.

## **Author details**

Juan Wang, Jianxin Chen, Huihui Zhao and Wei Wang\*

\*Address all correspondence to: doctorjuanwang@163.com

Beijing University of Chinese Medicine, Beijing, China

## **References**


[11] Goodacre R (2010) An overflow of what else but metabolism! Metabolomics 6:1–2.

**Author details**

180 Metabolomics - Fundamentals and Applications

**References**

139:72–77.

Juan Wang, Jianxin Chen, Huihui Zhao and Wei Wang\*

Beijing University of Chinese Medicine, Beijing, China

\*Address all correspondence to: doctorjuanwang@163.com

of end-stage heart failure. J Cardiac Fail 17:868–872.

88 195 patients. Eur J Heart Fail. 18(9):1132–40.

heart failure. Eur Heart J. 6(suppl D), D57–D60.

[1] David L, Zsuzsanna H, Anna M, Ellamae S, Mayu S et al. (2011) Molecular signatures

[2] Nuria F, Emili V, Montse C, Montse B, Miguel C-A et al. (2016) Medical resource use and expenditure in patients with chronic heart failure: a population-based analysis of

[3] Bundkirchen A, Schwinger RHG (2004) Epidemiology and economic burden of chronic

[5] Krumholz HM, Chen YT, Wang Y, Vaccarino V, Radford MJ, et al. (2000) Predictors of readmission among elderly survivors of admission with heart failure. Am Heart J

[6] Dickstein K, Cohen-Solal A, Filippatos G, MTCMurray JJV, Ponikowski P, et al. (2008) ESC Guidelines for the diagnosis and treatment of acute and chronic heart failure 2008 The Task Force for the Diagnosis and Treatment of Acute and Chronic Heart Failure 2008 of the European Society of Cardiology. Developed in collaboration with the Heart Failure Association of the ESC (HFA) and endorsed by the European Society of

[7] Marcinkiewicz-Siemion M, Ciborowski M, Kretowski A, Musial WJ, Kaminski KA (2016) Metabolomics – A wide-open door to personalized treatment in chronic heart

[8] Yunlun L, Jianqing J, Chuanhua Y, Haiqiang J, Jingwen X et al. (2014) Oral Chinese herbal medicine for improvement of quality of life in patients with chronic heart failure:

[9] Wang Z, Liu X, Ho RL, Lam CW, Chow MS (2016) Precision or personalized medicine for cancer chemotherapy: is there a role for herbal medicine. Molecules 21(7):889.

[10] Prasad S, Tyagi A (2015) Traditional medicine: the goldmine for modern drugs. Adv

[4] McMurray JJ, Pfeffer MA (2005) Heart failure. Lancet 365(9474):1877–1889.

Intensive Care Medicine (ESITCM). Eur Heart J 29:2388–2442.

a systematic review and meta-analysis. Qual Life Res 23:1177–1192.

failure? Int J Cardiol 219:156–163.

Tech. Biol. Med. 03(1).


[39] Desai AS, Claggett B, Pfeffer MA, Bello N, Finn PV, Granger CB, MTCMurray JJV, Pocock SS (2013) Influence of hospitalization for cardiovascular versus noncardiovascular reasons on subsequent mortality in patients with chronic heart failure across the spectrum of ejection fraction. Circulation Heart Journal, 34(suppl 1):284.

[24] Madsen R, Lundstedt T, Trygg J (2010) Chemometrics in metabolomics—a review in

[25] Cantor GH (2011) Metabolomics and mechanisms: sometimes the fisher catches a big

[26] Mamas M, Dunn WB, Neyses L, Goodacre R (2011) The role of metabolites and metabolomics in clinically applicable biomarkers of disease. Arch Toxicol 85:5–17.

[27] Sabatine MS, Liu E, Morrow DA, Heller E, McCarroll R, et al. (2005) Metabolomic identification of novel biomarkers of myocardial ischemia. Circulation 112:3868–3875.

[28] Dunn WB, Ellis DI (2005) Metabolomics: current analytical platforms and methodolo-

[29] Dunn WB, Broadhurst DI, Deepak SM, Buch MH, McDowell G, et al. (2007) Serum metabolomics reveals many novel metabolic markers of heart failure, including

[30] MacIntyre D, Jimenez B, Lewintre EJ, Martín CR, Schäfer H, et al. (2010) Serum metabolome analysis by 1H-NMR reveals differences between chronic lymphocytic

[31] Lin D, Hollander Z, Meredith A, Stadnick E, Sasaki M, et al. (2011) Molecular signatures

[32] Kumps A, Duez P, Mardens Y (2002) Metabolic, nutritional, iatrogenic, and artifactual sources of urinary organic acids: a comprehensive table. Clin Chem 48:708–717.

[33] De Souza AI, Cardin S, Wait R, Chung YL, Vijayakumar M, et al. (2010) Proteomic and metabolomic analysis of atrial profibrillatory remodelling in congestive heart failure. J

[34] Pauly DF, Pepine CJ (2003) The role of carnitine in myocardial dysfunction. Am J Kidney

[35] Oka T, Itoi T, Terada N, Nakanishi H, Taguchi R, et al. (2008) Change in the membranous lipid composition accelerates lipid peroxidation in young rat hearts subjected to 2

[36] Turer AT, Stevens RD, Bain JR, Muehlbauer MJ, van der Westhuizen J, et al. (2009) Metabolomic profiling reveals distinct patterns of myocardial substrate use in humans with coronary artery disease or left ventricular dysfunction during surgical ischemia/

[37] Russell R (1991) Changes in citric acid cycle flux and anaplerosis antedate the functional

[38] Zimmerli LU, Schiffer E, Zürbig P, Good DM, Kellmann M, et al. (2008) Urinary proteomic biomarkers in coronary artery disease. Mol Cell Proteomics 7:290–298.

decline in isolated rat hearts utilizing acetoacetate. J Clin Invest 87:384.

weeks of hypoxia followed by hyperoxia. Circ J.72(8):1359–66.

human disease diagnosis. Anal Chim Acta 659(1–2):23–33.

pseudouridine and 2-oxoglutarate. Metabolomics 3:413–426.

leukaemia molecular subgroups. Leukemia 24:788–797.

of end-stage heart failure. J Card Fail 17:867–874.

Mol Cell Cardiol 49:851–863.

reperfusion. Circ J 119:1736–1746.

Dis 41:S35–S43.

fish. Toxicol Sci. 118(2):321–3.

182 Metabolomics - Fundamentals and Applications

gies. Trac-Trend Anal Chem 24:285–294.


cachectic and noncachectic heart failure patients: relationship with neurohormonal and inflammatory biomarkers. Metab: Clin Exp 61(1):37–42.


**Provisional chapter**

## **A Metabolomics Approach to Metabolic Diseases**

**A Metabolomics Approach to Metabolic Diseases**

Luis Aldámiz-Echevarría, Fernando Andrade, Marta Llarena and Domingo

Marta Llarena and Domingo González-Lamuño González-Lamuño

Additional information is available at the end of the chapter Additional information is available at the end of the chapter

Luis Aldámiz-Echevarría, Fernando Andrade,

http://dx.doi.org/10.5772/65469

#### **Abstract**

cachectic and noncachectic heart failure patients: relationship with neurohormonal and

[51] Wang X, Aihua Z, Hui S (2015) Power of metabolomics in diagnosis and biomarker

[52] Dong FX, Huang D, He LG, Jia W (2008) Research on urine metabolomics in chronic kidney disease Ⅲ with kidney-yang deficiency. China J Trad Chin Med Pharm 12(23):

inflammatory biomarkers. Metab: Clin Exp 61(1):37–42.

1110–1113.

184 Metabolomics - Fundamentals and Applications

discovery of hepatocellular carcinoma. Hepatology 5:2072–2077.

Metabolomics, defined as the comprehensive analysis of compounds in a biological specimen, is an emerging technology that helps several pathologies to inform about new biomarkers. Metabolic diseases comprise a group of rare conditions that in total represent an important health problem. Historically, small numbers of metabolites have been used to diagnose complex metabolic diseases such as diabetes or metabolic diseases. Metabolomic methodology, due to the evolution of clinical chemistry technologies, could detect thousands of organic compounds. In this way, metabolomic approach gives information of metabolic pathways describing physiopathology that underlies disease, including the possibility of discovery of new markers that could be used to diagnose or check the efficacy of the treatments. Diabetes, classic inborn error of metabolism as methylmalonic aciduria, lysosomal diseases and rare optic neuropathy affecting adults are discussed in this chapter.

**Keywords:** metabolic diseases, diabetes, optic neuropathy, lysosomal diseases

## **1. Introduction**

Rare metabolic diseases comprise a group of more than 7000 conditions that are particularly rare, but in total, they represent an important health problem. There is no single, internationally accepted definition of a rare metabolic disease; currently, it is a syndrome that occurs in one child per 200,000 births. In Europe, they are supposed to affect 7% of the population. Nowadays, as less than 500 metabolic diseases have available and effective treatments, new therapeutic solutions should be developed.

New approach and initiatives are necessary to advance research for patients suffering from an inborn error of metabolism because governments pay little attention due to their costs and

© 2016 The Author(s). Licensee InTech. This chapter is distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/3.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. © 2017 The Author(s). Licensee InTech. This chapter is distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/3.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

low incidences. Suitable and quick diagnosis is basic to the patients, even if there is no treatment, since it reduces the overdiagnosis, followed by suitable care, can improve quality of life for these patients.

The inherited metabolic diseases are biochemical defects diagnosed routinely by neonatal screening programmes. This successful screening methodology continues to broaden and improve, and new biomarkers are added depending on the country.

Metabolomics is an "omics" science focused on dynamic biology ruled by organic metabolites. A metabolome is defined as the group of metabolites detected and playing in the same metabolic pathway in the normal functioning of the cell. Metabolomics complements other analytical sciences as proteomics or genomics. Physiological changes that are consequences of a special gene expression are characterised by a variation in the metabolic compounds. Thus, the metabolome is more sensitive than the transcriptome or the proteome. A small variation in protein expression can have a significant effect on the activity of the metabolic pathway and the concentration of the relevant metabolites.

In summary, the workflow for metabolomics applied to metabolic diseases starts with model design or patients selection (inclusion/exclusion criteria), sample preparation and separationdetection of metabolites by chromatography and mass spectrometry, respectively. Then, the obtained information has to be aligned and identified in order to carry out the statistics. The last step in this methodology is to obtain useful biomarkers for each inborn error of metabolism (**Figure 1**).

The aim of this chapter is to show the utility of metabolomics, applying to diabetes and inborn error of metabolism. The latter ones traditionally with difficult diagnosis and uncertain treatment are due to their low incidence. However, the metabolomic approach could give these kinds of diseases a tool to discover new and effective biomarkers.

#### **1.1. Diabetes**

Diabetes, caused by several interactions between genetic and environmental factors, is a frequent disorder related to mutations in several genes, with each individual gene accounting for 1% of disease risk. In case of type 2 diabetes, dysfunction of multiple organ systems, including impaired insulin action in the muscle and adipose tissue, defective hepatic glucose production and insulin resistance are caused by loss of beta-cell mass and function. The difficult challenge for the understanding of the molecular pathways is evident, but progress in this area may be aided by the recent advent of technologies for metabolomic approach including nuclear magnetic resonance or mass spectrometry.

Metabolomic could provide some advantages over other "omics" technologies in diabetes research: (1) The amount of metabolites or small molecules found and identified is less than the number of genes or proteins. So, the pathway interpretation should be easier and effective, as the results from genomics and proteomics. (2) Metabolomics is a tool for describing mechanisms of action and reverse effects of several treatments. However, metabolomics can be a science with limitations for technology misuse or data overinterpretation.

low incidences. Suitable and quick diagnosis is basic to the patients, even if there is no treatment, since it reduces the overdiagnosis, followed by suitable care, can improve quality of life

The inherited metabolic diseases are biochemical defects diagnosed routinely by neonatal screening programmes. This successful screening methodology continues to broaden and

Metabolomics is an "omics" science focused on dynamic biology ruled by organic metabolites. A metabolome is defined as the group of metabolites detected and playing in the same metabolic pathway in the normal functioning of the cell. Metabolomics complements other analytical sciences as proteomics or genomics. Physiological changes that are consequences of a special gene expression are characterised by a variation in the metabolic compounds. Thus, the metabolome is more sensitive than the transcriptome or the proteome. A small variation in protein expression can have a significant effect on the activity of the metabolic pathway and

In summary, the workflow for metabolomics applied to metabolic diseases starts with model design or patients selection (inclusion/exclusion criteria), sample preparation and separationdetection of metabolites by chromatography and mass spectrometry, respectively. Then, the obtained information has to be aligned and identified in order to carry out the statistics. The last step in this methodology is to obtain useful biomarkers for each inborn error of metabo-

The aim of this chapter is to show the utility of metabolomics, applying to diabetes and inborn error of metabolism. The latter ones traditionally with difficult diagnosis and uncertain treatment are due to their low incidence. However, the metabolomic approach could give these

Diabetes, caused by several interactions between genetic and environmental factors, is a frequent disorder related to mutations in several genes, with each individual gene accounting for 1% of disease risk. In case of type 2 diabetes, dysfunction of multiple organ systems, including impaired insulin action in the muscle and adipose tissue, defective hepatic glucose production and insulin resistance are caused by loss of beta-cell mass and function. The difficult challenge for the understanding of the molecular pathways is evident, but progress in this area may be aided by the recent advent of technologies for metabolomic approach includ-

Metabolomic could provide some advantages over other "omics" technologies in diabetes research: (1) The amount of metabolites or small molecules found and identified is less than the number of genes or proteins. So, the pathway interpretation should be easier and effective, as the results from genomics and proteomics. (2) Metabolomics is a tool for describing mechanisms of action and reverse effects of several treatments. However, metabolomics

can be a science with limitations for technology misuse or data overinterpretation.

improve, and new biomarkers are added depending on the country.

kinds of diseases a tool to discover new and effective biomarkers.

ing nuclear magnetic resonance or mass spectrometry.

the concentration of the relevant metabolites.

for these patients.

186 Metabolomics - Fundamentals and Applications

lism (**Figure 1**).

**1.1. Diabetes**

**Figure 1.** Experimental methodology involved in metabolomic studies of inborn errors of metabolism, where separation and detection phase could include a combination of gas chromatography, liquid chromatography and capillary electrophoresis with mass spectrometry.

Little information has been published about metabolic control of diabetic patients or their evolution with the treatment; for this reason, several profiles such as acylcarnitines, amino acids or cardiovascular markers, among them, should be studied taking into account recent metabolomic research for diabetes [1, 2]. Moreover, mitochondrial dysfunction and altered beta-oxidation of fatty acids have been described in diabetes [3].

A characteristic of all types of diabetes is that there are alterations in energy metabolism due to the lack of insulin action. This change is reflected especially in the metabolism of carnitine (**Figure 2**), which plays a major role as initiator of the beta-oxidation of fatty acids that occurs in the mitochondria. The investigation of the metabolic state of carnitine and its combination with the fatty acids and their metabolic intermediates produced during beta-oxidation seems essential to determine the degree of damage or the prognosis of these patients. Since it has been published that diabetic patients show carnitine deficiency and increased esterified carnitine [4], the study of the acylcarnitine composition compared to control group seems interesting, as well as the correlation of acylcarnitine composition with various complications which are expected during the course of the disease. These profiles of acylcarnitines can suggest incomplete beta-oxidation of long-chain fatty acids that can be amended with treatment. The carnitine and acylcarnitines can be analysed by means of their extraction from plasma or dried blood spots with an organic solvent, to remove proteins or solid-liquid extraction, respectively.

As some diabetic patients could suffer from alterations in energy metabolism, the study of creatine levels could be imperative to indicate a possible deficit of creatine to muscles or the brain. This creatine becomes creatine phosphate by a process of phosphorylation, leading to the genesis of ATP, essential in tissues with high-energy consumption, such as muscles and the brain through a process of dephosphorylation. An alteration of this metabolic pathway would lead to increase guanidinoacetate and a deficit in the production of creatine and, therefore, phosphocreatine.

**Figure 2.** Molecular structure of carnitine, extracted from plasma by removing proteins.

In the absence of specific metabolic biomarkers of evolution and efficacy to treatment in diabetes, it should be performed a metabolic profile based on the profile of amino acids, which have been related to the methylation cycle and dementia (Met, Cys, Tau) in the diabetes.

l-Arginine, the main substrate of the endothelial nitric oxide synthase (NOS), is oxidised to l-citrulline and nitric oxide. Asymmetric dimethylarginine (ADMA) is derived from the proteolysis of proteins and acts as a competitive inhibitor of nitric oxide synthase, which is related to blood pressure, glucose intolerance, the thickness of the intima-media carotid, etc., making it a cardiovascular marker. It has been observed that the endothelial NO production is reduced in diabetic patients at high cardiovascular risk partly due to the limitation in the synthesis of arginine substrate and, secondly, because the ADMA, its inhibitor, is increased. So, it is proposed that the ADMA acts as a marker of cardiovascular risk in diabetic patients [5].

Epigenetics may contribute to metabolic study of diabetes through the global status of DNA methylation, which has been vaguely studied in this disease. Such mechanism methylation may play a role in the aetiology of monogenic diabetes. We propose the quantification of 5-methylcytosine (5-mC) in order to study changes in the methylation of DNA/RNA or its possible degradation during the treatment.

#### **1.2. Methylmalonic aciduria**

Methylmalonic aciduria (MMA) is an inherited disorder involving certain amino acid and fatty acid metabolism. Levels of methylmalonic acid are increased mainly in urine. They are caused by an enzymatic deficiency of methylmalonyl-CoA mutase (MUT) in the mitochondria or by a defect in the uptake, transport or synthesis of cobalamin, the cofactor of MUT. The management of this organic aciduria is based on a low-protein diet. MMA patient could suffer from renal failure as one of their critical problems, not common in other organic acidurias. However, the cause of renal failure and cardiovascular complications depends on several pathways. So, it is thought that nitric oxide formation could be affected causing endothelial dysfunction. Related to nitric oxide, arginine-asymmetric dimethylarginine (ADMA) pathway, where ADMA acts as an inhibitor for nitric oxide synthase, should be described in these patients. ADMA levels are associated with an increase in cardiovascular events and chronic renal disease [6]. In addition, levels of the isomer symmetric dimethylarginine (SDMA) as a renal biomarker were also elevated in patients with chronic kidney disease.

Metabolic approach to evaluate the cardiovascular biomarker status should be recommended to improve the management for these patients. The objectives of this approach are also the identification and quantification of biomarkers of renal injury and inflammation such as interleukins, tumour necrosis factor alpha (TNFα) and transforming growth factor beta (TGFβ), in patients who are at different stages of kidney damage. The definition of biomarkers for early renal impairment in MMA patient could assess changes in these parameters during the kidney damage progression.

The comparison of MMA subgroups at different stages of kidney damage would allow the identification of those markers of inflammation and tissue fibrosis involved in tubulointerstitial nephritis or primary glomerular injury and will establish a chronology on the appearance of altered levels of biomarkers for the development of kidney damage. Finally, we believe that greater understanding of the role of these molecules and inflammatory processes in the context of associated nephropathy by means of metabolomics is vital to facilitate the development of novel targets and therapeutic nephroprotective strategies. The use of anti-inflammatory molecules or antioxidants can be applied clinically successfully.

#### **1.3. Non-arteritic anterior ischemic optic neuropathy (NAION)**

As some diabetic patients could suffer from alterations in energy metabolism, the study of creatine levels could be imperative to indicate a possible deficit of creatine to muscles or the brain. This creatine becomes creatine phosphate by a process of phosphorylation, leading to the genesis of ATP, essential in tissues with high-energy consumption, such as muscles and the brain through a process of dephosphorylation. An alteration of this metabolic pathway would lead to increase guanidinoacetate and a deficit in the production of creatine and, therefore,

In the absence of specific metabolic biomarkers of evolution and efficacy to treatment in diabetes, it should be performed a metabolic profile based on the profile of amino acids, which have been related to the methylation cycle and dementia (Met, Cys, Tau) in the diabetes.

**Figure 2.** Molecular structure of carnitine, extracted from plasma by removing proteins.

l-Arginine, the main substrate of the endothelial nitric oxide synthase (NOS), is oxidised to l-citrulline and nitric oxide. Asymmetric dimethylarginine (ADMA) is derived from the proteolysis of proteins and acts as a competitive inhibitor of nitric oxide synthase, which is related to blood pressure, glucose intolerance, the thickness of the intima-media carotid, etc., making it a cardiovascular marker. It has been observed that the endothelial NO production is reduced in diabetic patients at high cardiovascular risk partly due to the limitation in the synthesis of arginine substrate and, secondly, because the ADMA, its inhibitor, is increased. So, it is proposed that the ADMA acts as a marker of cardiovascular risk in

Epigenetics may contribute to metabolic study of diabetes through the global status of DNA methylation, which has been vaguely studied in this disease. Such mechanism methylation may play a role in the aetiology of monogenic diabetes. We propose the quantification of 5-methylcytosine (5-mC) in order to study changes in the methylation of DNA/RNA or its

Methylmalonic aciduria (MMA) is an inherited disorder involving certain amino acid and fatty acid metabolism. Levels of methylmalonic acid are increased mainly in urine. They are caused by an enzymatic deficiency of methylmalonyl-CoA mutase (MUT) in the mitochondria or by a defect in the uptake, transport or synthesis of cobalamin, the cofactor of MUT. The management of this organic aciduria is based on a low-protein diet. MMA patient could suffer

phosphocreatine.

188 Metabolomics - Fundamentals and Applications

diabetic patients [5].

possible degradation during the treatment.

**1.2. Methylmalonic aciduria**

The cause for optic neuropathy affecting elderly patients could be multifactorial and difficult to diagnose in case of non-arteritic anterior ischemic optic neuropathy (NAION). Few information has been published about metabolic pathways involved in this rare disease. So, the purpose of metabolomics is to suggest effective biomarkers and to describe metabolic fingerprinting.

Samples of patients and controls can be fingerprinted with liquid chromatography coupled to high-sensitive mass spectrometry, such as quadrupole time of flight (QTOF). According to **Figure 1**, data should be filtered, aligned and statistically analysed before identifying new compounds. These biomarkers were found to be significant in class separation and could be later confirmed by obtaining the characteristic MS/MS spectra and online databases, as Metlin or Human Metabolome Database. NAION patients presented differences in the phospholipid profile in comparison with controls, such as lower levels of lysoPCs.

LysoPCs and lysoPEs are products or metabolites of PCs and PEs (**Figure 3**), respectively, which are structural components of cell membranes. The structure of lysoPCs is a choline polar group linked to fatty acyls that differ in chain length, position and degree of

**Figure 3.** Molecular structure for phosphatidylcholine (PC) and phosphatidylethanolamine (PE), as the precursors or LysoPC and LysoPE.

saturation. These compounds were extracted from blood samples by centrifugation, so protein precipitation was performed by adding 300 μL of cool methanol:ethanol (1:1) to the plasma. They were vortexed during 1 min and stored at −20 °C for 10 min. After centrifugation, the supernatant can be injected in MS equipment. Increased levels of lysoPEs and reduced levels of lysoPCs can be a consequence of an altered choline phospholipid metabolism. The whole profile of lysoPCs has been published as a biomarker in other diseases, though the type of the lysoPCs was not taken into account [7]. As lysoPC profile is a biomarker for other diseases, variations in some lysoPC isomers may not be enough in order to identify a specific disease. For this reason, individual lysoPC levels produced by a pathology should be further investigated. As it has been researched by our group, lysoPC profiling of plasma shows a characteristic profile in NAION patients: higher levels of lysoPEs and lower levels of lysoPCs. As PCs are methylated products of PEs, the hypothesis for this result revolves around the poor methylation capacity in optic nerve cells. The characteristic profile of lysoPCs and lysoPEs also shows that the activity of phospholipases (PL) and lysophospholipases (LPL) could be increased [8].

NAION patients showed high levels of l-palmitoylcarnitine, which has the highest weight in the Partial Least Squares Discriminant Analysis (PLS-DA) prediction model (**Figure 4**). Nevertheless, these levels are not clearly related to the physiopathology of the disease because the uptake of acetylcarnitines by the mitochondria is a reversible process that is used to undergo mitochondrial β-oxidation of long-chain fatty acids and to transport acyl-coenzyme A from mitochondria to cytosol. So, more studies are necessary to confirm a disturbance in β-oxidation of fatty acid. However, these high levels of l-palmitoylcarnitine have been involved in the pathology of ischemia, acting as an inhibitor of cardiac Na,K-ATPase. Because of its amphipathic skill, palmitoylcarnitine can induce alterations in membrane fluidity and

**Figure 4.** Partial Least Squares Discriminant Analysis (PLS-DA) scores plot of plasma metabolic profiles obtained from NAION patients and controls.

surface during apoptosis in NAION patients [9]. In this case, the membrane disturbance takes part in high-energy uptake cells of the optic nerve provoking severe blindness. This discussion could signify that carnitine as a supplement could increase the *in vivo* regeneration of optic nerve cells after NAION event. Furthermore, due to identified profile by metabolomics and NAION physiopathology as central nervous system problem, it seems to be a stronger relationship between NAION disease and other ischemic processes, such as myocardial ischemia or ictus, than with neuropathological or ophthalmological syndromes.

#### **1.4. Lysosomal diseases**

saturation. These compounds were extracted from blood samples by centrifugation, so protein precipitation was performed by adding 300 μL of cool methanol:ethanol (1:1) to the plasma. They were vortexed during 1 min and stored at −20 °C for 10 min. After centrifugation, the supernatant can be injected in MS equipment. Increased levels of lysoPEs and reduced levels of lysoPCs can be a consequence of an altered choline phospholipid metabolism. The whole profile of lysoPCs has been published as a biomarker in other diseases, though the type of the lysoPCs was not taken into account [7]. As lysoPC profile is a biomarker for other diseases, variations in some lysoPC isomers may not be enough in order to identify a specific disease. For this reason, individual lysoPC levels produced by a pathology should be further investigated. As it has been researched by our group, lysoPC profiling of plasma shows a characteristic profile in NAION patients: higher levels of lysoPEs and lower levels of lysoPCs. As PCs are methylated products of PEs, the hypothesis for this result revolves around the poor methylation capacity in optic nerve cells. The characteristic profile of lysoPCs and lysoPEs also shows that the activity of phospholipases

**Figure 3.** Molecular structure for phosphatidylcholine (PC) and phosphatidylethanolamine (PE), as the precursors or

NAION patients showed high levels of l-palmitoylcarnitine, which has the highest weight in the Partial Least Squares Discriminant Analysis (PLS-DA) prediction model (**Figure 4**). Nevertheless, these levels are not clearly related to the physiopathology of the disease because the uptake of acetylcarnitines by the mitochondria is a reversible process that is used to undergo mitochondrial β-oxidation of long-chain fatty acids and to transport acyl-coenzyme A from mitochondria to cytosol. So, more studies are necessary to confirm a disturbance in β-oxidation of fatty acid. However, these high levels of l-palmitoylcarnitine have been involved in the pathology of ischemia, acting as an inhibitor of cardiac Na,K-ATPase. Because of its amphipathic skill, palmitoylcarnitine can induce alterations in membrane fluidity and

(PL) and lysophospholipases (LPL) could be increased [8].

LysoPC and LysoPE.

190 Metabolomics - Fundamentals and Applications

The lysosomal diseases are another group of inherited metabolic disorders characterised by mutations in genes encoding lysosomal enzymes and proteins. The enzyme deficiencies cause impaired intracellular turnover and build-up of complex molecules including sphingolipids, glycosaminoglycans and glycoproteins. These metabolites are established by metabolic approach as biomarkers. The pathology of lysosomal diseases is typically characterised by intra-lysosomal storage of a variety of substrates in multiple tissues and organs. Thus, the phenotypes of these disorders are complex and characterised by the variable association of visceral, skeletal and neurological manifestations.

The advancement in the diagnosis and treatment of lysosomal diseases has been largely stimulated by the improved knowledge on their molecular bases and pathophysiology. Metabolomics has a major role in this further impulse to the availability of technologies allowing faster diagnosis. In the case of Fabry disease, caused by deficiency of α-galactosidase A, the glycosphingolipid accumulation in plasma and urine is used as biomarker. Two specific biomarkers have been identified and quantified in plasma and urine: globotriaosylceramide (Gb3) and globotriaosylsphingosine (lyso-Gb3). The search continues for biomarkers that might be reliable indicators of disease severity and response to treatment.

Several metabolomic studies have been carried out for lysosomal diseases, such as Fabry [10] and Niemann-Pick [11] syndromes, to establish efficient biomarkers. For other lysosomal diseases like mucopolysaccharidoses, these studies could be useful to identify the increase of a specific glycosaminoglycan in urine, such as dermatan sulphate, chondroitin sulphate, heparan sulphate and keratan sulphate [12].

## **2. Future progress**

Nowadays, the metabolomics is continuously improving its methodology to obtain new biomarkers quickly by means of advances in mass spectrometry and statistics software. So, it is clear that little-studied rare diseases, with no defined biomarkers, will be studied through this methodology. However, what is going to be an outstanding breakthrough is the combination of the information obtained by metabolomics, proteomics and genomics. Sometimes, these "omics" sciences are studied separately, but all information should be taken into account together in the future to understand the connexion of all information in the whole metabolic pathway.

## **3. Conclusions**

It is evident that there is not a single perfect tool for global metabolic profiling. However, metabolomics, which combines sensitive analytical techniques with multivariate analysis, seems to be a good approach to identify biomarkers in diabetes and metabolic rare diseases, such as methylmalonic acidemia or lysosomal diseases.

## **Acknowledgements**

Research and results in this manuscript were financed by the BioCruces and Carlos III Health Research Institutes.

## **Appendices and nomenclatures**


NAION Non-arteritic anterior ischemic optic neuropathy


## **Author details**

Several metabolomic studies have been carried out for lysosomal diseases, such as Fabry [10] and Niemann-Pick [11] syndromes, to establish efficient biomarkers. For other lysosomal diseases like mucopolysaccharidoses, these studies could be useful to identify the increase of a specific glycosaminoglycan in urine, such as dermatan sulphate, chondroitin sulphate, hepa-

Nowadays, the metabolomics is continuously improving its methodology to obtain new biomarkers quickly by means of advances in mass spectrometry and statistics software. So, it is clear that little-studied rare diseases, with no defined biomarkers, will be studied through this methodology. However, what is going to be an outstanding breakthrough is the combination of the information obtained by metabolomics, proteomics and genomics. Sometimes, these "omics" sciences are studied separately, but all information should be taken into account together in the future to understand the connexion of all information in the whole metabolic

It is evident that there is not a single perfect tool for global metabolic profiling. However, metabolomics, which combines sensitive analytical techniques with multivariate analysis, seems to be a good approach to identify biomarkers in diabetes and metabolic rare diseases,

Research and results in this manuscript were financed by the BioCruces and Carlos III Health

ran sulphate and keratan sulphate [12].

192 Metabolomics - Fundamentals and Applications

such as methylmalonic acidemia or lysosomal diseases.

**2. Future progress**

pathway.

**3. Conclusions**

**Acknowledgements**

**Appendices and nomenclatures**

ADMA Asymmetric dimethylarginine HMDB Human Metabolome Database

LysoPE Lysophosphatidylethanolamine

LysoPC Lysophosphatidylcholine

MMA Methylmalonic acidurias

Research Institutes.

Luis Aldámiz-Echevarría1,\*, Fernando Andrade1 , Marta Llarena1 and Domingo González-Lamuño2

\*Address all correspondence to: luisjose.aldamiz-echevarazuara@osakidetza.eus

1 Division of Metabolism, BioCruces Health Research Institute, Centre for Biomedical Research on Rare Diseases (CIBER-ER), Barakaldo, Bizkaia, Spain

2 Department of Nephrology and Metabolism, Marqués de Valdecilla University Hospital, Santander, Spain

#### **References**


## **Chapter 11**

**Provisional chapter**

## **Metabolomics in Neonatology Metabolomics in Neonatology**

Mina H. Hanna and Patrick D. Brophy Mina H. Hanna and Patrick D. Brophy

Additional information is available at the end of the chapter Additional information is available at the end of the chapter

http://dx.doi.org/10.5772/66295

#### **Abstract**

[6] Lu TM, Chung MY, Lin CC, Hsu CP, Lin SJ. Asymmetric dimethylarginine and clinical outcomes in chronic kidney disease. Clin J Am Soc Nephrol 2011;6:1566-1572. doi:

[7] Dong J, Cai X, Zhao L, Xue X, Zou L, Zhang X, Liang X. Lysophosphatidylcholine profiling of plasma: discrimination of isomers and discovery of lung cancer biomarkers.

[8] Ciborowski M, Rupérez FJ, Martinez-Alcázar MP, Angulo S, Radziwon P, Olszanski R, Kioczko J, Barbas C. Metabolomic approach with LC-MS reveals significant effect of pressure on Diver's plasma. J Proteome Res 2010;9:4131-4137. doi: 10.1021/pr100331j [9] Nalecz KA, Miecz D, Berezowski V, Cecchelli R. Carnitine: transport and physiological functions in the brain. Mol Aspects Med 2004;25:551-567. doi: 10.1016/j.mam.2004.06.001

[10] Boutin M, Auray-Blais C. Metabolomic discovery of novel urinary galabiosylceramide analogs as Fabry disease biomarkers. J Am Soc Mass Spectrom 2015;26(3):499-

[11] Maekawa M, Shimada M, Ohno K, Togawa M, Nittono H, Iida T, Hofmann AF, Goto J, Yamaguchi H, Mano N. Focused metabolomics using liquid chromatography/ electrospray ionization tandem mass spectrometry for analysis of urinary conjugated cholesterol metabolites from patients with Niemann-Pick disease type C and 3β-hydroxysteroid dehydrogenase deficiency. Ann Clin Biochem 2015;52:576-587. doi:

[12] Staples GA, Zaia J. Analysis of glycosaminoglycans using mass spectrometry. Curr

Metabolomics 2010:6;478-488. doi: 10.1007/s11306-010-0215-x

Proteomics 2011;8:325-336. doi: 10.2174/157016411798220871

10.2215/CJN.08490910

194 Metabolomics - Fundamentals and Applications

510. doi: 10.1007/s13361-014-1060-3

10.1177/0004563214568871

Throughout recent decades, the incidence of preterm birth has risen worldwide, and although the majority of preterm neonates now survive infancy, many suffer from debilitating morbidities in the short term and/or increased disease risks in the long term. Traditional diagnostic biomarkers suffer from considerable confounders, limiting their use in the early identification of diseases. There is a need to develop novel biomark‐ ers that can identify, in real time, the evolution of organ dysfunction in an early diag‐ nostic, monitoring, and prognostic fashion. Use of "omics," particularly metabolomics, may provide valuable information regarding functional pathways underlying different pathologies and prediction of clinical outcomes. The emerging knowledge generated by the application of metabolomics in neonatology provides new insights that can help to identify markers of early diagnosis, disease progression, response to treatment, and new therapeutic targets. In this chapter, we review the current knowledge of different metab‐ olomics technologies in neonatal‐perinatal medicine, including biomarker discovery, defining as yet unrecognized biologic therapeutic targets, and linking of metabolomics to relevant standard indices and long‐term outcomes.

**Keywords:** metabolomics, biomarkers, personalized healthcare

## **1. Introduction**

"Omics" refers to the collective technologies used to explore the roles, relationships, and actions of the various types of molecules that make up the phenotype of an organism. Living systems complexity and adaptiveness can be read through self‐organized highly intercon‐ nected networks whose interacting components are dynamically coordinated in hierarchical patterns. Systems biology is a scientific discipline that endeavors to quantify all of the molecu‐ lar elements of a biological system to assess their interactions and to integrate that informa‐ tion into network models. Therefore, systems biology reflects the knowledge acquired by omics in a meaningful manner [1].

© 2016 The Author(s). Licensee InTech. This chapter is distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/3.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. © 2017 The Author(s). Licensee InTech. This chapter is distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/3.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

From genes to metabolites, the omics technologies have progressed significantly in the medi‐ cal field over the last decade secondary to the remarkable advancement in laboratory method‐ ologies and analytical tools. We discuss in this chapter the current knowledge of metabolomics technologies in neonatal‐perinatal medicine including biomarker discovery, defining as yet unrecognized biologic therapeutic targets, and linking of metabolomics to relevant standard indices.

## **2. Metabolomics technologies**

The two major methodologies applied in metabolomics are nuclear magnetic resonance (NMR) spectroscopy and mass spectrometry (MS). Both techniques can deliver high sensi‐ tivity, selectivity, and throughput data with high degree of reproducibility [2]. NMR spec‐ troscopy is a quantitative, nondestructive, reproducible technique that provides detailed information on solution‐state molecular structures, based on atom‐centered nuclear interac‐ tions. The advantage of applying NMR is that it uses the magnetic properties of atomic nuclei, delivering simultaneous information on both the structure and molecular mobility of metabo‐ lites without the need for the preselection of analytical parameters or sample derivatization procedures. However, sensitivity is a limiting factor and often metabolite concentrations in the range of 1–10 μmol/L are required for detection and quantification by NMR [3]. Mass spectrometry analytical platforms tend to have much higher sensitivity, enabling extensive assessment of different metabolites in biological fluids or tissues samples. **Figure 1** illustrates the flow of the processes of metabolomics.

The first step is sample collection, consistency in collection and processing through standard operating procedures is important to avoid iatrogenic biases. Variables to consider in this step: (1) circadian variation and time of collection during the day, (2) nutritional impact, and (3) gestational age at birth and postnatal days of life. Following collection, samples may be stored for extended periods of time. However, metabolites stability over time should be a part of quality control measurements in conjunction with analytical variability. Prior to analysis, the samples have to be extracted into a suitable solvent using chromatography, commonly used method is either gas or liquid chromatography (GC or LC) followed by ionization in a fluid or matrix; and subsequently, metabolites are identified using a mass spectrometer on the basis of their mass‐to‐charge ratio (*m*/*z*) and their representation in the spectrum. Metabolite identification in MS is destructive based on fragmentation patterns either through the mea‐ surement of molecular mass (indicative of the molecular formula) or by collection of frag‐ mentation mass spectra (indicative of molecular structure). Therefore, the application of this technology has the advantage of identifying novel metabolites not previously described in databases. On the other hand, ion suppression in complex biological samples limits the ability to quantify metabolites secondary to the interaction of multiple analytes that are present in the ionization source at the same time [4, 5].

Metabolome characterization can be performed in a targeted manner, or in a nontargeted (pattern‐recognition) manner. The target method implicates identification and quantification of specific metabolites in a given biofluid or tissue extract by comparing the spectrum of

**Figure 1.** Metabolomics work flow.

From genes to metabolites, the omics technologies have progressed significantly in the medi‐ cal field over the last decade secondary to the remarkable advancement in laboratory method‐ ologies and analytical tools. We discuss in this chapter the current knowledge of metabolomics technologies in neonatal‐perinatal medicine including biomarker discovery, defining as yet unrecognized biologic therapeutic targets, and linking of metabolomics to relevant standard

The two major methodologies applied in metabolomics are nuclear magnetic resonance (NMR) spectroscopy and mass spectrometry (MS). Both techniques can deliver high sensi‐ tivity, selectivity, and throughput data with high degree of reproducibility [2]. NMR spec‐ troscopy is a quantitative, nondestructive, reproducible technique that provides detailed information on solution‐state molecular structures, based on atom‐centered nuclear interac‐ tions. The advantage of applying NMR is that it uses the magnetic properties of atomic nuclei, delivering simultaneous information on both the structure and molecular mobility of metabo‐ lites without the need for the preselection of analytical parameters or sample derivatization procedures. However, sensitivity is a limiting factor and often metabolite concentrations in the range of 1–10 μmol/L are required for detection and quantification by NMR [3]. Mass spectrometry analytical platforms tend to have much higher sensitivity, enabling extensive assessment of different metabolites in biological fluids or tissues samples. **Figure 1** illustrates

The first step is sample collection, consistency in collection and processing through standard operating procedures is important to avoid iatrogenic biases. Variables to consider in this step: (1) circadian variation and time of collection during the day, (2) nutritional impact, and (3) gestational age at birth and postnatal days of life. Following collection, samples may be stored for extended periods of time. However, metabolites stability over time should be a part of quality control measurements in conjunction with analytical variability. Prior to analysis, the samples have to be extracted into a suitable solvent using chromatography, commonly used method is either gas or liquid chromatography (GC or LC) followed by ionization in a fluid or matrix; and subsequently, metabolites are identified using a mass spectrometer on the basis of their mass‐to‐charge ratio (*m*/*z*) and their representation in the spectrum. Metabolite identification in MS is destructive based on fragmentation patterns either through the mea‐ surement of molecular mass (indicative of the molecular formula) or by collection of frag‐ mentation mass spectra (indicative of molecular structure). Therefore, the application of this technology has the advantage of identifying novel metabolites not previously described in databases. On the other hand, ion suppression in complex biological samples limits the ability to quantify metabolites secondary to the interaction of multiple analytes that are present in

Metabolome characterization can be performed in a targeted manner, or in a nontargeted (pattern‐recognition) manner. The target method implicates identification and quantification of specific metabolites in a given biofluid or tissue extract by comparing the spectrum of

indices.

**2. Metabolomics technologies**

196 Metabolomics - Fundamentals and Applications

the flow of the processes of metabolomics.

the ionization source at the same time [4, 5].

interest to a library of reference spectra of pure compounds. This approach may suffer from an inherent bias as it captures only a part of the metabolome. Alternatively, the global nontar‐ geted approach serves as a hypothesis‐generating unbiased tool running as a first screening assay in clinical biomarker discovery studies, followed by targeted analysis for the metabo‐ lites that show significant differences or changes. The global pattern‐recognition method can also screen for a multitude of key compounds in specific metabolic pathways which provide valuable information for metabolic fingerprinting.

The vast amount of data generated by metabolomics methods provides a unique opportunity to investigate alterations in metabolic pathways in response to changes in the cellular envi‐ ronment, and/or disease conditions. However, the high complexity of this data introduces a challenging aspect of data analysis that requires careful use of statistical methodologies and computational tools for efficient data visualization and analysis. Metabolic pathway analy‐

sis implies integration of the identified metabolites into metabolic correlation networks in order to better understand the complex relationships among various metabolites. Therefore, it allows researchers to correlate observed chemometric changes to the underlying pathologi‐ cal mechanisms.

## **3. Metabolomics in neonatology**

#### **3.1. Preterm birth and postnatal maturation**

Preterm birth represents the aggregation of heterogeneous phenotypes, it is a com‐ plex disorder caused by multifactorial influences and the interplay of numerous risk factors.

Metabolomic profiling of amniotic fluid was able to distinguish patients who delivered at term from patients who delivered preterm. A decrease in carbohydrates was associated with preterm delivery in the presence or absence of inflammation whereas an increase in amino acid metabolites was a unique feature of preterm labor with inflammation [6].

Wilson et al. examined the associations between the degree of prematurity and the levels of amino acids, enzymes, and endocrine markers in a large cohort of infants. They concluded that children at different stages of prematurity are metabolically distinct [7]. Similarly Atzori et al. found that metabolomic analysis revealed distinct urinary metabolic profiles in neonates of different gestational ages, suggesting that gestational age has a strong effect on the metabolic profile of neonates, and applying this technology may predict the post‐ maturation of preterm and term neonates [8]. Furthermore, metabolomic analysis showed significant alterations in three metabolic pathways: (1) arginine and proline; (2) urea cycle; and (3) glycine, serine, and threonine between neonates with intrauterine growth restriction (IUGR) and controls [9].

#### **3.2. Maternal chorioamnionitis and preeclampsia**

The application of metabolomics methods has shown a clear distinction between preterm infants born to mothers with histological chorioamnionitis (HCA) from those born to moth‐ ers without HCA. Metabolites discriminating were the following: mannitol, 4‐hydroxyphen‐ ylacetate, p‐cresol, myo‐inositol, trimethylamine‐N‐oxide, and 1‐methylnicotinamide [10]. Similarly, metabolomics has the potential to identify changes under clinical conditions, such as preeclampsia (PE), that are associated with placental molecular pathophysiology. Heazell et al. have demonstrated that placental tissue from uncomplicated pregnancies cultured in 1% oxygen (hypoxia) had metabolic similarities to explants from preeclampsia pregnancies cultured at 6% oxygen (normoxia). This group of metabolites includes prostaglandins, a num‐ ber of long‐chain fatty acids and several amino acids [11]. Metabolic footprinting offers a hypothesis‐generating strategy to investigate factors absorbed by and released from the pla‐ centa. Horgan et al. analyzed the metabolic footprint of placental villous explants cultured at different oxygen tensions between women who deliver a small for gestational age (SGA) baby and those from normal controls. SGA explant media cultured under hypoxic conditions was noted, on a univariate level, to exhibit the same metabolic signature as controls cultured under normoxic conditions for 49% of the metabolites of interest, suggesting that SGA tissue is acclimatized to hypoxic conditions *in vivo* [12].

#### **3.3. Respiratory distress syndrome and bronchopulmonary dysplasia**

Respiratory distress syndrome (RDS), formerly also known as hyaline membrane disease, is a common problem in preterm newborn infants. Surfactant deficiency or inactivation is a major contributing factor for the development of RDS. Metabolic profiling of bronchoalveolar lavage fluid (BALF) is a promising tool for assessing novel biomarkers of RDS in preterm infants. Applying GC‐MS based metabolomic analysis revealed 10 metabolites that are over‐ expressed in BALF collected during mechanical ventilation following surfactant administra‐ tion [13].

Bronchopulmonary dysplasia (BPD) is the most common chronic lung disease in infants with a multifactorial pathogenesis arising from a complex interaction between genetic and envi‐ ronment factors. Comparing the urinary metabolic profiles at birth of preterm neonates, Fanos et al. found five discriminant metabolites: lactate, taurine, trimethylamine‐N‐oxide (TMAO), myo‐inositol (which increased in BPD patients), and gluconate (which was decreased) [14]. The increase in urinary lactate in the BPD group may represent a process of anaerobic respira‐ tion. Taurine and TMAO have anessential biological role for osmoregulation and membrane stabilization. Additionally, taurine has essential roles in calcium homeostasis, renal cell cycle and apoptosis, nerve cell activity and detoxification [15]. The data emerging from this study provide better insights into the pathophysiological mechanisms of BPD development.

#### **3.4. Hypoxic ischemic encephalopathy**

sis implies integration of the identified metabolites into metabolic correlation networks in order to better understand the complex relationships among various metabolites. Therefore, it allows researchers to correlate observed chemometric changes to the underlying pathologi‐

Preterm birth represents the aggregation of heterogeneous phenotypes, it is a com‐ plex disorder caused by multifactorial influences and the interplay of numerous risk

Metabolomic profiling of amniotic fluid was able to distinguish patients who delivered at term from patients who delivered preterm. A decrease in carbohydrates was associated with preterm delivery in the presence or absence of inflammation whereas an increase in amino

Wilson et al. examined the associations between the degree of prematurity and the levels of amino acids, enzymes, and endocrine markers in a large cohort of infants. They concluded that children at different stages of prematurity are metabolically distinct [7]. Similarly Atzori et al. found that metabolomic analysis revealed distinct urinary metabolic profiles in neonates of different gestational ages, suggesting that gestational age has a strong effect on the metabolic profile of neonates, and applying this technology may predict the post‐ maturation of preterm and term neonates [8]. Furthermore, metabolomic analysis showed significant alterations in three metabolic pathways: (1) arginine and proline; (2) urea cycle; and (3) glycine, serine, and threonine between neonates with intrauterine growth restriction

The application of metabolomics methods has shown a clear distinction between preterm infants born to mothers with histological chorioamnionitis (HCA) from those born to moth‐ ers without HCA. Metabolites discriminating were the following: mannitol, 4‐hydroxyphen‐ ylacetate, p‐cresol, myo‐inositol, trimethylamine‐N‐oxide, and 1‐methylnicotinamide [10]. Similarly, metabolomics has the potential to identify changes under clinical conditions, such as preeclampsia (PE), that are associated with placental molecular pathophysiology. Heazell et al. have demonstrated that placental tissue from uncomplicated pregnancies cultured in 1% oxygen (hypoxia) had metabolic similarities to explants from preeclampsia pregnancies cultured at 6% oxygen (normoxia). This group of metabolites includes prostaglandins, a num‐ ber of long‐chain fatty acids and several amino acids [11]. Metabolic footprinting offers a hypothesis‐generating strategy to investigate factors absorbed by and released from the pla‐ centa. Horgan et al. analyzed the metabolic footprint of placental villous explants cultured at different oxygen tensions between women who deliver a small for gestational age (SGA) baby and those from normal controls. SGA explant media cultured under hypoxic conditions

acid metabolites was a unique feature of preterm labor with inflammation [6].

cal mechanisms.

factors.

(IUGR) and controls [9].

**3.2. Maternal chorioamnionitis and preeclampsia**

**3. Metabolomics in neonatology**

198 Metabolomics - Fundamentals and Applications

**3.1. Preterm birth and postnatal maturation**

Hypoxic ischemic encephalopathy (HIE) is a complex neurological injury, characterized by biphasic depletion in high energy phosphates, with an estimated incidence of two per 1000 deliveries. Walsh et al. performed metabolomic analysis on umbilical cord blood from new‐ borns that were divided into three groups: those with confirmed HIE (*n* = 31), asphyxiated infants without encephalopathy (*n* = 40) and matched controls (*n* = 71). Targeted metabolo‐ mic analysis showed a significant alteration between study groups in 29 metabolites from 3 distinct classes (amino acids, acylcarnitines, and glycerophospholipids). A logistic regression model using five metabolites clearly delineates severity of asphyxia andclassifies HIE infants with area under the curve (AUC) = 0.92 [16].

#### **3.5. Necrotizing enterocolitis/late onset sepsis**

Necrotizing enterocolitis (NEC) and late onset sepsis (LOS) are the leading causes of death among preterm infants. Stewart et al. compared the serum proteomic and metabolomic profiles longitudinally in preterm infants with NEC or LOS, matched to controls. While no single protein or metabolite was detectedin all NEC or LOS cases which was absent in controls; several proteins were identified which were associated with disease status. The expression of these proteins generally varied between diseased infants, potentially relating to differing pathophysiology ofdisease [17]. Similarly, Wilcock et al. found metabolomic differences in preterm babies at risk of NEC. However, sample sizes were insufficient to confidently identify a biomarker. Additionally, network modeling of preterm and term metabolomes suggested possible nutritional deficiency and altered pro‐insulin action in preterm babies [18].

#### **3.6. Neonatal kidney injury**

Acute Kidney Injury (AKI) is common in neonates undergoing cardiac surgery, and is associated with increased mortality and ICU length of stay [19]. Mass spectrometry‐based metabolomics was used in a prospective cohort of pediatric cardiac surgery patients (*n* = 40). Twenty‐one of these children developed acute kidney injury defined as an increase in serum creatinine concentrations 50% or greater from baseline after 48–72 h. Homovanillic acid sul‐ fate (HVA‐SO<sup>4</sup> ), a dopamine metabolite was identified as a marker indicating AKI with 90% sensitivity and 95% specificity using a cutoff value of 24 ng/ml at 12 h after surgery [20]. Atzori et al. showed a correlation between urinary metabolic profiles and neutrophil gelatinase‐asso‐ ciated lipocalin (NGAL) concentration in a cohort of young adults born with extremely low‐ birth weight (ELBW), using partial least‐squares discriminant analysis [21].

#### *3.6.1. Drug‐induced nephrotoxicity*

Nephrotoxic‐medications are becoming increasingly recognized as a common and potentially modifiable cause of AKI in neonates. In a single center retrospective cohort 87% of very low birth weight infants (VLBW) were exposed to at least one nephrotoxic medication and on average these neonates were exposed to 14 days of nephrotoxic medications during their NICU stay [22]. Early identification of renal injury through omics technologies implicates defining different biomarkers that rely on the mechanisms of toxicity of each drug or drug class [23]. In our experimental study, gentamicin‐induced acute kidney injury in newborn rats resulted in a distinct urinary metabolic profile characterized by glucosuria, phosphatu‐ ria, and aminoaciduria that preceded changes in serum creatinine. Additionally, lower lev‐ els of kynurenic acid were noted in the urine of gentamicin injected rats, coinciding with higher levels of tryptophan, suggesting a degrading effect of gentamicin toxicity on trypto‐ phan metabolism pathway [24]. Xu et al. applied integrated pathway analysis and metabo‐ lite‐transcript correlation analysis to define perturbed biochemical pathways and molecular functions that may be relevant to the mechanisms of nephrotoxicity. They concluded that transcriptional downregulation of luminal sodium‐dependent transporters SLC5A1, SLC5A2, SLC6A18, and SLC16A7 might be the central mediators of drug‐induced kidney injury and adaptive response pathways. The integrated pathway analysis performed on these studies indicates that cisplatin‐ or gentamicin‐induced renal Fanconi‐like syndromes manifested by glucosuria, hyperaminoaciduria, lactic aciduria, and ketonuria might be better explained by the reduction of functional proximal tubule transporters rather than by the perturbation of metabolic pathways inside kidney cells [25].

An alternative approach implicatesdiscoveryof a limited number of biomarkers that identify injury specific to primary sites in the kidney, such as the glomerulus or the proximal tubule. A prospective observational trial showed that the urinary excretion of biomarkers that signify proximal tubular damage was higher in the gentamicin group compared with control and pre‐ ceded the peak of SCr and urine output decrease [26]. The application of different omics technolo‐ gies *in vitro* systems and preclinical models to predict nephrotoxicity allows testing of the safety and efficacy of novel therapies and enhances the development and implementation of new drugs.

Askenazi et al. demonstrated that urinary biomarkers can predict AKI and mortality in very low birth weight infants independent of gestational age and birth weight [27]. We found that urinary NGAL, osteopontin(OPN) and cystatin C (Cys C) increased significantly in infants who developed AKI, in contrast, urinary epidermal growth factor (EGF) and uromodulin (UMOD) decreased significantly in this group. Urinary biomarkers demonstrated a signifi‐ cant change 24 h prior to contemporary creatinine‐based neonatal AKI definition [28]. It is particularly important to recognize the differences in omics biomarkers across different gesta‐ tional ages, postnatal days, and fluid balance status when designing future validation studies [29–32].

## **4. Future directions**

differences in preterm babies at risk of NEC. However, sample sizes were insufficient to confidently identify a biomarker. Additionally, network modeling of preterm and term metabolomes suggested possible nutritional deficiency and altered pro‐insulin action in

Acute Kidney Injury (AKI) is common in neonates undergoing cardiac surgery, and is associated with increased mortality and ICU length of stay [19]. Mass spectrometry‐based metabolomics was used in a prospective cohort of pediatric cardiac surgery patients (*n* = 40). Twenty‐one of these children developed acute kidney injury defined as an increase in serum creatinine concentrations 50% or greater from baseline after 48–72 h. Homovanillic acid sul‐

sensitivity and 95% specificity using a cutoff value of 24 ng/ml at 12 h after surgery [20]. Atzori et al. showed a correlation between urinary metabolic profiles and neutrophil gelatinase‐asso‐ ciated lipocalin (NGAL) concentration in a cohort of young adults born with extremely low‐

Nephrotoxic‐medications are becoming increasingly recognized as a common and potentially modifiable cause of AKI in neonates. In a single center retrospective cohort 87% of very low birth weight infants (VLBW) were exposed to at least one nephrotoxic medication and on average these neonates were exposed to 14 days of nephrotoxic medications during their NICU stay [22]. Early identification of renal injury through omics technologies implicates defining different biomarkers that rely on the mechanisms of toxicity of each drug or drug class [23]. In our experimental study, gentamicin‐induced acute kidney injury in newborn rats resulted in a distinct urinary metabolic profile characterized by glucosuria, phosphatu‐ ria, and aminoaciduria that preceded changes in serum creatinine. Additionally, lower lev‐ els of kynurenic acid were noted in the urine of gentamicin injected rats, coinciding with higher levels of tryptophan, suggesting a degrading effect of gentamicin toxicity on trypto‐ phan metabolism pathway [24]. Xu et al. applied integrated pathway analysis and metabo‐ lite‐transcript correlation analysis to define perturbed biochemical pathways and molecular functions that may be relevant to the mechanisms of nephrotoxicity. They concluded that transcriptional downregulation of luminal sodium‐dependent transporters SLC5A1, SLC5A2, SLC6A18, and SLC16A7 might be the central mediators of drug‐induced kidney injury and adaptive response pathways. The integrated pathway analysis performed on these studies indicates that cisplatin‐ or gentamicin‐induced renal Fanconi‐like syndromes manifested by glucosuria, hyperaminoaciduria, lactic aciduria, and ketonuria might be better explained by the reduction of functional proximal tubule transporters rather than by the perturbation of

An alternative approach implicatesdiscoveryof a limited number of biomarkers that identify injury specific to primary sites in the kidney, such as the glomerulus or the proximal tubule. A prospective observational trial showed that the urinary excretion of biomarkers that signify

birth weight (ELBW), using partial least‐squares discriminant analysis [21].

), a dopamine metabolite was identified as a marker indicating AKI with 90%

preterm babies [18].

fate (HVA‐SO<sup>4</sup>

**3.6. Neonatal kidney injury**

200 Metabolomics - Fundamentals and Applications

*3.6.1. Drug‐induced nephrotoxicity*

metabolic pathways inside kidney cells [25].

The application of metabolomics approaches in neonatology is currently experimented on dif‐ ferent platforms due to its unique ability to generate functional readouts of systems biology, setting the ground for future personalized prenatal, neonatal, and pediatric care. Yet the clini‐ cal translation of this unprecedented large amount of data into clinical practices for neonatal health care requires addressing of the inherent interindividual variability [33]. Metabolomics has the greatest potential in the field of biomarker discovery because this technique defines the signature of the actual processes that are occurring within the body rather than just merely examining compounds (such as untranscribed DNA or pre‐ or post‐translationally modified proteins) that may be redundant to these processes. Although currently omics studies are mainly descriptive in nature, the goal is that through integration of experimental approaches and computational modelling, better models for personalized health care delivery will be generated. The following stages delineate how to translate the biomarker(s) discovery asso‐ ciation studies into clinical applications in a stepwise approach:


Experimental work in model systems and integration with other omics approaches are essential steps to provide insight into the pathophysiologic interactions between selected biomarkers and disease pathogenesis. Finally, large epidemiological cohort studies are needed to assess whether metabolomic biomarkers improve upon existing disease markers and to determine the validity of their application in different clinical settings.

#### **5. Summary**

The rapidly expanding field of metabolomics has been driven in recent years byadvances in the analytical methods. Metabolomics will have major implications in the field of per‐ sonalized health care in the future. After establishing metabolomic profiles in the neonatal population, the next step is metabolic fingerprinting. In such metabolomic investigations, the intention is not to identify each observed compound but to compare patterns or finger‐ prints of metabolites that change in response to disease or drug exposure. The combination of metabolic profiling and fingerprinting will lead to the maximum utilization of metabo‐ lomics. In one approach, changes in fingerprints correlating with metabolite profiles may be linked to a physiological or pathological state. As more quantitative metabolomic data‐ bases evolve, they can be integrated with data sets from the other "omics" technologies to enhance the data value and provide greater biological insight than anyone "omics" technique alone can offer. The promise of this emerging technology is focusing on translational metabo‐ lomics for the identification of biomarkers, monitoring postnatal metabolic maturation, and the implementation of a tailored management of neonatal disorders.

#### **Author details**

Mina H. Hanna1 \* and Patrick D. Brophy<sup>2</sup>

\*Address all correspondence to: mina.hanna@uky.edu


#### **References**


[3] Schnackenberg LK, Beger RD. Monitoring the health to disease continuum with global metabolic profiling and systems biology. Pharmacogenomics 2006;7:1077–86.

biomarkers and disease pathogenesis. Finally, large epidemiological cohort studies are needed to assess whether metabolomic biomarkers improve upon existing disease markers

The rapidly expanding field of metabolomics has been driven in recent years byadvances in the analytical methods. Metabolomics will have major implications in the field of per‐ sonalized health care in the future. After establishing metabolomic profiles in the neonatal population, the next step is metabolic fingerprinting. In such metabolomic investigations, the intention is not to identify each observed compound but to compare patterns or finger‐ prints of metabolites that change in response to disease or drug exposure. The combination of metabolic profiling and fingerprinting will lead to the maximum utilization of metabo‐ lomics. In one approach, changes in fingerprints correlating with metabolite profiles may be linked to a physiological or pathological state. As more quantitative metabolomic data‐ bases evolve, they can be integrated with data sets from the other "omics" technologies to enhance the data value and provide greater biological insight than anyone "omics" technique alone can offer. The promise of this emerging technology is focusing on translational metabo‐ lomics for the identification of biomarkers, monitoring postnatal metabolic maturation, and

and to determine the validity of their application in different clinical settings.

the implementation of a tailored management of neonatal disorders.

1 Department of Pediatrics, University of Kentucky, Lexington, KY, USA

[1] Pesce F, Pathan S, Schena FP. From ‐omics to personalized medicine in nephrology: inte‐

[2] Lindon JC, Nicholson JK. Spectroscopic and statistical techniques for information recovery in metabonomics and metabolomics. Annu Rev Anal Chem (Palo Alto Calif)

2 Department of Pediatrics, University of Iowa, Iowa City, IA, USA

gration is the key. Nephrol Dial Transplant 2013;28:24–8.

\* and Patrick D. Brophy<sup>2</sup>

\*Address all correspondence to: mina.hanna@uky.edu

**5. Summary**

202 Metabolomics - Fundamentals and Applications

**Author details**

Mina H. Hanna1

**References**

2008;1:45–69.


[30] Askenazi DJ, Koralkar R, Levitan EB, et al. Baseline values of candidate urine acute kidney injury biomarkers vary by gestational age in premature infants. Pediatr Res 2011;70:302–6.

[16] Walsh BH, Broadhurst DI, Mandal R, et al. The metabolomic profile of umbilical cord blood in neonatal hypoxic ischaemic encephalopathy. PLoS One 2012;7:e50520.

[17] Stewart CJ, Nelson A, Treumann A, et al. Metabolomic and proteomic analysis of serum from preterm infants with necrotising entercolitis and late‐onset sepsis. Pediatr Res

[18] Wilcock A, Begley P, Stevens A, Whatmore A, Victor S. The metabolomics of necrotis‐ ing enterocolitis in preterm babies: an exploratory study. J Matern Fetal Neonatal Med

[19] Alabbas A, Campbell A, Skippen P, Human D, Matsell D, Mammen C. Epidemiology of cardiac surgery‐associated acute kidney injury in neonates: a retrospective study.

[20] Beger RD, Holland RD, Sun J, et al. Metabonomics of acute kidney injury in children

[21] Atzori L, Mussap M, Noto A, et al. Clinical metabolomics and urinary NGAL for the early prediction of chronic kidney disease in healthy adults born ELBW. J Matern Fetal

[22] Rhone ET, Carmody JB, Swanson JR, Charlton JR. Nephrotoxic medication exposure in very low birth weight infants. J Matern Fetal Neonatal Med 2014;27:1485–90.

[23] Bonventre JV, Vaidya VS, Schmouder R, Feig P, Dieterle F. Next‐generation biomarkers

[24] Hanna MH, Segar JL, Teesch LM, Kasper DC, Schaefer FS, Brophy PD. Urinary metabolomic markers of aminoglycoside nephrotoxicity in newborn rats. Pediatr Res

[25] Xu EY, Perlina A, Vu H, et al. Integrated pathway analysis of rat urine metabolic profiles and kidney transcriptomic profiles to elucidate the systems toxicology of model nephro‐

[26] Jansen D, Peters E, Heemskerk S, et al. Tubular injury biomarkers to detect gentami‐ cin‐induced acute kidney injury in the neonatal intensive care unit. Am J Perinatol

[27] Askenazi DJ, Montesanti A, Hunley H, et al. Urine biomarkers predict acute kidney injury and mortality in very low birth weight infants. J Pediatr 2011;159:907–12 e1. [28] Hanna M, Brophy PD, Giannone PJ, Joshi MS, Bauer JA, RamachandraRao S. Early uri‐ nary biomarkers of acute kidney injury in preterm infants. Pediatr Res 2016;80(2):218–23.

[29] Saeidi B, Koralkar R, Griffin RL, Halloran B, Ambalavanan N, Askenazi DJ. Impact of gestational age, sex, and postnatal age on urine biomarkers in premature neonates.

2016;79:425–31.

204 Metabolomics - Fundamentals and Applications

2016;29:758–62.

2013;73:585–91.

2016;33:180–7.

Pediatr Nephrol 2013;28:1127–34.

Neonatal Med 2011;24 Suppl 2:40–3.

after cardiac surgery. Pediatr Nephrol 2008;23:977–84.

for detecting kidney toxicity. Nat Biotechnol 2010;28:436–40.

toxicants. Chem Res Toxicol 2008;21:1548–61.

Pediatr Nephrol 2015;30:2037–44.


## *Edited by Jeevan K. Prasain*

Metabolomics: Fundamentals and Applications authoritatively presents the basic principles and applications of metabolomics. Topics covered in this book range from the analysis of metabolites from different biological sources and their data processing and statistical analysis. This book serves as a basic guide for a wide range of audiences from less familiar with metabolomics techniques to more experienced researchers seeking to understand complex biological systems from the systems biology approach.

Metabolomics - Fundamentals and Applications

Metabolomics

Fundamentals and Applications

*Edited by Jeevan K. Prasain*

Photo by Arsgera / iStock