**5. Ataxonomic differentiation of cyanobacterial strains on the base of single-cell fluorescence spectra**

Fluorescence spectra have been used to classify phytoplankton populations since approximately the early 1970s [54, 55]. However, because of the generally low device precision and poor availabilities, the rate of species discrimination was relatively low. Recently new attempts to conduct the discrimination among microalga on the base of absorption or fluorescence spectra were reported [7, 10, 13, 15]. But again in published experiments only big algal groups with a considerable differences in pigment composition can be successfully separated (e.g., cryptophytes, chlorophytes, cyanobacteria, etc.). Moreover, all the authors pointed out that the discrimination among cyanobacterial species is quite complex and ambiguous. Actually, the correct discrimination of cyanobacterial species on the base of fluorescence signature is usually hampered by alterations in the pigment composition within one strain, which depends on the environmental conditions and physiological state of the culture. These difficulties can be overcome by using single-cell fluorescence spectra instead of bulk ones and by recording 7–8 spectra with different excitation wavelengths for each cell instead of one or two as usually is done.

In the presented investigation, 307 sets of 8 single-cell fluorescent spectra for 23 cyanobacterial strains, belonging to 15 genera, were analyzed. An optimal set of classification parameters was considered that is sufficient for determining the generic membership of cyanobacterial cells by means of mathematical statistics. The results of this study show that LDA and ANN are able to recognize cyanobacteria up to species/strains according to the data recorded by means of CLSM. This implies that the classifier (LDA or ANN) is capable of defining a unique niche in a multiparameter space for each of 23 cyanobacterial strains, used in this investigation.

The results of LDA, evaluated over 63 parameters extracted from 307 single-cell fluorescence spectra, are presented in **Figure 7** as 3D-plots in the space of canonical discriminating functions. It is clear that the discrimination between species is sufficiently good. Moreover, the closely related species (e.g., *Spirulina* and *Oscillatoria*, *Synechococcus* and *Chlorogloea*, *Microcystis*, *Synechocystis* and *Myxosarcina*) appear close to each other. Such species as *Leptolyngbia*, *Geitleninema*, and *Oscillatoria*, which includes several strains, form big groups. However, inside these groups single strains also can be discriminated, which is demonstrated on the right panel, where the corresponding scaled region 1 is presented. This is confirmed by a classification diagram plotted in **Figure 7C**. The classification accuracy in the presented example was near 97.4%. The high classification accuracy is due to the fact that LDA works with distribution functions for classification parameters and their statistical characteristics, which allows to build a good classification model.

In the legend all used cyanobacterial strains are named and enumerated according to CALU collection. Solid curves bounded the regions, occupied by seven strains

*Microalgae - From Physiology to Application*

*Four characteristic sets of single-cell fluorescence spectra. The excitation wavelengths (405, 458, 476, 488, 496, 514, 543, and 633 nm) are given over the curves. All spectra are normalized to the maximum intensity and shifted along x-axis for convenience of observation. The dashed lines indicate fluorescence maxima of the* 

microscope (CLSM) Leica TCS-SP5, which are placed near each set. Each spectrum in the set was obtained using different laser lines for excitation: 405, 458, 476, 488, 496, 514, 543, and 633 nm. Corresponding excitation wavelengths are given over each spectrum. All spectra are normalized to the maximum intensity and shifted along x-axis for convenience of observation. It can be easily noticed that laser line 458 nm excites mostly in vivo fluorescence of Chl a in both photosystems PSII and PSI around 682 and 715 nm, correspondingly, and the emission spectrum by cyanobacterial cells shows no appreciable emission of PC or APC. In cyanobacteria, the 458 nm excitation is preferentially absorbed by PSI that contains more Chl a than by PSII and is stoichiometrically more abundant than PSII. However, because reaction center of PSI turns over faster than the PSII, it has lower fluorescence intensity than the PSII antenna. This is indicated by PSI emission band at 715 nm which is much weaker than the PSII emission band at 682 nm. The excitation by intermediate (blue and green) wavelengths (405, 488, and 496 nm) reveals fluorescent maxima of all photosynthetic pigments, as the light in this range is absorbed by all pigmentprotein complexes almost in equal portions and fluorescence emits by all steps of energy transfer chain (**Figure 5**). The direct excitation of cells in the PE absorption region at 514 and 543 nm results in emission spectrum with two main peaks at 580 and 656 nm, which are due to PE, PC, and APC emission, and for species that lack PE, the emission accumulates mostly near 656 nm. Two chlorophyll fluorescence components can be resolved for some species in a number of spectra. The spectra of the 633 nm excitation directly give a prominent emission band at 656 nm that originates from C-PC, omitting band at 580 nm, which cannot be excited by 633 nm, even for species that have PE (see **Figure 6**). Other small emission bands, corresponding to fine pigment structure of antenna complex, are not resolved at the

These in vivo fluorescence emission spectra reflect the structure of lightharvesting complex of corresponding species and correct or incorrect functioning of its energy transfer chain. Four characteristic wavelengths, corresponding to the fluorescence maximum or shoulder, can be easily distinguished: (1) peak near 580 nm corresponds to the fluorescence of phycoerythrin, (2) peak near 656– 560 nm corresponds to the fluorescence of phycocyanin and allophycocyanin in common (they are undistinguishable at room temperature), (3) peak near 682 nm

*individual pigments (PE, 580 nm; PC, 656 nm; Chl a, 682 and 715 nm).*

**16**

room temperature.

**Figure 6.**

#### **Figure 7.**

*The results of linear discriminant analysis. (a) and (b) Observations in 3D space of first three maximal canonical discriminant functions (root 1,2,3). Solid curves bounded the regions, occupied by seven new species. In the legend new species are indicated with red. (c) Classification diagram for 23 cyanobacterial strains from CALU collection. Red dots indicate false results.*

(*Anabaena variabilis* Kutz. sp. CALU 824, *Geitlerinema* sp. CALU 1315, *Myxosarcina chroococcoides* sp. CALU 601, *Nostoc* sp. CALU 1763, *Spirulina platensis* (Nordst.) sp. CALU 550, *Synechococcus* CALU 756, and *Synechocystis aquatilis* sp. CALU 1336) used for testing ANN classificator (in the legend they are indicated by red color).

In the considered classification problem, the quality of the ANN operation should be determined not only by the absolute value of the classification accuracy but also by the ability of the designed ANN to recognize and properly classify unknown species that did not participate in the training process. Thus, the performance of ANN was tested first with the aim only to discriminate between 16 known cyanobacterial species (**Figure 8a**). Another seven strains were identified as test ones, to verify the correctness of ANN in recognizing new strains (so-called generalization quality). Analysis of a test set with data from the same monocultures confirmed that the parameters extracted from the fluorescent spectrum sets contained enough information to correctly identify cyanobacterial cells at the species/strain level. The trained neural network presented here showed not the highest rate of correct classification—only about 95.7%—but it shows the best recognition quality for new strains. The results of the ANN recognition are presented in **Figure 8b**.

Bar charts in **Figure 8a** represent the results of the classification of 268 experimental measurements by 16 classes. Each bar represents the classification results as the probability distributions. Each color in the bar corresponds to 1 of 16 target classes (known cyanobacterial strains). The percentage rate of colors in the bar

**19**

**Figure 8.**

*CALU collection names.*

*Self-Fluorescence of Photosynthetic System: A Powerful Tool for Investigation of Microalgal…*

shows the probability distribution of belonging to the target classes. Maximal eigen-

*The results of ANN classification. (a) The results of recognition of 16 known cyanobacterial strains. (b) The results of recognition of seven unknown strains. Numbers over each bar indicate maximal class probability for each strain. (c) General classification results. Red dots indicate false results. Strains are numbered according to* 

In contrast to standard classifiers, a classifier built on the base of ANN has a so-called generalization ability. It means that ANN is able to recognize new cyanobacterial strains that were previously unknown for it and suggest possible variants of their generic affiliation to known classes. In **Figure 8c**, the ANN classification results for 16 target classes and 7 strains that were not presented in the training set are shown. The aim of ANN classifier was to determine which of the 16 known classes and 7 unknown strains could be attributed. The results of ANN classification correlate well with the results predicted by LDA (**Figure 7**). The closely related strains in this case were 1763–666, 601–398, 756–1409, 1315–1718, 550–1416, 824–1817, and 1336–398 (in the pairs, the first strain is unknown for ANN, and the second is the one of the nearest target classes). The strains of 1336 *Synechocystis*, 601 *Myxosarcina*, and 1315 *Geitlerinema* ANN classifiers relate to the close genera *Microcystis* and *Geitlerinema*, correspondingly. And for the remaining strains, it proposed possible classification options. Minor errors in classification of strains 756 *Synechococcus*, 824 *Anabaena*, and 550 *Spirulina*, in which the classifier relates to genera *Synechococcus*, *Nostoc*, and *Oscillatoria*, correspondingly, can be explained by the fact that in the space of classification parameters they lie in the wide free regions between the groups of the known strains, approximately, at equal distances from 2

class probability is indicated above each bar.

*DOI: http://dx.doi.org/10.5772/intechopen.88785*

#### **Figure 8.**

*Microalgae - From Physiology to Application*

(*Anabaena variabilis* Kutz. sp. CALU 824, *Geitlerinema* sp. CALU 1315, *Myxosarcina chroococcoides* sp. CALU 601, *Nostoc* sp. CALU 1763, *Spirulina platensis* (Nordst.) sp. CALU 550, *Synechococcus* CALU 756, and *Synechocystis aquatilis* sp. CALU 1336) used for testing ANN classificator (in the legend they are indicated by red color). In the considered classification problem, the quality of the ANN operation should be determined not only by the absolute value of the classification accuracy but also by the ability of the designed ANN to recognize and properly classify unknown species that did not participate in the training process. Thus, the performance of ANN was tested first with the aim only to discriminate between 16 known cyanobacterial species (**Figure 8a**). Another seven strains were identified as test ones, to verify the correctness of ANN in recognizing new strains (so-called generalization quality). Analysis of a test set with data from the same monocultures confirmed that the parameters extracted from the fluorescent spectrum sets contained enough information to correctly identify cyanobacterial cells at the species/strain level. The trained neural network presented here showed not the highest rate of correct classification—only about 95.7%—but it shows the best recognition quality for new strains. The results of the ANN recognition are

*The results of linear discriminant analysis. (a) and (b) Observations in 3D space of first three maximal canonical discriminant functions (root 1,2,3). Solid curves bounded the regions, occupied by seven new species. In the legend new species are indicated with red. (c) Classification diagram for 23 cyanobacterial strains from* 

Bar charts in **Figure 8a** represent the results of the classification of 268 experimental measurements by 16 classes. Each bar represents the classification results as the probability distributions. Each color in the bar corresponds to 1 of 16 target classes (known cyanobacterial strains). The percentage rate of colors in the bar

**18**

**Figure 7.**

*CALU collection. Red dots indicate false results.*

presented in **Figure 8b**.

*The results of ANN classification. (a) The results of recognition of 16 known cyanobacterial strains. (b) The results of recognition of seven unknown strains. Numbers over each bar indicate maximal class probability for each strain. (c) General classification results. Red dots indicate false results. Strains are numbered according to CALU collection names.*

shows the probability distribution of belonging to the target classes. Maximal eigenclass probability is indicated above each bar.

In contrast to standard classifiers, a classifier built on the base of ANN has a so-called generalization ability. It means that ANN is able to recognize new cyanobacterial strains that were previously unknown for it and suggest possible variants of their generic affiliation to known classes. In **Figure 8c**, the ANN classification results for 16 target classes and 7 strains that were not presented in the training set are shown. The aim of ANN classifier was to determine which of the 16 known classes and 7 unknown strains could be attributed. The results of ANN classification correlate well with the results predicted by LDA (**Figure 7**). The closely related strains in this case were 1763–666, 601–398, 756–1409, 1315–1718, 550–1416, 824–1817, and 1336–398 (in the pairs, the first strain is unknown for ANN, and the second is the one of the nearest target classes). The strains of 1336 *Synechocystis*, 601 *Myxosarcina*, and 1315 *Geitlerinema* ANN classifiers relate to the close genera *Microcystis* and *Geitlerinema*, correspondingly. And for the remaining strains, it proposed possible classification options. Minor errors in classification of strains 756 *Synechococcus*, 824 *Anabaena*, and 550 *Spirulina*, in which the classifier relates to genera *Synechococcus*, *Nostoc*, and *Oscillatoria*, correspondingly, can be explained by the fact that in the space of classification parameters they lie in the wide free regions between the groups of the known strains, approximately, at equal distances from 2

or 3 nearest ones (see **Figure 7**). Therefore the ANN cannot make a correct decision. And the false result of ANN classificator in classification of 1763 *Nostoc* may due to the incorrect initial dataset or false a priori information about 1763 strain affiliation.

To validate the correctness of the neural network operation, the results of the ANN classification were compared with the results of the LDA. The neural network-based classification agrees well with the expected results and with the results of LDA. The identification performance of the network for cyanobacterial strains from the same species is slightly less than for the cells from different species, but anyway they can also be distinguished perfectly well.

### **6. Conclusion**

The automatization of the cyanobacterial species differentiation is a key problem in both industrial biomass production and environmental monitoring. Unfortunately, all presently utilized methods cannot be implemented in online monitoring procedures due to various reasons. In this work, an example of the use of LDA and ANN technologies for online differentiation of cyanobacterial strains according to their in vivo single-cell fluorescence spectra is presented. The novel discrimination technique demonstrated here includes a strict procedure for recording and processing single-cell fluorescence emission spectra, which eliminates most of usual data processing difficulties and, as a result, has a quite high classification accuracy. And the initial information is obtained via fluorescent spectroscopy; the experimental data can be processed automatically. Moreover, due to the use of CLSM microscopic spectroscopy instead of conventional fluorimetry, the initial data have less variations and can be accurately sorted. Any objectionable and unpredictable impact is eliminated at the first step of obtaining fluorescence spectra. Since noninvasive and nondistructive method is used, the information about vital cell operation (e.g., light harvesting) can be additionally taken into account, to obtain the desirable precision of discrimination.

The universality of the considered technique makes it possible to use it for investigation of any phytoplankton species irrespective of their habitat or cultivation. Utilizing data from several fluorescence spectra, instead of one, results in more fingerprint information which leads to the taxonomic differentiation on a finer scale. Differentiation procedure, presented here, was carried out by means of statistical analysis on the base of mathematical characteristics of intrinsic fluorescence spectra of living single cells; therefore it is free from usual subjectivity, which can occur while using methods of direct optical microscopy. Moreover, formalization of data processing gives a wide opportunity for automating of the classification procedure of cyanobacterial strains in field samples, while online monitoring of water bodies is conducted.

Undoubtedly, the data set should be expanded to include more species and phytoplankton classes/divisions, grown under different nutrient and light conditions. However, this work already demonstrates the potential of the discrimination of phytoplankton classes by means of fluorescence microscopic spectroscopy. Combining the knowledge of phytoplankton structure along with taxon-specific measurements of photosynthetic activity and biochemical cell composition can lead to new models which increase the reliability of online monitoring.

**21**

**Author details**

Natalia Grigoryeva

Saint-Petersburg, Russia

\*Address all correspondence to: renes3@mail.ru

provided the original work is properly cited.

St. Petersburg Scientific Research Centre for Ecological Safety of RAS,

© 2019 The Author(s). Licensee IntechOpen. This chapter is distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/ by/3.0), which permits unrestricted use, distribution, and reproduction in any medium,

*Self-Fluorescence of Photosynthetic System: A Powerful Tool for Investigation of Microalgal…*

*DOI: http://dx.doi.org/10.5772/intechopen.88785*

*Self-Fluorescence of Photosynthetic System: A Powerful Tool for Investigation of Microalgal… DOI: http://dx.doi.org/10.5772/intechopen.88785*
