**2. Methods**

common learning disorders in children. Approximately 5–7% of all children aged 4–12 years old have these disorders [5]. The impact of these disorders in real life is that a child does not have the same speech skills as other children of the same age because his or her speech skills are delayed. Children with SLI fail to acquire their native language properly or completely despite having normal nonverbal intelligence, a lack of hearing problems, a lack of known neurological dysfunctions and a lack of behavioral, emotional or social problems [4]. These experiences can disrupt children's social lives and separate children from their contemporaries, which can create a specific social barrier. There is a relationship between the develop-

The determination that SLI includes a significant genetic component was demonstrated in various studies of heritability, for example, in a study of genetic etiology, a study of twins and a study of family evaluations [6]. SLI affected more boys than girls in another study [5]. The manifestation of the disorder primarily occurs in manipulating the linguistic rules of derivation and inflection, resulting in incorrect syntactic structures in their native tongue. Furthermore, there is reduced development of vocabulary at early ages. Usually, the production of language for those with the disorder is worse than their language comprehension. Various difficulties can be present in children with SLI in nonlinguistic cognitive skills, for example, motor ability, mental rotation or executive functions [7]. Other difficulties can be associated with impairments in reading and problems with working memory [8–11]. Many studies evaluate the problem underlying and causing the observed language difficulties. In these studies, theories of language acquisition as well as language representation and processing have been applied [4, 12]. The most frequently listed hypotheses for the causes of SLI

(a) Slower linguistic processing despite relatively normal linguistic representation [4, 12]; (b) Normal linguistic and other cognitive skills with later timing in the onset or triggering of language acquisition processes, leading to developmental delay in language acquisition

(c) Problems with grammar and specific subgroups of grammar, while cognitive abilities are

The Laboratory of Artificial Neural Network Applications (LANNA) [16] at the Czech Technical University in Prague, with the participation of R&D Laboratory at the Military Technical Institute, collaborates on a project with the Department of Paediatric Neurology, 2nd Faculty of Medicine of Charles University in Prague, and with the Motol University Hospital. The project focuses on children with SLI. A partial aim of this project is to obtain data about SLI and speech disorders using automatic utterance analysis by self-organizing neural networks. The goal of this research is to determine the parameters that correspond to correlations across the results generated from diagnostics (from several different specialists, for example, speech therapists and specialists, psychologists, neurologists, and EEG and MRI tractography) and tests. LANNA uses methods based on computer speech analysis to deter-

ment of a child's language skills, age and success with treatment.

4 Learning Disabilities - An International Perspective

are as follows:

[13]; and

relatively intact [14, 15].

mine whether children have SLI.

#### **2.1. Ethical statement**

The research was performed in compliance with ethical standards and was in accordance with the ethical standards of the Ethics Committee of Motol University Hospital in Prague, Czech Republic. The parents of the participants were informed and provided written informed consent for participation in this research.

#### **2.2. Speech databases and participants**

To investigate the effects of speech problems on children with SLI, it was necessary to create a speech database. The LANNA research group created the database [17]. The stimulus for its creation came from cooperation with the Department of Paediatric Neurology in the 2nd Faculty of Medicine of Charles University in Prague and Motol University Hospital, which was supported by grants from IGA MZ CR (Science Foundation of the Ministry of Health of the Czech Republic). The database contained three partial databases of speech recordings of the speech from the following different speaker types: H-CH (children without speech disorders), SLI-CH I (children with SLI), and SLI-CH II (children with SLI with three different degrees of diagnosis severity, which include mild, moderate and severe). This classification of degrees was chosen based on the decisions of speech therapists and specialists from Motol Hospital.

A total of 54 native Czech participants with SLI-CH II (hereafter referred to as *"cases"*) consisted of 35 boys and 19 girls, aged 70–131 months (mean age = 96 ± 16.3 months and median age = 94 months). The participants included in the study had to be examined by a clinical psychologist. The examinations were performed in the Department of Pediatric Neurology of the 2nd Faculty of Medicine of Charles University in Prague. The examination lasted all day, and the parents were present during the exam. The participants (children) were subjected to the following tests over one day: the Stanford-Binet Intelligence Test (Fourth Edition) [18]; Gessel Developmental Diagnosis [19], another standardized and specialized test for the Czech language (world differentiation and sound differentiation tests, auditory analysis and synthesis test); special graphomotor and perceptual skills tests; a test for visuomotor coordination; a test of figure drawing and tracing; and, finally, spontaneous talk evaluations [1, 2, 20, 21]. The inclusion criteria were the following: performance intelligence quotient (PIQ) ≥ 70; disturbed phonemic discrimination; and disturbed language at various levels, which included phonologic, syntactic, lexical, semantic and pragmatic levels [22]. The participants were assessed by other specialists. Neurological examinations showed no abnormalities. Motor milestones were within normal ranges. None of the children had hearing impairments. None were receiving antiepileptic medications. No child was diagnosed with a pervasive developmental disorder or other dominating behavioral problem. None of the children had a history of language or other cognitive regression [22].

A total of 44 native Czech participants from the H-CH subgroup (hereafter referred to as the *"controls"*) with no history of neurological and/or communication disorders were recruited as a control group. There were 35 boys and 19 girls who were 70–124 months old (mean age = 106 ± 15.4 and median = 110 months). None of the controls underwent voice therapy.

All recordings, data and applications were saved on the server of the LANNA research group, and they are available to authorized users or those who have access to the server of the LINDAT/CLARIN Centre for Language Research Infrastructure. The saved data lack identifying information and are free to use, for scientific purposes, on the server of the LINDAT/ CLARIN (http://hdl.handle.net/11372/LRT-1597) [23].

#### **2.3. Procedures and speaking tasks**

The selected utterances and first seven tasks, with the English translations of the original Czech utterances used in the current research, are listed in **Table 1**. Only words (a total of 38), no phrases or sentences, were chosen for inclusion from all suitable utterances.


**Table 1.** List of the vocal tasks.

Clinical psychologists and speech therapists collaborated on the selection of suitable utterances, and they formulated the test based on their own experience and acknowledged tests. With this test, the participants repeated spoken utterances, which were necessary to ensure the same conditions for all participants because the younger children could not yet read. The structures of the utterances included a range of words and phrases for a total of 68 different utterances. All utterances were previously described [17].

Only the participants and speech therapist were present during the recordings to maintain the participant's attention during the recording. The procedure of recording the participant was as follows. The participant repeated text after the speech therapist. The same conditions were used for both groups of participants (controls and cases). The recording equipment consisted of digital devices, specifically a digital Dictaphone from Sony Corporation (MD SONY MZ-N710) and an iBook laptop computer by Apple Inc., with professional solution software by Avid Technology, Inc. More information about the recordings of the H-CH and SLI-CH II subgroups can be found in a previously published study [17].

#### **2.4. Processing the recordings and the software used**

The following programs were used: Cool Edit Pro 2 [24] and Labelling [25, 26]. The Labelling program was used to segment the speech signal. It was written in the MATLAB programming environment as part of the SOMLab [26] programming system, which was developed in the LANNA. Statistics Toolbox in MATLAB [27] and R software were used for statistical computing [28]. The R Project for Statistical Computing is a language and environment for statistical computing and graphics. It is a GNU project that was developed at Bell Laboratories by John Chambers.

## **3. Error analysis: transcriptional analysis**

In this part of the chapter, a new method, called error analysis, is presented to identify cases based on the number of pronunciation errors in the utterances. Pronunciation requires the ability to distinguish the sounds of spoken language via hearing. The cases had a distinctly impaired ability to aurally differentiate phonemes, and they could not distinguish acoustically similar words. These problems occur in the perception and processing of verbal stimuli, storage in memory and recall, including memory learning. These problems are related to acoustic-verbal processes. One requirement of pronunciation is the ability to distinguish the sounds of spoken language by hearing. Analysis was performed by comparison of the words pronounced by the cases versus the words pronounced by the controls, and it was focused on the description of errors in individual words. During the research on the cases, their utterances included many more errors than controls. These errors occurred across all age categories (our research included children aged 39–131 months).

#### **3.1. Description**

as a control group. There were 35 boys and 19 girls who were 70–124 months old (mean age = 106 ± 15.4 and median = 110 months). None of the controls underwent voice therapy.

All recordings, data and applications were saved on the server of the LANNA research group, and they are available to authorized users or those who have access to the server of the LINDAT/CLARIN Centre for Language Research Infrastructure. The saved data lack identifying information and are free to use, for scientific purposes, on the server of the LINDAT/

The selected utterances and first seven tasks, with the English translations of the original Czech utterances used in the current research, are listed in **Table 1**. Only words (a total of 38),

[T2] Consonants 10 Czech *"m - b - t - d - r - l - k -* 

[T3] Syllables 9 Czech *"pe - la - vla - pro - bě -* 

[T4] Two-syllable words 5 Czech *"kolo - pivo - sokol -* 

[T5] Three-syllable words 4 Czech *"dědeček - pohádka -* 

[T6] Four-syllable words 3 Czech *"motovidlo - televize* 

[T7] Five-syllable words 2 Czech *"různobarevný* 

English *"a – e – i – o – u"*

English *"m - b - t - d - r - l - k -* 

English *"pe - la - vla - for -* 

English *"wheel - beer - falcon -* 

English *"grandfather - fairy tale* 

English *"niddy noddy -* 

English *"varicoloured – thyme"*

*g - h – ch"*

*g - h – ch"*

*– finger"*

*papír – trdlo"*

*paper – boob"*

*pokémon – květina"*

*- Pokemon – flower"*

*television – dustbin"*

*– mateřídouška"*

*– popelnice"*

*nos - ber - krk – prst"*

*bě - nose - take - neck* 

no phrases or sentences, were chosen for inclusion from all suitable utterances.

**Task code Description # Patterns Language Utterances** [T1] Vowels 5 Czech *"a – e – i – o – u"*

CLARIN (http://hdl.handle.net/11372/LRT-1597) [23].

**2.3. Procedures and speaking tasks**

6 Learning Disabilities - An International Perspective

**Table 1.** List of the vocal tasks.

Three matrices, that is, reference matrix *[RM]*, test matrix *[TM]* and confusion matrix *[CM]*, and two parameters of utterance, that is, utterance of speech therapist *[ut1 ]* and utterance

#### 8 Learning Disabilities - An International Perspective

of participant *[ut2 ]* (see **Table 1**), comprise the basic input for this method. *RM* is defined as a square reference matrix with *k* parameters. *K* is characterized as the number of phonemes in *ut1* or *ut2* depending on the size, where a larger *K* is more decisive (see the following equations):

$$RM = \begin{pmatrix} rm\_{11} & rm\_{12} & rm\_{13} & \dots & rm\_{1n} \\ rm\_{21} & rm\_{22} & rm\_{23} & \dots & rm\_{2n} \\ rm\_{31} & rm\_{32} & rm\_{33} & \dots & rm\_{3n} \\ \vdots & \vdots & \vdots & \ddots & \vdots \\ rm\_{m1} & rm\_{m2} & rm\_{m3} & \dots & rm\_{nm} \end{pmatrix} \tag{1}$$

$$\sum\_{l=1}^{k} r \, m\_{lk} = k \tag{2}$$

The number of errors is obtained as a penalty score from comparing the phonemes from *ut1* and *ut2* (see the following equation):

$$PS = wp + up + mp \tag{3}$$

where *PS* is the penalty score, *wp* is the number of incorrect phonemes, *up* is the number of unspoken phonemes and *mp* is the number of missing phonemes. A detailed description of the error analysis and all algorithms are provided in Ref. [29]. The input data for error analysis were the recorded *ut1* and *ut2* , and the output from error analysis was a *PS* of the analyzed *ut2* . In simple terms, comparison of the *TM* and *RM* matrices generates the *CM* matrix. The *CM* matrix contains all information about the errors in *ut2*.

#### **3.2. Statistical evaluation and results**

Research data were divided into two groups, controls (*p\_h*) and cases (*p\_sli*). The Shapiro-Wilk test for normality was used to determine that the data were statistically normal. The obtained scores (*p\_h: W* = 0.9175, *p-val* = 0.00444; *p\_sli: W* = 0.83, *p-val* = 2.28e−06) were too small to confirm the hypothesis that the groups had a normal distribution. The Wilcoxon's rank-sum test is a nonparametric test used as a substitute for the *t*-test. The obtained scores for *p\_h* vs. *p\_sli* (*w*) were as follows: *p-val* = 1.01e−15, *zval* = -8.3166 and *ranksum* = 963. The *p*-value was less than the significance level of 0.05; therefore, the null hypothesis of equal medians was rejected. There was sufficient evidence in the data to suggest that the controls and cases were not the same at the default 5% significance level, which was sufficient for significant contention. These results could be considered correct, and it could be argued that there was a significant difference in the number of errors in the speech of the cases and controls.

The results of the analyses of utterance errors are displayed in **Figure 1**, which presents all participants included in our current study.

The controls are displayed in red (or at a higher position), and the cases are displayed in blue (or at a lower position) or in grayscale. Pronunciation errors are displayed in the upper graph. A higher value indicates more errors. The cases had a total number of errors in their

of participant *[ut2*

or *ut2*

8 Learning Disabilities - An International Perspective

*RM* =

∑

(see the following equation):

and *ut2*

*CM* matrix contains all information about the errors in *ut2*.

nemes in *ut1*

equations):

and *ut2*

*ut2*

sis were the recorded *ut1*

**3.2. Statistical evaluation and results**

participants included in our current study.

*]* (see **Table 1**), comprise the basic input for this method. *RM* is defined

depending on the size, where a larger *K* is more decisive (see the following

*r mkk* = *k* (2)

(1)

, and the output from error analysis was a *PS* of the analyzed

⎞

⎟ ⎠

as a square reference matrix with *k* parameters. *K* is characterized as the number of pho-

*rm*<sup>11</sup> *rm*<sup>12</sup> *rm*<sup>13</sup> … *rm*1*<sup>n</sup>*

*rm*<sup>21</sup> *rm*<sup>22</sup> *rm*<sup>23</sup> … *rm*2*<sup>n</sup> rm*<sup>31</sup> *rm*<sup>32</sup> *rm*<sup>33</sup> … *rm*3*<sup>n</sup>*⋮ ⋮ ⋮ ⋱ ⋮

*rmm*<sup>1</sup> *rmm*<sup>2</sup> *rmm*<sup>3</sup> … *rmmn*

⎛

⎜ ⎝

> *i*=1 *k*

The number of errors is obtained as a penalty score from comparing the phonemes from *ut1*

*PS* = *wp* + *up* + *mp* (3)

where *PS* is the penalty score, *wp* is the number of incorrect phonemes, *up* is the number of unspoken phonemes and *mp* is the number of missing phonemes. A detailed description of the error analysis and all algorithms are provided in Ref. [29]. The input data for error analy-

. In simple terms, comparison of the *TM* and *RM* matrices generates the *CM* matrix. The

Research data were divided into two groups, controls (*p\_h*) and cases (*p\_sli*). The Shapiro-Wilk test for normality was used to determine that the data were statistically normal. The obtained scores (*p\_h: W* = 0.9175, *p-val* = 0.00444; *p\_sli: W* = 0.83, *p-val* = 2.28e−06) were too small to confirm the hypothesis that the groups had a normal distribution. The Wilcoxon's rank-sum test is a nonparametric test used as a substitute for the *t*-test. The obtained scores for *p\_h* vs. *p\_sli* (*w*) were as follows: *p-val* = 1.01e−15, *zval* = -8.3166 and *ranksum* = 963. The *p*-value was less than the significance level of 0.05; therefore, the null hypothesis of equal medians was rejected. There was sufficient evidence in the data to suggest that the controls and cases were not the same at the default 5% significance level, which was sufficient for significant contention. These results could be considered correct, and it could be argued that there was a

significant difference in the number of errors in the speech of the cases and controls.

The results of the analyses of utterance errors are displayed in **Figure 1**, which presents all

The controls are displayed in red (or at a higher position), and the cases are displayed in blue (or at a lower position) or in grayscale. Pronunciation errors are displayed in the upper graph. A higher value indicates more errors. The cases had a total number of errors in their

**Figure 1.** Evaluation of the error analysis. Data from controls are shown in blue (or at a lower position), and data from cases are shown in red (or at a higher position) or in grayscale. Samples with more errors are at a higher position in the upper graph. The histogram of the errors for each participant is shown in the middle graph. Boxplots represent the distributions of the numbers of utterance errors for controls and cases in the bottom chart.

utterances that was much greater than the number of errors for the controls. The distributions of utterance errors of the controls and cases are displayed in the middle graph. The distribution of utterance errors of the controls was clustered around the lower values compared with the distribution of the cases. Box plots representing the distributions of utterance errors of the controls and cases show clear differences between these groups. The cases made more errors in their utterances than controls of the same ages. **Table 2** shows the difference in the average number of errors between controls and cases.

The final evaluation of the error analysis results is shown in **Table 3**. The percentage success rate for the best method, *HM* (*"hand-made"*), was 93.81%. It was necessary to set the limits for each group as the thresholds using the maximum and minimum values from both groups.


**Table 2.** Error analysis: comparison of both groups: average number of errors for controls and for cases.


**Table 3.** Final results for classifiers based on the HM and ANN methods. The method with the highest rate of success in bold.

Classification labeled as misclassification (*"misclass"* in **Figure 2**) indicates the values located outside these limits. As the final criterion for the classification of several words containing an error, the group of cases comprised all children who had more than six words with any error during testing. Self-Organizing Maps (SOMs) [30], a subgroup of an Artificial Neural Network (ANN) [31], was the basis for the other three methods. Parameters for the ANN were set with a standard approach, that is, ratios were 0.7 for training, 0.15 for testing and 0.15 for validation. Differences were observed in the values for weights. ANN1 comprised original default values, ANN2 comprised minimum and maximum values from both groups and ANN3 comprised the weights set to the mean values of these groups. **Figure 2** provides a process diagram for the classification of the error analysis method.

Approximately the same final results are observed for both classifiers, but the HM classifier is easier to implement and use. The results indicate that children with SLI had a greater number of errors in their utterances than typical children.

**Figure 2.** Process diagram illustrating the principle of error analysis. Overview and comparison of the classification through ANN and HM method.
