**Abstract**

Reading and reading difficulties are some of the most researched topics in the literature in regard to psychology and education. Additionally, some specific subjects such as prediction and prevention attract research interest as well. These issues are discussed in the present chapter that focused on the screening measures and their characteristics towards significance and effectiveness. More specifically, discrimination accuracy, sensitivity, and specificity as well as validity and reliability were taken into consideration. Some well-known studies were examined revealing a range of methodological issues, which affected the effectiveness of using measures in the extant research. Although the findings were consistent with literature, they continued to be scant and not widely accepted, affected by several limitations regarding the sampling and the experimental design.

**Keywords:** reading difficulties, screening, discrimination accuracy, sensitivity, specificity

#### **1. Introduction**

The reading struggling and prevention of reading failure are among the most important and well-studied subjects in the relevant literature. Two decades earlier, Joseph Torgesen, in his influential article "Catch Them Before They Fall: Identification and Assessment to Prevent Reading Failure in Young Children" argued that "*The best solution to the problem of reading failure is to allocate resources for early identification and prevention. The goal is to describe procedures … to identify children who need extra help in reading before they experience serious failure*…" [1].

Actually, in the following years, great emphasis has been placed on the issue of screening for at-risk children and important research findings have emerged, such as Ref. [2] findings showing that most children at risk for early reading difficulties could be effectively identified at the beginning of kindergarten. As the literature review shows, a lot of effective and precise screening tools and procedures have been developed in order to locate the at-risk children as soon and as precisely as possible.

#### **2. Considerations on effectiveness of screening**

It is widely accepted that diagnostic assessment is not practical for assessing all children for academic risk, while screening procedures could provide reliable and

valid information regarding children's current academic skills and meet financial and time constraints [3]. However, screening is a preliminary process of identification that could identify those children who may be at risk of future difficulty in school and in need of further individual diagnostic testing. More specifically, it is a brief assessment that provides predictive information about a child's development in a specific academic area, in order to identify at-risk children that need extra support through early intervention. The screening measure is administered to all children and is used to identify an initial risk pool of children suspected of being at risk of developing reading disabilities. Screening information leads to the decision of risk for each child screened. Risk decisions are made by selecting a critical cutpoint along a continuum of scores on a single or group of screening measures [4].

Screening may include parent interviews or written questionnaires and checklists, observation of the child, or use of specific screening tests. Because the earlier a learning disability is detected, the better chance a child will have of succeeding in school and in life, it is used mainly at the kindergarten or at the beginning of the first grade. Often, early identification is delayed, and as a result, the at-risk children might experience significant problems in learning to read. The consequences of these delays for the child include prolonged frustration, missed opportunities for special instructional interventions, and cumulative academic deficiencies, as well as lifelong secondary psychological problems.

From early years until now, there has been a common understanding of characteristics of effective developmental screening tests. These characteristics are an adequate standardization sample, low cost, ease of administration, appropriate content, and adequate validity and reliability (e.g., see [5, 6]). However, predictive validity or instrument reliability has also been cited as a major problem in screening for children at risk [7–10]. Ref. [11] stated "… *a test with a low predictive value is unlikely to be either efficient or useful*…" (p. 1583). An effective framework is usually appreciated based on the measures of relevance and utility. Relevance of the measures relates to the relationship between the measure and the purpose of the assessment on the one hand, and the utility of the measures on the other hand, which is usually evaluated by cost-effectiveness [12].

Screening studies discussed the outcome results as poor or good, with poor indicating a subject who exhibits the target disorder and good a subject who does not. The measurement is realized in two points of time. Based on the measurement results, four placements may occur; the subject may be placed in cell A: failed screen and poor outcome = true positive; cell B: failed screen and good outcome = false positive; cell C: passed screen and poor outcome = false negative; and cell D: passed screen and good outcome = true negative. The matrix is deceptively simple and easy to misinterpret, because cell information varies in relation to rows, columns, or the entire matrix [7, 13].

On the other hand, a vast majority of the studies recommended the assessment of accuracy in terms of sensitivity and specificity as appropriate indices to identify the capacity of an examined screening instrument (**Table 1**). These indices can be calculated using the formula: Sensitivity = TP/(TP + FN) and Specificity = TN/ (TN + FP). Sensitivity and specificity are two sides of a coin. Sensitivity is related to the probability that a result of a test will be positive, when the criterion—in this case, disability—is present. Expressed as a percentage, sensitivity measurement results in a true positive rate. On the contrary, specificity produces a true negative rate expressed as a percentage, referring to the probability that a test result will be negative when the criterion—in this case, disability—is not present. The overall classification accuracy can be estimated using the Eq. (TP + TN)/ (TP + FP + FN + TN) [5]. Positive likelihood ratio is the ratio between the probability of a positive test result given the presence of the disease and the probability of a positive test result given the absence of risk (e.g., [4, 8, 12, 14–28]).

**209**

education.

0.70 are considered poor.

*Screening Young Children at Risk for Reading Failure DOI: http://dx.doi.org/10.5772/intechopen.82081*

Classification accuracy = (TP + TN)/(TP + FP + FN + TN)

Sensitivity = TP/(TP + FN) Specificity = TN/(TN + FP)

**Table 1.**

*Screening results table.*

Using a risk index can serve as a good alternative to single cut scores. This index includes calculations as probability of being classified as at risk or not at risk. A weighted regression formula of predictors to a specific outcome determines the classification and the construction of the risk index. Moreover, the ability of a test to discriminate diseased cases from normal cases is evaluated using a receiver operating characteristic (ROC) curve analysis. ROC curves can also be used to compare

**Predictor (screen) Poor outcome (criterion) Good outcome (criterion)**

Poor (Fail to screen) (TP) True positive (FP) False positive Good (Pass to screen) (FN) False negative (TN) True negative

An ROC curve is provided by a screen that cannot discriminate between cases and non-cases. This is a straight line passing through the origin with unit slope, and effective screens will provide a convex curve above this line. Area under the curve (AUC), that is, the ROC curve, provides a measure of the screening test performance. This measure goes beyond sensitivity and specificity at a single threshold, integrating the full range of scores that need to be taken into account for making a decision about a threshold in order to separate illness from health. This practically means that a value of 0.5 (that is under the straight line of unit slope) indicates a lack of effectiveness, whereas a value very close to 1.0 is indicative of a very good screen. Ref. [3] noted that the AUC is an indicator of a screening tool's overall ability to differentiate between children with lower-than-average emergent literacy skills and children with average or better emergent literacy skills, and it is calculated at all possible cut scores. Using optimal cut score statistics allows examination of the utility of the screening tool under the circumstances in which it would typically be used. Ref. [4] suggested that AUC values above 0.90 represent excellent diagnostic accuracy, between 0.80 and 0.90 represent good, 0.70–0.80 fair, and values below

the diagnostic performance of two or more screening tests [5, 29].

**3. Single or multiple predictors and criterion measures**

Large amounts of predictors have been proposed by researchers. Several prereading measures, when administered in kindergarten, are predictors of later reading abilities. These measures include letter name and letter sound knowledge, phonological awareness, verbal short-term memory, and rapid automatized naming [6].

Two related studies [23, 24] found that measures of letter naming, phonological awareness, rapid object naming, and non-word repetition at the beginning of kindergarten were very good predictors of reading outcomes at the end of the first grade. Ref. [2] further has shown that measuring at-risk children's response to supplemental intervention during kindergarten can improve accuracy of identification beyond that of early screening. Even when predicting performance on the state assessment in the third grade, Ref. [5] found that a comprehension measure was the best predictor. In addition, the review [34] revealed that risk factors associated with speech and language delay were male gender, family history, and low parental


#### **Table 1.**

*Early Childhood Education*

lifelong secondary psychological problems.

which is usually evaluated by cost-effectiveness [12].

positive test result given the absence of risk (e.g., [4, 8, 12, 14–28]).

valid information regarding children's current academic skills and meet financial and time constraints [3]. However, screening is a preliminary process of identification that could identify those children who may be at risk of future difficulty in school and in need of further individual diagnostic testing. More specifically, it is a brief assessment that provides predictive information about a child's development in a specific academic area, in order to identify at-risk children that need extra support through early intervention. The screening measure is administered to all children and is used to identify an initial risk pool of children suspected of being at risk of developing reading disabilities. Screening information leads to the decision of risk for each child screened. Risk decisions are made by selecting a critical cutpoint along a continuum of scores on a single or group of screening measures [4]. Screening may include parent interviews or written questionnaires and checklists, observation of the child, or use of specific screening tests. Because the earlier a learning disability is detected, the better chance a child will have of succeeding in school and in life, it is used mainly at the kindergarten or at the beginning of the first grade. Often, early identification is delayed, and as a result, the at-risk children might experience significant problems in learning to read. The consequences of these delays for the child include prolonged frustration, missed opportunities for special instructional interventions, and cumulative academic deficiencies, as well as

From early years until now, there has been a common understanding of characteristics of effective developmental screening tests. These characteristics are an adequate standardization sample, low cost, ease of administration, appropriate content, and adequate validity and reliability (e.g., see [5, 6]). However, predictive validity or instrument reliability has also been cited as a major problem in screening for children at risk [7–10]. Ref. [11] stated "… *a test with a low predictive value is unlikely to be either efficient or useful*…" (p. 1583). An effective framework is usually appreciated based on the measures of relevance and utility. Relevance of the measures relates to the relationship between the measure and the purpose of the assessment on the one hand, and the utility of the measures on the other hand,

Screening studies discussed the outcome results as poor or good, with poor indicating a subject who exhibits the target disorder and good a subject who does not. The measurement is realized in two points of time. Based on the measurement results, four placements may occur; the subject may be placed in cell A: failed screen and poor outcome = true positive; cell B: failed screen and good outcome = false positive; cell C: passed screen and poor outcome = false negative; and cell D: passed screen and good outcome = true negative. The matrix is deceptively simple and easy to misinterpret, because cell information varies in relation to rows, columns, or the entire matrix [7, 13]. On the other hand, a vast majority of the studies recommended the assessment of accuracy in terms of sensitivity and specificity as appropriate indices to identify the capacity of an examined screening instrument (**Table 1**). These indices can be calculated using the formula: Sensitivity = TP/(TP + FN) and Specificity = TN/ (TN + FP). Sensitivity and specificity are two sides of a coin. Sensitivity is related to the probability that a result of a test will be positive, when the criterion—in this case, disability—is present. Expressed as a percentage, sensitivity measurement results in a true positive rate. On the contrary, specificity produces a true negative rate expressed as a percentage, referring to the probability that a test result will be negative when the criterion—in this case, disability—is not present. The overall classification accuracy can be estimated using the Eq. (TP + TN)/ (TP + FP + FN + TN) [5]. Positive likelihood ratio is the ratio between the probability of a positive test result given the presence of the disease and the probability of a

**208**

*Screening results table.*

Using a risk index can serve as a good alternative to single cut scores. This index includes calculations as probability of being classified as at risk or not at risk. A weighted regression formula of predictors to a specific outcome determines the classification and the construction of the risk index. Moreover, the ability of a test to discriminate diseased cases from normal cases is evaluated using a receiver operating characteristic (ROC) curve analysis. ROC curves can also be used to compare the diagnostic performance of two or more screening tests [5, 29].

An ROC curve is provided by a screen that cannot discriminate between cases and non-cases. This is a straight line passing through the origin with unit slope, and effective screens will provide a convex curve above this line. Area under the curve (AUC), that is, the ROC curve, provides a measure of the screening test performance. This measure goes beyond sensitivity and specificity at a single threshold, integrating the full range of scores that need to be taken into account for making a decision about a threshold in order to separate illness from health. This practically means that a value of 0.5 (that is under the straight line of unit slope) indicates a lack of effectiveness, whereas a value very close to 1.0 is indicative of a very good screen.

Ref. [3] noted that the AUC is an indicator of a screening tool's overall ability to differentiate between children with lower-than-average emergent literacy skills and children with average or better emergent literacy skills, and it is calculated at all possible cut scores. Using optimal cut score statistics allows examination of the utility of the screening tool under the circumstances in which it would typically be used. Ref. [4] suggested that AUC values above 0.90 represent excellent diagnostic accuracy, between 0.80 and 0.90 represent good, 0.70–0.80 fair, and values below 0.70 are considered poor.

### **3. Single or multiple predictors and criterion measures**

Large amounts of predictors have been proposed by researchers. Several prereading measures, when administered in kindergarten, are predictors of later reading abilities. These measures include letter name and letter sound knowledge, phonological awareness, verbal short-term memory, and rapid automatized naming [6].

Two related studies [23, 24] found that measures of letter naming, phonological awareness, rapid object naming, and non-word repetition at the beginning of kindergarten were very good predictors of reading outcomes at the end of the first grade. Ref. [2] further has shown that measuring at-risk children's response to supplemental intervention during kindergarten can improve accuracy of identification beyond that of early screening. Even when predicting performance on the state assessment in the third grade, Ref. [5] found that a comprehension measure was the best predictor. In addition, the review [34] revealed that risk factors associated with speech and language delay were male gender, family history, and low parental education.

Moreover, phonological awareness was recognized by Refs. [16, 17] as an important risk factor. However, Ref. [8], proposed as risk factors the letter-name knowledge, and the rapid serial naming, reference [20], proposed the Inittial Sound Fluency task of the DIBELS, reference [21], proposed the rapid naming objects, reference [22], proposed the Word Identification and Passage Comprehension subtests and the Word Attack subtest of the WJ-R., and final reference [19], proposed as risk factors the Letter-Name Fluency (LNF), and the Nonsense Word Fluency (NWF).

Additionally, most of the screening studies used multiple predictors, and all of them used phonological processing measures [8, 16–22]. Some of them used the total or part of a specific screening test in order to test their validity and reliability [20–22]. Some others used measures such as pre-reading behaviors, reading habits [18], or working memory [30]. Others used parents or self-reported questionnaires and checklists [31, 32], and finally some used teacher ratings [28, 33].

Similar risk indicators have been used in the context of the newest screening studies. For example, a multivariate screening battery was administered by Ref. [4] to 252 beginning first-grade children. The children had low initial reading abilities, and their reading outcomes were measured at the end of the second grade. Logistic regression analyses showed a high degree of accuracy concerning the prediction of reading outcomes. This screening model, which proved to be highly accurate, included measures of phonological awareness, rapid digit naming, and oral vocabulary.

Ref. [28] examined 240 fourth-grade children and they were classified as not-atrisk or at-risk readers based on a three-factor model reflecting reading comprehension, word recognition/decoding, and word fluency. More specifically, participants were assessed using measures of reading comprehension, oral language, word recognition, word decoding, phonological processing, auditory memory, and spelling.

As criterion measures, all of them used reading ability tested by a number of standardized and normalized reading tests. The most popular of them were the Woodcock Diagnostic Reading Battery; Woodcock-Johnson Psycho-Educational Battery-Revised; CTOPP; Reading-Gray Oral Reading Test; WRAT Spelling; and Peabody Individual Achievement Test.

#### **4. Research design considerations and findings**

Regarding the experimental design of the screening studies, it could be noted that a lot of these had longitudinal or follow-up designs and the other half had a cross-sectional one. Commonly, the follow-up studies had two phases with one-year interval. Others had different designs, for example, Ref. [21] included three phases and 16-month interval and Ref. [17] presented two phases and 4–6-week interval. These studies administered the set of predictors (tests or part of tests or single measures) and at the second phase, the criterion measures were administered, that is, the reading ability measures. The studies with cross-sectional designs administered the predictors and the reading measures at the same time.

There are two approaches to the study of reading disabilities. Firstly, the most common approach to reading assessment is to separate children into groups based on their reading scores. Consequently, it is important to determine if variables thought to be related to the development of reading skills are predictive of group membership, that is, they predict if the child belongs to the at-risk group or not. Secondly, the alternative approach is to consider reading as a continuum of abilities. Based on that, it is significant to determine if the variables thought to influence the development of reading abilities can predict the full range of the child's reading scores obtained. Concerning the significant discriminant function models regardless of which language measure was used, classification accuracy was about as good

**211**

group matched in age.

*Screening Young Children at Risk for Reading Failure DOI: http://dx.doi.org/10.5772/intechopen.82081*

statistic multiple analyses to give the best results [20, 31, 32].

0.86 when reading outcome was based on a composite score for reading.

or better for the typical reading group as it was for the poor reading groups [34]. Screening studies mainly used t-tests, ANOVAs, MANOVAs; correlations; logistic regression; and discriminant analysis. Often, the cutoff scores used by the studies were arbitrary, usually recommended by the literature (e.g., [16]) or revealed by the

Screening procedures that result in sensitivity levels at or above 90% and specificity levels of at least 80% are generally deemed acceptable ([29]). An alternative index of accuracy is the area under the receiver operating characteristic (ROC) curve. According to Ref. [29], an ROC curve is a plot of the true positive rate (sensitivity) against the false positive rate (specificity) for each of the cut points of a decisionmaking instrument. Therefore, the area under the curve (AUC) may be used as an overall estimate of the accuracy of an assessment. Values above 0.80 are considered good, while values above 0.90 are excellent [29]. Ref. [25] found that AUC was 0.84 when reading outcome was based on individual component measures of reading and

Ref. [3] had administered at two time points two screening tools to 176 preschoolers. Specifically, the study used the Revised Get Ready to Read! (GRTR-R) tool, the Individual Growth and Development Indicators (IGDIs), and a diagnostic measure. Comparing the two screening tools based on a receiver operating characteristic curve analysis, it emerged that, at optimal cut scores, IGDIs provided less accurate classification of children's overall emergent literacy skills than GRTR-R. However, neither measure was particularly good at classifying specific emergent literacy skills. On the other hand, Ref. [23] examined if kindergarten measures of language ability predicted reading comprehension difficulties independently of direct word reading measures. In addition, they investigated if response to language intervention in kindergarten added to the prediction of third-grade reading comprehension. The participants were 263 kindergarten children at risk and 103 children for control

Ref. [26] examined and evaluated if and to what extent R-CBM and CBM maze were technically adequate to inform their use in the context of a universal screening program of reading in fourth and fifth grades. The results of the study suggest evidence of short- and long-term alternate forms of reliability, criterion validity, and predictive validity for both R-CBM and CBM maze. It is also supported that possibly the two measures are comparable for use in universal screening at those grade levels. Therefore, the study suggests that R-CBM and CBM maze could be

Ref. [34] was a review aimed to update the evidence on screening and treating children for speech and language delay in children through 5 years of age. In 23 studies evaluating the accuracy of screening tools, sensitivity ranged between 50 and 94%, and specificity ranged between 45 and 96%. As noted above, 12 treatment studies improved various outcomes in language, articulation, and stuttering. There has been restricted evidence concerning interventions that provided other improved outcomes or adverse effects of treatment. Male gender, family history, and low parental education were the main risk factors that were related to speech and language delay. The use of various screening tools can lead to accurate identification of children who need/undergo diagnostic evaluations and interventions. Evidence, on the other hand, is not adequate concerning their applicability in primary care settings. In addition, some treatments for young children, who have been identified with speech and language delays and disorders, may be effective. The recent study of Ref. [35] aimed at dyslexia's early detection via machine by observing how people interact in the context of a linguistic computer-based game. In order to train a statistical model that predicts readers with and without dyslexia using measures derived from the game, they examined 267 children and adults.

used interchangeably for screening of reading outcomes.

Moreover, phonological awareness was recognized by Refs. [16, 17] as an important risk factor. However, Ref. [8], proposed as risk factors the letter-name knowledge, and the rapid serial naming, reference [20], proposed the Inittial Sound Fluency task of the DIBELS, reference [21], proposed the rapid naming objects, reference [22], proposed the Word Identification and Passage Comprehension subtests and the Word Attack subtest of the WJ-R., and final reference [19], proposed as risk factors the Letter-Name

Additionally, most of the screening studies used multiple predictors, and all of them used phonological processing measures [8, 16–22]. Some of them used the total or part of a specific screening test in order to test their validity and reliability [20–22]. Some others used measures such as pre-reading behaviors, reading habits [18], or working memory [30]. Others used parents or self-reported questionnaires

Similar risk indicators have been used in the context of the newest screening studies. For example, a multivariate screening battery was administered by Ref. [4] to 252 beginning first-grade children. The children had low initial reading abilities, and their reading outcomes were measured at the end of the second grade. Logistic regression analyses showed a high degree of accuracy concerning the prediction of reading outcomes. This screening model, which proved to be highly accurate, included measures

Ref. [28] examined 240 fourth-grade children and they were classified as not-atrisk or at-risk readers based on a three-factor model reflecting reading comprehension, word recognition/decoding, and word fluency. More specifically, participants were assessed using measures of reading comprehension, oral language, word recognition, word decoding, phonological processing, auditory memory, and spelling. As criterion measures, all of them used reading ability tested by a number of standardized and normalized reading tests. The most popular of them were the Woodcock Diagnostic Reading Battery; Woodcock-Johnson Psycho-Educational Battery-Revised; CTOPP; Reading-Gray Oral Reading Test; WRAT Spelling; and

Regarding the experimental design of the screening studies, it could be noted that a lot of these had longitudinal or follow-up designs and the other half had a cross-sectional one. Commonly, the follow-up studies had two phases with one-year interval. Others had different designs, for example, Ref. [21] included three phases and 16-month interval and Ref. [17] presented two phases and 4–6-week interval. These studies administered the set of predictors (tests or part of tests or single measures) and at the second phase, the criterion measures were administered, that is, the reading ability measures. The studies with cross-sectional designs administered

There are two approaches to the study of reading disabilities. Firstly, the most common approach to reading assessment is to separate children into groups based on their reading scores. Consequently, it is important to determine if variables thought to be related to the development of reading skills are predictive of group membership, that is, they predict if the child belongs to the at-risk group or not. Secondly, the alternative approach is to consider reading as a continuum of abilities. Based on that, it is significant to determine if the variables thought to influence the development of reading abilities can predict the full range of the child's reading scores obtained. Concerning the significant discriminant function models regardless of which language measure was used, classification accuracy was about as good

Fluency (LNF), and the Nonsense Word Fluency (NWF).

and checklists [31, 32], and finally some used teacher ratings [28, 33].

of phonological awareness, rapid digit naming, and oral vocabulary.

Peabody Individual Achievement Test.

**4. Research design considerations and findings**

the predictors and the reading measures at the same time.

**210**

or better for the typical reading group as it was for the poor reading groups [34]. Screening studies mainly used t-tests, ANOVAs, MANOVAs; correlations; logistic regression; and discriminant analysis. Often, the cutoff scores used by the studies were arbitrary, usually recommended by the literature (e.g., [16]) or revealed by the statistic multiple analyses to give the best results [20, 31, 32].

Screening procedures that result in sensitivity levels at or above 90% and specificity levels of at least 80% are generally deemed acceptable ([29]). An alternative index of accuracy is the area under the receiver operating characteristic (ROC) curve. According to Ref. [29], an ROC curve is a plot of the true positive rate (sensitivity) against the false positive rate (specificity) for each of the cut points of a decisionmaking instrument. Therefore, the area under the curve (AUC) may be used as an overall estimate of the accuracy of an assessment. Values above 0.80 are considered good, while values above 0.90 are excellent [29]. Ref. [25] found that AUC was 0.84 when reading outcome was based on individual component measures of reading and 0.86 when reading outcome was based on a composite score for reading.

Ref. [3] had administered at two time points two screening tools to 176 preschoolers. Specifically, the study used the Revised Get Ready to Read! (GRTR-R) tool, the Individual Growth and Development Indicators (IGDIs), and a diagnostic measure. Comparing the two screening tools based on a receiver operating characteristic curve analysis, it emerged that, at optimal cut scores, IGDIs provided less accurate classification of children's overall emergent literacy skills than GRTR-R. However, neither measure was particularly good at classifying specific emergent literacy skills.

On the other hand, Ref. [23] examined if kindergarten measures of language ability predicted reading comprehension difficulties independently of direct word reading measures. In addition, they investigated if response to language intervention in kindergarten added to the prediction of third-grade reading comprehension. The participants were 263 kindergarten children at risk and 103 children for control group matched in age.

Ref. [26] examined and evaluated if and to what extent R-CBM and CBM maze were technically adequate to inform their use in the context of a universal screening program of reading in fourth and fifth grades. The results of the study suggest evidence of short- and long-term alternate forms of reliability, criterion validity, and predictive validity for both R-CBM and CBM maze. It is also supported that possibly the two measures are comparable for use in universal screening at those grade levels. Therefore, the study suggests that R-CBM and CBM maze could be used interchangeably for screening of reading outcomes.

Ref. [34] was a review aimed to update the evidence on screening and treating children for speech and language delay in children through 5 years of age. In 23 studies evaluating the accuracy of screening tools, sensitivity ranged between 50 and 94%, and specificity ranged between 45 and 96%. As noted above, 12 treatment studies improved various outcomes in language, articulation, and stuttering. There has been restricted evidence concerning interventions that provided other improved outcomes or adverse effects of treatment. Male gender, family history, and low parental education were the main risk factors that were related to speech and language delay. The use of various screening tools can lead to accurate identification of children who need/undergo diagnostic evaluations and interventions. Evidence, on the other hand, is not adequate concerning their applicability in primary care settings. In addition, some treatments for young children, who have been identified with speech and language delays and disorders, may be effective.

The recent study of Ref. [35] aimed at dyslexia's early detection via machine by observing how people interact in the context of a linguistic computer-based game. In order to train a statistical model that predicts readers with and without dyslexia using measures derived from the game, they examined 267 children and adults.

Specifically, the model was trained and evaluated in a 10-fold cross experiment. Using the most informative features, it reached an 84.62% of accuracy.

Another recent study of Ref. [12] focused on a year-end state reading assessment in two states. The study examined the predictive validity and classification accuracy of individual- and group-administered screening measures related to student performance. A total of 321 students participated in the study, and in the fall of fourth grade, they were assessed regarding word-level, text fluency, and reading comprehension. Logistic regression results, applying a multivariate approach, revealed minimal to no increase in classification accuracy over the single comprehension measure. Receiver operating characteristic (ROC) curve analyses determined local cut scores to maintain sensitivity constantly at 0.90; this resulted in a large number of false positives.

Referring to predictive accuracy, Ref. [16] in accordance with findings of the past decade found that both phonological awareness and letter identification yielded the highest overall results. Moreover, all the constructs were promising as far as the accuracy rates are concerned. The false positive rate ranged from 13 to 27%, depending on the construct. The false negative rate ranged from 0.06 to 0.21%. Researchers continue to struggle with high hit and miss rates in predictive accuracy. Most importantly, researchers must address the high rate of false negatives. As funds and resources to provide reading interventions are limited, this is of particular practical importance to ensure that the most appropriate students are served.

The study of Ref. [17] examined the convergent and concurrent validity of two recently developed measures of phonological processing, the TOPA and the CTOPP. Both of these instruments used in combination appear to be useful in the early identification of children at risk for difficulty in learning to read. Based on the results, however, the use of either, or both, of these instruments as sole predictors of reading outcome cannot be supported.

The study of Ref. [20] compared DIBELS test with CTOPP. Specifically, the concurrent validity and diagnostic accuracy of the published test DIBELS was examined and was compared to the well-documented published test of CTOPP. Results suggest that the DIBELS strongly correlates with subtest and composite scores of the CTOPP that are designed to measure phonological awareness and memory, and less strongly with rapid naming tasks.

The findings of Ref. [18] indicated that the accuracy of the discrimination was high, 89.7%, with a 6.2% false negatives rate. However, using the calibration data from the reference group to identify at-risk status in a different sample, the accuracy fell to 80.2% with a 10.2% false negative rate.

Ref. [31] found that the Adult Reading History Questionnaire (ARHQ ) was valid. This was demonstrated by the high correlation between the ARHQ and diagnostic measures for adults (rs = 0.57–0.70). However, not every familial case is perfectly detected by ARHQ. Therefore, it would be more preferable and appropriate if clinicians and researchers used this questionnaire less as a diagnostic tool and more as a screening instrument.

The findings of Ref. [8] supported that letter name knowledge and rapid serial naming were most important in predicting later RD. The study had a sensitivity of 0.49 and specificity of 0.76. The findings of Ref. [21] were not consistent with the initial findings of the designers that the DEST was significantly and strongly correlated with later reading ability. Specifically, the rapid naming of objects variable emerged as a consistent predictor of later attainment, which predicted significant amounts of variability in reading and spelling, and the correlation coefficient were 0.344 (p ≤ 0.05).

Ref. [22] examined the relations among standardized reading achievement tests, phonological awareness measures (CTOPP), and fluency rates (CBM, subtest of

**213**

curve index of 0.90.

**5. Screening in RTI context**

*Screening Young Children at Risk for Reading Failure DOI: http://dx.doi.org/10.5772/intechopen.82081*

achievement.

sensitivity of 57.1%.

criminator was the self-report data.

receiving special learning assistance.

Woodcock-Johnson Tests of Achievement-Revised) and how these measures relate to teacher ratings. The authors supported that measures of phonological awareness and reading fluency that provide further information may be included as part of reading assessment in addition to traditional norm-referenced measures of reading

Ref. [19] examined whether the measures could accurately identify poor readers in first grade. The sensitivity of phonological awareness was 42.9 and 66.7% for ORF and the WJ-R Word Attack, respectively, missing one-half and one-third of the students who later demonstrated reading problems. In addition, measures of letter name knowledge and letter sound knowledge were not sensitive in identifying students who were performing poorly on either first-grade reading criteria, with

Ref. [32] constructed a parent report checklist including information about the development history of the child and some indicators for reading problems. The author supported that this checklist was valid and reliable and it could be screened

In the study of Ref. [30], phonological awareness, distinctness of phonological representations, and phonological working memory were captured in the context of a series of tasks. Furthermore, a questionnaire was designed including two scales of self-reports: (a) one concerned with typical dyslexic symptoms and (b) one concerned with reading interest. The findings noted that the most powerful dis-

Ref. [36] examined the accuracy of teacher ratings. Therefore, kindergarten children identified by their teachers as making substandard progress toward one or more academic objectives performed significantly less well than a matched group of no identified children on tests of word reading, spelling, mathematics, and knowledge of letter names and letter sounds. Furthermore, by the end of the third school year, greater proportions of identified children than no identified children were

Another study examining teachers' rating was Ref. [33]. Kindergarten teachers appear to be better predictors of students who will not develop academic difficulty, as negative predictive values were consistently high regardless of the predictive variable. Variables associated with learning rather than behavioral or social variables may be better indicators of future academic achievement. The authors proposed that effective academic screening measures be used in conjunction with teacher ratings in order to maximize specificity in identifying children who are at

More recently, Ref. [28] compared teacher ratings and reading factors as predictors for future reading competence. Specifically, they administered multiple measures of reading to 230 fourth-grade children. Teachers rated children's reading skills, academic competence, and attention. A three-factor model including reading comprehension, word recognition/decoding, and word fluency was used, in order to classify children as not-at-risk or at-risk readers. Predictors of reading status included group-administered tests of reading comprehension, silent word reading fluency, and teacher ratings of reading problems. The receiver operating characteristic curve (ROC) analysis yielded an area under the

The goal of universal screening is to promote the early identification of reading difficulties or potential reading difficulties. In order to prevent further difficulties,

between RD and NRD with 97.2% discriminative accuracy.

risk for later learning disability early in their academic years.

of false positives.

outcome cannot be supported.

less strongly with rapid naming tasks.

more as a screening instrument.

racy fell to 80.2% with a 10.2% false negative rate.

Specifically, the model was trained and evaluated in a 10-fold cross experiment.

Another recent study of Ref. [12] focused on a year-end state reading assessment in two states. The study examined the predictive validity and classification accuracy of individual- and group-administered screening measures related to student performance. A total of 321 students participated in the study, and in the fall of fourth grade, they were assessed regarding word-level, text fluency, and reading comprehension. Logistic regression results, applying a multivariate approach, revealed minimal to no increase in classification accuracy over the single comprehension measure. Receiver operating characteristic (ROC) curve analyses determined local cut scores to maintain sensitivity constantly at 0.90; this resulted in a large number

Referring to predictive accuracy, Ref. [16] in accordance with findings of the past

decade found that both phonological awareness and letter identification yielded the highest overall results. Moreover, all the constructs were promising as far as the accuracy rates are concerned. The false positive rate ranged from 13 to 27%, depending on the construct. The false negative rate ranged from 0.06 to 0.21%. Researchers continue to struggle with high hit and miss rates in predictive accuracy. Most importantly, researchers must address the high rate of false negatives. As funds and resources to provide reading interventions are limited, this is of particular practical

The study of Ref. [17] examined the convergent and concurrent validity of two recently developed measures of phonological processing, the TOPA and the CTOPP. Both of these instruments used in combination appear to be useful in the early identification of children at risk for difficulty in learning to read. Based on the results, however, the use of either, or both, of these instruments as sole predictors of reading

The study of Ref. [20] compared DIBELS test with CTOPP. Specifically, the concurrent validity and diagnostic accuracy of the published test DIBELS was examined and was compared to the well-documented published test of CTOPP. Results suggest that the DIBELS strongly correlates with subtest and composite scores of the CTOPP that are designed to measure phonological awareness and memory, and

The findings of Ref. [18] indicated that the accuracy of the discrimination was high, 89.7%, with a 6.2% false negatives rate. However, using the calibration data from the reference group to identify at-risk status in a different sample, the accu-

Ref. [31] found that the Adult Reading History Questionnaire (ARHQ ) was valid. This was demonstrated by the high correlation between the ARHQ and diagnostic measures for adults (rs = 0.57–0.70). However, not every familial case is perfectly detected by ARHQ. Therefore, it would be more preferable and appropriate if clinicians and researchers used this questionnaire less as a diagnostic tool and

The findings of Ref. [8] supported that letter name knowledge and rapid serial naming were most important in predicting later RD. The study had a sensitivity of 0.49 and specificity of 0.76. The findings of Ref. [21] were not consistent with the initial findings of the designers that the DEST was significantly and strongly correlated with later reading ability. Specifically, the rapid naming of objects variable emerged as a consistent predictor of later attainment, which predicted significant amounts of variability in reading and spelling, and the correlation coefficient were

Ref. [22] examined the relations among standardized reading achievement tests, phonological awareness measures (CTOPP), and fluency rates (CBM, subtest of

importance to ensure that the most appropriate students are served.

Using the most informative features, it reached an 84.62% of accuracy.

**212**

0.344 (p ≤ 0.05).

Woodcock-Johnson Tests of Achievement-Revised) and how these measures relate to teacher ratings. The authors supported that measures of phonological awareness and reading fluency that provide further information may be included as part of reading assessment in addition to traditional norm-referenced measures of reading achievement.

Ref. [19] examined whether the measures could accurately identify poor readers in first grade. The sensitivity of phonological awareness was 42.9 and 66.7% for ORF and the WJ-R Word Attack, respectively, missing one-half and one-third of the students who later demonstrated reading problems. In addition, measures of letter name knowledge and letter sound knowledge were not sensitive in identifying students who were performing poorly on either first-grade reading criteria, with sensitivity of 57.1%.

Ref. [32] constructed a parent report checklist including information about the development history of the child and some indicators for reading problems. The author supported that this checklist was valid and reliable and it could be screened between RD and NRD with 97.2% discriminative accuracy.

In the study of Ref. [30], phonological awareness, distinctness of phonological representations, and phonological working memory were captured in the context of a series of tasks. Furthermore, a questionnaire was designed including two scales of self-reports: (a) one concerned with typical dyslexic symptoms and (b) one concerned with reading interest. The findings noted that the most powerful discriminator was the self-report data.

Ref. [36] examined the accuracy of teacher ratings. Therefore, kindergarten children identified by their teachers as making substandard progress toward one or more academic objectives performed significantly less well than a matched group of no identified children on tests of word reading, spelling, mathematics, and knowledge of letter names and letter sounds. Furthermore, by the end of the third school year, greater proportions of identified children than no identified children were receiving special learning assistance.

Another study examining teachers' rating was Ref. [33]. Kindergarten teachers appear to be better predictors of students who will not develop academic difficulty, as negative predictive values were consistently high regardless of the predictive variable. Variables associated with learning rather than behavioral or social variables may be better indicators of future academic achievement. The authors proposed that effective academic screening measures be used in conjunction with teacher ratings in order to maximize specificity in identifying children who are at risk for later learning disability early in their academic years.

More recently, Ref. [28] compared teacher ratings and reading factors as predictors for future reading competence. Specifically, they administered multiple measures of reading to 230 fourth-grade children. Teachers rated children's reading skills, academic competence, and attention. A three-factor model including reading comprehension, word recognition/decoding, and word fluency was used, in order to classify children as not-at-risk or at-risk readers. Predictors of reading status included group-administered tests of reading comprehension, silent word reading fluency, and teacher ratings of reading problems. The receiver operating characteristic curve (ROC) analysis yielded an area under the curve index of 0.90.
