**3. New knowledge-based toolkit**

*Some RNA Viruses*

adversaries.

**2. The model**

Medical Association Publishing House [5] and the European Centre for Disease Prevention and Control [6]. All the available approaches suggest that the number of new COVID-19 cases plays a key role in mapping its trajectory [7] worldwide.

COVID-19 is an evolving epidemic, and its up-and-down spread (trend or pattern commonly referred to as "curve") is a sign of its elusiveness. As of today (July 25, 2020), the COVID-19 is striking back with record-setting blows. In general, the COVID-19 issue relates to various facets such as public health and social as well as culture characteristics, and the world seems lacking sound methodologies on how to address this problem. Using predictive tracking or forecasting quantitative measures can assist the authorities, officials, organizations, and users to be proactive rather than reactive, and thus better prepared to mitigate potential

The literature seems to suggest that using the number of new cases and the level of social distancing are the key variables to analyze the COVID-19 in various ways. In what follows, we provide a background information about the four main COVID-19 modeling techniques: system dynamics, agent-based modeling, discrete event simulation, and hybrid simulation [8]. System dynamics uses differential equations to model resources, knowledge, people, and money, and the flows between these parameters explains the simulation behavior. The agent-based techniques are stochastic, enabling the variability of human behavior to be incorporated to help understand the likely effectiveness of proposed protective measures. The discrete event technique is also stochastic and models operations over time where entities flow through a number of activities. The hybrid simulation combines two or more techniques and is used for complex behavior. These techniques focus mainly on the unfolding phases of disease transmission such as quarantine, lock down, testing, and health care services. Some of these approaches have been rooted in the literature since 1777, and are complex, and cumbersome to implement. Without adequate specialists in advanced and complex mathematical theories and/ or computers, the logical question is thus: how could the proper personnel ascertain the COVID-19 spread in order to make proactive intervention decisions; e.g. to prepare hospitals and intensive care units, to mitigate the adverse impacts of what may happen in the near future? In search for accurate answer and based on the popular utilization of COVID-19 relationship between the number of cases [9] and population per land area, the idea of a new index was conceptualized in this study. It represents the number of reported confirmed new cases per population in the specific region the data was recorded. This new concept harnesses the number of cases and the regional crowdedness of people, which varies in the US from single digit to multi-thousand [2]. The index increases with more cases and with more

dense populations (assumed shorter social distancing).

In this study, a combined linear regression analysis and data-fitting model is used. To deal with data fluctuation, this model adopted the hypothesis that was successfully used in other published studies of a short time span of one month maximum for forecasting, [10–13]. That hypothesis is logical and rational because the world knows that the virus spread in unpredictable; thus, longer time spans may encompass inaccurate data. The data is obtained from the New York Times Journal database [14]. The journal publishes the daily cases of COVID-19 by state and county in the US. The data from eleven states was used: New York State (NYS), Florida (FL), California (CA), Colorado (CO), Illinois (IL), Texas (Tx), Louisiana (LA), Washington (WA), Georgia (GA), New Jersey (NJ), and Michigan (MI).

**28**

To accurately and proactively capture the big picture of COVID-19 spread, this study transfers the expertise of problem-solving from humans into a KB toolkit that takes in the same data, and yields the same conclusion but faster. This new KB-statistic hybrid approach effectively assists humans in dealing with COVID-19 massive daily data in addition to save time which is an essential requirement in dealing with the virus illusiveness. The study introduces for the first time in this field, to our knowledge, a novel KB toolkit to visualize the data and make it easier to understand and use without either mathematical or computer expertise. The CORVITT is a promising incubator for COVID-19 future forecasting platforms. Its VBA-based architecture blueprint emerges from an open-end modular adaptable structure encompassing a graphical-interface client allowing the users to easily operate it. This KB technology has been proven in other applications and thus applied in this study for COVID-19 [15–17]. To the author's knowledge, the concept of CORVITT has not been attempted to date for COVID-19. **Figure 1** shows the dashboard of CORVITT. The user could simply click the button that represents the state/province of interest, and the dashboard will display the microdata or the relative comparison of all states. **Figure 2** shows the data used in **Figure 1**. Although the amount of collected data is massive, the use of the dashboard is intuitive and user

**Figure 1.** *Dashboard of the CORVITT presented in this chapter.*

**Figure 2.** *The macro-data used in this study.*

friendly. Again, there is no need for medical, mathematical, or computer skills to use the dashboard and benefit from its applications. This is one of the takeaways of bringing an artificial brain to help human brains in dealing with complex challenge at hand such as the COVID-19.

#### **4. Results and discussion**

In what follows, we examined the feasibility of the ascribed model for COVID-19 in two ways: firstly, by analyzing its forecasted outcomes in eleven US states, and secondly by comparing the forecasted results with actual onsite data. Firstly, a database was created at micro-level or counties, for the first time to our knowledge, for the new cases and population per land area on COVID-19 from March 27 to May 1, 2020 in the eleven US states: NYS, FL, CA, TX, LA, WA, CO, GA, IL, NJ, and MI. These sates had a steady high number of confirmed cases according to the New York Times Journal. **Table 1** shows some of the collected data. **Figure 2** shows CORVITT gauging of the virus county-wise distributions in terms of the new index and population. **Figure 2** shows that in NYS, FL, CA, CO, and IL the inhabitants are infected in areas with large social distances because most of the data is concentrated at low population, i.e. large spaces between the inhabitants. On the contrary, in TX, LA, WA, GA, NJ, and MI the virus was spread though the social distance was small because the data spreads over a wide range of population, i.e. large space between the inhabitants. Unlike the common general approach that was used for all US states

**31**

preparations possible.

*How Can We Be Ahead of COVID-19 Curve? A Hybrid Knowledge-Based and Modified…*

Multiple R 0.992092 R Square 0.984246 Adjusted R Square 0.976369 Standard error 0.372065 Observations 4

Residual 2 0.276865 0.138433

**Coefficients Standard** 

**error**

Total 3 17.57407

*A sample output of the modified regression.*

at all times since 2019, which is common in the news media from Tabloid to New York Times journals, this discovery unveiled new facets. **Figure 1(d)** shows that on March 27, the indexes were 0.93 and 0.10 in NYS and NJ (large distance), respectively, 0.08 and 0.07 for TX and GA (small social distance), respectively, though the spread of data appeared similar on the dashboard in each group. In addition, NYS, LA, WA, IL, and MI have high indexes by comparison to FL, CA, TX, CO, GA, and NJ For example, on April 21, 2020, the indexes were 5.0 and 0.40 for NYS and CA although the closeness of data in both states was similar. Furthermore, the increase of the index within each state was nonuniform. For example, the index increased from 2.8 to 5 between April 9 and 21 in NYS, and from 0.05 to 1.0 in TX over the same period. From a different angel of view, **Figure 1(a)** shows that the rate of spreading of the virus differ from one state to another. For example, the spreading rate in IL is very high compared to that in CA. This indicates that the virus can affect more people in IL (57,920 sq. mi and 13 million people) than NJ (8730 sq. mi and 9 million people). Taken as a whole, CORVITT-outcomes suggest that NYS is a good region for the virus to spread whereas CA is not as good from March 27 to May 1, 2020. Such information allows the authorities to prioritize the resources giving NYS the highest priority. Secondly, **Figure 3a** and **b** compares the forecasted and actual on-site data and shows a close agreement in different US states. **Figure 3a** shows that whereas the new cases in NYS has reached the peak in the first week of May, the pandemic was worsening in other states such as Illinois, but in California, Georgia, and Colorado reached a plateau. **Figure 3c** shows the severity of the pandemic as indicated by the skewness; positive (to the right) for LA and negative for NYS, NJ, and MI of the growth and deterioration of the distributions [18]. The positive skewness means longer deterioration (decline in cases) time. The shallow deterioration rate at the trailing end of the curve in **Figure 1(b)** is a sign of a plateau. **Figure 1** describes the peak, weakness, and steadiness statuses by which the virus trajectory disperses through different stages in various regions. This new discovery is useful to understand the building up and collapse of the virus impacts thus make proactive

Regression 1 17.29721 17.29721 124.9503 0.007908

Intercept −7476.32 669.1315 −11.1732 0.007915 −10355.4 −4597.28 date 0.170252 0.015231 11.17812 0.007908 0.104719 0.235785

df SS MS F Significance F

**t Stat P-value Lower** 

**95%**

**Upper 95.0%**

*DOI: http://dx.doi.org/10.5772/intechopen.93867*

**Regression statistics**

**ANOVA**

**Table 1.**

*How Can We Be Ahead of COVID-19 Curve? A Hybrid Knowledge-Based and Modified… DOI: http://dx.doi.org/10.5772/intechopen.93867*


#### **Table 1.**

*Some RNA Viruses*

friendly. Again, there is no need for medical, mathematical, or computer skills to use the dashboard and benefit from its applications. This is one of the takeaways of bringing an artificial brain to help human brains in dealing with complex challenge

In what follows, we examined the feasibility of the ascribed model for COVID-19 in two ways: firstly, by analyzing its forecasted outcomes in eleven US states, and secondly by comparing the forecasted results with actual onsite data. Firstly, a database was created at micro-level or counties, for the first time to our knowledge, for the new cases and population per land area on COVID-19 from March 27 to May 1, 2020 in the eleven US states: NYS, FL, CA, TX, LA, WA, CO, GA, IL, NJ, and MI. These sates had a steady high number of confirmed cases according to the New York Times Journal. **Table 1** shows some of the collected data. **Figure 2** shows CORVITT gauging of the virus county-wise distributions in terms of the new index and population. **Figure 2** shows that in NYS, FL, CA, CO, and IL the inhabitants are infected in areas with large social distances because most of the data is concentrated at low population, i.e. large spaces between the inhabitants. On the contrary, in TX, LA, WA, GA, NJ, and MI the virus was spread though the social distance was small because the data spreads over a wide range of population, i.e. large space between the inhabitants. Unlike the common general approach that was used for all US states

**30**

at hand such as the COVID-19.

*The macro-data used in this study.*

**Figure 2.**

**4. Results and discussion**

*A sample output of the modified regression.*

at all times since 2019, which is common in the news media from Tabloid to New York Times journals, this discovery unveiled new facets. **Figure 1(d)** shows that on March 27, the indexes were 0.93 and 0.10 in NYS and NJ (large distance), respectively, 0.08 and 0.07 for TX and GA (small social distance), respectively, though the spread of data appeared similar on the dashboard in each group. In addition, NYS, LA, WA, IL, and MI have high indexes by comparison to FL, CA, TX, CO, GA, and NJ For example, on April 21, 2020, the indexes were 5.0 and 0.40 for NYS and CA although the closeness of data in both states was similar. Furthermore, the increase of the index within each state was nonuniform. For example, the index increased from 2.8 to 5 between April 9 and 21 in NYS, and from 0.05 to 1.0 in TX over the same period. From a different angel of view, **Figure 1(a)** shows that the rate of spreading of the virus differ from one state to another. For example, the spreading rate in IL is very high compared to that in CA. This indicates that the virus can affect more people in IL (57,920 sq. mi and 13 million people) than NJ (8730 sq. mi and 9 million people). Taken as a whole, CORVITT-outcomes suggest that NYS is a good region for the virus to spread whereas CA is not as good from March 27 to May 1, 2020. Such information allows the authorities to prioritize the resources giving NYS the highest priority. Secondly, **Figure 3a** and **b** compares the forecasted and actual on-site data and shows a close agreement in different US states. **Figure 3a** shows that whereas the new cases in NYS has reached the peak in the first week of May, the pandemic was worsening in other states such as Illinois, but in California, Georgia, and Colorado reached a plateau. **Figure 3c** shows the severity of the pandemic as indicated by the skewness; positive (to the right) for LA and negative for NYS, NJ, and MI of the growth and deterioration of the distributions [18]. The positive skewness means longer deterioration (decline in cases) time. The shallow deterioration rate at the trailing end of the curve in **Figure 1(b)** is a sign of a plateau. **Figure 1** describes the peak, weakness, and steadiness statuses by which the virus trajectory disperses through different stages in various regions. This new discovery is useful to understand the building up and collapse of the virus impacts thus make proactive preparations possible.

#### **Figure 3.**

*(a): The trajectory of cases' dispersion over time in various US States. (b): A satisfactory agreement between the forecasted and actual data. (c). The growth and deterioration distributions of cases over time.*

**33**

**Author details**

Rafaat Hussein

State University of New York, Syracuse NY, US

provided the original work is properly cited.

\*Address all correspondence to: ezpsc@yahoo.com

© 2020 The Author(s). Licensee IntechOpen. This chapter is distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/ by/3.0), which permits unrestricted use, distribution, and reproduction in any medium,

*How Can We Be Ahead of COVID-19 Curve? A Hybrid Knowledge-Based and Modified…*

As the enormity of the COVID-19 threat has become clear, the characteristics of existing COVID-19 complex analytic methodologies and the all-encompassing approach place serious limitations on their usefulness for practical use. The computer technologies have reached what no one could imagined, and the KB systems have proven very beneficial in many fields. The rational question is: why has it taken so long for a logical approach to appear to practicalize the analytical complex simulations? To answer the question, this chapter introduces machine smartness to assist humans' intelligence to capture the big picture of the virus illusiveness thus take proactive rather than retroactive steps to mitigate safely its inevitable adverse effects. This seed study introduced a hybrid KB-regression analysis model for COVID-19 forecasting. It used data collected from eleven US states at macro-level level to foresee the short-term spread trajectory. The outputs unveiled new discoveries and shed light on various facets of the COVID-19 in each state. The accuracy of the hybrid approach was gauged by comparing forecasted and actual data and satisfactory agreements were found. It should be noted that this study is a step forward, but additional development is in progress for improvement preparations.

*DOI: http://dx.doi.org/10.5772/intechopen.93867*

**5. Conclusions**

*How Can We Be Ahead of COVID-19 Curve? A Hybrid Knowledge-Based and Modified… DOI: http://dx.doi.org/10.5772/intechopen.93867*
