**7. Approximate model of crystalized intelligence: a non-elusive attempt**

Crystalized intelligence ð Þ *Gc* includes both declarative (static) and procedural (dynamic) knowledge (see Section 6.2). In the following section, we try to model *Gc* in an approximate sense through deep neural network model considering only declarative (static) knowledge. Convolutional neural network (CNN) is an implementation of deep neural network architecture. There are several variations of CNN architecture, e.g. Alexnet, Inception, Resnet, Demnet, etc. Input to the CNN is a static representation of knowledge represented by a matrix.

*Quest for I (Intelligence) in AI (Artificial Intelligence): A Non-Elusive Attempt DOI: http://dx.doi.org/10.5772/intechopen.96324*

### **7.1 Why CNN?**

**6. Definitions of CHC abilities**

Schneider and McGrew (2012).

**6.2 Crystallized intelligence (Gc)**

*Narrow G <sup>f</sup> stratum I ability definitions.*

*Narrow Gc stratum I ability definitions.*

**6.1 Fluid intelligence (Gf)**

**Table 1.**

**Table 2.**

**136**

These definitions are derived from an integration of the writings of Carroll (1993), Gustafsson and Undheim (1996), Horn (1991), McGrew (1997, 2005), and

*Artificial Intelligence - Latest Advances, New Paradigms and Novel Applications*

**Figures 2**, **3**, **Tables 1** and **2** provide the definition of CHC abilities.

*Gc* includes both declarative (static) and procedural (dynamic) knowledge.

Definitions of broad crytallized and fluid abilities are available in **Figures 2** and **3**.

Fluid Intelligence (*G <sup>f</sup>* ) Induction (I) Ability to discover underlying characteristic of a problem.

General Sequential Reasoning (RG) Ability to discover rules to solve a novel problem. Quantitative Reasoning (RQ) Ability for inductively and deductively reasoning.

Crystalized Intelligence (*Gc*) Depth and breadth of acquired knowledge.

Lexical Knowledge (VL) Content of vocabulary for oral communications.

Grammatical Sensitivity (MY) Knowledge of grammar of native language.

Language Development (LD) Understanding of native language.

Oral Production and Fluency (OP) Narrow oral communication skills.

**7. Approximate model of crystalized intelligence: a non-elusive attempt**

Crystalized intelligence ð Þ *Gc* includes both declarative (static) and procedural (dynamic) knowledge (see Section 6.2). In the following section, we try to model *Gc* in an approximate sense through deep neural network model considering only declarative (static) knowledge. Convolutional neural network (CNN) is an implementation of deep neural network architecture. There are several variations of CNN architecture, e.g. Alexnet, Inception, Resnet, Demnet, etc. Input to the CNN is a

static representation of knowledge represented by a matrix.

It reflects the knowledge and experience of a person.

**Table 2** shows the definition of narrow crystallized ability.

Note. *Definitions were derived from Carroll (1993) and Schneider and McGrew (2012).*

General (verbal) Information (K0) Domain of knowledge.

Note. *Definitions were derived from Carroll (1993) and Schneider and McGrew (2012).*

Communication Ability (CM) Speaking ability.

**Narrow Stratum I Name (Code) Definition**

**Narrow Stratum I Name (Code) Definition**

Suppose, we have a 28 � 28 RGB image. So, the total number of inputs in a neural network will be 28 � 28 � 3 ¼ 1872*:*.

Let we have a 1000 � 1000 RGB image. In this case the total number of inputs in a neural network will be 3 million, which is pretty large.

Since, the number of inputs have increased, the number of weight parameters, will also increase. If there are 1000 nodes in the first layer, the number of elements in the weight matrix of the first layer will be, 3 billion.

We see that with the increase in the dimension of the image, there is a huge increase in the number of parameters, in a feedforward neural network. Thus, it is pretty difficult to train a neural network with such a large number of parameters.

#### **7.2 Computer vision problem**

Suppose we have 6 � 6 grayscale image.


We wish to detect vertical edges in it. So, the filter or kernel we use is as follows:


After convolution, the resultant matrix, we get as:


The filter can be learnt using neural networks, which will determine the 9 values of the filter.

We treat each element of the filter as parameters and learn these parameters using back-propagation, similar to the ordinary neural network.

#### **7.3 A short summary of convolutional operations**

Summary of convolutions

*Artificial Intelligence - Latest Advances, New Paradigms and Novel Applications*

*n* � *n* image *f* � *f* filter padding *p* stride *s n* þ 2*p* � *f s* þ 1 � *n* þ 2*p* � *f s* þ 1 

#### **How to do convolutions on RGB Images?**

Since an RGB image consists of 3 channels, we need to have 3 filters for each channel. So, for an image 6 � 6 � 3, we need a filter of shape 3 � 3 � 3*:*

Max pooling takes the max of the elements in a **f f** region.

*Quest for I (Intelligence) in AI (Artificial Intelligence): A Non-Elusive Attempt*

the filter is passed over.

**Average Pooling**

parameters in Pooling layers.

the input image.

fluid intelligence *G <sup>f</sup>*

**139**

Going by this way, the output will be.

*DOI: http://dx.doi.org/10.5772/intechopen.96324*

of channels in the output of max-pooling will also be *nc:*

The two important features of CNNs are:

on a small number of inputs.

Suppose, we take a 2 2 filter, with strides 2, the output will be a 2 2 2D matrix. Now, the elements in the output will be max of the elements in the 2 2 region,

If we have a 3D input, the max-pooling output will have the same number of channels as in input. If the number of channels in the input is *nc*, then the number

Instead of taking the max of the elements, we take the average in this technique.

3.75 1.25 3.75 2

One important point to note about Pooling layers is that, there are no trainable

1.**Parameter Sharing**: A filter learnt can be used to detect a feature over all of

2.**Sparsity of Connection**?: In each layer, each output value is dependent only

Unfortunately, Deep learning models are often problematic. Though Deep learning models are robust under declarative (static) knowledge, it is not sufficient under procedural knowledge which refers to the process of reasoning with previously learned procedures to transform learning. Further, several abilities being assessed by psychometric intelligence tests are crystalized abilities which are acquired through experience and which are not distinguishable from skills (multipurpose skills). On the other hand, AI tests showed a focus on capabilities that enable new skill acquisition; hence crystalized abilities are not acceptable for intelligent decision making [6].

**8. Approximate model of fluid intelligence: a nonelusive attempt**

In this section, we model, in an approximate sense, the above said concept of

for on-spot problem solving of previously unseen problems

#### **How is this convolution computed?**

As in 2D convolution, the first filter is convoluted with the Red channel, the second filter with the Green channel, and the third filter with the Blue channel. The values at each convolutional step are added over the channels to give the final result, which will output a single channel, or a 2D matrix.

Suppose, the above 3 � 3 � 3 filter used is for detecting vertical edges. Now, suppose that we also want to detect Horizontal edges. So, we need another 3 � 3 � 3 filter for that purpose, which will again output a 2D matrix.

By stacking the output of these two filters, we get as folows:

½ � ð Þ� *H* � *f* þ 1 ð Þ� *W* � *f* þ 1 2 output considering no padding ð Þ*:*

The number of channels in the output is equal to the number of filters we are using. And, the number of channels in each filter = number of filters in the input.

However, before stacking up the outputs, bias is added to the output and passed through the activation function, which is then used as input to the next layer.

**Convolutional Layer.**

Now, there are various types of layers in a CNN: 1. Convolutional 2. Pooling 3. Fully Connected **Pooling Layers:** Let us consider a 2D matrix for Max-Pooling:

*Quest for I (Intelligence) in AI (Artificial Intelligence): A Non-Elusive Attempt DOI: http://dx.doi.org/10.5772/intechopen.96324*


Max pooling takes the max of the elements in a **f f** region.

Suppose, we take a 2 2 filter, with strides 2, the output will be a 2 2 2D matrix. Now, the elements in the output will be max of the elements in the 2 2 region, the filter is passed over.

Going by this way, the output will be.


If we have a 3D input, the max-pooling output will have the same number of channels as in input. If the number of channels in the input is *nc*, then the number of channels in the output of max-pooling will also be *nc:*

#### **Average Pooling**

*n* � *n* image *f* � *f* filter padding *p* stride *s*

�

Since an RGB image consists of 3 channels, we need to have 3 filters for each channel.

As in 2D convolution, the first filter is convoluted with the Red channel, the second filter with the Green channel, and the third filter with the Blue channel. The values at each convolutional step are added over the channels to give the final result,

Suppose, the above 3 � 3 � 3 filter used is for detecting vertical edges. Now, suppose that we also want to detect Horizontal edges. So, we need another 3 � 3 � 3

½ � ð Þ� *H* � *f* þ 1 ð Þ� *W* � *f* þ 1 2 output considering no padding ð Þ*:*

The number of channels in the output is equal to the number of filters we are using. And, the number of channels in each filter = number of filters in the input. However, before stacking up the outputs, bias is added to the output and passed

through the activation function, which is then used as input to the next layer.

*n* þ 2*p* � *f s*

þ 1

þ 1

So, for an image 6 � 6 � 3, we need a filter of shape 3 � 3 � 3*:*

*n* þ 2*p* � *f s*

**How to do convolutions on RGB Images?**

**How is this convolution computed?**

**Convolutional Layer.**

**Pooling Layers:**

**138**

which will output a single channel, or a 2D matrix.

filter for that purpose, which will again output a 2D matrix. By stacking the output of these two filters, we get as folows:

Now, there are various types of layers in a CNN: 1. Convolutional 2. Pooling 3. Fully Connected

Let us consider a 2D matrix for Max-Pooling:

*Artificial Intelligence - Latest Advances, New Paradigms and Novel Applications*

Instead of taking the max of the elements, we take the average in this technique.


One important point to note about Pooling layers is that, there are no trainable parameters in Pooling layers.

The two important features of CNNs are:


Unfortunately, Deep learning models are often problematic. Though Deep learning models are robust under declarative (static) knowledge, it is not sufficient under procedural knowledge which refers to the process of reasoning with previously learned procedures to transform learning. Further, several abilities being assessed by psychometric intelligence tests are crystalized abilities which are acquired through experience and which are not distinguishable from skills (multipurpose skills). On the other hand, AI tests showed a focus on capabilities that enable new skill acquisition; hence crystalized abilities are not acceptable for intelligent decision making [6].

#### **8. Approximate model of fluid intelligence: a nonelusive attempt**

In this section, we model, in an approximate sense, the above said concept of fluid intelligence *G <sup>f</sup>* for on-spot problem solving of previously unseen problems through meta-learning (learning to learn) approach. Inductive and deductive reasoning are generally considered to be the hallmark narrow ability indicators of *G <sup>f</sup> :* But in our study we do not consider such hallmark ability of *G <sup>f</sup>* [7].

From section 4.2, we understand that crystallized intelligence ð Þ *Gc* is an out-

distinct components in putative test of intelligence. Also from **Figure 2**, we under-

from the top level mental (neutral) energy g. Hence to make a very 'crude approximate' of three stratum theory of C–H–C (see **Figure 3**), we adopt deep metalearning approach where we integrate the power of deep learning approach into

(neutral) energy as shown in **Figure 2** and try to follow the hierarchy of three layers to derive the broad and narrow abilities to perform the specific task of given job. Here we consider the term 'crude approximation, because the top level of **Figure 2** or **Figure 3** can never be reached by the present state of art of artificial neural network. Specially "mood" at the top level of **Figure 3** is a biological phenomenon which generates sufficient mental energy (neural energy) inside the brain under favorable mental conditions. Hence, under such circumstances we assume sufficient neutral (mental) energy is generated for C–H–C theory to perform lower level of

**Figure 5** shows the concept space of deep meta-learning. Eq. (4) represents the

T �P Tð Þ,ð Þ **<sup>x</sup>**,**<sup>y</sup>** � *<sup>J</sup>* LT ð Þ *<sup>θ</sup>*M*θ*<sup>G</sup> ,Lð Þ **<sup>x</sup>**,**<sup>y</sup>** ð Þ *<sup>θ</sup>*D, *<sup>θ</sup>*<sup>G</sup>

where *θ*G, *θ*<sup>M</sup> and *θ*<sup>D</sup> are the parameters of deep meta-learning. We assume that the top level mental (neural) energy is available for C–H–C theory of intelligence and a crude approximation of C–H–C theory to mimic human intelligence can be achieved through deep-meta-learning approach. In deep-metal learning approach we crudely approximate to integrate crystalized intelligence ð Þ *Gc* into fluid intelligence *G <sup>f</sup>*

**10. Paradeigm shift from Von Neumann computing to neuromorphic**

So far, we have approximately modeled the psychrometric model of human intelligence and implemented in Von-Neumann computer system. Now we seek a

� � and ð Þ *Gc* are both derived from the top level mental

is influenced by a trace of fluid intelligence, though ð Þ *Gc* and *G <sup>f</sup>*

*Quest for I (Intelligence) in AI (Artificial Intelligence): A Non-Elusive Attempt*

cognitive process like crystalized and fluid intelligence.

**9. Concept space of deep meta-learning**

� �*:* Thus the performance of crystallized intelligence

� � and acquired knowledge ð Þ *Gc* are derived

h i � � , (4)

� � are two separate

� �*:*

growth of fluid intelligence *G <sup>f</sup>*

meta-learning. The *G <sup>f</sup>*

meta-learning process [8].

**computing**

*Concept space of deep meta-learning.*

**Figure 5.**

**141**

min *θ*G*θ*M*θ*<sup>D</sup>

stand that the component reasons *Gf*

*DOI: http://dx.doi.org/10.5772/intechopen.96324*
