**4.3 Cortical manifestation of convolution**

We now progress to the issue of how convolution could be implemented in cortical architecture. To this end, we describe the various computational constraints imposed by the computational requirements of convolution and argue that the known cortical architecture satisfies these constraints.

#### 4.3.0.22

The key to the solution of the convolution problem in the neurological domain is provided by the Convolution Theorem, the same one employed by numerous digital signal processing applications. This theorem was discussed in Section 2.2. The importance of the theorem is that the convolution of two functions in the spatial domain can be achieved by the multiplication of the functions in the frequency domain. The implications of this theorem to the cortical convolution conundrum are significant. In our model, the components of the Fourier transform of the function the input signal is to be convolved with are represented by connection weights. Then, once the input signal has been transformed to the frequency

the Convolution Theorem to assert that with each component of the global signal traversing one additional connection, the state of the signal would represent a convolved signal in frequency space. This assertion was predicated on the assumption that the weight of each of these last connections represented the Fourier weight of the appropriate gaussian. The final step was to transform the convolved signal back from frequency space to the spatial domain. This was achieved with the inverse PaSH-FFT, which would be completed at the additional cost of the signal traversing a path connecting a further five neurons. Putting these three steps together, we arrived at a total path length of 10 connections for the global input signal to be transformed into a representation of global convolution of the visual field. We also note that this analysis accounted for a single global convolution of the input signal. However, there will be many global convolutions required, possibly up to one for every orientation preference and spatial frequency preference represented in a local map. Although the input spatial signal needs only to be transformed into the frequency domain once, each distinct convolution would require a distinct set of parallel paths to transform the signal back into the spatial domain. Consequently, the multiple convolutions would not necessarily result in a longer path. Accordingly, we assert that the transform, PaSH-FFT, with appropriate parameterisation, would deliver a global convolution of the visual field. Moreover, this output signal is generated within the required time constraints imposed by observed contextual modulation. Given our assumptions, the lowest number of iterations required to complete a Fourier transform is two. Consequently, 10 represents the length of the shortest path (see

<sup>195</sup> Cortical Specification of a Fast Fourier

The signal processing literature describes many different types of fast Fourier transforms (FFT). Although any one of them represents an alternative candidate to PaSH-FFT, the problem to address is accounting for how they might be implemented within the known constraints of cortical architecture. All fast Fourier transforms need to rearrange components between their intermediate steps of multiply and add. PaSH-FFT derives its rearrangements of components with the transform *M*10 that, as argued, is compatible with the distribution and quantity of long-range cortical connections. If any other FFT could be substituted for PaSH-FFT in the model, one would need to account for the rearrangement phase of that FFT

Another issue worthy of some discussion pertains to the Fourier transform and the absence of empirical evidence that would irrefutably demonstrate its cortical implementation. Part of the explanation for this lack of evidence may be provided by the role the Fourier transform plays in the vision process as suggested by this paper. That is, PaSH-FFT was shown to be a means to an end (convolution), not the end itself. Consequently, the question of finding neurons through empirical experimentation that measures response properties of neurons that closely model the profile of a Fourier transform may remain unanswered for some time to come.

Underpinning the proposed implementation of PaSH-FFT in cortical architecture is a highly simplistic model of the parallelism inherent in the cortex. The model employed did not take into account at least two well accepted features of this parallel architecture. First, the system itself somehow synchronises the flow of the signal. Second, the cortex does not

equation 7) possible to deliver a global convolution via PaSH-FFT.

Transform Supports a Convolution Model of Visual Perception

within the known connectivity of area V1.

**5. Discussion**

5.0.0.25

5.0.0.26

domain the required convolutions can be performed by mere multiplications. In cortical terms, each component of the signal, in the frequency domain, must traverse a connection to one more neuron to achieve the desired multiplication. However, the resulting convolution, in the frequency domain, must be transformed back to the spatial domain to complete the convolution. This is achieved with an inverse Fourier transform. Accordingly, the sequence of connections along the path that terminates in the output of a convolved value in the spatial domain is thus given by:

$$\mathbf{(s)Convolution} = \mathbf{(s)}P \odot F \odot P \odot F \odot \mathbf{C} \odot P \odot I \odot P \odot I \odot P \tag{7}$$

## 4.3.0.23

It is assumed that each component of the input signal traverses parallel paths along the network. Thus, the net time cost to complete a convolution is equivalent to the time required for a component of the input signal to traverse a path connecting 10 neurons. This path is composed of five short-range intrinsic connections and five long-range connections.

#### **4.4 Analysis**

The plausibility of the cortical model of convolution proposed in this paper is fundamentally predicated on the assumptions made in its formulation. Consequently, we summarise these assumptions along with the arguments offered to justify them before we provide an analysis of the model's parameterisation:


#### 4.4.0.24

The first assumption is possibly the most critical as it establishes the fundamental architectural relationship between the local and global maps and is essential to PaSH-FFT. The second two assumptions implied that a Fourier transform of the portion of the signal represented in a local map would be completed by each component of the signal traversing one cortical connection and that on completion of the first iteration of PaSH-FFT, the global signal consists of 10,000 local discrete Fourier transforms each of which is at the scale of a local map. Then, on completion of the second iteration of PaSH-FFT, the 10,000 local Fourier transforms would be transformed into a global Fourier transform of size 10, 0002 = 100 million, which by the fourth assumption represents the size of the global signal. From this we are able to assert that the input spatial signal would be transformed to frequency space at a cost of the signal traversing a path connecting four neurons. With the signal in frequency space, we employed the Convolution Theorem to assert that with each component of the global signal traversing one additional connection, the state of the signal would represent a convolved signal in frequency space. This assertion was predicated on the assumption that the weight of each of these last connections represented the Fourier weight of the appropriate gaussian. The final step was to transform the convolved signal back from frequency space to the spatial domain. This was achieved with the inverse PaSH-FFT, which would be completed at the additional cost of the signal traversing a path connecting a further five neurons. Putting these three steps together, we arrived at a total path length of 10 connections for the global input signal to be transformed into a representation of global convolution of the visual field. We also note that this analysis accounted for a single global convolution of the input signal. However, there will be many global convolutions required, possibly up to one for every orientation preference and spatial frequency preference represented in a local map. Although the input spatial signal needs only to be transformed into the frequency domain once, each distinct convolution would require a distinct set of parallel paths to transform the signal back into the spatial domain. Consequently, the multiple convolutions would not necessarily result in a longer path. Accordingly, we assert that the transform, PaSH-FFT, with appropriate parameterisation, would deliver a global convolution of the visual field. Moreover, this output signal is generated within the required time constraints imposed by observed contextual modulation. Given our assumptions, the lowest number of iterations required to complete a Fourier transform is two. Consequently, 10 represents the length of the shortest path (see equation 7) possible to deliver a global convolution via PaSH-FFT.
