**Appendix**

defined for each image pair as a standard magnification (�1) of the threshold for selecting the suitable image pairs. Namely, by decreasing the threshold magnification, we can discard more image pairs. Conversely, by increasing the magnification, many image pairs can be used for recovery. Because of the limit of pages, we only

**Figure 11** shows the result of the recovered depth for each threshold set as a constant multiple of the reference value. We also looked at the results using all image pairs. From these results, it can be confirmed that by reducing the magnification, inappropriate image pairs can be discarded and the accuracy of depth recovery is improved. The percentage shown in the caption of the figure shows the number of image pairs used for recovery, which is determined in conjunction with

In this chapter, we introduced a depth recovery algorithms that uses large number of images with small movements by using camera motion that simulates fixational eye movements, especially the tremor component. The algorithms can be divided into a differential-type and an integral-type. For the differential-type, it is desirable that the movement on the image is relatively small with respect to the texture pattern of the surface to be imaged, and conversely, for the integral-type, it is appropriate to apply it to a fine texture compared to the movement on the image. Therefore, ideally, the development of a depth recovery system in which both schemes function adaptively and selectively according to the target texture is the

A detailed technical issue is to automatically determine the parameters that control the smoothness of the depth. This can be achieved by considering all unknowns as stochastic variables and formulating them in the variational Bayesian framework. As for the integration method, since the resolution of the recovered depth is low in principle, it is possible to consider a composite type in which the differential-type is applied again and refinement is performed on the result

So far, we have considered a method that assumes only tremor, but in the future, we are planning to study camera motion that also simulates drift and microsaccade. In the method for drift component, it is necessary to extend the method based on tremor to the online version, and then update the depth estimate while advancing the tracking of the target as time series processing. When using microsaccades, it is necessary to handle large movements between frames. Therefore, based on the correspondence of feature points, sparse but highly accurate depth restoration can be expected. Drift itself does not have much merit in its use, but it plays an important role in generating microsaccades. As described above, we believe that an interesting system can be realized by comprehensively using the three

On the other hand, stereoscopic vision and motion stereoscopic vision are difficult to handle objects with few textures. In [24], we proposed a stereo system that considers shading information. The projected images to both cameras are calculated by computer graphics technique while changing the depth estimation value. The depth is determined so that the generated image matches the image observed by each camera. As a result, the association between images is indirectly realized. By introducing this method, it becomes possible to handle textureless objects. We aim to develop a comprehensive depth restoration method, including the

*<sup>r</sup>* <sup>¼</sup> <sup>2</sup>*:*<sup>64</sup> � <sup>10</sup>�<sup>2</sup> by which the average of the optical flow'<sup>s</sup>

show the results with *σ*<sup>2</sup>

*Applications of Pattern Recognition*

the change in threshold.

most important task in the future.

obtained by the integral-type.

components.

**20**

**6. Conclusions**

amplitude approximately coincides with *λ=*4.

Here, the method of calibrating the axis of rotation is explained using **Figure 12**. Let a point in 3-D space be *X*<sup>1</sup> ¼ ½ � *X*1, *Y*1, *Z*<sup>1</sup> <sup>T</sup> in the coordinate system before camera rotation and *X*<sup>2</sup> ¼ *X*2, *Y*2, *Z*2� <sup>T</sup> in the coordinate system after rotation, and the coordinates of the corresponding points on the image be *x*<sup>1</sup> ¼ *x*1, *y*1, *z*<sup>1</sup> � �<sup>T</sup> and *x*<sup>2</sup> ¼ *x*2, *y*2, *z*<sup>2</sup> � �<sup>T</sup> , respectively. Similarly, the optical axes before and after rotation are *z*<sup>1</sup> <sup>1</sup> <sup>¼</sup> ½ � 0, 0, 1 <sup>T</sup> and *<sup>z</sup>*<sup>2</sup> <sup>2</sup> <sup>¼</sup> ½ � 0, 0, 1 T, respectively. If the rotation is taken around the X-axis, the rotation matrix is given by the following equation.

$$\mathbf{R} = \begin{bmatrix} \mathbf{1} & \mathbf{0} & \mathbf{0} \\ \mathbf{0} & \cos \theta & -\sin \theta \\ \mathbf{0} & \sin \theta & \cos \theta \end{bmatrix}. \tag{31}$$

The translation *T* of the lens center generated by this rotation is given by the following equation in the coordinate system before rotation.

**Figure 12.** *Explanation of rotation axis calibration.*

$$\mathbf{T}^1 = Z\_0 \, z\_2^1 - Z\_0 \, z\_1^1 = Z\_0 (\mathbf{R} - \mathbf{I}) \, z\_1^1 \equiv Z\_0 \, \mathbf{S} z\_1^1,\tag{32}$$

where *z*<sup>1</sup> <sup>2</sup> represents the optical axis after rotation in the coordinate system before rotation. In addition, *X*<sup>1</sup> and *X*<sup>2</sup> have the following relationship.

$$\mathbf{X}\_2 = \mathbf{R}^T (\mathbf{X}\_1 - T^1) \to \mathbf{R} \mathbf{X}\_2 = \mathbf{X}\_1 - T^1 \tag{33}$$

Furthermore, by substituting the relation of *<sup>x</sup>*<sup>1</sup> <sup>¼</sup> *<sup>X</sup>*<sup>1</sup> <sup>1</sup>*=Z*1, *<sup>x</sup>*<sup>2</sup> <sup>¼</sup> *<sup>X</sup>*<sup>2</sup> <sup>2</sup>*=Z*<sup>2</sup> into Eq. (33), the following equation is obtained.

$$Z\_2 \mathbf{R} \mathbf{x}\_2 = \mathbf{X}\_1 - Z\_0 \mathbf{S} \mathbf{z}\_1^1. \tag{34}$$

**References**

61:687–716.

1995;16:191–204.

2003;25:1051–1062.

3827.

1640.

**23**

[1] Martinez-Conde S, Macknik S L, and Hubel D. The role of fixational eye movements in visual perception. Nature

*DOI: http://dx.doi.org/10.5772/intechopen.97404*

*Stereoscopic Calculation Model Based on Fixational Eye Movements*

Haubecker H, Geibier P, editors. Handbook of Computer Vision and Applications.Academic Press; 1999. vol.

[11] Bruhn A and Weickert J. Locas/ kanade meets horn/schunk: combining local and global optic flow methods. Int. J. Computer Vision. 2005;61:211–231.

Naganuma S, and Okubo K. Direct 3-d shape recovery from image sequence based on multi-scale bayesian network. In: Proceedings on ICPR; 2008. p. CD–

[13] Tagawa N. Depth Perception model based on fixational eye movements using

Proceedings on ICPR; 2010. p. 1662–1665.

Bayesian statistical inference. In:

[14] Tagawa N and Alexandrova T. Computational model of depth perception based on fixational eye movements. In: Proceedings on VISAPP; 2010, p. 328–333.

[15] Tagawa N, Iida Y, and Okubo K. Depth perception model exploiting blurring caused by random small camera motions. In: Proceedings on

[16] Tagawa N, Koizumi S, and Okubo K. Direct depth recovery from motion blur caused by random camera rotations imitating fixational eye movements. In: Proceedings on VISAPP; 2013, p. 177–

[17] Sorel M and Flusser J. Space-variant restoration of images degraded by camera motion blur. IEEE Trans. Image

[18] Paramanand C and Rajagopalan A N. Depth from motion and optical blur with unscented Kalman filter. IEEE Trans. Image Processing. 2012;21:2798–

Processing. 2008;17:105–116.

VISAPP; 2012, p. 329–334.

[12] Tagawa N, Kawaguchi J,

2. p. 397–422.

ROM.

186.

2811.

[2] Stemmler M. A single spike suffices:

[3] Propokopowicz P and Cooper P. The dynamic retina. Int. J. Computer Vision.

[4] Hongler M-O, de Meneses Y L, Beyeler A, and Jacot J. The resonant retina: Exploiting vibration noise to optimally detect edges in an image. IEEE Trans. Pattern Anal. Machine Intell.

[5] Ando S, Ono N, and Kimachi A. Involuntary eye-movement vision based on three-phase correlation image sensor.

In: Proceedings on19th Sensor Symposium; 2002. p. 83–86.

[6] Lazaros N, Sirakoulis G-C, and Gasteratos A. Review of stereo vision algorithm: from software to hardware. Int. J. Optomechatronics. 2008;5:435–462.

[7] Wang J and Zickler T. Local detection of stereo occlusion boundaries. In: Proceedings on CVPR; 2019. p. 3818–

[8] Liu F, Zhou S, Wang Y, Hou G, Sun Z, and Tan T. Binocular light-field: imaging theory and occlusion-robust depth perception application. IEEE Trans. Image Process. 2019;29: 1628–

[9] Horn B K P and Schunk B.

1981;17:185–203.

Determining optical flow. Artif. Intell.

[10] Simoncelli E P. Bayesian multi-scale differential optical flow. In: Jahne B,

Reviews. 2004;5:229–240.

the simplest form of stochastic resonance in model neuron. Network: Computations in Neural Systems. 1996;

By expressing this equation in terms of components and organizing it, the following two equations are derived.

$$Z\_2(y\_2\cos\theta - \sin\theta) = Y\_1 + Z\_0\sin\theta,\tag{35}$$

$$Z\_2(y\_2\sin\theta + \cos\theta) = Z\_1 - Z\_0(\cos\theta - 1). \tag{36}$$

By substituting Eq. (35) into Eq. (36), the solution of *Z*<sup>0</sup> can be derived as follows:

$$Z\_0 = \frac{Z\_1(y\_2\cos\theta - \sin\theta) - Y\_1(y\_2\sin\theta + \cos\theta)}{\sin\theta(y\_2\sin\theta + \cos\theta) + (\cos\theta - 1)(y\_2\cos\theta - \sin\theta)}.\tag{37}$$
