**3. Coin image preprocessing**

The appearance of coins in 2D images is highly influenced by the lighting conditions and the orientation of the imaged surface. Coins are characterized by a 3D surface and the reflected light into the camera direction is typically a mixture of strong specular and diffuse refections depending on the placement of camera and light sources, the type of light sources, the coin surface structure, dirt and abrasion. In order to diminish the influence induced by the lighting conditions a controlled acquisition setup is recommended. Controlled acquisition strongly improves recognition of objects of low intra-class surface variation, e.g. modern coins. Ancient coins are characterized by high surface variation even within a single class, therefore different type and direction of light sources make small patterns on the coin look very different which limits, for instance, the use of local image features for coin recognition. Best practice for acquisition of ancient coins was summarized by Kampel & Zambanini (2008) and Hoßfeld et al. (2006) described a sophisticated system for modern coin acquisition.

In this section, we will discuss preprocessing under controlled illumination for modern coins and slightly varying conditions for ancient coins. Since the shape of historical coins might not be as regular or flat as the shape of their present counterparts, it is a promising approach to calculate 3D models for higher coin matching rates. Therefore, we will also present acquisition of 3D data from stereo image pairs and stripe projection in this section.

## **3.1 Coin detection**

The separation of an object of interest from background is commonly termed segmentation. Under controlled acquisition automatic intensity thresholding approaches (Sezgin & Sankur, 2004) are feasible for modern coins (Nölle et al., 2003). Due to textured background, presence of other objects in the image, inhomogeneous or poor illumination and low contrast, straightforward methods based on global image intensity thresholding tend to fail.

In situations, where explicit knowledge on the properties of objects is available, this knowledge can be used to steer segmentation parameters. For example, the compactness measure was used in a comparable application to find an intensity threshold in images showing circular spot welds by Ruisz et al. (2007). Similarly, ancient coins were localized by thresholding the local intensity range, i.e. the difference between maximum and minimum graylevel, in a local window and evaluation of the compactness measure (Zambanini & Kampel, 2008). Typically, the shape of modern coins is circular, 4 Will-be-set-by-IN-TECH

reported by Huber-Mörk et al. (2008), where the combination of shape and local descriptors to capture the unique characteristics of the coin shape and die information was suggested. For ancient coin recognition features from the Scale-invariant feature transform (SIFT) (Lowe, 2004) was used and compared to algorithms based on shape matching i.e. a shape context description and a robust correlation algorithm (Zaharieva et al., 2007). Ancient coins are in general not of a perfect circular shape. From a numismatic point of view, the shape of a coin is a very specific feature. Thus, the shape described by the edge of a coin serves as a first clue in the process of coin identification and discrimination. A shape based method tuned to the properties of ancient coins was combined with matching of local features through Bayesian

The appearance of coins in 2D images is highly influenced by the lighting conditions and the orientation of the imaged surface. Coins are characterized by a 3D surface and the reflected light into the camera direction is typically a mixture of strong specular and diffuse refections depending on the placement of camera and light sources, the type of light sources, the coin surface structure, dirt and abrasion. In order to diminish the influence induced by the lighting conditions a controlled acquisition setup is recommended. Controlled acquisition strongly improves recognition of objects of low intra-class surface variation, e.g. modern coins. Ancient coins are characterized by high surface variation even within a single class, therefore different type and direction of light sources make small patterns on the coin look very different which limits, for instance, the use of local image features for coin recognition. Best practice for acquisition of ancient coins was summarized by Kampel & Zambanini (2008) and Hoßfeld et al. (2006) described a sophisticated system for modern coin acquisition.

In this section, we will discuss preprocessing under controlled illumination for modern coins and slightly varying conditions for ancient coins. Since the shape of historical coins might not be as regular or flat as the shape of their present counterparts, it is a promising approach to calculate 3D models for higher coin matching rates. Therefore, we will also present acquisition

The separation of an object of interest from background is commonly termed segmentation. Under controlled acquisition automatic intensity thresholding approaches (Sezgin & Sankur, 2004) are feasible for modern coins (Nölle et al., 2003). Due to textured background, presence of other objects in the image, inhomogeneous or poor illumination and low contrast,

In situations, where explicit knowledge on the properties of objects is available, this knowledge can be used to steer segmentation parameters. For example, the compactness measure was used in a comparable application to find an intensity threshold in images showing circular spot welds by Ruisz et al. (2007). Similarly, ancient coins were localized by thresholding the local intensity range, i.e. the difference between maximum and minimum graylevel, in a local window and evaluation of the compactness measure (Zambanini & Kampel, 2008). Typically, the shape of modern coins is circular,

straightforward methods based on global image intensity thresholding tend to fail.

of 3D data from stereo image pairs and stripe projection in this section.

fusion (Huber-Mörk et al., 2010).

**3. Coin image preprocessing**

**3.1 Coin detection**

Fig. 2. Image of a modern coin, intermediate detection results and segmentation.

whereas ancient coins deviate from this shape, but still stay close to a circular outline. Therefore, approaches based on edge detection and application of the Hough transform (Duda & Hart, 1972) were applied to modern coins (Reisert et al., 2006) as well as to ancient coins (Arandjelovi´c, 2010), where a modified version of the Hough transform was used.

For a modern coin, such as shown in Fig.2 (a), we suggest an edge based technique to segment the coin from the background. The detection of the coin employs a common segmentation approach and works reliably for controlled lighting conditions and relatively clean background, e.g. a moderately dirty conveyor belt. Problems might be caused by very dark coins, i.e. coins which reflect only a small amount of light towards the camera. A multi-stage segmentation procedure is suggested. The outline of the suggested segmentation method is:


An example of an overlay of the extracted blob onto the input image is shown in Fig.2(f). Coin position and diameter are estimated from the detected blob, which directly delivers access to a translation invariant description.

For ancient coins we employ a measure of compactness *ct* related to a threshold *t* defined as

$$\mathbf{c}\_{t} = 4\pi A\_{t}/P\_{t}^{2} \tag{1}$$

where *At* is the area of the region covered by the coin and *Pt* is the perimeter of the coin. The measures *At* and *Pt* are obtained by connected components analysis (Sonka et al., 1998)

into polar coordinates. In the polar image shift invariance, corresponding to rotational invariance when mapped back to Cartesian coordinates, is achieved through cross-correlation. Cross-correlation is efficiently implemented using the fast Fourier transform (FFT)

Automatic Coin Classification and Identification 133

Rotational invariance for a coin edge image involves cross-correlation with reference edge images. The edge image is mapped from Cartesian to polar coordinates, see Fig.4. The result of cross-correlation between the coin image to be classified and a set of reference images is used to derive class hypotheses. In detail, for both sides of a coin under investigation

2. Selection of a set of reference images depending on thickness and diameter measure (if

3. Cross-correlation of the coin side edge image under investigation with all reference coin edge images in the selected reference set, resulting in a cross-correlation value and

4. Ranking of the reference set by the maximum correlation value and generation of a set of

To obtain reliable estimates for cross-correlation and rotation angle the polar image is split into *n* bands along the radius coordinate, corresponding to concentric rings in Cartesian coordinates. The peak of the correlation value *Ki* for band *i* is determined for each band and the position of the peak is taken as an estimate for the rotation angle in band *i*. The sample

arctan(*S*/*C*) if *S* ≥ 0 and *C >* 0

(2)

arctan(*S*/*C*) + 2*π* if *S <* 0 and *C >* 0

edge pixels in reference coin and coin under investigation *δ<sup>i</sup>* = 1, otherwise *δ<sup>i</sup>* = 0.

*<sup>i</sup>*=<sup>1</sup> *δ<sup>i</sup>* sin *αi*. If band *i* contains a significant number of

arctan(*S*/*C*) + *π* if *C <* 0

rotational invariant processing and hypothesis generation proceeds as follows:

available). Each reference image is associated with a coin class.

associated rotation angle estimation for each reference class.

1. Estimation of coin diameter from coin detection.

hypotheses for the highest-ranking classes.

Fig. 4. Processing for rotational invariance

mean angle direction *α*¯ is estimated via (Fisher, 1995):

⎧ ⎨ ⎩

*α*¯ =

*<sup>i</sup>*=<sup>1</sup> *<sup>δ</sup><sup>i</sup>* cos *<sup>α</sup>i*, *<sup>S</sup>* <sup>=</sup> <sup>∑</sup>*<sup>n</sup>*

with *C* = ∑*<sup>n</sup>*

(Cooley & Tukey, 1965).

applied to the binary image which is derived from thresholding the intensity range image. Figure 3 (a) shows an intensity image of an ancient coin, Fig. 3 (b) is the corresponding intensity range image and Figs. 3 (c)-(e) show thresholded images for different selections of *t* along with calculated values for compactness *ct*. The image thresholded at the optimal level *topt* with highest compactness is given in Fig. 3 (f). A sudden decrease of the compactness measure occurs with oversegmentation of the coin into several small regions, e.g. compare to Fig. 3 (e).

Fig. 3. Image of an ancient coin, intensity range image and different binary images with corresponding threshold and compactness.

#### **3.2 Invariant preprocessing for 2D images**

Apart from illumination dependency the appearance of a coin varies considerably with respect to its grey values depending on dirt and abrasion. These variations frequently are inhomogeneous. This suggests, even if illumination influence could be neglected, that for recognition purposes grey values by themselves will not give appropriate results. On the other hand, edge information remains more or less stable or at least degrades gracefully. Therefore, we based the feature extraction for coin recognition on edges. In principle any edge detector may be used for this purpose. From our experience the approaches suggested by Canny (1986), by Rothwell et al. (1995) and the Laplacian of Gaussian method (Marr & Hildreth, 1980) work satisfactorily.

For reliable matching of coins invariance with respect to rotation has to be taken into account. Invariance with respect to translation is already discussed and taken into account by an approach involving segmentation in Sec. 3.1. Scale variance is accounted for either by using a calibrated acquisition device or normalization of the segmented image.

In general, rotational invariance is either approached via the use of geometrical moments (Hu, 1962), radial coding of features (Torres-Mendez et al., 2000), or using a mapping from Cartesian to polar coordinate representation, e.g. log-polar mapping (Kurita et al., 1998). A method based on the construction of an Eigenspace from uniformly rotated images was published by Uenohara & Kanade (1997). The application of their approach works through locating of a specific small pattern in a larger image. In a later paper (Uenohara & Kanade, 1998) an improvement of the location method based on the discrete cosine transform (DCT) was suggested.

We obtain rotational invariance by estimation of the rotational angle followed by a rotation into a reference pose. Angle estimation is performed for images transformed 6 Will-be-set-by-IN-TECH

applied to the binary image which is derived from thresholding the intensity range image. Figure 3 (a) shows an intensity image of an ancient coin, Fig. 3 (b) is the corresponding intensity range image and Figs. 3 (c)-(e) show thresholded images for different selections of *t* along with calculated values for compactness *ct*. The image thresholded at the optimal level *topt* with highest compactness is given in Fig. 3 (f). A sudden decrease of the compactness measure occurs with oversegmentation of the coin into several small regions, e.g. compare to

> (d) Binarized, *t* = 65, *c*<sup>65</sup> = 0.859

(e) Binarized, *t* = 85, *c*<sup>85</sup> = 0.018

(f) Binarized, *t* = 49, *c*<sup>49</sup> = 0.888

(c) Binarized, *t* = 5, *c*<sup>5</sup> = 0.096

Fig. 3. Image of an ancient coin, intensity range image and different binary images with

Apart from illumination dependency the appearance of a coin varies considerably with respect to its grey values depending on dirt and abrasion. These variations frequently are inhomogeneous. This suggests, even if illumination influence could be neglected, that for recognition purposes grey values by themselves will not give appropriate results. On the other hand, edge information remains more or less stable or at least degrades gracefully. Therefore, we based the feature extraction for coin recognition on edges. In principle any edge detector may be used for this purpose. From our experience the approaches suggested by Canny (1986), by Rothwell et al. (1995) and the Laplacian of Gaussian method (Marr & Hildreth,

For reliable matching of coins invariance with respect to rotation has to be taken into account. Invariance with respect to translation is already discussed and taken into account by an approach involving segmentation in Sec. 3.1. Scale variance is accounted for either by using a

In general, rotational invariance is either approached via the use of geometrical moments (Hu, 1962), radial coding of features (Torres-Mendez et al., 2000), or using a mapping from Cartesian to polar coordinate representation, e.g. log-polar mapping (Kurita et al., 1998). A method based on the construction of an Eigenspace from uniformly rotated images was published by Uenohara & Kanade (1997). The application of their approach works through locating of a specific small pattern in a larger image. In a later paper (Uenohara & Kanade, 1998) an improvement of the location method based on the discrete cosine transform (DCT)

We obtain rotational invariance by estimation of the rotational angle followed by a rotation into a reference pose. Angle estimation is performed for images transformed

calibrated acquisition device or normalization of the segmented image.

Fig. 3 (e).

(a) Coin image (b) Intensity range

1980) work satisfactorily.

was suggested.

corresponding threshold and compactness.

**3.2 Invariant preprocessing for 2D images**

into polar coordinates. In the polar image shift invariance, corresponding to rotational invariance when mapped back to Cartesian coordinates, is achieved through cross-correlation. Cross-correlation is efficiently implemented using the fast Fourier transform (FFT) (Cooley & Tukey, 1965).

Rotational invariance for a coin edge image involves cross-correlation with reference edge images. The edge image is mapped from Cartesian to polar coordinates, see Fig.4. The result of cross-correlation between the coin image to be classified and a set of reference images is used to derive class hypotheses. In detail, for both sides of a coin under investigation rotational invariant processing and hypothesis generation proceeds as follows:


Fig. 4. Processing for rotational invariance

To obtain reliable estimates for cross-correlation and rotation angle the polar image is split into *n* bands along the radius coordinate, corresponding to concentric rings in Cartesian coordinates. The peak of the correlation value *Ki* for band *i* is determined for each band and the position of the peak is taken as an estimate for the rotation angle in band *i*. The sample mean angle direction *α*¯ is estimated via (Fisher, 1995):

$$\overline{\alpha} = \begin{cases} \arctan(\mathcal{S}/\mathcal{C}) & \text{if} \quad \mathcal{S} \ge 0 \quad \text{and} \quad \mathcal{C} > 0 \\ \arctan(\mathcal{S}/\mathcal{C}) + \pi & \text{if} \quad \mathcal{C} < 0 \\ \arctan(\mathcal{S}/\mathcal{C}) + 2\pi \text{ if} \quad \mathcal{S} < 0 \quad \text{and} \quad \mathcal{C} > 0 \end{cases} \tag{2}$$

with *C* = ∑*<sup>n</sup> <sup>i</sup>*=<sup>1</sup> *<sup>δ</sup><sup>i</sup>* cos *<sup>α</sup>i*, *<sup>S</sup>* <sup>=</sup> <sup>∑</sup>*<sup>n</sup> <sup>i</sup>*=<sup>1</sup> *δ<sup>i</sup>* sin *αi*. If band *i* contains a significant number of edge pixels in reference coin and coin under investigation *δ<sup>i</sup>* = 1, otherwise *δ<sup>i</sup>* = 0.

A black and white stripe pattern is projected on the coin's surface. The stripes get deformed by the coin's shape and its surface structure. By using a stereo camera pair, 3D information can be obtained from two 2D images showing the same object at exactly the same time from different views. In active stereo vision, a light source projects artificial features. This features are easy to extract as their properties are known and they can be matched unambiguously. In the setup used for coin acquisition, the scanner provides a theoretical x-y resolution of 20 *μm*

Automatic Coin Classification and Identification 135

The goal of stereo vision is to obtain depth information from 2D input data. Since the two cameras have a fixed relative orientation, the distance between them is not variable and the position of any point in 3D space can be obtained by triangulation. Therefore, the intersection between two lines of two images, where each line is passing through the projection of the point and the projection center, has to be determined. The setup can be described using epipolar geometry, which is the geometry between two views (Hartley & Zisserman, 2003). As an initial step, corresponding points must be found, which is performed using the projected and deformed stripe pattern on the object's surface. We fixed the coins on a rotation / tilt table in front of the active stereo system and the object was scanned from eight different but known viewing positions. For aligning the data from different viewpoints, the Iterative Closest Point (ICP) algorithm, which was presented by Besl & McKay (1992) and Chen & Medioni (1992), is used. Since the position of the rotation/tilt table is known, a preliminary alignment process can be performed first. All eight scans are finally aligned and merged into a polygon mesh.

As the appearance of an ancient coin is often unique, e.g. due to variations in the hammering process, die, mint signs, shape, scratches, wearing, etc. its image contains important information for identification. The uniqueness in the appearance of coins results from variations in the coin blank material and application of the tools in minting, as well as from wear of the coin. Therefore, for numismatists the shape of the coin edge is regarded to be an

Our approach of shape comparison is based on a description of the difference between the shape of a coin and the shape of a circle. Therefore, the suggested approach is called deviation from circular shape matching (DCSM). In order to represent the coin shape, a border tracing on the binary image resulting from segmentation is performed. A list of border pixels is obtained and resampled to *l* samples using equidistantly spaced intervals with respect to the arc length.

A one-dimensional descriptor, i.e. a curve describing the border, is obtained from fitting the coin edge to a circle and unrolling the polar distances between sample points and fitted circle into a vector. The center *sc* = (*xc*, *yc*) of the fitted circle is derived from the center of gravity and the radius *r* is the mean distance between the center and all sample points *si* = (*xi*, *yi*)

> *<sup>l</sup>* ∑ *i*=1,...,*l*

*yi*, *<sup>r</sup>* <sup>=</sup> <sup>1</sup>

*<sup>l</sup>* ∑ *i*=1,...,*l* �*si* − *sc*� (3)

and a theoretical z-resolution limit of 1 *μm*.

**3.4 Extraction of coin shape features**

important feature to characterize a coin.

Figures 6 (a)-(d) show this operation.

*xc* <sup>=</sup> <sup>1</sup>

*<sup>l</sup>* ∑ *i*=1,...,*l* *xi*, *yc* <sup>=</sup> <sup>1</sup>

using

A cross-correlation estimate *K* for the coin under investigation is calculated using *K* = 1/*n*� ∑*<sup>n</sup> <sup>i</sup>*=<sup>1</sup> *δiKi*. The number of bands *n*� ≤ *n* used in cross-correlation and angle estimation varies between images and is simply obtained by *n*� = ∑*<sup>n</sup> <sup>i</sup>*=<sup>1</sup> *δi*.

### **3.3 Surface analysis from 3D data**

Analysis of coin images in 2D might lead to loss of important features, e.g. highlights due to specular reflections decrease the quality of the images and handicap automatic analysis. Especially ancient coin surfaces are reliefs visualizing inscriptions and symbols. Therefore, the appearance of coins in 2D images is highly influenced by the lighting conditions. Different lighting directions make small patterns on the coin look very different and limits, for instance, the use of local image features for coin recognition. Since the surface shape of historical coins might not be as regular or flat as the shape of their present counterparts, we suggest to calculate 3D reconstructions for higher coin matching rates. With 3D scans, detailed models of both coin sides are obtained which allow a more accurate analysis (Akca et al., 2007).

However, 3D acquisitions are more laborious and expensive and, to our knowledge, 3D vision approaches applied to 3D databases of coins do not exist at the moment. By using 3D coin models, various additional features can be obtained for object matching which are not available in 2D (e.g. changes on the coin's surface, thickness and volume measurements). The profile of an exemplary ancient coin is shown in Fig. 5 (a). Two coin cuts, which are obviously visible in the 3D reconstruction Figure 5 (a) can be seen in the profile plot in Fig. 5 (b).

Fig. 5. 3D reconstruction of an ancient coin.

The Breuckmann stereoSCAN 3D system (http://www.breuckmann.com/index.php?id= stereoscan) was used for coin data acqusition (Zambanini et al., 2009). The scanner is an active stereo system consisting of a projector and two cameras serving as stereo camera pair and combines the shape from structured light and stereo vision approach (Stoykova et al., 2007). In order to evaluate the accuracy of the coin models acquired by the Breuckmann stereoSCAN 3D, real world coin data is compared to data gathered from their virtual 3D model counterparts.

8 Will-be-set-by-IN-TECH

A cross-correlation estimate *K* for the coin under investigation is calculated using *K* =

Analysis of coin images in 2D might lead to loss of important features, e.g. highlights due to specular reflections decrease the quality of the images and handicap automatic analysis. Especially ancient coin surfaces are reliefs visualizing inscriptions and symbols. Therefore, the appearance of coins in 2D images is highly influenced by the lighting conditions. Different lighting directions make small patterns on the coin look very different and limits, for instance, the use of local image features for coin recognition. Since the surface shape of historical coins might not be as regular or flat as the shape of their present counterparts, we suggest to calculate 3D reconstructions for higher coin matching rates. With 3D scans, detailed models of both coin sides are obtained which allow a more accurate analysis (Akca et al., 2007).

However, 3D acquisitions are more laborious and expensive and, to our knowledge, 3D vision approaches applied to 3D databases of coins do not exist at the moment. By using 3D coin models, various additional features can be obtained for object matching which are not available in 2D (e.g. changes on the coin's surface, thickness and volume measurements). The profile of an exemplary ancient coin is shown in Fig. 5 (a). Two coin cuts, which are obviously

visible in the 3D reconstruction Figure 5 (a) can be seen in the profile plot in Fig. 5 (b).

(a) 3D rendering (b) Profile plot

The Breuckmann stereoSCAN 3D system (http://www.breuckmann.com/index.php?id= stereoscan) was used for coin data acqusition (Zambanini et al., 2009). The scanner is an active stereo system consisting of a projector and two cameras serving as stereo camera pair and combines the shape from structured light and stereo vision approach (Stoykova et al., 2007). In order to evaluate the accuracy of the coin models acquired by the Breuckmann stereoSCAN 3D, real world coin data is compared to data gathered from their virtual 3D

varies between images and is simply obtained by *n*� = ∑*<sup>n</sup>*

**3.3 Surface analysis from 3D data**

Fig. 5. 3D reconstruction of an ancient coin.

model counterparts.

*<sup>i</sup>*=<sup>1</sup> *δiKi*. The number of bands *n*� ≤ *n* used in cross-correlation and angle estimation

*<sup>i</sup>*=<sup>1</sup> *δi*.

1/*n*� ∑*<sup>n</sup>*

A black and white stripe pattern is projected on the coin's surface. The stripes get deformed by the coin's shape and its surface structure. By using a stereo camera pair, 3D information can be obtained from two 2D images showing the same object at exactly the same time from different views. In active stereo vision, a light source projects artificial features. This features are easy to extract as their properties are known and they can be matched unambiguously. In the setup used for coin acquisition, the scanner provides a theoretical x-y resolution of 20 *μm* and a theoretical z-resolution limit of 1 *μm*.

The goal of stereo vision is to obtain depth information from 2D input data. Since the two cameras have a fixed relative orientation, the distance between them is not variable and the position of any point in 3D space can be obtained by triangulation. Therefore, the intersection between two lines of two images, where each line is passing through the projection of the point and the projection center, has to be determined. The setup can be described using epipolar geometry, which is the geometry between two views (Hartley & Zisserman, 2003). As an initial step, corresponding points must be found, which is performed using the projected and deformed stripe pattern on the object's surface. We fixed the coins on a rotation / tilt table in front of the active stereo system and the object was scanned from eight different but known viewing positions. For aligning the data from different viewpoints, the Iterative Closest Point (ICP) algorithm, which was presented by Besl & McKay (1992) and Chen & Medioni (1992), is used. Since the position of the rotation/tilt table is known, a preliminary alignment process can be performed first. All eight scans are finally aligned and merged into a polygon mesh.

#### **3.4 Extraction of coin shape features**

As the appearance of an ancient coin is often unique, e.g. due to variations in the hammering process, die, mint signs, shape, scratches, wearing, etc. its image contains important information for identification. The uniqueness in the appearance of coins results from variations in the coin blank material and application of the tools in minting, as well as from wear of the coin. Therefore, for numismatists the shape of the coin edge is regarded to be an important feature to characterize a coin.

Our approach of shape comparison is based on a description of the difference between the shape of a coin and the shape of a circle. Therefore, the suggested approach is called deviation from circular shape matching (DCSM). In order to represent the coin shape, a border tracing on the binary image resulting from segmentation is performed. A list of border pixels is obtained and resampled to *l* samples using equidistantly spaced intervals with respect to the arc length. Figures 6 (a)-(d) show this operation.

A one-dimensional descriptor, i.e. a curve describing the border, is obtained from fitting the coin edge to a circle and unrolling the polar distances between sample points and fitted circle into a vector. The center *sc* = (*xc*, *yc*) of the fitted circle is derived from the center of gravity and the radius *r* is the mean distance between the center and all sample points *si* = (*xi*, *yi*) using

$$\mathbf{x}\_{\mathbf{c}} = \frac{1}{l} \sum\_{i=1,\ldots,l} \mathbf{x}\_{i\prime} \quad y\_{\mathbf{c}} = \frac{1}{l} \sum\_{i=1,\ldots,l} y\_{i\prime} \quad r = \frac{1}{l} \sum\_{i=1,\ldots,l} ||\mathbf{s}\_{i} - \mathbf{s}\_{\mathbf{c}}|| \tag{3}$$

Assume *Em* is a reference image, or so called master edge image, and *Ea* is an edge image to be matched. Then in general there is an unknown rotation *φ* around the center of gravity that aligns both edge images. In polar coordinates this rotation transforms into a cyclic translation in the angular direction. To determine *φ* we may deploy a fast correlation method based on the edge images, see Subsec 3.2 Although correlation methods based on the fast Fourier transformation perform efficiently, there are some drawbacks using the edge images directly. First, to preserve the visual information the resolution of the edge image cannot be too small. Depending on the diameter of the coin we typically get coin image resolutions from 100 × 100 to 300 × 300 pixels and the correlation would add significantly to the overall computational costs. Secondly, the outer border, which in most cases contains a substantial part of the edge points, usually does not help to find *φ* as it comprises too many symmetries. To avoid both we suggest to calculate the correlation on a two dimensional edge density function restricted

Automatic Coin Classification and Identification 137

*<sup>i</sup>*,*<sup>j</sup>* <sup>=</sup> |{(*θ*, *<sup>ρ</sup>*) <sup>∈</sup> *<sup>E</sup>p*|*θi*−<sup>1</sup> <sup>≤</sup> *<sup>θ</sup> <sup>&</sup>lt; <sup>θ</sup>i*, *<sup>ρ</sup>j*−<sup>1</sup> *<sup>&</sup>lt; <sup>ρ</sup> <sup>&</sup>lt; <sup>ρ</sup>j*}| (6)

*n*,*l* ∑ *i*,*j Hd i*,*j*

*<sup>m</sup>* and *E<sup>d</sup>*

*<sup>a</sup>φ*(*x*, *y*)) (7)

*<sup>m</sup>*(*x*, *y*)) (8)

*<sup>a</sup>* . By choosing a

/*N*, *i* = 1, . . . , *n*; *j* = 1, . . . , *l*, *N* =

The sets {*θ*0,..., *θn*} and {*ρ*0,..., *ρl*} are the discrete resolutions in angular and distance

high resolution in the angular direction (i.e. *n* ≥ 512) and a coarse resolution (i.e. *l* ≤ 16) in the distance direction, omitting to include the coin borders, we found that *φ* usually may be determined up to ±0.5◦. Once *φ* is known, we may align the actual coin image to the master. This is done efficiently by only calculating the rotated coordinates for the edge points

*<sup>a</sup>* resulting in the rotated actual coin edge image *Eaφ*. From here we compute two distance

*<sup>a</sup>φ*<sup>|</sup> <sup>∑</sup> (*x*,*y*)∈*E<sup>c</sup> aφ*

where *E*¯ *<sup>d</sup>* is the result of applying a morphological dilation operation to the binary edge image *E* in order to counteract the remaining uncertainty of the angular position. *e*abrasion tells us how many expected (master) edge points are missing, whereas *e*dirt sums the additional edge points in the actual edge image. If these errors are higher than given thresholds we have to dismiss the match. In general we cannot know which master coin corresponds to the actual coin image. Therefore, we have to calculate eqns. 7 and 8 for all master coin candidates.

The Eigenspace decomposition for image analysis was introduced by Sirovich & Kirby (1987) and found numerous applications over the last decades, most prominently in the field of face recognition (Turk & Pentland, 1991). We start with the description of the mathematical

(<sup>1</sup> <sup>−</sup> *<sup>E</sup>*¯ *<sup>d</sup>*

(<sup>1</sup> <sup>−</sup> *<sup>E</sup>*¯*<sup>d</sup>*

directions, respectively. Now, we may estimate *φ* by correlating *E<sup>d</sup>*

*<sup>e</sup>*abrasion <sup>=</sup> <sup>1</sup>

*<sup>e</sup>*dirt <sup>=</sup> <sup>1</sup> |*Ec*


to the inner part of the coin. This is given by

*Hd*

*Ed <sup>i</sup>*,*<sup>j</sup>* <sup>=</sup> *<sup>H</sup><sup>d</sup> i*,*j*

**4.2 Eigenimage representation and matching**

in *E<sup>c</sup>*

measures

(e) normalized 1D description of coin shape

Fig. 6. Processing of coin contour.

where (*xi*, *yi*) are the coordinates of sample point *si* and �·� denotes the *L*2-norm. The 1D representation is given by *D* = (*d*1,..., *dl*), where

$$d\_i = (\|\mathbf{s}\_i - \mathbf{s}\_\mathcal{c}\| - r) / r,\\ \mathbf{i} = 1, \dots, l \tag{4}$$

The division by *r* makes the representation invariant with respect to scale. Figure 6 (e) shows the obtained 1D representation.

#### **4. Matching for classification and identification**

Matching for classification or identification is based on edge based features extracted as described in the previous section. In this section, we will discuss direct matching of edge features, Eigenspace matching and shape matching.

#### **4.1 Direct matching of edges**

In the direct matching approach for edge points we start with a binary edge image *E* derived from a coin image. Let *<sup>E</sup><sup>c</sup>* <sup>=</sup> {(*<sup>x</sup>* <sup>−</sup> *xm*, *<sup>y</sup>* <sup>−</sup> *ym*)|*E*(*x*, *<sup>y</sup>*) = <sup>1</sup>} be the list of cartesian edge point coordinates with the center of gravity (*xm*, *ym*) as origin. The polar coordinate representation of *E<sup>c</sup>* is given by

$$\begin{aligned} E^p &= \{ (\theta, \rho) | (\mathbf{x}, y) \in E^\xi \} \\ \theta &= \arctan y / \mathbf{x}, \quad \rho = \sqrt{\mathbf{x}^2 + y^2}, \quad \mathbf{x} = \rho \cos \theta, \quad y = \rho \sin \theta \end{aligned} \tag{5}$$

10 Will-be-set-by-IN-TECH

(a) Coin image (b) Coin edge (c) Fitted circle (d) Sampling along arc

(e) normalized 1D description of coin shape

where (*xi*, *yi*) are the coordinates of sample point *si* and �·� denotes the *L*2-norm. The 1D

The division by *r* makes the representation invariant with respect to scale. Figure 6 (e) shows

Matching for classification or identification is based on edge based features extracted as described in the previous section. In this section, we will discuss direct matching of edge

In the direct matching approach for edge points we start with a binary edge image *E* derived from a coin image. Let *<sup>E</sup><sup>c</sup>* <sup>=</sup> {(*<sup>x</sup>* <sup>−</sup> *xm*, *<sup>y</sup>* <sup>−</sup> *ym*)|*E*(*x*, *<sup>y</sup>*) = <sup>1</sup>} be the list of cartesian edge point coordinates with the center of gravity (*xm*, *ym*) as origin. The polar coordinate representation

*di* = (�*si* − *sc*� − *r*)/*r*, *i* = 1, . . . , *l* (4)

} (5)

*x*<sup>2</sup> + *y*2, *x* = *ρ* cos *θ*, *y* = *ρ* sin *θ*

Fig. 6. Processing of coin contour.

the obtained 1D representation.

**4.1 Direct matching of edges**

of *E<sup>c</sup>* is given by

representation is given by *D* = (*d*1,..., *dl*), where

**4. Matching for classification and identification**

features, Eigenspace matching and shape matching.

*<sup>E</sup><sup>p</sup>* <sup>=</sup> {(*θ*, *<sup>ρ</sup>*)|(*x*, *<sup>y</sup>*) <sup>∈</sup> *<sup>E</sup><sup>c</sup>*

*θ* = arctan *y*/*x*, *ρ* =

Assume *Em* is a reference image, or so called master edge image, and *Ea* is an edge image to be matched. Then in general there is an unknown rotation *φ* around the center of gravity that aligns both edge images. In polar coordinates this rotation transforms into a cyclic translation in the angular direction. To determine *φ* we may deploy a fast correlation method based on the edge images, see Subsec 3.2 Although correlation methods based on the fast Fourier transformation perform efficiently, there are some drawbacks using the edge images directly. First, to preserve the visual information the resolution of the edge image cannot be too small. Depending on the diameter of the coin we typically get coin image resolutions from 100 × 100 to 300 × 300 pixels and the correlation would add significantly to the overall computational costs. Secondly, the outer border, which in most cases contains a substantial part of the edge points, usually does not help to find *φ* as it comprises too many symmetries. To avoid both we suggest to calculate the correlation on a two dimensional edge density function restricted to the inner part of the coin. This is given by

$$H\_{i,j}^d = |\{ (\theta, \rho) \in E^p | \theta\_{i-1} \le \theta < \theta\_{i\prime} \rho\_{j-1} < \rho < \rho\_{\parallel} \}| \tag{6}$$

$$E\_{i,j}^d = H\_{i,j}^d / N, \quad i = 1, \dots, n; j = 1, \dots, l, \quad N = \sum\_{i,j}^{n,l} H\_{i,j}^d$$

The sets {*θ*0,..., *θn*} and {*ρ*0,..., *ρl*} are the discrete resolutions in angular and distance directions, respectively. Now, we may estimate *φ* by correlating *E<sup>d</sup> <sup>m</sup>* and *E<sup>d</sup> <sup>a</sup>* . By choosing a high resolution in the angular direction (i.e. *n* ≥ 512) and a coarse resolution (i.e. *l* ≤ 16) in the distance direction, omitting to include the coin borders, we found that *φ* usually may be determined up to ±0.5◦. Once *φ* is known, we may align the actual coin image to the master. This is done efficiently by only calculating the rotated coordinates for the edge points in *E<sup>c</sup> <sup>a</sup>* resulting in the rotated actual coin edge image *Eaφ*. From here we compute two distance measures

$$\mathcal{e}\_{\text{abrasion}} = \frac{1}{|E\_m^c|} \sum\_{(x,y) \in E\_m^c} (1 - \bar{E}\_{a\phi}^d(x, y)) \tag{7}$$

$$\varepsilon\_{\text{dirr}} = \frac{1}{|E\_{a\Phi}^c|} \sum\_{(\mathbf{x}, \mathbf{y}) \in E\_{a\Phi}^c} (1 - \bar{E}\_m^d(\mathbf{x}, \mathbf{y})) \tag{8}$$

where *E*¯ *<sup>d</sup>* is the result of applying a morphological dilation operation to the binary edge image *E* in order to counteract the remaining uncertainty of the angular position. *e*abrasion tells us how many expected (master) edge points are missing, whereas *e*dirt sums the additional edge points in the actual edge image. If these errors are higher than given thresholds we have to dismiss the match. In general we cannot know which master coin corresponds to the actual coin image. Therefore, we have to calculate eqns. 7 and 8 for all master coin candidates.

#### **4.2 Eigenimage representation and matching**

The Eigenspace decomposition for image analysis was introduced by Sirovich & Kirby (1987) and found numerous applications over the last decades, most prominently in the field of face recognition (Turk & Pentland, 1991). We start with the description of the mathematical

most expressive Eigenimages constructed from edge images. Eigenhills have been suggested by Yilmaz & Gökmen (2000). There Eigenhills are derived from application of the PCA to edge images which are covered by a "membrane". We used a 2D Gaussian filter kernel with a s of 1.5 to smooth the edge images which are of size 128x128 pixels. Figure 8(d) shows the most

Automatic Coin Classification and Identification 139

(a) Intensity Eigenspace (b) Equalized intensity Eigenspace

(c) Edge Eigenspace (d) Eigenhills

Fig. 8. First 32 Eigenimages ranked by corresponding Eigenvalues for different variants of

Figure 9 gives the normalized cumulative sum of the sorted Eigenvalues for all the considered variants of Eigenspace representation. For intensity Eigenspace i.e. the first 32 Eigenimages retain approximately 78% of the variance present in the original set of intensity images. Approximately 60% of the variance present in the original set of histogram equalized intensity images is contained in the first 32 sorted Eigenimages. For edge Eigenspace, only about 42% of the variance present in the original set of edge images is contained in the first 32 sorted Eigenimages. Approximately 76% of the variance present in the original set of smoothed edge images is contained in the first 32 sorted Eigenhills. Therefore, the Eigenhills approach achieves a compact representation comparable to intensity Eigenspace, while also being

The shape descriptions of two coins are compared by a linear combination of global and local shape matching. The local matching is derived from the difference of Fourier shape descriptors, whereas the correlation coefficient between the curves serves as global measure

Eigenspace representation.

illumination invariant.

**4.3 Shape matching**

of shape similarity.

expressive Eigenimages, i.e. Eigenhills, constructed from smoothed edge images.

procedure of eigenspace construction employing principal components analysis (PCA). Subsequently, we discuss multiple Eigenspaces in the context of coin recognition.

In the Eigenspace approach, we consider a set of *M* images *B*<sup>1</sup> to *BM*. Each image *Bi* is of size *N* × *N* pixels. The images are reformed into vectors Γ<sup>1</sup> to Γ*M*, e.g. by scanning the image line by line. If all pixels of an image are used to produce a vector, each vector Γ<sup>1</sup> has length *L* = *N*2. An average vector Ψ and difference vectors *ψ<sup>i</sup>* are calculated by

$$\Psi = \frac{1}{M} \sum\_{i=1}^{M} \Gamma\_{i\prime} \quad \text{where} \quad \psi\_i = \Gamma\_i - \Psi\_\prime i = 1, \dots, M \tag{9}$$

Principal axes are obtained by the Eigendecomposition of the covariance matrix C defined by

$$\mathbb{C} = \frac{1}{M} \sum\_{i=1}^{M} \psi\_i \psi\_i^T = AA^T, \quad \text{where} \quad A = (\psi\_1, \psi\_{2'}, \dots, \psi\_M) \tag{10}$$

The Eigenvectors are sorted in non-increasing order depending on the corresponding Eigenvalue. A small number *M*� of significant Eigenvectors is retained from the ranked Eigenvalues, a common practice which leads to the most expressive features (Turk & Pentland, 1991). A weighting factor *ω<sup>k</sup>* corresponding to the *k*-th Eigenimage for a new reformed image is obtained by projection onto the *k*-th Eigenspace component *uk* using

$$
\omega\_i = \mu\_k (\Gamma - \Psi), \quad \mathbf{K} = 1, \dots, M' \tag{11}
$$

The weights *ω<sup>k</sup>* are arranged in an vector Ω = (*ω*1,..., *ω*� *<sup>M</sup>*)*T*. For the coin recognition task, not the full images are reformed into a vector, only the interior pixels of the coin are rearranged into the vector Γ, see Fig. 7.

Fig. 7. Arrangement of inner coin pixels into a vector.

To overcome limitations regarding illumination variation in the Eigenspace approach a number of solutions were proposed, e.g. Murase & Nayar (1994) investigate the determination of the illumination which gives best discrimination. The PCA of edge images and smoothed edge images is suggested as an illumination invariant way of Eigenspace construction by Yilmaz & Gökmen (2000), gradient images are used as input to PCA by Venkatesh et al. (2002) and Bischof et al. (2001) use a set of gradient based filter banks applied to the Eigenimage representation.

Figure 8(a) shows the first 32 Eigenimages constructed from graylevel images, the top left image is the Eigenimage corresponding to the largest Eigenvalue. Histogram equalization is sometimes suggested as a way to achieve illumination invariance. Figure 8(b) shows the most expressive Eigenimages constructed from histogram equalized images. Figure 8(c) shows the 12 Will-be-set-by-IN-TECH

procedure of eigenspace construction employing principal components analysis (PCA).

In the Eigenspace approach, we consider a set of *M* images *B*<sup>1</sup> to *BM*. Each image *Bi* is of size *N* × *N* pixels. The images are reformed into vectors Γ<sup>1</sup> to Γ*M*, e.g. by scanning the image line by line. If all pixels of an image are used to produce a vector, each vector Γ<sup>1</sup> has length

Principal axes are obtained by the Eigendecomposition of the covariance matrix C defined by

The Eigenvectors are sorted in non-increasing order depending on the corresponding Eigenvalue. A small number *M*� of significant Eigenvectors is retained from the ranked Eigenvalues, a common practice which leads to the most expressive features (Turk & Pentland, 1991). A weighting factor *ω<sup>k</sup>* corresponding to the *k*-th Eigenimage for a new reformed image is obtained by projection onto the *k*-th Eigenspace component *uk* using

not the full images are reformed into a vector, only the interior pixels of the coin are rearranged

To overcome limitations regarding illumination variation in the Eigenspace approach a number of solutions were proposed, e.g. Murase & Nayar (1994) investigate the determination of the illumination which gives best discrimination. The PCA of edge images and smoothed edge images is suggested as an illumination invariant way of Eigenspace construction by Yilmaz & Gökmen (2000), gradient images are used as input to PCA by Venkatesh et al. (2002) and Bischof et al. (2001) use a set of gradient based filter banks applied to the

Figure 8(a) shows the first 32 Eigenimages constructed from graylevel images, the top left image is the Eigenimage corresponding to the largest Eigenvalue. Histogram equalization is sometimes suggested as a way to achieve illumination invariance. Figure 8(b) shows the most expressive Eigenimages constructed from histogram equalized images. Figure 8(c) shows the

Γ*i*, where *ψ<sup>i</sup>* = Γ*<sup>i</sup>* − Ψ, *i* = 1, . . . , *M* (9)

*<sup>i</sup>* <sup>=</sup> *AAT*, where *<sup>A</sup>* = (*ψ*1, *<sup>ψ</sup>*2,..., *<sup>ψ</sup>M*) (10)

*ω<sup>i</sup>* = *uk*(Γ − Ψ), *K* = 1, . . . , *M*� (11)

*<sup>M</sup>*)*T*. For the coin recognition task,

Subsequently, we discuss multiple Eigenspaces in the context of coin recognition.

*L* = *N*2. An average vector Ψ and difference vectors *ψ<sup>i</sup>* are calculated by

*M* ∑ *i*=1

*ψiψ<sup>T</sup>*

<sup>Ψ</sup> <sup>=</sup> <sup>1</sup> *M*

> *M* ∑ *i*=1

The weights *ω<sup>k</sup>* are arranged in an vector Ω = (*ω*1,..., *ω*�

Fig. 7. Arrangement of inner coin pixels into a vector.

*<sup>C</sup>* <sup>=</sup> <sup>1</sup> *M*

into the vector Γ, see Fig. 7.

Eigenimage representation.

most expressive Eigenimages constructed from edge images. Eigenhills have been suggested by Yilmaz & Gökmen (2000). There Eigenhills are derived from application of the PCA to edge images which are covered by a "membrane". We used a 2D Gaussian filter kernel with a s of 1.5 to smooth the edge images which are of size 128x128 pixels. Figure 8(d) shows the most expressive Eigenimages, i.e. Eigenhills, constructed from smoothed edge images.

Fig. 8. First 32 Eigenimages ranked by corresponding Eigenvalues for different variants of Eigenspace representation.

Figure 9 gives the normalized cumulative sum of the sorted Eigenvalues for all the considered variants of Eigenspace representation. For intensity Eigenspace i.e. the first 32 Eigenimages retain approximately 78% of the variance present in the original set of intensity images. Approximately 60% of the variance present in the original set of histogram equalized intensity images is contained in the first 32 sorted Eigenimages. For edge Eigenspace, only about 42% of the variance present in the original set of edge images is contained in the first 32 sorted Eigenimages. Approximately 76% of the variance present in the original set of smoothed edge images is contained in the first 32 sorted Eigenhills. Therefore, the Eigenhills approach achieves a compact representation comparable to intensity Eigenspace, while also being illumination invariant.
