**2. Proposed scheme**

In this part, we introduce an offline Arabic signature identification system based on classification techniques. The procedure consists of four phases: pre-processing, features extracting, selected feature by (DCT+ SPCA) technique, and matching. The complete process begins with acquiring the images of signatures to undergo a pre-processing stage, and then identification and verification process, which are illustrated in **Figure 1**.

Arabic signatures are booked on A4 size paper and then scanned at 300 dpi, 256 gray level images. The dataset contains encompasses signatures from persons scanned signatures were collected from the signer Anwar Yahya Ebrahim et al. Each signatory has 10 signatures to predict a response of which, 6 are genuine signatures and 4 are forged signatures. There are enough signatures to ensure sufficient samples for both training and testing. Where 7 of the samples are assigned to the

*New Attributes Extraction System for Arabic Autograph as Genuine and Forged through…*

*DOI: http://dx.doi.org/10.5772/intechopen.96561*

The distribution of the number of genuine and forgery samples for different signatories is illustrated in **Figure 2**. Arabic Signature images are then pre-processed in order to improve the quality of images. Noises, such as irrelevant data, are removed from the features to develop the achievement of identification. These samples are then converted into binary samples before feature extraction process

Adaptive window positioning technique is then applied to separate Arabic autograph images into small segments or sub-images. This makes the process of removing redundant data easy and facilitates the comparison of segmented fragments. A 14x14 segment size is chosen for the images for an optimum output [18]. Further, a group of features (form measures) from the approaches are extracted, which represents the signature image in a feature space. To analysis data accurately, a variety of observations as well as a value of significant individual features are needed to be organized. Such the data can be given and analyzed by machines or humans. The goal of form representation is to get form measures. These measures are used as classification features in models. Moreover, sub-images are presented from

training set, and the rest 3 to the testing set from both classes.

[14–17].

**97**

**Figure 1.**

*Proposed methodology.*

**2.2 Feature extraction**

the set of obtained features [19].

#### **2.1 Pre-processing**

In this step, data are acquired and signature images are pre-processed. For the purpose of this study, Arabic signature is used as the data consisting of 500 true samples and 250 forged samples was used. True samples were obtained from 50 different persons. Every signer was asked to sign 10 times using common types of pens. The 10 signatures collected from each person were used as follows: six of these signatures were selected at random for system learning and the remaining four were used for system testing in addition to "ve forged" samples.

#### *2.1.1 Arabic signature database*

This study employed the Arabic signature database created by "Anwar Yahya Ebrahim" as the Arabic signature samples for testing the proposed method. The

*New Attributes Extraction System for Arabic Autograph as Genuine and Forged through… DOI: http://dx.doi.org/10.5772/intechopen.96561*

**Figure 1.** *Proposed methodology.*

utilized in image recognition lately. According to Alattas [9], financial institutions are interested to benefit from the reliability and safety of offline signaturerecognition systems. Another major reason is that online authentication systems require more complex processing and high-tech gadgets than off-line systems. Offline autographs are usually presented on a piece of paper, which is the norm in documentation. Currently, there is a need for efficient online and offline systems to ascertain the genuineness of personal autographs. Authentication of handwritten autographs usually consists of a series of procedures. These processes are preprocessing (where images are enhanced, binarized, divided into fragments and other related operations), feature extraction (features of the signatures are extracted as raw forms), feature selection or reduction (extracted features are reduced for efficiency), identification and authentication of the signatures against the signature database based on the selected features. A good verification outcome can be performed by likening the strong features of the taster against the autograph of a signer sample utilizing suitable techniques or classifiers [10]. Methods depend on local tests, which concentrate on the analysis of the essential features of different scripts [10–12]. Some studies utilized evolving curves which do not move away to near by features decreasing the superfluous fragmentation [13]. Based on the available gap in the literature, in this paper, we propose a new process to identify and authenticate Offline-Arabic signatures. This method uses a combination of techniques including adaptive window positioning procedure for autograph attribute extraction and feature selection method for reduced features and selection of important features. In this paper, enhanced Discrete Cosine Transform (DCT) and, Spars Principal Component Analysis (SPCA) method is used to extract attributes. Further, these extracted features are reduced to the best features only. In this research, in order to classify genuine and forged signature two types of classifiers: 1) Decision Tree and 2) Support Vector Machine (SVM) are applied. The classification outcomes of Decision Tree and SVM are compared to choose a better classifier. .

In this part, we introduce an offline Arabic signature identification system based on classification techniques. The procedure consists of four phases: pre-processing, features extracting, selected feature by (DCT+ SPCA) technique, and matching. The complete process begins with acquiring the images of signatures to undergo a pre-processing stage, and then identification and verification process, which are

In this step, data are acquired and signature images are pre-processed. For the purpose of this study, Arabic signature is used as the data consisting of 500 true samples and 250 forged samples was used. True samples were obtained from 50 different persons. Every signer was asked to sign 10 times using common types of pens. The 10 signatures collected from each person were used as follows: six of these signatures were selected at random for system learning and the remaining four were

This study employed the Arabic signature database created by "Anwar Yahya Ebrahim" as the Arabic signature samples for testing the proposed method. The

used for system testing in addition to "ve forged" samples.

**2. Proposed scheme**

*Applications of Pattern Recognition*

illustrated in **Figure 1**.

*2.1.1 Arabic signature database*

**96**

**2.1 Pre-processing**

Arabic signatures are booked on A4 size paper and then scanned at 300 dpi, 256 gray level images. The dataset contains encompasses signatures from persons scanned signatures were collected from the signer Anwar Yahya Ebrahim et al. Each signatory has 10 signatures to predict a response of which, 6 are genuine signatures and 4 are forged signatures. There are enough signatures to ensure sufficient samples for both training and testing. Where 7 of the samples are assigned to the training set, and the rest 3 to the testing set from both classes.

The distribution of the number of genuine and forgery samples for different signatories is illustrated in **Figure 2**. Arabic Signature images are then pre-processed in order to improve the quality of images. Noises, such as irrelevant data, are removed from the features to develop the achievement of identification. These samples are then converted into binary samples before feature extraction process [14–17].

#### **2.2 Feature extraction**

Adaptive window positioning technique is then applied to separate Arabic autograph images into small segments or sub-images. This makes the process of removing redundant data easy and facilitates the comparison of segmented fragments. A 14x14 segment size is chosen for the images for an optimum output [18]. Further, a group of features (form measures) from the approaches are extracted, which represents the signature image in a feature space. To analysis data accurately, a variety of observations as well as a value of significant individual features are needed to be organized. Such the data can be given and analyzed by machines or humans.

The goal of form representation is to get form measures. These measures are used as classification features in models. Moreover, sub-images are presented from the set of obtained features [19].

*α*1, *α*2, *α*3, ……… , *α*12 and eight SPCA features are represented by

This set of 20 features represents one signature.

and

ðDCT⨁SPCAÞ ¼

**2.4 Classification**

**3.1 Enrolment**

**4. Training**

**99**

measurement were evaluated:

performed sub-steps are as follows:

*DOI: http://dx.doi.org/10.5772/intechopen.96561*

**3. Signature alignment**

*βSPCA*1, *β*SPCA2, *β*SPCA3, … … , *β*SPCA8. These both subsets of features can be combined by concatenating DCT features with SPCA features to form a single features vector (DCT⨁SPCA) of 20 features as shown below in Eq. (1).

*New Attributes Extraction System for Arabic Autograph as Genuine and Forged through…*

DCT ¼ ½ � α1, α2, α3, α4, α5, α6, α7, α8, α9, α10, α11, α<sup>12</sup>

SPCA ¼ ½ � βSPCA1, βSPCA2, βSPCA3, βSPCA4, βSPCA5, βSPCA6, βSPCA7, βSPCA8

½α1, α2, α3, α4, α5, α6, α7, α8, α9, α10, α11, α12, βSPCA1, βSPCA2, βSPCA3, βSPCA4, βSPCA5, βSPCA6, βSPCA7, βSPCA8�

In this step, the model is presented based on training and testing. The various

In order to perform a meaningful comparison of images of different lengths, we applied Extreme Points Warping (EPW) method [21]. EPW method modifies a shape using peaks and valleys as pivoting points, rather than warping the whole shape. The algorithm fixed the optimum linear alignment of two vectors by using the smallest overall dimension between them. The distances were recalculated between feature directions at each iteration. The alignment was considered to achieve optimal status in case the average dimension between feature vectors attained a low value. The dimension between two signature samples was calculated

as the median of the dimensions between the fully aligned feature vectors.

1.Median dimension to the farthest sample (dmax).

2.Median dimension to the nearest sample (dmin).

For enrolment to the system, 54 signatures were selected from each user for training. Each pair of Arabic signatures was aligned to determine their distance, as described in the previous section. Using these aligned distances, the following

The training group of Arabic signature images was used to determine the threshold parameter in order to distinguish dubious group from the genuine class.

The 2-dimensional feature vectors (Pmin, Pmax) and normalize the feature values by the matching averages of the reference set (dmin, dmax) were obtained

(1)

**Figure 2.** *Examples of genuine signatures and their respective forged counterparts found in the Arabic signatures.*

The attributes are then normalized using a feature matrix. The normalization process is very important. This is because when attributes are in various ranges, higher ratios may dominate lower values, which may distort the results. Normalization places the attribute ratios within the same scales and ranges to enable comparison. The projection and profile features are normalized by using window height, while the other descriptors are normalized by their maximum possible respective values. After normalization, each feature of the main window is composed to form a vector. This scales and translates each feature individually to a fixed range on the training set, which is a number between zero and one [20].

#### **2.3 Features selection**

This study proposes two fusions of features namely, Discrete Cosine Transform⨁ Spars Principal Component Analysis and (DCT⨁SPCA). The former is introduced represent the high pass in vertical, diagonal and horizontal directions, respectively in signature images whereas the latter is proposed to discriminate between genuine and forged of Arabic signatures. The reason to combine DCT and SPCA features is that both are transformed based features so due to homogeneity they are best choice for combining. Fusion combines the useful information from both images. The motivation to combine these both features are numerous similarities found in DCT and SPCA features. This proposed technique uses the high pass signature images to extract the necessary information for the signature verification.

Succeeding the feature selection, the twelve DCT features and the eight SPCA features are extracted. These features are then fused in order to classify signatures into genuine and forged classes. Suppose twelve DCT features are represented by

*New Attributes Extraction System for Arabic Autograph as Genuine and Forged through… DOI: http://dx.doi.org/10.5772/intechopen.96561*

*α*1, *α*2, *α*3, ……… , *α*12 and eight SPCA features are represented by *βSPCA*1, *β*SPCA2, *β*SPCA3, … … , *β*SPCA8. These both subsets of features can be combined by concatenating DCT features with SPCA features to form a single features vector (DCT⨁SPCA) of 20 features as shown below in Eq. (1).

DCT ¼ ½ � α1, α2, α3, α4, α5, α6, α7, α8, α9, α10, α11, α<sup>12</sup>

and

SPCA ¼ ½ � βSPCA1, βSPCA2, βSPCA3, βSPCA4, βSPCA5, βSPCA6, βSPCA7, βSPCA8 ðDCT⨁SPCAÞ ¼ ½α1, α2, α3, α4, α5, α6, α7, α8, α9, α10, α11, α12, βSPCA1, βSPCA2, βSPCA3, βSPCA4, βSPCA5, βSPCA6, βSPCA7, βSPCA8� (1)

This set of 20 features represents one signature.

### **2.4 Classification**

In this step, the model is presented based on training and testing. The various performed sub-steps are as follows:

## **3. Signature alignment**

In order to perform a meaningful comparison of images of different lengths, we applied Extreme Points Warping (EPW) method [21]. EPW method modifies a shape using peaks and valleys as pivoting points, rather than warping the whole shape. The algorithm fixed the optimum linear alignment of two vectors by using the smallest overall dimension between them. The distances were recalculated between feature directions at each iteration. The alignment was considered to achieve optimal status in case the average dimension between feature vectors attained a low value. The dimension between two signature samples was calculated as the median of the dimensions between the fully aligned feature vectors.

#### **3.1 Enrolment**

The attributes are then normalized using a feature matrix. The normalization process is very important. This is because when attributes are in various ranges, higher ratios may dominate lower values, which may distort the results. Normalization places the attribute ratios within the same scales and ranges to enable comparison. The projection and profile features are normalized by using window height, while the other descriptors are normalized by their maximum possible respective values. After normalization, each feature of the main window is composed to form a vector. This scales and translates each feature individually to a fixed

*Examples of genuine signatures and their respective forged counterparts found in the Arabic signatures.*

range on the training set, which is a number between zero and one [20].

This study proposes two fusions of features namely, Discrete Cosine Transform⨁ Spars Principal Component Analysis and (DCT⨁SPCA). The former is introduced represent the high pass in vertical, diagonal and horizontal directions, respectively in signature images whereas the latter is proposed to discriminate between genuine and forged of Arabic signatures. The reason to combine DCT and SPCA features is that both are transformed based features so due to homogeneity they are best choice for combining. Fusion combines the useful information from both images. The motivation to combine these both features are numerous similarities found in DCT and SPCA features. This proposed technique uses the high pass signature images to extract the necessary information for the signature verification. Succeeding the feature selection, the twelve DCT features and the eight SPCA features are extracted. These features are then fused in order to classify signatures into genuine and forged classes. Suppose twelve DCT features are represented by

**2.3 Features selection**

*Applications of Pattern Recognition*

**Figure 2.**

**98**

For enrolment to the system, 54 signatures were selected from each user for training. Each pair of Arabic signatures was aligned to determine their distance, as described in the previous section. Using these aligned distances, the following measurement were evaluated:

1.Median dimension to the farthest sample (dmax).

2.Median dimension to the nearest sample (dmin).

The training group of Arabic signature images was used to determine the threshold parameter in order to distinguish dubious group from the genuine class.

### **4. Training**

The 2-dimensional feature vectors (Pmin, Pmax) and normalize the feature values by the matching averages of the reference set (dmin, dmax) were obtained using the EPW algorithm. These were calculated based on Eqs. (2) and (3) to represent the allocation of the feature group.

$$\mathbf{N}\text{ max} = \text{dmax}/\text{Pmax} \tag{2}$$

result vectors for each distance. This is to obtain a good level of generalization capability. To establish the rating of signers' relationship to the inquiry samples, firstly we used these processing points and then we combined the results of the

*New Attributes Extraction System for Arabic Autograph as Genuine and Forged through…*

way and on the same samples from Arabic signatures as SVM. MATLAB 2014 bagged tree classification and trees software were used in the training and classification simulation. To predict a reaction, the decision procedure in the decision tree from the root (starting) node (feature) down to a leaf (feature) node was followed. Responses were included in the leaf feature. Decision trees granted responses, such as 'true' or 'false'. Decision Tree was created to perform classification [20, 24]. The

Evaluation of Tree Classification (Bagged Trees) technique was used in the same

Step 1: Start the first with all input features and then examine all potential binary divides on each predictor

Step 3: If the divide leads to a child node with less than the least leaf parameter), choose a divide with the

In this section, we discuss the outcomes of the suggested methodology on some

The input image in RGB color space was first converted to grayscale image as displayed in **Figure 3(a)** represented Gray image. Then, the image was smoothened with median filter and converted to binary as shown in **Figure 3(b)**. Further, the image was passed from boundary box to find the boundaries of the text area as presented in (c), while in (d) the image was resized to apply the adaptive

In this phase, we represent the sub-images from a set of features. The outcome of the feature extraction is shown in **Table 1(a)**. Initially, these features were not normalized. The values shown in **Table 1(a)** represent the frequencies of the designs extracted from each box. Higher ratios mean there is a more specific model with the genuine autograph, which suggests that the Arabic signatures are highly similar to the test signature. The features were then normalized using a composed matrix of features. The projection and profile features were normalized using window height, while the other descriptors were normalized by their respective maximum possible value. Normalization places different feature values in the same ranges as shown in **Table 1(b)**. After normalization, each normalized feature of main window were concatenated into a single feature set, which represent each

Step 5: If it made up of only observations of one category a (feature) node is perspicuous. Therefore,

entire samples.

**Algorithm 1**

**4.2 Decision tree classification**

*DOI: http://dx.doi.org/10.5772/intechopen.96561*

described steps are presented in Algorithm 1.

Step 2: choose a divides with good optimization standard

the node is fewer than minimum parent observations

**5. Outcomes and discussion**

**5.1 Pre-processing**

**5.2 Feature extraction**

**101**

of samples from the Arabic signatures.

better optimization standard. Subject to the least feature constraint

Step 4: put the divides and reiterate recursively for the two child (features) nodes

windowing algorithm to divide it into fragments as shown in (e).

$$\text{N min} = \text{dmin} / \text{Pmin} \tag{3}$$

Normalization of information ensures the genuineness or forgery of signatures in the training set. We trained a decision tree classifier to recognize the genuine and forged signatures in this normalized feature area (**Figure 3**). To facilitate comparisons, two classifiers were used: The Tree classifier and SVM classifier were applying the 2-dimensional attribute vectors. A linear classification was made by choosing a threshold ratio separating the two classes within the training set. This threshold was used in the verification process.
