**2. CAPTCHA technique based on handwriting**

This technique adopts the handwritten text in the CAPTCHA images and applies a unique feature (separating handwritten characters). This feature can help in differentiating it from any previous handwritten CAPTCHA techniques, and prospectively enhances security level. Moreover, the CAPTCHA's text combines different text languages beside the default language (English) which makes it a multilingual CAPTCHA. The secondary language is selected from a set of languages (French, Spanish, and Arabic) based on the user's region. The main reason for providing multilingual CAPTCHA is that other OCR programs in other languages have not reached the professionalism level of the English OCR yet, and to expand the CAPTCHA usage scope to be used worldwide [1].

At the beginning, different handwritten characters were collected from 100 volunteers; each volunteer wrote the alphabet characters of the 4 adopted languages for the research, each using their own handwriting style. The handwritten characters were classified and stored in a database. These characters were used to synthesize random words that generate the CAPTCHA text, and users should recognize the words in order to pass the CAPTCHA. Furthermore, for the sake of adding a proper security level that will protect the website services from bots' attacks, some distortion methods are applied on each handwritten character separately at the generation process to increase the difficulty for bots to break the CAPTCHA, besides the handwritten characteristics that are fairly resistant for such bots to break.

In summary, this technique goes through two main phases as part of its generating process: the first phase is data gathering and preparation, and the second phase is CAPTCHA implementation with some steps included in each phase.

### **2.1. Data gathering and preparation**

This phase goes through six steps. They are as follows:

• The first step is characters' samples creation. In this step, samples for each character in the four different adopted languages (English, Arabic, Spanish, and French) that will be used in the CAPTCHA text are made [1].


#### **2.2. Algorithm technique**

Therefore, it is obvious that stopping such bots by means of a reliable Completely Automated Public Turing Test to Tell Computers and Humans Apart (CAPTCHA) is inevitable. More so,

Completely Automated Public Turing Test to Tell Computers and Humans Apart () is considered one of the most common techniques that can be used to distinguish between humans and artificial agents (or bots). For time being, the exponential growth of free web services has led to the misuse of automated bots and spam [4], which has resulted in serious security issues in web services. Using CAPTCHA in its various types has proven to be effective in protecting

This technique adopts the handwritten text in the CAPTCHA images and applies a unique feature (separating handwritten characters). This feature can help in differentiating it from any previous handwritten CAPTCHA techniques, and prospectively enhances security level. Moreover, the CAPTCHA's text combines different text languages beside the default language (English) which makes it a multilingual CAPTCHA. The secondary language is selected from a set of languages (French, Spanish, and Arabic) based on the user's region. The main reason for providing multilingual CAPTCHA is that other OCR programs in other languages have not reached the professionalism level of the English OCR yet, and to expand the CAPTCHA

At the beginning, different handwritten characters were collected from 100 volunteers; each volunteer wrote the alphabet characters of the 4 adopted languages for the research, each using their own handwriting style. The handwritten characters were classified and stored in a database. These characters were used to synthesize random words that generate the CAPTCHA text, and users should recognize the words in order to pass the CAPTCHA. Furthermore, for the sake of adding a proper security level that will protect the website services from bots' attacks, some distortion methods are applied on each handwritten character separately at the generation process to increase the difficulty for bots to break the CAPTCHA, besides the

In summary, this technique goes through two main phases as part of its generating process: the first phase is data gathering and preparation, and the second phase is CAPTCHA imple-

• The first step is characters' samples creation. In this step, samples for each character in the four different adopted languages (English, Arabic, Spanish, and French) that will be used

handwritten characteristics that are fairly resistant for such bots to break.

mentation with some steps included in each phase.

This phase goes through six steps. They are as follows:

**2.1. Data gathering and preparation**

in the CAPTCHA text are made [1].

websites, and the services they provide, from any harm caused by bots' attacks [1].

in a multilingual world, multilingual CAPTCHAs are indispensable.

**2. CAPTCHA technique based on handwriting**

usage scope to be used worldwide [1].

164 Multilingualism and Bilingualism

**Figure 2** shows an abstract view of the technique process.

#### **2.3. Handwriting characteristics**

Choosing and utilizing the handwriting in designing new CAPTCHA technique was not decided randomly with any logical reasons. On the contrary, it was chosen after a quite long search and study of what characteristics the handwriting has, and how it could be utilized in security field.

Nevertheless, the handwriting in general has some characteristics that can only be utilized by humans. Due to the human brain's superior ability, the brain can analyze and recognize unclear handwritten characters and digits; it also can recognize various different

**Figure 1.** Steps of the data gathering and preparation phase.

the country's name is not on any of the four countries' lists, then English will be the default

Innovative Multilingual CAPTCHA Based on Handwritten Characteristics

http://dx.doi.org/10.5772/intechopen.72599

167

In addition, the user's website default language will be determined and compared to the retrieved country's language; if they are different, then the website default language will be

The following flowchart (**Figure 3**) illustrates the whole process of the first step in this CAPTCHA technique which decides the CAPTCHA language to be displayed to the user.

Furthermore, after the language has been decided on, the CAPTCHA generation process will move on to the next step which is choosing the CAPTCHA word length. The word length is chosen randomly from five to eight characters. Next, the word construction process will start by selecting the handwritten characters and distorting them separately. However, this step will be done little bit differently if the previously decided language is Arabic; the following flowchart (**Figure 4**) clarifies the CAPTCHA word construction

language to use (English).

used.

process in detail.

**Figure 3.** Deciding the CAPTCHA language process.

**Figure 2.** Abstract view of the technique process.

handwriting styles written by different people. Moreover, the human brain has the privilege of using its experience to figure out uncompleted characters or uncompleted words that have missing letters. It even can read the Arabic words written without any dots on its letters, because the words' shapes can be enough for the brain to figure out the words, unlike OCR machines which mostly cannot recognize the words if they are not complete or without dots in Arabic words case.

Overall, this confirms the human capability in utilizing the handwriting characteristics, which cannot be found in any OCR machine, and it encourages us to go through this CAPTCHA technique which is based on handwriting.
