**3. Technique implementation**

The generation process of this technique starts by getting the user's IP address. Then, it gets the country's name where the user is located at the time of accessing which is obtained using the IP-API service. Consequently, a country language will be retrieved from the database using the country name, where a list of countries is sorted and classified into one of the adopted languages (Arabic, English, French, and Spanish). A list for each of the adopted language was created which contains the countries which speak the specified language. Hence, the countries classification is done based on the official spoken language in each country. However, if the country's name is not on any of the four countries' lists, then English will be the default language to use (English).

In addition, the user's website default language will be determined and compared to the retrieved country's language; if they are different, then the website default language will be used.

The following flowchart (**Figure 3**) illustrates the whole process of the first step in this CAPTCHA technique which decides the CAPTCHA language to be displayed to the user.

Furthermore, after the language has been decided on, the CAPTCHA generation process will move on to the next step which is choosing the CAPTCHA word length. The word length is chosen randomly from five to eight characters. Next, the word construction process will start by selecting the handwritten characters and distorting them separately. However, this step will be done little bit differently if the previously decided language is Arabic; the following flowchart (**Figure 4**) clarifies the CAPTCHA word construction process in detail.

**Figure 3.** Deciding the CAPTCHA language process.

handwriting styles written by different people. Moreover, the human brain has the privilege of using its experience to figure out uncompleted characters or uncompleted words that have missing letters. It even can read the Arabic words written without any dots on its letters, because the words' shapes can be enough for the brain to figure out the words, unlike OCR machines which mostly cannot recognize the words if they are not complete

Overall, this confirms the human capability in utilizing the handwriting characteristics, which cannot be found in any OCR machine, and it encourages us to go through this CAPTCHA

The generation process of this technique starts by getting the user's IP address. Then, it gets the country's name where the user is located at the time of accessing which is obtained using the IP-API service. Consequently, a country language will be retrieved from the database using the country name, where a list of countries is sorted and classified into one of the adopted languages (Arabic, English, French, and Spanish). A list for each of the adopted language was created which contains the countries which speak the specified language. Hence, the countries classification is done based on the official spoken language in each country. However, if

or without dots in Arabic words case.

**Figure 2.** Abstract view of the technique process.

166 Multilingualism and Bilingualism

technique which is based on handwriting.

**3. Technique implementation**

**4. Experiment techniques**

**Figure 8.** Arabic CAPTCHA " ".

**4.1. First OCR**

**4.2. Second OCR**

**4.3. Third OCR**

**4.4. Fourth OCR**

**4.5. Fifth OCR**

in images and then copies it to the clipboard.

the ability to extract text from various columns in the images.

other online OCRs, and it produces the extracted text fairly quickly [6].

In the conducted experiments, six different OCRs were used to test the technical performance of the proposed CAPTCHA techniques. The used OCRs have a good review from some tech-

Innovative Multilingual CAPTCHA Based on Handwritten Characteristics

http://dx.doi.org/10.5772/intechopen.72599

169

Moreover, other methods were used to test the usability, such as surveys and local web pages

The first OCR used in the experiments is an application called Free OCR. This application utilizes the most recent version of the Tesseract OCR engine (v3.01), which can ensure a reliable level of text-extracting accuracy. Tesseract is an open-source OCR engine maintained by Google. It offers support for different languages, with a level of accuracy potentially reaching 98% [1, 5].

Capture2Text is the second technique used in our experiments. It is an open-source OCR tool, like the first OCR; it uses the Tesseract engine introduced by Google to capture the written text

The third OCR used is a free online OCR called i2OCR. It is available in the following link: http://www.i2ocr.com/. This online OCR supports various recognition languages; it also has

FreeOCR is the fourth OCR tool used in the experiments. As the name suggests, it is available online as a free service, which is available in the following link: http://www.free-ocr.com/. Moreover, the extraction process speed for this OCR site is considered fast in comparison with

The fifth OCR we adopted in the experiments is an online OCR software called OnlineOCR.This OCR software is available in the following link: http://www.onlineocr.net/. Additionally, this

nical experts and they provide good results when used to extract the regular text.

to get the users' responses and analyze them from different perspectives.

**Figure 4.** CAPTCHA word construction process.

**Figure 5.** English CAPTCHA "M u J F R t Q".

**Figure 6.** French CAPTCHA "F ë x Œ r".

**Figure 7.** Spanish CAPTCHA "X b CH y N R w".

As shown above, when the CAPTCHA word is generated, it is displayed to the user in one of adopted languages (Arabic, English, French, and Spanish). **Figures 5**–**8** show examples of the handwritten CAPTCHA technique with each adopted language.

**Figure 8.** Arabic CAPTCHA " ".
