**Abstract**

A phishing attack is one of the most common forms of cybercrime worldwide. In recent years, phishing attacks have continued to escalate in severity, frequency and impact. Globally, the attacks cause billions of dollars of losses each year. Cybercriminals use phishing for various illicit activities such as personal identity theft and fraud, and to perpetrate sophisticated corporate-level attacks against financial institutions, healthcare providers, government agencies and businesses. Several solutions using various methodologies have been proposed in the literature to counter web-phishing threats. This research work adopts a novel strategy to the detection and prevention of website phishing attacks, with a practical implementation through development towards a browser toolbar add-in. The IPDS is shown to be highly effective both in the detection of phishing attacks and in the identification of fake websites. Experimental results show that approach using the CNN + LSTM has a 93.28% accuracy with an average detection time of 25 seconds, whilst the approach has a slightly lower accuracy. These times are within typical times for loading a web page which makes toolbar integration into a browser a practical option for website phishing detection in real time. The results of this development are compared with previous work and demonstrate both better or similar detection performance. This is the first work that considers how best to integrate images, text and frames in a hybrid feature-based solution for a phishing detection scheme.

**Keywords:** cybercrime, deep learning, convolutional neural network (CNN), long short-term memory (LSTM), big data

## **1. Introduction**

The use of technology for fraudulent activities has flourished in recent years. The technical resources required to carry out phishing attacks are readily available through private and public sources. Hence, some of these technical resources have been automated and streamlined, thereby allowing their use by non-technical criminals. This automation has made it easier for a larger population of lesssophisticated criminals to commit crimes online, as it has made phishing more viable and economical.

In the recent times, there has been a considerable increase in the assortment, technology and complexity of phishing attacks in response to the increase in countermeasures and user awareness in order to sustain profitability from the illegal activities by the phisher [1]. Providing the ability to detect website phishing attacks may help individual users or organisations in identifying legitimate websites. The effectiveness in recognising an attack may significantly contribute to the making

of an effective decision between a fake and legitimate site [2]. Phishing is a form of social engineering attack in which an attacker, also known as a phisher, attempts to fraudulently retrieve sensitive user information by sending an email claiming to be a legitimately established organisation. They scam the user into giving confidential information that will be used for identity theft [2]. A phisher uses various methods, including email, web pages, and malicious software, to steal personal information and account credentials [3]. The aim of the phishing website is to use users' private information without their permission, and they do this by developing a new website that mimics a reliable website [4].

Hence, phishing website detection has become the object of a great deal of consideration among many academics who are attempting to find ways to incorporate malicious detection devices into web servers as a safety precaution [5]. Despite there being several ways to carry out phishing attacks, current phishing detection techniques unfortunately only cover some attack vectors such as fake website and emails [6]. Moreover, phishing has become more sophisticated, and such attacks can now bypass the filters that have been put in place by anti-phishing techniques [7]. Some detection techniques have been proposed, but most of them only deal with spoof web pages [8]. However, it is quite challenging in detection due to the evading techniques that the phisher uses.

Currently, machine learning is continuously demonstrating its effectiveness in an extensive range of applications. This technology has come to the fore in recent times, owing to the advent of big data [9]. Big data has enabled machine learning algorithms to discover more fine-grained patterns and to make more accurate and timely predictions than ever before [10]. Machine learning techniques are used for object identification in images, the transcription of voice into text, matching news items and products with user interests and presenting relevant search results [11]. The most common form of machine learning, whether deep or not, is supervised learning [12]. Previous methods have failed to combine the usage of frames, images, and text to develop an effective phishing detection method. Because using only text which is the common trend to a detection phishing website, this will not be effective as some changes can be made to the frame and the image. Doing so is, therefore, the focus of this work and therein lies its originality as well using the deep learning of Convolutional Neural Network (CNN) and Long short-term memory (LSTM) as classification algorithm in this solution.

Given the above, the objective is to develop a solution that includes the decision support system for detection of phishing attacks as well as providing insights and improving awareness as to how active Internet users can protect themselves against phishing attacks. It is hoped that this will help to formulate an upward trend in the practice of preventive measures against cyber-security issues. Despite various approaches having been utilised to develop anti-phishing tools to combat phishing attacks, these methods suffer from limited accuracy [1].

The main aim of this research is to develop an intelligent phishing detection and protection scheme for identification of website-based phishing attacks. This goal involves improving on previous work by building a robust classifier for intelligent phishing detection in online transactions. In order to achieve this aim the intelligent phishing detection support system should possess the following characteristics:


These requirements will be met by achieving the following five specific objectives:


V.Develop a plug-in and implement on a cross-platform operating system.

This section introduces the issue of interest and the significance of this research study. It provides details of the research problem and the research questions to be resolved together with the precise research objectives. It also summarises the existing literature and clarifies the main contributions of this research.
