**2. Misuse or signature detection**

Misuse detection, also called signature detection, is an approach in which attack patterns or unauthorized and suspicious behaviors are learned based on past activities and then the knowledge about the learned patterns is used to detect or predict subsequent similar such patterns in a network. The attack or misuse patterns, which are also called signatures, include patterns of log files or data packets that were found to be malicious and identified as threats to the network and the computing hosts. Each log file consists of its own signature that exhibits a unique pattern consisting of binary bits 0 and 1. For intrusion detection systems protecting host computers, that is, for host-based intrusion detection systems (HIDSs), the attack signature databases may contain various patterns of system calls that represent a different attack on the host. In the case of a network-based intrusion detection system (NIDS), attack signatures reveal specific patterns in data packets. These patterns may include signatures of the data payload, the packet header,

**157**

**Figure 1.**

*Machine Learning Applications in Misuse and Anomaly Detection*

unauthorized activities, such as improper file transfer protocol (FTP) initiation, or failed login attempt in Telnet. A typical data packet includes several fields such as: (i) the source Internet protocol (IP) address, (ii) the destination IP address, (iii) the source port number for transmission control protocol (TCP) or user datagram protocol (UDP), (iv) the destination port number for TCP or UDP, (v) the protocol description such as UDP, TCP or Internet control message protocol (ICMP), and (vi) the data payload. An attack signature can be detected in any specific field, or in

**Figure 1** shows how a typical misuse or signature detection system works. These

detection systems execute algorithms that attempt to match learned patterns or signatures from past attacks with the current activities in a network in order to detect any possible attack or malicious activities. If the signature of any current activity in the network matches with the signature of any activity in the attack signature database, the detection system raises an alert. A module in the detection system initiates a further investigation of the attack and starts invoking appropriate security modules to defend against such attacks. If the attack is found to be a real attack and not a false alarm by the detection system, the existing database of the attack signatures is updated with the signature of the new attack. For example, if the signature of an attack is: *login name = "*Sidra*,*" then, whenever there is any attempt to login into any device in the network with the name "Sidra," the signature

This approach adopted in a signature-based detection system is primarily meant for detecting already known threats and vulnerabilities in a network. However, these systems suffer from a drawback of producing too many false alarms. A false alarm or a false positive refers to a situation where the system raises an alert of an attack while no attack has really happened on the network. As an example, let us consider the case where a user logs into a remote server. If the user forgets the login password and makes multiple attempts of login, the account of the user is most likely to be locked after a certain number of such failed attempts. As the signature-based detection system cannot differentiate between a failed login attempt by a legitimate user, and a malicious user attempting to login in an unauthorized way into some legitimate user's account, both the activities are

*DOI: http://dx.doi.org/10.5772/intechopen.92653*

any combination of these fields.

considered as attacks.

detection system will raise an alert of an attack.

*Working of misuse or signature detection: Illustration of "if-else" rules.*

### *Machine Learning Applications in Misuse and Anomaly Detection DOI: http://dx.doi.org/10.5772/intechopen.92653*

*Security and Privacy From a Legal, Ethical, and Technical Perspective*

in the design of a robust and accurate anomaly detection system.

In this chapter, we have briefly reviewed some of the well-known misuse and anomaly-based detection systems that are proposed in the literature. We have also discussed some hybrid approaches in intrusion detections that effectively combine misuse and anomaly detection approaches so as to improve the detection accuracy and reduce false alarms. The rest of the chapter is organized as follows. Section 2 presents a brief discussion on misuse or signature-based detection approach. In Section 3, we discuss how various machine learning approaches can be applied in misuse or signature-based systems. Section 4 provides a brief overview of anomaly detection, while in Section 5, we discuss how machine learning and data mining algorithms can be effectively deployed in anomaly-based detection systems. In Section 6, we briefly discuss the working principles of some of the well-known hybrid detection systems. Section 7 concludes the chapter while highlighting some of the recent trends in machine learning approaches in network

Misuse detection, also called signature detection, is an approach in which attack patterns or unauthorized and suspicious behaviors are learned based on past activities and then the knowledge about the learned patterns is used to detect or predict subsequent similar such patterns in a network. The attack or misuse patterns, which are also called signatures, include patterns of log files or data packets that were found to be malicious and identified as threats to the network and the computing hosts. Each log file consists of its own signature that exhibits a unique pattern consisting of binary bits 0 and 1. For intrusion detection systems protecting host computers, that is, for host-based intrusion detection systems (HIDSs), the attack signature databases may contain various patterns of system calls that represent a different attack on the host. In the case of a network-based intrusion detection system (NIDS), attack signatures reveal specific patterns in data packets. These patterns may include signatures of the data payload, the packet header,

working principle of an anomaly detection system is fundamentally different from that of misuse or signature detection system. Misuse or signature detection systems first need to be equipped with a well-defined set of attack signatures populated in their database. An anomaly detection system, on the other hand, defines a detailed and accurate profile of the normal behavior of the networks and hosts. The normal state of the cyberinfrastructure, consisting of networks and hosts, indicates an attack-free state. When an anomalous activity occurs in the cyberinfrastructure, the anomaly detection system notices a state change from the normal state to a state that is no longer normal. On observing this state change, the anomaly detection system raises an alert of a possible attack on the cyberinfrastructure. Unlike the signature or misuse detection systems, the anomaly detection systems are capable of detecting novel attacks as the detection strategy for these systems is based on the state change information, rather than a matching of attack signatures. It is precisely for this reason that anomaly detection schemes are capable of detecting various different types of attacks. Some of these attacks include: (i) segmentation of binary code in a user password, (ii) backdoor service on a malicious process on a well-known port number in a computing host, (iii) stealthy reconnaissance attempts, (iv) novel buffer overflow attacks, (v) direction of hypertext transmission protocol (HTTP) on a nonstandard port number, (vi) stealthy attacks on protocol stacks and (vii) different variants of denial of service (DoS) and distributed denial of service (DDoS), and so on. Early and accurate detection of these attacks poses significant challenges

**156**

security applications.

**2. Misuse or signature detection**

unauthorized activities, such as improper file transfer protocol (FTP) initiation, or failed login attempt in Telnet. A typical data packet includes several fields such as: (i) the source Internet protocol (IP) address, (ii) the destination IP address, (iii) the source port number for transmission control protocol (TCP) or user datagram protocol (UDP), (iv) the destination port number for TCP or UDP, (v) the protocol description such as UDP, TCP or Internet control message protocol (ICMP), and (vi) the data payload. An attack signature can be detected in any specific field, or in any combination of these fields.

**Figure 1** shows how a typical misuse or signature detection system works. These detection systems execute algorithms that attempt to match learned patterns or signatures from past attacks with the current activities in a network in order to detect any possible attack or malicious activities. If the signature of any current activity in the network matches with the signature of any activity in the attack signature database, the detection system raises an alert. A module in the detection system initiates a further investigation of the attack and starts invoking appropriate security modules to defend against such attacks. If the attack is found to be a real attack and not a false alarm by the detection system, the existing database of the attack signatures is updated with the signature of the new attack. For example, if the signature of an attack is: *login name = "*Sidra*,*" then, whenever there is any attempt to login into any device in the network with the name "Sidra," the signature detection system will raise an alert of an attack.

This approach adopted in a signature-based detection system is primarily meant for detecting already known threats and vulnerabilities in a network. However, these systems suffer from a drawback of producing too many false alarms. A false alarm or a false positive refers to a situation where the system raises an alert of an attack while no attack has really happened on the network. As an example, let us consider the case where a user logs into a remote server. If the user forgets the login password and makes multiple attempts of login, the account of the user is most likely to be locked after a certain number of such failed attempts. As the signature-based detection system cannot differentiate between a failed login attempt by a legitimate user, and a malicious user attempting to login in an unauthorized way into some legitimate user's account, both the activities are considered as attacks.

**Figure 1.** *Working of misuse or signature detection: Illustration of "if-else" rules.*

The efficacy of misuse or signature detection system largely depends on the completeness and sufficiency of the knowledge of attack patterns and signatures captured in the attack signature database of the system. It is a nontrivial task to capture and represent the knowledge of attacks and system vulnerabilities in a cyberinfrastructure or in a network of computing machines, and the job heavily depends on domain experts. Since the knowledge and skills of domain experts may vary significantly from person to person, the design of signature detection systems, quite often, can be incomplete and inaccurate. Moreover, a slight variation, evolution, blending, or a combination of already known attacks can make signature detection an impossible task. This is a typical problem with any similarity-based learning system like a signature-based intrusion detection system.
