Voice and multimedia communications are rapidly migrating from traditional networks to TCP/IP networks (Internet), where services are provisioned by SIP (Session Initiation Protocol). In this paper we propose an on-line filter that examines the stream of incoming SIP messages and classifies them as good or bad. The classification is carried out in two stages: first a lexical analysis is performed to weed out those messages that do belong to the language generated by the grammar defined by the SIP standard. After this first stage, a second filtering occurs which identifies messages that somehow differ - in structure or contents - from messages that were previously classified as good. While the first filter stage is straightforward, as the classification is crisp (either a messages belongs to the language or it does not), the second stage requires a more delicate handling, as it not a sharp decision whether a message is semantically meaningful or not. The approach we followed for this is based on using past experience on previously classified messages, i.e. a “learn-by-examples” which led to a classifier based on Support- Vector-Machines (SVM) to perform the required analysis of each incoming SIP message. The paper describes the overall architecture of the two-stage filter and then explores several points of the configuration-space for the SVM to determine a good configuration setting that will perform well when used to classify a large sample of SIP messages obtained from real traffic collected on a VoIP installation at our institution. Finally, the performance of the classification on additional messages collected from the same source is presented.
On the use of SVMs to Detect Anomalies in a Stream of SIP Messages
Ferdous, Raihana;Lo Cigno, Renato Antonio;Zorat, Alessandro
2012-01-01
Abstract
Voice and multimedia communications are rapidly migrating from traditional networks to TCP/IP networks (Internet), where services are provisioned by SIP (Session Initiation Protocol). In this paper we propose an on-line filter that examines the stream of incoming SIP messages and classifies them as good or bad. The classification is carried out in two stages: first a lexical analysis is performed to weed out those messages that do belong to the language generated by the grammar defined by the SIP standard. After this first stage, a second filtering occurs which identifies messages that somehow differ - in structure or contents - from messages that were previously classified as good. While the first filter stage is straightforward, as the classification is crisp (either a messages belongs to the language or it does not), the second stage requires a more delicate handling, as it not a sharp decision whether a message is semantically meaningful or not. The approach we followed for this is based on using past experience on previously classified messages, i.e. a “learn-by-examples” which led to a classifier based on Support- Vector-Machines (SVM) to perform the required analysis of each incoming SIP message. The paper describes the overall architecture of the two-stage filter and then explores several points of the configuration-space for the SVM to determine a good configuration setting that will perform well when used to classify a large sample of SIP messages obtained from real traffic collected on a VoIP installation at our institution. Finally, the performance of the classification on additional messages collected from the same source is presented.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione