Share by e-mail

Return to the home page of Tomasz Bujlow


Classification of HTTP Traffic Based on C5.0 Machine Learning Algorithm

Tomasz Bujlow, Tahir Riaz, and Jens Myrup Pedersen

Proceedings of the Fourth IEEE International Workshop on Performance Evaluation of Communications in Distributed Systems and Web-based Service Architectures (PEDISWESA 2012), pp. 882 - 887, IEEE, Cappadocia, Turkey, July 2012, DOI: 10.1109/ISCC.2012.6249413.

  Download this publication in PDF (author's version)


Abstract

Our previous work demonstrated the possibility of distinguishing several kinds of applications with accuracy of over 99%. Today, most of the traffic is generated by web browsers, which provide different kinds of services based on the HTTP protocol: web browsing, file downloads, audio and voice streaming through third-party plugins, etc. This paper suggests and evaluates two approaches to distinguish various types of HTTP content: distributed among volunteers' machines and centralized running in the core of the network. We also assess the accuracy of the global classifier for both HTTP and non-HTTP traffic. We achieved accuracy of 94%, which supposed to be even higher in real-life usage. Finally, we provided graphical characteristics of different kinds of HTTP traffic.


Return to the home page of Tomasz Bujlow