Spam/Ham E-Mail Classification using Machine Learning Methods based on Bag of Words Technique


Sahin E., AYDOS M., Orhan F.

26th IEEE Signal Processing and Communications Applications Conference (SIU), İzmir, Turkey, 2 - 05 May 2018 identifier identifier

  • Publication Type: Conference Paper / Full Text
  • Volume:
  • Doi Number: 10.1109/siu.2018.8404347
  • City: İzmir
  • Country: Turkey
  • Hacettepe University Affiliated: Yes

Abstract

Nowadays, we use frequently e-mails, one of the communication channels, in electronic environment. It play an important role in our lives because of many reasons such as personal communications, business-focused activities, marketing, advertising etc. E-mails make life easier because of meeting many different types of communication needs. On the other hand they can make life difficult when they are used outside of their purposes. Spam emails can be not only annoying receivers, but also dangerous for receiver's information security. Detecting and preventing spam e-mails has been a separate issue. In this study, the texts of the links which is in the e-mail body are handled and classified by the machine learning methods and Bag of Word Technique. We analyzed the effect of different N-Grams on classification performance and the success of different machine learning techniques in classifying spam e-mail by using accuracy metric