Use of HOG Descriptors in Phishing Detection

BOZKIR A. S., Sezer E. A.

4th International Symposium on Digital Forensic and Security (ISDFS), Arkansas, United States Of America, 25 - 27 April 2016, pp.148-153 identifier identifier

  • Publication Type: Conference Paper / Full Text
  • Doi Number: 10.1109/isdfs.2016.7473534
  • City: Arkansas
  • Country: United States Of America
  • Page Numbers: pp.148-153
  • Hacettepe University Affiliated: Yes


Phishing is a scamming activity which deals with making a visual illusion on computer users by providing fake web pages which mimic their legitimate targets in order to steal valuable digital data such as credit card information or e-mail passwords. In contrast to other anti-phishing attempts this paper proposes to evaluate and solve this problem by leveraging a pure computer vision based method in the concept of web page layout similarity. Proposed approach employs Histogram of Oriented Gradients (HOG) descriptor in order to capture cues of page layout without the need of time consuming intermediate stage of segmentation. Moreover, histogram intersection kernel has been used as a similarity metric for computing similarity. Thus, an efficient and fast phishing page detection scheme has been developed in order to combat with zero-day phishing page attacks. To verify the efficiency of our phishing page detection mechanism, 50 unique phishing pages and their legitimate targets have been collected. Furthermore, 100 pairs of legitimate pages have been gathered. As the next stage, the similarity scores in these two groups were computed and compared. According to promising results, similarity degree around 75% and above can be adequate for alarming.