SimiLay: A Developing Web Page Layout Based Visual Similarity Search Engine


Creative Commons License

Bozkır A. S. , Sezer E.

in: Machine Learning and Data Mining in Pattern Recognition, Petra Perner, Editor, Springer-Verlag , Berlin, pp.457-470, 2014

  • Publication Type: Book Chapter / Chapter Research Book
  • Publication Date: 2014
  • Publisher: Springer-Verlag
  • City: Berlin
  • Page Numbers: pp.457-470
  • Editors: Petra Perner, Editor

Abstract

Web page visual similarity has been a trend topic in last decade. Furthermore, effective methods and approaches are crucial for phishing detection and related issues. In this study, we aim to develop a search engine for web page visual similarity and propose a novel method for capturing and calculating layout similarity of web pages. To achieve this, web page elements are classified and mapped with a novel technique. Furthermore, an extension of well known bag of features approach named spatial pyramid match has been employed via histogram intersection schema for capturing and measuring the partial and whole page layout similarity. Promising results demonstrate that spatial pyramid matching kernel can be used for this field.