Determining of Discriminative Blog Size for Authorship Attribution on the Turkish Texts


CANBAY P., SEVER H., SEZER E.

6th International Symposium on Digital Forensic and Security (ISDFS), Antalya, Turkey, 22 - 25 March 2018, pp.319-323 identifier identifier

  • Publication Type: Conference Paper / Full Text
  • Volume:
  • Doi Number: 10.1109/isdfs.2018.8355373
  • City: Antalya
  • Country: Turkey
  • Page Numbers: pp.319-323
  • Hacettepe University Affiliated: Yes

Abstract

Although many features and methods are used to extract information from a text about its author, a standard method and feature set could not be presented in this area. The fact that different types of texts are produced continuously in the electronic environment has made this process even more difficult. Authorship attribution, which is interested to find the author of the anonymous text, is a common branch of forensic science, computer science, and linguistics. This study focuses on answering the question of what is the discriminative and satisfying text size for authorship attribution studies. The study conducted on the Turkish blog writings and aimed at providing a standard solution step in this area. As a result of the many experiments, short texts from 500 words are seemed inappropriate to find meaningful results in authorship attribution studies.