Determining of Discriminative Blog Size for Authorship Attribution on the Turkish Texts

CANBAY P., SEVER H., SEZER E.

6th International Symposium on Digital Forensic and Security (ISDFS), Antalya, Türkiye, 22 - 25 Mart 2018, ss.319-323, (Tam Metin Bildiri)

Yayın Türü: Bildiri / Tam Metin Bildiri
Cilt numarası:
Doi Numarası: 10.1109/isdfs.2018.8355373
Basıldığı Şehir: Antalya
Basıldığı Ülke: Türkiye
Sayfa Sayıları: ss.319-323
Hacettepe Üniversitesi Adresli: Evet

Özet

Although many features and methods are used to extract information from a text about its author, a standard method and feature set could not be presented in this area. The fact that different types of texts are produced continuously in the electronic environment has made this process even more difficult. Authorship attribution, which is interested to find the author of the anonymous text, is a common branch of forensic science, computer science, and linguistics. This study focuses on answering the question of what is the discriminative and satisfying text size for authorship attribution studies. The study conducted on the Turkish blog writings and aimed at providing a standard solution step in this area. As a result of the many experiments, short texts from 500 words are seemed inappropriate to find meaningful results in authorship attribution studies.