Intelligent Authorship Identification with using Turkish Newspapers Metadata

Yavanoglu O.

4th IEEE International Conference on Big Data (Big Data), Washington, Kiribati, 5 - 08 Aralık 2016, ss.1895-1900, (Tam Metin Bildiri)

Yayın Türü: Bildiri / Tam Metin Bildiri
Cilt numarası:
Doi Numarası: 10.1109/bigdata.2016.7840809
Basıldığı Şehir: Washington
Basıldığı Ülke: Kiribati
Sayfa Sayıları: ss.1895-1900
Hacettepe Üniversitesi Adresli: Evet

Özet

Authorship identification is a problem of data mining and classification. There are numerous methods and algorithms have been published to understand its nature. Although, researchers still investigate best and simple solutions due to its heterogeneous and multilingual characteristics. This study introduced new authorship identification process based on artificial neural network (ANN) model using embedded stylistics features. It is well known that stylistics features mostly depend on the topic or genre of the article. Our dataset contains 22.000 Turkish newspaper articles which belong to different genres. The experimental results indicate that % 97 success rate has been achieved with Levenberg Marguardt based classifier. It can be concluded that the corpus presented in this work for the first time might contribute to not only authorship identification but also other identification purposes.