Developing Turkish Sentiment Lexicon for Sentiment Analysis Using Online News Media

Saglam F., Sever H., GENÇ B.

13th IEEE/ACS International Conference on Computer Systems and Applications (AICCSA), Aghadir, Fas, 29 Kasım - 02 Aralık 2016, (Tam Metin Bildiri)

Yayın Türü: Bildiri / Tam Metin Bildiri
Cilt numarası:
Doi Numarası: 10.1109/aiccsa.2016.7945670
Basıldığı Şehir: Aghadir
Basıldığı Ülke: Fas
Anahtar Kelimeler: sentiment analysis, Turkish, GDELT, Turkish Sentiment Lexicon, SentiWordNet
Hacettepe Üniversitesi Adresli: Evet

Özet

Internet is a very rich resource of documents that need to be analysed to extract their sentimental values. Sentiment Analysis which is a subfield of Natural Language Processing discipline focuses on this issue. The existence of sentiment lexicons in their own language is a very important resource for scientists studying in sentiment analysis field. Since many studies of sentiment analysis have been conducted on text written in English language, developed methods and resources for English may not produce the desired results in other languages. In Turkish, a rich sentiment lexicon does not exists, such as SentiWordNet for English. In this study, we aimed to develop Turkish sentiment lexicon, and we enhanced an existing lexicon which has 27K Turkish words to 37K words. For quantifying the performance of this enhanced lexicon, we tested both lexicons on domain independent news texts. The accuracy of determining the polarity of news written in Turkish has been increased from 60.6% to 72.2%.