Extending a sentiment lexicon with synonym-antonym datasets: SWNetTR plus

Saglam, Fatih; GENÇ, BURKAY; Sever, Hayri

doi:10.3906/elk-1809-120

Extending a sentiment lexicon with synonym-antonym datasets: SWNetTR plus

TURKISH JOURNAL OF ELECTRICAL ENGINEERING AND COMPUTER SCIENCES, cilt.27, sa.3, ss.1806-1820, 2019 (SCI-Expanded, Scopus, TRDizin)

Yayın Türü: Makale / Tam Makale
Cilt numarası: 27 Sayı: 3
Basım Tarihi: 2019
Doi Numarası: 10.3906/elk-1809-120
Dergi Adı: TURKISH JOURNAL OF ELECTRICAL ENGINEERING AND COMPUTER SCIENCES
Derginin Tarandığı İndeksler: Science Citation Index Expanded (SCI-EXPANDED), Scopus, TR DİZİN (ULAKBİM)
Sayfa Sayıları: ss.1806-1820
Anahtar Kelimeler: Turkish sentiment lexicon, sentiment analysis, sentiment lexicon, graph model, GDELT, SWNetTR plus
Açık Arşiv Koleksiyonu: AVESİS Açık Erişim Koleksiyonu
Hacettepe Üniversitesi Adresli: Evet

Özet

In our previous studies on developing a general-purpose Turkish sentiment lexicon, we constructed SWNetTR-PLUS, a sentiment lexicon of 37K words. In this paper, we show how to use Turkish synonym and antonym word pairs to extend SWNetTR-PLUS by almost 33% to obtain SWNetTR++, a Turkish sentiment lexicon of 49K words. The extension was done by transferring the problem into the graph domain, where nodes are words, and edges are synonym- antonym relations between words, and propagating the existing tone and polarity scores to the newly added words using an algorithm we have developed. We tested the existing and new lexicons using a manually labeled Turkish news media corpus of 500 news texts. The results show that our method yielded a significantly more accurate lexicon than SWNetTR-PLUS, resulting in an accuracy increase from 72.2% to 80.4%. At this level, we have now maximized the accuracy rates of translation-based sentiment analysis approaches, which first translate a Turkish text to English and then do the analysis using English sentiment lexicons.