Extending a sentiment lexicon with synonym-antonym datasets: SWNetTR plus


Creative Commons License

Saglam F., GENÇ B., Sever H.

TURKISH JOURNAL OF ELECTRICAL ENGINEERING AND COMPUTER SCIENCES, cilt.27, sa.3, ss.1806-1820, 2019 (SCI-Expanded) identifier identifier

  • Yayın Türü: Makale / Tam Makale
  • Cilt numarası: 27 Sayı: 3
  • Basım Tarihi: 2019
  • Doi Numarası: 10.3906/elk-1809-120
  • Dergi Adı: TURKISH JOURNAL OF ELECTRICAL ENGINEERING AND COMPUTER SCIENCES
  • Derginin Tarandığı İndeksler: Science Citation Index Expanded (SCI-EXPANDED), Scopus, TR DİZİN (ULAKBİM)
  • Sayfa Sayıları: ss.1806-1820
  • Anahtar Kelimeler: Turkish sentiment lexicon, sentiment analysis, sentiment lexicon, graph model, GDELT, SWNetTR plus
  • Hacettepe Üniversitesi Adresli: Evet

Özet

In our previous studies on developing a general-purpose Turkish sentiment lexicon, we constructed SWNetTR-PLUS, a sentiment lexicon of 37K words. In this paper, we show how to use Turkish synonym and antonym word pairs to extend SWNetTR-PLUS by almost 33% to obtain SWNetTR++, a Turkish sentiment lexicon of 49K words. The extension was done by transferring the problem into the graph domain, where nodes are words, and edges are synonym- antonym relations between words, and propagating the existing tone and polarity scores to the newly added words using an algorithm we have developed. We tested the existing and new lexicons using a manually labeled Turkish news media corpus of 500 news texts. The results show that our method yielded a significantly more accurate lexicon than SWNetTR-PLUS, resulting in an accuracy increase from 72.2% to 80.4%. At this level, we have now maximized the accuracy rates of translation-based sentiment analysis approaches, which first translate a Turkish text to English and then do the analysis using English sentiment lexicons.