NEUROCOMPUTING, cilt.405, ss.149-160, 2020 (SCI-Expanded)
Most pre-trained word embeddings are achieved from context-based learning algorithms trained over a large text corpus. This leads to learning similar vectors for words that share most of their contexts, while expressing different meanings. Therefore, the complex characteristics of words cannot be fully learned by using such models. One of the natural language processing applications that suffers from this problem is sentiment analysis. In this task, two words with opposite sentiments are not distinguished well by using common pre-trained word embeddings. This paper addresses this problem and proposes two simple, but empirically effective, approaches to learn word embeddings for sentiment analysis. The both approaches exploit sentiment lexicons and take into account the polarity of words in learning word embeddings. While the first approach encodes the sentiment information of words into existing pre-trained word embeddings, the second one builds synthetic sentimental contexts for embedding models along with other semantic contexts. The word embeddings obtained from the both approaches are evaluated on several sentiment classification tasks using Skip-gram and GloVe models. Results show that both approaches improve state-of-the-art results using basic deep learning models over sentiment analysis benchmarks. (c) 2020 Elsevier B.V. All rights reserved.