Evaluation of Semantic Relatedness Measures for Turkish Language


Sopaoglu U., ERCAN G.

17th International Conference on Intelligent Text Processing and Computational Linguistics (CICLing), Konya, Türkiye, 3 - 09 Nisan 2016, cilt.9623, ss.600-611 identifier identifier

  • Yayın Türü: Bildiri / Tam Metin Bildiri
  • Cilt numarası: 9623
  • Doi Numarası: 10.1007/978-3-319-75477-2_43
  • Basıldığı Şehir: Konya
  • Basıldığı Ülke: Türkiye
  • Sayfa Sayıları: ss.600-611
  • Hacettepe Üniversitesi Adresli: Evet

Özet

The problem of quantifying semantic relatedness level of two words is a fundamental sub-task for many natural language processing systems. While there is a large body of research on measuring semantic relatedness in the English language, the literature lacks detailed analysis for these methods in agglutinative languages. In this research, two new evaluation resources for the Turkish language are constructed. An extensive set of experiments involving multiple tasks: word association, semantic categorization, and automatic WordNet relationship discovery are performed to evaluate different semantic relatedness measures in the Turkish language. As Turkish is an agglutinative language, the morphological processing component is important for distributional similarity algorithms. For languages with rich morphological variations and productivity, methods ranging from simple stemming strategies to morphological disambiguation exists. In our experiments, different morphological processing methods for the Turkish language are investigated.