TurkishDelightNLP: A Neural Turkish NLP Toolkit


Alecakir H., BÖLÜCÜ N., Can B.

Conference of the North-American-Chapter-of-the-Association-for-Computational-Linguistics (NAAACL) - Human Language Technologies, Washington, Amerika Birleşik Devletleri, 10 - 15 Temmuz 2022, ss.17-26 identifier identifier

  • Yayın Türü: Bildiri / Tam Metin Bildiri
  • Basıldığı Şehir: Washington
  • Basıldığı Ülke: Amerika Birleşik Devletleri
  • Sayfa Sayıları: ss.17-26
  • Hacettepe Üniversitesi Adresli: Evet

Özet

We introduce a neural Turkish NLP toolkit called TurkishDelightNLP that performs computational linguistic analyses from morphological level to semantic level that involves tasks such as stemming, morphological segmentation, morphological tagging, part-of-speech tagging, dependency parsing, and semantic parsing, as well as high-level NLP tasks such as named entity recognition. We publicly share the open-source Turkish NLP toolkit through a web interface that allows an input text to be analysed in real-time, as well as the open source implementation of the components provided in the toolkit, an API, and several annotated datasets such as word similarity test set to evaluate word embeddings and UCCA-based semantic annotation in Turkish. This will be the first open-source Turkish NLP toolkit that involves a range of NLP tasks in all levels. We believe that it will be useful for other researchers in Turkish NLP and will be also beneficial for other high-level NLP tasks in Turkish.