Automatic Term Extraction with Joint Multilingual Learning


Karaman I. N., ÇİÇEKLİ İ., Ercan G.

7th International Conference on Computer Science and Engineering, UBMK 2022, Diyarbakır, Türkiye, 14 - 16 Eylül 2022, ss.159-164 identifier

  • Yayın Türü: Bildiri / Tam Metin Bildiri
  • Doi Numarası: 10.1109/ubmk55850.2022.9919455
  • Basıldığı Şehir: Diyarbakır
  • Basıldığı Ülke: Türkiye
  • Sayfa Sayıları: ss.159-164
  • Anahtar Kelimeler: Automatic term extraction, bidirectional long short-term memory, deep learning, joint multilingual learning, neural sequence labeling
  • Hacettepe Üniversitesi Adresli: Evet

Özet

© 2022 IEEE.Automatic term extraction using deep learning achieves promising results if sufficient training data exists. Unfortunately, some languages may lack these resources in some domains causing poor performance due to under-fitting. In this study, we propose a joint multilingual deep learning model with sequence labeling to extract terms, trained on multilingual data and aligned word embeddings to tackle this problem. Our evaluation results demonstrate that the multilingual model provides an improvement for automatic term extraction task when it is compared with a monolingual model trained with limited training data. Although the improvement rate varies according to domain and the size of the data, our evaluation shows that the highest improvement in F1-score is 10.1 % in the domain of Computer Science, the least improvement is 7.6% in the domain of Electronic Engineering. Our multilingual model also achieves competitive results when it is compared with a monolingual model trained with sufficient training data.