Topic segmentation using word-level semantic relatedness functions


ERCAN G., ÇİÇEKLİ İ.

JOURNAL OF INFORMATION SCIENCE, cilt.42, sa.5, ss.597-608, 2016 (SCI-Expanded) identifier identifier

  • Yayın Türü: Makale / Tam Makale
  • Cilt numarası: 42 Sayı: 5
  • Basım Tarihi: 2016
  • Doi Numarası: 10.1177/0165551515602460
  • Dergi Adı: JOURNAL OF INFORMATION SCIENCE
  • Derginin Tarandığı İndeksler: Science Citation Index Expanded (SCI-EXPANDED), Social Sciences Citation Index (SSCI), Scopus
  • Sayfa Sayıları: ss.597-608
  • Hacettepe Üniversitesi Adresli: Evet

Özet

Semantic relatedness deals with the problem of measuring how much two words are related to each other. While there is a large body of research for developing new measures, the use of semantic relatedness (SR) measures in topic segmentation has not been explored. In this research the performance of different SR measures is evaluated in the topic segmentation problem. To this end, two topic segmentation algorithms that use the difference in SR of words are introduced. Our results indicate that using an SR measure trained with a general domain corpora achieves better results than topic segmentation algorithms using Wordnet or simple word repetition. Furthermore, when compared with computationally more complex algorithms performing global analysis, our local analysis, enhanced with general domain lexical semantic information, achieves comparable results.