Keyphrase extraction through query performance prediction


Ercan G., ÇİÇEKLİ İ.

JOURNAL OF INFORMATION SCIENCE, cilt.38, ss.476-488, 2012 (SCI İndekslerine Giren Dergi) identifier identifier

  • Cilt numarası: 38 Konu: 5
  • Basım Tarihi: 2012
  • Doi Numarası: 10.1177/0165551512448984
  • Dergi Adı: JOURNAL OF INFORMATION SCIENCE
  • Sayfa Sayıları: ss.476-488

Özet

Previous research shows that keyphrases are useful tools in document retrieval and navigation. While these point to a relation between keyphrases and document retrieval performance, no other work uses this relationship to identify keyphrases of a given document. This work aims to establish a link between the problems of query performance prediction (QPP) and keyphrase extraction. To this end, features used in QPP are evaluated in keyphrase extraction using a naive Bayes classifier. Our experiments indicate that these features improve the effectiveness of keyphrase extraction in documents of different length. More importantly, commonly used features of frequency and first position in text perform poorly on shorter documents, whereas QPP features are more robust and achieve better results.