Keyphrase extraction through query performance prediction


Creative Commons License

Ercan G., ÇİÇEKLİ İ.

JOURNAL OF INFORMATION SCIENCE, cilt.38, sa.5, ss.476-488, 2012 (SCI-Expanded) identifier identifier

  • Yayın Türü: Makale / Tam Makale
  • Cilt numarası: 38 Sayı: 5
  • Basım Tarihi: 2012
  • Doi Numarası: 10.1177/0165551512448984
  • Dergi Adı: JOURNAL OF INFORMATION SCIENCE
  • Derginin Tarandığı İndeksler: Science Citation Index Expanded (SCI-EXPANDED), Social Sciences Citation Index (SSCI), Scopus
  • Sayfa Sayıları: ss.476-488
  • Hacettepe Üniversitesi Adresli: Evet

Özet

Previous research shows that keyphrases are useful tools in document retrieval and navigation. While these point to a relation between keyphrases and document retrieval performance, no other work uses this relationship to identify keyphrases of a given document. This work aims to establish a link between the problems of query performance prediction (QPP) and keyphrase extraction. To this end, features used in QPP are evaluated in keyphrase extraction using a naive Bayes classifier. Our experiments indicate that these features improve the effectiveness of keyphrase extraction in documents of different length. More importantly, commonly used features of frequency and first position in text perform poorly on shorter documents, whereas QPP features are more robust and achieve better results.