Information retrieval effectiveness of Turkish search engines

Bitirim Y., Tonta Y. A., Sever H.

ADVANCES IN INFORMATION SYSTEMS, vol.2457, pp.93-103, 2002 (SCI-Expanded) identifier

  • Publication Type: Article / Article
  • Volume: 2457
  • Publication Date: 2002
  • Journal Indexes: Science Citation Index Expanded (SCI-EXPANDED)
  • Page Numbers: pp.93-103
  • Hacettepe University Affiliated: Yes


This is an investigation of information retrieval performance of Turkish search engines with respect to precision, normalized recall, coverage and novelty ratios. We defined seventeen query topics for Arabul, Arama, Netbul and Superonline. These queries were carefully selected to assess the capability of a search engine for handling broad or narrow topic subjects, exclusion of particular information, identifying and indexing Turkish characters, retrieval of hub/authoritative pages, stemming of Turkish words, correct interpretation of Boolean operators. We classified each document in a retrieval output as being "relevant" or "nonrelevant" to calculate precision and normalized recall ratios at various cut-off points for each pair of query topic and search engine. We found the coverage and novelty ratios for each search engine. We also tested how search engines handle meta-tags and dead links. Arama appears to be the best Turkish search engine in terms of average precision and normalized recall ratios, and the coverage of Turkish sites. Turkish characters (and stemming as well) still cause bottlenecks for Turkish search engines. Superonline and Netbul make use of the indexing information in metatag fields to improve retrieval results.