Two learning approaches for protein name extraction

Tatar, İLYAS; Cicekli, Ilyas

doi:10.1016/j.jbi.2009.05.004

Two learning approaches for protein name extraction

Tatar S., Cicekli I.

JOURNAL OF BIOMEDICAL INFORMATICS, cilt.42, sa.6, ss.1046-1055, 2009 (SCI-Expanded, Scopus)

Yayın Türü: Makale / Tam Makale
Cilt numarası: 42 Sayı: 6
Basım Tarihi: 2009
Doi Numarası: 10.1016/j.jbi.2009.05.004
Dergi Adı: JOURNAL OF BIOMEDICAL INFORMATICS
Derginin Tarandığı İndeksler: Science Citation Index Expanded (SCI-EXPANDED), Scopus
Sayfa Sayıları: ss.1046-1055
Hacettepe Üniversitesi Adresli: Hayır

Özet

Protein name extraction, one of the basic tasks in automatic extraction of information from biological texts, remains challenging. In this paper, we explore the use of two different machine learning techniques and present the results of the conducted experiments. in the first method, Bigram language model is used to extract protein names. In the latter, we use an automatic rule learning method that can identify protein names located in the biological texts. In both cases, we generalize protein names by using hierarchically categorized syntactic token types.