UniProt: the universal protein knowledgebase

Bateman, Alex; Martin, Maria; O'Donovan, Claire; Magrane, Michele; Alpi, Emanuele; Antunes, Ricardo; Bely, Benoit; Bingley, Mark; Bonilla, Carlos; Britto, Ramona; Bursteinas, Borisas; Bye-A-Jee, Hema; Cowley, Andrew; Da Silva, Alan; De Giorgi, Maurizio; Dogan, TUNCA; Fazzini, Francesco; Castro, Leyla; Figueira, Luis; Garmiri, Penelope; Georghiou, George; Gonzalez, Daniel; Hatton-Ellis, Emma; Li, Weizhong; Liu, Wudong; Lopez, Rodrigo; Luo, Jie; Lussi, Yvonne; MacDougall, Alistair; Nightingale, Andrew; Palka, Barbara; Pichler, Klemens; Poggioli, Diego; Pundir, Sangya; Pureza, Luis; Qi, Guoying; Rosanoff, Steven; Saidi, Rabie; Sawford, Tony; Shypitsyna, Aleksandra; Speretta, Elena; Turner, Edward; Tyagi, Nidhi; Volynkin, Vladimir; Wardell, Tony; Warner, Kate; Watkins, Xavier; Zaru, Rossana; Zellner, Hermann; Xenarios, Ioannis; Bougueleret, Lydie; Bridge, Alan; Poux, Sylvain; Redaschi, Nicole; Aimo, Lucila; Argoud-Puy, Ghislaine; Auchincloss, Andrea; Axelsen, Kristian; Bansal, Parit; Baratin, Delphine; Blatter, Marie-Claude; Boeckmann, Brigitte; Bolleman, Jerven; Boutet, Emmanuel; Breuza, Lionel; Casal-Casas, Cristina; de Castro, Edouard; Coudert, Elisabeth; Cuche, Beatrice; Doche, Mikael; Dornevil, Dolnide; Duvaud, Severine; Estreicher, Anne; Famiglietti, Livia; Feuermann, Marc; Gasteiger, Elisabeth; Gehant, Sebastien; Gerritsen, Vivienne; Gos, Arnaud; Gruaz-Gumowski, Nadine; Hinz, Ursula; Hulo, Chantal; Jungo, Florence; Keller, Guillaume; Lara, Vicente; Lemercier, Philippe; Lieberherr, Damien; Lombardot, Thierry; Martin, Xavier; Masson, Patrick; Morgat, Anne; Neto, Teresa; Nouspikel, Nevila; Paesano, Salvo; Pedruzzi, Ivo; Pilbout, Sandrine; Pozzato, Monica; Pruess, Manuela; Rivoire, Catherine; Roechert, Bernd; Schneider, Michel; Sigrist, Christian; Sonesson, Karin; Staehli, Sylvie; Stutz, Andre; Sundaram, Shyamala; Tognolli, Michael; Verbregue, Laure; Veuthey, Anne-Lise; Wu, Cathy; Arighi, Cecilia; Arminski, Leslie; Chen, Chuming; Chen, Yongxing; Garavelli, John; Huang, Hongzhan; Laiho, Kati; McGarvey, Peter; Natale, Darren; Ross, Karen; Vinayaka, C.; Wang, Qinghua; Wang, Yuqi; Yeh, Lai-Su; Zhang, Jian

doi:10.1093/nar/gkw1099

UniProt: the universal protein knowledgebase

Bateman A., Martin M. J., O'Donovan C., Magrane M., Alpi E., Antunes R., ...Daha Fazla

NUCLEIC ACIDS RESEARCH, cilt.45, 2017 (SCI-Expanded, Scopus)

Yayın Türü: Makale / Tam Makale
Cilt numarası: 45
Basım Tarihi: 2017
Doi Numarası: 10.1093/nar/gkw1099
Dergi Adı: NUCLEIC ACIDS RESEARCH
Derginin Tarandığı İndeksler: Science Citation Index Expanded (SCI-EXPANDED), Scopus
Açık Arşiv Koleksiyonu: AVESİS Açık Erişim Koleksiyonu
Hacettepe Üniversitesi Adresli: Hayır

Özet

The UniProt knowledgebase is a large resource of protein sequences and associated detailed annotation. The database contains over 60 million sequences, of which over half a million sequences have been curated by experts who critically review experimental and predicted data for each protein. The remainder are automatically annotated based on rule systems that rely on the expert curated knowledge. Since our last update in 2014, we have more than doubled the number of reference proteomes to 5631, giving a greater coverage of taxonomic diversity. We implemented a pipeline to remove redundant highly similar proteomes that were causing excessive redundancy in UniProt. The initial run of this pipeline reduced the number of sequences in UniProt by 47 million. For our users interested in the accessory proteomes, we have made available sets of pan proteome sequences that cover the diversity of sequences for each species that is found in its strains and sub-strains. To help interpretation of genomic variants, we provide tracks of detailed protein information for the major genome browsers. We provide a SPARQL endpoint that allows complex queries of the more than 22 billion triples of data in UniProt (http://sparql.uniprot.org/).