URL: | http://www.genomeindia.org/biocuration |
Full name: | Manually Curated Database of Rice Proteins |
Description: | Introduction ‘Manually Curated Database of Rice Proteins’ is a literature based manually curated protein centric database of rice proteins. The database provides experimental data embedded in published articles in a computer searchable format. The literature curation workflow is fundamentally different than most other literature mining approaches, which concentrate on mining the text of the articles to extract information. The in-house developed manual curation models enable digitization of the experimental data itself. Emphasis is given to experiments that provide direct information about a protein/gene expression or activity. Thus, data from experiments such as quantitative/semi-quantitative RT-PCR, Northern analysis, protein-protein or DNA-protein interaction, enzymatic assays, trait analysis etc. have been manually digitized using these in-house developed data curation models. As a result of such curation one is able to search, for example, all RT-PCR based gene expression data for a particular rice protein published in several different publications in a matter of seconds. The data curation models extensively utilize well-known ontologies such as Gene Ontology (GO), Plant Ontology (PO), Trait Ontology (TO), Environmental Ontology (EO) etc. to facilitate seamless integration of experimental data across publications. Since the existing ontologies were not sufficient to represent the immensely diverse data available in published literature a large number of new terms have also been appended to the existing ontologies. Moreover, several other coding systems were also developed to systematically capture aspects of the experimental data that was beyond the scope of the existing ontologies. Current release of the database has data for over 2390 rice proteins from over 550 research articles. |
Year founded: | 2014 |
Last update: | |
Version: | |
Accessibility: | |
Country/Region: | India |
Data type: | |
Data object: | |
Database category: | |
Major species: | |
Keywords: |
University/Institution: | University of Delhi South Campus |
Address: | Department of Plant Molecular Biology, University of Delhi South Campus, Benito Juarez Road, New Delhi – 110021, India |
City: | |
Province/State: | |
Country/Region: | India |
Contact name (PI/Team): | Saurabh Raghuvanshi |
Contact email (PI/Helpdesk): | saurabh@genomeindia.org |
Manually curated database of rice proteins. [PMID: 24214963]
'Manually Curated Database of Rice Proteins' (MCDRP) available at http://www.genomeindia.org/biocuration is a unique curated database based on published experimental data. Semantic integration of scientific data is essential to gain a higher level of understanding of biological systems. Since the majority of scientific data is available as published literature, text mining is an essential step before the data can be integrated and made available for computer-based search in various databases. However, text mining is a tedious exercise and thus, there is a large gap in the data available in curated databases and published literature. Moreover, data in an experiment can be perceived from several perspectives, which may not reflect in the text-based curation. In order to address such issues, we have demonstrated the feasibility of digitizing the experimental data itself by creating a database on rice proteins based on in-house developed data curation models. Using these models data of individual experiments have been digitized with the help of universal ontologies. Currently, the database has data for over 1800 rice proteins curated from >4000 different experiments of over 400 research articles. Since every aspect of the experiment such as gene name, plant type, tissue and developmental stage has been digitized, experimental data can be rapidly accessed and integrated. |