a catalog of biological databases
|Description:||The ITS RefSeq Targeted Loci project is the result of an international collaboration with fungal taxonomic specialists to verify and provide a curated set of complete and near full length sequences with specimen data and correct taxonomic names. Collaborators at MycoBank, Index Fungorum, ISHAM, UNITE and curators from various culture and herbarium collections around the world contributed to this effort. ITS RefSeq accessions (NR_ ) include sequences mostly obtained from type specimens and a few from verified specimens (ultimately to be replaced with sequences from type specimens). The collection source of type material are indicated in each record and collection acronyms follows the collection codes maintained at the NCBI collections database.|
|University/Institution:||National Center for Biotechnology Information|
|Address:||National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, Maryland, USA|
|Contact name (PI/Team):||Conrad L. Schoch|
|Contact email (PI/Helpdesk):||email@example.com|
The ITS (nuclear ribosomal internal transcribed spacer) RefSeq database at the National Center for Biotechnology Information (NCBI) is dedicated to the clear association between name, specimen and sequence data. This database is focused on sequences obtained from type material stored in public collections. While the initial ITS sequence curation effort together with numerous fungal taxonomy experts attempted to cover as many orders as possible, we extended our latest focus to the family and genus ranks. We focused on Trichoderma for several reasons, mainly because the asexual and sexual synonyms were well documented, and a list of proposed names and type material were recently proposed and published. In this case study the recent taxonomic information was applied to do a complete taxonomic audit for the genus Trichoderma in the NCBI Taxonomy database. A name status report is available here: https://www.ncbi.nlm.nih.gov/Taxonomy/TaxIdentifier/tax_identifier.cgi. As a result, the ITS RefSeq Targeted Loci database at NCBI has been augmented with more sequences from type and verified material from Trichoderma species. Additionally, to aid in the cross referencing of data from single loci and genomes we have collected a list of quality records of the RPB2 gene obtained from type material in GenBank that could help validate future submissions. During the process of curation misidentified genomes were discovered, and sequence records from type material were found hidden under previous classifications. Source metadata curation, although more cumbersome, proved to be useful as confirmation of the type material designation.
DNA phylogenetic comparisons have shown that morphology-based species recognition often underestimates fungal diversity. Therefore, the need for accurate DNA sequence data, tied to both correct taxonomic names and clearly annotated specimen data, has never been greater. Furthermore, the growing number of molecular ecology and microbiome projects using high-throughput sequencing require fast and effective methods for en masse species assignments. In this article, we focus on selecting and re-annotating a set of marker reference sequences that represent each currently accepted order of Fungi. The particular focus is on sequences from the internal transcribed spacer region in the nuclear ribosomal cistron, derived from type specimens and/or ex-type cultures. Re-annotated and verified sequences were deposited in a curated public database at the National Center for Biotechnology Information (NCBI), namely the RefSeq Targeted Loci (RTL) database, and will be visible during routine sequence similarity searches with NR_prefixed accession numbers. A set of standards and protocols is proposed to improve the data quality of new sequences, and we suggest how type and other reference sequences can be used to improve identification of Fungi. Database URL: http://www.ncbi.nlm.nih.gov/bioproject/PRJNA177353.