Database Commons

a catalog of biological databases

e.g., animal; RNA; Methylation; China

Database information

Rfam (RNA Families)

General information

Description: The Rfam database is a collection of non-coding RNA families represented by manually curated sequence alignments, consensus secondary structures and annotation gathered from corresponding Wikipedia, taxonomy and ontology resources.
Year founded: 2003
Last update: 2019-1
Version: v14.1
Accessibility:
Manual:
Accessible
Real time : Checking...
Country/Region: United Kingdom
Data type:
RNA
Data object:
Database category:
Major organism:
Keywords:

Contact information

University/Institution: European Bioinformatics Institute
Address: Wellcome Genome Campus, Hinxton, Cambridge, UK
City: Cambridge
Province/State: Cambridgeshire
Country/Region: United Kingdom
Contact name (PI/Team): Robert D. Finn
Contact email (PI/Helpdesk): rdf@ebi.ac.uk

Record metadata

Created on: 2015-06-20
Curated by:
Lina Ma [2019-04-18]
[2018-11-27]
Lina Ma [2018-06-04]
Dong Zou [2018-03-05]
Lin Xia [2016-04-01]
Mengwei Li [2016-02-21]
Mengwei Li [2016-02-18]
Lina Ma [2015-06-27]
Lin Xia [2015-06-26]

Ranking

All databases:
50/4499 (98.911%)
Gene genome and annotation:
27/1199 (97.832%)
50
Total Rank
2,894
Citations
180.875
z-index

Community reviews

Not Rated
Data quality & quantity:
Content organization & presentation
System accessibility & reliability:

Word cloud

Publications

29112718
Rfam 13.0: shifting to a genome-centric resource for non-coding RNA families. [PMID: 29112718]
Ioanna Kalvari, Joanna Argasinska, Natalia Quinones-Olvera, Eric P Nawrocki, Elena Rivas, Sean R Eddy, Alex Bateman, Robert D Finn, Anton I Petrov

The Rfam database is a collection of RNA families in which each family is represented by a multiple sequence alignment, a consensus secondary structure, and a covariance model. In this paper we introduce Rfam release 13.0, which switches to a new genome-centric approach that annotates a non-redundant set of reference genomes with RNA families. We describe new web interface features including faceted text search and R-scape secondary structure visualizations. We discuss a new literature curation workflow and a pipeline for building families based on RNAcentral. There are 236 new families in release 13.0, bringing the total number of families to 2687. The Rfam website is http://rfam.org.

Nucleic Acids Res. 2018:46(D1) | 70 Citations (from Europe PMC, 2019-12-14)
29927072
Non-Coding RNA Analysis Using the Rfam Database. [PMID: 29927072]
Ioanna Kalvari, Eric P Nawrocki, Joanna Argasinska, Natalia Quinones-Olvera, Robert D Finn, Alex Bateman, Anton I Petrov

Rfam is a database of non-coding RNA families in which each family is represented by a multiple sequence alignment, a consensus secondary structure, and a covariance model. Using a combination of manual and literature-based curation and a custom software pipeline, Rfam converts descriptions of RNA families found in the scientific literature into computational models that can be used to annotate RNAs belonging to those families in any DNA or RNA sequence. Valuable research outputs that are often locked up in figures and supplementary information files are encapsulated in Rfam entries and made accessible through the Rfam Web site. The data produced by Rfam have a broad application, from genome annotation to providing training sets for algorithm development. This article gives an overview of how to search and navigate the Rfam Web site, and how to annotate sequences with RNA families. The Rfam database is freely available at http://rfam.org. © 2018 by John Wiley & Sons, Inc.

Curr Protoc Bioinformatics. 2018:62(1) | 10 Citations (from Europe PMC, 2019-12-14)
25577390
Rfam: annotating families of non-coding RNA sequences. [PMID: 25577390]
Jennifer Daub, Ruth Y Eberhardt, John G Tate, Sarah W Burge

The primary task of the Rfam database is to collate experimentally validated noncoding RNA (ncRNA) sequences from the published literature and facilitate the prediction and annotation of new homologues in novel nucleotide sequences. We group homologous ncRNA sequences into "families" and related families are further grouped into "clans." We collate and manually curate data cross-references for these families from other databases and external resources. Our Web site offers researchers a simple interface to Rfam and provides tools with which to annotate their own sequences using our covariance models (CMs), through our tools for searching, browsing, and downloading information on Rfam families. In this chapter, we will work through examples of annotating a query sequence, collating family information, and searching for data.

Methods Mol Biol. 2015:1269() | 11 Citations (from Europe PMC, 2019-12-14)
25392425
Rfam 12.0: updates to the RNA families database. [PMID: 25392425]
Eric P Nawrocki, Sarah W Burge, Alex Bateman, Jennifer Daub, Ruth Y Eberhardt, Sean R Eddy, Evan W Floden, Paul P Gardner, Thomas A Jones, John Tate, Robert D Finn

The Rfam database (available at http://rfam.xfam.org) is a collection of non-coding RNA families represented by manually curated sequence alignments, consensus secondary structures and annotation gathered from corresponding Wikipedia, taxonomy and ontology resources. In this article, we detail updates and improvements to the Rfam data and website for the Rfam 12.0 release. We describe the upgrade of our search pipeline to use Infernal 1.1 and demonstrate its improved homology detection ability by comparison with the previous version. The new pipeline is easier for users to apply to their own data sets, and we illustrate its ability to annotate RNAs in genomic and metagenomic data sets of various sizes. Rfam has been expanded to include 260 new families, including the well-studied large subunit ribosomal RNA family, and for the first time includes information on short sequence- and structure-based RNA motifs present within families. © The Author(s) 2014. Published by Oxford University Press on behalf of Nucleic Acids Research.

Nucleic Acids Res. 2015:43(Database issue) | 398 Citations (from Europe PMC, 2019-12-14)
23125362
Rfam 11.0: 10 years of RNA families. [PMID: 23125362]
Sarah W Burge, Jennifer Daub, Ruth Eberhardt, John Tate, Lars Barquist, Eric P Nawrocki, Sean R Eddy, Paul P Gardner, Alex Bateman

The Rfam database (available via the website at http://rfam.sanger.ac.uk and through our mirror at http://rfam.janelia.org) is a collection of non-coding RNA families, primarily RNAs with a conserved RNA secondary structure, including both RNA genes and mRNA cis-regulatory elements. Each family is represented by a multiple sequence alignment, predicted secondary structure and covariance model. Here we discuss updates to the database in the latest release, Rfam 11.0, including the introduction of genome-based alignments for large families, the introduction of the Rfam Biomart as well as other user interface improvements. Rfam is available under the Creative Commons Zero license.

Nucleic Acids Res. 2013:41(Database issue) | 387 Citations (from Europe PMC, 2019-12-14)
21062808
Rfam: Wikipedia, clans and the "decimal" release. [PMID: 21062808]
Paul P Gardner, Jennifer Daub, John Tate, Benjamin L Moore, Isabelle H Osuch, Sam Griffiths-Jones, Robert D Finn, Eric P Nawrocki, Diana L Kolbe, Sean R Eddy, Alex Bateman

The Rfam database aims to catalogue non-coding RNAs through the use of sequence alignments and statistical profile models known as covariance models. In this contribution, we discuss the pros and cons of using the online encyclopedia, Wikipedia, as a source of community-derived annotation. We discuss the addition of groupings of related RNA families into clans and new developments to the website. Rfam is available on the Web at http://rfam.sanger.ac.uk.

Nucleic Acids Res. 2011:39(Database issue) | 245 Citations (from Europe PMC, 2019-12-14)
18953034
Rfam: updates to the RNA families database. [PMID: 18953034]
Paul P Gardner, Jennifer Daub, John G Tate, Eric P Nawrocki, Diana L Kolbe, Stinus Lindgreen, Adam C Wilkinson, Robert D Finn, Sam Griffiths-Jones, Sean R Eddy, Alex Bateman

Rfam is a collection of RNA sequence families, represented by multiple sequence alignments and covariance models (CMs). The primary aim of Rfam is to annotate new members of known RNA families on nucleotide sequences, particularly complete genomes, using sensitive BLAST filters in combination with CMs. A minority of families with a very broad taxonomic range (e.g. tRNA and rRNA) provide the majority of the sequence annotations, whilst the majority of Rfam families (e.g. snoRNAs and miRNAs) have a limited taxonomic range and provide a limited number of annotations. Recent improvements to the website, methodologies and data used by Rfam are discussed. Rfam is freely available on the Web at http://rfam.sanger.ac.uk/and http://rfam.janelia.org/.

Nucleic Acids Res. 2009:37(Database issue) | 453 Citations (from Europe PMC, 2019-12-14)
15608160
Rfam: annotating non-coding RNAs in complete genomes. [PMID: 15608160]
Sam Griffiths-Jones, Simon Moxon, Mhairi Marshall, Ajay Khanna, Sean R Eddy, Alex Bateman

Rfam is a comprehensive collection of non-coding RNA (ncRNA) families, represented by multiple sequence alignments and profile stochastic context-free grammars. Rfam aims to facilitate the identification and classification of new members of known sequence families, and distributes annotation of ncRNAs in over 200 complete genome sequences. The data provide the first glimpses of conservation of multiple ncRNA families across a wide taxonomic range. A small number of large families are essential in all three kingdoms of life, with large numbers of smaller families specific to certain taxa. Recent improvements in the database are discussed, together with challenges for the future. Rfam is available on the Web at http://www.sanger.ac.uk/Software/Rfam/ and http://rfam.wustl.edu/.

Nucleic Acids Res. 2005:33(Database issue) | 654 Citations (from Europe PMC, 2019-12-14)
12520045
Rfam: an RNA family database. [PMID: 12520045]
Sam Griffiths-Jones, Alex Bateman, Mhairi Marshall, Ajay Khanna, Sean R Eddy

Rfam is a collection of multiple sequence alignments and covariance models representing non-coding RNA families. Rfam is available on the web in the UK at http://www.sanger.ac.uk/Software/Rfam/ and in the US at http://rfam.wustl.edu/. These websites allow the user to search a query sequence against a library of covariance models, and view multiple sequence alignments and family annotation. The database can also be downloaded in flatfile form and searched locally using the INFERNAL package (http://infernal.wustl.edu/). The first release of Rfam (1.0) contains 25 families, which annotate over 50 000 non-coding RNA genes in the taxonomic divisions of the EMBL nucleotide database.

Nucleic Acids Res. 2003:31(1) | 666 Citations (from Europe PMC, 2019-12-14)