Standards
1.3 GSA

The Genome Sequence Archive (GSA) is a data repository specialized for archiving raw sequence reads.

1.3.1 ID System

A GSA object consists of a series of Experiments and Runs.

GSA Accession No. is prefixed with 'CRA' and followed by 6 digits. For example, CRA000001.

Experiment Accession No. is prefixed with 'CRX' and followed by 6 digits. For example, CRX000001.

Run Accession No. is prefixed with 'CRR' and followed by 6 digits. For example, CRR000001.

1.3.2 Experiment
1.3.2.1 Attributes
Attributes*mandatory attribute
Name Description Tips Value Format
*ID Experiment IDs, prefixed with 'E' and followed by a natural number, such as E1, E2, E3.... The Experiment ID must be unique.
*Experiment title Short description that will identify the Experiment on public pages. It can have any format, but we suggest that you make it concise, unique, consistent, and as informative as possible. Every Experiment from same Sample must be unique. {text}
*BioProject accession BioProject accession. Typical of the form PRJCA [number], NOT SUBPRJCA [number], like PRJCA000005.
*BioSample name Sample Name is a name that you choose for the sample. It can have any format, but we suggest that you make it concise, unique and consistent within your lab, and as informative as possible. Every Sample Name from a single Submitter must be unique. {text}
*Platform This column has drop-down menus that allow you to select from a controlled vocabulary Once specified for one row, these values can be copied-and-pasted down. See the Platform form for details
*Library Construction / Experimental Design Free-form description of the methods used to create the sequencing library; a brief 'materials and methods' section. e.g., DNA of sorted NCSCs was extracted from the cell line using a QIAamp DNA Mini Kit, sheared to approximately 300-500 bp using a Covaris S220 instrument. Then the libraries were constructed through end-repair, A-tailing, adapter ligation and bisulfate-converted using a ZymoEZ DNA Methylation Kit. {text}
Library name Name of Library.
*Strategy This column has drop-down menus that allow you to select from a controlled vocabulary. Once specified for one row, these values can be copied-and-pasted down. See the Strategy form for details.
*Source This column has drop-down menus that allow you to select from a controlled vocabulary. Once specified for one row, these values can be copied-and-pasted down. See the Source form for details.
*Selection This column has drop-down menus that allow you to select from a controlled vocabulary. Once specified for one row, these values can be copied-and-pasted down. See the Selection form for details.
*Layout This column has drop-down menus that allow you to select from a controlled vocabulary. Once specified for one row, these values can be copied-and-pasted down.
*Read length for mate 1(bp) Planned Read Length of Mate1 for your submission. When Platform is PacBio sequel and Ion Torrent series sequencers, leave this column empty is available.
Read length for mate 2 (bp) Planned Read Length of Mate 2 for your submission. Require for paired-end data only.
Insert size (bp) Fragment size for Paired reads. Please provide a numerical value for the median interval of the insert size.
Nominal size (bp) Nominal size
Nominal standard deviation (bp) Standard deviation of insert size
Planned number of cycles Planned number of cycles for your submission. When the Platform is Helicos HeliScope, the Planned number of cycles is required.