Download variation files or useful tools in GVM

Variation Data

All genomic variation data are publicly available. Variation data files in VCF and FASTA formats are tabulated as below.

Note:
   Brief VCF is the vcf format file without individual genotype;
   Detailed VCF is the vcf format file with individual genotype.
 
Organism (version) SNP (VCF) SNP (VCF) SNP (FASTA) Short INDEL (VCF) Short INDEL (VCF) Short INDEL (FASTA)

About the data

VCF (Variant Call Format) is a simplified text file format containing information about a position in the genome. More details about its format and specifications are listed below.
1. #CHROM is short for chromosome number
2. POS is short for chromosome position
3. ID is variation identifier in GVM system
4. REF is short for reference allele
5. ALT is short for alternate allele
6. QUAL is variants quality
7. FILTER is filter status
8. INFO is additional information for each variant
More detail information of vcf. format can be found in http://samtools.github.io/hts-specs/VCFv4.1.pdf

FASTA format provide 50nt flanking sequences for each variants (50nt for each flank) which is typically useful for BLAST applications. e.g.
>OSA01S123 class=1|alleles="A/G"|version=1
AGGTCCAGGCTGCCAAGCTTGAACTCCGTCTCCCAGACGACGACGGCCGC
R
GGAGGAAGGCGGACCATGTCGCCGGTGAGGTTGTTGCAGACAGACACGCA

Useful tools

1. Variants calling tools:
2. Genome alignment tools:
3. Variants annotation tools: