MalariaGEN and PMCW to host second gernomic data hackathon... more
Ag3.7: Anopheles gambiae data resource

Released on 15 May 2015.

Parasite

The MalariaGEN Vector Observatory Anopheles gambiae data resource version 3.7 (Ag3.7) contains single nucleotide polymorphism (SNP) calls, copy number variant (CNV) calls and SNP haplotypes from whole-genome sequencing of mosquitoes collected in Benin (451 samples), Burkino Faso (24 samples), Cameroon (744 samples), Gabon (4 samples), Ghana (528 samples), Guinea (1 sample), and Tanzania (794 samples) from 2007 to 2021. Ag3.7 contains 2546 whole genome sequences from An. coluzziAn. arabiensisAn. fontenillei and An. gambiae.

Data sets

Downloads

Downloads include sequence alignments (BAM files) and variant calls (VCF files) from both alignment- and assembly-based calling methods.

All of the data files included in this release can be downloaded from the Wellcome Trust Sanger Institute public FTP site.

NOTE: Many browsers now do not support links to FTP sites. If you are experiencing difficulties, you may need to change your browser settings.

Go to FTP Download PDF

Release notes

Organisation of VCF files

There are two VCF files available for each chromosome arm. One file has all SNPs discovered (e.g., ag1000g.phase1.AR2.2L.vcf.gz) and the second file has only those SNPs that passed all quality filters (ag1000g.phase1.AR2.2L.PASS.vcf.gz). For most analyses it is recommended to only work with PASS variants and therefore the PASS.vcf.gz files will be more convenient to use.

Known issues

Organisation of VCF files iiii

aaa There are two VCF files available for each chromosome arm. One file has all SNPs discovered (e.g., ag1000g.phase1.AR2.2L.vcf.gz) and the second file has only those SNPs that passed all quality filters (ag1000g.phase1.AR2.2L.PASS.vcf.gz). For most analyses it is recommended to only work with PASS variants and therefore the PASS.vcf.gz files will be more convenient to use.

Go to 1000 Genomes FTP

Apply for access

Archived

Our approach to sharing data

Citations

To cite these data directly, please use the following citation format:

The Anopheles gambiae 1000 Genomes Consortium (2014): Ag1000G phase 1 AR2 data release. MalariaGEN. http://www.malariagen.net/data/ag1000g-phase1-AR2

Also bla bla cite these data directly, please use the following citation format:

The Anopheles gambiae 1000 Genomes Consortium (2014): Ag1000G phase 1 AR2 data release. MalariaGEN. http://www.malariagen.net/data/ag1000g-phase1-AR2

Contacts