P. falciparum open data resources
The P. falciparum Community Project open data resource is updated as the samples increase, new locations are incorporated, and sequencing and analytical methods improve. Descriptions of data releases can be found on the following data release resource pages.
Resource 34 – An open dataset of P. falciparum genome variation in 16,000 worldwide samples: This data set is upgraded to v7.0 of the variant discovery and genotype calling pipeline.
Resource 26 – An open dataset of P. falciparum genome variation in 7,000 worldwide samples: This data set is upgraded to v6.0 of its variant discovery and genotype calling pipeline.
Resource 22 – Population genetic structure and adaptation of malaria parasites on the edge of endemic distribution: Genotype calls on 428,402 on the sequences of 86 samples from Mauritania generated using v5.0 of the analysis pipeline.
Resource 21 – Whole genome sequencing of P. falciparum from dried blood spots using selective whole genome amplification: The data used in these analyses was generated as a result of epidemiological studies.
Resource 20 – Culture adaptation of malaria parasites selects for convergent loss-of-function mutants: Genotype calls on 944,270 SNPs on the sequences of different time points during culture adaptation of six Gambian isolates. These genotypes were generated using v4.0 of the analysis pipeline.
Resource 16 – Genomic epidemiology of artemisinin resistant malaria: This catalogue has more than 900,000 SNPs and allele frequency data based on an analysis of 3,488 samples from 43 locations in 23 countries.
Resource 15 – Genetic architecture of artemisinin-resistant P. falciparum: This data release v3.0 contains sample information, accession numbers, and baseline genotypes for 2,512 samples
Resource 12 – Multiple populations of artemisinin-resistant P. falciparum in Cambodia: An analysis using the data from Resource 10 which has been iteratively updated.
Resource 10 – Analysis of P. falciparum diversity in natural infections by deep sequencing: Data release using v1.0 of the p. falciparum pipeline. This catalogue of genetic variations (SNPs) and allele frequencies v1.0 has 86,000 SNPs.
The GenRe-Mekong Project open data resources uses SpotMalaria’s version 6 pipeline for variant discovery and genotype calling
Resource 29 – Genetic surveillance in the Greater Mekong Subregion and South Asia to support malaria control and elimination: This is the first data release of P. falciparum from the region using the SpotMalaria genetic surveillance framework for extracting and analysing genetic data from malaria infected patients. This release contains sample information, accession numbers and genotype calls for samples.
P. vivax open data resources
The P. vivax Genome Variation project open data resource is regularly updated as the number of samples increase, new locations are incorporated, and sequencing and analytical methods improve. Descriptions of data releases can be found on the following data release resource pages.
Resource 30 – An open dataset of P. vivax genome variation in 1,895 worldwide samples: This data release includes variant discovery using the upgraded v.4.0 pipeline. The full text is available at https://wellcomeopenresearch.org/articles/7-136/v1(link is external).
Resource 24 – Genomic analysis of a pre-elimination Malaysian P. vivax population reveals selective pressures and changing transmission dynamics: Genotype calls on 527,107 SNPs in 259 samples from Malaysia, Thailand and Indonesia, using v3.0 of the analysis pipeline.
Resource 19 – Independent origin and global distribution of distinct Plasmodium vivax Duffy Binding Protein gene duplications: The data contains P. vivax Duffy Binding Protein gene (PvDBP) haplotypes for 37 Cambodian clinical isolates as well as the P. vivax Sal 1 reference genome (Pv_Sal1_DBP_1) and P. vivax M15 (Pv_M15_DBP_2). Terminal numbers (i.e. “_1” and “_2”) indicate whether the isolate contains one or two copies of PvDBP.
Resource 17 – P. vivax Genome Variation May 2016 data release: This data release contains sample information, accession numbers, and genotype calls for samples used in the analyses described in Pearson et al, 2016. The full text article is accessible.
A host-parasite data resource is available here.