MalariaGEN and PMCW to host second gernomic data hackathon... more
An open dataset of Plasmodium falciparum - v.7.0

Released on 8 Dec 2022.

Parasite

This page provides information about the Pf7 dataset which contains genome variation data on over 20,000 worldwide samples of Plasmodium falciparum.

This release contains details on contributing partner studies, sample metadata and key sample attributes inferred from genomic data, and genomic data including raw sequence reads. A descrption of the dataset can be found here.

These data are available open access. Publications using these data should acknowledge and cite the source of the data using the following format: “This publication uses MalariaGEN data as described in ‘Pf7: an open dataset of Plasmodium falciparum genome variation in 20,000 worldwide samples . MalariaGEN et al, Wellcome Open Research 2023, 8:22 https://doi.org/10.12688/wellcomeopenres.18681.1.

Data sets

Sequence alignments

Downloads include sequence alignments (BAM files) and variant calls (VCF files) from both alignment- and assembly-based calling methods. All of the data files included in this release can be downloaded from the Wellcome Trust Sanger Institute public FTP site. NOTE: Many browsers now do not support links to FTP sites. If you are experiencing difficultie

Download FTP Download PDF

Downloads

Downloads include sequence alignments (BAM files) and variant calls (VCF files) from both alignment- and assembly-based calling methods.

All of the data files included in this release can be downloaded from the Wellcome Trust Sanger Institute public FTP site.

NOTE: Many browsers now do not support links to FTP sites. If you are experiencing difficulties, you may need to change your browser settings.

Go to FTP Download PDF

Release notes

README file describes in fine detail all the files included in the release, the format and interpretation of each column, and contains some tips and tricks for accessing genotype data in VCF and zarr files.

NOTE: You may need to download a free FTP client to access the FTP links.

Open access

Our approach to sharing data

Citations

Publications using these data should acknowledge and cite the source of the data using the following format:

“This publication uses data from the MalariaGEN Plasmodium falciparum Community Project as described in ‘Pf7: an open dataset of Plasmodium falciparum genome variation in 20,000 worldwide samples. MalariaGEN et al, Wellcome Open Research 2023822 DOI: https://doi.org/10.12688/wellcomeopenres.18681.1’”

Contacts