Latest Announcements

Wednesday September 30, 2015

The Phase 3 publication, A global reference for human genetic variation and the Phase 3 Structural variation publication, An integrated map of structural variation in 2,504 human genomes are now available from Nature alongside a celebration of 25 years of the Human Genome Project

The variants from the Phase 3 analysis are available in ftp/release/20130502/ and extended information about the SV dataset can be found in ftp/phase3/integrated_sv_map/.

Both these papers are open access and should be free for everyone to read and download.

If you have any questions about the data these papers are based on or how to access it please email info@1000genomes.org

http://ftp.1000genomes.ebi.ac.uk/vol1/ftp/release/20130502/



Recent project announcements

Wednesday December 16, 2015

We hare realigned exome data from 2692 samples and high coverage PCR-free data from 24 samples, generated for the 1000 Genomes Project to the GRCh38 human assembly.

The alignment is against the full assembly including the GRC maintained alternate loci sequences and decoy and additional HLA sequences from the IMGT.

Our fasta file can be found in the reference directory.

The alignment was carried out using a new alt-aware version of BWA-mem.  The alignment files themselves can be found in the data_collections/1000_genomes_project/data directory. The exome alignment index, high coverage alignment index and sequence.index  can be found in the data_collections/1000_genomes_project directory.

Please note, these files are now being distributed in CRAM format, rather than BAM format. You can find more details about CRAM in this README. Full details of our alignment pipeline can be found in the alignment pipeline README 

If you have any questions please email info@1000genomes.org

 



Friday November 27, 2015

EMBL-EBI has recently rearranged its Globus hosted endpoints.

This means the Globus endpoint for 1000G data is changing from ebi#1000g to ebi#public ('1000g' subfolder). The old endpoint will be discontinued shortly. The new endpoint is configured to achieve increased reliability and performance. If you have any questions about the change please contact info@1000genomes.org.



Friday October 16, 2015

We have aligned the Illumina Platinum pedigree sequence data to GRCh38. The data was aligned to the full assembly including the GRC maintained alternate loci along with decoy and additional HLA sequences from the IMGT. A copy of the FASTA file can be found in our reference directory. The alignment was carried out using a new alt-aware version of BWA-mem. 

The alignment files themselves can be found in the data_collections/illumina_platinum_pedigree/data directory.

The alignment index and sequence index can be found in the data_collections/illumina_platinum_pedigree directory.

Please note, alignment files are now being distributed in CRAM format, rather than BAM format. You can find more details about CRAM in this README

Further details of our alignment pipeline can be found in the data collection README.

If you have any questions please email info@1000genomes.org.



Project Overview

The 1000 Genomes Project is an international collaboration to produce an extensive public catalog of human genetic variation, including SNPs and structural variants, and their haplotype contexts. This resource will support genome-wide association studies and other medical research studies.

The genomes of about 2500 unidentified people from about 25 populations around the world will be sequenced using next-generation sequencing technologies. The results of the study will be freely and publicly accessible to researchers worldwide.

Further information about the project is available in the About tab. Information about downloading, browsing or using the 1000 Genomes data is available in the Data tab.