Latest Announcements

Wednesday October 31, 2012

The Phase 1 publication, An Integrated map of genetic variation from 1092 human genomes is now available from Nature and can be downloaded directly from the ftp site.  The paper is distributed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 Unported licence.  Please share our paper appropriately.

All the data files associated with this paper can be found in our phase1 analysis results directory.



Recent project announcements

Saturday May 25, 2013

The official release of phase3 low coverage and exome data is completed and available on the ftp site. The alignment data were generated by Sanger Center.  All BAMs have gone through the DCC QA process; samples and runs identified as problematic have been withdrawn. The 20130502.analysis.sequence.index has been updated to reflect the withdrawn:  

Here are the main alignment index files: 
 
or 
 
There are 2535 samples in the index files; all of them passed QA and have both exome and low coverage data.
 
In the alignment_indices directory ftp://ftp.1000genomes.ebi.ac.uk/vol1/ftp/alignment_indices/, you may find associated stats files and summary bas files and an exome HsMetrics file:
 
20130502_20120522.alignment_stats.low_coverage.csv
20130502_20120522.alignment_stats.exome.csv
20130502.low_coverage.alignment.index.bas.gz
20130502.exome.alignment.index.bas.gz  
20130502.exome.alignment.index.HsMetrics.gz
20130502.exome.alignment.index.HsMetrics.gz.stats  
 
A handful samples passed all QA but only have either low coverage data (23) or exome data (16); we keep the BAM files for these samples at
 
 
Two alignment index files can be found in the same directory:
 
20130502.exome_only.alignment.index
20130502.lc_only.alignment.index


Monday April 22, 2013

The final sequence index file is released on the FTP site
http://ftp.1000genomes.ebi.ac.uk/vol1/ftp/sequence_indices/20130422.sequence.index


The corresponding analysis.sequence.index that contains only >70bp long Illumina reads is 
http://ftp.1000genomes.ebi.ac.uk/vol1/ftp/sequence_indices/20130422.analysis.sequence.index
 
You may find different stats files for this release in http://ftp.1000genomes.ebi.ac.uk/vol1/ftp/sequence_indices.  We have achieved our goal of 2500 samples for both low coverage and exome projects!  The overlap between >5Gb exome samples and >10Gb low coverage samples is also greater than 2500 (2564). 


Monday April 15, 2013

Another sequence index file is released on the FTP site

http://ftp.1000genomes.ebi.ac.uk/vol1/ftp/sequence_indices/20130415.sequence.index

The corresponding analysis.sequence.index that contains only >70bp long Illumina reads is 

http://ftp.1000genomes.ebi.ac.uk/vol1/ftp/sequence_indices/20130415.analysis.sequence.index
 
You may find different stats files for this release in http://ftp.1000genomes.ebi.ac.uk/vol1/ftp/sequence_indices.  We have achieved our goal of 2500 samples for both low coverage and exome projects!  The overlap between >5Gb exome samples and >10Gb low coverage samples is 2466.


Project Overview

The 1000 Genomes Project is an international collaboration to produce an extensive public catalog of human genetic variation, including SNPs and structural variants, and their haplotype contexts. This resource will support genome-wide association studies and other medical research studies.

The genomes of about 2500 unidentified people from about 25 populations around the world will be sequenced using next-generation sequencing technologies. The results of the study will be freely and publicly accessible to researchers worldwide.

Further information about the project is available in the About tab. Information about downloading, browsing or using the 1000 Genomes data is available in the Data tab.