Latest Announcements

Tuesday February 14, 2012

This February 2012 release represents a new improved set of phased genotypes for our integrated phase 1 variant release. This release contains SNPS, short INDELs and Deletions based on low coverage and exome sequencing data across 1092 individuals.

Please note the sites list has been filtered to remove a small number of indels which were discovered to have a high false positve rate. There is more information about this in the README

Our FAQ contains instructions on how to get smaller subsections of these files

Data access links: EBI / NCBI

Link to additional information:README file



Thursday June 23, 2011

Genotypes for 1094 individuals for the May 2011 snp calls from the 20101123 sequence and alignment release of the 1000 genomes project has now been made. This release is based on the GRCh37 assembly of the human genome and is released in the format VCF 4.0

Our FAQ contains instructions on how to get smaller subsections of these files

Data access links: EBI / NCBI

Link to additional information:README file



Recent project announcements

Monday January 30, 2012

Additional sequence data from the 1000 Genomes full project are now available. The current sequence.index file can be found at:

20120130.sequence.index

Data access links: EBI / NCBI / Instructions for data download and Aspera

Sequence index and Statistics files

Sequence index file format



Friday January 27, 2012

The EBI 1000 Genomes ftp site is now also available via http



Wednesday January 11, 2012

Trio high coverage BAMs generated for pilot2 project have been moved to

ftp://ftp.1000genomes.ebi.ac.uk/vol1/ftp/technical/pilot2_high_cov_GRCh37_bams 

Exon targetted BAMs generated for pilot3 project has been moved to 

ftp://ftp.1000genomes.ebi.ac.uk/vol1/ftp/technical/pilot3_exon_targetted_GRCh37_bams

Both sets of BAMs are mapped to GRCh37.



Project Overview

The 1000 Genomes Project is an international collaboration to produce an extensive public catalog of human genetic variation, including SNPs and structural variants, and their haplotype contexts. This resource will support genome-wide association studies and other medical research studies.

The genomes of about 2500 unidentified people from about 25 populations around the world will be sequenced using next-generation sequencing technologies. The results of the study will be freely and publicly accessible to researchers worldwide.

Further information about the project is available in the About tab. Information about downloading, browsing or using the 1000 Genomes data is available in the Data tab.