Latest Announcements

Monday September 15, 2014

Our final release of the Phase 3 variant set is now available on the FTP site

This update represents version 5 of our release. The issues which have been resolved since our initial release are covered in the Known Issues README

This release includes super population allele frequencies in the main release vcfs and functional annotation from the Ensembl Variant Effect Predictor along side many other datasets in the supporting directory. The complete list of data is covered in the Supporting Directory README

Please send any questions about this data set to info@1000genomes.org



Recent project announcements

Saturday October 18, 2014

We have now added a set of Chromosome X variants as part of our final release.

The genotypes and sites are available in our main release directory.

We will update the file during November. We need to add functional annotation and super population allele frequency and per site sequence depth information. 



Saturday October 18, 2014

We have added two sets of STR predictions and genotypes to the 1000 Genomes dataset.

These are available in the supporting directory strs

The call set were created using LobSTR and RepeatSeq respectively.

The sites are genotyped in all 2535 individuals who were used in our final release. This includes the 31 individuals who are related to other individuals in the main call set. 



Wednesday September 24, 2014

The 1000 Genomes project is holding a tutorial during ASHG 2014.

The 1000 Genomes Project has released the variants, genotypes, and integrated haplotypes for the complete set of 2504 samples from 26 populations.  This tutorial describes the data sets, how to access them, and how to use them.

The tutorial will be on Sunday 19th between 8 and 9:30pm in the Convention Centre, Room 24ABC in the Upper Level.

The program is listed on our tutorial web page.

No registration is needed.

Please send any questions about the tutorial to info@1000genomes.org



Project Overview

The 1000 Genomes Project is an international collaboration to produce an extensive public catalog of human genetic variation, including SNPs and structural variants, and their haplotype contexts. This resource will support genome-wide association studies and other medical research studies.

The genomes of about 2500 unidentified people from about 25 populations around the world will be sequenced using next-generation sequencing technologies. The results of the study will be freely and publicly accessible to researchers worldwide.

Further information about the project is available in the About tab. Information about downloading, browsing or using the 1000 Genomes data is available in the Data tab.