Home ›
FAQ
This is the 1000 Genomes FAQ. This list of questions is not exhaustive. If you have any other questions you can't find the answer to please email info@1000genomes.org to ask.
- Do I need a password to access 1000 genomes data
- Are my snps found by the 1000 genomes project
- Are the 1000 genomes variants in dbSNP?
- Are the 1000 genomes variant calls phased?
- Are the pilot project snps in dbSNP
- Can I volunteer to be part of the 1000 genomes project
- Do I need permission to use the 1000 genomes data in my own scientific research
- How do I find out about new 1000 genomes releases
- Can I get cell lines for 1000 genomes Samples
- Are there any assemblies available for the 1000 genomes samples?
- How do I contact you?
- How do I get a sub-section of a bam file
- How do I get a sub-section of a vcf file?
- Is the 1000 genomes data available in genome browsers?
- Are there any fasta files containing 1000 genomes variants or haplotypes?
- Are all the genotype calls in the current release VCF files bi-allelic?
- Are there any scripts or apis for use with the 1000 genomes data sets?
- Can I access the databases associated with the browser
- Can I convert vcf files to plink/ped format?
- Can I find the genomic position for a list of dbSNP rs numbers
- Can I search the ftp site?
- Can I search the website?
- How do I cite the 1000 genomes project
- How to download files using aspera
- Is there any gene expression data available for the 1000 genomes project samples?
- There is a corrupt file on your ftp site
- What Axiom genotype data do you have?
- What High Density Genotyping information do you have?
- What Omni genotype data do you have?
- What are the kgp identifiers?
- What are your filename conventions?
- What capture technology does the Exome sequencing used
- What do the pilot project, phase 1, phase 2 and phase 3 mean?
- What do your population codes like CEU or TSI mean
- What does Genotype Dosage mean in the phase1 integrated call set
- What is a panel file?
- What is a sequence index file?
- What is the Data Slicer?
- What is the completion criteria for samples in the project?
- What is the depth of coverage of your Phase1 variants
- What strand are the variants in your vcf file on
- Where are the alignments for the high coverage trios
- Where is the most recent release?
- Which populations are part of your study?
- Which reference assembly do you use?
- Why do some variants in the phase1 release have an zero Allele Frequency
- Why does a tabix fetch fail?
- what is the difference between the analysis groups exome and exon targetted in the sequence index
- Are there any statistics about how much sequence data has been generated by the project
- Are there torrents available for the 1000 genomes data sets
- Can I blast against the 1000 genomes data sets
- Can I get 1000 genomes data on the Amazon Cloud
- Can I get access to the 1000 genomes Wiki
- Can I get genotypes for a specific individual/population from your vcf files
- Can I get haplotype data for the 1000 genomes individuals
- Can I get image files for any of the 1000 genomes sequencing runs?
- Can I get individual genotype information from browser.1000genomes.org
- Can I get phenotype, gender and family relationship information for the samples?
- Does the 1000 genomes project use HapMap data?
- Can I map your snp coordinates between NCBI36 and GRCh37
- Can I use the 1000 genomes data for imputation?
- How are your alignments generated?
- How can I get the allele frequency of my variant?
- How many individuals will be sequenced?
- How much disk space is used by the 1000 genomes project?
- How much sequence data has been generated for single individuals?
- Is the data for the pilot study still available?
- What Sequencing Platforms were used for the 1000 genomes project
- What Structural variant data is available for the project?
- What are the targets for your exon targetted pilot study
- What are the targets for your whole exome sequencing?
- What do the names of your fastq files mean?
- What do the names of your variant files mean and what format are the files?
- What does an individual have a genotype in a location where it has no sequence coverage?
- What format are your alignments in and what do the names mean
- What is a bas file?
- What is the difference between your data directory and the pilot_data/data directory
- What library insert sizes where used in the 1000 genomes project
- What read lengths are being used by the project
- What tools can I use to download 1000 genomes data
- What version of vcf are your vcf files in?
- What was the source of the DNA for sequencing?
- Where are the pilot structural variants archived?
- Where are the snps for the X/Y/Mitochondrial chr
- Where are your alignment files located?
- Where are your reference data sets?
- Where are your sequence files located?
- Where are your variant files located?
- Where can I get consequence annotations for the 1000 genome variants
- Where does the Ancestral Allele Information for your variants come from?
- Which samples are you sequencing?
- Why are the coordinates of your pilot variants different to what is displayed in Ensembl or UCSC
- Why do some of your vcf genotype files have genotypes of ./. in them?
- Why is only 85% of the genome assayable?
- Why is the Allele frequency different from Allele Count/Allele Number?
- Why is the sequence data distributed in 2 or 3 files labelled SRR_1, SRR_2 and SRR?
- Why isn't a snp in dbSNP or HapMap
- Why isn't my snp in browser.1000genomes.org