Detecting natural selection by empirical comparison to random regions of the genome

TitleDetecting natural selection by empirical comparison to random regions of the genome
Publication TypeJournal Article
Year of Publication2009
AuthorsYu F, Keinan A, Chen H, Ferland RJ, Hill RS, Mignault AA, Walsh CA, Reich D
JournalHum. Mol. Genet.
KeywordsAdaptor Proteins, Computer Simulation, DNA, Forkhead Transcription Factors, G-Protein-Coupled, Gene Frequency, Genetic, Genome, Haplotypes, Human, Humans, Models, Nerve Tissue Proteins, Neurogenesis, Polymorphism, Receptors, Selection, Sequence Analysis, Signal Transducing, Single Nucleotide

Historical episodes of natural selection can skew the frequencies of genetic variants, leaving a signature that can persist for many tens or even hundreds of thousands of years. However, formal tests for selection based on allele frequency skew require strong assumptions about demographic history and mutation, which are rarely well understood. Here, we develop an empirical approach to test for signals of selection that compares patterns of genetic variation at a candidate locus with matched random regions of the genome collected in the same way. We apply this approach to four genes that have been implicated in syndromes of impaired neurological development, comparing the pattern of variation in our re-sequencing data with a large-scale, genomic data set that provides an empirical null distribution. We confirm a previously reported signal at FOXP2, and find a novel signal of selection centered at AHI1, a gene that is involved in motor and behavior abnormalities. The locus is marked by many high frequency derived alleles in non-Africans that are of low frequency in Africans, suggesting that selection at this or a closely neighboring gene occurred in the ancestral population of non-Africans. Our study also provides a prototype for how empirical scans for ancient selection can be carried out once many genomes are sequenced.