Repository logo
 

Investigating the role of demography and selection in genome scale patterns of common and rare variant diversity in humans


Change log

Authors

Mörseburg, Alexander 

Abstract

In the last decade, an unprecedented increase in the availability of whole genome sequence (WGS) data has reshaped the field of human evolutionary genomics. However, many earlier sequencing projects like the HapMap and 1000 Genomes panels focussed on a limited set of populations. Therefore, more research has been required to better characterise genetic diversity in understudied regions, such as Island Southeast Asia and Siberia. This thesis contributes to this ongoing effort in the form of three partially related subprojects. Firstly, population structure and local adaptations in Southeast Asia were investigated using novel autosomal 730,000 SNP data from 146 individuals in the context of a larger worldwide panel of 1,825 humans. The Kankanaey Igorot from the highlands of the Philippine Mountain Province were highlighted as the closest living representatives of the source population that may have given rise to the Austronesian expansion. Furthermore, consistent with archaeological, cultural and linguistic evidence of Indian influence in Southeast Asia starting from 2.5 kya South Asian admixture in the region was estimated to date back to the last couple of thousand years. To provide an unbiased high-resolution picture of the patterns of functional and rare variants worldwide high coverage WGS data from 483 individuals (including 379 novel genomes) were analysed. Ingenuity Variant Analysis and the Ensembl Variant Effect Predictor were applied to a subset of these genomes (n = 382) to create a repository of functional and deleterious variants. Evidence for purifying selection in genes involved in pigmentation and immune defence against viruses was detected in African populations. The most differentiated sites across continental groups were integrated with haplotype-based selection tests and annotations from functional databases to pinpoint disease and metabolism-related candidate loci. A subset of the WGS dataset, designed to maximise coverage of diverse ethnic groups (n = 447), was screened for variants occurring exclusively in two individuals in a heterozygous state (f2 variants). It was shown that f2 sharing correlates well with the results of CHROMOPAINTER, a state-of-the-art method to detect recent gene flow, and, allows for the detection of cryptic relatedness among distant populations. This was demonstrated by an example of a previously undetected low-scale African ancestry component in the South American Calchaquíes putatively related to the transatlantic slave trade.

Description

Date

2018-12-20

Advisors

Kivisild, Toomas

Keywords

Human genetics, Population genomics, Population history, Natural selection, Rare genetic variants

Qualification

Doctor of Philosophy (PhD)

Awarding Institution

University of Cambridge