The Genomic Formation of South and Central Asia
Abstract
The genetic formation of Central and South Asian populations has been unclear because of an absence of ancient DNA. To address this gap, we generated genome-wide data from 362 ancient individuals, including the first from eastern Iran, Turan (Uzbekistan, Turkmenistan, and Tajikistan), Bronze Age Kazakhstan, and South Asia. Our data reveal a complex set of genetic sources that ultimately combined to form the ancestry of South Asians today. We document a southward spread of genetic ancestry from the Eurasian Steppe, correlating with the archaeologically known expansion of pastoralist sites from the Steppe to Turan in the Middle Bronze Age (2300-1500 BCE). These Steppe communities mixed genetically with peoples of the Bactria Margiana Archaeological Complex (BMAC) whom they encountered in Turan (primarily descendants of earlier agriculturalists of Iran), but there is no evidence that the main BMAC population contributed genetically to later South Asians. Instead, Steppe communities integrated farther south throughout the 2nd millennium BCE, and we show that they mixed with a more southern population that we document at multiple sites as outlier individuals exhibiting a distinctive mixture of ancestry related to Iranian agriculturalists and South Asian hunter-gathers. We call this group Indus Periphery because they were found at sites in cultural contact with the Indus Valley Civilization (IVC) and along its northern fringe, and also because they were genetically similar to post-IVC groups in the Swat Valley of Pakistan. By co-analyzing ancient DNA and genomic data from diverse present-day South Asians, we show that Indus Periphery-related people are the single most important source of ancestry in South Asia—consistent with the idea that the Indus Periphery individuals are providing us with the first direct look at the ancestry of peoples of the IVC—and we develop a model for the formation of present-day South Asians in terms of the temporally and geographically proximate sources of Indus Periphery-related, Steppe, and local South Asian hunter-gatherer-related ancestry. Our results show how ancestry from the Steppe genetically linked Europe and South Asia in the Bronze Age, and identifies the populations that almost certainly were responsible for spreading Indo-European languages across much of Eurasia.
One Sentence Summary Genome wide ancient DNA from 357 individuals from Central and South Asia sheds new light on the spread of Indo-European languages and parallels between the genetic history of two sub-continents, Europe and South Asia.
Subject Area
- Biochemistry (13761)
- Bioengineering (10496)
- Bioinformatics (33330)
- Biophysics (17188)
- Cancer Biology (14263)
- Cell Biology (20210)
- Clinical Trials (138)
- Developmental Biology (10908)
- Ecology (16092)
- Epidemiology (2067)
- Evolutionary Biology (20411)
- Genetics (13447)
- Genomics (18702)
- Immunology (13829)
- Microbiology (32274)
- Molecular Biology (13428)
- Neuroscience (70328)
- Paleontology (529)
- Pathology (2208)
- Pharmacology and Toxicology (3757)
- Physiology (5913)
- Plant Biology (12067)
- Synthetic Biology (3381)
- Systems Biology (8192)
- Zoology (1849)