Much Better Population Structure

April 16, 2017

This new study confirms the results of previous studies by Di Gaetano et al. (2012) and Fiorito et al. (2016) but has much better geographical coverage of samples, with 737 individuals from 20 locations in 15 different regions being tested, making the earlier genetic "gap" between North-Central and Southern Italians disappear, filled in by an intermediate Central Italian cluster, creating a continuous cline of variation down the peninsula (with Sardinians as outliers) that mirrors geography.

The four new Italian samples from this study (N_ITA, C_ITA, S_ITA and SARD) cluster right on top of older Italian samples from Bergamo, Tuscany, Abruzzo, Sicily and Sardinia used in earlier studies, which are barely visible underneath, showing that the results are consistent. Northern Italians once again cluster with Spaniards, and there's a small Greek sample this time, but it's not from the areas closest to Southern Italians, clustering more with the Central Italians instead.

The study also for the first time includes a formal admixture test that models the ancestry of Italians by inferring admixture events using all of the Western Eurasian samples. The results are very interesting in light of the ancient DNA evidence that has come out in the last couple years.

When top 1% of most significant f3 values were retained according to computed Z-scores, 85% of tested population trios actually involved Italian clusters (Supplementary Table S2). In addition to the pattern described in the main text, the SARD sample seemed to have played a major role as source of admixture for most of the examined populations, especially Italian ones, rather than as recipient of migratory processes. In fact, the most significant f3 scores for trios including SARD indicated peninsular Italians as plausible results of admixture between SARD and populations from Iran, Caucasus and Russia. This scenario could be interpreted as further evidence that Sardinians retain high proportions of a putative ancestral genomic background that was considerably widespread across Europe at least until the Neolithic and that has been subsequently erased or masked in most of present-day European populations.

It's known that Sardinians are almost identical to Early European Farmers from the Neolithic, and that the Indo-Europeans who spread their languages all across Europe in the Bronze Age were a mix of Eastern Hunter-Gatherers from the Russian Steppe and either Caucasus Hunter-Gatherers or Chalcolithic Iranians (who are very similar).

So it looks like Italians resemble other Europeans in being a mix of early European farmers and later Indo-European invaders. The farmer ancestry (which is ultimately from Anatolia) has an expected southeast to northwest cline in Europe, but surprisingly not within Italy. According to the study's estimates, it's about the same amount in the North as it is in the South. The two regions actually differ in their Indo-European related ancestries, caused by inverse clines of the "Caucasus/Iran" and "Russian Steppe" components.

The purple component was predominant in Southern European groups and equally distributed along the peninsula (average frequency of 46%), almost reaching fixation in Sardinians (85%) plausibly due to their long-term isolation especially to Post-Neolithic processes. [...] The green component was considerably represented in samples from Caucasus and Middle East, being also evident in some Southern European populations (e.g. Greeks) and, especially, in Southern Italy (28%), progressively decreasing towards the northern part of the peninsula (12%). [...] The red component characterized most of Central and Eastern European populations, being reduced in Sardinia (7.4%) and showing a decreasing north-south gradient in peninsular Italy (from 39% in N_ITA to 20% in S_ITA).

This difference could be explained by the fact that Ancient Italy was home to a variety of Indo-European speakers: Italic languages spread everywhere, Celtic languages were spoken in the North, and Greek and Illyrian languages in the South. It's likely that some of these arrived via southern route through the Balkans while others took a northern route over the Alps, and the people who brought them thus had different levels of "Caucasus" and "Russian" ancestry.

Sazzini et al. "Complex interplay between neutral and adaptive evolution shaped differential genomic background and disease susceptibility along the Italian peninsula". Scientific Reports, 2016.

Related: Complex Spread of Indo-European Languages