Tracing languages back to their earliest common ancestor through sound shifts

In Turkic, sound transitions among consonants (solid circles) and among vowels (squares) are frequent and regular (indicated by links, arrows indicate direction), but rare between most consonants and vowels.

February 19, 2015

A team of researchers in the U.S. and U.K. has developed a statistical technique that sorts out when changes to words’ pronunciations most likely occurred in the evolutionary history of related languages.

Their model, presented recently in the journal Current Biology, gives researchers a renewed opportunity to trace words and languages back to their earliest common ancestor or ancestors – potentially thousands of years further into prehistory than previous techniques can do with any statistical rigor.

Led by Santa Fe Institute Professor Tanmoy Bhattacharya and University of Reading Professor Mark Pagel, the technique detects historical “concerted sound changes” – a linguistic evolutionary phenomenon where a specific sound changes to another specific sound in many words simultaneously.

For example, the modern languages of English and Latin descended from a common predecessor called proto-Indoeuropean. In English, the words father and foot took on an initial f sound, but in Latin those words retained their p sound, as in pater and ped. This transition occurred across the English language in many words that had featured a p sound.

The researchers tested their new model on Turkic, a family of at least 35 languages spoken by Turkic peoples from Southeastern Europe and the Mediterranean to Siberia and Western China. Their computer analysis automatically considered and evaluated the likelihood (without the potentially biased input of humans) that more than 70 regular sound changes had occurred throughout the 2000-year history of the Turkic languages.

An example is the word pas (head in English) in the Khakassian language. In Turkish, Uzbek, and 16 other Turkic languages the initial sound is b instead, yielding baš. Similarly, pel- (meaning louse) in Khakassian is bil- or bel- in the other languages. The ubiquity of this sound difference strongly supports the hypothesis that a regular sound shift occurred.

“Computers so far have mainly used the presence or absence of words with a common origin in various languages to stitch together trees that describe the descent of the various languages from a common ancestor,” say Bhattacharya. “This has left out the vastly richer data residing in sounds, primarily because sound changes in different words are not independent, as most mutations in genetics are, for example.”

In the paper, the researchers developed the mathematics for detecting and evaluating hypotheses concerning concerted sound changes and showed that the technique provided vastly superior dated trees for the Turkic language family than previous methods.

“Regular correspondences between sounds in different languages have long been an important test for establishing linguistic relatedness in traditional historical linguistics,” Bhattacharya says. “Being able to detect them automatically and score them within a probabilistic framework is a major step forward. Such a probabilistic framework may be able to help us detect and evaluate signals of very old regular sound changes that are much weaker than what is possible to determine unambiguously in a manual fashion.”

"Our new method is another exciting step to understanding how languages and genes evolve,” says Pagel. “It will allow us to go back in time further than before, making it possible to reconstruct ancient proto-languages, words that might have been spoken many thousands of years ago.”

The paper, “Detecting Regular Sound Changes in Linguistics as Events of Concerted Evolution,” was published January 5 in Current Biology. Co-authors include Daniel Hruschka (Arizona State University, a former SFI Omidyar Postdoctoral Fellow), Simon Branford (Reading), D. Eric Smith (SFI and the Krasnow Institute for Advanced Study), Jon Wilkins (SFI and the Ronin Institute), and Andrew Meade (Reading).

Tracing languages back to their earliest common ancestor through sound shifts

February 19, 2015

Share

News Media Contact

Santa Fe Institute

Tags

More SFI News

Tim Kohler to deliver Linda S. Cordell Lecture

To accelerate biosphere science, reconnect three scientific cultures

Mirta Galesic receives prestigious ERC Advanced Grant

Carlo Rovelli receives 2024 Lewis Thomas Prize

Research News Brief: Defining a city using cell-phone data

Complexity tools for USDA nutritional guidelines

Quantifying the potential value of data

Carlo Rovelli joins SFI's Fractal Faculty

New book offers thoughtful approach to modeling complex social systems

Research News Brief: A test of AI “personalities” and behavior

Study: To make sense of history, embrace uncertainty

Study: Predicting steps in a random process

Embodied intelligence & a sense of self

How to track important changes in a dynamic network

African and South Asian students build new connections during inaugural Complexity Global School

New gifts support SFI Education and Postdoctoral programs

The cultural evolution of collective property rights

Applications for Complexity Global School are now open

Life as a planetary regulator: an experimental test

Sam Bowles named IEA Fellow