A network inferred from translated word meaning data drawn from American and Oceanic languages. The researchers' onlne tool allows for simultaneous visualization of up to four language groups.

We create words to label people, places, actions, thoughts, and more so we can express ourselves meaningfully to others. Do humans' shared cognitive abilities and dependence on languages naturally provide a universal means of organizing certain concepts? Or do environment and culture influence each language uniquely?

Using a new methodology that measures how closely words’ meanings are related within and between languages, an international team of researchers has revealed that for many universal concepts, the world’s languages feature a common structure of semantic relatedness. 

“Before this work, little was known about how to measure [a culture’s sense of] the semantic nearness between concepts,” says co-author and SFI Professor Tanmoy Bhattacharya. “For example, are the concepts of sun and moon close to each other, as they are both bright blobs in the sky? How about sand and sea, as they occur close by? Which of these pairs is the closer? How do we know?” 

Translation, the mapping of relative word meanings across languages, would provide clues. But examining the problem with scientific rigor called for an empirical means to denote the degree of semantic relatedness between concepts. 

To get reliable answers, Bhattacharya needed to fully quantify a comparative method that is commonly used to infer linguistic history qualitatively. (He and collaborators had previously developed this quantitative method to study changes in sounds of words as languages evolve.) 

“Translation uncovers a disagreement between two languages on how concepts are grouped under a single word,” says co-author and SFI researcher Hyejin Youn. “Spanish, for example, groups ‘fire’ and ‘passion’ under ‘incendio,’ whereas Swahili groups ‘fire’ with ‘anger’ (but not ‘passion’).” 

To quantify the problem, the researchers chose a few basic concepts that we see in nature (sun, moon, mountain, fire, and so on). Each concept was translated from English into 81 diverse languages, then back into English. Based on these translations, a weighted network was created. The structure of the network was used to compare languages’ ways of partitioning concepts. 

The team found that the translated concepts consistently formed three theme clusters in a network, densely connected within themselves and weakly to one another: water, solid natural materials, and earth and sky. 

“For the first time, we now have a method to quantify how universal these relations are,” says Bhattacharya. “What is universal – and what is not – about how we group clusters of meanings teaches us a lot about psycholinguistics, the conceptual structures that underlie language use.” 

The researchers hope to expand this study’s domain, adding more concepts, then investigating how the universal structure they reveal underlies meaning shift.

Their research was published today in PNAS. Among the paper’s eight co-authors are five SFI-affiliated researchers: SFI Professor Cristopher Moore, External Professors D. Eric Smith and Jon Wilkins, and Youn and Bhattacharya.

Read the paper, "On the universal structure of human lexical semantics," in PNAS (February 1, 2016)

Visualize language meaning networks using the researchers' online tool 

Read the research highlight in Nature (February 11, 2016)

Read the article in Scientific Computing (February 5, 2016)

Read the article in the Christian Science Monitor (February 4, 2016)

Watch the video on Quartz (February 4, 2016)