What common patterns underlie languages? How much variation is possible and how do neighboring languages influence each other? These and many other questions can now be answered using the newly published Grambank data collection. With over 400,000 data points and 2,400 languages, Grambank is the largest comparative grammatical database. Initial analyzes of the data show how diverse the world's languages are, while underscoring the risk that many unique languages will be supplanted by others and lost.
Our language is an essential part of our culture. It determines how we can express ourselves and communicate and even how we think. It provides information about our cultural heritage and conclusions about historical migration movements can be drawn from language affinities. Languages are subject to constant evolution. They are culturally inherited and continue to develop, also influenced by other languages. Around 7,000 different languages are currently spoken worldwide. So far, however, the global patterns of human language diversity have not been systematically described.
Cataloged grammatical properties
A team led by Hedvig Skirgård from the Max Planck Institute for Evolutionary Anthropology in Leipzig, in collaboration with researchers from more than 60 institutions worldwide, has now compiled the largest grammatical database to date. The new Grambank database includes over 400,000 data points and 2,400 languages from 215 language families from all inhabited continents. The team coded the recorded languages according to numerous grammatical features, including word order, the way words are inflected and whether there are gender-specific pronouns.
"Initially, we had to revise the questionnaire for entering the language features several times in order to capture as many of the different strategies that languages have developed for encoding grammatical properties," says Skirgård. The database now includes a total of 195 individual grammatical properties. This results in an enormous number of possible combinations. Analyzes of the Grambank data show that the actual variation is also very large, but is subject to certain limits. The languages of the original inhabitants of America, which developed isolated from the rest of the world over many millennia, are nevertheless relatively similar in structure to other languages of the world. This could indicate that certain human cognitive traits make certain grammatical structures more likely than others.
The most unusual languages in the world
"If the processes of language evolution and diversification were to start all over again, there would still be some resemblance to what we have today," says Skirgård's colleague Russel Gray. "Being subject to the constraints of human cognition means that while there is a high degree of historical contingency in the organization of grammatical structures, there are also fixed patterns." The data show that the historical relatedness of languages plays a greater role in their grammatical similarity than the geographic proximity of present-day ranges. Languages are thus more similar to other languages with which they share a common ancestor than languages with which they are only in contact through the geographical proximity of their distribution areas. "Genealogy generally trumps geography," Gray summarizes.
In addition, the team went in search of the most unusual languages in the world, i.e. languages that have the most unique possible combination of grammatical features. "Most of the time, the most unusual languages do not belong to the largest language families, or if they do, they are at the geographical edge of their range," the authors explain. Some of the most unusual languages in the data set include Native American languages such as Movima, and Papua New Guinea languages such as Kuot and Yélî Dnye. These languages are only spoken by a few thousand people and are severely endangered. However, many other languages that are considered less exceptional but still have a number of unique characteristics could also disappear in the future.
Unique cultural source
"The extraordinary diversity of languages is one of the greatest cultural achievements of humankind," says co-author Steven Levinson from the Max Planck Institute for Psycholinguistics in Nijmegen in the Netherlands. “This diversity is under severe threat, particularly in some regions of the world such as northern Australia and parts of South and North America. Without sustained efforts to document and revitalize endangered languages, our future view of human history, cognition, and culture will be severely limited through the window that linguistics offers us.”
The Grambank database helps identify, document, and preserve endangered languages. "Every single language harbors a unique and irreplaceable source of linguistic knowledge," explains the team. With the help of Grambank, connections between linguistic diversity and other cultural and biological characteristics can be explored. "It ranges from religious beliefs to economic behaviors and musical traditions to genetic lineages," says Gray. "These connections to other facets of human behavior will make Grambank a key resource not only for linguistics but more generally for the multidisciplinary effort to understand human diversity."
Source: Hedvig Skirgård (Max Planck Institute for Evolutionary Anthropology, Leipzig) et al., Science Advances, doi: 10.1126/sciadv.adg6175