AI Decodes Chimpanzee Vocabulary Items
Abstract
Research continues to reveal complexity in animal vocal utterances (emission of vocalizations comprising one or more units). However, identifying what animal utterances mean remains an on-going challenge. Critical steps include mapping a clearly defined signal form on to clearly defined signal usage, where signal usage is assigned through a combination of the context of production and the receivers’ response. Here, we address the problem of defining the signal form: the acoustic features which distinguish a call with one usage from a call with another usage, broadly analogous to identifying vocabulary items. Deciphering audio-recordings from species that live in noisy environments, like tropical forest, are particularly challenging, a task that may be aided by AI. To date, AI use in animal vocalizations recorded in natural environments has shown success in classifying vocalizations by species, individual, or call type. Much rarer are studies classifying vocalizations by the specific socio-ecological contexts in which they are emitted, such as greet, feed or play. We used the ANIMAL-SPOT algorithm with demonstrated success in call classification in noisy environments at the species level, on a sample of 1396 calls from 60 wild chimpanzees, Taï Forest, Ivory Coast. We successfully challenged ANIMAL-SPOT in a particularly hard task of classifying two acoustically similar noisy, graded ‘grunt’ calls to the correct context of production, either feed or greet, with >80% correct classification, when randomized simulations gave around 50% accuracy. We conclude, AI may emerge as a new tool for unravelling vocal communication in graded, noisy vocal systems, to automatically and rapidly distinguish subtle and hard to detect acoustic differences. Specifically, we demonstrate initial steps in applying AI to decode vocabulary in chimpanzees and other animals.