2016 Grants:

Robert Henderson in collaboration with Ryan Bennett, from Yale University, received an NSF Grant the for project entitled "Collaborative Research: Investigations into tone and stress in a complex prosodic system"


The sound systems of human languages differ widely, as any student of foreign languages knows. One way in which languages differ is their use of stress and tone. English is a language which uses stress to distinguish word meanings: compare the word "record" when used as a verb ("to recórd") and as a noun ("the récord"). Other languages use stress to mark the edges of words. Similar patterns are found with tone: in Tokyo Japanese, for instance, the meaning of the word "hashi" depends on the pitch melody with which it is spoken (with a high-low melody it means "chopsticks", and with a low-high melody it means "bridge"). Languages which use just one of these features are very common. Less common, and less studied, are languages which use stress and tone together. Until such languages are studied in greater detail, we will not have a complete picture of the sound systems of human languages and the ways in which they can vary.

This project investigates stress and tone in Uspanteko, a highly endangered Mayan language spoken by approximately 1000-3000 people in the central highlands of Guatemala. Uspanteko uses tone to distinguish word meanings, as in "síip" (high tone, means "tick") vs. "siip" (no high tone, means "gift"). Alongside tone, stress is used in Uspanteko to mark the edges of words. The primary aim of this project is to document and analyze the acoustic structure of stress and tone in Uspanteko. As the sound system of Uspanteko is typologically unusual, this project has the potential to contribute substantially to our understanding of cross-linguistic variation in the areas of stress and tone. One product of this research will be a large, annotated corpus of spontaneous speech in Uspanteko. The project also involves a substantial training component for linguistics students in the U.S. and Guatemala.


Tyler Peterson and Ofelia Zepeda received an NSF Grant the project entitled "Workshop: Assessing and Documenting the Vitality of Native American Languages"


The Native American Languages Act, passed by the U.S. Congress in 1990, enacted into policy the recognition of the unique status and importance of Native American languages. All Native American languages are endangered, although they vary considerably in terms of 'vitality,' that is, who, how and where the languages are in use. Currently, there is no systematic assessment of the Native American languages of the United States and their vitality. Through a workshop followed by a summer course, this project will review existing assessment tools and survey methodologies with the goal of enabling participants to create new and innovative assessment tools that address this need. Participants will Native Americans who are currently engaged in language work, as citizen scientists, educators, and staff and students at tribal colleges and universities. This project has the potential to inform policy decisions and implementations in national and international contexts. In addition, it will create a cohort of indigenous citizen scientists well-versed in scientific activities that include research protocols, assessment design and use, data analysis and more.

The American Indian Language Development Institute (AILDI) at the University of Arizona is in its fourth decade serving as a training institute for Native Americans in descriptive linguistics, language documentation, and assessment and more. Through this pilot project, Documenting Native Language Vitality, AILDI researchers and workshop participants will begin the long term goal to create a model for grass-roots assessment for tribal communities. In the summer course, participants will learn best practices in data collection, management and archiving, as well as how to deploy assessment data. Participants will come from eight Arizona tribal language and four tribal colleges and universities, the latter of whose involvement will test the viability of the national network of tribal colleges for future assessments. The NSF Tribal Colleges and Universities (TCUP) program in EHR is providing support for tribal college participation in this project. Assessing language vitality and the collection and analysis of data in language surveys are the empirical cornerstones for language documentation and revitalization projects in Native American communities. Importantly, this innovative approach has implications and potential utility for endangered language scientists and communities worldwide. Results will be published online and in print to promote an ongoing discussion with a wide audience.

2015 Grants:

Heidi Harley received an NSF Grant for the project entitled "Hiaki Grammar: Documentation and Analysis."


This three-pronged project focuses on the grammar of Hiaki (Yaqui), an endangered Uto-Aztecan language spoken in southern Arizona and northern Mexico. Investigating understudied languages like Hiaki provides researchers the opportunity to discover hitherto undocumented diversity in the human capacity for linguistic expression, and hence to test and improve upon models of human language and cognition. Basic descriptive research on a broad array of grammatical areas will be conducted, enabling the drafting of the second volume of a projected three-volume series on the grammar of Hiaki. A series of subtitled videos, useful in language teaching as well as serving documentary purposes, will be produced.

Two particularly intriguing aspects of Hiaki will be the focus of the theoretical component of this research. The grammar of Hiak''s noun phrases is like that of German or Russian, with concord for number and case marking on articles, nouns and adjectives. In contrast, Hiaki's verb phrases resemble those of Japanese, requiring complex suffixation processes instead of complement clauses in many contexts. Surprisingly, noun phrase concord becomes dissociated when the noun phrase mentions a possessor, like 'The dancer's beads', such that number marking on the article tracks the number of the possessor but the case of the whole noun phrase. This pattern of dissociation is not expected in many current models of grammatical agreement. Verb phrases also exhibit an unusual property: They permit passivization of already non-active verbs. These two very unusual features of Hiaki will be investigated in detail in consultation with native speakers of the language, and a model of these grammatical patterns developed. The implications for our understanding of the structure of human language will include a new grasp of when and why such variation can arise, and, potentially, new theories of agreement and passivization, respectively.

Simin Karimi with co-PIs Andrew Carnie and Heidi Harley received an NSF Grant for a project entitled "A Descriptive and Theoretical Analysis of Complex Predicates in Iranian Languages."


Verbal information is conveyed differently in the world languages. In English, this information is mostly expressed by a simple verb, while in some other languages it is conveyed by a complex structure consisting of two or more components linguists call complex predicates. For example, gerye kardan 'cry doing' (Farsi) translates as the single verb 'to cry' in English. This project focuses on a descriptive and theoretical analysis of complex predicates in seventeen Iranian languages.

The nature of complex predicates has been the subject of many linguistic studies in the last 5-6 decades. One major question is whether they are word-like, or rather phrase-like with each component revealing independent properties. Furthermore, the contribution of each part to the meaning of the whole have long been a central issue in linguistic theory. Crucially, complex predicates reveal distinct behavior with respect to certain constructions compared to those observed in well-studied languages such as English. These constructions include passives (The apple was eaten), resultatives (John wiped the table clean), and causatives (John made Mary leave). The nature of these differences, and the reasons for their existence, are not well understood. New data and the microparametric comparison of the Iranian languages will break new ground in our understanding of the underpinnings of complex predicate formation and their behavior in human language.

Data collection will be conducted via linguistic elicitation with native speaker consultants as well as Complex Predicate Questionnaires to be filled out by native speaker consultants trained in linguistics. One of the goals of this project is to transcribe, catalogue and enter the data into a database. All the textual material that forms the basis for the descriptive analysis and pedagogical material will be stored in an accessible format available to public for further study, in addition to a web-based encyclopedia for the target languages. This study promises many benefits. It trains several students as experts in Iranian languages, provides understanding of languages and cultures Americans are interested in, but do not know much about, and fosters research relations between linguists in three continents, which will enrich collaborative activities.


Mike Hammond with co-PIs Diana Archangeli, Heddwen Brooks, Andrew Carnie, Diane Ohala, Adam Ussishkin, Peredur Webb-Davies, and Andy Wedel received an NSF Grant for a project entitled "SBE-RCUK: Experimental and Descriptive Investigations of Welsh (cym) Consonant Mutation."


Most human languages are conveyed through the medium of sound, and thus under- standing the way language sound works is central to understanding human language. This project addresses an unusual language sound phenomenon, initial consonant mutation: the initial consonant of a word varies to express grammatical information about that word, for example the possessives in Welsh, e.g. cath ‘cat’, fy nghath ‘my cat’, dy gath ‘your cat’, ei chath ‘her cat’. In order to address big questions like “How do such systems evolve historically?”, “How can children learn initial mutation?”, and “Why does this not occur in a language like English?”, it is important to first fully understand the properties of the mutating consonants for speakers of a language with initial consonant mutation. This is the research goal, with respect to Welsh, a language with a rich and productive system of initial consonant mutation. In addition, many Americans are of Welsh descent. The primary heritage language of this community is Welsh, a medium for a rich culture of literature, song, poetry, history, etc. It also provides an important window into the world-view and culture of the ancestors of the Welsh-American community. Sadly, the Welsh language is endangered. There are no monolingual speakers. In Wales, the number of speakers is now less than 20% of the population and there’s a real danger that the language will decline below critical mass and no longer be viable. The potential loss to the understanding of the culture and background of so many Welsh and Welsh-Americans is tragic. The research to be conducted here will both help document extraordinary aspects of the grammatical system and provide resources for pedagogical materials essential in the revival of the language.

The interdisciplinary and international team will apply a diversity of investigative tools: perceptual studies using masked priming, production studies using acoustic analysis and ultrasound for articulatory study, traditional field work involving elicitation, judgment tasks, statistical corpus work, and two acquisition studies (one with adults; one with children). These heterogeneous methodologies allow the team to approach the investigation of consonant mutation from a variety of directions, thus providing a uniquely comprehensive view of the phenomenon. The sub-projects are diverse, but focus on two central questions: i) whether mutations are best viewed as lexical or phonological phenomena; and ii) what the precise phonological properties of mutations are. These studies will also impact linguistic theory, including our understanding of lexical access in a language where a lot of morphological information is concentrated in the beginning of the word and in our understanding of the organization of phonological systems and alternations. The core of the team has done similar work on Scottish Gaelic and done logistical and descriptive advance work on Welsh, and so is uniquely positioned for this project.

Andrew Carnie with co-PIs Muriel Fisher and Mile Hammond received an NSF Grant for the project entitled "Collborative Research: Creating An Audio-Visual Corpus Of Scottish Gaidhlig To Preserve And Investigate Linguistic Diversity"


Scottish Gaelic, a Celtic language closely related to Irish and Welsh and more distantly to English, was once spoken across Scotland by most of the country's population. Today, however, Gaelic remains a community language only in the most remote regions of western Scotland. Gaelic speakers comprise about 1% of the Scottish population; the 2011 census found only 57,375 native speakers, compared to 250,000 a century earlier. This sharp decline makes the language's continued survival uncertain.

Scottish Gaelic is of immense interest because it possesses many rare linguistic features, including initial consonant mutation where a word-initial consonant can be changed depending on what the function of that word is in a clause. Scottish Gaelic also exhibits pre-aspirated consonants, and verb-initial sentence structure. In addition, Scottish Gaelic offers remarkable examples of how knowledge systems particular to its local geography and climate (land management, fishing techniques) are imbedded in the language. Should Gaelic become extinct, the global community -- not just Scotland -- will lose an irreplaceable cultural and scientific resource.

Through this two-year project for $189,457, Professors Ian Clayton from the University of Nevada, Reno along with Andrew Carnie and Mike Hammond from the University of Arizona will create a corpus of linguistic interviews with 30 native Gaelic speakers, with the help of native speaker Muriel Fisher (recipient of the Linguistic Society of America's 2015 Excellence in Community Linguistics Award). Speakers will represent a range of ages, geographic origins, and professional backgrounds. The collection will contain more than twenty hours of high-quality audio-visual material, transcribed and translated, with both cultural and scientific value. The collection will offer an invaluable tool to help linguists expand their scientific study of the language's rare features. In addition, the interviews will focus on traditional occupations, folklore, and oral history, the kinds of knowledge and terminology most at risk as Gaelic declines.

When complete, the corpus will be publically available through the Max Planck Institute's Language Archive, and the University of Arizona's Open Repository.

2012 Grants:

Andrew Carnie with co-PIs Natasha Warner, Adam Ussishkin, Mike Hammond, and Diana Archangeli received an NSF Grant for the project entitled "Experimental and Descriptive Investigation of Gaidhlig Consonant Mutations"


As many as 20 million Americans identify themselves as Scottish or Scots/Irish in descent. One of the primary heritage languages of this community, Scottish Gaelic is the medium for a rich culture of literature, song, poetry, history and indigenous knowledge-systems. It also provides an important window into the world-view and culture of the ancestors of the Scottish-American community. The loss of native speaker knowledge about this endangered language is imminent. The Scottish Gaelic language is of particular interest to scientific linguists. Gaelic is very different from English in the ways it signals grammatical relationships between words. In particular it has a mechanism for indicating grammatical notions such as tense, gender, aspect, and possession by changing the first consonant of one of the words in the relationship. For example, the initial consonant in the word cù 'dog' (a k sound) is pronounced with a hard ch sound like German Bach when it is appears after possessive words like mo 'my' (written as mo chù). These changes, called "initial consonant mutations," are a productive and critical part of the grammatical system of the language. This mechanism for indicating grammatical inflection is extremely rare in the world's languages and is very poorly understood.

In order for linguists to properly understand the sound system of a language, they have to use instrumental measures of how speakers articulate sounds and use psycholinguistic experiments to measure how speakers understand and use the patterns of sounds. Using modern linguistic instrumental and psycholinguistic techniques. Professors Andrew Carnie, Diana Archangeli, Michael Hammond, Natasha Warner and Adam Ussishkin, along with Scottish Gaelic native speaker Muriel Fisher will investigate the articulation, patterning, and perception of initial consonant mutations. For example, this study will determine whether speakers store both the mutated and non-mutated words in their minds by asking them to identify the word while they listen to obscured (or "masked") speech sounds. A related experiment, where the sounds people hear are subtly artificially modified (or "gated"), will be used to study that the exact point in the sound stream at which listeners can identify whether the word is mutated or non-mutated. A study of the relative statistical frequency of sounds in mutated and non-mutated in a collection of language (or corpus) will show how productive the process is and how it corresponds to statistics on the frequency of sounds in the larger grammatical system of the language. Psycholinguistic techniques, including judgments and nonsense word tasks will be used to investigate the mental procedures speakers use to produce these mutations. Finally the actual articulation of these mutations will be investigated using modern phonetic instrumental measures such as ultrasound and airflow volume. The output of this research project will be a description of the Gaelic consonant mutation that will help complete an on-going description of the language. In addition, in doing this research graduate students will be trained in the techniques of sound analysis of an endangered language. This training will allow the students to conduct similar experimental studies on other endangered languages. This work has significant implications for documenting and preserving the linguistic traditions of the Scottish and Scottish-American communities.

2010 Grants:

Natasha Warner with co-PI Miguel Simonet received an NSF Grant for the project entitled "Speech Reduction across Languages and Dialects"


The speech humans produce every day in casual conversation is incredibly varied, with sounds and whole syllables changed or missing. American English listeners notice nothing unusual when hearing such "reduced speech" in context; however, second-language speakers and even listeners from other English-speaking countries often find American English reduced speech difficult to understand. The current research centers on how speakers and listeners use reduced, spontaneous speech across languages and dialects, and on how such speech may hinder or even facilitate communication among speakers of different backgrounds. The project will test speakers of Dutch, Spanish, Japanese, and three dialects of English to determine 1) to what extent reduction is language-specific and part of the grammar rather than random or physically-determined variability, 2) whether the sound patterns of the native language influence phonetic variability at the level of spontaneous speech in the second language, 3) how strongly dialect affects understanding of reduced speech, and 4) how degree of proficiency, years of experience, strength of ethnic/national identity, etc. affect production and understanding of reduction. The overarching theoretical question is, what is part of the learned grammar and what is low-level variability. Furthermore, the project will provide data on theoretical questions about exemplar models of speech perception, mutual effects between speakers' first and second languages, and articulatory planning. 

Through globalization, immigration, and telecommunications, humans in the modern world often interact across language backgrounds. Native English speakers interact with non-native speakers and speakers of different English dialects interact with each other. The current project addresses how humans handle the variability of conversational speech in communicative situations. Detailed knowledge of natural, reduced speech, gained through this project, will impact speech technology and how humans interact with computers by voice, benefiting speech synthesis and automatic recognition of spontaneous speech. Because the project includes extensive investigation of English proficiency and language background for the experiment participants, it will also shed light on what factors make it easier or harder for non-native listeners to understand conversational speech in their second language. This project forms a synergistic international collaborative group to answer these questions. More information can be found here.

2005 Grants:

Natasha Warner with Community Collaborator Quirina Luna received an NEH Grant for the project entitled "Database of Mutsun, an Extinct California American Indian Language"

The Mutsun language is a Costanoan (Ohlone) language of California that was spoken in the area of the modern towns of Gilroy and San Juan Bautista.  The last fluent speaker passed away in 1930, but there is a large quantity of written documentation of the language, made by early linguists and a missionary working with fluent native speakers of Mutsun from 1807 to 1930.  The purpose of this project is to enter all the data on Mutsun into a database, analyze it, and use this to produce a dictionary and text collection.  These materials also lead to language learning among the project members, and allow creation of language teaching materials for the Mutsun community.  This project has language revitalization in the Mutsun community as its primary purpose, but the data also allows us and future linguists to answer theoretical questions about linguistics, for example about metathesis. More information can be found here.

Natasha Warner, in collaboration with Anne Cutler, received a contract funded by the Max Planck Institute for Psycholinguistics for the project entitled "Perception of Speech Sounds: A Diphone-based Investigation"

Speech sounds overlap in time, with perceptual cues to one sound spreading into the neighboring sound.  For example, /k/ sounds different if the following vowel is /i/ ("key") vs. /o/ (in "coal"), but it sounds yet different again if it is before /n/ (e.g. in "acknowledge") or /s/ ("axe").  Listeners are adept at perceiving overlapping cues and using them to extract information from the speech signal gradiently as it becomes available over time.  The purpose of this project is to create a large database of information on how American English listeners perceive sounds of the speech stream over time, and how sure they are of what sounds they're hearing at any given moment during the speech signal.  The data from this project is publicly available, for use by other researchers, and can be downloaded from the Diphones sub-page of Warner's web page (http://www.u.arizona.edu/~nwarner/).