Phonology is the study of sound patterns in spoken language. These patterns include contrastive sound inventories, regularities in sound distribution, and alternations of sounds. Components of sound patterns are usually expressed in terms of five basic phonological category types: (1) distinctive features, which are thought to be the smallest identifiable elements; (2) contrastive segments or phonemes (vowels, consonants, glides), which are typically composed of features; (3) contrastive tones or tonemes; (4) timing or weight units, which are abstract units of duration that can be associated to features, segments, and tones; and (5) prosodic constituents, which are groupings of features and segments into larger units like syllables, metrical feet, and intonational phrases. The arrangement of syllables into feet, feet into prosodic words, words into phrases, and phrases into utterances is referred to as the prosodic hierarchy. Phonology is distinct from phonetics, which investigates physical properties of speech sounds, be they acoustic, aerodynamic, articulatory, or auditory, although there is overlap between the two fields.
History
The study of sound patterns has a long tradition going back more than 2,000 years to Tholkaappiyam, an anonymous Old Tamil [oldt1248] grammar, and to Pāṇini’s grammar of Sanskrit [sans1269] from ca. 350 BCE.1 Pāṇini’s grammar formulated a contrastive sound inventory, a list of natural classes of sounds, rules of pronunciation targeting these classes, and ordering principles for these rules. About 1,000 years later, Sibawayh’s 8th century treatise on Arabic [stan1318] grammar included a discussion of sound patterns distinct from phonetics and emphasized the rule-governed nature of alternations.
Alhough many aspects of modern phonology were introduced in these ancient texts, the term phoneme (or a translation equivalent), referring to a minimal unit of contrast, and the modern view of phonology rooted in these units of contrast were formalized only in the late 19th century. These ideas were central to the work of Jan Baudouin de Courtenay of the Kazan School and later to Prague School phonology of the 20th century, led by Nicolai Trubetzkoy and Roman Jakobson (Anderson, 2021). Prague School phonology gave rise to generative phonology (Goldsmith & Laks, 2023), most closely associated with Morris Halle (professor of linguistics at the Massachusetts Institute of Technology from 1951 to 1996), his many students and colleagues, and the theory detailed in The Sound Pattern of English (Chomsky & Halle, 1968). The 20th century also saw independent developments in prosodic, metrical, and tonal phonology. At the University of London, John Rupert Firth proposed autonomous prosodies including tone, nasality, and voice quality, informed by African and Asian languages (Firth, 1948, 1957). These features were independent of specific vowels or consonants and could be associated with larger domains like syllable and word. In the Americas, Kenneth L. Pike’s work on the complex tone patterns of San Miguel El Grande Mixtec [sanm1295] inspired his framework of Tagmemics, bringing phonology into a wider context of language structure and use (Pike, 1948, 1967). In addition, Pike’s techniques for identifying phonological components were used for many years to train Summer Institute of Linguistics fieldworkers, yielding hundreds of detailed phonological descriptions of the world’s languages.
Core concepts
Distinctive features
Although contrasts are typically embodied by phonemes and tonemes, the minimal units of contrast in phonological systems are distinctive features that compose or describe segments. Most phonologists assume a set of 25 to 35 features to describe all contrastive sounds in the world’s languages. To illustrate how features operate, consider the consonant phonemes of Central Rotokas [roto1249]: /p t k b d g/ (Firchow & Firchow, 1969; Robinson, 2006). A minimal analysis of this system of consonantal contrasts makes use of only three binary-valued features, as shown in Table 1.
Consonant phonemes of Central Rotokas | ||||||
p | t | k | b | d | g | |
[voice] | - | - | - | + | + | + |
[labial] | + | - | - | + | - | - |
[coronal] | - | + | - | - | + | - |
The distinctive feature [+voice] specifies /b d g/ as the natural class of sounds produced with vocal fold vibration, whereas /p t k/ are [-voice]. Place of articulation is also contrastive in Rotokas: [+labial] distinguishes /p b/ (made with the lips) from other consonants, whereas [+coronal] distinguishes /t d/ (made with the tongue tip) from other consonants. Although abbreviated as /p/, the phonological representation of this Central Rotokas phoneme is a feature matrix or feature list: /p/ can be described abstractly as [-voiced, +labial, -coronal] in contrast to /b/, which is [+voiced, +labial,
-coronal].
Segments and autosegments
All spoken human languages make use of distinct vowels and consonants to distinguish words from each other. Vowels and consonants together are referred to as segments, and the study of segments and the features that compose them is called segmental phonology. There is great diversity in the segmental phonologies of the world’s languages: in Central Rotokas, there are only six distinct consonants (Table 1), whereas in Ubykh [ubyk1235], there are 80; in Nuer dialects [nuer1246], there are more than 20 distinctive vowels, whereas in some Abkhaz-Adyge languages, there are only two.
All spoken languages also make use of intonation encoded phonologically as sequences of level tones. These tone melodies are distinctive fundamental frequency (F0) contours that express meaningful differences at the level of the phrase. For example, in many languages a sentence is interpreted as a statement if it has a falling (high-to-low) pitch but as a question if the same segmental string has a rising (low-to-high) pitch. The study of tone and other features that take domains larger than the segment is called autosegmental phonology.
The most common autosegmental feature is tone. Languages in which relative F0 values on vowels, moras, or syllables encode lexical and grammatical distinctions are called tone languages. As with segments, there is great diversity in the tonal phonologies of the world’s languages: Shona [shon1251] has a minimal tonal contrast between high and low tone; Nuer distinguishes high, mid, and low; and Utunyoso Trique [sanm1298] has nine tonal contrasts on monosyllables, analyzed as level tones 1, 2, 3, and 4 and contour tones 31, 32, 43, 13, and 35, in which 1 is the lowest pitch level and 5 the highest.
Other common autosegments are features for (vowel) backing, rounding, tongue root advancement, and nasality. When any one of these features is associated with multiple syllables, or an entire word, the sound pattern is referred to as harmony. Turkish [nucl1301] has rounding and backing harmony, Paraguayan Guarani [para1311] has nasal harmony, and Akan [akan1250] has advanced tongue root harmony.
Phonological length and segments vs. clusters
Many languages distinguish short and long segments. In rare cases, as in Agar Dinka [sout2833], three degrees of length are found. The representation of phonological length is illustrated in Figure 1, in which X is an abstract timing unit, and the string of Xs is referred to as the timing tier.

Representing phonological length with timing units (X = one unit)
Short, long, and extra-long segments are compared with sequences of short segments, and short complex segments like diphthongs and affricates are compared with segment sequences. Many-to-one and one-to-many mappings between features or segments and the timing tier can encode many other contrasts, including prenasalized stops vs. nasal-stop clusters, aspirated stops vs. stop-h sequences, complex labio-velar stops vs. labial-velar stop sequences, and contour tones vs. sequences of level tones. These contrasts are supported by phonological evidence in many languages, despite the fact that a single segment like /t͡ʃ/ and a cluster like /tʃ/ may be phonetically indistinguishable.
Phonotactics
Rules governing possible phoneme sequences are called phonotactics. In Central Rotokas, every consonant must be followed by a vowel; there are no consonant sequences. This phonotactic may be expressed in terms of syllable template (C)V, in which parentheses indicate optionality.
Prosodic constituents
Prosodic constituents are groupings of features or segments into larger units. The most widely recognized prosodic constituents, from smaller to larger, are syllable (with sub-constituents nucleus, onset, coda), metrical foot, prosodic word (known also as phonological word), phonological phrase, and intonational phrase. Grouping segments into prosodic or metrical constituents makes use of many kinds of phonological evidence, from native speaker intuitions to language play, text-to-tune mappings, stress patterns, intonation contours, and prosodically conditioned alternations, and may also be determined by syntactic constituency (Lahiri & Plank, 2022). Prosodic structure in Yine [yine1238] is shown for the utterance in the first row of Table 2, in which periods mark syllable boundaries (based on native speaker slow/careful speech), and parentheses delineate metrical feet. Yine feet are trochaic: the first syllable is strong, and the second is weak. Penultimate primary stress and alternating secondary stress from the beginning of the word demonstrate that the utterance constitutes a single prosodic word (Hanson, 2010, 25–27, 37; Matteson, 1965). Note that the prosodic structure is built on a word that lacks certain vowels (in bold in the morphological structure) that are underlyingly present in constituent morphemes (see below) [see Morphology].
Phonological constituents of a Yine utterance including prosodic structure | |
Prosodic structure | [(ˌsa.çri).(ˌkhi.ma).(ˌtka.na).kta.(ˈtka.na)]ProsodicWord |
morphological structure | Ø-saçrɨka-hima-ta-ka-na-kta-tka-na |
morph-by-morph gloss | 3-surround-QUOT-VCL-PASS-CMPV-GENZ-PFV-3PL |
translation | 'they were completely surrounded everywhere, reportedly' |
Alternations
Sometimes phonemes have different phonetic realizations that are noncontrastive. These allophonic variants can occur for many reasons. For instance, their realization may depend on linguistic context (e.g., preceding or following sounds, position within the syllable, position within the word) or rate of speech, or they might be conditioned by sociolinguistic variables or vary freely amongst speakers of the language [see Linguistic Variation]. In Central Rotokas, each of the voiced phonemes /b d g/ in Table 1 has a range of allophones: /b/ can be pronounced as [b], [m], or [β]; /d/ as [d], [n], [l], or [ɾ]; and /g/ as [g], [ŋ], or [ɣ]. The alternations between these sounds highlight a strength of the phonological analysis shown in Table 1: nasality and continuancy are not contrastive features in Central Rotokas, therefore, pronunciation of /b d g/ can vary along these dimensions without compromising segmental discriminability.
When a phoneme has distinct non-allophonic realizations that are contextually determined, these are referred to as (non-allophonic) alternations. Alternations are often neutralizing, eliminating contrast in a specific environment. Alternations may also involve deletion or insertion of a feature or segment, like the loss of bolded medial vowels in Yine in Figure 1. In cases like this, underlying forms, distinct from surface forms, are motivated. Within generative phonology and its descendants, contextually predictable alternations, whether allophonic or non-allophonic, are captured by phonological rules or phonological constraints.
Phonology in other domains
The study of historical phonology is primarily focused on the reconstruction of contrastive sound inventories and regularities in sound distribution and on sound changes as they have occurred over time. Specialists in language acquisition [see Language Acquisition] study how phonological systems are acquired, whereas laboratory phonologists design experimental conditions to test phonological hypotheses.
Questions, controversies, and new developments
Nearly every central tenet of modern phonological theory has been questioned over the last half-century. Take, for example, the proposal in The Sound Pattern of English that distinctive features are innate properties of the human language faculty (Chomsky & Halle, 1968). A careful review of segment inventories and alternations in hundreds of languages suggests that no single distinctive feature system is adequate and that phonological features are most likely emergent, learned properties of grammar (Mielke, 2008).
A wider question in phonology is whether universal tendencies should be directly encoded in phonological grammars or attributed to aspects of general cognition and language use. Consider the universal preference for voiceless sounds like /p t k/ in Table 1 over voiced sounds like /b d g/, which was emphasized in Prague School work of the early 20th century (Anderson, 2021, 126–132). Some approaches, like optimality theory (Kager, 1999; Prince & Smolensky, 1993) encode this preference as an explicit constraint type in the grammar: a markedness constraint determines that /p t k/ are in some sense better segments than /b d g/. Alternative approaches, including laboratory phonology (Ohala, 1974, 1983), evolutionary phonology (Blevins, 2004), and typological usage-based frameworks (Evans & Levinson, 2009; Haspelmath, 2006), suggest extra-grammatical explanations for this and many other universal tendencies. In this particular case, the cross-linguistic preference for voiceless oral stops in sound inventories and neutralizing alternations follows from several factors: the aerodynamics of oral stop production that inhibit vocal cord vibration (Ohala, 1997) and phonetic contexts in which stop voicing is less likely to be produced or perceived (Blevins, 2006). As these controversies continue, increased synergy between phonology and phonetics, especially in laboratory phonology and language documentation and description, provides a basis for incremental progress (Blevins et al., 2020).
Broader connections
Although phonologists generally focus on human sounds and sound patterns, there are natural human languages that do not use sound. These are sign languages [see Sign Language]. Linguistic work on sign language phonology suggests that it is subject to many of the same cognitive and communicative pressures that guide the development of spoken languages. These findings may shed new light on the controversies mentioned above.
While human languages have sounds and sound patterns, phonology may not be unique to our species. Segmental contrasts and prosodic structure may be found in birdsong, and mammalian vocal tracts may be ready to talk, if they only had the cognitive capacity to do so (Fitch, 2018; Fitch et al., 2016; Mann et al., 2021; Wohlgemuth et al., 2010).
Further reading
Bybee, J. (2001). Phonology and language use. Cambridge University Press.
Goldsmith, J. A., Riggle, J., & Yu, A. C. L. (Eds.) (2011). The handbook of phonological theory (2nd ed.). Blackwell.
Liberman, M. (2018). Towards progress in theories of language sound structure. In D. Brentari and J. Lee (Eds.), Shaping phonology (pp. 201-222). University of Chicago Press.
Odden, D. (2013). Introducing phonology (2nd ed.). Cambridge University Press.
Footnotes
In this entry, named languages are followed by Glottolog language codes in brackets, which provide unique, stable identifiers for all known human languages (see https://glottolog.org/).
↩
References
Anderson, S. R. (2021). Phonology in the twentieth century: Second edition, revised and expanded. Language Science Press.
↩Blevins, J. (2004). Evolutionary phonology: The emergence of sound patterns. Cambridge University Press.
↩Blevins, J. (2006). A theoretical synopsis of evolutionary phonology. Theoretical Linguistics, 32(2), 117-166. https://doi.org/10.1515/TL.2006.009
↩Blevins, J., Egurtzegi, A., & Ullrich, J. (2020). Final obstruent voicing in Lakota: Phonetic evidence and phonological implications. Language, 96(2), 294-337. https://doi.org/10.1353/lan.2020.0022
↩Chomsky, N., & Halle, M. (1968). The sound pattern of English. Harper & Row.
↩Evans, N., & Levinson, S. C. (2009). The myth of language universals: Language diversity and its importance for cognitive science. Behavioral and Brain Sciences, 32(5), 429-492. https://doi.org/10.1017/S0140525X0999094X
↩Firchow, I. B., & Firchow, J. (1969). An abbreviated phonemic inventory. Anthropological Linguistics, 11(9), 271-276.
↩Firth, J. R. (1948). Sounds and prosodies. Transactions of the Philological Society, 47(1), 127-152. https://doi.org/10.1111/j.1467-968X.1948.tb00556.x
↩Firth, J. R. (1957). Papers in linguistics: 1934–1951. Oxford University Press.
↩Fitch, W. T. (2018). What animals can teach us about human language: The phonological continuity hypothesis. Current Opinion in Behavioral Sciences, 21, 68-75. https://doi.org/10.1016/j.cobeha.2018.01.014
↩Fitch, W. T., de Boer, B., Mathur, N., & Ghazanfar, A. A. (2016). Monkey vocal tracts are speech-ready. Science Advances 2(12), e1600723. https://doi.org/10.1126/sciadv.1600723
↩Goldsmith, J. A., & Laks, B. (2023). Generative phonology: Its origins, its principles, and its successors. In L. R. Waugh, M. Monville-Burston, & J. E. Joseph (Eds.), The Cambridge history of linguistics (pp. 704-727). Cambridge University Press. https://doi.org/10.1017/9780511842788.035
↩Hanson, R. (2010). A grammar of Yine (Piro) [Doctoral dissertation, La Trobe University]. Open at La Trobe. https://opal.latrobe.edu.au/articles/thesis/A_grammar_of_Yine_Piro_/21847407?file=38770605
↩Haspelmath, M. (2006). Against markedness (and what to replace it with). Journal of Linguistics, 42(1), 25-70. https://doi.org/10.1017/S0022226705003683
↩Kager, R. (1999). Optimality theory. Cambridge University Press.
↩Lahiri, A., & Plank, F. (2022). Phonological phrasing: Approaches to grouping at lower levels of the prosodic hierarchy. In B. E. Dresher & H. van der Hulst (Eds.), The Oxford history of phonology (pp. 134-162). https://doi.org/10.1093/oso/9780198796800.003.0007
↩Mann, D. C., Fitch, W. T., Tu, H. W., & Hoeschele, M. (2021). Universal principles underlying segmental structures in parrot song and human speech. Scientific Reports, 11, 776. https://doi.org/10.1038/s41598-020-80340-y
↩Matteson, E. (1965). The Piro (Arawakan) language. University of California Press.
↩Mielke, J. (2008). The emergence of distinctive features. Oxford University Press.
↩Ohala, J. J. (1974). Phonetic explanation in phonology. In A. Bruck, R. A. Fox, & M. W. LaGaly (Eds.), Papers from the parasession on natural phonology (pp. 251-274). Chicago Linguistic Society.
↩Ohala, J. J. (1983). The origin of sound patterns in vocal tract constraints. In P. F. MacNeilage (Ed.), The production of speech (pp. 189-216). Springer-Verlag. https://doi.org/10.1007/978-1-4613-8202-7_9
↩Ohala, J. J. (1997). Aerodynamics of phonology. In Proceedings of the 4th Seoul International Conference on Linguistics (pp. 92-97). Linguistic Society of Korea.
↩Pike, K. L. (1948). Tone languages: A technique for determining the number and type of pitch contrasts in language, with studies in tonemic substitution and fusion. University of Michigan Press.
↩Pike, K. L. (1967). Language in relation to a unifed theory of the structure of human behavior. Mouton.
↩Prince, A., & Smolensky, P. (1993). Optimality theory: Constraint interaction in generative grammar. Rutgers University. https://doi.org/10.7282/T34M92MV
↩Robinson, S. (2006). The phoneme inventory of the Aita dialect of Rotokas. Oceanic Linguistics, 45(1), 206-209. https://doi.org/10.1353/ol.2006.0018
↩Wohlgemuth, M. J., Sober, S. J., & Brainard, M. S. (2010). Linked control of syllable sequence and phonology in birdsong. Journal of Neuroscience, 30(39), 12936-12949. https://doi.org/10.1523/JNEUROSCI.2690-10.2010
↩