Phonetics in Phonology
John J. Ohala
University of California, Berkeley
At least since Trubetzkoy (1933, 1939) many have thought of phonology and phonetics as
separate, largely autonomous, disciplines with distinct goals and distinct methodologies. Some
linguists even seem to doubt whether phonetics is properly part of linguistics at all (Sommerstein
1977:1). The commonly encountered expression ‘the interface between phonology and phonetics’
implies that the two domains are largely separate and interact only at specific, proscribed points
In this paper I will attempt to make the case that phonetics is one of the essential areas of study
for phonology. Without phonetics, I would maintain, (and allied empirical disciplines such as
psycholinguistics and sociolinguistics) phonology runs the risk of being a sterile, purely
descriptive and taxonomic, discipline; with phonetics it can achieve a high level of explanation and
prediction as well as finding applications in areas such as language teaching, communication
disorders, and speech technology (Ohala 1991).
The central task within phonology (as well as in speech technology, etc.) is to explain the
variability and the patterning -- the “behavior” -- of speech sounds. What are regarded as
functionally the ‘same’ units, whether word, syllable, or phoneme, show considerable physical
variation depending on context and style of speaking, not to mention speaker-specific factors.
Documenting and explaining this variation constitutes a major challenge. Variability is evident in
several domains: in everyday speech where the same word shows different phonetic shapes in
different contexts, e.g., the release of the /t/ in tea has more noise than that in toe when spoken in
isolation. Variability also manifests itself dialectally, morphologically, and in sound change. All
of these forms of variation are related. Todays’s allophonic variation can lead to tomorrow’s
sound change. Sound change that takes place in one language community and not another leads to
dialectal variation; sound change that occurs in one morphological environment and not another
leads to morphophonemic variation. But the variable behavior of speech sounds is not random;
there are statistically favored patterns in it. Part of our task in explaining sound patternings, then,
is to attempt to understand the universal factors that give rise to allophonic variation and how they
can lead to sound change.
Below I will first give a brief sketch of two areas -- among many possible -- where phonetics
can provide a principled and empirically-supported account of certain sound patterns (see also
Ohala 1992, 1993). Then I will give an account of sound change that connects the phonetic
variation to the phonological variation.
2. Phonetic Accounts of Sound Patterns
2..1 The Aerodynamic Voicing Constraint
The aerodynamic voicing constraint (AVC) (which I treat in more detail in another paper
presented at this SICOL, “Aerodynamics of phonology”) provides an example of a phonetic
constraint on speech production. It is manifest phonetically in everyday speech as well as having
an impact on the phonology of languages through sound change. Briefly, the AVC arises as
follows: voicing requires that the vocal cords be lightly approximated and that there be air flowing
through them. During a stop, even if the vocal cords are in the right configuration, air will
accumulate in the oral cavity and eventually reach the same level of air pressure as that in the
trachea. When the pressure differential across the glottis is zero or even near zero, the air flow is
reduced to a point where vocal cord vibration ceases.
There are ways to moderate the effects of the AVC -- it is not an absolute constraint against
voicing in obstruents. Obviously many languages have voiced stops, even voiced geminate stops,
e.g., Hindi. For example, one can allow the oral cavity to expand passively thus creating more
room for the accumulating air and in that way delaying the moment when oral pressure equals
subglottal pressure. One can also actively expand the oral cavity, e.g., by enlarging the pharynx,
lowering the larynx and the jaw, and thus prolonging voicing even more. But these maneuvers
have their own limits and costs and therefore phonological consequences. To exploit passive
expansion of the vocal tract, one must keep the duration of the stop somewhat short (at least in
comparison to the duration of cognate voiceless stops). A consequence of this, I believe, is that
intervocalic voiced stops, because they need to have their closure interval kept short, are more
likely to cross the stop vs. “spirant” boundary and become voiced spirants or approximants than is
true of intervocalic voiceless stops. This is evident, e.g., in Spanish stops where breath-group-
initial voiced stops have voiced spirant allophones intervocalically: /»ba¯o/ ‘bath’ but /»naBo/
‘turnip’ (the voiceless stops show no such manner change in the same environments: /»piko/
‘beak’, /»kapa/ ‘cape’). Given the “cost” of maintaining voicing in spite of the AVC, one finds an
asymmetrical incidence of voicing in geminate stops. As noted by Jaeger (1978), although both
voiced and voiceless geminate stops are attested, in many languages there are only voiceless
geminates. Moreover, in some cases we can trace the history of geminates and their voicing.
There are many instances of voiced geminate stops becoming voiceless but I am unaware of any
cases of voiceless geminate stops becoming voiced (Klingenheben 1927). Moreover, whether
passive or active expansion of the oral cavity solves the problem of how to maintain voicing during
a stop, the possibilities for such expansion are less with back-articulated stops such as velars and
uvulars than with front-articulated ones such as labials and apicals. Thus there are many instances
of languages having a voicing distinction in stops but lacking a voiced velar stop, e.g., Dutch, Thai,
Czech (in native vocabulary) (Gamkrelidze 1975, Sherman 1975). In Nobiin Nubian
morphologically-derived geminates from voiced stops retain voicing with labials but not with stops
articulated further back: /fab˘çn/ (< /fab/ ‘father’ + suffix) but /mUk˘çn/ (< /mUg/ 'dog’ + suffix)
(Bell 1971, Ohala 1983).
2.2. Acoustic-Perceptual Factors in Changes in Place of Articulation
A quite familiar process in speech sound variation is the assimilation of the place of articulation
of a consonant to that of an adjacent consonant, e.g., in English the final stop of wide is alveolar
but that in the related derived word width is dental under the influence of the adjacent dental
fricative [T]. Here it is one articulator, the tongue apex, which shifts its place because it is also
involved in making an adjacent sound at a place different from its original place. But there are
some cases of consonantal changes in place of articulation where the articulators involved before
and after the change are distinct. Although these have often been characterized as being
articulatorily-motivated changes, a more careful examination shows that this cannot be the case.1T
Representative examples of the cases I am referring to are exemplified in Table 1.
Here, as mentioned, the articulators used in the “before” state and the “after” are different.
This is obviously true when p > t / __i,j and k > p / __u, w, where lips and tongue are involved but
it is also true in the case of k > t, tS, S, s / __i, j (also called ‘velar palatalization’), where the
articulator is the tongue dorsum before the change and tongue apex afterwards. Although both
apex and dorsum are part of the tongue, they are for the most part functionally independent. Thus
this change cannot be exactly like the [t] ~ [t5] variation in wide ~width. Further evidence that velar
palatalization is not articulatorily motivated is the fact that the place of articulation of the after
state ([t, tS, S, s] are further forward of the place of the conditioning environment ([i, j]).
Table 1. Examples of sound changes involving large changes in place of articulation.
k > t, tS, S, s / __ i, j
cocc + diminutive
racine [“asin] ‘root’ < Gallo-Roman
k > p / ___ u, w
p > t / ___ i, j
Bohemian tEt ‘five’
Genoese Italian tSena ‘full’
If there were a purely articulatory motivation for the shift we should rather expect the outcome of
this change to be the palatal consonants [c, C]. Instead, as argued in Ohala 1986, 1992, velar
palatalization as well as the other two place changes are best explained by the acoustic-perceptual
similarity and thus confusability of the sounds involved. In fact, laboratory-based confusion
studies duplicate these sound changes, showing a high incidence of confusions of the type [ki] >
[ti] (where ‘>’ means ‘is confused with’), [pi] > [ti] and [ku] > [pu] (Winitz et al. 1972; see also
Guion 1996).2 These results show that sound change can be studied in the laboratory (Ohala 1993).
3. From phonetic variations to phonological variations
3.1. Theoretical foundations
The type of phonetic constraints discussed above are constant and timeless. They are
responsible for numerous phonetic variations in pronunciation and perception every day in every
language each time a speaker speaks and a listener listens. What is the relationship between these
constant production and perceptual variations in speech and the events designated as sound change
which occur in a particular language at a particular period in history?
My view of this can be stated very simply (see also Ohala 1992, 1993):
1. Physical phonetic constraints in speech production lead to distortions or perturbations of the
speech signal which may make it ambiguous to the listener. These phonetic constraints may
be of various types: neurological, neuro-muscular, articulatory (inertial and elastic properties
of the speech organs), aerodynamic, as well as the constraints governing the mapping of
2. The listener occasionally misinterprets or misparses the speech signal due to these ambiguities
and arrives at a different pronunciation norm from that intended by the speaker. A change in
pronunciation norm constitutes a “mini” sound change.
3. Whether the new pronunciation norm is “nipped in the bud”, i.e., eliminated by being
corrected or whether it spreads through the lexicon and from one speaker to the next is
determined by psychological and sociological factors. Unlike the physical phonetic
constraints, these latter factors have a definite historical aspect. They occur in a definite place
Phonetics has a role in studying the first two of these stages. This can be characterized as studying
and duplicating “mini” sound changes in the laboratory. Together these constitute what might be
called the initiation of sound change or more colorfully, the germination of the seeds of sound
change. Step three covers the transmission or spread of sound change.
If the above proposal is accepted about the relation of universal phonetics on the one hand and
language- and time-specific sound change on the other, there are some important implications
• Any attempt to construct truly general, explanatory, theories of natural sound patterns, i.e.,
ones capable of reflecting natural classes of speech sounds and making the maximal
generalizations about speech sound behavior, will have to exploit physical phonetic models of
speech processes. In short, phonological naturalness is based on universal physical phonetic
constraints. Most of the phonological notations in current use in mainstream phonology, e.g.,
autosegmental notation and feature geometry, are inherently incapable of representing such
naturalness in a principled and general way (Ohala 1990a,b, 1995).
• Because the representations that do reflect the naturalness of sound patterns employ complex
mathematical models using continuous parameters, it is extremely unlikely that any of this is
psychological, i.e., it is unlikely that native speakers are aware or need to be aware of the
naturalness of the sound patterns in their language. Native speakers do not need to be aware
of Boyle’s Law in order to be subject to it any more than they have to know chemistry in order
to digest their food. Thus the attempts in mainstream phonology to attribute phonological
naturalness to “Universal Grammar”, part of the psychological/genetic endowment of all
humans, is just redundant.
• Sound change is not teleological; it does not serve to optimize articulation, perception, or the
way language is processed in the speaker’s brain. It is just an inadvertent error on the part of
• As a corollary to the above: sound change is not implemented by a novel or altered rule of
grammar. Just as the transcription errors of a student taking notes on a teacher’s lectures were
not intended by the teacher nor the student, so too a listener’s errors in interpreting the speech
signal were not implemented as a rule changing the pronunciation norm either by the speaker
or the listener.
• Many linguists, e.g., Weinreich, Labov, and Herzog (1968), Martinet (1949), Jakobson
(1978), Lass (1980), Vennemann (1993), believe that it should be possible, ideally or actually,
to answer the questions “[w]hy do changes in a structural feature take place in a particular
language at a given time, but not in other languages with the same feature, or in the same
language at other times?” (Weinreich, et al., 1968:102). Insofar as this question may have an
answer, it is not to be found in the initiation of the sound change. If, as I propose, a new
pronunciation norm is initiated when a listener misapprehends the speech signal, the question
of why this occurred reduces to why some listener made such a mistake. But the studies of
sound change in the laboratory show that some percentage of listeners invariably make such
perceptual mistakes. There is always some probability for misperception -- sometimes higher,
sometimes lower -- associated with any given ambiguous signal. Just as in the lab-based
perception studies no one bothers asking why a specific subject A misperceived stimulus B,
so, too, I believe it is fruitless to ask why a given sound change arose in a specific language A
at a specific time B, and not in some other language or not in language A at some other time.
Rather, in both lab studies and in sound change we should be more concerned with the
probability levels for confusion given the total population of speaker-listener interactions --
where the ‘total population’ is all languages at all points in time. It may be possible (though
difficult) to find the social and/or psychological factors which led a given sound change, once
it had been initiated, to spread to other speakers and to other words in the lexicon, in other
words, to become “popular” enough to be characteristic of a whole speech community.
However, most attempts to identify such factors suffer from the “too many degrees of
freedom” problem: a whole host of causative factors can be drawn upon including the
language’s phonology, morphology, spelling, syntax, lexicon, semantics, pragmatics, even the
“personality” of the speakers, etc., where each of these contain multiple factors. There seems
to be no scientific rigor in invoking these alleged causal factors and, unlike the enterprise of
studying sound change in the laboratory, there have been no controlled tests of the hypotheses
Phonetics is one of the disciplines that helps to provide answers to phonology’s questions about
why speech sounds behave as they do. Moreover, in its growth over the past couple of centuries it
has developed a respectable level of scientific rigor in creating and testing models of various
aspects of the speech mechanism. Phonology can benefit from phonetics’ methods, data, and
theories (Ohala 1991).
1 For additional challenges to articulatory-based accounts of assimilation, see Ohala 1990b.
2 Regarding the asymmetry in the direction of confusion, see Ohala 1985, 1997, Plauche@ et al. 1997.
Bell. H. 1971. “The phonology of Nobiin Nubian,” African Language Review 9, 115-159.
Gamkrelidze, T. V. 1975. “On the correlation of stops and fricatives in a phonological system,” Lingua 35,
Guion, S. 1996. Velar palatalization: coarticulation, perception and sound change. Doc. diss., Univ. of
Texas at Austin.
Jaeger, J. J. 1978. “Speech aerodynamics and phonological universals,” Proc., Annual Meeting of the
Berkeley Linguistics Society 4, 311-329.
Jakobson, R. 1978. “Principles of historical phonology,” In P. Baldi and R. N. Werth (eds.),. Readings in
historical phonology. University Park, PA: Pennsylvania State University Press. 253-260.
Klingenheben, A. 1927. “Stimmtonverlust bein Geminaten,” In Festschrift Meinhof. Hamburg:
Kommissionsverlag von L. Friederichsen & Co. 134-145.
Lass, R. 1980. On explaining language change. Cambridge: Cambridge University Press.
Martinet, A. 1949. Phonology as functional phonetics. London: Oxford University Press.
Ohala, J. J. 1983. “The origin of sound patterns in vocal tract constraints,” In: P. F. MacNeilage (ed.), The
production of speech. New York: Springer-Verlag. 189 - 216.
Ohala, J. J. 1985. “Linguistics and automatic speech processing,” In: R. De Mori & C. Y. Suen (eds.), New
systems and architectures for automatic speech recognition and synthesis. Berlin: Springer-Verlag.
447 - 475.
Ohala, J. J. 1986. “Discussion,” In: J. S. Perkell & D. H. Klatt (eds.), Invariance and Variability in Speech
Processes. Hillsdale, NJ: Lawrence Erlbaum. 197 - 198.
Ohala, J. J. 1990a. “There is no interface between phonetics and phonology. A personal view,” Journal of
Phonetics 18, 153-171.
Ohala, J. J. 1990b. “The phonetics and phonology of aspects of assimilation,” In J. Kingston & M. Beckman
(eds.), Papers in Laboratory Phonology I: Between the grammar and the physics of speech.
Cambridge: Cambridge University Press. 258-275.
Ohala, J. J. 1991. “The integration of phonetics and phonology,” Proceedings of the XIIth International
Congress of Phonetic Sciences, Aix-en-Provence, 19-24 Aug 1991. Vol. 1, 1-16.
Ohala, J. J. 1992. “What's cognitive, what's not, in sound change,” In G. Kellermann & M. D. Morrissey
(eds.), Diachrony within synchrony: Language history and cognition. Frankfurt/M: Peter Lang Verlag.
Ohala, J. J. 1993. “The phonetics of sound change,” In C. Jones (ed.), Historical Linguistics: Problems and
Perspectives. London: Longman. 237-278.
Ohala, J. J. 1995. “Phonetic explanations for sound patterns: implications for grammars of competence,” K.
Elenius & P. Branderud (eds.), Proc. 13th Int. Congr. Phonetic Sciences, Stockholm, 13-19 August
1995. Vol. 2. 52-59.
Ohala, J. J. 1997. “Comparison of speech sounds: Distance vs. cost metrics,” In S. Kiritani, H. Hirose, & H.
Fujisaki (eds.), Speech Production and Language. In honor of Osamu Fujimura. Berlin: Mouton de
Gruyter. 261 - 270.
Plauché, M., C. Delogu, and J. J. Ohala. 1996. “Asymmetries of consonant confusions,” Proc., Eurospeech 97,
Rhodes, Greece, 22-25 Sept. 1997.
Sherman, D. 1975. “Stop and fricative systems: a discussion of paradigmatic gaps and the question of
language sampling. Stanford, CA: Stanford Working Papers in Language Universals 17, 1-31.
Sommerstein, A. H. 1977. Modern phonology. Baltimore, MD: University Park Press.
Trubetzkoy, N. 1933. “La phonologie actuelle,” J. de Psychologie. No. 1-4, 227-246.
Trubetzkoy, N. 1939. Grundzüge der Phonologie. Prag. [Bd. 7, Travaux du Cercle Linguistique de Prague.]
Vennemann, T. 1993. “Language change as language improvement,” In Charles Jones (ed.), Historical
Linguistics: Problems and Perspectives. London: Longman. 319-344.
Weinreich, U., W. Labov, and M. I. Herzog. 1968. “Empirical foundations for a theory of language
change.,” In W. P. Lehmann and Y. Malkiel (eds.), Directions for historical linguistics. Austin, TX:
University of Texas Press. 95-188.
Winitz, H., M. Scheib, and J. Reeds. 1972. “Identification of stops and vowels for the burst portion of /p, t,
k/ isolated from conversational speech,” J. Acoust. Soc. Am. 51, 1309-1317.
Department of Linguistics
University of California
Berkeley, CA 94720
+1 510 649 0776