Natural Phonology

The Private and Public Faces of Language

Posted in Phonology by rwojcik on February 21, 2023

Learning to speak a language is not unlike learning to type on a QWERTY keyboard. Typing takes a lot of hand-eye coordination. The keys represent tactile-visual (sensorimotor) units, because tapping them produces a desired or undesired visual effect on a computer screen. The mouth is a kind of oral keyboard that everyone is born with, and the ear is a kind of auditory computer screen. Learning to touch-type on a keyboard requires keeping an eye on the screen while making movements with the fingers, wrists, and arms. Learning to speak requires paying attention to different sounds produced while making different movements with the lips, jaw, tongue, velum, pharynx, larynx, and vocal cords. That’s what infants do within a few months of birth when they start to babble. Eventually, they attempt pronunciations of words, and they need to remember which complex gestures approximate the sounds made by other speakers. All of this requires a lot of practice with mouth-ear coordination, just as learning to touch-type requires a lot of hand-eye coordination. Movements that do not produce the desired auditory or visual results have to be suppressed. Practice makes perfect, and intuitions of which movements achieve the desired results provide a basis for well-formedness intutions over time.

Sensorimotor units link different aspects of cognition. Alphabetic letters are visual signs that can be linked to motions made by the body to produce them on a computer screen. Or they can be linked to hand gestures made with a writing implement to produce them on a visual medium such as paper. Or they can be linked to voice commands made to a computer or cell phone, which also produces visual symbols on a display via speech-to-text technology. Auditory acoustic signals can be linked to articulatory gestures, but they can also be linked to keyboard input that produces an acoustic signal from a speaker via text-to-speech technology. The sensorimotor units themselves are always psychological abstractions. They are created by brains that build associative links that can connect volitional motor activity to various types of perception. Those connections are abstract causal units. That is, they are causal functions with motor inputs and perceptual outputs. Their causal function is what I call an intention. Note that intention is not the same as volition in that intention can be imagined, but volition has to be actuated or effected in bodily motion.

There is an important private-public asymmetry in sensorimotor units such as speech sounds. The articulatory side of a speech sound is always private, but the auditory side can be shared publicly with other speakers. The articulatory activity is not immediately accessible to others, but the acoustic signal is. The articulatory side can be described and recorded. That is what linguists do when they work with articulatory phonetics and phonology. The sensorimotor unit itself is a concept that can be represented symbolically, which is where visual alphabetic symbols come in. One can associate that unit with one or more symbols. Phoneticians use phonetic alphabets to represent speech sounds sui generis, but social communities use them to represent phonemes–the mnemonic speech sounds that can be combined to represent words and morphemes.

Morphemes, like phonemes, have private and public faces. The private face is conceptual. The private meanings we assign mentally to words are inaccessible to others, grounded in personal experiences, and therefore unique to every speaker. However, language is a medium through which we share some of those experiences with others. So there is a public side to morphemes that exists in the social realm, just as the acoustic signal associated with a phonological representation exists as a shared social phenomenon. Phonemic representations are associated with articulatory-acoustic units or speech sounds, but they only make sense in terms of their role in representing morphemes. Morphemes are stored in memory, so I define phonemes as mnemonic speech sounds.

Charles Fillmore, like David Stampe, was one of my earliest mentors in linguistic theory. Several years ago, he told me that he had come to think of language as “word-guided mental telepathy”. Fillmore was referring to the influence that Roger Schank had had on his Frame Semantics theory, but it is a good metaphor for the private-public nature of language as I am describing it here. One might call phonological representations “phoneme-guided mental telepathy,” because phonemes play a heuristic role in identifying the morphemes one uses to encode and decode them.

The two-sided nature of phonemes means that they play two separate distinctive roles–articulatory and auditory. We normally think of phonemes only in terms of the public side of their nature, so that role has become the defining characteristic in linguistic definitions of phonemes. During the 1920s and afterwards, linguists came to reject the idea that phonemes could be auditorily ambiguous. This was not Baudouin’s view, but , for various reasons that I will take up later, most linguists came to define the phoneme as a type of sound that had to be phonetically unambiguous. A phoneme was a type of speech sound that functioned to distinguish words and morphemes. The most popular method for establishing distinctive phonemes in a linguistic system was to demonstrate minimal word pair in which the speech sound would be the only way to tell them apart.

However, there are two different ways in which a phoneme can be distinctive–acoustic and articulatory. A word can be associated with a sequence of distinctive sounds, or a string of distinctive coordinated articulatory gestures. The difference is that what happens in a speech tract is private. It is not accessible to other speakers. So a word in a speaker’s mind can also be associated with a string of articulatory units that are consciously available to speakers but not listeners. If phonemes are defined as mnemonic speech sounds, then the proper venue for establishing their reality is not in the acoustic signal, but in the mind. Hence, Baudouin de Courtenay was inclined to define the phoneme as the “psychic equivalent of a speech sound” rather than merely a sound that could be used to distinguish words.

Baudouin himself was acutely aware of the sensorimotor nature of phonemes. He even gave special names to them. He called the articulatory side a kineme, the auditory side an acouseme, and the sensorimotor unit itself a kinakeme, although these terms, like physiophonetics and psychophonetics, never caught on with subsequent linguists who built their theories on his work. Today, we refer to physiophonetics as phonology and psychophonetics as morphophonogy, although the distinction between the two very separate fields of study has blurred together in generative approaches to phonology. Here is a passage from Baudouin’s article The Difference between Phonetics and Psychophonetics, as translated by Edward Stankiewicz in his A Baudouin de Courtenay Anthology:

Phoneme, the psychological equivalent of a physical “sound,” the actual and reproducible phonetic unit of linguistic thought. The phoneme consists, in turn, of constituent elements of which we are not aware during linguistic intercourse but which can be obtained by analysis; they are:
the kineme, the articulatory, phonational element of linguistic thought;
the acouseme, the simplest psychological element of audition or acoustic perception; and
the kinakeme, the complex representative of both the articulatory (phonational) and auditory elements.

Różnica między fonetyką a psychofonetyką, STWarsz, 1927; Soviet ed. II, pp. 325-30

Speech Sounds, Phonemes, and Allophones

Posted in Phonology by rwojcik on February 18, 2023

The basic unit of phonological theory is the phoneme. Traditionally, phonemes are defined as the smallest units of sound in speech that can be used to distinguish words in a language. That’s a fairly good definition, but it lacks any reference to the psychological function of phonemes. It does not explain why people cannot just use any speech sounds to distinguish words. For example, why are glottal stops not phonemes of English? Glottal stops occur all the time the the speech of English speakers, but most speakers percieve them as allophonic variants of other phonemes such as voiceless stop phonemes /p/, /t/, and /k/. Glottal stops also occur optionally in word-initial position before vowels, but they cannot be used to differentiate words. If English speakers can perceive and produce glottal stops, why can’t they use them as minimal units of sound that distinguish words?

As a native speaker of American English, I usually pronounce the name Sutton with a glottal stop [ˈsəʔn̩], but I “hear” that sound phonemically as a /t/. I can also pronounce the name in slow tempo or careful speech as [ˈsətʰən]. That is, I can suppress the collapse of the final syllable into a syllabic nasal, which triggers the conversion of homorganic [tn̩] to [ʔn̩]. In fact, I can artificially flap the intervocalic /t/ to produce the pronunciation [ˈsəɾ̥ə̃n] and and even rephonemicize the voiceless flap to a voiced stop /d/, yielding [ˈsədn̩]. That is, I can make Sutton sound like the word sudden, although that would not be a normal way of pronouncing the word for me. I sometimes hear other speakers pronounce the name that way. I certainly hear speakers pronounce the word “sentence” with a flap, although I invariably pronounce it with a glottal stop at normal tempo, but, if I articulate each syllable carefully as “sen-tence”, the [t] is easily produced.

The point is that I am consciously aware of glottal stops, and I can manipulate my articulation of a word to produce them or not produce them. Moreover, the glottal stop is distinctive in the sense that its occurrence distinguishes the word Sutton from sudden. I cannot pronounce sudden with an intervocalic glottal stop without consciously and gratuitously changing the phonemic character of the word. Glottal stops and flaps occur in my speech, but they are not phonemic speech sounds. They are invariably allophonic or, in the language of early twentieth century linguist, Edward Sapir, not “psychologically real” speech sounds. Why is that?

Nikolai Trubetzkoy, in his 1930s-era posthumously published Principles of Phonology (Grundzüge der Phonologie), set out to define the phoneme. He briefly considered Baudouin de Courtenay’s definition of the phoneme as the “psychic equivalent of a speech sound” but rejected it. I will explain why I disagree with his rejection. Here is the relevant passage from his book:

Originally the phoneme was defined in psychologistic terms. J. Baudouin de Courtenay defined the phoneme as the “psychic equivalent of the speech sound.” This definition was untenable since several speech sounds (as variants) can correspond to the same phoneme, each sound having its own “psychic equivalent,” namely acoustic and motor images corresponding to it. Furthermore, this definition is based on the assumption that the speech sound itself is a concrete, positive given entity. But in reality this is not the case. Only the actual continuous flow of the speech event is a positive entity. When we extract the individual “speech sounds” from the continuum we do so because the respective section of the sound continuum “corresponds” to a word made up of specific phonemes. The speech sound can only be defined in terms of its relation to the phoneme. But if, in the definition of the phoneme, one proceeds from the speech sound, one is caught in a vicious circle.

N.S. Trubetzkoy Principles of Phonology. Translated by Christiane A.M. Baltaxe. Berkeley, University of California Press. 1969: 37-38.

I think Trubetzkoy was somewhat hasty and off the mark in this passage, but not entirely wrong in his criticism. It is not obvious what Baudouin meant by “psychic equivalent,” but that does not mean that it could be equated with just any putative speech sound. Otherwise, as Trubetzkoy points out, that would trap one in a vicious circle of defining the phoneme as a psychic equivalent of itself. I also disagree with him on another point. Trubetzkoy did not treat speech sounds as discrete (“concrete, positive”) entities. If that is the case, then what are the “acoustic motor images” that he himself referred to in the same paragraph? To put it another way, what is it that phoneticians transcribe and study? What do the symbols in the International Phonetic Alphabet (IPA) correspond to? Trubetzkoy was able to discard the concept of a psychological phoneme by denying the psychological reality of speech sounds generally, but those allegedly unreal speech sounds are referred to all over his work via the medium of alphabetic symbols for them. In my view, speech sounds themselves are psychologically real entities, not just phonemes. It’s just that phonemes are a special kind of speech sound.

What one needs to do first is clearly define the meaning of speech sound as a mental image of some kind, and then we define the phoneme as a special type of speech sound. The special property of a phonemic speech sound is that it can occur in the phonological representation of a word. Patricia Donegan has pointed out to me that David Stampe preferred to use the term memorable in connection with phonemes, but I use the term mnemonic, because I want to emphasize the role of phonemes as memory aids. We associate morphemes in memory with strings or sequences of phonemes that evoke lexical meanings. It follows from this mnemonic function that the inventory of phonemic speech sounds in a language will be limited in number and that they can be easily distinguished from each other. Speech sounds per se do not have to belong to any particular language. They are just sounds that can be articulated with a normal vocal apparatus and perceived with normal hearing.

Trubetzkoy also wrote that “The speech sound can only be defined in terms of its relation to the phoneme.” I believe that he got it backwards here. It ought to be that the phoneme can only be defined in terms of its relation to the speech sound. But what exactly is a “speech sound”? According to Trubetzkoy, it is an “acoustic and motor image”. I agree, but that seems to require that the speech sound itself is defined as a psychological unit independently of its phonemic status in a language. So let’s continue to think of it as an intentional auditory and motor (sensorimotor) image, but let’s separate that from the mnemonic property of a phoneme. Speech sounds are potential phonemes in the sense that anyone can learn a new sounds in a new language, but that new language may have a different inventory of phonemic speech sounds that speakers use to construct lexical representations that are stored in, and retrieved from, memory.

I propose to define a speech sound as a minimal sensorimotor unit of intended articulation that has an auditory effect in an acoustic signal. The sensory side is auditory rather than acoustic, since the acoustic signal can only be processed mentally as an auditory signal by a listener. The motor side is a complex coordinated movement of parts of the speech tract or a complex articulatory gesture. I have deliberately defined speech sound as an intentional movement, because I want to treat the sensorimotor unit as falling within a range of movements and auditory effects that are more or less autonomically produced. This is how I will allow for allophonic divergences in sound and motion during speech production. Allophones can become intentional auditory-motor images in the same sense that breathing can become intentional, but they normally do not require conscious attention during speech production. A phoneme is a special type of speech sound–a mnemonic speech sound.

David Stampe always used to tell us that alphabets and rhyme were two types of direct evidence that linguists have always been subliminally aware of as evidence for the existence of phonemes. Alphabets are inventories of visual symbols in which the symbols or combinations of symbols correspond directly to phonemic segments. However, given my distinction between speech sounds generally and phonemic speech sounds, I would prefer to link alphabetic symbols to speech sounds and treat language-specific alphabets as inventories of alphabetic symbols that correspond to phonemic speech sounds. So the International Phonetic Alphabet (IPA) is a real alphabet that is not associated with any particular language or set of phonemes. It is just an inventory of possible speech sounds in human language. Its symbols can be used conventionally to represent not only phonemes, but also divergent allophonic pronunciations of phonemes in different languages.

Phonetics is a separate branch of linguistics from phonology, because it is devoted to the study of speech sounds, not phonemes. Since speech sounds are auditory-motor units, phonetics can be divided into auditory and articulatory phonetics. Phoneticians look at both sides of speech sounds. And since auditory signals are caused by acoustic signals, acoustic phonetics is a third perspective on which to base the study of speech sounds. Phoneticians use phonetic alphabets to transcribe the speech sounds that they study.

Phonology has been clearly defined by Patricia Donegan and David Stampe many times in the past. The following passage is from their paper Hypotheses of Natural Phonology:

Phonology is the study of the categorical discrepancies between speech as it is perceived and intended, and speech as it is targeted for actuation.

One could just define phonology as the study of phonemics, which is how Trubetzkoy wanted the title of his work to be translated into English, according to Trubetskoy’s introductory section (p. 9). Christiane Baltaxe explained why she did not go with that suggestion in her translator’s footnote to that passage. What is interesting to me about Donegan and Stampe’s definition is that it does not tie the branch of phonology to any particular language or inventory of phonemes. It defines the field in terms of human speech behavior, which also includes language learning behavior, among other things. This expanded concept of the domain of phonology is significant, because Natural Phonology has a lot to say about phonological behavior during language learning. That would not make sense if the goal of phonology (and linguistics generally) were just to limit itself to studying patterns of well-formedness in a language. Language learners, almost by definition, are going to produce a lot of ill-formed linguistic patterns during the learning process. Intentional speech sounds, as opposed to phonemes, play a large role in the acquisition of speech. I’ll take up this subject and others in later posts.

Below is a table that summarizes some terms as I have used them here. I plan to continue to use these terms in the same way in subsequent posts.

Speech SoundA minimal sensorimotor unit of intended articulation that has an auditory effect in an acoustic signal
PhonemeA mnemonic speech sound that can be used in the mentally stored representation of a lexeme
AllophoneA divergent articulatory modification of a phoneme
AlphabetAn inventory of visual symbols in which a symbol or combination of symbols correspond to speech sounds
PhoneticsThe study of speech sounds
PhonologyThe study of the categorical discrepancies between speech as it is perceived and intended, and speech as it is targeted for actuation.

Historical Roots of Natural Phonology

Posted in Phonology by rwojcik on February 15, 2023

I have not posted anything on this site since 2010, so it has seen virtually no traffic since that time. Sadly, David Stampe is no longer with us, but I now intend to revive this blog. The main reason is that I have been invited to write a chapter for the proposed Cambridge Handbook of Natural Linguistics, to be edited by Katarzyna Dziubalska-Kołaczyk, Patricia Donegan, and Wolfgang Dressler. My contribution is entitled “Historical Roots of Natural Phonology”. Since the content of that chapter will involve some of the material I and others posted on this site earlier, I it will help me to organize my thoughts about the content of that material here.

My objective in describing the historical roots of Natural Phonology will not be comrehensive. Rather, I intend to identify some of the historical trends that I view as significant precursors to David Stampe’s theory, even though they might not have directly influenced his own formulation of the theory. For example, I’m not sure that Panini’s Ashtadhyayi should be viewed as a direct inspiration to Stampe’s rule/process dichotomy, but that astounding work stands as the taproot of all modern linguistic theories. There are aspects of it that relate directly to Stampe’s framework, so it is worth mentioning those aspects when looking at history through the lens of his framework. It is also worth noting that David Stampe himself had a passionate interest in India, its languages, and Panini’s work, so it might be useful to look for some historical precedents there.

There is also ample precedent from ancient Greece, although that civilization had nothing like the linguistic sophistication of contemporary Hindu grammarians. However, Greece did develop a fairly clean alphabetic writing system that mapped directly to the phonemes than phonolgical representations in modern generative approaches does. Sanskrit’s Devanagari was actually an abugida (syllabic alphabet, alphasyllabary), so it only represented vowels as standalone symbols when not combined with consonants. Otherwise, vowels were represented as diacritic symbols attached to consonant or consonant cluster. This may have given Vedic scholars an edge over Greeks in discovering more about the nature of phonology, because syllables and metric timing are essential to understanding Sanskrit phonological processes such as voicing, nasalization, assimilation, syncope, epenthesis, aspiration, deaspiration. Being able to represent individual sounds with visual symbols was an important technological leap forward in the development of linguistic phonology.

I intend to write more about these topics later, but it is useful to point out one direct influence that the ancient Greeks had on the development of Natural Phonology. The word natural in Natural Phonology comes out of the debate between Hermogenes and Cratylus over the correctness of names in Plato’s Cratylus. Hermogenes was an extreme conventionalist, arguing that the only basis for naming things was human convention, whereas Cratylus was an extreme naturalist, arguing that words followed from the nature of the things they named. Socrates, who is supposed to represent Plato’s perspective, gives pro and con arguments for both positions, but ends up mocking the naturalist explanations of Cratylus. There is no solid conclusion in the debate other than that linguistic explanations must rely on both conventional and natural explanations.

Modern linguistic theory seeks to explain linguistic phenomena that appear to limit the range of patterns we observe in human languages. Within the framework of Chomskyan generative grammar, universal traits of human linguistic behavior are grounded in observation of well-formedness intuitions. Everyone inherits the ability to build an internal mental grammar that allows native speakers of a language to recognize well-formedness in their language. The standard view in Chomsky’s framework is that human offspring normally inherit brains that are preloaded with knowledge of the possible forms that a grammar might impose on well-formedness judgments, and this accounts for the rapid development of grammatical knowledge in first language learners. This is often referred to as the innateness hypothesis. Seemingly conventional aspects of linguistic structure are deemed to be grounded in what are sometimes referred to as innate markedness conventions.

Stampe’s perspective on linguistic universals was very different from that of a generative linguist. His alternative explanation of phonological universals was not based on any innate preinstalled set of markedness conventions, but in the very nature of articulation and auditory perception in humans. Even more importantly, it was not so much about developing well-formedness intuitions as being able to produce and understand speech, which occurs in patterns that are often not perceived as well-formed. Intuitions of well-formedness in language are certainly very real and very important, but the drive to perceive what is right or wrong about linguistic patterns in a language is not the only concern of a language learner. Even more important is the ability to understand and be understood. What is obviously inherited by everyone is not just a brain, but a body that is used to produce and comprehend spoken language. Linguistic universals follow from the nature of the body that produces language, not just arbitrary analytical conventions that might be genetically handed down from parents. For more details on this topic, see David Stampe’s On Chapter Nine.

New Old Paper

Posted in Phonology by Geoffrey Nathan on May 29, 2010

I am attempting to scan and convert old NP papers to usable format. Here is the first one, an old Stampe paper on diphthongs. I’ve proofread it pretty well, but the formatting is a little funky because I don’t understand completely how Word does footnotes. If anyone wants to fix it, please do so and let me know.

Removed for reformatting and reconsideration.

The Role of Rules in Speech Production and Perception

Posted in Linguistics, Phonology by rwojcik on March 3, 2009

Since speakers strive to produce grammatical speech, it follows that linguistic grammars are a key component of any theory of speech production.  In generative theory, the relationship between grammar and behavior is indirect.  Grammars describe well-formed structure, but the theoretical framework makes no explicit claim about how such knowledge interacts with speech production.  Natural Phonology entails a different theoretical approach, one that is grounded in linguistic performance.

Rules and Processes represent psychological strategies that directly govern speaker intentions.  Morphophonological Rules represent strategies that alter the string of phonetic targets–usually phonemes of a language–that the speaker intends to produce.  Words and morphemes are associated with strings of phonemes in memory.  So Rules make reference to morpheme boundaries.  They are lexical in nature.  They are fundamentally suppletive operations.  The Rule that replaces /f/ with /v/ in the formation of plural knives is essentially the same kind of mental operation that replaces /go+d/ with /wEnt/ in the formation of past tense went.  Inflectional Rules participate directly in the production of syntactically-grouped strings of morphemes and words, and Derivational Rules act directly in the formation of new words.  Rules also play a role in speech understanding, as they help listeners identify the words that they are listening for. Rules also interact with other strategies that alter strings of phonemes.  For example, secret languages or luding, which is an extragrammatical modification of linguistic performance.

Phonological Processes do not alter the morphologically grouped strings of sounds that comprise morphemes.  Rather they alter the articulatory gestures that a speaker ends up using in the production of a fixed string of phonetic segments.  Processes change the articulatory gestures that a speaker uses in producing prosodically-grouped strings.  Morpheme boundaries do not really exist at this level of production, although such boundaries can affect the prosodic grouping.  So the common noun singer lacks the [g] pronounced in the proper noun Singer, because the morpheme boundary affects the prosodic chunking.  That is, the obstruent cluster /ng/ is reduced at the ends of syllables, but not when there is an intervening syllable boundary.  The phenomenon can be seen in the ease with which some of us can flap the nd cluster in the past tense/participle handed, but not so easily in the adjective left-handed, which involves a different kind of morphological parsing.

So my point here has been to emphasize the fact that Rules, like processes, are performance strategies.  They govern the act of producing speech as well as perceiving it.  It is just that the psychological functions of Rules and Processes are entirely different.  The former operate on strings of phonemes grouped according to syntactic and lexical production, whereas the latter help to organize the rhythmic nature of articulatory production.

Hypotheses of Natural Phonology

Posted in Linguistics, Phonology by rwojcik on February 18, 2009

We are pleased to offer you the latest draft of Patricia Donegan and David Stampe’s “Hypotheses of Natural Phonology“, which was originally delivered in Poznan in September 2008.  The final version will appear in Poznan Studies in Contemporary Linguistics.  Those who wish to make private comments can email the authors.  Public comments can be made to this post.

Processes, Speech Impediments, Chewing Gum, and Phonological Intuitions

Posted in Linguistics, Phonology by rwojcik on February 18, 2009

In generative linguistics, a linguistic derivation is a method of calculating well-formedness.  Intuitions of linguistic well-formedness are basically what generativists set out to explain.  So it is very difficult for a generative linguist to understand Natural Phonology.  As I have already pointed out, the inputs to a Process derivation do not have to be well-formed.  They can be any string of phonetic targets imaginable.  Phonological Processes exist in infants well before they have intuitions of well-formedness, and infants simply could not learn to speak if it were Processes that defined grammatical intuitions.  How would they know what to try to pronounce?  Yet it is still a fact that Processes in adults play a big role in determining what we come to believe is phonologically well-formed.  So how does that come about?

Morphophonological Rules set up the inputs to Processes.  They are operations on strings of phonemes arranged in morphological chunks.  Processes are operations on articulatory targets that are arranged in rhythmic chunks.  Just as Rules apply to lexical units, Processes apply to prosodic units.  When I pronounce French poorly with my English accent, one of the reasons is that I don’t feel comfortable with syllable-timed articulation.  I am comfortable with articulatory chunks that are stress-timed, and I happen to like vowel reduction.  Subjectively speaking, that is one reason why I get along better with Russian, another language that invites vowel reduction, albeit of a different sort than English.  But I can mangle Russian pronunciation, too.  It is just that I am better at suppressing bad English habits when I speak Russian.  I learned Russian when I was still a teenager.  French came later.

Processes can be thought of as habits–psychological operations that overcome some articulatory difficulty.  They are also like speech impediments, especially during the acquisition of foreign pronunciation.  What we do with processes is the same in second language acquisition as in first language acquisition.  We struggle to suppress those that impede the pronunciations that we are striving to achieve.  The English-learning child who struggles to suppress obstruent devoicing of English voiced obstruents has the same problem as the adult Russian or German speaker who struggles to suppress obstruent devoicing at the ends of syllables in English.  Well, not quite the same.  Children are so much better at suppressing misarticulations.  And that is a hallmark of language acquisition in Natural Phonology–the suppression of misarticulations.  A phonological system plays a major role in controlling muscular coordination during speech.  Morphonological Rules don’t have much to do with articulatory coordination, because they are just part of the speech production program that moves phonemic strings into the articulatory pipeline.

But what about non-linguistic coordination of the articulators?  I can talk while chewing on things, much to the annoyance of my audience.  Some people can’t help but speak with a lisp or a stutter.  Intoxicated people slur their words, and then there is singing and whispering.  Non-linguistic articulatory gestures have to get coordinated with articulatory gestures.  How does that happen?  I don’t have a simple answer, but I do believe that Natural Phonology offers a better approach than Generative Phonology in getting answers to the question of how the brain mediates clashes between linguistic and non-linguistic behavior.

But if phonological derivations are part of a program to coordinate articulatory gestures during speech, where do intuitions of well-formedness come from?  How do we know that bnick is not a possible English word, but blick is?  If any string of phonetic segments can be allowed as the input to a Process derivation, that helps to explain why I can try to pronounce Russian and French words and why my English processes impede my intended articulations.  But it loses me something I got from Generative Phonology–an explanation of why bnick is phonologically ill-formed for English.  So where does that knowledge come from?

Intuitions of well-formedness are expectations that  generative linguistic theory tends to attribute to a single source–the grammar.   But expectations of that sort need not come from a single source of information.  In adult native speakers, the Process system is generally a reliable filter on what is pronounceable because misarticulations were suppressed during the language acquisition phase.  My sense of well-formedness extends well beyond linguistic structures.  I can recognize missteps in dances, and I know it when I make those missteps.  Similarly, language learners come to know their own difficulties with pronunciation and whether their speech tract can produce the desired effects with minimal effort.  What makes bnick ill-formed is not that it is blocked from phonological inputs.  It is that I cannot pronounce it as easily as I can pronounce nonsense words like blick. To get to bnick, I have to suppress an epenthetic vowel that does not impede my pronunciation of blick.  (Well, bnick is a problem when I’m not speaking Russian anyway.  Russians don’t mind consonant clusters as much as English speakers do.)

Some puzzles for NP

Posted in Linguistics, Phonology by Geoffrey Nathan on February 18, 2009

I’ve been reading in various related areas for several years now, and some researchers outside the NP tradition (or who are actually hostile to it) have some findings that NP theorists need to take account of. I’ll mention some of them here, and it will be interesting to see what folks think.

Abby Cohn has done some work on vowel nasalization in English and French, and argues, based on actual articulatory data, that the vowel nasalization in English (as in ‘bend’)  is quite different articulatorily from that in French (as in ‘bonté’). In English the velum gradually raises from the /b/ towards the /d/, while in French the velum snaps down immediately and remains down till the /t/. Her argument is that nasalization is unspecified in English between the /b/ and the /n/, so it begins lowering immediately after the onset consonant, eventually reaching fully open position (she uses the term ‘interpolated’). In short, the nasalized vowel in English isn’t a substituted target, but rather an accident on the way to a nasal consonant, while the nasalized vowel in French is an intentional target, and thus articulatorily different.

Cohn, Abigail. 1993. “Nasalization in English: Phonology or Phonetics”. Phonology 10: 43-81.
Lisa Zsiga has similar work in the ‘bless you’ assimilation, finding different tongue shapes for the assimilated /ʃ/ from the underlying one.
2000 Zsiga, Elizabeth C. “Phonetic alignment constraints: Consonant overlap and palatalization in English and Russian”, Journal of Phonetics 28: 69 – 102.

The basic point here is that ‘segments’ produced through the operation of processes are different kinds of things from segments produced as aimed at, even if they sound the same (i.e. have some nasalization, or have a hiss with a higher center of gravity).

What is a phonetic segment?

Posted in Linguistics, Phonology by rwojcik on February 12, 2009

Phonetic segments (phones) are not physical things. They are mental constructs with physical manifestations. Linguists represent them with phonetic symbols enclosed in brackets:  [p], [b], [m], [æ].  Linguists also classify phonetic segments according to their acoustic and articulatory properties.  So another way to characterize them is as a bundle of phonetic features, which are usually binary to indicate the presence or lack of the feature.  So [p] can also be represented as a feature bundle: [+labial], [+stop], [-voice].  Phonetic segments have a two-faced character.  They can be manifested in two ways  acoustic and articulatory.  So feature classes can represent articulatory, acoustic, or both characteristics.

The physical manifestations of phonetic segments are dynamic, not static, in nature.  That is, the articulation of [p] requires sequential and simultaneous events in the speech tract.  The speaker needs a very sophisticated program of muscular coordination to produce sequences of segments, and that program of coordination can alter the pronunciation of the sequence so as to do two different things:  make the segments easier to hear or easier to produce.  Under noisy conditions, the articulation of the segments may be altered in different ways than it is under more casual conditions.  For example, an English speaker might reduce or eliminate the vowel in the first syllable polite in casual speech, pronouncing it more like plight.  Likewise, a speaker might pronounce please in emphatic speech more like policePuh-lease!  Casual speech reduces articulation, thus helping the speaker more than the hearer.  Emphatic speech enhances the acoustic properties of the speech signal, thus helping the hearer more than the speaker.  The two-faced character of phonetic segments affects the mental program that coordinates articulatory gestures.  The operations governing such a program are motivated by the opposing interests of the speaker and the hearer.  Phonology is the study of the discrepancy between the articulatory targets that the speakers intend to produce and end up producing.  It is also (but maybe to a lesser degree) the study of the discrepancy between the auditory-acoustic signal that hearers receive and the string of phonetic segments that they think the speaker intended to articulate.

Consider the strings of phonetic targets that speakers try to produce.  These phonetic targets differ across languages and dialects.  They are phonetic segments that speakers associate with the words and morphemes of their mental lexicons.  We call those phonetic segments or targets phonemes.  Those are the speech sounds that speakers expect to hear and associate with the words of their language.  Phonemes have to be acoustically distinct from each other in order to help us distinguish words, so phonemes are phonetic segments that can be used to distinguish words in the mental lexicon.  Unfortunately, the coordination of articulatory gestures may reduce or eliminate their distinctive function in running speech.  Police and please can sound a lot more like each other in casual speech than when they are carefully articulated.

Now we have a rudimentary understanding of the nature of phonetic segments or idealized phonetic targets.  Notice that we are not limited to producing just the phonemes of our language.  We can try to articulate foreign sounds.  We can try to pronounce English with a foreign accent or in a different dialect from our native one.  As long as we are speaking a language, as opposed to just making noises with our mouths, the mind will impose a program of articulatory coordination on strings of phonetic targets.  To acquire proper pronunciation in a foreign dialect or language, we may need to suppress or alter parts of the articulatory program that we use to produce native pronunciations.  We also may need to set up new phonetic targets as new phonemes.  The sounds that we try to articulate may change, and the way we pronounce the sounds we try to articulate also changes.

Now I will present the basic idea of Natural Phonology.  It is not a structuralist or generativist theory of phonology, although I’ll have more to say about that in later posts.  Like generative theory (but unlike structuralist theory), Natural Phonology is explicitly a psychological approach to language.  The main difference is that Natural Phonology can only be understood in terms of behavioral strategies.  It posits two distinct types of strategies that directly affect speech production:

  1. Processes: operations that alter the articulation of a sequence of phonetic segments (or phonemes) that a speaker intends to articulate
  2. Rules: operations that alter the string of phonetic segments (or phonemes) that a speaker intends to articulate

From a traditional linguistic perspective, processes are purely phonological operations.  Rules operate outside of phonology, although they also play a role in pronunciation.  They are part of what Trubetzkoy called morphophonology.

Finally, let us consider how Rules and Processes affect foreign language acquisition.  Consider native German speakers who learn to pronounce the plural of knife. Germans have a Process that governs the pronunciation of all final obstruent consonants.  We hear the effect in a German accent, where the speakers pronounce a word like give as giff.  To learn English articulation, native speakers of German must suppress the articulatory constraint or Process that devoices final consonants.  Now, the problem with knife is that the plural form undergoes a Rule that replaces the final /f/ in the stem with /v/.  German speakers must learn to try to pronounce a /v/ phoneme instead of an /f/ in order to produce standard English knives.  So German speakers face a doubly difficult problem with that plural form, because they have to learn to alter the intended phonemic target in the stem while simultaneously suppressing the devoicing of the final consonant cluster.  When German speakers learning English say knifes [nayfs], we do not know whether they are actually intending to pronounce a voiced consonant cluster and can’t do it or a voiceless cluster because they haven’t yet learned what to try to pronounce.

The Rule/Process dichotomy in Natural Phonology explains the fundamental psychological difference between phonological and morphonological operations.  Not surprisingly, the dichotomy corresponds exactly to the psychophonetic/physiophonetic alternational dichotomy that the founder of modern phonology, Baudouin de Courtenay, first called to the attention of his linguistic peers.

Welcome to Natural Phonology

Posted in Linguistics, Phonology by rwojcik on February 12, 2009

My name is Rick Wojcik.  I received my Ph.D. in linguistics from Ohio State University in 1973 and went on to teach linguistics at Columbia University, Barnard College, and Hofstra University until 1987.  I subsequently became a researcher and developer Natural Language Processing programs in industry.

The main purpose of this blog is to publish on topics that are relevant David Stampe’s theory of Natural Phonology.  The scope need not be limited to Natural Phonology per se, but it is limited to linguistic approaches that are compatible with the vision of phonological theory that David pioneered in the mid-1960s, when generative phonology was under development as a branch of generative theory.  Part of my motivation for establishing this blog is to help clear away confusion and misconceptions about Natural Phonology, which is not really a generative linguistic theory of phonology.  The theory has actually become somewhat obscure to most working phonologists nowadays, and I have always believed that the main reason for that is the tendency of most linguists to interpret it as just another approach to phonology within their overall framework.  My hope here is to provide an alternative point of view that can serve as an outlet for what some of us take to be a very revolutionary theory of phonology.

Ironically, Natural Phonology can also be described as reactionary, because it is something of a throwback to Jan Niecisław Ignacy Baudouin de Courtenay‘s original concept of phonological theory, which he described in terms of  physiophonetic alternations.  I will have more to say about this in future posts to this blog.