To express themselves, humans are capable of producing a great many different speech sounds (the phonetic elements of a language), which are formed into words. This ability is the result of a complex interaction between many parts of the body, including the brain, the lungs, the larynx, the pharynx, and a collection of mobile articulators: the tongue, lips, lower jaw, and soft palate (or velum). The lungs and larynx provide the sound source, which is shaped by the upper airway known as the vocal tract.
If the vocal folds are close together, pressure supplied by the exhaled air sets them into vibration to produce a tone. This is called phonation. The size and shape of the cavities of the vocal tract (pharynx, nose, and mouth), determined by the positions of the mobile articulators, amplify certain frequencies of the tone. This resonance produces a complex sound that is unique for each phonetic element of the language.
The larynx, situated on top of the trachea, opens into the pharynx and is considered part of the upper airway. It is composed of cartilages linked by ligaments and muscles and completely covered by mucous membrane. The largest of these cartilages, the thyroid, forms a visible bump in the neck in men, the Adam’s apple. Within the thyroid cartilage are the vocal folds.
The vocal folds are long, smooth, rounded bands of muscle tissue that can be lengthened or shortened, tensed or relaxed, and separated or approximated. They are attached to the thyroid cartilage in the front and to the arytenoid cartilages in the back. Activation of the various intrinsic muscles, also attached to the arytenoid cartilages, causes the vocal folds to open wide during respiration or to close, tense, and stretch during phonation. For sound to be produced when exhaled air passes through the vocal folds, their edges must be more or less closed: the amount of closing affects voice quality.
Phonation (1) requires the vocal folds to be closely approximated. The tension of the intrinsic muscles, along with the pressure applied from the lungs, determines the quality of the sound that is produced. On the other hand, no sound is produced when the glottis is wide open (2) and the larynx is used solely for breathing.
A large number of muscles act to position the tongue, lips, jaw, and soft palate in various combinations to articulate different consonant or vowel sounds. Many consonants result from the presence of obstructions to the air flow by the tongue and lips with teeth and hard palate. Occlusive consonants (p, t, k) are produced by the complete obstruction and then sudden release of the air flow, while fricative consonants (f, th, s, sh) are produced with an incomplete obstruction, resulting in noise-like sounds. For both of these categories, sounds are also produced while the vocal folds are vibrating, resulting in voiced consonants (b, d, g, v, z, j).
Articulation of vowels involves no major obstacles to the passage of sounds from the larynx to the mouth opening. Therefore, resonance is what differentiates these sounds. The size and shape of the vocal tract, the degree of lip rounding, and the degree of muscular tension are the most important factors affecting vowel articulation. Changes in oral, labial, and nasal cavities also contribute to vowel articulation. In some languages, such as French, the nasal resonator is involved in articulation of nasal vowels, when the velum of the soft palate moves to let some air pass through, adding a nasal quality to the sound.