Musicians who possess functional sight singing and dictation skills—what Gary Karpinski (2000) refers to as "thinking in music"—have unified into a single entity three discrete bodies of knowledge about music: iconography, nomenclature, and sound. Iconography refers to the visual representation of music: staffs, clefs, time and key signatures, accidentals, note heads, stems, flags, etc. By nomenclature, I refer to any labeling system for pitches, rhythms, and harmonies: solfège, letter names, scale-degree numbers, rhythmic syllables, counting numbers, Roman numerals, pcsets, etc. The third component—musical sound—is both invisible to human eyes and intangible to human touch. Though machines can generate a physical representation of soundwaves, audio frequencies are elusively abstract, even after they reach the eardrum to be processed by the brain.

A significant challenge when guiding students to develop sight singing and dictation skills is that both iconography and nomenclature can be learned and assessed in complete silence as pencil-and-paper tasks, without ever hearing a musical sound. For example, most students can quickly recite the spaces of the treble clef as "f-a-c-e" and the lines as "every good boy does fine." Yet how many can transfer these mnemonics into musical sound—if not in absolute pitch, then at least relative to a starting pitch? How many can communicate that they have indelibly internalized music's iconography and nomenclature with the sounds they represent?

My primary goal in teaching aural skills, as with all my teaching, is to provide opportunities for students to build bridges between prior knowledge and the defined learning outcomes, guiding students from "the known to the unknown" (Alexander, 1932). Most often, my students arrive knowing either letter names or solfege. Their preferred nomenclature system, however, is only loosely connected to iconography and tenuously to sound. An initial learning activity, therefore, is to bind nomenclature to sound, followed thereafter by marrying nomenclature-cum-sound to iconography. In this blessay, I suggest approaches to unifying nomenclature, sound, and iconography, as well as propose an innovative curricular reform. I begin with the latter topic, which leads to the former.

Almost twenty years ago while teaching in a very small music department, I reformed the curriculum to accommodate students who needed a one-semester music fundamentals course before beginning the degree-required theory sequence. To determine readiness for Theory 1, incoming students (n < 10) were given access to an online placement test via Blackboard©, the course management system that at the time was still in its infancy. Upon logging in, students found a sample test and two computer-graded tests, each with 15 questions. The topics included time and key signatures, scales, clefs, triads, and intervals. To bypass the Music Fundamentals course, students needed to answer correctly the majority of questions about each individual topic.

Due to the limited size of the department, Theory I and III were offered only in the fall, and II and IV only in the spring. Therefore, students who enrolled in Music Fundamentals in the fall had no theory class to take in the spring; they had to postpone taking Theory I until the beginning of their sophomore year, a whole year behind their cohort. The remedy was to shift Theory I and III to the spring and, correspondingly, Theory II and IV to the fall. Students who required Music Fundamentals in the fall enrolled in Theory I with their cohort after the winter break, rather than waiting nine months for the next fall semester to commence. The four-semester aural sequence, however, did not shift (see Crystal Peebles' description of the theory and aural sequence at Ithaca College in her essay in this volume). Instead, Aural I's learning outcomes, in-class activities, and assessments reoriented to build students' "musical database," in which the nomenclature of music, the sound of music, and the iconography of music were melded together into a single entity called aural skills.

To accommodate students who were concurrently enrolled in Music Fundamentals and Aural I, the nomenclature for all musical structures was provided. For example, after students memorized a pentascale-and-triad pattern in major and minor on solfège (do-re-mi/me-fa-so-fa-mi/me-re-do-mi/me-so-mi/me-do), they were given the notation and spelling for the pattern in all keys as a sing and play assignment. Students performed the pattern on letter names, progressing counter-clockwise around the circle of fifths, each key linked by transforming the third scale degree of the old key into the leading tone of the new key. Not only did students internalize the beginning of every major and minor scale, they also memorized the sound and spelling of consonant triads, as well as the leading tone to each key. Progressing down by fifth not only assured familiarity with the order of keys around the circle of fifths, but also prepared a later assignment of resolving dominant seventh chords down by fifth to major and minor chords: E7 to A, A7 to D, D7 to G, etc. Subsequent assignments added musical structures to the students' musical database: octave scales, intervals, and progressions. Through the combination of singing and piano performance, students internalized not only the sound of these musical structures, but also their spelling. With the sound and nomenclature unified, iconography was the final element. Sing and play activities and assessments thus became sing and notate: sing the label while notating it on the staff. The process of encoding musical notation not only anticipated dictation exercises, but also prepared the skill of decoding musical notation when sight singing.

In addition to singing, playing, and notating assignments, students memorized melodies beginning with tunes that are primarily stepwise (e.g., "Ode to Joy" and "My Country Tis of Thee") and advancing to those with leaps (e.g., the openings of Mozart's Eine Kleine Nachtmusik, "Somewhere Over the Rainbow," and "Let the River Run"). Once internalized, these melodies were notated in various keys and clefs as preparation for dictating unfamiliar melodies. Similarly, students sang chord arpeggios of common progressions, often with musical excerpts in various styles, as a pre-harmonic dictation learning activity. The focus was not on how to construct a common-practice progression; that detail would be explored in theory classes. The goal was to perform and internalize harmonies in relationship to a given tonal center.

Additional in-class activities and assessments included: (1) listening to a pitch, being told its letter name, and notating it in the correct octave on the grand staff; (2) aurally discerning if two motives are the same or different, or if they are sequential, or ornamented versions of one another; and (3) determining which one of three or four notated contours matched a performance, and, vice versa, which one of three performed motives corresponded to a given notation. All were designed to unify musical sound with its nomenclature and iconography.

A guiding principle when creating this revised Aural I course was a rewording of physicist Richard Feynman's oft-quoted aphorism found on his blackboard after his death, from "What I cannot create, I cannot understand" to "What I cannot sing and notate from memory, I cannot sight read or notate in dictation." While I do not explicitly quote Feynman or my revision of Feynman in class, I do demonstrate the underlying concept very early in the semester. For more than a decade, my aural skills classes have begun with me reading three literary quotes for students to write down, which replicates with words the process of taking musical dictation. The first excerpt I recite is always in English:

The weight of this sad time we must obey;
Speak what we feel, not what we ought to say.
–– Albany in Shakespeare's King Lear, Act V Scene III

The second in Spanish:

Un brazo de la noche entra por mi ventana.
An arm of the night enters my window.
–– Federico García Lorca's "Nocturnos De La Ventana"

The third in German:

Wer reitet so spät durch Nacht und Wind?
Es ist der Vater mit seinem Kind
Who rides so late through night and wind?
It is the father with his child
–– Goethe's "Erlkönig"

The students' facial response to each reading as I change languages reveals their unspoken reaction: nonchalance, surprise, and, finally, complete puzzlement.

Afterwards, we review the students' work—the product and the process. As to be expected, almost all students write out the Shakespeare quote correctly, even though a few replace "weight" with its homonym "wait," producing an alternative meaning of the text. In response to the Spanish poem, the number of complete correct answers decreases to about a third of the students. Although my classes often include students who are native Spanish-speakers, as well as those who've studied the language formally in school, the majority have had no classroom instruction in Spanish. Nonetheless, by living in an area wherein residents frequently encounter Spanish, they have been exposed to the language. When this casual familiarity combines with English-language skills, non-Spanish speakers can produce a transcription that is incorrect, yet recognizable: "Uñ brassó de la nochí in tra por mi ventãna." This never happens with the German example; rarely does any student transcribe the Goethe quotation correctly. In fact, about a quarter of the students abandon their attempt to write the German and submit an incomplete answer.

Through discussing the process of listening to and writing down the quotations, students articulate the skills they will later activate to take melodic dictation: (1) listen, hold in short-term memory, and mentally replay the excerpt while notating it, and (2) associate the sound(s) you hear with the visual symbols that represent the sound(s). Though students quickly grasp the first process, I tether our exploration of the second process to specific words in each language. With respect to the English vocabulary, I pointedly ask about "ought." No one ever admits to using the word regularly, which prompts a discussion about how they could write "ought" if it is so rarely in their spoken vocabulary. Students mention that "ought" is similar to words they do use regularly, such as "bought." I punctuate the point by saying that they will need a parallel skill both in sight singing and when taking dictation: relate something vaguely familiar to something they know well. I highlight a different process with the Spanish word "brazo," which is often misspelled as "brasso" to match its pronunciation. I compare this error to notating a D♯ as an E♭: the two pitches may sound the same, but they are spelled differently and respond differently based on the musical context. The German provides yet a different vital lesson for students to grasp about sight singing and taking dictation—a lesson that corresponds to my rewording of Feynman's aphorism: "What I cannot sing and notate from memory, I cannot sight read or notate in dictation." Teutonic phonemes and graphemes are enigmatic to most of my students. Therefore, the German word "Wind" is typically spelled "vent," the closest-sounding English word. Students stumble into this naïve error because they do not have an association between what they are hearing and its visual representation: between a sound and its iconography.

Dedicating a semester to creating the students' musical database at the beginning of the Theory & Aural sequence left Theory IV as an orphan at the end of the sequence. The scope of Aural IV, however, had not only encompassed chromaticism, but also introduced sight singing tone rows as a precursor to learning about serialism. It was necessary to integrate some aural skills into Theory IV. Nonetheless, I found great value in starting the sequence with aural skills. First, Theory I did not need to begin with a review of scales, intervals, and chords. Students were already fluent with the fundamentals, as well as the circle of fifths, key signatures, progressions, and numerous theoretical concepts. For example, by having sung chords ascending from guide tones (Stevens 2016), students had actively performed tonal voice leading parsimoniously: hold any common tone(s), then move the shortest distance. Singing dominant seventh chords as root position arpeggios that concluded with the chordal seventh resolving down by step foreshadowed part writing seventh chords correctly. Moreover, secondary dominants were easily comprehended because V7–tonic resolutions had been internalized around the circle of fifths. Most importantly, the students' sense of pitch space was well established; they were prepared to be successful when sight singing and taking dictation.

Within a larger work titled "I Am Not a Camera" British poet W. H. Auden (1972) wrote "What we have not named / or beheld as a symbol / escapes our notice." If Auden had been an aural skills instructor, perhaps he would have posited "What we have not sung on a label / or notated on a staff / eludes our ability to sight read or notate in dictation."


  • Alexander, F. Matthias. 1932. The Use of the Self: Its Conscious Direction in Relation to Diagnosis, Functioning and the Control of Reaction. London: Methuen.
  • Auden, W.H. 1972. Epistle to a Godson. New York: Random House.
  • Karpinksi, Gary. 2000. Aural Skills Acquisition. New York: Oxford University Press.
  • Stevens, Daniel. 2016. "Symphonic Hearing: Mastering Harmonic Dictation Using the Do/Ti Test." Journal of Music Theory Pedagogy 30:111-76.

Return to Top of Page


  • There are currently no refbacks.

Copyright (c) 2020 Cynthia I. Gonzales

Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 International License.

Beginning with Volume 7 (2019), Engaging Students: Essays in Music Pedagogy is published under a Creative Commons Attribution 4.0 International license unless otherwise indicated.

Volumes 1 (2013)  6 (2018) were published under a Creative Commons Attribution-ShareAlike 3.0 Unported License unless otherwise indicated.

Engaging Students: Essays in Music Pedagogy is published by The Ohio State University Libraries.

If you encounter problems with the site or have comments to offer, including any access difficulty due to incompatibility with adaptive technology, please contact

ISSN: 2689‐2871