Modular arithmetic is a special form of arithmetic that you’ve been doing all your life; you use it anytime you tell time. If it’s 7:15 right now, in 27 hours, we say it will be 10:15, as opposed to 34:15. Every 12 hours (or 24, depending on where you live), we reset the hour. This kind of *reseting every \(n\) units* can be formalized.

We say that \(x\) is congruent to \(y\) modulo \(n\) provided that \(x-y\) is a multiple of \(n\). The notation we use looks like this: for some integers \(x,y,k,n\),

\[x \equiv y \text{ mod }{n} \Leftrightarrow x - y = kn\]

There are a few equivalent ways of saying the same thing:

If we divide \(x\) and \(y\) by \(n\), we get the same remainder.

\(x - y\) is a mulitple of \(n\).

- Alternatively, we can say \(x - y\) is divisible by \(n\).

\(11 \equiv 7 \text{ mod }{4}\) because \(11 - 7 = 4\), which is a multiple of \(4\).

\(213 \equiv 1 \text{ mod }{53}\) because \(213 - 1 = 212\), which is divisible by \(53\).

\(13 \equiv 6 \text{ mod }{7}\) because \(13 - 6 = 7\), which is divisible by \(7\).

In each of the examples above, we confirmed that two numbers were congruent modulo \(n\) for various \(n\)-values. We can also define an operation based on congruence modulo \(n\) as follows:

\(x \text{ mod }{n} = y\), where \(x\equiv y \text{ mod }{n}\) and \(0 \leq y \leq n-1\).

Another way of saying this is that \(y\) is the remainder we get when we divide \(x\) by \(n\).

This is almost exactly how we tell time. If we were really following the formal definition of modular arithmetic, we would use \(0\) rather than \(12\) when telling time.

Here are a few examples:

\(18 \text{ mod }{12} = 4\) because \(18 \div 12 = 1R4\).

\(19 \text{ mod }{7} = 5\) because \(19 \div 7 = 2R5\).

\(37 \text{ mod }{12} = 1\) because \(37 \div 12 = 3R1\).

Now you try some:

Evaluate \(34 \text{ mod }{5}\)

\[4\]

Evaluate \(53 \text{ mod }{7}\)

\[4\]

Evaluate \(39 \text{ mod }{3}\)

\[0\]

Evaluate \(44 \text{ mod }{12}\)

\[8\]

In mathematics, a *set* usually refers to an unordered collection of objects, without repetition. As you dig deeper, the definition of a set requires some nuance, but for now we’ll stick to this basic definition.^{1}

We’ll denote sets with curly braces. Here are a few examples:

\(\{1, 2, 3\}\).

\(\{x, x^2, x^3, x^4, \dots\}\) - Note, we use ellipses to indicate that there is a pattern we intend the reader to infer. In this case, we would infer that the exponent on \(x\) continues to increase by 1 with each subsequent term and that this pattern goes on forever (ad infinitum, for the latin-loving reader).

\(\{\dots,-4,-2,0,2,4,\dots\}\)

\(\{2, 3, 5, 7, 11, 13, \dots\}\) - the set of all primes

Now, we can clarify what we mean by *unordered* and *without repetition*:

Because sets are unordered, \(\{1,2,3\} = \{2, 1, 3\} = \{3, 2, 1\}\).

Because sets don’t include repetition, \(\{1,1,1,2,3,3\} = \{1, 2, 3\}\).

To indicate that a given *element* is contained in a set, we use the character \(\in\).

\(4 \in \{1,2,3,4,5,6,7,8,9\}\) but \(13 \notin \{1,2,3,4,5,6,7,8,9\}\).

\(60 \in \{\dots,-4,-2,0,2,4,\dots\}\) but \(217 \notin \{\dots,-4,-2,0,2,4,\dots\}\).

Finally, we often name the sets we define with capital letters, though any variable name will do. Some of the more commonly used sets have their own special characters.

Let \(A = \{\dots,-4,-2,0,2,4,\dots\}\). Then we can say, more concisely, \(4 \in A\), \(13 \notin A\).

The set of all integers usually is denoted \(\mathbb{Z}\).

The set of all real numbers is usually denoted \(\mathbb{R}\).

As we saw in the chapter on the Pythagorean scale, the fundamental organizing principle in Western music is the *octave*. An octave is a ratio of \(2 : 1\). Two frequencies, or pitches, that are an octave apart are said to be the same note. While there are still musicians who use tuning systems like the Pythagorean tuning system (based on octaves and perfect fifths) and various other just intonation systems (based on the smallest possible ratios), most modern music uses a tuning system called *equal temperament*.

With equal temperament, we take the octave and divide it into twelve evenly spaced notes. When we talk about musical *spacing*, we are referring to ratios/multiplication, not addition. So, if we want twelve evenly spaced notes in an octave, we need a multiplier, \(r\), such that \(r^{12} = 2\). Solving for \(r\), we get

\[\boxed{ r = \sqrt[12]{2} \approx 1.05946309 }\]

The advantage of equal temperament is that all of our notes are equally spaced; this means that a given pattern of notes can be shifted (or transposed) to start on any note without changing the pattern of ratios. This is not possible with a tuning system in which the notes are note evenly spaced.

Let’s revisit the Pythagorean 12-note scale and see what happens if we try to transpose a melody.

Root | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 | Octave |
---|---|---|---|---|---|---|---|---|---|---|---|---|

\(1\) | \(\frac{256}{243}\) | \(\frac{9}{8}\) | \(\frac{32}{27}\) | \(\frac{81}{64}\) | \(\frac{4}{3}\) | \(\frac{729}{512}\) | \(\frac32\) | \(\frac{128}{81}\) | \(\frac{27}{16}\) | \(\frac{16}{9}\) | \(\frac{243}{128}\) | \(2\) |

We’ll start with the root note, jump up to the third note, and then up to the fourth note.

Root | 3 | 4 |
---|---|---|

\(1\) | \(\frac{9}{8}\) | \(\frac{32}{27}\) |

What are the ratios between these notes?

Between the root and the 3rd note: \(9 : 8\)

Between the root and the 4th note: \(32 : 27\)

Between the 3rd and the 4th notes: \(256 : 243\) (we get this by dividing) \[\frac{32}{27} \div \frac98 = \frac{32}{27} \cdot \frac89 = \frac{256}{243}\]

Now, what if we start this same pattern on the 2nd note in the Pythagorean scale? We would start on note 2, jump up to note 4, and then to note 5.

2 | 4 | 5 |
---|---|---|

\(\frac{256}{243}\) | \(\frac{32}{27}\) | \(\frac{81}{64}\) |

Let’s compute the ratios between these notes:

Between the 2nd and 4th notes: \(9 : 8\) \[\frac{32}{27}\div\frac{256}{243} = \frac98\]

Between the 2nd and 5th notes: \(19683 : 16384\) \[\frac{81}{64}\div\frac{256}{243} = \frac{19683}{16384}\]

Between the 4th and 5th notes: \(2187 : 2048\) \[\frac{81}{64}\div\frac{32}{27} = \frac{2187}{2048}\]

Compare this with equal temperament, where everything is spaced evenly: no matter what starting note you choose, to go up by \(2\) notes, you multiply by \(2^{2/12} = 2^{1/6}\); to go up by 5 notes, you multiply by \(2^{5}{12}\); to go up by \(n\) notes, you multiply by \(2^{n/12}\). This means that any melody (or chord) can be transposed without changing that pattern of ratios that makes it up. From this perspective, equal temperament seems like the obvious choice for building instruments. But there’s a cost.

As we saw in previous sections, the naturally occurring overtones of any pitched sound are integer multiples of the pitch we hear, so an equal temperament instrument is, in some sense, out of tune with nature.

Consider the perfect fifth: a ratio of \(3:2\). This was the ratio Pythagoras used to build his entire tuning system. It’s also the ratio between the first and second overtones ^{2}. The perfect fifth is the eighth note in a twelve-note scale, which means we go up seven notes from our root. To go up by seven notes in equal temperament, we multiply by \(r\) 7 times:

\[r^7 = {(\sqrt[12]{2})}^7 = 2^{7/12} \approx 1.49830708\]

The percentage error here might not seem like much:

\[\text{\% error } = \frac{\frac{3}{2} - 2^{7/12}}{\frac{3}{2}} = 0.00112862 = \boxed{ 0.11 \%}\]

But, because every note is a little off, once we start playing several notes together, the differences can becomes quite noticeable. For now, let’s look at the differences between equal temperament and just intonation with respect to some of the other intervals.

The fifth note in a twelve-note scale is called the *major third*. As we saw in the last chapter, the just intonation ratio for a major third is \(5 : 4\). How does this compare to the equal temperament major third (go up by four notes)? Compute the percentage error.

\[\frac{\frac{5}{4} - 2^{4/12}}{\frac{5}{4}} = -.00793684 = 0.79\%\]

Once again, this might seem like a very tiny error. Later in this chapter, we’ll explore these subtle differences. You may be surprised by how audible these tiny variations actually are.

Most of the music you’ve ever heard is made up of just twelve notes. When we say there are just twelve notes, what we mean is that there are twelve unique notes in an octave. We can stitch together as many octaves as we like to create many pitches. The standard starting point in the US is to give the frequency 440 Hz the name A. Between that A and the A one octave up (880 Hz), we have twelve notes. We can then go up and down by as many octaves as we like to create lots of pitches. Take a piano, for example. It has 88 keys, each generating a unique pitch (or frequency), but each of these 88 pitches corresponds to one of twelve notes.

Let’s take a look at the modern convention for naming the twelve notes in our equal temperament system. At first, the way we do this is going to seem a little strange. You might ask yourself ‘why not use twelve different letters for twelve different notes?’ Keep in mind that much of what we do in the western musical tradition is based on the major scale, which has only seven notes.

To name the twelve notes, we use the letters A through G and modifiers like \(\sharp\) (sharp) and \(\flat\) (flat). The \(\sharp\) symbol means ‘one note above’ while the \(\flat\) symbol means ‘one note below.’ Let’s list the notes here and then say a little bit about what’s happening:

1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 | |
---|---|---|---|---|---|---|---|---|---|---|---|---|

A | A\(\sharp\) | B | C | C\(\sharp\) | D | D\(\sharp\) | E | F | F\(\sharp\) | G | G\(\sharp\) | |

B\(\flat\) | D\(\flat\) | E\(\flat\) | G\(\flat\) | A\(\flat\) |

Notice that most of the letters have a note in between them with the exception of B-C and E-F. For the ‘in-between’ notes, we have two (or more) ways of labeling them. E.g., the note between A and B can be called A\(\sharp\) or B\(\flat\). This is definitely confusing at first, but we’ll soon see how this note-naming convention makes communicating and recalling musical information easy, once we get the hang of it.

It’s pretty convenient that we have twelve notes and twelve hours on the clock. In both cases, we’re doing arithmetic modulo 12. In order for modular arithmetic to work here, we want to start counting at \(0\) (like a good computer scientist), and go up to 11. So, the table we made above should look like this instead:

0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | |
---|---|---|---|---|---|---|---|---|---|---|---|---|

A | A\(\sharp\) | B | C | C\(\sharp\) | D | D\(\sharp\) | E | F | F\(\sharp\) | G | G\(\sharp\) | |

B\(\flat\) | D\(\flat\) | E\(\flat\) | G\(\flat\) | A\(\flat\) |

Remembering which notes don’t have ‘in-between’ notes is helpful for identifying where the notes fall on a standard keyboard/piano:

The white keys are the ‘unadorned’ notes A-G.

The black keys are all sharps and flats.

B and C come after the group of three black keys, before the group of two.

You’ve likely heard musical terms like *chord*, *scale*, and *key*. In this section, we are going to define these terms and see how these musical objects are built using simple mathematical patterns.

We’ll define a chord to be *a set of two or more notes played together*.

We’ll begin by defining four primary triads - chords with three notes. There are four kinds: major, minor, diminished, and augmented. The definition of each is as follows:

MAJOR TRIAD: \(\{n, n+4 \text{ mod } 12 , n+7 \text{ mod } 12\}\)

MINOR TRIAD: \(\{n, n+3 \text{ mod } 12, n+7 \text{ mod } 12\}\)

DIMINISHED TRIAD: \(\{n, n+3 \text{ mod } 12, n+7 \text{ mod } 12\}\)

AUGMENTED TRIAD: \(\{n, n+4 \text{ mod } 12, n+8 \text{ mod } 12\}\).

Here, \(n\) is the *root note* of the triad, with \(C = 0\).

Another way of defining the four kinds of triads is by thinking solely in terms of the *jumps* between the notes. You may have noticed that the differences between the notes in any triad are always either 3 or 4.

MAJOR TRIAD: +4 +3

MINOR TRIAD: +3 +4

DIMINISHED TRIAD: +3 +3

AUGMENTED TRIAD: +4 +4

Let’s build the four primary triads with a root of C:

C-Major: \(\{0, 4, 7\} = \{C, E, G\}\)

C-Minor: \(\{0, 3, 7\} = \{C, E\flat, G\}\)

C-Diminished: \(\{0, 3, 6\} = \{C, E\flat, G\flat\}\)

C-Augmented: \(\{0, 4, 8\} = \{C, E, G\sharp\}\)

Since all of the note numbers were between 0 and 11, we didn’t have to apply modulo 12.

Let’s do the same for A. In this case, \(n = 9\). This time we’ll have to apply modulo 12. Of course, you can also just draw a musical clock and count your way around.

A-Major: \[\{9, 13 \text{ mod } 12, 16 \text{ mod } 12\} = \{9, 1, 4\} = \{A, C\sharp, E\}\]

A-Minor: \[\{9, 12 \text{ mod } 12, 16 \text{ mod } 12\} = \{9, 0, 4\} = \{A, C, E\}\]

A-Diminished: \[\{9, 12 \text{ mod } 12, 15 \text{ mod } 12\} = \{9, 0, 3\} = \{A, C, E\flat\}\]

A-Augmented: \[\{9, 13 \text{ mod } 12, 17 \text{ mod } 12\} = \{9, 1, 4\} = \{A, C\sharp, E\sharp\}\]

At this point, you might be asking why we used \(C\sharp\) rather than \(D\flat\) or \(E\sharp\) rather than \(F\). Notice that for all three of the A triads, we used the letters A, C, and E. In doing so, we went with *every other letter* in our musical alphabet, skipping over B and D in this case. This is a pattern we’ll follow in naming all triads: use every other letter. For example, I know that any kind of D triad will use the letters D, F, and A; it’s just a matter of adding the appropriate sharps and flats. This is helpful in learning and memorizing the notes in chords and is also essential when writing out traditional musical notation.

Build each of the following triads:

D-minor

\[\{D, F\sharp, A\}\]

F\(\sharp\)-diminished

\[\{F\sharp, A, C\}\]

B\(\flat\)-major

\[\{B\flat, D, F\}\]

G-minor

\[\{D, F\sharp, A\}\]

E-augmented

\[\{E, G\sharp, B\sharp\}\] Note: This is a weird situation in which convention tells us to use B#

We can build on the four basic triads by continuing to add notes in increments of 3 or 4. The names we use for these chords will make more sense once we’ve discussed scales, but here are a few examples (try playing them yourself to hear what they sound like).

MAJOR SEVENTH CHORD:\(\{n, n+4, n+7, n+11\}\) or \(+4, +3, +4\). E.g., F-major-seventh (Fmaj7): F-A-C-E.

MINOR SEVENTH CHORD:\(\{n, n+3, n+7, n+10\}\) or \(+3, +4, +3\). E.g., E-minor-seventh (Em7): E-G-B-D

MINOR NINTH CHORD:\(\{n, n+3, n+7, n+10, n+14\}\) or \(+3, +4, +3, +4\). E.g., D-minor-ninth (Dm9): D-F-A-C-E

You may have noticed that these seventh and ninth chords contain multiple triads. For example, the Dm9 chord contains a D-minor, an F-major, and an A-minor.

Like chords, scales are just sets of notes. We’ve already encountered one very important scale: the major scale. In the Pythagorean tuning system, we get the notes of the major scale through repeated application of the ratios \(3:2\) and \(2:1\). In doing so, we got the seven notes of the major scale out of order: We started with C, then got G (its perfect fifth), then D, then A, and so on. In tuning systems like the Pythagorean system or Just Intonation, the spacing between notes is not equal, so defining scales as a pattern of intervals (i.e., by counting notes) doesn’t quite work.

Consider the full twelve-note Pythagorean tuning system. Suppose we want to use an interval of 1 notes as part of our scale pattern. If we go from the root to note 2, we go up by a factor of 256/243.

What happens when we go from note 1 to note 2 (another jump of 1 note)?

\[\frac{9}{8}\cdot\frac{243}{256} =\frac{2187}{2048}\]

Root | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 | Octave |
---|---|---|---|---|---|---|---|---|---|---|---|---|

\(1\) | \(\frac{256}{243}\) | \(\frac{9}{8}\) | \(\frac{32}{27}\) | \(\frac{81}{64}\) | \(\frac{4}{3}\) | \(\frac{729}{512}\) | \(\frac32\) | \(\frac{128}{81}\) | \(\frac{27}{16}\) | \(\frac{16}{9}\) | \(\frac{243}{128}\) | \(2\) |

Now, let’s define the major scale in terms of a pattern of intervals.

If we begin on note \(n\), the \(n\)-major scale is the following set of notes:

\[\{n, n+2, n+4, n+5, n+7, n+9, n+11\} \; \text{mod } 12\]

Here are a few major scales:

C-major: We start with \(C \rightarrow n = 0\): \[\begin{aligned} \text{C-major } &= \{n, n+2, n+4, n+5, n+7, n+9, n+11\}\text{ mod } 12\\ &= \{0, 0+2, 0+4, 0+5, 0+7, 0+9, 0+11\}\text{ mod } 12\\ &= \{0,2,4,5,7,9,11\}\\ &= \boxed{\{C, D, E, F, G, A, B\}} \end{aligned}\]

E-major: We start with \(E \rightarrow n=4\): \[\begin{aligned} \text{E-major } &= \{n, n+2, n+4, n+5, n+7, n+9, n+11\}\text{ mod } 12\\ &= \{4, 4+2, 4+4, 4+5, 4+7, 4+9, 4+11\}\text{ mod } 12\\ &= \{4, 6, 8, 9, 11, 13, 15\} \text{ mod } 12\\ &= \{4, 6, 8, 9, 11, 1, 3\}\\ &= \boxed{\{E, F\sharp, G\sharp, A, B, C\sharp, D\sharp\}} \end{aligned}\]

B\(\flat\)-major: We start with \(B\flat \rightarrow n = 10\): \[\begin{aligned} \text{B}\flat\text{-major } &= \{n, n+2, n+4, n+5, n+7, n+9, n+11\}\text{ mod } 12\\ &= \{10, 10+2, 10+4, 10+5, 10+7, 10+9, 10+11\}\text{ mod } 12\\ &= \{10, 12, 14, 15, 17, 19, 21\} \text{ mod } 12\\ &= \{10, 0, 2, 3, 5, 7, 9\}\\ &= \boxed{\{B\flat, C, D, E\flat, F, G, A\}} \end{aligned}\]

Notice that for scales, we use each letter exactly once. This is why we use just seven letters to name notes: the major scale contains only seven notes. The naming rule for (most) seven-notes scales - of which the major scale is just one example - is to use each letter once.

For E-major, our second note was \(6\). This can be an F\(\sharp\) or a G\(\flat\). Since we started with E, the next note in the scale must use \(F\), so we go with F\(\sharp\).

Similarly, for B\(\flat\), we used E\(\flat\) rather than D\(\sharp\) for the fourth note in the scale because we had already used the letter \(D\) for the third note.

Write out the notes for each of the following scales:

G-major

\[\{G, A, B, C, D, E, F\sharp\}\]

D-major

\[\{D, E, F\sharp, G, A, B, C\sharp\}\]

E\(\flat\)-major

\[\{E\flat, F, G, A\flat, B\flat, C, D\}\]

You may have noticed that each major scale can be written with just sharps or just flats. This observation provides us with a very nice way to organize the twelve major scales. We’ll revisit this in a later section.

Given any starting note, there are twelve possible intervals, or jumps, that we can make before returning to the octave:

\[+0, +1, +2, +3, +4, +5, +6, +7, +8, +9, +10, +11\]

Rather than working numerically, musicians give each possible interval a name. This may seem confusing at first, but it provides us with most ways of describing the relationships between notes in chords and scales. Here’s the list of intervals within an octave:

Interval/Jump | Name | Short |
---|---|---|

\(+0\) | Unison | P1 |

\(+1\) | Minor Second | m2 |

\(+2\) | Major Second | M2 |

\(+3\) | Minor Third | m3 |

\(+4\) | Major Third | M3 |

\(+5\) | Perfect Fourth | P4 |

\(+6\) | Tritone | d5 or A4 |

\(+7\) | Perfect Fifth | P5 |

\(+8\) | Minor Sixth | m6 |

\(+9\) | Major Sixth | M6 |

\(+10\) | Minor Seventh | m7 |

\(+11\) | Major Seventh | M7 |

\(+12\) | Octave | P8 |

Let’s look at how some of these interval names can help us remember the patterns in some common scales and chords.

Notice that the major scale is made up of only major and perfect intervals:

{P1, M2, M3, P4, P5, M6, M7}

Compare that with the

*natural minor scale*:{P1, M2, m3, P4, P5, m6, m7}

One more: the harmonic minor scale:

{P1, M2, m3, P4, P5, m6, M7}

In terms of intervals, the natural minor and harmonic minor differ on in their seventh note (sometimes referred to as the degree); whereas the natural minor has a minor seventh, the harmonic has a major seventh.

Major and minor chords both have the pattern

**root-third-fifth**. The difference between them is that a major chord contains the major third of the root and a minor chord contains the minor third of the root.Major Chord: {P1, M3, P5}

Minor Chord: {P1, m3, P5}

To see why this colloquial definition can be problematic, consider the set, \(A\), which contains all sets that do not contain themselves. If \(A\) doesn’t contain itself, then \(A\) must contain itself. But if \(A\) contains itself, then \(A\) is a set that doesn’t contain itself. This is the same kind of circular logic we end up with when we write, {This statement is false.} If it’s true, then it’s false. But if it’s false, then it’s true! All this is to say that mathematicians have had to refine their definition of a set to avoid such paradoxes and logical loops.↩

Remember, a pitched sound with frequency \(f\) creates overtones with frequencies of \(2f, 3f, 4f, 5f, \dots\). The ratio between the overtones \(2f\) and \(3f\) is \(3: 2\).↩