We see a distinct preference for denying the premise of the measurement rather than accepting a measured value of zero…

Anil Venkatesh

## Act NULLA1

It happened at the peak of remote instruction. I had just finished a Zoom session with my calculus students from my improvised living room workspace. “You mathematicians sure are obsessed with zero,” my spouse commented idly from across the room. I confess that I was a bit indignant at first. Me, obsessed with zero? But then I thought back to the content of the day’s lesson. “As $h$ approaches zero… slope of zero… zero divided by zero…” She may have had a point. Over the next few semesters, I began asking my calculus students for their opinions on the matter, only to confirm that mathematicians are widely regarded to have an unhealthy fixation on the number zero. Let’s wind the clock way back to see where all the trouble started.

## Act I

The origin of the number zero is a hotly debated topic in the history of mathematics, not least because this number serves multiple conceptually different roles. In positional (place value) notation, zero is a placeholder that distinguishes 11 from 101. This may well be the first historical use of zero, attested in the Babylonian numeral system by 700 BCE and in Chinese rod numerals around 400 BCE. Over the next thousand years, zero would gradually gain recognition as a bona fide number, culminating in the work of Brahmagupta (c. 598 — c. 668 CE) which laid out the following rules of arithmetic with zero.

 $a + 0 = a$ $a – 0 = a$ $0 + 0 = 0$ $0 – 0 = 0$ $a \times 0 = 0$ $0 \times 0 = 0$ $\sqrt{0} = 0$ $0 \div 0 = 0$ $a \div 0 =$ ? $0 \div a =$ ?

Brahmagupta’s rules of arithmetic with zero ($a$ is any nonzero number).

Brahmagupta clearly perceived some discomfort with the interaction between zero and division as he chose not to commit himself to the value of $\frac{a}{0}$ or even $\frac{0}{a}$ for $a\neq 0$. As I look through Brahmagupta’s rules, I can’t help but feel that the average present-day person has no greater fluency with zero than Brahmagupta did nearly 1500 years ago. When compared to the growth in global literacy rate ($<$1% to $87$%), it's fair to marvel at how stubbornly zero has remained in the domain of the esoteric. Yes, children get a lot more practice reading than they do with the arithmetic of zero, but why? In the next three sections, I attempt to shed some light on why zero is generally excluded from our everyday experience of math.

## Act II

“If you build a straight fence 30 feet long with posts spaced 3 feet apart, how many posts do you need?” If you said 10, you’ve committed what’s nowadays known as the fencepost error, a problem contemplated over 2,000 years ago by the Roman architect and engineer Vitruvius. This is an example of the off-by-one error, one of the most common logical errors in computer programming. Beyond the realms of for-loops and fence construction, we also find this error in music. My childhood violin teacher once commented that in music, 3 is half of 4. To see why, think of the Bee Gees’ famous line “Ah, ah, ah, ah, stayin’ alive, stayin’ alive.” The figure below shows that you’re halfway through the four “ah’s” when you sing the third one.2 This is because we count musical rhythms by the attack or onset of the note, not its duration. The beats of the song are the fenceposts, but the actual sounds are the fence slats that connect them.

Audio waveform excerpt from “Stayin’ Alive.” The passage of four “ah’s” is highlighted with a white background. Note that the moment the third “ah” is sung is precisely halfway through this passage.

The fencepost error exists because everyone learns to count starting from 1, not zero. Starting from the number 1 is a highly intuitive way to count; for one thing, it’s the only way for the count to line up with the actual quantity of objects counted! Even so, there are many sensible applications of zero-based numbering, particularly in computer science and mathematics. Your computer stores information as a bytes, groups of eight 0’s and 1’s. A single byte can hold $2^8$ or 256 different states, each of them a different eight-digit binary number. How would you go about counting these 256 states? The most natural way is to label each state with the value of its binary representation, but this has us counting from $0 = 00000000$ to $255 = 11111111$, not from 1 to 256.

Byte State Label (Binary Value)
00000000 0
00000001 1
00000010 2
00000011 3
$\cdots$ $\cdots$
11111101 253
11111110 254
11111111 255

Natural method of counting the different states of a byte.

In a more prosaic example, the notation of 24-hour clocks uses 0 to denote the first hour of the day, just as 0 denotes the first minute of each hour and the first second of each minute. Together with the music paradox, these examples have the following thing in common: they all somehow involve counting intervals along a number line.

Mathematicians:
Count intervallic data with the first fencepost at zero.
Most People:
Count everything from 1.

The branch of mathematics dedicated to counting is combinatorics, which also lends us two of the most egregious uses of zero in the calculus classroom. When we reach Taylor series, students are subjected to the double-barreled revelation that 0! = 1 and ${n \choose 0} = 1$. The first of these is read “zero factorial” and refers to the formula $$n! = n(n-1)(n-2)\cdots(3)(2)(1)$$ which works out-of-the-box when $n = 1, 2, 3$ and so on. Since factorials are built from multiplication, surely $0!$ is zero… right? The second expression is read “$n$ choose 0” and draws on the more general idea of ${n \choose k}$, the number of ways to select $k$ objects from an assortment of $n$ objects. So, we’re supposed to select zero objects from $n$? If we’re not taking any objects at all, isn’t it fair to say that there are no ways to do this? The notion of selecting zero objects almost feels semantically ill posed, like stating that the King of France is bald (there is no King of France).

The most common elementary arguments for these counterintuitive uses of zero make a universalizing appeal: we start with a formula that works for strictly positive numbers, then argue that the formula “wants” to take the value of 1 when the input is forced to zero. The notion that a rule that works in one case should be universalized to another case is an aesthetic judgment that mathematicians typically endorse, but the general public does not. Outside mathematics, there is no aesthetic value to summarizing a complicated decision process into a single universal rule; people much prefer to be given separate directions in case of fire, crime, health emergency, and so on. Therefore, it is distinctly unsatisfying to most students to be told to accept $0! = 1$ on the basis of a universalizing appeal. In the next section, I discuss why 0 is so often the subject of universalizing by introducing a related concept: degeneracy.

## Act III

My spouse started this whole inquiry with her observation about my obsession with zero, but that wasn’t the first mathematical reality check she ever gave me. Many years before, she chided me (and all mathematicians, I guess) for our habit of taking perfectly good English words and repurposing them as jargon. One of the more colorful examples of this practice is the term degeneracy, which in fairness may be the fault of the physicists. In thermodynamics, the term refers to a phenomenon that occurs in gases as the temperature drops toward absolute zero; later, degenerate matter came to describe matter in a state of maximum compression, such as the core of a neutron star. At some point, mathematicians adopted the term by analogy for what happens when a system is pushed to the point of collapsing on itself. Consider an ordinary rectangle with width of 2 and height of 1. Now shrink the height of the rectangle while leaving the width constant. As the height reaches 0, you have a degenerate rectangle on your hands, better known as a line segment. Mathematicians happily accept degenerate rectangles into the family of rectangles, but this practice of inclusivity never fails to infuriate everyone else. Again, the universalizing aesthetic is operative here. To a mathematician, it’s better to include degenerate objects rather than having to make a special case for each of them, but the cognitive affront of accepting line segments (and even single points) as rectangles is too much for most people to accept).

Nine rectangles.

When degeneracy occurs, zero tends to be present. This is because when we use numbers to quantify things in the world, we generally use non-negative scales; think for example of price, distance, weight, duration, and of course quantity. Consequently, zero ends up lurking at the extreme lower end of the scale. While some of these properties can plausibly be extended to the negative numbers, such use is not commonplace or intuitive to most. For example, accountants take care to compute assets and liabilities separately in order to keep both non-negative. Functional MRI studies have shown that negative numbers elicit stronger, yet less differentiated brain activation than positive numbers, suggesting that our neural representation of negative numbers is both less precise and more cognitively costly. Meanwhile, ordinary uses of numbers generally don’t extend very well to zero. You wouldn’t describe a destination 0 miles away or an event 0 hours in the future. As discussed in Act II, you certainly wouldn’t count zero objects. When a measurement would come up as zero, we much prefer to deny the premise of the measurement (“we’re here” rather than “we’re 0 miles from home”).3 All this is to say that non-mathematicians view negative numbers with suspicion, and zero with outright hostility. The former make their brains work harder, and the latter seems to have no purpose at all.

Mathematicians:
Universalize to zero and admit degenerate cases.
Most People:
Avoid zero measurements and reject degenerate cases.

I first began thinking about non-negative scales when I noticed that my calculus students were reluctant to characterize horizontal lines as having zero slope. The definition of the slope of a line is the ratio of its change in height to a given change in width. The traffic sign “5% grade” means that a road has a slope of 0.05, so every mile you travel leads to an altitude change of 5% of a mile, or 264 feet. (Strictly, the mile in question is measured horizontally, not along the slope of the road, but this approximation yields a very similar result for realistic grade values.) Grade is another example of a non-negative scale as we expect motorists to know which way is up. However, slope in calculus class can be positive or negative depending on whether the line gains or loses altitude from left to right. If the altitude doesn’t change at all, the line is horizontal and its slope is 0. I’ve had many calculus students over the years who don’t take to either of these notions. To some, slope is just mathematical jargon for steepness (i.e., grade); these students struggle to identify whether a given line’s slope is positive or negative. Even more students have difficulty accepting that a horizontal line has a slope of zero, much preferring to state “it doesn’t have a slope.” As with the other examples of non-negative scales, we see a distinct preference for denying the premise of the measurement rather than accepting a measured value of zero.

## Act IV

The most ubiquitous instance of (mathematical) degeneracy in calculus class is in the computation of the slope of a tangent line. The figure below depicts the process of computing the slope of line that is tangent to a circular arc at the point $(2,1)$. We repeatedly choose a second point elsewhere on the circle and connect it to $(2,1)$ with a secant line; as our choice of point gets closer to $(2,1)$, the resulting secant line gets closer to the true tangent line at $(2,1)$.

Computing the slope of a tangent line of a circular arc as the limit of slopes of secant lines. As the $x$-value is brought closer to $2$, the secant line between the points becomes more and more similar to the true tangent line at $x=2$.

If our choice of second point is $(x, f(x))$, then the slope of the resulting secant line is
\begin{align*}
\frac{\Delta y}{\Delta x} &= \frac{1 – f(x)}{2 – x}\cdot
\end{align*}
We get the best approximation for the slope of the tangent line if we put the second point really close to $(2,1)$. What if we get greedy and pick $(2,1)$ as the second point? Then the formula collapses in $\tfrac{0}{0}$. This is a degenerate state because we’ve attempted to find the slope between $(2,1)$ and itself!

This brings me to my final principle about zero: debugging. One of the hardest things to teach novice computer programmers is that error messages are their friend; when their code inevitably hits a snag, the best way to find the problem is to read the error message. Yet novice programmers tend to hastily delete the error message rather than mining it for its information. Similarly, when an error occurs in a mathematical formula, this usually means that a potentially interesting “bug” has occurred. In the case of the degeneracy of the slope formula, the bug is that we’ve attempted to find the slope between a point and itself, which makes no sense at all! The fact that this degeneracy reduces the slope formula to $\tfrac{0}{0}$ is a good thing. It’s the mathematical form of an error message that prompts you to reexamine the assumptions that led to it.

Mathematicians:
Interpret divide-by-zero errors as debugging messages.
Most People:
Avoid (fear?) divide-by-zero errors.

Incidentally, interpreting $\tfrac{0}{0}$ as a debugging message about the slope formula helps us to understand what Brahmagupta missed. Since every single tangent line yields this same mysterious fraction, there is no one value that can be ascribed to it. Instead, we are obliged to treat $\tfrac{0}{0}$ as an indeterminate form that can only resolve into its intended value once we squash the relevant bug.

Some Questions: Is your relationship with zero obsessive, acrimonious, or perhaps something less dramatic? Do you perceive aesthetic value in universalizing or do you prefer to exclude degenerate cases? How many “ah’s” are there in “Stayin’ Alive?” Are you sure?

## Footnotes

1. The Romans had no symbol for zero, instead writing the word nulla when necessary.
2. Perhaps even more shockingly, $2 \div 2 = 2$ in music since the beginning of the second “stayin’ alive” marks the halfway point of that passage!
3. The area of linguistics that describes the difference between these statements is pragmatics, the study of the practical use of language. Math is pragmatically poor compared to natural language, which is also why there are essentially no good math jokes.