6 min read

Measurements of Scale, Part Two

## 
References (show)

And we’re back!

Be sure to check out Measurements of Scale part one, here. Now, we’ll tackle interval and ratio measurements.

Warning: This entry is considerably longer than Part One. If you suffer from TL;DR, please use the “back” button to return to safety.

Interval Scale: It’s all about distance

In the previous article, it was stated that an ordinal scale does not indicate “[d]istance between positions” or “intrinsic value at each position”. An interval scale, however, possesses the qualities of an ordinal scale and includes information about the distance between positions.

As a data practitioner, keep in mind that distance is not a strictly geo-spatial term. For example, consider the words in the sentence “I tried to climb the hill but was too tired.”:

  • The position of appearance of words the words are on an ordinal scale

  • The distances between word positions are also on an interval scale:

I tried to climb the hill but was too tired
I 0 -1 -2 -3 -4 -5 -6 -7 -8 -9
tried 1 0 -1 -2 -3 -4 -5 -6 -7 -8
to 2 1 0 -1 -2 -3 -4 -5 -6 -7
climb 3 2 1 0 -1 -2 -3 -4 -5 -6
the 4 3 2 1 0 -1 -2 -3 -4 -5
hill 5 4 3 2 1 0 -1 -2 -3 -4
but 6 5 4 3 2 1 0 -1 -2 -3
was 7 6 5 4 3 2 1 0 -1 -2
too 8 7 6 5 4 3 2 1 0 -1
tired 9 8 7 6 5 4 3 2 1 0

In addition, interval scales have an infinite domain and no true zero point:

  • Latitude and longitude are on an interval scale. The “zero” location on the globe (0 ° long., 0 ° lat.) is arbitrary, only having value for the purpose of providing a reference point from which distance is measured.

  • Temporal demarcations (e.g., dates, timestamps, etc.) are also on an interval scale *. “Nine o’clock” is not three times as “big” as “three o’clock”. Also, there would be no reason one should consider a clock divided into 27 partitions as invalid as long as the partitions are equal.

* Yes, one could point to cosmological theories of the origin of the universe as a rebuttal, so I ask for your indulgence.

By way of example:

  • \(X:\) a vector of data
    [99, 95, 45, 50, 100, 60, 94, 78, 11, 68]

  • Ordinal: positions of each value in \(X\)
    [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]

  • Interval: distances from the mean \((X - \bar{X})\)
    [29, 25, -25, -20, 30, -10, 24, 8, -59, -2]

Division and multiplication are not allowed on an interval scale: fortunately, we have our final measurement scale — the ratio scale.

Ratio

“Alas, poor Yorick! I knew him, Horatio

Not only does the ratio scale contain the winning personality of an interval scale, it possesses the distinguishing feature of having an absolute zero point beyond which nothing exists. Physical measurements such as weight and height are examples. Returning to our example sentence, we have the concept of string similarity which is also on a ratio scale (Jaro-Winkler string similarity values generated with the R package stringdist):

I tried to climb the hill but was too tired
I 1 0.0000 0.0000 0.0000 0.0000 0.0000 0 0 0.0000 0.0000
tried 0 1.0000 0.5667 0.4667 0.6889 0.4833 0 0 0.5111 0.9333
to 0 0.5667 1.0000 0.0000 0.6111 0.0000 0 0 0.8889 0.5667
climb 0 0.4667 0.0000 1.0000 0.0000 0.4667 0 0 0.0000 0.4667
the 0 0.6889 0.6111 0.0000 1.0000 0.5278 0 0 0.5556 0.6889
hill 0 0.4833 0.0000 0.4667 0.5278 1.0000 0 0 0.0000 0.4833
but 0 0.0000 0.0000 0.0000 0.0000 0.0000 1 0 0.0000 0.0000
was 0 0.0000 0.0000 0.0000 0.0000 0.0000 0 1 0.0000 0.0000
too 0 0.5111 0.8889 0.0000 0.5556 0.0000 0 0 1.0000 0.5111
tired 0 0.9333 0.5667 0.4667 0.6889 0.4833 0 0 0.5111 1.0000

The Importance of Absolute Zero

What makes a ratio scale so great is that it provides an objective, uniform way to communicate information about the values at each location as well as the distances between values. Count data is a great example of this. For example, consider a set of counts of unique people entering or leaving an auditorium (for the first time) though one of four entrances:

Entrance Entered Exited
A 12 1
B 6 0
C 0 4
D 1 6

Having an absolute zero point allows us to objectively state that nobody entered through Entrance C or exited through Door B for the first time during the measurement period. Another example is a sequence of notes played at certain volumes (measured in decibels):

Note Freq (Hz) Distance
from C
dB
A 220 -1.5 -115.73
A# 233.08 -1 126.68
B 246.94 -0.5 59.77
C (middle) 261.63 0 -30.35
C# 277.18 0.5 -5.39
D 293.66 1 6.66
D# 311.13 1.5 43.33
E 329.63 2 52.84
  • With an interval scale, one can correctly state that the E is two steps (distance) from the C, but it would be incorrect to state that E is twice as “high” as D.

  • With a ratio scale, once correctly state that Hz separates E and C and that E has 1.3 times as high a frequency as C.

  • A final … note (sorry): column “dB” is on a ratio scale, but it isn’t obvious. There doesn’t appear to be an absolute zero given the negative values, so what gives? Decibels (dB) are logarithmic. Negative logarithms result in smaller values. To illustrate, consider taking a piece of paper, cutting it in half, and then repeating this process many times: \(\frac{1}{2^0}, \frac{1}{2^1}, \frac{1}{2^2}, \cdots, \frac{1}{2^x}\equiv 2^{-x}: x\in [0, \infty)\). With each iteration, the halves become smaller and smaller. At some point, there would be virtually nothing left to cut:

Congratulations — you made it! Thanks for taking this brief journey into measurements of scale. In a future post, I’ll relate measurements of scale and central-tendency to satisfying analytic requests, but before that, I’ll address the deconstruction of analytic requests after I complete my research.

Until next time, I wish you much success in your journey as a data practitioner!

Life is data, but data is not life: analyze responsibly!