A recent reddit post posed the question of whether there exists an associative metric on the non-negative reals.  I didn’t see it until after most people moved on, so I’m posting my thoughts on the subject here.  It is likely that none of this is new, but perhaps some people will find it interesting to have it collected in one place.

As a brief review just to collect the relevant definitions in one place… a function $d : X^2 \to \mathbb{R}_{\geq 0}$ is a metric if:

1. $d(x,y) = 0$ when $x = y$, and $d(x,y) > 0$ when $x \neq y$.  This is called definiteness.
2. $d(x,y) = d(y,x)$.  This is called symmetry.
3. $d(x,y) \leq d(x,z) + d(z,y)$.  This is the triangle inequality.

In particular, if $X = \mathbb{R}_{\geq 0}$, then $d$ is a binary operation.  One may then ask whether it is associative.  That is, if $d(x, d(y,z)) = d(d(x,y),z)$.  The problem is whether there exists a metric $d$ on the non-negative reals that is in fact associative.

# A Key Observation

To get started, let’s just play some games with manipulating the axioms above.  Let’s consider any non-negative real number $x$, and let $k = d(0,x)$.  That is, $k$ is the distance of $x$ from the origin.  Then something interesting happens.  By the associative property we’ve just discussed, we have:

$d(0, d(x,k)) = d(d(0,x), k)$

But remember, $d(0,x)$ on the right-hand side of that equation is the same as $k$, and by the definiteness property of a metric, $d(k,k) = 0$.  So we really have:

$d(0, d(x,k)) = d(k,k) = 0$

Now, applying definiteness again in the other direction, we can conclude that $0 = d(x,k)$, and then applying it yet a third time, that $x = k$.  In other words, we have:

Proposition 1: Suppose $d$ is an associative metric on the non-negative reals.  Then for all non-negative reals $x$, $d(0,x) = x$.

This is not an unusual property of a metric on the reals.  For example, the standard metric $d(x,y) = |x-y|$ has this property for all non-negative numbers.  However, it’s not true of all metric (not the discrete metric, for example.)  Note that so far, we have used only associativity and definiteness.  The really interesting property of metric, though, is the triangle inequality.  So let’s apply that to try to get some bounds on the possible choices of $d$, by decomposing distances into distances from the origin.  Suppose we choose $x$ and $y$ as arbitrary non-negative reals.  Then we have:

1. $d(0,x) \leq d(0,y) + d(y,x)$
2. $d(0,y) \leq d(0,x) + d(x,y)$
3. $d(x,y) \leq d(x,0) + d(0,y)$

Now, just using symmetry and proposition 1 to simplify those expressions, we get:

Proposition 2: Suppose $d$ is an associative metric on the non-negative reals.  Then for all non-negative reals $x$ and $y$,$|x-y| \leq d(x,y) \leq x + y$.

These are not very tight bounds on the possible values of $d$, really.  But they do tell us something interesting: namely, that a metric that works for us will necessarily have something to do with the magnitudes of the numbers involved.  This was not necessarily a given — one can imagine metrics on many sets of numbers that are entirely unrelated to the magnitudes of the numbers involved.  It turns out that in our case, our willingness to interchange the value of the metric with its inputs via the associative property somehow forces the metric to give significance to the magnitude of the numbers.

# Group Structure

The original post to Reddit made the casual observation that it can be shown that if this is true, then $\mathbb{R}_{\geq 0}$ is a group under this operation.  To review another basical definition, a group is any set together with a binary operation (which I’ll call $\bullet$), which satisfies:

1. Associativity: that is, $x \bullet (y \bullet z) = (x \bullet y) \bullet z$.
2. Identity: that is, there exists $e$ such that for all $x$, $e \bullet x = x \bullet e = x$.
3. Inverses: that is, for any $x$, there is a $x^{-1}$ such that $x \bullet x^{-1} = x^{-1} \bullet x = e$.

As a quick change of notation (so that we’re more consistent with the infix notation normally used for groups), we define $x \bullet y = d(x,y)$.  Of course, the associativity of $\bullet$ was something we assumed from the very beginning.  Proposition 1, together with the symmetry property, give us the identity, which is zero.  Definiteness gives us inverses: for any $x$, we know $d(x,x) = 0$, so that $x$ is its own inverse.  Therefore, $\langle \mathbb{R}_{\geq 0}, \bullet\rangle$ is a group.

A good bit could be said about the structure of this group.  In fact, that every element is its own inverse gives us very specific information about what this group looks like (it’s built from copies of $\mathbb{Z}_2$, for example.)  But even just knowing that it’s a group gives us some interesting ways to think about the problem.  The bounds we found in the previous section were actually compatible with the standard metric on the real numbers.  But seen as a group operation, our associative metric starts to develop some interesting properties that are somewhat different from the standard metric.

For example: suppose you give me a non-negative real $x$, and ask me what other non-negative reals are a distance $y$ away from it.  Because distance is a group operation, there is precisely one answer to this question: if $x \bullet z = y$, then $z = x^{-1} \bullet y$ (and in this case, since every number is its own inverse, that’s the same as $x \bullet y$).  In other words the function $d_x(y) = d(x,y)$ is necessarily one-to-one, for any $x$.

That’s a pretty significant statement!  In fact, it’s fairly simple now to show that for $x \neq 0$, $d_x$ therefore cannot be a continuous function in the topology induced by the standard metric $|x-y|$.  We might previously have hoped that perhaps the metric that worked would look something like the standard metric, though perhaps stretched or distorted in some way.  In fact, the earlier proposition gave us extra hope, by limiting the amount by which our metric can differ from the standard metric.  It’s now clear, though, that an associative metric will be quite  different from the standard metric.

(As an aside, it’s worth noting that $d_x$ above is continuous in the topology induced by the metric $d$.  This is basically equivalent to requiring that for any $\epsilon > 0$, there exists $\delta > 0$ such that whenever $y \bullet y_0 < \delta$, we also have $d_x(y) \bullet d_x(y_0) < \epsilon$.  But $d_x(y) \bullet d_x(y_0) = x \bullet y \bullet x \bullet y_0 = y \bullet y_0$, so choosing $\delta = \epsilon$ always works.)

# Where Next?

Who knows.  We have two conclusions now that seem rather at odds with each other: first, then an associative metric would need to overestimate the standard metric by a bounded (though, admittedly, very loosely bounded) amount.  Second, we have that in terms of topological properties, our metric differs significantly.  Resolving these tensions — or showing that they can’t be resolved — would perhaps be an important next step in addressing the problem.

1. Twan van Laarhoven / Jun 15 2009 2:27 pm

I think I have something that almost works. Define d(a,b) as the digit-wise xor (aka sum mod 2) of a and b. So for example:

d(1/3,2/3) = d(0.010101..,0.101010..) = 0.111111.. = 3/3
d(1/2,1/3) = 0.11010101.. = 5/6
d(1/4,1/3) = 0.00010101.. = 1/12

The problem is of course that the binary expansion of real numbers is not uniquely defined, 0.01111.. = 0.10000.., so you get also

d(1/2,1/3) = 0.00101010.. = 1/6

Maybe we can just pick one of these? Say, the smallest?
Let’s try that.

d(1/2,1/3) = d(0.011111,0.0101010) = 0.00101010 = 1/6
or d(0.100000,0.0101010) = 0.11010101 = 5/6
d(1/2,5/6) = d(0.011111,0.1101010) = 0.10101010 = 2/3
or d(0.100000,0.1101010) = 0.01010101 = 1/3

We can’t have both d(1/2,1/3)=1/6 and d(1/2,5/6)=1/3, since then

1/6 = d(1/2,1/3) = d(1/2,d(1/2,5/6)) = d(d(1/2,1/2),5/6) = d(0,5/6) = 5/6

Maybe we should consistently pick either .011111 or .100000, let’s say we go with the latter. Then d(1/2,1/3) = 5/6 and d(1/2,5/6) = 1/3. But what about

d(1/2,1/6) = d(0.100000,0.0010101) = 0.101010101 = 2/3

So far so good, but now

d(1/3,2/3) = d(0.010101,0.1010101) = 0.111111111 = 1

but

d(1/3,1) = d(0.010101,1.0000000) = 1.010101010 = 4/3

The problem is of course that we switched representation for 0.111111111 to 1.0000000.

• Jade NB / Jun 17 2009 1:43 pm

@Twan: Notice that your operation mirrors very directly the observation made about the structure of the group in question, namely, that it is ‘built up’ from copies of the cyclic group of order 2. Of course, one has to be a little careful with the meaning of ‘built up’, since we don’t have a finitely generated group and so can’t invoke the fundamental theorem; but the fact that your proposed metric is such a natural reflection of an underlying algebraic structure suggests that it must be nearly the right thing.

By the way, note also that the fact that all elements of this putative group square to the identity already gives us the symmetry (since xyxy = (xy)^2 = e = x^2 y^2 = xxyy, so that xy = yx), so that we don’t even need all the axioms of a metric!

• Matthew / Jun 22 2009 4:07 am

So we can check the vector space axioms to show that the group is a vector space over Z2. Then by a cardinality argument its dimension must be the cardinality of the reals, 2^omega. So that pins it down up to isomophism; we know exactly what the group structure must be, the problem is in how to map it to the reals so that the group operation respects the triangle inequality.

Would it help to find a basis for this vector space, using the axiom of choice?

• cdsmith / Jun 22 2009 11:16 am

Actually replying to Matthew’s comment. I’m not 100% certain that simply knowing the cardinality, and that {0,x} for any x is a subgroup isomorphic to Z2, is enough to determine the group up to isomorphism. But then again, it’s early, and I haven’t had my coffee yet. In any case, yes I definitely agree that the hard part is to determine if one can find the specific one-to-one correspondence between this specific group and the real numbers, which happens to preserve the triangle inequality with standard addition on the reals. I have no interesting ideas on how to approach this, though. At the moment, I’m stuck just thinking that the obvious correspondence, which doesn’t work, is the only one I can imagine. Though of course any permutation of the reals that maps 0 to the identity of the group would be a candidate… and there are unfathomably many of them.

• Matthew / Jun 22 2009 3:25 pm

cdsmith: Let me try and convince you (and myself) with the details:

Checking the axioms for a vector space, from http://en.wikipedia.org/wiki/Vector_space#Definition with Z2 as the field, addition as our group operation/metric, and scalar multiplication defined to be 1.x = x, 0.x = 0

Commutativity of addition: again by definition, also follows from the other assumptions as Jade NB notes
Additive identity: Proposition 1 in the article
Inverse elements for addition: we’ve shown that each element is self-inverse

Scalar multiplication axioms are trivial checks of the definition, not sure these even matter to the cardinality argument but here goes:

Distributivity of scalar multiplication with respect to vector addition:
0.(v+w) = 0 = 0.v + 0.w; 1.(v+w) = v+w = 1.v + 1.w

Distributivity of scalar multiplication with respect to field addition:
(1+1).v = 0.v = 0 = v+v = 1.v + 1.v
(0+1).v = 1.v = v = 0+v = 0.v + 1.v
(1+0).v = 1.v = v = v+0 = 1.v + 0.v
(0+0).v = 0.v = 0 = v+v = 1.v + 1.v

Compatibility of scalar multiplication with field multiplication a(bv) = (ab)v [nb 3]
(0.0)v = 0v = 0.(0v)
(0.1)v = 0v = 0 = 0.(1v)
(1.0)v = 0v = 0 = 1.0 = 1.(0v)
(1.1)v = 1v = 1.(1v)

Identity element of scalar multiplication: 1.v = v

So we know we’re looking for the additive group of a vector space over Z2.

Then using a few facts:
– Every vector space has a basis (requires axiom of choice for the infinite-dimensional case)
– All bases of a vector space have equal cardinality, defining the dimension of the space
– An vector space isomorphism can be constructed from a bijection between bases
– All vector spaces of the same dimension are isomorphic
– The dimension of an infinite-dimensional vector space is equal to its cardinality (or that of the field if it’s greater, clearly not in this case when it’s Z2): http://planetmath.org/encyclopedia/DimensionFormulaeForVectorSpaces.html#SECTION00040000000000000000

We get that the space is THE vector space over Z2 of dimension 2^omega.

You might be tempted to think of this space as the space of functions from R -> Z2, ie the space of arbitrary subsets of R. But it is also isomorphic to the space of only /finite/ subsets of R, since this is also a vector space over Z2 with XOR as the operation, and it has the same cardinality. Something a little surprising (as consequences of AC can sometimes be…)

Dunno if that helps at all :)

• Matthew / Jun 22 2009 3:34 pm

Mistake in that last paragraph: it’s NOT the space of arbitrary subsets of R, since that has even bigger cardinality.

You could see it as the space of finite subsets of R with XOR, (or eg of R+, R_>=0 if you prefer) or alternatively as the space of arbitrary subsets of N with XOR.

I was looking for a way to use either of these to construct a nice bijection onto the reals, which kept some relationship with magnitude, but failed.

Gut feeling is that some non-constructive magic is needed, perhaps a cleverer application of axiom of choice, perhaps some hard-hitting theorem from topology… also considered fiddling around with the 2-adic numbers, which might lead somewhere but not just yet…

2. Matthew / Jun 21 2009 2:53 pm

I wonder if you could prove this in the negative, by showing that all attempts to construct such a metric must head down this kind of route, and suffer from the same problem.

I tried, it felt like the triangle inequality condition wasn’t quite strong enough, or my powers too weak :)

3. Matthew / Jun 21 2009 2:54 pm

(clarification: referring to Twan’s post)