It’s a great equation for pulling out of the bag when giving talks at IMO training camps on functional equations, partly because it is so simple and pops up reasonably often, but also because it teaches a valuable lesson to school children about functions: they aren’t always nice. However, we always wave our hands and say that there exist horrible solutions, and circumstantial evidence for this is that it seems like we can’t get any more information out of the equation than its value on the rationals (or a translation thereof). However, we are always a little wary of actually doing the construction, partly because it would be useless for the IMO, and partly because it in some sense isn’t really a construction at all – it requires Zorn’s lemma, and in some sense always will. That said, I think IMO trainees have a right to know what these solutions look like if they’re curious, so I thought I’d try putting an elementary explanation up here.

Firstly, recall that Cauchy’s equation is a property of functions from the reals to the reals, telling us that

f(x+y) = f(x) + f(y) for all x,y \in \mathbb{R}.

An obvious guess is f(x)=kx, which clearly works. However, we cannot deduce this must be the solution for all reals, but we can deduce, by a fairly routine induction argument, that f(xy)=xf(y) for all rational x.

These two properties taken together tell us that for every x,y \in \mathbb{R}, a,b \in \mathbb{Q} we have f(ax+by) = af(x) + bf(y). If we consider the real numbers to be a set of vectors over the rationals, then this tells us that f is a linear map. However, it is also clear that the converse holds, so in fact any linear map will satisfy both of these conditions and so satisfy the Cauchy equation.

Just to elaborate on what we mean by considering the real numbers to be vectors over the rationals, the standard vectors from A-level maths are usually vectors over the reals. We could represent a 3-dimensional vector by v=xi + yj+zk where x,y,z are real numbers, and i,j,k are fixed vectors which should be somehow considered as abstract objects with the properties that:

  • The sum of any two vectors is a vector (and we can subtract vectors, there is a zero vector, and so on).
  • If we multiply a vector by a scalar (in this case a real number) we get another vector.

These two operations interact with one another in the obvious way. However, there’s no reason vectors have to represent arrows in space. Anything that obeys the above rules, regardless of its interpretation, can be considered as a ‘vector’. To see that the reals can be considered as vectors over the rationals, consider the set of numbers of the form a+b\sqrt 2 + c\sqrt 3 : a,b,c \in \mathbb{Q}. It is not tricky to prove that this behaves, for our purposes, just like the 3D real vector spaces from A-level maths. Sure, we can multiply real numbers together in a way we can’t multiply geometrical vectors, but this is a property we will ignore for now (except the multiplication in which one number is a rational number).

Bearing in mind the above example, it therefore isn’t a giant leap of intuition to consider the entire set of real numbers as a set of vectors over the rational numbers. There is one important potential area of confusion still unresolved though. In all the above examples, we were able to expand any vector out and write it uniquely as a (finite) linear combination of other vectors (e.g. v = x_1 i + x_2 j + x_3 k). A collection with this property, like \{ i,j,k\} is called a basis for the vector space. If you spend a while trying to build a basis for the reals over the rationals, it soon becomes clear that it’s going to have to be pretty big. In fact, you don’t have to think too hard before you find an infinitely large collection of real numbers where none can be written in terms of the others (so the basis is going to have to be infinite). It isn’t at all obvious that a basis even exists.

However, fortunately, one can prove using Zorn’s Lemma that a basis always exists (in fact, even better, we can extend any set of vectors that aren’t expressible in terms of each other to form a basis). A statement and sketch proof of Zorn can be found elsewhere on my blog, as can the proof that every vector space has a basis. A basis for the reals over the rationals is often called a Hamel basis.

Now we have a basis, it is easy to construct linear maps. In general, a linear map can be thought of as any ‘square matrix’ (which in the case of the reals over the rationals will have infinite size). A more useful definition is that given above, that it satisfies the equation f(ax+by) = af(x)+bf(y). Given our basis \mathcal{B} = \{ b_i \}_{i\in I} (where I is some indexing set and could be larger than \mathbb{N}) it is clear that any permutation of the basis is a linear map (and will hence satisfy Cauchy’s equation).

To make this concrete, let’s consider a simple example. Suppose, for simplicity, we only care for now about numbers of the form x+y\sqrt{2} (where x,y are rationals). We define f(x+y\sqrt{2}) = x\sqrt{2}+y (a linear map obtained by a permutation of the basis).

One the one hand it is simple to check that if z_i=x_i+y_i\sqrt{2}: i=1,2 then f(z_1+z_2) = (x_1+x_2)\sqrt{2} + (y_1+y_2) = f(z_1)+f(z_2). I.e. this function does satisfy the Cauchy equation. But on the other hand consider z=14142135-10000000\sqrt{2} (which is a very small number). f(z) = 14142135 \sqrt{2} - 10000000 which is very large (and it is clear that by taking better and better rational approximations to \sqrt{2} one can get arbitrarily large values of f arbitrarily close to zero). So f is a decidedely bizarre function, but it does satisfy the Cauchy equation (and it is easy to see that, given a Hamel basis, it can be extended to the entire set of real numbers by just fixing everything else in the basis).