For almost every n….

January 20, 2010 by tlovering

Have just come across a fairly groovy gadget: ultrafilter-based quantifiers, essentially a rigorous way of talking about statements being true “for almost all n“.

The basic plan is that we will work with a set \mathcal{U} of subsets of \mathbb{N} which are to be considered “very large” or “almost every natural number”. We then can define the logical statement

P(n) holds for almost every n (or, \forall_\mathcal{U} n, P(n))

to mean

\{n \in N : P(n) \text{ holds}\}\in \mathcal{U}.

So all we really have to do is work out how ‘large’ sets behave, or how we want these propositions to work out. The following seem like reasonable assumptions:

  • The entire set of natural numbers is a large subset. The empty set is not.
  • P and Q hold for almost all naturals if and only if P holds for almost all naturals and Q holds for almost all naturals.
  • Either P holds or Q holds for almost every natural if either P holds for almost every natural, or if Q does.
  • If not almost every natural has property P, then almost every natural has property “not P”
  • A large subset of \mathbb{N} must be infinite.

Writing in terms of our collection of ‘large sets’ we obtain the rules:

  • \mathbb{N} \in \mathcal{U}, \emptyset \not\in \mathcal{U}
  • A,B \in \mathcal{U} \Leftrightarrow A\cap B \in \mathcal{U}
  • A\cup B \in \mathcal{U} \Leftrightarrow A \in \mathcal{U} \text{ or } B \in \mathcal{U}
  • For every A \subseteq \mathbb{N}, precisely one of A, \mathbb{N}\backslash A is in \mathcal{U}.
  • There is no finite set in \mathcal{U}.

These, it turns out, do axiomatise the structure of consistent sets of subsets of \mathbb{N} called nonprincipal ultrafilters.

Note the “nonprincipal” bit comes from our final condition that \mathcal{U}. If we remove this restriction, we allow the existence of a small number of so-called ‘principal’ ultrafilters of the form “every set containing the number m.” One can easily verify that this set satisfies all the remaining axioms.

Great! So we have a bunch of rules telling us these structures can work. Furthermore, it’s possible without a great amount of thought to construct one using transfinite induction (usually in the form of Zorn’s Lemma). However, it’s clear they still perhaps are not quite what we might intuitively think. For example, at some point during the construction of an ultrafilter we are required to decide whether the set of even numbers is a “large” set or if the odd numbers are. Annoyingly, our above terminology will translate into “Almost every number is even” and “Almost no number is odd” or vice-versa with neither being canonically ‘correct’ (though I can think of meta-arguments for both ways round).

So why is this somewhat eclectic formalism helpful?

Firstly, it helps us to fudge situations that “don’t quite work” as things stand. For example, it is often true that if one takes a product of some number of algebraic structures one gets a similar structure. Groups and topological spaces both have reasonably nicely defined products (with operations componentwise). An obvious type of structure that won’t have this property is a field. For example, let’s take \mathbb{R}. If we take a countable product \mathbb{R}^\mathbb{N} it is obvious we don’t automatically get a field (though we do get a rather large vector space), since an element like (3,\pi,0,2,-4.3,.....) is on the one hand clearly not equal to (0,0,0,...) but on the other hand has no inverse, because the zero in the third entry causes problems.

Oh, what shame… so I guess there’s no hope for constructing lots of new fields from old fields by taking huge products… If only there was some way in which we could say that a sequence (1,0.5,0,0,0,0,……0,0,0,….) has so many zeroes it is basically the same as the zero element, whereas (0,1,1,1,…,1,1,…) has so few zeroes, it could just as well be (1,1,……1,1,…) and we wouldn’t care, in such a way that removes all our problems. Aha! But we can do this! Define an equivalence relation on \mathbb{R}^\mathbb{N} by

x=(x_1,x_2,...,) \sim y=(y_1,...) if for most i, x_i=y_i (where we fix “for most” to be with respect to some ultrafilter \mathcal{U}).

And define the ultraproduct as the quotient of the product space by this equivalence relation

\mathbb{R}_\mathcal{U} := \mathbb{R}^\mathbb{N} / \sim

As a quick exercise, work out what this looks like if \mathcal{U} is a principal ultrafilter.

If \mathcal{U} is nonprincipal this gives rise to a new (much larger) field called the hyper-real numbers where “infinite” and “infinitessimal” numbers can be thought to exist. Taking a product of different fields (and modifying the ultrafilter) may lead to diverse fields with different structures.

As a further application, a couple of days ago I stumbled across a beautiful proof in a paper by Imre Leader of the following result. Unfortunately the details are too long for me to replicate, but I can hopefully sketch some of what is going on, though I must confess that this post derives from awe rather than a deep understanding on my part.

Theorem (Finite Sums Thm): Let the natural numbers be coloured with finitely many colours. Then there exist x_1, x_2, .... \in \mathbb{N}  such that FS((x_i)):= \{x_{i_1}+...+x_{i_m}: m,i_1,...,i_m \in \mathbb{N}\} is monochromatic.

To prove this, one defines an addition operation on ultrafilters, by

\mathcal{U} + \mathcal{V} = \{S \subset \mathbb{N}: (\forall_\mathcal{U} x)(\forall_\mathcal{V} y) x+y \in S\}.

One can check that this is associative (but not commutative) and left-continuous in a standard compact topology on the space of natural ultrafilters, and then deduce (nontrivially) that there is some idempotent ultrafilter: \mathcal{U} + \mathcal{U} = \mathcal{U}. Then the proof can be rapidly finished off by picking a single colour class C lying in \mathcal{U} (which can be seen to exist by an easy induction), and building up the sequence step by step passing to nonempty nested subsets of C still lying inside \mathcal{U} at each stage.

Finally, an amusing application is Godel’s proof of the existence of God, where an ultrafilter structure is noted on the poset of “good properties”. I am not sure if he accounts for the possibility that “goodness” could itself be a primitive property and generate a principal ultrafilter, but I haven’t followed the argument closely, instead referring the reader to the relevent wikipedia page.

Quantisation of the Harmonic Oscillator

December 8, 2009 by tlovering

As its name suggests, Quantum Mechanics is underpinned by the idea of quantisation: physical systems on the quantum scale can be split into discrete states one of which will always be observed when a measurement is made. Part IB Quantum Mechanics includes a section on the harmonic oscillator, possibly the simplest analytic nonzero potential, and then proves that states are quantised by the fairly unrevealing method of ’spotting’ a Gaussian groundstate and then deriving series solutions and only taking those which won’t misbehave at infinity, giving a discrete spectrum of energy values.

In part II, a much more novel and surprising (at least from my current perspective) method is taken – dream up some operators that allow you to hop between energy levels by a constant amount, and then place bounds on where we can travel, giving a fixed lattice of hop locations which yield the discrete energy spectrum. This seems curious, and I’m wondering if there is something more fundamental going on or if this is just a trick. Maybe writing out this post will help (or responses from more physically inclined friends).

The (undimensionalised) harmonic oscillator is that with Hamiltonian

H = \frac{p^2}{\hbar^2} + x^2 = (x-i\frac{p}{\hbar})(x+i\frac{p}{\hbar}) + [i\frac{p}{\hbar},x] = a^\dagger a + 1

Where we have let a=x+i\frac{p}{\hbar}.

Our first nontrivial observation is that the form of the Hamiltonian leads to the relation

\langle \psi |H-1|\psi \rangle = |a |\psi \rangle|^2 \geq 0.

I.e. every state has energy at least 1, and a|\psi\rangle is the zero state if and only if \langle \psi |H|\psi \rangle =1.

Our second nontrivial observation is that a does not commute with the Hamiltonian. Indeed, let’s bash out some commutation relations (deriving everything from [x,p] = i\hbar) and see what we get.

[a^\dagger, a] = [x-i\frac{p}{\hbar},x+i\frac{p}{\hbar}] = [x,i\frac{p}{\hbar}]-[i\frac{p}{\hbar},x]=-2

Hence [H,a] = [a^\dagger a, a] = [a^\dagger, a]a = -2a.

What does this mean for our energy eigenstates (if it’s got energy E we’ll write it |E\rangle? If we measure the energy of a state after the a operator has been applied we get

Ha |E\rangle = (aH + [H,a]) |E\rangle = (E-2)a|E\rangle

In other words, provided E\not=1 (where both sides of the equation vanish by what we already proved about the modulus of a|\psi\rangle) applying the operator a reduces the energy of any eigenstate by 2.

Hence the only possible energy eigenvalues are E=1,3,5,...... Indeed, if there were any more, we could apply a some finite number of times to obtain an eigenstate with energy value less than 1, which we proved was impossible, so the energy spectrum of the harmonic oscillator is indeed discrete.

I still find that this method exists and works wholly mysterious and wonderful (given quite how much neater it is than just bashing out the differential equations directly). If anyone can give me some deeper insight into what is going on I’d be appreciative.

A slightly strange audience participation experiment

October 9, 2009 by tlovering

I have started having a few thoughts that have led me to propose a quick experiment which could be helpful in solving a problem I’ve been working on. If you (a mathematically inclined blog reader) have about 10 minutes free right now and are keen to try a nice problem, read on. Otherwise, stop reading and come back later when you have a little bit of time.

Basically I need you to try the following problem more-or-less immediately (no subconscious thinking time!) for ONLY 10-15 MINUTES and then send me a quick email (my @cam.ac.uk address is tl309) detailing your thoughts on the problem. You don’t have to have solved it. What I’m really interested in is your reaction to the problem and the heuristics going through your mind. How you *feel* after 10-15 minutes of work could be immensely useful to know, and if I get enough responses I’ll post the interesting ones in a later post explaining what I’m trying to do.

Problem: Imre got lost on the way to his office and has found himself wandering along the infinitely long corridor in the Centre for Mathematical Sciences. The floor is covered with large tiles such that the corridor is only one tile wide but there are infinitely many stretching in both directions. Imre’s arch-enemy (whom we shall denote Ermi) has rigged up the corridor with teleporters so that if Imre is on certain tiles at certain times he is transported to a faraway nightmarish universe where the axiom of choice is false.

Imre takes one second to move one tile in either direction (though he could just pause and wave his arms for a second instead). Owing to power limitations (and his unsophisticated doom teleporter software), each second Ermi can program just a single teleporter to activate for just one second at some specified point in the future (so he *could* keep one teleporter activated for the entire time, or could activate lots of teleporters simultaneously but only for a very short period of time at a specified point quite far away in the future). There is a time delay, so Ermi can’t turn any teleporters on instantly (so Imre will always get a chance to move out of the way if a teleporter is activated where he is standing). Is it possible for Ermi to trap Imre, or can Imre evade him forever (if whenever Ermi programs a teleporter Imre knows about it)?

If you’ve seen and thought about an equivalent problem before, please don’t tell anyone else for the time being (and don’t bother sending me an email). Otherwise, enjoy, and I’ll be interested in hearing your thoughts (obviously, once you’ve sent me the email, feel free to try solving the problem ;) ).

Tom.

Edit: It would actually be ok if you just posted a “comment” underneath this thread instead of emailing. I was worried about people reading what other people had written, but since I can wait until I publish the comments publically this isn’t an issue.

The long line: too long to be pathwise connected

September 29, 2009 by tlovering

While revising IB Metric and Topological Spaces, I came across the following cool alternative to the topologists sine curve as an example of a space that is connected but not pathwise connected.

The idea is to build a really really long line (in fact we shall build a ray, but constructing a line can be achieved by an obvious gluing job). It will be connected for the same reasons as the real line, but not pathwise connected just because any path will be too short. To realise our goal, we pick the smallest uncountable ordinal \omega_1 and just take X=\omega_1 \times [0,1) \cup \{\infty\} with the obvious (order) topology generated by interpreting this as placing uncountably many intervals end-to-end (and \infty > (\alpha,x) for all (\alpha,x) \in \omega_1 \times [0,1)).

Consequently we get a really really long ray, which is far longer than any continuous image of a closed interval of real numbers, so it isn’t pathwise connected (any path from (0,0) to \infty must cross uncountably many intervals, but there is a rational number in the inverse image of each interval.

Living space on the unit sphere

September 19, 2009 by tlovering

One of the recent questions from an example sheet in Metric and Topological Spaces gave rise to the following more interesting general question:

What is the greatest number N_d(\theta) of unit vectors in \mathbb{R}^d such that the angle between any two is greater than or equal to angle \theta?

I don’t want to ruin this question by posting a solution, or even a sketch solution on this blog, but there is some very interesting behaviour revealed once you look into it. Anyone who wishes to investigate fully independently should now stop reading. I am going to post a few interesting remarks that break down the kind of behaviour the problem exhibits (without giving any proofs).

Firstly, it is almost obvious that if \theta \geq \frac{\pi}{2} then N_d(\theta) \leq 2d. Unfortunately, the ‘obvious’ proof of this is a proof by ‘worst case scenario’ (an orthogonal set and its negation is clearly the best example), which unfortunately doesn’t really hold water. However, this does turn out to be true, and my process of finding proof gave me an interesting perspective on what ‘really spread out’ vectors look like.

Now we get a bizarre phenomenon: acute angles seem to behave frighteningly differently to right and obtuse angles. Any sort of inductive bound one tries to find seems to fail, yet coming up with reasonable explicit constructions is tricky. In fact it turns out that a sudden jump is made from there being linearly many unit vectors to their being exponentially many unit vectors. Getting the upper bound for this is quite simple. A lower bound is harder.

Cauchy’s Functional Equation Solutions

September 17, 2009 by tlovering

It’s a great equation for pulling out of the bag when giving talks at IMO training camps on functional equations, partly because it is so simple and pops up reasonably often, but also because it teaches a valuable lesson to school children about functions: they aren’t always nice. However, we always wave our hands and say that there exist horrible solutions, and circumstantial evidence for this is that it seems like we can’t get any more information out of the equation than its value on the rationals (or a translation thereof). However, we are always a little wary of actually doing the construction, partly because it would be useless for the IMO, and partly because it in some sense isn’t really a construction at all – it requires Zorn’s lemma, and in some sense always will. That said, I think IMO trainees have a right to know what these solutions look like if they’re curious, so I thought I’d try putting an elementary explanation up here.

Firstly, recall that Cauchy’s equation is a property of functions from the reals to the reals, telling us that

f(x+y) = f(x) + f(y) for all x,y \in \mathbb{R}.

An obvious guess is f(x)=kx, which clearly works. However, we cannot deduce this must be the solution for all reals, but we can deduce, by a fairly routine induction argument, that f(xy)=xf(y) for all rational x.

These two properties taken together tell us that for every x,y \in \mathbb{R}, a,b \in \mathbb{Q} we have f(ax+by) = af(x) + bf(y). If we consider the real numbers to be a set of vectors over the rationals, then this tells us that f is a linear map. However, it is also clear that the converse holds, so in fact any linear map will satisfy both of these conditions and so satisfy the Cauchy equation.

Just to elaborate on what we mean by considering the real numbers to be vectors over the rationals, the standard vectors from A-level maths are usually vectors over the reals. We could represent a 3-dimensional vector by v=xi + yj+zk where x,y,z are real numbers, and i,j,k are fixed vectors which should be somehow considered as abstract objects with the properties that:

  • The sum of any two vectors is a vector (and we can subtract vectors, there is a zero vector, and so on).
  • If we multiply a vector by a scalar (in this case a real number) we get another vector.

These two operations interact with one another in the obvious way. However, there’s no reason vectors have to represent arrows in space. Anything that obeys the above rules, regardless of its interpretation, can be considered as a ‘vector’. To see that the reals can be considered as vectors over the rationals, consider the set of numbers of the form a+b\sqrt 2 + c\sqrt 3 : a,b,c \in \mathbb{Q}. It is not tricky to prove that this behaves, for our purposes, just like the 3D real vector spaces from A-level maths. Sure, we can multiply real numbers together in a way we can’t multiply geometrical vectors, but this is a property we will ignore for now (except the multiplication in which one number is a rational number).

Bearing in mind the above example, it therefore isn’t a giant leap of intuition to consider the entire set of real numbers as a set of vectors over the rational numbers. There is one important potential area of confusion still unresolved though. In all the above examples, we were able to expand any vector out and write it uniquely as a (finite) linear combination of other vectors (e.g. v = x_1 i + x_2 j + x_3 k). A collection with this property, like \{ i,j,k\} is called a basis for the vector space. If you spend a while trying to build a basis for the reals over the rationals, it soon becomes clear that it’s going to have to be pretty big. In fact, you don’t have to think too hard before you find an infinitely large collection of real numbers where none can be written in terms of the others (so the basis is going to have to be infinite). It isn’t at all obvious that a basis even exists.

However, fortunately, one can prove using Zorn’s Lemma that a basis always exists (in fact, even better, we can extend any set of vectors that aren’t expressible in terms of each other to form a basis). A statement and sketch proof of Zorn can be found elsewhere on my blog, as can the proof that every vector space has a basis. A basis for the reals over the rationals is often called a Hamel basis.

Now we have a basis, it is easy to construct linear maps. In general, a linear map can be thought of as any ’square matrix’ (which in the case of the reals over the rationals will have infinite size). A more useful definition is that given above, that it satisfies the equation f(ax+by) = af(x)+bf(y). Given our basis \mathcal{B} = \{ b_i \}_{i\in I} (where I is some indexing set and could be larger than \mathbb{N}) it is clear that any permutation of the basis is a linear map (and will hence satisfy Cauchy’s equation).

To make this concrete, let’s consider a simple example. Suppose, for simplicity, we only care for now about numbers of the form x+y\sqrt{2} (where x,y are rationals). We define f(x+y\sqrt{2}) = x\sqrt{2}+y (a linear map obtained by a permutation of the basis).

One the one hand it is simple to check that if z_i=x_i+y_i\sqrt{2}: i=1,2 then f(z_1+z_2) = (x_1+x_2)\sqrt{2} + (y_1+y_2) = f(z_1)+f(z_2). I.e. this function does satisfy the Cauchy equation. But on the other hand consider z=14142135-10000000\sqrt{2} (which is a very small number). f(z) = 14142135 \sqrt{2} - 10000000 which is very large (and it is clear that by taking better and better rational approximations to \sqrt{2} one can get arbitrarily large values of f arbitrarily close to zero). So f is a decidedely bizarre function, but it does satisfy the Cauchy equation (and it is easy to see that, given a Hamel basis, it can be extended to the entire set of real numbers by just fixing everything else in the basis).

Neat formula for the projective dual of a convex curve y=f(x)

September 11, 2009 by tlovering

In projective geometry we know that ‘well-behaved’ curves such as conics can be viewed in two ways: either as a locus of points or an envelope of lines (the curve being determined by being tangent to every line in the envelope). One of the little gems of the new superficially dry course on ‘Variational Principles’ is a method to give us a way of switching between these two views algebraically (provided the situation is suitably nice).

Firstly, we suppose that y=f(x) describes a locus of points. How might we parameterise the tangents in a way that doesn’t involve points at all? Tangents are lines of the form y=mx-c, so in some sense there are two co-ordinates (m,c) that describe the tangents uniquely. If we are to obtain an equation in a similar form c=f^*(m) we will clearly require that f is strictly convex (else g will fail to be well-defined), so assume this is the case.

We then seek the c such that y=mx-c is the unique tangent to the curve y=f(x) with gradient m. Equating the two values for y we get that c = mx_0 - f(x_0) where x_0 is the unique real number such that f'(x_0) = m. Nothing mind-blowing so far.

But this becomes rather more interesting once we remember the supporting-line inequality for convex functions (a convex function is always bigger than any tangent with equality at the point of tangency). Writing this down at our point we get that f(x) \geq mx - mx_0 + f(x_0). Rearranging, we get

mx- f(x) \leq mx_0 - f(x_0) for all x.

So writing this in a slightly different way, we get the cute formula that f^*(m) = \text{max}_x(mx-f(x)).

And for an encore, it’s now obvious (by projective duality) that if we do this transformation twice we get back to where we started (the curve stays fixed, and our function goes from describing the locus to describing the envelope to describing the locus again), so we get the nice relation that f^{**} = f for all strictly convex f.

Sequences of reals have no countable basis

September 8, 2009 by tlovering

When first asked if V=  \mathbb{R}^{\mathbb{N}} has a countable basis as a vector space over \mathbb{R} almost everyone will think for a second: “of course! (1,0,0,…),(0,1,0,…),… will do it”. Then most people realise this is total rubbish because (1,1,….) isn’t a finite linear combination of these. This then usually raises the suspicion that any basis must be uncountable.

A quick way to check this is just to consider the set of sequences of form a_n = e^{\alpha \log n} where \alpha is any positive real. This is clearly an uncountable linearly independent set.

However, actually constructing a basis seems to be impossible (though the Zorn argument I posted on this blog about 6 months ago shows it has to exist). If anyone knows how to do it, I’d love to hear it.

EDIT: It has been pointed out to me that it’s not entirely obvious that finding an uncountable linearly independent set constitutes a proof that there is no countable basis. I have assumed the dimension theorem, that with every vector space V there is associated a fixed cardinal \text{dim}V which gives the cardinality of any basis. Equivalently, any two bases for V have the same cardinality, and any set of greater cardinality therefore has linear dependencies (since it cannot be extended to a basis). To prove the dimension theorem, we take two bases of different sizes, write out each element of the smaller one in terms of the larger one, and because we only use a finite subset of the elements from the large basis to write out each element of the smaller one, the union of the elements used in this process can’t possibly be equal to the entire large basis, giving the necessary contradiction.

Zorn’s Lemma: You can’t run forever.

September 7, 2009 by tlovering

Imagine you are in one of those computer games where there is a wall of spikes coming from the left of the screen, so you are forced to proceed indefinitely until the level ends, always proceeding in the same direction. Suppose the level never ends: when can you escape? Zorn’s lemma asks, and answers, this question. Perhaps a little surprisingly it reveals that perversely however massively infinite and complicated your set is that the only possible means of escape you have is to dart down a hole that looks like the natural numbers, such that wherever the wall of spikes is there is always some amount of space to the right. If there are no such holes, then however stupidly huge the space in which you have to run, you eventually have no chance.

Sometimes Zorn is viewed as being a rather scary complicated theorem because it is equivalent to the axiom of choice. This is totally false: the axiom of choice (large cartesian products of nonempty sets are nonempty) is a really nonscary fact that is equivalent to the axiom of choice. What seems to make Zorn scary is that it gives really specific information about the structures of monstrously large sets, and there’s no way to prove it without first working out what it means to have a pursuit lasting many many infinities and then ending.

Let’s set up a little notation. Suppose we have a set X on which there is some partial order relation < (so for any pair of distinct elements a,b \in X it may be true that one of a<b or b<a holds, or neither could hold). Since we have to keep running to the right, we will be interested in subsets of X that are well ordered: of the form x_1<x_2<.... (where the subscripts could be members of any set, possibly much larger than the integers). We (perhaps slightly abusing standard usage) call such a set a chain.

We say that a chain is an escape tunnel if there is no m \in X such that for every element x in the chain, m \geq x. If there is an escape tunnel, we can trivially escape down it forever. The only interesting case is therefore when X doesn’t have any escape tunnels. It could perhaps be viewed as being a ‘closed’ set in some sense, but bear in mind that it could be absolutely huge, far bigger than anything we could ever visualise. Can we exploit this sheer hugeness to escape, without needing to cheat and run down an escape tunnel?

Well, let’s give it a shot. Suppose we find ourselves on some point x. If we are going to keep running, all we need is some x' such that x'>x. We then just keep going forever, and because our set is really big we can probably keep going. We try going for a while, and locally it looks like we are darting down an escape tunnel, but because there aren’t any escape tunnels, there is some m ahead of us and our planned route which at some point (possibly infinitely far away in the future) the wall of spikes will get to, so we must get there too. So we keep going ‘forever and a day’ in this way.

Ah, but wait a sec. Can a set X of any size support this kind of running forever? After all, a set does have a fixed cardinality. If we run for ages and ages and ages (starting at some point), we might be able to build an arbitrarily large well-ordered subset of X. But the size of such a subset is bounded: it can never be bigger than X.

And yet if we consider the set of well-orderings of subsets of X, and consider the set S of order-isomorphism classes represented, this is itself a well-ordered set, but cannot possibly be order-isomorphic to any of its elements. In other words, S is a well-ordering bigger than any possible on X.

Yet the set of well-ordering types up to and including S is itself well-ordered (1<2<3 is in some obvious sense smaller than 1<2<3<4<5), and if we keep running forever in our space X, our route is at some point bound to be long enough to be a copy of S. But this is impossible, because S is just too big. So at some point we get stuck at some point from which we cannot escape and die, even if X is really massive.

More formally, X has at least one maximal element (‘a point from which we cannot escape’).

There are bits of rigour missing from the argument above, and somehow without developing the dull bits of the theory of ordinals it’s impossible to fill in the gaps, but the intention of this post was to highlight that it’s somehow ultimately a result about really long pursuits in a large space, and that the proof is really not much more idea-wise than you need in the finite case (once you’ve had the idea of blocking off the escape tunnels).


Student proves to the porters that the reals are uncountable

September 6, 2009 by tlovering

First I’ll begin by mentioning that this is an idea completely stolen from a talk by Prof. Imre Leader on games of pursuit, most recently delivered at the ‘Bath’ IMO Squad Nursery camp in Oxford last week.

Suppose we have a ‘lawn’ on which there is somewhere a student. Being in Cambridge, the student is not supposed to be on the lawn, and a group of porters is charged with apprehending the student. The head porter may deploy some number of porters (whose maximum speed is the same as that of the student) to points on the boundary of the lawn and issue them with a strategy to bring the student in (the student must escape in finite time by reaching a point on the boundary at which there is not a waiting porter). Of course, the porters may never step on the grass. The question is, for various lawns, how many porters do you need? The two lawns we will consider are the square shaped lawn in ‘Great Court’, and the circular-shaped law ‘New Court’. As always, I’d advocate the reader sitting down and drawing pictures of the lawns and thinking about (even proving?) how many porters they think they need before reading on.

For the square lawn, it is clear that 4 porters can do the job by just tracking the student’s ordinate parallel to the side they are guarding. A boring check shows that 3 cannot do it.

For the circular lawn, it seems difficult to prove that any number of porters can do it even though it seems that a trillion porters should probably be able to do it. However, the head porter has a cunning plan. There are infinitely many points on the boundary of the lawn, so if he dispatches infinitely many porters, he can just guard every point, so he sends out porters P_1,P_2,...... However, the student realizes that to avoid all the porters, he can avoid each in turn, so it will suffice if the student can get himself to some position really close to the boundary where he will be able to escape P_1 whatever he does next provided he does so quickly, and then in time at most \frac{1}{1000} get himself to a position where he will always escape P_2 provided he escapes by the same deadline he had for P_1, and so on, taking at most 1000^{-(n-1)} time to get to a position from which he will always escape P_n provided he leaves within some sufficient time at least 1000^{-1}+1000^{-2}+... from getting to his position clear of P_1. It’s then clear that, provided his path tends to the boundary, when he eventually escapes none of the porters will be waiting for him. Having had this idea, the problem becomes very simple owing to the arc length of the circle along any distance being greater than the length of the corresponding chord (so if the student is sufficiently near the edge, he can always get well away from any given porter by heading along some straight line). Hence any infinite set of porters labelled with natural numbers fails to catch the student. What has gone wrong in the head porter’s plan? Clearly if we just permenently put a porter on every point of the boundary they are bound to catch the student. The only problem that could possibly have arisen is that he labelled his infinite set of porters with the natural numbers, and the above escape strategy shows that such a set of porters wasn’t actually big enough to implement the obvious strategy and guard every point of the circle. In other words, the head porter has, to his misfortune, discovered that the set of points on the boundary of the circle is uncountable.