In this post I will present a quite nice proof of the following fact from elementary number theory:

Theorem 1Let be a prime number. There are such that if and only if is a quadratic residue .

Recall that is a quadratic residue if there exist such that , or equivalently if . I recently learned a short proof of Theorem 1 using a nice result of Minkowski in geometry.

One of the directions of Theorem 1 is quite simple: if there are such that , then is co-prime to and hence there exists such that . We have

Therefore is a multiple of and hence . In the next section I will give an elementary proof of the other direction. Observe that, as a corollary of Theorem 1 we obtain the following:

Corollary 2There are infinitely many primes of the form .

*Proof:* Assume, for the sake of a contradiction, that there are only finitely many primes, say . Note that is in this list. Let

Observe that , so is odd. Let be a prime that divides . Then is odd, and hence it is not on the list (because otherwise would divide and ). However, is a quadratic residue and by Theorem 1 is of the form , which is the desired contradiction.

** — 1. From number theory to geometry — **

Let be a prime number for which is a quadratic residue. The idea to prove (the hard direction of) Theorem 1 is that it suffices to find such that

With this in mind we define the sets

We need to show that . Let be such that and . Note that both and . Let be the subgroup of generated by and and note that . Let be the parallelogram with vertices . In other words

Observe that translations of by elements in form a tilling of the plane. More precisely, I claim that for any there exists some point such that . Indeed, let

where is the unique integer in the interval and . Then , and

which proves the claim. We now turn to . Adopting the usual convention

We find that the area of the ellipse is . On the other hand, the area of the parallelogram is . Thinking again of the tilling of by shifts of , we conclude that there exist two points in which belong to different shifts of the same point in . More precisely, there exist and such that

Since is an ellipse centered at we conclude that and thus . The point just found is then in both and and is not , so we have (because is not a square, so can’t be and neither can ) and as desired.

** — 2. Minkowski theorem — **

In this less elementary section I will explore the argument used in the previous section and formalize it. Let . A *lattice* in is a discrete subgroup which linearly generates the whole as a vector space over or, equivalently, it is a subgroup such that the quotient group is compact (for the quotient topology). We will denote by the natural projection from to .

As an example take and let be the set in the proof of Theorem 1 above. Another example is when . A non-example is , still with , because the quotient space can be identified with an infinite vertical strip of width and is not compact.

Lemma 3Let , let be a lattice in and let be the quotient group. Denote by the (-dimensional) Lebesgue measure and, for a Borel set , let

Then is (a normalization of) the Haar measure in .

*Proof:* We have trivially that and for every measurable set . Thus all that is needed for to be a measure is being countably additive.

Let be a sequence of disjoint measurable sets in and let . Fix and, for each , let be such that and . Intersecting with if needed, we can assume that is a disjoint family (to see this, assume , then which is a contradiction). Thus we have that has measure

Since and was arbitrary, we conclude that .

To prove the other inequality, let be such that and . For each , let . Since , we have . Thus for each . Since are disjoint, then also are disjoint. Summing over we deduce

Since is arbitrary we have . This implies that is a measure on .

The fact that is invariant follows directly from the fact that is invariant and is a group homomorphism.

Equivalently one can identify with a subset of (a fundamental domain) and take to be the restriction of to , but we will not do this here.

We can now formulate Minkowski’s theorem, which is really behind the proof of Theorem 1:

Theorem 4 (Minkowski)Let and let be a lattice in . Let and let be the normalization of the Haar measure on given by (1). Let be symmetric with respect to the origin and convex. If , there exists a nonzero point .

*Proof:* Let . Then and I claim that there exist two points such that .

To see this we can use equation (1) to find a set such that and . For each define

Since we have that for every there exists some such that and hence for some . In particular we have that

and hence

Using the fact that is translation invariant we deduce that

But we have that , by construction. Thus the sets can not be disjoint. Let be in and such that and let be in that intersection. Then both are in and , proving the claim.

So let be such that . Then also because is symmetric with respect to the origin, and hence so is . Since is convex we conclude that .

The method from this post illustrates how one can use a variety of tools, from measure theory and geometry to algebra, to deduce a result in number theory. It looks like this could be the beginning of general result, but the fact that the set in the proof of Theorem 1 is a lattice does not generalize to higher dimensions. One can still find a lattice of solutions for the local restriction, but the volume of the fundamental domain will be of the form , which does not work well with the rest of the argument.