Primes of the form x^2+2y^2

In this post I will present a quite nice proof of the following fact from elementary number theory:

Theorem 1 Let {p\in{\mathbb N}} be a prime number. There are {x,y\in{\mathbb N}} such that {x^2+2y^2=p} if and only if {-2} is a quadratic residue {\mod p}.

Recall that {-2} is a quadratic residue {\mod p} if there exist {z,k\in{\mathbb N}} such that {z^2+2=kp}, or equivalently if {z^2\equiv-2\mod p}. I recently learned a short proof of Theorem 1 using a nice result of Minkowski in geometry.

One of the directions of Theorem 1 is quite simple: if there are {x,y\in{\mathbb N}} such that {x^2+2y^2=p}, then {y} is co-prime to {p} and hence there exists {u\in{\mathbb N}} such that {uy\equiv1\mod p}. We have

\displaystyle y^2\big((ux)^2+2\big)=(yux)^2+2y^2\equiv x^2+2y^2\equiv0\mod p

Therefore {y^2\big((ux)^2+2\big)} is a multiple of {p} and hence {(ux)^2\equiv-2\mod p}. In the next section I will give an elementary proof of the other direction. Observe that, as a corollary of Theorem 1 we obtain the following:

Corollary 2 There are infinitely many primes of the form {x^2+2y^2}.

Proof: Assume, for the sake of a contradiction, that there are only finitely many primes, say {p_1,p_2,\dots,p_n}. Note that {2} is in this list. Let

\displaystyle z=\left(\prod_{i=1}^n p_i\right)^2+2

Observe that {x\equiv2\mod 4}, so {z/2} is odd. Let {p} be a prime that divides {z/2}. Then {p} is odd, and hence it is not on the list {p_1,\dots,p_n} (because otherwise {p} would divide {z} and {z-2}). However, {-2} is a quadratic residue {\mod p} and by Theorem 1 {p} is of the form {x^2+2y^2}, which is the desired contradiction. \Box

— 1. From number theory to geometry —

Let {p} be a prime number for which {-2} is a quadratic residue. The idea to prove (the hard direction of) Theorem 1 is that it suffices to find {x,y\in{\mathbb N}} such that

\displaystyle x^2+2y^2<2p\qquad\text{ and }\qquad x^2+2y^2\equiv0\mod p

With this in mind we define the sets

\displaystyle A=\big\{(x,y)\in{\mathbb R}^2:x^2+2y^2<2p\big\} \qquad B=\big\{(x,y)\in{\mathbb Z}^2:x^2+2y^2\equiv0\mod p\big\}

We need to show that {A\cap B\neq\{(0,0)\}}. Let {z\in{\mathbb N}} be such that {z^2\equiv-2\mod p} and {z<p}. Note that both {(z,1)\in B} and {(p,0)\in B}. Let {C} be the subgroup of {{\mathbb R}^2} generated by {(z,1)} and {(p,0)} and note that {C\subset B}. Let {F} be the parallelogram with vertices {(0,0),(z,1),(p,0),(z+p,1)}. In other words

\displaystyle F=\big\{(x,y)\in {\mathbb R}^2:0\leq y<1\quad \text{ and }\quad zy\leq x<zy+p\big\}

Observe that translations of {F} by elements in {C} form a tilling of the plane. More precisely, I claim that for any {(a,b)\in{\mathbb R}^2} there exists some point {(x,y)\in C} such that {(a-x,b-y)\in F}. Indeed, let

\displaystyle (x,y)=\left(\lfloor b\rfloor z+\left\lfloor\frac{a-\{b\}z}p\right\rfloor p,\lfloor b\rfloor\right)=\lfloor b\rfloor(z,1)+\left\lfloor\frac{a-\{b\}z}p\right\rfloor(p,0)

where {\lfloor b\rfloor} is the unique integer in the interval {[b,b+1)} and {\{b\}=b-\lfloor b\rfloor}. Then {(x,y)\in C}, {0\leq b-y=\{b\}<1} and

\displaystyle \begin{array}{rcl} \displaystyle z\{b\}\leq a-x<z\{b\}+p&\iff&\displaystyle z\{b\}\leq a-\lfloor b\rfloor z-\left\lfloor\frac{a-\{b\}z}p\right\rfloor p<z\{b\}+p\\&\iff&\displaystyle 0\leq a-bz-\left\lfloor\frac{a-\{b\}z}p\right\rfloor p<p\\&\iff&\displaystyle 0\leq \frac{a-bz}p-\left\lfloor\frac{a-\{b\}z}p\right\rfloor <1\end{array}

which proves the claim. We now turn to {A}. Adopting the usual convention

\displaystyle \frac A2:=\left\{\left(\frac x2,\frac y2\right):(x,y)\in A\right\}=\left\{(x,y)\in{\mathbb R}^2:x^2+2y^2<\frac p2\right\}

We find that the area of the ellipse {A/2} is {\pi p/(2\sqrt{2})>p}. On the other hand, the area of the parallelogram {F} is {p}. Thinking again of the tilling of {{\mathbb R}^2} by shifts of {F}, we conclude that there exist two points in {A/2} which belong to different shifts of the same point in {F}. More precisely, there exist {(a_1,b_1),(a_2,b_2)\in A/2} and {(x,y)\in C} such that

\displaystyle (a_1,b_1)-(a_2,b_2)=(x,y)\neq(0,0)

Since {A/2} is an ellipse centered at {0} we conclude that {-(a_2,b_2)\in A/2} and thus {(a_1,b_1)-(a_2,b_2)\in A}. The point {(x,y)} just found is then in both {A} and {C\subset B} and is not {(0,0)}, so we have {|x|,|y|\in{\mathbb N}} (because {p} is not a square, so {x} can’t be {0} and neither can {y}) and {|x|^2+|y|^2=p} as desired.

— 2. Minkowski theorem —

In this less elementary section I will explore the argument used in the previous section and formalize it. Let {d\in{\mathbb N}}. A lattice in {{\mathbb R}^d} is a discrete subgroup {\Lambda} which linearly generates the whole {{\mathbb R}^d} as a vector space over {{\mathbb R}} or, equivalently, it is a subgroup such that the quotient group {F:={\mathbb R}^d/\Lambda} is compact (for the quotient topology). We will denote by {\lambda} the natural projection from {{\mathbb R}^d} to {F}.

As an example take {d=2} and let {\Lambda} be the set {C} in the proof of Theorem 1 above. Another example is when {\Lambda={\mathbb Z}^2}. A non-example is {\Lambda={\mathbb Z}\times\{0\}}, still with {d=2}, because the quotient space can be identified with an infinite vertical strip of width {1} and is not compact.

Lemma 3 Let {d\in{\mathbb N}}, let {\Lambda} be a lattice in {{\mathbb R}^d} and let {F={\mathbb R}^d/\Lambda} be the quotient group. Denote by {\ell} the ({d}-dimensional) Lebesgue measure and, for a Borel set {A\subset F}, let

\displaystyle \mu(A)=\inf\left\{\ell(\tilde A):A\subset\lambda(\tilde A)\right\}\ \ \ \ \ (1)


Then {\mu} is (a normalization of) the Haar measure in {F}.

Proof: We have trivially that {\mu(\emptyset)=0} and {\mu(A)\geq0} for every measurable set {A}. Thus all that is needed for {\mu} to be a measure is being countably additive.

Let {(A_n)_{n\in{\mathbb N}}} be a sequence of disjoint measurable sets in {F} and let {A=\bigcup_{n\in{\mathbb N}}A_n}. Fix {\epsilon>0} and, for each {n\in{\mathbb N}}, let {B_n\subset{\mathbb R}^d} be such that {A_n\subset\lambda(B_n)} and {\mu(A_n)>\ell(B_n)-\epsilon/2^n}. Intersecting {B_n} with {\lambda^{-1}(A_n)} if needed, we can assume that {(B_n)} is a disjoint family (to see this, assume {x\in B_n\cap B_m}, then {\lambda(x)\in A_n\cap A_m=\emptyset} which is a contradiction). Thus we have that {B:=\bigcup_{n\in{\mathbb N}}B_n} has measure

\displaystyle \ell(B)=\sum_{n\in{\mathbb N}}\ell(B_n)<\sum_{n\in{\mathbb N}}\mu(A_n)+\epsilon

Since {A\subset\lambda(B)} and {\epsilon>0} was arbitrary, we conclude that {\mu(A)\leq\sum_{n\in{\mathbb N}}\mu(A_n)}.

To prove the other inequality, let {B\subset{\mathbb R}^d} be such that {A\subset\lambda(B)} and {\mu(A)>\ell(B)-\epsilon}. For each {n\in{\mathbb N}}, let {B_n=B\cap\lambda^{-1}(A_n)}. Since {A\subset\lambda(B)}, we have {A_n\subset\lambda(B_n)}. Thus {\mu(A_n)\leq\ell(B_n)} for each {n}. Since {(A_n)} are disjoint, then also {(B_n)} are disjoint. Summing over {n} we deduce

\displaystyle \sum_{n\in{\mathbb N}}\mu(A_n)\leq\sum_{n\in{\mathbb N}}\ell(B_n)\leq\ell(B)<\mu(A)+\epsilon

Since {\epsilon} is arbitrary we have {\mu(A)=\sum_{n\in{\mathbb N}}\mu(A_n)}. This implies that {\mu} is a measure on {F}.

The fact that {\mu} is invariant follows directly from the fact that {\ell} is invariant and {\lambda} is a group homomorphism. \Box

Equivalently one can identify {F} with a subset of {{\mathbb R}^d} (a fundamental domain) and take {\mu} to be the restriction of {\ell} to {F}, but we will not do this here.

We can now formulate Minkowski’s theorem, which is really behind the proof of Theorem 1:

Theorem 4 (Minkowski) Let {d\in{\mathbb N}} and let {\Lambda} be a lattice in {{\mathbb R}^d}. Let {F={\mathbb R}^d/\Lambda} and let {\mu} be the normalization of the Haar measure on {F} given by (1). Let {A\subset{\mathbb R}^d} be symmetric with respect to the origin and convex. If {\ell(A)>2^d\mu(F)}, there exists a nonzero point {x\in A\cap\Lambda}.

Proof: Let {B=A/2=\left\{x/2:x\in A\right\}}. Then {\ell(B)>\mu(F)} and I claim that there exist two points {x,x'\in B} such that {\lambda(x)=\lambda(x')}.

To see this we can use equation (1) to find a set {\tilde F\subset{\mathbb R}^d} such that {\ell(\tilde F)<\ell(B)} and {F=\lambda(\tilde F)}. For each {\gamma\in\Lambda} define

\displaystyle B_\gamma=\{x\in B:\exists y\in \tilde F:x-y=\gamma\}=B\cap(\tilde F+\gamma)

Since {\gamma(\tilde F)=F} we have that for every {x\in{\mathbb R}^d} there exists some {y\in\tilde F} such that {\lambda(x)=\lambda(y)} and hence {x\in \tilde F+\gamma} for some {\gamma\in\Lambda}. In particular we have that

\displaystyle B\subset\bigcup_{\gamma\in\Lambda}B_\gamma

and hence

\displaystyle \ell(B)=\sum_{\gamma\in\Lambda}\ell(B_\gamma)

Using the fact that {\ell} is translation invariant we deduce that

\displaystyle \ell(B)=\sum_{\gamma\in\Lambda}\ell(B_\gamma-\gamma)

But we have that {B_\gamma-\gamma\subset\tilde F}, by construction. Thus the sets {\{B_\gamma-\gamma:\gamma\in\Lambda\}} can not be disjoint. Let {\gamma_1\neq\gamma_2} be in {\Lambda} and such that {(B_{\gamma_1}-\gamma_1)\cap(B_{\gamma_2}-\gamma_2)\neq\empty} and let {y} be in that intersection. Then both {y+\gamma_1,y+\gamma_2} are in {B} and {\lambda(y+\gamma_1)=\lambda(y+\gamma_2)}, proving the claim.

So let {x\neq x'\in B} be such that {x-x'\in\Lambda}. Then also {-x'\in B} because {X} is symmetric with respect to the origin, and hence so is {B}. Since {A} is convex we conclude that {x-x'\in A}. \Box

The method from this post illustrates how one can use a variety of tools, from measure theory and geometry to algebra, to deduce a result in number theory. It looks like this could be the beginning of general result, but the fact that the set {C} in the proof of Theorem 1 is a lattice does not generalize to higher dimensions. One can still find a lattice of solutions for the local restriction, but the volume of the fundamental domain will be of the form {p^{d-1}}, which does not work well with the rest of the argument.


About Joel Moreira

PhD Student at OSU in Mathematics. I'm portuguese.
This entry was posted in Classic results, Number Theory and tagged , , . Bookmark the permalink.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s