Skip to main content

Featured Post

Kryptos - The Cipher (Part 4) - Correctly positioned decryption of the word BERLIN

EASTNORTHEAST - This is not exactly the hint Jim Sanborn (JS) gave for K4 on the 29th of January this year. He only gave NORTHEAST - which refers to the positions 26-34 of K4's plaintext.  Beside BERLIN and CLOCK it is the third revealed plaintext word of K4. However, also this hint does not seem to help much.  However, it just so happened, that a member in the yahoo kryptos group had a conversation with Jim Sanborn due to a submitted solution. Sandborn's answer to the question contained again the last clue which surprisingly was EASTNORTHEAST at position 22-34. Jim Sanborns compass rose at CIA There is disagreement if Jim revealed this on purpose or he did it accidentially, but the new extended clue seem to be serious and valid.Interestingly, EASTNORTHEAST is exactly the direction which is illustrated on the compass rose on one of the stones around kryptos, also created by Jim Sanborn. Actually, i dont really kn...

How to detect primes in polynomial time - Understanding the AKS Algorithm

Detecting primes in deterministic polynomial time was a goal for hundreds of years.

In 2002, a major breakthrough occurred in number theory: for the first time, a proof was published showing how to determine, in deterministic polynomial time, whether a given integer $n$ is prime. The resulting method is now known as the AKS algorithm, named after its authors Agrawal, Kayal, and Saxena. For their work, they received two major international awards: the Gödel Prize and the Fulkerson Prize.

Of course, even before 2002, finding large prime numbers was far from hopeless. In fact, the algorithms used in practice today are essentially the same as those used before AKS, because they are extremely efficient. However, before AKS, all known efficient primality tests had one of the following drawbacks:

  1. they ran in polynomial time but relied on unproven conjectures, such as the Extended Riemann Hypothesis;

  2. or they were probabilistic, meaning that they could make an error, although with astronomically small probability after sufficiently many trials.

Many of these algorithms are based on the simple but powerful Fermat's Little Theorem, which has been known for centuries

Fermat's Little Theorem

Given a prime number $p \in \mathbb{P}$ and integer $b \in \mathbb{Z}$ then \begin{equation} b^{n} \equiv b \pmod{n} \end{equation}

This gives us a very simple test for compositeness: if we find a base $b$ such that \[ b^n \not\equiv b \pmod n,\] then $n$ cannot be prime.

However, the converse is not always true. For certain composite integers $n$ and certain bases $b$, the congruence \[ b^n \equiv b \pmod n \] may still hold. In that case, the composite number $n$ behaves like a prime number with respect to the base $b$.

Video: Fermat's Little Theorem

If $n$ is composite, but the congruence above holds for a particular base $b$, then $n$ is called a base-$b$ pseudoprime. It is called a pseudoprime because, for this specific base, it passes the same test that every prime number would pass.

In the video above, for example, you can see that $15$ behaves like a prime number for certain bases. Nevertheless, for many Fermat-type tests, a composite number can only fool the test for a limited proportion of bases. Therefore, if we test many randomly chosen bases, the probability of mistakenly declaring a composite number to be prime becomes very small. After $10$ independent successful tests, the error probability is already roughly bounded by $1/2^{10}$ in the idealized setting; after $100$ such tests, it is negligible for practical purposes.

This is the basic philosophy behind many primality tests used in practice today: they are fast, simple, and extremely reliable, but still probabilistic.

There is, however, a serious obstacle for the basic Fermat test. Some composite integers pass Fermat's test for every base $b$ that is coprime to $n$. These numbers are called Carmichael numbers. They are rare, but there are infinitely many of them.

One way to deal with Carmichael numbers is to move to the stronger concept of strong pseudoprimes. This leads to much better tests, such as the Miller–Rabin primality test. These tests can expose Carmichael numbers very efficiently, but they still do not remove the probabilistic aspect entirely.

The key idea behind the AKS algorithm is different. Instead of relying only on Fermat's Little Theorem for integers, Agrawal, Kayal, and Saxena considered a polynomial generalization of it.

Generalized Fermat's Little Theorem

Let $n \in \mathbb{N}$ ($n \geq 2$) and $a \in \mathbb{Z}$ then $n$ is prime if and only if \begin{equation} (\text{X}+a)^n \equiv \text{X}^n + a \pmod{n} \end{equation}

As stated in their original paper:

"The above identity (2) suggests a simple test for primality: given an input $n$, choose an $a$ and test whether the congruence is satisfied. However, this takes time $\Omega(n)$ because we need to evaluate $n$ coefficients in the LHS in the worst case."

So testing simply all values for $a$ is not possible. They came up with the following beautiful algorithm.

1. Algorithm

AKS Algorithm
Input: integer n > 1
  1. If $n$ is a PerfectPower return COMPOSITE
    // I.e. test if $n = a^b$ for $a \in \mathbb{N}$ and $b > 1$.
  2. Find the smallest integer $r$ such that the smallest integer $k$ for which $r | (n^k - 1)$ holds, is larger than $\log(n)^2$.
    // It is a critical part of the proof, that it can be shown that $r$ is at most $\log(n)^5$
  3. Test for all integers $0 \leq a \leq r$ if $\gcd(a,n) > 1$. If yes return COMPOSITE
    // Since $r$ is small we test if there is a integer less than $r$ which has a greatest common divisor with $n$ larger than $1$.
  4. If $n \leq r$ return PRIME
    //Since we tested in Step 3 if any integer less than $r$ has a common divisor with $n$, but if $n$ is also less then $r$, then $n$ must be prime.
  5. For $a = 1$ to $\lfloor \sqrt{\varphi(r)}\log(n) \rfloor$
    1. if $(X+a)^n \not\equiv X^n+a \pmod{X^r-1}$ return COMPOSITE
    // If this fails we know from the generalisation of Fermat's Little Theorem that $n$ must be composite.
  6. Return PRIME
    // This is the critical part of the algorithm. If we end here, the integer must be prime, albeit testing only a few values in Step 5.

The critical parts that one need to understand are Step 2 and Step 6. Since the other ones are more or less self-exploratory and do not address the core ideas of their work.

And if you understand Step 2 and Step 6 you can tell yourself that you understand one of the major breakthrough in Number Theory. For hundreds, perhaps thousand of years, nobody though about this solution.

3. Step 2

From now on, and for the rest of the post, let $n$ be the integer whose primality we want to test.

I will not repeat the full proof in all its detail here. Instead, I want to explain the core idea behind why this step works. For Step 2, the key question is:

Why does there exist an integer $r \leq \log^5(n)$ such that $n^k - 1$ is not divisible by $r$ for all integers $k \leq \log^2(n)$?

The main ingredient in this step is the least common multiple (LCM), which most of us probably learned about in school. The $\textsf{lcm}$ of the integers $a_1, a_2, \ldots, a_m$ is the smallest positive integer $M$ that is divisible by each of the integers $a_i$. For example, the $\textsf{lcm}$ of the first 10 positive integers is \[ \textsf{lcm}(1,2,3,4,5,6,7,8,9,10) = 2520. \] There is no smaller positive integer that is divisible by all integers from $1$ to $10$. We will use the following basic theorem.

Theorem: LCM lower bound

For the $\textsf{lcm}$ of the first $m > 7$ positive integers, we have: \begin{equation} \textsf{lcm}(1,2,\ldots,m) \geq 2^m. \end{equation}

In our example this is true, since $2520 \geq 2^{10} = 1024$. This simple estimate is essentially all we need.

Let \[ B = \lceil \log^5(n) \rceil. \] The proof defines the integer \[ M := n^{\lfloor \log B \rfloor} \prod_{i=1}^{\lfloor \log^2(n) \rfloor} (n^i - 1). \] For the moment, ignore the prefactor $n^{\lfloor \log B \rfloor}$; it is included for technical reasons. The important part is the product \[ \prod_{i=1}^{\lfloor \log^2(n) \rfloor} (n^i - 1). \] We want to show that not every integer $r \leq B$ can divide this product. In other words, there must be at least one integer $r \leq B$ that does not divide any of the terms $n^k - 1$ for $k \leq \log^2(n)$.

The trick is to compare the size of $M$ with the least common multiple of all integers from $1$ to $B$. We estimate: \begin{align*} M &\leq n^{\lfloor \log B \rfloor + \frac{1}{2}\log^2(n)(\log^2(n)-1)} \\ &\leq n^{\log^4(n)} \\ & \leq n^{\log^5(n)} \\ & \leq 2^B. \end{align*} Thus, $M$ is at most $2^B$. On the other hand, by the LCM lower bound, \[ \textsf{lcm}(1,2,\ldots,B) \geq 2^B. \] The number $\textsf{lcm}(1,2,\ldots,B)$ is the smallest positive integer divisible by every integer from $1$ to $B$. Therefore, if $M$ is smaller than this LCM, then $M$ cannot be divisible by every integer from $1$ to $B$.

Hence, there must be at least one integer $r \leq B = \lceil \log^5(n) \rceil$ that does not divide $M$. Since each factor $n^k - 1$ divides $M$, this means that this same $r$ cannot divide any $n^k - 1$ for $k \leq \log^2(n)$. This is exactly the integer $r$ we were looking for.

That is the whole idea. We have shown that such an $r$ must exist. Therefore, by testing the integers up to $\log^5(n)$, we are guaranteed to find one.

4. Step 6

Step 6 is the core of the algorithm. Step 6 is the heart of the AKS algorithm. Up to this point, the algorithm has only checked the polynomial congruence for comparatively few values of $a$. Nevertheless, if all these checks succeed, the algorithm returns PRIME.

We set $\lambda = \lfloor \sqrt{\varphi(r)}\log(n) \rfloor$ (this is the number of the values for $a$ to test).

Why is $n$ prime if we reached Step 6, i.e., especially if we only checked $\lambda$ values for $a$ in the equation \[ (X+a)^n \equiv X+a \pmod{X^r-1} \] in Step 5?

To understand the main idea, suppose that $n$ is composite, and let $p$ be a fixed prime divisor of $n$. The goal is to show that, under the assumptions made in the previous steps, this leads to a contradiction unless $n$ is actually a prime power.

The only object we want to define here is the set $\text{I}$: \[ \text{I} = \left\{ \frac{n^i}{p^i}\cdot p^j\;|\; i,j \geq 0\right\} \] The set $\text{I}$ contains infinity integers. If we reduce every integer in the set modulo $r$ (from Step 3), we reduce it to a finite number. Let this number $t$, i.e. \[ t = \left| \left\{ \frac{n^i}{p^i}\cdot p^j\pmod{r}\;|\;i,j \geq 0\right\} \right| \]

Can we give a lower bound for $t$? Yes. By the choice of $r$, we know that \[ \operatorname{ord}_r(n) > \log^2(n).\] This means that the first $\log^2(n)$ powers \[ 1, n, n^2, \ldots \] are all distinct modulo $r$. Therefore, \[ t \geq \log^2(n). \] In particular, \[ \sqrt{t} \geq \log(n) \] and \[ t \geq 1 + \lfloor \sqrt{t}\log n \rfloor \]

The core of the proof is to define another group $\mathcal{G}$ whose size can be bounded in two different ways. The group depends on $n$, $r$, and the polynomial factors that appear in Step 5.

We will not define the group $\mathcal{G}$ in full detail here. What matters is that there are two crucial bounds for its size:

  1. A lower bound: if the congruences in Step 5 hold for all tested values of $a$, then $\mathcal{G}$ must be large.

  2. An upper bound: if $n$ is composite and not a prime power, then $\mathcal{G}$ cannot be too large.

More precisely, it was already known that the size of $\mathcal{G}$ is larger than \[ \binom{t+\lambda}{t-1}. \] On the other hand, AKS proved that if $n$ is not a prime power, then \[ |\mathcal{G}| \leq n^{\sqrt{t}}. \]

This comparison is the key idea. If Step 5 succeeds for all tested values of $a$, then the lower bound becomes so large that it contradicts the upper bound for composite non-prime-powers.

Of course, the algorithm does not actually compute the group $\mathcal{G}$. The group is only used in the proof of correctness. Its size is far too large to compute directly. Instead, the argument shows theoretically that if all tests in Step 5 pass, then $n$ cannot be composite unless it is a prime power.

Let us see why the lower bound becomes too large.

After testing the $\lambda$ values in Step 5, we obtain enough distinct generators for the group $\mathcal{G}$. This gives the lower bound \[ |\mathcal{G}| \geq \binom{t+\lambda}{t-1}. \] Since \[ \lambda = \left\lfloor \sqrt{\varphi(r)}\log(n) \right\rfloor \] and $t \leq \varphi(r)$, we have enough tested values to force the following chain of inequalities:

After checking $\lambda$ values in Step 5 we can found $\lambda$ distinct generators for the group $\mathcal{G}$. Knowing this we can already compute the lower bound $\binom{t+\lambda}{t-1}$ for the current state \begin{align*} \left|\mathcal{G}\right| & \geq \binom{t + \lambda}{t-1} \\ & \geq \binom{1 + \lfloor \sqrt{t}\log n \rfloor + \lambda}{ \lfloor \sqrt{t}\log n \rfloor } \\ & \geq \binom{1 + 2\lfloor \sqrt{t}\log n \rfloor}{ \lfloor \sqrt{t}\log n \rfloor } \\ & > 2^{\lfloor \sqrt{t}\log n \rfloor + 1}\\ & \geq n^{\sqrt{t}} \end{align*} Thus, if all congruences in Step 5 hold, then the group $\mathcal{G}$ must satisfy \[ |\mathcal{G}| \geq n^{\sqrt{t}}. \] But this contradicts the AKS upper bound \[ |\mathcal{G}| \leq n^{\sqrt{t}} \] for composite integers that are not prime powers.

Therefore, if the algorithm reaches Step 6, then $n$ must be a prime power. But Step 1 already checked whether $n$ is a perfect power. Hence the only remaining possibility is \[ n = p \] for some prime number $p$.

This is why Step 6 is valid: if all previous tests succeed, then $n$ cannot be composite. The algorithm may look surprisingly short, but the proof shows that these few tests force the hidden group $\mathcal{G}$ to become too large for a composite number.

Comments

  1. Cool! The prime number 97 in the picture, does this happen to be a nod to Kryptos K4? ☺️

    ReplyDelete

Post a Comment

Popular posts from this blog

Kryptos - The Cipher (Part 4) - Correctly positioned decryption of the word BERLIN

EASTNORTHEAST - This is not exactly the hint Jim Sanborn (JS) gave for K4 on the 29th of January this year. He only gave NORTHEAST - which refers to the positions 26-34 of K4's plaintext.  Beside BERLIN and CLOCK it is the third revealed plaintext word of K4. However, also this hint does not seem to help much.  However, it just so happened, that a member in the yahoo kryptos group had a conversation with Jim Sanborn due to a submitted solution. Sandborn's answer to the question contained again the last clue which surprisingly was EASTNORTHEAST at position 22-34. Jim Sanborns compass rose at CIA There is disagreement if Jim revealed this on purpose or he did it accidentially, but the new extended clue seem to be serious and valid.Interestingly, EASTNORTHEAST is exactly the direction which is illustrated on the compass rose on one of the stones around kryptos, also created by Jim Sanborn. Actually, i dont really kn...

Kryptos - The Cipher (Part 1) - Introduction

Introduction. Since I think that KRYPTOS does not need any introduction, I will only give you a brief description of one of the most famous and only partially solved ciphers known today: KRYPTOS - Von Jim Sanborn - Jim Sanborn, CC BY-SA 3.0, https://commons.wikimedia.org/w/index.php?curid=8253447 KRYPTOS was constructed in Nov. 1990 on the ground of the CIA Headquarter in Langley, Virginia by Jim Sanborn It contains 4 ciphers (K1,K2,K3,K4) on its left side and some kind of Vigenère-Table on its right side K1, K2 and K3 were solved by James Gillogly in 1999. Afterwards, the CIA and later the NSA claimed that they had a solution to the first three ciphers at an earlier point in time Ed Scheidt, a cryptoanalyst and former director of the CIA, gave Sanborn the input of possible cryptographic techniques to use K1 is a variant of the Vigenère-Cipher (Quagmire 3) with the codewords KRYPTOS and PALIMPSES...

Kryptos - The Cipher (Part 3) - K4 Intentional vs. non-intentional errors

This post is about is more or less a collection of several approaches and facts that has been said as well as some speculations. B-ary integer representation According to [1] during a Question and Answer round, Jim Sanborn was asked again about the hint BERLIN. The question was if N decodes to B, Y decodes to E, etc, etc. and Jim confirmed it does. Emphatically . It is written, that Jim Sanborn rattled through the entire crib: \begin{align}   \texttt{N} &\stackrel{\text{decode}}{\rightarrow} \texttt{B} \\   \texttt{Y} &\stackrel{\text{decode}}{\rightarrow}  \texttt{E} \\   \texttt{P} &\stackrel{\text{decode}}{\rightarrow}  \texttt{R} \\   \texttt{V} &\stackrel{\text{decode}}{\rightarrow}  \texttt{L} \\   \texttt{T} &\stackrel{\text{decode}}{\rightarrow}  \texttt{I} \\   \texttt{T} &\stackrel{\text{decode}}{\rightarrow}  \texttt{N} \end{align} When the same q...