Skip to main content

Featured Post

Backdooring Cryptography - Two characters that break your SSL encryption

In this article, we demonstrate a subtle but devastating backdoor in finite-field Diffie–Hellman. By computing public keys modulo $p^2$ instead of $p$ while restricting the secret exponent to $x \leq p-1$, the discrete logarithm becomes efficiently recoverable using Fermat quotients. We show the full derivation and provide a working Sage implementation. Backdoors are always bad — but they are catastrophic when they are embedded in a fundamental primitive like Diffie–Hellman key exchange. If your browser shows a green lock, you assume your connection is secure. But what if the implementation of Diffie–Hellman contains a tiny change that looks harmless in code review — and yet allows an attacker to recover the private exponent in milliseconds? In this post I’ll show a nasty little backdoor that requires only a tiny modification: using a modulus of $p^2$ instead of $p$, while keeping the secret exponent bounded by $p$ This ...

De Bruijn Sequences and the Discrete Logarithm Problem

This blog post is about De Bruijn sequences and its relationship to cryptography. We focus here on De Bruijn sequences over the alphabet $A=\{0,1\}$.
Definition [De Bruijn sequence (binary)]. A binary string of length $2^n$ is called a De Bruijn sequence of order $n$ if each possible $n$-bit string occurs exactly once (incl. wrap-around) as a substring. 
Picture 1 - DeBruijn graph for n=3
De Bruijn sequences are the shortest possible bitstring that contains all possible $n$-bit substrings. For each $n$ there are $$\frac{2^{2^{n-1}}}{2^n}$$ possible De Bruijn sequences. For example, a De Bruijn sequence of order $n=3$ is $$00011101$$ and of order $n=4$ is $$0000101100111101$$ There are several ways to create De Bruijn sequences. One way is walk along an Hamiltonian Cycle in the corresponding De Bruijn graph (see Picture 1). A De Bruijn graph is a directed graph of degree $2$ which has $2^n$ vertices. There is an edge from vertex $i$ to the two vertices $$(2i\;\text{mod}\:2^{n-1})\;\text{and}\;(2i\;\text{mod}\;2^{n-1})+1$$ The example sequence above for $n=3$ can be obtained by visiting the vertices in the graph from Picture 1 in the following order: $0,1,3,7,6,5,2,4,(0)$. Another way to create De Bruijn sequences is to use a Linear Feedback Shift Register (LFSR), loaded with a primitive polynomial of degree $n$ to create a $m$-sequence. $m$-sequences are De Bruijn sequences that miss the substring $0_n$, i.e., the substring that consists of $n$ zeros (sometimes also called a punctured De Bruijn sequence). There is also a relationship between discrete logarithms in GF$(2^n)$ and $m$-sequences which however is not the relationship i want to cover here. A good point to start reading about this topic is the book Algebraic Shift Registers Sequences from Goresky and Klapper [1].


Based on the definition of De Bruijn sequences it follows that each $n$-bit string is unique. Hence associated with De Bruijn sequences there is the following problem:
Definition [Decoding Problem]. Given a De Bruijn sequence of order $n$, called $S_n$, and a unique $n$-bit substring $s$, find the position of $s$ in $S_n$.
For some special structured De Bruijn sequences the Decoding Problem is efficiently solvable [2]. As mentioned above, $m$-sequences are De Bruijn sequence, that miss the all zero substring $0_n$ and are associated with LFSR which again are associated with discrete logarithms over GF($2^n$). Hence, one can ask the following question:
Q: Take the De Bruijn graph, e.g., from Picture 1, and do not walk along a Hamiltonian Cycle, but a shorter cycle. Then by definition of the De Bruijn graph one gets a sequence that also contains only unique $n$-bit substrings but not all of them. Is there something special about those sequences?
For example, one can walk along the graph from Picture 1 by forming the cycle $1,3,6,4,(1)$ which corresponds to the sequence $$0011$$ and has the corresponding unique substrings $$001,011,110,100$$ There are of course many such cycles in a given graph. And indeed, some of them are special and are called $l$-sequences. They have a special relationship to the (partial) Fermat quotient $q_p(2)$.
Definition [(Partial) Fermat Quotient]. Given a prime number $p$ and let $h$ be the multiplicative order of $2$ in $\mathbb{F}^*_p$, then $$q_p(2) = \frac{2^h-1}{p}$$
If $h=p-1$, i.e.,$2$ is a primitive root in $\mathbb{F}^*_p$, $q_p(2)$ is the usual Fermat Quotient otherwise it is the partial Fermat Quotient. The relationship is, that the binary representation of $q_p(2)$ is an $l$-sequence. The proof is actually not that complicated and the uniqueness of each $n$-bit substring comes from the fact, that each group in $\mathbb{F}^*_p$ contains only unique group elements and i used this relationship also in the paper [3] to link group elements to $p$-adic digit sums. Because some $l$-sequences are equal to a (partial) Fermat quotient gives those sequences a small linear complexity, i.e., you can write a very short "shortcut" for the $l$-sequence (namely $q_p(2)$) rather than writing all its digits one by one.

If $\mathbb{G}_2$ is the subgroup of $\mathbb{F}^*_p$ generated by $2$ and let $[n]$ be the set of all $n$-bit substring of the $l$-sequence $q_p(2)$, then we have an effiently computable function $$M_p: \mathbb{G}_2 \rightarrow [n] $$ with $$M_p: r \in \mathbb{G}_2 \rightarrow \text{Binary}(-rp^{-1}\;\text{mod}\;2^n)$$ for which also the inverse function is efficiently computable.
Theorem. The decoding problem of $l$-sequences of order $n$ is equal to solve the discrete logarithm in $\mathbb{F}^*_p$ to base $g=2$ for an $n$-bit prime number $p$.
Proof. The proof of the theorem is very short. Given a discrete logarithm instance $(p,r,g)$ (i.e. $g^x \equiv r\pmod{p}$) and given the $l$-sequence $L$ obtained from the binary representation of $(2^h-1)/p$, then $L$ always starts with $0_{n-1}1$. Then compute $t \leftarrow M_p(r)$ and solve the Decoding Problem by finding the position of $t$ in $L$ and the solution is $t+n$.■

Example: $p=29$ then it is $$q_{29}(2) = \frac{2^{28}-1}{29}= 9.256.395$$ Then the binary representation of $q_{29}(2)$ is (prepended with zeros to reach a length of $h$) :$$ (*)\;\;\;0000100011010011110111001011$$ It is an order $5$ $l$-sequence of length $28$. It only misses the four substrings $$00000,  01010, 10101, 11111$$ Suppose the given discrete logarithm instance is $(29,21,2)$, i.e., $2^x \equiv 21\pmod{29}$ (The secret is $x=17$). Then $$M_{29}(21) \equiv -21\cdot{29}^{-1} \pmod{2^5} \equiv 7 = 00111_2$$ For $(29,19,2)$ (solution $x=9$) it is $$M_{29}(19) \equiv -19\cdot{29}^{-1} \pmod{2^5} \equiv 17 = 10001_2$$ As you can see from $(*)$, in the first case, the Decoding algorithm returns $12$ (starting from position $0$) and $12+5=17$ and in the second case, the Decoding algorithm returns $4$ and $4+5=9$ which are both the correct solutions.

Lempel's Homomorphism

There is a way to step from a cycle in a De Bruijn graph of order $n$ to a cycle in a De Bruin graph of order $n-1$. It is called Lempel's Homomorphism, see [3]. It is not restricted to De Bruijn sequences and hence can be applied even to short cycles, e.g., $l$-sequences.
Definition [Lempel's Homomorphism]. Let $S_n = b_0b_1b_2\ldots b_{n-1}$ and $S_{n-1} = \beta_0\beta_1\beta_2\ldots \beta_{n-2}$ two binary string of length $n$ and $n-1$ respectively. Then $$\text{D}: \{0,1\}^n \rightarrow \{0,1\}^{n-1}$$ with $$\text{D}(S_n) = S_{n-1}\;\text{with}\;\beta_i = b_i \oplus b_{i+1},\;\;i = 1,\ldots, n-1$$
We need some definitions before we can apply Lempel's Homomorphism. For a bit $b_i$ let $\overline{b}_i$ the inverse, i.e., $b_i = 0$, then $\overline{b}_i = 1$ and vice versa. If $S_n$ is a bitstring then $\overline{S}_n$ means that each bit is inverted. $S_n$ and $\overline{S}_n$ are said to be dual to each other.

Primitive. A bit string $S_n$ (the subscript $n$ refers to the bit length of unique substrings) is primitive if $S_n$ contains the $n$-bit substring $s$ but not the substring $\overline{s}$.
Self dual. A bit string $S_n$ is self dual if $\overline{S}_n$ is a cyclic shift $S_n$. If $S_n$ is self dual its length is always even.

Note that self dual also implies, that for each $n$-bit substring $s_i$ of $S_n$, $\overline{s}_i$ is also a substring of $S_n$.

Then Lempel stated the following Theorems:
Theorem A. Given a bitstring $S_n$ that represents a cycle of length $k$ in a De Bruijn graph of order $n$, then $\text{D}(S_n)$ is a bitstring $S_{n-1}$ that represents a cycle of length $k$ in a De Bruijn graph of order $n-1$.
Theorem A states that if you have a cycle in a De Bruijn graph of order $n$ you can find a cycle of the same length in a De Bruijn graph of order $n-1$. Note that this implies that the cycle length must be less or equal to $2^{n-1}$ since this is the longest possible cycle in a De Bruijn graph of order $n-1$.
Theorem B. Given a bitstring $S_n$ that is self dual and represents a cycle of length $2m$ in a De Bruijn graph of order $n$, then $\text{D}(S_n)$ is a cycle of length $m$ in the De Bruijn graph of order $n-1$.
Theorem B states a way to step down to a De Bruijn graph of smaller order and concurrently halve the length of the cycle. The relationship between self dual sequences and $q_p(2)$ is that self dual implies that the group element $-1$ is element of the group generated by $2$ in $\mathbb{F}^*_p$.

If one applies this two theorems to $l$-sequences that come from the binary representation of $q_p(2)$, can they be used to gain some knowledge about a given discrete logarithm problem in $\mathbb{F}^*_p$? It seems that the linear complexity of a given sequence increases exponentially when $\text{D}$ is applied, but are there some exceptions?

[1] Goresky and Klapper, Algebraic Shift Register Sequences, Cambridge University Press, 498 pages
[2] Chris J. Mitchell, Tuvi Etzion and Kenneth G. Paterson, A Method for Constructing Decodable de Bruijn Sequences, IEEE Transactions on Information Theory, Vol. 42, No. 5, 1996
[3] On a Homomorphism of the de Bruijn Graph and Its Applications to the Design of Feedback Shift Registers, Abraham Lempel, IEEE Transactions on Computers, Vol. C-19, No. 12, 1970
[4] Christian Schridde, Computing discrete logarithms using $\mathcal{O}((log\;q)^2)$ operations  from $\{+, -, \times,\div\, \&\}$, Groups Complexity Cryptology, Vol. 8, No. 2, pp. 91-107, 2016

Comments

Popular posts from this blog

Kryptos - The Cipher (Part 4) - Correctly positioned decryption of the word BERLIN

EASTNORTHEAST - This is not exactly the hint Jim Sanborn (JS) gave for K4 on the 29th of January this year. He only gave NORTHEAST - which refers to the positions 26-34 of K4's plaintext.  Beside BERLIN and CLOCK it is the third revealed plaintext word of K4. However, also this hint does not seem to help much.  However, it just so happened, that a member in the yahoo kryptos group had a conversation with Jim Sanborn due to a submitted solution. Sandborn's answer to the question contained again the last clue which surprisingly was EASTNORTHEAST at position 22-34. Jim Sanborns compass rose at CIA There is disagreement if Jim revealed this on purpose or he did it accidentially, but the new extended clue seem to be serious and valid.Interestingly, EASTNORTHEAST is exactly the direction which is illustrated on the compass rose on one of the stones around kryptos, also created by Jim Sanborn. Actually, i dont really kn...

Kryptos - The Cipher (Part 1) - Introduction

Introduction. Since I think that KRYPTOS does not need any introduction, I will only give you a brief description of one of the most famous and only partially solved ciphers known today: KRYPTOS - Von Jim Sanborn - Jim Sanborn, CC BY-SA 3.0, https://commons.wikimedia.org/w/index.php?curid=8253447 KRYPTOS was constructed in Nov. 1990 on the ground of the CIA Headquarter in Langley, Virginia by Jim Sanborn It contains 4 ciphers (K1,K2,K3,K4) on its left side and some kind of Vigenère-Table on its right side K1, K2 and K3 were solved by James Gillogly in 1999. Afterwards, the CIA and later the NSA claimed that they had a solution to the first three ciphers at an earlier point in time Ed Scheidt, a cryptoanalyst and former director of the CIA, gave Sanborn the input of possible cryptographic techniques to use K1 is a variant of the Vigenère-Cipher (Quagmire 3) with the codewords KRYPTOS and PALIMPSES...

Kryptos - The Cipher (Part 3) - K4 Intentional vs. non-intentional errors

This post is about is more or less a collection of several approaches and facts that has been said as well as some speculations. B-ary integer representation According to [1] during a Question and Answer round, Jim Sanborn was asked again about the hint BERLIN. The question was if N decodes to B, Y decodes to E, etc, etc. and Jim confirmed it does. Emphatically . It is written, that Jim Sanborn rattled through the entire crib: \begin{align}   \texttt{N} &\stackrel{\text{decode}}{\rightarrow} \texttt{B} \\   \texttt{Y} &\stackrel{\text{decode}}{\rightarrow}  \texttt{E} \\   \texttt{P} &\stackrel{\text{decode}}{\rightarrow}  \texttt{R} \\   \texttt{V} &\stackrel{\text{decode}}{\rightarrow}  \texttt{L} \\   \texttt{T} &\stackrel{\text{decode}}{\rightarrow}  \texttt{I} \\   \texttt{T} &\stackrel{\text{decode}}{\rightarrow}  \texttt{N} \end{align} When the same q...