Imagine Alice, Bob and Eve sit together inside a room. None of them have met before. They get to talk and after some time Alice wants to tell Bob something that Eve should not hear. Also, Eve is a talented eavesdropper, so whispering is not an option. Luckily, Alice has heard of the DiffieHellmanMerkle key exchange and explains it to Bob. They perform the key exchange and although Eve heard everything they said, Alice and Bob end up with a shared key that Eve does not know. They use it to encrypt their further communication and talk about their deepest secrets. From Eve’s perspective Alice and Bob are only talking gibberish.
The first time I heard about the DiffieHellmanMerkle key exchange, this sounded like magic to me. A key exchange entirely taking place over a public channel and still leading to a shared secret between the two parties. How should this even be possible?
Some years later during my cryptography lectures, naturally the DiffieHellmanMerkle key exchange was discussed. We learned about commutative groups and rings, cyclic groups, the discrete logarithm problem and other advanced topics. I had the feeling that I understood everything about the key exchange in depth.
Fast forward to my job as penetration tester.
The question came up whether it is secure to use SSH with diffiehellmangroup14sha256
.
With everything learned during my cryptography lectures, I had to admit to myself that I could not answer this question.
What’s group14
and where do you even need a hashing algorithm as SHA256 during the key exchange?
Thus, I decided to take a closer look at the key exchange.
This blog post discusses parts of the DiffieHellmanMerkle key exchange, that I found to be relevant regarding its security. We are going to start with a bit of theory and go over to take a look at attack vectors against the basic key exchange procedure. Finally, we are going to evaluate its implementation in SSH, according to the previously defined attacker vectors.
Basics
The DiffieHellmanMerkle (DHM) key exchange was developed in 1976 by Whitfield Diffie, Martin Hellman and Ralph Merkle.^{1} However, the term DiffieHellman (DH) key exchange established, undermining Merkle’s contribution to its invention.^{2} The purpose of DHM key exchange is to securely establish a shared secret over an insecure channel.
As such it is very popular in modern cryptography and being used in a wide range of network protocols, such as SSH, TLS and IPSec.
Algebraic Fundamentals
At first, we’re going to take a look at a tiny bit of group theory.
In algebra, the combination of a set of numbers and an operation that fulfills certain requirements is called a group.
The finite group we are interested in, is called the multiplicative group of integers modulo $n$, typically denoted $\mathbb{Z}^*_n$.
The set of numbers in this group is $\{0, 1, \dots, n1\}$ and the operation is multiplication modulo
Let’s look at the multiplication of two elements from $\mathbb{Z}^*_7$ for clarification: $2 \cdot 5 \mod 7 = 10 \mod 7 = 3$.
A special case of such groups are socalled cyclic groups. These are groups that are generated by the powers of a single element $g$, the generator. This means that the cyclic group contains all elements that can be calculated with the term $g^x \mod n$ with $x \in \mathbb{Z}$.
For comprehension, let’s take a look at the illustration of the cyclic groups for $n=7$ generated by $g=2$ and $g=3$.
The elements of $\mathbb{Z}^*_7$ are plotted on the outline of the circle. Consecutive elements are connected by an arrow and calculated like this for $g=2$:
 $x=0$: $2^0 \mod 7 = 1$
 $x=1$: $2^1 \mod 7 = 2$
 $x=2$: $2^2 \mod 7 = 4$
 $x=3$: $2^3 \mod 7 = 8 \mod 7 = 1$
…
Note that every element could be calculated from its preceding element by multiplying it with $g$ and applying the modulo operation.
As we can see, the choice of the generator affects the amount of elements in the group, also called the group order. A very interesting property of group orders is described by Lagrange’s theorem.^{3} It states that the order of any subgroup divides the order of the original group. In the above example, the cyclic group for $g=3$ has an order of $6=3 \cdot 2$. The subgroup for $g=2$ has the order $3$, which indeed divides $6$. In the case of cyclic groups, the statement of the theorem is even stronger. In fact, for every divisor $r$ of the order of a cyclic group, a subgroup with the order $r$ will exist.
Another property we will need is that $g^q = 1$ if $g$ generates a group of order $q$. This is because $g^0 = 1$ and the cyclic group will repeat after $q$ elements.
Note: Another interesting group for DHM is the multiplicative group of points on an elliptic curve. We are not going to discuss elliptic curves in this blog post.
The Algorithm
Equipped with this knowledge, let’s take a look at the typical textbook description of DHM key exchange.
Assume, Alice and Bob want to talk to each other via an insecure channel. To secure their communication, they perform the following steps.
 Alice chooses a large prime number $p$, a positive integer $g \lt p$ and a random positive integer $a \lt p$. $a$ must be kept secret.
 Alice calculates $A=g^a \mod p$.
 Alice sends $p$, $g$ and $A$ to Bob.
 Bob chooses a random positive integer $b \lt p$, which must be kept secret.
 Bob calculates $B=g^b \mod p$.
 Bob sends $B$ to Alice.
 Alice calculates $K = B^a \mod p$.
 Bob calculates $K = A^b \mod p$.
$K$ is the shared secret of Alice and Bob.
The following figure illustrates the procedure:
Note: Sometimes $p$ and $g$ are also assumed to be known to Alice and Bob prior to the key exchange. Both specifications can be considered identical.
The magic of the key exchange lies within the fact that Alice and Bob never exchange their secret keys $a$ and $b$ but end up with the same shared secret $K$. The eavesdropper Eve is not able to derive $K$ from the public values $p$, $g$, $A$ and $B$. The security of the scheme depends on the assumption that there is no efficient solution to the discrete logarithm problem, meaning given $g^x \mod p$, it is hard to find $x$.
Attack Vectors
Assuming that the discrete logarithm problem is indeed hard to solve, still a few attack vectors arise when examining the key exchange in practice.
Machine in the Middle
The most prominent attack can be performed by an active attacker manipulating traffic: the socalled machineinthemiddle (MITM) attack. The attack can be thought of as Mallory making her own specific key exchanges with Alice and Bob, while Alice and Bob think they are talking to each other.
Of course, Mallory could also choose her own $p$ and $g$ for the key exchange with Bob.
In order to defend against this attack, we need to introduce some sort of authentication. However, this is not part of the textbook DHM key exchange. This is a first hint towards the hashing algorithm we might need later on.
Bad Arguments
Even when only considering passive attackers, a lot can go wrong during our key exchange. We don’t yet know how to choose arguments for the parameters $p$, $g$, $a$ and $b$ such that the key exchange is secure.
We are going into a little more detail here. However, I’ll try to clarify the concepts by illustrating cyclic groups as above.
Bad Primes
Although text books usually state that $p$ must be a huge prime, there are sets of prime numbers that are a bad choice for the key exchange. The problem arises from the group order property stated by Lagrange’s theorem. Since the element $0$ is never inside a cyclic group, the order of the largest group is $p1$. This definitely is no prime number and might even have several divisors. Thus, subgroups exist that have an order equal to such a divisor. This puts us in the realm of smallsubgroup attacks.
Let’s take a look at an example. The following figure shows the 12 cyclic groups for $p=13$ and $1 \leq g \leq 12$.
The order of the largest cyclic group is $12=3 \cdot 2 \cdot 2$. Thus, we know that there will be subgroups of order $2$ and $3$. This is confirmed by the illustration through $g=3$, $g=9$, and $g=12$.
If we are in such a subgroup, the number of possible results of the equation $X=g^x \mod p$ is small. This may make an exhaustive search over the key space feasible. Thus, the order of the cyclic group is crucial to the security of DHM key exchange.
In order to prevent the existence of small subgroups, $p$ is usually chosen to be a safe prime. This means that $q:=(p1)/2$ also is a prime number. An example of a safe prime is $p=11$ because $(111)/2=5$ is also prime.^{4} This leaves us with the subgroups of order $1$, $2$ and $q$. The first two are generated by the trivial generators $g=1$ and $g=p1$. An order of $q$ is considered sufficient if $p$ is large enough. Nowadays, $p$ is recommended to be at least 2048 bit long.
Logjam
A popular vulnerability that arises through the use of small primes is called Logjam. In 2015, a paper^{5} was published that showed that the discrete logarithm for exportgrade^{6} prime numbers could be computed within a minute. The attack required a weeklong precomputation for a commonly supported 512bit prime. The research group further stated that nation state attackers could perform the attack on common 1024bit primes.
Generator $g$
Next, let’s take a look at $g$, which is also called the generator of a cyclic group.
The following figure shows the cyclic groups for the safe prime $p=11$ and $2 \leq g \leq 9$.
As expected, we get groups of the two different orders $p1=10$ and $(p1)/2=5$. The subgroups of order $5$ are generated by quadratic residues. Simply put, these are square numbers inside the cyclic group. We can calculate them as we would calculate natural square numbers:
 $1^2 \mod 11 = 1$
 $2^2 \mod 11 = 4$
 $3^2 \mod 11 = 9$
 $4^2 \mod 11 = 5$
 $5^2 \mod 11 = 3$
…
For a generator $g$ of the entire group, an element $h := g^x$ is a square if and only if $x$ is an even number because we can write $x = 2y$. Thus, $g^x = g^{2y} = (g^y)^2$.
We can derive two interesting properties from this definition:
 The half of the elements of the entire cyclic group are squares. If the entire group has order $p1$, the group of quadratic residues has order $(p1)/2=q$.
 The group generated by a quadratic residue only contains quadratic residues itself: Let $r$ be an quadratic residue. We can write $r = s^2$. Thus, $r^x = (s^2)^x = s^{2x} = (s^x)^2$.
These properties give us an easy way to distinguish whether $X=g^x \mod p$ is a quadratic residue. If $X^q = 1$, the elements repeat after $q$ elements. This can only happen if we are in a subgroup of order $q$, which only contains quadratic residues. Thus, we can conclude that the secret exponent $x$ is even. In binary, this reveals that the lowest order bit is $0$. The attack can be conducted against the exponents of Alice’s $A$, Bob’s $B$ and their shared secret $K$. If $K$ is used to derive further keys with a hash function, this is not a security issue. However, there are schemes that might break through this weakness, as described in the great blog post On Generators of DiffieHellmanGroups .
So, how do we find a proper $g$ for a prime $p$? Assuming $p$ is a safe prime, we should use a generator from the group of quadratic residues. We can choose any $g \lt p$ and check whether the element is a quadratic residue. Alternatively, we can also simply use $g=4=2^2$.
Random Numbers $a$ and $b$
The only requirement for $a$ and $b$ is that they are chosen uniformly at random from the set $\{1,2,\dots,p1\}$. At least that’s what most resources on the Internet seem to say. However, it would make more sense to choose from the set $\{1,2,\dots,q\}$, with $q$ being the order of the subgroup of quadratic residues. Values larger than $q$ will not give any security advantage because elements start repeating afterwards. For the sake of simplicity, the set up to $p1$ should also work, without posing a security drawback.
That being said, the numbers $1$ and $q$ are very bad choices because $g^1 \mod p = g$ and $g^q \mod p = 1$. Both results are easily recognizable. We could also argue that all exponents $a$ such that $g^a \lt p$ are easy to recognize because the modulo operation stays unused. However, it is highly unlikely that such numbers are chosen by random.^{7} Thus, keeping them in our set of possible values is a neglectable security risk.
DHM Key Exchange in SSH
Having the theoretical background, we are now going to take a look at real world scenarios of DHM key exchange on the basis of SSH, particularly of OpenSSH. According to Shodan OpenSSH makes up $\sim 87\%$ of SSH services on the Internet.
All implementations of SSH have to comply with the transport layer protocol that is defined in RFC4253. Section 7 and 8 cover the key exchange. The following figure depicts the protocol with the notations from the RFC:

The client and server exchange their identification string. This contains the used protocol and software version, as well as optional comments. An example of an identification string is
SSH2.0OpenSSH_9.0
. 
The client and server exchange name lists of supported key exchange, sometimes abbreviated kex, and server host key algorithms. Further information, which is not relevant for understanding the key exchange, is omitted here.
To agree upon algorithms, both parties choose the first algorithm in the list of the client that the server also supports. Regarding the key exchange algorithm, this defines the group, which itself specifies values for $p, g, q$, and the hash function to use. An SSH server usually has multiple host key pairs, one for each algorithm. The agreed algorithm defines the host key to use, i.e. $K_S$. Note that the client already stores a copy of the server’s public host key from former connections. 
A typical procedure of the DHM key exchange as discussed above is accomplished. Additionally, the server calculates a hash over the exchanged information and signs this hash with the appropriate private host key.
The client verifies that $K_S$ indeed belongs to this host using its local copy, calculates the shared key $K$, the hash $H$ and verifies its signature using $K_S$. 
If everything went well, client and server exchange a NEWKEYS message, which confirms that $K$ can be used as a shared secret.
Generation of Secret Exponents
According to RFC4253 the exponent of the client is a random number $x$, with
Server Authentication
To protect against active MITM attacks, an authentication mechanism was introduced into SSH.
Server and client calculate a hash over all of their exchanged messages
Key Exchange Algorithms
The original RFC defines two key exchange algorithms that must be implemented:
diffiehellmangroup1sha1
uses Oakley Group 2^{8} and SHA1 as hash function.diffiehellmangroup14sha1
uses Oakley Group 14 and SHA1 as well.
The socalled Oakley Groups originate from the Oakley Key Determination Protocol, which formed the basis for the Internet Key Exchange (IKE) protocol, which again is used in IPSec.^{9} The groups are originally defined in RFC2412.
RFC8268 adds more groups and hash functions to SSH. The largest prime number is used by Oakley Group 18 and counts 8192 bits. The current recommendations on which groups to use in SSH were updated on January 13, 2022, in RFC9142. The following table shows an excerpt from the RFC and whether the default configuration of OpenSSH supports the algorithm in their current version.^{10}
Key Exchange Method Name  Recommendation  OpenSSH 

diffiehellmangroup1sha1  SHOULD NOT  
diffiehellmangroup14sha1  MAY  
diffiehellmangroup14sha256  MUST  ✓ 
diffiehellmangroup15sha512  MAY  
diffiehellmangroup16sha512  SHOULD  ✓ 
diffiehellmangroup17sha512  MAY  
diffiehellmangroup18sha512  MAY  ✓ 
diffiehellmangroupexchangesha1  SHOULD NOT  
diffiehellmangroupexchangesha256  MAY  ✓ 
As you can see, group1
, i.e. Oakley Group 2 should no longer be used.
Also the recommended hash algorithm is shifting from SHA1 to SHA256 and SHA512, although group14
may be still used in combination with SHA1.
To get an idea how such groups are defined, let’s look at the definition of Oakley Group 14 from RFC3526. The following is the hexadecimal value of the safe 2048bit prime:
FFFFFFFF FFFFFFFF C90FDAA2 2168C234 C4C6628B 80DC1CD1 29024E08 8A67CC74
020BBEA6 3B139B22 514A0879 8E3404DD EF9519B3 CD3A431B 302B0A6D F25F1437
4FE1356D 6D51C245 E485B576 625E7EC6 F44C42E9 A637ED6B 0BFF5CB6 F406B7ED
EE386BFB 5A899FA5 AE9F2411 7C4B1FE6 49286651 ECE45B3D C2007CB8 A163BF05
98DA4836 1C55D39A 69163FA8 FD24CF5F 83655D23 DCA3AD96 1C62F356 208552BB
9ED52907 7096966D 670C354E 4ABC9804 F1746C08 CA18217C 32905E46 2E36CE3B
E39E772C 180E8603 9B2783A2 EC07A28F B5C55DF0 6F4C52C9 DE2BCBF6 95581718
3995497C EA956AE5 15D22618 98FA0510 15728E5A 8AACAA68 FFFFFFFF FFFFFFFF
The generator is the number $2$, which also is a quadratic residue of the respective cyclic group. This holds for every Oakley Group. We can conclude that Oakley Groups fulfill all of the security requirements we have discussed.
Group Exchange
Beside explicitly defined groups, there is also the possibility to use group exchange, for example diffiehellmangroupexchangesha256
.
In this variant, the server sends a custom prime and generator to the client that meet its requirements.
Group exchange is defined in RFC4419 and extended in RFC8270.
The idea is that “the server can constantly compute new groups in the background”.
The security considerations in the RFC state that only safe primes should be used.
As an aftermath of Logjam, RFC8270 also recommends that at least 2048bit primes should be used.
However, no attention is paid to the generator being a quadratic residue.
This allows to recover the least significant bit of the private exponents, which funnily is also stated within the RFC:
The least significant bit of the private exponent can be recovered when the modulus is a safe prime.
Still, the idea of a group exchange is promising. The Logjam attack is not viable when primes are large enough and groups are not reused, as it depends on precomputation.
OpenSSH
OpenSSH supports group exchange.
However, new groups are not constantly computed in the background, as suggested by the RFC.
Instead groups are distributed with the software package.
These are stored inside the file /etc/ssh/moduli
.
In recent years OpenSSH regenerated its moduli multiple times a year.^{11}
However, these changes often are not propagated into the software packages of Linux distributions.
As an example, the current Debian 11 package of OpenSSH is based on the commit 049297d
from June 4, 2020.
This probably gives enough time for precomputation and makes the use of group exchange pointless.
In order to use the group exchange sensibly, the moduli should be regenerated regularly instead. For OpenSSH, this can be done as follows:^{12}
sshkeygen M generate O bits=2048 {TEMP_FILE}
sshkeygen M screen f {TEMP_FILE} {MODULI_FILE}
The first command generates candidate primes and the second command tests them for suitability.
Another thing to note is that the groups distributed by OpenSSH and generated by sshkeygen
do not use quadratic residues as generator.^{13}
This means that the least significant bit of the private exponents can always be recovered when using group exchange with OpenSSH.
Down the Rabbit Hole
As I started writing this blog post, I had not heard anything about quadratic residues and many more topics that were discussed above. It feels like three new questions came up every time I answered one. Even now, there are still topics I would like to explore further, for example why and how client and server may reduce the size of their private exponents.^{14} However, it feels like this blog post is long enough already and it definitely took long enough to write. Therefore, I will leave it as it is for now.
I hope you learned something and have a great day.

New Directions in Cryptography by Whitfield Diffie and Martin E. Hellman (1976). ↩︎

An interview with Martin Hellman conducted by Jeffrey R. Yost (2004). Page 25. ↩︎

“Lagrange’s theorem states that for any finite group $G$, the order of every subgroup divides the order of $G$.” from Wikipedia: Lagrange’s theorem (group theory). ↩︎

This guarantees that the order of the cyclic group has exactly two prime factors: $2$ and $q:=(p1)/2$. The PohligHellman algorithm computes discrete logarithms in $\mathcal{O}(\sqrt{q})$. Wikipedia: PohligHellmanAlgorithmus ↩︎

Imperfect Forward Secrecy: How DiffieHellman Fails in Practice by David Adrian et al (2015). ↩︎

“Exportgrade cryptography is encryption software which were limited in key size by the government of the USA, as part of the Crypto Wars. Since cryptography was considered to be a technology that would be dangerous if it fell into the hands of the enemy, restrictions were placed on its export. RSA, one of the more common cryptographic schemes, was limited to use 512 bit keys, which would allow easy decryption by intelligence agencies.” from Securitipedia: Export grade cryptography. ↩︎

Let $g=2$ and $p$ be a 1024bit prime number. Then $q$ is a 1023bit prime number. Powers $g^x$ up to $x=1023$ are easy to recognize. Including $0$, there are 1024 bad choices for $x$. The probability to randomly choose those out of a set of $\sim 2^{1023}$ is $\approx\frac{2^{10}}{2^{1023}}=\frac{1}{2^{1013}}$. This probability is similar to correctly guessing the shared secret. ↩︎

SSH’s
group1
indeed uses Oakley Group 2. SSH maintains custom group identifiers that are distinct from Oakley. RFC4253 Section 6.5 ↩︎ 
Wikipedia: Oakley protocol and Internet Key Exchange. ↩︎

OpenSSH 9.0 was considered for this evaluation. ↩︎

The
moduli
file in the GitHub repository opensshportable was updated three times each in 2021 and 2020. ↩︎ 
Linux man pages on sshkeygen. ↩︎

In fact, I’ve checked every historic moduli file from the OpenSSH GitHub repository and none of the groups used a quadratic residue generator. ↩︎

This is stated in RFC4419: Section 6.2 ↩︎