Imagine Alice, Bob and Eve sit together inside a room. None of them have met before. They get to talk and after some time Alice wants to tell Bob something that Eve should not hear. Also, Eve is a talented eavesdropper, so whispering is not an option. Luckily, Alice has heard of the Diffie-Hellman-Merkle key exchange and explains it to Bob. They perform the key exchange and although Eve heard everything they said, Alice and Bob end up with a shared key that Eve does not know. They use it to encrypt their further communication and talk about their deepest secrets. From Eve’s perspective Alice and Bob are only talking gibberish.

The first time I heard about the Diffie-Hellman-Merkle key exchange, this sounded like magic to me. A key exchange entirely taking place over a public channel and still leading to a shared secret between the two parties. How should this even be possible?

Some years later during my cryptography lectures, naturally the Diffie-Hellman-Merkle key exchange was discussed. We learned about commutative groups and rings, cyclic groups, the discrete logarithm problem and other advanced topics. I had the feeling that I understood everything about the key exchange in depth.

Fast forward to my job as penetration tester. The question came up whether it is secure to use SSH with diffie-hellman-group14-sha256. With everything learned during my cryptography lectures, I had to admit to myself that I could not answer this question. What’s group14 and where do you even need a hashing algorithm as SHA-256 during the key exchange? Thus, I decided to take a closer look at the key exchange.

This blog post discusses parts of the Diffie-Hellman-Merkle key exchange, that I found to be relevant regarding its security. We are going to start with a bit of theory and go over to take a look at attack vectors against the basic key exchange procedure. Finally, we are going to evaluate its implementation in SSH, according to the previously defined attacker vectors.


The Diffie-Hellman-Merkle (DHM) key exchange was developed in 1976 by Whitfield Diffie, Martin Hellman and Ralph Merkle.1 However, the term Diffie-Hellman (DH) key exchange established, undermining Merkle’s contribution to its invention.2 The purpose of DHM key exchange is to securely establish a shared secret over an insecure channel.

As such it is very popular in modern cryptography and being used in a wide range of network protocols, such as SSH, TLS and IPSec.

Algebraic Fundamentals

At first, we’re going to take a look at a tiny bit of group theory. In algebra, the combination of a set of numbers and an operation that fulfills certain requirements is called a group. The finite group we are interested in, is called the multiplicative group of integers modulo $n$, typically denoted $\mathbb{Z}^*_n$. The set of numbers in this group is $\{0, 1, \dots, n-1\}$ and the operation is multiplication modulo $n$.

Let’s look at the multiplication of two elements from $\mathbb{Z}^*_7$ for clarification: $2 \cdot 5 \mod 7 = 10 \mod 7 = 3$.

A special case of such groups are so-called cyclic groups. These are groups that are generated by the powers of a single element $g$, the generator. This means that the cyclic group contains all elements that can be calculated with the term $g^x \mod n$ with $x \in \mathbb{Z}$.

For comprehension, let’s take a look at the illustration of the cyclic groups for $n=7$ generated by $g=2$ and $g=3$.

Illustration of the two cyclic groups $g=2$ and $g=3$ in $\mathbb{Z}^*_7$

Illustration of the two cyclic groups $g=2$ and $g=3$ in $\mathbb{Z}^*_7$

The elements of $\mathbb{Z}^*_7$ are plotted on the outline of the circle. Consecutive elements are connected by an arrow and calculated like this for $g=2$:

  • $x=0$: $2^0 \mod 7 = 1$
  • $x=1$: $2^1 \mod 7 = 2$
  • $x=2$: $2^2 \mod 7 = 4$
  • $x=3$: $2^3 \mod 7 = 8 \mod 7 = 1$

Note that every element could be calculated from its preceding element by multiplying it with $g$ and applying the modulo operation.

As we can see, the choice of the generator affects the amount of elements in the group, also called the group order. A very interesting property of group orders is described by Lagrange’s theorem.3 It states that the order of any subgroup divides the order of the original group. In the above example, the cyclic group for $g=3$ has an order of $6=3 \cdot 2$. The subgroup for $g=2$ has the order $3$, which indeed divides $6$. In the case of cyclic groups, the statement of the theorem is even stronger. In fact, for every divisor $r$ of the order of a cyclic group, a subgroup with the order $r$ will exist.

Another property we will need is that $g^q = 1$ if $g$ generates a group of order $q$. This is because $g^0 = 1$ and the cyclic group will repeat after $q$ elements.

Note: Another interesting group for DHM is the multiplicative group of points on an elliptic curve. We are not going to discuss elliptic curves in this blog post.

The Algorithm

Equipped with this knowledge, let’s take a look at the typical textbook description of DHM key exchange.

Assume, Alice and Bob want to talk to each other via an insecure channel. To secure their communication, they perform the following steps.

  1. Alice chooses a large prime number $p$, a positive integer $g \lt p$ and a random positive integer $a \lt p$. $a$ must be kept secret.
  2. Alice calculates $A=g^a \mod p$.
  3. Alice sends $p$, $g$ and $A$ to Bob.
  4. Bob chooses a random positive integer $b \lt p$, which must be kept secret.
  5. Bob calculates $B=g^b \mod p$.
  6. Bob sends $B$ to Alice.
  7. Alice calculates $K = B^a \mod p$.
  8. Bob calculates $K = A^b \mod p$.

$K$ is the shared secret of Alice and Bob.

The following figure illustrates the procedure:

An illustration of the DHM key exchange

An illustration of the DHM key exchange

Note: Sometimes $p$ and $g$ are also assumed to be known to Alice and Bob prior to the key exchange. Both specifications can be considered identical.

The magic of the key exchange lies within the fact that Alice and Bob never exchange their secret keys $a$ and $b$ but end up with the same shared secret $K$. The eavesdropper Eve is not able to derive $K$ from the public values $p$, $g$, $A$ and $B$. The security of the scheme depends on the assumption that there is no efficient solution to the discrete logarithm problem, meaning given $g^x \mod p$, it is hard to find $x$.

Attack Vectors

Assuming that the discrete logarithm problem is indeed hard to solve, still a few attack vectors arise when examining the key exchange in practice.

Machine in the Middle

The most prominent attack can be performed by an active attacker manipulating traffic: the so-called machine-in-the-middle (MITM) attack. The attack can be thought of as Mallory making her own specific key exchanges with Alice and Bob, while Alice and Bob think they are talking to each other.

Illustration of the active attacker Mallory performing a machine-in-the-middle attack

Illustration of the active attacker Mallory performing a machine-in-the-middle attack

Of course, Mallory could also choose her own $p$ and $g$ for the key exchange with Bob.

In order to defend against this attack, we need to introduce some sort of authentication. However, this is not part of the textbook DHM key exchange. This is a first hint towards the hashing algorithm we might need later on.

Bad Arguments

Even when only considering passive attackers, a lot can go wrong during our key exchange. We don’t yet know how to choose arguments for the parameters $p$, $g$, $a$ and $b$ such that the key exchange is secure.

We are going into a little more detail here. However, I’ll try to clarify the concepts by illustrating cyclic groups as above.

Bad Primes

Although text books usually state that $p$ must be a huge prime, there are sets of prime numbers that are a bad choice for the key exchange. The problem arises from the group order property stated by Lagrange’s theorem. Since the element $0$ is never inside a cyclic group, the order of the largest group is $p-1$. This definitely is no prime number and might even have several divisors. Thus, subgroups exist that have an order equal to such a divisor. This puts us in the realm of small-subgroup attacks.

Let’s take a look at an example. The following figure shows the 12 cyclic groups for $p=13$ and $1 \leq g \leq 12$.

Illustration of cyclic groups for $p=13$

Illustration of cyclic groups for $p=13$

The order of the largest cyclic group is $12=3 \cdot 2 \cdot 2$. Thus, we know that there will be subgroups of order $2$ and $3$. This is confirmed by the illustration through $g=3$, $g=9$, and $g=12$.

If we are in such a subgroup, the number of possible results of the equation $X=g^x \mod p$ is small. This may make an exhaustive search over the key space feasible. Thus, the order of the cyclic group is crucial to the security of DHM key exchange.

In order to prevent the existence of small subgroups, $p$ is usually chosen to be a safe prime. This means that $q:=(p-1)/2$ also is a prime number. An example of a safe prime is $p=11$ because $(11-1)/2=5$ is also prime.4 This leaves us with the subgroups of order $1$, $2$ and $q$. The first two are generated by the trivial generators $g=1$ and $g=p-1$. An order of $q$ is considered sufficient if $p$ is large enough. Nowadays, $p$ is recommended to be at least 2048 bit long.


A popular vulnerability that arises through the use of small primes is called Logjam. In 2015, a paper5 was published that showed that the discrete logarithm for export-grade6 prime numbers could be computed within a minute. The attack required a week-long precomputation for a commonly supported 512-bit prime. The research group further stated that nation state attackers could perform the attack on common 1024-bit primes.

Generator $g$

Next, let’s take a look at $g$, which is also called the generator of a cyclic group.

The following figure shows the cyclic groups for the safe prime $p=11$ and $2 \leq g \leq 9$.

Illustration of cyclic groups for $p=11$

Illustration of cyclic groups for $p=11$

As expected, we get groups of the two different orders $p-1=10$ and $(p-1)/2=5$. The subgroups of order $5$ are generated by quadratic residues. Simply put, these are square numbers inside the cyclic group. We can calculate them as we would calculate natural square numbers:

  • $1^2 \mod 11 = 1$
  • $2^2 \mod 11 = 4$
  • $3^2 \mod 11 = 9$
  • $4^2 \mod 11 = 5$
  • $5^2 \mod 11 = 3$

For a generator $g$ of the entire group, an element $h := g^x$ is a square if and only if $x$ is an even number because we can write $x = 2y$. Thus, $g^x = g^{2y} = (g^y)^2$.

We can derive two interesting properties from this definition:

  • The half of the elements of the entire cyclic group are squares. If the entire group has order $p-1$, the group of quadratic residues has order $(p-1)/2=q$.
  • The group generated by a quadratic residue only contains quadratic residues itself: Let $r$ be an quadratic residue. We can write $r = s^2$. Thus, $r^x = (s^2)^x = s^{2x} = (s^x)^2$.

These properties give us an easy way to distinguish whether $X=g^x \mod p$ is a quadratic residue. If $X^q = 1$, the elements repeat after $q$ elements. This can only happen if we are in a subgroup of order $q$, which only contains quadratic residues. Thus, we can conclude that the secret exponent $x$ is even. In binary, this reveals that the lowest order bit is $0$. The attack can be conducted against the exponents of Alice’s $A$, Bob’s $B$ and their shared secret $K$. If $K$ is used to derive further keys with a hash function, this is not a security issue. However, there are schemes that might break through this weakness, as described in the great blog post On Generators of Diffie-Hellman-Groups .

So, how do we find a proper $g$ for a prime $p$? Assuming $p$ is a safe prime, we should use a generator from the group of quadratic residues. We can choose any $g \lt p$ and check whether the element is a quadratic residue. Alternatively, we can also simply use $g=4=2^2$.

Random Numbers $a$ and $b$

The only requirement for $a$ and $b$ is that they are chosen uniformly at random from the set $\{1,2,\dots,p-1\}$. At least that’s what most resources on the Internet seem to say. However, it would make more sense to choose from the set $\{1,2,\dots,q\}$, with $q$ being the order of the subgroup of quadratic residues. Values larger than $q$ will not give any security advantage because elements start repeating afterwards. For the sake of simplicity, the set up to $p-1$ should also work, without posing a security drawback.

That being said, the numbers $1$ and $q$ are very bad choices because $g^1 \mod p = g$ and $g^q \mod p = 1$. Both results are easily recognizable. We could also argue that all exponents $a$ such that $g^a \lt p$ are easy to recognize because the modulo operation stays unused. However, it is highly unlikely that such numbers are chosen by random.7 Thus, keeping them in our set of possible values is a neglectable security risk.

DHM Key Exchange in SSH

Having the theoretical background, we are now going to take a look at real world scenarios of DHM key exchange on the basis of SSH, particularly of OpenSSH. According to Shodan OpenSSH makes up $\sim 87\%$ of SSH services on the Internet.

Top four SSH-2.0 services on the Internet according to Shodan

Top four SSH-2.0 services on the Internet according to Shodan

All implementations of SSH have to comply with the transport layer protocol that is defined in RFC4253. Section 7 and 8 cover the key exchange. The following figure depicts the protocol with the notations from the RFC:

Illustration of DHM key exchange in SSH

Illustration of DHM key exchange in SSH

  1. The client and server exchange their identification string. This contains the used protocol and software version, as well as optional comments. An example of an identification string is SSH-2.0-OpenSSH_9.0.

  2. The client and server exchange name lists of supported key exchange, sometimes abbreviated kex, and server host key algorithms. Further information, which is not relevant for understanding the key exchange, is omitted here.
    To agree upon algorithms, both parties choose the first algorithm in the list of the client that the server also supports. Regarding the key exchange algorithm, this defines the group, which itself specifies values for $p, g, q$, and the hash function to use. An SSH server usually has multiple host key pairs, one for each algorithm. The agreed algorithm defines the host key to use, i.e. $K_S$. Note that the client already stores a copy of the server’s public host key from former connections.

  3. A typical procedure of the DHM key exchange as discussed above is accomplished. Additionally, the server calculates a hash over the exchanged information and signs this hash with the appropriate private host key.
    The client verifies that $K_S$ indeed belongs to this host using its local copy, calculates the shared key $K$, the hash $H$ and verifies its signature using $K_S$.

  4. If everything went well, client and server exchange a NEWKEYS message, which confirms that $K$ can be used as a shared secret.

Generation of Secret Exponents

According to RFC4253 the exponent of the client is a random number $x$, with $1 \lt x \lt q$, where $q$ is the order of the subgroup generated by $g$. The exponent of the server is a random number $y$, with $0 \lt y \lt q$. Why the server is allowed to use 1 but the client is not, I don’t know. Choosing only from a set of values smaller than $q$ makes sense, as discussed above.

Server Authentication

To protect against active MITM attacks, an authentication mechanism was introduced into SSH. Server and client calculate a hash over all of their exchanged messages ($V_C, V_S, I_C, I_S, e, f$), the public host key $K_S$ and the shared secret $K$. The server signs this hash with the private host key of which the client is assumed to already know the public counterpart. During the verification of the signature, the client would notice any interventions made by the MITM and in return abort the connection. Thus, the key exchange is secured against such an attacker.

Key Exchange Algorithms

The original RFC defines two key exchange algorithms that must be implemented:

  • diffie-hellman-group1-sha1 uses Oakley Group 28 and SHA-1 as hash function.
  • diffie-hellman-group14-sha1 uses Oakley Group 14 and SHA-1 as well.

The so-called Oakley Groups originate from the Oakley Key Determination Protocol, which formed the basis for the Internet Key Exchange (IKE) protocol, which again is used in IPSec.9 The groups are originally defined in RFC2412.

RFC8268 adds more groups and hash functions to SSH. The largest prime number is used by Oakley Group 18 and counts 8192 bits. The current recommendations on which groups to use in SSH were updated on January 13, 2022, in RFC9142. The following table shows an excerpt from the RFC and whether the default configuration of OpenSSH supports the algorithm in their current version.10

Key Exchange Method Name Recommendation OpenSSH
diffie-hellman-group1-sha1 SHOULD NOT
diffie-hellman-group14-sha1 MAY
diffie-hellman-group14-sha256 MUST
diffie-hellman-group15-sha512 MAY
diffie-hellman-group16-sha512 SHOULD
diffie-hellman-group17-sha512 MAY
diffie-hellman-group18-sha512 MAY
diffie-hellman-group-exchange-sha1 SHOULD NOT
diffie-hellman-group-exchange-sha256 MAY

As you can see, group1, i.e. Oakley Group 2 should no longer be used. Also the recommended hash algorithm is shifting from SHA-1 to SHA-256 and SHA-512, although group14 may be still used in combination with SHA-1.

To get an idea how such groups are defined, let’s look at the definition of Oakley Group 14 from RFC3526. The following is the hexadecimal value of the safe 2048-bit prime:

FFFFFFFF FFFFFFFF C90FDAA2 2168C234 C4C6628B 80DC1CD1 29024E08 8A67CC74
020BBEA6 3B139B22 514A0879 8E3404DD EF9519B3 CD3A431B 302B0A6D F25F1437
4FE1356D 6D51C245 E485B576 625E7EC6 F44C42E9 A637ED6B 0BFF5CB6 F406B7ED
EE386BFB 5A899FA5 AE9F2411 7C4B1FE6 49286651 ECE45B3D C2007CB8 A163BF05
98DA4836 1C55D39A 69163FA8 FD24CF5F 83655D23 DCA3AD96 1C62F356 208552BB
9ED52907 7096966D 670C354E 4ABC9804 F1746C08 CA18217C 32905E46 2E36CE3B
E39E772C 180E8603 9B2783A2 EC07A28F B5C55DF0 6F4C52C9 DE2BCBF6 95581718
3995497C EA956AE5 15D22618 98FA0510 15728E5A 8AACAA68 FFFFFFFF FFFFFFFF

The generator is the number $2$, which also is a quadratic residue of the respective cyclic group. This holds for every Oakley Group. We can conclude that Oakley Groups fulfill all of the security requirements we have discussed.

Group Exchange

Beside explicitly defined groups, there is also the possibility to use group exchange, for example diffie-hellman-group-exchange-sha256. In this variant, the server sends a custom prime and generator to the client that meet its requirements. Group exchange is defined in RFC4419 and extended in RFC8270. The idea is that “the server can constantly compute new groups in the background”. The security considerations in the RFC state that only safe primes should be used. As an aftermath of Logjam, RFC8270 also recommends that at least 2048-bit primes should be used. However, no attention is paid to the generator being a quadratic residue. This allows to recover the least significant bit of the private exponents, which funnily is also stated within the RFC:

The least significant bit of the private exponent can be recovered when the modulus is a safe prime.

Still, the idea of a group exchange is promising. The Logjam attack is not viable when primes are large enough and groups are not reused, as it depends on precomputation.


OpenSSH supports group exchange. However, new groups are not constantly computed in the background, as suggested by the RFC. Instead groups are distributed with the software package. These are stored inside the file /etc/ssh/moduli.

In recent years OpenSSH regenerated its moduli multiple times a year.11 However, these changes often are not propagated into the software packages of Linux distributions. As an example, the current Debian 11 package of OpenSSH is based on the commit 049297d from June 4, 2020. This probably gives enough time for precomputation and makes the use of group exchange pointless.

In order to use the group exchange sensibly, the moduli should be regenerated regularly instead. For OpenSSH, this can be done as follows:12

ssh-keygen -M generate -O bits=2048 {TEMP_FILE}
ssh-keygen -M screen -f {TEMP_FILE} {MODULI_FILE}

The first command generates candidate primes and the second command tests them for suitability.

Another thing to note is that the groups distributed by OpenSSH and generated by ssh-keygen do not use quadratic residues as generator.13 This means that the least significant bit of the private exponents can always be recovered when using group exchange with OpenSSH.

Down the Rabbit Hole

As I started writing this blog post, I had not heard anything about quadratic residues and many more topics that were discussed above. It feels like three new questions came up every time I answered one. Even now, there are still topics I would like to explore further, for example why and how client and server may reduce the size of their private exponents.14 However, it feels like this blog post is long enough already and it definitely took long enough to write. Therefore, I will leave it as it is for now.

I hope you learned something and have a great day.

  1. New Directions in Cryptography by Whitfield Diffie and Martin E. Hellman (1976). ↩︎

  2. An interview with Martin Hellman conducted by Jeffrey R. Yost (2004). Page 25. ↩︎

  3. “Lagrange’s theorem states that for any finite group $G$, the order of every subgroup divides the order of $G$.” from Wikipedia: Lagrange’s theorem (group theory)↩︎

  4. This guarantees that the order of the cyclic group has exactly two prime factors: $2$ and $q:=(p-1)/2$. The Pohlig-Hellman algorithm computes discrete logarithms in $\mathcal{O}(\sqrt{q})$. Wikipedia: Pohlig-Hellman-Algorithmus ↩︎

  5. Imperfect Forward Secrecy: How Diffie-Hellman Fails in Practice by David Adrian et al (2015). ↩︎

  6. “Export-grade cryptography is encryption software which were limited in key size by the government of the USA, as part of the Crypto Wars. Since cryptography was considered to be a technology that would be dangerous if it fell into the hands of the enemy, restrictions were placed on its export. RSA, one of the more common cryptographic schemes, was limited to use 512 bit keys, which would allow easy decryption by intelligence agencies.” from Securitipedia: Export grade cryptography↩︎

  7. Let $g=2$ and $p$ be a 1024-bit prime number. Then $q$ is a 1023-bit prime number. Powers $g^x$ up to $x=1023$ are easy to recognize. Including $0$, there are 1024 bad choices for $x$. The probability to randomly choose those out of a set of $\sim 2^{1023}$ is $\approx\frac{2^{10}}{2^{1023}}=\frac{1}{2^{1013}}$. This probability is similar to correctly guessing the shared secret. ↩︎

  8. SSH’s group1 indeed uses Oakley Group 2. SSH maintains custom group identifiers that are distinct from Oakley. RFC4253 Section 6.5 ↩︎

  9. Wikipedia: Oakley protocol and Internet Key Exchange↩︎

  10. OpenSSH 9.0 was considered for this evaluation. ↩︎

  11. The moduli file in the GitHub repository openssh-portable was updated three times each in 2021 and 2020. ↩︎

  12. Linux man pages on ssh-keygen↩︎

  13. In fact, I’ve checked every historic moduli file from the OpenSSH GitHub repository and none of the groups used a quadratic residue generator. ↩︎

  14. This is stated in RFC4419: Section 6.2 ↩︎