Deciphering RSA Cryptosystems with Modular Arithmetic

WARNING: The following post features mathematics performed either by professionals or under the supervision of professionals. Accordingly, CyberCrud and its producer must insist that no one attempt to answer or solve any equation or theorem proven in this post.

Most of us are probably somewhat familiar with public-key cryptosystems. In fact, it’s pretty hard not to brush shoulders with them once in a while these days. You probably use things like SSH, GnuPG, and OpenSSL all the time. The level of securiy these tools provide is incredible and it’s partly due to a clever system called RSA. It’s well into it’s third decade and still going strong.

Yet you can’t truly appreciate something like RSA until you actually understand just how your super secret messages are being shattered into unrecognizable pieces and then reconstructed perfectly once again. Fortunately, you don’t need a Ph.D in some obscure branch of ancient Egyptian algebra to understand it either; the mathematics involved aren’t too far-fetched. If you’ve ever taken a class in discrete mathematics, you’ll feel right at home here.

A Little History

Before getting into anything messy, it’s important to understand the basics of how congruence relations apply to cryptology. So let’s go back in time to around 50 BC and take a look at the Caesar cipher. Everyone into the TARDIS! <insert annoying whooshing sound>

As you can probably guess from the name, the Caesar cipher was popularized by Julius Caesar during the Gallic Wars. Caesar kept his messages secret by replacing each letter in the original message with the letter that is three positions further down in the alphabet. That is, the letter A becomes D, the letter B becomes E, C becomes F, and so on.

Expressed mathematically, each letter is first replaced by an integer between 0 and 25, determined by is position in the alphabet. A is replaced by 0, B is replaced by 1, C by 2, and so on until Z becomes 25. The encryption process can be represented using the function f  which assigns to a non-negative integer p  , where p \le 25  , the integer f(p)  in the set \{0, 1, 2, ..., 25\}  with:

f(p) = (p + 3) \mod 26

Consider the example of encrypting the message “THE CAKE IS A LIE”.

Begin by replacing each letter in the message with its corresponding position in the alphabet (beginning at 0). You get:

19 7 4    2 0 10 4    8 18    0    11 8 4

Replacing each of these numbers p  by f(p) = (p + 3) \mod 26  gives:

22 10 7    5 3 13 7    11 21    3    14 11 7

And finally, translating this back to letters produces the ciphertext: “WKH FDNH LV D OLH”.

Simple enough, right? Now to recover the original message, the inverse of f  , denoted f^{-1}  , is used. The function f^{-1}  sends an integer p  from \{0, 1, 2, ..., 25\}  to:

f^{-1}(p) = (p - 3) \mod 26

That is, to find the original message, each letter is shifted back by three letters in the alphabet.

However, the magnitude of the shift in the Caesar cipher does not necessarily have to be restricted to just three. The previous equations can be expressed more generally if we shift each letter by k  .

f(p) = (p + k) \mod 26

f^{-1}(p) = (p - k) \mod 26

Though at this point, it’s more appropriate to refer to this cipher as a shift cipher instead.

It’s pretty obvious that Caesar’s method and shift ciphers in general only offer a laughable amount of security. Each letter changes its identity yet retains its position, making it vulnerable to cryptanalysis using simple frequency analysis.

Public-Key Cryptography

These methods are very basic examples of private-key cryptosystems (sometimes called symmetric-key algorithms). The same cryptographic keys are used for both encryption and decryption. To encrypt a message with our shift ciphers, we shifted right by k  and to decrypt it we shifted left by k  . This is one of its greatest weaknesses since both parties are required to have access to a secret key but without means to securely exchange the key.

In walks the public-key cryptosystem to save the day. In such a system, knowing how to send someone an encrypted message will not help you decrypt other messages sent to that person. Each party has a publicly known encryption key while only the decryption keys are kept secret. That’s why the encryption key is sometimes referred to as the “public key” and the decryption key as the “private key.” Only the intended receiver of a given message can decrypt it since the encryption key simply does not allow someone to derive the decryption key (technically, it is sometimes possible but more on that later.) It basically says “You know what? Fine. You want this key so bad? Take the damn thing. See if I care. In fact, send me anything you like with it. But good luck trying to eavesdrop on anything else I might send.”

Introducing the RSA Cryptosystem

Perhaps the most widely used application of public-key cryptography is RSA which popped its head up during 1977 at MIT. It’s name comes from the researchers involved in its publication: Ron Rivest, Adi Shamir, and Leonard Adleman. I would’ve named it something cool like Really Super Algorithm, though. Their initial research is available here which I strongly suggest reading. (It’s also funny to see them use phrases back then like “the era of ‘electronic mail’ may soon be upon us.”)

What makes RSA so neat is that it doesn’t actually propose anything new. It’s unlike private-key systems whose strength lies in the complexity of some discrete algorithm continuously applied to a bit stream. All they did was utilize existing mathematics that have been sitting around for centuries.

As a public-key cryptosystem, RSA relies not on one key, as in our shift cipher example, but on two: a public key and a private key. The public key is made freely available, while the private key is kept a secret. Should you wish to communicate with someone, you simply send them your public key, allowing them to encrypt any messages sent to you so that only you can decipher them. Conversely, you will also need their public key in order to reply back.

Encryption Phase

So what exactly does an encryption key consist of? How is it that you can be so confident in giving away the key without fear of decryption by an adversary? That seems like magic.

It’s not magic. It’s math! RSA keys are based on modular exponentiation modulo the product of two large primes. That can be a scary phrase but, for now, know that the encryption key consists of two integers e  and n  – expressed as (e, n)  – where e  is the exponent and n  is the modulus used in the encryption function.

The modulus n  is simply the product of two large primes, p  and q  . Expressed n = pq  . These primes can be chosen at random using probabilistic primality tests. When you hear people talk about their 1024-bit keys or 2048-bit keys, they are actually referring to the bit length of n  .

To compute the exponent e  , first we must find the number of positive integers less than n  that are relatively prime to n  . This is called Euler’s fotient function. For prime numbers p  :

\phi(p) = p - 1  .

Because the totient function is multiplicative, in our case:

\phi(n) = \phi(pq) = \phi(p) \cdot \phi(q) = (p - 1)(q - 1)  .

Then e  is chosen as a random value between the range (1, \phi(n))  . That is, 1 < e < \phi(n)  and gcd(e, \phi(n)) = 1  .

Now that you’ve got an encryption key, the actual encryption phase can begin. Encryption begins by translating the message into a sequence of integers, just like we did with the Caesar cipher. The resulting integers are then grouped together to form even larger integers where each group represents a block of letters. The process continues by transforming the integer M  (the original message) into an integer C  (the ciphertext) using the following function:

C = M^e \mod n

However, it’s impractical to first compute M^e  and then find it’s remainder when divided by n  . Remember, we’re dealing with huge integers here. That’s why an algorithm for fast modular exponentiation is used instead. Unfortunately, a full discussion of modular exponentiation would make this post a lot longer than it needs to be. I won’t leave you hanging though. Just so that you’re able to actually perform some of these examples, see the following Ruby code for an implementation of the Right-to-Left Binary method.

# Computes b^e mod m using Right-to-Left Binary method.
def mod_exp(b, e, m)
  x = 1

  while e > 0
    x = (x * b) % m if e.odd?
    e = e >> 1
    b = (b * b) % m
  end

  return x
end

Let’s try an example ourselves. Keep in mind that we’ll be using much smaller values for p  and q  for practical reasons.

For my Portal fans out there, let’s use the same message “THE CAKE IS A LIE” when p = 43  , q = 59  , and e = 13 .

Note that e  and (p - 1)(q - 1)  are relatively prime. That is:

gcd(e, (p - 1)(q - 1)) = gcd(13, 42 \times 58) = 1

Begin by translating each letter into their respective two-digit integer representation and grouping them into blocks of four. However, unlike with the shift cipher, the alphabet begins at 1, not 0. This is because 0 is used to represent a space. Therefore, 00 = space, 01 = A, 02 = B, … 26 = Z. The result would be:

2008  0500  0301  1105  0009  1900  0100  1209  0500

Notice how the last block has been padded with 00. Then we apply the previous formula using the given variables.

C = M^{13} \mod 2537

All that’s left is to substitute M  for each block of integers. Using the method for modular exponentiation, we get:

2008^{13} \mod 2537 = 1230

500^{13} \mod 2537 = 2330

301^{13} \mod 2537 = 129

1105^{13} \mod 2537 = 2348

9^{13} \mod 2537 = 1864

1900^{13} \mod 2537 = 1317

100^{13} \mod 2537 = 197

1209^{13} \mod 2537 = 1108

500^{13} \mod 2537 = 2330

Finally, we’re left with our encrypted message:

1230  2330  0129  2348  1864  1317  0197  1108  2330

Give it a shot yourself. Using the above mod_exp() function, fire up Ruby and run:

p = 43; q = 59; e = 13; n = p * q
[2008, 500, 301, 1105, 9, 1900, 100, 1209, 500].each {|m|
  printf "%04d ", mod_exp(m, e, n)
}

Decryption Phase

To recover the original message, the decryption key is used. As with the encryption key, the decryption key also consists of two integers: the same modulus n  and the integer d  , an inverse of e  modulo \phi(n)  . Such an inverse exists since e  and \phi(n)  are relatively prime. To demonstrate this, recall the theorem which states that the integers a  and b  are congruent modulo m  if and only if there is an integer k  such that a = b + km  . This means that if de \equiv 1 \mod{\phi(n)}  , then there is an integer k  such that de = 1 + k \cdot \phi(n)  . Therefore:

C^d \equiv{(M^e)^d} = M^{de} = M^{1 + k \cdot \phi(n)} \mod n

Assuming that gcd(M, p) = gcd(M, q) = 1  (which holds true except in rare cases), we can use Fermat’s Little Theorem to show that M^{p - 1} \equiv 1 \mod p  and M^{q - 1} \equiv 1 \mod q  . Further simplifying:

C^d \equiv M \cdot (M^{p-1})^{k(q-1)} \equiv M \cdot 1 \equiv M \mod p

and

C^d \equiv M \cdot (M^{q-1})^{k(p-1)} \equiv M \cdot 1 \equiv M \mod q

Finally, because p  and q  are relatively prime, that ol’ crazy Chinese Remainder Theorem lets us conclude that:

C^d \equiv M \mod pq

Let’s test it out and try decrypting our ciphertext. We know that n = 43 \times 59  and e = 13  . To find the decryption key d  we need to find the inverse of e  modulo pq  .

13x \equiv 1 \mod 2436 = 937

We use 937 as our decryption key, d  . To decrypt a ciphertext block C  , we compute:

P = C^{937} \mod 2537

Then it’s just a process of using modular exponentiation like last time.

1230^{937} \mod 2537 = 2008

2330^{937} \mod 2537 = 500

129^{937} \mod 2537 = 301

2348^{937} \mod 2537 = 1105

1864^{937} \mod 2537 = 9

1317^{937} \mod 2537 = 1900

197^{937} \mod 2537 = 100

1108^{937} \mod 2537 = 1209

2330^{937} \mod 2537 = 500

Doesn’t this look familiar?

2008  0500  0301  1105  0009  1900  0100  1209  0500

It’s our original message, “THE CAKE IS A LIE”. It’s numerical representation, anyway.

Security Implications

The security of the RSA cryptosystem is based on what’s called the RSA Problem; that is, the process of performing a private key operation (decryption) when given only a public key. Given the public key pair (e, n)  , this involves finding the e-th root modulo a composite n  in order to recover a value M  such that C \equiv M^e \mod n  .

This approach involves factoring the modulus n  in order to recover the prime factors p  and q  . An attacker could then compute \phi(n)  and then proceed to derive the private key exponent d  . However, the problem isn’t so cut and dry. Remember that we’re dealing with very large integers. If p  and q  were each 500 digits, the n = pq  would be 1,000 digits. No polynomial-time method exists that can factor prime integers of this magnitude in a reasonable amount of time on a modern computer.

So that’s it? Big numbers? That’s it’s secret power? Hasn’t anybody considered that computers may be able to easily factor these numbers some day?

Yes. It is true that cracking RSA encryption is not impossible. It’s just infeasible. As of now, the largest factored key was 768-bits. But even that was on a huge state-of-the-art distributed system comprised of hundreds of machines and took almost two years. Those types of setups aren’t exactly common. Considering that the default key size used by ssh-keygen is 2048, you can be safe in the knowledge that your precious SSH boxes will remain secure for more than a lifetime. Even the overly paranoid can use a 4096-bit key which most believe won’t be factored until D.B. Cooper is found.

Moving Forward

Hopefully, you now have a better understanding of how RSA is implemented. It’s a fascinating subject that really makes you appreciate the works of mathematicians of centuries old that made it all possible. Don’t stop there though. RSA is just one contender in the public-key boxing ring. The Diffie–Hellman algorithm is another method which came slightly before RSA. There’s also the ElGamal signature scheme. Dive in and enjoy!