A public-key cryptography algorithm created by Rivest, Shamir, and Adleman. Based on the hypothesis that generating large prime numbers is easy, but factoring large numbers is hard.

Also a company that develops and markets encryption software.
taken from my discrete mathematics textbook:

The RSA cryptosystem is based on modular exponentiation modulo of the product of two large primes. Each individual has an encryption key consisting of a modulus n = pq where p and q are large primes, say, with 200 digits each, and an exponent e that is relatively prime to (p - 1)(q - 1). To produce a usable key, two large primes must be found. This can be done quickly on a computer using probabilistic primality tests. However, the product of these primes n = pq, with approximatley 400 digits, cannot be factored in a reasonable length of time.

In the RSA encryption method, messages are translated into sequences of integers. This can be done by translationg each letter into an integer, as is done with the Caesar cipher. These integers are grouped together to form larger integers, each representing a block of letters. The encryption proceeds by transforming the integer M, representing the plain-text (the original message), to an integer C, representing the ciphertext (the encrypted message), using the function:

C = M^e mod n

Distilled from: Discrete Mathematics and its Applications
By Kenneth H. Rosen
Published by McGraw Hill

Here you can find an explanation of the mathematics behind the RSA cryptosystem.

Start with an integer n which is the product of two different primes p and q. n is going to be part of the public key but p and q have to be kept private.

Now suppose we have a message for encryption. We need to first convert it into numeric form. How we do this is not really important but as an example, if we have a plain text message then we could convert it into numeric form (not very efficiently) by stringing together the three digit numbers obtained from each character by the ASCII code plus 100. So the message Phear me! gets translated to 180204201197214132209201133.

Because we are going to be working modulo n so we have to break the numeric form of the message into blocks, each block being a nonnegative integer less than n.

The next stage is the encryption process. We need an integer a that is coprime to phi(n), where phi is the Euler Phi function. This means that a is invertible modulo n.

The pair (n,a) is the public key. Let us now encrypt our message with this key. We have converted our message into a sequence of blocks x each of which is a nonnegative integer and less than n. The encrypted version of x is just the uniquely determined nonnegative integer less than n that coincides with xa modulo n.

How could we reverse the process and decrypt the message? The decryption algorithm depends on knowing an inverse b for a modulo phi(n). The pair (n,b) is the private key.

Suppose that we have an encrypted block of our original message, call it y. Then to decrypt all we have to do is find the residue of yb modulo n.

This really works because (xa)b=xab, and we know that ab is congruent to 1 modulo phi(n); that is 1=ab-k.phi(n) and so ab=1+k(p-1)(q-1)

Now we work modulo p. There are two cases, if x is divisible by p then we certainly have that xab is congruent to x mod p. So assume not. Then Fermat's Little Theorem says that xp-1 is congruent to 1 modulo p. Thus


is congruent to x modulo p. The same argument shows that xab is congruent to x modulo q also, and so xab is congruent to x modulo n, as required.

It is believed that cracking the above encryption method is equivalent to factoring n and up to now the latter is a difficult problem. There isn't actually a rigorous proof of this equivalence, just some heuristic arguments, so it is possible, in priciple, that someone will come up with a way to break the encryption without factoring n but this seems fairly unlikely. This system has been use for quite a few years now and has proved to be secure in practice.

A good deal of care has to be taken in choosing the primes p and q. If they are too small then clearly we can easily factor n so we must choose large primes (we are talking over a 100 digits for very modest security). But we also have to avoid some obvious traps. For example we don't want p and q to be too close together, for then again, n is easily factored.

There is some overlap with posthumous's writeup but there's quite a bit more detail here.
In fact, according to the UK government, the RSA folk didn't really invent the RSA algorithm. They say it was invented at Government Communications Headquarters (GCHQ), the british equivalent of the NSA. The man who started this went by the name of James Ellis, who worked in the Communications-Electronics Security Group.

The story starts when he was asked to help reduce the cost of key distribution for secret communications. Instead of finding ways to cut corners, he decided to go all the way, and eliminate the problem entierly. By 1969, he knew that this was possible, and enlisted the help of Clifford Cocks, who helped figured out the solution, but without realizing the significance. He passed it on to another mathmatition named Malcome Williamson, who is trying to find a flaw, actually discovered Diffie-Hellman-Merkle key exchange in 1975, when the DHM trio was just getting started. The GCHQ didn't, however patent the work, due to secrecy restrictions, so DHM and RSA are given credit.

This is not to say that those folk didn't do a wonderful job, they arrived to the same conclusion independently and were truly the first to realize its full potential.

Much of this information is taken for a wonderful book called The Code Book by Simon Singh.

Well there is slightly more to it than this. The public key is the modulus n and a computationally favourable exponent e. The private key is d, the inverse of e mod (p-1)*(q-1) (and of course n). It is important to make n as nasty as possible. This is done by choosing p and q to be of similar size and for them both to be strong primes. There are a few views on this but if p-1 has a large prime factor (and the same for q-1) then a few attacks are thwarted. p and q can be discarded but in general they are stored with the private key (along with a few precalculated helper values) so that exponentiation by d can be performed efficiently.

e is usually 65537 for X509 certificates (https/SSL) and typically n is 1024 or 2048 bits.

A few notes about RSA from a practical perspective. This writeup is not intended to cover all of the various bits of history and details of RSA, simply because there are half a dozen other writeups in this node that seem to be doing a fine job of that.

The Modulus

The modulus n does not actually have to be equal to p*q with p and q prime. It can be the product of any number of prime numbers (ie, any composite number will work fine). Of course, this makes n easier to factor (by how much is not really known; most factoring estimates are based on factoring p*q). The benefit comes when using CRT-based RSA decryption. This method replaces an exponentiation modulo n with 2 exponentiations modulo p and one exponentiation modulo q, plus a few extra operations. Since the time for a modular exponentiation is exponential with the size of the operands, this is a major win for speed, and almost all crypto software, and most hardware, does it in this manner. If n has more factors, you can break the decryption up even more, using many small modular exponentiations instead of one big one. This is generally termed multi-prime RSA, and has been patented by Compaq in the US.

Generally you can get away with between 3 and 6 factors, depending on the total size of the key. The really nice thing about this is that you can give someone your public key, and it looks exactly like a regular RSA key. The only way to tell it's not is to factor the modulus.

The Public Exponent

The public exponent is chosen to be small for several reasons. First, it makes encryption and signature verification very fast, which is nice. In particular, someone will only sign something once, but it might be verified many thousands or even millions of times. More importantly, if the decryption exponent d is too small, then it's possible to recover the key. If e is relatively small (say, less than 232), d will always be sufficiently large to prevent the attack.

Commonly chosen values for the exponent include 3, 17, 257, and 65537. These numbers are all primes of the form 2n+1. Because there are only two 1 bits in the binary representations of these numbers, the exponentiation algorithm is quite fast (this is due to the properties of many common exponentiation algorithms). Prime numbers for the exponent are preferred because p-1 and q-1 must be relatively prime to e (this restriction also applies to additional factors if multi-prime RSA is used). It's easy to see that (assuming p and q are always chosen randomly), the restriction will be easiest to satisfy when e is a prime number.

Converting messages into numbers

Most examples of RSA (including the ones included in this node), basically ignore the problem of encoding a message into something usable by RSA. Many 'toy' examples, designed to showcase the mathematics involved, suggest turning the string to be encrypted into an integer by concatenating the ASCII representation of the characters.

A common size for an RSA modulus is 1024 bits (or 128 bytes). RSA cannot encrypt a message (really, RSA encrypts integers), larger than the modulus. So what about when we want to encrypt a 1 megabyte file? A single encryption using a megabyte long RSA modulus would take quite a few minutes on a fast machine, so that doesn't seem too great.

An alternative is to break the message up into pieces and encrypt them one at a time (in fact, some books suggest this as a serious option). But this too will be quite slow, and additionally insecure. For example, an attacker could undetectably rearrange the RSA encrypted blocks. Unfortunately, some crypto libraries, like Java's JCE, actually support using RSA in this manner (most do not). The real solution is to generate a random key, encrypt it with RSA, and then send the recipient both the RSA-encrypted key, and the message encrypted with a symmetric cipher such as DES, using the key.

Even this doesn't solve all of the problems. It's quite possible in some situations (such as a server that decrypts a message encrypted with RSA and processes it somehow) to attack RSA, if it is used in the manner mentioned above. These attacks usually allow the attacker to gain the decryption of a small number of ciphertexts. They are fairly hard to pull off, requiring a good deal of CPU time and the ability to send many thousands, or even millions, of messages to the server. However, they are much easier to accomplish than factoring the modulus.

To prove that this should not be taken lightly, I'll point out that attacks against SSL have been published which can recover a SSL session key in a fairly practical manner. However, because it would require connecting many many times to the server, it would undoubtably be noticed by the resident network engineers.

The way to work around this problem is to first choose the random key, and then encode it in some way that provides redundancy, and ensures that the input to the actual RSA operation is only a little bit smaller than the modulus. There are many ways to do this, some of which have been standardized in PKCS and IEEE 1363 (in particular, read OAEP).

Similar issues surround the use of RSA for signatures. IEEE 1363a has some solutions that seem fairly workable -- unfortunately, the best ones are patented.

This Writeup Sucks

This has been a brief and most incomplete coverage of some of the practical issues surrounding the use of RSA today. If you have comments or suggestions for additions, please, /msg away.

Log in or register to write something here or to contact authors.