MOAR codey stuff

I turned my last post on cryptography into a two-parter because there was a fair ton of stuff that I wasn’t able to cover in that particular 1200 words that I consider to be interesting/relevant, so here the rest of it comes.  I’m not going to bother for an intro this time though, so go and read my last post (if you haven’t already’ before this one to make sure we’re all on the same level here.

We all good? OK, let’s talk about public keys.

When one encodes or decodes a cipher, you perform a slightly different process when performing each process, but each process is mathematically related to the other. For example, when encrypting a Caesar cipher you ‘add three’ to the ‘value’ of each letter, and when decrypting you subtract three; the one process is the inverse of the other. These different types of key, or parts of the overall key, are known as the encryption and decryption keys. Since the two are mathematically related, knowledge of the one allows an enemy cryptanalyst to discover the other with relative ease in most cases; thus, both keys have to be kept very secret to avoid exposure, and making the distribution of keys a dangerous business.

However, in the RSA algorithms talked about at the end of the last post the tool for its encryption (the massive number M and the power P it is raised to) are no use to a foe if he does not have the two prime factors of M needed to decrypt it (I still don’t get how that works mathematically) with any degree of ease. Thus the encryption key needed to send messages to a person secretly can be distributed freely and be known to anyone who wants to, without fear of these secret messages being decoded; incredibly useful for spy networks, since it allows multiple operatives to use the same key to send messages to someone without fear that the capture of one agent could compromise everyone else’s security. In this kind of cryptography, the key distributed publically and which anyone can access is known as the ‘public key’, whilst the secret key used to decrypt it is called the ‘private key’.

RSA algorithms are not the only methods employed in public key cryptography, but any cryptographical methods it does employ are inherently secure ones. Public and private keys have other uses too beyond secure encryption; when encrypting a message using somebody else’s public key, it is possible to add a digital ‘signature’ using your private key. The recipient of your message, upon decrypting it with their private key, can then use your public key and a special algorithm to verify your signature, confirming that the message came from you (or at least someone in possession of your private key- I still don’t know how the maths works here). You can also ‘share’ private and public keys with another person to produce a ‘shared secret’, but here my concept of what the hell is going on takes another large step back so I think I’ll leave this subject there.

Despite all its inherent security, there is one risk still associated with public-key cryptography and techniques similar to the RSA algorithm. The weak link lies in the key itself; the transferring of a private key is (mostly) only ever necessary when old lines of communication are insecure, meaning that a key can often be intercepted by a sharp enemy cryptanalyst. If he is smart, he’ll then send the key straight on to its intended recipient, meaning they are likely to carry on using it oblivious of the fact that the other side can intercept and translate every message sent to him. Therefore, it is advantageous to remove this weak link by ensuring the recipient can tell if the message has been intercepted; and here we enter the weird and wonderful world of quantum cryptography.

The name is actually a misnomer; quantum theory and effects cannot be used to encrypt secure messages, and the term refers to two ideas that are only related to cryptography. One is the theoretical possibility that future quantum computers may be able to crack the RSA problem and throw the world of cryptanalysis wide open again, whilst the other, far more practical, side of things refers to this method of confirming that a message has not bee intercepted (known as quantum key distribution, or QKD). The theory behind it is almost as hard to get your head around as the maths of the RSA algorithm, but I’ll try to explain the basics. The principle behind it concerns Heisenberg’s uncertainty principle; the idea that attempting to observe a quantum effect or system will change it in some way (just go with it). The two parties sending a message to one another communicate in two ways; one via a ‘quantum link’ with which to send the secret message, and another via an open channel (e.g. the internet). The first party (who convention dictates is called Alice) sends her message via the quantum channel, polarising each bit of quantum data in one of two types of direction (just go with it). The receiving party (traditionally called Bob) receives this polarised quantum data, but since he doesn’t know which type of polarisation Alice has uses he just picks one at random each time (just go with it). About half of the time, therefore, he’ll get the right answer. Alice then tells him over the open channel which polarisation she used for each bit (usually, for reasons of speed, this is all done automatically via computer), and Bob tells her which type of polarisation he checked for each bit. They both discard the ones where they did it a different way around, and keep the ones where they did it the same way as a shared key- thus is the key exchanged.

However, if somebody (Eve, conventionally) has been eavesdropping on this little conversation and has measured the polarisation of the quantum bits, then the polarisation of the bits will have been changed by this process (just go with it). This introduces error into Bob’s reading, some of which can just be put down to the mechanics of the process; if, however, more than p bits show an error (p is picked to be a suitable number- I couldn’t give you an example), then the line and key is presumed to be insecure and the whole process is started again. Simple, isn’t it?

Despite all the bother and complexity about these processes however, it is still acknowledged that perhaps the best way to conceal a message’s content is to simply hide the thing very, very well. The Egyptians would frequently tattoo a message into a slave’s head, send him to the recipient and then let him shave his head afterwards, but a more advanced version was employed during WW2 as a direct link between Franklin D. Roosevelt and Winston Churchill. Both had a set of identical tracks of white noise (ie random sound), which they would ‘add’ to a recorded audio message and their counterpart would ‘subtract’ when it got to the other end. The random nature of white noise made the link impossible to break (well, at the time; I don’t know what a computer might be able to do to with it) without access to the original track. The code was used throughout the war, and was never broken.

Advertisement

Crypto

Cryptography is a funny business; shady from the beginning, the whole business of codes and ciphers has been specifically designed to hide your intentions and move in the shadows, unnoticed. However, the art of cryptography has been changed almost beyond recognition in the last hundred years thanks to the invention of the computer, and what was once an art limited by the imagination of the nerd responsible has now turned into a question of sheer computing might. But, as always, the best way to start with this story is at the beginning…

There are two different methods of applying cryptography to a message; with a code or with a cipher. A code is a system involving replacing words with other words (‘Unleash a fox’ might mean ‘Send more ammunition’, for example), whilst a cipher involves changing individual letters and their ordering. Use of codes can generally only be limited to a few words that can be easily memorised, and/or requires endless cross-referencing with a book of known ‘translations’, as well as being relatively insecure when it comes to highly secretive information. Therefore, most modern encoding (yes, that word is still used; ‘enciphering’ sounds stupid) takes the form of employing ciphers, and has done for hundreds of years; they rely solely on the application of a simple rule, require far smaller reference manuals, and are more secure.

Early attempts at ciphers were charmingly simple; the ‘Caesar cipher’ is a classic example, famously invented and used by Julius Caesar, where each letter is replaced by the one three along from it in the alphabet (so A becomes D, B becomes E and so on). Augustus Caesar, who succeeded Julius, didn’t set much store by cryptography and used a similar system, although with only a one-place transposition (so A to B and such)- despite the fact that knowledge of the Caesar cipher was widespread, and his messages were hopelessly insecure. These ‘substitution ciphers’ suffered from a common problem; the relative frequency with which certain letters appear in the English language (E being the most common, followed by T) is well-known, so by analysing the frequency of occurring letters in a substitution-enciphered message one can work out fairly accurately what letter corresponds to which, and work out the rest from there. This problem can be partly overcome by careful phrasing of messages and using only short ones, but it’s nonetheless a problem.

Another classic method is to use a transposition cipher, which changes the order of letters- the trick lies in having a suitable ‘key’ with which to do the reordering. A classic example is to write the message in a rectangle of a size known to both encoder and recipient, writing in columns but ‘reading it off’ in rows. The recipient can then reverse the process to read the original message. This is a nice method, and it’s very hard to decipher a single message encoded this way, but if the ‘key’ (e.g. the size of the rectangle) is not changed regularly then one’s adversaries can figure it out after a while. The army of ancient Sparta used a kind of transposition cipher based on a tapered wooden rod called a skytale (pronounced skih-tah-ly), around which a strip of paper was wrapped and the message written down it, one on each turn of paper. The recipient then wrapped the paper around a skytale of identical girth and taper (the tapering prevented letters being evenly spaced, making it harder to decipher), and read the message off- again, a nice idea, but the need to make a new set of skytale’s for everyone every time the key needed changing rendered it impractical. Nonetheless, transposition ciphers are a nice idea, and the Union used them to great effect during the American Civil War.

In the last century, cryptography has developed into even more of an advanced science, and most modern ciphers are based on the concept of transposition ciphers- however, to avoid the problem of using letter frequencies to work out the key, modern ciphers use intricate and elaborate systems to change by how much the ‘value’ of the letter changes each time. The German Lorenz cipher machine used during the Second World War (and whose solving I have discussed in a previous post) involved putting the message through three wheels and electronic pickups to produce another letter; but the wheels moved on one click after each letter was typed, totally changing the internal mechanical arrangement. The only way the British cryptographers working against it could find to solve it was through brute force, designing a computer specifically to test every single possible starting position for the wheels against likely messages. This generally took them several hours to work out- but if they had had a computer as powerful as the one I am typing on, then provided it was set up in the correct manner it would have the raw power to ‘solve’ the day’s starting positions within a few minutes. Such is the power of modern computers, and against such opponents must modern cryptographers pit themselves.

One technique used nowadays presents a computer with a number that is simply too big for it to deal with; they are called ‘trapdoor ciphers’. The principle is relatively simple; it is far easier to find that 17 x 19 = 323 than it is to find the prime factors of 323, even with a computer, so if we upscale this business to start dealing with huge numbers a computer will whimper and hide in the corner just looking at them. If we take two prime numbers, each more than 100 digits long (this is, by the way, the source of the oft-quoted story that the CIA will pay $10,000 to anyone who finds a prime number of over 100 digits due to its intelligence value) and multiply them together, we get a vast number with only two prime factors which we shall, for now, call M. Then, we convert our message into number form (so A=01, B=02, I LIKE TRAINS=0912091105201801091419) and the resulting number is then raised to the power of a third (smaller, three digits will do) prime number. This will yield a number somewhat bigger than M, and successive lots of M are then subtracted from it until it reaches a number less than M (this is known as modulo arithmetic, and can be best visualised by example: so 19+16=35, but 19+16 (mod 24)=11, since 35-24=11). This number is then passed to the intended recipient, who can decode it relatively easily (well, so long as they have a correctly programmed computer) if they know the two prime factors of M (this business is actually known as the RSA problem, and for reasons I cannot hope to understand current mathematical thinking suggests that finding the prime factors of M is the easiest way of solving this; however, this has not yet been proven, and the matter is still open for debate). However, even if someone trying to decode the message knows M and has the most powerful computer on earth, it would take him thousands of years to find out what its prime factors are. To many, trapdoor ciphers have made cryptoanalysis (the art of breaking someone else’s codes), a dead art.

Man, there’s a ton of cool crypto stuff I haven’t even mentioned yet… screw it, this is going to be a two-parter. See you with it on Wednesday…