3.1 intro to crypto Flashcards
Definition of crypto
- Cryptography is the art of private communication in a public environment
- The word encryption does not occur in this definition – sometimes when you’re establishing private communication you are not encrypting your data but you’re hiding your data
- Hiding your message is steganography
Services of crypto
- confidentiality
- integrity
- authentication
- nonrepudiation
- availability
- Confidentiality: information kept private and secure – encrypt email so you can’t read it.
- Integrity: data not modified, deleted or added – ensuring the message sent is the message received – if you can decrypt the message back to plain text then you have a good sense that the message hasn’t been tampered with, but with the addition of digital signatures you can mathematically prove that the message hasn’t been identified
- Authentication: providing verification of identities – verify the identity of a person or system – passwords are stored as cryptographic caches – when connecting to the internet for example, both you and ISP know your password, their server send encrypted string with your password, you decrypt with your password and send back the string, meaning you both know the password
- Nonrepudiation: assurance by recording identities and activities – the person that sent the message cannot deny doing so.
- Crypto helps with integrity and confidentiality, authenticity, and accountability
- It does not help with availability – if something is encrypted, I can still delete all the files, even if I can’t read it
o It’s actually anti-availability
o Ransomware encrypts your files and they decrypt once you pay the ransom
o For other examples, you want to make sure you balance availability – too much availability means you put the key in the same place that the data is encrypted and too little availability is like loosing the key
Terminology
- plaintext
- ciphertext
- encryption
- decryption
- algorithm
- cryptoanalysis or cryptanalytic attack
- key
- keyspace
- Plaintext: synonymous with cleartext what you are reading now – text in human readable form
- Ciphertext: the plaintext transformed into a non-human-readable form.
- Encryption: the process of turning plaintext into ciphertext. Transform readable to nonreadable
- Decryption: the process of turning ciphertext back into plaintext.
- Algorithm: a process to be followed in calculation, especially by a computer. The public knowledge set of rules behind cryptography.
o Data encryption standard (DES)
o Advanced encryption standard (AES) - Cryptanalysis or cryptanalytic attack: the art and science of breaking cryptography – throughout history folks wanting to hide their communications have gotten better at doing so, then those wanting to read those messages find ways to break it – this is a cycle that will continue forever
- Key: a numeric value of a given length (expressed in bits). The changing part of the encryption algorithm. In certain types of cryptgraphic systems, protecting the key is paramount.
- Keyspace: the range of values that can be used to construct a key given a specific key length. In other words, every possible combination of a given number of 1s and 0s.
o DES encryption algorithm uses a key of 56 bits, there are 72 quadrillion combinations.
Why keyspace matters
- picture on page 54
- Every time you had a bit the number of possible combinations doubles
- The term brute force attack simply means that you guess every possible combination of a key, eventually one of them will work. This is the one attack against crypto that is always guaranteed to work – it is a question of how long it will take
- When you look at the strength of a cryptographic system, one of the first things you want to know is the length of the key – the longer key means more possible combinations, meaning more guesses to defeat the system via a brute force attack
- The way the odds play out, you will probably guess the key after passing half of the possible combinations – everything is possible, even getting it on your first try, but of course that is very unlikely
o Brute force is like trying to find a specific needle in a needle stack – they all look the same to the naked eye – you have 340 undecillion needle and you have to find one specific needle - Brute force reality: if you assume –
o There are 7 billion people on the planet
o Every person has 10 computers (70 billion computers)
o Each computer can test 1 billion keys per second (70 billion keys per second)
o You crack the key after resting 50% of the possibilities
o Then the earth’s population can crack one 128-bit key in 770 billion centuries – “effective infinity
Strength of cryptosystems
- randomness
- birthday attack
- strong math and peer review
Strength of cryptosystems: randomness is critical
* In addition to being long, the key must also be as random as possible
* A predictable key leads to a weak key attack
* The longer the key the easier it is to generate random values
* Bad news: computers cannot generate a truly random value – on eof the most difficult things to accomplish in software code is a good randomization routine
* Because computers cannot generate true randomness, they utilize a pseudo-random number generator (PRNG) to generate a random enough value for a cryptographic key
An attack against randomness: Birthday attack
* There is an interesting attack in cryptography known as the birthday attack – let’s take a look at how this attack works because it is a randomness issue that leads to the attack
* An important question to ask when looking at attacks if what if we could find a way to predict the most likely keys a PRNG will generate? This would speed up the brute force attack, how can you accomplish this?
* Birthday probability theory:
o If you take 23 people born in the US. There is a 50% probability that two of them will have the same birth date (month and day). If you raise the number to 47 people, it become a virtual certainty.
o The fact is that the most common birthdays in the US are in late September (september 16 to October 1 are especially common).
o The point is that there are certain days on the calendar that are more likely to occur as birth dates than others
o Apply that to PRNG – a pseudo-random number generator will not generate a 40 bit key like all zeros, all ones, or predictable patterns
o A true random number generator would generate predictable patterns like 12345,
o So if we remove the true random generators, then we know there are keys that are more likely to be generated – if we apply this concept to determine the keys most likely to occur and guess those first in our brute force attack, we can accelerate the attack a great deal
Strength of cryptsystems: strong math and peer reviewed algorithms
* In modern crypography, if the algorithm has a flaw this can lead to a mathematical attack – the result is that you can decrypt the data without the key
* Do not buy the “snake oil” (a substance with no real medicinal value sold as a remedy for all diseases) if a vendor tells you they have proprietary encryption algorithm do not buy the product as every case we know is that it got broken within two weeks when it wasn’t peer reviewed
* By constrast, the DES algorithm has been public domain since March 1975, every mathematician in the world has tried to find a flaw in the math, but so far none of them have – peer review gives us a high degree of confidence in that algorithm
* Incidentally, the AES algorithm has been in the public domain since August 1998 – no mathematician has reported any flaws in its math either – but it was not until 2010, twelve years after the peer review began that the mathematical community considered AES to be sufficiently reviewed
* People try to convince management that having a proprietary encryption key is more secure since people don’t know how it works – but eventually you have to release it to the public and others will attack it
The work factor
- calculating the work factor
- Moore’s law
- If you encrypt something and anyone in the world can decrypt it, then everyone in the world can decrypt it
- When calculating the work factor, you take into account everything we talked about – the length and randomness of the key, the algorithm strength, the avalanche effect and so on.
- Also consider Moore’s law – he explained that the number of transitotrs in a dense integrated circuit doubles approximately every two every – meaning also that computers increase in speed every two years (1,2,4,8,16,32,64)
- For example, Cray2 mainframe 1985 1.9 GFLOPs (16 million dollars) – Iphone11 does 1,000 GFLOPs (1teraflop) $1000
- Now think about what that speed difference means when someone is brute-forcing passwords or crypto keys – every two years, the number of guesses they can make per second doubles – it means that the DES algorithm 56-bit key that would take more than 20,000 years to brute force in 1975 can now be brute-forced in hours
- There is no such thing then as an infinite work factor, you need to determine an appropriate work factor
o Example: the secret attack at dawn needs to be protected only approximately 24 hours, after you launch an attack, the other side kind of knows what you were planning. - A long work factor is hard to achieve in large part because of the spped at which computing power is increasing – but what about a piece of data you must protect for 100 years, here’s what you do:
o Encrypt it today with the best crypto available. Then in 20 years from now, AES will be considered weak and something new will be better and out – decrypt it with the old and re-encrupt with the new. Twenty years later, the cycle repeats, and you decrypt with the old and re-encrypty with the new.
o At no time can you transmit the data, if for example, you transmit it today in the AES 256 bit encrypted form and someone captures it in that form, then 25 years from now, they will be able to break it back to plaintext.
Hidden ciphers: origins of steganography
- the shaven head
- invisible ink
- pin pricks
- shoelaces
- sir john trevanion’s letter
- a modern hidden cipher
- Humans secretly communicating information
- The shaven head: take a slave and shave their head, tattoo a message to their scalp, and wait for their hair to grow back, then they would send the salve to the other side, shave the head, and read the message.
- Invisible ink: 100% pure lemon juice, a sharp stick, and a piece of paper: dip the stick in the lemon and write a message, when it dries you cannot see the message, unless you take the paper to a candle and heat it slowly.
- Pin pricks: nazi symptehziers during WWII would poke dots above certain letters in a news paper and would drop it off in a predetermined location
- Shoelaces:
- Sir John Trevanion’s letter:
o A famous example of steganographic communication comes from the 17th centurity. Sir John Trevanion had been improsined and sentenced to death – while in prison one his servants sent him a letter with odd punctionation – but the real message was hidden as the third character after any punctuation mark
o It reads out the panel at east end of chapel slides – this is where he went to escape - A modern hidden cipher: steganography
o The field of study related to covered writing – today though it means hiding a message inside a picture, movie, or similar type of file
o One popular method of hiding messages involves the use of pictures – again there are many ways of using graphics for this, but we will concentrade on a method that works with JPG files
When you look at how a JPG is constructued, each byte of the file defines one pixel – the first 5 bits define the color, contracts, and such – the last 3 bits define insignificant subtleties the human eye cannot discern
The picture on the left is just a JPG image, the picture on the right is the same image but has a 60kb text file hidden inside it – we used the tool called steghide to accomplish this
The human eye cannot discern adifferentce in the two pictures because the text file has only overwritten one of the last 3 bits of each byte
A crypgraphic hash could tell us that the images are different but not how they are different. And to perform a comparasion of that nature would require that you have both the before and after images as you do here, in testing if you embed a large enough file that it overwrites all three of the bits of most of the bytes, there is indeed a slight defradation of the image, but without the before and after, it would be hard to tell. - As stated, we used steghide to embed the text file inside the JPG file, when we ran that software it prompted us for a passphrase, this stypically means that the text file will be encrypted before it is embedded, usually the passphrase becomes the encryption key
- Rumor has it that this type of communication has been used by organized crime and terrorists for years, because they encrypt first then embed, we cannot prove it is happening.
- Steganography can also be used for good reasons like digitally watermarking images, movies, songs so that we can prove if plagiarism occurs, or if something is edited, the watermark will survive.
Steganography vs encryption
- Encryption provides for confidentiality of communication – not secrecy of communication – nobody else knows what you are saying, but they can tell you are talking privately – humans and computers can spot cyphertext
- Steganopgraphy provides secrecy of communication – nobody knows the parties are even talking
- Combined together, you can get the best of both
transposition/permutation/obfuscation
- rail fence cipher
- scytale cipher
- column shift
- pictures on page 60 and beyond
- Transposition ciphers change and reorder the the letters.
- Think of the word cinema being rearranged to create iceman, in crypto though the new version of the letters are not always readable words
- Transposition, permutation, and obfuscation are all synonymous
- Rail fence cipher:
o You have a rail fence, you throw letters of your message against the fence, every other letter strikes a rail of the fence and stops. Every other letter passes between the rails and keeps going – you then write the top line, followed by the bottom line. You draw a line between the two strings to show the demarcation. This is an attack at dawn message. - Scytale cipher was used by the spartans around 400 BC.
o All generals carried a stick of a c ertain diameter and taper – they would wrap a hide around the stick and write their message down the strips as you see above, again our secret message is attach at dawn. When you unwrapped the hide, you had the letters in the order the occur on the strips.
o Without the diameter staff, you would hav e ahard time realigning the letters to read the message
o From time to time one of the generals would be captured and the enemy obtained their stick or the key, which would cause the generals to then create a whole bunch of new sticks and distribute them to all the generals – they are essentially changing the encryption key - Column shift: you build a block of text boces in columns, when we don’t hav enough to fill a column we pad it with binary 01 10
o We then employ some formula to rearrange the order of the columns
Substitution cipher
- monoalphabetic
- polyalphabetic
- atbash
- ceaser cipher
- rot 13
- Where a transposition changes the order of the letters, the substation replaces the letters using one of several formulas
- They fall into one of two broad catogeries:
o Monoalphabetic: you replac ethe plaintext alphabet with one ciphertext alphabet
o Polyalphabetic: you replace the plaintext alphabet with many ciphertext alphabets - Monoalphabetic cipher: works by expediency of replacing the plaintext with a ciphertext. Any number of methods for creating the ciphertext alphabet.
o Atbash – used from 600 to 500 BC
to create the ciphertext alphabet they simply turned the alphabet backwards, so instead of beginning ABC, it begins with ZYX.
o Ceaser cipher or C3. They simply shifted the alphabet by three positions to create the ciphertext alphabet. Because they rotate the alphabet this is known as a rotational substation cipher. The ciphertext alphabet starts with DEF and ends with ABC.
o ROT 13: Rotational substation cipher which rotates the alphabet by 13 positions. In newsgroups people would tell offensive jokes using ROT 13, for those who want to see the joke you decrypt, if you don’t want to be offended then don’t bother.
monoalphabetic cipher: attacks
- frequency analysis attack
- plaintext attack
- When used by themselves, monoalphabetic ciphers are highly susceptible to attacks, regardless of the method used ot create the ciphertext alphabet, in fact, history has several examples of these ciphers being broken
- One such attack is called a frequency analysis attack, also called the cyphertext only attack: the adversary has nothing to work with but your cyphertext, they perform frequency analysis on that cyphertext to attempt to discern patterns. To do this you need to know the language of the original plaintext
- In each language, letters have a frequency of occurenc e- in English the letter E is most common, occurring about 12.7% of the time, T is around 9.1%, A is 8.2%. In addition there are diagraphs that occur together like TH occurring at 1.52%
- So to perform a frequency analysis attach we look for patterns in your ciphertext, perhaps we find the letter Z occurring most often and there is a high frequency of ZZs, we now believe you are replacing Es with Zs.
- Eventually the process of elimination starts to kick in as well, when we know all but eight or ten of the letters, it starts being easier to figure out those remaining
- A monoalphabetic cipher is also susceptible to a known plaintext attack – arabs vs greeks where the greeks always began every message with the words in the name of God. Using those 18 characters, the Arabs figured out how to encrypt their monoalphabetic cipher that they deciphered the entire message.
- A modern day example of the known plaintext attack would be if you know every email message a person sends out contains Microsoft outlook 2013, from bit x to bit y of the message. If the messages always end with a signature block and the paragaphs that says if this is sensitive and you get it by mistake, please disregard. All of these could be examples of plaintext, therefore the attack is theoretically possible today – however, given the extreme amount of randomness that modern ciphers put into ciphertext, it would realistically take millions of years to actually pull the attack off.
Polyalphabetic ciphers
- vigenere in the 16th century
- vigenere table
- running key
- picture on page 64
- The word mono is from ancient Greek and means one – likewise the word poly is also from ancient Greek and means many
- Instead of replacing the plaintext alphabet with a single alphabet, polyalphabetic ciphers use multiple cipher alphabets
- Vigenere in the 16th centur invented this
- Let’s take the vigenere table - to encrypt you the word Grass and you have chosen the key CRYPT – you go across the top plaintext line to get the plaintext and then go to the key and see where G and C meet which is E
- Notice that the double letter here did not surivive encryption – making this a more secure form of encryption
- How would the receiver encrypt? I send you the ciphertext and you already need to know the key – then you would take the C in the key and the E in the cipher text then decrypt to plaintext
- Running key: initial implementatipns of the vigenere cipher utilized a repeating key – for example, if you chose a key of a lemon, you would just repeart that key LEMONLEMONLEMON until you had all your text encrypted – it took many years, but eventually someone figured out an advanced statistical analysis and defeated the repeating key method
Substitution ciphers: code and code books
- picture on page 64
- We now move from replacing individual letters (referred to as ciphers) to replacing words (referred to as codes)
- Code books usually contain the plaintext to code in the front, and the matching code to plaintext in the back this is a classic example – for it to work both parties must have an identical copy of the code book – each book is typically good for a limited time
- Let’s say we need to pass the message that we want air support and cover fire at 0800 hours
- Air support = ZCO – Cover fire = NLG 0800 = QTD
- So to encode the message is ZCO NLG QTD over the unencrepted radio – which would be Zulu Charlie Oscar November Lima Golf Quebec Tango Delta
Substitution cipher: one-time pad (OTP)
- rules/criteria
- XOR - poor-man’s cipher
- picture on page 65
- In 1917 gilbert vernam inveneted the vernam cipher, better known as the one-time pad. In many security books you will read that this the only unbreakable cipher, but this is only true if it meets the following criteria:
o You must use it once and never again
o The pad must be as long as the message
o Both parties must protect the pad at all time
o The pad must be random as possible - This uses XOR – the poor man’s cipher:
o Performs a simple comparison of two binary bits – and produces results based on the comparison if they are identifical the result is always a 0, if the they are different the result is always a 1 - In the one-time pad, first as with the code book both communicating parties must have an identical copy of the pad (the pad amounts to the cryptographic key) as stated earlier, the pad must be random, protected, used only once, and at least as long as the message
- Here you can see the word now in binary on the top line in the box labaled encryption the second line is the one-time pad, perform the XOR operation and you arrive at the third line (the ciphertext)
- You send that line to the person you are communicating with, as stated they already have a copy of the pad
- Moving to the bottom box labeled decryption, the receipent XOR’s the ciphertext against the pad and obtains the original plaintext
- Technical explanation of XOR: remember that in binary there are place values, and if a binary of a certain pattern results in 78 for example, you can see on the ascii chart that this means uppercase N. The XOR takes the two binary bits like a lower and upper case N and XOR’s it by comparing the two. This is called bit switching and is used to increase randomness. From the comparison of the one-time pad and ciphertext when you decrypt you get the binary ASCII and know where the place values are to then look up what the letters are.