Cryptography Flashcards
Kerchoffs’ Principle
if:
* the encryption algorithm was well designed
* the N bit long keys are kept secret
* the algorithm is executed by a trusted party
then:
the only possible attack us the brute force attack which requires a number of trials equal to T = 2^{Bit}
Symmetric cryptography: formulas, characteristics, related problems
- a.k.a secret key crypto
- K1 = K2
- C = enc(K, P)
- P = enc^-1(K, C)
- low computational load -> used for data encryption
- problem: how to share the secret key among sender and receiver
Symmetric cryptography for blocks: algorithms list
DES, 3-DES, IDEA, AES
DES and variants
Symmetric cryptography for blocks
- Data Encryption Standard
- it requires: XOR, shift, permutation -> designed to be efficient in hardware
DES
* key length: 56 + 8
* block length: 64
* time T
3-DES
* key length: 112 + 16 or 168 + 24 -> 112 bit strength
* block length: 64
* time 3T
* it is DES applied three times using 2 or 3 keys of 56 bit
* usually applied in the EDE mode: encryption with K1, decryption with K2, encryption with K3 or K1
* if 2 keys: if the attacker has 2^59B of memory available the strength is of 56 bit, otherwise it is of 112 bit
* if 3 keys: if the attacker has 2^59B of memory available the strength is of 112 bit, otherwise it is of 168 bit
IDEA
Symmetric cryptography for blocks, famous for PGP
AES
Symmetric cryptography for blocks
- Advanced Encryption Standard
- key length: 128 - 192- 256
- block length: 128
Important things about the XOR operation
- it is the inverse of itself
- it is the confusion operator because P(0) = P(1) = 50%
- primitive function available on all CPU
Why 2DES is not used?
Double application of encryption algorithms s subject to a known-plaintext attack named MITM which permits to decrypt data with at most 2^{N+1} attempts if the key is N bit long. This means that only the encryption time is doubled while the effective key length increases by just one bit
Demonstration
How can we apply a block algorithm?
There are different application modes:
* if data to encrypt > algorithm block size: ECB, CBC
* otherwise: padding, CTR, CTS (not sure CTS is a mode)
These are not algorithms but modes of application!!!!
ECB: extended name, formula, characteristics and problems
- Electronic Code Block
- D > B
- C_i = enc(K, P_i) where P_i’s length is the block size length
- P_i = enc^-1(K, C_i)
Advantages
* no propagation of errors: an error in transmission causes an error in the decryption of a single block without propagating to the other blocks;
* CPU parallelization: it is possible to work on several blocks at the same time.
Disadvantages
* swapping attacks: encryption does not depend on the position of the block) it is possible to swap two blocks, or move any block, without this being discovered;
* known-plaintext attacks: identical blocks are encrypted in the same way; an attacker can build a database containing the encryption outputs with all the possible keys of a recurring block (eg header of a Word document), and then find the key by comparing each block of ciphertext for the recurring block (pattern).
CBC: extended name, formula, characteristics and problems
- Cypher Block Chaining
- D > B
- C_i = enc(K, P_i XOR C_{i-1})
- C_0 = IV, random value that has to be known by the receiver -> there are two opposite approaches: IV in clear, IV encrypted
- P_i = enc^-1(K, C_i) XOR C_{i-1} -> XOR is the inverse of itself
characteristics:
- it tries to solve ECB’s problems
- no undetected swapping permitted
- one error in transmission generates an error at the decryption of two block
Padding: characteristics and problems
Bits are added to the last block (derived from the division) till the block length is reached
Characteristics:
* some offer minimal integrity control
* if data length is minor then the ad hoc techniques are preferred such as CTR
* EVEN IF THE PLAINTEXT IS AN EXACT MULTIPLE OF THE BLOCK PADDING MUST BE ADDED TO AVOID ERROR IN THE INTERPRETAZION OF THE LAST BLOCK -> 1≤ L ≤ B -> the biggest padding has to be done when no padding is needed
* attacks also depend from the padding type chosen
Problems:
* the sent/stored data are bigger than the original ones -> a solution could be using CTS
* which value have the padding bits? There are different padding TECHNIQUES
CTR: extended name, formula, characteristics and problems
- Counter Mode
- D < B -> data are small themselves so they are made of N bits -> variable length, a group
characteristics/formula:
- a nonce, a counter and a function are required
- at the sender and at the receiver there is a synchronized register in which a computation f(nonce, counter) is performed. the result is the key and the leftmost bits are taken
- the key is XORed with the plaintext
- no error propagation
- direct access to each block
Example: if we assume that f is the concatenation…
at the sender:
for (x = 0; x < data_size; x++) {
ctext[x] = ptext[x]
XOR
MSB( enc(K, nonce || x) )
}
If CTR is byte-oriented it can be considered a stream algorithm
CTS: extended name, formula, characteristics and problems
Cyphertext Stealing
- it is a technique used when we cannot have encrypted data bigger than the original data or when we cannot use padding
- very important for storage encryption
- the computation time slightly increases
Execution:
- the last block is filled with bytes taken from the second to last block -> the second to last block becomes a partial one
- blocks are encrypted
- the last block and the second to last block positions are exchanged
Which padding techniques are there?
- … 0x00 0x00 0x00 -> if the message length is known
- original DES -> one 1 followed by many 0, …1000000
- one byte with value 128 followed by null bytes, … 0x80 0x00 0x00
- techniques depending explicitly from the data length (L):
- Schneier: … 0x00 0x00 0xL
- SSL/TLS: … 0xL 0xL 0xL
- SSH2: random bytes, … r r 0xL -> equal data give different cypher texts
- IPsec/ESP: progressive number, … 0xL-2 0xL-1 0xL
- bytes with value L-1: … 0xL-1 0xL-1 0xL-1
CTS - example with CBC
slide
Symmetric Stream Encryption Algorithms: characteristics, problems, difference between an ideal and a real algorithm, main algorithm list
They work on a data stream without requiring the split in block, typically they work on a bit or on a byte at a time
- ideal algorithm: one-time pad, that means that a key as long as the message is required
- real algorithms: they use a key that is generated by pseudo-random key generators that are synchronized at the sender and at the receiver. The generators use a shared seed to generate the key. This seed is the main secret shared between sender and receiver.
characteristics and problems:
- very fast
- operations are not complex
- subject to cancellation attack and others
Algorithms: RC4 (old), Salsa20, ChaCha20
Salsa20 and ChaCha20
Symmetric Stream Encryption Algorithms
ChaCha20 is an improvement of Salsa20: more diffusion of bits, faster on some architectures
They are both based on:
- key: 128 or 256
- + nonce + counter
- base operation: ARX (Add, Rotate, XOR). Rotate is not available on all CPUs so if you do not have it by default and you have to implement it it will be slower
Salsa20:
- base function: f(key256, Nonce64, Counter64) = 512-bit kystream block
- O(1) decryption of any block at random
- Salsa20 perfora 20 mixing rounds of the input
ChaCha20:
- 96 bit nonce
- 32 bit block counter -> limit of 2^32 = 256GB 64 bit block
- used by Google and OpenSSH
Minimum secure key length for symmetric and asymmetric algorithms
symm: 256 to have high security, ok 128
asymm (FFC, IFC): 4096 (2048)
asymm (ECC): 512 (256)
Key distribution for symmetric cryptography
- OOB
- by means of key exchange algorithms
Asymmetric cryptography: characteristics, main algorithms
- a.k.a public key crypto
- two different keys, they form a key-pair and they are dependent one from the other -> SK + PK
- one of the keys is used for encryption, the other one for decryption -> they are keys with inverse functionality
- high computational load
- used to encrypt little data such as keys to distribute or to create digital signatures
- main algorithms: DH, RSA, DSA
- encryption with SK -> digital signature
- encryption with PK -> confidentiality
Digital Signature: how is it reached and which security properties does it have
It is reached via asymmetric encryption by using the private key to encrypt the data. Usually the encryption is done on the data summary and not directly on data.
PROVIDES DATA AUTHENTICATION AND INTEGRITY + NON-REPUDIATION
DSA
Digital Signature Algorithm (public key algorithm)
- The security challenge is the discrete logarithm problem, where the attacker sees the result of exponentiation but does not know the base.
- DSA is used exclusively for digital signatures. It employs a one-way lossy compression function, meaning the original plaintext cannot be recovered from the signature
RSA
Rivest - Shamir - Adleman public key algorithm
- Its security is based on the mathematical difficulty of factorizing large numbers, specifically the product of two large prime numbers -> There is no known polynomial-time solution for factorizing very large numbers,
- can be used for both digital signatures and encryption (confidentiality)
Key distribution for asymmetric cryptography: concept and problem
- the private key should never be disclosed
- the public one should be distributed as widely as possible
Problem: who guarantees the binding between the public key and the identity that owns it?
solutions:
1. OOB
2. the public key is distributed by means of a public-key certificate
DH
KEY AGREEMENT ALGORITHM
- first public-key algorithm invented
- A and B choose / agree two public integers p (prime, large) and g (generator) such that:
1 < g < p (typically g=2, 3, or 5) - length of DH key = number of bits of p
- A arbitrarily chooses an integer x>0 and computes: X=g^x mod p
- B arbitrarily chooses an integer y>0 and computes: Y=g^y mod p
- A and B exchange (publish) X and Y
- A computes K_A =Y^x mod p
- B computes K_B =X^y mod p
- result: K_A = K_B = g^{xy} mod p
characteristics and problems:
- resistant to the sniffing attack: if the attacker sniffs A and B he cannot get to know the key because he also needs x and y
- if the attacker can manipulate the data then it is possible to make a man-in-the-middle attack; in this case it requires pre-authentication. One common approach is the use of digital certificates for DH keys. MQV (Menezes-Qu-Vanstone) is an extension of the Diffie-Hellman protocol that incorporates authentication directly into the exchange process, enhancing its security against MITM attacks.
DH MITM attack
Elliptic Curve Cryptography: characteristics and modified algorithms
Instead of using modular arithmetic, the operations are executed on the surface of a 2D (elliptic) curve.
in this way the problem of discrete logarithm on such a curve is more complex than the problem in modular arithmetic and this leads to use shorter keys (about 1/10).
- digital signature = ECDSA
- key agreement = ECDH
- authenticated key agreement = ECMQV
- key distribution = ECIES (EC Integrated Encryption Scheme)
Main concept of integrity
If someone intercepts an encrypted communication he cannot read it but he can modify it -> INTEGRITY IS ABOUT DETECTION
Message digest: what is it, which characteristics must it have
A message digest is a fixed-length “summary” of the message to be protected (of any length)
It must be:
* fast to compute
* impossible or very difficult to invert
* difficult to create “collisions”
The digest is often used to avoid performing operations on the whole message, especially when the message is very large (e.g. because public-key cryptography is very slow)
Digest can be calculated in many ways, but usually via a CRYPTOGRAPHIC hash function
How does a cryptographic hash function work in general?
- split the message M in N blocks M_1 …M_N
- iteratively apply a base function (f)
V_k =f(V_{k-1}, M_k) with V_0 =IV and h=V_N
THE DIGEST IS THE RESULT OF THE LAST ITERATION
Cryptographic hash algorithms
name block digest definition notes
MD2 8 bit 128 bit RFC-1319 obsolete
MD4 512 bit 128 bit RFC-1320 obsolete
MD5 512 bit 128 bit RFC-1321 obsolete
RIPEMD-160 512 bit 160 bit ISO/IEC 10118-3 old + rare
SHA-1 512 bit 160 bit FIPS 180-1 sufficient
SHA-224 512 bit 224 bit FIPS 180-2, FIPS 180-3 good
SHA-256 512 bit 256 bit FIPS 180-2, FIPS 180-3 good
SHA-384 512 bit 384 bit FIPS 180-2, FIPS 180-3 good
SHA-512 512 bit 512 bit FIPS 180-2, FIPS 180-3 good
SHA-3 1152-576 224-512 FIPS 202, FIPS 180-4 excellent
What is aliasing and how is it related to the digest length?
- md1 = h(m1)
- md2 = h(m2)
if m1 ≠ m2 then we’d like to have md1 ≠ md2 to avoid aliasing
If the algorithm is well designed and generates a digest of N bits, then the probability of aliasing is:
P_A = 1 / (2^N_bit)
Thus, digests with many bits are required (because statistical events are involved)
The birthday attack
This is related to the message digest.
A N-bit digest algorithm is insecure when more than 2**(N/2) digests are generated because the probability to have two messages with the same digest is P_A ~ 50%
When do we have a balanced crypto system?
A cryptosystem is “balanced” when the encryption and digest algorithms have the same resistance:
- SHA-256 and SHA-512 have been designed for use respectively with AES-128 and AES-256
- note: SHA-1 (i.e. SHA-160) matched Skipjack-80
KDF (Key Derivation Function)
A cryptographic key must be random (each bit has 50% probability to be 0 or 1) but users typically insert passwords (or better passphrases) guessable and not random.
To solve this problem we can use a key derivation function:
- K = KDF(P, S, I) where
- P = password or passphrase
- S = salt (to make K difficult to guess given P)
- I = number of iterations of the base function (to slow down the computation and make life complex for attackers)
There are some KDF based upon cryptographic hash functions.
An example is PBKDF2: it uses SHA-1, |S| > 64, I ≥ 1000
Another examples is HKDF, based on HMAC
PBKDF2. Can it be attacked?
PBKDF2 can be attacked because the number of iterations may be large but this requires only a lot of computation but not a lot of RAM
Password Hashing Competition (PHC)
PBKDF2 can be attacked because C may be large but this requires only a lot of computation but not a lot of RAM.
Hence it can be attacked by ASIC or GPU -> we need to increase the RAM needed for the attack -> public competition launched in 2013
The winner was Argon2
MAC, MIC, MID
They are codes
- to guarantee the integrity of messages, a code is added to the message: MIC (Message Integrity Code)
- often integrity is not useful without authentication, thus the code (ensuring both security properties) is named: MAC (Message Authentication Code)
- to avoid replay attacks, a unique identifier can be added to the message: MID (Message IDentifier)
How does Authentication by means of keyed-digest work?
Data are sent along with a digest calculated not only on data themselves but also on a shared secret key.
- (sender) d = digest( K, M )
- (transmission) M || d
- (receiver) d’ = digest( K, M )
- (verification) if ( d == d’ ) then OK else ALARM!
Only who knows the key can compare the transmitted digest with the digest calculated on the received data.
Advantages:
- only one operation (digest)
- few additional data
- authentication + integrity
Which properties does a keyed-digest give?
authentication + integrity
HMAC: characteristic, pros and cons
HMAC is a keyed digest that is BASED ON a base hash function.
HMAC provides: integrity through digest and DATA authentication (but not of the sender since anyone could have sent that hash)
base hash function H:
- B byte block
- L byte output, with B > L
definitions:
- ipad = 0x36 repeated B times
- opad = 0x5C repeated B times
- if |K|>B then K’=H(K) else K’=K
- if |K’|<B then K’ is 0-padded up to B bytes
hmac
- H = H(K’ XOR opad||H(K’ XOR ipad||data))
Pros and cons:
+ little data transmitted due to the small size of the digest
+ fast performance
+ confidentiality and authentication with a single hash algorithm
- partial integrity given that collisions can occur