## W4261 Introduction to Cryptography: Fall 2016 Lecture Summaries

These are brief (possibly non-comprehensive) summaries written shortly after each lecture, of what was covered. Readings refer by default to chapters from the required textbook, though sometimes include pointers to other texts found on the readings webpage, or to materials written by us. The material in class does not always correspond exactly to the material in the textbook: you are responsible for the material taught in class. In particular, often the readings pointed below contain significantly more details and proofs than were covered in class. Conversely, class occasionally contains material that is not in the textbook (I've tried to indicate when this is the case). Please go over the previous class material (using your notes and textbook as appropriate) before every lecture.
• Lecture 1 (9/6) Introduction to modern cryptography and the goals of this class. Secret Sharing: definition of threshold t-out-of-n secret sharing (correctness and security). Example construction of 2-out-of-2 secret sharing.
Reading: See the following lecture notes on secret sharing.

• Lecture 2 (9/8) Review of secret sharing, and two definitions of security. 2-out-of-2 additive secret sharing. 2-out-of-n secret sharing from 2-out-of-2 secret sharing (proof by reduction, hybrid proof technique). We did not quite finish the proof (will do next time).
Reading: See the following lecture notes on secret sharing.

• Lecture 3 by Luke (9/13) Review of 2-out-of-2 additive secret sharing. Finished proof of 2-out-of-n secret sharing from 2-out-of-2 secret sharing (proof by reduction, hybrid proof technique). Introduced some number theory needed for Shamir's secret sharing (t-out-of-n secret sharing scheme). Saw Shamir's secret sharing scheme, proved correctness (through polynomial interpolation theorem presented without full proof) and security (through identical distributions perfect security definition).
Reading: Slides for lecture can be found here. If interested in supplementary material, Rosulek's textbook Chapter 3 covers Shamir's secret sharing, while the book by Cramer, Damgård, Nielsen studies secret sharing in depth (going much beyond what we saw in class). HW1 out on 9/14.

• Lecture 4 (9/20) Definition of private key encryption (syntax and correctness). Kerckhoff's principle. Overview of some historical ciphers (Atbash, Caesar, Shift, substitution, Vignere) and simple attacks on them (exhaustive search, frequency analysis). Motivation and definition of perfect secrecy: two equivalent definitions (we didn't prove equivalence), capturing the intuition that the ciphertext alone contains no information about the plaintext. Shift/substitution cipher is not perfectly secret for more than one character messages, but yes for one character messages. Discussed potential implications of removing 0 from the key space. Defined the one-time pad encryption scheme.
Reading: Ch. 1 (general intro), 2.1,2.2

• Lecture 5 (9/22) Proved that one-time pad scheme satifies perfect secrecy. Discussed two inherent issues: First, we showed that one-time pad cannot be used to send multiple message securely; this problem is inherent to any (stateless) perfectly secret scheme, but can be fixed by using a stateful encryption algorithm (often not desirable). Second, we proved that for any perfectly secret encryption, the key space must be larger than the message space. Discussion and motivation for the computational (rather than perfect) approach, protecting only against "feasible" adversaries, with security except for a "tiny" probability.

• Lecture 6 (9/27) Brief review of running time, polynomial time, ppt, and negligible definitions, a negligible function times a polynomial is still negligible. Security parameter. Definition of EAV security (indistinguishability for a single message against an eavesdropper). Argued informally that if an EAV-secure scheme existed with a smaller keyspace than messaage space, we would have P=NP; thus, proving security of such a scheme is at least as hard as proving P ≠ NP. This is the case for most other cryptographic primitives. In fact we don't even know how to prove their existence if we assume P ≠ NP, but instead we prove security under stronger assumptions (e.g factoring is hard or other mathematical assumptions) as we will see. Private key encryption (both EAV security and stronger notions of security) can be shown equivalent to various other primitives (OWF,PRG,PRF,Signatures,MAC,..) some of which we will study (all of them can be constructed from some specific computational hardness assumptions, and all of them imply P ≠ NP). Began motivating pseudorandom generators and how they would help with EAV secure encryption, via a pseudorandom one-time pad.
Reading: 3.1, 3.2.1. Note: our definition of EAV security is the textbook's definition 3.9 - we did not (yet) mention the equivalent def 3.8

• Lecture 7 (9/29) Pseudorandom generators (PRG) definition. Showed that if you can test whether a string is in the image of G or not, then you can distinguish the output of G from uniform. This means that (1) if we required the PRG definition to hold for any distinguisher (not just ppt) then the definition would not be satisfiable (the attack can always be executed in exponential time), (2) if G is such that this attack can be executed in efficiently time, G is not a PRG, (3) The existence of PRGs implies P ≠ NP. We then went through several ways to constructions of potential G's (eg by using an underlying given PRG), and discussed/proved whether or not they can be a PRG.

• Lecture 8 (10/4) Proved that if G is a PRG then G(G(·)) is also a PRG (proof via one hybrid in between the two we need to prove indistinguishable). Discussion of how to extend it to many applications: if you apply G only on the prefix of length n, any polynomial number of applications results in a PRG; if you apply G on the whole output, this is ok any number of times that results in a polynomial size output (the proof is via a hybrid argument, which we didn't fully present but it's similar to the mini-hybrid argument that we saw for two applications of G). We concluded that if there is a PRG with any expansion (even one bit expansion), then there's a PRG with any polynomial size expansion. We then saw an example candidate construction of a PRG based on a number theoretic assumption: If p is a prime and g is an element of Zp*, and some conditions on p,g, hold, then G(a,b)=(g^a,g^b,g^{ab}) (everything mod p) is believed to be pseudorandom (and is certainly efficient, deterministic, and length expanding). This is called the DDH assumption, which we will revisit later. We then revisited the use of PRG for encryption, and proved that if PRG exist, then there's an EAV secure encryption scheme (using a pseudorandom pad).
Reading: 3.3, 7.4.2. The PRG based on DDH is not in the textbook, although the assumption itself (more formally and for general groups) is definition 8.63 in the textbook.

• Lecture 9 (10/6) After recalling the EAV-secure scheme based on PRG that we saw last time, we showed that it is not EAV secure for multiple messages. In fact, we proved that there is no (stateless) deterministic encryption scheme that satisfies multiple message (or even two message) EAV security (as long as the message space contains at least two different messages of the same size). We discussed the stateful use of PRG for encrypting multiple messages via stream ciphers, and then moved on to stateless encryption (our default), which must be randomized. We then discussed stronger notions of security, beyond the "ciphertext only" attack that EAV security protects against: known-plaintext attack, chosen plaintext attack (CPA), chosen ciphertext attack (CCA) (and there are more we did not mention). We defined formally CPA security (also known as IND-CPA), and claimed (omitting proof) that CPA security for one message implies CPA security for multiple messages. We then showed that if we could have a secret key encoding a random function from n bits to n bits, we could achieve CPA security by using the value of the function on a random point as a one-time pad. This is not possible to do efficiently (it would require exponential size key), but this motivates pseudorandom functions (PRFs).

• Lecture 10 (10/11) Defined PRFs, gave example of a function that is not a PRF, discussed difference between PRF and PRG. We then proved that PRF imply PRG, and that PRG implies PRF with a small (logarithmic) input size, both of these via easy/direct constructions (we proved the first, left the second proof as an exercise). We then showed the GGM tree-based construction of PRF (with polynomial input size) from PRG, but did not prove it. We discussed why PRFs cannot be used directly as encryption algorithm (not CPA secure, and not clear how to decrypt). We then defined pseudorandom permutations (PRP) and strong PRP (also called "block ciphers"), which solve the second issue, as they allow to invert (but they are still not CPA secure, so block ciphers can't be directly used as encryption algorithms).

• Lecture 11 (10/13) Proved that PRF implies CPA-secure encryption (we used an encryption scheme that outputs (r, Fk(r) xor m) for a random r). This works for messages whose length is the block size of the PRF. We started discussing encryption of longer messages, by parsing them into blocks (with appropriate padding to have the length a multiple of the block size). Block-by-block encryption remains CPA secure (we didn't formally prove), but ciphertext is long. Discussed ECB mode of encryption (not even EAV secure, do not use!).

• Lecture 12 (10/18) Modes of encryption: discussed general considerations (security, length of ciphertext, whether or not you need to be able to compute the inverse of the block-cipher/PRF, whether you can do encryption or decryption in parallel for each block at once, and others). Recalled block-by-block and ECB from last time, and showed several others: OFB, CBC, Counter mode. These three are CPA secure (we argued informally why) but we also discussed a stateful slight variant of CBC that is not CPA secure (used to attack SSL 3.0/TLS 1.0). We now switch to constructions of strong PRP from PRF: Defined Feistel networks, and proved that this gives a permutation for any round function chosen, and any number of rounds. Showed one round Feistel is not a PRP, regardless of the round function. Claimed that 2 round Feistel is also never a PRP - finding the attack was left as an exercise. Luby Rackoff Theorem: If we instantiate Feistel network with round functions that are PRF with independent keys for each round, then after 3 rounds we get a PRP but not a strong PRP, and after 4 rounds we get a strong PRP. (We did not prove this theorem).

• Lecture 13 (10/20) Showed the attack (distinguisher) on 2 round Feistel (thus, it's not a PRP). Next, we moved on to practical suggested instantiations for PRP, also known as block ciphers. Some general discussion and principles for block ciphers in practice: concrete security with fixed key and block sizes, any attacks faster than exhaustive search constitue a serious weakness, avalanche effect is necessary but far from sufficient). Discussed DES, which follows the Feistel network structure with 16 rounds, with a specific design of the round function, which we showed and discussed. Note that this doesn't fall under Luby-Rackoff theorem since the round function is not a PRF, and the keys for each round are not independent (however, there are more rounds). Discussed security of DES (remains without any significant practical attacks, but key size and sometimes block size are too small for secure use today). Started discussion of increasing key length for a given block cipher: double encryption (eg 2DES) can be attacked.
Reading: 6.2 preamble, 6.2.2, 6.2.3, 6.2.4

• Lecture 14 (10/25) Showed triple-blockcipher, to increase security of the block cipher, effectively doubling the key size. For DES, the resulting 3DES is a reasonable choice, although key size 112 bits and block size 64 bits are starting to be a bit too small. We then moved on to AES: high level overview of its history, security (very strong according to all current indications), and construction. Mentioned (without much detail) the underlying idea of Substitution Permutation Networks (different variations of it are used for the round functions in both AES and DES), and the principles of confusion and diffusion, although did not give many details. Overview of how all the primitives we've seen so far fit together. CCA secure private key encryption: motivation (including mention of padding oracle attacks) and definition (mentioning CCA1 and CCA2 versions). Showed that the CPA secure encryption we saw is not CCA secure. Briefly mentioned the notions of malleability / non-malleability / homomorphic encryption.

• Lecture 15 (10/27) Message authentication codes (MACS): discussed motivation and goals (authenticity and integrity). Encryption does not generally provide authentication (saw this with respect to all the encryption schemes we saw so far). Definition of security for MAC (existential unforgeability against adaptive chosen message attacks), including variations of fixed-length MAC and strong MAC (where any valid new pair (m,t) is considered a forgery, even if only t is new and m was already asked before). Noted that if the Mac algorithm is deterministic, then there's a unique tag for each message, and thus standard and strong MAC security definitions are equivalent. Construction of fixed-length MAC by using a PRF, and sketch of security proof.

• Midterm Review by Ghada (11/1)

• Midterm (11/3) In-class, open-book open-notes midterm.

• Lecture 16 (11/10) Domain exstension for MAC (using fixed length MAC to build arbitrary length MAC): authenticating block-by-block, including random identifier, length of message, and index of block in each block is secure (high level of proof sketched). Showed attacks on block-by-block authentication if any of these is removed. CBC-MAC (arbitrary length MAC using PRF): showed construction, noted main differences with CBC encryption (no IV and no intermediate values), and stated (without proof) security for fixed length messages (only). Two ways to make it secure for arbitrary length messages: either prepend the length of the message as the first block, or apply one more layer of PRF at the end with a fresh key. (Note: in Lecture 19 we will see hash-based MAC as another alternative).

• Lecture 17 (11/15) Authenticated Encryption: discussion and definition -- CCA security and (tailored) unforgeability. Achieving authenticated encryption from underlying schemes of CPA secure encryption and secure MAC: as candidates, we specified algorithms for Encrypt-and-Authenticate, Encrypt-then-Authenticate, Authenticate-then-Encrypt and began their security analysis (proofs were sketched, sometimes in more detail and sometimes less, but not formally written). Encrypt-and-Authenticate: satisfies unforgeability, but not even CPA secure (do not use). Encrypt-then-Authenticate: saw that if the MAC is not strong, CCA security could be broken (if given (c,t) it is easy to come with a valid (c,t') for another t', can feed that to decryption oracle and break CCA). However, with a strong MAC, this is CCA secure and unforgeable (authenticated encryption).

Collision resistant hash functions: motivation and definition. Discussed asymptotic definition with a keyed hash function (note that key is known -- no secrets), vs practical unkeyed hash functions, and why proofs of security that yield a collision are meaningful even with unkeyed functions. Mentioned weaker possible definitions: target collision resistance and one-wayness.
Reading: 4.5.1, 4.5.2, 4.5.4, 3.7.2 (several other variations on padding oracle attacks (e.g POODLE) were launched in practice, if you want to read about them - not part of class material), 5.1.

• Lecture 19 (11/22) Generic attacks on CRHF: brute force (applies even to target collision resistance), and birthday attack which takes time and space about square root of the brute force (improved versions with efficient space and with time-space tradeoffs exist too, but we did not describe them). This means that we need a longer output (for security against 2^n adversary, we need to choose output length at least 2n). Merkle-Damgard transform for CRHF domain extension (sketched proof of security). Overview of practical constructions of CRHF: they are typically based on designing a fixed-length CRHF (could be seen as relying on a specially designed underlying block-cipher), and then using Merkle-Damgard or a different way for domain extension. MD5 has collisions (including meaningful ones) found - do not use. SHA1: vulnerabilities known, but no explicit collision found yet. SHA2 family seems ok. Winner of SHA3 competition by NIST, Keccak, was recently standardized. Sample of applications of CRHF and hash functions more generally. For MAC: Hash and Mac (domain extension for MAC), and HMAC overview. Random Oracle Model (ROM) overview. Equality checking (fingerprinting), which has many applications (storing passwords, files, etc). Brief mention of Merkle trees (see textbook for more detail on them).
Reading: 5.1, 5.2, 5.3, 5.4.1, 5.5, 5.6.1, 5.6 (we only covered some of it, and quickly, but you may be interested to read all of it), 6.3

• Lecture 20 (11/29) Quick review of number theory (the finite group (Zn*,1) with respect to modular multiplication is of size φ(n); for any x in the group, x^φ(n)=1 so exponents can be reduced modulo the group order; special case of prime p: Zp* is cyclic, and in poly time we can generate a random n-bit prime p together with a generator g of Zp*; etc.) Discussed the following assumptions over cyclic groups: Discrete Log assumption (DLA), computational Diffie-Hellman assumption (CDH), and decisional Diffie-Hellman assumption (DDH). In any group, if DDH holds then CDH holds, and if CDH then DLA holds. The converse is not true in general, and even in specific useful groups where DDH is believed to hold, we don't have a proof that it's equivalent to DLA (thus, DLA is the weakest/best assumption of the three). Diffie-Hellman key exchange, and discussion of its security (the key is indistinguishable from a random group element if DDH asumption holds). DDH assumption does not hold in groups of composite order, so we discussed how to choose a group of prime order: start with Zp* where p is a safe prime (p=2q+1 where q is a prime), then use its subgroup of prime order q (the subgroup generated by g^2; it is also the group of quadratic residues QRp). For this group, DDH believed true.
Reading: For number theory background see the relevant parts of Angluin's notes and of appendix B. 8.1.1--8.1.4, 8.3.1--8.3.3, 10.3

• Lecture 21 (12/1) Quick number theoretic facts (most without proof) for Zp* (p prime) with a generator g: It is easy to check whether an element in the group has a square root (raise to (p-1)/2 and check whether or not it is 1), and if so the element is called a quadratic residue, and it is easy to compute its square root. The subgroup of quadratic residues QRp is generated by g^2 and its order (size) is half of the size of Zp*, namely (p-1)/2 (if p happens to be a safe prime, then this group is of prime order). We showed why DDH does not hold over the group Zp* with the generator g: g^xy is distinguishable from a random group element g^z (where x,y,z chosen at random), because g^xy is more likely to be a quardartic residue than g^z. However, if p is a safe prime and we work in the subgroup QRp, g^xy and g^z are distributed identically, and DDH is believed to be true (and thus we can use DH key exchange). Briefly discussed how to go from a random group element (the key selected in the DH protocol) to a random bit string (needed if the key is to be used for symmetric key crypto): use a key derivation function (can be done eg using a theoretical tool called strong extractor, or using a hash function in the random oracle model). Discussed vulnerability of DH key exchange to active attacks such as "man in the middle" attack, and mentioned that for authenticated channel we need to handle key distribution (eg with a centralized entity like CAs). Introduced public key encryption (PKE), and defined security via indistinguishability. Noted that for PKE, EAV security and CPA security are equivalent definitions (but not CCA). Showed that key exhcnage in two rounds (single message from each party) is equivalent to PKE (although we did not formally define KE). As a special case of this observation, using DH key exchange we can get a PKE scheme -- this is the El Gamal PKE which we described, and is secure under the DDH assumption.
Reading: same as last lecture, as well as 11.1, 11.2, 11.4.1. We touched upon topics discussed in chapter 10, 11.3, and 13.4.1, but left most of it uncovered.

• Recitation by Edo (12/2) Focusing on number theory background and examples.
Reading: Notes from the recitation can be found here. Other reading follows the readings posted for the relevant lectures.

• Lecture 22 (12/6) Review of El-Gamal PKE. Factoring assumption for a product of two large primes, and discussion of its hardness (best algorithm known is super polynomial but subexponential - which means key lengths should be higher than desired level of security). RSA assumption with respect to GenRSA experiment to generate (N,e,d) (some variations on how e is chosen in the experiment yield different assumptions, all considered hard). If factoring assumption is false then RSA assumption is false, but the other direction is not known (thus, RSA is a stronger assumption). Indeed, factoring is equivalent to finding φ(N) and equivalent to finding the inverse of e, all of which will allow to take e-th roots mod N and break RSA assumption, but it's possible that there are other ways to break RSA even if factoring is hard. Textbook (plain) RSA and its insecurity as a PKE (e.g it is deterministic; there are other attacks as well). The RSA function (raising to power e mod N) is a permutation over ZN*. We discussed informally the concept of a trapdoor permutation (which this is a special case of, assuming the RSA assumption holds): a permutation that is easy to compute and hard to invert, but easy to invert given a trapdoor. Any trapdoor permutation can be used to get a secure PKE, via hard-core bit (we just mentioned without details). Specifically for RSA permutation, it is known that finding the lsb with probability non-negligibly better than 1/2 is equivalent to finding the entire preimage (breaking RSA) with non-negligible probability. This yields a PKE scheme for one bit messages, secure under the RSA assumption: the encryption outputs (r^e, lsb(r) xor m). More efficient is padded RSA (we can prove security when the message is logarithmic, and insecurity when the pad is logarithmic; we do not have a proof or an attack when the message and the pad are the same length). This (and variations) has been used in practice.
Reading: 8.2 (for the starred subsections we just mentioned the takeaway), 11.5.1, 11.5.2, and touched on 13.1. If you are interested (not required for class), you can check here for recommended key lengths.

• Lecture 23 (12/8) Review of RSA permutations and ways to add randomness to it, towards getting a secure PKE based on RSA assumption. In addition to the ones from last lectrure, we suggested a hasing based scheme, where Enc(m)=(r^e, H(r) xor m) for a random r. This is CPA secure in the random oracle model (ROM). We talked about CCA security, showing CCA attacks on textbook RSA and then on all the other PKE candidates we saw so far. We mentioned OAEP, based on a two-round Feistel structure with two different hash functions as round function. OAEP is provably CCA secure in the ROM (when the two hash functions are modeled as independent random functions). We also mentioned that there are CCA secure PKE schemes in the standard model based on RSA assumption, as well as based on DDH (Cramer-Shoup) and other number theoretic assumptions. We noted that typically hybrid encryption is used, starting with PKE to obtain a symmetric key, and then using private key encryption. Briefly mentioned the notion of KEM/DEM for hybrid encryption, and mentioned that the hashing based scheme mentioned above can be used as part of a CCA secure KEM/DEM scheme in ROM. Defined Signature schemes and their security. Contrasted signatures with MACs (including public verifiability, transferability and non-repudiation). Showed textbook RSA signatures and their insecurity. Existing approaches to constructing secure signatures: (1) heuristic constructions, loosely based on number theoretic assumptions, (2) constructions provably secure in ROM based on number theoretic assumptions, (3) constructions provably secure in the standard model from (a) number theoretic assumptions, (b) CRHF, (c) primitives like PRF/PRG etc (captured by "one-way functions"). The latter also means that in this sense digital signatures belong to the "private key world" (you don't need a trapdoor or a PKE to obtain signatures). We gave one example of a construction of sigatures, falling under category (2): the RSA full domain hash (RSA-FDH), provably secure in the ROM under the RSA assumption. A note on signatures vs PKE: the notion that they are "dual/opposite" and that "to sign, you decrypt, to verify you encrypt" is wrong (often does not make sense even syntactically, and when it does it's not secure; e.g. this notion can be applied to textbook RSA encryption and textbook RSA signatures, but that is not secure for either signatures nor encryption). Wrap up the class with an overview of some more advanced things happening in cryptography.
Reading: We gave no proofs in this lecture, and stayed on a high overview level. The textbook has many more details, which can be found in 11.3, 11.5, 12.1-12.4