Lectures

W4261 Introduction to Cryptography: Spring 2022 Lecture Summaries

Lecture notes from class are uploaded to courseworks (under "files")-- both sections are coordinated, and a single set of lecture notes (lightly edited) is uploaded after class. Video recordings of class are also available on courseworks.

Below are brief summaries written shortly after each lecture, of what was covered, together with recommended (and in some cases required) readings. Readings refer by default to chapters from the required textbook, though sometimes may include pointers to other texts found on the readings page, or to handouts written by us. The material in class does not correspond exactly to the material in the textbook. Often the readings below contain significantly more details and proofs than covered in class. Conversely, class sometimes contains material that is not in the textbook (I'll try to indicate when this is the case).

You are (only) responsible for (all) the material taught in class, any readings explicitly marked below as required (even if not covered in class), and anything covered by our homework. It will be assumed that before each lecture you carefully go over the previous lecture and required readings (if any).

Lecture 1 (1/18) Introduction to modern cryptography, overview of this class. Definition of private key encryption scheme syntax and correctness. Kerckhoffs' principle. Overview of some basic historical ciphers (Atbash, Caesar, shift) and simple attacks (brute force, frequency analysis). Motivation for rigorous definition of security and discussion of what secrecy for private key encryption should mean (assuring that the key is hard to guess or that the message is hard to guess are not sufficient for security; definition of secrecy should work for any message distribution or prior knowledge of the adversary).
Reading: Chapter 1.

Lecture 2 (1/20) Discussion leading to two equivalent definitions of perfect secrecy, capturing the intuition that the ciphertext contains no new information about the plaintext (we did not prove equivalence of the definitions). Proved that the shift cipher is not perfectly secret. Defined the one-time pad (OTP) encryption scheme and proved that it is perfectly secret. Discussed two problems with OTP, related to efficiency and security. First, it requires the keys to be as long as the messages. Second, it cannot be used more than one time: we defined perfect secrecy for encryption of two messages, and claimed that OTP does not satisfy it. We mentioned (very briefly and informally - we will discuss later) the stronger notions of known plaintext and chosen plaintext attacks (that OTP also fails to achieve). We mentioned (and will discuss again next time) that both these problems are in fact inherent to every perfectly secret scheme: any such scheme must have a key space as large as the message space, and no such scheme can be perfectly secret for two messages.
Reading: 2.1-2.3
Required reading: if you need to brush up on asymptotic notation, ppt algorithms, and negligible functions, do so now! (see section 3.1, including definition 3.4 of negligible functions, and appendix A of the textbook; you can also find more sources in the background section here).

Lecture 3 (1/25) Proved that every perfectly secret scheme must have a key space at least as large as the message space (so, if these spaces consist of all strings of a certain length, the key length cannot be shorter than the message length). Discussed (without showing proof) how no (stateless) encryption scheme can be perfectly secret for more than one message. We also mentioned that deterministic schemes can't be even computationally secret for more than one message. Discussed general principles for defining cryptographic security, motivated computational security and our asymptotic approach (security only against polynomial time adversaries, and allowing negligible probability of adversary success). Definition of EAV-security (indistinguishability in the presense of an eavesdropper--single ciphertext only attack). Noted that if we remove the restriction on the adversary's running time, and set the advantage to 0 (rather than negligible), this would become another equivalent definition of perfect secrecy. Mentioned that we don't know how to prove that an EAV-secure encryption scheme exits -- if we could prove it, we would have a proof that P is not equal NP. This is why assumptions are necessary.
Reading: 3.1, 3.2.1. (you may be interested in 3.2.2, but we will not cover it).
Reading ahead (if desired): 3.3.

Lecture 4 (1/27) Background and motivation for cryptographic assumptions and complexity: we outlined (high level only, no proof) why having an EAV-secure encryption scheme would imply P ≠ NP, and thus proving that such a scheme exists is beyond our reach today. Instead, we rely on assumptions. Cpnstructing secure private key encryption from the minimal assumption P ≠ NP is not known and is a major open problem. We know that the existence of private key encryption is equivalent to many other primitives (e.g.: OWF, PRG, PRF, Block ciphers, MAC, Signature schemes, commitment schemes, ZK proofs - we will talk about some of these primitive class later). In turn, these primitive can be constructed from mathematical assumptions (such as number-theoretic ones) or engineering designs. Next we moved to discussing and defining pseudorandom generators (PRG). We showed that if an algorithm can test whether a string is in the co-domain of G or not, then it can be used to distinguish the output of G from uniform. Thus, if this can be done efficiently, G is not a PRG (and for every G, there's an exponential time distinguisher). Gave some simple examples of consructions and whether they are necessarily PRG or not, along the way discussing how to prove a G is or is not a PRG, and more generally proofs by reduction [in one of the sections we will finish this discussion next time].
Reading: 3.3.1, 3.3.2
Reading ahead (if desired): 3.3.3.

First review session (Friday 1/28 2pm, over zoom): led by Asif and Zeyu (Thomas).

Lecture 5 (2/1) Reviewed structure of cryptographic proofs by reduction. Proved that if G is a PRG then G(G(·)) is also a PRG. Our proof used the hybrid method (with one hybrid in between the two distributions we need to prove indistinguishable), which is a very common technique (although in this case there are also simpler proofs). We showed that if there's a PRG with any expansion (even one bit), then there's a PRG with any polynomial expansion (we sketched the proof, which generates one bit at a time, sometimes called a stream cipher). We proved that if a PRG exists, then there's a EAV-secure encryption scheme (based on a pseudorandom one-time pad). The proof was (as usual) via a reduction.
Reading: 3.3. The construction increasing the expansion is described and proved in 8.4.2 (the proof is not part of the class material).

Lecture 6 (2/3) Quick review of PRGs, discussing a common quiz mistake, and the fact that pseudorandomness of PRG output is meaningful only for long enough input. Also reviewed the construction of EAV-secure encryption from PRG we saw last time ("pseudorandomn OTP"), and discuss how it is not secure for multiple message encryption. Instead, it can be used in a stateful mode (encryption via stream cipher), but for stateless encryption with security against multiple messages, we cannot have deterministic encryption. Defined CPA security. Claimed (skipping proof) that any CPA secure encryption scheme must have super polynomially many possible encryptions for any given message with a given key, and that any CPA secure encryption scheme is also CPA secure for multiple messages. Mentioned that if we could have a random function as a secret key, we could use it towards building CPA secure encryption (discussed intuitively, we will see exactly how later). However, a random function takes exponential length to describe, so can't be used. This is one of the motivations for pseudorandom functions (PRF) that we defined.
Reading: 3.4, 3.5.1.

Lecture 7 (2/8) Reviewed notion of PRF and how it compares to the notion of PRG. An example of proving a function is not a PRF. We proved that PRF implies a PRG, and noted that this corresponds to a common way in practice to construct stream ciphers from block ciphers (a primitive we will discuss later). We also showed that PRG implies a PRF, via the GGM construction (we showed the construction but not the proof). Discussed why a PRF directly applied to the message is not a good encryption schemee (e.g., it's deterministic) - we will see how to use PRFs to construct a (randomized) CPA secure encryption next time.
Reading: 3.5.1. 3.6.1 has the construction of PRG from PRF (although it is in the context of stateful encryption with stream ciphers, which they define formally and we didn't). 8.5 has the GGM construction of PRF from PRG.

Lecture 8 (2/10) Reviewed PRF definition, and defined pseudorandom permutation (PRP) and strong PRP. The term "blockcipher" typically refers to a (strong) PRP. Defined Feistel netowrks, in which each round (and thus any number of rounds) is always a permutation (efficiently computable and invertible, for any efficiently computable round function). Stated (without proof) the Luby-Rackoff theorem: if we instantiate a Feistel network with round functions that are PRF with independent key for each round, then after three rounds we get a PRP (but not a strong PRP), and after four rounds we get a strong PRP (one or two rounds do not give a PRP). As a consequence, PRF implies PRP. We then went back to showing that given a PRF, we can construct a CPA-secure encryption scheme for fixed length messages (the length is the output block size). We proved the security of the constructed CPA secure encryption scheme.
Reading: 3.5.1, 7.2.2, 8.6, 3.5.2

Second review session (Saturday 2/12 4:30pm, Mudd 633): led by Cassandra and Swapnil.

Lecture 9 (2/15) Reviewed the CPA secure encryption from PRF, and discussed the fact that its security depends on the (input) block size of the PRF being long enough (so that in polynomially many queries, there's negligible probability of repetition). Other applications of PRF may or may not require long enough block size (but they always require long enough key). Discussed how to use PRF (or block-ciphers) to encrypt shorter messages (by padding to block length), and how to use them to encrypt longer messages. Towards that end, we claimed that if there's CPA secure encryption for a fixed message size, block-by-block encryption remains CPA secure for longer messages (proof is by a hybrid argument, but we did not show it). Discussed modes-of-encryption, which are ways to use PRF/PRP/block ciphers towards encrypting longer messages, with better efficiency (specfically, better ciphertext length) than the naive way to encrypt block-by-block, while maintaining other desirable properties (plus CPA security). Mentioned ECB mode (not even EAV secure -- do not use!), CBC mode and CTR mode (both CPA secure). We also mentioned that while CTR mode can be used in a stateful way, using CBC mode in a stateful way makes it no longer CPA secure, which was used to launch a successful attack ("BEAST") on SSL/TLS in 2011 (lesson: use crypto primitives correctly -- only as intended and proved). Discussed intuition of what is needed for a mode-of-encryption to maintain CPA security.
Reading: 3.5.2, 3.6.3.

Lecture 10 (2/18) Reveiwed where we are, and gave a high level overview of some practical blockcipher constructions: discussed design paradigms, concrete security for fixed key and block ciphers, etc. Discussed DES structure and its security: remains secure without significant practical attacks (could be assumed to be a strong PRP), but since key size and sometimes block size are too small for use today, should not be considered secure as-is. Discussed how to increase effective key length for a given blockcipher: 3DES effectively doubles key size (no known non-trivial attacks), but 2DES does not (susceptible to meet-in-the-middle-attacks), although composing PRP twice is still a PRP (so 2DES is still at least as secure as DES - but 3DES has double key-size security). Discussed AES structure and its security (very strong according to all current indications). Lesson from last two lectures: do not design your own block cipher -- use AES (or 3DES). Moreover, when you use it (or any other cryptographic primitive) as a building block in another construction, use it correctly, as intended, in a way that was proved secure assuming the block cipher is strong PRP.
Reading: 7.2.3, 7.2.4, 7.2.5.
Required reading: Read carefully either Chapter 7 or Chapter 8 of the textbook (covering practical and theoretical constructions of private key cryptographic primitives, respectively). Note that in class we have/will have covered some parts of both these chapters -- you are responsible for everything covered in class, plus everything else in one of these chapters.

Lecture 11 (2/22) Gave a high level overview of theoretical constructions of private key primitives: defined one-way functions (OWF) and talked about their importance as a minimal building block in cryptography. Defined hard-core bit (HCB) and mentioned the Goldreich-Levin theorem, which shows how to get a HCB from every OWF. We also defined one-way permutations (OWP) and sketched how OWP imply PRG (mentioning that OWF also imply PRG - we didn't show how). We then switched to a new goal: authenticity/integrity, rather than secrecy. Showed that encryption does not necessarily provide authentication. Motivated and defined message authentication codes (MACs). Defined security of a MAC (existential unforgeability against adaptive chosen message attacks), including the variations of fixed-length MAC and strong MAC (noting that if the MAC algorithm is deterministic then there's a unique tag for each message and standrad MAC implies strong MAC in this case).
Reading: 8.1-8.4 (although we only gave a high level overview of these parts, skipping proofs and many other details). 4.1, 4.2.

Lecture 12 (2/24) MAC review and discussion, explained why replay attacks are not handled by the MAC definition, and that they should be protected at the level of the message (mentioned approaches using counters or time stamps). Showed a construction of secure fixed-length MAC by using a PRF and proved its security. Discussed how to use a fixed-length MAC to get a MAC for longer messages: simply authenticating each block separately and some more sophisticated variations do not work (we discussed attacks based on reordering blocks, mixing blocks from different messages, and truncating blocks). But a secure MAC can be obtained with a block-by-block approach if we include in each block also a random message identifier, length of message, and the index of the block. This results in a long tag, while there are much more efficient schemes (with a single-block tag even for long messages). We showed one of them: CBC-MAC, which is secure for fixed-length messages (arbitrarily long but just one length, agreed upon and fixed in advance). Pointed out the main differences with CBC encryption (no IV and no intermediate values), which are essential for security. Two ways to augment CBC-MAC to be secure for arbitrary length messages: either prepend the length of the message as a first block, or apply an additional layer of PRF at the end with a fresh key. (We will see hash-based MAC later as another efficient and secure alternative). Notes: We did not prove security for any of the above longer message MACs. In one of the sections we did not cover CBC-MAC - will do so next time.
Reading: 4.2, 4.3, 4.4.1

Third review session (Monday 2/28 4:00pm, Mudd 524): led by Miranda and Sandip.

Lecture 13 (3/1) Reviewed some quiz problems, and CBC-MAC. Defined chosen ciphertext attack (CCA) security, and mentioned the weaker CCA1 ("lunchtime attack") version. Discussed motivation, including a high level discussion of the following. First, padding oracle attacks (a special case of CCA attacks), which were successfully used on several practical systems. Second, non-malleability (given an encryption of a message, cannot get an encryption of a related message), which is well motivated in many applications (while in other applications we may want the opposite, "fully homomorphic encryption", but we did not discuss or define this). CCA security implies non-malleability. The encryption schemes we've seen so far are not CCA secure (we showed an attack on the one where ciphertexts are of the form (r,Fk(r) xor m)). We claimed that if Fk is a strong PRP, an encryption scheme where ciphertexts are of the form Fk(r||m) for a random r, is CCA secure. However, we will next construct a better CCA secure scheme (one that will also achieve authentication). Started discussion and motivation for authenticated encryption.
Reading: 4.4.1, 5.1, 5.2.

Lecture 14 (3/3) Defined authenticated encryption (an encryption scheme satisfying both CCA security and an appropriate version of unforgeability). Given secure encryption scheme and MAC scheme, how do we construct an authenticated encryption scheme? We discussed three general approaches (all were used in real systems). Encrypt-and-Authenticate: satisfies unforgeability (we showed the reduction) but not even CPA secure (do not use). Authenticate-then-encrypt: satisfies unforgeability, as well as maintaining the encryption security (CPA or CCA) of the underlying encryption scheme. Encrypt-then-authenticate: satisfies unforgeability. If the underlying MAC is not a strong MAC, it is vulnerable to CCA attack. However, if the underlying MAC is strong (e.g., any deterministic MAC), and the underlying scheme is CPA secure, this is CCA secure (hence, good authenticated encryption). Thus, using a strong MAC can be used to upgrade security of private key encryption from CPA to CCA. As a rule of thumb, whenever active adversaries are a concern, you should always authenticate as a last step / have your algorithms start with verifying a MAC on the received message, before using other cryptographic operations like decryption.
Reading: 5.2, 5.3.1.

Lecture 15 (3/8) Finished discussion of authenticated encryption from last time. Collision resistant hash functions (CRHF): motivation and definition (for fixed length and arbitrary length). Several informa discussions: asymptotic keyed definition vs practical unkeyed hash functions (note that in any case there's never a secret key - adversary knows everything about the function). Placing this primitive "in between" private key and public key crypto worlds (public key since there's some evidence it's stronger than OWF, and since it does not have any secret key; secret key in terms of the techniques used in practice, which are more in line with private key /symmetric key primitives, and extremely efficient). Generic attacks on any hash function: "birthday" attack takes time about square root of brute force attack. This means we need a longer output (for security against 2^n attack, we need to choose output length at least 2n). Mentioned some weaker notions of security for hash function: second preimage resistance/target collision resistance), preimage resistance/one-wayness, and mentioned the much stronger notion of random oracle (we'll discuss next time).
Reading: 5.3.1, 6.1, 6.4.1

Lecture 16 (3/10) Merkle-Damgard transform for domain extension (from fixed to variable length CRHF). Discussed some applications of hash functons (informally and without proofs): for MAC domain extension, hash-and-MAC is secure (using any CRHF and fixed length MAC of matching lengths). To build MAC directly for hashing alone, can use HMAC, a standardized, provably secure, and very efficient design, widely used in practice (we did not show how). Hash functions are used for fingerprinting/equality checking in many different contexts, such as deduplication, or verifying integrity of a downloaded file. Password hashing. Merkle trees of CRHF constitutes a CRHF if the number of leaves is fixed. Blockchain (hashing used for fingerprinting previous blocks/proving integrity of chain, where we need collision resistance, as well as for proof of work/puzzle solving, where we need a stronger property). Random oracle model discussion. Quick review of CRHF in practice: MD5 and SHA1 should not be used (collisions found, including meaningful collisions). SHA3 - the standardization of Keccak, winner of NIST competition - considered secure.
Reading: 6.2, 6.3, 6.5, 6.6.1, 6.6.2, 6.6.3, 7.3.2, 7.3.3 (but we skipped many details).
Background reading for the next part of class (recommended): Appendix B and 9.1 without the starred parts, 9.2.1, 9.3.1.

Lecture 17 (3/22) Some motivation and discussion followed by a quick (and superficial) review of some facts from number and group theory (without proofs): finding gcd and inverses mod N. Groups, including the special cases of (Z_N,0) wrt modular addition, (Z_N*,1) wrt modular multiplication. Gave the formula for φ(N)=|Z_N*|. If a finite group is of order q, then for every element x in the group x^q=1 (so exponent can be reduced mod q). In particular for any N, and any x in Z_N*, x^φ(N)=1 mod N (special case for a prime p: x^(p-1)=1 mod p). Defined cyclic groups. The following are cyclic: (a) (Z_N,+) for any N, (b) (Z_p*,×) for a prime p, (c) any group of prime order. There is an efficient algorithm that (on input 1^n) selects a random n-bit prime with a generator of Z_p*.
Reading: Appendix B (except the starred parts), 9.1.1-9.1.4, 9.3.1

Lecture 18 (3/24) Noted that the group (Z_p*,×) is isomorphic to the group (Z_p-1,+) via the following isomorphism: map x ∈ Z_p-1 to g^x ∈ Z_p*, where g is a a generator of Z_p*. Note that this isomorphism (in this direction) is efficiently computable. Discrete log assumption (DLA) and discussion. DLA believed to be true for Z_p*. Diffie Hellman key exchange, and discussion of required assumptions. This led to the definition of the computational DH assumption (CDH) and the decisional DH assumption (DDH). DDH implies CDH, which in turn implies DLA. The converse is not true in general (there are specific useful groups where DDH is false, but DLA is believed to hold), and even in specific useful groups where DDH is believed to hold, we don't have a proof that it is equivalent to DLA (thus, DLA is the best/weakest assumption of the three). Going back to DH key exchange protocol, DDH implies that the key agreed on is indistinguishable from a random element in the group. However, while DLA and CDH are believed to be true in Z_p*, DDH does not hold in this group, as we will see. Instead, DH key exchange is used in a related group (we will show next time), where DDH is believed to hold.
Reading: 9.3.1, 9.3.2, 11.3. You're encouraged to also read 11.1,11.2 (but this is not required).

Fourth review session (Sunday 3/27 5:00pm, Mudd 627): led by Asif and Swapnil.

Lecture 19 (3/29) We defined QRp, the subgroup of quadratic residues mod p (this group is cyclic and generated by the g^2 where g is a generator of Zp*; it is of size (p-1)/2). We argued that for a prime p it's easy to tell whether an element of Zp* is in QRp or not (by raising it to (p-1)/2 and checking if we get 1), and that this can be used to break the DDH assumption in Zp*. A similar attack works for any composite order group, but this attack does not work in a prime order group (this is one of several advantages of working in prime order groups). If p is a safe prime (p=2q+1 where q is a prime) then QRp is of prime order q. For this group, DDH is believed to be true, and so we can apply the DH KE for this group to get indistinguishability (namely passive security). We mentioned that for an active attacker, the DH KE scheme we saw is not secure (eg, there is a devastating "man in the middle" attack). Still, today's KE protocols (e.g in SSL/TLS) use this DH KE idea, modified to allow an "authenticated channel" with the help of a Certificate Authority (CA), to prevent active attacks (we did not discuss any details about how, or what a CA is). We briefly mentioned some other comments about the discrete-log related assumptions: noted that the DLA implies OWF (it is one of the candidate OWF we have), and also noted that there are "bilinear groups" where DDH is easy (DDH assumption false), and this is used for applications (while DL is still considered hard / DLA assumed true). Started discussion of Public Key Encryption (PKE), with a quick historic overview (in one of the sections this was done in the beginning of lecture 20).

Lecture 20 (3/31) Defined PKE with CPA security. Noted that EAV security for one message, multiple messages, and CPA security are all equivalent for PKE (this also means that there does not exist PKE with perfect security, even for EAV, single message, and large keys - brute force attack always breaks security). Noted that, like in the private key case, Enc must be randomized, and there must be super polynomially many encryptions for each message. Discussed differences between PKE and private key encryption Showed that PKE is equivalent to 2-messae key exchange (although we didn't formally define the latter, so we showed this informally). We showed the PKE obtained from the DH KE protocol -- this is the El Gamal PKE, secure under the DDH assumption (proof of security left as an exercise). Defined the Factoring assumption (which can be phrased as: multiplying two large primes is a one-way function). Noted that the grade-school algorithm for factoring is not polynomial time, and mentioned that we have better algorithms, but those are still not polynomial time.

Lecture 21 (4/5) Defined the RSA function, and showed that it is a permutation over Z_N*. RSA assumption wrt GenRSA experiment generating (N,e,d) (variations on how e is chosen yield different assumptions, all considered hard). If factoring assumption is false (factoring easy) then RSA assumption is false, but the other direction is not known (thus, RSA is a stronger / worse assumption). Indeed, factoring N=pq is equivalent to finding φ(N) and can be used to find the inverse of e, all of which allow to take e-th roots mod N and break the RSA assumption. However, it's possible that there are other ways to break RSA even if factoring is hard. Textbook (plain) RSA and its insecurity as a PKE (e.g., it is deterministic; there are other attacks as well). RSA encryption refers (or should) to secure randomized schemes as described below. We discussed (informally) the concept of a trapdoor permutation (the RSA function is one, under the RSA assumption): a permutation that is easy to compute and hard to invert, but easy to invert given a trapdoor. Using a trapdoor permutation directly is not a good PKE, but any trapdoor permutation can be used to get a secure PKE, via a hard-core bit. Specifically for the RSA permutation, it is known (though we did not prove) that under the RSA assumption, the lsb is a hard-core bit. That is: if there's a ppt algorithm that given N,e, and x^e mod N can find the lsb(x) with probability non-negligibly better than 1/2, then there's a ppt algorithm that given the same input, can recover x completely with non-negligible probability. This yields a PKE scheme for one bit messages, secure under the RSA assumption: the encryption outputs (r^e, lsb(r) xor m). More efficient options include the following. Padded RSA PKE (can prove security under RSA assumption when message length is at most logarithmic in n, insecurity when random pad is at most logarithmic in n, and we do not have a proof nor an attack when the message and pad are of the same length). Hashing-based encryption, where the encryption outputs (r^e, H(r) xor m), which can be proven CPA secure in the ROM under the RSA assumption. These, or variations with specific padding patterns, have often been used in practice. Defined the Rabin function (squaring mod N=pq). We showed that this is a 4:1 function (along the way mentioning the Chinese Remainder Theorem). (In one of the sections Rabin function was defined in lecture 22).

Lecture 22 (4/7) Showed that inverting the Rabin function (i.e, taking square roots mod N=pq) is equivalent to factoring N. Thus, using this as a basis for encryption relies on a better (weaker) assumption than RSA. To do this, we mentioned (without proof) that if p, q are both 3 mod 4, then squaring is a permutation over QR_N. We then can use it as a trapdoor permutation, and build PKE on from it by incorporating randomness like we discussed for RSA. On the other hand, due to equivalence to factoring, schemes based on this assumption are often more susceptible to CCA attacks, which could allow to completely factor N. Defined CCA secure PKE. PKE schemes we saw so far are not CCA secure: we demonstrated by showing attack on El Gamal (given an encryption of m can rerandomize to a new encryption, as well as get an encryption of related messages; thus this is a malleable scheme, and this can be used to break CCA security). Mentioned that there are provably CCA secure schemes based on number theoretic assumptions (for example, the Cramer-Shoup DDH-based efficient scheme, schemes based on Factoring, etc). There are also hashing-based schemes, e.g. OAEP, which are secure in the ROM (under number-theoretic assumptions). However, it is not known how to construct CCA secure PKE from CPA secure PKE. The most typical use of PKE is for hybrid encryption, or the KEM-DEM paradigm: use PKE to encrypt a random key which is subsequently used for private key encryption.

Lecture 23 (4/12) Reviewed a problem from the quiz. Digital signatures: motivation and discussion of differences with MAC (including public verifiability, transferability, and non-repudiation). Definition of signature schemes and their security (existential unforgeability against CMA). Insecurity of plain RSA signatures: First, we showed a no-chosen-message attack forging a signature on some message (this attack applies to any TDP which is used directly for signing). Then we showed a CMA attack that allows to forge a signature on any message (this attack is based on the malleability of the RSA function). We discussed the wrong notion that signatures and PKE are dual in the sense that you can "sign by decrypting, verify by encrypting". This does not make sense with our modern terminology, even syntactically, since a secure PKE must be randomized, and every message has exponentially many ciphertexts. This notion was suggested by Diffie and Hellman in their seminal work where they propose the idea of PKE and signature schemes, but what they referred to as PKE is what we call "trapdoor permutation" today (which is a deterministic function).

Fifth review session (Wed, 4/13 2:00pm, over zoom): led by Miranda and Zeyu (Thomas).

Lecture 24 (4/14) Different approaches for constructing secure signatures: (1) heuristic constructions based on number theoretic assumptions, (2) constructions provably secure in the random oracle model (ROM), based on number theoretic assumptions (or generic TDP) (3) provably secure constructions (in the standard model) from: (a) one-way functions, (b) CRHF, (c) number theoretic assumptions. Our discussion included the following details. (3a) is the best, minimal assumption: it is possible to construct signature schemes from OWF, so in terms of complexity, signatures are equivalent to OWF (and therefore also to private key encryption, MAC, PRF, etc) - in that way it is a private key primitive. We didn't show how to construct signatures from OWF, but we did show a one-time signature from OWF. This can be used together with CRHF to obrain a signature scheme (3b). We mentioned that there are more efficient provably secure constructions from number theoretic assumptions such as strong RSA. Approaches 1 and 2 give even more efficient constructions. For (2), we described the RSA Full Domain Hash (RSA-FDH), as well as any TDP-FDH, which is secure in the ROM. Towards showing other signature schemes, and as an important primitive on its own, we informally defined identification schemes, which allow Alice to prove to Bob that she holds the sk associated with her pk, without revealing information that will allow an adversary to impersonate her later. We showed the Schnorr identification scheme which is secure based on DLA (looking ahead to next class: this is a zk-pok for discrete log). We discussed the Fiat-Shamir paradigm that allows to remove interaction from a 3-round scheme where the second message is random, by using a hash function, modeled as a random oracle (security is in the ROM). Using Fiat-Shamir with Schnorr identification gives Schnorr signature scheme (another example of (2)). Finally, schemes for (1) include DSA and ECDSA, which are part of NIST's digital signature standard. These are also based on DLA identification scheme, which has been heuristically made non-interactive by using a hash function and another public function. We started informally discussing zero-knowledge proofs (one motivation is identification schemes, mentioned above, but there are many other applications). Talked about how to prove that two cards are different colors (to a verifier that can't see this difference).

Lecture 25 (4/19) Defined interactive proof systems. All languages in NP have a simple interactive proof system (in fact, it's a non-interactive one) where the prover sends the witness to the verifier. Mentioned that the corresponding class IP was proven to be equal to PSPACE (likely much bigger than NP), demonstrating the power of interaction. Defined zero knowledge proof systems, which include the same completeness and soundness properties, and an additional zk property, defined via the simulation paradigm (perfect, statistical, or computational). We mentioned that sometimes there's an efficient prover strategy (for completeness), for a prover holding some witness. Soundness is still required to hold against any, even unbounded, prover (when soundness holds only against poly time provers, this is called an interactive argument, rather than proof). If the ZK property holds only against the original verifier V, this is called "honest-verifier zero knowledge". We showed a ZK proof system for proving quadratic residuosity modulo a composite. We showed correctness and soundness, and showed the simulator for the zk property, which runs in expected polynomial time, and once it halts, its output is distributed identically to the real execution. All these properties held without making any computational assumptions. Additionally, this protocol has the property that the honest verifier simply sends a random challenge to the prover. For such ZK proofs, the Fiat-Shamir methodology replaces the verifier with a random function, and all the properties are preserved in the ROM. This can be used to achieve non-interactive zk (NIZK) in the ROM. We mentioned that the protocol we saw is actually a zk proof of knowledge (ZK-POK), where the prover is proving knowledge of the square root of the input. This is defined via an extractor algorithm that can extract the witness (in this case, square root) from a prover that can correctly answer multiple challenges (we only described this informally). We stated the theorem that if OWF exist, then any language in NP has a ZK interactive proof system. We did not have time to discuss the proof of this theorem or how important it is (but note that the trivial protocol of just sending the witness, is clearly not zero-knowledge).

Lecture 26 (4/21) Guest lecture by Dr. Eran Tromer: "crypto for crypto". An overview of how cryptographic primitives we've seen so far are used in the blockchain area. Primitives mentioned included hash functions (in several different ways), signature schemes, one-way functions, commitment schemes, and zero-knowledge proofs (or more specifically, zk-snarks: zero-knowledge succinct, non-interactive, arguments of knowledge). Secure multi-party computation (MPC) was also hinted at, although we have not mentioned this area in class (yet?).

Back to Course Main Page