Academia.eduAcademia.edu
Applied Introduction to Cryptography and Cybersecurity Amir Herzberg Comcast Professor of Security Innovations Department of Computer Science and Engineering University of Connecticut March 24, 2024 For updated draft see: http://bit.ly/AI2CS. Comments, corrections and suggestions, are appreciated; send by email to amir.herzberg@uconn.edu. ©Amir Herzberg i Preface This textbook introduces cybersecurity, with focus on the key element of applied cryptography. Our goal is to provide sufficient depth and precision for understanding of this important and fascinating area, but without requiring extensive prior background in mathematics or in the theory of computer science. The textbook presents design principles, discusses practical systems, attacks and vulnerabilities, presents basic cryptographic mechanisms, constructions and deőnitions, and includes many examples, exercises and several programming labs. The goal is that readers will be able to use the book for self-study, and lecturer will be able to use it as a textbook for a course, or use parts of it for two courses. We use the term cybersecurity to refer to the protection of systems involving communication and computation mechanisms, from attacks by adversaries. This is a broad area, since there communication and computations have become so diverse and important, and since there are many threats and many types of adversaries. Ensuring security is challenging; attacks often exploit subtle vulnerabilities and use unexpected strategies, which intuition may fail to consider. This makes cybersecurity challenging; careful, adversarial thinking is crucial. Cybersecurity is a very applied area, however, in this applied area, precise analysis, definitions and proofs are critical. This stands in contrast to many other areas of engineering, where designs are often evaluated under typical, expected scenarios and failures. This approach is insufficient for cybersecurity, since security should be ensured against arbitrary attacker strategies, rather than against expected, familiar attacks. Defenses should be designed assuming limitations on the capabilities of the adversary, but without making any assumptions on the adversary’s strategy. Cybersecurity is a broad area, including many aspects, technical as well as otherwise (legal, economics, social and much more). The technical aspects include cryptography, network-security, software security, system security, privacy, secure human-computer interaction and more. This textbook focuses on applied cryptography and also introduces some aspects of network security and of secure human-computer interaction. We believe that this is a good choice for a őrst course in cybersecurity, for three reasons: • Applied cryptography is essential to many areas of cybersecurity. Hence, this textbook may provide a common basis for students interested in iii these different areas. Some students may continue by focusing mostly on cryptography, maybe even on the theory of cryptography; for these, we hope to provide a good basis in the applied aspects of cryptography. Other may continue to areas of cybersecurity which ‘just’ use cryptography such as network-security, secure systems, privacy and human-centered security; for those, we hope to provide the necessary background in cryptography. • The study of cryptography develops adversarial thinking, which is critical for every cybersecurity expert. Modern cryptography is based on precise deőnitions of goals and assumptions, with analysis and proofs of security for given adversary capabilities, not assuming a speciőc adversary strategy or attack method. Other areas of cybersecurity often use intuitive goals, design and analysis, and may even focus on speciőc adversary strategies. Such approaches may be unavoidable, since precise deőnitions and proofs are often infeasible; however, these approaches appear less helpful in developing adversarial thinking. • Finally, there is the pragmatic consideration of scheduling and prerequisites. Modern cryptography is based on both mathematics and on the theory of computing; however, as we believe you will őnd, we found it possible to use only a limited amount of math and theory, and introduce this limited amount with this book. As a result, the text can be used by readers who did not learn previous math or computer-science courses, beyond what is learned in many high-school programs. This is in contrast to other areas of cybersecurity, which require considerable background (e.g., in networking, operating-systems, or programming). The need for security against attackers with arbitrary strategies, restricted only by their capabilities, motivates the use of provable security, as well as the use of standard, well-studied designs, whose security was conőrmed by experts. Indeed, some of the worst failures of security systems, and especially of cryptographic systems, are due to attempts by non-experts to design schemes and protocols. This textbook tries to combine practice with theory, applicability with precision, breadth with depth. These are ambitious goals, and we hope we are not completely off the mark. Feedback and suggestions for improvement are highly appreciated. Organization and usage of this textbook This textbook is designed for an introductory course in cybersecurity, focusing on applied cryptography. It may be used for self-study, or for one (large) or two (smaller) courses. Chapter 1 provides essential introduction to cryptography, providing intuitive discussion of the main mechanisms we discuss, and introducing the challenge of deőning security. It also introduces notations used throughout the book, and provides critical background information; additional background is provided iv in Appendix A. The rest of the book was designed so that the necessary background in each topic is very limited; we believe it is not necessary to require learning these topics in a prerequisite course. The introduction also provides a brief historical perspective, which we hope some readers may őnd of interest. In chapters 2-3, we begin introducing applied cryptography. These chapters focus on the efficient and conceptually-simple shared-key cryptographic functions: encryption (Chapter 2), authentication (Chapter 4) and hashing (Chapter 3). In Chapter 3 we also begin introducing more elaborate applied cryptographic schemes, focusing on hashing-based schemes such as the Merkle digest schemes and blockchains. Chapter 5 introduces applied shared-key cryptographic protocols. By focusing on shared-key protocols, we provide a gentle introduction to the important subject of resiliency to key-exposure. This also provides good motivation to public-key cryptography, the topic of the next chapter (Chapter 6). Indeed, Chapter 6 deals extensively with different public-key protocols and applications, and shows the powerful ability to use public-key cryptography to ensure resiliency to, and recovery from, key exposure. The next three chapters cover areas of cybersecurity which are closely related to cryptography, focusing mostly on the cryptographic aspects. Two chapters introduce network security: the important TLS protocol in Chapter 7, and Public Key Infrastructure (PKI) in Chapter 8. Then, Chapter 9 covers the critical topic of human-centered cryptography; too often this aspect is not sufficiently taken in account, and cryptography is circumvented by exploiting human behavior and psychology. We conclude the book in Chapter 10, brieŕy discussing some advanced topics such as secret sharing, privacy and anonymity, elliptic curves cryptography and quantum (and post-quantum) cryptography, as well as some of the important aspects of cybersecurity which are beyond this text. The use of background such as math and theory in this textbook is limited to what appears to be essential or helpful for understanding, at least for a signiőcant fraction of readers. Some of this background is covered in Appendix A. Study of this textbook may be followed by study of other areas in cybersecurity, or by in-depth study of cryptography. There are multiple excellent in-depth textbooks on cryptography or speciőc cryptographic topics; some of my favorites are [16, 165, 166, 205, 309, 370]. Acknowledgments I received a lot of help in developing this textbook from friends, colleagues and students; I am very grateful. Speciőc thanks to: • The students and my peers in the University of Connecticut, who have been very understanding and supportive. • Sara Wrotniak, my PhD student, who gave me incredible feedback when she studied using this textbook as an undergrad, and later when I asked her to review the text further. v • Professors and researchers who provided valuable feedback, with or without teaching a course using the textbook, including: Ghada Almashaqbeh, Nimrod Aviram, Ahmed El-Yahyaoui, Peter Gutmann, John Heslen, Walter Krawec, Laurent Michel, David Pointcheval, Ivan Pryvalov, Zhiyun Qian, Amir and Luba Sapir, Jerry Shi, Haya Shulman, Ewa Syta, and Ari Trachtenberg. Walter has also kindly contributed the section on quantum cryptography (Section 10.4). • Instructors and teaching-assistants who used the text and provided important feedback: Justin Furuness, Hemi Liebowiz, Sam Markelon, and Anna Mendonca. I’m especially indebted to Anna for collecting excellent feedback from her students - and for introducing me to Sara. • Many other readers who helped with feedback, corrections and suggestions, including Pierre Abbat, Yanjing Xu, Bar Meyuchas, and Yike (Nicole) Zhang and many others. • The Latex and Tikz communities, who provide amazing resources and support. Special thanks to Nils Fleischhacker for the cool ‘tikz people’ package, and to Shaanan Cohney who kindly sent me the LaTeX source code from [100], which I turned into Figure 2.37. I probably forgot to mention some important contributors; please accept my apologies and let me know. Indeed, in general, please let me know of any omission or mistake in the text, and accept my thanks in advance. Many thanks to all of you; feedback is greatly appreciated, deőnitely including harsh criticism. Special thanks to my friend, PhD adviser and mentor, Prof. Oded Goldreich, who endured the challenge of trying to teach me both cryptography and proper writing. I surely cannot meet Oded’s high standards, but I believe and hope that this textbook does help to provide the necessary foundations of applied cryptography to practitioners of cybersecurity. Laying foundations is deőnitely a goal I deőnitely share with Oded. Let me quote Oded’s father: It is possible to build a cabin with no foundations, but not a lasting building Eng. Isidor Goldreich Last but not least, I wish to thank the incredible support and understanding of my family throughout the years, especially my amazing parents, Ada and Michael, my lovely and supportive wife Tatiana, and my wonderful and talented children, Raaz, Tamir, Omri and, last but not least, Karina. vi Contents Preface iii Contents vii List of Figures xvi List of Tables xxi List of Labs xxiii List of Principles xxiv Preface 1 1 Introducing Cybersecurity and Cryptography 1.1 Cybersecurity and Cryptography: the basics . . . . . . . . . . . 1.1.1 Three Cybersecurity Functions: Prevention, Detection and Deterrence . . . . . . . . . . . . . . . . . . . . . . . 1.1.2 Generic Security Goals . . . . . . . . . . . . . . . . . . . 1.1.3 Attack Model . . . . . . . . . . . . . . . . . . . . . . . . 1.1.4 Provable Security . . . . . . . . . . . . . . . . . . . . . . 1.1.5 Risk and costs-beneőt analysis . . . . . . . . . . . . . . 1.2 The basic mechanisms: encryption, signatures and hashing . . . 1.2.1 Encryption: symmetric and asymmetric cryptosystems . 1.2.2 Kerckhoffs’ principle . . . . . . . . . . . . . . . . . . . . 1.2.3 Digital Signature schemes . . . . . . . . . . . . . . . . . 1.2.4 Applying Signatures for Evidences and for Public Key Infrastructure (PKI) . . . . . . . . . . . . . . . . . . . . 1.2.5 Cryptographic hash functions . . . . . . . . . . . . . . . 1.3 Sequence Diagrams and Notations . . . . . . . . . . . . . . . . 1.3.1 Sequence diagrams . . . . . . . . . . . . . . . . . . . . . 1.4 A Bit of Background . . . . . . . . . . . . . . . . . . . . . . . . 1.4.1 A bit of Computational Complexity . . . . . . . . . . . 1.4.2 A bit of Number Theory and Group Theory . . . . . . . 1.4.3 A bit of Probability . . . . . . . . . . . . . . . . . . . . 5 5 vii 6 8 9 10 12 13 13 15 17 20 22 22 24 24 26 27 28 1.5 Provable-Security and Deőnitions . . . . . . . . . . . . . . . . . 1.5.1 Deőnition of a Signature Scheme . . . . . . . . . . . . . 1.5.2 Signature attack models and the conservative design principle . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.5.3 Types of forgery . . . . . . . . . . . . . . . . . . . . . . 1.5.4 Game-based Security and the Oracle Notation . . . . . 1.5.5 The Existential Unforgeablity CMA Game . . . . . . . . 1.5.6 The unforgeability advantage function . . . . . . . . . . 1.5.7 Concrete security, asymptotic security and negligible functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.5.8 Existentially-unforgeable signature schemes . . . . . . . 1.6 A Brief History of Cryptography, Computing and Cybersecurity 1.6.1 A brief history of cryptography . . . . . . . . . . . . . . 1.6.2 A brief history of computing and cybersecurity . . . . . 1.7 Lab and Additional Exercises . . . . . . . . . . . . . . . . . . . 29 30 31 33 34 35 37 37 39 40 40 43 46 2 Confidentiality: Encryption Schemes and Pseudo-Randomness 51 2.1 Historical Ciphers . . . . . . . . . . . . . . . . . . . . . . . . . 52 2.1.1 Ancient Keyless Ciphers . . . . . . . . . . . . . . . . . . 53 2.1.2 Keyed-Caesar cipher . . . . . . . . . . . . . . . . . . . . 57 2.1.3 The General Monoalphabetic Substitution (GMS) Cipher 58 2.1.4 Frequency analysis attacks on monoalphabetic ciphers . 59 2.1.5 The Polyalphabetic Vigenère ciphers . . . . . . . . . . . 61 2.2 Cryptanalysis Attack Models: CTO, KPA, CPA and CCA . . . 64 2.3 Generic attacks and Effective Key-Length . . . . . . . . . . . . 69 2.3.1 The generic exhaustive-search CTO attack . . . . . . . . 70 2.3.2 The Table Look-up and the Time-Memory Tradeoff Generic CPA attacks . . . . . . . . . . . . . . . . . . . . . . . . 72 2.3.3 Effective key length . . . . . . . . . . . . . . . . . . . . 73 2.4 Unconditional security and the One Time Pad (OTP) . . . . . 75 2.5 Pseudo-Randomness, Indistinguishability and Asymptotic Security 77 2.5.1 Pseudo-Random Generators and Stream Ciphers . . . . 78 2.5.2 The Turing Indistinguishability Test . . . . . . . . . . . 80 2.5.3 PRG indistinguishability test . . . . . . . . . . . . . . . 80 2.5.4 Deőning Secure Pseudo-Random Generator (PRG) . . . 81 2.5.5 Secure PRG Constructions . . . . . . . . . . . . . . . . 83 2.5.6 RC4: Vulnerabilities and Attacks . . . . . . . . . . . . . 86 2.5.7 Random functions . . . . . . . . . . . . . . . . . . . . . 89 2.5.8 Pseudorandom functions (PRFs) . . . . . . . . . . . . . 94 2.5.9 PRF: Constructions and Robust Combiners . . . . . . . 101 2.5.10 The key separation principle . . . . . . . . . . . . . . . 101 2.6 Block Ciphers and PRPs . . . . . . . . . . . . . . . . . . . . . . 103 2.6.1 Random and Pseudo-Random Permutations . . . . . . . 104 2.6.2 Block ciphers . . . . . . . . . . . . . . . . . . . . . . . . 106 2.6.3 The Feistel Construction: 2n-bit Block Cipher from n-bit PRF . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109 viii 2.7 Deőning secure encryption . . . . . . . . . . . . . . . . . . . . . 111 2.7.1 Attack model . . . . . . . . . . . . . . . . . . . . . . . . 111 2.7.2 The Indistinguishability-Test for Shared-Key Cryptosystems112 2.7.3 The Indistinguishability-Test for Public-Key Cryptosystems (PKCs) . . . . . . . . . . . . . . . . . . . . . . . . 117 2.7.4 Design of Secure Encryption: the Cryptographic Building Blocks Principle . . . . . . . . . . . . . . . . . . . . . . 118 2.8 Encryption Modes of Operation . . . . . . . . . . . . . . . . . . 120 2.8.1 The Electronic Code Book mode (ECB) mode . . . . . . 122 2.8.2 The CTR and PBR modes . . . . . . . . . . . . . . . . 124 2.8.3 The Output-Feedback (OFB) Mode . . . . . . . . . . . 127 2.8.4 The Cipher Feedback (CFB) Mode . . . . . . . . . . . . 131 2.8.5 The Cipher-Block Chaining (CBC) mode . . . . . . . . 133 2.8.6 Modes of Operation Ensuring CCA Security? . . . . . . 135 2.9 Padding Schemes and Padding Oracle Attacks . . . . . . . . . . 135 2.10 Case study: the (in)security of WEP . . . . . . . . . . . . . . . 139 2.10.1 CRC-then-XOR does not ensure integrity . . . . . . . . 141 2.11 Encryption: Final Words . . . . . . . . . . . . . . . . . . . . . 142 2.12 Lab and Additional Exercises . . . . . . . . . . . . . . . . . . . 144 3 Integrity: from Hashing to Blockchains 159 3.1 Introducing cryptographic hash functions, properties and variants160 3.1.1 Warm-up: hashing for efficiency . . . . . . . . . . . . . . 161 3.1.2 Properties of cryptographic hash functions . . . . . . . . 164 3.1.3 Applications of cryptographic hash functions . . . . . . 167 3.1.4 Standard cryptographic hash functions . . . . . . . . . . 168 3.2 Collision Resistant Hash Function (CRHF) . . . . . . . . . . . 168 3.2.1 Keyless Collision Resistant Hash Function (Keyless-CRHF)168 3.2.2 There are no Keyless CRHFs! . . . . . . . . . . . . . . . 170 3.2.3 Keyed Collision Resistance . . . . . . . . . . . . . . . . 172 3.2.4 Birthday and exhaustive attacks on CRHFs . . . . . . . 175 3.2.5 CRHF Applications (1): File Integrity . . . . . . . . . . 176 3.2.6 CRHF Applications (2): Hash-then-Sign (HtS) . . . . . 176 3.3 Second-preimage resistance (SPR) Hash Functions . . . . . . . 180 3.3.1 The Chosen-Preőx Collisions Vulnerability . . . . . . . . 183 3.4 One-Way Functions . . . . . . . . . . . . . . . . . . . . . . . . . 186 3.4.1 OTPw (One-Time Password) Authentication . . . . . . 187 3.4.2 Using OWF for One-Time Signatures . . . . . . . . . . 188 3.5 Randomness Extraction and Key Derivation Functions . . . . . 191 3.5.1 Von Neumann’s Biased-Coin Extractor . . . . . . . . . . 192 3.5.2 The Bitwise Randomness Extractor . . . . . . . . . . . . 192 3.5.3 Key Derivation Functions (KDFs) and Extract-then-Expand194 3.6 The Random Oracle Model . . . . . . . . . . . . . . . . . . . . 196 3.7 Static Accumulator Schemes and the Merkle-Tree . . . . . . . . 198 3.7.1 Deőnition of a Keyless Static Accumulator . . . . . . . 198 3.7.2 Collision-Resistant Accumulators . . . . . . . . . . . . . 199 ix 3.7.3 3.8 3.9 3.10 3.11 The Merkle tree (MT) Accumulator and its Collisionresistance . . . . . . . . . . . . . . . . . . . . . . . . . . 3.7.4 The Proof-of-Inclusion (PoI) Requirements . . . . . . . 3.7.5 The Merkle-Tree PoI . . . . . . . . . . . . . . . . . . . . Dynamic Accumulators . . . . . . . . . . . . . . . . . . . . . . 3.8.1 Dynamic accumulators: motivations, extensions and deőnition . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.8.2 Dynamic Accumulator Collision-Resistance . . . . . . . 3.8.3 Constructing a dynamic accumulator from a static accumulator . . . . . . . . . . . . . . . . . . . . . . . . . . . The Merkle-Damgård Construction . . . . . . . . . . . . . . . . 3.9.1 The Merkle-Damgård Static Accumulator . . . . . . . . 3.9.2 The Merkle-Damgård Hash Function . . . . . . . . . . . 3.9.3 The Merkle-Damgard Dynamic Accumulator . . . . . . Blockchains, PoW and Bitcoin . . . . . . . . . . . . . . . . . . 3.10.1 Blockchain Design . . . . . . . . . . . . . . . . . . . . . 3.10.2 The Bitcoin blockchain and cryptocurrency . . . . . . . Lab and additional exercises . . . . . . . . . . . . . . . . . . . . 4 Authentication: MAC, Blockchain and Signature Schemes 4.1 Encryption for Authentication? . . . . . . . . . . . . . . . . . . 4.2 Message Authentication Code (MAC) schemes . . . . . . . . . 4.3 Message Authentication Code (MAC): Deőnitions . . . . . . . . 4.4 Applying MAC Schemes . . . . . . . . . . . . . . . . . . . . . . 4.5 Constructing MAC from a Block Cipher . . . . . . . . . . . . . 4.5.1 Every PRF is a MAC . . . . . . . . . . . . . . . . . . . 4.5.2 CBC-MAC: ln-bit MAC (and PRF) from n-bit PRF . . 4.5.3 Constructing Secure VIL MAC from PRF . . . . . . . . 4.6 Other MAC Constructions . . . . . . . . . . . . . . . . . . . . . 4.6.1 MAC design ‘from scratch’ . . . . . . . . . . . . . . . . 4.6.2 Robust combiners for MAC . . . . . . . . . . . . . . . . 4.6.3 HMAC and other constructions of a MAC from a Hash function . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.7 Combining Authentication, Encryption and Other Functions . 4.7.1 Authenticated Encryption (AE) and AEAD schemes . . 4.7.2 Authentication via EDC-then-Encryption? . . . . . . . . 4.7.3 Generic Authenticated Encryption Constructions . . . . 4.7.4 Single-Key Generic Authenticated-Encryption . . . . . . 4.7.5 Authentication, encryption, compression and error detection/correction codes . . . . . . . . . . . . . . . . . . . . 4.8 Additional exercises . . . . . . . . . . . . . . . . . . . . . . . . 200 202 202 205 205 207 209 209 210 213 216 216 218 219 223 231 232 233 235 238 240 241 242 244 245 245 246 247 250 251 253 253 257 258 261 5 Shared-Key Protocols 267 5.1 Modeling cryptographic protocols . . . . . . . . . . . . . . . . . 268 5.1.1 The session/record protocol . . . . . . . . . . . . . . . . 269 5.1.2 A simple EtA session/record protocol . . . . . . . . . . 271 x 5.2 5.3 5.4 5.5 5.6 5.7 5.8 Shared-key Entity Authentication Protocols . . . . . . . . . . . 273 5.2.1 Interactions and requirements of entity authentication protocols . . . . . . . . . . . . . . . . . . . . . . . . . . 274 5.2.2 Vulnerability study: SNA mutual-authentication protocol 277 5.2.3 Authentication Protocol Design Principles . . . . . . . . 279 5.2.4 Secure Mutual Entity Authentication with the 2PP protocol . . . . . . . . . . . . . . . . . . . . . . . . . . . . 280 Authenticated Request-Response Protocols . . . . . . . . . . . 281 5.3.1 Summary of request/response protocols . . . . . . . . . 283 5.3.2 The 2PP-RR Authenticated Request-Response Protocol 285 5.3.3 2RT-2PP Authenticated Request-Response protocol . . 286 5.3.4 The Counter-based RR Authenticated Request-Response protocol . . . . . . . . . . . . . . . . . . . . . . . . . . . 286 5.3.5 Time-based RR Authenticated Request-Response protocol287 Shared-key Key Exchange Protocols . . . . . . . . . . . . . . . 289 5.4.1 The Key Exchange extension of 2PP . . . . . . . . . . . 291 5.4.2 Deriving Per-Goal Keys . . . . . . . . . . . . . . . . . . 292 Key Distribution Center Protocols . . . . . . . . . . . . . . . . 293 5.5.1 The Kerberos Key Distribution Protocol . . . . . . . . . 293 The GSM Key Exchange Protocol . . . . . . . . . . . . . . . . 295 5.6.1 VN-impersonation Replay attack on GSM . . . . . . . . 299 5.6.2 Crypto-agility and cipher suite negotiation in GSM . . . 301 5.6.3 The downgrade to A5/2 attack on GSM . . . . . . . . . 304 Resiliency to Exposure: Forward Secrecy and Recover Security 309 5.7.1 Forward Secrecy 2PP Key Exchange . . . . . . . . . . . 310 5.7.2 Recover-Security Key Exchange Protocol . . . . . . . . 311 5.7.3 Stronger notions of resiliency to key exposure . . . . . . 313 Additional Exercises . . . . . . . . . . . . . . . . . . . . . . . . 316 6 Public Key Cryptography 323 6.1 Introduction to PKC . . . . . . . . . . . . . . . . . . . . . . . . 324 6.1.1 Public key cryptosystems . . . . . . . . . . . . . . . . . 324 6.1.2 Signature schemes . . . . . . . . . . . . . . . . . . . . . 325 6.1.3 Public-Key-based Key Exchange Protocols . . . . . . . 326 6.1.4 Advantages of Public Key Cryptography (PKC) . . . . . 328 6.1.5 The price of PKC: assumptions, computation costs and length of keys and outputs . . . . . . . . . . . . . . . . 329 6.1.6 Hybrid Encryption . . . . . . . . . . . . . . . . . . . . . 333 6.1.7 The Factoring and Discrete Logarithm Hard Problems . 335 6.1.8 The secrecy implied by the discrete logarithm assumption 337 6.2 The DH Key Exchange Protocol . . . . . . . . . . . . . . . . . 339 6.2.1 Physical key exchange . . . . . . . . . . . . . . . . . . . 339 6.2.2 Some candidate key exchange protocols . . . . . . . . . 341 6.2.3 The Diffie-Hellman Key Exchange Protocol and Hardness Assumptions . . . . . . . . . . . . . . . . . . . . . . . . 345 6.2.4 Secure derivation of keys from the DH protocol . . . . . 348 xi 6.3 6.4 6.5 6.6 6.7 Using DH for Resiliency to Exposures . . . . . . . . . . . . . . 6.3.1 The Authenticated DH protocol: ensuring PFS . . . . . 6.3.2 The DH-Ratchet protocol: Perfect Forward Secrecy (PFS) and Perfect Recover Security (PRS) . . . . . . . . . . . The DH and El-Gamal PKCs . . . . . . . . . . . . . . . . . . . 6.4.1 The DH PKC and the Hashed DH PKC . . . . . . . . . 6.4.2 The El-Gamal PKC . . . . . . . . . . . . . . . . . . . . 6.4.3 El-Gamal is Multiplicative-Homomorphic Encryption . . 6.4.4 Types and Applications of Homomorphic Encryption . . The RSA Public-Key Cryptosystem . . . . . . . . . . . . . . . 6.5.1 RSA key generation. . . . . . . . . . . . . . . . . . . . . 6.5.2 Textbook RSA: encryption, decryption, and signing. . . 6.5.3 Efficiency of RSA . . . . . . . . . . . . . . . . . . . . . . 6.5.4 Correctness of RSA . . . . . . . . . . . . . . . . . . . . 6.5.5 The RSA assumption and the vulnerability of textbook RSA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.5.6 Padded RSA encryption: PKCS#1 v1.5 and OAEP . . 6.5.7 Bleichenbacher’s Padding Side-Channel Attack on PKCS#1 v1.5 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Public key signature schemes . . . . . . . . . . . . . . . . . . . 6.6.1 RSA-based signatures . . . . . . . . . . . . . . . . . . . Labs and Additional Exercises . . . . . . . . . . . . . . . . . . . 350 351 353 355 355 358 360 362 365 365 366 368 369 371 373 378 383 385 388 7 TLS protocols: web-security and beyond 399 7.1 Introduction to TLS and SSL . . . . . . . . . . . . . . . . . . . 399 7.1.1 A brief history of SSL and TLS . . . . . . . . . . . . . . 401 7.1.2 TLS: High-level Overview . . . . . . . . . . . . . . . . . 403 7.1.3 TLS: security goals . . . . . . . . . . . . . . . . . . . . 406 7.1.4 TLS: Engineering goals . . . . . . . . . . . . . . . . . . 407 7.1.5 TLS and the TCP/IP Protocol Stack . . . . . . . . . . . 408 7.2 The TLS Record Protocol . . . . . . . . . . . . . . . . . . . . . 409 7.2.1 The Authenticate-then-Encrypt (AtE) Record Protocol 410 7.2.2 The CPA-Oracle Attack Model . . . . . . . . . . . . . . 414 7.2.3 Padding Attacks: Poodle and Lucky13 . . . . . . . . . . 416 7.2.4 The BEAST Attack: Exploiting CBC with Predictable-IV 419 7.2.5 Exploiting RC4 Biases to Recover Plaintext . . . . . . . 423 7.2.6 Exploiting Compress-then-Encrypt: CRIME, TIME, BREACH424 7.2.7 The TLS AEAD-based record protocol (TLS 1.3) . . . . 426 7.3 The SSLv2 Handshake Protocol . . . . . . . . . . . . . . . . . . 428 7.3.1 SSLv2: the ‘basic’ handshake . . . . . . . . . . . . . . . 429 7.3.2 SSLv2 Key-derivation . . . . . . . . . . . . . . . . . . . 429 7.3.3 SSLv2: ID-based Session Resumption . . . . . . . . . . 431 7.3.4 SSLv2: Client Authentication . . . . . . . . . . . . . . . 433 7.4 The Handshake Protocol: from SSLv3 to TLSv1.2 . . . . . . . 433 7.4.1 SSLv3 to TLSv1.2: improved derivation of keys . . . . . 435 7.4.2 SSLv3 to TLSv1.2: DH-based key exchange . . . . . . . 436 xii 7.5 7.6 7.7 7.8 7.4.3 The TLS Extensions mechanism . . . . . . . . . . . . . 7.4.4 SSLv3 to TLSv1.2: session resumption . . . . . . . . . . 7.4.5 SSLv3 to TLSv1.2: Client authentication . . . . . . . . Negotiations and Downgrade Attacks (SSL to TLS 1.2) . . . . 7.5.1 SSLv2 cipher suite negotiation and downgrade attack . 7.5.2 Handshake Integrity Against Cipher Suite Downgrade . 7.5.3 Finished Fails: the Logjam and FREAK cipher suite downgrade attacks . . . . . . . . . . . . . . . . . . . . . 7.5.4 Backward compatibility and protocol version negotiation 7.5.5 The TLS Downgrade Dance and the Poodle Version Downgrade Attack . . . . . . . . . . . . . . . . . . . . . . . . 7.5.6 Securing the TLS downgrade dance: the SCSV cipher suite and beyond . . . . . . . . . . . . . . . . . . . . . . 7.5.7 The SSL-Stripping Attack and the HSTS Defense . . . . 7.5.8 Three Principles: Secure Extensibility, KISS and Minimize Attack Surface . . . . . . . . . . . . . . . . . . . . The TLS 1.3 Handshake: Improved Security and Performance . 7.6.1 TLS 1.3: Negotiation and Backward Compatibility . . . 7.6.2 TLS 1.3 Full (1-RTT) DH Handshake . . . . . . . . . . 7.6.3 TLS 1.3 Full (1-RTT) Pre-Shared Key (PSK) Handshake 7.6.4 TLS 1.3 Zero-RTT Pre-Shared Key (PSK) Handshake . 7.6.5 TLS 1.3 Key Derivation . . . . . . . . . . . . . . . . . . 7.6.6 Cross-Protocol Attacks on TLS 1.3 . . . . . . . . . . . . TLS: Final Words and Further Reading . . . . . . . . . . . . . Additional Exercises . . . . . . . . . . . . . . . . . . . . . . . . 8 Public Key Infrastructure (PKI) 8.1 Introduction: PKI Concepts and Goals . . . . . . . . . . . . . . 8.1.1 Rogue certiőcates . . . . . . . . . . . . . . . . . . . . . . 8.1.2 Security goals of PKI schemes. . . . . . . . . . . . . . . 8.1.3 The Web PKI . . . . . . . . . . . . . . . . . . . . . . . . 8.2 The X.509 PKI . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.2.1 The X.500 Global Directory Standard . . . . . . . . . . 8.2.2 The X.500 Distinguished Name . . . . . . . . . . . . . . 8.2.3 X.509 Public Key Certiőcates . . . . . . . . . . . . . . . 8.2.4 The X.509v3 Extensions Mechanism . . . . . . . . . . . 8.2.5 Trust-Anchor Certiőcate Validation . . . . . . . . . . . 8.2.6 The SubjectAltName and the IssuerAltName Extensions 8.2.7 Standard key-usage and policy extensions . . . . . . . . 8.2.8 Certiőcate policy (CP) and Domain/Organization/Extended Validation . . . . . . . . . . . . . . . . . . . . . . 8.3 Intermediate-CAs and Certiőcate Path Validation . . . . . . . . 8.3.1 The certiőcate path constraints extensions . . . . . . . . 8.3.2 The basic constraints extension . . . . . . . . . . . . . . 8.3.3 The name constraint extension . . . . . . . . . . . . . . 8.3.4 The policy constraints extension . . . . . . . . . . . . . xiii 439 440 443 444 445 447 449 451 453 454 454 457 458 461 464 466 469 470 471 472 473 481 482 485 486 487 488 488 489 493 497 500 501 502 503 505 506 507 508 512 8.4 Certiőcate Revocation . . . . . . . . . . . . . . . . . . . . . . . 8.4.1 Certiőcate Revocation List (CRL) . . . . . . . . . . . . 8.4.2 Online Certiőcate Status Protocol (OCSP) . . . . . . . 8.4.3 OCSP Stapling and the Must-Staple Extension . . . . . 8.4.4 Reducing OCSP Computational Overhead . . . . . . . . 8.4.5 Optimized Periodic Revocation Status: OneCRL, CRLsets and CRVs . . . . . . . . . . . . . . . . . . . . . . . . . . 8.5 Web PKI: Vulnerabilities, Failures and Improvements . . . . . . 8.5.1 Web PKI Certiőcate Authority Failures . . . . . . . . . 8.5.2 X.509/PKIX Defenses against Corrupt/Negligent CAs . 8.6 Certiőcate Transparency (CT) . . . . . . . . . . . . . . . . . . 8.6.1 CT: concepts, entities and goals . . . . . . . . . . . . . . 8.6.2 The Honest-Logger Certiőcate Transparency (HL-CT) design . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.6.3 The Audit-and-Gossip Certiőcate Transparency (AnGCT) design . . . . . . . . . . . . . . . . . . . . . . . . . 8.6.4 The NTTP-Security Certiőcate Transparency (NS-CT) design . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.7 Additional Exercises . . . . . . . . . . . . . . . . . . . . . . . . 512 516 519 526 532 9 Human-centered Security and Cryptography 9.1 Introducing User-Centered Security and Cryptography . . . . . 9.1.1 Biases and other challenges of protecting human users . 9.1.2 Principles of user-centred security . . . . . . . . . . . . 9.2 Login Ceremonies . . . . . . . . . . . . . . . . . . . . . . . . . . 9.3 Password-based Login Ceremonies . . . . . . . . . . . . . . . . 9.3.1 Password-based web-form login ceremonies . . . . . . . 9.3.2 Impersonating website and phishing attacks . . . . . . . 9.3.3 Defenses against impersonating websites . . . . . . . . . 9.3.4 Password dictionaries . . . . . . . . . . . . . . . . . . . 9.3.5 Online and offline dictionary attacks . . . . . . . . . . . 9.4 Password-őle exposures . . . . . . . . . . . . . . . . . . . . . . 9.4.1 Encrypted password őle . . . . . . . . . . . . . . . . . . 9.4.2 Hashed password őles . . . . . . . . . . . . . . . . . . . 9.4.3 Adding salt to hashed passwords . . . . . . . . . . . . . 9.4.4 Hashed passwords with pepper . . . . . . . . . . . . . . 9.4.5 Using cryptographic co-processor to protect hashed passwords . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9.4.6 Password managers . . . . . . . . . . . . . . . . . . . . . 9.4.7 Detecting exposure of passwords and password őle . . . 9.5 Password-Authenticated Key Exchange (PAKE) . . . . . . . . 9.5.1 Two Naïve-PAKE Protocols . . . . . . . . . . . . . . . . 9.5.2 The EKE2 PAKE protocol . . . . . . . . . . . . . . . . 9.6 Login ceremonies beyond passwords . . . . . . . . . . . . . . . 9.6.1 Something else you know: beyond passwords . . . . . . 9.6.2 One-Time Password and Hash-Chain Login Ceremonies 573 574 574 576 578 579 582 583 587 588 590 592 593 594 596 597 xiv 537 538 538 541 544 545 548 552 557 564 599 600 603 604 606 609 610 611 613 9.7 9.6.3 Device-based Login Ceremonies . . . . . . . . . . . . . . 616 9.6.4 Biometrics: something you are authentication . . . . . . 617 Lab and Additional Exercises . . . . . . . . . . . . . . . . . . . 617 10 Conclusions and few advanced topics 10.1 Secret sharing and its Applications . . . . . . . . . . . . . . . . 10.2 Side-channels . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10.3 Elliptic Curves Cryptography . . . . . . . . . . . . . . . . . . . 10.4 Quantum and post-quantum cryptography: by Walter Krawec . 10.4.1 Quantum cryptanalysis and post-quantum cryptography 10.4.2 Quantum Cryptography . . . . . . . . . . . . . . . . . . 10.5 Privacy and anonymity . . . . . . . . . . . . . . . . . . . . . . . 10.6 Theory of cryptography . . . . . . . . . . . . . . . . . . . . . . 623 623 623 623 623 625 627 629 629 Appendix A Background A.1 Background: Computational Complexity . . . . . . . . A.2 Background: Number Theory and Group Theory . . . A.2.1 The modulo operation and modular arithmetic A.2.2 Multiplicative inverses . . . . . . . . . . . . . . A.2.3 Fermat’s and Euler’s Theorems . . . . . . . . . A.2.4 Group Theory, Cyclic Groups and Generators . A.3 Background: Probability . . . . . . . . . . . . . . . . . 631 631 636 636 638 640 644 646 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Index 653 Bibliography 667 xv List of Figures 1.3 1.4 1.5 1.6 1.7 1.8 Cybersecurity defense approaches: Prevention, Deterrence and Detection. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Generic Security Goals: Conődentiality, Integrity, Authentication, Availability and Non-repudiation . . . . . . . . . . . . . . . . . . . Encryption: terms and typical use . . . . . . . . . . . . . . . . . . Shared key (symmetric) cryptosystem. . . . . . . . . . . . . . . . . Public key (asymmetric) cryptosystem. . . . . . . . . . . . . . . . Digital Signature Scheme. . . . . . . . . . . . . . . . . . . . . . . . Sequence diagram for the initialization and use of a signature scheme Comparing linear, quadratic and exponential complexities . . . . . 2.1 2.2 2.3 2.4 2.5 2.6 2.7 2.8 2.9 2.10 2.11 2.12 2.13 2.14 2.15 2.16 2.17 2.18 2.19 2.20 2.21 2.22 2.23 2.24 Stateful shared key (symmetric) cryptosystem. . . . . . . . . . . The At-Bash Cipher. . . . . . . . . . . . . . . . . . . . . . . . . The AzBy, Caesar and ROT13 Ciphers. . . . . . . . . . . . . . . The Masonic Cipher . . . . . . . . . . . . . . . . . . . . . . . . . Letter and Bigram Frequencies in English . . . . . . . . . . . . . The Ciphertext-Only (CTO) attack model . . . . . . . . . . . . . The Known-Plaintext Attack (KPA) model . . . . . . . . . . . . The Chosen-Plaintext Attack (CPA) model . . . . . . . . . . . . The Chosen-Ciphertext Attack (CCA) model . . . . . . . . . . . The One Time Pad (OTP) cipher . . . . . . . . . . . . . . . . . . PRG-based Stream Cipher . . . . . . . . . . . . . . . . . . . . . . The Turing Indistinguishability Test . . . . . . . . . . . . . . . . Intuition for the PRG Indistinguishability Test . . . . . . . . . . Feedback Shift Register . . . . . . . . . . . . . . . . . . . . . . . Bit-wise encryption using a random function . . . . . . . . . . . Block (n-bits) encryption using a Random Function f (·) . . . . . Using PRF for secure encryption . . . . . . . . . . . . . . . . . . The PRF Indistinguishability Test . . . . . . . . . . . . . . . . . Standard block ciphers (AES and DES) . . . . . . . . . . . . . . The PRP Indistinguishability Test . . . . . . . . . . . . . . . . . Three ‘rounds’ of the Feistel Cipher . . . . . . . . . . . . . . . . The IND-CPA test for shared-key encryption . . . . . . . . . . . The IND-CPA test for symmetric encryption (E, D) . . . . . . . IND-CPA-PK, indistinguishability test for public-key encryption 1.1 1.2 xvi . . . . . . . . . . . . . . . . . . . . . . . . 6 8 14 14 15 18 25 27 52 54 55 56 59 65 66 67 69 76 79 80 81 86 91 93 94 95 103 105 110 114 115 117 2.25 2.26 2.27 2.28 2.29 2.30 2.31 2.32 2.33 2.34 2.35 2.36 2.37 Electronic Code Book (ECB) mode encryption . . . . . . . . . Electronic Code Book (ECB) mode decryption . . . . . . . . . Visual demonstration of the weakness of the ECB mode . . . . Per-Block Random (PBR) mode encryption . . . . . . . . . . . Counter (CTR) mode encryption . . . . . . . . . . . . . . . . . Output Feedback (OFB) mode encryption . . . . . . . . . . . . Output Feedback (OFB) mode decryption. Adapted from [218]. Cipher Feedback (CFB) mode encryption . . . . . . . . . . . . Cipher Feedback (CFB) mode decryption . . . . . . . . . . . . Cipher Block Chaining (CBC) mode encryption . . . . . . . . . Cipher Block Chaining (CBC) mode decryption . . . . . . . . . The Padding Oracle Attack model . . . . . . . . . . . . . . . . A single round of the ANSI X9.31 stateful PRG . . . . . . . . . . . . . . . . . . . . . . 122 123 123 124 127 128 128 132 132 133 134 137 151 3.1 3.2 3.3 3.4 3.5 3.6 3.7 3.8 3.9 3.10 3.11 3.12 3.13 3.14 3.15 3.16 3.17 Keyless and Keyed Hash Functions . . . . . . . . . . . . . . . . . . Load-balancing with (keyless) hash function h(·) . . . . . . . . . . Algorithmic Complexity Denial-of-Service Attack . . . . . . . . . . Load balancing with a collision-resistant hash function . . . . . . . Keyless collision resistant hash function (CRHF) . . . . . . . . . . Keyed collision resistance hash function (CRHF) . . . . . . . . . . Target collision resistant (TCR) hash function . . . . . . . . . . . . Use of hash function h to validate integrity of őle . . . . . . . . . . Second-preimage resistance (SPR) . . . . . . . . . . . . . . . . . . One-Way Function (aka Preimage-Resistance) . . . . . . . . . . . . A one-time signature scheme, limited to a single bit . . . . . . . . A one-time signature scheme, for l-bit string (denoted d). . . . . . A one-time signature scheme using ‘Hash-then-Sign’ . . . . . . . . Bitwise-Randomness Extractor (BRE) Hash Function . . . . . . . The Merkle-Tree construction using hash function h . . . . . . . . Illustrating the the Merkle Tree’s Proof-of-Inclusion (PoI) . . . . . Constructing a dynamic accumulator αD from a static accumulator α. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . The digest function of the Merkle-Damgård accumulator . . . . . . Compression function h . . . . . . . . . . . . . . . . . . . . . . . . The simpliőed Merkle-Damgård hash hMDwo (x) . . . . . . . . . . . The Merkle-Damgård hash hMD (x) . . . . . . . . . . . . . . . . . . The Bitcoin Blockchain . . . . . . . . . . . . . . . . . . . . . . . . 160 162 163 164 169 172 173 177 181 186 189 190 190 193 200 203 3.18 3.19 3.20 3.21 3.22 . . . . . . . . . . . . . 209 210 214 215 216 221 4.1 Using a MAC scheme to authenticate messages . . . . . . . . . . . 234 4.2 The CBC-MAC construction . . . . . . . . . . . . . . . . . . . . . 243 4.3 Combining Encryption, Authentication, Reliability and Compression 259 5.1 Interactions for the record/session protocols . . . . . . . 5.2 Interactions for entity authentication protocols . . . . . 5.3 The (vulnerable) SNA mutual authentication protocol. 5.4 Attack on SNA Mutual Entity Authentication protocol . xvii . . . . . . . . . . . . . . . . . . . . . . . . 269 275 277 279 5.5 5.6 5.7 5.8 5.9 5.10 5.11 5.12 5.13 5.14 5.15 5.16 5.17 5.18 5.19 5.20 5.21 5.22 The 2PP Mutual Entity Authentication protocol. . . . . . . . . . . Interactions for Authenticated Request-Response Protocols . . . . The 2PP-RR Authenticated Request-Response Protocol . . . . . . The 2RT-2PP Authenticated Request-Response protocol . . . . . . The Counter-based RR Authenticated Request-Response protocol Time-based Authenticated Request-Response protocol . . . . . . . The 2PP Key Exchange protocol . . . . . . . . . . . . . . . . . . . The Kerberos Key Distribution Center Protocol . . . . . . . . . . . The GSM Key Exchange Protocol . . . . . . . . . . . . . . . . . . The VN-impersonation attack on GSM . . . . . . . . . . . . . . . . The GSM Key Exchange Protocol with cipher suite negotiation . . Simpliőed downgrade attack on GSM Key Exchange. . . . . . . . A ‘real’ downgrade attack on GSM Key Exchange. . . . . . . . . The Forward-Secrecy 2PP Key Exchange protocol . . . . . . . . . Result of running the Forward-Secrecy 2PP Key Exchange . . . . . Running the recover-security Key Exchange protocol for őve periods Relations between notions of resiliency to key exposures . . . . . . Simpliőed SSL . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.1 6.2 6.3 6.4 6.5 6.6 6.7 6.8 6.9 6.10 6.11 6.12 6.13 6.14 6.15 6.16 6.17 6.18 6.19 6.20 6.21 6.22 6.23 6.24 The discovery of Public-Key Cryptography . . . . . . . . . . . . . Operation of two-ŕows key exchange protocols . . . . . . . . . . . Hybrid encryption . . . . . . . . . . . . . . . . . . . . . . . . . . . Physical Key Exchange Protocol . . . . . . . . . . . . . . . . . . . The (insecure) XOR Key Exchange Protocol . . . . . . . . . . . . The (insecure) Exponentiation Key Exchange Protocol . . . . . . . The Modular-Exponentiation Key Exchange Protocol . . . . . . . The Diffie-Hellman Key Exchange Protocol . . . . . . . . . . . . . MitM attack on the DH key-exchange protocol . . . . . . . . . . . The Generalized Diffie-Hellman Key Exchange Protocol . . . . . . The Auth-h-DH protocol . . . . . . . . . . . . . . . . . . . . . . . The DH-Ratchet protocol . . . . . . . . . . . . . . . . . . . . . . . The DH public key cryptosystem . . . . . . . . . . . . . . . . . . . The Hashed DH public key cryptosystem . . . . . . . . . . . . . . The El-Gamal Public-Key Cryptosystem using DDH group . . . . Privacy-preserving voting using homomorphic encryption . . . . . RSA encryption: textbook RSA (vulnerable) vs. padded RSA . . . Simpliőed-OAEP-padded RSA Encryption . . . . . . . . . . . . . . OAEP padding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . The chosen-ciphertext side-channel attack (CCSCA) model . . . . Bleichenbacher’s attack on RSA . . . . . . . . . . . . . . . . . . . . Public key certiőcate issuing and usage processes. . . . . . . . . . How not to ensure resilient key exchange (for Ex. 6.20) . . . . . . . Insecure ‘robust-combiner’ authenticated DH protocol, studied in Exercise 6.21. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.25 Insecure variant of the DH-Ratchet Protocol, for Ex. 6.26. . . . . . xviii 280 282 285 286 287 288 291 294 297 299 303 306 308 311 311 312 315 317 324 327 334 340 342 342 344 346 347 349 351 354 356 358 360 363 367 376 377 379 380 384 392 392 394 7.1 7.2 7.3 7.4 7.5 7.6 7.7 7.8 7.9 7.10 7.11 7.12 7.13 7.14 7.15 7.16 7.17 7.18 7.19 7.20 7.21 7.22 7.23 A simpliőed overview of the operation of TLS . . . . . . . . . . . . 404 Placement of TLS in the TCP/IP protocol stack . . . . . . . . . . 409 Phases of TLS connection . . . . . . . . . . . . . . . . . . . . . . . 409 The Authenticate-then-Encrypt (AtE) record protocol of SSL and TLS411 Padding in the record protocol of SSL and TLS . . . . . . . . . . . 413 The CPA-Oracle Attack model . . . . . . . . . . . . . . . . . . . . 415 The AEAD Record Protocol (TLS 1.3) . . . . . . . . . . . . . . . . 426 ‘Basic’ SSLv2 handshake . . . . . . . . . . . . . . . . . . . . . . . . 430 SSLv2 handshake, with ID-based session resumption . . . . . . . . 432 The ‘basic’ RSA-based handshake, from SSLv3 till TLS 1.2 . . . . 434 SSLv3 to TLSv1.2: static DH handshake . . . . . . . . . . . . . . . 437 SSLv3 to TLSv1.2 . . . . . . . . . . . . . . . . . . . . . . . . . . . 438 SSLv3 to TLS1.2 handshake, with ID-based session resumption. . 440 Ticket-based session resumption . . . . . . . . . . . . . . . . . . . . 442 Client authentication in SSLv3 to TLS1.2. . . . . . . . . . . . . . 444 SSLv2 handshake, with details of cipher suite negotiation . . . . . 445 Example of SSLv2 cipher suite negotiation . . . . . . . . . . . . . . 446 Cipher suite downgrade attack on SSLv2 . . . . . . . . . . . . . . . 446 The Logjam cipher suite downgrade attack . . . . . . . . . . . . . 450 The SSL-Stripping Attack . . . . . . . . . . . . . . . . . . . . . . . 455 TLS 1.3 1-RTT full Diffie-Hellman handshake . . . . . . . . . . . . 464 TLS 1.3 Full (1-RTT) Pre-Shared Key (PSK) handshake . . . . . . 466 TLS 1.3 Zero-RTT Pre-Shared Key (PSK) Handshake . . . . . . . 469 8.1 8.2 8.3 8.4 8.5 8.6 8.7 8.8 8.9 8.10 8.11 8.12 8.13 8.14 8.15 8.16 8.17 8.18 8.19 PKI Entities and typical application for Web PKI . . . . . . . . . 484 Example of the X.500 (and X.509) Distinguished Name (DN) Hierarchy.490 The Identiőers Trilemma . . . . . . . . . . . . . . . . . . . . . . . . 493 X.509 version 1 certiőcate . . . . . . . . . . . . . . . . . . . . . . . 494 X.509 version 3 certiőcate . . . . . . . . . . . . . . . . . . . . . . . 496 A single-hop (length one) certiőcate-path . . . . . . . . . . . . . . 509 A three-hop certiőcate-path . . . . . . . . . . . . . . . . . . . . . . 510 Example of the use of Name Constraint . . . . . . . . . . . . . . . 511 Example of the use of name constraint with dNSName . . . . . . . 512 X.509 Certiőcate Revocation List (CRL) . . . . . . . . . . . . . . 516 The Online Certiőcate Status Protocol (OCSP) . . . . . . . . . . . 519 OCSP used by relying party . . . . . . . . . . . . . . . . . . . . . . 520 The MitM soft-fail Attack on a TLS connection using OCSP . . . 524 OCSP stapling in the TLS protocol, using the CSR TLS extension 528 MitM soft-fail attack on OCSP-stapling TLS client (browser) . . . 530 Use of the TLS-feature X.509 extension indicating Must-Staple . . 531 Certiőcates-Merkle-tree variant of OCSP . . . . . . . . . . . . . . . 533 Signed revocations-status Merkle-tree . . . . . . . . . . . . . . . . 535 Optimizing OCSP response, using tree-of-revocations and Proof-ofNon-Inclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 536 8.20 The issue process for both HL-CT and AnG-CT . . . . . . . . . . 549 8.21 Monitoring in HL-CT . . . . . . . . . . . . . . . . . . . . . . . . . 550 xix 8.22 8.23 8.24 8.25 8.26 8.27 8.28 Omitted-certiőcate attack by a rogue logger and rogue CA . . . . Split-world attack on AnG-CT . . . . . . . . . . . . . . . . . . . . Zombie-certiőcate attack . . . . . . . . . . . . . . . . . . . . . . . . NS-CT issue process, with a (possibly corrupt) logger . . . . . . . NS-CT detection by monitor of a misleading certiőcate . . . . . . . NTTP-Security Certiőcate Transparency (NS-CT) . . . . . . . . . NS-CT defending against the omitted-certiőcate attack, by providing a Proof-of-Misbehavior of a rogue logger . . . . . . . . . . . . . . . 8.29 Split-world attack by a rogue logger against NS-CT (incorrectly) deployed without gossip . . . . . . . . . . . . . . . . . . . . . . . . 8.30 Inter-monitor gossip . . . . . . . . . . . . . . . . . . . . . . . . . . 9.1 9.2 9.3 9.4 9.5 9.6 9.7 9.8 551 555 558 560 561 562 563 564 565 Exposing user’s password via DNS poisoning for login page sent over http. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 584 Typical phishing attack luring a user to an impersonating website 585 Using AI (ChatGPT) to generate a personalized password dictionary589 Password validation using cryptographic co-processor . . . . . . . . 599 The Naïve-PAKE Protocol . . . . . . . . . . . . . . . . . . . . . . 607 The Naïve-DH-PAKE Protocol . . . . . . . . . . . . . . . . . . . . 608 A MitM offline dictionary attack against the Naïve-DH-PAKE Protocol609 The EKE2 PAKE Protocol . . . . . . . . . . . . . . . . . . . . . . 610 xx List of Tables 1.1 Notations used in this manuscript. . . . . . . . . . . . . . . . . . . 23 2.1 Cryptanalysis Attack Models . . . . . . . . . . . . . . . . . . . . . 2.2 Table for Exercise 2.10 . . . . . . . . . . . . . . . . . . . . . . . . . 2.3 Do-it-yourself table for selecting random permutations ρ1 , ρ2 over domain D = {0, 1}2 . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.4 Comparison between random function, random permutation, PRG, PRF, PRP and block-cipher . . . . . . . . . . . . . . . . . . . . . . 2.5 Encryption Modes of Operation using n-bit block cipher . . . . . . 2.6 Ciphertexts for Exercise 2.25 . . . . . . . . . . . . . . . . . . . . . 65 90 3.1 3.2 104 108 121 145 Goals and Requirements for keyless cryptographic hash functions . 165 Comparison: PRF, KDF and Extractor hash functions . . . . . . . 195 4.1 Authentication schemes: MAC, Authenticated Encryption (AE, AEAD) and Signatures . . . . . . . . . . . . . . . . . . . . . . . . . 252 5.1 Authenticated Request-Response (RR) protocols. . . . . . . . . . . 284 5.2 Comparing the impact of transmission rates to the impact of and the number of round-trips required by a protocol, assuming the typical RTT (round-trip time) of 100msec, and assuming overall transmission of 50KByte. . . . . . . . . . . . . . . . . . . . . . . . 284 5.3 Notions of resiliency to key exposures of key-setup Key Exchange protocols . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 314 6.1 6.2 Key length and computing time for asymmetric and symmetric cryptography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 332 Resiliency to key exposures of Key Exchange protocols . . . . . . . 353 7.1 Derivation of connection keys and IVs, in SSLv3 to TLS1.2 . . . . 436 7.2 Important TLS/SSL attacks due to speciőcations vulnerabilities. . 472 7.3 Table for Exercise 7.15. . . . . . . . . . . . . . . . . . . . . . . . . 478 8.1 8.2 8.3 Standard keywords/attributes in X.500 Distinguished Names . . . 490 Standard certiőcate policies . . . . . . . . . . . . . . . . . . . . . . 504 Comparison of revocation-checking mechanisms . . . . . . . . . . . 514 xxi 8.4 8.5 Revocation-related parameters . . . . . . . . . . . . . . . . . . . . 514 Notable Web PKI Certiőcate Authority Failures . . . . . . . . . . 539 9.1 Operations required for offline dictionary attacks against a hashed password őle . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 595 A.1 Euler’s function ϕ(n) . . . . . . . . . . . . . . . . . . . . . . . . . . 641 xxii List of Labs Lab 1 (Using cryptography to validate downloads) . . . . . . . . . . 46 Lab 2 (Ransomware and Encryption) . . . . . . . . . . . . . . . . . . 144 Lab 3 (Checksum and CRC Collisions) . . . . . . . . . . . . . . . . . 223 Lab 4 (Breaking textbook and weakly-padded RSA) . . . . . . . . . 388 Lab 5 (Password Cracking and Crypto Hash Functions) . . . . . . . 617 xxiii List of Principles Principle 1 (Attack model principle: assume capabilities, not strategy) 10 Principle 2 (Kerckhoffs’ principle) . . . . . . . . . . . . . . . . . . . . 16 Principle 3 (Conservative design and usage) . . . . . . . . . . . . . . 33 Principle Principle Principle Principle Principle Principle 4 5 6 7 8 9 (Limit usage of each key) . . . . . (Sufficient effective key length) . . (Random function design method) (Key separation) . . . . . . . . . . (Cryptographic Building Blocks) . (Minimize plaintext redundancy) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60 . 74 . 94 . 102 . 118 . 131 Principle 10 (Key Separation) . . . . . . . . . . . . . . . . . . . . . . 239 Principle 11 (Crypto-agility) . . . . . . . . . . . . . . . . . . . . . . . 301 Principle 12 (Minimize use of public-key cryptography) . . . . . . . 333 Principle 13 (Secure extensibility by design) . . . . . . . . . . . . . . 457 Principle 14 (The KISS Principle) . . . . . . . . . . . . . . . . . . . 457 Principle 15 (Minimize the attack surface) . . . . . . . . . . . . . . . 458 Principle 16 (The UX>Security Precedence Rule) . . . . . . . . . . . 525 Principle 17 (Soft-fail security is insecure) . . . . . . . . . . . . . . . 529 Principle 18 (Bellovin’s principle: secure is as securely used) . . . . . Principle 19 (Fail-safe defaults, or: Security Should be Default, and Defaults Should be Secure.) . . . . . . . . . . . . . . . . . . . . Principle 20 (Defend, don’t ask and don’t warn.) . . . . . . . . . . . Principle 21 (Use click-whirr responses to improve security) . . . . . Principle 22 (Alerts Should Wake Up) . . . . . . . . . . . . . . . . . xxiv 574 576 577 577 577 Preface This textbook introduces cybersecurity, with focus on the key element of applied cryptography. Our goal is to provide sufficient depth and precision for understanding of this important and fascinating area, but without requiring extensive prior background in mathematics or in the theory of computer science. The textbook presents design principles, discusses practical systems, attacks and vulnerabilities, presents basic cryptographic mechanisms, constructions and deőnitions, and includes many examples, exercises and several programming labs. The goal is that readers will be able to use the book for self-study, and lecturer will be able to use it as a textbook for a course, or use parts of it for two courses. We use the term cybersecurity to refer to the protection of systems involving communication and computation mechanisms, from attacks by adversaries. This is a broad area, since there communication and computations have become so diverse and important, and since there are many threats and many types of adversaries. Ensuring security is challenging; attacks often exploit subtle vulnerabilities and use unexpected strategies, which intuition may fail to consider. This makes cybersecurity challenging; careful, adversarial thinking is crucial. Cybersecurity is a very applied area, however, in this applied area, precise analysis, definitions and proofs are critical. This stands in contrast to many other areas of engineering, where designs are often evaluated under typical, expected scenarios and failures. This approach is insufficient for cybersecurity, since security should be ensured against arbitrary attacker strategies, rather than against expected, familiar attacks. Defenses should be designed assuming limitations on the capabilities of the adversary, but without making any assumptions on the adversary’s strategy. Cybersecurity is a broad area, including many aspects, technical as well as otherwise (legal, economics, social and much more). The technical aspects include cryptography, network-security, software security, system security, privacy, secure human-computer interaction and more. This textbook focuses on applied cryptography and also introduces some aspects of network security and of secure human-computer interaction. We believe that this is a good choice for a őrst course in cybersecurity, for three reasons: • Applied cryptography is essential to many areas of cybersecurity. Hence, this textbook may provide a common basis for students interested in 1 LIST OF PRINCIPLES 2 these different areas. Some students may continue by focusing mostly on cryptography, maybe even on the theory of cryptography; for these, we hope to provide a good basis in the applied aspects of cryptography. Other may continue to areas of cybersecurity which ‘just’ use cryptography such as network-security, secure systems, privacy and human-centered security; for those, we hope to provide the necessary background in cryptography. • The study of cryptography develops adversarial thinking, which is critical for every cybersecurity expert. Modern cryptography is based on precise deőnitions of goals and assumptions, with analysis and proofs of security for given adversary capabilities, not assuming a speciőc adversary strategy or attack method. Other areas of cybersecurity often use intuitive goals, design and analysis, and may even focus on speciőc adversary strategies. Such approaches may be unavoidable, since precise deőnitions and proofs are often infeasible; however, these approaches appear less helpful in developing adversarial thinking. • Finally, there is the pragmatic consideration of scheduling and prerequisites. Modern cryptography is based on both mathematics and on the theory of computing; however, as we believe you will őnd, we found it possible to use only a limited amount of math and theory, and introduce this limited amount with this book. As a result, the text can be used by readers who did not learn previous math or computer-science courses, beyond what is learned in many high-school programs. This is in contrast to other areas of cybersecurity, which require considerable background (e.g., in networking, operating-systems, or programming). The need for security against attackers with arbitrary strategies, restricted only by their capabilities, motivates the use of provable security, as well as the use of standard, well-studied designs, whose security was conőrmed by experts. Indeed, some of the worst failures of security systems, and especially of cryptographic systems, are due to attempts by non-experts to design schemes and protocols. This textbook tries to combine practice with theory, applicability with precision, breadth with depth. These are ambitious goals, and we hope we are not completely off the mark. Feedback and suggestions for improvement are highly appreciated. Organization and usage of this textbook This textbook is designed for an introductory course in cybersecurity, focusing on applied cryptography. It may be used for self-study, or for one (large) or two (smaller) courses. Chapter 1 provides essential introduction to cryptography, providing intuitive discussion of the main mechanisms we discuss, and introducing the challenge of deőning security. It also introduces notations used throughout the book, and provides critical background information; additional background is provided Applied Introduction to Cryptography and Cybersecurity LIST OF PRINCIPLES 3 in Appendix A. The rest of the book was designed so that the necessary background in each topic is very limited; we believe it is not necessary to require learning these topics in a prerequisite course. The introduction also provides a brief historical perspective, which we hope some readers may őnd of interest. In chapters 2-3, we begin introducing applied cryptography. These chapters focus on the efficient and conceptually-simple shared-key cryptographic functions: encryption (Chapter 2), authentication (Chapter 4) and hashing (Chapter 3). In Chapter 3 we also begin introducing more elaborate applied cryptographic schemes, focusing on hashing-based schemes such as the Merkle digest schemes and blockchains. Chapter 5 introduces applied shared-key cryptographic protocols. By focusing on shared-key protocols, we provide a gentle introduction to the important subject of resiliency to key-exposure. This also provides good motivation to public-key cryptography, the topic of the next chapter (Chapter 6). Indeed, Chapter 6 deals extensively with different public-key protocols and applications, and shows the powerful ability to use public-key cryptography to ensure resiliency to, and recovery from, key exposure. The next three chapters cover areas of cybersecurity which are closely related to cryptography, focusing mostly on the cryptographic aspects. Two chapters introduce network security: the important TLS protocol in Chapter 7, and Public Key Infrastructure (PKI) in Chapter 8. Then, Chapter 9 covers the critical topic of human-centered cryptography; too often this aspect is not sufficiently taken in account, and cryptography is circumvented by exploiting human behavior and psychology. We conclude the book in Chapter 10, brieŕy discussing some advanced topics such as secret sharing, privacy and anonymity, elliptic curves cryptography and quantum (and post-quantum) cryptography, as well as some of the important aspects of cybersecurity which are beyond this text. The use of background such as math and theory in this textbook is limited to what appears to be essential or helpful for understanding, at least for a signiőcant fraction of readers. Some of this background is covered in Appendix A. Study of this textbook may be followed by study of other areas in cybersecurity, or by in-depth study of cryptography. There are multiple excellent in-depth textbooks on cryptography or speciőc cryptographic topics; some of my favorites are [16, 165, 166, 205, 309, 370]. Acknowledgments I received a lot of help in developing this textbook from friends, colleagues and students; I am very grateful. Speciőc thanks to: • The students and my peers in the University of Connecticut, who have been very understanding and supportive. • Sara Wrotniak, my PhD student, who gave me incredible feedback when she studied using this textbook as an undergrad, and later when I asked her to review the text further. Applied Introduction to Cryptography and Cybersecurity 4 LIST OF PRINCIPLES • Professors and researchers who provided valuable feedback, with or without teaching a course using the textbook, including: Ghada Almashaqbeh, Nimrod Aviram, Ahmed El-Yahyaoui, Peter Gutmann, John Heslen, Walter Krawec, Laurent Michel, David Pointcheval, Ivan Pryvalov, Zhiyun Qian, Amir and Luba Sapir, Jerry Shi, Haya Shulman, Ewa Syta, and Ari Trachtenberg. Walter has also kindly contributed the section on quantum cryptography (Section 10.4). • Instructors and teaching-assistants who used the text and provided important feedback: Justin Furuness, Hemi Liebowiz, Sam Markelon, and Anna Mendonca. I’m especially indebted to Anna for collecting excellent feedback from her students - and for introducing me to Sara. • Many other readers who helped with feedback, corrections and suggestions, including Pierre Abbat, Yanjing Xu, Bar Meyuchas, and Yike (Nicole) Zhang and many others. • The Latex and Tikz communities, who provide amazing resources and support. Special thanks to Nils Fleischhacker for the cool ‘tikz people’ package, and to Shaanan Cohney who kindly sent me the LaTeX source code from [100], which I turned into Figure 2.37. I probably forgot to mention some important contributors; please accept my apologies and let me know. Indeed, in general, please let me know of any omission or mistake in the text, and accept my thanks in advance. Many thanks to all of you; feedback is greatly appreciated, deőnitely including harsh criticism. Special thanks to my friend, PhD adviser and mentor, Prof. Oded Goldreich, who endured the challenge of trying to teach me both cryptography and proper writing. I surely cannot meet Oded’s high standards, but I believe and hope that this textbook does help to provide the necessary foundations of applied cryptography to practitioners of cybersecurity. Laying foundations is deőnitely a goal I deőnitely share with Oded. Let me quote Oded’s father: It is possible to build a cabin with no foundations, but not a lasting building Eng. Isidor Goldreich Last but not least, I wish to thank the incredible support and understanding of my family throughout the years, especially my amazing parents, Ada and Michael, my lovely and supportive wife Tatiana, and my wonderful and talented children, Raaz, Tamir, Omri and, last but not least, Karina. Applied Introduction to Cryptography and Cybersecurity Chapter 1 Introducing Cybersecurity and Cryptography Cybersecurity and cryptography are exciting topics, with rich history and with an extensive impact on the practical world, as well as fascinating challenges of engineering, theory and mathematics. In this textbook, our goal is to introduce cybersecurity, focusing on the area of applied cryptography, which is essential to most or all areas of cybersecurity. We believe that this basic knowledge in applied cryptography is desirable for any student of computer science, but especially for students which plan to focus on any area of cybersecurity or, of course, students planning to focus on cryptography. This chapter provides several sections with important foundations to the rest of this book. The őrst section (Section 1.1) introduces the areas of Cybersecurity and Cryptography, and their goals, approaches and few basic principles. Section 1.2 provides an informal introduction to two basic cryptographic mechanisms, encryption and signatures. Section 1.4 provides essential background and notation, used throughout the text; additional background is provided in Appendix A. In section 1.5 we discuss the challenge of defining security of cryptographic schemes, focusing on the important deőnition of security for signature schemes. Finally, Section 1.6 provides a few gems from the fascinating history of cryptography and cybersecurity, which, we believe, may provide an interesting perspective. 1.1 Cybersecurity and Cryptography: the basics The high-level goal of cybersecurity is to protect ‘good’ parties, who use computing and communicating systems legitimately, from damages due to misbehaving entities, to whom we usually refer as attackers or adversaries), who abuse the systems to harm the users and/or to obtain an unfair advantage or beneőt. Attackers could be rogue authorized users, often referred to as insiders, or outsiders, which only have some access to the systems or the communication, and are not authorized users. Attackers may also control one or multiple de5 6 CHAPTER 1. INTRODUCING CYBERSECURITY AND CRYPTOGRAPHY vices. Such devices may be legitimately owned by the attacker, or be corrupted, i.e., controlled by the attacker in spite of not being legitimately owned by the attackers; corrupted devices are often referred to as being pwned1 by the attacker. Unfortunately, it is not so easy to turn this high-level goal into precise definitions of security, allowing evaluation of the security of different systems and defenses. It is also challenging to design schemes that satisfy these deőnitions, and to prove their security. In this section, we discuss the main approaches to ensure and deőne security. But őrst, in subsection 1.1.1, we introduce the three cybersecurity functions: prevention, detection and deterrence. 1.1.1 Three Cybersecurity Functions: Prevention, Detection and Deterrence Figure 1.1 illustrates the three cybersecurity functions, i.e., basic ways of protecting against attackers: prevention, detection and deterrence. Prevention Cybersecurity defense approaches Deterrence Detection Figure 1.1: Cybersecurity defense approaches: Prevention, Deterrence and Detection. Prevention: mechanisms that prevent an attacker from causing damage, or that reduce the amount of possible damage. Encryption is an example of a cryptographic prevention mechanism; it is usually used to prevent an attacker from disclosing sensitive information. When possible, prevention is obviously the best defense - ‘an ounce of prevention is worth a pound of cure’. However, sometimes it is impossible to completely prevent an attack. For example, the TLS protocol (Chapter 7) relies on public key certificates issued and signed by a trusted party, called a Certificate Authority (CA); attacks due 1 The verb ‘pwn’ means to control or ‘own’ a computing system illegitimately, by exploiting a vulnerability. The verb ‘pwn’ is taken from gamers slang, where it was originally used for a player which is completely dominated by an opponent; its origin may come from chess and combine the chess ‘pawn’ with the verb ‘own’. See [310]. Applied Introduction to Cryptography and Cybersecurity 1.1. CYBERSECURITY AND CRYPTOGRAPHY: THE BASICS 7 to a mis-behaving CA cannot be completely prevented. Instead, such attacks are mitigated using detection mechanisms - and deterrence, with signiőcant penalties for misbehaving CAs, as we discuss in Chapter 8. Even when a cryptographic mechanism seems to completely prevent an attack, e.g., when using encryption, it is still prudent to also deploy detection mechanisms, since hidden, subtle vulnerabilities may exist even in thoroughly reviewed designs and systems. Deterrence: mechanisms designed to discourage (deter) attackers from attacking. Deterrence is achieved either by the (visible) use of strong defenses, making it futile to attack, or by penalizing misbehaving entities. Penalizing requires attribution of the attack or misbehavior to the correct entity. Digital signature schemes are an important cryptographic deterrence mechanism, which we discuss later in this chapter (subsection 1.2.3) and, in more details, in chapters 4 and 6. A signature veriőed using the misbehaving party’s well-known public key, over a given message, provides evidence that this party signed that message; we refer to such evidence of abuse as Proof of Misbehavior (PoM). PoM can be used to punish or penalize the misbehaving party in different ways - an important deterrent. Note, however, that deterrence can only be effective against a rational adversary; a penalty may fail to deter an irrational adversary, e.g., a terrorist. Deterrence is effective if the adversary is rational, and would refrain from attacking if her expected proőt (from attack) would be less than the expected penalty. In practice, most attackers are rational, hence, good deterrence is an effective defense. However, a challenge is that attackers do their best to avoid being detected and penalized; deterrence is only effective, when combined with strong penalties - and effective detection. Detection: By detection we refer to security mechanisms that detect an attack, usually while the attack is ongoing. Detection allows deployment of additional defenses, which are not deployed otherwise (e.g., due to costs). Detection is often a key element in ensuring security. First, detection is required to penalize attackers and hence effective detection is key to effective deterrence. Second, detection allows reaction, such as using additional defense mechanisms and performing operations to recover security and minimize damages. Therefore, security systems often invest a lot in detection, while attackers usually do their best to avoid detection, including refraining from attacks and actions that may result in detection. Detection is a major component of important, widely deployed networksecurity and host-security defenses such as Intrusion Detection Systems (IDS) and honeypots. Detection is not as much applied in cryptography, however, we present some examples of the use of detection which are related to cryptographic security. One example are the PKI detection mechanisms which we discuss in Chapter 8. Another example is the combination of an Error Detection Code and a Message Authentication Code, used to detect attempt to attack authenticated Applied Introduction to Cryptography and Cybersecurity 8 CHAPTER 1. INTRODUCING CYBERSECURITY AND CRYPTOGRAPHY communication, as illustrated in Figure 4.3. Finally, we discuss the detection of password exposure in subsection 9.4.7. 1.1.2 Generic Security Goals Different systems and schemes often have different security goals; however, some goals are generic, and apply, possibly with some variations on details, to many systems and schemes. The generic goals, illustrated in Figure 1.2, include Confidentiality, Integrity, authenticity, and Availability and Non-repudiation. Three of them (conődentiality, integrity, and either authentication or availability) are often referred to as the CIA triad, an easily-memorable acronym. Cybersecurity Generic Goals Confidentiality Integrity Authenticity Availability Mechanisms: Encryption: symmetric (Chapter 2) and asymmetric or public-key (Chapter 6). Mechanisms: Hash functions, Accumulators (e.g., Merkle-tree) and blockchain schemes, Chapter 3. Mechanism: Message Authentication Code (MAC). Mechanisms: Proof-of-Work (PoW, Section 3.10.2), Public-Key Infrastructure (PKI, Chapter 8). Non-repudiation (and authentication) Mechanisms: Digital signatures Figure 1.2: The Generic Security Goals: Conődentiality, Integrity, Authentication, Availability and Non-repudiation (which extends authentication). The őrst three are sometimes referred to as the CIA triad. Non-repudiation is an extension of authentication. For each goal, we list some of the corresponding cryptographic mechanisms covered in this textbook. Confidentiality. A system satisőes the confidentiality goal, if it prevents an attacker from disclosing some information deőned as confidential. The ‘classical’ conődentiality mechanism is encryption, illustrated in Figure 1.4, with two main variants: shared key cryptosystems, also referred to as symmetric encryption, and public key cryptosystems, also referred to as asymmetric encryption. Integrity. Ensuring integrity means prevention or detection of unauthorized operations, such as the modiőcation of data. Integrity is applied to a wide range of situations, including the integrity of a computer system, information stored Applied Introduction to Cryptography and Cybersecurity 1.1. CYBERSECURITY AND CRYPTOGRAPHY: THE BASICS 9 or transmitted, and more. We cover several cryptographic integrity mechanisms in Chapter 3, including cryptographic hash functions, Merkle digest schemes and blockchains. Authenticity and Non-repudiation. Authentication mechanisms validate that a particular information was originated from a speciőc entity, or that a particular interaction involved a speciőc entity. A unique variant of authentication is non-repudiation, which provides evidence of the origin of information, that can be presented, later, to a third party, to ‘prove’ the identity of the origin. We cover the Message Authentication Codes (MAC) schemes which provides authentication, and signature schemes, which provides non-repudiation (and authentication). Note: we introduce signature schemes already in this chapter (subsection 1.2.3), and discuss the important application of cryptographic hash functions and signature schemes to ensure software integrity and authenticity, as a main defense against malware, in Lab 1. Availability. Availability mechanisms ensure that services can be provided efficiently, even if an attacker tries to disrupt services, in what is called Denialof-Service (DoS) attacks. DoS attacks have become a major concern for network and service providers, and defenses against them are a major challenge of network security; however, there is only limited use of cryptography schemes in defenses against DoS. In fact, some cryptographic protocols are vulnerable to DoS, i.e., a DoS attack may disrupt the security service, potentially resulting in a vulnerability. We discuss this concern for Public Key Infrastructure (PKI), in Chapter 8. We explain how some PKI designs, such as CRLs and CRVs, are resistant to such DoS attacks, while others, e.g., OCSP, may be vulnerable or even abused to create a DoS attack. Bespoke security goals. The generic goals apply to many cybersecurity systems. The security goals of speciőc systems and schemes usually include these generic goals, possibly with some adaptation, as well as addition, system-speciőc goals. 1.1.3 Attack Model Security goals should be deőned and evaluated with respect to the capabilities of the attackers, rather than by assuming a speciőc attacker strategy. This is an important principle of modern cryptology, which already appears in early publications such as the seminal paper of Diffie and Hellman [123]: understand and deőne a clear model of the attacker capabilities (attack model), and then deőne goals/requirements for the scheme/system - against attackers with the speciőed capabilities, regardless of the attacker’s strategy (choices). This principle applies not only in cryptology, but in general in security. Precise articulation of the attack model and of the security requirements is fundamental to the design and analysis of security. Let us re-state this important principle. Applied Introduction to Cryptography and Cybersecurity 10 CHAPTER 1. INTRODUCING CYBERSECURITY AND CRYPTOGRAPHY Principle 1 (Attack model principle: assume capabilities, not strategy). Security requirements should be defined and analyzed with respect to a well defined attack model, which specifies any restrictions of the capabilities of the attacker. The attack model should not restrict the attacker’s strategy. We discuss several attack models in this textbook, and three of them, which apply to digital signatures, already in this chapter, in subsection 1.5.2. The weakest of them is the Key-Only Attack model, where the attacker is only given the public key, and the strongest is the Chosen Message Attack model, where the attacker is even given the ability to ask for signatures for message of its own choosing. As you can see, the models only refer to the capabilities of the attacker, and not to any speciőc attack strategy or even to the requirements; we discuss the security requirements of signature schemes separately, in subsection 1.5.8. The attack model principle applies also to areas of cybersecurity where it is hard to deőne a rigorous attack model. Even in such scenarios, it is important to limit our assumptions to the attacker capabilities, rather than assuming a speciőc attacker strategy. However, where possible, a well-deőned attack model, allowing provable security, is better - if this is done ‘correctly’, as we discuss next. 1.1.4 Provable Security In most areas of science and engineering, design and evaluation are based on experimental analysis, measuring the expected outcomes of the system under typical scenarios. In contrast, security should be ensured against an adversary (attacker), who is not bound to behave in some typical way; new attacks may be very different from past attacks. Observing the behavior of adversaries and designing defenses based solely on these behaviors may leave subtle or even gaping vulnerabilities that can be exploited by adversaries. That said, when a history of previous attacks is available and shows a clear pattern, it does make sense to evaluate defenses against the same type of attacks - in addition to evaluation against arbitrary attacker strategy, as per Principle 1. It is challenging to ensure security against an arbitrary attacker strategy, only limiting the attacker’s capabilities. There are many subtle ways in which our intuition and imprecise arguments may fail, resulting in vulnerabilities. Cryptography and security are exceptions to the popular saying ‘in theory there is no difference between theory and practice, while in practice there is’ [82]. In fact, in security, and especially in cryptography, precise deőnitions and proofs of security, or at least extensive, clear analysis, are necessary to ensure that a system is secure against arbitrary attacker strategies. This approach is usually referred to as provable security. That said, one has to be very careful to understand the limitations and pitfalls of proofs, discussed in a series of papers by Koblitz and Menezes, summarized in [237]. Proofs of security often involve simpliőcations and assumptions, even simplifying ‘assumptions’ known to be incorrect. An important example of such simpliőcation is the Random Oracle Model (ROM), discussed in Section 3.6. Applied Introduction to Cryptography and Cybersecurity 1.1. CYBERSECURITY AND CRYPTOGRAPHY: THE BASICS 11 Using the ROM, a protocol using a speciőc cryptographic hash function, e.g., SHA-1, is analyzed as if the function used was selected at random. Clearly, this is not a correct assumption; yet, a proof using such assumption, gives some indication of security, or at least, limits the possible types of attacks. Simpliőcations can result in a system has a proof of security - leading people to trust its security - however, in reality, this system is vulnerable and exploited. One reason this happens is that proofs are based on models of the attacker capabilities and of the system, which often do not fully capture reality; attacks often cleverly abuse and exploit exactly these aspects which were glossed-over and abstracted away. Furthermore, efforts to consider more realistic aspects tend to make analysis, proofs, and provably-secure designs, harder and more complicated. As a result, even for an expert, proofs may be challenging and require extensive effort - to write and to validate. Complex proofs may have subtle errors, which may remain undiscovered for years. Proofs should be clean and enlightening, while reality is dirty and murky; and viruses, bugs, and vulnerabilities thrive in the dirt and the dark. Another challenge for provable security, is to correctly identify and rigorously deőne the security requirements (goals). Rigorous deőnitions of security requirements are often challenging - to deőne and to understand. However, clear deőnitions of security requirements are important, for several reasons: 1. To prove the security of the cryptographic scheme. 2. To prove the security of an application of the scheme, e.g., as part of another cryptographic scheme or of a cryptographic protocol. 3. To allow researchers to explore attacks against the system and demonstrate that requirement are not met. 4. To avoid vulnerabilities in a system using the scheme, due to incorrect usage of the scheme, namely, avoid incorrect use of cryptographic mechanisms. The last item is worth extra attention; incorrect usage is a common reason for cryptography-related vulnerabilities, and the exploits are often devastating. In this textbook, we will discuss several such weaknesses - in widely deployed standard and products. We believe that one of the main reasons for these failures is the fact that the system designers were not sufficiently familiar with the security properties of the cryptographic schemes they used. Our hope is that readers of this textbook will learn enough to allow them, when using cryptographic schemes, to understand their properties and how to securely use them - and to know when they need to consult with a more knowledgeable cryptographer. This will reduce the likelihood of usage errors. The challenge of rigorously deőning security requirements and proving security, also implies that protocols and mechanisms are sometimes used without a rigorous deőnition and/or a proof. It is sometimes necessary, for functionality, business or performance considerations, to take the risk of a subtle vulnerability, Applied Introduction to Cryptography and Cybersecurity 12 CHAPTER 1. INTRODUCING CYBERSECURITY AND CRYPTOGRAPHY which could exist due to informal requirements or analysis. However, as explained by Koblitz and Menezes, when feasible, there are important advantages to follow the provable security approach, with deőnitions of requirements, models and assumptions, and proofs security. The provable security approach helps to avoid, or at least reduce, vulnerabilities and attacks. In particular, a proof of security - often, even a ŕawed one - rules out many possible vulnerabilities. In fact, vulnerabilities are often discovered, and circumvented, when researchers work toward a proof of security. There are also several countermeasures which may help to address the concerns about provable security: 1. The use of computer-aided cryptography [24], i.e., automated tools to aid in generation of proofs and in veriőcation of the correctness of proofs. 2. The study of attacks and in particular of cryptanalysis, i.e., attacks against cryptographic mechanisms. Researchers studying attacks on systems would often ‘think outside the box’ and exploit subtle vulnerabilities, which may be hard to identify when we only consider the proofs of security. Attacks also provide a quantitative measure of insecurity, which may be matched against a proven quantitative measure of security, identifying possible gaps which may allow improved proofs - or improved attacks. 3. Improved understanding of the value, as well as the limitations, of provable security, and better intuition, that help practitioners make use of provable security - while avoiding subtle pitfalls. We hope that this textbook will aid in the development of such improved understanding and intuition, even to readers who will not proceed to learn the theory of cryptography. Note that while our discussion focused on cryptographic deőnitions and proofs, these considerations apply to cybersecurity in general. However, rigorous deőnitions and proofs are much less common in other areas of cybersecurity, possibly since it is harder there to őnd useful simpliőcations. We consider this as a good motivation for studying cryptography, as a way to develop the ‘adversarial thinking’ so essential in all areas of cybersecurity. To sum up, a proof of security is a powerful tool, but, like other power tools, should always be used carefully and correctly. Namely, we must fully understand the deőnitions, assumptions and simpliőcations, and never assume additional properties or applicability to scenarios where the assumptions do not hold. Therefore, cybersecurity experts must master both theory and practice. A fair amount of skepticism, paranoia and humility is also advisable. 1.1.5 Risk and costs-benefit analysis The management of computation and communication systems involves multiple challenges, including security risks. Security managers have to consider the costs of deploying security mechanisms, against the probability of an attack and the expected damages from possible attacks, if the mechanisms are not Applied Introduction to Cryptography and Cybersecurity 1.2. THE BASIC MECHANISMS: ENCRYPTION, SIGNATURES AND HASHING 13 deployed. Risk analysis is an attempt to estimate the probabilities of occurrence of different attacks, and of the attacks being successful; and cost-beneőt analysis uses these values, as well as the costs of deploying different security mechanisms, to decide which security mechanisms are worth deploying. In this textbook, we do not further discuss risk and cost-beneőt analysis. The reason is that we are not aware of a sufficiently reliable and generally applicable methodology for such analysis. It seems to the author that practitioners often use crude approximations based on their experience, common sense and industryadopted estimates. Therefore, we focus on design and analysis of secure systems, and on potential vulnerabilities and attacks; we mostly ignore the risks of different attacks and the costs of different defenses. There are a few exceptions; e.g., we focus on computationally efficient schemes and adversaries, where by ‘efficient’ we mean ‘whose runtime is bounded by a polynomial in the size of their inputs’; see more in Section A.1. 1.2 The basic mechanisms: encryption, signatures and hashing In this section, we provide a birds-eye introduction to two basic cryptographic mechanisms: cryptosystems (symmetric and asymmetric) and signature schemes. We also introduce Kerckhoffs’ principle, a fundamental design principle for cryptographic schemes, mostly applied also to other cybersecurity defenses. 1.2.1 Encryption: symmetric and asymmetric cryptosystems Encryption schemes, also referred to as cryptosystems or ciphers, are the oldest and most well known cryptographic mechanism - and the main mechanism for ensuring confidentiality. Encryption transforms sensitive information, referred to as plaintext, into a form called ciphertext, which allows the intended recipients to decrypt it back into the plaintext; the ciphertext should not expose any information to an attacker. The focus on encryption and conődentiality is evident in the term cryptography, i.e., ‘secret writing’, which is often used as a synonym for cryptology. In fact, for some reason, in the recent years, cryptography seems to be the more common term; so we also use it. In all but some ancient (and completely insecure) cryptosystems, the encryption and decryption operations use a key. In symmetric cryptosystems, encryption uses the same (secret) key as used for decryption, often denoted k. In contrast, in asymmetric cryptosystems, only the decryption key, denoted d, must be private, and the (related but different) encryption key, denoted e, can be published, i.e., is public. Due to these properties, symmetric cryptosystems are also called shared-key cryptosystems, and asymmetric cryptosystems are also called public key cryptosystems (PKCs). We study shared-key cryptosystems in Chapter 2, and public-key cryptosystems in Chapter 6. Applied Introduction to Cryptography and Cybersecurity 14 CHAPTER 1. INTRODUCING CYBERSECURITY AND CRYPTOGRAPHY Ci ph er te xt Eavesdropping Eve Alice Bob Key Plaintext Key Encrypt Ciphertext Decrypt Plaintext Nurse Figure 1.3: Encryption: terms and typical use. Alice needs to send sensitive information (plaintext) to Bob, so that the information will reach Bob - but remain conődential from the attacker, Eavesdropping Eve. To do this, Alice encrypts the plaintext, typically using a key; the encrypted form is called ciphertext. A secure encryption would prevent Eve from learning, from the ciphertext, anything about the plaintext (except how much was sent). However, by decrypting the ciphertext, typically using a key, Bob would recover the sent plaintext. In symmetric cryptosystems, encryption and decryption use the same (shared, symmetric) key, while in asymmetric cryptosystems, encryption uses a public encryption key and decryption uses a different, albeit related, private decryption key. Figure 1.4: Shared key (symmetric) cryptosystem. Symmetric (shared-key) cryptosystems use the same key, e.g., k, for both encryption and decryption, as illustrated in Figure 1.4. The key k is chosen as a random bit-string. In the őgure, the key length is variable, given as input (denoted l); in practice, many cryptosystems are designed for a speciőc key length. A symmetric cryptosystem must ensure correctness, i.e., m = Dk (Ek (m)), for every plaintext message m and every key k. The basic security requirement from symmetric cryptosystems is to ensure confidentiality, i.e., an adversary should not be able to learn, from the ciphertext, any information about the plaintext. We present the deőnition only in subsection 2.7.2, since it involves some subtle aspects. Asymmetric (public-key) cryptosystems are illustrated in Figure 1.5. As shown, public key cryptosystems use a pair of two different, but related, keys: Applied Introduction to Cryptography and Cybersecurity 1.2. THE BASIC MECHANISMS: ENCRYPTION, SIGNATURES AND HASHING 15 Figure 1.5: Public key (asymmetric) cryptosystem. a public encryption key e for encryption, and a private decryption key d for decryption. The keypair (e, d) is produced by a key generation algorithm, denoted KG, which is deőned as part of the asymmetric cryptosystem. For public-key cryptosystems that support variable key length l, the length l is provided as input to the KG (as illustrated). Public key cryptosystems should also satisfy correctness, namely: m = Dd (Ee (m)), for every keypair (e, d) generated by KG. Note that a key generation algorithm is not required for shared-key cryptosystems, since the shared key k can simply be selected as a random string. However, clearly, for asymmetric cryptosystems, we cannot select randomly both the public key e and the (corresponding) private key d. Formally, every symmetric cryptosystem can be viewed also a an asymmetric cryptosystem, simply by deőning KG as random selection of l bits, which are used for both d and e. However, such construction will fail to provide the stronger security requirement from public-key cryptosystems, i.e., that an adversary should not be able to learn, from the ciphertext, any information about the plaintext, even when given the encryption key e. Indeed, it is not trivial to design asymmetric encryption schemes which ensure correctness - and which are not easy to ‘break’. Even today, there is only a limited number of different designs for public key encryption schemes. In fact, while the concept of public-key cryptosytsems was proposed in a seminal paper [123] by Whit Diffie and Martin Hellman, the a design was őrst published only later, in [334], by Rivest, Shamir and Adelman. We present two of the most well-known public key cryptosystems, RSA and El-Gamal, in Chapter 6. Readers are encouraged to try to come up with their own design for an asymmetric cryptosystem - not necessarily a really secure one, just something which will not be trivial to break. You may őnd it quite a challenge - and are welcome to peer in Chapter 6 to see how it can be done. 1.2.2 Kerckhoffs’ principle In both symmetric and asymmetric cryptosystems, the key used for decryption must be kept secret. However, what about the algorithms used for encryption and Applied Introduction to Cryptography and Cybersecurity 16 CHAPTER 1. INTRODUCING CYBERSECURITY AND CRYPTOGRAPHY decryption? Knowledge of the algorithms may help an attacker, so, intuitively, it seems that the algorithms should be kept secret. Indeed, some ancient cryptosystems did not use a key at all, and relied entirely on the secrecy of the algorithms; we give few examples in Chapter 2. Even for keyed cryptosystems, it is harder to attack without knowing the design; see Exercise 2.25. This is often reŕected using the expression knowledge is power. Therefore, traditionally, cryptosystems were kept secret, and this secrecy was considered as necessary for their security - an approach we refer to as security by obscurity. However, in 1883, the Dutch cryptographer Auguste Kerckhoffs realized that ‘security by obscurity’ has serious disadvantages; in particular, once an attacker obtains a cryptosystem, the system may become completely insecure. This motivated Kerckhoffs to publish the following principle [232], which is now considered the ‘basic’ rule in applied cryptography. We extended Kerckhoffs’ principle to refer to arbitrary security mechanisms, and not just to cryptosystems or even to cryptographic systems. Principle 2 (Kerckhoffs’ principle). When designing or evaluating the security of (cryptographic) systems, assume that the adversary knows the design, i.e., knows everything except the secret keys. Kerckhoffs’ principle has additional advantages. One advantage is that the resulting design is likely to be more secure, as it was designed against a more powerful attacker - an attacker that knows the details of the design. An even more important advantage is the ability to evaluate the security of the design by experts which were not part of the design team, and challenging them to őnd vulnerabilities; it is often easier for an expert to őnd a vulnerability in a system designed by somebody else. Note that the Kerckhoffs’ principle does not require that the design be made public; it merely allows publication, since it requires that the goal of the designers should be for security to hold even against an attacker than knows the design. In principle, we may follow Kerckhoffs’ principle, yet keep the design private, thereby making it ‘even harder’ to őnd a vulnerability. Since security does not assume secrecy of the design, we can continue to use the system even if the conődentiality of the design is breached. However, there are further advantages in going further and relying on the security of public, standard designs. Published, standard designs have the obvious advantage of improving the efficiency of production and use, by allowing interoperable implementations by arbitrary vendors. A more subtle advantage of published, and esp. standard, designs, is that they facilitate evaluation and cryptanalysis by many experts, and motivate experts to find and publish vulnerabilities. As a results, users will be alerted to vulnerabilities earlier, reducing the risks of using a system with a vulnerability known only to the attacker. Through evaluation of security by multiple, motivated experts is the best possible guarantee for security and cryptographic designs - except, possibly, for provably-secure designs. In fact, as mentioned above (and in [237]), even ‘provably-secure’ designs were found to have vulnerabilities during careful Applied Introduction to Cryptography and Cybersecurity 1.2. THE BASIC MECHANISMS: ENCRYPTION, SIGNATURES AND HASHING 17 review by experts, due to a mistake in the proof or due some modeling or other assumption. Therefore, Justice Brandeis’ saying that ‘sunlight is the best disinfectant’ applies also to cryptographic and cybersecurity systems. This is well demonstrated by two important cryptographic systems, which whose design was kept conődential, both designed in the 1990’s: the GSM mobile telephony network and the CSS encryption for DVDs. In both cases, it did not take very long for the algorithms to be leaked, and quite soon afterwards, they were found to be insecure and vulnerable to successful, practical attacks. The GSM case is particularly interesting and important, with several glaring vulnerabilities, some of which can be considered as design errors. The GSM designers did not even plan a proper ‘migration plan’ for changing from the exposed ciphers and protocols to more secure alternatives, which resulted in devices remaining vulnerable years after the vulnerabilities becoming known to all experts. We discuss some of these vulnerabilities in Section 5.6. Sometimes it is feasible to combine the security beneőts of open design and of maintaining the secrecy of the design, by combining two candidate schemes. For example, we use a cryptosystem which is a combination of a published, standard cryptosystems and of another cryptosystem, designed for security (following Kerckhoffs’ principle) but kept conődential (to make attacks harder). We use the term robust combiner for such construction, which ensures security as long as one of the two schemes is not broken; several such robust combiners are known, for different cryptographic schemes; see [191] and subsection 2.5.9. 1.2.3 Digital Signature schemes The goal of cryptosystems (encryption schemes) is to ensure conődentiality, i.e., secrecy of information. Let us now focus on digital signature schemes, whose goal is to ensure the authentication and non-repudiation goals (subsection 1.1.2). Signature schemes play a critical role in applied cryptography; for example, in Lab 1 we show how their critical role in protecting against malware, and in Chapter 8 and Chapter 7 we show their critical role in establishing secure communication, in particular, in the TLS protocol (used, mainly, to protect web communication). Digital signature schemes, like asymmetric encryption, were proposed as a concept in [123], but their őrst published design was in the RSA paper [334]. Later on, we discuss two constructions of signature schemes: the RSA design in Chapter 6, and a hash-based design which is limited to signing a single message, called a one-time signature schemes, in Chapter 3. Handwritten signatures: goals and reality. To provide intuition to the digital signature, let us őrst consider handwritten signatures. Ideally, handwritten signatures should allow everyone to verify a signed document, by comparing the signature on the document to a sample signature, known to be of that signer. The purported security of handwritten signature, is based on two (implicit) assumptions. The őrst assumption is that only the signer herself should be able to sign a document in a way which produces a signature matching her sample Applied Introduction to Cryptography and Cybersecurity 18 CHAPTER 1. INTRODUCING CYBERSECURITY AND CRYPTOGRAPHY Figure 1.6: Digital Signature Scheme. signature. The second assumption is that it is infeasible to change a signed document, without leaving marks that would invalidate the signature. Reality is less ideal; handwritten signatures are forged, hand-signed documents are modiőed. Indeed, for these reasons, there are experts who are called upon to detect forged signatures and documents modiőed after signature - and these experts may fail to make a correct determination - indeed, different experts may disagree. A serious forger may, for example, lure the victim signer into signing a benign-looking document, such as ‘I owe Mal $1’, which may make it easier to modify into ‘I owe Mal $1,000,000’ without leaving marks. It could be quite hard to prove that the document was modiőed after it was signed. Digital signatures and their security. Let us now focus on digital signatures, illustrated in Figure 1.6, and their security properties. Basically, we argue that digital signatures provide the security that is desired, but not really provided, by handwritten signatures. There are some obvious differences between handwritten signatures and digital signatures. Obviously, the document and the signature are strings (őles) rather than ink on paper, and the processes of signing and validating are done by applying appropriate functions (or algorithms), as illustrated in Figure 1.6. The signing and veriőcation functions require appropriate keys. The private signing key, s, replaces the unique personal characteristics of the signer, creating the personalized signature; the public verification key, v, replaces the sample signature available to anybody wishing to verify signed documents. This keypair, (s, v), is produced together, using a key generation algorithm, KG. A signature scheme S = (KG, Sign, Verify), therefore, consists of these three algorithms: KG, for producing the keys, Sign, for signing, and Verify, for verifying. These three algorithms are similar to these of a public key cryptosystem (Figure 1.5), and, like there, the key generation algorithm, KG, may receive as input an indication of the required key-length l. The key generation algorithm outputs a pair of keys: a private signing key s, to be known only to the signer, and a corresponding public veriőcation key v, Applied Introduction to Cryptography and Cybersecurity 1.2. THE BASIC MECHANISMS: ENCRYPTION, SIGNATURES AND HASHING 19 to be known to anybody who wishes to verify signatures. The Sign algorithm is given the (private) signing key s and a message m; the output of the signature algorithm, which we denote in Figure 1.6 by σ, is usually referred to as the signature of the message m: σ = Sign s (m). Signature schemes should also ensure security. Intuitively, the security requirement is that an attacker cannot forge messages. Namely, given the public veriőcation key v, an attacker cannot obtain a signature σ such that Vv (m, σ) = True, unless it was also given the private signing key s or the computed value of σ = Ss (m). We provide a precise deőnition of this security requirement in subsection 1.5.1, since this deőnition requires some notations and background that we did not yet discuss. Note, however, that like public-key encryption, it is non-trivial to design a signature scheme (which is not easy to break). Therefore, we present an implementations only much later: of RSA in Chapter 6, and of one-time signatures in subsection 3.4.2. The reader may wonder why do we deőne the security requirements of signature schemes already in this chapter. There are a few reasons: • The security requirements of signature schemes are relatively simple and easy to deőne. By discussing the requirements and deőning them, we demonstrate the tools of probability and computational complexity, which are later used to deőne other cryptographic schemes. • Signatures are one of the most important and widely used cryptographic schemes, yet - they are one of the least known and understood outside the cryptographic community. • Signatures are the main deterrence-based security mechanism, at least in cryptography, and deőnitely in this textbook. Deterrence is not always sufficiently recognized for its potential to ensure security; we discuss this next. Non-repudiation. Everyone can use the public validation key v to verify signatures; this key does not suffice to generate valid signatures! Therefore, a digital signature can provide not only authentication, but also non-repudiation: the signer cannot deny having signed the message. As we will see in Chapter 7 and Chapter 8, this property is critical to applications of public-key cryptography, including the TLS protocol, which is key to web-security and other applications. Warning: conflicting use of the term ‘digital signature’. To őnish this high-level introduction to the cryptographic mechanism of digital signatures, we need to warn the reader about a conŕicting use of the same term for a very different mechanism. Speciőcally, the term ‘digital signature’ is often used to refer to the visual appearance of a ‘signature’ in a document, such as a signature included in a PDF őle. This mechanism offers little or no technical defense against forgery; it is quite easy for an attacker to use the same visual ‘signature’ in a different őle, i.e., forgery is often easy. Applied Introduction to Cryptography and Cybersecurity 20 CHAPTER 1. INTRODUCING CYBERSECURITY AND CRYPTOGRAPHY 1.2.4 Applying Signatures for Evidences and for Public Key Infrastructure (PKI) Signatures are asymmetric: signing requires the private signing key, but validation of a signature only requires the corresponding public veriőcation key. This property facilitates two functions which are critical to cybersecurity: provision of evidences and facilitating the Public Key Infrastructure (PKI). Let us brieŕy discuss both of these functions. Signatures facilitate non-repudiation and evidences. The recipient of a signed message ‘knows’ that once she validated a signature, using the veriőcation key v, she would be able to convince other parties that the message was, in fact, signed by the use of the private key corresponding to v. We refer to this property as non-repudiation, since the owner of the private key cannot claim that the ability to verify messages allowed another party to forge a signature, i.e., compute a seemingly-valid signature of a message, without access to the private signing key. The non-repudiation property allows a digitally-signed document to provide an evidence for the agreement of the signer, much like the classical use of handwritten signatures. Indeed, the use of digital signatures to prove agreement, has signiőcant advantages compared to the use of hand-written signatures: Security. Handwritten signatures are prone to forging of the signature itself, as well as to modiőcation of the signed document. If the signature scheme is secure (i.e., existentially unforgeable, see Deőnition 1.6), then production of a valid signature over a document m practically requires the application of the private signing key to sign exactly m. Convenience. Digital signatures can be sent over a network easily, and their veriőcation only requires running of an algorithm. Admittedly, signature veriőcation does involve some non-negligible overhead, but this is incomparably easier than the manual process and expertise required to conőrm handwritten signatures. Later on, digital signatures may be easily archived, backed-up and so on. Non-repudiation is essential for many important applications, such as signing an agreement or a payment order, or for validation of recommendations and reviews. Non-repudiation is also applied extensively in different cryptographic systems and protocols. Legal interpretation of signatures and digitized handwritten signature. Digital signatures are covered by legislation in some jurisdictions, however, their legal deőnition and implications vary signiőcantly between jurisdictions, and often differs considerably from what you may expect based on the cryptographic deőnitions and properties. For example, many web services use the term ‘digital signature’ to refer to agreement by a user in a web form, sometimes accompanied by a visual representation of a handwritten signature. Other Applied Introduction to Cryptography and Cybersecurity 1.2. THE BASIC MECHANISMS: ENCRYPTION, SIGNATURES AND HASHING 21 systems and organizations, consider as a ‘digital signature’ the scanned or scribbled version of a person’s signature, which may be better referred to as a digitized handwritten signatures. Such services may offer convenience, but not the security of digital signatures (in the sense used in this textbook and by experts). In particular, since digitized handwritten signatures are merely digitally-represented images, they deőnitely cannot prevent an attacker from modifying the ‘signed’ document in arbitrary way, or even reusing the signature to ‘sign’ a completely unrelated document. From the security point of view, these digitized handwritten signatures are quite insecure - not only compared to cryptographic signatures, but even compared to ‘real’ handwritten signatures, since ‘real’ handwritten signatures may be veriőed with some precision by careful inspection (often by experts). Signatures facilitate public key infrastructure (PKI) and certificates. Most applied cryptographic systems involve public key cryptosystems (PKCs), e.g. RSA, and key-exchange protocols, e.g. the Diffie-Hellman (DH) protocol, both presented in Chapter 6. In particular, PKCs and key-exchange are central to the TLS protocol (Chapter 7), which is probably the most widely-used and important cryptographic protocol, and the main cryptographic web-security mechanism. However, all of these depend on the use of authentic public keys for remote entities, using only public information (keys). This still leaves the question of establishing the authenticity of the public information (keys). If the adversary is limited in its abilities to interfere with the communication between the parties, then it may be trivial to ensure the authenticity of the information received from the peer. In particular, if the adversary is passive, i.e., can only eavesdrop to messages, then it suffices to simply send the public key (or other public value). Some designs assume that the adversary is inactive or passive during the initial exchange, and use this exchange information such as keys between the two parties. This is called the trust on first use (TOFU) adversary model. In few scenarios, the attacker may inject fake messages, but cannot eavesdrop on messages sent between the parties; in this case, parties may easily authenticate a message from a peer, by previously sending a challenge to the peer, which the peer includes in the message. We refer to this as a off-path adversary. Off-path adversaries are mainly studied when focusing on non-cryptographic aspects of network security. However, all these methods fail against the stronger Man-in-the-Middle (MitM) adversary, who can modify and inject messages as well as eavesdrop on messages. Furthermore, there are many scenarios where attackers may obtain MitM capabilities, and even when this seems harder to believe, it is always better to ensure security against such powerful attackers, following the conservative design principle (Principle 3). To ensure security against a MitM attacker, we must use strong, cryptographic authentication mechanisms. Signature schemes provide a solution to this dilemma. Namely, a party receiving signed information from a remote peer, can validate that information, Applied Introduction to Cryptography and Cybersecurity 22 CHAPTER 1. INTRODUCING CYBERSECURITY AND CRYPTOGRAPHY using only the public signature-validation key of the signer. Furthermore, signatures also allow the party performing the signature-validation, to őrst validate the public signature-validation key, even when it is delivered by an insecure channel which is subject to a MitM attack, such as email. This solution is called public key certificates, and we discuss it in Chapter 8. 1.2.5 Cryptographic hash functions A cryptographic hash function h receives an input string m and outputs a short string h(m). We refer to this output as the hash, őngerprint, digest or checksum of the input string m. Several security properties are deőned for cryptographic hash functions; see Chapter 3. The most well-known is collision resistance. Collision resistance means that given the digest h(m) of some string m, it is infeasible to őnd a different string m′ = ̸ m, which has the same digest: h(m′ ) = h(m). The collision resistance property is often used to ensure integrity, e.g., integrity of software downloads; see Lab 1. 1.3 Sequence Diagrams and Notations Notations and őgures are essential for precise, effective technical communication. However, it can be frustrating to read text which uses unfamiliar, forgotten or confusing notations. This could be a special challenge for these readers of this text who were not much exposed to some of the notations used in mathematics and theory of computer science. Unfortunately, there are sometimes multiple notations for the same concept, or multiple conŕicting interpretations for the same notation. We tried to choose the more widely used and least conŕicting and confusing notations, but that required some difficult tradeoffs. For example, we use the symbol + + to denote string concatenation, although the symbol || is more commonly used to denote string concatenation in cryptographic literature. The reason for preferring + + is to avoid confusing readers who are used to the use of the symbol || to denote the logical-OR operator, as in several programming languages. In this section, we introduce notations and sequence diagrams, which we use in this text. Our choice of notations as well as of the use or sequence diagrams and their style is consistent, as much as possible, with common literature. In particular, Table 1.1 presents notations which we use extensively. Please refer to it whenever you see some unclear notation, and alert the author of any missing, incorrect or confusing notation. Let us discuss separately two important, basic notations: the dot notation, for referring to items within a tuple, and the key notation, for denoting the key when provided as input to a cryptographic function. The dot notation. We use A.s and A.v to denote the signing and veriőcation keys of Alice, respectively. Here, A stands for ‘Alice’, and the s and v after the Applied Introduction to Cryptography and Cybersecurity 1.3. SEQUENCE DIAGRAMS AND NOTATIONS Table 1.1: Notations used in this manuscript. S = {a, b, c} N, Z, Z+ Zp , Z∗p {x ∈ X | f (x) = 0} (∀x ∈ X)(f (x) > 1) (∃x ∈ X|f (x) > 1) Πx∈S Vx i! C ∪B A⊆B A⫋B A×B 0x..., e.g., 0xAF 2 {a, b}l {a, b}∗ + + al (e.g., 1l , 0l ) |b| a[i] a[i : j] or a[i . . . j] aR a.b x∧y x∨y x⊕y x $ x←X Pr $ (F (x)) x←X A Bk (·) , A fk (·) PPT N EGL(n) O(f (n)) k (ψ) A set S with three elements - a, b and c. Sets are denoted with capital letter. N: natural numbers (integers greater than zero); Z: all integers; Z+ : non-negative integers; R: all real numbers. The sets {0, . . . , (p − 1)} and {1, . . . , (p − 1)}, respectively. The subset of elements x ∈ X s.t. f (x) = 0. For all elements x in the set X, holds f (x) > 1. Set X omitted when ‘obvious’. There is (exists) some x in X s.t. f (x) > 1. Multiplication of Vx for every x ∈ S, e.g., Πx∈{a,b,c} Vx = Va · Vb · Vc . Similar to use of Σx∈S for addition. The factorial of i, deőned as: i! ≡ Πj∈{1,...,i j = 1 · 2 · . . . · i. Union of sets C and B. Set A is a subset of set B, i.e., a ∈ A ⇒ a ∈ B. Set A is a ‘proper’ subset of set B, i.e. A ⊆ B but A = ̸ B. For example, N ⫋ Z+ ⫋ Z. Cross-product of sets A and B, i.e., the set {(a, b)|a ∈ A and b ∈ B}. Hexadecimal string, i.e., 0x is followed by a string of hexadecimal digits (from 0 to F ). The set of strings of length l over the alphabet {a, b}. The set of strings of any length, over the alphabet {a, b}. Concatenation of strings; abc + + de = abcde. Note: in cryptographic literature, concatenation is often denoted by ||; we prefer + + since || is elsewhere used for the logical OR operation. String consisting of l concatenations of the letter/sequence a. For example, 04 = 0000, 13 = 111, and (01)2 = 0101. Also, 1l is the number l in unary notation. The length of string b; hence, |an | = n · |a| and |0n | = n. The ith most signiőcant character of string a. For a binary string, the ith bit; for a byte-string, the ith byte. E.g., if a = 011, then a[1] = 0 and a[2] = 1. Substring containing a[i] + + ... + + a[j]. The ‘reverse’ of string a, e.g., abcdeR = edcba. Dot notation: element b of tuple a (Section 1.3). bitwise logical AND; 0111 ∧ 1010 = 0010. bitwise logical OR; 0111 ∨ 1010 = 1111. Bit-wise exclusive OR (XOR); 0111 ⊕ 1010 = 1101. The bit-wise inverse of binary string or bit x. Select element x from set X with uniform distribution. The probability of F (x) to occur, when x is selected uniformly from set X. Algorithm A with oracle access to algorithm B or to function f , with key k. Namely, A can give input x and receive Bk (x) or fk (x). See Deőnition 1.3. The set of efficient (Probabilistic Polynomial Time) algorithms; see Deőnition A.1. Set of ‘negligible functions’ in input n ∈ N, see Def. 1.5. Big-O notation, identiőes the complexity of an algorithm; see Equation A.1. String ψ protected (‘enveloped’) using key k, e.g., by the TLS record protocol. Applied Introduction to Cryptography and Cybersecurity 23 24 CHAPTER 1. INTRODUCING CYBERSECURITY AND CRYPTOGRAPHY dot represent speciőc values (keys) associated with Alice. We őnd dot notation a convenient way to identify different keys and other values associated with a particular entity. We also use dot notation, when necessary, to avoid ambiguity when referring to the different functions comprising a cryptographic scheme, for $ example, a signature scheme S = (KG, Sign, Verify), e.g.: (s, v) ← S.KG(1l ). Finally, we also use dot notation to refer to different outputs of a function returning multiple values. We use these conventions throughout this textbook. The key notation. Cryptographic functions often use keys. While the key is, formally, an input to the function like any other input, it is often convenient, and customary, to place the key as subscript to the function name. For example, we use Sign s (m) to denote the signing algorithm Sign applied to the message m, using the signing key s. In Figure 1.7 we use this notation together with the dot notation, and write Sign A.s (m) to denote the result of applying signature algorithm Sign to message m, using the signing key A.s of Alice. 1.3.1 Sequence diagrams Let us now present sequence diagrams, a widely-used technique for illustrating the interactions between entities over time; sequence diagrams are widely used in cybersecurity, communication protocols and other areas involving interactions between entities. For example, Figure 1.7 is a sequence diagram, like most őgures in this textbook, and, indeed, most őgures in the cybersecurity, cryptography and networking literature. A sequence diagram illustrates the progression of events over time. In this őgure, and in most őgures in this textbook, the time proceeds from top to bottom; note that in some other schedule diagrams, time proceeds from left to right. The top of the diagram shows the different parties and processes/algorithms. For example, in Figure 1.7, we have two parties, Alice and Bob, and the three algorithms comprising the signature scheme (KG, Sign, Verify). The arrows represent communication: over the network (between parties) or within the same party (with algorithms). To show network latency (delay), some sequence diagrams may use slanted arrows. However, in this textbook, we mostly ignore the communication delays, hence the sequence diagrams mostly use horizontal arrows. Example: sequence diagram for signature schemes. Figure 1.7 demonstrates sequence diagrams and some of the notations we presented, by presenting a sequence diagram of the initialization and the typical use (signing and verifying) of a signature scheme. 1.4 A Bit of Background Modern cryptography makes extensive use of mathematics, in particular, complexity theory, number theory, group theory, and probability. These are large, Applied Introduction to Cryptography and Cybersecurity 1.4. A BIT OF BACKGROUND Alice Sign Nurse Initialize A.s 25 KG(1l ): key generation Bob Alice’s Private Alice’s public signing key A.s veriőcation key A.v Verify A.v m Sign and verify Sign A.s (m) σ ≡ Sign A.s (m) m, σ m, σ Ok Figure 1.7: Sequence diagram for the initialization and use of a signature scheme. Alice signs message m with her private signing key A.s, resulting in σ = Sign A.s (m). Alice sends σ and m to Bob, who verifies the signature by computing VerifyA.v (m, σ). Since in this example σ is a valid signature of m, the result is Ok. If σ is not a valid signature of m, the result should be Invalid. important and interesting areas, which are often part of computer science curriculum. However, we believe that it is not essential to study these areas before studying this textbook; the textbook only requires limited use of basic concepts and results from these areas. Instead, we provide the necessary, limited background, for a reader who did not learn these areas so far, in Appendix A. In fact, studying this textbook before studying these topics, may provide motivation and prepare the reader for in-depth study of these important and interesting areas. In this section, we brieŕy introduce these topics, to allow readers to determine if they need to learn a bit from any of them, or if they know enough from prior studies. If readers őnd it necessary to learn a bit more about these topics, we provide the necessary background in Appendix A. Readers can read the appendixes in advance and/or ‘as needed’, i.e., when the text makes use of the relevant area. In subsections 1.3 and 1.3, we introduce notations used throughout this textbook. Notations are important, therefore, we urge readers to read and refer to these sections. In Section 1.3 we present sequence diagrams, a widely-used graphical notation for presenting interactions between entities, which we use extensively to present protocols and attacks; and the convenient dot notation, which identiőes values associated with a particular entity. In Section 1.3, we focus on additional notations, many of which may be familiar to readers; these notations are summarized in Table 1.1, for handy reference when reading the Applied Introduction to Cryptography and Cybersecurity 26 CHAPTER 1. INTRODUCING CYBERSECURITY AND CRYPTOGRAPHY text. 1.4.1 A bit of Computational Complexity Most cryptographic schemes assume restrictions on the computational abilities of the adversary2 . The focus on adversaries with restricted computational capabilities, allows us to rule out some attacks which require absurd amount of resources, such as exhaustive search - trying out all possible keys (see subsection 2.3.1). However, how can we deőne a clear restriction on the adversary’s computational abilities? This can be quite tricky. However, the theory of computational complexity provides an elegant solution. In this introductory textbook, we only need to understand some very basic notions and properties from the theory of complexity. Our discussion is quite restricted to the essential notions; we believe that readers should be able to follow the text even without learning computational complexity. We list below the main aspects of complexity theory that we use, and explain them in Section A.1. For readers interested to learn more, we recommend one of the many excellent textbooks on computational complexity, e.g., [106, 167], which provide a much more extensive introduction to this important and interesting subject. The aspects of computational complexity which we use in the textbook, and describe in Section A.1, include: • The big-O notation and its use for specifying and comparing the asymptotic complexities (overhead) of algorithms. Asymptotic complexity focuses on the overhead of the algorithm as its input size grows toward inőnity. This is convenient, since for smaller input size, the overhead may be dominated by őxed per-operation and startup factors, which become insigniőcant as the input size grows. The big-O notation basically ignores these őxed factors. For example, the big-O notation for linear functions is O(n), for quadratic functions O(n2 ) and for exponential functions it is O(an ), where a is the exponent; see the illustration of speciőc linear, quadratic and exponential functions in Figure 1.8. • The deőnition of Probabilistic Polynomial Time (PPT) algorithms, also referred to as efficient algorithms or polytime algorithms, and the corresponding notion for functions. Basically, an algorithm is efficient, or PPT, if its run time is bounded by any polynomial in its input length. Therefore, an algorithm is efficient (PPT) if its time complexity is linear, quadratic or any other polynomial in the input length n, e.g., O(na ) for any a. In contrast, an algorithm whose time complexity is exponential, e.g., O(2n ), is considered inefficient. For motivation, see how in Figure 1.8 the exponential function exceeds the linear and quadratic functions for sufficiently large input size n. 2 There are also some definitions and constructions of unconditionally secure cryptographic schemes. We cover two important unconditionally-secure schemes: One Time Pad (OTP) encryption (Section 2.4) and Secret Sharing (Section 10.1). Applied Introduction to Cryptography and Cybersecurity 1.4. A BIT OF BACKGROUND 27 ·104 f (n), g(n), h(n) 6 4 f (n) = 600n + 900 g(n) = 80n2 + 400 h(n) = 2n + 100 2 0 0 2 4 6 8 n 10 12 14 16 Figure 1.8: Comparing linear, quadratic and exponential complexities: comparing the linear function f (n) = 600n + 900 = O(n), the quadratic function g(n) = 80n2 + 400 = O(n2 ) and the exponential function h(n) = 2n + 100 = O(2n ). • The security parameter 1l , which is a number encoded in unary (i.e., a string of l bits whose value is 1). The security parameter is often used as the input to the (randomized) key-generation algorithms; the length of the key may be the same as 1l . The reason for using this input is to allow the key-generation and other algorithms to run in time • The non-deterministic polynomial-time (NP) class of problems, and the ? N P = P question. 1.4.2 A bit of Number Theory and Group Theory Number theory and group theory are often used in the design and analysis of cryptographic schemes. In this textbook, we use only a tiny subset that is necessary for our study of applied cryptography. The subset of number theory that we need is mostly focused on modular arithmetic, i.e., the computation of expressions involving arithmetic operations over integers, where the operations include modulo operations. Modular arithmetic shares many of the properties of regular arithmetic (over the integers and the real numbers), however, there are also important differences; a reader not familiar with these basic topics, is strongly advised to learn this topic, from Section A.2 or from one of the many good textbooks covering this topic. One example of a difference between modular arithmetic and regular arithmetic is the subject of multiplicative inverses, which we cover in subsection A.2.2. Applied Introduction to Cryptography and Cybersecurity 28 CHAPTER 1. INTRODUCING CYBERSECURITY AND CRYPTOGRAPHY The multiplicative inverse of a number x is denoted x−1 , and is the number satisfying x · x−1 = 1, where we may use modular multiplication or regular multiplication. Over the integers, there are no multiplicative inverses (except for 1 who is its own inverse, while over the reals, every number, except zero, has an inverse. For modular arithmetic, the situation is a bit more complex: an integer a has multiplicative inverse modulo integer m > 0, if and only if a and m are coprime, namely, they do not have a common divisor (except 1). Computing the multiplicative inverse is one of the problems which is widely believed to be computationally hard, i.e., it is believed that there is no efficient (PPT) algorithm to compute multiplicative inverses, but only when applied to numbers which are hard to factor, i.e., numbers chosen as a multiplication of very large random primes. Finding the factors of such numbers is referred to as the factoring problem, and it is considered computationally-hard (Section A.1); and when the factors are hard to compute, computing multiplicative inverses is also hard. In fact, this is crucial to the security of the well-known and important RSA public key cryptosystem, which we discuss in Chapter 6. Speciőcally, in RSA, the private key is the multiplicative inverse of the public key, which means that if it is possible to efficiently compute multiplicative inverses, then RSA is insecure. For details and more applications of multiplicative inverses, see subsection A.2.2. In subsection A.2.3, we discuss two important and beautiful results of number theory, which are very important in cryptography: Fermat’s and Euler’s theorems. Among their application, is also the efficient computation of multiplicative inverses - for numbers with known, or small, prime factors. In fact, these theorems can also allow to reduce the complexity of modularexponentiation. This property is key to the design of the RSA public key cryptosystem (see Chapter 6). Finally, in subsection A.2.4, we introduce basic notions from the domain of group theory, which are used widely in applied cryptography, and a bit in this textbook. In particular, group theory is used to deőne the discrete logarithm problem, which is another important number-theoretic problems considered computationally hard. The discrete logarithm problem is used as the basis for several public-key schemes, including the important Diffie-Hellman key-exchange protocol (also in Chapter 6). 1.4.3 A bit of Probability Probabilistic analysis and algorithms are very important for computer science in general and for cryptography in particular. However, luckily, only the very basics are required for our study of applied cryptography. We present this minimal background on probability in Section A.3; for more in-depth coverage, take a course and/or read one of the many excellent textbooks, e.g., [106, 175]. Probability deals with events which result in a value from some predeőned set. For simplicity, we only consider a finite set of possible outcomes, and only uniform distributions and independent random variables. Applied Introduction to Cryptography and Cybersecurity 1.5. PROVABLE-SECURITY AND DEFINITIONS 29 We are often interested in the outcome of a randomized (or probabilistic) algorithm A, i.e., an algorithm that can perform bit flip operations. We use the notation Pr(π(A)) to denote the probability that predicate π holds for the (randomized) output of A, and the notation y ← A(x) to denote that y is assigned the random outcome of a uniformly-chosen run of A with input x. In Section A.3, we present several simple, yet useful, properties of probability, which we use in this textbook. 1.5 Provable-Security and Definitions Ensuring security is challenging. It is tempting to identify a list of possible attacks, and evaluate security against these; but that is often misleading, resulting in vulnerability to other, unforeseen attacks. Instead, modern cryptography is mostly based on provable security, whose goal is to prove that an attacker with given capabilities is unable or unlikely to ‘break security’ of a given system or cryptographic scheme. Proving security requires to clearly and precisely deőne the cryptographic scheme and its interactions, the attacker capabilities, the security requirements, and any additional assumptions. Only with precise deőnitions of the scheme, attacker capabilities, security requirements and assumptions, can we try to prove security. Note that there are also many opportunities for errors leading to vulnerabilities, either in the proof itself, or in the use of ‘incorrect’ deőnitions for attacker capabilities, security requirements or assumptions, e.g., when an attacker may have additional capabilities or when a cryptographic scheme is used incorrectly, i.e., assuming it ensures properties beyond its security requirements. Security and cryptography are rather unique in being very applied, yet requiring ‘correct’ and precise deőnitions and analysis. In this section, we introduce the provable-security approach, by presenting the deőnition of cryptographic signature scheme (subsection 1.5.1), the relevant attack models (subsection 1.5.2), types of signature forgeries (subsection 1.5.3) and, őnally, the security requirements of signature schemes (subsection 1.5.8). This subject has been introduced in a seminal paper from 1988 by Goldwasser, Micali and Rivest [170], which is highly recommended reading. Why signatures? Our choice of illustrating provable-security on signature schemes, rather than encryption, has multiple motivations. First, cryptographic signatures are fascinating and widely-applied; in particular, they are fundamental to the TLS protocol (Chapter 7) and the public-key infrastructure (PKI, Chapter 8). Second, signatures are less known, and, furthermore, the term ‘electronic signature’ is often applied, confusingly, to mechanisms with weaker security guarantees. Third, we believe that the deőnition of security requirements of signature schemes is, surprisingly, simpler and more intuitive than that of encryption schemes (see subsection 2.7.2). Fourth, signatures allow us to deőne and compare multiple natural, widely-known security requirements. Finally, these deőnitions are used in Chapter 3, which covers integrity and Applied Introduction to Cryptography and Cybersecurity 30 CHAPTER 1. INTRODUCING CYBERSECURITY AND CRYPTOGRAPHY hash functions, including the Hash-then-Sign (HtS) paradigm and hash-based signature schemes, as well as in Chapter 4, which covers message-authentication and in Chapter 6 which covers public key cryptography. Presenting them here allows the reader to chose the order of reading these chapters. 1.5.1 Definition of a Signature Scheme We now present an example of the formal deőnition of cryptographic signature schemes and their correctness requirements; the same approach is used to study other cryptographic schemes, by őrst deőning a scheme and its correctness requirements. A (digital/cryptographic) signature scheme S consists of three algorithms: S = (KG, Sign, Verify). The Sign algorithm is used to sign messages, using a secret/private key s; the Verify algorithm is used to verify the purported signature over a message, using a known, public key v; and the key generation algorithm KG generates the keypair (s, v). The deőnition, which follow, uses the concepts of efficient (PPT) algorithms and of security parameter, which we discuss in Appendix A. Intuitively, the security parameter indicates the desired tradeoff between security and performance; a longer security parameter implies more security and more overhead, e.g., longer keys. In most of the cryptographic literature, and speciőcally in this textbook, the algorithms, including the adversaries, are efficient (PPT). Namely, the run-time of every algorithm is bounded by a polynomial in the length of its inputs, which usually includes the security parameter 1l . In this textbook, there is only one exception: the unconditionally secure One Time Pad algorithm (Section 2.4). Definition 1.1 (Signature scheme). A signature scheme is a tuple of three efficient (PPT) algorithms, S = (KG, Sign, Verify), and a set M of messages, such that: KG is a randomized algorithm whose input is a unary string (security parameter 1l ) and whose output is a pair of binary strings (s, v), called the private key and the public key, respectively. To refer to only one of the two outputs of KG, we use the dot notation, i.e., s ≡ KG.s(1l ) and v ≡ KG.v(1l )). Sign is an algorithm that receives two binary strings as input, a signing key s ∈ {0, 1}∗ and a message m ∈ M , and outputs another binary string σ ∈ {0, 1}∗ . We call σ the signature of m using signing key s. Verify is an algorithm that receives three binary strings as input: a verification key v, a message m, and σ, a purported signature over m, and whose output is True or False (i.e., a predicate). Intuitively, Verify should output True if and only if σ is the signature of m using s, where s is the signature key corresponding to v (generated with v). Applied Introduction to Cryptography and Cybersecurity 1.5. PROVABLE-SECURITY AND DEFINITIONS 31 Usually, the set of messages M is either the set of all binary strings {0, 1}∗ , or the set {0, 1}n of binary strings of some őxed length n. When M is not explicitly mentioned, this implies the set of all binary strings, i.e., M = {0, 1}∗ . In practice, signature schemes often assume őxed input length n. To sign longer messages, the messages are hashed and then the output of the hash is signed. We refer to this as the Hash-then-Sign paradigm, and discuss it in subsection 3.2.6. For example, the DSA standard signature scheme [297] deőnes the application of Hash-then-Sign (using a standard cryptographic hash function, SHA). The motivation is efficiency; the őxed-length signature schemes have high overhead (e.g., see Table 6.1), which further increase, super-linearly, as a function of the message size. Hashing longer messages, and then signing the is much more efficient. The deőnition allows the algorithms of the signature schemes to randomized. This may look unnecessary, but, in fact, some important signing algorithms are randomized, e.g., using the PSS encoding [44, 292]. The correctness requirement of a cryptographic scheme veriőes that the scheme operates correctly under benign operating conditions, usually without allowing probability of error. For a signature scheme, this simply means that if the purported signature σ is indeed the output of the corresponding ‘Sign’ operation, then the veriőcation will return True, correctly indicating a valid $ signature. Namely, if (s, v) ← KG(1l ) is a pair of signing key and corresponding validation key, then validation, using v, of a signature σ over a message m, produced by signing m using s would always return True. Let us now turn this into a precise deőnition of the correctness requirement. Definition 1.2 (Correctness of a signature scheme). We say that a signature scheme (KG, Sign, Verify), with set M of messages, is correct, if for every $ security parameter 1l , every key-pair (s, v) ← KG(1l ) and every message m ∈ M holds: Verifyv (m, Sign s (m)) = True (1.1) Signature scheme security requirements. Intuitively, the security goal of signature schemes is unforgeability, i.e., to prevent an attacker from obtaining a (meaningful) forgery, where a forgery is a valid signature for a message that was not signed by the owner of the private signing key s. However, this goal is not well deőned, and may be interpreted in several different ways; in particular, we can consider different attack models and types of forgery. We discuss such variants in the rest of this section, as well as notations and concepts that are relevant to provable security, in general and of signature schemes speciőcally, such as the oracle notation. 1.5.2 Signature attack models and the conservative design principle Following the attack model principle (Principle 1), security should be deőned and ensured with respect to attacker capabilities, i.e., attack model, rather than Applied Introduction to Cryptography and Cybersecurity 32 CHAPTER 1. INTRODUCING CYBERSECURITY AND CRYPTOGRAPHY assuming a speciőc attack strategy. Let us discuss the attack models relevant to the security of signature schemes. The Key-Only Attack model. The weakest attack model against a signature scheme, is the Key-Only Attack model, where the adversary is given (only) the public veriőcation key v. More generally, in this attack model, the attacker is given (only) all the public keys and other public information; this attack model can be applied against any cryptographic scheme. However, the Key-Only Attack model is usually considered too weak. Speciőcally, for any practical applications of signatures, surely the adversary should also be able to observe at least one signed message, in addition to the public key; this motivates the use of a stronger attack models. The Known Message Attack (KMA) model. In the Known Message Attacker (KMA) model, the attacker can receive (an arbitrary number of) pairs of a message and its signature. However, the attacker cannot control (choose) the messages. Variants of this model may require the attack to succeed for any given set of signed messages, or for signed random messages. However, this model is also usually considered too weak, since in many applications and scenarios, the attacker may be able lure the signer into signing some messages with speciőc format or content. For example, both parties are typically able to inŕuence some of the text of a contract before it is signed. The Chosen Message Attack (CMA) models. In this textbook, and in most of the works in modern cryptography, we adopt the stronger Chosen Message Attack (CMA) model, where the attacker can ask for, and receive, the signatures of arbitrary messages of its choosing. Furthermore, we use the a strong variant called Adaptive Chosen Message Attack (Adaptive-CMA) we allow the adversary to choose the messages adaptively, based on the public key and on the signatures it has previously received (but the word ‘adaptive’ is often omitted). See [170] for the weaker models: directed CMA (where attacker chooses the messages only based on the public key) and generic CMA (attacker chooses the messages without any input). The Conservative Design Principle. One may argue that the Adaptive Chosen Message Attack model is ‘unnecessarily strong’. In many applications, the adversary’s ability to impact the contents of the signed message is very limited; and in some, the signer may phrase the contract, not allowing the adversary to have any (substantial) impact on it. It may seem that the KMA model, or the weaker-CMA models mentioned above (directed CMA or generic CMA), may suffice; requiring security against the much stronger CMA model may seem to impose unnecessary burden. However, we want to avoid vulnerabilities in systems using cryptographic scheme, due to incorrect usage of the schemes as well as due to incorrect attack models. It is difficult to predict the actual environment in which a Applied Introduction to Cryptography and Cybersecurity 1.5. PROVABLE-SECURITY AND DEFINITIONS 33 cryptographic scheme would be used, and a subtle difference between the real attacker capabilities and the attack model we use, may result in a vulnerability. Speciőcally, in subsection 3.2.6 we show how an attacker who can receive signature over a document which is mostly benign, except for a short string selected by the attacker, is able to forge a signature over a very different document. A variant of this attack was even demonstrated to circumvent the critical Web PKI mechanism (Chapter 8). This motivates a more conservative approach, e.g., the use of the stronger CMA model. This is a special case of the important conservative design principle, which basically says that cryptographic mechanisms should be secure under minimal assumptions on the application scenarios, the underlying mechanisms and the attacker. Principle 3 (Conservative design and usage). Cybersecurity mechanisms, and in particular cryptographic schemes, should be specified and designed with minimal assumptions, simple usage with minimal restrictions, strongest security requirements, and maximal, well-defined attack model (attacker capabilities), rather than being designed using assumptions which hold for a specific system or application. On the other hand, when using an underlying cryptographic scheme, the design should assume the minimal requirements from the scheme, and limit, as much as possible, the attacks that can be deployed against this underlying scheme. Both parts of the conservative design principle are very important. Many systems were vulnerable due to the use of mechanisms designed with subtle assumptions or restrictions, or ensuring insufficiently strong properties, e.g., assuming limited attacker capabilities. Other systems used cryptographic mechanisms in sub-optimal way, which unnecessarily gave the attacker ability to exploit later-discovered vulnerabilities of the cryptographic mechanisms. The design of security mechanisms to minimize assumptions and restrictions on their usage, can also make their use easier. 1.5.3 Types of forgery Which forgery is considered as a successful attack? Clearly, the forgery must consist of a message which wasn’t signed by the legitimate signer (owner of the private key), and a valid signature. However, which message is considered a meaningful forgery? We consider three types of forgeries: Existential forgery: any forgery is considered meaningful. Namely, the attacker is considered successful if it is able to obtain any pair of a message and a valid signature over it, where the message was not signed by the legitimate signer - even if the message is pure gibberish. Selective forgery: The attacker selects some message m ∈ M before it begins interacting with the system, and then succeeds in generating a valid signature for m (without asking the signer to sign m, of course). Applied Introduction to Cryptography and Cybersecurity 34 CHAPTER 1. INTRODUCING CYBERSECURITY AND CRYPTOGRAPHY Universal forgery: The attacker can produce a valid signature to any message m given to it. Attackers that can perform universal forgery, can also perform selective forgery; and attackers that can perform selective forgery, can also perform existential forgery. For some application scenarios, it may suffice to prevent universal forgery or selective forgery, for example: • Assume there are only two (or few) pre-deőned messages to be signed, e.g., an inspector signing either ‘valid’ or ‘invalid’. In this case, security against universal forgery suffices; the attacker cannot choose the message to be forged. • Assume the application of signing a document, e.g., a contract. The attacker may have signiőcant ŕexibility in which document it forges, but the forgery must be of a legible, meaningful contract. Existential forgery, where the attacker may only forge signature over some ‘random’ gibberish document, may not be a threat. However, following the conservative design principle, it is preferable to use signature schemes that prevent (even) existential forgery. Such scheme can be safely used, even in an application where the attacker may be able to exploit signatures over seemingly-meaningless or benign messages. Our discussion of signature schemes so far has been intuitive. However, in order to prove security, we need precise deőnitions, which we present in the following subsections. 1.5.4 Game-based Security and the Oracle Notation There are different methods of deőning security requirements for cryptographic schemes; this textbook, and possibly most works on applied provable security, follows the game-based approach. We believe that game-based deőnitions are easier, more intuitive and used by more works on provable security of applied protocols, compared to other approaches such as simulation-based deőnitions. Games. The term game refers to a well-deőned algorithm that returns a binary outcome of one execution, where an adversary A attacks the scheme: True if the attack succeeded and False if the attack failed. The game is often randomized; randomness may be used by the game itself, e.g., to deőne random challenges, by the adversary A and/or by the cryptographic scheme (if it is probabilistic, i.e., uses random bit-ŕip operations). Game-based security deőnitions deőne a game, often using pseudo-code, and then use the same to deőne the security requirements. Applied Introduction to Cryptography and Cybersecurity 1.5. PROVABLE-SECURITY AND DEFINITIONS 35 Oracles and the oracle notation. Many cryptographic games, allow the adversary to receive the result of speciőc, limited operations that use the private key - while not providing that key to the attacker. For example, a Chosen Message Attacker (CMA) can receive signatures of messages it chooses, where the signatures are computed using the private signing key, which is obviously not disclosed to the attacker. Basically, the game provides the attacker with ‘black box’ or ‘subroutine’ access to a function, which, internally, has access to the private key. Speciőcally, to allow CMA, the attacker is given such access to a function computing Sign s (m), where m is a message chosen by the attacker, and s is the private signing key (not given to the attacker). The term oracle access is used to refer to such ‘black box’ access, e.g., to the function computing Sign s (m) for attacker-chosen message m. Oracles are used extensively in complexity theory and in cryptography. We use the oracle notation A S.Signs (·) to denote that the adversary A is given oracle access to Sign s (·), the signing functionality using the private signing key s, for attacker-chosen messages. This means that A can provide input x ∈ {0, 1}∗ and receive S.Sign s (x), i.e., a signature of x using the secret key s. Notice that A does not receive the private signing key s, and has no access to the operation of S.Sign s (·). Definition 1.3 (Oracle notation). Let A be an algorithm, f be a function, and k ∈ {0.1}∗ be a string (typically, a secret such as a private key). We use the notation A fk (·) to denote that algorithm A can provide input x and and receive fk (x). Similarly, we use the notation A Bk (·) to denote that A can provide input x and receive Bk (x), where B is an algorithm. We refer to fk (·) or Bk (·) as an oracle. In the next subsection, we use the oracle notation to deőne the existentialunforgeability CMA game. 1.5.5 The Existential Unforgeablity CMA Game Algorithm 1 presents the pseudocode of the algorithm for the existential unSign l forgeability adaptive chosen-message attack (CMA) game, EU FA,S (1 ). The game returns True if the adversary ‘wins’, i.e., is able to output some message m and a valid signature for it σ; otherwise, i.e., if the attack fails, then the game returns False. Sign l Algorithm 1 The existential unforgeability game EU FA,S (1 )(1l ) between signature scheme S = (KG, Sign, Verify) and adversary A. $ (s, v) ← S.KG(1l ); $ (m, σ) ← A S.Signs (·) (v, 1l ); return (S.Verifyv (m, σ) ∧ (A didn’t give m as input to ., S.Sign s (·))); Applied Introduction to Cryptography and Cybersecurity 36 CHAPTER 1. INTRODUCING CYBERSECURITY AND CRYPTOGRAPHY Sign l Explanation of the existential unforgeability game EU FA,S (1 ) (Algorithm 1). The game receives only one input, the security parameters 1l , and has only three steps: 1. Use the key-generation algorithm of the signature scheme, to generate the $ $ signing and veriőcation keys: (s, v) ← S.KG(1l ). We use the ← symbol to emphasize that S.KG is a randomized algorithm, i.e., returns a random key pair. $ 2. Then, we let (m, σ) ← A S.Signs (·) (1l ), i.e., the adversary outputs a message m and a purported forged signature for it, σ. The adversary receives oracle access to the signing algorithm, i.e., can receive the values S.Sign s (x) for any input x chosen by the adversary. 3. Finally, the game returns True, i.e., the adversary ‘wins’, if σ is a valid signature on m (using the veriőcation key v), provided that m is not one of the inputs x whose signature S.Sign s (x) was received by A from the oracle in the previous step. Intuitively, an existentially-unforgeable signature scheme S ensures that evSign l ery efficient (PPT) adversary A would ‘almost always’ lose, i.e., Pr(EU FA,S (1 ) = True) would be tiny or negligible, provided that the security parameter 1l is ‘sufficiently large’. We deőne this requirement below, in Deőnition 1.4 and Deőnition 1.6. The following exercise shows that the adversary A can always ‘win’ the Sign l EU FA,S (1 ) game against an arbitrary signature scheme S, if either we allow A to be inefficient (i.e., not a PPT algorithm), or if the keys generated by algS Sign l (1 )(1l ) = True) = 1. are of limited length. Namely, in these cases, Pr(EU FA,S Exercise 1.1 (Forgery if adversary is computationally unbounded or if key length is bounded). Let S be an arbitrary efficient (PPT) signature algorithm. Sign l Present an adversary A that is able to ‘win’ EU FA,S (1 ) every time, if we allow either of the following: 1. A does not have to be an efficient (PPT) algorithm. 2. S outputs fixed, or bounded-length, keys. Sketch of solution to first item: Since S is efficient, there is some polynomial which bounds its running time. In particular, this bounds the length of the private signing key s. The adversary will try all strings s′ up to that length; the adversary will apply the signature algorithm using each such potential signature key s′ , and each time, verify the signature using the public key v; eventually, the correct signing key is found. Sketch of solution to the second item: for simplicity, assume that the private key is always of length l. Then A tries to sign using any of the 2l possible keys, verifying the signatures using the (known) public key, until it őnds the correct key. Since l is őxed, the number of keys is a (large) constant rather than a Applied Introduction to Cryptography and Cybersecurity 1.5. PROVABLE-SECURITY AND DEFINITIONS 37 function of the security parameter; hence, the adversary can test all 2l possible keys. 1.5.6 The unforgeability advantage function The existential unforgeability game (Algorithm 1) is a random process, which returns the outcome of a random run of the game, with the given adversary A and signature scheme S. The outcome is True in runs where the adversary ‘wins’, i.e., outputs a forgery, and False in runs where the adversary ‘loses’, i.e., does not output a forgery. The outcome of the game may depend on the (random) keys output by the (probabilistic) KG algorithm, as well as the outputs of the (randomized) adversary A. The probability that the adversary wins usually depends on the security parameter 1l . This probability is called the existential unforgeability advantage of A against S. Definition 1.4. The existential unforgeability advantage function of adversary A against signature scheme S is defined as:   EU F −Sign l Sign l εS,A (1 ) ≡ Pr EU FA,S (1 ) = True (1.2) Where the probability is taken over the random coin tosses of A and of S during Sign l Sign l the run of EU FA,S (1 ) with input (security parameter) 1l , and EU FA,S (1 ) is the game defined in Algorithm 1. The advantage function gives us a measure of the security of the signature scheme; in particular, clearly, a scheme is secure only if for any efficient adversary A, the advantage is small, or better yet, negligible3 . Note, however, that for any őxed value of the security parameter 1l , there is an adversary A that EU F −Sign l always wins - i.e., such that εS,A (1 ) = 1 (Exercise 1.1). Therefore, our deőnition of security cannot be bounded to a speciőc security parameter, and must consider the advantage as a function. Which advantage functions are sufficiently-small (or negligible)? There are two main ways in which we can deal with this question: asymptotic security and concrete security. In this textbook we will adopt the asymptotic security approach, which we explain below, since it is a bit easier to use. However, let us őrst brieŕy explain the alternative approach of concrete security, which allows more detailed analysis of security - but is a bit harder to use. 1.5.7 Concrete security, asymptotic security and negligible functions Concrete security. The concrete security approach uses the advantage function directly as the measure of security. Namely, in this approach, there is no explicit deőnition of a ‘secure’ scheme; each scheme is only associated 3 Unfortunately, no efficient signature scheme can ensure zero advantage; see Exercise 1.6. Applied Introduction to Cryptography and Cybersecurity 38 CHAPTER 1. INTRODUCING CYBERSECURITY AND CRYPTOGRAPHY with a speciőc advantage function. This allows the calculation of the advantage as a speciőc probability value - for any given, concrete values of the security parameter 1l ; this is also the reason for the term, concrete security. In fact, in this approach, often the advantage function is given additional parameters. For example, the advantage function for a signature scheme may include the number of messages signed during the execution. In general, inputs to the advantage function often include the number of (different kinds of) oracle calls. Concrete security allows precise analysis of security for speciőc key lengths and other parameters, and the security impacts of different cryptographic constructions. These is a signiőcant advantage of concrete security over asymptotic security (deőned next). However, we believe that concrete security may be less appropriate for this introductory textbook. Instead, we decided to adopt the simpler-to-use asymptotic security approach, described next, where a design is either secure or insecure - no quantitative measure. Asymptotic security and negligible functions. Asymptotic security reEU F −Sign l quires the advantage function, e.g., εS,A (1 ), to be negligible in the security parameter l. What does it mean when we say that a function ε : N → R is negligible? Clearly, we expect such a function to converge to zero for large input, i.e.: liml→∞ ε(l) = 0. Moreover, a negligible function is a function that converges to zero faster than any (non-zero) polynomial; the deőnition follows. Definition 1.5 (Negligible function). A function ε : N → R is negligible, if for every non-zero polynomial p(l) ̸= 0 holds: lim ε(l) · p(l) = 0 l→∞ (1.3) We use N EGL to denote the set of all negligible functions. Notes: 1. An equivalent condition is to say that ε : N → R is negligible if for every c holds liml→∞ ε(l) · lc = 0. 2. Any non-zero polynomial is not negligible. 3. For any constant x > 1, the inverse exponential function ε(l) = x−l , e.g., 2−l , is negligible. Here is an exercise to make sure this important concept is well understood. Exercise 1.2. which of the following functions are negligible? Why? l (a) fa (l) = 10−8 · l−10 , (b) fb (l) = 2−l/2 , (c) fc (l) = l!1 , (d) fd (l) = (−1) l , (e) fe (l) = 0.5l . Working with negligible functions is a useful simpliőcation; here is one convenient property, which shows that if an algorithm has negligible probability to ‘succeed’, then running it a polynomial number of times will not help - the probability to succeed will remain negligible. Applied Introduction to Cryptography and Cybersecurity 1.5. PROVABLE-SECURITY AND DEFINITIONS 39 Lemma 1.1. Consider negligible function ϵ : N → R, i.e., ϵ(l) ∈ N EGL(l). Then for any polynomial p(l), the function f (l) = p(l) · ϵ(l) is also negligible, i.e., f (l) ∈ N EGL(l). 1.5.8 Existentially-unforgeable signature schemes We now use the deőnitions of a negligible function and of the existential unforgeability advantage function, to deőne the asymptotic notion of an existentially unforgeable signature scheme. Definition 1.6 (Existentially-unforgeable signature scheme). A signature scheme S is existentially unforgeable if for all PPT algorithms A, the adEU F −Sign l (1 ) ∈ N EGL(l), where vantage of A over S is negligible, i.e.: εS,A EU F −Sign l (1 ) is defined in Definition 1.4. εS,A   EU F −Sign l Sign l Recall that εS,A (1 ) ≡ Pr EU FA,S (1 )(1l = True is the probability that the adversary A succeeds to forge a message, in a random run of the existential-unforgeability game, with adversary A, signature scheme S and security parameter 1l . Exercise 1.1 shows that that a signature scheme cannot be existentiallyunforgeable, if the length of the keys it generates is őxed or bounded, regardless of the security parameter. In spite of this, standard signature schemes are deőned for őxed key (and input, output) length. We leave the deőnition of the corresponding game and notion of selectiveunforgeability to the reader. Notice we do not include in this exercise universalunforgeability, since it requires a slightly different type of deőnition. Exercise 1.3. 1. Define the selective-unforgeability game. 2. Define a selectively-unforgeable signature scheme. 3. Show: if S is existentially-unforgeable, then S is selectively-unforgeable. One-Time Signatures. Our deőnitions above focused on the ‘classical’ deőnitions of signature schemes and their security. However, there are many other variations considered in the cryptographic literature; let us mention one of these variants, One-Time Signature schemes. These are signature schemes which allow only a single of signature operation. A similar variant allows a limited number of signature operations (Limited-use signatures). Exercise 1.4. 1. Define the existential-unforgeability game for a one-time signature (or: limited-use signature). 2. Define a one-time existentially-unforgeable signature scheme. One-time (and limited-use) signatures can be more computationally efficient than the ‘classical’, unlimited-use signatures; see subsection 3.4.2. They may Applied Introduction to Cryptography and Cybersecurity 40 CHAPTER 1. INTRODUCING CYBERSECURITY AND CRYPTOGRAPHY also be convenient way to ensure security against attackers with quantumcomputing capabilities, see Section 10.4. Adapting the deőnitions we presented above to support such variants is not difficult; see Exercise 1.7. And there are many applications, where we can use one-time signatures. So why do we deőne and normally use ‘classical’, unlimited-use signatures? One reason is the conservative-design principle above. It is also simpler to use (and deőne) unlimited-use signatures, and it is more convenient that we can use them for many applications. Convenience, reuse and simplicity are all very important properties. 1.6 A Brief History of Cryptography, Computing and Cybersecurity Cryptography is a surprisingly ancient art, indeed, one of the earliest sciences; and it also played a surprisingly large role in the development of computing. And the history of computing is obviously closely linked to the history of cyberspace and cybersecurity - where cryptography plays a major role. Therefore, we conclude this chapter with a brief review of the history of cryptology (subsection 1.6.1), and of the history of computing and cybersecurity (subsection 1.6.2). 1.6.1 A brief history of cryptography We now present a brief history of cryptography. We are only able to give few important highlights from the fascinating history of cryptology, and interested readers should consult some of the excellent manuscripts such as [223, 362]. We identify three main eras of cryptography: the heuristic cryptography era (until 1883), the Enigma era (1883 to 1970s) and modern cryptography era. We chose the year 1883 to separate between the heuristic era and the Enigma era, since in that year Kerckhoffs published his seminal manuscript [232]. The heuristic cryptography era [-1883]. Cryptology, which literally means the ‘science of secrets’, has been applied to protect sensitive communication since ancient times, and is therefore much more ancient than computing devices. Originally, cryptology focused on protecting secrecy, i.e., on confidentiality, which is mainly provided by encryption schemes, also called ciphers and cryptosystems; see Figure 1.3. One of the early evidences of encryption is from about 1500BC; it is an encryption of a formula for pottery glaze, which presumably was commercially valuable. We present few other ancient ciphers in Section 2.1, but are not able to properly cover this topic here; if interested, read some of the excellent books such as [223, 264, 362]. Kerckhoffs’ publication (1883). Kerckhoffs’ publication [232], in 1883, could be viewed as the end of the heuristic cryptography era. Kerckhoffs’ work Applied Introduction to Cryptography and Cybersecurity 1.6. A BRIEF HISTORY OF CRYPTOGRAPHY, COMPUTING AND CYBERSECURITY 41 may have been the őrst major published work in cryptography; until it, most works in cryptography, and designs of cryptosystems, were kept secret, following the security by obscurity approach (with few limited exceptions). Kerckhoffs realized that it is better to assume that the attacker may be able to capture encryption devices and reverse-engineer them to learn the algorithm, as we paraphrase in the Kerckhoffs principle (Principle 2), see subsection 1.2.2. Kerckhoffs’ principle remains a basic principle of cryptography, and is gradually being adopted into other areas of cybersecurity. With the increased use of cryptography, adoption of standards, software implementations and advanced reverse-engineering tools, Kerckhoffs’ principle only became even more important. Note that Kerckhoffs did not argue that cryptographic designs should be published. Indeed, for many years after Kerckhoffs book was published, research and development in cryptography remained an area mostly dealt with by intelligence and defense organizations, and mostly in secret. This was deőnitely true until, and during, World War II. The Enigma era [1883-1970s]. The most important advances in cryptography after Kerckhoffs’ publication were made as part of the efforts of the second world war (WWII), and in the period leading to it. Cryptography, and in particular cryptanalysis, i.e., ‘breaking’ the security of cryptographic schemes, played a key role during the second world war; an important by-product was the development of the őrst computer. During the war, both sides used multiple types of encryption devices. The most well known is the Enigma and encryption device, used by the Germans. The early versions of the Enigma were in use already from 1924, and it was modiőed and improved over the years. Due to the importance of the Enigma, we use it as the name of the era. Details of the design of the Enigma were kept secret, however, the designers clearly tried to follow Kerckhoffs’ principle, and ensure security even if the design would be known to the cryptanalysts. This fact, together with continuous improvements to Enigma, made the Germans believe that Enigma will not be broken. Indeed, the Allies were very concerned about their lack of ability to decipher Enigma traffic; and the cryptanalysis of Enigma was a major undertaking and a huge achievement. However, there may have been also a drawback to the German cryptographic effort: they were over-conődent in the security of Enigma, and continued using it long after it was broken. Possibly due to this overconődence, they also used Enigma in ways which made it easier to break the cipher, in violation of the conservative design and usage principle (Principle 3); See Section 2.2. When the Enigma was eventually broken, the allies took extraordinary measure to prevent the Germans from realizing this, so that the Germans will not change keys, change Enigma or take other precautions. Usually, this meant creating alternate explanations to the Allies response to the information in the Applied Introduction to Cryptography and Cybersecurity 42 CHAPTER 1. INTRODUCING CYBERSECURITY AND CRYPTOGRAPHY plaintext; but sometimes , the difficult decision was made to avoid reaction to the information, since the resulting risks to people and/or property was deemed less than the risks if the Germans realize that Enigma was broken. It has been estimated that the successful cryptanalysis of Enigma, shortened the war by about two years, and saved millions of lives. While Enigma was designed following Kerckhoffs principle, i.e., to ensure security even if its design is known, the German also made extensive efforts to maintain the secrecy of the design. Indeed, the successful attacks on Enigma were based on leakage of some information about it. In 1932, the French intelligence were able to obtain the secret manual and the settings for the Enigma from a German officer, and shared this information with the British intelligence and the Polish cryptanalysis unit, called the Cypher Bureau. In the Polish Cypher Bureau, Captain Maksymilian Ciȩ0̇ki led team that used mathematics for cryptanalysis, consisting of three of his cryptography students: Rejewski, Zygalski and Różycki. Using Engima’s manual and setting, the team reverse-engineered the Enigma, and built Enigma replicates. Furthermore, they developed special-purpose electromechanical devices called Bombe, that allowed efficient testing of different Enigma keys against limited exposed information, such as known pairs of ciphertext and (likely) plaintext. In 1939, Poland shared their Bombe devices and their cryptanalysis results with the British. Using plaintext/ciphertext pairs and the Bombe machines, the cryptanalysis center in Bletchley Park, led by Alan Turing, was able to decipher much of the intercepted Enigma traffic. However, the Germans periodically improved the Enigma. Every change in the Enigma required extensive effort to design and create modiőed Bombe devices, and a period when traffic could not be deciphered. Furthermore, the somewhat less known Lorenz encryption devices were introduced by the Germans later in the war, and no complete device was captured and available to cryptanalysts. The challenges of adapting Bombe devices to changes in Enigma, and of breaking the Lorenz devices, motivated the construction of the first electronic computer, called Colossus. For more on the history of computing and cybersecurity, see subsection 1.6.2. Since Colossus was programmable, it was possible to test many possible attacks and to successfully cryptanalyze (different versions of) Lorenz, Enigma and other cryptosystems. Modern cryptography. Until the 1970s, cryptography remained mainly a topic for intelligence and research organizations. In the 1970s, this changed, quite dramatically, with the beginning of what we now call modern cryptology, which involves extensive academic research, publication, products and standards, and has many important commercial applications. Two important landmarks mark the beginning of modern cryptology. The őrst landmark is the development and publication of the Data Encryption Standard (DES) [296]. The publication of DES as an open standard, marks that cryptography began to be widely deployed in commercial products; in Applied Introduction to Cryptography and Cybersecurity 1.6. A BRIEF HISTORY OF CRYPTOGRAPHY, COMPUTING AND CYBERSECURITY 43 particular, DES was key to the security of the emerging bank networks, and to the security mechanisms of the emerging computer communication networks, which was dominated, for many years, by IBM’s SNA. The second landmark is the introduction of radical, innovative concept of Public Key Cryptology (PKC), where the key to encrypt messages may be public, allowing easy distribution of encryption keys. The őrst publication was of the seminal paper New directions in cryptography [123], by Diffie and Hellman. In [123], Diffie and Hellman introduced the concepts of public-key cryptography, public-key encryption and digital signatures. They also presented the important Diffie-Hellman Key Exchange protocol; we discuss public key cryptography and the Diffie-Hellman protocol in Chapter 6. It is notable in [123], Diffie and Hellman did not yet present a design of a public key cryptosystem. The őrst published public-key cryptosystem was RSA by Rivest, Shamir and Adelman in [334]. RSA and the Diffie-Hellman protocol remain widely-used public-key cryptographic mechanisms; we discuss them in Chapter 6. In fact, the same design as RSA was already discovered a few years earlier, by the GCHQ British intelligence organization. However, the GCHQ kept this achievement secret until 1997, long after the same design was re-discovered independently and published in [334]. See [264] for more details about the fascinating history of the discovery - and re-discovery - of public key cryptology. These two discoveries of public-key cryptology, with very different impacts on society, illustrate the dramatic change between ‘classical’ cryptology, studied only in secrecy and with impact mostly limited to the intelligence and defense areas, and modern cryptology, with extensive published research and extensive impact on society and economy. 1.6.2 A brief history of computing and cybersecurity Early computing: from Babbage to Colossus. The őrst known idea of computing was proposed by Charles Babbage in 1822. Babbage’s created two designs of mechanical computing devices: the difference engine and theanalytical engine. The difference engine was a special-purpose machine, designed to tabulate logarithms and trigonometric functions; however, the analytical engine was a general purpose computer, which was designed to process input consisting of data and a program, both provided using punched cards. Babbage was also interested in cryptography, and designed the őrst known practical attack against the Vigenére cipher (Section 2.1). The attack was based on usage of letter frequencies, and one of the applications Babbage designed for the analytical engine was to compute letter frequencies. Babbage but never completed implementing either engine, but the designs, however, were correct; they were implemented and tested for historical purposes, from 1989 to 1991, i.e., more than a century after he died. However, Babbage did build some modules of the engines, and has demonstrated and described their designs to peers. Applied Introduction to Cryptography and Cybersecurity 44 CHAPTER 1. INTRODUCING CYBERSECURITY AND CRYPTOGRAPHY The demonstration and description excited Ada Lovelace, one of the few female mathematicians at the time. Ada wrote a description of the sequence of operations for solving certain mathematical problems by the analytical engine, which is considered the őrst computer program; Ada was also the őrst to suggest that computers may be used to manipulate non-numeric data, such as text or music. It took about a century from the initial design of the engines, and until the őrst working computing device was implemented. This was a mechanical device designed and implemented in 1938, by the German inventor Konrad Zuse; Zuse named it the Z1. The Z1 was unreliable and slow. As a result, it did not have any useful applications or use, except as proof of feasibility. In particular, special-purpose calculating devices were way better in performing applied calculations. In particular, this held for electromechanical designs including the Enigma cipher used for encryption, and the Bombe machines used in Bletchley park to break the Enigma; both were much more efficient and reliable than the Z1. However, as we discussed above, there was a repeated struggle to adapt the Bombe devices to changes in Enigma; and the Bombe failed to break the newer Lorenz cryptosystem. This motivated the construction of the first practical computer, called Colossus. Colossus was designed by Tommy Flowers as part of the Bletchley park WWII cryptanalysis effort. Colossus was the őrst fully-electric computing device, i.e., did not involve mechanical components. The Colossus was also the őrst computer, i.e., the őrst computing device that could be programmed for arbitrary tasks, rather than only perform a predeőned set of tasks or computations. This was in contrast to previous devices, including the Enigma and the Lorenz cryptosystems, which were electro-mechanical and also were limited to a predeőned computation. From Colossus to Modern Computers One critical difference between the Colossus and more modern computers, as well as from Babbage’s design of the analytical engine, is that the Colossus did not read a program from storage. Instead, setting up the program for the Colossus involved manual setting of switches and jack-panel connections. This method of ‘programming’ the Colossus wasn’t very convenient, but it was acceptable for the Colossus, since there were only a few such machines and only a few, simple programs, and the simplicity of design and manufacture was more important than making it easier to change programs. Even this crude form of ‘programming’ was incomparably easier than changing the basic functionality of the machine, as required in special-purpose devices - including the Enigma devices and the Bombe devices used for cryptanalysis of the Enigma traffic. Designs of an electromechanical computer which supports a stored program, were proposed already in 1936/1937 - by two independent efforts. The őrst was by Konrad Zuse, who mentioned such design in a patent on ŕoating-point calculations published in 1936 [400]; the second was Alan Turing, who deőned Applied Introduction to Cryptography and Cybersecurity 1.6. A BRIEF HISTORY OF CRYPTOGRAPHY, COMPUTING AND CYBERSECURITY 45 and studied a formal model for stored-program computers. This Turing machine model, introduced in Turing’s seminar paper ‘On Computable Numbers’ [372], is still fundamental to the theory of computing. However, practical implementations of stored-program computers appeared only after WWII. Stored-program computers were much easier to use, and allowed larger and more sophisticated programs as well as the use of the same hardware for multiple purposes (and programs). Hence, stored-program computers quickly became the norm - to the extent some people argue that earlier devices were not ‘real computers’. Stored-program computers also created a vast market for programs. It now became feasible for programs to be created in one location, and then shipped to and installed in a remote computer. For many years, this was done by physically shipping the programs, stored in media such as tapes, discs and others. Now that computer networks are widely available, program distribution is often, in fact usually, done by sending the program over the network. Easier distribution of software meant also that the same program could be used by many computers; indeed, today we have programs that run on billions of computing devices. The ability of a program to run on many computers created an incentive to develop more programs; and the availability of a growing number of programs increased the demand for computers and their impact. Computer networks and cyberspace. The ability to develop a program and have it applied in multiple computers caused the economic ‘network effect’ that made computers and programming much more useful. This effect dramatically increased when computer networks began to facilitate inter-computer communication. The introduction of personal computers (1977-1982), and the subsequent introduction of the Internet, the web and of the smartphone, caused, each, a further dramatic increases in the use and impact of computing and of computer networking. There was also a growing interest in the potential social implications of computers and networks, and a growing number of science-őction works focused on these aspects. One of these was the novel ‘Burning Chrome’, published by William Gibson in 1982. Actually, it seems that it was this novel that introduced the term cyberspace, to refer to this interconnected environment connecting networks, computers, devices and humans. The cyber part of the term cyberspace is taken from the term cybernetics, introduced in [388] to describe the study of communication and control systems in machines, humans and animals. By now, computing is used not only in ‘traditional computers’, but as a critical component of many other devices, from tiny sensors to vehicles - cyberphysical systems and IoT (Internet of Things) devices. The term cyberspace is now mostly used for the ubiquitous use of devices with different computing capabilities, communicating via networks and interacting with humans. Applied Introduction to Cryptography and Cybersecurity 46 CHAPTER 1. INTRODUCING CYBERSECURITY AND CRYPTOGRAPHY Cybersecurity and Hacking. With great power, comes great responsibility; and the increased importance of the Internet and cyberspace also increased the risks of abuse and cyber-attacks. The awareness of these risks signiőcantly increased as attacks on computer systems and networks became widespread, especially attacks exploiting software vulnerabilities, and/or involving malicious software, i.e., malware. This awareness resulted in the study, research and development of threats and corresponding security mechanisms, including computer security, software security, network security and data/information security. The awareness of security risks also resulted in important works of őction. One of these was the 1983 novel cyberpunk, by Bruce Bethke. Bethke coined this term for individuals which are socially-inept yet technologically-savvy. Originally a derogatory term, cyberpunk was later adopted, with a positive interpretation, as a name of a movement with several manifestos, e.g., [234]. The reverse process happened with regards to the term hacker, which was originally used, already from the 1960s, to describe proőcient programmers, thinking ‘out of the box’ and őnding creative solutions and shortcuts (‘hacks’), and more recently, to experts of computer security. However, from the 1980s, the term hacker is often applied with a negative connotation, to a person who tries to break into computer systems and to circumvent defenses. The terms black-hat hacker (or cracker) and white-hat hacker (or just hacker) are often used to distinguish between the ‘attacking hacker’ and the ‘defending hacker’, or between hackers operating illegally and hackers following legal, and hopefully also ethical, principles. In works of őction, cyberpunks and hackers are often presented as sociallyinept yet technology-savvy, with incredible abilities to penetrate systems. These abilities are mostly presented in positive, even glamorous ways, e.g., as saving human society from oppressive, corrupt regimes, governments, agencies and rogue Artiőcial Intelligence systems. The focus on decentralization and personal freedom is deőnitely a main part of the cyberpunk manifestos [234]. Indeed, much of the success of the Internet is due to its decentralized nature, and to the use of cryptography to provide security to őnancial transactions and some level of privacy. Important privacy tools, such as the Tor anonymous communication system [124], are based on cryptography and inherently decentralized, which may be hoped to defend against potentially malicious governments. Furthermore, some of the cryptography and privacy mechanisms, such as the PGP encryption suite [159], were developed in spite of signiőcant resistance by governments. Cryptography is also at the core of the extensive efforts to develop blockchains (see Section 3.10) and other decentralized őnancial tools and currencies, such as the Bitcoin cryptocurrency. 1.7 Lab and Additional Exercises Lab 1 (Using cryptography to validate downloads). Malware is among the most common, well-known and harmful cybersecurity threats. In this lab, we explore the use of basic cryptographic mechanisms, specifically, cryptographic Applied Introduction to Cryptography and Cybersecurity 1.7. LAB AND ADDITIONAL EXERCISES 47 hashing and signatures, to validate downloads and thereby avoid installing and using downloaded malware that is a fake version of the software that the user wanted to download. As for the other labs in this textbook, we will provide Python scripts for generating and grading this lab (LabGen.py and LabGrade.py). If not yet posted online, professors may contact the author to receive the scripts. The lab-generation script generates random challenges for each student (or team), as well as solutions which will be used by the grading script. We recommend to make the scripts available to the students, as example of how to use the cryptographic functions. It is easy and permitted to modify these scripts to use other languages/libraries or to modify and customize them as desired. 1. Using hash for download integrity. In this question we use a cryptographic, collision-resistant hash function (see subsection 1.2.5 and Section 3.2) to ensure the integrity of software downloads, i.e., to ensure download is of the intended, authentic software, and not of a malware impersonated as the desired software. Software is often made available via repositories, which may not be fully secure; to ensure the integrity, the publishers often provide the hash of the software. Namely, to protect the integrity of some software download, say encoded as a string m, the publisher provides in some secure channel the value of the hash Hm ≡ h(m). The user then downloads the software from the (insecure) repository, obtaining the downloaded string m′ . To conőrm its integrity, i.e., conőrm that m′ = m, the user uses m′ if h(m′ ) = Hm , i.e., h(m′ ) = h(m). Based on the collision-resistance property of h, the fact that h(m′ ) = h(m) is believed to imply that m = m′ . Note that other applications of hash functions may rely on other properties, for example, on the one-way property (Section 3.4). Input: a folder Q1files containing several őles, and a őle Q1.hash, containing the SHA-256 hash applied to one of the őles. Note that the őle contains the hash in binary bytes, not encoded as text. Goal: identify the őle in Q1files whose hash is given in Q1.hash. Submission: the name of the matching őle and a program, A1.py that, given a hash őle Q1.hash and a folder Q1files, outputs the name of the matching őle (or a message if there is no matching őle). 2. Using signatures to authenticate downloads. The hash mechanism has two disadvantages. First, an attacker which controls the value of the Digest can set it to be the digest of a malware; second, the digest must be updated for any software update. In this question, we show an alternative: authenticating the software using digital signatures (subsection 1.2.3). Practical cryptographic libraries such as PyCryptodome use the Hashthen-Sign paradigm (subsection 3.2.6), i.e., they apply the signing function to the hash of the information to be signed. Hence, you will need to specify both a signature scheme (e.g., use RSA) and a hash function (e.g., use Applied Introduction to Cryptography and Cybersecurity 48 CHAPTER 1. INTRODUCING CYBERSECURITY AND CRYPTOGRAPHY again the SHA-256 hash function). The reason for using the Hash-thenSign paradigm, is that it is absurdly inefficient to apply the public-key signature algorithm directly over a (usually) long message, rather than over the (short) hash. Input: The őle Q2pk.pem, which contains the public validation key, which can be used to validate őles purportedly signed by the legitimate software, and the directory Q2files which contains several őles and the corresponding purported signatures. Note: the signature was created by the LabGen.py script, using RSA signature (with PKCS#1 v1.5 padding) and using the SHA-256 hash function. Goal: identify the őle(s) in Q2files which are properly signed, as validated using the public key given in Q2pk.pem. Submission: the name(s) of the properly signed őle(s), and a program, A2.py that, given the public validation key őle Q2pk.pem and a folder Q2files, outputs the name(s) of the matching őle(s) (or a message if there is no matching őle). 3. To get a feeling for the performance of public key signatures and of cryptographic hash functions, perform experiments to and presents graphs showing: a) The time required for hashing as a function of the input length, from 1000 bits to one million bytes. b) The time required for generation of signing and validation public key pairs, from keys of lengths from 1000 bits to the maximal length you őnd feasible (say, up to őve minutes). c) The time required to sign inputs for inputs of length from 1000 bits to one million bits, using each of the private signing keys you generated in the previous item. d) The time required to validate signatures for inputs of length from 1000 bits to one million bits, using each of the private signing keys you generated in the previous item. e) Explain how the results received in the previous items make sense, and the implications for the time required for hashing, signing, and validating, and the relations to the Hash-then-Sign paradigm (subsection 3.2.6). Note: you may need to repeat some operations many times to be able to measure the times with reasonable precision. Exercise 1.5. Let S be a signature scheme, and let A Key and A σ be the following two simple adversaries: A Key (v) randomly guesses the signing key s. After guessing s, the adversary signs the desired message using s, and ‘wins’ if the signature validates correctly using verification key v. Applied Introduction to Cryptography and Cybersecurity 1.7. LAB AND ADDITIONAL EXERCISES 49 A σ (v) randomly guesses the signature σ for a message m chosen by the adversary, and ‘wins’ if σ validates correctly using verification key v. Consider a single guess by both adversaries. Let ls denote the length of the signing key s and lσ denote the length of the signatures produces by the signing algorithm (assume all signatures has same length lσ ). 1. Compute the exact probability that each adversary wins, after a single guess. 2. What are the relationships between the probabilities computed in the previous item, and the probabilities of winning, after a single guess, each of the three notions of unforgeability introduced in this chapter? Explain. 3. Compute the exact probability that each adversary wins, after two guesses. 4. Consider adversaries AEKey and AEσ which operate similarly to A Key and A σ , except that instead of guessing only one or two times, they test every possible value (of s or of σ) until winning. What is the maximal and average number of guesses required by each of AEKey and AEσ ? 5. What is the advantage of algorithms AEKey and AEσ over S? (Definition 1.4) 6. Do the replies to the previous item imply that S is not existentially unforgeable? Exercise 1.6 (There is always some probability of forgery.). Show that there is no signature scheme S such that every efficient adversary A has zero advantage EU F −Sign l (1 ) = 0. Preferably, show this result holds against S, i.e., such that: εS,A l for any value of 1 . Exercise 1.7. Based on the definitions in this chapter, define One-time signature schemes. Exercise 1.8. Prove that if m is co-prime with n, then for every integer l > 0 holds: ml mod n = (m mod n)l mod φ(n) mod n Applied Introduction to Cryptography and Cybersecurity Chapter 2 Confidentiality: Encryption Schemes and Pseudo-Randomness Encryption deals with protecting the conődentiality of sensitive information, which we refer to as plaintext message m, by encoding (encrypting) it into ciphertext c, as illustrated in Figure 1.3. The ciphertext c should hide the contents of m from the adversary, yet allow recovery of the original information by legitimate parties, using a decoding process called decryption. Encryption is one of the oldest applied sciences; some basic encryption techniques were already used thousands of years ago. The most important categorization of encryption schemes is between shared key cryptosystems, also called symmetric cryptosystems (Figure 1.4), and public key cryptosystems, also called asymmetric cryptosystems (Figure 1.5). In both cases, we use the terms ‘encryption scheme’ and ‘cryptosystem’ interchangeably. In this chapter, we mostly focus on shared-key cryptosystems; we discuss public-key cryptography in Chapter 6. Let us begin by deőning (stateless) shared-key cryptosystems and their correctness requirement. Definition 2.1 (Stateless shared-key cryptosystem and their correctness). A shared-key cryptosystem is a pair of keyed algorithms, ⟨E, D⟩, and sets K, M and C, called the key space, plaintext space and ciphertext space, respectively. A shared-key cryptosystem is correct if for every input key k ∈ K and plaintext m ∈ M , the encryption of m using k returns ciphertext c ∈ C that decrypts, using key k, back to m. Namely, (∀k ∈ K, m ∈ M ) Dk (Ek (m)) = m (2.1) Deőnition 2.1 does not allow the encryption and decryption algorithms to maintain state, i.e., are for a stateless shared-key cryptosystem. However, many practical cryptosystems use state, as illustrated in Figure 2.1; for example, the state may be used as a counter. Let us, therefore, extend Deőnition 2.1 to deőne a stateful shared-key cryptosystem and its correctness. 51 52 CHAPTER 2. CONFIDENTIALITY: ENCRYPTION SCHEMES AND PSEUDO-RANDOMNESS Figure 2.1: Stateful shared key (symmetric) cryptosystem. Definition 2.2 (Stateful shared-key cryptosystem and their correctness). A stateful shared-key cryptosystem is a pair of keyed algorithms, ⟨E, D⟩, and sets K, M , C and S, called the key space, plaintext space, ciphertext space and state space, respectively. A stateful shared-key cryptosystem is correct if for every input key k ∈ K, plaintext m ∈ M and state s ∈ S, the encryption of m using k with state s returns ciphertext c ∈ C, that decrypts, using key k and state s, back to m. Namely, (∀k ∈ K, m ∈ M, s ∈ S) Dk (Ek (m, s), s) = m (2.2) The state is often clear from the context, and then we may omit it, i.e., write simply Ek (m) and Dk (c), as for a stateless cryptosystem. Shared-key cryptosystems are also sometimes referred to as ciphers, but we use this term for two speciőc types of shared-key cryptosystems: block ciphers, which encrypt and decrypt fixed-length blocks of bits, and stream ciphers, which are stateful cryptosystems, which, typically, encrypt and decrypt bit by bit, i.e., M = C = {0, 1}. . In this chapter we will see a variety of shared-key cryptosystems. Some of these are deterministic, some randomized; some stateless, some stateful; and with different plaintext and key spaces. Note that deőnitions 2.1 and 2.2 deőne only the correctness requirements; we did not yet deőne security requirements. We will deőne security later in this chapter, but it will be more complex than one may initially expect. Intuitively, the goal is clear: conődentiality, in a strong sense, against powerful adversaries. However, there are subtle issues, as well as multiple variants which differ in their exact requirements and assumptions about the adversary capabilities. 2.1 Historical Ciphers Cryptology is one of the most historical sciences. To ‘warm up’ for our discussion of encryption schemes, let us őrst discuss a few historical ciphers, which were in use from ancient times till the nineteenth century. These simple, historical Applied Introduction to Cryptography and Cybersecurity 2.1. HISTORICAL CIPHERS 53 ciphers helps us introduce some of the basic ideas and challenges of cryptography and cryptanalysis, and provide some historical perspective, beyond the little we presented in Section 1.6. For more information, see [223, 362]. One has to keep in mind is that the design of these ciphers was mostly kept secret, believing that an attacker cannot break the cipher if it does not know its design; a (usually false) belief we refer to as security by obscurity. In fact, some of the ancient ciphers relied only on the secrecy of their design, and did not even use a secret key; we discuss such keyless ciphers in subsection 2.1.1. Even when using a published design, users typically kept their choice secret, and often did minor changes. Indeed, it is harder to cryptanalyze a scheme which is not even known. Still, it is ill-advised to rely on ‘security by obscurity’. We explain this in subsection 1.2.2, where we present the Kerckhoffs’s principle, which essentially says that security of a cipher should not depend on the secrecy of its design. Many historical ciphers, and in particular most ancient ciphers, were monoalphabetic substitution ciphers. Monoalphabetic substitution ciphers use a őxed mapping from each plaintext character to a corresponding ciphertext character (or some other symbol). Namely, these ciphers are stateless and deterministic, and deőned by a permutation from the plaintext alphabet to a set of ciphertext characters or symbols. We further discuss general monoalphabetic substitution ciphers in subsection 2.1.3. We also discuss, in subsection 2.1.5, three variants of the Vigenère cipher, which is Poly-alphabetic substitution ciphers. 2.1.1 Ancient Keyless Ciphers In this section, we discuss few ancient ciphers. These ciphers are all simple, keyless (no secret key) and monoalphabetic. A cipher is monoalphabetic if it is deőned by a single, őxed mapping from each plaintext letter to ciphertext letter or symbol. The At-Bash cipher The At-Bash cipher may be the earliest cipher whose use is documented; speciőcally, it is believed to be used, three times, in the Old Testament book of Jeremiah. The cipher maps each of the letters in the Hebrew alphabet to a different letter. Speciőcally, the letters are mapped in ‘reverse order’: őrst letter to the last letter, second letter to the second-to-last letter, and so on; this mapping is reŕected in the name ‘At-Bash’1 . The At-Bash cipher is illustrated in Fig. 2.2. Even if you are not familiar with the letters of the Hebrew alphabet, the mapping may still be identiőed by the visual appearance. If you still őnd it hard to match, that’s Ok; we next describe an adaptation of the At-Bash cipher to the Latin alphabet. To properly deőne ciphers as well as more complex cryptographic scheme, we use pseudocode or a formula. For monoalphabetic ciphers, a formula usually 1 The name ‘At-Bash’ reflects the ‘reverse mapping’ of the Hebrew alphabet. The ‘At’ refers to the mapping of the first letter (‘Aleph’, ℵ) to the last letter (‘Taf’, !‫)ת‬, and of the second letter (‘Beth’, ℶ) to the second-to-last letter (‘Shin’, !‫)ש‬. Applied Introduction to Cryptography and Cybersecurity 54 CHAPTER 2. CONFIDENTIALITY: ENCRYPTION SCHEMES AND PSEUDO-RANDOMNESS ‫א‬ ‫ב‬ ‫ג‬ . . . . ‫ר‬ ‫ש‬ ‫ת‬ ‫ת‬ ‫ש‬ ‫ר‬ . . . . ‫ג‬ ‫ב‬ ‫א‬ Figure 2.2: The At-Bash Cipher. suffices. We deőne the cipher as a function of the input letter, where each letter is represented by its distance from the beginning of the alphabet. In Hebrew, there are 22 letters, so we encode them by the numbers from 0, representing the Hebrew letter Alef (ℵ), to 21, representing the Hebrew letter Taf (!‫)ת‬. Let p be a plaintext message consisting of l Hebrew letters, where p[i], for i = 0, . . . , (l − 1)2 , is the encoding of the corresponding letter (p[i] ∈ {0, 1, . . . , 21}). We use c to denote the corresponding l-letters ciphertext, i.e., the encryption of p using the At-Bash cipher: c = EAt−Bash (p). We compute the c using the following formula, for i = 0, . . . , (l − 1): (∀i = 0, . . . , (l − 1)) c[i] = 21 − p[i] (2.3) It is convenient to denote the alphabet size by n, i.e., in Hebrew, n = 22. With this convention we can rewrite the formula in Equation 2.3 as c[i] = (n−1)−p[i]. The Az-By cipher The Az-By cipher is the same as the At-Bash cipher, except using the Latin alphabet, which has n = 26 letters (from A to Z). We illustrate the Az-By cipher in the top part of Fig. 2.3; below it, we present two other keyless ancient ciphers, which we discuss next - the Caesar and ROT13 ciphers. Let p be a plaintext message consisting of the encoding of l Latin letters, where p[i] ∈ {0, 1, . . . , 25}, i.e., p[i] ∈ {0, 1, . . . , (n − 1)}. The corresponding Az-By ciphertext, c = EAz−By (p), is given by: (∀i = 0, . . . , (l − 1)) c[i] = 25 − p[i] = (n − 1) − p[i] (2.4) Obviously, this formula is the same as of the At-Bash cipher (Equation 2.3), except for adjusting for the fact that the Latin alpha has n = 26 letters, while the Hebrew alphabet has only 22. 2 For technical reasons, it is a bit more convenient to use 0, rather than 1, as the index of the first plaintext letter (p[0]). Applied Introduction to Cryptography and Cybersecurity 2.1. HISTORICAL CIPHERS AzBy Caesar ROT13 55 A B C D E F G H I J K L M Z Y X W V U T S R Q P O N N O P Q R S T U V W X Y Z M L K J I H G F E D C B A A B C D E F G H I J K L M D E F G H I J K L M N O P N O P Q R S T U V W X Y Z Q R S T U V W X Y Z A B C A B C D E F G H I J K L M N O P Q R S T U V W X Y Z N O P Q R S T U V W X Y Z A B C D E F G H I J K L M Figure 2.3: The AzBy, Caesar and ROT13 Ciphers. The Caesar cipher. We next present the well-known Caesar cipher. The Caesar cipher has been used, as the name implies, by Julius Caesar. It is also a monoalphabetic cipher, and we describe a variant3 which operates on the set of the n = 26 Latin letters, from A to Z. In the Caesar cipher, each plaintext letter is replaced by the letter appearing in the alphabet three places after the plaintext letter. To map the last three letters in the alphabet (X, Y and Z), we repeat the three three őrst letters (A, B and C), i.e., X is mapped to A, Y to B and Z mapped to C. See the middle row of Figure 2.3. As with the Az-By cipher, we represent each letter by its distance from the beginning of the Latin alphabet; i.e., we represent the letter ‘A’ by the number 0, and so on; ‘Z’ is represented by 25. Let p be a plaintext message consisting of the encoding of l Latin letters, where p[i] ∈ {0, 1, . . . , 25}, and let c denote the l-lettered corresponding Caesar ciphertext c = ECaesar (p). Then c is given by: (∀i = 0, . . . , (l − 1)) c[i] = p[i] + 3 mod 26 (2.5) For example, consider encryption of plaintext word ‘axe’, whose encoding is 3 In Caesar’s time, the alphabet contained only 23 characters; for simplicity, we describe a variant of it which is applied to the current Latin alphabet of 26 characters. Applied Introduction to Cryptography and Cybersecurity 56 CHAPTER 2. CONFIDENTIALITY: ENCRYPTION SCHEMES AND PSEUDO-RANDOMNESS p[0] = 0, p[1] = 23 and p[2] = 4. This gives c[0] = 3, c[1] = 0, c[2] = 7, i.e., the ciphertext string ‘dah’. Simple! Exercise 2.1. Let p and c be a plaintext string and the corresponding Caesar ciphertext, as above. Show the process for decrypting a given ciphertext c, i.e., computing the corresponding plaintext p. Use an equation similar to Equation 2.5. The ROT13 cipher ROT13 is a popular variant of the Caesar cipher, with the minor difference that ROT13 ‘rotates’ the letters by 13 positions, while Caesar rotates by 3 positions. Let p be a plaintext message consisting of the encoding of l Latin letters, where p[i] ∈ {0, 1, . . . , 25}, and let c denote the l-lettered corresponding ROT13 ciphertext c = EROT 13 (p). Then c is given by: (∀i = 0, . . . , l − 1) c[i] = p[i] + 13 mod 26 (2.6) The ROT13 cipher is illustrated by the bottom row in Figure 2.3. We are not aware of usage of ROT13 to obtain secrecy; it is normally used only to prevent inadvertent exposure to the plaintext, such as to hide potentially offensive jokes or to obscure an answer to a puzzle or other spoiler. Because of its utter unsuitability for real secrecy, ROT13 is used to refer to weak encryption schemes (e.g., ‘about as secure as ROT13’). A convenient feature of ROT13 is that it is a self-inverse function, i.e., decryption is exactly the same process as encryption. Exercise 2.2. Show that ROT13 is a self-inverse function. Namely, show that for every plaintext message p holds: p = EROT 13 (EROT 13 (p)). Can you identify additional ancient ciphers which are self-inverse functions? The Masonic cipher. A őnal example of a historic, keyless, monoalphabetic cipher is the Masonic cipher. The Masonic cipher is from the 18th century and is illustrated in Fig. 2.4. This cipher uses a ‘key’ to map from plaintext to ciphertext and back, but the key is only meant to assist in the mapping, since it has a regular structure and is considered part of the cipher. Figure 2.4: The Masonic Cipher, written graphically and as a mapping from the Latin alphabet to graphic shapes. Applied Introduction to Cryptography and Cybersecurity 2.1. HISTORICAL CIPHERS 2.1.2 57 Keyed-Caesar cipher Keyless ciphers have limited utility; in particular, the design of the cipher becomes a critical secret, whose exposure completely breaks security. Therefore, every modern cipher, and even most historical ciphers, use secret keys. Readers who are interested in these (keyed) historical ciphers should consult manuscripts on the history of cryptology, e.g. [223,362]. We only discuss, brieŕy, few simple and well known keyed historical ciphers in subsection 2.1.5. In this subsection, we present a very simple, and trivially vulnerable, keyed cipher: the Keyed-Caesar cipher, also referred to as the shift cipher. The Keyed-Caesar cipher is a simple keyed variant of the Caesar cipher. In fact, some people do not even distinguish between the Keyed-Caesar cipher and the Caesar cipher. The Keyed-Caesar cipher helps us explain the Vigenère cipher (in subsection 2.1.5), and illustrates the fact that using a key - even a long key is not sufficient for security. Recall that the Caesar cipher is deőned by c[i] = p[i] + 3 mod n, where p[i] ∈ {0, 1, . . . , n − 1} is a single letter (with n = 26 for Latin). The KeyedCaesar cipher is deőned with an additional parameter: a key k ∈ {0, 1, . . . , n−1}. Given an input plaintext string p consisting of l ‘letters’, p[0], . . . , p[l − 1], the ciphertext string c = EkKC−n (p) consists of the l ‘letters’ c[0], . . . , c[l − 1] computed as: (∀i = 0, . . . , l − 1) c[i] = p[i] + k mod n (2.7) When using n = 26, the Keyed-Caesar cipher encrypts a single Latin character at a time, just like the Caesar cipher; it simply uses an arbitrary rotation k, rather than the őxed rotation as done in the original Caesar (k = 3) and ROT13 (k = 13) ciphers. Obviously, with only n = 26 keys, the Keyed-Caesar cipher is also insecure; the attacker only needs to try the 26 possible key values. This attack, where the attacker tries all possible keys, is called exhaustive search, and can be used against any cipher. To foil exhaustive search, we can use longer keys and blocks of plaintext. For example, if we extend our character set to include the 16-bit UCS-2 character set, then n = 216 , making exhaustive search much harder. By using even longer keys and blocks, say, őve UCS-2 characters (80 bits), we have n = 280 keys, making exhaustive search impractical. However, even with very large n, the Keyed-Caesar cipher is still insecure. In particular, as the following exercise shows, even with huge n, Keyed-Caesar cipher can still be easily broken by an attacker who has access to a single pair of plaintext p and the corresponding ciphertext c = EkKC−n (p). Furthermore, any message would do - even a single-letter message (l = 1). This is a very weak form of a Known plaintext attack (KPA); we discuss KPA and other attack models for cryptosystems in Section 2.2. Exercise 2.3 (Known plaintext attack (KPA) on the Keyed-Caesar cipher). Let p be an arbitrary l-lettered plaintext for the Keyed-Caesar cipher with any alphabet size n, i.e., ∀i = 1, . . . , l : p[i] ∈ {0, . . . , n − 1}, and let c = EkKC−n (p) Applied Introduction to Cryptography and Cybersecurity CHAPTER 2. CONFIDENTIALITY: ENCRYPTION SCHEMES AND PSEUDO-RANDOMNESS 58 be the corresponding ciphertext using key k ∈ {0, . . . , n − 1}. Given a pair of one letter from p, say p[1], and the corresponding letter from c, i.e., c[1], show how the attacker can find the key k, allowing decryption of any ciphertext. Solution: From Equation 2.7, k = c[i] − p[i] mod n for any i, e.g., for i = 0 (the őrst plaintext letter and the corresponding őrst ciphertext letter). 2.1.3 The General Monoalphabetic Substitution (GMS) Cipher Monoalphabtic substitution ciphers are deterministic, stateless mappings from plaintext characters to ciphertext characters or symbols. The use of any other set of symbols instead of letters does not substantially change in the security of such ciphers, hence we focus on permutations on a őxed alphabet. The Mason, At-Bash, Caesar and Keyed-Caesar ciphers are all monoalphabetic substitution ciphers. The Mason, At-Bash and Caesar ciphers are keyless, i.e., are deőned by a speciőc permutation. For example, the Caesar’s cipher is the rotate-by-three permutation. The Keyed-Caesar cipher uses the rotate-by-k permutation, where k is the key, which is a letter in the alphabet. The General Monoalphabetic Substitution (GMS) cipher is the simple keyed cipher obtained by applying an arbitrary permutation (mapping) from the plaintext characters to the ciphertext characters. The key is the permutation. Namely, given an plaintext p = p[0], . . . , p[l − 1], the ciphertext c = EkGM S (p) consists of the l ‘letters’ c[0], . . . , c[l − 1] computed as: (∀i = 0, . . . , l − 1) c[i] = k (p[i]) (2.8) The key, which is a permutation over the alphabet, is often written as a table with two rows: the őrst containing the plaintext letters and the other containing the corresponding ciphertext letters. Some readers may recall having used sometime, maybe long ago, such ‘key tables’ to create a simple monoalphabetic cipher; many kids do. In the Latin alphabet, there are n = 26 letters; each could be chosen for A, then any of the remaining 25 could be used for B, and so on. Namely, the total number of permutations (i.e., keys) is 26!, the factorial4 of 26. The factorial grows very fast as a function of n; for example, 26! > 288 , i.e., there are over 288 permutations (keys) for 26-letters General Monoalphabetic Substitution (GMS) cipher. Namely, the Latin alphabet, with n = 26 characters, suffices to make exhaustive search infeasible for General Monoalphabetic Substitution (GMS) cipher. This improves signiőcantly compare to the Keyed-Caesar cipher, where to obtain the same number of keys, we need a huge alphabet of n = 26! > 288 letters. However, when using a small alphabet such as of Latin, General Monoalphabetic Substitution (GMS) cipher is vulnerable to the frequency analysis attack, a simple attack that we describe next. 4 26! ≡ 26 · 25 · . . . 2 · 1. Applied Introduction to Cryptography and Cybersecurity 2.1. HISTORICAL CIPHERS 59 3.5 12 11 3 10 9 2.5 6 5 2 1.5 Probability 7 Probability 8 4 1 3 2 1 0 0 E T A O I N S R H L D C U M F P GWY B V K X J Q Z Letter TH HE IN ER AN RE ON AT EN ND Bigram Figure 2.5: Frequencies of letters and (most common) bigrams in English, based on [302] 2.1.4 Frequency analysis attacks on monoalphabetic ciphers Plaintext messages are rarely completely random strings; some messages, or part of messages, are more common than others. A frequency analysis attack exploits knowledge about the plaintext distribution, to facilitate cryptanalysis. Frequency analysis attacks often succeeded against historical ciphers, using only this knowledge about the plaintext distribution, and a collection of ciphertext messages; we refer to such attacks as ciphertext only (CTO) attacks. The frequency analysis attack is effective against any monoalphabetic cipher, including General Monoalphabetic Substitution (GMS) cipher. The only exceptions, where frequency-analysis may fail, are when using extremely large alphabets, and/or when the plaintext has uniform (‘random’) distribution. Classical monoalphabetic ciphers map each letter in the alphabet of a speciőc natural (human) language, e.g., the Latin alphabet, to a őxed letter (or symbol). Namely, , the alphabet is not very large. Furthermore, in the typical case where the plaintext is a natural language message, e.g., in English. Therefore, the plaintext is not uniformly distributed. In fact, some letters are signiőcantly more common than others; similarly, some pairs of letters (bigrams) and some strings of several letters (ngrams), are signiőcantly more common than others. See Figure 2.5. Knowledge about the language, and the distributions of letters, bigrams and ngrams, is usually available to the attacker; this makes frequency analysis, and other ciphertext only (CTO) attacks, easier to launch than Known Plaintext Attacks (KPA). Furthermore, frequency attacks work exceedingly well against classical monoalphabetic ciphers, given encryption of text in a known language. Indeed, some deductions, and often complete decryption, may be done manually; for example, in texts in English, the most common letter is almost always E (12.49%). Identiőcation of T and H is also quite easy; T is the second-mostcommon (9.28%), and TH and HE are the most common bigrams (3.56% and Applied Introduction to Cryptography and Cybersecurity 60 CHAPTER 2. CONFIDENTIALITY: ENCRYPTION SCHEMES AND PSEUDO-RANDOMNESS 3.07%, respectively). Similarly, since we identiőed E, it is now easy to identify R too, since ER and RE are of the most common bigrams (2.02% and 1.85%). These easy deductions suffice to decrypt about about a third of the letters in the ciphertext; and the reader can surely őnd few other easy deductions. Of course, no reason to work manually; it is easy to write a program, to efficiently cryptanalyze any monoalphabetic cipher, including General Monoalphabetic Substitution (GMS) cipher. Exercise 2.4. Write two programs: one that implements General Monoalphabetic Substitution (GMS) cipher, and another that cryptanalyzes the resulting ciphertexts (without being given the key). Your cryptanalysis program can assume that the encrypted plaintext is typical text in English. By experimenting with the cryptanalysis program of Exercise 2.4, you will őnd that it may fail if given short ciphertext - but is very reliable, given sufficiently long ciphertext. This phenomenon exists for other attacks too; cryptanalysis often requires a signiőcant amount of ciphertext encrypted using the same encryption key. This motivates refreshing (changing) the key, thereby limiting the use of each key to a limited amount of plaintext (and ciphertext). Frequent key refresh make cryptanalysis harder or, ideally, infeasible. Principle 4 (Limit usage of each key). Systems deploying ciphers/cryptosystems should limit the amount of usage of each key, changing keys as necessary, to foil cryptanalysis attacks. An extreme example of this is the one time pad (OTP) cipher, which we discuss later (Section 2.4). The one-time pad is essentially a one-bit substitution cipher - but with a different random mapping for each bit. This turns the insecure substitution cipher into a provably secure cipher! Another way to defend against letter frequency attacks, is to use a much larger alphabet, or to map sequences of letters rather than individual letters. For example, by simply mapping pairs of plaintext letters rather than single letters, we basically prevent the use of the letter-frequency table; of course, the attacker can still take advantage of the bigram distribution. However, this requires the use of an accordingly-larger table to map between plaintext and ciphertext. Such larger tables are difficult to store and share, and result in high overhead. In Section 2.6 we present block ciphers, which are basically, efficient monoalphabetic substitution ciphers which use a large ‘alphabet’, e.g., 64 bits for DES or 128 bits for AES. Block ciphers use much shorter keys compared to the General Monoalphabetic Substitution (GMS) cipher. Block ciphers use sufficiently-long keys to foils exhaustive search, typically from 56 bits (DES) to 256 bits (AES). Unlike the Keyed-Caesar cipher, however, block ciphers are designed to ensure security against frequency analysis as well as KPA and other cryptanalysis attacks. Applied Introduction to Cryptography and Cybersecurity 2.1. HISTORICAL CIPHERS 2.1.5 61 The Polyalphabetic Vigenère ciphers We conclude our discussion of historical ciphers, by discussing polyalphabetic ciphers, and in particular, two variants of the Vigenère cipher. We őrst describe the simpler (and weaker) (repeating-key) Vigenère cipher published, in 1553, by Giovan Battista Bellaso5 . We then describe the stronger Autokey, published in 1586 by Blaise de Vigenére. We refer to these two ciphers as the Vigenère ciphers. Both Vigenère ciphers are polyalphabetic ciphers, namely, they use multiple mappings from plaintext characters to ciphertext characters. The motivation to use multiple mappings is to defeat frequency analysis attacks, and different variants of the Vigenère cipher were used until the twenty century; in fact, even the Enigma (Section 1.6) is a polyalphabetic cipher and essentially similar to application of a cascade of few Vigenère ciphers. The (repeating-key) Vigenère cipher. The (repeating-key) Vigenère cipher extends the Keyed-Caesar cipher, by using a string of few characters as the key, rather than a single character as in the Keyed-Caesar cipher. The (repeatingkey) Vigenère cipher simply applies the letters of the key, one by one, repeating the key after using its last character. Namely, given an input plaintext string p consisting of l characters, p[0], . . . , p[l− 1], and a key k consisting of λ characters, k[0], . . . , k[λ], the ciphertext string c = EkVigenère cipher (p) consists of the l characters c[0], . . . , c[l − 1] computed as: (∀i = 1, . . . , l) c[i] = p[i] + k[i mod λ] mod n (2.9) Example 2.1. Consider encryption of the string p =‘ABCDEFG’, whose encoding is (i = 0, . . . , 6)p[i] = i, using the key k =‘BED’, whose encoding is k[0] = 1, k[1] = 4 and k[2] = 3. The resulting ciphertext c is the string c = ‘BFFEIIH’. Attacking the (repeating-key) Vigenère cipher. Since polyalphabetic ciphers such as the Vigenère cipher use different keys (offsets) for different characters, direct application of frequency analysis would fail. The őrst attack against the (repeating-key) Vigenère cipher was published by Kasiski in 18636 . This attack proceeds in two steps: őnding the length λ of the key k, and then őnding the key itself (k[0], . . . , k[λ − 1]). Once the key length λ is found, we can apply frequency analysis separately to each of the λ sequences si (i = 0, . . . , λ − 1) of ciphertext characters generated 5 The (repeating-key) Vigenère cipher is usually referred to simply as the Vigenère cipher. We add the qualifier (repeating key) to avoid confusion with the Autokey, which is the stronger cipher published, later, by Vigenére. The Vigenère cipher was published first by Bellaso, so its name is anyway misleading. 6 This attack was known earlier but not published; in particular, it was found in personal notes written by Babbage in 1846. Applied Introduction to Cryptography and Cybersecurity 62 CHAPTER 2. CONFIDENTIALITY: ENCRYPTION SCHEMES AND PSEUDO-RANDOMNESS by the different key letters: s0 s1 ... sλ−1 ≡ ≡ (c[0], c[λ], . . .) (c[1], c[λ + 1], . . .) ≡ (c[λ − 1], c[2λ − 1], . . .) To őnd λ, Kasiski looked for repeating sequences of characters (ngrams) in the ciphertext. For example, assume that for some m, m′ hold c[m : m + 3] = c[m′ : m′ + 3]. By substituting c[m : m + 3] and c[m′ : m′ + 3] from Equation 2.9, we have: (∀i = 0, . . . , 3) p[m + i] + k[m + i mod λ] mod n = p[m′ + i] + k[m′ + i mod λ] mod n (2.10) Very often, the plaintext contains some repeating words or strings, ranging from relatively-common words or ngrams, to terms which are speciőc to that plaintext. This would be the most common reason for repeating strings in the ciphertext. Namely, when we őnd such m, m′ , it is likely that p[m : m + 3] = p[m′ : m′ + 3]. Hence, from Equation 2.10, holds: (∀i = 0, . . . , 3) k[m + i mod λ] mod n = k[m′ + i mod λ] mod n (2.11) Usually, when Equation 2.11 holds, then m = m′ mod λ, i.e., m − m′ is either λ or a multiple of λ. The attack usually proceed by őnding few such repeating sequences, e.g., m1 = m′1 mod λ and m2 = m′2 mod λ, and λ is the common divisor of m1 − m′1 and of m2 − m′2 , or one of the (few) common divisors. We then apply frequency analysis to őnd each of the λ characters of the key, k[0], . . . , k[λ − 1]. The one-time-pad. An important scenario is when we use the Vigenère cipher with a key which is (at least) as long as the plaintext. Namely, each character of the key is used to hide only one character of the plaintext. In this case, Equation 2.9 simpliőes to c[i] = p[i] + k[i] mod n; and in the special case of n = 2, we have c[i] = p[i] ⊕ k[i]. When the key k is chosen randomly, and especially when n = 2, this cipher is referred to as the one-time pad; we discuss it in Section 2.4. The one-time pad is unconditionally secure, i.e., the attacker cannot learn from the ciphertext anything about the plaintext, regardless of the attacker’s computational capabilities (for any n > 1). The Autokey. We now describe Autokey, which is the cipher that Vigenére actually designed; it is an enhancement or variant of the ‘repeating key Vigenère cipher’. Both Vigenère ciphers operate in the same way, until exhausting the characters of the key; but then they differ. Recall that the Applied Introduction to Cryptography and Cybersecurity 2.1. HISTORICAL CIPHERS 63 repeating-key Vigenère cipher reuses the same key string. Instead, the Autokey uses the plaintext. Namely, given an input plaintext string p consisting of l characters, p[0], . . . , p[l− 1], and a key k consisting of λ characters, k[0], . . . , k[λ − 1], we compute the ciphertext string c = EkAutokey (p) as follows. We őrst deőne the autokey k ′ , as k ′ ≡ k||p, i.e., the concatenation of the plaintext to the key. Next, we compute the ciphertext c, which consists of the l characters c[0], . . . , c[l − 1] computed as: (∀i = 1, . . . , l) c[i] = p[i] + k ′ [i] mod n (2.12) Example 2.2. Consider encryption of the string p =‘ABCDEFG’, whose encoding is (i = 0, . . . , 6)p[i] = i, using the key k =‘BED’, whose encoding is k[0] = 1, k[1] = 4 and k[2] = 3. The autokey would be k ′ =‘BEDABCDEFG’, and the resulting ciphertext c is the string c = ‘BFFDFHJ’. Attacking the Autokey. Let us present instructive attacks on the Autokey. These attacks use two different models of the attacker capabilities, the Known plaintext attack (KPA) model and the Ciphertext only attack (CTO) model; we discuss these and other attack models against cryptosystems below, in Section 2.2. Example 2.3 (A Known plaintext attack against Autokey). Assume that the attacker captures ciphertext c, and knows that the plaintext is of the form p = pKnown + + pSecret ; for example, pKnown =‘The password is:’. The attacker can find the key k and the plaintext pSecret as follows, assuming that the key k is not longer than pKnown . Let λ denote the number of characters in the key k, i.e., k = k[0 : λ]. Hence, for i = 0, . . . , λ holds: (1) k ′ [i] = k[i] (by definition of k ′ ), (2) p[i] = pKnown [i] and (3) c[i] = p[i] + k ′ [i] (Equation 2.12). The attacker can, therefore, find the key k, since for i = 0, . . . , λ holds: k[i] = k ′ [i] = c[i] − pKnown [i]. In the (less likely) case that k is longer than pKnown , this attack will expose the first |pKnown | characters of k. This will allow exposure of the first |pKnown | characters of the plaintext of other messages encrypted using k. To expose additional characters, we will need to use a different attack, such as the CTO attack which we describe next. Example 2.4 (A Ciphertext Only Attack (CTO) against Autokey). In spite of their name, ciphertext only (CTO) attacks require some knowledge about the plaintext. For exampl, assume that the plaintext is known to be in English. Let us assume also that the attacker knows the number of characters in the key, λ. Assume that the attacker knows some plaintext p[i]. Then c[i + λ] = p[i + λ] + p[i] mod n; hence the attacker can find p[i + λ]. Now that p[i + λ] is known, the attacker can find p[i + 2λ], and so on. Similarly, the attacker can find p[i − λ], provided, of course, that i − λ ≥ 0. Namely, given a guess for p[i], the attacker can find p[j] for every j = i mod λ. This allows the attacker to decrypt the ciphertext, by using frequency analysis; let us sketch how. Since the letter E is very common (more than 18 of the letters!), Applied Introduction to Cryptography and Cybersecurity CHAPTER 2. CONFIDENTIALITY: ENCRYPTION SCHEMES AND PSEUDO-RANDOMNESS 64 the attacker tries different indexes i, testing if p[i] = 4 (indicating the letter E). To test if this is the case, the attacker uses the above process to find what would be the letters p[j] for every j = i mod λ, and checks the distribution of the resulting sequence of letters. We conclude that the guess was correct (p[i] = 4, i.e., the ith letter is indeed E), if and only if, the distribution is close to the letter frequency in English, in particular, if roughly 18 of the letters in this sequence are E. After the attacker finds, in this way, the sequence of plaintext letter p[j] for every j = i mod λ, it can continue to decrypt additional ciphertext letters by guessing additional letters. In particular, the attacker can now use the bigram distribution, to guess letters adjacent to already-exposed letters. 2.2 Cryptanalysis Attack Models: CTO, KPA, CPA and CCA As discussed in Section 1.1 and in particular in Principle 1, security should be deőned and analyzed with respect to a clear, well deőned model of the attacker capabilities, which we refer to as the attack model. In particular, cryptanalysis attack models deőne the capabilities of attackers trying to ‘break’ an encryption scheme. In this section, we introduce the four basic cryptanalysis attack models: CTO, KPA, CPA and CCA; we already gave few examples of CTO and KPA attacks. Table 2.1 summarizes these four basic cryptanalysis attack models, as well as two additional cryptanalysis attack models: the chosen-ciphertext side-channel attack (CCSCA), against public-key cryptosystems, presented in Chapter 6, and the CPA-Oracle Attack, against shared-key cryptosystems, presented in Chapter 7. Following Kerckhoffs’s principle, in all of these attack models, the design of the cryptosystems are known to the attacker; defense is only provided via the secret keys7 , which are selected or generated randomly. The Ciphertext-Only (CTO) attack model. We discussed above the letter-frequency attack, which relied only on access to sufficient amount of ciphertext, and on knowing the letter-distribution of plaintext messages. Later, in subsection 2.3.1, we presented exhaustive search, an attack requiring the ability to identify correctly-decrypted plaintext (with signiőcant probability). Both of these attacks requires only access to (sufficient) ciphertext, and some limited knowledge about the plaintext distribution - in there examples, knowledge of the letter frequencies in the plaintext language, or ability to identify possible plaintexts, respectively. We refer to such attacks, which require only these ‘minimal’ attacker capabilities, as ciphertext-only (CTO8 ) attacks, or as attacks under the ciphertext-only (CTO) attack model. In particular, ciphertext-only attacks do not require a (plaintext, ciphertext) pair. 7 private keys for public-key cryptosystems historical reasons, we use the CTO acronym for the ciphertext-only attack, although it is inconsistent with the other acronyms for attack models. 8 For Applied Introduction to Cryptography and Cybersecurity 2.2. CRYPTANALYSIS ATTACK MODELS: CTO, KPA, CPA AND CCA Attack model Ciphertext Only (CTO) Known Plaintext Attack (KPA) Chosen Plaintext Attack (CPA) Chosen Ciphertext Attack (CCA) chosen-ciphertext side-channel attack (CCSCA) CPA-Oracle Attack Cryptanalyst knowledge, capabilities Plaintext distribution (possibly noisy/partial) Section 2.2 Fig. 2.6 Set of (ciphertext, plaintext) pairs 2.2 2.7 Ciphertext for arbitrary plaintext chosen by attacker Plaintext for arbitrary ciphertext chosen by attacker ‘Side-channel feedback’ from processing adversary-selected ciphertexts 2.2 2.8 2.2 2.9 6.5.7 6.20 Attacker receives encryptions of pre+ +x+ +post, for challenge x and chosen (pre, post), and feedback from processing selected ciphertexts 7.2.3 2.36 65 Table 2.1: Cryptanalysis Attack Models. In all attack types, the cryptanalyst knows the cipher design and a body of ciphertext. (eav esdr op) Eavesdropping Eve m0 , m1 , . . . m0 , m1 , . . . Encrypt: ci ← Ek (mi ) c 0 , c1 , . . . Decrypt: mi ← Dk (ci ) Nurse Alice Bob Figure 2.6: The Ciphertext-Only (CTO) attack model. Notice the small image representing the plaintext distribution, which is a tiny-version of the letterdistribution graph of Figure 2.5; this is used to sample the plaintext, for both the plaintext messages m0 , m1 , . . . and for sample plaintext messages given to the attacker. To facilitate CTO attacks, the attacker must have some knowledge about the distribution of plaintexts. In practice, such knowledge is typically implied by the speciőc application or scenario. For example, when it is known that the message is in English, then the attacker can apply known statistics such as the letter-distribution histogram Figure 2.5. For a formal, precise deőnition, we normally allow the adversary to pick the plaintext distribution. Note that this requires deőning security carefully, to prevent absurd ‘attacks’, which clearly fail in practice, to seem to fall under the deőnition. Applied Introduction to Cryptography and Cybersecurity 66 CHAPTER 2. CONFIDENTIALITY: ENCRYPTION SCHEMES AND PSEUDO-RANDOMNESS Eavesdropping Eve . m∗ , m0 , m1 , . . . (eaves m0 , m1 , . . . m∗ drop) ,.. , m1 m0 Encrypt: ci ← Ek (mi ) c ∗ , c0 , c1 , . . . Decrypt: Dk (·) Nurse Alice Bob Figure 2.7: The Known-Plaintext Attack (KPA) model. As in the CTO model, all plaintext messages, m∗ and m0 , m1 , . . ., are chosen from a known plaintext distribution, and the eavesdropping-adversary receives their encryptions, c∗ and c0 , c1 , . . ., and tried to learn m∗ - or something about m∗ . In the KPA model, the adversary receives, in addition, the plaintext messages m0 , m1 , . . ., except for the ‘challenge message’ m∗ . The plaintext messages m0 , m1 , . . . are not given to the CTO attacker. The Known-Plaintext Attack (KPA) model. In the known-plaintext attack (KPA) model, the attacker receives one or multiple pairs of plaintext and the corresponding ciphertext. However, the attacker cannot choose the plaintext; one way to model this is to assume that the plaintext is chosen randomly. In the historical attacks on cryptographic systems, known plaintext attacks were sometimes possible, such as when some text was available both in plaintext and in ciphertext. One interesting example is the ‘deciphering’ of the Rosetta stone, which contained the same inscription, engraved in three different ways: one in Greek and two in Egyptian, once using hieroglyphic script and once using Demotic script. This is how archaeologists learned to read hieroglyphic. Previous attempts to decipher hieroglyphic, using ‘ciphertext only’, i.e., without known plaintext, were in vain. Another example of historical known-plaintext attack was the cryptanalysis of the Enigma cipher by the Allies during WW II (Section 1.6). The Germans were often sending encryptions of plaintext which would either be known or could be guessed with reasonable probability. Sometimes the same message was sent to some recipients encrypted, and to others in plaintext; and often the message began with a predictable greeting or otherwise contained some predictable content. The (plaintext m, ciphertext Ek (m)) pairs were fed to the Bombe devices, which tried all possible keys, until őnding the correct key (exhaustive search). Note that this was incorrect use of Enigma; following the conservative design and usage principle, the exposure of (plaintext, ciphertext) Applied Introduction to Cryptography and Cybersecurity 2.2. CRYPTANALYSIS ATTACK MODELS: CTO, KPA, CPA AND CCA 67 pairs should have been avoided. In modern applied cryptography, it is very common for the attacker to obtain KPA capabilities. For example, consider the common use of the TLS protocol (Chapter 7) to protect web communication, referred as https (for ‘http-secure’). In such usage, the entire traffic between browser and web-server is encrypted by TLS. Typically, this includes images and web-pages (encoded in HTML) which are sent to all clients. Namely, the attacker can obtain this plaintext, simply by requesting the same page from the web-server. The Chosen-Plaintext Attack (CPA) model. In subsection 2.3.2, we discuss the table look-up and time-memory tradeoff attacks; in both of these generic attacks, the adversary must be able to obtain the encryption of one or few speciőc plaintext messages - the messages used to create the precomputed table. Therefore, these attacks cannot be launched under the Known-Plaintext Attack (CPA) model. Instead, these attacks are facilitated by the chosen-plaintext attack (CPA) model. ,.. , m1 m0 m∗ , m0 , m1 , . . . . Encrypt: ci ← Ek (mi ) (eav esdr m∗ op) Eavesdropping Eve c ∗ , c0 , c1 , . . . Decrypt: Dk (·) Nurse Alice Bob Figure 2.8: The Chosen-Plaintext Attack (CPA) model. Here, the attacker can choose the plaintext messages m0 , m1 , . . ., in addition to knowing the distribution from which m∗ is sampled. Note: in the CPA attack, the attacker controls the plaintext messages given to Alice; however, we usually still consider this attacker to be an ‘eavesdropper’, since the attacker cannot modify or inject messages to the communication between Alice and Bob. Security against CPA has been studied extensively in modern cryptography, but was not even considered in historical cryptanalysis. For example, in the second world war, cryptanalysts in Bletchley Park made extensive use of some known plaintexts, i.e., used KPA. However, they had no hope to choose the plaintext, i.e., to use CPA. Therefore, they were not able to use CPA attack techniques, e.g., the attacks in subsection 2.3.2. However, in modern applied cryptography, it is not that unusual for the attacker to obtain CPA capabilities. For example, consider the, as for KPA, the Applied Introduction to Cryptography and Cybersecurity 68 CHAPTER 2. CONFIDENTIALITY: ENCRYPTION SCHEMES AND PSEUDO-RANDOMNESS common use of the TLS protocol (Chapter 7) to protect web communication. TLS applies encryption to the traffic from the browser to the server; however, a basic premise of the http protocol, is that a every web-page (and script) can send requests to any other website, including websites from different domains. Namely, when the user is browsing to the attacker’s web-page, or when the browser is running a script written by the attacker, the attacker can cause the browser to send arbitrary requests to any website. These requests would often also contain sensitive information, typically a cookie attached by the browser to the request. The browser will often also use the same key to encrypt other requests sent by the browser to the same website. Every cipher vulnerable to CTO attack, is also vulnerable to KPA attack; and every cipher vulnerable to KPA, is also vulnerable to CPA attack. We say that the CPA attack model is stronger than the KPA model, and that KPA model is stronger that CTO model, and denote this by: CP A > KP A > CT O. Exercise 2.5 (CP A > KP A > CT O). Explain (informally) why every cryptosystem vulnerable to CTO attack, is also vulnerable to KPA, and every cryptosystem vulnerable to KPA, is also vulnerable to CPA. The Chosen-Ciphertext Attack (CCA) model. Finally, in the chosenciphertext attack (CCA) attack model, the attacker has the ability to receive the decryption of arbitrary ciphertexts, chosen by the attacker. This attack model may seem absurd; if the attacker can receive decryption of arbitrary ciphertexts, then the encryption does not protect conődentiality even without cryptanalysis, no? However, this is actually a very important attack model, since, in practice, there are different scenarios where the attacker may be able to obtain some some information about the plaintext of some ciphertexts. In particular, in subsection 6.5.7 we present the important Bleichanbacher attack against (insecure) padding schemes applied to the plaintext before RSA encryption; this attack is under the chosen-ciphertext side-channel attack (CCSCA), which is a weaker version of the CCA attack model. There are a few other variants to the CCA model. One set of variants combines the CCA model with the CTO, KPA or CPA models, i.e., allows the adversary also the ability to receive encryptions of random plaintext from a known distribution, known plaintexts or chosen plaintexts, respectively. Another set of variants concern the timing of the choice of ciphertext messages to be decrypted; some CCA models require the adversary to choose these ciphertexts before receiving the challenge ciphertext c∗ , and other CCA models allow the adversary to select these ciphertexts after receiving c∗ , of course, forbidding the use of c∗ as one of these ciphertexts to be decrypted. We adopt the common deőnition, where CCA-attackers are also allowed to perform CPA attacks, i.e., the attacker can obtain the encryptions of attackerchosen plaintext messages. With this deőnition, trivially, CCA > CP A. Combining this with the previous exercise, we have the complete ordering: CCA > CP A > KP A > CT O. Applied Introduction to Cryptography and Cybersecurity 2.3. GENERIC ATTACKS AND EFFECTIVE KEY-LENGTH 69 MitM Mal ,.. , m1 m0 m∗ . m′0 , m′1 , . . . c ∗ , c0 , c1 , . . . m∗ , m0 , m1 , . . . Encrypt: ci ← Ek (mi ) c′0 , c′1 , . . . (∀i)c′i ̸= c∗ Decrypt: m′i ← Dk (c′i ) Nurse Alice Bob Figure 2.9: The Chosen-Ciphertext Attack (CCA) model. In the CCA model, the adversary can select ciphertext messages c′0 , c′1 , . . ., all different from the ‘challenge ciphertext’ c∗ , and receive their decryptions m′i = Dk (c′i ). 2.3 Generic attacks and Effective Key-Length We discussed several ciphers and attack models; how can we evaluate the security of different ciphers, under a given attack model? This is a fundamental challenge in cryptography; we will discuss this challenge in this section, as well as later on, esp. when we introduce our őrst deőnition of a cryptographic mechanism the Pseudo-Random Generator (PRG), in subsection 2.5.2. One way in which non-experts often compare the security of different ciphers, is using their key length. We caution that while a sufficiently-long key is required for security, a long key is not a sufficient to ensure security; in subsection 2.3.3 we introduce the effective key length concept and principle, which should be used instead of simply relying on the key length. Nevertheless, the key length of a cipher is important - since a short key does allow attacks. In fact, there are attacks which we call generic attacks which work against any cryptosystem with insufficiently long key, or against all cryptosystems that share some (common) property. Generic attacks may differ in the attack model they assume, in the types of cryptosystems they break and in the attack efficiency and overhead. We present three important generic attacks in this section. In subsection 2.3.1, we present the exhaustive search attack, which essentially tries out all the keys until őnding the right one. Exhaustive search works on most ciphers and scenarios; it requires the ability to test candidate keys. Such ability exists usually, but not always, as we explain. In subsection 2.3.2 we discuss two other generic attacks: table look-up and time-memory tradeoff. These attacks further demonstrate, that the attacker’s success does not depend only on the key-length - it also depends on the attack model and attacker capabilities, e.g., storage capacity. Finally, in subsection 2.3.3 we discuss additional challenges in evaluating security of cryptographic mechanisms, and introduce the effective key length concept and principle. Applied Introduction to Cryptography and Cybersecurity CHAPTER 2. CONFIDENTIALITY: ENCRYPTION SCHEMES AND PSEUDO-RANDOMNESS 70 2.3.1 The generic exhaustive-search CTO attack Recall that in the CTO attack model, the attacker has some knowledge about the distribution of plaintexts. In this section we discuss generic exhaustivesearch CTO attacks, also called exhaustive search or brute force attacks, where the attacker uses this knowledge to cryptanalyze a generic cryptosystem, i.e., without dependency on the design of the speciőc cryptosystem. In Algorithm 2 we present a simple exhaustive-search CTO attack, which uses a predicate V alid(p) that validates plaintexts, i.e., V alid(p) returns ⊥ if p is not a valid plaintext, for example, if p is a word in English. Other CTO exhaustive key search algorithms do not require such a predicate; and typically use a known probability distribution for the plaintext space, e.g., using the distribution of letters in English (Figure 2.5). Algorithm 3 A CTO exhaustive key search algorithm using predicate V alid(·), for stateless decryption algorithm D. Set of possible keys: K = {k1 , k2 , . . . , kn } Set of ciphertexts: C = {c1 , c2 , . . .} Predicate V alid(p): returns True if p is valid plaintext, ⊥ otherwise for all k in K do if (∃c ∈ C) V alid(Dk (c)) = ⊥ then Remove k from K return set of remaining candidate keys K Algorithm 2 decrypts each of the known ciphertext messages, using every possible key. If the decryption of some ciphertext c using some key k is not a valid plaintext, it follows that k is not a correct key. The process typically eliminates all but one or few candidate keys, and incorrect keys are discarded quickly, after testing them against a small number of ciphertexts. Testing candidate keys. Exhaustive search works best when decrypting arbitrary ciphertext with an incorrect key usually result in clearly-invalid plaintext. Notice our use of the term ‘usually’; surely there is some probability that decryption with the wrong key will result in seemingly-valid plaintext. Hence, exhaustive search may not return a single correct decryption key. Instead, quite often, exhaustive search may return multiple candidate keys, which all resulted in seemingly-valid decryption. In such cases, the attacker must now eliminate some of these candidate keys by trying to decrypt additional ciphertexts and discarding a key when its decryption of some ciphertext appears to result in invalid plaintext. Is CTO Exhaustive Search feasible? Exhaustive search is not always feasible. One concern is the availability of the valid(·) function, or of other distinguishing properties based on the probability distribution of plaintexts. In particular, exhaustive search is infeasible if the plaintext space is uniformly Applied Introduction to Cryptography and Cybersecurity 2.3. GENERIC ATTACKS AND EFFECTIVE KEY-LENGTH 71 random, i.e., any string is equally likely to be used as plaintext. Let us therefore focus on the simpler case, where a valid(·) function is known. Furthermore, assume the typical case where decryption of a valid ciphertext with incorrect key results, with high probability, in an invalid plaintext p (i.e., valid(p) = ⊥). Further focusing on stateless cryptosystems, we can exclude an incorrect key after few trial decryptions. Consider a symmetric cryptosystem (E, D), where the key is chosen as a random binary string of given length l, namely the key space is 2l . Namely, the attack succeeds after decrypting one or few valid ciphertexts, for, at most, each of the 2l possible keys. The attack is therefore feasible, when using insufficiently-long keys. Surprisingly, designers have repeatedly underestimated the risk of exhaustive search and used ciphers with insufficiently long keys, i.e., insufficiently large key spaces. Let us elaborate. Let TS be the sensitivity period, i.e., the duration required for maintaining secrecy, and TD be the time it takes to test each potential key, by performing one or more decryptions. Hence, the attacker can test TS /TD keys out of the key-space containing 2l keys. If TS /TD > 2l , then the attacker can test all keys, and őnd the key for certain (with probability 1); otherwise, the attacker TS succeeds with probability TD . By selecting a sufficient key length, we can ·2l ensure that the success probability is as low as desired. For example, consider the conservative assumption of testing a billion keys per second, i.e., TD = 10−9 , and requiring the security for three thousand years, i.e., TS = 1011 , with probability of attack succeeding at most 0.1%. We őnd that to ensure security withthese  parameters against brute force attack, we need keys of length l ≥ log2 TTDS = log2 (1020 ) < 74 bits. The above calculation assumed a minimal time to test each key. Of course, attackers will often be able to test many keys in parallel, by using multiple computers and/or parallel processing, possibly with hardware acceleration. Such methods were used during 1994-1999 in multiple demonstrations of the vulnerability of the Data Encryption Standard (DES) to different attacks. The őnal demonstration was exhaustive search completing in 22 hours, testing many keys in parallel using a $250,000 dedicated-hardware machine (‘deep crack’) together with distributed.net, a network of computers contributing their idle time. However, the impact of such parallel testing, as well as improvements in processing time, is easily addressed by reasonable extension of key length. Assume that an attacker is able to test 100 million keys in parallel during the same 10−9 second, i.e., TD = 10−17 . With the same goals  and calculation as above we őnd that we need keys of length l ≥ log2 TTDS = log2 (1026 ) < 100. This is far below even the minimal key length of 128 bits supported by the Advanced Encryption Standard (AES). Therefore, exhaustive search is not a viable attack against AES or other ciphers with over 100 bits. In principle, exhaustive search can be be applicable to public key cryptosystems (KG, E, D), where the public and private keys are generated by the $ randomized key-generation algorithm, i.e., (e, d) ← KG(1l ). However, this is Applied Introduction to Cryptography and Cybersecurity CHAPTER 2. CONFIDENTIALITY: ENCRYPTION SCHEMES AND PSEUDO-RANDOMNESS 72 usually impractical, for two reasons. First, all known public key cryptosystems have orders of magnitude more overhead than shared key cryptosystems, which makes testing all keys impractical. Second, known public key cryptosystems are all vulnerable to attacks which are signiőcantly more efficient than exhaustive search, and therefore are used with signiőcantly longer keys, which make exhaustive-search attacks infeasible. See details in Chapter 6 and speciőcally in Table 6.1. Exhaustive search for stateful ciphers. Exhaustive search may be harder against stateful ciphers, since decryption may depend on the (initial) state of the cipher. The natural adaptation of the exhaustive search attack of Algorithm 2 requires to consider the initial state as a part of the key, which makes the attack harder; this motivated stateful cipher designs, for example the Vigenère cipher and Autokey ciphers (subsection 2.1.5). Of course, if the initial state is known, then this concern does not exist. However, the attack can remain infeasible, if the key space is too large. In particular, in Section 2.4 we present the one time pad (OTP) stream cipher, where every plaintext bit is encrypted with a corresponding key bit, i.e., c = p⊕k where |k| = |p| = |c|. OTP cannot be broken by exhaustive search, since for every ciphertext c and every plaintext p of the same length, there is a key k such that c = p ⊕ k. 2.3.2 The Table Look-up and the Time-Memory Tradeoff Generic CPA attacks Exhaustive search is very computation-intensive; it őnds the key, on the average, after testing half of the keyspace. On the other hand, its storage requirements are very modest, and almost9 independent of the key space. In contrast, the table look-up attack, which we next explain, uses O(2l )10 memory, where l is the key length, but only table-lookup time. However, this requires ciphertext of some pre-deőned plaintext message, which we denote p∗ . This can be achieved by an attacker with Chosen Plaintext Attack (CPA) capabilities, or whenever the attacker can obtain encryptions of some well known message p∗ . Many communication protocols use predictable, well-known messages at speciőc times, often upon connection initialization, which provides the attacker with encryptions of this predictable known plaintext message p∗ and suffice for this attack. In the table look-up attack, the attacker őrst precomputes T (k) = Ek (p∗ ) for every key k, and for every c s.t. c = T (k) for some key k, stores also the inverse table T −1 (c) = {k s.t. c = Ek (p∗ )}. Later, the attacker asks for the encryption of the same plaintext p∗ , using the unknown secret key, which we denote k ∗ ; let c∗ = Ek∗ (p∗ ) denote the received ciphertext. The attacker now identiőes the key as one of the entries in T −1 (c∗ ). The number of matching 9 Exhaustive 10 See search needs storage for the key guesses. Section A.1 for the big-O notation, O(2l ). Applied Introduction to Cryptography and Cybersecurity 2.3. GENERIC ATTACKS AND EFFECTIVE KEY-LENGTH 73 keys is usually one or very small, allowing the attacker to quickly rule out the incorrect keys, e.g., by decrypting some additional ciphertext messages. The table look-up attack requires O(2l ) storage to ensure O(1) computation, while the exhaustive search attack uses O(1) storage and O(2l ) computations. Several more advanced generic attacks allow different tradeoffs between the computing time and the amount of storage (memory) required for the attack. The őrst and most well known time-memory tradeoff attack was presented by Martin Hellman [189]. Later works presented other tradeoff attacks, such as the time/memory/data tradeoff of [65] and the rainbow tables technique of [303]. Unfortunately, we will not be able to cover these interesting attacks, and the readers are encouraged to read these (and other) papers presenting them. 2.3.3 Effective key length Cryptanalysis, i.e., developing attacks on cryptographic mechanisms, is a large part of the research in applied cryptography; it includes generic attacks such as these presented earlier in this section, as well as numerous attacks which are tailored to a speciőc cryptographic mechanism. This may look surprising; why publish attacks? Surely the goal is not to help attacks against cryptographic systems? Cryptanalysis facilitates two critical decisions facing designers of security systems which use cryptography: which cryptographic mechanism to use, and what parameters to use, in particular, which key length to use. Let us focus őrst on the key length. All too often, when cryptographic products and protocols are mentioned in the popular press, the key-length in use is mentioned as an indicator of their security. Furthermore, this is sometimes used to argue for the security of the cryptographic mechanism, typically by presenting the number of different key values possible with a given key length. The number of different keys, is the time required for the exhaustive search attack, and has direct impact on the resources required by the other generic attacks we discussed. Hence, clearly, keys must be sufficiently long to ensure security. But how long? It is incorrect to compare the security of two different cryptographic systems, which use different cryptographic mechanisms (e.g., ciphers), by comparing the key length used in the two systems. Let us give two examples: 1. We saw that the general monoalphabetic substitution cipher (subsection 2.1.3) is insecure, although its key space is relatively large. We could easily increase the key length, e.g., by adding more symbols, e.g., use different symbols for lowercase and uppercase letters; but this will not signiőcantly improve security. 2. The key length used by symmetric cryptosystems, as discussed in this chapter, rarely exceed 300 bits, and is usually much smaller - 128 bits is common; more bits are simply considered unnecessary. In contrast, asymmetric, public-key cryptography is usually used with longer keys Applied Introduction to Cryptography and Cybersecurity 74 CHAPTER 2. CONFIDENTIALITY: ENCRYPTION SCHEMES AND PSEUDO-RANDOMNESS often much longer, depending on the speciőc public key cryptosystem; see Table 6.1. It is useful to compare the security of different cryptosystems, when each is used with a speciőc key-length - e.g., with comparable efficiency. As explained above, using the key-length alone would be misleading. One convenient, widely used measure for the security of a given cryptosystem, used with a speciőc key length, is called the effective key length; essentially, this uses exhaustive search as a measure to compare against. We say that a cipher using k-bit keys has effective key length l if the most effective attack known against it takes about 2l operations, where k ≥ l. We expect the effective key length of good symmetric ciphers to be close to their real key length, i.e., l should not be ‘much smaller’ compared to k. For important symmetric ciphers, any attack which increases the gap between k and l would be of great interest, and as the gap grows, there will be increasing concern with using the cipher. The use of key lengths which are 128 bits or more leaves a ‘safety margin’ against potential better future attacks, and gives time to change to a new cipher when a stronger, more effective attack is found. Note that, as shown in Table 6.1, for asymmetric cryptosystems, there is often a large gap between the real key length l and the effective key length k. This is considered acceptable, since the design of asymmetric cryptosystems is challenging, and it seems reasonable to expect attacks with performance much better than exhaustive search. In particular, in most public key systems, the secret key is not an arbitrary, random binary string. Note that the evaluation of the effective key length, depends on the attack model; there are often attackers with much smaller effective-key length, when assuming a stronger attack model, e.g., CPA compared to KPA or CTO. One should therefore also take into account the expected attack model. Also, notice that the effective key length measure compares based on the time required for the attack; it does not allow for comparing different resources, for example, time-memory tradeoff. Normally, we select sufficient key length to ensure security against any conceivable adversary, e.g., leaving a reasonable margin above effective key length of say 100 bits; a larger margin is required when the sensitivity period of the plaintext is longer. The cost of using longer keys is often justiőed, considering the damages of loss of security and of having to change in a hurry to a cipher with longer effective key length, or even of having to use longer keys. In some scenarios, however, the use of longer keys may have signiőcant costs; for example, doubling the key length in the RSA cryptosystem increases the computational costs by about six. We therefore may also consider the risk from exposure, as well as the resources that a (rational) attacker may deploy to break the system. This is summarized by the following principle. Principle 5 (Sufficient effective key length). Deployed cryptosystems should have sufficient effective key length to foil feasible attacks, considering the maximal expected adversary resources and most effective yet feasible attack model, Applied Introduction to Cryptography and Cybersecurity 2.4. UNCONDITIONAL SECURITY AND THE ONE TIME PAD (OTP) 75 as well as cryptanalysis and speed improvements expected over the sensitivity period of the plaintext. Experts, as well as standardization and security organizations, publish estimates of the required key length of different cryptosystems (and other cryptographic schemes); we present a few estimates in Table 6.1. 2.4 Unconditional security and the One Time Pad (OTP) The exhaustive search and table look-up attacks are generic - they do not depend on the speciőc design of the cipher: their complexity is merely a function of key length. This raises the natural question: is every cipher breakable, given enough resources? Or, can encryption be secure unconditionally - even against an attacker with unbounded resources (time, computation speed, storage)? We next present such an unconditionally secure cipher, the one time pad (OTP). The one time pad is often attributed to a 1919 patent by Gilbert Vernam [382], although some of the critical aspects may have been due to Mauborgne [49], and in fact, the idea was already proposed by Frank Miller in 1882 [48]; we refer readers to the many excellent references on history of encryption, e.g., [223, 362]. The one time pad is not just unconditionally secure - it is also an exceedingly simple and computationally efficient cipher. Speciőcally: Encryption: To encrypt a message, compute its bitwise XOR with the key. Namely, the encryption of each plaintext bit, say m[i], is one ciphertext bit, c[i], computed as: c[i] = m[i] ⊕ k[i], where k[i] is the ith bit of the key. Decryption: Decryption simply reverses the encryption, i.e., the ith decrypted bit would be c[i] ⊕ k[i]. Key: The key bits k = {k[1], k[2], . . .} should consist of independently drawn fair coins, and the key length must be at least as long as that of the plaintext. Notice, that this long key should be - somehow - shared between the parties, which is a challenge in many scenarios. See illustration in Figure 2.10. The correctness of OTP, i.e., the fact that decryption recovers the plaintext correctly, follows from the properties of exclusive OR. Namely, given c[i] = m[i]⊕k[i], the corresponding decrypted bit is c[i]⊕k[i] = (m[i]⊕k[i])⊕k[i] = m[i], as required. The unconditional security of OTP also follows from properties of XOR; let us explain it, albeit informally. First, OTP handles each bit completely independently of others, so we can focus on the security of a particular bit m[i], encrypted as c[i] = m[i] ⊕ k[i]. Recall that each key bit k[i] is selected randomly, and suppose, for simplicity, that the message bit m[i] was also selected randomly. Namely, both bits can be either 0 or 1 with probability half. Given Applied Introduction to Cryptography and Cybersecurity 76 CHAPTER 2. CONFIDENTIALITY: ENCRYPTION SCHEMES AND PSEUDO-RANDOMNESS k[i] m[i] c[i] = m[i] ⊕ k[i] Figure 2.10: The one time pad (OTP) cipher - an unconditionally-secure stateful stream cipher: (∀i) c[i] = m[i] ⊕ k[i] (bit-wise XOR). c[i], there are two equally likely conclusions: either k[i] = 0 and therefore m[i] = c[i], or k[i] = 1 and therefore m[i] = 1 − c[i]. Namely, seeing c[i] does not change our knowledge about m[i], simply since k[i] is a random bit. The precise argument is similar, mainly avoiding the assumption that the message bit is fair, i.e., allowing some prior knowledge about the probability that m[i] = 1; the argument would show that there is no extra knowledge about m[i] from observing c[i]. The unconditional secrecy of OTP was recognized early on, and established rigorously in a seminal paper published in 1949 by Claude Shannon [353]. In that paper, Shannon also proved the more challenging fact that every unconditionally-secure cipher must have keys as long as the plaintext; namely, as long as unconditional secrecy is required, this aspect cannot be improved. Interested readers can őnd this proof in textbooks on cryptography, e.g., [370]. Interestingly, OTP is actually a very special case of the Keyed-Caesar cipher (subsection 2.1.2). Recall that the Keyed-Caesar cipher is deőned by: c[i] = p[i] + k mod n. The OTP cipher is basically the same, except that we use n = 2 - and a different key bit k[i] for every plaintext bit m[i]. Speciőcally: c[i] = m[i] ⊕ k[i] = m[i] + k[i] mod 2 (2.13) In fact, the Keyed-Caesar cipher is unconditionally secure also for n > 2, as long as each key letter k[i] is chosen randomly from {0, 1, . . . , n − 1} and used to encrypt a single plaintext letter m[i] ∈ {0, 1, . . . , n − 1}. The cryptographic literature has many beautiful results on unconditional security. However, it is rarely practical to use such long keys, and in practice, adversaries - like everyone else - have limited computational abilities. Therefore, in this textbook, we focus on computationally-bounded adversaries. While the key required by OTP makes its use rarely practical, we next show a computationally-secure variant of OTP, where the key can be much smaller than the plaintext, and which can be used in practical schemes. Of course, this variant is - at best - secure only against computationally-bounded attackers. Note that OTP handles multiple plaintext messages as part of one long plaintext string, i.e., it uses m[i] for the ith plaintext bit, and similarly for Applied Introduction to Cryptography and Cybersecurity 2.5. PSEUDO-RANDOMNESS, INDISTINGUISHABILITY AND ASYMPTOTIC SECURITY 77 ciphertext and key bits; basically, one could say that the OTP encrypts its input as one long plaintext message, which it encrypts one bit at a time. Some other stateful ciphers operate on multiple plaintext messages, where each message may consist of multiple bits. Exercise 2.6. Define the stateful encryption and decryption functions ⟨E, D⟩ for the OTP cipher. Solution: We use the index i of the next bit to be encrypted as the state, initialized with i = 1. Namely, encryption Ek (m[i], i) returns (m[i] ⊕ k[i], i + 1), and decryption Dk (c[i], i) returns (c[i] ⊕ k[i], i + 1). Stream ciphers. The one-time pad (OTP) is often referred to as a stream cipher. We use the term stream ciphers11 to refer to stateful cryptosystems that use bit-by-bit encryption process, i.e., a stream cipher is a process of mapping each plaintext bit m[i] to a corresponding ciphertext bit c[i]. Since stream ciphers map each plaintext bit to a ciphertext bit, and we require decryption to be correct (recover the plaintext), then the mapping from plaintext to ciphertext cannot be randomized; but obviously, it also cannot be the same for all bits, or decryption would be trivial. It follows that stream ciphers must be stateful. For example, with one time pad, not only the parties must share a key as long as all plaintext bits, the parties must also maintain an exact, synchronized count of the number of key bits used so far, to ensure correct decryption. Reuse of key bits with the one-time-pad is also insecure. In particular, suppose the design uses the same key bit k[i] to encrypt both m[i] and m[i + 1], i.e., c[i] = m[i] ⊕ k[i] but also c[i + 1] = m[i + 1] ⊕ k[i], reusing k[i]. Then an attacker knowing one known plaintext, e.g., m[i] and c[i], and eavesdropping on c[i+1], can őnd m[i+1]. Speciőcally, in this case, m[i+1] = c[i+1]⊕(m[i]⊕c[i]). See Example 2.5 for a more realistic example of this vulnerability. Stream ciphers are often used in applied cryptography, and esp. in hardware implementations, mainly due to their simple and efficient hardware implementation. Rarely, we use the OTP (or another unconditionally-secure cipher), but much more commonly, a stream cipher with a bounded-length key, providing ‘only’ computational security. In the following section we introduce pseudorandom generators (PRG) and pseudorandom functions (PRF), and show how to use either of them to design a bounded key length stream cipher. 2.5 Pseudo-Randomness, Indistinguishability and Asymptotic Security Randomness is widely used in cryptography - for example, the one time pad cipher (Section 2.4) uses random keys to ensure unconditional secrecy. In this 11 Other authors use the term stream cipher also for cryptosystems that use byte-by-byte or block-by-block encryption, essentially as a synonym to stateful encryption. Applied Introduction to Cryptography and Cybersecurity 78 CHAPTER 2. CONFIDENTIALITY: ENCRYPTION SCHEMES AND PSEUDO-RANDOMNESS section, we introduce pseudo-randomness, a central concept in cryptography, and three types of pseudo-random schemes: pseudo-random generator (PRG), pseudorandom function (PRF) and pseudo-random permutation (PRP). We also introduce the central technique of indistinguishability test, which is central to the deőnitions of these three pseudo-random schemes - as well as to the deőnition of secure encryption, which we present later. 2.5.1 Pseudo-Random Generators and their use for Bounded Key-length Stream Ciphers In this subsection we introduce the Pseudo-Random Generator (PRG). A PRG is one of the simpler cryptographic deőnitions, and hence we consider it a good choice for the őrst deőnition; however, it is still not that easy. Hence, in this subsection we introduce PRGs but deőne them only informally. We focus on the classical application of PRGs, which is to construct a stream cipher. We have already seen a stream cipher: the one time pad (OTP). The OTP has the advantage of being unconditionally-secure, but also the disadvantage of requiring the parties to share a secret key which is as long as all the plaintext bits they may need to encrypt, i.e., to share a key of unbounded length. . In contrast, stream ciphers constructed from PRGs only require the parties to share a bounded-length key; this is a critical advantage over the OTP. On the other hand, stream ciphers with bounded key-length, such as these constructed from PRGs, cannot be unconditionally-secure; instead, they can only be computationally secure - i.e., secure only assuming that the attacker has limited computational capabilities. In this section, we will see one method to implement a bounded-key-length stream cipher from a pseudo-random generator (PRG). PRGs have other important applications, and are one of the cryptographic mechanism whose deőnition is least-complex; therefore, they are a good way to introduce the more complex - and even more important - cryptographic mechanisms of pseudorandom functions (PRF), pseudo-random permutations (PRP) and block ciphers, which we present later. PRG: intuitive definition. In this subsection, we only introduce PRGs informally, focusing on their use in the construction of bounded-key-length stream ciphers. We focus on the following simple deőnition, Definition 2.3 (Informal deőnition of a (stateless) PRG). Given any random input string s, usually referred to as the seed, a PRG fP RG outputs a longer string, i.e., (∀s ∈ {0, 1}∗ ) (|fP RG (s)| > |s|). Furthermore, if s is a random string (of |s| bits), then fP RG (s) is pseudo-random. Intuitively, this means that fP RG (s) cannot be efficiently distinguished from a true random string of the same length |fP RG (s)|. We deőne these concepts of efficient distinguishing and PRG precisely quite soon, in subsection 2.5.2. However, őrst, let us see that there are some other Applied Introduction to Cryptography and Cybersecurity 2.5. PSEUDO-RANDOMNESS, INDISTINGUISHABILITY AND ASYMPTOTIC SECURITY 79 ways to deőne PRG. Let us mention one of these variants, which is widely used; to avoid confusion, we refer to this variant as a stateful PRG, and, where relevant, refer to PRG deőned as in Deőnition 2.3. Variant: stateful PRG. A stateful PRG fSP RG is a function that receives two inputs, a current state (or seed) s and a ‘timestamp’ t, and outputs a pseudo-random string r and a new state s′ . We use the dot notation to refer to the two outputs,fSP RG .r and fSP RG .s′ . The timestamp input t is optional; if used, we require the outputs for a given timestamp t1 to be pseudorandom, even if the adversary is given the outputs fSP RG (s, t2 ) for a different timestamp t2 ̸= t1 . Building a stream cipher from a PRG. To obtain a stream cipher, we require a PRG which produces a pseudo-random string as long as the plaintext 12 . We then XOR each plaintext bit with the corresponding pseudo-random bit, as shown in Figure 2.11. k fP RG (·) m[i] fP RG (k); |fP RG (k)| = |m| > |k| c[i] = m[i] ⊕ fP RG (k)[i] Figure 2.11: PRG-based Stream Cipher. The input to the PRG is usually called either key or seed; if the input is random, or pseudo-random, then the (longer) output string is pseudo-random. The state includes the current bit index i. The pseudo-random generator stream cipher is very similar to the OTP; the only difference is that instead of using a truly random sequence of bits to XOR the plaintext bits, we use the output of a Pseudo-Random Generator (PRG). If we denote the ith output bit of fP RG (k) by fP RG (k)i , we have that the ith ciphertext bit c[i], is deőned as: c[i] = m[i] ⊕ fP RG (k)i . This is a shared-key stream cipher in which k is the shared key, quite similar to the OTP. Speciőcally, the state is the index of the bit i, encryption of plaintext bit m[i] is m[i] ⊕ fP RG (k)[i] and changes the state to i + 1, and decryption of the ciphertext bit c[i] is c[i] ⊕ fP RG (k)[i] (and changes the state to i + 1). Note 12 The output of some PRGs may be only slightly longer than their input, e.g., one bit longer. However, we can use such a PRG to construct another PRG, whose output length is longer (as a function of its input length). The details and proof are beyond our scope; see, e.g., [165]). Applied Introduction to Cryptography and Cybersecurity 80 CHAPTER 2. CONFIDENTIALITY: ENCRYPTION SCHEMES AND PSEUDO-RANDOMNESS that this may require us to compute the value of fP RG (k) each time we need a speciőc bit, or to store all or parts of fP RG (k). In subsection 2.5.4 below, we properly deőne pseudo-random generators (PRGs), using the PRG indistinguishability test, which we present in subsection 2.5.3. But let us őrst introduce the ingenious concept of indistinguishability test, which was introduced by Alan Turing. 2.5.2 The Turing Indistinguishability Test Intuitively, a PRG is an efficient algorithm, whose input is a binary string s, called seed (or sometimes key); if the input is either random or pseudo-random, then the (longer) output string is pseudo-random. In order to turn this intuitive description into a deőnition, we őrst deőne clearly the notions of ‘efficient’ and ‘pseudorandom’. We discuss these notions in the following subsections; in this subsection, we őrst present the ingenious but non-trivial concept of indistinguishability test, which is key to the notion of pseudorandomness - and to many deőnitions in cryptography. The őrst indistinguishability test was the Turing Indistinguishability test, proposed by Alan Turing in 1950, in a seminal paper [373], which lay the foundations for artiőcial intelligence. Turing proposed the test, illustrated in Figure 2.12, as a possible deőnition of an intelligent machine. Turing referred to this test as the imitation test; another name often used for this test is simply the Turing test. Figure 2.12: The Turing Indistinguishability Test. A machine is considered intelligent, if a distinguisher (judge) cannot determine in which box is the machine and in which is a human. Turing stipulated that communication between the distinguisher and the boxes would only be in printed form, to avoid what he considered ‘technical’ challenges such as voice recognition. Many cryptographic mechanisms are deőned using indistinguishability tests. These tests are similar, in their basic concept, to the Turing indistinguishability test. The following subsection presents the őrst such test, which is testing for the important property of pseudorandomness. 2.5.3 PRG indistinguishability test We now return to the discussion of pseudorandomness, and deőne the PRG indistinguishability test, illustrated in Figure 2.13. The pseudorandom test is Applied Introduction to Cryptography and Cybersecurity 2.5. PSEUDO-RANDOMNESS, INDISTINGUISHABILITY AND ASYMPTOTIC SECURITY 81 similar to the Turing indistinguishability test in Figure 2.12, in the sense that a distinguisher is asked to identify which is the ‘true’ (intelligent person in Turing test, and random sequences here) and who is the ‘imitation’ (machine in Turing test, and sequences output by function f here). Intuitively, a pseudo-random generator is a function f whose input is a ‘short’ $ random bit string x ← {0, 1}n , and whose output a longer string f (x) ∈ {0, 1}ln s.t. ln > n, which is pseudo-random - i.e., indistinguishable from a random string (of the same length ln ). But what does it mean for the output to be indistinguishable from random? This is deőned by the PRG indistinguishability test, which we next deőne and which, in concept, quite resembles the Turing indistinguishability test, although the details are different. The similarity can be seen from comparing the illustration of the PRG indistinguishability test in Figure 2.13, to that of the Turing test in Figure 2.12. Figure 2.13: Intuition for the Pseudo-Random Generator (PRG) Indistinguishability Test. Intuitively, f : {0, 1}∗ → {0, 1}∗ is a (secure) pseudo-random generator (PRG), if an efficient distinguisher D can’t effectively distinguish between f (x), for a random input x, and a random string of the same length (|f (x)|), where |f (x)| > |x|. In order to turn this intuition into a deőnition of a (secure) Pseudo-Random Generator (PRG), we must specify precisely the capabilities of the distinguisher and criteria for the outcome of the experiment, i.e., when would we say that f is indeed a (good/secure) PRG. We next discuss these two aspects in the following subsection, where we őnally present deőnitions for (secure) PRG. 2.5.4 Defining Secure Pseudo-Random Generator (PRG) We now őnally deőne (secure) pseudo-random generator (PRG). We őrst deőne RG the distinguisher capabilities; next, we deőne the advantage εP D,f (n) of D for function f and inputs of length n; and őnally we deőne a (secure) PRG. Applied Introduction to Cryptography and Cybersecurity 82 CHAPTER 2. CONFIDENTIALITY: ENCRYPTION SCHEMES AND PSEUDO-RANDOMNESS Distinguisher capabilities. We model the distinguisher as an algorithm, denoted D, which receives the a binary string - either a random string or the ‘pseudorandom’ output of the PRG f - and outputs its evaluation, which should be 0 if given truly random string, and 1 otherwise, i.e., if the input is not truly random. The distinguisher algorithm D has to be efficient (or PPT). The terms efficient algorithm and PPT (Probabilistic Polynomial Time) algorithm are crucial to deőnitions of asymptotic security, which we use in this textbook; see Section A.1. PRG: the advantage of D for f . Before we deőne the criteria for a function f to be considered a (secure) Pseudo-Random Generator (PRG), we notice that by simply randomly guessing, the distinguisher may succeed with probability 1 1 2 . Namely, succeeding with probability 2 does not imply a vulnerability, and should not result in any advantage for D. RG We therefore deőne the advantage εP D,f (n) of D for function f as the probability that D outputs 1 (correctly) when given the output of the pseudorandom function f (x), for random input x, minus the probability that D outputs 1 (incorrectly) when given a truly random string r. As required, this gives no advantage to a distinguisher which simply guesses the bit. However, this introduces a challenge: how should we choose x and r? We solve this challenge with the following assumption on f . Length-uniform assumption. We simplify the deőnitions by assuming that f is a length-uniform function, i.e., for every input of length n, the output would be of the same length ln . RG We can now present the deőnition of the advantage εP D,f (n): the probability $ that D correctly outputs 1 when given f (x) for random n-bit input x ← {0, 1}n , minus the probability that D incorrectly outputs 1 when given a random ln -bit string r. Definition 2.4. Let f : {0, 1}∗ → {0, 1}∗ be a length-uniform function, i.e., if |x| = n then |f (x)| = ln , and let D be an algorithm. The PRG-advantage of D RG for f is denoted εP D,f (n) and defined as: RG εP D,f (n) ≡ Pr $ x←{0,1}n [D (f (x)) = 1] − Pr [D (r) = 1] $ (2.14) r ←{0,1}ln The probabilities in Equation 2.14 are computed over uniformly-random n-bit binary string s (seed), uniformly-random ln = |f (1n )|-bit binary string r, and uniformly-random coin tosses of the distinguisher D, if D uses random bits. Note that the length of the output of f depends only on the length of the input, hence, our use of ln . Definition of Secure PRG. Finally, let’s deőne a secure PRG. The deőnition assumes both the PRG and the distinguisher D are efficient (PPT), i.e., their running time is bounded by a polynomial in the input length. Note that the Applied Introduction to Cryptography and Cybersecurity 2.5. PSEUDO-RANDOMNESS, INDISTINGUISHABILITY AND ASYMPTOTIC SECURITY 83 PRG must be deterministic, but the distinguisher D may be probabilistic (randomized). Definition 2.5 (Secure Pseudo-Random Generator (PRG)). A length uniform function f : {0, 1}∗ → {0, 1}∗ , s.t. (∀x ∈ {0, 1}n ) ln = |f (x)|, is a secure Pseudo-Random Generator (PRG), if it is efficiently-computable (f ∈ P P T ), length-increasing (ln > n) and ensures indistinguishability, i.e., for every RG distinguisher D ∈ P P T , the advantage of D for f is negligible, i.e., εP D,f (n) ∈ P RG N EGL, where εD,f (n) is defined as in Equation 2.14. The term ‘secure’ is often omitted; i.e., when we simply say that algorithm f is a pseudo-random generator (PRG), this implies that it is a secure PRG. Exercise 2.7. Let x ∈ {0, 1}n . Show that the following are not PRGs: (a) fa (x) = 3x mod 2n+2 (using standard binary encoding), (b) fb (x) = 3x + parity(x), where parity(x) returns 1 mod 2n+1 (similarly), and (c) fc (x) = x + if the number of 1 bits in x is odd and 0 if it is even. Solution for part (a): Notice that here we view x as a number encoded in binary, whose value can be between 0 and 2n − 1. A simple distinguisher Da for fa is: Da (y) outputs 1 (i.e., pseudo-random) if y mod 3 = 0, otherwise, it outputs 0 (i.e., random). Let us show why this distinguisher has signiőcant advantage. First notice that if y = fa (x), then Da (y) outputs 1 (correctly), for every x ∈ {0, 1}n . This holds since 3x < 2n+2 , and hence, y = fa (x) = 3x mod 2n+2 = 3x. Namely, Da (y) = Da (3x) = 1, by deőnition of Da . $ It remains to show that the probability that Da (r) = 1, for r ← {0, 1}n+2 , is signiőcantly less than 1. If 2n+2 mod 3 = 1, this probability is exactly third; otherwise, the probability is only 2−(n+2 higher. In either case, the probability is deőnitely much less than 1! 1 RG −(n+2) Therefore, εP ) > 12 , i.e., is clearly non-negligible. Da ,fa (n) ≥ 1 − ( 3 + 2 2.5.5 Secure PRG Constructions Note that we did not present a construction of a secure PRG. In fact, if we could have presented a provably-secure construction of a secure PRG, satisfying Def. 2.5, this would have immediately proven that P ̸= N P , solving the most well-known open problems in the theory of complexity. Put differently, if P = N P , then there cannot be any secure PRG algorithm (satisfying Def. 2.5). ? Since P = N P is believed to be a very hard problem, proving that a given construction is a (secure) PRG must also be a very hard problem, and unlikely to be done as a side-product of proving that some function is a PRG. Similar arguments apply to most of the cryptographic mechanisms we will learn in this book, including secure encryption, when messages may be longer than the key. (The one time pad (OTP) is a secure encryption scheme when used with key which is at least as long as the plaintext.) Applied Introduction to Cryptography and Cybersecurity 84 CHAPTER 2. CONFIDENTIALITY: ENCRYPTION SCHEMES AND PSEUDO-RANDOMNESS What is possible is to present a reduction-based construction of a PRG, namely, construction of PRG from some other cryptographic mechanism, along with a proof that the PRG is secure if that other mechanism is ‘secure’. For example, see [165] for a construction of PRG f from a different cryptographic mechanism called one-way function (which we discussed in Section 3.4), and a reduction proof, showing if the construction of f uses a OWF fOW F , then the resulting function f would be a PRG. We will also present few reduction proofs, for example, later in this section we prove reductions which construct a PRG from other cryptographic mechanisms such as a pseudorandom function (PRF), see Exercise 2.13, and a block-cipher. Courses, books and papers dealing with cryptography, are full of reduction proofs, e.g., see [165, 166, 370]. Unfortunately, there is no proof of the existence of any of these - one-way function, PRF, block-cipher or most other cryptographic schemes. Indeed, such proofs would imply P = ̸ N P . Still, reduction proofs are the main method of ensuring the security of most cryptographic mechanisms - by showing that they are ‘at least as secure’ as another cryptographic mechanism, typically a mechanism whose security is well established (e.g., by failure of extensive cryptanalysis efforts). For example, it seems ‘easier’ to design a one-way function than a PRG. If so, then we could obtain a PRG using a given one-way function, and a construction of a PRG from a one-way function. As a more practical example, block-ciphers are standardized, with lots of cryptanalysis efforts; therefore, block ciphers are a good basis to use for building other cryptographic functions. Let us give an important example of reduction-based proof which is speciőc to PRGs. This is a construction of a PRG whose output is signiőcantly larger than its input, from a PRG whose output is only one-bit longer than its input. Unfortunately, the construction and proof are beyond our scope; see [165]. However, the following exercise (Exercise 2.8) proves a related - albeit much simpler - reduction, showing that a PRG G from n bits to n + 1-bits, gives also a PRG G′ from n + 1 bits to n + 2-bits, simply by exposing one bit. In other words, this shows that a PRG may expose one (or more) bits - but remain a PRG. Exercise 2.8. Let f : {0, 1}n → {0, 1}n+1 be a secure PRG. Is f ′ : {0, 1}n+1 → + x) = b + + f (x), where b ∈ {0, 1}, also a secure {0, 1}n+2 , defined as f ′ (b + PRG? Solution: Yes, if f is a PRG then f ′ is also a PRG. First, recall the PRGadvantage (Equation 2.14) for distinguisher D, using ln = n + 1: RG εP D,f (n) ≡ Pr $ x←{0,1}n [D (f (x))] − [D (r)] Pr $ (2.15) r ←{0,1}n+1 Next, rewrite Equation 2.14 for f ′ and distinguisher D′ , by substituting f by f ′ , x by x′ , n by n + 1 and ln by n + 2: RG εP D ′ ,f ′ (n + 1) ≡ Pr $ x′ ←{0,1}n+1 [D′ (f ′ (x′ ))] − Pr $ r ′ ←{0,1}n+2 [D′ (r′ )] ̸∈ N EGL(n) Applied Introduction to Cryptography and Cybersecurity (2.16) 2.5. PSEUDO-RANDOMNESS, INDISTINGUISHABILITY AND ASYMPTOTIC SECURITY 85 We next present a simple construction of a distinguisher D (for f ), using, as a subroutine (oracle), a given distinguisher D′ (for f ′ ): n o $ D(y) ≡ Return D′ (b + + y) where b ← {0, 1} (2.17) Clearly D is efficient (PPT) if and only if D′ is efficient (PPT). RG P RG ′ We prove that εP D ′ ,f ′ (n + 1) = εD,f (n), therefore, f is a PRF if and only if f is a PRG. We begin by developing the őrst component of Equation 2.15: h i $ Pr Pr [D(f (x))] = D′ (b + + f (x))|b ← {0, 1} (2.18) $ $ x←{0,1}n x←{0,1}n = Pr $ x←{0,1}n = Pr h i $ D′ (f ′ (b + + x))|b ← {0, 1} D′ (f ′ (x′ ))) $ x′ ←{0,1}n+1 We now develop the other component of Equation 2.15: h i $ [D(r)] = Pr D′ (b + + r)|b ← {0, 1} Pr $ $ r ←{0,1}n+1 r ←{0,1}n+1 = Pr D′ (r′ ) $ (2.19) (2.20) (2.21) (2.22) r ′ ←{0,1}n+2 Now substitute the two components in Equation 2.15: RG εP D,f (n) ≡ = Pr $ x←{0,1}n [D (f (x))] − Pr $ x′ ←{0,1}n+1 ≡ D′ (f ′ (x′ ))) − RG εP D ′ ,f ′ (n + 1) [D (r)] Pr $ r ←{0,1}n+1 Pr D′ (r′ ) $ r ′ ←{0,1}n+2 (2.23) (2.24) (2.25) RG P RG ′ Hence, εP D,f (n) = εD ′ ,f ′ (n + 1), namely, f is a PRG if and only if f is a PRG. Feedback Shift Registers (FSR). There are many proposed designs for PRGs. Many of these are based on Feedback Shift Registers (FSRs), with a known linear or non-linear feedback function f , as illustrated in Fig. 2.14. For Linear Feedback Shift Registers (LFSR), the feedback function f is simply the XOR of some of the bits of the register. Given the value of the initial bits r1 , r2 , . . . , rl of an FSR, the value of the next bit rl+1 is deőned as: rl+1 = f (r1 , . . . , rl ); and following bits are deőned similarly: (∀i > l)ri = f (ri−l , . . . , ri−1 ). FSRs are well-studied with many desirable properties. However, by deőnition, their state is part of their output. Hence, they cannot directly be used as cryptographic PRGs. One solution is to deőne another function g over the bits of the register, which outputs one or more bits which should hopefully be pseudorandom. The following exercise gives some examples. Applied Introduction to Cryptography and Cybersecurity CHAPTER 2. CONFIDENTIALITY: ENCRYPTION SCHEMES AND PSEUDO-RANDOMNESS 86 f(…) r10 r9 r8 r7 r6 r10 r5 r4 r3 r2 r1 Figure 2.14: Feedback Shift Register, with (linear or non-linear) feedback function f (). Exercise 2.9. For each of the following pairs of functions, show why they are not a secure PRG: P  l 1. f (r1 , . . . , rl ) = r mod 2, g(r1 , . . . , rl ) = r1 . Note: this is an i=1 i LFSR. 2. f (r1 , . . . , rl ) = Πli=1 ri ; and any g. There are also many other designs of PRGs based on Feedback Shift Registers (FSRs), often combining multiple FSRs (often LFSRs) in different ways; one reason is that FSRs are convenient for efficient hardware implementations. Let us consider the case of GSM encryption. GSM (Global System for Mobile Communications) is considered the secondgeneration (2G) cellular network technology. It was őrst deployed in Finland in 1991 and quickly became the dominant cellular standard worldwide, and is still used alongside later-generation cellular technologies. Security was a major concern to the GSM designers, and they deőned two PRGs, A5/1 and A5/2, both combining three LFSRs. In the hope of preventing cryptanalysis, the design of A5/1 and A5/2 was kept secret, in contrary to the Kerckhoffs’ principle (Principle 2). This decision was a mistake; the design was reverseengineered and make public quite quickly, and quite soon, A5/2, and then also A5/1, where broken. The details are beyond our scope; see, e.g., [26]. Other PRG designs exist, e.g., the RC4 design, which is designed for convenient software implementation. Let us brieŕy discuss the vulnerabilities found in RC4, and their potential impact. 2.5.6 RC4: Vulnerabilities and Attacks The RC4 PRG design features simplicity and good efficiency (including for software implementation). This design is publicly available since its anonymous, unofficial disclosure in September 1994 [393]. It was therefore adopted by several Applied Introduction to Cryptography and Cybersecurity 2.5. PSEUDO-RANDOMNESS, INDISTINGUISHABILITY AND ASYMPTOTIC SECURITY 87 standards, including the WEP and WPA wireless-LAN standards (Section 2.10) and the SSL and TLS protocols (Chapter 7), which further increased the cryptanalytical efforts to identify exploitable weaknesses. Several works have, in fact, found vulnerabilities in RC4. Describing the details of RC4’s design and the cause of the vulnerabilities is beyond our scope. However, let us brieŕy discuss the impact of the őrst major reported vulnerability, the Mantin-Shamir attack on RC4 [276], as we őnd this vulnerability and its history to be instructive in several ways. We also discuss another important vulnerability due to incorrect usage of RC4, which is not due to a vulnerability of RC4 at all - the same problem would hold when incorrectly using any PRG; such incorrect-usage vulnerabilities are even more common than vulnerabilities due to cryptanalysis attacks, which is one reason that it is so important to understand the exact deőnition and security goals of cryptographic mechanisms. The Mantin-Shamir and other RC4 vulnerabilities. Since the output of a PRG should be indistinguishable from random, then any detectable difference between the output of the PRG and the uniformly random distribution, is a failure of the PRG. For the theoretical deőnition, this holds for any computable difference, regardless of the ability to exploit it for a speciőc practical attack; for example, it suffices that some efficiently computable function f has a different distribution when applied to speciőc output bits of RC4, compared to its distribution for random bits. However, many PRG weaknesses, and in particular, RC4 weaknesses, are simpler: a bias of a particular bit or byte of the RC4 output. In particular, for a random byte sequence, the probability of any byte in the sequence to have a particular value should be exactly 2−8 , since all 28 byte values should be equally likely. However, in 2001, Mantin and Shamir found that given a random seed/key, the second output byte of RC4, denoted Z2 , has observable bias from random. Speciőcally, they found that: P r(Z2 = 08 ) ≈ 2−7 (2.26) Clearly, this shows that RC4 does not fulőll our deőnition of a secure PRG: its output can be efficiently distinguished from a random bit sequence. Based on the conservative design principle (Principle 3), this should have caused the use of alternative mechanisms, at least signiőcantly enhancing the RC4 security mechanism, or a completely different design. However, that did not happen; in spite of this and additional vulnerabilities found, the use of RC4 continued and even increased over more than a decade, with adoption by new standards such as TLS and WPA, in addition to its use by older standards such as SSL and WEP. Apparently, this vulnerability appeared too minor to be a major concern and to stop using RC4. Which shows the difficulty of adopting the conservative design principle in practice. Really, can we justify the principle? Can such apparently-minor vulnerability be exploited in a realistic attack? The answer to both questions is a resounding yes! Applied Introduction to Cryptography and Cybersecurity 88 CHAPTER 2. CONFIDENTIALITY: ENCRYPTION SCHEMES AND PSEUDO-RANDOMNESS First, when we őnd one vulnerability, we should assume more are probably lurking, possibly already known and exploited by some powerful attacker; we better change to a more secure design. In the case of RC4, this proved to be the case. Multiple additional vulnerabilities were discovered over the years, beginning with another important vulnerability in the same year [147]; see some of them in [235]. Even these did not stop the use of RC4, until the effective attack on TLS of [11] (see subsection 7.2.5). It is widely believed that some attackers already exploited attacks against RC4 for years before it was őnally abandoned. Some products still use RC4, at least for ‘backward compatibility’, which is often vulnerable to downgrade attacks, see subsection 5.6.3 and Section 7.5. Second, the Mantin-Shamir attack can be abused in some applications and scenarios. Speciőcally, in some applications, the same secret, x, is encrypted using RC4 many times, using different seed values, s1 , s2 , . . .. This scenario can occur in practice, e.g., when the SSL or TLS protocol is used to secure communication between browser and website; see subsection 7.2.5. Let us show how an attacker can, in such cases, őnd the second byte x[2] of the secret x, if x is encrypted using RC4, with seeds s1 , s2 , . . .. Let Z2 (si ) denote the value of the second byte of the output of RC4, when initiated with seeds s1 , s2 , . . .. Let us focus on the (vulnerable) second byte. The encryption of x[2] using seed si is the ciphertext ci [2] = x[2] ⊕ Z2 (si ). Hence, while the probability of most bytes of the ciphertext is about 2−8 , from Equation 2.26 we have P r(ci [2] = x[2]) ≈ 2−7 . Therefore, the second byte of the secret x would simply be the most common second byte of the ciphertext - by a large margin. Namely, the second byte of the plaintext encrypted by RC4, can be exposed by a simple, efficient and effective Ciphertext Only (CTO) attack, if the attacker can obtain a reasonable number of ciphertexts (few hundreds at most). And if this is not sufficiently convincing, see the improved, practical attack on TLS in subsection 7.2.5. Several variants of RC4 were proposed to defend against this and other attacks. The simplest variant simply discards some initial portion of the output; this is known as RC4-dropN, where N is the number of initial output bytes dropped discarded, e.g., 256 or 1024. RC4-dropN clearly avoids the speciőc Mantin-Shamir attack, but may fail against other attacks, most notable the attack against the use of RC4 by TLS [11], see subsection 7.2.5. Vulnerabilities due to incorrect use of RC4 (or any PRG). While PRGs, and other cryptographic schemes, can be vulnerable, many security failures are due to incorrect, vulnerable usage of cryptographic schemes. Let us give an example of an attack against vulnerable deployment of a PRG in a system. Speciőcally, we present an attack against MS-Word 2002, which exploits a vulnerability in the usage of the RC4 PRG, rather than in its design. Namely, this attack could have been carried out if any PRG was used in the same (incorrect and vulnerable) way, as RC4 was used by MS-Word 2002. Example 2.5. MS-Word 2002 used RC4 for document encryption, in the following way. The user provided password for the document; that password was Applied Introduction to Cryptography and Cybersecurity 2.5. PSEUDO-RANDOMNESS, INDISTINGUISHABILITY AND ASYMPTOTIC SECURITY 89 used as a key to the RC4 PRG, producing a long pseudo-random string which is referred to as P ad, i.e., P ad = RC4(password). When the document is saved or restored from storage, it is XORed with P ad. This design is vulnerable; can you spot why? The vulnerability is not specific in any way to the choice of RC4; the problem is in how it is used. Namely, this design re-uses the same pad whenever the document is modified - a ‘multi-times pad’ rather than OTP. For example, suppose that the document is changed by adding one letter x, say in position i. Clearly, it is possible to find i, given only the two ciphertexts. Furthermore, let c[i] be the original ciphertext in position i, and c′ [i] be the ciphertext in position i after the insertion of x; and let y be the ith plaintext character before the insertion, which then moves to be the (i + 1)th character. Then c[i] = y ⊕ pad[i] and c′ [i] = x ⊕ pad[i]. By XORing the two equations, we have: c[i] ⊕ c′ [i] = (y ⊕ pad[i]) ⊕ (x ⊕ pad[i]) = y ⊕ x. Hence, knowing x gives y and vice verse; and if neither is known, the attacker can still learn y ⊕ x, which still gives some information on the plaintext. If these exposures do not yet convince the reader of the insecurity of this design, then see details of a complete, practical plaintext-recovery attack in [278]. The fact that the vulnerability is due to the use of RC4 and not to cryptanalysis of RC4, is very typical of vulnerabilities in systems involving cryptography. In fact, cryptanalysis is rarely the cause of vulnerabilities - system, conőguration and software vulnerabilities are more common. 2.5.7 Random functions One practical drawback of stream ciphers is the fact that they require state, to remember how many bits (or bytes) were already output. What happens if state is lost? Can we eliminate or reduce the use of state? It would be great to allow recovery from loss of state, or to avoid the need to preserve state when encryption is not used, e.g., between one message and the next. In the next section, we introduce another pseudo-random cryptographic mechanism, called a pseudorandom function (PRF), which has many applications in cryptography - including stateless, randomized shared-key cryptosystems. However, before we introduce pseudo-random functions, let us őrst discuss the ‘real’ random functions. Selecting a random function. How can we select a random function from a domain D to a range R? One way is as follows: for each input x ∈ D, select a random element in R to be f (x), namely: $ (∀x ∈ D)f (x) ← R (2.27) This process can be done manually for small domain and range, by randomly choosing the mapping and writing it in a table, e.g., as in Table 2.2 and Applied Introduction to Cryptography and Cybersecurity 90 CHAPTER 2. CONFIDENTIALITY: ENCRYPTION SCHEMES AND PSEUDO-RANDOMNESS Exercise 2.10, where we focus on the typical case where both D and R are sets of binary strings of speciőc length, i.e., for some integers n, m, we have D = {0, 1}n , R = {0, 1}m . Function f1 f2 Domain {0, 1}2 {0, 1}2 Range {0, 1} {0, 1}3 00 01 10 11 coin-ŕips Table 2.2: Do-it-yourself table for selecting f1 , f2 randomly, in Exercise 2.10. Exercise 2.10. Using a coin, select randomly the functions below; count your coin flips. 1. f1 : {0, 1}2 → {0, 1} (use a copy of Table 2.2) 2. f2 : {0, 1}2 → {0, 1}3 (use a copy of Table 2.2) 3. f3 : {0, 1}3 → {0, 1}2 (create your own table) How many coin flips required were required for each function? For each of the functions, what is the probability that all its output bits are zero ? And of all outputs bits being 1? The exercise is limited to very small values (n = 2 and m ∈ {1, 3}, since the number of coin-ŕips required is 2n · m, i.e., grows exponentially as a function of n (and linearly in m). The total number of functions grows even more rapidly; let FD,R denote the set of all functions from a őnite domain D to a őnite range R (i.e., FD,R ≡ {D → R}). A function f ∈ FD,R maps each element in D, to an element in R; hence, the FD,R is a őnite set. More speciőcally, we can map each element in D, to any element in R. The total number of functions is, therefore, |R||D| , and each speciőc function is selected with probability |R|−|D| . In the typical case where D is the set of n bits strings and R is the set of m bits strings, there are 2n elements in D and 2m elements in R, i.e., |D| = 2n and |R| = 2m . The total number of functions from D = {0, 1}n to R = {0, 1}m n n is, therefore, |FD,R | = (2m )2 = 2m·2 , i.e., superexponential in n. We see that selecting and storing a function from a large domain (and to a large range) would be difficult, as would be sending it - which is required if we want multiple parties to use the same random function. This is a pity; a shared random function can be very useful for cryptographic applications, e.g., it can be used to select random keys, or to implement a stateless stream cipher. This motivates the use of pseudorandom functions, which we discuss in the next subsection. However, let us őrst discuss such applications of random functions, as well as the relevant security and performance considerations. Stream cipher using a random function. Figure 2.15a presents the design of a stream cipher using a randomly-chosen function f which is shared by the two parties and kept secret. The design could be used either for bit-by-bit Applied Introduction to Cryptography and Cybersecurity 2.5. PSEUDO-RANDOMNESS, INDISTINGUISHABILITY AND ASYMPTOTIC SECURITY 91 encryption, with the random function mapping each input i to a single bit f (i), which is then XORed with the corresponding message bit m[i] to form the ciphertext ci = m[i] ⊕ f (i). Alternatively, both input messages m[i] and the output of the random function could be strings of some length, e.g., n, and then each invocation of the random function will produce n pad bits, XORed with n pad bits to produce n cipher bits. i $ ri ← {0, 1}n f (·) m[i] f (·) m[i] c[i] = m[i] ⊕ f (i) (a) A stream-cipher for stateful encryption, using limited storage (for a counter). Communication-optimal, i.e., |ciphertext| = |plaintext|. z }| { c[i] = (m[i] ⊕ f (ri ), ri ) (b) Stateless, randomized encryption, with high communication overhead: n (random) bits per plaintext bit. Figure 2.15: Bit-wise encryption using random function f (·) ∈ {0, 1}, stateful (a) and randomized (b). One drawback of the use of stream ciphers is the need to maintain synchronized state between sender and recipient. This refers to the (typical) case where the input is broken into multiple messages, each provided in a separate call to the encryption device. To encrypt all of these messages using a stream cipher OTP or the design in Figure 2.15a - the two parties must maintain the number of calls i (bits or strings of őxed length). To avoid this requirement, we can use randomized encryption, as we next explain. Stateless, randomized encryption using a random function f . An even more interesting application of random functions, is to avoid the need for the two parties to maintain state (of the message/bit counter i). To do this, we use the random function to construct randomized encryption, as shown in Figure 2.15b. Here, again, we use a random function f which outputs a single bit. To encrypt each plaintext bit m[i], we choose a string ri of n random bits, $ i.e., ri ← {0, 1}n . The ciphertext c[i] corresponding to plaintext bit m[i] is the pair (m[i] ⊕ f (ri ), ri ). Security. As with every cryptographic mechanism, we ask: are the designs in Figure 2.15b and Figure 2.15a secure? In this case, we assume that the two Applied Introduction to Cryptography and Cybersecurity 92 CHAPTER 2. CONFIDENTIALITY: ENCRYPTION SCHEMES AND PSEUDO-RANDOMNESS parties share a random function f which is unknown to the adversary; basically, the function is considered as a shared secret ‘key’. Intuitively, the design of Figure 2.15a is secure as long as we never re-use the same counter value. Similarly, the design of Figure 2.15b is secure as long as we use a sufficient number of random bits. Both statements are correct; but it isn’t trivial to understand why. Let us focus on the slightly more complex case of randomized encryption (the design of Figure 2.15b); the argument for the counter-based, stateful stream cipher design (Figure 2.15a) follows similarly. An obvious concern is that an attacker may try to predict the value of f (ri ) used to encrypt a message (or bit) m[i], from previously-observed ciphertexts {c[j]}j<i . Let us assume, to be safe, that the attacker knows all the corresponding plaintexts m[j], allowing the attacker to őnd all the corresponding mappings {f (rj )}j<i . Using this information, can the attacker guess f (ri )? It is possible that ri is the same as one of the previously-used random values, i.e., ri = rj for some j < i. In this case, the attacker has received already c[j]; and since we assumed that the attacker knows m[j], it follows that the attacker can expose m[i], by computing: m[i] = c[i] ⊕ f (ri ) = c[i] ⊕ f (rj ) = c[i] ⊕ (m[j] ⊕ c[j]) (2.28) Therefore, this case should be avoided, which is easy to do, by selecting using sufficiently long random strings ri , making it very unlikely that we will repeat the same value. Consider, therefore, the case where ri ̸∈ {(rj }j<i . Reconsider the process of selecting a random function, as in Exercise 2.10. What we did was to select the entire table - mapping from every element in the domain to a random element in the range - before we applied the random function. However, notice that it does not matter if, instead, we choose the mapping for each element ri ∈ D in the domain only on the first time we need to compute f (ri ). Think it over! This means, that if ri ̸∈ {rj }j<i , then the attacker does not learn anything about f (ri ), even if the attacker is given all of the ‘previous’ {f (rj )}j<i values. Until we select (randomly) the value of f (ri ), the attacker cannot know anything about it. Therefore, the only concern we have is with the case that ri ∈ {rj }j<i . Let us return to this issue; what is the probability of that happening? Well since each mapping is selected randomly, simply i−1 |D| . Focusing on the typical i−1 n case where the input domain is {0, 1} , this is 2n Therefore, if n is ‘sufficiently large’, then the maximal number of observations by the attacker would still be negligible compared to 2n - and 2in would be negligible. For example, if the attacker can observe a million encryptions, we ‘just’ need 2n to be way larger than one million ; and considering that a million is less than 220 , using any n signiőcantly larger than 20 seems safe enough. Efficiency. So the scheme is secure - provided n is ‘sufficiently large’, e.g., 80 or more. However, is it also efficient? To implement the scheme, we need to compute the random function f ; since we want a recipient to decipher our messages, we need to compute and send all of f before we begin sending Applied Introduction to Cryptography and Cybersecurity 2.5. PSEUDO-RANDOMNESS, INDISTINGUISHABILITY AND ASYMPTOTIC SECURITY 93 ciphertexts. However, this requires us to ŕip and (securely) share 2n bits - for n = 80 (or more). Unfortunately, that’s infeasible. Fortunately, there is an alternative, efficient solution: use a pseudorandom function (PRF) instead of the random function, providing an efficient solution which is still secure, albeit only against computationally-limited adversaries. Note that there is another efficiency concern with the scheme: is it really necessary to send a new random string for each bit? Of course not. We can address this concern in two ways: Large range R = {0, 1}l : this allows us to use the same random string r or counter i, to encrypt a block of l plaintext bits, by bitwise XOR of the l message bits with the corresponding l-bit output of f (r) (or f (i)). In this way, the n bits of r allow encryption of l bits of plaintext. See Figure 2.16. Use f (r) as seed of a PRG: if we use a sufficiently large range, a PRG could ‘expand’ f (r) into as many bits as required to bit-wise XOR with the plaintext: Ef (m) = (r, P RG(f (r)) ⊕ m). In this way, the n bits of r allow encryption of arbitrarily long plaintext m - requiring new n random bits only to encrypt new plaintext, and only if the state (of the PRG) was not retained. This is essentially what is done by the Output Feedback (OFB) mode of operation, which we see later on, except that the OFB mode also implements the PRG using the PRF, instead of using two separate functions (a PRF and a PRG). $ i mi ri ← {0, 1}n f (·) f (·) /n /n mi / / n n /n z }| { ci = (mi ⊕ f (i)) (a) Stateful block encryption with Random Function f (·). /n z }| { ci = (mi ⊕ f (ri ), ri ) (b) Stateless, randomized block encryption with Random Function f (·). Figure 2.16: Block (n-bits) encryption using a Random Function f (·). Use only one function application for n plaintext bits. Applied Introduction to Cryptography and Cybersecurity CHAPTER 2. CONFIDENTIALITY: ENCRYPTION SCHEMES AND PSEUDO-RANDOMNESS 94 2.5.8 Pseudorandom functions (PRFs) A pseudorandom function (PRF) is an efficient substitute to the use of a random function, which ensures similar properties, while requiring the generation and sharing of only a short key. The main limitation is that PRFs are secure only against computationally bounded adversaries. A PRF scheme has two inputs: a secret key k and a ‘message’ m; we denote it as P RFk (m). Once k is őxed, the PRF becomes only a function of the message. The basic property of PRF is that this function (P RFk (·)) is indistinguishable from a truly random function. Intuitively, this means that a PPT adversary cannot tell if it is interacting with P RFk (·) with domain D and range R, or with a random function f from D to R. Hence, PRFs can be used in many applications, providing an efficient, easily-deployable alternative to the impractical truly random functions. For example, PRFs can be used to construct shared-key cryptosystems, as illustrated in Figure 2.17. The őgure presents two designs of a cryptosystem from a PRF: a stateful encryption, as a stream cipher, in Figure 2.17a, and a stateless randomized encryption, in Figure 2.17b. $ i fk (·) k ri ← {0, 1}n k fk (·) /n mi /n mi / / n n /n z }| { ci = (mi ⊕ fk (i)) (a) Stateful block encryption with a PRF fk (·). /n }| { z ci = (mi ⊕ fk (ri ), ri ) (b) Stateless, randomized block encryption using PRF fk (·). Figure 2.17: Block (n-bits) encryption using a pseudorandom function (PRF) fk (·). Use only one PRF application for n plaintext bits. Both designs simply use a PRF instead of a random function, used in the corresponding designs in Figure 2.15. The security of the PRF designs follows from the security of the corresponding random-function-based designs - and from the indistinguishability of a PRF and a random function. Indeed, this is one case of a very useful technique, which we refer to as the random function design principle. Principle 6 (Random function design method). Design cryptographic protocols and mechanisms using a random function, to make the security analysis easier. Applied Introduction to Cryptography and Cybersecurity 2.5. PSEUDO-RANDOMNESS, INDISTINGUISHABILITY AND ASYMPTOTIC SECURITY 95 Once secure, implement using a pseudorandom function; security would follow since a PRF is indistinguishable from a random function. Note that the random function design method requires the parties to share the secret and random key of the PRF. In Section 3.6 we present the Random Oracle Model (ROM), which allows a similar approach to be applied when such shared secret key is not available. We now need to őnally properly deőne a secure pseudorandom function (PRF); this deőnition reuses the oracle notation and other concepts introduced in Section A.1. In this deőnition, the adversary A has oracle access to one of two $ functions: a random function from domain D to range R, i.e., f ← {D → R}, or the PRF keyed with a random n-bit key, Fk , with k being a random n-bit string, $ i.e., k ← {0, 1}n . We denote these two cases by A f and A Fk , respectively. The adversary should try to distinguish between these two cases, e.g., by outputting 0 (or ‘false’) if given oracle access to the random function f , and outputting 1 (or ‘true’) if given access to the PRF Fk . The idea of the deőnition is illustrated in Fig. 2.18. Figure 2.18: The pseudorandom function (PRF) Indistinguishability Test. We say that function Fk (x) : {0, 1}∗ ×D → R is a (secure) pseudo-random generator (PRG), if no distinguisher D can efficiently distinguish between Fk (·) and a random function f from the same domain D to the same range R, when the key k is a randomly-chosen sufficiently-long binary string. We now őnally deőne a pseudorandom function (PRF), Fk (x) : {0, 1}∗ ×D → R. The domain13 consists of the key, which we assume to be an (arbitrary long) binary string, i.e., from the set {0, 1}∗ ; and of an input from an arbitrary set D. The scheme must allow for arbitrary length for the key, since security 13 The notation {0, 1}∗ × D simply means a pair: a key from {0, 1}∗ and an element from set D. Applied Introduction to Cryptography and Cybersecurity 96 CHAPTER 2. CONFIDENTIALITY: ENCRYPTION SCHEMES AND PSEUDO-RANDOMNESS requirements - in this case, indistinguishability - are deőned asymptotically, i.e., for sufficiently long keys; see Chapter 1. Definition 2.6. A pseudorandom function (PRF) is a polynomial-time computable function Fk (x) : {0, 1}∗ × D → R s.t. for all PPT algorithms A, RF P RF εP A,F (n) ∈ N EGL, i.e., is negligible, where the advantage εA,F (n) of the PRF F against adversary A is defined as:  F n   f n  RF A k (1 ) − Pr A (1 ) (2.29) εP Pr A,F (n) ≡ $ k←{0,1}n $ f ←{D→R} The probabilities are taken over random coin tosses of A, and random choices $ $ of the key k ← {0, 1}n and of the function f ← {D → R}. Overview of the PRF indistinguishability test. The basic idea of this deőnition is the use of indistinguishability test, much like in the deőnition of a secure PRG (Deőnition 2.5), and even the Turing indistinguishability test (Figure 2.12). Namely, a PRF (Fk ) is secure if every PPT algorithm A cannot have signiőcant advantage in identifying the pseudorandom function. We deőne the advantage as the probability that A outputs 1 (‘true’, i.e., pseudorandom) when given oracle access to the pseudorandom function Fk , minus the probability that A outputs 1 (‘true’, i.e., pseudorandom) when given oracle access to the random function Fk , where both functions are over the same domain D and range R. ‘Signiőcant’ here means at least some positive polynomial in n, the length of the key k, often referred to as the security parameter. The oracle notation. Both A Fk and A f use the oracle notation introduced in Deőnition 1.3. Namely, they mean that A is given ‘oracle’ access to the respective function (Fk () and f ()). Oracle access means that the adversary can give any input x and get back that function applied to x, i.e., Fk (x) or f (x), respectively Why allow arbitrary key length (security parameter)? The deőnition allows arbitrarily-long keys, although in practice, cryptographic standards often have a őxed key length, or only a few options. The reason that the deőnition allows arbitrary length is that it requires the success probability to be negligible - smaller than any polynomial in the key length - which is meaningless if the key length is bounded. Why is 1n given as input to the adversary? A subtle, yet important, aspect of the deőnition is the fact that in the two calls to the adversary A, we provide the adversary with the value 1n as input, where n is the key length (security parameter). The value 1n simply signiőes a string of n consecutive bits whose value is 1, i.e., it is the value of n encoded in unary. But why provide 1n as input? It makes sense that the adversary should be informed of the key Applied Introduction to Cryptography and Cybersecurity 2.5. PSEUDO-RANDOMNESS, INDISTINGUISHABILITY AND ASYMPTOTIC SECURITY 97 length n, but why use unary encoding? Why not provide n using the ‘standard’ binary encoding? To understand the reason, őrst recall that we focus on efficient (PPT) algorithms; namely, the running time of both the pseudorandom function F and the adversary A is bounded by a polynomial in the size of their inputs. The inputs to the PRF include the key, and hence, consists of at least n bits; hence, the running time of the PRF is (at least) polynomial in n. It is therefore ‘only fair’ that the running time of the adversary A is also allowed to be polynomial in n; to ensure this, we provide to it 1n as input, similarly to our provision of the security parameter 1l to other algorithms, see Section A.1. Examples of secure and insecure PRFs. To clarify Deőnition 2.6, and to demonstrate how to show if a given function is a PRF or not, we give and solve two exercises. The őrst exercise shows two insecure PRF designs; the following exercise proves that a given construction is a secure PRF. Exercise 2.11. Examples of an insecure PRF constructions: 1. Show that Fk (m) = k ⊕ m is not a secure PRF. 2. Let p be an n-bits prime number, and Fk (m) ≡ k · (m + k) mod p. Show that F is not a secure PRF (to the domain {0, . . . , p − 1}). 3. Show that Fk (m) ≡ k ∨ (mk + k m+k mod 2n ) is not a secure PRF. 4. Assume that secure PRF functions exist. Given a function Fk (m), define F̂k (m) ≡ Fm (k), i.e., F̂ switches between the key and the input of F . Show that even if we know that F is secure, it is possible that F̂ is not a secure PRF. Solutions: 1. The adversary A g is given an oracle to a function g, and needs to output ‘True’ if g(·) = Fk (·) for a random key k and ‘False’ if g(·) is random function. A simple way to do this is for Ag őrst to make a query for g(0); if g(·) = Fk (·) then A receives back g(0n ) = Fk (0n ) = k ⊕ 0n = k. So in this case (g(·) = Fk (·)) A ‘knows’ k; it can check if indeed g(·) = Fk (·) (or g is a random function) by giving any other input m ̸= 0n . If g(m) = k ⊕ m, then (with very high probability) the function is indeed g(·) = Fk (·), i.e., not a random function, and A returns ‘True’; if g(m) ̸= k ⊕ m then the function g is deőnitely not g(·) = Fk (·), which means, in this case, it must be a random function, and A returns ‘False’. 2. Similarly to the previous item, the adversary őrst gives to the oracle the input 0. If the oracle is to Fk (m) ≡ k · (m + k) mod p, then the adversary receives k 2 mod p. Now, every number whose value is k 2 mod p for some integer k is called a quadratic residue. We discuss quadratic residues in subsection 6.1.8, in particular, explaining that there is an efficient Applied Introduction to Cryptography and Cybersecurity 98 CHAPTER 2. CONFIDENTIALITY: ENCRYPTION SCHEMES AND PSEUDO-RANDOMNESS algorithm to determine if a number is quadratic residue or not (Claim 6.1). The adversary may use this test, and if the oracle returns a quadratic residue, the adversary assumes that the oracle is to Fk (m) ≡ k · (m + k) mod p and returns ‘True’, since, with high probability, a random function would not return a quadratic residue. Let us see another solution, that does not require testing the output for being a quadratic residue. Observe that m + k is even, if and only if either both m and k are even, or both m and k are odd. This motivates the following distinguishing adversary A g (m), that outputs ‘True’, with high probability, if g(m) = Fk (m), and with low probability, if g is a random function. The adversary ask the oracle to compute both g(0) and g(2). If the two results, g(0) and g(2), have the same parity (either are both even, or are both odd), then the adversary concludes that g is more likely to be pseudo-random, and outputs ‘True’. Otherwise, if one of {g(0), g(2)} is even and the other is odd, then g cannot be Fk (m) ≡ k · (m + k) mod p, and hence must be a random function, and A outputs ‘False’. Notice that when this adversary outputs ‘True’, it would be wrong in the cases where the oracle function g was selected at random, but happens to return an even number for both inputs (zero and 2). However, since g was selected at random, the values g(0) and g(2) were also selected at random, and the probability of both g(0) and g(2) being even would be only 14 . Hence, while the adversary could be wrong, it still has a signiőcant (non-negligible) advantage. 3. It seems quite hard to predict much about a speciőc value of Fk (m) for a random key k. However, since F includes an ‘or’ operation with k, any bit which is set in k, is always output in Fk (m) (for every m). Since almost always k has some non-zero bits, then the adversary can detect these bits by computing the ‘and’ operation to multiple outputs of the oracle (for different inputs). Notice that if the oracle is to a random function, then the probability of any given bit to be set in the outputs of l different inputs is only 2−l . This allows the adversary to efficiently distinguish between being given an oracle to Fk (m) ≡ k ∨ (mk + k m+k mod 2n ) - vs. being given an oracle to a random function. 4. Let F ′ be a secure PRF; deőne: Fk (m) ≡ {Fk′ (m) if k ̸= 0n , otherwise (k = 0n ): 0n .} (2.30) As long as the key chosen is not the special all-zeros key, F is the same as F ′ , hence, F is also a PRF, since keys are selected randomly so k = 0n is selected only with probability 2−n . It remains to show that with this F function, F̂ is not a secure PRF. However, an adversary given an oracle to either F̂ or a random oracle, can simply give the input 0n . If the oracle is to F̂ , then this returns F̂k (0n ) = F0n (k). However, by deőnition (Equation 2.30), F̂k (0n ) = F0n (k) = 0n . Applied Introduction to Cryptography and Cybersecurity 2.5. PSEUDO-RANDOMNESS, INDISTINGUISHABILITY AND ASYMPTOTIC SECURITY 99 Hence, an adversary can easily distinguish between F̂k (m) and a random function. Exercise 2.12 (Proving security (of PRF) by reduction). Let Bn ≡ {0, 1}n denote the set of n-bits binary strings. Assume that Fk (m) : Bn × Bn → Bn be a secure PRF. Prove that F̂k (m) is also a secure PRF, where: F̂k (m) ≡ Fk (m ⊕ (0n−1 + + 1)) (2.31) . Solution: The solution uses the main technique in proofs of security of a construction in cryptography, which is called proof by reduction, which works as follows. We assume, to the contrary, that the construction is insecure. In this exercise, this means that we assume that F̂ is not a PRF. Namely, from ˆ Deőnition 2.6, there exists an efficient adversary (PPT algorithm) Aˆf , with an ˆ oracle denoted fˆ, such that Aˆf has a signiőcant (non-negligible) probability of distinguishing between the following two cases: Case 1: oracle fˆ is a random function: The oracle fˆ is selected at ran$ dom from the set of functions from n bits to n bits, i.e., fˆ ← {f : Bn → Bn }. Case 2: oracle fˆ is the PRF with a random key: The oracle fˆ is to the function F̂k (·), where k is a random n-bits key. Namely, fˆ←F̂k (·) where $ k ← Bn . In other words, we assume, to the contrary, that the PPT adversary Aˆ has a signiőcant advantage in the PRF indistinguishability test (Figure 2.18) against the constructed PRF F̂ , i.e.: h i i h RF ˆF̂k (·) (1n ) − ˆfˆ(·) (1n ) ̸∈ N EGL(n) (2.32) εP (n) ≡ Pr A Pr A ˆ A,F̂ $ k←Bn $ fˆ←{ Bn →Bn } In the left-hand expression the oracle is F̂k (·), for a random key k, while in the right-hand expression the oracle fˆ(·) is a random functions from n-bits to n-bits. We then present another PPT adversary A f which we show, under this assumption, to have a signiőcant advantage in the PRF indistinguishability test (Figure 2.18) against F , i.e.: h i h i RF ˆFk (·) (1n ) − ˆf (·) (1n ) ̸∈ N EGL(n) (2.33) εP (n) ≡ Pr A Pr A ˆ A,F̂ $ k←Bn $ f ←{ Bn →Bn } Namely, we show that if F̂ is not a secure PRF (Equation 2.32) then F cannot be a secure PRF (Equation 2.33), which is a contradiction since we were given that F is a secure PRF. Applied Introduction to Cryptography and Cybersecurity CHAPTER 2. CONFIDENTIALITY: ENCRYPTION SCHEMES AND PSEUDO-RANDOMNESS 100 ˆ Given Aˆf , we deőne adversary A f as follows: o n ˆ A f (·) (1n ) ≡ Return Aˆf (·) (1n ) where fˆ(m) ≡ f (m ⊕ (0n−1 + + 1)) (2.34) ˆ Namely, we implement the adversary A f against F , using the adversary Aˆf ˆ against F̂ , where we compute the value of f on a given input m using the oracle we have for f , i.e., by computing: fˆ(m) = f (m ⊕ (0n−1 + + 1)). RF P RF To complete the proof, we show that εP (n) = ε A,F (n), or more speciőˆ F̂ A, cally that the following two equations hold: i h   Pr A Fk (1n ) = Pr AˆF̂k (1n ) (2.35) $ k←Bn Pr $ f ←{Bn →Bn } $  A f (1n )  k←Bn = Pr $ fˆ←{Bn →Bn } h ˆ Aˆf (1n ) i (2.36) Equation 2.35 follows from substituting Equation 2.34 and then the deőnition of F̂k (Equation 2.31. To prove Equation 2.36, we observe that Let us show that also holds: h i  f n  ˆ Pr Aˆf (1n )|fˆ(m)≡f (m⊕(0n−1 ++1) A (1 ) = Pr (2.37) $ $ f ←{Bn →Bn } f ←{Bn →Bn } + 1) then f (m) ≡ fˆ(m ⊕ (0n−1 + + 1); namely, Now, if fˆ(m) ≡ f (m ⊕ (0n−1 + the probability of picking uniformly f is the same as the probability of picking fˆ - in fact, both are equal to 2−2n . Hence: i i h h ˆ ˆ Pr Aˆf (1n ) Aˆf (1n )|fˆ(m)≡f (m⊕(0n−1 ++1) = Pr $ f ←{Bn →Bn } $ fˆ←{Bn →Bn } This completes the proof. Additional PRF applications. including: PRFs have many additional applications, Message Authentication. In Chapter 4, we show that a PRF may be used for message authentication. Pseudo-random permutation or block cipher. We discuss the use of PRF to construct a pseudo-random permutation in the following subsection; later, in Section 2.6, we show how to extend this to construct a block cipher. Derive independently-random keys/values. In many scenarios, two parties share only one key k, but need to use multiple shared keys which are ‘independently-random’. This is easily achieved using PRF f ; if g1 , g2 , . . . are distinct identiőers, one of each required value, then we can derive the keys as k1 = fk (g1 ), k2 = fk (g2 ), and so on. As a concrete example, to derive separate keys for each day d from the same k, we can use kd = fk (d); exposure of k2 and k4 will not expose any other key, e.g., k3 . We elaborate on this in subsection 2.5.10. Applied Introduction to Cryptography and Cybersecurity 2.5. PSEUDO-RANDOMNESS, INDISTINGUISHABILITY AND ASYMPTOTIC SECURITY 2.5.9 101 PRF: Constructions and Robust Combiners The concept of PRFs was proposed in a seminal paper by Goldreich, Goldwasser and Micali [168]; the paper also presents a provably-secure construction of PRF, given a PRG. That is, if there is a successful attack on the constructed PRF, this attack can be used as a ‘subroutine’ to construct a successful attack on the underlying PRG. However, the construction of [168] is inefficient: it requires many applications of the PRG for a single application of the PRF. Therefore, this construction is not applied in practice. Instead, practical systems mostly implement PRFs using standard block ciphers. We model block ciphers as invertible Pseudo-Random Permutation (PRP) and discuss them in Section 2.6. In fact, PRGs are also often implemented from a block cipher. However, for now, let us show a simple construction of a PRG from a PRF. Exercise 2.13. Let F be a PRF over {0, 1}n bits, and let k, r ∈ {0, 1}n . Prove that f (k) = (Fk (1) + + Fk (2)) is a PRG. Another option is to construct candidate pseudorandom functions directly, without assuming and using any other ‘secure’ cryptographic function, basing their security on failure to ‘break’ them using known techniques and efforts by expert cryptanalysts. In fact, pseudorandom functions are among the cryptographic functions that seem good candidates for such ‘ad-hoc’ constructions; it is relatively easy to come up with a reasonable candidate PRF, which will not be trivial to attack. Finally, it is not difficult to combine two candidate PRFs F ′ , F ′′ , over the same domain and range, into a combined PRF F which is secure as long as either F ′ or F ′′ is a secure PRF. We refer to such a construction as a robust combiner. Constructions of robust combiners are known for many cryptographic primitives. The following lemma, from [191], presents a trivial yet efficient robust combiner for PRFs. Lemma 2.1 (Robust combiner for PRFs). Let F ′ , F ′′ : {0, 1}∗ × D → R be two polynomial-time computable functions, and let: F(k′ ,k′′ ) (x) ≡ Fk′ ′ (x) ⊕ Fk′′′′ (x) (2.38) If either F ′ or F ′′ is a PRF, then F is a PRF. Namely, this construction is a robust combiner for PRFs. Proof: see [191]. 2.5.10 The key separation principle In the PRF robust combiner (Eq. 2.38), we used separate keys for the two candidate-PRF functions F ′ , F ′′ . In fact, this is necessary, as the following exercise shows. Applied Introduction to Cryptography and Cybersecurity 102 CHAPTER 2. CONFIDENTIALITY: ENCRYPTION SCHEMES AND PSEUDO-RANDOMNESS Exercise 2.14 (Independent keys are required for PRF robust combiners). Let F ′ , F ′′ : {0, 1}∗ × D → {0, 1}∗ be two polynomial-time computable functions, and let Fk (x) = Fk′ (x) ⊕ Fk′′ (x). Demonstrate that the fact that one of F ′ , F ′′ is a PRF may not suffice to ensure that F would be a PRF. Solution: Suppose F ′ = F ′′ . Then for every k, x holds: Fk (x) = Fk′ (x) ⊕ ′ Fk′′ (x) = Fk′ (x) ⊕ Fk′ (x) = 0|Fk (x)| . Namely, for any input x and any key k the output of Fk (x) is an all-zeros string (Fk (x) ∈ 0∗ ). Hence F is clearly not a PRF. This is an example of the general Key separation principle, which we present below. In fact, the study of robust combiners often helps to better understand the properties of cryptographic schemes and to learn how to write cryptographic proofs. Principle 7 (Key separation). Use separate, independently-pseudorandom keys for each different cryptographic scheme, as well as for different types/sources of plaintext and different periods. The principle combines three main motivations for the use of separate, independently-pseudorandom keys: Per-goal keys: use separate keys for different cryptographic schemes. A system may use multiple different cryptographic functions or schemes, often for different goals, e.g., encryption vs. authentication. In this case, security may fail if the same or related keys are used for multiple different functions. Exercise 2.14 above is an example. Limit information for cryptanalysis. By using separate, independentlypseudorandom keys, we reduce the amount of information available to the attacker (ciphertext, for example). Limit the impact of key exposure. Namely, by using separate keys, we ensure that exposure of some of the keys will not jeopardize the secrecy of communication encrypted with the other keys. One important application of Pseudorandom functions (PRFs) is derivation of multiple separate keys from a single shared secret key k. Namely, a PRF, say f , is handy whenever two parties share one secret key k and need to derive multiple separate, independently pseudorandom keys k1 , k2 , . . . from k. A common way to achieve this, is for the two parties to use some set of identiőers γ1 , γ2 , . . ., a distinct identiőer for each derived key, and compute each key ki as: ki = fk (γi ). As another example, system designers often want to limit the impact of key exposure due to cryptanalysis or to system attacks. One way to reduce the damage from key exposures is to change the keys periodically, e.g., use key kd for day number d: Example 2.6 (Using a PRF for independent per-period keys.). Assume that Alice and Bob share one master key kM . They may derive a shared secret key for day d as kd = P RFkM (d). Even if all the daily keys are exposed, except the ˆ the key for day dˆ remains secure as long as kM is kept secret. key for one day d, Applied Introduction to Cryptography and Cybersecurity 2.6. BLOCK CIPHERS AND PRPS 2.6 103 Block Ciphers and PRPs Modern symmetric encryption schemes are built in modular fashion, using a basic building block - the block cipher. A block cipher is deőned by a pair of keyed functions, Ek , Dk , such that the domain and the range of both Ek and Dk are {0, 1}n , i.e., binary strings of őxed length n; for simplicity, we (mostly) use n for the length of both keys and blocks, as well as the security parameter, although in some ciphers, these are different numbers. Block ciphers should satisfy the correctness requirement: m = Dk (Ek (m)) for every k, m ∈ {0, 1}n . Notice that the correctness requirement should hold always, i.e., not only with high probability. This is in contrast to security requirements, which are typically deőned to hold only against efficient (PPT) adversaries - and allow a negligible failure probability, i.e., the adversary may win with negligible probability. But there is no reason to allow any probability for incorrect decryption. Figure 2.19: High-Level view of the NIST standard block ciphers: AES (current) and DES (obsolete). Block ciphers may be the most important basic cryptographic building blocks. Block ciphers are in wide use in many practical systems and constructions, and two of them were standardized by NIST - the Data Encryption Standard (DES) (1977-2001) [296], the őrst standardized cryptographic scheme, and its successor, the Advanced Encryption Standard (AES) (2002-present) [108] (Fig. 2.19). DES was replaced, since it was no longer considered secure; the main reason was simply that improvements in hardware made exhaustive-search feasible, due to the relatively short, 56-bit key. Another reason for reduced conődence in DES even in longer-key variants - was the presentation of differential cryptanalysis and linear cryptanalysis, two strong cryptanalysis attacks, which are quite generic, namely, effective against many cryptographic designs - including DES. Indeed, AES was designed for resiliency against these and other known attacks, and so far, no published attack against AES appear to justify concerns. We present a simpliőed explanation and example of differential cryptanalysis below, and encourage interested readers to follow up in the extensive literature on cryptanalysis in general and these attacks, e.g., [131, 220, 236]; [236] also gives excellent overview of block ciphers. Applied Introduction to Cryptography and Cybersecurity CHAPTER 2. CONFIDENTIALITY: ENCRYPTION SCHEMES AND PSEUDO-RANDOMNESS 104 While the deőnition of correctness for block ciphers (above) is simple and widely-accepted, there is not yet universal agreement on the security requirements. We adopt the common approach, which requires block ciphers to be a pair of Pseudo-Random Permutations (PRP). In the following subsection, we deőne pseudo-random permutations; then, in subsection 2.6.2, we discuss the security of block ciphers. 2.6.1 Random and Pseudo-Random Permutations After discussing random functions and PRFs, we now introduce two related concepts: a random permutation and a pseudo-random permutation (PRP). Random permutations. A permutation is a function π : D → D mapping a domain D onto itself, where every element is mapped to a distinct element, namely: (π : D → D) is a permutation ⇐⇒ (∀x, x′ ∈ D) (π(x) ̸= π(x′ )) (2.39) Note that a permutation may map an element onto itself, i.e., π(x) = x is perfectly legitimate. We use P erm(D) to denote the set of all permutations over domain D. $ Selection of a random permutation over D, i.e., selecting ρ ← P erm(D), is similar to selection of a random function (Equation 2.27) - except for the need to avoid collisions. A collision is a pair of elements (x, x′ ), both mapped to the same element: y = ρ(x) = ρ(x′ ). One natural way to think about this selection, is as being done incrementally, mapping one input at a time. Let D′ ⊆ D be the set of elements we didn’t map yet, and R ⊆ D be the set of elements to which we didn’t map any element yet; $ initially, R = D′ = D. Given any ‘new’ element x ∈ D′ , select: ρ(x) ← R, and then remove x from D′ and ρ(x) from R. Using this process, for a small domain, e.g., D = {0, 1}n for small n, the selection of a random permutation ρ is easy and can be done manually - similarly to the process for selecting a random function (over small domain and range). The process requires O(2n ) coin tosses, time and storage. For example, use Table 2.3 to select two random permutations over domain D = {0, 1}2 , and notice the number of coin-ŕips required. Function ρ1 ρ2 Domain {0, 1}2 {0, 1}2 00 01 10 11 coin-ŕips Table 2.3: Do-it-yourself table for selecting random permutations ρ1 , ρ2 over domain D = {0, 1}2 . Applied Introduction to Cryptography and Cybersecurity 2.6. BLOCK CIPHERS AND PRPS 105 Pseudo-Random Permutation (PRP). Similarly to a PRF, a PseudoRandom Permutation (PRP) over domain D, denoted Ek (·), is an efficient algorithm which cannot be distinguished efficiently from a random permutation $ ρ ← P erm(D), provided that the key k provided is ‘sufficiently long’ and chosen randomly. In the deőnition, the adversary A has oracle access to one of two functions: either Ek (·), keyed with a random n-bit key k, or a random permutation over $ domain D, i.e., ρ ← P erm(D). We denote these two cases by A Ek (·) and A ρ(·) , respectively. The adversary should try to distinguish between these two cases, e.g., by outputting the string ‘Rand’ if given access to the random permutation ρ(·), and outputting, say, ‘not random’, if given access to the PRP Ek (·). The idea of the deőnition is illustrated in Fig. 2.20. Note that the deőnition allows arbitrary length of the key (n), since indistinguishability is only deőned asymptotically - for sufficiently long keys. Figure 2.20: The Pseudo-Random Permutation (PRP) Indistinguishability Test. We say that function Ek (x) : {0, 1}∗ × D → D is a (secure) pseudo-random permutation (PRP), if no distinguisher D can efficiently distinguish between $ Ek (·) and a random permutation ρ ← P erm(D) over domain D, when the key k is a randomly-chosen sufficiently-long binary string. Definition 2.7. A pseudo-random Permutation (PRP) is a polynomial-time computable function Ek (x) : {0, 1}∗ ×D → D ∈ P P T s.t. for all PPT algorithms RP P RP A, εP A,E (n) ∈ N EGL(n), i.e., is negligible, where the advantage εA,E (n) of the PRP E against adversary A is defined as:  E n  RP A k (1 ) − Pr [A ρ (1n )] (2.40) εP Pr A,E (n) ≡ $ k←{0,1}n $ ρ←P erm(D) The probabilities are taken over random coin tosses of A, and random choices $ $ of the key k ← {0, 1}n and of the function ρ ← P erm(D). Applied Introduction to Cryptography and Cybersecurity 106 CHAPTER 2. CONFIDENTIALITY: ENCRYPTION SCHEMES AND PSEUDO-RANDOMNESS One natural - and important - question is the relation between a PRP over domain D, and a PRF whose domain and range are both D. Somewhat surprisingly, it turns out that a PRP over D is indistinguishable from a PRF over D. This important result is called the PRP/PRF Switching Lemma, and has multiple proofs; we recommend the proof in [363]. Note that the lemma provides a relation between the advantage functions; this is an example of concrete security. Lemma 2.2 (The PRP/PRF Switching Lemma). Let E be a polynomial-time computable function Ek (x) : {0, 1}∗ × D → D ∈ P P T , and let A be a PPT adversary. Then: q2 P RP RF (2.41) εP A,E (n) − εA,E (n) < 2 · |D| Where q is the maximal number of oracle queries performed by A in each run, and the advantage functions are as defined in Equation 2.40 and Equation 2.29. In particular, if the size of the domain D is exponential in the security parameter n (the length of key and of the input to A), e.g., D = {0, 1}n , then RF P RP εP A,E (n) − εA,E (n) ∈ N EGL(n). In this case, E is a PRP over D, if and only if it is a PRF over D. Proof idea: In a polynomial set of queries of a random function, there is negligible probability of having two values which will map to the same value. Hence, it is impossible to efficiently distinguish between a random function and a random permutation. The proof follows since a PRF (PRP) is indistinguishable from a random function (resp., permutation). The PRP/PRF switching lemma is somewhat counter-intuitive, since, for large D, there are many more functions than permutations. Focusing on n 2n D = {0, 1}n for convenience, there are (2n ) = 2n·2 functions over D, and ‘only’ 2n !, i.e., the factorial14 of 2n , permutations. Note that the loss of (concrete) security bounded by the switching lemma is a disadvantage in using a block cipher directly as a PRF - it would be an (asymptotically) secure PRF, but the advantage against the PRF deőnition would be larger than the advantage against the PRP deőnition. Therefore, we would prefer to use one of several constructions of a PRF from a block cipher/PRP. These constructions are efficient and simple, yet avoid this loss in security; see [39, 183]. 2.6.2 Block ciphers A block cipher is one of the most important cryptographic mechanisms, with multiple applications and implementations, often used as a ‘building block’ to construct other mechanisms. The symbols (E, D) are often used for the functions of the block cipher, since one basic application of a block cipher is as an encryption scheme; intuitively, E is ‘encryption’ and D is ‘decryption’. 14 For every integer i, the factorial of i is denoted i! and defined as i! ≡ 1 · 2 · . . . (i − 1) · i. Applied Introduction to Cryptography and Cybersecurity 2.6. BLOCK CIPHERS AND PRPS 107 However, as we will see, a block cipher does not meet the (strong) deőnition of secure encryption. Intuitively, a block cipher is an invertible PRP ; this is the reason that we often use the letter E to denote a PRP. The deőnition, which follows, is an extension of Deőnition 2.7. Definition 2.8. Let D be a finite domain. Given a permutation ρ : D → D over domain D, define the inverse of ρ, denoted ρ−1 : D → D, as the permutation over D such that (∀x ∈ D) x = ρ−1 (ρ(x)). A block cipher over domain D is a pair of keyed polynomial-time computable functions, (Ek , Dk ) over domain D, which satisfy: Correctness: for every x ∈ D and every key k ∈ {0, 1}∗ holds x = Dk (E(k(x)). Indistinguishability: the pair (E, D) is indistinguishable from the pair (ρ, ρ−1 ) where ρ is a random permutation over domain D. Namely, for every PPT algorithm A, the invertible Pseudo-Random Permutation (iPRP)RP advantage of A against (E, D), denoted εiP A,(E,D) (n), is a negligible funciP RP tion of n, where εA,(E,D) (n) is defined as: RP εiP A,(E,D) (n) ≡ Pr $ k←{0,1}n   A Ek ,Dk (1n ) − Pr $ ρ←P erm(D) h −1 A ρ,ρ (1n ) i (2.42) Note: A f,g (1n ) is the oracle notation, denoting algorithm A running with input 1n and oracles to functions f and g; see Definition 1.3. Let us give an example. Example 2.7. Let Ek (m) = m ⊕ k and Ek′ (m) = m + k mod 2n . Show the corresponding D, D′ functions such that both (E, D) and (E ′ , D′ ) satisfy the correctness requirement; and show neither of them satisfy the security requirement, i.e., neither are pairs of invertible PRPs. Solution: Dk (c) = c ⊕ k, Dk′ (c) = c − k mod 2n . Correctness follows from the arithmetic properties. Let us now show that (E, D) is insecure; speciőcally, let us show that Ek is not a PRP. Recall that we need to provide a PPT RP adversary A Ek (·) , s.t. εP E,A is not negligible. We present a simple adversary A, that only makes two queries, and whose advantage is almost 1. The őrst query of A will be for input 0n ; if we denote the oracle response by f (·), then A receives f (0n ). If the oracle is for E, A receives Ek (0n ) = 0n ⊕ k = k, i.e., the key k. Intuitively, this clearly ‘breaks’ the system; let us show exactly how, but from this point, our solution holds for the general case where the adversary found k (if the oracle is for f (·) = Ek (·)). Our second query can be for any other value (not 0n ), e.g., let’s make the query 1n , so we now receive f (1n ), where f is either Ek or a random permutation. Adversary A checks if f (1n ) (which it received from the oracle) is the same as Ek (1n ) (which A computes, since it believes it knows k). If the two are identical, then probably f (·) = Ek (·), i.e., A returns 1 (PRP); otherwise, Applied Introduction to Cryptography and Cybersecurity CHAPTER 2. CONFIDENTIALITY: ENCRYPTION SCHEMES AND PSEUDO-RANDOMNESS 108 Function Property PRG f ‘Long’ output is pseudorandom, if ‘short’ input is random Random function f Random permutation π PRF fk (·) (∀x ∈ D) f (x) ← R (random mapping for each input) Random 1-to-1 mapping: ∀x = ̸ x′ ∈ D) π(x) ̸= π(x′ ). Indistinguishable from a random function, if k is selected randomly Indistinguishable from a random permutation, if k is selected randomly (E, D) is indistinguishable from a random invertible permutation over domain D, and satisfy correctness: (∀k, m) m = Dk (Ek (m)) PRP fk (·) Block cipher (E, D) $ Table 2.4: Comparison between random function, random permutation, PRG, PRF, PRP and block-cipher. Domain is denoted D, range is denoted R; for permutations and block cipher, domain is also range. then for sure f is a random permutation (and A returns 0). So, the advantage of A is almost 1, speciőcally: RP εP A,E (n) = Pr $ k←[0,1]n =  A Ek = 1 − n n Pr $ f ←P erm({0,1}n Pr (Ek (1 ) = Ek (1 )) − k Pr $  ) Af = 1 (Ek (1n ) = f (1n )) f ←P erm({0,1}n Now, if f is a random permutation, then f (1n ) is a random n bit string; since there are 2n n-bit strings, then the probability of f (1n ) to be any speciőc string, including Ek (1n ), is 2−n . Hence: RP εP A,E (n) = 1− 1 ≈1 2n Now, notice that the same adversary A also distinguishes E ′ ; we leave to the reader to substitute the values as necessary; these minimal changes are only required until A ‘őnds’ k, from that point, the solution is exactly identical. See Table 2.4 for a summary and comparison of random function, random permutation, PRG, PRF and Pseudo-random Permutation (PRP). Robust combiner for block ciphers. Block ciphers is that they have a simple robust combiner, i.e., a method to combine two or more candidate block ciphers (E ′ , D′ ), , (E ′′ , D′′ ) into one ‘combined’ pair (E, D) which is a secure block cipher provided one or more of the candidates is a secure block cipher. See [191]. Basically, assuming both (E ′ , D′ ) and (E ′′ , D′′ ) satisfy correctness, then their cascade is a secure block cipher, i.e., Ek′ ,k′′ (x) = Ek′ ′ (Ek′′′′ (x)), Dk′ ,k′′ (x) = Dk′′′′ (Dk′ ′ (x)). See [191]. Applied Introduction to Cryptography and Cybersecurity 2.6. BLOCK CIPHERS AND PRPS 2.6.3 109 The Feistel Construction: 2n-bit Block Cipher from n-bit PRF It is not too difficult to design a candidate PRF from basic operations. However, designing a candidate PRP directly from basic operations seems harder, since we need to ensure that every input is mapped to a distinct output. Directly constructing a block cipher, an invertible PRP, seems harder, since we need to ensure the permutation property and őnd the inverse permutation, but still prevent vulnerability. This motivates the design of a PRP by using a PRF. In fact, the PRP/PRF switching lemma (Lemma 2.2) shows that every PRF can also be used as a PRP. Namely, a PRF over a domain D is indistinguishable from a PRP over D, and vice verse. Namely, no computationally-bounded (PPT) adversary is likely to distinguish between a PRP and a PRF (over domain D). So, it seems that we can just use a PRF instead of a PRP. However, a PRF is allowed to have some collisions, i.e., values x ̸= x′ ∈ D such that Fk (x) = Fk (x′ ), for the same key k. Collisions do not exist for (random or not) permultations, and are undesirable for a block cipher; indeed, we required that a block cipher (E, D) will ensure correctness, i.e., that m = Dk (Ek (m)), for every key k and message m. Clearly this will not hold if we use a PRF, which may have collisions, as the E function. In this subsection, we study the Feistel construction of a PRP from a PRF; furthermore, the construction is of a PRP with input of 2n bits, given a PRF with inputs and outputs of n bits. Such a design is not trivial; see the following two exercises. Exercise 2.15. Let f be a PRF from n-bit strings to n-bit strings, and define gkL ,kR (mL + + mR ) ≡ fkL (mL ) + + fkR (mR ). Show that g is neither a PRF nor a PRP (over 2n-bit strings). Hint: given a black box containing g or a random permutation over 2n-bit strings, design a distinguishing adversary A as follows. A makes two queries, one with input x = 02n and the other with input x′ = 0n + + 1n . Denote ′ ′ the corresponding outputs by y = y0,...,2n−1 and y = y0,...,2n−1 . If the box ′ . In contrast, if the box contained a contained g, then y0,...,n−1 = y0,...,n−1 ′ random function, then the probability that y0,...,n−1 = y0,...,n−1 is very small −n only 2 . The probability is about as small as if the box contained a PRP. The next exercise presents a slightly more elaborate scheme, which is essentially a reduced version of the Feistel construction (presented next). Exercise 2.16. Let f be a PRF from n-bit strings to n-bit strings. Show that gkL ,kR (mL + + mR ) = ml ⊕ fkR (mR ) + + mR ⊕ fkL (mL ) is not a PRP (over 2n-bit strings). We next present the Feistel construction, the most well known and simplest construction of a PRP - in fact, an invertible PRP (block cipher) - from a PRF. As shown in Fig. 2.21, the Feistel cipher transforms an n-bit PRF into a 2n-bit invertible PRP. Applied Introduction to Cryptography and Cybersecurity 110 CHAPTER 2. CONFIDENTIALITY: ENCRYPTION SCHEMES AND PSEUDO-RANDOMNESS Figure 2.21: Three ‘rounds’ of the Feistel Cipher, constructing a block cipher (invertible PRP) from a PRF Fk (·). The Feistel cipher is used in DES (but not in AES). Note: most publications present the Feistel cipher a bit differently, by ‘switching sides’ in each round. Formally, given a function y = fk (x) with n-bit keys, inputs and outputs, the three-rounds Feistel gk (m) is deőned as: Lk (m) = Rk (m) = gk (m) = m0,...,n−1 ⊕ Fk (mn,...,2n−1 ) Fk (Lk (m)) ⊕ mn,...,2n−1 Lk (m) ⊕ Fk (Rk (m)) + + Rk (m) Note that we consider only a ‘three rounds’ Feistel cipher, and use the same underlying function Fk in all three rounds, but neither aspect is mandatory. In fact, the Feistel cipher is used in the design of DES and several other block ciphers, typically using more rounds (e.g., 16 in DES), and often using different functions at different rounds. Luby and Rackoff [269] proved that a Feistel cipher of three or more ‘rounds’, using a PRF as Fk (·), is an invertible PRP, i.e., a block cipher. One may ask, why use the Feistel design rather than directly design an invertible PRP? Indeed, this is done in AES, which does not follow the Feistel Applied Introduction to Cryptography and Cybersecurity 2.7. DEFINING SECURE ENCRYPTION 111 cipher design. An advantage of using the Feistel design is that it allows the designer to focus on the pseudo-randomness requirements when designing the PRF, without having simultaneously to make sure that the design is also an invertible permutation. Try to design a PRP, let alone an invertible PRP, and compare it to using the Feistel cipher! 2.7 Defining secure encryption In the previous section, we deőned PRG, PRF and PRP; in this section, we őnally make the next step and deőne secure encryption. The deőnition of secure encryption is quite subtle. In fact, people have been designing - and attacking - cryptosystems for millennia, without a precise deőnition of the security goals! This only changed with the seminal paper of Goldwasser and Micali [169], which presented the őrst precise deőnition of secure encryption, along with a design which was proven secure (under reasonable assumptions); this paper is one of cornerstones of modern cryptography. It may be surprising that deőning secure encryption is so challenging; we therefore urge you to attempt the following exercise, where you are essentially challenged to try to deőne secure encryption on your own, before reading the rest of this section and comparing with the deőnition we present. Exercise 2.17 (Deőning secure encryption). Define secure symmetric encryption, as illustrated in Figure 1.4. Refer separately to the two aspects of security definitions: (1) the attack model, i.e., the capabilities of the attacker, and (2) the success criteria, i.e., what constitutes a successful attack and what constitutes a secure encryption scheme. 2.7.1 Attack model Security deőnition require a precise attack model, deőning the maximal expected capabilities of the attacker. We discussed already some of these capabilities. In particular, we already discussed the computational limitations of the attacker: in Section 2.4 we discussed the unconditional security model, where attackers have unbounded computational resources, and from subsection 2.5.2 we focus on Probabilistic Polynomial Time (PPT) adversaries, whose computation time is bounded by some polynomial in their input size. Another important aspect of the attacker model is the interactions with the attacked scheme and the environment. In Section 2.2, we introduced the cipher-text only (CTO), known-plaintext attack (KPA), chosen plaintext attack (CPA) and chosen ciphertext attack (CCA) attack models. Speciőcally, in a chosen-plaintext attack, the adversary can choose plaintext and receive the corresponding ciphertext (encryption of that plaintext), and in chosen-ciphertext attack, the adversary can choose ciphertext and receive the corresponding plaintext (its decryption), or error message if the ciphertext does not correspond to well-encrypted plaintext. Applied Introduction to Cryptography and Cybersecurity 112 CHAPTER 2. CONFIDENTIALITY: ENCRYPTION SCHEMES AND PSEUDO-RANDOMNESS It is desirable to allow for attackers with maximal capabilities. Therefore, when we evaluate cryptosystems, we are interested in their resistance to all types of attacks, and especially the stronger ones - CCA and CPA. On the other hand, when we design systems using a cipher, we try to limit that attacker’s capabilities. For example, one approach to foil CCA attacks is to apply some simple padding function pad to add redundancy to the plaintext before encryption; the padding function may be as simple as appending a őxed string. For example, given message m, key k, encryption scheme (E, D) and a simple padding + 0l , i.e., concatenate l zeros, we now encrypt by function, e.g., pad(m) = m + computing c = Ek (pad(m)) = Ek (m + + 0l ). This allows the decryption process to identify invalid ciphertexts. Namely, given c = Ek (pad(m)) = Ek (m + + 0l ), l then Dk (c) = m + + 0 , and we output m as usual; but if the output of Dk does not contain l trailing zeros, then we identify faulty ciphertext. This approach often helps to make it hard or infeasible for the attacker to apply a chosenciphertext attack; in particular, a random ciphertext would almost always be detected as faulty. Note, however, that adding redundancy to the plaintext may make it easier to perform ciphertext-only attacks; see Principle 9. Also, some combinations of encryption and padding functions may still be vulnerable to chosen ciphertext attacks (CCA), as we show in the following exercise. Exercise 2.18. Show that the combination of the simple padding function pad(m) = m + + 0l and the PRG-stream-cipher (Fig. 2.11) is vulnerable to CCA attack. Show this also for the combination of the simple padding pad(m) = m+ +0l with two other ciphers we discussed so far. 2.7.2 The Indistinguishability-Test for Shared-Key Cryptosystems Intuitively, the security goal of encryption is confidentiality: to transform plaintext into ciphertext in such way that will allow speciőc parties (‘recipients’) - and only them - to perform decryption, transforming the ciphertext back to the original plaintext. However, the goal as stated may be interpreted to only forbid recovery of the exact, complete plaintext; but what about recovery of partial plaintext? For example, suppose an eavesdropper can decipher half of the characters from the plaintext - is this secure? We believe most readers would not agree. What if she can decipher less, say one character? In some applications, this may be acceptable; in others, even exposure of one character may have signiőcant consequences. Intuitively, we require that an adversary cannot learn anything given the ciphertext. This may be viewed as extreme; for example, in many applications the plaintext includes known őelds, and their exposure may not be a concern. Applied Introduction to Cryptography and Cybersecurity 2.7. DEFINING SECURE ENCRYPTION 113 However, it is best to minimize assumptions and use deőnitions and schemes which are secure for a wide range of applications. Indeed, in general, when we design a security system, cryptographic or otherwise, it is important to clearly deőne both aspects of security: the attack model (e.g., types of attacks ‘allowed’ and any computational limitations), as well as the success criteria (e.g., ability to get merchandise without paying for it). Furthermore, it is difficult to predict the actual environment in which a system would be used. Therefore, following the conservative design principle (Principle 3), our deőnition should prevent the adversary from learning any information about the plaintext from the ciphertext. Let us assume that you agree that it would be best to require that an adversary cannot learn anything from the ciphertext. How do we ensure this? This is not so easy. The seminar paper by Goldwasser and Micali [169] presented two deőnitions and showed them to be equivalent: semantic secure encryption and indistinguishability. We will only present the latter, since we őnd it easier to understand and use, and resembeles the PRF, PRG and Turing indistinguishability tests (Figure 2.18, Figure 2.13 and Figure 2.12, respectively). Intuitively, an encryption scheme ensures indistinguishability if an attacker cannot distinguish between encryption of any two given messages. But, again, turning this into a ‘correct’ and precise deőnition requires care. The concept of indistinguishability is reminiscent of disguises; it may help to consider the properties we can hope to őnd in an ‘ideal disguise service’: Any two disguised persons are indistinguishable: we cannot distinguish between any two well-disguised persons. Yes, even Rachel from Leah!15 Except, the two persons should have the ‘same size’: assuming that a disguise is of ‘reasonable size’ (overhead), a giant can’t be disguised to be indistinguishable from a dwarf! Re-disguises should be different: if we see Rachel in disguise, and then she disappears and we see a new disguise, we should not be able to tell if it is Rachel again, in new disguise - or any other disguised person! This means that disguises must be randomized or stateful, i.e., every two disguises of the same person (Rachel) will be different. We will present corresponding properties for indistinguishable encryption: Encryptions of any two messages are indistinguishable. to allow arbitrary applications, we allow the attacker to choose the two messages. However, there is one restriction: the two messages should be same length. Re-encryptions should be different: the attacker should not be able to distinguish encryptions based on previous encryptions of the same messages. This means that encryption must be randomized or stateful, so that two encryptions of same message will be different. (A weaker notion of 15 See: Genesis 29:23, King James Bible. Applied Introduction to Cryptography and Cybersecurity 114 CHAPTER 2. CONFIDENTIALITY: ENCRYPTION SCHEMES AND PSEUDO-RANDOMNESS ‘deterministic encryption’ allows detection of re-encryption of a message, and is sometimes used for scenarios where state and randomization are to be avoided.) We are őnally ready to formally present an indistinguishability-based deőnition of secure encryption. Deőnition 2.9 deőnes chosen-plaintext attack (CPA) indistinguishable (IND-CPA) shared key encryption schemes. Figure 2.22: Illustration of the CPA indistinguishability (IND-CPA) test for IN D−CP A (b, n); see also pseudocode, Figure 2.23. shared-key encryption, TA,⟨E,D⟩ Throughout the test, the adversary A may ask for encryption of one or many messages m. At some point, A sends two same length messages (|m0 | = |m1 |), and receives the encryption of mb , i.e.: Ek (mb ). Finally, A outputs its guess b∗ , and ‘wins’ if b = b∗ . The encryption is IND-CPA if Pr(b∗ = 1|b = 1) − Pr(b∗ = 1|b = 0) is negligible. The IND-CPA test receives two inputs: the ‘challenge bit’ b (that A tries to őnd), and the security parameter, which in this case is also the key length, n. The adversary is given oracle access to Ek (·); we denote the fact that A has oracle access to Ek (·) by writing AkE (·). Namely, A may select a message m and receive its encryption Ek (m) - and possibly repeat this for more messages. The encryption Ek (·) may be either stateless or stateful; for stateful encryption Ek (·), the state is maintained by the oracle, not exposed to A. At some point, A gives a pair of messages m0 , m1 , and receives c∗ = Ek (mb ). As we discussed above, the two messages must be of equal length, |m0 | = |m1 |. Finally, A outputs b∗ , which is the output of the test. Intuitively, A ‘wins’ if b∗ = b. We present the IND-CPA test informally in Figure 2.22, and using pseudocode in Figure 2.23. Oracle notation A Ek (·) . In the IND-CPA test, we use the oracle notation A Ek (·) , deőned in Def. 1.3. Namely, A Ek (·) denotes calling the A algorithm, with ‘oracle access’ to the (keyed) PPT algorithm Ek (·), i.e., A can provide Applied Introduction to Cryptography and Cybersecurity 2.7. DEFINING SECURE ENCRYPTION 115 IN D−CP A TA,⟨E,D⟩ (b, n) { $ k ← {0, 1}n (m0 , m1 ) ← A Ek (·) (‘Choose’, 1n ) s.t. |m0 | = |m1 | c∗ ← Ek (mb ) b∗ = A Ek (·) (‘Guess’, c∗ ) Return b∗ } Figure 2.23: Pseudocode for the chosen-plaintext attack (CPA) indistinguishability (IND-CPA) test for shared key encryption schemes (E, D), illustrated in Figure 2.22. The two calls to the adversary are often referred to as the ‘Choose’ phase and the ‘Guess’ phase. arbitrary plaintext string m and receive Ek (m). The oracle may maintain its own state, allowing stateful encryption. Adversary A chooses challenge messages. The IND-CPA test allows A to choose the two challenge messages m0 , m1 , and then receive c∗ = Ek (mb ), where b ∈ {0, 1}. Allowing A to select the two messages completely may make it easier for A; in many applications, the adversary only has very limited knowledge about the possible plaintext messages. This is following the conservative design principle - the encryption should be appropriate for any application, including one in which there are only two possible plaintext messages, known to the attacker - who ‘just’ needs to know which of them was encrypted. One classical example is when the messages are ‘attack’ or ‘retreat’; another would be ‘sell’ or ‘buy’. Encryption must be randomized or stateful. IND-CPA encryption must either be randomized or stateful. The reason is simple: the adversary is allowed to make queries for arbitrary messages - including the ‘challenges’ m0 , m1 . If the encryption scheme is deterministic - and stateless - then all encryptions of a message, e.g. m0 , will return a őxed ciphertext; this will allow the attacker to trivially ‘win’ in the IND-CPA experiment. Furthermore, Exercise 2.46 shows that limiting the number of random bits per encryption may lead to vulnerability. Using the IND-CPA test, we now deőne IND-CPA encryption, similarly to how we deőned PRG and PRF, in Deőnition 2.5 and Deőnition 2.6, respectively. Definition 2.9 (IND-CPA shared-key cryptosystems). Let ⟨E, D⟩ be a sharedkey cryptosystem. We say that ⟨E, D⟩ is IND-CPA, if every efficient adversary IN D−CP A A ∈ P P T has negligible advantage ε⟨E,D⟩,A (n) ∈ N EGL(n), where: h i h i D−CP A IN D−CP A IN D−CP A εIN (n) ≡ Pr TA,⟨E,D⟩ (1, n) = 1 − Pr TA,⟨E,D⟩ (0, n) = 1 ⟨E,D⟩,A (2.43) Applied Introduction to Cryptography and Cybersecurity 116 CHAPTER 2. CONFIDENTIALITY: ENCRYPTION SCHEMES AND PSEUDO-RANDOMNESS Where the probability is over the random coin tosses in IND-CPA (including of A and E). The following exercise can help understand subtle aspects of the deőnition. Exercise 2.19. Consider the two following alternative advantage functions, ε̃ and ε̂: h i h i D−CP A IN D−CP A IN D−CP A ε̃IN (n) ≡ Pr TA,⟨E,D⟩ (1, n) = 1 − Pr TA,⟨E,D⟩ (1, n) = 0 ⟨E,D⟩,A D−CP A ε̂IN (n) ⟨E,D⟩,A ≡ h i h i IN D−CP A IN D−CP A Pr TA,⟨E,D⟩ (1, n) = 1 − Pr TA,⟨E,D⟩ (0, n) = 0 Show that both ε̃ and ε̂ are not reasonable definitions for advantage function, by presenting (simple) adversaries which achieve significant advantage for any cryptosystem, including (randomized) cryptosystems which satisfies indistinguishability, i.e., where any PPT adversary will have negligible advantage. Indistinguishability for the CTO, KPA and CCA attack models Deőnition 2.9 focuses on Chosen-Plaintext Attack (CPA) model. Modifying this deőnition for the case of chosen-ciphertext (CCA) attacks requires a further (quite minor) change and extension, to prevent the attacker from ‘abusing’ the decryption oracle to decrypt the challenge ciphertext. Modifying the deőnition for Cipher-Text-Only (CTO) attack and KnownPlaintext Attack (KPA) is more challenging. For KPA, the obvious question is which plaintext-ciphertext messages are known; this may be solved by using random plaintext messages, however, in reality, the known-plaintext is often quite speciőc. It is similarly challenging to modify the deőnition so it covers CTO attacks, where the attacker must know some information about the plaintext distribution. This information may be related to the speciőc application, e.g., when the plaintext is English. In other cases, information about the plaintext distribution may be derived from system design. One example is text encoded using a code with some built-in error-detection capability, e.g., the ASCII encoding [94], where one of the bits in every character is the parity of the other bits. An even more extreme example is in GSM, where the plaintext is the result of the application of an Error-Correcting Code (ECC), providing signiőcant redundancy which even allows a CTO attack on GSM’s A5/1 and A5/2 ciphers [26]. In such a case, the amount of redundancy in the plaintext can be compared to that provided by a KPA attack. We consider it a CTO attack, as long as the attack does not require knowledge of all or much of the plaintext corresponding to the given ciphertext messages. Some systems, including GSM, allow the attacker to guess all or much of the plaintext for some of the ciphertext messages, e.g., when sending a predictable message at a speciőc time. Such systems violate the Conservative Applied Introduction to Cryptography and Cybersecurity 2.7. DEFINING SECURE ENCRYPTION 117 Design Principle (principle 3), since a KPA-vulnerability of the cipher renders the system vulnerable. A better system design would limit the adversary’s knowledge about the distribution of plaintexts, requiring a CTO vulnerability to attack the system. 2.7.3 The Indistinguishability-Test for Public-Key Cryptosystems (PKCs) We next deőne the CPA-indistinguishability for public key cryptosystems (PKC; see Figure 1.5). The deőnition is a minor variation of the indistinguishabilitytest for shared-key cryptosystems (Deőnition 2.9), and even a bit simpler. In fact, let us first present the deőnition, as well as the IND-CPA test for PKCS (Figure 2.24), and only then point out and explain the differences; this would allow the reader to play ‘őnd the differences’, comparing to Deőnition 2.9. IN D−CP A TA,⟨KG,E,D⟩ (b, n) { $ (e, d) ← KG(1n ) (m0 , m1 ) ← A(‘Choose’, e) s.t. |m0 | = |m1 | c∗ ← Ee (mb ) b∗ = A(‘Guess’, (c∗ , e)) Return b∗ } Figure 2.24: The IND-CPA-PK test for public-key encryption (KG, E, D). Notice that this test does not use the decryption key d, generated in the őrst step. Definition 2.10 (IND-CPA-PK). Let ⟨KG, E, D⟩ be a public-key cryptosystem. We say that ⟨KG, E, D⟩ is IND-CPA-PK, if every efficient adversary A ∈ P P T D−CP A−P K has negligible advantage εIN <KG,E,D>,A (n) ∈ N EGL(n), where: h i h i D−CP A−P K IN D−CP A IN D−CP A εIN (n) ≡ Pr TA,⟨KG,E,D⟩ (1, n) = 1 − Pr TA,⟨KG,E,D⟩ (0, n) = 1 ⟨KG,E,D⟩,A (2.44) Where the probability is over the random coin tosses in IND-CPA (including of A and E). In IND-CPA-PK ( Deőnition 2.10), the adversary is given the public key e. Hence, ADV can encrypt at will, without the need to make encryption queries, as enabled by the oracle calls in Deőnition 2.9, and we removed the oracle. Another change is purely syntactic: the cryptosystem includes an explicit key generation algorithm KG, while for the shared-key cryptosystem, we assumed the (typical) case where the keys are just random n-bit strings. We discuss three speciőc public key cryptosystems, all in Chapter 6: the DH and El-Gamal PKCs in Section 6.4, and the RSA PKC in Section 6.5. Applied Introduction to Cryptography and Cybersecurity 118 2.7.4 CHAPTER 2. CONFIDENTIALITY: ENCRYPTION SCHEMES AND PSEUDO-RANDOMNESS Design of Secure Encryption: the Cryptographic Building Blocks Principle We next discuss the design of secure symmetric encryption schemes. It would be great if we could use encryption schemes which are provably secure, e.g., proven to be IND-CPA (Deőnition 2.9), without assumptions on computationalhardness of some underlying functions. However, this is unlikely; let us explain why. A provably IND-CPA encryption implies P ̸= N P . IND-CPA implies that there is no efficient (PPT) algorithm that can distinguish between encryption of two given messages, i.e., the IND-CPA test is not in the polynomialcomplexity class P, containing problems which have a polynomial-time algorithm. On the other hand, surely it is easy to ‘win’ in the test, given the key; which implies that the IND-CPA test is in the non-deterministic polynomial complexity class NP, containing problems which have a polynomial-time algorithm - if given a hint (in our case, the key). Taken together, this would have shown that the complexity class P is strictly smaller than the complexity class N P , i.e., P ̸= N P . Now, that would be a solution to the most fundamental question in the theory of computational complexity! It is not practical to require the encryption algorithm to have a property whose existence implies a solution to such a basic, well-studied open question. Therefore, both theoretical and applied cryptography consider designs whose security relies on failed attempts in cryptanalysis. The big question is: should we rely on failed cryptanalysis of the scheme itself, or on failed cryptanalysis of underlying components of the scheme? It may seem that the importance of encryption schemes should motivate the őrst approach, i.e., relying of failed attempts to cryptanalyze the scheme. Surely this was the approach in historical and ‘classical’ cryptology. However, in modern applied cryptography, it is much more common to use the second approach, i.e., to construct encryption using ‘simpler’ underlying primitives, and to base the security of the cryptosystem on the security of these component modules. We summarize this approach in the following principle, and then give some justiőcations. Principle 8 (Cryptographic Building Blocks). The security of cryptographic systems should only depend on the security of a few basic building blocks. These blocks should be simple and with well-deőned and easy to test security properties. More complex schemes should be proven secure by reduction to the security of the underlying blocks. The advantages of following the cryptographic building blocks principle include: Efficient cryptanalysis: by focusing cryptanalysis effort on few schemes, we obtain much better validation of their security. The fact that the building Applied Introduction to Cryptography and Cybersecurity 2.7. DEFINING SECURE ENCRYPTION 119 blocks are simple and are selected to be easy to test makes cryptanalysis even more effective. Replacement and upgrade: by using simple, well-deőned modules, we can replace them for improved efficiency - or to improve security, in particular after being broken or when doubts arise. Flexibility and variations: complex systems and schemes naturally involve many options, tradeoffs and variants; it is better to build all such variants using the same basic building blocks. Robust combiners: there are known, efficient robust-combiner designs for the basic cryptographic building blocks [191]. If desired, we can use these as the basic blocks for improved security. The cryptographic building blocks principle is key to both applied and theoretical modern cryptography. From the theoretical perspective, it is important to understand which schemes can be implemented given another scheme. There are many results exploring such relationships between different cryptographic schemes and functions, with many positive results (constructions), few negative results (proofs that efficient constructions are impossible or improbable), and very few challenging open questions. In modern applied cryptography, the principle implies the need to deőne a small number of basic building blocks, which would be very efficient, simple functions - and convenient for many applications. The security of these building blocks would be established by extensive (yet unsuccessful) cryptanalysis efforts - instead of relying on provably-secure reductions from other cryptographic mechanisms. In fact, most cryptographic libraries contain the four such widely-used building blocks: shared-key block ciphers, cryptographic hash functions , publickey encryption and signature schemes. Cryptographic hash functions and block ciphers are much more efficient than the public key schemes (see Table 6.1) and hence are preferred, and used in most practical systems - when public-key operations may be avoided. In particular, block ciphers are widely used as cryptographic building blocks, as they satisfy most of the requirements of the Cryptographic Building Blocks principle. They are simple, deterministic functions with Fixed Input Length (FIL), which is furthermore identical to their output length. This should be contrasted with ‘full ŕedged encryption schemes’, which are randomized (or stateful) and have Variable Input Length (VIL). Which brings us to the natural question: can we use block ciphers for secure encryption - and how ? Block ciphers vs. Secure encryption. Could we simply use a block cipher for encryption? This seems natural; block ciphers, in particular DES and AES, are often referred to as encryption schemes, and even typically use the notation (E, D) for their keyed functions. However, block ciphers do not satisfy Applied Introduction to Cryptography and Cybersecurity 120 CHAPTER 2. CONFIDENTIALITY: ENCRYPTION SCHEMES AND PSEUDO-RANDOMNESS the requirements of most deőnitions of encryption, e.g., the IND-CPA test of Def. 2.9. Exercise 2.20. Explain why a PRP and a block cipher, fail the IND-CPA test (Def. 2.9). Solution: Consider Ek (m), which is either a PRP or the ‘encryption’ operation of a block cipher (i.e., a pair (E, D) of a PRP and its reverse). Then Ek (m) is a function; whenever we apply it to the same message m, with the same key k, we will receive the same output (Ek (m)). The attacker A would choose any two different messages as (m0 , m1 ), conőrm that c0 = Ek (m0 ) ̸= c1 = Ek (m1 ), and then use these as a challenge, to receive c∗ = Ek (mb ). It then outputs b′ s.t. cb′ = c∗ . On the other hand, in the next section, we discuss multiple constructions of secure encryption schemes based on block ciphers; such constructions are often referred to as mode of operation. 2.8 Encryption Modes of Operation Finally we get to design symmetric encryption schemes. Following the Cryptographic Building Blocks principle, the designs are based on the much simpler block ciphers. We use the term mode of operation for such construction of encryption and other cryptographic schemes from block ciphers. This term, and several standard modes of operation, were deőned in the DES speciőcations [296], and redeőned in the AES speciőcation [134], which added one more standard mode of operation (the CTR mode). Additional modes of operation were deőned and proposed in different standards and publications. In this section, we describe the standard modes of operations from [134,296], slightly simplifying the CTR mode. For didactic purposes, we add one nonstandard (and inefficient) mode of operation, which we call the Per-Block Random (PBR) mode. These modes are summarized in Table 2.5. The ‘modes of operations’ in Table 2.5 are designed to turn block ciphers into more complete cryptosystems, handling goals such as: Variable length and padding: we allow encryption of arbitrary, Variable Input Length (VIL) messages. All modes of operation are deőned for input whose length is an integral number l of blocks. If the input may not be an integral number of blocks, then the input should be padded to an integral number of blocks, before applying the encryption (i.e., the mode of operation). Correct padding can be quite simple, however, surprisingly, incorrect padding can result in serious vulnerability; We discuss padding and possible vulnerabilities in Section 2.9. Randomization/state: Most modes use randomness to ensure independence between two encryptions of the same (or of related) messages, as required for indistinguishability-based security deőnitions. The exceptions are the Applied Introduction to Cryptography and Cybersecurity 2.8. ENCRYPTION MODES OF OPERATION Mode Electronic code book (ECB) Encryption ci = Ek (mi ) Per-Block Random (PBR) Counter (CTR) [simpliőed] ri ← {0, 1}n , ci = (ri , mi ⊕ Ek (ri )) ci = mi ⊕ Ek (i) Output Feedback (OFB) r0 ← {0, 1}n , ri = Ek (ri−1 ), c0 ← r0 , ci ← ri ⊕ mi Cipher Feedback (CFB) Cipher-Block Chaining (CBC) 121 Flip ci [j] ⇒ Corrupt mi Properties Deterministic (distinguishable) $ Flip mi [j] (no integrity) Flip mi [j] Long ciphertext Fast online, stateful (i) $ Flip mi [j] (no integrity) Fast online (precompute) $ Corrupt mi+1 , ŕip mi [j] Can decrypt in parallel $ Flip mi+1 [j], corrupt mi Can decrypt in parallel c0 ← {0, 1}n , ci ← mi ⊕ Ek (ci−1 ) c0 ← {0, 1}n , ci ← Ek (mi ⊕ ci−1 ) Table 2.5: Encryption Modes of Operation using n-bit block cipher. ECB, OFB, CFB and CBC are from NIST ( [134, 296]). The plaintext is given + m2 + + . . ., where each block has n as a concatenation of n-bit blocks m1 + bits, i.e., mi ∈ {0, 1}n . Similarly the ciphertext is produced as a set of n-bits blocks c0 + + c1 + + . . . ∈ {0, 1}n , where ci ∈ {0, 1}n (except for PBR, where 2n ci ∈ {0, 1} ). We use mi [j] (ci [j]) to denote the j th bit of the plaintext (respectively, ciphertext). CTR mode, which uses state instead of randomization, and the ECB mode, that uses neither - and, therefore, is not IND-CPA. PRF: Most modes (PBR, OFB, CFB and CTR), use only the encryption function E - even for decryption. This has an important implication: they may be implemented using a PRF instead of a block cipher. This may have imply better security, esp. when the same key is used for an extensive number of messages, due to improved concrete-security (smaller advantage to attacker). However, notice that there will not be such advantage if we simply use a block cipher as a PRP, relying on the PRP/PRF switching lemma (Lemma 2.2); we should use one of the (simple and efficient) constructions of PRF from block cipher, which avoid an increase in the adversary’s advantage; see [39, 183]. See [33, 338]. Efficiency. Efficiency is important - and multi-faceted. All of the modes we present, use one block-cipher operation per message-block, and allow parallel decryption. The OFB, CTR and PBR modes also allow parallel encryption, or ‘random access’ decryption - decryption of only speciőc blocks from the plaintext. Another efficiency consideration is offline precomputation; in the CTR modes, we may conduct all the block-ciphers computations offline; after receiving the plaintext/ciphertext, we only need a single XOR operation (per block). The OFB mode has similar Applied Introduction to Cryptography and Cybersecurity CHAPTER 2. CONFIDENTIALITY: ENCRYPTION SCHEMES AND PSEUDO-RANDOMNESS 122 property but only for encryption; decryption requires the ciphertext as input to the block-cipher. Integrity/authentication: Some modes, which, unfortunately, we do not discuss, ensure both conődentiality and integrity, preventing an attacker from modifying intercepted messages to mislead the recipient, or from forging messages as if they were sent by a trusted sender. These include the Counter and CBC-MAC (CCM) mode and the (more efficient) Galois/Counter Mode (GCM) mode. Other modes ensure only authenticity; we discuss one such mode, the CBC-MAC mode, in subsection 4.5.2. Error localization and weak integrity: In the OFB and CTR, corruption of any number m of ciphertext bits, results in corruption of only the corresponding plaintext bits. This may help to recover from some corruptions of bits during communication, since no additional bits are lost, but also implies that the attacker may ‘ŕip’ plaintext bits by ‘ŕipping’ the corresponding ciphertext bits. In contrast, in the CFB and CBC modes, corruption of a single ciphertext block, ŕips a bit in one block, and ‘corrupts’ another block - with some exceptions; this is sometimes considered as a weak form of protection of integrity, but the defense is very fragile and relying on it has resulted in several vulnerabilities. 2.8.1 The Electronic Code Book mode (ECB) mode ECB is a naïve mode, which isn’t really a proper ‘mode’: it simply applies the block cipher separately to each block of the plaintext. Namely, to decrypt the plaintext string m = m1 + + m2 + + . . ., where each mi is a block (i.e., |mi | = n), we simply compute ci = Ek (mi ). Decryption is equally trivial: mi = Dk (ci ), and correctness of encryption, i.e., m = Dk (Ek (m)) for every k, m ∈ {0, 1}∗ , follows immediately from the correctness of the block cipher Ek (·). m1 k Ek (·) c1 m2 k Ek (·) c2 m3 k Ek (·) ml ······ k c3 Ek (·) cl Figure 2.25: Electronic Code Book (ECB) mode encryption of plaintext message m consisting of l blocks, m = m1 , . . . , ml . Adapted from [218]. Note: notations are not exactly consistent with text, should be őxed. The reader may have already noticed that ECB is simply a monoalphabetic substitution cipher, as discussed in subsection 2.1.3. The ‘alphabet’ here is Applied Introduction to Cryptography and Cybersecurity 2.8. ENCRYPTION MODES OF OPERATION c1 k DK (·) m1 c2 k DK (·) m2 123 c3 k DK (·) cl ······ k m3 DK (·) ml Figure 2.26: Electronic Code Book (ECB) mode decryption of ciphertext c consisting of l blocks, c = c1 , . . . , cl . Adapted from [218]. indeed large: each ‘letter’ is a whole n-bit block. For typical block ciphers, the block size is signiőcant, e.g., nDES = 64 bits for DES and nAES = 128 bits; this deőnitely improves security, and may make it challenging to decrypt ECB-mode messages in many scenarios. However, obviously, this means that ECB may expose some information about plaintext, in particular, all encryptions of the same plaintext block will result in the same ciphertext block. Even with relatively long blocks of 64 or 128 bits, such repeating blocks are quite likely in practical applications and scenarios, since inputs are not random strings. Essentially, this is a generalization of the letter-frequency attack of subsection 2.1.3 (see Fig. 2.5). This weakness of ECB is often illustrated graphically by the example illustrated in Fig. 2.27, using the ‘Linux Penguin’ image [144, 389]. Figure 2.27: The classical visual demonstration of the weakness of the ECB mode. The middle and the right ‘boxes’ are encryptions of the bitmap image 25 shown on the left. Which of the two is ‘encrypted’ using ECB, and which is encrypted with one of the secure encryption modes? Applied Introduction to Cryptography and Cybersecurity CHAPTER 2. CONFIDENTIALITY: ENCRYPTION SCHEMES AND PSEUDO-RANDOMNESS 124 2.8.2 The Counter (CTR) mode and the Per-Block Random Mode (PBR) We next present the counter mode (CTR) mode and the Per-Block Random mode (PBR). The Per-Block Random mode (PBR). The PBR is not a standard, possibly since it is inefficient: a block of random bits is generated for each plaintext block, and sent as part of the ciphertext, resulting in ciphertext whose length is twice that of the plaintext. We present it since it provide a simple way to construct a secure stateless encryption scheme from a PRF, PRP or block cipher, and since it is very similar to the stateful CTR mode. r1 m1 r2 m2 r3 m3 Ek (r1 ) Ek (r2 ) Ek (r3 ) ⊕ ⊕ ⊕ c1 = (r1 , m1 ⊕ Ek (r1 )) c2 = (r2 , m2 ⊕ Ek (r2 )) c3 = (r3 , m3 ⊕ Ek (r3 )) ··· rl ······ ml Ek (rl ) ⊕ ··· cl = (rl , ml ⊕ Ek (rl )) Figure 2.28: Per-Block Random (PBR) mode encryption of plaintext message m consisting of l blocks, m = m1 , . . . , ml . The PBR mode is illustrated in Figure 2.28. Let m = m1 + + m2 + +...+ + mM be a plaintext message, where each mi is one n−bits block, i.e., mi ∈ {0, 1}n , and let E denote a block cipher for n-bit blocks, and k denote a key for E. Then we compute the PBR-mode encryption of m using block cipher E and key k, denoted P BR.EncE k (m), as follows: ) ( $ n r ← {0, 1} E i (2.45) P BR.Enck (m) ≡ c1 + + ... + + cM where ci ← (ri , mi ⊕ Ek (ri )) Namely, encrypt each message block mi with the corresponding random block ri . Note that we can encode each ci simply as ci = ri + + mi ⊕ Ek (ri ), i.e., as a string of 2n bits; the pairwise notation is equivalent, and a bit easier to work with. PBR decryption performs the dual operation. Namely, given key k and ciphertext c = (r1 , c′1 ) + + ... + + (rn , c′n ), where each ri and c′i are one block (n bits), we compute the PBR-mode decryption of c, denoted P BR.DecE k (c), as: P BR.DecE + ... + + mM where mi ← ci ⊕ Ek (ri ) k (c) = m1 + Applied Introduction to Cryptography and Cybersecurity (2.46) 2.8. ENCRYPTION MODES OF OPERATION 125 It is not difficult to show that PBR mode ensures correctness. Exercise 2.21. Show that PBR mode ensures correctness, i.e., that for every l-blocks message m = m1 , . . . , ml and l random blocks r1 , . . . rl holds: m = E P BR.DecE k (P BR.Enck (m)). Note that PBR mode does not use at all the ‘decryption’ function D of the underlying block cipher (invertible PRP). Indeed, PBR can be instantiated using a PRF or PRP instead of using a block cipher. As can be seen from Table 2.5 and Exercise 2.47, this also holds for OFB and CFB modes. PBR is not a standard mode, and indeed, we do recommend it for applications, since it is wasteful: it requires the use of one block of random bits per each block of the plaintext, and all these random blocks also become part of the ciphertext and are used for decryption, i.e., the length of the ciphertext is double the length of the plaintext. However, PBR is secure - allowing us to discuss a simple provably-secure construction of a symmetric cryptosystem, based on the security of the underlying block cipher (invertible PRP). Theorem 2.1. If E is a PRF or PRP, or (E, D) is a block cipher (invertible PRP), then (P BR.EncE , P BR.DecE ) is a CPA-indistinguishable symmetric encryption. Proof. We present the proof when E is a PRF; the other cases are similar. We also focus, for simplicity, on encryption of a single-block message, m = m1 ∈ {0, 1}n . Denote by (P BR.Encf , P BR.Decf ) the same construction, except using, $ instead of E, a ‘truly’ random function f ← {{0, 1}n → {0, 1}n }. In this case, for any pair of plaintext messages m0 , m1 selected by the adversary and randomness r used for encrypting, the probability of c∗ = (r, m0 ⊕ f (r)) is exactly the same as the probability of c∗ = (r, m1 ⊕ f (r)), from symmetry of the random choice of f and r. Hence, the attacker’s success probability, when ‘playing’ the IND-CPA game (Def. 2.1) ‘against’ (P BR.Encf , P BRf ) is exactly half. Note that this holds even for computationally-unbounded adversary. Assume, to the contrary, that there is some PPT adversary A, that is able to gain a non-negligible advantage against (P BR.EncE , P BR.DecE ). Recall that this holds, even assuming E is a PRF - however, as argued above, A succeeds with probability exactly half, i.e., with exactly zero advantage, against (P BR.Encf , P BR.Decf ), i.e., if instead of the PRF E, we use the truly random function f . We can use A to distinguish between Ek (·) and a random function f , with signiőcant probability, contradicting the assumption that Ek is a PRF; see Def. 2.6. Namely, we run A against the PBR construction instantiated with either a true random function or Ek (·), resulting in either (P BR.Encf , P BR.Decf ) or (P BR.EncE , P BR.DecE ), correspondingly. Since A wins with signiőcant advantage against (P BR.EncE , P BR.DecE ), and with no advantage against (P BR.Encf , P BR.Decf ), this allows distinguishing the pseudorandom function Ek (·) from a truly random function f , proving the contradiction. Applied Introduction to Cryptography and Cybersecurity 126 CHAPTER 2. CONFIDENTIALITY: ENCRYPTION SCHEMES AND PSEUDO-RANDOMNESS Error localization, integrity and CCA security. Since PBR mode encrypts the plaintext by bitwise XOR, i.e., ci = (ri + + mi ⊕ Ek (ri )), ŕipping a bit in the second part results in ŕipping of the corresponding bit in the decrypted plaintext, with no other change in the plaintext. We say that such bit errors are perfectly localized or have no error propagation. On the other hand, bit errors in the random pad part ri corrupt the entire corresponding plaintext block, i.e., are propagated to the entire block. We say that the PBR mode ensures 1-block error localization, since an error in one ciphertext block corrupts at most one plaintext blocks upon decryption. In general, we say that a cryptosystem ensures error localization, and speciőcally b-blocks error localization, if an error in one ciphertext block corrupts at most b plaintext blocks. Error localization is a common property; in fact, all of the modes of operations that we discuss ensure error localization. Error localization limits the damage of bit-ŕip errors, but has security drawbacks. First, we note that with PBR, ŕipping of speciőc ciphertext bits causes ŕip in a corresponding bit in the decrypted plaintext; this also holds for several ciphers that ensure error localization, e.g., the one-time-pad (OTP). Therefore, PBR and other ciphers allowing bit-ŕipping of plaintext do not protect integrity. Of possibly larger concern is that PBR, and every cryptosystem that ensures error localization, cannot be IND-CCA secure; see Exercise 2.22. Exercise 2.22 (Error localization conŕicts with IND-CCA security). that PBR is not IND-CCA secure. 1. Show 2. Generalize part 1 to show that any cryptosystem E, D with localized errors is not IND-CCA secure. Or, prove for the special case where an error in ciphertext block i results in corruption of blocks i and i + 1 of the decrypted plaintext. Solution of part 1: The adversary gives to the test two challenge plaintext, m0 and m1 , both consisting of two blocks, and which differ in their őrst block, i.e., m0 [1] ̸= m1 [1] (the second blocks of m0 and m1 may differ or not), and receives ∗ the encryption c∗ = P BR_EncE k (mb ). From Equation 2.45, the ciphertext c ∗ consists of two pairs: (i ∈ {1, 2})c [i] = (r[i], ĉ[i]) where ĉ[i] ≡ mb [i] ⊕ Ek (ri )) and r[i] is a random block. The adversary asks the oracle for decryption of c′ , which also consists of two pairs, namely: c′ [1] = c∗ [1], c′ [2] = (r[2] ⊕ 1, ĉ[2]), i.e., ŕipping one (or more) bits of r[2]. Notice that c′ ̸= c∗ , therefore, the adversary is ‘allowed’ to give c′ for decryption by the oracle. ′ ′ Let m′ = P BR_DecE k (c ) denote the decryption of c which the adversary ′ receives from the oracle; From Equation 2.46, m [1] = mb [1]. Since m0 [1] ̸= m1 [1], the adversary learns b. The Counter (CTR) mode. Let us now discuss the counter mode (CTR), a standard mode of operation, deőned16 in [134], which is similar in design to 16 Our description slightly simplifies CTR mode; see exact details in [134]. Applied Introduction to Cryptography and Cybersecurity 2.8. ENCRYPTION MODES OF OPERATION s+1 m1 s+2 m2 127 s+3 m3 · · · · · · s+l ······ Ek (·) Ek (·) Ek (·) Ek (·) ⊕ ⊕ ⊕ c1 = m1 ⊕ Ek (s + 1) c2 = m2 ⊕ Ek (s + 2) c3 = m3 ⊕ Ek (s + 3) ml ⊕ ··· cl = ml ⊕ Ek (s + l) Figure 2.29: Counter (CTR) mode encryption of plaintext message m consisting of l blocks, m = m1 , . . . , ml . The counter (state) s is the number of blocks encrypted so far. Initially, s = 0, and its value is incremented whenever encrypting a new block. the PBR mode. CTR mode is unique among the modes of operation we discuss in being stateful; it maintains a counter of the number of blocks encrypted/decrypted so far, which is incremented whenever encrypting/decrypting a new block. See Figure 2.29. The security of CTR mode against CPA attack follows, very similarly to that of PBR, from the PRF/PRP assumption of the block cipher E. By using state, we avoid the need to generate and send a new random block for each plaintext block; therefore, when we can reliably use state, CTR mode offers efficiency. Another advantage is that senders and recipients can pre-compute the block-cipher operations even before the receive the plaintext or ciphertext, requiring only block-wise XOR when the data (plaintext or ciphertext) arrives. On the other hand, counter (CTR) mode is vulnerable to CCA attack; you can show this basically just like shown for PBR in Exercise 2.22. 2.8.3 The Output-Feedback (OFB) Mode We now proceed to discuss standard modes, which provably-ensure secure encryption, with randomization, for multiple-block messages - yet are more efficient compare to the PBR mode. We begin with the simple Output-Feedback (OFB) Mode. In spite of its simplicity, this mode ensures provably-secure encryption - and requires the generation and exchange of only a single block of random bits, compare to one block of random bits per each plaintext block, as in PBR. The OFB mode is illustrated in Figs. 2.30 (encryption) and 2.31 (decryption). OFB is a variant on the PRF-based stream cipher discussed in subsection 2.5.1 and illustrated in Fig. 2.17, and, like it, operates on input which consists of l Applied Introduction to Cryptography and Cybersecurity CHAPTER 2. CONFIDENTIALITY: ENCRYPTION SCHEMES AND PSEUDO-RANDOMNESS 128 IV pad0 Ek (·) Ek (·) pad1 Ek (·) pad2 m1 c1 Ek (·) pad3 m2 c0 ······ padl m3 c2 ml c3 cl Figure 2.30: Output Feedback (OFB) mode encryption. Adapted from [218]. c0 = pad0 Ek (·) Ek (·) pad1 c1 pad2 c2 m1 Ek (·) ······ pad3 c3 m2 Ek (·) padl cl m3 ml Figure 2.31: Output Feedback (OFB) mode decryption. Adapted from [218]. blocks of n bits each. The difference is that OFB uses a PRP (block cipher) Ek instead of the PRF P RFk . We use a random Initialization Vector (IV) as a ‘seed’ to generate a long sequence of pseudo-random n−bit pad blocks, pad1 , . . . , padl , to encrypt plaintext blocks m1 , . . . , ml . We next compute the bitwise XOR of the pad blocks pad1 , . . . , padl , with the corresponding plaintext blocks m1 , . . . , ml , resulting in the ciphertext which consists of the random IV c0 and the results of the XOR operation, i.e. c1 = m1 ⊕ pad1 , c2 = m2 ⊕ pad2 , . . .. Let us now deőne OF B.EncE k (m), the OFB mode for a given block cipher (E, D). For simplicity we deőne OF B.EncE k (m) for messages m which consist of +...+ + ml , where (∀i ≤ l)|mi | = n. some number l of n-bit blocks, i.e., m = m1 + Then OF B.EncE (m) is deőned as: k OF B.EncE + ... + + ml ) = (c0 + + c1 + + ... + + cl ) k (m1 + Applied Introduction to Cryptography and Cybersecurity (2.47) 2.8. ENCRYPTION MODES OF OPERATION 129 where: pad0 padi c0 ci $ ← {0, 1}n (2.48) Ek (padi−1 ) (2.49) ← ← pad0 padi ⊕ mi (2.50) (2.51) ← Offline pad precomputation The OFB mode allows both the encryption process and the decryption process to precompute the pad, ‘offline’ - i.e., before the plaintext and ciphertext, respectively, are available. Offline pad precomputation is possible since the pad does not depend on the plaintext (or ciphertext). This can be important, e.g., when a CPU with limited computation speed needs to support a limited number of ‘short bursts’, without adding latency. Once the plaintext/ciphertext is available, we only need one XOR operation per block. Parallelism. The pad is computed sequentially; there does not appear to be a way to speed up its computation using parallelism. Error localization, correction and integrity Since OFB operates as a bit-wise stream cipher, then it is 1-localized (or perfectly localized): a change in any ciphertext bit simply causes a change in the corresponding plaintext bit - and no other bit. The ‘perfect bit error localization’ property implies that an error correction and/or detection code can be applied applied either to the ciphertext or to the plaintext (before encryption, with correction/detection applied to plaintext after decryption). Without localization, a single bit error in the ciphertext could translate to many bit errors in the plaintext. This implies that error correction, and, to lesser degree, detection, should be applied to the ciphertext. Encode-then-Encrypt considered harmful. Some designers prefer to apply error correction to the plaintext, and rely on the ‘perfect bit error localization’ property to allow recovery from corresponding errors in the ciphertext. This motivated the use of OFB or a similar XOR-based stream cipher, allowing application of error detection code (EDC) or error correction code (ECC) on the plaintext; we refer to this as the Encode-then-Encrypt design. We now explain why this design could cause vulnerability (hence ‘considered harmful’). Our discussion of Encode-the-Encrypt applies to both EDC and ECC; let us focus on ECC. ECC codes have two functions, encode(·) and decode(·); the input domain to encode, denoted Dom, may be őxed-length strings (block code) or variable-length strings (convolution code). All ECC codes must satisfy the basic correctness property, which is that for every input string m ∈ Dom holds: m = decode(encode(m)). An ECC code should ensure noise correction property, i.e., it should ensure recovery from some set of possible errors in encoded messages; often, an ECC Applied Introduction to Cryptography and Cybersecurity 130 CHAPTER 2. CONFIDENTIALITY: ENCRYPTION SCHEMES AND PSEUDO-RANDOMNESS also allows detection of some additional errors. For example, one classical noise model are Hamming errors, which are simply bit-ŕips; these errors are deőned using the Hamming distance. The Hamming distance H(x, y) between two equal-length binary strings (|x| = |y| and x, y ∈ {0, 1}∗ ) is the number of different bits, i.e.:  ∀l ∈ N, x, y ∈ {0, 1}l H(x, y) ≡ |{i : x[i] ̸= y[i]}| (2.52) An ECC ensures correction of tc -bit-ŕip errors, or corrects errors up to Hamming distance tc , if for every message m ∈ Dom and every binary string y ∈ {0, 1}∗ : H(y, code(m)) ≤ tc ⇒ m = decode(y) (2.53) Similarly, an ECC (or EDC) ensures detection of td -bit-ŕip errors, or detects errors up to Hamming distance td , if for every message m ∈ Dom and every binary string y ∈ {0, 1}∗ : H(y, code(m)) ≤ td ⇒ (decode(y) ∈ {m, False}) (2.54) The classical Hamming code allows correction of a single bit-ŕip error, and detection of two bit-ŕip errors, i.e., tc = 1 and td = 2. However, applying error correction or detection codes on the plaintext, and relying on perfect bit error localization to allow error detection/correctness of the ciphertext, is not recommended. The reason is that such codes create structured redundancy in the plaintext, which may facilitate CTO attacks. Let us give an example. Example 2.8 (CTO attack on GSM). The Ciphertext-Only (CTO) attack on the A5/1 and A5/2 stream ciphers [26], which are defined as part of the GSM protocol, exploits the known relationship between ciphertext bits. This known relationship is due to the fact that an Error Correction Code is applied to the plaintext before encryption in the GSM protocol. This redundancy suffices to attack the ciphers, using techniques that normally can be applied only in Known Plaintext attacks. Unfortunately, complete details of this beautiful and important result are beyond our scope; for details, see [26]. As a result of this attack, the use of the (weaker) A5/2 was completely discontinued, and the use of A5/1 is not recommended. However, the GSM protocol still applied Encode-then-Encrypt, facilitating CTO cryptanalysis attacks. One may wonder, why would designers prefer to apply error correction to the plaintext rather than to the ciphertext? One motivation may be the hope that this may make cryptanalysis harder, e.g., corrupt some plaintext statistics such as letter frequencies. This may hold for some codes; but we better design such defenses explicitly into the cryptosystem and not rely on such fuzzy properties of encoding. Another motivation may be the hope that applying error correction/detection to the plaintext may provide integrity. Note that due to the perfect bit error localization of OFB, an attacker can easily ŕip a speciőc plaintext bit - by Applied Introduction to Cryptography and Cybersecurity 2.8. ENCRYPTION MODES OF OPERATION 131 ŕipping the corresponding ciphertext bit. If we applied error detection to the plaintext, then corruption of a single bit will corrupt the entire plaintext. However, since the attacker can ŕip multiple ciphertext bits, thereby ŕipping the corresponding plaintext bits, there are cases where the attacker can modify the ciphertext in such a way as to ŕip speciőc bits in the plaintext while also ‘őxing’ the error detection/correction code, to make the message appear correct. We conclude the following principle. Principle 9 (Minimize plaintext redundancy). Plaintext should preferably have minimal redundancy. In particular, plaintext should preferably not contain Error Correction or Detection codes. Namely, applying error correction to plaintext is a bad idea - certainly when using stream-cipher design such as OFB. This raises the obvious question: can an encryption mode of a block cipher also protect the integrity of the decrypted plaintext? Both of the following modes, CFB and CBC, provide a limited defense of integrity - by ensuring that errors do propagate. Provable security of OFB. The above discussed weaknesses are due to incorrect deployments of OFB; correctly used, OFB is secure. Proving the security of OFB follows along similar lines to Theorem 2.1, except that in order to deal with multi-block messages, we will need to use a more elaborate proof technique called ‘hybrid proof’; we leave that for courses and books focusing on Cryptology, e.g., [166, 370]. 2.8.4 The Cipher Feedback (CFB) Mode We now present the Cipher Feedback (CFB) Mode. Like most standard modes, it uses a random őrst block (‘initialization vector’, IV). In fact, CFB resembles OFB; the IV is also the őrst block of the ciphertext (c0 = IV ). Then, iteratively, each ciphertext block ci is used to generate the following pseudo-random pad block padi+1 = Ek (ci ); note that there is no pad0 (as c0 is simply the IV). Finally, the next ciphertext block, ci+1 , is computed by bitwise XOR between the corresponding pseudorandom pad block padi+1 and the corresponding plaintext block mi+1 . Namely, we deőne CF B.EncE k (m), the CFB mode for a given block cipher (E, D), as follows. For simplicity we deőne CF B.EncE k (m) for messages m which consist of some number l of n-bit blocks, i.e., m = m1 + + ... + + ml , where (∀i ≤ l)|mi | = n. The ciphertext CF B.EncE (m) consists of l + 1 blocks k c0 + + c1 + + ... + + cl , deőned by: CF B.EncE + c1 + + ... + + cl , where: k (m) ≡ c0 + $ c0 = IV ← {0, 1}n ci = mi ⊕ Ek (ci−1 ) ( for i ∈ {1, . . . , l}) Applied Introduction to Cryptography and Cybersecurity (2.55) CHAPTER 2. CONFIDENTIALITY: ENCRYPTION SCHEMES AND PSEUDO-RANDOMNESS 132 Note that the difference between CFB and OFB is in the ‘feedback’ mechanism, namely, the computation of the pads padi (for i > 1). In CFB mode, this is done using the ciphertext rather than the previous pad. See Fig. 2.32. IV Ek (·) Ek (·) pad1 m1 pad2 m2 c1 c0 Ek (·) ······ Ek (·) pad3 m3 c2 padl ml c3 cl Figure 2.32: Cipher Feedback (CFB) mode encryption. Adapted from [218]. c0 (= IV ) c1 c2 Ek (·) Ek (·) Ek (·) pad1 m1 pad2 m2 pad3 m3 c3 ······ cl−1 cl Ek (·) padl ml Figure 2.33: Cipher Feedback (CFB) mode decryption. Adapted from [218]. Optimizing implementations: parallel decryption, but no precomputation Unlike OFB, the CFB mode does not support offline precomputation of the pad, since the pad depends on the ciphertext (of the previous block). One optimization that is possible is to parallelize the decryption operation. Namely, decryption may be performed for all blocks in parallel, since the decryption mi of block i is mi = ci ⊕ pi = ci ⊕ Ek (ci−1 ), i.e., can be computed based on the ciphertexts of this block and of the previous block. Error localization and integrity Error localization in CFB is not perfect; a single bit error in one ciphertext block completely corrupts the following plaintext block. As we discussed for OFB, this reduction in error localization may be viewed as an advantage in ensuring integrity. Like OFB mode, the CFB mode allows the attacker to ŕip speciőc bits in the decrypted plaintext, by ŕipping corresponding Applied Introduction to Cryptography and Cybersecurity 2.8. ENCRYPTION MODES OF OPERATION 133 bits in the ciphertext. However, as a result of such bit ŕipping, say in block i, the decrypted plaintext of the following block is completely corrupted. Intuitively, this implies that applying an error-detection code to the plaintext would allow detection of such changes, in contrast to the situation with OFB mode. However, this dependency on the error detection code applied to the plaintext may cause some concerns. First, it is an assumption about the way that OFB is used; can we provide some defense for integrity that will not depend on such additional mechanisms as an error detection code? Second, it seems challenging to prove that the above intuition is really correct, and this is likely to depend on the speciőcs of the error detection code used. Finally, adding error detection code to the plaintext increases its redundancy, in contradiction to Principle 9. We next present the CBC mode, which provides a different defense for integrity, which addresses these concerns. 2.8.5 The Cipher-Block Chaining (CBC) mode Among the modes of operation deőned in [134], the most widely-used, by far, is the Cipher-Block Chaining (CBC) mode. The CBC mode, like the OFB and CFB modes, uses a random Initialization $ Vector (IV) as the őrst block of the ciphertext, c0 ← {0, 1}n . However, in th contrast to OFB and CFB, to encrypt the i plaintext block mi , CBC XORes mi with the previous ciphertext block ci−1 , and then applies the block cipher. Namely, ci = Ek (ci−1 ⊕ mi ). See Fig. 2.34. m1 m2 m3 Ek Ek Ek c1 c2 c3 mn IV c0 ······ Ek cn Figure 2.34: Cipher Block Chaining (CBC) mode encryption. Adapted from [218]. More precisely, let (E, D) be a block cipher, and let m = m1 + + ... + + mn be a message (broken into blocks). Then the CBC encryption of m using key k and initialization vector IV ∈ {0, 1}l is deőned as: CBC.EncE + ... + + ml ) = (c0 + + c1 + + ... + + cl ) k (m1 + Applied Introduction to Cryptography and Cybersecurity (2.56) CHAPTER 2. CONFIDENTIALITY: ENCRYPTION SCHEMES AND PSEUDO-RANDOMNESS 134 c1 c2 c3 Dk Dk Dk m1 m2 m3 cn ······ Dk c0 (IV ) mn Figure 2.35: Cipher Block Chaining (CBC) mode decryption. Adapted from [218]. where: c0 = IV (i ∈ {1, . . . , l}) ci $ ← ← {0, 1}n Ek (ci−1 ⊕ mi ) (2.57) (2.58) We see that the CBC mode, like CFB, allows parallel decryption, but not offline pad precomputation. The CBC mode, like the other modes (except ECB), ensures IND-CPA, i.e., security against CPA attacks, provided that the underlying block cipher is a secure invertible PRP, however, is vulnerable to CCA attacks. Exercise 2.23. Demonstrate that CBC mode does not ensure security against CCA attacks. Hint: the solution is quite similar to that of Exercise 2.22. Incorrect use vulnerability: BEAST exploit of predictable IV. While CBC mode ensures security in the sense of IND-CPA, i.e., against Chosen Plaintext Attack, this is only true if CBC is used correctly; in particular, the IV (also used as c0 ) must be random. The SSL protocol, as well as version 1.0 of TLS, use CBC encryption in the following incorrect way. They select the IV randomly only for the őrst message m0 in a connection; for subsequent messages, say mi , the IV is simply the last ciphertext block of the previous message. This creates a vulnerability exploited, e.g., by the BEAST attack. For details, see subsection 7.2.4, Exercise 7.8 and [25, 132]. Error propagation and integrity Any change in the CBC ciphertext, even of one bit, results in unpredictable output from the block cipher’s ‘decryption’ operation, and hence unpredictable decryption. Namely, ŕipping a bit in the ciphertext block i does not ŕip the corresponding bit in plaintext block i, as it did in the OFB and CFB modes. Applied Introduction to Cryptography and Cybersecurity 2.9. PADDING SCHEMES AND PADDING ORACLE ATTACKS 135 However, the ŕipping of a bit in the ciphertext block ci−1 , without change to block ci , results in the ŕipping of the corresponding bit in the ith decrypted plaintext block. Namely, bit-ŕipping is still possible in CBC, it is just a bit different - and in order to ŕip a bit in the decrypted-plaintext block i, the adversary has to ŕip the corresponding bit in the previous block (i − 1), which results in corruption of the decryption of block i − 1. Indeed, this kind of tampering is used in several attacks on systems deploying CBC, such as the Poodle attack [290]. Note also that bit ŕipping in the őrst decrypted-plaintext block only requires ŕipping of the corresponding IV block - and hence does not corrupt any plaintext block. 2.8.6 Modes of Operation Ensuring CCA Security? We already observed, in Exercise 2.22, that any cryptosystem (and mode) that ensures error localization to some extent cannot be IND-CCA secure. This implies that none of the modes we discussed is IND-CCA secure. Such failure can occur even for the much weaker - and more common - case of Feedback-only CCA attacks, where the attacker does not receive the decrypted plaintext, but only an indication of whether the plaintext was ‘valid’ or not. How can we ensure security against CCA attacks? One intuitive defense is to avoid giving any feedback on invalid-plaintext failures. However, this is harder than it may seem. For example, often, after (successful) decryption, a response is immediately sent, which may be hard to emulate when the plaintext is invalid - we may be even unable to identify the sender, e.g., if the sender identity is encrypted for anonymity. By observing if a response is sent, or the timing of the response, an attacker may obtain feedback on the attack. Such unintentional indications are referred to as side channels; for example, when the feedback is based on the time the response is sent, this is a timing side channel. A better approach may be to prevent response to chosen-ciphertext queries, without decrypting them. One simple way to do this is to authenticate the ciphertext, typically by appending to the ciphertext an authentication tag, which allows secure detection of any modiőcation in the ciphertext. Several of the more modern, widely used modes of operation, e.g., GCM [135, 343], combine authentication and encryption, with one beneőt being the protection against chosen-ciphertext attacks. Authentication is the subject of the next chapter. 2.9 Padding Schemes and Padding Oracle Attacks All modes of operation are deőned for input whose length is an integral number l of blocks. In most applications, the input may not be an integral number of blocks, but an string of arbitrary number of bits or, more commonly, of bytes. The principle of all padding schemes is quite simple. Before encryption, the plaintext is padded to an integral number of blocks, by appending a pad string to the message, which is removed after decryption. The length of the pad is between one byte and a whole block (l bytes), and is chosen to ensure that the padded message (message plus pad) őts in an integral number of blocks. Applied Introduction to Cryptography and Cybersecurity 136 CHAPTER 2. CONFIDENTIALITY: ENCRYPTION SCHEMES AND PSEUDO-RANDOMNESS Padding schemes mainly differ in the contents of the pad string, and in the validation of the pad after decryption. Two commonly used padding schemes applied to plaintext before shared-key encryption are: X9.23 padding: Several protocols, most notably the SSL protocol discussed in Chapter 7, use the following padding scheme, which we refer to as X9.23 padding, since it was deőned in the ANSI X9.23 standard [13]. In X9.23 padding, the last byte of the pad contains the length of the pad minus one, i.e., the length except this byte. The length of the pad is restricted17 to the block-length l, hence the value of the last byte must be a number between zero and l − 1. If the results of decryption does not end with a byte between zero and l − 1, then the ciphertext is considered to have invalid padding, and a padding error is returned. The other bytes of the X9.23 pad can have arbitrary values (e.g., all zeros, or random values). PKCS#5 padding: Other protocols, most notably the TLS protocol (also discussed in Chapter 7), use the following padding scheme, which we refer to as PKCS#5 padding. PKCS#5 padding is deőned in several standards including PKCS#5, PKCS#7 and RFC 5652 [206]. It is similar to ANSI X9.23 padding, the main difference being that all padding bytes must contain the same value as the last byte, i.e., the length of the pad minus one. Namely, a one-byte pad will contain the byte 00, a two-byte pad will contain pad 0101, and so on. If the results of decryption does not have this pattern, then a padding error is returned. Another difference is that PKCS#5 padding allows the pad to be longer than a single block (up to 256 bytes, to ensure that the pad length minus one can őt in one byte.). With both X9.23 and PKCS#5 padding schemes, the decrypted plaintext should end with a valid pad, which should be (efficiently) veriőed by the recipient. We say that ciphertext c has invalid pad, and return a padding error, if the decrypted plaintext has invalid pad. Consider an l-bytes block m, or a multi-block message whose last block is m. Let m[i], for i = 1, . . . , l denote the ith most-signiőcant byte of m. Block m has valid ANSI X9.23 padding if m[l] < l. For example, for blocks of l = 16 bytes, the four most-signiőcant bits of the last byte must be all zeroes, i.e., the last byte must be of the form 0x0ϕ in hexadecimal notation. In this case, ϕ can be any hexadecimal digit, from 0 to F , representing the corresponding four bits in binary. Similarly, plaintext string m has valid PKCS#5 padding if the value of the last byte of m, which we denote x = m [|m|], is also the value of preceding x − 1 bytes of m, i.e., m [(|m| − x) : |m|] = xx . Namely, the last x bytes of m contain the same value x. The reader may wonder why do we bother describing these two simple and very similar schemes. The reason is that padding error indications, which we 17 For convenience, we consider this restriction of pad length to one block to be a mandatory property of X9.23 padding, although some implementations may not enforce it. Applied Introduction to Cryptography and Cybersecurity 2.9. PADDING SCHEMES AND PADDING ORACLE ATTACKS 137 refer to as padding oracles, are exploited in many attacks. In Chapter 7, we discuss practical padding oracle attacks against SSL and TLS. In this section, we describe the Padding Oracle Attack model, and then present a simple padding oracle attack against ECB and CBC modes using X9.23 padding, extended in Exercise 2.24 to an attack against CBC mode using PKCS#5 padding. The Padding Oracle Attack Model. In many systems, an attacker may be able to detect when an message with invalid padding is received, by observing an explicit ‘invalid pad’ error message, or otherwise, e.g., by detecting the different behavior of the recipient due to invalid pad, such as different timing of a response; the later is a special case of a timing side channel. In any case, we refer to this ability to detect the validity of the padding of the (decrypted) plaintext, for a given ciphertext, as a padding oracle. MitM Mal Oracle: Is m′i well-padded? m0 , m1 , . . . c 1 , c2 , . . . m0 , m1 , . . . Pad and Encrypt: ci ← Ek (P ad(mi )) c′1 , c′2 , . . . Decrypt: m′i ← Dk (c′i ) Nurse Alice Bob Figure 2.36: The Padding Oracle Attack model. We illustrate the basic Padding Oracle Attack model in Figure 2.36; this is basically a CTO attack with the addition of the padding oracle capability. Note that, following the Kerckhoffs’ principle (Principle 2), we assume that the attacker knows the details of the padding scheme in use, as well as other aspects of the system. Of course, the padding oracle capability may also complement other attacker capabilities. In particular, in subsection 7.2.2, we present the CPA-Oracle Attack model, where the attacker also has chosen-plaintext capability. The CPA-Oracle Attack model is quite realistic, although it is more powerful then the Padding Oracle Attack model. Indeed, the CPA-Oracle Attack model is often used for security evaluation of practical protocols. In particular, in subsection 7.2.2 we discuss practical attacks against different versions of the SSL and TLS protocols, using the CPA-Oracle Attack model. Simple padding-oracle attacks. Let us present simple padding oracle attacks against the ECB and CBC modes of operation, when using X9.23 padding. Assume, for example, that we use blocks of 16 bytes. Hence, the Applied Introduction to Cryptography and Cybersecurity 138 CHAPTER 2. CONFIDENTIALITY: ENCRYPTION SCHEMES AND PSEUDO-RANDOMNESS length of the pad is between one and 16, and the encoded value in the last byte should be 0x0ϕ in hexadecimal notation (where ϕ is one hexadecimal digit). Typical encryption schemes use shorter blocks (8 bytes for DES and 12 bytes for AES), which requires a tiny change in the attack and slightly increases the exposure due to the attack. Padding-Oracle attack on X9.23 padding using ECB mode. Consider plaintext message m = m1 + + ... + + mn , containing n − 1 full blocks and one empty or non-full block mn . In ECB mode, each non-őnal plaintext block mi (i.e., i < n), is encrypted directly as ci = Ek (mi ). The őnal plaintext block mn is padded before encryption, i.e., encrypted as cn = Ek (pad(mn )), where pad(·) is the X9.23 padding function. The attack can be applied to any block of the ciphertext, except the last block; let us focus on some block ci where i < n. Instead of sending ci as an intermediate block of a longer ciphertext message, the attacker sends ci as if it is the last block of a ciphertext message, e.g., the ciphertext consist only of this single block ci . (If the ciphertext should have multiple blocks, prepend some blocks before ci - we only need ci to be the last block of the ciphertext.) Following the Padding Oracle Attack model, the attacker receives an indication whether the decryption of ci , i.e., mi = Dk (ci ), has valid padding or not. If mi has valid padding, then the value of its last byte must be less than the block length, l = 16; namely, the last byte of m must be of the form 0x0ϕ (in hexadecimal notation), i.e., its four most signiőcant bits must be zeros. This is an exposure of (limited) information about these four bits of the plaintext, i.e., an indication if these four bits are all zeros or not. Padding-Oracle attack on X.923 padding using CBC mode. This attack, like the one on ECB mode, is applicable to every non-last ciphertext block ci . Again, the basic idea is to send ci as the last block of ciphertext messages to the recipient, and learn information about the value of few bits of plaintext mi , using the response of the padding oracle. Speciőcally, let mi [j] denote the j th bit of mi . Recall that we use l to denote the number of bits in each block, i.e., mi = mi [1] + + ... + + mi [l]. We show how, when using X9.23 padding, the attack őnds the four plaintext bits mi [j] for j = l − 7, l − 6, l − 5 and l − 4. This requires only access to ci and to the padding oracle for CBC mode. Exercise 2.24 extends the attack to PKCS#5 padding, where the attack őnds the entire last byte of mi . Recall that in CBC mode, the ciphertext block ci is computed as ci = Ek (ci−1 ⊕ mi ). The attacker now sends ciphertext c′ containing i blocks, where c′i = ci , and using different previous ciphertext blocks c′i−1 . The value of c′i−1 may be different from the value of ci−1 , the original previous ciphertext block. (The other blocks of c′ do not have any impact on the attack and can even be eliminated, but to simplify notations, assume c′ contains at least i blocks.) Let m′i denote the last decrypted plaintext block would be m′i ; since we use CBC, it is computed as: m′i = Dk (ci ) ⊕ c′i−1 = (mi ⊕ ci−1 ) ⊕ c′i−1 Applied Introduction to Cryptography and Cybersecurity (2.59) 2.10. CASE STUDY: THE (IN)SECURITY OF WEP 139 The attacker has (only) access to the padding oracle, which indicates if the last block of the decrypted message, m′i , has correct padding or not. This depends only on the value of the last byte of m′i ; correct padding requires that last byte of m′i be of the form 0x0ϕ, i.e., its four most signiőcant bits should be all zeros. In other words, m′i [l − 7 : l − 4] = 0000 (in binary). Recall that m′i = Dk (ci ) ⊕ c′i−1 ; therefore, the attacker can try the 16 different values of four most signiőcant bits of the last byte of c′i−1 , until it őnds a value c′i−1 that has correct padding, i.e., resulting in plaintext m′i whose last byte is of the form 0x0ϕ, i.e., m′i [l − 7 : l − 4] = 0000. Hence, from Equation 2.59, we have: mi [l − 7 : l − 4] = ci−1 [l − 7 : l − 4] ⊕ c′i−1 [l − 7 : l − 4] (2.60) The reader may have noticed that the attack exploits the fact that X9.23 padding limits the pad length to one block. PKCS#5 padding does not make this restriction, which may seem to make it secure against such padding oracle attack. However, in fact, Exercise 2.24 extends the attack for the case of PKCS#5 padding. The extended attack requires more padding-oracle queries, but is also more rewarding, as it allows recovery of the entire plaintext. Exercise 2.24. Present a padding attack, assuming the availability of a padding oracle and the use PKCS#5 padding. The attack should find the entire last byte of a non-last plaintext block mi , given only the corresponding ciphertext block ci and previous ciphertext block ci−1 , and access to the padding oracle. Hint: A random plaintext block would have valid padding if its last byte contains 0x00, i.e., the entire pad is this single last byte; other random plaintext blocks are unlikely to have valid padding (why?). Use this to őnd the last byte of mi . Once this byte is found, we repeat a similar logic to őnd the preceding byte of mi , using the fact that a valid random plaintext whose last byte contains 0x01, would also have 0x01 in the preceding byte. We then proceed similarly to őnd all plaintext bytes of mi . The reader may also őnd the solution to this exercise by reading [380]. Additional padding oracle attacks, focusing on the SSL and TLS protocols, are described in subsection 7.2.3. 2.10 Case study: the (in)security of WEP We conclude this chapter, and further motivate the next, by discussing a case study: vulnerabilities of the Wired Equivalency Privacy (WEP) standard [103]. WEP stands for Wired Equivalency Privacy; it was developed as part of the IEEE 802.11b standard, to provide some protection of data over wireless local area networks (also known as WiFi networks). As the name implies, the original goals aimed at a limited level of privacy (meaning conődentiality), which was deemed ‘equivalent’ to the (physically limited) security offered by a wired connection. Applied Introduction to Cryptography and Cybersecurity 140 CHAPTER 2. CONFIDENTIALITY: ENCRYPTION SCHEMES AND PSEUDO-RANDOMNESS These critical vulnerabilities were discovered long ago, mostly in [77], relatively soon after the standard was published; yet, products and networks supporting WEP still exist. This is an example of the fact that once a standard is published and adopted, it is often very difficult to őx security. Hence, it is important to carefully evaluate security in advance, in an open process that encourages researchers to őnd vulnerabilities, and, where possible, with proofs of security. To address these vulnerabilities, WEP was replaced - possibly in too much haste - with a new standard, the Wi-Fi Protected Access (WPA), which so far has three versions (WPA1, WPA2, WPA3). Vulnerabilities were also found in these, e.g., see [378, 379], but these are more subtle and harder to exploit; WEP should only be used for educational purposes, as we do here. WEP assumes a symmetric key between the mobile device and an access point, which is used as the key (seed) for the RC4 cipher. WEP networks share the same key with all mobiles; this means that each device which has the key, can eavesdrop on all communication; this vulnerability exists also for the common use of more advanced WiFi security protocols, such as WPA (versions 1 to 3). We discuss speciőc additional vulnerabilities which are speciőc to WEP, and make it insecure even against an attacker that is not given the key to connect to the network. Conődentiality in WEP is protected using the RC4 PRG, used as a stream cipher as in subsection 2.5.1. RC4 is initiated with a secret shared key, which, in WEP, is speciőed to be only 40 bits long. This short key size was chosen intentionally, to allow export of the hardware, since when the standard was drafted, the United States and many other countries had export limitations on strong cryptography, which necessarily uses longer keys. Many WEP implementations also support longer, 104-bit keys for RC4; however, attacks published show that even with 104-bit keys, RC4 is still vulnerable, see subsection 2.5.6, subsection 7.2.5 and [11, 235, 276]. The WEP PRG is initiated with a 24-bit per-packet random Initialization Vector (IV). We use RC4IV,k to denote the string output by RC4 when initialized using a given IV, k pair. More speciőcally, we use RC4IV,k [l] to denote the őrst l bits in RC4IV,k , i.e., in the output by RC4 when initialized using given IV, k pair. WEP packets use the CRC-32 error detection code, computed over the plaintext message m. CRC-32 [240] is one of the standard variants of CRC cyclic redundancy check code. CRC codes are popular error-detecting codes (EDC); they are simple to implement and efficiently detect errors in data, caused by random noise corruptions. Namely, if m′ is the result of such random corruption of m, then, with high probability, m and m′ will have different CRC codes, i.e., CRC(m) ̸= CRC(m′ ), allowing detection of the corruption by comparing their CRC codes. Note that, for simplicity, and since we do not discuss other CRC codes, we use CRC(m) to denote the CRC-32 code computed over message m. CRC codes, and in particular CRC-32, are linear, in the sense that for any Applied Introduction to Cryptography and Cybersecurity 2.10. CASE STUDY: THE (IN)SECURITY OF WEP 141 two strings m, m′ ∈ {0, 1}∗ of equal length (|m| = |m′ |), holds: CRC(m ⊕ m′ ) = CRC(m) ⊕ CRC(m′ ) (if |m| = |m′ |) (2.61) WEP uses CRC as follows. To send a message m using secret key k, WEP implementations select a random 24-bit IV, and transmit the IV together with W EPk (m, IV ), deőned as: W EPk (m, IV ) ≡ RC4IV,k [32 + |m|] ⊕ (m + + CRC(m)) (2.62) The length of the WEP transmission is, therefore, the length of the message m, plus 56 bits: 24 bits for the IV and 32 bits for the CRC-32 code. 2.10.1 CRC-then-XOR does not ensure integrity CRC-32 is a quite good error detection code. By encrypting the output of CRC, speciőcally by XORing it with the pseudo-random pad generated by RC4, the WEP designers hoped to protect message integrity, i.e., not only detect random corruptions, but also prevent intentional modiőcation or forgery of messages. However, error-detection codes are designed to detect random, not intentional, corruptions; i.e., for reliability, not for security. In particular, it is easy to őnd a collision, i.e., messages m and m′ such that m = ̸ m′ yet CRC(m) = CRC(m′ ). Furthermore, we next show how the linearity of the CRC (Equation 2.61) allows an attacker to change the message m sent in a WEP packet, by ŕipping any desired bits and appropriately adjusting the CRC őeld. Speciőcally, let ∆ represent the string of length |m| containing 1 for bit locations that the attacker wishes to ŕip. Having eavesdropped and obtained W EPk (m, IV ), the attacker can compute a valid W EPk (m ⊕ ∆, IV ) as follows: W EPk (m ⊕ ∆, IV ) = = = = RC4IV,k [32 + |m ⊕ ∆|] ⊕ ([m ⊕ ∆] + + CRC(m ⊕ ∆)) RC4IV,k [32 + |m|] ⊕ ([m ⊕ ∆] + + [CRC(m) ⊕ CRC(∆)]) RC4IV,k [32 + |m|] ⊕ (m + + CRC(m)) ⊕ (∆ + + CRC(∆)) W EPk (m, IV ) ⊕ (∆ + + CRC(∆)) Namely, the CRC mechanism, XOR-encrypted, does not provide any meaningful integrity protection. An attacker can easily ŕip bits in a WEP message and have it properly received. WEP authentication-based vulnerabilities. We have just seen that WEP failed to provide integrity; however, as we now show, WEP also fails to ensure confidentiality. Interestingly, the most devastating vulnerability, which is the one we show, takes advantage of WEP’s shared-key authentication mode; WEP also deőnes a mode called open-system authentication, which simply means that there is no authentication, and is therefore not vulnerable to this speciőc attack. Even when using open-system authentication, i.e., giving up on authentication, WEP is still vulnerable to other cryptanalysis attacks exploiting RC4 weaknesses, e.g., see [276]. Applied Introduction to Cryptography and Cybersecurity 142 CHAPTER 2. CONFIDENTIALITY: ENCRYPTION SCHEMES AND PSEUDO-RANDOMNESS However, we focus on the vulnerability of WEP when using the shared-key authentication mode. It works very simply: the access point sends a random challenge R; and the mobile sends back W EPk (R, IV ), i.e., a proper WEP packet containing the ‘message’ R. This authentication mode is currently rarely used, since it allows attacks on the encryption mechanism. First, notice that it provides a trivial way for the attacker to obtain ‘cribs’ (known plaintext - ciphertext pairs). Of course, encryption systems should be protected against known-plaintext attacks; however, following the conservative design principle (principle 3), system designers should try to make it difficult for attackers to obtain cribs. In the common, standard case of 40-bit WEP implementations, a crib is deadly - an attacker can now do a trivial exhaustive search to őnd the key. Even when using longer keys (104 bits), the shared-key authentication exposes WEP to a simple cryptanalysis attack. Speciőcally, since R is known, the attacker learns RC4IV,k for a given, random IV . Since the length of the IV is just 24 bits, it is feasible to obtain a collection of most IV values and the corresponding RC4IV,k pads, allowing decryption of most messages. As a result of these concerns, most WEP systems use only open-system authentication mode, i.e., do not provide any authentication. Further WEP encryption vulnerabilities We brieŕy mention two further vulnerabilities of the WEP encryption mechanism. The őrst vulnerability exploits the integrity vulnerability discussed earlier. As explained there, the attacker can ŕip arbitrary bits in the WEP payload message. WEP is a link-layer protocol; the payload is usually an Internet Protocol (IP) packet, whose header contains, in known position, the destination address. An attacker can change the destination address, causing forwarding of the packet directly to the attacker! The second vulnerability is the fact that WEP uses ‘plain’ RC4, which has been shown in [276] to be vulnerable. 2.11 Encryption: Final Words Conődentiality, as provided by encryption, is the oldest goal of cryptology, and is still critical to the entire area of cybersecurity. Encryption has been studied for millennia, but for many years, the design of cryptosystems was kept secret, in the hope of improving security. Kerckhoffs’ principle (Principle 2), however, has been widely adopted and caused cryptography to be widely studied and deployed, in industry and academia. Cryptography was further revolutionized by the introduction of precise deőnitions and proofs of security by reduction, based on the theory of complexity. In particular, modern study of applied cryptography makes extensive use of provable security, especially computational security, i.e., ensuring security properties with high probability, against Probabilistic Polynomial Time (PPT) adversaries. We have seen a small taste of such deőnitions and proofs in this Applied Introduction to Cryptography and Cybersecurity 2.11. ENCRYPTION: FINAL WORDS 143 chapter; we will see a bit more later on, but for a real introduction to the theory of cryptography, see appropriate textbooks, such as [165, 166]. Applied Introduction to Cryptography and Cybersecurity CHAPTER 2. CONFIDENTIALITY: ENCRYPTION SCHEMES AND PSEUDO-RANDOMNESS 144 2.12 Lab and Additional Exercises Lab 2 (Ransomware and Encryption). In this lab, we explore the abuse of cryptography by ransomware. Ransomware encrypts the user files, and requires the user to pay ’ransom’, with the promise of sending back the decryption key or program. As for the other labs in this textbook, we will provide Python scrips for generating and grading this lab (LabGen.py and LabGrade.py). If not yet posted online, professors may contact the author to receive the scripts. The lab-generation script generates random challenges for each student (or team), as well as solutions which will be used by the grading script. We recommend to make the scripts available to the students, as example of how to use the cryptographic functions. It is easy and permitted to modify these scripts to use other languages/libraries or to modify and customize them as desired. The lab has two parts. 1. In this part, you are given a ransomware program, R1.py, and your task is to reverse-engineer and break it, i.e., decrypt the őles without paying ransom. You will be able to do it, since R1.py is a simple Python program using a shared-key cryptosystem, speciőcally, the AES block cipher in CBC mode; see Section 2.6. In the next part, we will discuss more realistic ransomware, that uses public-key encryption rather than shared-key encryption, making it infeasible to recover the őles without paying ransom, by reverse-engineering of the ransomware. The ransomware R1.py has two outputs for each input őle, say example.txt: its encryption, example.txt.enc, and a token, example.txt.token, to be sent with the ransom payment. The token is needed since R1.py selects a different random shared key to encrypt each őle, e.g., example.txt; the attacker uses example.txt.token to őnd the decryption key. Input: The ‘weak ransomware program’, given conveniently (and unrealistically) as a Python script, R1.py, the encrypted őle example.txt.enc and the token example.txt.token. Goal: reverse-engineer R1.py and then, using the token example.txt.token, recover the original őle, example.txt. Submission: the recovered example.txt őle and your program, A1.py that produced it (given example.txt.enc and example.txt.token as input). Note: Before encryption, the plaintext (example.txt) was padded so that its length would be a multiple number of ‘blocks’. The padding has to be removed after applying the block-cipher’s decryption function. 2. In this part you develop ‘strong’ ransomware, using public key (asymmetric) encryption. You will develop and submit the following programs: Applied Introduction to Cryptography and Cybersecurity 2.12. LAB AND ADDITIONAL EXERCISES Identiőer A B C D E F E F Cipher Caesar AzBy ROT13 Keyed Caesar Ciphertext JUHDW SDUWB ILFMW GZYOV NYBAR NTNVA XRORT SQUIQH BLFMT OLEVI LGTIE JXKYY FZNEG UBHFR EUHDN UXOHV 145 Plaintext and key Time Table 2.6: Ciphertexts for Exercise 2.25. All plaintexts are pairs of two simple őve to six letter words. The four upper examples have the cipher spelled out, the four lower examples hide it (‘obscurity’). It does not make them secure, but decryption may take a bit longer. a) A key-generation program KG2.py, that outputs a keypair of a public encryption key e and a private decryption key d. b) A ransomware program R2.py, which uses the encryption key e and outputs, for each input őle example.txt, two őles: its encryption, example.txt.enc, and a payment token, example.txt.pay, to be sent with the ransom payment. c) A token-processing program TP2.py, which uses the private decryption key d, and outputs, for each input payment token example.txt.pay, a decryption token example.txt.d. d) A decryption program D2.py, using the decryption token example.txt.d, and recovering the plaintext example.txt given its encryption, example.txt.enc, as input. Exercise 2.25. Table 2.6 shows eight ciphertexts, all using one of the four simple substitution ciphers identified in the top four rows. Decipher the ciphertexts, measuring the time it took you to decipher each of them. Fill in the blanks in the table: the plaintexts, the time it took you to decipher each message, and the ciphers and key (when relevant). Did the knowledge of the cipher significantly ease the cryptanalysis process? Did the key? Exercise 2.26. ConCrypt Inc. announces a new symmetric encryption scheme, CES. ConCrypt announces that CES uses a 128-bit keys and is five times faster than AES, and is the first practical cipher to be secure against computationallyunbounded attackers. Is there any method, process or experiment to validate or invalidate these claims? Describe or explain why not. Exercise 2.27. ConCrypt Inc. announces a new symmetric encryption scheme, CES512. ConCrypt announces that CES512 uses a 512-bit keys, and as a result, is proven to be much more secure than AES. Can you point out any concerns with using CES512 instead of AES? Applied Introduction to Cryptography and Cybersecurity 146 CHAPTER 2. CONFIDENTIALITY: ENCRYPTION SCHEMES AND PSEUDO-RANDOMNESS Exercise 2.28. Compare the following pairs of attack models. For each pair (A, B), state whether every cryptosystem secure under attack model A is also secure under attack model B and vice versa. Prove (if you fail to prove, at least give compelling argument) your answers. The pairs are: 1. (Ciphertext only, Known plaintext) 2. (Known plaintext, Chosen plaintext) 3. (Known plaintext, Chosen ciphertext) 4. (Chosen plaintext, Chosen ciphertext) Exercise 2.29. Alice is communicating using the GSM cellular standard, which encrypts all calls between her phone and the access tower. Identify the attacker model corresponding to each of the following cryptanalysis attack scenarios: 1. Assume that Alice and the tower use a different shared key for each call, and that Eve knows that specific, known message is sent from Bob to Alice at given times. 2. Assume (only) that Alice and the tower use a different shared key for each call. 3. Assume all calls are encrypted using a (fixed) secret key kA shared between Alice’s phone and the tower, and that Eve knows that specific, known control messages are sent, encrypted, at given times. 4. Assume (only) that all calls are encrypted using a (fixed) secret key kA shared between Alice’s phone and the tower Exercise 2.30. We covered several encryption schemes in this chapter, including At-Bash (AzBy), Caesar, Shift-cipher, general monoalphabetic substitution, OTP, PRG-based stream cipher, RC4, block ciphers, and the ‘modes’ in Table 2.5. Which of these is: (1) stateful, (2) randomized, (3) FIL, (4) polynomial-time? Exercise 2.31. Consider use of AES with key length of 256 bits and block length of 128 bit, for two different 128 bit messages, A and B (i.e., one block each). Bound, or compute precisely if possible, the probability that the encryption of A will be identical to the encryption of B, in each of the following scenarios: 1. Both messages are encrypted with the same randomly-chosen key, using ECB mode. 2. Both messages are encrypted with two keys, each of which is chosen randomly and independently, and using ECB mode. 3. Both messages are encrypted with the same randomly-chosen key, using CBC mode. Applied Introduction to Cryptography and Cybersecurity 2.12. LAB AND ADDITIONAL EXERCISES 147 4. Compute now the probability the same message is encrypted to the same ciphertext, using a randomly-chosen key and CBC mode. Exercise 2.32. Present a very efficient CPA attack on the mono-alphabetic substitution cipher, which allows complete recovery of arbitrary messages, using the encryption of one short plaintext string. Exercise 2.33 (PRG constructions). Let G : {0, 1}n → {0, 1}n+1 be a secure PRG. Is G′ , as defined in each of the following sections, a secure PRG? Prove. 1. G′ (s) = G(sR ), where sR means the reverse of s. 2. G′ (r + + s) = r + + G(s), where r, s ∈ {0, 1}n . 3. G′ (s) = G(s ⊕ G(s)1...n ), where G(s)1...n are the n most-significant bits of G(s). 4. G′ (s) = G(π(s)) where π is a (fixed) permutation. 5. G′ (s) = G(s + 1). 6. (harder!) G′ (s) = G(s ⊕ sR ). A. Solution to G′ (r + + s) = r + + G(s): B. Solution to G′ (s) = G(s ⊕ G(s)1...n ): may not be a PRG. For example, let g be a PRG from any number m bits to m + 1 bits, i.e., output is pseudorandom string just one bit longer than the input. Assume even n; for every x ∈ {0, 1}n/2 and y ∈ {0, 1}n/2 ∪ {0, 1}1+n/2 , let G(x + + y) = x + + g(y). If g is a PRG, then G is also a PRG (why?). However, when used in the above construction: G′ (x + + y) = = = = = G [(x + + y) ⊕ G(x + + y)] G [(x + + y) ⊕ (x + + g(y))] G [(x ⊕ x) + + (y ⊕ g(y))] h i G 0n/2 + + y) ⊕ (x + + g(y)) 0n/2 + + y ⊕ g(y) As this output begins with n/2 zero bits, it can be trivially distinguished from random. Hence G′ is clearly not a PRG. Exercise 2.34. Let G1 , G2 : {0, 1}n → {0, 1}2n be two different candidate PRGs (over the same domain and range). Consider the function G defined in each of the following sections. Is it a secure PRG - assuming both G1 and G2 are secure PRGs, or assuming only that one of them is secure PRG? Prove. 1. G(s) = G1 (s) ⊕ G2 (s). 2. G(s) = G1 (s) ⊕ G2 (s ⊕ 1|s| ). 3. G(s) = G1 (s) ⊕ G2 (0|s| ). Applied Introduction to Cryptography and Cybersecurity 148 CHAPTER 2. CONFIDENTIALITY: ENCRYPTION SCHEMES AND PSEUDO-RANDOMNESS Exercise 2.35. Let G : {0, 1}n → {0, 1}m be a secure PRG, where m > n. 1. Let m = n + 1. Use G to construct a secure PRG G′ : {0, 1}n → {0, 1}2n . 2. Let m = 2n, and consider G′ (x) = G(x) + + G(x + 1). Is G′ a secure PRG? 3. Let m = 2n. Use G to construct a secure PRG G′ : {0, 1}n → {0, 1}4·n . 4. Let m = 4n. Use G to construct a secure PRG Ĝ : {0, 1}n → {0, 1}64·n . Sketch of solution: Assume we are given a PRG f (·) from n − 1 bits to m bits. Next, deőne G(x) = f (x[0 : n − 2]), i.e., G(x) returns the value of f () when applied to the string which is the same as x except for the removal of the least signiőcant bit (LSb) of x, which we denote LSb(x). Since f is a PRG, then G is also a PRG, although a less efficient one. However, for a random x, with probability half, holds LSb(x) = 0 and hence G(x) = G(x + 1). Therefore G′ (x) = G(x)||G(x + 1) is surely not a secure PRG. Exercise 2.36. Let f, f ′ be two function from n bit binary strings to n′ bit ′ binary strings, i.e., f, f ′ : {0, 1}n → {0, 1}n . ̸ m2 ∈ {0, 1}n be two different random n bit strings. Present the 1. Let m1 = best upper and lower bounds you can, for the probability that f (m1 ) = f (m2 ), assuming n > n′ : ≤ Pr m1 ̸=m2 ←{0,1}n (f (m1 ) = f (m2 )) ≤ Justify your answer. ′ 2. Repeat, when f is a random function from {0, 1}n to {0, 1}n : ≤ Pr m1 ̸=m2 ←{0,1}n (f (m1 ) = f (m2 )) ≤ 3. Repeat items 1 and 2 for the case n = n′ . 4. Repeat items 1 and 2 for the case n′ > n. 5. Repeat items 1 and 2, for the probability that f (m1 ) = f ′ (m2 ). In item 2, only f is chosen randomly. Exercise 2.37. Let fk be a (secure) Pseudo-Random Function (PRF) from n ′ bit binary strings to n′ bit binary strings, i.e., f : {0, 1}∗ × {0, 1}n → {0, 1}n . Assume that the key k is chosen randomly as a string of length l; the key length l is specified so you can reference it in your responses, but do so only if you find it relevant; otherwise, you may ignore it. Assume l is sufficiently large for the PRF to be secure. 1. Is it possible that n < n′ ? Is it possible that n′ < n? Explain. Applied Introduction to Cryptography and Cybersecurity 2.12. LAB AND ADDITIONAL EXERCISES 149 2. Let m1 ̸= m2 ∈ {0, 1}n be two different random n bit strings. Present the best upper and lower bounds you can, for the probability that fk (m1 ) = fk (m2 ), assuming n < n′ : ≤ Pr m1 ̸=m2 ←{0,1}n (fk (m1 ) = fk (m2 )) ≤ Justify your answer. 3. Repeat for the case n = n′ . 4. Repeat for the case n > n′ . 5. Compare your answers with Exercise A.9. Exercise 2.38 (Ad-Hoc PRF competition project). In this exercise, you will experiment in trying to build directly a cryptographic scheme - in this case, a PRF - as well as in trying to ‘break’ (cryptanalyze) it. This exercise is best done by multiple groups, with each group consisting of one or few persons. 1. In the first phase, each group will design a PRF, whose input, key and output are all 64 bits long. The PRF should be written in Python (or some other agreed programming language), and only use the basic mathematical operations: table lookup, modular addition/subtraction/multiplication/division/remainder, XOR, max, min, and rotations. You may also use comparisons and conditional code. The length of your program should not exceed 400 characters, and it must be readable. You will also provide (separate) documentation. 2. All groups will be given the documentation and code of the PRFs of all other groups, and try to design programs to distinguish these PRFs from a random function (over same input and output domains). A distinguisher is considered successful if it is able to distinguish in more than 1% of the runs. Exercise 2.39. Let f be a secure Pseudo-Random Function (PRF) with n bit keys, domain and range, and let k be a secret, random n bit key. Derive from k, using f , two pseudorandom keys k1 , k2 , e.g., one for encryption and one for authentication. Each of the derived keys k1 , k2 should be 2n-bits long, i.e., twice the length of k. Note: the two keys should be independent, i.e., each of them (e.g., k1 ) should be pseudorandom, even if the adversary is given the other (e.g., k2 ). 1. k1 = 2. k2 = Exercise 2.40 (PRF constructions). Let Fk (m) : {0, 1}n × {0, 1}n → {0, 1}n be a secure PRF. Is F ′ , as defined in each of the following sections, a secure PRF? Justify your answers. Where the function is not a secure PRF, present Applied Introduction to Cryptography and Cybersecurity 150 CHAPTER 2. CONFIDENTIALITY: ENCRYPTION SCHEMES AND PSEUDO-RANDOMNESS the adversary that can distinguish between the function and a random function (as in Exercise 2.11). Where the function is a PRF, a precise proof as in Exercise 2.12 is best, but a good intuitive argument will also do. 1. F̂k (m) = Fk (mR ), where mR means the reverse of m. 2. F̂k (mL + + mR ) = Fk (mL ) + + Fk (mR ). 3. F̂k (mL + + mR ) = (mL ⊕ Fk (mR )) + + (Fk (mL ) ⊕ mR ). 4. F̂k (m) = LSb(Fk (m)), where LSb returns the least-significant bit of the input. Exercise 2.41 (Key dependent message security). Several works design cryptographic schemes such as encryption schemes, which are secure against a ‘key dependent message attack’, where the attacker specifies a function f and receives encryption Ek (f (k)), i.e., encryption of the message f (k) where k is the secret key. See [66]. 1. Extend the definition of secure pseudo-random function for security against key-dependent message attacks. 2. Suppose that F is secure pseudo-random function. Show a (’weird’) function F ′ which is also a secure pseudo-random function, but not secure against key-dependent message attacks. Exercise 2.42 (KDM security). Several works design cryptographic schemes such as encryption schemes, which are secure against a ‘key dependent message attack’. Recall that in many cryptographic definitions, the attacker has access to one or more oracle function, e.g., the chosen-plaintext oracle for encryption Ek (·) using secret key k. In a key-dependent message attack, the attacker has similar access to the oracle function, but instead of specifying directly the value of the input to the oracle, the attacker specifies a function f and the oracle is applied to f (k) where k is receives the relevant cryptographic function. For example for chosen-plaintext oracle for encryption using a shared key k, the attacker receives Ek (f (k)). See, e.g., [66]. 1. Extend the definition of secure pseudo-random function, to define a PRF scheme secure against key-dependent message attacks. 2. Repeat, for a block cipher (reversible PRP). 3. Suppose that F is secure pseudo-random function. Show a (’weird’) function F ′ which is also a secure pseudo-random function, but not secure against key-dependent message attacks. 4. Extend the definition of IND-CPA secure encryption to allow for keydependent message attacks. Applied Introduction to Cryptography and Cybersecurity 2.12. LAB AND ADDITIONAL EXERCISES 151 Exercise 2.43 (Stateful PRG, ANSI X9.31 and the DUHK attack). The ANSI X9.31 is a well-known design of a stateful PRG design built using a block cipher E, illustrated in Fig. 2.37. In this exercise we investigate a weakness in it, presented in [231]; it was recently shown to be still relevant for some devices using this standard, in the so-called DUHK attack [100]. Our presentation is a slight simplification of the X9.31 design but retains the important aspects of the attack. The stateful PRG is used in ‘rounds’, with the state of round i being output from round i−1. Specifically, the X9.31 PRG works as follows, given the current state si−1 , with s0 selected randomly, and some ‘timestamp’ Ti . First, compute an ‘internal’ value, xi = Ek (Ti ). Then, output the values ri = Ek (xi ⊕ si−1 ) and si = Ek (ri ⊕ xi ). See Fig. 2.37. The specification does not restrict the choice of k, and several implementations use constant k as part of their code; assume, therefore, that k is known. 1. Assume Ti , k and si−1 are known. Show how the attacker can find ri and si . 2. Assume that the values {Tj } are known for all j, and that k, ri are known (for specific i). Show how an attacker can compute {sj , rj } for every j. Ti si−1 Ek xi ⊕ ⊕ Ek Ek si ri Figure 2.37: A single round of the ANSI X9.31 stateful pseudorandom generator (PRG), using block cipher Ek (x), e.g., AES. This őgure was adapted from [100]. Exercise 2.44 (Cascade is not a robust combiner for PRFs). Let F ′ , F ′′ : {0, 1}∗ × D → D be two polynomial-time computable functions, and let their cascade, denoted F ≡ F ′ ◦ F ′′ be defined as: F(k′ ,k′′ ) (x) ≡ Fk′ ′ ◦ Fk′′′′ (x) ≡ Fk′ ′ (Fk′′′′ (x)) (2.63) Give an example of F ′ , F ′′ s.t. one of them is a PRF, yet their cascade F ≡ F ′ ◦ F ′′ is not a PRF. This shows that cascade is not a robust combiner for PRFs. Exercise 2.45. A message m of length 256 bytes is encrypted using a 128-bit block cipher, resulting in ciphertext c. During transmission, the 200th bit was flipped due to noise. Let c′ denote c with the 200th bit flipped, and m′ denote the result of decryption of c′ . 1. Which bits in m′ would be identical to the bits in m, assuming the use of each of the following modes: (1) ECB, (2) CBC, (3) OFB, (4) CFB? Explain (preferably, with diagram). Applied Introduction to Cryptography and Cybersecurity CHAPTER 2. CONFIDENTIALITY: ENCRYPTION SCHEMES AND PSEUDO-RANDOMNESS 152 2. For each of the modes, specify which bits is predictable as a function of the bits of m and the known fact that the 200th bit flipped. Exercise 2.46. Consider a scenario where randomness is scarce, motivating attempts to design encryption schemes that use less randomization. Specifically, consider the following two variants of CBC mode, both using, per message, only twenty random bits, rather than n random bits (block size) in standard CBC. Both variants are identical to CBC mode, except for the choice of c0 (the initialization vector), which is as specified below; both use a twenty-bits random $ string r ← {0, 1}20 . Show that neither variant suffices to ensure IND-CPA. Append zeros: c0 = r||0n−20 . Pseudorandomly: c0 = Ek (r). Note: your solution may require up to few million queries; just make sure the number of queries is polynomial in n. Exercise 2.47 (Modes of operation: decryption and correctness). Table 2.5 specifies only the encryption process for each mode. Write the decryption process for each mode and show that correctness is satisfied. Exercise 2.48. Hackme Bank protects money-transfer orders digitally sent between branches, by encrypting them using a block cipher. Money transfer +r+ +t+ +x+ +y+ + p, where f, r orders have the following structure: m = f + are each 20-bits representing the payer (from) and the payee (recipient), t is a 32-bit field encoding the time, x is a 24 bit field representing the amount, y is a 128-bit comment field defined by the payer and p is 32-bit parity fields, computed as the bitwise-XOR of the preceding 32-bit words. Orders with incorrect parity, outdated or repeating time field, or unknown payer/payee are ignored. Mal captures ciphertext message x containing money-transfer order of 1$ from Alice to his account. You may assume that Mal can ‘trick’ Alice into including a comment field y selected by Mal. Assume 64-bit block cipher. Can Mal cause transfer of larger amount to his account, and how, assuming use of the following modes: 1. ECB 2. CBC 3. OFB 4. CFB Solution: The őrst block contains f, r (10 bits each), and top 24 bits of the time t, the second block contains 8 more bits of the time, x (24 bits) and 32 bits of the comment; block three contains 64 bits of comments, and block four contains 32 bits of comment and 32 bits of parity. Denote these four plaintext blocks by m1 + + m2 + + m3 + + m4 . Denote the ciphertext blocks captured by Mal as c0 + + c1 + + c2 + + c3 + + c4 , where c0 is the IV. Applied Introduction to Cryptography and Cybersecurity 2.12. LAB AND ADDITIONAL EXERCISES 153 1. ECB: attacker select the third block (completely comment) to be identical to the second block, except for containing the maximal value in the 24 bits from bit 8 to bit 31. The attacker then switches between the third and fourth block before giving to the bank. Parity bits do not change. 2. CBC: Attacker chooses y s.t. m3 = m2 . Then, the attacker sends to the bank the manipulated message z0 + + c3 + + c3 + + c3 + + c4 , where z0 = m1 ⊕ m3 ⊕ c2 . As a result, decryption of the őrst block retrieves m1 correctly (as m1 = z0 ⊕ m3 ⊕ c2 ), and decryption of the last block similarly retrieves m4 correctly (no change in c3 , c4 ). However, both the second and the third block, decrypt to the value (c3 ⊕ c2 ⊕ m3 ). Hence, the 32 bit XOR of the message does not change. The decryption of the second block (to c3 ⊕ c2 ⊕ m3 ) is likely to leave the time value valid - and to increase the amount considerably. 3. OFB: the solution is trivial since Mal can ŕip arbitrary bits in the decrypted plaintext (by ŕipping corresponding bits in the ciphertext). 4. CFB: as in CBC, attacker chooses y s.t. m3 = m2 . Attacker sends to the bank the manipulated message c0 + + c1 + + c1 + + c1 + + z4 where z 4 = p 4 ⊕ c 2 ⊕ p2 . Exercise 2.49 (Affine block cipher). Hackme Inc. proposes the following highly-efficient block cipher, using two 64-bit keys k1 , k2 , for 64-bit blocks: Ek1 ,k2 (m) = (m ⊕ k1 ) + k2 (mod 264 ). 1. Show that Ek1 ,k2 is an invertible permutation (for any k1 , k2 ), and the inverse permutation Dk1 ,k2 . 2. Show that (E, D) is not a secure block cipher (invertible PRP). 3. Show that encryption using (E, D) is not CPA-IND, when used in the following modes: (1) ECB, (2) CBC, (3) OFB, (4) CFB. Exercise 2.50 (How not to build PRP from PRF). Suppose F is a secure PRF with input, output and keyspace all of length n bits. For xL , xR ∈ {0, 1}n , let Fk′ (xL + + xR ) = Fk (xL ) + + Fk (xR ) and Fk′′ (xL + + xR ) = Fk (xL ⊕ xR ) + + Fk (xL ⊕ Fk (xL ⊕ xR )). Prove that neither Fk′ nor Fk′′ are a PRP. Exercise 2.51 (Building PRP from a PRF). Suppose you are given a secure PRF F , with input, output and keyspace all of length n bits. Show how to use F to construct: 1. A PRP, with input and output length 2n bit and key length n bits, 2. A PRP, with input, output and key all of length n bits. Exercise 2.52. Show that the simple padding function pad(m) = m + + 0l , fails to prevent CCA attacks against most of the modes-of-operation (Fig. 2.5), when l ≤ n. The attacker may perform CPA and CCA queries, and the plaintext contains multiple blocks. Applied Introduction to Cryptography and Cybersecurity 154 CHAPTER 2. CONFIDENTIALITY: ENCRYPTION SCHEMES AND PSEUDO-RANDOMNESS Exercise 2.53 (Indistinguishability deőnition). Let (E, D) be a stateless sharedkey encryption scheme, and let p1 , p2 be two plaintexts. Let x be 1 if the most significant bits of p1 , p2 are identical and 0 otherwise, i.e., x = {1 if M Sb(p1 ) = M Sb(p2 ), else 0}. Assume that there exists an efficient algorithm X that computes x given the ciphertexts, i.e., x = X(Ek (p1 ), Ek (p2 )). Show that this implies that (E, D) is not IND-CPA secure, i.e., there is an efficient algorithm ADV which achieves significant advantage in the IND-CPA experiment. Present the implementation of ADV by filling in the missing code below: ADV Ek (‘Choose’, 1n ) : { } ADV Ek (‘Guess’, s, c∗ ) : { } Exercise 2.54 (Robust combiner for PRG). 1. Given two candidate PRGs, say G1 and G2 , design a robust combiner, i.e., a ‘combined’ function G which is a secure PRG is either G1 or G2 is a secure PRG. 2. In the design of the SSL protocol, there were two candidate PRGs, one (say G1 ) based on the MD5 hash function and the other (say G2 ) based on the SHA-1 hash function. The group decided to combine the two; a simplified version of the combined PRG is G(s) = G2 (s + + G1 (s)). Is this a robust-combiner, i.e., a secure PRG provided that either G1 or G2 is a secure PRG? Hint: Compare to Lemma 2.1. You may read on hash functions in Chapter 3, but the exercise does not require any knowledge of that; you should simply consider the construction G(s) = G2 (s + + G1 (s)) for arbitrary functions G1 , G2 . Exercise 2.55 (Using PRG for independent keys). In Example 2.6, we saw how to use a PRF to derive multiple pseudo-random keys from a single pseudorandom key, using a PRF. 1. Show how to derive two pseudo-random keys, using a PRG, say from n bits to 2n bits. 2. Show how to extend your design to derive four keys from the same PRG, or any fixed number of pseudo-random keys. Exercise 2.56. Let (E, D) be a block cipher which operates on 20 byte blocks; suppose that each computation of E or D takes 10−6 seconds (one microsecond), on given chips. Using (E, D) you are asked to implement a secure high-speed encrypting/decrypting gateway. The gateway receives packets at line speed of 108 bytes/second, but with maximum of 104 bytes received at any given second. The goal is to have minimal latency, using minimal number of chips. Present an appropriate design, argue why it achieves the minimal latency and why it is secure. Exercise 2.57. Consider the AES block cipher, with 256 bit key and 128 bit blocks, and two random one-block (128 bit ) messages, m1 and m2 , and Applied Introduction to Cryptography and Cybersecurity 2.12. LAB AND ADDITIONAL EXERCISES 155 two random (256-bit) keys, k1 and k2 . Calculate (or approximate/bound) the probability that Ek1 (m1 ) = Ek2 (m2 ). Exercise 2.58 (PRF→PRG). Present a simple and secure construction of a PRG, given a secure PRF. Exercise 2.59 (Independent PRGs). Often, a designer has one random or pseudo-random ‘seed/key’ binary string k ∈ {0, 1}∗ , from which it needs to generate two or more independently pseudorandom strings k0 , k1 ∈ {0, 1}∗ ; i.e., each of these is pseudorandom, even if the other is given to the (PPT) adversary. Let P RG be a pseudo-random generator, which on input of arbitrary length l bits, produces 4l output pseudorandom bits. For each of the following designs, prove its security (if secure) or its insecurity (is insecure). 1. For b ∈ {0, 1}, let kb = P RG(b + + k). 2. For b ∈ {0, 1}, let kb = P RG(k) [(b · 2 · |k|) . . . ((2 + b) · |k| − 1)]. Solution: 1. Insecure, since it is possible for a secure PRG to ignore the őrst bit, i.e., P RG(b + + s) = P RG(b + + s), resulting in k0 = P RG(0 + + k) = P RG(1 + + k) = k1 . We skip the (simple) proof that such a PRG may be secure. 2. Secure, since each of these is a (non-overlapping) subset of the output of the PRG. Exercise 2.60 (Indistinguishability hides partial information). In this exercise we provide an example to the fact that a cryptosystem that ensures indistinguishability (IND-CPA), is guaranteed not to leak partial information about plaintext, including relationships between the plaintext corresponding to different ciphertexts. Let (E, D) be an encryption scheme, which leaks some information about the plaintexts; specifically we assume that there exists an efficient adversary A s.t. for two ciphertexts c1 , c2 of E, holds A(c1 , c2 ) = 1 if and only if the plaintexts share a common prefix, e.g., c1 = Ek (ID + + m1 ) and c2 = Ek (ID + + m2 ) (same perfix, ID). Show that this implies that (E, D) is not IND-CPA secure. Exercise 2.61 (Encrypted cloud storage). Consider a set P of n sensitive (plaintext) records P = {p1 , . . . , pn } belonging to Alice, where n < 106 . Each record pi is l > 64 bits long ((∀i)(pi ∈ {0, 1}l )). Alice has very limited memory, therefore, she wants to store an encrypted version of her records in an insecure/untrusted cloud storage server S; denote these ciphertext records by C = {c1 , . . . , cn }. Alice can later retrieve the ith record, by sending i to S, who sends back ci , and then decrypting it back to pi . 1. Alice uses some secure shared key encryption scheme (E, D), with l bit keys, to encrypt the plaintext records into the ciphertext records. The goal Applied Introduction to Cryptography and Cybersecurity 156 CHAPTER 2. CONFIDENTIALITY: ENCRYPTION SCHEMES AND PSEUDO-RANDOMNESS of this part is to allow Alice to encrypt and decrypt each record i using a unique key ki , but maintain only a single ‘master’ key k, from which it can easily compute ki for any desired record i. One motivation for this is to allow Alice to give keys to specific record(s) ki to some other users (Bob, Charlie,...), allowing decryption of only the corresponding ciphertext ci , i.e., pi = Dki (ci ). Design how Alice can compute the key ki for each record (i), using only the key k and a secure block cipher (PRP) (F, F −1 ), with key and block sizes both l bits. Your design should be as efficient and simple as possible. Note: do not design how Alice gives ki to relevant users - e.g., she may do this manually; and do not design (E, D). Solution: ki = 2. Design now the encryption scheme to be used by Alice (and possibly by other users to whom Alice gave keys ki ). You may use the block cipher (F, F −1 ), but not other cryptographic functions. You may use different encryption scheme (E i , Di ) for each record i. Ensure confidentiality of the plaintext records from the cloud, from users (not given the key for that record), and from eavesdroppers on the communication. Your design should be as efficient as possible, in terms of the length of the ciphertext (in bits), and in terms of number of applications of the secure block cipher (PRP) (F, F −1 ) for each encryption and decryption operation. In this part, assume that Alice stores P only once, i.e., never modifies records pi . Your solution may include a new choice of ki , or simply use the same as in the previous part. , Solution: ki = Eki i (pi ) = , . Dki i (ci ) = 3. Repeat, when Alice may modify each record pi few times (say, up to 15 times); let ni denote number of modifications of pi . The solution should allow Alice to give (only) her key k, and then Bob can decrypt all records, using only the key k and the corresponding ciphertexts from the server. Note: if your solution is the same as before, this may imply that your solution to the previous part is not optimal. , Solution: ki = Eki i (pi ) = , . Dki i (ci ) = 4. Design an efficient way for Alice to validate the integrity of records retrieved from the cloud server S. This may include storing additional information Ai to help validate record i, and/or changes to the encryption/decryption scheme or keys as designed in previous parts. As in previous parts, your design should only use the block cipher (F, F −1 ). Solution: ki = , , Eki i (pi ) = , Dki i (ci ) = . Ai = Applied Introduction to Cryptography and Cybersecurity 2.12. LAB AND ADDITIONAL EXERCISES 157 5. Extend the keying scheme from the first part, to allow Alice to also compute keys ki,j , for integers i, j ≥ 0 s.t. 1 ≤ i · 2j + 1, (i + 1) · 2j ≤ n, where ki,j would allow (efficient) decryption of ciphertext records ci·2j +1 , . . . , c(i+1)·2j . For example, k0,3 allows decryption of records c1 , . . . , c8 , and k3,2 allows decryption of records c13 , . . . , c16 . If necessary, you may also change the encryption scheme (E i , Di ) for each record i. Solution: ki,j = , , Eki i (pi ) = Dki i (pi ) = . Exercise 2.62 (Modes vs. attack models.). For every mode of encryption we learned (see Table 2.5): 1. Is this mode always secure against any of the attack models we discussed (CTO, KPA, CPA, CCA)? 2. Assume this mode is secure against KPA. Is it then also secure against CTO? CPA? CCA? 3. Assume this mode is secure against CPA. Is it then also secure against CTO? KPA? CCA? Justify your answers. Exercise 2.63. Recall that WEP encryption is defined as: W EPk (m; IV ) = [IV, RC4IV,k ⊕ (m + + CRC(m))], where IV is a random 24-bit initialization window, and that CRC is a error-detection code which is linear, i.e., CRC(m ⊕ m′ ) = CRC(m) ⊕ CRC(m′ ). Also recall that WEP supports sharedkey authentication mode, where the access point sends random challenge r, and the mobile response with W EPk (r; IV ). Finally, recall that many WEP implementations use 40-bit key. 1. Explain how an attacker may efficiently find the 40-bit WEP key, by eavesdropping on the shared-key authentication messages between the mobile and the access point. 2. Present a hypothetical scenario where WEP would have used a fixed value of IV to respond to all shared-key authentication requests, say IV=0. Show another attack, that also finds the key using the shared-key authentication mechanism, but requires less time per attack. Hint: the attack may use (reasonable) precomputation process, as well as storage resources; and the attacker may send a ‘spoofed’ challenge which the client believes was sent by the access point. 3. Identify the attack models exploited in the two previous items: CTO, KPA, CPA or CCA? 4. Suppose now that WEP is deployed with a long key (typically 104 bits). Show another attack which will allow the attacker to decipher (at least part) of the encrypted traffic. Applied Introduction to Cryptography and Cybersecurity Chapter 3 Integrity: from Hashing to Blockchains Integrity is the ability to check if an object was modiőed, using a concise digest of the (original) object. In this chapter, we discuss the two main type of cryptographic integrity mechanisms: hash functions and accumulator schemes. Much of this chapter deals with cryptographic hash functions, used to compute the digest of a binary string. Cryptographic hash functions are among the most widely-used cryptographic schemes, and have many diverse properties, uses, and applications. In Section 3.1 we introduce cryptographic hash functions, their properties and variants. In Section 3.2 and Section 3.3, respectively, we discuss the two main integrity properties of hash functions, collision resistance hash functions (CRHF) and second preimage resistant (SPR) hash functions. In Section 3.4 we discuss one-way functions (OWF), which is another property often expected from cryptographic hash functions, and used for different purposes. In Section 3.5 we discuss one őnal important property often expected from cryptographic hash functions: randomness extraction; and in Section 3.6 we introduce the Random Oracle Model , an important paradigm often used to provide a simpliőed security analysis of protocols and schemes using cryptographic hash functions. Later, in Chapter 4, we discuss the use of cryptographic hash functions for authentication, in MAC and signature schemes. In Section 3.7 we introduce cryptographic accumulators. Accumulators can be seen as a generalization of hash functions, accepting a sequence/set of binary strings as input, and providing additional functionalities such as Proof of Inclusion (PoI). In Section 3.9 we present the Merkle-Damgård accumulator, a construction that is more known for its use in constructing cryptographic hash functions from their őxed-input-length variant, called compression functions. Both hash functions and accumulators have been extensively studied and widely deployed in practice. However, as we will see, their correct, secure use requires precise understanding of their properties; designs based on intuitive understanding have often proved vulnerable, and we will see some examples. Therefore, precise deőnitions are critical. We deőne what we consider as the 159 160 CHAPTER 3. INTEGRITY: FROM HASHING TO BLOCKCHAINS most important notions, mostly focusing, where possible, on the more-applied case of keyless hash functions; in more advanced texts on cryptography, you will őnd additional, and sometimes different, deőnitions. 3.1 Introducing cryptographic hash functions, their properties and variants Hash functions map variable input length (VIL) binary strings to n-bit strings, referred to as the digest of the input. Since the input may be arbitrarily long, and the output is always n bits, the basic property of hash functions is that the digest (output) is normally shorter than the input, a property often referred to as compression1 . The compression property, on its own, may be achieved trivially, e.g., by truncating the input; hash functions are expected to satisfy additional properties, as we will discuss. m h(·) h(m) (a) Keyless hash function h(·) : {0, 1}∗ → {0, 1}n . m k hk (·) hk (m) (b) Keyed hash function hk (·) : {0, 1}n × {0, 1}∗ → {0, 1}n . Figure 3.1: Keyless and Keyed Hash Functions: mapping from a variable-length input to n-bits output (digest). For simplicity, the keyed hash use n as length of both digest and key. As illustrated in Figure 3.1, hash functions may be keyed (hk (·) : {0, 1}n × {0, 1}∗ → {0, 1}n ) or keyless (h(·) : {0, 1}∗ → {0, 1}n ). We discuss cryptographic hash functions, i.e., hash functions which should satisfy different security properties, e.g., collision resistance, although hash functions are also used for non-security applications. To key or not to key? Existing standards of cryptographic hash functions, are all of keyless hashes, and use a őxed digest length n. For example, the SHA-1 standard cryptographic hash function subsection 3.1.4 uses n = 160, i.e., the output length is 160 bits. However, as we will see, keyless hash function cannot ensure important security requirements, e.g., collision resistance, which motivates using keyed hash functions. 1 Do not be confused with compression functions, which we define later, which compress from m bits strings to n < m bits strings. Applied Introduction to Cryptography and Cybersecurity 3.1. INTRODUCING CRYPTOGRAPHIC HASH FUNCTIONS, PROPERTIES AND VARIANTS 161 We discuss both keyed and keyless hash functions, focusing, where possible, on keyless hash functions, since they are simpler and more common in applied cryptographic protocols and systems. For the same reasons, we also focus on hash functions of őxed digest length n. The key k of keyed hash functions, including cryptographic keyed hash functions, is usually non-secret2 . 3.1.1 Warm-up: hashing for efficiency Before we focus on cryptographic hash functions, we őrst discuss brieŕy the use of hash functions for randomly mapping data, as used (also) for load-balancing and other ‘classical’, non-adversarial scenarios. Our goal is to provide intuition for the required security properties and awareness of some of the challenges. A common (non-cryptographic) application of hash functions is to map the inputs into the possible digests values (‘bins’) in a ‘random’ manner, i.e., with a roughly equal probability of assignment to each bin (digest value). For the typical case of n-bit digest values, there would be 2n bins. This property is used in many algorithms and data structures, to improve efficiency and fairness. This is illustrated in Fig. 3.2. Here, a hash function h maps from the set of names (given as unbounded-length strings), to a smaller set, say the set of n-bit binary strings. This is a special case of load balancing, i.e., avoiding unevenly use of computing resources, which may result in overload of one resource concurrently with under-utilization of an alternative resource. The goal is to roughly balance the number of entries (names) assigned to each bin. Of course, in cryptography, and cybersecurity in general, we mainly consider adversarial settings. In the context of load-balancing applications as shown in Fig. 3.2, this refers to an adversary who can manipulate some of the input names, and whose goal may be to cause imbalanced allocation of names to bins, i.e., many collisions - which can cause bad performance, potentially even a Denial of Service (DoS), i.e., a disruption or degradation of the service provided. Consider an attacker whose goal is to degrade the performance for a particular name, say Bob, as part of a Denial-of-Service (DoS) attack. The attacker may provide to the system a long list of names x1 , x2 , . . ., deviously selected such that all of them are mapped to the same bin as Bob. We refer to inputs x, x1 that have the same digest, i.e., h(x) = h(x1 ), as a collision. The attacker’s goal, therefore, is to őnd many values x1 , x2 , . . ., which all collide with the string x =‘Bob’, i.e., h(‘Bob’) = h(x1 ) = h(x2 ) = . . .. See Figure 3.3. DoS attacks of this type, which cause high computational overhead, are usually referred to as an Algorithmic-Complexity DoS Attacks. One way in which attackers may exploit an algorithmic complexity DoS attack, is to cause excessive overhead for network security devices, such as malware/virus scanners, Intrusion-Detection Systems (IDS) and Intrusion-Prevention Systems (IPS). The attack may cause the IDS/IPS systems to become ineffective, allowing the 2 Some works refer to the use of hashes with secret keys. However, the applications are typically of a MAC or PRF function, possibly constructed from a cryptographic hash function, a topic we discuss in subsection 4.6.3. Applied Introduction to Cryptography and Cybersecurity 162 CHAPTER 3. INTEGRITY: FROM HASHING TO BLOCKCHAINS Figure 3.2: Load-balancing with (keyless) hash function h(·) attacker to avoid detection. For further discussion of algorithmic-complexity and other Denial-of-Service (DoS) attacks, see [107, 193]. Note that for any hash function h and input x (e.g., x =‘Bob’), it is possible to őnd other inputs {x′1 , x′2 , . . .} which collide with x, i.e., (∀i)h(x′i ) = h(x), by randomly testing different inputs and collecting these whose hash is the same as h(x). However, for digest length n, there are 2n bins, hence the probability of such random guess to be a collision with ‘Bob’ is only 21n = 2−n , i.e., negligible in n. For some hash functions, including hash functions used for non-cryptographic applications, there are efficient ways for the attacker to őnd collisions, rather than testing random inputs. This includes hash functions that provide sufficientlyrandomized mapping for ‘natural’ inputs, which result from a benign selection process. We say that such hash functions are not collision resistant, i.e., these are functions where collisions can be found efficiently when the inputs are selected by an attacker. See the following exercise. Exercise 3.1. Given an alphabetic string x, let num(x, i) be the alphabeticalposition of the ith letter in x with the first letter (‘a’) being in position one. For example, num(‘hello’, 2) = 5 since ‘e’ is the fifth letter in the alphabet, and num(‘abcdef ’, i) = i for 1 ≤ i ≤ 6. Consider hash function h(x) = Applied Introduction to Cryptography and Cybersecurity 3.1. INTRODUCING CRYPTOGRAPHIC HASH FUNCTIONS, PROPERTIES AND VARIANTS 163 Figure 3.3: Algorithmic Complexity Denial-of-Service Attack exploiting insecure hash function h to cause many collisions P|x| i=1 num(x, i) mod 27, i.e., sum of all the letters (mod 27). Show how an attacker may easily generate a set {x1 , x2 , . . .} of any desired number of strings colliding with Bob, i.e., for every xi holds: h(xi ) = h(‘Bob’). The attacker should not need to compute the hash value for many different strings. Give three examples of such colliding strings. As an extra challenge, try to have your strings be real names! It isn’t very surprising that collisions can be found efficiently for a hash function not designed for collision resistance. However, this indicates the importance of deőning carefully the collision resistance requirement and other security requirements from cryptographic hash functions. This follows the attack model and security requirements principle (Principle 1). In Section 3.2 we deőne collision-resistant hash functions (CRHF); intuitively, in such functions, an attacker cannot efficiently őnd a collision. This should foil the Algorithmic Complexity Denial-of-Service Attack of Fig. 3.3, provided we use a sufficientlylong digest length n (i.e., 2n bins). See Figure 3.4. Applied Introduction to Cryptography and Cybersecurity 164 CHAPTER 3. INTEGRITY: FROM HASHING TO BLOCKCHAINS Figure 3.4: Load balancing with a collision-resistant hash function (CRHF) with n bits digest, i.e., using 2n bins. The probability of a random name to collide with ‘Bob’ is only 2−n ; furthermore, the probability of collision is negligible for guesses by all efficient algorithms. 3.1.2 Properties of cryptographic hash functions Cryptographic hash functions are used for many different applications, often assuming different security properties; unfortunately, these assumptions are not always made explicitly, and the assumed properties are not always clearly deőned. Roughly, these properties fall into three broad goals: integrity, confidentiality and randomness. Intuitively, integrity ensures the uniqueness of the message, given the digest; confidentiality ensures the digest does not ‘expose’ the message; and randomness ensures that the digest is pseudorandom, provided that the input ‘contains sufficient randomness’. We deőne four security requirements: collision resistance, second-preimage resistance, one-way function (also referred to as preimage resistance), and randomness extraction. Table 3.1 maps these requirements to the three goals, and gives an abridged descriptions of these properties, for keyless hash functions. Assuming that a given hash function has these security properties makes it possible to use the hash function for different applications, ensuring security as long as the hash function indeed satisőes the properties assumed. However, we should only assume properties which the hash function was designed and tested Applied Introduction to Cryptography and Cybersecurity 3.1. INTRODUCING CRYPTOGRAPHIC HASH FUNCTIONS, PROPERTIES AND VARIANTS 165 Goal Integrity Integrity Conődentiality Randomness All Requirement Collision resistance (CRHF; Deőnition 3.1) Second-preimage resistance (SPR; Deőnition 3.7) One-way function (OWF; Deőnition 3.9 Randomness extractor (Section 3.5) Random oracle model (ROM; ğ3.6) Abridged description Can’t őnd collision (m, m′ ), i.e., m ̸= m′ yet h(m) = h(m′ ). Can’t őnd collision to a random m: m′ ̸= m yet h(m) = h(m′ ). Given h(m) for random m, can’t őnd m′ s.t. h(m) = h(m′ ). If input is sufficiently random, then output is pseudorandom. Consider h as random function Table 3.1: Goals and Requirements for keyless cryptographic hash functions, presented for the hash function h : {0, 1}∗ → {0, 1}n . to ensure. Even seemingly minor differences between the security requirements for which the hash function was designed and tested, and the security properties required for the application, may result in a vulnerability. For example, Table 3.1 presents two integrity requirements, CRHF and SPR; we later show applications which are secure using a cryptographic hash function which satisőes the CRHF requirement, but may be vulnerable assuming ‘only’ the SPR requirement. Namely, even if the deőnitions may appear similar, it is still important to use the correct deőnition; the differences can be meaningful and even critical. Table 3.1 also includes the random oracle model (ROM), which we discuss (in Section 3.6). In the ROM, we analyze the security of a system using cryptographic hash functions as if we use a random function (from binary-strings to n-bit binary strings). Here is an exercise which may strengthen your intuitive understanding of the different security requirements in Table 3.1. Exercise 3.2 (Examples of (insecure, simple) hash functions). Let h(x) = x mod 2n , h′k (x) = k + x mod 2n , and h′′ (x) = x2 mod 2n , all computed by considering their inputs as integers in binary representation. To avoid confusion, we denote numbers in binary representation by subscript of 2, e.g., 100002 is the number 16 in decimal, i.e., 24 , and similarly, 1002 = 4 = 22 . Notice that h′ is keyed, while h and h′′ are keyless. 1. For n = 4, compute h(110102 ), h(101010102 ), h′′ (110102 ) and h′′ (101010102 ). Note: inputs are binary strings, and should be viewed as integers in binary representation, which we denote with the subscript 2. 2. Show that h(x) = x mod 2n is not a CRHF, SPR, OWF or randomness extractor, based on the abridged descriptions in Table 3.1. 3. Repeat for h′k (x) = k + x mod 2n . 4. Repeat for h′′ (x) = x2 mod 2n . Beware: the OWF property can be challenging. Applied Introduction to Cryptography and Cybersecurity 166 CHAPTER 3. INTEGRITY: FROM HASHING TO BLOCKCHAINS Solution for the first three items: 1. Recall that inputs are binary strings, and should be viewed as integers in binary representation; we write the outputs similarly. mod 24 = 10102 h(110102 ) = 110102 h(101010102 ) h′ (110102 ) = = 101010102 mod 24 = 10102 110102 mod 24 = (110102 mod 24 )2 = (10102 )2 = (1, 000, 0002 + 100002 + 100002 + 1002 ) = = 1002 ... = 1002 (similarly) h′ (101010102 ) mod 24 = (10002 + 102 )2 mod 24 mod 24 mod 24 2. Show that h(x) = x mod 2n is not a CRHF, SPR, OWF or randomness extractor. a) SPR and CRHF: We show that h is not a CRHF or an SPR hash function, by showing a collision for any given input x. Speciőcally, let x′ = x + 2n . Clearly x′ = ̸ x, and yet h(x′ ) = (x + 2n ) mod 2n = x mod 2n = h(x), namely, x′ is a collision (second preimage) with x. b) OWF: We next show that h is not a one-way function (OWF). Speciőcally, given h(x) for any preimage x, let x′ = h(x); clearly: h(x′ ) = x′ mod 2n = h(x) = (x mod 2n ) mod 2n mod 2n = x mod 2n = h(x) Namely, x′ is a preimage of h(x), and hence h is not a OWF. c) Finally we show that h is not a randomness extractor hash function. $ Speciőcally, let r ← {0, 1}n be a random n bit string, and let x = r+ + 0, i.e., let x be an n + 1-bits binary string, whose least signiőcant bit is zero and the other bits selected randomly. The value of h(x) = x mod 2n is the same as the n least-signiőcant bits of n, and in particular, the least signiőcant bit of h(x) is zero. Hence, h(x) is easily distiniguishable from a random n bit string. 3. Show that h′k (x) = k + x mod 2n is not a CRHF, SPR, OWF or randomness extractor. Recall that the key k is known to the adversary (not secret). This makes it easy to adapt the solutions of the previous item. Speciőcally: SPR and CRHF: The same collision x′ = x + 2n applies here too. Clearly x′ = ̸ x, and yet for every key k holds: hk (x′ ) = k + (x + 2n ) n mod 2 = k + x mod 2n = hk (x), namely, x′ is a collision (second preimage) with x. Applied Introduction to Cryptography and Cybersecurity 3.1. INTRODUCING CRYPTOGRAPHIC HASH FUNCTIONS, PROPERTIES AND VARIANTS 167 OWF: We next show that hk is not a one-way function (OWF), i.e., given hk (x) (and k), we can őnd x′ such that hk (x′ ) = hk (x). Notice that we are given the key - in our deőnitions of keyed hash function, the key is known to the attacker, i.e., not a secret. Speciőcally, given hk (x) for any preimage x, and the key k, let x′ = hk (x) − k; clearly: hk (x′ ) = x′ mod 2n = hk (x) − k = (k + x mod 2n ) mod 2n mod 2n = k + x mod 2n = hk (x) Namely, x′ is a preimage of hk (x), and hence h is not a OWF. Randomness extractor hash: Finally we show that hk is not a randomness extractor hash function. Speciőcally, let x = r + + 0n , where r is a random n bit string; for simplicity, assume that the key k is also n bits long. Then hk (x) = (r + + 0n ) + k mod 2n = k, which is obviously not a random string. 3.1.3 Applications of cryptographic hash functions The broad security requirements of cryptographic hash functions facilitate their use in many systems and for an extensive variety of applications. These different applications and systems rely on different security requirements. As in any security system, it is important to identify the exact security requirements and assumptions; however, published designs, and even standards, do not always deőne the requirements precisely. Important applications of cryptographic hash functions include: Integrity of a string or a set : the hash h(m) is a short digest of a typically much longer string m, which allows validation of the integrity of m. Hash functions are also used in the construction of accumulator schemes, e.g., the Merkle tree design; accumulators also produce a digest, but of an ordered or unordered set of strings, rather than of a single string. We discuss accumulators in Section 3.7. Hash-then-Sign : Signature schemes are usually deőned with Fixed Input Length (FIL), typically quite limited (e.g., < 1024 bits). To sign longer messages, we apply the FIL signing function to the digest h(m) of the message m being signed; this is called the Hash-then-Sign paradigm. See subsection 3.2.6. Improved login mechanisms : Hash functions are used to improve the security of password-based login authentication, in several ways. The most widely deployed method is using a hashed password file, which makes exposure of the server’s password őle less risky - since it contains only the hashed passwords. Another approach is to use a hash-based one-time password, which is a random number allowing the server to authenticate Applied Introduction to Cryptography and Cybersecurity CHAPTER 3. INTEGRITY: FROM HASHING TO BLOCKCHAINS 168 the user, with drawbacks of single use and having to remember or have this random number; see subsection 3.4.1, and a more extensive discussion of different login mechanisms in Chapter 9. Proof-of-Work : cryptographic hash functions are often used to provide Proofof-Work (PoW), i.e., to prove that an entity performed a considerable amount of computation. This is used by Bitcoin and other cryptocurrencies, and for other applications. See Section 3.10.2. Key derivation and randomness generation : hash functions are used to extract pseudorandom bits, given input with ‘sufficient randomness’. In particular, this is used to derive secret shared keys. See Section 3.5. 3.1.4 Standard cryptographic hash functions Due to their efficiency, simplicity and wide applicability, cryptographic hash functions are probably the most commonly used ‘cryptographic building blocks’, as discussed in the cryptographic building blocks principle (Principle 8). This implies the importance of deőning and adopting standard functions, which can be widely evaluated for security - mainly by cryptanalysis - and the need for deőnitions of security. There have been many proposed cryptographic hash functions; however, since security is based on failed efforts for cryptanalysis, designers usually avoid lesswell-known (and hence less tested) designs. The most well-known cryptographic hash functions include the MD4 and MD5 functions proposed by RSA Inc., the SHA-1, SHA-23 and SHA-3 functions standardized by NIST ( [114, 323]), the RIPEMD and RIPEMD-160 standards, and others, e.g., BLAKE2. Several of these, however, were already ‘broken’, i.e., shown to fail some of the security requirements. In particular, collisions - and speciőcally, chosen-prefix collisions were found for RIPEMD, MD4 and MD5 in [367], and later also for SHA-1 [262]; see subsection 3.3.1. As a result, these functions should be avoided and replaced, at least in applications which depend on the collision-resistance property. Existing standards deőne only keyless cryptographic hash functions. However, as we later explain, there are strong motivations to use keyed cryptographic hash functions, which use a non-secret key. In particular, collision-resistance cannot be achieved by any keyless function. 3.2 3.2.1 Collision Resistant Hash Function (CRHF) Keyless Collision Resistant Hash Function (Keyless-CRHF) A keyless hash function h : {0, 1}∗ → {0, 1}n maps unbounded length binary strings m ∈ {0, 1}∗ , to their n-bit digest h(m) ∈ {0, 1}n . Since the input domain 3 The SHA-2 specifications defines six variants for SHA-2, with digests lengths of 224, 256, 384 or 512 bits; these variants are named SHA-224, SHA-256, SHA-384, SHA-512, SHA-512/224, and SHA-512/256. Applied Introduction to Cryptography and Cybersecurity 3.2. COLLISION RESISTANT HASH FUNCTION (CRHF) 169 Figure 3.5: Keyless collision resistant hash function (CRHF): it is infeasible to efficiently őnd a collision, i.e., a pair of inputs x, x′ ∈Domain, which are mapped by the hash function h to the same output, h(x) = h(x′ ), except with negligible probability. is the (inőnite) set of all binary strings, and the range is őnite ({0, 1}n ), it follows that there are inőnitely many collisions, i.e., messages m ̸= m′ s.t. h(m) = h(m′ ). Indeed, even if we limit ourselves to the input set of messages m ∈ {0, 1}n+1 , the number of messages is 2n+1 and the number of digests is only 2n , so clearly at least half (2n ) of the inputs must result in a collision. Namely, collisions are common - in every hash function. However, for large digest length n, it is conceivable that finding a collision may be hard. Intuitively, we say that a hash function is collision resistant, if it is hard to őnd any collision, as illustrated in Figure 3.5. The deőnition follows; notice, that in this and (most) other deőnitions of keyless hash function, we actually view the hash function as if it is deőned for different digest lengths n, allowing us to discuss the computational complexity as a function of n. For more precise deőnitions that explicitly express the digest length n as a parameter, see, e.g., [165]. Definition 3.1 (Keyless Collision Resistant Hash Function (CRHF)). A keyless hash function h(·) : {0, 1}∗ → {0, 1}n is collision-resistant if for every efficient (PPT) algorithm A, the advantage εCRHF (n) is negligible in n, i.e., smaller h,A than any positive polynomial for sufficiently large n (as n → ∞), where: (n) ≡ Pr [(x, x′ ) ← A(1n ) s.t. (x ̸= x′ ) ∧ (h(x) = h(x′ ))] εCRHF h,A (3.1) Where the probability is taken over the random coin tosses of A. Let us deőne hsum , a simple example of an insecure hash function, which is handy to give examples of the different cryptographic hash function deőnitions - all of which, hsum fails to satisfy. While the input to hsum can be any binary string, we normally use it for input which is a string of decimal digits (encoded in binary); for any other input, hsum returns the őxed output of zero (0). When the input consists of a string of decimal digits, hsum repeatedly sums up the digits of its input, until obtaining only one digit, which is the output. For example, hsum (13) = 4, hsum (345) = 3 and hsum (5) = 5. The reader is probably familiar with hsum from elementary school, where it is introduced without calling it hsum , of course - as a way to check if a number divides by Applied Introduction to Cryptography and Cybersecurity 170 CHAPTER 3. INTEGRITY: FROM HASHING TO BLOCKCHAINS three or by nine. We next deőne hsum precisely and őnd collisions in it; our deőnition conveniently assumes that we can use the decimal value of a string in numeric operations, when the string composed of decimal digits. Example 3.1 (The hsum hash function and collisions in it). We define hsum (x) as follows:   0 if x ̸∈ {0, 1, . . . 9}∗     if x < 10 hsum (x) = Px    |x|  hsum  x[i] otherwise i=1 Let us show that hsum is not a CRHF. This is easy; in fact, given any integer x > 0, let x′ = 10 · x. Then x = ̸ x′ , yet hsum (x′ ) = hsum (x) - i.e., hsum is not a CRHF. So, how do we őnd a CRHF? Possibly surprisingly, we next show that we em cannot; namely, we show that there exists no keyless CRHF. 3.2.2 There are no Keyless CRHFs! Standard cryptographic hash functions, discussed in subsection 3.1.4, are all keyless; and practical deployment almost always use these designs. By now, the readers should not be surprised to learn that none of these were proven secure; we discussed in subsection 2.7.4 the fact that ‘real’, unconditional proofs of security for (most) cryptographic schemes would imply that P ̸= N P , and, therefore, would be major news. Cryptographic hash functions are among the basic cryptographic building blocks, which are typically validated by accumulated evidence of failed attempts to cryptanalyze them. However, the following lemma may be surprising: all keyless hash functions fail to satisfy the keyless-CRHF deőnition (Deőnition 3.1) - namely, a keylessCRHF - using this deőnition - simply does not exist. We present and prove this, and then discuss the implications. The reader is quite right to be suspect a ‘trick’ here - after all, we just explained that all standard cryptographic hash functions are keyless! Well, that is correct: the proof uses a ‘trick’ to show that there is an efficient attacker that can őnd a collision in the keyless hash function. The proof shows that for any given keyless hash function h, there exists an efficient adversarial algorithm Ah , ̸ m′ but h(m) = h(m′ ). that outputs a collision for h, i.e., a pair (m, m′ ) s.t. m = Furthermore, Ah is not just efficient in n: its time complexity is basically the time required to print out the collision. In fact, printing the collision is basically the only thing that Ah does. And note that Ah does not just succeeds with a ‘signiőcant’ probability; it always succeeds. Namely, h is very very far from the requirements of Deőnition 3.1! Would you like to see the trick - or did you already őgure it out? You may have, since we have essentially already did the trick - ‘hidden in plain sight’ exactly in the paragraph above. Can you őnd it? Try to őnd it, before you read the proof and the explanation of the trick. Applied Introduction to Cryptography and Cybersecurity 3.2. COLLISION RESISTANT HASH FUNCTION (CRHF) 171 Lemma 3.1 (Keyless CRHF do not exist.). There is no keyless CRHF hash function h : {0, 1}∗ → {0, 1}n . Proof: Given h(·), we prove that there exists an efficient adversary algorithm Ah that always őnds a collision, i.e. εCRHF (n) = 1 - clearly showing that h(·) h,A does not satisfy the deőnition of a keyless CRHF. Recall that, since the domain of h is inőnite while the range is the őnite set {0, 1}n , then h must have collisions, i.e., pairs of binary messages m ̸= m̂ s.t. h(m) = h(m̂). Let m, m̂ denote one such collision. It does not matter which collision we pick or how do we pick the collision. The adversary Ah simply outputs the collision (m, m̂), i.e., Ah (1n ) = (m, m̂). Obviously, Ah is efficient, and always outputs a collision. Therefore, h(·) is not a keyless CRHF (as deőned in Deőnition 3.1). The ‘trick’ was that we proved that such an attacker Ah exists - but in a non-constructive way, i.e., we did not present such an adversary or showed an efficient way to őnd it. We only showed that such an adversary exists. This shows that there is no keyless CRHF, as deőned in Deőnition 3.1. Of course, there may be some other reasonable notion for collision resistance for keyless hash, for which the lemma does not apply. Indeed, we later deőne SecondPreimage Resistant (SPR) hash, which is essentially a weaker collision-resistance property. In this textbook, for simplicity, we usually use keyless hash functions, often assuming collision resistance, i.e., the keyless hash function is a cryptanalysisresistant CRHF. In the constructions and designs we discuss, it is not too hard to add the ‘missing’ keys, when desired (e.g., for provably secure reductions). Namely, we use ‘keyless CRHFs’ as a convenient simpliőcation. A justiőcation may be that if the system using the hash is insecure, then we may be able to use the attack to őnd the collision - which seems hard, as cryptanalysts failed so far to őnd such collision. Another justiőcation is that in any practical implementations, the output length is őxed, while we only discuss asymptotic security deőnitions. In fact, many cryptographic designs use an even stronger simpliőcation - the random oracle model (ROM), which we discuss in ğ3.6. Another approach is to design the application without assuming a CRHF at all, and instead, rely on other properties, which may exist for keyless hash. One especially-relevant property is second-preimage resistance (SPR), which is, essentially, a weaker form of collision resistance.Of course, care must be taken to ensure that SPR is really sufficient for the application; there could be subtle vulnerabilities due to use of SPR in an application requiring ‘real’ collision-resistance. We discuss SPR in Section 3.3. A őnal alternative is to use a keyed CRHF instead of keyless CRHF. Considering that existing standards deőne only keyless hash, a common approach is to use construction of a keyed hash from a keyless hash. Often, this would be the HMAC construction, originally designed as a construction of a MAC function from hash. We discuss HMAC in subsection 4.6.3. Applied Introduction to Cryptography and Cybersecurity CHAPTER 3. INTEGRITY: FROM HASHING TO BLOCKCHAINS 172 Figure 3.6: Keyed collision resistance hash function (CRHF): given random key k, it is hard to őnd a collision for hk , i.e., a pair of inputs x, x′ ∈ {0, 1}∗ s.t. hk (x) = hk (x′ ). 3.2.3 Keyed Collision Resistance We next discuss keyed collision resistant hash functions (keyed CRHF). The deőnition for keyed CRHF seems very similar; the only difference is that the probability is also taken over the key, and the key is provided as input to the adversary. Recall that, for simplicity, we use n as the length of both the digest and the key; hence, we do not need to provide n as an additional input (since it is equal to the key length). We next deőne keyed collision resistance, which we illustrate in Figure 3.6. Recall that for keyed cryptographic hash functions, we assume, for simplicity, that n denotes both the length of the key and the length of the digest length, i.e., for every message m ∈ {0, 1}n holds: |k| = |hk (m)| = n. Definition 3.2 (Keyed Collision Resistant Hash Function (CRHF)). Consider a keyed hash function hk (·) : {0, 1}n × {0, 1}∗ → {0, 1}n , defined for any n ∈ N. We say that h is collision-resistant if for every efficient (PPT) algorithm A, the advantage εCRHF (n) is negligible in n, i.e., εCRHF (n) ∈ N EGL(n), where: h,A h,A εCRHF (n) ≡ h,A Pr k←{0,1}n [(x, x′ ) ← A(k) s.t. (x ̸= x′ )∧( (hk (x) = hk (x′ ) )] (3.2) Where the probability is taken over the random coin tosses of the adversary A and the random choice of k. Let us now deőne a simple, insecure keyed hash function - speciőcally, hsum k - essentially, a keyed-version of the hsum hash function (Example 3.1). Definition 3.3 (The keyed hsum (insecure) hash function.). Let k, x ∈ {0, 1, . . . 9}∗ . k sum Then we define hk (x) as follows: hsum (x) = hsum (k||x) k The following exercise uses the simple hsum hash function to demonstrate k the CRHF deőnition. Exercise 3.3. Show that hksum is not a keyed CRHF. Hint: see Example 3.1 for guidance. Applied Introduction to Cryptography and Cybersecurity 3.2. COLLISION RESISTANT HASH FUNCTION (CRHF) 173 Figure 3.7: Target collision resistant (TCR) hash function: adversary cannot őnd target x, to which it would be able to őnd a collision x′ , once it would be given the random key k. Target Collision Resistant (TCR) vs. ACR / Keyed CRHF. Deőnition 3.2 uses the term keyed CRHF, following Damgård [111]. Another term for this deőnition is any collision resistance (ACR hash), proposed by Bellare and Rogaway in [45]. They preferred this term, to emphasize that this deőnition allows the attacker to choose the speciőc collision as function of the key, since the key is given to the attacker before the attacker outputs the entire collision (both x and x′ s.t. hk (x) = hk (x′ )). Bellare and Rogaway preferred to use the term ACR to the term ‘keyed CRHF’, to emphasize the difference from a weaker notion of collision-resistance that they (and we) call4 Target Collision Resistant (TCR) hash. The term TCR emphasizes that, to ‘win against’ the TCR deőnition, the attacker has to őrst select the target x, i.e., one of the two colliding strings, before it receives the (random) key k. Only then the attacker is given the random key k, and has to output the colliding string x′ s.t. h(x) = h(x′ ). Intuitively, this makes sense: it seems that on most applications, a collision between two ‘random’ strings x, x′ may not help the attacker; the attacker often needs to match some speciőc ‘target’ string x. The TCR deőnition still allows the attacker to choose the target - but at least not as a function of the key! We next deőne target collision resistance, which we illustrate in Figure 3.7. Definition 3.4 (Target collision resistant (TCR) hash). A keyed hash function hk (·) : {0, 1}∗ × {0, 1}∗ → {0, 1}∗ is called a target collision-resistant (TCR) CR hash, if for every efficient (PPT) algorithm A, the advantage εTh,A (n) is negligible in n, i.e., smaller than any positive polynomial for sufficiently large n (as n → ∞), where:    s.t. (x ̸= x′ ) ∧ x ← A(1n ); CR (3.3) εTh,A (n) ≡ Pr n (hk (x) = hk (x′ )) x′ ← A(x, k) k←{0,1} Where the probability is taken over the random coin tosses of A and the random choice of k. 4 TCR is a different name for the notion, which was earlier defined by Naor and Yung in [295], but with a different name: universal one-way hash functions. Applied Introduction to Cryptography and Cybersecurity 174 CHAPTER 3. INTEGRITY: FROM HASHING TO BLOCKCHAINS Clearly, every keyed CRHF, i.e., Any-Collision-Resistant (ACR) hash, is also a Target-Collision-Resistant (TCR) hash function: if there is some value x with whom the adversary can őnd a collision for a random key (with high probability), then surely the adversary can őnd some collision for a random key (with high probability) - e.g., collision with the same x. However, the reverse appear, intuitively, unlikely: maybe it is possible to őnd a collision once given the key k, but not with a pre-committed value x? The following counterexample exercise/argument shows that indeed, this may be possible, i.e., there may be a keyed hash which is TCR but not a CRHF (not an ACR hash). Exercise 3.4. Let hk (·) be TCR hash function. Show a keyed hash function h′k (·) which is also TCR but not a keyed CRHF (i.e., no an ACR hash). Solution: Recall that the length of the key is n bits. Deőne h′k (x) as follows: h′k (x) = {0n if x[1 : n] = k, otherwise hk (x)} Namely, if the n most signiőcant bits of the input x are the same as k, then h′k (x) = 0n , otherwise, h′k (x) = hk (x). Clearly, for any key k holds h′k (k) = h′k (k||0) = 0n . Recall that in the deőnition of a keyed CRHF (Deőnition 3.2), the adversary A is given the key k. Hence it is easy for A to output a collision, e.g., the pair (x, x′ ) where x = k and x′ = k + + 0. Namely, h′ is not a CRHF. It remains to show that h′ is a TCR hash. In the TCR test, the key k is also chosen randomly; the difference is that the key is given to the adversary A only when A selects the second (colliding) input, x′ , and not earlier, when A selects the őrst input, x. Since k is chosen uniformly at random, there is probability 1 th most signiőcant bits are the same as k; and 21n is 2n for A to pick x whose n negligible (in n). Therefore, with overwhelming probability, A selects x whose nth most signiőcant bits are not the same as k, and therefore, h′k (x) = hk (x). Assume, to the contrary, that, given k, the adversary A őnd a collision to h′ , i.e., x′ = ̸ x such that h′k (x) = h′k (x′ ). However, with overwhelming probability ′ hk (x) = hk (x). Also, the nth most signiőcant bits of x′ cannot be the same as k (or h′k (x′ ) would be 0n ). Hence, h′k (x′ ) = hk (x′ ). Namely, we have found a collision for h: a pair x ̸= x′ such that hk (x) = h′ (x′ ), in contradiction to h being a TCR. Hence, an adversary A that őnds a target-collision for h′ , would also őnd such collision for h. Since h is a TCR, őnding such collisions is infeasible; hence, h′ must be also be a TCR hash. If possible, it is preferable to design protocols which use a TCR keyed hash, rather than protocols requiring the use of a keyed CRHF (ACR hash). That is since we have seen that a keyed hash h may satisfy the (weaker) TCR requirement, but not the (stronger) ACR (keyed CRHF) requirement. Furthermore, for the protocol to rely on a keyed CRHF rather than TCR, we must use sufficient digest length to protect against the birthday attack, which we discuss in the following subsection. Applied Introduction to Cryptography and Cybersecurity 3.2. COLLISION RESISTANT HASH FUNCTION (CRHF) 3.2.4 175 Birthday and exhaustive attacks on CRHFs Our deőnitions of collision-resistance (for both keyless and keyed hash, Deőnition 3.2 and Deőnition 3.1, respectively) place two signiőcant requirements on the attacker. First, the attacker must őnd collisions using a Probabilistic Polynomial Time (PPT) algorithm. Second, the attacker must succeed to őnd a collision with non-negligible probability. Let us explain why, without these requirements on the attacker, we cannot hope to achieve collision resistance. Both arguments hold for both keyless and keyed CRHFs; for simplicity, we focus on the keyless hash case. A PPT attacker can find collisions with exponentially-small probability in every hash function. Consider the same set X as before, and an algorithm that selects two random elements in X; with small probability, this algorithm would output the collision x ̸= x′ s.t. h(x) = h(x′ ). Therefore, the deőnitions allow the adversary to have negligible probability of őnding a collision. Attacker can find collisions in exponential time in every hash function. Consider a hash function h : {0, 1}∗ → {0, 1}n , and a set X containing 2n + 1 distinct input binary strings. The output of h is the set of n-bits strings, which contains 2n elements; hence, there must be at least two elements x ̸= x′ in the set X, which collide, i.e., h(x) = h(x′ ). An adversary can surely compute h(x) for each of the 2n + 1 elements in X, and őnd at least one such collision h(x) = h(x′ ). Of course, this attack requires 2n + 1 computations of the hash function, i.e., its runtime is exponential in n. Hence, the deőnitions require the adversary against a CRHF to run in time polynomial in n. The birthday paradox and attack on collision resistance. The argument above required the adversary to compute 2n + 1 hash values, i.e., O(2n ). We next show that actually, an can őnd a collision, in any hash function, √ adversary  with only O(2n/2 ) = O 2n expected number of hash-computations, rather than O(2n ). This attack is often called the birthday attack since it is due to the socalled birthday paradox. Consider a room containing 23 persons. What is the probability of a collision, i.e., two people having birthday on the same day of the year? Many people expect this probability to be quite small, but in reality, the probability is about half. To understand why this is true, notice that when a person is added to a room currently containing i persons (with no collisions), i 1 the probability of a collision with some person in the room is 356 , not 356 . More precisely, the expected number q of messages {m1 , m2 , . . . , mq } which should be hashed before őnding a collision h(mi ) = h(mj ) is approximately: q ⪅ 2n/2 · r π ⪅ 1.254 · 2n/2 2 Applied Introduction to Cryptography and Cybersecurity (3.4) 176 CHAPTER 3. INTEGRITY: FROM HASHING TO BLOCKCHAINS Hence, to ensure collision-resistance against adversary who can do 2q computations, e.g., q = 80 hash calculations, we need the digest length n to be roughly twice that size, e.g., 160 bits. Namely, the effective key length of a CRHF is only q = n/2. This motivates the fact that hash functions often have digest length twice the key length of shared-key cryptosystems used in the same system. Using longer digest length and/or longer key length does not harm security, but may have performance implications. Note that the birthday attack applies to both keyed CRHF and keyless CRHF; however, it does not apply to Target Collision Resistant (TCR) hash. Can you see why? Carefully compare Deőnition 3.2 vs. Deőnition 3.4, and you’ll őnd out! 3.2.5 CRHF Applications (1): File Integrity Collision resistance is a great tool for ensuring integrity. One common application is to distribute a (large) object m, e.g., a őle containing the executable code of a program. Suppose the őle m is distributed from its producer in LA, to a user or repository in Washington DC (step 1 in Fig. 3.8). Next, a user in NY is downloading the őle from the repository (or peer user) in DC (step 2), receiving m′ (which should be the same as m, of course). To validate the integrity of the received őle m′ , the user also downloads the digest h(m) of the őle, directly from the producer in LA (step 3), and then conőrms that h(m) = h(m′ ) (by computing h(m′ ) locally). By downloading the large őle m from the nearby DC rather than from LA, the transmission costs are reduced; by checking integrity using the digest h(m), we avoid the concern that the őle was modiőed in DC, or modiőed in transit between LA to DC or DC to NY. This method of download validation is deployed manually, by savvy users, or in an automated way, by operating systems, applications or a script running within a browser [161]. A potential remaining concern is modiőcation of the digest h(m) received directly from producer in LA, by a Man-in-the-Middle (MitM)5 attacker. This may be addressed in different ways, including the use of a secure web connection for retrieving h(m), as discussed in Chapter 7, receiving the digest from multiple independent sources, or receiving a signed digest. This last method is basically the same as the Hash-then-Sign methods that we discuss next. 3.2.6 CRHF Applications (2): Hash-then-Sign (HtS) Collision-resistance is a powerful property; in particular, it facilitates one of the most important applications of cryptographic hash functions - the Hash-thenSign (HtS) paradigm. The Hash-then-Sign paradigm is essential for efficient deployment of public-key digital signatures, which we introduced in subsection 1.5.1. We present constructions for signature schemes based on public key cryptosystems in Section 6.6; and in subsection 3.4.2 we discuss one-time 5 Also called Monster-in-the-Middle Applied Introduction to Cryptography and Cybersecurity 3.2. COLLISION RESISTANT HASH FUNCTION (CRHF) 177 Figure 3.8: Example of use of hash function h to validate integrity of őle m downloaded by a user in NY, from an untrusted repository in DC. To validate integrity, the user downloads the (short) digest directly from the website of the producer, in LA. This reduces network overhead - and load on the producer’s website - compared to downloading the entire őle from the producer’s website. signatures and present their constructions, based on one-way functions (OWFs). However, both approaches result in signatures for limited-length inputs; furthermore, extending the input length would signiőcantly further increase the already high overhead of signature computation and validation (Table 6.1). Real applications always use, instead, the Hash-then-Sign (HtS) construction, which we discuss here. It can be applied to either keyed or keyless hash; we mostly focus on the keyless-hash variant (or ‘keyless HtS’). The Hash-then-Sign solution applies a hash function h(·) to the ‘long’ message m, and signs the (short) output h(m). Namely, given a signature scheme S deőned for domain {0, 1}n and a hash function h with domain {0, 1}∗ h and range {0, 1}n (i.e., h : {0, 1}∗ → {0, 1}n ), we deőne the HtS scheme SHtS as follows: h SHtS .KG(1n ) h SHtS .Sign s (m) h SHtS .Verifyv (m, σ) ≡ ≡ ≡ S.KG(1n ) S.Sign s (h(m)) (3.5) (3.6) S.Verifyv (h(m), σ) (3.7) h The HtS scheme SHtS may be applied to any binary string, i.e., its domain ∗ is {0, 1} . The reader may conőrm that it is a correct signature scheme over {0, 1}∗ (Exercise 4.21). Theorem 3.1 shows that if h : {0, 1}∗ → {0, 1}n is a collision-resistant hash function (CRHF) h(·), as in Deőnition 3.1, and S is existentially unforgeable signature scheme over {0, 1}n , then the HtS scheme h SHtS is existentially unforgeable signature scheme over {0, 1}∗ , i.e., applicable to arbitrary-length binary strings. Of course, the HtS method may fail if using an insecure hash function h; see Exercise 4.24. Applied Introduction to Cryptography and Cybersecurity 178 CHAPTER 3. INTEGRITY: FROM HASHING TO BLOCKCHAINS Given a keyless CRHF h, HtS would be secure. The keyless Hashthen-Sign construction uses a keyless CRHF h : {0, 1}∗ → {0, 1}n . Given signature scheme (KG, S, V ) whose input domain is (or includes) n-bit strings, i.e., {0, 1}n , the keyless Hash-then-Sign signature of any message m ∈ {0, 1}∗ is deőned as Ssh (m) ≡ Ss (h(m)), where we use s for the private signing key of S (and of Ssh ). We next show that if h is a Collision-Resistant Hash Function (CRHF), and assuming that the signature scheme (S, V ) is secure (existentially unforgeable, see Deőnition 1.6), then (S h , V h ) is also secure. Notice that Lemma 3.1 showed keyless CRHFs do not exist at all, which makes this theorem useless as a basis for proofs of security of applications using a keyless hash function. However, the theorem is still useful, for two reasons. First, it is similar (and simpler) to the similar theorems for keyed CRHFs. Second, it justiőes, at least intuitively, the common use of the Hash-then-Sign paradigm applied to a keyless hash function h. Theorem 3.1 (Keyless Hash-then-Sign would be secure (existentially unforgeable)). Let (KG, S, V ) be an existentially unforgeable signature scheme over the domain {0, 1}n , and let h : {0, 1}∗ → {0, 1}n be a keyless CRHF function, h Let SHtS be the Hash-then-Sign signature as defined in Equation 3.6. Then h SHtS is an existentially unforgeable signature scheme over domain {0, 1}∗ . Proof: Assume that the claim does not hold, i.e., that there is an efficient adversary A ∈ P P T s.t. εSeu−Sign (n) ̸∈ N EGL(n), as deőned in Equation 1.2. h HtS ,A Namely, with signiőcant probability, A outputs a pair (m, σ) s.t. A didn’t h h provide m as input to the SHtS .Sign s (·) oracle, yet SHtS .Verifyv (m, σ). From h Equation 3.7, SHtS .Verifyv (m, σ) = S.Verifyv (h(m)). Let ϕ ≡ h(m); now, either there was another message m′ s.t. ϕ = h(m′ ) which A did provide as h input to the SHtS .Sign s (·) oracle, or not. Let us consider both cases; at least one of the two must occur with signiőcant probability. If there is another message m′ s.t. ϕ = h(m′ ) which A provided as input h .Sign s (·), then the pair (m, m′ ), both of which produced by A, is a to SHtS collision for h. This should be impossible to efficiently őnd, since h is assumed to be a CRHF. If there was no such m′ , then we can use A to construct an adversary A ′ that will őnd forgery for the original signature scheme (KG, S, V ). The adversary A ′ runs A, and whenever A makes an oracle query m, then A ′ computes ϕ = h(m), makes query for Ss (ϕ) and returns the result to A- obviously, this would be the expected value. Finally, when A returns the forgery (m, σ) for h SHtS , then A ′ computes h(m) and returns (h(m), σ), which is the corresponding forgery for S. The existence of such forgery contradicts the assumption that (KG, S, V ) is an existentially unforgeable signature scheme. Hence, there is no h h such efficient adversary A that ‘wins’ against SHtS , i.e., SHtS is an existentially ∗ unforgeable signature scheme over domain {0, 1} , as claimed. Applied Introduction to Cryptography and Cybersecurity 3.2. COLLISION RESISTANT HASH FUNCTION (CRHF) 179 Two Keyed-Hash HtS Constructions: Keyed-HtS and TCR-HtS. We now present two Hash-then-Sign (HtS) constructions from keyed-hash function. We begin with a very simple construction by Damgård [111], which we refer to as Keyed-HtS as it is based on the use of a keyed-CRHF. This construction is identical to the keyless Hash-then-Sign construction, except for the use of a keyed hash hk (·) : {0, 1}n × {0, 1}∗ → {0, 1}n , deőned for arbitrary key and digest length n ∈ N. The hash key is selected once, during the key-generation process, and becomes part of the public veriőcation key and of the private h signing key. We deőne the Keyed-HtS construction SHtS as follows. Definition 3.5 (The Keyed-HtS construction). Given a signature scheme S with domain {0, 1}n and a keyed hash (CRHF) hk (·) : {0, 1}∗ ×{0, 1}∗ → {0, 1}∗ , the Keyed-HtS signature using signature S and keyed-hash h is defined as follows:    $ (s, v) ← S.KG(1n ) ;  $  k ← {0, 1}n ; return ((s, k), (v, k)) h .KG(1n ) SHtS ≡ h SHtS .Sign (s,k) (m) ≡ S.Sign s (hk (m)) ≡ S.Verifyv (hk (m), σ) h SHtS .Verify(v,k) (m, σ)  (3.8) (3.9) (3.10) The Keyed-HtS construction requires the underlying keyed hash function to satisfy a relatively-strong requirement, namely, to make it infeasible to őnd any collision (what we referred to as ACR or keyed-CRHF hash). Bellare and Rogaway show, in [45], the almost-as-simple TCR-HtS construction. The TCR-HtS construction requires only the weaker target-collision resistant (TCR) keyed-hash function. For discussion and comparison on these two deőnitions, see subsection 3.2.3; in particular, the generic birthday attack is applicable against the keyed-CRHF property, but not against the TCR property, hence a hash function may be a secure TCR hash, even when it uses signiőcantly shorter digests (about half of the bits required for a secure keyed-CRHF). The TCR-HtS construction is similar to the keyless Hash-then-Sign construction (and to the keyed-HtS construction); there are basically two differences. The őrst difference is obvious: we use a keyed hash hk (·) : {0, 1}n ×{0, 1}∗ → {0, 1}n , i.e., a hash function which receives two inputs binary strings, a key k and a message, and outputs a binary string. The second difference is in the construction; we select and transmit the hash-key with each signature. Namely, the hash key is selected, randomly, as part of each signing operation, and is sent together with the output of the underlying signature function. We deőne the TCR-HtS construction SThCR−HtS as follows. Definition 3.6 (The TCR-HtS construction). Given a signature scheme S with domain {0, 1}n , defined for any integer n, and a keyed hash hk (·) : {0, 1}n × {0, 1}∗ → {0, 1}n , the TCR-HtS signature using signature S and keyed-hash h Applied Introduction to Cryptography and Cybersecurity 180 CHAPTER 3. INTEGRITY: FROM HASHING TO BLOCKCHAINS is defined as follows: SThCR−HtS .KG(1n ) ≡ SThCR−HtS .Sign s (m) ≡ SThCR−HtS .Verifyv (m, (k, σ)) ≡ S.KG(1n ) (3.11)   $ k ← {0, 1}n ;  σ ← S.Sign (k + + hk (m))  (3.12) s return (k, σ) S.Verifyv (k + + hk (m), σ) (3.13) Both the keyed-HtS and the TCR-HtS constructions, result in a secure, existentially-unforgeable signature scheme for unbounded input length - provided that the underlying hash function satisőes the required property. This property, however, is different between the two constructions. The keyed-HtS construction requires a keyed-CRHF, a relatively strong requirement, while the TCR-HtS construction only makes the (weaker) requirement of a TCR hash function. Theorem 3.2 (Keyed Hash-then-Sign is secure (existentially unforgeable)). Let (KG, S, V ) be an existentially unforgeable signature scheme over the domain {0, 1}n , and let hk (·) : {0, 1}n × {0, 1}∗ → {0, 1}n be a keyed hash function, Let h SHtS be the Keyed-HtS signature scheme, as defined in Equation 3.10, and let SThCR−HtS be the TCR-HtS signature scheme, as defined in Equation 3.13; both schemes are defined for arbitrary-length input strings. Then: h is an 1. If h is a keyed collision-resistant hash function (CRHF), then SHtS existentially unforgeable signature scheme. 2. If h is a Target Collision-Resistant (TCR) hash function, then SThCR−HtS is an existentially unforgeable signature scheme. Proof: see [45, 111]. Unfortunately, both of the constructions involve challenges for deployment, namely, may necessitate signiőcant changes in existing systems designed for Hashthen-Sign using keyless hash. The keyed-HtS construction requires distribution of a longer public key, since the random hash key becomes part of the public key; and the TCR-HtS construction requires a longer signature (k, σ), i.e., the signature has to include also the random key chosen for the hash function. Several papers propose alternative HtS constructions with provable security properties but without such deployment challenges, some of them also based on keyless hash functions; for example, see [314]. 3.3 Second-preimage resistance (SPR) Hash Functions The second property we introduce is second-preimage resistance (SPR). We deőne SPR only for keyless hash functions, although it can also be deőned for keyed hash [340]. Intuitively, a Second-Preimage Resistance (SPR) hash function h accepts one input, an arbitrary-length binary string m ∈ {0, 1}∗ , and outputs an n-bit Applied Introduction to Cryptography and Cybersecurity 3.3. SECOND-PREIMAGE RESISTANCE (SPR) HASH FUNCTIONS 181 Figure 3.9: Second-preimage resistance (SPR): given keyless hash function h : {0, 1}∗ → {0, 1}n , for any input length l ≥ n, given a random first preimage x ∈ {0, 1}l , for l ← A(1n ), it is hard to őnd a collision with x, i.e., a second preimage x′ ∈ {0, 1}∗ s.t. x′ ̸= x yet h(x) = h(x′ ). long binary string h(m) ∈ {0, 1}n , and satisőes the SPR property. The SPR property means, intuitively, that an efficient (PPT) adversary A has negligible probability, when given a random binary string x, to őnd a collision to x, i.e., a different string x′ = ̸ x which has the same hash: h(x′ ) = h(x). We refer to x′ as the second preimage, and to x as the random (őrst) preimage, since they are both preimages of h(x′ ) = h(x). We illustrate the SPR property in Figure 3.9. The reader will notice that we let the adversary select the length l of the random (and őrst) preimage x. By selecting the length of x őrst, we allow x to be a uniformly-random string from the (őnite) set {0, 1}l ); note that we can not select a uniformly-random element from an inőnite set, i.e., we can’t select a uniformly-random string from {0, 1}∗ (see Section A.3). The set {0, 1}l is a natural choice of a őnite set, since it contains all binary strings of length l; we can select a random string by ŕipping l fair coins - giving probability 21l to each of the 2l strings in the set {0, 1}l . We let the adversary select l, which seems prudent - why limit the adversary to preimage of speciőc length? However, we must prevent the adversary from choosing l which would be ‘too long’, since the adversary, as an efficient (PPT) algorithm, is allowed runtime which is polynomial in the length of its inputs, $ which includes the l bit x ← {0, 1}l . For example, if we let the adversary choose l = 2n , then x would have 2n bits, and, when given x and asked to choose x′ , the adversary will be allowed runtime polynomial in 2n , i.e., exponential in n, and to őnd, with high probability, a collision, e.g., by computing h(x′ ) for all values of x′ ∈ {0, 1}n+1 . To ensure that the entire adversary’s runtime is polynomial in n, and in particular that l, the length of x, is polynomial in n, we require the adversary to output l in unary, i.e., as a string 1l consisting of l bits whose value is 1. The deőnition follows. Definition 3.7 (Second-preimage resistance (SPR) Hash Function). A (keyless) hash function h : {0, 1}∗ → {0, 1}n is second-preimage resistant (SPR) if for R every efficient (PPT) algorithm A, the advantage εSP h,A (n) is negligible in n, i.e., smaller than any positive polynomial for sufficiently large n (as n → ∞), Applied Introduction to Cryptography and Cybersecurity 182 where: CHAPTER 3. INTEGRITY: FROM HASHING TO BLOCKCHAINS   R  εSP h,A (n) ≡ Pr   1l ← A(1n )   $   x← {0, 1}l    ′ x ← A(x) ′ ′ x ̸= x ∧ h(x) = h(x ) (3.14) Where the probability is taken over the choice of x and the random coin tosses of A. SPR is sometimes referred to as weak collision resistance, and indeed, as the reader can prove, every CRHF is also an SPR hash function. However, Exercise 3.5 shows that there may be an SPR which is not a CRHF. Indeed, while subsection 3.2.2 shows that there is no keyless CRHF, it is possible, and commonly believed, that (keyless) SPR hash functions do exist. In practice, collision attacks, but not second-preimage attacks, are known against the SHA1 and MD5 standard hash functions. Exercise 3.5. Let h be an SPR hash function. Use h to construct another hash function, h′ , which you will show to be (1) an SPR (like h), but (2) not a CRHF. Therefore, whenever possible, we should design protocols and systems to require only an SPR hash function, rather than a CRHF. Indeed, the SPR property suffices for some protocols and applications. For example, the SPR property suffices to authenticate the integrity of a őle downloaded from an untrusted repository, whose hash is signed by the (trusted) developer, as in Figure 3.8. However, other important applications require collision resistance and would be insecure if using a hash function which is only second-preimage resistant (SPR). Importantly, the SPR property is not sufficient for Hash-then-Sign (Hts), as we discuss in the next subsection. Exercise 3.6. Explain, intuitively, why the SPR property suffices to authenticate the integrity of a file downloaded from an untrusted repository, whose hash is signed by the (trusted) developer; and why your explanation does not apply to the use of such hash for the Hash-then-Sign construction. Although the SPR property suffices for some applications, and we know that there is no keyless hash function which is collision-resistant, it is still better to avoid any use of cryptographic hash functions for which collisions has been found. This protects against the common case, where a designer incorrectly believes that an SPR hash suffices, while the system is actually vulnerable to non-SPR collision attacks; and also protects against implementation errors which use the SPR hash function where the design called for a CRHF. Applied Introduction to Cryptography and Cybersecurity 3.3. SECOND-PREIMAGE RESISTANCE (SPR) HASH FUNCTIONS 3.3.1 183 The Chosen-Prefix Collisions Vulnerability Theorem 3.1 shows that Hash-then-Sign is secure when used with a CRHF. But would the weaker SPR property suffice to ensure security using the Hash-thenSign (HtS) paradigm, i.e., using the (keyless) HtS construction of Equation 3.7? In this subsection we show that such SPR-HtS, i.e., use of a hash function which is second-preimage resistant but not collision resistant, may result in signiőcant vulnerability. Let us begin with an arguably less-signiőcant vulnerability, showing that SPR-HtS would not ensure existential unforgeability. This requires only the following observation. Consider a (keyless) hash function, which is SPR but not CRHF. Namely, an adversary A may know a collision h(m) = h(m′ ), although A cannot efficiently őnd a collision to a randomly-chosen message mR . Now, consider the HtS signature (Equation 3.6) over m: h SHtS .Sign s (m) = S.Sign s (h(m)) = S.Sign s (h(m′ )) h = SHtS .Sign s (m′ ) h We conclude that SHtS is not an existentially-unforgeable signature scheme (Deőnition 1.6). h However, could it be that SHtS is ‘secure enough’ for practical applications? Even if A can őnd some collision h(m) = h(m′ ), for some ‘random’ strings m, m′ , how would the attacker convince the signer to sign m, and why should the alternative message m′ be of (signiőcant) value to the attacker? In short: is there a clearly realistic attack, which may be possible against an SPR hash h (although this attack fails against a CRHF)? We next show that this is indeed the case by presenting such an attack, exploiting a realistic vulnerability: the chosen-prefix vulnerability, which we next deőne. Definition 3.8 (The chosen-preőx collisions vulnerability). Hash function h is said to have the chosen-preőx vulnerability if there is an efficient (PPT) collision-őnding algorithm CF, s.t. given a (prefix) string p ∈ {0, 1}∗ , the algorithm CF efficiently outputs, with high probability, a collision, i.e., a pair of strings x, x′ ∈ {0, 1}∗ , s.t. for any (suffix) string s ∈ {0, 1}∗ holds h(p+ +x+ +s) = h(p + + x′ + + s). Namely:   (x, x′ ) ← CF(p) s.t. (x ̸= x′ )∧ (∀p) w.h.p.: (3.15) (∀s)(h(p + +x+ + s) = h(p + + x′ + + s)) Note that the fact that the collisions hold for any common suffix s is due to the fact that many keyless hashing functions have an iterative design. Due to this design, if there is a collision between for preőx p between x and x′ , i.e., x ̸= x′ yet h(p + + x) = h(p + + x′ ), then there is also a collision for any suffix s between p + +x+ + s and p + + x′ + + s. Namely, (∀s)h(p + +x+ + s) = h(p + + x′ + + s). In particular, the Merkle-Damgård construction, which is used by many hash functions, has this iterative design, and hence has this property. We discuss this construction in Section 3.9. Applied Introduction to Cryptography and Cybersecurity 184 CHAPTER 3. INTEGRITY: FROM HASHING TO BLOCKCHAINS Chosen-prefix attacks are practical. The chosen-preőx vulnerability is a realistic concern; in fact, such vulnerabilities were found for widely-used (at the time) standard hash functions including RIPEMD, MD4, and MD5 in [367], and later also for SHA1 [262], which all use the Merkle-Damgård construction. Due to this vulnerability, these hash functions are considered insecure and replaced, in new designs and, when possible, in existing systems, with cryptographic hash functions that are not known to have this or other vulnerabilities, such as SHA-2 and SHA-3 [114]. We next show how the chosen-preőx collisions vulnerability facilitates a realistic attack on the Hash-then-Sign paradigm. This attack allows an attacker to trick users into signing what appears to a third party to be a statement (e.g., money transfer) that the user never intended to sign. For more elaborate attacks, which allow also forgery of public key certificates, see Chapter 8 and [262, 367, 368]. Chosen-prefix attack on Hash-then-Sign: simplified version. We begin by presenting a simpliőed version of the chosen-preőx attack on the Hash-thenSign paradigm. In this version, an attacker, say Mal, uses the chosen-preőx attack to őnd a pair of strings (x, x′ ), which would collide when appended to the (chosen) preőx ‘Pay $’. Namely, the pair (x, x′ ) would satisfy x ̸= x′ and h(‘Pay $ + +x+ + s) = h(‘Pay $ + + x′ + + s) (for any suffix s). The main simpliőcation we make is to assume that Mal can deposit a ‘payment order’ of the form ‘Pay $’+ +x+ +to Mal’, where x is a binary string interpreted as an integer. Obviously, reality is more complex, e.g., if the payment order is a document, the amount x should be also encoded in printable characters, typically in ASCII. Another simpliőcation is to assume that, as binary numbers, x << x′ . Mal now sets-up an online shop and offers for sale an item whose market value is $y where x < y << x′ . Mal offers the item for only $x - a real bargain! But what is really going to happen? Alice comes along and happily buys the item, by sending to Mal a signed payment order to her bank, ready to be deposited at the bank. Namely, Alice sends to Mal the pair (P O, σ) where σ = SignA.s (h(P O)) and P O = ‘Pay $x to Mal′ . Alice expect to be charged $x after Mal will deposit this signed payment order (P O, σ). However, to her chagrin, she őnds that she was charged x′ >> x - a signiőcantly higher amount. Mal has tricked Alice, by depositing the forged payment order P O′ = ‘Pay $x′ to Mal’, together with σ, Alice’s signature on P O (σ = SignA.s (h(P O))). The Bank will honor this transactions, and charge Alice by $x′ , since σ is also a valid signature for P O′ , as: h(P O) = h(Pay $x to Mal) = h(Pay $x′ to Mal) = h(P O′ ) Hence, the same signature generated by Alice, appears to the bank to be a valid signature over the ‘fake’ payment order P O′ - and the bank transfers to Mal the larger amount $x′ >> $x! Applied Introduction to Cryptography and Cybersecurity 3.3. SECOND-PREIMAGE RESISTANCE (SPR) HASH FUNCTIONS 185 Recall now the simpliőcations we made in this description of the attack. In particular, banks will not accept a payment order where amounts are indicated as binary numbers; typically, the entire payment order should be encoded using printable characters, e.g., using ASCII code; more readable formats, such as PDF or HTML, are even more likely. We next discuss a more realistic variant of the attack, which works for payment orders encoded using PDF or HTML. A more realistic Chosen-prefix Attack: Signing PDF documents We now improve the chosen-preőx attack to allow forgery of signatures over documents formatted in ‘rich’ markup languages like PDF, postscript, and HTML. The attacker, Mal, exploits the fact that these (and similar) languages allow documents to contain conditional rendering statements, allowing the document to display different content depending on different conditions. In the attack, Mal uses the conditional rendering capability, to create two documents D1 , DM that have the same hash value, h(D1 ) = h(DM ), but when rendered by the correct viewer, e.g., PDF viewer, the two documents are rendered very differently. Namely, viewing D1 , the reader displays text t1 = ‘Pay $1 to Amazon’, while viewing DM , the reader displays text tM = ‘Pay $1,000,000 to Mal’. The rest of the contents, and even the details of the markup language used, do not materially change the attack, so we ignore them. Mal creates these two documents as follows. First, the documents share common perőx and suffix: D1 = p + + x1 + + s, DM = p + + xM + + s. The preőx p consists of headers and preliminaries as required by the markup language, e.g., %PDF for PDF, or <!DOCTYPE html> for HTML, followed by the ‘if’ statement in the appropriate syntax. Simplifying, let’s say that p = ‘if ’. Mal next applies the collision-őnding algorithm CF (Deőnition 3.8), to őnd collision for preőx p, namely: (x1 , xM ) ← CF(p). For every suffix s holds: h(p + + x1 + + s) = h(p + + xM + + s). To complete D1 and DM , Mal sets the suffix s to the string: s ← ‘=’ + + x1 + + ‘then display’ + + t1 + + ‘, else display’ + + tM Mal is now ready to launch the attack on Alice, similarly to the simpliőed attack above. Namely, Mal őrst sends D1 to Alice, who views it and sees the rendering t1 . Let us assume that Alice agrees to pay $1 to Amazon, and hence signs D1 , i.e., computes σ = SA.s (h(D1 )) and sends (D1 , σ) back to Mal. Mal forwards to the bank the modiőed message (DM , σ). The bank validates +x1 + +s) = h(p+ +xM + +s) = the signature, which would be Ok since h(D1 ) = h(p+ h(DM ). The bank then views DM , and sees: tM = ‘Pay $1,000,000 to Mal’ As a result, the bank transfers one million dollars from Alice to Mal. Of course, some work is required to actually deploy the above attack; in particular, it isn’t trivial to handle PDF őles. The following exercise challenges the readers to őnd a similar attack against HTML őles; this should not be too Applied Introduction to Cryptography and Cybersecurity 186 CHAPTER 3. INTEGRITY: FROM HASHING TO BLOCKCHAINS Figure 3.10: Intuition of the One-Way Function property, aka PreimageResistant hash function: given h(x) for a (sufficiently long) random preimage x, it is infeasible to őnd x, or any other preimage x′ of h, i.e., s.t.. h(x′ ) = h(x). Details in Deőnition 3.9. difficult, and we hope would be fun and instructive. Your solution may assume that the browser displaying the HTML őle supports JavaScript applets. Pl Exercise 3.7. Consider the hash function h(x1 + + x2 + + ... + + xl ) = i=1 xi mod p, where each xi is 64 bits and p is a 64-bit prime. (a) Is h an SPR hash function? CRHF? (b) Present a collision-finding algorithm CF for h. (c) Create two HTML files D1 , DM as above, i.e., s.t. h(D1 ) = h(DM ), yet when they are viewed in a browser, they display texts t1 , tM as above. 3.4 One-Way Functions, aka Preimage Resistance The third security property we discuss for cryptogrphic hash functions is called Preimage resistance or One-Way Function (OWF). Intuitively, h is a one-way function, if, given h(x) for a (sufficiently long) random preimage x, it is infeasible to őnd either x or any other preimage x′ of h, i.e., s.t.. h(x′ ) = h(x). This is illustrated in Fig. 3.10. We prefer the term One-Way Function (OWF) to the term preimage-resistant hash, since OWF emphasizes the ‘one-way’ property: computing h(x) is easy, but ‘inverting’ it to őnd x, or a colliding preimage x′ s.t. h(x′ ) = h(x), is hard. The deőnition follows. Note that the input x is selected as an l > n bits string, where l is selected by the adversary (in unary). This selection process is similar to the one used in the deőnition of second-preimage resistant (SPR) hash, and for the same reasons: to ensure that the adversary is limited to runtime polynomial in n, and to allow random choice of x (from a őnite domain). Definition 3.9 (One-Way Function (OWF), aka Preimage resistance). An efficient function h, with range {0, 1}n , is called preimage resistant, or a one-way F function, if for every efficient algorithm A ∈ P P T , the advantage εOW h,A (n) is negligible in n, i.e., smaller than any positive polynomial for sufficiently large n (as n → ∞), where: "   # $ $ 1l ← A(1n ) ; x ← {0, 1}l OW F (3.16) εh,A (n) ≡ Pr h(A(x)) = h(x) Applied Introduction to Cryptography and Cybersecurity 3.4. ONE-WAY FUNCTIONS 187 Where the probability is taken over the random coin tosses of A and over the $ choice of x ← {0, 1}l . The deőnition of a OWF requires the input x to be selected as a random l bits string. This raises the following question: supposed h is a OWF (as in Deőnition 3.9), but we select input x as a random member of some other set (not {0, 1}l ), e.g., a random text from a collection of numerous texts. Is it possible that the attacker will be able to őnd x, or another preimage x′ s.t. h(x) = h(x′ )? We next present two important applications of one-way functions: the OTPw (One-Time Password) Authentication Scheme and a one-time signature scheme. 3.4.1 The OTPw (One-Time Password) Authentication Scheme Passwords are the most well-known and widely-used method for user authentication, which we discuss in Chapter 9. In particular, we discuss designs for improved-security password systems, and designs for user-authentication mechanisms which are not based on passwords. Several of these designs are based on the OTPw6 . (One Time Password) Authentication Scheme, proposed already in 1974 [143]. The OTPw design uses a one-way function; indeed, it may be the earliest, or one of the earliest, publications introducing the concept of one-way functions. It is also a nice, simple example of OWFs and their applications, so it is natural for us to use it to introduce one-way functions. The design uses a one-way function which we denote by h : {0, 1}∗ → {0, 1}n . In the OTPw, each user, say Alice (denoted by A), selects her password, $ denoted P WA , as a random n−bits string, P WA ← {0, 1}n , and computes HP WA ← h(P WA ). The server receives and saves HP WA . To authenticate herself, Alice sends to the server P WA ; the server computes h(P WA ) and veriőes that it is the same as HP WA . The one time password design offers an important advantage over the naive way of using passwords, where the passwords are kept ‘as is’ (in plaintext) in the password őle. The advantage is that the authentication is secure, even if an attacker is able to obtain the value of the validation token HP WA . Indeed, this design is the key to the popular designs of hashed and salted password őles, which we discuss in subsection 9.4.2. If h is a one way function (OWF), then, from Deőnition 3.9, an efficient (PPT) adversary cannot őnd the random preimage P WA , even if given the validation token HP WA . Hence, the one time password scheme is secure: if Bob őnds that HP AA = h(P WA ), this should means that P WA is the 6 The idea is often referred to as OTP; however, OTP is also used to refer to the completely different notion of one time pad (Section 2.4). Note also that the term One-Time Password is often used for other user-authentication designs, including hash chain, which is an extension of OTPw, and other designs which are less related; see Chapter 9. Applied Introduction to Cryptography and Cybersecurity 188 CHAPTER 3. INTEGRITY: FROM HASHING TO BLOCKCHAINS correct OTPW. If Alice kept P WA securely and only discloses it to authenticate herself to Bob, then Bob knows this also means that Alice has initiated the authentication (by disclosing P WA ). Of course, an attacker which obtains the one time password P WA could try to abuse it; therefore, if the communication is not properly protected (encrypted), Bob should only accept P WA once, i.e., ‘one time’. Note that the OTPw scheme requires the transmission of the password to be done over a secure (encrypted) channel; if an attacker can eavesdrop to the password, then security is trivially broken. A secure connection is also required if the user wishes to replace the password (after authenticating successfully). Another concern with the scheme is that if that its security depends on the use of random passwords; if the password is contained in a dictionary of common passwords, then the attacker may őnd the password P WA from HP WA by hashing the different passwords in the dictionary - an offline dictionary attack (subsection 9.3.5). In subsection 9.4.2 we present improvements to the OTPw scheme that provide better protection in the case of human, non-random passwords; and in subsection 9.6.2 we discuss other one-time password (OTPW) schemes that provide better security, including against an eavesdropping adversary. In the next subsection, we move to a a different - albeit related - application of one-way functions, which is the one-time signature scheme. 3.4.2 Using OWF for One-Time Signatures We next show how to use one-way functions to implement a public key signature scheme, which can be used to sign only one message. We call such a scheme a one-time signature scheme; the term is often applied also to extensions which allow a limited number of signature operations. Note that our deőnition of signature schemes (subsection 1.5.1) did not allow a restriction on the number of applications of the signature scheme, therefore, one-time signature schemes are not within the scope of subsection 1.5.1. However, it is not hard to extend the deőnitions in subsection 1.5.1 so that they support limited number of applications; the reader may do it as an exercise. In spite of their obvious limitations, one-time signatures have also important advantages in security and performance, making them a good choice for many applications. These advantages are in comparison to the widely-used public-key signatures schemes such as RSA and DSA, whose security is based on the hardness of factoring (RSA) or of computing discrete logarithms (DSA); we discuss RSA, factoring and discrete-log in Chapter 6. Let us discuss these advantages of one-time signatures. The main security advantage of one-time signatures is that they are not (too) vulnerable to the potential availability to an attacker of a quantum computing device. This is in contrast to factoring, discrete-logarithms and many other computational problems, which may be efficiently solved (broken) using an appropriate quantum computer; we brieŕy discuss the impact of quantum computing in Section 10.4. Signature schemes based on computational Applied Introduction to Cryptography and Cybersecurity 3.4. ONE-WAY FUNCTIONS 189 assumptions (factoring, discrete logarithm) are also subject to other possible improvements in the algorithms to solve these problems. One-time signature schemes are also subject to vulnerabilities of the underlying hash function, but changing to another function/scheme, as well as using a signiőcant margin (longer digests), would be easier for hash functions than for signatures. And a quantum computer would provide much smaller advantage against hash functions, compared to its impact on factoring and discrete logarithms. The efficiency advantage of one-time signatures is their low computational overhead compared to regular signature schemes. This advantage can further increased, if longer key size are used for the public key signatures, to ensure security against a future quantum computer, and against algorithmic advances in solving the underlying ‘hard’ problem. Note, however, that one-time signatures require rather long public keys and signatures. Of course, the overhead depends on the scheme used. We present a simple scheme; some other schemes require shorter public keys and signatures. Figure 3.11: A one-time signature scheme, limited to signing only a single bit b. We present the construction in three steps, gradually improving performance. First One-Time Signature construction: signing a single bit. Figure 3.11 presents a one-time signature scheme which is deőned only for the case of a single-bit message. This scheme is, basically, an extension of the basic one-time password scheme of subsection 3.4.1. The private key s simply consists of two random strings s0 , s1 , while the public key consists of the hashes of these strings: v0 = h(s0 ), v1 = h(s1 ). To sign a bit b, we simply send σ = sb ; to validate incoming the bit b and its purported signature σ, we validate that vb = h(σ). Note that the pair of values v = (v0 , v1 ) is the public key, and not secret; only the pair s = (s0 , s1 ) is secret, and we disclose sb upon signing bit b. Second One-Time Signature construction: naïve signing of l-bit string. We next extend the scheme, to allow one-time signature of an l-bit string d, as illustrated in Figure 3.12. (We use the symbol d for for the string which is signed, since we will extend this scheme in the next step, where we use d for the digest of a longer message.) To allow signing of l bit strings, this scheme basically amounts to l applications of the one-bit one-time signature scheme, illustrated in Figure 3.11. Applied Introduction to Cryptography and Cybersecurity 190 CHAPTER 3. INTEGRITY: FROM HASHING TO BLOCKCHAINS Figure 3.12: A one-time signature scheme, for l-bit string (denoted d). The private signing key of this scheme consists of a set of l pairs of random strings, denoted {(si0 , si1 )}li=1 ; and the public validation key is the corresponding set of l pairs of strings denoted {(v0i , v1i )}li=1 , computed as: (∀i ∈ {1, . . . , l} and b ∈ {0, 1}) vbi = h(sib ). The signature over the binary string d = d1 + +. . .+ +dl is the set {s1d1 , . . . , sldl }. To validate that σ = σ1 + + ... + + σl is a valid signature of the l-bit string d = d1 + + ... + + dl , conőrm that for every i between 1 and l holds: vdi i = h(sidi ). Third (and final) One-Time Signature construction: using Hash-thenSign for efficient one-time signature for arbitrary length string m. Finally, we extend the one-time signature scheme further, to efficiently sign arbitrary-length inputs string m. This extension, illustrated in Figure 3.13, uses the Hash-then-Sign paradigm. Namely, we őrst compute the l-bit digest d of the message, as d = h′ (m), where h′ denote a CRHF; note that we denote the CRHF by h′ rather than h, since h and h′ can be different hash functions. After computing d = h′ (m), we apply the one-time signature scheme of Figure 3.12. Note that we described the scheme for a keyless CRHF h′ . We leave it to the reader to modify the design, for the case when h′ is a keyed CRHF as in Deőnition 3.5, or a TCR hash function as in Deőnition 3.6. Figure 3.13: A one-time signature scheme, for variable-length message m, using ‘Hash-then-Sign’. Applied Introduction to Cryptography and Cybersecurity 3.5. RANDOMNESS EXTRACTION AND KEY DERIVATION FUNCTIONS 191 3.5 Randomness Extraction and Key Derivation Functions Security and cryptography use randomness for many mechanisms, including encryption, key-generation, and challenge-response authentication. A source of true, perfect randomness is often unavailable; systems often rely on imperfect sources of randomness, such as measurements of delays of different physical actions. In this section, we discuss two related cryptographic tools to deal with this challenge: randomness extractor hash functions and key derivation functions (KDFs). Intuitively, a randomness extractor (or simply an extractor) hash is a (keyless or keyed) hash function h, whose output h(x) is pseudo-random7 , provided that its input x has ‘sufficient randomness’, i.e., satisőes a speciőed randomness assumption. A keyed extractor hk (x) also receives as input a non-secret random key k, often referred to as salt; if its input x has ‘sufficient randomness’, then its output hk (x) should be pseudo-random. The randomness extraction property is more subtle and harder to deőne than the the collision resistance, SPR and one-way properties. Also, randomness extraction is not listed as one of the goals of standard cryptographic hash functions. The cryptographic-theory literature mostly deals with extractors which ensure random or pseudorandom output, as long as their input has sufficientlyhigh min-entropy; extractors with this (weak) requirement on randomness from their inputs are referred to as generic extractors. These works also focus on keyed extractors, which use random non-secret keys; this is since keyless generic extractors do not exist [125, 286]. The discussion of generic extraction is beyond our scope; see, for example, [125, 286]. Instead, we will present two much simpler notions of (keyless) extractor hash functions. In subsection 3.5.1, we discuss the simple, keyless Biased-Coin Extractor proposed by Von Neumann, which ensures uniformly random output, provided that its input is a sequence of independently-sampled bits from some biased distribution. Then, in subsection 3.5.2, we present a simple model of a computational extractor, the bitwise randomness extractor, which ensures pseudorandom output, provided that its input contains a sufficient number of random bits. Practical modern cryptographic systems often extract randomness using a (keyed) Key Derivation Functions (KDFs). Key derivation functions can be seen as an extension of keyed extractors, but offer additional functionalities beyond extraction. In subsection 3.5.3, we discuss key derivation functions, and compare them to keyed and keyless extractors. We also present the Extract-then-Expand paradigm for constructing a KDF from a keyed extractor hash and a PRF. 7 This definition only requires pseudorandom output, rather than true randomness; such extractors are often referred to as computational extractors, e.g., in [242]. Extractors whose output is random regardless of the adversary’s computational abilities are often referred to as statistical extractors. Applied Introduction to Cryptography and Cybersecurity 192 3.5.1 CHAPTER 3. INTEGRITY: FROM HASHING TO BLOCKCHAINS Von Neumann’s Biased-Coin Extractor We őrst discuss a classical biased-coin extractor model proposed, already in 1951, by Von Neumann [383], one of the pioneers of computer science. In the Von Neumann model, each of the input bits is the result of an independent toss of a coin with őxed bias. Namely, for every bit generated, the value 1 is generated with probability 0 < p < 1 and the value 0 is generated with probability 1 − p with no relation to the value of other bits. We refer to this as the Von Neumann assumption. Von Neumann proposed the following method to extract perfect randomness from these biased bits. First, arrange these sampled bits in pairs {(xi , yi )}. Then, remove pairs where both bits are identical, i.e., leave only pairs of the form {(xi , 1−xi )}. Finally, output the sequence {xi }. This simple - if somewhat ‘wasteful’ in input bits - algorithm, is called the Von Neumann extractor. We leave it to the reader to show that if the input satisőes the Von Neumann assumption, then the output is a string of uniformly random bits. Exercise 3.8. [Von Neumann extractor] Show that, if the input of the Von Neumann extractor satisfies the Von Neumann assumption (above), then the output is uniformly random, i.e., each bit xi is 1 with probability exactly half independently of all other bits. The Von Neumann extractor is simple, and the output is proven uniform based only on the Von Neumann assumption, without requiring any computational assumption. However, the Von Neumann assumption is hard to justify for many typical security applications of randomness-extraction. In particular, consider the goal of key derivation, where we use some large input x which is ‘fairly random’ but not truly random, such as many measurements (of time, movements, etc.). We cannot use x directly as a key, so we apply a hash and use h(x). However, can we be certain the Von Neumann assumption holds? The following simple exercise show this assumption may not hold - even when every second bit in the input is random! Exercise 3.9. Consider the following random process for producing bit sequences {b1 , b2 , . . .}: ) ( bi ← 0 if i = 1 mod 2 (3.17) bi = $ bi ← {0, 1} otherwise Show that this sequence does not satisfy the Von Neumann assumption, and, in fact, that applying the Von Neumann extractor, will not result in random output string. 3.5.2 The Bitwise Randomness Extractor We now present another model for randomness extraction, which we call the bitwise randomness extractor (BRE) model. We present the BRE model as a simple way to provide readers with an understanding of randomness-extracting Applied Introduction to Cryptography and Cybersecurity 3.5. RANDOMNESS EXTRACTION AND KEY DERIVATION FUNCTIONS 193 Figure 3.14: Bitwise-Randomness Extractor (BRE) Hash Function. hash functions; you will őnd more advanced and stronger models in the literature, but these are beyond our scope. Intuitively, a hash function h is a bitwise-randomness extractor if its output h(x) is pseudorandom, even if the adversary can select the input x, except for a ‘sufficient’ number of bits of the input; these bits are selected randomly. This intuition is illustrated in Fig. 3.14, which deőnes a ‘game’ where an adversarial algorithm A tries to defeat the randomness extraction - by selecting some input message, except for n random bits8 , and then distinguishing between the output and a random n-bit string. Finally, we deőne an ‘indistinguishability test’, much like the ones used in Chapter 2 (for IND-CPA encryption (Deőnition 2.9), PRF, PRG...). Namely, we select a random bit b, and let yb be the n-bit output of the hash function h (whose input contains n random bits), and y1−b be n random bits. Notice that 1 − b is simply the complement of b, i.e., it is 0 if b = 1 and 1 if b = 0. The adversary A ‘wins’ if it correctly guesses the value of b. Definition 3.10 (Bitwise-Randomness extractor (BRE) hash function). An efficient hash function h : {0, 1}∗ → {0, 1}n is called bitwise-randomness extractor (BRE) if for every efficient algorithm A ∈ P P T , the advantage εBRE h,A (n) is negligible in n, i.e., smaller than any positive polynomial for sufficiently large n (as n → ∞), where: εBRE h,A (n) ≡ Pr [BREA,h (1, n) = 1] − Pr [BREA,h (0, n) = 1] (3.18) Where BREA,h (·, ·) is defined in Algorithm 4 and the probability is taken over the random coin tosses of A and of BREA,h (·, ·). 8 The choice of requiring exactly n random bits in the input is quite arbitrary - we could have required a larger number of input random bits; we used exactly n just for simplicity. Applied Introduction to Cryptography and Cybersecurity CHAPTER 3. INTEGRITY: FROM HASHING TO BLOCKCHAINS 194 Algorithm 4 Bitwise-Randomness Extraction Indistinguishability test BREA,h (b, n). (m, M ) ← A(1n ) P|M | If |m| = ̸ |M | or i=1 M (i) < n return ⊥ $ m′ ← {0, 1}|m| yb ← h(m ⊕ (m′ ∧ M )) $ y1−b ← {0, 1}n return A(y0 , y1 , m, M ) Exercise 3.14 gives a simple example of a hash function which is not a BRE. 3.5.3 Key Derivation Functions (KDFs) and the Extract-then-Expand paradigm Cryptographic protocols use randomness for multiple purposes. For example, the TLS protocol (Chapter 7) uses random bits for multiple keys (authentication/encryption, client to server and server to client), for random bits sent in the protocol, and for randomized encryption (e.g., as an initialization vector (IV) for block-cipher modes of operation). We could use a randomness extractor hash function to produce these random (or pseudorandom) bits; however, an extractor only outputs a őxed number of bits, which may be insufficient. We could apply the extractor repeatedly to provide all of these bits, but this will each application would require its own sufficiently-random input, so this would be inefficient. Therefore, many protocols use a slightly more complex cryptographic mechanism, which is called a Key Derivation Function (KDF), and is more efficient. Let us describe the KDF design proposed by Krawczyk [242] and speciőed by the IETF as RFC 5869 [243], which is deployed in TLS 1.3 (see Section 7.6). This KDF uses the modular and efficient extract then expand paradigm, which constructs a KDF from a given keyed randomness extractor hash ĥ and a given PRF f . Namely, the KDF uses the extractor ĥ to extract one pseudorandom key k P RF , which is used as the key to the PRF f , which is then used to efficiently generate the desired pseudorandom strings. The KDF receives four parameters. Two of the four parameters are the same as used by a keyed extractor hash: a (random but non-secret) key/salt k̂, and a ‘sufficiently random’ input x. The two other parameters of the KDF are (1) the number l of pseudorandom bits which the KDF should produce and (2) an identifier ID for the resulting string. The identiőer allows the KDF to be used, with the same input x, to generate multiple independently-pseudorandom strings, e.g., to be used for encryption key, for authentication key and as an IV. Let us present this KDF construction. For simplicity, assume that the required number of bits l is a multiple of the digest length n. We construct the KDF, based on the keyed extractor hash ĥ and on the PRF f , as: Applied Introduction to Cryptography and Cybersecurity 3.5. RANDOMNESS EXTRACTION AND KEY DERIVATION FUNCTIONS 195 Key k Input x Output PRF fk (x) Secret Arbitrary n bits KDF Extractor Extractor fk (x, l, ID) (keyed) hk (x) (keyless) h(x) Public No key Sufficiently random l bits l bits Table 3.2: Comparison: PRF, KDF and (keyed and keyless) randomness extractor hash functions. For all functions, outputs should be pseudorandom.  T1 + + T2 + + ... + + Tl/n , where:    T (0) is an empty string, KDFk̂ (x, l, ID) = + ID + + i), Ti = fkP RF (Ti−1 +    and k P RF = ĥk̂ (x)        (3.19) RFC 5869 [243] also propose the use of the well-known HMAC construction, which we discuss in subsection 4.6.3. HMAC transforms a keyless hash function h into a keyed function ĥk̂ (x), deőned as: ĥk̂ (x) = h(salt ⊕ OP AD + + h(k̂ ⊕ IP AD + + x)) (3.20) The values OP AD and IP AD in Equation 3.20 are őxed strings deőned in [243]. The keyed function ĥ resulting from this construction, is used, in [243] and otherwise, both as a keyed extractor hash and as a PRF, e.g., for the KDF construction of Equation 3.19. Note that although HMAC is often used both as a PRF and as keyed extractor hash, the security requirements from these two types of functions are quite different. Table 3.2 compares the four relevant types of functions: KDF, PRF and keyed/keyless extractor hash functions. And Exercise 3.10 shows that the keyed extractor and PRF deőnitions are ‘incomparable’: a function can be a PRF but not a keyed extractor hash function, and vice versa. Try to solve it őrst and only then read the solution, since if you directly read the solution, it may appear obvious. Exercise 3.10 (PRF vs. KDF). Let f be a (secure) PRF and h be a (secure) keyed extractor hash, where for both functions, the output and the key are n bits. 1. Use f and/or h to construct a (secure) PRF f ′ , such that f ′ is not a secure keyed extractor hash. 2. Use f and/or h to construct a (secure) keyed extractor hash h′ , such that h′ is not a secure PRF. Solution: fk′ (x) = h′k (x) =   0n fk (x) if k = x mod 2n otherwise k hk (x) if 0n = x mod 2n otherwise Applied Introduction to Cryptography and Cybersecurity 196 CHAPTER 3. INTEGRITY: FROM HASHING TO BLOCKCHAINS Complete the solution by explaining or demonstrating why f ′ is a secure PRF but insecure keyed extractor hash, and why h′ is a secure keyed extractor hash but insecure PRF. And őnd other solutions! 3.6 The Random Oracle Model Often, designers use cryptographic hash functions in their constructions, without assuming a speciőc property such as one-way function or second-preimage resistance. Many of these constructions are found to be insecure; however, some constructions resist attacks for many years, even in spite of considerable efforts. This is especially common for keyless hash functions. Furthermore, often, when a vulnerability is found in a system deploying a hash function, it often turns out that the attacks are generic, i.e., do not exploit a weakness of a speciőc hash function. In other words, these systems are vulnerable when implemented with any hash function. It is obviously desirable to identify designs which are vulnerable when used with any hash function. We deőnitely want to avoid such a design! In a more positive way, it is preferable to use designs which can be proven to be secure against any ‘generic’ attack; such designs may still be vulnerable when using any given speciőc hash function, but cannot be shown vulnerable when implemented with any hash function. Any vulnerability can be attributed to a property of the speciőc hash function used. The Random Oracle Model (ROM)9 , proposed by Bellare and Rogaway [42], offers such a method. Intuitively, Random Oracle Model (ROM) constructions and protocols are secure in the (impractical) case that the parties select the hash function h() as a random function for the same domain and range. Of course, when the construction is deployed, it must use a concrete, speciőc hash function h(), rather than a random function. Namely, we model an ‘ideal’ keyless hash function as a random function (over the same domain and range, i.e., {0, 1}∗ → {0, 1}n ). Definition 3.11 (ROM-security). Let H be the set of all functions from binary strings to {0, 1}n , for some digest length n. Consider a parameterized scheme S h , where h is a given hash function. Also, for any security definition def , let εdef → [0, 1] be the def -advantage function, defined for a given scheme S h ,A h h S and parameterized adversary A h , where A h is a PPT algorithm, using standard computational model (e.g., Turing machine), with ‘black-box’ (oracle) access to h. Namely, A h can provide an arbitrary input x to h and receive back h(x) (as a single operation). We say that the (parameterized) system Sh is def -ROM-secure, or def secure under the Random Oracle Model, if the advantage of any PPT adversary 9 Often people use the term ‘random oracle methodology’ or instead of ‘random oracle model’. Arguably, this would be more appropriate; but we use ‘random oracle model’, since it is more common. Applied Introduction to Cryptography and Cybersecurity 3.6. THE RANDOM ORACLE MODEL 197 $ A h for scheme S h , for a random hash function h ← H, is negligible, i.e.: Pr εdef ∈ N EGL(n) S h ,A h $ (3.21) h←H Note that when the parties can share a secret, random key, then one should use a Pseduo-Random Function (PRF) rather than the Random Oracle Model (ROM); see Principle 6. Advanced: about the choice of h. The careful reader may have noticed that it is not well deőned how to select an element from an inőnite set with $ uniform probability, hence, h ← H is not well deőned. This choice should be interpreted as follows. For any l > n, let H l be the set of all functions from {0, 1}l to {0, 1}n , i.e., functions from l-bit binary strings to n-bit binary strings: H l = {h : {0, 1}l → {0, 1}n }. The set H l is őnite, hence we can deőne random $ $ sampling from it: hl ← H l . When we write h ← H, it should be interpreted as $ choosing hl ← H l for every integer l > n; and, for any input string m ∈ {0, 1}∗ , let h(m) = h|m| (m). Notice that the deőnition is for a őxed digest length n; however, formally, our deőnitions should be phrased for n being a parameters. To prove that a protocol using a keyless hash function h is secure under the ROM, we analyze the protocol assuming that h is chosen randomly at the beginning of the execution. Once chosen, all parties will have ‘black-box’ (oracle) access to h. This includes the adversary as well as parties running the protocol. Security under the ROM vs. Security under the Standard Model. Analysis of security under the ROM model is widely used. In fact, papers in cryptography often use the term ‘secure in the standard model’ to emphasize that their results are proven secure in the ‘real model’, rather than only proven under the ROM or another simpliőed model. Proofs of security in the standard model usually still make different unproven assumptions; however, these assumptions are ‘standard’ cryptographic assumptions. For many of these standard cryptographic assumptions there are even results showing that there exist schemes satisfying these assumptions, provided that a complexity assumption such as P ̸= N P is true. In contrast, ROM-security does not necessarily imply that the design is ‘really’ secure, when implemented with any speciőc hash function; once the hash function is őxed, there may be an adversary that breaks the system. This is true even when the hash function adopted satisőes ‘standard’ security speciőcations, e.g., CRHF. See examples in subsection 4.6.3. Still, ROM-security is deőnitely a good indication of security, since a vulnerability has to use some property of the speciőc hash function. Indeed, there are widely-deployed designs which are only proven to be ROM-secure. Applied Introduction to Cryptography and Cybersecurity 198 3.7 CHAPTER 3. INTEGRITY: FROM HASHING TO BLOCKCHAINS Static Accumulator Schemes and the Merkle-Tree A collision-resistant hash function h can provide integrity for a binary string x, by computing the digest h(x) (or hk (x), for keyed hash). But what if we require integrity for multiple strings X = {x0 , x1 , . . . , xm−1 }? A naive solution is the encode-then-hash design, which encodes the set of strings X as a single string, and then applies the hash function h. Namely, we compute h(encode(X)), where encode is a one-to-one encoding of multiple strings to one string. It is not hard to design such encode function; however, this design has two disadvantages: 1. Validation requires all stings X, even if we only need to validate the integrity of a single string, say xi . 2. The set of digests is őxed (static). For example, suppose we compute h(encode(X)) and then receive an additional string xm or a whole additional set of multiple strings, X ′ . We want a digest value that will allow us to validate the integrity of all of these strings, i.e., of {x0 , x0 , . . . , xm−1 , xm } or X + + X ′ . Computing ‘from scratch’ seems inefficient computationally, and would also requires storage of all input strings. The basic goal of Accumulator schemes is to provide a more efficient way to ensure integrity to multiple binary strings. In this section, we discuss static accumulators, which compute a collisionresistant digest of a given set of input strings, allowing efficient validation of the integrity of one or more of the input strings. The encode-then-hash design h(encode(x0 , x1 , . . . , xm−1 )), explained above, is a naive, inefficient static accumulator; we present the Merkle Tree accumulator, which is widely deployed, e.g., in blockchains (see Section 3.10) and in the Certiőcate Transparency (CT) PKI scheme (see Section 8.6). In Section 3.8 we extend our discussion to dynamic accumulators, which allow efficient validation of the integrity of input strings, given in one or multiple events. Later, in Section 3.9, we discuss the MerkleDamgård (MD) design, which őts the deőnition of a dynamic accumulator, but is mostly known for its use in the design of several standard hash functions. Like for hash functions, we can deőne and use either keyed or keyless accumulators. Similarly to the case for hash functions, most practical applications use a keyless accumulator, while many theoretical works focus on keyed accumulators. Also like the case for hash functions, keyless accumulators cannot have some important properties such as collision-resistance, unless under simplifying models such as the Random Oracle Model (ROM, Section 3.6). In this text, we focus on the slightly simpler case of keyless accumulators. 3.7.1 Definition of a Keyless Static Accumulator A (keyless) static accumulator scheme consists of two algorithms: α.Accum and α.VerPoI. The Accum accumulates input strings into a digest ∆, and the VerPoI algorithm veriőes the inclusion of a given string in the accumulated digest. Applied Introduction to Cryptography and Cybersecurity 3.7. STATIC ACCUMULATOR SCHEMES AND THE MERKLE-TREE 199 Definition 3.12 (A keyless static accumulator). A keyless static accumulator scheme α is defined by two algorithms: α.Accum and α.VerPoI, where: α.Accum(X) → (∆, Π) is a deterministic algorithm that accumulates a sequence m−1 m (ordered set) of binary strings X = ⟨xi ⟩i=0 ∈ {0, 1}m ∈ ({0, 1}∗ ) . The ∗ α.Accum function outputs a digest ∆ ∈ {0, 1} ∪ {⊥} and an ordered set m m−1 Π = {πi }i=0 ∈ ({0, 1}∗ ) of Proofs-of-Inclusions. α.VerPoI(∆, x, ID, m, π) → {True, False} is a deterministic algorithm that verifies if x was one of the strings in the sequence accumulated into ∆, using the Proof-of-Inclusion π. The function may use two additional inputs: m, the number of strings in the sequence accumulated into ∆, and ID, the sequential number of x within the sequence. Notations. To refer to a particular output parameter of a function of the accumulator, we append the parameter name to the function name, separated by a dot. Namely, α.Accum.∆(X) denotes the digest (∆) output of the Accum function of α, and α.Accum.π(X) denotes the set of PoIs. 3.7.2 Collision-Resistant Accumulators The őrst security requirement we deőne is collision-resistance. We őrst extend the notion of a collision to multiple input strings x0 , . . . , xm−1 . Two natural definitions for collisions between sets of strings are ordered collisions and unordered collisions. For clarity, we deőne both ordered and unordered collisions; however, later, we focus on unordered collisions, which are used in most applications and publications, and refer to them simply as collisions. Note that every unordered collision is also an ordered collision, therefore, resistance to unordered collisions also implies resistance to ordered collisions. Definition 3.13 (Keyless static collisions and the Im(·) notation). Let α be an accumulator and let X, X ′ be two ordered sets of binary strings such that: α.Accum.∆(X) = α.Accum.∆(X ′ ) ̸= ⊥ If X = ̸ X ′ , then we say that the pair (X, X ′ ) is an ordered collision for α. Given an ordered set (sequence) X, let Im(X) denote the unordered set of elements in X. If Im(X) ̸= Im(X ′ ), then we say that (X, X ′ ) is an unordered collision, or simply a collision, for accumulator α. We next deőne collision-resistance for static keyless accumulators. Note that we could similarly deőne (keyless) second-preimage resistant accumulators. Definition 3.14 (Collision Resistant Accumulator). A keyless static accumulator scheme α is Collision Resistant if for any PPT adversary A there is negligible probability for A to output an (unordered) collision. Namely:   α.Accum.∆(X) = α.Accum.∆(X ′ ) ̸= ⊥, where ∈ N egl(λ) (3.22) P rob Im(X) ̸= Im(X ′ ) and (X, X ′ ) ← A(1λ ) Applied Introduction to Cryptography and Cybersecurity CHAPTER 3. INTEGRITY: FROM HASHING TO BLOCKCHAINS 200 ∆0,6 ≡ MT.Accum.∆({x0 , . . . , x6 }) h ∆0,3 ∆4,6 h h ∆0,1 ∆0,0 h ∆2,3 ∆1,1 ∆2,2 h h x0 x1 h ∆4,5 ∆5,5 h h h x6 x4 x5 ∆3,3 ∆4,4 h h x2 x3 h ∆6,6 Figure 3.15: The Merkle-Tree construction using hash function h, for inputs {x0 , . . . , x6 }, i.e., with k = 7 input strings, using the notation ∆i,j ≡ MT.Accum.∆({xi , . . . , xj }). 3.7.3 The Merkle tree (MT) Accumulator and its Collision-resistance The Merkle tree (MT) accumulator was őrst presented by Merkle in [281]; many variants of this design were used in practice and in different works and proposals. Most of these designs, including Merkle’s original design, are keyless. In this subsection, we describe the digest function of the (keyless) Merkle tree (MT) design used by the Certiőcate Transparency (CT) standard [256], analyzed in [128]. The construction is illustrated in Figure 3.15, using the notation ∆i,j ≡ MT.Accum.∆({xi , . . . , xj }). As shown in Figure 3.15, the Merkle tree (MT) accumulator constructs a (full or partial) binary tree whose leafs are the accumulated input strings, which we denote {xi }m i=0 . We deőne the tree recursively, as follows. We őrst deőne the MT digest of a list containing a single string {xi }, which we denote by ∆i,i , as: ∆i,i ≡ MT.Accum.∆({xi }) = h(0x00 + + xi ) (3.23) Next, let {xi , . . . , xi+j−1 } be a list containing j > 1 elements. We compute the MT digest of {xi , . . . , xi+j−1 }, denoted ∆i,i+j−1 , by the following recursive equation: ∆i,i+j−1 ≡MT.Accum.∆({xi , . . . , xi+j−1 }) = =h(0x01 + + ∆i,i+2l −1 + + ∆i+2l ,i+j−1 ) =h(0x01 + + MT.Accum.∆({xi , . . . , xi+2l −1 })+ + + + MT.Accum.∆({xi+2l , . . . , xi+j−1 })) where l is the maximal integer such that 2l < j Applied Introduction to Cryptography and Cybersecurity (3.24) 3.7. STATIC ACCUMULATOR SCHEMES AND THE MERKLE-TREE 201 For the example in Figure 3.15, we őrst compute ∆i,i , for i between 0 and 6, using Equation 3.23; e.g., ∆3,3 ≡ MT.Accum.∆({x3 }) = h(0x00 + + x3 ). We then compute the digests ∆i,i+1 for i ∈ {0, 2, 4}, using Equation 3.24; e.g., + ∆3,3 ). Next, we compute, using Equation 3.24, the + ∆2,2 + ∆2,3 = h(0x01 + + ∆0,1 + + ∆4,5 + digests ∆0,3 = h(0x01 + + ∆2,3 ), ∆4,6 = h(0x01 + + ∆6,6 ) and őnally ∆0,6 = h(0x01 + + ∆0,3 + + ∆4,6 ). Note that we use a different one-byte preőx to the hashed strings in Equation 3.23, computing the digest of a set containing a single input string {xi }, compared to the one-byte preőx we use in Equation 3.24, computing the digest of a set containing multiple strings {xi , . . . , xi+j−1 (for j > 1). This is necessary to ensure collision resistance. Let Accum′ .∆′ be a modiőed digest function, which uses the same preőx (or no preőx) for the two cases, i.e., a digest of a set containing only one string and a digest of a set containing multiple strings. Such modiőed digest function Accum′ .∆′ would have collisions - even second-preimage collisions. For example: MT.Accum′ .∆′ ({x0 , . . . , x4 }) = MT.Accum′ .∆′ ({y0,1 , y3,4 }), where yi,j ≡ MT.Accum′ .∆′ ({xi , xj }) (3.25) We next observe that the digest function of Merkle tree (MT) ensures collision resistance, provided that h is a collision-resistant hash function. Note that since h is keyless, we can only hope for it to be collision-resistant under a simpliőed model such as the Random Oracle Model (ROM), see Section 3.6. Lemma 3.2. Assume that h is a (keyless) collision-resistant hash function (under the random oracle model). Then Merkle tree (MT) is a collision-resistant static accumulator. Proof: Suppose, to the contrary, that Merkle tree (MT) has a collision, i.e., two different sets of strings {x0 , . . . , xm−1 } = ̸ {x′0 , . . . , x′m′ −1 } such that MT.Accum.∆({x0 , . . . , xm−1 }) = MT.Accum.∆({x′0 , . . . , x′m′ −1 }). If m = m′ = 1, then x0 ̸= x′0 yet we have h(0x00 + + x0 ) = h(0x00 + + x′0 ), a collision. If m = 1 and m′ > 1, then we have h(0x00 + + x0 ) = h(0x01 + + ∆′ ), where ′ ∆ is computed as per Equation 3.24. Regardless of the value of ∆′ , this is a collision. A dual argument holds if m > 1 and m′ = 1. Finally, consider m > 1 and m′ > 1, and assume, WLOG, that there is no collision for shorter input sequences. Denote ∆i,j ≡ MT.Accum.∆({xi , . . . , xj }) and ∆′i,j ≡ MT.Accum.∆({x′i , . . . , x′j }); in particular, we have ∆0,m−1 = ∆′0,m′ −1 . Let l be the maximal integer such that 2l < m − 1 and l′ be the ′ maximal integer such that 2l < m′ − 1. Then we have: ∆0,m−1 =h(0x01 + + ∆0,2l −1 + + ∆2l ,m−1 ), and ′ ′ + ∆′2l ,m−1 ) ∆0,m−1 =h(0x01 + + ∆0,2l′ −1 + (3.26) If either ∆0,2l −1 ̸= ∆′0,2l′ −1 or ∆2l ,m−1 ̸= ∆′2l′ ,m′ −1 , we have a collision. If both are equal, then we have a collision for a shorter input sequence. In any case, we have the desired contradiction. Applied Introduction to Cryptography and Cybersecurity 202 3.7.4 CHAPTER 3. INTEGRITY: FROM HASHING TO BLOCKCHAINS The Proof-of-Inclusion (PoI) Requirements Many applications use accumulators to verify the integrity of a speciőc input x to the accumulator, namely, verify that x was part of the set of input string X. Such veriőcation of a speciőc input may be done using the VerPoI operation, using the Proof-of-Inclusion (PoI) output, π, from the Accum operation corresponding to x. Note that the integrity of a speciőc input x could also be veriőed using the collision-resistance property of the accumulator, by re-computing the digest of the entire set of all input strings. However, verifying the integrity by using VerPoI has three advantages. First, using VerPoI is (typically) much more efficient than recomputing the digest. Second, validation does not require the entire set of input strings X, which may be unavailable, or require unnecessary overhead to obtain or maintain. Third, the use of VerPoI may also provide some privacy, since a party may verify a particular input string without having access to other inputs, basically following the ‘need to know’ paradigm. Veriőcation involves two complementing requirements: correctness and unforgeability. Intuitively, correctness requires that when VerPoI(∆, x, ID, m, π) will return 1 (true), if the inputs are ‘correct’, i.e., x was the ID-th string in the sequence of m strings accumulated into digest ∆, and π was the corresponding PoI output; and unforgeability requires that VerPoI(∆, x, ID, m, π) will return 0 (false) if ∆ is not the digest of a sequence containing x. Note that the unforgeability requirement does not require VerPoI to validate the position of x within X; this is since some accumulators do not preserve the position, and applications mostly do not depend on preserving the position. Definition 3.15 (PoI Correctness and Unforgeability). Let α be a static accumulator scheme. Correctness. We say that α ensures Proof-of-Inclusion (PoI) Correctness if the following holds for every security parameter λ ∈ N and input sequence X = {x0 , . . . , xm−1 }. Let (∆, {π0 , . . . , πm−1 }) ← α.Accum(X). Then ∀ID < m holds α.VerPoI(∆, xID , ID, m, πID ) = 1. Unforgeability. We say that α ensures Accumulator Proof-of-Inclusion (PoI) Unforgeability if for every security parameter λ ∈ N and PPT adversary A holds:   α.VerPoIpk (∆, xID , ID, m, πID ) = 1 ∧ xID ̸∈ Im(X),  where ∆, X, xID , ID, m and π were generated by:       ∈ N egl(λ) P rob   X ← A(1λ ) ;      (∆, Π) ← α.Accum(X) ;   (xID , ID, m, πID ) ← A(∆, Π) (3.27) 3.7.5 The Merkle-Tree PoI We now complete the deőnition of the static Merkle tree (MT) accumulator by presenting its Accum.Π and VerPoI functions, and proving that it ensures PoI correctness and unforgeability. Applied Introduction to Cryptography and Cybersecurity 3.7. STATIC ACCUMULATOR SCHEMES AND THE MERKLE-TREE 203 ∆0,6 ≡ MT.Accum({x0 , . . . , x6 }) h ∆0,3 h ∆0,1 ∆0,0 h ∆4,6 h ∆2,3 ∆1,1 ∆2,2 h ∆4,5 ∆3,3 ∆4,4 h ∆6,6 ∆5,5 h x6 h h h h h h x0 x1 x2 x3 x4 x5 Figure 3.16: Illustrating the the Merkle Tree’s Proof-of-Inclusion (PoI), for input string x3 . The PoI consists of the three hash values in thick blue rectangles: ∆2,2 , ∆0,1 and ∆4,6 . The other bold, blue hash values, over thick lines, are computed in the PoI veriőcation. The dotted inputs and hash operations are not used for the veriőcation of the PoI of x3 . Recall that the Merkle tree (MT) is basically a (full or partial) binary tree m−1 whose leafs are the accumulated input strings, which we denote {xi }i=0 . Let πID denote the PoI of an input string xID , i.e., πID ≡ MT.Accum.Π[ID](X). The PoI, πID , is a (usually small) sequence of digests, which are used, together with xID , m and ID, to re-compute the digest of X, i.e., compute ∆(X). If the computation returns the correct, expected value of ∆, this shows that xID was indeed the IDth element in the accumulated sequence. For example, Figure 3.16 shows the sequence of digests required to validate x3 , given in the order of their usage (and of the layers in the tree): {∆2,2 , ∆0,1 , ∆4,6 }. In the őgure, we placed these digest values in thick rectangulars. Therefore, the PoI of x3 is: π3 ≡ MT.Accum.Π[3](X) = {∆2,2 , ∆0,1 , ∆4,6 } (3.28) Computing the Proof of Inclusion of xi , i.e., πID ≡ MT.Accum.Π[ID](X). Consider őrst the base case, where the input is a list X containing a single element x, i.e., X = {x}. In this case, given x and ∆, we should just conőrm that ∆ = h(0x00 + + x) (Equation 3.23), namely, we do not need any PoI. Therefore, the PoI is simply the empty list ∅, namely: π0 ≡ MT.Accum.Π[0]({x}) = ∅ (3.29) Consider now the typical case, where X contains j > 1 elements, say X = {xi , . . . , xi+j−1 }. Note that MT.Accum.πm (X) is the PoI (sequence of digests) for xi+m , since this is the (m + 1)th element in the list. We compute MT.Accum.πm (X) recursively as follows, where l is the maximal integer such Applied Introduction to Cryptography and Cybersecurity 204 CHAPTER 3. INTEGRITY: FROM HASHING TO BLOCKCHAINS that 2l < j and ∆a,b is deőned as in Equation 3.24: πm ≡MT.Accum.Π[m](X) ≡ MT.Accum.Π[m]({xi , . . . , xi+j−1 }) =  If m < l: MT.Accum.Π[m]({xi , . . . , xi+l−1 }) + + ∆i+l,i+j−1 = Else: MT.Accum.Π[m − l]({xi+l , . . . , xi+j−1 }) + + ∆i,i+l−1 (3.30) The reader can conőrm that in Figure 3.16, the resulting sequence of digests for x3 , i.e., π3 ≡ MT.Accum.π[3]({x0 , . . . , x6 }), would be as expected, i.e.: π3 ≡ MT.Accum.π[3]({x0 , . . . , x6 }) = {∆2,2 , ∆0,1 , ∆4,6 } (3.31) Verifying the Proof of Inclusion of xi . The MT.VerPoI function is shown in Algorithm 5. This algorithm computes MT.VerPoI(∆, x, i, m, π), i.e., veriőes, using the sequence of digests in π, that x was the ID + 1-th string in a sequence containing m strings whose digest is ∆. The veriőcation is done ‘bottom up’, from layer j = 0 till layer j = ⌈log m⌉ − 1, re-computing one digest in each layer of the Merkle tree until őnally re-computing the digest over the entire sequence of m strings whose ID + 1-th string is x, at which point it simply remains to compare this digest to ∆. In the j th layer, we compute the digest (hash) of the previous digest, which is always over a subsequence containing x, and of π[j]. The order between the two strings being hashed at each layer depends on the  parity of ID 2j . Algorithm 5 The Verify-PoI algorithm MT.VerPoI(∆, x, i, m, π): verify that x was the ith string, out of m, accumulated into π. 1: 2: 3: 4: 5: 6: 7: 8: 9: 10: 11: if (i ≥ m) then return False δ ← h(0x00 + + x) ▷ If x was ith string then δ = ∆i,i for j = ▷ Loop to compute digest at each layer  ⌈log(m)⌉ − 1 do  0 to is odd then ▷ if in layer j, x was on right, then: if i+1 j 2 δ ← h(0x01 + +δ+ + π[j]) else ▷ else, i.e., if in layer j, x was on the left, then: δ ← h(0x01 + + π[j] + + δ) if δ = ∆ then return True else return False Let us track the computation of MT.VerPoI(∆0,6 , x3 , 3, 7, π3 ) in Figure 3.16, where π3 is as computed in Equation 3.31, i.e., π3 [0] = ∆2,2 , π3 [1] = ∆0,1 and π3 [2] = ∆4,6 . Following Algorithm 5, we őnd that we őrst compute   δ← +x3 ) (line 3), which is, as expected, the same as ∆3,3 . Since 3+1 = 4 is h(0x00+ 20 +π3 [0]+ +δ) = h(0x01+ +∆2,2 + +x3 ) +h(0x00+ even, we next compute δ ← h(0x01+ , which is, as expected, ∆2,3 . We similarly compute δ ← ∆0,3 and őnally δ ← ∆0,6 . Therefore, when the loop is completed, we őnd in line 8 that δ = ∆ and therefore return True. Applied Introduction to Cryptography and Cybersecurity 3.8. DYNAMIC ACCUMULATORS 205 Finally, we state, without proof, the correctness and unforgeability of MT. A similar claim is proven in [128] (Lemma 2). Lemma 3.3. Assume that h is a (keyless) collision-resistant hash function (under the random oracle model). Then the MT Digest is a static accumulator that ensures PoI correctness and unforgeability. 3.8 Dynamic Accumulators In some applications, there is a need to accumulate a set of strings that changes over time, typically, by accumulating new strings in addition to the alreadyaccumulated strings. We use the term Dynamic Accumulator to refer to a stateful accumulator that supports incremental accumulation of multiple input sets of strings, e.g., the l input sets of strings {Xi }li=1 , each received in an Accum event. The lth call to the Accum function of a Dynamic Accumulator produces the digest ∆l of the strings received in these l Accum events. More ⃗ l, precisely, the dynamic accumulator outputs the digest of the set of strings X deőned as: ⃗ l ≡ X1 + X + ... + + X1 (3.32) Dynamic accumulators extend upon the mechanisms and properties of static accumulators. In particular, they should satisfy collision resistance and ⃗l PoI unforgeability, extended to support accumulation of the strings in X over multiple Accum events. We present the extended deőnitions for collision resistance in subsection 3.8.2. Before that, in the following subsection, we discuss the motivations for using dynamic accumulators, different ways in which dynamic accumulators extend upon static accumulators, and őnally deőne dynamic accumulators. 3.8.1 Dynamic accumulators: motivations, extensions and definition There are two main motivations for using dynamic accumulators instead of re-applying a static accumulator α, applying α repeatedly on all input strings each time that a new set of strings is added to the accumulator. One motivation is efficiency; by reusing results from previous applications of the accumulator, dynamic accumulators are usually more efficient than the comparable use of the static accumulator α. Another motivation for dynamic accumulators is to provide verifiable consistency, allowing the use of an old digest to validate a new digest. Let ∆ be ⃗ and ∆′ be the digest of the set of strings X ⃗ ′. the digest of the set of strings X, ′ ′ ⃗ ⊆X ⃗ . We say that ∆ is consistent with ∆ if, and only if, X To verify consistency, dynamic accumulators deőne the consistency verification predicate VerUpd. This is a a new operation, i.e., it does not exist in static accumulators. The VerUpd operation efficiently verifies the consistency of an ‘updated digest’ ∆′ with a ‘previous digest’ ∆. To facilitate the veriőcation, Applied Introduction to Cryptography and Cybersecurity 206 CHAPTER 3. INTEGRITY: FROM HASHING TO BLOCKCHAINS VerUpd receives a third parameter U , in addition to the two digests ∆ and ∆′ . The U parameter is a set containing one or more update values. Normally, ∆′ is the result of a series of Accum events following the Accum event which outputted ∆. In this case, U contains one update value u from each of these Accum events. The veriőcation should be correct and sound, which intuitively mean: Sound consistency verification: It is infeasible for an attacker to output an inconsistent pair of digests ∆, ∆′ and a set U , that will pass veriőcation as consistent, i.e., s.t. VerUpd(∆, U, ∆′ ) = True. Correct consistency verification: Suppose the accumulator outputs digest ∆0 and then has a sequence of q Accum operations, resulting in the series of update values U ≡ {ui }qi=1 , for some q ≥ 1. Let ∆q be the digest output by the last Accum operation in the sequence. Then VerUpd(∆1 , U, ∆q ) = True. Note that ∆q is indeed consistent with ∆0 . Updating the PoIs. Dynamic accumulators also use the update values u output by the Accum operations to update the Proofs-of-Inclusion (PoIs). This allows veriőcation of that a string x was accumulated into a digest ∆q , even if x was input to an ‘early’ Accum event, producing digest ∆0 , and followed by some number q of additional Accum events until the last of them produced ∆q . Let π be the PoI for x produced in the ‘early’ Accum event. In order to verify that x was accumulated into ∆q , we need to update the PoI, using another new operation of dynamic accumulators: UpdPoI. The UpdPoI operation receives the outdated PoI π, corresponding to a string x ∈ X0 , and a set U containing one or more update values, say U = {u1 , . . . , uq }. The output of UpdPoI is an updated version of the PoI, which we denote πq . The UpdPoI operation should ensure correctness. Namely, let π be the PoI generated for input string x accumulated into ∆0 in an Accum event. Suppose this is followed by a sequence of q Accum operations, resulting in the series of update values U ≡ {ui }qi=1 , for some q ≥ 1. Let ∆q be the digest output by the last Accum operation in the sequence. If π ′ ← UpdPoI(π, U ), then VerPoI(∆q , x, π ′ ) should return True. To summarize, dynamic accumulators extend static accumulators in the following ways: State: Dynamic accumulators are stateful. The state of the accumulator, denoted s, allowing adding new string to the current digest, provided as input to Accum, with the new state produced as output. Additional outputs of Accum: The Accum function of a dynamic accumulator produces two additional outputs, in addition to the digest (∆) and PoI (π) which are produced by both dynamic and static accumulators. These two additional outputs are a new state, s′ , and an update value, u. The update value u is used to update the already-published Proofs-of-Inclusion Applied Introduction to Cryptography and Cybersecurity 3.8. DYNAMIC ACCUMULATORS 207 (PoIs), allowing old PoIs to be used with the new digest, and to validate that the new digest is consistent with the previous digest. Additional functions: Dynamic accumulator schemes include two additional functions: VerUpd, which veriőes that a given new digest ∆′ is consistent with, i.e., extends, the current digest ∆; and UpdPoI, which updates old PoIs so they can be used with an updated digest. Both VerUpd and UpdPoI use the u values produced in the interim Accum operations. Additional and modified properties: To allow for multiple Accum events, we need to make some changes to the collision-resistance and PoI requirements, and to add properties specify to the dynamic behaviour: sound and correct consistency verification, and UpdPoI correctness. We describe the changes to collision-resistance in subsection 3.8.2, and brieŕy discussed the other properties above. We complete this subsection by deőning (keyless) dynamic accumulators. Definition 3.16 (Dynamic accumulator schemes). A (keyless) dynamic accumulator scheme α is defined by four algorithms: α.Accum, α.VerPoI, α.VerUpd and α.UpdPoI, where: α.Accums (X) → (s′ , ∆, Π, u) is a deterministic algorithm that accumulates a sem−1 k quence (ordered set) of binary strings X = ⟨xi ⟩i=0 ∈ ({0, 1}∗ ) . α.Accum m−1 ′ ∗ outputs a new state s , a digest ∆ ∈ {0, 1} ∪ {⊥}, a set Π = {πi }i=0 ∈ ∗ ∗ k ({0, 1} ) of proofs-of-inclusions, and an update value u ∈ {0, 1} . In the first call, we use the initial state s = 1λ . α.VerPoI(∆, x, ID, m, π) → {True, False} is a deterministic algorithm that verifies if x was part of the inputs accumulated into ∆, using the Proof-ofInclusion π, optionally using the number of strings m and the sequential number ID. α.VerUpd(∆, U, ∆′ ) → {True, False} is a deterministic algorithm that verifies that new digest ∆′ is consistent with previous digest ∆, using a set of update values U . α.UpdPoI(π, U ) → π ′ is a deterministic algorithm that computes an updated PoI π ′ , given the current PoI (π) and a set of update values U . 3.8.2 Dynamic Accumulator Collision-Resistance Dynamic accumulators should ensure collision resistance as well as PoI correctness and unforgeability; the notions are similar to the corresponding notions for static accumulators, with the main change being that the dynamic PoI properties allow for the use of the UpdPoI function (to update the PoI after accumulating additional strings). In addition, dynamic accumulators should ensure consistency and correctness for the veriőed-update function (VerUpd). Applied Introduction to Cryptography and Cybersecurity CHAPTER 3. INTEGRITY: FROM HASHING TO BLOCKCHAINS 208 We deőne collision resistance in this subsection, and deőne PoI correctness and unforgeability in the following subsections. But őrst, let us introduce notations for the result of multiple invocations of the Accum operation, which we use in all of these deőnitions. Definition 3.17 (Multiple Accum notation). Consider an ordered set {Xj }lj=1 , where each Xj is itself a sequence of binary strings. Let α be a dynamic keyless accumulator and s0 ≡ 0λ+1 . For l ≥ 1, define recursively: sl ≡ α.Accumsl−1 .s(Xl ) (3.33) And, using sl (for l ≥ 1), define:  α.Accum.s {Xj }lj=1 ≡ sl  α.Accum.∆ {Xj }lj=1 ≡ α.Accum.∆sl−1 (Xl ) (3.34)   l−1 l α.Accum.π {Xj }j=1 ≡ α.Accum.π {Xj }j=1 + + α.Accum.πsl−1 (Xl ) We now use the multiple Accum notation to deőne dynamic collisions. Similar to Deőnition 3.13, we deőne both ordered collisions and unordered collisions. Note that for dynamic accumulators, there is another aspect of ordering: the order among the Accum events, and which messages were received in each Accum event. For unordered collisions, we ignore both aspects of ordering, while for ordered collisions, we respect both, i.e., we expect different digests if the same set of messages are reordered or split in a different way into Accum events. Definition 3.18 (Dynamic accumulator collisions). Consider two ordered sets ′ X = {Xi }li=1 and X′ = {Xi′ }li=1 , where Xi and Xi′ are sequences of binary strings. Let α be a dynamic accumulator such that that: α.Accum.∆(X) = α.Accum.∆(X′ ) ̸= ⊥ Let Im (X) denote the set of messages in X. If Im (X) = Im (X′ ) ̸= ⊥, then we say that (X, X′ ) is an unordered collision, or simply a collision, for α. If X ̸= X′ , then we say that the pair (X, X′ ) is an ordered collision for α. Finally, we deőne collision-resistance for dynamic accumulators. Definition 3.19 (Dynamic Accumulator Collision-Resistance). A dynamic accumulator scheme α satisfies a) (unordered) Collision-Resistance, or b) Ordered Collision-Resistance, if for any PPT adversary A there is negligible probability ′ for A to output a collision X = {Xi }li=0 and X′ = {Xi′ }li=0 . Namely:   α.Accum. =X (∆)α.Accum. ̸=X′ (∆)⊥, where   (a) Im(X) ̸= Im(X′ ) or (b) X ̸= X′ ,     P rob  ′  ∈ N egl(λ) and pk, X and X were generated by:     λ   (pk, s0 ) ← α.Setup(1 ) ; ′ (X, X ) ← A(pk) Applied Introduction to Cryptography and Cybersecurity (3.35) 3.9. THE MERKLE-DAMGÅRD CONSTRUCTION 3.8.3 209 Constructing a dynamic accumulator from a static accumulator We conclude this section by presenting a construction of a a dynamic accumulator αD from a static accumulator α. This construction is simple, and has an efficient accumulation function, producing a short digest which is also used as the state. x 1 , x2 , x3 0n 0 α x 4 , x 5 , x6 B1 ≡ α.∆(0n , x1 , x2 , x3 ) α x 7 , x 8 , x9 B2 ≡ α.∆(B1 , x4 , x5 , x6 ) 1 α B3 1 Figure 3.17: Constructing a dynamic accumulator αD from a static accumulator α. The digest function of the construction is illustrated in Figure 3.17. The őgure illustrates the digest function αD .∆ of the dynamic accumulator, for the case where the scheme is used in three Accum events, each time accumulating a set of three strings. As illustrated in Figure 3.17, the computation of the digest and state functions of the αD accumulator is simple and efficient, as follows: αD .∆0λ+1 (X) αD .s0λ+1 (X) αD .∆1++s (X) D α .s1++s (X) ≡ ≡ ≡ ≡ α.∆(0λ+1 + + X) D 1+ + α .∆0λ+1 (X) (3.36) (3.37) α.∆(1 + +s+ + X) (3.38) D 1+ + α .∆1++s (X) (3.39) Indeed, the αD accumulator is a good choice if the goal is to have a simple dynamic accumulator that ensures collision resistance. However, many applications of dynamic accumulators require efficient Proof-of-Inclusion and veriőable updates, and for the αD accumulator, these functions are quite simple but inefficient. We will leave it to the reader to deőne these functions. The αD accumulator is very similar to the Merkle-Damgård Construction, which we study next; and a variant of the αD accumulator is used by the Bitcoin blockchain, see subsection 3.10.2. 3.9 The Merkle-Damgård Construction In this section, we present the well-known Merkle-Damgård design. We present two variants of this design. First, given a hash function h, we deőne the Merkle-Damgård static accumulator MD h . Then, we modify MD h to create the Merkle-Damgård hash-function construction hMD of a CRHF, which allows Applied Introduction to Cryptography and Cybersecurity CHAPTER 3. INTEGRITY: FROM HASHING TO BLOCKCHAINS 210 arbitrary length input strings from a compression function h, i.e., a CRHF which supports only inputs of some őxed length (see Figure 3.19). Both MerkleDamgård constructions are very similar to the αD dynamic accumulator of subsection 3.8.3. 3.9.1 The Merkle-Damgård Static Accumulator MD h The Merkle-Damgård Static Accumulator, MD h , is a simple static accumulator, built using a CRHF h. The MD h accumulator is mostly useful for computing the digest (MD h .Accum.∆). For completeness, we also deőne the PoI (MD h .Accum.π) and the MD h .VerPoI functions; however, these functions are absurdly inefficient. In practice, MD h is only used to construct the MerkleDamgård hash function hMD , which requires only the (reasonably efficient) MD h .Accum.∆ function (see Section 3.9). Later, in subsection 3.9.3, we also use MD h as the basis for a dynamic accumulator which we denote DMD h ; the DMD h .Accum.π and DMD h .VerPoI functions are are similarly absurdly inefficient, so we don’t expect DMD h to be actually used for any application. We just consider it useful as a simple example of a dynamic accumulator. The MD h .∆ function. Let us őrst deőne the digest function MD h .Accum.∆, which produces the digest resulting from accumulating the sequence of strings {x1 , . . . , xl }. For conciseness, we use the ‘shorthand’ notation MD h .∆ instead of writing the ‘full name’ of the function, MD h .Accum.∆. We compute the digest MD h .∆({x1 , . . . , xl }) as follows:  MD h .∆ {xi }li=1 ≡     +1+ + xl  For l > 1 : h MD h .∆({xi }l−1 i=1 ) +   (3.40) h(0n+1 + + x1 ) For l = 1 : Figure 3.18 illustrates the computation of MD h .∆ for the case of four input strings, {x1 , . . . , x4 }. x1 0 0n h x2 1 MD h .∆({x1 }) h x3 1 MD h .∆({x1 , x2 }) h x4 1 ... h MD h .∆({x1 , x2 , x3 , x4 }) Figure 3.18: The digest function of the Merkle-Damgård accumulator MD h , applied to the input sequence {x1 , . . . , x4 }. Let us give a simple example for the computation of MD h .∆. Applied Introduction to Cryptography and Cybersecurity 3.9. THE MERKLE-DAMGÅRD CONSTRUCTION 211 Example 3.2. Let X = {11, 22, 33} be sequence of three strings. We compute the Merkle-Damgård digest of X, using underlying hash h, by: MD h .∆ ({11, 22, 33}) = h(MD h .∆({11, 22}) + +1+ + 33)     h = h h MD .∆({11}) + +1+ + 22 + + 133   = h h h(0n+1 + + 11) + + 122 + + 133 (3.41) For example, when we use the hash function h ≡ hsum (Example 3.1), we have:   MD h .∆ ({11, 22, 33}) = hsum hsum hsum (0n+1 + + 11) + + 122 + + 133 = hsum (hsum (2 + + 122) + + 133) (3.42) = hsum (7 + + 133) =5 The Merkle-Damgård accumulator ensures collision resistance. Even before we deőne the other functions of MD h , we can already show that MD h ensures collision-resistance, since collision resistance depends only on the digest function (see Deőnition 3.14). Lemma 3.4. If h is a CRHF, then MD h is an any collision-resistant accumulator. Proof: assume, to the contrary, that adversary A M D ∈ P P T has nonnegligible probability to output a collision. We use A M D to construct adversary A h which has non-negligible advantage to őnd a collision for h. Speciőcally, A h runs A M D , and whenever A M D outputs a collision for MD h .∆, then A h outputs a collision for h. Let us explain how. ∗ Let the collision be (B, B ′ ), i.e., : B, B ′ ∈ ({0, 1}∗ ) ∧ (B ̸= B ′ ) ∧ h h MD .∆(B) = MD .∆(B ′ ). Denote the number of messages in B by l and the number of messages in B ′ by l′ , and, without loss of generality, assume l ≥ l′ . The proof is by induction on l. If l = 1 then l′ must also be one, hence: MD h .∆(B) = h(0n+1 + + x1 ) and MD h .∆(B) = h(0n+1 + + m′1 ). Since B ̸= B ′ , and in this case B = {x1 } and B ′ = {m′1 }, it follows that x1 ̸= m′1 . Let m̄ = 0n+1 + + x1 , m̄′ = 0n+1 + + m′1 ; ̸ m̄′ . Hence, (m̄, m̄′ ) is a collision that A h will output, as it follows that m̄ = claimed. Assume therefore that the claim holds for l = i and we prove it holds also for l = i + 1. First assume l′ > 1. We have MD h .∆ ({x1 , . . . , xi+1 }) ≡ h(MD h .∆(x1 , . . . , xi ) + +1+ + xi+1 ) and +1+ + m′l′ ) MD h .∆ ({m′1 , . . . , m′l′ }) ≡ h(MD h .∆(m′1 , . . . , m′l′ −1 ) + and of course MD h .∆ ({x1 , . . . , xi+1 }) = MD h .∆ ({m′1 , . . . , m′l′ }) Applied Introduction to Cryptography and Cybersecurity (3.43) 212 CHAPTER 3. INTEGRITY: FROM HASHING TO BLOCKCHAINS Now, if the inputs to the hash are identical, than this means that: MD h .∆(m′1 , . . . , m′l′ −1 ) = MD h .∆(x1 , . . . , xi ) This would contradict the induction hypothesis. If the inputs to the hash are different, then this is a collision. Hence, in both cases, A h can output a collision, as claimed. It remains to consider the case where l′ = 1 (and we prove for l = i + 1 after we proved for l = i). Equation 3.43 still holds, but in this case we have MD h .∆ ({m′1 }) ≡ h(0n+1 + + m′l′ ) and MD h .∆ ({x1 , . . . , xi+1 }) = MD h .∆ ({m′1 }) = h(0n+1 + + m′l′ ) Since the (n + 1)th bit differs between these two inputs to h, but their outputs are the same, it follows that also in this case, A h outputs a collision and the claim is complete. Collisions for ‘slightly modified’ Merkle-Damgård accumulator. The design of the Merkle-Damgård accumulator may appear arbitrary, and some minor modiőcations, simplifying the design or improving performance, may appear harmless. However, changes to well-studied cryptographic mechanisms are dangerous; we will demonstrate this by considering two subtle issues of the design. First, let us consider the string of 0n bits that we append to the őrst message x1 in Equation 3.40, which is often referred to as an Initialization Vector (IV), similarly to the term used for ‘modes of operation’ of block cipher (Section 2.8). One obvious question is whether the value of the IV must be 0n , rather than, say, 1n or any other n-bit string. Here, in fact, the answer is that the choice of 0n is completely arbitrary; any n-bit string could be used, as long as it is a fixed string. But why must the IV be őxed? This seems wasteful. In particular, suppose that |x1 | = n. Can we then save one hash operation, by using x1 as the n-bits input used to hash x2 ? Unfortunately, this seemingly-minor change would allow collisions, as the next exercise shows. Exercise 3.11. Assume that |x1 | = n, and consider a variant on the MD construction where we change Equation 3.40 so that for l = 1, we have: MD h .∆ ({x1 }) ≡ x1 . This variant ‘saves’ a hash operation; however, show that is not collision resistant. Another possibly puzzling aspect of the construction is the fact that the input to the compression function includes one special bit, which is not taken from the input strings xi and also not from the IV or from the previous digest. This bit is needed to allow for the case that an attacker may know a preimage of the IV string, with a particular property; and an attacker may know such preimage even if h is collision-resistant. See the next exercise. Applied Introduction to Cryptography and Cybersecurity 3.9. THE MERKLE-DAMGÅRD CONSTRUCTION 213 Exercise 3.12. Show that collisions may be possible for a variant of the construction where Equation 3.40 is replaced by:    h l−1  For l > 1 : h MD .∆({x } ) + + x  i l i=1  MD h .∆ {xi }li=1 ≡   For l = 1 : h(0n + + x1 ) Hint: Let h be a hash function with a known preimage of 0n , i.e., where we know a value z such that h(z) = 0n . Show a collision for MD h . Then, show that such CRHF h may exist, by assuming some CRHF h′ , and using it, show that there exist a CRHF h with known preimage. The proof of Lemma 3.4 may be helpful. Defining the PoI functions of MD h . As we mentioned above, the MD h PoI functions, namely, the MD h .Accum.π function, computing the PoI values for the string accumulated, and the MD h .VerPoI function, verifying the PoI of a particular string, are both absurdly inefficient. Therefore, we do not expect these functions to be deployed in any application. Still, we őnd it useful to deőne them, as an additional example of an accumulator. In fact, we deőne an especially inefficient pair of PoI functions:  m−1 m−1 [j] ≡ {xi }i=0 (∀j : 0 ≤ j < m) MD h .Accum.Π {xi }i=0  if x ̸∈ π  False MD h .VerPoI(∆, x, ID, m, π) ≡ False if ∆ ̸= MD h .∆(π)  True else It is easy to see that this simple implementation ensures PoI correctness and unforgeability. However, as we mentioned, this implementation is absurdly inefficient. Speciőcally, the PoI consists of the entire sequence accumulated, and validation requires recomputing of the digest. The following exercise shows a more efficient - but still absurdly inefficient - implementation. Exercise 3.13. Present a more efficient design for the MD h PoI functions. In particular, the length of the PoI would be about m digests, rather than the entire input sequence {xi }m−1 i=0 . 3.9.2 The Merkle-Damgård Hash Function hMD In this subsection, we present hMD , the Merkle-Damgård construction of a hash function hMD ; this construction is used in the design of multiple hash functions, including the MD5, SHA-1 and SHA-2 hash functions10 . The construction was proposed independently by Merkle [282] and Damgård [112]; I personally őnd Damgård’s text easier to follow. Our presentation is different from the presentation in both papers, since we present the Merkle-Damgård Hash Function 10 The SHA-2 standard defines multiple hash functions using similar design but different sizes, including SHA-256, SHA-512, and others. SHA-3 uses a different design. Applied Introduction to Cryptography and Cybersecurity 214 CHAPTER 3. INTEGRITY: FROM HASHING TO BLOCKCHAINS x ∈ {0, 1} h(x) ∈ {0, 1}n , n′ h n < n′ Standard MD5 SHA-1 SHA-256 SHA-512 n′ 512 512 512 1024 n 128 160 256 512 Figure 3.19: Compression function h: n′ -bit input to n-bit output, n < n′ , and the values of n′ and n used by the MD5, SHA-1, SHA-256 and SHA-512 standard hash functions, which use the Merkle-Damgård construction of a hash function from a compression function. construction using the Merkle-Damgård accumulator, presented in the previous subsection. The Merkle-Damgård construction builds the hash function hMD from a compression function 11 h. The compression function h is a special hash function which maps binary strings of some length n′ into shorter strings of length n < n′ . This is in contrast to most hash functions, including hMD , which allow input strings of arbitrary length. See Figure 3.19. One motivation to build a cryptographic hash function from a compression function, is the ‘cryptographic building blocks’ principle (principle 8): The security of cryptographic systems should only depend on the security of a few basic building blocks. These blocks should be simple and with well-defined and easy-to-test security properties. Cryptographic hash functions are often viewed as a building block of applied cryptography, due to their simplicity and wide range of applications. However, compression functions are even simpler, since their input is restricted to őxed-input length strings. Let us proceed to explain the the Merkle-Damgård hash function construction hMD . We begin with a simpliőed version, where the input messages consist of an integral number of blocks, i.e., strings of (n′ − n − 1)-bits each. The Merkle-Damgård hash construction, simplified: messages consisting of integral blocks. We őrst describe a simpliőed Merkle-Damgård construction, which is deőned only for input (binary) strings which contain an integer number of blocks of (n′ − n − 1) bits; let us denote this integer by |x| m ≡ (n′ −n−1) . Given a collision-resistant compression function h, we deőne the following CRHF hMDwo (for inputs of m blocks): hMDwo (x) ≡ MD h .∆ ({xi }m i=0 ) , where xi is deőned by: xi ≡ x [i · (n′ − n − 1) : (i + 1) · (n′ − n − 1) − 1] (3.44) 11 The term compression function may not be the best choice; it may have been clearer to refer to such functions, with Fixed-Input-Length (FIL), as FIL-hash functions. However, the use of ‘compression functions’ is entrenched in the literature; hence, it seems best to use it. Also, note that we denote the compression function by h, although it is not mnemonic, since we build hMD from h using the MD h accumulator, which we defined for a hash/compression function h. Applied Introduction to Cryptography and Cybersecurity 3.9. THE MERKLE-DAMGÅRD CONSTRUCTION 215 Let us explain the design of hMDwo . we őrst split the input m into m strings, (m−1) {xi }i=0 , each containing (n′ − n − 1)-bits. Then, we compute the MD h (m−1) digest of the sequence {xi }i=0 , as in Figure 3.18. We illustrate hMDwo in Figure 3.20; in both the őgure and above, we use x[i : j] to denote the sub-string of x, from the ith bit to the j th bit. x[0 : (n′ − n − 2)] x For SHA-256: 255 bits (n′ − n − 1) bits 0 0n h n bits  (n′ − n − 1) : (2n′ − 2n − 3)  x (n′ − n − 1) bits 1 h n bits  (2n′ − 2n − 2) : (3n′ − 3n − 4)  x (n′ − n − 1) bits 1 h n bits  (3n′ − 3n − 4) : (4n′ − 4n − 5)  (n′ − n − 1) bits 1 h hMDwo (x) Figure 3.20: The simpliőed Merkle-Damgård hash hMDwo (x), deőned for inputs x whose length is an integral number of ‘blocks’, each containing (n′ − n − 1) bits, i.e., |x| = 0 mod (n′ − n − 1). The speciőc numbers used in this example are n′ = 512, n = 256 (as for SHA-256), resulting in block length of 255 bits. The input x is four blocks, i.e., 1020 bits. It is easy to see, from Lemma 3.4, that if h is a collision-resistant compression function, then hMDwo would be collision-resistant. The Merkle-Damgård hash construction with MD-strengthening. The ‘full’ Merkle-Damgård hash function construction, hMD , includes an additional preprocessing step, usually referred to as MD-strengthening. MDstrengthening avoids the restriction that the input x satisőes |x| = 0 mod (n′ − n), required by hMDwo (from Equation 3.44). Namely, the MD-Strengthening step allows us to handle binary strings of arbitrary length as input. MD-strengthening pads the message x with additional bits before hashing it with hMDwo , so that the length of the resulting string, |padM D (x)|, would be an integer number of blocks (of (n′ − n − 1) bits each). Namely, we pad x with p = (n′ − n − 1) − [|x| mod (n′ − n − 1)]. For example, if n′ = 512 and n = 256, as for SHA-256 (Figure 3.19), and x contains ten bytes (80 bits), then we pad x with p = 255 − 80 = 175 bits. The same pad will be used if |x| = 940 bits. The pad is computed using the padding function padM D (x), deőned as: padM D (x) ≡ x + + [0p ∨ bin(p)] where bin(p) is the binary encoding of integer p, and p ≡ (n′ − n − 1) − [|x| mod (n′ − n − 1)] (3.45) Given the hash function hMDwo of the Merkle-Damgård without strengthening, as deőned in Equation 3.44, we can construct the Merkle-Damgård hash hMD as: hMD (x) = hMDwo (padM D (x)). Alternatively, Equation 3.46 deőnes hMD directly from the deőnitions of MD h .∆ (Equation 3.40 and padM D (Equation 3.45). Applied Introduction to Cryptography and Cybersecurity CHAPTER 3. INTEGRITY: FROM HASHING TO BLOCKCHAINS 216 hMD (x) ≡ MD h .∆ ({xi }m i=0 ) , where m ≡ l |x| (n′ −n−1) m , (3.46) (i < m) xi ≡ x [i · (n′ − n − 1) : (i + 1) · (n′ − n − 1) − 1] and xm ≡ padM D (x[m · (n′ − n − 1) :]) We illustrate hMD in Figure 3.21, for the SHA-256 example given above, i.e., for n′ = 512, n = 256, |x| = 940 bits and, therefore, p = 255 − [940 mod 255] = 255 − 170 = 85. x[0 : (n′ − n − 2)] x For SHA-256: 255 bits (n′ − n − 1) bits 0 0n h n bits  (n′ − n − 1) : (2n′ − 2n − 3)  x (n′ − n − 1) bits 1 h n bits  (2n′ − 2n − 2) : (3n′ − 3n − 4)  x[(3n′ − 3n − 4) :]+ + + +[0p ∨ bin(p)] (n′ − n − 1) bits (n′ − n − 1) bits 1 h n bits 1 h hMD (x) Figure 3.21: The Merkle-Damgård hash hMD (x), deőned for arbitrary-length input x. Shown here for an example using SHA-256 (n′ = 512, n = 256) and input x of 940 bits, requiring four blocks, with padding of p = 255 − [940 mod 255] = 255 − 170 = 85 bits. Given a collision-resistant compression function h, the Merkle-Damgård hash construction produces a collision resistant hash function (CRHF). This follows from Lemma 3.4. Lemma 3.5. If h is a collision-resistant compression function, then hMD , as defined as in Equation 3.46, is a collision-resistant hash function (CRHF). 3.9.3 The Merkle-Damgård Dynamic Accumulator (DMD h ) We complete this section by observing that the Merkle-Damgård accumulator can easily be extended into a dynamic accumulator, DMD h . The state of DMD h would simply be the last computed digest. Given one or more additional strings as input, the new digest - and new state - can be computed just like computation of the digest over the non-őrst strings xi (i > 1). This digest computation is quite efficient. However, like MD h , the PoI mechanisms of DMD h would be extremely inefficient. Therefore, practical applications would use other, more efficient, dynamic accumulators, such as based on the Merkle Tree design. 3.10 Blockchains, Proof-of-Work (PoW) and Bitcoin Blockchain schemes are an extension of dynamic accumulators, with many important and exciting applications. Like dynamic accumulators, blockchains accumulate a sequence of strings, which are collected over multiple Accum Applied Introduction to Cryptography and Cybersecurity 3.10. BLOCKCHAINS, POW AND BITCOIN 217 events, each time accumulating an additional sub-sequence, which we call a block. In principle, transactions can be arbitrary strings; however, when we discuss blockchain schemes, we use the term transaction rather than simply ‘string’, since the őrst and most well-known application of blockchains is for cryptocurrencies such as Bitcoin, where each transaction represents a payment and the ledger is the complete list of transactions, establishing the ownership of each bitcoin; there are also other applications of blockchains where the term ‘transaction’ is appropriate. For the same reason, we refer to the entire sequence of transactions accumulated by the blockchain as a ledger. We discuss the Bitcoin cryptocurrency in subsection 3.10.2. Like dynamic accumulators, every time we add a new block to the ledger, using the Accum operation, the operation outputs a digest, Proof-of-Inclusion (PoI) value for each transaction, update values for previous PoI values, and a new state for the blockchain. Similarly to dynamic accumulators, blockchain schemes should ensure collision-resistance and the security of the Proof-ofInclusion values. Basically, collision resistance means that it is infeasible to őnd two ledgers which produce the same digest; and PoI security means that it is only feasible to őnd a valid tuple of transaction x, digest ∆ and a PoI P oI, if ∆ is the result of accumulating a ledger which includes transaction x. Both requirements are the same as these deőned for dynamic accumulators, see Deőnition 3.18. The big difference between blockchain schemes and accumulators is that blockchains allow multiple participants to cooperate in accumulating blocks of transactions, where different transactions in the same block often come from different participants. Namely, the participants maintain a shared ledger of transactions. This introduces several requirements; we informally describe the most important requirements below. We do not present rigorous deőnitions, since such deőnition involve the feasibility of an execution involving multiple parties, which is beyond our scope. Note also that these requirements can only be satisőed under appropriate models (assumptions), such as maximal delay for communication and benign participants continuously attempting to mine (add a new block to the chain); deőning such models is also beyond our scope. Consistency. One important requirement is consistency. Namely, the ledgers kept by different participants are required to be consistent. Consistency is deőned with respect the the sequence of digests kept by each participant. Namely, let ∆p (i) be the digest that participant p receives after accumulating the ith block. Then for every participant j which also accumulated i or more blocks, it should hold that: ∆p (i) = ∆p′ (i). Formally, we use the special value ⊥ to identify the blocks already accumulated by a participant; i.e., if ∆p (i)[t] = ⊥ then it means that p did not accumulate block i up to time t, while if ∆p (i)[t] ̸= ⊥ then p has already accumulated block i by time t. Consistency boils down to two basic requirements: őrst, we require that blocks are accumulated consecutively; and second, we require that the digests of all participants are always consistent. Namely, for all benign participants p, p′ , Applied Introduction to Cryptography and Cybersecurity 218 CHAPTER 3. INTEGRITY: FROM HASHING TO BLOCKCHAINS block-numbers i, i′ , time t and time t′ > t, we require that: (t ≥ t′ ) ∧ (∆p (i)[t] ̸= ⊥) ⇒ ∆p (i)[t] = ∆p (i)[t′ ] (∆p (i)[t] ̸= ⊥) ∧ (∆p′ (i)[t′ ] ̸= ⊥) ⇒ ∆p (i)[t] = ∆p′ (i)[t′ ] (3.47) (3.48) Consistency ensures that the blockchain can only grow, i.e., is immutable, and is shared among all participants. For example, a blockchain may be used to keep track of the ownership of objects belonging to some participants; a participant p who owns an object X at time t, may sign a transaction x which transfers ownership of X to participant p′ . Once the transaction x is added to the blockchain, i.e., included in a block added to the blockchain, then the participants consider object X as belonging to p′ . Consistency is key to such applications, as the ownership of an item must always be well deőned. Consistency alone is not very useful. For example, consider a trivial blockchain where benign participants always output Bp [t](i) = ⊥, i.e., blocks are never added; consistency is satisőed in a trivial sense. Additional requirements are necessary to make blockchains useful. One such additional requirement is chain growth, i.e., requiring that the chain grows over time. Chain growth. The chain growth requirement ensures that new blocks are mined (added to the blockchain) over time, at least in a speciőed minimal chain growth rate g. The minimal chain growth rate g is speciőed in terms of blocks per time unit, e.g., blocks per second. We require that for every benign participant p and every time t such that the blockchain of p accumulated (at least) i blocks at time t, then at any later time t′ > t interval the blockchain of p would contain at least i + ⌊g · (t′ − t)⌋ blocks at time t′ . Namely, the blockchain scheme ensures chain growth of rate g, if: For all i ∈ N, t′ > t > 0 and benign participant p holds:    ′   t −t [t′ ] [∆p (i)[t] ̸= ⊥] ⇒ ∆p i + g (3.49) Bounded delay transactions. The őnal property we discuss is bounded delay T for adding a transaction to the blockchain. Namely, suppose that at time t a benign participant tries to add a new transaction to the ledger. Then, the new transaction should be added by t + T (or earlier). Note that a particular blockchain application may require transactions to satisfy additional requirements in order to be considered ‘valid’, e.g., a Bitcoin transaction specifying payment of a bitcoin is ignore unless it is properly signed by the entity whom, according to the ledger, currently owns that bitcoin. 3.10.1 Blockchain Design: Permissioned and Permissionless Blockchains Blockchain schemes have many diverse designs, and use different mechanisms to accumulate transactions into blocks which are added to the ledger, and to Applied Introduction to Cryptography and Cybersecurity 3.10. BLOCKCHAINS, POW AND BITCOIN 219 ensure the blockchain requirements above, e.g., consistency and chain-growth, as well as other requirements. Blockchain schemes mostly belong to one of two categories: permissioned blockchains and permissionless blockchains. Permissioned blockchains. In permissioned blockchains, only speciőc, authorized parties can add a block to the blockchain. In a typical permissioned blockchain scheme, the digest is signed by an authorized parties; some schemes allow one of multiple parties to sign digests, and other schemes require multiple authorized parties to sign every new digest. Speciőcally, if A.s is an authorized (private) signing key, then the ith digest, denoted ∆i , is sent together with a signature σi ≡ SignA.s (∆i ). Note that this requires all participants to know the corresponding public validation key, A.v. Permissionless blockchains. Permissionless blockchains are egalitarian: every participant may try to add a block to the blockchain, by following a process called mining. The participant to succeed in mining a given block is selected randomly, and the probability of a participant to win is proportional to the resources it allocates to the mining. To motivate participants to participate allocate resources to mining, permissionless blockchains provide a reward to a participant that mines a block. In permissionless blockchain cryptocurrencies such as Bitcoin, the reward is given in the cryptocurrency, e.g., as an amount of bitcoins. In fact, in Bitcoin, the mining rewards are the only mechanism which increases the amount of bitcoins in circulation. This is the reason that this process is referred to as mining; mining bitcoins, like mining gold, is a randomized process where the chance of success depends on the amount of allocated resources. We discuss Bitcoin in the following subsection. 3.10.2 The Bitcoin blockchain and cryptocurrency Bitcoin is the őrst and the most well-known cryptocurrency, and also the őrst and most well-known application of blockchains. In Bitcoin, the ledger deőnes which bitcoin numbers are currently valid, and who is the owner of each bitcoin. The owner is identiőed only using the owner’s public key, which allows owners to transfer a bitcoin to other participants, by signing an appropriate transaction using the public key ‘owning’ this bitcoin. Speciőcally, given a valid bitcoin number c and a ledger X, the Bitcoin function owner returns the public validation key v ≡ owner(c, X) which is said to own coin c. If owner(c, X) = ⊥, then we say that c is not a valid coin number according to ledger X. A Bitcoin transaction x is a triple (x.coins, x.v, x.σ) where x.coins is a list of bitcoin numbers, x.v is the public key to which the transaction transfers ownership of the bitcoins in x.coins, and x.σ is a digital signature. We say that x is a valid transaction for ledger X, if there is some public key v such that (∀c ∈ x.coins)v = owner(c, X) and V erif yv ((x.coins, x.v), x.σ) = True, i.e., Applied Introduction to Cryptography and Cybersecurity 220 CHAPTER 3. INTEGRITY: FROM HASHING TO BLOCKCHAINS x.σ is a valid signature over (x.coins, x.v) using the public key v. If X ′ is the ledger after adding a new block which includes transaction x, then the ownership over x.coins will be passed to x.v, i.e., (∀c ∈ x.coins)x.v = owner(c, X ′ ). Pseudonymity, Anonymity and Privacy in Bitcoin. Bitcoin provides pseudonymity, since transactions specify only the public validation key, and do not specify any identiőer of the payer or recipient. The same person or organization may use multiple different public validation keys. For example, Charlie, whose (validation, signing) key pair is (C.v, C.s), may transfer ownership of bitcoin number c to Don, whose key pair is (D.v, D.s), by using C.s to sign the transaction (c, D.v). There is however no need to know who is the payer or the payee. This property is often misunderstood to imply that Bitcoin ensures anonymity and privacy, which motivates the use of bitcoins by people concerned about being associated with their transactions, including criminals. In particular, ransom payments, to ransomware or otherwise, and other payments for illegal products and services, are often made in bitcoins. However, while Bitcoin transactions use pseudonyms rather than names, this does not provide real anonymity. In fact, since all Bitcoin transactions are documented on the public ledger, this makes it easy to trace the movement of bitcoins between different owning public keys. This fact is often used by researchers and law enforcement agencies to track the ŕow of bitcoins, which often allows identiőcation of the user, i.e., deanonymization. The Bitcoin Proof-of-Work (PoW). In Bitcoin, mining a new block requires a solution to a difficult computational problem, namely, a Proofof-Work (PoW). There are other applications of PoW in Cybersecurity, e.g., in defenses against some Denial-of-Service attacks; and there are also other approaches for mining in permission-less blockchains, e.g., Proof-of-Stake, where the mining probability is proportional to the amount of cryptocurrency held by each participant. We őrst discuss the general concept of PoW schemes, and then focus on the Bitcoin PoW and block-mining operation. Intuitively, a PoW allows one party, the worker, to solve a challenge, with an approximately known amount of computational resources, resulting in a proof of this success, which can be efficiently veriőed by anyone. Proofs-of-Work schemes belong in this chapter of hash functions, both due to their use in permissionless blockchains and in particular Bitcoin, but also since their most well known implementation, which is the one used in Bitcoin, is based on a hash function. Notice that we used the general term ‘computational resources’ and not a more speciőc term such as computational power, i.e., number of computations per second, which is the resource used by the Bitcoin PoW. Indeed, some PoW proposals focus on other resources, e.g., storage, or on a combination of resources, e.g., time and storage. However, from this point, let us focus on PoW based on computational power, as used by Bitcoin. Applied Introduction to Cryptography and Cybersecurity 3.10. BLOCKCHAINS, POW AND BITCOIN 221 Definition 3.20 (Proof of Work (PoW) - intuitive deőnition). A PoW scheme PoW consists of two efficient algorithms: PoW.solve, and PoW.validate. PoW.solve: The PoW.solve algorithm receives three inputs: a challenge c ∈ CD , a (random) nonce value r ∈ {0, 1}n , and a work-amount w ∈ [1, 2n ]. The PoW.solve function outputs an n-bit binary string, to which we refer as the solution. PoW.V alidate: The PoW.V alidate algorithm has four inputs: the challenge c, the nonce r, the required work r ∈ [1, 2n ], and a purported solution π ∈ {0, 1}∗ . The PoW.V alidate algorithm returns true or false. A PoW scheme PoW is secure, if finding a solution, i.e., the runtime of λ PoW.solve, is distributed randomly with average of 2w computations of a given λ difficulty, typically, 2w computations of a given hash function h. The scheme is correct if: (3.50) (∀c, r, w)PoW.V alidate(c, r, w, PoW.solve(c, r, w)) = true Proof-of-work mechanisms are often implemented using a cryptographic hash function. A typical implementation, which is a simpliőcation of the one in Bitcoin, is the PoW scheme PoW B , deőned as: PoWB .V alidate: On input (c, r, w, π), return true if h(c + +r+ + π) ≤ w. Otherwise, return false. PoWB .solve: On input (c, r, w), repeatedly compute x ≡ h(c + +r+ + π) for $ different π ← {0, 1}n , aborting and returning π when x ≤ w. c1 , x1,1 , x1,2 c2 , x2,1 , x2,2 ∆′1 ∆′2 MT 0λ Nonce n1 c3 , x3,1 ∆′3 MT h ∆1 MT h Nonce n2 ∆2 h ∆3 Nonce n3 Figure 3.22: The (simpliőed) Bitcoin Blockchain, illustrated for three blocks. Each block begins with the coinbase transaction (ci ), followed by regular transactions (two for blocks 1 and 2, one for block 3). The Merkle-tree digest of the transactions of block i is hashed with the digest of block i − 1 (or with Oλ for the genesis block), and with a nonce ni . The value of the nonce should ensure that Bi is not higher than the current threshold. Applied Introduction to Cryptography and Cybersecurity 222 CHAPTER 3. INTEGRITY: FROM HASHING TO BLOCKCHAINS The Bitcoin blockchain We illustrate the Bitcoin blockchain in Figure 3.22. The reader may notice the similarity to the dynamic accumulator scheme illustrated in Figure 3.17. The transactions of block i, where i ≥ 1, are accumulated using a Merkle tree MT scheme. We use ∆′i to denote the digest of the transactions in block i, from which we later compute the digest ∆i of the blockchain with i blocks. To compute ∆i , we apply a hash function h to ∆′i and two other values: the + ni ). previous digest ∆i−1 and a nonce value ni , namely ∆i ≡ h(∆′i + + ∆i−1 + The value of the nonce ni is chosen randomly by the miner, until they őnd a nonce value ni which ensures that ∆i ≤ w, where w is the current work amount, following the Bitcoin Proof-of-Work (PoW) mechanism as described above. Bitcoin operations and concerns: costs of PoW, rewards and adjustable work-amount w. Solving the proof-of-work (PoW) problem requires considerable computational resources; this is intentional, to ensure that all entities have a fair chance to add new blocks, speciőcally, the probability of a speciőc miner to succeed in being the őrst to solve the PoW and to add the next block, is roughly the fraction of the number of hash computations by this miner, from the total number of hash computations done by all miners (for adding the block to the current blockchain). This prevents an attacker from adding most or all of the blocks, which would have allowed the attacker to abuse the system, e.g., by excluding speciőc transactions. However, this means that adding blocks requires the use of signiőcant amounts of energy. Most of this energy does not even result in a new block, since only one miner can succeed in adding the new block. This wasteful use of large amounts of energy is one of the criticisms against Bitcoin, and motivates the use of blockchains which are more energy-efficient. There are several alternative designs which avoid this waste of energy, including permissioned blockchains and blockchains using other mining mechanisms, e.g., proof of stake. To incentivize entities to invest the effort (and energy) required for mining (solving a PoW), Bitcoin rewards them by granting them a certain number of Bitcoins every time they succeed in adding a new block to the chain. This reward is calculated through a clever ‘reward rule’, designed to make the reward sufficient but not excessive. The reward consists of a number of newly-minted bitcoins, and transaction fees paid by the payers of transactions in the block. The reward (new coins and fees) are allocated to the public key that the miner includes in a special transaction called the coinbase transaction which is included in every Bitcoin block, as illustrated in Figure 3.22. The number of new bitcoins rewarded upon mining is cut in a half once per 210, 000 blocks mined, and would eventually become zero. For Bitcoin to remain a viable system after that point, transactions fees should provide sufficient incentive to motivate miners. The fee is indicated by the payer in each transaction, and this amount is moved from the payer to the miner of the block in the chain which includes this transaction; this is in addition to the amount of bitcoins transferred to the payee. Different transactions offer different fees, and Applied Introduction to Cryptography and Cybersecurity 3.11. LAB AND ADDITIONAL EXERCISES 223 miners may prefer transactions with higher fees; each block has limited size, so miners may not be able to include all available transactions in their blocks. The work-amount parameter w of the PoW is adjusted automatically by a feedback mechanism in Bitcoin, whose goal is to ensure that new blocks will be added at a reasonable, but not excessive, rate. Namely, if blocks are added more quickly, then the work amount is increased - making it harder to mine new blocks, i.e., slowing down the rate of adding blocks. This should maintain a stable rate of mining new blocks, and therefore, balancing between the overhead of block creation, and the delay until a new transaction appears on the chain. The mining rates can be impacted by multiple factors, including the energy costs of mining and the value of the award and fees obtained by a lucky miner, the likelihood of mining a block added to the chain (which depends on competition), and the costs and efficiency of mining hardware. 3.11 Lab and additional exercises Lab 3 (Checksum and CRC Collisions). In this lab, we experiment with attacks against a system using an insecure hash function, specifically, the the Internet Checksum error-detection codes as a hash function. Namely, to hash input x, we compute h(x), where h is the Internet Checksum function. We show that h is not a secure hash function, specifically, we find collisions, i.e., inputs x ̸= x′ such that h(x) = h(x′ ). Finally, we create a partially-chosen collision, i.e., a collision between the CRC/checksum-hash over two given (and very different) documents, by ‘filling in’ a designated area left undefined in each of the two documents. Note that this implies universal forgery of signatures computed using the Hash-then-Sign construction with CRC/checksum as the hash function. As for the other labs in this textbook, we will provide Python scripts for generating and grading this lab (LabGen.py and LabGrade.py). If not yet posted online, professors may contact the author to receive the scripts. The lab-generation script generates random challenges for each student (or team), as well as solutions which will be used by the grading script. We recommend to make the scripts available to the students, as example of how to use the cryptographic functions. It is easy and permitted to modify these scripts to use other languages/libraries or to modify and customize them as desired. 1. Let h denote the Internet Checksum function (of the input őle padded by a single 1 bit and then a minimal number of zero bits to result in legitimate input to the checksum function). Write a program to compute h. In you lab-input folder, in sub-folder checksum, őnd őles f1a.txt, f1b.txt and h1a.txt. File h1a.txt is the result of applying h over őle f1a.txt; use it to test your program. Then, compute the result of applying h over f1b.txt; name the resulting őle h1b.txt and place it in the lab-solutions folder. 2. Find and submit a collision for h Namely, place in sub-folder checksum of the lab-solutions folder three őles, f2a.txt, f2b.txt and h2.txt, such that h2.txt is the result of applying h over both őles (f2a.txt and f2b.txt). Applied Introduction to Cryptography and Cybersecurity 224 CHAPTER 3. INTEGRITY: FROM HASHING TO BLOCKCHAINS 3. In the sub-folder checksum of the lab-input folder, őnd őles f3a.txt and h3a.txt, which contains the Internet Checksum of f3a.txt. In f3a.txt, you will see a section designate łto changež. You should create a new őle f3as.txt which will be identical to f3a.txt, and have the same Internet Checksum (the one in h3a.txt), except that it would have different contents in the “to change” section. Upload f3as.html to the lab-solutions folder. 4. In the sub-folder checksum of the lab-input folder, őnd őle f4b.html. In f4b.html, you will őnd a conditional Javascript statement which compares two copies of the string łto changež to each other (resulting, of course, in ‘true’). You should create a new őle f4bs.html which will be identical to f4b.html, and have the same Internet Checksum, except that it would have different contents to one of the “to change” strings. Upload f4bs.html to the lab-solutions folder. Open the two őles in the browser and compare the results! Exercise 3.14 (XOR-hash). Consider the following hash function, defined for input messages consisting of number l of n bits ‘blocks’, i.e., total of l · n bits, where n is the length of the digest (n = |h(m)). Given such message m containing l · n bits, let us denote the ith block (of n bits) by mi , i.e., m = m1 + + m2 + + . . . ml and (∀i)|mi | = n. Define hash function h for such l · n bit messages, as: h(m1 . . . ml ) = Ll i=1 mi . Show that h does not have each of the following properties, or present a convincing argument why it does: 1. Collision-resistance (CRHF), see Section 3.2. 2. Second-preimage resistance (SPR), see Section 3.3. 3. Preimage resistance, i.e., h is not a one-way function (OWF), see Section 3.4. 4. Bitwise randomness extraction (BRE), see subsection 3.5.2. 5. Secure MAC, when h is used in the HMAC construction, see subsection 4.6.3. Solution to part 4 (randomness extraction): For simplicity, we present the solution for even n, i.e., n = 2µ where µ is an integer. Adversary A selects + 1µ + + 0µ + + 1µ . Let yb and y1−b input message m = 02n and mask M = 0µ + be the values computed by BREA,h (b, n); the reader can conőrm that y1−b is random, while yb is of the form 0µ + + r, where r is a random string of µ bits. On input (y0 , y1 , m, M ), the adversary A returns:  0 if m mod 2µ ̸= 0µ (3.51) A(y0 , y1 , m, M ) = 1 otherwise The reader should conőrm that the adversary is correct with overwhelming probability. Hence, h is not a bitwise randomness extractor (BRE). Applied Introduction to Cryptography and Cybersecurity 3.11. LAB AND ADDITIONAL EXERCISES 225 Exercise 3.15 (Insecure double-input hash). Let h be a ‘compression function’, i.e., a cryptographic hash function whose input is of length 2l and output is l of length l. Let h′ : {0, 1}2l·n → Ln{0, 1} extend h to inputs of length 2l · n, as ′ follows: h (m1 + +...+ + mn ) = i=1 h(mi ), where (∀i = 1, . . . , n)|mi | = 2l. For each of the following properties, assume h has the property, and show that h′ may not have the same property. Or, if you believe h′ does retain the property, argue why it does. The properties are: 1. Collision-resistance. 2. Second-preimage resistance. 3. One-wayness (preimage resistance) 4. Randomness extraction. Would any of your answers change, if h and h′ have a random public key as an additional input? Exercise 3.16 (Insecure XOR hash). Consider messages of 2n blocks of l bits each, denoted m1 . . . mn , and let hc be a secure compression function, i.e., a cryptographic hash function from 2n bits to l bits. DefineLhash function h for n such 2n blocks of l bits messages, as: h(m1 . . . m2n ) = i=1 hc (m2i , m2i−1 ). Show that h does not have each of the following properties, although hc has the corresponding property, or present a convincing argument why it does: 1. Collision-resistance. 2. Second-preimage resistance. 3. One-wayness (preimage resistance) 4. Bitwise randomness extraction. 5. Secure MAC, when h is used in the HMAC construction. Exercise 3.17 (Insecure cascade combining of hash). It is proposed to combine two hash functions by cascade, i.e., given hash functions h1 , h2 we define h12 (m) = h1 (h2 (m)) and h21 (m) = h2 (h1 (m). Suppose collision are known for h1 ; what does this imply for collisions in h12 and h21 ? Exercise 3.18 (Insecure committee-designed combined hash). Recently, weaknesses were found in few cryptographic hash functions such as hM D5 and hSHA1 , and as a result, there were many proposals for new functions. Dr. Simpleton suggests to combine the two into a new function, hc (m) = hSHA1 (hM D5 (m)), whose output length is 160 bits. Prof. Deville objects; she argued that hash functions should have longer outputs, and suggest a complex function, h666 , whose output size is 666 bits. A committee setup to decide between these two, proposes, instead, to XOR them into a new function: fX (m) = [0506 + + hc (m)] ⊕ h666 (m). Applied Introduction to Cryptography and Cybersecurity 226 CHAPTER 3. INTEGRITY: FROM HASHING TO BLOCKCHAINS 1. Present counterexamples showing that each of these may not be collisionresistant. 2. Present a design where we can be sure that finding a collision is definitely not easier than finding one in hSHA1 and in hc . 3. Repeat the first part for bitwise randomness-extraction. Exercise 3.19 (Ordered-collision resistance). Define ordered-collision resistance for accumulators, and show that every accumulator α which is resistant to unordered collisions, is also resistant to ordered collisions. Exercise 3.20 (SPR accumulator). cumulators. 1. Define second-preimage resistant ac- 2. Show that every collision-resistant accumulator is also a SPR accumulator. 3. Let α be a collision-resistant accumulator. Use α to construct another accumulator α′ , which will be SPR but not collision resistant. , and show res Exercise 3.21 (Insecure no-preőx Merkle Tree). Let MT′ be a variant on the Merle Tree design, which is identical to the one described in the text, except that it does not add the one-byte prefixes of 0x00 or 0x01 as used in Equations 3.23 and 3.24, respectively, using hash function h. Let (KG, S, V ) be a secure (FIL) ′ signature scheme. Let SsMT (X) = Ss (MT′ (X)) follow the ‘hash then sign’ paradigm, to turn (KG, S, V ) into a signature scheme for sequences of strings. ′ Show that SsMT is not a secure, existentially-unforgeable signature scheme, by presenting an efficient adversary (program) that outputs a forged signature. Exercise 3.22 (HMAC simpliőcation). Consider the following slight simplification of the popular HMAC construction: h′k (m) = h(k + + h(k + + m)), where h : {0, 1}∗ → {0, 1}n is a hash function, k is a random, public n-bit key, and m ∈ {0, 1}∗ is a message. 1. Assume h is a CRHF. Is h′k also a CRHF? □ Yes. Suppose h′k is not a CRHF, i.e., there is some adversary A ′ that finds a collision (m′1 , m′2 ) for h′ , i.e., h′k (m′1 ) = h′k (m′2 ). Then at least one of the following pairs of messages (m1,1 , m2,1 ), (m1,2 , m2,2 ) is a collision for h, i.e., either h(m1,1 ) = h(m2,1 ) or h(m1,2 ) = h(m2,2 ) (or both). The strings are: m1,1 = , m1,2 = , m2,1 = , m2,2 = . . Note □ No. Let ĥ be some CRHF, and define h(m) = that h is also a CRHF (you do not have to prove this, just to design h so this would be true and easy to see). Yet, h′k is not a CRHF. Specifically, the following two messages m′1 = , m′2 = ′ ′ ′ are a collision for hk , i.e., hk (m1 ) = hk (m2 ). Applied Introduction to Cryptography and Cybersecurity 3.11. LAB AND ADDITIONAL EXERCISES 227 2. Assume h is an SPR hash function. Is h′k also SPR? □ Yes. Suppose h′k is not SPR, i.e., for some l, there is some algorithm A ′ which, given a (random, sufficiently-long) message m′ , outputs a collision, i.e., m′1 ̸= m′ s.t. h′k (m′ ) = h′k (m′1 ). Then we define algorithm A which, given a (random, sufficiently long) message m, outputs a collision, i.e., m1 ̸= m s.t. hk (m) = hk (m1 ). The algorithm A is: Algorithm A(m): { Let m′ = Let m′1 = A ′ (m′ ) Output } □ No. Let ĥ be some SPR, and define h(m) = . Note that h is also an SPR (you do not have to prove this, just to design h so this would be true and easy to see). Yet, h′k is not an SPR. Specifically, given a random message m′ , then m′1 = is a collision, i.e., m′ ̸= m′1 yet h′k (m′1 ) = h′k (m′2 ). 3. Assume h is a OWF. Is h′k also a OWF? □ Yes. Suppose h′k is not OWF, i.e., for some l, there is some algorithm A ′ which, given h′k (m′ ) for a (random, sufficiently-long) message m′ , outputs a preimage, i.e., m′1 ̸= m′ s.t. h′k (m′ ) = h′k (m′1 ). Then we define algorithm A which, given h(m) for a (random, sufficiently long) message m, outputs a preimage, i.e., m1 s.t. hk (m) = hk (m1 ). The algorithm A is: Algorithm A(m): { Let m′ = Let m′1 = A ′ (m′ ) Output } □ No. Let ĥ be some OWF, and define h(m) = . Note that h is also an OWF (you do not have to prove this, just to design h so this would be true and easy to see). Yet, h′k is not an OWF. Specifically, given a random message m′ , then m′1 = is a collision, i.e., m′ ̸= m′1 yet h′k (m′1 ) = h′k (m′ ). 4. Repeat similarly for bitwise randomness extraction. Exercise 3.23 (Insecure prepend-key MAC). Consider the following construction: h′k (m) = h(k + + m), where h : {0, 1}∗ → {0, 1}n is a hash function, k is a secret n-bit key, and m ∈ {0, 1}∗ is a message. Assume you are given some SPR hash function ĥ : {0, 1}∗ → {0, 1}n̂ ; you can use n̂ which is smaller than n. Using ĥ, construct hash function h, so that (1) it is ‘obvious’ that h is also SPR (no need to prove), yet (2) h′k (m) = h(k + + m) is (trivially) not a secure MAC. Hint: design h s.t. it becomes trivial to find k from h′k (m) (for any m). Applied Introduction to Cryptography and Cybersecurity CHAPTER 3. INTEGRITY: FROM HASHING TO BLOCKCHAINS 228 1. h(x) = . 2. (Justification) h is an SPR, since 3. (Justification) h′k (m) = h(k + + m) is not a secure MAC, since: . . Exercise 3.24 (HMAC is secure under ROM). Show that the HMAC construction is secure under the Random Oracle Model (ROM), when used as a PRF, MAC and KDF. Exercise 3.25 (HMAC is insecure using CRHF). Show counterexamples showing that even if the underlying hash function h is collision-resistant, its (simplified) HMAC construction hmack (x) = h(k + + h(k + + m)) is insecure when used as any of PRF, MAC and KDF. Exercise 3.26 (Hash-tree with efficient proof of non-inclusion). The Merkle tree allows efficient proof of inclusion of a leaf (data item) in the tree. Present a variant of this tree which allows efficient proof of either inclusion or of noninclusion of an item with given ‘key’ value. In this tree, each item consists of two strings, a key and data. Assume all data items are given together, sorted by their key values; no need to build the tree dynamically or extend it. Your solution may ‘expose’ one or two additional data items beyond the one queried. Note: try to provide solution which is efficient in number of hash operations required for verification (the number should be about one more than in the regular Merkle tree). Hint: You can see example of proof of non-inclusion and its application in the NSEC3 record of DNSSEC (RFC 5155 [257]), and a graphical illustration in Figure 8.19. Exercise 3.27. The Merkle-tree scheme may also be useful for privacy, when some recipients should have access only to some files, e.g., if each file mi contains data which is private to user i. Note, however, that CRHFs - and Merkle-trees - may not ensure confidentiality. Collision-resistance does not ensure that the value of h(m) will not expose some information about m. Let h be a (keyed or keyless) CRHF. Use h to design another hash function g, s.t. (1) g is also a CRHF, yet (2) g exposes one or more bits of its input. Explain why this implies that the Merkle-tree construction does not guarantee privacy. In particular, explain why the P oI of one message may expose information about other messages. Exercise 3.28. This question is about digest and PoI for the Merkle tree scheme. For concreteness, we will also refer to the trivial (and insecure) hsum function (Example 3.1). 1. Compute the Merkle-tree digest, for the input sequence {10, 20}, as a formula for an arbitrary hash function h, and as a value for hsum . Solution: MT.∆(B1 ) = h(h(10) + + h(20)) When using hsum , we have MT.∆(B1 ) = 3. Applied Introduction to Cryptography and Cybersecurity (3.52) 3.11. LAB AND ADDITIONAL EXERCISES 229 2. Compute the Merkle-tree digest for input: B2 = {30, 40, 50, 60, 70, 80, 90, 100}. Present the digest as a formula for an arbitrary hash function h, and as a value for hsum . Solution: MT.∆(B2 ) = h [h (h(h(30) + + h(40)) + + h(h(50) + + h(60))) + + h (h(h(70) + + h(80)) + + h(h(90) + + h(100)))] When using hsum , we have MT.∆(B2 ) = 7. 3. Compute the PoI for the input value 50 in B2 . Present the PoI as a formula for an arbitrary hash function h, and as a value for hsum . Solution: The value 50 was the third input in B2 , therefore the PoI is: MT.P oI(B2 , 3) = MT.∆({70, 80, 90, 100}) + + MT.P oI({30, 40, 50, 60}, 3) = h [h(h(70) + + h(80)) + ++ +h(h(90) + + h(100))] + + + +h(h(30) + + h(40)) + + h(60) For the special case of hsum , we have MT.P oI(B2 , 3) = 7 + +7+ + 6. Applied Introduction to Cryptography and Cybersecurity Chapter 4 Authentication: Message Authentication Code (MAC), Blockchain and Signature Schemes Cybersecurity and cryptography address different goals related to threats to information and communication. The most well-known goals are confidentiality, integrity, authentication and availability. In cryptography, the terms integrity and authentication are used mostly as synonyms, both meaning the validation that communication and information comes from a speciőc entity, or from one of a speciőc set of entities, leaving us with the sassy acronym CIA, often referred to as the CIA triad, for confidentiality, integrity, authentication and availability1 So far, we have mostly focused on conődentiality, to which we dedicated all of Chapter 2. We also brieŕy introduced, in subsection 1.5.1, signature schemes, which are asymmetric (public key) authentication schemes. In this chapter, we discuss symmetric (shared key) authentication schemes, called Message Authentication Code (MAC). MAC schemes efficient - much more than comparably-secure signature schemes. People often expect that encryption will ensure authenticity as a side-product of ensuring conődentiality. Therefore, let us begin this chapter by discussing the use of encryption for authentication, and show that this can be vulnerable although, later, in Section 4.7, we also discuss authenticated encryption schemes, designed to ensure, indeed, both conődentiality and authenticity. 1 Originally, the ‘A’ in CIA referred to authentication; indeed, in computer security, ‘integrity’ has a different meaning: protection of a computer or other system from corruption. ‘Availability’ was later identified as another basic goal, with the less-sassy acronym CIAA. Both acronyms are also used sometimes with accountability replacing availability and/or authentication. Oh well, all worthy goals! 231 CHAPTER 4. AUTHENTICATION: MAC, BLOCKCHAIN AND SIGNATURE 232 SCHEMES 4.1 Encryption for Authentication? As we discussed in the previous chapter, encryption schemes ensure confidentiality, i.e., an attacker observing an encrypted message (ciphertext) cannot learn anything about the plaintext (except its length). Sometimes, people expect encryption to also be useful for authentication and integrity. Some encryption schemes indeed have some integrity properties. One important properties is non-malleable encryption. Intuitively, a non-malleable encryption scheme prevents the attacker from modifying the message in a ‘meaningful way’. See deőnition and secure constructions of non-malleable encryption schemes in [126]. However, be warned: achieving, and even deőning, non-malleability is not as easy as it may seem! In fact, many ciphers are malleable; often, an attacker can easily modify a known ciphertext c, to c′ = ̸ c s.t. m′ = Dk (c′ ) ̸= m (and also m′ = ̸ ERROR). Furthermore, often the attacker can ensure useful relations between m′ and m. An obvious example is when using the (unconditionally-secure) one-time-pad (OTP), as well as using Output-Feedback (OFB) mode. Exercise 4.1. Mal is a Man-in-the-Middle, able to intercept and modify messages from Alice to her bank. In this question we explore the ability of Mal to modify ciphertext (encrypted message) message which Alice sends to her bank. Alice’s message is composed of the following fields, in the given order, each consisting of eight bytes: operation, reason, amount, payee, payer, password, date. When operation= 1 and password contains the correct (four-bytes) password for payer, the bank transfers amount from payer to payee, listing it in the bank ledger with the given (four-bytes) reason. Use identifiers 1 for Alice, 2 for Bob, and 3 for Mal. 1. Suppose the parties use One-Time-Pad (OTP) encryption, and Mal intercept the ciphertext c sent from Alice to her bank, which is encryption of a request to transfer $ 3 to Bob. Explain how Mal can modify the ciphertext, causing the bank to transfer (preferably, a larger amount) to Mal rather than Bob. Assume Mal knows all the details (amount, payee, etc.). 2. How would your response change when using, instead of OTP, the following modes of operation of DES, with block size of 64 bits (8 bytes): (a) OFB, (b) CFB, (c) CBC, (d) CTR, (e) ECB. Note: You may not be able to find a successful attack for some of the modes - but don’t give up too easily! Solution for first part: with OTP, the ith ciphertext bit is computed by ci = mi ⊕ ki , and decrypted by mi = ci ⊕ ki . Therefore, i is encrypted by ci We conclude that encryption schemes may not suffice to ensure authentication. This motivates us to introduce, in the next section, another symmetric-key cryptographic scheme, which is designed explicitly to ensure authentication and integrity: the Message Authentication Code (MAC). Later, in Section 4.7, we also discuss how to achieve conődentiality together with authenticity. Applied Introduction to Cryptography and Cybersecurity 4.2. MESSAGE AUTHENTICATION CODE (MAC) SCHEMES 4.2 233 Message Authentication Code (MAC) schemes Message Authentication Code (MAC) schemes are a simple, symmetric key cryptographic functions, designed to verify the authenticity and integrity of information (messages), namely, to detect that a message was not sent by an ‘allowed sender’ (or, was modiőed after it was sent). A MAC function M ACk (m) has two inputs, a (secret) n-bit secret (symmetric) key k, and a message m. As illustrated in Figure 4.1, MAC schemes use the same key k to generate the authenticator (tag), and to validate the authenticator (tag); usually, upon receiving a message m with a purported authenticator σ, the recipient computes σ ′ ← M ACk (m) and veriőes m by conőrming that σ = σ ′ . Notice that this implies that MAC schemes, and authenticators, are usually deterministic. Intuitively, given m, M ACk (m) for a secret, random key k, it is infeasible for a (computationally-bounded) attacker to őnd another message m′ ̸= m together with the value of M ACk (m′ ). Typically, as shown in Fig. 4.1, a secret, symmetric MAC key k is shared between two (or more) parties. Each party can use the key to authenticate a message m, by computing an authentication tag M ACk (m). Given a message m together with a previously-computed tag T , a party veriőes the authenticity of the message m by re-computing M ACk (m) and comparing it to the tag T ; if equal, the message is valid, i.e., the tag must have been previously computed by the same party or another party, using the same secret key k. In a typical use, one party, say Alice, sends a message m to a peer, say Bob, authenticating m by computing and attaching the tag T = M ACk (m). Bob conőrms that T = M ACk (m), thereby validating that Alice sent the message, since he shares k only with Alice. See Fig. 4.1. MAC schemes are related to signature schemes, an asymmetric (public key) authentication mechanism which we introduced in subsection 1.5.1. With signature schemes, each party, e.g., Alice, generates a private signing key A.s and a corresponding public verification key A.v. In Deőnition 1.6, we deőne an existentially unforgeable signature scheme; intuitively, an adversary who is given the public key A.v, and can choose messages m1 , m2 , . . . and receive the corresponding signatures σ1 = S.Sign A.s (m1 ), σ2 = S.Sign A.s (m2 ), . . ., cannot őnd a different message m′ ̸∈ {m1 , m2 , . . .} with a corresponding signature σ ′ such that S.VerifyA.v (m′ , σ ′ ) = True. Note that we usually use the term authenticator to refer to the output M ACk (m) of the MAC function, i.e., if σ = M ACk (m), then σ is the authenticator of m using shared key k. Other terms for the authenticator are a tag, or a signature; we warn that this last term (‘signature’) may cause confusion between MAC schemes and signature schemes, and therefore, we recommend (and try) to avoid it. Repudiation vs. deniability While both signature schemes and MAC schemes are used to authenticate messages, there is a critical difference: a valid MAC can be computed by any entity that knows the shared key. Consider the scenario in Figure 4.1; even after Carl successfully validates m̂ using σ̂, he Applied Introduction to Cryptography and Cybersecurity CHAPTER 4. AUTHENTICATION: MAC, BLOCKCHAIN AND SIGNATURE 234 SCHEMES Nurse Alice, MitM, no key shared key k Bob, Carl, shared key k shared key k m, σ ← M ACk (m) m, σ σ = M ACk (m) ⇒ output m m̂, σ̂ ← M ACk (m̂) m′ ̸= m, σ ′ σ̂ = M ACk (m̂) ⇒ output m̂ σ ′ ̸= M ACk (m′ ) ⇒ discard m′ Figure 4.1: Using a Message Authentication Code (MAC) scheme, and a shared key k, to authenticate messages. The Man-in-the-Middle (MitM) adversary can observe message m and its authenticator σ = M ACk (m), but cannot forge the MAC, i.e., generate a pair m′ , σ ′ such that σ ′ = M ACk (m′ ), for m′ ̸= m. Note, however, that if a key k is shared among more than two entities, e.g., Alice, Bob and Carl, then each entity can authenticate messages using k; e.g., when Carl receives m̂, it cannot know if m̂ was sent by Alice or Bob (or even by Carl himself), except by using some indication within m, e.g., if m includes sender identiőcation. Due to this property, we say that MAC schemes allow repudiation; for non-repudiation, use signatures instead of MAC. cannot know if m̂ was sent by Alice or Bob. Of course, the sender identity may be indicated as part of the message m; however, there is nothing preventing an entity knowing the shared key k, from putting a different identity in the message and computing the MAC. This is in contrast to signature schemes, where the private signing key A.s must be used to produce a valid signature σ for a given message m, namely, knowing the public validation key A.v does not allow forgery of a message as if it was signed using A.s. This property of signature schemes is often referred to as non-repudiation, as it prevents the sender of a (signed) message from repudiating (denying) having sent it. Note that in some situations, e.g., for a whistle-blower, we may have the opposite goal, i.e., of preventing the recipient from proving the identity of the sender to a third person; this goal is usually referred to as deniability. To validate that a given tag T correctly validates a message m, i.e., T = M ACk (m), requires the ability to compute M ACk (·), i.e., knowledge of the shared secret key k. However, this implies the ability to compute (valid) tags from any other message. This allows the entity that computed the tag to later deny having done it, since it could have been computed also by other entities. Therefore, MAC schemes do not ensure non-repudiation - and, exactly because Applied Introduction to Cryptography and Cybersecurity 4.3. MESSAGE AUTHENTICATION CODE (MAC): DEFINITIONS 235 of that, allow deniability. Namely, we should use a signature scheme when we need to ensure nonrepudiation, i.e., to ensure that after validating a message with the key associated with Alice, we should be able to assume that Alice indeed sent the message. When we do not need non-repudiation, and especially when deniability is important, then we should use a MAC scheme. 4.3 Message Authentication Code (MAC): Definitions A MAC scheme is a function F , with the following unforgeability property: an attacker, which does not know the key k and is not given Fk (m) for any given message m, is unable to őnd the value of Fk (m), with better chance than a random guess. The deőnition has a lot in common with the deőnition of signature schemes and their existential-unforgeability requirement, see subsection 1.5.1; in particular, we allow the adversary to obtain the MAC values for any other message. The deőnition follows. For concreteness, we will focus on MAC whose output is an l-bit binary string. Definition 4.1 (MAC). An l-bit Message Authentication Code (MAC) over domain D, is a function F : {0, 1}∗ × D → {0, 1}l , such that for all PPT AC algorithms A, the advantage εM F,A (n) is negligible in n, i.e., smaller than any positive polynomial for sufficiently large n (as n → ∞), where: AC εM F,A (n) ≡ Pr $ k←{0,1}n h (m, Fk (m)) ← A Fk (·|except m) i 1 (1n ) − l 2 (4.1) $ Where the probability is taken over the random choice of an n bit key, k ← {0, 1}n , as well as over the coin tosses of A. Oracle. The expression A Fk (·|except m) refers to the output of the adversary A, where during its run, the adversary can give arbitrary inputs x = ̸ m and receive the corresponding values of the function, Fk (x). We say that the adversary A has an oracle to the MAC function FK (·) (excluding the message m). See Deőnition 1.3. AC The advantage function εM F,A (n) and key length n. The deőnition is for l-bit MAC, i.e., the output is always a binary string of length l. Hence, a random guess at the MAC of any input message m would be correct with probability AC 2−l . Therefore, we deőned the advantage εM F,A (n) as the probability that the adversary őnds a correct MAC value for a message m (not input to the oracle), minus the ‘base success probability’ of 2−l . The function F is a (secure ) MAC, AC if this advantage εM F,A (n) is negligible. The key length is denoted n, and is not bounded. The ‘advantage’ of the adversary over random guess, should be negligible in n, i.e., converge to zero as n grows. In practice, MAC functions are used with speciőc key length, which Applied Introduction to Cryptography and Cybersecurity CHAPTER 4. AUTHENTICATION: MAC, BLOCKCHAIN AND SIGNATURE 236 SCHEMES is believed to be ‘long enough’ to foil attacks (by attackers with reasonable resources and time). Output length - fixed (l) or as key length (n). In some other deőnitions of MAC schemes, the output length is also n, i.e., same as the key. In this case, the 21l element becomes 21n , which is negligible in n, and hence can be ignored. Input domain. Notice that the deőnition allows an arbitrary input domain D to the MAC function. The two most commonly used domains are D = {0, 1}∗ , i.e., the set of all binary string (of unbounded length), and D = {0, 1}lin , i.e., the set of all binary strings of some őxed length lin . Of course, lin may also be the same as l. A MAC function whose input is the set of binary strings of őxed length, is called FIL-MAC, i.e., Fixed Input Length MAC. In contrast, a MAC function whose input is the set of all binary strings is called VIL-MAC, i.e., Variable Input Length MAC. To ‘warm up’, let us show two examples of insecure MAC design. Our examples follow the deőnition, i.e., the attacker is allowed to ask for the MAC for some messages, and then has to come up with a different message and a correct MAC for that message. Notice that the deőnition does not require the ‘forged’ message to be ‘meaningful’; this means that it isn’t always trivial to exploit a vulnerable MAC. Following the ‘conservative design principle (Principle 3), the deőnition does not attempt to predict which forgeries will be meaningful, instead forbidding any forgery. Our őrst example is a very simple FIL-MAC construction which we denote XORE ; the construction is deőned for a given n-bit block cipher E. The XORE construction is deőned for inputs which are exactly two blocks. Its output is a single block, which is the result of XORing the ‘encryption’ of each block. Namely, for given key k: XORkE (m) ≡ Ek (m[1 : n]) ⊕ Ek (m[n + 1 : 2n]) The following example shows that XORE is not a secure MAC. We recommend you try to show it yourself before reading the solution. Example 4.1. To show that XORE is not a secure MAC, observe that: XORkE (m) = = = = Ek (m[1 : n]) ⊕ Ek (m[n + 1 : 2n]) Ek (m[n + 1 : 2n]) ⊕ Ek (m[1 : n]) XORkE (m[n + 1 : 2n] + + Ek (m[1 : n]) XORkE (m̄) where m̄ = m[n + 1 : 2n] + + Ek (m[1 : n] Therefore, for every 2n-bits input message m, we have XORkE (m) = XORkE (m̄), where m̄ is simply the message with the two blocks switched. This suffices to conclude that XORE does not satisfy the definition for secure MAC. Let us present a specific adversary A that ‘breaks’ XORE , i.e., shows it does not meet the definition of secure MAC. First, A asks for the MAC of Applied Introduction to Cryptography and Cybersecurity 4.3. MESSAGE AUTHENTICATION CODE (MAC): DEFINITIONS 237 + 1n (a block of zeros followed by a block of 1’s); we could have used m01 = 0n + almost any 2n−bit message, the choice of m01 is just for simplicity. As per the definition, A receives the MAC of m01 , i.e., XORkE (m01 ) = Ek (0n ) ⊕ Ek (1n ). Then A returns the pair m10 ||XORkE (m01 ), where m10 = 1n + + 0n . This is a successful forgery, since: XORkE (m10 ) = Ek (1n ) ⊕ Ek (0n ) = Ek (0n ) ⊕ Ek (1n ) = XORkE (m01 ) Our second example is of a ‘hairy’ function fk , deőned ‘from scratch’, i.e., not using an underlying block cipher. Such ‘hairy’ designs may appear to be good candidates for MAC - but all too often, they are vulnerable. Showing the vulnerability can be tricky, which motivates (1) the use of a strong deőnition and (2) the use of standard, secure constructions from basic building blocks, following the ‘building blocks principle’ (Principle 8).  Example 4.2. Consider fk (m) = k 3 · m + k 2 · m2 + k · m3 mod p, where p is a known number. Let us show, in a simple yet detailed way, that this hairy expression is not a secure MAC. Notice that our solution does not involve any attempt to find k! The idea of the solution is simple. Recall the most basic properties of modular arithmetic (Section A.2). From these it follows that fk (m) = fk (m + i · p), for any integer i. In particular, fk (m) = fk (m + p). If this isn’t clear, try to substitute some small integers for m, k and p; and then read again Section A.2 to see why these equations hold. This is the crux of the solution, but let us complete the details, by presenting AC an adversary A which s.t. εM F,A (n) is non-negligible; in fact, we’ll show that AC −l εM F,A (n) = 1 − 2 , for every n. Actually, the fact that we show advantage of (almost) 1 is quite typical of these exercises, although, of course, it suffices to show any non-negligible advantage. The adversary A is the following simple algorithm: 1. Let m′ be some arbitrary value, e.g. p, or 1, or 0, or whatever you like. 2. Let x ← Fk (m′ ), i.e., call the oracle on m′ . 3. Let m = m′ + p. 4. Output (m, x). AC −l Let us explain why εM F,A (n) = 1 − 2 . Given oracle access to fk (·) (for some random k), the adversary gave some input m′ , and received x ≡ fk (m′ ), i.e., in our case, x = k 3 · m′ + k 2 · m′2 + k · m′3 mod p. Then the adver̸ m′ , so the condisary outputs (m, x), where m = m′ + p. Obviously, m = tion of the use of the oracle is satisfied; on the other hand, x = fk (m′ ) = fk (m + p) = the expression is true for any k and we have:  fk (m). Therefore, Fk (·|except m) n AC (m, F (m)) ← A (1 ) = 1, proving that εM Pr $ k F,A (n) = n k←{0,1} 1 − 2−l . Applied Introduction to Cryptography and Cybersecurity CHAPTER 4. AUTHENTICATION: MAC, BLOCKCHAIN AND SIGNATURE 238 SCHEMES 4.4 Applying MAC Schemes A MAC function is a simple cryptographic mechanism, which is quite easy to use; however, it should be applied correctly - with an understanding of its properties and without expecting it to provide other properties. We now discuss a few aspects of the usage of MAC schemes, and give a few examples of common mistakes. Confidentiality A MAC function is a great tool to ensure integrity and authenticity; however, MAC may not ensure confidentiality. Namely, M ACk (m) may expose information about the message m. This is sometimes overlooked by system designers; for example, early versions of the SSH protocol used the so-called ‘Encrypt and Authenticate’ method, where to protect message m, + M ACk (m); one problem with this design is that the system sent Ek (m) + M ACk (m) may expose information about m. Notice that while obviously conődentiality is not a goal of MAC schemes, one may hope that it is derived from the authentication property. To refute such false hopes, it is best to construct a counterexample - a very useful technique to prove that claims about cryptographic schemes are incorrect. The counterexamples are often very simple - and often involve ‘stupid’ or ‘strange’ designs, which are specially designed to meet the requirements of the cryptographic deőnitions - while demonstrating the falseness of the false assumptions. Here is an example showing that MAC schemes may expose the message. Example 4.3 (MAC does not ensure conődentiality.). To show that MAC may not ensure confidentiality, we construct such a Non-confidential MAC function F N cM (where N cM stands for ‘Non-confidential MAC’). Our construction uses an arbitrary secure MAC scheme F (which may or may not ensure confidentiality). Specifically: FkN cM (m) = Fk (m) + + LSb(m) where LSb(m) is the least-significant bit of m. Surely, F N cM does not ensure confidentiality, since it exposes a bit of the message (we could have obviously exposed more bits - even all bits!). On the other hand, we now show that F N cM is a secure MAC. Assume, to the contrary, that there is some adversary A N cM that succeeds (with significant probability) against F N cM . We use A N cM to construct an attacker A that succeeds with the same probability against F . Attacker A works as follows: 1. When A N cM makes a query q to F N cM , then A makes the same query to F , receiving Fk (q); it then returns FkN cM (q) = Fk (q) + + LSb(q), as expected by A N cM . 2. When A N cM outputs its guess m, T , where T is its guess x for FkN cM (m) = Fk (m) + + LSb(m), and m was not used in any of A N cM ’s queries, then A outputs x except for its least-significant bit; namely, if x = FkN cM (m) = Fk (m) + + LSb(m), then A outputs FkN cM (m) = Fk (m). Applied Introduction to Cryptography and Cybersecurity 4.4. APPLYING MAC SCHEMES 239 It follows that F N cM is a secure MAC if and only if F is a secure MAC. We show, later on (subsection 4.5.1), that every PRF is a MAC. The following exercise shows that the reverse is not true: a MAC is not necessarily a PRF. This exercise is similar to the example above. Exercise 4.2 (Non-PRF MAC). Show that a MAC function is not necessarily a pseudorandom function (PRF). Solution outline: Let F be an arbitrary secure MAC scheme that outputs n-bit tags. Construct a MAC scheme F ′ , which outputs 2n-bit tags, as follows. Fk′ (m) = Fk (m) + + 0n Clearly, F ′ is not a PRF, because A has a signiőcant chance of distinguishing between an output of F ′ and a random 2n-bit string (since the second half of the output of F ′ is all zeros). Yet, you can show that F ′ is a secure MAC if and only if F is a secure MAC, using a similar method to the one in Example 4.3. Therefore, a MAC function is not necessarily a PRF. Key separation Another problem with the SSH ‘Encrypt and Authenticate’ design, Ek (m) + + M ACk (m), is the fact that the same key is used for both encryption and MAC. This can cause further vulnerability; an example is shown in the following simple exercise. Exercise 4.3 (Separate keys for separate functions). Show that the use of the same key for encryption and MAC in Ek (m) + + M ACk (m) can allow an attacker to succeed in forgery of messages - in addition to the potential loss of confidentiality shown above - even when E and M AC are secure (encryption and MAC, respectively). Solution outline: Let E ′ , M AC ′ be secure encryption and MAC functions, respectively. Deőne EkE ,kM (m) = Ek′ E (m) + + kM and M ACkE ,kM (m) = kE + + M ACk′ M (m). Obviously, , the use of EkE ,kM (m) + + M ACkE ,kM (m) exposes both keys and is therefore insecure. However, using the method of Example 4.3, you can show that E, M AC are also secure encryption and MAC functions, respectively. See also Exercise 4.18. Another motivation to separate between the keys used by a given cryptographic function/scheme, is to reduce the quantity of plaintext available to the cryptanalyst, and especially the amount of known and chosen plaintext. These considerations result in the principle of key separation. Principle 10 (Key Separation). Use separate, independently-pseudorandom keys for: (1) each different cryptographic scheme/function, (2) different types and/or different sources of plaintext, (3) different periods, and (4) different versions of the protocol or scheme. Applied Introduction to Cryptography and Cybersecurity CHAPTER 4. AUTHENTICATION: MAC, BLOCKCHAIN AND SIGNATURE 240 SCHEMES Freshness, replay-prevention and sender authentication A valid MAC received with a message shows that the message was properly authenticated by an entity holding the secret key, which we refer to as message authentication. Message authentication is a useful property, and can facilitate additional important properties. Let us discuss three such important properties, which can be facilitated using message authentication, and are even sometimes (incorrectly) assumed to be implied directly by message authentication: sender authentication, freshness and no-replay. Sender authentication is the ability to identify the identity of the party originating the message. We commented above that MAC does not ensure sender authentication, unless the design ensures that only the speciőc sender will compute the MAC using the speciőc key over the given message. A simple way to ensure this is by including the sender identity as part of the payload being signed. Another way to ensure this is for each sender to use its own authentication key. Of course, both methods do not prevent one entity holding a shared key, from impersonating as another entity using the same key; to prevent this, use signatures. Freshness is the ability to conőrm that a message was sent ‘recently’; a related property, replay-prevention, ensures that the message was not already handled previously. We can use MAC to ensure these properties, by including in the authenticated data appropriate őelds, such as a timestamp, a counter or a random number (‘nonce’) selected by the party validating freshness. Each of these options has its corresponding drawback: the need for synchronized clocks, the need to keep a state, or the need for the sender to receive the nonce from the recipient (additional interaction). 4.5 Constructing MAC from a Block Cipher In this section we discuss constructions of a MAC function from a block cipher. More precisely, the constructions are of a MAC from a pseudorandom function (PRF), mainly the CBC-MAC construction. Since a PRF is often not included in cryptographic libraries, it may be tempting to use instead a block cipher, which is part of most cryptographic libraries; recall that a block cipher is modeled by a Pseudo-Random Permutation, PRP, rather than by a PRF. The PRP/PRF switching lemma (Lemma 2.2) shows that we could simply use a block-cipher instead of the PRF, since a block-cipher (PRP) is indistinguishable from a PRF; however, recall this is not advisable, since the use of block cipher instead of PRF involves loss in security. Instead, use one of the efficient, simple constructions of PRF from a block cipher, which avoid the loss of security, e.g., [39, 183]. The section contains three subsections, In the őrst, we observe that given a PRF, we can actually use it directly as a MAC, i.e., every PRF is also a MAC. There is a caveat: the input domain of the MAC is the same as that of the PRF, which, in turn, is the same as of the underlying block cipher (if the PRF is implemented from a block cipher as explained above). Namely, if we use n-bit blocks, i.e. the domain of the block-cipher (and PRF) is {0, 1}n , then the Applied Introduction to Cryptography and Cybersecurity 4.5. CONSTRUCTING MAC FROM A BLOCK CIPHER 241 MAC function also applies (only) to n-bit messages. This is not satisfactory, since typical messages are longer. The second subsection presents the CBC-MAC construction, which constructs a l · n-bit PRF from an n-bit PRF, for a given constant number of blocks l. This allows efficient and secure use of n-bit-input PRF (or block cipher), to encrypt longer, l · n-bits messages. Finally, in the third subsection we discuss extensions that allow a MAC for messages of arbitrary length. In the following section, we discuss other construction of MAC schemes, which are not based on the use of a secure block cipher, most notably, the HMAC construction of a MAC scheme from a cryptographic hash function. 4.5.1 Every PRF is a MAC In this subsection, we take the őrst step toward the CBC-MAC construction. This step is the observation that every PRF whose range is {0, 1}l , is also an l-bit MAC, with the same input and output domains. This is formalized in the following lemma, which we call the PRF-is-MAC lemma. Lemma 4.1 (A PRF is a MAC). Let F be a PRF from input domain D to the range {0, 1}l . Then F is also an l-bit MAC, with input domain D and output domain {0, 1}l . Proof: Assume that F is not a MAC (for same domain D and range {0, 1}l ). AC Namely, assume that there exists some adversary AM AC s.t. εM AM AC ,F (n) is non-negligible in n (as deőned in Equation 4.1). We use AM AC to construct RF another adversary, AP RF , s.t. εP AP RF ,F (n) is non-negligible in n (as deőned in Equation 2.29); this shows that F is (also) not a PRF, which proves the claim. Let us now deőne AP RF . First, recall that in Equation 2.29, adversary $ AP RF is given an oracle either to a random function f ← {D → {0, 1}l , or to the pseudorandom function Fk : D → {0, 1}l for some random n-bit key $ k ← {0, 1}n . Adversary AP RF runs AM AC , letting it use the same oracle. Namely, whenever AM AC asks its oracle with input x ∈ D, adversary AP RF calls its oracle with the same input x; and when it receives a result ξ, it returns that result to AM AC . When AM AC terminates, it should return some pair, which we denote by (m, σ). Upon receiving (m, σ), adversary AP RF provides m as input to its oracle; denote the output by σ ′ . If σ ̸= σ ′ , then AP RF returns ‘Rand’; otherwise, i.e., if σ = σ ′ , then AP RF returns ‘Pseudo’. Essentially, AP RF outputs ‘Rand’ (i.e., guess it was given a random function), when AM AC fails was able to predict correctly the output of the oracle for the input m. Let us consider what happens if AP RF is given an oracle to a random $ function f ← {D → R}. In this case, when running AM AC , the values returned from the oracle were for that random function f ; clearly, AM AC cannot be expected to perform as well as when given an oracle to the function Fk (·). In fact, AM AC has to return a pair (m, σ), without giving input m to the oracle. Applied Introduction to Cryptography and Cybersecurity CHAPTER 4. AUTHENTICATION: MAC, BLOCKCHAIN AND SIGNATURE 242 SCHEMES But if the oracle is to a random function f , then f (m), is chosen independently of f (x) for any other input x ̸= m; learning other outputs cannot help you guess the output when the input is m! Hence, the probability of a match is (only) 2−l - the probability between two random l bit strings. Namely,  fof na random match  Pr $ A (1 ) = ‘Rand’ = 2−l . f ←{D→R} Now consider what happens if AP RF is given an oracle to a pseudorandom function Fk (·). The claim is that F is also a MAC, but we assumed, to the contrary, that it is not; so AM AC is able to return a pair (m, Fk (m)) - with probability significantly larger 2−l . In these cases, AP RF will return  F than n k (1 ) = ‘Rand’ = 2−l + p(n), where p(n) is a A ‘PR’. Namely, Pr $ n k←{0,1} significant (not negligible) function. RF It follows that εP A,F (n) is not negligible, and hence, F is not a PRF. 4.5.2 CBC-MAC: ln-bit MAC (and PRF) from n-bit PRF Lemma 4.1 shows that every n-bit PRF is also an n-bit MAC. But how can we deal with longer messages? In this subsection, we present the CBC-MAC construction, which produces an l · n-bit PRF, using a given n-bit PRF. Since every PRF is a MAC, this gives also an l · n-bit MAC. The CBC-MAC construction is a standard from 1989 [213], i.e., prior to the PRF-is-MAC lemma (from [37]), which is why it refers to construction of MAC (from block cipher) and not to construction of l · n-bit PRF from n-bit PRF. Before we present the CBC-MAC construction, let us discuss some insecure constructions. First, consider performing MAC to each block independently, similar to the ECB-mode (Section 2.8). One drawback is that this would result in a long MAC. An even worse drawback is that this is insecure; an attacker may obtain a MAC for a different message, which contains re-ordered and/or duplicated blocks. Next, consider adding a counter to the input, to which we refer as CTRMAC. This prevents the trivial attack - but not simple variants, as shown in the following exercise. For simplicity, the exercise is given for l = 2. Of course, this design also has the disadvantage of a longer output tag. Exercise 4.4 (CTR-MAC is insecure). Let E be a secure (n + 1)−bit block + m1 ) = cipher, and define the following 2n−bit domain function: Fk (m0 + Ek (0 + + m0 ) + + Ek (1 + + m1 ) (CTR-MAC). Present a counterexample showing that F is not a secure 2n−bit MAC. Finally, we present the CBC-MAC construction, also known as the CBCMAC mode. This is a widely used, standard construction of an (l · n)−bit MAC from an n−bit block cipher. The CBC-MAC mode, illustrated in Fig. 4.2, is a variant of the CBC mode used for encryption, see Section 2.8. Given a block-cipher E, we deőne CBC − M AC E as in Eq. 4.2, for an l-block input message (i.e., of length l · n bits), m = m1 + + m2 + + ... + + ml : CBC − M ACkE (m) = {c0 ← 0n ; (i = 1 . . . l)ci = Ek (mi ⊕ ci−1 ); output cl } (4.2) Applied Introduction to Cryptography and Cybersecurity 4.5. CONSTRUCTING MAC FROM A BLOCK CIPHER 243 See Fig. 4.2. When E is obvious we may simply write CBC − M ACk (·). m1 m2 m3 Ek Ek Ek 0n CBC − M ACkE (m) Figure 4.2: CBC-MAC: construction of l · n−bit PRF (and MAC), from n−bit PRF. CBC-MAC is the most widely used MAC construction from block ciphers. Other constructions of secure MAC from PRFs and block ciphers, including more efficient constructions, e.g., avoiding the need to know the input length in advance (CMAC [136]) or allowing parallel computation and veriőcation (e.g., XOR-MAC [36]). However, we focus on CBC-MAC, which is not only the most widely used, but also one of the most simple constructions of MAC from a block cipher. We next present Lemma 4.2 which shows that CBC-MAC constructs a secure PRF (and hence also MAC), provided that the underlying function E is a PRF. Lemma 4.2. If E is an n-bit PRF, then CBC − M ACkE (·) is a secure n · l-bit PRF and MAC, for any constant integer l > 0. Proof: see in [37]. CBC-MAC does not support input of arbitrary length . The CBCMAC construction is deőned for input which is an integral number of blocks, i.e., n · l bits. How can we extend it so it does support input of arbitrary length, i.e., a variable input length (VIL) PRF (and MAC) - deőned for input domain domain {0, 1}∗ ? One obvious problem is that an arbitrary binary string, ma y not even consist of an integral number of blocks, while CBC-MAC is deőned only for inputs which are of length n · l, i.e., integral number of blocks. However, let us ignore that problem for now, and focus on the complete-blocks input length (CBIL) domain, i.e., inputs whose length is an integer number of blocks. Let us őrst precisely deőne the CBIL domain.  CBIL ≡ m ∈ {0, 1}n·l |l ∈ mathbbZ + (4.3) In the next exercise we show that CBC-MAC is not a PRF, or a MAC, for the CBIL domain, and, hence, surely not a VIL MAC/PRF. Applied Introduction to Cryptography and Cybersecurity CHAPTER 4. AUTHENTICATION: MAC, BLOCKCHAIN AND SIGNATURE 244 SCHEMES Exercise 4.5 (CBC-MAC is not a VIL MAC). Show that CBC-MAC is not a MAC or PRF for the domain CBIL (Equation 4.3), and hence is definitely not a VIL MAC/PRF (for the domain {0, 1}∗ ). Solution: Let fk (·) = CBC − M ACkE (·) be the CBC-MAC using an underlying n-bit block cipher Ek . Namely, for a single-block message a ∈ {0, 1}n , we have fk (a) = Ek (a); and for a two block message a + + b, where a, b ∈ {0, 1}n , we have fk (a + + b) = Ek (b ⊕ Ek (a)). We present a simple adversary Afk , with oracle access to fk , i.e., A is able to make arbitrary query x ∈ {0, 1}∗ to fk and receive the result fk (x). Let X denote all the queries made by A during its run. We show that Afk generates a pair x, fk (x), where x ̸∈ X, which shows that fk (i.e., CBC-MAC) is not a MAC for domain CBIL (and hence also not a {0, 1}∗ -MAC, i.e., VIL MAC). Speciőcally, the adversary A őrst makes an arbitrary single-block query, for arbitrary a ∈ {0, 1}n . Let c denote the result, i.e., c = fk (a) = Ek (a). Then, A computes b = a ⊕ c and outputs the pair of message a + + b and tag c. Note that c = fk (a + + b), since fk (a + + b) = Ek (b ⊕ Ek (a)) = Ek ((a ⊕ c) ⊕ c) = Ek (a) = c. Namely, c is indeed the correct tag for a + + b. Obviously, A did not make a query to receive fk (a + + b). Hence, A succeeds in MAC game against CBC-MAC. However, as we next explain, by merely prepending the length to the input, we can create a VIL MAC from the CBC-MAC. 4.5.3 Constructing Secure VIL MAC from PRF Lemma 4.2 shows that CBC-MAC is a secure ln-bit FIL PRF (and MAC); however, Exercise 4.5 shows that it is not a VIL MAC (and hence surely not VIL PRF). The crux of the example was that we used the CBC-MAC of a one-block string, and presented it as the MAC of a 2-block string. This motivates a minor change to the construction, where we prepend the block-encoded length L(m) of the input m to the input before applying CBC-MAC. We deőne L(m) as an n-bit binary string (i.e., a block), whose binary value is the length |m| of the input m. Lemma 4.3 shows that this construction is indeed a secure VIL MAC. We refer to this variant as length-prepending CBC-MAC. Lemma 4.3 (Length-prepending CBC-MAC is a VIL PRF.). Let fk (m) = CBC − M ACkE (L(m) + + m), where L(m) is the block-encoded length of m (as defined above). Then fk (·) is a PRF over the set of all binary strings(and MAC). Proof: See [37]. Note that the block-encoded length L(m), can only support message up to the maximal length encoded by n bits - i.e., |m| < 2n . In practice, this isn’t an issue - and it is not difficult to extend the construction to avoid this limitation, if you really want to. It is hard to imagine a practical scenario in which you will have to do this, however. Applied Introduction to Cryptography and Cybersecurity 4.6. OTHER MAC CONSTRUCTIONS 4.6 245 Other MAC Constructions In the previous section, we presented constructions of MAC from PRFs and block ciphers. In the following subsections, we discuss other approaches for constructing a MAC function, including: (1) design a MAC ‘from scratch’, i.e., without provable reduction to the security of some other cryptographic scheme (subsection 4.6.1), (2) combine multiple candidate MAC functions (robust combiner, subsection 4.6.2), and, őnally, (3) construct a MAC/PRF from a cryptographic hash function (subsection 4.6.3). This last approach, constructing a MAC/PRF from a hash function, is the most widely-use method to implement a MAC function, usually using the HMAC construction [31, 32]. 4.6.1 MAC design ‘from scratch’ This approach attempts to design a candidate MAC function without requiring a reduction to the security of some cryptographic scheme; typically, the design simply does not involve any other, known cryptographic function. Instead, we may use some problems which are considered computationally-hard. The security of such design is based on the failure of signiőcant cryptanalysis efforts against the MAC function. This used to be the main method of design of new cryptographic mechanisms. However, following the cryptographic building block principle (principle 8), MAC functions are rarely designed ‘from scratch’. Let us give an example of one example: a (failed) attempt to construct MAC from EDC, and the resulting vulnerabilities. Two (failed) attempts to construct MAC from EDC Let us consider a speciőc design, which, intuitively, may look promising: constructing a MAC from a (good) Error Detection Code (EDC). Error Detection Codes are designed to ensure integrity, i.e., to detect corruptions in data; however, they are designed to detect random errors, and may fail to detect intentional modifications. We have seen already, in Section 2.10, that the WEP protocol failed to ensure integrity against attack, in spite of its use of the CRC-32 Cyclic Redundancy Check (CRC) error-detecting code before encryption. Let us consider two other simple constructions of a MAC from an EDC, which do not involve encryption: M ACk (m) = EDC(k + + m) and M ACk′ (m) = EDC(m + + k). We next show that these construction are insecure, when using CRC as the error-detection code (EDC). Exercise 4.6 (Insecure CRC-based MACs). Show that both of the following are insecure: (a) CRC-MACk (m) = CRC(k + + m) and (b) CRC-MAC′k (m) = CRC(m + + k). Solution: We only solve (a) and leave (b) as an (easy) exercise to the reader. In fact, we show how the attacker that receives only the MAC CRC-MACk (m) = CRC(k + + m) of any known message m, can compute the MAC for any other Applied Introduction to Cryptography and Cybersecurity CHAPTER 4. AUTHENTICATION: MAC, BLOCKCHAIN AND SIGNATURE 246 SCHEMES message m′ = ̸ m of the same length, i.e., compute CRC-MACk (m′ ) = CRC(k + + ′ m ). Recall that the CRC function is linear, namely, for any two strings of the same length, |x| = |x′ , holds: CRC(x ⊕ x′ ) = CRC(x) ⊕ CRC(x′ ) (Equation 2.61). Hence: CRC-MACk (m′ ) = CRC(k + + m′ )   = CRC (0|k| + + m′ ⊕ m) ⊕ (k + + m) = CRC(0|k| + + m′ ⊕ m) ⊕ CRC(k + + m) (4.4) = CRC-MAC0|k| (m′ ⊕ m) ⊕ CRC-MACk (m) The adversary computes |k| = |CRC-MACk (m′ )| − |CRC(m′ )|, and can therefore compute CRC-MAC0|k| (m′ ⊕ m). By plugging this into Equation 4.4, the adversary őnd CRC-MACk (m′ ). We conclude that CRC-MACis indeed insecure. The same seems to hold for other ECD-based MACs. 4.6.2 Robust combiners for MAC A robust combiner for MAC combines two (or more) candidate MAC functions to create a new composite function, which is proven secure provided that one (or a sufficient number) of the underlying functions is secure. There is actually a very simple robust combiner for MAC schemes: concatenation (denoted + +). In the following exercise we show that concatenation is a robust combiner for MAC functions. Exercise 4.7. Show that concatenation is a robust combiner for MAC functions. Solution (from [191]): Let F ′ , F ′′ be two candidate MAC schemes, and + Fk′′′′ (m). We should show that it suffices that deőne Fk′ ,k′′ (m) = Fk′ ′ (m) + ′ ′′ either F or F is a secure MAC, for F to be a secure MAC scheme as well. Without loss of generality, assume F ′ is secure; and assume, to the contrary, that F is not a secure MAC. Namely, assume an attacker A Fk′ ,k′′ (µ)|µ̸=m that can output a pair m, Fk′ ,k′′ (m), given access to an oracle that computes Fk′ ,k′′ on any value except m. We use A to construct an adversary A ′ which succeeds against F ′ . Adversary A ′ operates by running A, as well as selecting a key k ′′ and running Fk′′′′ (·); this is needed to allow A ′ to provide the oracle service to A Fk′ ,k′′ (µ)|µ̸=m , computing Fk′ ,k′′ (µ) for any given input µ. Whenever A makes a query q, then A ′ makes the same query to the Fk′ ′ (·) oracle, to receive Fk′ ′ (q). Then, A ′ computes by itself Fk′′′′ (q), and combines it with Fk′ ′ (q) to produce the required response (Fk′ ′ (q), Fk′′′′ (q)). When A őnally returns the pair (m, Fk′ ,k′′ (m)) = (m, Fk′ ′ (m) + + Fk′′′′ (m)), ′ ′ then A simply returns the pair (m, Fk′ (m)), i.e., omitting the second part of the MAC that A returned. Applied Introduction to Cryptography and Cybersecurity 4.6. OTHER MAC CONSTRUCTIONS 247 However, concatenation is a rather inefficient construction for robust combiner of MAC schemes, since it results in duplication of the length of the output. The following exercise shows that exclusive-or is also a robust combiner for MAC - and since the output length is the same as of the component MAC schemes, it is efficient. Exercise 4.8. Show that exclusive-or is a robust combiner for MAC functions. Namely, that M AC(k′ ,k′′ ) (x) = M ACk′ ′ (x) ⊕ M ACk′′′′ (x) is a secure MAC, if one or both of {M AC ′ , M AC ′′ } is a secure MAC. Guidance: Similar to the solution of Ex. 4.7. 4.6.3 HMAC and other constructions of a MAC from a Hash function Finally, we consider constructions of MAC functions from cryptographic hash functions. Cryptographic hash functions, like block ciphers, are deőned in multiple standards, therefore their use to construct MAC (and other schemes) follows the cryptographic building blocks principle (subsection 2.7.4). Furthermore, since both MAC and hash functions are deőned for arbitrary (variable) input length (VIL), the constructions of MAC from hash functions are simpler than the constructions from block ciphers (subsection 4.5.2). Furthermore, some cryptographic hash functions are extremely efficient, and this efficiency can be mostly inherited by HMAC. For example, the Blake2b [17] cryptographic hash function achieves speeds of over 109 bytes/second, with a relatively-weak CPU (Intel I5-6600 with 3310MHz clock). In fact, the use of hash functions to construct a MAC is so common, that many people use the term ‘keyed hash’ to refer to the resulting MAC function. The meaning is that the hash function uses a secret key k. This differs from our use of the term ‘keyed hash function’, as in subsection 3.2.3, which is also the usage in most works in cryptography, where the key k is not secret (i.e., the key k is known to the adversary). An additional problem with the term ‘keyed hash’ for the use of hash with a secret key to construct MAC, is that it may be interpreted to imply that it is safe to use a keyed CRHF as a MAC, simply by keeping its key secret instead of publishing it. It may be possible, for the same keyed function h to be a keyed CRHF (given a public key) and a MAC (given a secret key); however, it is also possible for h to be a keyed CRHF yet not to be a MAC (given a secret key), as we show in the next exercise. See also Exercise 4.22 and Exercise 4.23. Exercise 4.9. Let hk (m) be a keyed CRHF. Show a keyed hash function h′k (m) which (1) is a CRHF but (2) is not a secure MAC. Solution: Let h′k (m) = k||hk (m). Clearly h′ exposes its key, so it cannot be a secure MAC. However, h′ is still a CRHF, since any collision of h′ is also a collision for h. Applied Introduction to Cryptography and Cybersecurity CHAPTER 4. AUTHENTICATION: MAC, BLOCKCHAIN AND SIGNATURE 248 SCHEMES We see that a function may be a keyed CRHF but not a secure MAC; can we, instead, construct a MAC from a cryptographic hash function? Ideally, we would want to construct the MAC from a keyless cryptographic hash function, since existing standard cryptographic hash functions are keyless (subsection 3.1.4). In the reminder of this subsection, we discuss four such constructions, whose goal is to create a MAC from keyless hash functions. We begin with three designs studied by Tsudik [371], and then describe HMAC [30], a more recent construction which is now widely deployed and deőned as an IETF standard [32]. Tsudik’s constructions of MAC from hash: prepend key, append key and message-in-the-middle. Several heuristic proposals for the construction of a MAC from a cryptographic hash function were made, mostly constructing the MAC from a keyless hash function. Three of the most well known heuristics were presented and compared by Tsudik [371]. Given keyless hash function h, key k and message m, these are: Prepend Key: KMkh (m) = h(k + + m) Append Key: M Kkh (m) = h(m + + k) Message-in-the-Middle: KM Kkh (m) = h(k + +m+ + k) An obvious question is whether these schemes are secure - assuming that the cryptographic hash function h satisőes some assumption. Let us őrst observe that all three constructions are secure under the ROM (Section 3.6). Exercise 4.10. Prove that (a) KM h , (b) M K h and (c) KM K h are secure under the Random Oracle Methodology (ROM). Proof sketch: assume an adversary outputs m, σ for a message m which it did not give as input to the ‘oracle’ for h. Then the output of the corresponding h function, was never computed yet, i.e., it is still random. For example, for KMkh (m) = h(k + + m), the value of h(k + + m), for this m, was not computed yet. In fact, we need to pick it only to check the adversary’s guess σ; at that point, we choose it randomly from the set {0, 1}n . The probability that our choice will be the same as σ is only 2−n , i.e., negligible. Hence, KMkh (m) = h(k + + m) is secure under the ROM. This shows (a); essentially the same argument holds for (b) M K h and (c) KM K h . To avoid the possible impression that every construction is secure under the ROM, let us give an example of construction which is insecure even under the ROM. Speciőcally, consider KM KMkh (m) = h(k + + m) + + h(k + + m ⊕ 1|m| ). h Namely, KM KM was made ‘more complex’ - maybe with the futile hope that this will make it more secure - by concatenating two hash values, one of k + +m and the other of k + + m ⊕ 1|m| . Note that m ⊕ 1|m| is just a weird way for writing the negation of m. Example 4.4. Show that KM KM h is insecure, (even) under the ROM. Applied Introduction to Cryptography and Cybersecurity 4.6. OTHER MAC CONSTRUCTIONS 249 Solution: Adversary asks to receive KM KMkh for the message m = 0l (for any length l); let the value returned by denoted σL + + σR , where |σL | = |σR | = n. Then the adversary returns the ‘guess’ (1l , σR + + σL ). Verify that this is the correct pair. We recommend to readers to follow carefully the arguments in Exercise 4.10 and őnd out why they do not hold for KM KMkh (m). It is not trivial - and may help understanding these important concepts. We next observe that these three constructions, which are secure under the ROM, can be insecure using a hash function which satisőes standard requirements such as collision-resistance and preimage-resistance (one-way function), as in the following exercise. This illustrates the fact that security under the ROM does not imply security under standard assumptions. Exercise 4.11. Present a keyless hash function h such that: 1. h is a CRHF, yet (a) KM h , (b) M K h , (c) KM K h is not a secure MAC. 2. h is a SPR, yet (a) KM h , (b) M K h , (c) KM K h is not a secure MAC. 3. h is a OWF-hash, yet (a) KM h , (b) M K h , (c) KM K h is not a secure MAC. 4. h is a BRE, yet (a) KM h , (b) M K h , (c) KM K h is not a secure MAC. 5. h is CRHF, OWF and BRE, yet (a) KM h , (b) M K h , (c) KM K h is not a secure MAC. The examples may assume a hash function h′ which has the corresponding property (CRHF, SPR, OWF, BRE or their combination). Partial solution: Let h′ be a hash function h′ which is CRHF, SPR and OWF. We deőne h(x) to return the n most signiőcant bits of x if |x| = 2n and the n least signiőcant bits are all zero, and to return h′ (x) otherwise. We leave it to the reader to prove that h′ is a secure CRHF, SPR and OWF, yet KM h is not a secure MAC. Changing this construction to also cover BRE is not very difficult, as is modifying the constructions to show corresponding results for M K h and KM K h . While the examples in the solutions to Exercise 4.11 would be ‘artiőcial’ and irrelevant to any ‘real’ candidate hash function, some weaknesses of these constructions can apply to realistic hash functions. In particular, many hash functions have the following extend property: given h(x), one can compute h(x + + y), even without knowing anything about x. This property hold for any hash function using the (widely-used) Merkle-Damgård construction, which is used by many hash functions, including the MD5 and SHA-1 standards; see discussion in Section 3.9. In the following exercise we observe, after Tsudik, that if h has the extend property then KM h is not a secure MAC. Exercise 4.12. Show that the KM h is insecure, for any hash function h that has the extend property. Hint: Tsudik has shown this in [371]. Applied Introduction to Cryptography and Cybersecurity CHAPTER 4. AUTHENTICATION: MAC, BLOCKCHAIN AND SIGNATURE 250 SCHEMES HMAC. HMAC [30, 32] is the most widely-used construction of a MAC from a keyless hash function. HMAC is deőned as: HM ACk (m) = h(k ⊕ OP AD + + h(k ⊕ IP AD + + m)) (4.5) Where OPAD, IPAD are őxed constant strings. It is not difficult to see that HMAC is secure under the ROM (Exercise 3.24). However, while the ROM is useful, and security under this model indicates that some attacks are infeasible, it would surely be much better, if we could show that HMAC is secure under some ‘reasonable’ cryptographic assumption. In fact, this was done, in [30]. It would have been great if the assumption was one of the standard hash-function assumptions, e.g., collision resistance; however, the assumption in [30], while arguably reasonable, is somewhat more complex than these standard hash function assumptions, and we will not discuss these details here. Note that HMAC is insecure when using some collision-resistant hash function, i.e., collision-resistance is not a sufficient requirement from the hash function. You will show this in Exercise 3.25, where you should construction a CRHF h(·), for which HMAC is not a secure MAC. To ensure that h is a CRHF, the construction uses a given CRHF, h′ (·). Due to the importance and wide use of HMAC, conődence in its security grew over the years, with several additional results establishing its security under ‘even more reasonable’ assumptions (compared to [30]). The conődence in the security of HMAC also grew due to the fact that such important standard has not be ‘broken’ by cryptanalysis during this time. In fact, over time, HMAC is also often used for additional goals, such as a pseudorandom function (PRF) and as a Key Derivation Function (KDF), which is essentially a keyed variant of a randomness extraction hash function; see discussion in Section 3.5. 4.7 Combining Authentication, Encryption and Other Functions Message authentication combines authentication (sender identiőcation) and integrity (detection of modiőcation). However, when transmitting messages, we often have additional goals. These include security goals such as conődentiality, as well as fault-tolerance goals such as error-detection/correction, and even efficiency goals such as compression. In the őrst four subsections, we focus on the combination of the two basic security goals: encryption and authentication. Finally, in subsection 4.7.5, we discuss the complete secure session transmission protocol, which addresses additional goals involving security, reliability and efficiency, for a session (connection) between two parties. We return to these issues in Section 7.2, where we discuss the SSL/TLS protocols, including their record protocol. There are several vulnerabilities of the SSL/TLS protocols which are due to its use of the (less-preferred, often vulnerable) attacks on the SSL/TLS exploited vulnerable, insecure combinations of authentication, conődentiality and other functions, Applied Introduction to Cryptography and Cybersecurity 4.7. COMBINING AUTHENTICATION, ENCRYPTION AND OTHER FUNCTIONS 251 and the record protocol of TLS 1.3 was modiőed to a better design, to foils such attacks. There are two main options for ensuring the conődentiality and authentication/integrity requirements together: (1) by correctly combining an encryption scheme with a MAC scheme, or (2) by using a combined authenticated encryption scheme. In the őrst subsection below, we discuss authenticated encryption schemes and authenticated encryption with associated data (AEAD) schemes, which combine encryption (for conődentiality) and authentication. In the following subsections, we discuss speciőc generic constructions, combining MAC and encryption schemes. 4.7.1 Authenticated Encryption (AE) and AEAD schemes Authenticated Encryption (AE) schemes. The combination of conődentiality and authenticity is often required, but we have seen that incorrect combinations may lead to vulnerabilities. This motivates the design of schemes which combine the authentication and the conődentiality functions. We use the term authenticated encryption (AE) for such schemes [339], which consist of two main functions: encrypt-and-authenticate EnA and decrypt-and-verify DnV , plus, optionally, an explicit key-generation function. The decrypt-and-verify returns ERROR if the ciphertext is found not-authentic; similar veriőcation property can be implemented by a MAC scheme, by comparing the authenticator received with a message to the result of computing the MAC on the message. AE schemes may also have a key-generation function; in particular, this is necessary when the keys are not uniformly random. In addition to the support for both encryption and authentication, there is an additional innovative aspect to the deőnition of AE schemes. Namely, the AE encrypt-and-authenticate operation has three inputs. This is in contrast to the standard deőnition of encryption schemes, which deőnes only two inputs: the key and the plaintext. The third input of AE schemes is called a nonce. To ensure security, a different nonce value should be used whenever performing the encryption operation. Essentially, an Authenticated-Encryption resembles, therefore, a mode-of-operation of an encryption scheme, with the nonce taking the role of the IV or counter (state) input. The same nonce should be given to the decrypt-and-verify operation, and the scheme should ensure correctness, namely that for every plaintext message m, key k and nonce n holds: m = DnVkn (EnAnk (m)) (4.6) The use of a combined AE scheme allows simpler, less error-prone implementations compared to the use of two separate schemes, one for encryption and one for authentication. In particular, we need only a call to only one function (encrypt-and-authenticate or decrypt-and-verify) instead of requiring the correct use of both encryption/decryption and MAC functions. Many constructions of authenticated encryption are generic, i.e., built by combining arbitrary implementation of cryptographic schemes, following the Applied Introduction to Cryptography and Cybersecurity CHAPTER 4. AUTHENTICATION: MAC, BLOCKCHAIN AND SIGNATURE 252 SCHEMES ‘cryptographic building blocks’ principle. The combinations of encryption scheme and MAC scheme that we study later in this subsection are good examples for such generic constructions. Other constructions are ‘ad-hoc’, i.e., they are designed using speciőc functions. Such ad-hoc constructions may have better performance than generic constructions, however, that may come at the cost of requiring more complex or less well-tested security assumptions, contrary to the Cryptographic Building Blocks principle. Authenticated Encryption with Associated Data (AEAD). In many applications, e.g., TLS (Chapter 7), some of the data to be authenticated should not be encrypted. Typically, this would be data that is used also by agents which do not have the secret (decryption) key; for example, the identity of the destination. Such data is often referred to as associated data, and authenticated encryption schemes supporting it are referred to as AEAD (Authenticated Encryption with Associated Data) schemes [337]. AEAD schemes have the same three functions (key-generation, encrypt-and-authenticate, decrypt-and-verify), and their input also includes a nonce. However, they also have an additional (fourth) input, the associated-data őeld. Scheme MAC Type Goals Symmetric Authenticity Authenticity and AE Symmetric conődentiality Authenticity, and AEAD Symmetric conődentiality for part of document Authenticity and Signature Asymmetric non-repudiation Authenticity, nonSignCryptionAsymmetric repudiation and conődentiality Metaphor Document with secret mark Document with secret mark, in sealed envelop Document with secret mark, in sealed envelop with window Signed document Signed document in a sealed envelop Table 4.1: Authentication schemes: MAC, Authenticated Encryption (AE), Authenticated Encryption with Associate Data (AEAD), Signature and SignCryption schemes. Together with AE and AEAD schemes, we now have four different cryptographic authentication schemes. In Table 4.1, we sum up these four schemes; for each scheme, we list its intuitive goal, along with a metaphor to it. Like all metaphors, these should not be taken too seriously; hopefully, readers would őnd them helpful and not confusing. Authenticated encryption: attack model and success/fail criteria We now brieŕy discuss the attack model (attacker capabilities) and the goals (success/fail criteria) for the combination of authentication and conődentiality (encryption), as is essential for any security evaluation (principle 1). Essentially, Applied Introduction to Cryptography and Cybersecurity 4.7. COMBINING AUTHENTICATION, ENCRYPTION AND OTHER FUNCTIONS 253 this combines the corresponding attack model and goals of encryption schemes (indistinguishability test) and of message authentication code (MAC) schemes (forgery test). As in our deőnitions for encryption and MAC, we consider an efficient (PPT) adversary. We also allow the attacker to have similar capabilities as in the deőnitions of secure encryption / MAC. In particular, we allow chosen plaintext queries, where the attacker provides input messages (plaintext) and receives their authenticated-encryption, as in the chosen-plaintext attack (CPA) we deőned for encryption. Exercise 4.13. Present precise definitions for IND-CPA and security against forgery for AE and AEAD schemes. 4.7.2 Authentication via EDC-then-Encryption? Several practical secure communication systems őrst apply an Error-DetectingCode (EDC) to the message, and then encrypt it, i.e.: c = Ek (m + + EDC(m)). We believe that the motivation for this design is the hope to ensure authentication as well as conődentiality, i.e., the designers were (intuitively) trying to develop an authenticated-encryption scheme. Unfortunately, such designs are often insecure; in fact, often, the application of EDC/ECC before encryption allows attacks on the conődentiality of the design. We saw one example, for WEP, in Section 2.10. Another example of such vulnerability is in the design of GSM, which employs not just an Error Detecting Code but even an Error Correcting Code, with very high redundancy. In both WEP and GSM, the encryption was performed by XORing the plaintext (after EDC/ECC) with the keystream (output of PRG). However, EDC-then-Encrypt schemes are often vulnerable, also when using other encryption schemes. For example, the following exercise shows such vulnerability, albeit against the authentication property, when using CBC-mode encryption. Exercise 4.14 (EDC-then-CBC does not ensure authentication). Let E be a secure block cipher and let CBCkE (m; IV ) be the CBC-mode encryption of plaintext message m, using underlying block cipher E, key k and initialization vector IV , as in Eq. (2.56). Furthermore, let EDCtCBCkE (m; IV ) = CBCkE (m + + h(m); IV ) where h is a function outputting one block (error detecting code). Show that EDCtCBC E is not a secure authenticated encryption; specifically, that authentication fails. Hint: attacker asks for EDCtCBC E encryption of the message m′ = m+ + h(m); the output gives also the encryption of m. 4.7.3 Generic Authenticated Encryption Constructions We now discuss ‘generic’ constructions, combining arbitrary MAC and encryption schemes to ensure both conődentiality and authentication/integrity. As Applied Introduction to Cryptography and Cybersecurity CHAPTER 4. AUTHENTICATION: MAC, BLOCKCHAIN AND SIGNATURE 254 SCHEMES discussed above, these constructions can be used to construct a single, combined ‘authenticated encryption’ scheme, or to ensure both goals (conődentiality and authenticity) in a system. Different generic constructions were proposed - but not all are secure. Let us consider three constructions, all applied in important, standard applications. For each of the designs, we present the process of authenticating and encrypting a message m, using two keys - k ′ used for encryption, and k ′′ used for authentication. Authenticate and Encrypt (A&E) , e.g., used in early versions of the SSH protocol: C = Enck′ (m), A = M ACk′′ (m); send (C, A). Authenticate then Encrypt (AtE) , e.g., used in the SSL and TLS standards: A = M ACk′′ (m), C = Enck′ (m, A); send C. Encrypt then Authenticate (EtA) , e.g., used by the IPsec standard: C = Enck′ (m), A = M ACk′′ (C); send (C, A). Exercise 4.15 (Generic AE and AEAD schemes). Above we described only the ‘encrypt-and-authenticate’ function of the authenticated-encryption schemes for the three generic constructions, and even that, we described informally, without the explicit implementation. Complete the description by writing explicitly, for each of the three generic constructions above, the implementation for the encryptand-authenticate (EnA) and the decrypt-and-verify (DnV) functions. Present also the AEAD (Authenticated Encryption with Associated Data) version. Partial solution: we present only the solution for the A&E construction. The AE implementations are: A&E.EnA(k′ ,k′′ ) (m) A&E.DnV(k′ ,k′′ ) (c, a) ← (Enck′ (m), M ACk′′ (m))     m ← Deck′ (c); If a ̸= M ACk′′ (m), return m; ←   Otherwise, return ERROR; The AEAD implementations are very similar, except also with Associated Data (wAD); we present only the EnA function: A&E.EnAwAD(k′ ,k′′ ) (m, d; r) = + d)) (Enck′ (m; r), d, M ACk′′ (m + Some of these three generic constructions are insecure, as we demonstrate below for particular pairs of encryption and MAC functions. Can you identify or guess - which? The answers were given, almost concurrently, by two beautiful papers [40, 241]; the main points are in the following exercises. Exercise 4.16 shows that A&E is insecure; this is quite straightforward, and hence readers should try to solve it alone before reading the solution. Applied Introduction to Cryptography and Cybersecurity 4.7. COMBINING AUTHENTICATION, ENCRYPTION AND OTHER FUNCTIONS 255 Exercise 4.16 (Authenticate and Encrypt (A&E) is insecure). Show that a pair of secure encryption scheme Enc and secure MAC scheme M AC may be both secure, yet their combination using the A&E construction would be insecure. Solution: given any secure MAC scheme M AC, let + m[1] M ACk′ ′′ (m) = M ACk′′ (m) + where m[1] is the őrst bit of m. If M AC is a secure MAC then M AC ′ is also a secure MAC. However, M AC ′ exposes a bit of its input; hence, its use in A&E would allow the adversary to distinguish between encryptions of two messages, i.e., the resulting, combined scheme is not IND-CPA secure - even when the underlying encryption scheme E is secure. Exercise 4.17 shows that AtE is also insecure. The argument is more elaborate than the A&E argument from Exercise 4.16, and it may not be completely necessary to understand it for a őrst reading; however, it is a nice example of a cryptographic counterexample, so it may be worth investing the effort. Readers may also consult [241] for more details. Exercise 4.17 (Authenticate then Encrypt (AtE) is insecure). Show that a pair of secure encryption scheme Enc and secure MAC scheme M AC may be both secure, yet their combination using the AtE construction would be insecure. Solution: Consider the following simpliőed version of the Per-Block Random (PBR) mode presented in subsection 2.8.2, deőned for single block messages: Enck (m; r) = m ⊕ Ek (r) + + r, where E is a block cipher; notice that this is also essentially OFB and CFB mode encryption, applied to single block messages. When the random bits are not relevant, i.e., simply selected uniformly, then we do not explicitly write them and use the simpliőed notation Enck (m). As shown in Theorem 2.1, if E is a secure block cipher (or even merely a PRF or PRP), then Enc is an IND-CPA secure encryption scheme. Denote the block length by 4n, i.e., assume it is a multiple of 4. Hence, the output of Enc is 8n-bits long. We next deőne a randomized transform Split : {0, 1} → {0, 1}2 , i.e., from one bit to a pair of bits. The transform always maps 0 to 00, and randomly transforms 1 to {01, 10, 11} with the corresponding probabilities {49.9%, 50%, 0.1%}. We extend the deőnition of Split to 2b-bit long strings, by applying Split to each input block, i.e., given 2n-bit input message m = m1 + + ... + + m2n , where each mi is a bit, let Split(m) = Split(m1 ) + + ... + + Split(m2n ). We use Split to deőne a ‘weird’ variant of Enc, which we denote Enc′ , deőned as: Enc′k (m) = Enck (Split(m)). The reader should conőrm that, assuming E is a secure block cipher, then Enc′ is IND-CPA secure encryption scheme (for 2n-bit-long plaintexts). Consider now AtEk,k′ (m) = Enc′k (m + + M ACk′ (m)) = Enck (Split(m + + M ACk′ (m))), where m is an n-bits long string, and where M AC has input and outputs of n-bits long strings. Hence, the input to Enc′ is 2n-bits long, and hence, the input to Enc is 4n-bits long - as we deőned above. Applied Introduction to Cryptography and Cybersecurity CHAPTER 4. AUTHENTICATION: MAC, BLOCKCHAIN AND SIGNATURE 256 SCHEMES However, AtE is not a secure authenticated-encryption scheme. In fact, given c = AtEk,k′ (m), we can decipher m, using merely feedback-only CCA queries. Let us demonstrate how we őnd the őrst bit m1 of m. Denote the 8n bits of c as c = c1 + +. . .+ +c8n . Perform the query c′ = c¯1 + +c¯2 + +c3 + +c4 + +c5 + +. . .+ +c8n , i.e., inverting the őrst two bits of c. Recall that c = AtEk,k′ (m) = Enck (Split(m + + + r. Hence, by inverting c1 , c2 , M ACk′ (m))) and that Enck (m; r) = m ⊕ Ek (r) + we invert the two bits of Split(m1 ) upon decryption. The impact depends on the value of m1 . If m1 = 0, then Split(m1 ) = 00; by inverting them, we get 11, whose ‘unsplit’ transform returns 1 instead of 0, causing the MAC validation to fail, providing the attacker with an ‘ERROR’ feedback. However, if m1 = 1, then Split(m1 ) is either 01 or 10 (with probability 99.9%), and inverting both bits does not impact the ‘unsplit’ result, so that the MAC validation does not fail. This allows the attacker to determine the őrst bit m1 , with very small (0.1%) probability of error (in the rare case where Split(m1 ) returned 11). Note that the AtE construction is secure - for speciőc encryption and MAC schemes. However, it is not secure for arbitrary secure encryption and MAC schemes, i.e., as a generic construction. Namely, Encrypt-then-Authenticate (EtA) is the only remaining candidate generic construction. Fortunately, EtA is secure, for any secure encryption and MAC scheme, as the following lemma states. Lemma 4.4 (EtA is secure [241]). Given a CPA-IND encryption scheme Enc and a secure MAC scheme M AC, their EtA construction ensures both CPA-IND and secure MAC. Proof sketch: We őrst show that the IND-CPA property holds. Suppose, to the contrary, that there is an efficient (PPT) adversary A that ‘wins’ against EtA in the IND-CPA game, with signiőcant probability. We construct adversary A ′ that ‘wins’ in the IND-CPA game against the encryption scheme Enc, employed as part of the EtA scheme. Speciőcally, A ′ generates a key k ′′ for the MAC function, and runs A. Whenever A chooses the two challenge messages m0 , m1 , and should be provided with the authenticated-encryption of mb , then A ′ chooses the same two messages and receives c∗ = Enck′ (mb ). Then A ′ uses the key k ′′ it generated to compute a∗ = M ACk′′ (c∗) and return the pair (c∗, a∗) which would be the authenticated-encryption of mb , as required. Similarly, whenever A asks for encryption of a message m, then A ′ uses its oracle to compute c = Enck′ (m), and k ′′ to compute a = M ACk′′ (c). A ′ then returns the pair (c, a) to A, which is exactly the required EtA.EnAk′ ,k′′ (m). Finally, when A guesses a bit b, then A ′ guesses the same bit. If A ‘wins’, i.e., correctly guesses, then A ′ also ‘wins’. It follows that there is no efficient (PPT) adversary A that ‘wins’ against EtA in the IND-CPA game. We next show that EtA also ensures security against forgery, as in Def. 4.1, adjusted for AE / AEAD schemes, as in Ex. 4.13. Suppose there is an efficient (PPT) adversary A that succeeds in forgery of the EtA scheme, Applied Introduction to Cryptography and Cybersecurity 4.7. COMBINING AUTHENTICATION, ENCRYPTION AND OTHER FUNCTIONS 257 with signiőcant probability. Namely, A produces a message c and tag a s.t. m = EtA.DnVk′ ,k′′ (c, a), for some message m, without making a query to EtA.EnAk′ ,k′′ (m). By construction, this implies that a = M ACk′′ (c). However, from the deőnition of encryption (Def. 2.1), speciőcally the correctness property, there is no other message m′ ̸= m whose encryption would result in same ciphertext c. Hence, A did not make a query to EtA.EnAk′ ,k′′ that returned M ACk′′ (c) as the tag - yet A obtained M ACk′′ (c) somehow - in contradiction to the assumed security of MAC. Additional properties of EtA: efficiency and ability to foil DoS, CCA Not only is EtA secure given any secure Encryption and MAC scheme - it also has three additional desirable properties: Efficiency: Any corruption of the ciphertext, intentional or benign, is detected immediately by the veriőcation process (comparing the received tag to the MAC of the ciphertext). This is much more efficient then encryption. Foil DoS: This improved efficiency implies that it is much harder, and rarely feasible, to exhaust the resources of the recipient by sending corrupted messages (ciphertext). Foil CCA: By validating the ciphertext before decrypting it, EtA schemes prevent CCA attacks against the underlying encryption scheme, where the attacker provides specially-crafted ciphertext messages, receives the corresponding plaintext (or failure indication if the ciphertext was not valid encryption), and uses the resulting plaintext and/or failure indication to attack the underlying encryption scheme. If the attacker is creating such crafted ciphertext and sending the EtA scheme, then it should fail the MAC validation, and would not even be input to the decryption process. Therefore, as long as the attacker cannot forge legitimate MAC, they can only attack the MAC component of the EtA, and the encryption scheme is protected from this threat. 4.7.4 Single-Key Generic Authenticated-Encryption All three constructions above used two separate keys: k ′ for encryption and k ′′ for authentication. Sharing two separate keys may be harder than sharing a single key. Can we use a single key k for both the encryption and the MAC functions used in the generic authenticated encryption constructions (or, speciőcally, in the EtA construction, since it is always secure)? Note that this excludes the obvious naive ‘solution’ of using a ‘double-length’ key, split into an encryption key and a MAC key. The following exercise shows that such ‘key re-use’ is insecure. Exercise 4.18 (Key re-use is insecure). Let E ′ , M AC ′ be secure encryption and MAC schemes. Show (contrived) examples of secure encryption and MAC schemes, built using E ′ , M AC ′ , demonstrating vulnerabilities for each of the Applied Introduction to Cryptography and Cybersecurity CHAPTER 4. AUTHENTICATION: MAC, BLOCKCHAIN AND SIGNATURE 258 SCHEMES three generic constructions, when using the same key for authentication and for encryption. Partial solution: + k ′′ and M ACk′ ,k′′ (m) = k ′ + + M ACk′′ (m). A&E: Let Ek′ ,k′′ (m) = Ek′ ′ (m) + Obviously, when combined using the A&E construction, the result is completely insecure - both authentication and conődentiality are completely lost. + k ′′ as AtE: To demonstrate loss of authenticity, let Ek′ ,k′′ (m) = Ek′ ′ (m) + above. +M ACk′′ (m) EtA: To demonstrate loss of conődentiality, let M ACk′ ,k′′ (m) = k ′ + as above. To demonstrate loss of authentication, with a hint to one elegant solution: combine Ek′ ,k′′ (m) = Ek′ ′ (m) + + k ′′ as above, with a (simple) extension of Example 4.3. The reader is encouraged to complete missing details, and in particular, to show that all the encryption and MAC schemes used in the solution are secure (albeit contrived) - only their combined use, in the three generic constructions, is insecure. Since we know that we cannot re-use the same key for both encryption and MAC, the next question is - can we use two separate keys, k ′ , k ′′ from a single key k, and if so, how? We leave this as a (not too difficult) exercise. Exercise 4.19 (Generating two keys from one key). Given a secure n-bitkey shared-key encryption scheme (E, D) and a secure n-bit-key MAC scheme M AC, and a single random, secret n-bit key k, show how we can derive two keys (k ′ , k ′′ ) from k, s.t. the EtA construction is secure, when using k ′ for encryption and k ′′ for MAC, given: 1. A secure n-bit-key PRF f . 2. A secure n-bit-key block cipher (Ê, D̂). 3. A secure PRG from n bits to 2n bits. 4.7.5 Authentication, encryption, compression and error detection/correction codes Data is often encoded for additional, non-security goals: Reliability via error detection and/or correction, typically using Error Detection Code (EDC) such as Checksum, or Error Correction Code (ECC), such as ReedśSolomon codes. These mechanisms are designed against random errors, and are not secure against intentional, ‘malicious’ modiőcations. Note that secure message authentication, such as using MAC, also ensures error detection; however, as we explain below, it is often desirable to also use the (insecure) error detection codes. Applied Introduction to Cryptography and Cybersecurity 4.7. COMBINING AUTHENTICATION, ENCRYPTION AND OTHER FUNCTIONS 259 Compression for efficiency. Compression is applied to improve efficiency by reducing message length. As we explain below, this requirement may conŕict with the conődentiality requirement. Message Compress Plaintext Encrypt Header Ciphertext Seq MAC Header Ciphertext tag Code Header Ciphertext tag ECC Figure 4.3: Combining Security (encryption, authentication) with Reliability (EDC/ECC) and Compression. The application of EDC/ECC after the MAC, allows recipients to discard messages corrupted by noise, without computing the MAC function; this saves overhead, and allows the recipient to detect attacks on the authentication. In this subsection, we discuss how to correctly and securely combine the security goals of encryption and authentication, with these additional goals of reliability and compression (for efficiency). Fig. 4.3 presents the recommended process for combining all of these functions. Let us explain this recommendation: • Compression is only effective when applied to data with signiőcant redundancy; plaintext is often redundant, in which case, applying compression Applied Introduction to Cryptography and Cybersecurity CHAPTER 4. AUTHENTICATION: MAC, BLOCKCHAIN AND SIGNATURE 260 SCHEMES to it could be effective. In contrast, ciphertext would normally not have redundancy. Hence, if compression is used, it must be applied before encryption. Note, however, that this may conŕict with the conődentiality requirement, as we explain below; for better conődentiality, avoid compression completely, or take appropriate measures to limit the possible exposure. • Encryption is applied next, before authentication (MAC), following the ‘Encrypt-then-Authenticate’ construction. Alternatively, we may use an authenticated-encryption with associated data (AEAD) scheme, to combine the encryption and authentication functions. Notice that by applying authentication after encryption or using an AEAD scheme, we facilitate also authentication of a sequence-number or similar őeld used to prevent re-play/re-order/omission, which is often known to recipient, and hence may not be sent explicitly. We can also authenticate ‘header’ őelds such as destination address, which are also not encrypted, since they are used to process and route the encrypted message. The Encrypt-thenAuthenticate mode also allows prevention of chosen-ciphertext attacks and more efficient handling of corrupted messages. • Finally, we apply error correction / detection code. This allows efficient handling of messages corrupted due to noise or other benign reasons. An important side-beneőt is that authentication failures of messages to which errors were not detected imply an intentional forgery attack - an attacker made sure that the error-detecting code will be correct. Compress-then-Encrypt Vulnerability Note that there is a subtle vulnerability in applying compression before encryption, since encryption does not hide the length of the plaintext, while the length of compressed messages depends on the contents. In particular, a message containing randomly-generated strings typically does not compress well (length after compression is roughly as long as before compression), while messages containing lots of redundancy, e.g., strings composed of only one character, compress well (length after compression is much shorter). This allows an attacker to distinguish between the encryptions of two compressed messages, based on the redundancy of the plaintexts. This vulnerability was őrst presented in [230], and later exploited in several attacks, including attacks on the record protocol of SSL and TLS; see subsection 7.2.6 and Exercise 7.6. Applied Introduction to Cryptography and Cybersecurity 4.8. ADDITIONAL EXERCISES 4.8 261 Additional exercises Exercise 4.20. Mal intercepts a message sent from Alice to her bank, and instructing the bank to transfer 10$ to Bob. Assume that the communication is protected by One-Time-Pad OTP encryption, using a random key shared between Alice and her bank, and by including Alice’s password as part of the plaintext, validated by the bank. Assume Mal knows that the message is an ASCII encoding of the exact string Transfer 10$ to Bob. From: Alice, PW:, concatenated with Alice’s password (unknown to Mal). Show how Mal can change the message so that upon receiving it, the bank will, instead, transfer 99$ to Mal. (The modified message must have the same password!) Exercise 4.21. Let S be a correct signature scheme over domain {0, 1}n , and let h : {0, 1}∗ → {0, 1}n be a hash function whose output is n bits long. Prove h that the HtS construction SHtS , defined as in Equation 3.6, is correct. Exercise 4.22 (TCR hash is not necessarily a MAC). Let hk (m) be a target collision resistant (TCR) hash function (subsection 3.2.3). Show a keyed hash function h′k (m) which (1) is also TCR hash function but (2) is not a secure MAC. Exercise 4.23 (A MAC is not necessarily a CRHF). Let M ACk (m) be a secure MAC function. Show a keyed hash function hk (m) which (1) is a secure MAC yet (2) is not a (keyed) CRHF or a TCR hash function. Exercise 4.24. Let S be a existentially unforgeable signature scheme over + y) = x ⊕ y be a hash function whose input is domain {0, 1}n , and let h(x + 2n bits long, and whose output is the n-bit string resulting from the bit-wise exclusive-OR of the most-significant n input bits, with the least significant n h input bits. Show an attacker A that shows that the HtS construction SHtS , defined as in Equation 3.6, is not an existentially unforgeable signature scheme. Exercise 4.25. Hackme Inc. proposes the following highly-efficient MAC, using two 64-bit keys k1 , k2 , for 64-bit blocks: M ACk1 ,k2 (m) = (m⊕k1 )+k2 (mod 264 ). Show that this is not a secure MAC. Hint: Compare to Exercise 2.49. Exercise 4.26. Let F : {0, 1}n → {0, 1}l be a secure PRF, from n bit strings to l < n bit strings. Define F ′ : {0, 1}n → {0, 1}l as: Fk′ (m) = Fk (m) + + Fk (m), i.e., concatenate the results of Fk applied to m and to the inverse of m. Present ′ an efficient algorithm ADV Fk which demonstrates that F ′ is not a secure MAC, ′ i.e., outputs tuple (x, t) s.t. x ∈ {0, 1}n and t = Fk′ (x). Algorithm ADV Fk may provide input m ∈ {0, 1}n and receive Fk′ (m), as long as x ̸= m. You ′ can present ADV Fk by ‘filling in the blanks’ in the ‘template’ below, modifying and/or extending the template if desired, or simply write your own code if you like. ′ ); ADV Fk : {t′ = Fk′ ( Applied Introduction to Cryptography and Cybersecurity CHAPTER 4. AUTHENTICATION: MAC, BLOCKCHAIN AND SIGNATURE 262 SCHEMES Return ( ); } Exercise 4.27. Consider CF B − M AC, defined below, similarly to the definition of CBC − M AC (Eq. (4.2)):   c0 ← 0l ;   (i = 1 . . . η)ci = mi ⊕ Ek (ci−1 ); CF B − M ACkE (m1 + + m2 + +. . .+ + mη ) =   output cη 1. Show an attack demonstrating that CF B − M ACkE is not a secure l · η-bit MAC, even when E is a secure l-bit block cipher (PRP). Your attack should consist of: a) Up to three ‘queries’, i.e., messages m, m′ and m′′ , each of one or more blocks, to which the attacker receives CF B−M ACkE (m), CF B− M ACkE (m′ ) and CF B − M ACkE (m′′ ). Note: one query suffices, although you may use up to three. • m= • m′ = • m′′ = b) A forgery, i.e., a pair of a message mF and its authenticator a = such that mF ̸∈ {m, m′ , m′′ } and a = CF B − M ACkE (mF ). 2. Would your attack also work against the ‘improved’ variant ICF B − M ACkE (m) = EK (CF B − M ACkE (m))? If not, present an attack against ICF B − M ACkE (m): • m= • m′ = • m′′ = • mF • a= Exercise 4.28. 1. Alice sends to Bob the 16-byte message ‘I love you Bobby’, where each character is encoded using one-byte (8 bits) ASCII encoding. Assume that the message is encrypted using the (64-bit) DES block cipher, using OFB mode. Show how an attacker can modify the ciphertext message to result with the encryption of ‘I hate you Bobby’. 2. Can you repeat for CFB mode? Show or explain why not. 3. Can you repeat for CBC mode? Show or explain why not. 4. Repeat previous items, if we append to the message its CRC, and verify it upon decryption. Applied Introduction to Cryptography and Cybersecurity 4.8. ADDITIONAL EXERCISES 263 Exercise 4.29. 1. Our definition of FIL CBC-MAC assumed that the input is a complete number of blocks. Extend the construction to allow input of arbitrary length, and prove its security. 2. Repeat, for VIL CBC-MAC. Exercise 4.30. Consider a variant of CBC-MAC, where the value of the IV is not a constant, but instead the value of the last plaintext block, i.e.:   c0 ← mη ;   (i = 1 . . . η)ci = Ek (mi ⊕ ci−1 ); CBC − M ACkE (m1 + + m2 + +. . .+ + mη ) =   outputcη Is this a secure MAC? Prove or present convincing argument. Exercise 4.31. Let E be a secure PRF. Show that the following are not secure MAC schemes. 1. ECB-encryption of the message. 2. The XOR of the output blocks of ECB-encryption of the message. Exercise 4.32 (MAC from a PRF). In Exercise 2.38 you were supposed to construct a PRF, with input, output and keyspace all of 64 bits. Show how to use such (candidate) PRF to construct a VIL MAC scheme. Exercise 4.33. This question discuss a (slightly simplified) vulnerability in a recently proposed standard. The goal of the standard is to allow a server S to verify that a given input message was ‘approved’ by a series of őlters, F1 , F2 , . . . , Ff (each filter validates certain aspects of the message). The server S shares a secret ki with each filter Fi . To facilitate this verification, each message m is attached with a tag; the initial value of the tag is denoted T0 and and each filter Fi receives the pair (m, Ti−1 ) and, if it approves of the message, outputs the next tag Ti . The server s will receive the final pair (m, Tf ) and use Tf to validate that the message was approved by all filters (in the given order). A proposed implementation is as follows. The length of the tag would be the same as of the message and of all secrets ki , and that the initial tag T0 would be set to the message m. Each filter Fi signals approval by setting Ti = Ti−1 ⊕ ki . To validate, the server receives (m, Tf ) and computes m′ = Tf ⊕k1 ⊕k2 ⊕. . .⊕kf . The message is considered valid if m′ = m. 1. Show that in the proposed implementation if the tag Tf is computed as planned (i.e. as described above), then the message is considered valid if and only if all filters approved of it. 2. Show that the proposed implementation is insecure. 3. Present a simple, efficient and secure alternative design for the validation process. Applied Introduction to Cryptography and Cybersecurity CHAPTER 4. AUTHENTICATION: MAC, BLOCKCHAIN AND SIGNATURE 264 SCHEMES 4. Present an improvement to your method, with much improved, good performance even when messages are very long (and having tag as long as the message is impractical). Note: you may combine the solutions to the two last items; but separating the two is recommended, to avoid errors and minimize the impact of errors. Exercise 4.34 (Single-block authenticated encryption?). Let E be a block cipher (or PRP or PRF), for input domain {0, 1}l , and let l′ < l. For input ′ ′ domain m ∈ {0, 1}l−l , let fk (m) = Ek (m + + 0l ). 1. Prove or present counterexample: f is a secure MAC scheme. 2. Prove or present counterexample: f is an IND-CPA symmetric encryption scheme. Exercise 4.35. Let F : {0, 1}κ × {0, 1}l+1 → {0, 1}l+1 be a secure PRF, where κ is the key length, and both inputs and outputs are l + 1 bits long. Let F ′ : {0, 1}κ × {0, 1}2l → {0, 1}2l+2 be defined as: Fk′ (m0 + + m1 ) = Fk (0 + + m0 ) + + Fk (1 + + m1 ), where |m0 | = |m1 | = l. 1. Explain why it is possible that F ′ would not be a secure 2l-bit MAC. 2. Present an adversary and/or counter-example, showing F ′ is not a secure 2l-bit MAC. 3. Assume that, indeed, it is possible for F ′ not to be a secure MAC. Could F ′ then be a secure PRF? Present a clear argument. Exercise 4.36. Given a keyed function fk (x), show that if there is an efficient operation ADD such that fk (x+y) = ADD(fk (x), fk (y)), then f is not a secure MAC scheme. Note: a special case is when ADD(a, b) = a + b. Exercise 4.37 (MAC from other block cipher modes). In subsection 4.5.2 we have seen given an n-bit block cipher (E, D), the CBC-MAC, as defined in Eq. (4.2), is a secure n · η-bit PRF and MAC, for any integer η > 0; and in Ex. 4.4 we have seen this does not hold for CTR-mode MAC. Does this property hold for... ECB-MAC, defined as: ECB − M ACkE (m1 + + ... + + mη ) = Ek (m1 ) + + ... + + Ek (mη ) PBC-MAC, defined as: P BC − M ACkE (m1 + + ... + + mη ) = m1 ⊕ Ek (1) + + ... + + mη ⊕ Ek (η) OFB-MAC, defined as: OF B − M ACkE (m1 + + ... + + mη ) = pad0 , m1 ⊕ Ek (pad0 ) + + ... + + mη ⊕ Ek (padη−1 ) where pad0 is random. +...+ + mη ) = c0 , c1 , . . . cη where CFB-MAC, defined as: CF B − M ACkE (m1 + c0 is random and ci = mi ⊕ Ek (ci−1 ) for i>1. Applied Introduction to Cryptography and Cybersecurity 4.8. ADDITIONAL EXERCISES XOR-MAC, defined as: XOR − M ACkE (m1 + +. . .+ + mη ) = 265 L Ek (i ⊕ Ek (mi )) Justify your answers, by presenting counterexample (for incorrect claims) or by showing how an adversary against the MAC function, you construct an adversary against the block cipher. Exercise 4.38. Let (Enc, Dec) be an IND-CPA secure encryption scheme, and let Enc′k (m) = Enck (Compress(m)), where Compress is a compression function. Show that Enc′ is not IND-CPA secure. Exercise 4.39. Figure 4.3 shows the typical sequence of operations when sending a message, with confidentiality (encryption), authentication (MAC), error detection/correction code and (optional) compression. Show the corresponding sequence of operations when receiving such message, in a figure and/or in pseudo-code. In both cases, clarify the reaction if the MAC or EDC/ECC validation fails. Applied Introduction to Cryptography and Cybersecurity Chapter 5 Shared-Key Protocols In the previous chapters, we discussed cryptographic schemes, which consist of one or more functions. For example, MAC, PRF and PRG schemes consist of a single function, with different security criteria, e.g., for MAC, security against forgery (Deőnition 4.1). Similarly, encryption schemes consist of multiple functions (encryption, decryption and possible key generation), with criteria such as CPA-indistinguishability (CPA-IND, Deőnition 2.9). Most often, cryptographic schemes are used as a part of a protocol involving two or more parties (entities). In this chapter, we focus on the following basic shared-key protocols used for securing the communication between two parties: Session/record protocols secure the communication of a session between two parties, which typically includes exchange of multiple messages, using a key shared between the two parties. See Section 5.1. Entity-authentication protocols ensure the identity of a peer involved in the communication. We discuss the vulnerable SNA protocol, and its replacement, the 2PP protocol, in Section 5.2. Request-response protocols ensure the authentication and/or conődentiality of the communication between two parties. We discuss them in Section 5.3. Key Exchange protocols are run between the two parties, to establish shared keys to encrypt and authenticate communication. In this chapter, we focus on shared-key Key Exchange protocols, which use an already shared-key between the parties, to establish keys for encryption and authentication; these protocols ensure improved security compared to direct use of the shared key for these functions. See Section 5.4. Key distribution protocols establish shared keys between two parties, with the help of a trusted third party (TTP), often referred to as the Key Distribution Center (KDC). We discuss key distribution protocols in Section 5.5; much of our discussion is dedicated to studying the vulnerabilities of the GSM protocol. 267 268 CHAPTER 5. SHARED-KEY PROTOCOLS Resilient Key Exchange protocols are Key Exchange protocols with mechanisms to reduce exposure due to key exposure. These include forward secrecy, perfect forward secrecy (PFS) and recover security Key Exchange protocols. See Section 5.7. This chapter is mostly informal. In particular, we do not present rigorous deőnitions of security as we did in the previous chapters, and as we do in Section 5.1.1. This is since precise deőnitions of the execution of protocols and of the corresponding requirements and models, seem to be unavoidably too complex for this textbook. We hope that the informal discussion will clarify the main issues and empower readers to properly use cryptographic protocols and avoid pitfalls. Hopefully, the discussion will also prepare readers interested in design and analysis of cryptographic protocols to study more advanced texts which address these challenges. Some of the many relevant texts include [46, 87, 166, 358] and [196]. 5.1 Modeling cryptographic protocols We begin our discussion informally explaining what we mean by a cryptographic protocol. We use the term ‘cryptographic protocols’ to refer to cryptographic algorithms that involve interactions between two or more distinct entities1 , including benign entities and an adversary. In this textbook and many works in cryptography, we focus on protocols involving only two benign parties, often called Alice (A) and Bob (B), and one Man-in-the-Middle (MitM) adversary. Both benign entities run the same protocol P, which is an efficient (PPT) algorithm; to model parties with different roles, e.g., client vs. server, the ‘role’ can be provided as part of their initial state. The adversary also runs an efficient (PPT) algorithm. An execution of a cryptographic protocol P, with an adversary M, is the random outcome of a process that we refer to as the execution process. Typically, all interactions between benign parties, as well as interactions with the adversary, are done via the execution process; the execution process passes outputs from the benign parties to the adversary, and lets the adversary control the inputs to the benign parties, i.e., the adversary has Man-in-the-Middle (MitM) capabilities. We assume that the execution process allows the protocol to securely initialize the parties, which may involve sharing of secret keys or of public keys. Typically, a protocol has at least two interfaces: the interface with the application using the protocol (APP), and the interface with the network, allowing the protocol to communicate with the other entity (NET). A third interface is often used to provide the protocol with system service such as a clock (SYS). 1 In some works, the term ‘cryptographic protocol’ is used in a different way, mostly to mean what we refer to as a ‘cryptographic scheme’, i.e., not necessarily involving interactions between entities; e.g., you may see mention of ‘encryption protocol’. Applied Introduction to Cryptography and Cybersecurity 5.1. MODELING CRYPTOGRAPHIC PROTOCOLS 269 APP Application interface received(m), f ailure send(m) SYS System interface sleep(δ) Alice wake-up(t) Nurse send(µ) received(µ) NET Network interface Figure 5.1: Interactions for the record/session protocols, illustrated for Alice; Bob has the same interfaces. Note that while we use the labels send and received for both the APP and NET layers, the semantics and the messages sent (m and µ) are very different. Typically, µ will contain an encoding of m, possibly encrypted and/or authenticated, and a header that identiőes sender, recipient and key; see text for details. Other protocols we study use the same SYS and NET events, but different APP events. 5.1.1 The session/record protocol We illustrate the concept of a cryptographic protocol by focusing on the simple yet important session/record protocol. A session/record protocol uses a key shared between the two parties, to authenticate and/or encrypt the messages or records sent in a session or connection between them. In Chapter 7, we present a practical session/record protocol; this is the TLS record protocol, which is part of TLS. Session/record protocols are among the simplest practical protocols, in particular, simpler than the other protocols we discuss, and therefore a good example. Let us őrst describe the APP, SYS and NET interfaces of the session/record protocol, and the operations in each of them, as illustrated in Figure 5.1. Application (APP): an interface for input/output interactions between a benign party and the application which uses it, to whom we sometimes refer as the user of the protocol. These interactions provide the inputs from the application or user to the protocol running on the (benign) party, and allow the protocol to provide outputs to the application/user. For the session/record protocol, the only input interaction in the APP interface, is transmission of a message m from the application to be sent to the peer Applied Introduction to Cryptography and Cybersecurity 270 CHAPTER 5. SHARED-KEY PROTOCOLS in a send(m) interaction. There are two output interactions, the receipt of a message m from the peer in a received(m) event, and an indication that the protocol cannot send information due to a communication failure, in a f ailure interaction. Here, send, received and f ailure are labels and m is a value. Network (NET): an interface for the communication between benign parties, allowing a benign party to exchange messages with another benign party, subject to manipulations by the adversary. To send a message µ to the peer, the protocol uses the send(µ) output event on the N ET interface; send is the label and µ is the value. We use the symbol µ for the messages in the NET interface, to separate them from the symbol m which we use for messages in the APP interface. Typically, µ consists of two parts: header, identifying the sender, recipient and key(s) used, and payload, which contains an encoding of m, possibly encrypted and/or authenticated. In a received(µ) input event on the N ET interface, the protocol receives a message µ, which typically contains a header which identiőes the purported peer who sent the message, as well as the key(s) to be used, as needed, to decrypt and/or authenticate the message. System (SYS): an interface to other interactions of the protocol, typically to ‘local’ services such as clock, sensors and relays/actuators. In this textbook, we only use clock service, with two interactions. Speciőcally, the protocol may invoke sleep(δ) to request a ‘wake-up call’ after δ time units (e.g., seconds); upon that time, the protocol should receive an incoming wake-up(t) event, with t indicating the current time. This interface allows both ‘wake-up’ service as well as a ‘clock lookup’ service (using sleep(0)). The APP interface events are speciőc to each type of protocols, e.g., differ between session/record protocols and between entity-authentication and other protocols. However, all the protocols we study use the same SYS and NET events, as described above. Adversary capabilities and security requirements Our description of the interactions focused on their intended actions; however, obviously, the adversary can signiőcantly interfere with the interactions in different ways. Following the Attack Model principle (Principle 1), we only limit the adversary’s capabilities, rather than assuming a speciőc attacker strategy. The adversary’s capabilities include both computational capabilities and access capabilities. In most works in cryptography, the computational capabilities of the adversary are limited by requiring the adversary to be an efficient (PPT) algorithm (see Section A.1), i.e., its run time must be polynomial in the length of its inputs. The protocol should also be a PPT algorithm, and have comparable computational capabilities to these of the adversary. To ensure this, we provide an input called the security parameter, which we denote 1l , to both adversary and protocol, which means that they are both limited to run a polynomial Applied Introduction to Cryptography and Cybersecurity 5.1. MODELING CRYPTOGRAPHIC PROTOCOLS 271 number of steps in the length of 1l . Note that 1l is a number encoded in unary, i.e., it consists of l bits whose value is all 1 (see Table 1.1). The access capabilities deőne the ability of the adversary to observe the outputs and control the inputs of the benign parties. Like most works in cryptography, we focus on the Man-in-the-Middle (MitM) adversary, which observes all outputs and determines all inputs of the benign parties. Note that some security requirements restrict the adversary’s capabilities; in particular, in subsection 6.1.3 we study the key exchange problem, which assumes an eavesdropping adversary, who can observes messages but cannot modify, inject or drop messages. Following the conservative design principle (Principle 3, we allow the adversary complete control and observation capabilities over the application interface. In the case of the session/record protocol, this means that the adversary determines when a send(m) request from the application occurs, and what is the message m, except for distinguishability games where the adversary can provide two equally-long inputs, one of which will be used as input, and the adversary has to try to guess which input was used. The ability to control the input message m corresponds to the chosen plaintext attack (CPA) model for encryption and the chosen message attack (CMA) for signatures, and to the other attack models deőned in Chapter 2 for encryption and in subsection 1.5.2 for signatures. The authentication requirement of session/record protocols. Intuitively, an execution satisőes existential unforgeability if the sequence of message received by Bob is a preőx of the sequence of message sent by Alice. A session/record protocol P ensures existential unforgeability, if executions with every PPT adversary satisfy existential unforgeability, except with negligible probability. Session/record protocols are typically also expected to ensure conődentiality, which is usually deőned using an indistinguishability-game, along the same principle used in our deőnitions in Chapter 2. 5.1.2 PEtA : a simple EtA session/record protocol We now present the simple EtA session/record protocol, which we denote PEtA . This is a very simple record protocol, following the Encrypt-then-Authenticate (EtA) design (Section 4.7), and assuming an underlying reliable-communication service, such as provided by the widely-used TCP standard; read about TCP in [320] and introductory textbooks on Internet protocols, e.g., [245]. The use of EtA makes PEtA a highly-simpliőed variant of the IPsec ESP protocol [153], an important Internet security standard, while its assumption of an underlying reliable-communication service is similar to the SSL and TLS record protocols, which we discuss in Chapter 7. Note that the TLS record protocol uses either the (less-preferred, often vulnerable) Authenticate-then-Encrypt (AtE) design, or the AEAD design (subsection 4.7.1, which is secure - but different from EtA. Applied Introduction to Cryptography and Cybersecurity CHAPTER 5. SHARED-KEY PROTOCOLS 272 We describe the protocol for two speciőc parties {A, B}, although, of course, it can be trivially extended to support arbitrary pairs of parties. We also make several other simpliőcations, as listed below. Simplifications of PEtA . tions: The PEtA protocol makes the following simpliőca- Assumes a reliable network (connection): the protocol delivers messages, only if every packet sent by one party to its peer, over the network (NET interface), is delivered reliably. Reliable delivery means that the sequence of packets is delivered exactly as sent: no forgery, reordering, modiőcation, duplication, or loss of packets. However, we do allow that some of the last messages sent, are not delivered. The protocol ignores (‘drops’) any packet received which was not sent, or received out of sequence, with a ‘fail’ indication to the application. In practice, this simpliőcation implies that the protocol assumes an underlying reliable communication protocol, such as TCP. Does not ensure reliable connection: while PEtA assumes an underlying reliable connection, it does not guarantee a reliable connection. The protocol does ensure that messages received have been sent by a peer, but some messages sent may be lost (without indication to the sender or recipient), reordered and duplicated, and we may receive ‘stale’ messages sent unlimited time ago (no freshness). No compression, padding or fragmentation: the protocol sends messages of unbounded size ‘as-is’, without applying compression and without fragmenting long messages (into multiple bounded length fragments). Furthermore, the protocol assumes that the MAC and encryption functions received arbitrary-length messages, i.e., the protocol does not perform any padding before applying MAC and encryption. Assumes initialization of two shared keys: the protocol assumes initialization of a shared encryption key kE , as well as a shared authentication key kM AC . For a more practical session/record protocol, see Chapter 7, where we present the TLS record protocol, possibly the most important and widely used session/record protocol. The TLS record protocol avoids these simpliőcations, except the őrst, i.e., it also assumes an underlying reliable network (connection). Practical session/record protocols that avoid all simpliőcations include the DTLS [330] protocol, which is an adaptation of TLS allowing it to run over unreliable network service, and the IPsec [127, 153] protocol. Explanation of the simple secure session/record protocol PEtA . The PEtA protocol uses both a shared-key (symmetric) cryptosystem (E, D), using a (shared) key kE , and a Message Authentication Code (MAC) scheme M AC, Applied Introduction to Cryptography and Cybersecurity 5.2. SHARED-KEY ENTITY AUTHENTICATION PROTOCOLS 273 using a (shared) key kM AC . The description is simpliőed by ignoring the keygeneration of both the cryptosystem and the MAC scheme; we assume that the two keys, kE and kM AC are generated and shared securely securely with the two parties before the protocol begins. Receiving a message from the application. Assume party p ∈ {A, B} receives message m from the application, i.e., has a send(m) event on the APP interface. The protocol should send the incoming message m, properly encrypted and authenticated. The protocol follows the recommended Encryptthen-Authenticate (EtA) approach (subsection 4.7.5). Namely, the protocol $ őrst encrypts the message m. The ciphertext is c ← EkE (m); we use the $ ← notation to emphasize that the encryption algorithm is randomized, i.e., the ciphertext c is not a deterministic function of the message m. Then the + (sent + 1) + + p), adversary computes the authenticator: a ← M ACkM AC (c + where sent is the sequence number of the messages sent. The input to the MAC includes the sequence number sent + 1 of this message-send event, and the sender identiőcation p; this prevents manipulation by a MitM adversary. Finally, the protocol performs a send request from the network (NET interface). The packet sent is the pair (c, a), i.e., ciphertext and authenticator; we use the term packet for the information sent by the protocol, to distinguish between it and the message which is sent by the application using the protocol. Notice that the protocol does not send the sent messages counter sent or the sender identiőcation p, since the underlying networking service is assumed to ensure reliable delivery, namely, the messages should be received in the order sent, therefore, the recipient can count them and should have exactly the same inputs to the MAC; see below. An incoming received(µ) from the network (NET interface). Since we assumed a reliable network service, the incoming packet µ should be the next packet sent by the peer. The protocol parses the incoming packet µ as a pair (c, a). It veriőes that the authenticator is valid, i.e., that a = M ACkM AC (c + + (rcved + 1) + + p̂), where p̂ denotes the peer (i.e., the sender) and rcved is the counter of messages received successfully so far. If the authenticator a is valid, then the protocol passes to the application the plaintext message m ← DkE (c), i.e., the decryption of the ciphertext part c. The plaintext message m should be the rcved + 1 message sent by the peer p̂ to p; it would now also be the rcved + 1 message received by p from p̂. If validation fails, the protocol drops the packet, possibly providing an indication of the failure to the application. 5.2 Shared-key Entity Authentication Protocols In this section we discuss shared-key protocols for authenticating an interaction between a pair of parties (entities). Entity authentication protocols are among Applied Introduction to Cryptography and Cybersecurity 274 CHAPTER 5. SHARED-KEY PROTOCOLS the simplest and oldest cryptographic protocols, and were in extensive use for years, although later they were mostly replaced by authenticated Key Exchange protocols, which combine entity authentication with exchange of a session key. We discuss Key Exchange protocols in the following sections. We focus on the common case of mutual entity authentication, i.e., both parties authenticate each other; however, our discussion mostly applies also to the simpler case where only one party authenticates its peer. To refer to the party that initiates the handshake, we use the term initiator, and similarly the term responder for the party that responds. For convenience, in our examples we usually have Alice as the initiator (i.e., Alice initiates the handshake), and Bob as the responder; however, the roles could be reversed, i.e., Bob can be the initiator and Alice can be the responder. 5.2.1 Interactions and requirements of entity authentication protocols Basically, the goal of an entity authentication protocol is to validate interaction with the (correct) peer. In Mutual Entity Authentication protocols, both peers should authenticate each other, which require, at least, one exchange of messages between the two parties, e.g., a message from Alice to Bob and a response from Bob to Alice. Mutual authentication means that both parties validate they interacted successfully with their peer; one of the two parties, say x ∈ {Alice, Bob} should validate successfully only if its peer validated x successfully. We refer to the entire set of messages involved with a single run of the authentication protocol as the handshake; a handshake requires at least two message ŕows (e.g., Alice to Bob and response from Bob to Alice). Some protocols require three ŕows, i.e., also a response from Alice to Bob. Concurrent handshakes. Most entity authentication protocols allow multiple concurrent handshakes. For example, a handshake begins with the protocol in Alice receiving an Init request from the application (APP interface), requesting to initialize a new handshake with Bob. However, before this handshake terminates, Alice is requested to initiate another handshake with Bob; or, Bob receives an Init request from its own application, requesting Bob to initiate a handshake with Alice, while Alice is still in the middle of an ongoing handshake. Note that the same party may also be a responder for one or more concurrent handshakes, purportedly initiated by its peer. We elaborate on the motivation for supporting concurrent handshakes at the end of this subsection. Interactions of entity authentication protocols. An entity authentication protocol has the interactions illustrated in Figure 5.2. The interactions in the SYS and NET interfaces are the same as presented in Figure 5.1; let us Applied Introduction to Cryptography and Cybersecurity 5.2. SHARED-KEY ENTITY AUTHENTICATION PROTOCOLS 275 APP Application interface Resp(i), Accept(i) Init(i) SYS System interface sleep(δ) Alice wake-up(t) Nurse send(m) received(m) NET Network interface Figure 5.2: Interactions for entity authentication protocols, illustrated for Alice; Bob has the same interfaces. The handshake identifier i allows association of the different APP interactions related to the same handshake in each party, as explained in the text. explain the APP interface interactions, i.e., the interactions between the entity authentication protocol and the application using it. A handshake begins at an initiator, say Alice, upon Init(i) input interaction (request) from the user or application (APP interface). In the responder, say Bob, the handshake begins with a Resp(i) output interaction, i.e., a notiőcation by the protocol of a new handshake initiated by the peer (e.g., Alice). The identiőer i is called the session identifier; identiőers are required to be distinct for different handshakes in each party. Note that each handshake is assigned a unique identiőer at each party, e.g., iI at the initiator and iR at the responder. The two identiőers may be assigned independently at each party; they only need to be unique for that party. One simple implementation would be as a counter maintained at each party, incremented by the application when it issues a new Init(iI ) request, and by the protocol when it notiőes of a new handshake with a new Resp(iR ) interaction. A handshake terminates at a party, e.g., Alice, when she Accepts, i.e., validates successfully the interaction with the peer (Bob); this is done by the protocol returning Accept to the application (i.e., on the APP interface). If the validation fails - e.g., Bob never responds - the handshake may never terminate, or terminate with a failure indication2 . 2 The failure indicator is used to ensure that handshakes terminate within bounded time; for simplicity, we do not discuss this requirement or utilize it in the protocols and requirements that we cover. Applied Introduction to Cryptography and Cybersecurity 276 CHAPTER 5. SHARED-KEY PROTOCOLS To summarize, the interactions of the APP interface are as follows: Init(i) is an input interaction, i.e., a request from the application to initiate a new connection (with a distinct identiőer i). Init is the only input interaction. Resp(i) is an output interaction, where the protocol informs that it is responding to a new session, and assigns this new session an identiőer value i. This session identiőer must differ from other session identiőers used by this party (in Init or Resp interactions). Accept(i) is an output interaction, where the protocol informs the applications that session i has successfully completed. Security of entity-authentication protocols. Intuitively, the basic requirement from an entity-authenticating protocol is that successful authentication would imply a ‘fresh’ interaction with the peer. The term ‘fresh’ may mean either simply ‘within the recent ∆ seconds’ (for some small ∆), or that there was overlap in the handshake periods in the two peers. Another requirement that some protocols ensure is that a successful authentication in the responder implies successful authentication also in the initiator. We could also design a protocol with the reverse property, i.e., success at the initiator would ensure success at the responder. However, if we allow communication failures, we cannot ensure that success in either of the parties will imply success in the other party. Sequential vs. Concurrent Authentication Note that ensuring mutualauthentication is easier, if handshakes must be done sequentially, i.e., no concurrent handshakes. However, support for multiple concurrent sessions is a critical aspect of Mutual Entity Authentication protocols. Concurrent sessions are required in many scenarios; e.g., web communication often uses concurrent connections (sessions) between browser and server, to improve performance. Concurrent handshakes are also necessary to prevent ‘lock-out’ due to synchronization-errors (e.g., lost state by Initiator), or an intentional ‘lock-out’ by a malicious attacker, as part of a denial-of-service attack. In any case, there is no strong motivation to allow only consecutive handshakes; protocols that support concurrent handshakes are as efficient, and not signiőcantly more complex, than protocols that only allow sequential authentications. Therefore, following the conservative design principle (Principle 3), we focus on the case where concurrent handshakes are allowed. When designers do not consider the threats due to concurrent sessions, yet the implementation allows concurrent sessions, the result is often a vulnerable protocol. This is a typical example of the results of failures to articulate the requirements from the protocol and the adversary model, and of not following Principle 3. We next study such vulnerability: the SNA mutual-authentication protocol. Applied Introduction to Cryptography and Cybersecurity 5.2. SHARED-KEY ENTITY AUTHENTICATION PROTOCOLS Alice Bob $ NA ← 1l 277 $ A, NA NB ← 1l NA , NB , Ek (NA ) NB , Ek (NB ) Figure 5.3: The (vulnerable) SNA mutual authentication protocol. 5.2.2 Vulnerability study: SNA mutual-authentication protocol As a simple, yet realistic, example of an (insecure) two-party, shared key Mutual Entity Authentication protocol, consider the SNA mutual-authentication protocol. IBM’s SNA (Systems Network Architecture) was the primary networking technology from 1974 till the late 1980s, and is still in use by some ‘legacy’ applications. We describe the original, insecure version of the SNA Mutual Entity Authentication protocol, and later its replacement - the 2PP Mutual Entity Authentication protocol. Both protocols use a shared secret key k, to authenticate two parties to each other, without deriving a session key; we later describe extensions which also provide Key Exchange. We őrst explain the SNA protocol, illustrated in Figure 5.3, and then discuss its security. The SNA Mutual Entity Authentication protocol operates in three simple ŕows, as illustrated in Figure 5.3. The protocol uses a block cipher E. The initiator, say Alice, sends to her peer, say Bob, her identiőer, which we denote A, and NA , a random l-bit binary string which serves as a challenge; such a random challenge is often called a nonce. Here, l is the size of the inputs and outputs to a block cipher E used by the protocol. The responder, say Bob, replies with a proof of participation Ek (NA ), using the shared key k. Bob also sends his own random l-bit challenge (nonce) NB . To help Alice match this response with the correct handshake, as necessary to support concurrent handshakes, Alice also sends an identiőer with her handshake message, which Bob returns with his response. Or, as we show in Figure 5.3, Alice only includes her nonce NA , and Bob simply attaches Alice’s nonce NA with his handshake message (second ŕow). Upon receiving Bob’s response, Alice validates that the response contains the correct function Ek (NA ) of the nonce that she previously selected and sent. If so, Alice concludes that she communicates indeed with Bob. Alice then completes the Mutual Entity Authentication by sending her own ‘proof of participation’ Ek (NB ). Alice may also includes NB , to help Bob match the response with the correct handshake, similarly to the inclusion of NA by Bob; Applied Introduction to Cryptography and Cybersecurity 278 CHAPTER 5. SHARED-KEY PROTOCOLS or, a different identiőer may be used. Finally, Bob similarly validates that he received the expected function Ek (NB ) of its randomly selected nonce NB , and concludes that this Mutual Entity Authentication was initiated, and successfully completed, by Alice. Upon receiving the expected responses, both parties, Alice and Bob, signal to their applications that the Mutual Entity Authentication has completed successfully. The response expceted by Alice is Ek (NA ), and the response expected by Bob is Ek (NB ). SNA Mutual Entity Authentication ensures sequential, but not concurrent, mutual authentication. The simple SNA Mutual Entity Authentication of Figure 5.3 ensures mutual authentication, but only if restricted to sequential Mutual Entity Authentication. The protocol is vulnerable when allowing concurrent Mutual Entity Authentication. The attack is illustrated in Figure 5.4. Let us őrst explain why the protocol ensures mutual authentication, when restricted to sequential handshakes. Suppose, őrst, that Alice completes the protocol successfully. Namely, Alice received the expected second ŕow, Ek (NA ). Assume that this happened without Bob previously receiving NA as the őrst ŕow from Alice (and sending Ek (NA ) back). Due to the sequential restriction, Alice surely did not receive NA as a challenge in the time since Alice sent NA , and hence did not compute and send Ek (NA ). Since Alice selected NA randomly, from a sufficiently large set (i.e., NA is sufficiently long), it is unlikely that either Alice or Bob has received NA before Alice completed the protocol. Hence, the adversary must have computed Ek (NA ) rather than intercepted it. However, if the adversary can compute Ek (NA ) then the adversary can distinguish between Ek and a random permutation, contradicting the PRP assumption for E. Note that an eavesdropping attacker may collect such pairs (NA , Ek (NA ) or NB , Ek (NB )), however, since NA , NB are quite long strings (e.g., 64 bits), the probability of such re-use of the same NA , NB is negligible. However, the SNA Mutual Entity Authentication fails to ensure concurrent mutual authentication; namely, if the parties are willing to run two concurrent Mutual Entity Authentication handshakes, then an attacker can cause a party to complete the protocol ‘successfully’, i.e., thinking it has communicated with its peer, while in reality the peer did not receive any message. For example, Figure 5.4 illustrates an attack where Alice initiates one session with Bob, which is actually intercepted by an attacker. The attacker, impersonating as Bob, initiates another session with Alice. Both sessions terminate correctly, i.e., in both, Alice is tricked into believing that it has successfully interacted with Bob; however, in reality, Bob was never involved. Notice that the attack of Figure 5.4 requires that Alice would agree to act both as an initiator of a session and as a responder of a session. One may hope that SNA may be secure for concurrent Mutual Entity Authentication, if each party is only willing to act in one role (initiator or responder). However, this is not the case; see Exercise 5.8. Applied Introduction to Cryptography and Cybersecurity 5.2. SHARED-KEY ENTITY AUTHENTICATION PROTOCOLS 279 A, NA B, NA NA , NA′ , Ek (NA ) NA , NA′ , Ek (NA ) NA , Ek (NA′ ) Nurse Alice NA , Ek (NA′ ) Attacker Figure 5.4: Attack on SNA Mutual Entity Authentication with Alice initiating one session with Bob, which is actually intercepted by an Attacker; ŕows related to this session are marked in black and ‘regular’ arrows. In the attack, the Attacker, impersonating as Bob, initiates another session with Alice; ŕows related to this (second) session are marked in red and with dashed arrows. Both sessions terminate correctly, i.e., Alice would believe that it has successfully interacted with Bob in both sessions, while in reality, Bob was never involved. Notice that this attack requires that Alice would agree to act both as an initiator and as a responder of sessions. In practice, SNA allows concurrent Mutual Entity Authentication handshakes - and for good reasons, as motivated earlier (at the end of subsection 5.2.1). To ensure security, the Mutual Entity Authentication protocol of SNA was changed into the 2PP protocol, which we present in the following subsection. 5.2.3 Authentication Protocol Design Principles Before we present secure authentication protocols, it is useful to identify some weaknesses of the SNA protocol, which were exploited in the attacks on it. This allows us to derive design principles for different protocols which involve authentication. Identify sender and recipient. The SNA protocol allowed redirection: giving to Bob a value from a message which was originally sent - in this case, by Bob himself - to a different entity (Alice). This motivates the following design principle: in each message, identify the sender, the recipient, or, best, identify both of them. Authentication is usually easy and efficient, e.g., by adding an appropriate identiőer(s) to information being authenticated. Another solution may be to use independently-pseudorandom keys for sending messages by the two parties. Authenticate the handshake identifier. The SNA attack sent to Bob (part) of a message sent (by Bob) to Alice, during a different handshake. This motivates the design principle of authenticating the handshake-identiőer. Authenticate flow number and initiator/responder bit. The attack sent in the third protocol ŕow (second message from the initiator), the auApplied Introduction to Cryptography and Cybersecurity CHAPTER 5. SHARED-KEY PROTOCOLS 280 A, NA NA , NB , M ACk (2 + + ‘A ← B’ + + NA + + NB ) NB , M ACk (3 + + ‘A → B’ + + NA + + NB ) Nurse Alice Bob Figure 5.5: The 2PP Mutual Entity Authentication protocol. thenticator received from the second protocol ŕow (őrst message from the responder). This motivate us to authenticate the flow number (e.g., second vs. third) and a bit indicating if authenticator is sent by the initiator or the responder. Authenticate using MAC or signatures. The SNA Mutual Entity Authentication protocol was designed to ensure authentication; however, the (only) cryptographic function it uses is a block cipher. Instead, an authenticating protocol should use an authentication function such as MAC or signatures. Admittedly, from the Switching lemma (Lemma 2.2), a block cipher, i.e., a PRP, is also a PRF, and from the PRF-is-a-MAC lemma (Lemma 4.1), every PRF is also a MAC. However, the principle still stands: authenticating protocols should use authentication functions. Never provide an oracle. The SNA protocol applied the block cipher to input - the nonces - received entirely from the network, i.e., fully controlled by a MitM adversary, and returns the result of applying the block cipher to this input. Namely, this provides the adversary with an ‘oracle’ to the cryptographic function - in this case, the block cipher (PRP), which often causes a vulnerability. A simple defense to avoid providing ‘oracle’ is to include some random input(s) before applying the cryptographic function, in addition to the adversary-provided inputs. This defense works in many scenarios, including Mutual Entity Authentication protocols. Adopting even a subset of these design principles would have sufficed to prevent the attack of Figure 5.4. The 2PP protocol, described next, is essentially the result of applying these principle to address the vulnerabilities of the SNA protocol. 5.2.4 Secure Mutual Entity Authentication with the 2PP protocol In this subsection we present 2PP, a secure two party shared-key Mutual Entity Authentication protocol; the name 2P P simply stands for two party protocol. The 2PP protocol, presented in [61], was a replacement to the insecure SNA Mutual Entity Authentication protocol. Applied Introduction to Cryptography and Cybersecurity 5.3. AUTHENTICATED REQUEST-RESPONSE PROTOCOLS 281 The ŕows of the 2PP Mutual Entity Authentication protocol are presented in Figure 5.5. As in the SNA Mutual Entity Authentication protocol, the values NA and NB are n-bit nonces, where n is the security parameter - typically, the length of the shared key k. The nonces NA , NB are selected randomly, by Alice (initiator) and Bob (responder), respectively. The protocol, at both parties, outputs Accept(i) once it authenticates correctly the last ŕow from the peer for handshake i (second ŕow for the initiator, third ŕow for the responder). By validating this last ŕow, 2PP ensures mutual authentication. We will not prove the security of 2PP, but let us give an intuitive argument why it ensures mutual authentication, i.e., authentication of both responder and initiator. Note that the security of 2PP requires the MAC scheme to be secure, and requires that the key k and the nonces are ‘sufficiently long’, i.e., at least as long as the security parameter 1l . Security of 2PP: responder authentication. Consider an execution in which the initiator (Alice in Figure 5.5) ‘accepts’, although Bob is not responding. +A ← B+ + NA + + NB ). In 2PP, Alice ‘accepts’ (only) upon receiving M ACk (2 + In 2PP, a party authenticates a message beginning with 2 only when sending a response, and according to the identiőers, Alice would never send this. If Bob sent it after Alice began this handshake, then responder authentication holds. If Bob computed it before Alice began this handshake, then Alice randomly selected same NA as previously received by Bob, which would occur with negligible probability, since NA and NB are of the same length as the security parameter 1l . There remains the possibility that Bob also did not compute M ACk (2+ +A ← B+ + NA + + NB ). In this case, the adversary somehow found this value ‘on its own’, i.e., neither Alice nor Bob computed it. Such an adversary can produce a valid MAC on a message that was never MAC-ed by an ‘oracle’ knowing the MAC key; this contradicts the assumption that the MAC scheme used is secure. Hence, 2PP seems to ensure responder authentication. Security of 2PP: sender authentication. The argument for sender authentication is very similar; we leave it to the reader to work it out. 5.3 Authenticated Request-Response Protocols Authenticated request-response protocols extend mutual authentication protocols such as 2PP: not only do they authenticate the entities, they further authenticate the exchange of a request and corresponding response messages between the two parties. More precisely, they ensure the following properties: Request authentication: every request received by a party, was sent by its peer. Applied Introduction to Cryptography and Cybersecurity CHAPTER 5. SHARED-KEY PROTOCOLS 282 APP Application interface send-req(r, i) send-resp(r, i) SYS System interface rcv-req(r, i) rcv-resp(r, i) sleep(δ) Alice wake-up(t) Nurse send(m) received(m) NET Network interface Figure 5.6: Interactions for Authenticated Request-Response Protocols, illustrated for Alice; Bob has the same interfaces. Each party may send either a request or a response (to a previously received request). Response authentication: every response received by a party, was sent by its peer, in response to a request sent by the party. No Replay: every request/response is received by a party, at most the number of times it was sent by the peer. Let us őrst describe the interactions of authenticated request-response protocols, over the different interfaces (APP, NET and SYS). These are illustrated in Figure 5.6. Notice that the interactions on the NET and SYS interfaces are exactly the same as in Figure 5.2 and Figure 5.1. It only remains to explain the interactions on the APP interface, which are: send-req(r, i): the application instructs the protocol to send request r to the peer, using the identiőer i to identify the response (if and when it arrives). The identiőer should be unique, i.e., different from previous identiőers used by the application in send-req(r, i) interactions, e.g., a sequential number of this send-req interaction (in this entity). rcv-req(r, i): the protocol delivers an incoming request r from the peer, specifying identiőer i to be used by the application if and when it provides the corresponding response. The identiőer i should be unique, i.e., different from previous identiőers used by the protocol in rcv-req(r, i) events. Note that this identiőer is unrelated to these used to identify send-req(r, i) interactions. Applied Introduction to Cryptography and Cybersecurity 5.3. AUTHENTICATED REQUEST-RESPONSE PROTOCOLS 283 send-resp(r, i): the application instructs the protocol to send to the peer a response, r. The response r is to a previously-received request, which was identiőed by i; a send-resp(r, i) may occur only if a rcv-req(r′ , i) occurred earlier. The previous rcv-req(r′ , i) must have the same identiőer i, but the request r′ is usually different from the response, i.e., usually, r′ ̸= r. rcv-resp(r, i): the protocol delivers a response r from the peer, specifying identiőer i; this must be the same identiőer as in a previous send-req(r′ , i) interaction in this entity. Note that the value r of the response is typically different from the value r′ of the corresponding request. 5.3.1 Summary of request/response protocols We discuss four authenticated request/response (RR) protocolss: 2PP-RR: this protocol extends 2PP by also authenticating a request message from the responder (e.g., Bob) and the corresponding response message from the initiator (e.g., Alice). Note the ‘reverse’ role of the parties: the ‘initiator’ is the party sending the response, while the ‘responder’ is the party sending the request. This is a bit confusing, and worst, it is often inconvenient for applications. 2RT-RR: This Request-Response protocol requires two round-trips (hence the name, 2RT-RR). It allows the initiator to send a query and the responder to respond, which is often the required usage pattern; however, as mentioned, it requires two round trips, i.e., four ŕows. Counter-based-RR: this protocol authenticates a session rather than just request-response, and requires only one ŕow per each message. However, it requires both entities to maintain persistent state (counter). Time-based-RR: authenticates a request message and the corresponding response message, using only the minimal two ŕows: one for the request and one for the response. These protocols require both parties to have synchronized clocks. The protocol can handle (bounded) latency and clock drift, but this requires the responder to maintain persistent state for a limited time. We compare these four types of authenticated request-response protocols in Table 5.1. We notice that one of the most important differences among these protocols is the number of ŕows, which implies the number of round trip times required by the protocol. This number of round trip times is an important consideration in many practical scenarios; let us explain why. Overall delay is dominated by the number of round-trips. The number of round-trips is one of the most important attributes of request-response protocols, and is usually the dominant factor impacting the overall delay. As shown in Table 5.1, the Time-based-RR and the Counter-based-RR require a Applied Introduction to Cryptography and Cybersecurity CHAPTER 5. SHARED-KEY PROTOCOLS 284 Protocol 2PP-RR 2RT-2PP-RR Counter-based-RR Time-based-RR Figure 5.7 5.8 5.9 5.10 Section 5.3.2 5.3.3 5.3.4 5.3.5 Flows 3 4 2 2 Requirements/drawbacks Initiate periodically? Two round-trips Persistent state, one round trip Synchronization, one round trip Table 5.1: Authenticated Request-Response (RR) protocols. single round-trip to receive the response; in contrast, the 2RT-2PP-RR requires two round trips. The 2PP-RR is an exception, as it requires ‘role reversal’: the protocol is initiated by the entity producing the response; but how would this entity know when to initiate the protocol? This seems to require this entity to initiate the protocol periodically (synchronously), which makes it inappropriate to many applications, where the request should be sent asynchronously, as needed, rather than periodically. The reason for focusing on the number of round trips is that in many applications and scenarios, the number of round trips is the dominant factor impacting overall delay. This is since often, the round trip time (RTT) is the largest factor contributing to the delay. To emphasize this, let us state it as Fact 5.1 and brieŕy explain it. Fact 5.1 (Round trip time (RTT) values are signiőcant). Typical round-trip times, i.e, the delay (latency) from sending a short packet over the Internet and until receiving a response, can be quite significant. For example, [289] reports RTT values which are mostly below 200msec. However, in some scenarios, e.g., under bandwidth denial-of-service ( BW-DoS) attack, it can be it can be as high as one second and even more. In particular, when using geostationary satellite communication, typical delay is around 550 to 600msec. These values are independent of the bandwidth, i.e., hold also when using high bandwidth connections (fast transmission rates). While network bandwidth has dramatically increased over the years, the round trip times did not reduce that much. Round trips 1 1 2 2 Protocol(s) Time-based RR Time-based RR 2RT-2PP-RR 2RT-2PP-RR Bandwidth 1MB/s (8Mb/s) 100MB/s (800Mb/s) 1MB/s (8Mb/s) 100MB/s (800Mb/s) Delay 150msec 100.5msec 250msec 200.5msec Table 5.2: Comparing the impact of transmission rates to the impact of and the number of round-trips required by a protocol, assuming the typical RTT (round-trip time) of 100msec, and assuming overall transmission of 50KByte. To illustrate the importance of minimizing round-trips, Table 5.2 compares the overall delays in typical request-response scenarios when using protocols requiring one vs. two round trips, and when using lower bandwidth of 1MB/s vs. higher bandwidth of 100MB/s. Clearly, the transmission time is negligible Applied Introduction to Cryptography and Cybersecurity 5.3. AUTHENTICATED REQUEST-RESPONSE PROTOCOLS 285 A, NA req, NB , M ACk (2 + + ‘A ← B’ + + NA + + NB + + req) resp, M ACk (3 + + ‘A → B’ + + NA + + NB + + resp) Nurse Alice Bob Figure 5.7: The 2PP-RR Authenticated Request-Response Protocol: a three ŕow nonce-based authenticated Request-Response protocol, based on 2PP compared to the round-trip time; the delay is dominated by the number of round-trips and the round-trip delay. Indeed, TLS 1.3, the updated version of the TLS protocol, has a signiőcantly modiőed design which allows it to minimize the number of round trips and therefore of the round-trip times; see Section 7.6. Note that Fact 5.1 may be a bit obscured in most of the sequence diagrams in this text since, for simplicity, compactness and clarity, we use horizontal arrows for transmissions, i.e., the diagrams do not indicate the network latency of messages. We also usually do not show other delays. Most sequence diagrams in the literature use the same style. 5.3.2 The 2PP-RR Authenticated Request-Response Protocol We őrst discuss 2PP-RR, a three ŕow nonce-based authenticated RequestResponse protocol, which is a minor extension to 2PP. The 2PP-RR protocol is illustrated in Figure 5.7. In fact, the only change compared to the 2PP protocol (Figure 5.5), is the addition of the request (req) from responder to initiator, and of the response (resp) from initiator to responder, to the second and third ŕows, respectively. The 2PP-RR protocol is simple and not too difficult to prove secure, by a reduction to the security of the underlying MAC function. Namely, suppose that we know an efficient algorithm (adversary) M which shows that 2PP-RR does not meet the deőnition of a secure authenticated request-response protocol. The reduction shows how, given such M, we can efficiently compute the value of the MAC function on some input, without knowing the key. This contradicts the assumption that we use a secure MAC. Namely, if we use a secure MAC, then 2PP-RR is secure. The 2PP-RR protocol has, however, a signiőcant drawback, which makes it ill-suited for many applications. Speciőcally, in this protocol, the request is sent by the responder, and the initiator sends the response. In most applications, it makes sense for a party to initiate the protocol when it needs to make some request, rather than to wait for the initiator to contact it and only then, as a responder, send the response. The next protocol is a different adaptation of 2PP which avoids this drawback - but requires four ŕows, i.e., two full round trips. Applied Introduction to Cryptography and Cybersecurity CHAPTER 5. SHARED-KEY PROTOCOLS 286 A, NA NB req, M ACk (3 + + ‘A → B’ + + NA + + NB + + req) resp, M ACk (4 + + ‘A ← B’ + + NA + + NB + + resp) Nurse Alice Bob Figure 5.8: 2RT-2PP RR: a two-round-trips Authenticated Request-Response protocol 5.3.3 2RT-2PP Authenticated Request-Response protocol In Figure 5.8 we present 2RT-2PP RR, another authenticated request-response protocol based on 2PP. As the name implies, the 2RT-2PP RR protocol requires four flows, i.e., two round-trips; this is a signiőcant drawback. However, 2RT2PP improves upon 2PP-RR in that it authenticates a request from the initiator, and the corresponding response to it from the responder. The 2RT-2PP Request-Response protocol involves two simple extensions of the basic 2PP protocol. The őrst extension is the transmission and authentication of the request and response, similarly to their addition in 2PP-RR. The second extension is an additional (fourth) ŕow, from the responder back to the initiator, which carries the response of the responder to the request from the initiator. In a sense, 2RT-2PP ‘splits’ the contents of the second ŕow of the 2PP-RR. In 2RT-2PP, these contents are split between the second ŕow (providing the nonce NB ) and the fourth ŕow (providing the authenticated response). 5.3.4 The Counter-based RR Authenticated Request-Response protocol In Figure 5.9 we present the Counter-based RR Authenticated Request-Response protocol. In contrast to the 2PP protocols, this protocol requires only one round trip - sending the (authenticated) request and receiving the (authenticated) response. However, to prevent replay of previously-sent requests, in only one round-trip, this protocol requires both parties to maintain a synchronized counter. The challenge for the Counter-based RR protocol, as well as for the timebased RR protocol of the next subsection, is for the responder to verify the freshness of the request, i.e., that the request is not a replay of a request already received in the past. Freshness also implies no reordering; for example, a responder, say Bob, should reject request x from Alice, if Bob already received request x or a later-sent request x′ from Alice. Freshness prevents an attacker from replaying information from previous exchanges. For example, consider the request-response authentication of Figure 5.8; if NB is removed (or őxed), then an eavesdropper to the ŕows between Alice and Bob in one request-response Applied Introduction to Cryptography and Cybersecurity 5.3. AUTHENTICATED REQUEST-RESPONSE PROTOCOLS 287 req, iA , M ACk (1 + + ‘A → B’ + + iA + + req) If iA ̸= iB + 1: ignore Else; iB ← iB + 1 resp, iB , M ACk (2 + + ‘A ← B’ + + iB + + resp) Nurse Alice Bob Accept if iA = iB Figure 5.9: The Counter-based RR Authenticated Request-Response protocol session can copy these ŕows and cause Bob to process the same request again. For some requests, e.g., Transfer $100 from my account to Eve, this can be a concern. To ensure freshness without requiring the extra ŕows, one may use state, as in this subsection, or synchronization, as in the next subsection. Speciőcally, the counter-based RR protocol of Figure 5.9 requires both parties to maintain a counter; we use iA to denote the counter kept by Alice, and iB to denote the counter kept by Bob. Alice’s counter iA represents the number of queries that Alice sent, and Bob’s counter iB represents the number of responses that Bob sent; hence, both are initialized to zero. The protocol maintains these two counters synchronized, in the sense that at any time holds: iB ≤ iA ≤ iB + 1. Note that this design implies that this protocol does not allow concurrent transmission of requests. Furthermore, the protocol does not provide retransmissions or any other mechanisms to handle message-losses or corruptions; any such loss or corruption is likely to prevent any further query/response. However, it is not too difficult to extend the protocol to handle such issues, in particular, to allow concurrent requests and responses. Exercise 5.1. Extend the protocol of Figure 5.9, to allow Alice to send concurrent requests to Bob; allow Bob to respond to requests, even when they are received out-of-order. 5.3.5 Time-based RR Authenticated Request-Response protocol Figure 5.10 presents another alternative single-round Authenticated RequestResponse protocol; this variant allows the Initiator (e.g., Alice) to be stateless, and also limits the time that the responder (e.g., Bob) must keep state. Instead of relying on a counter maintained, in synchronized way, by the two parties, the protocol of Figure 5.10 relies on the use of time and on two synchronization assumptions, speciőcally, bounded delay and bounded clock skew. Applied Introduction to Cryptography and Cybersecurity CHAPTER 5. SHARED-KEY PROTOCOLS 288 TA ← clkA (·) req, TA , M ACk (1 + + ‘A → B’ + + TA + + req) req is valid if TA is larger than before, and TA ≥ clkB (·) − ∆. Nurse Alice resp, M ACk (2 + + ‘A ← B’ + + TA + + resp) resp is valid if received within 2∆, and with correct TA . Bob Figure 5.10: Time-based Authenticated Request-Response protocol, using a bound ∆ on the maximal delay plus maximal clock bias. We use clkA (·) to denote the time according to the local clock of Alice upon sending req, and clkB (·) for Bob’s clock upon receiving req. Alice sets TA ← clkA (·) when she sends the request, and authenticates it with the request. Bob uses TA to validate that the request is fresh, using the bound ∆, and ensuring TA is larger than previously received TA values. Bounded delay assumption. Let ∆delay ≥ 0 denote a bound on the maximal delay. Namely, if one party sends a message at time t, then this message is received by t + ∆delay or earlier. Bounded clock skew. Let ∆skew ≥ 0 denote a bound on the maximal clock skew, i.e., the maximal difference between the values of the clocks of two entities at any given time. Let clkA (t) (clkB (t)) denote the value of the clock at Alice (respectively, Bob) at time t; then we have: clkA (t) − ∆skew ≤ clkB (t) ≤ clkA (t) + ∆skew (5.1) The protocol is illustrated in Figure 5.10, with Alice sending the request and Bob responding. For simplicity, we use a combined bound: ∆ ≡ ∆skew + ∆delay , and the notation clkA (·), clkB (·) for the value of clkA (respectively, clkB ) at the time Alice sends (Bob receives) the req message. The protocol at Bob conőrms the received request req is valid, as follows: No modification: compare the received MAC value to the MAC computed with the correct inputs. req is a request from Alice to Bob: The fact that the input to the MAC begins with 1 + + ‘A → B’ ensures this is a request (őrst ŕow) from Alice to Bob. No replay: Bob validates that the received value of TA is larger than the largest previously received value of TA . Freshness (acceptable delay): Bob validates that the received TA is within ∆ from his own clock clkB (·) at the time the req is received. Applied Introduction to Cryptography and Cybersecurity 5.4. SHARED-KEY KEY EXCHANGE PROTOCOLS 289 When Alice receives the response resp, she similarly conőrms it is valid, as follows: No modification: Alice compares the received MAC value to the MAC computed with the correct inputs. resp is a response from Bob to Alice: The fact that the input to the MAC begins with 2 + + ‘A ← B’ ensures this is a response (second ŕow) from Bob to Alice. resp is a response to request req, and not a replay: the value of TA increments whenever Alice sends a new request, and therefore Alice sent only one request with this value of TA ; furthermore, this cannot be a replay of previous response from Bob, since no previous response would use this TA . Freshness (acceptable delay): Alice validates that the response is received within at most3 2 · ∆delay . Note that both Alice and Bob may discard their state (the sA .TA and sB .TB variables) after some time, which the reader may compute. Exercise 5.2. Show that Alice and Bob do not need to keep forever the stored values of TA ; at what time may each of them free up this storage? Like the counter-based protocol, we also presented the time-based protocol allowing only one query-response at any given time, and assuming reliable communication. However, it is not too difficult to extend it to allow multiple concurrent requests and responses, and handle unreliable communication. Exercise 5.3. Extend the protocol of Figure 5.10, to allow Alice to send up to three concurrent requests to Bob; allow for receiving and responding to requests out-of-order. 5.4 Shared-key Key Exchange Protocols Not all communication follows the request-response pattern, and even when it does, many times we may prefer to secure the communication using a session/record protocol, which secures arbitrary interactions, using a shared key, such as the simpliőed protocol in subsection 5.1.2, or the TLS record protocol in Chapter 7. In principle, the parties could simply use the same shared secret keys for all sessions between them. However, as already stated in Principle 4, it is desirable to limit the use of secret keys. The goal of shared-key Key Exchange protocols is to setup separate session keys {kiS } for each session i, in such a way that exposure of session key kiS will not expose other session keys. In this section 3 For simplicity, Figure 5.10 specifies validation of 2∆, which is also fine but a bit unnecessarily lax. Applied Introduction to Cryptography and Cybersecurity 290 CHAPTER 5. SHARED-KEY PROTOCOLS we focus on two-party shared-key Key Exchange protocols, which derive all session keys from one őxed key k, to which we refer as the master key and/or as the long-term key. These protocols should be contrasted from public-key key exchange protocols, which we study in subsection 6.1.3, and which are often used to establish the shared master key. In Chapter 7 we study the widely-used SSL and TLS protocols, which combine public-key key exchange to establish a shared master key, and shared-key Key Exchange to derive session keys from the master key. The use of session keys kiS securely derived from the master key k has multiple security beneőts: 1. By changing the key periodically, we reduce the number of ciphertext messages, encrypted using the same key, available to the cryptanalyst. This usually makes cryptanalysis harder, and possibly infeasible, compared to the use of a őxed key, which allows the adversary to collect plenty of ciphertext messages, which often makes the attack easier, as we have seen in Chapter 2. We also reduce the amount of plaintext exposed if the cryptanalyst succeeds in őnding a key, thereby reducing the amortized ‘return on investment’ for cryptanalysis. 2. Keys may also be exposed by hacking attacks; an attacker can use an exposed key until it is changed. By changing session keys periodically, and making sure that each of the session keys remains secret (pseudo-random) even if other session keys are exposed, we limit or reduce the damages due to exposure of some of the keys. 3. The separation between session keys and master key allows some or all security to be preserved even after attacks which expose the entire storage of a device. One way to achieve this is when the master key is conőned to a separate Hardware Security Module (HSM), protecting it even when the computer storage - except the internal storage of the HSM - is exposed. We also discuss ways to ensure or restore security following exposure, even without an HSM. 4. Finally, most session/record protocols, including the one in subsection 5.1.2 and the TLS record protocol in Chapter 7, rely on persistent counters kept by the peers; however, counters are often reset, or may otherwise get out of sync. Key Exchange protocols are an extension to mutually-authenticating protocols, with the same functions, inputs and outputs. There is only one more output of the protocol: the session key kiS . This key is provided to the session/record protocol running in each of the peers. We say that a two-party shared-key Key Exchange protocol ensures secure key-setup if it ensures the following two requirements. Key agreement: if both parties complete successfully, then they both output the same key kiS . Applied Introduction to Cryptography and Cybersecurity 5.4. SHARED-KEY KEY EXCHANGE PROTOCOLS 291 A, NA,i + ‘A ← B’ + + NA,i + + NB,i ) NA,i , NB,i , P RFkM (2 + + ‘A → B’ + + NA,i + + NB,i ) NB,i , P RFkM (3 + Nurse Alice Bob + NB,i ) kiS = P RFkM (NA,i + + NB,i ) kiS = P RFkM (NA,i + Figure 5.11: The 2PP Key Exchange protocol, shown generating ith session key, kiS . Key secrecy: each session key kiS is secret, or more precisely, pseudo-random, i.e., indistinguishable from a random string of same length, even if the adversary is given all the other session keys. This implies that the master key k must remain pseudorandom, even if all session keys {kiS } are given to the adversary. 5.4.1 The Key Exchange extension of 2PP In this subsection we discuss a simple extension to the 2PP protocol, which ensures secure key-setup. This is achieved by outputting the session key kiS as: kiS = P RFk (NA,i + + NB,i ) (5.2) In Equation 5.2, NA,i and NB,i are the nonces exchanged in the ith session of the protocol, and kiS is the derived ith session key. We use k M to denote the master (long-term) shared secret key, provided to both parties during initialization. The protocol is illustrated in Figure 5.11. Since both parties compute the session key kiS in the same way from NA,i + + NB,i and the master key kiM , it follows that they will receive the same key, i.e., the Key Exchange 2PP extension ensures key agreement. Since the session keys are computed using a pseudorandom function, kiS = P RFk (NA,i + + NB,i ), it follows that the key of each session is pseudo-random, even given all other session keys. Namely, the Key Exchange 2PP extension ensures secure key setup. Notice that there is another, seemingly unrelated change between the Mutual Entity Authentication 2PP (Figure 5.5) and the Key Exchange 2PP (Figure 5.11) protocols, namely, the use of P RF instead of M AC to authenticate the messages in the protocol. This change is needed to avoid using the same key in two different cryptographic schemes (MAC and PRF), which could, at least in some ‘absurd’ scenarios, be insecure. The change is also allowed, since every PRF is also a MAC. Applied Introduction to Cryptography and Cybersecurity CHAPTER 5. SHARED-KEY PROTOCOLS 292 5.4.2 Deriving Per-Goal Keys Following the key separation principle (Principle 7), session protocols often use two separate keyed cryptographic functions, one for encryption and one for authentication (MAC); the key used for each of the two goals should be pseudo-random, even given the key to the other goal. We refer to such keys are per-goal keys. The next exercise explains how we can use a single shared session key, from the 2PP or another key-setup protocol, to derive multiple per-goal session keys. Exercise 5.4 (Per-goal keys). 1. Show why it is necessary to use separately pseudorandom keys for encryption and for authentication (MAC), i.e., per-goal keys. S for encryption and 2. Show how to securely derive one session key kE,i S another session key kA,i for authentication, both from the same session S key kiS , yet each key (e.g., kE,i ) is pseudo-random even given the other S key (resp., kA,i ). Your solution may use any cryptographic scheme or function that we learned - your choice! Explain the security of your solutions. Hints for solutions: 1. See Exercise 4.18. S S 2. One simple solution is kE,i = P RFkiS (‘E’), kA,i = P RFkiS (‘A’), where ‘E’, ‘A’ are just two separate inputs to the PRF, ensuring two independentlypseudorandom session keys. To further improve the security of the session/record protocol, we may use two separate pairs of per-goal keys, depending on the direction: one pair A→B A→B (kE,i , kA,i ) for (encryption, authentication) of messages from Alice to Bob, B→A B→A and another pair (kE,i , kA,i ) for (encryption, authentication) of messages from Bob to Alice. Exercise 5.5. 1. How may the use of separate, pseudo-random pairs of per-goal keys for the two ‘directions’ improve security? 2. Show how to securely derive all four keys (both pairs) from the same session key kiS . 3. Show a modification of the Key Exchange 2PP extension, which securely derives all four keys (both pairs) ‘directly’, a bit more efficiently than by deriving them from kiS . Explain the security of your solutions. Applied Introduction to Cryptography and Cybersecurity 5.5. KEY DISTRIBUTION CENTER PROTOCOLS 5.5 293 Key Distribution Center Protocols In this section, we expand a bit beyond our focus on two party protocols, to brieŕy discuss three-party shared-key, Key Distribution Protocols. In general, key distribution protocols establish a shared key between two or more entities. We focus on Key Distribution Protocols which use only symmetric cryptography (shared keys), and involve only three parties: Alice, Bob - and a trusted third party (TTP), often referred to as the Key Distribution Center (KDC); the use of the term KDC is so common, that these protocols are often referred to as KDC protocols. The goal of the protocol - and of the KDC - is to establish a shared key between the other parties. There are many types of Key Distribution Protocols. We present two important and very different protocols, both simpliőed versions of practical protocols: the Kerberos protocol [299], which is the most widely-known and widely-used KDC protocol for computer networks, in subsection 5.5.1, and the GSM protocol [26], which is the őrst widely-deployed cellular communication protocol and still supported by essentially all existing mobile devices and networks, in Section 5.6. Due to the unique nature of GSM, we describe it separately, in the next section. The Kerberos and GSM protocols are very different. They even differ in their assumptions: in Kerberos, the KDC shares a key with each of two parties, while in GSM, the TTP shares a key only with one party, the client (e.g., Alice), and is assumed to have secure connection to the other party (e.g., Bob). Another important difference is that the Kerberos protocol is secure, while the GSM protocol is notoriously insecure - and will provide good opportunity to introduce important attack techniques. These attacks are practical and well known; there are even products that perform these and other attacks on GSM, and on some other cellular-communication protocols. 5.5.1 The Kerberos Key Distribution Protocol In this subsection we present a simpliőed version of the Kerberos [299] keydistribution protocol. Kerberos [299] is the most widely known and deployed shared-key system for authorization and authentication in computed networks and distributed systems. In Kerberos, as in many other KDC protocols, the KDC shares a key with each party: kA with Alice and kB with Bob; using these keys, the KDC helps Alice and Bob to share a symmetric key kAB between them. In fact, the term KDC protocol is usually used for protocols following this model. The (simpliőed) protocol is shown in Figure 5.12. The process essentially consists of two exchanges, both resembling the Time-based Authenticated Request-Response (Figure 5.10). The őrst exchange is between Alice and the KDC. In this exchange, the KDC sends to Alice the key kAB that will be shared between Alice and Bob, encrypted (cA ) and authenticated (mA ). In addition, Alice receives the pair cB , mB , which also consist of encryption and authentication of kAB , but this Applied Introduction to Cryptography and Cybersecurity CHAPTER 5. SHARED-KEY PROTOCOLS 294 Alice KDC Bob ‘Bob’, time, M ACkM (time + + ‘Bob’) A cA = EkE (kAB ), mA = M ACkM (time + + ‘Bob’ + + cA + + cB + + mB ) A A cB = EkE (kAB ), mB = M ACkM (time + + ‘Alice’ + + cB ) B B Use mA to validate cA , then extract kAB ; E M ← P RFkAB (‘Enc’) kAB ← P RFkAB (‘MAC’), kAB cB , mB , cReq = EkE (Request), mReq = M ACkM (1 + +A→B+ + time + + cReq ) AB AB Validate and decrypt cB , E M and derive kAB , kAB +A←B+ + time + + cResp ) cResp = EkE (Response), mResp = M ACkM (2 + AB AB Figure 5.12: The Kerberos Key Distribution Center Protocol (simpliőed). The E M E KDC shares with Alice kA for encryption and kA for MAC, and with Bob, kB M for encryption and kB for MAC. The KDC selects a shared session key kAB to be used by Alice and Bob for the speciőc session (request-response). Alice and Bob use kAB and a pseudorandom function P RF to derive two shared E M keys, kAB = P RFkAB (‘Enc’) (for encryption) and kAB = P RFkAB (‘MAC’) (for authentication, i.e., MAC). All parties validate contents of MACs before decrypting authenticated ciphertexts. time, using the keys shared between the KDC and Bob. Alice would next relay these to Bob. Notice that these values are also authenticated when sent to Alice (within mA ). In the second exchange, Alice sends her request to Bob, encrypted and authenticated using kAB . Alice also sends the pair cB , mB , which Bob uses to securely retrieve kAB . Alice and Bob both derive, from kAB , the shared E M encryption and authentication (MAC) keys kAB and kAB , respectively. Note that in the above protocol, the KDC never initiates communication, but only responds to an incoming request; this communication pattern, where a server machine (in this case, the KDC) only responds to incoming requests, is referred to as client-server. Server machines usually use client-server communication, since it relieves the server (e.g., KDC) from the need to maintain state for different clients, except for the long-term keys (e.g., kA and kB ). This makes it easier to implement an efficient service, especially when clients may access different servers. In Kerberos, the TTP has an additional role: access control. Namely, the TTP controls the ability of the client (Alice) to contact the service (Bob). In this case, the mB authenticator will also be a ticket or permit for the use of the server. Access control is an important aspect of computer and network security. Applied Introduction to Cryptography and Cybersecurity 5.6. THE GSM KEY EXCHANGE PROTOCOL 5.6 295 The GSM Key Exchange Protocol We next discuss the GSM Key Distribution and Key Exchange protocol, an important-yet-vulnerable shared-key Key Exchange protocol. This protocol is performed at the beginning of each connection between a Mobile device belonging to a user, e.g., mobile phone, a Visited Network (VN), and the user’s Home Network. The mobile is only connected via the Visited Network, i.e., any communication between the mobile and the Home Network must be via the Visited Network. Due to its wide use and importance, there are many publications on GSM; unfortunately, there is no complete agreement on the terms. We try to use terms which are widely used and intuitive, but readers should be ready to see different terms in different publications, e.g., the Visited Network (VN) is often referred to as the Base station (BS), or Visited Location; the Home Network is sometimes referred to either as the Home location or as the Authentication Center (AuC). The GSM protocol is based on a shared key ki associated with mobile user identiőers; this identiőers is called the International Mobile Subscriber Identifier (IMSI), but we refer to it simply as i. This key, ki , is known to the Home Network and to the user’s mobile device. More speciőcally, the mobile device of user with identiőer (IMSI) i, has a copy of ki ; and the Home Network has a mapping from each identiőer i of any of its users, to the corresponding ki . The Visited Network is not fully trusted by the user and by the Home Network; therefore, is not provided with the shared key ki , which should remain secret from it. The GSM design assumes secure communication between the Visited Network and the Home Network; in particular, information sent by the Home Network to the Visited Network is not exposed to any other party. This may be ensured by running TLS (Chapter 7) or another securecommunication protocol between these two parties; however, GSM simply assumes such secure communication, without specifying how it should be secured. Apparently, at least originally, visited and home networks simply often used private communication lines between them, and assumed this is secure enough, although we believe by now they probably all use TLS or a similar protocol. The basic idea of the GSM Key Exchange protocol is for the Home Network to provide the Visited Network with a triplet (r, Kc , s) for every session of the mobile, where: r (or Rand) is a random 128-bits string selected by the Home Network, and used, together with the client’s key ki , to compute (Kc , s), as: (Kc , s) ← A38(ki , r), where A38 is an algorithm4 , which the Home Network operator is free to select, and the GSM speciőcations requires to be a One-Way 4 Actually the GSM specifications defines two separate functions, A3 and A8, to compute each parameter: s ← A3(ki , r), Kc ← A8(ki , r). But since they are always computed together and are expected to have similar cryptographic properties, they are usually considered and implemented by a single algorithm, which we denote A38; you may also see it denoted A3A8. Applied Introduction to Cryptography and Cybersecurity CHAPTER 5. SHARED-KEY PROTOCOLS 296 Function. Since (Kc , s) are derived from ki and r, it suffices for the Visited Network to sends to the Mobile device only r, and the mobile can compute the same values (Kc , s) as computed by the Home Network (which were sent to the Visited Network). Kc is the session key. This key is used by the Visited Network and the client to encrypt the connection between them. s (or SRES) is a secret authenticator/result; the mobile device authenticates itself to the Visited Network by sending to it the same value of s, as the Visited Network received in the triplet (r, Kc , s) from the Home Network. The GSM Key Exchange protocol, and two messages illustrating how GSM protects data transfer with the session key Kc , are illustrated in Figure 5.13. The design uses two ‘cryptographic functions’, A38, introduced above, and A5. In the speciőcations, A38 is described as a One Way Function (OWF), and A5 is referred to as encryption; however, from their use in the protocol, both functions should really be pseudorandom functions (PRFs). The standard deőnes three variants of A5, denoted A5/v for v ∈ {1, 2, 3}, and uses A5/0 to denote no encryption. The GSM speciőcation allows some ŕexibility as to the speciőc functions. In fact, the operator of the Home Network is free select the A38 function. A common choice is an algorithm called COMP128 which was deőned by the GSM consortium, and shared under non-disclosure agreement. The A5/i ‘encryption’ functions, however, must be supported by both mobile and the Visited Network. The GSM speciőcations included two implementations of A5, denoted A5/1 and A5/2; the A5/2 algorithm is an intentionally-weaker variant of A5/1, included to allow export of GSM system for network operators in countries to whom it was, at the time, not allowed to export ‘strong security’ encryption products. Similarly to COMP128, both A5/1 and A5/2 were shared under non-disclosure agreement, i.e., kept ‘secret’. Later, another option was added - the A5/3 algorithm, based on the KASUMI block cipher. Another option is A5/0, which simply means that encryption is not performed at all. As can be seen in the bottom (last) messages of Figure 5.13, the A5/i functions outputs 228 bits. Half of them, bits 1 to 114, are used to encrypt messages from the Mobile client to the Visited Network; the other half, say from bit 115 to bit 228, is used to encrypt ‘responses’, i.e., messages from the Visited Network to the Mobile client. Note that the input to the A5/i functions is always the key k, and a non-secret number - for simplicity, we show it as a sequential counter5 . These values are synchronized between Mobile client and Visited Network; this synchronization is due to the underlying communication protocol (TDMA). Overview of the GSM Key Exchange. The Key Exchange begins with the mobile sending its identiőer IMSI (International Mobile Subscriber Identity) to 5 The actual numbers used in GSM are a bit more complex, but still non-secret. Applied Introduction to Cryptography and Cybersecurity 5.6. THE GSM KEY EXCHANGE PROTOCOL Mobile client 297 Visited network i (IMSI) Home network i (IMSI) $ r ← {0, 1}128 (Kc , s) ← A38(ki , r) r (r, s, Kc ) (Kc , s) ← A38(ki , r) s Ok ECC(m1 ) ⊕ A5/v(Kc , 1)[1 : 114] ECC(resp1 ) ⊕ A5/v(Kc , 1)[115 : 228] ECC(m2 ) ⊕ A5/v(Kc , 2)[1 : 114] ECC(resp2 ) ⊕ A5/v(Kc , 2)[115 : 228] (...and so on for more messages) Figure 5.13: The GSM Key Exchange Protocol; the standard deőnes ‘cryptographic functions’ A38 (deőned in the speciőcations as a OWF, but actually used as a PRF) and A5 (referred in the speciőcations as encryption, but actually also used as a PRF). The standard deőnes three variants of A5 denoted A5/v for v ∈ {0, 1, 2, 3}, where A5/0 denotes no encryption. The GSM standard also speciőes the Error Correction Code function ECC(·). the Visited Network; we denote this as simply i. The Visited Network forwards i to the Home Network. The Home Network uses i to retrieves the key ki of the mobile client. Then, the Home Network selects a random 128-bit binary string r, and uses ki and r to compute: (Kc , s) ← A38(ki , r), using the A38 algorithm (see discussion above). The Home Network sends the resulting GSM authentication triplet (r, s, Kc ) to the Visited Network. Figure 5.13 also shows an example of two messages m1 , m2 sent from Mobile client to the Visited Network, and two corresponding ‘responses’, resp1 , resp2 , Applied Introduction to Cryptography and Cybersecurity 298 CHAPTER 5. SHARED-KEY PROTOCOLS sent from the Visited Network to the Mobile client. Note that these do not have to be really responses to the messages; the Visited Network would send in the same way any other message to the Mobile client, e.g., from some remote communicating client - we just used ‘resp’ (for ‘response’) since it seems a bit clearer, avoiding confusion with messages from the Mobile client. Of course, in typical real use, the mobile and the Visited Network exchange more than two messages and responses. All the messages, including responses, that are exchanged between the Mobile client and the Visited host, are encrypted using the connection’s key Kc . The encryption uses the A5 algorithm chosen by the client, e.g., A5/1. More correctly, the messages (and responses) are encrypted by bitwise-XOR to the output of the A5 function, since the A5 algorithms deőne a pseudorandom function (PRF), as we explained above. Error correction then encryption? Before XORing the messages (and responses) with the output of A5, the protocol őrst applies an Error Correcting Code (ECC) to the message/response, to allow recovery from bit errors - common in wireless communication. In fact, GSM uses quite extreme error correction codes, e.g., with input of 184 (non-encoded) bits and output of 456 (encoded) bits6 . Note that since every ŕow XORs the message/response with only 114 bits from the output of A5, this means that every ‘real’ GSM message, would actually be transmitted using multiple of these 114-bit ŕows; what we show for the messages and responses in Figure 5.13 is only a simpliőcation. The use of error-correction before encryption may have been designed as a heuristic attempt to provide authentication as a by-product of encryption. However, as discussed in subsection 4.7.2, this approach is not advisable: it may fail to prevent message modiőcation; it foils the use of authentication and then applying error-detection/correction code, as a way to detect attacks; and, signiőcantly, it may make it easier to attack the encryption scheme, as the plaintext will have a lot of redundancy due to the ECC. This is the case for GSM. In fact, the use of ECC-then-Encrypt in GSM may be the best (or worst...) example of the risk in performing ECC and then encrypting. Speciőcally, see [26] for an effective ‘ciphertext-only’ attack on the GSM encryption using A5/1 or A5/2 encryption. While the attacks are presented as ‘ciphertext-only’, they fully exploit the fact that the plaintext has huge redundancy due to the use of ECC before encryption. While this attack is highly recommended reading, we will not describe it here. Instead, we proceed to discuss protocol-based attacks on the GSM Key Exchange. 6 This is a simplification; in reality, different lengths are used for different types of messages. Applied Introduction to Cryptography and Cybersecurity 5.6. THE GSM KEY EXCHANGE PROTOCOL Mobile 299 VN Attacker i (IMSI) r r s s Ok Ok ECC(m1 ) ⊕ A5/v(Kc , 1)[1 : 114] ECC(m1 ) ⊕ A5/v(Kc , 1)[1 : 114] ... ... ECC(mn ) ⊕ A5/v(Kc , 1)[1 : 114] i (IMSI) s Ok ECC(m′1 ) ⊕ A5/v(Kc , 1)[1 : 114] ... Impersonate phase r Cryptanalysis phase ECC(mn ) ⊕ A5/v(Kc , 1)[1 : 114] Eavesdrop phase i (IMSI) ECC(m′n′ ) ⊕ A5/v(Kc , 1)[1 : 114] Figure 5.14: The VN-impersonation attack by a MitM attacker on the GSM Key Exchange. The Key Exchange between the client and the Home Network is exactly like in Figure 5.13, but here we omit the Home Network and the messages exchanged between the Visited Network and the Home Network. This őgure is simpliőed, in particular, it does not include the cipher-negotiation details; see these in Figure 5.15. A5/v denotes the GSM encryption scheme; standard values are for v ∈ {0, 1, 2, 3}. 5.6.1 Vulnerability study: VN-impersonation Replay attack on GSM Figure 5.14 shows the simple VN-impersonation replay attack against the GSM Key Exchange protocol. The attack involves a fake Visited network (VN), i.e., the attacker is impersonating as a legitimate VN. The VN-impersonation attack has three phases: Applied Introduction to Cryptography and Cybersecurity 300 CHAPTER 5. SHARED-KEY PROTOCOLS Eavesdrop: in the őrst phase, the attacker eavesdrops on a legitimate connection between the mobile client and a legitimate Visited Network (VN). The Key Exchange between the client and the Home Network is exactly like in Figure 5.13, except that, for simplicity, Figure 5.14 does not show the Home Network and the messages exchanged between the Visited Network and the Home Network. Cryptanalysis: in the second phase, the attacker cryptanalyzes the ciphertexts collected during the eavesdrop phase. Assume that the attacker is successful in őnding the session key Kc shared between client and Visited Network; this is reasonable, since multiple effective attacks are known on the GSM ciphers A5/1 and A5/2. Impersonate: őnally, once cryptanalysis has exposed the session key Kc , the attacker impersonates as a legitimate Visited Network, and replays the same random challenge r sent by the legitimate Visited Network in the eavesdrop phase. Since the connection key Kc is derived in a deterministic way from r, by (Kc , s) ← A38(r), then the client would reuse the same connection key Kc as in the eavesdropped-to connection - the one exposed by the attacker during the cryptanalysis phase. The attacker now uses Kc to communicate correctly with the client; in particular this allows the adversary to decrypt any messages m′1 , . . . , m′n′ encrypted and sent by the client in this new connection. Are MitM attacks feasible against GSM? After the VN-impersonation attack and the MitM downgrade attack (subsection 5.6.3) were published [26], some responses argued that building a MitM adversary is ‘too complex’ and therefore such attacks are not a real concern. Nevertheless, devices allowing GSM-MitM attacks have been constructed - by academic researchers, students, independent developers - and also companies; such products are available for purchase from multiple vendors. Is the VN-impersonation attack effective and a real threat? The VN-impersonation allows the attacker to impersonate as a Visited Network and cause the mobile client to send (new) messages encrypted with the (old) key, which the attacker can now decrypt. The attacker may also respond with fake messages, of course, to continue the dialog. However, this attack has one signiőcant drawback: the attacker cannot impersonate as the client to a legitimate Visited Network. In particular, the attack does not allow the attacker to decrypt new responses sent from a remote peer to the client. In practice, this may also make it hard for the attacker to deploy this attack in some scenarios, e.g., to become a MitM on a complete call between the client and a remote party. The attacker may try to connect to the remote party using a separate call from the attacker’s own mobile device, relaying traffic between the client and the remote party via the client’s device - but the remote party may notice Applied Introduction to Cryptography and Cybersecurity 5.6. THE GSM KEY EXCHANGE PROTOCOL 301 the use of a different client device. This serious limitation is avoided by the downgrade attack we discuss next. 5.6.2 Crypto-agility and cipher suite negotiation in GSM In this subsection, we őrst introduce the important principle of Crypto-agility (also known as cryptographic agility). crypto-agility means that the cryptographic protocol allows the use of different cryptographic functions and schemes, a long as they satisfy some requirements, e.g., an IND-CCA encryption scheme or a secure PRG. We refer to a speciőc choice of functions/schemes used by a protocol as a cipher suite. Then, we explain that the GSM protocol supports crypto-agility, including a cipher suite negotiation mechanism, allowing entities to negotiate the speciőc cryptographic functions and schemes to be used. Finally, we show that the GSM ciphersuite negotiation mechanism is vulnerable to a devastating cipher suite downgrade attack. Principle 11 (Crypto-agility). Cryptographic protocols should be designed using abstract ‘building block’ cryptographic functions and schemes; the set of functions and schemes is called a cipher suite. Each function/scheme should have well-defined requirements. It is desirable for protocols to allow cipher suite negotiation to determine the specific cipher suite to be used in a particular run of the protocol, as long as it is secure. Cipher suite negotiation is secure if the negotiated cipher suite is never inferior to another cipher suite that is supported by both/all parties involved in the protocol. Basically, crypto-agility requires a modular design, where the design of the protocol does not depend on the speciőc components (cipher suite) used, only on their required security properties. This has several important beneőts: 1. crypto-agility allows replacing a cryptographic scheme/function which is found or suspected to be vulnerable, while continuing to use the same protocol. 2. crypto-agility allows different users to use the same protocol, but using different schemes, due to different trust assumptions, different efficiency/security tradeoffs and considerations, or other reasons, such as licensing, availability and legal (often export) restrictions. In particular, some countries restrict the export of some cryptographic mechanisms; speciőcally, until about 2000, the USA restricted export of cryptographic systems using symmetric encryption with keys longer than 40 bits. A protocol may further support a secure cipher suite negotiation to allow the parties to choose the ‘best’ cipher suite supported by both/all of them. 3. The security of a crypto-agile protocol can be established based on the well-deőned requirements from the cipher suite, which makes it easier to design, evaluate, understand and prove security of the protocol. Applied Introduction to Cryptography and Cybersecurity 302 CHAPTER 5. SHARED-KEY PROTOCOLS Note, however, that all too often, protocol designers focus on crypto-agility and cipher suite negotiation - but to properly ensure the security of the negotiation mechanism, creating serious vulnerabilities and allowing different downgrade attacks, which allow an attacker to trick the parties into using a particular, vulnerable cipher suite chosen by the attacker. These attacks usually involve a Man-in-the-Middle (MitM) attacker. In this section, we will see downgrade attacks against GSM; later, in Chapter 7, we will see several downgrade attacks on SSL and TLS. GSM cipher suite negotiation is vulnerable to downgrade attack. GSM supports crypto-agility, in the sense that the protocol is deőned for any stream cipher, with three speciőc options (A5/1, A5/2 and A5/3) as well as the ability to use other stream ciphers. Furthermore, GSM supports cipher suite negotiation, since the visited Network and the client (mobile) negotiate which stream-cipher to use. Namely, the client sends the list of supported stream ciphers, and the visited network indicates which of them it prefers; this stream cipher is then used by GSM. In particular, GSM’s cipher suite negotiation was essential to allow interoperability between a client / visited network that support only an exportable (cryptographically weak) stream cipher, and a visited network / client that support both the exportable stream cipher and a more secure stream cipher. However, the GSM cipher suite negotiation is not well protected, allowing a downgrade attacks by a MitM adversary, as we show next. In fact, we show two attacks. First, in this subsection, we outline the simple downgrade to A5/1 attack, that allows downgrading GSM connections to use A5/1. Then, in the subsection subsection 5.6.3, we present the (more advanced) downgrade to A5/2 attack. Both attacks work even when both mobile client and visited network support and prefer a stronger cipher, e.g., A5/3. First, let us explain the GSM ciphersuite negotiation mechanism. The GSM ciphersuite negotiation. The GSM ciphersuite negotiation process is shown in Figure 5.15. In the őrst message of the Key Exchange, containing the mobile client’s identity (IMSI). The Mobile client also lists the supported ciphers, i.e., the A5/v functions. For example, in the őgure, the Mobile supports A5/1 and A5/2. The Visited Network selects the stream cipher to be used from this list. Usually, the Visited Network would select the stream cipher considered most secure that this Visited Network supports, among those offered by the mobile client. A critical property of the GSM negotiation mechanism is that all clients support the stronger A5/1 protocol. Furthermore, GSM speciőes that Visited networks that support A5/1, as most do, should refuse to use A5/2 - even if A5/2 is the only option on the list. This is an important fact which has signiőcant impact on GSM downgrade attacks: Applied Introduction to Cryptography and Cybersecurity 5.6. THE GSM KEY EXCHANGE PROTOCOL Mobile client 303 Visited network i, Ciphers:{A5/1, A5/2, . . .} Abort if A5/1 ̸∈Ciphers r Home network i (IMSI) $ r ← {0, 1}128 (Kc , s) ← A38(ki , r) (r, s, Kc ) (Kc , s) ← A38(ki , r) s CIP HM ODCM D : A5/v (v ∈ {0, 1, 2}) ECC(CIP HM ODCOM ) ⊕ A5/v(Kc , 1)[1 : 114] Timeout and retransmission (no CIP HM ODOK received) ECC(CIP HM ODCOM ) ⊕ A5/v(Kc , 2)[1 : 114] ECC(CIP HM ODOK) ⊕ A5/v(Kc , 2)[115 : 228] (continue as in Figure 5.13) Figure 5.15: The GSM Key Exchange Protocol, including details of cipher suite negotiation (omitted in Figure 5.13). Note that the visited network aborts, if the ciphers offered by the client do not include A5/1. Fact 5.2 (GSM Visited Networks refuse to downgrade to A5/2). All GSM Visited networks support A5/1, and refuse to open (abort) a connection if the cipher suites offered by the mobile client do not include A5/1. The reason for Fact 5.2 is that the A5/2 cipher is known to have been designed intentionally to provide vulnerable encryption. This vulnerable cipher was necessary to gain government permission to export GSM equipment; GSM network operator equipment was allowed for export to certain countries, only if restricted to use only A5/2. Without the defense of Fact 5.2, a MitM attacker could have simply removed A5/1 (and any other ‘strong’ cipher) from the list of ciphers sent by the Mobile, and as a result, communication would have used the vulnerable A5/2. Unfortunately, Fact 5.2 does not prevent the simple downgrade to A5/1 attack, Applied Introduction to Cryptography and Cybersecurity 304 CHAPTER 5. SHARED-KEY PROTOCOLS which we present below. In the next subsection, we present the (more advanced) downgrade to A5/2 attack. Downgrade to A5/1 attack. Fact 5.2 is the only defense of GSM against downgrade attacks; yet, it does not refer at all to other ciphers, except A5/1. As a result, GSM is vulnerable to a simple downgrade to A5/1 attack. For example, if a mobile supports {A5/1, A5/2, A5/3}, then a MitM attacker can simply remove the A5/3 option from the list sent by the Mobile client, to cause the Visited Network and Mobile client to use the (weaker) A5/1 protocol. This attack is much simpler than the ones we present next, in subsection 5.6.3; therefore, we leave it as an exercise for the reader (Exercise 5.12). This exercise - őnding the simple downgrade attack - should not be difficult, especially after learning the more advanced downgrade to A5/2 attack which we describe next. 5.6.3 The downgrade to A5/2 attack on GSM In this subsection, we present the downgrade to A5/2 attack on GSM. This is a devastating attack, since A5/2 is an absurdly vulnerable algorithm; indeed, it was intentionally designed that way, since such weakened protocol was necessary to obtain permission to export GSM devices. This is a non-trivial attack, and therefore, we őrst present a simpliőed variant that is unlikely to work in practice. Then, we present the ‘real’ downgrade to A5/2 attack. Both attacks are based on the GSM key-reuse vulnerability, which we describe next. GSM key-reuse vulnerability. GSM has an unusual additional vulnerability, making downgrade attacks much worse than with most systems/protocols. This vulnerability is due to the following fact: Fact 5.3 (GSM Key-Reuse vulnerability). The GSM Key Exchange establishes the same key Kc , regardless of the cipher used (e.g., A5/1, A5/2 or A5/3). Namely, GSM protocol uses the same key Kc for all ciphers. Note that this vulnerability is due to the fact that the GSM designers completely ignored the Key separation principle (Principle 10). Fact 5.3 allows an attacker to őnd a key used with one (weak) scheme, and use it to decipher communication protected with a different (stronger) cipher. This is indeed deployed by the GSM downgrade attacks we present. Another fact used by the attack is that A5/2 is absurdly vulnerable, i.e., very effective and efficient attacks against A5/2 are known. Fact 5.4 (CTO attack on A5/2 requires 900 ciphertext bits and 1 seconds). The ciphertext-only (CTO) attack of [26] finds the connection encryption key Kc , given 900 bits or more of ciphertext, encrypting ECC-encoded messages; the attack takes less than one second (using standard computing capabilities). Applied Introduction to Cryptography and Cybersecurity 5.6. THE GSM KEY EXCHANGE PROTOCOL 305 The following few steps of the Key Exchange as described in Figure 5.15, are exactly as in Figure 5.13. In fact, since the interactions with the Home Network are not impacted or changed by the ciphersuite mechanism or by the attacks, we do not even include them in the discussion and őgures of the downgrade attack and its variants. After receiving the (correct) authenticator, s, from the mobile, the visited network identiőes its choice of A5 cipher to the mobile. This is done in the message CIPHMODCMD : A5/v, where v indicates the cipher to be used, i.e., in this case, v ∈ {0, 1, 2}; this is instead of merely sending ‘Ok’ as in Figure 5.13. The following message from the client, and the first encrypted, is the special message CIP HM ODCOM , i.e., ‘cipher mode complete’, which acknowledges that the Mobile is using the cipher mode indicated in the CIPHMODCMD (‘cipher mode command’) sent by the Visited Network. Recall that GSM is designed for wireless communication, with signiőcant probability for noise and corruption of the transmitted information - which is the motivation for its extensive use of Error Correcting Code, with extensive redundancy. In spite of that, messages may get lost. Hence, important control messages should be acknowledged; in particular, the Mobile client waits for an acknowledgement message from the Visited Network (VN), which we denote CIPHMODOK, to know that the VN received correctly the CIP HM ODCOM message. When the Mobile times-out, i.e., does not receive CIPHMODOK in time, then the Mobile retransmits the CIP HM ODCOM to the Visited Network; this scenario is shown in Figure 5.15 (where we show the case where the őrst retransmission is successful). This happens after a very short time-out, of much less than a second - few milliseconds. As shown in Figure 5.15, each retransmission uses a distinct sequence number (e.g., 1 and 2, in the őgure). This, again, is an important fact for the downgrade attack. Fact 5.5 (CIP HM ODCOM message and its retrasnsmission). After a Mobile receives the CIPHMODCMD (‘cipher mode command’) message, instructing the Mobile to use a specific cipher mode, then the Mobile encrypts and sends CIP HM ODCOM (‘cipher mode completed’) to the VN; this message contains 456 bits (including the ECC). The Mobile then waits for an acknowledgement a CIPHMODOK (‘cipher mode Ok’) message from the VN. If this isn’t received after time-out of few milliseconds, the Mobile re-transmits, with a new counter value. A failure may also occur in the reverse direction, i.e., the Visited Network (VN) may time-out while waiting for the CIP HM ODCOM (‘cipher mode completed’) from the client. In this case, the VN aborts the Key Exchange. Again, this happens after a timeout of few milliseconds. Similarly, a failure may occur also in the earlier phases of the Key Exchange, i.e., between sending i and receiving r, or between sending r and receiving s. However, GSM allows for much larger delays in these early phases - up to 12 seconds! This fact is also signiőcant. Applied Introduction to Cryptography and Cybersecurity CHAPTER 5. SHARED-KEY PROTOCOLS 306 Mobile MitM VN CIPHMODCMD : A5/2 CIPHMODCMD : A5/1 phase s Eavesdrop s phase: r Pre-analysis r find Kc i, {A5/1, A5/2} phase i, {A5/1, A5/2} ECC(CIP HM ODCOM, 1)⊕ A5/2(Kc , 1)[1 : 114] ECC(CIP HM ODOK, 1)⊕ A5/2(Kc , 1)[115 : 228] ECC(CIP HM ODOK, 1)⊕ A5/1(Kc , 1)[115 : 228] ECC(m1 ) ⊕ A5/2(Kc , 2)[1 : 114] ECC(mn ) ⊕ A5/1(Kc , 2)[1 : 114] ... ... Figure 5.16: Simpliőed downgrade attack on GSM Key Exchange. Fact 5.6 (GSM Key Exchange allows 12 second delays till CIPHMODCMD). The Mobile and the VN abort the Key Exchange if they do not receive the expected responses, after timeout of about 12 seconds; this holds for the messages until the CIPHMODCMD is sent (and received), from which point, responses are expected to be almost instantaneous (few milliseconds). If responses are not accepted by the timeout, the Key Exchange is aborted. Simplified, unrealistic downgrade attack. We őrst present a simpliőed, unrealistic downgrade attack against GSM in Figure 5.16. In this attack, the client supports A5/1 and A5/2, but the MitM attacker ‘removes’ A5/1 and only offers A5/2 to the Visited Network. As a result, the entire session between Visited Network and client is only protected using the (extremely vulnerable) A5/2 cipher. However, the attack of Figure 5.16 fails, for the following reasons: 1. As per Fact 5.6, the Visited Network would time-out when it does not receive the CIP HM ODCOM message within few milliseconds - while the fastest ciphertext-only cryptanalysis process of A5/2, from [26], takes about a second (Fact 5.4). Applied Introduction to Cryptography and Cybersecurity Cryptanalysis ECC(CIP HM ODCOM, 1)⊕ A5/1(Kc , 1)[1 : 114] 5.6. THE GSM KEY EXCHANGE PROTOCOL 307 2. The length of the CIP HM ODCOM message is only 456 bits (including the ECC), while the cryptanalysis attack of [26] requires 900 bits. Hence, the attack will not succeed to őnd the key at all. We next present the ‘real’ GSM downgrade attack, which overcomes these challenges. The ‘real’ downgrade attack on GSM Key Exchange. In Figure 5.17, we őnally present the ‘real’ downgrade attack on the GSM Key Exchange. This attack addresses the two challenges presented above, by: 1. As per Fact 5.6, GSM allows delays of about 12 seconds until the Visited Network sends CIPHMODCMD. So this attack delays the s response from the client - this gives the MitM attacker 12 seconds, much more than enough for the ciphertext-only cryptanalysis process of A5/2 from [26], which takes only about a second (Fact 5.4)! 2. To obtain a sufficient number of ciphertext bits, this attack intentionally causes the mobile client to time-out while waiting for the CIPHMODOK message, resulting in rapid retrasmission of the CIP HM ODCOM message from the client - re-encrypted, since the counter value is modified, as follows from Fact 5.5. Each of these retransmissions contains 456 bits, providing together more than the 900 bits required for the attack of A5/2 from [26] (Fact 5.4). Several additional variants of this attack are possible; see, for example, the following exercise. Exercise 5.6 (GSM combined replay and downgrade attack). Consider an attacker who eavesdrops and records the entire communication between mobile and Visited Network during a connection which is encrypted using a ‘strong’ cipher, say A5/3. Present a sequence diagram, like Figure 5.14, showing a ‘combined replay and downgrade attack’, allowing this attacker to decrypt all of that ciphertext communication by later impersonating as a Visited Network, and performing a downgrade attack. Hint: the attacker will resend the value of r from the eavesdropped-upon communication (encrypted using a ‘strong’ cipher) to cause the mobile to re-use the same key - but with a weak cipher, allowing the attacker to expose the key. Protecting GSM against downgrade attacks. Downgrade attacks involve modiőcation of information sent by the parties - speciőcally, the possible and/or chosen ciphers. Hence, the standard method to defend against downgrade attacks is to authenticate the exchange, or at least, the ciphersuite-related indicators. Note that this requires the parties to agree on the authentication mechanism, typically, a MAC scheme. It may be desirable to also negotiate the Applied Introduction to Cryptography and Cybersecurity CHAPTER 5. SHARED-KEY PROTOCOLS 308 Mobile MitM VN phase CIPHMODCMD : A5/2 Eavesdrop s phase: r find Kc r phase i, {A5/1, A5/2} Pre-analysis i, {A5/1, A5/2} ECC(CIP HM ODCOM, 1)⊕ A5/2(Kc , 1)[1 : 114] ECC(CIP HM ODOK, 2)⊕ A5/2(Kc , 2)[115 : 228] Cryptanalysis ECC(CIP HM ODCOM, 2)⊕ A5/2(Kc , 2)[1 : 114] s CIPHMODCMD : A5/1 ECC(CIP HM ODCOM, 1)⊕ A5/1(Kc , 1)[1 : 114] ECC(CIP HM ODOK, 1)⊕ A5/1(Kc , 1)[115 : 228] ... ... Figure 5.17: A ‘real’ downgrade attack on GSM Key Exchange. authentication mechanism. In such case, the negotiation should be bounded to reasonable time, and the use of the authentication scheme and key limited to a few messages, to foil downgrade attacks on the authentication mechanism. Every authentication mechanism supported should be secure against this (weak) attack. It is also necessary to avoid the use of the same key for different encryption schemes, as done in GSM (Fact 5.3), and exploited, e.g., by the attacks of Figure 5.17 and Exercise 5.6. Using separate keys is quite easy, and does not require any signiőcant resources - it seems that there was no real justiőcation for this design choice in GSM, except for the fact that this allows the Home Network to send just one key Kc , without knowing which cipher would be selected by the mobile and Visited Network. Deployed defense. The ‘real’ attack is not ‘real’ anymore - but it is prevented in a rather ‘crude’ way: the GSM consortium abolished support for the insecure A5/2. Note that downgrading between other versions, e.g., from A5/3 to A5/1, is still possible. Applied Introduction to Cryptography and Cybersecurity 5.7. RESILIENCY TO EXPOSURE: FORWARD SECRECY AND RECOVER SECURITY 309 Learning from GSM Key Exchange vulnerabilities. We have discussed several signiőcant failures of the GSM Key Exchange: the MitM downgrade attack of this subsection, the VN-impersonation attack of Figure 5.14, the use of ‘ECC-then-encrypt’ which allows a ciphertext-only attack on A5/1 or A5/2 [26]. There are more, e.g., efficient known-plaintext attacks. What are the root causes of these vulnerabilities and what can we learn, to avoid such vulnerabilities? We believe that many of these problems are due to the fact the designers violated several basic principles. Most notably, the GSM design violates the Kerckhoffs’ principle (Principle 2): it relies on the use of ‘secret’ algorithms such as A38, A5/1 and A5/2. The GSM design also did not undergo careful public security analysis, and in particular, its attack model was never clearly stated, violating Principle 1, clear attack model; and its design did not carefully apply well-deőned, standard cryptographic building blocks, violating Principle 3, conservative design, and Principle 8, cryptographic building blocks. 5.7 Resiliency to key exposure: forward secrecy and recover security One of the goals of deriving pseudorandom keys for each session was to reduce the damage due to exposure of one or some of the session keys. A natural question is, can we improve the resiliency to exposure? In particular, can a Key Exchange protocol provide some security, even when an adversary may sometimes expose also the master key, or, more generally, the entire state of the parties? Notice that with all Key Exchange protocols we studied, exposure of the master key, at any time, allows an adversary to easily expose all (past and future) session keys. One approach to this problem was already mentioned: place the master key κ within a Hardware Security Module (HSM), so that it is assumed not to be part of the state exposed to the attacker. However, often, the use of an HSM is not a realistic, viable option. Furthermore, cryptographic keys may be exposed even when using an HSM - by cryptanalysis or by some weakness of the HSM, such as side-channels allowing (immediate or gradual/partial) exposure of keys. In this section, we discuss a different approach to provide security with resiliency to key exposures. This approach is to design the Key Exchange protocol to ensure some security, even after the adversary obtains the master key (or the contents of the entire storage). We mostly focus on two notions of resiliency to key exposure: forward secrecy and recover security. We explain these two notions and present Key Exchange protocols satisfying them. Both of these notions can be achieved using shared-key only. We also brieŕy discuss additional, even stronger notions of resiliency to key exposures, mainly, an extension for each of the two notions: perfect forward secrecy (PFS) and perfect recover security (PRS). For these stronger notions of resiliency, it seems necessary to use public key cryptography, which we introduce Applied Introduction to Cryptography and Cybersecurity 310 CHAPTER 5. SHARED-KEY PROTOCOLS in Chapter 6. We present PFS and PRS protocols in Section 6.3; later, in Chapter 7, we discuss how PFS is provided by the TLS protocol. Terms for resilient security. Note that the notions of Recover Security and Perfect Recover Security are not widely used in the literature. Also, the term Forward Secrecy is not always used as we deőne it; e.g., often it is used to refer to the notion commonly (and here) referred to as Perfect Forward Secrecy. 5.7.1 Forward Secrecy 2PP Key Exchange We use the term forward secrecy to refer to Key Exchange protocols where exposure of the entire storage of the communicating party in some future time period, including every (master and session) key kept at that future time, would not expose the keys used in previous time periods, or the plaintext encrypted (and sent) during these previous periods. This should hold although all previous communication could have been intercepted and recorded by the attacker. To ensure forward secrecy, each period i would use a separate master key kiM . For simplicity, we will map sessions to time periods, i.e., run the Key Exchange protocol once at the beginning of every period. At the beginning of M period/session i, we must erase any previous master key (e.g., ki−1 ). Deőnition follows. Note that some authors refer to this notion as weak forward secrecy, to emphasize the distinction from the stronger notion of perfect forward secrecy (which we present later). Definition 5.1 (Key Exchange with Forward Secrecy). A Key Exchange protocol P ensures forward secrecy if once session i terminated, exposure of the state of the entity will not compromise the confidentiality of information sent by the entity or sent to the entity in session i. We next discuss forward secrecy 2PP Key Exchange, a forward-secrecy variant of the Key Exchange 2PP extension, which we discussed and presented earlier, in subsection 5.4.1. The difference is that instead of using a single master key k, received during initialization, the forward-secrecy Key Exchange uses a sequence of master keys k0M , k1M , . . .; for simplicity, assume that each master key kiM is used only for the ith Key Exchange, with k0M received during initialization. The key to achieving the forward secrecy property is to allow easy derivaM tion of the future master keys ki+1 , . . . from the current master key kiM , but M M , . . . , k0M , ki−2 prevent the reverse, i.e., maintain the previous master keys ki−1 M M pseudorandom, even for an adversary who knows ki , ki+1 , . . .. A simple way to achieve this is by using a PRF, namely: M (0) kiM = P RFki−1 (5.3) The session key kiS for the ith session can be derived using the corresponding minor change to Equation 5.2, namely: Applied Introduction to Cryptography and Cybersecurity 5.7. RESILIENCY TO EXPOSURE: FORWARD SECRECY AND RECOVER SECURITY 311 A, NA,i NA,i , NB,i , P RFkiM (2 + + ‘A ← B’ + + NA,i + + NB,i ) NB,i , P RFkiM (3 + + ‘A → B’ + + NA,i + + NB,i ) Nurse Alice kiS Bob M (0) kiM = P RFki−1 = P RFkiM (NA,i + + NB,i ) kiS M (0) kiM = P RFki−1 = P RFkiM (NA,i + + NB,i ) Figure 5.18: The Forward-Secrecy 2PP Key Exchange protocol. This protocol is similar to the 2PP Key Exchange protocol (Figure 5.11). The main difference is that this protocol uses a different master key kiM for each period i; the initial master key, shared by the two parties, is k0M . Secure Exposed Remains insecure k1M = P RFk0M (0) k2M = P RFk1M (0) k3M = P RFk2M (0) k1S = P RFk1M (NA,1 + + NB,1 ) k2S = P RFk2M (NA,2 + + NB,2 ) k3S = . . . Figure 5.19: Result of running the Forward-Secrecy 2PP Key Exchange for three periods, with the keys exposed in the second period. Periods prior to the exposure (in this example, only the őrst period) remain secure even after the period where keys are exposed. Periods from the exposure onward are insecure. + NB ) kiS = P RFkiM (NA + (5.4) The resulting Forward-secrecy 2PP Key Exchange protocol is illustrated in Figure 5.18. The use of NA and NB in Eq. (5.4) is not really necessary, since each master key is used only for a single Key Exchange. The Forward-Secure 2PP Key Exchange protocol ensures that the communication in any period that completed before any key exposure, remains secure regardless of key exposures in later periods. See Figure 5.19. 5.7.2 Recover-Security Key Exchange Protocol We use the term recover security to refer to key setup protocols where a single session without eavesdropping or other attacks, suffices to recover security from previous key exposures. Deőnition follows. Definition 5.2 (Recover security Key Exchange). A Key Exchange protocol recovers security if session i is secure, when either of the following holds: Applied Introduction to Cryptography and Cybersecurity CHAPTER 5. SHARED-KEY PROTOCOLS 312 Secure (MitM attacker) k1M = P RFk0M (NA,1 ⊕ NB1 ) k1M or k2M exposed k2M = P RFk1M (NA,2 ⊕ NB2 ) k1S = P RFk1M (NA,1 + + NB,1 ) Remains insecure Recover security Remains secure (If attacker eavesdrops) k3M = P RFk2M (NA,3 ⊕ NB3 ) (If no attack) k4M = P RFk3M (NA,4 ⊕ NB4 ) (MitM attacker) k5M = P RFk4M (NA,5 ⊕ NB5 ) k2S = P RFk2M (NA,2 + + NB,2 ) k3S = P RFk3M (NA,3 + + NB,3 ) k4S = P RFk4M (NA,4 + + NB,4 ) k5S = . . . Figure 5.20: Example of running the recover-security Key Exchange protocol for őve periods, with the keys exposed in the second period, and no attack (even eavesdropping) in the fourth period, allowing recovery of security. Periods prior to the exposure (in this example, only the őrst period) remain secure even after the period where keys are exposed. Periods after exposure remain insecure, until a ‘recovery period’ (in this example, period 4) where there is no attack. Following recovery period, security is maintained till next exposure. No attack: during session i, there was no exposure, and all messages were delivered correctly, without eavesdropping, injection or modification. Preserve security: during session i there was no exposure, and the previous session (i − 1) was secure. The forward-secure 2PP Key Exchange protocol (Figure 5.18) ensures forward secrecy - but not recover security. This is since the attacker can use one exposed master key, say kjM , to derive all the following master keys, including M kiM for i > j, using Equation 5.3; in particular, kj+1 = P RFkjM (0). However, a simple extension suffices to ensure recover security, as well as forward secrecy. The extension is simply to use the random values exchanged in each session, i.e., NA,i , NB,i , in the derivation of the next master key, i.e.: M (NA,i ⊕ NB,i ) kiM = P RFki−1 (5.5) We call this protocol the recover-security Key Exchange protocol, and illustrate its operation in Figure 5.20. By computing the new master key using these three values, it is secret as long as at least one of these three values is secret. Since the recover security requirement assumes at least one session where the attacker does not eavesdrop or otherwise interfere with the communication, then both NA,i and NB,i are secret, hence the new master key kiM is secret. Indeed, we could have used just one of NA,i and NB,i ; by XOR-ing with both of them, we ensure secrecy of the master key even if the attacker is able to capture one of the two ŕows, i.e., even stronger security. Two notes are in order. The őrst note is that the protocol is fragile, in the sense that an attacker who sends corrupted nonce value to one (or both) parties in a given period, can prevent recovery of (secure) communication in future rounds. This can be improved with some additional protocol complexity; we leave it as a challenge to the interested reader. The second note is that the Recover-Security Key Exchange protocol requires the parties to have a source of true randomness, which is called a True Random Applied Introduction to Cryptography and Cybersecurity 5.7. RESILIENCY TO EXPOSURE: FORWARD SECRECY AND RECOVER SECURITY 313 Bit Generator (TRBG), i.e., a source which produces random bits even if the party is broken-into (and keys exposed). In reality, many systems only rely on pseudo-random generators (PRGs), or pseudo-random functions (PRFs), whose future values are computable using a past value or using an exposed key. In such case, it becomes critical to use also the input from the peer (NA,i or NB,i ), and these values should be also used to re-initialize the PRG, so that new nonces (NA,i , NB,i ) are pseudorandom (or truly random) and not predictable. Truly random bit generators require an appropriate hardware device, and relying on physical properties, including thermal noise and quantum phenomena [360]. 5.7.3 Stronger notions of resiliency to key exposure Forward secrecy and recover security signiőcantly improve the resiliency against key exposure. There are additional and even stronger notions of resiliency to key exposure, which are provided by more advanced Key Exchange protocols; we only cover a few of these in this textbook - speciőcally, the ones in Table 5.3. All known protocols that achieve more advanced notions of resiliency use public key cryptology, and in particular, key-exchange protocols such as the Diffie-Hellman (DH) protocol. Indeed, it seems plausible that public-key cryptography is necessary for many of these notions. This includes the important notions of Perfect Forward Secrecy (PFS) and Perfect Recover Security (PRS), which, as the names imply, are stronger variants of forward and recover security, respectively. We now brieŕy discuss PFS and PRS, to understand their advantages and why they require more than the protocols we have seen in this chapter; we discuss these notions further, with implementations, in Section 6.3. Perfect Forward Secrecy (PFS). PFS, like Forward Secrecy, also requires resiliency to exposures of state, including keys, occurring in the future. However, on top of that, PFS also requires resiliency to exposure of the previous state, again including keys, as long as this exposure occurs only after the session ends. We next deőne this notion. PFS was apparently őrst coined by Gunther [178]; unfortunately, the term is not always used with a consistent meaning and deőnition, but the following deőnition seems to capture the meaning usually used by experts. Definition 5.3 (Perfect Forward Secrecy (PFS)). A Key Exchange protocol P ensures perfect forward secrecy (PFS) if data sent during session i is confidential (indistinguishable), provided that either (1) there is not MitM attack during session i, or (2) the master key of session i and of any previous session, is not given to the adversary - or given only after session i. We discuss some PFS Key Exchange protocols in the next chapter, which deals with asymmetric cryptography (also called public-key cryptography, PKC). All known PFS protocols are based on PKC. Applied Introduction to Cryptography and Cybersecurity 314 Notion Secure key-setup Forward Secrecy (FS) Perfect Forward Secrecy (PFS) Recover Security (RS) Perfect Recover Security (PRS) CHAPTER 5. SHARED-KEY PROTOCOLS Session i is secure, when: Attacker is given session keys of other sessions, but master key is never exposed. Crypto Shared key Attacker is given all keys, but only of sessions after session i. Shared key Attacker is given all keys of all sessions except i, but only after session i ends. Public key Attacker is given keys of other sessions, but session i − 1 is secure, or no eavesdropping/MitM during session i. Shared key Attacker is given keys of other sessions, but session i − 1 is secure, or no MitM during session i. Public key Table 5.3: Notions of resiliency to key exposures of key-setup Key Exchange protocols. See implementations of forward and recover security in subsection 5.7.1 and subsection 5.7.2 respectively, and for the corresponding ‘perfect’ notions (PFS and PRS) in subsection 6.3.1 and subsection 6.3.2, respectively. Exercise 5.7 (Forward Secrecy vs. Perfect Forward Secrecy (PFS)). Present a sequence diagram, showing that the forward-secrecy 2PP Key Exchange protocol presented in subsection 5.7.1, does not ensure Perfect Forward Secrecy (PFS). Perfect Recover Security (PRS). We introduce the term perfect recover security to refer to Key Exchange protocols where a single session without exposure or MitM attacks suffices to recover security from previous key exposures. Deőnition follows. Definition 5.4 (Perfect Recover Security (PRS) Key Exchange). A Key Exchange protocol ensures Perfect Recover Security (PRS), if security (confidentiality and authentication) is ensured for messages exchanged during session i, provided that there is no exposure during session i and either (1) session i − 1 is secure, or (2) there is no MitM attack during session i (session i is a recovery session). Note the similarity to PFS, in allowing only eavesdropping during the ‘recovery’ session i. Similarly to PFS, we also discuss some PRS Key Exchange protocols in the next chapter, which deals with asymmetric cryptography. Known PRS protocols are all based on asymmetric cryptography. Applied Introduction to Cryptography and Cybersecurity 5.7. RESILIENCY TO EXPOSURE: FORWARD SECRECY AND RECOVER SECURITY 315 Figure 5.21: Relations between notions of resiliency to key exposures. An arrow from notion A to notion B indicates that notion A implies notion B. For example, a protocol that ensures Perfect Forward Secrecy (PFS) also ensures Forward Secrecy. Comparison of the four notions of resiliency. We compare the four notions of resiliency (forward secrecy, PFS, recover security and PRS) in Table 5.3, along with ‘regular’ secure Key Exchange protocols. We also present the relationships between the őve notions in Figure 5.21. Additional notions of resiliency. The research in cryptographic protocols includes additional notions of resiliency to key and state exposures, which we do not cover in this textbook. These include threshold security [117], which ensures that the entire system remains secure even if (up to some threshold) of its modules are exposed or corrupted, proactive security [88], which deals with recovery of security of some modules after exposures, and leakage-resiliency [138], which ensures resiliency to gradual leakage of parts of the storage. Applied Introduction to Cryptography and Cybersecurity CHAPTER 5. SHARED-KEY PROTOCOLS 316 5.8 Additional Exercises Exercise 5.8 (Attack against SNA with ‘őxed roles’). Show that the SNA handshake protocol does not ensure concurrent mutual authentication, also for a scenario where each party is only willing to act in one role, i.e., either as an initiator or as a responder, but not as both. Hint: as in Figure 5.4, the attack will involve two sessions; but you are not required that both sessions will terminate correctly - one of them may fail. Exercise 5.9. Some applications require only one party (e.g., a door) to authenticate the other party (e.g., Alice); this allows a somewhat simpler protocol. We describe in the two items below two proposed protocols for this task (one in each item), both using a key k shared between the door and Alice, and a secure symmetric-key encryption scheme (E, D). Analyze the security of the two protocols. 1. The door selects a random string (nonce) n and sends Ek (n) to Alice; Alice decrypts it and sends back n. 2. The door selects and sends n; Alice computes and sends back Ek (n). Repeat the question, when E is a block cipher rather than an encryption scheme. Exercise 5.10. Consider the following mutual-authentication protocol, using shared key k and a (secure) block cipher (E, D): 1. Alice sends NA to Bob. 2. Bob replies with NB , Ek (NA ). 3. Alice completes the handshake by sending Ek (NB ⊕ Ek (NA )). Show an attack against this protocol, and identify the design principles which were violated by the protocol, and which, if followed, should have prevented such attacks. Exercise 5.11 (GSM). In this exercise we study some of the weaknesses of the GSM handshake protocol, as described in Section 5.6. In this exercise we ignore the existence of multiple types of encryption and their choice (‘ciphersuite’). 1. In this exercise, and in usual, we ignore the fact that the functions A8, A3 and the ciphers Ei were kept secret; explain why. 2. Present functions A3, A8 such that the protocol is insecure when using them, against an eavesdropping-only adversary. 3. Present functions A3, A8 that ensure security against MitM adversary, assuming E is a secure encryption. Prove (or at least argue) for security. (Here and later, you may assume a given secure PRF function, f .) Applied Introduction to Cryptography and Cybersecurity 5.8. ADDITIONAL EXERCISES 317 4. To refer to the triplet of a specific connection, say the j th connection, we use the notation: (r(j), sres(j), k(j)). Assume that during connection j ′ attacker received key k(ĵ) of previous connection ĵ < j ′ . Show how a MitM attacker can use this to expose, not only messages sent during connection ĵ, but also messages sent in future connections (after j ′ ) of this mobile. 5. Present a possible fix to the protocol, as simple and efficient as possible, to prevent exposure of messages sent in future connections (after j ′ ). The fix should only involve changes to the mobile and the Visited Network, not to the home. Exercise 5.12 (Downgrade to A5/1 attack on GSM). Consider a mobile client and a visited network that both support A5/3 (or some other strong stream cipher). Present a sequence diagram showing how a MitM attacker can cause them to use the (weaker) A5/1 protocol. Exercise 5.13. Fig. 5.22 illustrates a simplification of the SSL/TLS sessionsecurity protocol; this simplification uses a fixed master key k which is shared in advance between the two participants, Client and Server. This simplified version supports transmission of only two messages, a ‘request’ MC sent by the client to the server, and a ‘response’ MS sent from the server. The two messages are protected using a session key k ′ , which the server selects randomly at the beginning of each session, and sends to the client, protected using the fixed shared master key k. Figure 5.22: Simpliőed SSL The protocol should protect the confidentiality and integrity (authenticity) of the messages (MC , MS ), as well as ‘replay’ of messages, e.g., client sends MC in one session and server receives MC on two sessions. Applied Introduction to Cryptography and Cybersecurity CHAPTER 5. SHARED-KEY PROTOCOLS 318 1. The field cipher_suite contains a list of encryption schemes (‘ciphers’) supported by the client, and the field chosen_cipher contains the cipher in this list chosen by the server; this cipher is used in the two subsequent messages (a fixed cipher is used for the first two messages). For simplicity consider only two ciphers, say E1 and E2, and suppose that both client and server support both, but that they prefer E2 since E1 is known to be vulnerable. Show how a MitM attacker can cause the parties to use E1 anyway, allowing it to decipher the messages MC , MS . 2. Suggest a minor modification to the protocol to prevent such ‘downgrade attacks’. 3. Ignore now the risk of downgrade attacks, e.g., assume all ciphers supported are secure. Assume that MC is a request to transfer funds from the clients’ account to a target account, in the following format: Date (3 bytes) Operation type (1 byte) Comment (20 bytes) Amount (8 bytes) Target account (8 bytes) Assume that E is CBC mode encryption using an 8-bytes block cipher. The solution should not rely on replay of the messages (which will not work since only one message is sent in each direction on each usage). Mal is a (malicious) client of the bank, and eavesdrops on a session where Alice is sending a request to transfer 10$ to him (Mal). Show how Mal can abuse his Man-in-the-Middle abilities to cause transfer of larger amount. Explain a simple fix to the protocol to prevent this attack. Exercise 5.14. Consider the following protocol for server-assisted group-sharedkey setup. Every client, say i, shares a key ki with the server. Let G be a group of users; user i ∈ G can send, at any time, a request to the server for the (fixed) group key kG ; the request consists of the list of users in G, the time t (in seconds) according to the clock of user i, and an authenticator M ACki (G, t). If the value of t is within one minute from its own clock value, the server responds by sending to i the encrypted key: xG (t) = kG + Πj∈G P RFkj (t), where Πj∈G P RFkj (t) is multiplication of the values P RFkj (t) for every user j in G, (i) including j = i. User i then computes kG = xG mod P RFki (t). 1. Draw a sequence diagram showing the operation of the protocol. 2. Let i, j be two users in group G, i.e., i, j ∈ G. Suppose user i sends request at time ti and user j sends request at time tj . Explain the conditions for them to receive the same key kG . 3. Present an attack allowing a malicious user m ̸∈ G to learn the key kG for a group it does not belong to. User m may eavesdrop to all messages, and request the key kG′ for any group (set) of users G′ s.t. m ∈ G′ . Exercise 5.15. In the GSM protocol, the home sends to the Visited Network one or more authentication triplets (r, K, s). The Visited Network and the Applied Introduction to Cryptography and Cybersecurity 5.8. ADDITIONAL EXERCISES 319 mobile are to use each triplet only for a single handshake; this is somewhat wasteful, as often the mobile has multiple connections (and handshakes) while visiting the same Visited Network. 1. Suppose a Visited Network decides to re-use the same triplet (r, K, s) in multiple handshakes, for efficiency (less requests to home). Present message sequence diagram showing that this may allow an attacker to impersonate as a client. Namely, that client authentication fails. 2. Suggest an improvement to the messages sent between mobile and Visited Network, that will allow the Visited Network to reuse the (r, K, s) triplet received from Visited Network, for multiple secure handshakes with the mobile. Your improvement should consist of a single additional challenge rB which the Visited Network selects randomly and sends to the mobile, together with the challenge r received in the triplet from the home; and a single response sB which the mobile returns to the server, instead of sending the response s as in the original protocol. Show the computation of sB by mobile and Visited Network: sB = . Your solution may use an arbitrary pseudo-random function P RF . 3. GSM sends frames (messages) of 114 bits each, by bit-wise XORing the nth plaintext frame with 114 bits output from A5/iK (n). Here, A5/i, for i = 1, 2, . . . , is a cryptographic function, n is the frame number, and K was a key received from the home. A5/1 and A5/2 are described in the specifications - and both are known to be vulnerable; other functions can be agreed between mobile and Visited Network. Both A5/1 and A5/2 are insecure; for this question, assume the use of a secure cipher, say A5/5. Suppose, again, that a Visited Network decides to re-use the same triplet (r, K, s) in multiple handshakes. A mobile has two connections to the Visited Network, sending message m1 in the first connection and message m2 in the second connection. Assume that the Visited Network re-uses the same triplet (r, K, s) in both connections, and that the attacker knows the contents of m1 . Show how the attacker can find m2 . Note: the improvement suggested in the previous item (rB , sB ) does not have significant impact on this item - you can solve with it or without it. 4. To prevent the threat presented in the previous item, the mobile and Visited Network can use a different key K ′ = (instead of using K). 5. Design a Visited Network-only forward secrecy improvement to A5/5. Namely, even if attacker is given access to the entire memory of the Visited Network after the j th handshake using the same r, the attacker would still not be able to decipher information exchanged in past connections. Your design may send the value of j together with r from Visited Network to mobile, and may change the stored value of s at the end of every handshake; let sj denote the value of s at the j th handshake, where the initial value is Applied Introduction to Cryptography and Cybersecurity CHAPTER 5. SHARED-KEY PROTOCOLS 320 s received from the home (i.e., s1 = s). Your solution consists of defining the value of sj given sj−1 , namely: sj = . Exercise 5.16 (GSM). Many GSM mobile phones use an encryption algorithm referred to as A5/3, when supported by the visited network, since it is considered more secure than A5/1 (and certainly more than A5/2, which was discontinued). The MAL organization records millions of A5/3 encrypted connections by different people ‘of interest’. MAL cryptanalysts find an effective attack against the GSM A5/1 algorithm; the attack exposes the key in few minutes, requiring only one ciphertext message. Suppose now Alice tries to communicate using her mobile and the GSM protocol, and the connection setup is intercepted by MAL. Show a sequence diagram showing how MAL may use this inteception connection attempt and the attack found against A5/1, to decrypt prior GSM communication by Alice, which was encrypted using A5/3. Exercise 5.17. Consider the following key establishment protocol between any two users with an assistance of a server S, where each user U shares a secret key KU S with a central server S. A → B : (A, NA ) B → S : (A, NA , B, NB , EKBS (A + + NA )) S → A : (A, NA , B, NB , EKAS (NA + + sk), EKBS (A + + sk), NB ) + sk)) A → B : (A, NA , B, NB , EKBS (A + Assume that E is an authenticated encryption. Show an attack which allows an attacker to impersonate one of the parties to the other, while exposing the secret key sk. Exercise 5.18 (Hashing vs. Forward Secrecy). We discussed in §5.7.1 the use of PRG or PRF to derive future keys, ensuring Forward Secrecy. Could a cryptographic hash function be securely used for the same purpose, as in κi = h(κi−1 )? Evaluate if such design is guaranteed to be secure, when h is a (1) CRHF, (2) OWF, (3) bitwise-randomness extracting. Exercise 5.19 (PFS deőnitions). Below are informal definitions for PFS from the literature. Compare them to our definitions for PFS: are they equivalent? Are they ‘weaker’ - a protocol may satisfy them yet not be PFS as we define, or the other way around? Or are they incomparable (neither is always weaker)? Can you give an absurd example of a protocol meeting the definition, which is ‘clearly’ not sensible to be claimed to be PFS? Any other issue? From Wikipedia, [390] An encryption system has the property of forward secrecy if plain-text (decrypted) inspection of the data exchange that occurs during key agreement phase of session initiation does not reveal the key that was used to encrypt the remainder of the session. Applied Introduction to Cryptography and Cybersecurity 5.8. ADDITIONAL EXERCISES 321 From [279, 309] A protocol has Perfect Forward Secrecy (PFS) if the compromise of long-term keys does not allow an attacker to obtain past session keys. Applied Introduction to Cryptography and Cybersecurity Chapter 6 Public Key Cryptography As we discussed in subsection 1.6.1, cryptography has been applied for over two millennia. However, until relatively recently, cryptography was always based on the use of symmetric keys. In particular, in Chapter 2, we studied symmetric cryptosystems, also called shared-key cryptosystems, which use the same key k for encryption (c ← Ek (m)) and for decryption (m ← Dk (c)); see in Figure 1.4. Similarly, in Chapter 4, we focused on shared-key (symmetric) Message Authentication Code (MAC) schemes, which also used only one key k to compute the authenticator (tag) that we will send with a message to prove its authenticity, and later to compute the authenticator for a message received and conőrm it is identical to the one received with the message, proving its authenticity. This was changed quite dramatically by the publication, in 1976, of [123], a seminal paper by Diffie and Hellman introducing public key cryptography, also known as asymmetric cryptography. Asymmetric cryptography is built on the idea that we may use different keys for different functions, e.g., for encryption and for decryption. Of course, the keys may be related; for example, if we use a key e to encrypt, and a key d to decrypt, the pair (e, d) should be related to properly retrieve the plaintext: m = Dd (Ee (m)). The advantage in asymmetric cryptography is that, for many applications, one key can be public, and only the other kept private. For example, Alice can publish her encryption key A.e, allowing everyone to encrypt messages to her by computing c = EA.e (m), but only Alice knows the corresponding decryption key A.d such that m = DA.d (c). We refer to such asymmetric cryptosystems as Public Key Cryptosystem (PKC); see illustration in Figure 1.5. In [123], Diffie and Hellman identiőed three types of public-key schemes: public-key cryptosystem, digital signatures and key exchange. They also presented a design, but only for the DH key exchange protocol. This discovery of the revolutionary concept of asymmetric cryptography is recreated in Figure 6.11 . In this chapter, we introduce public key cryptography. We begin, in the following section, with a brief introduction to public key cryptography. We 1 Thanks to Whit and Marty for blessing this invented dialog. 323 324 CHAPTER 6. PUBLIC KEY CRYPTOGRAPHY Figure 6.1: The discovery of Public-Key Cryptography by Whitőeld Diffie and Martin Hellman. then discuss key-exchange protocols, and later also public-key cryptosystems, mainly El-Auth-h-DH and RSA. We already discussed signature schemes and their security in subsection 1.5.1, but in subsection 6.6.1 we discuss the speciőc case of RSA-based public key signatures. 6.1 Introduction to PKC The basic observation leading to asymmetric cryptography is quite simple in hindsight: security requirements are asymmetric. For example, to protect conődentiality, an encryption scheme should prevent an attacker from decrypting ciphertext, requiring the key used for decryption to be secret. However, conődentiality is not broken if the attacker can encrypt messages. In fact, in Deőnition 2.9 of security against chosen plaintext attack (CPA), we allow the adversary to encrypt plaintexts without any restriction; in contrast, we do not facilitate decrypting ciphertexts. Therefore, if for a given cryptosystem, the encryption key does not allow decryption, and, in particular, does not expose the decryption key, then security under Deőnition 2.9 would imply security under a similar deőnition where the adversary is given the encryption key. We őrst discuss the three basic types of public key schemes introduced in [123], which are still the most important types of public key schemes: public key cryptosystem (PKC), digital signature schemes, and key exchange protocols. 6.1.1 Public key cryptosystems Public key cryptosystems (PKC) are encryption schemes consisting of three algorithms, (KG, E, D), which use a pair of keys: a public key e for encryption, and a private key d for decryption. Both keys are generated by the key generation algorithm KG. The encryption key is not secret; namely, we assume that it is known to the attacker. Applied Introduction to Cryptography and Cybersecurity 6.1. INTRODUCTION TO PKC 325 Let us now deőne a public key cryptosystem, similarly to Deőnition 2.1 for shared-key cryptosystems. As in Deőnition 2.1, we require correctness, i.e., that decryption of an encrypted message will recover that message. One notable difference is that a public key cryptosystem includes a key generation algorithm KG, since the encryption and decryption keys must be related - we cannot just choose them at random. Definition 6.1 (Public-key cryptosystem (PKC)). A public-key cryptosystem (PKC) is a triplet of (probabilistic) algorithms, (KG, E, D) and a set M (of plaintext messages), ensuring correctness, i.e., for every message m ∈ M and $ key-pair (e, d) ← KG(1l ) holds: Dd (Ee (m)) = m (6.1) See the illustration of a public key cryptosystem in Figure 1.5. Other terms for public key cryptosystems include asymmetric cryptosystems and public key encryption schemes. We will try to stick to the term ‘public key cryptosystems’, often using just the acronym PKC. We further discuss public key cryptosystems in sections 6.4.2 and 6.5. 6.1.2 Signature schemes We introduced signature schemes, in subsection 1.2.3; in particular, see subsection 1.5.1 and illustration in Figure 1.7. Let us quickly recall them here. Signature schemes consist of three algorithms, (KG, S, V ), for Key Generation, Signing and Verifying, respectively. Key Generation (KG) is a randomized algorithm, and it outputs a pair of correlated keys: a private signing key s for ‘signing’ a message, and a public validation key v for validating a given signature for a given message. The validation key v is not secret: it should only allow validation of authenticity, and should not facilitate signing. Both signature schemes and Message Authentication Code (MAC) functions are used for authentication of messages, which is based on the unforgeability requirement (subsection 1.5.1). The difference is that MAC functions use a single secret key k for authenticating and for validating messages, while signature schemes use a distinct private signing key s for signing (authenticating), and a distinct public veriőcation key v for validating authenticity. The correctness requirement of signature schemes is also similar to the one for MAC schemes (Section 4.3), namely for security parameter 1l , message m $ and key-pair (s, v) ← KG(1l ) holds: Vv (m, Ss (m)) = True (6.2) We presented constructions for one-time signatures in subsection 3.4.2. In Section 6.6, we discuss constructions of ‘regular’ signature schemes, i.e., schemes which may be used to sign arbitrary number of messages. Applied Introduction to Cryptography and Cybersecurity 326 6.1.3 CHAPTER 6. PUBLIC KEY CRYPTOGRAPHY Public-Key-based Key Exchange Protocols Public-key based Key Exchange protocols establish a shared secret key among two (or more) parties, based on the use of public keys. Such protocols may be authenticated, using keys shared in advance between the parties, or unauthenticated, not using any pre-shared keys. In this chapter, we focus on different variants of the Diffie-Hellman (DH) key-exchange protocol. We őrst discuss the unauthenticated Diffie-Hellman (DH) key-exchange protocols, in subsection 6.2.3, and later, in Section 6.3, discuss authenticated Diffie-Hellman (DH) key-exchange protocols. Like other key-exchange protocols, the őnal output is a shared key, known to both (or all) participants; the goal is that the key would be completely hidden from the attacker. Public-key-based key exchange protocols are much more computationally-demanding than the shared-key key-exchange protocols we discussed in Chapter 5, and therefore, they are typically used only periodically, and the shared-key they output is usually referred to as a master key, and used later to derive shared session keys, possibly using (efficient) shared-key key-exchange protocols, as discussed in Section 5.4. Or, a combined protocol can be used both to share the master key and then to share session-keys based on the master key; in particular, this is done by the widely-used TLS protocol, which we study in Chapter 7. The basic operation and goals of unauthenticated public-key-based key exchange protocols are illustrated in Figure 6.2, focusing on the case where the protocol involves only two ŕows: the őrst from Alice to Bob, and the second from Bob to Alice. To motivate the unauthenticated scenario, think that Alice and Bob meet in public, and want to establish secure communication between them; however, it being a public place, their discussion may be overheard, e.g., by Eavesdropping Eve. The key exchange protocol would allow them to establish a shared secret key, known only to the two of them, in spite of the possible eavesdropping. Notice that during the run of the protocol, we allow the adversary (Eavesdropping Eve) only to eavesdrop to the communication between the parties; such adversary cannot modify or inject messages between the parties. Defining two-flows unauthenticated public-key-based key exchange protocols. Let us now deőne, informally, an unauthenticated public-key-based key exchange protocol; for simplicity, let’s focus on protocols using two ŕows, as in Figure 6.2. Such a key exchange protocol can be deőned by a pair of efficient probabilistic algorithms, (KG, KC) (for key-generation and key-combining, respectively). The key-generation algorithm is given the security parameter, in unary, 1l , and outputs a pair of strings, e.g., (a, PA ); we refer to the őrst (a) as the private key and to the second (PA ) as the public key. The security parameter can be the same or related to the effective key length of the protocol; the actual length of public keys is typically signiőcantly longer, see subsection 6.1.5. The key generation function KG receives as input a security parameter 1l , and outputs a pair of keys, e.g., (PA , a), where PA is a public key and a is a Applied Introduction to Cryptography and Cybersecurity 6.1. INTRODUCTION TO PKC 327 Eavesdropping Eve Alice $ (PA , a) ← KG(1l ) ) op dr es av (e p) ro sd ve (ea Nurse Bob $ (PB , b) ← KG(1l ) PA PB kA,B ← KC(a, PB ) Goals: Indistinguishability: Eve cannot distinguish kA,B from random. Correctness: both parties get the same keys i.e., kA,B = kB,A . kB,A ← KC(b, PA ) Figure 6.2: Operation of an arbitrary two-ŕows unauthenticated public-key based key exchange protocol, such as the Diffie-Hellman protocol (presented later). Such protocol is deőned by two efficient probabilistic algorithms: Key Generation (KG), to generate (private, public) key-pairs, and Key Combining (KC), to combine the exchanged public keys (PA and PB ) into the shared key: k = KC(PA , PB ). An unauthenticated key exchange protocol should be secure against an eavesdropping adversary Eve, but is vulnerable to a Man-in-theMiddle adversary. private key. The key-combining function KC is run by each party, and receives as input a public value (from the other party) and a private value (of the party running KC); the output of KC would be used as the key shared between the two parties. The keys derived by the Alice and Bob should be the same, i.e.: KC(a, PA ) = KC(b, PB ). A key exchange protocol should ensure correctness and indistinguishability. The correctness requirement is that both parties will derive the same key. More precisely, for every security parameter 1l , the following should hold. Let $ $ (a, Pa ) ← KG(1l ) and (b, PB ) ← KG(1l ) be two key-pairs generated, using the key-generation algorithm KG with security parameter 1l , for Alice and Bob respectively. Then we have: KC(a, PB ) = KC(b, PA ) (6.3) Namely, applying the key-combining algorithm KC to combine Alice’s private key a with Bob’s public key PB , results in the same symmetric key as the one resulting from combining Bob’s private key b with Alice’s public key PA . Indistinguishability requires, intuitively, that an eavesdropping adversary, who ‘sees’ PA and PB , cannot learn anything about the shared key; equivalently, it requires that the adversary cannot distinguish between being given randomlygenerated PA , PB and the key derived from them, versus being given randomlygenerated PA , PB and a random string of the same length as the key. The following deőnition states this requirement more precisely. Applied Introduction to Cryptography and Cybersecurity 328 CHAPTER 6. PUBLIC KEY CRYPTOGRAPHY Definition 6.2 (The indistinguishability requirement). Let (KG, KC) be a key-exchange protocol, and A be an efficient (PPT) adversary. We say that (KG, KC) ensures key-indistinguishability if for every PPT adversary A and for sufficiently-large security parameter 1l , holds:   A (PA , PB , r) = 1   A (PA , PB , KC(a, PA )) = 1   where     where $    (a, P ) ← l KG(1 ),  A Pr  $  ∈ N EGL(1l ) −Pr  l (a, PA ) ← KG(1 ),     $ l  (b, PB ) ← KG(1 ),  $ l (b, PB ) ← KG(1 ) $ r ← {0, 1}|KC(a,PA )| (6.4) 6.1.4 Advantages of Public Key Cryptography (PKC) Public key cryptography is not just a cool concept; it is very useful, allowing solutions to problems which symmetric cryptography fails to solve, and making it easier to solve other problems. We őrst identify three important challenges which require the use of asymmetric cryptography: Signatures provide evidence. Only the owner of the private key can digitally sign a message, but everyone can validate this signature. This allows a recipient of a signed message to know that once he validated the signature, he has the ability to convince other parties that the message was signed by the sender. This is impossible using (shared-key) MAC schemes, and allows many applications, such as signing an agreement, payment order or recommendation/review. An important special case is signing a public key certificate, linking an entity and its public key. Security without assuming shared key. Using public key cryptography, we can establish secure communication between parties, without requiring them to previously share a secret key between them, or to share a secret key and communicate with an additional party (such as a KDC, see Section 5.5). One method to do so is to use unauthenticated key-exchange protocol; this is secure if the attacker has only eavesdropping capabilities during the exchange (this is not secure against a MitM attacker). Another alternative is when one party (e.g., the client) knows, or can securely receive, the public key of the other party (e.g., the server); in this case, the client can encrypt a shared key and send it to the server. To allow a party, e.g., the client (Alice), to validate the public key of the other party, e.g., the server (Bob), we can send the public key PB signed by a trusted party. We refer to the signed public key as a public key certificate; public key certiőcates are a very important aspect of applied cryptography, and we discuss them extensively in Chapter 8. Stronger resiliency to exposure. In Section 5.7 we discussed the goal of resiliency to exposure of secret information, in particular, of the ‘master Applied Introduction to Cryptography and Cybersecurity 6.1. INTRODUCTION TO PKC 329 key’ of shared-key key-setup protocols, and presented the forward secrecy key-setup handshake. In subsection 5.7.3, we also brieŕy discussed some stronger resiliency properties, including Perfect Forward Secrecy (PFS), Threshold security and Proactive security. Designs for achieving such stronger resiliency notions are all based on public key cryptography; we discuss these in Section 6.3. Public key cryptography (PKC) also makes it easier to design and deploy secure systems. Speciőcally: Easier key distribution: public keys are easier to distribute, since they can be given in a public forum (such as directory) or in an incoming message; note that the public keys still need to be authenticated, to be sure we are receiving the correct public keys, but there is no need to protect their secrecy. Distribution is also easier since each party only needs to distribute one (public) key to all its peers, rather than setting up different secret keys, one per each peer. Easier key management: public keys are easier to maintain and use, since they may be kept in non-secure storage, as long as they are validated before being used. Less keys: Only one public key is required for each party, compared to a total of n·(n−1) = O(n2 ) shared keys required for each pair of n entities. 2 Namely, we need to maintain - and refresh - less keys. Considering all these advantages, one may wonder why not always use public key cryptography. The reason is that there is also a price to the use of PKC as we next discuss. 6.1.5 The price of PKC: assumptions, computation costs and length of keys and outputs With all the advantages listed above, it may seem that we should always use public key cryptography. However, PKC has three signiőcant drawbacks: computation time, key-length and potential vulnerability. We discuss these in this subsection. All of these drawbacks are due to the fact that when attacking a PKC scheme, the attacker has the public key which corresponds to the private key. The private key is closely related to the public key - for example, the private decryption key ‘reverses’ encryption using the public key; yet, the public key should not expose (information about) the private key. It is challenging to come up with a scheme that allows this relationship between the encryption and decryption keys, and yet where the public key does not expose the private key. In fact, as discussed in Section 1.6, the concept of PKC was ‘discovered’ twice! Considering the challenge of designing asymmetric cryptosystems, it should not be surprising that all known public-key schemes have considerable drawbacks Applied Introduction to Cryptography and Cybersecurity 330 CHAPTER 6. PUBLIC KEY CRYPTOGRAPHY compared to the corresponding shared-key (symmetric) schemes. There are two types of drawbacks: overhead and required assumptions. PKC assumptions and quantum cryptanalysis Applied PKC algorithms, such as RSA, DH, El-Gamal and elliptic-curve PKCs, all rely on speciőc computational assumptions, mostly on the hardness of speciőc number-theoretic problems, mainly two: factoring and discrete logarithm. These speciőc hardness assumptions, and several others, are usually considered well-founded. This is due to the extensive efforts of mathematicians and other experts to őnd efficient algorithms for these problems. In particular, factoring and discrete logarithms have been studied for many years, long before their use for PKC was proposed; and efforts increased by far as PKC became known and important. However, it is certainly conceivable that an efficient algorithms exists - and would someday be found. Such a discovery may even occur suddenly and soon - such unpredictability is the nature of algorithmic and mathematical breakthroughs. In particular, a recent draft [352] presented a new factoring method, which claimed to be fast enough to be practical for signiőcant keylengths, and speciőcally to ‘destroy the RSA cryptosystem’. As the time of writing, this draft was withdrawn, and may be incorrect; but it will not be shocking if such an algorithm were to be found, indicating that RSA security may be considerably less than currently estimated, requiring the use of either longer keys or other schemes. Furthermore, since all of the widely-used PKC algorithms are so closely related, it is even possible that some, potentially related, advances in cryptanalysis would apply to all of them - leaving us without any practical PKC algorithm. PKC algorithms are the basis for the security of many systems and protocols; if suddenly there were no viable, practical and unbroken PKC, that would be a major problem. And if all that is not alarming enough, efficient algorithms to solve both the factoring and the discrete logarithm problems are known, requiring an appropriate quantum computer. There has been many efforts to develop quantum computers, with signiőcant progress - but results are still far from the ability to cryptanalyze these PKC schemes, when used with key-lengths which are considered secure (against known attacks, using standard computing devices). However, that may change with improvements in quantum computing. Cryptographers work hard to identify additional candidate PKC systems, which will rely on other, ‘independent’ or - ideally - more general assumptions, as well as schemes which are secure even if large-scale quantum computing becomes feasible, which are referred to as post-quantum cryptography. We discuss the impact of quantum computing on cryptography, including both use for cryptanalysis and development of post-quantum cryptography, in Section 10.4. One particularly interesting approach to the development of cryptographic schemes robust to advances in algorithms for speciőc problems, is the design of PKC schemes based on lattice problems. Lattice problems seem resilient to Applied Introduction to Cryptography and Cybersecurity 6.1. INTRODUCTION TO PKC 331 quantum-computing; furthermore, some of the results in this area have proofs of security based on the general and well-founded complexity assumption of NP-completeness. Details are beyond our scope; see, e.g., [16, 315]. PKC overhead: key-length and computation. Another drawback of asymmetric cryptography, is that all of the proposed schemes - deőnitely, all proposed schemes which were not broken - have much higher overhead, compared to the corresponding shared-key schemes. There are two main types of overhead: computation time and key-length. The system designers choose the key-length of the cryptosystems they use, based on the sufficient effective key length principle (principle 5). These decisions are based on the perceived resources and motivation of the attackers, on their estimation or bounds of the expected damages due to exposure, and on the constraints and overheads of the relevant system resources. Finally, a critical consideration is the estimates of the required key length for the cryptosystems in use, based on known and estimated future attacks. Such estimates and recommendations are usually provided by experts proposing new cryptosystems, and then revised and improved by experts and different standardization and security organizations, publishing key-length recommendations. We present three well-known recommendations in Table 6.1. These recommendations are marked in the table as LV’01, NIST2014 and BSI’17, and were published, respectively, in a seminal 2001 paper by Lenstra and Verheul [261], by NIST in 2014 [27] and by the German BSI organization in 2017 [86]. See these and much more online at [163]. Recommendations are usually presented with respect to a particular year in which the ciphertexts are to remain conődential (the three rows for 2020, 2030 and 2040 in Table 6.1). Experts estimate the expected improvements in the cryptanalysis capabilities of attackers over years, due to improved hardware speeds, reduced hardware costs, reduced energy costs (due to improved hardware), and, often more signiőcantly but hardest to estimate, improvements in methods of cryptanalysis. Such predictions cannot be done precisely, and hence, recommendations differ, sometimes considerably. Table 6.1 presents the recommendations for four typical, important cryptosystems (in columns two to four). Column two presents the recommendations for a symmetric cryptosystem such as AES. The recommendations for symmetric cryptosystems are not limited to AES; they apply to any symmetric (shared-key) cryptosystem. They only require that the best attacks against the system are generic attacks such as exhaustive search (subsection 2.3.1) or table lookup (subsection 2.3.2); symmetric cryptosystem against which there is a more effective attack are typically considered insecure and avoided. Column three presents the recommendations for RSA and El-Gamal, the two oldest and most well-known public-key cryptosystems; we discuss both cryptosystems, in sections 6.5 and 6.4.2. This column also applies to the DiffieHellman (DH) key-exchange protocol; in fact, the El-Gamal cryptosystem is essentially a variant of the DH protocol, as we explain in subsection 6.4.2. RSA Applied Introduction to Cryptography and Cybersecurity 332 Estimation Year 2020 2030 2040 Crypto++ [110] CHAPTER 6. PUBLIC KEY CRYPTOGRAPHY Symmetric Cryptography LV NIST BSI 2002 2014 2017 86 112 128 93 112 128 101 128 128 4.5 · 109 bytes/sec 128-bits AES Factoring (RSA), Discrete-log (DH) LV NIST BSI 2002 2014 2017 1881 2048 2000 2493 2048 3000 3214 3072 3000 3 · 105 bytes/sec 2048-bits RSA/DH Elliptic curves (ECIES) LV NIST BSI 2002 2014 2017 161 224 250 176 224 250 191 256 250 3 · 104 bytes/sec 256-bits ECIES Table 6.1: Comparison of key length and computing time for asymmetric and symmetric cryptography. Table shows three recommendations for keylength, in bits, required for conődentiality against ‘commercial’ adversaries. The recommendations are given for widely-deployed public-key (asymmetric) and shared-key (symmetric) cryptosystems; the rows refer to the year in which conődentiality is to be preserved (2020, 2030 and 2040). The recommendations are based on predictions of advances in both computing power and cryptanalysis. The LV recommendations are from a 2002 paper [261], the NIST recommendations are from a 2014 publication [27] and the BSI values are from a 2017 publication [86]. The bottom row compares the performance of the schemes for the Crypto++ implementation, based on [110]. and El-Gamal/DH are based on two different number-theoretic problems: the factoring problem (for RSA) and the discrete-logarithm problem (for DH/ElGamal); but the best-known attacks against both are related, with running time which is exponential in half the key-length. We brieŕy discuss these problems in subsection 6.1.7. The fourth column of table 6.1 presents the recommendations for ellipticcurve based public-key cryptosystems such as ECIES. As the table shows, the recommended key-lengths for elliptic-curve based public-key cryptosystems are, quite consistently, much lower than the recommendations for the ‘older’ RSA and El-Gamal/DH systems; this makes them attractive in applications where longer keys are problematic, due to storage and/or communication overhead. We do not cover elliptic-curve cryptosystems in this textbook; these are covered in other courses and books, e.g., [16, 187, 370]. Table 6.1 shows that the required key-length is considerably higher for public-key schemes, compared to shared-key (symmetric) schemes. Symmetric cryptography requires only about half of the key-length required by Ellipticcurve cryptosystems, and only about 5% of the key length required, for the same level of security, when using the RSA and DH public-key schemes. The lower key-length recommendations for Elliptic-curve cryptography, makes these schemes attractive in the (many) applications where key-length is critical, such as when communication bandwidth and/or storage are limited. The bottom row of Table 6.1 compares the running time of implementations of AES with 128 bit key in counter (CTR) mode, RSA with 1024 and 2048 bit key, and 256 bit ECIES elliptic curve cryptosystem. We see that the Applied Introduction to Cryptography and Cybersecurity 6.1. INTRODUCTION TO PKC 333 symmetric cryptosystem (AES) is many orders of magnitude faster. It supports about 4.5 · 109 bytes/second, compared with about 3 · 105 bytes/second for the comparably-secure 2048-bit RSA, and less than 3 · 104 bytes/second for ECEIS. We used the values reported for one of the popular Cryptographic libraries, Crypto++ [110]. The minimize use of PKC principle. In this subsection we have seen several serious concerns with the use of asymmetric (public key) cryptography. First, practical, deployed public key cryptographic algorithms, are secure only with speciőc assumptions - which have held many years, true, but still may be broken, e.g., by new, faster factoring algorithms [352]. Second, applied public-key techniques may be vulnerable to further improvements in quantum computing. Finally, as Table 6.1 shows, asymmetric (public key) cryptography has much higher overhead compared to symmetric cryptography. From all of this, we conclude the following principle: Principle 12 (Minimize use of public-key cryptography). Designers should avoid, or, where absolutely necessary, minimize the use of public-key cryptography. In particular, consider that typical messages are much longer than the size of inputs to the public-key algorithms. If we ignored the high costs of asymmetric cryptography, we could split the input into ‘blocks’ whose size is the allowed input-size of the public-key algorithms, and then use ‘modes of operations’, like these presented for encryption and MAC, for applying the public-key algorithms to multiple blocks. However, the resulting computation costs would have been absurd. Even more absurd, although theoretically possible, would be to modify the public-key operation to directly support longer inputs. Luckily, there are simple and efficient solutions, to both encryption and signatures, which are used essentially universally, to apply these schemes to long, typically Variable Input Length (VIL), messages: Signatures: use the Hash-then-Sign (HtS) paradigm, see subsection 3.2.6. Encryption: use the hybrid encryption paradigm, see the following subsection (subsection 6.1.6). 6.1.6 Hybrid Encryption The huge performance overhead of asymmetric cryptosystems implies that they are typically used mainly when the parties do not share a symmetric key. Furthermore, even when the parties do not share a symmetric key, we usually do not use directly the widely-used, ‘classical’ asymmetric cryptosystems (KGA , E A , DA ), e.g., RSA. Instead, we usually combine such ‘classical’ asymmetric cryptosystem (KGA , E A , DA ), with an efficient symmetric cryptosystem (E S , DS ), e.g., AES. Namely, we construct a new, hybrid asymmetric cryptosystem, which we denote (KGH , E H , DH ). Note our use of mnemonic Applied Introduction to Cryptography and Cybersecurity 334 CHAPTER 6. PUBLIC KEY CRYPTOGRAPHY superscripts to distinguish between the three cryptosystems: A for the ‘classical’ asymmetric cryptosystem (KGA , E A , DA ), S for the symmetric cryptosystem (E S , DS ) and H for the hybrid cryptosystem (KGH , E H , DH ). Hybrid encryption is to obtain the beneőts of asymmetric (public key) cryptosystems, yet with much reduced overhead, for the typical case where the plaintext is much longer than the input size of the ‘classical’ public-key encryption. For example, when using RSA with 4000-bit keys, the input size must be less than 4000 bits; with hybrid encryption, we can encrypt much longer messages, with only a single (4000-bit) RSA encryption, plus the number of symmetric-key operations required to encrypt the plaintext. Almost universally, hybrid encryption is achieved using the simple construction illustrated in Figure 6.3. Note that Figure 6.3 shows only the encryption and decryption processes of the hybrid encryption scheme; this is since in this common construction, the hybrid encryption scheme (KGH , E H , DH ) uses the same key-generation function as that of the underlying asymmetric encryption scheme, i.e., KGH (1l ) = KGA (1l ). Let us explain the hybrid encryption and decryption processes, as illustrated in Figure 6.3. Figure 6.3: Hybrid encryption (KGH , E H , DH ), deőned as a combination of an asymmetric (public key) cryptosystem (KGA , E A , DA ) with shared key cryptosystem E S , DS ), to allow efficient public key encryption of long message m using public key e and security parameter 1l . The hybrid encryption process EeH (m). We now explain how we perform hybrid encryption, given message m and using the public key e, as illustrated $ in Figure 6.3. We őrst select a random l-bit symmetric key k ← {0, 1}|e| ; note that, for simplicity, we use the length of the public key as the length of the shared key too (in practice, a shorter key usually suffices). We then use this key k to encrypt the message using symmetric encryption, i.e., compute the cipher-message cM = EkS (m). Finally, we then use the public-key encryption, to encrypt the symmetric key k, i.e., compute the cipher-key cK , as: cK = EeA (k). The ciphertext of the hybrid encryption is the pair of cipher-message and Applied Introduction to Cryptography and Cybersecurity 6.1. INTRODUCTION TO PKC cipher-key, i.e., (cM , cK ). More formally, we deőne:   $ k ← {0, 1}|e| EeH (m) ← (cM , cK ) where  cM ← EkS (m)  cK ← EeA (k) 335 (6.5) The hybrid decryption process DdH ((cM , cK )). We now explain how we perform hybrid decryption, given the private key d and the ciphertext, which should be a pair (cM , cK ) of cipher-message and cipher-key. The hybrid decryption process is illustrated in Figure 6.3. We őrst decrypt the symmetric key, by: k ← DdA (cK ), and then use this key k to decrypt the message, using the symmetric decryption function: m ← DkS (cM ). More formally, we deőne the hybrid decryption process as: DdH ((cM , cK )) ← DkS (cM ), where k ← DdA (cK ) (6.6) Exercise 6.1. Prove, or present counterexample to, the following claim: if the asymmetric (KGA , E A , DA ) and symmetric (E S , DS ) cryptosystems ensure correctness (per definitions 6.1 and 2.1, respectively), then the hybrid cryptosystem (KGH , E H , DH ), defined as above, is also an asymmetric cryptosystem (PKC) that ensures correctness. 6.1.7 The Factoring and Discrete Logarithm Hard Problems As discussed in Section A.1, cryptography, and in particular public-key cryptography is based on the theory of complexity, and speciőcally on (computationally) hard problems. Intuitively, a hard problem is a family of computational problems, with two properties: Easy to verify: there is an efficient (PPT) algorithm to verify solutions. Hard to solve: there is no known efficient algorithm that solves the problem (with signiőcant probability). We refer here to known algorithms; it is unreasonable to expect a proof that there is no efficient algorithm to a problem for which there is an efficient (PPT) veriőcation algorithm. The reason for that is that such a proof would also solve the most important, fundamental open problem in the theory of complexity, i.e., it would show that N P ̸= P . See Section A.1 or relevant textbooks, e.g., [165]. Intuitively, public key schemes use hard problems, by having the secret key provide the solution to the problem, and the public key provide the parameters to verify the solution. To make this more concrete, we brieŕy discuss factoring and discrete logarithm, the two hard problems which are the basis for many public key schemes, including the oldest and most well known: RSA, DH, El-Gamal. For more in-depth discussion of these and other schemes, see courses and books on cryptography, e.g., [370]. Note that while so far, known attacks are equally effective against both systems (see Table 6.1), there is not yet a proof that an efficient algorithm for one problem implies an efficient algorithm against the second. Applied Introduction to Cryptography and Cybersecurity 336 CHAPTER 6. PUBLIC KEY CRYPTOGRAPHY Note: both the factoring and the discrete logarithm problems are well-known problems from the domain of number theory. Properly understanding these problems, as well as the operation and security of the public key cryptosystems we present which are related to these problems, may require familiarity with some basic notions of number theory. These are summarized in Section A.2. Factoring. The factoring problem is one of the oldest problems in algorithmic number theory, and is the basis for RSA and other cryptographic schemes. Basically, the factoring problem involves őnding the prime divisors (factors) of a large integer. However, most numbers have small divisors - half of the numbers divide by two, third divide by three and so on... This allows efficient number sieve algorithms to factor most numbers. Therefore, the factoring hard problem refers speciőcally to factoring of numbers which have only large prime factors. For the RSA cryptosystem, in particular, we consider factoring of a number n computed as the product of two large random primes: n = pq. The factoring hard problem assumption is that given such n, there is no efficient algorithm to factor it back into p and q. Veriőcation consists simply of multiplying p and q, or, if given only one of the two, say p, of dividing n by p and conőrming that the result is an integer q with no residue. Discrete logarithm. The discrete logarithm problem is another important, well-known problem from algorithmic number theory - and the basis for the DH (Diffie-Hellman) key-exchange protocol, the El-Gamal cryptosystem, ellipticcurve cryptography, and additional cryptographic schemes. Discrete logarithms are deőned for a given cyclic group G and a generator g of G; see background in subsection A.2.4. Given a generator g of a őnite cyclic group G, and an element x ∈ G, an integer y is called the discrete logarithm of x over G with respect to g, if x = g y . Note that multiplication (and exponentiation) are done using the group operation of G; e.g., for the modulo p group Z∗p ≡ {1, 2, . . . , p − 1}, we require x ≡ g y ( mod p). Discrete logarithms are similar to the ‘regular’ logarithm function logb (x) over the real numbers R, which returns the number y ∈ R s.t. y = by . Discrete logarithms, unlike ‘regular’ logarithms, are computed over the őnite cyclic group G rather than over the real numbers, and use the group operation of G rather than multiplication over the real numbers. Intuitively, an algorithm that outputs a discrete logarithm a given an element x ∈ G and the generator g is said to solve the discrete logarithm problem for G. We say that the discrete logarithm problem is hard for őnite cyclic group G, if there is no efficient (PPT) algorithm A that solves the discrete logarithm problem for G (with signiőcant probability of success). This is in contrast to the logarithm function over the real numbers, which is efficiently computable. Note, however, that for any group G, it only requires an exponentiation to verify whether x = g y , and exponentiation can be computed quite efficiently. Applied Introduction to Cryptography and Cybersecurity 6.1. INTRODUCTION TO PKC 337 This discussion is only intuitive, since we did not clearly deőne the input of the algorithm A. This may suffice for most readers; however, for interested readers, we present also a precise deőnition of the discrete logarithm problem. In this deőnition, we consider a PPT algorithm Gen which receives, as input, a security parameter 1l , and generates (outputs) the generator g and the order q of G. Definition 6.3 (The discrete logarithm problem). Let Gen be a PPT algorithm that, on input 1l , outputs (g, q) such that {1, g, . . . , g q } is a cyclic group (using a given group operation). We say that the discrete logarithm problem is hard for groups generated by Gen, if for every PPT algorithm A holds: h  l i $ Pr (g, q) ← Gen 11 ; a ← {1, . . . , q} : a = A(g a ) ∈ N EGL(1l ) (6.7) In practical cryptography, the discrete logarithm problem is used mostly for (cyclic) groups deőned by multiplications modulo a prime p, often the cyclic group Z∗p ≡ {1, 2, . . . , p − 1}. However, for some primes p, the discrete-logarithm problem is easy for Z∗p . In particular: Fact 6.1. Let p be a prime. If p − 1 has only ‘small’ prime factors, then there are known algorithms, such as the Pohlig-Hellman algorithm [319], that efficiently compute discrete logarithms. This motivates the use of a modulus p which is a prime without small factors. In this textbook, we focus on a special case called safe prime, as we next deőne. Definition 6.4 (Safe prime). A prime number p ∈ N is called a safe prime, if p = 2q + 1 for some prime q ∈ N. If p is a safe prime, we say that the group Z∗p , containing the numbers from 1 to p − 1, with the modular multiplication operation, is a safe prime group. Many efforts have failed to őnd an efficient algorithm to compute discretelogarithms for safe prime groups. As a result, the discrete-logarithm problem is widely believed to be hard for the mod-p group, Z∗p , if p is a safe prime. For efficiency and/or security considerations, some designs use other őnite cyclic groups for which the discrete logarithm problem is considered hard, which are not safe prime groups; one example are groups deőned using elliptic curves. 6.1.8 The secrecy implied by the discrete logarithm assumption Suppose that the discrete logarithm assumption for safe prime groups holds, i.e., it is computationally-hard to őnd the discrete-log a, given g a mod p, where g is a generator of the safe prime group. Does this mean that the attacker cannot learn any information on a? The answer is no. Furthermore, we show that the attacker can efficiently learn some information about a - speciőcally, its least-significant bit (LSb), i.e., if a is even (LSb(a) = 0) or odd (LSb(a) = 1). As we will see later, this Applied Introduction to Cryptography and Cybersecurity 338 CHAPTER 6. PUBLIC KEY CRYPTOGRAPHY has important implications on the design of some discrete logarithm-based cryptographic schemes, such as the ‘secure’ way to use the Diffie-Hellman protocol; see Claim 6.3. Learning LSb(a) is based on the notion of quadratic residue modulo p, which has many uses in the mathematics of cryptography. Definition 6.5 (Quadratic residue). Let p be a prime number, and let y be a positive integer. We say that y is a quadratic residue modulo p, if there is some integer z s.t. y ≡ z 2 (modp). We őrst claim, without proof, that quadratic residuosity2 can be efficiently determined. Claim 6.1. Given a prime p, there is an efficient algorithm that can determine if a given positive integer y is a quadratic residue modulo p. Proof: omitted; see, e.g., [205]. We next show that y = g x mod p is a quadratic residue modulo p if and only if LSb(x) = 0, i.e., the least signiőcant bit of x is zero, or equivalently, x is even. Combined with Claim 6.1, this shows that we can efficiently őnd the least-signiőcant bit of the exponent x. Claim 6.2. Let p be a prime, g be a generator for Z∗p , and x be a positive integer. Then y ≡ g x mod p is a quadratic residue mod p, if and only if LSb(x) = 0, i.e., x is even. Proof: Let us őrst prove that if x is even, i.e., LSb(x) = 0, then y = g x mod p is a quadratic residue. First observe, that if x is even, then there is some integer z s.t. x = 2z. Hence, y ≡ g 2z ≡ (g z )2 (modp); i.e., y is, indeed, a quadratic residue. We now the other direction, i.e., let y = g x mod p be a quadratic residue mod p, where x is an integer; we prove that LSb(x) = 0, i.e., x is even. This proof uses basic facts from number theory, which we present in subsection A.2.3. For any odd number m, there exists an integer k such that m = 2k + 1. Let us assume, to the contrary, that g m is a quadratic residue mod p, for some odd integer m (LSb(m) = 1); namely, g m ≡ z 2 mod p for some integer z. From Fermat’s theorem (Theorem A.1) follows that: z p−1 ≡ 1(modp) (6.8) 2 Determination of quadratic residuosity is equivalent to computation of the Legendre symbol, defined as  if y is a quadratic residue modulo p and 0 ̸= a mod p  1 y −1 if y isn’t a quadratic residue modulo p ≡  p 0 if y = 0 mod p. where p is a prime and y is an integer. Applied Introduction to Cryptography and Cybersecurity 6.2. THE DH KEY EXCHANGE PROTOCOL 339 However, on the other hand: z p−1 ≡ z 2· p−1 2 ≡ (z 2 ) p−1 2 ≡ (g m ) p−1 2 ≡ g (2k+1)· p−1 2 ≡ g k·(p−1) · g p−1 2 (modp) (6.9) Now, again from Fermat’s theorem, we have: g k·(p−1) ≡ (g p−1 )k ≡ 1k ≡ 1(modp) (6.10) By combining Equations (6.8-6.10), we have: g p−1 2 ≡1·g p−1 2 ≡ g k·(p−1) · g p−1 2 ≡ z p−1 ≡ 1(modp) (6.11) p−1 Namely, g 2 ≡ 1(modp). However, g is a generator of Z∗p , i.e., g k ̸≡ 1 for every integer k s.t. 1 ≤ k < p−1 2 ≡ 1(modp), i.e., p − 1, and in particular for k = p−1 2 . This contradicts g Equation 6.11. Claim 6.2 shows that by (efficiently) őnding if g x mod p is a quadratic residue modulo p (Claim 6.1), we can őnd the least-signiőcant bit of x (LSb(x)), indicating if x is even or odd. Namely, while it may be hard to compute the entire discrete logarithm (x, given g x mod p), it is possible to efficiently őnd at least one bit of x - the least signiőcant bit. 6.2 The DH Key Exchange Protocol A major motivation for public key cryptography, is to secure communication between parties, without requiring the parties to previously agree on a shared secret key. In their seminal paper [123], Diffie and Hellman introduced the concept of public key cryptography, including public-key cryptosystem (PKC), which indeed allows secure communication without a preshared secret key. However, this paper did not contain a proposal for implementing a PKC. Instead, [123] introduced the key exchange problem, and present the DiffieHellman (DH) key exchange protocol, often referred to simply as the DH protocol. Although a key-exchange protocol is not a public key cryptosystem, yet it also allows secure communication - without requiring a previously shared secret key. In fact, the goal of a key exchange protocol, is to establish a shared secret key. In this section, we explain the DH protocol, by developing it in three steps each in a subsection. In subsection 6.2.1 we discuss a ‘physical’ variant of the DH protocol, which involves physical padlocks and exchanging a box (locked by one or two locks). 6.2.1 Physical key exchange To help understand the Diffie-Hellman key exchange protocol, we őrst describe a physical key exchange protocol, illustrated by the sequence diagram in Fig. 6.4. In this protocol, Alice and Bob exchange a secret key, by using a box, and two padlocks - one of Alice and one of Bob. Note that initially, Alice and Bob do Applied Introduction to Cryptography and Cybersecurity 340 CHAPTER 6. PUBLIC KEY CRYPTOGRAPHY Figure 6.4: Physical Key Exchange Protocol not have a shared key - and, in particular, Bob cannot open Alice’s padlock and vice versa; the protocol nevertheless, allows them to securely share a key. Alice initiates the protocol by placing the key to be shared in the box, and locking the box with her padlock. When Bob receives the locked box, he cannot remove Alice’s padlock and open the box. Instead, Bob locks the box with his own padlock, in addition to Alice’s padlock. Bob now sends the box, locked by both padlocks, to Alice. Upon receiving the box, locked by both padlocks, Alice removes her own padlock and sends back the box, now locked only by Bob’s padlock, back to Bob. Finally, Bob removes his own padlock, and is now able to open the box and őnd the key sent by Alice. We assume that the Man in the Middle adversary cannot remove Alice’s or Bob’s padlocks, and hence, cannot learn the secret in this way. The Diffie-Hellman protocol replaces this physical assumption, by appropriate cryptographic assumptions. However, notice that there is a further limitation on the adversary, which is crucial for the security of this physical key exchange protocol: the adversary should be unable to send a fake padlock. Note that in Figure 6.4, both padlocks are stamped by the initial of their owner - Alice or Bob. The protocol is not secure, if the adversary is able to put her own padlock on the box, but stamp it with A or B, and thereby make it appear as if the padlock is Alice’s or Bob’s, respectively. This corresponds to the fact that the Diffie-Hellman protocol is only secure against an eavesdropping adversary, but insecure against a MitM adversary. The critical property that facilitated the physical key exchange protocol, is that Alice can remove her padlock, even after Bob has added his own padlock. Namely, the ‘padlock’ operation is ‘commutative’ - it does not matter if Alice placed her padlock őrst and Bob second, she can still remove her padlock Applied Introduction to Cryptography and Cybersecurity 6.2. THE DH KEY EXCHANGE PROTOCOL 341 as if it was applied last. In a sense, the key to cryptographic key exchange protocols such as Diffie Hellman, is to perform a mathematical operation which is also commutative; of course, there are many commutative operations. We next discuss ‘insecure prototypes’ key-exchange protocols based on three commutative operations: addition, multiplication and XOR. However, before that, let us brieŕy discuss the definition of a secure key exchange protocol. 6.2.2 Some candidate key exchange protocols In this subsection, we present few ‘prototype’ key-exchange protocols, which help us to properly explain the Diffie-Hellman protocol. Unlike the physical key exchange protocol of subsection 6.2.1, these are ‘real protocols’, i.e., involve only the exchange of messages - no physical objects or assumptions. We begin with three insecure ‘prototypes’, each using a different commutative operation: XOR, Addition and Multiplication. The XOR, Addition and Multiplication key exchange protocols. The sequence diagram in Figure 6.5 presents the őrst prototype: the XOR key exchange protocol. This prototype tries to use the XOR operator, to ‘implement the padlocks’ of Figure 6.4. XOR is a natural candidate, since we know that XOR can provide conődentiality when used ‘correctly’, e.g., in the one-time pad construction. Furthermore, XOR is commutative, and it is easy to see that this suffices to ensure the correctness of the XOR key exchange, i.e., the fact that kA,B = kB,A , as follows: kB,A = = = = = k ′′′ ⊕ kB (k ′′ ⊕ kA ) ⊕ kB ((k ′ ⊕ kB ) ⊕ kA ) ⊕ kB (((kA,B ⊕ kA ) ⊕ kA kA,B However, as the next exercise shows, the XOR key exchange protocol is insecure. In fact, not only it does not satisfy indistinguishability, but worse: an eavesdropper can easily őnd the exchanged key. Exercise 6.2 (XOR key exchange protocol is insecure). Show how an eavesdropping adversary may find the secret key exchanged by the XOR Key Exchange protocol, by (only) using the values sent between the two parties. Solution (sketch): attacker XORs all three messages, to obtain: k = (k ⊕ kA ) ⊕ (k ⊕ kA ⊕ kB ) ⊕ (k ⊕ kB ). Exponentiation key exchange. The attack on the XOR key exchange was due the fact that the attacker was able to ‘remove’ elements by applying the XOR again, due the combination of XOR’s commutativity and the fact that for XOR, every element is its own inverse, i.e., (∀x ∈ {0, 1}l ) x ⊕ x = 0l . Applied Introduction to Cryptography and Cybersecurity CHAPTER 6. PUBLIC KEY CRYPTOGRAPHY 342 Eavesdropping Eve Alice Bob Nurse $ kA , kA,B ← {0, 1}l $ kB ← {0, 1}l ′ k ← kA ⊕ kA,B k ′′ ← k ′ ⊕ kB k ′′′ ← kA ⊕ k ′′ Output kB,A ← k ′′′ ⊕ kB Output kA,B Figure 6.5: The (insecure) XOR Key Exchange Protocol; this protocol ensures correctness kA,B = kB,A , but is insecure. Speciőcally, by eavesdropping to the three exchanged messages (k ′ , k ′′ and k ′′′ ), Eve can őnd the key kA,B . Can you őnd out how? See Exercise 6.2. Eavesdropping Eve Alice Bob Nurse $ k, ra ← {0, . . . , 2n − 1} (random n-but integers) $ rb ← {0, . . . , 2n − 1} (random n-bit integer) x ← g k·ra y ← x rb z ← y 1/ra Output kA,B ≡ g k Output kB,A ← z 1/rb Figure 6.6: The (insecure and inefficient) Exponentiation Key Exchange Protocol, using some random integer g. It the resulting shared key kA,B = kB,A is too long, use only some bits. This protocol, like the XOR key exchange protocol, ensures correctness kA,B = kB,A but is insecure, as we show in the text. So, let us try to use a different mathematical operation, that also ensures commutativity (for correctness), but where elements are (typically) not their own inverses: exponentiation. In Fig. 6.6, we show the resulting Exponentiation Key Exchange protocol. This protocol is obviously very inefficient, but let us ignore the inefficiency; we just present it to show its correctness and vulnerability, as motivation and to build intuition to the modular exponentiation key exchange protocol that we show afterwards. Let us őrst show that the Exponentiation Key Exchange protocol ensures Applied Introduction to Cryptography and Cybersecurity 6.2. THE DH KEY EXCHANGE PROTOCOL 343 correctness, i.e., that kA,B = kB,A : kB,A = = = z 1/rb 1/rb  y 1/ra (xrb ) = g = k 1/(rb ·ra )  k·ra 1/ra g = kA,B (6.12) (6.13) (6.14) (6.15) (6.16) We relied on the commutativity of exponentiation in Equation 6.15. Let us now explain an attack recovering the key kA,B exchanged, similarly to the attack on the XOR key exchange. The attack uses the fact that the exponentiation operation may be removed to őnd the exponent, by computing the inverse operation, i.e., logarithm (base g). The logarithm function is less efficient than exponentiation, but, over the integers or real numbers, it is still considered an efficient operation, since it can be computed in polynomial time. Namely, an eavesdropper can simply compute the logarithm (base g) to remove the exponentiations from all ŕows, which reduces the protocol to the multiplication key exchange, shown insecure in Ex. 6.12 (and similarly to the attack on XOR key exchange in Exercise 6.2). Namely, the attacker applies the logarithm operator, with basis g, to the three messages of Fig. 6.6 - resulting in the values k · ra , k · ra · rb and k · rb . The attacker can now combine these three a )·(k·rb ) values to őnd k, by computing k = (k·r (k·ra ·rb ) . Of course, this attack used the value of g. This is justiőed, since in a key exchange protocol, the parties do not have any preshared secret input (see Figure 6.2); indeed, if the parties already share a secret key, why not use it directly? Note that even if g is a preshared secret, the protocol is still vulnerable, with a modiőed attack (Exercise 6.13). Exercise 6.12 shows that a similar vulnerability occurs if we use multiplication or addition instead of XOR or exponentiation. However, we will not give up - and we next show how we ‘őx’ this protocol, and őnally present a protocol which is hoped to be secure. Note that we do not claim that this protocol is secure; indeed, like other designs based on supposedly computationally-hard problem, a proof that the design is secure is unlikely - it would imply a proof that P ̸= N P ; see Section A.1. Modular-Exponentiation Key Exchange. We now ‘őx’ the Exponentiation Key Exchange Protocol (Fig. 6.6). The attack against it used the fact that the computations in Fig. 6.6 are done over the őeld of the real numbers (R), where there are efficient algorithms to compute logarithms. This motivates changing this protocol, to use, instead, operations over a group in which the (discrete) logarithm problem is considered hard. Such groups exist, e.g., the ‘mod p’ group, for a safe prime p. Applied Introduction to Cryptography and Cybersecurity CHAPTER 6. PUBLIC KEY CRYPTOGRAPHY 344 Eavesdropping Eve Alice Bob Nurse $ k, a ← Z∗p ≡ {1, . . . (p − 1)} (large random integers < p) $ b ← Z∗p ≡ {1, . . . (p − 1)} (large random integer < p) x ← g k·a mod p y ← xb mod p z ← ya −1 mod (p−1) Output kA,B ≡ g k mod p mod p Output kB,A ← z b −1 mod (p−1) mod p Figure 6.7: The Modular-Exponentiation Key Exchange Protocol, where p is a prime and g is a generator of Z∗p . The values k, a and b are chosen randomly from Z∗p , i.e., integers between 1 and p − 1 (why? see Exercise 6.14). Alice −1 derives kA,B = g k mod p, and Bob derives kB,A = z b mod (p−1) mod p; Equation 6.17 shows both derive the same key, i.e., kA,B = kB,A . We present this protocol in Fig. 6.7. Notice that this protocol uses multiplicative inverses in the (p − 1) modular group, e.g., a−1 mod (p − 1) is the number in Z∗p−1 ≡ {1, . . . , p − 2} such that: a · a−1 = 1 mod (p − 1). The correctness of the Modular-Exponentiation Key Exchange Protocol follows from the commutativity of modular-exponentiation, much like the correctness of the preceding protocols: kB,A = = = = = −1 mod (p−1) zb mod p  −1 b−1 mod (p−1) mod (p−1) ya −1 b ·a−1 mod (p−1) xb mod p  −1 mod (p−1) k·a a g mod p g k mod p = kA,B mod p (6.17) Is this protocol secure, i.e., does it ensure indistinguishability? This may depend on the prime p used. One way to try to break the Modular-Exponentiation Key Exchange Protocol of Fig. 6.7, is to compute the (discrete) logarithm of the three values exchanged by the protocol - like the attack above against the ‘regular’ Exponentiation Key Exchange Protocol. This works when discrete logarithm can be computed efficiently, e.g., when p − 1 is a smooth number, i.e., has only small prime factors. Examples of primes p s.t. p − 1 is a smooth number. Let us give two simple examples of primes p such that p − 1 is smooth. The őrst example is of Fermat primes, i.e., primes of the form p = 2x + 1 for integer x. The second, possible better3 example is of Pierpont primes, i.e., primes of the form p = 2x · 3y + 1 3 Pierpont primes may be a better example since very large Pierpont primes are known, in fact, the number of Pierpont primes is conjectured to be infinite. In contrast, only five Fermat primes are known, and the largest currently known is 65537 = 216 + 1. Applied Introduction to Cryptography and Cybersecurity 6.2. THE DH KEY EXCHANGE PROTOCOL 345 for integers x, y. In contrast, computing discrete logarithms is believed to be computationallyhard for certain moduli, making this attack impractical. In particular, discrete logarithm is assumed to be computationally hard when the modulus p is a large safe prime, i.e., p = 2q + 1 for some prime q; see Deőnition 6.4 and Deőnition 6.3. The attacker can easily detect if a = 1 or b = 1, and then őnd k. However, 1 the probability of this choice is only p−2 , which is exponentially small in the number of bits in p - i.e., negligible (for sufficiently large p). The attacker can also guess some speciőc value chosen by the parties, say ã ∈ {1, . . . , p − 1} as a guess for a, compute ã−1 mod (p − 1), and check if the guess for a was −1 correct (i.e., whether ã = a mod p), by comparing z to y ã mod p. If the −1 ã guess was correct, i.e., if z = y mod p, then ã = a mod (p − 1), and the −1 attacker computes the key: kA,B = xa mod (p−1) mod p. Note that there is no advantage for the parties to select the a, b or k exponents from a larger set (not limited to {1, . . . , p − 1}); all the values sent by the protocol, as well as the key, will be exactly the same as when using the corresponding exponents mod (p − 1). In the following subsection, we present the Diffie-Hellman protocol - which is essentially an improved and simpliőed variant of the Modular-Exponentiation Key Exchange Protocol of Fig. 6.7. 6.2.3 The Diffie-Hellman Key Exchange Protocol and Hardness Assumptions Fig. 6.8 presents the Diffie-Hellman (DH) key exchange protocol. The protocol uses the (safe) prime group Z∗p , i.e., using multiplications modulo p, where p is a (safe) prime p. The protocol assumes a given (public) choice of parameters: the (safe) prime p and the generator g. Recall that the order q of Z∗p is q = p − 1, i.e., g q = 1 mod p and {1, . . . , p − 1} = {g i }qi=1 . The protocol consists of only two ŕows: in the őrst ŕow, Alice sends g a mod p, where a ∈ {1, . . . , p − 1} is a private key chosen randomly by Alice; and in the second ŕow, Bob responds with g b mod p, where b is a private key chosen randomly by Bob. The result of the protocol is a shared secret value g ab a mod p, computed by Alice as kA,B = g b mod p mod p = g ba mod p, b a ab and by Bob as kB,A = (g mod p) mod p = g mod p. The Diffie-Hellman key-exchange protocol is, essentially, a simpliőed, and slightly optimized, variant of the Modular-Exponentiation Key Exchange Protocol of Fig. 6.7; and in particular, the security of both protocols relies on the difficulty of computing discrete logarithms, and may fail if p has only small factors; the choice of safe primes (p = 2q + 1 for prime q) is hoped to be ‘safe’, i.e., to ensure security. The basic difference between the two protocols is that in the Diffie-Hellman protocol, the key output is not g k mod p for some random k, as happens for the Exponentiation key exchange. Instead, the key being output (exchanged) is Applied Introduction to Cryptography and Cybersecurity CHAPTER 6. PUBLIC KEY CRYPTOGRAPHY 346 Eavesdropping Eve Alice Bob Nurse $ a ← Z∗p ≡ {1, . . . (p − 1)} (random integer < p) $ b ← Z∗p ≡ {1, . . . (p − 1)} (random integer < p) x ← g a mod p y ← g b mod p Output key: kA,B ≡ y a ≡ (g b )a = g a·b = (g a )b ≡ xb ≡ kB,A (mod p) Figure 6.8: The Diffie-Hellman Key Exchange Protocol. The protocol uses mod p computations, where p is a prime. It is believed to be hard to compute the resulting key g ab mod p, when p is a safe prime (p = 2q + 1 where q is a prime). g ab . This makes the protocol a bit simpler, and more efficient: only two ŕows instead of three, no need to compute inverses (a−1 , b−1 mod p), and one less exponentiation. The correctness of the Diffie-Hellman key exchange protocol, i.e., the fact that kA,B = kB,A , follows from the commutativity of exponentiation (and modular exponentiation), as follows: kA,B ≡ ≡ ≡ ≡ = y a mod p (g b )a mod p g a·b mod p xb mod p kB,A (6.18) (6.19) (6.20) (6.21) (6.22) The following exercise may help to get a better feeling for the protocol and how it works. Exercise 6.3 (Diffie-Hellman (DH) Key Exchange). Let p = 7. 1. Is p = 7 a safe prime? 2. Find a generator g for Z∗7 ; show that g is a generator and how you found it. 3. Alice and Bob run the DH protocol with the prime p and generator g. Alice selects a = 3, and Bob selects b = 6. Compute the values sent by Alice and Bob, and show the computation of the shared key by each of them, resulting in the same value. Applied Introduction to Cryptography and Cybersecurity 6.2. THE DH KEY EXCHANGE PROTOCOL $ 347 $ a ← {1, . . . , p} $ e ← {1, . . . , p} b ← {1, . . . , p} g a mod p g e mod p g e mod p g b mod p Nurse Alice MitM Adversary (g e ) = g a·e mod p (g a ) = g a·e mod p, e g b = g b·e mod p a e Bob b (g e ) = g b·e mod p Figure 6.9: MitM attack on the DH key-exchange protocol. The DH protocol is believed to be secure against an eavesdropping adversary - or if the messages are authenticated. DH is vulnerable to MitM attacker. Both the Diffie-Hellman and the Modular-Exponentiation key exchange protocols insecure against a MitM attacker; they are designed only against an eavesdropping adversary. In fact, as shown in Figure 6.9, all a MitM attacker needs to do is to fake the message from a party, allowing it to impersonate that party (establishing a shared key with the other party). Indeed, in practice, we (almost) always use authenticated variants of the DH protocol, as we discuss in subsection 6.3.1. Security of DH and the Computational DH Assumption. Ok, so the DH protocol is vulnerable against MitM; but can we safely use it against an eavesdropper? Namely, can we assume that the key output by the DH protocol cannot be computed by an eavesdropping adversary, when DH is computed over a group in which discrete logarithm is assumed to be a computationally-hard problem, e.g., the ‘mod p’ group where p is a safe prime? So far, this has not been proven; there is no proof that if DH is computed (using a safe prime p), then the resulting key cannot be efficiently computed by an adversary. There isn’t even a proof showing that such attack against DH would imply an efficient method to compute discrete logarithms (modulo a safe prime p). In fact, these are still important open questions. The common approach is to assume that when DH protocol is run using safe prime p, then an eavesdropping adversary cannot guess the resulting shared key. This assumption, which is stronger than the assumption of hardness of discrete-log, is called the Computational DH (CDH) assumption. The CDH assumption essentially means that it is infeasible to compute the DH shared Applied Introduction to Cryptography and Cybersecurity CHAPTER 6. PUBLIC KEY CRYPTOGRAPHY 348 secret g ab mod p, given the two exchanged values g a mod p and g b mod p, if p is a sufficiently-long safe prime (i.e., p = 2q + 1 for a prime q). Both the DH and the discrete-log problems are easy for some other values of p, in particular, when p − 1 is a smooth number, i.e., has only small prime factors; see our discussion of such primes in subsection 6.2.2. Definition 6.6 (Computational DH (CDH) for safe prime groups). The Computational DH (CDH) assumption for safe prime groups holds, if there is no efficient (PPT) adversary A that, given a random n-bit safe prime p (i.e., p = 2q + 1 for prime q) and generator g, and the values (g a mod p, g b mod p) $ for random a, b ← {1, . . . , p − 1}, returns, with non-negligible probability, g ab mod p. Namely, for every PPT algorithm A and random n-bit safe prime p and generator g holds:   (6.23) Pr A(g a mod p, g b mod p) = g ab mod p ∈ N EGL(n) $ a,b←Z∗ p Note that the deőnition allows the random choice of a = 1 (or b = 1), although for a = 1 holds g ab = g b . However, the number of possible values in Z∗p is exponential in n, i.e., the probability of such ‘bad choice’ is negligible. At least one bit of g ab mod p is exposed! Even assuming that the CDH assumption holds for safe prime groups, an eavesdropper is still be able to learn (at least) one bit about g ab mod p. Speciőcally, an attacker, observing g a mod p and g b mod p from a run of the Diffie-Hellman protocol, can efficiently őnd whether g ab mod p is a quadratic residue modulo p, i.e., if there exists some z ∈ Z∗p such that g ab ≡ z 2 mod p (Deőnition 6.5). Let us show how. Claim 6.3. Let p be a prime, g be a generator for Z∗p , a, b be integers, and y ≡ g ab mod p. Given g a mod p and g b mod p, we can efficiently deduce if y is a quadratic residue modulo p. Proof: From Claim 6.1, we can efficiently őnd if g a mod p and g b mod p are quadratic residues modulo p. From Claim 6.2, this gives the least signiőcant bit (parity) of a and of b; obviously, ab is even if either a or b is even. Again from Claim 6.2, the least signiőcant bit of ab indicates the quadratic residuosity of g ab mod p. 6.2.4 Secure derivation of keys from the DH protocol An eavesdropper to the DH key exchange can observe g a mod p and g b mod p; hence, from Claim 6.3, the attacker can know if y ≡ g ab mod p is a quadratic residue modulo p. Therefore, using y ≡ g ab mod p directly as a key may not be advisable, as even assuming that the CDH assumption is true, still an eavesdropper can learn partial information about y (i.e., if it is a quadratic residue). Notice that while we show only exposure of this information - the Applied Introduction to Cryptography and Cybersecurity 6.2. THE DH KEY EXCHANGE PROTOCOL 349 Eavesdropping Eve Alice Bob Nurse $ $ a ← {1, . . . , q} b ← {1, . . . , q} x ← ga y ← gb Output key: kA,B = y a = (g b )a = g a·b = (g a )b = xb = kB,A Figure 6.10: The Generalized Diffie-Hellman Key Exchange Protocol, for group G with order q. All operations are group operations, denoted like the usual multiplication notation. For some types of groups, and sufficiently-large order q, it is believed that it is infeasible not only to compute g ab but even to distinguish between g ab and a random group member, i.e., DDH security. The protocol reduces to the ‘regular’ (mod-p) Diffie-Hellman protocol, when the group G is Z∗p , the modular-p group for prime p. quadratic residuosity of g ab mod p - there could be ways to expose more4 information without violating the CDH assumption. So, how can we use the DH protocol to securely exchange a key? One could simply ignore this concern; but let us discuss two other, more prudent, options. First option: generalized DH protocol, using DDH groups. The generalized DH protocol can ensure that the value of the derived key kAB is secret, without any leakage. This protocol uses (and requires) a cyclic group G where the (stronger) Decisional DH (DDH) Assumption is believed to hold. Let us őrst deőne this assumption (a bit informally). Definition 6.7 (The Decisional DH (DDH) Assumption). Group G, with order q, satisfies the Decisional DH (DDH) Assumption if there is no PPT algorithm A candistinguish, with  athat   significant advantage compared to guessing, between g , g b , g ab and g a , g b , g c , for a, b and c selected randomly from {1, . . . , q}. A group for which the DDH assumption is believed to hold is called a DDH group. In Figure 6.10, we present the Generalized Diffie-Hellman protocol, using cyclic group G, rather than the speciőc group Z∗p used in the original DH protocol. We use the group operation of G, denoted as multiplication, in lieu of the mod-p multiplication used by Z∗p . The protocol ensures key secrecy, if G is a DDH group, i.e., if the DDH problem is believed to be computationally infeasible (‘hard’) for G. The generalized DH protocol assumes agreed-upon DDH group G and generator g, and known order q for G. Like the original DH protocol (Figure 6.8), 4 It may possible to expose even 80% of the bits [75]. Applied Introduction to Cryptography and Cybersecurity 350 CHAPTER 6. PUBLIC KEY CRYPTOGRAPHY the protocol has only two ŕows. In the őrst ŕow, Alice sends g a , where a ∈ {1, . . . , q} is a secret chosen randomly by Alice; and in the second ŕow, Bob responds with g b , where b ∈ {1, . . . , q} is a secret chosen randomly by Bob. Both ‘exponentiations’ (g a and g b ) are done by repeatedly applying the group operation (instead of modular multiplication, as in the original DH protocol). The result protocol is a shared secret value g ab , computed by Alice as  of the b ba b a kA,B = g = g , and by Bob as kB,A = (g a ) = g ab . All ‘exponentiations’ are repeated application of the group operation of G. Notice that the secret value exchanged, g ab , is an element of the group G, i.e., it is not a uniformly random string; this requires mapping of g ab into a random string. One group where DDH is assumed to hold is Qp , the subgroup of Z∗p consisting of the quadratic residues in Z∗p , for a safe prime p (i.e., p = 2q + 1, for prime q). Certain elliptic-curve groups are also believed to be DDH groups. See other examples in [75]. Second option: extract a secret, random key from the partiallyThe second option is to use the DH protocol as random g ab mod p. described, i.e., with a safe prime group Z∗p , but to securely extract (derive) a shared key k from g ab mod p. Section 3.5 discusses the two common ways to extract a shared key from a mostly-random shared secret data: using either a randomness extractor hash function or a Key Derivation Function (KDF). This approach requires, basically, that the g ab mod p contains a ‘sufficient’ randomization, to ensure that the output of the extractor or KDF is pseudorandom basically, a variant of DDH with respect to the speciőc extractor or KDF used. 6.3 Using DH for Resiliency to Exposures: the (PFS) Auth-h-DH and (PRS) DH-Ratchet protocols As discussed above, and demonstrated in Ex. 6.9, the DH protocol is vulnerable to a MitM attacker; its security is only against a passive, eavesdropping-only attacker. In most practical scenarios, attackers who are able to eavesdrop, have some or complete ability to also perform active attacks such as message injection; it may seem that DH is only applicable to the relatively few scenarios of eavesdropping-only attackers. In this section, we discuss extensions of the DH protocol, extensively in practice to improve resiliency to adversaries which have MitM abilities, combined with key-exposure abilities. Speciőcally, these extensions allow us to ensure Perfect Forward Secrecy (PFS) and Perfect Recover Security (PRS), the two strongest notions of resiliency to key exposures of secure key setup protocols as presented in Table 5.3 (Section 5.7). Applied Introduction to Cryptography and Cybersecurity 6.3. USING DH FOR RESILIENCY TO EXPOSURES 351 MitM attacker Alice, has M K Bob, has M K Nurse $ ai ← Z∗p $ ≡ {1, . . . (p − 1)} xi ← g ai bi ← Z∗p ≡ {1, . . . (p − 1)} mod p, M ACM K (xi ) yi ← g bi mod p, M ACM K (yi ) (i) kA,B ≡ h(yiai mod p) (i) (i) Session ith key: ki = kA,B = kB,A (i) kB,A ← h(xbi i mod p) Figure 6.11: The Auth-h-DH Protocol, showing ith exchange. This protocol is secure against MitM attackers; furthermore, it ensures Perfect Forward Secrecy (PFS), i.e., exposure of current keys does not expose past keys. The protocol, as presented, uses both a MAC function (M AC) and a keyless randomness extractor hash function h. 6.3.1 The Authenticated DH (Auth-h-DH) protocol: ensuring Perfect Forward Secrecy (PFS) Assuming that the parties share a secret master key M K, it is quite easy to extend the DH protocol in order to protect against MitM attackers. All that is required is to use a Message Authentication Code (MAC) scheme to authenticate the DH ŕows. To ensure security without requiring the use of DDH group (i.e., the DDH assumption), we may extract the key using an extractor hash function h (or a KDF). See Fig. 6.11, showing the Auth-h-DH, the resulting authenticated variant of the DH protocol using hash h to extract the key. Correctness follows similarly to the argument for ‘regular’ Diffie-Hellman protocol: (i) ki ≡ kA,B ≡ = = = = ≡ h (yiai mod p)   a i mod p h g bi   a i mod p h g bi   b h (g ai ) i mod p   h xbi i mod p (i) kB,A (6.24) (6.25) (6.26) (6.27) (6.28) (6.29) The next informal lemma presents the security properties of Auth-h-DH. We only given an informal argument for the validity of the Lemma. Lemma 6.1. Assuming the extended CDH assumption, the Auth-h-DH protocol (Fig. 6.11) ensures secure key-setup (indistinguishability) and Perfect Forward Applied Introduction to Cryptography and Cybersecurity 352 CHAPTER 6. PUBLIC KEY CRYPTOGRAPHY Secrecy (PFS), provided that M AC is a secure MAC function and that h is a randomness extractor hash function. Argument: Let us őrst consider a run where the key M K is unknown to the attacker, i.e., was not exposed. Hence, since M AC is a secure Message Authentication Code (MAC), then the protocol outputs a key only if the message it receives was sent by the peer (or by itself); in any case, the exponent used (ai or bi ) was selected randomly by Alice or Bob and unknown to the attacker. From the extended CDH assumption (subsection 6.2.4), and the assumption that h is a randomness extractor hash, it follows that the session key ki (set by (i) (i) Alice to kA,B , and by Bob to kB,A ) is indistinguishable from random, i.e., the Auth-h-DH protocol ensures secure key-setup. Note that the protocol cannot ensure that both parties will generate the same ki , since a MitM attacker may change messages, or simply drop one of the messages (causing the key to be output by only one party). The PFS property also follows, since it requires that key ki is secure, even if M K is exposed, as long as the exposure occurs after session i was completed; and our analysis did not exclude such exposure after the session was over. KDF-based variants of the Auth-h-DH protocol. As mentioned in Section 3.5, keyless generic extractors do not exist, and it is preferable to avoid their use and rely on a Key Derivation Function (KDF), which can also output as many pseudorandom bits as needed. Let us discuss brieŕy two variants of the Auth-h-DH protocol, which use a KDF instead of a (keyless) extractor. Variant 1: a two-keyed KDF-based variant of Auth-h-DH. The őrst variant simply replaces the keyless randomness extractor hash h of Figure 6.11, with the KDF-extract function, i.e., the key is derived as KDFsalt , where salt is a random and non-secret (known) key, which does not expose the master key M K used by the MAC function. Variant 2: a secure variant of Auth-h-DH, using a combined KDF/PRF. Another variant of Auth-h-DH uses the same function f and the same master key M K to replace both the MAC function and the KDF function. This variant requires f to satisfy both the MAC and the KDF functionalities. This is a stronger assumption, but it makes the design simpler and more efficient; therefore, it is often preferred. See Exercise 6.19 for questions related to these and other variants of the Auth-h-DH protocol. And here is a more basic exercise about the Auth-h-DH protocol. Exercise 6.4. Alice and Bob share master key M K and perform the Auth-hDH protocol daily, at the beginning of every day i, to set up a ‘daily key’ ki for day i. Assume that Mal can eavesdrop on communication between Alice and Bob every day, but perform MitM attacks only every even day (i s.t. i ≡ 0 ( mod 2)). Assume further that Mal is given the master key M K, on the fifth day. Could Mal decipher messages sent during day i, for i = 1, . . . , 10? Write your responses in a table. Applied Introduction to Cryptography and Cybersecurity 6.3. USING DH FOR RESILIENCY TO EXPOSURES Protocol 2PP-Key Exchange FS-ratchet RS-ratchet Auth-h-DH DH-ratchet 353 Section Secure key setup Forward secrecy (FS) Perfect Forward Secrecy (PFS) Recover Security (RS) Perfect Recover Security (PRS) 5.4.1 ✓ ✗ ✗ ✗ ✗ 5.7.1 5.7.2 6.3.1 6.3.2 ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✗ ✗ ✓ ✓ ✗ ✓ ✗ ✓ ✗ ✗ ✗ ✓ Table 6.2: Resiliency to key exposures of Key Exchange protocols. Note that the results of Ex. 6.4 imply that the Auth-h-DH protocol does not ensure the Recover Security property. We next show extensions that improve resiliency to key exposures, and speciőcally recover security after exposure, provided that the attacker does not deploy the MitM ability for one handshake. 6.3.2 The DH-Ratchet protocol: Perfect Forward Secrecy (PFS) and Perfect Recover Security (PRS) The Auth-h-DH protocol ensures perfect forward secrecy (PFS), but does not ensure recover security, and deőnitely not perfect recover security (PRS). In fact, a single key exposure, at some point in time, suffices to make all future handshakes vulnerable to a MitM attacker - even if there has been some intermediate handshakes without (any) attacks, i.e., during which the attacker had neither MitM nor eavesdropper capabilities. To see that the Auth-h-DH protocol does not ensure recover security, see Exercise 6.4. Note that the (shared key) RS-Ratchet protocol presented in subsection 5.7.2 (Fig. 5.20), achieved recovery of security - albeit, not Perfect Recover Security (PRS). Namely, the Auth-h-DH protocol does not even strictly improve resiliency compared to the RS-Ratchet protocol (subsection 5.7.2); see Table 6.2. In this subsection we show how to achieve both PFS and Perfect Recover Security (PRS). Speciőcally, we present the DH-Ratchet protocol, as illustrated in Fig. 6.12, which ensures both PFS and PRS. This protocol uses a function f , which is assumed to be simultaneously both a PRF and a KDF; this is similar to one of the variants of the Auth-h-DH protocol, discussed in subsection 6.3.1. Like the Auth-h-DH protocol presented above, the DH-ratchet protocol also authenticates the DH exchange; hence, as long as the authentication key is unknown to the attacker at the time when the protocol is run, then the key exchanged by the protocol is secret. The improvement, compared to the Authh-DH protocol, is in the key used to authenticate the DH exchange; instead of using a őxed master key (M K) as done by the Auth-h-DH protocol (Fig. 6.11), the DH-ratchet protocol authenticates the ith DH exchange using the session key exchanged in the previous round (exchange), i.e., ki−1 . An initial shared Applied Introduction to Cryptography and Cybersecurity CHAPTER 6. PUBLIC KEY CRYPTOGRAPHY 354 Alice, has ki−1 (from previous round) Bob, has ki−1 (from previous round) MitM attacker Nurse $ $ ai ← Z∗p ≡ {1, . . . (p − 1)} bi ← Z∗p ≡ {1, . . . (p − 1)} xi ← g ai mod p, fki−1 (xi ) yi ← g bi mod p, fki−1 (yi ) (i) kA,B ≡ fki−1 (yiai mod p) (i) (i) Session ith key: ki = kA,B = kB,A (i) kB,A ← fki−1 (xai i mod p) Figure 6.12: The DH-Ratchet key exchange protocol, ensuring PFS and PRS against MitM attacker, assuming that f is both a PRF and KDF. The őgure shows round i of the protocol. secret key k0 is used to authenticate the őrst round, i.e., i = 1. Lemma 6.2 (DH-Ratchet ensures PFS and PRS). The DH-Ratchet protocol (Fig. 6.12) ensures secure key-setup with perfect forward secrecy (PFS) and perfect recovery security (PRS), assuming that f is both a PRF and KDF. Sketch of proof: The PFS property follows, like in Lemma 6.1, from the fact that ki , the session key exchanged during session i, depends on the result of the DH protocol, i.e., is secure against an eavesdropping-only adversary. The protocol also ensures secure key setup, since a MitM adversary cannot learn ki−1 and hence cannot forge the DH messages. The PRS property follows from the fact that if at some session i′ there is only an eavesdropping adversary, then the resulting key ki′ is secure, i.e., unknown to the attacker, since this is assured when running DH against an eavesdropping-only adversary. It follows that in the following session (i′ +1), the key used for authentication is unknown to the attacker, hence the execution is again secure - and results in a new key ki′ +1 which is again secure (unknown to attacker). This continues, by induction, as long as the attacker is not (somehow) given the key ki to some session i, before the parties receive the messages of the following session i + 1. Many instant messaging applications use a slightly more advanced version of the DH-Ratchet protocol, usually referred to as the Double Ratchet protocol (or algorithm). The Double-Ratchet protocol does not use the DH-derived keys ki directly to protect the traffic; instead, it derives from ki a series of keys used to protect the traffic. The standard double-ratchet protocol is also asynchronous, allowing the two parties to change keys independently and without dependency on time synchronization. The following exercise presents the Synchronous Double-Ratchet protocol, a slightly simpliőed example of the Double-Ratchet protocol which retains much of its security beneőts but assumes synchronized clocks. Applied Introduction to Cryptography and Cybersecurity 6.4. THE DH AND EL-GAMAL PKCS 355 Exercise 6.5 (The Synchronous Double-Ratchet protocol). Alice and Bob use low-energy devices to communicate. To ensure secrecy, they run, daily, the DH-Ratchet protocol (Fig. 6.12), but want to further improve security, by changing keys every hour. However, to save energy and time, the hourly process should use only very efficient computations - and no exponentiations. Let kij denote the key they share after the j th hour of the ith day, where ki0 = ki (the key exchanged in the ‘daily exchange’ of Fig. 6.12). 1. Show how Alice and Bob should set their hourly shared secret key kij . 2. Identify the security benefits of your solution, compared to the ‘regular’ DH-Ratchet protocol. Solution: 1. kij = fkj−1 (1) (the value 1 is arbitrary of course). i 2. The protocol uses kij as the ‘session key’, i.e., the key used to protect the traffic in the j th hour of the ith day. Assume the őrst hour of the day is numbered j = 1. The advantage is that exposure of kij , for any hour j > 0, does not expose the ‘ratchet master key’ of that day ki = ki0 . Hence, such exposure only exposes traffic sent during this hour but not traffic sent in any other hour (or day). This is in contrast with the DH-Ratchet protocol, where exposure of the session key ki , used throughout day i, exposes all the traffic of day i, and furthermore, exposes future traffic until the key is recovered (in a day in which the attacker does not eavesdrop). 6.4 The DH and El-Gamal Public Key Cryptosystems In this section, we discuss two related public-key cryptosystems, both based on the discrete-logarithm problem: the (well-known) El-Gamal PKCS and the Diffie-Hellman (DH) PKCS, which is basically a transformation of the DH key exchange protocol into a public-key cryptosystem. In the next section (Section 6.5), we present a third system, the (well-known) RSA public key cryptosystem. 6.4.1 The DH PKC and the Hashed DH PKC In their seminal paper [123], Diffie and Hellman presented the concept of publickey cryptosystems - but did not present an implementation. On the other hand, they did present the (Figure 6.8). We next show that a minor tweak allows us to turn the DH key exchange protocol into a PKC; we accordingly refer to this PKC as the DH PKC, and a variant of it that also uses a hash function h as the DH-h PKC. Applied Introduction to Cryptography and Cybersecurity CHAPTER 6. PUBLIC KEY CRYPTOGRAPHY 356 Nurse Alice knows dA , g, p, computes: eA = g dA mod p Bob knows eA , g, p Input: message m $ Select b ← {1, . . . , p − 1}  gb eA h i b mod p, m ⊕ (eA ) mod p Figure 6.13: The DH public key cryptosystem (DH PKC). Bob encrypts plaintext message m, using Alice’s public key eA and the public parameters: a safe prime p and generator g of the group Z∗p . The ciphertext consists of the pair of  h i b strings g b mod p, m ⊕ (eA ) mod p . The value b is selected randomly and known only to Bob, who should select a new b for each encryption. The DH PKC. Let us őrst present the DH public key cryptosystem (DH PKC), illustrated in Figure 6.13. As can be seen, this public key cryptosystem is essentially an adaptation of the DH key exchange protocol (Figure 6.8), using safe prime p. Essentially, instead of Alice selecting random secret a and sending g a mod p to Bob in the őrst ŕow of the DH protocol, Alice selects a fixed $ private key dA , exactly in the same way, i.e., dA ← {1, . . . , p − 1}. Next, Alice computes her public key eA , as: eA ≡ g dA mod p. Bob encrypts a message m using Alice’s public key eA by selecting a random value b ∈ [2, p − 2], and then computing two values: g b mod p and m ⊕ ebA mod p. Bob sends these two values to Alice; note that this is essentially Bob’s role in the DH protocol. Namely, Bob computes the ciphertext according to Equation 6.30: EeA (m) =  $  b← [1, p− 1]  Return g b mod p, m ⊕ (eA )b mod p     (6.30) Notice that the ciphertext is the pair of both of these values. Upon receiving such a ciphertext, which we denote (cb , cm ), Alice can decrypt it by computing: DdA (cb , cm ) = h d cm ⊕ (cb ) A mod p i (6.31) To see that the DH PKC ensures correctness, i.e., that decryption recovers Applied Introduction to Cryptography and Cybersecurity 6.4. THE DH AND EL-GAMAL PKCS the plaintext, we observe that:   b mod p DdA (EeA (m)) = DdA g b mod p, m ⊕ (eA )   h dA b = m ⊕ (eA ) mod p ⊕ g b mod p   = m ⊕ g b·dA mod p ⊕ g b·dA mod p = m 357 mod p i The security of DH PKC and the Hashed DH PKC. Intuitively, the security of DH PKC seems to follow from the CDH assumption (Deőnition 6.6). Let us present a reduction argument which supports this intuition. Assume an attacker A DHP KC is able to learn a random message m from eA and EeA (m). Then we can design an attacker A CDH that will be able to compute g ab mod p, given g a mod p and g b mod p, as follows. Given g a mod p and g b mod p, the attacker A CDH will deőne eA = g a mod p and cb = g b mod p, and select b a random message m and compute cm = m ⊕ (g a mod p) mod p. However, this argument has a ŕaw; even if the CDH assumption holds, the DH PKC may not be a secure encryption. Speciőcally, the argument was based on the assumption that the attacker A DHP KC is able to learn the message m. However, a secure encryption scheme should also prevent disclosure of partial information about the plaintext, as formalized by the indistinguishability test for public key cryptosystems, PKC IND-CPA (Deőnition 2.10). In fact, Claim 6.3 shows that an attacker may learn partial information about g ab mod p from the public values g a mod p and g b mod p - even if the CDH assumption holds. This may, therefore, expose partial information about m when using the DH public key cryptosystem (DH-PKC) as presented in Figure 6.13. One solution to this is to modify the design. Recall that DH-PKC uses the mod p group (with a safe prime p). We could, instead, use a DDH group (Deőnition 6.7), i.e., a cyclic group G that is believed to hide all partial information, as in subsection 6.2.4. Fig. 6.14 presents a different solution: the Hashed DH PKC. In the Hashed DH PKC, we apply a cryptographic hash function h to ebA , before XORing it with the message. Namely, we compute cm = m ⊕ h(ebA mod p). Speciőcally, this should be a randomness-extractor hash function. The output of a randomnessextractor hash h should be indistinguishable from random; hence, it should hide all partial information about the key. The ciphertext is, therefore, the pair:    b EeA (m) ≡ g b mod p, m ⊕ h (eA ) mod p Yet another variant of the DH PKC uses a keyed key-derivation function KDFs (g dA ·b mod p), with a uniformly-random key/salt s. See discussion of randomness extractor hash functions and key derivation functions KDF in subsection 3.5.3. Exercise 6.6. Show the correctness of the DH-h PKC. Applied Introduction to Cryptography and Cybersecurity CHAPTER 6. PUBLIC KEY CRYPTOGRAPHY 358 Nurse Alice knows dA , g, p, computes: eA = g dA mod p Bob knows eA , g, p Input: message m $ Select b ← {1, . . . , p − 1}  gb eA   b mod p, m ⊕ h (eA ) mod p Figure 6.14: The Hashed DH Public Key Cryptosystem (Hashed DH PKC: same as the DH PKC (Figure 6.13, except for hashing the ‘one-time key’ ebA mod p. 6.4.2 The El-Gamal PKC The El-Gamal PKC is another encryption scheme based on the DH key exchange protocol, which is closely related to the DH PKC. As for the DH PKC, we discuss three variants of the El-Gamal PKC: (1) using the mod p group (for safe prime p), (2) using DDH group, and (3) Hashed El-Gamal. The original design of the El-Gamal PKC, in [158], uses multiplications mod p where p is a safe prime, like the DH PKC. Key generation is also done as $ in DH PKC, i.e., Alice’s selects her private key randomly as dA ← {2, . . . , p − 1} dA mod p. Even the encryption and computes her public key eA as: eA ≡ g process is similar to DH PKC: Bob selects a random value b ∈ [2, p − 1], and computes and sends to Alice a pair of values (cb , cm ), where cb ≡ g b mod p and cm ≡ m · ebA mod p, as in Equation 6.32.  EeA (m) = (cb , cm ) ≡ g b mod p , m · ebA mod p (6.32) The difference between the original El-Gamal PKC and the DH PKC is in how Bob uses the ebA value to encrypt the message m. In the original El-Gamal PKC, Bob multiples m, i.e., computes cm ← m · ebA mod p, while in DH PKC, Bob uses exclusive-or, i.e., computes cm ← m ⊕ ebA . Decryption is also modiőed accordingly, by using (modular) division instead of exclusive-or, i.e.: DdA (cb , cm ) = cm cdb A A mod p = cm · c−d b mod p (6.33) Correctness follows similarly to DH PKC. Unfortunately, similarly to DH PKC, the original El-Gamal PKC may expose partial information about the plaintext message m. And, like for DH PKC, Applied Introduction to Cryptography and Cybersecurity 6.4. THE DH AND EL-GAMAL PKCS 359 there are two solutions, both similar to the corresponding DH-PKC solution: Hashed El-Gamal or using El-Gamal with a DDH group (Deőnition 6.7). The őrst solution, Hashed El-Gamal, is similar to hashed DH PKC. Namely, in the encryption process, we hash the ‘one-time pad’ ebA before using it to hide the message m. Namely, encryption is, as usual, a pair (cb , cm ), except cm is computed as: cm ← m ⊕ ebA . We also compute the hash for decryption: DdA (cb , cm ) = c m  h cdb A   −1 mod p = cm · h cdb A mod p (6.34) The second solution, using El-Gamal with a DDH group, is similar to the use of a DDH group with the DH PKC. Namely, computations are done over a cyclic group G believed to ensure the DDH assumption (Deőnition 6.7). Using the usual convention where we denote the operation of group G in the same ways that we normally denote multiplication over the reals (or integers), the encrypt and decrypt operations of El-Gamal become even a bit simpler (compared to Equation 6.32 and Equation 6.33:  A EeA (m) = g b , ebA ; DdA (cb , cm ) = cm · c−d (6.35) b The El-Gamal with a DDH group is believed to be secure, i.e., to prevent disclosure of any partial information about the plaintext. In addition, the DDH El-Gamal PKC is multiplicative homomorphic; we explain this property and some of its important applications in the following section. Let us describe the DDH El-Gamal PKC in more details; we also illustrate it in Fig. 6.15. We use g to denote a generator of G, and q to denote the order of G; i.e., G = {g 1 , . . . , g q }. Alice selects her private key dA to be a $ random element in the group, i.e., dA ← {g 1 , . . . , g q }; and her public key is simply eA = g dA . Notice that we use the standard notations of multiplication and exponentiation for the corresponding group operations of G. As shown in Fig. 6.15, the El-Gamal encryption of plaintext m ∈ G, denoted EeA (m), is computed as follows: ) ( $ b ← [1, q]  (6.36) EeA (m) ← Return g b , m · ebA El-Gamal decryption is deőned as: A DdA (cb , cm ) = c−d · cm b (6.37) The correctness property holds since for every message m ∈ G holds: h −d  b i A DdA (g b , m · ebA ) = g b · m · g dA   (6.38) = g −b·dA · m · g b·dA =m Applied Introduction to Cryptography and Cybersecurity CHAPTER 6. PUBLIC KEY CRYPTOGRAPHY 360 $ Alice, private key dA ← {1, . . . , q}, public key eA ← g dA MitM attacker Bob sends plaintext m Nurse $ b ← {1, . . . , q} eA (cb , cm ) where cb ← g b , cm ← ebA · m Output: m′ ≡ x−dA · y m′ = g b −dA Correctness: b −b b · g dA · m = m · (eA ) · m = g dA Figure 6.15: The El-Gamal Public-Key Cryptosystem using a DDH cyclic group G, whose order we denote by q; exponentiations and multiplications are done using the corresponding operations of G. The private key of Alice, denoted dA , and the value b used by Bob, are both selected randomly from the set {1, . . . , q}. Alice computes her public key as eA = g dA . Exercise 6.7. In this exercise, use the mod p modular group (where p is a prime) to compute (the original) El-Gamal encryption. Let p = 5. 1. Find a generator for Z∗p . (There are only three candidates to try!) 2. Let’s select the private key of Alice as dA = 2. Compute Alice’s public key, eA = g dA mod p. 3. Compute El-Gamal encryption of 4 and of 3: c4 ≡ EeA (4), c3 ≡ EeA (3). Comment: this is a randomized encryption, so another encryption may result in a different output! 4. Compute the decryptions of c4 and of c3 . 5. Explain why El-Gamal encryption using mod p group - even for large safe prime p - does not satisfy the requirements of secure encryption. 6.4.3 El-Gamal is Multiplicative-Homomorphic Encryption An encryption scheme (E, D) is multiplicative homomorphic if there is an operation, which we denote ×, deőned over a pair of ciphertext messages, such that for every public-private key pair (e, d) and every pair of plaintext message m1 , m2 holds that Ee (m1 ) × Ee (m2 ) is an encryption of m1 · m2 , namely: m1 · m2 = Dd (Ee (m1 ) × Ee (m2 )) (6.39) Where m1 · m2 is integer multiplication. In this section, we show that the (non-hashed) El-Gamal cryptosystem is multiplicative homomorphic, and discuss some of the applications of this Applied Introduction to Cryptography and Cybersecurity 6.4. THE DH AND EL-GAMAL PKCS 361 property. We focus on the use of a DDH group G, i.e., a cyclic group which is believed to satisfy the DDH assumption (Deőnition 6.7). It is convenient to think of × as a ‘multiplication of ciphertexts’ operation. Following this, the homomorphic property basically means, that the multiplication of two ciphertexts, Ee (m1 ) × Ee (m2 ), is equivalent to the encryption of the multiplication of the two messages, Ee (m1 · m2 ). Notice that m1 · m2 is done using the group operation rather than normal multiplication. The El-Gamal × operation is also similar to multiplication. Recall that in the El-Gamal PKC, ciphertexts consist of pairs (cb , cm ) of elements from the group G. The × operator applied to a pair of ciphertexts, (cb , cm ) and (cb′ , cm′ ), is deőned as: (6.40) (cb , cm ) × (cb′ , cm′ ) ≡ (cb · cb′ , cm · cm′ ) The following lemma shows that the × operator correctly computes the encryption of the multiplication of the two plaintext messages m and m′ whose El-Gamal encryptions are (cb , cm ) and (cb′ , cm′ ), respectively. $ Lemma 6.3. Let dA ← [1, p − 1] and eA = g dA . Then for any two messages m, m′ ∈ G holds and any encryption of them EeA (m), EeA (m′ ) holds: m · m′ = DdA (EeA (m) × EeA (m′ )) (6.41) Proof: Let b be the random exponent used to compute EeA (m) and b′ be the random exponent used to compute EeA (m′ ). Then: EeA (m) ′ EeA (m ) = = (g b , m · ebA ) b′ ′ (g , m · (6.42) ′ ebA ) (6.43) Hence: EeA (m) × EeA (m′ ) ′ ′ (g b+b , m · m′ · eb+b A ) = (6.44) And: DdA (EeA (m) × EeA (m′ )) = =  g b+b ′ m · m′ −dA · m · m′ · g d A b+b′ (6.45) (6.46) Exercise 6.8. Use the values of p, g from Exercise 6.7, and perform all multiplications mod p. 1. Compute cM ≡ c3 × c4 . 2. Compute the decryption of cM . Explain why the result is as expected from the lemma. Note: it is not secure to use the Exercise 6.7). mod p group for El-Gamal (last item in Applied Introduction to Cryptography and Cybersecurity 362 6.4.4 CHAPTER 6. PUBLIC KEY CRYPTOGRAPHY Types and Applications of Homomorphic Encryption n the next section we discuss another multiplicative-homomorphic, the textbook RSA PKC. Like we deőned multiplicative-homomorphic encryption, we can deőne other types of homomorphic encryption. One obvious example is additive-homomorphic encryption. There are also encryption schemes which are homomorphic with respect to multiple operations - each using a different operator, of course. In particular, encryption schemes which homomorphic with respect to both multiplication and addition are referred to as Fully Homomorphic Encryption (FHE). FHE is a very powerful tool; it allows computation of the encryption of arbitrary function of the plaintext, given only the ciphertexts. For example, we can encrypt the (secret) inputs, send them to an untrusted computation server which will compute the encryption of the function over the inputs, and then decrypt the results - without exposing the inputs to the untrusted server. Namely, FHE allows arbitrary computations over encrypted data. Such schemes have different applications, e.g., in cloud computing, where an untrusted cloud service is performing some computation on encrypted values. However, known FHE schemes are complex and have signiőcant overhead, in terms of computation time and/or key/ciphertext length. This is in comparison to Partially-Homomorphic Encryption (PHE) schemes that are homomorphic with respect to only one operation, e.g., multiplication. In some applications, a single operations suffices. For example, multiplicative-homomorphic encryption allows multiplication of ciphertexts, as we have just seen; given two ciphertexts, we can compute the encryption of their multiplication - without knowing the plaintexts or the decryption key. In this section, we will give a glimpse of the important applications of homomorphic encryption, limiting ourselves to multiplicative-homomorphic encryption. We focus on the El-Gamal multiplicative homomorphic encryption, and demonstrate how it can be used for applications requiring anonymity, and speciőcally, for anonymous voting. This is a tiny taste from the extensive research on the use of cryptography to ensure privacy, anonymity and secure and private voting. Secure and private voting. Secure voting is essential for democracy; and one of the main requirements is, usually, voting privacy, i.e., preserving the conődentiality of the vote of each individual, and only exposing the tally of entire populations. This may be achieved by use of physical designs such as a ballot box or trusted voting machines. There is extensive research on the use of cryptography to ensure secure electronic voting. We focus on voter privacy, i.e., ensuring the secrecy of the vote of speciőc individuals. First, let us consider a trivial design for an e-voting system: voters encrypt their votes with the public key of a trusted server, to which they then send their votes; the server decrypts and then tallies the votes. This system requires Applied Introduction to Cryptography and Cybersecurity 6.4. THE DH AND EL-GAMAL PKCS Tally server eDS Alice Bob Cora eDS eDS eDS 363 Decrypt server dDS EeDS (piA ) EeDS (piB ) EeDS (piC ) EeDS (piA · piB · piC ) Decrypt and output piA · piB · piC Figure 6.16: Example: Privacy-preserving voting using two servers and multiplicative-homomorphic encryption, e.g., El-Gamal PKC; pi > 1 is a small prime number assigned to candidate i. Voters send encrypted using eDS , the public key of the Decrypt server, the identiőer of their candidate, and send to the tally server. The tally server sends eDS (piA · piB · piC ) to the decrypt server, who outputs the combined vote piA · piB · piC . By factoring this, we őnd how many votes were given to each candidate i. complete trust in the server; in particular, the server can trivially know the vote of each voter. To ensure voter privacy, we separate the two functions of the server and use two servers: a tally server and a decrypt server. We also switch the order of operations: the encrypted votes are sent őrst to the tally server, who aggregates all of them into a single (encrypted) value, and then sent to the decrypt server, who decrypts to produce the őnal outcome. This privacy-preserving voting process is shown in Fig. 6.16. As shown, each candidate i is assigned a unique small prime number: pi > 1. Each voter, e.g. Alice, selects one candidate, say iA (with identiőer piA ), and sends to the tally server EeDS (piA ). The tally server combines the encrypted votes by computing x ≡ EeDS (piA ) × EeDS (piB ) × EeDS (piC ) and sending x to the decrypt server. From Equation 6.39, x = EeDS (piA ) × EeDS (piB ) × EeDS (piC ) = EeDS (piA · piB · piC ), i.e., x is the encryption of piA · piB · piC . Hence the decrypt server outputs piA · piB · piC , i.e., the combined vote. By factoring the combined vote, we őnd how many votes were given to each candidate i. Note that this factoring operation is efficient, since we know exactly all possible factors. Also, note that for the factoring to provide correct result, the combined vote must always be less than p. Obviously, this is not a complete description of a secure voting system. A complete system would not only ensure voter privacy, but also prevent cheating by users and by the servers. Exercise 6.9. We continue with p = 5 and same g and public key from Exercise 6.8; now use this as the public key of the Decrypt server, eDS . Let Applied Introduction to Cryptography and Cybersecurity 364 CHAPTER 6. PUBLIC KEY CRYPTOGRAPHY there be two candidates: 2 and 3 (actually, we can’t have more with p = 5 why?). 1. Alice and Bob vote 2 and Cora votes 3; compute the encrypted votes. 2. Compute the encrypted combined vote (output of Tally). 3. Decrypt the combined vote. 4. The output may not be correct; explain why. 5. Show at least one way in which a corrupt party (voter, tally server or decrypt server) can manipulate the results. Another application: Re-encryption. Let us point out another application of multiplicative homomorphic encryption, such as the El-Gamal PKC: reencryption. Namely, consider encryption (cm , cb ) = EeA (m) of plaintext m using public key eA . Let (c1 , cb,1 ) ← EeA (1) be an encryption of 1 using eA , and let (c′m , c′b ) ← (cm , cb ) × (c1 , cb,1 ). From the homomorphic property (Equation 6.39), (c′m , c′b ) is an encryption of m · 1 = m, i.e., it is also an encryption of m. A further property of re-encryption is that an adversary cannot distinguish between the re-encryption of (cm , cb ) and an encryption of a different message m̂ ̸= m. Re-encryption is used in different protocols, often for anonymous communication; there are other solutions to ensure anonymity, including Tor and anonymous remailer [113], all based on cryptography. Re-encryption preserves the same decryption key. El-Gamal also allows a similar, but different, mechanism, called proxy re-encryption, where another entity called proxy is given a special key, denoted eA→B and computed by Alice, that allows the proxy to transform a ciphertext message encrypted with the key of Alice, c = EeA (m), into an encryption of the same message but with Bob’s key, c′ = EeB (m). Proxy re-encryption has been proposed for different applications, such as monitoring of encrypted traffic. For more details on this mechanism and its applications, see [69]. Re-encryption (and proxy re-encryption) require the use or awareness of the public key eA with which the message was encrypted. In some applications, it is desirable to allow re-encryption without specifying the public key, e.g., for recipient anonymity. In such case, one can use an elegant extension called universal re-encryption, which allows re-encryption without knowledge of the encryption key eA . This is done by appending the encryption EeA (1) to each ciphertext; see details in [171]. Homomorphic encryption cannot be IND-CCA secure! We see that homomorphic encryption has some nice applications. However, there is a caveat, namely, homomorphic encryption cannot be IND-CCA secure. This is especially clear to see, considering the re-encryption application. Namely, an attacker can re-encrypt the encryption c∗ = Ee (m∗ ) of a challenge message m∗ , resulting in ciphertext c′ ̸= c∗ , which also decrypts to m∗ . The CCA attack allows Applied Introduction to Cryptography and Cybersecurity 6.5. THE RSA PUBLIC-KEY CRYPTOSYSTEM 365 decryption of c′ , since c′ = ̸ c∗ , and this provides the attacker with the challenge ∗ message m . This argument can be extended to show that homomorphic encryption cannot ensure ‘relaxed CCA security’ (rCCA), where the attacker cannot make a ciphertext query which decrpts to the challenge plaintext (such as c′ ); see Exercise 6.29. The homomorphic property is so useful, that it has multiple applications such as these mentioned above, in spite of the fact that it cannot be INDCCA (or even IND-rCCA) secure. However, this does require extra care and expertise. For example, we brieŕy mentioned that the text book RSA PKC is multiplicative-homomorphic. We will see that this fact was, indeed, abused in an important attack against RSA, which motivates use of RSA with a padding mechanism (which makes it non-homomorphic). With that, let us proceed to discuss RSA. 6.5 The RSA Public-Key Cryptosystem In 1978, Rivest, Shamir and Adelman presented the őrst proposal for a publickey cryptosystem - as well as a digital signature scheme [334]. This beautiful scheme is usually referred to by the initials of the inventors, i.e., RSA; it was awarded the Turing award in 2002, and is still widely used. We will cover here only some basic details of RSA; a more in-depth study is recommended, by taking a course in cryptography and/or reading one of the many books on cryptography covering RSA in depth. The reader may want to refresh on subsection A.2.2 and Section A.2.3 before learning this section. 6.5.1 RSA key generation. Key generation in RSA is more complex than for DH and El-Gamal. We őrst list the steps, and then explain them: • Select a pair of large prime numbers p, q, speciőcally, both p and q would  have N2 bits. Let n = p · q and let ϕn = (p − 1) · (q − 1). As a result, n would have at least N bits. • Select a value e which is co-prime to ϕn , i.e., gcd(e, ϕn ) = 1. • Compute d s.t. e · d mod ϕn = 1. • The public key is (e, n) and the private key is (d, n). Selecting e to be co-prime to ϕn is necessary - and sufficient - to ensure that e has a multiplicative inverse d in the group mod ϕn . To őnd the inverse d, we can use the extended Euclidean algorithm. This algorithm efficiently őnds numbers d, x s.t. e · d + ϕn · x = gcd(e, ϕn ) = 1; namely, e · d = 1 mod ϕn . See subsection A.2.2. Applied Introduction to Cryptography and Cybersecurity CHAPTER 6. PUBLIC KEY CRYPTOGRAPHY 366 The public key of RSA is (e, n) and the private key is (d, n), since the modulus n is required for both encryption and decryption. However, we - and others - often abuse notation and refer to the keys simply as e and d. 6.5.2 Textbook RSA: encryption, decryption, and signing. The RSA cryptosystem is based on the RSA-encrypt function EeRSA (m), applied to plaintext m using public key e, and the RSA decryption function DdRSA (c), applied to ciphertext c and using the private key d. These functions are computed as follows: EeRSA (m) DdRSA (c) = = me mod n cd mod n (6.47) (6.48) Here, the message m is encoded as a positive integer, and limited to m < n, ensuring that m = m mod n. In subsection 6.5.4 we show that this ensures correctness, i.e., correct decryption, namely: For every message m and c ← EeRSA (m) holds: m = DdRSA (c) (6.49) We use the term textbook-RSA encryption, for RSA encryption performed by directly applying the RSA-encrypt function to the plaintext, without any padding or other preprocessing, i.e., using E RSA and DRSA as deőned above. As we explain in subsection 6.5.5, textbook RSA encryption has signiőcant vulnerabilities. Therefore, in practice, the input to RSA is always processed; this preprocessing of the input is referred to as padding. Padding is deőned by a pair of functions (pad, unpad). The input to pad, and the output of unpad, are plaintext messages; and the two functions should ensure that unpadding of a padded message, recovers the message as it was before padding, namely: (∀m) m = unpad(pad(m)) (6.50) Therefore, for every correct public key cryptosystem (KG, E, D) and a corresponding keypair (e, d), i.e., (e, d) ← KG(1l ), holds the correctness of padded encryption: (∀m, (e, d) ← (1l )) m = unpad(De (Ee (pad(m)) (6.51) Let us know consider the speciőc case of (padded) RSA encryption. From the correctness of RSA (Equation 6.49) follows the correctness of padded RSA. Namely, For every message m and c ← EeRSA (pad(m)) holds m = unpad(DdRSA (c)). Or:   d e (∀m) m = unpad ([pad(m)] mod n) mod n (6.52) We illustrate textbook-RSA vs. padded-RSA in Figure 6.17, and discuss standard RSA padding in subsection 6.5.6. Applied Introduction to Cryptography and Cybersecurity 6.5. THE RSA PUBLIC-KEY CRYPTOSYSTEM m Plaintext RSA-encrypt: c ← me mod n 367 c Ciphertext (a) Textbook-RSA, presented in subsection 6.5.2 and shown to be vulnerable in subsection 6.5.5. Padded RSA m Plaintext M Pad M ← Pad(m) RSA-encrypt: c ← M e mod n c Ciphertext (b) Padded RSA, usually following the PKCS#1 standard; see subsection 6.5.6. Version 1.5 of PKCS#1 is vulnerable; version 2 of PKCS#1, also referred to as OAEP, is considered secure. We use M for the padded plaintext (M = P ad(m)). Figure 6.17: RSA encryption: (a) textbook RSA (vulnerable) vs. (b) padded RSA. Textbook RSA signatures: encrypting with private key? RSA is also the basis for a signature scheme, which we discuss in in subsection 6.6.1. Signing uses the same key-generation process as explained above, except that we usually denote the public veriőcation key by v (instead of public encryption key e) and the private signing key by s (instead of private decryption key d). Similarly to encryption, we can also deőne RSA textbook signatures, by computing the RSA function over the message using the private signing key s, as in: SignRSA (m) s = V erif yvRSA (m, σ) = ms mod n  True if m = σ v mod n, False otherwise (6.53) (6.54) In practice, textbook RSA signatures are never used. One reason is that textbook RSA signatures can only be applied if the message m is less than n, which is rarely, if ever, the case. Therefore, in practice, RSA signatures always involve some additional processing, typically using the Hash-then-Sign (subsection 3.2.6), i.e., computing SignRSA (h(m)). Using textbook RSA signatures s also introduces vulnerabilities, which are similar to these outlined for textbook RSA encryption in subsection 6.5.5. Notice that the textbook RSA signing equation Equation 6.53 is exactly the same as the textbook RSA encryption equation Equation 6.47, except for the use of the private signing key s instead of the public encryption key e. Therefore, you may őnd reference to RSA signing a ‘encryption with the private key’. We recommend to avoid this expression, since it only applied to textbook RSA signatures and textbook RSA encryption, which are both vulnerable and Applied Introduction to Cryptography and Cybersecurity CHAPTER 6. PUBLIC KEY CRYPTOGRAPHY 368 not used in practice, and not to the secure RSA signatures and encryption (using padding and/or hashing). See further discussion of RSA signatures in subsection 6.6.1. 6.5.3 Efficiency of RSA The RSA algorithms are conceptually simple; however, their computation requires considerable resources. The basic reason for that is that RSA security completely breaks down if an attacker is able to factor n, and őnd its factors (p and q); this allows computation of ϕ(n) = (p − 1) · (q − 1), and using ϕ(n), the computation of the private key d from the public key e, since d = e−1 mod n (and using ϕ(n) we can compute the multiplicative inverses). Factoring is a well-studied problem, and while no polynomial factoring algorithm is known, there are algorithms that improve efficiency considerably. As a result, RSA keys should be quite long, as shown in Table 6.1; notice that the key-length should be chosen based on the maximal time at which the encryption should remain secret, and not based on the current time. For example, for information whose conődentiality should be preserved until 2040, the modulus n should be about 3000 bits, i.e., p and q should be random prime with about 1500 bits. Computations, especially exponentiation, with such extremely long numbers, is computationally intensive. The computations are modulo n, which keeps the results from becoming even longer, but this requires computation of the modulus of the result (and optionally of intermediate values), and the computation of the modulus is also a computationally-intensive problem. Therefore, efficiency is a major consideration for implementation of RSA. In this subsection, we discuss only one of the most basic optimizations: choosing e to improve efficiency. The public exponent e is not secret, and, so far, we only required it to be co-prime to ϕ(n). This motivates choosing e that will improve efficiency usually, to make encryption faster. In particular, choosing e = 3 implies that encryption - i.e., computing me mod n - requires only two multiplications, i.e., is very efficient (compared to exponentiation by larger number). Note, however, that there are several concerns with such extremely-small e; in particular, if m is also small, in particular, if c = me < n (without reducing mod n), then we can efficiently decrypt by 1/e taking the e-th root: c1/e = (me ) = m. This particular concern is relatively easily addressed by padding, as discussed below; however, there are several additional attacks on RSA with very-low exponent e, e.g., [105]. Some of these attacks are for the case where a party may encrypt and send the same message, or ‘highly related’ messages, to multiple recipients. These attacks motivate (1) the use of padding to break any possible relationships between messages , as well as (2) the choice of slightly larger e, such as 17 or 216 + 1 = 65537. The reason to choose these speciőc primes is that exponentiation requires only 5 or 17 multiplications, respectively; see next exercise. Applied Introduction to Cryptography and Cybersecurity 6.5. THE RSA PUBLIC-KEY CRYPTOSYSTEM 369 Exercise 6.10. Given integer m, show how to compute m17 in only five 16 multiplications, and m2 +1 in only 17 multiplications. Hint: use the following idea: compute m8 with three multiplications by   2 2 m8 = m2 . Handling long plaintext: hybrid encryption. To complete our discussion of RSA efficiency, let us comment that RSA encryption, like other publickey cryptosystems, used hybrid encryption to encrypt long messages. This is necessary, since the input m to the RSA-encrypt function must be less than the modulus n, or decryption will output m mod n which will differ from m. Theoretically, we could have selected n to be longer than the longest message, but this would have resulted in excessive overhead. Therefore, we select n based on the security requirements (Table 6.1), which is much shorter than (normal) plaintext. To encrypt ‘normal’ messages, which are typically much longer, we apply hybrid encryption. In hybrid encryption, we use the public key encryption to encrypt a shared key k, and then use the shared key to efficiently encrypt the long message m. See subsection 6.1.6 and Figure 6.3. 6.5.4 Correctness of RSA Does RSA decryption really work? Obviously, yes, it does; we now will explain why it does. Before we ‘really’ explain this, an exercise may give some intuition. To solve the exercise (and understand the following discussion), the reader may want to refresh on multiplicative inverses subsection A.2.2 and Euler’s function and theorem Section A.2.3. Exercise 6.11 (Textbook RSA ensures correctness, i.e., decryption recovers message). Let p = 7. 1. Recall: Z∗p , for a prime p, is the group containing the numbers from 1 to p − 1, with the modular multiplication operation. A generator for Z∗p is a number g ∈ {1, . . . , p − 1} such that by multiplying g by itself enough times, each time modulus p, we get all the numbers in Z∗p . Find a generator g for Z∗p ; show that g is a generator and how you found it. 2. What is ϕ(p)? 3. Let q = 11, and let n = q · p. Compute ϕ(q), ϕ(p) and ϕ(pq); for each of them, compute directly from the definition, and using the relevant facts/lemmas that we learned or that appear in the textbook. 4. Let e = 11 be an RSA encryption key for the modulus n; compute the corresponding private key d. Note: this is a correction; previously we had e = 3 - do you see why that value wasn’t good? Applied Introduction to Cryptography and Cybersecurity CHAPTER 6. PUBLIC KEY CRYPTOGRAPHY 370 5. Encode your initials by mapping the letters (A to Z) to the corresponding numeric values (from 1 to 26), resulting in f, l ∈ {1, 2, . . . , 26}. Compute m = f + l + 7. RSA 6. Compute c = Ee,n (m) = me mod n. RSA 7. Compute m′ = Dd,n (c) = cd mod n. 8. Explain why this encryption is insecure - yet why the use of this value e = 11 may be secure in other applications of RSA. Let us now ‘really’ explain why textbook RSA decryption recovers the plaintext, i.e., the correctness of textbook RSA (Equation 6.49), namely, (∀m)DdRSA (EeRSA (m)) = m. RSA’s correctness is based on Euler’s Theorem (Theorem A.3), which says that for any co-prime integers m, n, holds mφ(n) = 1 mod n, where ϕ(n) is the Euler function, deőned as the number of positive integers which are less than n and co-prime to n. See Section A.2.3. We use the theorem, to explain RSA’s correctness, i.e., why DdRSA (EeRSA (m)) = m. Note that for any primes p, q holds ϕ(p) = p − 1, ϕ(q) = q − 1, and ϕ(p · q) = (p − 1)(q − 1) (Lemma A.1). This is the reason for us using ϕn = ϕ(n) = ϕ(p · q) = (p − 1)(q − 1) in the RSA key generation process. Recall that e · d = 1 mod ϕ(n), i.e., for some integer i it holds that e · d = 1 + i · ϕ(n). Hence: me·d mod n m1+i·φ(n) mod n  i m · mφ(n) mod n = = b Recall Eq. (A.5) : ab mod n = (a mod n) mod n. Assuming that m and n are co-prime (gcd(m, n) = 1), we can apply Euler’s theorem and can substitute mφ(n) = 1 mod n and receive:  i me·d mod n = m · mφ(n) mod n mod n (6.55) = m · 1i mod n = m mod n (6.56) It follows that: DdRSA (EeRSA (m)) = DdRSA (me = (me mod n) e·d =m  = m · 1i d mod n mod n = m · mφ(n) =m mod n) mod n mod n i (6.57) mod n mod n = m Applied Introduction to Cryptography and Cybersecurity 6.5. THE RSA PUBLIC-KEY CRYPTOSYSTEM 371 Notice that m mod n = m since we restricted plaintext messages m so that m < n. Namely, under the assumption (above) that m and n are co-primes, we have shown the correctness of textbook RSA. What about the assumption that m and n are co-primes? Unfortunately, it does not always hold. Recall that the message m may be any positive integer s.t. 1 < m < n. Most, but not all, of these possible messages - i.e., integers smaller than n - are co-prime to n. In fact, the number of integers smaller than n and co-prime to n is exactly the deőnition of ϕ(n), which we know to be: ϕ(n) = ϕ(p · q) = (p − 1) · (q − 1) = n − q − p + 1. So most possible messages m are indeed co-prime to n. Still, p + q − 2 messages are not co-prime to n; this number is much smaller than n but is 1 still polynomial in n (roughly n 2 ), and our explanation does not hold for these values. We assure the reader, however, that correctness holds also for these values; it ‘just’ requires a slightly more elaborate argument. Such arguments, usually using the Chinese Remainder Theorem, can be found in many textbooks on cryptography and number theory, e.g., [205]. 6.5.5 The RSA assumption and the vulnerability of textbook RSA Now that we have seen that the textbook RSA PKC ensures correctness, it is time to discuss its security. We will őrst discuss the underlying security assumption, and then discuss several vulnerabilities of textbook RSA, which are the reason that in practice, we always use padded RSA, which we discuss in subsection 6.5.6. The security of RSA encryption is based on the RSA assumption. Intuitively, the RSA assumption is that there is only negligible probability that an efficient adversary A correctly recovers the plaintext m, given the ciphertext me mod n and the public key (e, n). Let us restate the RSA assumption a bit more formally. Definition 6.8 (RSA assumption). Choose n, e as explained above, i.e., n = pq for p, q chosen as random l-bit prime numbers, and e is co-prime to ϕn . The RSA assumption is that for any efficient (PPT) algorithm A and constant c, and for sufficiently large l, holds: Pr [A((e, n), me mod n) = m] ∈ N EGL(l) (6.58) $ Where m is chosen randomly m ← [2, n − 2]. The RSA assumption is also referred to sometimes as the RSA trapdoor one-way permutation assumption. The ‘trapdoor’ refers to the fact that d is a ‘trapdoor’ that allows inversion of RSA; the ‘one-way’ refers to the fact that computing RSA (given public key (e, n)) is easy, but inverting is ‘hard’; and the ‘permutation’ is due to RSA being a permutation (and in particular, invertible). See also the related concept of one-way functions in ğ3.4. Applied Introduction to Cryptography and Cybersecurity 372 CHAPTER 6. PUBLIC KEY CRYPTOGRAPHY One obvious question is the relation between the RSA assumption and the assumption that factoring is hard. Assume that some adversary AF can efficiently factor large numbers, speciőcally, the modulus n (which is part of the RSA public key). Hence, AF can factor n, őnd q and p, compute ϕn and proceed to compute the decryption key d, given the public key < e, n >, just like done in RSA key generation. We therefore conclude that if factoring is easy, i.e., there exists such adversary A, then the RSA assumption cannot hold (and RSA is insecure). Textbook RSA is multiplicative-homomorphic - and vulnerable. Assume now that we are willing to accept the RSA assumption. What, then, about the security of RSA, when used as a public key cryptosystem (PKC)? In this subsection, we discuss textbook RSA, and argue that it is vulnerable; this motivates the PKCS#1 speciőcations for ‘padded RSA encryption’, which we discuss in the next subsection. Before we discuss the vulnerabilities of textbook RSA, let us point out an important property of it: textbook RSA is multiplicative-homomorphic. Indeed, this follows quite simply: EeRSA (m1 · m2 ) = (m1 · m2 ) = me1 · me2 e mod n mod n = EeRSA (m1 ) · EeRSA (m2 ) mod n As we already discussed in subsection 6.4.3, a homomorphic encryption scheme cannot be IND-CCA secure. This is a drawback, requiring extra care in design of secure applications using a homomorphic encryption scheme. However, in the case of textbook RSA, there are additional vulnerabilities, which makes its use clearly inadvisable: 1. Unlike El-Gamal PKC, the textbook RSA PKC is deterministic; hence, encryption of the same plaintext m, will always result in the same ciphertext c = me mod n. Suppose that the attacker guesses (or knows) a set of likely (or possible) plaintexts, say m1 , m2 and m3 . The attacker can easily compute, say, c1 = me1 mod n; if the plaintext message m was the same as m1 , then c = c1 . Textbook RSA encryption resembles, in this sense, the insecure ECB mode of operation (Section 2.8), which has the same vulnerability. Secure encryption5 - shared key or public key - must be randomized and/or stateful! 2. Textbook-RSA is vulnerable to low-exponent attacks, especially when sending low-value messages (small m); we mentioned the trivial attack when me < n. See more elaborate low-exponent attacks in [105]); these exploit scenarios where we send identical or related messages to multiple recipients. 5 However, some designs of cloud databases employ deterministic encryption, specifically to facilitate identification of encryption of the same element. Of course, this must be done with great care to avoid unintended exposure. Applied Introduction to Cryptography and Cybersecurity 6.5. THE RSA PUBLIC-KEY CRYPTOSYSTEM 373 3. The RSA assumption does not rule out a potential exposure of partial information about the plaintext, e.g., a particular bit. The log(n) leastsigniőcant bits were shown to be as secure as the entire preimage [10]; however, it may be possible to expose other bits, like the one bit of the discrete log (see subsection 6.1.8). 4. Finally, while every homomorphic encryption scheme is vulnerable to CCA attacks, textbook RSA is also vulnerable to much weakened version of CCA attacks, where the attacker only needs to receive very limited information about the ciphertext. In subsection 6.5.7 we present Bleichenbacher’s attack, which can effectively ‘break’ textbook RSA given only a onebit error indicator, speciőcally, indication if the result of textbook RSA decryption is a ‘properly padded plaintext’. Furthermore, this attack breaks not only ‘textbook RSA’, but also some versions of padded RSA, in particular, when using the (widely-used and very simple) RSA PKCS#1 version 1.5 padding (deőned in Equation 6.59). We discuss padded RSA in the following subsection. 6.5.6 Padded RSA encryption: PKCS#1 v1.5 and OAEP In the previous subsection, we saw that textbook RSA has signiőcant vulnerabilities, making it inadvisable to use it. Therefore, to improve security, practical deployments always use padded RSA, i.e., apply a pad function to the message before applying the encryption operation, and a corresponding unpad function to recover the plaintext after decryption. The unpad function should recover the message before padding, which ensures the correctness of padded RSA, i.e., for every message m we always have m = unpad(DdRSA (EeRSA (m)) (Equation 6.52). Practical deployments of RSA encryption, usually follow one of the versions of the PKCS #1 speciőcations6 [386]. Before version 2.0, PKCS#1 deőned only one padding, which we refer to as the v1.5 padding; version 2.0 added OAEP padding, based on the design from [43]. We brieŕy discuss both of these widely used padding schemes. PKCS#1 version 1.5 padding. The v1.5 padding was designed mainly to address two of the vulnerabilities of textbook RSA: vulnerability 1 (due to deterministic output) and vulnerability 2 (low-exponent attacks). Namely: • To prevent an attacker from identifying multiple encryptions of the same plaintext, possibly by the attacker encrypting some guesses of possible plaintexts, the padding would include a sufficient number of random bits. 6 PKCS stands for Public Key Cryptography Standards for standard padding. The PKCS specifications were published by RSA Security LLC; versions 1.5, 2.0, 2.1 and 2.2 were define by the IETF, in RFCs [219, 224, 225] and [386], respectively. Applied Introduction to Cryptography and Cybersecurity 374 CHAPTER 6. PUBLIC KEY CRYPTOGRAPHY • To prevent the low-exponent attacks (vulnerability 2), prepend the message bytes with the random bits. To further ensure a sufficiently large value for the padded-plaintext, prepend this with a non-zero byte. Speciőcally, the PKCS#1 version 1.5 padding algorithm, P adv1.5 (·), is deőned as follows. Given an input (pre-padding) plaintext message m, and a random string r of at least eight non-zero random bytes, we compute the padded message M = P adv1.5 (m) as: M = P adv1.5 (m) = 0x00 + + 0x02 + +r+ + 0x00 + +m (6.59) To ensure that the binary value of M is less than n, as required for correct decryption, the (pre-padding) message m must contain less than l − 11 bytes, where l is the length of the modulus n (in bytes). The value of the second byte (0x02) prevents low-exponent attacks (by being non-zero). In addition, the fact that the second byte is 0x02 identiőes that the operation applied to the input was RSA encryption, and not, say, RSA signing. In the decryption process, we őrst obtain the padded message M , from which we can easily extract and return only the plaintext m. The message should be returned only if the padding is correct, i.e., begins with 0x0002, followed by at least eight non-zero bytes, and őnally followed by a zero byte. If M deviates from this in any way, an error indicator is returned. As the readers hopefully agree, the PKCS#1 version 1.5 padding, deőned in Equation 6.59, is simple to understand and easy to implement. These properties are part of the reason that this padding was quickly adopted by many systems, and is still quite widely deployed. However, the PKCS#1 version 1.5 padding turns out to be vulnerable to some attacks. The őrst of these was Bleichenbacher’s Padding Side Channel Attack, which we discuss in the next subsection (and see Figure 6.21). Optimal Asymmetric Encryption Padding (OAEP). A more secure padding called Optimal Asymmetric Encryption Padding (OAEP) was proposed by Bellare and Rogaway [43]. From version 2.0 of the PKCS#1 standard, until the version 2.2 [386], the current version, the standard includes both OAEP as well as the PKCS#1v1.5 padding. However, OAEP should be used whenever possible: it is more secure, without noticeable extra overhead. Intuitively, OAEP further improves the security of RSA, by introducing two mechanisms, each addressing one of the two main vulnerabilities of the v1.5 padding: Mix all bits: To deal with the concern that RSA may expose partial information (vulnerability 3), OAEP mixes up all the bits of the plaintext before applying textbook-RSA encryption. The mixing makes it necessary to expose many or all bits of the input to the textbook RSA encrypt function, in order to expose any information about the plaintext. Redundancy against chosen-ciphertext attacks: To foil chosen-ciphertext attacks (CCA), OAEP adds redundancy to the plaintext before applying Applied Introduction to Cryptography and Cybersecurity 6.5. THE RSA PUBLIC-KEY CRYPTOSYSTEM 375 encryption to it. The OAEP decryption process returns the decrypted plaintext, only if it contains the same redundancy. This should make it infeasible for the attacker to learn sensitive information by CCA, i.e., sending manipulated ciphertext messages, since, almost always, their decryption would not have the correct redundancy (and be rejected). Bellare and Rogaway coined the term plaintext-aware encryption for encryption with such validation function as part of the decryption process, where an attacker must know (‘be aware of’) a plaintext m in order to create a ciphertext c which will be decrypted into m. If the attacker produces ciphertext c′ without knowing a corresponding plaintext m′ , then the decryption of c′ is bound to result in an error indication rather than in a decrypted plaintext m′ . Following [43], we őrst present a simpliőed version of the OAEP padding, which only implements the ‘mix all bits’ mechanism, to prevent exposure of partial information, but does not add redundancy to defend against CCA attacks. Later, we describe the (non-simpliőed) OAEP padding, which extends the simpliőed padding, and also adds redundancy to defend against CCA attacks. Simplified OAEP padding (P ad2 (·)). The simpliőed OAEP padding (P ad2 (·)), illustrated in Figure 6.18, is already quite clever, and therefore, let us develop it in three stages: P ad0 , P ad1 and then P ad2 . First, for P ad0 , consider an attacker which can only expose a single bit in the preimage of the RSA encryption; of course we don’t know which bit. To prevent such attacker from exposing any bit of the plaintext, we let P ad0 őrst select a random string r of the same length as the plaintext m, and then output m ⊕ r + + r. Namely, padding P ad0 selects a random one-time pad r to m; the padded message M consists of the ‘encrypted’ plaintext m ⊕ r, concatenated to the ‘pad’ r. Obviously, the P ad0 padding is inefficient. First, it only protects against exposure of a single bit; what if we can expose two bits of the preimage, e.g., the őrst bit of m ⊕ r and the (corresponding) őrst bit of r? Second, do we really need to send such a long pad (as long as the plaintext m)? We address both issues in P ad1 , by using a shorter random string r, and then applying a cryptographic hash function g, whose output is as long as the plaintext m; so now we use g(r) as the one-time pad. it to a longer string g(r). Clearly, we reduced the overhead since r is shorter. Furthermore, if we view g as a ‘random oracle’, then security also improved. Even if the attacker knows several - but not all - of the random string r, then its output bits are still random. Suppose, however, that the attacker can expose |r| bits of the preimage of RSA - or, speciőcally, all the bits of the (short) random string r. Of course, r cannot be too short if we rely on it to randomize g(r), so this would imply that the RSA encryption function is much weaker than we expect; but still, it would be nice to protect also against such a case - if we can do it efficiently - and we can, as is done by P ad2 . Applied Introduction to Cryptography and Cybersecurity CHAPTER 6. PUBLIC KEY CRYPTOGRAPHY 376 r Plaintext m (random k bits) (n bits) m r g(r) g(·) f (·) r f (m ⊕ g(r)) m ⊕ g(r) r ⊕ f (m ⊕ g(r)) M = P ad2 (m) = (m ⊕ g(r)) + + (r ⊕ f (m ⊕ g(r))) M = P ad2 (m) c = EeRSA (M ) = M e mod n c = M e mod n = (P ad2 (m)) e mod n Figure 6.18: Simpliőed-OAEP (P ad2 (·)) padded RSA encryption. The P ad2 (·) padding goal is to protect against partial exposure of RSA preimage, but it does not validate that the ciphertext is the result of applying encryption, i.e., does not ensure plaintext-awareness. In fact, in this third (and őnal) simpliőed padding, P ad2 , we simply apply again the ‘hash then one-time pad’ method of P ad1 - this time, to ‘protect r’. More speciőcally, given plaintext m, the P ad2 algorithm őrst computes m⊕g(r), + r ⊕ f [m ⊕ g(r)], for a random string r, and then outputs the padding m ⊕ g(r) + where f is yet-another crytographic hash function. To further clarify, let us write down both the P ad2 and the corresponding U nP ad2 functions: : P ad2 (m) ≡ (m ⊕ g(r)) + + (r ⊕ f (m ⊕ g(r))) U nP ad2 (c) ≡ c[0 : n − 1] ⊕ g(f (c[0 : n − 1]) ⊕ c[n : n + k − 1]) Applied Introduction to Cryptography and Cybersecurity (6.60) 6.5. THE RSA PUBLIC-KEY CRYPTOSYSTEM 377 The U nP ad2 function takes advantage of the known size of the two components of P ad2 (m, r) (the n-bits m ⊕ g(r) and the k-bits r ⊕ f (m ⊕ g(r))). As the reader can easily conőrm, this padding is correct, i.e., m = U nP ad2 (P ad2 (m)). OAEP padding. Finally, we describe the ‘complete’ OAEP padding, as presented in [43], and later standardized by the IETF [386]. The OAEP padding adds to the ‘simpliőed padding’ (P ad2 above) a simple redundancy mechanism, that ensures plaintext-awareness, and thereby, provides effective defense against CCA attacks. Our description is of the padding and its security is a bit simpliőed, but we believe it suffices for our (educational) purposes; there are also (minor) differences between the details of the design between the one in [43] and the one standardized by the IETF. Reader interested in details should refer to [386] and to the security analysis in [43] and in follow-up works. r Plaintext m 0 (n bits) (random l bits) l m+ + 0l r g(r) l g(·) f (·)  r f (x) x≡ m+ + 0 ⊕ g(r) y ≡ r ⊕ f (x) M =x+ +y =   m+ + 0l ⊕ g(r) + + (r ⊕ f (m ⊕ g(r))) M =x+ +y c = EeRSA (M ) = M e mod n c = M e mod n Figure 6.19: OAEP padding [43] To add redundancy to the plaintext, OAEP simply appends to the plaintext message m a string of l zero bits, i.e., 0l , as illustrated in Figure 6.19. The value Applied Introduction to Cryptography and Cybersecurity 378 CHAPTER 6. PUBLIC KEY CRYPTOGRAPHY of l should be sufficiently large to make it ensure plaintext-aware encryption, i.e., make it impractical to őnd ciphertext which decrypts with the correct set of zero bits, except for ciphertexts obtained by applying the encryption process (to known plaintext). A typical value for l may be 128 bits. The design of OAEP, mostly inherits the security properties of the simpliőed OAEP padding, P ad2 (·). In addition, OAEP adds redundancy to the plaintext, to ensure plaintext-aware encryption and thereby defend against CCA attacks. Suppose the attacker sends some ciphertext c which is not the result of legitimate encryption. Namely, the attacker does not know the resulting plaintext; and even if we assume that the attacker can őnd some bits of the preimage, it would deőnitely not know all of r, and therefore, cannot ensure that the decryption will have the correct the 0l redundancy string. Therefore, intuitively, the attacker does not gain information from CCA attacks. 6.5.7 Bleichenbacher’s Padding Side-Channel Attack on PKCS#1 v1.5 In 1998, Daniel Bleichenbacher presented possibly the most important attack against public-key cryptosystems based on RSA [72]. Bleichenbacher’s attack is using the chosen-ciphertext side-channel attack (CCSCA) model. We begin this subsection by brieŕy discussing this attack model, and the important area of side-channel attacks in general, and then focus on Bleichenbacher’s attack. Side-channel attacks and the chosen-ciphertext side-channel attack (CCSCA) model. In chosen-ciphertext side-channel attacks , as in other attack models for encryption schemes, the attacker receives a challenge ciphertext c∗ , Which is the encryption of a challenge plaintext m∗ . At the end of the attack, the attacker outputs a guess m, and we say that the attacker wins if its guess is correct, i.e., if m = m∗ . What is unique about the CCSCA model are the attacker capabilities. Speciőcally, like in a ‘regular’ CCA attack, the attacker can give ciphertexts c1 , c2 , . . . to be decrypted; however, in a chosen-ciphertext side-channel attack (CCSCA), the attacker does not receive the results of the decryption. Instead, the attacker receives only some side-channel information regarding the decryption process and the decrypted message. See Figure 6.20. A side-channel is transmission of information using a non-standard channel, which was not intended for communication by the system designers, and, possibly, not considered when evaluating the security of the system. Such a channel is dur to some ‘side-effect’ of the operation of the system. For example, in the attack models we discussed in Chapter 2, e.g., CPA (see Figure 2.8), the model deőnes clearly which information is available to the attacker; any other information is therefore excluded. But a side-channel may provide some additional information, ‘outside the model’. There are different types of side-channels, including timing of events, power consumption, electromagnetic radiation, audio signals, error indicators and more. There are also different applications and goals for side-channels, including noncryptographic side-channels; for example, one common use of side-channels Applied Introduction to Cryptography and Cybersecurity 6.5. THE RSA PUBLIC-KEY CRYPTOSYSTEM Side-channel: timing, errors, power, noise,... MitM Mal (attacker) Challenge m∗ Encrypt c∗ ← Ee (m∗ ) c∗ 379 c 1 , c2 , . . . Decrypt mi ← Dd (ci ) m1 , m2 , . . . Guess mg Figure 6.20: The chosen-ciphertext side-channel attack (CCSCA) model. Attacker receives ‘side-channel feedback’ from the processing of adversary-selected ciphertexts. Attacker ‘wins’ when its ‘guess’, mg is identical to the ‘challenge’ m∗ , i.e., when mg = m∗ . is to inőltrate information across a őrewall or other device which inspects information sent outside of an organization, to detect leakage. Even focusing on cryptographic side-channels, there are different types and goals. In particular, sometimes side-channels are viewed as increasing the power of the attacker, for example, a side-channel that leaks information about the computation, which may allow leakage of information regarding the secret/private key. In other side-channel attacks, the side-channel is viewed as a weaker assumption regarding the attacker capabilities; speciőcally, in Bleichenbacher’s attack, the attacker receives ‘only’ a very limited indication about the decrypted message, instead of receiving the exact plaintext (as in a CCA attack). The information leaked in each side-channel ‘signal’ is, typically, extremely limited, e.g., only a single bit. Indeed, in many side-channel attacks, each ‘signal’ does not provide even one bit of information, since the side-channel signal is obscured by random noise. In spite of that, there have been many successful side-channel attacks on cryptographic systems and other security systems. However, we only cover Bleichenbacher’s attack, as designed against RSA encryption implementations that use the PKCS#1 version 1.5 padding, deőned in Equation 6.59. Bleichenbacher’s side-channel attack. Bleichenbacher’s side-channel attack is one of the most important and well-known side-channel attacks. The attack is against RSA encryption using the PKCS#1 versions 1.5 padding; this padding is very popular, and used by numerous systems and standards, including several variants of the important SSL and TLS protocols (Chapter 7). Bleichenbacher’s attack, and variants of it, apply to many systems and standards. In particular, Manger [275] showed a variant that can attack even the variant of OAEP standardized as version 2.0 of PKCS#1. As shown in [341], this attack can also be applied against many implementations of PKCS#1 v1.5, with greater efficiency than the original Bleichenbacher attack. We note Applied Introduction to Cryptography and Cybersecurity CHAPTER 6. PUBLIC KEY CRYPTOGRAPHY 380 f1 , f2 , . . . where MitM Mal (attacker) c∗ fi = True iff Mi has correct PKCS#1 version 1.5 padding c1 , c 2 , . . . Mi ← cdi mod n Unpad {Mi } mi ← U npad(Mi ) {mi } mg Figure 6.21: Bleichenbacher’s attack on RSA PKCS#1 version 1.5. The attacker is given ciphertext c∗ and outputs a guess plaintext mg . The attacker goal is d to compute the ‘correct’ plaintext, mg = m∗ , where m∗ = (c∗ ) mod n. To ∗ e compute mg , the attacker sends ciphertexts {ci = c · si mod n, where si are different integers. The attacker receives only the one-bit-feedback side-channel indications {fi }, where fi is true if Mi , the ith output of the textbook RSA decryption function, has correct PKCS#1 version 1.5 padding. We use mi to denote the outcome of the PKCS#1 decryption of ci , i.e., the result of unpadding Mi . that Manger’s attack exploits a seemingly-minor detail of PKCS#1 version 2.0, which differs from the original OAEP design; this detail was őxed in version 2.2 of PKCS#1 [292]. The setup of Bleichenbacher’s attack is illustrated in Figure 6.21. The attacker’s goal is to compute a string M which is the same size as the modulus n, d which we denote l bytes, and such that M = (c∗ ) mod n, for a given ‘challenge ciphertext’ c∗ . Basically, M would be the padded version of the original plaintext m encoded by PKCS#1 version 1.5 padding, as in Equation 6.59; it would then be trivial for the attacker to unpad M and őnd the original plaintext m. Bleichenbacher’s attack works for arbitrary c∗ , i.e., c∗ is not necessarily the result of PKCS#1 v1.5 encryption. This can be very useful in some scenarios, and is exploited in some attacks based on Bleichenbacher’s attack, including attacks we discuss in Chapter 7. However, we focus on the simpler and common case where c∗ is the result of RSA PKCS#1 v1.5 padded encryption of some plaintext m∗ , which simpliőes the attack a bit. Often, the attacker obtains c∗ by eavesdropping to a transmission from a benign sender, which obtained c∗ by encrypting some challenge message m∗ using RSA with PKCS#1 v1.5 padding. In this case, if the attacker succeeds to d compute (c∗ ) mod n, this gives the padded challenge message M ∗ = P ad(m∗ ), Applied Introduction to Cryptography and Cybersecurity 6.5. THE RSA PUBLIC-KEY CRYPTOSYSTEM 381 which easily provides m∗ itself, since m∗ = U nP ad(P ad(m∗ )) = U nP ad(M ∗ ). Since c∗ is the result of RSA encryption using PKCS#1 v1.5, we have ∗ M = P adv1.5 (m∗ ). From the deőnition of the pad in Equation 6.59, we know that a properly-padded message such as M ∗ must begin with 0x0002. Hence: 2B ≤ M ∗ < 3B, where B ≡ 28(l−2) (6.61) Where l is the length, in bytes, of n (and hence of M ). Note that Equation 6.61 does not reŕect any knowledge about the plaintext m∗ or the private key, only about the padded-plaintext M ∗ ← P adv1.5 (m∗ ) and the public key (n, e). The attack generates a sequence of sets {Mi }i≥0 , each containing one or more intervals of integers, where the correct solution, M ∗ , is in one of these intervals. The initial set, M0 , simply contains one interval containing all the l-bytes strings beginning with 0x0002, i.e.: M0 ≡ { [2B, 3B − 1 ]} (6.62) Obviously, the correct solution, M ∗ , is within the interval contained in M0 . The attacks proceeds to iteratively produce additional sets of intervals, {Mi }, such that for every i > 0, the set of values contained in (the intervals in) Mi is a strict subset of the set of values in Mi−1 , but always includes the correct solution M ∗ . The attack completes when Mi contains only one value - which must, therefore, be the correct solution, M ∗ . Note that when the number of elements in Mi is sufficiently small, the attacker could also exhaustively look for the solution, i.e., the value Mg ∈ Mi such that c∗ = Mge mod n. This is often more efficient than continuing the attack until Mi contains only a single element. In particular, we can use the fact that every correctly-padded message, with PKCS#1 v1.5, would contain at least one byte of only zero bits between the random string and the actual (unpadded) plaintext, which should rule out roughly 255/256 of the values in Mi . Similarly, we can use any additional information we may have about the plaintext, such as a speciőc format or known contents of part of the plaintext. The attack computes, for every i > 0, two values: őrst, an integer si , and then, the set of intervals Mi , terminating when Mi contains only a single value (which would be the solution M ∗ ). The computation is done iteratively over i, beginning with i = 1. Computation of s1 . Let us őrst describe the special case of computing s1 . We compute s1 as the smallest integer such that the textbook RSA decryption of c∗ · (s1 )e mod n has correct padding, as we learn from the one-bit-feedback upon sending c∗ · (s1 )e mod n to the decryption device. Namely, m∗ · s1 mod n is well-padded, and, in particular, begins with 0x0002. Note: it suffice to begin n searching for s1 from the minimal value of 3B , since for smaller values of s1 , ∗ m · s1 mod n cannot begin with 0x0002 (i.e., is not well padded). Computation of si for i > 1, if Mi−1 contains more than one interval. In this case, si is the smallest integer such that si > si−1 and the decryption of c∗ · (si )e mod n has correct padding, and, in particular, the resulting plaintext, m∗ · si mod n, begins with 0x0002. Applied Introduction to Cryptography and Cybersecurity 382 CHAPTER 6. PUBLIC KEY CRYPTOGRAPHY Computation of si for i > 1, if Mi−1 contains exactly one interval, Mi−1 = {[a, b]}. In this last case, choose small integers ri , si such the decryption of c∗ · (si )e mod n has correct padding and that the following two conditions hold: b · si−1 − 2B n 2B + ri · n 3b + ri · n ≤si < b a ri ≥ 2 (6.63) Computation of Mi . Finally, after si has been found, we compute the new set Mi as:       [  3B − 1 + rn 2B + rn , min b, Mi ← max a, si si (a,b,r) (6.64) a · si − 3B + 1 b · si − 2B for all [a, b] ∈ Mi−1 and for ≤r≤ n n The size of the intervals in the Mi sets decreases in each iteration, but analysis of the rate of efficiency, and, in particular, the required number of padding-correctness feedback queries, is beyond our scope. See (simpliőed) analysis in [72], which estimates that the attack will require about 220 (about a million) queries. Finally, we prove that the attack őnds M ∗ . Lemma 6.4. For every Mi produced by the Bleichenbacher attack as describe above, holds M ∗ ∈ Mi . Proof: We already mentioned that M ∗ ∈ M0 . Assume that M ∗ ∈ Mi−1 , and we prove that M ∗ ∈ Mi . Whenever we choose si , we conőrm, using the padding-correctness feedback, that m∗ · si mod n is well-padded, and, in particular, begins with 0x0002. Namely, for some integer r holds: 2B ≤ M ∗ · si − r · n ≤ 3B − 1 (6.65) 2B ≤ M ∗ · si − r · n ≤ 3B − 1, namely: 3B − 1 + r · n 2B + r · n ≤ M∗ ≤ si si (6.66) Now, since M ∗ ∈ Mi−1 , then there exist an interval [a, b] ∈ Mi−1 which contains M ∗ , i.e., a ≤ M ∗ ≤ b. Substituting M ∗ in Equation 6.65, we have: a · si − (3B − 1) b · si − 2B ≤r≤ n n (6.67) The combination of Equation 6.66 and Equation 6.67 implies that M ∗ must be in one of the intervals in Mi , as deőned in Equation 6.64. Applied Introduction to Cryptography and Cybersecurity 6.6. PUBLIC KEY SIGNATURE SCHEMES 6.6 383 Public key signature schemes We now discuss the third type of public-key cryptographic schemes: signature schemes, introduced in subsection 1.5.1. Signature schemes consist of three efficient algorithms (KG, S, V ), illustrated in Figure 1.6: Key-generation KG: a randomized algorithm, whose input is the key length l, and which outputs the private signing key s and the public validation key v, each of length l bits. Signing S: a (deterministic or randomized) algorithm, whose inputs are a message m and the signing key s, and whose output is a signature σ. Validation V : a deterministic algorithm, whose inputs are a message m, signature σ and validation key v, and which outputs an indication whether this is a valid signature for this message or not. Figure 1.7 illustrates the process of signing a message (by Alice) and validation of the signature (by Bob). We denote Alice’s keys by A.s (for the private signing key) and A.v (for the public validation key); note that this őgure assumes that Bob knows A.v - we later explain how signatures also facilitate distribution of public keys such as A.v. Signature schemes have two critical properties, which make them a critical enabler to modern cryptographic systems. First, they facilitate secure remote exchange in the MitM adversary model; second, they facilitate non-repudiation. We begin by brieŕy discussing these two properties. Signatures facilitate secure remote exchange of information in the MitM adversary model. Public key cryptosystems and key-exchange protocols, facilitate establishing of private communication and shared keys between two remote parties, using only public information (keys). However, this still leaves the question of authenticity of the public information (keys). If the adversary is limited in its abilities to interfere with the communication between the parties, then it may be trivial to ensure the authenticity of the information received from the peer. In particular, many works assume that the adversary is passive, i.e., can only eavesdrop on messages; this is also the basic model for the DH key exchange protocol. In this case, it suffices to simply send the public key (or other public value). Some designs assume that the adversary is inactive or passive during the initial exchange, and use this to exchange information such as keys between the two parties. This is called the trust on first use (TOFU) adversary model. In other cases, the attacker may inject fake messages, but cannot eavesdrop on messages sent between the parties; in this case, parties may easily authenticate a message from a peer, by previously sending a challenge to the peer, which the peer includes in the message. However, all these methods fail against the stronger Man-in-the-Middle (MitM) adversary, who can modify and inject messages as well as eavesdrop Applied Introduction to Cryptography and Cybersecurity 384 CHAPTER 6. PUBLIC KEY CRYPTOGRAPHY on messages. To ensure security against such attacker, we must use strong, cryptographic authentication mechanisms. One option is to use message authentication codes, however, this requires the parties to share a secret key in advance; if that’s the case, the parties could use this shared key to establish secure communication directly. Signature schemes provide a solution to this dilemma. Namely, a party receiving signed information from a remote peer, can validate that information, using only the public signature-validation key of the signer. Furthermore, signatures also allow the party performing the signature-validation, to őrst validate the public signature-validation key, even when it is delivered by an insecure channel which is subject to a MitM attack, such as email. This solution is called public key certificates. Figure 6.22: Public key certiőcate issuing and usage processes. As illustrated in Fig. 6.22, a public key certiőcate is a signature by an entity called the issuer or certificate authority (CA), over the public key of the subject, e.g., Alice. In addition to the public key of the subject, subject.v, the signed information in the certiőcate contains attributes such as the validity period, and, usually, an identiőer and/or name for the subject (Alice). Once Alice receives her signed certiőcate Cert, she can deliver it to the relying party (e.g., Bob), possibly via insecure channels such as email or the Internet Protocol (IP). This allows the relying party (Bob) to use Alice’s public key, i.e., rely on it, e.g., to validate Alice’s signature over a message m, as shown in Fig. 6.22. Note that this requires Bob to trust this CA and to have its validation key, CA.v. This discussion of certiőcates is very basic; more details are provided in chapter 8, which discusses public-key infrastructure (PKI), and in Chapter 7, which discusses the important TLS and SSL protocols. Signatures facilitate non-repudiation. The other unique property of digital signature schemes is that they facilitate non-repudiation. Namely, upon receiving a properly signed document, together with a signature by some wellApplied Introduction to Cryptography and Cybersecurity 6.6. PUBLIC KEY SIGNATURE SCHEMES 385 known authority establishing the public signature-validation key, the recipient is assured that she can convince other parties that she received the document signed properly. This is a very useful property. This property does not hold for message-authentication codes (MAC schemes), where a recipient can validate an incoming message has the correct MAC code, but cannot prove this to another party - in particular, since the recipient is able to compute herself the MAC code for arbitrary messages. 6.6.1 RSA-based signatures RSA signatures were proposed in the seminal RSA paper [334], and are based on the RSA assumption, with exactly the same key-generation process as for the RSA PKC. The only difference in key generation, is that for signature schemes, the public key is denoted v (as it is used for validation), and the private key is denoted s (as it is used for signing). There are two main variants of RSA signatures: signature with message recovery, and signature with appendix. We begin with signatures with appendix, as in practice, almost all applications of RSA signatures are with appendix; in fact, we present (later) signatures with message recovery mainly since they are often mentioned, and almost as often, a cause for confusion. RSA signature with appendix. In the (theoretically-possible) case that input messages are very short, and can be be encoded as a positive integer which is less than n, we can sign using RSA by applying the RSA exponentiation directly to the message, resulting in the signature σ. In this case, the signature and validation operations are deőned as: SsRSA (m) = (ms VvRSA (σ, m) = {m if m = σ v mod n, m) mod n, ‘error’ otherwise} Above, s is the private signature key, and v is the public validation key. The keys are generated using the RSA key generation process; see subsection 6.5.1 In practice, as discussed in ğ3.2.6, input messages are of variable length - and rarely shorter than modulus. Hence, real signatures apply the Hash-then-Sign (HtS) paradigm, using some cryptographic hash function h, whose range is contained in [1, . . . , n − 1], i.e., allowable input to the RSA function. Applied to the RSA FIL signature as deőned above, we have the signature scheme (S RSA,h , V RSA,h ), deőned as follows: s SsRSA,h (m) = ([h(m)] VvRSA,h (σ, m) = {m if h(m) = σ v mod n, m) mod n, error otherwise} The resulting signature scheme is secure, if h is a CRHF; see ğ3.2. This signature scheme is called signature with appendix since it requires transmission of both original message and its signature. This is in contrast to a Applied Introduction to Cryptography and Cybersecurity CHAPTER 6. PUBLIC KEY CRYPTOGRAPHY 386 rarely used variant of RSA signatures which is called signature with message recovery, which we explain next. ‘Signature with recovery’ is rarely, if ever, applied in practice; we describe it since there is a lot of reference to it in literature, and in fact, this method causes quite a lot of confusion among practitioners. Hopefully the text below will help to avoid such confusion. RSA signature with message recovery. RSA signatures with message recovery have the cute property, that they only require transmission of one mod n integer - the signature; the message itself does not need to be sent, as it is recovered from the signature. This cute property would result in a small savings of bandwidth, compared to signature with appendix, when both methods are applicable. However, as we explain below, this method is rarely applicable; furthermore, it is cause for frequent confusion. RSA signatures with message recovery require the use of an invertible padding function R(·) which is applied to the messages to be signed. The main goal of R is to ensure sufficient, known redundancy (in R(m); this is why we denote it by R). This redundancy, applied to the message before the public key signature operation, should make it unlikely that a random value would appear as a valid signature. The output of R(m) is used as input to the RSA exponentiation; hence, to ensure recovery, the value of R(m) must be allowed input, i.e., in the range [1, ,̇n − 1] (where n is the RSA modulus). Note that this implies that length of m has to be even shorter than the length of R(m), since R(m) must contain all of m, as well as the redundancy. Once R is deőned, the signature and validation operations for RSA with Message Recovery (RSAwMR) would be: SsRSAwM R (m) VvRSAwM R (x) = = s [R(m)] mod n  −1 v R (x mod n) if deőned, else error (6.68) (6.69) For validation to be meaningful, there should be only a tiny subset of the integers x s.t. xv mod n would be in the range of R, i.e., the result of the mapping of some message m. Since there are at most n values of xv mod n to begin with, this means that the range of R, i.e., the set of legitimate messages, must be tiny in comparison with n - which means that the message space should be really tiny. In reality, messages being signed are almost always much longer than the tiny message space available for signatures with message recovery. Hence, the use of this method is almost non-existent. In fact, our description of signature schemes (Figure 1.6) assumed that the message is sent along with its signature, i.e., our deőnition did not even take into consideration schemes like this, which avoid sending the original message entirely. Note that RSA signatures with message recovery are often a cause of confusion, due to their syntactic similarity to RSA encryption. Namely, you may come across people referring to the use of ‘RSA encryption with the private key’ as a method to authenticate or sign messages. What these people really Applied Introduction to Cryptography and Cybersecurity 6.6. PUBLIC KEY SIGNATURE SCHEMES 387 mean is to the use of RSA signatures with message recovery. We caution to avoid such confusing use of terminology; RSA signatures are usually used with appendix, but even in the rare cases of using RSA signatures with message recovery, RSA signing is not the same as encryption with the private key! Applied Introduction to Cryptography and Cybersecurity 388 6.7 CHAPTER 6. PUBLIC KEY CRYPTOGRAPHY Labs and Additional Exercises Lab 4 (Breaking textbook and weakly-padded RSA). In this lab we will break textbook RSA encryption, as well as padded RSA encryption, using specific (weak) padding schemes. As for the other labs in this textbook, we will provide Python scripts for generating and grading most questions in this lab (LabGen.py and LabGrade.py). For this lab, the lab-generation script is called LabGenRSA.py, and should be provided in the lab-scripts folder. In addition, if learning with a professor, the students may be asked to submit a lab report with their results, including results to questions which are not auto-graded. If the programs are not yet posted online, professors may contact the author to receive the scripts. The lab-generation script generates random challenges for each student (or team), as well as solutions which will be used by the grading script. We recommend to make the scripts available to the students, as example of how to use the cryptographic functions. It is easy and permitted to modify these scripts to use other languages/libraries or to modify and customize them as desired. 1. To warm up, we perform textbook RSA decryption. In your lab-input folder, őnd őles e1, d1, n1, ma1, mb1, cx1 and cy1, all generated by the LabGen.py script. Use the private decryption key d1 (and the modulus n1) to decrypt cx1 and cy1; save the results in the corresponding őles mx1 and my1 in the lab-answers folder. To allow you to check your program, one of these two answers (mx1 and my1) should be identical to one of the two input message őles, ma1 and mb1. If you got this one right, most likely you also got the other decryption right. To warm up, implement three variants of RSA decryption, using: (1a) textbook RSA, (1b) PKCS#1 version 1.5, and (1c) OAEP. In your labinput folder, őnd őles e1, d1, n1, ma1, mb1, cx1a, cx1b, cx1c, cy1a, cy1b and cy1c, all generated by the LabGen.py script, as well as a textbook RSA module. Files e1 and d1 contain an RSA encryption and decryption keys, respectively, both using the modulus n1. Use the private key d1 and the modulus n1 to decrypt: (1a) cx1a and cy1b using textbook RSA, (1b) cx1b and cy2b using PKCS#1 version 1.5, and (1c) cx1c and cy1c using OAEP. For the PKCS and OAEP encryptions, you should also check and remove the padding. Save the results in the corresponding őles mx1a, mx1b, mx1c, my1a, my1b, and my1c in the lab-answers folder. To allow you to check your program, one result from each pair should be identical to one of the two input message őles, ma1 and mb1. If you got this one right, most likely you also got the other decryption right. You can also conőrm that during decryption of the PKCS#1 version 1.5 and OAEP ciphertexts, you őnd correctly padded plaintexts. 2. To further warm up, let’s also do textbook RSA encryption. Use the public encryption key e1 (and the modulus n1) to encrypt ma1 and mb; Applied Introduction to Cryptography and Cybersecurity 6.7. LABS AND ADDITIONAL EXERCISES 389 save the results in the corresponding őles ca1 and cb1 in the lab-answers folder. Again, one of these two answers (ca1 and cb1) should be identical to one of the two input ciphertext őles, cx1 and cy1. If you got this one right, most likely you also got the other decryption right. 3. In this item, we break textbook RSA encryption. In your lab-input folder, őnd a őle ciphertexts.csv containing ‘eavesdropped ciphertexts’ (and corresponding identiőers), and őle plaintexts.csv containing ‘suspected plaintexts’ (and corresponding identiőers), both using the CSV format (check it out). You should be able to identify two plaintexts from plaintexts.csv as corresponding to two of the ciphertexts (in ciphertexts.csv). The őle pair0-1 in the lab-input folder contains one match (as a pair of commaseparated identiőers of plaintext and ciphertext); check that one of the matches you found is the same as the contents of this őle. If so, then the other pair you found should also be correct; save it in őle pair0-2 in the lab-answers folder, using the same format as of pair0-1. Measure also the runtime, and upload it as őle t0. 4. Now that we see how textbook RSA is insecure, let’s try a naive padding. Speciőcally, let us deőne the NP1 (‘Naive Padding 1’) as: N P 1(m) = 0x02 + +r+ + m, where m is the (pre-padding) plaintext message, r is one byte consisting of four random bits followed by four zero bits. This is an (overly) simpliőed version of the PKCS#1 version 1.5 padding algorithm, P adv1.5 (·), as deőned in Equation 6.59 (subsection 6.5.6). Reuse the ciphertexts.csv and plaintexts.csv őles. You should again be able to identify two of the plaintexts from plaintexts.csv as corresponding to two of the ciphertexts (in ciphertexts.csv), this time, when applying padding N P 1 to the plaintexts before applying RSA textbook encryption (using e1 and n1). The őle pair1-1 in the lab-input folder contains one match (as a pair of comma-separated identiőers of plaintext and ciphertext); check that one of the matches you found is the same as the contents of this őle. If so, then the other pair you found should also be correct; save it in őle pair1-2 in the lab-answers folder, using the same format. Measure also the runtime, and upload it as őle t1. 5. Repeat the previous item, for a random string r containing (1) 8 random bits, (2) 12 random bits followed by four zero bits, (3) 16 random bits and (4) 20 random bits followed by four zero bits. Check your results using the provided pair2-1, pair3-1, pair4-1 and pair5-1 őles, and save your ‘solution pairs’ as őles pair2-2, pair3-2, pair4-2 and pair5-2. Measure also the runtime. a) Create a graph of the runtime for the time it took to decrypt using a random string from 0 to 20 bits. b) Identify the function giving the runtime as a function of the number of random bits. Applied Introduction to Cryptography and Cybersecurity CHAPTER 6. PUBLIC KEY CRYPTOGRAPHY 390 c) Using the function you identiőed, approximate the runtime if you used this attack to decrypt encryption with padding of eight random bytes, the minimal number of random bytes required by PKCS#1 version 1.5. d) Repeat, for padding of 12 random bytes. , and 6. Challenge. Find in lab-input folder the őles c8 − 1, c8 − 2, c12 − 1 and c12 − 2. These are all encryptions using the N P 1 padding, except using a random string r of length 8 bytes (for c8 − 1, c8 − 2) or 12 bytes (for c12 − 1, c12 − 2). Your goal is to őnd the corresponding plaintexts. To assist you in őnding the plaintexts, use the ‘padding-correctness feedback’ interface in the lab web-server, or provided by your professor. This interface allows you to upload a ciphertext c and receive an indication if this ciphertext is properly padded using N P 1 or not. To allow you to test your program, you will őnd the plaintexts p8−1, p12−1 in the lab-input folder. Once you őnd the one or both of the other solutions, upload them in the corresponding őles (p8 − 2, p12 − 2) in the lab-answers folder. Upload also the runtimes, as őles t8 and t12, and add them to your graph of run-times from the previous item, comparing them to the runtimes you projected using the attack of the previous item. 7. Discussion. Explain why it is signiőcantly easier to attack the N P 1 padding, compared to the Bleichenbacher’s attack against the PKCS#1 v1.5 padding. Exercise 6.12 (Addition/multiplication key exchange is insecure). Present a sequence diagram similar to Figure 6.5 and Figure 6.6, but using addition or multiplication instead of XOR/exponentiation. Show that the resulting protocol is vulnerable to an eavesdropping attacker. Exercise 6.13. Show that the exponential key exchange (Figure 6.6) is insecure against an eavesdropper, even if the base g used by the protocol is a secret shared between Alice and Bob. Exercise 6.14 (Justiőcation for limitations on possible random inputs for key exchange protocols). Show that in both the Diffie-Hellman key exchange protocol and the Modular Exponentiations key exchange protocol, there will be no gain in security if the parties choose their random inputs (a, b and, for exponential key exchange, also k) from a larger set, say {1, . . . , 2p}. Show that the same holds for the selection of g. Exercise 6.15. The Diffie-Hellman protocol is a special case of a key exchange protocol, defined by the pair of functions (KG, F ), as introduced in subsection 6.1.3. Applied Introduction to Cryptography and Cybersecurity 6.7. LABS AND ADDITIONAL EXERCISES 391 1. Present the Diffie-Hellman protocol as a key exchange protocol, i.e., define the corresponding (KG, F ) functions. 2. We presented two assumptions regarding the security of the DH protocol: the Computational-DH (CDH) assumption and the Decisional-DH (DDH) assumption. Show that one of these assumption does not suffice to ensure key-indistinguishability? What about the other one? Exercise 6.16. It is proposed that to protect the DH protocol against an imposter, we add an additional ‘confirmation’ exchange after the protocol terminated with a shared key k = h(g ab mod p). In this confirmation, Alice will send to Bob M ACk (g b ) and Bob will respond with M ACk (g a ). Show the message-flow of an attack, showing how a MitM (Man-in-the-Middle) attacker can impersonate as Alice (or Bob). The attacker has ‘MitM capabilities’, i.e., it can intercept messages (sent by either Alice or Bob) and inject fake messages (incorrectly identifying itself as Alice or Bob). Exercise 6.17. Suppose that an efficient algorithm to find discrete log is found, so that the DH protocol becomes insecure; however, some public-key cryptosystem (G, E, D) is still considered secure, consisting of algorithms for, respectively, key-generation, encryption and decryption. 1. Design a key-agreement protocol which is secure against an eavesdropping adversary, assuming that (G, E, D) is secure (as a replacement to DH). 2. Explain which benefits the use of your protocol may provide, compared with simple use of the cryptosystem (G, E, D), to protect the confidentiality of messages sent between Alice and Bob against a powerful MitM adversary. Assume Alice and Bob do have known public keys. Exercise 6.18. Assume that there is an efficient (PPT) attacker A that can find a specific bit in g ab mod p, given only g a mod p and g b mod p. Show that the DDH assumption does not hold for this group, i.e., that there is an efficient (PPT) attacker A that can distinguish, with significant advantage over random guess, between g ab mod p and between g x for x taken randomly from [1, . . . , p − 1]. Exercise 6.19. It is frequently proposed to use a PRF as a Key Derivation Function (KDF), e.g., to extract a pseudo-random key k ′ = P RFk (g ab mod p) from the DH exchanged value g ab mod p, where k is a uniform random key (known to attacker). In particular, in subsection 6.3.1, a variant of the Auth-DH protocol uses a function f assumed to fulfill both the PRF requirements and the KDF requirements. In this exercise, we explore alternatives. 1. Let f be a secure PRF; note that f may be a KDF or not. Show a function f ′ which is (1) also a PRF and (2) not a secure KDF. 2. Let g be a secure KDF; note that g may be a PRF or not. Show a function g ′ which is (1) also a KDF and (2) not a secure PRF. Applied Introduction to Cryptography and Cybersecurity CHAPTER 6. PUBLIC KEY CRYPTOGRAPHY 392 Alice, has ki−1 (from previous round) Bob, has ki−1 (from previous round) MitM attacker Nurse $ ai ← Z∗p ≡ {1, . . . (p − 1)} xi ← g $ ai bi ← Z∗p ≡ {1, . . . (p − 1)} mod p, M ACki−1 (xi )   (i) kB,A ≡ h xbi i mod p yi ← g bi mod p, M ACk(i) (yi ) B,A (i) (i) kA,B ≡ h (yiai mod p) (i) Session ith key: ki = kA,B = kB,A Figure 6.23: How not to ensure resilient key exchange: illustration for Ex. 6.20 MitM attacker Alice, knows M K Bob, knows M K Nurse $ ai ← Z∗p ≡ {1, . . . (p − 1)} xi ← g $ ai bi ← Z∗p ≡ {1, . . . (p − 1)} mod p, fM K (xi ), gM K (xi ) yi ← g bi mod p, fM K (yi ), gM K (yi ) yi′ ← yiai mod p (i) kA,B ≡ fM K (yi′ ) ⊕ gM K (yi′ ) Session ith key: ki = (i) kA,B = (i) kB,A x′i ← xbi i mod p (i) kB,A ≡ fM K (x′i ) ⊕ gM K (x′i ) Figure 6.24: Insecure ‘robust-combiner’ authenticated DH protocol, studied in Exercise 6.21. 3. Present a variant of the Auth-DH protocol, as a modification of Figure 6.11, which uses a PRF (instead of MAC) and a KDF (instead of KDF). Explain why this variant is secure, when the key of the PRF and the KDF are chosen independently (using uniform distribution). 4. Let f be a secure PRF and g be a secure KDF. Show functions f ′ , g ′ such that f ′ is a PRF and g ′ is a KDF, and furthermore the following holds: the protocol in the previous item may be insecure when used with f ′ and g ′ , if both of them use the same symmetric master key M K. Note: this is an example of the principle of key separation (Principle 10). Exercise 6.20 (How not to ensure resilient key exchange). Fig. 6.23 illustrates a slightly different protocol for authenticating the DH protocol, using a changing key ki (to ensure resilient key exchange). Present a sequence diagram showing that this protocol is not secure. Exercise 6.21. The protocol in Fig. 6.24 is an (incorrect) attempt at a robustcombiner authenticated DH protocol. Applied Introduction to Cryptography and Cybersecurity 6.7. LABS AND ADDITIONAL EXERCISES 393 1. Show a sequence diagram for an attack showing that this variant is insecure. 2. Show a simple fix that achieves the goal (robust combiner authenticated DH protocol). Exercise 6.22. Assume it takes 10 seconds for any message to pass between Alice and Bob. 1. Assume that both Alice and Bob initiate the ratchet protocol (Fig. 6.12) every 30 seconds. Draw a sequence diagram showing the exchange of messages between time 0 and time 60seconds; mark the keys used by each of the two parties to authenticate messages sent and to verify messages received. 2. Repeat, if Bob’s clock is 5 seconds late. Exercise 6.23. In the DH ratchet protocol, as described (Fig. 6.12), the parties derive symmetric keys ki,j and use them to authenticate data (application) messages they exchange between them, as well as the first message of the next handshake. 1. Assume a chosen-message attacker model, i.e., the attacker may define arbitrary data (application) messages to be sent from Alice to Bob and vice verse at any given time, and ‘wins’ if a party accepts a message never sent by its peer (i.e., that message passes validation successfully). Show that, as described, the protocol is insecure in this model. 2. Propose a simple, efficient and secure way to avoid this vulnerability, by only changing how the protocol is used - without changing the protocol itself. Exercise 6.24. The DH protocol, as well as the ratchet protocol (as described in Fig. 6.12), are designed for communication between only two parties. 1. Extend DH to support key agreement among three parties. 2. Similarly extend the ratchet protocol. Exercise 6.25 (DH-Ratchet). Figure 6.12 shows the DH-Ratchet protocol, where the key used to authenticate the DH exchange as well as the data messages is changing periodically (as indicated), and where f is a PRF (Pseudo-Random Function). Assume that this protocol is run daily, from day i = 1, and where k0 is a randomly-chosen secret initial master key, shared between Alice and Bob; messages on day i are encrypted and authenticated using session key ki , by selecting a random string r and sending r and f (r||fki (r) ⊕ m). An attacker can eavesdrops on the communication between the parties on all days, and on days 3, 6, 9, . . . it can also spoof messages (send messages impersonating as either Alice or Bob), and act as Man-in-the-Middle (MitM). On the fifth day (i = 5), the attacker is also given the initial master key M k0 . Applied Introduction to Cryptography and Cybersecurity CHAPTER 6. PUBLIC KEY CRYPTOGRAPHY 394 Alice, has ki−1 (from previous round) Bob, has ki−1 (from previous round) MitM attacker Nurse $ ai ← Z∗p ≡ {1, . . . (p − 1)} xi ← g $ ai bi ← Z∗p ≡ {1, . . . (p − 1)} mod p, fki−1 (xi ) yi ← g bi mod p, fki−1 (yi ) (i) kA,B ≡ fki−1 (yi · g ai mod p) (i) (i) Session ith key: ki = kA,B = kB,A (i) kB,A ← fki−1 (xi · g ai mod p) Figure 6.25: Insecure variant of the DH-Ratchet Protocol, for Ex. 6.26. • Explain why sending r and f (r||fki (r) ⊕ m) ensures authenticity and confidentiality, provided that ki is secret. • What are the days whose messages the attacker will be able to decrypt (find out) upon day ten? • Show a sequence diagram of the attack, and list calculations done by the attacker. Exercise 6.26 (Insecure variant of DH-Ratchet). Figure 6.25 shows a variant of the DH-Ratchet protocol, using a (secure) pseudorandom function f to derive the session key. 1. Does this protocol ensure forward-secrecy (FS)? If so, explain; if not, present sequence diagram of attack. 2. Repeat, for PFS. 3. Repeat, for Recover-Security (RS). 4. Repeat, for PRS. Exercise 6.27 (GSM). Design a more secure variant of the GSM handshake protocol, which foils the attack described in Exercise 5.16; the mobile and visited network can identify support of this variant by referring to it as a new cipher, say A5/33. The actual data encryption can use any secure shared-key encryption; the critical improvement is to the negotiation, namely, to prevents attacks as in Exercise 5.16. The change may involve one or few new handshake messages between mobile and visited network, but no change to the rest of the GSM network, in particular, no change to the home network. Your solution may require the mobile and/or visited network to use additional cryptographic mechanisms, including public key mechanisms, but only during handshake. Hint: your solution should set the key to be used by A5/3, using cryptographic mechanism(s) we learned. Applied Introduction to Cryptography and Cybersecurity 6.7. LABS AND ADDITIONAL EXERCISES 395 Exercise 6.28. We saw that El-Gamal encryption (Equation 6.36) may be re-randomized, using the recipient’s public key, and mentioned that this may be extended into an encryption scheme which is univerally re-randomizable, i.e. where re-randomization does not require the recipient’s public key. Design such encryption scheme. Hint: begin with El-Gamal encryption, and use as part of the ciphertext, the result of encrypting the number 1. Or see [171]. Exercise 6.29. A public-key cryptosystem is IND-rCCA secure, if it passes the IND-CPA test, when the attacker is restricted to avoid any ciphertext queries whose output is the challenge message m∗ [89]. Show that: 1. The El-Gamal PKC is not IND-rCCA secure. 2. Textbook RSA is not IND-rCCA secure. Exercise 6.30. The RSA algorithm calls for selecting e and then computing d to be its inverse ( mod ϕ(n)). Explain how the key owner can efficiently compute d, and why an attacker cannot do the same. Exercise 6.31. The RSA key generation algorithm requires the selection of two large primes p, q. Would it be secure to save time by using p = q? Or first choose p, then let q be the next-largest prime? Exercise 6.32 (Tiny-message attack on textbook RSA). We discussed that RSA should always be used with appropriate padding, and that ‘textbook RSA’ (no padding) is insecure, in particular, is not randomized so definitely does not ensure indistinguishability. 1. Show that textbook RSA may be completely decipherable, if the message length is less than |n|/e. (This is mostly relevant for e = 3.) 2. Show that textbook RSA may be completely decipherable, if there is only a limited set of possible messages. 3. Show that textbook RSA may be completely decipherable, if the message length is less than |n|/e, except for a limited set of additional (longer) possible messages. Exercise 6.33. Consider the use of textbook RSA for encryption (no padding). Show that it is insecure against a chosen-ciphertext attack. Exercise 6.34. Consider a variation of RSA which uses the same modulus N for multiple users, where each user, say Alice, is given its key-pair (A.e, A.d) by a trusted authority (which knows the factoring of N and hence ϕ(N ). Show that one user, say Mal, given his keys (M.e, M.d) and the public key of other users say A.e, can compute A.d. Note: recall that each users’s private key is the inverse of the public key ( mod ϕ(n), e.g., M.e = M.d−1 mod ϕ(n). Applied Introduction to Cryptography and Cybersecurity 396 CHAPTER 6. PUBLIC KEY CRYPTOGRAPHY Exercise 6.35. Public-key algorithms often use term ‘public key’ to refer to only one component of the public key. For example, with RSA, people often refer to e as the public key, although the actual RSA public key consists of the pair (e, n), i.e., also includes the modulus n. Consider an application which receives an RSA signature (eA , n), where eA is the same as in the public key (eA , nA ) of user Alice, but n ̸= nA ; however, the application still concludes that this is a valid signature by Alice. Show how this allows an attacker to trick the recipient into believing - incorrectly - that an incoming message (sent by the attacker) was signed by Alice. Note: similar situation exists with other public key algorithms, e.g., elliptic curves, where the public key consists of a specification of a curve and of a particular ‘public point’ on the curve, but often people refer only to the point as if it is the (entire) public key. In particular, this led to the ‘Curveball’ vulnerability in the Windows certificate-validation mechanism [361], which was due to validation of only the ‘public point’ and use of the curve selected by the attacker. Exercise 6.36. You are given textbook-RSA ciphertext c = 281, with public key e = 3 and modulus n = 3111. Compute the private key d and the message m = cd mod n. Hint: it is probably best to begin by computing the factorization of n. Exercise 6.37. Consider the use of textbook RSA for encryption as well as for signing (using hash-then-sign), with the same public key e used for encryption and for signature-verification, and the same private key d used for decryption and for signing. Show this is insecure against chosen-ciphertext attacks, i.e., allows either forged signatures or decryption. Exercise 6.38. The following design is proposed to send email while preserving sender-authentication and confidentiality, using known public encryption and verification keys for all users. Namely, assume all users know the public encryption and verification keys of all other users. Assume also that all users agree on public key encryption and signature algorithms, denoted E and S respectively. When one user, say Alice, wants to send message m to another user, say + ‘Alice’ + + SA.s (m)), where B.e is Bob, it computes and sends: c = EB.e (m + Bob’s public encryption key, A.s is Alice’s private signature key, and ‘Alice’ is Alice’s (unique, well-known) name, allowing Bob to identify her as the sender. When Bob receives this ciphertext c, it first decrypts it, which implies it was sent to him. To validate that the message was sent by Alice, he looks up Alice’s public verification key A.v, and verifies the signature. 1. Explain how a malicious user Mal can cause Bob to believe it received a message m from Alice, although Alice never sent that message to Bob. (Alice may have sent a different message, or sent that message to somebody else.) Applied Introduction to Cryptography and Cybersecurity 6.7. LABS AND ADDITIONAL EXERCISES 397 2. Propose a simple, efficient and secure fix. Exercise 6.39 (Combining public key signatures and encryption). Many applications require both confidentiality, using recipient’s public encryption key, say B.e, and non-repudiation (signature), using sender’s verification key, say A.v. Namely, to send a message to Bob, Alice uses both her private signature key A.s and Bob’s public encryption key B.e; and to receive a message from Alice, Bob uses his private decryption key B.d and Alice’s public verification key A.v. 1. It is proposed that Alice will select a random key k and send to Bob the triplet: (cK , cM , σ) = (EB.e (k), k ⊕ m, SignA.s (‘Bob′ + + k ⊕ m)). Show this design is insecure, i.e., a MitM attacker may either learn the message m or cause Bob to receive a message ‘from Alice’ - that Alice never sent. 2. Propose a simple, efficient and secure fix. Define the sending and receiving process precisely. 3. Extend your solution to allow prevention of replay (receiving multiple times a message sent only once). Note: signcryption schemes combine the public key signature and encryption operations, possibly with greater efficiency than applying separately the encryption and signing operation. Applied Introduction to Cryptography and Cybersecurity Chapter 7 The TLS protocols for web-security and beyond In this chapter, we discuss the Transport-Layer Security (TLS) protocol, which is the main protocol used to secure connections over the Internet - and, in particular, web-communication. We believe that TLS is the most studied applied cryptographic protocol; it is widely deployed in many applications and has huge impact on the security of the Internet. Its extensive study has resulted in many attacks and subsequent countermeasures, defenses, improvements and several versions. 7.1 Introduction to TLS and SSL The TLS protocol is arguably the most ‘successful’ security protocol - it is deőnitely very widely used. One reason for this wide use is that TLS is widely applicable; it is used in more diverse scenarios and environments than any other security protocol. Many extensions and changes have been proposed over the years, allowing the use of TLS in new scenarios and satisfying new requirements, as well as improving security. Some of these extensions and changes were adopted as an inherent part of a new revision of the protocol, and many others can be deployed using the built-in extensions mechanism (subsection 7.4.3), which became a standard part of TLS beginning with version TLS 1.1. Indeed, we believe TLS is probably the applied cryptography protocol which was most widely studied and analyzed, with many vulnerabilities exposed and őxed; this gives us considerable conődence in the security of the (later versions of) TLS. From a security point of view, this popularity is a double-edged sword. On the one hand, this wide popularity motivates extensive efforts by the ‘whitehat’ security community, including researchers from academia and industry, to identify vulnerabilities and improve the security of the protocols and their implementations. This published research resulted in signiőcant improvements to the security of TLS; many attacks and corresponding countermeasures were 399 400 CHAPTER 7. TLS PROTOCOLS: WEB-SECURITY AND BEYOND published by researchers, and new versions of the protocols were gradually more and more secure, culminating with TLS 1.3, which has signiőcant design changes whose goal is to improve security. On the other hand, this wide popularity also implies that ‘black-hat crackers’ have a strong motivation to őnd vulnerabilities in the TLS protocols and in their popular implementations. In fact, the desire to ‘break’ secure connections, may even motivate powerful organizations, e.g., the NSA, to invest extensive efforts in ‘injecting’ intentional, hidden vulnerabilities (cryptographic backdoors) into the speciőcations and implementations of TLS and of other popular cryptographic systems, libraries, protocols and standards. One example of what may be a cryptographic backdoor, is the Dual-EC Deterministic Random Bit Generator (DRBG), which was found vulnerable in [96]. The Dual-EC DRBG was included in NIST, ANSI and ISO/IEC standards, and implemented in the widely-used BSAFE cryptographic toolkit from RSA (also used by implementations of TLS). There are claims that the NSA created and promoted the Dual-EC DRBG as a cryptographic backdoor, and allegedly even paid RSA to make it the default pseudorandom generator in BSAFE. One evidence for NSA’s involvement came from the NSA memos exposed by Edward Snowden in 2013, which also indicated that NSA spends $250 million per year to insert backdoors into software, hardware and standards; see [55, 392]. Some of this purported effort to insert trapdoors into standards, seems to have been directed at TLS. In particular, consider [332], a proposal for a TLS extension by Eric Rescorla (as a consultant to the US government) and Margaret Salter (an NSA employee). The purported goal of this extension was to increase the number of random bits exchanged during the TLS handshake. However, these additional random bits seem to signiőcantly improve the efficiency of the Dual-EC DRBG attack. This may indicate that this was another attempt to insert a cryptographic trapdoor - in this case, to the TLS speciőcations; see [55]. This widespread use of TLS also has important implications for learning and teaching TLS. Obviously, the importance of TLS motivates studying it; furthermore, the evolution of TLS, and the different attacks and countermeasures, are a valuable, interesting lesson, which can help to identify and avoid vulnerabilities in different protocols and systems. On the other hand, this also means that there is an excessive wealth of important and interesting information - indeed, entire books were dedicated to cover TLS, e.g., [307, 328], and even they do not cover all aspects and attacks. We have tried to maintain a reasonable balance; however, there were many hard choices and surely there is much to improve. As in other aspects, your feedback would be appreciated. Organization of this chapter. In the following subsection (subsection 7.1.1, we present a brief history of SSL and TLS, and its three main phases: (1) the proprietary SSLv2 design, (2) the evolution from SSLv3 to TLS version 1.2, and őnally (3) the TLS 1.3 re-design. We later dedicate a section to each of these three phases. Applied Introduction to Cryptography and Cybersecurity 7.1. INTRODUCTION TO TLS AND SSL 401 Why discuss older versions? One motivation to describe the older SSL and TLS versions is to learn about protocol vulnerabilities and attacks, which can help us to develop the intuition to identify vulnerabilities in different protocols, and to design secure protocols. Another motivation is that these attacks are often still relevant, for two reasons. First, many clients and servers still support outdated versions. Second, several downgrade attacks (subsection 5.6.3) break implementations of newer versions of TLS, by exploiting their support for older, vulnerable versions. 7.1.1 A brief history of SSL and TLS Let us begin with a few words on the history of SSL and TLS. The TLS standards are deőned by the Internet Engineering Task Force (IETF), as an evolution of the Secure Socket Layer (SSL) protocols. The SSL protocols are quite similar to TLS in their basic design and goals, but developed by the Netscape corporation (rather than by the IETF); SSLv3 beneőted from some feedback from researchers and the Internet security community. In fact, version 3 of SSL (SSLv3) is closely related to versions 1.0 to 1.2 of TLS, and less similar to version 2 of SSL. The beginning of SSL was around 1994, with the beginning of the commercial use of the World Wide Web (WWW). Possibly the őrst company to focus on the commercial potential of the web was the Netscape corporation, established in 1994. At the time, a major concern was the ability to perform onlinepurchases securely. Credit cards were quickly recognized as an appropriate payment method, since they were already used widely for phone purchases and mail orders; such remotely-authorized transactions were referred to as card not present, to identify the risk due to reliance on card details without visual conőrmation of the physical card and handwritten signature, which were required for the more common (at the time) card present transactions. However, the transmission of credit card information over the Internet was considered less secure, than the somewhat-protected transmission of credit card information in a phone call or by mail. In both phone and physical mail, there is some level of authentication of the merchant, since the customer initiated the phone call or addresses the physical mail; but Internet communication may be viewed by different providers and, depending on technology, even other users. A secure solution was considered essential for the commercial use of the Web. Netscape came out with the őrst protocol to protect credit card transactions over the Web - the SSL protocol. SSL initial goal was, basically, to provide security for credit card transactions, which will be comparable to the security of credit card transactions performed remotely, over phone or mail (referred to as ‘card not present’ transactions). SSL-protected web transactions became, basically, a new way to perform ‘card not present’ transactions. Indeed, very quickly, the use of SSL to protect web credit card transactions became widely adopted and expected by customers. Importantly, the use of SSL was also not limited to credit card transactions, and it did not take very long for it to be widely use to protect other webpages as Applied Introduction to Cryptography and Cybersecurity 402 CHAPTER 7. TLS PROTOCOLS: WEB-SECURITY AND BEYOND well. SSL, as its name (Secure Socket Layer) implies, provides general-purpose secure communication interface, extending TCP’s widely-used socket API [369]. To secure web communication, Netscape added support to the https protocol, which is basically the same as the web http protocol, except that http runs over TCP, without security, while https runs over SSL (or TLS), providing security. The design of SSL as a secure variant of the socket API gave it important advantages over alternatives protocols proposed around the same years. The closest contender was Secure HTTP (SHTTP, [333]), a general-purpose security extension for the HTTP protocol; but SHTTP is signiőcantly more complex to understand, implement and deploy. Even more complex and ambitious were two other designs for credit-card payments over the web: the Secure Electronic Transactions (SET) protocol, developed by Microsoft, Visa and later also Mastercard [8], and the iKP protocol, developed by IBM [34, 35]. Both SET and iKP tried to provide security comparable to ‘card present’ transactions, by having the client’s device digitally-sign each purchase; the goal was to provide a secure alternative to the handwritten signature on a credit card slip. As a result, they were signiőcantly more complex to understand, implement and deploy, compared to SSL. Three main advantages helped SSL to quickly become a success. First, SSL was quickly implemented and deployed by Netscape, whose browser was, by far, the most popular at the time, with a large lead over all other browsers combined. Second, while SET and iKP provided better security for credit card transactions, they were limited to this credit-card application; in contrast, SSL could be used for other applications requiring secure client to server communication, not only for credit card purchases. In particular, although the original motivation for deploying SSL was to encrypt the credit-card in transit, protecting against an eavesdropper, it soon become apparent that server authentication provides a critical security function, by allowing clients to identify impersonating websites. The third and most signiőcant advantage of SSL is that SSL is simple - simple in its concept, simple to implement, simple to integrate in applications, most notably, in a browser, and, most signiőcantly, simple for adoption. Speciőcally, SET and iKP required adoption by credit card processors as well as merchants and customers, with private keys and certiőcates for each party; while SSL required only adoption by merchants and customers. Furthermore, SSL and TLS always perform server authentication, but client authentication is optional, and, in fact, not widely deployed. Namely, deploying SSL (or TLS) only requires the websites (merchants) to generate private keys and obtain public-key certiőcates. Indeed, once the customer uses an SSL-enabled browser and the merchant offers an SSL-enabled website, all parties (the merchant, customer and credit-card processor), they basically operate as in other card-not-present scenarios; no additional change to their systems or processes is required. The importance of simplicity and ease of deployment and use for applied security mechanisms cannot be overstated, and we will return to it when we introduce, later in this chapter, the Keep it Simple and Secure (KISS ) principle (Principle 14). For example, while SSL supports both server-authentication and client-authentication, it is usually deployed with only server authentication, Applied Introduction to Cryptography and Cybersecurity 7.1. INTRODUCTION TO TLS AND SSL 403 requiring only servers (merchants) to obtain public key certiőcates; even today, only few clients obtain certiőcates (for client authentication). The very őrst versions of SSL were not published. The őrst publication was in June 1995, when Netscape published SSL version 2 (SSLv2) [202]. Later in 1995, Microsoft changed strategy; they published and implemented the Private Communication Technology (PCT) Protocol [50], which is similar to SSLv2. SSLv2 had signiőcant design vulnerabilities, and its publication allowed the web security community to expose these vulnerabilities. In November 1996, Netscape published the much-improved, and quite different, SSLv3 [156] (later published as [155]). The speciőcation published was quite complete, allowing independent interoperable implementations. Also in 1996, the IETF established a working group to develop an agreed standard protocol to replace the proprietary-developed SSL and PCT protocols. To avoid arguments on which of the two names should be used, a new name was chosen: the TLS (Transport Layer Security) protocol. However, TLS 1.0 [120], the őrst standard produced by the TLS working group, was closely based on SSLv3. Unfortunately, although SSLv3 and TLS 1.0 addressed some of SSLv2’s vulnerability, they still had serious vulnerabilities, as well as non-security limitations. In April 2006, the TLS working group of the IETF deőned TLS 1.1 [121] to őx the issues discovered in TLS 1.0. However, additional vulnerabilities and concerns were discovered, motivating another release: TLS 1.2 [122] (published in Augus 2008). These three TLS versions (1.0 to 1.2) were all quite similar to SSLv3, only őxing clearly-exploitable vulnerabilities and adding features. After vulnerabilities were discovered also in TLS 1.2, the working group decided to do a major redesign. This took about 10 years; the IETF published TLS 1.3 [329] only in August 2018. TLS 1.3 is still the latest version of TLS. In contrast to previous designs, the TLS 1.3 designers gave preference to mechanisms with proven security properties; ideally, we would like the complete TLS protocol to be provably secure. While a complete proof of security was not yet published for TLS 1.3, there are encouraging and important partial results. As a result on this stronger emphasis on security, as well as due to signiőcant changes to improve performance (mainly, reduce latency), TLS 1.3 is a major deviation from the previous versions. 7.1.2 TLS: High-level Overview The TLS and SSL protocols were originally designed to secure the communication between a web-browser and a web-server, and, while they are now widely deployed for additional applications, web-security remains their main application. We present a highly-simpliőed overview of this typical use-case in Figure 7.1. In Figure 7.1, we show how Alice, a web user, surfs to the TLS protected website https://b.com; we focus on the simpler variants of TLS, which are based on RSA encryption, and denote b.com’s public RSA encryption key by B.e. Notice that the URL of TLS protected websites begins with the protocol name Applied Introduction to Cryptography and Cybersecurity 404 CHAPTER 7. TLS PROTOCOLS: WEB-SECURITY AND BEYOND Alice Browser DNS CA Web server 0a. Request certificate for B.e, b.com [,IDs] 0b. Certificate: SCA.s (B.e, b.com[,IDs], . . .) 1. https://b.com 2a. Resolve name https://b.com 2b. https://b.com A 1.2.3.4 3a. TCP handshake: SYN 3b. TCP SYN/ACK 4a. TLS Handshake: Client_Hello 4b. Server_Hello and certificate: (B.e, SCA.s (B.e, b.com, . . .)) 4c. Key_Exchange: EB.e (k) 5a. TLS session: k (HTTP GET b.com) 6a. Display login form and 5b. k (HTTP response (HTML login form)) 6b. username, pw 7a. k (HTTP POST (username, pw)) 7b. k (HTTP response . . .) · · · Figure 7.1: A simpliőed overview of the operation of TLS, to secure the login between the browser and the web-server, using RSA for key exchange. https, rather than the protocol name http, used for unprotected web sites. The process consists of the following steps: Step 0 (in advance): Web server obtains certificate. Before the server of b.com can provide TLS service, it need to obtain a certificate for its public key B.e, signed by a certificate authority (CA) trusted by the client (in this case, browser). For that purpose, the server sends to the CA (in ŕow 0.a) its domain name, b.com, its public encryption key B.e, and optionally other identiőers (IDs). The CA should validate that the Applied Introduction to Cryptography and Cybersecurity 7.1. INTRODUCTION TO TLS AND SSL 405 server indeed ‘owns’ domain b.com, and is associated with any additionally provided identiőers (the optional IDs). If validation passes, the CA signs the certiőcate using its private signing key CA.s, and sends it to the server (ŕow 0.b). The certiőcate contains the public key B.e, the domain b.com, the optional IDs and other ‘administrative’ information, such as validity period. For more details about certiőcates see Chapter 8; in particular, the validation process is discussed in subsection 8.2.8. Step 1: client requests website https://b.com. The user (Alice) enters the desired Universal Resource Locator (URL), https://b.com. The URL consists of the protocol (https) and the domain name of the desired webserver (b.com); in addition, the path may contain identiőcation of a speciőc path and object in the server. In this example, Alice does not specify any speciőc path or object; and the browser considers this a request for the default object index.html. The choice of the https protocol, instructs the browser to open a secure connection, i.e., send the HTTP requests over an TLS session, rather than directly over an unprotected TCP connection. The request may be speciőed in one of three ways: (1) by the user ‘typing’ the URL into the address bar of the browser, i.e., ‘manually’, (2) ‘semiautomatically’, by the user clicking on an hyperlink or bookmark which speciőed this URL, or (3) by an instruction from the webpage currently displayed by the browser. Step 2: resolving domain name into IP address. To communicate with the b.com web server, the browser needs the IP address of the server. The Domain Name System (DNS) provides resolution (mapping) from domain names to IP addresses. We simplify this process into a request from the browser to the DNS (ŕow 2a), and a response from the DNS to the browser specifying the IP address (ŕow 2b). The step is skipped if the IP address is already known, typically, cached from a previous connection. This step is vulnerable to network attacks, including MitM attacks and off-path attacks exploiting weaknesses of the domain name system (DNS) [200, 201]. Figure 9.1 illustrates a DNS poisoning attack against a login webpage which uses TLS incorrectly, to only protect the password submitted by the user. Step 3: TCP handshake. The TLS protocol runs over the TCP (Transmission Control Protocol) protocol, which provides important services such as reliability and congestion/ŕow control. The őrst two ŕows of a TCP connections are called the TCP handshake, and contain only control signals, no data. The őrst ŕow is referred to as TCP SYN (ŕow 3a), and the second ŕo is referred to as TCP SYN/ACK (ŕow 3b). Step 4: TLS handshake. The TLS protocol also begins with a handshake, i.e., few control ŕows, which establishes the secure connection. Different versions of TLS support different handshakes, which we describe in following sections; we simplify the handshake based on RSA encryption Applied Introduction to Cryptography and Cybersecurity CHAPTER 7. TLS PROTOCOLS: WEB-SECURITY AND BEYOND 406 in Figure 7.1. All TLS begin with the Client_Hello message (ŕow 4a). The server responds with Server_Hello and the certiőcate (ŕow 4b). This provides the browser with the server’s public encryption key (B.e); the browser selects a random key, here denoted simply as k, and shares it with the server by encrypting it using B.e and sending the encryption (EB.e (k)) to the web server (ŕow 4c, the Key_Exchange message). Step 5: the TLS session (record protocol), initial webpage. At this point, the browser and server can communicate securely, with their messages protected using the TLS record protocol and the key they shared (which we denoted k). We denote the protection of the record protocol by the envelope symbol, with the key as subscript, i.e., k (·). The browser őrst sends an HTTP GET request, requesting the index.html webpage (ŕow 5a); the server responds by sending back the page, written in HTML (Hypertext Markup Language), e.g., a login form (ŕow 5b). Step 6: page displayed to user. Flow 6a represents the browser displaying the webpage to the user (Alice), together with a few security indicators such as a padlock . Flow 6b represents the user entering user name and password. Step 7: additional HTTP requests and response. Flows 7a and 7b are examples of additional HTTP requests and responses, protected by the TLS record protocol. In ŕow 7a, the browser sends the username and password; the interaction typically continues with the server HTTP response (ŕow 7b) and additional requests and responses (not shown). 7.1.3 TLS: security goals TLS and SSL are designed to ensure security between two computers, usually referred to as a client and a server, in spite of attacks by a MitM (Man-in-theMiddle) attacker. The goals include: Key exchange: securely setup a secret shared key, preventing exposure of this key to a MitM attacker. Server authentication: authenticate the identity of the server, i.e., assure the client that it is communicating with the right server. Client authentication: authenticate the identity of the client. Client authentication is optional; in fact, TLS is usually used without client authentication, allowing an anonymous, unidentiőed client to connect to the server. When client authentication is desired, it is usually performed by sending a secret credential within the TLS secure connection, such as a password or cookie. Connection Integrity: validate that the communication received by one party, is exactly identical to the communication sent by the peer (in Applied Introduction to Cryptography and Cybersecurity 7.1. INTRODUCTION TO TLS AND SSL 407 spite of a MitM attacker); if it isn’t, abort the connection, with error message, rather than delivering information without integrity. Note that TLS and SSL are run over TCP, which ensures integrity against benign errors, therefore any failure must be due to an attack, and aborting the connection is a sensible response. TLS and SSL detect not just corruption of an individual message, but also message re-ordering, and truncation attacks where the attacker drops the last message sent by the peer. Connection confidentiality: Ensure that a MitM attacker cannot learn anything about the information sent between the two parties, except for the ‘traffic pattern’ - amount of information sent/received. Perfect forward secrecy (PFS): Version 3 of SSL and all versions of the TLS handshakes support the (optional) use of authenticated DH key agreement, which ensures perfect forward secrecy (PFS), as discussed in subsection 6.3.1. PFS is deőned in Deőnition 5.3. Crypto-agility: We say that a cryptographic protocol, such as TLS, provides cryptographic agility or crypto-agility, if it allows the parties to select the speciőc cryptographic algorithms they use for a given function (e.g., block cipher, hash function or signatures). We introduced and discussed the importance of crypto-agility in subsection 5.6.2; in particular, crypto-agility is essential, when a vulnerability is found or suspected in a particular algorithm. TLS supports crypto-agility; it allows the cryptographic algorithms to be negotiated in each session, through cipher suite negotiation. All versions of TLS, as well as SSLv3, were designed to meet these goals; for SSLv2, Perfect Forward Secrecy (PFS) was not a goal. Of course, the goals may not be actually met, due to different vulnerabilities; we present some of the important vulnerabilities in this chapter, and in particular, see Table 7.2. 7.1.4 TLS: Engineering goals In addition to the security goals, the success of TLS is largely due to its focus - from the very őrst versions - on the generic ‘engineering goals’, applicable to any system, of efficiency, ease of deployment and use, and flexibility. By addressing these goals, TLS is widely used and applicable in a very wide range of applications and scenarios. Let us brieŕy discuss these three engineering goals. Efficiency - and session resumption. Efficiency is always a desirable goal. In the case of TLS, there are two main efficiency considerations: computational overhead and latency. In terms of computational overhead, the main consideration is minimizing the computationally-intensive public-key operations. To minimize public-key operations, once the handshake establishes a shared key (using public key operations), the parties may reuse this key to establish future connections without requiring additional public-key operations. We refer to the Applied Introduction to Cryptography and Cybersecurity 408 CHAPTER 7. TLS PROTOCOLS: WEB-SECURITY AND BEYOND set of connections based on the same public-key exchange as a session, and a handshake that reuses the pre-exchanged shared key as a session-resumption handshake. In terms of minimizing latency, the main consideration is to minimize the number of round trip exchanges. End-to-end delays are typically on the order of tens to hundreds milliseconds, which is usually much higher than the transmission delays, esp. for the limited amount of information sent in TLS exchange. Reducing the number of round trips became even more important as transmission speeds increased; this is reŕected by the fact that until TLS 1.3, all designs had a őxed number of two round-trips to complete the handshake, only then allowing the client to send a protected message (already in the third exchange). In contrast, a TLS 1.3 handshake requires only a single round-trip (before sending a protected message), and even allows the clients to send a request already in the őrst exchange (with some limitations and somewhat reduced security properties, see later). A more minor efficiency consideration is minimization of bandwidth; this is mainly signiőcant in scenarios where bandwidth is limited, such as very noisy wireless connections. Extensibility and versatility. Extensibility is always important - deőnitely for a widely deployed security protocol such as TLS, which is used in diverse scenarios and environments. Indeed, part of the success of TLS derives from its extensibility and versatility; the protocol supports many optional mechanisms, e.g., client authentication, and ŕexibility such as crypto-agility (Principle 11). Furthermore, from TLS version 1.1 (and even earlier for some implementations), the TLS protocol supports a built-in extension mechanism, providing even greater ŕexibility (subsection 7.4.3). Ease of deployment and use. Finally, the success and wide-use of the TLS protocols are largely due to their ease of deployment and usage. As shown in Figure 7.2, the TLS protocol is typically implemented ‘on top’ of the popular TCP sockets API, and then used by applications, directly or via the HTTPS or other protocols. This architecture makes it easy to install and use TLS, without requiring changes to the operating-system and kernel. This is in contrast to some of the other communication-security mechanisms, in particular the IPsec protocol [127, 153], which, like TLS, is also an IETF standard. The ease of deployment and use of TLS is probably the reason that TLS has become an almost-universal security substrate for many systems, even where other protocols may have advantages. For example, IPsec is probably better for use for Virtual Private Networks (VPNs), yet TLS VPNs are more widely deployed. 7.1.5 TLS and the TCP/IP Protocol Stack See Figure 7.2 for the placement of the TLS protocols with respect to the TCP/IP protocol stack, and Figure 7.3 for a typical connection. Applied Introduction to Cryptography and Cybersecurity 7.2. THE TLS RECORD PROTOCOL TLS Handshake ... HTTPS TLS record TCP sockets API TCP IP 409 HTTP ... Figure 7.2: Placement of TLS in the TCP/IP protocol stack. The TLS record protocol (őrst box in top line, in green) establishes keys for, and also uses, the TLS record layer protocol (őrst box in second line, also in green). The HTTPS protocol, and other application protocols that use the TLS record protocol, are in two middle boxes of the top line (in yellow). Application protocols that do not use TLS for security, including the HyperText Transfer Protocol (HTTP), are in the two last boxes in the two top lines (in pink). These protocols, as well as TLS itself, all use the TCP protocol, via the sockets library layer. TCP ensures reliable communication, on top of the (unreliable) Internet Protocol (IP). Figure 7.3: Phases of TLS connection. The black ŕows (Syn+Ack and later Fin+Ack) are the TCP connection setup and tear-down exchanges, required to ensure reliability. The fuchsia ŕows represent the TLS handshake; notice there are often more than the three shown. The blue ŕows represent the data transfer, protected using TLS record layer; and the red ŕows represent the TLS connection tear-down exchange. [This subsection is yet to be written; this material is well covered in many textbooks on networking, e.g., [245].] 7.2 The TLS Record Protocol In this section we begin our in-depth discussion of the TLS protocols. Speciőcally, we focus on the record protocol component of TLS; this protocol protects the communication, using symmetric cryptography, i.e., encryption, authentication (MAC) and/or authenticate-encryption. The symmetric key used by the record protocol, must be previously setup, securely, by the handshake protocol, which we discuss in the following sections. We begin the in-depth discussion of TLS with the record protocol, rather than with the more ‘interesting’ handshake protocol, for two reasons. First, we think that the record protocol is simpler, and can be understood independently Applied Introduction to Cryptography and Cybersecurity 410 CHAPTER 7. TLS PROTOCOLS: WEB-SECURITY AND BEYOND of the handshake protocol. Second, by presenting attacks on vulnerable record protocol options, we motivate some of the mechanisms later introduced by the handshake protocol. We discussed the basic principles underlying record protocols already earlier, in Chapter 4 and Chapter 5. In this section, we focus on the TLS record protocol, the most widely-deployed record protocol, with some unique aspects and instructive vulnerabilities. We focus on the common case, where the TLS record protocol is applied ‘on-top’ of an underlying reliable communication protocol - typically, TCP 1 . Hence, without an attack, messages sent are received reliably, without losses, duplications or re-ordering; any deviation must indicate an attack and justiőes closing the connection. Both TCP and the TLS record protocol treat the data from the application as one long stream of bytes, regardless of the sequence of (usually multiple) calls in which the protocol receives the data from the application. Namely, the application in the receiver, should parse the stream of bytes which the record protocol outputs, into the different application-level units (typically called messages). The record protocol involves authentication, encryption and few other functions applied to the data. We already discussed such combinations in Section 4.7; in this section, we focus on the TLS-speciőc aspects. In subsection 7.2.1, we discuss the TLS Authenticate-then-Encrypt (AtE) record protocol, used in SSL and in versions of TLS until TLS 1.2. TLS 1.2 also supports the AtE record protocol, but also supports the alternative AEAD record protocol, which we discuss in subsection 7.2.7 (see also discussion of AEAD schemes in subsection 4.7.1). TLS 1.3 supports only the AEAD record protocol. In subsection 7.2.3-7.2.6, we present attacks exploiting vulnerabilities in the AtE record protocol. This motivates the adoption of the AEAD record protocol (in TLS 1.3 and, optionally, in TLS 1.2). Cipher suites. Different versions and implementations of TLS may support different cryptographic algorithms. The list of cryptographic algorithms in used by both record protocol and handshake protocol, at a speciőc connection, is called the cipher suite. For the AtE record protocol, this deőnes an encryption algorithm, a MAC algorithm, and (optionally) a compression algorithm. For the AEAD record protocol, the record protocol has less options: it is deőned by a single AEAD algorithm. 7.2.1 The Authenticate-then-Encrypt (AtE) Record Protocol We begin our discussion of the TLS record protocol, by focusing on the Authenticate-then-Encrypt (AtE) design, used by SSLv3 and TLS, until until 1We do not cover DTLS [330], a variant of TLS, designed to work over the UDP protocol, i.e., over an unreliable datagram service. Applied Introduction to Cryptography and Cybersecurity 7.2. THE TLS RECORD PROTOCOL Message as sent by the application, e.g., HTTP request Plaintext Fragment (up to 16KB) Compress (optional) 411 Message as sent by S T V L E Y E E Q P R N Compressed fragment 1 the application, e.g S T V L E Y E E Q P R N Compressed fragment 2 ., HTTP request S T V L E Y E E Q P R N Compressed fragment 3 Authenticate Compressed fragment 1 M A C Compressed fragment 2 M A C Compressed fragment 3 M A C Pad (if using a block cipher) Compressed fragment 1 M P A A C D Compressed fragment 2 M P A A C D Compressed fragment 3 M P A A C D Compressed fragment 1 M P A A C D Compressed fragment 2 M P A A C D Compressed fragment 3 M P A A C D IV (if block cipher, from TLS 1.1.) Encrypt and send I V T V L I Y E E V P R N Encrypted fragment 1 I V T V L I Y E E V P R N Encrypted fragment 2 I V T V L I Y E E V P R N Encrypted fragment 3 Figure 7.4: The Authenticate-then-Encrypt (AtE) design of the record protocol of SSL and TLS. Unőlled őelds (pad, IV) are only used for block ciphers; the IV őeld is added only from TLS 1.1. The MAC is computed over the sequence number (SEQ), type (TYP), version (VER), length (LEN) and compressed fragment, as in Equation 7.1. The type, version and length őelds are sent, as plaintext, together with the corresponding encrypted fragment. The record protocol used (only) the AtE design until TLS 1.1; TLS 1.2 supports the use of either the AtE design or the AEAD design (subsection 7.2.7). version 1.2 (which allows AEAD as an alternative) and 1.3 (which only allows AEAD). Figure 7.4 illustrates the sequence of processing-steps applied by the sender, running the TLS AtE record protocol. These steps are applied to the input that the sender receives - from the application, or from the TLS alert or handshake protocols. Let us discuss each of the steps in the order they are applied by the sender of the data: Fragment: break the TCP stream into fragments; namely, a single (long) ‘message’, sent in one ‘send’ event by the application, may be parsed by the record protocol into multiple fragments, as shown in the top lines of Figure 7.4. Note that the record protocol may also aggregate multiple (short) ‘messages’, sent in consecutive ‘send’ event, into one fragment (this is less common and not shown in the őgure). Each fragment consists of up to 16KB. One motivation for fragmenting is to allow pipeline operation, reducing the latency. For example, the sender may process the őrst fragment, then send it while, in parallel, processing the second fragment. Another motivation for fragmenting is to allow recipients to allocate a őxed-sized buffer for incoming fragments, and avoid the risk of buffer Applied Introduction to Cryptography and Cybersecurity 412 CHAPTER 7. TLS PROTOCOLS: WEB-SECURITY AND BEYOND overŕow bugs and attacks. TLS recipients should discard an incoming fragment larger than 16KB. Compress: apply lossless compression to each fragment. Compression may reduce the processing overhead and the communication. As discussed in Section 4.7, ciphertext cannot be compressed. Therefore, compression should be done before encryption - or not at all. Note that the length of compressed data depends on the amount of redundancy in the plaintext, and encryption usually does not hide the length of the (compressed) plaintext; hence, there is a risk of exposure of the (approximate) amount of redundancy in the plaintext, when applying compress-then-encrypt. Indeed, the fact that TLS applies compression before encryption was exploited in the CRIME, BREACH and TIME compression attacks [29, 283, 335, 354]. These attacks motivated disabling of TLS compression, and currently TLS compression is rarely used (and not even supported in TLS 1.3). However, compression attacks may still be possible, in the (common) use of application-level compression; this is exploited in the BREACH attack [164]. We discuss these attacks in subsection 7.2.6. Authenticate: The AtE record protocol authenticates the plaintext by applying a MAC function, before encryption. The input to the MAC function consists of the concatenation of a Sequence number (SEQ) őeld, indicator the sequence number of the record, the type, version and length őelds and the Compressed Fragment itself; the type, version and length őelds are deőned as: Type: one byte indicating the type of data in this record. The most common type is ‘application data’, which is encoded by 0x17; other types are used for the handshake protocol, alert protocol (error indicators) and for a special Change Cipher Specification (CCS) message, indicating change of to new set of cryptographic keys (and, optionally, algorithms). Version: an identiőer of the version of TLS. Length: the number of bytes in the (optionally compressed) fragment. Namely, the MAC of a message sent by the server is calculated by: M AC = M ACkSM AC  SeqNum + + Type + + Version+ + + +Length + + Compressed_Fragment  (7.1) M AC A similar equation - using kC instead of kSM AC - is used for the MAC of messages sent by the client. MAC keys. The sender computes the MAC uses a shared key; we use M AC kC to denote the key for traffic from client to server, and kSM AC to Applied Introduction to Cryptography and Cybersecurity 7.2. THE TLS RECORD PROTOCOL Authenticated fragment 413 Compressed fragment MAC x bytes Padded authenticated fragment Compressed fragment Block 1 (l Bytes) MAC Block 2 (l Bytes) Pad: p bytes Block 3 (l Bytes) Figure 7.5: Padding in the AtE record protocol of SSL and TLS, when using a block cipher with block of l bytes; l = 8 for DES and l = 16 for AES. Pad contains p = l − (x mod l) bytes, where x is the length of the authenticated fragment (compressed fragment plus MAC); e.g., if l = 16 and x = 35, then p = 13, and the padded authenticated fragment őts in three blocks (as in the őgure). In TLS, all p pad bytes must contain p − 1. In the SSL record protocol, only the last pad byte must contains p − 1; the other p − 1 pad bytes may have any value. Stream-cipher encryption does not require padding. denote the key used for traffic from server to client2 . The recipient validates that the value in the received MAC őeld, is the same as the result of the MAC function applied with the corresponding key. Both M AC kSM AC and kC are generated by the handshake protocol. Padding: The input to a block cipher must be exactly one block; however, the length of the output from the authentication, consisting of the compressed fragment and the MAC, would often not be an integral number of blocks. Therefore, when using a block cipher, the TLS AtE record protocol appends a padding string, ensuring that the total length of the input to the encryption is an integral number of blocks, as shown in Figure 7.5. If the length of the authenticated fragment is x bytes, and the block-length is l bytes, then the required number of pad bytes would be p = l − (x mod l). SSL restricts the pad to only őll up to one block (0 < p ≤ l), but TLS allows longer pad, up to 256 bytes, which can be used to hide the exact length of the fragment. 2 In SSLv2, the same keys are used also for encryption, and hence simply denoted k C and M AC as the kS , and derived as in Eq. (7.20. Also, note that the TLS specifications refers to kS server’s MAC_write_key, i.e., a key used by the server to compute the MAC for segments M AC . We being sent (‘written’), as well as the client’s MAC_read_key; and similarly for kC find our notation simpler. Applied Introduction to Cryptography and Cybersecurity CHAPTER 7. TLS PROTOCOLS: WEB-SECURITY AND BEYOND 414 SSL uses X9.17 padding, and TLS uses PKCS#5 padding, to allow removal of the padding after decryption, and to validate the correctness of the pad; see their description in Section 2.9. Basically, in SSLv3, the length is in the last padding byte, and the value of other padding bytes is undeőned, while in TLS, all padding bytes must contain the number of padding bytes. This seemingly-minor difference is signiőcant, as we show by describing the Poodle padding attack in subsection 7.2.3 below. Prepend IV: Many block-cipher encryption algorithms, e.g., CBC, require an initialization vector (IV), which is usually selected randomly; see Section 2.8. In TLS 1.1 and 1.2, the IV is sent by the record protocol, prepended to the plaintext/ciphertext, as shown in Figure 7.4. SSL and TLS 1.0 try to save the (very limited) resources required to select and send the IV, and do not send the IV. This ŕawed design was exploited by the devastating BEAST attack [132]; its publication was a main driver for adoption of TLS 1.1. See subsection 7.2.4. Encrypt: TLS encrypts the concatenation of the compressed plaintext fragment, the MAC and, if necessary, the padding. Padding is required when using a mode-of-operation of a block cipher; it is not required for stream ciphers. A basic problem with the record protocols for SSLv3 and TLS versions 1.0 to 1.2, is its use of Authenticate-then-Encrypt (AtE) design, rather than the secure Encrypt-then-Authenticate design, or the use of a secure authenticated encryption. We discussed these alternatives in Section 4.7. The EtA design has been standardized, as a TLS extension (subsection 7.4.3), to improve the security of TLS 1.0 to 1.2 [179], and the authenticated encryption, speciőcally using AEAD, is used by the TLS 1.3 record protocol, and optionally in TLS 1.2 (subsection 7.2.7). We discuss attacks exploiting the use of EtA in subsection 7.2.3-7.2.5. 7.2.2 The CPA-Oracle Attack Model Following Principle 1, we now model the adversary against the TLS record protocol, which we call the CPA-Oracle Attack model. The model is applicable to different applications of TLS. We focus on the common use of TLS to secure the communication between web-client and web server. Intuitively, our attack model combines three adversary capabilities: MitM: the adversary has Man-in-the-Middle (MitM) capability when it can intercept packets sent between client and server, modify them and inject forged packets. Rogue website: the victim client innocently visits a rogue website controlled by the adversary, e.g., 666.com. A rogue website can send automaticallyexecuted hyperlinks to the browser, e.g., a request to embed an image or script from a victim website. Such attacks are referred to as cross site Applied Introduction to Cryptography and Cybersecurity 7.2. THE TLS RECORD PROTOCOL 415 attacks, and their goal is usually to abuse the relationship between the client and the victim website. Cross site attacks are of the most common attacks on web security, and there is extensive study of non-cryptographic defenses against them. These defenses rely on prevention of MitM attacks by TLS (against MitM attackers). CPA oracle: the adversary can receive an indication if the decrypted plaintext is valid or invalid. In some attacks, the adversary also has the ability to distinguish between a padding error and a MAC-validation error; this ability is key to padding attacks. From SSLv3, the error messages are encrypted, but different attacks, against different versions and implementations, were able to detect the whether a failure is due to invalid padding or to invalid MAC, based on differences in timing. See subsection 7.2.3. MitM Mal controls 666.com If c′i is invalid: connection i aborted. Timing side-channel may expose reason: invalid pad or MAC? Distribution Cookie x : om B .c from i, t i efix) p s e r u si Req ath (p uffix) p y (s bod ci mi = p i + +x+ + si Encode: ci ← + M AC(mi ) + + padi ) Encki (mi + c′i Decode using ki Nurse Client (Alice) Benign website B.com Figure 7.6: The CPA-Oracle Attack model on the AtE record protocol of SSL and TLS. The attacker’s goal is to őnd a string x, such a cookie (or password), sent (encrypted) by a browser to a web-server with every request. The model allows the attacker to control the preőx pi (the request path), and the suffix si (the request body) for every request, such that the plaintext input is pi + +x+ + si . We allow the attacker to intercept and modify the ciphertext, and to receive feedback on the results of the validation of decryption of c′i . Both SSL and TLS breaks the connection upon each error; the attacker may cause the browser to send a new request (using same secret x, and possibly changing (pi , si ), but the parties will use in each connection i a separate key ki . Error messages are encrypted, but some attacks, on some versions/implementations, distinguish between invalid MAC and invalid padding, see subsection 7.2.3. The CPA-Oracle Attack model (Figure 7.6) is a simpliőed model of an attacker with MitM, rogue website and padding oracle capabilities. The goal of the attacker is to expose information about a secret string x, taken from some (known) distribution. Often, x is a cookie that the browser automatically includes with every request that it sends to the benign website B.com. The rogue website capability allows the attacker to cause the browser to send different Applied Introduction to Cryptography and Cybersecurity 416 CHAPTER 7. TLS PROTOCOLS: WEB-SECURITY AND BEYOND requests to the website, always including the cookie as part of the request; the attacker may control much of the request, including the path (which comes before the cookie) and the body/payload (which comes after the cookie). Furthermore, the attacker often knows the contents of the rest of the request, except the cookie. For simplicity, the CPA-Oracle Attack model lets the attacker choose, with every request, both preőx pi and suffix si , so that the plaintext input to the TLS record protocol in the ith request is mi = pi + +x+ + si . For simplicity, we further assume that the attacker knows the length |x| of the secret cookie. Error handling. The TLS record protocol aborts a connection upon receiving any invalid ciphertext c′i . The attacker can send a new request, i.e., new plaintext preőx pi+1 and suffix si+1 . TLS negotiates a new pseudorandom key ki for each connection i. The attacker could send multiple requests on the same connection until it is aborted; if requests i and i + 1 are sent on the same connection (no abort), then ki = ki+1 . The Plaintext-Recovery security goal. Ideally, the use of TLS should prevent an attacker, which can eavesdrop to the communication between Alice and the benign website B.com, from learning any information about x. However, the AtE record protocol allows the use of compression before encryption, which makes it impossible to ensure indistinguishability between ciphertexts of a highly compressible message and a mostly-random messages; see subsection 4.7.5. Instead, we consider the more modest security goal of preventing plaintext recovery. More speciőcally, the goal is to prevent exposure of a secret string x, typically a cookie, which is part of plaintext. 7.2.3 Padding Attacks: Poodle and Lucky13 The use of Authenticate-then-Encrypt (AtE) by SSL and versions 1.0, 1.1 and (optionally) 1.2 of TLS, may result in vulnerabilities to padding oracle attacks, introduced in Section 2.9. However, Section 2.9 focused only on encrypted communication, without authentication or integrity checks on the plaintext. In contrast, the TLS AtE record protocol validates MAC on the plaintext, after padding is removed. Note that if the pad is formatted correctly but the last byte of the pad contains a wrong value (not the value p = l − (x mod l)), then an incorrect number of ‘pad bytes’ would be removed; see Figure 7.5. This will result in a failed MAC validation, since the input to the MAC will begin before or after the correct beginning of the MAC őeld (depending on the value in the last pad byte). Furthermore, from early on, TLS designers were aware of the risk of padding attacks, and took steps to prevent them. First, upon detecting an error - invalid pad, invalid MAC or invalid contents - the connection is aborted. Second, all error messages are encrypted, to prevent an attacker from distinguishing between padding errors and MAC errors. Indeed, early padding attacks [380] resulted in only limited information leakage from the encrypted messages. This was quickly followed by more realistic attacks such as [90], which used a timing side channel to distinguish between pad errors and MAC errors, based Applied Introduction to Cryptography and Cybersecurity 7.2. THE TLS RECORD PROTOCOL 417 on differences in the processing time of the two. Effective attacks, based on easy-to-measure timing differences, are known for SSLv3 and TLS 1.0. These attacks, at least, should have motivated the TLS 1.1 and 1.2 designers to change from the (insecure) AtE of TLS, to the secure EtA (Encrypt-thenAuthenticate) paradigm, or to an AEAD-based protocol. Abandoning the vulnerable AtE follows the conservative design principle (Principle 3); but, unfortunately, this did not happen. Instead, the TLS 1.1 and 1.2 speciőcations include additional countermeasures that attempt to prevent distinguishing between pad errors and MAC errors, such as computing the MAC even if the pad is invalid. Such countermeasures are ingenious, but unreliable. Furthermore, surely these steps cannot prevent a possible padding attack that works without distinguishing between MAC and pad failures! We discuss two of the most important padding oracle attacks on TLS: the Lucky13 and Poodle padding attacks. Lucky13 is based on circumventing the countermeasures and distinguishing between MAC and pad failures, while Poodle works even if the two errors are indistinguishable - making it much easier to exploit. Both of these attacks are against the use of CBC mode; Lucky13 addresses TLS, which uses PKCS#5 padding, and Poodle addresses SSL, which uses X9.23 padding. Lucky13. We őrst brieŕy discuss the Lucky13 padding attack [9]. Lucky13 extends the padding oracle attacks from Section 2.9, speciőcally the attack against PKCS#5 padding in Exercise 2.24. Lucky13 uses careful timing sidechannel analysis, to distinguish between pad errors and MAC errors; i.e., it circumvents the countermeasures against timing side channels in TLS 1.1 and TLS 1.2, which were designed speciőcally to prevent such distinction. This allows Lucky13 to then follow a similar approach to the padding oracle attack of [90]. Lucky13 uses carefully-constructed plaintexts, allowing it to attack the secret x one byte at a time, like the method used in Exercise 2.24. The attack cleverly uses the fact that when that in the (rare) cases that the pad is valid, the padding bytes are removed. It designs the plaintext carefully to cause a difference in the number of invocations of the compression function used iteratively by the MAC algorithm; see Section 3.9. The details are elegant, and while we will not cover them, the reader is encouraged to look them up in [9]. Poodle. Even more signiőcantly, SSLv3 is also vulnerable to the Poodle padding attack, which does not require distinction between padding and MAC failures. Furthermore, many implementations of TLS are vulnerable to the Poodle downgrade attack, which we discuss in Section 7.5. By downgrading TLS to SSL, the Poodle downgrade attack allows the Poodle padding attack to succeed against many implementations of TLS 1.0 to 1.2. This is a major motivation for adoption of TLS 1.3. We therefore describe the Poodle padding attack. Applied Introduction to Cryptography and Cybersecurity 418 CHAPTER 7. TLS PROTOCOLS: WEB-SECURITY AND BEYOND The Poodle padding attack was introduced in [290]; it is based on observations made years earlier, in [287]. Speciőcally, the Poodle padding attack is based on three observations: 1. SSL uses X9.23 padding, where the only requirement for valid padding, is that the last byte contains a number smaller than the block length l. There is no requirement on the values of the other pad bytes (in contrast to PKCS#5 padding, see Section 2.9). 2. The CPA-Oracle Attack model allows the attacker to detect both invalidpadding and invalid-MAC errors. 3. The CPA-Oracle Attack model allows the attacker to prepend chosen preőx p to the to the secret (cookie) x, to append chosen suffix s to x. By adding or removing bytes from the preőx p and suffix s, the attacker ensures that the entire plaintext, before padding, contains an integral number of blocks: |m| ≡ 0( mod l). As a result, the pad length will also be a whole block, with the last byte containing (l − 1). For convenience, let us focus on 8-byte blocks, as with DES; then the last plaintext block, denoted mn , consists of eight bytes containing 0x07, i.e., (∀j : 1 ≤ j ≤ 8)mn [j] = 0x07. The Poodle attacks takes advantage of a similar observation as we used to solve Exercise 2.24, namely, that a random plaintext block would have valid PKCS#5 padding if, and almost always if, its last byte contains 0x00. The difference is that SSL uses X9.23 padding, and also applied authentication (MAC) before padding and encrypting; in particular, this will mean that now we need the last byte of the plaintext to contain 0x07, in order to remove an entire padding block and leave the MAC intact. The attack proceeds in three steps. First Poodle step: collect. In this step, the attacker ‘collects’ 256 SSL record protocol packets, which we denote r0x00 , . . . , r0xF F . For convenience, all records should consist of exactly n blocks, e.g., ri = r1i . . . rni . The records should be correctly encoded and, in particular, decrypt into valid-padded plaintext; we further require that the pad will őll the entire last block of the plaintext. Since SSL , for 8-byte blocks: i (∀i ∈ {0x00, . . . , 0xF F })Dk (rni )[8] ⊕ rn−1 [8] = 0x07 (7.2) We further require that the value of the last byte of the before-last block of each record would be identical to the index of the record. Namely: 0x00 0xF F rn−1 [8] = 0x00, . . . , rn−1 [8] = 0xF F (7.3) This means we need to generate candidate records repeatedly, until we collect all 256 records. This is not a lot of overhead, and should not require much more than 256 requests to the encryption oracle, i.e., hyperlinks sent to the browser to cause it to send a request to the victim server. We will use this ri collection in the following steps of the attack. Applied Introduction to Cryptography and Cybersecurity 7.2. THE TLS RECORD PROTOCOL 419 Second Poodle step: find last byte of x. In this step, the attacker őnds the last byte of the cookie/secret x; for example, assume the cookie x is 8 bytes, so we őnd x[8]. The attacker ensures, by adjusting the length of the preőx p, so that x[8] is the last byte of some plaintext block. Since we use CBC, there are two consecutive ciphertext blocks, which we denote by c− and c+ , such that: x[8] = c− [8] ⊕ Dk (c+ )[8] (7.4) The attacker now constructs chosen ciphertexts c0x00 , . . . , c0xF F , by using c+ to replace the last block of each of the ri records. Namely: i (∀i ∈ {0x00, . . . , 0xF F })ci = r1i + + ... + + rn−1 + + c+ (7.5) The attacker invokes the CPA-Oracle Attack oracle on each of these chosen ciphertexts. The padding is valid only if the value of the last decrypted plaintext byte would be between 0x00 and 0x07. If the padding is valid, the MAC is checked; and it is valid only if the padding consists of the entire last block, i.e., only if the last byte of the plaintext (and padding) contains 0x07. Since we use CBC mode, this occurs when: 0x07 = cin−1 [8] ⊕ Dk (cin )[8] (7.6) i By substituting cin−1 = rn−1 and cin = c+ (both from Equation 7.5), we have: i 0x07 = rn−1 [8] ⊕ Dk (c+ )[8] (7.7) i Substitute now rn−1 [8] = i (Equation 7.3) and Dk (c+ )[8] = x[8] ⊕ c− [8] (Equation 7.4), we have: 0x07 = i ⊕ (x[8] ⊕ c− [8]) (7.8) Equation 7.8 holds when i = 0x07 ⊕ x[8] ⊕ c− [8], and therefore, one (exactly) of the chosen ciphertexts will have valid padding and valid MAC. Furthermore, when we identify that ciphertext ci has valid padding and MAC, we can also őnd x[8], the last byte of the secret/cookie, as x[8] = i ⊕ 0x07 ⊕ c− [8]. Last Poodle step: őnally, the attacker repeats the second step, with a minor change, to őnd the other bytes of the cookie/secret x. This can be done easily for the cookie (or any other secret automatically sent by the browser), as follows. The adversary changes the preőx p to make a different byte of x be the last byte in some plaintext block. Then the attacker can proceed exactly as in step 2 to őnd this other byte of x. 7.2.4 The BEAST Attack: Exploiting CBC with Predictable-IV The basic design of the TLS record layer is deőned for an arbitrary encryption algorithm; however, the speciőcations deőne a limited number of standard cipher suites. Furthermore, all of the EtA cipher suites use CBC mode encryption Applied Introduction to Cryptography and Cybersecurity 420 CHAPTER 7. TLS PROTOCOLS: WEB-SECURITY AND BEYOND (subsection 2.8.5). Several of the attacks on the record protocol, e.g., Poodle and other padding oracle attacks, are based on properties of CBC mode. In this section, we brieŕy discuss the BEAST3 attack, which is also focused on the use of CBC mode, but addresses a very different vulnerability, existing in SSL and TLS 1.0 (but not in later versions). This vulnerability is the use of predictable Initialization Vector (IV). As presented in subsection 2.8.5, CBC mode requires the use of a random IV for each message. The value of the IV does not need to be secret, and in most implementations - including TLS 1.1 and 1.2 - the IV is sent ‘in the clear’, visible to an eavesdropper. However, the SSL design - adopted also by TLS 1.0 - used the handshake protocol to derive the IV for the őrst fragment in a connection, similarly to the derivation of the shared keys used by the record protocol. This means that both sender and recipient have a shared, pseudorandom IV for the őrst fragment; therefore, the protocol does not send the IV along with the rest of the ciphertext. We conjecture, that the designers felt that this is a better design, probably since it appears that keeping the IV secret may, somehow, be beneőcial against some future attack against CBC mode (with a particular block cipher). There is also the minor beneőt of reducing the amount of bytes sent. Using a pseudorandom IV, without sending it, is őne. So, there is no problem with this ‘implicit IV’ method, for the first fragments in a connection. However, what about other fragments, sent over the same connection ? Also for these (non-őrst) fragments, SSL and TLS 1.0 do not send an IV. Instead, for any non-őrst fragment in the connection, the SSL and TLS 1.0 design uses the last ciphertext block of the previous fragment sent over the connection, as the IV for the new fragment. Namely, their IV is the value of the previous ciphertext block sent (from most-recently-sent fragment). Since the ciphertext is produced by a block cipher, this may seem secure. However, this is another example of the risk of trusting intuition and not carefully validating the security of a design, and not relying on the exact cryptographic properties of the underlying mechanisms. Speciőcally, the security of CBC relied on the assumption that the IV is random, which implies unpredictable; once the IV is őxed (as the value of the last-sent ciphertext block), it is completely predictable and not random any more! Let us see the cryptographic details of BEAST, under the CPA-Oracle Attack model. The model allows the attacker to choose each preőx-suffix pair (pi , si ), potentially as a function of the previous ciphertext ci−1 . We later brieŕy discuss the challenges in actually deploying BEAST in practice, since, obviously, the CPA-Oracle Attack model is only a simpliőcation of reality. BEAST: cryptographic aspects. Suppose we use 8-bytes blocks (as with DES). The attack exposes the secret/cookie x byte by byte; let us őrst show how we expose the őrst (most signiőcant) byte, x[1]. The attacker provides őrst chosen-plaintext preőx p∗ (and suffix s∗ ), chosen to ensure that a speciőc 3 BEAST stands for Browser Exploit Against TLS. Applied Introduction to Cryptography and Cybersecurity 7.2. THE TLS RECORD PROTOCOL 421 block of the resulting plaintext m∗ = p∗ + +x+ + s∗ would contain the last ∗ seven bytes of p , followed by the first byte x[1] of the secret/cookie. It is convenient to have p∗ contain őfteen bytes (two blocks minus one byte). The exact contents are not important, but for our discussion, a convenient choice is: p∗ = 123456789ABCDEF . As a result, we have: m∗ [1 : 16] = p∗ + + x[1] = 123456789ABCDEF + + x[1] (7.9) Let c∗ denote the resulting encryption of m∗ . Let c∗ (1) = c∗ [1 : 8], i.e., the őrst block (eight bytes) of c∗ , and c∗ (2) = c∗ [9 : 16] denote the second block; similarly, m∗ (2) = m∗ [9 : 16] = 9ABCDEF + + x[1]. Since we use CBC encryption, we have: c∗ (2) = Ek (c∗ (1) ⊕ m∗ (2)) = Ek (c∗ (1) ⊕ (9ABCDEF + + x[1])) (7.10) For i = 0, . . . , 255, the attacker next obtains IVi , the IV that would be used to encrypt the next fragment. In the SSL (and TLS 1.0) design, the IV is the last-sent ciphertext block (the end of the previous ciphertext fragment). The attacker asks for encryption of plaintext p′i , computed as: p′i = (m∗ [9 : 15] + + i) ⊕ c∗ (1) ⊕ IVi = (9ABCDEF + + i) ⊕ c∗ (1) ⊕ IVi (7.11) Let c′i be the CBC encryption of p′i with the known IV value IVi ; hence, c′i = Ek ((9ABCDEF + + i) ⊕ c∗ (1)). We try the 256 different values for i until we őnd one of them, denoted i∗ , such that: + i∗ ) ⊕ c∗ (1)) c∗ (2) = c′i∗ = Ek ((9ABCDEF + (7.12) Since Ek is a permutation, equal outputs of Ek imply equal inputs to Ek . Hence, from Equation 7.10, we őnd x[1] by the following deductions: c∗ (1) ⊕ (9ABCDEF + + x[1]) =(9ABCDEF + + i∗ ) ⊕ c∗ (1) (9ABCDEF + + x[1]) =(9ABCDEF + + i∗ ) x[1] = i∗ (7.13) Finding other bytes. Let us explain how we őnd x[2], by utilizing the fact that we already know x[1]; the method extends to the other bytes too. The attack simply requires choosing a fourteen bytes preőx pb∗ , e.g., pb∗ = 123456789ABCDE. As a result, we have: c∗ [1 : 16] = pb∗ + m + x[1 : 2] = 123456789ABCDE + + x[1 : 2] (7.14) Since x[1] is already known, we are in a similar situation to earlier, of őnding c∗ [16] = x[2]. Let the last (unknown) byte in the second block, which is now m   c∗ (2) = m c∗ [9 : 16]. c∗ , cb∗ (1) = cb∗ [1 : 8], cb∗ (2) = cb∗ [9 : 16] and m cb∗ = Ek m Since we use CBC, we have:    c∗ (2) = Ek cb∗ (1) ⊕ (9ABCDE + cb∗ (2) = Ek cb∗ (1) ⊕ m + x[1 : 2]) (7.15) Applied Introduction to Cryptography and Cybersecurity 422 CHAPTER 7. TLS PROTOCOLS: WEB-SECURITY AND BEYOND The attacker now performs a similar test to the one before to őnd the value of x[2]. Namely, for i = 0, . . . , 255, the attacker obtains IVi , the IV that would be used to encrypt the next fragment, and then asks for encryption of pb′i : pb′i = (9ABCDE + + x[1] + + i) ⊕ cb∗ (1) ⊕ IVi (7.16) The attacker eavesdrops to obtain cb′i , the CBC encryption of pb′i with IV IVi :    cb′i = Ek pb′i ⊕ IVi = Ek (9ABCDE + + x[1] + + i) ⊕ cb∗ (1) (7.17) Similarly to before, the attacker őnds a value ib∗ ∈ {0, . . . , 255} for which: ′ cb∗ (2) = cc ib∗ (7.18) Since Ek is a permutation, equal outputs of Ek imply equal inputs to Ek . Substitute the inputs to Ek from Equation 7.15 and Equation 7.17 and we have: cb∗ (1) ⊕ (9ABCDE + + x[1 : 2]) = (9ABCDE + + x[1] + + ib∗ ) ⊕ cb∗ (1) (9ABCDE + + x[1 : 2]) = (9ABCDE + + x[1] + + ib∗ ) x[2] = (7.19) ib∗ In this way, we őnd x[2] (as ib∗ ); other bytes follow similarly. BEAST: system aspects. The cryptographic aspects of BEAST were published already in 2004 by Bard [25], and observed for SSH and IPsec years earlier [38, 336]. Hence, following the conservative design principle (Principle 3, the TLS designers should have avoided the use of observable IV, i.e., the value of the last-sent ciphertext block, preventing this vulnerability - as was őnally done from TLS 1.1. Unfortunately, as in many similar scenarios, the designers ignored these wellknown warnings, and used anyway the last-sent ciphertext block as the IV for the next fragment. The reason is that deploying Bard’s attack [25] seemed too challenging. In particular, the attack requires the attacker to control the very first block of the new fragment; but in the classical use of TLS to secure HTTP communication between browser and website, every HTTP request begins with a őxed header. So it seems that the attacker cannot control the őrst block and the attack is prevented. Bard’s paper [25] showed this problem may be overcome, but the solution wasn’t very practical. As a result, the vulnerability persisted in TLS 1.0. It was eventually addressed, in TLS 1.1; however, deployment of TLS 1.1 was limited for years. One of the main drivers for the adoption of TLS 1.1 was the publication [132], by Duong and Rizzo, of several practical ‘implementation tricks’ allowing the BEAST attack to deploy the basic cryptanalytical ideas of Bard, addressing the system-challenges that made Bard’s attack [25] so challenging. Let us brieŕy mention the two most important implementation tricks, which are relevant for Applied Introduction to Cryptography and Cybersecurity 7.2. THE TLS RECORD PROTOCOL 423 other attacks too. First, they observed that TLS attacks can target the HTTP cookie őeld, which is often used as a (secret) authenticator sent automatically by the browser whenever sending a request to a speciőc website. Second, they observed that the use of the WebSocket mechanism [146] made the attack usable against TLS as used by browsers, i.e., when running the HTTP protocol over TLS (denoted as HTTPS). Further details are beyond our scope. 7.2.5 Exploiting RC4 Biases to Recover Plaintext BEAST, Lucky13 and POODLE are all critical attacks against the use of CBC encryption as speciőed by TLS (1.0 to 1.2). Some countermeasures were proposed to these attacks, with the most popular one being to simply use stream-cipher encryption, using RC4, instead of using a block-cipher such as DES or AES in CBC mode. However, RC4 was known, for years, to have some vulnerabilities, such as the bias of the second byte observed and exploited in [276]; see subsection 2.5.6. Therefore, following the conservative design principle (Principle 3), it should have been avoided, and deőnitely not used as a ‘more secure’ alternative to CBC-mode. This choice to use RC4 as a supposedly more secure alternative to CBC, was another example of underestimating the risk due to what appeared as ‘impractical vulnerabilities’ - this time, of the RC4 stream cipher (or pseudorandom generator). Indeed, in the common use of TLS to protect HTTP communication, the beginning of the information sent by the client is normally the HTTP header - which begins with well-known bytes. Therefore, we cannot exploit the signiőcant bias of the second RC4 byte [276]. However, in [11], it was shown that RC4 has additional biases, which may allow exposure of conődential, sensitive communication such as the cookie. This included two types of biases: Single byte biases: biases were detected for some output bytes of RC4, not just the second byte - although it has the largest bias. In [11], additional biases were found in the őrst 256 bytes of RC4. By careful analysis of the results of a large number of encryptions of the same secret x, the attack can recover the secret with signiőcant probability, which depends on the number of encryptions and on the positions of the secret (earlier positions usually had more bias). For example, after 225 encryptions, the őrst 50 bytes were recovered with probability of more than 50% per byte. Double-byte biases: RC4 also has biases of pairs of bytes in different (adjacent) positions, as reported already in 2000 [148]. By carefully analyzing such pairs, using the known biases, the attacker can recover the plaintext from arbitrary positions. The combination of the attacks on CBC and on RC4, was an important motivation to development and adoption of other cipher suites, and of improved record protocols, mainly, the AEAD record protocols (subsection 7.2.7). Applied Introduction to Cryptography and Cybersecurity 424 7.2.6 CHAPTER 7. TLS PROTOCOLS: WEB-SECURITY AND BEYOND Exploiting Compress-then-Encrypt: The CRIME, TIME and BREACH Attacks We conclude our discussion of attacks against the TLS record protocol, by discussing attacks which focus on the compression of the plaintext before encryption. As we discussed in subsection 4.7.5, when we apply compression to plaintext and then encrypt the ciphertext, there is a risk of exposure of partial information about the plaintext. In particular, an attacker would be able to distinguish between the compressed-then-encrypted ciphertexts of two equal-length plaintexts p1 , p2 , if p1 has high redundancy (compresses to a much shorter string) while p2 has low redundancy (compression does not reduce its length). However, the potential exposure due to plaintext compression, may not appear to be a serious threat - and, unfortunately, this threat, presented already in 2002 [230], was mostly ignored for many years. In particular, as shown in Figure 7.4, the AtE TLS record protocol includes an (optional) compression process; and applications using TLS often apply their own processing to the messages, before sending them via TLS. As you will őnd in Exercise 7.6, under the CPA-Oracle Attack model, it can be quite easy to deploy this attack and expose part or all of the cookie. Deploying it in practice involves several challenges, such as the following. First, the attacker may not be able to completely control the entire plaintext (except for the secret). Second, when using a block cipher, small changes in the length of the plaintext may not reŕect corresponding changes in the length of the ciphertext. Third, the deployed compression schemes are considerably more complex than Exercise 7.6; and, as the exercise shows, details of the compression scheme are very relevant to the feasibility of the attack. The CRIME attack [335], presented by Duong and Rizzo, demonstrated how these and other challenges can be overcome, allowing efficient exposure of cookie or other secrets repeatedly-sent in HTTP requests or responses. Speciőcally, the demonstration focused on exposure of cookies by utilizing TLS compression. We will not describe the attack here, since the principle is quite simple (as will be evident from Exercise 7.6), however, the attack necessarily involves details of the compression mechanisms in use, which are beyond our scope. The details are not that complex, and interested readers are encouraged to read about them; a good description is provided in [29]. In their presentation, Duong and Rizzo also discussed potential variants of the CRIME attack that can extract secrets sent in HTTP responses, such as CSRF tokens4 , as well as the potential abuse of other compression mechanisms, such as the widely-deployed HTTP compression. In spite of this, and in contrary to the conservative design principle (Principle 3), the main response to CRIME was disabling of TLS compression. Ignoring Principle 3 was obviously a CRIME, and the punishment followed, in the form of 4 A CSRF token is a pseudorandom identifier sent by a website to a browser, allowing the browser to submit operations on the user’s account. CSRF tokens are the common defense against the Cross Site Request Forgery (CSRF) attack. Applied Introduction to Cryptography and Cybersecurity 7.2. THE TLS RECORD PROTOCOL 425 two effective, convincing attacks that exploited HTTP compression mechanisms: TIME [29] and BREACH [164]. The BREACH attack showed how HTTP compression allows the application of CRIME to expose secrets in HTTP responses, most notably the abovementioned CSRF tokens. Like CRIME, the disclosure is very effective. TIME, like BREACH, also assumes only HTTP compression (not TLS compression). But TIME also extends CRIME in a more profound way: it shows how to expose secrets in HTTP requests, such as cookies, without requiring eavesdrop capabilities. Namely, the attacker does not have the entire capabilities in the CPA-Oracle Attack model; it only controls a website visited by the user. This is the cross-site attack model, which is the most commonly deployed model in studies of non-cryptographic web security. Luckily, TIME seems signiőcantly harder to deploy, namely, it may require an extensive amount of queries and time. Preventing Compress-then-Encrypt Exposure. There are several possible countermeasures to the Compress-then-Encrypt exposure. We mention three of them. First, the most certain way to avoid exposure of conődentiality due to the use of Compress-then-Encrypt, is simple: to avoid compression. While the use of compression for data can be critical for performance, it may be possible to avoid compression of the sensitive information. For example, cookies are sent in the HTTP headers, which are usually much shorter than the payload; it may be acceptable to compress only the payload. Or, avoid HTTP compression completely, in spite of the performance hit! A second possible countermeasure is to perform special encoding to sensitive data, that will prevent compress-then-encrypt exposure of (only) that data while allowing compression of the rest of the data. For example, we can apply a ‘randomizing transform’ R to sensitive data s, such as cookies and CSRF tokens, before applying compression. One simple randomizing transform would XOR the sensitive data s, with a random or pseudorandom string r, which is appended to the data separately. A standard transform may even be embedded into TLS, requiring the application only to mark the sensitive data s. This could be a nice programming project for the interested reader, and may be a useful extension to TLS. One unavoidable challenge, however, of this approach, if the need to identify the sensitive data s. Some kinds of sensitive data may be amenable to automated identiőcation (e.g., cookies), but other types may require the programmer to annotate the data. This is a serious disadvantage, as the countermeasure is prone to be done incorrectly or to not to be done at all. The third and őnal countermeasure we mention is to add random padding to the compressed data, hiding the exact amount compressed. This countermeasure is very intuitive, but could often fail, e.g., by averaging-out the randomness using multiple measurements. Applied Introduction to Cryptography and Cybersecurity CHAPTER 7. TLS PROTOCOLS: WEB-SECURITY AND BEYOND 426 Plaintext Message as sent by the application, e.g., HTTP request Ciphertext k (fragment 1) k Plaintext AEAD Authenticated encryption with Additional Data Ciphertext k (fragment 2) Additional data k Type ., HTTP request Zeros pad OTYP VER LEN Nonce SEQ Type Fragment 3(≤ 16KB) Zeros pad OTYP VER LEN Nonce SEQ Type AEAD Authenticated encryption with Additional Data Additional data the application, e.g OTYP VER LEN OTYP VER LEN k Plaintext OTYP VER LEN Additional data Fragment 2(≤ 16KB) Zeros pad OTYP VER LEN Nonce SEQ Fragment 1(≤ 16KB) Message as sent by Plaintext AEAD Authenticated encryption with Additional Data Ciphertext k (fragment 3) Figure 7.7: The AEAD Record Protocol (used always in TLS 1.3, and optionally in TLS 1.2). The ciphertext produced by the Authenticated Encryption with Additional Data (AEAD) function, provides encryption of the plaintext input as well as authentication of both the plaintext and the additional data, using a nonce which should be unique in each invocation. The design contains several ‘small print őelds’, e.g., SEQ and OTYP, which are explained in the text, and can be mostly ignored for intuitive understanding. 7.2.7 The TLS AEAD-based record protocol (TLS 1.3) Most of the vulnerabilities identiőed for the TLS AtE record protocol, can be avoided by adopting one of the two other designs discussed in Section 4.7: a Encrypt-then-Authenticate (EtA) design, which uses őrst an encryption scheme and then an authentication scheme, or the authenticated encryption with associated data (AEAD) schemes, which combine encryption (for conődentiality) and authentication. RFC 7366 [179], Encrypt-then-MAC, applies the ‘classical’ Encrypt-thenAuthenticate (EtA) paradigm. The speciőcations use TLS extensions to signal the use of EtA rather AtE. It could be deployed in versions 1.0 to 1.2 of TLS (for version 1.0, provided that extensions are supported). TLS 1.3 supports only the use of an authenticated encryption with associated data (AEAD) scheme, which ensures both conődentiality and authenticity; see subsection 4.7.1. AEAD schemes accept two types of data: plaintext, which is encrypted and authenticated, and additional data, which is only authenticated, not encrypted. The design of the AEAD record protocol is illustrated in Figure 7.7. Without going into all of the ‘small print őelds’ used by the protocol, which we discuss below, this design is simpler than that of the AtE record protocol (Figure 7.4). First, instead of a separate encryption scheme and authentication scheme, we use just the AEAD scheme; we also need only one key to achieve both conődentiality and authenticity. Second, the AEAD function provides all functions of a ‘mode of operation’ and more; it does not require a separate padding operation or Applied Introduction to Cryptography and Cybersecurity 7.2. THE TLS RECORD PROTOCOL 427 a random initialization vector, although it does require a unique nonce for security [74]. This simpliőes its use, and in particular, avoids the need for variants (for stream cipher and for block cipher). Finally, the AEAD record protocol does not support a TLS compression function, to foil Compress-thenEncrypt vulnerabilities as deployed by the CRIME attack (subsection 7.2.6). The simplicity of the TLS AEAD-based record protocol, follows the KISS principle (Principle 14), and helps avoid implementation vulnerability. This simplicity also facilitated automated veriőcation of the security of the TLS 1.3 record layer speciőcations and of appropriately-developed implementation [115]. The seven small print fields. Unfortunately, the TLS 1.3 record protocol also comes with seven őelds whose meaning and use can be a bit obscure. Readers may mostly ignore these ‘seven small print őelds’, but let us explain them anyway, and reserve judgement on whether including all of these small print őelds in the design was really necessary or desirable. We present these ‘small print’ őelds by their positioning in Figure 7.7, from left to right. Nonce: The AEAD function requires provision of a unique nonce őeld in every invocation; the same nonce value should be used for the encryptand-authenticate operation and to the corresponding decrypt-and-verify operation. The nonce can be seen as a substitute to the random or unique IV required by most modes of operation, or to the counter used by CTR mode. The nonce input is 32 bits, and as the value should only be used once in the TLS connection, it imposes a (very high) limit on the number of fragments in a connection. SEQ: The 32-bit sequence number is the sequence number of the fragment. By authenticating this data with each fragment, the protocol can detect any tampering with the order of fragments sent in the connection, e.g., reordering or duplicating some fragments. Any detected manipulation results in immediate disconnection, since unintentional errors would be prevented by the lower-layer TCP layer which ensures reliable, ordered connections. Both sender and recipient maintain count of the fragments sent and received, therefore, there is no need to actually send SEQ. OTYP: This őeld is referred to in the TLS 1.3 speciőcations [329] as the opaque type; it is included for backward compatibility with earlier versions of TLS, and should always contain the őxed type of 23, indicating application data - even if, in reality, the fragment contains a different type of record, such as of the alert or handshake protocols. The ‘real type’ of the record is included in the plaintext, with the fragment data, and therefore its value is hidden from an eavesdropper. VER: The legacy version őeld. This őeld is included for backward compatibility with earlier versions of TLS, and should always contain the value 0x0303. The protocol learns the ‘real’ version of TLS from the handshake protocol, Applied Introduction to Cryptography and Cybersecurity 428 CHAPTER 7. TLS PROTOCOLS: WEB-SECURITY AND BEYOND since, in TLS 1.3, the record layer is invoked only after negotiation by the handshake protocol, which authenticate the protocol version. LEN: The length of the ciphertext fragment, i.e., of the output of the AEAD. The length is provided as part of the additional data input to the AEAD, therefore, it should be computed before applying the OAEP, taking into account the length of the plaintext input and the expansion performed by the OAEP. Type: This is the ‘real’ type of the fragment. TLS 1.3 and earlier versions, deőne only four valid types: application data, handshake, alert, and CCS (Change Cipher Speciőcation), each assigned to a non-zero byte value. Zeros pad: This is an optional őeld, which can have arbitrary number of bytes whose value is 0x00 (zero). A random or otherwise selected number of zero pad bytes may be used to hide the size of the fragment from an attacker which can observe the TLS ciphertexts. The motivation to hide the length of the fragment is mainly due to the Compress-thenEncrypt vulnerabilities and attacks (CRIME, TIME and BREACH), see subsection 7.2.6. 7.3 The SSLv2 Handshake Protocol In this section we discuss the SSLv2 (SSL version 2) handshake protocol, its features - and some of its main vulnerabilities. SSL version 2 is the earliest published version of the SSL protocol [202], and its handshake protocol is interesting - beyond its historical importance. One motivation to study it, is that SSLv2 already introduces much of the basic concepts and designs used in later versions - and, since it is a bit simpler, it is a good way for us to introduce these basic TLS concepts and designs. Another motivation is that the SSLv2 handshake has some serious vulnerabilities; understanding these vulnerabilities is instructive, to develop the ability to detect ŕaws in cryptographic protocols, and to understand and motivate the design of later versions of the TLS handshake protocol. Finally, surprisingly, there are still quite a lot of implementations that support SSLv2, although they also support (and prefer) later versions, which may make them vulnerable to downgrade attacks; see Section 7.5. The SSLv2 handshake is a non-trivial cryptographic protocol, with support for multiple options and mechanisms - mostly supported also by all later versions (of SSL and TLS), often with extensions and improvements, and removal of insecure mechanisms. We describe the protocol in the following three subsections. In ğ7.3.1 we present the ‘basic’ handshake, namely, the handshake when there is no existing session (already established shared key), and the protocol uses public-key operations to share a key. In contrast, in ğ7.3.3 we present the session resumption handshake, allowing to re-use the shared key exchanged in a previous handshake between the same client and server, to open a new connection without additional public key operations. In ğ7.5.1 we discuss how Applied Introduction to Cryptography and Cybersecurity 7.3. THE SSLV2 HANDSHAKE PROTOCOL 429 SSLv2 handles cipher suite negotiation, and explain how an attacker may exploit the (insecure) SSLv2 cipher suite negotiation mechanism, to launch the simple yet effective cipher suite downgrade attack. Finally, in ğ7.3.4 we discuss how SSLv2 supports the (optional) client-authentication feature. Terms and notations. SSLv2, as described in the original publications, e.g., in [202], uses several terms and notations which were modiőed in later versions. For consistency, we use the terms used by the later versions, also when describing SSLv2; these terms are often also more intuitive. For example, we use the terms client random rC and server random rS , as in SSL3 and TLS. However, the SSLv2 documentation refers to these őelds as challenge and connection-ID, respectively. 7.3.1 SSLv2: the ‘basic’ handshake In this subsection we discuss the ‘basic’ SSLv2 handshake, illustrated in Fig. 7.8, which is a simpliőcation of the SSLv2 handshake protocol. This simpliőed version does not include cipher suite negotiation, session resumption and client authentication. We discuss these additional aspects of SSLv2 in the following subsections. The Hello messages. The SSLv2 handshake begins with the client sending a ClientHello message to the server, specifying the client’s protocol version and the client random, rc , a random bit-string used to randomize the key derivation. The server responds with the ServerHello message, which contains the server random, rS , and the server’s public-key certiőcate. The certiőcate contains the server’s public key, the server’s RSA public encryption key S.e, the domainname of the server, e.g., s.com, and additional őelds. The certiőcate is signed by an authorized Certificate Authority (CA), using the CA’s private signing key CA.s. The client veriőes the certiőcate. This includes several checks: does the client trust the signing CA? Is the certiőcate properly signed? Is the domain that the client tries to connect to, the same as the domain in the certiőcate (or one of the domains in the certiőcate)? Has the certiőcate expired or revoked? There are a few more checks; we discuss certiőcates and their validation in Chapter 8. The ClientHello and ServerHello messages retain these basic functions and őelds in later versions of SSL and TLS; the most signiőcant changes are in TLS 1.3. 7.3.2 SSLv2 Key-derivation The SSLv2 handshake protocol establishes a shared master key, which we denote kM . The master key is selected by the client, and sent encrypted, using RSA, to the server, in the Client key exchange message. Namely, the client sends ES.e (kM ). Applied Introduction to Cryptography and Cybersecurity CHAPTER 7. TLS PROTOCOLS: WEB-SECURITY AND BEYOND 430 Server S (s.com) Client C ClientHello: client random (rC ) ServerHello: server random (rS ) Certificate: SCA.s (S.e, s.com, . . .) (signed by CA’s signing key CA.s, and containing Server’s public encryption key, S.e) Select kM Derive kC , kS ClientKeyExchange: ES.e (kM ) ClientFinished: kC (rS ) kM ← DS.d (ES.e (kM )) Derive kC , kS ServerFinished: kS (ID) Figure 7.8: ‘Basic’ SSLv2 handshake: new session, no client authentication, and ignoring cipher suite negotiation.. The client and server selects random strings (rC , rS respectively) and exchange them in the ClientHello and ServerHello messages. The servers sends its certiőcate. The client veriőes that the certiőcate is valid and properly signed (by a trusted CA), and that the domain in the certiőcate matches the desired server domain. If so, the client selects randomly a shared master key kM , encrypts it using RSA encryption with the server’s public key S.e, and sends to the server. The client to server key kC and the server to client key kS are derived from the master key kM and the randomizers rC and rS , using the M D5 cryptographic hash, as in Equation 7.20. Both Finished messages are protected by the SSL record protocol, which we denote by kC (client to server communication) or kS (server to client); the ClientFinished contains the server’s random rS , preventing replay, and the ServerFinished contains an identiőer ID, allowing efficient session resumption (Figure 7.9). The public key encryption of kM is the most computationally-intensive operation by the client; therefore, it is desirable for the protocol to be secure even if the client reuses the same master key kM and its encryption ES.e (kM ) in multiple connections, assuming that the master key was not exposed. To ensure this, we use the client random and server random őelds from the Hello messages, rC and rS , respectively. Namely, we combine rC and rS with the master key, and use the combination to derive session-speciőc cryptographic keysfor the session. The derived cryptographic keys are used to protect communication in the connection, and include keys for encryption/decryption as well as for authentication and veriőcation of authentication (MAC). In SSLv2, the parties derive and use only two keys from kM and the random nonces rC , rS : the client-to-server key kC and the server-to-client key kS . The client uses kC to encrypt messages it sends and compute the MAC to attach to them for authentication, and kS to decrypt messages it receives from the server and to compute the MAC on the ciphertext, and compare it to the received MAC value, for authentication. These are derived as follows: Applied Introduction to Cryptography and Cybersecurity 7.3. THE SSLV2 HANDSHAKE PROTOCOL kC = M D5(kM + + ł1ž + + rC + + rS ) kS = M D5(kM + + ł0ž + + rC + + rS ) 431 (7.20) Why is this separation between kC , used to protect messages from client to server, and kS , used to protect messages from server to client? One reason is that, with a stream cipher, using the same key in both directions would result in insecure re-use of the same key-pad for encryption of two different messages. Another motivation, relevant to block ciphers, is to improve security, following the key separation principle (Principle 10) . In particular, many websites are public, and send exactly the same information to all users; however, we may want to protect the conődentiality of the contents, e.g., queries, sent by the users. By separating between kC and kS , the attacker cannot use the large amount of known plaintext sent from server to client, to cryptanalyze the ciphertext sent from client to server. On the other hand, the use of a the same secret key for two different cryptographic functions violates the same key separation principle. This was őxed in later versions of TLS, which derive separate keys for encryption and for authentication - with the exception of TLS 1.3, which only needs one key since it uses a single AEAD scheme to ensure both authentication and encryption. The use of both random numbers rC and rS is required, to ensure that a different key is used in different connections. This has three motivations. First, it is necessary to prevent replay of messages - from either client or server; see exercise 7.9. Second, it reduces the total amount of known and chosen plaintext that can be available for cryptanalysis of a key, and the amount of plaintext that is exposed by successful cryptanalysis. Finally, it avoids the possibility that known or chosen plaintext from one connection, e.g., from the public login page of a website, may help attack against data sent in another connection, which may not have the same amount of known or chosen plaintext. Note, however, that the SSLv2 key derivation does not fully follow the key separation principle, since it uses the same key for conődentiality (encryption) and for message-authenticity (MAC). This can cause vulnerability even if both encryption and MAC are independently secure; see Exercise 4.3. Indeed, later E versions of TLS use separate keys for the encryption and for MAC, e.g., kC and M AC kC . 7.3.3 SSLv2: ID-based Session Resumption The main overhead of the TLS protocol is due to the computationally-intensive public key operations. Often, there are multiple connections between the same (client, server) pair, over a short period of time; in such cases, the server and client may re-use the master key exchanged previously, thereby avoiding additional public key operations. To facilitate re-use of the master key, the server includes an identiőer ID at the end of the handshake; to re-use the same master key in another connection, the client sends this ID with its client-hello, Applied Introduction to Cryptography and Cybersecurity 432 CHAPTER 7. TLS PROTOCOLS: WEB-SECURITY AND BEYOND Client Server Client hello: client random (rC ) , ID Server hello: server random (rS ), ID-hit Client finished: kC (rS ) Server finished: kS (rC ), kS (ID) Figure 7.9: SSLv2 handshake, with ID-based session resumption. The client initiates this handshake by including the ID őeld in the ‘Client hello’ message. The ID őeld was received from in the ‘Server őnish’ message in a previous connection, and cached, with the corresponding session key kM , by the client. If the server does not have the (ID, kM ) pair in cache, then the handshake completes without resumption, as in Fig.7.8. Otherwise, when (ID, kM ) is in the server’s cache, then the parties can reuse kM , i.e., ‘resumes the session’, by deriving new shared keys from kM , using Eq. (7.20). This avoids the public key operations, encryption by client and decryption by server of master key kM , as well as the overhead of transmitting the certiőcate SCA.s (S.e, s.com). The server indicates such cache-hit by sending the ID-hit ŕag in its ‘Server hello’ response, and continuing with the resumption handshake as shown here. and if the server has the corresponding key, the session is resumed efficiently, avoiding additional public key operations. We illustrate this ID-based session resumption process in Figure 7.9. The impact of session-resumption can be quite dramatic. The savings are mostly on the computation (CPU) time; instead of computing public-key encryption of the master key kM (for client) and decryption (for server) for every TCP connection, we now need only require these operations for the őrst TCP connection in a session. The ratio of the computation time with and without session resumption is typically on the orders of 100 for typical usage, such as for protecting web communication using the https protocol, i.e., running http over TLS. Session resumption in SSLv2 is always based on the use of the ID. This ID-based session resumption mechanism has a signiőcant drawback: it requires the server to be stateful, speciőcally, to maintain state for each session (for the session-ID and the master key). In the typical case where the same web-server is running over multiple machines, this requires that this storage be shared between all of these servers, or to ensure that a client will contact the same machine each time - a difficult requirement that sometimes is infeasible. These drawbacks motivate the adoption of alternative methods for session resumption, most notably, the TLS session-token resumption mechanism, that we discuss later. Note that the session resumption protocol is one reason for requiring the Applied Introduction to Cryptography and Cybersecurity 7.4. THE HANDSHAKE PROTOCOL: FROM SSLV3 TO TLSV1.2 433 use of client and server random numbers; see the following exercise. Exercise 7.1. Consider implementations of the SSLv2 protocol, where the (1) client random or (2) server random fields are omitted (or always sent as a fixed string). Show a message sequence diagram for two corresponding attacks, one allowing replay of messages to the client, and one allowing replay of messages to the server. Hint: perform replay of messages from one connection to a different connection (both using the same master key, i.e., same session). 7.3.4 SSLv2: Client Authentication All versions of SSL and TLS, including SSLv2, support an (optional) client authentication mechanism, where the client proves its identity by sending a certiőcate for a public signature-validation key, and then signs content sent by the server. Client certiőcates should identify a client approved by the server, and be signed (issued) by a certiőcate authority (CA) trusted by the server, just like server certiőcates should be signed by a CA trusted by the client. In SSLv2, the information signed by the client consists of a signature using the client’s private key, over several őelds, including a challenge sent by the server with the request for client authentication, the server’s certiőcate, and the shared connection keys. Furthermore, this signature should be sent encrypted (using the appropriate connection key). It may not be immediate to see why all of these elements are used, but as we see in Exercise 7.10, removal of some of them may result in a vulnerability. In the next section we discuss the handshake protocol from SSLv3 to TLSv1.2; the client-authentication design of these versions is simpler and more amenable to security analysis. 7.4 The Handshake Protocol: from SSLv3 to TLSv1.2 We will now discuss the evolution of the TLS handshake protocol after version 2, from version 3 of SSL [155], to versions 1.0, 1.1 and 1.2 of TLS [120, 122, 381]. These four handshake protocols are quite similar - we will mention the few major differences. Later, in ğ7.6, we present version 1.3 of TLS, which is the latest - and involves more signiőcant differences, compared to the more incremental changes of these earlier versions. The handshake protocol, especially before TLS 1.3, has multiple mechanisms and options; we will not cover all of them. One important mechanism we will not cover is for handshake renegotiation. Renegotiation allows clients and servers to change negotiated aspects of the session. One important use for renegotiation is when a server decides to ask for client authentication, after the session began without client authentication (i.e., with an anonymous client). Renegotiation was a rather complex mechanism, and subject to few attacks, most notably [57, 331]. Applied Introduction to Cryptography and Cybersecurity 434 CHAPTER 7. TLS PROTOCOLS: WEB-SECURITY AND BEYOND Client Server Client hello: version (vC ), random (rC ), cipher suites, [extensions,] Server hello: version (vS ), random (rS ), cipher suite, [extensions,] certificate: SCA.s (S.e, s.com, . . .) Client key exchange: ES.e (kP M ); Client finished: P RFkM (‘client finished:’, h(previous flows)) Server finished: P RFkM (‘server finished:’, h(previous flows)) Figure 7.10: The ‘basic’ RSA-based handshake, for SSLv3 and TLS 1.0, 1.1 and 1.2. The master key kM is computed, as in Eq. (7.21), from the pre-master key kP M , which is sent in the client key exchange message (third ŕow). Notice that the client key exchange message simply contains encryption of kP M , i.e.: ES.e (kP M )). From TLS 1.1, the speciőcations supports (optional) extensions, as illustrated; see subsection 7.4.3. Figure 7.10 illustrates the ‘basic’ variant of the handshake protocol, of the SSLv3 protocol and the TLS protocol (versions 1.0 to 1.2). Like SSLv2, this ‘basic’ variant uses RSA encryption to send encrypted key from client to server. In the following subsections, we discuss the main improvements introduced in these later versions of TLS, including: Improved key derivation and kP M (§7.4.1): the key derivation process was signiőcantly overhauled between SSLv2 and the later versions, beginning with SSLv3. In particular, the client-key-exchange message of the basic exchange includes the premaster key kP M , from which the protocol derives the master key kM . As before, the master key kM is used to derive the keys for the record-protocol, used to encrypt and authenticate data on the connection. However, from SSLv3, the protocol correctly separates between the encryption keys and the authentication keys (in contrast to SSLv2). DH key exchange and PFS (§7.4.2): From SSLv3, the TLS protocols supports DH key exchange, as an alternative or complementary mechanism to the use of RSA-based key exchange (the only method in SSLv2). The main advantage is support for Perfect forward secrecy (PFS). Session-Ticket Resumption (§7.4.4): an important TLS extension allows Session-Ticket Resumption, a new mechanism for session resumption. Session-ticket resumption allows the server to avoid keeping state for each session, which is often an important improvement over the ID-based session resumption mechanism supported already in SSLv2 (but which requires servers to maintain state for each session). Improved handshake integrity and negotiation (§7.5): from SSLv3, the handshake protocol’s őnish message authenticates the data of the previApplied Introduction to Cryptography and Cybersecurity 7.4. THE HANDSHAKE PROTOCOL: FROM SSLV3 TO TLSV1.2 435 ous ŕows of the handshake; this prevents the SSLv2 downgrade attack (Figure 7.18) and other violations of handshake integrity. TLS, and to lesser degree SSLv3 too, also improve other aspects of the negotiation, in particular, support for extensions, negotiation of the protocol version, and negotiation of additional mechanisms, including key-distribution and compression. Two of these changes - improved key derivation and improved handshake integrity - have impact already on the ‘basic’ handshake. To see this impact, compare Figure 7.10 (for SSLv3 to TLS 1.2) to Figure 7.8 (the corresponding ‘basic’ handshake of SSLv2). We therefore begin our discussion with these two changes. 7.4.1 SSLv3 to TLSv1.2: improved derivation of keys Deriving master key from premaster key. From SSLv3, the handshake protocol exchanges a pre-master key kP M , instead of the master key kM exchanged in SSLv2. The parties derive the master key kM from the pre-master key kP M , using a PRF, as in Eq. (7.21): kM = P RFkP M (łmaster secretž + + rC + + rS ) (7.21) The main motivation for this additional step is that the value exchanged between the parties may not be a perfectly-uniform secret binary string, as required for a cryptographic key. When exchanging the shared key using the ‘basic’, RSA-based handshake, this may happen when the client does not have a sufficiently good source of randomization, or if the client simply resends the same encrypted premaster key as computed and used in a previous connection to the same server - not a recommended way to use the protocol, of course, but possibly attractive for some very weak clients. When exchanging the shared key using the DH protocol, there is a different motivation for using this additional derivation step, from premaster key to master key. Namely, the standard DH groups are all based on the use of a safe prime; as we explain in ğ6.2.3, this implies that we rely on the Computational DH assumption (CDH), and that the attacker may be able to learn at least one bit of information about the exchanged key. By deriving the master key from the premaster key, we hope to ensure that the entire master key would be pseudorandom. Deriving connection keys. Another important improvement of the handshake protocols of SSLv3 to TLS1.2, compared to the SSLv2 handshake, is in the derivation of the connection keys which are used for encryption and authentication by the record protocol. This aspect is not apparent from looking at the ŕows (Fig. 7.10). Speciőcally, recall that in SSLv2, we derived from the master-key kM two keys, kS for protecting traffic sent by the server S, and kC for protecting traffic Applied Introduction to Cryptography and Cybersecurity CHAPTER 7. TLS PROTOCOLS: WEB-SECURITY AND BEYOND 436 Table 7.1: Derivation of connection keys and IVs, in SSLv3 to TLS1.2 A kC key-block = P RFkM (‘key expansion’ + + rC + + rS ) E kSA kC kSE IVC IVS sent by the client C, as in Eq. (7.20). In SSLv3 and TLS, we use kM to derive, for traffic sent by the client C and server S, three keys/values each, for a total A A of six keys/values: two authentication (MAC) keys, (kC , kS ), two encryption E E keys, (kC , kS ), and two initialization vectors, (IVC , IVS ), used for initialization of the ‘modes of operation’ (Section 2.8). In each pair of keys, we use the one with subscript C for traffic from client to server, and the one with subscript S for traffic from server to client. To derive these six keys/values, we generate from kM a long string which is referred to as key block, which we then partition into the six keys/values. The exact details of the derivation differ between these different versions of the handshake protocol, and arguably, none of the derivations is fully justiőed by standard cryptographic deőnitions and reductions. We present the following simpliőcation, leaving the exact details for exercises; the interested reader can őnd the full details in the corresponding RFC speciőcations. Our simpliőcation is deőned using a generic pseudorandom function P RF , whose input is an arbitrary-length string, and whose output is a ‘sufficiently long’ pseudo-random binary string called key-block, as follows: key-block = P RFkM (‘key expansion’ + + rC + + rS ) (7.22) The key-block is then partitioned into the six keys/values, as illustrated in Table 7.1. 7.4.2 SSLv3 to TLSv1.2: DH-based key exchange From SSLv3, the TLS handshake supports DH key exchange. Three types of DH key exchange are supported, ephemeral (signed), static (certified) and anonymous (unauthenticated). The ephemeral method is the most popular TLS handshake method, due to its signiőcant security beneőts. The static (certiőed) method is rarely, if ever, used, and does not offer increased security. However, the two methods are actually quite similar. The anonymous method is rarely used, since it does not provide server authentication; we focus on the other two methods. In both methods, the parties derive a shared key kP M , referred to as the pre-master key, following the DH protocol. Speciőcally, TLS use a modular group, with an agreed upon safe prime p and generator g. The parties exchange their ‘public keys’, g S.x (for the server) and g C.y (for the client), where each party uses a randomly-generated private key: S.x for the server and C.y for the client. The parties then derive the pre-master key kP M , again as in ‘plain’ DH key exchange, namely: Applied Introduction to Cryptography and Cybersecurity 7.4. THE HANDSHAKE PROTOCOL: FROM SSLV3 TO TLSV1.2 Client Client hello: version (vC ), random (rC ), cipher suites (. . . DH. . . ), 437 Server Server hello: version (vS ), random (rS ), cipher suite:. . . DH. . . , certificate: SCA.s ((g, p, g S.x mod p), s.com, . . .), [, extensions] Client key exchange: g C.y ; Client finished: P RFkM (‘client finished:’, h(previous flows)) Server finished: P RFkM (‘server finished:’, h(previous flows)) Figure 7.11: SSLv3 to TLSv1.2: the static DH handshake, using static (certiőed) DH public parameter for the server, g S.x mod p. Pre-master key kP M is computed as in Eq. (7.23), and master key kM is computed - from kP M - as in Eq. (7.21). kP M = g C.y·S.x mod p (7.23) Recall that when using a modular group, the value exchanged by the DH protocol is not pseudorandom; namely, security may rely only on the computational DH assumption (CDH), as we know that the stronger DDH (Decisional DH, Deőnition 6.7) assumption does not hold for such groups. This is one motivation for not using kP M directly as a key to cryptographic functions. Instead, we derive from the pre-master key kP M another key, the master key kM , which should be pseudorandom. See ğ7.4.1, where we discuss the derivation of the master key and of the keys for speciőc cryptographic functions, such as PRF, MAC or shared-key encryption. DH Static (certified) handshake. In static (certiőed) DH key exchange, the server’s DH public key is signed as part of the signing process of a public key certificate. Namely, the signing entity is a certificate authority which is trusted by the browser, and the certiőcate contains the domain name (e.g., s.com) and other parameters such as expiration date: SCA.s ((g, p, g S.x mod p), s.com, . . .). See Figure 7.11. In practice, the use of a certiőcate implies that the server’s DH public key, g S.x , is őxed for long periods, similarly to the typical use of RSA or other public key methods. Hence, the static (certiőed) DH key exchange is similar in its properties to the RSA key exchange; the difference is simply that instead of using RSA encryption to exchange the key, and relying on the RSA (and factoring) assumptions, the static (certiőed) DH key exchange relies on the DH (and discrete-logarithm) assumptions. DH Ephemeral (DHE) handshake: ensuring Perfect Forward Secrecy (PFS). The DH Ephemeral (DHE) key exchange uses a different, randomlychosen private key for each exchange; for DH, this means that each party selects Applied Introduction to Cryptography and Cybersecurity 438 CHAPTER 7. TLS PROTOCOLS: WEB-SECURITY AND BEYOND Client Server Client hello: version (vC ), random (rC ), cipher suites (incl. DHE-RSA or DHE-DSS) [, ID] [,extensions] Server hello: version (vS ), random (rS ), cipher suite (DHE-RSA), Certificate: SignCA.s . .), [, extensions]   (S.v, s.com, .S.x (p, g, g mod p), Server key exchange: SignS.s ((p, g, g S.x mod p)) Client key exchange: (p, g, g S.x mod p), g C.y ; Client finished: P RFkM (‘client finished:’), h(previous flows)) Server finished: P RFkM (‘server finished:’), h(previous flows)) Figure 7.12: SSLv3 to TLSv1.2: the DH Ephemeral (DHE) handshake. The DH exponents S.x and C.y are chosen randomly for this handshake. The server signs its DH public key g S.x mod p, using RSA (RSA_DHE ciphersuite) or DSS (DSS_DHE ciphersuite). The pre-master key kP M is computed as in Eq. (7.23), and master key kM is computed as in Eq. (7.21). a new private exponent (S.x for the server, C.y for the client) in each handshake. This is illustrated in Figure 7.12. The DH exchange is ‘server-authenticated’, i.e., the server signs its ‘public’ DH value (g S .x mod p), and links it with the particular handshake by including in the signed data also the server and client random numbers (rS and rC ). To allow the client to validate this signature, the server should send a public key certiőcate that speciőes its public signature-verification key, rather than the server’s public decryption key, as used for TLS handshakes where the client encrypts the pre-master key using the server’s public key. Following the key separation principle, these two public keys should be different, but many servers actually use the same public key (and private key) for both purposes. The lack of key separation was exploited in several attacks against TLS [18, 73, 214, 341]. Once the TLS session terminates, the private exponents are erased - as well as any keys derived from them, including the pre-master key kP M , the master key kM , the derived key block (Eq. (7.22)) and the keys derived from A E it (kSA , kSE , kC , kC ). This ensures perfect forward secrecy (PFS), i.e., the ith session between client and server is secure against a powerful MitM attacker, even if the attacker is given, all the keys and other contents of the memory of both client and server before and after the ith session, as long as the keys are given only after the ith handshake is completed. Security assumptions of DH key exchange. An obvious, important difference between the RSA key exchange and the DH key exchange methods, is that instead of using RSA encryption to exchange the key, and relying on the RSA (and factoring) assumptions, the static (certiőed) DH key exchange relies on the computational-DH (and discrete-logarithm) assumptions. Notice, Applied Introduction to Cryptography and Cybersecurity 7.4. THE HANDSHAKE PROTOCOL: FROM SSLV3 TO TLSV1.2 439 however, that in the typical case where the certiőcate uses RSA signatures, the security of the handshake still relies also on the RSA (and factoring) assumptions. Namely, the DH key exchanges require both the computational-DH (and discrete logarithm) assumption, and the RSA (and factoring) assumption. In this sense, DH key exchanges, using RSA signatures, requires more assumptions compared to the RSA key exchange. TLS 1.2 (and 1.3) also support ECDSA signatures, which, like the DH key exchange, are based on the discrete logarithm assumption, avoiding the reliance on an addition assumption (RSA and therefore also hardness of factoring), which retaining the advantage of perfect forward secrecy (PFS). 7.4.3 The TLS Extensions mechanism One of the most important improvements of TLS over SSL, is that TLS support a ŕexible and secure extensions mechanism. This mechanism allows clients to specify additional őelds, not deőned in the protocol, but supported (and ‘understood’) by some of the servers. Once a server receives an extension that it supports, its behavior may change from the ‘standard protocol’ in arbitrary way (as deőned by the extension); however, servers should ignore any unknown extension. Extensions were ‘unofficially’ supported as early as TLS1.0, where servers are required to ignore any unknown őelds appended beyond the known őelds, as deőned in [67, 68]. Support for extensions became a (mandatory) part of the TLS speciőcations from version 1.1. Some standard extensions facilitate important functionality, and some are needed for security; and users may deőne additional extensions. Let us discuss one important extension here, and another one in the next subsection. The Server Name Indication (SNI ) is an example of an important, popular extension. Support of SNI became mandatory from TLS 1.1, and was one of the main factors motivating websites and clients to adopt TLS. Many servers refuse handshakes where the client does not include the SNI extension in Client Hello. The main use of SNI is to support the common scenario, where the same web server is used to provide web-pages belonging to multiple different domain names, e.g., a.com and b.org. Each domain name may require a different certiőcate; the SNI extension allows the client to indicate the desired server domain name early on in the protocol, before the server has to send a certiőcate to the client - allowing the server to send the desired certiőcate based on the web-page that is being requested. Before SNI, the common way to a web-server to support multiple web-sites, with different domain names, was by having each site use a dedicated port - an inconvenient and inefficient solution. However, the SNI extension is valuable even for servers which host only a single domain. This is since SNI allows the server to verify that the domain that the client wants to connect with, is the same as the hosted domain, before the server spends the considerable computational resources to complete the TLS handshake. This avoids spending server resources due to incorrect client Applied Introduction to Cryptography and Cybersecurity CHAPTER 7. TLS PROTOCOLS: WEB-SECURITY AND BEYOND 440 Client Server Client hello: client random (rC ), cipher suites, ID Server hello: server random (rS ) Client finished: P RFkM (‘client finished:’), h(previous flows)) Server finished: P RFkM (‘server finished:’), h(previous flows)) Figure 7.13: SSLv3 to TLS1.2 handshake, with ID-based session resumption. requests sent to the server; it also avoids exposing the hosted domain (to an attacker sending Client Hello to őnd out the domain name). By requiring the SNI extension, a server can also prevent a potential Denial of Service (DoS) attack, exhausting server’s resources. Without SNI, a roguewebsite could abuse visits by benign users, to attack other sites; this kind of DoS attack is called a cross-site Denial-of-Service attack; without SNI, it could be especially effective against TLS 1.3. For details, see [193]. 7.4.4 SSLv3 to TLSv1.2: session resumption Both SSLv3 and TLS, like SSLv2, support the (stateful) ID-based session resumption mechanism; however, many TLS servers also support extensions, including the session-ticket extension, which is an alternative, ‘stateless’ method for session resumption. In this subsection we discuss these two methods. ID-based session resumption in SSLv3 and TLS 1.0-1.2 We begin with the (stateful) ID-based session resumption mechanism, which did not change much from its implementation in SSLv2. Figure 7.13 illustrates the handling of ID-based session resumption, in the SSLv3 handshake protocol, and in versions 1.0-1.2 of the TLS protocol. In the őgure, the client-hello message contains the session-ID, denoted simply ID, which was received from server in a previous connection. Session resumption is possible, when the server still has the corresponding entry (ID, kM , γ) saved from a previous connection; ID is the session identifier, kM is the session’s master key, and γ contains ‘related information’ such as the cipher suite used in the session. When the server has the (ID, kM , γ) entry, it reuses kM and γ, i.e., ‘resumes the session’, and derives new shared keys from it (using Eq. (7.20). This avoids the public key encryption (by client) and decryption (by server) of master key kM , as well as the transmission of the relevant information, most signiőcantly, the public key certiőcate. Applied Introduction to Cryptography and Cybersecurity 7.4. THE HANDSHAKE PROTOCOL: FROM SSLV3 TO TLSV1.2 441 Note that when either the client or the server, or both, do not have a valid (ID, kM ) pair, then the handshake is essentially the same as for a ‘basic’ handshake (without resumption), as in Fig. 7.8. The only changes are the inclusion of the ID from client (if it has it), and the inclusion of an ID in the ‘server-őnish’ message, to be (optionally) used for future resumption of additional connections (in the same session). The session resumption mechanism can have a signiőcant impact on performance; in particular, websites often involve opening of a very large number of TCP connections to the same server, to download different objects. The reduction in CPU time can easily be a ratio of dozens or even hundreds. Therefore, this is a very important mechanism; however, it also has some signiőcant challenges and concerns, as we next discuss. Session-ID resumption: challenges and concerns. The basic challenge of ID-based session resumption is the need to maintain state, and lookup the state - and key - using the ID. To minimize the storage and lookup time overhead, the cache of saved (ID, kM ) pairs cannot be too large; on the other hand, if the cached is too small, then the resumption mechanism is less effective. This challenge is made much harder, since web servers are usually replicated - to handle high load and to reduce latency by placing the server closer to the clients, e.g., in a Content Distribution Network (CDN). Ensuring PFS with ID-based session resumption Another challenge is that the exposure of the master key kM , exposes the entire communication of every connection to an eavesdropper; namely, the storage of the key may foil the perfect-forward secrecy (PFS) mechanism. To ensure PFS, we must ensure that all copies of the key kM are discarded, without any copies remaining - a non-trivial challenge. This challenge is often made even harder due to the way that web-servers implement the (ID, kM ) cache. Speciőcally, in some popular servers, e.g. Apache, the operator can only deőne the size of the (ID, kM ) cache. Suppose the goal is to ensure PFS on daily basis, i.e., to change keys daily. Then the cache size must be small enough to ensure that entries will be thrown out after at most a day, yet, if it is too small, there will be many cache misses, i.e., the efficiency-gain of the resumption mechanism will be reduced. Furthermore, even if we use a small cache, a client which continues a session for very long time may never get evicted from the cache, and hence we may not achieve the goal of ensuring PFS on daily basis, if the cache uses the (usual) paradigm of throwing out the least-recently-used element; to ensure entries are thrown after one day at most, it should operate as a queue (őrst-in-őrst-out). Exercise 7.2. Consider a web server which has, on average, one million daily visitors, but the number in some days may be as low as one thousand. What is the required size of the ID-session cache, in terms of number of (ID, kM ) entries, to ensure PFS on daily basis, when entries are removed from the cache only when necessary to make room for new entries? Can you estimate or bound, Applied Introduction to Cryptography and Cybersecurity 442 CHAPTER 7. TLS PROTOCOLS: WEB-SECURITY AND BEYOND Client Server Client hello: client random (rC ), cipher suites, ticket-extension(τ ) Server hello: server random (rS ) Client finished: P RFkM (‘client finished:’, h(previous flows)) Server finished: P RFkM (‘server finished:’, h(previous flows)) ticket-extension(τ ′ ) Figure 7.14: Ticket-based session resumption, using a ticket τ which the client sends to the server with client hello; the client has received τ from the server in a previous handshake. The server should be able to validate the ticket as one that the server issued previously, and not too long ago, and to retrieve the shared pre-master-key encrypted within the ticket. This is usually done by having the ticket . how many of the connections will be served from cache on a typical day? Assume the ID-session cache operates using a FIFO eviction paradigm. The Session-Ticket extension and its use for session resumption. The TLS extensions mechanism provides an alternative, stateless session-resumption mechanism. The idea is simple: together with the finish message of a successful handshake, the server attaches a session-ticket extension. Later, when the client re-connects to the same server, it attaches the previously-received session-ticket extension. See Figure 7.14. The ticket should allow any of the ‘authorized servers’ (e.g., running the website), to recover the value of the master key kM of the session with the client - but prevent attackers, eavesdropping on the ticket as sent by the client, from őnding kM . This is achieved by having kM , and other values sent in the ticket, encrypted using a secret, symmetric Session Ticket Encryption Key, which we denote kST EK , known (only) to all authorized servers. Notice that kST EK is not shared with clients or derived by TLS; the method of generating it and sharing it between the servers is implementation-speciőc. Since kST EK is a shared key, it is usually simply selected randomly. Clients cannot encrypt the tickets; hence, they must store both ticket and (plaintext) session’s master key kM , to allow the client to perform its part of the handshake. The contents of the session ticket are only used by the servers, and are opaque to the clients, i.e., not ‘understood’ or used by the clients; hence, different implementations may use different tickets. RFC5077 [344] recommends a structure which uses Encrypt-then-Authenticate, where the encrypted contents include the protocol version, cipher suite, compression method, master secret key, client identity and a timestamp. The timestamp allows the server to limit the validity period of a ticket (and Applied Introduction to Cryptography and Cybersecurity 7.4. THE HANDSHAKE PROTOCOL: FROM SSLV3 TO TLSV1.2 443 the keys contained within); if the server receives a ticket which already expired, or is invalid for any other reason, it simply ignores it and proceeds with the ‘regular’ handshake, establishing a new pre-master key (and potentially sending a new ticket to the client). The limited validity period for the ticket is important, to limit the risk from exposure of the keys in a particular ticket - via cryptanalysis or in other ways, such as abuse of a vulnerability of the browser. Limiting the validity of tickets is clearly also necessary to ensure Perfect Forward Secrecy (PFS); however, this is not sufficient. Speciőcally, to ensure PFS, we should also limit the ability of an attacker to decipher messages from past recorded ciphertexts of long-terminated connections, using an exposed Session Ticket Encryption Key (STEK) kST EK . Let us discuss this challenge. PFS with Session Ticket Encryption Keys. To preserve PFS, e.g., on daily basis, we need to make sure that each Session Ticket Encryption Key kST EK is kept for only the allowed duration - e.g., up to 24 hours (‘daily’). In principle, this is easy; we can maintain this key only in memory, and never write it to disk or other non-volatile storage, making it easier to ensure it is not kept beyond the desired period (e.g., daily). This rule may require us to maintain several ticket-keys concurrently, e.g., generate a new key once an hour, allowing it to ‘live’ for up to 24 hours. In the typical case of replicated servers, the ticket keys kST EK should be distributed securely to all replicates. Changing the key becomes even more important, with it being used in so many machines. Unfortunately, like for ID-based resumption, many popular web-servers implement ticket-based resumption in ways which are problematic for perfect forward secrecy (PFS). These web-server implementations do not provide a mechanism to limit the lifetime of the ticket key, except by restarting the server (to force the server to choose a new ticket key). For some administrators and scenarios, this lack of support for PFS may be a consideration for choosing a server, or for using session-IDs and disabling session-tickets. 7.4.5 SSLv3 to TLSv1.2: Client authentication The SSL and TLS protocols support, already from SSLv2, a mechanism for authenticating client, as an optional service of the handshake. In this subsection we describe how this optional client authentication mechanism works, in SSLv3 and in TLS 1.0 to 1.2. The TLS client authentication mechanism is illustrated in Figure 7.15. The mechanism consists of tree additions to the ‘basic’ handshake. First, the server signals the need for client authentication, by including the certificate request őeld together with the server-hello message. The certiőcate-request őeld identiőes the certiőcate-authorities (issuers) which are accepted by this server; namely, client authentication is possible only if the client has a certiőcate from one of these entities. Applied Introduction to Cryptography and Cybersecurity 444 CHAPTER 7. TLS PROTOCOLS: WEB-SECURITY AND BEYOND Client Server Client hello: client random (rC ), cipher suites Server hello: server random (rS ), cipher suite, Certificate, CertificateRequest:{CAs} Client-certificate:SignCAC .s (C.v, . . .), CertificateVerify:SignC.s (h(handshake)), ClientKeyExchange: . . . , Client finished: P RFkM (‘client finished:’, h(previous flows)) Server finished: P RFkM (‘server finished:’, h(previous flows)) Figure 7.15: Client authentication in SSLv3 to TLS1.2. Next, the client attaches, to its client key exchange message, two őelds. The őrst is the certiőcate itself; the second, called certificate verify, is a digital signature over the handshake messages. The ability to produce this signature, serves as proof of the identity of the client. This client authentication mechanism is quite simple and efficient; however, it is not widely deployed. In reality, TLS is typically deployed using only the public key (and certiőcate) of the server, i.e., only allowing the client to authenticate the server, but without client authentication. The reason for that is that TLS client authentication requires clients to use a private key, and to obtain a certiőcate on the corresponding public key; furthermore, that certiőcate must be signed by an authority trusted by the server. This raises two serious challenges. First, clients often use multiple devices, and this requires them to have access to their private keys on these multiple devices, which raises both usability and security concerns. Second, clients must obtain a certiőcate - and from an authority trusted by the server. As a result, most websites prefer to avoid the use of TLS client authentication; when user authentication is required, they rely on sending secret credentials such as passwords or cookies, over the TLS secure connection. Note also that the client authentication mechanism requires the client to send their certiőcate ‘in the clear’. This may be a privacy concern, since the certiőcate may allow identiőcation of the client. 7.5 Negotiations and Downgrade Attacks (SSL to TLS 1.2) The evolution of TLS, at least until TLS 1.3, saw an increasingly complex set of different options and choices: different usage modes, different protocol versions, different cipher suites (from SSLv2) and different extensions (from TLS 1.1). To allow this ŕexibility, the speciőcations and implementations use different Applied Introduction to Cryptography and Cybersecurity 7.5. NEGOTIATIONS AND DOWNGRADE ATTACKS (SSL TO TLS 1.2)445 Client Server Client hello: version (vC ), client random (rC ), cipher suites Server hello: server random (rS ), certificate: SCA.s (S.e, s.com, . . .) and cipher suites Client key exchange: cipher suite, ES.e (kM ); Client finished: EkC (rS ) Server finished: EkS (rC ) Figure 7.16: SSLv2 handshake, with details of cipher suite negotiation (underlined). The Client hello message indicates the options supported by the client; the Server hello message contains the subset of these, which are also supported by the server. The client chooses one of these, and indicates the choice in the Client key exchange message. Note: the negotiation was modiőed in later versions. negotiation mechanisms. The basic goal of negotiation is for client and server to agree on the same options/choices. However, there is also a more challenging security goal: to prevent downgrade attacks, where an attacker causes the parties to use a vulnerable option/choice, although both parties are able, and prefer, to use a secure option/choice. As we have seen for GSM (subsection 5.6.3), downgrade attacks can be simple to understand and deploy, yet efficiently break down security - and they could persist for years, as old versions die slow. The situation with TLS is quite similar. In this section, we discuss the pre-TLS-1.3 negotiation mechanisms, and several effective, and quite simple, downgrade attacks. We exclude discussion of TLS 1.3, since its negotiation mechanisms are quite different, mostly due to insights from these downgrade attacks. Let us begin with SSLv2, which is completely vulnerable to such attacks. 7.5.1 SSLv2 cipher suite negotiation and downgrade attack In ğ5.6.2 we presented the crypto-agility principle (Principle 11), i.e., allowing ŕexibility, replacement and upgrade of the cryptographic mechanisms. We also discussed how the GSM support for crypto-agility is vulnerable to downgrade attack. How about SSLv2? Figure 7.16 illustrates how SSLv2 also supports crypto-agility, i.e., the SSLv2 cipher suite negotiation mechanism. Figure 7.17 gives an example of the negotiation process, when the client supports three cipher suites, all using the MD5 hashing algorithm, but with three different ciphers and key lengths (128-bit keys or 40-bit keys for RC4, and 56-bit keys with DES), and the server supports two ciphers (128-bit RC4 and 40-bit RC4). The 40-bit keys are Applied Introduction to Cryptography and Cybersecurity CHAPTER 7. TLS PROTOCOLS: WEB-SECURITY AND BEYOND 446 Client Client hello: client random (rC ), cipher suites=RC4_128_MD5, RC4_40_MD5, DES_64_MD5 Server Server hello: server random (rS ), certificate: SCA.s (S.e, s.com, . . .) and cipher suites=RC4_128_MD5, RC4_40_MD5 Client key exchange: RC4_128_MD5, ES.e (kM ); Client finished: EkC (rS ) Server finished: EkS (rC ) Figure 7.17: Example of SSLv2 cipher suite negotiation. In this example, the client offers three cipher suites, and the server supports two of these. The negotiation was changed from SSLv3. Client Client hello: client random (rC ), cipher suites=RC4_128_MD5, RC4_40_MD5 MitM Client hello: client random (rC ), cipher suites=RC4_40_MD5 Server Server hello: server random (rS ), certificate: SCA.s (S.e, s.com, . . .) and cipher suites=RC4_40_MD5 Client key exchange (RC4_40_MD5): ES.e (kM ); Client finished: kC (rS ) Server finished: kS (rC ), EkS (ID) Figure 7.18: Cipher suite downgrade attack on SSLv2. Server and client end up using master key kM with only 40 secret bits, which the attacker can őnd by exhaustive search. Attacker does not need to őnd key during handshake; parties use the 40-bit key for entire connection, attacker may even just record ciphertexts and decrypt later. Note that while SSLv2 is not used anymore, we later discuss sversion downgrade attack that may trick the server and/or client into using SSLv2, exposing them to this (and other) attacks on SSLv2. obviously insecure, but until about 2000, these were the only keys allowed for products sold or distributed outside of the USA, due to USA export controls. In SSLv2, the őnish messages only conőrm that the parties share the same server and client keys (KS and KC , respectively), but not the integrity of the rest of the hello messages - in particular, there is no authentication of the cipher suites sent by server and client. This allows simple downgrade attacks, removing ‘strong’ ciphers from the list of ciphers supported by client and/or server. Figure 7.18 illustrates how a Man-in-the-Middle (MitM) attacker may perform this downgrade attack on SSLv2; in the example illustrated, the attacker removes the ‘regular-version’ 128-bit RC4 encryption from the list of ciphers supported by the client, leaving only the weaker ‘export-version’ 40-bit RC4 encryption. Indeed, the SSLv2 downgrade attack is even simpler and easier to deploy, compared to the GSM downgrade attack (ğ5.6.3). Applied Introduction to Cryptography and Cybersecurity 7.5. NEGOTIATIONS AND DOWNGRADE ATTACKS (SSL TO TLS 1.2)447 7.5.2 Handshake Integrity Against Cipher Suite Downgrade From SSLv3, the cipher suite negotiation mechanism was improved. The most important change is the adoption of an important, simple defense of the handshake integrity, which prevents the cipher suite downgrade attack of Figure 7.18. Two other changes are (1) extended cipher-suites that specify also the key-exchange mechanism (SSLv2 cipher suites speciőed only the symmetric cryptography), and (2) the server chooses the cipher-suite (in Server hello, i.e., second ŕow), instead of the client (in third ŕow). ‘Finished:’ handshake integrity foils cipher suite downgrade. Beginning with SSLv3, the handshake protocol includes a simple mechanism for validating the integrity of all handshake messages. The client and server authenticate the entire handshake, using the master key derived for that connection. In particular, if the cipher suite downgrade attack of Figure 7.18 would be launched against SSLv3 or TLS, then the server will detect the attack, and disconnect the connection, upon receiving the Client Finished: message. Speciőcally, as can be seen, e.g., in Figure 7.10, both client and server send, in their respective finished message, a validation value, ensuring the integrity of all previously exchanged messages in that handshake. Upon receiving the őnished message from the peer (server or client, respectively), the value is checked and if incorrect, the handshake is aborted. Similarly to the keys-derivation process (ğ7.4.1), the details slightly differ among the different versions, and we present a slight simpliőcation, consistent with the one we used in ğ7.4.1. Both validation values are computed following the ‘hash-then-authenticate’ paradigm, using a hash function h, assumed to be collision resistant, and a a pseudorandom function P RF . The validation sent with the Client őnished message, which we denote vC , is computed as: vC = P RFkM (‘client őnished:’ + + h(handshake-messages)) (7.24) Similarly, the validation sent with the Server őnished message, which we denote vS , is computed as: vS = P RFkM (‘server őnished:’ + + h(handshake-messages)) (7.25) Note the similarity to the derivation of the key-block (Equation 7.22). Security Analysis of Finished Validation. Let us give an intuitive explanation to the security provided by Finished validation mechanism, from SSLv3, focusing on the prevention of cipher suite downgrade attacks. For simplicity, assume that the client and server both support a secure shared cipher suite X, which they prefer over a vulnerable cipher suite V . Also, focus and on the (typical) handshake, with server authentication but without client authentication. In this case, the protocol should ensure that if the handshake completes successfully at a client, then it also completed successfully at the Applied Introduction to Cryptography and Cybersecurity 448 CHAPTER 7. TLS PROTOCOLS: WEB-SECURITY AND BEYOND server, and the two parties use cipher suite X (or a more preferred secure cipher suite). Consider an execution in which the client completes the handshake successfully. In such execution, the client must have received a valid Server őnished message, i.e., the client’s value of vS , computed as in Equation 7.25, is the same as the value received from the server, which we denote vS′ . Basically, the security follows from the use of ‘hash-then-authenticate’. Let us elaborate a bit, mainly to highlight the assumptions involved. The őrst assumption is that the master key kM cannot be exposed by an attacker sending a (manipulated or fabricated) Server őnished message. Namely, kM is pseudorandom and known only to the client and to the intended server. Note that the assumption, and other assumptions identiőed below, should hold for an any supported cipher suite. Also, recall that, from SSLv3, the cipher suite speciőed both the symmetric-encryption mechanisms and the key exchange mechanisms. Hence, all key exchange mechanisms supported by the client must be secure, at least to the extent of preventing exposure of kM during a successful handshake. The basic argument for security, is that when the client receives a valid Server őnished message, with vS = vS′ , then the server must have previously sent that message, and client and server must have seen identical handshake messages. This argument is based on the assumption that kM is pseudorandom, and on the use of hash-then-authenticate in the computation of both Finished messages. Hence, our analysis assumes that the P RF function is a secure pseudorandom function, and that the hash function h is a collision-resistant hash function. The observant reader will notice that the second ‘assumption’ is not really an assumption as much as a simplification, since h is a keyless hash and there are no keyless collision resistant hash functions (subsection 3.2.2). This is one of multiple reasons that imply that we cannot hope to provide a full, reduction based proof for the security of TLS - at least, until version 1.3. The rest of the security argument follows. A PRF provides message authentication (subsection 4.5.1), when used with a pseudorandom key (in this case, kM ). Hence, when the client receives vS′ , this implies that the server, previously, computed vS′ = P RFkM (‘server őnished:’ + + h(handshake-messages)), providing as input (‘server őnished:’ + + h(handshake-messages)), and then sent vS . Since the client computed the same value (vS = vS′ ), it follows that the inputs to the PRF provided by client and server were the same. From the collision-resistance of h, the inputs to the hash were also the same. It follows that both parties have seen identical handshake messages. In particular, they must have seen the same ciphersuite negotiation messages, and hence, cipher suite downgrade attacks are impossible from SSLv3. Details of the hash function h. The computation of the validation values vC , vS involves, in both equations, a cryptographic hash function h, whose deőnition differs between the different versions. Speciőcally, in TLS 1.2, the hash function is implemented simply as SHA-256, i.e., h(m) = SHA_256(m). The TLS 1.0 and 1.1 design is more elaborate, and follows the ‘robust combiner Applied Introduction to Cryptography and Cybersecurity 7.5. NEGOTIATIONS AND DOWNGRADE ATTACKS (SSL TO TLS 1.2)449 for MAC’ design of ğ4.6.2; speciőcally, the hash is computed by concatenating the results of two cryptographic hash functions, M D5 and SHA1, as: h(m) = M D5(m)+ +SHA1(m). SSLv3 also similarly combines MD5 and SHA1, however, in SSLv3 the combination is in the computation of P RF itself, and fails to ensure a robust combiner; details omitted. 7.5.3 Finished Fails: the Logjam and FREAK cipher suite downgrade attacks In spite of the above security analysis of Finished message validation, two ciphertext downgrade attacks have circumvented the defense: FREAK and Logjam. Both attacks exploit the fact that, due to USA export regulations until around 2000, SSLv3 and TLS 1.0 support several cipher suites which use short vulnerable keys; these cipher suites were supposed to be used (only) for exported versions of TLS, i.e., versions distributed outside the USA. The FREAK5 attack [56] exploits the RSA_EXPORT cipher suite, which uses a weak, 512-bit modulus N , which can be factored in few hours by affordable hardware. The attack was effective on popular implementations of TLS, however, it exploited a subtle bug in their implementations, which caused the client to receive the weak (512-bit) key although the client was using the strong (nonexport) RSA cipher suite. Similarly, the Logjam attack [7] exploits an exportable version of the DiffieHellman Ephemeral (DHE) key exchange (Figure 7.12), which uses 512-bit groups. While the speciőc exponents are (correctly) chosen randomly in each run, many implementations use the same groups. However, for a given group, it is possible to perform a precomputation step, following which, different discrete logs (with the same modulus) can be computed with acceptable, if still signiőcant, computational costs. If the attack can be carried out in real time, the MitM attacker may be able to őnd the pre-master key derived by the client, and hence forge a valid Server Finished message, thereby successfully impersonating as the legitimate server. See Figure 7.19. 5 FREAK stands for Factoring RSA Export Keys. With an added ‘A’ for fun, I guess. Applied Introduction to Cryptography and Cybersecurity CHAPTER 7. TLS PROTOCOLS: WEB-SECURITY AND BEYOND 450 kP M ← (g b )a mod p kM ← P RFkP M  “master secret” + +rC + + rS b ← DiscLog(g b modp) derive kP M and kM  MitM Client A (Alice) Server B (Bob.com) After precomputation phase of disc-log of group (g, p) Client Hello: random rC , cipher suite: DHE Client Hello: random rC , cipher suite: DHE_Export Server Hello: random rS , cipher suite: DHE Server Hello: random rS , cipher suite: DHE_Export CertB , SignS.s rC + + rS + + (p + +g+ + g b mod p) g a mod p Client Finished:   Server Finished:   P RFkM P RFkM ‘client finished:’+ + + +h(previous flows) ‘server finished:’+ + + +h(previous flows)  Figure 7.19: The Logjam cipher suite downgrade attack against servers supporting the exportable (weak keys) version of the DH Ephemeral (DHE) key exchange. The attack works for the (surprisingly common) case of known DH group (p, g), allowing the attacker to precompute the most computationally challenging part of the discrete log computation and use it for the different exponents and handshakes. The MitM attack begins once this precomputation is done. The attacker forces TLS clients to use export-strength Diffie-Hellman Ephemeral (DHE) key exchange (the DHE_Export cipher suite). The attacker modiőes Client Hello to request DHE_Export from the server, and modiőes Serve Hello to appear as if the server uses regular DHE. The client does not deIntroduction Cryptography and Cybersecurity g, g b ) correspond to export-version of DH, tect that theApplied Diffie-Hellman valuesto(p, a and continues the handshake, sending g mod p and then Client Finished. The attacker now uses the precomputed values to compute bb ← DiscLog(g b modp), allowing it to őnd the master key kM and complete the handshake. 7.5. NEGOTIATIONS AND DOWNGRADE ATTACKS (SSL TO TLS 1.2)451 What can we learn from these attacks? One important lesson is yet-anotherexample of the risks of due to the use of insufficiently-secure cryptography; vulnerabilities die hard, and can often be abused even years after most systems adopted more secure solutions. Another lesson is the risk of relying only on intuitive reasoning, as presented above, for the security of systems. Intuition is useful to identify some attacks and to create initial designs, but cryptographic vulnerabilities can be subtle, and a precise, in-depth security analysis is critical. 7.5.4 Backward compatibility and protocol version negotiation SSL was an immediate success; it was widely deployed soon after it was released (as SSLv2). Hence, when introducing SSLv3, designers had to seriously consider backward compatibility, namely, allowing a client/server running a SSLv3, to interact with a server/client, respectively, running SSLv2. Note that already SSLv2 includes a version number in the client hello and server hello messages, although it did not include a protocol version negotiation mechanism. With the deőnition of (different versions of) TLS, in parallel to further proliferation of web devices, the need for backward compatibility only grew stronger; basically, a new version has almost no chance of adoption without backward compatibility with earlier versions. TLS protocol version negotiation. The Client Hello messages of TLS 1.01.2 and SSLv3 are very similar, and the Client Hello and Server Hello messages include protocol-version identiőcation. This allows a simple and efficient version negotiation mechanism for implementations of TLS 1.0-1.2, allowing downgrade to the least-updated among the versions of the client and the server, from SSLv3 to TLS 1.2. This TLS version negotiation mechanism works as follows. When a TLS server receives a Client Hello indicating an older version, it simply continues the negotiation using this older version of the protocol. Similarly, when a TLS server receives a Client Hello indicating a protocol version newer than the one supported by the server, then the server continues the negotiation using its own (older) version of the protocol. This version negotiation mechanism allows TLS servers running any version of TLS, or running SSLv3, to interact with clients running any version of TLS or SSLv3; only SSLv2 is incompatible. In all cases, the clients detect the lowest-common version which is to be used from the Server Hello message, and continue the protocol using that version. Importantly, the same handshake continues, and therefore the integrity mechanism of the Finished messages validates that the downgrade was, indeed, selected by the server. Vulnerability of TLS version negotiation. The TLS version negotiation mechanism seems, intuitively, secure; however, is it really? As in many cases, the intuition here can be misleading. Speciőcally, [341] showed how the use of Applied Introduction to Cryptography and Cybersecurity CHAPTER 7. TLS PROTOCOLS: WEB-SECURITY AND BEYOND 452 an optimized version of the Bleichenbacher attack succeeds in decrypting the premaster key, in a small fraction of the attempts. By trying to open a sufficient number of sessions, the attack succeeds against many TLS implementations. Furthermore, while TLS 1.3 has an improved version-negotiation mechanism (subsection 7.6.1), and does not even rely on RSA encryption, this vulnerability may allow the attack of [341] to downgrade from TLS 1.3 to a lower, vulnerable version. The SSL version negotiation mechanism and SSLv3 version downgrade attack. The above TLS protocol version negotiation mechanism does not support SSLv2, since SSLv2 uses a different client-hello format. The SSLv3 speciőcation [155] speciőed a different version negotiation mechanism, which allows SSLv3 clients to interact with SSLv2 servers. The SSL version negotiation mechanism works by the following simple downgrade dance: an SSLv3 client őrst sends the SSLv3 Client Hello message; but if a valid Server Hello is not received, it sends the SSLv2 Client Hello. This allows interoperability of an SSLv3 client with an SSLv2 server. The other case, of SSLv2 client and SSLv3 server, is handled like the TLS mechanism, i.e., the SSLv3 server detects receipt of SSLv2 Client Hello and continues the handshake using SSLv2. It is easy to see, that the SSL version negotiation (downgrade dance) mechanism is vulnerable to a downgrade attack. Basically, a MitM attacker drops the SSLv3 Client Hello message, thereby causing the client to connect using SSLv2. See the following exercise. The designers were aware of this risk, and the SSLv3 standard notes that this method for backward compatibility will be ‘phased out with all due haste6 ’. Exercise 7.3 (The SSLv3 version downgrade attack). Show message sequence diagram for a MitM version downgrade attack, tricking an SSLv3 server and an SSLv3 client who sends SSLv2-format client-hello (for backward compatibility), into completing the handshake using SSLv2 and using a weak (40-bit) cipher. Hint: see this attack (referred to as ‘version rollback attack’) in [384]. Kocher’s ad-hoc defense against downgrade to SSLv2. In practice, interactions between most TLS and SSLv3 servers and clients are protected from the protocol downgrade attack of Exercise 7.3, by an ingenious ad-hoc defense, designed by Paul Kocher. These clients signal their support of SSLv3, by encoding a ‘signal’ of that in the padding used in the RSA encryption. For details on this and other issues related to MitM downgrade attacks (also referred to as version rollback), see [384] and appendix E.2 of the TLS 1.0 speciőcations, RFC2246 [120]. 6 SSLv3 was released in 1996, and never modified to remove this method of backward compatibility. This allowed downgrading even of TLS 1.2 to SSLv2, until March 2011, when TLS versions were redefined [375], removing the ability to downgrade to SSL. That’s haste for IETF, apparently. Applied Introduction to Cryptography and Cybersecurity 7.5. NEGOTIATIONS AND DOWNGRADE ATTACKS (SSL TO TLS 1.2)453 7.5.5 The TLS Downgrade Dance and the Poodle Version Downgrade Attack The TLS backward compatibility mechanism (subsection 7.5.4) requires support by the server: the server should respond correctly to Client Hello using newer versions, indicating to the client the need to move to the older version. This is not trivial; while Client Hello messages use basically the same design, there are several additions from SSLv3 onwards, including the important TLS extensions mechanism (from TLS 1.1). While the additions have been carefully designed to be ignored by implementations following correctly and precisely the speciőcations of the previous versions, many lower-version implementations still fail to process them correctly, or for some other reason, do not continue with the protocol using their (lower-version) protocol. Unfortunately, there are many older version servers which fail, in this way, to support TLS version negotiation. Many clients try to work with such servers anyway, by the following process, which we call the TLS downgrade dance: try őrst to connect using the latest version, but if receiving no response (or error), try with older versions. The reader will notice that this downgrade dance is basically an extension of the SSL downgrade dance. Unfortunately, it is also vulnerable to a downgrade attack. An attacker can simply block connection attempts (or send back a fake error message), causing the client to use an older, vulnerable version of the protocol; see exercise below. Exercise 7.4 (The Poodle version downgrade attack). Consider client that supports the downgrade dance described above. Namely, the client first tries to connect using TLS 1.2; if that fails, it tries to connect using TLS 1.1; and if that also fails, it tries to connect using TLS 1.0. Present a message sequence diagram for a MitM attack, which tricks this client into using TLS 1.0, even when the server it tries to connect with supports TLS 1.2. Although the TLS downgrade dance does not follow the TLS speciőcations, it is supported by many TLS clients. This provides a very effective downgrade attack from TLS versions 1.0-1.2 to lower versions, including SSLv3, and sometimes even SSLv2. The ability to perform this downgrade attack was őrst demonstrated in the Poodle attack [290], and therefore we refer to it as the Poodle version downgrade attack. Combined with the Poodle padding attack (subsection 7.2.3), this allowed Poodle to be successfully exploited against most web servers implementing TLS 1.0-1.2. One way to avoid the version downgrade attack is to avoid the downgrade dance. However, evidently, this may cause loss of backward compatibility with many servers (e.g., websites), a price that client developers may not be willing to pay. We next present the SCSV signaling mechanism, which allows secure use of the TLS downgrade dance. Applied Introduction to Cryptography and Cybersecurity 454 7.5.6 CHAPTER 7. TLS PROTOCOLS: WEB-SECURITY AND BEYOND Securing the TLS downgrade dance: the SCSV cipher suite and beyond Exercise 7.4 shows a potential vulnerability for a common case, where clients use ‘downgrade dance’ to ensure backward compatibility with servers supporting older (lower) versions of the TLS protocol. How can we mitigate this risk, while still allowing clients to use the TLS downgrade dance, in order to to interact with servers running older versions, that do not support the TLS negotiation mechanism? A standard solution is the Signaling Cipher Suite Value (SCSV) cipher suite, speciőed in RFC 7507 [288]. Clients that support SCSV, őrst try to connect to the server using their current TLS version - no change from clients not supporting SCSV. The difference is only when this initial connection fails, and the client decides to try the ‘downgrade dance’, to support connections with servers supporting (only) older versions of TLS. In these ‘downgrade dance’ handshakes, the client adds a special ‘cipher suite’ to its list of supported cipher suites, sent as part of the ClientHello message. The special ‘cipher suite’ is called TLS_FALLBACK_SCSV, and is encoded by a speciőc string. Unlike the original (and main) goal of the cipher suites őeld, the SCSV is not an indication of cryptographic algorithms supported by the client. Instead, the existence of SCSV indicates to the server, that this handshake message is sent as part of a downgrade dance by the client, i.e., that the client supports a higher version than the one speciőed in the current handshake. If the server receives such handshake, and supports a higher version of the protocol itself, this would indicate an error or attack, as this client and server should use the higher version. Therefore, in this case, the server responds with an appropriate indication to the client. This use of the cipher suites őelds for signaling the downgrade dance is a ‘hack’ - it is not the intended, typical use of this őeld. A ‘cleaner’ alternative would be to achieve similar signaling using a dedicated extension mechanism; later in this section, we describe the TLS extension mechanism, which is used for this purpose in TLS 1.3. We believe that the reason that SCSV was deőned using this ‘hack’ (encoding of a non-existent cipher suite) rather than using an appropriate TLS extension, was the desire to support downgrade dance to older versions and implementations of TLS, that do not support TLS extensions. 7.5.7 The SSL-Stripping Attack and the HSTS Defense An even more extreme downgrade attack is to trick the client into using an insecure connection, i.e., not to use TLS at all, although the server supports secure (TLS) connections. The attack is designed against the use of TLS to protect web communication. Browsers connect to websites using the protocol speciőed in the URL, typically, either HTTP (unprotected) or HTTPS (TLS protected). The URL is often from a previously-received webpage. If that previously-received webpage is unprotected, then the hyperlink may be modiőed by a MitM attacker Applied Introduction to Cryptography and Cybersecurity 7.5. NEGOTIATIONS AND DOWNGRADE ATTACKS (SSL TO TLS 1.2)455 Alice (client) Bob (server) MitM attacker Nurse GET bob.com/index.htm GET bob.com/index.htm ...<a href=http://bob.com/login.htm>... ...<a href=https://bob.com/login.htm>... GET bob.com/login.htm?pw=IluvBob,... (plaintext) TLS Client Hello TLS Server Hello TLS Client key-exchange and őnish TLS Server őnish GET bob.com/login.htm?pw=IluvBob,... (encrypted) ...http connection continues... (plaintext) ...http connection continues...(encrypted) Figure 7.20: The SSL-Stripping MitM Attack on an TLS connection. The attack replaces https hyperlinks with http hyperlinks (second ŕow). If the users do not notice and enters their password, then the attacker obtains the password. The attacker may continue with the connection, using a secure connection to the server, to obtain more sensitive information and to reduce the likelihood of the user detecting the attack. speciőcally, changing from a URL specifying the protected HTTPS, to a URL specifying the unprotected HTTP. Browsers indicate the protocol used (HTTP or HTTPS) to the user, but many or most users are unlikely to notice a downgrade (from HTTPS to HTTP). This attack is referred to as SSL-Stripping, and was őrst presented by Marlinspike [277]. The SSL-Stripping attack is illustrated in Figure 7.20. The attack works by replacing https hyperlinks, sent over insecure connections, with the http hyperlinks (to the same resource, except the change from https to http). The user often would not notice that the web-page is delivered over http, and send sensitive information such as password. Of course, the attack only works if the https hyperlink is sent over an insecure connection. Therefore, the attack surface is minimized by securing more web-pages, especially search engines. Applied Introduction to Cryptography and Cybersecurity 456 CHAPTER 7. TLS PROTOCOLS: WEB-SECURITY AND BEYOND A good defense against SSL-Stripping and similar attacks is for the browsers to detect or prevent HTTP hyperlinks to a website which always uses (or offers) HTTPS connections. The standard mechanism to ensure that is the HSTS (HTTP Strict Transport Security) policy, deőned in RFC 6797 [203]. The HSTS policy indicates that a particular domain name (of a web server), should be used only via HTTPS (secure) connections, and not via unprotected (HTTP) connections. HSTS is sent as an HTTP header őeld (Strict-Transport-Security), in an HTTP response sent by the web server to the client. The HSTS policy speciőes that the speciőc domain name, and optionally also subdomains, should always be connected using HTTPS, i.e., a secure connection. Speciőcally, 1. The browser should only use secure connections to the server; in particular, if the browser receives a request to connect to the server using the (unprotected) HTTP protocol, the browser would, instead, connect to the server using the HTTPS (protected) protocol, i.e., using HTTP over TLS. 2. The browser should terminate any secure transport connection attempts upon any secure transport errors or warnings, e.g., a warning about the use of invalid certiőcate. The HSTS policy is designed to prevent attacks by a MitM attacker, hence, the HSTS policy itself must be protected - and, in particular, the attacker should not be able to ‘drop’ it. For this reason, HSTS policy must be known to the browser before it connects to the server. This may be achieved in two ways: Caching - max-age: The HSTS header őeld has a parameter called max-age, which deőnes a period of time, speciőed in seconds, during which the browser should ‘remember’ the HSTS directive, i.e., keep it in cache. Any connection within this time, would be protected by HSTS. Namely, suppose that at time T , a browser receives an HTTP response containing the HSTS header with max − age = m from site example.com, over a secure connection (i.e., using TLS). Assume that later, but before time T +m, the browser again is directed to request an object from example.com; then the browser will open the link to example.com only over a secure connection, i.e., using TLS. This motivates the use of a large value for max-age; however, notice that if a domain must move back to HTTP for some reason, or there are failures in the secure connection attempts for some reason, e.g., expired certiőcate, then the site may be unreachable for max-age seconds. Pre-loaded HSTS policy: The browser maintains a list of HSTS domains which are preloaded, i.e., do not require a previous visit to the site by this browser. This avoids the risk of a browser accessing an HSTS-using website but without a cached HSTS policy. However, this requires the browser to be preloaded with the HSTS policy - a burden on the site and on the browser, and some overhead for this communication. An optional Applied Introduction to Cryptography and Cybersecurity 7.5. NEGOTIATIONS AND DOWNGRADE ATTACKS (SSL TO TLS 1.2)457 parameter of the HSTS header, instructs search engines to add the site to the HSTS preload list of related browsers. This is used by Google to maintain the pre-loaded HSTS list of the Chrome browser. 7.5.8 Three Principles: Secure Extensibility, KISS and Minimize Attack Surface The different downgrade attack attacks show the importance and the challenge of secure backward compatible upgrades. Backward compatibility is essential, to motivate adoption of a new versions of protocols. For example, without backward compatibility, a web-server using SSLv2 is unlikely to upgrade to SSLv3, until most clients would upgrade to SSLv3; and clients would not upgrade to SSLv3, until there are many web-servers they can interact with using SSLv3 - a chicken and egg problem. We conclude that backward compatibility is essential for successful upgrade. On the other hand, we see the backward compatibility mechanisms may allow vulnerability that can be exploited for downgrade attacks. It is therefore necessary to ensure a secure extension mechanism allowing backward compatibility. In particular, a every practical security protocol should support a secure version negotiation mechanism. We conclude the principle of secure extensibility by design, which requires built-in secure mechanisms for extensions and backward compatibility. This is another important design principle for secure systems and protocols, cryptographic or otherwise. Principle 13 (Secure extensibility by design). When designing security systems or protocols, one goal must be to build-in secure mechanisms for extensions, downward compatible versions, and negotiation of options and cryptographic algorithms (cipher suite negotiation). However, backward compatibility, like other options and extensions, increases the attack surface and make the system more complex, both of which imply greater risk of vulnerabilities. Let us discuss these two issues. First, flexibility brings complexity, and vulnerabilities lurk in complexity: the simpler the system, the easier it is to protect. This is the important KISS principle7 . Principle 14 (The KISS Principle). Keep It Simple and Secure The simpler a system is, the easier it is to protect; for better security, minimize complexity, options, flexibility and extensibility. Now to the attack surface, which is based on the intuitively-deőned notion of an attack vector. An attack vector is an element of the system, which may have a ŕaw which can be exploited by an attacker to ‘break’ the system; attack vectors may be deőned as functions, classes, lines of code, cryptographic functions, API calls, software/hardware modules or protocol variants. The attack surface is a 7 The KISS principle originates from the US nave, where it meant ‘Keep It Simple, Stupid’. We changed it a bit, to Keep It Simple and Secure. Applied Introduction to Cryptography and Cybersecurity 458 CHAPTER 7. TLS PROTOCOLS: WEB-SECURITY AND BEYOND rough measure of the number or extent of the attack vectors in a given system. In a simpliőed case, let n denote the attack surface, corresponding to n attack vectors, each with probability pv of a discovered vulnerability. The probability that the overall system will be secure, i.e., not have any discovered vulnerability, is only (1 − pv )n . Hence, we need to minimize ‘attack surface’, n - as well as to minimize the probability of vulnerability, pv . Principle 15 (Minimize the attack surface). Systems should be designed to minimize their attack surface, i.e., minimize the number of their attack vectors. Roughly, the probability of vulnerability of a system, is exponential in the number of attack vectors. DROWN and other Cross-Protocol Attacks (exploiting lack of key separation). Modern protocols, like (newer versions of) TLS, are adopting the extensibility-by-design principle and support secure extensions and backward compatible versions. However, there is yet an important element of extensibility that is often neglected: version-based key separation (Principle 10). Namely, suppose the same key - in particular, public-private key pair - is used by both a vulnerable protocol and a secure protocol. Then, it may be possible to expose the key by running the vulnerable protocol, and exploit this to attack the system also when using the secure protocol. The DROWN attack [18] is an important example of a cross-protocol attack, due to the lack of version-based key separation. A signiőcant number of webservers were found to support SSLv2, furthermore, using the same key-pair as they use for improved-security TLS handshake. This makes these servers vulnerable to an improved Bleichenbacher attack presented in [18], which allows to perform operations using the RSA private key. 7.6 The TLS 1.3 Handshake: Improved Security and Performance In this section, we (őnally) discuss the handshake protocol of TLS 1.3 [329] the current version of TLS. The TLS 1.3 handshake protocol, like the record protocol Figure 7.7, is a major re-design, providing signiőcant improvements in performance and security compared to the handshake protocol of earlier versions. The main goals of these changes were to improve security, to improve performance. Another goal was to simplify - a goal on its own right, but also reducing the risk of vulnerabilities, following the KISS principle (Principle 14). However, while some signiőcant simpliőcations were made, we cannot deny that TLS 1.3 introduces its own complexities; arguably, these complexities can be justiőed by the security beneőts they provide to different scenarios. Our description makes some simpliőcations, in what seems to be more technical details. Consistency with previous versions was not one of the main goals. Indeed, the TLS 1.3 handshake protocol, and especially the key derivation mechanisms, Applied Introduction to Cryptography and Cybersecurity 7.6. THE TLS 1.3 HANDSHAKE: IMPROVED SECURITY AND PERFORMANCE 459 differ considerably from previous versions, even in the terminology used. For example, the term premaster key is not used. This, unfortunately, adds another challenge to the reader familiar with the previous versions. TLS 1.3 Security Improvements. The security improvements of the TLS 1.3 handshake include: • TLS 1.3 does not support the (previously widely used) RSA-based key exchange. This avoids many attacks on RSA implementations, mainly variants of Bleichenbacher’s attack, such as ROBOT [73]. Note, however, that TLS 1.3 still allows the use of RSA signatures for authentication of the handshake ŕows, including of the public DH values. The signature can be vulnerable to a cross-protocol attack, if (incorrectly) using the same RSA private key for TLS 1.3 CertiőcateVerify signature, as the key used by insecure protocols such as earlier TLS versions; see subsection 7.6.6. • TLS 1.3 handshake always uses DH Ephemeral (DHE) key exchange to provide a fresh secret shared key in every exchange, using either őniteőelds of elliptic-curves Diffie-Hellman. This ensures perfect forward secrecy (PFS)8 . The use of this single key-exchange mechanism also simpliőes the handshake. • TLS 1.3 disallows the use of other cipher suites with known weaknesses, most notably, these using ‘export-grade’ cryptography. This foils cipher suite downgrade attacks which exploits the use of weak, ‘export-grade’ cryptography, such as the LOGJAM attack [7], exploiting support for 512 bit DH groups, and the FREAK attack [56], exploiting the use of ephemeral 512 bit RSA private keys. See subsection 7.5.3. • Previous versions of TLS allowed the server to specify an arbitrary DiffieHellman group, by sending the modulus p and generator g. This allows different servers to select different groups, which may help to foil discrete-logarithm precomputation attacks such as in the Logjam attack (subsection 7.5.3). However, as in many cases, flexibility resulted in vulnerability; many servers used the same groups - and often, using weak groups. TLS 1.3 uses standard őnite groups or elliptic curves; these should correspond to carefully chosen groups and curves, properly deőned, e.g., in [162, 251]. This use of speciőc, studied groups/curves follows the cryptographic building blocks principle (Principle 8). • In the TLS 1.3 Server_Key_Exchange message, the servers sign the entire handshake, not just the DHE parameters and the client and server random numbers (rC and rS ) as in previous versions; compare to Figure 7.12. In particular, when the client receives and validates this signed 8 However, TLS 1.3 does not ensure perfect recover security (PRS), since it relies on the secrecy of the server’s private key, for signing the exchange; see Exercise 7.16. The reliance on a fixed private key was exploited by the few known attacks against the TLS 1.3 handshake protocol. Applied Introduction to Cryptography and Cybersecurity 460 CHAPTER 7. TLS PROTOCOLS: WEB-SECURITY AND BEYOND Server_Key_Exchange, it conőrms that the server received correctly the supported-versions, cipher suites and supported-groups information sent in Client_Hello. This defends against downgrade attacks (Section 7.5). • For signatures and hashing, TLS 1.3 forbids the use of the vulnerable RSA PKCS#1 v1.5 and SHA-1 algorithms. Secure alternatives are speciőed instead: PSS padding for signatures and for hashing, SHA-256, SHA-384 or SHA-512. See PKCS#1 v2.2 (RFC 8017), RFC 5756 and RFC4055 [292, 347, 374]. • TLS 1.3 removed support for handshake renegotiation. Renegotiation added complexity, vulnerabilities (e.g., [57, 331]) and attack surface. TLS 1.3 provides an alternative mechanism to support the main use case for renegotiation, which is, invoking client authentication only ‘as-needed’, after the handshake completes. • TLS 1.3 design prefers cryptographic designs which are amenable to proofs of security, and hence their security seems more well established. This is done without requiring an explicit attack exploiting the previous, intuitive or less-established design. The improved, and arguably more complex, key derivation process is a good example (subsection 7.6.5). • TLS 1.3 speciőes the use of pre-shared keys (PSKs). This single mechanism replaces multiple mechanisms in previous versions: different PSK-based cipher suites, with and without DH [21, 22, 141] as well as session resumption, both ID-based and session-ticket based (subsection 7.4.4). This simplicity helps ensure security (Principle 14). • TLS 1.3 protects őelds and messages as soon as the necessary keys are established. In particular, it protects the extensions sent in the Server_Hello and later messages, and it protects the Certiőcate (sent after Server_Hello). One beneőt from protecting the certiőcate is improved privacy for users against an eavesdropper, who could have used the certiőcate to identify the website used (even if the website cannot be identiőed from the addressing information, e.g., when using an anonymity-providing proxy). Note that exactly this feature, may cause concern to network administrators who may have relied on the visibility of the certiőcate to prevent communication with undesired websites. TLS 1.3 Performance Improvement: reduced latency overhead (less round trips). The performance improvement of TLS 1.3 handshake is mostly due to reduced handshake latency, which results mainly from the reduction in the number of round-trips. Most applications use the request-response communication pattern, where the client sends a request to the server who sends back a response. The handshake latency is the time since the client application initiated the connection (and transferred the request to the client’s TLS module), and until the server receives the request. Applied Introduction to Cryptography and Cybersecurity 7.6. THE TLS 1.3 HANDSHAKE: IMPROVED SECURITY AND PERFORMANCE 461 Modern networks use fast transmission rates; hence the latency is mostly due to propagation and queuing delays. Namely, as explained in Fact 5.1 and illustrated in Table 5.2, the dominant factors in determining the delay are the number of round-trips and the round trip delay; we can mostly ignore the transmission time of the packets. This is especially true since the handshake packets, and most requests, are quite short. This means that it doesn’t matter much if we send somewhat longer messages or multiple messages consecutively, without waiting for a response. The latency is almost entirely determined by the delay caused by the delay due to waiting for a response before sending the next message. Clearly, request-response interaction requires at least one round trip: sending the request to the server and receiving the response. Therefore, the ‘base’ latency for request-response interaction is one round trip time (RTT); this is required even without security. TLS 1.3 aims to minimize the latency overhead, which basically means, minimize the number of additional round trips, required by the handshake before the request can be sent. Previous versions of TLS required two round trips (Figs. 7.3 and 7.8); TLS 1.3 requires only one round trip. Furthermore, TLS 1.3 supports a zero round trips handshake, where one client request, containing some application data, can be sent as part of the initial ŕow from the client; see subsection 7.6.4. Our presentation of the TLS 1.3 Handshake protocol. In the following subsections, we present a simpliőed overview of the TLS 1.3 Handshake protocol. We cover the most important aspects of the protocol. In subsection 7.6.1, we discuss the TLS 1.3 negotiation and backward compatibility mechanisms. In subsection 7.6.2 we discuss the TLS 1.3 1-RTT (‘full’) Diffie-Hellman handshake. In subsection 7.6.3 we discuss the Pre-Shared Key (PSK) handshake, used to support both off-band shared keys and session resumption. In subsection 7.6.4 we discuss the zero-TTL handshake, which avoids entirely the delay-overhead of earlier versions of TLS and even of the ‘full’ TLS 1.3 handshake. In subsection 7.6.5, we discuss the key derivation process of the TLS 1.3 handshake protocol. Finally, in subsection 7.6.6 we discuss cross-protocol attacks, which are the only known attacks which exploit a vulnerability that exists in the TLS 1.3 speciőcations. Overall, we tried to cover the most important aspects of the TLS 1.3 handshake mechanism. However, we had to make some simpliőcations and omissions. As one important example, see Exercise 7.19, which discusses the risk of Denial of Service (DoS) on TLS servers, and two defenses against it, including using the Cookie extension. The Cookie extension can also be used to off-load state from the TLS server to the client [329]. 7.6.1 TLS 1.3: Negotiation and Backward Compatibility In Section 7.5, we discussed the cipher suite and version negotiation mechanisms of previous versions of TLS. The redesign of TLS 1.3 includes a signiőcant Applied Introduction to Cryptography and Cybersecurity 462 CHAPTER 7. TLS PROTOCOLS: WEB-SECURITY AND BEYOND change in these negotiation mechanisms. However, this redesign was done carefully to ensure backward compatibility with earlier versions of TLS. Cipher suite negotiation. In previous versions of TLS, as well as SSLv3, the cipher suites deőned three separate aspects: (1) the record protocol ciphers, e.g., AES, (2) the key exchange mechanism (RSA, DH or pre-shared key), and (3) the signature algorithm. This resulted in an exponential explosion in the number of cipher suites, with unnecessary complexity (and room for error). TLS 1.3 separates the cipher suite negotiations info four distinct aspects: Record protocol cipher suite: the AEAD algorithm and the hash function used for key derivation. DH group and public share: the Diffie-Hellman group (or elliptic curve), and optionally, a public key share extension which contains DH public values (key shares). Signature algorithm: the signature algorithm used for server authentication, i.e., to sign the Server_Finished message. Pre-Shared Key: identiőes pre-shared keys, and speciőes whether such keys are to be used to authenticate a Diffie-Hellman key exchange, providing PFS, or to be used directly as the shared secret between the parties. See subsection 7.6.3. TLS 1.3 version negotiation: the Supported_Versions extension. TLS included a version negotiation mechanism from early on; however, as we discussed in subsection 7.6.1, many TLS servers did not implement it correctly, leading clients to adopt the insecure ‘downgrade dance’ and exposing them to the Poodle version downgrade attack. However, upon early, experimental deployment of TLS 1.3, it was discovered that many servers still do not implement this version negotiation mechanism. At őrst, it appeared that clients should be able to use ‘downgrade dance’ securely, by using the SCSV extension (subsection 7.5.6), designed to prevent a MitM attacker from causing unnecessary downgrades. No such luck; it was soon realized that there are also many TLS 1.2 servers that do not support SCSV, which would have allowed a Poodle-like downgrade attack against TLS 1.3. The TLS 1.3 designers decided that the only secure solution is to change the version negotiation mechanism, in a way which is backward compatible with TLS 1.2 servers. Speciőcally: 1. TLS 1.3 uses a Client_Hello message which is compatible with TLS 1.2, including the version number. Namely, TLS 1.3 Client_Hello messages include the identiőer of TLS 1.2, rather than that of TLS 1.3. TLS servers running version 1.2 or an earlier version, should handle this correctly (if they implement version negotiation correctly). TLS 1.3 servers will manage, since they use the Supported_Versions extension, which provides a new version negotiation mechanism. Applied Introduction to Cryptography and Cybersecurity 7.6. THE TLS 1.3 HANDSHAKE: IMPROVED SECURITY AND PERFORMANCE 463 2. TLS 1.3 clients list the versions they support, in order of preference, in the Supported_Versions extension. Supported_Versions is a new mandatory extension, which should be supported by all new TLS servers and clients (running version 1.3 or future versions). If this extension is absent, the server should continue the handshake using TLS 1.2. If the extension is present, the server uses the ‘best’ version supported both by the client and by itself, and indicates it in the Supported_Versions extension sent back (with Server_Hello). Let us pray that this new mechanism will be implemented correctly by TLS 1.3 servers, avoiding a similar predicament upon upgrading to TLS 1.4! Backward compatible Client_Hello. As we explained, the Client_Hello message of TLS 1.3 must be backwards compatible with TLS 1.2. In particular, the Client_Hello version őeld will indicate version 1.2, not 1.3; the ‘Supported Versions’ extension will indicate which versions are supported by the client (e.g., version 1.3 and some older versions). To retain backward compatibility of the Client_Hello with TLS 1.2, several őelds contain ‘legacy’ values, used only by legacy (TLS 1.2 and lower) servers receiving the message, and ignored by TLS 1.3 (or newer) servers. Let us discuss each of these ‘legacy’ őelds: Version: As explained above, TLS 1.3 clients indicate here the version of TLS 1.2. Session_ID: This őeld is used for any previously-cached Session_ID for that server (subsection 7.4.4). Compression: This őeld is used in earlier versions to identify the compression method. In TLS 1.3, should contain the indication for the ‘null’ compression method. Applied Introduction to Cryptography and Cybersecurity 464 7.6.2 CHAPTER 7. TLS PROTOCOLS: WEB-SECURITY AND BEYOND TLS 1.3 Full (1-RTT) DH Handshake Client Server Client_Hello: Client  random (rC ), cipher suites,  supported_groups, Key_Share, extensions: supported_versions, signature_algs, CAs, . . . Server_Hello: Server random (rS ), extensions: k̂ {Key_Share, . . . } X Certificate: k̂ {SignCA.s (S.v, . . .)}, S CertificateVerify: k̂ {SignS.s (handshake)}, S  Server_Finished: k̂ M ACkF inished (Server_Finished:handshake+ ) S  M ACF inished_key (Client_Finished:handshake∗ ) Client_Finished: k̂ C kC [Application data] Figure 7.21: TLS 1.3 1-RTT full Diffie-Hellman handshake; see Figure 7.22 to see the version with support for Pre-Shared Key (PSK). The CertiőcateVerify message contains a signature over the entire handshake until it: Client_Hello, Server_Hello and the Certiőcate. Server_Finished contains a MAC over handshake+ , i.e., the entire handshake, plus the CertiőcateVerify message itself. Client_Finished contains MAC over the handshake∗ , i.e., entire handshake, plus CertiőcateVerify and Server_Finished. We use k̂X {. . .} to denote AEAP protection using handshake-key of party X ∈ {C, S}, and kX [. . .] to denote AEAP protection using application-key of party X ∈ {C, S}. We next present the TLS 1.3 1-RTT Diffie-Hellman Handshake, illustrated in Figure 7.21. This is the typical initial TLS 1.3 handshake, always used by a client and server on their őrst connection, and optionally used in subsequent connections. In contrast to the ‘basic handshake’ of the previous sections (and versions), the TLS 1.3 full handshake always uses the DH protocol for key exchange; therefore, it always ensures PFS. The handshake ensures server authentication using the server’s signature over the initial handshake messages, sent in a message called CertificateVerify. This signature authenticates the server’s DH component (gibi mod pi ). By including the initial handshake messages in the signed content, the signature also protects against downgrade attacks. The TLS 1.3 full (1-RTT) handshake allows the client to send the request after a single round-trip; that’s why it is called a 1-RTT handshake. Namely, the server (single) ŕow contains both the Server_Hello message (with the server’s DH exponent, extensions, certiőcate and signature), and the Server_Finished message, which ensures the integrity of the exchange. In order to allow the server to send the finished: message in its (single) ŕow, it has to receive all the necessary keying information from the client earlier i.e., already in the Client_Hello message. Since TLS 1.3 always uses DH key exchange, this means that the client must send sufficient information for the Applied Introduction to Cryptography and Cybersecurity 7.6. THE TLS 1.3 HANDSHAKE: IMPROVED SECURITY AND PERFORMANCE 465 server to determine the group (for őnite őeld DH) or curve (for Elliptic Curves DH) to be used; furthermore, the client needs to even provide its DH key-share (e.g., g a mod p). We can provide all this - efficiently - in the Client_Hello, as a result of the TLS 1.3 use of a limited number of standard DH finite groups and elliptic curves. The client indicates the DH groups (including elliptic curves) that it supports, in the Supported_Groups extension; and its DH key-share, in the Key_Share extension. Both extensions must be sent as part of the TLS 1.3 Client_Hello message. There is a cost here in overhead of computing and sending these values - but the savings in RTT are usually much more important. The client may provide key_share only for some of the groups; if there is no key_share for the group preferred by the server, the server can send a special message, HelloRetryRequest, and the client will send a ‘corrected’ Client_Hello. (In this case, the handshake will require two RTTs.) Three other mandatory extensions in the TLS 1.3 Client_Hello identify: (1) the supported_versions, (2) the signature_algorithms supported and (3) the certificate_authorities (CAs) trusted by the client. By requiring the supported versions extension, TLS 1.3 protects against version downgrades. The goal of the two other mandatory extensions is, apparently, to avoid incompatabilities, i.e., a situation where the server may use a signing algorithm not supported by the client, or a certiőcate from a CA not supported by the client. The Server_Hello message is quite similar to this of previous versions, except that it also provides the server’s key_share for the DH exchange, in an aptly named extension. It is followed by the server’s certiőcate and the TLS 1.3 CertiőcateVerify message, which authenticates the server and ensures the handshake integrity, by including a signature over the entire handshake (to this point). Finally, the server sends its (authenticated) Server_Finished message, again much like in previous versions (just earlier!). The őnal ŕow of the handshake contains the Client_Finished message, authenticating the entire handshake (until this point), much like in earlier versions. Both client and server Finished messages may be followed immediately by application data sent by the respective parties, protected using the shared key. Notice that since the TLS 1.3 record protocol uses AEAD to protect the data, it uses only one key in each direction (client to server, kC→S , and server to client, kS→C ); the őgure includes only a message from the client. We denote the AEAD protection provided by the record protocol by kC→S (Application data). Applied Introduction to Cryptography and Cybersecurity 466 7.6.3 CHAPTER 7. TLS PROTOCOLS: WEB-SECURITY AND BEYOND TLS 1.3 Full (1-RTT) Pre-Shared Key (PSK) Handshake Client Server Initial handshake (no PSK): Client_Hello: Client  random (rC ), cipher suites, PSK_key_exchange_modes (PSK-only, PSK-DHE),   supported_versions, supported groups, extensions: signature_algs, CAs, Key_Share, . . . Server_Hello: Server random (rS ), extensions: k̂ {Key_Share (if DHE), . . . } X Certificate: k̂ {SignCA.s (S.v, . . .)}, S CertificateVerify: k̂ {SignS.s (handshake)}, S  Server_Finished: k̂ M ACkF inished (Server_Finished:handshake+ ) S  Client_Finished: k̂ M ACF inished_key (Server_Finished:handshake∗ ) C kC (Application data) kS  NewSessionTicket:  ticket_lifetime, ticket_age_add, ticket_nonce, ticket, extensions  ... Subsequent handshake (with PSK): Client_Hello: Client  random (rC ), cipher suites, Pre_Shared_Key, PSK_key_exchange_modes,   supported_versions, supported_groups, extensions: signature_algs, CAs, Key_Share, . . . Server_Hello: Server random (rS ), extensions: k̂ {Pre_Shared_Key, Key_Share (if DHE),. . . } S Server_Finished: k̂ M ACkF inished (Server_Finished:handshake+ ) S  Client_Finished: k̂ M ACF inished_key (Client_Finished:handshake∗ ) C kC (Application data) Figure 7.22: TLS 1.3 Full (1-RTT) Pre-Shared Key (PSK) handshake. We show the typical use of a PSK, for session resumption, although a PSK can also be shared off-band. We show two handshake: an initial handshake where the PSK is established, and a subsequent handshake which uses the PSK. To establish a PSK, the Client_Hello includes the PSK_key_exchange_modes extension, indicating if the PSK can be used to authenticate a DHE (providing PFS) and/or to provide shared-key only handshake (without PFS). See text for details. Applied Introduction to Cryptography and Cybersecurity 7.6. THE TLS 1.3 HANDSHAKE: IMPROVED SECURITY AND PERFORMANCE 467 In Figure 7.22, we present the TLS 1.3 Full (1-RTT) Pre-Shared Key (PSK) Handshake. The TLS 1.3 pre-shared key mechanism can be used for session resumption, using a key shared in an earlier session, or with out-of-band preshared keys, as previously supported by several special cipher suites [21, 22, 141]. For both session resumption and out-of-band pre-shared keys, TLS 1.3 supports two Pre-Shared Key modes, which offer different costs and beneőts: The PSK Key Exchange (PSK_KE) mode: the pre-shared key is used to secure the session without the use of Diffie-Hellman key exchange, or any other public key operation. The advantage is reduced computational costs and associated delay and energy costs; as shown in Table 6.1, symmetric cryptography requires a tiny fraction of the computational resources required by public-key cryptography. There is also a reduction in the amount of data sent, which is meaningful in some unusual situations; usually, this reduction is insigniőcant. The disadvantage of the a PSK Key Exchange (PSK_KE) mode, is that it does not ensure perfect forward secrecy (PFS). The PSK and DHE (PSK_DHE_KE) mode: the pre-shared key is used to authenticate the Diffie-Hellman key exchange, instead of using the server’s certiőcate and signature (with the CertiőcateVerify message). This mode has three uses. The őrst is simply to reduce the overhead of the transmission and veriőcation of the certiőcate and the signature, while retaining the added security provided by PFS. The second is to use is for scenarios where the server does not have a certiőcate and cannot perform the certiőcate-based veriőcation (CertiőcateVerify message), and the security of the DH key exchange is based on the shared-key authentication. The third usage is when the server does send both certiőcate and signature; in this case, the goal is the added security provided by the shared key, e.g., in case of exposure of the server’s private key. In Figure 7.22, we focus on the session resumption case, by presenting a sequence of two handshakes. The őrst is an initial handshake, which is essentially a full (1-REE) DH handshake, which further establishes one or more Pre-Shared Keys, using the PSK_key_exchange_modes extension and the NewSessionTicket message. It is followed by the second, a subsequent handshake, which uses the Pre-Shared Key from the previously sent ticket, to resume the session, by performing a pre-shared key handshake. The PSK_key_exchange_modes extension speciőes which Pre-Shared Key mode(s) the client wants to use: the PSK Key Exchange (PSK_KE) mode and/or the PSK Key Exchange (PSK_KE) mode. This extension is relevant for the use of the PSK in the current handshake (if it uses a pre-shared key), and for any new pre-shared keys which the server may share using the NewSessionTicket message. The NewSessionTicket message provides the client with one or more ‘tickets’. The ticket(s) sent after a successful handshake, refer to a pre-shared key, derived from a dedicated shared secret called the resumption_master_secret. Applied Introduction to Cryptography and Cybersecurity 468 CHAPTER 7. TLS PROTOCOLS: WEB-SECURITY AND BEYOND The resumption_master_secret is one of the secrets and keys derived for the session; the derivation uses a keyed function, which we denote hExpand . As a simpliőcation, consider resumption_master_secret to be a secret key, and hExpand to be a pseudorandom function (PRF); for more precise details, see subsection 7.6.5. The PSK associated with the ticket is derived from the ticket_nonce, which is one of the őelds sent in the NewSessionTicket, and the resumption_master_secret, as follows: ′′ P SK(ticket_nonce) = hExpand resumption_master_secret (“resumption , ticket_nonce) (7.26) The Client_Hello message of the subsequent handshake, using the PSK, includes two PSK-related extensions: the PSK_key_exchange_modes extension, discussed above, and the Pre_Shared_Key extension. The Pre_Shared_Key extension identiőes one or more pre-shared keys known to the client, allowing the server to use any of these that the server may have cached to establish the new connection. For session resumption, the PSK identiőer is the ticket, provided in a previous NewSessionTicket message. The server also uses the Pre_Shared_Key extension to signal that it uses a speciőc pre-shared key (from the list provided by the client). In Figure 7.22, the subsequent handshake contains the Key_Share extension. This extension includes the server’s DH key-share, and therefore, is used if, and only if, using the DH key exchange (with őnite őeld or elliptic curve). We conclude, therefore, that the client’s PSK_key_exchange_modes extension allowed the use of PSK_KE, which was then chosen by the server. Applied Introduction to Cryptography and Cybersecurity 7.6. THE TLS 1.3 HANDSHAKE: IMPROVED SECURITY AND PERFORMANCE 469 7.6.4 TLS 1.3 Zero-RTT Pre-Shared Key (PSK) Handshake Client Server Client_Hello: Client  random (rC ), cipher suites, Early_Data, Pre_Shared_Key, PSK_key_exchange_modes, , supported_versions, supported_groups, extensions:  signature_algs, CAs, Key_Share, . . . P SKC (Early application data) Server_Hello: Server random (rS ), extensions: k̂ {Pre_Shared_Key, Early_Data, Key_Share (if DHE),. . . } S Server_Finished: k̂ M ACkF inished (Server_Finished:handshake+ ) S P SKC (EndOfEarlyData)  M ACF inished_key (Client_Finished:handshake∗ ) Client_Finished: k̂ C kC (Application data) Figure 7.23: TLS 1.3 Zero-RTT Pre-Shared Key (PSK) Handshake. The client provides some ‘early application data’ to the server immediately after the Client_Hello message, without waiting for the server’s response. In Figure 7.23 we present the TLS 1.3 Zero-RTT Pre-Shared Key (PSK) Handshake. This is a special form of a pre-shared key handshake, in which we use the PSK to secure some data sent in the Client_Hello message. Speciőcally, this data is contained in the dedicated Early_Data extension, which is sent in the őrst ŕow, sent from the client to the server. Since Early_Data is sent as part of the very őrst ŕow of the connection, the delay until it arrive includes only the time since the client sends this message, and until the server receives it. Namely, there is no need to wait for any roundtrip of handshake messages, before the client can send this ‘early’ application data to the server. In other words, the Early_Data is sent without waiting for any round-trips to complete, i.e., with zero RTT latency. The only delay is, therefore, the unavoidable time for the data itself to transfer from client to server. The Early_Data does not beneőt from all the protections offered by TLS. In particular, it is only protected by the pre-shared key, therefore it does not beneőt from perfect forward secrecy (PFS). Furthermore, an attacker can replay the Client_Hello message; this may cause the server to re-process the Early_Data, unless appropriate countermeasures prevent this (see below). Therefore, Early_Data should only be used for client requests which do not require PFS, and either where re-processing is allowed, or with appropriate countermeasures to prevent re-processing. A typical example for a legitimate, common use of Early_Data (and zero-RTT handshake), where both the lack of PFS and the possibility for re-processing are not a concern, is when the client Applied Introduction to Cryptography and Cybersecurity 470 CHAPTER 7. TLS PROTOCOLS: WEB-SECURITY AND BEYOND sends a query to the server, authenticated by a password or cookie sent with the request, and receives back a (protected) response. Two alternative countermeasures against re-processing of Early_Data are possible. The őrst possible countermeasure is to allow each ticket to be used only once; this requires the server to maintain track of already-used tickets, similarly to the use of session-ID for session resumption in earlier versions. An alternative countermeasure is to limit the lifetime of the ticket, using the ticket_lifetime őeld of the NetSessionTicket message. 7.6.5 TLS 1.3 Key Derivation TLS 1.3 makes extensive use of key derivation, using a pair of keyed functions: a key expansion function hExpand and a key extraction function hExtract . Both functions are deőned in [243], based on the HMAC construction and a given hash function h. The hash function h is deőned as part of the TLS 1.3 cipher suite. The key expansion function hExpand is similar to a PRF; like a PRF, it should receive a pseudorandom key, and outputs a pseudorandom string. The length of its output pseudorandom string is speciőed as one of its inputs. It receives one more input, which is basically the input information to the PRF. The TLS speciőcations deőnes different inputs for each derivation. The key extraction function hExtract is similar to a keyed Key Derivation Function (KDF). Namely, the output of hx (y) is a (short) pseudorandom string, provided the either (a) x is pseudorandom (secret) and y is a unique value, or (b) x is a ‘salt’ (randomly chosen but known to attacker), and y is a high-entropy string, i.e., intuitively, contains ‘sufficient secret random bits’. The TLS 1.3 handshake protocol applies hExtract and hExpand to derive multiple pseudorandom, independent keys for different purposes. Simplifying: 1. When using a pre-shared key P SK, we use it (or actually, a key derived from it) as a key to hExpand , to generate several pseudorandom ‘early secrets/keys’. We use one key to protect the Early_Data (if sent), and another key, which we denote k1 , as the key for the next derivation step. 2. We next derive khandshake = hExtract (shared_secret), where shared_secret k1 is the partially-secret output of the Diffie-Hellman key exchange. We use khandshake as the key to hExpand , to generate several pseudorandom keys, including a key to protect client-to-server handshake messages, a key to protect server-to-client handshake messages, and a key k2 which we use for the next derivation step. 3. We next derive kM aster = hExtract (0). We use kM aster as the key to k2 hExpand , to generate several pseudorandom keys, including a key to protect client-to-server application messages, a key to protect server-toclient application messages, and a key resumption_master_secret which we use to derive pre-shared keys, as deőned in Equation 7.26. Applied Introduction to Cryptography and Cybersecurity 7.6. THE TLS 1.3 HANDSHAKE: IMPROVED SECURITY AND PERFORMANCE 7.6.6 471 Cross-Protocol Attacks on TLS 1.3 We complete out discussion of TLS 1.3 by brieŕy discussing cross-protocol attacks, since these types of attacks can be effective standard-complying implementations of TLS 1.3; arguably, the standard could have speciőed certain countermeasures which would have prevented this, as we discuss below. Let us őrst explain what are cross-protocol attacks; the concept is not limited to TLS. Cross-protocol attacks are attacks which exploit a vulnerability in one (‘weak’) protocol PW , to attack another, ‘strong’ protocol, PS . The attacks exploit two ŕaws: the vulnerability of PW , and the fact that PS and PW use the same private/secret key, in violation of the key separation principle (Principle 10). In particular, while TLS 1.3 does not use RSA encryption, it does use signatures - which are often RSA signatures - for CertiőcateVerify. In many implementations, the same private key is used for these TLS 1.3 RSA signatures, and for vulnerable RSA decryption, typically, using version 1.5 or 2.0 of PKCS#1. In particular, many TLS 1.3 implementations were found to use the same private key (to sign CertiőcateVerify) as used for RSA decryption by older TLS implementations (of the same organization). This allow use of the Bleichenbacher attack and variants of it, e.g., Manger’s attack, to allow the attacker to ‘perform the private-key RSA operation’, potentially allowing the attacker to sign the CertiőcateVerify message. Two variants of this attack were published. The őrst attack was published in [214]; however, it required quite extensive computational abilities from the attacker. The DROWN attack, published a year later, is a much more efficient and practical attack, but it required the key to be shared with an implementation of SSLv2; surprisingly, the researchers found that even in 2016, there was a very signiőcant number of web servers which supported SSLv2 (and reused the same private key for other protocols, e.g., TLS 1.3 signing). These cross-protocol attacks are not necessarily a major concern for the use of TLS 1.3, for two reasons. The őrst reason is a technical challenge: the attacker must complete the attack, and in particular abuse the lower-version TLS version to sign the CertiőcateVerify message, before the client aborts the connection (due to not receiving CertiőcateVerify in time). This challenge makes deployment of the attack challenging - but it may still be possible in some scenarios. The second reason for the defense to be of limited concern, is that the attack only works when the TLS 1.3 implementation uses the same private key (for signatures) as used by lower versions of TLS (for encryption). Such reuse of the same key for two different purposes (decryption and signing) and by two different versions of TLS, is a double violation of the key separation principle (Principle 10). Implementations of TLS 1.3 should avoid this, and use a different private (signing) key then the private (decryption) key used by lower versions in fact, such key separation should have been done even before this attack was published! Furthermore, typically, the same private key is used for two different purposes and protocol, only when using also the same certificate. Speciőcally, Applied Introduction to Cryptography and Cybersecurity 472 CHAPTER 7. TLS PROTOCOLS: WEB-SECURITY AND BEYOND TLS 1.3 could require that the certiőcate contains a key-usage extension, that explicitly forbids its use for decryption. While this would not absolutely prevent cross-protocol attacks, it would probably make them extremely unlikely, as certiőcates of keys currently used (for encryption) by older versions of TLS are unlikely to contain a key-usage extension forbiddings the use of the private key for decryption. For details about the key-usage and other certiőcate extensions, see subsection 8.2.7. 7.7 TLS: Final Words and Further Reading The TLS protocols are the most widely used, and deőnitely the most studied, applied cryptographic protocol. Extensive efforts by many cryptographers and security experts were invested in validating and improving the security of TLS. It is instructive to observe that these efforts resulted in the discovery of a large number of serious vulnerabilities. We discussed several of these, focusing on speciőcation vulnerabilities. We summarize the most important TLS speciőcations vulnerabilities in in Table 7.2. We did not cover the many implementation vulnerabilities, although some of them, in particular the Heartbleed bug [91,399], had comparable and even greater impact. Name BEAST CRIME Lucky13 RC4-biases BREACH TIME Poodle-padding Poodle-downgrade FREAK Cross-Bleichenbacher DROWN Logjam ROBOT Bleichenbacher’s CAT Year 2011 2012 2013 2013 2013 2013 2014 2014 2015 2015 2016 2018 2018 2019 References subsection 7.2.4, subsection 7.2.6, [9] subsection 7.2.5, subsection 7.2.6, subsection 7.2.6, subsection 7.2.3, subsection 7.5.5, subsection 7.5.3, subsection 7.6.6, subsection 7.6.6, subsection 7.5.3, [73] subsection 7.5.4, [132] [335] [11] [164] [29] [290] [290] [56] [214] [18] [7] [341] Versions SSL, TLS 1.0 TLS 1.0-1.1 SSL, TLS 1.0-1.2 SSL, TLS 1.0-1.2 TLS 1.0-1.2 TLS 1.0-1.2 SSLv3 TLS 1.0-1.2 TLS 1.0-1.2 TLS 1.3 TLS 1.0-1.3 TLS 1.0-1.2 TLS 1.0-1.2 TLS 1.0-1.3 Ciphers CBC n/a CBC RC4 n/a n/a CBC n/a RSA RSA RSA DHE RSA RSA Type Cryptanalysis Compression Padding, timing Cryptanalysis Compression Compression Padding Version downgrade Cipher suite downgrade Cross-protocol and Bleichenbacher Cross protocol and Bleichenbacher Cipher suite downgrade Bleichenbacher Bleichenbacher and downgrade Table 7.2: Important TLS/SSL attacks due to speciőcations vulnerabilities. We can learn some important lessons from this history of attacks and improvements, including: A crack today, a break tomorrow: many devastating attacks, e.g., BEAST and Poodle, can be traced to vulnerabilities reported years earlier, but ignored since they appeared impractical. Or as correctly stated in [9], attacks only gets better. Vulnerabilities are resilient and return: even after a vulnerability has been discovered and countermeasures adopted, attackers are often able to continue taking advantage of the vulnerability, in different ways. First, attackers are often able to adjust the attack and defeat the countermeasures. Second, attacker are often able to downgrade the system to use an outdated, vulnerable version, possibly using downgrade attacks. Third, Applied Introduction to Cryptography and Cybersecurity 7.8. ADDITIONAL EXERCISES 473 attackers are sometimes able to use cross-protocol attacks, where they exploit a vulnerable system to circumvent the (strong) defenses of another system which uses the same keys. It is much better to design systems securely from early on; of course, an advice easier given than followed! Separate keys: Cross-protocol attacks such as DROWN [18], as well as BEAST [132] which abuses the continued use of the same key for different messages, remind us of the important principle of key separation (Principle 10). The use of TLS 1.3, or of any secure protocol, will not help, if we reuse the same secret key which can be found from its usage in an insecure protocol! Test, test, test: őnally, while we focused on speciőcation ŕaws, many attacks on TLS, e.g., Heartbleed, exploit implementation flaws. Testing for security is difficult, however, vital for the security of the system, since vulnerabilities will not be detected by normal use of the system. Of course, testing is harder for larger and more complex systems; which brings us to the next and őnal item... KISS! Finally, we see again and again the importance of the KISS principle (Principle 14): keep speciőcations, design and code small and simple, minimize complexity and attack surface and avoid unnecessary options and ŕexibility. The KISS principle is important against both speciőcation and implementation vulnerabilities. The design of TLS 1.3 began with an extensive effort to follow this principle and eliminate unnecessary options. However, with features creeping in, there are reasons to be concerned that vulnerabilities may be found, in implementations and/or in the speciőcations themselves. 7.8 Additional Exercises Exercise 7.5 (The TLS record protocol fragmentation and compression). 1. The TLS record protocol uses fragments of size up to 16KB. Explain potential disadvantage of using much longer fragments (or no fragments). 2. Explain potential disadvantage of using much shorter fragments. 3. Explain why fragmentation is applied before compression (for the AtE protocol). 4. Suppose compression is not used. Why would we apply fragmentation before authentication and encryption? 5. The TLS record protocols apply compression, then authentication. Is it possible to reverse the order, i.e., apply authentication and then compression? Can you identify advantages to either order? Applied Introduction to Cryptography and Cybersecurity 474 CHAPTER 7. TLS PROTOCOLS: WEB-SECURITY AND BEYOND Exercise 7.6 (Vulnerability of Compress-then-Encrypt). One of the important goals of TLS is to hide the value of cookies sent by a browser, as part of an HTTP-over-TLS connection (marked by the https protocol in the beginning of the URL). Cookies are strings that are sent automatically by the browser to a website; they are often used to authenticate the user. Consider a cross-site attacker, as in subsection 7.2.6, i.e., the attacker controls a rogue website visited by the victim user; further, assume that the attacker can eavesdrop on the (protected) communication, following the CPA-Oracle Attack model. This allows the attacker to control the contents of the request sent by the browser, except for the Cookie HTTP header, which is added by the browser and which consists of the string ‘Cookie:’ followed by the value of the (unknown) cookie. 1. Assume that the length of the cookie is known, that it contains only alphanumeric characters, and the following compression scheme. Check what is the longest string α which appears at least twice in the uncompressed data; replace the occurrences of α by a special character, say ‘ !’, and concatenate to the data the same special character (e.g. ‘ !’), followed by α. Present an efficient attack which exposes the first character of the cookie. 2. Extend the attack, to find the entire cookie. 3. What is the maximal number of requests required by the attack, for a cookie of l characters? 4. What is the expected number of requests required by the attack, for cookie of l alphabetic characters selected randomly (with uniform distribution)? Note: compression is used by all versions of TLS before TLS 1.3. Exercise 7.7 (SSLv2 key derivation). SSL uses MD5 for key derivation. In this question, we explore the required properties from MD5 for the key derivation to be secure. 1. Show that it is not sufficient to assume that MD5 is collision-resistant, for the key derivation to be secure. 2. Repeat, for the one-way function property. 3. Repeat, for the randomness-extraction property. 4. Define a simple assumption regarding MD5, which ensures that key derivation is secure. The definition should be related to cryptographic functions and properties we defined and discussed. Exercise 7.8 (BEAST vulnerability). Versions of TLS before TLS1.1, use CBC encryption in the following way. They select the IV randomly only for the first message m0 in a connection; for subsequent messages, say mi , the IV is simply the last ciphertext block of the previous message. This creates a vulnerability exploited, e.g., by the BEAST attack and few earlier works [25,132]. Applied Introduction to Cryptography and Cybersecurity 7.8. ADDITIONAL EXERCISES 475 In this question we explore a simplified version of these attacks. For simplicity, assume that the attacker always knows the next IV to be used in encryption, and can specify plaintext message and receive its CBC encryption (using the known next IV). Assume known block length, e.g., 16 bytes. 1. Assume the attacker sees ciphertext (c0 , c1 ) resulting from CBC encryption with c0 being the IV, of a single-block message m, which can have only two known values: m ∈ {m0 , m1 }. To find if m was m0 or m1 , the adversary uses fact that it knows the next IV to be used, which we denote c′0 , and asks for CBC encryption of a specially-crafted single-block message m′ ; denote the returned ciphertext by the pair (c′0 , c′1 ), where c′0 is the (previously known) IV, as indicated earlier. The adversary can now compute m′ from c′1 : a) What is the value of m′ that the adversary will ask to encrypt? b) Fill the  missing parts in the solution of the adversary: m0 if m= m1 if 2. Show pseudo-code for the attacker algorithm used in the previous item. 3. Show pseudo-code for an attack that finds the last byte of message m. Hint: use the previous solution as a routine in your code. 4. Assume now that the attacker tries to find a long secret plaintext string x of length l bytes. Assume attacker can ask for encryption of messages m = p+ + x, where p is a plaintext string chosen by the attacker. Show pseudo-code for an attack that finds x. Hint: use previous solution as routine; it may help to begin considering fixed-length x, e.g., four bytes. Sketch of solution to second part: Attacker makes query for encryption of some one-block message y, receives α0 , α1 where α1 = Ek (α0 ⊕ y). Suppose that now, the attacker knows the value IV to be used for encryption of the next message. Attacker picks m0 = IV ⊕ y ⊕ α0 , and m1 some random message. If the game picks bit b = 0, then attacker receives encryption of m0 ; this encryption would be Ek (IV ⊕ m0 ) = Ek (IV ⊕ IV ⊕ y ⊕ α0 ) = Ek (y ⊕ α0 ) = α1 (and the IV ). Otherwise, if the game picks b = 1, then the attacker receives some other string. Sketch of solution to third part: solution to previous part allowed attacker to check if the plaintext was a given string; we now simply repeat this for the 256 different strings corresponding to all possible values of last byte of m2 . Exercise 7.9 (Non-random client/server). Some devices may not have a source of random bits; in this exercise, we explore possible resulting vulnerabilities, and a possible work-around. Applied Introduction to Cryptography and Cybersecurity 476 CHAPTER 7. TLS PROTOCOLS: WEB-SECURITY AND BEYOND 1. Consider a IoT-lock, which receives the lock/unlock requests over TLS, as a web-server - but without access to a source of randomness. Show a replay attack allowing an eavesdropping attacker to open the lock by replaying messages of a legitimate user. 2. Consider a monitoring station displaying images from security cameras, by initiating TLS connections to the cameras and receiving the current impages. Show a replay attack allowing an eavesdropping attacker to replay old images (while cracking the safe). 3. Assume that the devices have non-volatile memory. Show how it can be used to ensure secure interactions, even though the devices still do not have a source of random bits. Your answer should present a sequence diagram and the relevant equations, for SSLv2 or another version of TLS. Exercise 7.10 (SSLv2 client authentication). The SSLv2 client authentication mechanism requires clients to sign a message containing three main fields: a random challenge sent by the server, the server’s certificate, and the shared secret keys exchanged by the protocol. The signature uses RSA with the Hash-thenSign paradigm, using the MD5 hash function. Furthermore, the client should encrypt the message containing the signature (using the just-established shared key). This is more complex than the later client-authentication designs; in this question, we explore attempts to remove some of these multiple requirements. 1. Suppose that the input to the signature did not contain the server’s certificate. Show a sequence diagram showing client-authentication fails. 2. Suppose that the signed message was not encrypted. Present a possible vulnerability of the protocol, where you are allowed to replace the use of MD5 with the use of any collision-resistant hash function h. Note: this may require h to have a vulnerability that we may not expect to find in MD5 or other well-designed cryptographic has functions. Exercise 7.11 (TLS handshake: resiliency to key exposure). Fig. 7.10 presents the RSA-based TLS Handshake. This variant of the handshake protocol was popular in early versions, but later ‘phased out’ and completely removed in TLS 1.3. The main reason was the fact that this variant allows an attacker that obtains the server’s public key, to decrypt all communication with the server using this key - before and after the exposure. 1. Show, in a sequence diagram, how a MitM attacker who is given the private key of the server at time T1 , can decrypt communication of the server at past time T0 < T1 . 2. Show, in a sequence diagram, how TLS 1.3 avoids this impact of exposure of the private key. Applied Introduction to Cryptography and Cybersecurity 7.8. ADDITIONAL EXERCISES 477 3. Show, in a sequence diagram, how a MitM attacker who is given the private key of the server at time T1 , can decrypt communication of the server at future time T2 > T1 . 4. Explain which feature of TLS 1.3 can reduce the exposure of future communication, and how. Exercise 7.12 (Protocol version downgrade-dance attack). Implementations of TLS and SSL specify the version of the protocol in the ClientHello and ServerHello messages. If the server does not support the client’s version, then it replies with an error message. When the client receives this error message (‘version not supported’), it re-tries the handshake using the best-next version of TLS supported by the client. This method of ensuring backward compatibility with older versions of TLS is referred to as downgrade dance. 1. Present a sequence diagram showing how a MitM attacker can exploit the downgrade dance mechanism, to cause the server and client to use an outdated version of the protocol, allowing the attacker to exploit vulnerabilities of that version. 2. The TLS Fallback Signaling Cipher Suite Value (SCSV) [288], discussed in subsection 7.5.6, is designed to mitigate this risk. Let vc denote the TLS version run by the client and vS denote the TLS version run by the server. Present a sequence diagrams showing TLS connections where (1) client and server support SCSV and vc > vs , (2) same, vc = vs , (3) same, vc < vs , (4) any of these, with a MitM attacker who tries to cause use of version vm < min(vc , vs ). Note: See also Exercise 8.15. Exercise 7.13 (Client-chosen cipher suite downgrade attack). In many variants of the TLS handshake, e.g., the RSA-based handshake in Fig. 7.10, the authentication of the (previous) handshake messages in the Finish flows, is relied upon to prevent a MitM attacker from performing a downgrade attack and causing the client and server to use a less-preferred (and possibly less secure) cipher suite. However, in this process, the server can choose which of the client’s cipher suites would be used. To ensure the use of the cipher suite most preferred by the client, even if less preferred by the server, some client implementations send only the most-preferred cipher suites. If none of these is acceptable to the server, then the server responds with an error message. In this case, the client will try to perform the handshake again, specifying now only the next-preferred cipher suite(s), and so on - referred to as downgrade dance. 1. Show how a MitM attacker can exploit this mechanism to cause the server and client to use a cipher suite that both consider inferior. 2. Suggest a fix to the implementation of the client which achieves the same goal, yet is not vulnerable to this attack. Your fix should not require any change in the server. Applied Introduction to Cryptography and Cybersecurity 478 CHAPTER 7. TLS PROTOCOLS: WEB-SECURITY AND BEYOND Exercise 7.14 (TLS server without randomness). An IoT device provides http interface to clients, i.e., acts as a tiny web server. For authentication, clients send their commands together with a secret password, e.g., on, <password> and off, <password>. Communication is over TLS for security, with the RSA-based TLS handshake, as in Figure 7.10. The IoT device does not have a source of randomness, hence, it computes the server-random rS from the client-random, using a fixed symmetric key kS (kept only by the device), as: rS = AESkS (rC ). 1. Present a message sequence diagram showing how an attacker, which can eavesdrop on a connection in which the client turned the device ‘on’, can later turn the device ‘on’ again, without the client being involved. 2. Would your answer change (and how), if the device supports ID-based session resumption? Ticket-based session resumption? 3. Show a secure method for the server to compute the server-random method, which will not require a source of randomness. The IoT device may use and update a state variable s; you solution consists of the computation of the server-random: rS = and of the update to the state . variable performed at the end of every handshake: s = Exercise 7.15 (DH Ephemeral (DHE)). Consider a client and server that use TLSv1.2 with DH Ephemeral (DHE) public keys, as in Fig. 7.12. Assume that the client and server run this protocol daily, at the beginning of every day i. (Within each day, they may use session resumption to avoid additional public key operations; but this is not relevant to the question). Assume that Mal can (1) eavesdrop on communication every day, (2) perform MitM attacks (only) every even day (i s.t. i ≡ 0 ( mod 2)), (3) is given all the keys known to the server on the fourth day. Note: the server erases any key once it is not longer in use (i.e., on fourth day, attacker is not given the ‘session keys’ established n previous days). Fill the ‘Exposed on’ column of day i in in Table 7.3, indicating the first day j ≥ i in which the adversary should be able to decrypt (expose) the traffic sent on day i between client and server. Write ‘never’ if the adversary should never be able to decrypt the traffic of day i. Briefly justify. Day 1 2 3 4 5 6 7 8 Eavesdrop? Yes Yes Yes Yes Yes Yes Yes Yes MitM? No Yes No Yes No Yes No Yes Given keys? No No No Yes No No No No Exposed on... Justify Table 7.3: Table for Exercise 7.15. Applied Introduction to Cryptography and Cybersecurity 7.8. ADDITIONAL EXERCISES 479 Exercise 7.16 (TLS with PRS). Consider a client that has three consecutive TLS connections to a server, using TLS 1.3. An attacker has different capabilities in each of these connections, as follows: • In the first connection, attacker obtains all the information kept by the server (including all keys). • In the second connection, attacker is disabled. • In the third connection, attacker has MitM capabilities. Is the communication between client and server exposed, during the third connection? 1. Show a sequence diagram showing that with TLS 1.3, communication during third connection is exposed to attacker. 2. Present an improvement to TLS 1.3 that will protect communication during third connection. Simplify your solution by assuming no attack during the second connection. 3. Further to provide same protection, even if the attacker can eavesdrop to the communication during the second connection. 4. How can your improvement be implemented using TLS 1.3, allowing backward compatibility, i.e., a ‘normal’ TLS 1.3 interaction when one of the two parties (client or server) does not support your improvement? Exercise 7.17. A Pierpont prime is a prime number of the form 2u · 3v + 1, where u, v are non-negative integers; Pierpont primes are a generalization of Fermat primes. Assume that Alice’s browser sends, in the Client Hello message of TLS 1.3, the set of exponentiations {giai mod pi }, where for some i, say i = 3, the prime p3 is a Pierpont prime. 1. Assume that the server bob.com selects to use this p3 and g3a3 mod p3 , i.e., sends back g3b3 mod p3 as part of the server-hello message. Present a sequence diagram showing how a MitM attacker would be able to eavesdrop and modify messages sent between Alice and Bob. 2. Assume that the server prefers p2 , such that there is some prime q2 such that p2 = 2 · q2 + 1. Explain why the MitM attack of the previous item fails. 3. Present a sequence diagram showing that the attacker is still able to impersonate as the website. 4. Extend the impersonation attack to a complete MitM attack against Alice and bob.com, assuming typical user authentication (using cookie or userid/password). Applied Introduction to Cryptography and Cybersecurity 480 CHAPTER 7. TLS PROTOCOLS: WEB-SECURITY AND BEYOND 5. Suppose the client hold a pre-shared key (and ticket); would this prevent the attack? explain. Exercise 7.18. This exercise continues Exercise 2.43, please see the ANSI X9.31 design presented there. Some TLS implementations use X9.31 as a PRG, to generate keys, nonces and IVs for encryption. Assume fixed key k (known to attacker), and that the values Ti are the current time, in seconds. Explain a possible vulnerability; you may make reasonable assumptions, e.g., on the use of different outputs of the PRG and on clock synchronization. Demonstrate how an attacker may exploit the vulnerability, using a sequence diagram. You may present the attack into any variant of TLS that you wish. Exercise 7.19 (Protecting TLS 1.3 servers from computational DoS). TLS servers can be subject to a Denial-of-Service (DoS) attack, in which the attacker overload the server with Client_Hello messages, each time causing the server to perform computationally-intensive operations. In TLS 1.3, the attacker can cause the server to perform two or three computationally-intensive operations: signing the CertificateVerify message and computing the server’s Diffie-Hellman key share, and possibly also computing the Diffie-Hellman shared key. All this requires minimal computational costs to the attacker. 1. Explain the attack, using sequence diagram. 2. Explain how the Cookie extension of TLS 1.3 can help against this attack, following the details in [329]. Identify assumptions/limitations of this defense. 3. An alternative way to defend against such DoS attacks, uses the pre-shared key to authenticate the Client_Hello message. This defense can allows the server to establish connections with clients which has a pre-shared key, even when, due to the attack, other clients cannot establish a connection. Such defense does not currently exist in TLS 1.3. Design such defense and explain it, using appropriate sequence diagrams. You solution should not require any change in the TLS 1.3 handshake protocol. Applied Introduction to Cryptography and Cybersecurity Chapter 8 Public Key Infrastructure (PKI) A big advantage of public key cryptography is that public keys are easier to distribute, as they are not secret; we only need to ensure their authenticity. The main mechanism for authenticating public keys is a public key certificate (or simply a certiőcate), signed by a trusted Certificate Authority (CA). In Chapter 7, we sketched how certiőcates are used by the TLS protocol, but without discussing certiőcates or CAs. This chapter focuses on certiőcates; we discuss how certiőcates are issued, validated and revoked, and how to determine which certiőcates, signed by which authorities, are trustworthy. This set of mechanisms is key to the use of public key, and therefore referred to as Public Key Infrastructure (PKI). PKI is, therefore, an essential component for practical deployments of public key cryptography (PKC). The basic concept of PKI is almost as old as the őrst publications of public key cryptography, and appeared in [239]. PKI was also standardized quite early, before any application of PKC; this was in the (őrst version of the) X.509 speciőcations [92]. The X.509 standard has evolved over the years; the most important change was with the publication of X.509 version 3 in 1997, often referred to simply as X.509v3. X.509v3 is still the most important and widely used version of certiőcate, and much of our discussion in this chapter is focused on it (speciőcally, Section 8.2 to Section 8.5). X.509v3 is designed for generality; there are several published X.509 profiles, which deőne different restrictions on the contents and use of X.509 certiőcates. We mostly focus on the most well-known proőle, the PKIX proőle, used by Internet protocols and most other deployed PKIs, and deőned in RFC 5280 [104, 207]. We also discuss some extensions of X.509, mainly OCSP (RFC 6960) [346] and Certificate Transparency (CT) [253ś255]). All of these (X.509 with the PKIX proőle as well as CRLs, OCSP and CT) are used in common implementations of the TLS/SSL handshake protocols, which we discussed in Chapter 7, and in particular its application to secure the communication between browsers and web-servers; this particular application is often referred to as the Web PKI. The Web PKI is probably the most well-known, and possibly also most important, application of PKI; see subsection 8.1.3. 481 482 CHAPTER 8. PUBLIC KEY INFRASTRUCTURE (PKI) For many years, PKI was basically identiőed with X.509v3, and to a large extent, this is still mostly true; all signiőcant PKI deployments and mechanisms follow X.509v3. One exception was the handling of revocations, where the original CRL proposal was found too inefficient, and other mechanisms have been deployed and proposed, with no dominant standard yet; see Section 8.4. However, with the growing importance and use of the Web, there were growing concerns about vulnerabilities of the PKI system, due to repeated, high proőle PKI failures (see subsection 8.5.1). Several proposals were made to improve PKI security, and one of them, Certificate Transparency, has been widely deployed, including by CAs and browsers, and is being standardized by the IETF. However, CT is also based on X.509, and only extends the X.509 speciőcations, as we discuss in Section 8.6. Topics. PKI is important and involves many proposals, mechanisms and details; this chapter covers what seemed the most important aspects, and readers may want to further focus, at least in őrst reading, on the aspects most important to them. All readers should probably read the ‘PKI concepts and goal’ in Section 8.1. X.509 is covered mostly in Section 8.2, with the important aspect of intermediate-CAs and certiőcate-path in Section 8.3; readers may skip some of the details of the different extensions and even entire Section 8.3, if their goal is to get a more high-level understanding of PKI. Section 8.4 discusses certiőcate revocation, including both the CRL and OCSP standards, as well as other, ‘optimized’ designs. In Section 8.5, we discuss some of the criticisms of Web PKI, the main current deployed PKI system, and some propose improvements, while Section 8.6 focuses on the emerging Certiőcate Transparency extension to the X.509 PKI. 8.1 Introduction: PKI Concepts and Goals Basic PKI entities: relying party, issuer (CA) and subject. Public Key Infrastructure (PKI) schemes distribute a public key pk together with a set ATTR of attributes and a signature σ. The signature σ is the result of a signature algorithm applied to input containing both pk and ATTR. The tuple (pk, ATTR, σ) is called a public key certificate or simply a certificate. The certiőcate is issued by an entity referred to as a Certificate Authority (CA) or as the issuer. Most attributes refer to the subject of the certiőcate, i.e., the entity who knows (‘owns’) the private key corresponding to the certiőed public key pk. In addition, there are often additional attributes related to the certiőcate itself rather than to the subject, such as the certiőcate validity period and serial number. The basic feature of public key cryptography is that the party that knows the private key is, usually, different than the party that uses the corresponding public key. To use the public key, we need to authenticate it; and usually this is done by verifying a certiőcate. Namely, the party that uses a public key relies upon the public key certiőcate, and on the PKI processes used to validate and Applied Introduction to Cryptography and Cybersecurity 8.1. INTRODUCTION: PKI CONCEPTS AND GOALS 483 (at least one) trusted Certiőcate Authority (CA). Therefore, we refer to this party as the relying party. The X.509 certificate life cycle. Figure 8.1 illustrates the X.509 PKI entities and certiőcate life cycle; some other PKIs have different life-cycles, sometimes with additional entities. In particular, Certiőcate Transparency has additional parties and a more complex life cycle; see Section 8.6. But for now, let’s focus on the more basic scenario of X.509 PKI. To request an X.509 certiőcate, the subject typically generates a (public, private) key pair, and then requests a CA to issue the certiőcate for the public key, with speciőc requested attributes, such as identiőers. The CA should validate that the public key was received from a subject which is entitled to the requested attributes; in subsection 8.2.8 we discuss the main validation methods: the ‘easy’ domain validation, the ‘classical’ organization validation and the extra-secure extended-validation. If the validation passes, the CA constructs the certiőcate, including the validated attributes from the relying party, as well as other attributes determined by the CA, such as serial number and validity period, and then signs it, using the CA’s private signing key. After issuing the certiőcate, the CA sends it to the subject, who provides it to the relying party, often, and in particular for Web PKI, during the TLS/SSL handshake. The relying party should validate the certiőcate; this includes validation of the signature, using the public key of the CA and validation of attributes within the certiőcate (e.g., expiration time). In the typical example of Figure 8.1, the subject is the website bob.com, and the relying party is Alice, or Alice’s browser. The őgure shows the simple case of a typical identity certiőcate, issued to website bob.com directly by a CA trusted by the relying party. Such directly-trusted CAs are called ‘trust anchors’ or ‘root CAs’. In reality, most Web PKI certiőcates are indirectlyissued, i.e., issued by an intermediate-CA (Figure 8.6), or even by a path of multiple intermediate-CAs, as we discuss in Figure 8.7. Usually, certiőcates are used until they expire, i.e., throughout their validity period, which is typically few months (rarely over a year). Around the expiration date, certiőcates are often re-issued. However, sometimes, a certiőcate should be revoked, i.e., invalidated before its planned and speciőed expiration time. Revocation is done by the CA, usually upon appropriate (and authenticated) request from the subject. A simple revocation mechanism called Certificate Revocation List (CRL) has been part of X.509 from its earliest versions [92]; however, practical, efficient certiőcation turned out to be quite a challenge, and multiple designs were proposed; we discuss revocation mechanisms in Section 8.4. While revocations can occur for administrative reasons, most revocations are due to security concerns, such as: Subject key exposure: private keys should be well protected from exposure; however, exposures do happen. Normally, exposures are quite rare and sporadic. However, a discovery of a software vulnerability may cause Applied Introduction to Cryptography and Cybersecurity CHAPTER 8. PUBLIC KEY INFRASTRUCTURE (PKI) 484 Certiőcate Authority (Aka CA or Issuer) Signing key: CA.s. Bob’s public key Bob.e Alice (relying party) CA’s public key: CA.v. Certiőcate CB Certiőcate CB : CB = SignCA.s (bob.com, Bob.e, . . .) Subject (e.g, website bob.com)) Nurse Figure 8.1: PKI Entities and typical application for server-authentication in Web PKI process. Here, we show the simple case of a typical identity certiőcate issued by a trusted CA (‘trust anchor’ or ‘root CA’) to website bob.com. The certiőcate is signed using the CA’s private signing key, denoted CA.s, and validated by the relying party using the CA’s public key CA.v, which should be known to the relying party. Dashed arrows represent the certiőcate issuing process, occurring once, before client connections. See also Figure 8.6 and Figure 8.7 for issuing with one or more intermediate-CAs, and Figure 8.5 for the certiőcate őelds. exposure of many private keys, as happened due to the Heartbleed Bug [91, 399]. CA failures: usually, certiőcate authorities have operated in a secure, trustworthy manner, and issued correct certiőcates to the rightful subjects - as required and expected. However, there have also been several incidents where CAs have failed in different ways, including vulnerable subject identification, e.g., insecure email validation, issuing intermediate-CA certificates to untrusted entities, e.g., to all customers, and even CA compromise and issuing of rogue certificates or what appears to be intentional issuing of rogue certificates. See subsection 8.5.1 and Table 8.5. Cryptanalytical certificate forgery: certiőcate-based PKIs all use and depend on the Hash-then-Sign mechanism, and therefore become vulnerable if the signature scheme is vulnerable - or if the hash-function used is vulnerable. Speciőcally, certiőcate forgery was demonstrated when using hash functions vulnerable to chosen-prefix collision attacks, speciőcally, using MD5 [367], and later also using SHA-1 [262, 367]. See Chapter 3. Subject key exposures are the most common reason for revocation, and typically results in a ‘steady’ rate of several dozens of revocations daily; however, software and CA vulnerabilities can result in exceptional ‘waves’ of revocations. Applied Introduction to Cryptography and Cybersecurity 8.1. INTRODUCTION: PKI CONCEPTS AND GOALS 485 For example, the Heartbleed bug resulted in several days with many revocation, even more than 10,000 on one day [91, 399]. Identity certificates. Many certiőcates include an identifying attribute, i.e., an identifier of the subject; such certiőcates are referred to as identity certificates. In the typical server-authentication use by TLS/SSL of Web PKI, the relying-party is the browser, the subject is the website, and the relevant identiőer is the domain name of the website, e.g., bob.com. This typical usecase is illustrated in Figure 8.1, where the certiőcate CB contains a signature SignCA.s (bob.com, Bob.e, . . .), using CA.s, the private signing key of the CA, over the identiőer bob.com, the public key Bob.e, and other őelds. 8.1.1 Rogue certificates The basic goal of PKI is to allow a relying party to determine which public key to use, or whether to use a given public key - typically, included in a certiőcate. We use the term rogue certificate for a certiőcate which contains wrong or misleading information, and hence, should not be relied upon. The basic goal of rogue certiőcates is to allow the attacker to mislead the user or security mechanisms, typically by impersonating as a trusted entity: impersonate as a trusted website (website spoofing), impersonate as a web-server of a trusted sender (phishing email), impersonate as a trusted software provider (signed malware). Equivocating certificates. To mislead security mechanisms, the rogue certiőcates need to use exactly a speciőc name that ‘belongs’ to the legitimate ‘owner’, namely, an equivocating name. Equivocating (same-name) certiőcates contain exactly the same name as a legitimate domain name, but certiőed by an attacker - and, obviously, containing a public key chosen by the attacker. Equivocating certiőcates can be used to circumvent many important security mechanisms, including Same-Origin-Policy (SOP), blacklists, whitelists and other access-control mechanisms. Misleading (impersonating) certificates and domain-names. Many attacks focus on misleading the user, rather than misleading an automated security mechanism. These attacks take advantage of the fact that humans do not follow a precise algorithm for their trust decisions, in contrast with automated security mechanisms. The attacker’s goal is still, usually, impersonation; there is currently no mechanism that prevents rogue entities from obtaining domain names and certiőcates for non-impersonating nefarious purposes, such as scams such mechanism is probably desirable, but seems very hard to establish. Hence, we focus on impersonating certiőcates. Notice that equivocating certiőcates can obviously be used for impersonating the subject, such as for phishing emails and spoofed (fake) websites; however, when the goal is to trick a human user, there are other ways which prove almost as effective, mainly: Applied Introduction to Cryptography and Cybersecurity 486 CHAPTER 8. PUBLIC KEY INFRASTRUCTURE (PKI) Homographic (visually impersonating) domain names: the attacker uses names which appear visually to be the exact names of a legitimate, trusted entity, although, they actually just exploit visual similarities between different characters, typically using different fonts. A simple example are the names paypal.com and paypaI.com - in many fonts, the lower-case ‘L’ and capital ‘I’ are hard to distinguish. Attacker may also use fonts from different alphabets, e.g., some Cyrillic letters are visually indistinguishable from other Latin letters (e.g., P). Attacks using such visually-impersonating domain names are called homographic attacks. Domain-name hacking: the attacker uses a different domain name, which the attacker can register and control, but which users, usually not even aware of the structure of domain names, will not distinguish from a trusted domain name. Examples: to impersonate as the site accounts.bank.com, use bank.accounts.com, accounts-bank.com or accounts.bank.co. The last example also uses the human tendency to ignore the end and other minor deviations in text (our ‘built-in error correction’). Combo domain names (combosquatting): names which combine a trademark or a name associated with a legitimate, trusted entity, with another term which seem to either ‘make sense’ or simply to ‘appear meaningless/technical’. Combo names are probably one of the most effective form of misleading names. Example: to impersonate as the website bank.com, use accounts-bank.com or bank.accts.com. Typosquatting : These domain names exploit typical typos, such as due to typing and/or spelling errors, such as banc.com, baank.com and banl.com. The high-level goal of PKI is to protect the relying parties from such rogue certiőcates, as well as CAs who issue rogue certiőcates intentionally or due to negligence. In the next subsection, we try to turn this high-level goal into more precise requirements. The concerns about misleading certiőcates and domain names are some of the many challenges which a designer faces, when trying to protect systems involving human users; we look a bit deeper into this important topic in Chapter 9. 8.1.2 Security goals of PKI schemes. X.509 security goals. The basic, high-level goal of a public key infrastructure (PKI) is to allow relying parties to ensure that it uses a valid public key for its speciőc needs and application. The mere fact that X.509 certiőcates are signed by the CA may seem to ensure this goal; however, there are two caveats. First, X.509 allows certiőcation not only by root CAs, trusted directly by the relying party, but also by intermediate CAs, based on a precise policy of the root CA. Second, relying parties should not be fooled into relying on a revoked certiőcate. These two caveats imply two corresponding security requirements: Applied Introduction to Cryptography and Cybersecurity 8.1. INTRODUCTION: PKI CONCEPTS AND GOALS 487 Accountability: assume a relying party validated a certiőcate c, optionally using some ‘additional data’ D (typically, additional certiőcates). Yet, assume that the entity identiőed as the subject in c, denies ‘owning’ the certiőed public key. Then we can identify an accountable CA, CAA , which has signed some certiőcate cA , whose subject denies ‘owning’ the public key certiőed in cA . If c was issued by a root CA than D is unnecessary, CAA is the root CA and cA = c. See X.509’s certiőcate-path mechanisms in Section 8.3. Revocation: assume that at time t, a CA revokes a certiőcate c (that it previously issued). Then after some bounded delay ∆, all relying parties will not consider the certiőcate as valid. Post-X.509 security goals. There are several ‘post-X.509’ PKI designs which aim to address additional requirements, including: Transparency: the set of all issued certiőcates is transparent, i.e., publicly known. See Section 8.6. Revocation-status transparency: the revocation status of a certiőcate is publicly known, i.e. it is known whether a particular certiőcate was revoked. Equivocation-prevention: there cannot exist two valid yet equivocating certiőcates, i.e., different identity certiőcates for the same identiőer (e.g., domain name). Equivocation detection any pair of equivocating certiőcates would be detected within bounded time after the second one is issued. Relying party privacy: the PKI mechanism does not expose which certiőcates are validated by a given relying party. One case where this does not hold is when using the OCSP protocol (Section 8.4). 8.1.3 The Web PKI In Chapter 7 we discussed the use of certiőcates by the SSL/TLS protocol, used to secure web traffic and other applications. The SSL or TLS client (often, the browser) receives a certiőcate (pk, ATTR, σ) authenticating the server’s public key pk, and binding it to the server’s domain, e.g., bob.com, which is speciőed as one of the attributes in ATTR. The certiőcate contains a signature σ; for the certiőcate to be valid, σ must be a signature over (pk, ATTR), which validates correctly using the public validation key of some trusted certificate authority (CA). In web security applications, each browser maintains and/or uses1 a list of trusted root certificate authorities (root CAs). These root CAs can also certify 1 Often, browsers use a list of trusted root CAs maintained by the operating system, possibly combining it with browser-maintained list. Applied Introduction to Cryptography and Cybersecurity 488 CHAPTER 8. PUBLIC KEY INFRASTRUCTURE (PKI) additional CAs, referred to as intermediate CAs; we explain the process later on. To support SSL or TLS, a web-server for a domain, e.g., bob.com, needs a certiőcate for that domain, signed by a root or intermediate CA, and, of course, to know and use the corresponding private key. Note that SSL and TLS also supports (optional) client authentication; however, client authentication requires a client certificate; only few traditional clients, such as browsers, have these client certiőcates, but their use is more common for IoT devices. Similarly, client certiőcates are required for end-to-end secure email services, e.g., using S/MIME [325]; again, only a tiny fraction of the users went through the process of obtaining a client certiőcate, and as a result, these secure email services are not widely used. The difficulties of obtaining client certiőcates are probably one reason for the fact that most secure messaging applications rely on authentication by the provider, and sometimes also by the peer user, but not on client certiőcates. For further discussion, focusing on weaknesses of the current Web PKI and some solutions, see subsection 8.5.1. 8.2 The X.509 PKI In this section, we discuss the basic notions of the X.509 PKI standard, which was developed as part of the X.500 global directory standard. X.509 is the most widely deployed PKI speciőcation, and also includes some of the more advanced PKI concepts which we cover in the following sections. 8.2.1 The X.500 Global Directory Standard X.500 [95] is an ambitious, extensive set of standards of the International Telecommunication Union (ITU), a United Nations agency whose role is to facilitate international connectivity in communications networks. The goal of X.500 is to facilitate the interconnection of directory services provided by different organizations and systems. The őrst version of X.500 was published as early as 1988, and numerous extensions and updates were published over the years. The basic idea of X.500 is to provide a trusted, unified and ideally global directory, by combining the data and services of its multiple component directories. Such a uniőed directory would be operated by cooperation between trustworthy providers, such as telecommunication companies. Signiőcant aspects of X.500, such as the distinguished names, are deployed by LDAP and other directory services [208], although, these are far from the vision of a global directory. Among the possible reasons for that is the high complexity of the X.500 design, concerns that X.500 interoperability may cause exposure of sensitive information, and lack of sufficient trust among different directory providers. However, some concepts from X.500 live on; we already mentioned LDAP as one example. More relevantly to our subject, the X.500 recommendation contributed extensively to the development of PKI schemes. The X.500 designers Applied Introduction to Cryptography and Cybersecurity 8.2. THE X.509 PKI 489 observed that an interoperable directory should bind standard identifiers to standard attributes. One important set of attributes deőnes the public key(s) of each entity. The entity’s public encryption key allows relying parties to encrypt messages so that only the intended recipient may decrypt them. Similarly, the entity’s public validation key allows relying parties to validate statements signed by the entity. We next discuss the main form of standard identifier deőned in X.500: the distinguished name. 8.2.2 The X.500 Distinguished Name The design of X.500 was extensively informed by the experience of telecommunication companies at the time, which included provision of directory services to phone users. Phone directory services are mostly based on looking up the person’s common name; the common name has the obvious advantage of being a meaningful identifier - we usually know the common name of a person when we ask the directory for that person’s information. Phone directories would normally also allow speciőcation of the relevant area, e.g., in form of locality; by limiting search to speciőc areas or localities, the directory services can be decentralized. However, obviously, a common name is not a unique identiőer - in fact, some common names are quite common, if you excuse the pun. In classical phone directories, this is addressed by returning a set of results containing all relevant entries, along with the relevant common name and other attributes (e.g., location). The X.500 designers decided that in order to allow efficient use of large, global directories, returning multiple results is not a viable option. Instead, they decided to use a more reőned identiőer, with multiple keywords - where the common name will simply be one of these keywords. This identiőer is the X.500 Distinguished Name (DN). The distinguished name was designed to satisfy the following three main goals for identifiers: Meaningful: identiőers should be meaningful and recognizable by humans. This makes it easier to memorize the identiőer, as well as to link it with off-net identiőer, with potential legal and reputation implications. Unique: identiőers should be unique, i.e., different subjects should have different identiőers, allowing each identiőer to be mapped to a speciőc subject. Decentralized management: multiple, ‘distributed’ issuers, can issue identiőers, without restrictions, i.e., any issuer is allowed to issue any identiőer. The uniqueness requirement is an obvious challenge, as common names are obviously not unique. To facilitate unique DNs for people sharing the same common name, X.500 distinguished names consist of a sequence of several keyword-value pairs. The inclusion of multiple keywords - also referred to as Applied Introduction to Cryptography and Cybersecurity 490 CHAPTER 8. PUBLIC KEY INFRASTRUCTURE (PKI) C L O OU CN Country Locality Organization name Organization unit common name Table 8.1: Standard keywords/attributes in X.500 Distinguished Names Figure 8.2: Example of the X.500 (and X.509) Distinguished Name (DN) Hierarchy. attributes - helps to ensure unique identiőcation, when combined with the common name. Typical, standard keywords are shown in Table 8.1; however, a directory is free to use any keyword it desires. To satisfy the ‘meaningful’ goal, identiőers should have readable representations. RFC 1779 [233] speciőes a popular string representation for distinguished names, where keyword-value pairs are separated by the equal sign, and different pairs are separated by comma or semicolon, optionally also with spaces. Other representations are possible too, e.g., Figure 8.2 includes encoding of a DN using slash for separation. Let us give two simple examples of different legitimate interpretations (and implementations) of the RFC 1779 representation: 1. As illustrated in Figure 8.2, the distinguished name (DN) for a police officer named John Doe in the Soho precinct of the NYPD may be deőned as: C=US/L=NY/O=NYPD/OU=soho/CN=John Doe. 2. The distinguished name (DN) for an IBM UK employee with the name Julian Jones may be written as: CN=Julian Jones, O=IBM, C=GB. Read below on the author’s experience with this (realistic) DN. Applied Introduction to Cryptography and Cybersecurity 8.2. THE X.509 PKI 491 Note that the two examples use different order of keywords, with the most speciőc term being on the right in the őrst and on the left in the second; this kind of ambiguity is a source of implementation bugs, as indeed happened for few implementations; it may also cause vulnerabilities. Note that keyword-value pairs comprising an X.500 distinguished name are speciőed in a sequence, i.e., as an ordered list. This allows the distinguished names to be organized as a hierarchy, using the sequence of keywords as the nodes, as illustrated in Figure 8.2. By assigning a speciőc, single entity to assign identiőers in a sub-tree of the X.500 DN hierarchy, this entity can ensure uniqueness by never allocating the same identiőer (DN) to two different subjects, e.g., the Soho precinct of the NYPD may maintain its own sub-directory. This also allows queries over the entire set of distinguished names that begins with a particular preőx of keyword-value pairs. However, this implies that X.500 distinguished names cannot be issued in an entirely decentralized manner - some control and coordination on the allocation of identiőers is required. Furthermore, there are also some caveats with respect to the other goals - unique and meaningful identiőers. Let us őrst consider the goal of meaningful identiőers. The use of subdivisions such as ‘organization unit (OU)’ may help to reduce the likelihood of two persons with the same common name in the same ‘bin’, this possibility still exists. As a result, administrators may have to enforce uniqueness by ‘modifying’ the common name. For example, if there are multiple IBM UK employees with the name Julian Jones, one of them may be assigned the DN: CN=Julian Jones2, O=IBM, C=GB This results in less meaningful distinguished names; e.g., it is easy to confuse between the DNs of the two employees. For example, the author has sent to CN=Julian Jones, O=IBM, C=GB messages intended for CN=Julian Jones2, O=IBM, C=GB, when using an email system that used distinguished names as email addresses. Luckily, both Julians were understanding of the mistake. Another cause of mistakes and ambiguity is the fact that there are no rules governing the order of the keywords, i.e., the structure of the hierarchy, as is evident from the two examples we presented. In particular, some multinational organizations may use the country as the top level category, as in CN=Julian Jones, O=IBM, C=GB, while others may view the organization itself as the toplevel category, as in CN=Julian Jones, C=GB, O=IBM. These two distinguished names are different; this distinction may not be obvious to a non-expert, further reducing from the goal of ‘meaningful’ names. There are also cases where uniqueness is not guaranteed. Some namespaces are shared by design, and cannot be segregated with a single authority assigning identiőers in each segment. For example, consider Internet domain names; multiple registrars are authorized to assign names in several top-level domains such as com and org. There is a coordination process between registrars, but if not followed correctly, conŕicts may occur. This problem is more severe with respect to public key certiőcates for Internet domain names, which can be issued by multiple Certiőcate Authorities; any Applied Introduction to Cryptography and Cybersecurity 492 CHAPTER 8. PUBLIC KEY INFRASTRUCTURE (PKI) faulty authority may issue a certiőcate to an entity who does not rightfully own the certiőed domain name. Such incidents occurred - often due to intentional attack; e.g., see Table 8.5. This is a major concern for Web PKI as well as PKI in general, and we discuss it further later in this chapter. We conclude that X.500 distinguished names are not perfectly meaningful and deőnitely not decentralized ; furthermore, sometimes, distinguished names may even not perfectly ensure uniqueness. Indeed, there seem to be an inherent challenge in satisfying all three goals, although achieving any two of these three properties is deőnitely feasible - a classical trilemma scenario. The identifiers trilemma. We argued that X.500 distinguished names may fail to ensure each of the three goals deőned above - uniqueness, meaningfulness and decentralized management. In contrast, several other identiőers ensure pairs of these three properties: Common names are meaningful - and decentralized, as any person can decide on the name. However, they are deőnitely not unique. Public keys and random identifiers are decentralized and (except for very rare collisions) unique. However, obviously, public keys are not directly meaningful to humans. Email addresses are unique and meaningful. However, they are not decentralized, since each issuer can only assign identiőers (email addresses) in its own domain. This begs the question: is there a scheme which will ensure identiőers which fully satisfy all three properties, i.e., would be unique, meaningful and managed and issued in a decentralized way? It seems that this may be hard or impossible, i.e., it may be possible to only fully ensure two of these three goals, but not all three. We refer to this challenge as the Identifiers Trilemma 2 , and illustrate it in Figure 8.3. Additional concerns regarding X.500 Distinguished Names. We conclude our discussion of X.500 distinguished names, by discussing few additional concerns. Privacy. The inclusion of multiple categorizing őelds in X.500 DNs, may expose information in an unnecessary, and sometimes undesired, manner. For example, employees may not always want to expose their location or organizational unit. Flexibility. People may change locations, organization units and more; with X.500 DNs, this may result in ‘incorrect’ DN, or require change of the DN both undesirable. 2 This challenge is also referred to as Zooko’s triangle; however, Zooko has apparently referred to a different trilemma, albeit also related to identifiers. Specifically, Zooko considered the challenge of identifiers which will be distributed, meaningful for humans, and also selfcertifying, allowing recipients to locally confirm the mapping from name to value. Applied Introduction to Cryptography and Cybersecurity 8.2. THE X.509 PKI 493 Figure 8.3: The Identiőers Trilemma: the challenge of co-ensuring unique, decentralized and meaningful identiőers. Usability. X.500 DNs are designed to be meaningful, i.e., users can easily understand the different keywords and values. However, sometimes this may not suffice to ensure usability. In particular, consider two of the most important applications for public key cryptography and certiőcates: secure web-browsing and secure email/messaging. Secure web-browsing: users, as well as hyperlinks, specify the desired website using an Internet domain name, and not a distinguished name. Hence, the relevant identiőer for the website is that domain name - provided by the user or in the hyperlink. This requires mapping from the domain name to the distinguished name. A better solution is for the certiőcate to directly include the domain name; this is supported by the SubjectAltName extension, deőned by PKIX, see subsection 8.2.6. Secure email/messaging: users also do not use distinguished names to identify peers with whom they communicate using email and instant messaging applications. Instead, they use email addresses - or application-speciőc identiőcation. This problem may not be as meaningful, since most end users do not have a public key certiőcate at all; and, again, PKIX allows certiőcates to directly specify email address. 8.2.3 X.509 Public Key Certificates The X.500 standard included a dedicated sub-standard, X.509, which deőned authentication mechanisms, allowing entities to authenticate themselves to the directory. X.509 deőned multiple authentication mechanisms, e.g., the use of password based authentication. However, one of these authentication methods became a very important, widely used standard: the X.509 public key certificate. Originally, the main goal of the X.509 authentication was to allow each entity to maintain its own record with the directory, e.g., to change address. However, it was soon realized that public key certiőcates allow many more applications, Applied Introduction to Cryptography and Cybersecurity 494 CHAPTER 8. PUBLIC KEY INFRASTRUCTURE (PKI) Figure 8.4: X.509 version 1 certiőcate. Note the őelds added in later versions, mainly version 3 of X.509 (Figure 8.5), most notably the extensions őeld. since they allow recipients to authenticate the public key of a party without requiring any prior communication. As a result, X.509 certiőcates became a widely deployed standard, which is used for SSL and TLS, code-signing, secure email (S/MIME), IP-sec and more. All this use is in spite of complaints about the complexity of the X.509 speciőcations and encoding formats - obviously, the wide use is also one reason for the numerous complaints. For details of the encoding, see [226, 376]; see also [306]. The deőnition of the X.509 certiőcates did not change too much from the őrst version of X.509; the contents (őelds) of that őrst version of X.509 are shown in Figure 8.4. These őelds are, by their order in the certiőcate: Version: the version of the X.509 certiőcate and protocol. Certificate serial number: a serial number of the certiőcate, unique among all of the certiőcates issued by this CA. PKIX [104] speciőes that the serial number should be a positive integer of up to 20 bytes, i.e., up to 159 bits. The best practice is to select the serial number randomly, not sequentially. The motivation are attacks [367, 368] that manipulate a CA into issuing a certiőcate whose hash collides with the contents of a different certiőcate, when using predictable sequence numbers, together with a hash-function which has the chosen-preőx collisions vulnerability, such as MD5 or SHA-1 (subsection 3.3.1). Signature-process Object Identifier (OID): this is an identiőer of the process used for signing the certiőcate, typically using the Hash-then-Sign paradigm. This identiőer speciőes both the underling public key signature algorithm, e.g., RSA, as well as the hash algorithm, e.g., SHA-256. The algorithm may be written as a string for readability, and standard string terms are used for widely used methods, e.g., sha256WithRSAEncryption; Applied Introduction to Cryptography and Cybersecurity 8.2. THE X.509 PKI 495 notice the use of the term ‘RSA Encryption’ when referring to RSA signatures - a common misnomer. In the certiőcate itself, the algorithm is typically speciőed using the Object-Identifier (OID) standard; see Note 8.1. Issuer Distinguished Name: the distinguished name of the certiőcate authority which issued, and signed, the certiőcate. Validity period: the period of time during which the certiőcate is to be considered valid. Subject Distinguished Name: the distinguished name of the subject of the certiőcate, i.e., the entity to whom the certiőcate was issued. This entity is expected to know the private key corresponding to the certiőed public key. Subject public key information: this őeld contains the public key of the subject, and an object identifier (OID, see Note 8.1) that identiőes the algorithm with which the key is used, including key-length, e.g., RSA/2048. The allowed usage of the certiőed public key - e.g., to encrypt messages sent to the subject, or to validate signatures by the subject is speciőed in the KeyUsage extension (subsection 8.2.7), not in subject public key information őeld. Signature: őnally, this őeld contains the result of the application of the signature algorithm (identiőed by the signature-process OID őeld above), to all of the other őelds in the certiőcate, using the private signing key of the issuer (certiőcate authority). The sequence of all these őelds in the certiőcate, excluding the signature őeld itself, is referred to as the to-be-signed őelds; see Figure 8.4 and Figure 8.5. This allows the relying party to validate the authenticity of the őelds in the certiőcate, e.g., the validity period, the subject distinguished name, and the subject public key. Exercise 8.1. Provide a security motivation for the fact that the signature process is specified as one of the (signed) fields within the certificate. Do this by constructing two ‘artificial’ CRHFs, hA and hB ; to construct hA and hB , you may use a given CRHF h. Your constructions should allow you to show that it could be insecure to use certificates where the signature process (incl. hashing) is not clearly identified as part of the signed fields. Specifically, design hA , hB to show how an attacker may ask a CA to sign a certificate for one name, say Attacker, and then use the resulting signature over the certificate to forge a certificate for a different name, say Victim. X.509 Certificates: Versions 2 and 3. Following X.509 version 1, the X.509 certiőcates were extended by few additional őelds; see Figure 8.5. Version 2 of X.509 added two őelds, both of them for unique identiőers one for the subject and one for the issuer (CA). These őelds were deőned to Applied Introduction to Cryptography and Cybersecurity 496 CHAPTER 8. PUBLIC KEY INFRASTRUCTURE (PKI) Note 8.1: Object identiőers (OIDs) The joint ITU and ISO ASN.1 standard [93, 129] defines the concept of an object identifier (OID) as a unique identifier for arbitrary objects. Object identifiers are specified as a sequence of numbers, e.g., 1.16.180.1.45.34, separated by dots (as shown) or spaces. OID numbers are assigned hierarchically to organizations and to ‘individual objects’; when an organization is assigned a number, e.g., 1.16, it may assign OIDs whose prefix is 1.16 to other organizations or directly to objects, e.g., 1.16.180.1.45.34. The top level numbers are either zero (0), allocated to ITU, 1, allocated to ISO, or 2, allocated jointly to ISO and ITU. RFC 3279 [28] defines OIDs for many cryptographic algorithms and processes used in Internet protocols, e.g., RSA, DSA and elliptic-curve signature algorithms; when specifying a signature process, the OID normally also specifies both the underlying public key signature algorithm and key length, e.g., RSA/2048, and the hashing function, e.g., SHA-256, used to apply the ‘Hash-then-Sign process. X.509 uses OIDs to identify signature algorithms and other types of objects, e.g., extensions and issuer-policies. The use of OIDs allows identification of the specific type of each object, which helps interoperability between different implementations. Figure 8.5: X.509 version 3 certiőcate. Version 2 is identical, except for not having the extensions őeld; version 1 also does not have the two ‘unique identiőer’ őelds (Figure 8.4). Applied Introduction to Cryptography and Cybersecurity 8.2. THE X.509 PKI 497 ensure uniqueness, in situations where the distinguished name may fail to ensure uniqueness, as discussed in subsection 8.2.2. However, these unique identiőer őelds are not in wide use, as they are entirely unrelated to the meaningful identiőers used in typical applications. Version 3 of X.509 (X.509v3) is the one in practical use; the main reason for its wide success is that it dramatically increased the expressiveness of X.509 certiőcates. As can be seen in Figure 8.5, this dramatic improvement is due to just one new őeld added in version 3: the general-purpose extensions őeld. The extensions őeld provides extensive ŕexibility and expressiveness to certiőcates, and facilitates many applications and use cases; this őeld is typically much longer than all other őelds combined. The X.509v3 extensions mechanism is the subject of the next subsection. 8.2.4 The X.509v3 Extensions Mechanism As shown in Figure 8.5, X.509 certiőcates, from version 3, include a őeld that can contain one, or more, extensions. We discuss some speciőc, important extensions in the following subsections. But őrst, let us discuss the extensions mechanism itself, since this mechanism has a rather clever design, which cleverly balances between the need to allow extendibility, and the concern of using a certiőcate incorrectly (due to ignoring or incorrectly handling an extension). Each extension has the following three components: Extension identifier: speciőes the type of the extension. The extension identiőer is speciőed using an object identiőer (OID), to facilitate interoperability. The following subsections discuss some important extensions, e.g., key usage and name constraint. Extension value: this is an arbitrary string which provides the value of the extension. For example, a possible value for the key-usage extension would indicate that the certiőed key is to be used as a public encryption key, while a possible value for the name constraint extension may be Permit C=GB, allowing the subject of the certiőcate to issue its own certiőcates, but only with the value ‘GB’ (Great Britain) to their ‘C’ (country) keyword. Criticality indicator: this is a binary ŕag, i.e., an extension can be marked as critical or as non-critical. The value of the criticality indicator ŕag in an extension instructs relying parties how to handle the certiőcate if the relying party is not familiar with this type of extension, as indicated by the extension identiőer. A relying party should not use a certiőcate which includes an extension marked as critical, if the relying party is not familiar with this type of extension. Relying parties can use certiőcates even if it contains an extension of a type not known to the relying party, ignoring that extension, if the extension is marked as non-critical. When the relying party is familiar with the type of an extension, the value of the criticality indicator is not applicable. Applied Introduction to Cryptography and Cybersecurity 498 CHAPTER 8. PUBLIC KEY INFRASTRUCTURE (PKI) The criticality indicator ŕag is a simple mechanism - but very valuable, by allowing both critical extensions and non-critical extensions. The ŕexibility offered by the ‘criticality indicator’ makes the X.509 extensions mechanism very versatile; it is a pity that this idea has not been adopted by other extension mechanisms. For example, TLS client and servers simply ignore unknown TLS extensions, i.e., treat them as non-critical, as discussed in Chapter 7. It would have been useful if TLS allowed also deőnition of critical extensions, i.e., instructing a TLS peer to refuse connection if the peer is sending a critical but unknown TLS extension. This can be achieved quite easily; see next exercise. Exercise 8.2. Design how TLS may be extended to support critical extensions. Could you achieve this using the existing TLS extensions mechanism? X.509, as well as PKIX and other X.509 proőles, deőne some extensions to be always marked critical, others to be always marked non-critical, and others to be marked differently depending on needs. We next present examples of each of these three types of extensions, focusing on standard extensions. Example 8.1. The TLS feature X.509 extension: defined to be used as a non-critical extension. An X.509 extension called TLS feature is deőned in RFC 7633 [184]. This TLS feature extension is used in TLS server certiőcates, to indicates that the server supports a speciőc TLS extension (see subsection 7.4.3). The name chosen for this X.509 extension is TLS feature, rather than TLS extension, to make it clearer that the TLS feature is an X.509 extension, and only refers to the support for a speciőc TLS extension; unfortunately, confusion is still natural. The TLS feature X.509 extension allows the server to indicate to the client that the server supports certain important TLS extensions (‘features’). Some TLS clients may not support the TLS feature X.509 extension, so if this extension would be marked critical, these clients would reject the certiőcate, and the connection would fail. Hence, the TLS feature X.509 extension should be marked as non-critical. Note that the TLS feature extension isn’t one of the standard extensions deőned in either X.509 or PKIX; it was developed later, speciőcally, to allow a certiőcate to mark that the server always uses the must-staple TLS extension, see subsection 8.4.3. Example 8.2. The extended key usage extension: can be either critical or non-critical. The extended key usage extension allows the issuer to deőne allowed usage for the public key certiőed, which is in addition to or in place of the usage speciőed in the key usage extension. In some scenarios, the ‘extended key usage’ should be critical, e.g., to prevent incorrect usage based on the key usage extension, by clients not supporting extended key usage. In other scenarios, the extended key usage extension should be non-critical, e.g., when allowing some additional usage over that speciőed already in the key usage extension. Applied Introduction to Cryptography and Cybersecurity 8.2. THE X.509 PKI 499 Example 8.3. The key usage extension: always critical in PKIX, and a motivating attack. PKIX speciőes that the key usage extension must be marked critical, while X.509 allows the key usage extension to be marked as either critical or ‘notcritical’. Let us őrst give a contrived example of a possible attack exploiting a certiőcate where key-usage was not marked as critical, causing a relying party who does not understand this extension, to make a critical security mistake. Assume that the parties (key-owner, i.e., certiőcate subject, and relying party) use ‘textbook RSA’ encryption, i.e., encrypt plaintext mE by computing c = meE mod n; and ‘textbook RSA’ signing, i.e., sign message mS by outputting σ = h(mS )d mod n, i.e., ‘decrypting’ the hash of the message. Furthermore, assume the key-owner uses its decryption key to authenticate that it is active at a given time, by decrypting an arbitrary challenge ciphertext sent to it; this requires only a relatively weak form of ciphertext-attack resistance, where the attacker must ask for the decryption before seeing the challenge ciphertext it must decrypt, often referred to as IND-CCA1 secure and assumed for textbook RSA. A key-owner using this mechanism must use its key only for decrypting these challenges; assume it receives a certiőcate CE for its encryption key e, with the key-usage extension correctly marking this as an encryption key, but not marked as critical. An attacker may abuse this, together with the fact that key usage is not understood by some relying parties, to mislead these relying-parties into thinking that the key-owner signed some attacker-chosen-message mA , as follows. The attacker computes cA = h(mA ) and sends it to the key-owner, as if it is a standard challenge ciphertext to be decrypted. The key-owner therefore decrypts cA and outputs the decryption, cdA mod n = h(mA )d mod n, which we denote by σA , i.e., σA ≡ h(mA )d mod n. Now the attacker sends the pair (mA , σA ), along with the certiőcate CE , to the relying party, claiming mA was signed by the key-owner with signature σA . Since the relying party is not familiar with the key-usage extension, and it was not marked critical in the keyowner’s certiőcate CE , then the relying-party would validate (mA , σA ), which would validate correctly, and thereby incorrectly consider mA as validly-signed by the key-owner. Let us also point out a more practical attack on a TLS 1.3 client that does not correctly implement the key usage extension. If the TLS server also has runs any older version of TLS (or SSL) that is vulnerable to some variant of the Bleichenbacher attack, then the attacker may be able to forge an RSA signature using the private (decryption) key of the old, vulnerable version. If the client ignores the key usage extension and uses the public key for verifying the signature, the attacker succeeds in a cross-protocol attack on TLS 1.3, even if the server has correctly separated between the TLS 1.3 signature-veriőcation public key, and the public encryption key used by the old, vulnerable version. Applied Introduction to Cryptography and Cybersecurity 500 8.2.5 CHAPTER 8. PUBLIC KEY INFRASTRUCTURE (PKI) Trust-Anchor Certificate Validation Upon receiving a certiőcate, the relying party must decide whether it can rely on and use the certiőed public key, for a particular application. In this section, we focus on the case of certiőcates signed by a trust anchor CA, i.e., a CA trusted by the relying party. In this case, the relying party would apply the certificate validation process, using the public signature-validation key of the CA, CA.v, to determine if the given certiőcate is valid. If the certiőcate is not signed by a trust anchor, then the relying party should őrst perform the certification path validation process, to decide if to trust this certiőcate, based on additional certiőcates; we discuss this in Section 8.3. Assume, therefore, that a relying party receives a certiőcate signed (issued) by a trust anchor, i.e., the relying party trusts the issuing CA, denoted I, and knows its public validation key I.v. To validate the certiőcate, the relying party uses I.v and the contents of the certiőcate, as follows: Issuer. The relying party veriőes that the issuer I of the certiőcate, as identiőed by the issuer distinguished name őeld, is a trusted CA, i.e., a trust anchor (root-CA, for Web PKI). Validity period. The relying party checks the validity period speciőed in the certiőcate. If the public key is used for encryption or to validate signatures on responses to challenges sent by the relying party, then the certiőcate should be valid at the relevant times, including at the current time. If the public key is used to validate signature generated at the past, then it should be valid at a time when these signatures already existed, possibly attested by supporting validation by trusted time-stamping services. Subject. The relying party veriőes that the subject, identiőed in the subject őeld using the distinguished name, is an entity that the relying party expected. For example, when the relying party is a browser and it receives a website certiőcate, then the relying party should conőrm that the website identity (e.g., domain name) is the same as indicated in the ‘subject distinguished name’ őeld of the certiőcate. Signature algorithms. The relying party conőrms that it can apply and trust the validation algorithm of the signature scheme identiőed in the signature algorithm OID őeld of the certiőcate. If the certiőcate is signed using an unsupported algorithm, or an algorithm known or suspected to be insecure, validation fails. Issuer and subject unique identifiers. From version 2, X.509 certiőcates also include őelds for unique identiőers for the issuer and the subject, which the relying party should use to further conőrm their identities. In PKIX, these identiőers are usually not used, and PKIX does not require their validation. This is probably since in PKIX, the issuer and subject identiőers are typically in corresponding extensions. Applied Introduction to Cryptography and Cybersecurity 8.2. THE X.509 PKI 501 Extensions. The relying party validates that it is familiar with any extension marked as critical; the existence of any unrecognized extension, marked as critical, would invalidate the entire certiőcate. Then, the relying party validates the existence and contents of any extension that its policy requires. To avoid incompatibilities, relying parties and CAs usually follow agreed-upon policies for the required and permitted extensions, often referred to as PKI profile, such as PKIX from the IETF [104] and proőles deőned by the CA/Browser Forum [152]. Validate signature. The relying party next uses the trusted public validation key of the CA, CA.v, and the signature-validation process as speciőed in the certiőcate, to validate the signature over all the ‘to be signed’ őelds in the certiőcate, i.e., all őelds except the signature itself. 8.2.6 The SubjectAltName and the IssuerAltName Extensions Both X.509 and PKIX deőne the standard SubjectAltName (SAN) and IssuerAltName (IAN) extensions, providing alternative identiőcation mechanisms (names) to complement or replace the Distinguished Name mechanism, providing identiőcation for identifying, respectively, the subject and the issuer. These alternative őelds allow the use of other forms of names, identiőers and addresses for the subject and/or the issuer. Note that a certiőcate may contain multiple SANs. The most important form of an alternative name is a Domain Name System (DNS) name, referred to as dNSName, e.g., example.com. These dNSNames are used by most Internet protocols, and are familiar to most users. Also allowed but rarely used alternative names, include email addresses, IP addresses, and URIs. In fact, the use of alternate names is so common, that in many PKIX certiőcates, the subject and the issuer distinguished-name őelds are left empty. Indeed, PKIX (RFC 5280) speciőes that this must be done, when the Certiőcate Authority can only validate one (or more) of the alternative name forms, which is often the case in practice. PKIX speciőes that in such cases, where the SubjectAltName extensions is the only identiőcation and the subject distinguished name is empty, then the extension should be marked as critical, and otherwise, when there is a subject distinguished name, it should be marked as non-critical. Note that PKIX (RFC 5280) speciőes that the Issuer Alternative Name extension should always be marked as non-critical. In contrast, the X.509 standard speciőes that both alternative-name extensions, may be ŕagged as either critical or non-critical. Also, note that implementations of the Secure Socket Layer (SSL) and Transport Layer Security (TLS) protocols, often allow certiőcates to include wildcard certificates, which, instead of specifying a speciőc domain name, use the wildcard notation to specify a set of domain name. For TLS, this support is clearly deőned in RFC 6125 [342]. Wildcard domain names are domain names Applied Introduction to Cryptography and Cybersecurity 502 CHAPTER 8. PUBLIC KEY INFRASTRUCTURE (PKI) where some of the alphanumeric strings are replaced with the wildcard character ‘*’; there are often restrictions on the location of the wildcard character, e.g., it may be allowed only in the complete left-most label of a DNS domain name, as in *.example.com. Wildcard domain names are not addressed in PKIX (RFC5280) or X.509, and RFC 6125 mentions several security concerns regarding their use. 8.2.7 Standard key-usage and policy extensions We next discuss another set of standard extensions, deőned in the X.509 standard with further details in PKIX and other PKI proőles. These extensions deal with the usage of the certiőed key and with the certificate policies related to the issuing and the usage of the certiőcate. Some of the more important extensions, and their recommended usage as per the PKIX proőle, include: The authority key identifier extension. Provides an identiőer for the issuer’s public key, allowing the relying party to identify which public validation key to use to validate the certiőcate, if the issuer has multiple public keys. It is always non-critical. The subject key identifier extension. Provides an identiőer for the certiőed subject’s public key, allowing the relying party to identify that key when necessary, e.g., when validating a signature signed by one of few signature keys of the subject - including signatures on (other) certiőcates. It is always non-critical. The key usage extension. The key usage extension deőnes the allowed usages of the certiőed public key of the subject, including for signing, encryption and key exchange. The speciőcation allows the use of same key for multiple purposes, e.g., encryption and validating signatures, however, this should not be used, as the use of the same key for such different purposes may be vulnerable - security would not follow from the pure security deőnitions for encryption and for signatures. An exception is when using schemes designed speciőcally to allow both applications, such as signcryption schemes. The PKIX standard ( [104] requires this extension to be marked as critical; see subsection 8.2.4. The extended key usage extension. The extended key usage extension allows deőnition of speciőc purposes for which the key is to be used, as supported by relying parties. The speciőcation also allows the CA to indicate that other uses, as deőned by the key-usage extension, are also allowed; otherwise, only the speciőed purposes are allowed. This extension may be marked as critical or not; see subsection 8.2.4. The private key usage period extension. This extension is relevant only for certiőcation of signature-validation public keys; it indicates the allowed period of use of the private key (to generate signatures). Always marked non-critical. Applied Introduction to Cryptography and Cybersecurity 8.2. THE X.509 PKI 503 The certificate policies extension. This extension identiőes one or more certificate policies which apply to the certiőcate; for brief discussion of certiőcate policies, see subsection 8.2.8. The extension identiőes certiőcate policies using object identiőers (OID). In particular, the policy OID in the certiőcate policies extension, is the main mechanism to identify the type of validation of the legitimacy of the certiőcate, performed by the CA before it issued the certiőcate - Domain Validation (DV), Organization Validation (OV) or Extended Validation (EV). For more discussion on certiőcate policies and on the three types of validation, see subsection 8.2.8. The certiőcate policies extension may be marked as critical or as non-critical. The policy mappings extension. This extension is used only in certiőcates issued to another CA, called CA certificates. It speciőes that one of the issuer’s certiőcate policies can be considered equivalent to a given (different) certiőcate policy used by the subject (certiőed) CA. This extension may be marked as critical or as non-critical. Exercise 8.3. Some of the extensions presented in this subsection should always be non-critical, while others may be marked either critical or non-critical. Justify each of these designations by appropriate examples. 8.2.8 Certificate policy (CP) and Domain/Organization/Extended Validation A certificate policy (CP) is a set of rules that indicate the applicability of the certiőcate to a particular use, such as indicating a particular community of relying parties that may rely on the certiőcate, and/or a class of relying party applications or security requirements, which may rely on the certiőcate. Certiőcate policies inform relying parties of the level of conődence they may have in the validity of the bindings between the certiőed public key and the information in the certiőcates regarding the subject, including the subject identiőers. Namely, the Certiőcate Policy provides information which may assist the relying party to decide whether or not to trust a certiőcate for a particular purpose. The certiőcate policy may also be viewed as a legally-meaningful document, which may deőne, and often limit, the liability and obligations of the issuer (CA) for potential inaccuracies in the certiőcate, and deőne statutes to which the CA, subject, and relying parties should conform; however, these legal aspects are beyond our scope. Standard certificate policies and types of validations: DV, OV and EV. The certiőcate policies extension is often used to identify a standard policy; the policy is speciőed by the policy OID őeld. Such standard policies are deőned by the CA-Brower Forum (CABF), and speciőc policies are identiőed in [152]; see Table 8.2. Standard policies often identify the type of validation performed by the CA before issuing the certiőcate. Applied Introduction to Cryptography and Cybersecurity CHAPTER 8. PUBLIC KEY INFRASTRUCTURE (PKI) 504 OID: Name Usage TLS-DV TLS ...1.2.1 TLS-OV TLS ...1.2.2 2.23.140... Validation requirements Domain validated (confirmation email) Price Chrome IE $82.5 N/A N/A $291 N/A N/A $14 Organization validation $30 (registration, phone...) TLS-EV TLS ...1.1 Code-Sign OV Code signing ...1.3 Code-Sign EV Code signing ...1.3 Extended validation (more verifications) Organization validation $135 (registration, phone...) Extended validation (more verifications) Table 8.2: Standard certiőcate policies. Prices for a yearly certificate, from https://www.thesslstore.com, June 2021. For the important case of website certiőcates for SSL or TLS, three types of validation are deőned: Domain Validation (DV) and Organization Validation (OV) and Extended Validation (EV) (in order of increasing validation). Domain Validation and Organization Validation follow the ‘Baseline certiőcate requirements’ deőned by the CA/Browser Forum in [150]. Domain Validation (DV) is a fully-automated - but not very secure - validation process. It involves sending a request to an address associated with the domain, and validating the response. The address may be an IP address or email address (sometimes referred to as email validation). Domain Validation is vulnerable to network attacks, including MitM attacks and off-path attacks exploiting weaknesses of the domain name system (DNS) or of the routing infrastructure; e.g., see [109]. (OV) also requires validation of the organization which is certiőed, i.e., the subject of the certiőcate. Most CAs perform limited validation, typically involving validation that the organization is registered in the speciőed location, and that the request is validated by a phone call to a registered number. Extended Validation involve additional validation requirements, usually following the ‘Extended-Validation certiőcate guidelines’ deőned by the CA/Browser Forum in [151]. These include registration in official registries, physical address and more. The type of validation could be used by relying parties to determine their use of the certiőcate; it is conceivable that fraudsters would be less likely to obtain EV or even OV certiőcates, due to their higher costs and stronger validation requirements. The risk of a rogue certiőcate depends on the validation requirements (Table 8.2); it is highest for domain-validated certiőcates, which are only validated by automated email to the address listed in the Whois records, a process vulnerable to off-path and routing attacks [62, 81]. However, the current popular web-browsers do not appear to treat certiőcates differently based on their validation method. Until around 2019, most major browsers displayed a visible indication in the location bar for EV certiőcates, Applied Introduction to Cryptography and Cybersecurity 8.3. INTERMEDIATE-CAS AND CERTIFICATE PATH VALIDATION 505 e.g., as shown in Table 8.2 for the IE browser, based on research such as [194]. However, this was mostly abandoned in the recent years, and currently, most browsers make minimal use of the type of validation. Browsers display the same indicator for all TLS-protected websites (as shown for Chrome in Table 8.2), and the validation type can only be identiőed by users using the user-interface to look up the details of the certiőcate. The main justiőcations given [174] are that these indications were found ineffective, and interfere with the browser approach of presenting a warning against sites which are not protected by TLS. Exercise 8.4. Compare the user-interface indications of the certificate validation method of two browsers. Check the certificates for at least three websites, e.g., a bank, a newspaper and a browser download web-page. 8.3 Intermediate-CAs and Certificate Path Validation PKI schemes require the relying parties to trust the contents of the certiőcate, mainly, the binding between the public key and the identiőer. In the simple case, the certiőcate is signed by a CA trusted directly by the relying parties, as in Figure 8.1. Such a CA, which is directly trusted by a relying party, is called a trust anchor or root CA of that relying party. Direct trust in one or more trust-anchor (directly trusted) CAs might suffice for small, simple PKI systems. However, many PKI systems are more complex. For example, browsers typically directly trust dozens of trust anchor CAs, referred to in browsers as root CAs, and also browsers also indirectly trust certiőcates signed by other CAs, referred to as intermediate CA; an intermediate CA must be certiőed by root CA, or by a properly-certiőed, indirectly-trusted intermediate CA. Relying parties and PKIs may apply different conditions for determining which certiőcates (and CAs) to trust. For example, in the PGP Web-of-trust PKI [159], every party can certify other parties. One party, say Bob, may decide to indirectly trust another party, say Alice, if Alice is properly certiőed by a ‘sufficient’ number of Bob’s trust anchors, or by a ‘sufficient’ number of parties which Bob trusts indirectly. The trust decision may also be based on ratings speciőed in certiőcates, indicating the amount of trust in a peer. Some designs may also allow ‘negative ratings’, i.e., one party recommending not to trust another party. The determination of whether to trust an entity based on a set of certiőcates - and/or other credentials and inputs - is referred to as the trust establishment or trust management problem, and studied extensively; see [70, 71, 198, 199, 266] and citations of and within these publications. We focus on the simpler case, where a single valid certification path suffices to establish trust in a certiőcate; a certificate path is a series of certiőcates C1 , C2 , . . . where each of them is signed by the public key certiőed in the previous one, and the őrst one is signed by a root CA (trust anchor). Different relying parties may validate a certiőcate path differently, based on their different trust anchors and different policies for trusting certiőcates certiőed by an intermediate CA, using a certiőcate path. The same CA, say CAA , may be a trust anchor Applied Introduction to Cryptography and Cybersecurity 506 CHAPTER 8. PUBLIC KEY INFRASTRUCTURE (PKI) for Alice, and an intermediate CA for Bob, who has a different trust anchor, say CAB . This is the mechanism deployed in most PKI systems and by most relying parties, and speciőed in X.509, and speciőcally in PKIX and Web-PKI. The validation of the certiőcate path is based on several certificate path constraints extensions, which we discuss in the following subsections. 8.3.1 The certificate path constraints extensions In this subsection, we present the three certificate path constraints extensions that are deőned in X.509 and PKIX: basic constraints, name constraints and policy constraints. These constraints are relevant only for certiőcates issued to a subject, e.g., www.bob.com, by some intermediate CA (ICA), i.e., ICA is not directly trusted by the relying party (say Alice), i.e., it is not one of Alice’s trust anchors. Since an intermediate CA (ICA) is not a trust anchor for Alice (the relying party), then Alice would only trust certiőcates issued by the ICA if the ICA is ‘properly certiőed’ by some trust anchor CA; we use TACA to refer to a speciőc Trust Anchor CA which Alice trusts, and based on this trust, may or may not trust a given ICA. In the simple case, illustrated in Figure 8.6, the relying party (Alice) receives two certiőcates: a certiőcate for the subject, e.g., the website www.bob.com, signed by some Intermediate CA, which we denote ICA; and a certiőcate for ICA, signed by the trust anchor CA, TACA. In this case, we will say that the subject, www.bob.com, has a single-hop certification path from TACA, since ICA is certiőed by the trust anchor TACA. In this case, therefore, the certiőcation path consists of two certiőcates: CICA , the certiőcate issued by the trust anchor TACA to the intermediate CA ICA, and CB , the certiőcate issued by the intermediate CA ICA to the subject (www.bob.com). In more complex scenarios there are additional Intermediate CAs in the certification path from the trust anchor to the subject, i.e, the certiőcation path is indirect, or in other words, contains multiple hops. For example, Figure 8.7 illustrates a scenario where the subject, www.bob.com, is certiőed via an indirect certiőcation path with three hops, i.e., including three intermediate CAs: ICA1, ICA2 and ICA3. The subject www.bob.com is certiőed by ICA3, which is certiőed by ICA2, which is certiőed by ICA1, and only ICA1 is certiőed by a trust anchor CA, TACA. Hence, in this example, the certiőcation path consists of four certiőcates: (1) CICA1 , the certiőcate issued by the trust anchor TACA to the intermediate CA ICA1, (2) and (3), the two certiőcates CICA2 and CICA3 , issued by the intermediate CAs ICA1 and ICA2, respectively, to the intermediate CAs ICA2 an ICA3, respectively, and őnally (4) CB , the certiőcate issued by the intermediate CA ICA3 to the subject (www.bob.com). We use the terms subsequent certificates to refer to the certiőcates in a certiőcation path which were issued by intermediate CAs, and the terms root certificate or trust-anchor certificate to refer to the ‘őrst’ certiőcate on the path, i.e., the one issued by the trust-anchor CA. The second certiőcate along the path is certiőed by the intermediate CA certiőed by the trust anchor (in the Applied Introduction to Cryptography and Cybersecurity 8.3. INTERMEDIATE-CAS AND CERTIFICATE PATH VALIDATION 507 trust-anchor certiőcate); and any following certiőcate along the path, say the ith certiőcate along the path (for i > 1), is certiőed by the intermediate CA which was certiőed in the (i − 1)th certiőcate in the path. The length of a certiőcate path is the number of intermediate CAs along it, which is one less than the number of certiőcates along the path. Note that, somewhat contrary to their name, the certiőcation path constraints cannot prevent or prohibit Intermediate CAs from signing certiőcates which do not comply with these constraints; the constraints only provide information for the relying party, say Alice, instructing Alice to trust a certiőcate signed by ICA, only if it conforms with the constraints speciőed in the certiőcates issued to the intermediate CAs. 8.3.2 The basic constraints extension The basic constraints extension deőnes whether the subject of the certiőcate, say example.com, is allowed to be a CA itself, i.e., if example.com may also sign certiőcates (e.g., for other domains or for employees). More speciőcally, the extension deőnes two values: a Boolean ŕag denoted simply cA (with this non-standard capitalization), and an integer called pathLenConstraint (again, with this capitalization). The cA ŕag indicates if the subject (example.com) is ‘allowed’ to issue certiőcates, i.e., act as a CA; if cA = T RU E, then example.com may issue certiőcates, and if cA = F ALSE, then it is not ‘allowed’ to issue certiőcates. Recall that this is really just a signal to the relying parties receiving certiőcates signed by example.com; also, this only restricts the use of the certiőcate that I issued to example.com for validation of certiőcates issued by example.com, it does not prevent or prohibit example.com from issuing certiőcates, which a relying party may still trust, either since it directly trusts example.com (i.e., it is a trust anchor), or since it receives also an additional certiőcate for example.com signed by a different trusted CA, and that certiőcate allows example.com to be a CA, e.g., by having the value TRUE to the cA ŕag in the basic constraints extension. The value of the pathLenConstraint is relevant only when there is a ‘path’ of more than one intermediate CA, between the Trust Anchor CA and the subject. For example, it is relevant only in Figure 8.7, and not in Figure 8.6. For example, in both Figure 8.6 and Figure 8.7, the Trust Anchor CA (TACA) signs certiőcate CICA1 , where is should specify the ICA1 is a trusted (intermediate) CA. Namely, it must set the cA ŕag in the basic-constraints extension of CICA1 to TRUE. However, in Figure 8.7, ICA1 further certiőes ICA2 which certiőes ICA3 - and only ICA3 certiőes the subject (www.bob.com). Therefore, for the relying party to ‘trust’ certiőcate CB for the subject, signed by ICA3, it is required that CICA1 will also contain the path-length (pathLen) parameter in the basic constraint extension, and this is parameter must be at least 2 - allowing two more CAs till certiőcation of the subject. Similarly, the certiőcate issued by ICA1 to ICA2 must contain the basic constraints extension, indicating cA as TRUE, as well as value of 1 at least for the pathLen parameter. Applied Introduction to Cryptography and Cybersecurity 508 CHAPTER 8. PUBLIC KEY INFRASTRUCTURE (PKI) Unfortunately, currently, essentially all browsers do not enforce path-length constraints on the root CAs. Root CAs sometimes do enforce path-length constraints on intermediate CAs, however, these are usually rather long, e.g., 3, leaving wide room for an end-entity to receive, by mistake, a certiőcate allowing it to issue certiőcates. Of course, in most cases, end-entity certiőcates will not allow issuing certiőcates, typically since their basic-constraints will indicate that they are not a CA. Browsers usually enforce basic constraint, although, failures may happen, esp. since this kind of ŕaw - lack of validation - is not likely to be detected by normal user. Exercise 8.5 (IE failure to validate basic constraint). Old versions of the IE browser failed to validate the basic constraint field. Show a sequence diagram for an attack exploiting this vulnerability, allowing a MitM attacker to collect the user’s password to trusted sites which authenticate the user using user-id and password, protected using SSL or TLS. Exercise 8.6. Assume that TACA is concerned that subject-CAs may issue certificates to end-entities (e.g., websites) and neglect to include a basic constraint extension, to prevent the end entity from issuing certificates. Explain how TACA may achieve this, for the scenarios in Figure 8.6 and in Figure 8.7. Identify any remaining potential for such failure by one of the intermediate CAs in these figures. 8.3.3 The name constraint extension The name constraint extension is used in certiőcates issued to a subject CA, such as the intermediate CAs in Figure 8.6 and Figure 8.7. The name constraint extension restricts the set of subject-names to be certiőed by the subject CA, as well as by any subsequent CA. For example, in Figure 8.7, name constraint included in certiőcate CICA1 issued by TACA to ICA1, would restrict certiőcates issued by ICA1, ICA2 and ICA3 3 . The name constraint extension has two possible parameters, which we denote by the names4 permit (to deőne permitted name spaces) and exclude (to forbid name spaces, typically within the permitted name space). Focusing on the PKIX proőle, both parameters are identiőers for names, usually a domain name; we focus on this case. When a domain name is speciőed, this is taken to include sub-domains, e.g., if a name constraints contain parameter permit (only) for domain name com, then this allows subdomains such as google.com, but not names in other top-level domains such as x.org. The exclude parameter takes precedence; i.e., if a certiőcate contains both permit for domain name, say edu, 3 The name constraints in C ICA1 would also restrict certificates issued by the subject (www.bob.com); we didn’t list this above, since the subject’s certificate, CB , should prevent the subject from issuing certificates, using the basic constraints extension. 4 The actual parameter names are permittedSubtrees and excludedSubtrees, which are a bit cumbersome. Applied Introduction to Cryptography and Cybersecurity 8.3. INTERMEDIATE-CAS AND CERTIFICATE PATH VALIDATION Root CA / TACA Certiőcate CICA 509 ICA (Intermediate CA) (Trust Anchor CA) Certiőcates CICA , CB Relying party (e.g, Alice’s browser) Certiőcates CICA , CB Subject (e.g, www.bob.com) Nurse 1 2 3 4 5 6 cA No Yes Yes Yes Yes Yes Basic pathLen (any) (any) (any) (any) (any) (any) CICA constraints extensions Name Permit Exclude (any) (any) bob.com none or x.bob.com cat.com (any) bob.com www.bob.com (any) (any) (any) bob.com Policy Req. Policy (any) none or > 1 (any) (any) 0 (any) CB valid? No Yes No No No No Figure 8.6: A single-hop (length one) certiőcate-path, consisting of trust-anchor CA T ACA, an intermediate CA ICA, and a subject (e.g., website www.bob.com). The table shows the impact of six examples of certiőcate path constraints extensions in certiőcate CICA , on the validity of certiőcate CB issued by ICA. In these examples, CB is for domain name www.bob.com, has no certiőcate policies extension and has basic constraints indicating cA = N o (not a CA); and the value (any) in a őeld, indicates that the example holds for any value in this őeld. Each row is one example of the constraints in CICA . In example (row) 1, CICA does not have the cA ŕag set (true); namely, CICA does not indicate that ICA is a CA, and hence CB is invalid. In contrast, in example 2, certiőcate CB is valid, since the cA ŕag is true, the Name-constraints permit bob.com and does not exclude www.bob.com, and either there is no policy-constraint or its value is more than 1. See discussion in subsection 8.3.1. Applied Introduction to Cryptography and Cybersecurity CHAPTER 8. PUBLIC KEY INFRASTRUCTURE (PKI) 510 TACA CICA1 ICA1 CICA1 , CICA2 ICA2 CICA1 , CICA2 , CICA3 ICA3 Certiőcates CICA1 , CICA2 , CICA3 , CB Relying party (e.g, Alice’s browser) Certiőcates CICA1 , CICA2 , CICA3 , CB Subject (e.g, www.bob.com) Nurse 1 2 3 4 5 cA Yes Yes Yes Yes Yes CICA1 constraints extensions Basic Name pathLen Permit Exclude <2 (any) (any) none or ≥ 2 bob.com none or x.bob.com (any) (any) (any) (any) cat.com (any) (any) (none) bob.com Policy Req. Policy (any) none or > 3 ≤3 (any) (any) CB valid? No Yes No No No Figure 8.7: A length 3 certiőcate-path, consisting of trust-anchor CA T ACA, three intermediate CAs (ICA1, ICA2, ICA3), and a subject (e.g., website www.bob.com). The table shows the impact of the őve example values of the certiőcate path constraints extensions (see subsection 8.3.1), in particular, of the pathLen (path length) parameter of the basic constraints extension. For the examples in the table, assume that none of the certiőcates has the certiőcate policies extension, and that the intermediate certiőcates CICA1 , CICA2 , CICA3 all have the cA ŕag set in ‘Basic constraints’, and that CICA2 , CICA3 do not have any other constraints. For example, in row 1, CB is invalid, since the pathLen őeld in the Basic-constraints extensions of CICA1 is set to less than 2 (and the path from ICA1 to ICA3 is of length two). In contrast, in row 2, the pathLen constraint does not exist (or is satisőed), and the other constraints in CICA1 are also set to allow the certiőcate path to be valid (compare to the examples in Figure 8.6). Applied Introduction to Cryptography and Cybersecurity 8.3. INTERMEDIATE-CAS AND CERTIFICATE PATH VALIDATION 511 Figure 8.8: Example of the use of the name constraint extension, where constraints are over distinguished name keywords. NTT Japan issues a certiőcate to IBM Japan, with the name constraint Permit O=IBM, i.e., allowing it to certify only distinguished names with the value ‘IBM’ to the ‘O’ (organization) keyword, since NTT Japan does not trust IBM Japan to certify other organizations. IBM Japan certiőes the global IBM, only for names in the IBM organization (Permit O=IBM ), and excluding names in Japan (Exclude C=Japan). As a result, NTT trusts certiőcates issued by IBM to different parts of IBM, e.g., IBM US, but would not trust certiőcates issued by IBM to IBM Japan or to other companies, e.g., Symantec. Similarly, IBM certiőes Symantec for all names, except names in the IBM organization. and exclude for subdomain uconn.edu, then this allows subsequent certiőcates only for domains in the edu top-level domain, and excludes domains in the subdomain uconn.edu. See examples in the tables in Figure 8.6 and Figure 8.7. Note that these examples focus on the typical case of DNS domain names, however, the restrictions may apply to other types of names, e.g., email addresses or X.509 distinguished names. Figure 8.8 presents an example of a typical application of the name constraint extension, using X.509 domain names. In this example, the NTT Japan CA issues a certiőcate to IBM Japan, allowing the IBM Japan CA to certify any certiőcates with the value ‘IBM’ for the organization (O) keyword - implying that IBM Japan cannot certify other organizations. Also, see IBM Japan certifying the ‘main’, corporate IBM CA, but excluding sites where the value of the country (C) keyword is Japan, i.e., not allowing corporate IBM CA to certify sites in Japan, even IBM sites. Notice that such certiőcate issued by corporate IBM would also be trusted by relying parties using only NTT Japan as a trust anchor, provided that other relevant constraints such as certiőcate path length are satisőed (or not speciőed). Figure 8.9 presents a similar example, but using DNS domain names instead of X.509 distinguished names. Applied Introduction to Cryptography and Cybersecurity 512 CHAPTER 8. PUBLIC KEY INFRASTRUCTURE (PKI) Figure 8.9: Example of the use of the name constraint extension with DNS names (dNSName). Unfortunately, currently, essentially all browsers do not enforce any Name constraints on the root CAs, and root CAs rarely enforce Name constraints on intermediate CAs. Therefore, although we believe most browsers do support name constraints, these are rarely actually deployed in practice. 8.3.4 The policy constraints extension In addition to the basic constraints and name constraint extensions, X.509 and PKIX also deőne a third standard extension that deőnes additional constraints on subsequent certiőcates. This is the policy constraints extension, which is related to the certiőcate policies and certiőcate policy mappings extensions; see subsection 8.2.8. The policy constraints extension allows the CA to deőne two requirements which must hold, for subsequent certiőcates in a certiőcate path to be considered valid: requireExplicityPolicy: if speciőed as a number n, and the path length is longer than n, then all certiőcates in the path must have a policy required by the user. inhibitPolicyMapping: if speciőed as a number n, and the certiőcate path is longer than n, say C1 , . . . , Cn , Cn+1 , . . ., then Cn+1 and any subsequent certiőcate, should not have a policy mapping extension. 8.4 Certificate Revocation In several scenarios, it becomes necessary to revoke an issued certiőcate, prior to its planned expiration date. Different reasons for revoking a certiőcate are Applied Introduction to Cryptography and Cybersecurity 8.4. CERTIFICATE REVOCATION 513 listed in the PKIX [104] and X.509 [212] standard; we roughy categorize5 them as follows: Revocation due to security concerns: these revocations are due to potential compromise of the certiőed key, discovery that the certiőcate was requested without authorization, that the certiőcate was misused or could be misleading, or that the subject violated its obligations or Terms of Use. Revocation due to change: these revocations are due to a change which invalidates the information in the certiőcate. This can be a change to the certiőed name or to other certiőed attributes, e.g., removal of some (certiőed) privilege. Other reasons for change are when the certiőed entity ceases to operate, or just stops using the certiőed public-private key pair, for a benign reason (such as change of business or change to a more secure key or algorithm). Other revocations: some revocations may be for other reasons, such as a request by the subject, a legal obligation of the CA, or some mistake or failure of the CA. For example, the Let’s Encrypt CA had to revoke multiple certiőcates, when they detected a bug in their CAA (Certificate Authority Authorization) validation code, affecting over 3 million certiőcates [1], most of which were revoked. (Read about CAA in subsection 8.5.2 and [185].) Revocation mechanisms: CRLs, OCSP and others. Revocation mechanisms are methods to inform the relying parties that a certiőcate was revoked. This turns out to be a signiőcant challenge - deőnitely much larger than originally anticipated. The early X.509 design [92] only offered one method for revocation, the Certificate Revocations List (CRL), which is, essentially, a list of the revoked certiőcates, typically signed by the issuing CA; see subsection 8.4.1. CRLs are still widely implemented and supported by CAs, however, most relying parties prefer to use other mechanisms to check for revocations, since CRLs often have excessive overhead, mainly in terms of bandwidth. A daily download of, say, 100M B to 1GB for each relying party, is problematic for the CAs - as well as for many relying parties. Another revocation mechanism which was standardized is the Online Certificate Revocation Protocol (OCSP), see subsection 8.4.2 (and the ‘StapledOCSP’ mechanism in subsection 8.4.3). However, OCSP also has its own set of drawbacks, including delay, loss of privacy and of availability, overhead and vulnerability to Denial-of-Service (DoS) attacks, as we discuss. Indeed, there is still no consensus on the ‘best’ revocation mechanism. In subsection 8.4.5 we discuss several other, non-standardized revocation mechanisms. This includes the (deployed, proprietary) OneCRL and CRLset mechanisms, 5While X.509 specifies an optional field to specify the reason for revocation, most certificates do not include such indication, hence, unfortunately, we do not know the distribution of reasons. Applied Introduction to Cryptography and Cybersecurity CHAPTER 8. PUBLIC KEY INFRASTRUCTURE (PKI) 514 Method CRL (periodic) Freshness Delay Hours/days None (local) ∆-CRL Hours/days None (local) OCSP (ignore cache) Seconds (or less) Seconds Stapled OCSP TS minutes None (in TLS) Hours/days None (local) OneCRL, CRLset CRV Hours/days ∆-CRV Hours/days None (local) None (local) Compute One signature, one verification qCA signatures, qRP verifications c · 24·60 T S signatures, qRP verifications One signature, one verification Bandwidth Storage Concerns Very high: rAll · LCRL High: rAll · LCRL Bandwidth Usually: > rD · LCRL , sometimes: rAll · LCRL High: rAll · LCRL Complexity, adoption, storage Medium/Low: qCA · LO None Delay, overhead, availability, DoS, privacy exposure Medium/Low: qCA · LO None Delay, overhead, availability, DoS, privacy exposure Medium/Low Medium/Low Partial coverage, proprietary, another TTP Low: rAll · log c Very low: rD · log c Low: rAll · log c Low: rAll · log c Proposed, not yet deployed Table 8.3: Comparison of revocation-checking mechanisms. Bandwidth and computations are for a 24 hours period. Computation focuses on the public key signature operations: verifying (done by the relying party) and signing. Signing is done by the CA, except for OneCRL/CRLset, where signing is done by the vendor. The parameters (c, LCRL , . . .) are described in Table 8.4. Values are rough approximations (simpliőcations). Parameter Length of CRL entry Length of OCSP response Number of certiőcates Total number of revocations Revocations in 24 hours Validation-queries by a relying-party in 24 hours Validation-queries to a CA in 24 hours Time between OCSP requests by website (stapled OCSP) Notation LCRL LO c rAll rD qRP qCA Typical value 100 bytes 1000 bytes 108 106 1000 50 107 TS 10 minutes Table 8.4: Revocation-related parameters. Actual values differ signiőcantly between different PKIs; the values given are just examples. the CRV and ∆-CRV mechanisms proposed in [364]. In subsection 8.4.4, we also discuss some non-standardized OCSP optimizations. Table 8.3 compares the CRL, OCSP, OneCRL/CRLset and CRV/∆−CRV revocation mechanisms. Validation and freshness of revocation information. Revocation information, sent as CRL, as OCSP response or otherwise, should be validated for authenticity and freshness. All of the revocation mechanisms we discuss are based on provision of signed and time-stamped revocation information, which allows any party to validate it at any time (not just immediately). Furthermore, Applied Introduction to Cryptography and Cybersecurity 8.4. CERTIFICATE REVOCATION 515 this allows the relying party to prove later that it received particular revocation information, which can help justify its resulting actions, such as performing a signed order which the relying party validated using the certiőed public key (not revoked based on information available to the relying party at that time), or refusing to perform some order (since the key was revoked). The correct action may also depend on the time (and date) of revocation, which is, therefore, often part of the information for a revoked certiőcate. What is fresh revocation information? Note that revocation information may change over time, even while in transit to the relying party. Therefore, it is possible that the relying party receives, at time t information indicating a certiőcate is still valid, while the certiőcate was revoked at some time t′ ≤ t. This should happen only if t′ is ‘sufficiently close’ to t, i.e., t − t′ ≤ ∆ where ∆ is the allowed period to use a certiőcate after the timestamp in the revocation information. The value of ∆ may be deőned by an extension in the certiőcate or within the revocation information; for example, this is the case for CRL and OCSP. Alternatively the allowed ∆ may be the same for all certiőcates (deőned by a speciőca CA, or by any CA), as is done by several (non-standard) revocation mechanisms such as OneCRL, CRLsets and CRVs. Distribution of revocation information. Often, the relying party receives the revocation information directly from the CA. However, since the revocation information is signed and time-stamped, then it could also be relayed by third parties. This fact is utilized by OCSP stapling, where the revocation information is provided by the subject, ‘stapled’ to the certiőcate. OCSP stapling is speciőed for the common case where the certiőcate is provided by a TLS Server, and in this case, it is provided in the Server Hello message. We discuss OCSP stapling in subsection 8.4.3. Retrieving revocation information: periodically or as-needed (online)? There are two main options for the retrieving revocation information: periodically, e.g., daily, or as-needed, i.e., when the relying party needs the revocation information to validate a speciőc certiőcate. Online (as-needed) retrieval may reduce the bandwidth overhead, since it avoids downloading unnecessary revocation information; on the other hand, the relying party must wait for the revocation information to arrive, introducing delay (waiting for the response) and the risk of communication failures. Online retrieval may also allow an attacker to perform a Denial-of-Service (DoS) attack agains the CA or other OCSP server, by a ‘ŕood’ of revocation-queries; and there are also privacy concerns, due to the exposure of the identity of the certiőcates used by the relying party. To reduce the overhead, some relying parties use a cached OCSP response if available, and perform online retrieval only when they do not have a valid OCSP response in cache. The impact of the periodical retrieval vs. as-needed retrieval can be clearly seen in Table 8.3. OCSP, as its name implies, retrieves revocation information Applied Introduction to Cryptography and Cybersecurity To be signed (tbs) 516 CHAPTER 8. PUBLIC KEY INFRASTRUCTURE (PKI) CRL Version (optional) Signature algorithm (OID) Issuer (Distinguished Name) thisUpdate (time) nextUpdate (time; optional) revokedCertificates (optional) ... CRL Entry n CRL Entry 1 crlExtensions (optional) Signature: Signs.Issuer (tbs) Certificate serial number Revocation Date (time) CRL Entry Extensions (optional) (b) X.509 CRL Entry (a) X.509 CRL Figure 8.10: ( X.509 Certiőcate Revocation List (CRL): CRL őelds (a), and CRL Entry őelds (b). Fields with white background have corresponding őelds in the X.509v3 certiőcate. as-needed (online), unless it is cached (in OCSP implementations that cache responses). Most other revocation mechanisms, including CRLs, usually retrieve revocation information periodically, although some implementations, mainly of CRLs, also retrieve them only as-needed, to save bandwidth by not downloading unnecessary CRLs. Table 8.3 does not include the less-typical options of OCSP which caches responses and of retrieving CRLs only as-needed. 8.4.1 Certificate Revocation List (CRL) The X.509 designers probably expected revocation to be a rare incident, with a small number of certiőcates which were revoked (but not yet expired) at any given time. In this case, a simple solution is for the CA to periodically sign and distribute a time-stamped list of all revoked certiőcates, which is called a Certificate Revocation List (CRL). The CA may also authorize another entity to issue CRLs; the term CRL issuer can be used, to refer to the entity issuing the CRLs - the CA or an entity authorized by the CA. However, for simplicity, we mostly refer to the typical case where the CA is also the CRL issuer. CRLs are deőned as part of the X.509 standard, already from its early versions [92]; their contents were enhanced with later versions. Figure 8.10 shows the contents of the widely used CRL deőned in [211]. As can be seen, this CRL shares quite a lot with the X.509v3 certiőcate (Figure 8.5); in particular, it has similar version6 , OID, issuer Distinguished Name (DN), subject DN and extensions őelds. The őelds which are unique to CRLs are: 6 Confusingly, the CRL specification support extensions from CRL version 2, not 3 (as might be expected for X.509v3 certificate). Furthermore, in CRLs, the version field is optional; when the version field is absent, this indicates version 1 (which does not support extensions). In practice (and this book), CRLs are always version 2, i.e., contain both version and extension fields. Applied Introduction to Cryptography and Cybersecurity 8.4. CERTIFICATE REVOCATION 517 thisUpdate: the time at which the CRL was issued (signed). nextUpdate: If speciőed, the nextUpdate bounds the time when an updated CRL will be issued. Usually, relying parties request a new certiőcate prior to the nextUpdate time, i.e., it serves, essentially, as the ‘expiration date’ for the CRL. revokedCertificates: this őeld lists one or more CRL Entries. The contents of CRL Entries are shown in Figure 8.10 (b). Each CRL Entry contains (1) the serial number of the revoked certiőcate, (2) the revocation date (and time), and (3) optional extensions, much like the X.509v3 certiőcate (and the CRL) extensions. crlExtensions: an optional őeld, that may contain extensions to the CRL, much like the X.509 certiőcate extensions. A relying party should use a valid, non-expired CRL to check if a certiőcate issued by the CA was revoked. Almost always, the CRL is cached until it is replaced by a more-recently-issued CRL (from the same issuer and set of certiőcates). The relying party typically requests the CRL, either periodically, to ensure a fresh CRL is available when needed, or only when needed to validate a certiőcate. The CRL Distribution Points (‘cRLDistributionPoints’ certificate extension. To retrieve the CRL, the relying party usually uses the cRLDistributionPoints certiőcate extension, deőned in [211]; this extension deőnes how (using what protocol) to retrieve the CRL, and what address to use. Speciőcally, this information is provided by one or more DistributionPoints entries, each of which deőnes a URI (Universal Resource Locator), which deőnes the protocol and location for downloading the CRL. Bandwidth overhead is a major concern with CRLs, since CRLs can often be quite large. The size of a single CRL entry, which we denote LCRL , can differ signiőcantly among providers, but most are close to 100 bytes; and the total number of revocations, which we denote rAll , could be a million or even more. This results in total CRL length of LCRL · rAll of around 100 million bytes per CA, considerable overhead for both CAs and relying parties. Measurements of CRL overhead were reported in [399], who found median CRL of 51KB and maximal CRL of 76MB (yes, this is in Mega Bytes!); and [364], who found average CRL of 173KB. The reason for that is that the number of revocations may be surprisingly high; speciőcally, [399] found that about 8% of the nonexpired certiőcates were revoked, mostly due to the Heartbleed bug [91], and Let’s revoke discovered a bug in their issuing which required revocation of three million certiőcates [305]. However, even looking at measurements for periods without such events, we see that about 1% of the non-expired certiőcates are revoked - which can still result in excessively long CRLs. Three standard X.509 CRL extensions are designed to reduce the bandwidth overhead of CRLs: Applied Introduction to Cryptography and Cybersecurity 518 CHAPTER 8. PUBLIC KEY INFRASTRUCTURE (PKI) The Issuing Distribution Point (IDP) CRL extension and CRL scopes. A CRL, available from some DistributionPoint, should contain all revoked certiőcates which were issued by the CA and not yet expired, belonging to the scope of the CRL. By default, all certiőcates issued by the CA have the same scope, i.e., are available from the same DistributionPoint. However, often, CAs prefer to use multiple smaller CRLs, by splitting the set of certiőcates into separate scopes, e.g., based on issuing time. X.509 [211] speciőes that in this case, the CRL must contain the standard, critical CRL extension called Issuing Distribution Point (IDP), which will deőne the relevant scope. The main motivation for a CA to use multiple CRLs, each with distinct scope (deőned using the IDP CRL extension) and DistributionPoint, is to reduce the bandwidth overhead. The use of multiple DistributionPoints (and scopes) reduces the length of each CRL, at the cost of requiring the CA to sign and distribute multiple CRLs. When relying parties download CRLs only as-needed, this may reduce the required bandwidth, but at the cost of reduced likelihood that the required CRL is cached, i.e., more cases where the relying party must download the CRL to validate a certiőcate, which can cause signiőcant delay and availability concerns. OCSP can be seen as an extreme case, with each certiőcate requiring a separate request/response. The Authorities Revocation List (ARL) extension lists only revocations of CA certiőcates. This is essentially equivalent to placing CA certiőcates in a dedicated distribution point, and becomes meaningful mainly when relying parties download only the ARL, and use other mechanisms, such as OCSP, for non-CA certiőcates. The Delta CRL extension lists only new revocations, which occurred since last base-CRL, which is retrieved as needed. To validate that a given certiőcate is not revoked, check if it is contained either in the Delta-CRL or in a base-CRL, issued not earlier than the time speciőed in the DeltaCRL. For this method to be effective, relying parties should cache, and periodically download, the base-CRL, so, the storage requirements are the same as when using ‘regular’ CRLs, and sometimes also the bandwidth requirements. Also, implementation is more complex, especially if the relying party may need to ‘prove’ to a third party, in the future, that she relied on a certiőcate that was not revoked at the time. Possibly due to such concerns, Delta-CRLs are not widely deployed. Even with such optimizations, CRLs may still introduce signiőcant bandwidth overhead. When to download CRLs: periodically (in advance) or as-needed? The original X.509 CRL design, was to download all CRLs in advance, in a periodical process, e.g., daily [92]. This periodical process should be done with sufficient frequency, to make sure that the revocation information is reasonably Applied Introduction to Cryptography and Cybersecurity 8.4. CERTIFICATE REVOCATION OCSP Client Usually, relying party or subject, e.g., web server 519 OCSP Responder (CA or trusted OCSP server) OCSP request: version, {CertID1 , . . .} [, signature] [, extensions] OCSP response: ResponseStatus, producedAt, responses, signature Figure 8.11: The Online Certificate Status Protocol (OCSP). The request includes one or more certiőcate identiőers {CertID1 , . . .}; requests are optionally signed. The OCSP response is signed by the responder, and includes response for each CertID in the request. Each of these ‘individual responses’ includes the CertID, cert-status, time of this update, time of the next update, and optional extensions. Cert-status is either revoked, good or unknown. updated (fresh). Since CRLs can be quite long, and many CRLs are not required in any given day, this results in considerable overhead. Therefore, implementations often fetch the CRLs only as-needed ( online). However, this may cause increased delay (waiting for information to arrive), reduced reliability (what to do if revocation information is unavailable), and privacy concerns (e.g., exposing the website being visited). As a result, the use of CRLs has become less and less common; e.g., it is not done, currently, by major browsers. A standardized alternative to CRLs is the Online Certificate Status Protocol (OCSP) standard, which we discuss in subsection 8.4.2, subsection 8.4.3. We later also discuss other, non-standardized alternatives, in subsection 8.4.4 and subsection 8.4.5. 8.4.2 Online Certificate Status Protocol (OCSP) OCSP (Online Certiőcate Status Protocol) [346], shown in Figure 8.11 is a request-response protocol, providing a secure, signed indication to the relying party, showing the ‘current’ status of certiőcates (details below). The protocol involves two entities: the OCSP client, who sends an OCSP request to request the status of one or more certiőcates, and the OCSP responder (server), who responds with a (signed) OCSP response, indicating the status of the certiőcate(s). The OCSP client, i.e., the entity that sends the OCSP request, is either the relying party or another party. In this subsection, we focus on the ‘classical’ OCSP deployment, where the relying party, e.g., browser, sends the OCSP request to the CA (or other OCSP responder), as in Figure 8.12; in this case, the relying party (often browser) acts as the OCSP client. Later, in subsection 8.4.3, we discuss the stapled-OCSP deployment, where it is the subject, e.g., website, Applied Introduction to Cryptography and Cybersecurity CHAPTER 8. PUBLIC KEY INFRASTRUCTURE (PKI) 520 OCSP Responder (often the CA) TLS client (browser) TLS (web) server TLS Client Hello TLS Server Hello (includes certificate) OCSP request OCSP response Revoked or invalid: abort, Timeout: abort (hard-fail) or proceed (soft-fail), Valid: proceed. TLS key exchange, finish TLS finish Figure 8.12: OCSP used by relying party (as OCSP client). There are several concerns with this form of using OCSP, including privacy exposure, overhead on CA, and handling of delayed/missing OCSP response by the client/browser. This last concern, illustrated in Figure 8.13, motivated updated browsers to support and prefer OCSP-stapling (see Figure 8.14), where the TLS/web server makes the OCSP request, instead of the client/browser, and ‘staples’ the OCSP response to the TLS server hello message. who sends the OCSP request, i.e., the subject (often website) acts as the OCSP client. The OCSP responder, i.e, the entity that processes OCSP requests and sends responses, is an entity trusted by the relying party; we will assume this is the CA itself, although it could also be another entity, delegated by the CA. Each OCSP response message is signed by the OCSP responder or the CA, allowing the relying party to validate it, even if received via an untrusted intermediary, e.g., the subject (website). Improving efficiency with multi-cert OCSP requests. To improve efficiency, a single OCSP request may specify (request status for) multiple certiőcates (CertIDs)7 . Correspondingly, a single OCSP response, using a single signature, may include (signed) responses for multiple certiőcates. The support for OCSP requests and responses for multiple certiőcates, is especially important when certiőcates are signed by intermediate CAs, using a CertiőcatePath; see Note 8.2. 7 Certificate identifiers (CertIDs) may be specified using the hash of the issuer name and key, and a certificate serial number. Applied Introduction to Cryptography and Cybersecurity 8.4. CERTIFICATE REVOCATION 521 Note 8.2: Using OCSP to validate a Certiőcate-Path (CP) An indirectly trusted certificate, certified via a certificate chain, consisting of certificates issued by (one or more) intermediate CAs, may be invalidated via revocation of any of these certificates. A relying party wishing to validate the status of the indirectly-trusted certificate, needs to check for revocation of the intermediate-CA certificates in the chain, not only of the indirectly-trusted certificate itself. Since Intermediate-CAs are critical elements of the PKI, and their number is much smaller than end-entities, relying parties may use other mechanisms to check for revocation of intermediate-CA certificates. Specifically, intermediate-CA certificates are often validated using (special) CRLs or proprietary mechanisms such as OneCRL or CRLset. However, sometimes, their validity should be checked using OCSP. The fact that an OCSP request may include multiple certificates, allows this process to be more efficient; a single OCSP request-response interaction may suffice to obtain updated status for all of these certificates, provided that the same OCSP responder is able to provide (signed) OCSP responses for all of these certificates (issued by different CAs). The original OCSP stapling specification, RFC 6066 [139], does not support stapling of multiple certificates. This is addressed in TLS 1.3 [329], which allows the RFC 6066 information to be attached to every certificate in the chain sent by the server. Alternatively, implementations of older versions of TLS can use the (later-defined) ‘multiple certificate status’ extension, RFC 6961 [317]. OCSP vs. CRLs. The length of an OCSP response is linear in the number of CertIDs in the corresponding OCSP request, rather than a function of the total number of revoked certiőcates of this CA, as is the case for CRLs. Furthermore, the computation required for sending an OCSP response is just one signature operation, plus some hash function applications, regardless of the number of revoked certiőcates or the number of certiőcates whose status is requested in this OCSP request. In the common case where the total number of revoked certiőcates may be large, this signiőcantly reduces the overhead of generating and distributing often large CRL responses. Namely, OCSP provides an alternative which is often more efficient than CRLs; with CRLs, the CA must ‘push’ the list of all revocations to all relying parties, while with OCSP, a relying party receives information only about relevant certiőcates. In addition, OCSP responses are sent on a timely fashion, when the relying party is validating the relevant certiőcate - which may provide a more ‘fresh’ indication compared to the periodical CRL. As a result of these advantages, OCSP appears to be deployed more than CRLs. However, OCSP has its own set of challenges, so it is also not widely used. Let us discuss these challenges. OCSP Challenges: ambiguity, failures and delay. OCSP status responses for each certiőcate may specify one of three values: revoked, good or unknown. The ‘unknown’ response is typically sent when the OCSP responder does not serve OCSP requests for the issuer of the certiőcate in question, or cannot resolve their status at the time (e.g., due to lack of response from the CA). These unknown responses are ambiguous; relying parties are left to Applied Introduction to Cryptography and Cybersecurity 522 CHAPTER 8. PUBLIC KEY INFRASTRUCTURE (PKI) decide how to interpret and respond to it. These ambiguous responses are quite problematic, as we explain below. But őrst let us discuss another OCSP scenario that also leads to similar ambiguity: failed requests. An OCSP request may fail in multiple ways. One way is when the OCSP client fails to establish communication or receive response from the OCSP server. Another reason is when the OCSP responder sends back an OCSP failure return code, indicating a reason for failure. These reasons include: • Lack of signature on OCSP request (when required by OCSP responder) • Request not properly authorized/authenticated, e.g., not from known IP address, or missing/incorrect authentication information, when required by OCSP responder. Authentication information should be provided by the client in an appropriate OCSP extension. • Technical reasons, such as overload or internal error. Recall now that in the ‘classical’ OCSP deployment, the OCSP client is the relying party, typically, the browser, as in Figure 8.12. However, this creates a dilemma for the browser (or other relying party): how should the relying party respond to OCSP failures and ambiguous responses, e.g., when a response does not arrive (within reasonable time) or indicates an OCSP failure? The following are the main options - and why each of them seems unsatisfactory: Wait: if the problem is timeout, then the relying party may simply continue waiting for the OCSP response, possibly resending the request periodically, and never ‘giving up’. However, OCSP servers could fail or become inaccessible forever, or for extremely long, leaving the relying party in this state. We do not believe any relying party has taken or will take this approach; also, it does not address the other types of OCSP ambiguities. In fact, even when the relying party ‘time-outs’ if the OCSP response is not received within reasonable time, the delay of waiting for the OCSP response is often a concern. Hard-fail: abort the connection (and inform the user). That is clearly a ‘safe’ alternative, i.e., prevent use of a revoked certiőcate. However, the OCSP interaction may often fail or return ambiguous response due to benign reasons, such as network connectivity issues or overload of the OCSP responder. In particular, usually, the OCSP responder is the CA, and CAs often do not have sufficient resources to handle high load of OCSP requests. Therefore, this approach is not widely adopted. Ask user: the relying party may, after some timeout, invoke a user-interface dialog and ask the user to decide if to continue with the connection or abort it. For example, a browser may invoke a dialog, informing the user that the certiőcate-validation process is taking longer than usually, and ask the user what action it should take. While this option may seem to empower the user, in reality, users are rarely able to understand the Applied Introduction to Cryptography and Cybersecurity 8.4. CERTIFICATE REVOCATION 523 situation and make an informed decision, and are very likely to continue with the connection; see discussion of usability in Chapter 9. Hence, except for ‘shifting the responsibility’ to the user, this option is inferior to direct soft-fail, discussed next. Soft-fail: őnally, the relying party may simply continue as if it received a valid OCSP response. By far, this is the most widely-adopted option. In the typical case of a benign failure to receive the OCSP response, there is no harm in picking this option. However, this choice leaves the user vulnerable to an impersonation attack using a revoked certiőcate, when the attacker can block the OCSP response; see Figure 8.13. Since our need for cryptography is mainly due to concerns about a Man-in-the-Middle attacker, who can surely block communication, this option results in vulnerability. As Figure 8.13 shows, the soft-fail approach essentially nulliőes the value of OCSP validation - against an attacker that can block or sufficiently delay the OCSP request/response, if the attacker has exposed the private key of the TLS (web) server, or has obtained a fake certiőcate for the server’s domain (that was later revoked). Both exposing of the private key and obtaining a fake certiőcate are challenging attacks. However, such attacks do occur, which is one reason we need revocations; see examples of such attacks in Table 8.5. The other condition, of being able to block the OCSP response, is often surprisingly easy for an attacker, e.g., by sending an excessive number of OCSP requests to the OCSP responder (e.g., the CA) at the same time as the OCSP request from the relying party. In particular, an attacker is likely to be able to launch such attack by intentionally invoking appropriate links from a website controlled by the attacker, in a so called web-puppet attack; see the web-security chapter of [192]. In spite of this, soft-fail is common choice of browsers and most other relying parties, basically, since developers give more weight to user-experience (UX) considerations, than to security considerations - see the UX>security precedence rule (Note 8.3). Unfortunately, as we explained, this allows attackers to circumvent OCSP and use revoked certiőcates, by intentionally causing a failure to the OCSP challenge-response communication. There are several additional problems with the use of ‘classical’ OCSP deployment, where the OCSP request is sent by the relying party (often, browser): Delay: since OCSP is an online, request-response protocol, its deployment at the beginning of a connection often results in considerable delay. Privacy exposure: the stream of OCSP requests (and responses) may expose the identities of websites visited by the user to the OCSP responder, or to other agents able to inspect the network traffic. By default, OCSP requests and responses are not encrypted, exposing this information even to an eavesdropper; but even if encryption is used, privacy is at risk. First, the CA is still exposed to the identities of websites visited by a particular Applied Introduction to Cryptography and Cybersecurity CHAPTER 8. PUBLIC KEY INFRASTRUCTURE (PKI) 524 TLS client (browser) MitM (fake server, with revoked cert) OCSP Responder (CA) TLS Client Hello TLS Server Hello with revoked certificate OCSP request (drop) OCSP response time-out→ softfail TLS key exchange, finish TLS finish (data) Figure 8.13: The MitM soft-fail Attack on a TLS connection using OCSP. The attack assumes ‘classical’ OCSP deployment, where the TLS-client (browser) sends the OCSP request (acts as OCSP client), and (vulnerable) soft-fail handling of timeouts and ambiguous OCSP responses. The attacker is impersonating as a website, to which the attacker has the private key; the corresponding certiőcate is already revoked, but the attack allows the attacker to trick the browser into accepting it anyway, allowing the impersonation attack to succeed. The browser queries the CA (or other OCSP server) to receive a fresh certiőcate-status. However, the attacker ‘kills’ the OCSP request, or the OCSP response (őgure illustrates dropping of the response). After waiting for some time, the browser times-out, and accepts the revoked certiőcate sent by the impersonating website, although no OCSP response was received. This soft-fail behavior is used by most browsers, since the alternatives (very long timeout, asking the user, or hard-fail) are not well received by users. Applied Introduction to Cryptography and Cybersecurity 8.4. CERTIFICATE REVOCATION 525 Note 8.3: The UX>Security Precedence Rule In the OCSP soft-fail vulnerability, as described in Section 8.4.2, most browsers support OCSP, but only using soft-fail, namely, if the OCSP-response is not received within some time, then the browser simply continues with the connection, i.e., ‘gives up’ on the OCSP validation and continues using the received certificate, basically, assumes that the certificate was not revoked. It is well understood that this allows a MitM attacker to foil the OCSP validation, i.e., the use of the soft-fail approach results in a known vulnerability. Still, browser developers usually prefer to have this vulnerability, to the secure alternative of hard-fail, namely, aborting a connection after ‘giving up’ on the OCSP response. The reason is that there are also benign reasons that may cause the OCSP response not to arrive, such as unusually high delay due to network congestion or high load on the OCSP responder (typically, the CA). Aborting a connection in such cases would result in loss of availability. If the response is only delayed and eventually arrives, waiting for a long time would result in poor performance. Loss of availability, performance, reliability and functionality, are all immediately visible to the end users, i.e., they harm the user experience (UX). User experience has a direct, immediate impact on the success of a product. In contrast, security and privacy considerations are rarely visible to the users. As a result, even when vendors and developers care about security and privacy, they usually prefer to compromise on these goals, to avoid harming the user experience (UX) aspects: availability, functionality, performance, usability and reliability. We refer to this as the UX>Security Precedence Rule. Principle 16 (The UX>Security Precedence Rule). Vendors and developers give precedence to the user experience (UX) considerations ( availability, functionality, performance, usability and reliability), than to the security and privacy considerations. Of course, the UX>Security Precedence Rule is just a simplification; real decisions are more complex, and some vulnerabilities will be considered so critical, that developers will prefer to fix them, even at the cost of some reduction in UX. However, usually, the challenge for designers and researchers is to find solutions which will ensure sufficient security, but avoiding or minimizing harm to the user experience (UX). Applied Introduction to Cryptography and Cybersecurity CHAPTER 8. PUBLIC KEY INFRASTRUCTURE (PKI) 526 user. Second, even with encryption of OCSP requests and responses, the timing patterns create a side-channel that may allow an eavesdropper to identify visited websites. Processing overhead: while OCSP often reduces overhead signiőcantly compared to CRLs, it still requires each response to be signed, which is computational burden on the OCSP responder. In addition to this computational overhead, there is the overhead of processing each of the (many) OCSP requests; this overhead remains even when applying optimizations that reduce the OCSP computational overhead, e.g., as in subsection 8.4.4 and Exercise 8.20. The processing overhead is especially a concern for the OCSP responder. Consider the typical case, of a CA providing OCSP responder service; the signatures in OCSP responses imply signiőcant processing overhead, which can be a signiőcant concern to the CA. Normally, CAs cannot charge for the overhead of handling these OCSP requests; and to provide reliable service, they should be ready to respond to a Flash Crowd8 of requests, from visitors of a (suddenly popular) website, or to respond to request sent as part of an intentional Denial-of-Service attack (on the CA or on a subject of a certiőcate). Due to the overhead concerns, an OCSP responder may limit its services to authorized OCSP clients. To support this, OCSP requests may be signed; some servers may use other ways to authenticate their clients, e.g., using the optional extensions mechanism supported by OCSP requests. We next describe OCSP stapling, where the OCSP client is the subject of the certiőcate rather than the relying party. The goal of OCSP stapling is to mitigate these security, privacy and efficiency concerns. In subsection 8.4.4 we discuss additional methods to reduce the computational overhead of OCSP. 8.4.3 OCSP Stapling and the Must-Staple Extension In the previous subsection, we have seen several disadvantages of the ‘classical’ OCSP deployment, where the relying party sends the OCSP requests (i.e., acts as the OCSP client). In this section we discuss an alternative approach, the OCSP stapling deployment, where the OCSP request is sent by the subject, typically the website, acting as the OCSP client. Namely, this design moves the responsibility to obtain ‘fresh’ OCSP signed responses to the subject (e.g., web-server), rather than placing this responsibility (and burden) on every client (e.g., browser). This addresses the privacy exposure and reduces the overhead on the OCSP responder (typically, the CA), since it now needs only to send a single signed OCSP response to each subject (website) - much less overhead than sending to every relying party (browser). Furthermore, since now only the subject is supposed to make OCSP requests, the CA may limit the service to its customers, the subjects. 8 The term Flash Crowd is the name of a sci-fi novella by Larry Niven, describing ‘physical flash crowd’ due to the use of a transfer booth. Applied Introduction to Cryptography and Cybersecurity 8.4. CERTIFICATE REVOCATION 527 Therefore, of all the concerns discussed for the relying-party-based OCSP, only one remains: handling of ambiguous OCSP responses, and in particular, the MitM soft-fail attack (Figure 8.13). We discuss two variants of OCSP stapling, which handle, in two different ways, such ambiguities and failures. OCSP Stapling. OCSP stapling is a different way to deploy OCSP, where the subject runs the OCSP client and periodically sends OCSP requests to the OCSP responder for an OCSP response for the server’s certiőcate, e.g., CB . Let us focus on the typical scenario, where the relying party is a browser running TLS, who receives a certiőcate CB from the web (and TLS) server, e.g., bob.com, who is the subject of the certiőcate CB . In OCSP Stapling, the subject (web server) periodically sends an OCSP request to the OCSP responder (CA). The web-server does this periodically, without waiting for the TLS Client Hello message from the client. The CA (or other OCSP responder) sends back the OCSP response; usually, the response indicates that CB is still Ok (not revoked), at the current time time(·). We denote this response as σ; importantly, σ = SignCA.s (CB Ok:time(·)), i.e., contains a signature by the private signing key CA.s of the CA, on the web-server’s certiőcate CB and the current time. This response should satisfy browsers (as relying parties), at least until bob.com will ‘refresh’ it by again sending OCSP request for CB . The web-server, e.g., bob.com, keeps the response σ, providing it to all connections by OCSP-stapling-supporting browsers, until it would request and receive a newer OCSP response, in the next period. When an OCSP-stapling-supporting browser connects to bob.com, it indicates its support for OCSP-stapling by including the CSR TLS extension; CSR stands for the Certificate Status Response TLS-extension. If the server supports stapling and has a valid OCSP response σ, then staples (includes) the OCSP response σ, which it places in the CSR TLS-extension, sent in the server’s response. See this scenario in Figure 8.14. Note that we discuss here the variant of OCSP deployment, where stapling is optional ; i.e., the web-server may not staple an OCSP response, e.g., if the web-server did not receive the OCSP response from the OCSP responder. We later discuss OCSP Must-Staple, a variant of OCSP deployment where the subject commits to sending a valid OCSP response. Once the browser receives the OCSP response (in the CSR TLS-extension), it validates it, i.e., validates the signature of the CA (using the CA’s public validation key CA.v), and then validating that the response indicates nonrevocation (which we marked by Ok) and that the time indicated is ‘recent enough’. When all is Ok, the browser completes the TLS handshake with bob.com and then continues with the TLS connection. We described the OCSP-stapling process for a TLS connection between a browser and a web-server, for the case where the certiőcate was issued by a root CA (directly trusted by the browser). However, the process is exactly the same for other TLS clients and servers, and the modiőcations for the (typical) case Applied Introduction to Cryptography and Cybersecurity CHAPTER 8. PUBLIC KEY INFRASTRUCTURE (PKI) 528 Web+TLS server bob.com, subject of CB and OCSP client Browser, TLS client and relying party CA and OCSP Responder OCSP request (for CB ) OCSP response: σ = SignCA.S (CB Ok:time(·)) TLS Client Hello with CSR TLS-extension TLS Server Hello with CSR extension: σ (OCSP Response) TLS key exchange, finish TLS finish Figure 8.14: (Optional) OCSP stapling in the TLS protocol, using the Certificate Status Request (CSR) TLS extension, for a typical TLS connection between browser and web-server bob.com, the subject of certiőcate CB . bob.com received CB from the CA (not shown); the CA is also the OCSP responder. The web (and TLS) server bob.com periodically sends OCSP requests to the CA (also OCSP responder), requesting the status of its own certiőcate CB . The CA sends back the OCSP response, σ = SignCA.S (CB Ok:time(·)), signaling that CB was not revoked up to time time(·). The browser sends the TLS CSR extension to bob.com with TLS Client Hello, to request OCSP-stapling. The server sends back σ, the OCSP response, also in the CSR extension. The TLS handshake now completes as usual. of intermediate CA are simple, following the multi-cert OCSP request-response as discussed earlier, including in Note 8.2. Handling Ambiguous OCSP responses and the MitM soft-fail attack. Let us now return to discuss the handling of ambiguous OCSP responses, and in particular, handling of the case where no OCSP response is received. For stapled OCSP, such failure may happen either between subject of the certiőcate, typically the web-server, who acts as the OCSP client, and the CA (OCSP responder); or between the relying party, typically the browser, and the subject (web-server). In particular, this will happen if the web-server does not support OCSP stapling. In any case, the bottom line is that the browser does not receive a stapled OCSP response from the web-server. In the ‘optional’ OCSP stapling design, Applied Introduction to Cryptography and Cybersecurity 8.4. CERTIFICATE REVOCATION 529 this simply directs the browser to attempt to resolve the revocation situation by itself. Typically, the browser would now perform an OCSP query directly with the OCSP responder (typically, the CA), or even request the CRL. However, now we are basically back in the ‘classical OCSP’ deployment, where OCSP (and/or CRL) are deployed by the relying party. So, let us consider again the browser’s response if it fails to receive a response to its OCSP (or CRL) request. This places the browser in similar dilemma to the one discussed earlier - and most implementations would adopt the soft-fail approach, i.e., use the certiőcate assuming that it was not revoked. Unfortunately, this implies we are again vulnerable to an MitM soft-fail attack, similar to the one presented earlier (Figure 8.13). The attack is only slightly modiőed due to the failed effort for OCSP stapling, and should probably be quite clear from Figure 8.15. One way to defend against the MitM soft-fail attack (Figure 8.15), is using the Must-Staple extension to the server’s X.509 certiőcate, which we discuss next. The Must-Staple X.509 extension: enforcing OCSP stapled response. The attacks of Figure 8.13 and Figure 8.15 show the risk of adopting the softfail approach. The soft-fail mechanism is the equivalent of deciding to allow bypassing of airport security screening, whenever the line becomes too long. A likely outcome of such policy would be that an attacker will őnd ways to cause the line to be congested, and then use the bypass to avoid screening and perform an attack. We sum this up with the following principle. Principle 17 (Soft-fail security is insecure). Defenses should not be bypassed due to failures: if defenses are bypassed upon failure, attacker will cause failures to bypass defenses. Namely, soft-fail security is insecurity. Awareness of the risk of the soft-fail approach, motivates adoption of the harsher, hard-fail approach. However, this conŕicts with the UX>Security precedence rule (Principle 16). Deőnitely, it would be absurd for a browser to refuse connection to a website, only since it does not receive the OCSP response; this is very likely due to a benign reason, such as that the website does not support OCSP stapling! The TLS-feature X.509 extension [184] is the standard solution to this dilemma. This extension to the website’s X.509 certiőcate can be used to indicate that the website always staples OCSP responses. To a large extent, this moves the UX vs. Security decision from the browser to the website: the browser would apply the ‘must-staple’ policy, only to a website that requests it, by using the ‘must-staple’ extension in its X.509 certiőcate. As shown in Figure 8.16, this foils the MitM soft-fail attack on OCSP-stapling TLS client of Figure 8.15. Note, however, that the TLS-feature is only effective when the attacker tries to abuse a certiőcate issued to the legitimate website (with TLS-feature extension) and later revoked, e.g., after key-exposure was detected or suspected. In the common case where the attacker is able to get a CA to Applied Introduction to Cryptography and Cybersecurity CHAPTER 8. PUBLIC KEY INFRASTRUCTURE (PKI) 530 TLS client (browser) MitM (fake server, with revoked cert) OCSP Responder (CA) TLS Client Hello with CSR extension TLS Server Hello without OCSP response OCSP request (drop) OCSP response time-out→ softfail TLS key exchange, finish TLS finish (data) Figure 8.15: MitM soft-fail attack on OCSP-stapling TLS client (browser), using a revoked TLS server (website) certiőcate; assume that the attacker has the certiőed (and revoked) private key. The browser sends the CSR TLS extension; however, the website’s certiőcate does not have the X.509 Must-Staple extension, or the client does not respect this extension. The attacker impersonates as the web-server, and sends the TLS server-hello and certiőcate messages; the attacker does not send the OCSP response (which would have indicated revocation). The client is misled into thinking that the server does not support OCSP stapling. The client may now send an OCSP request to the appropriate OCSP responder, e.g., the relevant CA, but the MitM attacker would ‘kill’ the OCSP request or response (the őgure shows killing of the response). After time-out, the client ‘gives up’ on the OCSP response, and ‘soft-fails’, i.e., accepts the certiőcate and establishes the connection with the impersonated website). Applied Introduction to Cryptography and Cybersecurity 8.4. CERTIFICATE REVOCATION TLS client (browser) 531 MitM (fake server, with revoked cert) TLS Client Hello with CSR extension TLS Server Hello without CSR extension; certificate has TLS-feature X.509 extension [184] indicating Must-Staple abort (and alert/report?) Figure 8.16: The use of the TLS-feature X.509 extension [184], to indicate MustStaple, defends against the MitM soft-fail attack on OCSP-stapling TLS client of Figure 8.15. As in Figure 8.15, the attacker tries to impersonate a website, to which the attacker has the private key and the corresponding certiőcate, which was already revoked. As in Figure 8.15, the client sends Client-Hello request, with the CSR TLS extension, i.e., asking the server to staple OCSP response. As in Figure 8.15, the attacker responds without the CSR extension, i.e., trying to mislead the client into falling back to sending an OCSP request (and then soft-failing). However, the Must-Staple extension instructs the client to refuse to continue without the OCSP response from the server. issue a certiőcate for a request sent by the attacker, the attacker can surely ask not to include this extension, allowing the attacker to avoid the must-staple mechanism. Mandatory Must-Staple? The inclusion of the Must-Staple certiőcate extension in a certiőcate C prevents an attacker from abusing C after C was revoked, when C was revoked due to (suspected or detected) exposure of the private key. However, the Must-Staple extension does not prevent an attacker from abusing a rogue certiőcate CR = ̸ C, e.g., a certiőcate with a misleading domain-name (subsection 8.1.1), even after the CA revokes CR (and/or C), if the rogue certiőcate CR does not include the must-staple extension. Such rogue certiőcate CR can still be used to attack a client which does not request and wait for OCSP approval. One way to prevent this would be a mandatory Must-Staple extension, but this seems unlikely to happen. In fact, there are signiőcant challenges to the adoption of the Must-Staple extension, as we now explain. Must-Staple Adoption Challenges. The UX>Security precedence rule (Principle 16) applies also to websites; website developers would be reluctant to adopt the Must-Staple extension, if they believe this may jeopardize the availability of their website. That may be due to different reasons, such as clients processing the extension incorrectly, web-servers not supporting the extension or the OCSP process correctly, or to not receiving the OCSP response from the OCSP responder (usually, the CA). Applied Introduction to Cryptography and Cybersecurity 532 CHAPTER 8. PUBLIC KEY INFRASTRUCTURE (PKI) Unfortunately, measurements of adoption published so far were not very encouraging [98]. One possible reason is that CA may be reluctant to support Must-Staple; indeed, with the proliferation of websites, it is likely that the use of Must-Staple will result in increase rate of OCSP responses by CAs, each requiring a new signature - a potentially signiőcant overhead. See some (non-standardized) possible optimizations in subsection 8.4.4. However, we still hope that Must-Staple will be gradually adopted, as it offers signiőcant security advantages, with high efficiency to relying parties and subjects. There does not appear to be any technical reason for either the incorrect processing or for failures of the web-servers to receive OCSP responses (and then provide them to the browsers). Indeed, this is an example of the signiőcant adoption challenges facing designers of new Internet and web security mechanisms. Adoption considerations should be an important part of the design process. In the following exercise, we discuss some issues which may help - or hinder - the adoption of the OCSP Must-Staple extension. Exercise 8.7. For each of the following variants of the OCSP Must-Staple extension process, explain possible impacts on adoption, security and performance: 1. Mark the Must-Staple extension as a critical X.509 extension. 2. Mark the Must-Staple extension as a non-critical X.509 extension. 3. When a browser receives from a website a certificate with Must-Staple extension, but without the stapled OCSP response, then the browser would not abort the connection, but request a certificate from the CA, and abort the connection only if this request also fails. 4. Same as previous item, however, the website/CA will have the ability to indicate if the client should try sending OCSP request to the CA (if it does not receive it stapled from the web-server). Consider three ways to indicate this: (a) an option of the OCSP Must-Staple extension, (b) a separate extension, or (c) an option indicated in a TLS extension returned by the web server. Notice that Must-Staple extension requires support by the CA, to include it in the web-server’s certiőcate, and to provide sufficiently-reliable OCSP service. An alternative solution which does not require such special certiőcate-extension is discussed in Exercise 8.18. 8.4.4 Reducing OCSP Computational Overhead As can be seen in Table 8.3, OCSP performance is, typically, better in most aspects: low-delay, relatively low bandwidth, and no storage required. However, OCSP computational overhead can be higher. For the CA, this overhead is mainly due to the signature operations; in the relying party, it is mostly due to signature-veriőcations. Applied Introduction to Cryptography and Cybersecurity 8.4. CERTIFICATE REVOCATION 533 σ1−8 = SignCA.s (h1−8 + + time) h1−8 h1−4 h5−8 h1−2 h3−4 h5−6 h7−8 h1 h2 h3 h4 h5 h6 h7 h8 c 1 , s1 c 2 , s1 c 3 , s3 c 4 , s4 c 5 , s5 c 6 , s6 c 7 , s7 c 8 , s8 Figure 8.17: Certiőcates-Merkle-tree variant of OCSP: optimizing OCSP response, by signing the digest of a Merkle-tree whose leaves are the certiőcates ci and their statuses si ∈ {good, revoked, unknown} (subsection 3.7.3). The root is the signature over the hash-tree and the time. Every internal node is the hash of its children; in particular, for every i holds hi = h(ci , si ), and hi−(i+1) = h(hi + + hi+1 ). To validate any certiőcate, say c3 , provide the signature of the certiőcate hash-tree, i.e., σ1−8 , the time-of-signing and the digest scheme’s Proof-of-Inclusion (PoI), i.e., the values of internal hash nodes required to validate the signed hash, namely h4 , h1−2 and h5−8 . In this subsection, we discuss several non-standard optimizing-variants of OCSP, which can reduce its computational overhead. The Certificate-Hash-Tree. This OCSP variant uses the Merkle-tree scheme, introduced in subsection 3.7.3, to allow the CA or OCSP responder to periodically perform a single signature operation, to provide OCSP responses indicating status for any OCSP requests. Assume that the CA issued a large set of certiőcates c1 , c2 , . . . cn , but each OCSP request will contain only one or few certiőcate-identiőers. As shown in Figure 8.17, the signature is computed over the result of a hash-tree applied to the entire set of certiőcates issued by the CA (and their statuses), concatenated with the current time. The leaves of the hash-tree are the pairs of individual certiőcates c1 , . . . , cn and their corresponding statuses s1 , . . . , sn . The construction uses a collision-resistant hash function (CRHF) denoted h. The OCSP response for a query for status of certiőcate ci , consists of this signature, the time of signing, and the values of ‘few’ internal nodes, essentially, one node per layer of the hash tree. This allows the OCSP client to recompute the result of the hash tree, and then validate the signature. For example, to validate Applied Introduction to Cryptography and Cybersecurity 534 CHAPTER 8. PUBLIC KEY INFRASTRUCTURE (PKI) the value of c6 , the response should include h5 , h7−8 and h1−4 . To validate, compute h6 = h(c6 ), then h5−6 = h(h5 + + h6 ), then h5−8 = h(h5−6 + + h7−8 ), + h5−8 ) and őnally verify the signature over h1−8 and then h1−8 = h(h1−4 + time, by validating that verif yCA.v (σ1−8 , h1−8 + + time) returns true. We refer to this set of values (e.g., c6 , h5 , h7−8 , h1−4 ) as Proof-of-Inclusion (PoI) of c6 . Exercise 8.8. Consider the Certificates-Merkle-tree variant of OCSP, described above and illustrated in Figure 8.17. 1. Present pseudo-code for the validation of the OCSP responses by a client (relying party), when using this variant. 2. Let c be the number of certificates issued by a CA, rAll be the number of revoked certificates, and i be the number of certificate-identifiers sent in a given OCSP request. Note that r < n and, typically, i << r. What is the number of signature and hash operations, required to (a) produce and send a CRL, (b) produce and send an OCSP response, (c) produce a certificate-hash-tree OCSP response. 3. This variant uses a (keyless) collision-resistant hash function (CRHF) h. Explain a disadvantage of this requirement and suggest a change to the design that will avoid this disadvantage. Signed Revocations-Status Merkle-Tree. We can further signiőcantly reduce the overhead of OCSP, by using a Merkle-tree of revocations status instead of a Merkle-tree of certiőcates. This Merkle tree will still contain one leaf per certiőcate. However, the value of leaf i will be a bit bi , corresponding to the revocation of certiőcate i; i.e., bi = 1 if certiőcate i is revoked, and bi = 0 otherwise. The CA applied the Merkle-tree scheme to these leafs, and obtains the digest of the entire tree of revocations, which it signs. To provide the OCSP response for a query to the status of a particular certiőcate, say certiőcate i, the CA includes in the response this signature, together with a Proof-of-Inclusion (PoI) of the value bi as the ith leaf of the tree. This PoI can be further optimized by observing that revocations are not very common, i.e., most leafs will be zero (not revoked). There is no need to include the hash of any subtree whose leaves are all zero (i.e., none of the certiőcates in it was revoked). Revoked-certificates Merkle tree. The disadvantage of the revocationstatus approach is that it provides information about all certiőcates. Assuming that the only a very small fraction of the certiőcates are revoked, other optimizations are possible - and possibly even more effective. For example, we present the revoked-certificates Merkle tree approach. This approach is also interesting since it introduces an additional optional mechanism for Merkle digest schemes: a Proof of Non-Inclusion (PoNI). Applied Introduction to Cryptography and Cybersecurity 8.4. CERTIFICATE REVOCATION 535 σ1−8 = SignCA.s (h1−8 + + time) h1−8 h0000 h1001 h00 h00 h10 h01 h0 h0 h0 h0 h1 h0 h0 h1 0 (b1 ) 0 (b2 ) 0 (b3 ) 0 (b4 ) 1 (b5 ) 0 (b6 ) 0 (b7 ) 1 (b8 ) Figure 8.18: Signed revocations-status Merkle-tree; leaf i contain