Applied Introduction to
Cryptography and Cybersecurity
Amir Herzberg
Comcast Professor of Security Innovations
Department of Computer Science and Engineering
University of Connecticut
March 24, 2024
For updated draft see: http://bit.ly/AI2CS.
Comments, corrections and suggestions, are appreciated; send by email to
amir.herzberg@uconn.edu.
©Amir Herzberg
i
Preface
This textbook introduces cybersecurity, with focus on the key element of
applied cryptography. Our goal is to provide sufficient depth and precision for
understanding of this important and fascinating area, but without requiring
extensive prior background in mathematics or in the theory of computer science.
The textbook presents design principles, discusses practical systems, attacks
and vulnerabilities, presents basic cryptographic mechanisms, constructions and
deőnitions, and includes many examples, exercises and several programming
labs. The goal is that readers will be able to use the book for self-study, and
lecturer will be able to use it as a textbook for a course, or use parts of it for
two courses.
We use the term cybersecurity to refer to the protection of systems involving
communication and computation mechanisms, from attacks by adversaries. This
is a broad area, since there communication and computations have become
so diverse and important, and since there are many threats and many types
of adversaries. Ensuring security is challenging; attacks often exploit subtle
vulnerabilities and use unexpected strategies, which intuition may fail to consider.
This makes cybersecurity challenging; careful, adversarial thinking is crucial.
Cybersecurity is a very applied area, however, in this applied area, precise
analysis, definitions and proofs are critical. This stands in contrast to many
other areas of engineering, where designs are often evaluated under typical,
expected scenarios and failures. This approach is insufficient for cybersecurity,
since security should be ensured against arbitrary attacker strategies, rather
than against expected, familiar attacks. Defenses should be designed assuming limitations on the capabilities of the adversary, but without making any
assumptions on the adversary’s strategy.
Cybersecurity is a broad area, including many aspects, technical as well
as otherwise (legal, economics, social and much more). The technical aspects
include cryptography, network-security, software security, system security, privacy, secure human-computer interaction and more. This textbook focuses on
applied cryptography and also introduces some aspects of network security and
of secure human-computer interaction. We believe that this is a good choice for
a őrst course in cybersecurity, for three reasons:
• Applied cryptography is essential to many areas of cybersecurity. Hence,
this textbook may provide a common basis for students interested in
iii
these different areas. Some students may continue by focusing mostly on
cryptography, maybe even on the theory of cryptography; for these, we
hope to provide a good basis in the applied aspects of cryptography. Other
may continue to areas of cybersecurity which ‘just’ use cryptography such
as network-security, secure systems, privacy and human-centered security;
for those, we hope to provide the necessary background in cryptography.
• The study of cryptography develops adversarial thinking, which is critical
for every cybersecurity expert. Modern cryptography is based on precise
deőnitions of goals and assumptions, with analysis and proofs of security
for given adversary capabilities, not assuming a speciőc adversary strategy
or attack method. Other areas of cybersecurity often use intuitive goals,
design and analysis, and may even focus on speciőc adversary strategies.
Such approaches may be unavoidable, since precise deőnitions and proofs
are often infeasible; however, these approaches appear less helpful in
developing adversarial thinking.
• Finally, there is the pragmatic consideration of scheduling and prerequisites.
Modern cryptography is based on both mathematics and on the theory of
computing; however, as we believe you will őnd, we found it possible to
use only a limited amount of math and theory, and introduce this limited
amount with this book. As a result, the text can be used by readers who
did not learn previous math or computer-science courses, beyond what
is learned in many high-school programs. This is in contrast to other
areas of cybersecurity, which require considerable background (e.g., in
networking, operating-systems, or programming).
The need for security against attackers with arbitrary strategies, restricted
only by their capabilities, motivates the use of provable security, as well as
the use of standard, well-studied designs, whose security was conőrmed by
experts. Indeed, some of the worst failures of security systems, and especially of
cryptographic systems, are due to attempts by non-experts to design schemes
and protocols.
This textbook tries to combine practice with theory, applicability with
precision, breadth with depth. These are ambitious goals, and we hope we are
not completely off the mark. Feedback and suggestions for improvement are
highly appreciated.
Organization and usage of this textbook
This textbook is designed for an introductory course in cybersecurity, focusing
on applied cryptography. It may be used for self-study, or for one (large) or
two (smaller) courses.
Chapter 1 provides essential introduction to cryptography, providing intuitive
discussion of the main mechanisms we discuss, and introducing the challenge of
deőning security. It also introduces notations used throughout the book, and
provides critical background information; additional background is provided
iv
in Appendix A. The rest of the book was designed so that the necessary
background in each topic is very limited; we believe it is not necessary to require
learning these topics in a prerequisite course. The introduction also provides a
brief historical perspective, which we hope some readers may őnd of interest.
In chapters 2-3, we begin introducing applied cryptography. These chapters
focus on the efficient and conceptually-simple shared-key cryptographic functions:
encryption (Chapter 2), authentication (Chapter 4) and hashing (Chapter 3).
In Chapter 3 we also begin introducing more elaborate applied cryptographic
schemes, focusing on hashing-based schemes such as the Merkle digest schemes
and blockchains.
Chapter 5 introduces applied shared-key cryptographic protocols. By focusing on shared-key protocols, we provide a gentle introduction to the important
subject of resiliency to key-exposure. This also provides good motivation to
public-key cryptography, the topic of the next chapter (Chapter 6). Indeed,
Chapter 6 deals extensively with different public-key protocols and applications, and shows the powerful ability to use public-key cryptography to ensure
resiliency to, and recovery from, key exposure.
The next three chapters cover areas of cybersecurity which are closely related
to cryptography, focusing mostly on the cryptographic aspects. Two chapters
introduce network security: the important TLS protocol in Chapter 7, and
Public Key Infrastructure (PKI) in Chapter 8. Then, Chapter 9 covers the
critical topic of human-centered cryptography; too often this aspect is not
sufficiently taken in account, and cryptography is circumvented by exploiting
human behavior and psychology.
We conclude the book in Chapter 10, brieŕy discussing some advanced topics
such as secret sharing, privacy and anonymity, elliptic curves cryptography and
quantum (and post-quantum) cryptography, as well as some of the important
aspects of cybersecurity which are beyond this text.
The use of background such as math and theory in this textbook is limited to
what appears to be essential or helpful for understanding, at least for a signiőcant
fraction of readers. Some of this background is covered in Appendix A.
Study of this textbook may be followed by study of other areas in cybersecurity, or by in-depth study of cryptography. There are multiple excellent
in-depth textbooks on cryptography or speciőc cryptographic topics; some of
my favorites are [16, 165, 166, 205, 309, 370].
Acknowledgments
I received a lot of help in developing this textbook from friends, colleagues and
students; I am very grateful. Speciőc thanks to:
• The students and my peers in the University of Connecticut, who have
been very understanding and supportive.
• Sara Wrotniak, my PhD student, who gave me incredible feedback when
she studied using this textbook as an undergrad, and later when I asked
her to review the text further.
v
• Professors and researchers who provided valuable feedback, with or without teaching a course using the textbook, including: Ghada Almashaqbeh,
Nimrod Aviram, Ahmed El-Yahyaoui, Peter Gutmann, John Heslen, Walter Krawec, Laurent Michel, David Pointcheval, Ivan Pryvalov, Zhiyun
Qian, Amir and Luba Sapir, Jerry Shi, Haya Shulman, Ewa Syta, and Ari
Trachtenberg. Walter has also kindly contributed the section on quantum
cryptography (Section 10.4).
• Instructors and teaching-assistants who used the text and provided important feedback: Justin Furuness, Hemi Liebowiz, Sam Markelon, and
Anna Mendonca. I’m especially indebted to Anna for collecting excellent
feedback from her students - and for introducing me to Sara.
• Many other readers who helped with feedback, corrections and suggestions,
including Pierre Abbat, Yanjing Xu, Bar Meyuchas, and Yike (Nicole)
Zhang and many others.
• The Latex and Tikz communities, who provide amazing resources and
support. Special thanks to Nils Fleischhacker for the cool ‘tikz people’
package, and to Shaanan Cohney who kindly sent me the LaTeX source
code from [100], which I turned into Figure 2.37.
I probably forgot to mention some important contributors; please accept
my apologies and let me know. Indeed, in general, please let me know of any
omission or mistake in the text, and accept my thanks in advance. Many thanks
to all of you; feedback is greatly appreciated, deőnitely including harsh criticism.
Special thanks to my friend, PhD adviser and mentor, Prof. Oded Goldreich,
who endured the challenge of trying to teach me both cryptography and proper
writing. I surely cannot meet Oded’s high standards, but I believe and hope
that this textbook does help to provide the necessary foundations of applied
cryptography to practitioners of cybersecurity. Laying foundations is deőnitely
a goal I deőnitely share with Oded. Let me quote Oded’s father:
It is possible to build a cabin with no foundations, but not a lasting
building
Eng. Isidor Goldreich
Last but not least, I wish to thank the incredible support and understanding
of my family throughout the years, especially my amazing parents, Ada and
Michael, my lovely and supportive wife Tatiana, and my wonderful and talented
children, Raaz, Tamir, Omri and, last but not least, Karina.
vi
Contents
Preface
iii
Contents
vii
List of Figures
xvi
List of Tables
xxi
List of Labs
xxiii
List of Principles
xxiv
Preface
1
1 Introducing Cybersecurity and Cryptography
1.1 Cybersecurity and Cryptography: the basics . . . . . . . . . . .
1.1.1 Three Cybersecurity Functions: Prevention, Detection
and Deterrence . . . . . . . . . . . . . . . . . . . . . . .
1.1.2 Generic Security Goals . . . . . . . . . . . . . . . . . . .
1.1.3 Attack Model . . . . . . . . . . . . . . . . . . . . . . . .
1.1.4 Provable Security . . . . . . . . . . . . . . . . . . . . . .
1.1.5 Risk and costs-beneőt analysis . . . . . . . . . . . . . .
1.2 The basic mechanisms: encryption, signatures and hashing . . .
1.2.1 Encryption: symmetric and asymmetric cryptosystems .
1.2.2 Kerckhoffs’ principle . . . . . . . . . . . . . . . . . . . .
1.2.3 Digital Signature schemes . . . . . . . . . . . . . . . . .
1.2.4 Applying Signatures for Evidences and for Public Key
Infrastructure (PKI) . . . . . . . . . . . . . . . . . . . .
1.2.5 Cryptographic hash functions . . . . . . . . . . . . . . .
1.3 Sequence Diagrams and Notations . . . . . . . . . . . . . . . .
1.3.1 Sequence diagrams . . . . . . . . . . . . . . . . . . . . .
1.4 A Bit of Background . . . . . . . . . . . . . . . . . . . . . . . .
1.4.1 A bit of Computational Complexity . . . . . . . . . . .
1.4.2 A bit of Number Theory and Group Theory . . . . . . .
1.4.3 A bit of Probability . . . . . . . . . . . . . . . . . . . .
5
5
vii
6
8
9
10
12
13
13
15
17
20
22
22
24
24
26
27
28
1.5
Provable-Security and Deőnitions . . . . . . . . . . . . . . . . .
1.5.1 Deőnition of a Signature Scheme . . . . . . . . . . . . .
1.5.2 Signature attack models and the conservative design principle . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.5.3 Types of forgery . . . . . . . . . . . . . . . . . . . . . .
1.5.4 Game-based Security and the Oracle Notation . . . . .
1.5.5 The Existential Unforgeablity CMA Game . . . . . . . .
1.5.6 The unforgeability advantage function . . . . . . . . . .
1.5.7 Concrete security, asymptotic security and negligible functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.5.8 Existentially-unforgeable signature schemes . . . . . . .
1.6 A Brief History of Cryptography, Computing and Cybersecurity
1.6.1 A brief history of cryptography . . . . . . . . . . . . . .
1.6.2 A brief history of computing and cybersecurity . . . . .
1.7 Lab and Additional Exercises . . . . . . . . . . . . . . . . . . .
29
30
31
33
34
35
37
37
39
40
40
43
46
2 Confidentiality: Encryption Schemes and Pseudo-Randomness 51
2.1 Historical Ciphers . . . . . . . . . . . . . . . . . . . . . . . . . 52
2.1.1 Ancient Keyless Ciphers . . . . . . . . . . . . . . . . . . 53
2.1.2 Keyed-Caesar cipher . . . . . . . . . . . . . . . . . . . . 57
2.1.3 The General Monoalphabetic Substitution (GMS) Cipher 58
2.1.4 Frequency analysis attacks on monoalphabetic ciphers . 59
2.1.5 The Polyalphabetic Vigenère ciphers . . . . . . . . . . . 61
2.2 Cryptanalysis Attack Models: CTO, KPA, CPA and CCA . . . 64
2.3 Generic attacks and Effective Key-Length . . . . . . . . . . . . 69
2.3.1 The generic exhaustive-search CTO attack . . . . . . . . 70
2.3.2 The Table Look-up and the Time-Memory Tradeoff Generic
CPA attacks . . . . . . . . . . . . . . . . . . . . . . . . 72
2.3.3 Effective key length . . . . . . . . . . . . . . . . . . . . 73
2.4 Unconditional security and the One Time Pad (OTP) . . . . . 75
2.5 Pseudo-Randomness, Indistinguishability and Asymptotic Security 77
2.5.1 Pseudo-Random Generators and Stream Ciphers . . . . 78
2.5.2 The Turing Indistinguishability Test . . . . . . . . . . . 80
2.5.3 PRG indistinguishability test . . . . . . . . . . . . . . . 80
2.5.4 Deőning Secure Pseudo-Random Generator (PRG) . . . 81
2.5.5 Secure PRG Constructions . . . . . . . . . . . . . . . . 83
2.5.6 RC4: Vulnerabilities and Attacks . . . . . . . . . . . . . 86
2.5.7 Random functions . . . . . . . . . . . . . . . . . . . . . 89
2.5.8 Pseudorandom functions (PRFs) . . . . . . . . . . . . . 94
2.5.9 PRF: Constructions and Robust Combiners . . . . . . . 101
2.5.10 The key separation principle . . . . . . . . . . . . . . . 101
2.6 Block Ciphers and PRPs . . . . . . . . . . . . . . . . . . . . . . 103
2.6.1 Random and Pseudo-Random Permutations . . . . . . . 104
2.6.2 Block ciphers . . . . . . . . . . . . . . . . . . . . . . . . 106
2.6.3 The Feistel Construction: 2n-bit Block Cipher from n-bit
PRF . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109
viii
2.7
Deőning secure encryption . . . . . . . . . . . . . . . . . . . . . 111
2.7.1 Attack model . . . . . . . . . . . . . . . . . . . . . . . . 111
2.7.2 The Indistinguishability-Test for Shared-Key Cryptosystems112
2.7.3 The Indistinguishability-Test for Public-Key Cryptosystems (PKCs) . . . . . . . . . . . . . . . . . . . . . . . . 117
2.7.4 Design of Secure Encryption: the Cryptographic Building
Blocks Principle . . . . . . . . . . . . . . . . . . . . . . 118
2.8 Encryption Modes of Operation . . . . . . . . . . . . . . . . . . 120
2.8.1 The Electronic Code Book mode (ECB) mode . . . . . . 122
2.8.2 The CTR and PBR modes . . . . . . . . . . . . . . . . 124
2.8.3 The Output-Feedback (OFB) Mode . . . . . . . . . . . 127
2.8.4 The Cipher Feedback (CFB) Mode . . . . . . . . . . . . 131
2.8.5 The Cipher-Block Chaining (CBC) mode . . . . . . . . 133
2.8.6 Modes of Operation Ensuring CCA Security? . . . . . . 135
2.9 Padding Schemes and Padding Oracle Attacks . . . . . . . . . . 135
2.10 Case study: the (in)security of WEP . . . . . . . . . . . . . . . 139
2.10.1 CRC-then-XOR does not ensure integrity . . . . . . . . 141
2.11 Encryption: Final Words . . . . . . . . . . . . . . . . . . . . . 142
2.12 Lab and Additional Exercises . . . . . . . . . . . . . . . . . . . 144
3 Integrity: from Hashing to Blockchains
159
3.1 Introducing cryptographic hash functions, properties and variants160
3.1.1 Warm-up: hashing for efficiency . . . . . . . . . . . . . . 161
3.1.2 Properties of cryptographic hash functions . . . . . . . . 164
3.1.3 Applications of cryptographic hash functions . . . . . . 167
3.1.4 Standard cryptographic hash functions . . . . . . . . . . 168
3.2 Collision Resistant Hash Function (CRHF) . . . . . . . . . . . 168
3.2.1 Keyless Collision Resistant Hash Function (Keyless-CRHF)168
3.2.2 There are no Keyless CRHFs! . . . . . . . . . . . . . . . 170
3.2.3 Keyed Collision Resistance . . . . . . . . . . . . . . . . 172
3.2.4 Birthday and exhaustive attacks on CRHFs . . . . . . . 175
3.2.5 CRHF Applications (1): File Integrity . . . . . . . . . . 176
3.2.6 CRHF Applications (2): Hash-then-Sign (HtS) . . . . . 176
3.3 Second-preimage resistance (SPR) Hash Functions . . . . . . . 180
3.3.1 The Chosen-Preőx Collisions Vulnerability . . . . . . . . 183
3.4 One-Way Functions . . . . . . . . . . . . . . . . . . . . . . . . . 186
3.4.1 OTPw (One-Time Password) Authentication . . . . . . 187
3.4.2 Using OWF for One-Time Signatures . . . . . . . . . . 188
3.5 Randomness Extraction and Key Derivation Functions . . . . . 191
3.5.1 Von Neumann’s Biased-Coin Extractor . . . . . . . . . . 192
3.5.2 The Bitwise Randomness Extractor . . . . . . . . . . . . 192
3.5.3 Key Derivation Functions (KDFs) and Extract-then-Expand194
3.6 The Random Oracle Model . . . . . . . . . . . . . . . . . . . . 196
3.7 Static Accumulator Schemes and the Merkle-Tree . . . . . . . . 198
3.7.1 Deőnition of a Keyless Static Accumulator . . . . . . . 198
3.7.2 Collision-Resistant Accumulators . . . . . . . . . . . . . 199
ix
3.7.3
3.8
3.9
3.10
3.11
The Merkle tree (MT) Accumulator and its Collisionresistance . . . . . . . . . . . . . . . . . . . . . . . . . .
3.7.4 The Proof-of-Inclusion (PoI) Requirements . . . . . . .
3.7.5 The Merkle-Tree PoI . . . . . . . . . . . . . . . . . . . .
Dynamic Accumulators . . . . . . . . . . . . . . . . . . . . . .
3.8.1 Dynamic accumulators: motivations, extensions and deőnition . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3.8.2 Dynamic Accumulator Collision-Resistance . . . . . . .
3.8.3 Constructing a dynamic accumulator from a static accumulator . . . . . . . . . . . . . . . . . . . . . . . . . . .
The Merkle-Damgård Construction . . . . . . . . . . . . . . . .
3.9.1 The Merkle-Damgård Static Accumulator . . . . . . . .
3.9.2 The Merkle-Damgård Hash Function . . . . . . . . . . .
3.9.3 The Merkle-Damgard Dynamic Accumulator . . . . . .
Blockchains, PoW and Bitcoin . . . . . . . . . . . . . . . . . .
3.10.1 Blockchain Design . . . . . . . . . . . . . . . . . . . . .
3.10.2 The Bitcoin blockchain and cryptocurrency . . . . . . .
Lab and additional exercises . . . . . . . . . . . . . . . . . . . .
4 Authentication: MAC, Blockchain and Signature Schemes
4.1 Encryption for Authentication? . . . . . . . . . . . . . . . . . .
4.2 Message Authentication Code (MAC) schemes . . . . . . . . .
4.3 Message Authentication Code (MAC): Deőnitions . . . . . . . .
4.4 Applying MAC Schemes . . . . . . . . . . . . . . . . . . . . . .
4.5 Constructing MAC from a Block Cipher . . . . . . . . . . . . .
4.5.1 Every PRF is a MAC . . . . . . . . . . . . . . . . . . .
4.5.2 CBC-MAC: ln-bit MAC (and PRF) from n-bit PRF . .
4.5.3 Constructing Secure VIL MAC from PRF . . . . . . . .
4.6 Other MAC Constructions . . . . . . . . . . . . . . . . . . . . .
4.6.1 MAC design ‘from scratch’ . . . . . . . . . . . . . . . .
4.6.2 Robust combiners for MAC . . . . . . . . . . . . . . . .
4.6.3 HMAC and other constructions of a MAC from a Hash
function . . . . . . . . . . . . . . . . . . . . . . . . . . .
4.7 Combining Authentication, Encryption and Other Functions .
4.7.1 Authenticated Encryption (AE) and AEAD schemes . .
4.7.2 Authentication via EDC-then-Encryption? . . . . . . . .
4.7.3 Generic Authenticated Encryption Constructions . . . .
4.7.4 Single-Key Generic Authenticated-Encryption . . . . . .
4.7.5 Authentication, encryption, compression and error detection/correction codes . . . . . . . . . . . . . . . . . . . .
4.8 Additional exercises . . . . . . . . . . . . . . . . . . . . . . . .
200
202
202
205
205
207
209
209
210
213
216
216
218
219
223
231
232
233
235
238
240
241
242
244
245
245
246
247
250
251
253
253
257
258
261
5 Shared-Key Protocols
267
5.1 Modeling cryptographic protocols . . . . . . . . . . . . . . . . . 268
5.1.1 The session/record protocol . . . . . . . . . . . . . . . . 269
5.1.2 A simple EtA session/record protocol . . . . . . . . . . 271
x
5.2
5.3
5.4
5.5
5.6
5.7
5.8
Shared-key Entity Authentication Protocols . . . . . . . . . . . 273
5.2.1 Interactions and requirements of entity authentication
protocols . . . . . . . . . . . . . . . . . . . . . . . . . . 274
5.2.2 Vulnerability study: SNA mutual-authentication protocol 277
5.2.3 Authentication Protocol Design Principles . . . . . . . . 279
5.2.4 Secure Mutual Entity Authentication with the 2PP protocol . . . . . . . . . . . . . . . . . . . . . . . . . . . . 280
Authenticated Request-Response Protocols . . . . . . . . . . . 281
5.3.1 Summary of request/response protocols . . . . . . . . . 283
5.3.2 The 2PP-RR Authenticated Request-Response Protocol 285
5.3.3 2RT-2PP Authenticated Request-Response protocol . . 286
5.3.4 The Counter-based RR Authenticated Request-Response
protocol . . . . . . . . . . . . . . . . . . . . . . . . . . . 286
5.3.5 Time-based RR Authenticated Request-Response protocol287
Shared-key Key Exchange Protocols . . . . . . . . . . . . . . . 289
5.4.1 The Key Exchange extension of 2PP . . . . . . . . . . . 291
5.4.2 Deriving Per-Goal Keys . . . . . . . . . . . . . . . . . . 292
Key Distribution Center Protocols . . . . . . . . . . . . . . . . 293
5.5.1 The Kerberos Key Distribution Protocol . . . . . . . . . 293
The GSM Key Exchange Protocol . . . . . . . . . . . . . . . . 295
5.6.1 VN-impersonation Replay attack on GSM . . . . . . . . 299
5.6.2 Crypto-agility and cipher suite negotiation in GSM . . . 301
5.6.3 The downgrade to A5/2 attack on GSM . . . . . . . . . 304
Resiliency to Exposure: Forward Secrecy and Recover Security 309
5.7.1 Forward Secrecy 2PP Key Exchange . . . . . . . . . . . 310
5.7.2 Recover-Security Key Exchange Protocol . . . . . . . . 311
5.7.3 Stronger notions of resiliency to key exposure . . . . . . 313
Additional Exercises . . . . . . . . . . . . . . . . . . . . . . . . 316
6 Public Key Cryptography
323
6.1 Introduction to PKC . . . . . . . . . . . . . . . . . . . . . . . . 324
6.1.1 Public key cryptosystems . . . . . . . . . . . . . . . . . 324
6.1.2 Signature schemes . . . . . . . . . . . . . . . . . . . . . 325
6.1.3 Public-Key-based Key Exchange Protocols . . . . . . . 326
6.1.4 Advantages of Public Key Cryptography (PKC) . . . . . 328
6.1.5 The price of PKC: assumptions, computation costs and
length of keys and outputs . . . . . . . . . . . . . . . . 329
6.1.6 Hybrid Encryption . . . . . . . . . . . . . . . . . . . . . 333
6.1.7 The Factoring and Discrete Logarithm Hard Problems . 335
6.1.8 The secrecy implied by the discrete logarithm assumption 337
6.2 The DH Key Exchange Protocol . . . . . . . . . . . . . . . . . 339
6.2.1 Physical key exchange . . . . . . . . . . . . . . . . . . . 339
6.2.2 Some candidate key exchange protocols . . . . . . . . . 341
6.2.3 The Diffie-Hellman Key Exchange Protocol and Hardness
Assumptions . . . . . . . . . . . . . . . . . . . . . . . . 345
6.2.4 Secure derivation of keys from the DH protocol . . . . . 348
xi
6.3
6.4
6.5
6.6
6.7
Using DH for Resiliency to Exposures . . . . . . . . . . . . . .
6.3.1 The Authenticated DH protocol: ensuring PFS . . . . .
6.3.2 The DH-Ratchet protocol: Perfect Forward Secrecy (PFS)
and Perfect Recover Security (PRS) . . . . . . . . . . .
The DH and El-Gamal PKCs . . . . . . . . . . . . . . . . . . .
6.4.1 The DH PKC and the Hashed DH PKC . . . . . . . . .
6.4.2 The El-Gamal PKC . . . . . . . . . . . . . . . . . . . .
6.4.3 El-Gamal is Multiplicative-Homomorphic Encryption . .
6.4.4 Types and Applications of Homomorphic Encryption . .
The RSA Public-Key Cryptosystem . . . . . . . . . . . . . . .
6.5.1 RSA key generation. . . . . . . . . . . . . . . . . . . . .
6.5.2 Textbook RSA: encryption, decryption, and signing. . .
6.5.3 Efficiency of RSA . . . . . . . . . . . . . . . . . . . . . .
6.5.4 Correctness of RSA . . . . . . . . . . . . . . . . . . . .
6.5.5 The RSA assumption and the vulnerability of textbook
RSA . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
6.5.6 Padded RSA encryption: PKCS#1 v1.5 and OAEP . .
6.5.7 Bleichenbacher’s Padding Side-Channel Attack on PKCS#1
v1.5 . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Public key signature schemes . . . . . . . . . . . . . . . . . . .
6.6.1 RSA-based signatures . . . . . . . . . . . . . . . . . . .
Labs and Additional Exercises . . . . . . . . . . . . . . . . . . .
350
351
353
355
355
358
360
362
365
365
366
368
369
371
373
378
383
385
388
7 TLS protocols: web-security and beyond
399
7.1 Introduction to TLS and SSL . . . . . . . . . . . . . . . . . . . 399
7.1.1 A brief history of SSL and TLS . . . . . . . . . . . . . . 401
7.1.2 TLS: High-level Overview . . . . . . . . . . . . . . . . . 403
7.1.3 TLS: security goals . . . . . . . . . . . . . . . . . . . . 406
7.1.4 TLS: Engineering goals . . . . . . . . . . . . . . . . . . 407
7.1.5 TLS and the TCP/IP Protocol Stack . . . . . . . . . . . 408
7.2 The TLS Record Protocol . . . . . . . . . . . . . . . . . . . . . 409
7.2.1 The Authenticate-then-Encrypt (AtE) Record Protocol 410
7.2.2 The CPA-Oracle Attack Model . . . . . . . . . . . . . . 414
7.2.3 Padding Attacks: Poodle and Lucky13 . . . . . . . . . . 416
7.2.4 The BEAST Attack: Exploiting CBC with Predictable-IV 419
7.2.5 Exploiting RC4 Biases to Recover Plaintext . . . . . . . 423
7.2.6 Exploiting Compress-then-Encrypt: CRIME, TIME, BREACH424
7.2.7 The TLS AEAD-based record protocol (TLS 1.3) . . . . 426
7.3 The SSLv2 Handshake Protocol . . . . . . . . . . . . . . . . . . 428
7.3.1 SSLv2: the ‘basic’ handshake . . . . . . . . . . . . . . . 429
7.3.2 SSLv2 Key-derivation . . . . . . . . . . . . . . . . . . . 429
7.3.3 SSLv2: ID-based Session Resumption . . . . . . . . . . 431
7.3.4 SSLv2: Client Authentication . . . . . . . . . . . . . . . 433
7.4 The Handshake Protocol: from SSLv3 to TLSv1.2 . . . . . . . 433
7.4.1 SSLv3 to TLSv1.2: improved derivation of keys . . . . . 435
7.4.2 SSLv3 to TLSv1.2: DH-based key exchange . . . . . . . 436
xii
7.5
7.6
7.7
7.8
7.4.3 The TLS Extensions mechanism . . . . . . . . . . . . .
7.4.4 SSLv3 to TLSv1.2: session resumption . . . . . . . . . .
7.4.5 SSLv3 to TLSv1.2: Client authentication . . . . . . . .
Negotiations and Downgrade Attacks (SSL to TLS 1.2) . . . .
7.5.1 SSLv2 cipher suite negotiation and downgrade attack .
7.5.2 Handshake Integrity Against Cipher Suite Downgrade .
7.5.3 Finished Fails: the Logjam and FREAK cipher suite
downgrade attacks . . . . . . . . . . . . . . . . . . . . .
7.5.4 Backward compatibility and protocol version negotiation
7.5.5 The TLS Downgrade Dance and the Poodle Version Downgrade Attack . . . . . . . . . . . . . . . . . . . . . . . .
7.5.6 Securing the TLS downgrade dance: the SCSV cipher
suite and beyond . . . . . . . . . . . . . . . . . . . . . .
7.5.7 The SSL-Stripping Attack and the HSTS Defense . . . .
7.5.8 Three Principles: Secure Extensibility, KISS and Minimize Attack Surface . . . . . . . . . . . . . . . . . . . .
The TLS 1.3 Handshake: Improved Security and Performance .
7.6.1 TLS 1.3: Negotiation and Backward Compatibility . . .
7.6.2 TLS 1.3 Full (1-RTT) DH Handshake . . . . . . . . . .
7.6.3 TLS 1.3 Full (1-RTT) Pre-Shared Key (PSK) Handshake
7.6.4 TLS 1.3 Zero-RTT Pre-Shared Key (PSK) Handshake .
7.6.5 TLS 1.3 Key Derivation . . . . . . . . . . . . . . . . . .
7.6.6 Cross-Protocol Attacks on TLS 1.3 . . . . . . . . . . . .
TLS: Final Words and Further Reading . . . . . . . . . . . . .
Additional Exercises . . . . . . . . . . . . . . . . . . . . . . . .
8 Public Key Infrastructure (PKI)
8.1 Introduction: PKI Concepts and Goals . . . . . . . . . . . . . .
8.1.1 Rogue certiőcates . . . . . . . . . . . . . . . . . . . . . .
8.1.2 Security goals of PKI schemes. . . . . . . . . . . . . . .
8.1.3 The Web PKI . . . . . . . . . . . . . . . . . . . . . . . .
8.2 The X.509 PKI . . . . . . . . . . . . . . . . . . . . . . . . . . .
8.2.1 The X.500 Global Directory Standard . . . . . . . . . .
8.2.2 The X.500 Distinguished Name . . . . . . . . . . . . . .
8.2.3 X.509 Public Key Certiőcates . . . . . . . . . . . . . . .
8.2.4 The X.509v3 Extensions Mechanism . . . . . . . . . . .
8.2.5 Trust-Anchor Certiőcate Validation . . . . . . . . . . .
8.2.6 The SubjectAltName and the IssuerAltName Extensions
8.2.7 Standard key-usage and policy extensions . . . . . . . .
8.2.8 Certiőcate policy (CP) and Domain/Organization/Extended Validation . . . . . . . . . . . . . . . . . . . . . .
8.3 Intermediate-CAs and Certiőcate Path Validation . . . . . . . .
8.3.1 The certiőcate path constraints extensions . . . . . . . .
8.3.2 The basic constraints extension . . . . . . . . . . . . . .
8.3.3 The name constraint extension . . . . . . . . . . . . . .
8.3.4 The policy constraints extension . . . . . . . . . . . . .
xiii
439
440
443
444
445
447
449
451
453
454
454
457
458
461
464
466
469
470
471
472
473
481
482
485
486
487
488
488
489
493
497
500
501
502
503
505
506
507
508
512
8.4
Certiőcate Revocation . . . . . . . . . . . . . . . . . . . . . . .
8.4.1 Certiőcate Revocation List (CRL) . . . . . . . . . . . .
8.4.2 Online Certiőcate Status Protocol (OCSP) . . . . . . .
8.4.3 OCSP Stapling and the Must-Staple Extension . . . . .
8.4.4 Reducing OCSP Computational Overhead . . . . . . . .
8.4.5 Optimized Periodic Revocation Status: OneCRL, CRLsets
and CRVs . . . . . . . . . . . . . . . . . . . . . . . . . .
8.5 Web PKI: Vulnerabilities, Failures and Improvements . . . . . .
8.5.1 Web PKI Certiőcate Authority Failures . . . . . . . . .
8.5.2 X.509/PKIX Defenses against Corrupt/Negligent CAs .
8.6 Certiőcate Transparency (CT) . . . . . . . . . . . . . . . . . .
8.6.1 CT: concepts, entities and goals . . . . . . . . . . . . . .
8.6.2 The Honest-Logger Certiőcate Transparency (HL-CT)
design . . . . . . . . . . . . . . . . . . . . . . . . . . . .
8.6.3 The Audit-and-Gossip Certiőcate Transparency (AnGCT) design . . . . . . . . . . . . . . . . . . . . . . . . .
8.6.4 The NTTP-Security Certiőcate Transparency (NS-CT)
design . . . . . . . . . . . . . . . . . . . . . . . . . . . .
8.7 Additional Exercises . . . . . . . . . . . . . . . . . . . . . . . .
512
516
519
526
532
9 Human-centered Security and Cryptography
9.1 Introducing User-Centered Security and Cryptography . . . . .
9.1.1 Biases and other challenges of protecting human users .
9.1.2 Principles of user-centred security . . . . . . . . . . . .
9.2 Login Ceremonies . . . . . . . . . . . . . . . . . . . . . . . . . .
9.3 Password-based Login Ceremonies . . . . . . . . . . . . . . . .
9.3.1 Password-based web-form login ceremonies . . . . . . .
9.3.2 Impersonating website and phishing attacks . . . . . . .
9.3.3 Defenses against impersonating websites . . . . . . . . .
9.3.4 Password dictionaries . . . . . . . . . . . . . . . . . . .
9.3.5 Online and offline dictionary attacks . . . . . . . . . . .
9.4 Password-őle exposures . . . . . . . . . . . . . . . . . . . . . .
9.4.1 Encrypted password őle . . . . . . . . . . . . . . . . . .
9.4.2 Hashed password őles . . . . . . . . . . . . . . . . . . .
9.4.3 Adding salt to hashed passwords . . . . . . . . . . . . .
9.4.4 Hashed passwords with pepper . . . . . . . . . . . . . .
9.4.5 Using cryptographic co-processor to protect hashed passwords . . . . . . . . . . . . . . . . . . . . . . . . . . . .
9.4.6 Password managers . . . . . . . . . . . . . . . . . . . . .
9.4.7 Detecting exposure of passwords and password őle . . .
9.5 Password-Authenticated Key Exchange (PAKE) . . . . . . . .
9.5.1 Two Naïve-PAKE Protocols . . . . . . . . . . . . . . . .
9.5.2 The EKE2 PAKE protocol . . . . . . . . . . . . . . . .
9.6 Login ceremonies beyond passwords . . . . . . . . . . . . . . .
9.6.1 Something else you know: beyond passwords . . . . . .
9.6.2 One-Time Password and Hash-Chain Login Ceremonies
573
574
574
576
578
579
582
583
587
588
590
592
593
594
596
597
xiv
537
538
538
541
544
545
548
552
557
564
599
600
603
604
606
609
610
611
613
9.7
9.6.3 Device-based Login Ceremonies . . . . . . . . . . . . . . 616
9.6.4 Biometrics: something you are authentication . . . . . . 617
Lab and Additional Exercises . . . . . . . . . . . . . . . . . . . 617
10 Conclusions and few advanced topics
10.1 Secret sharing and its Applications . . . . . . . . . . . . . . . .
10.2 Side-channels . . . . . . . . . . . . . . . . . . . . . . . . . . . .
10.3 Elliptic Curves Cryptography . . . . . . . . . . . . . . . . . . .
10.4 Quantum and post-quantum cryptography: by Walter Krawec .
10.4.1 Quantum cryptanalysis and post-quantum cryptography
10.4.2 Quantum Cryptography . . . . . . . . . . . . . . . . . .
10.5 Privacy and anonymity . . . . . . . . . . . . . . . . . . . . . . .
10.6 Theory of cryptography . . . . . . . . . . . . . . . . . . . . . .
623
623
623
623
623
625
627
629
629
Appendix A Background
A.1 Background: Computational Complexity . . . . . . . .
A.2 Background: Number Theory and Group Theory . . .
A.2.1 The modulo operation and modular arithmetic
A.2.2 Multiplicative inverses . . . . . . . . . . . . . .
A.2.3 Fermat’s and Euler’s Theorems . . . . . . . . .
A.2.4 Group Theory, Cyclic Groups and Generators .
A.3 Background: Probability . . . . . . . . . . . . . . . . .
631
631
636
636
638
640
644
646
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Index
653
Bibliography
667
xv
List of Figures
1.3
1.4
1.5
1.6
1.7
1.8
Cybersecurity defense approaches: Prevention, Deterrence and Detection. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Generic Security Goals: Conődentiality, Integrity, Authentication,
Availability and Non-repudiation . . . . . . . . . . . . . . . . . . .
Encryption: terms and typical use . . . . . . . . . . . . . . . . . .
Shared key (symmetric) cryptosystem. . . . . . . . . . . . . . . . .
Public key (asymmetric) cryptosystem. . . . . . . . . . . . . . . .
Digital Signature Scheme. . . . . . . . . . . . . . . . . . . . . . . .
Sequence diagram for the initialization and use of a signature scheme
Comparing linear, quadratic and exponential complexities . . . . .
2.1
2.2
2.3
2.4
2.5
2.6
2.7
2.8
2.9
2.10
2.11
2.12
2.13
2.14
2.15
2.16
2.17
2.18
2.19
2.20
2.21
2.22
2.23
2.24
Stateful shared key (symmetric) cryptosystem. . . . . . . . . . .
The At-Bash Cipher. . . . . . . . . . . . . . . . . . . . . . . . .
The AzBy, Caesar and ROT13 Ciphers. . . . . . . . . . . . . . .
The Masonic Cipher . . . . . . . . . . . . . . . . . . . . . . . . .
Letter and Bigram Frequencies in English . . . . . . . . . . . . .
The Ciphertext-Only (CTO) attack model . . . . . . . . . . . . .
The Known-Plaintext Attack (KPA) model . . . . . . . . . . . .
The Chosen-Plaintext Attack (CPA) model . . . . . . . . . . . .
The Chosen-Ciphertext Attack (CCA) model . . . . . . . . . . .
The One Time Pad (OTP) cipher . . . . . . . . . . . . . . . . . .
PRG-based Stream Cipher . . . . . . . . . . . . . . . . . . . . . .
The Turing Indistinguishability Test . . . . . . . . . . . . . . . .
Intuition for the PRG Indistinguishability Test . . . . . . . . . .
Feedback Shift Register . . . . . . . . . . . . . . . . . . . . . . .
Bit-wise encryption using a random function . . . . . . . . . . .
Block (n-bits) encryption using a Random Function f (·) . . . . .
Using PRF for secure encryption . . . . . . . . . . . . . . . . . .
The PRF Indistinguishability Test . . . . . . . . . . . . . . . . .
Standard block ciphers (AES and DES) . . . . . . . . . . . . . .
The PRP Indistinguishability Test . . . . . . . . . . . . . . . . .
Three ‘rounds’ of the Feistel Cipher . . . . . . . . . . . . . . . .
The IND-CPA test for shared-key encryption . . . . . . . . . . .
The IND-CPA test for symmetric encryption (E, D) . . . . . . .
IND-CPA-PK, indistinguishability test for public-key encryption
1.1
1.2
xvi
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
6
8
14
14
15
18
25
27
52
54
55
56
59
65
66
67
69
76
79
80
81
86
91
93
94
95
103
105
110
114
115
117
2.25
2.26
2.27
2.28
2.29
2.30
2.31
2.32
2.33
2.34
2.35
2.36
2.37
Electronic Code Book (ECB) mode encryption . . . . . . . . .
Electronic Code Book (ECB) mode decryption . . . . . . . . .
Visual demonstration of the weakness of the ECB mode . . . .
Per-Block Random (PBR) mode encryption . . . . . . . . . . .
Counter (CTR) mode encryption . . . . . . . . . . . . . . . . .
Output Feedback (OFB) mode encryption . . . . . . . . . . . .
Output Feedback (OFB) mode decryption. Adapted from [218].
Cipher Feedback (CFB) mode encryption . . . . . . . . . . . .
Cipher Feedback (CFB) mode decryption . . . . . . . . . . . .
Cipher Block Chaining (CBC) mode encryption . . . . . . . . .
Cipher Block Chaining (CBC) mode decryption . . . . . . . . .
The Padding Oracle Attack model . . . . . . . . . . . . . . . .
A single round of the ANSI X9.31 stateful PRG . . . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
122
123
123
124
127
128
128
132
132
133
134
137
151
3.1
3.2
3.3
3.4
3.5
3.6
3.7
3.8
3.9
3.10
3.11
3.12
3.13
3.14
3.15
3.16
3.17
Keyless and Keyed Hash Functions . . . . . . . . . . . . . . . . . .
Load-balancing with (keyless) hash function h(·) . . . . . . . . . .
Algorithmic Complexity Denial-of-Service Attack . . . . . . . . . .
Load balancing with a collision-resistant hash function . . . . . . .
Keyless collision resistant hash function (CRHF) . . . . . . . . . .
Keyed collision resistance hash function (CRHF) . . . . . . . . . .
Target collision resistant (TCR) hash function . . . . . . . . . . . .
Use of hash function h to validate integrity of őle . . . . . . . . . .
Second-preimage resistance (SPR) . . . . . . . . . . . . . . . . . .
One-Way Function (aka Preimage-Resistance) . . . . . . . . . . . .
A one-time signature scheme, limited to a single bit . . . . . . . .
A one-time signature scheme, for l-bit string (denoted d). . . . . .
A one-time signature scheme using ‘Hash-then-Sign’ . . . . . . . .
Bitwise-Randomness Extractor (BRE) Hash Function . . . . . . .
The Merkle-Tree construction using hash function h . . . . . . . .
Illustrating the the Merkle Tree’s Proof-of-Inclusion (PoI) . . . . .
Constructing a dynamic accumulator αD from a static accumulator
α. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
The digest function of the Merkle-Damgård accumulator . . . . . .
Compression function h . . . . . . . . . . . . . . . . . . . . . . . .
The simpliőed Merkle-Damgård hash hMDwo (x) . . . . . . . . . . .
The Merkle-Damgård hash hMD (x) . . . . . . . . . . . . . . . . . .
The Bitcoin Blockchain . . . . . . . . . . . . . . . . . . . . . . . .
160
162
163
164
169
172
173
177
181
186
189
190
190
193
200
203
3.18
3.19
3.20
3.21
3.22
.
.
.
.
.
.
.
.
.
.
.
.
.
209
210
214
215
216
221
4.1 Using a MAC scheme to authenticate messages . . . . . . . . . . . 234
4.2 The CBC-MAC construction . . . . . . . . . . . . . . . . . . . . . 243
4.3 Combining Encryption, Authentication, Reliability and Compression 259
5.1 Interactions for the record/session protocols . . . . . . .
5.2 Interactions for entity authentication protocols . . . . .
5.3 The (vulnerable) SNA mutual authentication protocol.
5.4 Attack on SNA Mutual Entity Authentication protocol .
xvii
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
269
275
277
279
5.5
5.6
5.7
5.8
5.9
5.10
5.11
5.12
5.13
5.14
5.15
5.16
5.17
5.18
5.19
5.20
5.21
5.22
The 2PP Mutual Entity Authentication protocol. . . . . . . . . . .
Interactions for Authenticated Request-Response Protocols . . . .
The 2PP-RR Authenticated Request-Response Protocol . . . . . .
The 2RT-2PP Authenticated Request-Response protocol . . . . . .
The Counter-based RR Authenticated Request-Response protocol
Time-based Authenticated Request-Response protocol . . . . . . .
The 2PP Key Exchange protocol . . . . . . . . . . . . . . . . . . .
The Kerberos Key Distribution Center Protocol . . . . . . . . . . .
The GSM Key Exchange Protocol . . . . . . . . . . . . . . . . . .
The VN-impersonation attack on GSM . . . . . . . . . . . . . . . .
The GSM Key Exchange Protocol with cipher suite negotiation . .
Simpliőed downgrade attack on GSM Key Exchange. . . . . . . .
A ‘real’ downgrade attack on GSM Key Exchange. . . . . . . . .
The Forward-Secrecy 2PP Key Exchange protocol . . . . . . . . .
Result of running the Forward-Secrecy 2PP Key Exchange . . . . .
Running the recover-security Key Exchange protocol for őve periods
Relations between notions of resiliency to key exposures . . . . . .
Simpliőed SSL . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
6.1
6.2
6.3
6.4
6.5
6.6
6.7
6.8
6.9
6.10
6.11
6.12
6.13
6.14
6.15
6.16
6.17
6.18
6.19
6.20
6.21
6.22
6.23
6.24
The discovery of Public-Key Cryptography . . . . . . . . . . . . .
Operation of two-ŕows key exchange protocols . . . . . . . . . . .
Hybrid encryption . . . . . . . . . . . . . . . . . . . . . . . . . . .
Physical Key Exchange Protocol . . . . . . . . . . . . . . . . . . .
The (insecure) XOR Key Exchange Protocol . . . . . . . . . . . .
The (insecure) Exponentiation Key Exchange Protocol . . . . . . .
The Modular-Exponentiation Key Exchange Protocol . . . . . . .
The Diffie-Hellman Key Exchange Protocol . . . . . . . . . . . . .
MitM attack on the DH key-exchange protocol . . . . . . . . . . .
The Generalized Diffie-Hellman Key Exchange Protocol . . . . . .
The Auth-h-DH protocol . . . . . . . . . . . . . . . . . . . . . . .
The DH-Ratchet protocol . . . . . . . . . . . . . . . . . . . . . . .
The DH public key cryptosystem . . . . . . . . . . . . . . . . . . .
The Hashed DH public key cryptosystem . . . . . . . . . . . . . .
The El-Gamal Public-Key Cryptosystem using DDH group . . . .
Privacy-preserving voting using homomorphic encryption . . . . .
RSA encryption: textbook RSA (vulnerable) vs. padded RSA . . .
Simpliőed-OAEP-padded RSA Encryption . . . . . . . . . . . . . .
OAEP padding . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
The chosen-ciphertext side-channel attack (CCSCA) model . . . .
Bleichenbacher’s attack on RSA . . . . . . . . . . . . . . . . . . . .
Public key certiőcate issuing and usage processes. . . . . . . . . .
How not to ensure resilient key exchange (for Ex. 6.20) . . . . . . .
Insecure ‘robust-combiner’ authenticated DH protocol, studied in
Exercise 6.21. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
6.25 Insecure variant of the DH-Ratchet Protocol, for Ex. 6.26. . . . . .
xviii
280
282
285
286
287
288
291
294
297
299
303
306
308
311
311
312
315
317
324
327
334
340
342
342
344
346
347
349
351
354
356
358
360
363
367
376
377
379
380
384
392
392
394
7.1
7.2
7.3
7.4
7.5
7.6
7.7
7.8
7.9
7.10
7.11
7.12
7.13
7.14
7.15
7.16
7.17
7.18
7.19
7.20
7.21
7.22
7.23
A simpliőed overview of the operation of TLS . . . . . . . . . . . . 404
Placement of TLS in the TCP/IP protocol stack . . . . . . . . . . 409
Phases of TLS connection . . . . . . . . . . . . . . . . . . . . . . . 409
The Authenticate-then-Encrypt (AtE) record protocol of SSL and TLS411
Padding in the record protocol of SSL and TLS . . . . . . . . . . . 413
The CPA-Oracle Attack model . . . . . . . . . . . . . . . . . . . . 415
The AEAD Record Protocol (TLS 1.3) . . . . . . . . . . . . . . . . 426
‘Basic’ SSLv2 handshake . . . . . . . . . . . . . . . . . . . . . . . . 430
SSLv2 handshake, with ID-based session resumption . . . . . . . . 432
The ‘basic’ RSA-based handshake, from SSLv3 till TLS 1.2 . . . . 434
SSLv3 to TLSv1.2: static DH handshake . . . . . . . . . . . . . . . 437
SSLv3 to TLSv1.2 . . . . . . . . . . . . . . . . . . . . . . . . . . . 438
SSLv3 to TLS1.2 handshake, with ID-based session resumption. . 440
Ticket-based session resumption . . . . . . . . . . . . . . . . . . . . 442
Client authentication in SSLv3 to TLS1.2. . . . . . . . . . . . . . 444
SSLv2 handshake, with details of cipher suite negotiation . . . . . 445
Example of SSLv2 cipher suite negotiation . . . . . . . . . . . . . . 446
Cipher suite downgrade attack on SSLv2 . . . . . . . . . . . . . . . 446
The Logjam cipher suite downgrade attack . . . . . . . . . . . . . 450
The SSL-Stripping Attack . . . . . . . . . . . . . . . . . . . . . . . 455
TLS 1.3 1-RTT full Diffie-Hellman handshake . . . . . . . . . . . . 464
TLS 1.3 Full (1-RTT) Pre-Shared Key (PSK) handshake . . . . . . 466
TLS 1.3 Zero-RTT Pre-Shared Key (PSK) Handshake . . . . . . . 469
8.1
8.2
8.3
8.4
8.5
8.6
8.7
8.8
8.9
8.10
8.11
8.12
8.13
8.14
8.15
8.16
8.17
8.18
8.19
PKI Entities and typical application for Web PKI . . . . . . . . . 484
Example of the X.500 (and X.509) Distinguished Name (DN) Hierarchy.490
The Identiőers Trilemma . . . . . . . . . . . . . . . . . . . . . . . . 493
X.509 version 1 certiőcate . . . . . . . . . . . . . . . . . . . . . . . 494
X.509 version 3 certiőcate . . . . . . . . . . . . . . . . . . . . . . . 496
A single-hop (length one) certiőcate-path . . . . . . . . . . . . . . 509
A three-hop certiőcate-path . . . . . . . . . . . . . . . . . . . . . . 510
Example of the use of Name Constraint . . . . . . . . . . . . . . . 511
Example of the use of name constraint with dNSName . . . . . . . 512
X.509 Certiőcate Revocation List (CRL) . . . . . . . . . . . . . . 516
The Online Certiőcate Status Protocol (OCSP) . . . . . . . . . . . 519
OCSP used by relying party . . . . . . . . . . . . . . . . . . . . . . 520
The MitM soft-fail Attack on a TLS connection using OCSP . . . 524
OCSP stapling in the TLS protocol, using the CSR TLS extension 528
MitM soft-fail attack on OCSP-stapling TLS client (browser) . . . 530
Use of the TLS-feature X.509 extension indicating Must-Staple . . 531
Certiőcates-Merkle-tree variant of OCSP . . . . . . . . . . . . . . . 533
Signed revocations-status Merkle-tree . . . . . . . . . . . . . . . . 535
Optimizing OCSP response, using tree-of-revocations and Proof-ofNon-Inclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 536
8.20 The issue process for both HL-CT and AnG-CT . . . . . . . . . . 549
8.21 Monitoring in HL-CT . . . . . . . . . . . . . . . . . . . . . . . . . 550
xix
8.22
8.23
8.24
8.25
8.26
8.27
8.28
Omitted-certiőcate attack by a rogue logger and rogue CA . . . .
Split-world attack on AnG-CT . . . . . . . . . . . . . . . . . . . .
Zombie-certiőcate attack . . . . . . . . . . . . . . . . . . . . . . . .
NS-CT issue process, with a (possibly corrupt) logger . . . . . . .
NS-CT detection by monitor of a misleading certiőcate . . . . . . .
NTTP-Security Certiőcate Transparency (NS-CT) . . . . . . . . .
NS-CT defending against the omitted-certiőcate attack, by providing
a Proof-of-Misbehavior of a rogue logger . . . . . . . . . . . . . . .
8.29 Split-world attack by a rogue logger against NS-CT (incorrectly)
deployed without gossip . . . . . . . . . . . . . . . . . . . . . . . .
8.30 Inter-monitor gossip . . . . . . . . . . . . . . . . . . . . . . . . . .
9.1
9.2
9.3
9.4
9.5
9.6
9.7
9.8
551
555
558
560
561
562
563
564
565
Exposing user’s password via DNS poisoning for login page sent over
http. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 584
Typical phishing attack luring a user to an impersonating website 585
Using AI (ChatGPT) to generate a personalized password dictionary589
Password validation using cryptographic co-processor . . . . . . . . 599
The Naïve-PAKE Protocol . . . . . . . . . . . . . . . . . . . . . . 607
The Naïve-DH-PAKE Protocol . . . . . . . . . . . . . . . . . . . . 608
A MitM offline dictionary attack against the Naïve-DH-PAKE Protocol609
The EKE2 PAKE Protocol . . . . . . . . . . . . . . . . . . . . . . 610
xx
List of Tables
1.1
Notations used in this manuscript. . . . . . . . . . . . . . . . . . .
23
2.1 Cryptanalysis Attack Models . . . . . . . . . . . . . . . . . . . . .
2.2 Table for Exercise 2.10 . . . . . . . . . . . . . . . . . . . . . . . . .
2.3 Do-it-yourself table for selecting random permutations ρ1 , ρ2 over
domain D = {0, 1}2 . . . . . . . . . . . . . . . . . . . . . . . . . . .
2.4 Comparison between random function, random permutation, PRG,
PRF, PRP and block-cipher . . . . . . . . . . . . . . . . . . . . . .
2.5 Encryption Modes of Operation using n-bit block cipher . . . . . .
2.6 Ciphertexts for Exercise 2.25 . . . . . . . . . . . . . . . . . . . . .
65
90
3.1
3.2
104
108
121
145
Goals and Requirements for keyless cryptographic hash functions . 165
Comparison: PRF, KDF and Extractor hash functions . . . . . . . 195
4.1 Authentication schemes: MAC, Authenticated Encryption (AE,
AEAD) and Signatures . . . . . . . . . . . . . . . . . . . . . . . . . 252
5.1 Authenticated Request-Response (RR) protocols. . . . . . . . . . . 284
5.2 Comparing the impact of transmission rates to the impact of and
the number of round-trips required by a protocol, assuming the
typical RTT (round-trip time) of 100msec, and assuming overall
transmission of 50KByte. . . . . . . . . . . . . . . . . . . . . . . . 284
5.3 Notions of resiliency to key exposures of key-setup Key Exchange
protocols . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 314
6.1
6.2
Key length and computing time for asymmetric and symmetric
cryptography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 332
Resiliency to key exposures of Key Exchange protocols . . . . . . . 353
7.1 Derivation of connection keys and IVs, in SSLv3 to TLS1.2 . . . . 436
7.2 Important TLS/SSL attacks due to speciőcations vulnerabilities. . 472
7.3 Table for Exercise 7.15. . . . . . . . . . . . . . . . . . . . . . . . . 478
8.1
8.2
8.3
Standard keywords/attributes in X.500 Distinguished Names . . . 490
Standard certiőcate policies . . . . . . . . . . . . . . . . . . . . . . 504
Comparison of revocation-checking mechanisms . . . . . . . . . . . 514
xxi
8.4
8.5
Revocation-related parameters . . . . . . . . . . . . . . . . . . . . 514
Notable Web PKI Certiőcate Authority Failures . . . . . . . . . . 539
9.1
Operations required for offline dictionary attacks against a hashed
password őle . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 595
A.1 Euler’s function ϕ(n) . . . . . . . . . . . . . . . . . . . . . . . . . . 641
xxii
List of Labs
Lab 1 (Using cryptography to validate downloads) . . . . . . . . . .
46
Lab 2 (Ransomware and Encryption) . . . . . . . . . . . . . . . . . . 144
Lab 3 (Checksum and CRC Collisions) . . . . . . . . . . . . . . . . . 223
Lab 4 (Breaking textbook and weakly-padded RSA) . . . . . . . . . 388
Lab 5 (Password Cracking and Crypto Hash Functions) . . . . . . . 617
xxiii
List of Principles
Principle 1 (Attack model principle: assume capabilities, not strategy) 10
Principle 2 (Kerckhoffs’ principle) . . . . . . . . . . . . . . . . . . . . 16
Principle 3 (Conservative design and usage) . . . . . . . . . . . . . . 33
Principle
Principle
Principle
Principle
Principle
Principle
4
5
6
7
8
9
(Limit usage of each key) . . . . .
(Sufficient effective key length) . .
(Random function design method)
(Key separation) . . . . . . . . . .
(Cryptographic Building Blocks) .
(Minimize plaintext redundancy)
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
. 60
. 74
. 94
. 102
. 118
. 131
Principle 10 (Key Separation) . . . . . . . . . . . . . . . . . . . . . . 239
Principle 11 (Crypto-agility) . . . . . . . . . . . . . . . . . . . . . . . 301
Principle 12 (Minimize use of public-key cryptography)
. . . . . . . 333
Principle 13 (Secure extensibility by design) . . . . . . . . . . . . . . 457
Principle 14 (The KISS Principle) . . . . . . . . . . . . . . . . . . . 457
Principle 15 (Minimize the attack surface) . . . . . . . . . . . . . . . 458
Principle 16 (The UX>Security Precedence Rule) . . . . . . . . . . . 525
Principle 17 (Soft-fail security is insecure) . . . . . . . . . . . . . . . 529
Principle 18 (Bellovin’s principle: secure is as securely used) . . . . .
Principle 19 (Fail-safe defaults, or: Security Should be Default, and
Defaults Should be Secure.) . . . . . . . . . . . . . . . . . . . .
Principle 20 (Defend, don’t ask and don’t warn.) . . . . . . . . . . .
Principle 21 (Use click-whirr responses to improve security) . . . . .
Principle 22 (Alerts Should Wake Up) . . . . . . . . . . . . . . . . .
xxiv
574
576
577
577
577
Preface
This textbook introduces cybersecurity, with focus on the key element of
applied cryptography. Our goal is to provide sufficient depth and precision for
understanding of this important and fascinating area, but without requiring
extensive prior background in mathematics or in the theory of computer science.
The textbook presents design principles, discusses practical systems, attacks
and vulnerabilities, presents basic cryptographic mechanisms, constructions and
deőnitions, and includes many examples, exercises and several programming
labs. The goal is that readers will be able to use the book for self-study, and
lecturer will be able to use it as a textbook for a course, or use parts of it for
two courses.
We use the term cybersecurity to refer to the protection of systems involving
communication and computation mechanisms, from attacks by adversaries. This
is a broad area, since there communication and computations have become
so diverse and important, and since there are many threats and many types
of adversaries. Ensuring security is challenging; attacks often exploit subtle
vulnerabilities and use unexpected strategies, which intuition may fail to consider.
This makes cybersecurity challenging; careful, adversarial thinking is crucial.
Cybersecurity is a very applied area, however, in this applied area, precise
analysis, definitions and proofs are critical. This stands in contrast to many
other areas of engineering, where designs are often evaluated under typical,
expected scenarios and failures. This approach is insufficient for cybersecurity,
since security should be ensured against arbitrary attacker strategies, rather
than against expected, familiar attacks. Defenses should be designed assuming limitations on the capabilities of the adversary, but without making any
assumptions on the adversary’s strategy.
Cybersecurity is a broad area, including many aspects, technical as well
as otherwise (legal, economics, social and much more). The technical aspects
include cryptography, network-security, software security, system security, privacy, secure human-computer interaction and more. This textbook focuses on
applied cryptography and also introduces some aspects of network security and
of secure human-computer interaction. We believe that this is a good choice for
a őrst course in cybersecurity, for three reasons:
• Applied cryptography is essential to many areas of cybersecurity. Hence,
this textbook may provide a common basis for students interested in
1
LIST OF PRINCIPLES
2
these different areas. Some students may continue by focusing mostly on
cryptography, maybe even on the theory of cryptography; for these, we
hope to provide a good basis in the applied aspects of cryptography. Other
may continue to areas of cybersecurity which ‘just’ use cryptography such
as network-security, secure systems, privacy and human-centered security;
for those, we hope to provide the necessary background in cryptography.
• The study of cryptography develops adversarial thinking, which is critical
for every cybersecurity expert. Modern cryptography is based on precise
deőnitions of goals and assumptions, with analysis and proofs of security
for given adversary capabilities, not assuming a speciőc adversary strategy
or attack method. Other areas of cybersecurity often use intuitive goals,
design and analysis, and may even focus on speciőc adversary strategies.
Such approaches may be unavoidable, since precise deőnitions and proofs
are often infeasible; however, these approaches appear less helpful in
developing adversarial thinking.
• Finally, there is the pragmatic consideration of scheduling and prerequisites.
Modern cryptography is based on both mathematics and on the theory of
computing; however, as we believe you will őnd, we found it possible to
use only a limited amount of math and theory, and introduce this limited
amount with this book. As a result, the text can be used by readers who
did not learn previous math or computer-science courses, beyond what
is learned in many high-school programs. This is in contrast to other
areas of cybersecurity, which require considerable background (e.g., in
networking, operating-systems, or programming).
The need for security against attackers with arbitrary strategies, restricted
only by their capabilities, motivates the use of provable security, as well as
the use of standard, well-studied designs, whose security was conőrmed by
experts. Indeed, some of the worst failures of security systems, and especially of
cryptographic systems, are due to attempts by non-experts to design schemes
and protocols.
This textbook tries to combine practice with theory, applicability with
precision, breadth with depth. These are ambitious goals, and we hope we are
not completely off the mark. Feedback and suggestions for improvement are
highly appreciated.
Organization and usage of this textbook
This textbook is designed for an introductory course in cybersecurity, focusing
on applied cryptography. It may be used for self-study, or for one (large) or
two (smaller) courses.
Chapter 1 provides essential introduction to cryptography, providing intuitive
discussion of the main mechanisms we discuss, and introducing the challenge of
deőning security. It also introduces notations used throughout the book, and
provides critical background information; additional background is provided
Applied Introduction to Cryptography and Cybersecurity
LIST OF PRINCIPLES
3
in Appendix A. The rest of the book was designed so that the necessary
background in each topic is very limited; we believe it is not necessary to require
learning these topics in a prerequisite course. The introduction also provides a
brief historical perspective, which we hope some readers may őnd of interest.
In chapters 2-3, we begin introducing applied cryptography. These chapters
focus on the efficient and conceptually-simple shared-key cryptographic functions:
encryption (Chapter 2), authentication (Chapter 4) and hashing (Chapter 3).
In Chapter 3 we also begin introducing more elaborate applied cryptographic
schemes, focusing on hashing-based schemes such as the Merkle digest schemes
and blockchains.
Chapter 5 introduces applied shared-key cryptographic protocols. By focusing on shared-key protocols, we provide a gentle introduction to the important
subject of resiliency to key-exposure. This also provides good motivation to
public-key cryptography, the topic of the next chapter (Chapter 6). Indeed,
Chapter 6 deals extensively with different public-key protocols and applications, and shows the powerful ability to use public-key cryptography to ensure
resiliency to, and recovery from, key exposure.
The next three chapters cover areas of cybersecurity which are closely related
to cryptography, focusing mostly on the cryptographic aspects. Two chapters
introduce network security: the important TLS protocol in Chapter 7, and
Public Key Infrastructure (PKI) in Chapter 8. Then, Chapter 9 covers the
critical topic of human-centered cryptography; too often this aspect is not
sufficiently taken in account, and cryptography is circumvented by exploiting
human behavior and psychology.
We conclude the book in Chapter 10, brieŕy discussing some advanced topics
such as secret sharing, privacy and anonymity, elliptic curves cryptography and
quantum (and post-quantum) cryptography, as well as some of the important
aspects of cybersecurity which are beyond this text.
The use of background such as math and theory in this textbook is limited to
what appears to be essential or helpful for understanding, at least for a signiőcant
fraction of readers. Some of this background is covered in Appendix A.
Study of this textbook may be followed by study of other areas in cybersecurity, or by in-depth study of cryptography. There are multiple excellent
in-depth textbooks on cryptography or speciőc cryptographic topics; some of
my favorites are [16, 165, 166, 205, 309, 370].
Acknowledgments
I received a lot of help in developing this textbook from friends, colleagues and
students; I am very grateful. Speciőc thanks to:
• The students and my peers in the University of Connecticut, who have
been very understanding and supportive.
• Sara Wrotniak, my PhD student, who gave me incredible feedback when
she studied using this textbook as an undergrad, and later when I asked
her to review the text further.
Applied Introduction to Cryptography and Cybersecurity
4
LIST OF PRINCIPLES
• Professors and researchers who provided valuable feedback, with or without teaching a course using the textbook, including: Ghada Almashaqbeh,
Nimrod Aviram, Ahmed El-Yahyaoui, Peter Gutmann, John Heslen, Walter Krawec, Laurent Michel, David Pointcheval, Ivan Pryvalov, Zhiyun
Qian, Amir and Luba Sapir, Jerry Shi, Haya Shulman, Ewa Syta, and Ari
Trachtenberg. Walter has also kindly contributed the section on quantum
cryptography (Section 10.4).
• Instructors and teaching-assistants who used the text and provided important feedback: Justin Furuness, Hemi Liebowiz, Sam Markelon, and
Anna Mendonca. I’m especially indebted to Anna for collecting excellent
feedback from her students - and for introducing me to Sara.
• Many other readers who helped with feedback, corrections and suggestions,
including Pierre Abbat, Yanjing Xu, Bar Meyuchas, and Yike (Nicole)
Zhang and many others.
• The Latex and Tikz communities, who provide amazing resources and
support. Special thanks to Nils Fleischhacker for the cool ‘tikz people’
package, and to Shaanan Cohney who kindly sent me the LaTeX source
code from [100], which I turned into Figure 2.37.
I probably forgot to mention some important contributors; please accept
my apologies and let me know. Indeed, in general, please let me know of any
omission or mistake in the text, and accept my thanks in advance. Many thanks
to all of you; feedback is greatly appreciated, deőnitely including harsh criticism.
Special thanks to my friend, PhD adviser and mentor, Prof. Oded Goldreich,
who endured the challenge of trying to teach me both cryptography and proper
writing. I surely cannot meet Oded’s high standards, but I believe and hope
that this textbook does help to provide the necessary foundations of applied
cryptography to practitioners of cybersecurity. Laying foundations is deőnitely
a goal I deőnitely share with Oded. Let me quote Oded’s father:
It is possible to build a cabin with no foundations, but not a lasting
building
Eng. Isidor Goldreich
Last but not least, I wish to thank the incredible support and understanding
of my family throughout the years, especially my amazing parents, Ada and
Michael, my lovely and supportive wife Tatiana, and my wonderful and talented
children, Raaz, Tamir, Omri and, last but not least, Karina.
Applied Introduction to Cryptography and Cybersecurity
Chapter 1
Introducing Cybersecurity and
Cryptography
Cybersecurity and cryptography are exciting topics, with rich history and with
an extensive impact on the practical world, as well as fascinating challenges of
engineering, theory and mathematics. In this textbook, our goal is to introduce
cybersecurity, focusing on the area of applied cryptography, which is essential
to most or all areas of cybersecurity. We believe that this basic knowledge
in applied cryptography is desirable for any student of computer science, but
especially for students which plan to focus on any area of cybersecurity or, of
course, students planning to focus on cryptography.
This chapter provides several sections with important foundations to the
rest of this book. The őrst section (Section 1.1) introduces the areas of Cybersecurity and Cryptography, and their goals, approaches and few basic principles.
Section 1.2 provides an informal introduction to two basic cryptographic mechanisms, encryption and signatures. Section 1.4 provides essential background
and notation, used throughout the text; additional background is provided in
Appendix A. In section 1.5 we discuss the challenge of defining security of
cryptographic schemes, focusing on the important deőnition of security for
signature schemes. Finally, Section 1.6 provides a few gems from the fascinating
history of cryptography and cybersecurity, which, we believe, may provide an
interesting perspective.
1.1
Cybersecurity and Cryptography: the basics
The high-level goal of cybersecurity is to protect ‘good’ parties, who use computing and communicating systems legitimately, from damages due to misbehaving
entities, to whom we usually refer as attackers or adversaries), who abuse the
systems to harm the users and/or to obtain an unfair advantage or beneőt.
Attackers could be rogue authorized users, often referred to as insiders, or
outsiders, which only have some access to the systems or the communication,
and are not authorized users. Attackers may also control one or multiple de5
6
CHAPTER 1. INTRODUCING CYBERSECURITY AND CRYPTOGRAPHY
vices. Such devices may be legitimately owned by the attacker, or be corrupted,
i.e., controlled by the attacker in spite of not being legitimately owned by
the attackers; corrupted devices are often referred to as being pwned1 by the
attacker.
Unfortunately, it is not so easy to turn this high-level goal into precise
definitions of security, allowing evaluation of the security of different systems and
defenses. It is also challenging to design schemes that satisfy these deőnitions,
and to prove their security. In this section, we discuss the main approaches to
ensure and deőne security. But őrst, in subsection 1.1.1, we introduce the three
cybersecurity functions: prevention, detection and deterrence.
1.1.1
Three Cybersecurity Functions: Prevention, Detection
and Deterrence
Figure 1.1 illustrates the three cybersecurity functions, i.e., basic ways of protecting against attackers: prevention, detection and deterrence.
Prevention
Cybersecurity defense approaches
Deterrence
Detection
Figure 1.1: Cybersecurity defense approaches: Prevention, Deterrence and
Detection.
Prevention: mechanisms that prevent an attacker from causing damage,
or that reduce the amount of possible damage. Encryption is an example
of a cryptographic prevention mechanism; it is usually used to prevent an
attacker from disclosing sensitive information. When possible, prevention is
obviously the best defense - ‘an ounce of prevention is worth a pound of cure’.
However, sometimes it is impossible to completely prevent an attack. For
example, the TLS protocol (Chapter 7) relies on public key certificates issued
and signed by a trusted party, called a Certificate Authority (CA); attacks due
1 The
verb ‘pwn’ means to control or ‘own’ a computing system illegitimately, by exploiting
a vulnerability. The verb ‘pwn’ is taken from gamers slang, where it was originally used for a
player which is completely dominated by an opponent; its origin may come from chess and
combine the chess ‘pawn’ with the verb ‘own’. See [310].
Applied Introduction to Cryptography and Cybersecurity
1.1. CYBERSECURITY AND CRYPTOGRAPHY: THE BASICS
7
to a mis-behaving CA cannot be completely prevented. Instead, such attacks
are mitigated using detection mechanisms - and deterrence, with signiőcant
penalties for misbehaving CAs, as we discuss in Chapter 8. Even when a
cryptographic mechanism seems to completely prevent an attack, e.g., when
using encryption, it is still prudent to also deploy detection mechanisms, since
hidden, subtle vulnerabilities may exist even in thoroughly reviewed designs
and systems.
Deterrence: mechanisms designed to discourage (deter) attackers from attacking. Deterrence is achieved either by the (visible) use of strong defenses,
making it futile to attack, or by penalizing misbehaving entities. Penalizing
requires attribution of the attack or misbehavior to the correct entity. Digital
signature schemes are an important cryptographic deterrence mechanism, which
we discuss later in this chapter (subsection 1.2.3) and, in more details, in chapters 4 and 6. A signature veriőed using the misbehaving party’s well-known
public key, over a given message, provides evidence that this party signed that
message; we refer to such evidence of abuse as Proof of Misbehavior (PoM).
PoM can be used to punish or penalize the misbehaving party in different ways
- an important deterrent. Note, however, that deterrence can only be effective
against a rational adversary; a penalty may fail to deter an irrational adversary,
e.g., a terrorist.
Deterrence is effective if the adversary is rational, and would refrain from
attacking if her expected proőt (from attack) would be less than the expected
penalty. In practice, most attackers are rational, hence, good deterrence is an
effective defense. However, a challenge is that attackers do their best to avoid
being detected and penalized; deterrence is only effective, when combined with
strong penalties - and effective detection.
Detection: By detection we refer to security mechanisms that detect an attack,
usually while the attack is ongoing. Detection allows deployment of additional
defenses, which are not deployed otherwise (e.g., due to costs).
Detection is often a key element in ensuring security. First, detection is
required to penalize attackers and hence effective detection is key to effective
deterrence. Second, detection allows reaction, such as using additional defense
mechanisms and performing operations to recover security and minimize damages. Therefore, security systems often invest a lot in detection, while attackers
usually do their best to avoid detection, including refraining from attacks and
actions that may result in detection.
Detection is a major component of important, widely deployed networksecurity and host-security defenses such as Intrusion Detection Systems (IDS)
and honeypots. Detection is not as much applied in cryptography, however, we
present some examples of the use of detection which are related to cryptographic
security. One example are the PKI detection mechanisms which we discuss in
Chapter 8. Another example is the combination of an Error Detection Code and
a Message Authentication Code, used to detect attempt to attack authenticated
Applied Introduction to Cryptography and Cybersecurity
8
CHAPTER 1. INTRODUCING CYBERSECURITY AND CRYPTOGRAPHY
communication, as illustrated in Figure 4.3. Finally, we discuss the detection of
password exposure in subsection 9.4.7.
1.1.2
Generic Security Goals
Different systems and schemes often have different security goals; however,
some goals are generic, and apply, possibly with some variations on details, to
many systems and schemes. The generic goals, illustrated in Figure 1.2, include
Confidentiality, Integrity, authenticity, and Availability and Non-repudiation.
Three of them (conődentiality, integrity, and either authentication or availability)
are often referred to as the CIA triad, an easily-memorable acronym.
Cybersecurity Generic Goals
Confidentiality
Integrity
Authenticity
Availability
Mechanisms:
Encryption:
symmetric (Chapter 2)
and asymmetric or
public-key (Chapter 6).
Mechanisms:
Hash functions,
Accumulators (e.g.,
Merkle-tree) and
blockchain schemes,
Chapter 3.
Mechanism:
Message
Authentication
Code (MAC).
Mechanisms:
Proof-of-Work
(PoW, Section 3.10.2),
Public-Key Infrastructure
(PKI, Chapter 8).
Non-repudiation
(and authentication)
Mechanisms:
Digital signatures
Figure 1.2: The Generic Security Goals: Conődentiality, Integrity, Authentication, Availability and Non-repudiation (which extends authentication). The
őrst three are sometimes referred to as the CIA triad. Non-repudiation is an
extension of authentication. For each goal, we list some of the corresponding
cryptographic mechanisms covered in this textbook.
Confidentiality. A system satisőes the confidentiality goal, if it prevents an
attacker from disclosing some information deőned as confidential. The ‘classical’
conődentiality mechanism is encryption, illustrated in Figure 1.4, with two main
variants: shared key cryptosystems, also referred to as symmetric encryption,
and public key cryptosystems, also referred to as asymmetric encryption.
Integrity. Ensuring integrity means prevention or detection of unauthorized
operations, such as the modiőcation of data. Integrity is applied to a wide range
of situations, including the integrity of a computer system, information stored
Applied Introduction to Cryptography and Cybersecurity
1.1. CYBERSECURITY AND CRYPTOGRAPHY: THE BASICS
9
or transmitted, and more. We cover several cryptographic integrity mechanisms
in Chapter 3, including cryptographic hash functions, Merkle digest schemes
and blockchains.
Authenticity and Non-repudiation. Authentication mechanisms validate
that a particular information was originated from a speciőc entity, or that a particular interaction involved a speciőc entity. A unique variant of authentication
is non-repudiation, which provides evidence of the origin of information, that
can be presented, later, to a third party, to ‘prove’ the identity of the origin.
We cover the Message Authentication Codes (MAC) schemes which provides
authentication, and signature schemes, which provides non-repudiation (and
authentication). Note: we introduce signature schemes already in this chapter
(subsection 1.2.3), and discuss the important application of cryptographic hash
functions and signature schemes to ensure software integrity and authenticity,
as a main defense against malware, in Lab 1.
Availability. Availability mechanisms ensure that services can be provided
efficiently, even if an attacker tries to disrupt services, in what is called Denialof-Service (DoS) attacks. DoS attacks have become a major concern for network
and service providers, and defenses against them are a major challenge of
network security; however, there is only limited use of cryptography schemes in
defenses against DoS. In fact, some cryptographic protocols are vulnerable to
DoS, i.e., a DoS attack may disrupt the security service, potentially resulting
in a vulnerability. We discuss this concern for Public Key Infrastructure (PKI),
in Chapter 8. We explain how some PKI designs, such as CRLs and CRVs, are
resistant to such DoS attacks, while others, e.g., OCSP, may be vulnerable or
even abused to create a DoS attack.
Bespoke security goals. The generic goals apply to many cybersecurity
systems. The security goals of speciőc systems and schemes usually include these
generic goals, possibly with some adaptation, as well as addition, system-speciőc
goals.
1.1.3
Attack Model
Security goals should be deőned and evaluated with respect to the capabilities
of the attackers, rather than by assuming a speciőc attacker strategy. This is
an important principle of modern cryptology, which already appears in early
publications such as the seminal paper of Diffie and Hellman [123]: understand
and deőne a clear model of the attacker capabilities (attack model), and then
deőne goals/requirements for the scheme/system - against attackers with the
speciőed capabilities, regardless of the attacker’s strategy (choices). This
principle applies not only in cryptology, but in general in security. Precise
articulation of the attack model and of the security requirements is fundamental
to the design and analysis of security. Let us re-state this important principle.
Applied Introduction to Cryptography and Cybersecurity
10 CHAPTER 1. INTRODUCING CYBERSECURITY AND CRYPTOGRAPHY
Principle 1 (Attack model principle: assume capabilities, not strategy). Security requirements should be defined and analyzed with respect to a well defined
attack model, which specifies any restrictions of the capabilities of the attacker.
The attack model should not restrict the attacker’s strategy.
We discuss several attack models in this textbook, and three of them, which
apply to digital signatures, already in this chapter, in subsection 1.5.2. The
weakest of them is the Key-Only Attack model, where the attacker is only given
the public key, and the strongest is the Chosen Message Attack model, where
the attacker is even given the ability to ask for signatures for message of its own
choosing. As you can see, the models only refer to the capabilities of the attacker,
and not to any speciőc attack strategy or even to the requirements; we discuss
the security requirements of signature schemes separately, in subsection 1.5.8.
The attack model principle applies also to areas of cybersecurity where it is
hard to deőne a rigorous attack model. Even in such scenarios, it is important
to limit our assumptions to the attacker capabilities, rather than assuming a
speciőc attacker strategy. However, where possible, a well-deőned attack model,
allowing provable security, is better - if this is done ‘correctly’, as we discuss
next.
1.1.4
Provable Security
In most areas of science and engineering, design and evaluation are based on
experimental analysis, measuring the expected outcomes of the system under
typical scenarios. In contrast, security should be ensured against an adversary
(attacker), who is not bound to behave in some typical way; new attacks may
be very different from past attacks. Observing the behavior of adversaries and
designing defenses based solely on these behaviors may leave subtle or even
gaping vulnerabilities that can be exploited by adversaries. That said, when a
history of previous attacks is available and shows a clear pattern, it does make
sense to evaluate defenses against the same type of attacks - in addition to
evaluation against arbitrary attacker strategy, as per Principle 1.
It is challenging to ensure security against an arbitrary attacker strategy,
only limiting the attacker’s capabilities. There are many subtle ways in which
our intuition and imprecise arguments may fail, resulting in vulnerabilities.
Cryptography and security are exceptions to the popular saying ‘in theory there
is no difference between theory and practice, while in practice there is’ [82]. In
fact, in security, and especially in cryptography, precise deőnitions and proofs
of security, or at least extensive, clear analysis, are necessary to ensure that a
system is secure against arbitrary attacker strategies. This approach is usually
referred to as provable security.
That said, one has to be very careful to understand the limitations and pitfalls
of proofs, discussed in a series of papers by Koblitz and Menezes, summarized
in [237]. Proofs of security often involve simpliőcations and assumptions, even
simplifying ‘assumptions’ known to be incorrect. An important example of such
simpliőcation is the Random Oracle Model (ROM), discussed in Section 3.6.
Applied Introduction to Cryptography and Cybersecurity
1.1. CYBERSECURITY AND CRYPTOGRAPHY: THE BASICS
11
Using the ROM, a protocol using a speciőc cryptographic hash function, e.g.,
SHA-1, is analyzed as if the function used was selected at random. Clearly, this
is not a correct assumption; yet, a proof using such assumption, gives some
indication of security, or at least, limits the possible types of attacks.
Simpliőcations can result in a system has a proof of security - leading people
to trust its security - however, in reality, this system is vulnerable and exploited.
One reason this happens is that proofs are based on models of the attacker
capabilities and of the system, which often do not fully capture reality; attacks
often cleverly abuse and exploit exactly these aspects which were glossed-over
and abstracted away. Furthermore, efforts to consider more realistic aspects
tend to make analysis, proofs, and provably-secure designs, harder and more
complicated. As a result, even for an expert, proofs may be challenging and
require extensive effort - to write and to validate. Complex proofs may have
subtle errors, which may remain undiscovered for years. Proofs should be
clean and enlightening, while reality is dirty and murky; and viruses, bugs, and
vulnerabilities thrive in the dirt and the dark.
Another challenge for provable security, is to correctly identify and rigorously deőne the security requirements (goals). Rigorous deőnitions of security
requirements are often challenging - to deőne and to understand. However,
clear deőnitions of security requirements are important, for several reasons:
1. To prove the security of the cryptographic scheme.
2. To prove the security of an application of the scheme, e.g., as part of
another cryptographic scheme or of a cryptographic protocol.
3. To allow researchers to explore attacks against the system and demonstrate
that requirement are not met.
4. To avoid vulnerabilities in a system using the scheme, due to incorrect
usage of the scheme, namely, avoid incorrect use of cryptographic mechanisms.
The last item is worth extra attention; incorrect usage is a common reason
for cryptography-related vulnerabilities, and the exploits are often devastating.
In this textbook, we will discuss several such weaknesses - in widely deployed
standard and products. We believe that one of the main reasons for these
failures is the fact that the system designers were not sufficiently familiar with
the security properties of the cryptographic schemes they used. Our hope is
that readers of this textbook will learn enough to allow them, when using
cryptographic schemes, to understand their properties and how to securely use
them - and to know when they need to consult with a more knowledgeable
cryptographer. This will reduce the likelihood of usage errors.
The challenge of rigorously deőning security requirements and proving security, also implies that protocols and mechanisms are sometimes used without a
rigorous deőnition and/or a proof. It is sometimes necessary, for functionality,
business or performance considerations, to take the risk of a subtle vulnerability,
Applied Introduction to Cryptography and Cybersecurity
12 CHAPTER 1. INTRODUCING CYBERSECURITY AND CRYPTOGRAPHY
which could exist due to informal requirements or analysis. However, as explained by Koblitz and Menezes, when feasible, there are important advantages
to follow the provable security approach, with deőnitions of requirements, models and assumptions, and proofs security. The provable security approach helps
to avoid, or at least reduce, vulnerabilities and attacks. In particular, a proof of
security - often, even a ŕawed one - rules out many possible vulnerabilities. In
fact, vulnerabilities are often discovered, and circumvented, when researchers
work toward a proof of security.
There are also several countermeasures which may help to address the
concerns about provable security:
1. The use of computer-aided cryptography [24], i.e., automated tools to aid
in generation of proofs and in veriőcation of the correctness of proofs.
2. The study of attacks and in particular of cryptanalysis, i.e., attacks against
cryptographic mechanisms. Researchers studying attacks on systems
would often ‘think outside the box’ and exploit subtle vulnerabilities,
which may be hard to identify when we only consider the proofs of security.
Attacks also provide a quantitative measure of insecurity, which may be
matched against a proven quantitative measure of security, identifying
possible gaps which may allow improved proofs - or improved attacks.
3. Improved understanding of the value, as well as the limitations, of provable
security, and better intuition, that help practitioners make use of provable
security - while avoiding subtle pitfalls. We hope that this textbook will
aid in the development of such improved understanding and intuition,
even to readers who will not proceed to learn the theory of cryptography.
Note that while our discussion focused on cryptographic deőnitions and
proofs, these considerations apply to cybersecurity in general. However, rigorous
deőnitions and proofs are much less common in other areas of cybersecurity,
possibly since it is harder there to őnd useful simpliőcations. We consider
this as a good motivation for studying cryptography, as a way to develop the
‘adversarial thinking’ so essential in all areas of cybersecurity.
To sum up, a proof of security is a powerful tool, but, like other power
tools, should always be used carefully and correctly. Namely, we must fully
understand the deőnitions, assumptions and simpliőcations, and never assume
additional properties or applicability to scenarios where the assumptions do not
hold. Therefore, cybersecurity experts must master both theory and practice.
A fair amount of skepticism, paranoia and humility is also advisable.
1.1.5
Risk and costs-benefit analysis
The management of computation and communication systems involves multiple
challenges, including security risks. Security managers have to consider the
costs of deploying security mechanisms, against the probability of an attack
and the expected damages from possible attacks, if the mechanisms are not
Applied Introduction to Cryptography and Cybersecurity
1.2. THE BASIC MECHANISMS: ENCRYPTION, SIGNATURES AND
HASHING
13
deployed. Risk analysis is an attempt to estimate the probabilities of occurrence
of different attacks, and of the attacks being successful; and cost-beneőt analysis
uses these values, as well as the costs of deploying different security mechanisms,
to decide which security mechanisms are worth deploying.
In this textbook, we do not further discuss risk and cost-beneőt analysis. The
reason is that we are not aware of a sufficiently reliable and generally applicable
methodology for such analysis. It seems to the author that practitioners often
use crude approximations based on their experience, common sense and industryadopted estimates. Therefore, we focus on design and analysis of secure systems,
and on potential vulnerabilities and attacks; we mostly ignore the risks of
different attacks and the costs of different defenses. There are a few exceptions;
e.g., we focus on computationally efficient schemes and adversaries, where by
‘efficient’ we mean ‘whose runtime is bounded by a polynomial in the size of
their inputs’; see more in Section A.1.
1.2
The basic mechanisms: encryption, signatures and
hashing
In this section, we provide a birds-eye introduction to two basic cryptographic
mechanisms: cryptosystems (symmetric and asymmetric) and signature schemes.
We also introduce Kerckhoffs’ principle, a fundamental design principle for
cryptographic schemes, mostly applied also to other cybersecurity defenses.
1.2.1
Encryption: symmetric and asymmetric cryptosystems
Encryption schemes, also referred to as cryptosystems or ciphers, are the oldest
and most well known cryptographic mechanism - and the main mechanism for
ensuring confidentiality.
Encryption transforms sensitive information, referred to as plaintext, into
a form called ciphertext, which allows the intended recipients to decrypt it
back into the plaintext; the ciphertext should not expose any information to
an attacker. The focus on encryption and conődentiality is evident in the
term cryptography, i.e., ‘secret writing’, which is often used as a synonym for
cryptology. In fact, for some reason, in the recent years, cryptography seems to
be the more common term; so we also use it.
In all but some ancient (and completely insecure) cryptosystems, the encryption and decryption operations use a key. In symmetric cryptosystems,
encryption uses the same (secret) key as used for decryption, often denoted k.
In contrast, in asymmetric cryptosystems, only the decryption key, denoted d,
must be private, and the (related but different) encryption key, denoted e, can
be published, i.e., is public. Due to these properties, symmetric cryptosystems
are also called shared-key cryptosystems, and asymmetric cryptosystems are also
called public key cryptosystems (PKCs). We study shared-key cryptosystems in
Chapter 2, and public-key cryptosystems in Chapter 6.
Applied Introduction to Cryptography and Cybersecurity
14 CHAPTER 1. INTRODUCING CYBERSECURITY AND CRYPTOGRAPHY
Ci
ph
er
te
xt
Eavesdropping Eve
Alice
Bob
Key
Plaintext
Key
Encrypt
Ciphertext
Decrypt
Plaintext
Nurse
Figure 1.3: Encryption: terms and typical use. Alice needs to send sensitive
information (plaintext) to Bob, so that the information will reach Bob - but
remain conődential from the attacker, Eavesdropping Eve. To do this, Alice
encrypts the plaintext, typically using a key; the encrypted form is called
ciphertext. A secure encryption would prevent Eve from learning, from the
ciphertext, anything about the plaintext (except how much was sent). However,
by decrypting the ciphertext, typically using a key, Bob would recover the sent
plaintext. In symmetric cryptosystems, encryption and decryption use the same
(shared, symmetric) key, while in asymmetric cryptosystems, encryption uses
a public encryption key and decryption uses a different, albeit related, private
decryption key.
Figure 1.4: Shared key (symmetric) cryptosystem.
Symmetric (shared-key) cryptosystems use the same key, e.g., k, for both
encryption and decryption, as illustrated in Figure 1.4. The key k is chosen as
a random bit-string. In the őgure, the key length is variable, given as input
(denoted l); in practice, many cryptosystems are designed for a speciőc key length.
A symmetric cryptosystem must ensure correctness, i.e., m = Dk (Ek (m)), for
every plaintext message m and every key k.
The basic security requirement from symmetric cryptosystems is to ensure
confidentiality, i.e., an adversary should not be able to learn, from the ciphertext, any information about the plaintext. We present the deőnition only in
subsection 2.7.2, since it involves some subtle aspects.
Asymmetric (public-key) cryptosystems are illustrated in Figure 1.5. As
shown, public key cryptosystems use a pair of two different, but related, keys:
Applied Introduction to Cryptography and Cybersecurity
1.2. THE BASIC MECHANISMS: ENCRYPTION, SIGNATURES AND
HASHING
15
Figure 1.5: Public key (asymmetric) cryptosystem.
a public encryption key e for encryption, and a private decryption key d for
decryption. The keypair (e, d) is produced by a key generation algorithm,
denoted KG, which is deőned as part of the asymmetric cryptosystem. For
public-key cryptosystems that support variable key length l, the length l is
provided as input to the KG (as illustrated). Public key cryptosystems should
also satisfy correctness, namely: m = Dd (Ee (m)), for every keypair (e, d)
generated by KG.
Note that a key generation algorithm is not required for shared-key cryptosystems, since the shared key k can simply be selected as a random string.
However, clearly, for asymmetric cryptosystems, we cannot select randomly
both the public key e and the (corresponding) private key d.
Formally, every symmetric cryptosystem can be viewed also a an asymmetric
cryptosystem, simply by deőning KG as random selection of l bits, which are
used for both d and e. However, such construction will fail to provide the
stronger security requirement from public-key cryptosystems, i.e., that an
adversary should not be able to learn, from the ciphertext, any information
about the plaintext, even when given the encryption key e.
Indeed, it is not trivial to design asymmetric encryption schemes which
ensure correctness - and which are not easy to ‘break’. Even today, there is
only a limited number of different designs for public key encryption schemes. In
fact, while the concept of public-key cryptosytsems was proposed in a seminal
paper [123] by Whit Diffie and Martin Hellman, the a design was őrst published
only later, in [334], by Rivest, Shamir and Adelman. We present two of the
most well-known public key cryptosystems, RSA and El-Gamal, in Chapter 6.
Readers are encouraged to try to come up with their own design for an
asymmetric cryptosystem - not necessarily a really secure one, just something
which will not be trivial to break. You may őnd it quite a challenge - and are
welcome to peer in Chapter 6 to see how it can be done.
1.2.2
Kerckhoffs’ principle
In both symmetric and asymmetric cryptosystems, the key used for decryption
must be kept secret. However, what about the algorithms used for encryption and
Applied Introduction to Cryptography and Cybersecurity
16 CHAPTER 1. INTRODUCING CYBERSECURITY AND CRYPTOGRAPHY
decryption? Knowledge of the algorithms may help an attacker, so, intuitively,
it seems that the algorithms should be kept secret. Indeed, some ancient
cryptosystems did not use a key at all, and relied entirely on the secrecy of the
algorithms; we give few examples in Chapter 2. Even for keyed cryptosystems,
it is harder to attack without knowing the design; see Exercise 2.25. This is
often reŕected using the expression knowledge is power. Therefore, traditionally,
cryptosystems were kept secret, and this secrecy was considered as necessary
for their security - an approach we refer to as security by obscurity.
However, in 1883, the Dutch cryptographer Auguste Kerckhoffs realized
that ‘security by obscurity’ has serious disadvantages; in particular, once an
attacker obtains a cryptosystem, the system may become completely insecure.
This motivated Kerckhoffs to publish the following principle [232], which is now
considered the ‘basic’ rule in applied cryptography. We extended Kerckhoffs’
principle to refer to arbitrary security mechanisms, and not just to cryptosystems
or even to cryptographic systems.
Principle 2 (Kerckhoffs’ principle). When designing or evaluating the security
of (cryptographic) systems, assume that the adversary knows the design, i.e.,
knows everything except the secret keys.
Kerckhoffs’ principle has additional advantages. One advantage is that the
resulting design is likely to be more secure, as it was designed against a more
powerful attacker - an attacker that knows the details of the design. An even
more important advantage is the ability to evaluate the security of the design
by experts which were not part of the design team, and challenging them to
őnd vulnerabilities; it is often easier for an expert to őnd a vulnerability in a
system designed by somebody else.
Note that the Kerckhoffs’ principle does not require that the design be made
public; it merely allows publication, since it requires that the goal of the designers
should be for security to hold even against an attacker than knows the design.
In principle, we may follow Kerckhoffs’ principle, yet keep the design private,
thereby making it ‘even harder’ to őnd a vulnerability. Since security does not
assume secrecy of the design, we can continue to use the system even if the
conődentiality of the design is breached.
However, there are further advantages in going further and relying on
the security of public, standard designs. Published, standard designs have
the obvious advantage of improving the efficiency of production and use, by
allowing interoperable implementations by arbitrary vendors. A more subtle
advantage of published, and esp. standard, designs, is that they facilitate
evaluation and cryptanalysis by many experts, and motivate experts to find
and publish vulnerabilities. As a results, users will be alerted to vulnerabilities
earlier, reducing the risks of using a system with a vulnerability known only to
the attacker. Through evaluation of security by multiple, motivated experts
is the best possible guarantee for security and cryptographic designs - except,
possibly, for provably-secure designs. In fact, as mentioned above (and in [237]),
even ‘provably-secure’ designs were found to have vulnerabilities during careful
Applied Introduction to Cryptography and Cybersecurity
1.2. THE BASIC MECHANISMS: ENCRYPTION, SIGNATURES AND
HASHING
17
review by experts, due to a mistake in the proof or due some modeling or
other assumption. Therefore, Justice Brandeis’ saying that ‘sunlight is the best
disinfectant’ applies also to cryptographic and cybersecurity systems.
This is well demonstrated by two important cryptographic systems, which
whose design was kept conődential, both designed in the 1990’s: the GSM
mobile telephony network and the CSS encryption for DVDs. In both cases, it
did not take very long for the algorithms to be leaked, and quite soon afterwards,
they were found to be insecure and vulnerable to successful, practical attacks.
The GSM case is particularly interesting and important, with several glaring
vulnerabilities, some of which can be considered as design errors. The GSM
designers did not even plan a proper ‘migration plan’ for changing from the
exposed ciphers and protocols to more secure alternatives, which resulted in
devices remaining vulnerable years after the vulnerabilities becoming known to
all experts. We discuss some of these vulnerabilities in Section 5.6.
Sometimes it is feasible to combine the security beneőts of open design and
of maintaining the secrecy of the design, by combining two candidate schemes.
For example, we use a cryptosystem which is a combination of a published,
standard cryptosystems and of another cryptosystem, designed for security
(following Kerckhoffs’ principle) but kept conődential (to make attacks harder).
We use the term robust combiner for such construction, which ensures security
as long as one of the two schemes is not broken; several such robust combiners
are known, for different cryptographic schemes; see [191] and subsection 2.5.9.
1.2.3
Digital Signature schemes
The goal of cryptosystems (encryption schemes) is to ensure conődentiality, i.e.,
secrecy of information. Let us now focus on digital signature schemes, whose
goal is to ensure the authentication and non-repudiation goals (subsection 1.1.2).
Signature schemes play a critical role in applied cryptography; for example,
in Lab 1 we show how their critical role in protecting against malware, and
in Chapter 8 and Chapter 7 we show their critical role in establishing secure
communication, in particular, in the TLS protocol (used, mainly, to protect
web communication).
Digital signature schemes, like asymmetric encryption, were proposed as a
concept in [123], but their őrst published design was in the RSA paper [334].
Later on, we discuss two constructions of signature schemes: the RSA design in
Chapter 6, and a hash-based design which is limited to signing a single message,
called a one-time signature schemes, in Chapter 3.
Handwritten signatures: goals and reality. To provide intuition to the
digital signature, let us őrst consider handwritten signatures. Ideally, handwritten signatures should allow everyone to verify a signed document, by comparing
the signature on the document to a sample signature, known to be of that signer.
The purported security of handwritten signature, is based on two (implicit)
assumptions. The őrst assumption is that only the signer herself should be able
to sign a document in a way which produces a signature matching her sample
Applied Introduction to Cryptography and Cybersecurity
18 CHAPTER 1. INTRODUCING CYBERSECURITY AND CRYPTOGRAPHY
Figure 1.6: Digital Signature Scheme.
signature. The second assumption is that it is infeasible to change a signed
document, without leaving marks that would invalidate the signature.
Reality is less ideal; handwritten signatures are forged, hand-signed documents are modiőed. Indeed, for these reasons, there are experts who are
called upon to detect forged signatures and documents modiőed after signature
- and these experts may fail to make a correct determination - indeed, different
experts may disagree. A serious forger may, for example, lure the victim signer
into signing a benign-looking document, such as ‘I owe Mal $1’, which may
make it easier to modify into ‘I owe Mal $1,000,000’ without leaving marks.
It could be quite hard to prove that the document was modiőed after it was
signed.
Digital signatures and their security. Let us now focus on digital signatures, illustrated in Figure 1.6, and their security properties. Basically, we
argue that digital signatures provide the security that is desired, but not really
provided, by handwritten signatures.
There are some obvious differences between handwritten signatures and
digital signatures. Obviously, the document and the signature are strings (őles)
rather than ink on paper, and the processes of signing and validating are done
by applying appropriate functions (or algorithms), as illustrated in Figure 1.6.
The signing and veriőcation functions require appropriate keys. The private
signing key, s, replaces the unique personal characteristics of the signer, creating
the personalized signature; the public verification key, v, replaces the sample
signature available to anybody wishing to verify signed documents. This keypair,
(s, v), is produced together, using a key generation algorithm, KG.
A signature scheme S = (KG, Sign, Verify), therefore, consists of these
three algorithms: KG, for producing the keys, Sign, for signing, and Verify,
for verifying. These three algorithms are similar to these of a public key
cryptosystem (Figure 1.5), and, like there, the key generation algorithm, KG,
may receive as input an indication of the required key-length l.
The key generation algorithm outputs a pair of keys: a private signing key
s, to be known only to the signer, and a corresponding public veriőcation key v,
Applied Introduction to Cryptography and Cybersecurity
1.2. THE BASIC MECHANISMS: ENCRYPTION, SIGNATURES AND
HASHING
19
to be known to anybody who wishes to verify signatures. The Sign algorithm is
given the (private) signing key s and a message m; the output of the signature
algorithm, which we denote in Figure 1.6 by σ, is usually referred to as the
signature of the message m: σ = Sign s (m).
Signature schemes should also ensure security. Intuitively, the security
requirement is that an attacker cannot forge messages. Namely, given the
public veriőcation key v, an attacker cannot obtain a signature σ such that
Vv (m, σ) = True, unless it was also given the private signing key s or the
computed value of σ = Ss (m).
We provide a precise deőnition of this security requirement in subsection 1.5.1,
since this deőnition requires some notations and background that we did not
yet discuss. Note, however, that like public-key encryption, it is non-trivial to
design a signature scheme (which is not easy to break). Therefore, we present
an implementations only much later: of RSA in Chapter 6, and of one-time
signatures in subsection 3.4.2.
The reader may wonder why do we deőne the security requirements of
signature schemes already in this chapter. There are a few reasons:
• The security requirements of signature schemes are relatively simple and
easy to deőne. By discussing the requirements and deőning them, we
demonstrate the tools of probability and computational complexity, which
are later used to deőne other cryptographic schemes.
• Signatures are one of the most important and widely used cryptographic
schemes, yet - they are one of the least known and understood outside
the cryptographic community.
• Signatures are the main deterrence-based security mechanism, at least in
cryptography, and deőnitely in this textbook. Deterrence is not always
sufficiently recognized for its potential to ensure security; we discuss this
next.
Non-repudiation. Everyone can use the public validation key v to verify
signatures; this key does not suffice to generate valid signatures! Therefore, a
digital signature can provide not only authentication, but also non-repudiation:
the signer cannot deny having signed the message. As we will see in Chapter 7
and Chapter 8, this property is critical to applications of public-key cryptography,
including the TLS protocol, which is key to web-security and other applications.
Warning: conflicting use of the term ‘digital signature’. To őnish this
high-level introduction to the cryptographic mechanism of digital signatures,
we need to warn the reader about a conŕicting use of the same term for a very
different mechanism. Speciőcally, the term ‘digital signature’ is often used to
refer to the visual appearance of a ‘signature’ in a document, such as a signature
included in a PDF őle. This mechanism offers little or no technical defense
against forgery; it is quite easy for an attacker to use the same visual ‘signature’
in a different őle, i.e., forgery is often easy.
Applied Introduction to Cryptography and Cybersecurity
20 CHAPTER 1. INTRODUCING CYBERSECURITY AND CRYPTOGRAPHY
1.2.4
Applying Signatures for Evidences and for Public Key
Infrastructure (PKI)
Signatures are asymmetric: signing requires the private signing key, but validation of a signature only requires the corresponding public veriőcation key. This
property facilitates two functions which are critical to cybersecurity: provision
of evidences and facilitating the Public Key Infrastructure (PKI). Let us brieŕy
discuss both of these functions.
Signatures facilitate non-repudiation and evidences. The recipient of a
signed message ‘knows’ that once she validated a signature, using the veriőcation
key v, she would be able to convince other parties that the message was, in
fact, signed by the use of the private key corresponding to v. We refer to this
property as non-repudiation, since the owner of the private key cannot claim
that the ability to verify messages allowed another party to forge a signature,
i.e., compute a seemingly-valid signature of a message, without access to the
private signing key.
The non-repudiation property allows a digitally-signed document to provide
an evidence for the agreement of the signer, much like the classical use of handwritten signatures. Indeed, the use of digital signatures to prove agreement,
has signiőcant advantages compared to the use of hand-written signatures:
Security. Handwritten signatures are prone to forging of the signature itself, as
well as to modiőcation of the signed document. If the signature scheme is
secure (i.e., existentially unforgeable, see Deőnition 1.6), then production
of a valid signature over a document m practically requires the application
of the private signing key to sign exactly m.
Convenience. Digital signatures can be sent over a network easily, and their
veriőcation only requires running of an algorithm. Admittedly, signature veriőcation does involve some non-negligible overhead, but this is
incomparably easier than the manual process and expertise required to
conőrm handwritten signatures. Later on, digital signatures may be easily
archived, backed-up and so on.
Non-repudiation is essential for many important applications, such as signing
an agreement or a payment order, or for validation of recommendations and
reviews. Non-repudiation is also applied extensively in different cryptographic
systems and protocols.
Legal interpretation of signatures and digitized handwritten signature.
Digital signatures are covered by legislation in some jurisdictions, however, their
legal deőnition and implications vary signiőcantly between jurisdictions, and
often differs considerably from what you may expect based on the cryptographic
deőnitions and properties. For example, many web services use the term
‘digital signature’ to refer to agreement by a user in a web form, sometimes
accompanied by a visual representation of a handwritten signature. Other
Applied Introduction to Cryptography and Cybersecurity
1.2. THE BASIC MECHANISMS: ENCRYPTION, SIGNATURES AND
HASHING
21
systems and organizations, consider as a ‘digital signature’ the scanned or
scribbled version of a person’s signature, which may be better referred to as
a digitized handwritten signatures. Such services may offer convenience, but
not the security of digital signatures (in the sense used in this textbook and
by experts). In particular, since digitized handwritten signatures are merely
digitally-represented images, they deőnitely cannot prevent an attacker from
modifying the ‘signed’ document in arbitrary way, or even reusing the signature
to ‘sign’ a completely unrelated document. From the security point of view,
these digitized handwritten signatures are quite insecure - not only compared to
cryptographic signatures, but even compared to ‘real’ handwritten signatures,
since ‘real’ handwritten signatures may be veriőed with some precision by
careful inspection (often by experts).
Signatures facilitate public key infrastructure (PKI) and certificates.
Most applied cryptographic systems involve public key cryptosystems (PKCs),
e.g. RSA, and key-exchange protocols, e.g. the Diffie-Hellman (DH) protocol,
both presented in Chapter 6. In particular, PKCs and key-exchange are central
to the TLS protocol (Chapter 7), which is probably the most widely-used and
important cryptographic protocol, and the main cryptographic web-security
mechanism. However, all of these depend on the use of authentic public keys
for remote entities, using only public information (keys). This still leaves the
question of establishing the authenticity of the public information (keys).
If the adversary is limited in its abilities to interfere with the communication
between the parties, then it may be trivial to ensure the authenticity of the
information received from the peer. In particular, if the adversary is passive,
i.e., can only eavesdrop to messages, then it suffices to simply send the public
key (or other public value).
Some designs assume that the adversary is inactive or passive during the
initial exchange, and use this exchange information such as keys between the
two parties. This is called the trust on first use (TOFU) adversary model.
In few scenarios, the attacker may inject fake messages, but cannot eavesdrop
on messages sent between the parties; in this case, parties may easily authenticate
a message from a peer, by previously sending a challenge to the peer, which the
peer includes in the message. We refer to this as a off-path adversary. Off-path
adversaries are mainly studied when focusing on non-cryptographic aspects of
network security.
However, all these methods fail against the stronger Man-in-the-Middle
(MitM) adversary, who can modify and inject messages as well as eavesdrop
on messages. Furthermore, there are many scenarios where attackers may
obtain MitM capabilities, and even when this seems harder to believe, it is
always better to ensure security against such powerful attackers, following the
conservative design principle (Principle 3). To ensure security against a MitM
attacker, we must use strong, cryptographic authentication mechanisms.
Signature schemes provide a solution to this dilemma. Namely, a party
receiving signed information from a remote peer, can validate that information,
Applied Introduction to Cryptography and Cybersecurity
22 CHAPTER 1. INTRODUCING CYBERSECURITY AND CRYPTOGRAPHY
using only the public signature-validation key of the signer. Furthermore,
signatures also allow the party performing the signature-validation, to őrst
validate the public signature-validation key, even when it is delivered by an
insecure channel which is subject to a MitM attack, such as email. This solution
is called public key certificates, and we discuss it in Chapter 8.
1.2.5
Cryptographic hash functions
A cryptographic hash function h receives an input string m and outputs a short
string h(m). We refer to this output as the hash, őngerprint, digest or checksum
of the input string m.
Several security properties are deőned for cryptographic hash functions; see
Chapter 3. The most well-known is collision resistance. Collision resistance
means that given the digest h(m) of some string m, it is infeasible to őnd
a different string m′ =
̸ m, which has the same digest: h(m′ ) = h(m). The
collision resistance property is often used to ensure integrity, e.g., integrity of
software downloads; see Lab 1.
1.3
Sequence Diagrams and Notations
Notations and őgures are essential for precise, effective technical communication.
However, it can be frustrating to read text which uses unfamiliar, forgotten or
confusing notations. This could be a special challenge for these readers of this
text who were not much exposed to some of the notations used in mathematics
and theory of computer science.
Unfortunately, there are sometimes multiple notations for the same concept,
or multiple conŕicting interpretations for the same notation. We tried to choose
the more widely used and least conŕicting and confusing notations, but that
required some difficult tradeoffs. For example, we use the symbol +
+ to denote
string concatenation, although the symbol || is more commonly used to denote
string concatenation in cryptographic literature. The reason for preferring +
+ is
to avoid confusing readers who are used to the use of the symbol || to denote
the logical-OR operator, as in several programming languages.
In this section, we introduce notations and sequence diagrams, which we use
in this text. Our choice of notations as well as of the use or sequence diagrams
and their style is consistent, as much as possible, with common literature. In
particular, Table 1.1 presents notations which we use extensively. Please refer to
it whenever you see some unclear notation, and alert the author of any missing,
incorrect or confusing notation.
Let us discuss separately two important, basic notations: the dot notation,
for referring to items within a tuple, and the key notation, for denoting the key
when provided as input to a cryptographic function.
The dot notation. We use A.s and A.v to denote the signing and veriőcation
keys of Alice, respectively. Here, A stands for ‘Alice’, and the s and v after the
Applied Introduction to Cryptography and Cybersecurity
1.3. SEQUENCE DIAGRAMS AND NOTATIONS
Table 1.1: Notations used in this manuscript.
S = {a, b, c}
N, Z, Z+
Zp , Z∗p
{x ∈ X | f (x) = 0}
(∀x ∈ X)(f (x) > 1)
(∃x ∈ X|f (x) > 1)
Πx∈S Vx
i!
C ∪B
A⊆B
A⫋B
A×B
0x..., e.g., 0xAF 2
{a, b}l
{a, b}∗
+
+
al (e.g., 1l , 0l )
|b|
a[i]
a[i : j] or a[i . . . j]
aR
a.b
x∧y
x∨y
x⊕y
x
$
x←X
Pr $ (F (x))
x←X
A Bk (·) , A fk (·)
PPT
N EGL(n)
O(f (n))
k (ψ)
A set S with three elements - a, b and c. Sets are denoted with
capital letter.
N: natural numbers (integers greater than zero); Z: all integers;
Z+ : non-negative integers; R: all real numbers.
The sets {0, . . . , (p − 1)} and {1, . . . , (p − 1)}, respectively.
The subset of elements x ∈ X s.t. f (x) = 0.
For all elements x in the set X, holds f (x) > 1. Set X omitted
when ‘obvious’.
There is (exists) some x in X s.t. f (x) > 1.
Multiplication of Vx for every x ∈ S, e.g., Πx∈{a,b,c} Vx = Va · Vb · Vc .
Similar to use of Σx∈S for addition.
The factorial of i, deőned as: i! ≡ Πj∈{1,...,i j = 1 · 2 · . . . · i.
Union of sets C and B.
Set A is a subset of set B, i.e., a ∈ A ⇒ a ∈ B.
Set A is a ‘proper’ subset of set B, i.e. A ⊆ B but A =
̸ B. For
example, N ⫋ Z+ ⫋ Z.
Cross-product of sets A and B, i.e., the set {(a, b)|a ∈ A and b ∈
B}.
Hexadecimal string, i.e., 0x is followed by a string of hexadecimal
digits (from 0 to F ).
The set of strings of length l over the alphabet {a, b}.
The set of strings of any length, over the alphabet {a, b}.
Concatenation of strings; abc +
+ de = abcde. Note: in cryptographic
literature, concatenation is often denoted by ||; we prefer +
+ since
|| is elsewhere used for the logical OR operation.
String consisting of l concatenations of the letter/sequence a. For
example, 04 = 0000, 13 = 111, and (01)2 = 0101. Also, 1l is the
number l in unary notation.
The length of string b; hence, |an | = n · |a| and |0n | = n.
The ith most signiőcant character of string a. For a binary string,
the ith bit; for a byte-string, the ith byte. E.g., if a = 011, then
a[1] = 0 and a[2] = 1.
Substring containing a[i] +
+ ... +
+ a[j].
The ‘reverse’ of string a, e.g., abcdeR = edcba.
Dot notation: element b of tuple a (Section 1.3).
bitwise logical AND; 0111 ∧ 1010 = 0010.
bitwise logical OR; 0111 ∨ 1010 = 1111.
Bit-wise exclusive OR (XOR); 0111 ⊕ 1010 = 1101.
The bit-wise inverse of binary string or bit x.
Select element x from set X with uniform distribution.
The probability of F (x) to occur, when x is selected uniformly
from set X.
Algorithm A with oracle access to algorithm B or to function f ,
with key k. Namely, A can give input x and receive Bk (x) or fk (x).
See Deőnition 1.3.
The set of efficient (Probabilistic Polynomial Time) algorithms; see
Deőnition A.1.
Set of ‘negligible functions’ in input n ∈ N, see Def. 1.5.
Big-O notation, identiőes the complexity of an algorithm; see
Equation A.1.
String ψ protected (‘enveloped’) using key k, e.g., by the TLS
record protocol.
Applied Introduction to Cryptography and Cybersecurity
23
24 CHAPTER 1. INTRODUCING CYBERSECURITY AND CRYPTOGRAPHY
dot represent speciőc values (keys) associated with Alice. We őnd dot notation
a convenient way to identify different keys and other values associated with a
particular entity. We also use dot notation, when necessary, to avoid ambiguity
when referring to the different functions comprising a cryptographic scheme, for
$
example, a signature scheme S = (KG, Sign, Verify), e.g.: (s, v) ← S.KG(1l ).
Finally, we also use dot notation to refer to different outputs of a function
returning multiple values. We use these conventions throughout this textbook.
The key notation. Cryptographic functions often use keys. While the key
is, formally, an input to the function like any other input, it is often convenient,
and customary, to place the key as subscript to the function name. For example,
we use Sign s (m) to denote the signing algorithm Sign applied to the message
m, using the signing key s. In Figure 1.7 we use this notation together with the
dot notation, and write Sign A.s (m) to denote the result of applying signature
algorithm Sign to message m, using the signing key A.s of Alice.
1.3.1
Sequence diagrams
Let us now present sequence diagrams, a widely-used technique for illustrating
the interactions between entities over time; sequence diagrams are widely used
in cybersecurity, communication protocols and other areas involving interactions
between entities. For example, Figure 1.7 is a sequence diagram, like most őgures
in this textbook, and, indeed, most őgures in the cybersecurity, cryptography
and networking literature. A sequence diagram illustrates the progression of
events over time. In this őgure, and in most őgures in this textbook, the
time proceeds from top to bottom; note that in some other schedule diagrams,
time proceeds from left to right. The top of the diagram shows the different
parties and processes/algorithms. For example, in Figure 1.7, we have two
parties, Alice and Bob, and the three algorithms comprising the signature
scheme (KG, Sign, Verify). The arrows represent communication: over the
network (between parties) or within the same party (with algorithms).
To show network latency (delay), some sequence diagrams may use slanted
arrows. However, in this textbook, we mostly ignore the communication delays,
hence the sequence diagrams mostly use horizontal arrows.
Example: sequence diagram for signature schemes. Figure 1.7 demonstrates sequence diagrams and some of the notations we presented, by presenting
a sequence diagram of the initialization and the typical use (signing and verifying)
of a signature scheme.
1.4
A Bit of Background
Modern cryptography makes extensive use of mathematics, in particular, complexity theory, number theory, group theory, and probability. These are large,
Applied Introduction to Cryptography and Cybersecurity
1.4. A BIT OF BACKGROUND
Alice
Sign
Nurse
Initialize
A.s
25
KG(1l ): key
generation
Bob
Alice’s Private
Alice’s public
signing key A.s
veriőcation key A.v
Verify
A.v
m
Sign and
verify
Sign A.s (m)
σ ≡ Sign A.s (m)
m, σ
m, σ
Ok
Figure 1.7: Sequence diagram for the initialization and use of a signature
scheme. Alice signs message m with her private signing key A.s, resulting in
σ = Sign A.s (m). Alice sends σ and m to Bob, who verifies the signature by
computing VerifyA.v (m, σ). Since in this example σ is a valid signature of m,
the result is Ok. If σ is not a valid signature of m, the result should be Invalid.
important and interesting areas, which are often part of computer science curriculum. However, we believe that it is not essential to study these areas before
studying this textbook; the textbook only requires limited use of basic concepts
and results from these areas. Instead, we provide the necessary, limited background, for a reader who did not learn these areas so far, in Appendix A. In fact,
studying this textbook before studying these topics, may provide motivation
and prepare the reader for in-depth study of these important and interesting
areas.
In this section, we brieŕy introduce these topics, to allow readers to determine
if they need to learn a bit from any of them, or if they know enough from
prior studies. If readers őnd it necessary to learn a bit more about these topics,
we provide the necessary background in Appendix A. Readers can read the
appendixes in advance and/or ‘as needed’, i.e., when the text makes use of the
relevant area.
In subsections 1.3 and 1.3, we introduce notations used throughout this
textbook. Notations are important, therefore, we urge readers to read and refer
to these sections. In Section 1.3 we present sequence diagrams, a widely-used
graphical notation for presenting interactions between entities, which we use
extensively to present protocols and attacks; and the convenient dot notation,
which identiőes values associated with a particular entity. In Section 1.3, we
focus on additional notations, many of which may be familiar to readers; these
notations are summarized in Table 1.1, for handy reference when reading the
Applied Introduction to Cryptography and Cybersecurity
26 CHAPTER 1. INTRODUCING CYBERSECURITY AND CRYPTOGRAPHY
text.
1.4.1
A bit of Computational Complexity
Most cryptographic schemes assume restrictions on the computational abilities
of the adversary2 . The focus on adversaries with restricted computational
capabilities, allows us to rule out some attacks which require absurd amount of
resources, such as exhaustive search - trying out all possible keys (see subsection 2.3.1).
However, how can we deőne a clear restriction on the adversary’s computational abilities? This can be quite tricky. However, the theory of computational
complexity provides an elegant solution. In this introductory textbook, we only
need to understand some very basic notions and properties from the theory
of complexity. Our discussion is quite restricted to the essential notions; we
believe that readers should be able to follow the text even without learning
computational complexity. We list below the main aspects of complexity theory
that we use, and explain them in Section A.1. For readers interested to learn
more, we recommend one of the many excellent textbooks on computational
complexity, e.g., [106, 167], which provide a much more extensive introduction
to this important and interesting subject.
The aspects of computational complexity which we use in the textbook, and
describe in Section A.1, include:
• The big-O notation and its use for specifying and comparing the asymptotic
complexities (overhead) of algorithms. Asymptotic complexity focuses on
the overhead of the algorithm as its input size grows toward inőnity. This
is convenient, since for smaller input size, the overhead may be dominated
by őxed per-operation and startup factors, which become insigniőcant
as the input size grows. The big-O notation basically ignores these őxed
factors. For example, the big-O notation for linear functions is O(n),
for quadratic functions O(n2 ) and for exponential functions it is O(an ),
where a is the exponent; see the illustration of speciőc linear, quadratic
and exponential functions in Figure 1.8.
• The deőnition of Probabilistic Polynomial Time (PPT) algorithms, also
referred to as efficient algorithms or polytime algorithms, and the corresponding notion for functions. Basically, an algorithm is efficient, or
PPT, if its run time is bounded by any polynomial in its input length.
Therefore, an algorithm is efficient (PPT) if its time complexity is linear,
quadratic or any other polynomial in the input length n, e.g., O(na ) for
any a. In contrast, an algorithm whose time complexity is exponential,
e.g., O(2n ), is considered inefficient. For motivation, see how in Figure 1.8
the exponential function exceeds the linear and quadratic functions for
sufficiently large input size n.
2 There are also some definitions and constructions of unconditionally secure cryptographic
schemes. We cover two important unconditionally-secure schemes: One Time Pad (OTP)
encryption (Section 2.4) and Secret Sharing (Section 10.1).
Applied Introduction to Cryptography and Cybersecurity
1.4. A BIT OF BACKGROUND
27
·104
f (n), g(n), h(n)
6
4
f (n) = 600n + 900
g(n) = 80n2 + 400
h(n) = 2n + 100
2
0
0
2
4
6
8
n
10
12
14
16
Figure 1.8: Comparing linear, quadratic and exponential complexities: comparing the linear function f (n) = 600n + 900 = O(n), the quadratic function
g(n) = 80n2 + 400 = O(n2 ) and the exponential function h(n) = 2n + 100 =
O(2n ).
• The security parameter 1l , which is a number encoded in unary (i.e., a
string of l bits whose value is 1). The security parameter is often used as
the input to the (randomized) key-generation algorithms; the length of
the key may be the same as 1l . The reason for using this input is to allow
the key-generation and other algorithms to run in time
• The non-deterministic polynomial-time (NP) class of problems, and the
?
N P = P question.
1.4.2
A bit of Number Theory and Group Theory
Number theory and group theory are often used in the design and analysis
of cryptographic schemes. In this textbook, we use only a tiny subset that is
necessary for our study of applied cryptography.
The subset of number theory that we need is mostly focused on modular
arithmetic, i.e., the computation of expressions involving arithmetic operations
over integers, where the operations include modulo operations. Modular arithmetic shares many of the properties of regular arithmetic (over the integers and
the real numbers), however, there are also important differences; a reader not
familiar with these basic topics, is strongly advised to learn this topic, from
Section A.2 or from one of the many good textbooks covering this topic.
One example of a difference between modular arithmetic and regular arithmetic is the subject of multiplicative inverses, which we cover in subsection A.2.2.
Applied Introduction to Cryptography and Cybersecurity
28 CHAPTER 1. INTRODUCING CYBERSECURITY AND CRYPTOGRAPHY
The multiplicative inverse of a number x is denoted x−1 , and is the number
satisfying x · x−1 = 1, where we may use modular multiplication or regular
multiplication. Over the integers, there are no multiplicative inverses (except
for 1 who is its own inverse, while over the reals, every number, except zero,
has an inverse. For modular arithmetic, the situation is a bit more complex: an
integer a has multiplicative inverse modulo integer m > 0, if and only if a and
m are coprime, namely, they do not have a common divisor (except 1).
Computing the multiplicative inverse is one of the problems which is widely
believed to be computationally hard, i.e., it is believed that there is no efficient
(PPT) algorithm to compute multiplicative inverses, but only when applied to
numbers which are hard to factor, i.e., numbers chosen as a multiplication of
very large random primes. Finding the factors of such numbers is referred to as
the factoring problem, and it is considered computationally-hard (Section A.1);
and when the factors are hard to compute, computing multiplicative inverses is
also hard. In fact, this is crucial to the security of the well-known and important
RSA public key cryptosystem, which we discuss in Chapter 6. Speciőcally, in
RSA, the private key is the multiplicative inverse of the public key, which means
that if it is possible to efficiently compute multiplicative inverses, then RSA is
insecure.
For details and more applications of multiplicative inverses, see subsection A.2.2.
In subsection A.2.3, we discuss two important and beautiful results of
number theory, which are very important in cryptography: Fermat’s and
Euler’s theorems. Among their application, is also the efficient computation
of multiplicative inverses - for numbers with known, or small, prime factors.
In fact, these theorems can also allow to reduce the complexity of modularexponentiation. This property is key to the design of the RSA public key
cryptosystem (see Chapter 6).
Finally, in subsection A.2.4, we introduce basic notions from the domain
of group theory, which are used widely in applied cryptography, and a bit
in this textbook. In particular, group theory is used to deőne the discrete
logarithm problem, which is another important number-theoretic problems
considered computationally hard. The discrete logarithm problem is used as
the basis for several public-key schemes, including the important Diffie-Hellman
key-exchange protocol (also in Chapter 6).
1.4.3
A bit of Probability
Probabilistic analysis and algorithms are very important for computer science
in general and for cryptography in particular. However, luckily, only the very
basics are required for our study of applied cryptography. We present this
minimal background on probability in Section A.3; for more in-depth coverage,
take a course and/or read one of the many excellent textbooks, e.g., [106, 175].
Probability deals with events which result in a value from some predeőned
set. For simplicity, we only consider a finite set of possible outcomes, and only
uniform distributions and independent random variables.
Applied Introduction to Cryptography and Cybersecurity
1.5. PROVABLE-SECURITY AND DEFINITIONS
29
We are often interested in the outcome of a randomized (or probabilistic)
algorithm A, i.e., an algorithm that can perform bit flip operations. We use
the notation Pr(π(A)) to denote the probability that predicate π holds for the
(randomized) output of A, and the notation y ← A(x) to denote that y is
assigned the random outcome of a uniformly-chosen run of A with input x.
In Section A.3, we present several simple, yet useful, properties of probability,
which we use in this textbook.
1.5
Provable-Security and Definitions
Ensuring security is challenging. It is tempting to identify a list of possible
attacks, and evaluate security against these; but that is often misleading, resulting in vulnerability to other, unforeseen attacks. Instead, modern cryptography
is mostly based on provable security, whose goal is to prove that an attacker
with given capabilities is unable or unlikely to ‘break security’ of a given system
or cryptographic scheme.
Proving security requires to clearly and precisely deőne the cryptographic
scheme and its interactions, the attacker capabilities, the security requirements,
and any additional assumptions. Only with precise deőnitions of the scheme,
attacker capabilities, security requirements and assumptions, can we try to
prove security. Note that there are also many opportunities for errors leading to
vulnerabilities, either in the proof itself, or in the use of ‘incorrect’ deőnitions
for attacker capabilities, security requirements or assumptions, e.g., when an
attacker may have additional capabilities or when a cryptographic scheme is used
incorrectly, i.e., assuming it ensures properties beyond its security requirements.
Security and cryptography are rather unique in being very applied, yet requiring
‘correct’ and precise deőnitions and analysis.
In this section, we introduce the provable-security approach, by presenting
the deőnition of cryptographic signature scheme (subsection 1.5.1), the relevant
attack models (subsection 1.5.2), types of signature forgeries (subsection 1.5.3)
and, őnally, the security requirements of signature schemes (subsection 1.5.8).
This subject has been introduced in a seminal paper from 1988 by Goldwasser,
Micali and Rivest [170], which is highly recommended reading.
Why signatures? Our choice of illustrating provable-security on signature
schemes, rather than encryption, has multiple motivations. First, cryptographic
signatures are fascinating and widely-applied; in particular, they are fundamental to the TLS protocol (Chapter 7) and the public-key infrastructure
(PKI, Chapter 8). Second, signatures are less known, and, furthermore, the
term ‘electronic signature’ is often applied, confusingly, to mechanisms with
weaker security guarantees. Third, we believe that the deőnition of security
requirements of signature schemes is, surprisingly, simpler and more intuitive
than that of encryption schemes (see subsection 2.7.2). Fourth, signatures allow
us to deőne and compare multiple natural, widely-known security requirements.
Finally, these deőnitions are used in Chapter 3, which covers integrity and
Applied Introduction to Cryptography and Cybersecurity
30 CHAPTER 1. INTRODUCING CYBERSECURITY AND CRYPTOGRAPHY
hash functions, including the Hash-then-Sign (HtS) paradigm and hash-based
signature schemes, as well as in Chapter 4, which covers message-authentication
and in Chapter 6 which covers public key cryptography. Presenting them here
allows the reader to chose the order of reading these chapters.
1.5.1
Definition of a Signature Scheme
We now present an example of the formal deőnition of cryptographic signature
schemes and their correctness requirements; the same approach is used to study
other cryptographic schemes, by őrst deőning a scheme and its correctness
requirements.
A (digital/cryptographic) signature scheme S consists of three algorithms:
S = (KG, Sign, Verify). The Sign algorithm is used to sign messages, using
a secret/private key s; the Verify algorithm is used to verify the purported
signature over a message, using a known, public key v; and the key generation
algorithm KG generates the keypair (s, v).
The deőnition, which follow, uses the concepts of efficient (PPT) algorithms
and of security parameter, which we discuss in Appendix A. Intuitively, the
security parameter indicates the desired tradeoff between security and performance; a longer security parameter implies more security and more overhead,
e.g., longer keys. In most of the cryptographic literature, and speciőcally in
this textbook, the algorithms, including the adversaries, are efficient (PPT).
Namely, the run-time of every algorithm is bounded by a polynomial in the
length of its inputs, which usually includes the security parameter 1l . In this
textbook, there is only one exception: the unconditionally secure One Time
Pad algorithm (Section 2.4).
Definition 1.1 (Signature scheme). A signature scheme is a tuple of three
efficient (PPT) algorithms, S = (KG, Sign, Verify), and a set M of messages,
such that:
KG is a randomized algorithm whose input is a unary string (security parameter 1l ) and whose output is a pair of binary strings (s, v), called
the private key and the public key, respectively. To refer to only one of
the two outputs of KG, we use the dot notation, i.e., s ≡ KG.s(1l ) and
v ≡ KG.v(1l )).
Sign is an algorithm that receives two binary strings as input, a signing key
s ∈ {0, 1}∗ and a message m ∈ M , and outputs another binary string
σ ∈ {0, 1}∗ . We call σ the signature of m using signing key s.
Verify is an algorithm that receives three binary strings as input: a verification
key v, a message m, and σ, a purported signature over m, and whose
output is True or False (i.e., a predicate). Intuitively, Verify should
output True if and only if σ is the signature of m using s, where s is the
signature key corresponding to v (generated with v).
Applied Introduction to Cryptography and Cybersecurity
1.5. PROVABLE-SECURITY AND DEFINITIONS
31
Usually, the set of messages M is either the set of all binary strings {0, 1}∗ ,
or the set {0, 1}n of binary strings of some őxed length n. When M is not
explicitly mentioned, this implies the set of all binary strings, i.e., M = {0, 1}∗ .
In practice, signature schemes often assume őxed input length n. To sign
longer messages, the messages are hashed and then the output of the hash
is signed. We refer to this as the Hash-then-Sign paradigm, and discuss it
in subsection 3.2.6. For example, the DSA standard signature scheme [297]
deőnes the application of Hash-then-Sign (using a standard cryptographic hash
function, SHA). The motivation is efficiency; the őxed-length signature schemes
have high overhead (e.g., see Table 6.1), which further increase, super-linearly,
as a function of the message size. Hashing longer messages, and then signing
the is much more efficient.
The deőnition allows the algorithms of the signature schemes to randomized.
This may look unnecessary, but, in fact, some important signing algorithms are
randomized, e.g., using the PSS encoding [44, 292].
The correctness requirement of a cryptographic scheme veriőes that the
scheme operates correctly under benign operating conditions, usually without
allowing probability of error. For a signature scheme, this simply means that
if the purported signature σ is indeed the output of the corresponding ‘Sign’
operation, then the veriőcation will return True, correctly indicating a valid
$
signature. Namely, if (s, v) ← KG(1l ) is a pair of signing key and corresponding
validation key, then validation, using v, of a signature σ over a message m,
produced by signing m using s would always return True. Let us now turn
this into a precise deőnition of the correctness requirement.
Definition 1.2 (Correctness of a signature scheme). We say that a signature
scheme (KG, Sign, Verify), with set M of messages, is correct, if for every
$
security parameter 1l , every key-pair (s, v) ← KG(1l ) and every message m ∈ M
holds:
Verifyv (m, Sign s (m)) = True
(1.1)
Signature scheme security requirements. Intuitively, the security goal of
signature schemes is unforgeability, i.e., to prevent an attacker from obtaining
a (meaningful) forgery, where a forgery is a valid signature for a message that
was not signed by the owner of the private signing key s. However, this goal is
not well deőned, and may be interpreted in several different ways; in particular,
we can consider different attack models and types of forgery. We discuss such
variants in the rest of this section, as well as notations and concepts that are
relevant to provable security, in general and of signature schemes speciőcally,
such as the oracle notation.
1.5.2
Signature attack models and the conservative design
principle
Following the attack model principle (Principle 1), security should be deőned
and ensured with respect to attacker capabilities, i.e., attack model, rather than
Applied Introduction to Cryptography and Cybersecurity
32 CHAPTER 1. INTRODUCING CYBERSECURITY AND CRYPTOGRAPHY
assuming a speciőc attack strategy. Let us discuss the attack models relevant
to the security of signature schemes.
The Key-Only Attack model. The weakest attack model against a signature
scheme, is the Key-Only Attack model, where the adversary is given (only) the
public veriőcation key v. More generally, in this attack model, the attacker is
given (only) all the public keys and other public information; this attack model
can be applied against any cryptographic scheme. However, the Key-Only
Attack model is usually considered too weak. Speciőcally, for any practical
applications of signatures, surely the adversary should also be able to observe
at least one signed message, in addition to the public key; this motivates the
use of a stronger attack models.
The Known Message Attack (KMA) model. In the Known Message
Attacker (KMA) model, the attacker can receive (an arbitrary number of) pairs
of a message and its signature. However, the attacker cannot control (choose)
the messages. Variants of this model may require the attack to succeed for
any given set of signed messages, or for signed random messages. However,
this model is also usually considered too weak, since in many applications and
scenarios, the attacker may be able lure the signer into signing some messages
with speciőc format or content. For example, both parties are typically able to
inŕuence some of the text of a contract before it is signed.
The Chosen Message Attack (CMA) models. In this textbook, and
in most of the works in modern cryptography, we adopt the stronger Chosen
Message Attack (CMA) model, where the attacker can ask for, and receive,
the signatures of arbitrary messages of its choosing. Furthermore, we use the
a strong variant called Adaptive Chosen Message Attack (Adaptive-CMA) we
allow the adversary to choose the messages adaptively, based on the public key
and on the signatures it has previously received (but the word ‘adaptive’ is
often omitted). See [170] for the weaker models: directed CMA (where attacker
chooses the messages only based on the public key) and generic CMA (attacker
chooses the messages without any input).
The Conservative Design Principle. One may argue that the Adaptive
Chosen Message Attack model is ‘unnecessarily strong’. In many applications,
the adversary’s ability to impact the contents of the signed message is very
limited; and in some, the signer may phrase the contract, not allowing the
adversary to have any (substantial) impact on it. It may seem that the KMA
model, or the weaker-CMA models mentioned above (directed CMA or generic
CMA), may suffice; requiring security against the much stronger CMA model
may seem to impose unnecessary burden.
However, we want to avoid vulnerabilities in systems using cryptographic
scheme, due to incorrect usage of the schemes as well as due to incorrect
attack models. It is difficult to predict the actual environment in which a
Applied Introduction to Cryptography and Cybersecurity
1.5. PROVABLE-SECURITY AND DEFINITIONS
33
cryptographic scheme would be used, and a subtle difference between the real
attacker capabilities and the attack model we use, may result in a vulnerability.
Speciőcally, in subsection 3.2.6 we show how an attacker who can receive
signature over a document which is mostly benign, except for a short string
selected by the attacker, is able to forge a signature over a very different
document. A variant of this attack was even demonstrated to circumvent the
critical Web PKI mechanism (Chapter 8). This motivates a more conservative
approach, e.g., the use of the stronger CMA model.
This is a special case of the important conservative design principle, which
basically says that cryptographic mechanisms should be secure under minimal
assumptions on the application scenarios, the underlying mechanisms and the
attacker.
Principle 3 (Conservative design and usage). Cybersecurity mechanisms,
and in particular cryptographic schemes, should be specified and designed with
minimal assumptions, simple usage with minimal restrictions, strongest security
requirements, and maximal, well-defined attack model (attacker capabilities),
rather than being designed using assumptions which hold for a specific system or
application. On the other hand, when using an underlying cryptographic scheme,
the design should assume the minimal requirements from the scheme, and limit,
as much as possible, the attacks that can be deployed against this underlying
scheme.
Both parts of the conservative design principle are very important. Many
systems were vulnerable due to the use of mechanisms designed with subtle
assumptions or restrictions, or ensuring insufficiently strong properties, e.g.,
assuming limited attacker capabilities. Other systems used cryptographic
mechanisms in sub-optimal way, which unnecessarily gave the attacker ability
to exploit later-discovered vulnerabilities of the cryptographic mechanisms. The
design of security mechanisms to minimize assumptions and restrictions on their
usage, can also make their use easier.
1.5.3
Types of forgery
Which forgery is considered as a successful attack? Clearly, the forgery must
consist of a message which wasn’t signed by the legitimate signer (owner of the
private key), and a valid signature. However, which message is considered a
meaningful forgery? We consider three types of forgeries:
Existential forgery: any forgery is considered meaningful. Namely, the attacker is considered successful if it is able to obtain any pair of a message
and a valid signature over it, where the message was not signed by the
legitimate signer - even if the message is pure gibberish.
Selective forgery: The attacker selects some message m ∈ M before it begins
interacting with the system, and then succeeds in generating a valid
signature for m (without asking the signer to sign m, of course).
Applied Introduction to Cryptography and Cybersecurity
34 CHAPTER 1. INTRODUCING CYBERSECURITY AND CRYPTOGRAPHY
Universal forgery: The attacker can produce a valid signature to any message
m given to it.
Attackers that can perform universal forgery, can also perform selective
forgery; and attackers that can perform selective forgery, can also perform
existential forgery.
For some application scenarios, it may suffice to prevent universal forgery
or selective forgery, for example:
• Assume there are only two (or few) pre-deőned messages to be signed,
e.g., an inspector signing either ‘valid’ or ‘invalid’. In this case, security
against universal forgery suffices; the attacker cannot choose the message
to be forged.
• Assume the application of signing a document, e.g., a contract. The
attacker may have signiőcant ŕexibility in which document it forges, but
the forgery must be of a legible, meaningful contract. Existential forgery,
where the attacker may only forge signature over some ‘random’ gibberish
document, may not be a threat.
However, following the conservative design principle, it is preferable to use
signature schemes that prevent (even) existential forgery. Such scheme can be
safely used, even in an application where the attacker may be able to exploit
signatures over seemingly-meaningless or benign messages.
Our discussion of signature schemes so far has been intuitive. However, in
order to prove security, we need precise deőnitions, which we present in the
following subsections.
1.5.4
Game-based Security and the Oracle Notation
There are different methods of deőning security requirements for cryptographic
schemes; this textbook, and possibly most works on applied provable security,
follows the game-based approach. We believe that game-based deőnitions are
easier, more intuitive and used by more works on provable security of applied
protocols, compared to other approaches such as simulation-based deőnitions.
Games. The term game refers to a well-deőned algorithm that returns a
binary outcome of one execution, where an adversary A attacks the scheme:
True if the attack succeeded and False if the attack failed. The game is often
randomized; randomness may be used by the game itself, e.g., to deőne random
challenges, by the adversary A and/or by the cryptographic scheme (if it is
probabilistic, i.e., uses random bit-ŕip operations).
Game-based security deőnitions deőne a game, often using pseudo-code, and
then use the same to deőne the security requirements.
Applied Introduction to Cryptography and Cybersecurity
1.5. PROVABLE-SECURITY AND DEFINITIONS
35
Oracles and the oracle notation. Many cryptographic games, allow the
adversary to receive the result of speciőc, limited operations that use the private
key - while not providing that key to the attacker. For example, a Chosen
Message Attacker (CMA) can receive signatures of messages it chooses, where
the signatures are computed using the private signing key, which is obviously
not disclosed to the attacker.
Basically, the game provides the attacker with ‘black box’ or ‘subroutine’
access to a function, which, internally, has access to the private key. Speciőcally,
to allow CMA, the attacker is given such access to a function computing
Sign s (m), where m is a message chosen by the attacker, and s is the private
signing key (not given to the attacker).
The term oracle access is used to refer to such ‘black box’ access, e.g., to
the function computing Sign s (m) for attacker-chosen message m. Oracles are
used extensively in complexity theory and in cryptography.
We use the oracle notation A S.Signs (·) to denote that the adversary A
is given oracle access to Sign s (·), the signing functionality using the private
signing key s, for attacker-chosen messages. This means that A can provide
input x ∈ {0, 1}∗ and receive S.Sign s (x), i.e., a signature of x using the secret
key s. Notice that A does not receive the private signing key s, and has no
access to the operation of S.Sign s (·).
Definition 1.3 (Oracle notation). Let A be an algorithm, f be a function, and
k ∈ {0.1}∗ be a string (typically, a secret such as a private key). We use the
notation A fk (·) to denote that algorithm A can provide input x and and receive
fk (x). Similarly, we use the notation A Bk (·) to denote that A can provide input
x and receive Bk (x), where B is an algorithm. We refer to fk (·) or Bk (·) as an
oracle.
In the next subsection, we use the oracle notation to deőne the existentialunforgeability CMA game.
1.5.5
The Existential Unforgeablity CMA Game
Algorithm 1 presents the pseudocode of the algorithm for the existential unSign l
forgeability adaptive chosen-message attack (CMA) game, EU FA,S
(1 ). The
game returns True if the adversary ‘wins’, i.e., is able to output some message
m and a valid signature for it σ; otherwise, i.e., if the attack fails, then the
game returns False.
Sign l
Algorithm 1 The existential unforgeability game EU FA,S
(1 )(1l ) between
signature scheme S = (KG, Sign, Verify) and adversary A.
$
(s, v) ← S.KG(1l );
$
(m, σ) ← A S.Signs (·) (v, 1l );
return (S.Verifyv (m, σ) ∧ (A didn’t give m as input to ., S.Sign s (·)));
Applied Introduction to Cryptography and Cybersecurity
36 CHAPTER 1. INTRODUCING CYBERSECURITY AND CRYPTOGRAPHY
Sign l
Explanation of the existential unforgeability game EU FA,S
(1 ) (Algorithm 1). The game receives only one input, the security parameters 1l , and
has only three steps:
1. Use the key-generation algorithm of the signature scheme, to generate the
$
$
signing and veriőcation keys: (s, v) ← S.KG(1l ). We use the ← symbol
to emphasize that S.KG is a randomized algorithm, i.e., returns a random
key pair.
$
2. Then, we let (m, σ) ← A S.Signs (·) (1l ), i.e., the adversary outputs a message
m and a purported forged signature for it, σ. The adversary receives oracle
access to the signing algorithm, i.e., can receive the values S.Sign s (x) for
any input x chosen by the adversary.
3. Finally, the game returns True, i.e., the adversary ‘wins’, if σ is a valid
signature on m (using the veriőcation key v), provided that m is not one
of the inputs x whose signature S.Sign s (x) was received by A from the
oracle in the previous step.
Intuitively, an existentially-unforgeable signature scheme S ensures that evSign l
ery efficient (PPT) adversary A would ‘almost always’ lose, i.e., Pr(EU FA,S
(1 ) =
True) would be tiny or negligible, provided that the security parameter 1l
is ‘sufficiently large’. We deőne this requirement below, in Deőnition 1.4 and
Deőnition 1.6.
The following exercise shows that the adversary A can always ‘win’ the
Sign l
EU FA,S
(1 ) game against an arbitrary signature scheme S, if either we allow
A to be inefficient (i.e., not a PPT algorithm), or if the keys generated by algS
Sign l
(1 )(1l ) = True) = 1.
are of limited length. Namely, in these cases, Pr(EU FA,S
Exercise 1.1 (Forgery if adversary is computationally unbounded or if key
length is bounded). Let S be an arbitrary efficient (PPT) signature algorithm.
Sign l
Present an adversary A that is able to ‘win’ EU FA,S
(1 ) every time, if we
allow either of the following:
1. A does not have to be an efficient (PPT) algorithm.
2. S outputs fixed, or bounded-length, keys.
Sketch of solution to first item: Since S is efficient, there is some polynomial
which bounds its running time. In particular, this bounds the length of the
private signing key s. The adversary will try all strings s′ up to that length; the
adversary will apply the signature algorithm using each such potential signature
key s′ , and each time, verify the signature using the public key v; eventually,
the correct signing key is found.
Sketch of solution to the second item: for simplicity, assume that the private
key is always of length l. Then A tries to sign using any of the 2l possible keys,
verifying the signatures using the (known) public key, until it őnds the correct
key. Since l is őxed, the number of keys is a (large) constant rather than a
Applied Introduction to Cryptography and Cybersecurity
1.5. PROVABLE-SECURITY AND DEFINITIONS
37
function of the security parameter; hence, the adversary can test all 2l possible
keys.
1.5.6
The unforgeability advantage function
The existential unforgeability game (Algorithm 1) is a random process, which
returns the outcome of a random run of the game, with the given adversary A
and signature scheme S. The outcome is True in runs where the adversary
‘wins’, i.e., outputs a forgery, and False in runs where the adversary ‘loses’,
i.e., does not output a forgery.
The outcome of the game may depend on the (random) keys output by
the (probabilistic) KG algorithm, as well as the outputs of the (randomized)
adversary A. The probability that the adversary wins usually depends on the
security parameter 1l . This probability is called the existential unforgeability
advantage of A against S.
Definition 1.4. The existential unforgeability advantage function of adversary
A against signature scheme S is defined as:
EU F −Sign l
Sign l
εS,A
(1 ) ≡ Pr EU FA,S
(1 ) = True
(1.2)
Where the probability is taken over the random coin tosses of A and of S during
Sign l
Sign l
the run of EU FA,S
(1 ) with input (security parameter) 1l , and EU FA,S
(1 )
is the game defined in Algorithm 1.
The advantage function gives us a measure of the security of the signature
scheme; in particular, clearly, a scheme is secure only if for any efficient adversary
A, the advantage is small, or better yet, negligible3 . Note, however, that for
any őxed value of the security parameter 1l , there is an adversary A that
EU F −Sign l
always wins - i.e., such that εS,A
(1 ) = 1 (Exercise 1.1). Therefore, our
deőnition of security cannot be bounded to a speciőc security parameter, and
must consider the advantage as a function.
Which advantage functions are sufficiently-small (or negligible)? There are
two main ways in which we can deal with this question: asymptotic security
and concrete security. In this textbook we will adopt the asymptotic security
approach, which we explain below, since it is a bit easier to use. However, let us
őrst brieŕy explain the alternative approach of concrete security, which allows
more detailed analysis of security - but is a bit harder to use.
1.5.7
Concrete security, asymptotic security and negligible
functions
Concrete security. The concrete security approach uses the advantage
function directly as the measure of security. Namely, in this approach, there
is no explicit deőnition of a ‘secure’ scheme; each scheme is only associated
3 Unfortunately,
no efficient signature scheme can ensure zero advantage; see Exercise 1.6.
Applied Introduction to Cryptography and Cybersecurity
38 CHAPTER 1. INTRODUCING CYBERSECURITY AND CRYPTOGRAPHY
with a speciőc advantage function. This allows the calculation of the advantage
as a speciőc probability value - for any given, concrete values of the security
parameter 1l ; this is also the reason for the term, concrete security. In fact, in
this approach, often the advantage function is given additional parameters. For
example, the advantage function for a signature scheme may include the number
of messages signed during the execution. In general, inputs to the advantage
function often include the number of (different kinds of) oracle calls.
Concrete security allows precise analysis of security for speciőc key lengths
and other parameters, and the security impacts of different cryptographic constructions. These is a signiőcant advantage of concrete security over asymptotic
security (deőned next). However, we believe that concrete security may be less
appropriate for this introductory textbook. Instead, we decided to adopt the
simpler-to-use asymptotic security approach, described next, where a design is
either secure or insecure - no quantitative measure.
Asymptotic security and negligible functions. Asymptotic security reEU F −Sign l
quires the advantage function, e.g., εS,A
(1 ), to be negligible in the security
parameter l.
What does it mean when we say that a function ε : N → R is negligible?
Clearly, we expect such a function to converge to zero for large input, i.e.:
liml→∞ ε(l) = 0. Moreover, a negligible function is a function that converges to
zero faster than any (non-zero) polynomial; the deőnition follows.
Definition 1.5 (Negligible function). A function ε : N → R is negligible, if for
every non-zero polynomial p(l) ̸= 0 holds:
lim ε(l) · p(l) = 0
l→∞
(1.3)
We use N EGL to denote the set of all negligible functions.
Notes:
1. An equivalent condition is to say that ε : N → R is negligible if for every c
holds liml→∞ ε(l) · lc = 0.
2. Any non-zero polynomial is not negligible.
3. For any constant x > 1, the inverse exponential function ε(l) = x−l , e.g.,
2−l , is negligible.
Here is an exercise to make sure this important concept is well understood.
Exercise 1.2. which of the following functions are negligible? Why?
l
(a) fa (l) = 10−8 · l−10 , (b) fb (l) = 2−l/2 , (c) fc (l) = l!1 , (d) fd (l) = (−1)
l , (e)
fe (l) = 0.5l .
Working with negligible functions is a useful simpliőcation; here is one
convenient property, which shows that if an algorithm has negligible probability
to ‘succeed’, then running it a polynomial number of times will not help - the
probability to succeed will remain negligible.
Applied Introduction to Cryptography and Cybersecurity
1.5. PROVABLE-SECURITY AND DEFINITIONS
39
Lemma 1.1. Consider negligible function ϵ : N → R, i.e., ϵ(l) ∈ N EGL(l).
Then for any polynomial p(l), the function f (l) = p(l) · ϵ(l) is also negligible,
i.e., f (l) ∈ N EGL(l).
1.5.8
Existentially-unforgeable signature schemes
We now use the deőnitions of a negligible function and of the existential unforgeability advantage function, to deőne the asymptotic notion of an existentially
unforgeable signature scheme.
Definition 1.6 (Existentially-unforgeable signature scheme). A signature
scheme S is existentially unforgeable if for all PPT algorithms A, the adEU F −Sign l
(1 ) ∈ N EGL(l), where
vantage of A over S is negligible, i.e.: εS,A
EU F −Sign l
(1 ) is defined in Definition 1.4.
εS,A
EU F −Sign l
Sign l
Recall that εS,A
(1 ) ≡ Pr EU FA,S
(1 )(1l = True is the probability that the adversary A succeeds to forge a message, in a random run of
the existential-unforgeability game, with adversary A, signature scheme S and
security parameter 1l .
Exercise 1.1 shows that that a signature scheme cannot be existentiallyunforgeable, if the length of the keys it generates is őxed or bounded, regardless
of the security parameter. In spite of this, standard signature schemes are
deőned for őxed key (and input, output) length.
We leave the deőnition of the corresponding game and notion of selectiveunforgeability to the reader. Notice we do not include in this exercise universalunforgeability, since it requires a slightly different type of deőnition.
Exercise 1.3.
1. Define the selective-unforgeability game.
2. Define a selectively-unforgeable signature scheme.
3. Show: if S is existentially-unforgeable, then S is selectively-unforgeable.
One-Time Signatures. Our deőnitions above focused on the ‘classical’ deőnitions of signature schemes and their security. However, there are many other
variations considered in the cryptographic literature; let us mention one of these
variants, One-Time Signature schemes. These are signature schemes which
allow only a single of signature operation. A similar variant allows a limited
number of signature operations (Limited-use signatures).
Exercise 1.4.
1. Define the existential-unforgeability game for a one-time
signature (or: limited-use signature).
2. Define a one-time existentially-unforgeable signature scheme.
One-time (and limited-use) signatures can be more computationally efficient
than the ‘classical’, unlimited-use signatures; see subsection 3.4.2. They may
Applied Introduction to Cryptography and Cybersecurity
40 CHAPTER 1. INTRODUCING CYBERSECURITY AND CRYPTOGRAPHY
also be convenient way to ensure security against attackers with quantumcomputing capabilities, see Section 10.4. Adapting the deőnitions we presented
above to support such variants is not difficult; see Exercise 1.7. And there are
many applications, where we can use one-time signatures.
So why do we deőne and normally use ‘classical’, unlimited-use signatures?
One reason is the conservative-design principle above. It is also simpler to use
(and deőne) unlimited-use signatures, and it is more convenient that we can
use them for many applications. Convenience, reuse and simplicity are all very
important properties.
1.6
A Brief History of Cryptography, Computing and
Cybersecurity
Cryptography is a surprisingly ancient art, indeed, one of the earliest sciences;
and it also played a surprisingly large role in the development of computing. And
the history of computing is obviously closely linked to the history of cyberspace
and cybersecurity - where cryptography plays a major role.
Therefore, we conclude this chapter with a brief review of the history of
cryptology (subsection 1.6.1), and of the history of computing and cybersecurity
(subsection 1.6.2).
1.6.1
A brief history of cryptography
We now present a brief history of cryptography. We are only able to give few
important highlights from the fascinating history of cryptology, and interested
readers should consult some of the excellent manuscripts such as [223, 362]. We
identify three main eras of cryptography: the heuristic cryptography era (until
1883), the Enigma era (1883 to 1970s) and modern cryptography era. We chose
the year 1883 to separate between the heuristic era and the Enigma era, since
in that year Kerckhoffs published his seminal manuscript [232].
The heuristic cryptography era [-1883]. Cryptology, which literally means
the ‘science of secrets’, has been applied to protect sensitive communication since
ancient times, and is therefore much more ancient than computing devices. Originally, cryptology focused on protecting secrecy, i.e., on confidentiality, which is
mainly provided by encryption schemes, also called ciphers and cryptosystems;
see Figure 1.3.
One of the early evidences of encryption is from about 1500BC; it is an
encryption of a formula for pottery glaze, which presumably was commercially
valuable. We present few other ancient ciphers in Section 2.1, but are not able
to properly cover this topic here; if interested, read some of the excellent books
such as [223, 264, 362].
Kerckhoffs’ publication (1883). Kerckhoffs’ publication [232], in 1883,
could be viewed as the end of the heuristic cryptography era. Kerckhoffs’ work
Applied Introduction to Cryptography and Cybersecurity
1.6. A BRIEF HISTORY OF CRYPTOGRAPHY, COMPUTING AND
CYBERSECURITY
41
may have been the őrst major published work in cryptography; until it, most
works in cryptography, and designs of cryptosystems, were kept secret, following
the security by obscurity approach (with few limited exceptions). Kerckhoffs
realized that it is better to assume that the attacker may be able to capture
encryption devices and reverse-engineer them to learn the algorithm, as we
paraphrase in the Kerckhoffs principle (Principle 2), see subsection 1.2.2.
Kerckhoffs’ principle remains a basic principle of cryptography, and is
gradually being adopted into other areas of cybersecurity. With the increased
use of cryptography, adoption of standards, software implementations and
advanced reverse-engineering tools, Kerckhoffs’ principle only became even
more important.
Note that Kerckhoffs did not argue that cryptographic designs should be
published. Indeed, for many years after Kerckhoffs book was published, research
and development in cryptography remained an area mostly dealt with by
intelligence and defense organizations, and mostly in secret. This was deőnitely
true until, and during, World War II.
The Enigma era [1883-1970s]. The most important advances in cryptography after Kerckhoffs’ publication were made as part of the efforts of the second
world war (WWII), and in the period leading to it. Cryptography, and in
particular cryptanalysis, i.e., ‘breaking’ the security of cryptographic schemes,
played a key role during the second world war; an important by-product was
the development of the őrst computer.
During the war, both sides used multiple types of encryption devices. The
most well known is the Enigma and encryption device, used by the Germans.
The early versions of the Enigma were in use already from 1924, and it was
modiőed and improved over the years. Due to the importance of the Enigma,
we use it as the name of the era.
Details of the design of the Enigma were kept secret, however, the designers
clearly tried to follow Kerckhoffs’ principle, and ensure security even if the
design would be known to the cryptanalysts. This fact, together with continuous
improvements to Enigma, made the Germans believe that Enigma will not be
broken.
Indeed, the Allies were very concerned about their lack of ability to decipher
Enigma traffic; and the cryptanalysis of Enigma was a major undertaking
and a huge achievement. However, there may have been also a drawback to
the German cryptographic effort: they were over-conődent in the security of
Enigma, and continued using it long after it was broken. Possibly due to this
overconődence, they also used Enigma in ways which made it easier to break the
cipher, in violation of the conservative design and usage principle (Principle 3);
See Section 2.2.
When the Enigma was eventually broken, the allies took extraordinary
measure to prevent the Germans from realizing this, so that the Germans will
not change keys, change Enigma or take other precautions. Usually, this meant
creating alternate explanations to the Allies response to the information in the
Applied Introduction to Cryptography and Cybersecurity
42 CHAPTER 1. INTRODUCING CYBERSECURITY AND CRYPTOGRAPHY
plaintext; but sometimes , the difficult decision was made to avoid reaction to
the information, since the resulting risks to people and/or property was deemed
less than the risks if the Germans realize that Enigma was broken. It has been
estimated that the successful cryptanalysis of Enigma, shortened the war by
about two years, and saved millions of lives.
While Enigma was designed following Kerckhoffs principle, i.e., to ensure
security even if its design is known, the German also made extensive efforts to
maintain the secrecy of the design. Indeed, the successful attacks on Enigma
were based on leakage of some information about it. In 1932, the French
intelligence were able to obtain the secret manual and the settings for the
Enigma from a German officer, and shared this information with the British
intelligence and the Polish cryptanalysis unit, called the Cypher Bureau.
In the Polish Cypher Bureau, Captain Maksymilian Ciȩ0̇ki led team that used
mathematics for cryptanalysis, consisting of three of his cryptography students:
Rejewski, Zygalski and Różycki. Using Engima’s manual and setting, the team
reverse-engineered the Enigma, and built Enigma replicates. Furthermore, they
developed special-purpose electromechanical devices called Bombe, that allowed
efficient testing of different Enigma keys against limited exposed information,
such as known pairs of ciphertext and (likely) plaintext.
In 1939, Poland shared their Bombe devices and their cryptanalysis results
with the British. Using plaintext/ciphertext pairs and the Bombe machines,
the cryptanalysis center in Bletchley Park, led by Alan Turing, was able to
decipher much of the intercepted Enigma traffic.
However, the Germans periodically improved the Enigma. Every change
in the Enigma required extensive effort to design and create modiőed Bombe
devices, and a period when traffic could not be deciphered.
Furthermore, the somewhat less known Lorenz encryption devices were
introduced by the Germans later in the war, and no complete device was
captured and available to cryptanalysts.
The challenges of adapting Bombe devices to changes in Enigma, and of
breaking the Lorenz devices, motivated the construction of the first electronic
computer, called Colossus. For more on the history of computing and cybersecurity, see subsection 1.6.2. Since Colossus was programmable, it was possible to
test many possible attacks and to successfully cryptanalyze (different versions
of) Lorenz, Enigma and other cryptosystems.
Modern cryptography. Until the 1970s, cryptography remained mainly a
topic for intelligence and research organizations. In the 1970s, this changed,
quite dramatically, with the beginning of what we now call modern cryptology,
which involves extensive academic research, publication, products and standards,
and has many important commercial applications.
Two important landmarks mark the beginning of modern cryptology. The
őrst landmark is the development and publication of the Data Encryption
Standard (DES) [296]. The publication of DES as an open standard, marks
that cryptography began to be widely deployed in commercial products; in
Applied Introduction to Cryptography and Cybersecurity
1.6. A BRIEF HISTORY OF CRYPTOGRAPHY, COMPUTING AND
CYBERSECURITY
43
particular, DES was key to the security of the emerging bank networks, and to
the security mechanisms of the emerging computer communication networks,
which was dominated, for many years, by IBM’s SNA.
The second landmark is the introduction of radical, innovative concept of
Public Key Cryptology (PKC), where the key to encrypt messages may be public,
allowing easy distribution of encryption keys. The őrst publication was of the
seminal paper New directions in cryptography [123], by Diffie and Hellman.
In [123], Diffie and Hellman introduced the concepts of public-key cryptography,
public-key encryption and digital signatures. They also presented the important
Diffie-Hellman Key Exchange protocol; we discuss public key cryptography and
the Diffie-Hellman protocol in Chapter 6.
It is notable in [123], Diffie and Hellman did not yet present a design of a
public key cryptosystem. The őrst published public-key cryptosystem was RSA
by Rivest, Shamir and Adelman in [334]. RSA and the Diffie-Hellman protocol
remain widely-used public-key cryptographic mechanisms; we discuss them in
Chapter 6.
In fact, the same design as RSA was already discovered a few years earlier,
by the GCHQ British intelligence organization. However, the GCHQ kept this
achievement secret until 1997, long after the same design was re-discovered
independently and published in [334]. See [264] for more details about the
fascinating history of the discovery - and re-discovery - of public key cryptology.
These two discoveries of public-key cryptology, with very different impacts
on society, illustrate the dramatic change between ‘classical’ cryptology, studied
only in secrecy and with impact mostly limited to the intelligence and defense
areas, and modern cryptology, with extensive published research and extensive
impact on society and economy.
1.6.2
A brief history of computing and cybersecurity
Early computing: from Babbage to Colossus. The őrst known idea of
computing was proposed by Charles Babbage in 1822. Babbage’s created two
designs of mechanical computing devices: the difference engine and theanalytical
engine. The difference engine was a special-purpose machine, designed to
tabulate logarithms and trigonometric functions; however, the analytical engine
was a general purpose computer, which was designed to process input consisting
of data and a program, both provided using punched cards.
Babbage was also interested in cryptography, and designed the őrst known
practical attack against the Vigenére cipher (Section 2.1). The attack was based
on usage of letter frequencies, and one of the applications Babbage designed for
the analytical engine was to compute letter frequencies.
Babbage but never completed implementing either engine, but the designs,
however, were correct; they were implemented and tested for historical purposes,
from 1989 to 1991, i.e., more than a century after he died. However, Babbage
did build some modules of the engines, and has demonstrated and described
their designs to peers.
Applied Introduction to Cryptography and Cybersecurity
44 CHAPTER 1. INTRODUCING CYBERSECURITY AND CRYPTOGRAPHY
The demonstration and description excited Ada Lovelace, one of the few
female mathematicians at the time. Ada wrote a description of the sequence of
operations for solving certain mathematical problems by the analytical engine,
which is considered the őrst computer program; Ada was also the őrst to suggest
that computers may be used to manipulate non-numeric data, such as text or
music.
It took about a century from the initial design of the engines, and until the
őrst working computing device was implemented. This was a mechanical device
designed and implemented in 1938, by the German inventor Konrad Zuse; Zuse
named it the Z1.
The Z1 was unreliable and slow. As a result, it did not have any useful
applications or use, except as proof of feasibility. In particular, special-purpose
calculating devices were way better in performing applied calculations. In
particular, this held for electromechanical designs including the Enigma cipher
used for encryption, and the Bombe machines used in Bletchley park to break
the Enigma; both were much more efficient and reliable than the Z1.
However, as we discussed above, there was a repeated struggle to adapt the
Bombe devices to changes in Enigma; and the Bombe failed to break the newer
Lorenz cryptosystem. This motivated the construction of the first practical
computer, called Colossus.
Colossus was designed by Tommy Flowers as part of the Bletchley park
WWII cryptanalysis effort. Colossus was the őrst fully-electric computing
device, i.e., did not involve mechanical components. The Colossus was also
the őrst computer, i.e., the őrst computing device that could be programmed
for arbitrary tasks, rather than only perform a predeőned set of tasks or
computations. This was in contrast to previous devices, including the Enigma
and the Lorenz cryptosystems, which were electro-mechanical and also were
limited to a predeőned computation.
From Colossus to Modern Computers One critical difference between
the Colossus and more modern computers, as well as from Babbage’s design of
the analytical engine, is that the Colossus did not read a program from storage.
Instead, setting up the program for the Colossus involved manual setting of
switches and jack-panel connections.
This method of ‘programming’ the Colossus wasn’t very convenient, but it
was acceptable for the Colossus, since there were only a few such machines and
only a few, simple programs, and the simplicity of design and manufacture was
more important than making it easier to change programs. Even this crude form
of ‘programming’ was incomparably easier than changing the basic functionality
of the machine, as required in special-purpose devices - including the Enigma
devices and the Bombe devices used for cryptanalysis of the Enigma traffic.
Designs of an electromechanical computer which supports a stored program,
were proposed already in 1936/1937 - by two independent efforts. The őrst
was by Konrad Zuse, who mentioned such design in a patent on ŕoating-point
calculations published in 1936 [400]; the second was Alan Turing, who deőned
Applied Introduction to Cryptography and Cybersecurity
1.6. A BRIEF HISTORY OF CRYPTOGRAPHY, COMPUTING AND
CYBERSECURITY
45
and studied a formal model for stored-program computers. This Turing machine
model, introduced in Turing’s seminar paper ‘On Computable Numbers’ [372],
is still fundamental to the theory of computing.
However, practical implementations of stored-program computers appeared
only after WWII. Stored-program computers were much easier to use, and
allowed larger and more sophisticated programs as well as the use of the
same hardware for multiple purposes (and programs). Hence, stored-program
computers quickly became the norm - to the extent some people argue that
earlier devices were not ‘real computers’.
Stored-program computers also created a vast market for programs. It now
became feasible for programs to be created in one location, and then shipped to
and installed in a remote computer. For many years, this was done by physically
shipping the programs, stored in media such as tapes, discs and others. Now
that computer networks are widely available, program distribution is often, in
fact usually, done by sending the program over the network.
Easier distribution of software meant also that the same program could be
used by many computers; indeed, today we have programs that run on billions
of computing devices. The ability of a program to run on many computers
created an incentive to develop more programs; and the availability of a growing
number of programs increased the demand for computers and their impact.
Computer networks and cyberspace. The ability to develop a program
and have it applied in multiple computers caused the economic ‘network effect’
that made computers and programming much more useful. This effect dramatically increased when computer networks began to facilitate inter-computer
communication.
The introduction of personal computers (1977-1982), and the subsequent
introduction of the Internet, the web and of the smartphone, caused, each, a
further dramatic increases in the use and impact of computing and of computer
networking.
There was also a growing interest in the potential social implications of
computers and networks, and a growing number of science-őction works focused
on these aspects. One of these was the novel ‘Burning Chrome’, published by
William Gibson in 1982. Actually, it seems that it was this novel that introduced
the term cyberspace, to refer to this interconnected environment connecting
networks, computers, devices and humans.
The cyber part of the term cyberspace is taken from the term cybernetics,
introduced in [388] to describe the study of communication and control systems
in machines, humans and animals.
By now, computing is used not only in ‘traditional computers’, but as a
critical component of many other devices, from tiny sensors to vehicles - cyberphysical systems and IoT (Internet of Things) devices. The term cyberspace
is now mostly used for the ubiquitous use of devices with different computing
capabilities, communicating via networks and interacting with humans.
Applied Introduction to Cryptography and Cybersecurity
46 CHAPTER 1. INTRODUCING CYBERSECURITY AND CRYPTOGRAPHY
Cybersecurity and Hacking. With great power, comes great responsibility;
and the increased importance of the Internet and cyberspace also increased
the risks of abuse and cyber-attacks. The awareness of these risks signiőcantly
increased as attacks on computer systems and networks became widespread,
especially attacks exploiting software vulnerabilities, and/or involving malicious
software, i.e., malware. This awareness resulted in the study, research and development of threats and corresponding security mechanisms, including computer
security, software security, network security and data/information security.
The awareness of security risks also resulted in important works of őction.
One of these was the 1983 novel cyberpunk, by Bruce Bethke. Bethke coined
this term for individuals which are socially-inept yet technologically-savvy.
Originally a derogatory term, cyberpunk was later adopted, with a positive
interpretation, as a name of a movement with several manifestos, e.g., [234].
The reverse process happened with regards to the term hacker, which was
originally used, already from the 1960s, to describe proőcient programmers,
thinking ‘out of the box’ and őnding creative solutions and shortcuts (‘hacks’),
and more recently, to experts of computer security. However, from the 1980s,
the term hacker is often applied with a negative connotation, to a person who
tries to break into computer systems and to circumvent defenses. The terms
black-hat hacker (or cracker) and white-hat hacker (or just hacker) are often
used to distinguish between the ‘attacking hacker’ and the ‘defending hacker’,
or between hackers operating illegally and hackers following legal, and hopefully
also ethical, principles.
In works of őction, cyberpunks and hackers are often presented as sociallyinept yet technology-savvy, with incredible abilities to penetrate systems. These
abilities are mostly presented in positive, even glamorous ways, e.g., as saving
human society from oppressive, corrupt regimes, governments, agencies and
rogue Artiőcial Intelligence systems. The focus on decentralization and personal
freedom is deőnitely a main part of the cyberpunk manifestos [234].
Indeed, much of the success of the Internet is due to its decentralized nature,
and to the use of cryptography to provide security to őnancial transactions
and some level of privacy. Important privacy tools, such as the Tor anonymous communication system [124], are based on cryptography and inherently
decentralized, which may be hoped to defend against potentially malicious
governments. Furthermore, some of the cryptography and privacy mechanisms,
such as the PGP encryption suite [159], were developed in spite of signiőcant
resistance by governments. Cryptography is also at the core of the extensive
efforts to develop blockchains (see Section 3.10) and other decentralized őnancial
tools and currencies, such as the Bitcoin cryptocurrency.
1.7
Lab and Additional Exercises
Lab 1 (Using cryptography to validate downloads). Malware is among the
most common, well-known and harmful cybersecurity threats. In this lab, we
explore the use of basic cryptographic mechanisms, specifically, cryptographic
Applied Introduction to Cryptography and Cybersecurity
1.7. LAB AND ADDITIONAL EXERCISES
47
hashing and signatures, to validate downloads and thereby avoid installing and
using downloaded malware that is a fake version of the software that the user
wanted to download.
As for the other labs in this textbook, we will provide Python scripts for
generating and grading this lab (LabGen.py and LabGrade.py). If not yet
posted online, professors may contact the author to receive the scripts. The
lab-generation script generates random challenges for each student (or team),
as well as solutions which will be used by the grading script. We recommend
to make the scripts available to the students, as example of how to use the
cryptographic functions. It is easy and permitted to modify these scripts to use
other languages/libraries or to modify and customize them as desired.
1. Using hash for download integrity. In this question we use a cryptographic,
collision-resistant hash function (see subsection 1.2.5 and Section 3.2) to
ensure the integrity of software downloads, i.e., to ensure download is of
the intended, authentic software, and not of a malware impersonated as
the desired software. Software is often made available via repositories,
which may not be fully secure; to ensure the integrity, the publishers
often provide the hash of the software. Namely, to protect the integrity
of some software download, say encoded as a string m, the publisher
provides in some secure channel the value of the hash Hm ≡ h(m). The
user then downloads the software from the (insecure) repository, obtaining
the downloaded string m′ . To conőrm its integrity, i.e., conőrm that
m′ = m, the user uses m′ if h(m′ ) = Hm , i.e., h(m′ ) = h(m). Based
on the collision-resistance property of h, the fact that h(m′ ) = h(m) is
believed to imply that m = m′ . Note that other applications of hash
functions may rely on other properties, for example, on the one-way
property (Section 3.4).
Input: a folder Q1files containing several őles, and a őle Q1.hash, containing the SHA-256 hash applied to one of the őles. Note that the
őle contains the hash in binary bytes, not encoded as text.
Goal: identify the őle in Q1files whose hash is given in Q1.hash.
Submission: the name of the matching őle and a program, A1.py that,
given a hash őle Q1.hash and a folder Q1files, outputs the name of
the matching őle (or a message if there is no matching őle).
2. Using signatures to authenticate downloads. The hash mechanism has
two disadvantages. First, an attacker which controls the value of the
Digest can set it to be the digest of a malware; second, the digest must be
updated for any software update. In this question, we show an alternative:
authenticating the software using digital signatures (subsection 1.2.3).
Practical cryptographic libraries such as PyCryptodome use the Hashthen-Sign paradigm (subsection 3.2.6), i.e., they apply the signing function
to the hash of the information to be signed. Hence, you will need to specify
both a signature scheme (e.g., use RSA) and a hash function (e.g., use
Applied Introduction to Cryptography and Cybersecurity
48 CHAPTER 1. INTRODUCING CYBERSECURITY AND CRYPTOGRAPHY
again the SHA-256 hash function). The reason for using the Hash-thenSign paradigm, is that it is absurdly inefficient to apply the public-key
signature algorithm directly over a (usually) long message, rather than
over the (short) hash.
Input: The őle Q2pk.pem, which contains the public validation key, which
can be used to validate őles purportedly signed by the legitimate
software, and the directory Q2files which contains several őles and
the corresponding purported signatures. Note: the signature was
created by the LabGen.py script, using RSA signature (with PKCS#1
v1.5 padding) and using the SHA-256 hash function.
Goal: identify the őle(s) in Q2files which are properly signed, as validated
using the public key given in Q2pk.pem.
Submission: the name(s) of the properly signed őle(s), and a program,
A2.py that, given the public validation key őle Q2pk.pem and a folder
Q2files, outputs the name(s) of the matching őle(s) (or a message if
there is no matching őle).
3. To get a feeling for the performance of public key signatures and of
cryptographic hash functions, perform experiments to and presents graphs
showing:
a) The time required for hashing as a function of the input length, from
1000 bits to one million bytes.
b) The time required for generation of signing and validation public key
pairs, from keys of lengths from 1000 bits to the maximal length you
őnd feasible (say, up to őve minutes).
c) The time required to sign inputs for inputs of length from 1000 bits to
one million bits, using each of the private signing keys you generated
in the previous item.
d) The time required to validate signatures for inputs of length from
1000 bits to one million bits, using each of the private signing keys
you generated in the previous item.
e) Explain how the results received in the previous items make sense,
and the implications for the time required for hashing, signing,
and validating, and the relations to the Hash-then-Sign paradigm
(subsection 3.2.6).
Note: you may need to repeat some operations many times to be able to
measure the times with reasonable precision.
Exercise 1.5. Let S be a signature scheme, and let A Key and A σ be the
following two simple adversaries:
A Key (v) randomly guesses the signing key s. After guessing s, the adversary
signs the desired message using s, and ‘wins’ if the signature validates
correctly using verification key v.
Applied Introduction to Cryptography and Cybersecurity
1.7. LAB AND ADDITIONAL EXERCISES
49
A σ (v) randomly guesses the signature σ for a message m chosen by the adversary, and ‘wins’ if σ validates correctly using verification key v.
Consider a single guess by both adversaries. Let ls denote the length of the
signing key s and lσ denote the length of the signatures produces by the signing
algorithm (assume all signatures has same length lσ ).
1. Compute the exact probability that each adversary wins, after a single
guess.
2. What are the relationships between the probabilities computed in the previous item, and the probabilities of winning, after a single guess, each of
the three notions of unforgeability introduced in this chapter? Explain.
3. Compute the exact probability that each adversary wins, after two guesses.
4. Consider adversaries AEKey and AEσ which operate similarly to A Key and
A σ , except that instead of guessing only one or two times, they test every
possible value (of s or of σ) until winning. What is the maximal and
average number of guesses required by each of AEKey and AEσ ?
5. What is the advantage of algorithms AEKey and AEσ over S? (Definition 1.4)
6. Do the replies to the previous item imply that S is not existentially
unforgeable?
Exercise 1.6 (There is always some probability of forgery.). Show that there is
no signature scheme S such that every efficient adversary A has zero advantage
EU F −Sign l
(1 ) = 0. Preferably, show this result holds
against S, i.e., such that: εS,A
l
for any value of 1 .
Exercise 1.7. Based on the definitions in this chapter, define One-time signature schemes.
Exercise 1.8. Prove that if m is co-prime with n, then for every integer l > 0
holds:
ml mod n = (m mod n)l mod φ(n) mod n
Applied Introduction to Cryptography and Cybersecurity
Chapter 2
Confidentiality: Encryption
Schemes and Pseudo-Randomness
Encryption deals with protecting the conődentiality of sensitive information,
which we refer to as plaintext message m, by encoding (encrypting) it into
ciphertext c, as illustrated in Figure 1.3. The ciphertext c should hide the
contents of m from the adversary, yet allow recovery of the original information
by legitimate parties, using a decoding process called decryption. Encryption
is one of the oldest applied sciences; some basic encryption techniques were
already used thousands of years ago.
The most important categorization of encryption schemes is between shared
key cryptosystems, also called symmetric cryptosystems (Figure 1.4), and public
key cryptosystems, also called asymmetric cryptosystems (Figure 1.5). In both
cases, we use the terms ‘encryption scheme’ and ‘cryptosystem’ interchangeably.
In this chapter, we mostly focus on shared-key cryptosystems; we discuss
public-key cryptography in Chapter 6. Let us begin by deőning (stateless)
shared-key cryptosystems and their correctness requirement.
Definition 2.1 (Stateless shared-key cryptosystem and their correctness). A
shared-key cryptosystem is a pair of keyed algorithms, ⟨E, D⟩, and sets K, M
and C, called the key space, plaintext space and ciphertext space, respectively.
A shared-key cryptosystem is correct if for every input key k ∈ K and
plaintext m ∈ M , the encryption of m using k returns ciphertext c ∈ C that
decrypts, using key k, back to m. Namely,
(∀k ∈ K, m ∈ M ) Dk (Ek (m)) = m
(2.1)
Deőnition 2.1 does not allow the encryption and decryption algorithms to
maintain state, i.e., are for a stateless shared-key cryptosystem. However, many
practical cryptosystems use state, as illustrated in Figure 2.1; for example, the
state may be used as a counter. Let us, therefore, extend Deőnition 2.1 to
deőne a stateful shared-key cryptosystem and its correctness.
51
52
CHAPTER 2. CONFIDENTIALITY: ENCRYPTION SCHEMES AND
PSEUDO-RANDOMNESS
Figure 2.1: Stateful shared key (symmetric) cryptosystem.
Definition 2.2 (Stateful shared-key cryptosystem and their correctness). A
stateful shared-key cryptosystem is a pair of keyed algorithms, ⟨E, D⟩, and sets
K, M , C and S, called the key space, plaintext space, ciphertext space and
state space, respectively.
A stateful shared-key cryptosystem is correct if for every input key k ∈ K,
plaintext m ∈ M and state s ∈ S, the encryption of m using k with state s
returns ciphertext c ∈ C, that decrypts, using key k and state s, back to m.
Namely,
(∀k ∈ K, m ∈ M, s ∈ S) Dk (Ek (m, s), s) = m
(2.2)
The state is often clear from the context, and then we may omit it, i.e.,
write simply Ek (m) and Dk (c), as for a stateless cryptosystem.
Shared-key cryptosystems are also sometimes referred to as ciphers, but we
use this term for two speciőc types of shared-key cryptosystems: block ciphers,
which encrypt and decrypt fixed-length blocks of bits, and stream ciphers, which
are stateful cryptosystems, which, typically, encrypt and decrypt bit by bit, i.e.,
M = C = {0, 1}. .
In this chapter we will see a variety of shared-key cryptosystems. Some of
these are deterministic, some randomized; some stateless, some stateful; and
with different plaintext and key spaces.
Note that deőnitions 2.1 and 2.2 deőne only the correctness requirements;
we did not yet deőne security requirements. We will deőne security later in this
chapter, but it will be more complex than one may initially expect. Intuitively,
the goal is clear: conődentiality, in a strong sense, against powerful adversaries.
However, there are subtle issues, as well as multiple variants which differ in
their exact requirements and assumptions about the adversary capabilities.
2.1
Historical Ciphers
Cryptology is one of the most historical sciences. To ‘warm up’ for our discussion
of encryption schemes, let us őrst discuss a few historical ciphers, which were
in use from ancient times till the nineteenth century. These simple, historical
Applied Introduction to Cryptography and Cybersecurity
2.1. HISTORICAL CIPHERS
53
ciphers helps us introduce some of the basic ideas and challenges of cryptography
and cryptanalysis, and provide some historical perspective, beyond the little we
presented in Section 1.6. For more information, see [223, 362].
One has to keep in mind is that the design of these ciphers was mostly kept
secret, believing that an attacker cannot break the cipher if it does not know
its design; a (usually false) belief we refer to as security by obscurity. In fact,
some of the ancient ciphers relied only on the secrecy of their design, and did
not even use a secret key; we discuss such keyless ciphers in subsection 2.1.1.
Even when using a published design, users typically kept their choice secret,
and often did minor changes. Indeed, it is harder to cryptanalyze a scheme
which is not even known. Still, it is ill-advised to rely on ‘security by obscurity’.
We explain this in subsection 1.2.2, where we present the Kerckhoffs’s principle,
which essentially says that security of a cipher should not depend on the secrecy
of its design.
Many historical ciphers, and in particular most ancient ciphers, were monoalphabetic substitution ciphers. Monoalphabetic substitution ciphers use a őxed
mapping from each plaintext character to a corresponding ciphertext character
(or some other symbol). Namely, these ciphers are stateless and deterministic,
and deőned by a permutation from the plaintext alphabet to a set of ciphertext
characters or symbols. We further discuss general monoalphabetic substitution
ciphers in subsection 2.1.3. We also discuss, in subsection 2.1.5, three variants
of the Vigenère cipher, which is Poly-alphabetic substitution ciphers.
2.1.1
Ancient Keyless Ciphers
In this section, we discuss few ancient ciphers. These ciphers are all simple,
keyless (no secret key) and monoalphabetic. A cipher is monoalphabetic if it
is deőned by a single, őxed mapping from each plaintext letter to ciphertext
letter or symbol.
The At-Bash cipher The At-Bash cipher may be the earliest cipher whose
use is documented; speciőcally, it is believed to be used, three times, in the
Old Testament book of Jeremiah. The cipher maps each of the letters in the
Hebrew alphabet to a different letter. Speciőcally, the letters are mapped in
‘reverse order’: őrst letter to the last letter, second letter to the second-to-last
letter, and so on; this mapping is reŕected in the name ‘At-Bash’1 .
The At-Bash cipher is illustrated in Fig. 2.2. Even if you are not familiar
with the letters of the Hebrew alphabet, the mapping may still be identiőed
by the visual appearance. If you still őnd it hard to match, that’s Ok; we next
describe an adaptation of the At-Bash cipher to the Latin alphabet.
To properly deőne ciphers as well as more complex cryptographic scheme,
we use pseudocode or a formula. For monoalphabetic ciphers, a formula usually
1 The name ‘At-Bash’ reflects the ‘reverse mapping’ of the Hebrew alphabet. The ‘At’
refers to the mapping of the first letter (‘Aleph’, ℵ) to the last letter (‘Taf’, !)ת, and of the
second letter (‘Beth’, ℶ) to the second-to-last letter (‘Shin’, !)ש.
Applied Introduction to Cryptography and Cybersecurity
54
CHAPTER 2. CONFIDENTIALITY: ENCRYPTION SCHEMES AND
PSEUDO-RANDOMNESS
א
ב
ג
.
.
.
.
ר
ש
ת
ת
ש
ר
.
.
.
.
ג
ב
א
Figure 2.2: The At-Bash Cipher.
suffices. We deőne the cipher as a function of the input letter, where each letter
is represented by its distance from the beginning of the alphabet.
In Hebrew, there are 22 letters, so we encode them by the numbers from 0,
representing the Hebrew letter Alef (ℵ), to 21, representing the Hebrew letter
Taf (!)ת.
Let p be a plaintext message consisting of l Hebrew letters, where p[i], for i =
0, . . . , (l − 1)2 , is the encoding of the corresponding letter (p[i] ∈ {0, 1, . . . , 21}).
We use c to denote the corresponding l-letters ciphertext, i.e., the encryption
of p using the At-Bash cipher: c = EAt−Bash (p). We compute the c using the
following formula, for i = 0, . . . , (l − 1):
(∀i = 0, . . . , (l − 1)) c[i] = 21 − p[i]
(2.3)
It is convenient to denote the alphabet size by n, i.e., in Hebrew, n = 22. With
this convention we can rewrite the formula in Equation 2.3 as c[i] = (n−1)−p[i].
The Az-By cipher The Az-By cipher is the same as the At-Bash cipher,
except using the Latin alphabet, which has n = 26 letters (from A to Z).
We illustrate the Az-By cipher in the top part of Fig. 2.3; below it, we
present two other keyless ancient ciphers, which we discuss next - the Caesar
and ROT13 ciphers.
Let p be a plaintext message consisting of the encoding of l Latin letters,
where p[i] ∈ {0, 1, . . . , 25}, i.e., p[i] ∈ {0, 1, . . . , (n − 1)}. The corresponding
Az-By ciphertext, c = EAz−By (p), is given by:
(∀i = 0, . . . , (l − 1)) c[i] = 25 − p[i] = (n − 1) − p[i]
(2.4)
Obviously, this formula is the same as of the At-Bash cipher (Equation 2.3),
except for adjusting for the fact that the Latin alpha has n = 26 letters, while
the Hebrew alphabet has only 22.
2 For technical reasons, it is a bit more convenient to use 0, rather than 1, as the index of
the first plaintext letter (p[0]).
Applied Introduction to Cryptography and Cybersecurity
2.1. HISTORICAL CIPHERS
AzBy
Caesar
ROT13
55
A
B
C
D
E
F
G
H
I
J
K
L
M
Z
Y
X
W
V
U
T
S
R
Q
P
O
N
N
O
P
Q
R
S
T
U
V
W
X
Y
Z
M
L
K
J
I
H
G
F
E
D
C
B
A
A
B
C
D
E
F
G
H
I
J
K
L
M
D
E
F
G
H
I
J
K
L
M
N
O
P
N
O
P
Q
R
S
T
U
V
W
X
Y
Z
Q
R
S
T
U
V
W
X
Y
Z
A
B
C
A
B
C
D
E
F
G
H
I
J
K
L
M
N
O
P
Q
R
S
T
U
V
W
X
Y
Z
N
O
P
Q
R
S
T
U
V
W
X
Y
Z
A
B
C
D
E
F
G
H
I
J
K
L
M
Figure 2.3: The AzBy, Caesar and ROT13 Ciphers.
The Caesar cipher. We next present the well-known Caesar cipher. The
Caesar cipher has been used, as the name implies, by Julius Caesar. It is also a
monoalphabetic cipher, and we describe a variant3 which operates on the set of
the n = 26 Latin letters, from A to Z.
In the Caesar cipher, each plaintext letter is replaced by the letter appearing
in the alphabet three places after the plaintext letter. To map the last three
letters in the alphabet (X, Y and Z), we repeat the three three őrst letters (A,
B and C), i.e., X is mapped to A, Y to B and Z mapped to C. See the middle
row of Figure 2.3.
As with the Az-By cipher, we represent each letter by its distance from the
beginning of the Latin alphabet; i.e., we represent the letter ‘A’ by the number
0, and so on; ‘Z’ is represented by 25.
Let p be a plaintext message consisting of the encoding of l Latin letters,
where p[i] ∈ {0, 1, . . . , 25}, and let c denote the l-lettered corresponding Caesar
ciphertext c = ECaesar (p). Then c is given by:
(∀i = 0, . . . , (l − 1)) c[i] = p[i] + 3 mod 26
(2.5)
For example, consider encryption of plaintext word ‘axe’, whose encoding is
3 In Caesar’s time, the alphabet contained only 23 characters; for simplicity, we describe a
variant of it which is applied to the current Latin alphabet of 26 characters.
Applied Introduction to Cryptography and Cybersecurity
56
CHAPTER 2. CONFIDENTIALITY: ENCRYPTION SCHEMES AND
PSEUDO-RANDOMNESS
p[0] = 0, p[1] = 23 and p[2] = 4. This gives c[0] = 3, c[1] = 0, c[2] = 7, i.e., the
ciphertext string ‘dah’. Simple!
Exercise 2.1. Let p and c be a plaintext string and the corresponding Caesar
ciphertext, as above. Show the process for decrypting a given ciphertext c,
i.e., computing the corresponding plaintext p. Use an equation similar to
Equation 2.5.
The ROT13 cipher ROT13 is a popular variant of the Caesar cipher, with
the minor difference that ROT13 ‘rotates’ the letters by 13 positions, while
Caesar rotates by 3 positions.
Let p be a plaintext message consisting of the encoding of l Latin letters,
where p[i] ∈ {0, 1, . . . , 25}, and let c denote the l-lettered corresponding ROT13
ciphertext c = EROT 13 (p). Then c is given by:
(∀i = 0, . . . , l − 1) c[i] = p[i] + 13
mod 26
(2.6)
The ROT13 cipher is illustrated by the bottom row in Figure 2.3.
We are not aware of usage of ROT13 to obtain secrecy; it is normally used
only to prevent inadvertent exposure to the plaintext, such as to hide potentially
offensive jokes or to obscure an answer to a puzzle or other spoiler. Because of
its utter unsuitability for real secrecy, ROT13 is used to refer to weak encryption
schemes (e.g., ‘about as secure as ROT13’).
A convenient feature of ROT13 is that it is a self-inverse function, i.e.,
decryption is exactly the same process as encryption.
Exercise 2.2. Show that ROT13 is a self-inverse function. Namely, show
that for every plaintext message p holds: p = EROT 13 (EROT 13 (p)). Can you
identify additional ancient ciphers which are self-inverse functions?
The Masonic cipher. A őnal example of a historic, keyless, monoalphabetic
cipher is the Masonic cipher. The Masonic cipher is from the 18th century and
is illustrated in Fig. 2.4. This cipher uses a ‘key’ to map from plaintext to
ciphertext and back, but the key is only meant to assist in the mapping, since
it has a regular structure and is considered part of the cipher.
Figure 2.4: The Masonic Cipher, written graphically and as a mapping from
the Latin alphabet to graphic shapes.
Applied Introduction to Cryptography and Cybersecurity
2.1. HISTORICAL CIPHERS
2.1.2
57
Keyed-Caesar cipher
Keyless ciphers have limited utility; in particular, the design of the cipher
becomes a critical secret, whose exposure completely breaks security. Therefore,
every modern cipher, and even most historical ciphers, use secret keys.
Readers who are interested in these (keyed) historical ciphers should consult
manuscripts on the history of cryptology, e.g. [223,362]. We only discuss, brieŕy,
few simple and well known keyed historical ciphers in subsection 2.1.5.
In this subsection, we present a very simple, and trivially vulnerable, keyed
cipher: the Keyed-Caesar cipher, also referred to as the shift cipher. The
Keyed-Caesar cipher is a simple keyed variant of the Caesar cipher. In fact,
some people do not even distinguish between the Keyed-Caesar cipher and the
Caesar cipher. The Keyed-Caesar cipher helps us explain the Vigenère cipher
(in subsection 2.1.5), and illustrates the fact that using a key - even a long key is not sufficient for security.
Recall that the Caesar cipher is deőned by c[i] = p[i] + 3 mod n, where
p[i] ∈ {0, 1, . . . , n − 1} is a single letter (with n = 26 for Latin). The KeyedCaesar cipher is deőned with an additional parameter: a key k ∈ {0, 1, . . . , n−1}.
Given an input plaintext string p consisting of l ‘letters’, p[0], . . . , p[l − 1],
the ciphertext string c = EkKC−n (p) consists of the l ‘letters’ c[0], . . . , c[l − 1]
computed as:
(∀i = 0, . . . , l − 1) c[i] = p[i] + k mod n
(2.7)
When using n = 26, the Keyed-Caesar cipher encrypts a single Latin
character at a time, just like the Caesar cipher; it simply uses an arbitrary
rotation k, rather than the őxed rotation as done in the original Caesar (k = 3)
and ROT13 (k = 13) ciphers.
Obviously, with only n = 26 keys, the Keyed-Caesar cipher is also insecure;
the attacker only needs to try the 26 possible key values. This attack, where
the attacker tries all possible keys, is called exhaustive search, and can be used
against any cipher.
To foil exhaustive search, we can use longer keys and blocks of plaintext. For
example, if we extend our character set to include the 16-bit UCS-2 character
set, then n = 216 , making exhaustive search much harder. By using even longer
keys and blocks, say, őve UCS-2 characters (80 bits), we have n = 280 keys,
making exhaustive search impractical.
However, even with very large n, the Keyed-Caesar cipher is still insecure.
In particular, as the following exercise shows, even with huge n, Keyed-Caesar
cipher can still be easily broken by an attacker who has access to a single pair
of plaintext p and the corresponding ciphertext c = EkKC−n (p). Furthermore,
any message would do - even a single-letter message (l = 1). This is a very weak
form of a Known plaintext attack (KPA); we discuss KPA and other attack
models for cryptosystems in Section 2.2.
Exercise 2.3 (Known plaintext attack (KPA) on the Keyed-Caesar cipher).
Let p be an arbitrary l-lettered plaintext for the Keyed-Caesar cipher with any
alphabet size n, i.e., ∀i = 1, . . . , l : p[i] ∈ {0, . . . , n − 1}, and let c = EkKC−n (p)
Applied Introduction to Cryptography and Cybersecurity
CHAPTER 2. CONFIDENTIALITY: ENCRYPTION SCHEMES AND
PSEUDO-RANDOMNESS
58
be the corresponding ciphertext using key k ∈ {0, . . . , n − 1}. Given a pair of
one letter from p, say p[1], and the corresponding letter from c, i.e., c[1], show
how the attacker can find the key k, allowing decryption of any ciphertext.
Solution: From Equation 2.7, k = c[i] − p[i] mod n for any i, e.g., for i = 0
(the őrst plaintext letter and the corresponding őrst ciphertext letter).
2.1.3
The General Monoalphabetic Substitution (GMS)
Cipher
Monoalphabtic substitution ciphers are deterministic, stateless mappings from
plaintext characters to ciphertext characters or symbols. The use of any other
set of symbols instead of letters does not substantially change in the security of
such ciphers, hence we focus on permutations on a őxed alphabet.
The Mason, At-Bash, Caesar and Keyed-Caesar ciphers are all monoalphabetic substitution ciphers. The Mason, At-Bash and Caesar ciphers are keyless,
i.e., are deőned by a speciőc permutation. For example, the Caesar’s cipher is
the rotate-by-three permutation. The Keyed-Caesar cipher uses the rotate-by-k
permutation, where k is the key, which is a letter in the alphabet.
The General Monoalphabetic Substitution (GMS) cipher is the simple keyed
cipher obtained by applying an arbitrary permutation (mapping) from the
plaintext characters to the ciphertext characters. The key is the permutation.
Namely, given an plaintext p = p[0], . . . , p[l − 1], the ciphertext c = EkGM S (p)
consists of the l ‘letters’ c[0], . . . , c[l − 1] computed as:
(∀i = 0, . . . , l − 1) c[i] = k (p[i])
(2.8)
The key, which is a permutation over the alphabet, is often written as a table
with two rows: the őrst containing the plaintext letters and the other containing
the corresponding ciphertext letters. Some readers may recall having used
sometime, maybe long ago, such ‘key tables’ to create a simple monoalphabetic
cipher; many kids do.
In the Latin alphabet, there are n = 26 letters; each could be chosen for A,
then any of the remaining 25 could be used for B, and so on. Namely, the total
number of permutations (i.e., keys) is 26!, the factorial4 of 26. The factorial
grows very fast as a function of n; for example, 26! > 288 , i.e., there are over 288
permutations (keys) for 26-letters General Monoalphabetic Substitution (GMS)
cipher.
Namely, the Latin alphabet, with n = 26 characters, suffices to make
exhaustive search infeasible for General Monoalphabetic Substitution (GMS)
cipher. This improves signiőcantly compare to the Keyed-Caesar cipher, where
to obtain the same number of keys, we need a huge alphabet of n = 26! > 288
letters.
However, when using a small alphabet such as of Latin, General Monoalphabetic Substitution (GMS) cipher is vulnerable to the frequency analysis attack,
a simple attack that we describe next.
4 26!
≡ 26 · 25 · . . . 2 · 1.
Applied Introduction to Cryptography and Cybersecurity
2.1. HISTORICAL CIPHERS
59
3.5
12
11
3
10
9
2.5
6
5
2
1.5
Probability
7
Probability
8
4
1
3
2
1
0
0
E T A O I N S R H L D C U M F P GWY B V K X J Q Z
Letter
TH
HE
IN
ER
AN
RE
ON
AT
EN
ND
Bigram
Figure 2.5: Frequencies of letters and (most common) bigrams in English, based
on [302]
2.1.4
Frequency analysis attacks on monoalphabetic ciphers
Plaintext messages are rarely completely random strings; some messages, or
part of messages, are more common than others. A frequency analysis attack
exploits knowledge about the plaintext distribution, to facilitate cryptanalysis.
Frequency analysis attacks often succeeded against historical ciphers, using only
this knowledge about the plaintext distribution, and a collection of ciphertext
messages; we refer to such attacks as ciphertext only (CTO) attacks. The
frequency analysis attack is effective against any monoalphabetic cipher, including General Monoalphabetic Substitution (GMS) cipher. The only exceptions,
where frequency-analysis may fail, are when using extremely large alphabets,
and/or when the plaintext has uniform (‘random’) distribution.
Classical monoalphabetic ciphers map each letter in the alphabet of a speciőc
natural (human) language, e.g., the Latin alphabet, to a őxed letter (or symbol).
Namely, , the alphabet is not very large. Furthermore, in the typical case where
the plaintext is a natural language message, e.g., in English. Therefore, the
plaintext is not uniformly distributed. In fact, some letters are signiőcantly
more common than others; similarly, some pairs of letters (bigrams) and some
strings of several letters (ngrams), are signiőcantly more common than others.
See Figure 2.5.
Knowledge about the language, and the distributions of letters, bigrams and
ngrams, is usually available to the attacker; this makes frequency analysis, and
other ciphertext only (CTO) attacks, easier to launch than Known Plaintext
Attacks (KPA).
Furthermore, frequency attacks work exceedingly well against classical
monoalphabetic ciphers, given encryption of text in a known language. Indeed,
some deductions, and often complete decryption, may be done manually; for
example, in texts in English, the most common letter is almost always E
(12.49%). Identiőcation of T and H is also quite easy; T is the second-mostcommon (9.28%), and TH and HE are the most common bigrams (3.56% and
Applied Introduction to Cryptography and Cybersecurity
60
CHAPTER 2. CONFIDENTIALITY: ENCRYPTION SCHEMES AND
PSEUDO-RANDOMNESS
3.07%, respectively). Similarly, since we identiőed E, it is now easy to identify
R too, since ER and RE are of the most common bigrams (2.02% and 1.85%).
These easy deductions suffice to decrypt about about a third of the letters in
the ciphertext; and the reader can surely őnd few other easy deductions. Of
course, no reason to work manually; it is easy to write a program, to efficiently
cryptanalyze any monoalphabetic cipher, including General Monoalphabetic
Substitution (GMS) cipher.
Exercise 2.4. Write two programs: one that implements General Monoalphabetic Substitution (GMS) cipher, and another that cryptanalyzes the resulting
ciphertexts (without being given the key). Your cryptanalysis program can
assume that the encrypted plaintext is typical text in English.
By experimenting with the cryptanalysis program of Exercise 2.4, you will
őnd that it may fail if given short ciphertext - but is very reliable, given
sufficiently long ciphertext. This phenomenon exists for other attacks too;
cryptanalysis often requires a signiőcant amount of ciphertext encrypted using
the same encryption key. This motivates refreshing (changing) the key, thereby
limiting the use of each key to a limited amount of plaintext (and ciphertext).
Frequent key refresh make cryptanalysis harder or, ideally, infeasible.
Principle 4 (Limit usage of each key). Systems deploying ciphers/cryptosystems
should limit the amount of usage of each key, changing keys as necessary, to
foil cryptanalysis attacks.
An extreme example of this is the one time pad (OTP) cipher, which we
discuss later (Section 2.4). The one-time pad is essentially a one-bit substitution
cipher - but with a different random mapping for each bit. This turns the
insecure substitution cipher into a provably secure cipher!
Another way to defend against letter frequency attacks, is to use a much
larger alphabet, or to map sequences of letters rather than individual letters.
For example, by simply mapping pairs of plaintext letters rather than single
letters, we basically prevent the use of the letter-frequency table; of course,
the attacker can still take advantage of the bigram distribution. However, this
requires the use of an accordingly-larger table to map between plaintext and
ciphertext. Such larger tables are difficult to store and share, and result in high
overhead.
In Section 2.6 we present block ciphers, which are basically, efficient monoalphabetic substitution ciphers which use a large ‘alphabet’, e.g., 64 bits for
DES or 128 bits for AES. Block ciphers use much shorter keys compared to
the General Monoalphabetic Substitution (GMS) cipher. Block ciphers use
sufficiently-long keys to foils exhaustive search, typically from 56 bits (DES)
to 256 bits (AES). Unlike the Keyed-Caesar cipher, however, block ciphers are
designed to ensure security against frequency analysis as well as KPA and other
cryptanalysis attacks.
Applied Introduction to Cryptography and Cybersecurity
2.1. HISTORICAL CIPHERS
2.1.5
61
The Polyalphabetic Vigenère ciphers
We conclude our discussion of historical ciphers, by discussing polyalphabetic
ciphers, and in particular, two variants of the Vigenère cipher. We őrst describe the simpler (and weaker) (repeating-key) Vigenère cipher published, in
1553, by Giovan Battista Bellaso5 . We then describe the stronger Autokey,
published in 1586 by Blaise de Vigenére. We refer to these two ciphers as the
Vigenère ciphers.
Both Vigenère ciphers are polyalphabetic ciphers, namely, they use multiple
mappings from plaintext characters to ciphertext characters. The motivation
to use multiple mappings is to defeat frequency analysis attacks, and different
variants of the Vigenère cipher were used until the twenty century; in fact, even
the Enigma (Section 1.6) is a polyalphabetic cipher and essentially similar to
application of a cascade of few Vigenère ciphers.
The (repeating-key) Vigenère cipher. The (repeating-key) Vigenère cipher
extends the Keyed-Caesar cipher, by using a string of few characters as the key,
rather than a single character as in the Keyed-Caesar cipher. The (repeatingkey) Vigenère cipher simply applies the letters of the key, one by one, repeating
the key after using its last character.
Namely, given an input plaintext string p consisting of l characters, p[0], . . . , p[l−
1], and a key k consisting of λ characters, k[0], . . . , k[λ], the ciphertext string
c = EkVigenère cipher (p) consists of the l characters c[0], . . . , c[l − 1] computed as:
(∀i = 1, . . . , l) c[i] = p[i] + k[i
mod λ]
mod n
(2.9)
Example 2.1. Consider encryption of the string p =‘ABCDEFG’, whose
encoding is (i = 0, . . . , 6)p[i] = i, using the key k =‘BED’, whose encoding is
k[0] = 1, k[1] = 4 and k[2] = 3. The resulting ciphertext c is the string c =
‘BFFEIIH’.
Attacking the (repeating-key) Vigenère cipher. Since polyalphabetic
ciphers such as the Vigenère cipher use different keys (offsets) for different
characters, direct application of frequency analysis would fail. The őrst attack
against the (repeating-key) Vigenère cipher was published by Kasiski in 18636 .
This attack proceeds in two steps: őnding the length λ of the key k, and then
őnding the key itself (k[0], . . . , k[λ − 1]).
Once the key length λ is found, we can apply frequency analysis separately
to each of the λ sequences si (i = 0, . . . , λ − 1) of ciphertext characters generated
5 The (repeating-key) Vigenère cipher is usually referred to simply as the Vigenère cipher.
We add the qualifier (repeating key) to avoid confusion with the Autokey, which is the stronger
cipher published, later, by Vigenére. The Vigenère cipher was published first by Bellaso, so
its name is anyway misleading.
6 This attack was known earlier but not published; in particular, it was found in personal
notes written by Babbage in 1846.
Applied Introduction to Cryptography and Cybersecurity
62
CHAPTER 2. CONFIDENTIALITY: ENCRYPTION SCHEMES AND
PSEUDO-RANDOMNESS
by the different key letters:
s0
s1
...
sλ−1
≡
≡
(c[0], c[λ], . . .)
(c[1], c[λ + 1], . . .)
≡
(c[λ − 1], c[2λ − 1], . . .)
To őnd λ, Kasiski looked for repeating sequences of characters (ngrams) in
the ciphertext. For example, assume that for some m, m′ hold c[m : m + 3] =
c[m′ : m′ + 3]. By substituting c[m : m + 3] and c[m′ : m′ + 3] from Equation 2.9,
we have:
(∀i = 0, . . . , 3) p[m + i] + k[m + i mod λ] mod n =
p[m′ + i] + k[m′ + i mod λ] mod n
(2.10)
Very often, the plaintext contains some repeating words or strings, ranging
from relatively-common words or ngrams, to terms which are speciőc to that
plaintext. This would be the most common reason for repeating strings in the
ciphertext. Namely, when we őnd such m, m′ , it is likely that p[m : m + 3] =
p[m′ : m′ + 3]. Hence, from Equation 2.10, holds:
(∀i = 0, . . . , 3) k[m + i
mod λ]
mod n = k[m′ + i
mod λ]
mod n (2.11)
Usually, when Equation 2.11 holds, then m = m′ mod λ, i.e., m − m′ is
either λ or a multiple of λ. The attack usually proceed by őnding few such
repeating sequences, e.g., m1 = m′1 mod λ and m2 = m′2 mod λ, and λ is
the common divisor of m1 − m′1 and of m2 − m′2 , or one of the (few) common
divisors. We then apply frequency analysis to őnd each of the λ characters of
the key, k[0], . . . , k[λ − 1].
The one-time-pad. An important scenario is when we use the Vigenère cipher
with a key which is (at least) as long as the plaintext. Namely, each character
of the key is used to hide only one character of the plaintext. In this case,
Equation 2.9 simpliőes to c[i] = p[i] + k[i] mod n; and in the special case of
n = 2, we have c[i] = p[i] ⊕ k[i].
When the key k is chosen randomly, and especially when n = 2, this cipher
is referred to as the one-time pad; we discuss it in Section 2.4. The one-time
pad is unconditionally secure, i.e., the attacker cannot learn from the ciphertext anything about the plaintext, regardless of the attacker’s computational
capabilities (for any n > 1).
The Autokey. We now describe Autokey, which is the cipher that Vigenére actually designed; it is an enhancement or variant of the ‘repeating
key Vigenère cipher’. Both Vigenère ciphers operate in the same way, until
exhausting the characters of the key; but then they differ. Recall that the
Applied Introduction to Cryptography and Cybersecurity
2.1. HISTORICAL CIPHERS
63
repeating-key Vigenère cipher reuses the same key string. Instead, the Autokey
uses the plaintext.
Namely, given an input plaintext string p consisting of l characters, p[0], . . . , p[l−
1], and a key k consisting of λ characters, k[0], . . . , k[λ − 1], we compute the
ciphertext string c = EkAutokey (p) as follows. We őrst deőne the autokey k ′ , as
k ′ ≡ k||p, i.e., the concatenation of the plaintext to the key. Next, we compute
the ciphertext c, which consists of the l characters c[0], . . . , c[l − 1] computed as:
(∀i = 1, . . . , l) c[i] = p[i] + k ′ [i]
mod n
(2.12)
Example 2.2. Consider encryption of the string p =‘ABCDEFG’, whose
encoding is (i = 0, . . . , 6)p[i] = i, using the key k =‘BED’, whose encoding is
k[0] = 1, k[1] = 4 and k[2] = 3. The autokey would be k ′ =‘BEDABCDEFG’,
and the resulting ciphertext c is the string c = ‘BFFDFHJ’.
Attacking the Autokey. Let us present instructive attacks on the Autokey.
These attacks use two different models of the attacker capabilities, the Known
plaintext attack (KPA) model and the Ciphertext only attack (CTO) model;
we discuss these and other attack models against cryptosystems below, in
Section 2.2.
Example 2.3 (A Known plaintext attack against Autokey). Assume that
the attacker captures ciphertext c, and knows that the plaintext is of the form
p = pKnown +
+ pSecret ; for example, pKnown =‘The password is:’.
The attacker can find the key k and the plaintext pSecret as follows, assuming
that the key k is not longer than pKnown . Let λ denote the number of characters
in the key k, i.e., k = k[0 : λ]. Hence, for i = 0, . . . , λ holds: (1) k ′ [i] = k[i] (by
definition of k ′ ), (2) p[i] = pKnown [i] and (3) c[i] = p[i] + k ′ [i] (Equation 2.12).
The attacker can, therefore, find the key k, since for i = 0, . . . , λ holds: k[i] =
k ′ [i] = c[i] − pKnown [i].
In the (less likely) case that k is longer than pKnown , this attack will expose
the first |pKnown | characters of k. This will allow exposure of the first |pKnown |
characters of the plaintext of other messages encrypted using k. To expose
additional characters, we will need to use a different attack, such as the CTO
attack which we describe next.
Example 2.4 (A Ciphertext Only Attack (CTO) against Autokey). In spite
of their name, ciphertext only (CTO) attacks require some knowledge about the
plaintext. For exampl, assume that the plaintext is known to be in English. Let
us assume also that the attacker knows the number of characters in the key, λ.
Assume that the attacker knows some plaintext p[i]. Then c[i + λ] = p[i +
λ] + p[i] mod n; hence the attacker can find p[i + λ]. Now that p[i + λ] is
known, the attacker can find p[i + 2λ], and so on. Similarly, the attacker can
find p[i − λ], provided, of course, that i − λ ≥ 0. Namely, given a guess for p[i],
the attacker can find p[j] for every j = i mod λ.
This allows the attacker to decrypt the ciphertext, by using frequency analysis;
let us sketch how. Since the letter E is very common (more than 18 of the letters!),
Applied Introduction to Cryptography and Cybersecurity
CHAPTER 2. CONFIDENTIALITY: ENCRYPTION SCHEMES AND
PSEUDO-RANDOMNESS
64
the attacker tries different indexes i, testing if p[i] = 4 (indicating the letter
E). To test if this is the case, the attacker uses the above process to find what
would be the letters p[j] for every j = i mod λ, and checks the distribution
of the resulting sequence of letters. We conclude that the guess was correct
(p[i] = 4, i.e., the ith letter is indeed E), if and only if, the distribution is close
to the letter frequency in English, in particular, if roughly 18 of the letters in
this sequence are E.
After the attacker finds, in this way, the sequence of plaintext letter p[j] for
every j = i mod λ, it can continue to decrypt additional ciphertext letters by
guessing additional letters. In particular, the attacker can now use the bigram
distribution, to guess letters adjacent to already-exposed letters.
2.2
Cryptanalysis Attack Models: CTO, KPA, CPA and
CCA
As discussed in Section 1.1 and in particular in Principle 1, security should be
deőned and analyzed with respect to a clear, well deőned model of the attacker
capabilities, which we refer to as the attack model. In particular, cryptanalysis
attack models deőne the capabilities of attackers trying to ‘break’ an encryption
scheme.
In this section, we introduce the four basic cryptanalysis attack models:
CTO, KPA, CPA and CCA; we already gave few examples of CTO and KPA
attacks. Table 2.1 summarizes these four basic cryptanalysis attack models,
as well as two additional cryptanalysis attack models: the chosen-ciphertext
side-channel attack (CCSCA), against public-key cryptosystems, presented in
Chapter 6, and the CPA-Oracle Attack, against shared-key cryptosystems,
presented in Chapter 7. Following Kerckhoffs’s principle, in all of these attack
models, the design of the cryptosystems are known to the attacker; defense is
only provided via the secret keys7 , which are selected or generated randomly.
The Ciphertext-Only (CTO) attack model.
We discussed above the
letter-frequency attack, which relied only on access to sufficient amount of
ciphertext, and on knowing the letter-distribution of plaintext messages. Later,
in subsection 2.3.1, we presented exhaustive search, an attack requiring the ability
to identify correctly-decrypted plaintext (with signiőcant probability). Both of
these attacks requires only access to (sufficient) ciphertext, and some limited
knowledge about the plaintext distribution - in there examples, knowledge of
the letter frequencies in the plaintext language, or ability to identify possible
plaintexts, respectively. We refer to such attacks, which require only these
‘minimal’ attacker capabilities, as ciphertext-only (CTO8 ) attacks, or as attacks
under the ciphertext-only (CTO) attack model. In particular, ciphertext-only
attacks do not require a (plaintext, ciphertext) pair.
7 private
keys for public-key cryptosystems
historical reasons, we use the CTO acronym for the ciphertext-only attack, although
it is inconsistent with the other acronyms for attack models.
8 For
Applied Introduction to Cryptography and Cybersecurity
2.2. CRYPTANALYSIS ATTACK MODELS: CTO, KPA, CPA AND CCA
Attack model
Ciphertext
Only
(CTO)
Known Plaintext
Attack (KPA)
Chosen Plaintext
Attack (CPA)
Chosen Ciphertext
Attack (CCA)
chosen-ciphertext
side-channel attack
(CCSCA)
CPA-Oracle Attack
Cryptanalyst knowledge, capabilities
Plaintext distribution (possibly noisy/partial)
Section
2.2
Fig.
2.6
Set of (ciphertext, plaintext) pairs
2.2
2.7
Ciphertext for arbitrary plaintext chosen by
attacker
Plaintext for arbitrary ciphertext chosen by
attacker
‘Side-channel feedback’ from processing
adversary-selected ciphertexts
2.2
2.8
2.2
2.9
6.5.7
6.20
Attacker receives encryptions of pre+
+x+
+post,
for challenge x and chosen (pre, post), and
feedback from processing selected ciphertexts
7.2.3
2.36
65
Table 2.1: Cryptanalysis Attack Models. In all attack types, the cryptanalyst
knows the cipher design and a body of ciphertext.
(eav
esdr
op)
Eavesdropping Eve
m0 , m1 , . . .
m0 , m1 , . . .
Encrypt:
ci ← Ek (mi )
c 0 , c1 , . . .
Decrypt:
mi ← Dk (ci )
Nurse
Alice
Bob
Figure 2.6: The Ciphertext-Only (CTO) attack model. Notice the small image
representing the plaintext distribution, which is a tiny-version of the letterdistribution graph of Figure 2.5; this is used to sample the plaintext, for both
the plaintext messages m0 , m1 , . . . and for sample plaintext messages given to
the attacker.
To facilitate CTO attacks, the attacker must have some knowledge about
the distribution of plaintexts. In practice, such knowledge is typically implied
by the speciőc application or scenario. For example, when it is known that the
message is in English, then the attacker can apply known statistics such as the
letter-distribution histogram Figure 2.5. For a formal, precise deőnition, we
normally allow the adversary to pick the plaintext distribution. Note that this
requires deőning security carefully, to prevent absurd ‘attacks’, which clearly
fail in practice, to seem to fall under the deőnition.
Applied Introduction to Cryptography and Cybersecurity
66
CHAPTER 2. CONFIDENTIALITY: ENCRYPTION SCHEMES AND
PSEUDO-RANDOMNESS
Eavesdropping Eve
.
m∗ , m0 , m1 , . . .
(eaves
m0 , m1 , . . .
m∗
drop)
,..
, m1
m0
Encrypt:
ci ← Ek (mi )
c ∗ , c0 , c1 , . . .
Decrypt:
Dk (·)
Nurse
Alice
Bob
Figure 2.7: The Known-Plaintext Attack (KPA) model. As in the CTO model,
all plaintext messages, m∗ and m0 , m1 , . . ., are chosen from a known plaintext
distribution, and the eavesdropping-adversary receives their encryptions, c∗ and
c0 , c1 , . . ., and tried to learn m∗ - or something about m∗ . In the KPA model,
the adversary receives, in addition, the plaintext messages m0 , m1 , . . ., except
for the ‘challenge message’ m∗ . The plaintext messages m0 , m1 , . . . are not
given to the CTO attacker.
The Known-Plaintext Attack (KPA) model.
In the known-plaintext
attack (KPA) model, the attacker receives one or multiple pairs of plaintext
and the corresponding ciphertext. However, the attacker cannot choose the
plaintext; one way to model this is to assume that the plaintext is chosen
randomly.
In the historical attacks on cryptographic systems, known plaintext attacks
were sometimes possible, such as when some text was available both in plaintext
and in ciphertext. One interesting example is the ‘deciphering’ of the Rosetta
stone, which contained the same inscription, engraved in three different ways:
one in Greek and two in Egyptian, once using hieroglyphic script and once
using Demotic script. This is how archaeologists learned to read hieroglyphic.
Previous attempts to decipher hieroglyphic, using ‘ciphertext only’, i.e., without
known plaintext, were in vain.
Another example of historical known-plaintext attack was the cryptanalysis
of the Enigma cipher by the Allies during WW II (Section 1.6). The Germans
were often sending encryptions of plaintext which would either be known or
could be guessed with reasonable probability. Sometimes the same message
was sent to some recipients encrypted, and to others in plaintext; and often
the message began with a predictable greeting or otherwise contained some
predictable content. The (plaintext m, ciphertext Ek (m)) pairs were fed to
the Bombe devices, which tried all possible keys, until őnding the correct key
(exhaustive search). Note that this was incorrect use of Enigma; following the
conservative design and usage principle, the exposure of (plaintext, ciphertext)
Applied Introduction to Cryptography and Cybersecurity
2.2. CRYPTANALYSIS ATTACK MODELS: CTO, KPA, CPA AND CCA
67
pairs should have been avoided.
In modern applied cryptography, it is very common for the attacker to obtain
KPA capabilities. For example, consider the common use of the TLS protocol
(Chapter 7) to protect web communication, referred as https (for ‘http-secure’).
In such usage, the entire traffic between browser and web-server is encrypted by
TLS. Typically, this includes images and web-pages (encoded in HTML) which
are sent to all clients. Namely, the attacker can obtain this plaintext, simply by
requesting the same page from the web-server.
The Chosen-Plaintext Attack (CPA) model.
In subsection 2.3.2, we
discuss the table look-up and time-memory tradeoff attacks; in both of these
generic attacks, the adversary must be able to obtain the encryption of one or few
speciőc plaintext messages - the messages used to create the precomputed table.
Therefore, these attacks cannot be launched under the Known-Plaintext Attack
(CPA) model. Instead, these attacks are facilitated by the chosen-plaintext
attack (CPA) model.
,..
, m1
m0
m∗ , m0 , m1 , . . .
.
Encrypt:
ci ← Ek (mi )
(eav
esdr
m∗
op)
Eavesdropping Eve
c ∗ , c0 , c1 , . . .
Decrypt:
Dk (·)
Nurse
Alice
Bob
Figure 2.8: The Chosen-Plaintext Attack (CPA) model. Here, the attacker
can choose the plaintext messages m0 , m1 , . . ., in addition to knowing the
distribution from which m∗ is sampled. Note: in the CPA attack, the attacker
controls the plaintext messages given to Alice; however, we usually still consider
this attacker to be an ‘eavesdropper’, since the attacker cannot modify or inject
messages to the communication between Alice and Bob.
Security against CPA has been studied extensively in modern cryptography,
but was not even considered in historical cryptanalysis. For example, in the
second world war, cryptanalysts in Bletchley Park made extensive use of some
known plaintexts, i.e., used KPA. However, they had no hope to choose the
plaintext, i.e., to use CPA. Therefore, they were not able to use CPA attack
techniques, e.g., the attacks in subsection 2.3.2.
However, in modern applied cryptography, it is not that unusual for the
attacker to obtain CPA capabilities. For example, consider the, as for KPA, the
Applied Introduction to Cryptography and Cybersecurity
68
CHAPTER 2. CONFIDENTIALITY: ENCRYPTION SCHEMES AND
PSEUDO-RANDOMNESS
common use of the TLS protocol (Chapter 7) to protect web communication.
TLS applies encryption to the traffic from the browser to the server; however, a
basic premise of the http protocol, is that a every web-page (and script) can
send requests to any other website, including websites from different domains.
Namely, when the user is browsing to the attacker’s web-page, or when the
browser is running a script written by the attacker, the attacker can cause the
browser to send arbitrary requests to any website. These requests would often
also contain sensitive information, typically a cookie attached by the browser
to the request. The browser will often also use the same key to encrypt other
requests sent by the browser to the same website.
Every cipher vulnerable to CTO attack, is also vulnerable to KPA attack;
and every cipher vulnerable to KPA, is also vulnerable to CPA attack. We say
that the CPA attack model is stronger than the KPA model, and that KPA
model is stronger that CTO model, and denote this by: CP A > KP A > CT O.
Exercise 2.5 (CP A > KP A > CT O). Explain (informally) why every cryptosystem vulnerable to CTO attack, is also vulnerable to KPA, and every cryptosystem vulnerable to KPA, is also vulnerable to CPA.
The Chosen-Ciphertext Attack (CCA) model.
Finally, in the chosenciphertext attack (CCA) attack model, the attacker has the ability to receive
the decryption of arbitrary ciphertexts, chosen by the attacker. This attack
model may seem absurd; if the attacker can receive decryption of arbitrary
ciphertexts, then the encryption does not protect conődentiality even without
cryptanalysis, no? However, this is actually a very important attack model, since,
in practice, there are different scenarios where the attacker may be able to obtain
some some information about the plaintext of some ciphertexts. In particular,
in subsection 6.5.7 we present the important Bleichanbacher attack against
(insecure) padding schemes applied to the plaintext before RSA encryption; this
attack is under the chosen-ciphertext side-channel attack (CCSCA), which is a
weaker version of the CCA attack model.
There are a few other variants to the CCA model. One set of variants
combines the CCA model with the CTO, KPA or CPA models, i.e., allows the
adversary also the ability to receive encryptions of random plaintext from a
known distribution, known plaintexts or chosen plaintexts, respectively. Another
set of variants concern the timing of the choice of ciphertext messages to be
decrypted; some CCA models require the adversary to choose these ciphertexts
before receiving the challenge ciphertext c∗ , and other CCA models allow the
adversary to select these ciphertexts after receiving c∗ , of course, forbidding the
use of c∗ as one of these ciphertexts to be decrypted.
We adopt the common deőnition, where CCA-attackers are also allowed to
perform CPA attacks, i.e., the attacker can obtain the encryptions of attackerchosen plaintext messages. With this deőnition, trivially, CCA > CP A.
Combining this with the previous exercise, we have the complete ordering:
CCA > CP A > KP A > CT O.
Applied Introduction to Cryptography and Cybersecurity
2.3. GENERIC ATTACKS AND EFFECTIVE KEY-LENGTH
69
MitM Mal
,..
, m1
m0
m∗
.
m′0 , m′1 , . . .
c ∗ , c0 , c1 , . . .
m∗ , m0 , m1 , . . .
Encrypt:
ci ← Ek (mi )
c′0 , c′1 , . . .
(∀i)c′i ̸= c∗
Decrypt:
m′i ← Dk (c′i )
Nurse
Alice
Bob
Figure 2.9: The Chosen-Ciphertext Attack (CCA) model. In the CCA model,
the adversary can select ciphertext messages c′0 , c′1 , . . ., all different from the
‘challenge ciphertext’ c∗ , and receive their decryptions m′i = Dk (c′i ).
2.3
Generic attacks and Effective Key-Length
We discussed several ciphers and attack models; how can we evaluate the security
of different ciphers, under a given attack model? This is a fundamental challenge
in cryptography; we will discuss this challenge in this section, as well as later
on, esp. when we introduce our őrst deőnition of a cryptographic mechanism the Pseudo-Random Generator (PRG), in subsection 2.5.2.
One way in which non-experts often compare the security of different ciphers,
is using their key length. We caution that while a sufficiently-long key is required
for security, a long key is not a sufficient to ensure security; in subsection 2.3.3
we introduce the effective key length concept and principle, which should be
used instead of simply relying on the key length.
Nevertheless, the key length of a cipher is important - since a short key
does allow attacks. In fact, there are attacks which we call generic attacks
which work against any cryptosystem with insufficiently long key, or against all
cryptosystems that share some (common) property. Generic attacks may differ
in the attack model they assume, in the types of cryptosystems they break and
in the attack efficiency and overhead.
We present three important generic attacks in this section. In subsection 2.3.1, we present the exhaustive search attack, which essentially tries out
all the keys until őnding the right one. Exhaustive search works on most ciphers
and scenarios; it requires the ability to test candidate keys. Such ability exists
usually, but not always, as we explain. In subsection 2.3.2 we discuss two
other generic attacks: table look-up and time-memory tradeoff. These attacks
further demonstrate, that the attacker’s success does not depend only on the
key-length - it also depends on the attack model and attacker capabilities, e.g.,
storage capacity. Finally, in subsection 2.3.3 we discuss additional challenges in
evaluating security of cryptographic mechanisms, and introduce the effective
key length concept and principle.
Applied Introduction to Cryptography and Cybersecurity
CHAPTER 2. CONFIDENTIALITY: ENCRYPTION SCHEMES AND
PSEUDO-RANDOMNESS
70
2.3.1
The generic exhaustive-search CTO attack
Recall that in the CTO attack model, the attacker has some knowledge about
the distribution of plaintexts. In this section we discuss generic exhaustivesearch CTO attacks, also called exhaustive search or brute force attacks, where
the attacker uses this knowledge to cryptanalyze a generic cryptosystem, i.e.,
without dependency on the design of the speciőc cryptosystem.
In Algorithm 2 we present a simple exhaustive-search CTO attack, which
uses a predicate V alid(p) that validates plaintexts, i.e., V alid(p) returns ⊥ if
p is not a valid plaintext, for example, if p is a word in English. Other CTO
exhaustive key search algorithms do not require such a predicate; and typically
use a known probability distribution for the plaintext space, e.g., using the
distribution of letters in English (Figure 2.5).
Algorithm 3 A CTO exhaustive key search algorithm using predicate V alid(·),
for stateless decryption algorithm D.
Set of possible keys: K = {k1 , k2 , . . . , kn }
Set of ciphertexts: C = {c1 , c2 , . . .}
Predicate V alid(p): returns True if p is valid plaintext, ⊥ otherwise
for all k in K do
if (∃c ∈ C) V alid(Dk (c)) = ⊥ then
Remove k from K
return set of remaining candidate keys K
Algorithm 2 decrypts each of the known ciphertext messages, using every
possible key. If the decryption of some ciphertext c using some key k is not
a valid plaintext, it follows that k is not a correct key. The process typically
eliminates all but one or few candidate keys, and incorrect keys are discarded
quickly, after testing them against a small number of ciphertexts.
Testing candidate keys. Exhaustive search works best when decrypting
arbitrary ciphertext with an incorrect key usually result in clearly-invalid plaintext. Notice our use of the term ‘usually’; surely there is some probability that
decryption with the wrong key will result in seemingly-valid plaintext. Hence,
exhaustive search may not return a single correct decryption key. Instead, quite
often, exhaustive search may return multiple candidate keys, which all resulted
in seemingly-valid decryption. In such cases, the attacker must now eliminate
some of these candidate keys by trying to decrypt additional ciphertexts and
discarding a key when its decryption of some ciphertext appears to result in
invalid plaintext.
Is CTO Exhaustive Search feasible? Exhaustive search is not always
feasible. One concern is the availability of the valid(·) function, or of other
distinguishing properties based on the probability distribution of plaintexts. In
particular, exhaustive search is infeasible if the plaintext space is uniformly
Applied Introduction to Cryptography and Cybersecurity
2.3. GENERIC ATTACKS AND EFFECTIVE KEY-LENGTH
71
random, i.e., any string is equally likely to be used as plaintext. Let us therefore
focus on the simpler case, where a valid(·) function is known. Furthermore,
assume the typical case where decryption of a valid ciphertext with incorrect
key results, with high probability, in an invalid plaintext p (i.e., valid(p) = ⊥).
Further focusing on stateless cryptosystems, we can exclude an incorrect
key after few trial decryptions. Consider a symmetric cryptosystem (E, D),
where the key is chosen as a random binary string of given length l, namely
the key space is 2l . Namely, the attack succeeds after decrypting one or few
valid ciphertexts, for, at most, each of the 2l possible keys. The attack is
therefore feasible, when using insufficiently-long keys. Surprisingly, designers
have repeatedly underestimated the risk of exhaustive search and used ciphers
with insufficiently long keys, i.e., insufficiently large key spaces. Let us elaborate.
Let TS be the sensitivity period, i.e., the duration required for maintaining
secrecy, and TD be the time it takes to test each potential key, by performing
one or more decryptions. Hence, the attacker can test TS /TD keys out of the
key-space containing 2l keys. If TS /TD > 2l , then the attacker can test all
keys, and őnd the key for certain (with probability 1); otherwise, the attacker
TS
succeeds with probability TD
. By selecting a sufficient key length, we can
·2l
ensure that the success probability is as low as desired.
For example, consider the conservative assumption of testing a billion keys
per second, i.e., TD = 10−9 , and requiring the security for three thousand years,
i.e., TS = 1011 , with probability of attack succeeding at most 0.1%. We őnd
that to ensure security withthese
parameters against brute force attack, we
need keys of length l ≥ log2 TTDS = log2 (1020 ) < 74 bits.
The above calculation assumed a minimal time to test each key. Of course,
attackers will often be able to test many keys in parallel, by using multiple
computers and/or parallel processing, possibly with hardware acceleration.
Such methods were used during 1994-1999 in multiple demonstrations of the
vulnerability of the Data Encryption Standard (DES) to different attacks. The
őnal demonstration was exhaustive search completing in 22 hours, testing many
keys in parallel using a $250,000 dedicated-hardware machine (‘deep crack’)
together with distributed.net, a network of computers contributing their idle
time.
However, the impact of such parallel testing, as well as improvements in
processing time, is easily addressed by reasonable extension of key length.
Assume that an attacker is able to test 100 million keys in parallel during the
same 10−9 second, i.e., TD = 10−17 . With the same goals
and calculation as
above we őnd that we need keys of length l ≥ log2 TTDS = log2 (1026 ) < 100.
This is far below even the minimal key length of 128 bits supported by the
Advanced Encryption Standard (AES). Therefore, exhaustive search is not a
viable attack against AES or other ciphers with over 100 bits.
In principle, exhaustive search can be be applicable to public key cryptosystems (KG, E, D), where the public and private keys are generated by the
$
randomized key-generation algorithm, i.e., (e, d) ← KG(1l ). However, this is
Applied Introduction to Cryptography and Cybersecurity
CHAPTER 2. CONFIDENTIALITY: ENCRYPTION SCHEMES AND
PSEUDO-RANDOMNESS
72
usually impractical, for two reasons. First, all known public key cryptosystems
have orders of magnitude more overhead than shared key cryptosystems, which
makes testing all keys impractical. Second, known public key cryptosystems
are all vulnerable to attacks which are signiőcantly more efficient than exhaustive search, and therefore are used with signiőcantly longer keys, which make
exhaustive-search attacks infeasible. See details in Chapter 6 and speciőcally in
Table 6.1.
Exhaustive search for stateful ciphers. Exhaustive search may be harder
against stateful ciphers, since decryption may depend on the (initial) state of the
cipher. The natural adaptation of the exhaustive search attack of Algorithm 2
requires to consider the initial state as a part of the key, which makes the attack
harder; this motivated stateful cipher designs, for example the Vigenère cipher
and Autokey ciphers (subsection 2.1.5).
Of course, if the initial state is known, then this concern does not exist.
However, the attack can remain infeasible, if the key space is too large. In
particular, in Section 2.4 we present the one time pad (OTP) stream cipher,
where every plaintext bit is encrypted with a corresponding key bit, i.e., c = p⊕k
where |k| = |p| = |c|. OTP cannot be broken by exhaustive search, since for
every ciphertext c and every plaintext p of the same length, there is a key k
such that c = p ⊕ k.
2.3.2
The Table Look-up and the Time-Memory Tradeoff
Generic CPA attacks
Exhaustive search is very computation-intensive; it őnds the key, on the average,
after testing half of the keyspace. On the other hand, its storage requirements
are very modest, and almost9 independent of the key space.
In contrast, the table look-up attack, which we next explain, uses O(2l )10
memory, where l is the key length, but only table-lookup time. However, this
requires ciphertext of some pre-deőned plaintext message, which we denote
p∗ . This can be achieved by an attacker with Chosen Plaintext Attack (CPA)
capabilities, or whenever the attacker can obtain encryptions of some well
known message p∗ . Many communication protocols use predictable, well-known
messages at speciőc times, often upon connection initialization, which provides
the attacker with encryptions of this predictable known plaintext message p∗ and suffice for this attack.
In the table look-up attack, the attacker őrst precomputes T (k) = Ek (p∗ )
for every key k, and for every c s.t. c = T (k) for some key k, stores also the
inverse table T −1 (c) = {k s.t. c = Ek (p∗ )}. Later, the attacker asks for the
encryption of the same plaintext p∗ , using the unknown secret key, which we
denote k ∗ ; let c∗ = Ek∗ (p∗ ) denote the received ciphertext. The attacker now
identiőes the key as one of the entries in T −1 (c∗ ). The number of matching
9 Exhaustive
10 See
search needs storage for the key guesses.
Section A.1 for the big-O notation, O(2l ).
Applied Introduction to Cryptography and Cybersecurity
2.3. GENERIC ATTACKS AND EFFECTIVE KEY-LENGTH
73
keys is usually one or very small, allowing the attacker to quickly rule out the
incorrect keys, e.g., by decrypting some additional ciphertext messages.
The table look-up attack requires O(2l ) storage to ensure O(1) computation,
while the exhaustive search attack uses O(1) storage and O(2l ) computations.
Several more advanced generic attacks allow different tradeoffs between the
computing time and the amount of storage (memory) required for the attack.
The őrst and most well known time-memory tradeoff attack was presented by
Martin Hellman [189]. Later works presented other tradeoff attacks, such as
the time/memory/data tradeoff of [65] and the rainbow tables technique of [303].
Unfortunately, we will not be able to cover these interesting attacks, and the
readers are encouraged to read these (and other) papers presenting them.
2.3.3
Effective key length
Cryptanalysis, i.e., developing attacks on cryptographic mechanisms, is a large
part of the research in applied cryptography; it includes generic attacks such as
these presented earlier in this section, as well as numerous attacks which are
tailored to a speciőc cryptographic mechanism. This may look surprising; why
publish attacks? Surely the goal is not to help attacks against cryptographic
systems?
Cryptanalysis facilitates two critical decisions facing designers of security
systems which use cryptography: which cryptographic mechanism to use, and
what parameters to use, in particular, which key length to use.
Let us focus őrst on the key length. All too often, when cryptographic
products and protocols are mentioned in the popular press, the key-length in use
is mentioned as an indicator of their security. Furthermore, this is sometimes
used to argue for the security of the cryptographic mechanism, typically by
presenting the number of different key values possible with a given key length.
The number of different keys, is the time required for the exhaustive search
attack, and has direct impact on the resources required by the other generic
attacks we discussed. Hence, clearly, keys must be sufficiently long to ensure
security. But how long?
It is incorrect to compare the security of two different cryptographic systems,
which use different cryptographic mechanisms (e.g., ciphers), by comparing the
key length used in the two systems. Let us give two examples:
1. We saw that the general monoalphabetic substitution cipher (subsection 2.1.3) is insecure, although its key space is relatively large. We could
easily increase the key length, e.g., by adding more symbols, e.g., use
different symbols for lowercase and uppercase letters; but this will not
signiőcantly improve security.
2. The key length used by symmetric cryptosystems, as discussed in this
chapter, rarely exceed 300 bits, and is usually much smaller - 128 bits
is common; more bits are simply considered unnecessary. In contrast,
asymmetric, public-key cryptography is usually used with longer keys Applied Introduction to Cryptography and Cybersecurity
74
CHAPTER 2. CONFIDENTIALITY: ENCRYPTION SCHEMES AND
PSEUDO-RANDOMNESS
often much longer, depending on the speciőc public key cryptosystem; see
Table 6.1.
It is useful to compare the security of different cryptosystems, when each is
used with a speciőc key-length - e.g., with comparable efficiency. As explained
above, using the key-length alone would be misleading. One convenient, widely
used measure for the security of a given cryptosystem, used with a speciőc key
length, is called the effective key length; essentially, this uses exhaustive search
as a measure to compare against.
We say that a cipher using k-bit keys has effective key length l if the most
effective attack known against it takes about 2l operations, where k ≥ l. We
expect the effective key length of good symmetric ciphers to be close to their real
key length, i.e., l should not be ‘much smaller’ compared to k. For important
symmetric ciphers, any attack which increases the gap between k and l would
be of great interest, and as the gap grows, there will be increasing concern with
using the cipher. The use of key lengths which are 128 bits or more leaves a
‘safety margin’ against potential better future attacks, and gives time to change
to a new cipher when a stronger, more effective attack is found.
Note that, as shown in Table 6.1, for asymmetric cryptosystems, there is
often a large gap between the real key length l and the effective key length k.
This is considered acceptable, since the design of asymmetric cryptosystems is
challenging, and it seems reasonable to expect attacks with performance much
better than exhaustive search. In particular, in most public key systems, the
secret key is not an arbitrary, random binary string.
Note that the evaluation of the effective key length, depends on the attack
model; there are often attackers with much smaller effective-key length, when
assuming a stronger attack model, e.g., CPA compared to KPA or CTO. One
should therefore also take into account the expected attack model.
Also, notice that the effective key length measure compares based on the
time required for the attack; it does not allow for comparing different resources,
for example, time-memory tradeoff.
Normally, we select sufficient key length to ensure security against any
conceivable adversary, e.g., leaving a reasonable margin above effective key
length of say 100 bits; a larger margin is required when the sensitivity period
of the plaintext is longer. The cost of using longer keys is often justiőed,
considering the damages of loss of security and of having to change in a hurry
to a cipher with longer effective key length, or even of having to use longer keys.
In some scenarios, however, the use of longer keys may have signiőcant costs;
for example, doubling the key length in the RSA cryptosystem increases the
computational costs by about six. We therefore may also consider the risk from
exposure, as well as the resources that a (rational) attacker may deploy to break
the system. This is summarized by the following principle.
Principle 5 (Sufficient effective key length). Deployed cryptosystems should
have sufficient effective key length to foil feasible attacks, considering the maximal expected adversary resources and most effective yet feasible attack model,
Applied Introduction to Cryptography and Cybersecurity
2.4. UNCONDITIONAL SECURITY AND THE ONE TIME PAD (OTP)
75
as well as cryptanalysis and speed improvements expected over the sensitivity
period of the plaintext.
Experts, as well as standardization and security organizations, publish
estimates of the required key length of different cryptosystems (and other
cryptographic schemes); we present a few estimates in Table 6.1.
2.4
Unconditional security and the One Time Pad (OTP)
The exhaustive search and table look-up attacks are generic - they do not depend
on the speciőc design of the cipher: their complexity is merely a function of
key length. This raises the natural question: is every cipher breakable, given
enough resources? Or, can encryption be secure unconditionally - even against
an attacker with unbounded resources (time, computation speed, storage)?
We next present such an unconditionally secure cipher, the one time pad
(OTP). The one time pad is often attributed to a 1919 patent by Gilbert
Vernam [382], although some of the critical aspects may have been due to
Mauborgne [49], and in fact, the idea was already proposed by Frank Miller
in 1882 [48]; we refer readers to the many excellent references on history of
encryption, e.g., [223, 362].
The one time pad is not just unconditionally secure - it is also an exceedingly
simple and computationally efficient cipher. Speciőcally:
Encryption: To encrypt a message, compute its bitwise XOR with the key.
Namely, the encryption of each plaintext bit, say m[i], is one ciphertext
bit, c[i], computed as: c[i] = m[i] ⊕ k[i], where k[i] is the ith bit of the
key.
Decryption: Decryption simply reverses the encryption, i.e., the ith decrypted
bit would be c[i] ⊕ k[i].
Key: The key bits k = {k[1], k[2], . . .} should consist of independently drawn
fair coins, and the key length must be at least as long as that of the
plaintext. Notice, that this long key should be - somehow - shared between
the parties, which is a challenge in many scenarios.
See illustration in Figure 2.10.
The correctness of OTP, i.e., the fact that decryption recovers the plaintext
correctly, follows from the properties of exclusive OR. Namely, given c[i] =
m[i]⊕k[i], the corresponding decrypted bit is c[i]⊕k[i] = (m[i]⊕k[i])⊕k[i] = m[i],
as required.
The unconditional security of OTP also follows from properties of XOR;
let us explain it, albeit informally. First, OTP handles each bit completely
independently of others, so we can focus on the security of a particular bit
m[i], encrypted as c[i] = m[i] ⊕ k[i]. Recall that each key bit k[i] is selected
randomly, and suppose, for simplicity, that the message bit m[i] was also selected
randomly. Namely, both bits can be either 0 or 1 with probability half. Given
Applied Introduction to Cryptography and Cybersecurity
76
CHAPTER 2. CONFIDENTIALITY: ENCRYPTION SCHEMES AND
PSEUDO-RANDOMNESS
k[i]
m[i]
c[i] = m[i] ⊕ k[i]
Figure 2.10: The one time pad (OTP) cipher - an unconditionally-secure stateful
stream cipher: (∀i) c[i] = m[i] ⊕ k[i] (bit-wise XOR).
c[i], there are two equally likely conclusions: either k[i] = 0 and therefore
m[i] = c[i], or k[i] = 1 and therefore m[i] = 1 − c[i]. Namely, seeing c[i] does not
change our knowledge about m[i], simply since k[i] is a random bit. The precise
argument is similar, mainly avoiding the assumption that the message bit is
fair, i.e., allowing some prior knowledge about the probability that m[i] = 1;
the argument would show that there is no extra knowledge about m[i] from
observing c[i].
The unconditional secrecy of OTP was recognized early on, and established
rigorously in a seminal paper published in 1949 by Claude Shannon [353].
In that paper, Shannon also proved the more challenging fact that every
unconditionally-secure cipher must have keys as long as the plaintext; namely,
as long as unconditional secrecy is required, this aspect cannot be improved.
Interested readers can őnd this proof in textbooks on cryptography, e.g., [370].
Interestingly, OTP is actually a very special case of the Keyed-Caesar
cipher (subsection 2.1.2). Recall that the Keyed-Caesar cipher is deőned by:
c[i] = p[i] + k mod n. The OTP cipher is basically the same, except that we
use n = 2 - and a different key bit k[i] for every plaintext bit m[i]. Speciőcally:
c[i] = m[i] ⊕ k[i] = m[i] + k[i]
mod 2
(2.13)
In fact, the Keyed-Caesar cipher is unconditionally secure also for n > 2, as
long as each key letter k[i] is chosen randomly from {0, 1, . . . , n − 1} and used
to encrypt a single plaintext letter m[i] ∈ {0, 1, . . . , n − 1}.
The cryptographic literature has many beautiful results on unconditional
security. However, it is rarely practical to use such long keys, and in practice,
adversaries - like everyone else - have limited computational abilities. Therefore,
in this textbook, we focus on computationally-bounded adversaries.
While the key required by OTP makes its use rarely practical, we next show
a computationally-secure variant of OTP, where the key can be much smaller
than the plaintext, and which can be used in practical schemes. Of course, this
variant is - at best - secure only against computationally-bounded attackers.
Note that OTP handles multiple plaintext messages as part of one long
plaintext string, i.e., it uses m[i] for the ith plaintext bit, and similarly for
Applied Introduction to Cryptography and Cybersecurity
2.5. PSEUDO-RANDOMNESS, INDISTINGUISHABILITY AND
ASYMPTOTIC SECURITY
77
ciphertext and key bits; basically, one could say that the OTP encrypts its input
as one long plaintext message, which it encrypts one bit at a time. Some other
stateful ciphers operate on multiple plaintext messages, where each message
may consist of multiple bits.
Exercise 2.6. Define the stateful encryption and decryption functions ⟨E, D⟩
for the OTP cipher.
Solution: We use the index i of the next bit to be encrypted as the state,
initialized with i = 1. Namely, encryption Ek (m[i], i) returns (m[i] ⊕ k[i], i + 1),
and decryption Dk (c[i], i) returns (c[i] ⊕ k[i], i + 1).
Stream ciphers. The one-time pad (OTP) is often referred to as a stream
cipher. We use the term stream ciphers11 to refer to stateful cryptosystems that
use bit-by-bit encryption process, i.e., a stream cipher is a process of mapping
each plaintext bit m[i] to a corresponding ciphertext bit c[i].
Since stream ciphers map each plaintext bit to a ciphertext bit, and we
require decryption to be correct (recover the plaintext), then the mapping from
plaintext to ciphertext cannot be randomized; but obviously, it also cannot
be the same for all bits, or decryption would be trivial. It follows that stream
ciphers must be stateful. For example, with one time pad, not only the parties
must share a key as long as all plaintext bits, the parties must also maintain
an exact, synchronized count of the number of key bits used so far, to ensure
correct decryption.
Reuse of key bits with the one-time-pad is also insecure. In particular,
suppose the design uses the same key bit k[i] to encrypt both m[i] and m[i + 1],
i.e., c[i] = m[i] ⊕ k[i] but also c[i + 1] = m[i + 1] ⊕ k[i], reusing k[i]. Then an
attacker knowing one known plaintext, e.g., m[i] and c[i], and eavesdropping on
c[i+1], can őnd m[i+1]. Speciőcally, in this case, m[i+1] = c[i+1]⊕(m[i]⊕c[i]).
See Example 2.5 for a more realistic example of this vulnerability.
Stream ciphers are often used in applied cryptography, and esp. in hardware
implementations, mainly due to their simple and efficient hardware implementation. Rarely, we use the OTP (or another unconditionally-secure cipher), but
much more commonly, a stream cipher with a bounded-length key, providing
‘only’ computational security. In the following section we introduce pseudorandom generators (PRG) and pseudorandom functions (PRF), and show how
to use either of them to design a bounded key length stream cipher.
2.5
Pseudo-Randomness, Indistinguishability and
Asymptotic Security
Randomness is widely used in cryptography - for example, the one time pad
cipher (Section 2.4) uses random keys to ensure unconditional secrecy. In this
11 Other authors use the term stream cipher also for cryptosystems that use byte-by-byte
or block-by-block encryption, essentially as a synonym to stateful encryption.
Applied Introduction to Cryptography and Cybersecurity
78
CHAPTER 2. CONFIDENTIALITY: ENCRYPTION SCHEMES AND
PSEUDO-RANDOMNESS
section, we introduce pseudo-randomness, a central concept in cryptography,
and three types of pseudo-random schemes: pseudo-random generator (PRG),
pseudorandom function (PRF) and pseudo-random permutation (PRP). We
also introduce the central technique of indistinguishability test, which is central
to the deőnitions of these three pseudo-random schemes - as well as to the
deőnition of secure encryption, which we present later.
2.5.1
Pseudo-Random Generators and their use for Bounded
Key-length Stream Ciphers
In this subsection we introduce the Pseudo-Random Generator (PRG). A PRG
is one of the simpler cryptographic deőnitions, and hence we consider it a good
choice for the őrst deőnition; however, it is still not that easy. Hence, in this
subsection we introduce PRGs but deőne them only informally. We focus on
the classical application of PRGs, which is to construct a stream cipher.
We have already seen a stream cipher: the one time pad (OTP). The OTP
has the advantage of being unconditionally-secure, but also the disadvantage of
requiring the parties to share a secret key which is as long as all the plaintext bits
they may need to encrypt, i.e., to share a key of unbounded length. . In contrast,
stream ciphers constructed from PRGs only require the parties to share a
bounded-length key; this is a critical advantage over the OTP. On the other hand,
stream ciphers with bounded key-length, such as these constructed from PRGs,
cannot be unconditionally-secure; instead, they can only be computationally
secure - i.e., secure only assuming that the attacker has limited computational
capabilities.
In this section, we will see one method to implement a bounded-key-length
stream cipher from a pseudo-random generator (PRG). PRGs have other important applications, and are one of the cryptographic mechanism whose deőnition
is least-complex; therefore, they are a good way to introduce the more complex
- and even more important - cryptographic mechanisms of pseudorandom functions (PRF), pseudo-random permutations (PRP) and block ciphers, which we
present later.
PRG: intuitive definition. In this subsection, we only introduce PRGs
informally, focusing on their use in the construction of bounded-key-length
stream ciphers. We focus on the following simple deőnition,
Definition 2.3 (Informal deőnition of a (stateless) PRG). Given any random
input string s, usually referred to as the seed, a PRG fP RG outputs a longer
string, i.e., (∀s ∈ {0, 1}∗ ) (|fP RG (s)| > |s|). Furthermore, if s is a random
string (of |s| bits), then fP RG (s) is pseudo-random. Intuitively, this means that
fP RG (s) cannot be efficiently distinguished from a true random string of the
same length |fP RG (s)|.
We deőne these concepts of efficient distinguishing and PRG precisely quite
soon, in subsection 2.5.2. However, őrst, let us see that there are some other
Applied Introduction to Cryptography and Cybersecurity
2.5. PSEUDO-RANDOMNESS, INDISTINGUISHABILITY AND
ASYMPTOTIC SECURITY
79
ways to deőne PRG. Let us mention one of these variants, which is widely
used; to avoid confusion, we refer to this variant as a stateful PRG, and, where
relevant, refer to PRG deőned as in Deőnition 2.3.
Variant: stateful PRG. A stateful PRG fSP RG is a function that receives
two inputs, a current state (or seed) s and a ‘timestamp’ t, and outputs a
pseudo-random string r and a new state s′ . We use the dot notation to refer
to the two outputs,fSP RG .r and fSP RG .s′ . The timestamp input t is optional;
if used, we require the outputs for a given timestamp t1 to be pseudorandom,
even if the adversary is given the outputs fSP RG (s, t2 ) for a different timestamp
t2 ̸= t1 .
Building a stream cipher from a PRG. To obtain a stream cipher, we
require a PRG which produces a pseudo-random string as long as the plaintext
12
. We then XOR each plaintext bit with the corresponding pseudo-random
bit, as shown in Figure 2.11.
k
fP RG (·)
m[i]
fP RG (k); |fP RG (k)| = |m| > |k|
c[i] = m[i] ⊕ fP RG (k)[i]
Figure 2.11: PRG-based Stream Cipher. The input to the PRG is usually called
either key or seed; if the input is random, or pseudo-random, then the (longer)
output string is pseudo-random. The state includes the current bit index i.
The pseudo-random generator stream cipher is very similar to the OTP;
the only difference is that instead of using a truly random sequence of bits
to XOR the plaintext bits, we use the output of a Pseudo-Random Generator
(PRG). If we denote the ith output bit of fP RG (k) by fP RG (k)i , we have that
the ith ciphertext bit c[i], is deőned as: c[i] = m[i] ⊕ fP RG (k)i . This is a
shared-key stream cipher in which k is the shared key, quite similar to the
OTP. Speciőcally, the state is the index of the bit i, encryption of plaintext bit
m[i] is m[i] ⊕ fP RG (k)[i] and changes the state to i + 1, and decryption of the
ciphertext bit c[i] is c[i] ⊕ fP RG (k)[i] (and changes the state to i + 1). Note
12 The
output of some PRGs may be only slightly longer than their input, e.g., one bit
longer. However, we can use such a PRG to construct another PRG, whose output length is
longer (as a function of its input length). The details and proof are beyond our scope; see,
e.g., [165]).
Applied Introduction to Cryptography and Cybersecurity
80
CHAPTER 2. CONFIDENTIALITY: ENCRYPTION SCHEMES AND
PSEUDO-RANDOMNESS
that this may require us to compute the value of fP RG (k) each time we need a
speciőc bit, or to store all or parts of fP RG (k).
In subsection 2.5.4 below, we properly deőne pseudo-random generators
(PRGs), using the PRG indistinguishability test, which we present in subsection 2.5.3. But let us őrst introduce the ingenious concept of indistinguishability
test, which was introduced by Alan Turing.
2.5.2
The Turing Indistinguishability Test
Intuitively, a PRG is an efficient algorithm, whose input is a binary string s,
called seed (or sometimes key); if the input is either random or pseudo-random,
then the (longer) output string is pseudo-random. In order to turn this intuitive
description into a deőnition, we őrst deőne clearly the notions of ‘efficient’
and ‘pseudorandom’. We discuss these notions in the following subsections;
in this subsection, we őrst present the ingenious but non-trivial concept of
indistinguishability test, which is key to the notion of pseudorandomness - and
to many deőnitions in cryptography.
The őrst indistinguishability test was the Turing Indistinguishability test,
proposed by Alan Turing in 1950, in a seminal paper [373], which lay the
foundations for artiőcial intelligence. Turing proposed the test, illustrated in
Figure 2.12, as a possible deőnition of an intelligent machine. Turing referred
to this test as the imitation test; another name often used for this test is simply
the Turing test.
Figure 2.12: The Turing Indistinguishability Test. A machine is considered
intelligent, if a distinguisher (judge) cannot determine in which box is the
machine and in which is a human. Turing stipulated that communication
between the distinguisher and the boxes would only be in printed form, to avoid
what he considered ‘technical’ challenges such as voice recognition.
Many cryptographic mechanisms are deőned using indistinguishability tests.
These tests are similar, in their basic concept, to the Turing indistinguishability
test. The following subsection presents the őrst such test, which is testing for
the important property of pseudorandomness.
2.5.3
PRG indistinguishability test
We now return to the discussion of pseudorandomness, and deőne the PRG
indistinguishability test, illustrated in Figure 2.13. The pseudorandom test is
Applied Introduction to Cryptography and Cybersecurity
2.5. PSEUDO-RANDOMNESS, INDISTINGUISHABILITY AND
ASYMPTOTIC SECURITY
81
similar to the Turing indistinguishability test in Figure 2.12, in the sense that a
distinguisher is asked to identify which is the ‘true’ (intelligent person in Turing
test, and random sequences here) and who is the ‘imitation’ (machine in Turing
test, and sequences output by function f here).
Intuitively, a pseudo-random generator is a function f whose input is a ‘short’
$
random bit string x ← {0, 1}n , and whose output a longer string f (x) ∈ {0, 1}ln
s.t. ln > n, which is pseudo-random - i.e., indistinguishable from a random
string (of the same length ln ).
But what does it mean for the output to be indistinguishable from random?
This is deőned by the PRG indistinguishability test, which we next deőne and which, in concept, quite resembles the Turing indistinguishability test,
although the details are different. The similarity can be seen from comparing
the illustration of the PRG indistinguishability test in Figure 2.13, to that of
the Turing test in Figure 2.12.
Figure 2.13: Intuition for the Pseudo-Random Generator (PRG) Indistinguishability Test. Intuitively, f : {0, 1}∗ → {0, 1}∗ is a (secure) pseudo-random
generator (PRG), if an efficient distinguisher D can’t effectively distinguish
between f (x), for a random input x, and a random string of the same length
(|f (x)|), where |f (x)| > |x|.
In order to turn this intuition into a deőnition of a (secure) Pseudo-Random
Generator (PRG), we must specify precisely the capabilities of the distinguisher
and criteria for the outcome of the experiment, i.e., when would we say that
f is indeed a (good/secure) PRG. We next discuss these two aspects in the
following subsection, where we őnally present deőnitions for (secure) PRG.
2.5.4
Defining Secure Pseudo-Random Generator (PRG)
We now őnally deőne (secure) pseudo-random generator (PRG). We őrst deőne
RG
the distinguisher capabilities; next, we deőne the advantage εP
D,f (n) of D for
function f and inputs of length n; and őnally we deőne a (secure) PRG.
Applied Introduction to Cryptography and Cybersecurity
82
CHAPTER 2. CONFIDENTIALITY: ENCRYPTION SCHEMES AND
PSEUDO-RANDOMNESS
Distinguisher capabilities. We model the distinguisher as an algorithm,
denoted D, which receives the a binary string - either a random string or the
‘pseudorandom’ output of the PRG f - and outputs its evaluation, which should
be 0 if given truly random string, and 1 otherwise, i.e., if the input is not truly
random. The distinguisher algorithm D has to be efficient (or PPT). The terms
efficient algorithm and PPT (Probabilistic Polynomial Time) algorithm are
crucial to deőnitions of asymptotic security, which we use in this textbook; see
Section A.1.
PRG: the advantage of D for f . Before we deőne the criteria for a function
f to be considered a (secure) Pseudo-Random Generator (PRG), we notice that
by simply randomly guessing, the distinguisher may succeed with probability
1
1
2 . Namely, succeeding with probability 2 does not imply a vulnerability, and
should not result in any advantage for D.
RG
We therefore deőne the advantage εP
D,f (n) of D for function f as the probability that D outputs 1 (correctly) when given the output of the pseudorandom
function f (x), for random input x, minus the probability that D outputs 1
(incorrectly) when given a truly random string r. As required, this gives no
advantage to a distinguisher which simply guesses the bit. However, this introduces a challenge: how should we choose x and r? We solve this challenge with
the following assumption on f .
Length-uniform assumption. We simplify the deőnitions by assuming that
f is a length-uniform function, i.e., for every input of length n, the output would
be of the same length ln .
RG
We can now present the deőnition of the advantage εP
D,f (n): the probability
$
that D correctly outputs 1 when given f (x) for random n-bit input x ← {0, 1}n ,
minus the probability that D incorrectly outputs 1 when given a random ln -bit
string r.
Definition 2.4. Let f : {0, 1}∗ → {0, 1}∗ be a length-uniform function, i.e., if
|x| = n then |f (x)| = ln , and let D be an algorithm. The PRG-advantage of D
RG
for f is denoted εP
D,f (n) and defined as:
RG
εP
D,f (n) ≡
Pr
$
x←{0,1}n
[D (f (x)) = 1] −
Pr
[D (r) = 1]
$
(2.14)
r ←{0,1}ln
The probabilities in Equation 2.14 are computed over uniformly-random
n-bit binary string s (seed), uniformly-random ln = |f (1n )|-bit binary string
r, and uniformly-random coin tosses of the distinguisher D, if D uses random
bits. Note that the length of the output of f depends only on the length of the
input, hence, our use of ln .
Definition of Secure PRG. Finally, let’s deőne a secure PRG. The deőnition
assumes both the PRG and the distinguisher D are efficient (PPT), i.e., their
running time is bounded by a polynomial in the input length. Note that the
Applied Introduction to Cryptography and Cybersecurity
2.5. PSEUDO-RANDOMNESS, INDISTINGUISHABILITY AND
ASYMPTOTIC SECURITY
83
PRG must be deterministic, but the distinguisher D may be probabilistic
(randomized).
Definition 2.5 (Secure Pseudo-Random Generator (PRG)). A length uniform
function f : {0, 1}∗ → {0, 1}∗ , s.t. (∀x ∈ {0, 1}n ) ln = |f (x)|, is a secure
Pseudo-Random Generator (PRG), if it is efficiently-computable (f ∈ P P T ),
length-increasing (ln > n) and ensures indistinguishability, i.e., for every
RG
distinguisher D ∈ P P T , the advantage of D for f is negligible, i.e., εP
D,f (n) ∈
P RG
N EGL, where εD,f (n) is defined as in Equation 2.14.
The term ‘secure’ is often omitted; i.e., when we simply say that algorithm
f is a pseudo-random generator (PRG), this implies that it is a secure PRG.
Exercise 2.7. Let x ∈ {0, 1}n . Show that the following are not PRGs: (a)
fa (x) = 3x mod 2n+2 (using standard binary encoding), (b) fb (x) = 3x
+ parity(x), where parity(x) returns 1
mod 2n+1 (similarly), and (c) fc (x) = x +
if the number of 1 bits in x is odd and 0 if it is even.
Solution for part (a): Notice that here we view x as a number encoded in
binary, whose value can be between 0 and 2n − 1.
A simple distinguisher Da for fa is: Da (y) outputs 1 (i.e., pseudo-random)
if y mod 3 = 0, otherwise, it outputs 0 (i.e., random). Let us show why this
distinguisher has signiőcant advantage.
First notice that if y = fa (x), then Da (y) outputs 1 (correctly), for every x ∈
{0, 1}n . This holds since 3x < 2n+2 , and hence, y = fa (x) = 3x mod 2n+2 =
3x. Namely, Da (y) = Da (3x) = 1, by deőnition of Da .
$
It remains to show that the probability that Da (r) = 1, for r ← {0, 1}n+2 ,
is signiőcantly less than 1. If 2n+2 mod 3 = 1, this probability is exactly third;
otherwise, the probability is only 2−(n+2 higher. In either case, the probability
is deőnitely much less than 1!
1
RG
−(n+2)
Therefore, εP
) > 12 , i.e., is clearly non-negligible.
Da ,fa (n) ≥ 1 − ( 3 + 2
2.5.5
Secure PRG Constructions
Note that we did not present a construction of a secure PRG. In fact, if we
could have presented a provably-secure construction of a secure PRG, satisfying
Def. 2.5, this would have immediately proven that P ̸= N P , solving the most
well-known open problems in the theory of complexity. Put differently, if
P = N P , then there cannot be any secure PRG algorithm (satisfying Def. 2.5).
?
Since P = N P is believed to be a very hard problem, proving that a given
construction is a (secure) PRG must also be a very hard problem, and unlikely
to be done as a side-product of proving that some function is a PRG. Similar
arguments apply to most of the cryptographic mechanisms we will learn in this
book, including secure encryption, when messages may be longer than the key.
(The one time pad (OTP) is a secure encryption scheme when used with key
which is at least as long as the plaintext.)
Applied Introduction to Cryptography and Cybersecurity
84
CHAPTER 2. CONFIDENTIALITY: ENCRYPTION SCHEMES AND
PSEUDO-RANDOMNESS
What is possible is to present a reduction-based construction of a PRG,
namely, construction of PRG from some other cryptographic mechanism, along
with a proof that the PRG is secure if that other mechanism is ‘secure’. For
example, see [165] for a construction of PRG f from a different cryptographic
mechanism called one-way function (which we discussed in Section 3.4), and a
reduction proof, showing if the construction of f uses a OWF fOW F , then the
resulting function f would be a PRG. We will also present few reduction proofs,
for example, later in this section we prove reductions which construct a PRG
from other cryptographic mechanisms such as a pseudorandom function (PRF),
see Exercise 2.13, and a block-cipher. Courses, books and papers dealing with
cryptography, are full of reduction proofs, e.g., see [165, 166, 370].
Unfortunately, there is no proof of the existence of any of these - one-way
function, PRF, block-cipher or most other cryptographic schemes. Indeed, such
proofs would imply P =
̸ N P . Still, reduction proofs are the main method
of ensuring the security of most cryptographic mechanisms - by showing that
they are ‘at least as secure’ as another cryptographic mechanism, typically
a mechanism whose security is well established (e.g., by failure of extensive
cryptanalysis efforts).
For example, it seems ‘easier’ to design a one-way function than a PRG. If so,
then we could obtain a PRG using a given one-way function, and a construction
of a PRG from a one-way function. As a more practical example, block-ciphers
are standardized, with lots of cryptanalysis efforts; therefore, block ciphers are
a good basis to use for building other cryptographic functions.
Let us give an important example of reduction-based proof which is speciőc
to PRGs. This is a construction of a PRG whose output is signiőcantly larger
than its input, from a PRG whose output is only one-bit longer than its input.
Unfortunately, the construction and proof are beyond our scope; see [165].
However, the following exercise (Exercise 2.8) proves a related - albeit much
simpler - reduction, showing that a PRG G from n bits to n + 1-bits, gives also
a PRG G′ from n + 1 bits to n + 2-bits, simply by exposing one bit. In other
words, this shows that a PRG may expose one (or more) bits - but remain a
PRG.
Exercise 2.8. Let f : {0, 1}n → {0, 1}n+1 be a secure PRG. Is f ′ : {0, 1}n+1 →
+ x) = b +
+ f (x), where b ∈ {0, 1}, also a secure
{0, 1}n+2 , defined as f ′ (b +
PRG?
Solution: Yes, if f is a PRG then f ′ is also a PRG. First, recall the PRGadvantage (Equation 2.14) for distinguisher D, using ln = n + 1:
RG
εP
D,f (n) ≡
Pr
$
x←{0,1}n
[D (f (x))] −
[D (r)]
Pr
$
(2.15)
r ←{0,1}n+1
Next, rewrite Equation 2.14 for f ′ and distinguisher D′ , by substituting f
by f ′ , x by x′ , n by n + 1 and ln by n + 2:
RG
εP
D ′ ,f ′ (n + 1) ≡
Pr
$
x′ ←{0,1}n+1
[D′ (f ′ (x′ ))] −
Pr
$
r ′ ←{0,1}n+2
[D′ (r′ )] ̸∈ N EGL(n)
Applied Introduction to Cryptography and Cybersecurity
(2.16)
2.5. PSEUDO-RANDOMNESS, INDISTINGUISHABILITY AND
ASYMPTOTIC SECURITY
85
We next present a simple construction of a distinguisher D (for f ), using,
as a subroutine (oracle), a given distinguisher D′ (for f ′ ):
n
o
$
D(y) ≡ Return D′ (b +
+ y) where b ← {0, 1}
(2.17)
Clearly D is efficient (PPT) if and only if D′ is efficient (PPT).
RG
P RG
′
We prove that εP
D ′ ,f ′ (n + 1) = εD,f (n), therefore, f is a PRF if and only
if f is a PRG. We begin by developing the őrst component of Equation 2.15:
h
i
$
Pr
Pr [D(f (x))] =
D′ (b +
+ f (x))|b ← {0, 1}
(2.18)
$
$
x←{0,1}n
x←{0,1}n
=
Pr
$
x←{0,1}n
=
Pr
h
i
$
D′ (f ′ (b +
+ x))|b ← {0, 1}
D′ (f ′ (x′ )))
$
x′ ←{0,1}n+1
We now develop the other component of Equation 2.15:
h
i
$
[D(r)] =
Pr
D′ (b +
+ r)|b ← {0, 1}
Pr
$
$
r ←{0,1}n+1
r ←{0,1}n+1
=
Pr
D′ (r′ )
$
(2.19)
(2.20)
(2.21)
(2.22)
r ′ ←{0,1}n+2
Now substitute the two components in Equation 2.15:
RG
εP
D,f (n)
≡
=
Pr
$
x←{0,1}n
[D (f (x))] −
Pr
$
x′ ←{0,1}n+1
≡
D′ (f ′ (x′ ))) −
RG
εP
D ′ ,f ′ (n + 1)
[D (r)]
Pr
$
r ←{0,1}n+1
Pr
D′ (r′ )
$
r ′ ←{0,1}n+2
(2.23)
(2.24)
(2.25)
RG
P RG
′
Hence, εP
D,f (n) = εD ′ ,f ′ (n + 1), namely, f is a PRG if and only if f is a
PRG.
Feedback Shift Registers (FSR). There are many proposed designs for
PRGs. Many of these are based on Feedback Shift Registers (FSRs), with
a known linear or non-linear feedback function f , as illustrated in Fig. 2.14.
For Linear Feedback Shift Registers (LFSR), the feedback function f is simply the XOR of some of the bits of the register. Given the value of the
initial bits r1 , r2 , . . . , rl of an FSR, the value of the next bit rl+1 is deőned
as: rl+1 = f (r1 , . . . , rl ); and following bits are deőned similarly: (∀i > l)ri =
f (ri−l , . . . , ri−1 ).
FSRs are well-studied with many desirable properties. However, by deőnition,
their state is part of their output. Hence, they cannot directly be used as
cryptographic PRGs. One solution is to deőne another function g over the
bits of the register, which outputs one or more bits which should hopefully be
pseudorandom. The following exercise gives some examples.
Applied Introduction to Cryptography and Cybersecurity
CHAPTER 2. CONFIDENTIALITY: ENCRYPTION SCHEMES AND
PSEUDO-RANDOMNESS
86
f(…)
r10 r9 r8 r7 r6 r10 r5 r4 r3 r2 r1
Figure 2.14: Feedback Shift Register, with (linear or non-linear) feedback
function f ().
Exercise 2.9. For each of the following pairs of functions, show why they are
not a secure PRG:
P
l
1. f (r1 , . . . , rl ) =
r
mod 2, g(r1 , . . . , rl ) = r1 . Note: this is an
i=1 i
LFSR.
2. f (r1 , . . . , rl ) = Πli=1 ri ; and any g.
There are also many other designs of PRGs based on Feedback Shift Registers
(FSRs), often combining multiple FSRs (often LFSRs) in different ways; one
reason is that FSRs are convenient for efficient hardware implementations. Let
us consider the case of GSM encryption.
GSM (Global System for Mobile Communications) is considered the secondgeneration (2G) cellular network technology. It was őrst deployed in Finland
in 1991 and quickly became the dominant cellular standard worldwide, and
is still used alongside later-generation cellular technologies. Security was a
major concern to the GSM designers, and they deőned two PRGs, A5/1 and
A5/2, both combining three LFSRs. In the hope of preventing cryptanalysis,
the design of A5/1 and A5/2 was kept secret, in contrary to the Kerckhoffs’
principle (Principle 2). This decision was a mistake; the design was reverseengineered and make public quite quickly, and quite soon, A5/2, and then also
A5/1, where broken. The details are beyond our scope; see, e.g., [26].
Other PRG designs exist, e.g., the RC4 design, which is designed for convenient software implementation. Let us brieŕy discuss the vulnerabilities found
in RC4, and their potential impact.
2.5.6
RC4: Vulnerabilities and Attacks
The RC4 PRG design features simplicity and good efficiency (including for
software implementation). This design is publicly available since its anonymous,
unofficial disclosure in September 1994 [393]. It was therefore adopted by several
Applied Introduction to Cryptography and Cybersecurity
2.5. PSEUDO-RANDOMNESS, INDISTINGUISHABILITY AND
ASYMPTOTIC SECURITY
87
standards, including the WEP and WPA wireless-LAN standards (Section 2.10)
and the SSL and TLS protocols (Chapter 7), which further increased the
cryptanalytical efforts to identify exploitable weaknesses. Several works have,
in fact, found vulnerabilities in RC4.
Describing the details of RC4’s design and the cause of the vulnerabilities
is beyond our scope. However, let us brieŕy discuss the impact of the őrst
major reported vulnerability, the Mantin-Shamir attack on RC4 [276], as we
őnd this vulnerability and its history to be instructive in several ways. We also
discuss another important vulnerability due to incorrect usage of RC4, which is
not due to a vulnerability of RC4 at all - the same problem would hold when
incorrectly using any PRG; such incorrect-usage vulnerabilities are even more
common than vulnerabilities due to cryptanalysis attacks, which is one reason
that it is so important to understand the exact deőnition and security goals of
cryptographic mechanisms.
The Mantin-Shamir and other RC4 vulnerabilities. Since the output of
a PRG should be indistinguishable from random, then any detectable difference
between the output of the PRG and the uniformly random distribution, is a
failure of the PRG. For the theoretical deőnition, this holds for any computable
difference, regardless of the ability to exploit it for a speciőc practical attack;
for example, it suffices that some efficiently computable function f has a
different distribution when applied to speciőc output bits of RC4, compared
to its distribution for random bits. However, many PRG weaknesses, and in
particular, RC4 weaknesses, are simpler: a bias of a particular bit or byte of
the RC4 output.
In particular, for a random byte sequence, the probability of any byte in
the sequence to have a particular value should be exactly 2−8 , since all 28 byte
values should be equally likely. However, in 2001, Mantin and Shamir found
that given a random seed/key, the second output byte of RC4, denoted Z2 , has
observable bias from random. Speciőcally, they found that:
P r(Z2 = 08 ) ≈ 2−7
(2.26)
Clearly, this shows that RC4 does not fulőll our deőnition of a secure PRG:
its output can be efficiently distinguished from a random bit sequence. Based
on the conservative design principle (Principle 3), this should have caused the
use of alternative mechanisms, at least signiőcantly enhancing the RC4 security
mechanism, or a completely different design. However, that did not happen; in
spite of this and additional vulnerabilities found, the use of RC4 continued and
even increased over more than a decade, with adoption by new standards such
as TLS and WPA, in addition to its use by older standards such as SSL and
WEP.
Apparently, this vulnerability appeared too minor to be a major concern
and to stop using RC4. Which shows the difficulty of adopting the conservative
design principle in practice. Really, can we justify the principle? Can such
apparently-minor vulnerability be exploited in a realistic attack? The answer
to both questions is a resounding yes!
Applied Introduction to Cryptography and Cybersecurity
88
CHAPTER 2. CONFIDENTIALITY: ENCRYPTION SCHEMES AND
PSEUDO-RANDOMNESS
First, when we őnd one vulnerability, we should assume more are probably
lurking, possibly already known and exploited by some powerful attacker; we
better change to a more secure design. In the case of RC4, this proved to be
the case. Multiple additional vulnerabilities were discovered over the years,
beginning with another important vulnerability in the same year [147]; see some
of them in [235]. Even these did not stop the use of RC4, until the effective attack
on TLS of [11] (see subsection 7.2.5). It is widely believed that some attackers
already exploited attacks against RC4 for years before it was őnally abandoned.
Some products still use RC4, at least for ‘backward compatibility’, which is
often vulnerable to downgrade attacks, see subsection 5.6.3 and Section 7.5.
Second, the Mantin-Shamir attack can be abused in some applications and
scenarios. Speciőcally, in some applications, the same secret, x, is encrypted
using RC4 many times, using different seed values, s1 , s2 , . . .. This scenario
can occur in practice, e.g., when the SSL or TLS protocol is used to secure
communication between browser and website; see subsection 7.2.5. Let us show
how an attacker can, in such cases, őnd the second byte x[2] of the secret x, if
x is encrypted using RC4, with seeds s1 , s2 , . . ..
Let Z2 (si ) denote the value of the second byte of the output of RC4, when
initiated with seeds s1 , s2 , . . .. Let us focus on the (vulnerable) second byte.
The encryption of x[2] using seed si is the ciphertext ci [2] = x[2] ⊕ Z2 (si ).
Hence, while the probability of most bytes of the ciphertext is about 2−8 , from
Equation 2.26 we have P r(ci [2] = x[2]) ≈ 2−7 .
Therefore, the second byte of the secret x would simply be the most common
second byte of the ciphertext - by a large margin. Namely, the second byte
of the plaintext encrypted by RC4, can be exposed by a simple, efficient and
effective Ciphertext Only (CTO) attack, if the attacker can obtain a reasonable
number of ciphertexts (few hundreds at most). And if this is not sufficiently
convincing, see the improved, practical attack on TLS in subsection 7.2.5.
Several variants of RC4 were proposed to defend against this and other
attacks. The simplest variant simply discards some initial portion of the output;
this is known as RC4-dropN, where N is the number of initial output bytes
dropped discarded, e.g., 256 or 1024. RC4-dropN clearly avoids the speciőc
Mantin-Shamir attack, but may fail against other attacks, most notable the
attack against the use of RC4 by TLS [11], see subsection 7.2.5.
Vulnerabilities due to incorrect use of RC4 (or any PRG). While
PRGs, and other cryptographic schemes, can be vulnerable, many security
failures are due to incorrect, vulnerable usage of cryptographic schemes. Let
us give an example of an attack against vulnerable deployment of a PRG in
a system. Speciőcally, we present an attack against MS-Word 2002, which
exploits a vulnerability in the usage of the RC4 PRG, rather than in its design.
Namely, this attack could have been carried out if any PRG was used in the
same (incorrect and vulnerable) way, as RC4 was used by MS-Word 2002.
Example 2.5. MS-Word 2002 used RC4 for document encryption, in the
following way. The user provided password for the document; that password was
Applied Introduction to Cryptography and Cybersecurity
2.5. PSEUDO-RANDOMNESS, INDISTINGUISHABILITY AND
ASYMPTOTIC SECURITY
89
used as a key to the RC4 PRG, producing a long pseudo-random string which is
referred to as P ad, i.e., P ad = RC4(password). When the document is saved
or restored from storage, it is XORed with P ad. This design is vulnerable; can
you spot why?
The vulnerability is not specific in any way to the choice of RC4; the problem
is in how it is used. Namely, this design re-uses the same pad whenever the
document is modified - a ‘multi-times pad’ rather than OTP. For example,
suppose that the document is changed by adding one letter x, say in position i.
Clearly, it is possible to find i, given only the two ciphertexts.
Furthermore, let c[i] be the original ciphertext in position i, and c′ [i] be the
ciphertext in position i after the insertion of x; and let y be the ith plaintext
character before the insertion, which then moves to be the (i + 1)th character.
Then c[i] = y ⊕ pad[i] and c′ [i] = x ⊕ pad[i]. By XORing the two equations, we
have: c[i] ⊕ c′ [i] = (y ⊕ pad[i]) ⊕ (x ⊕ pad[i]) = y ⊕ x. Hence, knowing x gives y
and vice verse; and if neither is known, the attacker can still learn y ⊕ x, which
still gives some information on the plaintext.
If these exposures do not yet convince the reader of the insecurity of this
design, then see details of a complete, practical plaintext-recovery attack in [278].
The fact that the vulnerability is due to the use of RC4 and not to cryptanalysis of RC4, is very typical of vulnerabilities in systems involving cryptography.
In fact, cryptanalysis is rarely the cause of vulnerabilities - system, conőguration
and software vulnerabilities are more common.
2.5.7
Random functions
One practical drawback of stream ciphers is the fact that they require state,
to remember how many bits (or bytes) were already output. What happens if
state is lost? Can we eliminate or reduce the use of state? It would be great to
allow recovery from loss of state, or to avoid the need to preserve state when
encryption is not used, e.g., between one message and the next. In the next
section, we introduce another pseudo-random cryptographic mechanism, called
a pseudorandom function (PRF), which has many applications in cryptography
- including stateless, randomized shared-key cryptosystems. However, before
we introduce pseudo-random functions, let us őrst discuss the ‘real’ random
functions.
Selecting a random function. How can we select a random function from
a domain D to a range R? One way is as follows: for each input x ∈ D, select
a random element in R to be f (x), namely:
$
(∀x ∈ D)f (x) ← R
(2.27)
This process can be done manually for small domain and range, by randomly
choosing the mapping and writing it in a table, e.g., as in Table 2.2 and
Applied Introduction to Cryptography and Cybersecurity
90
CHAPTER 2. CONFIDENTIALITY: ENCRYPTION SCHEMES AND
PSEUDO-RANDOMNESS
Exercise 2.10, where we focus on the typical case where both D and R are
sets of binary strings of speciőc length, i.e., for some integers n, m, we have
D = {0, 1}n , R = {0, 1}m .
Function
f1
f2
Domain
{0, 1}2
{0, 1}2
Range
{0, 1}
{0, 1}3
00
01
10
11
coin-ŕips
Table 2.2: Do-it-yourself table for selecting f1 , f2 randomly, in Exercise 2.10.
Exercise 2.10. Using a coin, select randomly the functions below; count your
coin flips.
1. f1 : {0, 1}2 → {0, 1} (use a copy of Table 2.2)
2. f2 : {0, 1}2 → {0, 1}3 (use a copy of Table 2.2)
3. f3 : {0, 1}3 → {0, 1}2 (create your own table)
How many coin flips required were required for each function? For each of the
functions, what is the probability that all its output bits are zero ? And of all
outputs bits being 1?
The exercise is limited to very small values (n = 2 and m ∈ {1, 3}, since the
number of coin-ŕips required is 2n · m, i.e., grows exponentially as a function of
n (and linearly in m). The total number of functions grows even more rapidly;
let FD,R denote the set of all functions from a őnite domain D to a őnite range
R (i.e., FD,R ≡ {D → R}). A function f ∈ FD,R maps each element in D,
to an element in R; hence, the FD,R is a őnite set. More speciőcally, we can
map each element in D, to any element in R. The total number of functions is,
therefore, |R||D| , and each speciőc function is selected with probability |R|−|D| .
In the typical case where D is the set of n bits strings and R is the set of m
bits strings, there are 2n elements in D and 2m elements in R, i.e., |D| = 2n
and |R| = 2m . The total number of functions from D = {0, 1}n to R = {0, 1}m
n
n
is, therefore, |FD,R | = (2m )2 = 2m·2 , i.e., superexponential in n.
We see that selecting and storing a function from a large domain (and to
a large range) would be difficult, as would be sending it - which is required if
we want multiple parties to use the same random function. This is a pity; a
shared random function can be very useful for cryptographic applications, e.g.,
it can be used to select random keys, or to implement a stateless stream cipher.
This motivates the use of pseudorandom functions, which we discuss in the next
subsection. However, let us őrst discuss such applications of random functions,
as well as the relevant security and performance considerations.
Stream cipher using a random function. Figure 2.15a presents the design
of a stream cipher using a randomly-chosen function f which is shared by the
two parties and kept secret. The design could be used either for bit-by-bit
Applied Introduction to Cryptography and Cybersecurity
2.5. PSEUDO-RANDOMNESS, INDISTINGUISHABILITY AND
ASYMPTOTIC SECURITY
91
encryption, with the random function mapping each input i to a single bit
f (i), which is then XORed with the corresponding message bit m[i] to form
the ciphertext ci = m[i] ⊕ f (i). Alternatively, both input messages m[i] and
the output of the random function could be strings of some length, e.g., n, and
then each invocation of the random function will produce n pad bits, XORed
with n pad bits to produce n cipher bits.
i
$
ri ← {0, 1}n
f (·)
m[i]
f (·)
m[i]
c[i] = m[i] ⊕ f (i)
(a) A stream-cipher for stateful encryption, using limited storage (for a
counter). Communication-optimal, i.e.,
|ciphertext| = |plaintext|.
z
}|
{
c[i] = (m[i] ⊕ f (ri ), ri )
(b) Stateless, randomized encryption,
with high communication overhead: n
(random) bits per plaintext bit.
Figure 2.15: Bit-wise encryption using random function f (·) ∈ {0, 1}, stateful
(a) and randomized (b).
One drawback of the use of stream ciphers is the need to maintain synchronized state between sender and recipient. This refers to the (typical) case where
the input is broken into multiple messages, each provided in a separate call to
the encryption device. To encrypt all of these messages using a stream cipher OTP or the design in Figure 2.15a - the two parties must maintain the number
of calls i (bits or strings of őxed length). To avoid this requirement, we can use
randomized encryption, as we next explain.
Stateless, randomized encryption using a random function f . An
even more interesting application of random functions, is to avoid the need for
the two parties to maintain state (of the message/bit counter i). To do this,
we use the random function to construct randomized encryption, as shown in
Figure 2.15b. Here, again, we use a random function f which outputs a single
bit.
To encrypt each plaintext bit m[i], we choose a string ri of n random bits,
$
i.e., ri ← {0, 1}n . The ciphertext c[i] corresponding to plaintext bit m[i] is the
pair (m[i] ⊕ f (ri ), ri ).
Security. As with every cryptographic mechanism, we ask: are the designs
in Figure 2.15b and Figure 2.15a secure? In this case, we assume that the two
Applied Introduction to Cryptography and Cybersecurity
92
CHAPTER 2. CONFIDENTIALITY: ENCRYPTION SCHEMES AND
PSEUDO-RANDOMNESS
parties share a random function f which is unknown to the adversary; basically,
the function is considered as a shared secret ‘key’.
Intuitively, the design of Figure 2.15a is secure as long as we never re-use
the same counter value. Similarly, the design of Figure 2.15b is secure as long
as we use a sufficient number of random bits. Both statements are correct; but
it isn’t trivial to understand why. Let us focus on the slightly more complex
case of randomized encryption (the design of Figure 2.15b); the argument for
the counter-based, stateful stream cipher design (Figure 2.15a) follows similarly.
An obvious concern is that an attacker may try to predict the value of f (ri )
used to encrypt a message (or bit) m[i], from previously-observed ciphertexts
{c[j]}j<i . Let us assume, to be safe, that the attacker knows all the corresponding plaintexts m[j], allowing the attacker to őnd all the corresponding mappings
{f (rj )}j<i . Using this information, can the attacker guess f (ri )?
It is possible that ri is the same as one of the previously-used random values,
i.e., ri = rj for some j < i. In this case, the attacker has received already c[j];
and since we assumed that the attacker knows m[j], it follows that the attacker
can expose m[i], by computing:
m[i] = c[i] ⊕ f (ri ) = c[i] ⊕ f (rj ) = c[i] ⊕ (m[j] ⊕ c[j])
(2.28)
Therefore, this case should be avoided, which is easy to do, by selecting using
sufficiently long random strings ri , making it very unlikely that we will repeat
the same value. Consider, therefore, the case where ri ̸∈ {(rj }j<i . Reconsider
the process of selecting a random function, as in Exercise 2.10. What we did
was to select the entire table - mapping from every element in the domain
to a random element in the range - before we applied the random function.
However, notice that it does not matter if, instead, we choose the mapping for
each element ri ∈ D in the domain only on the first time we need to compute
f (ri ). Think it over!
This means, that if ri ̸∈ {rj }j<i , then the attacker does not learn anything
about f (ri ), even if the attacker is given all of the ‘previous’ {f (rj )}j<i values.
Until we select (randomly) the value of f (ri ), the attacker cannot know anything
about it. Therefore, the only concern we have is with the case that ri ∈ {rj }j<i .
Let us return to this issue; what is the probability of that happening? Well
since each mapping is selected randomly, simply i−1
|D| . Focusing on the typical
i−1
n
case where the input domain is {0, 1} , this is 2n
Therefore, if n is ‘sufficiently large’, then the maximal number of observations
by the attacker would still be negligible compared to 2n - and 2in would be
negligible. For example, if the attacker can observe a million encryptions, we
‘just’ need 2n to be way larger than one million ; and considering that a million
is less than 220 , using any n signiőcantly larger than 20 seems safe enough.
Efficiency. So the scheme is secure - provided n is ‘sufficiently large’, e.g.,
80 or more. However, is it also efficient? To implement the scheme, we need
to compute the random function f ; since we want a recipient to decipher
our messages, we need to compute and send all of f before we begin sending
Applied Introduction to Cryptography and Cybersecurity
2.5. PSEUDO-RANDOMNESS, INDISTINGUISHABILITY AND
ASYMPTOTIC SECURITY
93
ciphertexts. However, this requires us to ŕip and (securely) share 2n bits - for
n = 80 (or more). Unfortunately, that’s infeasible. Fortunately, there is an
alternative, efficient solution: use a pseudorandom function (PRF) instead of
the random function, providing an efficient solution which is still secure, albeit
only against computationally-limited adversaries.
Note that there is another efficiency concern with the scheme: is it really
necessary to send a new random string for each bit? Of course not. We can
address this concern in two ways:
Large range R = {0, 1}l : this allows us to use the same random string r or
counter i, to encrypt a block of l plaintext bits, by bitwise XOR of the l
message bits with the corresponding l-bit output of f (r) (or f (i)). In this
way, the n bits of r allow encryption of l bits of plaintext. See Figure 2.16.
Use f (r) as seed of a PRG: if we use a sufficiently large range, a PRG could
‘expand’ f (r) into as many bits as required to bit-wise XOR with the
plaintext: Ef (m) = (r, P RG(f (r)) ⊕ m). In this way, the n bits of r allow
encryption of arbitrarily long plaintext m - requiring new n random bits
only to encrypt new plaintext, and only if the state (of the PRG) was not
retained. This is essentially what is done by the Output Feedback (OFB)
mode of operation, which we see later on, except that the OFB mode
also implements the PRG using the PRF, instead of using two separate
functions (a PRF and a PRG).
$
i
mi
ri ← {0, 1}n
f (·)
f (·)
/n
/n
mi /
/
n
n
/n
z }| {
ci = (mi ⊕ f (i))
(a) Stateful block encryption with
Random Function f (·).
/n
z }| {
ci = (mi ⊕ f (ri ), ri )
(b) Stateless, randomized block encryption with Random Function f (·).
Figure 2.16: Block (n-bits) encryption using a Random Function f (·). Use only
one function application for n plaintext bits.
Applied Introduction to Cryptography and Cybersecurity
CHAPTER 2. CONFIDENTIALITY: ENCRYPTION SCHEMES AND
PSEUDO-RANDOMNESS
94
2.5.8
Pseudorandom functions (PRFs)
A pseudorandom function (PRF) is an efficient substitute to the use of a random
function, which ensures similar properties, while requiring the generation and
sharing of only a short key. The main limitation is that PRFs are secure only
against computationally bounded adversaries.
A PRF scheme has two inputs: a secret key k and a ‘message’ m; we
denote it as P RFk (m). Once k is őxed, the PRF becomes only a function of
the message. The basic property of PRF is that this function (P RFk (·)) is
indistinguishable from a truly random function. Intuitively, this means that a
PPT adversary cannot tell if it is interacting with P RFk (·) with domain D and
range R, or with a random function f from D to R. Hence, PRFs can be used
in many applications, providing an efficient, easily-deployable alternative to the
impractical truly random functions.
For example, PRFs can be used to construct shared-key cryptosystems, as
illustrated in Figure 2.17. The őgure presents two designs of a cryptosystem
from a PRF: a stateful encryption, as a stream cipher, in Figure 2.17a, and a
stateless randomized encryption, in Figure 2.17b.
$
i
fk (·)
k
ri ← {0, 1}n
k
fk (·)
/n
mi
/n
mi /
/
n
n
/n
z }| {
ci = (mi ⊕ fk (i))
(a) Stateful block encryption with a
PRF fk (·).
/n
}|
{
z
ci = (mi ⊕ fk (ri ), ri )
(b) Stateless, randomized block encryption using PRF fk (·).
Figure 2.17: Block (n-bits) encryption using a pseudorandom function (PRF)
fk (·). Use only one PRF application for n plaintext bits.
Both designs simply use a PRF instead of a random function, used in the
corresponding designs in Figure 2.15. The security of the PRF designs follows
from the security of the corresponding random-function-based designs - and
from the indistinguishability of a PRF and a random function. Indeed, this is
one case of a very useful technique, which we refer to as the random function
design principle.
Principle 6 (Random function design method). Design cryptographic protocols
and mechanisms using a random function, to make the security analysis easier.
Applied Introduction to Cryptography and Cybersecurity
2.5. PSEUDO-RANDOMNESS, INDISTINGUISHABILITY AND
ASYMPTOTIC SECURITY
95
Once secure, implement using a pseudorandom function; security would follow
since a PRF is indistinguishable from a random function.
Note that the random function design method requires the parties to share
the secret and random key of the PRF. In Section 3.6 we present the Random
Oracle Model (ROM), which allows a similar approach to be applied when such
shared secret key is not available.
We now need to őnally properly deőne a secure pseudorandom function
(PRF); this deőnition reuses the oracle notation and other concepts introduced
in Section A.1. In this deőnition, the adversary A has oracle access to one of two
$
functions: a random function from domain D to range R, i.e., f ← {D → R}, or
the PRF keyed with a random n-bit key, Fk , with k being a random n-bit string,
$
i.e., k ← {0, 1}n . We denote these two cases by A f and A Fk , respectively. The
adversary should try to distinguish between these two cases, e.g., by outputting
0 (or ‘false’) if given oracle access to the random function f , and outputting 1
(or ‘true’) if given access to the PRF Fk . The idea of the deőnition is illustrated
in Fig. 2.18.
Figure 2.18: The pseudorandom function (PRF) Indistinguishability Test. We
say that function Fk (x) : {0, 1}∗ ×D → R is a (secure) pseudo-random generator
(PRG), if no distinguisher D can efficiently distinguish between Fk (·) and a
random function f from the same domain D to the same range R, when the
key k is a randomly-chosen sufficiently-long binary string.
We now őnally deőne a pseudorandom function (PRF), Fk (x) : {0, 1}∗ ×D →
R. The domain13 consists of the key, which we assume to be an (arbitrary
long) binary string, i.e., from the set {0, 1}∗ ; and of an input from an arbitrary
set D. The scheme must allow for arbitrary length for the key, since security
13 The notation {0, 1}∗ × D simply means a pair: a key from {0, 1}∗ and an element from
set D.
Applied Introduction to Cryptography and Cybersecurity
96
CHAPTER 2. CONFIDENTIALITY: ENCRYPTION SCHEMES AND
PSEUDO-RANDOMNESS
requirements - in this case, indistinguishability - are deőned asymptotically, i.e.,
for sufficiently long keys; see Chapter 1.
Definition 2.6. A pseudorandom function (PRF) is a polynomial-time computable function Fk (x) : {0, 1}∗ × D → R s.t. for all PPT algorithms A,
RF
P RF
εP
A,F (n) ∈ N EGL, i.e., is negligible, where the advantage εA,F (n) of the PRF
F against adversary A is defined as:
F n
f n
RF
A k (1 ) −
Pr
A (1 )
(2.29)
εP
Pr
A,F (n) ≡
$
k←{0,1}n
$
f ←{D→R}
The probabilities are taken over random coin tosses of A, and random choices
$
$
of the key k ← {0, 1}n and of the function f ← {D → R}.
Overview of the PRF indistinguishability test. The basic idea of this
deőnition is the use of indistinguishability test, much like in the deőnition of
a secure PRG (Deőnition 2.5), and even the Turing indistinguishability test
(Figure 2.12). Namely, a PRF (Fk ) is secure if every PPT algorithm A cannot
have signiőcant advantage in identifying the pseudorandom function. We deőne
the advantage as the probability that A outputs 1 (‘true’, i.e., pseudorandom)
when given oracle access to the pseudorandom function Fk , minus the probability
that A outputs 1 (‘true’, i.e., pseudorandom) when given oracle access to the
random function Fk , where both functions are over the same domain D and
range R. ‘Signiőcant’ here means at least some positive polynomial in n, the
length of the key k, often referred to as the security parameter.
The oracle notation. Both A Fk and A f use the oracle notation introduced
in Deőnition 1.3. Namely, they mean that A is given ‘oracle’ access to the
respective function (Fk () and f ()). Oracle access means that the adversary can
give any input x and get back that function applied to x, i.e., Fk (x) or f (x),
respectively
Why allow arbitrary key length (security parameter)? The deőnition
allows arbitrarily-long keys, although in practice, cryptographic standards often
have a őxed key length, or only a few options. The reason that the deőnition
allows arbitrary length is that it requires the success probability to be negligible
- smaller than any polynomial in the key length - which is meaningless if the
key length is bounded.
Why is 1n given as input to the adversary? A subtle, yet important,
aspect of the deőnition is the fact that in the two calls to the adversary A, we
provide the adversary with the value 1n as input, where n is the key length
(security parameter). The value 1n simply signiőes a string of n consecutive bits
whose value is 1, i.e., it is the value of n encoded in unary. But why provide
1n as input? It makes sense that the adversary should be informed of the key
Applied Introduction to Cryptography and Cybersecurity
2.5. PSEUDO-RANDOMNESS, INDISTINGUISHABILITY AND
ASYMPTOTIC SECURITY
97
length n, but why use unary encoding? Why not provide n using the ‘standard’
binary encoding?
To understand the reason, őrst recall that we focus on efficient (PPT)
algorithms; namely, the running time of both the pseudorandom function F and
the adversary A is bounded by a polynomial in the size of their inputs. The
inputs to the PRF include the key, and hence, consists of at least n bits; hence,
the running time of the PRF is (at least) polynomial in n. It is therefore ‘only
fair’ that the running time of the adversary A is also allowed to be polynomial
in n; to ensure this, we provide to it 1n as input, similarly to our provision of
the security parameter 1l to other algorithms, see Section A.1.
Examples of secure and insecure PRFs. To clarify Deőnition 2.6, and to
demonstrate how to show if a given function is a PRF or not, we give and solve
two exercises. The őrst exercise shows two insecure PRF designs; the following
exercise proves that a given construction is a secure PRF.
Exercise 2.11. Examples of an insecure PRF constructions:
1. Show that Fk (m) = k ⊕ m is not a secure PRF.
2. Let p be an n-bits prime number, and Fk (m) ≡ k · (m + k) mod p. Show
that F is not a secure PRF (to the domain {0, . . . , p − 1}).
3. Show that Fk (m) ≡ k ∨ (mk + k m+k mod 2n ) is not a secure PRF.
4. Assume that secure PRF functions exist. Given a function Fk (m), define
F̂k (m) ≡ Fm (k), i.e., F̂ switches between the key and the input of F .
Show that even if we know that F is secure, it is possible that F̂ is not a
secure PRF.
Solutions:
1. The adversary A g is given an oracle to a function g, and needs to output
‘True’ if g(·) = Fk (·) for a random key k and ‘False’ if g(·) is random
function. A simple way to do this is for Ag őrst to make a query for g(0); if
g(·) = Fk (·) then A receives back g(0n ) = Fk (0n ) = k ⊕ 0n = k. So in this
case (g(·) = Fk (·)) A ‘knows’ k; it can check if indeed g(·) = Fk (·) (or g is
a random function) by giving any other input m ̸= 0n . If g(m) = k ⊕ m,
then (with very high probability) the function is indeed g(·) = Fk (·), i.e.,
not a random function, and A returns ‘True’; if g(m) ̸= k ⊕ m then the
function g is deőnitely not g(·) = Fk (·), which means, in this case, it must
be a random function, and A returns ‘False’.
2. Similarly to the previous item, the adversary őrst gives to the oracle the
input 0. If the oracle is to Fk (m) ≡ k · (m + k) mod p, then the adversary
receives k 2 mod p. Now, every number whose value is k 2 mod p for
some integer k is called a quadratic residue. We discuss quadratic residues
in subsection 6.1.8, in particular, explaining that there is an efficient
Applied Introduction to Cryptography and Cybersecurity
98
CHAPTER 2. CONFIDENTIALITY: ENCRYPTION SCHEMES AND
PSEUDO-RANDOMNESS
algorithm to determine if a number is quadratic residue or not (Claim 6.1).
The adversary may use this test, and if the oracle returns a quadratic
residue, the adversary assumes that the oracle is to Fk (m) ≡ k · (m + k)
mod p and returns ‘True’, since, with high probability, a random function
would not return a quadratic residue.
Let us see another solution, that does not require testing the output for
being a quadratic residue. Observe that m + k is even, if and only if either
both m and k are even, or both m and k are odd. This motivates the
following distinguishing adversary A g (m), that outputs ‘True’, with high
probability, if g(m) = Fk (m), and with low probability, if g is a random
function. The adversary ask the oracle to compute both g(0) and g(2). If
the two results, g(0) and g(2), have the same parity (either are both even,
or are both odd), then the adversary concludes that g is more likely to be
pseudo-random, and outputs ‘True’. Otherwise, if one of {g(0), g(2)} is
even and the other is odd, then g cannot be Fk (m) ≡ k · (m + k) mod p,
and hence must be a random function, and A outputs ‘False’.
Notice that when this adversary outputs ‘True’, it would be wrong in the
cases where the oracle function g was selected at random, but happens
to return an even number for both inputs (zero and 2). However, since
g was selected at random, the values g(0) and g(2) were also selected at
random, and the probability of both g(0) and g(2) being even would be
only 14 . Hence, while the adversary could be wrong, it still has a signiőcant
(non-negligible) advantage.
3. It seems quite hard to predict much about a speciőc value of Fk (m) for a
random key k. However, since F includes an ‘or’ operation with k, any bit
which is set in k, is always output in Fk (m) (for every m). Since almost
always k has some non-zero bits, then the adversary can detect these bits
by computing the ‘and’ operation to multiple outputs of the oracle (for
different inputs). Notice that if the oracle is to a random function, then
the probability of any given bit to be set in the outputs of l different
inputs is only 2−l . This allows the adversary to efficiently distinguish
between being given an oracle to Fk (m) ≡ k ∨ (mk + k m+k mod 2n ) - vs.
being given an oracle to a random function.
4. Let F ′ be a secure PRF; deőne:
Fk (m) ≡ {Fk′ (m) if k ̸= 0n , otherwise (k = 0n ): 0n .}
(2.30)
As long as the key chosen is not the special all-zeros key, F is the same
as F ′ , hence, F is also a PRF, since keys are selected randomly so k = 0n
is selected only with probability 2−n . It remains to show that with this
F function, F̂ is not a secure PRF.
However, an adversary given an oracle to either F̂ or a random oracle, can
simply give the input 0n . If the oracle is to F̂ , then this returns F̂k (0n ) =
F0n (k). However, by deőnition (Equation 2.30), F̂k (0n ) = F0n (k) = 0n .
Applied Introduction to Cryptography and Cybersecurity
2.5. PSEUDO-RANDOMNESS, INDISTINGUISHABILITY AND
ASYMPTOTIC SECURITY
99
Hence, an adversary can easily distinguish between F̂k (m) and a random
function.
Exercise 2.12 (Proving security (of PRF) by reduction). Let Bn ≡ {0, 1}n
denote the set of n-bits binary strings. Assume that Fk (m) : Bn × Bn → Bn be
a secure PRF. Prove that F̂k (m) is also a secure PRF, where:
F̂k (m) ≡ Fk (m ⊕ (0n−1 +
+ 1))
(2.31)
.
Solution: The solution uses the main technique in proofs of security of a
construction in cryptography, which is called proof by reduction, which works
as follows. We assume, to the contrary, that the construction is insecure. In
this exercise, this means that we assume that F̂ is not a PRF. Namely, from
ˆ
Deőnition 2.6, there exists an efficient adversary (PPT algorithm) Aˆf , with an
ˆ
oracle denoted fˆ, such that Aˆf has a signiőcant (non-negligible) probability of
distinguishing between the following two cases:
Case 1: oracle fˆ is a random function: The oracle fˆ is selected at ran$
dom from the set of functions from n bits to n bits, i.e., fˆ ← {f : Bn →
Bn }.
Case 2: oracle fˆ is the PRF with a random key: The oracle fˆ is to the
function F̂k (·), where k is a random n-bits key. Namely, fˆ←F̂k (·) where
$
k ← Bn .
In other words, we assume, to the contrary, that the PPT adversary Aˆ
has a signiőcant advantage in the PRF indistinguishability test (Figure 2.18)
against the constructed PRF F̂ , i.e.:
h
i
i
h
RF
ˆF̂k (·) (1n ) −
ˆfˆ(·) (1n ) ̸∈ N EGL(n) (2.32)
εP
(n)
≡
Pr
A
Pr
A
ˆ
A,F̂
$
k←Bn
$
fˆ←{ Bn →Bn }
In the left-hand expression the oracle is F̂k (·), for a random key k, while in
the right-hand expression the oracle fˆ(·) is a random functions from n-bits to
n-bits.
We then present another PPT adversary A f which we show, under this
assumption, to have a signiőcant advantage in the PRF indistinguishability test
(Figure 2.18) against F , i.e.:
h
i
h
i
RF
ˆFk (·) (1n ) −
ˆf (·) (1n ) ̸∈ N EGL(n) (2.33)
εP
(n)
≡
Pr
A
Pr
A
ˆ
A,F̂
$
k←Bn
$
f ←{ Bn →Bn }
Namely, we show that if F̂ is not a secure PRF (Equation 2.32) then F
cannot be a secure PRF (Equation 2.33), which is a contradiction since we were
given that F is a secure PRF.
Applied Introduction to Cryptography and Cybersecurity
CHAPTER 2. CONFIDENTIALITY: ENCRYPTION SCHEMES AND
PSEUDO-RANDOMNESS
100
ˆ
Given Aˆf , we deőne adversary A f as follows:
o
n
ˆ
A f (·) (1n ) ≡ Return Aˆf (·) (1n ) where fˆ(m) ≡ f (m ⊕ (0n−1 +
+ 1))
(2.34)
ˆ
Namely, we implement the adversary A f against F , using the adversary Aˆf
ˆ
against F̂ , where we compute the value of f on a given input m using the oracle
we have for f , i.e., by computing: fˆ(m) = f (m ⊕ (0n−1 +
+ 1)).
RF
P RF
To complete the proof, we show that εP
(n)
=
ε
A,F (n), or more speciőˆ F̂
A,
cally that the following two equations hold:
i
h
Pr A Fk (1n ) =
Pr AˆF̂k (1n )
(2.35)
$
k←Bn
Pr
$
f ←{Bn →Bn }
$
A f (1n )
k←Bn
=
Pr
$
fˆ←{Bn →Bn }
h
ˆ
Aˆf (1n )
i
(2.36)
Equation 2.35 follows from substituting Equation 2.34 and then the deőnition
of F̂k (Equation 2.31. To prove Equation 2.36, we observe that
Let us show that also holds:
h
i
f n
ˆ
Pr
Aˆf (1n )|fˆ(m)≡f (m⊕(0n−1 ++1)
A (1 ) =
Pr
(2.37)
$
$
f ←{Bn →Bn }
f ←{Bn →Bn }
+ 1) then f (m) ≡ fˆ(m ⊕ (0n−1 +
+ 1); namely,
Now, if fˆ(m) ≡ f (m ⊕ (0n−1 +
the probability of picking uniformly f is the same as the probability of picking
fˆ - in fact, both are equal to 2−2n . Hence:
i
i
h
h
ˆ
ˆ
Pr
Aˆf (1n )
Aˆf (1n )|fˆ(m)≡f (m⊕(0n−1 ++1) =
Pr
$
f ←{Bn →Bn }
$
fˆ←{Bn →Bn }
This completes the proof.
Additional PRF applications.
including:
PRFs have many additional applications,
Message Authentication. In Chapter 4, we show that a PRF may be used
for message authentication.
Pseudo-random permutation or block cipher. We discuss the use of PRF
to construct a pseudo-random permutation in the following subsection;
later, in Section 2.6, we show how to extend this to construct a block
cipher.
Derive independently-random keys/values. In many scenarios, two parties share only one key k, but need to use multiple shared keys which are
‘independently-random’. This is easily achieved using PRF f ; if g1 , g2 , . . .
are distinct identiőers, one of each required value, then we can derive the
keys as k1 = fk (g1 ), k2 = fk (g2 ), and so on. As a concrete example, to
derive separate keys for each day d from the same k, we can use kd = fk (d);
exposure of k2 and k4 will not expose any other key, e.g., k3 . We elaborate
on this in subsection 2.5.10.
Applied Introduction to Cryptography and Cybersecurity
2.5. PSEUDO-RANDOMNESS, INDISTINGUISHABILITY AND
ASYMPTOTIC SECURITY
2.5.9
101
PRF: Constructions and Robust Combiners
The concept of PRFs was proposed in a seminal paper by Goldreich, Goldwasser
and Micali [168]; the paper also presents a provably-secure construction of PRF,
given a PRG. That is, if there is a successful attack on the constructed PRF,
this attack can be used as a ‘subroutine’ to construct a successful attack on the
underlying PRG. However, the construction of [168] is inefficient: it requires
many applications of the PRG for a single application of the PRF. Therefore,
this construction is not applied in practice.
Instead, practical systems mostly implement PRFs using standard block
ciphers. We model block ciphers as invertible Pseudo-Random Permutation
(PRP) and discuss them in Section 2.6. In fact, PRGs are also often implemented
from a block cipher. However, for now, let us show a simple construction of a
PRG from a PRF.
Exercise 2.13. Let F be a PRF over {0, 1}n bits, and let k, r ∈ {0, 1}n . Prove
that f (k) = (Fk (1) +
+ Fk (2)) is a PRG.
Another option is to construct candidate pseudorandom functions directly,
without assuming and using any other ‘secure’ cryptographic function, basing
their security on failure to ‘break’ them using known techniques and efforts by
expert cryptanalysts. In fact, pseudorandom functions are among the cryptographic functions that seem good candidates for such ‘ad-hoc’ constructions; it
is relatively easy to come up with a reasonable candidate PRF, which will not
be trivial to attack.
Finally, it is not difficult to combine two candidate PRFs F ′ , F ′′ , over the
same domain and range, into a combined PRF F which is secure as long as
either F ′ or F ′′ is a secure PRF. We refer to such a construction as a robust
combiner. Constructions of robust combiners are known for many cryptographic
primitives. The following lemma, from [191], presents a trivial yet efficient
robust combiner for PRFs.
Lemma 2.1 (Robust combiner for PRFs). Let F ′ , F ′′ : {0, 1}∗ × D → R be
two polynomial-time computable functions, and let:
F(k′ ,k′′ ) (x) ≡ Fk′ ′ (x) ⊕ Fk′′′′ (x)
(2.38)
If either F ′ or F ′′ is a PRF, then F is a PRF. Namely, this construction is a
robust combiner for PRFs.
Proof: see [191].
2.5.10
The key separation principle
In the PRF robust combiner (Eq. 2.38), we used separate keys for the two
candidate-PRF functions F ′ , F ′′ . In fact, this is necessary, as the following
exercise shows.
Applied Introduction to Cryptography and Cybersecurity
102
CHAPTER 2. CONFIDENTIALITY: ENCRYPTION SCHEMES AND
PSEUDO-RANDOMNESS
Exercise 2.14 (Independent keys are required for PRF robust combiners). Let
F ′ , F ′′ : {0, 1}∗ × D → {0, 1}∗ be two polynomial-time computable functions,
and let Fk (x) = Fk′ (x) ⊕ Fk′′ (x). Demonstrate that the fact that one of F ′ , F ′′ is
a PRF may not suffice to ensure that F would be a PRF.
Solution: Suppose F ′ = F ′′ . Then for every k, x holds: Fk (x) = Fk′ (x) ⊕
′
Fk′′ (x) = Fk′ (x) ⊕ Fk′ (x) = 0|Fk (x)| . Namely, for any input x and any key k the
output of Fk (x) is an all-zeros string (Fk (x) ∈ 0∗ ). Hence F is clearly not a
PRF.
This is an example of the general Key separation principle, which we present
below. In fact, the study of robust combiners often helps to better understand
the properties of cryptographic schemes and to learn how to write cryptographic
proofs.
Principle 7 (Key separation). Use separate, independently-pseudorandom keys
for each different cryptographic scheme, as well as for different types/sources
of plaintext and different periods.
The principle combines three main motivations for the use of separate,
independently-pseudorandom keys:
Per-goal keys: use separate keys for different cryptographic schemes. A
system may use multiple different cryptographic functions or schemes,
often for different goals, e.g., encryption vs. authentication. In this case,
security may fail if the same or related keys are used for multiple different
functions. Exercise 2.14 above is an example.
Limit information for cryptanalysis. By using separate, independentlypseudorandom keys, we reduce the amount of information available to the
attacker (ciphertext, for example).
Limit the impact of key exposure. Namely, by using separate keys, we
ensure that exposure of some of the keys will not jeopardize the secrecy
of communication encrypted with the other keys.
One important application of Pseudorandom functions (PRFs) is derivation
of multiple separate keys from a single shared secret key k. Namely, a PRF, say f ,
is handy whenever two parties share one secret key k and need to derive multiple
separate, independently pseudorandom keys k1 , k2 , . . . from k. A common way
to achieve this, is for the two parties to use some set of identiőers γ1 , γ2 , . . ., a
distinct identiőer for each derived key, and compute each key ki as: ki = fk (γi ).
As another example, system designers often want to limit the impact of key
exposure due to cryptanalysis or to system attacks. One way to reduce the
damage from key exposures is to change the keys periodically, e.g., use key kd
for day number d:
Example 2.6 (Using a PRF for independent per-period keys.). Assume that
Alice and Bob share one master key kM . They may derive a shared secret key
for day d as kd = P RFkM (d). Even if all the daily keys are exposed, except the
ˆ the key for day dˆ remains secure as long as kM is kept secret.
key for one day d,
Applied Introduction to Cryptography and Cybersecurity
2.6. BLOCK CIPHERS AND PRPS
2.6
103
Block Ciphers and PRPs
Modern symmetric encryption schemes are built in modular fashion, using a
basic building block - the block cipher. A block cipher is deőned by a pair of
keyed functions, Ek , Dk , such that the domain and the range of both Ek and
Dk are {0, 1}n , i.e., binary strings of őxed length n; for simplicity, we (mostly)
use n for the length of both keys and blocks, as well as the security parameter,
although in some ciphers, these are different numbers.
Block ciphers should satisfy the correctness requirement: m = Dk (Ek (m))
for every k, m ∈ {0, 1}n . Notice that the correctness requirement should hold
always, i.e., not only with high probability. This is in contrast to security
requirements, which are typically deőned to hold only against efficient (PPT)
adversaries - and allow a negligible failure probability, i.e., the adversary may
win with negligible probability. But there is no reason to allow any probability
for incorrect decryption.
Figure 2.19: High-Level view of the NIST standard block ciphers: AES (current)
and DES (obsolete).
Block ciphers may be the most important basic cryptographic building blocks.
Block ciphers are in wide use in many practical systems and constructions, and
two of them were standardized by NIST - the Data Encryption Standard (DES)
(1977-2001) [296], the őrst standardized cryptographic scheme, and its successor,
the Advanced Encryption Standard (AES) (2002-present) [108] (Fig. 2.19). DES
was replaced, since it was no longer considered secure; the main reason was
simply that improvements in hardware made exhaustive-search feasible, due to
the relatively short, 56-bit key. Another reason for reduced conődence in DES even in longer-key variants - was the presentation of differential cryptanalysis
and linear cryptanalysis, two strong cryptanalysis attacks, which are quite
generic, namely, effective against many cryptographic designs - including DES.
Indeed, AES was designed for resiliency against these and other known attacks,
and so far, no published attack against AES appear to justify concerns.
We present a simpliőed explanation and example of differential cryptanalysis
below, and encourage interested readers to follow up in the extensive literature
on cryptanalysis in general and these attacks, e.g., [131, 220, 236]; [236] also
gives excellent overview of block ciphers.
Applied Introduction to Cryptography and Cybersecurity
CHAPTER 2. CONFIDENTIALITY: ENCRYPTION SCHEMES AND
PSEUDO-RANDOMNESS
104
While the deőnition of correctness for block ciphers (above) is simple and
widely-accepted, there is not yet universal agreement on the security requirements. We adopt the common approach, which requires block ciphers to be a
pair of Pseudo-Random Permutations (PRP). In the following subsection, we
deőne pseudo-random permutations; then, in subsection 2.6.2, we discuss the
security of block ciphers.
2.6.1
Random and Pseudo-Random Permutations
After discussing random functions and PRFs, we now introduce two related
concepts: a random permutation and a pseudo-random permutation (PRP).
Random permutations. A permutation is a function π : D → D mapping
a domain D onto itself, where every element is mapped to a distinct element,
namely:
(π : D → D) is a permutation ⇐⇒ (∀x, x′ ∈ D) (π(x) ̸= π(x′ ))
(2.39)
Note that a permutation may map an element onto itself, i.e., π(x) = x is
perfectly legitimate.
We use P erm(D) to denote the set of all permutations over domain D.
$
Selection of a random permutation over D, i.e., selecting ρ ← P erm(D), is
similar to selection of a random function (Equation 2.27) - except for the need
to avoid collisions. A collision is a pair of elements (x, x′ ), both mapped to the
same element: y = ρ(x) = ρ(x′ ).
One natural way to think about this selection, is as being done incrementally,
mapping one input at a time. Let D′ ⊆ D be the set of elements we didn’t map
yet, and R ⊆ D be the set of elements to which we didn’t map any element yet;
$
initially, R = D′ = D. Given any ‘new’ element x ∈ D′ , select: ρ(x) ← R, and
then remove x from D′ and ρ(x) from R.
Using this process, for a small domain, e.g., D = {0, 1}n for small n, the
selection of a random permutation ρ is easy and can be done manually - similarly
to the process for selecting a random function (over small domain and range).
The process requires O(2n ) coin tosses, time and storage. For example, use
Table 2.3 to select two random permutations over domain D = {0, 1}2 , and
notice the number of coin-ŕips required.
Function
ρ1
ρ2
Domain
{0, 1}2
{0, 1}2
00
01
10
11
coin-ŕips
Table 2.3: Do-it-yourself table for selecting random permutations ρ1 , ρ2 over
domain D = {0, 1}2 .
Applied Introduction to Cryptography and Cybersecurity
2.6. BLOCK CIPHERS AND PRPS
105
Pseudo-Random Permutation (PRP). Similarly to a PRF, a PseudoRandom Permutation (PRP) over domain D, denoted Ek (·), is an efficient
algorithm which cannot be distinguished efficiently from a random permutation
$
ρ ← P erm(D), provided that the key k provided is ‘sufficiently long’ and chosen
randomly.
In the deőnition, the adversary A has oracle access to one of two functions:
either Ek (·), keyed with a random n-bit key k, or a random permutation over
$
domain D, i.e., ρ ← P erm(D). We denote these two cases by A Ek (·) and A ρ(·) ,
respectively. The adversary should try to distinguish between these two cases,
e.g., by outputting the string ‘Rand’ if given access to the random permutation
ρ(·), and outputting, say, ‘not random’, if given access to the PRP Ek (·). The
idea of the deőnition is illustrated in Fig. 2.20.
Note that the deőnition allows arbitrary length of the key (n), since indistinguishability is only deőned asymptotically - for sufficiently long keys.
Figure 2.20: The Pseudo-Random Permutation (PRP) Indistinguishability Test.
We say that function Ek (x) : {0, 1}∗ × D → D is a (secure) pseudo-random
permutation (PRP), if no distinguisher D can efficiently distinguish between
$
Ek (·) and a random permutation ρ ← P erm(D) over domain D, when the key
k is a randomly-chosen sufficiently-long binary string.
Definition 2.7. A pseudo-random Permutation (PRP) is a polynomial-time
computable function Ek (x) : {0, 1}∗ ×D → D ∈ P P T s.t. for all PPT algorithms
RP
P RP
A, εP
A,E (n) ∈ N EGL(n), i.e., is negligible, where the advantage εA,E (n) of
the PRP E against adversary A is defined as:
E n
RP
A k (1 ) −
Pr
[A ρ (1n )]
(2.40)
εP
Pr
A,E (n) ≡
$
k←{0,1}n
$
ρ←P erm(D)
The probabilities are taken over random coin tosses of A, and random choices
$
$
of the key k ← {0, 1}n and of the function ρ ← P erm(D).
Applied Introduction to Cryptography and Cybersecurity
106
CHAPTER 2. CONFIDENTIALITY: ENCRYPTION SCHEMES AND
PSEUDO-RANDOMNESS
One natural - and important - question is the relation between a PRP
over domain D, and a PRF whose domain and range are both D. Somewhat
surprisingly, it turns out that a PRP over D is indistinguishable from a PRF
over D. This important result is called the PRP/PRF Switching Lemma, and
has multiple proofs; we recommend the proof in [363]. Note that the lemma
provides a relation between the advantage functions; this is an example of
concrete security.
Lemma 2.2 (The PRP/PRF Switching Lemma). Let E be a polynomial-time
computable function Ek (x) : {0, 1}∗ × D → D ∈ P P T , and let A be a PPT
adversary. Then:
q2
P RP
RF
(2.41)
εP
A,E (n) − εA,E (n) <
2 · |D|
Where q is the maximal number of oracle queries performed by A in each run,
and the advantage functions are as defined in Equation 2.40 and Equation 2.29.
In particular, if the size of the domain D is exponential in the security
parameter n (the length of key and of the input to A), e.g., D = {0, 1}n , then
RF
P RP
εP
A,E (n) − εA,E (n) ∈ N EGL(n). In this case, E is a PRP over D, if and only
if it is a PRF over D.
Proof idea: In a polynomial set of queries of a random function, there is
negligible probability of having two values which will map to the same value.
Hence, it is impossible to efficiently distinguish between a random function and a
random permutation. The proof follows since a PRF (PRP) is indistinguishable
from a random function (resp., permutation).
The PRP/PRF switching lemma is somewhat counter-intuitive, since, for
large D, there are many more functions than permutations. Focusing on
n
2n
D = {0, 1}n for convenience, there are (2n ) = 2n·2 functions over D, and
‘only’ 2n !, i.e., the factorial14 of 2n , permutations.
Note that the loss of (concrete) security bounded by the switching lemma
is a disadvantage in using a block cipher directly as a PRF - it would be an
(asymptotically) secure PRF, but the advantage against the PRF deőnition
would be larger than the advantage against the PRP deőnition. Therefore,
we would prefer to use one of several constructions of a PRF from a block
cipher/PRP. These constructions are efficient and simple, yet avoid this loss in
security; see [39, 183].
2.6.2
Block ciphers
A block cipher is one of the most important cryptographic mechanisms, with
multiple applications and implementations, often used as a ‘building block’
to construct other mechanisms. The symbols (E, D) are often used for the
functions of the block cipher, since one basic application of a block cipher is
as an encryption scheme; intuitively, E is ‘encryption’ and D is ‘decryption’.
14 For
every integer i, the factorial of i is denoted i! and defined as i! ≡ 1 · 2 · . . . (i − 1) · i.
Applied Introduction to Cryptography and Cybersecurity
2.6. BLOCK CIPHERS AND PRPS
107
However, as we will see, a block cipher does not meet the (strong) deőnition of
secure encryption.
Intuitively, a block cipher is an invertible PRP ; this is the reason that we
often use the letter E to denote a PRP. The deőnition, which follows, is an
extension of Deőnition 2.7.
Definition 2.8. Let D be a finite domain. Given a permutation ρ : D → D over
domain D, define the inverse of ρ, denoted ρ−1 : D → D, as the permutation
over D such that (∀x ∈ D) x = ρ−1 (ρ(x)).
A block cipher over domain D is a pair of keyed polynomial-time computable
functions, (Ek , Dk ) over domain D, which satisfy:
Correctness: for every x ∈ D and every key k ∈ {0, 1}∗ holds x = Dk (E(k(x)).
Indistinguishability: the pair (E, D) is indistinguishable from the pair (ρ, ρ−1 )
where ρ is a random permutation over domain D. Namely, for every
PPT algorithm A, the invertible Pseudo-Random Permutation (iPRP)RP
advantage of A against (E, D), denoted εiP
A,(E,D) (n), is a negligible funciP RP
tion of n, where εA,(E,D) (n) is defined as:
RP
εiP
A,(E,D) (n) ≡
Pr
$
k←{0,1}n
A Ek ,Dk (1n ) −
Pr
$
ρ←P erm(D)
h
−1
A ρ,ρ (1n )
i
(2.42)
Note: A f,g (1n ) is the oracle notation, denoting algorithm A running with
input 1n and oracles to functions f and g; see Definition 1.3.
Let us give an example.
Example 2.7. Let Ek (m) = m ⊕ k and Ek′ (m) = m + k mod 2n . Show
the corresponding D, D′ functions such that both (E, D) and (E ′ , D′ ) satisfy
the correctness requirement; and show neither of them satisfy the security
requirement, i.e., neither are pairs of invertible PRPs.
Solution: Dk (c) = c ⊕ k, Dk′ (c) = c − k mod 2n . Correctness follows from
the arithmetic properties. Let us now show that (E, D) is insecure; speciőcally,
let us show that Ek is not a PRP. Recall that we need to provide a PPT
RP
adversary A Ek (·) , s.t. εP
E,A is not negligible. We present a simple adversary
A, that only makes two queries, and whose advantage is almost 1. The őrst
query of A will be for input 0n ; if we denote the oracle response by f (·), then
A receives f (0n ). If the oracle is for E, A receives Ek (0n ) = 0n ⊕ k = k, i.e.,
the key k. Intuitively, this clearly ‘breaks’ the system; let us show exactly how,
but from this point, our solution holds for the general case where the adversary
found k (if the oracle is for f (·) = Ek (·)).
Our second query can be for any other value (not 0n ), e.g., let’s make
the query 1n , so we now receive f (1n ), where f is either Ek or a random
permutation. Adversary A checks if f (1n ) (which it received from the oracle) is
the same as Ek (1n ) (which A computes, since it believes it knows k). If the two
are identical, then probably f (·) = Ek (·), i.e., A returns 1 (PRP); otherwise,
Applied Introduction to Cryptography and Cybersecurity
CHAPTER 2. CONFIDENTIALITY: ENCRYPTION SCHEMES AND
PSEUDO-RANDOMNESS
108
Function
Property
PRG f
‘Long’ output is pseudorandom, if ‘short’ input is
random
Random function f
Random permutation π
PRF fk (·)
(∀x ∈ D) f (x) ← R (random mapping for each input)
Random 1-to-1 mapping: ∀x =
̸ x′ ∈ D) π(x) ̸= π(x′ ).
Indistinguishable from a random function, if k is
selected randomly
Indistinguishable from a random permutation, if k is
selected randomly
(E, D) is indistinguishable from a random invertible
permutation over domain D, and satisfy correctness:
(∀k, m) m = Dk (Ek (m))
PRP fk (·)
Block cipher (E, D)
$
Table 2.4: Comparison between random function, random permutation, PRG,
PRF, PRP and block-cipher. Domain is denoted D, range is denoted R; for
permutations and block cipher, domain is also range.
then for sure f is a random permutation (and A returns 0). So, the advantage
of A is almost 1, speciőcally:
RP
εP
A,E (n)
=
Pr
$
k←[0,1]n
=
A Ek = 1 −
n
n
Pr
$
f ←P erm({0,1}n
Pr (Ek (1 ) = Ek (1 )) −
k
Pr
$
) Af = 1
(Ek (1n ) = f (1n ))
f ←P erm({0,1}n
Now, if f is a random permutation, then f (1n ) is a random n bit string;
since there are 2n n-bit strings, then the probability of f (1n ) to be any speciőc
string, including Ek (1n ), is 2−n . Hence:
RP
εP
A,E (n)
=
1−
1
≈1
2n
Now, notice that the same adversary A also distinguishes E ′ ; we leave to
the reader to substitute the values as necessary; these minimal changes are only
required until A ‘őnds’ k, from that point, the solution is exactly identical.
See Table 2.4 for a summary and comparison of random function, random
permutation, PRG, PRF and Pseudo-random Permutation (PRP).
Robust combiner for block ciphers. Block ciphers is that they have a
simple robust combiner, i.e., a method to combine two or more candidate block
ciphers (E ′ , D′ ), , (E ′′ , D′′ ) into one ‘combined’ pair (E, D) which is a secure
block cipher provided one or more of the candidates is a secure block cipher.
See [191]. Basically, assuming both (E ′ , D′ ) and (E ′′ , D′′ ) satisfy correctness,
then their cascade is a secure block cipher, i.e., Ek′ ,k′′ (x) = Ek′ ′ (Ek′′′′ (x)),
Dk′ ,k′′ (x) = Dk′′′′ (Dk′ ′ (x)). See [191].
Applied Introduction to Cryptography and Cybersecurity
2.6. BLOCK CIPHERS AND PRPS
2.6.3
109
The Feistel Construction: 2n-bit Block Cipher from
n-bit PRF
It is not too difficult to design a candidate PRF from basic operations. However,
designing a candidate PRP directly from basic operations seems harder, since
we need to ensure that every input is mapped to a distinct output. Directly
constructing a block cipher, an invertible PRP, seems harder, since we need to
ensure the permutation property and őnd the inverse permutation, but still
prevent vulnerability. This motivates the design of a PRP by using a PRF.
In fact, the PRP/PRF switching lemma (Lemma 2.2) shows that every PRF
can also be used as a PRP. Namely, a PRF over a domain D is indistinguishable
from a PRP over D, and vice verse. Namely, no computationally-bounded
(PPT) adversary is likely to distinguish between a PRP and a PRF (over domain
D). So, it seems that we can just use a PRF instead of a PRP.
However, a PRF is allowed to have some collisions, i.e., values x ̸= x′ ∈ D
such that Fk (x) = Fk (x′ ), for the same key k. Collisions do not exist for (random
or not) permultations, and are undesirable for a block cipher; indeed, we required
that a block cipher (E, D) will ensure correctness, i.e., that m = Dk (Ek (m)),
for every key k and message m. Clearly this will not hold if we use a PRF,
which may have collisions, as the E function.
In this subsection, we study the Feistel construction of a PRP from a PRF;
furthermore, the construction is of a PRP with input of 2n bits, given a PRF
with inputs and outputs of n bits. Such a design is not trivial; see the following
two exercises.
Exercise 2.15. Let f be a PRF from n-bit strings to n-bit strings, and define
gkL ,kR (mL +
+ mR ) ≡ fkL (mL ) +
+ fkR (mR ). Show that g is neither a PRF nor
a PRP (over 2n-bit strings).
Hint: given a black box containing g or a random permutation over 2n-bit
strings, design a distinguishing adversary A as follows. A makes two queries,
one with input x = 02n and the other with input x′ = 0n +
+ 1n . Denote
′
′
the corresponding outputs by y = y0,...,2n−1 and y = y0,...,2n−1 . If the box
′
. In contrast, if the box contained a
contained g, then y0,...,n−1 = y0,...,n−1
′
random function, then the probability that y0,...,n−1 = y0,...,n−1
is very small −n
only 2 . The probability is about as small as if the box contained a PRP.
The next exercise presents a slightly more elaborate scheme, which is essentially a reduced version of the Feistel construction (presented next).
Exercise 2.16. Let f be a PRF from n-bit strings to n-bit strings. Show that
gkL ,kR (mL +
+ mR ) = ml ⊕ fkR (mR ) +
+ mR ⊕ fkL (mL ) is not a PRP (over 2n-bit
strings).
We next present the Feistel construction, the most well known and simplest
construction of a PRP - in fact, an invertible PRP (block cipher) - from a PRF.
As shown in Fig. 2.21, the Feistel cipher transforms an n-bit PRF into a 2n-bit
invertible PRP.
Applied Introduction to Cryptography and Cybersecurity
110
CHAPTER 2. CONFIDENTIALITY: ENCRYPTION SCHEMES AND
PSEUDO-RANDOMNESS
Figure 2.21: Three ‘rounds’ of the Feistel Cipher, constructing a block cipher
(invertible PRP) from a PRF Fk (·). The Feistel cipher is used in DES (but not
in AES). Note: most publications present the Feistel cipher a bit differently, by
‘switching sides’ in each round.
Formally, given a function y = fk (x) with n-bit keys, inputs and outputs,
the three-rounds Feistel gk (m) is deőned as:
Lk (m)
=
Rk (m)
=
gk (m)
=
m0,...,n−1 ⊕ Fk (mn,...,2n−1 )
Fk (Lk (m)) ⊕ mn,...,2n−1
Lk (m) ⊕ Fk (Rk (m)) +
+ Rk (m)
Note that we consider only a ‘three rounds’ Feistel cipher, and use the same
underlying function Fk in all three rounds, but neither aspect is mandatory.
In fact, the Feistel cipher is used in the design of DES and several other block
ciphers, typically using more rounds (e.g., 16 in DES), and often using different
functions at different rounds.
Luby and Rackoff [269] proved that a Feistel cipher of three or more ‘rounds’,
using a PRF as Fk (·), is an invertible PRP, i.e., a block cipher.
One may ask, why use the Feistel design rather than directly design an
invertible PRP? Indeed, this is done in AES, which does not follow the Feistel
Applied Introduction to Cryptography and Cybersecurity
2.7. DEFINING SECURE ENCRYPTION
111
cipher design. An advantage of using the Feistel design is that it allows the
designer to focus on the pseudo-randomness requirements when designing the
PRF, without having simultaneously to make sure that the design is also an
invertible permutation. Try to design a PRP, let alone an invertible PRP, and
compare it to using the Feistel cipher!
2.7
Defining secure encryption
In the previous section, we deőned PRG, PRF and PRP; in this section, we
őnally make the next step and deőne secure encryption.
The deőnition of secure encryption is quite subtle. In fact, people have
been designing - and attacking - cryptosystems for millennia, without a precise
deőnition of the security goals! This only changed with the seminal paper of
Goldwasser and Micali [169], which presented the őrst precise deőnition of secure
encryption, along with a design which was proven secure (under reasonable
assumptions); this paper is one of cornerstones of modern cryptography.
It may be surprising that deőning secure encryption is so challenging; we
therefore urge you to attempt the following exercise, where you are essentially
challenged to try to deőne secure encryption on your own, before reading the
rest of this section and comparing with the deőnition we present.
Exercise 2.17 (Deőning secure encryption). Define secure symmetric encryption, as illustrated in Figure 1.4. Refer separately to the two aspects of security
definitions: (1) the attack model, i.e., the capabilities of the attacker, and (2) the
success criteria, i.e., what constitutes a successful attack and what constitutes a
secure encryption scheme.
2.7.1
Attack model
Security deőnition require a precise attack model, deőning the maximal expected
capabilities of the attacker. We discussed already some of these capabilities. In
particular, we already discussed the computational limitations of the attacker:
in Section 2.4 we discussed the unconditional security model, where attackers
have unbounded computational resources, and from subsection 2.5.2 we focus
on Probabilistic Polynomial Time (PPT) adversaries, whose computation time
is bounded by some polynomial in their input size.
Another important aspect of the attacker model is the interactions with
the attacked scheme and the environment. In Section 2.2, we introduced the
cipher-text only (CTO), known-plaintext attack (KPA), chosen plaintext attack
(CPA) and chosen ciphertext attack (CCA) attack models. Speciőcally, in
a chosen-plaintext attack, the adversary can choose plaintext and receive the
corresponding ciphertext (encryption of that plaintext), and in chosen-ciphertext
attack, the adversary can choose ciphertext and receive the corresponding
plaintext (its decryption), or error message if the ciphertext does not correspond
to well-encrypted plaintext.
Applied Introduction to Cryptography and Cybersecurity
112
CHAPTER 2. CONFIDENTIALITY: ENCRYPTION SCHEMES AND
PSEUDO-RANDOMNESS
It is desirable to allow for attackers with maximal capabilities. Therefore,
when we evaluate cryptosystems, we are interested in their resistance to all
types of attacks, and especially the stronger ones - CCA and CPA. On the other
hand, when we design systems using a cipher, we try to limit that attacker’s
capabilities.
For example, one approach to foil CCA attacks is to apply some simple
padding function pad to add redundancy to the plaintext before encryption; the
padding function may be as simple as appending a őxed string. For example,
given message m, key k, encryption scheme (E, D) and a simple padding
+ 0l , i.e., concatenate l zeros, we now encrypt by
function, e.g., pad(m) = m +
computing c = Ek (pad(m)) = Ek (m +
+ 0l ). This allows the decryption process
to identify invalid ciphertexts. Namely, given c = Ek (pad(m)) = Ek (m +
+ 0l ),
l
then Dk (c) = m +
+ 0 , and we output m as usual; but if the output of Dk does
not contain l trailing zeros, then we identify faulty ciphertext. This approach
often helps to make it hard or infeasible for the attacker to apply a chosenciphertext attack; in particular, a random ciphertext would almost always be
detected as faulty.
Note, however, that adding redundancy to the plaintext may make it easier
to perform ciphertext-only attacks; see Principle 9.
Also, some combinations of encryption and padding functions may still be
vulnerable to chosen ciphertext attacks (CCA), as we show in the following
exercise.
Exercise 2.18. Show that the combination of the simple padding function
pad(m) = m +
+ 0l and the PRG-stream-cipher (Fig. 2.11) is vulnerable to CCA
attack. Show this also for the combination of the simple padding pad(m) = m+
+0l
with two other ciphers we discussed so far.
2.7.2
The Indistinguishability-Test for Shared-Key
Cryptosystems
Intuitively, the security goal of encryption is confidentiality: to transform
plaintext into ciphertext in such way that will allow speciőc parties (‘recipients’)
- and only them - to perform decryption, transforming the ciphertext back to
the original plaintext. However, the goal as stated may be interpreted to only
forbid recovery of the exact, complete plaintext; but what about recovery of
partial plaintext?
For example, suppose an eavesdropper can decipher half of the characters
from the plaintext - is this secure? We believe most readers would not agree.
What if she can decipher less, say one character? In some applications, this may
be acceptable; in others, even exposure of one character may have signiőcant
consequences.
Intuitively, we require that an adversary cannot learn anything given the
ciphertext. This may be viewed as extreme; for example, in many applications
the plaintext includes known őelds, and their exposure may not be a concern.
Applied Introduction to Cryptography and Cybersecurity
2.7. DEFINING SECURE ENCRYPTION
113
However, it is best to minimize assumptions and use deőnitions and schemes
which are secure for a wide range of applications.
Indeed, in general, when we design a security system, cryptographic or
otherwise, it is important to clearly deőne both aspects of security: the attack
model (e.g., types of attacks ‘allowed’ and any computational limitations), as
well as the success criteria (e.g., ability to get merchandise without paying for
it). Furthermore, it is difficult to predict the actual environment in which a
system would be used. Therefore, following the conservative design principle
(Principle 3), our deőnition should prevent the adversary from learning any
information about the plaintext from the ciphertext.
Let us assume that you agree that it would be best to require that an
adversary cannot learn anything from the ciphertext. How do we ensure this?
This is not so easy. The seminar paper by Goldwasser and Micali [169] presented
two deőnitions and showed them to be equivalent: semantic secure encryption
and indistinguishability. We will only present the latter, since we őnd it
easier to understand and use, and resembeles the PRF, PRG and Turing
indistinguishability tests (Figure 2.18, Figure 2.13 and Figure 2.12, respectively).
Intuitively, an encryption scheme ensures indistinguishability if an attacker
cannot distinguish between encryption of any two given messages. But, again,
turning this into a ‘correct’ and precise deőnition requires care.
The concept of indistinguishability is reminiscent of disguises; it may help
to consider the properties we can hope to őnd in an ‘ideal disguise service’:
Any two disguised persons are indistinguishable: we cannot distinguish
between any two well-disguised persons. Yes, even Rachel from Leah!15
Except, the two persons should have the ‘same size’: assuming that a
disguise is of ‘reasonable size’ (overhead), a giant can’t be disguised to be
indistinguishable from a dwarf!
Re-disguises should be different: if we see Rachel in disguise, and then she
disappears and we see a new disguise, we should not be able to tell if it is
Rachel again, in new disguise - or any other disguised person! This means
that disguises must be randomized or stateful, i.e., every two disguises of
the same person (Rachel) will be different.
We will present corresponding properties for indistinguishable encryption:
Encryptions of any two messages are indistinguishable. to allow arbitrary applications, we allow the attacker to choose the two messages.
However, there is one restriction: the two messages should be same length.
Re-encryptions should be different: the attacker should not be able to
distinguish encryptions based on previous encryptions of the same messages. This means that encryption must be randomized or stateful, so that
two encryptions of same message will be different. (A weaker notion of
15 See:
Genesis 29:23, King James Bible.
Applied Introduction to Cryptography and Cybersecurity
114
CHAPTER 2. CONFIDENTIALITY: ENCRYPTION SCHEMES AND
PSEUDO-RANDOMNESS
‘deterministic encryption’ allows detection of re-encryption of a message,
and is sometimes used for scenarios where state and randomization are to
be avoided.)
We are őnally ready to formally present an indistinguishability-based deőnition of secure encryption. Deőnition 2.9 deőnes chosen-plaintext attack (CPA)
indistinguishable (IND-CPA) shared key encryption schemes.
Figure 2.22: Illustration of the CPA indistinguishability (IND-CPA) test for
IN D−CP A
(b, n); see also pseudocode, Figure 2.23.
shared-key encryption, TA,⟨E,D⟩
Throughout the test, the adversary A may ask for encryption of one or many
messages m. At some point, A sends two same length messages (|m0 | = |m1 |),
and receives the encryption of mb , i.e.: Ek (mb ). Finally, A outputs its guess b∗ ,
and ‘wins’ if b = b∗ . The encryption is IND-CPA if Pr(b∗ = 1|b = 1) − Pr(b∗ =
1|b = 0) is negligible.
The IND-CPA test receives two inputs: the ‘challenge bit’ b (that A tries
to őnd), and the security parameter, which in this case is also the key length, n.
The adversary is given oracle access to Ek (·); we denote the fact that A has
oracle access to Ek (·) by writing AkE (·). Namely, A may select a message m
and receive its encryption Ek (m) - and possibly repeat this for more messages.
The encryption Ek (·) may be either stateless or stateful; for stateful encryption
Ek (·), the state is maintained by the oracle, not exposed to A. At some point,
A gives a pair of messages m0 , m1 , and receives c∗ = Ek (mb ). As we discussed
above, the two messages must be of equal length, |m0 | = |m1 |. Finally, A
outputs b∗ , which is the output of the test. Intuitively, A ‘wins’ if b∗ = b.
We present the IND-CPA test informally in Figure 2.22, and using pseudocode in Figure 2.23.
Oracle notation A Ek (·) . In the IND-CPA test, we use the oracle notation
A Ek (·) , deőned in Def. 1.3. Namely, A Ek (·) denotes calling the A algorithm,
with ‘oracle access’ to the (keyed) PPT algorithm Ek (·), i.e., A can provide
Applied Introduction to Cryptography and Cybersecurity
2.7. DEFINING SECURE ENCRYPTION
115
IN D−CP A
TA,⟨E,D⟩
(b, n) {
$
k ← {0, 1}n
(m0 , m1 ) ← A Ek (·) (‘Choose’, 1n ) s.t. |m0 | = |m1 |
c∗ ← Ek (mb )
b∗ = A Ek (·) (‘Guess’, c∗ )
Return b∗
}
Figure 2.23: Pseudocode for the chosen-plaintext attack (CPA) indistinguishability (IND-CPA) test for shared key encryption schemes (E, D), illustrated in
Figure 2.22. The two calls to the adversary are often referred to as the ‘Choose’
phase and the ‘Guess’ phase.
arbitrary plaintext string m and receive Ek (m). The oracle may maintain its
own state, allowing stateful encryption.
Adversary A chooses challenge messages. The IND-CPA test allows A to
choose the two challenge messages m0 , m1 , and then receive c∗ = Ek (mb ), where
b ∈ {0, 1}. Allowing A to select the two messages completely may make it easier
for A; in many applications, the adversary only has very limited knowledge
about the possible plaintext messages. This is following the conservative design
principle - the encryption should be appropriate for any application, including
one in which there are only two possible plaintext messages, known to the
attacker - who ‘just’ needs to know which of them was encrypted. One classical
example is when the messages are ‘attack’ or ‘retreat’; another would be ‘sell’
or ‘buy’.
Encryption must be randomized or stateful. IND-CPA encryption must
either be randomized or stateful. The reason is simple: the adversary is allowed
to make queries for arbitrary messages - including the ‘challenges’ m0 , m1 . If
the encryption scheme is deterministic - and stateless - then all encryptions of
a message, e.g. m0 , will return a őxed ciphertext; this will allow the attacker
to trivially ‘win’ in the IND-CPA experiment. Furthermore, Exercise 2.46
shows that limiting the number of random bits per encryption may lead to
vulnerability.
Using the IND-CPA test, we now deőne IND-CPA encryption, similarly to
how we deőned PRG and PRF, in Deőnition 2.5 and Deőnition 2.6, respectively.
Definition 2.9 (IND-CPA shared-key cryptosystems). Let ⟨E, D⟩ be a sharedkey cryptosystem. We say that ⟨E, D⟩ is IND-CPA, if every efficient adversary
IN D−CP A
A ∈ P P T has negligible advantage ε⟨E,D⟩,A
(n) ∈ N EGL(n), where:
h
i
h
i
D−CP A
IN D−CP A
IN D−CP A
εIN
(n) ≡ Pr TA,⟨E,D⟩
(1, n) = 1 − Pr TA,⟨E,D⟩
(0, n) = 1
⟨E,D⟩,A
(2.43)
Applied Introduction to Cryptography and Cybersecurity
116
CHAPTER 2. CONFIDENTIALITY: ENCRYPTION SCHEMES AND
PSEUDO-RANDOMNESS
Where the probability is over the random coin tosses in IND-CPA (including of
A and E).
The following exercise can help understand subtle aspects of the deőnition.
Exercise 2.19. Consider the two following alternative advantage functions, ε̃
and ε̂:
h
i
h
i
D−CP A
IN D−CP A
IN D−CP A
ε̃IN
(n) ≡ Pr TA,⟨E,D⟩
(1, n) = 1 − Pr TA,⟨E,D⟩
(1, n) = 0
⟨E,D⟩,A
D−CP A
ε̂IN
(n)
⟨E,D⟩,A
≡
h
i
h
i
IN D−CP A
IN D−CP A
Pr TA,⟨E,D⟩
(1, n) = 1 − Pr TA,⟨E,D⟩
(0, n) = 0
Show that both ε̃ and ε̂ are not reasonable definitions for advantage function,
by presenting (simple) adversaries which achieve significant advantage for any
cryptosystem, including (randomized) cryptosystems which satisfies indistinguishability, i.e., where any PPT adversary will have negligible advantage.
Indistinguishability for the CTO, KPA and CCA attack models
Deőnition 2.9 focuses on Chosen-Plaintext Attack (CPA) model.
Modifying this deőnition for the case of chosen-ciphertext (CCA) attacks
requires a further (quite minor) change and extension, to prevent the attacker
from ‘abusing’ the decryption oracle to decrypt the challenge ciphertext.
Modifying the deőnition for Cipher-Text-Only (CTO) attack and KnownPlaintext Attack (KPA) is more challenging. For KPA, the obvious question
is which plaintext-ciphertext messages are known; this may be solved by using
random plaintext messages, however, in reality, the known-plaintext is often
quite speciőc.
It is similarly challenging to modify the deőnition so it covers CTO attacks,
where the attacker must know some information about the plaintext distribution.
This information may be related to the speciőc application, e.g., when the
plaintext is English.
In other cases, information about the plaintext distribution may be derived
from system design. One example is text encoded using a code with some
built-in error-detection capability, e.g., the ASCII encoding [94], where one of
the bits in every character is the parity of the other bits. An even more extreme
example is in GSM, where the plaintext is the result of the application of an
Error-Correcting Code (ECC), providing signiőcant redundancy which even
allows a CTO attack on GSM’s A5/1 and A5/2 ciphers [26]. In such a case,
the amount of redundancy in the plaintext can be compared to that provided
by a KPA attack. We consider it a CTO attack, as long as the attack does not
require knowledge of all or much of the plaintext corresponding to the given
ciphertext messages.
Some systems, including GSM, allow the attacker to guess all or much
of the plaintext for some of the ciphertext messages, e.g., when sending a
predictable message at a speciőc time. Such systems violate the Conservative
Applied Introduction to Cryptography and Cybersecurity
2.7. DEFINING SECURE ENCRYPTION
117
Design Principle (principle 3), since a KPA-vulnerability of the cipher renders
the system vulnerable. A better system design would limit the adversary’s
knowledge about the distribution of plaintexts, requiring a CTO vulnerability
to attack the system.
2.7.3
The Indistinguishability-Test for Public-Key
Cryptosystems (PKCs)
We next deőne the CPA-indistinguishability for public key cryptosystems (PKC;
see Figure 1.5). The deőnition is a minor variation of the indistinguishabilitytest for shared-key cryptosystems (Deőnition 2.9), and even a bit simpler. In
fact, let us first present the deőnition, as well as the IND-CPA test for PKCS
(Figure 2.24), and only then point out and explain the differences; this would
allow the reader to play ‘őnd the differences’, comparing to Deőnition 2.9.
IN D−CP A
TA,⟨KG,E,D⟩
(b, n) {
$
(e, d) ← KG(1n )
(m0 , m1 ) ← A(‘Choose’, e) s.t. |m0 | = |m1 |
c∗ ← Ee (mb )
b∗ = A(‘Guess’, (c∗ , e))
Return b∗
}
Figure 2.24: The IND-CPA-PK test for public-key encryption (KG, E, D).
Notice that this test does not use the decryption key d, generated in the őrst
step.
Definition 2.10 (IND-CPA-PK). Let ⟨KG, E, D⟩ be a public-key cryptosystem.
We say that ⟨KG, E, D⟩ is IND-CPA-PK, if every efficient adversary A ∈ P P T
D−CP A−P K
has negligible advantage εIN
<KG,E,D>,A (n) ∈ N EGL(n), where:
h
i
h
i
D−CP A−P K
IN D−CP A
IN D−CP A
εIN
(n) ≡ Pr TA,⟨KG,E,D⟩
(1, n) = 1 − Pr TA,⟨KG,E,D⟩
(0, n) = 1
⟨KG,E,D⟩,A
(2.44)
Where the probability is over the random coin tosses in IND-CPA (including of
A and E).
In IND-CPA-PK ( Deőnition 2.10), the adversary is given the public key e.
Hence, ADV can encrypt at will, without the need to make encryption queries,
as enabled by the oracle calls in Deőnition 2.9, and we removed the oracle.
Another change is purely syntactic: the cryptosystem includes an explicit key
generation algorithm KG, while for the shared-key cryptosystem, we assumed
the (typical) case where the keys are just random n-bit strings.
We discuss three speciőc public key cryptosystems, all in Chapter 6: the
DH and El-Gamal PKCs in Section 6.4, and the RSA PKC in Section 6.5.
Applied Introduction to Cryptography and Cybersecurity
118
2.7.4
CHAPTER 2. CONFIDENTIALITY: ENCRYPTION SCHEMES AND
PSEUDO-RANDOMNESS
Design of Secure Encryption: the Cryptographic
Building Blocks Principle
We next discuss the design of secure symmetric encryption schemes. It would
be great if we could use encryption schemes which are provably secure, e.g.,
proven to be IND-CPA (Deőnition 2.9), without assumptions on computationalhardness of some underlying functions. However, this is unlikely; let us explain
why.
A provably IND-CPA encryption implies P ̸= N P . IND-CPA implies
that there is no efficient (PPT) algorithm that can distinguish between encryption of two given messages, i.e., the IND-CPA test is not in the polynomialcomplexity class P, containing problems which have a polynomial-time algorithm.
On the other hand, surely it is easy to ‘win’ in the test, given the key; which
implies that the IND-CPA test is in the non-deterministic polynomial complexity class NP, containing problems which have a polynomial-time algorithm - if
given a hint (in our case, the key). Taken together, this would have shown that
the complexity class P is strictly smaller than the complexity class N P , i.e.,
P ̸= N P . Now, that would be a solution to the most fundamental question in
the theory of computational complexity!
It is not practical to require the encryption algorithm to have a property
whose existence implies a solution to such a basic, well-studied open question.
Therefore, both theoretical and applied cryptography consider designs whose
security relies on failed attempts in cryptanalysis. The big question is: should
we rely on failed cryptanalysis of the scheme itself, or on failed cryptanalysis of
underlying components of the scheme?
It may seem that the importance of encryption schemes should motivate
the őrst approach, i.e., relying of failed attempts to cryptanalyze the scheme.
Surely this was the approach in historical and ‘classical’ cryptology.
However, in modern applied cryptography, it is much more common to use
the second approach, i.e., to construct encryption using ‘simpler’ underlying
primitives, and to base the security of the cryptosystem on the security of these
component modules. We summarize this approach in the following principle,
and then give some justiőcations.
Principle 8 (Cryptographic Building Blocks). The security of cryptographic
systems should only depend on the security of a few basic building blocks. These
blocks should be simple and with well-deőned and easy to test security properties.
More complex schemes should be proven secure by reduction to the security of
the underlying blocks.
The advantages of following the cryptographic building blocks principle
include:
Efficient cryptanalysis: by focusing cryptanalysis effort on few schemes, we
obtain much better validation of their security. The fact that the building
Applied Introduction to Cryptography and Cybersecurity
2.7. DEFINING SECURE ENCRYPTION
119
blocks are simple and are selected to be easy to test makes cryptanalysis
even more effective.
Replacement and upgrade: by using simple, well-deőned modules, we can
replace them for improved efficiency - or to improve security, in particular
after being broken or when doubts arise.
Flexibility and variations: complex systems and schemes naturally involve
many options, tradeoffs and variants; it is better to build all such variants
using the same basic building blocks.
Robust combiners: there are known, efficient robust-combiner designs for
the basic cryptographic building blocks [191]. If desired, we can use these
as the basic blocks for improved security.
The cryptographic building blocks principle is key to both applied and theoretical modern cryptography. From the theoretical perspective, it is important
to understand which schemes can be implemented given another scheme. There
are many results exploring such relationships between different cryptographic
schemes and functions, with many positive results (constructions), few negative
results (proofs that efficient constructions are impossible or improbable), and
very few challenging open questions.
In modern applied cryptography, the principle implies the need to deőne a
small number of basic building blocks, which would be very efficient, simple
functions - and convenient for many applications. The security of these building
blocks would be established by extensive (yet unsuccessful) cryptanalysis efforts
- instead of relying on provably-secure reductions from other cryptographic
mechanisms.
In fact, most cryptographic libraries contain the four such widely-used
building blocks: shared-key block ciphers, cryptographic hash functions , publickey encryption and signature schemes. Cryptographic hash functions and block
ciphers are much more efficient than the public key schemes (see Table 6.1)
and hence are preferred, and used in most practical systems - when public-key
operations may be avoided.
In particular, block ciphers are widely used as cryptographic building blocks,
as they satisfy most of the requirements of the Cryptographic Building Blocks
principle. They are simple, deterministic functions with Fixed Input Length
(FIL), which is furthermore identical to their output length. This should be
contrasted with ‘full ŕedged encryption schemes’, which are randomized (or
stateful) and have Variable Input Length (VIL). Which brings us to the natural
question: can we use block ciphers for secure encryption - and how ?
Block ciphers vs. Secure encryption. Could we simply use a block cipher
for encryption? This seems natural; block ciphers, in particular DES and
AES, are often referred to as encryption schemes, and even typically use the
notation (E, D) for their keyed functions. However, block ciphers do not satisfy
Applied Introduction to Cryptography and Cybersecurity
120
CHAPTER 2. CONFIDENTIALITY: ENCRYPTION SCHEMES AND
PSEUDO-RANDOMNESS
the requirements of most deőnitions of encryption, e.g., the IND-CPA test of
Def. 2.9.
Exercise 2.20. Explain why a PRP and a block cipher, fail the IND-CPA test
(Def. 2.9).
Solution: Consider Ek (m), which is either a PRP or the ‘encryption’ operation of a block cipher (i.e., a pair (E, D) of a PRP and its reverse). Then Ek (m)
is a function; whenever we apply it to the same message m, with the same key
k, we will receive the same output (Ek (m)). The attacker A would choose any
two different messages as (m0 , m1 ), conőrm that c0 = Ek (m0 ) ̸= c1 = Ek (m1 ),
and then use these as a challenge, to receive c∗ = Ek (mb ). It then outputs b′
s.t. cb′ = c∗ .
On the other hand, in the next section, we discuss multiple constructions of
secure encryption schemes based on block ciphers; such constructions are often
referred to as mode of operation.
2.8
Encryption Modes of Operation
Finally we get to design symmetric encryption schemes. Following the Cryptographic Building Blocks principle, the designs are based on the much simpler
block ciphers. We use the term mode of operation for such construction of
encryption and other cryptographic schemes from block ciphers. This term,
and several standard modes of operation, were deőned in the DES speciőcations [296], and redeőned in the AES speciőcation [134], which added one more
standard mode of operation (the CTR mode). Additional modes of operation
were deőned and proposed in different standards and publications.
In this section, we describe the standard modes of operations from [134,296],
slightly simplifying the CTR mode. For didactic purposes, we add one nonstandard (and inefficient) mode of operation, which we call the Per-Block
Random (PBR) mode. These modes are summarized in Table 2.5.
The ‘modes of operations’ in Table 2.5 are designed to turn block ciphers
into more complete cryptosystems, handling goals such as:
Variable length and padding: we allow encryption of arbitrary, Variable
Input Length (VIL) messages. All modes of operation are deőned for input
whose length is an integral number l of blocks. If the input may not
be an integral number of blocks, then the input should be padded to an
integral number of blocks, before applying the encryption (i.e., the mode
of operation). Correct padding can be quite simple, however, surprisingly,
incorrect padding can result in serious vulnerability; We discuss padding
and possible vulnerabilities in Section 2.9.
Randomization/state: Most modes use randomness to ensure independence
between two encryptions of the same (or of related) messages, as required
for indistinguishability-based security deőnitions. The exceptions are the
Applied Introduction to Cryptography and Cybersecurity
2.8. ENCRYPTION MODES OF OPERATION
Mode
Electronic code
book (ECB)
Encryption
ci = Ek (mi )
Per-Block Random (PBR)
Counter (CTR)
[simpliőed]
ri ← {0, 1}n ,
ci = (ri , mi ⊕ Ek (ri ))
ci = mi ⊕ Ek (i)
Output
Feedback (OFB)
r0 ← {0, 1}n , ri = Ek (ri−1 ),
c0 ← r0 , ci ← ri ⊕ mi
Cipher Feedback
(CFB)
Cipher-Block
Chaining (CBC)
121
Flip ci [j] ⇒
Corrupt mi
Properties
Deterministic
(distinguishable)
$
Flip
mi [j]
(no integrity)
Flip mi [j]
Long ciphertext
Fast online,
stateful (i)
$
Flip
mi [j]
(no integrity)
Fast online
(precompute)
$
Corrupt
mi+1 ,
ŕip
mi [j]
Can decrypt
in parallel
$
Flip mi+1 [j],
corrupt mi
Can decrypt
in parallel
c0 ← {0, 1}n ,
ci ← mi ⊕ Ek (ci−1 )
c0 ← {0, 1}n ,
ci ← Ek (mi ⊕ ci−1 )
Table 2.5: Encryption Modes of Operation using n-bit block cipher. ECB,
OFB, CFB and CBC are from NIST ( [134, 296]). The plaintext is given
+ m2 +
+ . . ., where each block has n
as a concatenation of n-bit blocks m1 +
bits, i.e., mi ∈ {0, 1}n . Similarly the ciphertext is produced as a set of n-bits
blocks c0 +
+ c1 +
+ . . . ∈ {0, 1}n , where ci ∈ {0, 1}n (except for PBR, where
2n
ci ∈ {0, 1} ). We use mi [j] (ci [j]) to denote the j th bit of the plaintext
(respectively, ciphertext).
CTR mode, which uses state instead of randomization, and the ECB
mode, that uses neither - and, therefore, is not IND-CPA.
PRF: Most modes (PBR, OFB, CFB and CTR), use only the encryption
function E - even for decryption. This has an important implication: they
may be implemented using a PRF instead of a block cipher. This may
have imply better security, esp. when the same key is used for an extensive
number of messages, due to improved concrete-security (smaller advantage
to attacker). However, notice that there will not be such advantage if we
simply use a block cipher as a PRP, relying on the PRP/PRF switching
lemma (Lemma 2.2); we should use one of the (simple and efficient)
constructions of PRF from block cipher, which avoid an increase in the
adversary’s advantage; see [39, 183]. See [33, 338].
Efficiency. Efficiency is important - and multi-faceted. All of the modes we
present, use one block-cipher operation per message-block, and allow
parallel decryption. The OFB, CTR and PBR modes also allow parallel
encryption, or ‘random access’ decryption - decryption of only speciőc
blocks from the plaintext. Another efficiency consideration is offline
precomputation; in the CTR modes, we may conduct all the block-ciphers
computations offline; after receiving the plaintext/ciphertext, we only
need a single XOR operation (per block). The OFB mode has similar
Applied Introduction to Cryptography and Cybersecurity
CHAPTER 2. CONFIDENTIALITY: ENCRYPTION SCHEMES AND
PSEUDO-RANDOMNESS
122
property but only for encryption; decryption requires the ciphertext as
input to the block-cipher.
Integrity/authentication: Some modes, which, unfortunately, we do not
discuss, ensure both conődentiality and integrity, preventing an attacker
from modifying intercepted messages to mislead the recipient, or from
forging messages as if they were sent by a trusted sender. These include
the Counter and CBC-MAC (CCM) mode and the (more efficient) Galois/Counter Mode (GCM) mode. Other modes ensure only authenticity;
we discuss one such mode, the CBC-MAC mode, in subsection 4.5.2.
Error localization and weak integrity: In the OFB and CTR, corruption
of any number m of ciphertext bits, results in corruption of only the
corresponding plaintext bits. This may help to recover from some corruptions of bits during communication, since no additional bits are lost,
but also implies that the attacker may ‘ŕip’ plaintext bits by ‘ŕipping’
the corresponding ciphertext bits. In contrast, in the CFB and CBC
modes, corruption of a single ciphertext block, ŕips a bit in one block,
and ‘corrupts’ another block - with some exceptions; this is sometimes
considered as a weak form of protection of integrity, but the defense is
very fragile and relying on it has resulted in several vulnerabilities.
2.8.1
The Electronic Code Book mode (ECB) mode
ECB is a naïve mode, which isn’t really a proper ‘mode’: it simply applies the
block cipher separately to each block of the plaintext. Namely, to decrypt the
plaintext string m = m1 +
+ m2 +
+ . . ., where each mi is a block (i.e., |mi | = n),
we simply compute ci = Ek (mi ). Decryption is equally trivial: mi = Dk (ci ),
and correctness of encryption, i.e., m = Dk (Ek (m)) for every k, m ∈ {0, 1}∗ ,
follows immediately from the correctness of the block cipher Ek (·).
m1
k
Ek (·)
c1
m2
k
Ek (·)
c2
m3
k
Ek (·)
ml
······
k
c3
Ek (·)
cl
Figure 2.25: Electronic Code Book (ECB) mode encryption of plaintext message
m consisting of l blocks, m = m1 , . . . , ml . Adapted from [218].
Note: notations are not exactly consistent with text, should be őxed.
The reader may have already noticed that ECB is simply a monoalphabetic
substitution cipher, as discussed in subsection 2.1.3. The ‘alphabet’ here is
Applied Introduction to Cryptography and Cybersecurity
2.8. ENCRYPTION MODES OF OPERATION
c1
k
DK (·)
m1
c2
k
DK (·)
m2
123
c3
k
DK (·)
cl
······
k
m3
DK (·)
ml
Figure 2.26: Electronic Code Book (ECB) mode decryption of ciphertext c
consisting of l blocks, c = c1 , . . . , cl . Adapted from [218].
indeed large: each ‘letter’ is a whole n-bit block. For typical block ciphers, the
block size is signiőcant, e.g., nDES = 64 bits for DES and nAES = 128 bits; this
deőnitely improves security, and may make it challenging to decrypt ECB-mode
messages in many scenarios.
However, obviously, this means that ECB may expose some information
about plaintext, in particular, all encryptions of the same plaintext block will
result in the same ciphertext block. Even with relatively long blocks of 64 or 128
bits, such repeating blocks are quite likely in practical applications and scenarios,
since inputs are not random strings. Essentially, this is a generalization of the
letter-frequency attack of subsection 2.1.3 (see Fig. 2.5).
This weakness of ECB is often illustrated graphically by the example illustrated in Fig. 2.27, using the ‘Linux Penguin’ image [144, 389].
Figure 2.27: The classical visual demonstration of the weakness of the ECB
mode. The middle and the right ‘boxes’ are encryptions of the bitmap image
25
shown on the left. Which of the two is ‘encrypted’ using ECB, and which is
encrypted with one of the secure encryption modes?
Applied Introduction to Cryptography and Cybersecurity
CHAPTER 2. CONFIDENTIALITY: ENCRYPTION SCHEMES AND
PSEUDO-RANDOMNESS
124
2.8.2
The Counter (CTR) mode and the Per-Block Random
Mode (PBR)
We next present the counter mode (CTR) mode and the Per-Block Random
mode (PBR).
The Per-Block Random mode (PBR). The PBR is not a standard,
possibly since it is inefficient: a block of random bits is generated for each
plaintext block, and sent as part of the ciphertext, resulting in ciphertext whose
length is twice that of the plaintext. We present it since it provide a simple way
to construct a secure stateless encryption scheme from a PRF, PRP or block
cipher, and since it is very similar to the stateful CTR mode.
r1
m1
r2
m2
r3
m3
Ek (r1 )
Ek (r2 )
Ek (r3 )
⊕
⊕
⊕
c1 =
(r1 , m1 ⊕ Ek (r1 ))
c2 =
(r2 , m2 ⊕ Ek (r2 ))
c3 =
(r3 , m3 ⊕ Ek (r3 ))
···
rl
······
ml
Ek (rl )
⊕
···
cl =
(rl , ml ⊕ Ek (rl ))
Figure 2.28: Per-Block Random (PBR) mode encryption of plaintext message
m consisting of l blocks, m = m1 , . . . , ml .
The PBR mode is illustrated in Figure 2.28. Let m = m1 +
+ m2 +
+...+
+ mM
be a plaintext message, where each mi is one n−bits block, i.e., mi ∈ {0, 1}n ,
and let E denote a block cipher for n-bit blocks, and k denote a key for E.
Then we compute the PBR-mode encryption of m using block cipher E and key
k, denoted P BR.EncE
k (m), as follows:
)
(
$
n
r
←
{0,
1}
E
i
(2.45)
P BR.Enck (m) ≡ c1 +
+ ... +
+ cM where
ci ← (ri , mi ⊕ Ek (ri ))
Namely, encrypt each message block mi with the corresponding random
block ri . Note that we can encode each ci simply as ci = ri +
+ mi ⊕ Ek (ri ), i.e.,
as a string of 2n bits; the pairwise notation is equivalent, and a bit easier to
work with.
PBR decryption performs the dual operation. Namely, given key k and
ciphertext c = (r1 , c′1 ) +
+ ... +
+ (rn , c′n ), where each ri and c′i are one block (n
bits), we compute the PBR-mode decryption of c, denoted P BR.DecE
k (c), as:
P BR.DecE
+ ... +
+ mM where mi ← ci ⊕ Ek (ri )
k (c) = m1 +
Applied Introduction to Cryptography and Cybersecurity
(2.46)
2.8. ENCRYPTION MODES OF OPERATION
125
It is not difficult to show that PBR mode ensures correctness.
Exercise 2.21. Show that PBR mode ensures correctness, i.e., that for every
l-blocks message m = m1 , . . . , ml and l random blocks r1 , . . . rl holds: m =
E
P BR.DecE
k (P BR.Enck (m)).
Note that PBR mode does not use at all the ‘decryption’ function D of
the underlying block cipher (invertible PRP). Indeed, PBR can be instantiated
using a PRF or PRP instead of using a block cipher. As can be seen from
Table 2.5 and Exercise 2.47, this also holds for OFB and CFB modes.
PBR is not a standard mode, and indeed, we do recommend it for applications, since it is wasteful: it requires the use of one block of random bits per
each block of the plaintext, and all these random blocks also become part of
the ciphertext and are used for decryption, i.e., the length of the ciphertext
is double the length of the plaintext. However, PBR is secure - allowing us
to discuss a simple provably-secure construction of a symmetric cryptosystem,
based on the security of the underlying block cipher (invertible PRP).
Theorem 2.1. If E is a PRF or PRP, or (E, D) is a block cipher (invertible
PRP), then (P BR.EncE , P BR.DecE ) is a CPA-indistinguishable symmetric
encryption.
Proof. We present the proof when E is a PRF; the other cases are similar. We
also focus, for simplicity, on encryption of a single-block message, m = m1 ∈
{0, 1}n .
Denote by (P BR.Encf , P BR.Decf ) the same construction, except using,
$
instead of E, a ‘truly’ random function f ← {{0, 1}n → {0, 1}n }. In this
case, for any pair of plaintext messages m0 , m1 selected by the adversary and
randomness r used for encrypting, the probability of c∗ = (r, m0 ⊕ f (r)) is
exactly the same as the probability of c∗ = (r, m1 ⊕ f (r)), from symmetry of
the random choice of f and r. Hence, the attacker’s success probability, when
‘playing’ the IND-CPA game (Def. 2.1) ‘against’ (P BR.Encf , P BRf ) is exactly
half. Note that this holds even for computationally-unbounded adversary.
Assume, to the contrary, that there is some PPT adversary A, that is able
to gain a non-negligible advantage against (P BR.EncE , P BR.DecE ). Recall
that this holds, even assuming E is a PRF - however, as argued above, A
succeeds with probability exactly half, i.e., with exactly zero advantage, against
(P BR.Encf , P BR.Decf ), i.e., if instead of the PRF E, we use the truly random
function f .
We can use A to distinguish between Ek (·) and a random function f , with signiőcant probability, contradicting the assumption that Ek is a PRF; see Def. 2.6.
Namely, we run A against the PBR construction instantiated with either a
true random function or Ek (·), resulting in either (P BR.Encf , P BR.Decf )
or (P BR.EncE , P BR.DecE ), correspondingly. Since A wins with signiőcant
advantage against (P BR.EncE , P BR.DecE ), and with no advantage against
(P BR.Encf , P BR.Decf ), this allows distinguishing the pseudorandom function
Ek (·) from a truly random function f , proving the contradiction.
Applied Introduction to Cryptography and Cybersecurity
126
CHAPTER 2. CONFIDENTIALITY: ENCRYPTION SCHEMES AND
PSEUDO-RANDOMNESS
Error localization, integrity and CCA security. Since PBR mode encrypts the plaintext by bitwise XOR, i.e., ci = (ri +
+ mi ⊕ Ek (ri )), ŕipping a bit
in the second part results in ŕipping of the corresponding bit in the decrypted
plaintext, with no other change in the plaintext. We say that such bit errors are
perfectly localized or have no error propagation. On the other hand, bit errors
in the random pad part ri corrupt the entire corresponding plaintext block,
i.e., are propagated to the entire block. We say that the PBR mode ensures
1-block error localization, since an error in one ciphertext block corrupts at most
one plaintext blocks upon decryption. In general, we say that a cryptosystem
ensures error localization, and speciőcally b-blocks error localization, if an error
in one ciphertext block corrupts at most b plaintext blocks. Error localization
is a common property; in fact, all of the modes of operations that we discuss
ensure error localization.
Error localization limits the damage of bit-ŕip errors, but has security
drawbacks. First, we note that with PBR, ŕipping of speciőc ciphertext bits
causes ŕip in a corresponding bit in the decrypted plaintext; this also holds
for several ciphers that ensure error localization, e.g., the one-time-pad (OTP).
Therefore, PBR and other ciphers allowing bit-ŕipping of plaintext do not
protect integrity.
Of possibly larger concern is that PBR, and every cryptosystem that ensures
error localization, cannot be IND-CCA secure; see Exercise 2.22.
Exercise 2.22 (Error localization conŕicts with IND-CCA security).
that PBR is not IND-CCA secure.
1. Show
2. Generalize part 1 to show that any cryptosystem E, D with localized errors
is not IND-CCA secure. Or, prove for the special case where an error in
ciphertext block i results in corruption of blocks i and i + 1 of the decrypted
plaintext.
Solution of part 1: The adversary gives to the test two challenge plaintext, m0
and m1 , both consisting of two blocks, and which differ in their őrst block, i.e.,
m0 [1] ̸= m1 [1] (the second blocks of m0 and m1 may differ or not), and receives
∗
the encryption c∗ = P BR_EncE
k (mb ). From Equation 2.45, the ciphertext c
∗
consists of two pairs: (i ∈ {1, 2})c [i] = (r[i], ĉ[i]) where ĉ[i] ≡ mb [i] ⊕ Ek (ri ))
and r[i] is a random block. The adversary asks the oracle for decryption of c′ ,
which also consists of two pairs, namely: c′ [1] = c∗ [1], c′ [2] = (r[2] ⊕ 1, ĉ[2]), i.e.,
ŕipping one (or more) bits of r[2]. Notice that c′ ̸= c∗ , therefore, the adversary
is ‘allowed’ to give c′ for decryption by the oracle.
′
′
Let m′ = P BR_DecE
k (c ) denote the decryption of c which the adversary
′
receives from the oracle; From Equation 2.46, m [1] = mb [1]. Since m0 [1] ̸=
m1 [1], the adversary learns b.
The Counter (CTR) mode. Let us now discuss the counter mode (CTR),
a standard mode of operation, deőned16 in [134], which is similar in design to
16 Our
description slightly simplifies CTR mode; see exact details in [134].
Applied Introduction to Cryptography and Cybersecurity
2.8. ENCRYPTION MODES OF OPERATION
s+1
m1
s+2
m2
127
s+3
m3 · · · · · ·
s+l
······
Ek (·)
Ek (·)
Ek (·)
Ek (·)
⊕
⊕
⊕
c1 =
m1 ⊕ Ek (s + 1)
c2 =
m2 ⊕ Ek (s + 2)
c3 =
m3 ⊕ Ek (s + 3)
ml
⊕
···
cl =
ml ⊕ Ek (s + l)
Figure 2.29: Counter (CTR) mode encryption of plaintext message m consisting
of l blocks, m = m1 , . . . , ml . The counter (state) s is the number of blocks
encrypted so far. Initially, s = 0, and its value is incremented whenever
encrypting a new block.
the PBR mode. CTR mode is unique among the modes of operation we discuss
in being stateful; it maintains a counter of the number of blocks encrypted/decrypted so far, which is incremented whenever encrypting/decrypting a new
block. See Figure 2.29.
The security of CTR mode against CPA attack follows, very similarly to
that of PBR, from the PRF/PRP assumption of the block cipher E. By using
state, we avoid the need to generate and send a new random block for each
plaintext block; therefore, when we can reliably use state, CTR mode offers
efficiency. Another advantage is that senders and recipients can pre-compute
the block-cipher operations even before the receive the plaintext or ciphertext,
requiring only block-wise XOR when the data (plaintext or ciphertext) arrives.
On the other hand, counter (CTR) mode is vulnerable to CCA attack; you
can show this basically just like shown for PBR in Exercise 2.22.
2.8.3
The Output-Feedback (OFB) Mode
We now proceed to discuss standard modes, which provably-ensure secure
encryption, with randomization, for multiple-block messages - yet are more
efficient compare to the PBR mode.
We begin with the simple Output-Feedback (OFB) Mode. In spite of its
simplicity, this mode ensures provably-secure encryption - and requires the
generation and exchange of only a single block of random bits, compare to one
block of random bits per each plaintext block, as in PBR.
The OFB mode is illustrated in Figs. 2.30 (encryption) and 2.31 (decryption).
OFB is a variant on the PRF-based stream cipher discussed in subsection 2.5.1
and illustrated in Fig. 2.17, and, like it, operates on input which consists of l
Applied Introduction to Cryptography and Cybersecurity
CHAPTER 2. CONFIDENTIALITY: ENCRYPTION SCHEMES AND
PSEUDO-RANDOMNESS
128
IV
pad0
Ek (·)
Ek (·)
pad1
Ek (·)
pad2
m1
c1
Ek (·)
pad3
m2
c0
······
padl
m3
c2
ml
c3
cl
Figure 2.30: Output Feedback (OFB) mode encryption. Adapted from [218].
c0 = pad0
Ek (·)
Ek (·)
pad1
c1
pad2
c2
m1
Ek (·)
······
pad3
c3
m2
Ek (·)
padl
cl
m3
ml
Figure 2.31: Output Feedback (OFB) mode decryption. Adapted from [218].
blocks of n bits each. The difference is that OFB uses a PRP (block cipher) Ek
instead of the PRF P RFk .
We use a random Initialization Vector (IV) as a ‘seed’ to generate a long
sequence of pseudo-random n−bit pad blocks, pad1 , . . . , padl , to encrypt plaintext blocks m1 , . . . , ml . We next compute the bitwise XOR of the pad blocks
pad1 , . . . , padl , with the corresponding plaintext blocks m1 , . . . , ml , resulting in
the ciphertext which consists of the random IV c0 and the results of the XOR
operation, i.e. c1 = m1 ⊕ pad1 , c2 = m2 ⊕ pad2 , . . ..
Let us now deőne OF B.EncE
k (m), the OFB mode for a given block cipher
(E, D). For simplicity we deőne OF B.EncE
k (m) for messages m which consist of
+...+
+ ml , where (∀i ≤ l)|mi | = n.
some number l of n-bit blocks, i.e., m = m1 +
Then OF B.EncE
(m)
is
deőned
as:
k
OF B.EncE
+ ... +
+ ml ) = (c0 +
+ c1 +
+ ... +
+ cl )
k (m1 +
Applied Introduction to Cryptography and Cybersecurity
(2.47)
2.8. ENCRYPTION MODES OF OPERATION
129
where:
pad0
padi
c0
ci
$
←
{0, 1}n
(2.48)
Ek (padi−1 )
(2.49)
←
←
pad0
padi ⊕ mi
(2.50)
(2.51)
←
Offline pad precomputation The OFB mode allows both the encryption
process and the decryption process to precompute the pad, ‘offline’ - i.e.,
before the plaintext and ciphertext, respectively, are available. Offline pad
precomputation is possible since the pad does not depend on the plaintext (or
ciphertext). This can be important, e.g., when a CPU with limited computation
speed needs to support a limited number of ‘short bursts’, without adding
latency. Once the plaintext/ciphertext is available, we only need one XOR
operation per block.
Parallelism. The pad is computed sequentially; there does not appear to be
a way to speed up its computation using parallelism.
Error localization, correction and integrity Since OFB operates as a
bit-wise stream cipher, then it is 1-localized (or perfectly localized): a change
in any ciphertext bit simply causes a change in the corresponding plaintext bit
- and no other bit.
The ‘perfect bit error localization’ property implies that an error correction
and/or detection code can be applied applied either to the ciphertext or to the
plaintext (before encryption, with correction/detection applied to plaintext
after decryption). Without localization, a single bit error in the ciphertext
could translate to many bit errors in the plaintext. This implies that error
correction, and, to lesser degree, detection, should be applied to the ciphertext.
Encode-then-Encrypt considered harmful. Some designers prefer to
apply error correction to the plaintext, and rely on the ‘perfect bit error localization’ property to allow recovery from corresponding errors in the ciphertext.
This motivated the use of OFB or a similar XOR-based stream cipher, allowing
application of error detection code (EDC) or error correction code (ECC) on the
plaintext; we refer to this as the Encode-then-Encrypt design. We now explain
why this design could cause vulnerability (hence ‘considered harmful’).
Our discussion of Encode-the-Encrypt applies to both EDC and ECC; let
us focus on ECC. ECC codes have two functions, encode(·) and decode(·); the
input domain to encode, denoted Dom, may be őxed-length strings (block code)
or variable-length strings (convolution code). All ECC codes must satisfy the
basic correctness property, which is that for every input string m ∈ Dom holds:
m = decode(encode(m)).
An ECC code should ensure noise correction property, i.e., it should ensure
recovery from some set of possible errors in encoded messages; often, an ECC
Applied Introduction to Cryptography and Cybersecurity
130
CHAPTER 2. CONFIDENTIALITY: ENCRYPTION SCHEMES AND
PSEUDO-RANDOMNESS
also allows detection of some additional errors. For example, one classical
noise model are Hamming errors, which are simply bit-ŕips; these errors are
deőned using the Hamming distance. The Hamming distance H(x, y) between
two equal-length binary strings (|x| = |y| and x, y ∈ {0, 1}∗ ) is the number of
different bits, i.e.:
∀l ∈ N, x, y ∈ {0, 1}l H(x, y) ≡ |{i : x[i] ̸= y[i]}|
(2.52)
An ECC ensures correction of tc -bit-ŕip errors, or corrects errors up to Hamming
distance tc , if for every message m ∈ Dom and every binary string y ∈ {0, 1}∗ :
H(y, code(m)) ≤ tc ⇒ m = decode(y)
(2.53)
Similarly, an ECC (or EDC) ensures detection of td -bit-ŕip errors, or detects
errors up to Hamming distance td , if for every message m ∈ Dom and every
binary string y ∈ {0, 1}∗ :
H(y, code(m)) ≤ td ⇒ (decode(y) ∈ {m, False})
(2.54)
The classical Hamming code allows correction of a single bit-ŕip error, and
detection of two bit-ŕip errors, i.e., tc = 1 and td = 2.
However, applying error correction or detection codes on the plaintext, and
relying on perfect bit error localization to allow error detection/correctness
of the ciphertext, is not recommended. The reason is that such codes create
structured redundancy in the plaintext, which may facilitate CTO attacks. Let
us give an example.
Example 2.8 (CTO attack on GSM). The Ciphertext-Only (CTO) attack on
the A5/1 and A5/2 stream ciphers [26], which are defined as part of the GSM
protocol, exploits the known relationship between ciphertext bits. This known
relationship is due to the fact that an Error Correction Code is applied to the
plaintext before encryption in the GSM protocol. This redundancy suffices to
attack the ciphers, using techniques that normally can be applied only in Known
Plaintext attacks. Unfortunately, complete details of this beautiful and important
result are beyond our scope; for details, see [26]. As a result of this attack, the
use of the (weaker) A5/2 was completely discontinued, and the use of A5/1 is
not recommended. However, the GSM protocol still applied Encode-then-Encrypt,
facilitating CTO cryptanalysis attacks.
One may wonder, why would designers prefer to apply error correction to the
plaintext rather than to the ciphertext? One motivation may be the hope that
this may make cryptanalysis harder, e.g., corrupt some plaintext statistics such
as letter frequencies. This may hold for some codes; but we better design such
defenses explicitly into the cryptosystem and not rely on such fuzzy properties
of encoding.
Another motivation may be the hope that applying error correction/detection
to the plaintext may provide integrity. Note that due to the perfect bit error
localization of OFB, an attacker can easily ŕip a speciőc plaintext bit - by
Applied Introduction to Cryptography and Cybersecurity
2.8. ENCRYPTION MODES OF OPERATION
131
ŕipping the corresponding ciphertext bit. If we applied error detection to
the plaintext, then corruption of a single bit will corrupt the entire plaintext.
However, since the attacker can ŕip multiple ciphertext bits, thereby ŕipping
the corresponding plaintext bits, there are cases where the attacker can modify
the ciphertext in such a way as to ŕip speciőc bits in the plaintext while also
‘őxing’ the error detection/correction code, to make the message appear correct.
We conclude the following principle.
Principle 9 (Minimize plaintext redundancy). Plaintext should preferably have
minimal redundancy. In particular, plaintext should preferably not contain Error
Correction or Detection codes.
Namely, applying error correction to plaintext is a bad idea - certainly when
using stream-cipher design such as OFB. This raises the obvious question: can
an encryption mode of a block cipher also protect the integrity of the decrypted
plaintext? Both of the following modes, CFB and CBC, provide a limited
defense of integrity - by ensuring that errors do propagate.
Provable security of OFB. The above discussed weaknesses are due to
incorrect deployments of OFB; correctly used, OFB is secure. Proving the
security of OFB follows along similar lines to Theorem 2.1, except that in order
to deal with multi-block messages, we will need to use a more elaborate proof
technique called ‘hybrid proof’; we leave that for courses and books focusing on
Cryptology, e.g., [166, 370].
2.8.4
The Cipher Feedback (CFB) Mode
We now present the Cipher Feedback (CFB) Mode. Like most standard modes,
it uses a random őrst block (‘initialization vector’, IV). In fact, CFB resembles
OFB; the IV is also the őrst block of the ciphertext (c0 = IV ). Then, iteratively,
each ciphertext block ci is used to generate the following pseudo-random pad
block padi+1 = Ek (ci ); note that there is no pad0 (as c0 is simply the IV).
Finally, the next ciphertext block, ci+1 , is computed by bitwise XOR between
the corresponding pseudorandom pad block padi+1 and the corresponding
plaintext block mi+1 .
Namely, we deőne CF B.EncE
k (m), the CFB mode for a given block cipher
(E, D), as follows. For simplicity we deőne CF B.EncE
k (m) for messages m
which consist of some number l of n-bit blocks, i.e., m = m1 +
+ ... +
+ ml ,
where (∀i ≤ l)|mi | = n. The ciphertext CF B.EncE
(m)
consists
of
l
+
1
blocks
k
c0 +
+ c1 +
+ ... +
+ cl , deőned by:
CF B.EncE
+ c1 +
+ ... +
+ cl , where:
k (m) ≡ c0 +
$
c0 = IV ← {0, 1}n
ci = mi ⊕ Ek (ci−1 ) ( for i ∈ {1, . . . , l})
Applied Introduction to Cryptography and Cybersecurity
(2.55)
CHAPTER 2. CONFIDENTIALITY: ENCRYPTION SCHEMES AND
PSEUDO-RANDOMNESS
132
Note that the difference between CFB and OFB is in the ‘feedback’ mechanism, namely, the computation of the pads padi (for i > 1). In CFB mode, this
is done using the ciphertext rather than the previous pad. See Fig. 2.32.
IV
Ek (·)
Ek (·)
pad1
m1
pad2
m2
c1
c0
Ek (·)
······
Ek (·)
pad3
m3
c2
padl
ml
c3
cl
Figure 2.32: Cipher Feedback (CFB) mode encryption. Adapted from [218].
c0 (= IV )
c1
c2
Ek (·)
Ek (·)
Ek (·)
pad1
m1
pad2
m2
pad3
m3
c3
······
cl−1
cl
Ek (·)
padl
ml
Figure 2.33: Cipher Feedback (CFB) mode decryption. Adapted from [218].
Optimizing implementations: parallel decryption, but no precomputation Unlike OFB, the CFB mode does not support offline precomputation
of the pad, since the pad depends on the ciphertext (of the previous block).
One optimization that is possible is to parallelize the decryption operation.
Namely, decryption may be performed for all blocks in parallel, since the
decryption mi of block i is mi = ci ⊕ pi = ci ⊕ Ek (ci−1 ), i.e., can be computed
based on the ciphertexts of this block and of the previous block.
Error localization and integrity Error localization in CFB is not perfect;
a single bit error in one ciphertext block completely corrupts the following
plaintext block.
As we discussed for OFB, this reduction in error localization may be viewed
as an advantage in ensuring integrity. Like OFB mode, the CFB mode allows the
attacker to ŕip speciőc bits in the decrypted plaintext, by ŕipping corresponding
Applied Introduction to Cryptography and Cybersecurity
2.8. ENCRYPTION MODES OF OPERATION
133
bits in the ciphertext. However, as a result of such bit ŕipping, say in block i, the
decrypted plaintext of the following block is completely corrupted. Intuitively,
this implies that applying an error-detection code to the plaintext would allow
detection of such changes, in contrast to the situation with OFB mode.
However, this dependency on the error detection code applied to the plaintext
may cause some concerns. First, it is an assumption about the way that OFB
is used; can we provide some defense for integrity that will not depend on such
additional mechanisms as an error detection code? Second, it seems challenging
to prove that the above intuition is really correct, and this is likely to depend
on the speciőcs of the error detection code used. Finally, adding error detection
code to the plaintext increases its redundancy, in contradiction to Principle 9.
We next present the CBC mode, which provides a different defense for integrity,
which addresses these concerns.
2.8.5
The Cipher-Block Chaining (CBC) mode
Among the modes of operation deőned in [134], the most widely-used, by far, is
the Cipher-Block Chaining (CBC) mode.
The CBC mode, like the OFB and CFB modes, uses a random Initialization
$
Vector (IV) as the őrst block of the ciphertext, c0 ← {0, 1}n . However, in
th
contrast to OFB and CFB, to encrypt the i plaintext block mi , CBC XORes
mi with the previous ciphertext block ci−1 , and then applies the block cipher.
Namely, ci = Ek (ci−1 ⊕ mi ). See Fig. 2.34.
m1
m2
m3
Ek
Ek
Ek
c1
c2
c3
mn
IV
c0
······
Ek
cn
Figure 2.34: Cipher Block Chaining (CBC) mode encryption. Adapted from
[218].
More precisely, let (E, D) be a block cipher, and let m = m1 +
+ ... +
+ mn
be a message (broken into blocks). Then the CBC encryption of m using key k
and initialization vector IV ∈ {0, 1}l is deőned as:
CBC.EncE
+ ... +
+ ml ) = (c0 +
+ c1 +
+ ... +
+ cl )
k (m1 +
Applied Introduction to Cryptography and Cybersecurity
(2.56)
CHAPTER 2. CONFIDENTIALITY: ENCRYPTION SCHEMES AND
PSEUDO-RANDOMNESS
134
c1
c2
c3
Dk
Dk
Dk
m1
m2
m3
cn
······
Dk
c0
(IV )
mn
Figure 2.35: Cipher Block Chaining (CBC) mode decryption. Adapted from
[218].
where:
c0 = IV
(i ∈ {1, . . . , l}) ci
$
←
←
{0, 1}n
Ek (ci−1 ⊕ mi )
(2.57)
(2.58)
We see that the CBC mode, like CFB, allows parallel decryption, but not
offline pad precomputation.
The CBC mode, like the other modes (except ECB), ensures IND-CPA, i.e.,
security against CPA attacks, provided that the underlying block cipher is a
secure invertible PRP, however, is vulnerable to CCA attacks.
Exercise 2.23. Demonstrate that CBC mode does not ensure security against
CCA attacks.
Hint: the solution is quite similar to that of Exercise 2.22.
Incorrect use vulnerability: BEAST exploit of predictable IV. While
CBC mode ensures security in the sense of IND-CPA, i.e., against Chosen
Plaintext Attack, this is only true if CBC is used correctly; in particular, the
IV (also used as c0 ) must be random. The SSL protocol, as well as version
1.0 of TLS, use CBC encryption in the following incorrect way. They select
the IV randomly only for the őrst message m0 in a connection; for subsequent
messages, say mi , the IV is simply the last ciphertext block of the previous
message. This creates a vulnerability exploited, e.g., by the BEAST attack. For
details, see subsection 7.2.4, Exercise 7.8 and [25, 132].
Error propagation and integrity Any change in the CBC ciphertext, even
of one bit, results in unpredictable output from the block cipher’s ‘decryption’
operation, and hence unpredictable decryption. Namely, ŕipping a bit in the
ciphertext block i does not ŕip the corresponding bit in plaintext block i, as it
did in the OFB and CFB modes.
Applied Introduction to Cryptography and Cybersecurity
2.9. PADDING SCHEMES AND PADDING ORACLE ATTACKS
135
However, the ŕipping of a bit in the ciphertext block ci−1 , without change
to block ci , results in the ŕipping of the corresponding bit in the ith decrypted
plaintext block. Namely, bit-ŕipping is still possible in CBC, it is just a bit
different - and in order to ŕip a bit in the decrypted-plaintext block i, the
adversary has to ŕip the corresponding bit in the previous block (i − 1), which
results in corruption of the decryption of block i − 1. Indeed, this kind of
tampering is used in several attacks on systems deploying CBC, such as the
Poodle attack [290]. Note also that bit ŕipping in the őrst decrypted-plaintext
block only requires ŕipping of the corresponding IV block - and hence does not
corrupt any plaintext block.
2.8.6
Modes of Operation Ensuring CCA Security?
We already observed, in Exercise 2.22, that any cryptosystem (and mode) that
ensures error localization to some extent cannot be IND-CCA secure. This
implies that none of the modes we discussed is IND-CCA secure. Such failure
can occur even for the much weaker - and more common - case of Feedback-only
CCA attacks, where the attacker does not receive the decrypted plaintext, but
only an indication of whether the plaintext was ‘valid’ or not.
How can we ensure security against CCA attacks? One intuitive defense
is to avoid giving any feedback on invalid-plaintext failures. However, this is
harder than it may seem. For example, often, after (successful) decryption, a
response is immediately sent, which may be hard to emulate when the plaintext
is invalid - we may be even unable to identify the sender, e.g., if the sender
identity is encrypted for anonymity. By observing if a response is sent, or the
timing of the response, an attacker may obtain feedback on the attack. Such
unintentional indications are referred to as side channels; for example, when the
feedback is based on the time the response is sent, this is a timing side channel.
A better approach may be to prevent response to chosen-ciphertext queries,
without decrypting them. One simple way to do this is to authenticate the
ciphertext, typically by appending to the ciphertext an authentication tag, which
allows secure detection of any modiőcation in the ciphertext. Several of the
more modern, widely used modes of operation, e.g., GCM [135, 343], combine
authentication and encryption, with one beneőt being the protection against
chosen-ciphertext attacks. Authentication is the subject of the next chapter.
2.9
Padding Schemes and Padding Oracle Attacks
All modes of operation are deőned for input whose length is an integral number
l of blocks. In most applications, the input may not be an integral number of
blocks, but an string of arbitrary number of bits or, more commonly, of bytes.
The principle of all padding schemes is quite simple. Before encryption, the
plaintext is padded to an integral number of blocks, by appending a pad string
to the message, which is removed after decryption. The length of the pad is
between one byte and a whole block (l bytes), and is chosen to ensure that the
padded message (message plus pad) őts in an integral number of blocks.
Applied Introduction to Cryptography and Cybersecurity
136
CHAPTER 2. CONFIDENTIALITY: ENCRYPTION SCHEMES AND
PSEUDO-RANDOMNESS
Padding schemes mainly differ in the contents of the pad string, and in the
validation of the pad after decryption. Two commonly used padding schemes
applied to plaintext before shared-key encryption are:
X9.23 padding: Several protocols, most notably the SSL protocol discussed
in Chapter 7, use the following padding scheme, which we refer to as
X9.23 padding, since it was deőned in the ANSI X9.23 standard [13]. In
X9.23 padding, the last byte of the pad contains the length of the pad
minus one, i.e., the length except this byte. The length of the pad is
restricted17 to the block-length l, hence the value of the last byte must
be a number between zero and l − 1. If the results of decryption does not
end with a byte between zero and l − 1, then the ciphertext is considered
to have invalid padding, and a padding error is returned. The other bytes
of the X9.23 pad can have arbitrary values (e.g., all zeros, or random
values).
PKCS#5 padding: Other protocols, most notably the TLS protocol (also
discussed in Chapter 7), use the following padding scheme, which we refer
to as PKCS#5 padding. PKCS#5 padding is deőned in several standards
including PKCS#5, PKCS#7 and RFC 5652 [206]. It is similar to ANSI
X9.23 padding, the main difference being that all padding bytes must
contain the same value as the last byte, i.e., the length of the pad minus
one. Namely, a one-byte pad will contain the byte 00, a two-byte pad will
contain pad 0101, and so on. If the results of decryption does not have
this pattern, then a padding error is returned. Another difference is that
PKCS#5 padding allows the pad to be longer than a single block (up to
256 bytes, to ensure that the pad length minus one can őt in one byte.).
With both X9.23 and PKCS#5 padding schemes, the decrypted plaintext
should end with a valid pad, which should be (efficiently) veriőed by the recipient.
We say that ciphertext c has invalid pad, and return a padding error, if the
decrypted plaintext has invalid pad.
Consider an l-bytes block m, or a multi-block message whose last block is
m. Let m[i], for i = 1, . . . , l denote the ith most-signiőcant byte of m. Block
m has valid ANSI X9.23 padding if m[l] < l. For example, for blocks of l = 16
bytes, the four most-signiőcant bits of the last byte must be all zeroes, i.e., the
last byte must be of the form 0x0ϕ in hexadecimal notation. In this case, ϕ
can be any hexadecimal digit, from 0 to F , representing the corresponding four
bits in binary.
Similarly, plaintext string m has valid PKCS#5 padding if the value of the
last byte of m, which we denote x = m [|m|], is also the value of preceding x − 1
bytes of m, i.e., m [(|m| − x) : |m|] = xx . Namely, the last x bytes of m contain
the same value x.
The reader may wonder why do we bother describing these two simple and
very similar schemes. The reason is that padding error indications, which we
17 For convenience, we consider this restriction of pad length to one block to be a mandatory
property of X9.23 padding, although some implementations may not enforce it.
Applied Introduction to Cryptography and Cybersecurity
2.9. PADDING SCHEMES AND PADDING ORACLE ATTACKS
137
refer to as padding oracles, are exploited in many attacks. In Chapter 7, we
discuss practical padding oracle attacks against SSL and TLS. In this section,
we describe the Padding Oracle Attack model, and then present a simple padding
oracle attack against ECB and CBC modes using X9.23 padding, extended in
Exercise 2.24 to an attack against CBC mode using PKCS#5 padding.
The Padding Oracle Attack Model. In many systems, an attacker may be
able to detect when an message with invalid padding is received, by observing
an explicit ‘invalid pad’ error message, or otherwise, e.g., by detecting the
different behavior of the recipient due to invalid pad, such as different timing
of a response; the later is a special case of a timing side channel. In any case,
we refer to this ability to detect the validity of the padding of the (decrypted)
plaintext, for a given ciphertext, as a padding oracle.
MitM Mal
Oracle:
Is m′i well-padded?
m0 , m1 , . . .
c 1 , c2 , . . .
m0 , m1 , . . .
Pad and Encrypt:
ci ← Ek (P ad(mi ))
c′1 , c′2 , . . .
Decrypt:
m′i ← Dk (c′i )
Nurse
Alice
Bob
Figure 2.36: The Padding Oracle Attack model.
We illustrate the basic Padding Oracle Attack model in Figure 2.36; this
is basically a CTO attack with the addition of the padding oracle capability.
Note that, following the Kerckhoffs’ principle (Principle 2), we assume that
the attacker knows the details of the padding scheme in use, as well as other
aspects of the system.
Of course, the padding oracle capability may also complement other attacker
capabilities. In particular, in subsection 7.2.2, we present the CPA-Oracle
Attack model, where the attacker also has chosen-plaintext capability. The
CPA-Oracle Attack model is quite realistic, although it is more powerful then
the Padding Oracle Attack model. Indeed, the CPA-Oracle Attack model
is often used for security evaluation of practical protocols. In particular, in
subsection 7.2.2 we discuss practical attacks against different versions of the
SSL and TLS protocols, using the CPA-Oracle Attack model.
Simple padding-oracle attacks. Let us present simple padding oracle
attacks against the ECB and CBC modes of operation, when using X9.23
padding. Assume, for example, that we use blocks of 16 bytes. Hence, the
Applied Introduction to Cryptography and Cybersecurity
138
CHAPTER 2. CONFIDENTIALITY: ENCRYPTION SCHEMES AND
PSEUDO-RANDOMNESS
length of the pad is between one and 16, and the encoded value in the last byte
should be 0x0ϕ in hexadecimal notation (where ϕ is one hexadecimal digit).
Typical encryption schemes use shorter blocks (8 bytes for DES and 12 bytes
for AES), which requires a tiny change in the attack and slightly increases the
exposure due to the attack.
Padding-Oracle attack on X9.23 padding using ECB mode. Consider plaintext message m = m1 +
+ ... +
+ mn , containing n − 1 full blocks and one empty
or non-full block mn . In ECB mode, each non-őnal plaintext block mi (i.e.,
i < n), is encrypted directly as ci = Ek (mi ). The őnal plaintext block mn is
padded before encryption, i.e., encrypted as cn = Ek (pad(mn )), where pad(·) is
the X9.23 padding function.
The attack can be applied to any block of the ciphertext, except the last
block; let us focus on some block ci where i < n. Instead of sending ci as an
intermediate block of a longer ciphertext message, the attacker sends ci as if it
is the last block of a ciphertext message, e.g., the ciphertext consist only of this
single block ci . (If the ciphertext should have multiple blocks, prepend some
blocks before ci - we only need ci to be the last block of the ciphertext.)
Following the Padding Oracle Attack model, the attacker receives an indication whether the decryption of ci , i.e., mi = Dk (ci ), has valid padding or not.
If mi has valid padding, then the value of its last byte must be less than the
block length, l = 16; namely, the last byte of m must be of the form 0x0ϕ (in
hexadecimal notation), i.e., its four most signiőcant bits must be zeros. This is
an exposure of (limited) information about these four bits of the plaintext, i.e.,
an indication if these four bits are all zeros or not.
Padding-Oracle attack on X.923 padding using CBC mode. This attack,
like the one on ECB mode, is applicable to every non-last ciphertext block ci .
Again, the basic idea is to send ci as the last block of ciphertext messages to
the recipient, and learn information about the value of few bits of plaintext mi ,
using the response of the padding oracle.
Speciőcally, let mi [j] denote the j th bit of mi . Recall that we use l to denote
the number of bits in each block, i.e., mi = mi [1] +
+ ... +
+ mi [l]. We show how,
when using X9.23 padding, the attack őnds the four plaintext bits mi [j] for
j = l − 7, l − 6, l − 5 and l − 4. This requires only access to ci and to the padding
oracle for CBC mode. Exercise 2.24 extends the attack to PKCS#5 padding,
where the attack őnds the entire last byte of mi .
Recall that in CBC mode, the ciphertext block ci is computed as ci =
Ek (ci−1 ⊕ mi ). The attacker now sends ciphertext c′ containing i blocks, where
c′i = ci , and using different previous ciphertext blocks c′i−1 . The value of c′i−1
may be different from the value of ci−1 , the original previous ciphertext block.
(The other blocks of c′ do not have any impact on the attack and can even be
eliminated, but to simplify notations, assume c′ contains at least i blocks.)
Let m′i denote the last decrypted plaintext block would be m′i ; since we use
CBC, it is computed as:
m′i = Dk (ci ) ⊕ c′i−1 = (mi ⊕ ci−1 ) ⊕ c′i−1
Applied Introduction to Cryptography and Cybersecurity
(2.59)
2.10. CASE STUDY: THE (IN)SECURITY OF WEP
139
The attacker has (only) access to the padding oracle, which indicates if
the last block of the decrypted message, m′i , has correct padding or not. This
depends only on the value of the last byte of m′i ; correct padding requires that
last byte of m′i be of the form 0x0ϕ, i.e., its four most signiőcant bits should be
all zeros. In other words, m′i [l − 7 : l − 4] = 0000 (in binary).
Recall that m′i = Dk (ci ) ⊕ c′i−1 ; therefore, the attacker can try the 16
different values of four most signiőcant bits of the last byte of c′i−1 , until it őnds
a value c′i−1 that has correct padding, i.e., resulting in plaintext m′i whose last
byte is of the form 0x0ϕ, i.e., m′i [l − 7 : l − 4] = 0000. Hence, from Equation 2.59,
we have:
mi [l − 7 : l − 4] = ci−1 [l − 7 : l − 4] ⊕ c′i−1 [l − 7 : l − 4]
(2.60)
The reader may have noticed that the attack exploits the fact that X9.23
padding limits the pad length to one block. PKCS#5 padding does not make
this restriction, which may seem to make it secure against such padding oracle
attack. However, in fact, Exercise 2.24 extends the attack for the case of
PKCS#5 padding. The extended attack requires more padding-oracle queries,
but is also more rewarding, as it allows recovery of the entire plaintext.
Exercise 2.24. Present a padding attack, assuming the availability of a padding
oracle and the use PKCS#5 padding. The attack should find the entire last byte
of a non-last plaintext block mi , given only the corresponding ciphertext block ci
and previous ciphertext block ci−1 , and access to the padding oracle.
Hint: A random plaintext block would have valid padding if its last byte
contains 0x00, i.e., the entire pad is this single last byte; other random plaintext
blocks are unlikely to have valid padding (why?). Use this to őnd the last byte
of mi . Once this byte is found, we repeat a similar logic to őnd the preceding
byte of mi , using the fact that a valid random plaintext whose last byte contains
0x01, would also have 0x01 in the preceding byte. We then proceed similarly
to őnd all plaintext bytes of mi .
The reader may also őnd the solution to this exercise by reading [380].
Additional padding oracle attacks, focusing on the SSL and TLS protocols, are
described in subsection 7.2.3.
2.10
Case study: the (in)security of WEP
We conclude this chapter, and further motivate the next, by discussing a case
study: vulnerabilities of the Wired Equivalency Privacy (WEP) standard [103].
WEP stands for Wired Equivalency Privacy; it was developed as part of the
IEEE 802.11b standard, to provide some protection of data over wireless local
area networks (also known as WiFi networks). As the name implies, the original
goals aimed at a limited level of privacy (meaning conődentiality), which was
deemed ‘equivalent’ to the (physically limited) security offered by a wired
connection.
Applied Introduction to Cryptography and Cybersecurity
140
CHAPTER 2. CONFIDENTIALITY: ENCRYPTION SCHEMES AND
PSEUDO-RANDOMNESS
These critical vulnerabilities were discovered long ago, mostly in [77], relatively soon after the standard was published; yet, products and networks
supporting WEP still exist. This is an example of the fact that once a standard
is published and adopted, it is often very difficult to őx security. Hence, it is
important to carefully evaluate security in advance, in an open process that
encourages researchers to őnd vulnerabilities, and, where possible, with proofs
of security. To address these vulnerabilities, WEP was replaced - possibly in
too much haste - with a new standard, the Wi-Fi Protected Access (WPA),
which so far has three versions (WPA1, WPA2, WPA3). Vulnerabilities were
also found in these, e.g., see [378, 379], but these are more subtle and harder to
exploit; WEP should only be used for educational purposes, as we do here.
WEP assumes a symmetric key between the mobile device and an access
point, which is used as the key (seed) for the RC4 cipher. WEP networks
share the same key with all mobiles; this means that each device which has the
key, can eavesdrop on all communication; this vulnerability exists also for the
common use of more advanced WiFi security protocols, such as WPA (versions
1 to 3). We discuss speciőc additional vulnerabilities which are speciőc to WEP,
and make it insecure even against an attacker that is not given the key to
connect to the network.
Conődentiality in WEP is protected using the RC4 PRG, used as a stream
cipher as in subsection 2.5.1. RC4 is initiated with a secret shared key, which,
in WEP, is speciőed to be only 40 bits long. This short key size was chosen
intentionally, to allow export of the hardware, since when the standard was
drafted, the United States and many other countries had export limitations on
strong cryptography, which necessarily uses longer keys. Many WEP implementations also support longer, 104-bit keys for RC4; however, attacks published
show that even with 104-bit keys, RC4 is still vulnerable, see subsection 2.5.6,
subsection 7.2.5 and [11, 235, 276].
The WEP PRG is initiated with a 24-bit per-packet random Initialization
Vector (IV). We use RC4IV,k to denote the string output by RC4 when initialized
using a given IV, k pair. More speciőcally, we use RC4IV,k [l] to denote the őrst
l bits in RC4IV,k , i.e., in the output by RC4 when initialized using given IV, k
pair.
WEP packets use the CRC-32 error detection code, computed over the
plaintext message m. CRC-32 [240] is one of the standard variants of CRC
cyclic redundancy check code. CRC codes are popular error-detecting codes
(EDC); they are simple to implement and efficiently detect errors in data,
caused by random noise corruptions. Namely, if m′ is the result of such random
corruption of m, then, with high probability, m and m′ will have different
CRC codes, i.e., CRC(m) ̸= CRC(m′ ), allowing detection of the corruption
by comparing their CRC codes. Note that, for simplicity, and since we do
not discuss other CRC codes, we use CRC(m) to denote the CRC-32 code
computed over message m.
CRC codes, and in particular CRC-32, are linear, in the sense that for any
Applied Introduction to Cryptography and Cybersecurity
2.10. CASE STUDY: THE (IN)SECURITY OF WEP
141
two strings m, m′ ∈ {0, 1}∗ of equal length (|m| = |m′ |), holds:
CRC(m ⊕ m′ ) = CRC(m) ⊕ CRC(m′ ) (if |m| = |m′ |)
(2.61)
WEP uses CRC as follows. To send a message m using secret key k, WEP
implementations select a random 24-bit IV, and transmit the IV together with
W EPk (m, IV ), deőned as:
W EPk (m, IV ) ≡ RC4IV,k [32 + |m|] ⊕ (m +
+ CRC(m))
(2.62)
The length of the WEP transmission is, therefore, the length of the message
m, plus 56 bits: 24 bits for the IV and 32 bits for the CRC-32 code.
2.10.1
CRC-then-XOR does not ensure integrity
CRC-32 is a quite good error detection code. By encrypting the output of CRC,
speciőcally by XORing it with the pseudo-random pad generated by RC4, the
WEP designers hoped to protect message integrity, i.e., not only detect random
corruptions, but also prevent intentional modiőcation or forgery of messages.
However, error-detection codes are designed to detect random, not intentional,
corruptions; i.e., for reliability, not for security. In particular, it is easy to őnd a
collision, i.e., messages m and m′ such that m =
̸ m′ yet CRC(m) = CRC(m′ ).
Furthermore, we next show how the linearity of the CRC (Equation 2.61) allows
an attacker to change the message m sent in a WEP packet, by ŕipping any
desired bits and appropriately adjusting the CRC őeld.
Speciőcally, let ∆ represent the string of length |m| containing 1 for bit
locations that the attacker wishes to ŕip. Having eavesdropped and obtained
W EPk (m, IV ), the attacker can compute a valid W EPk (m ⊕ ∆, IV ) as follows:
W EPk (m ⊕ ∆,
IV ) =
=
=
=
RC4IV,k [32 + |m ⊕ ∆|] ⊕ ([m ⊕ ∆] +
+ CRC(m ⊕ ∆))
RC4IV,k [32 + |m|] ⊕ ([m ⊕ ∆] +
+ [CRC(m) ⊕ CRC(∆)])
RC4IV,k [32 + |m|] ⊕ (m +
+ CRC(m)) ⊕ (∆ +
+ CRC(∆))
W EPk (m, IV ) ⊕ (∆ +
+ CRC(∆))
Namely, the CRC mechanism, XOR-encrypted, does not provide any meaningful integrity protection. An attacker can easily ŕip bits in a WEP message and have it properly received.
WEP authentication-based vulnerabilities. We have just seen that WEP
failed to provide integrity; however, as we now show, WEP also fails to ensure
confidentiality. Interestingly, the most devastating vulnerability, which is the
one we show, takes advantage of WEP’s shared-key authentication mode; WEP
also deőnes a mode called open-system authentication, which simply means that
there is no authentication, and is therefore not vulnerable to this speciőc attack.
Even when using open-system authentication, i.e., giving up on authentication,
WEP is still vulnerable to other cryptanalysis attacks exploiting RC4 weaknesses,
e.g., see [276].
Applied Introduction to Cryptography and Cybersecurity
142
CHAPTER 2. CONFIDENTIALITY: ENCRYPTION SCHEMES AND
PSEUDO-RANDOMNESS
However, we focus on the vulnerability of WEP when using the shared-key
authentication mode. It works very simply: the access point sends a random
challenge R; and the mobile sends back W EPk (R, IV ), i.e., a proper WEP
packet containing the ‘message’ R.
This authentication mode is currently rarely used, since it allows attacks
on the encryption mechanism. First, notice that it provides a trivial way for
the attacker to obtain ‘cribs’ (known plaintext - ciphertext pairs). Of course,
encryption systems should be protected against known-plaintext attacks; however, following the conservative design principle (principle 3), system designers
should try to make it difficult for attackers to obtain cribs. In the common,
standard case of 40-bit WEP implementations, a crib is deadly - an attacker
can now do a trivial exhaustive search to őnd the key.
Even when using longer keys (104 bits), the shared-key authentication
exposes WEP to a simple cryptanalysis attack. Speciőcally, since R is known,
the attacker learns RC4IV,k for a given, random IV . Since the length of the
IV is just 24 bits, it is feasible to obtain a collection of most IV values and the
corresponding RC4IV,k pads, allowing decryption of most messages.
As a result of these concerns, most WEP systems use only open-system
authentication mode, i.e., do not provide any authentication.
Further WEP encryption vulnerabilities We brieŕy mention two further
vulnerabilities of the WEP encryption mechanism.
The őrst vulnerability exploits the integrity vulnerability discussed earlier.
As explained there, the attacker can ŕip arbitrary bits in the WEP payload
message. WEP is a link-layer protocol; the payload is usually an Internet
Protocol (IP) packet, whose header contains, in known position, the destination
address. An attacker can change the destination address, causing forwarding of
the packet directly to the attacker!
The second vulnerability is the fact that WEP uses ‘plain’ RC4, which has
been shown in [276] to be vulnerable.
2.11
Encryption: Final Words
Conődentiality, as provided by encryption, is the oldest goal of cryptology, and
is still critical to the entire area of cybersecurity. Encryption has been studied
for millennia, but for many years, the design of cryptosystems was kept secret,
in the hope of improving security. Kerckhoffs’ principle (Principle 2), however,
has been widely adopted and caused cryptography to be widely studied and
deployed, in industry and academia.
Cryptography was further revolutionized by the introduction of precise
deőnitions and proofs of security by reduction, based on the theory of complexity.
In particular, modern study of applied cryptography makes extensive use
of provable security, especially computational security, i.e., ensuring security
properties with high probability, against Probabilistic Polynomial Time (PPT)
adversaries. We have seen a small taste of such deőnitions and proofs in this
Applied Introduction to Cryptography and Cybersecurity
2.11. ENCRYPTION: FINAL WORDS
143
chapter; we will see a bit more later on, but for a real introduction to the theory
of cryptography, see appropriate textbooks, such as [165, 166].
Applied Introduction to Cryptography and Cybersecurity
CHAPTER 2. CONFIDENTIALITY: ENCRYPTION SCHEMES AND
PSEUDO-RANDOMNESS
144
2.12
Lab and Additional Exercises
Lab 2 (Ransomware and Encryption). In this lab, we explore the abuse of
cryptography by ransomware. Ransomware encrypts the user files, and requires
the user to pay ’ransom’, with the promise of sending back the decryption key
or program.
As for the other labs in this textbook, we will provide Python scrips for
generating and grading this lab (LabGen.py and LabGrade.py). If not yet
posted online, professors may contact the author to receive the scripts. The
lab-generation script generates random challenges for each student (or team),
as well as solutions which will be used by the grading script. We recommend
to make the scripts available to the students, as example of how to use the
cryptographic functions. It is easy and permitted to modify these scripts to use
other languages/libraries or to modify and customize them as desired.
The lab has two parts.
1. In this part, you are given a ransomware program, R1.py, and your
task is to reverse-engineer and break it, i.e., decrypt the őles without
paying ransom. You will be able to do it, since R1.py is a simple Python
program using a shared-key cryptosystem, speciőcally, the AES block
cipher in CBC mode; see Section 2.6. In the next part, we will discuss
more realistic ransomware, that uses public-key encryption rather than
shared-key encryption, making it infeasible to recover the őles without
paying ransom, by reverse-engineering of the ransomware.
The ransomware R1.py has two outputs for each input őle, say example.txt:
its encryption, example.txt.enc, and a token, example.txt.token, to be sent
with the ransom payment. The token is needed since R1.py selects a
different random shared key to encrypt each őle, e.g., example.txt; the
attacker uses example.txt.token to őnd the decryption key.
Input: The ‘weak ransomware program’, given conveniently (and unrealistically) as a Python script, R1.py, the encrypted őle example.txt.enc
and the token example.txt.token.
Goal: reverse-engineer R1.py and then, using the token example.txt.token,
recover the original őle, example.txt.
Submission: the recovered example.txt őle and your program, A1.py
that produced it (given example.txt.enc and example.txt.token as
input).
Note: Before encryption, the plaintext (example.txt) was padded so that
its length would be a multiple number of ‘blocks’. The padding has to be
removed after applying the block-cipher’s decryption function.
2. In this part you develop ‘strong’ ransomware, using public key (asymmetric) encryption. You will develop and submit the following programs:
Applied Introduction to Cryptography and Cybersecurity
2.12. LAB AND ADDITIONAL EXERCISES
Identiőer
A
B
C
D
E
F
E
F
Cipher
Caesar
AzBy
ROT13
Keyed Caesar
Ciphertext
JUHDW SDUWB
ILFMW GZYOV
NYBAR NTNVA
XRORT SQUIQH
BLFMT OLEVI
LGTIE JXKYY
FZNEG UBHFR
EUHDN UXOHV
145
Plaintext and key
Time
Table 2.6: Ciphertexts for Exercise 2.25. All plaintexts are pairs of two simple
őve to six letter words. The four upper examples have the cipher spelled out,
the four lower examples hide it (‘obscurity’). It does not make them secure, but
decryption may take a bit longer.
a) A key-generation program KG2.py, that outputs a keypair of a public
encryption key e and a private decryption key d.
b) A ransomware program R2.py, which uses the encryption key e and
outputs, for each input őle example.txt, two őles: its encryption,
example.txt.enc, and a payment token, example.txt.pay, to be sent
with the ransom payment.
c) A token-processing program TP2.py, which uses the private decryption key d, and outputs, for each input payment token example.txt.pay,
a decryption token example.txt.d.
d) A decryption program D2.py, using the decryption token example.txt.d, and recovering the plaintext example.txt given its encryption,
example.txt.enc, as input.
Exercise 2.25. Table 2.6 shows eight ciphertexts, all using one of the four simple substitution ciphers identified in the top four rows. Decipher the ciphertexts,
measuring the time it took you to decipher each of them. Fill in the blanks in
the table: the plaintexts, the time it took you to decipher each message, and the
ciphers and key (when relevant). Did the knowledge of the cipher significantly
ease the cryptanalysis process? Did the key?
Exercise 2.26. ConCrypt Inc. announces a new symmetric encryption scheme,
CES. ConCrypt announces that CES uses a 128-bit keys and is five times faster
than AES, and is the first practical cipher to be secure against computationallyunbounded attackers. Is there any method, process or experiment to validate or
invalidate these claims? Describe or explain why not.
Exercise 2.27. ConCrypt Inc. announces a new symmetric encryption scheme,
CES512. ConCrypt announces that CES512 uses a 512-bit keys, and as a result,
is proven to be much more secure than AES. Can you point out any concerns
with using CES512 instead of AES?
Applied Introduction to Cryptography and Cybersecurity
146
CHAPTER 2. CONFIDENTIALITY: ENCRYPTION SCHEMES AND
PSEUDO-RANDOMNESS
Exercise 2.28. Compare the following pairs of attack models. For each pair
(A, B), state whether every cryptosystem secure under attack model A is also
secure under attack model B and vice versa. Prove (if you fail to prove, at least
give compelling argument) your answers. The pairs are:
1. (Ciphertext only, Known plaintext)
2. (Known plaintext, Chosen plaintext)
3. (Known plaintext, Chosen ciphertext)
4. (Chosen plaintext, Chosen ciphertext)
Exercise 2.29. Alice is communicating using the GSM cellular standard, which
encrypts all calls between her phone and the access tower. Identify the attacker
model corresponding to each of the following cryptanalysis attack scenarios:
1. Assume that Alice and the tower use a different shared key for each call,
and that Eve knows that specific, known message is sent from Bob to Alice
at given times.
2. Assume (only) that Alice and the tower use a different shared key for each
call.
3. Assume all calls are encrypted using a (fixed) secret key kA shared between
Alice’s phone and the tower, and that Eve knows that specific, known
control messages are sent, encrypted, at given times.
4. Assume (only) that all calls are encrypted using a (fixed) secret key kA
shared between Alice’s phone and the tower
Exercise 2.30. We covered several encryption schemes in this chapter, including At-Bash (AzBy), Caesar, Shift-cipher, general monoalphabetic substitution, OTP, PRG-based stream cipher, RC4, block ciphers, and the ‘modes’
in Table 2.5. Which of these is: (1) stateful, (2) randomized, (3) FIL, (4)
polynomial-time?
Exercise 2.31. Consider use of AES with key length of 256 bits and block
length of 128 bit, for two different 128 bit messages, A and B (i.e., one block
each). Bound, or compute precisely if possible, the probability that the encryption
of A will be identical to the encryption of B, in each of the following scenarios:
1. Both messages are encrypted with the same randomly-chosen key, using
ECB mode.
2. Both messages are encrypted with two keys, each of which is chosen
randomly and independently, and using ECB mode.
3. Both messages are encrypted with the same randomly-chosen key, using
CBC mode.
Applied Introduction to Cryptography and Cybersecurity
2.12. LAB AND ADDITIONAL EXERCISES
147
4. Compute now the probability the same message is encrypted to the same
ciphertext, using a randomly-chosen key and CBC mode.
Exercise 2.32. Present a very efficient CPA attack on the mono-alphabetic
substitution cipher, which allows complete recovery of arbitrary messages, using
the encryption of one short plaintext string.
Exercise 2.33 (PRG constructions). Let G : {0, 1}n → {0, 1}n+1 be a secure
PRG. Is G′ , as defined in each of the following sections, a secure PRG? Prove.
1. G′ (s) = G(sR ), where sR means the reverse of s.
2. G′ (r +
+ s) = r +
+ G(s), where r, s ∈ {0, 1}n .
3. G′ (s) = G(s ⊕ G(s)1...n ), where G(s)1...n are the n most-significant bits
of G(s).
4. G′ (s) = G(π(s)) where π is a (fixed) permutation.
5. G′ (s) = G(s + 1).
6. (harder!) G′ (s) = G(s ⊕ sR ).
A. Solution to G′ (r +
+ s) = r +
+ G(s):
B. Solution to G′ (s) = G(s ⊕ G(s)1...n ): may not be a PRG. For example, let
g be a PRG from any number m bits to m + 1 bits, i.e., output is pseudorandom
string just one bit longer than the input. Assume even n; for every x ∈ {0, 1}n/2
and y ∈ {0, 1}n/2 ∪ {0, 1}1+n/2 , let G(x +
+ y) = x +
+ g(y). If g is a PRG, then
G is also a PRG (why?). However, when used in the above construction:
G′ (x +
+ y)
=
=
=
=
=
G [(x +
+ y) ⊕ G(x +
+ y)]
G [(x +
+ y) ⊕ (x +
+ g(y))]
G [(x ⊕ x) +
+ (y ⊕ g(y))]
h
i
G 0n/2 +
+ y) ⊕ (x +
+ g(y))
0n/2 +
+ y ⊕ g(y)
As this output begins with n/2 zero bits, it can be trivially distinguished from
random. Hence G′ is clearly not a PRG.
Exercise 2.34. Let G1 , G2 : {0, 1}n → {0, 1}2n be two different candidate
PRGs (over the same domain and range). Consider the function G defined in
each of the following sections. Is it a secure PRG - assuming both G1 and G2
are secure PRGs, or assuming only that one of them is secure PRG? Prove.
1. G(s) = G1 (s) ⊕ G2 (s).
2. G(s) = G1 (s) ⊕ G2 (s ⊕ 1|s| ).
3. G(s) = G1 (s) ⊕ G2 (0|s| ).
Applied Introduction to Cryptography and Cybersecurity
148
CHAPTER 2. CONFIDENTIALITY: ENCRYPTION SCHEMES AND
PSEUDO-RANDOMNESS
Exercise 2.35. Let G : {0, 1}n → {0, 1}m be a secure PRG, where m > n.
1. Let m = n + 1. Use G to construct a secure PRG G′ : {0, 1}n → {0, 1}2n .
2. Let m = 2n, and consider G′ (x) = G(x) +
+ G(x + 1). Is G′ a secure PRG?
3. Let m = 2n. Use G to construct a secure PRG G′ : {0, 1}n → {0, 1}4·n .
4. Let m = 4n. Use G to construct a secure PRG Ĝ : {0, 1}n → {0, 1}64·n .
Sketch of solution: Assume we are given a PRG f (·) from n − 1 bits to m
bits. Next, deőne G(x) = f (x[0 : n − 2]), i.e., G(x) returns the value of f ()
when applied to the string which is the same as x except for the removal of
the least signiőcant bit (LSb) of x, which we denote LSb(x). Since f is a PRG,
then G is also a PRG, although a less efficient one. However, for a random x,
with probability half, holds LSb(x) = 0 and hence G(x) = G(x + 1). Therefore
G′ (x) = G(x)||G(x + 1) is surely not a secure PRG.
Exercise 2.36. Let f, f ′ be two function from n bit binary strings to n′ bit
′
binary strings, i.e., f, f ′ : {0, 1}n → {0, 1}n .
̸ m2 ∈ {0, 1}n be two different random n bit strings. Present the
1. Let m1 =
best upper and lower bounds you can, for the probability that f (m1 ) =
f (m2 ), assuming n > n′ :
≤
Pr
m1 ̸=m2 ←{0,1}n
(f (m1 ) = f (m2 )) ≤
Justify your answer.
′
2. Repeat, when f is a random function from {0, 1}n to {0, 1}n :
≤
Pr
m1 ̸=m2 ←{0,1}n
(f (m1 ) = f (m2 )) ≤
3. Repeat items 1 and 2 for the case n = n′ .
4. Repeat items 1 and 2 for the case n′ > n.
5. Repeat items 1 and 2, for the probability that f (m1 ) = f ′ (m2 ). In item 2,
only f is chosen randomly.
Exercise 2.37. Let fk be a (secure) Pseudo-Random Function (PRF) from n
′
bit binary strings to n′ bit binary strings, i.e., f : {0, 1}∗ × {0, 1}n → {0, 1}n .
Assume that the key k is chosen randomly as a string of length l; the key length
l is specified so you can reference it in your responses, but do so only if you
find it relevant; otherwise, you may ignore it. Assume l is sufficiently large for
the PRF to be secure.
1. Is it possible that n < n′ ? Is it possible that n′ < n? Explain.
Applied Introduction to Cryptography and Cybersecurity
2.12. LAB AND ADDITIONAL EXERCISES
149
2. Let m1 ̸= m2 ∈ {0, 1}n be two different random n bit strings. Present the
best upper and lower bounds you can, for the probability that fk (m1 ) =
fk (m2 ), assuming n < n′ :
≤
Pr
m1 ̸=m2 ←{0,1}n
(fk (m1 ) = fk (m2 )) ≤
Justify your answer.
3. Repeat for the case n = n′ .
4. Repeat for the case n > n′ .
5. Compare your answers with Exercise A.9.
Exercise 2.38 (Ad-Hoc PRF competition project). In this exercise, you will
experiment in trying to build directly a cryptographic scheme - in this case, a
PRF - as well as in trying to ‘break’ (cryptanalyze) it. This exercise is best
done by multiple groups, with each group consisting of one or few persons.
1. In the first phase, each group will design a PRF, whose input, key and
output are all 64 bits long. The PRF should be written in Python (or some
other agreed programming language), and only use the basic mathematical
operations: table lookup, modular addition/subtraction/multiplication/division/remainder, XOR, max, min, and rotations. You may also use
comparisons and conditional code. The length of your program should
not exceed 400 characters, and it must be readable. You will also provide
(separate) documentation.
2. All groups will be given the documentation and code of the PRFs of all
other groups, and try to design programs to distinguish these PRFs from a
random function (over same input and output domains). A distinguisher
is considered successful if it is able to distinguish in more than 1% of the
runs.
Exercise 2.39. Let f be a secure Pseudo-Random Function (PRF) with n bit
keys, domain and range, and let k be a secret, random n bit key. Derive from
k, using f , two pseudorandom keys k1 , k2 , e.g., one for encryption and one for
authentication. Each of the derived keys k1 , k2 should be 2n-bits long, i.e., twice
the length of k. Note: the two keys should be independent, i.e., each of them
(e.g., k1 ) should be pseudorandom, even if the adversary is given the other (e.g.,
k2 ).
1. k1 =
2. k2 =
Exercise 2.40 (PRF constructions). Let Fk (m) : {0, 1}n × {0, 1}n → {0, 1}n
be a secure PRF. Is F ′ , as defined in each of the following sections, a secure
PRF? Justify your answers. Where the function is not a secure PRF, present
Applied Introduction to Cryptography and Cybersecurity
150
CHAPTER 2. CONFIDENTIALITY: ENCRYPTION SCHEMES AND
PSEUDO-RANDOMNESS
the adversary that can distinguish between the function and a random function
(as in Exercise 2.11). Where the function is a PRF, a precise proof as in
Exercise 2.12 is best, but a good intuitive argument will also do.
1. F̂k (m) = Fk (mR ), where mR means the reverse of m.
2. F̂k (mL +
+ mR ) = Fk (mL ) +
+ Fk (mR ).
3. F̂k (mL +
+ mR ) = (mL ⊕ Fk (mR )) +
+ (Fk (mL ) ⊕ mR ).
4. F̂k (m) = LSb(Fk (m)), where LSb returns the least-significant bit of the
input.
Exercise 2.41 (Key dependent message security). Several works design cryptographic schemes such as encryption schemes, which are secure against a ‘key
dependent message attack’, where the attacker specifies a function f and receives
encryption Ek (f (k)), i.e., encryption of the message f (k) where k is the secret
key. See [66].
1. Extend the definition of secure pseudo-random function for security against
key-dependent message attacks.
2. Suppose that F is secure pseudo-random function. Show a (’weird’)
function F ′ which is also a secure pseudo-random function, but not secure
against key-dependent message attacks.
Exercise 2.42 (KDM security). Several works design cryptographic schemes
such as encryption schemes, which are secure against a ‘key dependent message
attack’. Recall that in many cryptographic definitions, the attacker has access
to one or more oracle function, e.g., the chosen-plaintext oracle for encryption
Ek (·) using secret key k. In a key-dependent message attack, the attacker has
similar access to the oracle function, but instead of specifying directly the value
of the input to the oracle, the attacker specifies a function f and the oracle
is applied to f (k) where k is receives the relevant cryptographic function. For
example for chosen-plaintext oracle for encryption using a shared key k, the
attacker receives Ek (f (k)). See, e.g., [66].
1. Extend the definition of secure pseudo-random function, to define a PRF
scheme secure against key-dependent message attacks.
2. Repeat, for a block cipher (reversible PRP).
3. Suppose that F is secure pseudo-random function. Show a (’weird’)
function F ′ which is also a secure pseudo-random function, but not secure
against key-dependent message attacks.
4. Extend the definition of IND-CPA secure encryption to allow for keydependent message attacks.
Applied Introduction to Cryptography and Cybersecurity
2.12. LAB AND ADDITIONAL EXERCISES
151
Exercise 2.43 (Stateful PRG, ANSI X9.31 and the DUHK attack). The ANSI
X9.31 is a well-known design of a stateful PRG design built using a block cipher
E, illustrated in Fig. 2.37. In this exercise we investigate a weakness in it,
presented in [231]; it was recently shown to be still relevant for some devices
using this standard, in the so-called DUHK attack [100]. Our presentation is a
slight simplification of the X9.31 design but retains the important aspects of the
attack.
The stateful PRG is used in ‘rounds’, with the state of round i being output
from round i−1. Specifically, the X9.31 PRG works as follows, given the current
state si−1 , with s0 selected randomly, and some ‘timestamp’ Ti . First, compute
an ‘internal’ value, xi = Ek (Ti ). Then, output the values ri = Ek (xi ⊕ si−1 )
and si = Ek (ri ⊕ xi ). See Fig. 2.37. The specification does not restrict the
choice of k, and several implementations use constant k as part of their code;
assume, therefore, that k is known.
1. Assume Ti , k and si−1 are known. Show how the attacker can find ri and
si .
2. Assume that the values {Tj } are known for all j, and that k, ri are known
(for specific i). Show how an attacker can compute {sj , rj } for every j.
Ti
si−1
Ek
xi
⊕
⊕
Ek
Ek
si
ri
Figure 2.37: A single round of the ANSI X9.31 stateful pseudorandom generator
(PRG), using block cipher Ek (x), e.g., AES. This őgure was adapted from [100].
Exercise 2.44 (Cascade is not a robust combiner for PRFs). Let F ′ , F ′′ :
{0, 1}∗ × D → D be two polynomial-time computable functions, and let their
cascade, denoted F ≡ F ′ ◦ F ′′ be defined as:
F(k′ ,k′′ ) (x) ≡ Fk′ ′ ◦ Fk′′′′ (x) ≡ Fk′ ′ (Fk′′′′ (x))
(2.63)
Give an example of F ′ , F ′′ s.t. one of them is a PRF, yet their cascade F ≡
F ′ ◦ F ′′ is not a PRF. This shows that cascade is not a robust combiner for
PRFs.
Exercise 2.45. A message m of length 256 bytes is encrypted using a 128-bit
block cipher, resulting in ciphertext c. During transmission, the 200th bit was
flipped due to noise. Let c′ denote c with the 200th bit flipped, and m′ denote
the result of decryption of c′ .
1. Which bits in m′ would be identical to the bits in m, assuming the use
of each of the following modes: (1) ECB, (2) CBC, (3) OFB, (4) CFB?
Explain (preferably, with diagram).
Applied Introduction to Cryptography and Cybersecurity
CHAPTER 2. CONFIDENTIALITY: ENCRYPTION SCHEMES AND
PSEUDO-RANDOMNESS
152
2. For each of the modes, specify which bits is predictable as a function of
the bits of m and the known fact that the 200th bit flipped.
Exercise 2.46. Consider a scenario where randomness is scarce, motivating
attempts to design encryption schemes that use less randomization. Specifically,
consider the following two variants of CBC mode, both using, per message,
only twenty random bits, rather than n random bits (block size) in standard
CBC. Both variants are identical to CBC mode, except for the choice of c0 (the
initialization vector), which is as specified below; both use a twenty-bits random
$
string r ← {0, 1}20 . Show that neither variant suffices to ensure IND-CPA.
Append zeros: c0 = r||0n−20 .
Pseudorandomly: c0 = Ek (r).
Note: your solution may require up to few million queries; just make sure the
number of queries is polynomial in n.
Exercise 2.47 (Modes of operation: decryption and correctness). Table 2.5
specifies only the encryption process for each mode. Write the decryption process
for each mode and show that correctness is satisfied.
Exercise 2.48. Hackme Bank protects money-transfer orders digitally sent
between branches, by encrypting them using a block cipher. Money transfer
+r+
+t+
+x+
+y+
+ p, where f, r
orders have the following structure: m = f +
are each 20-bits representing the payer (from) and the payee (recipient), t is a
32-bit field encoding the time, x is a 24 bit field representing the amount, y is a
128-bit comment field defined by the payer and p is 32-bit parity fields, computed
as the bitwise-XOR of the preceding 32-bit words. Orders with incorrect parity,
outdated or repeating time field, or unknown payer/payee are ignored.
Mal captures ciphertext message x containing money-transfer order of 1$
from Alice to his account. You may assume that Mal can ‘trick’ Alice into
including a comment field y selected by Mal. Assume 64-bit block cipher. Can
Mal cause transfer of larger amount to his account, and how, assuming use of
the following modes:
1. ECB
2. CBC
3. OFB
4. CFB
Solution: The őrst block contains f, r (10 bits each), and top 24 bits of the
time t, the second block contains 8 more bits of the time, x (24 bits) and 32
bits of the comment; block three contains 64 bits of comments, and block four
contains 32 bits of comment and 32 bits of parity. Denote these four plaintext
blocks by m1 +
+ m2 +
+ m3 +
+ m4 .
Denote the ciphertext blocks captured by Mal as c0 +
+ c1 +
+ c2 +
+ c3 +
+ c4 ,
where c0 is the IV.
Applied Introduction to Cryptography and Cybersecurity
2.12. LAB AND ADDITIONAL EXERCISES
153
1. ECB: attacker select the third block (completely comment) to be identical
to the second block, except for containing the maximal value in the 24
bits from bit 8 to bit 31. The attacker then switches between the third
and fourth block before giving to the bank. Parity bits do not change.
2. CBC: Attacker chooses y s.t. m3 = m2 . Then, the attacker sends
to the bank the manipulated message z0 +
+ c3 +
+ c3 +
+ c3 +
+ c4 , where
z0 = m1 ⊕ m3 ⊕ c2 . As a result, decryption of the őrst block retrieves
m1 correctly (as m1 = z0 ⊕ m3 ⊕ c2 ), and decryption of the last block
similarly retrieves m4 correctly (no change in c3 , c4 ). However, both the
second and the third block, decrypt to the value (c3 ⊕ c2 ⊕ m3 ). Hence,
the 32 bit XOR of the message does not change. The decryption of the
second block (to c3 ⊕ c2 ⊕ m3 ) is likely to leave the time value valid - and
to increase the amount considerably.
3. OFB: the solution is trivial since Mal can ŕip arbitrary bits in the decrypted plaintext (by ŕipping corresponding bits in the ciphertext).
4. CFB: as in CBC, attacker chooses y s.t. m3 = m2 . Attacker sends
to the bank the manipulated message c0 +
+ c1 +
+ c1 +
+ c1 +
+ z4 where
z 4 = p 4 ⊕ c 2 ⊕ p2 .
Exercise 2.49 (Affine block cipher). Hackme Inc. proposes the following
highly-efficient block cipher, using two 64-bit keys k1 , k2 , for 64-bit blocks:
Ek1 ,k2 (m) = (m ⊕ k1 ) + k2 (mod 264 ).
1. Show that Ek1 ,k2 is an invertible permutation (for any k1 , k2 ), and the
inverse permutation Dk1 ,k2 .
2. Show that (E, D) is not a secure block cipher (invertible PRP).
3. Show that encryption using (E, D) is not CPA-IND, when used in the
following modes: (1) ECB, (2) CBC, (3) OFB, (4) CFB.
Exercise 2.50 (How not to build PRP from PRF). Suppose F is a secure PRF
with input, output and keyspace all of length n bits. For xL , xR ∈ {0, 1}n , let
Fk′ (xL +
+ xR ) = Fk (xL ) +
+ Fk (xR ) and Fk′′ (xL +
+ xR ) = Fk (xL ⊕ xR ) +
+ Fk (xL ⊕
Fk (xL ⊕ xR )). Prove that neither Fk′ nor Fk′′ are a PRP.
Exercise 2.51 (Building PRP from a PRF). Suppose you are given a secure
PRF F , with input, output and keyspace all of length n bits. Show how to use
F to construct:
1. A PRP, with input and output length 2n bit and key length n bits,
2. A PRP, with input, output and key all of length n bits.
Exercise 2.52. Show that the simple padding function pad(m) = m +
+ 0l , fails
to prevent CCA attacks against most of the modes-of-operation (Fig. 2.5), when
l ≤ n. The attacker may perform CPA and CCA queries, and the plaintext
contains multiple blocks.
Applied Introduction to Cryptography and Cybersecurity
154
CHAPTER 2. CONFIDENTIALITY: ENCRYPTION SCHEMES AND
PSEUDO-RANDOMNESS
Exercise 2.53 (Indistinguishability deőnition). Let (E, D) be a stateless sharedkey encryption scheme, and let p1 , p2 be two plaintexts. Let x be 1 if the most
significant bits of p1 , p2 are identical and 0 otherwise, i.e., x = {1 if M Sb(p1 ) =
M Sb(p2 ), else 0}. Assume that there exists an efficient algorithm X that
computes x given the ciphertexts, i.e., x = X(Ek (p1 ), Ek (p2 )). Show that this
implies that (E, D) is not IND-CPA secure, i.e., there is an efficient algorithm
ADV which achieves significant advantage in the IND-CPA experiment. Present
the implementation of ADV by filling in the missing code below:
ADV Ek (‘Choose’, 1n ) : {
}
ADV Ek (‘Guess’, s, c∗ ) : {
}
Exercise 2.54 (Robust combiner for PRG).
1. Given two candidate PRGs,
say G1 and G2 , design a robust combiner, i.e., a ‘combined’ function G
which is a secure PRG is either G1 or G2 is a secure PRG.
2. In the design of the SSL protocol, there were two candidate PRGs, one
(say G1 ) based on the MD5 hash function and the other (say G2 ) based
on the SHA-1 hash function. The group decided to combine the two; a
simplified version of the combined PRG is G(s) = G2 (s +
+ G1 (s)). Is this
a robust-combiner, i.e., a secure PRG provided that either G1 or G2 is a
secure PRG?
Hint: Compare to Lemma 2.1. You may read on hash functions in Chapter 3,
but the exercise does not require any knowledge of that; you should simply
consider the construction G(s) = G2 (s +
+ G1 (s)) for arbitrary functions G1 , G2 .
Exercise 2.55 (Using PRG for independent keys). In Example 2.6, we saw
how to use a PRF to derive multiple pseudo-random keys from a single pseudorandom key, using a PRF.
1. Show how to derive two pseudo-random keys, using a PRG, say from n
bits to 2n bits.
2. Show how to extend your design to derive four keys from the same PRG,
or any fixed number of pseudo-random keys.
Exercise 2.56. Let (E, D) be a block cipher which operates on 20 byte blocks;
suppose that each computation of E or D takes 10−6 seconds (one microsecond),
on given chips. Using (E, D) you are asked to implement a secure high-speed
encrypting/decrypting gateway. The gateway receives packets at line speed of
108 bytes/second, but with maximum of 104 bytes received at any given second.
The goal is to have minimal latency, using minimal number of chips. Present
an appropriate design, argue why it achieves the minimal latency and why it is
secure.
Exercise 2.57. Consider the AES block cipher, with 256 bit key and 128
bit blocks, and two random one-block (128 bit ) messages, m1 and m2 , and
Applied Introduction to Cryptography and Cybersecurity
2.12. LAB AND ADDITIONAL EXERCISES
155
two random (256-bit) keys, k1 and k2 . Calculate (or approximate/bound) the
probability that Ek1 (m1 ) = Ek2 (m2 ).
Exercise 2.58 (PRF→PRG). Present a simple and secure construction of a
PRG, given a secure PRF.
Exercise 2.59 (Independent PRGs). Often, a designer has one random or
pseudo-random ‘seed/key’ binary string k ∈ {0, 1}∗ , from which it needs to
generate two or more independently pseudorandom strings k0 , k1 ∈ {0, 1}∗ ; i.e.,
each of these is pseudorandom, even if the other is given to the (PPT) adversary.
Let P RG be a pseudo-random generator, which on input of arbitrary length l
bits, produces 4l output pseudorandom bits. For each of the following designs,
prove its security (if secure) or its insecurity (is insecure).
1. For b ∈ {0, 1}, let kb = P RG(b +
+ k).
2. For b ∈ {0, 1}, let kb = P RG(k) [(b · 2 · |k|) . . . ((2 + b) · |k| − 1)].
Solution:
1. Insecure, since it is possible for a secure PRG to ignore the őrst bit,
i.e., P RG(b +
+ s) = P RG(b +
+ s), resulting in k0 = P RG(0 +
+ k) =
P RG(1 +
+ k) = k1 . We skip the (simple) proof that such a PRG may be
secure.
2. Secure, since each of these is a (non-overlapping) subset of the output of
the PRG.
Exercise 2.60 (Indistinguishability hides partial information). In this exercise
we provide an example to the fact that a cryptosystem that ensures indistinguishability (IND-CPA), is guaranteed not to leak partial information about
plaintext, including relationships between the plaintext corresponding to different
ciphertexts. Let (E, D) be an encryption scheme, which leaks some information about the plaintexts; specifically we assume that there exists an efficient
adversary A s.t. for two ciphertexts c1 , c2 of E, holds A(c1 , c2 ) = 1 if and
only if the plaintexts share a common prefix, e.g., c1 = Ek (ID +
+ m1 ) and
c2 = Ek (ID +
+ m2 ) (same perfix, ID). Show that this implies that (E, D) is
not IND-CPA secure.
Exercise 2.61 (Encrypted cloud storage). Consider a set P of n sensitive
(plaintext) records P = {p1 , . . . , pn } belonging to Alice, where n < 106 . Each
record pi is l > 64 bits long ((∀i)(pi ∈ {0, 1}l )). Alice has very limited memory, therefore, she wants to store an encrypted version of her records in an
insecure/untrusted cloud storage server S; denote these ciphertext records by
C = {c1 , . . . , cn }. Alice can later retrieve the ith record, by sending i to S, who
sends back ci , and then decrypting it back to pi .
1. Alice uses some secure shared key encryption scheme (E, D), with l bit
keys, to encrypt the plaintext records into the ciphertext records. The goal
Applied Introduction to Cryptography and Cybersecurity
156
CHAPTER 2. CONFIDENTIALITY: ENCRYPTION SCHEMES AND
PSEUDO-RANDOMNESS
of this part is to allow Alice to encrypt and decrypt each record i using a
unique key ki , but maintain only a single ‘master’ key k, from which it
can easily compute ki for any desired record i. One motivation for this
is to allow Alice to give keys to specific record(s) ki to some other users
(Bob, Charlie,...), allowing decryption of only the corresponding ciphertext
ci , i.e., pi = Dki (ci ). Design how Alice can compute the key ki for each
record (i), using only the key k and a secure block cipher (PRP) (F, F −1 ),
with key and block sizes both l bits. Your design should be as efficient
and simple as possible. Note: do not design how Alice gives ki to relevant
users - e.g., she may do this manually; and do not design (E, D).
Solution: ki =
2. Design now the encryption scheme to be used by Alice (and possibly by
other users to whom Alice gave keys ki ). You may use the block cipher
(F, F −1 ), but not other cryptographic functions. You may use different
encryption scheme (E i , Di ) for each record i. Ensure confidentiality of
the plaintext records from the cloud, from users (not given the key for
that record), and from eavesdroppers on the communication. Your design
should be as efficient as possible, in terms of the length of the ciphertext
(in bits), and in terms of number of applications of the secure block cipher
(PRP) (F, F −1 ) for each encryption and decryption operation. In this
part, assume that Alice stores P only once, i.e., never modifies records pi .
Your solution may include a new choice of ki , or simply use the same as
in the previous part.
,
Solution: ki =
Eki i (pi ) =
,
.
Dki i (ci ) =
3. Repeat, when Alice may modify each record pi few times (say, up to 15
times); let ni denote number of modifications of pi . The solution should
allow Alice to give (only) her key k, and then Bob can decrypt all records,
using only the key k and the corresponding ciphertexts from the server.
Note: if your solution is the same as before, this may imply that your
solution to the previous part is not optimal.
,
Solution: ki =
Eki i (pi ) =
,
.
Dki i (ci ) =
4. Design an efficient way for Alice to validate the integrity of records retrieved from the cloud server S. This may include storing additional
information Ai to help validate record i, and/or changes to the encryption/decryption scheme or keys as designed in previous parts. As in
previous parts, your design should only use the block cipher (F, F −1 ).
Solution: ki =
,
,
Eki i (pi ) =
,
Dki i (ci ) =
.
Ai =
Applied Introduction to Cryptography and Cybersecurity
2.12. LAB AND ADDITIONAL EXERCISES
157
5. Extend the keying scheme from the first part, to allow Alice to also compute
keys ki,j , for integers i, j ≥ 0 s.t. 1 ≤ i · 2j + 1, (i + 1) · 2j ≤ n, where ki,j
would allow (efficient) decryption of ciphertext records ci·2j +1 , . . . , c(i+1)·2j .
For example, k0,3 allows decryption of records c1 , . . . , c8 , and k3,2 allows
decryption of records c13 , . . . , c16 . If necessary, you may also change the
encryption scheme (E i , Di ) for each record i.
Solution: ki,j =
,
,
Eki i (pi ) =
Dki i (pi ) =
.
Exercise 2.62 (Modes vs. attack models.). For every mode of encryption we
learned (see Table 2.5):
1. Is this mode always secure against any of the attack models we discussed
(CTO, KPA, CPA, CCA)?
2. Assume this mode is secure against KPA. Is it then also secure against
CTO? CPA? CCA?
3. Assume this mode is secure against CPA. Is it then also secure against
CTO? KPA? CCA?
Justify your answers.
Exercise 2.63. Recall that WEP encryption is defined as: W EPk (m; IV ) =
[IV, RC4IV,k ⊕ (m +
+ CRC(m))], where IV is a random 24-bit initialization window, and that CRC is a error-detection code which is linear, i.e.,
CRC(m ⊕ m′ ) = CRC(m) ⊕ CRC(m′ ). Also recall that WEP supports sharedkey authentication mode, where the access point sends random challenge r,
and the mobile response with W EPk (r; IV ). Finally, recall that many WEP
implementations use 40-bit key.
1. Explain how an attacker may efficiently find the 40-bit WEP key, by
eavesdropping on the shared-key authentication messages between the
mobile and the access point.
2. Present a hypothetical scenario where WEP would have used a fixed value
of IV to respond to all shared-key authentication requests, say IV=0. Show
another attack, that also finds the key using the shared-key authentication
mechanism, but requires less time per attack. Hint: the attack may use
(reasonable) precomputation process, as well as storage resources; and the
attacker may send a ‘spoofed’ challenge which the client believes was sent
by the access point.
3. Identify the attack models exploited in the two previous items: CTO, KPA,
CPA or CCA?
4. Suppose now that WEP is deployed with a long key (typically 104 bits).
Show another attack which will allow the attacker to decipher (at least
part) of the encrypted traffic.
Applied Introduction to Cryptography and Cybersecurity
Chapter 3
Integrity: from Hashing to
Blockchains
Integrity is the ability to check if an object was modiőed, using a concise
digest of the (original) object. In this chapter, we discuss the two main type of
cryptographic integrity mechanisms: hash functions and accumulator schemes.
Much of this chapter deals with cryptographic hash functions, used to
compute the digest of a binary string. Cryptographic hash functions are
among the most widely-used cryptographic schemes, and have many diverse
properties, uses, and applications. In Section 3.1 we introduce cryptographic
hash functions, their properties and variants. In Section 3.2 and Section 3.3,
respectively, we discuss the two main integrity properties of hash functions,
collision resistance hash functions (CRHF) and second preimage resistant (SPR)
hash functions. In Section 3.4 we discuss one-way functions (OWF), which is
another property often expected from cryptographic hash functions, and used
for different purposes. In Section 3.5 we discuss one őnal important property
often expected from cryptographic hash functions: randomness extraction; and
in Section 3.6 we introduce the Random Oracle Model , an important paradigm
often used to provide a simpliőed security analysis of protocols and schemes
using cryptographic hash functions. Later, in Chapter 4, we discuss the use of
cryptographic hash functions for authentication, in MAC and signature schemes.
In Section 3.7 we introduce cryptographic accumulators. Accumulators
can be seen as a generalization of hash functions, accepting a sequence/set of
binary strings as input, and providing additional functionalities such as Proof of
Inclusion (PoI). In Section 3.9 we present the Merkle-Damgård accumulator, a
construction that is more known for its use in constructing cryptographic hash
functions from their őxed-input-length variant, called compression functions.
Both hash functions and accumulators have been extensively studied and
widely deployed in practice. However, as we will see, their correct, secure use
requires precise understanding of their properties; designs based on intuitive
understanding have often proved vulnerable, and we will see some examples.
Therefore, precise deőnitions are critical. We deőne what we consider as the
159
160
CHAPTER 3. INTEGRITY: FROM HASHING TO BLOCKCHAINS
most important notions, mostly focusing, where possible, on the more-applied
case of keyless hash functions; in more advanced texts on cryptography, you
will őnd additional, and sometimes different, deőnitions.
3.1
Introducing cryptographic hash functions, their
properties and variants
Hash functions map variable input length (VIL) binary strings to n-bit strings,
referred to as the digest of the input. Since the input may be arbitrarily long,
and the output is always n bits, the basic property of hash functions is that the
digest (output) is normally shorter than the input, a property often referred
to as compression1 . The compression property, on its own, may be achieved
trivially, e.g., by truncating the input; hash functions are expected to satisfy
additional properties, as we will discuss.
m
h(·)
h(m)
(a) Keyless hash function
h(·) : {0, 1}∗ → {0, 1}n .
m
k
hk (·)
hk (m)
(b) Keyed hash function
hk (·) : {0, 1}n × {0, 1}∗ → {0, 1}n .
Figure 3.1: Keyless and Keyed Hash Functions: mapping from a variable-length
input to n-bits output (digest). For simplicity, the keyed hash use n as length
of both digest and key.
As illustrated in Figure 3.1, hash functions may be keyed (hk (·) : {0, 1}n ×
{0, 1}∗ → {0, 1}n ) or keyless (h(·) : {0, 1}∗ → {0, 1}n ). We discuss cryptographic hash functions, i.e., hash functions which should satisfy different
security properties, e.g., collision resistance, although hash functions are also
used for non-security applications.
To key or not to key? Existing standards of cryptographic hash functions,
are all of keyless hashes, and use a őxed digest length n. For example, the
SHA-1 standard cryptographic hash function subsection 3.1.4 uses n = 160, i.e.,
the output length is 160 bits. However, as we will see, keyless hash function
cannot ensure important security requirements, e.g., collision resistance, which
motivates using keyed hash functions.
1 Do not be confused with compression functions, which we define later, which compress
from m bits strings to n < m bits strings.
Applied Introduction to Cryptography and Cybersecurity
3.1. INTRODUCING CRYPTOGRAPHIC HASH FUNCTIONS, PROPERTIES
AND VARIANTS
161
We discuss both keyed and keyless hash functions, focusing, where possible,
on keyless hash functions, since they are simpler and more common in applied
cryptographic protocols and systems. For the same reasons, we also focus on
hash functions of őxed digest length n.
The key k of keyed hash functions, including cryptographic keyed hash
functions, is usually non-secret2 .
3.1.1
Warm-up: hashing for efficiency
Before we focus on cryptographic hash functions, we őrst discuss brieŕy the use
of hash functions for randomly mapping data, as used (also) for load-balancing
and other ‘classical’, non-adversarial scenarios. Our goal is to provide intuition
for the required security properties and awareness of some of the challenges.
A common (non-cryptographic) application of hash functions is to map the
inputs into the possible digests values (‘bins’) in a ‘random’ manner, i.e., with
a roughly equal probability of assignment to each bin (digest value). For the
typical case of n-bit digest values, there would be 2n bins. This property is used
in many algorithms and data structures, to improve efficiency and fairness. This
is illustrated in Fig. 3.2. Here, a hash function h maps from the set of names
(given as unbounded-length strings), to a smaller set, say the set of n-bit binary
strings. This is a special case of load balancing, i.e., avoiding unevenly use of
computing resources, which may result in overload of one resource concurrently
with under-utilization of an alternative resource. The goal is to roughly balance
the number of entries (names) assigned to each bin.
Of course, in cryptography, and cybersecurity in general, we mainly consider
adversarial settings. In the context of load-balancing applications as shown
in Fig. 3.2, this refers to an adversary who can manipulate some of the input
names, and whose goal may be to cause imbalanced allocation of names to
bins, i.e., many collisions - which can cause bad performance, potentially even a
Denial of Service (DoS), i.e., a disruption or degradation of the service provided.
Consider an attacker whose goal is to degrade the performance for a particular
name, say Bob, as part of a Denial-of-Service (DoS) attack. The attacker may
provide to the system a long list of names x1 , x2 , . . ., deviously selected such
that all of them are mapped to the same bin as Bob. We refer to inputs x, x1
that have the same digest, i.e., h(x) = h(x1 ), as a collision. The attacker’s goal,
therefore, is to őnd many values x1 , x2 , . . ., which all collide with the string
x =‘Bob’, i.e., h(‘Bob’) = h(x1 ) = h(x2 ) = . . .. See Figure 3.3.
DoS attacks of this type, which cause high computational overhead, are
usually referred to as an Algorithmic-Complexity DoS Attacks. One way in
which attackers may exploit an algorithmic complexity DoS attack, is to cause
excessive overhead for network security devices, such as malware/virus scanners,
Intrusion-Detection Systems (IDS) and Intrusion-Prevention Systems (IPS).
The attack may cause the IDS/IPS systems to become ineffective, allowing the
2 Some works refer to the use of hashes with secret keys. However, the applications are
typically of a MAC or PRF function, possibly constructed from a cryptographic hash function,
a topic we discuss in subsection 4.6.3.
Applied Introduction to Cryptography and Cybersecurity
162
CHAPTER 3. INTEGRITY: FROM HASHING TO BLOCKCHAINS
Figure 3.2: Load-balancing with (keyless) hash function h(·)
attacker to avoid detection. For further discussion of algorithmic-complexity
and other Denial-of-Service (DoS) attacks, see [107, 193].
Note that for any hash function h and input x (e.g., x =‘Bob’), it is possible
to őnd other inputs {x′1 , x′2 , . . .} which collide with x, i.e., (∀i)h(x′i ) = h(x), by
randomly testing different inputs and collecting these whose hash is the same as
h(x). However, for digest length n, there are 2n bins, hence the probability of
such random guess to be a collision with ‘Bob’ is only 21n = 2−n , i.e., negligible
in n.
For some hash functions, including hash functions used for non-cryptographic
applications, there are efficient ways for the attacker to őnd collisions, rather
than testing random inputs. This includes hash functions that provide sufficientlyrandomized mapping for ‘natural’ inputs, which result from a benign selection
process. We say that such hash functions are not collision resistant, i.e., these
are functions where collisions can be found efficiently when the inputs are
selected by an attacker. See the following exercise.
Exercise 3.1. Given an alphabetic string x, let num(x, i) be the alphabeticalposition of the ith letter in x with the first letter (‘a’) being in position one.
For example, num(‘hello’, 2) = 5 since ‘e’ is the fifth letter in the alphabet,
and num(‘abcdef ’, i) = i for 1 ≤ i ≤ 6. Consider hash function h(x) =
Applied Introduction to Cryptography and Cybersecurity
3.1. INTRODUCING CRYPTOGRAPHIC HASH FUNCTIONS, PROPERTIES
AND VARIANTS
163
Figure 3.3: Algorithmic Complexity Denial-of-Service Attack exploiting insecure
hash function h to cause many collisions
P|x|
i=1 num(x, i) mod 27, i.e., sum of all the letters (mod 27). Show how an
attacker may easily generate a set {x1 , x2 , . . .} of any desired number of strings
colliding with Bob, i.e., for every xi holds: h(xi ) = h(‘Bob’). The attacker
should not need to compute the hash value for many different strings. Give
three examples of such colliding strings. As an extra challenge, try to have your
strings be real names!
It isn’t very surprising that collisions can be found efficiently for a hash
function not designed for collision resistance. However, this indicates the
importance of deőning carefully the collision resistance requirement and other
security requirements from cryptographic hash functions. This follows the
attack model and security requirements principle (Principle 1). In Section 3.2 we
deőne collision-resistant hash functions (CRHF); intuitively, in such functions,
an attacker cannot efficiently őnd a collision. This should foil the Algorithmic
Complexity Denial-of-Service Attack of Fig. 3.3, provided we use a sufficientlylong digest length n (i.e., 2n bins). See Figure 3.4.
Applied Introduction to Cryptography and Cybersecurity
164
CHAPTER 3. INTEGRITY: FROM HASHING TO BLOCKCHAINS
Figure 3.4: Load balancing with a collision-resistant hash function (CRHF) with
n bits digest, i.e., using 2n bins. The probability of a random name to collide
with ‘Bob’ is only 2−n ; furthermore, the probability of collision is negligible for
guesses by all efficient algorithms.
3.1.2
Properties of cryptographic hash functions
Cryptographic hash functions are used for many different applications, often
assuming different security properties; unfortunately, these assumptions are not
always made explicitly, and the assumed properties are not always clearly deőned.
Roughly, these properties fall into three broad goals: integrity, confidentiality
and randomness. Intuitively, integrity ensures the uniqueness of the message,
given the digest; confidentiality ensures the digest does not ‘expose’ the message;
and randomness ensures that the digest is pseudorandom, provided that the
input ‘contains sufficient randomness’.
We deőne four security requirements: collision resistance, second-preimage
resistance, one-way function (also referred to as preimage resistance), and
randomness extraction. Table 3.1 maps these requirements to the three goals,
and gives an abridged descriptions of these properties, for keyless hash functions.
Assuming that a given hash function has these security properties makes it
possible to use the hash function for different applications, ensuring security as
long as the hash function indeed satisőes the properties assumed. However, we
should only assume properties which the hash function was designed and tested
Applied Introduction to Cryptography and Cybersecurity
3.1. INTRODUCING CRYPTOGRAPHIC HASH FUNCTIONS, PROPERTIES
AND VARIANTS
165
Goal
Integrity
Integrity
Conődentiality
Randomness
All
Requirement
Collision resistance
(CRHF; Deőnition 3.1)
Second-preimage resistance
(SPR; Deőnition 3.7)
One-way function
(OWF; Deőnition 3.9
Randomness extractor
(Section 3.5)
Random oracle model
(ROM; ğ3.6)
Abridged description
Can’t őnd collision (m, m′ ), i.e.,
m ̸= m′ yet h(m) = h(m′ ).
Can’t őnd collision to a random m:
m′ ̸= m yet h(m) = h(m′ ).
Given h(m) for random m,
can’t őnd m′ s.t. h(m) = h(m′ ).
If input is sufficiently random,
then output is pseudorandom.
Consider h as random function
Table 3.1: Goals and Requirements for keyless cryptographic hash functions,
presented for the hash function h : {0, 1}∗ → {0, 1}n .
to ensure. Even seemingly minor differences between the security requirements
for which the hash function was designed and tested, and the security properties
required for the application, may result in a vulnerability. For example, Table 3.1
presents two integrity requirements, CRHF and SPR; we later show applications
which are secure using a cryptographic hash function which satisőes the CRHF
requirement, but may be vulnerable assuming ‘only’ the SPR requirement.
Namely, even if the deőnitions may appear similar, it is still important to use the
correct deőnition; the differences can be meaningful and even critical. Table 3.1
also includes the random oracle model (ROM), which we discuss (in Section 3.6).
In the ROM, we analyze the security of a system using cryptographic hash
functions as if we use a random function (from binary-strings to n-bit binary
strings).
Here is an exercise which may strengthen your intuitive understanding of
the different security requirements in Table 3.1.
Exercise 3.2 (Examples of (insecure, simple) hash functions). Let h(x) = x
mod 2n , h′k (x) = k + x mod 2n , and h′′ (x) = x2 mod 2n , all computed by
considering their inputs as integers in binary representation. To avoid confusion,
we denote numbers in binary representation by subscript of 2, e.g., 100002 is
the number 16 in decimal, i.e., 24 , and similarly, 1002 = 4 = 22 . Notice that h′
is keyed, while h and h′′ are keyless.
1. For n = 4, compute h(110102 ), h(101010102 ), h′′ (110102 ) and h′′ (101010102 ).
Note: inputs are binary strings, and should be viewed as integers in binary
representation, which we denote with the subscript 2.
2. Show that h(x) = x mod 2n is not a CRHF, SPR, OWF or randomness
extractor, based on the abridged descriptions in Table 3.1.
3. Repeat for h′k (x) = k + x mod 2n .
4. Repeat for h′′ (x) = x2 mod 2n . Beware: the OWF property can be
challenging.
Applied Introduction to Cryptography and Cybersecurity
166
CHAPTER 3. INTEGRITY: FROM HASHING TO BLOCKCHAINS
Solution for the first three items:
1. Recall that inputs are binary strings, and should be viewed as integers in
binary representation; we write the outputs similarly.
mod 24 = 10102
h(110102 )
=
110102
h(101010102 )
h′ (110102 )
=
=
101010102 mod 24 = 10102
110102 mod 24 = (110102 mod 24 )2
=
(10102 )2
=
(1, 000, 0002 + 100002 + 100002 + 1002 )
=
=
1002
... = 1002 (similarly)
h′ (101010102 )
mod 24 = (10002 + 102 )2
mod 24
mod 24
mod 24
2. Show that h(x) = x mod 2n is not a CRHF, SPR, OWF or randomness
extractor.
a) SPR and CRHF: We show that h is not a CRHF or an SPR hash
function, by showing a collision for any given input x. Speciőcally, let
x′ = x + 2n . Clearly x′ =
̸ x, and yet h(x′ ) = (x + 2n ) mod 2n = x
mod 2n = h(x), namely, x′ is a collision (second preimage) with x.
b) OWF: We next show that h is not a one-way function (OWF).
Speciőcally, given h(x) for any preimage x, let x′ = h(x); clearly:
h(x′ )
=
x′
mod 2n = h(x)
=
(x
mod 2n )
mod 2n
mod 2n = x
mod 2n = h(x)
Namely, x′ is a preimage of h(x), and hence h is not a OWF.
c) Finally we show that h is not a randomness extractor hash function.
$
Speciőcally, let r ← {0, 1}n be a random n bit string, and let
x = r+
+ 0, i.e., let x be an n + 1-bits binary string, whose least
signiőcant bit is zero and the other bits selected randomly. The value
of h(x) = x mod 2n is the same as the n least-signiőcant bits of n,
and in particular, the least signiőcant bit of h(x) is zero. Hence, h(x)
is easily distiniguishable from a random n bit string.
3. Show that h′k (x) = k + x mod 2n is not a CRHF, SPR, OWF or randomness extractor. Recall that the key k is known to the adversary (not
secret). This makes it easy to adapt the solutions of the previous item.
Speciőcally:
SPR and CRHF: The same collision x′ = x + 2n applies here too.
Clearly x′ =
̸ x, and yet for every key k holds: hk (x′ ) = k + (x + 2n )
n
mod 2 = k + x mod 2n = hk (x), namely, x′ is a collision (second
preimage) with x.
Applied Introduction to Cryptography and Cybersecurity
3.1. INTRODUCING CRYPTOGRAPHIC HASH FUNCTIONS, PROPERTIES
AND VARIANTS
167
OWF: We next show that hk is not a one-way function (OWF), i.e.,
given hk (x) (and k), we can őnd x′ such that hk (x′ ) = hk (x). Notice
that we are given the key - in our deőnitions of keyed hash function,
the key is known to the attacker, i.e., not a secret. Speciőcally, given
hk (x) for any preimage x, and the key k, let x′ = hk (x) − k; clearly:
hk (x′ )
=
x′
mod 2n = hk (x) − k
=
(k + x
mod 2n )
mod 2n
mod 2n = k + x
mod 2n = hk (x)
Namely, x′ is a preimage of hk (x), and hence h is not a OWF.
Randomness extractor hash: Finally we show that hk is not a randomness extractor hash function. Speciőcally, let x = r +
+ 0n , where
r is a random n bit string; for simplicity, assume that the key k is
also n bits long. Then hk (x) = (r +
+ 0n ) + k mod 2n = k, which is
obviously not a random string.
3.1.3
Applications of cryptographic hash functions
The broad security requirements of cryptographic hash functions facilitate their
use in many systems and for an extensive variety of applications. These different
applications and systems rely on different security requirements. As in any
security system, it is important to identify the exact security requirements and
assumptions; however, published designs, and even standards, do not always
deőne the requirements precisely. Important applications of cryptographic hash
functions include:
Integrity of a string or a set : the hash h(m) is a short digest of a typically
much longer string m, which allows validation of the integrity of m. Hash
functions are also used in the construction of accumulator schemes, e.g.,
the Merkle tree design; accumulators also produce a digest, but of an
ordered or unordered set of strings, rather than of a single string. We
discuss accumulators in Section 3.7.
Hash-then-Sign : Signature schemes are usually deőned with Fixed Input
Length (FIL), typically quite limited (e.g., < 1024 bits). To sign longer
messages, we apply the FIL signing function to the digest h(m) of the
message m being signed; this is called the Hash-then-Sign paradigm. See
subsection 3.2.6.
Improved login mechanisms : Hash functions are used to improve the security of password-based login authentication, in several ways. The most
widely deployed method is using a hashed password file, which makes
exposure of the server’s password őle less risky - since it contains only
the hashed passwords. Another approach is to use a hash-based one-time
password, which is a random number allowing the server to authenticate
Applied Introduction to Cryptography and Cybersecurity
CHAPTER 3. INTEGRITY: FROM HASHING TO BLOCKCHAINS
168
the user, with drawbacks of single use and having to remember or have
this random number; see subsection 3.4.1, and a more extensive discussion
of different login mechanisms in Chapter 9.
Proof-of-Work : cryptographic hash functions are often used to provide Proofof-Work (PoW), i.e., to prove that an entity performed a considerable
amount of computation. This is used by Bitcoin and other cryptocurrencies, and for other applications. See Section 3.10.2.
Key derivation and randomness generation : hash functions are used to
extract pseudorandom bits, given input with ‘sufficient randomness’. In
particular, this is used to derive secret shared keys. See Section 3.5.
3.1.4
Standard cryptographic hash functions
Due to their efficiency, simplicity and wide applicability, cryptographic hash
functions are probably the most commonly used ‘cryptographic building blocks’,
as discussed in the cryptographic building blocks principle (Principle 8). This
implies the importance of deőning and adopting standard functions, which can
be widely evaluated for security - mainly by cryptanalysis - and the need for
deőnitions of security.
There have been many proposed cryptographic hash functions; however, since
security is based on failed efforts for cryptanalysis, designers usually avoid lesswell-known (and hence less tested) designs. The most well-known cryptographic
hash functions include the MD4 and MD5 functions proposed by RSA Inc., the
SHA-1, SHA-23 and SHA-3 functions standardized by NIST ( [114, 323]), the
RIPEMD and RIPEMD-160 standards, and others, e.g., BLAKE2.
Several
of these, however, were already ‘broken’, i.e., shown to fail some of the security
requirements. In particular, collisions - and speciőcally, chosen-prefix collisions were found for RIPEMD, MD4 and MD5 in [367], and later also for SHA-1 [262];
see subsection 3.3.1. As a result, these functions should be avoided and replaced,
at least in applications which depend on the collision-resistance property.
Existing standards deőne only keyless cryptographic hash functions. However, as we later explain, there are strong motivations to use keyed cryptographic
hash functions, which use a non-secret key. In particular, collision-resistance
cannot be achieved by any keyless function.
3.2
3.2.1
Collision Resistant Hash Function (CRHF)
Keyless Collision Resistant Hash Function
(Keyless-CRHF)
A keyless hash function h : {0, 1}∗ → {0, 1}n maps unbounded length binary
strings m ∈ {0, 1}∗ , to their n-bit digest h(m) ∈ {0, 1}n . Since the input domain
3 The SHA-2 specifications defines six variants for SHA-2, with digests lengths of 224,
256, 384 or 512 bits; these variants are named SHA-224, SHA-256, SHA-384, SHA-512,
SHA-512/224, and SHA-512/256.
Applied Introduction to Cryptography and Cybersecurity
3.2. COLLISION RESISTANT HASH FUNCTION (CRHF)
169
Figure 3.5: Keyless collision resistant hash function (CRHF): it is infeasible
to efficiently őnd a collision, i.e., a pair of inputs x, x′ ∈Domain, which are
mapped by the hash function h to the same output, h(x) = h(x′ ), except with
negligible probability.
is the (inőnite) set of all binary strings, and the range is őnite ({0, 1}n ), it
follows that there are inőnitely many collisions, i.e., messages m ̸= m′ s.t.
h(m) = h(m′ ). Indeed, even if we limit ourselves to the input set of messages
m ∈ {0, 1}n+1 , the number of messages is 2n+1 and the number of digests is
only 2n , so clearly at least half (2n ) of the inputs must result in a collision.
Namely, collisions are common - in every hash function.
However, for large digest length n, it is conceivable that finding a collision
may be hard. Intuitively, we say that a hash function is collision resistant, if it
is hard to őnd any collision, as illustrated in Figure 3.5. The deőnition follows;
notice, that in this and (most) other deőnitions of keyless hash function, we
actually view the hash function as if it is deőned for different digest lengths n,
allowing us to discuss the computational complexity as a function of n. For more
precise deőnitions that explicitly express the digest length n as a parameter,
see, e.g., [165].
Definition 3.1 (Keyless Collision Resistant Hash Function (CRHF)). A keyless
hash function h(·) : {0, 1}∗ → {0, 1}n is collision-resistant if for every efficient
(PPT) algorithm A, the advantage εCRHF
(n) is negligible in n, i.e., smaller
h,A
than any positive polynomial for sufficiently large n (as n → ∞), where:
(n) ≡ Pr [(x, x′ ) ← A(1n ) s.t. (x ̸= x′ ) ∧ (h(x) = h(x′ ))]
εCRHF
h,A
(3.1)
Where the probability is taken over the random coin tosses of A.
Let us deőne hsum , a simple example of an insecure hash function, which is
handy to give examples of the different cryptographic hash function deőnitions
- all of which, hsum fails to satisfy. While the input to hsum can be any binary
string, we normally use it for input which is a string of decimal digits (encoded
in binary); for any other input, hsum returns the őxed output of zero (0).
When the input consists of a string of decimal digits, hsum repeatedly sums
up the digits of its input, until obtaining only one digit, which is the output.
For example, hsum (13) = 4, hsum (345) = 3 and hsum (5) = 5. The reader is
probably familiar with hsum from elementary school, where it is introduced without calling it hsum , of course - as a way to check if a number divides by
Applied Introduction to Cryptography and Cybersecurity
170
CHAPTER 3. INTEGRITY: FROM HASHING TO BLOCKCHAINS
three or by nine. We next deőne hsum precisely and őnd collisions in it; our
deőnition conveniently assumes that we can use the decimal value of a string in
numeric operations, when the string composed of decimal digits.
Example 3.1 (The hsum hash function and collisions in it). We define hsum (x)
as follows:
0
if x ̸∈ {0, 1, . . . 9}∗
if x < 10
hsum (x) =
Px
|x|
hsum
x[i]
otherwise
i=1
Let us show that hsum is not a CRHF. This is easy; in fact, given any
integer x > 0, let x′ = 10 · x. Then x =
̸ x′ , yet hsum (x′ ) = hsum (x) - i.e., hsum
is not a CRHF.
So, how do we őnd a CRHF? Possibly surprisingly, we next show that we
em cannot; namely, we show that there exists no keyless CRHF.
3.2.2
There are no Keyless CRHFs!
Standard cryptographic hash functions, discussed in subsection 3.1.4, are all
keyless; and practical deployment almost always use these designs. By now, the
readers should not be surprised to learn that none of these were proven secure; we
discussed in subsection 2.7.4 the fact that ‘real’, unconditional proofs of security
for (most) cryptographic schemes would imply that P ̸= N P , and, therefore,
would be major news. Cryptographic hash functions are among the basic
cryptographic building blocks, which are typically validated by accumulated
evidence of failed attempts to cryptanalyze them.
However, the following lemma may be surprising: all keyless hash functions
fail to satisfy the keyless-CRHF deőnition (Deőnition 3.1) - namely, a keylessCRHF - using this deőnition - simply does not exist. We present and prove
this, and then discuss the implications.
The reader is quite right to be suspect a ‘trick’ here - after all, we just
explained that all standard cryptographic hash functions are keyless! Well, that
is correct: the proof uses a ‘trick’ to show that there is an efficient attacker that
can őnd a collision in the keyless hash function. The proof shows that for any
given keyless hash function h, there exists an efficient adversarial algorithm Ah ,
̸ m′ but h(m) = h(m′ ).
that outputs a collision for h, i.e., a pair (m, m′ ) s.t. m =
Furthermore, Ah is not just efficient in n: its time complexity is basically the
time required to print out the collision. In fact, printing the collision is basically
the only thing that Ah does. And note that Ah does not just succeeds with a
‘signiőcant’ probability; it always succeeds. Namely, h is very very far from the
requirements of Deőnition 3.1!
Would you like to see the trick - or did you already őgure it out? You may
have, since we have essentially already did the trick - ‘hidden in plain sight’
exactly in the paragraph above. Can you őnd it? Try to őnd it, before you
read the proof and the explanation of the trick.
Applied Introduction to Cryptography and Cybersecurity
3.2. COLLISION RESISTANT HASH FUNCTION (CRHF)
171
Lemma 3.1 (Keyless CRHF do not exist.). There is no keyless CRHF hash
function h : {0, 1}∗ → {0, 1}n .
Proof: Given h(·), we prove that there exists an efficient adversary algorithm
Ah that always őnds a collision, i.e. εCRHF
(n) = 1 - clearly showing that h(·)
h,A
does not satisfy the deőnition of a keyless CRHF.
Recall that, since the domain of h is inőnite while the range is the őnite set
{0, 1}n , then h must have collisions, i.e., pairs of binary messages m ̸= m̂ s.t.
h(m) = h(m̂). Let m, m̂ denote one such collision. It does not matter which
collision we pick or how do we pick the collision.
The adversary Ah simply outputs the collision (m, m̂), i.e., Ah (1n ) = (m, m̂).
Obviously, Ah is efficient, and always outputs a collision. Therefore, h(·) is not
a keyless CRHF (as deőned in Deőnition 3.1).
The ‘trick’ was that we proved that such an attacker Ah exists - but in a
non-constructive way, i.e., we did not present such an adversary or showed an
efficient way to őnd it. We only showed that such an adversary exists. This
shows that there is no keyless CRHF, as deőned in Deőnition 3.1. Of course,
there may be some other reasonable notion for collision resistance for keyless
hash, for which the lemma does not apply. Indeed, we later deőne SecondPreimage Resistant (SPR) hash, which is essentially a weaker collision-resistance
property.
In this textbook, for simplicity, we usually use keyless hash functions, often
assuming collision resistance, i.e., the keyless hash function is a cryptanalysisresistant CRHF. In the constructions and designs we discuss, it is not too hard
to add the ‘missing’ keys, when desired (e.g., for provably secure reductions).
Namely, we use ‘keyless CRHFs’ as a convenient simpliőcation. A justiőcation
may be that if the system using the hash is insecure, then we may be able
to use the attack to őnd the collision - which seems hard, as cryptanalysts
failed so far to őnd such collision. Another justiőcation is that in any practical
implementations, the output length is őxed, while we only discuss asymptotic
security deőnitions. In fact, many cryptographic designs use an even stronger
simpliőcation - the random oracle model (ROM), which we discuss in ğ3.6.
Another approach is to design the application without assuming a CRHF
at all, and instead, rely on other properties, which may exist for keyless hash.
One especially-relevant property is second-preimage resistance (SPR), which
is, essentially, a weaker form of collision resistance.Of course, care must be
taken to ensure that SPR is really sufficient for the application; there could
be subtle vulnerabilities due to use of SPR in an application requiring ‘real’
collision-resistance. We discuss SPR in Section 3.3.
A őnal alternative is to use a keyed CRHF instead of keyless CRHF. Considering that existing standards deőne only keyless hash, a common approach
is to use construction of a keyed hash from a keyless hash. Often, this would
be the HMAC construction, originally designed as a construction of a MAC
function from hash. We discuss HMAC in subsection 4.6.3.
Applied Introduction to Cryptography and Cybersecurity
CHAPTER 3. INTEGRITY: FROM HASHING TO BLOCKCHAINS
172
Figure 3.6: Keyed collision resistance hash function (CRHF): given random key
k, it is hard to őnd a collision for hk , i.e., a pair of inputs x, x′ ∈ {0, 1}∗ s.t.
hk (x) = hk (x′ ).
3.2.3
Keyed Collision Resistance
We next discuss keyed collision resistant hash functions (keyed CRHF). The
deőnition for keyed CRHF seems very similar; the only difference is that the
probability is also taken over the key, and the key is provided as input to
the adversary. Recall that, for simplicity, we use n as the length of both the
digest and the key; hence, we do not need to provide n as an additional input
(since it is equal to the key length). We next deőne keyed collision resistance,
which we illustrate in Figure 3.6. Recall that for keyed cryptographic hash
functions, we assume, for simplicity, that n denotes both the length of the key
and the length of the digest length, i.e., for every message m ∈ {0, 1}n holds:
|k| = |hk (m)| = n.
Definition 3.2 (Keyed Collision Resistant Hash Function (CRHF)). Consider
a keyed hash function hk (·) : {0, 1}n × {0, 1}∗ → {0, 1}n , defined for any n ∈ N.
We say that h is collision-resistant if for every efficient (PPT) algorithm A, the
advantage εCRHF
(n) is negligible in n, i.e., εCRHF
(n) ∈ N EGL(n), where:
h,A
h,A
εCRHF
(n) ≡
h,A
Pr
k←{0,1}n
[(x, x′ ) ← A(k) s.t. (x ̸= x′ )∧( (hk (x) = hk (x′ ) )] (3.2)
Where the probability is taken over the random coin tosses of the adversary A
and the random choice of k.
Let us now deőne a simple, insecure keyed hash function - speciőcally, hsum
k
- essentially, a keyed-version of the hsum hash function (Example 3.1).
Definition 3.3 (The keyed hsum
(insecure) hash function.). Let k, x ∈ {0, 1, . . . 9}∗ .
k
sum
Then we define hk (x) as follows:
hsum
(x) = hsum (k||x)
k
The following exercise uses the simple hsum
hash function to demonstrate
k
the CRHF deőnition.
Exercise 3.3. Show that hksum is not a keyed CRHF.
Hint: see Example 3.1 for guidance.
Applied Introduction to Cryptography and Cybersecurity
3.2. COLLISION RESISTANT HASH FUNCTION (CRHF)
173
Figure 3.7: Target collision resistant (TCR) hash function: adversary cannot
őnd target x, to which it would be able to őnd a collision x′ , once it would be
given the random key k.
Target Collision Resistant (TCR) vs. ACR / Keyed CRHF. Deőnition 3.2 uses the term keyed CRHF, following Damgård [111]. Another term for
this deőnition is any collision resistance (ACR hash), proposed by Bellare and
Rogaway in [45]. They preferred this term, to emphasize that this deőnition
allows the attacker to choose the speciőc collision as function of the key, since
the key is given to the attacker before the attacker outputs the entire collision
(both x and x′ s.t. hk (x) = hk (x′ )).
Bellare and Rogaway preferred to use the term ACR to the term ‘keyed
CRHF’, to emphasize the difference from a weaker notion of collision-resistance
that they (and we) call4 Target Collision Resistant (TCR) hash. The term
TCR emphasizes that, to ‘win against’ the TCR deőnition, the attacker has to
őrst select the target x, i.e., one of the two colliding strings, before it receives
the (random) key k. Only then the attacker is given the random key k, and has
to output the colliding string x′ s.t. h(x) = h(x′ ). Intuitively, this makes sense:
it seems that on most applications, a collision between two ‘random’ strings
x, x′ may not help the attacker; the attacker often needs to match some speciőc
‘target’ string x. The TCR deőnition still allows the attacker to choose the
target - but at least not as a function of the key!
We next deőne target collision resistance, which we illustrate in Figure 3.7.
Definition 3.4 (Target collision resistant (TCR) hash). A keyed hash function
hk (·) : {0, 1}∗ × {0, 1}∗ → {0, 1}∗ is called a target collision-resistant (TCR)
CR
hash, if for every efficient (PPT) algorithm A, the advantage εTh,A
(n) is
negligible in n, i.e., smaller than any positive polynomial for sufficiently large
n (as n → ∞), where:
s.t. (x ̸= x′ ) ∧
x ← A(1n );
CR
(3.3)
εTh,A
(n) ≡
Pr n
(hk (x) = hk (x′ ))
x′ ← A(x, k)
k←{0,1}
Where the probability is taken over the random coin tosses of A and the random
choice of k.
4 TCR is a different name for the notion, which was earlier defined by Naor and Yung
in [295], but with a different name: universal one-way hash functions.
Applied Introduction to Cryptography and Cybersecurity
174
CHAPTER 3. INTEGRITY: FROM HASHING TO BLOCKCHAINS
Clearly, every keyed CRHF, i.e., Any-Collision-Resistant (ACR) hash, is
also a Target-Collision-Resistant (TCR) hash function: if there is some value
x with whom the adversary can őnd a collision for a random key (with high
probability), then surely the adversary can őnd some collision for a random key
(with high probability) - e.g., collision with the same x. However, the reverse
appear, intuitively, unlikely: maybe it is possible to őnd a collision once given
the key k, but not with a pre-committed value x? The following counterexample
exercise/argument shows that indeed, this may be possible, i.e., there may be a
keyed hash which is TCR but not a CRHF (not an ACR hash).
Exercise 3.4. Let hk (·) be TCR hash function. Show a keyed hash function
h′k (·) which is also TCR but not a keyed CRHF (i.e., no an ACR hash).
Solution: Recall that the length of the key is n bits. Deőne h′k (x) as follows:
h′k (x) = {0n if x[1 : n] = k, otherwise hk (x)}
Namely, if the n most signiőcant bits of the input x are the same as k, then
h′k (x) = 0n , otherwise, h′k (x) = hk (x). Clearly, for any key k holds h′k (k) =
h′k (k||0) = 0n . Recall that in the deőnition of a keyed CRHF (Deőnition 3.2),
the adversary A is given the key k. Hence it is easy for A to output a collision,
e.g., the pair (x, x′ ) where x = k and x′ = k +
+ 0. Namely, h′ is not a CRHF.
It remains to show that h′ is a TCR hash. In the TCR test, the key k is also
chosen randomly; the difference is that the key is given to the adversary A only
when A selects the second (colliding) input, x′ , and not earlier, when A selects
the őrst input, x. Since k is chosen uniformly at random, there is probability
1
th
most signiőcant bits are the same as k; and 21n is
2n for A to pick x whose n
negligible (in n). Therefore, with overwhelming probability, A selects x whose
nth most signiőcant bits are not the same as k, and therefore, h′k (x) = hk (x).
Assume, to the contrary, that, given k, the adversary A őnd a collision to h′ ,
i.e., x′ =
̸ x such that h′k (x) = h′k (x′ ). However, with overwhelming probability
′
hk (x) = hk (x). Also, the nth most signiőcant bits of x′ cannot be the same as
k (or h′k (x′ ) would be 0n ). Hence, h′k (x′ ) = hk (x′ ). Namely, we have found
a collision for h: a pair x ̸= x′ such that hk (x) = h′ (x′ ), in contradiction to
h being a TCR. Hence, an adversary A that őnds a target-collision for h′ ,
would also őnd such collision for h. Since h is a TCR, őnding such collisions is
infeasible; hence, h′ must be also be a TCR hash.
If possible, it is preferable to design protocols which use a TCR keyed
hash, rather than protocols requiring the use of a keyed CRHF (ACR hash).
That is since we have seen that a keyed hash h may satisfy the (weaker)
TCR requirement, but not the (stronger) ACR (keyed CRHF) requirement.
Furthermore, for the protocol to rely on a keyed CRHF rather than TCR, we
must use sufficient digest length to protect against the birthday attack, which
we discuss in the following subsection.
Applied Introduction to Cryptography and Cybersecurity
3.2. COLLISION RESISTANT HASH FUNCTION (CRHF)
3.2.4
175
Birthday and exhaustive attacks on CRHFs
Our deőnitions of collision-resistance (for both keyless and keyed hash, Deőnition 3.2 and Deőnition 3.1, respectively) place two signiőcant requirements
on the attacker. First, the attacker must őnd collisions using a Probabilistic
Polynomial Time (PPT) algorithm. Second, the attacker must succeed to őnd
a collision with non-negligible probability. Let us explain why, without these
requirements on the attacker, we cannot hope to achieve collision resistance.
Both arguments hold for both keyless and keyed CRHFs; for simplicity, we
focus on the keyless hash case.
A PPT attacker can find collisions with exponentially-small probability in every hash function. Consider the same set X as before, and
an algorithm that selects two random elements in X; with small probability,
this algorithm would output the collision x ̸= x′ s.t. h(x) = h(x′ ). Therefore,
the deőnitions allow the adversary to have negligible probability of őnding a
collision.
Attacker can find collisions in exponential time in every hash function.
Consider a hash function h : {0, 1}∗ → {0, 1}n , and a set X containing 2n + 1
distinct input binary strings. The output of h is the set of n-bits strings, which
contains 2n elements; hence, there must be at least two elements x ̸= x′ in
the set X, which collide, i.e., h(x) = h(x′ ). An adversary can surely compute
h(x) for each of the 2n + 1 elements in X, and őnd at least one such collision
h(x) = h(x′ ). Of course, this attack requires 2n + 1 computations of the hash
function, i.e., its runtime is exponential in n. Hence, the deőnitions require the
adversary against a CRHF to run in time polynomial in n.
The birthday paradox and attack on collision resistance. The argument
above required the adversary to compute 2n + 1 hash values, i.e., O(2n ). We
next show that actually, an
can őnd a collision, in any hash function,
√ adversary
with only O(2n/2 ) = O 2n expected number of hash-computations, rather
than O(2n ).
This attack is often called the birthday attack since it is due to the socalled birthday paradox. Consider a room containing 23 persons. What is the
probability of a collision, i.e., two people having birthday on the same day of
the year? Many people expect this probability to be quite small, but in reality,
the probability is about half. To understand why this is true, notice that when
a person is added to a room currently containing i persons (with no collisions),
i
1
the probability of a collision with some person in the room is 356
, not 356
.
More precisely, the expected number q of messages {m1 , m2 , . . . , mq } which
should be hashed before őnding a collision h(mi ) = h(mj ) is approximately:
q ⪅ 2n/2 ·
r
π
⪅ 1.254 · 2n/2
2
Applied Introduction to Cryptography and Cybersecurity
(3.4)
176
CHAPTER 3. INTEGRITY: FROM HASHING TO BLOCKCHAINS
Hence, to ensure collision-resistance against adversary who can do 2q computations, e.g., q = 80 hash calculations, we need the digest length n to be
roughly twice that size, e.g., 160 bits. Namely, the effective key length of a
CRHF is only q = n/2. This motivates the fact that hash functions often have
digest length twice the key length of shared-key cryptosystems used in the same
system. Using longer digest length and/or longer key length does not harm
security, but may have performance implications.
Note that the birthday attack applies to both keyed CRHF and keyless
CRHF; however, it does not apply to Target Collision Resistant (TCR) hash.
Can you see why? Carefully compare Deőnition 3.2 vs. Deőnition 3.4, and
you’ll őnd out!
3.2.5
CRHF Applications (1): File Integrity
Collision resistance is a great tool for ensuring integrity. One common application is to distribute a (large) object m, e.g., a őle containing the executable
code of a program. Suppose the őle m is distributed from its producer in LA,
to a user or repository in Washington DC (step 1 in Fig. 3.8). Next, a user
in NY is downloading the őle from the repository (or peer user) in DC (step
2), receiving m′ (which should be the same as m, of course). To validate the
integrity of the received őle m′ , the user also downloads the digest h(m) of
the őle, directly from the producer in LA (step 3), and then conőrms that
h(m) = h(m′ ) (by computing h(m′ ) locally). By downloading the large őle m
from the nearby DC rather than from LA, the transmission costs are reduced;
by checking integrity using the digest h(m), we avoid the concern that the őle
was modiőed in DC, or modiőed in transit between LA to DC or DC to NY.
This method of download validation is deployed manually, by savvy users,
or in an automated way, by operating systems, applications or a script running
within a browser [161].
A potential remaining concern is modiőcation of the digest h(m) received
directly from producer in LA, by a Man-in-the-Middle (MitM)5 attacker. This
may be addressed in different ways, including the use of a secure web connection
for retrieving h(m), as discussed in Chapter 7, receiving the digest from multiple
independent sources, or receiving a signed digest. This last method is basically
the same as the Hash-then-Sign methods that we discuss next.
3.2.6
CRHF Applications (2): Hash-then-Sign (HtS)
Collision-resistance is a powerful property; in particular, it facilitates one of the
most important applications of cryptographic hash functions - the Hash-thenSign (HtS) paradigm. The Hash-then-Sign paradigm is essential for efficient
deployment of public-key digital signatures, which we introduced in subsection 1.5.1. We present constructions for signature schemes based on public
key cryptosystems in Section 6.6; and in subsection 3.4.2 we discuss one-time
5 Also
called Monster-in-the-Middle
Applied Introduction to Cryptography and Cybersecurity
3.2. COLLISION RESISTANT HASH FUNCTION (CRHF)
177
Figure 3.8: Example of use of hash function h to validate integrity of őle m
downloaded by a user in NY, from an untrusted repository in DC. To validate
integrity, the user downloads the (short) digest directly from the website of the
producer, in LA. This reduces network overhead - and load on the producer’s
website - compared to downloading the entire őle from the producer’s website.
signatures and present their constructions, based on one-way functions (OWFs).
However, both approaches result in signatures for limited-length inputs; furthermore, extending the input length would signiőcantly further increase the
already high overhead of signature computation and validation (Table 6.1).
Real applications always use, instead, the Hash-then-Sign (HtS) construction,
which we discuss here. It can be applied to either keyed or keyless hash; we
mostly focus on the keyless-hash variant (or ‘keyless HtS’).
The Hash-then-Sign solution applies a hash function h(·) to the ‘long’
message m, and signs the (short) output h(m). Namely, given a signature
scheme S deőned for domain {0, 1}n and a hash function h with domain {0, 1}∗
h
and range {0, 1}n (i.e., h : {0, 1}∗ → {0, 1}n ), we deőne the HtS scheme SHtS
as follows:
h
SHtS
.KG(1n )
h
SHtS
.Sign s (m)
h
SHtS
.Verifyv (m, σ)
≡
≡
≡
S.KG(1n )
S.Sign s (h(m))
(3.5)
(3.6)
S.Verifyv (h(m), σ)
(3.7)
h
The HtS scheme SHtS
may be applied to any binary string, i.e., its domain
∗
is {0, 1} . The reader may conőrm that it is a correct signature scheme over
{0, 1}∗ (Exercise 4.21). Theorem 3.1 shows that if h : {0, 1}∗ → {0, 1}n is a
collision-resistant hash function (CRHF) h(·), as in Deőnition 3.1, and S is
existentially unforgeable signature scheme over {0, 1}n , then the HtS scheme
h
SHtS
is existentially unforgeable signature scheme over {0, 1}∗ , i.e., applicable
to arbitrary-length binary strings. Of course, the HtS method may fail if using
an insecure hash function h; see Exercise 4.24.
Applied Introduction to Cryptography and Cybersecurity
178
CHAPTER 3. INTEGRITY: FROM HASHING TO BLOCKCHAINS
Given a keyless CRHF h, HtS would be secure. The keyless Hashthen-Sign construction uses a keyless CRHF h : {0, 1}∗ → {0, 1}n . Given
signature scheme (KG, S, V ) whose input domain is (or includes) n-bit strings,
i.e., {0, 1}n , the keyless Hash-then-Sign signature of any message m ∈ {0, 1}∗ is
deőned as Ssh (m) ≡ Ss (h(m)), where we use s for the private signing key of S
(and of Ssh ).
We next show that if h is a Collision-Resistant Hash Function (CRHF), and
assuming that the signature scheme (S, V ) is secure (existentially unforgeable,
see Deőnition 1.6), then (S h , V h ) is also secure. Notice that Lemma 3.1 showed
keyless CRHFs do not exist at all, which makes this theorem useless as a basis
for proofs of security of applications using a keyless hash function. However, the
theorem is still useful, for two reasons. First, it is similar (and simpler) to the
similar theorems for keyed CRHFs. Second, it justiőes, at least intuitively, the
common use of the Hash-then-Sign paradigm applied to a keyless hash function
h.
Theorem 3.1 (Keyless Hash-then-Sign would be secure (existentially unforgeable)). Let (KG, S, V ) be an existentially unforgeable signature scheme over
the domain {0, 1}n , and let h : {0, 1}∗ → {0, 1}n be a keyless CRHF function,
h
Let SHtS
be the Hash-then-Sign signature as defined in Equation 3.6. Then
h
SHtS is an existentially unforgeable signature scheme over domain {0, 1}∗ .
Proof: Assume that the claim does not hold, i.e., that there is an efficient
adversary A ∈ P P T s.t. εSeu−Sign
(n) ̸∈ N EGL(n), as deőned in Equation 1.2.
h
HtS ,A
Namely, with signiőcant probability, A outputs a pair (m, σ) s.t. A didn’t
h
h
provide m as input to the SHtS
.Sign s (·) oracle, yet SHtS
.Verifyv (m, σ). From
h
Equation 3.7, SHtS .Verifyv (m, σ) = S.Verifyv (h(m)). Let ϕ ≡ h(m); now,
either there was another message m′ s.t. ϕ = h(m′ ) which A did provide as
h
input to the SHtS
.Sign s (·) oracle, or not. Let us consider both cases; at least
one of the two must occur with signiőcant probability.
If there is another message m′ s.t. ϕ = h(m′ ) which A provided as input
h
.Sign s (·), then the pair (m, m′ ), both of which produced by A, is a
to SHtS
collision for h. This should be impossible to efficiently őnd, since h is assumed
to be a CRHF.
If there was no such m′ , then we can use A to construct an adversary A ′ that
will őnd forgery for the original signature scheme (KG, S, V ). The adversary
A ′ runs A, and whenever A makes an oracle query m, then A ′ computes
ϕ = h(m), makes query for Ss (ϕ) and returns the result to A- obviously, this
would be the expected value. Finally, when A returns the forgery (m, σ) for
h
SHtS
, then A ′ computes h(m) and returns (h(m), σ), which is the corresponding
forgery for S. The existence of such forgery contradicts the assumption that
(KG, S, V ) is an existentially unforgeable signature scheme. Hence, there is no
h
h
such efficient adversary A that ‘wins’ against SHtS
, i.e., SHtS
is an existentially
∗
unforgeable signature scheme over domain {0, 1} , as claimed.
Applied Introduction to Cryptography and Cybersecurity
3.2. COLLISION RESISTANT HASH FUNCTION (CRHF)
179
Two Keyed-Hash HtS Constructions: Keyed-HtS and TCR-HtS. We
now present two Hash-then-Sign (HtS) constructions from keyed-hash function.
We begin with a very simple construction by Damgård [111], which we refer
to as Keyed-HtS as it is based on the use of a keyed-CRHF. This construction
is identical to the keyless Hash-then-Sign construction, except for the use of
a keyed hash hk (·) : {0, 1}n × {0, 1}∗ → {0, 1}n , deőned for arbitrary key and
digest length n ∈ N. The hash key is selected once, during the key-generation
process, and becomes part of the public veriőcation key and of the private
h
signing key. We deőne the Keyed-HtS construction SHtS
as follows.
Definition 3.5 (The Keyed-HtS construction). Given a signature scheme S
with domain {0, 1}n and a keyed hash (CRHF) hk (·) : {0, 1}∗ ×{0, 1}∗ → {0, 1}∗ ,
the Keyed-HtS signature using signature S and keyed-hash h is defined as follows:
$
(s, v) ← S.KG(1n ) ;
$
k ← {0, 1}n ;
return ((s, k), (v, k))
h
.KG(1n )
SHtS
≡
h
SHtS
.Sign (s,k) (m)
≡
S.Sign s (hk (m))
≡
S.Verifyv (hk (m), σ)
h
SHtS
.Verify(v,k) (m, σ)
(3.8)
(3.9)
(3.10)
The Keyed-HtS construction requires the underlying keyed hash function to
satisfy a relatively-strong requirement, namely, to make it infeasible to őnd any
collision (what we referred to as ACR or keyed-CRHF hash).
Bellare and Rogaway show, in [45], the almost-as-simple TCR-HtS construction. The TCR-HtS construction requires only the weaker target-collision
resistant (TCR) keyed-hash function. For discussion and comparison on these
two deőnitions, see subsection 3.2.3; in particular, the generic birthday attack is
applicable against the keyed-CRHF property, but not against the TCR property,
hence a hash function may be a secure TCR hash, even when it uses signiőcantly
shorter digests (about half of the bits required for a secure keyed-CRHF).
The TCR-HtS construction is similar to the keyless Hash-then-Sign construction (and to the keyed-HtS construction); there are basically two differences. The
őrst difference is obvious: we use a keyed hash hk (·) : {0, 1}n ×{0, 1}∗ → {0, 1}n ,
i.e., a hash function which receives two inputs binary strings, a key k and a message, and outputs a binary string. The second difference is in the construction;
we select and transmit the hash-key with each signature. Namely, the hash key
is selected, randomly, as part of each signing operation, and is sent together
with the output of the underlying signature function. We deőne the TCR-HtS
construction SThCR−HtS as follows.
Definition 3.6 (The TCR-HtS construction). Given a signature scheme S with
domain {0, 1}n , defined for any integer n, and a keyed hash hk (·) : {0, 1}n ×
{0, 1}∗ → {0, 1}n , the TCR-HtS signature using signature S and keyed-hash h
Applied Introduction to Cryptography and Cybersecurity
180
CHAPTER 3. INTEGRITY: FROM HASHING TO BLOCKCHAINS
is defined as follows:
SThCR−HtS .KG(1n )
≡
SThCR−HtS .Sign s (m)
≡
SThCR−HtS .Verifyv (m, (k, σ))
≡
S.KG(1n )
(3.11)
$
k ← {0, 1}n ;
σ ← S.Sign (k +
+ hk (m)) (3.12)
s
return (k, σ)
S.Verifyv (k +
+ hk (m), σ)
(3.13)
Both the keyed-HtS and the TCR-HtS constructions, result in a secure,
existentially-unforgeable signature scheme for unbounded input length - provided
that the underlying hash function satisőes the required property. This property,
however, is different between the two constructions. The keyed-HtS construction
requires a keyed-CRHF, a relatively strong requirement, while the TCR-HtS
construction only makes the (weaker) requirement of a TCR hash function.
Theorem 3.2 (Keyed Hash-then-Sign is secure (existentially unforgeable)). Let
(KG, S, V ) be an existentially unforgeable signature scheme over the domain
{0, 1}n , and let hk (·) : {0, 1}n × {0, 1}∗ → {0, 1}n be a keyed hash function, Let
h
SHtS
be the Keyed-HtS signature scheme, as defined in Equation 3.10, and let
SThCR−HtS be the TCR-HtS signature scheme, as defined in Equation 3.13; both
schemes are defined for arbitrary-length input strings. Then:
h
is an
1. If h is a keyed collision-resistant hash function (CRHF), then SHtS
existentially unforgeable signature scheme.
2. If h is a Target Collision-Resistant (TCR) hash function, then SThCR−HtS
is an existentially unforgeable signature scheme.
Proof: see [45, 111].
Unfortunately, both of the constructions involve challenges for deployment,
namely, may necessitate signiőcant changes in existing systems designed for Hashthen-Sign using keyless hash. The keyed-HtS construction requires distribution
of a longer public key, since the random hash key becomes part of the public
key; and the TCR-HtS construction requires a longer signature (k, σ), i.e., the
signature has to include also the random key chosen for the hash function.
Several papers propose alternative HtS constructions with provable security
properties but without such deployment challenges, some of them also based on
keyless hash functions; for example, see [314].
3.3
Second-preimage resistance (SPR) Hash Functions
The second property we introduce is second-preimage resistance (SPR). We
deőne SPR only for keyless hash functions, although it can also be deőned for
keyed hash [340].
Intuitively, a Second-Preimage Resistance (SPR) hash function h accepts
one input, an arbitrary-length binary string m ∈ {0, 1}∗ , and outputs an n-bit
Applied Introduction to Cryptography and Cybersecurity
3.3. SECOND-PREIMAGE RESISTANCE (SPR) HASH FUNCTIONS
181
Figure 3.9: Second-preimage resistance (SPR): given keyless hash function
h : {0, 1}∗ → {0, 1}n , for any input length l ≥ n, given a random first preimage
x ∈ {0, 1}l , for l ← A(1n ), it is hard to őnd a collision with x, i.e., a second
preimage x′ ∈ {0, 1}∗ s.t. x′ ̸= x yet h(x) = h(x′ ).
long binary string h(m) ∈ {0, 1}n , and satisőes the SPR property. The SPR
property means, intuitively, that an efficient (PPT) adversary A has negligible
probability, when given a random binary string x, to őnd a collision to x, i.e.,
a different string x′ =
̸ x which has the same hash: h(x′ ) = h(x). We refer to x′
as the second preimage, and to x as the random (őrst) preimage, since they are
both preimages of h(x′ ) = h(x).
We illustrate the SPR property in Figure 3.9. The reader will notice that
we let the adversary select the length l of the random (and őrst) preimage x.
By selecting the length of x őrst, we allow x to be a uniformly-random string
from the (őnite) set {0, 1}l ); note that we can not select a uniformly-random
element from an inőnite set, i.e., we can’t select a uniformly-random string
from {0, 1}∗ (see Section A.3). The set {0, 1}l is a natural choice of a őnite set,
since it contains all binary strings of length l; we can select a random string by
ŕipping l fair coins - giving probability 21l to each of the 2l strings in the set
{0, 1}l .
We let the adversary select l, which seems prudent - why limit the adversary
to preimage of speciőc length? However, we must prevent the adversary from
choosing l which would be ‘too long’, since the adversary, as an efficient (PPT)
algorithm, is allowed runtime which is polynomial in the length of its inputs,
$
which includes the l bit x ← {0, 1}l . For example, if we let the adversary choose
l = 2n , then x would have 2n bits, and, when given x and asked to choose x′ ,
the adversary will be allowed runtime polynomial in 2n , i.e., exponential in n,
and to őnd, with high probability, a collision, e.g., by computing h(x′ ) for all
values of x′ ∈ {0, 1}n+1 .
To ensure that the entire adversary’s runtime is polynomial in n, and in
particular that l, the length of x, is polynomial in n, we require the adversary
to output l in unary, i.e., as a string 1l consisting of l bits whose value is 1.
The deőnition follows.
Definition 3.7 (Second-preimage resistance (SPR) Hash Function). A (keyless)
hash function h : {0, 1}∗ → {0, 1}n is second-preimage resistant (SPR) if for
R
every efficient (PPT) algorithm A, the advantage εSP
h,A (n) is negligible in n,
i.e., smaller than any positive polynomial for sufficiently large n (as n → ∞),
Applied Introduction to Cryptography and Cybersecurity
182
where:
CHAPTER 3. INTEGRITY: FROM HASHING TO BLOCKCHAINS
R
εSP
h,A (n) ≡ Pr
1l ← A(1n )
$
x←
{0, 1}l
′
x ← A(x)
′
′
x ̸= x ∧ h(x) = h(x )
(3.14)
Where the probability is taken over the choice of x and the random coin tosses
of A.
SPR is sometimes referred to as weak collision resistance, and indeed, as
the reader can prove, every CRHF is also an SPR hash function. However,
Exercise 3.5 shows that there may be an SPR which is not a CRHF. Indeed,
while subsection 3.2.2 shows that there is no keyless CRHF, it is possible, and
commonly believed, that (keyless) SPR hash functions do exist. In practice,
collision attacks, but not second-preimage attacks, are known against the SHA1
and MD5 standard hash functions.
Exercise 3.5. Let h be an SPR hash function. Use h to construct another
hash function, h′ , which you will show to be (1) an SPR (like h), but (2) not a
CRHF.
Therefore, whenever possible, we should design protocols and systems to
require only an SPR hash function, rather than a CRHF. Indeed, the SPR
property suffices for some protocols and applications. For example, the SPR
property suffices to authenticate the integrity of a őle downloaded from an
untrusted repository, whose hash is signed by the (trusted) developer, as in
Figure 3.8.
However, other important applications require collision resistance and would
be insecure if using a hash function which is only second-preimage resistant
(SPR). Importantly, the SPR property is not sufficient for Hash-then-Sign (Hts),
as we discuss in the next subsection.
Exercise 3.6. Explain, intuitively, why the SPR property suffices to authenticate
the integrity of a file downloaded from an untrusted repository, whose hash is
signed by the (trusted) developer; and why your explanation does not apply to
the use of such hash for the Hash-then-Sign construction.
Although the SPR property suffices for some applications, and we know that
there is no keyless hash function which is collision-resistant, it is still better
to avoid any use of cryptographic hash functions for which collisions has been
found. This protects against the common case, where a designer incorrectly
believes that an SPR hash suffices, while the system is actually vulnerable
to non-SPR collision attacks; and also protects against implementation errors
which use the SPR hash function where the design called for a CRHF.
Applied Introduction to Cryptography and Cybersecurity
3.3. SECOND-PREIMAGE RESISTANCE (SPR) HASH FUNCTIONS
3.3.1
183
The Chosen-Prefix Collisions Vulnerability
Theorem 3.1 shows that Hash-then-Sign is secure when used with a CRHF. But
would the weaker SPR property suffice to ensure security using the Hash-thenSign (HtS) paradigm, i.e., using the (keyless) HtS construction of Equation 3.7?
In this subsection we show that such SPR-HtS, i.e., use of a hash function which
is second-preimage resistant but not collision resistant, may result in signiőcant
vulnerability.
Let us begin with an arguably less-signiőcant vulnerability, showing that
SPR-HtS would not ensure existential unforgeability. This requires only the
following observation. Consider a (keyless) hash function, which is SPR but not
CRHF. Namely, an adversary A may know a collision h(m) = h(m′ ), although
A cannot efficiently őnd a collision to a randomly-chosen message mR . Now,
consider the HtS signature (Equation 3.6) over m:
h
SHtS
.Sign s (m) = S.Sign s (h(m))
= S.Sign s (h(m′ ))
h
= SHtS
.Sign s (m′ )
h
We conclude that SHtS
is not an existentially-unforgeable signature scheme
(Deőnition 1.6).
h
However, could it be that SHtS
is ‘secure enough’ for practical applications?
Even if A can őnd some collision h(m) = h(m′ ), for some ‘random’ strings
m, m′ , how would the attacker convince the signer to sign m, and why should
the alternative message m′ be of (signiőcant) value to the attacker? In short:
is there a clearly realistic attack, which may be possible against an SPR hash h
(although this attack fails against a CRHF)? We next show that this is indeed
the case by presenting such an attack, exploiting a realistic vulnerability: the
chosen-prefix vulnerability, which we next deőne.
Definition 3.8 (The chosen-preőx collisions vulnerability). Hash function h
is said to have the chosen-preőx vulnerability if there is an efficient (PPT)
collision-őnding algorithm CF, s.t. given a (prefix) string p ∈ {0, 1}∗ , the
algorithm CF efficiently outputs, with high probability, a collision, i.e., a pair of
strings x, x′ ∈ {0, 1}∗ , s.t. for any (suffix) string s ∈ {0, 1}∗ holds h(p+
+x+
+s) =
h(p +
+ x′ +
+ s). Namely:
(x, x′ ) ← CF(p) s.t. (x ̸= x′ )∧
(∀p) w.h.p.:
(3.15)
(∀s)(h(p +
+x+
+ s) = h(p +
+ x′ +
+ s))
Note that the fact that the collisions hold for any common suffix s is due
to the fact that many keyless hashing functions have an iterative design. Due
to this design, if there is a collision between for preőx p between x and x′ , i.e.,
x ̸= x′ yet h(p +
+ x) = h(p +
+ x′ ), then there is also a collision for any suffix s
between p +
+x+
+ s and p +
+ x′ +
+ s. Namely, (∀s)h(p +
+x+
+ s) = h(p +
+ x′ +
+ s).
In particular, the Merkle-Damgård construction, which is used by many hash
functions, has this iterative design, and hence has this property. We discuss
this construction in Section 3.9.
Applied Introduction to Cryptography and Cybersecurity
184
CHAPTER 3. INTEGRITY: FROM HASHING TO BLOCKCHAINS
Chosen-prefix attacks are practical. The chosen-preőx vulnerability is a
realistic concern; in fact, such vulnerabilities were found for widely-used (at the
time) standard hash functions including RIPEMD, MD4, and MD5 in [367], and
later also for SHA1 [262], which all use the Merkle-Damgård construction. Due
to this vulnerability, these hash functions are considered insecure and replaced,
in new designs and, when possible, in existing systems, with cryptographic
hash functions that are not known to have this or other vulnerabilities, such as
SHA-2 and SHA-3 [114].
We next show how the chosen-preőx collisions vulnerability facilitates a
realistic attack on the Hash-then-Sign paradigm. This attack allows an attacker
to trick users into signing what appears to a third party to be a statement (e.g.,
money transfer) that the user never intended to sign. For more elaborate attacks,
which allow also forgery of public key certificates, see Chapter 8 and [262, 367,
368].
Chosen-prefix attack on Hash-then-Sign: simplified version. We begin
by presenting a simpliőed version of the chosen-preőx attack on the Hash-thenSign paradigm. In this version, an attacker, say Mal, uses the chosen-preőx
attack to őnd a pair of strings (x, x′ ), which would collide when appended to
the (chosen) preőx ‘Pay $’. Namely, the pair (x, x′ ) would satisfy x ̸= x′ and
h(‘Pay $ +
+x+
+ s) = h(‘Pay $ +
+ x′ +
+ s) (for any suffix s).
The main simpliőcation we make is to assume that Mal can deposit a
‘payment order’ of the form ‘Pay $’+
+x+
+to Mal’, where x is a binary string
interpreted as an integer. Obviously, reality is more complex, e.g., if the
payment order is a document, the amount x should be also encoded in printable
characters, typically in ASCII. Another simpliőcation is to assume that, as
binary numbers, x << x′ .
Mal now sets-up an online shop and offers for sale an item whose market
value is $y where x < y << x′ . Mal offers the item for only $x - a real bargain!
But what is really going to happen?
Alice comes along and happily buys the item, by sending to Mal a signed
payment order to her bank, ready to be deposited at the bank. Namely,
Alice sends to Mal the pair (P O, σ) where σ = SignA.s (h(P O)) and P O =
‘Pay $x to Mal′ .
Alice expect to be charged $x after Mal will deposit this signed payment
order (P O, σ). However, to her chagrin, she őnds that she was charged x′ >> x
- a signiőcantly higher amount. Mal has tricked Alice, by depositing the forged
payment order P O′ = ‘Pay $x′ to Mal’, together with σ, Alice’s signature on
P O (σ = SignA.s (h(P O))). The Bank will honor this transactions, and charge
Alice by $x′ , since σ is also a valid signature for P O′ , as:
h(P O) = h(Pay $x to Mal) = h(Pay $x′ to Mal) = h(P O′ )
Hence, the same signature generated by Alice, appears to the bank to be a
valid signature over the ‘fake’ payment order P O′ - and the bank transfers to
Mal the larger amount $x′ >> $x!
Applied Introduction to Cryptography and Cybersecurity
3.3. SECOND-PREIMAGE RESISTANCE (SPR) HASH FUNCTIONS
185
Recall now the simpliőcations we made in this description of the attack. In
particular, banks will not accept a payment order where amounts are indicated
as binary numbers; typically, the entire payment order should be encoded using
printable characters, e.g., using ASCII code; more readable formats, such as
PDF or HTML, are even more likely. We next discuss a more realistic variant
of the attack, which works for payment orders encoded using PDF or HTML.
A more realistic Chosen-prefix Attack: Signing PDF documents
We now improve the chosen-preőx attack to allow forgery of signatures over
documents formatted in ‘rich’ markup languages like PDF, postscript, and
HTML. The attacker, Mal, exploits the fact that these (and similar) languages
allow documents to contain conditional rendering statements, allowing the
document to display different content depending on different conditions.
In the attack, Mal uses the conditional rendering capability, to create two
documents D1 , DM that have the same hash value, h(D1 ) = h(DM ), but
when rendered by the correct viewer, e.g., PDF viewer, the two documents
are rendered very differently. Namely, viewing D1 , the reader displays text
t1 = ‘Pay $1 to Amazon’, while viewing DM , the reader displays text tM =
‘Pay $1,000,000 to Mal’. The rest of the contents, and even the details of
the markup language used, do not materially change the attack, so we ignore
them.
Mal creates these two documents as follows. First, the documents share
common perőx and suffix: D1 = p +
+ x1 +
+ s, DM = p +
+ xM +
+ s.
The preőx p consists of headers and preliminaries as required by the markup
language, e.g., %PDF for PDF, or <!DOCTYPE html> for HTML, followed by the
‘if’ statement in the appropriate syntax. Simplifying, let’s say that p = ‘if ’.
Mal next applies the collision-őnding algorithm CF (Deőnition 3.8), to őnd
collision for preőx p, namely: (x1 , xM ) ← CF(p). For every suffix s holds:
h(p +
+ x1 +
+ s) = h(p +
+ xM +
+ s).
To complete D1 and DM , Mal sets the suffix s to the string:
s ← ‘=’ +
+ x1 +
+ ‘then display’ +
+ t1 +
+ ‘, else display’ +
+ tM
Mal is now ready to launch the attack on Alice, similarly to the simpliőed
attack above. Namely, Mal őrst sends D1 to Alice, who views it and sees the
rendering t1 . Let us assume that Alice agrees to pay $1 to Amazon, and hence
signs D1 , i.e., computes σ = SA.s (h(D1 )) and sends (D1 , σ) back to Mal.
Mal forwards to the bank the modiőed message (DM , σ). The bank validates
+x1 +
+s) = h(p+
+xM +
+s) =
the signature, which would be Ok since h(D1 ) = h(p+
h(DM ). The bank then views DM , and sees:
tM = ‘Pay $1,000,000 to Mal’
As a result, the bank transfers one million dollars from Alice to Mal.
Of course, some work is required to actually deploy the above attack; in
particular, it isn’t trivial to handle PDF őles. The following exercise challenges
the readers to őnd a similar attack against HTML őles; this should not be too
Applied Introduction to Cryptography and Cybersecurity
186
CHAPTER 3. INTEGRITY: FROM HASHING TO BLOCKCHAINS
Figure 3.10: Intuition of the One-Way Function property, aka PreimageResistant hash function: given h(x) for a (sufficiently long) random preimage x,
it is infeasible to őnd x, or any other preimage x′ of h, i.e., s.t.. h(x′ ) = h(x).
Details in Deőnition 3.9.
difficult, and we hope would be fun and instructive. Your solution may assume
that the browser displaying the HTML őle supports JavaScript applets.
Pl
Exercise 3.7. Consider the hash function h(x1 +
+ x2 +
+ ... +
+ xl ) = i=1 xi
mod p, where each xi is 64 bits and p is a 64-bit prime. (a) Is h an SPR
hash function? CRHF? (b) Present a collision-finding algorithm CF for h. (c)
Create two HTML files D1 , DM as above, i.e., s.t. h(D1 ) = h(DM ), yet when
they are viewed in a browser, they display texts t1 , tM as above.
3.4
One-Way Functions, aka Preimage Resistance
The third security property we discuss for cryptogrphic hash functions is called
Preimage resistance or One-Way Function (OWF). Intuitively, h is a one-way
function, if, given h(x) for a (sufficiently long) random preimage x, it is infeasible
to őnd either x or any other preimage x′ of h, i.e., s.t.. h(x′ ) = h(x). This is
illustrated in Fig. 3.10.
We prefer the term One-Way Function (OWF) to the term preimage-resistant
hash, since OWF emphasizes the ‘one-way’ property: computing h(x) is easy,
but ‘inverting’ it to őnd x, or a colliding preimage x′ s.t. h(x′ ) = h(x), is hard.
The deőnition follows. Note that the input x is selected as an l > n bits
string, where l is selected by the adversary (in unary). This selection process is
similar to the one used in the deőnition of second-preimage resistant (SPR) hash,
and for the same reasons: to ensure that the adversary is limited to runtime
polynomial in n, and to allow random choice of x (from a őnite domain).
Definition 3.9 (One-Way Function (OWF), aka Preimage resistance). An
efficient function h, with range {0, 1}n , is called preimage resistant, or a one-way
F
function, if for every efficient algorithm A ∈ P P T , the advantage εOW
h,A (n) is
negligible in n, i.e., smaller than any positive polynomial for sufficiently large
n (as n → ∞), where:
"
#
$
$
1l ← A(1n ) ; x ← {0, 1}l
OW F
(3.16)
εh,A (n) ≡ Pr
h(A(x)) = h(x)
Applied Introduction to Cryptography and Cybersecurity
3.4. ONE-WAY FUNCTIONS
187
Where the probability is taken over the random coin tosses of A and over the
$
choice of x ← {0, 1}l .
The deőnition of a OWF requires the input x to be selected as a random
l bits string. This raises the following question: supposed h is a OWF (as in
Deőnition 3.9), but we select input x as a random member of some other set
(not {0, 1}l ), e.g., a random text from a collection of numerous texts. Is it
possible that the attacker will be able to őnd x, or another preimage x′ s.t.
h(x) = h(x′ )?
We next present two important applications of one-way functions: the OTPw
(One-Time Password) Authentication Scheme and a one-time signature scheme.
3.4.1
The OTPw (One-Time Password) Authentication
Scheme
Passwords are the most well-known and widely-used method for user authentication, which we discuss in Chapter 9. In particular, we discuss designs
for improved-security password systems, and designs for user-authentication
mechanisms which are not based on passwords. Several of these designs are
based on the OTPw6 . (One Time Password) Authentication Scheme, proposed
already in 1974 [143].
The OTPw design uses a one-way function; indeed, it may be the earliest,
or one of the earliest, publications introducing the concept of one-way functions.
It is also a nice, simple example of OWFs and their applications, so it is natural
for us to use it to introduce one-way functions. The design uses a one-way
function which we denote by h : {0, 1}∗ → {0, 1}n .
In the OTPw, each user, say Alice (denoted by A), selects her password,
$
denoted P WA , as a random n−bits string, P WA ← {0, 1}n , and computes
HP WA ← h(P WA ). The server receives and saves HP WA . To authenticate
herself, Alice sends to the server P WA ; the server computes h(P WA ) and veriőes
that it is the same as HP WA .
The one time password design offers an important advantage over the naive
way of using passwords, where the passwords are kept ‘as is’ (in plaintext) in
the password őle. The advantage is that the authentication is secure, even if
an attacker is able to obtain the value of the validation token HP WA . Indeed,
this design is the key to the popular designs of hashed and salted password őles,
which we discuss in subsection 9.4.2.
If h is a one way function (OWF), then, from Deőnition 3.9, an efficient
(PPT) adversary cannot őnd the random preimage P WA , even if given the
validation token HP WA . Hence, the one time password scheme is secure:
if Bob őnds that HP AA = h(P WA ), this should means that P WA is the
6 The
idea is often referred to as OTP; however, OTP is also used to refer to the completely
different notion of one time pad (Section 2.4). Note also that the term One-Time Password is
often used for other user-authentication designs, including hash chain, which is an extension
of OTPw, and other designs which are less related; see Chapter 9.
Applied Introduction to Cryptography and Cybersecurity
188
CHAPTER 3. INTEGRITY: FROM HASHING TO BLOCKCHAINS
correct OTPW. If Alice kept P WA securely and only discloses it to authenticate
herself to Bob, then Bob knows this also means that Alice has initiated the
authentication (by disclosing P WA ). Of course, an attacker which obtains the
one time password P WA could try to abuse it; therefore, if the communication
is not properly protected (encrypted), Bob should only accept P WA once, i.e.,
‘one time’.
Note that the OTPw scheme requires the transmission of the password to
be done over a secure (encrypted) channel; if an attacker can eavesdrop to the
password, then security is trivially broken. A secure connection is also required
if the user wishes to replace the password (after authenticating successfully).
Another concern with the scheme is that if that its security depends on the use
of random passwords; if the password is contained in a dictionary of common
passwords, then the attacker may őnd the password P WA from HP WA by
hashing the different passwords in the dictionary - an offline dictionary attack
(subsection 9.3.5).
In subsection 9.4.2 we present improvements to the OTPw scheme that
provide better protection in the case of human, non-random passwords; and
in subsection 9.6.2 we discuss other one-time password (OTPW) schemes that
provide better security, including against an eavesdropping adversary. In the
next subsection, we move to a a different - albeit related - application of one-way
functions, which is the one-time signature scheme.
3.4.2
Using OWF for One-Time Signatures
We next show how to use one-way functions to implement a public key signature
scheme, which can be used to sign only one message. We call such a scheme a
one-time signature scheme; the term is often applied also to extensions which
allow a limited number of signature operations.
Note that our deőnition of signature schemes (subsection 1.5.1) did not allow
a restriction on the number of applications of the signature scheme, therefore,
one-time signature schemes are not within the scope of subsection 1.5.1. However,
it is not hard to extend the deőnitions in subsection 1.5.1 so that they support
limited number of applications; the reader may do it as an exercise.
In spite of their obvious limitations, one-time signatures have also important
advantages in security and performance, making them a good choice for many
applications. These advantages are in comparison to the widely-used public-key
signatures schemes such as RSA and DSA, whose security is based on the
hardness of factoring (RSA) or of computing discrete logarithms (DSA); we
discuss RSA, factoring and discrete-log in Chapter 6. Let us discuss these
advantages of one-time signatures.
The main security advantage of one-time signatures is that they are not
(too) vulnerable to the potential availability to an attacker of a quantum
computing device. This is in contrast to factoring, discrete-logarithms and
many other computational problems, which may be efficiently solved (broken)
using an appropriate quantum computer; we brieŕy discuss the impact of
quantum computing in Section 10.4. Signature schemes based on computational
Applied Introduction to Cryptography and Cybersecurity
3.4. ONE-WAY FUNCTIONS
189
assumptions (factoring, discrete logarithm) are also subject to other possible
improvements in the algorithms to solve these problems. One-time signature
schemes are also subject to vulnerabilities of the underlying hash function, but
changing to another function/scheme, as well as using a signiőcant margin
(longer digests), would be easier for hash functions than for signatures. And
a quantum computer would provide much smaller advantage against hash
functions, compared to its impact on factoring and discrete logarithms.
The efficiency advantage of one-time signatures is their low computational
overhead compared to regular signature schemes. This advantage can further
increased, if longer key size are used for the public key signatures, to ensure
security against a future quantum computer, and against algorithmic advances
in solving the underlying ‘hard’ problem.
Note, however, that one-time signatures require rather long public keys and
signatures. Of course, the overhead depends on the scheme used. We present a
simple scheme; some other schemes require shorter public keys and signatures.
Figure 3.11: A one-time signature scheme, limited to signing only a single bit b.
We present the construction in three steps, gradually improving performance.
First One-Time Signature construction: signing a single bit. Figure 3.11 presents a one-time signature scheme which is deőned only for the
case of a single-bit message. This scheme is, basically, an extension of the basic
one-time password scheme of subsection 3.4.1.
The private key s simply consists of two random strings s0 , s1 , while the
public key consists of the hashes of these strings: v0 = h(s0 ), v1 = h(s1 ). To
sign a bit b, we simply send σ = sb ; to validate incoming the bit b and its
purported signature σ, we validate that vb = h(σ). Note that the pair of values
v = (v0 , v1 ) is the public key, and not secret; only the pair s = (s0 , s1 ) is secret,
and we disclose sb upon signing bit b.
Second One-Time Signature construction: naïve signing of l-bit string.
We next extend the scheme, to allow one-time signature of an l-bit string d,
as illustrated in Figure 3.12. (We use the symbol d for for the string which is
signed, since we will extend this scheme in the next step, where we use d for
the digest of a longer message.) To allow signing of l bit strings, this scheme
basically amounts to l applications of the one-bit one-time signature scheme,
illustrated in Figure 3.11.
Applied Introduction to Cryptography and Cybersecurity
190
CHAPTER 3. INTEGRITY: FROM HASHING TO BLOCKCHAINS
Figure 3.12: A one-time signature scheme, for l-bit string (denoted d).
The private signing key of this scheme consists of a set of l pairs of
random strings, denoted {(si0 , si1 )}li=1 ; and the public validation key is the
corresponding set of l pairs of strings denoted {(v0i , v1i )}li=1 , computed as:
(∀i ∈ {1, . . . , l} and b ∈ {0, 1}) vbi = h(sib ).
The signature over the binary string d = d1 +
+. . .+
+dl is the set {s1d1 , . . . , sldl }.
To validate that σ = σ1 +
+ ... +
+ σl is a valid signature of the l-bit string
d = d1 +
+ ... +
+ dl , conőrm that for every i between 1 and l holds: vdi i = h(sidi ).
Third (and final) One-Time Signature construction: using Hash-thenSign for efficient one-time signature for arbitrary length string m.
Finally, we extend the one-time signature scheme further, to efficiently sign
arbitrary-length inputs string m. This extension, illustrated in Figure 3.13, uses
the Hash-then-Sign paradigm. Namely, we őrst compute the l-bit digest d of
the message, as d = h′ (m), where h′ denote a CRHF; note that we denote the
CRHF by h′ rather than h, since h and h′ can be different hash functions. After
computing d = h′ (m), we apply the one-time signature scheme of Figure 3.12.
Note that we described the scheme for a keyless CRHF h′ . We leave it to
the reader to modify the design, for the case when h′ is a keyed CRHF as in
Deőnition 3.5, or a TCR hash function as in Deőnition 3.6.
Figure 3.13: A one-time signature scheme, for variable-length message m, using
‘Hash-then-Sign’.
Applied Introduction to Cryptography and Cybersecurity
3.5. RANDOMNESS EXTRACTION AND KEY DERIVATION FUNCTIONS
191
3.5
Randomness Extraction and Key Derivation
Functions
Security and cryptography use randomness for many mechanisms, including
encryption, key-generation, and challenge-response authentication. A source of
true, perfect randomness is often unavailable; systems often rely on imperfect
sources of randomness, such as measurements of delays of different physical
actions. In this section, we discuss two related cryptographic tools to deal
with this challenge: randomness extractor hash functions and key derivation
functions (KDFs).
Intuitively, a randomness extractor (or simply an extractor) hash is a (keyless
or keyed) hash function h, whose output h(x) is pseudo-random7 , provided that
its input x has ‘sufficient randomness’, i.e., satisőes a speciőed randomness
assumption. A keyed extractor hk (x) also receives as input a non-secret random
key k, often referred to as salt; if its input x has ‘sufficient randomness’, then
its output hk (x) should be pseudo-random.
The randomness extraction property is more subtle and harder to deőne
than the the collision resistance, SPR and one-way properties. Also, randomness
extraction is not listed as one of the goals of standard cryptographic hash functions. The cryptographic-theory literature mostly deals with extractors which
ensure random or pseudorandom output, as long as their input has sufficientlyhigh min-entropy; extractors with this (weak) requirement on randomness from
their inputs are referred to as generic extractors. These works also focus on
keyed extractors, which use random non-secret keys; this is since keyless generic
extractors do not exist [125, 286].
The discussion of generic extraction is beyond our scope; see, for example,
[125, 286]. Instead, we will present two much simpler notions of (keyless)
extractor hash functions. In subsection 3.5.1, we discuss the simple, keyless
Biased-Coin Extractor proposed by Von Neumann, which ensures uniformly
random output, provided that its input is a sequence of independently-sampled
bits from some biased distribution. Then, in subsection 3.5.2, we present a
simple model of a computational extractor, the bitwise randomness extractor,
which ensures pseudorandom output, provided that its input contains a sufficient
number of random bits.
Practical modern cryptographic systems often extract randomness using a
(keyed) Key Derivation Functions (KDFs). Key derivation functions can be seen
as an extension of keyed extractors, but offer additional functionalities beyond
extraction. In subsection 3.5.3, we discuss key derivation functions, and compare
them to keyed and keyless extractors. We also present the Extract-then-Expand
paradigm for constructing a KDF from a keyed extractor hash and a PRF.
7 This
definition only requires pseudorandom output, rather than true randomness; such
extractors are often referred to as computational extractors, e.g., in [242]. Extractors whose
output is random regardless of the adversary’s computational abilities are often referred to as
statistical extractors.
Applied Introduction to Cryptography and Cybersecurity
192
3.5.1
CHAPTER 3. INTEGRITY: FROM HASHING TO BLOCKCHAINS
Von Neumann’s Biased-Coin Extractor
We őrst discuss a classical biased-coin extractor model proposed, already in 1951,
by Von Neumann [383], one of the pioneers of computer science. In the Von
Neumann model, each of the input bits is the result of an independent toss of a
coin with őxed bias. Namely, for every bit generated, the value 1 is generated
with probability 0 < p < 1 and the value 0 is generated with probability 1 − p with no relation to the value of other bits. We refer to this as the Von Neumann
assumption.
Von Neumann proposed the following method to extract perfect randomness
from these biased bits. First, arrange these sampled bits in pairs {(xi , yi )}.
Then, remove pairs where both bits are identical, i.e., leave only pairs of the
form {(xi , 1−xi )}. Finally, output the sequence {xi }. This simple - if somewhat
‘wasteful’ in input bits - algorithm, is called the Von Neumann extractor. We
leave it to the reader to show that if the input satisőes the Von Neumann
assumption, then the output is a string of uniformly random bits.
Exercise 3.8. [Von Neumann extractor] Show that, if the input of the Von
Neumann extractor satisfies the Von Neumann assumption (above), then the
output is uniformly random, i.e., each bit xi is 1 with probability exactly half independently of all other bits.
The Von Neumann extractor is simple, and the output is proven uniform
based only on the Von Neumann assumption, without requiring any computational assumption. However, the Von Neumann assumption is hard to justify
for many typical security applications of randomness-extraction. In particular,
consider the goal of key derivation, where we use some large input x which is
‘fairly random’ but not truly random, such as many measurements (of time,
movements, etc.). We cannot use x directly as a key, so we apply a hash and
use h(x). However, can we be certain the Von Neumann assumption holds?
The following simple exercise show this assumption may not hold - even when
every second bit in the input is random!
Exercise 3.9. Consider the following random process for producing bit sequences
{b1 , b2 , . . .}:
)
(
bi ← 0
if i = 1 mod 2
(3.17)
bi =
$
bi ← {0, 1}
otherwise
Show that this sequence does not satisfy the Von Neumann assumption, and,
in fact, that applying the Von Neumann extractor, will not result in random
output string.
3.5.2
The Bitwise Randomness Extractor
We now present another model for randomness extraction, which we call the
bitwise randomness extractor (BRE) model. We present the BRE model as a
simple way to provide readers with an understanding of randomness-extracting
Applied Introduction to Cryptography and Cybersecurity
3.5. RANDOMNESS EXTRACTION AND KEY DERIVATION FUNCTIONS
193
Figure 3.14: Bitwise-Randomness Extractor (BRE) Hash Function.
hash functions; you will őnd more advanced and stronger models in the literature,
but these are beyond our scope.
Intuitively, a hash function h is a bitwise-randomness extractor if its output
h(x) is pseudorandom, even if the adversary can select the input x, except for a
‘sufficient’ number of bits of the input; these bits are selected randomly. This
intuition is illustrated in Fig. 3.14, which deőnes a ‘game’ where an adversarial
algorithm A tries to defeat the randomness extraction - by selecting some input
message, except for n random bits8 , and then distinguishing between the output
and a random n-bit string.
Finally, we deőne an ‘indistinguishability test’, much like the ones used in
Chapter 2 (for IND-CPA encryption (Deőnition 2.9), PRF, PRG...). Namely,
we select a random bit b, and let yb be the n-bit output of the hash function
h (whose input contains n random bits), and y1−b be n random bits. Notice
that 1 − b is simply the complement of b, i.e., it is 0 if b = 1 and 1 if b = 0. The
adversary A ‘wins’ if it correctly guesses the value of b.
Definition 3.10 (Bitwise-Randomness extractor (BRE) hash function). An
efficient hash function h : {0, 1}∗ → {0, 1}n is called bitwise-randomness extractor (BRE) if for every efficient algorithm A ∈ P P T , the advantage εBRE
h,A (n) is
negligible in n, i.e., smaller than any positive polynomial for sufficiently large
n (as n → ∞), where:
εBRE
h,A (n) ≡ Pr [BREA,h (1, n) = 1] − Pr [BREA,h (0, n) = 1]
(3.18)
Where BREA,h (·, ·) is defined in Algorithm 4 and the probability is taken over
the random coin tosses of A and of BREA,h (·, ·).
8 The choice of requiring exactly n random bits in the input is quite arbitrary - we could
have required a larger number of input random bits; we used exactly n just for simplicity.
Applied Introduction to Cryptography and Cybersecurity
CHAPTER 3. INTEGRITY: FROM HASHING TO BLOCKCHAINS
194
Algorithm 4 Bitwise-Randomness Extraction Indistinguishability test
BREA,h (b, n).
(m, M ) ← A(1n )
P|M |
If |m| =
̸ |M | or i=1 M (i) < n return ⊥
$
m′ ← {0, 1}|m|
yb ← h(m ⊕ (m′ ∧ M ))
$
y1−b ← {0, 1}n
return A(y0 , y1 , m, M )
Exercise 3.14 gives a simple example of a hash function which is not a BRE.
3.5.3
Key Derivation Functions (KDFs) and the
Extract-then-Expand paradigm
Cryptographic protocols use randomness for multiple purposes. For example,
the TLS protocol (Chapter 7) uses random bits for multiple keys (authentication/encryption, client to server and server to client), for random bits sent in
the protocol, and for randomized encryption (e.g., as an initialization vector
(IV) for block-cipher modes of operation).
We could use a randomness extractor hash function to produce these random
(or pseudorandom) bits; however, an extractor only outputs a őxed number
of bits, which may be insufficient. We could apply the extractor repeatedly
to provide all of these bits, but this will each application would require its
own sufficiently-random input, so this would be inefficient. Therefore, many
protocols use a slightly more complex cryptographic mechanism, which is called
a Key Derivation Function (KDF), and is more efficient.
Let us describe the KDF design proposed by Krawczyk [242] and speciőed
by the IETF as RFC 5869 [243], which is deployed in TLS 1.3 (see Section 7.6).
This KDF uses the modular and efficient extract then expand paradigm, which
constructs a KDF from a given keyed randomness extractor hash ĥ and a given
PRF f . Namely, the KDF uses the extractor ĥ to extract one pseudorandom key
k P RF , which is used as the key to the PRF f , which is then used to efficiently
generate the desired pseudorandom strings.
The KDF receives four parameters. Two of the four parameters are the
same as used by a keyed extractor hash: a (random but non-secret) key/salt k̂,
and a ‘sufficiently random’ input x. The two other parameters of the KDF are
(1) the number l of pseudorandom bits which the KDF should produce and (2)
an identifier ID for the resulting string. The identiőer allows the KDF to be
used, with the same input x, to generate multiple independently-pseudorandom
strings, e.g., to be used for encryption key, for authentication key and as an IV.
Let us present this KDF construction. For simplicity, assume that the
required number of bits l is a multiple of the digest length n. We construct the
KDF, based on the keyed extractor hash ĥ and on the PRF f , as:
Applied Introduction to Cryptography and Cybersecurity
3.5. RANDOMNESS EXTRACTION AND KEY DERIVATION FUNCTIONS
195
Key k
Input x
Output
PRF
fk (x)
Secret
Arbitrary
n bits
KDF
Extractor
Extractor
fk (x, l, ID) (keyed) hk (x) (keyless) h(x)
Public
No key
Sufficiently random
l bits
l bits
Table 3.2: Comparison: PRF, KDF and (keyed and keyless) randomness
extractor hash functions. For all functions, outputs should be pseudorandom.
T1 +
+ T2 +
+ ... +
+ Tl/n , where:
T (0) is an empty string,
KDFk̂ (x, l, ID) =
+ ID +
+ i),
Ti = fkP RF (Ti−1 +
and k P RF = ĥk̂ (x)
(3.19)
RFC 5869 [243] also propose the use of the well-known HMAC construction,
which we discuss in subsection 4.6.3. HMAC transforms a keyless hash function
h into a keyed function ĥk̂ (x), deőned as:
ĥk̂ (x) = h(salt ⊕ OP AD +
+ h(k̂ ⊕ IP AD +
+ x))
(3.20)
The values OP AD and IP AD in Equation 3.20 are őxed strings deőned in [243].
The keyed function ĥ resulting from this construction, is used, in [243] and
otherwise, both as a keyed extractor hash and as a PRF, e.g., for the KDF
construction of Equation 3.19.
Note that although HMAC is often used both as a PRF and as keyed
extractor hash, the security requirements from these two types of functions are
quite different. Table 3.2 compares the four relevant types of functions: KDF,
PRF and keyed/keyless extractor hash functions. And Exercise 3.10 shows that
the keyed extractor and PRF deőnitions are ‘incomparable’: a function can be
a PRF but not a keyed extractor hash function, and vice versa. Try to solve it
őrst and only then read the solution, since if you directly read the solution, it
may appear obvious.
Exercise 3.10 (PRF vs. KDF). Let f be a (secure) PRF and h be a (secure)
keyed extractor hash, where for both functions, the output and the key are n
bits.
1. Use f and/or h to construct a (secure) PRF f ′ , such that f ′ is not a
secure keyed extractor hash.
2. Use f and/or h to construct a (secure) keyed extractor hash h′ , such that
h′ is not a secure PRF.
Solution:
fk′ (x)
=
h′k (x)
=
0n
fk (x)
if k = x mod 2n
otherwise
k
hk (x)
if 0n = x mod 2n
otherwise
Applied Introduction to Cryptography and Cybersecurity
196
CHAPTER 3. INTEGRITY: FROM HASHING TO BLOCKCHAINS
Complete the solution by explaining or demonstrating why f ′ is a secure PRF
but insecure keyed extractor hash, and why h′ is a secure keyed extractor hash
but insecure PRF. And őnd other solutions!
3.6
The Random Oracle Model
Often, designers use cryptographic hash functions in their constructions, without
assuming a speciőc property such as one-way function or second-preimage
resistance. Many of these constructions are found to be insecure; however, some
constructions resist attacks for many years, even in spite of considerable efforts.
This is especially common for keyless hash functions.
Furthermore, often, when a vulnerability is found in a system deploying
a hash function, it often turns out that the attacks are generic, i.e., do not
exploit a weakness of a speciőc hash function. In other words, these systems
are vulnerable when implemented with any hash function.
It is obviously desirable to identify designs which are vulnerable when used
with any hash function. We deőnitely want to avoid such a design! In a more
positive way, it is preferable to use designs which can be proven to be secure
against any ‘generic’ attack; such designs may still be vulnerable when using any
given speciőc hash function, but cannot be shown vulnerable when implemented
with any hash function. Any vulnerability can be attributed to a property of
the speciőc hash function used.
The Random Oracle Model (ROM)9 , proposed by Bellare and Rogaway [42],
offers such a method.
Intuitively, Random Oracle Model (ROM) constructions and protocols are
secure in the (impractical) case that the parties select the hash function h()
as a random function for the same domain and range. Of course, when the
construction is deployed, it must use a concrete, speciőc hash function h(), rather
than a random function. Namely, we model an ‘ideal’ keyless hash function as
a random function (over the same domain and range, i.e., {0, 1}∗ → {0, 1}n ).
Definition 3.11 (ROM-security). Let H be the set of all functions from binary
strings to {0, 1}n , for some digest length n. Consider a parameterized scheme
S h , where h is a given hash function. Also, for any security definition def ,
let εdef
→ [0, 1] be the def -advantage function, defined for a given scheme
S h ,A h
h
S and parameterized adversary A h , where A h is a PPT algorithm, using
standard computational model (e.g., Turing machine), with ‘black-box’ (oracle)
access to h. Namely, A h can provide an arbitrary input x to h and receive back
h(x) (as a single operation).
We say that the (parameterized) system Sh is def -ROM-secure, or def secure under the Random Oracle Model, if the advantage of any PPT adversary
9 Often people use the term ‘random oracle methodology’ or instead of ‘random oracle
model’. Arguably, this would be more appropriate; but we use ‘random oracle model’, since
it is more common.
Applied Introduction to Cryptography and Cybersecurity
3.6. THE RANDOM ORACLE MODEL
197
$
A h for scheme S h , for a random hash function h ← H, is negligible, i.e.:
Pr εdef
∈ N EGL(n)
S h ,A h
$
(3.21)
h←H
Note that when the parties can share a secret, random key, then one should
use a Pseduo-Random Function (PRF) rather than the Random Oracle Model
(ROM); see Principle 6.
Advanced: about the choice of h. The careful reader may have noticed
that it is not well deőned how to select an element from an inőnite set with
$
uniform probability, hence, h ← H is not well deőned. This choice should be
interpreted as follows. For any l > n, let H l be the set of all functions from
{0, 1}l to {0, 1}n , i.e., functions from l-bit binary strings to n-bit binary strings:
H l = {h : {0, 1}l → {0, 1}n }. The set H l is őnite, hence we can deőne random
$
$
sampling from it: hl ← H l . When we write h ← H, it should be interpreted as
$
choosing hl ← H l for every integer l > n; and, for any input string m ∈ {0, 1}∗ ,
let h(m) = h|m| (m).
Notice that the deőnition is for a őxed digest length n; however, formally,
our deőnitions should be phrased for n being a parameters.
To prove that a protocol using a keyless hash function h is secure under
the ROM, we analyze the protocol assuming that h is chosen randomly at
the beginning of the execution. Once chosen, all parties will have ‘black-box’
(oracle) access to h. This includes the adversary as well as parties running the
protocol.
Security under the ROM vs. Security under the Standard Model.
Analysis of security under the ROM model is widely used. In fact, papers
in cryptography often use the term ‘secure in the standard model’ to emphasize that their results are proven secure in the ‘real model’, rather than only
proven under the ROM or another simpliőed model. Proofs of security in the
standard model usually still make different unproven assumptions; however,
these assumptions are ‘standard’ cryptographic assumptions. For many of
these standard cryptographic assumptions there are even results showing that
there exist schemes satisfying these assumptions, provided that a complexity
assumption such as P ̸= N P is true.
In contrast, ROM-security does not necessarily imply that the design is
‘really’ secure, when implemented with any speciőc hash function; once the hash
function is őxed, there may be an adversary that breaks the system. This is true
even when the hash function adopted satisőes ‘standard’ security speciőcations,
e.g., CRHF. See examples in subsection 4.6.3.
Still, ROM-security is deőnitely a good indication of security, since a vulnerability has to use some property of the speciőc hash function. Indeed, there
are widely-deployed designs which are only proven to be ROM-secure.
Applied Introduction to Cryptography and Cybersecurity
198
3.7
CHAPTER 3. INTEGRITY: FROM HASHING TO BLOCKCHAINS
Static Accumulator Schemes and the Merkle-Tree
A collision-resistant hash function h can provide integrity for a binary string x,
by computing the digest h(x) (or hk (x), for keyed hash). But what if we require
integrity for multiple strings X = {x0 , x1 , . . . , xm−1 }? A naive solution is the
encode-then-hash design, which encodes the set of strings X as a single string,
and then applies the hash function h. Namely, we compute h(encode(X)), where
encode is a one-to-one encoding of multiple strings to one string. It is not hard
to design such encode function; however, this design has two disadvantages:
1. Validation requires all stings X, even if we only need to validate the
integrity of a single string, say xi .
2. The set of digests is őxed (static). For example, suppose we compute h(encode(X)) and then receive an additional string xm or a whole
additional set of multiple strings, X ′ . We want a digest value that
will allow us to validate the integrity of all of these strings, i.e., of
{x0 , x0 , . . . , xm−1 , xm } or X +
+ X ′ . Computing ‘from scratch’ seems inefficient computationally, and would also requires storage of all input
strings.
The basic goal of Accumulator schemes is to provide a more efficient way to
ensure integrity to multiple binary strings.
In this section, we discuss static accumulators, which compute a collisionresistant digest of a given set of input strings, allowing efficient validation
of the integrity of one or more of the input strings. The encode-then-hash
design h(encode(x0 , x1 , . . . , xm−1 )), explained above, is a naive, inefficient static
accumulator; we present the Merkle Tree accumulator, which is widely deployed,
e.g., in blockchains (see Section 3.10) and in the Certiőcate Transparency (CT)
PKI scheme (see Section 8.6). In Section 3.8 we extend our discussion to dynamic
accumulators, which allow efficient validation of the integrity of input strings,
given in one or multiple events. Later, in Section 3.9, we discuss the MerkleDamgård (MD) design, which őts the deőnition of a dynamic accumulator, but
is mostly known for its use in the design of several standard hash functions.
Like for hash functions, we can deőne and use either keyed or keyless accumulators. Similarly to the case for hash functions, most practical applications
use a keyless accumulator, while many theoretical works focus on keyed accumulators. Also like the case for hash functions, keyless accumulators cannot have
some important properties such as collision-resistance, unless under simplifying
models such as the Random Oracle Model (ROM, Section 3.6). In this text, we
focus on the slightly simpler case of keyless accumulators.
3.7.1
Definition of a Keyless Static Accumulator
A (keyless) static accumulator scheme consists of two algorithms: α.Accum and
α.VerPoI. The Accum accumulates input strings into a digest ∆, and the VerPoI
algorithm veriőes the inclusion of a given string in the accumulated digest.
Applied Introduction to Cryptography and Cybersecurity
3.7. STATIC ACCUMULATOR SCHEMES AND THE MERKLE-TREE
199
Definition 3.12 (A keyless static accumulator). A keyless static accumulator
scheme α is defined by two algorithms: α.Accum and α.VerPoI, where:
α.Accum(X) → (∆, Π) is a deterministic algorithm that accumulates a sequence
m−1
m
(ordered set) of binary strings X = ⟨xi ⟩i=0 ∈ {0, 1}m ∈ ({0, 1}∗ ) . The
∗
α.Accum function outputs a digest ∆ ∈ {0, 1} ∪ {⊥} and an ordered set
m
m−1
Π = {πi }i=0
∈ ({0, 1}∗ ) of Proofs-of-Inclusions.
α.VerPoI(∆, x, ID, m, π) → {True, False} is a deterministic algorithm that
verifies if x was one of the strings in the sequence accumulated into ∆,
using the Proof-of-Inclusion π. The function may use two additional
inputs: m, the number of strings in the sequence accumulated into ∆, and
ID, the sequential number of x within the sequence.
Notations. To refer to a particular output parameter of a function of the
accumulator, we append the parameter name to the function name, separated
by a dot. Namely, α.Accum.∆(X) denotes the digest (∆) output of the Accum
function of α, and α.Accum.π(X) denotes the set of PoIs.
3.7.2
Collision-Resistant Accumulators
The őrst security requirement we deőne is collision-resistance. We őrst extend
the notion of a collision to multiple input strings x0 , . . . , xm−1 . Two natural definitions for collisions between sets of strings are ordered collisions and unordered
collisions. For clarity, we deőne both ordered and unordered collisions; however,
later, we focus on unordered collisions, which are used in most applications and
publications, and refer to them simply as collisions. Note that every unordered
collision is also an ordered collision, therefore, resistance to unordered collisions
also implies resistance to ordered collisions.
Definition 3.13 (Keyless static collisions and the Im(·) notation). Let α be
an accumulator and let X, X ′ be two ordered sets of binary strings such that:
α.Accum.∆(X) = α.Accum.∆(X ′ ) ̸= ⊥
If X =
̸ X ′ , then we say that the pair (X, X ′ ) is an ordered collision for α.
Given an ordered set (sequence) X, let Im(X) denote the unordered set of
elements in X. If Im(X) ̸= Im(X ′ ), then we say that (X, X ′ ) is an unordered
collision, or simply a collision, for accumulator α.
We next deőne collision-resistance for static keyless accumulators. Note that
we could similarly deőne (keyless) second-preimage resistant accumulators.
Definition 3.14 (Collision Resistant Accumulator). A keyless static accumulator scheme α is Collision Resistant if for any PPT adversary A there is
negligible probability for A to output an (unordered) collision. Namely:
α.Accum.∆(X) = α.Accum.∆(X ′ ) ̸= ⊥, where
∈ N egl(λ) (3.22)
P rob
Im(X) ̸= Im(X ′ ) and (X, X ′ ) ← A(1λ )
Applied Introduction to Cryptography and Cybersecurity
CHAPTER 3. INTEGRITY: FROM HASHING TO BLOCKCHAINS
200
∆0,6 ≡ MT.Accum.∆({x0 , . . . , x6 })
h
∆0,3
∆4,6
h
h
∆0,1
∆0,0
h
∆2,3
∆1,1
∆2,2
h
h
x0
x1
h
∆4,5
∆5,5
h
h
h
x6
x4
x5
∆3,3
∆4,4
h
h
x2
x3
h
∆6,6
Figure 3.15: The Merkle-Tree construction using hash function h, for inputs {x0 , . . . , x6 }, i.e., with k = 7 input strings, using the notation ∆i,j ≡
MT.Accum.∆({xi , . . . , xj }).
3.7.3
The Merkle tree (MT) Accumulator and its
Collision-resistance
The Merkle tree (MT) accumulator was őrst presented by Merkle in [281];
many variants of this design were used in practice and in different works and
proposals. Most of these designs, including Merkle’s original design, are keyless.
In this subsection, we describe the digest function of the (keyless) Merkle
tree (MT) design used by the Certiőcate Transparency (CT) standard [256],
analyzed in [128]. The construction is illustrated in Figure 3.15, using the
notation ∆i,j ≡ MT.Accum.∆({xi , . . . , xj }).
As shown in Figure 3.15, the Merkle tree (MT) accumulator constructs a
(full or partial) binary tree whose leafs are the accumulated input strings, which
we denote {xi }m
i=0 . We deőne the tree recursively, as follows.
We őrst deőne the MT digest of a list containing a single string {xi }, which
we denote by ∆i,i , as:
∆i,i ≡ MT.Accum.∆({xi }) = h(0x00 +
+ xi )
(3.23)
Next, let {xi , . . . , xi+j−1 } be a list containing j > 1 elements. We compute
the MT digest of {xi , . . . , xi+j−1 }, denoted ∆i,i+j−1 , by the following recursive
equation:
∆i,i+j−1 ≡MT.Accum.∆({xi , . . . , xi+j−1 }) =
=h(0x01 +
+ ∆i,i+2l −1 +
+ ∆i+2l ,i+j−1 )
=h(0x01 +
+ MT.Accum.∆({xi , . . . , xi+2l −1 })+
+
+
+ MT.Accum.∆({xi+2l , . . . , xi+j−1 }))
where l is the maximal integer such that 2l < j
Applied Introduction to Cryptography and Cybersecurity
(3.24)
3.7. STATIC ACCUMULATOR SCHEMES AND THE MERKLE-TREE
201
For the example in Figure 3.15, we őrst compute ∆i,i , for i between 0 and
6, using Equation 3.23; e.g., ∆3,3 ≡ MT.Accum.∆({x3 }) = h(0x00 +
+ x3 ). We
then compute the digests ∆i,i+1 for i ∈ {0, 2, 4}, using Equation 3.24; e.g.,
+ ∆3,3 ). Next, we compute, using Equation 3.24, the
+ ∆2,2 +
∆2,3 = h(0x01 +
+ ∆0,1 +
+ ∆4,5 +
digests ∆0,3 = h(0x01 +
+ ∆2,3 ), ∆4,6 = h(0x01 +
+ ∆6,6 ) and
őnally ∆0,6 = h(0x01 +
+ ∆0,3 +
+ ∆4,6 ).
Note that we use a different one-byte preőx to the hashed strings in Equation 3.23, computing the digest of a set containing a single input string {xi },
compared to the one-byte preőx we use in Equation 3.24, computing the digest
of a set containing multiple strings {xi , . . . , xi+j−1 (for j > 1). This is necessary
to ensure collision resistance. Let Accum′ .∆′ be a modiőed digest function,
which uses the same preőx (or no preőx) for the two cases, i.e., a digest of a set
containing only one string and a digest of a set containing multiple strings. Such
modiőed digest function Accum′ .∆′ would have collisions - even second-preimage
collisions. For example:
MT.Accum′ .∆′ ({x0 , . . . , x4 }) = MT.Accum′ .∆′ ({y0,1 , y3,4 }),
where yi,j ≡ MT.Accum′ .∆′ ({xi , xj })
(3.25)
We next observe that the digest function of Merkle tree (MT) ensures
collision resistance, provided that h is a collision-resistant hash function. Note
that since h is keyless, we can only hope for it to be collision-resistant under a
simpliőed model such as the Random Oracle Model (ROM), see Section 3.6.
Lemma 3.2. Assume that h is a (keyless) collision-resistant hash function
(under the random oracle model). Then Merkle tree (MT) is a collision-resistant
static accumulator.
Proof: Suppose, to the contrary, that Merkle tree (MT) has a collision,
i.e., two different sets of strings {x0 , . . . , xm−1 } =
̸ {x′0 , . . . , x′m′ −1 } such that
MT.Accum.∆({x0 , . . . , xm−1 }) = MT.Accum.∆({x′0 , . . . , x′m′ −1 }). If m =
m′ = 1, then x0 ̸= x′0 yet we have h(0x00 +
+ x0 ) = h(0x00 +
+ x′0 ), a collision.
If m = 1 and m′ > 1, then we have h(0x00 +
+ x0 ) = h(0x01 +
+ ∆′ ), where
′
∆ is computed as per Equation 3.24. Regardless of the value of ∆′ , this is a
collision. A dual argument holds if m > 1 and m′ = 1.
Finally, consider m > 1 and m′ > 1, and assume, WLOG, that there is no
collision for shorter input sequences. Denote ∆i,j ≡ MT.Accum.∆({xi , . . . , xj })
and ∆′i,j ≡ MT.Accum.∆({x′i , . . . , x′j }); in particular, we have ∆0,m−1 =
∆′0,m′ −1 . Let l be the maximal integer such that 2l < m − 1 and l′ be the
′
maximal integer such that 2l < m′ − 1. Then we have:
∆0,m−1 =h(0x01 +
+ ∆0,2l −1 +
+ ∆2l ,m−1 ), and
′
′
+ ∆′2l ,m−1 )
∆0,m−1 =h(0x01 +
+ ∆0,2l′ −1 +
(3.26)
If either ∆0,2l −1 ̸= ∆′0,2l′ −1 or ∆2l ,m−1 ̸= ∆′2l′ ,m′ −1 , we have a collision. If both
are equal, then we have a collision for a shorter input sequence. In any case, we
have the desired contradiction.
Applied Introduction to Cryptography and Cybersecurity
202
3.7.4
CHAPTER 3. INTEGRITY: FROM HASHING TO BLOCKCHAINS
The Proof-of-Inclusion (PoI) Requirements
Many applications use accumulators to verify the integrity of a speciőc input x to
the accumulator, namely, verify that x was part of the set of input string X. Such
veriőcation of a speciőc input may be done using the VerPoI operation, using the
Proof-of-Inclusion (PoI) output, π, from the Accum operation corresponding to
x.
Note that the integrity of a speciőc input x could also be veriőed using the
collision-resistance property of the accumulator, by re-computing the digest of
the entire set of all input strings. However, verifying the integrity by using
VerPoI has three advantages. First, using VerPoI is (typically) much more
efficient than recomputing the digest. Second, validation does not require the
entire set of input strings X, which may be unavailable, or require unnecessary
overhead to obtain or maintain. Third, the use of VerPoI may also provide some
privacy, since a party may verify a particular input string without having access
to other inputs, basically following the ‘need to know’ paradigm.
Veriőcation involves two complementing requirements: correctness and
unforgeability. Intuitively, correctness requires that when VerPoI(∆, x, ID, m, π)
will return 1 (true), if the inputs are ‘correct’, i.e., x was the ID-th string in the
sequence of m strings accumulated into digest ∆, and π was the corresponding
PoI output; and unforgeability requires that VerPoI(∆, x, ID, m, π) will return
0 (false) if ∆ is not the digest of a sequence containing x. Note that the
unforgeability requirement does not require VerPoI to validate the position of
x within X; this is since some accumulators do not preserve the position, and
applications mostly do not depend on preserving the position.
Definition 3.15 (PoI Correctness and Unforgeability). Let α be a static
accumulator scheme.
Correctness. We say that α ensures Proof-of-Inclusion (PoI) Correctness
if the following holds for every security parameter λ ∈ N and input sequence
X = {x0 , . . . , xm−1 }. Let (∆, {π0 , . . . , πm−1 }) ← α.Accum(X). Then ∀ID <
m holds α.VerPoI(∆, xID , ID, m, πID ) = 1.
Unforgeability. We say that α ensures Accumulator Proof-of-Inclusion (PoI)
Unforgeability if for every security parameter λ ∈ N and PPT adversary A
holds:
α.VerPoIpk (∆, xID , ID, m, πID ) = 1 ∧ xID ̸∈ Im(X),
where ∆, X, xID , ID, m and π were generated by:
∈ N egl(λ)
P rob
X ← A(1λ ) ;
(∆, Π) ← α.Accum(X) ;
(xID , ID, m, πID ) ← A(∆, Π)
(3.27)
3.7.5
The Merkle-Tree PoI
We now complete the deőnition of the static Merkle tree (MT) accumulator by
presenting its Accum.Π and VerPoI functions, and proving that it ensures PoI
correctness and unforgeability.
Applied Introduction to Cryptography and Cybersecurity
3.7. STATIC ACCUMULATOR SCHEMES AND THE MERKLE-TREE
203
∆0,6 ≡ MT.Accum({x0 , . . . , x6 })
h
∆0,3
h
∆0,1
∆0,0
h
∆4,6
h
∆2,3
∆1,1
∆2,2
h
∆4,5
∆3,3
∆4,4
h
∆6,6
∆5,5
h
x6
h
h
h
h
h
h
x0
x1
x2
x3
x4
x5
Figure 3.16: Illustrating the the Merkle Tree’s Proof-of-Inclusion (PoI), for input
string x3 . The PoI consists of the three hash values in thick blue rectangles:
∆2,2 , ∆0,1 and ∆4,6 . The other bold, blue hash values, over thick lines, are
computed in the PoI veriőcation. The dotted inputs and hash operations are
not used for the veriőcation of the PoI of x3 .
Recall that the Merkle tree (MT) is basically a (full or partial) binary tree
m−1
whose leafs are the accumulated input strings, which we denote {xi }i=0
. Let
πID denote the PoI of an input string xID , i.e., πID ≡ MT.Accum.Π[ID](X).
The PoI, πID , is a (usually small) sequence of digests, which are used, together
with xID , m and ID, to re-compute the digest of X, i.e., compute ∆(X). If
the computation returns the correct, expected value of ∆, this shows that xID
was indeed the IDth element in the accumulated sequence.
For example, Figure 3.16 shows the sequence of digests required to validate x3 , given in the order of their usage (and of the layers in the tree):
{∆2,2 , ∆0,1 , ∆4,6 }. In the őgure, we placed these digest values in thick rectangulars. Therefore, the PoI of x3 is:
π3 ≡ MT.Accum.Π[3](X) = {∆2,2 , ∆0,1 , ∆4,6 }
(3.28)
Computing the Proof of Inclusion of xi , i.e., πID ≡ MT.Accum.Π[ID](X).
Consider őrst the base case, where the input is a list X containing a single
element x, i.e., X = {x}. In this case, given x and ∆, we should just conőrm
that ∆ = h(0x00 +
+ x) (Equation 3.23), namely, we do not need any PoI.
Therefore, the PoI is simply the empty list ∅, namely:
π0 ≡ MT.Accum.Π[0]({x}) = ∅
(3.29)
Consider now the typical case, where X contains j > 1 elements, say
X = {xi , . . . , xi+j−1 }. Note that MT.Accum.πm (X) is the PoI (sequence of
digests) for xi+m , since this is the (m + 1)th element in the list. We compute
MT.Accum.πm (X) recursively as follows, where l is the maximal integer such
Applied Introduction to Cryptography and Cybersecurity
204
CHAPTER 3. INTEGRITY: FROM HASHING TO BLOCKCHAINS
that 2l < j and ∆a,b is deőned as in Equation 3.24:
πm ≡MT.Accum.Π[m](X) ≡ MT.Accum.Π[m]({xi , . . . , xi+j−1 }) =
If m < l: MT.Accum.Π[m]({xi , . . . , xi+l−1 }) +
+ ∆i+l,i+j−1
=
Else:
MT.Accum.Π[m − l]({xi+l , . . . , xi+j−1 }) +
+ ∆i,i+l−1
(3.30)
The reader can conőrm that in Figure 3.16, the resulting sequence of digests
for x3 , i.e., π3 ≡ MT.Accum.π[3]({x0 , . . . , x6 }), would be as expected, i.e.:
π3 ≡ MT.Accum.π[3]({x0 , . . . , x6 }) = {∆2,2 , ∆0,1 , ∆4,6 }
(3.31)
Verifying the Proof of Inclusion of xi . The MT.VerPoI function is shown in
Algorithm 5. This algorithm computes MT.VerPoI(∆, x, i, m, π), i.e., veriőes,
using the sequence of digests in π, that x was the ID + 1-th string in a sequence
containing m strings whose digest is ∆. The veriőcation is done ‘bottom up’,
from layer j = 0 till layer j = ⌈log m⌉ − 1, re-computing one digest in each layer
of the Merkle tree until őnally re-computing the digest over the entire sequence
of m strings whose ID + 1-th string is x, at which point it simply remains to
compare this digest to ∆. In the j th layer, we compute the digest (hash) of the
previous digest, which is always over a subsequence containing x, and of π[j].
The order between
the two strings being hashed at each layer depends on the
parity of ID
2j .
Algorithm 5 The Verify-PoI algorithm MT.VerPoI(∆, x, i, m, π): verify that
x was the ith string, out of m, accumulated into π.
1:
2:
3:
4:
5:
6:
7:
8:
9:
10:
11:
if (i ≥ m) then return False
δ ← h(0x00 +
+ x)
▷ If x was ith string then δ = ∆i,i
for j =
▷ Loop to compute digest at each layer
⌈log(m)⌉ − 1 do
0 to
is
odd
then
▷ if in layer j, x was on right, then:
if i+1
j
2
δ ← h(0x01 +
+δ+
+ π[j])
else
▷ else, i.e., if in layer j, x was on the left, then:
δ ← h(0x01 +
+ π[j] +
+ δ)
if δ = ∆ then
return True
else
return False
Let us track the computation of MT.VerPoI(∆0,6 , x3 , 3, 7, π3 ) in Figure 3.16,
where π3 is as computed in Equation 3.31, i.e., π3 [0] = ∆2,2 , π3 [1] = ∆0,1
and π3 [2] = ∆4,6 . Following Algorithm 5, we őnd that we őrst compute
δ←
+x3 ) (line 3), which is, as expected, the same as ∆3,3 . Since 3+1
= 4 is
h(0x00+
20
+π3 [0]+
+δ) = h(0x01+
+∆2,2 +
+x3 )
+h(0x00+
even, we next compute δ ← h(0x01+
, which is, as expected, ∆2,3 . We similarly compute δ ← ∆0,3 and őnally
δ ← ∆0,6 . Therefore, when the loop is completed, we őnd in line 8 that δ = ∆
and therefore return True.
Applied Introduction to Cryptography and Cybersecurity
3.8. DYNAMIC ACCUMULATORS
205
Finally, we state, without proof, the correctness and unforgeability of MT.
A similar claim is proven in [128] (Lemma 2).
Lemma 3.3. Assume that h is a (keyless) collision-resistant hash function
(under the random oracle model). Then the MT Digest is a static accumulator
that ensures PoI correctness and unforgeability.
3.8
Dynamic Accumulators
In some applications, there is a need to accumulate a set of strings that changes
over time, typically, by accumulating new strings in addition to the alreadyaccumulated strings. We use the term Dynamic Accumulator to refer to a
stateful accumulator that supports incremental accumulation of multiple input
sets of strings, e.g., the l input sets of strings {Xi }li=1 , each received in an
Accum event. The lth call to the Accum function of a Dynamic Accumulator
produces the digest ∆l of the strings received in these l Accum events. More
⃗ l,
precisely, the dynamic accumulator outputs the digest of the set of strings X
deőned as:
⃗ l ≡ X1 +
X
+ ... +
+ X1
(3.32)
Dynamic accumulators extend upon the mechanisms and properties of
static accumulators. In particular, they should satisfy collision resistance and
⃗l
PoI unforgeability, extended to support accumulation of the strings in X
over multiple Accum events. We present the extended deőnitions for collision
resistance in subsection 3.8.2. Before that, in the following subsection, we
discuss the motivations for using dynamic accumulators, different ways in which
dynamic accumulators extend upon static accumulators, and őnally deőne
dynamic accumulators.
3.8.1
Dynamic accumulators: motivations, extensions and
definition
There are two main motivations for using dynamic accumulators instead of
re-applying a static accumulator α, applying α repeatedly on all input strings
each time that a new set of strings is added to the accumulator. One motivation
is efficiency; by reusing results from previous applications of the accumulator,
dynamic accumulators are usually more efficient than the comparable use of
the static accumulator α.
Another motivation for dynamic accumulators is to provide verifiable consistency, allowing the use of an old digest to validate a new digest. Let ∆ be
⃗ and ∆′ be the digest of the set of strings X
⃗ ′.
the digest of the set of strings X,
′
′
⃗ ⊆X
⃗ .
We say that ∆ is consistent with ∆ if, and only if, X
To verify consistency, dynamic accumulators deőne the consistency verification predicate VerUpd. This is a a new operation, i.e., it does not exist in static
accumulators. The VerUpd operation efficiently verifies the consistency of an
‘updated digest’ ∆′ with a ‘previous digest’ ∆. To facilitate the veriőcation,
Applied Introduction to Cryptography and Cybersecurity
206
CHAPTER 3. INTEGRITY: FROM HASHING TO BLOCKCHAINS
VerUpd receives a third parameter U , in addition to the two digests ∆ and
∆′ . The U parameter is a set containing one or more update values. Normally,
∆′ is the result of a series of Accum events following the Accum event which
outputted ∆. In this case, U contains one update value u from each of these
Accum events. The veriőcation should be correct and sound, which intuitively
mean:
Sound consistency verification: It is infeasible for an attacker to output an
inconsistent pair of digests ∆, ∆′ and a set U , that will pass veriőcation
as consistent, i.e., s.t. VerUpd(∆, U, ∆′ ) = True.
Correct consistency verification: Suppose the accumulator outputs digest
∆0 and then has a sequence of q Accum operations, resulting in the series
of update values U ≡ {ui }qi=1 , for some q ≥ 1. Let ∆q be the digest output
by the last Accum operation in the sequence. Then VerUpd(∆1 , U, ∆q ) =
True. Note that ∆q is indeed consistent with ∆0 .
Updating the PoIs. Dynamic accumulators also use the update values u
output by the Accum operations to update the Proofs-of-Inclusion (PoIs). This
allows veriőcation of that a string x was accumulated into a digest ∆q , even if
x was input to an ‘early’ Accum event, producing digest ∆0 , and followed by
some number q of additional Accum events until the last of them produced ∆q .
Let π be the PoI for x produced in the ‘early’ Accum event. In order to verify
that x was accumulated into ∆q , we need to update the PoI, using another new
operation of dynamic accumulators: UpdPoI. The UpdPoI operation receives
the outdated PoI π, corresponding to a string x ∈ X0 , and a set U containing
one or more update values, say U = {u1 , . . . , uq }. The output of UpdPoI is an
updated version of the PoI, which we denote πq .
The UpdPoI operation should ensure correctness. Namely, let π be the PoI
generated for input string x accumulated into ∆0 in an Accum event. Suppose
this is followed by a sequence of q Accum operations, resulting in the series
of update values U ≡ {ui }qi=1 , for some q ≥ 1. Let ∆q be the digest output
by the last Accum operation in the sequence. If π ′ ← UpdPoI(π, U ), then
VerPoI(∆q , x, π ′ ) should return True.
To summarize, dynamic accumulators extend static accumulators in the
following ways:
State: Dynamic accumulators are stateful. The state of the accumulator,
denoted s, allowing adding new string to the current digest, provided as
input to Accum, with the new state produced as output.
Additional outputs of Accum: The Accum function of a dynamic accumulator produces two additional outputs, in addition to the digest (∆) and PoI
(π) which are produced by both dynamic and static accumulators. These
two additional outputs are a new state, s′ , and an update value, u. The
update value u is used to update the already-published Proofs-of-Inclusion
Applied Introduction to Cryptography and Cybersecurity
3.8. DYNAMIC ACCUMULATORS
207
(PoIs), allowing old PoIs to be used with the new digest, and to validate
that the new digest is consistent with the previous digest.
Additional functions: Dynamic accumulator schemes include two additional
functions: VerUpd, which veriőes that a given new digest ∆′ is consistent
with, i.e., extends, the current digest ∆; and UpdPoI, which updates old
PoIs so they can be used with an updated digest. Both VerUpd and
UpdPoI use the u values produced in the interim Accum operations.
Additional and modified properties: To allow for multiple Accum events,
we need to make some changes to the collision-resistance and PoI requirements, and to add properties specify to the dynamic behaviour: sound and
correct consistency verification, and UpdPoI correctness. We describe the
changes to collision-resistance in subsection 3.8.2, and brieŕy discussed
the other properties above.
We complete this subsection by deőning (keyless) dynamic accumulators.
Definition 3.16 (Dynamic accumulator schemes). A (keyless) dynamic accumulator scheme α is defined by four algorithms: α.Accum, α.VerPoI, α.VerUpd
and α.UpdPoI, where:
α.Accums (X) → (s′ , ∆, Π, u) is a deterministic algorithm that accumulates a sem−1
k
quence (ordered set) of binary strings X = ⟨xi ⟩i=0 ∈ ({0, 1}∗ ) . α.Accum
m−1
′
∗
outputs a new state s , a digest ∆ ∈ {0, 1} ∪ {⊥}, a set Π = {πi }i=0
∈
∗
∗ k
({0, 1} ) of proofs-of-inclusions, and an update value u ∈ {0, 1} . In the
first call, we use the initial state s = 1λ .
α.VerPoI(∆, x, ID, m, π) → {True, False} is a deterministic algorithm that
verifies if x was part of the inputs accumulated into ∆, using the Proof-ofInclusion π, optionally using the number of strings m and the sequential
number ID.
α.VerUpd(∆, U, ∆′ ) → {True, False} is a deterministic algorithm that verifies that new digest ∆′ is consistent with previous digest ∆, using a set of
update values U .
α.UpdPoI(π, U ) → π ′ is a deterministic algorithm that computes an updated
PoI π ′ , given the current PoI (π) and a set of update values U .
3.8.2
Dynamic Accumulator Collision-Resistance
Dynamic accumulators should ensure collision resistance as well as PoI correctness and unforgeability; the notions are similar to the corresponding notions
for static accumulators, with the main change being that the dynamic PoI
properties allow for the use of the UpdPoI function (to update the PoI after
accumulating additional strings). In addition, dynamic accumulators should
ensure consistency and correctness for the veriőed-update function (VerUpd).
Applied Introduction to Cryptography and Cybersecurity
CHAPTER 3. INTEGRITY: FROM HASHING TO BLOCKCHAINS
208
We deőne collision resistance in this subsection, and deőne PoI correctness
and unforgeability in the following subsections. But őrst, let us introduce
notations for the result of multiple invocations of the Accum operation, which
we use in all of these deőnitions.
Definition 3.17 (Multiple Accum notation). Consider an ordered set {Xj }lj=1 ,
where each Xj is itself a sequence of binary strings. Let α be a dynamic keyless
accumulator and s0 ≡ 0λ+1 . For l ≥ 1, define recursively:
sl ≡ α.Accumsl−1 .s(Xl )
(3.33)
And, using sl (for l ≥ 1), define:
α.Accum.s {Xj }lj=1 ≡ sl
α.Accum.∆ {Xj }lj=1 ≡ α.Accum.∆sl−1 (Xl )
(3.34)
l−1
l
α.Accum.π {Xj }j=1 ≡ α.Accum.π {Xj }j=1 +
+ α.Accum.πsl−1 (Xl )
We now use the multiple Accum notation to deőne dynamic collisions. Similar
to Deőnition 3.13, we deőne both ordered collisions and unordered collisions.
Note that for dynamic accumulators, there is another aspect of ordering: the
order among the Accum events, and which messages were received in each
Accum event. For unordered collisions, we ignore both aspects of ordering, while
for ordered collisions, we respect both, i.e., we expect different digests if the
same set of messages are reordered or split in a different way into Accum events.
Definition 3.18 (Dynamic accumulator collisions). Consider two ordered sets
′
X = {Xi }li=1 and X′ = {Xi′ }li=1 , where Xi and Xi′ are sequences of binary
strings. Let α be a dynamic accumulator such that that:
α.Accum.∆(X) = α.Accum.∆(X′ ) ̸= ⊥
Let Im (X) denote the set of messages in X. If Im (X) = Im (X′ ) ̸= ⊥, then
we say that (X, X′ ) is an unordered collision, or simply a collision, for α.
If X ̸= X′ , then we say that the pair (X, X′ ) is an ordered collision for α.
Finally, we deőne collision-resistance for dynamic accumulators.
Definition 3.19 (Dynamic Accumulator Collision-Resistance). A dynamic accumulator scheme α satisfies a) (unordered) Collision-Resistance, or b) Ordered
Collision-Resistance, if for any PPT adversary A there is negligible probability
′
for A to output a collision X = {Xi }li=0 and X′ = {Xi′ }li=0 . Namely:
α.Accum. =X (∆)α.Accum. ̸=X′ (∆)⊥, where
(a) Im(X) ̸= Im(X′ ) or (b) X ̸= X′ ,
P rob
′
∈ N egl(λ)
and
pk,
X
and
X
were
generated
by:
λ
(pk, s0 ) ← α.Setup(1 ) ;
′
(X, X ) ← A(pk)
Applied Introduction to Cryptography and Cybersecurity
(3.35)
3.9. THE MERKLE-DAMGÅRD CONSTRUCTION
3.8.3
209
Constructing a dynamic accumulator from a static
accumulator
We conclude this section by presenting a construction of a a dynamic accumulator
αD from a static accumulator α. This construction is simple, and has an efficient
accumulation function, producing a short digest which is also used as the state.
x 1 , x2 , x3
0n
0
α
x 4 , x 5 , x6
B1 ≡ α.∆(0n , x1 , x2 , x3 )
α
x 7 , x 8 , x9
B2 ≡ α.∆(B1 , x4 , x5 , x6 )
1
α
B3
1
Figure 3.17: Constructing a dynamic accumulator αD from a static accumulator
α.
The digest function of the construction is illustrated in Figure 3.17. The
őgure illustrates the digest function αD .∆ of the dynamic accumulator, for the
case where the scheme is used in three Accum events, each time accumulating a
set of three strings.
As illustrated in Figure 3.17, the computation of the digest and state
functions of the αD accumulator is simple and efficient, as follows:
αD .∆0λ+1 (X)
αD .s0λ+1 (X)
αD .∆1++s (X)
D
α .s1++s (X)
≡
≡
≡
≡
α.∆(0λ+1 +
+ X)
D
1+
+ α .∆0λ+1 (X)
(3.36)
(3.37)
α.∆(1 +
+s+
+ X)
(3.38)
D
1+
+ α .∆1++s (X)
(3.39)
Indeed, the αD accumulator is a good choice if the goal is to have a simple
dynamic accumulator that ensures collision resistance. However, many applications of dynamic accumulators require efficient Proof-of-Inclusion and veriőable
updates, and for the αD accumulator, these functions are quite simple but
inefficient. We will leave it to the reader to deőne these functions.
The αD accumulator is very similar to the Merkle-Damgård Construction,
which we study next; and a variant of the αD accumulator is used by the Bitcoin
blockchain, see subsection 3.10.2.
3.9
The Merkle-Damgård Construction
In this section, we present the well-known Merkle-Damgård design. We present
two variants of this design. First, given a hash function h, we deőne the
Merkle-Damgård static accumulator MD h . Then, we modify MD h to create
the Merkle-Damgård hash-function construction hMD of a CRHF, which allows
Applied Introduction to Cryptography and Cybersecurity
CHAPTER 3. INTEGRITY: FROM HASHING TO BLOCKCHAINS
210
arbitrary length input strings from a compression function h, i.e., a CRHF
which supports only inputs of some őxed length (see Figure 3.19). Both MerkleDamgård constructions are very similar to the αD dynamic accumulator of
subsection 3.8.3.
3.9.1
The Merkle-Damgård Static Accumulator MD h
The Merkle-Damgård Static Accumulator, MD h , is a simple static accumulator,
built using a CRHF h. The MD h accumulator is mostly useful for computing the digest (MD h .Accum.∆). For completeness, we also deőne the PoI
(MD h .Accum.π) and the MD h .VerPoI functions; however, these functions are
absurdly inefficient. In practice, MD h is only used to construct the MerkleDamgård hash function hMD , which requires only the (reasonably efficient)
MD h .Accum.∆ function (see Section 3.9). Later, in subsection 3.9.3, we also
use MD h as the basis for a dynamic accumulator which we denote DMD h ;
the DMD h .Accum.π and DMD h .VerPoI functions are are similarly absurdly
inefficient, so we don’t expect DMD h to be actually used for any application.
We just consider it useful as a simple example of a dynamic accumulator.
The MD h .∆ function. Let us őrst deőne the digest function MD h .Accum.∆,
which produces the digest resulting from accumulating the sequence of strings
{x1 , . . . , xl }. For conciseness, we use the ‘shorthand’ notation MD h .∆ instead
of writing the ‘full name’ of the function, MD h .Accum.∆.
We compute the digest MD h .∆({x1 , . . . , xl }) as follows:
MD h .∆ {xi }li=1 ≡
+1+
+ xl
For l > 1 : h MD h .∆({xi }l−1
i=1 ) +
(3.40)
h(0n+1 +
+ x1 )
For l = 1 :
Figure 3.18 illustrates the computation of MD h .∆ for the case of four input
strings, {x1 , . . . , x4 }.
x1
0
0n
h
x2
1
MD h .∆({x1 })
h
x3
1
MD h .∆({x1 , x2 })
h
x4
1
...
h
MD h .∆({x1 , x2 , x3 , x4 })
Figure 3.18: The digest function of the Merkle-Damgård accumulator MD h ,
applied to the input sequence {x1 , . . . , x4 }.
Let us give a simple example for the computation of MD h .∆.
Applied Introduction to Cryptography and Cybersecurity
3.9. THE MERKLE-DAMGÅRD CONSTRUCTION
211
Example 3.2. Let X = {11, 22, 33} be sequence of three strings. We compute
the Merkle-Damgård digest of X, using underlying hash h, by:
MD h .∆ ({11, 22, 33}) = h(MD h .∆({11, 22}) +
+1+
+ 33)
h
= h h MD .∆({11}) +
+1+
+ 22 +
+ 133
= h h h(0n+1 +
+ 11) +
+ 122 +
+ 133
(3.41)
For example, when we use the hash function h ≡ hsum (Example 3.1), we have:
MD h .∆ ({11, 22, 33}) = hsum hsum hsum (0n+1 +
+ 11) +
+ 122 +
+ 133
= hsum (hsum (2 +
+ 122) +
+ 133)
(3.42)
= hsum (7 +
+ 133)
=5
The Merkle-Damgård accumulator ensures collision resistance. Even
before we deőne the other functions of MD h , we can already show that MD h
ensures collision-resistance, since collision resistance depends only on the digest
function (see Deőnition 3.14).
Lemma 3.4. If h is a CRHF, then MD h is an any collision-resistant accumulator.
Proof: assume, to the contrary, that adversary A M D ∈ P P T has nonnegligible probability to output a collision. We use A M D to construct adversary
A h which has non-negligible advantage to őnd a collision for h. Speciőcally,
A h runs A M D , and whenever A M D outputs a collision for MD h .∆, then A h
outputs a collision for h. Let us explain how.
∗
Let the collision be (B, B ′ ), i.e., : B, B ′ ∈ ({0, 1}∗ ) ∧ (B ̸= B ′ ) ∧
h
h
MD .∆(B) = MD .∆(B ′ ). Denote the number of messages in B by l and the
number of messages in B ′ by l′ , and, without loss of generality, assume l ≥ l′ .
The proof is by induction on l.
If l = 1 then l′ must also be one, hence: MD h .∆(B) = h(0n+1 +
+ x1 ) and
MD h .∆(B) = h(0n+1 +
+ m′1 ). Since B ̸= B ′ , and in this case B = {x1 } and
B ′ = {m′1 }, it follows that x1 ̸= m′1 . Let m̄ = 0n+1 +
+ x1 , m̄′ = 0n+1 +
+ m′1 ;
̸ m̄′ . Hence, (m̄, m̄′ ) is a collision that A h will output, as
it follows that m̄ =
claimed.
Assume therefore that the claim holds for l = i and we prove it holds also
for l = i + 1. First assume l′ > 1. We have
MD h .∆ ({x1 , . . . , xi+1 }) ≡ h(MD h .∆(x1 , . . . , xi ) +
+1+
+ xi+1 )
and
+1+
+ m′l′ )
MD h .∆ ({m′1 , . . . , m′l′ }) ≡ h(MD h .∆(m′1 , . . . , m′l′ −1 ) +
and of course
MD h .∆ ({x1 , . . . , xi+1 }) = MD h .∆ ({m′1 , . . . , m′l′ })
Applied Introduction to Cryptography and Cybersecurity
(3.43)
212
CHAPTER 3. INTEGRITY: FROM HASHING TO BLOCKCHAINS
Now, if the inputs to the hash are identical, than this means that:
MD h .∆(m′1 , . . . , m′l′ −1 ) = MD h .∆(x1 , . . . , xi )
This would contradict the induction hypothesis. If the inputs to the hash are
different, then this is a collision. Hence, in both cases, A h can output a collision,
as claimed.
It remains to consider the case where l′ = 1 (and we prove for l = i + 1 after
we proved for l = i). Equation 3.43 still holds, but in this case we have
MD h .∆ ({m′1 }) ≡ h(0n+1 +
+ m′l′ )
and
MD h .∆ ({x1 , . . . , xi+1 }) = MD h .∆ ({m′1 }) = h(0n+1 +
+ m′l′ )
Since the (n + 1)th bit differs between these two inputs to h, but their outputs
are the same, it follows that also in this case, A h outputs a collision and the
claim is complete.
Collisions for ‘slightly modified’ Merkle-Damgård accumulator. The
design of the Merkle-Damgård accumulator may appear arbitrary, and some
minor modiőcations, simplifying the design or improving performance, may
appear harmless. However, changes to well-studied cryptographic mechanisms
are dangerous; we will demonstrate this by considering two subtle issues of the
design.
First, let us consider the string of 0n bits that we append to the őrst message
x1 in Equation 3.40, which is often referred to as an Initialization Vector (IV),
similarly to the term used for ‘modes of operation’ of block cipher (Section 2.8).
One obvious question is whether the value of the IV must be 0n , rather than,
say, 1n or any other n-bit string. Here, in fact, the answer is that the choice
of 0n is completely arbitrary; any n-bit string could be used, as long as it is a
fixed string.
But why must the IV be őxed? This seems wasteful. In particular, suppose
that |x1 | = n. Can we then save one hash operation, by using x1 as the n-bits
input used to hash x2 ? Unfortunately, this seemingly-minor change would allow
collisions, as the next exercise shows.
Exercise 3.11. Assume that |x1 | = n, and consider a variant on the MD
construction where we change Equation 3.40 so that for l = 1, we have:
MD h .∆ ({x1 }) ≡ x1 . This variant ‘saves’ a hash operation; however, show that
is not collision resistant.
Another possibly puzzling aspect of the construction is the fact that the
input to the compression function includes one special bit, which is not taken
from the input strings xi and also not from the IV or from the previous digest.
This bit is needed to allow for the case that an attacker may know a preimage
of the IV string, with a particular property; and an attacker may know such
preimage even if h is collision-resistant. See the next exercise.
Applied Introduction to Cryptography and Cybersecurity
3.9. THE MERKLE-DAMGÅRD CONSTRUCTION
213
Exercise 3.12. Show that collisions may be possible for a variant of the
construction where Equation 3.40 is replaced by:
h
l−1
For
l
>
1
:
h
MD
.∆({x
}
)
+
+
x
i
l
i=1
MD h .∆ {xi }li=1 ≡
For l = 1 :
h(0n +
+ x1 )
Hint: Let h be a hash function with a known preimage of 0n , i.e., where we
know a value z such that h(z) = 0n . Show a collision for MD h . Then, show
that such CRHF h may exist, by assuming some CRHF h′ , and using it, show
that there exist a CRHF h with known preimage. The proof of Lemma 3.4 may
be helpful.
Defining the PoI functions of MD h . As we mentioned above, the MD h
PoI functions, namely, the MD h .Accum.π function, computing the PoI values
for the string accumulated, and the MD h .VerPoI function, verifying the PoI of
a particular string, are both absurdly inefficient. Therefore, we do not expect
these functions to be deployed in any application. Still, we őnd it useful to
deőne them, as an additional example of an accumulator. In fact, we deőne an
especially inefficient pair of PoI functions:
m−1
m−1
[j] ≡ {xi }i=0
(∀j : 0 ≤ j < m)
MD h .Accum.Π {xi }i=0
if x ̸∈ π
False
MD h .VerPoI(∆, x, ID, m, π) ≡
False if ∆ ̸= MD h .∆(π)
True
else
It is easy to see that this simple implementation ensures PoI correctness
and unforgeability. However, as we mentioned, this implementation is absurdly
inefficient. Speciőcally, the PoI consists of the entire sequence accumulated,
and validation requires recomputing of the digest. The following exercise shows
a more efficient - but still absurdly inefficient - implementation.
Exercise 3.13. Present a more efficient design for the MD h PoI functions.
In particular, the length of the PoI would be about m digests, rather than the
entire input sequence {xi }m−1
i=0 .
3.9.2
The Merkle-Damgård Hash Function hMD
In this subsection, we present hMD , the Merkle-Damgård construction of a hash
function hMD ; this construction is used in the design of multiple hash functions,
including the MD5, SHA-1 and SHA-2 hash functions10 . The construction was
proposed independently by Merkle [282] and Damgård [112]; I personally őnd
Damgård’s text easier to follow. Our presentation is different from the presentation in both papers, since we present the Merkle-Damgård Hash Function
10 The SHA-2 standard defines multiple hash functions using similar design but different
sizes, including SHA-256, SHA-512, and others. SHA-3 uses a different design.
Applied Introduction to Cryptography and Cybersecurity
214
CHAPTER 3. INTEGRITY: FROM HASHING TO BLOCKCHAINS
x ∈ {0, 1}
h(x) ∈ {0, 1}n ,
n′
h
n < n′
Standard
MD5
SHA-1
SHA-256
SHA-512
n′
512
512
512
1024
n
128
160
256
512
Figure 3.19: Compression function h: n′ -bit input to n-bit output, n < n′ ,
and the values of n′ and n used by the MD5, SHA-1, SHA-256 and SHA-512
standard hash functions, which use the Merkle-Damgård construction of a hash
function from a compression function.
construction using the Merkle-Damgård accumulator, presented in the previous
subsection.
The Merkle-Damgård construction builds the hash function hMD from a
compression function 11 h. The compression function h is a special hash function
which maps binary strings of some length n′ into shorter strings of length n < n′ .
This is in contrast to most hash functions, including hMD , which allow input
strings of arbitrary length. See Figure 3.19.
One motivation to build a cryptographic hash function from a compression
function, is the ‘cryptographic building blocks’ principle (principle 8): The
security of cryptographic systems should only depend on the security of a few
basic building blocks. These blocks should be simple and with well-defined and
easy-to-test security properties. Cryptographic hash functions are often viewed
as a building block of applied cryptography, due to their simplicity and wide
range of applications. However, compression functions are even simpler, since
their input is restricted to őxed-input length strings.
Let us proceed to explain the the Merkle-Damgård hash function construction
hMD . We begin with a simpliőed version, where the input messages consist of
an integral number of blocks, i.e., strings of (n′ − n − 1)-bits each.
The Merkle-Damgård hash construction, simplified: messages consisting of integral blocks. We őrst describe a simpliőed Merkle-Damgård
construction, which is deőned only for input (binary) strings which contain
an integer number of blocks of (n′ − n − 1) bits; let us denote this integer by
|x|
m ≡ (n′ −n−1)
. Given a collision-resistant compression function h, we deőne the
following CRHF hMDwo (for inputs of m blocks):
hMDwo (x) ≡ MD h .∆ ({xi }m
i=0 ) , where xi is deőned by:
xi ≡ x [i · (n′ − n − 1) : (i + 1) · (n′ − n − 1) − 1]
(3.44)
11 The term compression function may not be the best choice; it may have been clearer to
refer to such functions, with Fixed-Input-Length (FIL), as FIL-hash functions. However, the
use of ‘compression functions’ is entrenched in the literature; hence, it seems best to use it.
Also, note that we denote the compression function by h, although it is not mnemonic, since
we build hMD from h using the MD h accumulator, which we defined for a hash/compression
function h.
Applied Introduction to Cryptography and Cybersecurity
3.9. THE MERKLE-DAMGÅRD CONSTRUCTION
215
Let us explain the design of hMDwo . we őrst split the input m into m strings,
(m−1)
{xi }i=0 , each containing (n′ − n − 1)-bits. Then, we compute the MD h
(m−1)
digest of the sequence {xi }i=0 , as in Figure 3.18. We illustrate hMDwo in
Figure 3.20; in both the őgure and above, we use x[i : j] to denote the sub-string
of x, from the ith bit to the j th bit.
x[0 : (n′ − n − 2)]
x
For SHA-256: 255 bits
(n′ − n − 1) bits
0
0n
h
n bits
(n′ − n − 1) :
(2n′ − 2n − 3)
x
(n′ − n − 1) bits
1
h
n bits
(2n′ − 2n − 2) :
(3n′ − 3n − 4)
x
(n′ − n − 1) bits
1
h
n bits
(3n′ − 3n − 4) :
(4n′ − 4n − 5)
(n′ − n − 1) bits
1
h
hMDwo (x)
Figure 3.20: The simpliőed Merkle-Damgård hash hMDwo (x), deőned for inputs
x whose length is an integral number of ‘blocks’, each containing (n′ − n − 1)
bits, i.e., |x| = 0 mod (n′ − n − 1). The speciőc numbers used in this example
are n′ = 512, n = 256 (as for SHA-256), resulting in block length of 255 bits.
The input x is four blocks, i.e., 1020 bits.
It is easy to see, from Lemma 3.4, that if h is a collision-resistant compression
function, then hMDwo would be collision-resistant.
The Merkle-Damgård hash construction with MD-strengthening.
The ‘full’ Merkle-Damgård hash function construction, hMD , includes an additional preprocessing step, usually referred to as MD-strengthening. MDstrengthening avoids the restriction that the input x satisőes |x| = 0 mod (n′ −
n), required by hMDwo (from Equation 3.44). Namely, the MD-Strengthening
step allows us to handle binary strings of arbitrary length as input.
MD-strengthening pads the message x with additional bits before hashing
it with hMDwo , so that the length of the resulting string, |padM D (x)|, would
be an integer number of blocks (of (n′ − n − 1) bits each). Namely, we pad x
with p = (n′ − n − 1) − [|x| mod (n′ − n − 1)]. For example, if n′ = 512 and
n = 256, as for SHA-256 (Figure 3.19), and x contains ten bytes (80 bits), then
we pad x with p = 255 − 80 = 175 bits. The same pad will be used if |x| = 940
bits. The pad is computed using the padding function padM D (x), deőned as:
padM D (x) ≡ x +
+ [0p ∨ bin(p)]
where bin(p) is the binary encoding of integer p,
and p ≡ (n′ − n − 1) − [|x| mod (n′ − n − 1)]
(3.45)
Given the hash function hMDwo of the Merkle-Damgård without strengthening, as deőned in Equation 3.44, we can construct the Merkle-Damgård
hash hMD as: hMD (x) = hMDwo (padM D (x)). Alternatively, Equation 3.46 deőnes hMD directly from the deőnitions of MD h .∆ (Equation 3.40 and padM D
(Equation 3.45).
Applied Introduction to Cryptography and Cybersecurity
CHAPTER 3. INTEGRITY: FROM HASHING TO BLOCKCHAINS
216
hMD (x) ≡ MD h .∆ ({xi }m
i=0 ) , where m ≡
l
|x|
(n′ −n−1)
m
,
(3.46)
(i < m) xi ≡ x [i · (n′ − n − 1) : (i + 1) · (n′ − n − 1) − 1] and
xm ≡ padM D (x[m · (n′ − n − 1) :])
We illustrate hMD in Figure 3.21, for the SHA-256 example given above,
i.e., for n′ = 512, n = 256, |x| = 940 bits and, therefore, p = 255 − [940
mod 255] = 255 − 170 = 85.
x[0 : (n′ − n − 2)]
x
For SHA-256: 255 bits
(n′ − n − 1) bits
0
0n
h
n bits
(n′ − n − 1) :
(2n′ − 2n − 3)
x
(n′ − n − 1) bits
1
h
n bits
(2n′ − 2n − 2) :
(3n′ − 3n − 4)
x[(3n′ − 3n − 4) :]+
+
+
+[0p ∨ bin(p)]
(n′ − n − 1) bits
(n′ − n − 1) bits
1
h
n bits
1
h
hMD (x)
Figure 3.21: The Merkle-Damgård hash hMD (x), deőned for arbitrary-length
input x. Shown here for an example using SHA-256 (n′ = 512, n = 256) and
input x of 940 bits, requiring four blocks, with padding of p = 255 − [940
mod 255] = 255 − 170 = 85 bits.
Given a collision-resistant compression function h, the Merkle-Damgård
hash construction produces a collision resistant hash function (CRHF). This
follows from Lemma 3.4.
Lemma 3.5. If h is a collision-resistant compression function, then hMD , as
defined as in Equation 3.46, is a collision-resistant hash function (CRHF).
3.9.3
The Merkle-Damgård Dynamic Accumulator (DMD h )
We complete this section by observing that the Merkle-Damgård accumulator can
easily be extended into a dynamic accumulator, DMD h . The state of DMD h
would simply be the last computed digest. Given one or more additional strings
as input, the new digest - and new state - can be computed just like computation
of the digest over the non-őrst strings xi (i > 1).
This digest computation is quite efficient. However, like MD h , the PoI
mechanisms of DMD h would be extremely inefficient. Therefore, practical
applications would use other, more efficient, dynamic accumulators, such as
based on the Merkle Tree design.
3.10
Blockchains, Proof-of-Work (PoW) and Bitcoin
Blockchain schemes are an extension of dynamic accumulators, with many
important and exciting applications. Like dynamic accumulators, blockchains
accumulate a sequence of strings, which are collected over multiple Accum
Applied Introduction to Cryptography and Cybersecurity
3.10. BLOCKCHAINS, POW AND BITCOIN
217
events, each time accumulating an additional sub-sequence, which we call a
block. In principle, transactions can be arbitrary strings; however, when we
discuss blockchain schemes, we use the term transaction rather than simply
‘string’, since the őrst and most well-known application of blockchains is for
cryptocurrencies such as Bitcoin, where each transaction represents a payment
and the ledger is the complete list of transactions, establishing the ownership
of each bitcoin; there are also other applications of blockchains where the term
‘transaction’ is appropriate. For the same reason, we refer to the entire sequence
of transactions accumulated by the blockchain as a ledger. We discuss the
Bitcoin cryptocurrency in subsection 3.10.2.
Like dynamic accumulators, every time we add a new block to the ledger,
using the Accum operation, the operation outputs a digest, Proof-of-Inclusion
(PoI) value for each transaction, update values for previous PoI values, and a
new state for the blockchain. Similarly to dynamic accumulators, blockchain
schemes should ensure collision-resistance and the security of the Proof-ofInclusion values. Basically, collision resistance means that it is infeasible to
őnd two ledgers which produce the same digest; and PoI security means that
it is only feasible to őnd a valid tuple of transaction x, digest ∆ and a PoI
P oI, if ∆ is the result of accumulating a ledger which includes transaction x.
Both requirements are the same as these deőned for dynamic accumulators, see
Deőnition 3.18.
The big difference between blockchain schemes and accumulators is that
blockchains allow multiple participants to cooperate in accumulating blocks of
transactions, where different transactions in the same block often come from
different participants. Namely, the participants maintain a shared ledger of
transactions. This introduces several requirements; we informally describe the
most important requirements below. We do not present rigorous deőnitions,
since such deőnition involve the feasibility of an execution involving multiple
parties, which is beyond our scope. Note also that these requirements can only
be satisőed under appropriate models (assumptions), such as maximal delay for
communication and benign participants continuously attempting to mine (add
a new block to the chain); deőning such models is also beyond our scope.
Consistency. One important requirement is consistency. Namely, the ledgers
kept by different participants are required to be consistent. Consistency is
deőned with respect the the sequence of digests kept by each participant.
Namely, let ∆p (i) be the digest that participant p receives after accumulating
the ith block. Then for every participant j which also accumulated i or more
blocks, it should hold that: ∆p (i) = ∆p′ (i). Formally, we use the special
value ⊥ to identify the blocks already accumulated by a participant; i.e., if
∆p (i)[t] = ⊥ then it means that p did not accumulate block i up to time
t, while if ∆p (i)[t] ̸= ⊥ then p has already accumulated block i by time t.
Consistency boils down to two basic requirements: őrst, we require that blocks
are accumulated consecutively; and second, we require that the digests of all
participants are always consistent. Namely, for all benign participants p, p′ ,
Applied Introduction to Cryptography and Cybersecurity
218
CHAPTER 3. INTEGRITY: FROM HASHING TO BLOCKCHAINS
block-numbers i, i′ , time t and time t′ > t, we require that:
(t ≥ t′ ) ∧ (∆p (i)[t] ̸= ⊥) ⇒ ∆p (i)[t] = ∆p (i)[t′ ]
(∆p (i)[t] ̸= ⊥) ∧ (∆p′ (i)[t′ ] ̸= ⊥) ⇒ ∆p (i)[t] = ∆p′ (i)[t′ ]
(3.47)
(3.48)
Consistency ensures that the blockchain can only grow, i.e., is immutable,
and is shared among all participants. For example, a blockchain may be used
to keep track of the ownership of objects belonging to some participants; a
participant p who owns an object X at time t, may sign a transaction x which
transfers ownership of X to participant p′ . Once the transaction x is added
to the blockchain, i.e., included in a block added to the blockchain, then the
participants consider object X as belonging to p′ . Consistency is key to such
applications, as the ownership of an item must always be well deőned.
Consistency alone is not very useful. For example, consider a trivial
blockchain where benign participants always output Bp [t](i) = ⊥, i.e., blocks are
never added; consistency is satisőed in a trivial sense. Additional requirements
are necessary to make blockchains useful. One such additional requirement is
chain growth, i.e., requiring that the chain grows over time.
Chain growth. The chain growth requirement ensures that new blocks are
mined (added to the blockchain) over time, at least in a speciőed minimal
chain growth rate g. The minimal chain growth rate g is speciőed in terms of
blocks per time unit, e.g., blocks per second. We require that for every benign
participant p and every time t such that the blockchain of p accumulated (at
least) i blocks at time t, then at any later time t′ > t interval the blockchain of p
would contain at least i + ⌊g · (t′ − t)⌋ blocks at time t′ . Namely, the blockchain
scheme ensures chain growth of rate g, if:
For all i ∈ N, t′ > t > 0 and benign participant p holds:
′
t −t
[t′ ]
[∆p (i)[t] ̸= ⊥] ⇒ ∆p i +
g
(3.49)
Bounded delay transactions. The őnal property we discuss is bounded
delay T for adding a transaction to the blockchain. Namely, suppose that
at time t a benign participant tries to add a new transaction to the ledger.
Then, the new transaction should be added by t + T (or earlier). Note that a
particular blockchain application may require transactions to satisfy additional
requirements in order to be considered ‘valid’, e.g., a Bitcoin transaction
specifying payment of a bitcoin is ignore unless it is properly signed by the
entity whom, according to the ledger, currently owns that bitcoin.
3.10.1
Blockchain Design: Permissioned and Permissionless
Blockchains
Blockchain schemes have many diverse designs, and use different mechanisms
to accumulate transactions into blocks which are added to the ledger, and to
Applied Introduction to Cryptography and Cybersecurity
3.10. BLOCKCHAINS, POW AND BITCOIN
219
ensure the blockchain requirements above, e.g., consistency and chain-growth,
as well as other requirements. Blockchain schemes mostly belong to one of two
categories: permissioned blockchains and permissionless blockchains.
Permissioned blockchains. In permissioned blockchains, only speciőc, authorized parties can add a block to the blockchain. In a typical permissioned
blockchain scheme, the digest is signed by an authorized parties; some
schemes allow one of multiple parties to sign digests, and other schemes
require multiple authorized parties to sign every new digest.
Speciőcally, if A.s is an authorized (private) signing key, then the ith
digest, denoted ∆i , is sent together with a signature σi ≡ SignA.s (∆i ).
Note that this requires all participants to know the corresponding public
validation key, A.v.
Permissionless blockchains. Permissionless blockchains are egalitarian: every participant may try to add a block to the blockchain, by following
a process called mining. The participant to succeed in mining a given
block is selected randomly, and the probability of a participant to win
is proportional to the resources it allocates to the mining. To motivate
participants to participate allocate resources to mining, permissionless
blockchains provide a reward to a participant that mines a block. In
permissionless blockchain cryptocurrencies such as Bitcoin, the reward
is given in the cryptocurrency, e.g., as an amount of bitcoins. In fact, in
Bitcoin, the mining rewards are the only mechanism which increases the
amount of bitcoins in circulation. This is the reason that this process is
referred to as mining; mining bitcoins, like mining gold, is a randomized
process where the chance of success depends on the amount of allocated
resources. We discuss Bitcoin in the following subsection.
3.10.2
The Bitcoin blockchain and cryptocurrency
Bitcoin is the őrst and the most well-known cryptocurrency, and also the őrst
and most well-known application of blockchains. In Bitcoin, the ledger deőnes
which bitcoin numbers are currently valid, and who is the owner of each bitcoin.
The owner is identiőed only using the owner’s public key, which allows owners
to transfer a bitcoin to other participants, by signing an appropriate transaction
using the public key ‘owning’ this bitcoin.
Speciőcally, given a valid bitcoin number c and a ledger X, the Bitcoin
function owner returns the public validation key v ≡ owner(c, X) which is
said to own coin c. If owner(c, X) = ⊥, then we say that c is not a valid coin
number according to ledger X.
A Bitcoin transaction x is a triple (x.coins, x.v, x.σ) where x.coins is a list
of bitcoin numbers, x.v is the public key to which the transaction transfers
ownership of the bitcoins in x.coins, and x.σ is a digital signature. We say that
x is a valid transaction for ledger X, if there is some public key v such that
(∀c ∈ x.coins)v = owner(c, X) and V erif yv ((x.coins, x.v), x.σ) = True, i.e.,
Applied Introduction to Cryptography and Cybersecurity
220
CHAPTER 3. INTEGRITY: FROM HASHING TO BLOCKCHAINS
x.σ is a valid signature over (x.coins, x.v) using the public key v. If X ′ is the
ledger after adding a new block which includes transaction x, then the ownership
over x.coins will be passed to x.v, i.e., (∀c ∈ x.coins)x.v = owner(c, X ′ ).
Pseudonymity, Anonymity and Privacy in Bitcoin. Bitcoin provides
pseudonymity, since transactions specify only the public validation key, and do
not specify any identiőer of the payer or recipient. The same person or organization may use multiple different public validation keys. For example, Charlie,
whose (validation, signing) key pair is (C.v, C.s), may transfer ownership of
bitcoin number c to Don, whose key pair is (D.v, D.s), by using C.s to sign the
transaction (c, D.v). There is however no need to know who is the payer or the
payee.
This property is often misunderstood to imply that Bitcoin ensures anonymity
and privacy, which motivates the use of bitcoins by people concerned about
being associated with their transactions, including criminals. In particular,
ransom payments, to ransomware or otherwise, and other payments for illegal
products and services, are often made in bitcoins. However, while Bitcoin
transactions use pseudonyms rather than names, this does not provide real
anonymity. In fact, since all Bitcoin transactions are documented on the public
ledger, this makes it easy to trace the movement of bitcoins between different
owning public keys. This fact is often used by researchers and law enforcement
agencies to track the ŕow of bitcoins, which often allows identiőcation of the
user, i.e., deanonymization.
The Bitcoin Proof-of-Work (PoW). In Bitcoin, mining a new block
requires a solution to a difficult computational problem, namely, a Proofof-Work (PoW). There are other applications of PoW in Cybersecurity, e.g.,
in defenses against some Denial-of-Service attacks; and there are also other
approaches for mining in permission-less blockchains, e.g., Proof-of-Stake, where
the mining probability is proportional to the amount of cryptocurrency held
by each participant. We őrst discuss the general concept of PoW schemes, and
then focus on the Bitcoin PoW and block-mining operation.
Intuitively, a PoW allows one party, the worker, to solve a challenge, with
an approximately known amount of computational resources, resulting in a proof
of this success, which can be efficiently veriőed by anyone.
Proofs-of-Work schemes belong in this chapter of hash functions, both due
to their use in permissionless blockchains and in particular Bitcoin, but also
since their most well known implementation, which is the one used in Bitcoin,
is based on a hash function.
Notice that we used the general term ‘computational resources’ and not a
more speciőc term such as computational power, i.e., number of computations
per second, which is the resource used by the Bitcoin PoW. Indeed, some
PoW proposals focus on other resources, e.g., storage, or on a combination of
resources, e.g., time and storage. However, from this point, let us focus on PoW
based on computational power, as used by Bitcoin.
Applied Introduction to Cryptography and Cybersecurity
3.10. BLOCKCHAINS, POW AND BITCOIN
221
Definition 3.20 (Proof of Work (PoW) - intuitive deőnition). A PoW scheme
PoW consists of two efficient algorithms: PoW.solve, and PoW.validate.
PoW.solve: The PoW.solve algorithm receives three inputs: a challenge c ∈ CD ,
a (random) nonce value r ∈ {0, 1}n , and a work-amount w ∈ [1, 2n ]. The
PoW.solve function outputs an n-bit binary string, to which we refer as
the solution.
PoW.V alidate: The PoW.V alidate algorithm has four inputs: the challenge
c, the nonce r, the required work r ∈ [1, 2n ], and a purported solution
π ∈ {0, 1}∗ . The PoW.V alidate algorithm returns true or false.
A PoW scheme PoW is secure, if finding a solution, i.e., the runtime of
λ
PoW.solve, is distributed randomly with average of 2w computations of a given
λ
difficulty, typically, 2w computations of a given hash function h. The scheme is
correct if:
(3.50)
(∀c, r, w)PoW.V alidate(c, r, w, PoW.solve(c, r, w)) = true
Proof-of-work mechanisms are often implemented using a cryptographic
hash function. A typical implementation, which is a simpliőcation of the one in
Bitcoin, is the PoW scheme PoW B , deőned as:
PoWB .V alidate: On input (c, r, w, π), return true if h(c +
+r+
+ π) ≤ w. Otherwise, return false.
PoWB .solve: On input (c, r, w), repeatedly compute x ≡ h(c +
+r+
+ π) for
$
different π ← {0, 1}n , aborting and returning π when x ≤ w.
c1 , x1,1 , x1,2
c2 , x2,1 , x2,2
∆′1
∆′2
MT
0λ
Nonce n1
c3 , x3,1
∆′3
MT
h
∆1
MT
h
Nonce n2
∆2
h
∆3
Nonce n3
Figure 3.22: The (simpliőed) Bitcoin Blockchain, illustrated for three blocks.
Each block begins with the coinbase transaction (ci ), followed by regular transactions (two for blocks 1 and 2, one for block 3). The Merkle-tree digest of the
transactions of block i is hashed with the digest of block i − 1 (or with Oλ for
the genesis block), and with a nonce ni . The value of the nonce should ensure
that Bi is not higher than the current threshold.
Applied Introduction to Cryptography and Cybersecurity
222
CHAPTER 3. INTEGRITY: FROM HASHING TO BLOCKCHAINS
The Bitcoin blockchain We illustrate the Bitcoin blockchain in Figure 3.22.
The reader may notice the similarity to the dynamic accumulator scheme
illustrated in Figure 3.17.
The transactions of block i, where i ≥ 1, are accumulated using a Merkle
tree MT scheme. We use ∆′i to denote the digest of the transactions in block
i, from which we later compute the digest ∆i of the blockchain with i blocks.
To compute ∆i , we apply a hash function h to ∆′i and two other values: the
+ ni ).
previous digest ∆i−1 and a nonce value ni , namely ∆i ≡ h(∆′i +
+ ∆i−1 +
The value of the nonce ni is chosen randomly by the miner, until they őnd a
nonce value ni which ensures that ∆i ≤ w, where w is the current work amount,
following the Bitcoin Proof-of-Work (PoW) mechanism as described above.
Bitcoin operations and concerns: costs of PoW, rewards and adjustable work-amount w. Solving the proof-of-work (PoW) problem requires considerable computational resources; this is intentional, to ensure that
all entities have a fair chance to add new blocks, speciőcally, the probability
of a speciőc miner to succeed in being the őrst to solve the PoW and to add
the next block, is roughly the fraction of the number of hash computations by
this miner, from the total number of hash computations done by all miners
(for adding the block to the current blockchain). This prevents an attacker
from adding most or all of the blocks, which would have allowed the attacker
to abuse the system, e.g., by excluding speciőc transactions.
However, this means that adding blocks requires the use of signiőcant
amounts of energy. Most of this energy does not even result in a new block,
since only one miner can succeed in adding the new block. This wasteful use of
large amounts of energy is one of the criticisms against Bitcoin, and motivates
the use of blockchains which are more energy-efficient. There are several
alternative designs which avoid this waste of energy, including permissioned
blockchains and blockchains using other mining mechanisms, e.g., proof of stake.
To incentivize entities to invest the effort (and energy) required for mining
(solving a PoW), Bitcoin rewards them by granting them a certain number of
Bitcoins every time they succeed in adding a new block to the chain. This
reward is calculated through a clever ‘reward rule’, designed to make the reward
sufficient but not excessive. The reward consists of a number of newly-minted
bitcoins, and transaction fees paid by the payers of transactions in the block.
The reward (new coins and fees) are allocated to the public key that the miner
includes in a special transaction called the coinbase transaction which is included
in every Bitcoin block, as illustrated in Figure 3.22.
The number of new bitcoins rewarded upon mining is cut in a half once
per 210, 000 blocks mined, and would eventually become zero. For Bitcoin
to remain a viable system after that point, transactions fees should provide
sufficient incentive to motivate miners. The fee is indicated by the payer in each
transaction, and this amount is moved from the payer to the miner of the block
in the chain which includes this transaction; this is in addition to the amount of
bitcoins transferred to the payee. Different transactions offer different fees, and
Applied Introduction to Cryptography and Cybersecurity
3.11. LAB AND ADDITIONAL EXERCISES
223
miners may prefer transactions with higher fees; each block has limited size, so
miners may not be able to include all available transactions in their blocks.
The work-amount parameter w of the PoW is adjusted automatically by
a feedback mechanism in Bitcoin, whose goal is to ensure that new blocks
will be added at a reasonable, but not excessive, rate. Namely, if blocks are
added more quickly, then the work amount is increased - making it harder to
mine new blocks, i.e., slowing down the rate of adding blocks. This should
maintain a stable rate of mining new blocks, and therefore, balancing between
the overhead of block creation, and the delay until a new transaction appears
on the chain. The mining rates can be impacted by multiple factors, including
the energy costs of mining and the value of the award and fees obtained by a
lucky miner, the likelihood of mining a block added to the chain (which depends
on competition), and the costs and efficiency of mining hardware.
3.11
Lab and additional exercises
Lab 3 (Checksum and CRC Collisions). In this lab, we experiment with attacks
against a system using an insecure hash function, specifically, the the Internet
Checksum error-detection codes as a hash function. Namely, to hash input x,
we compute h(x), where h is the Internet Checksum function. We show that h
is not a secure hash function, specifically, we find collisions, i.e., inputs x ̸= x′
such that h(x) = h(x′ ). Finally, we create a partially-chosen collision, i.e., a
collision between the CRC/checksum-hash over two given (and very different)
documents, by ‘filling in’ a designated area left undefined in each of the two
documents. Note that this implies universal forgery of signatures computed using
the Hash-then-Sign construction with CRC/checksum as the hash function.
As for the other labs in this textbook, we will provide Python scripts for
generating and grading this lab (LabGen.py and LabGrade.py). If not yet
posted online, professors may contact the author to receive the scripts. The
lab-generation script generates random challenges for each student (or team),
as well as solutions which will be used by the grading script. We recommend
to make the scripts available to the students, as example of how to use the
cryptographic functions. It is easy and permitted to modify these scripts to use
other languages/libraries or to modify and customize them as desired.
1. Let h denote the Internet Checksum function (of the input őle padded
by a single 1 bit and then a minimal number of zero bits to result in
legitimate input to the checksum function). Write a program to compute
h. In you lab-input folder, in sub-folder checksum, őnd őles f1a.txt, f1b.txt
and h1a.txt. File h1a.txt is the result of applying h over őle f1a.txt; use it
to test your program. Then, compute the result of applying h over f1b.txt;
name the resulting őle h1b.txt and place it in the lab-solutions folder.
2. Find and submit a collision for h Namely, place in sub-folder checksum
of the lab-solutions folder three őles, f2a.txt, f2b.txt and h2.txt, such that
h2.txt is the result of applying h over both őles (f2a.txt and f2b.txt).
Applied Introduction to Cryptography and Cybersecurity
224
CHAPTER 3. INTEGRITY: FROM HASHING TO BLOCKCHAINS
3. In the sub-folder checksum of the lab-input folder, őnd őles f3a.txt and
h3a.txt, which contains the Internet Checksum of f3a.txt. In f3a.txt, you
will see a section designate łto changež. You should create a new őle
f3as.txt which will be identical to f3a.txt, and have the same Internet
Checksum (the one in h3a.txt), except that it would have different contents
in the “to change” section. Upload f3as.html to the lab-solutions folder.
4. In the sub-folder checksum of the lab-input folder, őnd őle f4b.html. In
f4b.html, you will őnd a conditional Javascript statement which compares
two copies of the string łto changež to each other (resulting, of course,
in ‘true’). You should create a new őle f4bs.html which will be identical
to f4b.html, and have the same Internet Checksum, except that it would
have different contents to one of the “to change” strings. Upload f4bs.html
to the lab-solutions folder. Open the two őles in the browser and compare
the results!
Exercise 3.14 (XOR-hash). Consider the following hash function, defined
for input messages consisting of number l of n bits ‘blocks’, i.e., total of l · n
bits, where n is the length of the digest (n = |h(m)). Given such message
m containing l · n bits, let us denote the ith block (of n bits) by mi , i.e.,
m = m1 +
+ m2 +
+ . . . ml and (∀i)|mi | = n.
Define hash function h for such l · n bit messages, as: h(m1 . . . ml ) =
Ll
i=1 mi . Show that h does not have each of the following properties, or present
a convincing argument why it does:
1. Collision-resistance (CRHF), see Section 3.2.
2. Second-preimage resistance (SPR), see Section 3.3.
3. Preimage resistance, i.e., h is not a one-way function (OWF), see Section 3.4.
4. Bitwise randomness extraction (BRE), see subsection 3.5.2.
5. Secure MAC, when h is used in the HMAC construction, see subsection 4.6.3.
Solution to part 4 (randomness extraction): For simplicity, we present the
solution for even n, i.e., n = 2µ where µ is an integer. Adversary A selects
+ 1µ +
+ 0µ +
+ 1µ . Let yb and y1−b
input message m = 02n and mask M = 0µ +
be the values computed by BREA,h (b, n); the reader can conőrm that y1−b is
random, while yb is of the form 0µ +
+ r, where r is a random string of µ bits.
On input (y0 , y1 , m, M ), the adversary A returns:
0 if m mod 2µ ̸= 0µ
(3.51)
A(y0 , y1 , m, M ) =
1
otherwise
The reader should conőrm that the adversary is correct with overwhelming
probability. Hence, h is not a bitwise randomness extractor (BRE).
Applied Introduction to Cryptography and Cybersecurity
3.11. LAB AND ADDITIONAL EXERCISES
225
Exercise 3.15 (Insecure double-input hash). Let h be a ‘compression function’,
i.e., a cryptographic hash function whose input is of length 2l and output is
l
of length l. Let h′ : {0, 1}2l·n →
Ln{0, 1} extend h to inputs of length 2l · n, as
′
follows: h (m1 +
+...+
+ mn ) = i=1 h(mi ), where (∀i = 1, . . . , n)|mi | = 2l. For
each of the following properties, assume h has the property, and show that h′
may not have the same property. Or, if you believe h′ does retain the property,
argue why it does. The properties are:
1. Collision-resistance.
2. Second-preimage resistance.
3. One-wayness (preimage resistance)
4. Randomness extraction.
Would any of your answers change, if h and h′ have a random public key as an
additional input?
Exercise 3.16 (Insecure XOR hash). Consider messages of 2n blocks of l bits
each, denoted m1 . . . mn , and let hc be a secure compression function, i.e., a
cryptographic hash function from 2n bits to l bits. DefineLhash function h for
n
such 2n blocks of l bits messages, as: h(m1 . . . m2n ) = i=1 hc (m2i , m2i−1 ).
Show that h does not have each of the following properties, although hc has the
corresponding property, or present a convincing argument why it does:
1. Collision-resistance.
2. Second-preimage resistance.
3. One-wayness (preimage resistance)
4. Bitwise randomness extraction.
5. Secure MAC, when h is used in the HMAC construction.
Exercise 3.17 (Insecure cascade combining of hash). It is proposed to combine two hash functions by cascade, i.e., given hash functions h1 , h2 we define
h12 (m) = h1 (h2 (m)) and h21 (m) = h2 (h1 (m). Suppose collision are known for
h1 ; what does this imply for collisions in h12 and h21 ?
Exercise 3.18 (Insecure committee-designed combined hash). Recently, weaknesses were found in few cryptographic hash functions such as hM D5 and hSHA1 ,
and as a result, there were many proposals for new functions. Dr. Simpleton
suggests to combine the two into a new function, hc (m) = hSHA1 (hM D5 (m)),
whose output length is 160 bits. Prof. Deville objects; she argued that hash functions should have longer outputs, and suggest a complex function, h666 , whose
output size is 666 bits. A committee setup to decide between these two, proposes,
instead, to XOR them into a new function: fX (m) = [0506 +
+ hc (m)] ⊕ h666 (m).
Applied Introduction to Cryptography and Cybersecurity
226
CHAPTER 3. INTEGRITY: FROM HASHING TO BLOCKCHAINS
1. Present counterexamples showing that each of these may not be collisionresistant.
2. Present a design where we can be sure that finding a collision is definitely
not easier than finding one in hSHA1 and in hc .
3. Repeat the first part for bitwise randomness-extraction.
Exercise 3.19 (Ordered-collision resistance). Define ordered-collision resistance
for accumulators, and show that every accumulator α which is resistant to
unordered collisions, is also resistant to ordered collisions.
Exercise 3.20 (SPR accumulator).
cumulators.
1. Define second-preimage resistant ac-
2. Show that every collision-resistant accumulator is also a SPR accumulator.
3. Let α be a collision-resistant accumulator. Use α to construct another
accumulator α′ , which will be SPR but not collision resistant.
, and show res
Exercise 3.21 (Insecure no-preőx Merkle Tree). Let MT′ be a variant on the
Merle Tree design, which is identical to the one described in the text, except that
it does not add the one-byte prefixes of 0x00 or 0x01 as used in Equations 3.23
and 3.24, respectively, using hash function h. Let (KG, S, V ) be a secure (FIL)
′
signature scheme. Let SsMT (X) = Ss (MT′ (X)) follow the ‘hash then sign’
paradigm, to turn (KG, S, V ) into a signature scheme for sequences of strings.
′
Show that SsMT is not a secure, existentially-unforgeable signature scheme, by
presenting an efficient adversary (program) that outputs a forged signature.
Exercise 3.22 (HMAC simpliőcation). Consider the following slight simplification of the popular HMAC construction: h′k (m) = h(k +
+ h(k +
+ m)), where
h : {0, 1}∗ → {0, 1}n is a hash function, k is a random, public n-bit key, and
m ∈ {0, 1}∗ is a message.
1. Assume h is a CRHF. Is h′k also a CRHF?
□ Yes. Suppose h′k is not a CRHF, i.e., there is some adversary A ′ that
finds a collision (m′1 , m′2 ) for h′ , i.e., h′k (m′1 ) = h′k (m′2 ). Then at least
one of the following pairs of messages (m1,1 , m2,1 ), (m1,2 , m2,2 ) is a collision for h, i.e., either h(m1,1 ) = h(m2,1 ) or h(m1,2 ) = h(m2,2 ) (or both).
The strings are: m1,1 =
, m1,2 =
,
m2,1 =
, m2,2 =
.
. Note
□ No. Let ĥ be some CRHF, and define h(m) =
that h is also a CRHF (you do not have to prove this, just to design h so this
would be true and easy to see). Yet, h′k is not a CRHF. Specifically, the following two messages m′1 =
, m′2 =
′
′
′
are a collision for hk , i.e., hk (m1 ) = hk (m2 ).
Applied Introduction to Cryptography and Cybersecurity
3.11. LAB AND ADDITIONAL EXERCISES
227
2. Assume h is an SPR hash function. Is h′k also SPR?
□ Yes. Suppose h′k is not SPR, i.e., for some l, there is some algorithm A ′
which, given a (random, sufficiently-long) message m′ , outputs a collision,
i.e., m′1 ̸= m′ s.t. h′k (m′ ) = h′k (m′1 ). Then we define algorithm A which,
given a (random, sufficiently long) message m, outputs a collision, i.e.,
m1 ̸= m s.t. hk (m) = hk (m1 ). The algorithm A is:
Algorithm A(m):
{
Let m′ =
Let m′1 = A ′ (m′ )
Output
}
□ No. Let ĥ be some SPR, and define h(m) =
. Note
that h is also an SPR (you do not have to prove this, just to design h so
this would be true and easy to see). Yet, h′k is not an SPR. Specifically,
given a random message m′ , then m′1 =
is a collision,
i.e., m′ ̸= m′1 yet h′k (m′1 ) = h′k (m′2 ).
3. Assume h is a OWF. Is h′k also a OWF?
□ Yes. Suppose h′k is not OWF, i.e., for some l, there is some algorithm
A ′ which, given h′k (m′ ) for a (random, sufficiently-long) message m′ ,
outputs a preimage, i.e., m′1 ̸= m′ s.t. h′k (m′ ) = h′k (m′1 ). Then we define
algorithm A which, given h(m) for a (random, sufficiently long) message
m, outputs a preimage, i.e., m1 s.t. hk (m) = hk (m1 ). The algorithm A
is:
Algorithm A(m):
{
Let m′ =
Let m′1 = A ′ (m′ )
Output
}
□ No. Let ĥ be some OWF, and define h(m) =
. Note
that h is also an OWF (you do not have to prove this, just to design h so
this would be true and easy to see). Yet, h′k is not an OWF. Specifically,
given a random message m′ , then m′1 =
is a collision,
i.e., m′ ̸= m′1 yet h′k (m′1 ) = h′k (m′ ).
4. Repeat similarly for bitwise randomness extraction.
Exercise 3.23 (Insecure prepend-key MAC). Consider the following construction: h′k (m) = h(k +
+ m), where h : {0, 1}∗ → {0, 1}n is a hash function, k is
a secret n-bit key, and m ∈ {0, 1}∗ is a message. Assume you are given some
SPR hash function ĥ : {0, 1}∗ → {0, 1}n̂ ; you can use n̂ which is smaller than
n. Using ĥ, construct hash function h, so that (1) it is ‘obvious’ that h is also
SPR (no need to prove), yet (2) h′k (m) = h(k +
+ m) is (trivially) not a secure
MAC. Hint: design h s.t. it becomes trivial to find k from h′k (m) (for any m).
Applied Introduction to Cryptography and Cybersecurity
CHAPTER 3. INTEGRITY: FROM HASHING TO BLOCKCHAINS
228
1. h(x) =
.
2. (Justification) h is an SPR, since
3. (Justification) h′k (m) = h(k +
+ m) is not a secure MAC, since:
.
.
Exercise 3.24 (HMAC is secure under ROM). Show that the HMAC construction is secure under the Random Oracle Model (ROM), when used as a PRF,
MAC and KDF.
Exercise 3.25 (HMAC is insecure using CRHF). Show counterexamples showing that even if the underlying hash function h is collision-resistant, its (simplified) HMAC construction hmack (x) = h(k +
+ h(k +
+ m)) is insecure when used
as any of PRF, MAC and KDF.
Exercise 3.26 (Hash-tree with efficient proof of non-inclusion). The Merkle
tree allows efficient proof of inclusion of a leaf (data item) in the tree. Present
a variant of this tree which allows efficient proof of either inclusion or of noninclusion of an item with given ‘key’ value. In this tree, each item consists of
two strings, a key and data. Assume all data items are given together, sorted by
their key values; no need to build the tree dynamically or extend it. Your solution
may ‘expose’ one or two additional data items beyond the one queried. Note:
try to provide solution which is efficient in number of hash operations required
for verification (the number should be about one more than in the regular Merkle
tree).
Hint: You can see example of proof of non-inclusion and its application in
the NSEC3 record of DNSSEC (RFC 5155 [257]), and a graphical illustration
in Figure 8.19.
Exercise 3.27. The Merkle-tree scheme may also be useful for privacy, when
some recipients should have access only to some files, e.g., if each file mi contains
data which is private to user i. Note, however, that CRHFs - and Merkle-trees
- may not ensure confidentiality. Collision-resistance does not ensure that the
value of h(m) will not expose some information about m.
Let h be a (keyed or keyless) CRHF. Use h to design another hash function g,
s.t. (1) g is also a CRHF, yet (2) g exposes one or more bits of its input. Explain
why this implies that the Merkle-tree construction does not guarantee privacy.
In particular, explain why the P oI of one message may expose information
about other messages.
Exercise 3.28. This question is about digest and PoI for the Merkle tree
scheme. For concreteness, we will also refer to the trivial (and insecure) hsum
function (Example 3.1).
1. Compute the Merkle-tree digest, for the input sequence {10, 20}, as a formula for an arbitrary hash function h, and as a value for hsum . Solution:
MT.∆(B1 ) = h(h(10) +
+ h(20))
When using hsum , we have MT.∆(B1 ) = 3.
Applied Introduction to Cryptography and Cybersecurity
(3.52)
3.11. LAB AND ADDITIONAL EXERCISES
229
2. Compute the Merkle-tree digest for input: B2 = {30, 40, 50, 60, 70, 80, 90, 100}.
Present the digest as a formula for an arbitrary hash function h, and as a
value for hsum .
Solution:
MT.∆(B2 )
=
h [h (h(h(30) +
+ h(40)) +
+ h(h(50) +
+ h(60))) +
+
h (h(h(70) +
+ h(80)) +
+ h(h(90) +
+ h(100)))]
When using hsum , we have MT.∆(B2 ) = 7.
3. Compute the PoI for the input value 50 in B2 . Present the PoI as a
formula for an arbitrary hash function h, and as a value for hsum .
Solution: The value 50 was the third input in B2 , therefore the PoI is:
MT.P oI(B2 , 3)
=
MT.∆({70, 80, 90, 100}) +
+
MT.P oI({30, 40, 50, 60}, 3)
=
h [h(h(70) +
+ h(80)) +
++
+h(h(90) +
+ h(100))] +
+
+
+h(h(30) +
+ h(40)) +
+ h(60)
For the special case of hsum , we have MT.P oI(B2 , 3) = 7 +
+7+
+ 6.
Applied Introduction to Cryptography and Cybersecurity
Chapter 4
Authentication: Message
Authentication Code (MAC),
Blockchain and Signature Schemes
Cybersecurity and cryptography address different goals related to threats to
information and communication. The most well-known goals are confidentiality,
integrity, authentication and availability. In cryptography, the terms integrity
and authentication are used mostly as synonyms, both meaning the validation
that communication and information comes from a speciőc entity, or from one of
a speciőc set of entities, leaving us with the sassy acronym CIA, often referred
to as the CIA triad, for confidentiality, integrity, authentication and availability1
So far, we have mostly focused on conődentiality, to which we dedicated
all of Chapter 2. We also brieŕy introduced, in subsection 1.5.1, signature
schemes, which are asymmetric (public key) authentication schemes. In this
chapter, we discuss symmetric (shared key) authentication schemes, called
Message Authentication Code (MAC). MAC schemes efficient - much more than
comparably-secure signature schemes.
People often expect that encryption will ensure authenticity as a side-product
of ensuring conődentiality. Therefore, let us begin this chapter by discussing
the use of encryption for authentication, and show that this can be vulnerable although, later, in Section 4.7, we also discuss authenticated encryption schemes,
designed to ensure, indeed, both conődentiality and authenticity.
1 Originally, the ‘A’ in CIA referred to authentication; indeed, in computer security,
‘integrity’ has a different meaning: protection of a computer or other system from corruption.
‘Availability’ was later identified as another basic goal, with the less-sassy acronym CIAA.
Both acronyms are also used sometimes with accountability replacing availability and/or
authentication. Oh well, all worthy goals!
231
CHAPTER 4. AUTHENTICATION: MAC, BLOCKCHAIN AND SIGNATURE
232
SCHEMES
4.1
Encryption for Authentication?
As we discussed in the previous chapter, encryption schemes ensure confidentiality, i.e., an attacker observing an encrypted message (ciphertext) cannot learn
anything about the plaintext (except its length). Sometimes, people expect
encryption to also be useful for authentication and integrity.
Some encryption schemes indeed have some integrity properties. One important properties is non-malleable encryption. Intuitively, a non-malleable
encryption scheme prevents the attacker from modifying the message in a
‘meaningful way’. See deőnition and secure constructions of non-malleable
encryption schemes in [126]. However, be warned: achieving, and even deőning,
non-malleability is not as easy as it may seem!
In fact, many ciphers are malleable; often, an attacker can easily modify a
known ciphertext c, to c′ =
̸ c s.t. m′ = Dk (c′ ) ̸= m (and also m′ =
̸ ERROR).
Furthermore, often the attacker can ensure useful relations between m′ and m.
An obvious example is when using the (unconditionally-secure) one-time-pad
(OTP), as well as using Output-Feedback (OFB) mode.
Exercise 4.1. Mal is a Man-in-the-Middle, able to intercept and modify messages from Alice to her bank. In this question we explore the ability of Mal to
modify ciphertext (encrypted message) message which Alice sends to her bank.
Alice’s message is composed of the following fields, in the given order, each consisting of eight bytes: operation, reason, amount, payee, payer, password, date.
When operation= 1 and password contains the correct (four-bytes) password
for payer, the bank transfers amount from payer to payee, listing it in the bank
ledger with the given (four-bytes) reason. Use identifiers 1 for Alice, 2 for Bob,
and 3 for Mal.
1. Suppose the parties use One-Time-Pad (OTP) encryption, and Mal intercept the ciphertext c sent from Alice to her bank, which is encryption of a
request to transfer $ 3 to Bob. Explain how Mal can modify the ciphertext,
causing the bank to transfer (preferably, a larger amount) to Mal rather
than Bob. Assume Mal knows all the details (amount, payee, etc.).
2. How would your response change when using, instead of OTP, the following
modes of operation of DES, with block size of 64 bits (8 bytes): (a) OFB,
(b) CFB, (c) CBC, (d) CTR, (e) ECB. Note: You may not be able to find
a successful attack for some of the modes - but don’t give up too easily!
Solution for first part: with OTP, the ith ciphertext bit is computed by
ci = mi ⊕ ki , and decrypted by mi = ci ⊕ ki . Therefore, i is encrypted by ci
We conclude that encryption schemes may not suffice to ensure authentication. This motivates us to introduce, in the next section, another symmetric-key
cryptographic scheme, which is designed explicitly to ensure authentication and
integrity: the Message Authentication Code (MAC). Later, in Section 4.7, we
also discuss how to achieve conődentiality together with authenticity.
Applied Introduction to Cryptography and Cybersecurity
4.2. MESSAGE AUTHENTICATION CODE (MAC) SCHEMES
4.2
233
Message Authentication Code (MAC) schemes
Message Authentication Code (MAC) schemes are a simple, symmetric key
cryptographic functions, designed to verify the authenticity and integrity of
information (messages), namely, to detect that a message was not sent by
an ‘allowed sender’ (or, was modiőed after it was sent). A MAC function
M ACk (m) has two inputs, a (secret) n-bit secret (symmetric) key k, and a
message m. As illustrated in Figure 4.1, MAC schemes use the same key k
to generate the authenticator (tag), and to validate the authenticator (tag);
usually, upon receiving a message m with a purported authenticator σ, the
recipient computes σ ′ ← M ACk (m) and veriőes m by conőrming that σ = σ ′ .
Notice that this implies that MAC schemes, and authenticators, are usually
deterministic. Intuitively, given m, M ACk (m) for a secret, random key k, it is
infeasible for a (computationally-bounded) attacker to őnd another message
m′ ̸= m together with the value of M ACk (m′ ).
Typically, as shown in Fig. 4.1, a secret, symmetric MAC key k is shared
between two (or more) parties. Each party can use the key to authenticate a
message m, by computing an authentication tag M ACk (m). Given a message
m together with a previously-computed tag T , a party veriőes the authenticity
of the message m by re-computing M ACk (m) and comparing it to the tag T ;
if equal, the message is valid, i.e., the tag must have been previously computed
by the same party or another party, using the same secret key k.
In a typical use, one party, say Alice, sends a message m to a peer, say Bob,
authenticating m by computing and attaching the tag T = M ACk (m). Bob
conőrms that T = M ACk (m), thereby validating that Alice sent the message,
since he shares k only with Alice. See Fig. 4.1.
MAC schemes are related to signature schemes, an asymmetric (public
key) authentication mechanism which we introduced in subsection 1.5.1. With
signature schemes, each party, e.g., Alice, generates a private signing key A.s
and a corresponding public verification key A.v. In Deőnition 1.6, we deőne
an existentially unforgeable signature scheme; intuitively, an adversary who is
given the public key A.v, and can choose messages m1 , m2 , . . . and receive the
corresponding signatures σ1 = S.Sign A.s (m1 ), σ2 = S.Sign A.s (m2 ), . . ., cannot
őnd a different message m′ ̸∈ {m1 , m2 , . . .} with a corresponding signature σ ′
such that S.VerifyA.v (m′ , σ ′ ) = True.
Note that we usually use the term authenticator to refer to the output
M ACk (m) of the MAC function, i.e., if σ = M ACk (m), then σ is the authenticator of m using shared key k. Other terms for the authenticator are a tag,
or a signature; we warn that this last term (‘signature’) may cause confusion
between MAC schemes and signature schemes, and therefore, we recommend
(and try) to avoid it.
Repudiation vs. deniability While both signature schemes and MAC
schemes are used to authenticate messages, there is a critical difference: a valid
MAC can be computed by any entity that knows the shared key. Consider
the scenario in Figure 4.1; even after Carl successfully validates m̂ using σ̂, he
Applied Introduction to Cryptography and Cybersecurity
CHAPTER 4. AUTHENTICATION: MAC, BLOCKCHAIN AND SIGNATURE
234
SCHEMES
Nurse
Alice,
MitM, no key
shared key k
Bob,
Carl,
shared key k
shared key k
m, σ ← M ACk (m)
m, σ
σ = M ACk (m) ⇒ output m
m̂, σ̂ ← M ACk (m̂)
m′ ̸= m, σ ′
σ̂ = M ACk (m̂) ⇒ output m̂
σ ′ ̸= M ACk (m′ ) ⇒ discard m′
Figure 4.1: Using a Message Authentication Code (MAC) scheme, and a shared
key k, to authenticate messages. The Man-in-the-Middle (MitM) adversary
can observe message m and its authenticator σ = M ACk (m), but cannot forge
the MAC, i.e., generate a pair m′ , σ ′ such that σ ′ = M ACk (m′ ), for m′ ̸= m.
Note, however, that if a key k is shared among more than two entities, e.g.,
Alice, Bob and Carl, then each entity can authenticate messages using k; e.g.,
when Carl receives m̂, it cannot know if m̂ was sent by Alice or Bob (or even
by Carl himself), except by using some indication within m, e.g., if m includes
sender identiőcation. Due to this property, we say that MAC schemes allow
repudiation; for non-repudiation, use signatures instead of MAC.
cannot know if m̂ was sent by Alice or Bob. Of course, the sender identity may
be indicated as part of the message m; however, there is nothing preventing
an entity knowing the shared key k, from putting a different identity in the
message and computing the MAC. This is in contrast to signature schemes,
where the private signing key A.s must be used to produce a valid signature
σ for a given message m, namely, knowing the public validation key A.v does
not allow forgery of a message as if it was signed using A.s. This property of
signature schemes is often referred to as non-repudiation, as it prevents the
sender of a (signed) message from repudiating (denying) having sent it. Note
that in some situations, e.g., for a whistle-blower, we may have the opposite
goal, i.e., of preventing the recipient from proving the identity of the sender to
a third person; this goal is usually referred to as deniability.
To validate that a given tag T correctly validates a message m, i.e., T =
M ACk (m), requires the ability to compute M ACk (·), i.e., knowledge of the
shared secret key k. However, this implies the ability to compute (valid) tags
from any other message. This allows the entity that computed the tag to later
deny having done it, since it could have been computed also by other entities.
Therefore, MAC schemes do not ensure non-repudiation - and, exactly because
Applied Introduction to Cryptography and Cybersecurity
4.3. MESSAGE AUTHENTICATION CODE (MAC): DEFINITIONS
235
of that, allow deniability.
Namely, we should use a signature scheme when we need to ensure nonrepudiation, i.e., to ensure that after validating a message with the key associated
with Alice, we should be able to assume that Alice indeed sent the message.
When we do not need non-repudiation, and especially when deniability is
important, then we should use a MAC scheme.
4.3
Message Authentication Code (MAC): Definitions
A MAC scheme is a function F , with the following unforgeability property: an
attacker, which does not know the key k and is not given Fk (m) for any given
message m, is unable to őnd the value of Fk (m), with better chance than a
random guess. The deőnition has a lot in common with the deőnition of signature
schemes and their existential-unforgeability requirement, see subsection 1.5.1;
in particular, we allow the adversary to obtain the MAC values for any other
message. The deőnition follows. For concreteness, we will focus on MAC whose
output is an l-bit binary string.
Definition 4.1 (MAC). An l-bit Message Authentication Code (MAC) over
domain D, is a function F : {0, 1}∗ × D → {0, 1}l , such that for all PPT
AC
algorithms A, the advantage εM
F,A (n) is negligible in n, i.e., smaller than any
positive polynomial for sufficiently large n (as n → ∞), where:
AC
εM
F,A (n) ≡
Pr
$
k←{0,1}n
h
(m, Fk (m)) ← A Fk (·|except
m)
i 1
(1n ) − l
2
(4.1)
$
Where the probability is taken over the random choice of an n bit key, k ←
{0, 1}n , as well as over the coin tosses of A.
Oracle. The expression A Fk (·|except m) refers to the output of the adversary
A, where during its run, the adversary can give arbitrary inputs x =
̸ m and
receive the corresponding values of the function, Fk (x). We say that the
adversary A has an oracle to the MAC function FK (·) (excluding the message
m). See Deőnition 1.3.
AC
The advantage function εM
F,A (n) and key length n. The deőnition is for
l-bit MAC, i.e., the output is always a binary string of length l. Hence, a random
guess at the MAC of any input message m would be correct with probability
AC
2−l . Therefore, we deőned the advantage εM
F,A (n) as the probability that the
adversary őnds a correct MAC value for a message m (not input to the oracle),
minus the ‘base success probability’ of 2−l . The function F is a (secure ) MAC,
AC
if this advantage εM
F,A (n) is negligible.
The key length is denoted n, and is not bounded. The ‘advantage’ of the
adversary over random guess, should be negligible in n, i.e., converge to zero as
n grows. In practice, MAC functions are used with speciőc key length, which
Applied Introduction to Cryptography and Cybersecurity
CHAPTER 4. AUTHENTICATION: MAC, BLOCKCHAIN AND SIGNATURE
236
SCHEMES
is believed to be ‘long enough’ to foil attacks (by attackers with reasonable
resources and time).
Output length - fixed (l) or as key length (n). In some other deőnitions
of MAC schemes, the output length is also n, i.e., same as the key. In this case,
the 21l element becomes 21n , which is negligible in n, and hence can be ignored.
Input domain. Notice that the deőnition allows an arbitrary input domain D
to the MAC function. The two most commonly used domains are D = {0, 1}∗ ,
i.e., the set of all binary string (of unbounded length), and D = {0, 1}lin , i.e.,
the set of all binary strings of some őxed length lin . Of course, lin may also
be the same as l. A MAC function whose input is the set of binary strings of
őxed length, is called FIL-MAC, i.e., Fixed Input Length MAC. In contrast, a
MAC function whose input is the set of all binary strings is called VIL-MAC,
i.e., Variable Input Length MAC.
To ‘warm up’, let us show two examples of insecure MAC design. Our
examples follow the deőnition, i.e., the attacker is allowed to ask for the
MAC for some messages, and then has to come up with a different message
and a correct MAC for that message. Notice that the deőnition does not
require the ‘forged’ message to be ‘meaningful’; this means that it isn’t always
trivial to exploit a vulnerable MAC. Following the ‘conservative design principle
(Principle 3), the deőnition does not attempt to predict which forgeries will be
meaningful, instead forbidding any forgery.
Our őrst example is a very simple FIL-MAC construction which we denote
XORE ; the construction is deőned for a given n-bit block cipher E. The XORE
construction is deőned for inputs which are exactly two blocks. Its output is
a single block, which is the result of XORing the ‘encryption’ of each block.
Namely, for given key k:
XORkE (m) ≡ Ek (m[1 : n]) ⊕ Ek (m[n + 1 : 2n])
The following example shows that XORE is not a secure MAC. We recommend
you try to show it yourself before reading the solution.
Example 4.1. To show that XORE is not a secure MAC, observe that:
XORkE (m)
=
=
=
=
Ek (m[1 : n]) ⊕ Ek (m[n + 1 : 2n])
Ek (m[n + 1 : 2n]) ⊕ Ek (m[1 : n])
XORkE (m[n + 1 : 2n] +
+ Ek (m[1 : n])
XORkE (m̄) where m̄ = m[n + 1 : 2n] +
+ Ek (m[1 : n]
Therefore, for every 2n-bits input message m, we have XORkE (m) =
XORkE (m̄), where m̄ is simply the message with the two blocks switched. This
suffices to conclude that XORE does not satisfy the definition for secure MAC.
Let us present a specific adversary A that ‘breaks’ XORE , i.e., shows it
does not meet the definition of secure MAC. First, A asks for the MAC of
Applied Introduction to Cryptography and Cybersecurity
4.3. MESSAGE AUTHENTICATION CODE (MAC): DEFINITIONS
237
+ 1n (a block of zeros followed by a block of 1’s); we could have used
m01 = 0n +
almost any 2n−bit message, the choice of m01 is just for simplicity.
As per the definition, A receives the MAC of m01 , i.e., XORkE (m01 ) =
Ek (0n ) ⊕ Ek (1n ). Then A returns the pair m10 ||XORkE (m01 ), where m10 =
1n +
+ 0n . This is a successful forgery, since:
XORkE (m10 ) = Ek (1n ) ⊕ Ek (0n ) = Ek (0n ) ⊕ Ek (1n ) = XORkE (m01 )
Our second example is of a ‘hairy’ function fk , deőned ‘from scratch’, i.e.,
not using an underlying block cipher. Such ‘hairy’ designs may appear to be
good candidates for MAC - but all too often, they are vulnerable. Showing the
vulnerability can be tricky, which motivates (1) the use of a strong deőnition
and (2) the use of standard, secure constructions from basic building blocks,
following the ‘building blocks principle’ (Principle 8).
Example 4.2. Consider fk (m) = k 3 · m + k 2 · m2 + k · m3 mod p, where p
is a known number. Let us show, in a simple yet detailed way, that this hairy
expression is not a secure MAC. Notice that our solution does not involve any
attempt to find k!
The idea of the solution is simple. Recall the most basic properties of modular
arithmetic (Section A.2). From these it follows that fk (m) = fk (m + i · p),
for any integer i. In particular, fk (m) = fk (m + p). If this isn’t clear, try to
substitute some small integers for m, k and p; and then read again Section A.2
to see why these equations hold.
This is the crux of the solution, but let us complete the details, by presenting
AC
an adversary A which s.t. εM
F,A (n) is non-negligible; in fact, we’ll show that
AC
−l
εM
F,A (n) = 1 − 2 , for every n. Actually, the fact that we show advantage of
(almost) 1 is quite typical of these exercises, although, of course, it suffices to
show any non-negligible advantage.
The adversary A is the following simple algorithm:
1. Let m′ be some arbitrary value, e.g. p, or 1, or 0, or whatever you like.
2. Let x ← Fk (m′ ), i.e., call the oracle on m′ .
3. Let m = m′ + p.
4. Output (m, x).
AC
−l
Let us explain why εM
F,A (n) = 1 − 2 . Given oracle access to fk (·) (for
some random k), the adversary gave some input m′ , and received x ≡ fk (m′ ),
i.e., in our case, x = k 3 · m′ + k 2 · m′2 + k · m′3 mod p. Then the adver̸ m′ , so the condisary outputs (m, x), where m = m′ + p. Obviously, m =
tion of the use of the oracle is satisfied; on the other hand, x = fk (m′ ) =
fk (m + p) =
the expression is true for any k and we have:
fk (m). Therefore,
Fk (·|except m) n
AC
(m,
F
(m))
←
A
(1 ) = 1, proving that εM
Pr $
k
F,A (n) =
n
k←{0,1}
1 − 2−l .
Applied Introduction to Cryptography and Cybersecurity
CHAPTER 4. AUTHENTICATION: MAC, BLOCKCHAIN AND SIGNATURE
238
SCHEMES
4.4
Applying MAC Schemes
A MAC function is a simple cryptographic mechanism, which is quite easy
to use; however, it should be applied correctly - with an understanding of its
properties and without expecting it to provide other properties. We now discuss
a few aspects of the usage of MAC schemes, and give a few examples of common
mistakes.
Confidentiality A MAC function is a great tool to ensure integrity and
authenticity; however, MAC may not ensure confidentiality. Namely, M ACk (m)
may expose information about the message m. This is sometimes overlooked
by system designers; for example, early versions of the SSH protocol used the
so-called ‘Encrypt and Authenticate’ method, where to protect message m,
+ M ACk (m); one problem with this design is that
the system sent Ek (m) +
M ACk (m) may expose information about m.
Notice that while obviously conődentiality is not a goal of MAC schemes,
one may hope that it is derived from the authentication property. To refute such
false hopes, it is best to construct a counterexample - a very useful technique
to prove that claims about cryptographic schemes are incorrect. The counterexamples are often very simple - and often involve ‘stupid’ or ‘strange’ designs,
which are specially designed to meet the requirements of the cryptographic
deőnitions - while demonstrating the falseness of the false assumptions. Here is
an example showing that MAC schemes may expose the message.
Example 4.3 (MAC does not ensure conődentiality.). To show that MAC
may not ensure confidentiality, we construct such a Non-confidential MAC
function F N cM (where N cM stands for ‘Non-confidential MAC’). Our construction uses an arbitrary secure MAC scheme F (which may or may not
ensure confidentiality). Specifically:
FkN cM (m) = Fk (m) +
+ LSb(m)
where LSb(m) is the least-significant bit of m. Surely, F N cM does not ensure
confidentiality, since it exposes a bit of the message (we could have obviously
exposed more bits - even all bits!).
On the other hand, we now show that F N cM is a secure MAC. Assume, to
the contrary, that there is some adversary A N cM that succeeds (with significant
probability) against F N cM . We use A N cM to construct an attacker A that
succeeds with the same probability against F . Attacker A works as follows:
1. When A N cM makes a query q to F N cM , then A makes the same query
to F , receiving Fk (q); it then returns FkN cM (q) = Fk (q) +
+ LSb(q), as
expected by A N cM .
2. When A N cM outputs its guess m, T , where T is its guess x for FkN cM (m) =
Fk (m) +
+ LSb(m), and m was not used in any of A N cM ’s queries, then A
outputs x except for its least-significant bit; namely, if x = FkN cM (m) =
Fk (m) +
+ LSb(m), then A outputs FkN cM (m) = Fk (m).
Applied Introduction to Cryptography and Cybersecurity
4.4. APPLYING MAC SCHEMES
239
It follows that F N cM is a secure MAC if and only if F is a secure MAC.
We show, later on (subsection 4.5.1), that every PRF is a MAC. The
following exercise shows that the reverse is not true: a MAC is not necessarily
a PRF. This exercise is similar to the example above.
Exercise 4.2 (Non-PRF MAC). Show that a MAC function is not necessarily
a pseudorandom function (PRF).
Solution outline: Let F be an arbitrary secure MAC scheme that outputs
n-bit tags. Construct a MAC scheme F ′ , which outputs 2n-bit tags, as follows.
Fk′ (m) = Fk (m) +
+ 0n
Clearly, F ′ is not a PRF, because A has a signiőcant chance of distinguishing
between an output of F ′ and a random 2n-bit string (since the second half of
the output of F ′ is all zeros). Yet, you can show that F ′ is a secure MAC if and
only if F is a secure MAC, using a similar method to the one in Example 4.3.
Therefore, a MAC function is not necessarily a PRF.
Key separation Another problem with the SSH ‘Encrypt and Authenticate’
design, Ek (m) +
+ M ACk (m), is the fact that the same key is used for both
encryption and MAC. This can cause further vulnerability; an example is shown
in the following simple exercise.
Exercise 4.3 (Separate keys for separate functions). Show that the use of
the same key for encryption and MAC in Ek (m) +
+ M ACk (m) can allow an
attacker to succeed in forgery of messages - in addition to the potential loss of
confidentiality shown above - even when E and M AC are secure (encryption
and MAC, respectively).
Solution outline: Let E ′ , M AC ′ be secure encryption and MAC functions,
respectively. Deőne EkE ,kM (m) = Ek′ E (m) +
+ kM and M ACkE ,kM (m) = kE +
+
M ACk′ M (m). Obviously, , the use of EkE ,kM (m) +
+ M ACkE ,kM (m) exposes
both keys and is therefore insecure. However, using the method of Example 4.3,
you can show that E, M AC are also secure encryption and MAC functions,
respectively. See also Exercise 4.18.
Another motivation to separate between the keys used by a given cryptographic function/scheme, is to reduce the quantity of plaintext available to the
cryptanalyst, and especially the amount of known and chosen plaintext. These
considerations result in the principle of key separation.
Principle 10 (Key Separation). Use separate, independently-pseudorandom
keys for: (1) each different cryptographic scheme/function, (2) different types
and/or different sources of plaintext, (3) different periods, and (4) different
versions of the protocol or scheme.
Applied Introduction to Cryptography and Cybersecurity
CHAPTER 4. AUTHENTICATION: MAC, BLOCKCHAIN AND SIGNATURE
240
SCHEMES
Freshness, replay-prevention and sender authentication A valid MAC
received with a message shows that the message was properly authenticated
by an entity holding the secret key, which we refer to as message authentication. Message authentication is a useful property, and can facilitate additional
important properties. Let us discuss three such important properties, which
can be facilitated using message authentication, and are even sometimes (incorrectly) assumed to be implied directly by message authentication: sender
authentication, freshness and no-replay.
Sender authentication is the ability to identify the identity of the party
originating the message. We commented above that MAC does not ensure
sender authentication, unless the design ensures that only the speciőc sender
will compute the MAC using the speciőc key over the given message. A simple
way to ensure this is by including the sender identity as part of the payload
being signed. Another way to ensure this is for each sender to use its own
authentication key. Of course, both methods do not prevent one entity holding
a shared key, from impersonating as another entity using the same key; to
prevent this, use signatures.
Freshness is the ability to conőrm that a message was sent ‘recently’; a
related property, replay-prevention, ensures that the message was not already
handled previously. We can use MAC to ensure these properties, by including
in the authenticated data appropriate őelds, such as a timestamp, a counter or
a random number (‘nonce’) selected by the party validating freshness. Each of
these options has its corresponding drawback: the need for synchronized clocks,
the need to keep a state, or the need for the sender to receive the nonce from
the recipient (additional interaction).
4.5
Constructing MAC from a Block Cipher
In this section we discuss constructions of a MAC function from a block cipher.
More precisely, the constructions are of a MAC from a pseudorandom function
(PRF), mainly the CBC-MAC construction. Since a PRF is often not included in
cryptographic libraries, it may be tempting to use instead a block cipher, which
is part of most cryptographic libraries; recall that a block cipher is modeled by
a Pseudo-Random Permutation, PRP, rather than by a PRF. The PRP/PRF
switching lemma (Lemma 2.2) shows that we could simply use a block-cipher
instead of the PRF, since a block-cipher (PRP) is indistinguishable from a PRF;
however, recall this is not advisable, since the use of block cipher instead of PRF
involves loss in security. Instead, use one of the efficient, simple constructions
of PRF from a block cipher, which avoid the loss of security, e.g., [39, 183].
The section contains three subsections, In the őrst, we observe that given a
PRF, we can actually use it directly as a MAC, i.e., every PRF is also a MAC.
There is a caveat: the input domain of the MAC is the same as that of the
PRF, which, in turn, is the same as of the underlying block cipher (if the PRF
is implemented from a block cipher as explained above). Namely, if we use
n-bit blocks, i.e. the domain of the block-cipher (and PRF) is {0, 1}n , then the
Applied Introduction to Cryptography and Cybersecurity
4.5. CONSTRUCTING MAC FROM A BLOCK CIPHER
241
MAC function also applies (only) to n-bit messages. This is not satisfactory,
since typical messages are longer.
The second subsection presents the CBC-MAC construction, which constructs a l · n-bit PRF from an n-bit PRF, for a given constant number of blocks
l. This allows efficient and secure use of n-bit-input PRF (or block cipher), to
encrypt longer, l · n-bits messages.
Finally, in the third subsection we discuss extensions that allow a MAC for
messages of arbitrary length.
In the following section, we discuss other construction of MAC schemes,
which are not based on the use of a secure block cipher, most notably, the
HMAC construction of a MAC scheme from a cryptographic hash function.
4.5.1
Every PRF is a MAC
In this subsection, we take the őrst step toward the CBC-MAC construction.
This step is the observation that every PRF whose range is {0, 1}l , is also an
l-bit MAC, with the same input and output domains. This is formalized in the
following lemma, which we call the PRF-is-MAC lemma.
Lemma 4.1 (A PRF is a MAC). Let F be a PRF from input domain D to the
range {0, 1}l . Then F is also an l-bit MAC, with input domain D and output
domain {0, 1}l .
Proof: Assume that F is not a MAC (for same domain D and range {0, 1}l ).
AC
Namely, assume that there exists some adversary AM AC s.t. εM
AM AC ,F (n) is
non-negligible in n (as deőned in Equation 4.1). We use AM AC to construct
RF
another adversary, AP RF , s.t. εP
AP RF ,F (n) is non-negligible in n (as deőned in
Equation 2.29); this shows that F is (also) not a PRF, which proves the claim.
Let us now deőne AP RF . First, recall that in Equation 2.29, adversary
$
AP RF is given an oracle either to a random function f ← {D → {0, 1}l , or
to the pseudorandom function Fk : D → {0, 1}l for some random n-bit key
$
k ← {0, 1}n . Adversary AP RF runs AM AC , letting it use the same oracle.
Namely, whenever AM AC asks its oracle with input x ∈ D, adversary AP RF
calls its oracle with the same input x; and when it receives a result ξ, it returns
that result to AM AC .
When AM AC terminates, it should return some pair, which we denote by
(m, σ). Upon receiving (m, σ), adversary AP RF provides m as input to its oracle;
denote the output by σ ′ . If σ ̸= σ ′ , then AP RF returns ‘Rand’; otherwise, i.e.,
if σ = σ ′ , then AP RF returns ‘Pseudo’. Essentially, AP RF outputs ‘Rand’ (i.e.,
guess it was given a random function), when AM AC fails was able to predict
correctly the output of the oracle for the input m.
Let us consider what happens if AP RF is given an oracle to a random
$
function f ← {D → R}. In this case, when running AM AC , the values returned
from the oracle were for that random function f ; clearly, AM AC cannot be
expected to perform as well as when given an oracle to the function Fk (·). In
fact, AM AC has to return a pair (m, σ), without giving input m to the oracle.
Applied Introduction to Cryptography and Cybersecurity
CHAPTER 4. AUTHENTICATION: MAC, BLOCKCHAIN AND SIGNATURE
242
SCHEMES
But if the oracle is to a random function f , then f (m), is chosen independently
of f (x) for any other input x ̸= m; learning other outputs cannot help you guess
the output when the input is m! Hence, the probability of a match is (only) 2−l
- the probability
between two random l bit strings. Namely,
fof na random match
Pr $
A (1 ) = ‘Rand’ = 2−l .
f ←{D→R}
Now consider what happens if AP RF is given an oracle to a pseudorandom
function Fk (·). The claim is that F is also a MAC, but we assumed, to the
contrary, that it is not; so AM AC is able to return a pair (m, Fk (m)) - with
probability significantly larger
2−l . In these cases, AP RF will return
F than
n
k
(1
)
=
‘Rand’ = 2−l + p(n), where p(n) is a
A
‘PR’. Namely, Pr $
n
k←{0,1}
significant (not negligible) function.
RF
It follows that εP
A,F (n) is not negligible, and hence, F is not a PRF.
4.5.2
CBC-MAC: ln-bit MAC (and PRF) from n-bit PRF
Lemma 4.1 shows that every n-bit PRF is also an n-bit MAC. But how can
we deal with longer messages? In this subsection, we present the CBC-MAC
construction, which produces an l · n-bit PRF, using a given n-bit PRF. Since
every PRF is a MAC, this gives also an l · n-bit MAC. The CBC-MAC construction is a standard from 1989 [213], i.e., prior to the PRF-is-MAC lemma
(from [37]), which is why it refers to construction of MAC (from block cipher)
and not to construction of l · n-bit PRF from n-bit PRF.
Before we present the CBC-MAC construction, let us discuss some insecure
constructions. First, consider performing MAC to each block independently,
similar to the ECB-mode (Section 2.8). One drawback is that this would result
in a long MAC. An even worse drawback is that this is insecure; an attacker
may obtain a MAC for a different message, which contains re-ordered and/or
duplicated blocks.
Next, consider adding a counter to the input, to which we refer as CTRMAC. This prevents the trivial attack - but not simple variants, as shown in
the following exercise. For simplicity, the exercise is given for l = 2. Of course,
this design also has the disadvantage of a longer output tag.
Exercise 4.4 (CTR-MAC is insecure). Let E be a secure (n + 1)−bit block
+ m1 ) =
cipher, and define the following 2n−bit domain function: Fk (m0 +
Ek (0 +
+ m0 ) +
+ Ek (1 +
+ m1 ) (CTR-MAC). Present a counterexample showing
that F is not a secure 2n−bit MAC.
Finally, we present the CBC-MAC construction, also known as the CBCMAC mode. This is a widely used, standard construction of an (l · n)−bit MAC
from an n−bit block cipher. The CBC-MAC mode, illustrated in Fig. 4.2,
is a variant of the CBC mode used for encryption, see Section 2.8. Given a
block-cipher E, we deőne CBC − M AC E as in Eq. 4.2, for an l-block input
message (i.e., of length l · n bits), m = m1 +
+ m2 +
+ ... +
+ ml :
CBC − M ACkE (m) = {c0 ← 0n ; (i = 1 . . . l)ci = Ek (mi ⊕ ci−1 ); output cl }
(4.2)
Applied Introduction to Cryptography and Cybersecurity
4.5. CONSTRUCTING MAC FROM A BLOCK CIPHER
243
See Fig. 4.2.
When E is obvious we may simply write CBC − M ACk (·).
m1
m2
m3
Ek
Ek
Ek
0n
CBC − M ACkE (m)
Figure 4.2: CBC-MAC: construction of l · n−bit PRF (and MAC), from n−bit
PRF.
CBC-MAC is the most widely used MAC construction from block ciphers.
Other constructions of secure MAC from PRFs and block ciphers, including
more efficient constructions, e.g., avoiding the need to know the input length in
advance (CMAC [136]) or allowing parallel computation and veriőcation (e.g.,
XOR-MAC [36]). However, we focus on CBC-MAC, which is not only the most
widely used, but also one of the most simple constructions of MAC from a block
cipher.
We next present Lemma 4.2 which shows that CBC-MAC constructs a
secure PRF (and hence also MAC), provided that the underlying function E is
a PRF.
Lemma 4.2. If E is an n-bit PRF, then CBC − M ACkE (·) is a secure n · l-bit
PRF and MAC, for any constant integer l > 0.
Proof: see in [37].
CBC-MAC does not support input of arbitrary length . The CBCMAC construction is deőned for input which is an integral number of blocks,
i.e., n · l bits. How can we extend it so it does support input of arbitrary length,
i.e., a variable input length (VIL) PRF (and MAC) - deőned for input domain
domain {0, 1}∗ ?
One obvious problem is that an arbitrary binary string, ma y not even
consist of an integral number of blocks, while CBC-MAC is deőned only for
inputs which are of length n · l, i.e., integral number of blocks. However, let
us ignore that problem for now, and focus on the complete-blocks input length
(CBIL) domain, i.e., inputs whose length is an integer number of blocks. Let us
őrst precisely deőne the CBIL domain.
CBIL ≡ m ∈ {0, 1}n·l |l ∈ mathbbZ +
(4.3)
In the next exercise we show that CBC-MAC is not a PRF, or a MAC, for
the CBIL domain, and, hence, surely not a VIL MAC/PRF.
Applied Introduction to Cryptography and Cybersecurity
CHAPTER 4. AUTHENTICATION: MAC, BLOCKCHAIN AND SIGNATURE
244
SCHEMES
Exercise 4.5 (CBC-MAC is not a VIL MAC). Show that CBC-MAC is not a
MAC or PRF for the domain CBIL (Equation 4.3), and hence is definitely not
a VIL MAC/PRF (for the domain {0, 1}∗ ).
Solution: Let fk (·) = CBC − M ACkE (·) be the CBC-MAC using an underlying n-bit block cipher Ek . Namely, for a single-block message a ∈ {0, 1}n , we
have fk (a) = Ek (a); and for a two block message a +
+ b, where a, b ∈ {0, 1}n ,
we have fk (a +
+ b) = Ek (b ⊕ Ek (a)).
We present a simple adversary Afk , with oracle access to fk , i.e., A is able
to make arbitrary query x ∈ {0, 1}∗ to fk and receive the result fk (x). Let X
denote all the queries made by A during its run. We show that Afk generates
a pair x, fk (x), where x ̸∈ X, which shows that fk (i.e., CBC-MAC) is not a
MAC for domain CBIL (and hence also not a {0, 1}∗ -MAC, i.e., VIL MAC).
Speciőcally, the adversary A őrst makes an arbitrary single-block query, for
arbitrary a ∈ {0, 1}n . Let c denote the result, i.e., c = fk (a) = Ek (a). Then, A
computes b = a ⊕ c and outputs the pair of message a +
+ b and tag c.
Note that c = fk (a +
+ b), since fk (a +
+ b) = Ek (b ⊕ Ek (a)) = Ek ((a ⊕ c) ⊕ c) =
Ek (a) = c. Namely, c is indeed the correct tag for a +
+ b. Obviously, A did not
make a query to receive fk (a +
+ b). Hence, A succeeds in MAC game against
CBC-MAC.
However, as we next explain, by merely prepending the length to the input,
we can create a VIL MAC from the CBC-MAC.
4.5.3
Constructing Secure VIL MAC from PRF
Lemma 4.2 shows that CBC-MAC is a secure ln-bit FIL PRF (and MAC);
however, Exercise 4.5 shows that it is not a VIL MAC (and hence surely not VIL
PRF). The crux of the example was that we used the CBC-MAC of a one-block
string, and presented it as the MAC of a 2-block string. This motivates a minor
change to the construction, where we prepend the block-encoded length L(m) of
the input m to the input before applying CBC-MAC. We deőne L(m) as an
n-bit binary string (i.e., a block), whose binary value is the length |m| of the
input m. Lemma 4.3 shows that this construction is indeed a secure VIL MAC.
We refer to this variant as length-prepending CBC-MAC.
Lemma 4.3 (Length-prepending CBC-MAC is a VIL PRF.). Let fk (m) =
CBC − M ACkE (L(m) +
+ m), where L(m) is the block-encoded length of m (as
defined above). Then fk (·) is a PRF over the set of all binary strings(and
MAC).
Proof: See [37].
Note that the block-encoded length L(m), can only support message up to
the maximal length encoded by n bits - i.e., |m| < 2n . In practice, this isn’t an
issue - and it is not difficult to extend the construction to avoid this limitation,
if you really want to. It is hard to imagine a practical scenario in which you
will have to do this, however.
Applied Introduction to Cryptography and Cybersecurity
4.6. OTHER MAC CONSTRUCTIONS
4.6
245
Other MAC Constructions
In the previous section, we presented constructions of MAC from PRFs and
block ciphers. In the following subsections, we discuss other approaches for
constructing a MAC function, including: (1) design a MAC ‘from scratch’,
i.e., without provable reduction to the security of some other cryptographic
scheme (subsection 4.6.1), (2) combine multiple candidate MAC functions
(robust combiner, subsection 4.6.2), and, őnally, (3) construct a MAC/PRF
from a cryptographic hash function (subsection 4.6.3). This last approach,
constructing a MAC/PRF from a hash function, is the most widely-use method
to implement a MAC function, usually using the HMAC construction [31, 32].
4.6.1
MAC design ‘from scratch’
This approach attempts to design a candidate MAC function without requiring
a reduction to the security of some cryptographic scheme; typically, the design
simply does not involve any other, known cryptographic function. Instead,
we may use some problems which are considered computationally-hard. The
security of such design is based on the failure of signiőcant cryptanalysis efforts
against the MAC function.
This used to be the main method of design of new cryptographic mechanisms.
However, following the cryptographic building block principle (principle 8), MAC
functions are rarely designed ‘from scratch’. Let us give an example of one
example: a (failed) attempt to construct MAC from EDC, and the resulting
vulnerabilities.
Two (failed) attempts to construct MAC from EDC Let us consider
a speciőc design, which, intuitively, may look promising: constructing a MAC
from a (good) Error Detection Code (EDC). Error Detection Codes are designed
to ensure integrity, i.e., to detect corruptions in data; however, they are designed
to detect random errors, and may fail to detect intentional modifications. We
have seen already, in Section 2.10, that the WEP protocol failed to ensure
integrity against attack, in spite of its use of the CRC-32 Cyclic Redundancy
Check (CRC) error-detecting code before encryption. Let us consider two other
simple constructions of a MAC from an EDC, which do not involve encryption:
M ACk (m) = EDC(k +
+ m) and M ACk′ (m) = EDC(m +
+ k). We next show
that these construction are insecure, when using CRC as the error-detection
code (EDC).
Exercise 4.6 (Insecure CRC-based MACs). Show that both of the following
are insecure:
(a) CRC-MACk (m) = CRC(k +
+ m) and (b) CRC-MAC′k (m) = CRC(m +
+ k).
Solution: We only solve (a) and leave (b) as an (easy) exercise to the reader.
In fact, we show how the attacker that receives only the MAC CRC-MACk (m) =
CRC(k +
+ m) of any known message m, can compute the MAC for any other
Applied Introduction to Cryptography and Cybersecurity
CHAPTER 4. AUTHENTICATION: MAC, BLOCKCHAIN AND SIGNATURE
246
SCHEMES
message m′ =
̸ m of the same length, i.e., compute CRC-MACk (m′ ) = CRC(k +
+
′
m ).
Recall that the CRC function is linear, namely, for any two strings of the same
length, |x| = |x′ , holds: CRC(x ⊕ x′ ) = CRC(x) ⊕ CRC(x′ ) (Equation 2.61).
Hence:
CRC-MACk (m′ ) = CRC(k +
+ m′ )
= CRC (0|k| +
+ m′ ⊕ m) ⊕ (k +
+ m)
= CRC(0|k| +
+ m′ ⊕ m) ⊕ CRC(k +
+ m)
(4.4)
= CRC-MAC0|k| (m′ ⊕ m) ⊕ CRC-MACk (m)
The adversary computes |k| = |CRC-MACk (m′ )| − |CRC(m′ )|, and can
therefore compute CRC-MAC0|k| (m′ ⊕ m). By plugging this into Equation 4.4,
the adversary őnd CRC-MACk (m′ ).
We conclude that CRC-MACis indeed insecure. The same seems to hold for
other ECD-based MACs.
4.6.2
Robust combiners for MAC
A robust combiner for MAC combines two (or more) candidate MAC functions
to create a new composite function, which is proven secure provided that one
(or a sufficient number) of the underlying functions is secure. There is actually
a very simple robust combiner for MAC schemes: concatenation (denoted +
+).
In the following exercise we show that concatenation is a robust combiner for
MAC functions.
Exercise 4.7. Show that concatenation is a robust combiner for MAC functions.
Solution (from [191]): Let F ′ , F ′′ be two candidate MAC schemes, and
+ Fk′′′′ (m). We should show that it suffices that
deőne Fk′ ,k′′ (m) = Fk′ ′ (m) +
′
′′
either F or F is a secure MAC, for F to be a secure MAC scheme as well.
Without loss of generality, assume F ′ is secure; and assume, to the contrary,
that F is not a secure MAC. Namely, assume an attacker A Fk′ ,k′′ (µ)|µ̸=m that
can output a pair m, Fk′ ,k′′ (m), given access to an oracle that computes Fk′ ,k′′
on any value except m. We use A to construct an adversary A ′ which succeeds
against F ′ .
Adversary A ′ operates by running A, as well as selecting a key k ′′ and
running Fk′′′′ (·); this is needed to allow A ′ to provide the oracle service to
A Fk′ ,k′′ (µ)|µ̸=m , computing Fk′ ,k′′ (µ) for any given input µ. Whenever A
makes a query q, then A ′ makes the same query to the Fk′ ′ (·) oracle, to receive
Fk′ ′ (q). Then, A ′ computes by itself Fk′′′′ (q), and combines it with Fk′ ′ (q) to
produce the required response (Fk′ ′ (q), Fk′′′′ (q)).
When A őnally returns the pair (m, Fk′ ,k′′ (m)) = (m, Fk′ ′ (m) +
+ Fk′′′′ (m)),
′
′
then A simply returns the pair (m, Fk′ (m)), i.e., omitting the second part of
the MAC that A returned.
Applied Introduction to Cryptography and Cybersecurity
4.6. OTHER MAC CONSTRUCTIONS
247
However, concatenation is a rather inefficient construction for robust combiner of MAC schemes, since it results in duplication of the length of the output.
The following exercise shows that exclusive-or is also a robust combiner for
MAC - and since the output length is the same as of the component MAC
schemes, it is efficient.
Exercise 4.8. Show that exclusive-or is a robust combiner for MAC functions.
Namely, that M AC(k′ ,k′′ ) (x) = M ACk′ ′ (x) ⊕ M ACk′′′′ (x) is a secure MAC, if
one or both of {M AC ′ , M AC ′′ } is a secure MAC.
Guidance: Similar to the solution of Ex. 4.7.
4.6.3
HMAC and other constructions of a MAC from a Hash
function
Finally, we consider constructions of MAC functions from cryptographic hash
functions. Cryptographic hash functions, like block ciphers, are deőned in multiple standards, therefore their use to construct MAC (and other schemes) follows
the cryptographic building blocks principle (subsection 2.7.4). Furthermore,
since both MAC and hash functions are deőned for arbitrary (variable) input
length (VIL), the constructions of MAC from hash functions are simpler than
the constructions from block ciphers (subsection 4.5.2). Furthermore, some
cryptographic hash functions are extremely efficient, and this efficiency can be
mostly inherited by HMAC. For example, the Blake2b [17] cryptographic hash
function achieves speeds of over 109 bytes/second, with a relatively-weak CPU
(Intel I5-6600 with 3310MHz clock).
In fact, the use of hash functions to construct a MAC is so common, that
many people use the term ‘keyed hash’ to refer to the resulting MAC function.
The meaning is that the hash function uses a secret key k. This differs from
our use of the term ‘keyed hash function’, as in subsection 3.2.3, which is also
the usage in most works in cryptography, where the key k is not secret (i.e., the
key k is known to the adversary).
An additional problem with the term ‘keyed hash’ for the use of hash with
a secret key to construct MAC, is that it may be interpreted to imply that it is
safe to use a keyed CRHF as a MAC, simply by keeping its key secret instead of
publishing it. It may be possible, for the same keyed function h to be a keyed
CRHF (given a public key) and a MAC (given a secret key); however, it is also
possible for h to be a keyed CRHF yet not to be a MAC (given a secret key),
as we show in the next exercise. See also Exercise 4.22 and Exercise 4.23.
Exercise 4.9. Let hk (m) be a keyed CRHF. Show a keyed hash function h′k (m)
which (1) is a CRHF but (2) is not a secure MAC.
Solution: Let h′k (m) = k||hk (m). Clearly h′ exposes its key, so it cannot be
a secure MAC. However, h′ is still a CRHF, since any collision of h′ is also a
collision for h.
Applied Introduction to Cryptography and Cybersecurity
CHAPTER 4. AUTHENTICATION: MAC, BLOCKCHAIN AND SIGNATURE
248
SCHEMES
We see that a function may be a keyed CRHF but not a secure MAC; can we,
instead, construct a MAC from a cryptographic hash function? Ideally, we would
want to construct the MAC from a keyless cryptographic hash function, since
existing standard cryptographic hash functions are keyless (subsection 3.1.4).
In the reminder of this subsection, we discuss four such constructions, whose
goal is to create a MAC from keyless hash functions. We begin with three
designs studied by Tsudik [371], and then describe HMAC [30], a more recent
construction which is now widely deployed and deőned as an IETF standard [32].
Tsudik’s constructions of MAC from hash: prepend key, append
key and message-in-the-middle.
Several heuristic proposals for the
construction of a MAC from a cryptographic hash function were made, mostly
constructing the MAC from a keyless hash function. Three of the most well
known heuristics were presented and compared by Tsudik [371]. Given keyless
hash function h, key k and message m, these are:
Prepend Key: KMkh (m) = h(k +
+ m)
Append Key: M Kkh (m) = h(m +
+ k)
Message-in-the-Middle: KM Kkh (m) = h(k +
+m+
+ k)
An obvious question is whether these schemes are secure - assuming that the
cryptographic hash function h satisőes some assumption. Let us őrst observe
that all three constructions are secure under the ROM (Section 3.6).
Exercise 4.10. Prove that (a) KM h , (b) M K h and (c) KM K h are secure
under the Random Oracle Methodology (ROM).
Proof sketch: assume an adversary outputs m, σ for a message m which it
did not give as input to the ‘oracle’ for h. Then the output of the corresponding
h function, was never computed yet, i.e., it is still random. For example, for
KMkh (m) = h(k +
+ m), the value of h(k +
+ m), for this m, was not computed yet.
In fact, we need to pick it only to check the adversary’s guess σ; at that point,
we choose it randomly from the set {0, 1}n . The probability that our choice
will be the same as σ is only 2−n , i.e., negligible. Hence, KMkh (m) = h(k +
+ m)
is secure under the ROM. This shows (a); essentially the same argument holds
for (b) M K h and (c) KM K h .
To avoid the possible impression that every construction is secure under
the ROM, let us give an example of construction which is insecure even under
the ROM. Speciőcally, consider KM KMkh (m) = h(k +
+ m) +
+ h(k +
+ m ⊕ 1|m| ).
h
Namely, KM KM was made ‘more complex’ - maybe with the futile hope that
this will make it more secure - by concatenating two hash values, one of k +
+m
and the other of k +
+ m ⊕ 1|m| . Note that m ⊕ 1|m| is just a weird way for
writing the negation of m.
Example 4.4. Show that KM KM h is insecure, (even) under the ROM.
Applied Introduction to Cryptography and Cybersecurity
4.6. OTHER MAC CONSTRUCTIONS
249
Solution: Adversary asks to receive KM KMkh for the message m = 0l (for
any length l); let the value returned by denoted σL +
+ σR , where |σL | = |σR | = n.
Then the adversary returns the ‘guess’ (1l , σR +
+ σL ). Verify that this is the
correct pair.
We recommend to readers to follow carefully the arguments in Exercise 4.10
and őnd out why they do not hold for KM KMkh (m). It is not trivial - and
may help understanding these important concepts.
We next observe that these three constructions, which are secure under
the ROM, can be insecure using a hash function which satisőes standard
requirements such as collision-resistance and preimage-resistance (one-way
function), as in the following exercise. This illustrates the fact that security
under the ROM does not imply security under standard assumptions.
Exercise 4.11. Present a keyless hash function h such that:
1. h is a CRHF, yet (a) KM h , (b) M K h , (c) KM K h is not a secure MAC.
2. h is a SPR, yet (a) KM h , (b) M K h , (c) KM K h is not a secure MAC.
3. h is a OWF-hash, yet (a) KM h , (b) M K h , (c) KM K h is not a secure
MAC.
4. h is a BRE, yet (a) KM h , (b) M K h , (c) KM K h is not a secure MAC.
5. h is CRHF, OWF and BRE, yet (a) KM h , (b) M K h , (c) KM K h is not
a secure MAC.
The examples may assume a hash function h′ which has the corresponding
property (CRHF, SPR, OWF, BRE or their combination).
Partial solution: Let h′ be a hash function h′ which is CRHF, SPR and
OWF. We deőne h(x) to return the n most signiőcant bits of x if |x| = 2n and
the n least signiőcant bits are all zero, and to return h′ (x) otherwise. We leave
it to the reader to prove that h′ is a secure CRHF, SPR and OWF, yet KM h
is not a secure MAC. Changing this construction to also cover BRE is not very
difficult, as is modifying the constructions to show corresponding results for
M K h and KM K h .
While the examples in the solutions to Exercise 4.11 would be ‘artiőcial’
and irrelevant to any ‘real’ candidate hash function, some weaknesses of these
constructions can apply to realistic hash functions. In particular, many hash
functions have the following extend property: given h(x), one can compute
h(x +
+ y), even without knowing anything about x. This property hold for any
hash function using the (widely-used) Merkle-Damgård construction, which is
used by many hash functions, including the MD5 and SHA-1 standards; see
discussion in Section 3.9. In the following exercise we observe, after Tsudik,
that if h has the extend property then KM h is not a secure MAC.
Exercise 4.12. Show that the KM h is insecure, for any hash function h that
has the extend property.
Hint: Tsudik has shown this in [371].
Applied Introduction to Cryptography and Cybersecurity
CHAPTER 4. AUTHENTICATION: MAC, BLOCKCHAIN AND SIGNATURE
250
SCHEMES
HMAC. HMAC [30, 32] is the most widely-used construction of a MAC from
a keyless hash function. HMAC is deőned as:
HM ACk (m) = h(k ⊕ OP AD +
+ h(k ⊕ IP AD +
+ m))
(4.5)
Where OPAD, IPAD are őxed constant strings.
It is not difficult to see that HMAC is secure under the ROM (Exercise 3.24).
However, while the ROM is useful, and security under this model indicates that
some attacks are infeasible, it would surely be much better, if we could show
that HMAC is secure under some ‘reasonable’ cryptographic assumption. In
fact, this was done, in [30]. It would have been great if the assumption was one
of the standard hash-function assumptions, e.g., collision resistance; however,
the assumption in [30], while arguably reasonable, is somewhat more complex
than these standard hash function assumptions, and we will not discuss these
details here.
Note that HMAC is insecure when using some collision-resistant hash function, i.e., collision-resistance is not a sufficient requirement from the hash
function. You will show this in Exercise 3.25, where you should construction
a CRHF h(·), for which HMAC is not a secure MAC. To ensure that h is a
CRHF, the construction uses a given CRHF, h′ (·).
Due to the importance and wide use of HMAC, conődence in its security
grew over the years, with several additional results establishing its security
under ‘even more reasonable’ assumptions (compared to [30]). The conődence
in the security of HMAC also grew due to the fact that such important standard
has not be ‘broken’ by cryptanalysis during this time. In fact, over time, HMAC
is also often used for additional goals, such as a pseudorandom function (PRF)
and as a Key Derivation Function (KDF), which is essentially a keyed variant
of a randomness extraction hash function; see discussion in Section 3.5.
4.7
Combining Authentication, Encryption and Other
Functions
Message authentication combines authentication (sender identiőcation) and
integrity (detection of modiőcation). However, when transmitting messages, we
often have additional goals. These include security goals such as conődentiality,
as well as fault-tolerance goals such as error-detection/correction, and even
efficiency goals such as compression.
In the őrst four subsections, we focus on the combination of the two basic
security goals: encryption and authentication. Finally, in subsection 4.7.5, we
discuss the complete secure session transmission protocol, which addresses
additional goals involving security, reliability and efficiency, for a session (connection) between two parties. We return to these issues in Section 7.2, where
we discuss the SSL/TLS protocols, including their record protocol. There are
several vulnerabilities of the SSL/TLS protocols which are due to its use of the
(less-preferred, often vulnerable) attacks on the SSL/TLS exploited vulnerable,
insecure combinations of authentication, conődentiality and other functions,
Applied Introduction to Cryptography and Cybersecurity
4.7. COMBINING AUTHENTICATION, ENCRYPTION AND OTHER
FUNCTIONS
251
and the record protocol of TLS 1.3 was modiőed to a better design, to foils
such attacks.
There are two main options for ensuring the conődentiality and authentication/integrity requirements together: (1) by correctly combining an encryption
scheme with a MAC scheme, or (2) by using a combined authenticated encryption scheme. In the őrst subsection below, we discuss authenticated encryption
schemes and authenticated encryption with associated data (AEAD) schemes,
which combine encryption (for conődentiality) and authentication. In the following subsections, we discuss speciőc generic constructions, combining MAC
and encryption schemes.
4.7.1
Authenticated Encryption (AE) and AEAD schemes
Authenticated Encryption (AE) schemes. The combination of conődentiality and authenticity is often required, but we have seen that incorrect
combinations may lead to vulnerabilities. This motivates the design of schemes
which combine the authentication and the conődentiality functions. We use the
term authenticated encryption (AE) for such schemes [339], which consist of two
main functions: encrypt-and-authenticate EnA and decrypt-and-verify DnV ,
plus, optionally, an explicit key-generation function. The decrypt-and-verify
returns ERROR if the ciphertext is found not-authentic; similar veriőcation
property can be implemented by a MAC scheme, by comparing the authenticator received with a message to the result of computing the MAC on the
message. AE schemes may also have a key-generation function; in particular,
this is necessary when the keys are not uniformly random.
In addition to the support for both encryption and authentication, there is
an additional innovative aspect to the deőnition of AE schemes. Namely, the
AE encrypt-and-authenticate operation has three inputs. This is in contrast to
the standard deőnition of encryption schemes, which deőnes only two inputs:
the key and the plaintext.
The third input of AE schemes is called a nonce. To ensure security,
a different nonce value should be used whenever performing the encryption
operation. Essentially, an Authenticated-Encryption resembles, therefore, a
mode-of-operation of an encryption scheme, with the nonce taking the role
of the IV or counter (state) input. The same nonce should be given to the
decrypt-and-verify operation, and the scheme should ensure correctness, namely
that for every plaintext message m, key k and nonce n holds:
m = DnVkn (EnAnk (m))
(4.6)
The use of a combined AE scheme allows simpler, less error-prone implementations compared to the use of two separate schemes, one for encryption and
one for authentication. In particular, we need only a call to only one function
(encrypt-and-authenticate or decrypt-and-verify) instead of requiring the correct
use of both encryption/decryption and MAC functions.
Many constructions of authenticated encryption are generic, i.e., built by
combining arbitrary implementation of cryptographic schemes, following the
Applied Introduction to Cryptography and Cybersecurity
CHAPTER 4. AUTHENTICATION: MAC, BLOCKCHAIN AND SIGNATURE
252
SCHEMES
‘cryptographic building blocks’ principle. The combinations of encryption scheme
and MAC scheme that we study later in this subsection are good examples
for such generic constructions. Other constructions are ‘ad-hoc’, i.e., they are
designed using speciőc functions. Such ad-hoc constructions may have better
performance than generic constructions, however, that may come at the cost of
requiring more complex or less well-tested security assumptions, contrary to
the Cryptographic Building Blocks principle.
Authenticated Encryption with Associated Data (AEAD). In many
applications, e.g., TLS (Chapter 7), some of the data to be authenticated should
not be encrypted. Typically, this would be data that is used also by agents
which do not have the secret (decryption) key; for example, the identity of the
destination. Such data is often referred to as associated data, and authenticated
encryption schemes supporting it are referred to as AEAD (Authenticated
Encryption with Associated Data) schemes [337]. AEAD schemes have the same
three functions (key-generation, encrypt-and-authenticate, decrypt-and-verify),
and their input also includes a nonce. However, they also have an additional
(fourth) input, the associated-data őeld.
Scheme
MAC
Type
Goals
Symmetric Authenticity
Authenticity and
AE
Symmetric
conődentiality
Authenticity, and
AEAD
Symmetric conődentiality for
part of document
Authenticity and
Signature Asymmetric
non-repudiation
Authenticity, nonSignCryptionAsymmetric repudiation and
conődentiality
Metaphor
Document with secret mark
Document with secret mark,
in sealed envelop
Document with secret mark,
in sealed envelop with window
Signed document
Signed document in a sealed
envelop
Table 4.1: Authentication schemes: MAC, Authenticated Encryption (AE),
Authenticated Encryption with Associate Data (AEAD), Signature and SignCryption schemes.
Together with AE and AEAD schemes, we now have four different cryptographic authentication schemes. In Table 4.1, we sum up these four schemes;
for each scheme, we list its intuitive goal, along with a metaphor to it. Like all
metaphors, these should not be taken too seriously; hopefully, readers would
őnd them helpful and not confusing.
Authenticated encryption: attack model and success/fail criteria
We now brieŕy discuss the attack model (attacker capabilities) and the goals
(success/fail criteria) for the combination of authentication and conődentiality
(encryption), as is essential for any security evaluation (principle 1). Essentially,
Applied Introduction to Cryptography and Cybersecurity
4.7. COMBINING AUTHENTICATION, ENCRYPTION AND OTHER
FUNCTIONS
253
this combines the corresponding attack model and goals of encryption schemes
(indistinguishability test) and of message authentication code (MAC) schemes
(forgery test).
As in our deőnitions for encryption and MAC, we consider an efficient (PPT)
adversary. We also allow the attacker to have similar capabilities as in the
deőnitions of secure encryption / MAC. In particular, we allow chosen plaintext
queries, where the attacker provides input messages (plaintext) and receives
their authenticated-encryption, as in the chosen-plaintext attack (CPA) we
deőned for encryption.
Exercise 4.13. Present precise definitions for IND-CPA and security against
forgery for AE and AEAD schemes.
4.7.2
Authentication via EDC-then-Encryption?
Several practical secure communication systems őrst apply an Error-DetectingCode (EDC) to the message, and then encrypt it, i.e.: c = Ek (m +
+ EDC(m)).
We believe that the motivation for this design is the hope to ensure authentication as well as conődentiality, i.e., the designers were (intuitively) trying to
develop an authenticated-encryption scheme. Unfortunately, such designs are
often insecure; in fact, often, the application of EDC/ECC before encryption
allows attacks on the conődentiality of the design. We saw one example, for
WEP, in Section 2.10. Another example of such vulnerability is in the design
of GSM, which employs not just an Error Detecting Code but even an Error
Correcting Code, with very high redundancy. In both WEP and GSM, the
encryption was performed by XORing the plaintext (after EDC/ECC) with the
keystream (output of PRG).
However, EDC-then-Encrypt schemes are often vulnerable, also when using
other encryption schemes. For example, the following exercise shows such
vulnerability, albeit against the authentication property, when using CBC-mode
encryption.
Exercise 4.14 (EDC-then-CBC does not ensure authentication). Let E be
a secure block cipher and let CBCkE (m; IV ) be the CBC-mode encryption of
plaintext message m, using underlying block cipher E, key k and initialization vector IV , as in Eq. (2.56). Furthermore, let EDCtCBCkE (m; IV ) =
CBCkE (m +
+ h(m); IV ) where h is a function outputting one block (error detecting code). Show that EDCtCBC E is not a secure authenticated encryption;
specifically, that authentication fails.
Hint: attacker asks for EDCtCBC E encryption of the message m′ =
m+
+ h(m); the output gives also the encryption of m.
4.7.3
Generic Authenticated Encryption Constructions
We now discuss ‘generic’ constructions, combining arbitrary MAC and encryption schemes to ensure both conődentiality and authentication/integrity. As
Applied Introduction to Cryptography and Cybersecurity
CHAPTER 4. AUTHENTICATION: MAC, BLOCKCHAIN AND SIGNATURE
254
SCHEMES
discussed above, these constructions can be used to construct a single, combined
‘authenticated encryption’ scheme, or to ensure both goals (conődentiality and
authenticity) in a system.
Different generic constructions were proposed - but not all are secure. Let
us consider three constructions, all applied in important, standard applications.
For each of the designs, we present the process of authenticating and encrypting a message m, using two keys - k ′ used for encryption, and k ′′ used for
authentication.
Authenticate and Encrypt (A&E) , e.g., used in early versions of the SSH
protocol: C = Enck′ (m), A = M ACk′′ (m); send (C, A).
Authenticate then Encrypt (AtE) , e.g., used in the SSL and TLS standards: A = M ACk′′ (m), C = Enck′ (m, A); send C.
Encrypt then Authenticate (EtA) , e.g., used by the IPsec standard: C =
Enck′ (m), A = M ACk′′ (C); send (C, A).
Exercise 4.15 (Generic AE and AEAD schemes). Above we described only the
‘encrypt-and-authenticate’ function of the authenticated-encryption schemes for
the three generic constructions, and even that, we described informally, without
the explicit implementation. Complete the description by writing explicitly, for
each of the three generic constructions above, the implementation for the encryptand-authenticate (EnA) and the decrypt-and-verify (DnV) functions. Present
also the AEAD (Authenticated Encryption with Associated Data) version.
Partial solution: we present only the solution for the A&E construction.
The AE implementations are:
A&E.EnA(k′ ,k′′ ) (m)
A&E.DnV(k′ ,k′′ ) (c, a)
← (Enck′ (m), M ACk′′ (m))
m ← Deck′ (c);
If a ̸= M ACk′′ (m), return m;
←
Otherwise, return ERROR;
The AEAD implementations are very similar, except also with Associated
Data (wAD); we present only the EnA function:
A&E.EnAwAD(k′ ,k′′ ) (m, d; r)
=
+ d))
(Enck′ (m; r), d, M ACk′′ (m +
Some of these three generic constructions are insecure, as we demonstrate
below for particular pairs of encryption and MAC functions. Can you identify or guess - which? The answers were given, almost concurrently, by two beautiful
papers [40, 241]; the main points are in the following exercises.
Exercise 4.16 shows that A&E is insecure; this is quite straightforward, and
hence readers should try to solve it alone before reading the solution.
Applied Introduction to Cryptography and Cybersecurity
4.7. COMBINING AUTHENTICATION, ENCRYPTION AND OTHER
FUNCTIONS
255
Exercise 4.16 (Authenticate and Encrypt (A&E) is insecure). Show that a pair
of secure encryption scheme Enc and secure MAC scheme M AC may be both
secure, yet their combination using the A&E construction would be insecure.
Solution: given any secure MAC scheme M AC, let
+ m[1]
M ACk′ ′′ (m) = M ACk′′ (m) +
where m[1] is the őrst bit of m.
If M AC is a secure MAC then M AC ′ is also a secure MAC. However, M AC ′
exposes a bit of its input; hence, its use in A&E would allow the adversary to
distinguish between encryptions of two messages, i.e., the resulting, combined
scheme is not IND-CPA secure - even when the underlying encryption scheme
E is secure.
Exercise 4.17 shows that AtE is also insecure. The argument is more
elaborate than the A&E argument from Exercise 4.16, and it may not be
completely necessary to understand it for a őrst reading; however, it is a nice
example of a cryptographic counterexample, so it may be worth investing the
effort. Readers may also consult [241] for more details.
Exercise 4.17 (Authenticate then Encrypt (AtE) is insecure). Show that a
pair of secure encryption scheme Enc and secure MAC scheme M AC may be
both secure, yet their combination using the AtE construction would be insecure.
Solution: Consider the following simpliőed version of the Per-Block Random
(PBR) mode presented in subsection 2.8.2, deőned for single block messages:
Enck (m; r) = m ⊕ Ek (r) +
+ r, where E is a block cipher; notice that this is also
essentially OFB and CFB mode encryption, applied to single block messages.
When the random bits are not relevant, i.e., simply selected uniformly, then we
do not explicitly write them and use the simpliőed notation Enck (m).
As shown in Theorem 2.1, if E is a secure block cipher (or even merely a
PRF or PRP), then Enc is an IND-CPA secure encryption scheme. Denote the
block length by 4n, i.e., assume it is a multiple of 4. Hence, the output of Enc
is 8n-bits long.
We next deőne a randomized transform Split : {0, 1} → {0, 1}2 , i.e., from
one bit to a pair of bits. The transform always maps 0 to 00, and randomly transforms 1 to {01, 10, 11} with the corresponding probabilities {49.9%, 50%, 0.1%}.
We extend the deőnition of Split to 2b-bit long strings, by applying Split to
each input block, i.e., given 2n-bit input message m = m1 +
+ ... +
+ m2n , where
each mi is a bit, let Split(m) = Split(m1 ) +
+ ... +
+ Split(m2n ).
We use Split to deőne a ‘weird’ variant of Enc, which we denote Enc′ ,
deőned as: Enc′k (m) = Enck (Split(m)). The reader should conőrm that,
assuming E is a secure block cipher, then Enc′ is IND-CPA secure encryption
scheme (for 2n-bit-long plaintexts).
Consider now AtEk,k′ (m) = Enc′k (m +
+ M ACk′ (m)) = Enck (Split(m +
+
M ACk′ (m))), where m is an n-bits long string, and where M AC has input and
outputs of n-bits long strings. Hence, the input to Enc′ is 2n-bits long, and
hence, the input to Enc is 4n-bits long - as we deőned above.
Applied Introduction to Cryptography and Cybersecurity
CHAPTER 4. AUTHENTICATION: MAC, BLOCKCHAIN AND SIGNATURE
256
SCHEMES
However, AtE is not a secure authenticated-encryption scheme. In fact,
given c = AtEk,k′ (m), we can decipher m, using merely feedback-only CCA
queries.
Let us demonstrate how we őnd the őrst bit m1 of m. Denote the 8n bits of c
as c = c1 +
+. . .+
+c8n . Perform the query c′ = c¯1 +
+c¯2 +
+c3 +
+c4 +
+c5 +
+. . .+
+c8n , i.e.,
inverting the őrst two bits of c. Recall that c = AtEk,k′ (m) = Enck (Split(m +
+
+ r. Hence, by inverting c1 , c2 ,
M ACk′ (m))) and that Enck (m; r) = m ⊕ Ek (r) +
we invert the two bits of Split(m1 ) upon decryption.
The impact depends on the value of m1 . If m1 = 0, then Split(m1 ) = 00;
by inverting them, we get 11, whose ‘unsplit’ transform returns 1 instead of 0,
causing the MAC validation to fail, providing the attacker with an ‘ERROR’
feedback. However, if m1 = 1, then Split(m1 ) is either 01 or 10 (with probability
99.9%), and inverting both bits does not impact the ‘unsplit’ result, so that
the MAC validation does not fail. This allows the attacker to determine the
őrst bit m1 , with very small (0.1%) probability of error (in the rare case where
Split(m1 ) returned 11).
Note that the AtE construction is secure - for speciőc encryption and MAC
schemes. However, it is not secure for arbitrary secure encryption and MAC
schemes, i.e., as a generic construction. Namely, Encrypt-then-Authenticate
(EtA) is the only remaining candidate generic construction. Fortunately, EtA
is secure, for any secure encryption and MAC scheme, as the following lemma
states.
Lemma 4.4 (EtA is secure [241]). Given a CPA-IND encryption scheme Enc
and a secure MAC scheme M AC, their EtA construction ensures both CPA-IND
and secure MAC.
Proof sketch: We őrst show that the IND-CPA property holds. Suppose, to
the contrary, that there is an efficient (PPT) adversary A that ‘wins’ against EtA
in the IND-CPA game, with signiőcant probability. We construct adversary A ′
that ‘wins’ in the IND-CPA game against the encryption scheme Enc, employed
as part of the EtA scheme. Speciőcally, A ′ generates a key k ′′ for the MAC
function, and runs A. Whenever A chooses the two challenge messages m0 , m1 ,
and should be provided with the authenticated-encryption of mb , then A ′
chooses the same two messages and receives c∗ = Enck′ (mb ). Then A ′ uses the
key k ′′ it generated to compute a∗ = M ACk′′ (c∗) and return the pair (c∗, a∗)
which would be the authenticated-encryption of mb , as required.
Similarly, whenever A asks for encryption of a message m, then A ′ uses its
oracle to compute c = Enck′ (m), and k ′′ to compute a = M ACk′′ (c). A ′ then
returns the pair (c, a) to A, which is exactly the required EtA.EnAk′ ,k′′ (m).
Finally, when A guesses a bit b, then A ′ guesses the same bit. If A ‘wins’,
i.e., correctly guesses, then A ′ also ‘wins’. It follows that there is no efficient
(PPT) adversary A that ‘wins’ against EtA in the IND-CPA game.
We next show that EtA also ensures security against forgery, as in Def. 4.1,
adjusted for AE / AEAD schemes, as in Ex. 4.13. Suppose there is an
efficient (PPT) adversary A that succeeds in forgery of the EtA scheme,
Applied Introduction to Cryptography and Cybersecurity
4.7. COMBINING AUTHENTICATION, ENCRYPTION AND OTHER
FUNCTIONS
257
with signiőcant probability. Namely, A produces a message c and tag a s.t.
m = EtA.DnVk′ ,k′′ (c, a), for some message m, without making a query to
EtA.EnAk′ ,k′′ (m). By construction, this implies that a = M ACk′′ (c).
However, from the deőnition of encryption (Def. 2.1), speciőcally the correctness property, there is no other message m′ ̸= m whose encryption would
result in same ciphertext c. Hence, A did not make a query to EtA.EnAk′ ,k′′
that returned M ACk′′ (c) as the tag - yet A obtained M ACk′′ (c) somehow - in
contradiction to the assumed security of MAC.
Additional properties of EtA: efficiency and ability to foil DoS, CCA
Not only is EtA secure given any secure Encryption and MAC scheme - it also
has three additional desirable properties:
Efficiency: Any corruption of the ciphertext, intentional or benign, is detected
immediately by the veriőcation process (comparing the received tag to
the MAC of the ciphertext). This is much more efficient then encryption.
Foil DoS: This improved efficiency implies that it is much harder, and rarely
feasible, to exhaust the resources of the recipient by sending corrupted
messages (ciphertext).
Foil CCA: By validating the ciphertext before decrypting it, EtA schemes
prevent CCA attacks against the underlying encryption scheme, where
the attacker provides specially-crafted ciphertext messages, receives the
corresponding plaintext (or failure indication if the ciphertext was not
valid encryption), and uses the resulting plaintext and/or failure indication
to attack the underlying encryption scheme. If the attacker is creating
such crafted ciphertext and sending the EtA scheme, then it should fail the
MAC validation, and would not even be input to the decryption process.
Therefore, as long as the attacker cannot forge legitimate MAC, they can
only attack the MAC component of the EtA, and the encryption scheme
is protected from this threat.
4.7.4
Single-Key Generic Authenticated-Encryption
All three constructions above used two separate keys: k ′ for encryption and
k ′′ for authentication. Sharing two separate keys may be harder than sharing
a single key. Can we use a single key k for both the encryption and the
MAC functions used in the generic authenticated encryption constructions (or,
speciőcally, in the EtA construction, since it is always secure)? Note that this
excludes the obvious naive ‘solution’ of using a ‘double-length’ key, split into
an encryption key and a MAC key. The following exercise shows that such ‘key
re-use’ is insecure.
Exercise 4.18 (Key re-use is insecure). Let E ′ , M AC ′ be secure encryption
and MAC schemes. Show (contrived) examples of secure encryption and MAC
schemes, built using E ′ , M AC ′ , demonstrating vulnerabilities for each of the
Applied Introduction to Cryptography and Cybersecurity
CHAPTER 4. AUTHENTICATION: MAC, BLOCKCHAIN AND SIGNATURE
258
SCHEMES
three generic constructions, when using the same key for authentication and for
encryption.
Partial solution:
+ k ′′ and M ACk′ ,k′′ (m) = k ′ +
+ M ACk′′ (m).
A&E: Let Ek′ ,k′′ (m) = Ek′ ′ (m) +
Obviously, when combined using the A&E construction, the result is completely insecure - both authentication and conődentiality are completely
lost.
+ k ′′ as
AtE: To demonstrate loss of authenticity, let Ek′ ,k′′ (m) = Ek′ ′ (m) +
above.
+M ACk′′ (m)
EtA: To demonstrate loss of conődentiality, let M ACk′ ,k′′ (m) = k ′ +
as above. To demonstrate loss of authentication, with a hint to one elegant
solution: combine Ek′ ,k′′ (m) = Ek′ ′ (m) +
+ k ′′ as above, with a (simple)
extension of Example 4.3.
The reader is encouraged to complete missing details, and in particular, to
show that all the encryption and MAC schemes used in the solution are secure
(albeit contrived) - only their combined use, in the three generic constructions,
is insecure.
Since we know that we cannot re-use the same key for both encryption and
MAC, the next question is - can we use two separate keys, k ′ , k ′′ from a single
key k, and if so, how? We leave this as a (not too difficult) exercise.
Exercise 4.19 (Generating two keys from one key). Given a secure n-bitkey shared-key encryption scheme (E, D) and a secure n-bit-key MAC scheme
M AC, and a single random, secret n-bit key k, show how we can derive two
keys (k ′ , k ′′ ) from k, s.t. the EtA construction is secure, when using k ′ for
encryption and k ′′ for MAC, given:
1. A secure n-bit-key PRF f .
2. A secure n-bit-key block cipher (Ê, D̂).
3. A secure PRG from n bits to 2n bits.
4.7.5
Authentication, encryption, compression and error
detection/correction codes
Data is often encoded for additional, non-security goals:
Reliability via error detection and/or correction, typically using Error
Detection Code (EDC) such as Checksum, or Error Correction Code
(ECC), such as ReedśSolomon codes. These mechanisms are designed
against random errors, and are not secure against intentional, ‘malicious’
modiőcations. Note that secure message authentication, such as using
MAC, also ensures error detection; however, as we explain below, it is
often desirable to also use the (insecure) error detection codes.
Applied Introduction to Cryptography and Cybersecurity
4.7. COMBINING AUTHENTICATION, ENCRYPTION AND OTHER
FUNCTIONS
259
Compression for efficiency. Compression is applied to improve efficiency by
reducing message length. As we explain below, this requirement may
conŕict with the conődentiality requirement.
Message
Compress
Plaintext
Encrypt
Header
Ciphertext
Seq
MAC
Header
Ciphertext
tag
Code
Header
Ciphertext
tag ECC
Figure 4.3: Combining Security (encryption, authentication) with Reliability
(EDC/ECC) and Compression. The application of EDC/ECC after the MAC,
allows recipients to discard messages corrupted by noise, without computing the
MAC function; this saves overhead, and allows the recipient to detect attacks
on the authentication.
In this subsection, we discuss how to correctly and securely combine the
security goals of encryption and authentication, with these additional goals of
reliability and compression (for efficiency). Fig. 4.3 presents the recommended
process for combining all of these functions. Let us explain this recommendation:
• Compression is only effective when applied to data with signiőcant redundancy; plaintext is often redundant, in which case, applying compression
Applied Introduction to Cryptography and Cybersecurity
CHAPTER 4. AUTHENTICATION: MAC, BLOCKCHAIN AND SIGNATURE
260
SCHEMES
to it could be effective. In contrast, ciphertext would normally not have
redundancy. Hence, if compression is used, it must be applied before
encryption. Note, however, that this may conŕict with the conődentiality
requirement, as we explain below; for better conődentiality, avoid compression completely, or take appropriate measures to limit the possible
exposure.
• Encryption is applied next, before authentication (MAC), following the
‘Encrypt-then-Authenticate’ construction. Alternatively, we may use
an authenticated-encryption with associated data (AEAD) scheme, to
combine the encryption and authentication functions. Notice that by
applying authentication after encryption or using an AEAD scheme, we
facilitate also authentication of a sequence-number or similar őeld used
to prevent re-play/re-order/omission, which is often known to recipient,
and hence may not be sent explicitly. We can also authenticate ‘header’
őelds such as destination address, which are also not encrypted, since they
are used to process and route the encrypted message. The Encrypt-thenAuthenticate mode also allows prevention of chosen-ciphertext attacks
and more efficient handling of corrupted messages.
• Finally, we apply error correction / detection code. This allows efficient
handling of messages corrupted due to noise or other benign reasons. An
important side-beneőt is that authentication failures of messages to which
errors were not detected imply an intentional forgery attack - an attacker
made sure that the error-detecting code will be correct.
Compress-then-Encrypt Vulnerability Note that there is a subtle vulnerability in applying compression before encryption, since encryption does not hide
the length of the plaintext, while the length of compressed messages depends on
the contents. In particular, a message containing randomly-generated strings
typically does not compress well (length after compression is roughly as long as
before compression), while messages containing lots of redundancy, e.g., strings
composed of only one character, compress well (length after compression is
much shorter). This allows an attacker to distinguish between the encryptions
of two compressed messages, based on the redundancy of the plaintexts.
This vulnerability was őrst presented in [230], and later exploited in several attacks, including attacks on the record protocol of SSL and TLS; see
subsection 7.2.6 and Exercise 7.6.
Applied Introduction to Cryptography and Cybersecurity
4.8. ADDITIONAL EXERCISES
4.8
261
Additional exercises
Exercise 4.20. Mal intercepts a message sent from Alice to her bank, and
instructing the bank to transfer 10$ to Bob. Assume that the communication
is protected by One-Time-Pad OTP encryption, using a random key shared
between Alice and her bank, and by including Alice’s password as part of the
plaintext, validated by the bank. Assume Mal knows that the message is an
ASCII encoding of the exact string Transfer 10$ to Bob. From: Alice, PW:,
concatenated with Alice’s password (unknown to Mal). Show how Mal can
change the message so that upon receiving it, the bank will, instead, transfer
99$ to Mal. (The modified message must have the same password!)
Exercise 4.21. Let S be a correct signature scheme over domain {0, 1}n , and
let h : {0, 1}∗ → {0, 1}n be a hash function whose output is n bits long. Prove
h
that the HtS construction SHtS
, defined as in Equation 3.6, is correct.
Exercise 4.22 (TCR hash is not necessarily a MAC). Let hk (m) be a target
collision resistant (TCR) hash function (subsection 3.2.3). Show a keyed hash
function h′k (m) which (1) is also TCR hash function but (2) is not a secure
MAC.
Exercise 4.23 (A MAC is not necessarily a CRHF). Let M ACk (m) be a secure
MAC function. Show a keyed hash function hk (m) which (1) is a secure MAC
yet (2) is not a (keyed) CRHF or a TCR hash function.
Exercise 4.24. Let S be a existentially unforgeable signature scheme over
+ y) = x ⊕ y be a hash function whose input is
domain {0, 1}n , and let h(x +
2n bits long, and whose output is the n-bit string resulting from the bit-wise
exclusive-OR of the most-significant n input bits, with the least significant n
h
input bits. Show an attacker A that shows that the HtS construction SHtS
,
defined as in Equation 3.6, is not an existentially unforgeable signature scheme.
Exercise 4.25. Hackme Inc. proposes the following highly-efficient MAC, using
two 64-bit keys k1 , k2 , for 64-bit blocks: M ACk1 ,k2 (m) = (m⊕k1 )+k2 (mod 264 ).
Show that this is not a secure MAC.
Hint: Compare to Exercise 2.49.
Exercise 4.26. Let F : {0, 1}n → {0, 1}l be a secure PRF, from n bit strings
to l < n bit strings. Define F ′ : {0, 1}n → {0, 1}l as: Fk′ (m) = Fk (m) +
+ Fk (m),
i.e., concatenate the results of Fk applied to m and to the inverse of m. Present
′
an efficient algorithm ADV Fk which demonstrates that F ′ is not a secure MAC,
′
i.e., outputs tuple (x, t) s.t. x ∈ {0, 1}n and t = Fk′ (x). Algorithm ADV Fk
may provide input m ∈ {0, 1}n and receive Fk′ (m), as long as x ̸= m. You
′
can present ADV Fk by ‘filling in the blanks’ in the ‘template’ below, modifying
and/or extending the template if desired, or simply write your own code if you
like.
′
);
ADV Fk : {t′ = Fk′ (
Applied Introduction to Cryptography and Cybersecurity
CHAPTER 4. AUTHENTICATION: MAC, BLOCKCHAIN AND SIGNATURE
262
SCHEMES
Return (
); }
Exercise 4.27. Consider CF B − M AC, defined below, similarly to the definition of CBC − M AC (Eq. (4.2)):
c0 ← 0l ;
(i = 1 . . . η)ci = mi ⊕ Ek (ci−1 );
CF B − M ACkE (m1 +
+ m2 +
+. . .+
+ mη ) =
output cη
1. Show an attack demonstrating that CF B − M ACkE is not a secure l · η-bit
MAC, even when E is a secure l-bit block cipher (PRP). Your attack
should consist of:
a) Up to three ‘queries’, i.e., messages m, m′ and m′′ , each of one or
more blocks, to which the attacker receives CF B−M ACkE (m), CF B−
M ACkE (m′ ) and CF B − M ACkE (m′′ ). Note: one query suffices, although you may use up to three.
• m=
• m′ =
• m′′ =
b) A forgery, i.e., a pair of a message mF
and its
authenticator a =
such that mF ̸∈ {m, m′ , m′′ }
and a = CF B − M ACkE (mF ).
2. Would your attack also work against the ‘improved’ variant ICF B −
M ACkE (m) = EK (CF B − M ACkE (m))? If not, present an attack against
ICF B − M ACkE (m):
• m=
• m′ =
• m′′ =
• mF
• a=
Exercise 4.28.
1. Alice sends to Bob the 16-byte message ‘I love you Bobby’,
where each character is encoded using one-byte (8 bits) ASCII encoding.
Assume that the message is encrypted using the (64-bit) DES block cipher,
using OFB mode. Show how an attacker can modify the ciphertext message
to result with the encryption of ‘I hate you Bobby’.
2. Can you repeat for CFB mode? Show or explain why not.
3. Can you repeat for CBC mode? Show or explain why not.
4. Repeat previous items, if we append to the message its CRC, and verify it
upon decryption.
Applied Introduction to Cryptography and Cybersecurity
4.8. ADDITIONAL EXERCISES
263
Exercise 4.29.
1. Our definition of FIL CBC-MAC assumed that the input
is a complete number of blocks. Extend the construction to allow input of
arbitrary length, and prove its security.
2. Repeat, for VIL CBC-MAC.
Exercise 4.30. Consider a variant of CBC-MAC, where the value of the IV is
not a constant, but instead the value of the last plaintext block, i.e.:
c0 ← mη ;
(i = 1 . . . η)ci = Ek (mi ⊕ ci−1 );
CBC − M ACkE (m1 +
+ m2 +
+. . .+
+ mη ) =
outputcη
Is this a secure MAC? Prove or present convincing argument.
Exercise 4.31. Let E be a secure PRF. Show that the following are not secure
MAC schemes.
1. ECB-encryption of the message.
2. The XOR of the output blocks of ECB-encryption of the message.
Exercise 4.32 (MAC from a PRF). In Exercise 2.38 you were supposed to
construct a PRF, with input, output and keyspace all of 64 bits. Show how to
use such (candidate) PRF to construct a VIL MAC scheme.
Exercise 4.33. This question discuss a (slightly simplified) vulnerability in
a recently proposed standard. The goal of the standard is to allow a server
S to verify that a given input message was ‘approved’ by a series of őlters,
F1 , F2 , . . . , Ff (each filter validates certain aspects of the message). The server
S shares a secret ki with each filter Fi . To facilitate this verification, each
message m is attached with a tag; the initial value of the tag is denoted T0 and
and each filter Fi receives the pair (m, Ti−1 ) and, if it approves of the message,
outputs the next tag Ti . The server s will receive the final pair (m, Tf ) and use
Tf to validate that the message was approved by all filters (in the given order).
A proposed implementation is as follows. The length of the tag would be the
same as of the message and of all secrets ki , and that the initial tag T0 would be
set to the message m. Each filter Fi signals approval by setting Ti = Ti−1 ⊕ ki .
To validate, the server receives (m, Tf ) and computes m′ = Tf ⊕k1 ⊕k2 ⊕. . .⊕kf .
The message is considered valid if m′ = m.
1. Show that in the proposed implementation if the tag Tf is computed as
planned (i.e. as described above), then the message is considered valid if
and only if all filters approved of it.
2. Show that the proposed implementation is insecure.
3. Present a simple, efficient and secure alternative design for the validation
process.
Applied Introduction to Cryptography and Cybersecurity
CHAPTER 4. AUTHENTICATION: MAC, BLOCKCHAIN AND SIGNATURE
264
SCHEMES
4. Present an improvement to your method, with much improved, good performance even when messages are very long (and having tag as long as
the message is impractical).
Note: you may combine the solutions to the two last items; but separating
the two is recommended, to avoid errors and minimize the impact of errors.
Exercise 4.34 (Single-block authenticated encryption?). Let E be a block
cipher (or PRP or PRF), for input domain {0, 1}l , and let l′ < l. For input
′
′
domain m ∈ {0, 1}l−l , let fk (m) = Ek (m +
+ 0l ).
1. Prove or present counterexample: f is a secure MAC scheme.
2. Prove or present counterexample: f is an IND-CPA symmetric encryption
scheme.
Exercise 4.35. Let F : {0, 1}κ × {0, 1}l+1 → {0, 1}l+1 be a secure PRF,
where κ is the key length, and both inputs and outputs are l + 1 bits long. Let
F ′ : {0, 1}κ × {0, 1}2l → {0, 1}2l+2 be defined as: Fk′ (m0 +
+ m1 ) = Fk (0 +
+ m0 ) +
+
Fk (1 +
+ m1 ), where |m0 | = |m1 | = l.
1. Explain why it is possible that F ′ would not be a secure 2l-bit MAC.
2. Present an adversary and/or counter-example, showing F ′ is not a secure
2l-bit MAC.
3. Assume that, indeed, it is possible for F ′ not to be a secure MAC. Could
F ′ then be a secure PRF? Present a clear argument.
Exercise 4.36. Given a keyed function fk (x), show that if there is an efficient
operation ADD such that fk (x+y) = ADD(fk (x), fk (y)), then f is not a secure
MAC scheme. Note: a special case is when ADD(a, b) = a + b.
Exercise 4.37 (MAC from other block cipher modes). In subsection 4.5.2 we
have seen given an n-bit block cipher (E, D), the CBC-MAC, as defined in Eq.
(4.2), is a secure n · η-bit PRF and MAC, for any integer η > 0; and in Ex. 4.4
we have seen this does not hold for CTR-mode MAC. Does this property hold
for...
ECB-MAC, defined as: ECB − M ACkE (m1 +
+ ... +
+ mη ) = Ek (m1 ) +
+ ... +
+
Ek (mη )
PBC-MAC, defined as: P BC − M ACkE (m1 +
+ ... +
+ mη ) = m1 ⊕ Ek (1) +
+
... +
+ mη ⊕ Ek (η)
OFB-MAC, defined as: OF B − M ACkE (m1 +
+ ... +
+ mη ) = pad0 , m1 ⊕
Ek (pad0 ) +
+ ... +
+ mη ⊕ Ek (padη−1 ) where pad0 is random.
+...+
+ mη ) = c0 , c1 , . . . cη where
CFB-MAC, defined as: CF B − M ACkE (m1 +
c0 is random and ci = mi ⊕ Ek (ci−1 ) for i>1.
Applied Introduction to Cryptography and Cybersecurity
4.8. ADDITIONAL EXERCISES
XOR-MAC, defined as: XOR − M ACkE (m1 +
+. . .+
+ mη ) =
265
L
Ek (i ⊕ Ek (mi ))
Justify your answers, by presenting counterexample (for incorrect claims) or
by showing how an adversary against the MAC function, you construct an
adversary against the block cipher.
Exercise 4.38. Let (Enc, Dec) be an IND-CPA secure encryption scheme,
and let Enc′k (m) = Enck (Compress(m)), where Compress is a compression
function. Show that Enc′ is not IND-CPA secure.
Exercise 4.39. Figure 4.3 shows the typical sequence of operations when sending a message, with confidentiality (encryption), authentication (MAC), error
detection/correction code and (optional) compression. Show the corresponding sequence of operations when receiving such message, in a figure and/or
in pseudo-code. In both cases, clarify the reaction if the MAC or EDC/ECC
validation fails.
Applied Introduction to Cryptography and Cybersecurity
Chapter 5
Shared-Key Protocols
In the previous chapters, we discussed cryptographic schemes, which consist of
one or more functions. For example, MAC, PRF and PRG schemes consist of a
single function, with different security criteria, e.g., for MAC, security against
forgery (Deőnition 4.1). Similarly, encryption schemes consist of multiple
functions (encryption, decryption and possible key generation), with criteria
such as CPA-indistinguishability (CPA-IND, Deőnition 2.9).
Most often, cryptographic schemes are used as a part of a protocol involving
two or more parties (entities). In this chapter, we focus on the following basic
shared-key protocols used for securing the communication between two parties:
Session/record protocols secure the communication of a session between two
parties, which typically includes exchange of multiple messages, using a
key shared between the two parties. See Section 5.1.
Entity-authentication protocols ensure the identity of a peer involved in
the communication. We discuss the vulnerable SNA protocol, and its
replacement, the 2PP protocol, in Section 5.2.
Request-response protocols ensure the authentication and/or conődentiality
of the communication between two parties. We discuss them in Section 5.3.
Key Exchange protocols are run between the two parties, to establish shared
keys to encrypt and authenticate communication. In this chapter, we focus
on shared-key Key Exchange protocols, which use an already shared-key
between the parties, to establish keys for encryption and authentication;
these protocols ensure improved security compared to direct use of the
shared key for these functions. See Section 5.4.
Key distribution protocols establish shared keys between two parties, with
the help of a trusted third party (TTP), often referred to as the Key Distribution Center (KDC). We discuss key distribution protocols in Section 5.5;
much of our discussion is dedicated to studying the vulnerabilities of the
GSM protocol.
267
268
CHAPTER 5. SHARED-KEY PROTOCOLS
Resilient Key Exchange protocols are Key Exchange protocols with mechanisms to reduce exposure due to key exposure. These include forward
secrecy, perfect forward secrecy (PFS) and recover security Key Exchange
protocols. See Section 5.7.
This chapter is mostly informal. In particular, we do not present rigorous
deőnitions of security as we did in the previous chapters, and as we do in
Section 5.1.1. This is since precise deőnitions of the execution of protocols
and of the corresponding requirements and models, seem to be unavoidably
too complex for this textbook. We hope that the informal discussion will
clarify the main issues and empower readers to properly use cryptographic
protocols and avoid pitfalls. Hopefully, the discussion will also prepare readers
interested in design and analysis of cryptographic protocols to study more
advanced texts which address these challenges. Some of the many relevant texts
include [46, 87, 166, 358] and [196].
5.1
Modeling cryptographic protocols
We begin our discussion informally explaining what we mean by a cryptographic
protocol. We use the term ‘cryptographic protocols’ to refer to cryptographic
algorithms that involve interactions between two or more distinct entities1 ,
including benign entities and an adversary.
In this textbook and many works in cryptography, we focus on protocols
involving only two benign parties, often called Alice (A) and Bob (B), and
one Man-in-the-Middle (MitM) adversary. Both benign entities run the same
protocol P, which is an efficient (PPT) algorithm; to model parties with different
roles, e.g., client vs. server, the ‘role’ can be provided as part of their initial
state. The adversary also runs an efficient (PPT) algorithm.
An execution of a cryptographic protocol P, with an adversary M, is the
random outcome of a process that we refer to as the execution process. Typically,
all interactions between benign parties, as well as interactions with the adversary,
are done via the execution process; the execution process passes outputs from the
benign parties to the adversary, and lets the adversary control the inputs to the
benign parties, i.e., the adversary has Man-in-the-Middle (MitM) capabilities.
We assume that the execution process allows the protocol to securely initialize
the parties, which may involve sharing of secret keys or of public keys.
Typically, a protocol has at least two interfaces: the interface with the
application using the protocol (APP), and the interface with the network,
allowing the protocol to communicate with the other entity (NET). A third
interface is often used to provide the protocol with system service such as a
clock (SYS).
1 In some works, the term ‘cryptographic protocol’ is used in a different way, mostly to
mean what we refer to as a ‘cryptographic scheme’, i.e., not necessarily involving interactions
between entities; e.g., you may see mention of ‘encryption protocol’.
Applied Introduction to Cryptography and Cybersecurity
5.1. MODELING CRYPTOGRAPHIC PROTOCOLS
269
APP
Application interface
received(m),
f ailure
send(m)
SYS
System
interface
sleep(δ)
Alice
wake-up(t)
Nurse
send(µ)
received(µ)
NET
Network interface
Figure 5.1: Interactions for the record/session protocols, illustrated for Alice;
Bob has the same interfaces. Note that while we use the labels send and
received for both the APP and NET layers, the semantics and the messages
sent (m and µ) are very different. Typically, µ will contain an encoding of m,
possibly encrypted and/or authenticated, and a header that identiőes sender,
recipient and key; see text for details. Other protocols we study use the same
SYS and NET events, but different APP events.
5.1.1
The session/record protocol
We illustrate the concept of a cryptographic protocol by focusing on the simple
yet important session/record protocol. A session/record protocol uses a key
shared between the two parties, to authenticate and/or encrypt the messages or
records sent in a session or connection between them. In Chapter 7, we present
a practical session/record protocol; this is the TLS record protocol, which is
part of TLS.
Session/record protocols are among the simplest practical protocols, in
particular, simpler than the other protocols we discuss, and therefore a good
example.
Let us őrst describe the APP, SYS and NET interfaces of the session/record
protocol, and the operations in each of them, as illustrated in Figure 5.1.
Application (APP): an interface for input/output interactions between a
benign party and the application which uses it, to whom we sometimes
refer as the user of the protocol. These interactions provide the inputs
from the application or user to the protocol running on the (benign) party,
and allow the protocol to provide outputs to the application/user. For the
session/record protocol, the only input interaction in the APP interface,
is transmission of a message m from the application to be sent to the peer
Applied Introduction to Cryptography and Cybersecurity
270
CHAPTER 5. SHARED-KEY PROTOCOLS
in a send(m) interaction. There are two output interactions, the receipt
of a message m from the peer in a received(m) event, and an indication
that the protocol cannot send information due to a communication failure,
in a f ailure interaction. Here, send, received and f ailure are labels and
m is a value.
Network (NET): an interface for the communication between benign parties,
allowing a benign party to exchange messages with another benign party,
subject to manipulations by the adversary. To send a message µ to the
peer, the protocol uses the send(µ) output event on the N ET interface;
send is the label and µ is the value. We use the symbol µ for the messages
in the NET interface, to separate them from the symbol m which we use
for messages in the APP interface. Typically, µ consists of two parts:
header, identifying the sender, recipient and key(s) used, and payload,
which contains an encoding of m, possibly encrypted and/or authenticated.
In a received(µ) input event on the N ET interface, the protocol receives a
message µ, which typically contains a header which identiőes the purported
peer who sent the message, as well as the key(s) to be used, as needed, to
decrypt and/or authenticate the message.
System (SYS): an interface to other interactions of the protocol, typically
to ‘local’ services such as clock, sensors and relays/actuators. In this
textbook, we only use clock service, with two interactions. Speciőcally, the
protocol may invoke sleep(δ) to request a ‘wake-up call’ after δ time units
(e.g., seconds); upon that time, the protocol should receive an incoming
wake-up(t) event, with t indicating the current time. This interface allows
both ‘wake-up’ service as well as a ‘clock lookup’ service (using sleep(0)).
The APP interface events are speciőc to each type of protocols, e.g., differ
between session/record protocols and between entity-authentication and other
protocols. However, all the protocols we study use the same SYS and NET
events, as described above.
Adversary capabilities and security requirements Our description of the
interactions focused on their intended actions; however, obviously, the adversary
can signiőcantly interfere with the interactions in different ways. Following the
Attack Model principle (Principle 1), we only limit the adversary’s capabilities,
rather than assuming a speciőc attacker strategy. The adversary’s capabilities
include both computational capabilities and access capabilities.
In most works in cryptography, the computational capabilities of the adversary are limited by requiring the adversary to be an efficient (PPT) algorithm
(see Section A.1), i.e., its run time must be polynomial in the length of its
inputs. The protocol should also be a PPT algorithm, and have comparable
computational capabilities to these of the adversary. To ensure this, we provide
an input called the security parameter, which we denote 1l , to both adversary
and protocol, which means that they are both limited to run a polynomial
Applied Introduction to Cryptography and Cybersecurity
5.1. MODELING CRYPTOGRAPHIC PROTOCOLS
271
number of steps in the length of 1l . Note that 1l is a number encoded in unary,
i.e., it consists of l bits whose value is all 1 (see Table 1.1).
The access capabilities deőne the ability of the adversary to observe the
outputs and control the inputs of the benign parties. Like most works in
cryptography, we focus on the Man-in-the-Middle (MitM) adversary, which
observes all outputs and determines all inputs of the benign parties. Note that
some security requirements restrict the adversary’s capabilities; in particular,
in subsection 6.1.3 we study the key exchange problem, which assumes an
eavesdropping adversary, who can observes messages but cannot modify, inject
or drop messages.
Following the conservative design principle (Principle 3, we allow the adversary complete control and observation capabilities over the application interface.
In the case of the session/record protocol, this means that the adversary determines when a send(m) request from the application occurs, and what is the
message m, except for distinguishability games where the adversary can provide
two equally-long inputs, one of which will be used as input, and the adversary
has to try to guess which input was used.
The ability to control the input message m corresponds to the chosen
plaintext attack (CPA) model for encryption and the chosen message attack
(CMA) for signatures, and to the other attack models deőned in Chapter 2 for
encryption and in subsection 1.5.2 for signatures.
The authentication requirement of session/record protocols. Intuitively, an execution satisőes existential unforgeability if the sequence of message
received by Bob is a preőx of the sequence of message sent by Alice. A session/record protocol P ensures existential unforgeability, if executions with
every PPT adversary satisfy existential unforgeability, except with negligible
probability.
Session/record protocols are typically also expected to ensure conődentiality,
which is usually deőned using an indistinguishability-game, along the same
principle used in our deőnitions in Chapter 2.
5.1.2
PEtA : a simple EtA session/record protocol
We now present the simple EtA session/record protocol, which we denote PEtA .
This is a very simple record protocol, following the Encrypt-then-Authenticate
(EtA) design (Section 4.7), and assuming an underlying reliable-communication
service, such as provided by the widely-used TCP standard; read about TCP
in [320] and introductory textbooks on Internet protocols, e.g., [245]. The use
of EtA makes PEtA a highly-simpliőed variant of the IPsec ESP protocol [153],
an important Internet security standard, while its assumption of an underlying
reliable-communication service is similar to the SSL and TLS record protocols,
which we discuss in Chapter 7. Note that the TLS record protocol uses either
the (less-preferred, often vulnerable) Authenticate-then-Encrypt (AtE) design,
or the AEAD design (subsection 4.7.1, which is secure - but different from EtA.
Applied Introduction to Cryptography and Cybersecurity
CHAPTER 5. SHARED-KEY PROTOCOLS
272
We describe the protocol for two speciőc parties {A, B}, although, of course,
it can be trivially extended to support arbitrary pairs of parties. We also make
several other simpliőcations, as listed below.
Simplifications of PEtA .
tions:
The PEtA protocol makes the following simpliőca-
Assumes a reliable network (connection): the protocol delivers messages,
only if every packet sent by one party to its peer, over the network (NET
interface), is delivered reliably. Reliable delivery means that the sequence
of packets is delivered exactly as sent: no forgery, reordering, modiőcation,
duplication, or loss of packets. However, we do allow that some of the
last messages sent, are not delivered. The protocol ignores (‘drops’) any
packet received which was not sent, or received out of sequence, with a
‘fail’ indication to the application. In practice, this simpliőcation implies
that the protocol assumes an underlying reliable communication protocol,
such as TCP.
Does not ensure reliable connection: while PEtA assumes an underlying
reliable connection, it does not guarantee a reliable connection. The
protocol does ensure that messages received have been sent by a peer,
but some messages sent may be lost (without indication to the sender or
recipient), reordered and duplicated, and we may receive ‘stale’ messages
sent unlimited time ago (no freshness).
No compression, padding or fragmentation: the protocol sends messages
of unbounded size ‘as-is’, without applying compression and without
fragmenting long messages (into multiple bounded length fragments).
Furthermore, the protocol assumes that the MAC and encryption functions
received arbitrary-length messages, i.e., the protocol does not perform
any padding before applying MAC and encryption.
Assumes initialization of two shared keys: the protocol assumes initialization of a shared encryption key kE , as well as a shared authentication
key kM AC .
For a more practical session/record protocol, see Chapter 7, where we
present the TLS record protocol, possibly the most important and widely used
session/record protocol. The TLS record protocol avoids these simpliőcations,
except the őrst, i.e., it also assumes an underlying reliable network (connection).
Practical session/record protocols that avoid all simpliőcations include the
DTLS [330] protocol, which is an adaptation of TLS allowing it to run over
unreliable network service, and the IPsec [127, 153] protocol.
Explanation of the simple secure session/record protocol PEtA . The
PEtA protocol uses both a shared-key (symmetric) cryptosystem (E, D), using
a (shared) key kE , and a Message Authentication Code (MAC) scheme M AC,
Applied Introduction to Cryptography and Cybersecurity
5.2. SHARED-KEY ENTITY AUTHENTICATION PROTOCOLS
273
using a (shared) key kM AC . The description is simpliőed by ignoring the keygeneration of both the cryptosystem and the MAC scheme; we assume that the
two keys, kE and kM AC are generated and shared securely securely with the
two parties before the protocol begins.
Receiving a message from the application. Assume party p ∈ {A, B}
receives message m from the application, i.e., has a send(m) event on the
APP interface. The protocol should send the incoming message m, properly
encrypted and authenticated. The protocol follows the recommended Encryptthen-Authenticate (EtA) approach (subsection 4.7.5). Namely, the protocol
$
őrst encrypts the message m. The ciphertext is c ← EkE (m); we use the
$
← notation to emphasize that the encryption algorithm is randomized, i.e.,
the ciphertext c is not a deterministic function of the message m. Then the
+ (sent + 1) +
+ p),
adversary computes the authenticator: a ← M ACkM AC (c +
where sent is the sequence number of the messages sent. The input to the
MAC includes the sequence number sent + 1 of this message-send event, and
the sender identiőcation p; this prevents manipulation by a MitM adversary.
Finally, the protocol performs a send request from the network (NET interface).
The packet sent is the pair (c, a), i.e., ciphertext and authenticator; we use the
term packet for the information sent by the protocol, to distinguish between it
and the message which is sent by the application using the protocol. Notice
that the protocol does not send the sent messages counter sent or the sender
identiőcation p, since the underlying networking service is assumed to ensure
reliable delivery, namely, the messages should be received in the order sent,
therefore, the recipient can count them and should have exactly the same inputs
to the MAC; see below.
An incoming received(µ) from the network (NET interface).
Since
we assumed a reliable network service, the incoming packet µ should be the
next packet sent by the peer. The protocol parses the incoming packet µ
as a pair (c, a). It veriőes that the authenticator is valid, i.e., that a =
M ACkM AC (c +
+ (rcved + 1) +
+ p̂), where p̂ denotes the peer (i.e., the sender) and
rcved is the counter of messages received successfully so far. If the authenticator
a is valid, then the protocol passes to the application the plaintext message
m ← DkE (c), i.e., the decryption of the ciphertext part c. The plaintext
message m should be the rcved + 1 message sent by the peer p̂ to p; it would
now also be the rcved + 1 message received by p from p̂. If validation fails, the
protocol drops the packet, possibly providing an indication of the failure to the
application.
5.2
Shared-key Entity Authentication Protocols
In this section we discuss shared-key protocols for authenticating an interaction
between a pair of parties (entities). Entity authentication protocols are among
Applied Introduction to Cryptography and Cybersecurity
274
CHAPTER 5. SHARED-KEY PROTOCOLS
the simplest and oldest cryptographic protocols, and were in extensive use for
years, although later they were mostly replaced by authenticated Key Exchange
protocols, which combine entity authentication with exchange of a session key.
We discuss Key Exchange protocols in the following sections. We focus on the
common case of mutual entity authentication, i.e., both parties authenticate
each other; however, our discussion mostly applies also to the simpler case
where only one party authenticates its peer.
To refer to the party that initiates the handshake, we use the term initiator,
and similarly the term responder for the party that responds. For convenience,
in our examples we usually have Alice as the initiator (i.e., Alice initiates the
handshake), and Bob as the responder; however, the roles could be reversed,
i.e., Bob can be the initiator and Alice can be the responder.
5.2.1
Interactions and requirements of entity authentication
protocols
Basically, the goal of an entity authentication protocol is to validate interaction
with the (correct) peer. In Mutual Entity Authentication protocols, both
peers should authenticate each other, which require, at least, one exchange of
messages between the two parties, e.g., a message from Alice to Bob and a
response from Bob to Alice. Mutual authentication means that both parties
validate they interacted successfully with their peer; one of the two parties,
say x ∈ {Alice, Bob} should validate successfully only if its peer validated x
successfully.
We refer to the entire set of messages involved with a single run of the
authentication protocol as the handshake; a handshake requires at least two
message ŕows (e.g., Alice to Bob and response from Bob to Alice). Some
protocols require three ŕows, i.e., also a response from Alice to Bob.
Concurrent handshakes. Most entity authentication protocols allow multiple concurrent handshakes. For example, a handshake begins with the protocol
in Alice receiving an Init request from the application (APP interface), requesting to initialize a new handshake with Bob. However, before this handshake
terminates, Alice is requested to initiate another handshake with Bob; or, Bob
receives an Init request from its own application, requesting Bob to initiate a
handshake with Alice, while Alice is still in the middle of an ongoing handshake.
Note that the same party may also be a responder for one or more concurrent
handshakes, purportedly initiated by its peer.
We elaborate on the motivation for supporting concurrent handshakes at
the end of this subsection.
Interactions of entity authentication protocols. An entity authentication
protocol has the interactions illustrated in Figure 5.2. The interactions in
the SYS and NET interfaces are the same as presented in Figure 5.1; let us
Applied Introduction to Cryptography and Cybersecurity
5.2. SHARED-KEY ENTITY AUTHENTICATION PROTOCOLS
275
APP
Application interface
Resp(i),
Accept(i)
Init(i)
SYS
System
interface
sleep(δ)
Alice
wake-up(t)
Nurse
send(m)
received(m)
NET
Network interface
Figure 5.2: Interactions for entity authentication protocols, illustrated for Alice;
Bob has the same interfaces. The handshake identifier i allows association of
the different APP interactions related to the same handshake in each party, as
explained in the text.
explain the APP interface interactions, i.e., the interactions between the entity
authentication protocol and the application using it.
A handshake begins at an initiator, say Alice, upon Init(i) input interaction
(request) from the user or application (APP interface). In the responder, say
Bob, the handshake begins with a Resp(i) output interaction, i.e., a notiőcation
by the protocol of a new handshake initiated by the peer (e.g., Alice). The
identiőer i is called the session identifier; identiőers are required to be distinct
for different handshakes in each party.
Note that each handshake is assigned a unique identiőer at each party,
e.g., iI at the initiator and iR at the responder. The two identiőers may be
assigned independently at each party; they only need to be unique for that
party. One simple implementation would be as a counter maintained at each
party, incremented by the application when it issues a new Init(iI ) request,
and by the protocol when it notiőes of a new handshake with a new Resp(iR )
interaction.
A handshake terminates at a party, e.g., Alice, when she Accepts, i.e.,
validates successfully the interaction with the peer (Bob); this is done by the
protocol returning Accept to the application (i.e., on the APP interface). If the
validation fails - e.g., Bob never responds - the handshake may never terminate,
or terminate with a failure indication2 .
2 The failure indicator is used to ensure that handshakes terminate within bounded time;
for simplicity, we do not discuss this requirement or utilize it in the protocols and requirements
that we cover.
Applied Introduction to Cryptography and Cybersecurity
276
CHAPTER 5. SHARED-KEY PROTOCOLS
To summarize, the interactions of the APP interface are as follows:
Init(i) is an input interaction, i.e., a request from the application to initiate
a new connection (with a distinct identiőer i). Init is the only input
interaction.
Resp(i) is an output interaction, where the protocol informs that it is responding
to a new session, and assigns this new session an identiőer value i. This
session identiőer must differ from other session identiőers used by this
party (in Init or Resp interactions).
Accept(i) is an output interaction, where the protocol informs the applications
that session i has successfully completed.
Security of entity-authentication protocols. Intuitively, the basic requirement from an entity-authenticating protocol is that successful authentication
would imply a ‘fresh’ interaction with the peer. The term ‘fresh’ may mean
either simply ‘within the recent ∆ seconds’ (for some small ∆), or that there
was overlap in the handshake periods in the two peers.
Another requirement that some protocols ensure is that a successful authentication in the responder implies successful authentication also in the initiator.
We could also design a protocol with the reverse property, i.e., success at the
initiator would ensure success at the responder. However, if we allow communication failures, we cannot ensure that success in either of the parties will imply
success in the other party.
Sequential vs. Concurrent Authentication Note that ensuring mutualauthentication is easier, if handshakes must be done sequentially, i.e., no concurrent handshakes. However, support for multiple concurrent sessions is a
critical aspect of Mutual Entity Authentication protocols. Concurrent sessions
are required in many scenarios; e.g., web communication often uses concurrent
connections (sessions) between browser and server, to improve performance.
Concurrent handshakes are also necessary to prevent ‘lock-out’ due to
synchronization-errors (e.g., lost state by Initiator), or an intentional ‘lock-out’
by a malicious attacker, as part of a denial-of-service attack. In any case,
there is no strong motivation to allow only consecutive handshakes; protocols
that support concurrent handshakes are as efficient, and not signiőcantly more
complex, than protocols that only allow sequential authentications. Therefore,
following the conservative design principle (Principle 3), we focus on the case
where concurrent handshakes are allowed.
When designers do not consider the threats due to concurrent sessions, yet
the implementation allows concurrent sessions, the result is often a vulnerable
protocol. This is a typical example of the results of failures to articulate the
requirements from the protocol and the adversary model, and of not following
Principle 3. We next study such vulnerability: the SNA mutual-authentication
protocol.
Applied Introduction to Cryptography and Cybersecurity
5.2. SHARED-KEY ENTITY AUTHENTICATION PROTOCOLS
Alice
Bob
$
NA ← 1l
277
$
A, NA
NB ← 1l
NA , NB , Ek (NA )
NB , Ek (NB )
Figure 5.3: The (vulnerable) SNA mutual authentication protocol.
5.2.2
Vulnerability study: SNA mutual-authentication
protocol
As a simple, yet realistic, example of an (insecure) two-party, shared key Mutual
Entity Authentication protocol, consider the SNA mutual-authentication protocol. IBM’s SNA (Systems Network Architecture) was the primary networking
technology from 1974 till the late 1980s, and is still in use by some ‘legacy’
applications.
We describe the original, insecure version of the SNA Mutual Entity Authentication protocol, and later its replacement - the 2PP Mutual Entity
Authentication protocol. Both protocols use a shared secret key k, to authenticate two parties to each other, without deriving a session key; we later describe
extensions which also provide Key Exchange. We őrst explain the SNA protocol,
illustrated in Figure 5.3, and then discuss its security.
The SNA Mutual Entity Authentication protocol operates in three simple
ŕows, as illustrated in Figure 5.3. The protocol uses a block cipher E. The
initiator, say Alice, sends to her peer, say Bob, her identiőer, which we denote
A, and NA , a random l-bit binary string which serves as a challenge; such a
random challenge is often called a nonce. Here, l is the size of the inputs and
outputs to a block cipher E used by the protocol.
The responder, say Bob, replies with a proof of participation Ek (NA ), using
the shared key k. Bob also sends his own random l-bit challenge (nonce) NB .
To help Alice match this response with the correct handshake, as necessary to
support concurrent handshakes, Alice also sends an identiőer with her handshake
message, which Bob returns with his response. Or, as we show in Figure 5.3,
Alice only includes her nonce NA , and Bob simply attaches Alice’s nonce NA
with his handshake message (second ŕow).
Upon receiving Bob’s response, Alice validates that the response contains
the correct function Ek (NA ) of the nonce that she previously selected and
sent. If so, Alice concludes that she communicates indeed with Bob. Alice
then completes the Mutual Entity Authentication by sending her own ‘proof
of participation’ Ek (NB ). Alice may also includes NB , to help Bob match the
response with the correct handshake, similarly to the inclusion of NA by Bob;
Applied Introduction to Cryptography and Cybersecurity
278
CHAPTER 5. SHARED-KEY PROTOCOLS
or, a different identiőer may be used.
Finally, Bob similarly validates that he received the expected function
Ek (NB ) of its randomly selected nonce NB , and concludes that this Mutual
Entity Authentication was initiated, and successfully completed, by Alice.
Upon receiving the expected responses, both parties, Alice and Bob, signal
to their applications that the Mutual Entity Authentication has completed
successfully. The response expceted by Alice is Ek (NA ), and the response
expected by Bob is Ek (NB ).
SNA Mutual Entity Authentication ensures sequential, but not concurrent, mutual authentication. The simple SNA Mutual Entity Authentication of Figure 5.3 ensures mutual authentication, but only if restricted
to sequential Mutual Entity Authentication. The protocol is vulnerable when
allowing concurrent Mutual Entity Authentication. The attack is illustrated in
Figure 5.4.
Let us őrst explain why the protocol ensures mutual authentication, when
restricted to sequential handshakes. Suppose, őrst, that Alice completes the
protocol successfully. Namely, Alice received the expected second ŕow, Ek (NA ).
Assume that this happened without Bob previously receiving NA as the őrst
ŕow from Alice (and sending Ek (NA ) back). Due to the sequential restriction,
Alice surely did not receive NA as a challenge in the time since Alice sent
NA , and hence did not compute and send Ek (NA ). Since Alice selected NA
randomly, from a sufficiently large set (i.e., NA is sufficiently long), it is unlikely
that either Alice or Bob has received NA before Alice completed the protocol.
Hence, the adversary must have computed Ek (NA ) rather than intercepted
it. However, if the adversary can compute Ek (NA ) then the adversary can
distinguish between Ek and a random permutation, contradicting the PRP
assumption for E. Note that an eavesdropping attacker may collect such pairs
(NA , Ek (NA ) or NB , Ek (NB )), however, since NA , NB are quite long strings
(e.g., 64 bits), the probability of such re-use of the same NA , NB is negligible.
However, the SNA Mutual Entity Authentication fails to ensure concurrent
mutual authentication; namely, if the parties are willing to run two concurrent
Mutual Entity Authentication handshakes, then an attacker can cause a party
to complete the protocol ‘successfully’, i.e., thinking it has communicated with
its peer, while in reality the peer did not receive any message. For example,
Figure 5.4 illustrates an attack where Alice initiates one session with Bob, which
is actually intercepted by an attacker. The attacker, impersonating as Bob,
initiates another session with Alice. Both sessions terminate correctly, i.e., in
both, Alice is tricked into believing that it has successfully interacted with Bob;
however, in reality, Bob was never involved.
Notice that the attack of Figure 5.4 requires that Alice would agree to act
both as an initiator of a session and as a responder of a session. One may hope
that SNA may be secure for concurrent Mutual Entity Authentication, if each
party is only willing to act in one role (initiator or responder). However, this is
not the case; see Exercise 5.8.
Applied Introduction to Cryptography and Cybersecurity
5.2. SHARED-KEY ENTITY AUTHENTICATION PROTOCOLS
279
A, NA
B, NA
NA , NA′ , Ek (NA )
NA , NA′ , Ek (NA )
NA , Ek (NA′ )
Nurse
Alice
NA , Ek (NA′ )
Attacker
Figure 5.4: Attack on SNA Mutual Entity Authentication with Alice initiating
one session with Bob, which is actually intercepted by an Attacker; ŕows related
to this session are marked in black and ‘regular’ arrows. In the attack, the
Attacker, impersonating as Bob, initiates another session with Alice; ŕows
related to this (second) session are marked in red and with dashed arrows. Both
sessions terminate correctly, i.e., Alice would believe that it has successfully
interacted with Bob in both sessions, while in reality, Bob was never involved.
Notice that this attack requires that Alice would agree to act both as an initiator
and as a responder of sessions.
In practice, SNA allows concurrent Mutual Entity Authentication handshakes - and for good reasons, as motivated earlier (at the end of subsection 5.2.1).
To ensure security, the Mutual Entity Authentication protocol of SNA was
changed into the 2PP protocol, which we present in the following subsection.
5.2.3
Authentication Protocol Design Principles
Before we present secure authentication protocols, it is useful to identify some
weaknesses of the SNA protocol, which were exploited in the attacks on it.
This allows us to derive design principles for different protocols which involve
authentication.
Identify sender and recipient. The SNA protocol allowed redirection: giving to Bob a value from a message which was originally sent - in this case,
by Bob himself - to a different entity (Alice). This motivates the following
design principle: in each message, identify the sender, the recipient, or,
best, identify both of them. Authentication is usually easy and efficient,
e.g., by adding an appropriate identiőer(s) to information being authenticated. Another solution may be to use independently-pseudorandom keys
for sending messages by the two parties.
Authenticate the handshake identifier. The SNA attack sent to Bob (part)
of a message sent (by Bob) to Alice, during a different handshake. This
motivates the design principle of authenticating the handshake-identiőer.
Authenticate flow number and initiator/responder bit. The attack sent
in the third protocol ŕow (second message from the initiator), the auApplied Introduction to Cryptography and Cybersecurity
CHAPTER 5. SHARED-KEY PROTOCOLS
280
A, NA
NA , NB , M ACk (2 +
+ ‘A ← B’ +
+ NA +
+ NB )
NB , M ACk (3 +
+ ‘A → B’ +
+ NA +
+ NB )
Nurse
Alice
Bob
Figure 5.5: The 2PP Mutual Entity Authentication protocol.
thenticator received from the second protocol ŕow (őrst message from the
responder). This motivate us to authenticate the flow number (e.g., second
vs. third) and a bit indicating if authenticator is sent by the initiator or
the responder.
Authenticate using MAC or signatures. The SNA Mutual Entity Authentication protocol was designed to ensure authentication; however, the
(only) cryptographic function it uses is a block cipher. Instead, an authenticating protocol should use an authentication function such as MAC
or signatures. Admittedly, from the Switching lemma (Lemma 2.2), a
block cipher, i.e., a PRP, is also a PRF, and from the PRF-is-a-MAC
lemma (Lemma 4.1), every PRF is also a MAC. However, the principle
still stands: authenticating protocols should use authentication functions.
Never provide an oracle. The SNA protocol applied the block cipher to
input - the nonces - received entirely from the network, i.e., fully controlled
by a MitM adversary, and returns the result of applying the block cipher
to this input. Namely, this provides the adversary with an ‘oracle’ to the
cryptographic function - in this case, the block cipher (PRP), which often
causes a vulnerability. A simple defense to avoid providing ‘oracle’ is to
include some random input(s) before applying the cryptographic function,
in addition to the adversary-provided inputs. This defense works in many
scenarios, including Mutual Entity Authentication protocols.
Adopting even a subset of these design principles would have sufficed to
prevent the attack of Figure 5.4. The 2PP protocol, described next, is essentially
the result of applying these principle to address the vulnerabilities of the SNA
protocol.
5.2.4
Secure Mutual Entity Authentication with the 2PP
protocol
In this subsection we present 2PP, a secure two party shared-key Mutual Entity
Authentication protocol; the name 2P P simply stands for two party protocol.
The 2PP protocol, presented in [61], was a replacement to the insecure SNA
Mutual Entity Authentication protocol.
Applied Introduction to Cryptography and Cybersecurity
5.3. AUTHENTICATED REQUEST-RESPONSE PROTOCOLS
281
The ŕows of the 2PP Mutual Entity Authentication protocol are presented
in Figure 5.5. As in the SNA Mutual Entity Authentication protocol, the values
NA and NB are n-bit nonces, where n is the security parameter - typically, the
length of the shared key k. The nonces NA , NB are selected randomly, by Alice
(initiator) and Bob (responder), respectively.
The protocol, at both parties, outputs Accept(i) once it authenticates
correctly the last ŕow from the peer for handshake i (second ŕow for the
initiator, third ŕow for the responder). By validating this last ŕow, 2PP ensures
mutual authentication.
We will not prove the security of 2PP, but let us give an intuitive argument
why it ensures mutual authentication, i.e., authentication of both responder
and initiator. Note that the security of 2PP requires the MAC scheme to be
secure, and requires that the key k and the nonces are ‘sufficiently long’, i.e., at
least as long as the security parameter 1l .
Security of 2PP: responder authentication. Consider an execution in
which the initiator (Alice in Figure 5.5) ‘accepts’, although Bob is not responding.
+A ← B+
+ NA +
+ NB ).
In 2PP, Alice ‘accepts’ (only) upon receiving M ACk (2 +
In 2PP, a party authenticates a message beginning with 2 only when sending a
response, and according to the identiőers, Alice would never send this. If Bob
sent it after Alice began this handshake, then responder authentication holds.
If Bob computed it before Alice began this handshake, then Alice randomly
selected same NA as previously received by Bob, which would occur with
negligible probability, since NA and NB are of the same length as the security
parameter 1l .
There remains the possibility that Bob also did not compute M ACk (2+
+A ←
B+
+ NA +
+ NB ). In this case, the adversary somehow found this value ‘on its
own’, i.e., neither Alice nor Bob computed it. Such an adversary can produce a
valid MAC on a message that was never MAC-ed by an ‘oracle’ knowing the
MAC key; this contradicts the assumption that the MAC scheme used is secure.
Hence, 2PP seems to ensure responder authentication.
Security of 2PP: sender authentication. The argument for sender authentication is very similar; we leave it to the reader to work it out.
5.3
Authenticated Request-Response Protocols
Authenticated request-response protocols extend mutual authentication protocols such as 2PP: not only do they authenticate the entities, they further
authenticate the exchange of a request and corresponding response messages
between the two parties. More precisely, they ensure the following properties:
Request authentication: every request received by a party, was sent by its
peer.
Applied Introduction to Cryptography and Cybersecurity
CHAPTER 5. SHARED-KEY PROTOCOLS
282
APP
Application interface
send-req(r, i)
send-resp(r, i)
SYS
System
interface
rcv-req(r, i)
rcv-resp(r, i)
sleep(δ)
Alice
wake-up(t)
Nurse
send(m)
received(m)
NET
Network interface
Figure 5.6: Interactions for Authenticated Request-Response Protocols, illustrated for Alice; Bob has the same interfaces. Each party may send either a
request or a response (to a previously received request).
Response authentication: every response received by a party, was sent by
its peer, in response to a request sent by the party.
No Replay: every request/response is received by a party, at most the number
of times it was sent by the peer.
Let us őrst describe the interactions of authenticated request-response protocols, over the different interfaces (APP, NET and SYS). These are illustrated
in Figure 5.6. Notice that the interactions on the NET and SYS interfaces are
exactly the same as in Figure 5.2 and Figure 5.1. It only remains to explain
the interactions on the APP interface, which are:
send-req(r, i): the application instructs the protocol to send request r to the
peer, using the identiőer i to identify the response (if and when it arrives).
The identiőer should be unique, i.e., different from previous identiőers
used by the application in send-req(r, i) interactions, e.g., a sequential
number of this send-req interaction (in this entity).
rcv-req(r, i): the protocol delivers an incoming request r from the peer, specifying identiőer i to be used by the application if and when it provides the
corresponding response. The identiőer i should be unique, i.e., different
from previous identiőers used by the protocol in rcv-req(r, i) events. Note
that this identiőer is unrelated to these used to identify send-req(r, i)
interactions.
Applied Introduction to Cryptography and Cybersecurity
5.3. AUTHENTICATED REQUEST-RESPONSE PROTOCOLS
283
send-resp(r, i): the application instructs the protocol to send to the peer a
response, r. The response r is to a previously-received request, which was
identiőed by i; a send-resp(r, i) may occur only if a rcv-req(r′ , i) occurred
earlier. The previous rcv-req(r′ , i) must have the same identiőer i, but
the request r′ is usually different from the response, i.e., usually, r′ ̸= r.
rcv-resp(r, i): the protocol delivers a response r from the peer, specifying
identiőer i; this must be the same identiőer as in a previous send-req(r′ , i)
interaction in this entity. Note that the value r of the response is typically
different from the value r′ of the corresponding request.
5.3.1
Summary of request/response protocols
We discuss four authenticated request/response (RR) protocolss:
2PP-RR: this protocol extends 2PP by also authenticating a request message
from the responder (e.g., Bob) and the corresponding response message
from the initiator (e.g., Alice). Note the ‘reverse’ role of the parties: the
‘initiator’ is the party sending the response, while the ‘responder’ is the
party sending the request. This is a bit confusing, and worst, it is often
inconvenient for applications.
2RT-RR: This Request-Response protocol requires two round-trips (hence
the name, 2RT-RR). It allows the initiator to send a query and the
responder to respond, which is often the required usage pattern; however,
as mentioned, it requires two round trips, i.e., four ŕows.
Counter-based-RR: this protocol authenticates a session rather than just
request-response, and requires only one ŕow per each message. However,
it requires both entities to maintain persistent state (counter).
Time-based-RR: authenticates a request message and the corresponding
response message, using only the minimal two ŕows: one for the request
and one for the response. These protocols require both parties to have
synchronized clocks. The protocol can handle (bounded) latency and clock
drift, but this requires the responder to maintain persistent state for a
limited time.
We compare these four types of authenticated request-response protocols in
Table 5.1. We notice that one of the most important differences among these
protocols is the number of ŕows, which implies the number of round trip times
required by the protocol. This number of round trip times is an important
consideration in many practical scenarios; let us explain why.
Overall delay is dominated by the number of round-trips. The number
of round-trips is one of the most important attributes of request-response
protocols, and is usually the dominant factor impacting the overall delay. As
shown in Table 5.1, the Time-based-RR and the Counter-based-RR require a
Applied Introduction to Cryptography and Cybersecurity
CHAPTER 5. SHARED-KEY PROTOCOLS
284
Protocol
2PP-RR
2RT-2PP-RR
Counter-based-RR
Time-based-RR
Figure
5.7
5.8
5.9
5.10
Section
5.3.2
5.3.3
5.3.4
5.3.5
Flows
3
4
2
2
Requirements/drawbacks
Initiate periodically?
Two round-trips
Persistent state, one round trip
Synchronization, one round trip
Table 5.1: Authenticated Request-Response (RR) protocols.
single round-trip to receive the response; in contrast, the 2RT-2PP-RR requires
two round trips. The 2PP-RR is an exception, as it requires ‘role reversal’: the
protocol is initiated by the entity producing the response; but how would this
entity know when to initiate the protocol? This seems to require this entity to
initiate the protocol periodically (synchronously), which makes it inappropriate
to many applications, where the request should be sent asynchronously, as
needed, rather than periodically.
The reason for focusing on the number of round trips is that in many
applications and scenarios, the number of round trips is the dominant factor
impacting overall delay. This is since often, the round trip time (RTT) is the
largest factor contributing to the delay. To emphasize this, let us state it as
Fact 5.1 and brieŕy explain it.
Fact 5.1 (Round trip time (RTT) values are signiőcant). Typical round-trip
times, i.e, the delay (latency) from sending a short packet over the Internet and
until receiving a response, can be quite significant. For example, [289] reports
RTT values which are mostly below 200msec. However, in some scenarios,
e.g., under bandwidth denial-of-service ( BW-DoS) attack, it can be it can be
as high as one second and even more. In particular, when using geostationary
satellite communication, typical delay is around 550 to 600msec. These values
are independent of the bandwidth, i.e., hold also when using high bandwidth
connections (fast transmission rates). While network bandwidth has dramatically
increased over the years, the round trip times did not reduce that much.
Round trips
1
1
2
2
Protocol(s)
Time-based RR
Time-based RR
2RT-2PP-RR
2RT-2PP-RR
Bandwidth
1MB/s (8Mb/s)
100MB/s (800Mb/s)
1MB/s (8Mb/s)
100MB/s (800Mb/s)
Delay
150msec
100.5msec
250msec
200.5msec
Table 5.2: Comparing the impact of transmission rates to the impact of and
the number of round-trips required by a protocol, assuming the typical RTT
(round-trip time) of 100msec, and assuming overall transmission of 50KByte.
To illustrate the importance of minimizing round-trips, Table 5.2 compares
the overall delays in typical request-response scenarios when using protocols
requiring one vs. two round trips, and when using lower bandwidth of 1MB/s
vs. higher bandwidth of 100MB/s. Clearly, the transmission time is negligible
Applied Introduction to Cryptography and Cybersecurity
5.3. AUTHENTICATED REQUEST-RESPONSE PROTOCOLS
285
A, NA
req, NB , M ACk (2 +
+ ‘A ← B’ +
+ NA +
+ NB +
+ req)
resp, M ACk (3 +
+ ‘A → B’ +
+ NA +
+ NB +
+ resp)
Nurse
Alice
Bob
Figure 5.7: The 2PP-RR Authenticated Request-Response Protocol: a three
ŕow nonce-based authenticated Request-Response protocol, based on 2PP
compared to the round-trip time; the delay is dominated by the number of
round-trips and the round-trip delay.
Indeed, TLS 1.3, the updated version of the TLS protocol, has a signiőcantly
modiőed design which allows it to minimize the number of round trips and
therefore of the round-trip times; see Section 7.6.
Note that Fact 5.1 may be a bit obscured in most of the sequence diagrams
in this text since, for simplicity, compactness and clarity, we use horizontal
arrows for transmissions, i.e., the diagrams do not indicate the network latency
of messages. We also usually do not show other delays. Most sequence diagrams
in the literature use the same style.
5.3.2
The 2PP-RR Authenticated Request-Response Protocol
We őrst discuss 2PP-RR, a three ŕow nonce-based authenticated RequestResponse protocol, which is a minor extension to 2PP. The 2PP-RR protocol is
illustrated in Figure 5.7. In fact, the only change compared to the 2PP protocol
(Figure 5.5), is the addition of the request (req) from responder to initiator,
and of the response (resp) from initiator to responder, to the second and third
ŕows, respectively.
The 2PP-RR protocol is simple and not too difficult to prove secure, by a
reduction to the security of the underlying MAC function. Namely, suppose
that we know an efficient algorithm (adversary) M which shows that 2PP-RR
does not meet the deőnition of a secure authenticated request-response protocol.
The reduction shows how, given such M, we can efficiently compute the value
of the MAC function on some input, without knowing the key. This contradicts
the assumption that we use a secure MAC. Namely, if we use a secure MAC,
then 2PP-RR is secure.
The 2PP-RR protocol has, however, a signiőcant drawback, which makes it
ill-suited for many applications. Speciőcally, in this protocol, the request is sent
by the responder, and the initiator sends the response. In most applications, it
makes sense for a party to initiate the protocol when it needs to make some
request, rather than to wait for the initiator to contact it and only then, as a
responder, send the response. The next protocol is a different adaptation of
2PP which avoids this drawback - but requires four ŕows, i.e., two full round
trips.
Applied Introduction to Cryptography and Cybersecurity
CHAPTER 5. SHARED-KEY PROTOCOLS
286
A, NA
NB
req, M ACk (3 +
+ ‘A → B’ +
+ NA +
+ NB +
+ req)
resp, M ACk (4 +
+ ‘A ← B’ +
+ NA +
+ NB +
+ resp)
Nurse
Alice
Bob
Figure 5.8: 2RT-2PP RR: a two-round-trips Authenticated Request-Response
protocol
5.3.3
2RT-2PP Authenticated Request-Response protocol
In Figure 5.8 we present 2RT-2PP RR, another authenticated request-response
protocol based on 2PP. As the name implies, the 2RT-2PP RR protocol requires
four flows, i.e., two round-trips; this is a signiőcant drawback. However, 2RT2PP improves upon 2PP-RR in that it authenticates a request from the initiator,
and the corresponding response to it from the responder.
The 2RT-2PP Request-Response protocol involves two simple extensions of
the basic 2PP protocol. The őrst extension is the transmission and authentication of the request and response, similarly to their addition in 2PP-RR. The
second extension is an additional (fourth) ŕow, from the responder back to
the initiator, which carries the response of the responder to the request from
the initiator. In a sense, 2RT-2PP ‘splits’ the contents of the second ŕow of
the 2PP-RR. In 2RT-2PP, these contents are split between the second ŕow
(providing the nonce NB ) and the fourth ŕow (providing the authenticated
response).
5.3.4
The Counter-based RR Authenticated
Request-Response protocol
In Figure 5.9 we present the Counter-based RR Authenticated Request-Response
protocol. In contrast to the 2PP protocols, this protocol requires only one round
trip - sending the (authenticated) request and receiving the (authenticated)
response. However, to prevent replay of previously-sent requests, in only
one round-trip, this protocol requires both parties to maintain a synchronized
counter.
The challenge for the Counter-based RR protocol, as well as for the timebased RR protocol of the next subsection, is for the responder to verify the
freshness of the request, i.e., that the request is not a replay of a request
already received in the past. Freshness also implies no reordering; for example,
a responder, say Bob, should reject request x from Alice, if Bob already received
request x or a later-sent request x′ from Alice. Freshness prevents an attacker
from replaying information from previous exchanges. For example, consider the
request-response authentication of Figure 5.8; if NB is removed (or őxed), then
an eavesdropper to the ŕows between Alice and Bob in one request-response
Applied Introduction to Cryptography and Cybersecurity
5.3. AUTHENTICATED REQUEST-RESPONSE PROTOCOLS
287
req, iA , M ACk (1 +
+ ‘A → B’ +
+ iA +
+ req)
If iA ̸= iB + 1: ignore
Else; iB ← iB + 1
resp, iB , M ACk (2 +
+ ‘A ← B’ +
+ iB +
+ resp)
Nurse
Alice
Bob
Accept if iA = iB
Figure 5.9: The Counter-based RR Authenticated Request-Response protocol
session can copy these ŕows and cause Bob to process the same request again.
For some requests, e.g., Transfer $100 from my account to Eve, this can be a
concern.
To ensure freshness without requiring the extra ŕows, one may use state, as
in this subsection, or synchronization, as in the next subsection.
Speciőcally, the counter-based RR protocol of Figure 5.9 requires both
parties to maintain a counter; we use iA to denote the counter kept by Alice,
and iB to denote the counter kept by Bob. Alice’s counter iA represents the
number of queries that Alice sent, and Bob’s counter iB represents the number
of responses that Bob sent; hence, both are initialized to zero. The protocol
maintains these two counters synchronized, in the sense that at any time holds:
iB ≤ iA ≤ iB + 1.
Note that this design implies that this protocol does not allow concurrent
transmission of requests. Furthermore, the protocol does not provide retransmissions or any other mechanisms to handle message-losses or corruptions; any
such loss or corruption is likely to prevent any further query/response. However,
it is not too difficult to extend the protocol to handle such issues, in particular,
to allow concurrent requests and responses.
Exercise 5.1. Extend the protocol of Figure 5.9, to allow Alice to send concurrent requests to Bob; allow Bob to respond to requests, even when they are
received out-of-order.
5.3.5
Time-based RR Authenticated Request-Response
protocol
Figure 5.10 presents another alternative single-round Authenticated RequestResponse protocol; this variant allows the Initiator (e.g., Alice) to be stateless,
and also limits the time that the responder (e.g., Bob) must keep state. Instead
of relying on a counter maintained, in synchronized way, by the two parties,
the protocol of Figure 5.10 relies on the use of time and on two synchronization
assumptions, speciőcally, bounded delay and bounded clock skew.
Applied Introduction to Cryptography and Cybersecurity
CHAPTER 5. SHARED-KEY PROTOCOLS
288
TA ← clkA (·)
req, TA , M ACk (1 +
+ ‘A → B’ +
+ TA +
+ req)
req is valid if TA is larger than before,
and TA ≥ clkB (·) − ∆.
Nurse
Alice
resp, M ACk (2 +
+ ‘A ← B’ +
+ TA +
+ resp)
resp is valid if received within 2∆, and with correct TA .
Bob
Figure 5.10: Time-based Authenticated Request-Response protocol, using a
bound ∆ on the maximal delay plus maximal clock bias. We use clkA (·) to
denote the time according to the local clock of Alice upon sending req, and
clkB (·) for Bob’s clock upon receiving req. Alice sets TA ← clkA (·) when she
sends the request, and authenticates it with the request. Bob uses TA to validate
that the request is fresh, using the bound ∆, and ensuring TA is larger than
previously received TA values.
Bounded delay assumption. Let ∆delay ≥ 0 denote a bound on the maximal
delay. Namely, if one party sends a message at time t, then this message is
received by t + ∆delay or earlier.
Bounded clock skew. Let ∆skew ≥ 0 denote a bound on the maximal clock
skew, i.e., the maximal difference between the values of the clocks of two entities
at any given time. Let clkA (t) (clkB (t)) denote the value of the clock at Alice
(respectively, Bob) at time t; then we have:
clkA (t) − ∆skew ≤ clkB (t) ≤ clkA (t) + ∆skew
(5.1)
The protocol is illustrated in Figure 5.10, with Alice sending the request and
Bob responding. For simplicity, we use a combined bound: ∆ ≡ ∆skew + ∆delay ,
and the notation clkA (·), clkB (·) for the value of clkA (respectively, clkB ) at
the time Alice sends (Bob receives) the req message.
The protocol at Bob conőrms the received request req is valid, as follows:
No modification: compare the received MAC value to the MAC computed
with the correct inputs.
req is a request from Alice to Bob: The fact that the input to the MAC
begins with 1 +
+ ‘A → B’ ensures this is a request (őrst ŕow) from Alice
to Bob.
No replay: Bob validates that the received value of TA is larger than the
largest previously received value of TA .
Freshness (acceptable delay): Bob validates that the received TA is within
∆ from his own clock clkB (·) at the time the req is received.
Applied Introduction to Cryptography and Cybersecurity
5.4. SHARED-KEY KEY EXCHANGE PROTOCOLS
289
When Alice receives the response resp, she similarly conőrms it is valid, as
follows:
No modification: Alice compares the received MAC value to the MAC computed with the correct inputs.
resp is a response from Bob to Alice: The fact that the input to the MAC
begins with 2 +
+ ‘A ← B’ ensures this is a response (second ŕow) from
Bob to Alice.
resp is a response to request req, and not a replay: the value of TA increments whenever Alice sends a new request, and therefore Alice sent
only one request with this value of TA ; furthermore, this cannot be a
replay of previous response from Bob, since no previous response would
use this TA .
Freshness (acceptable delay): Alice validates that the response is received
within at most3 2 · ∆delay .
Note that both Alice and Bob may discard their state (the sA .TA and sB .TB
variables) after some time, which the reader may compute.
Exercise 5.2. Show that Alice and Bob do not need to keep forever the stored
values of TA ; at what time may each of them free up this storage?
Like the counter-based protocol, we also presented the time-based protocol
allowing only one query-response at any given time, and assuming reliable
communication. However, it is not too difficult to extend it to allow multiple
concurrent requests and responses, and handle unreliable communication.
Exercise 5.3. Extend the protocol of Figure 5.10, to allow Alice to send up to
three concurrent requests to Bob; allow for receiving and responding to requests
out-of-order.
5.4
Shared-key Key Exchange Protocols
Not all communication follows the request-response pattern, and even when
it does, many times we may prefer to secure the communication using a session/record protocol, which secures arbitrary interactions, using a shared key,
such as the simpliőed protocol in subsection 5.1.2, or the TLS record protocol
in Chapter 7.
In principle, the parties could simply use the same shared secret keys for all
sessions between them. However, as already stated in Principle 4, it is desirable
to limit the use of secret keys. The goal of shared-key Key Exchange protocols
is to setup separate session keys {kiS } for each session i, in such a way that
exposure of session key kiS will not expose other session keys. In this section
3 For simplicity, Figure 5.10 specifies validation of 2∆, which is also fine but a bit
unnecessarily lax.
Applied Introduction to Cryptography and Cybersecurity
290
CHAPTER 5. SHARED-KEY PROTOCOLS
we focus on two-party shared-key Key Exchange protocols, which derive all
session keys from one őxed key k, to which we refer as the master key and/or
as the long-term key. These protocols should be contrasted from public-key key
exchange protocols, which we study in subsection 6.1.3, and which are often
used to establish the shared master key. In Chapter 7 we study the widely-used
SSL and TLS protocols, which combine public-key key exchange to establish a
shared master key, and shared-key Key Exchange to derive session keys from
the master key.
The use of session keys kiS securely derived from the master key k has
multiple security beneőts:
1. By changing the key periodically, we reduce the number of ciphertext
messages, encrypted using the same key, available to the cryptanalyst.
This usually makes cryptanalysis harder, and possibly infeasible, compared
to the use of a őxed key, which allows the adversary to collect plenty
of ciphertext messages, which often makes the attack easier, as we have
seen in Chapter 2. We also reduce the amount of plaintext exposed if the
cryptanalyst succeeds in őnding a key, thereby reducing the amortized
‘return on investment’ for cryptanalysis.
2. Keys may also be exposed by hacking attacks; an attacker can use an
exposed key until it is changed. By changing session keys periodically, and
making sure that each of the session keys remains secret (pseudo-random)
even if other session keys are exposed, we limit or reduce the damages
due to exposure of some of the keys.
3. The separation between session keys and master key allows some or all
security to be preserved even after attacks which expose the entire storage
of a device. One way to achieve this is when the master key is conőned to
a separate Hardware Security Module (HSM), protecting it even when the
computer storage - except the internal storage of the HSM - is exposed.
We also discuss ways to ensure or restore security following exposure, even
without an HSM.
4. Finally, most session/record protocols, including the one in subsection 5.1.2
and the TLS record protocol in Chapter 7, rely on persistent counters
kept by the peers; however, counters are often reset, or may otherwise get
out of sync.
Key Exchange protocols are an extension to mutually-authenticating protocols, with the same functions, inputs and outputs. There is only one more output
of the protocol: the session key kiS . This key is provided to the session/record
protocol running in each of the peers.
We say that a two-party shared-key Key Exchange protocol ensures secure
key-setup if it ensures the following two requirements.
Key agreement: if both parties complete successfully, then they both output
the same key kiS .
Applied Introduction to Cryptography and Cybersecurity
5.4. SHARED-KEY KEY EXCHANGE PROTOCOLS
291
A, NA,i
+ ‘A ← B’ +
+ NA,i +
+ NB,i )
NA,i , NB,i , P RFkM (2 +
+ ‘A → B’ +
+ NA,i +
+ NB,i )
NB,i , P RFkM (3 +
Nurse
Alice
Bob
+ NB,i )
kiS = P RFkM (NA,i +
+ NB,i )
kiS = P RFkM (NA,i +
Figure 5.11: The 2PP Key Exchange protocol, shown generating ith session key,
kiS .
Key secrecy: each session key kiS is secret, or more precisely, pseudo-random,
i.e., indistinguishable from a random string of same length, even if the
adversary is given all the other session keys. This implies that the master
key k must remain pseudorandom, even if all session keys {kiS } are given
to the adversary.
5.4.1
The Key Exchange extension of 2PP
In this subsection we discuss a simple extension to the 2PP protocol, which
ensures secure key-setup. This is achieved by outputting the session key kiS as:
kiS = P RFk (NA,i +
+ NB,i )
(5.2)
In Equation 5.2, NA,i and NB,i are the nonces exchanged in the ith session of the
protocol, and kiS is the derived ith session key. We use k M to denote the master
(long-term) shared secret key, provided to both parties during initialization.
The protocol is illustrated in Figure 5.11.
Since both parties compute the session key kiS in the same way from NA,i +
+
NB,i and the master key kiM , it follows that they will receive the same key,
i.e., the Key Exchange 2PP extension ensures key agreement. Since the session
keys are computed using a pseudorandom function, kiS = P RFk (NA,i +
+ NB,i ),
it follows that the key of each session is pseudo-random, even given all other
session keys. Namely, the Key Exchange 2PP extension ensures secure key
setup.
Notice that there is another, seemingly unrelated change between the Mutual
Entity Authentication 2PP (Figure 5.5) and the Key Exchange 2PP (Figure 5.11)
protocols, namely, the use of P RF instead of M AC to authenticate the messages
in the protocol. This change is needed to avoid using the same key in two
different cryptographic schemes (MAC and PRF), which could, at least in some
‘absurd’ scenarios, be insecure. The change is also allowed, since every PRF is
also a MAC.
Applied Introduction to Cryptography and Cybersecurity
CHAPTER 5. SHARED-KEY PROTOCOLS
292
5.4.2
Deriving Per-Goal Keys
Following the key separation principle (Principle 7), session protocols often
use two separate keyed cryptographic functions, one for encryption and one
for authentication (MAC); the key used for each of the two goals should be
pseudo-random, even given the key to the other goal. We refer to such keys are
per-goal keys. The next exercise explains how we can use a single shared session
key, from the 2PP or another key-setup protocol, to derive multiple per-goal
session keys.
Exercise 5.4 (Per-goal keys).
1. Show why it is necessary to use separately pseudorandom keys for encryption and for authentication (MAC), i.e., per-goal keys.
S
for encryption and
2. Show how to securely derive one session key kE,i
S
another session key kA,i for authentication, both from the same session
S
key kiS , yet each key (e.g., kE,i
) is pseudo-random even given the other
S
key (resp., kA,i ). Your solution may use any cryptographic scheme or
function that we learned - your choice!
Explain the security of your solutions.
Hints for solutions:
1. See Exercise 4.18.
S
S
2. One simple solution is kE,i
= P RFkiS (‘E’), kA,i
= P RFkiS (‘A’), where
‘E’, ‘A’ are just two separate inputs to the PRF, ensuring two independentlypseudorandom session keys.
To further improve the security of the session/record protocol, we may
use two separate pairs of per-goal keys, depending on the direction: one pair
A→B
A→B
(kE,i
, kA,i
) for (encryption, authentication) of messages from Alice to Bob,
B→A B→A
and another pair (kE,i
, kA,i ) for (encryption, authentication) of messages
from Bob to Alice.
Exercise 5.5.
1. How may the use of separate, pseudo-random pairs of
per-goal keys for the two ‘directions’ improve security?
2. Show how to securely derive all four keys (both pairs) from the same
session key kiS .
3. Show a modification of the Key Exchange 2PP extension, which securely
derives all four keys (both pairs) ‘directly’, a bit more efficiently than by
deriving them from kiS .
Explain the security of your solutions.
Applied Introduction to Cryptography and Cybersecurity
5.5. KEY DISTRIBUTION CENTER PROTOCOLS
5.5
293
Key Distribution Center Protocols
In this section, we expand a bit beyond our focus on two party protocols, to
brieŕy discuss three-party shared-key, Key Distribution Protocols. In general,
key distribution protocols establish a shared key between two or more entities.
We focus on Key Distribution Protocols which use only symmetric cryptography
(shared keys), and involve only three parties: Alice, Bob - and a trusted third
party (TTP), often referred to as the Key Distribution Center (KDC); the use
of the term KDC is so common, that these protocols are often referred to as
KDC protocols. The goal of the protocol - and of the KDC - is to establish a
shared key between the other parties.
There are many types of Key Distribution Protocols. We present two
important and very different protocols, both simpliőed versions of practical
protocols: the Kerberos protocol [299], which is the most widely-known and
widely-used KDC protocol for computer networks, in subsection 5.5.1, and the
GSM protocol [26], which is the őrst widely-deployed cellular communication
protocol and still supported by essentially all existing mobile devices and
networks, in Section 5.6. Due to the unique nature of GSM, we describe it
separately, in the next section.
The Kerberos and GSM protocols are very different. They even differ in
their assumptions: in Kerberos, the KDC shares a key with each of two parties,
while in GSM, the TTP shares a key only with one party, the client (e.g., Alice),
and is assumed to have secure connection to the other party (e.g., Bob).
Another important difference is that the Kerberos protocol is secure, while
the GSM protocol is notoriously insecure - and will provide good opportunity
to introduce important attack techniques. These attacks are practical and well
known; there are even products that perform these and other attacks on GSM,
and on some other cellular-communication protocols.
5.5.1
The Kerberos Key Distribution Protocol
In this subsection we present a simpliőed version of the Kerberos [299] keydistribution protocol. Kerberos [299] is the most widely known and deployed
shared-key system for authorization and authentication in computed networks
and distributed systems.
In Kerberos, as in many other KDC protocols, the KDC shares a key with
each party: kA with Alice and kB with Bob; using these keys, the KDC helps
Alice and Bob to share a symmetric key kAB between them. In fact, the term
KDC protocol is usually used for protocols following this model. The (simpliőed)
protocol is shown in Figure 5.12.
The process essentially consists of two exchanges, both resembling the
Time-based Authenticated Request-Response (Figure 5.10).
The őrst exchange is between Alice and the KDC. In this exchange, the
KDC sends to Alice the key kAB that will be shared between Alice and Bob,
encrypted (cA ) and authenticated (mA ). In addition, Alice receives the pair
cB , mB , which also consist of encryption and authentication of kAB , but this
Applied Introduction to Cryptography and Cybersecurity
CHAPTER 5. SHARED-KEY PROTOCOLS
294
Alice
KDC
Bob
‘Bob’, time, M ACkM (time +
+ ‘Bob’)
A
cA = EkE (kAB ), mA = M ACkM (time +
+ ‘Bob’ +
+ cA +
+ cB +
+ mB )
A
A
cB = EkE (kAB ), mB = M ACkM (time +
+ ‘Alice’ +
+ cB )
B
B
Use mA to validate cA , then extract kAB ;
E
M
← P RFkAB (‘Enc’)
kAB
← P RFkAB (‘MAC’), kAB
cB , mB , cReq = EkE (Request), mReq = M ACkM (1 +
+A→B+
+ time +
+ cReq )
AB
AB
Validate and decrypt cB ,
E
M
and derive kAB
, kAB
+A←B+
+ time +
+ cResp )
cResp = EkE (Response), mResp = M ACkM (2 +
AB
AB
Figure 5.12: The Kerberos Key Distribution Center Protocol (simpliőed). The
E
M
E
KDC shares with Alice kA
for encryption and kA
for MAC, and with Bob, kB
M
for encryption and kB for MAC. The KDC selects a shared session key kAB
to be used by Alice and Bob for the speciőc session (request-response). Alice
and Bob use kAB and a pseudorandom function P RF to derive two shared
E
M
keys, kAB
= P RFkAB (‘Enc’) (for encryption) and kAB
= P RFkAB (‘MAC’)
(for authentication, i.e., MAC). All parties validate contents of MACs before
decrypting authenticated ciphertexts.
time, using the keys shared between the KDC and Bob. Alice would next relay
these to Bob. Notice that these values are also authenticated when sent to Alice
(within mA ).
In the second exchange, Alice sends her request to Bob, encrypted and
authenticated using kAB . Alice also sends the pair cB , mB , which Bob uses
to securely retrieve kAB . Alice and Bob both derive, from kAB , the shared
E
M
encryption and authentication (MAC) keys kAB
and kAB
, respectively.
Note that in the above protocol, the KDC never initiates communication, but
only responds to an incoming request; this communication pattern, where a server
machine (in this case, the KDC) only responds to incoming requests, is referred
to as client-server. Server machines usually use client-server communication,
since it relieves the server (e.g., KDC) from the need to maintain state for
different clients, except for the long-term keys (e.g., kA and kB ). This makes
it easier to implement an efficient service, especially when clients may access
different servers.
In Kerberos, the TTP has an additional role: access control. Namely, the
TTP controls the ability of the client (Alice) to contact the service (Bob). In
this case, the mB authenticator will also be a ticket or permit for the use of the
server. Access control is an important aspect of computer and network security.
Applied Introduction to Cryptography and Cybersecurity
5.6. THE GSM KEY EXCHANGE PROTOCOL
5.6
295
The GSM Key Exchange Protocol
We next discuss the GSM Key Distribution and Key Exchange protocol, an
important-yet-vulnerable shared-key Key Exchange protocol. This protocol
is performed at the beginning of each connection between a Mobile device
belonging to a user, e.g., mobile phone, a Visited Network (VN), and the user’s
Home Network. The mobile is only connected via the Visited Network, i.e., any
communication between the mobile and the Home Network must be via the
Visited Network.
Due to its wide use and importance, there are many publications on GSM;
unfortunately, there is no complete agreement on the terms. We try to use
terms which are widely used and intuitive, but readers should be ready to see
different terms in different publications, e.g., the Visited Network (VN) is often
referred to as the Base station (BS), or Visited Location; the Home Network
is sometimes referred to either as the Home location or as the Authentication
Center (AuC).
The GSM protocol is based on a shared key ki associated with mobile user
identiőers; this identiőers is called the International Mobile Subscriber Identifier
(IMSI), but we refer to it simply as i. This key, ki , is known to the Home
Network and to the user’s mobile device. More speciőcally, the mobile device
of user with identiőer (IMSI) i, has a copy of ki ; and the Home Network has a
mapping from each identiőer i of any of its users, to the corresponding ki .
The Visited Network is not fully trusted by the user and by the Home
Network; therefore, is not provided with the shared key ki , which should
remain secret from it. The GSM design assumes secure communication between
the Visited Network and the Home Network; in particular, information sent
by the Home Network to the Visited Network is not exposed to any other
party. This may be ensured by running TLS (Chapter 7) or another securecommunication protocol between these two parties; however, GSM simply
assumes such secure communication, without specifying how it should be
secured. Apparently, at least originally, visited and home networks simply often
used private communication lines between them, and assumed this is secure
enough, although we believe by now they probably all use TLS or a similar
protocol.
The basic idea of the GSM Key Exchange protocol is for the Home Network
to provide the Visited Network with a triplet (r, Kc , s) for every session of the
mobile, where:
r (or Rand) is a random 128-bits string selected by the Home Network, and
used, together with the client’s key ki , to compute (Kc , s), as: (Kc , s) ←
A38(ki , r), where A38 is an algorithm4 , which the Home Network operator
is free to select, and the GSM speciőcations requires to be a One-Way
4 Actually
the GSM specifications defines two separate functions, A3 and A8, to compute
each parameter: s ← A3(ki , r), Kc ← A8(ki , r). But since they are always computed together
and are expected to have similar cryptographic properties, they are usually considered and
implemented by a single algorithm, which we denote A38; you may also see it denoted A3A8.
Applied Introduction to Cryptography and Cybersecurity
CHAPTER 5. SHARED-KEY PROTOCOLS
296
Function. Since (Kc , s) are derived from ki and r, it suffices for the Visited
Network to sends to the Mobile device only r, and the mobile can compute
the same values (Kc , s) as computed by the Home Network (which were
sent to the Visited Network).
Kc is the session key. This key is used by the Visited Network and the client
to encrypt the connection between them.
s (or SRES) is a secret authenticator/result; the mobile device authenticates
itself to the Visited Network by sending to it the same value of s, as the
Visited Network received in the triplet (r, Kc , s) from the Home Network.
The GSM Key Exchange protocol, and two messages illustrating how GSM
protects data transfer with the session key Kc , are illustrated in Figure 5.13.
The design uses two ‘cryptographic functions’, A38, introduced above, and
A5. In the speciőcations, A38 is described as a One Way Function (OWF),
and A5 is referred to as encryption; however, from their use in the protocol,
both functions should really be pseudorandom functions (PRFs). The standard
deőnes three variants of A5, denoted A5/v for v ∈ {1, 2, 3}, and uses A5/0 to
denote no encryption.
The GSM speciőcation allows some ŕexibility as to the speciőc functions.
In fact, the operator of the Home Network is free select the A38 function. A
common choice is an algorithm called COMP128 which was deőned by the GSM
consortium, and shared under non-disclosure agreement.
The A5/i ‘encryption’ functions, however, must be supported by both mobile
and the Visited Network. The GSM speciőcations included two implementations
of A5, denoted A5/1 and A5/2; the A5/2 algorithm is an intentionally-weaker
variant of A5/1, included to allow export of GSM system for network operators
in countries to whom it was, at the time, not allowed to export ‘strong security’
encryption products. Similarly to COMP128, both A5/1 and A5/2 were shared
under non-disclosure agreement, i.e., kept ‘secret’. Later, another option was
added - the A5/3 algorithm, based on the KASUMI block cipher. Another
option is A5/0, which simply means that encryption is not performed at all.
As can be seen in the bottom (last) messages of Figure 5.13, the A5/i
functions outputs 228 bits. Half of them, bits 1 to 114, are used to encrypt
messages from the Mobile client to the Visited Network; the other half, say
from bit 115 to bit 228, is used to encrypt ‘responses’, i.e., messages from the
Visited Network to the Mobile client.
Note that the input to the A5/i functions is always the key k, and a non-secret
number - for simplicity, we show it as a sequential counter5 . These values are
synchronized between Mobile client and Visited Network; this synchronization
is due to the underlying communication protocol (TDMA).
Overview of the GSM Key Exchange. The Key Exchange begins with the
mobile sending its identiőer IMSI (International Mobile Subscriber Identity) to
5 The
actual numbers used in GSM are a bit more complex, but still non-secret.
Applied Introduction to Cryptography and Cybersecurity
5.6. THE GSM KEY EXCHANGE PROTOCOL
Mobile
client
297
Visited
network
i (IMSI)
Home
network
i (IMSI)
$
r ← {0, 1}128
(Kc , s) ← A38(ki , r)
r
(r, s, Kc )
(Kc , s) ← A38(ki , r)
s
Ok
ECC(m1 ) ⊕ A5/v(Kc , 1)[1 : 114]
ECC(resp1 ) ⊕ A5/v(Kc , 1)[115 : 228]
ECC(m2 ) ⊕ A5/v(Kc , 2)[1 : 114]
ECC(resp2 ) ⊕ A5/v(Kc , 2)[115 : 228]
(...and so on for more messages)
Figure 5.13: The GSM Key Exchange Protocol; the standard deőnes ‘cryptographic functions’ A38 (deőned in the speciőcations as a OWF, but actually
used as a PRF) and A5 (referred in the speciőcations as encryption, but actually
also used as a PRF). The standard deőnes three variants of A5 denoted A5/v
for v ∈ {0, 1, 2, 3}, where A5/0 denotes no encryption. The GSM standard also
speciőes the Error Correction Code function ECC(·).
the Visited Network; we denote this as simply i. The Visited Network forwards
i to the Home Network.
The Home Network uses i to retrieves the key ki of the mobile client. Then,
the Home Network selects a random 128-bit binary string r, and uses ki and
r to compute: (Kc , s) ← A38(ki , r), using the A38 algorithm (see discussion
above). The Home Network sends the resulting GSM authentication triplet
(r, s, Kc ) to the Visited Network.
Figure 5.13 also shows an example of two messages m1 , m2 sent from Mobile
client to the Visited Network, and two corresponding ‘responses’, resp1 , resp2 ,
Applied Introduction to Cryptography and Cybersecurity
298
CHAPTER 5. SHARED-KEY PROTOCOLS
sent from the Visited Network to the Mobile client. Note that these do not
have to be really responses to the messages; the Visited Network would send in
the same way any other message to the Mobile client, e.g., from some remote
communicating client - we just used ‘resp’ (for ‘response’) since it seems a bit
clearer, avoiding confusion with messages from the Mobile client. Of course, in
typical real use, the mobile and the Visited Network exchange more than two
messages and responses.
All the messages, including responses, that are exchanged between the
Mobile client and the Visited host, are encrypted using the connection’s key
Kc . The encryption uses the A5 algorithm chosen by the client, e.g., A5/1.
More correctly, the messages (and responses) are encrypted by bitwise-XOR to
the output of the A5 function, since the A5 algorithms deőne a pseudorandom
function (PRF), as we explained above.
Error correction then encryption? Before XORing the messages (and
responses) with the output of A5, the protocol őrst applies an Error Correcting
Code (ECC) to the message/response, to allow recovery from bit errors - common
in wireless communication. In fact, GSM uses quite extreme error correction
codes, e.g., with input of 184 (non-encoded) bits and output of 456 (encoded)
bits6 . Note that since every ŕow XORs the message/response with only 114
bits from the output of A5, this means that every ‘real’ GSM message, would
actually be transmitted using multiple of these 114-bit ŕows; what we show for
the messages and responses in Figure 5.13 is only a simpliőcation.
The use of error-correction before encryption may have been designed as
a heuristic attempt to provide authentication as a by-product of encryption.
However, as discussed in subsection 4.7.2, this approach is not advisable: it
may fail to prevent message modiőcation; it foils the use of authentication and
then applying error-detection/correction code, as a way to detect attacks; and,
signiőcantly, it may make it easier to attack the encryption scheme, as the
plaintext will have a lot of redundancy due to the ECC. This is the case for
GSM.
In fact, the use of ECC-then-Encrypt in GSM may be the best (or worst...)
example of the risk in performing ECC and then encrypting. Speciőcally, see [26]
for an effective ‘ciphertext-only’ attack on the GSM encryption using A5/1 or
A5/2 encryption. While the attacks are presented as ‘ciphertext-only’, they
fully exploit the fact that the plaintext has huge redundancy due to the use of
ECC before encryption.
While this attack is highly recommended reading, we will not describe it
here. Instead, we proceed to discuss protocol-based attacks on the GSM Key
Exchange.
6 This
is a simplification; in reality, different lengths are used for different types of messages.
Applied Introduction to Cryptography and Cybersecurity
5.6. THE GSM KEY EXCHANGE PROTOCOL
Mobile
299
VN
Attacker
i (IMSI)
r
r
s
s
Ok
Ok
ECC(m1 ) ⊕ A5/v(Kc , 1)[1 : 114]
ECC(m1 ) ⊕ A5/v(Kc , 1)[1 : 114]
...
...
ECC(mn ) ⊕ A5/v(Kc , 1)[1 : 114]
i (IMSI)
s
Ok
ECC(m′1 )
⊕ A5/v(Kc , 1)[1 : 114]
...
Impersonate phase
r
Cryptanalysis phase
ECC(mn ) ⊕ A5/v(Kc , 1)[1 : 114]
Eavesdrop phase
i (IMSI)
ECC(m′n′ ) ⊕ A5/v(Kc , 1)[1 : 114]
Figure 5.14: The VN-impersonation attack by a MitM attacker on the GSM
Key Exchange. The Key Exchange between the client and the Home Network
is exactly like in Figure 5.13, but here we omit the Home Network and the
messages exchanged between the Visited Network and the Home Network. This
őgure is simpliőed, in particular, it does not include the cipher-negotiation
details; see these in Figure 5.15. A5/v denotes the GSM encryption scheme;
standard values are for v ∈ {0, 1, 2, 3}.
5.6.1
Vulnerability study: VN-impersonation Replay attack
on GSM
Figure 5.14 shows the simple VN-impersonation replay attack against the GSM
Key Exchange protocol. The attack involves a fake Visited network (VN), i.e.,
the attacker is impersonating as a legitimate VN.
The VN-impersonation attack has three phases:
Applied Introduction to Cryptography and Cybersecurity
300
CHAPTER 5. SHARED-KEY PROTOCOLS
Eavesdrop: in the őrst phase, the attacker eavesdrops on a legitimate connection between the mobile client and a legitimate Visited Network (VN). The
Key Exchange between the client and the Home Network is exactly like
in Figure 5.13, except that, for simplicity, Figure 5.14 does not show the
Home Network and the messages exchanged between the Visited Network
and the Home Network.
Cryptanalysis: in the second phase, the attacker cryptanalyzes the ciphertexts collected during the eavesdrop phase. Assume that the attacker is
successful in őnding the session key Kc shared between client and Visited
Network; this is reasonable, since multiple effective attacks are known on
the GSM ciphers A5/1 and A5/2.
Impersonate: őnally, once cryptanalysis has exposed the session key Kc , the
attacker impersonates as a legitimate Visited Network, and replays the
same random challenge r sent by the legitimate Visited Network in the
eavesdrop phase. Since the connection key Kc is derived in a deterministic
way from r, by (Kc , s) ← A38(r), then the client would reuse the same
connection key Kc as in the eavesdropped-to connection - the one exposed
by the attacker during the cryptanalysis phase. The attacker now uses
Kc to communicate correctly with the client; in particular this allows the
adversary to decrypt any messages m′1 , . . . , m′n′ encrypted and sent by
the client in this new connection.
Are MitM attacks feasible against GSM? After the VN-impersonation
attack and the MitM downgrade attack (subsection 5.6.3) were published [26],
some responses argued that building a MitM adversary is ‘too complex’ and
therefore such attacks are not a real concern. Nevertheless, devices allowing
GSM-MitM attacks have been constructed - by academic researchers, students,
independent developers - and also companies; such products are available for
purchase from multiple vendors.
Is the VN-impersonation attack effective and a real threat?
The
VN-impersonation allows the attacker to impersonate as a Visited Network
and cause the mobile client to send (new) messages encrypted with the (old)
key, which the attacker can now decrypt. The attacker may also respond with
fake messages, of course, to continue the dialog. However, this attack has
one signiőcant drawback: the attacker cannot impersonate as the client to a
legitimate Visited Network. In particular, the attack does not allow the attacker
to decrypt new responses sent from a remote peer to the client. In practice, this
may also make it hard for the attacker to deploy this attack in some scenarios,
e.g., to become a MitM on a complete call between the client and a remote
party. The attacker may try to connect to the remote party using a separate
call from the attacker’s own mobile device, relaying traffic between the client
and the remote party via the client’s device - but the remote party may notice
Applied Introduction to Cryptography and Cybersecurity
5.6. THE GSM KEY EXCHANGE PROTOCOL
301
the use of a different client device. This serious limitation is avoided by the
downgrade attack we discuss next.
5.6.2
Crypto-agility and cipher suite negotiation in GSM
In this subsection, we őrst introduce the important principle of Crypto-agility
(also known as cryptographic agility). crypto-agility means that the cryptographic protocol allows the use of different cryptographic functions and schemes,
a long as they satisfy some requirements, e.g., an IND-CCA encryption scheme
or a secure PRG. We refer to a speciőc choice of functions/schemes used by a
protocol as a cipher suite. Then, we explain that the GSM protocol supports
crypto-agility, including a cipher suite negotiation mechanism, allowing entities
to negotiate the speciőc cryptographic functions and schemes to be used. Finally,
we show that the GSM ciphersuite negotiation mechanism is vulnerable to a
devastating cipher suite downgrade attack.
Principle 11 (Crypto-agility). Cryptographic protocols should be designed
using abstract ‘building block’ cryptographic functions and schemes; the set of
functions and schemes is called a cipher suite. Each function/scheme should
have well-defined requirements. It is desirable for protocols to allow cipher suite
negotiation to determine the specific cipher suite to be used in a particular run
of the protocol, as long as it is secure. Cipher suite negotiation is secure if the
negotiated cipher suite is never inferior to another cipher suite that is supported
by both/all parties involved in the protocol.
Basically, crypto-agility requires a modular design, where the design of the
protocol does not depend on the speciőc components (cipher suite) used, only
on their required security properties. This has several important beneőts:
1. crypto-agility allows replacing a cryptographic scheme/function which is
found or suspected to be vulnerable, while continuing to use the same
protocol.
2. crypto-agility allows different users to use the same protocol, but using
different schemes, due to different trust assumptions, different efficiency/security tradeoffs and considerations, or other reasons, such as licensing,
availability and legal (often export) restrictions. In particular, some countries restrict the export of some cryptographic mechanisms; speciőcally,
until about 2000, the USA restricted export of cryptographic systems
using symmetric encryption with keys longer than 40 bits. A protocol
may further support a secure cipher suite negotiation to allow the parties
to choose the ‘best’ cipher suite supported by both/all of them.
3. The security of a crypto-agile protocol can be established based on the
well-deőned requirements from the cipher suite, which makes it easier to
design, evaluate, understand and prove security of the protocol.
Applied Introduction to Cryptography and Cybersecurity
302
CHAPTER 5. SHARED-KEY PROTOCOLS
Note, however, that all too often, protocol designers focus on crypto-agility
and cipher suite negotiation - but to properly ensure the security of the negotiation mechanism, creating serious vulnerabilities and allowing different downgrade
attacks, which allow an attacker to trick the parties into using a particular,
vulnerable cipher suite chosen by the attacker. These attacks usually involve
a Man-in-the-Middle (MitM) attacker. In this section, we will see downgrade
attacks against GSM; later, in Chapter 7, we will see several downgrade attacks
on SSL and TLS.
GSM cipher suite negotiation is vulnerable to downgrade attack.
GSM supports crypto-agility, in the sense that the protocol is deőned for any
stream cipher, with three speciőc options (A5/1, A5/2 and A5/3) as well as the
ability to use other stream ciphers. Furthermore, GSM supports cipher suite
negotiation, since the visited Network and the client (mobile) negotiate which
stream-cipher to use. Namely, the client sends the list of supported stream
ciphers, and the visited network indicates which of them it prefers; this stream
cipher is then used by GSM. In particular, GSM’s cipher suite negotiation
was essential to allow interoperability between a client / visited network that
support only an exportable (cryptographically weak) stream cipher, and a
visited network / client that support both the exportable stream cipher and a
more secure stream cipher.
However, the GSM cipher suite negotiation is not well protected, allowing
a downgrade attacks by a MitM adversary, as we show next. In fact, we show
two attacks. First, in this subsection, we outline the simple downgrade to A5/1
attack, that allows downgrading GSM connections to use A5/1. Then, in the
subsection subsection 5.6.3, we present the (more advanced) downgrade to A5/2
attack. Both attacks work even when both mobile client and visited network
support and prefer a stronger cipher, e.g., A5/3.
First, let us explain the GSM ciphersuite negotiation mechanism.
The GSM ciphersuite negotiation. The GSM ciphersuite negotiation
process is shown in Figure 5.15. In the őrst message of the Key Exchange,
containing the mobile client’s identity (IMSI). The Mobile client also lists the
supported ciphers, i.e., the A5/v functions. For example, in the őgure, the
Mobile supports A5/1 and A5/2. The Visited Network selects the stream cipher
to be used from this list. Usually, the Visited Network would select the stream
cipher considered most secure that this Visited Network supports, among those
offered by the mobile client.
A critical property of the GSM negotiation mechanism is that all clients
support the stronger A5/1 protocol. Furthermore, GSM speciőes that Visited
networks that support A5/1, as most do, should refuse to use A5/2 - even if A5/2
is the only option on the list. This is an important fact which has signiőcant
impact on GSM downgrade attacks:
Applied Introduction to Cryptography and Cybersecurity
5.6. THE GSM KEY EXCHANGE PROTOCOL
Mobile
client
303
Visited
network
i, Ciphers:{A5/1, A5/2, . . .}
Abort if
A5/1 ̸∈Ciphers
r
Home
network
i (IMSI)
$
r ← {0, 1}128
(Kc , s) ← A38(ki , r)
(r, s, Kc )
(Kc , s) ← A38(ki , r)
s
CIP HM ODCM D : A5/v (v ∈ {0, 1, 2})
ECC(CIP HM ODCOM ) ⊕ A5/v(Kc , 1)[1 : 114]
Timeout and retransmission
(no CIP HM ODOK received)
ECC(CIP HM ODCOM ) ⊕ A5/v(Kc , 2)[1 : 114]
ECC(CIP HM ODOK) ⊕ A5/v(Kc , 2)[115 : 228]
(continue as in Figure 5.13)
Figure 5.15: The GSM Key Exchange Protocol, including details of cipher suite
negotiation (omitted in Figure 5.13). Note that the visited network aborts, if
the ciphers offered by the client do not include A5/1.
Fact 5.2 (GSM Visited Networks refuse to downgrade to A5/2). All GSM
Visited networks support A5/1, and refuse to open (abort) a connection if the
cipher suites offered by the mobile client do not include A5/1.
The reason for Fact 5.2 is that the A5/2 cipher is known to have been
designed intentionally to provide vulnerable encryption. This vulnerable cipher
was necessary to gain government permission to export GSM equipment; GSM
network operator equipment was allowed for export to certain countries, only if
restricted to use only A5/2.
Without the defense of Fact 5.2, a MitM attacker could have simply removed
A5/1 (and any other ‘strong’ cipher) from the list of ciphers sent by the
Mobile, and as a result, communication would have used the vulnerable A5/2.
Unfortunately, Fact 5.2 does not prevent the simple downgrade to A5/1 attack,
Applied Introduction to Cryptography and Cybersecurity
304
CHAPTER 5. SHARED-KEY PROTOCOLS
which we present below. In the next subsection, we present the (more advanced)
downgrade to A5/2 attack.
Downgrade to A5/1 attack. Fact 5.2 is the only defense of GSM against
downgrade attacks; yet, it does not refer at all to other ciphers, except A5/1.
As a result, GSM is vulnerable to a simple downgrade to A5/1 attack.
For example, if a mobile supports {A5/1, A5/2, A5/3}, then a MitM attacker
can simply remove the A5/3 option from the list sent by the Mobile client, to
cause the Visited Network and Mobile client to use the (weaker) A5/1 protocol.
This attack is much simpler than the ones we present next, in subsection 5.6.3;
therefore, we leave it as an exercise for the reader (Exercise 5.12). This exercise
- őnding the simple downgrade attack - should not be difficult, especially after
learning the more advanced downgrade to A5/2 attack which we describe next.
5.6.3
The downgrade to A5/2 attack on GSM
In this subsection, we present the downgrade to A5/2 attack on GSM. This is a
devastating attack, since A5/2 is an absurdly vulnerable algorithm; indeed, it
was intentionally designed that way, since such weakened protocol was necessary
to obtain permission to export GSM devices. This is a non-trivial attack, and
therefore, we őrst present a simpliőed variant that is unlikely to work in practice.
Then, we present the ‘real’ downgrade to A5/2 attack.
Both attacks are based on the GSM key-reuse vulnerability, which we
describe next.
GSM key-reuse vulnerability. GSM has an unusual additional vulnerability,
making downgrade attacks much worse than with most systems/protocols. This
vulnerability is due to the following fact:
Fact 5.3 (GSM Key-Reuse vulnerability). The GSM Key Exchange establishes
the same key Kc , regardless of the cipher used (e.g., A5/1, A5/2 or A5/3).
Namely, GSM protocol uses the same key Kc for all ciphers.
Note that this vulnerability is due to the fact that the GSM designers
completely ignored the Key separation principle (Principle 10).
Fact 5.3 allows an attacker to őnd a key used with one (weak) scheme, and
use it to decipher communication protected with a different (stronger) cipher.
This is indeed deployed by the GSM downgrade attacks we present. Another
fact used by the attack is that A5/2 is absurdly vulnerable, i.e., very effective
and efficient attacks against A5/2 are known.
Fact 5.4 (CTO attack on A5/2 requires 900 ciphertext bits and 1 seconds).
The ciphertext-only (CTO) attack of [26] finds the connection encryption key
Kc , given 900 bits or more of ciphertext, encrypting ECC-encoded messages;
the attack takes less than one second (using standard computing capabilities).
Applied Introduction to Cryptography and Cybersecurity
5.6. THE GSM KEY EXCHANGE PROTOCOL
305
The following few steps of the Key Exchange as described in Figure 5.15, are
exactly as in Figure 5.13. In fact, since the interactions with the Home Network
are not impacted or changed by the ciphersuite mechanism or by the attacks,
we do not even include them in the discussion and őgures of the downgrade
attack and its variants.
After receiving the (correct) authenticator, s, from the mobile, the visited
network identiőes its choice of A5 cipher to the mobile. This is done in the
message CIPHMODCMD : A5/v, where v indicates the cipher to be used, i.e.,
in this case, v ∈ {0, 1, 2}; this is instead of merely sending ‘Ok’ as in Figure 5.13.
The following message from the client, and the first encrypted, is the special
message CIP HM ODCOM , i.e., ‘cipher mode complete’, which acknowledges
that the Mobile is using the cipher mode indicated in the CIPHMODCMD
(‘cipher mode command’) sent by the Visited Network.
Recall that GSM is designed for wireless communication, with signiőcant
probability for noise and corruption of the transmitted information - which is
the motivation for its extensive use of Error Correcting Code, with extensive
redundancy. In spite of that, messages may get lost. Hence, important control
messages should be acknowledged; in particular, the Mobile client waits for an
acknowledgement message from the Visited Network (VN), which we denote
CIPHMODOK, to know that the VN received correctly the CIP HM ODCOM
message.
When the Mobile times-out, i.e., does not receive CIPHMODOK in time,
then the Mobile retransmits the CIP HM ODCOM to the Visited Network;
this scenario is shown in Figure 5.15 (where we show the case where the őrst
retransmission is successful). This happens after a very short time-out, of
much less than a second - few milliseconds. As shown in Figure 5.15, each
retransmission uses a distinct sequence number (e.g., 1 and 2, in the őgure).
This, again, is an important fact for the downgrade attack.
Fact 5.5 (CIP HM ODCOM message and its retrasnsmission). After a Mobile
receives the CIPHMODCMD (‘cipher mode command’) message, instructing
the Mobile to use a specific cipher mode, then the Mobile encrypts and sends
CIP HM ODCOM (‘cipher mode completed’) to the VN; this message contains
456 bits (including the ECC). The Mobile then waits for an acknowledgement a CIPHMODOK (‘cipher mode Ok’) message from the VN. If this isn’t received
after time-out of few milliseconds, the Mobile re-transmits, with a new counter
value.
A failure may also occur in the reverse direction, i.e., the Visited Network
(VN) may time-out while waiting for the CIP HM ODCOM (‘cipher mode
completed’) from the client. In this case, the VN aborts the Key Exchange.
Again, this happens after a timeout of few milliseconds. Similarly, a failure may
occur also in the earlier phases of the Key Exchange, i.e., between sending i
and receiving r, or between sending r and receiving s. However, GSM allows
for much larger delays in these early phases - up to 12 seconds! This fact is also
signiőcant.
Applied Introduction to Cryptography and Cybersecurity
CHAPTER 5. SHARED-KEY PROTOCOLS
306
Mobile
MitM
VN
CIPHMODCMD : A5/2
CIPHMODCMD : A5/1
phase
s
Eavesdrop
s
phase:
r
Pre-analysis
r
find Kc
i, {A5/1, A5/2}
phase
i, {A5/1, A5/2}
ECC(CIP HM ODCOM, 1)⊕
A5/2(Kc , 1)[1 : 114]
ECC(CIP HM ODOK, 1)⊕
A5/2(Kc , 1)[115 : 228]
ECC(CIP HM ODOK, 1)⊕
A5/1(Kc , 1)[115 : 228]
ECC(m1 ) ⊕ A5/2(Kc , 2)[1 : 114]
ECC(mn ) ⊕ A5/1(Kc , 2)[1 : 114]
...
...
Figure 5.16: Simpliőed downgrade attack on GSM Key Exchange.
Fact 5.6 (GSM Key Exchange allows 12 second delays till CIPHMODCMD).
The Mobile and the VN abort the Key Exchange if they do not receive the
expected responses, after timeout of about 12 seconds; this holds for the messages
until the CIPHMODCMD is sent (and received), from which point, responses
are expected to be almost instantaneous (few milliseconds). If responses are not
accepted by the timeout, the Key Exchange is aborted.
Simplified, unrealistic downgrade attack. We őrst present a simpliőed,
unrealistic downgrade attack against GSM in Figure 5.16. In this attack, the
client supports A5/1 and A5/2, but the MitM attacker ‘removes’ A5/1 and
only offers A5/2 to the Visited Network. As a result, the entire session between
Visited Network and client is only protected using the (extremely vulnerable)
A5/2 cipher.
However, the attack of Figure 5.16 fails, for the following reasons:
1. As per Fact 5.6, the Visited Network would time-out when it does not
receive the CIP HM ODCOM message within few milliseconds - while
the fastest ciphertext-only cryptanalysis process of A5/2, from [26], takes
about a second (Fact 5.4).
Applied Introduction to Cryptography and Cybersecurity
Cryptanalysis
ECC(CIP HM ODCOM, 1)⊕
A5/1(Kc , 1)[1 : 114]
5.6. THE GSM KEY EXCHANGE PROTOCOL
307
2. The length of the CIP HM ODCOM message is only 456 bits (including
the ECC), while the cryptanalysis attack of [26] requires 900 bits. Hence,
the attack will not succeed to őnd the key at all.
We next present the ‘real’ GSM downgrade attack, which overcomes these
challenges.
The ‘real’ downgrade attack on GSM Key Exchange. In Figure 5.17,
we őnally present the ‘real’ downgrade attack on the GSM Key Exchange. This
attack addresses the two challenges presented above, by:
1. As per Fact 5.6, GSM allows delays of about 12 seconds until the Visited
Network sends CIPHMODCMD. So this attack delays the s response
from the client - this gives the MitM attacker 12 seconds, much more than
enough for the ciphertext-only cryptanalysis process of A5/2 from [26],
which takes only about a second (Fact 5.4)!
2. To obtain a sufficient number of ciphertext bits, this attack intentionally
causes the mobile client to time-out while waiting for the CIPHMODOK
message, resulting in rapid retrasmission of the CIP HM ODCOM message from the client - re-encrypted, since the counter value is modified, as
follows from Fact 5.5. Each of these retransmissions contains 456 bits,
providing together more than the 900 bits required for the attack of A5/2
from [26] (Fact 5.4).
Several additional variants of this attack are possible; see, for example, the
following exercise.
Exercise 5.6 (GSM combined replay and downgrade attack). Consider an
attacker who eavesdrops and records the entire communication between mobile
and Visited Network during a connection which is encrypted using a ‘strong’
cipher, say A5/3. Present a sequence diagram, like Figure 5.14, showing a
‘combined replay and downgrade attack’, allowing this attacker to decrypt all
of that ciphertext communication by later impersonating as a Visited Network,
and performing a downgrade attack.
Hint: the attacker will resend the value of r from the eavesdropped-upon
communication (encrypted using a ‘strong’ cipher) to cause the mobile to re-use
the same key - but with a weak cipher, allowing the attacker to expose the
key.
Protecting GSM against downgrade attacks. Downgrade attacks involve
modiőcation of information sent by the parties - speciőcally, the possible and/or
chosen ciphers. Hence, the standard method to defend against downgrade
attacks is to authenticate the exchange, or at least, the ciphersuite-related
indicators.
Note that this requires the parties to agree on the authentication mechanism, typically, a MAC scheme. It may be desirable to also negotiate the
Applied Introduction to Cryptography and Cybersecurity
CHAPTER 5. SHARED-KEY PROTOCOLS
308
Mobile
MitM
VN
phase
CIPHMODCMD : A5/2
Eavesdrop
s
phase:
r
find Kc
r
phase
i, {A5/1, A5/2}
Pre-analysis
i, {A5/1, A5/2}
ECC(CIP HM ODCOM, 1)⊕
A5/2(Kc , 1)[1 : 114]
ECC(CIP HM ODOK, 2)⊕
A5/2(Kc , 2)[115 : 228]
Cryptanalysis
ECC(CIP HM ODCOM, 2)⊕
A5/2(Kc , 2)[1 : 114]
s
CIPHMODCMD : A5/1
ECC(CIP HM ODCOM, 1)⊕
A5/1(Kc , 1)[1 : 114]
ECC(CIP HM ODOK, 1)⊕
A5/1(Kc , 1)[115 : 228]
...
...
Figure 5.17: A ‘real’ downgrade attack on GSM Key Exchange.
authentication mechanism. In such case, the negotiation should be bounded
to reasonable time, and the use of the authentication scheme and key limited
to a few messages, to foil downgrade attacks on the authentication mechanism.
Every authentication mechanism supported should be secure against this (weak)
attack.
It is also necessary to avoid the use of the same key for different encryption schemes, as done in GSM (Fact 5.3), and exploited, e.g., by the attacks
of Figure 5.17 and Exercise 5.6. Using separate keys is quite easy, and does not
require any signiőcant resources - it seems that there was no real justiőcation
for this design choice in GSM, except for the fact that this allows the Home
Network to send just one key Kc , without knowing which cipher would be
selected by the mobile and Visited Network.
Deployed defense. The ‘real’ attack is not ‘real’ anymore - but it is prevented
in a rather ‘crude’ way: the GSM consortium abolished support for the insecure
A5/2. Note that downgrading between other versions, e.g., from A5/3 to A5/1,
is still possible.
Applied Introduction to Cryptography and Cybersecurity
5.7. RESILIENCY TO EXPOSURE: FORWARD SECRECY AND RECOVER
SECURITY
309
Learning from GSM Key Exchange vulnerabilities. We have discussed
several signiőcant failures of the GSM Key Exchange: the MitM downgrade
attack of this subsection, the VN-impersonation attack of Figure 5.14, the
use of ‘ECC-then-encrypt’ which allows a ciphertext-only attack on A5/1 or
A5/2 [26]. There are more, e.g., efficient known-plaintext attacks. What are
the root causes of these vulnerabilities and what can we learn, to avoid such
vulnerabilities?
We believe that many of these problems are due to the fact the designers
violated several basic principles. Most notably, the GSM design violates the
Kerckhoffs’ principle (Principle 2): it relies on the use of ‘secret’ algorithms
such as A38, A5/1 and A5/2. The GSM design also did not undergo careful
public security analysis, and in particular, its attack model was never clearly
stated, violating Principle 1, clear attack model; and its design did not carefully
apply well-deőned, standard cryptographic building blocks, violating Principle 3,
conservative design, and Principle 8, cryptographic building blocks.
5.7
Resiliency to key exposure: forward secrecy and
recover security
One of the goals of deriving pseudorandom keys for each session was to reduce
the damage due to exposure of one or some of the session keys. A natural
question is, can we improve the resiliency to exposure? In particular, can a
Key Exchange protocol provide some security, even when an adversary may
sometimes expose also the master key, or, more generally, the entire state of
the parties? Notice that with all Key Exchange protocols we studied, exposure
of the master key, at any time, allows an adversary to easily expose all (past
and future) session keys.
One approach to this problem was already mentioned: place the master key
κ within a Hardware Security Module (HSM), so that it is assumed not to be
part of the state exposed to the attacker. However, often, the use of an HSM is
not a realistic, viable option. Furthermore, cryptographic keys may be exposed
even when using an HSM - by cryptanalysis or by some weakness of the HSM,
such as side-channels allowing (immediate or gradual/partial) exposure of keys.
In this section, we discuss a different approach to provide security with
resiliency to key exposures. This approach is to design the Key Exchange
protocol to ensure some security, even after the adversary obtains the master
key (or the contents of the entire storage). We mostly focus on two notions of
resiliency to key exposure: forward secrecy and recover security. We explain
these two notions and present Key Exchange protocols satisfying them. Both
of these notions can be achieved using shared-key only.
We also brieŕy discuss additional, even stronger notions of resiliency to key
exposures, mainly, an extension for each of the two notions: perfect forward
secrecy (PFS) and perfect recover security (PRS). For these stronger notions of
resiliency, it seems necessary to use public key cryptography, which we introduce
Applied Introduction to Cryptography and Cybersecurity
310
CHAPTER 5. SHARED-KEY PROTOCOLS
in Chapter 6. We present PFS and PRS protocols in Section 6.3; later, in
Chapter 7, we discuss how PFS is provided by the TLS protocol.
Terms for resilient security. Note that the notions of Recover Security and
Perfect Recover Security are not widely used in the literature. Also, the term
Forward Secrecy is not always used as we deőne it; e.g., often it is used to refer
to the notion commonly (and here) referred to as Perfect Forward Secrecy.
5.7.1
Forward Secrecy 2PP Key Exchange
We use the term forward secrecy to refer to Key Exchange protocols where
exposure of the entire storage of the communicating party in some future time
period, including every (master and session) key kept at that future time, would
not expose the keys used in previous time periods, or the plaintext encrypted
(and sent) during these previous periods. This should hold although all previous
communication could have been intercepted and recorded by the attacker.
To ensure forward secrecy, each period i would use a separate master key
kiM . For simplicity, we will map sessions to time periods, i.e., run the Key
Exchange protocol once at the beginning of every period. At the beginning of
M
period/session i, we must erase any previous master key (e.g., ki−1
). Deőnition
follows. Note that some authors refer to this notion as weak forward secrecy, to
emphasize the distinction from the stronger notion of perfect forward secrecy
(which we present later).
Definition 5.1 (Key Exchange with Forward Secrecy). A Key Exchange
protocol P ensures forward secrecy if once session i terminated, exposure of the
state of the entity will not compromise the confidentiality of information sent
by the entity or sent to the entity in session i.
We next discuss forward secrecy 2PP Key Exchange, a forward-secrecy
variant of the Key Exchange 2PP extension, which we discussed and presented
earlier, in subsection 5.4.1. The difference is that instead of using a single
master key k, received during initialization, the forward-secrecy Key Exchange
uses a sequence of master keys k0M , k1M , . . .; for simplicity, assume that each
master key kiM is used only for the ith Key Exchange, with k0M received during
initialization.
The key to achieving the forward secrecy property is to allow easy derivaM
tion of the future master keys ki+1
, . . . from the current master key kiM , but
M
M
, . . . , k0M
, ki−2
prevent the reverse, i.e., maintain the previous master keys ki−1
M
M
pseudorandom, even for an adversary who knows ki , ki+1 , . . .. A simple way
to achieve this is by using a PRF, namely:
M (0)
kiM = P RFki−1
(5.3)
The session key kiS for the ith session can be derived using the corresponding
minor change to Equation 5.2, namely:
Applied Introduction to Cryptography and Cybersecurity
5.7. RESILIENCY TO EXPOSURE: FORWARD SECRECY AND RECOVER
SECURITY
311
A, NA,i
NA,i , NB,i , P RFkiM (2 +
+ ‘A ← B’ +
+ NA,i +
+ NB,i )
NB,i , P RFkiM (3 +
+ ‘A → B’ +
+ NA,i +
+ NB,i )
Nurse
Alice
kiS
Bob
M (0)
kiM = P RFki−1
= P RFkiM (NA,i +
+ NB,i )
kiS
M (0)
kiM = P RFki−1
= P RFkiM (NA,i +
+ NB,i )
Figure 5.18: The Forward-Secrecy 2PP Key Exchange protocol. This protocol
is similar to the 2PP Key Exchange protocol (Figure 5.11). The main difference
is that this protocol uses a different master key kiM for each period i; the initial
master key, shared by the two parties, is k0M .
Secure
Exposed
Remains insecure
k1M = P RFk0M (0)
k2M = P RFk1M (0)
k3M = P RFk2M (0)
k1S = P RFk1M (NA,1 +
+ NB,1 )
k2S = P RFk2M (NA,2 +
+ NB,2 )
k3S = . . .
Figure 5.19: Result of running the Forward-Secrecy 2PP Key Exchange for
three periods, with the keys exposed in the second period. Periods prior to the
exposure (in this example, only the őrst period) remain secure even after the
period where keys are exposed. Periods from the exposure onward are insecure.
+ NB )
kiS = P RFkiM (NA +
(5.4)
The resulting Forward-secrecy 2PP Key Exchange protocol is illustrated in
Figure 5.18. The use of NA and NB in Eq. (5.4) is not really necessary, since
each master key is used only for a single Key Exchange.
The Forward-Secure 2PP Key Exchange protocol ensures that the communication in any period that completed before any key exposure, remains secure
regardless of key exposures in later periods. See Figure 5.19.
5.7.2
Recover-Security Key Exchange Protocol
We use the term recover security to refer to key setup protocols where a single
session without eavesdropping or other attacks, suffices to recover security from
previous key exposures. Deőnition follows.
Definition 5.2 (Recover security Key Exchange). A Key Exchange protocol
recovers security if session i is secure, when either of the following holds:
Applied Introduction to Cryptography and Cybersecurity
CHAPTER 5. SHARED-KEY PROTOCOLS
312
Secure
(MitM attacker)
k1M = P RFk0M (NA,1 ⊕ NB1 )
k1M or k2M exposed
k2M = P RFk1M (NA,2 ⊕ NB2 )
k1S = P RFk1M (NA,1 +
+ NB,1 )
Remains insecure
Recover security
Remains secure
(If attacker eavesdrops)
k3M = P RFk2M (NA,3 ⊕ NB3 )
(If no attack)
k4M = P RFk3M (NA,4 ⊕ NB4 )
(MitM attacker)
k5M = P RFk4M (NA,5 ⊕ NB5 )
k2S = P RFk2M (NA,2 +
+ NB,2 )
k3S = P RFk3M (NA,3 +
+ NB,3 )
k4S = P RFk4M (NA,4 +
+ NB,4 )
k5S = . . .
Figure 5.20: Example of running the recover-security Key Exchange protocol for
őve periods, with the keys exposed in the second period, and no attack (even
eavesdropping) in the fourth period, allowing recovery of security. Periods prior
to the exposure (in this example, only the őrst period) remain secure even after
the period where keys are exposed. Periods after exposure remain insecure,
until a ‘recovery period’ (in this example, period 4) where there is no attack.
Following recovery period, security is maintained till next exposure.
No attack: during session i, there was no exposure, and all messages were
delivered correctly, without eavesdropping, injection or modification.
Preserve security: during session i there was no exposure, and the previous
session (i − 1) was secure.
The forward-secure 2PP Key Exchange protocol (Figure 5.18) ensures forward secrecy - but not recover security. This is since the attacker can use one
exposed master key, say kjM , to derive all the following master keys, including
M
kiM for i > j, using Equation 5.3; in particular, kj+1
= P RFkjM (0).
However, a simple extension suffices to ensure recover security, as well as
forward secrecy. The extension is simply to use the random values exchanged
in each session, i.e., NA,i , NB,i , in the derivation of the next master key, i.e.:
M (NA,i ⊕ NB,i )
kiM = P RFki−1
(5.5)
We call this protocol the recover-security Key Exchange protocol, and illustrate
its operation in Figure 5.20.
By computing the new master key using these three values, it is secret as
long as at least one of these three values is secret. Since the recover security
requirement assumes at least one session where the attacker does not eavesdrop
or otherwise interfere with the communication, then both NA,i and NB,i are
secret, hence the new master key kiM is secret. Indeed, we could have used just
one of NA,i and NB,i ; by XOR-ing with both of them, we ensure secrecy of the
master key even if the attacker is able to capture one of the two ŕows, i.e., even
stronger security.
Two notes are in order. The őrst note is that the protocol is fragile, in the
sense that an attacker who sends corrupted nonce value to one (or both) parties
in a given period, can prevent recovery of (secure) communication in future
rounds. This can be improved with some additional protocol complexity; we
leave it as a challenge to the interested reader.
The second note is that the Recover-Security Key Exchange protocol requires
the parties to have a source of true randomness, which is called a True Random
Applied Introduction to Cryptography and Cybersecurity
5.7. RESILIENCY TO EXPOSURE: FORWARD SECRECY AND RECOVER
SECURITY
313
Bit Generator (TRBG), i.e., a source which produces random bits even if the
party is broken-into (and keys exposed). In reality, many systems only rely on
pseudo-random generators (PRGs), or pseudo-random functions (PRFs), whose
future values are computable using a past value or using an exposed key. In such
case, it becomes critical to use also the input from the peer (NA,i or NB,i ), and
these values should be also used to re-initialize the PRG, so that new nonces
(NA,i , NB,i ) are pseudorandom (or truly random) and not predictable. Truly
random bit generators require an appropriate hardware device, and relying on
physical properties, including thermal noise and quantum phenomena [360].
5.7.3
Stronger notions of resiliency to key exposure
Forward secrecy and recover security signiőcantly improve the resiliency against
key exposure. There are additional and even stronger notions of resiliency to
key exposure, which are provided by more advanced Key Exchange protocols;
we only cover a few of these in this textbook - speciőcally, the ones in Table 5.3.
All known protocols that achieve more advanced notions of resiliency use
public key cryptology, and in particular, key-exchange protocols such as the
Diffie-Hellman (DH) protocol. Indeed, it seems plausible that public-key cryptography is necessary for many of these notions. This includes the important
notions of Perfect Forward Secrecy (PFS) and Perfect Recover Security (PRS),
which, as the names imply, are stronger variants of forward and recover security, respectively. We now brieŕy discuss PFS and PRS, to understand their
advantages and why they require more than the protocols we have seen in this
chapter; we discuss these notions further, with implementations, in Section 6.3.
Perfect Forward Secrecy (PFS). PFS, like Forward Secrecy, also requires
resiliency to exposures of state, including keys, occurring in the future. However,
on top of that, PFS also requires resiliency to exposure of the previous state,
again including keys, as long as this exposure occurs only after the session ends.
We next deőne this notion. PFS was apparently őrst coined by Gunther [178];
unfortunately, the term is not always used with a consistent meaning and
deőnition, but the following deőnition seems to capture the meaning usually
used by experts.
Definition 5.3 (Perfect Forward Secrecy (PFS)). A Key Exchange protocol P
ensures perfect forward secrecy (PFS) if data sent during session i is confidential
(indistinguishable), provided that either (1) there is not MitM attack during
session i, or (2) the master key of session i and of any previous session, is not
given to the adversary - or given only after session i.
We discuss some PFS Key Exchange protocols in the next chapter, which
deals with asymmetric cryptography (also called public-key cryptography, PKC).
All known PFS protocols are based on PKC.
Applied Introduction to Cryptography and Cybersecurity
314
Notion
Secure
key-setup
Forward
Secrecy
(FS)
Perfect
Forward
Secrecy
(PFS)
Recover
Security
(RS)
Perfect
Recover
Security
(PRS)
CHAPTER 5. SHARED-KEY PROTOCOLS
Session i is secure, when:
Attacker is given session keys of other sessions, but
master key is never exposed.
Crypto
Shared
key
Attacker is given all keys, but only of sessions after
session i.
Shared
key
Attacker is given all keys of all sessions except i,
but only after session i ends.
Public
key
Attacker is given keys of other sessions, but session
i − 1 is secure, or no eavesdropping/MitM during
session i.
Shared
key
Attacker is given keys of other sessions, but session
i − 1 is secure, or no MitM during session i.
Public
key
Table 5.3: Notions of resiliency to key exposures of key-setup Key Exchange protocols. See implementations of forward and recover security in subsection 5.7.1
and subsection 5.7.2 respectively, and for the corresponding ‘perfect’ notions
(PFS and PRS) in subsection 6.3.1 and subsection 6.3.2, respectively.
Exercise 5.7 (Forward Secrecy vs. Perfect Forward Secrecy (PFS)). Present a
sequence diagram, showing that the forward-secrecy 2PP Key Exchange protocol
presented in subsection 5.7.1, does not ensure Perfect Forward Secrecy (PFS).
Perfect Recover Security (PRS). We introduce the term perfect recover
security to refer to Key Exchange protocols where a single session without
exposure or MitM attacks suffices to recover security from previous key exposures.
Deőnition follows.
Definition 5.4 (Perfect Recover Security (PRS) Key Exchange). A Key Exchange protocol ensures Perfect Recover Security (PRS), if security (confidentiality and authentication) is ensured for messages exchanged during session i,
provided that there is no exposure during session i and either (1) session i − 1 is
secure, or (2) there is no MitM attack during session i (session i is a recovery
session).
Note the similarity to PFS, in allowing only eavesdropping during the
‘recovery’ session i. Similarly to PFS, we also discuss some PRS Key Exchange
protocols in the next chapter, which deals with asymmetric cryptography.
Known PRS protocols are all based on asymmetric cryptography.
Applied Introduction to Cryptography and Cybersecurity
5.7. RESILIENCY TO EXPOSURE: FORWARD SECRECY AND RECOVER
SECURITY
315
Figure 5.21: Relations between notions of resiliency to key exposures. An
arrow from notion A to notion B indicates that notion A implies notion B. For
example, a protocol that ensures Perfect Forward Secrecy (PFS) also ensures
Forward Secrecy.
Comparison of the four notions of resiliency. We compare the four
notions of resiliency (forward secrecy, PFS, recover security and PRS) in Table 5.3, along with ‘regular’ secure Key Exchange protocols. We also present
the relationships between the őve notions in Figure 5.21.
Additional notions of resiliency. The research in cryptographic protocols
includes additional notions of resiliency to key and state exposures, which we
do not cover in this textbook. These include threshold security [117], which
ensures that the entire system remains secure even if (up to some threshold) of
its modules are exposed or corrupted, proactive security [88], which deals with
recovery of security of some modules after exposures, and leakage-resiliency [138],
which ensures resiliency to gradual leakage of parts of the storage.
Applied Introduction to Cryptography and Cybersecurity
CHAPTER 5. SHARED-KEY PROTOCOLS
316
5.8
Additional Exercises
Exercise 5.8 (Attack against SNA with ‘őxed roles’). Show that the SNA
handshake protocol does not ensure concurrent mutual authentication, also for
a scenario where each party is only willing to act in one role, i.e., either as an
initiator or as a responder, but not as both.
Hint: as in Figure 5.4, the attack will involve two sessions; but you are not
required that both sessions will terminate correctly - one of them may fail.
Exercise 5.9. Some applications require only one party (e.g., a door) to
authenticate the other party (e.g., Alice); this allows a somewhat simpler protocol.
We describe in the two items below two proposed protocols for this task (one
in each item), both using a key k shared between the door and Alice, and a
secure symmetric-key encryption scheme (E, D). Analyze the security of the
two protocols.
1. The door selects a random string (nonce) n and sends Ek (n) to Alice;
Alice decrypts it and sends back n.
2. The door selects and sends n; Alice computes and sends back Ek (n).
Repeat the question, when E is a block cipher rather than an encryption
scheme.
Exercise 5.10. Consider the following mutual-authentication protocol, using
shared key k and a (secure) block cipher (E, D):
1. Alice sends NA to Bob.
2. Bob replies with NB , Ek (NA ).
3. Alice completes the handshake by sending Ek (NB ⊕ Ek (NA )).
Show an attack against this protocol, and identify the design principles which
were violated by the protocol, and which, if followed, should have prevented such
attacks.
Exercise 5.11 (GSM). In this exercise we study some of the weaknesses of the
GSM handshake protocol, as described in Section 5.6. In this exercise we ignore
the existence of multiple types of encryption and their choice (‘ciphersuite’).
1. In this exercise, and in usual, we ignore the fact that the functions A8, A3
and the ciphers Ei were kept secret; explain why.
2. Present functions A3, A8 such that the protocol is insecure when using
them, against an eavesdropping-only adversary.
3. Present functions A3, A8 that ensure security against MitM adversary,
assuming E is a secure encryption. Prove (or at least argue) for security.
(Here and later, you may assume a given secure PRF function, f .)
Applied Introduction to Cryptography and Cybersecurity
5.8. ADDITIONAL EXERCISES
317
4. To refer to the triplet of a specific connection, say the j th connection, we
use the notation: (r(j), sres(j), k(j)). Assume that during connection
j ′ attacker received key k(ĵ) of previous connection ĵ < j ′ . Show how
a MitM attacker can use this to expose, not only messages sent during
connection ĵ, but also messages sent in future connections (after j ′ ) of
this mobile.
5. Present a possible fix to the protocol, as simple and efficient as possible,
to prevent exposure of messages sent in future connections (after j ′ ). The
fix should only involve changes to the mobile and the Visited Network, not
to the home.
Exercise 5.12 (Downgrade to A5/1 attack on GSM). Consider a mobile client
and a visited network that both support A5/3 (or some other strong stream
cipher). Present a sequence diagram showing how a MitM attacker can cause
them to use the (weaker) A5/1 protocol.
Exercise 5.13. Fig. 5.22 illustrates a simplification of the SSL/TLS sessionsecurity protocol; this simplification uses a fixed master key k which is shared
in advance between the two participants, Client and Server. This simplified
version supports transmission of only two messages, a ‘request’ MC sent by
the client to the server, and a ‘response’ MS sent from the server. The two
messages are protected using a session key k ′ , which the server selects randomly
at the beginning of each session, and sends to the client, protected using the
fixed shared master key k.
Figure 5.22: Simpliőed SSL
The protocol should protect the confidentiality and integrity (authenticity)
of the messages (MC , MS ), as well as ‘replay’ of messages, e.g., client sends
MC in one session and server receives MC on two sessions.
Applied Introduction to Cryptography and Cybersecurity
CHAPTER 5. SHARED-KEY PROTOCOLS
318
1. The field cipher_suite contains a list of encryption schemes (‘ciphers’)
supported by the client, and the field chosen_cipher contains the cipher
in this list chosen by the server; this cipher is used in the two subsequent
messages (a fixed cipher is used for the first two messages). For simplicity
consider only two ciphers, say E1 and E2, and suppose that both client
and server support both, but that they prefer E2 since E1 is known to be
vulnerable. Show how a MitM attacker can cause the parties to use E1
anyway, allowing it to decipher the messages MC , MS .
2. Suggest a minor modification to the protocol to prevent such ‘downgrade
attacks’.
3. Ignore now the risk of downgrade attacks, e.g., assume all ciphers supported
are secure. Assume that MC is a request to transfer funds from the clients’
account to a target account, in the following format:
Date
(3 bytes)
Operation type
(1 byte)
Comment
(20 bytes)
Amount
(8 bytes)
Target account
(8 bytes)
Assume that E is CBC mode encryption using an 8-bytes block cipher.
The solution should not rely on replay of the messages (which will not
work since only one message is sent in each direction on each usage).
Mal is a (malicious) client of the bank, and eavesdrops on a session where
Alice is sending a request to transfer 10$ to him (Mal). Show how Mal can
abuse his Man-in-the-Middle abilities to cause transfer of larger amount.
Explain a simple fix to the protocol to prevent this attack.
Exercise 5.14. Consider the following protocol for server-assisted group-sharedkey setup. Every client, say i, shares a key ki with the server. Let G be a
group of users; user i ∈ G can send, at any time, a request to the server for the
(fixed) group key kG ; the request consists of the list of users in G, the time t (in
seconds) according to the clock of user i, and an authenticator M ACki (G, t).
If the value of t is within one minute from its own clock value, the server
responds by sending to i the encrypted key: xG (t) = kG + Πj∈G P RFkj (t), where
Πj∈G P RFkj (t) is multiplication of the values P RFkj (t) for every user j in G,
(i)
including j = i. User i then computes kG = xG mod P RFki (t).
1. Draw a sequence diagram showing the operation of the protocol.
2. Let i, j be two users in group G, i.e., i, j ∈ G. Suppose user i sends request
at time ti and user j sends request at time tj . Explain the conditions for
them to receive the same key kG .
3. Present an attack allowing a malicious user m ̸∈ G to learn the key kG
for a group it does not belong to. User m may eavesdrop to all messages,
and request the key kG′ for any group (set) of users G′ s.t. m ∈ G′ .
Exercise 5.15. In the GSM protocol, the home sends to the Visited Network
one or more authentication triplets (r, K, s). The Visited Network and the
Applied Introduction to Cryptography and Cybersecurity
5.8. ADDITIONAL EXERCISES
319
mobile are to use each triplet only for a single handshake; this is somewhat
wasteful, as often the mobile has multiple connections (and handshakes) while
visiting the same Visited Network.
1. Suppose a Visited Network decides to re-use the same triplet (r, K, s)
in multiple handshakes, for efficiency (less requests to home). Present
message sequence diagram showing that this may allow an attacker to
impersonate as a client. Namely, that client authentication fails.
2. Suggest an improvement to the messages sent between mobile and Visited
Network, that will allow the Visited Network to reuse the (r, K, s) triplet
received from Visited Network, for multiple secure handshakes with the
mobile. Your improvement should consist of a single additional challenge
rB which the Visited Network selects randomly and sends to the mobile,
together with the challenge r received in the triplet from the home; and
a single response sB which the mobile returns to the server, instead of
sending the response s as in the original protocol. Show the computation
of sB by mobile and Visited Network: sB =
. Your
solution may use an arbitrary pseudo-random function P RF .
3. GSM sends frames (messages) of 114 bits each, by bit-wise XORing the
nth plaintext frame with 114 bits output from A5/iK (n). Here, A5/i, for
i = 1, 2, . . . , is a cryptographic function, n is the frame number, and K
was a key received from the home. A5/1 and A5/2 are described in the
specifications - and both are known to be vulnerable; other functions can
be agreed between mobile and Visited Network. Both A5/1 and A5/2 are
insecure; for this question, assume the use of a secure cipher, say A5/5.
Suppose, again, that a Visited Network decides to re-use the same triplet
(r, K, s) in multiple handshakes. A mobile has two connections to the
Visited Network, sending message m1 in the first connection and message
m2 in the second connection. Assume that the Visited Network re-uses
the same triplet (r, K, s) in both connections, and that the attacker knows
the contents of m1 . Show how the attacker can find m2 .
Note: the improvement suggested in the previous item (rB , sB ) does not
have significant impact on this item - you can solve with it or without it.
4. To prevent the threat presented in the previous item, the mobile and Visited
Network can use a different key K ′ =
(instead of using
K).
5. Design a Visited Network-only forward secrecy improvement to A5/5.
Namely, even if attacker is given access to the entire memory of the Visited
Network after the j th handshake using the same r, the attacker would still
not be able to decipher information exchanged in past connections. Your
design may send the value of j together with r from Visited Network to
mobile, and may change the stored value of s at the end of every handshake;
let sj denote the value of s at the j th handshake, where the initial value is
Applied Introduction to Cryptography and Cybersecurity
CHAPTER 5. SHARED-KEY PROTOCOLS
320
s received from the home (i.e., s1 = s). Your solution consists of defining
the value of sj given sj−1 , namely: sj =
.
Exercise 5.16 (GSM). Many GSM mobile phones use an encryption algorithm
referred to as A5/3, when supported by the visited network, since it is considered
more secure than A5/1 (and certainly more than A5/2, which was discontinued). The MAL organization records millions of A5/3 encrypted connections by
different people ‘of interest’. MAL cryptanalysts find an effective attack against
the GSM A5/1 algorithm; the attack exposes the key in few minutes, requiring
only one ciphertext message. Suppose now Alice tries to communicate using
her mobile and the GSM protocol, and the connection setup is intercepted by
MAL. Show a sequence diagram showing how MAL may use this inteception
connection attempt and the attack found against A5/1, to decrypt prior GSM
communication by Alice, which was encrypted using A5/3.
Exercise 5.17. Consider the following key establishment protocol between any
two users with an assistance of a server S, where each user U shares a secret
key KU S with a central server S.
A → B : (A, NA )
B → S : (A, NA , B, NB , EKBS (A +
+ NA ))
S → A : (A, NA , B, NB , EKAS (NA +
+ sk), EKBS (A +
+ sk), NB )
+ sk))
A → B : (A, NA , B, NB , EKBS (A +
Assume that E is an authenticated encryption. Show an attack which allows an
attacker to impersonate one of the parties to the other, while exposing the secret
key sk.
Exercise 5.18 (Hashing vs. Forward Secrecy). We discussed in §5.7.1 the
use of PRG or PRF to derive future keys, ensuring Forward Secrecy. Could
a cryptographic hash function be securely used for the same purpose, as in
κi = h(κi−1 )? Evaluate if such design is guaranteed to be secure, when h is a
(1) CRHF, (2) OWF, (3) bitwise-randomness extracting.
Exercise 5.19 (PFS deőnitions). Below are informal definitions for PFS from
the literature. Compare them to our definitions for PFS: are they equivalent?
Are they ‘weaker’ - a protocol may satisfy them yet not be PFS as we define, or
the other way around? Or are they incomparable (neither is always weaker)?
Can you give an absurd example of a protocol meeting the definition, which is
‘clearly’ not sensible to be claimed to be PFS? Any other issue?
From Wikipedia, [390] An encryption system has the property of forward
secrecy if plain-text (decrypted) inspection of the data exchange that occurs
during key agreement phase of session initiation does not reveal the key
that was used to encrypt the remainder of the session.
Applied Introduction to Cryptography and Cybersecurity
5.8. ADDITIONAL EXERCISES
321
From [279, 309] A protocol has Perfect Forward Secrecy (PFS) if the compromise of long-term keys does not allow an attacker to obtain past session
keys.
Applied Introduction to Cryptography and Cybersecurity
Chapter 6
Public Key Cryptography
As we discussed in subsection 1.6.1, cryptography has been applied for over
two millennia. However, until relatively recently, cryptography was always
based on the use of symmetric keys. In particular, in Chapter 2, we studied
symmetric cryptosystems, also called shared-key cryptosystems, which use the
same key k for encryption (c ← Ek (m)) and for decryption (m ← Dk (c)); see
in Figure 1.4. Similarly, in Chapter 4, we focused on shared-key (symmetric)
Message Authentication Code (MAC) schemes, which also used only one key k
to compute the authenticator (tag) that we will send with a message to prove
its authenticity, and later to compute the authenticator for a message received
and conőrm it is identical to the one received with the message, proving its
authenticity.
This was changed quite dramatically by the publication, in 1976, of [123], a
seminal paper by Diffie and Hellman introducing public key cryptography, also
known as asymmetric cryptography. Asymmetric cryptography is built on the
idea that we may use different keys for different functions, e.g., for encryption
and for decryption. Of course, the keys may be related; for example, if we use a
key e to encrypt, and a key d to decrypt, the pair (e, d) should be related to
properly retrieve the plaintext: m = Dd (Ee (m)).
The advantage in asymmetric cryptography is that, for many applications,
one key can be public, and only the other kept private. For example, Alice can
publish her encryption key A.e, allowing everyone to encrypt messages to her
by computing c = EA.e (m), but only Alice knows the corresponding decryption
key A.d such that m = DA.d (c). We refer to such asymmetric cryptosystems as
Public Key Cryptosystem (PKC); see illustration in Figure 1.5.
In [123], Diffie and Hellman identiőed three types of public-key schemes:
public-key cryptosystem, digital signatures and key exchange. They also presented a design, but only for the DH key exchange protocol. This discovery of the
revolutionary concept of asymmetric cryptography is recreated in Figure 6.11 .
In this chapter, we introduce public key cryptography. We begin, in the
following section, with a brief introduction to public key cryptography. We
1 Thanks
to Whit and Marty for blessing this invented dialog.
323
324
CHAPTER 6. PUBLIC KEY CRYPTOGRAPHY
Figure 6.1: The discovery of Public-Key Cryptography by Whitőeld Diffie and
Martin Hellman.
then discuss key-exchange protocols, and later also public-key cryptosystems,
mainly El-Auth-h-DH and RSA. We already discussed signature schemes and
their security in subsection 1.5.1, but in subsection 6.6.1 we discuss the speciőc
case of RSA-based public key signatures.
6.1
Introduction to PKC
The basic observation leading to asymmetric cryptography is quite simple
in hindsight: security requirements are asymmetric. For example, to protect
conődentiality, an encryption scheme should prevent an attacker from decrypting ciphertext, requiring the key used for decryption to be secret. However,
conődentiality is not broken if the attacker can encrypt messages. In fact, in
Deőnition 2.9 of security against chosen plaintext attack (CPA), we allow the
adversary to encrypt plaintexts without any restriction; in contrast, we do not
facilitate decrypting ciphertexts. Therefore, if for a given cryptosystem, the
encryption key does not allow decryption, and, in particular, does not expose
the decryption key, then security under Deőnition 2.9 would imply security
under a similar deőnition where the adversary is given the encryption key.
We őrst discuss the three basic types of public key schemes introduced
in [123], which are still the most important types of public key schemes: public
key cryptosystem (PKC), digital signature schemes, and key exchange protocols.
6.1.1
Public key cryptosystems
Public key cryptosystems (PKC) are encryption schemes consisting of three
algorithms, (KG, E, D), which use a pair of keys: a public key e for encryption,
and a private key d for decryption. Both keys are generated by the key generation
algorithm KG. The encryption key is not secret; namely, we assume that it is
known to the attacker.
Applied Introduction to Cryptography and Cybersecurity
6.1. INTRODUCTION TO PKC
325
Let us now deőne a public key cryptosystem, similarly to Deőnition 2.1 for
shared-key cryptosystems. As in Deőnition 2.1, we require correctness, i.e., that
decryption of an encrypted message will recover that message. One notable
difference is that a public key cryptosystem includes a key generation algorithm
KG, since the encryption and decryption keys must be related - we cannot just
choose them at random.
Definition 6.1 (Public-key cryptosystem (PKC)). A public-key cryptosystem
(PKC) is a triplet of (probabilistic) algorithms, (KG, E, D) and a set M (of
plaintext messages), ensuring correctness, i.e., for every message m ∈ M and
$
key-pair (e, d) ← KG(1l ) holds:
Dd (Ee (m)) = m
(6.1)
See the illustration of a public key cryptosystem in Figure 1.5.
Other terms for public key cryptosystems include asymmetric cryptosystems
and public key encryption schemes. We will try to stick to the term ‘public key
cryptosystems’, often using just the acronym PKC.
We further discuss public key cryptosystems in sections 6.4.2 and 6.5.
6.1.2
Signature schemes
We introduced signature schemes, in subsection 1.2.3; in particular, see subsection 1.5.1 and illustration in Figure 1.7. Let us quickly recall them here.
Signature schemes consist of three algorithms, (KG, S, V ), for Key Generation, Signing and Verifying, respectively. Key Generation (KG) is a randomized
algorithm, and it outputs a pair of correlated keys: a private signing key s for
‘signing’ a message, and a public validation key v for validating a given signature
for a given message. The validation key v is not secret: it should only allow
validation of authenticity, and should not facilitate signing.
Both signature schemes and Message Authentication Code (MAC) functions
are used for authentication of messages, which is based on the unforgeability
requirement (subsection 1.5.1). The difference is that MAC functions use a single
secret key k for authenticating and for validating messages, while signature
schemes use a distinct private signing key s for signing (authenticating), and a
distinct public veriőcation key v for validating authenticity.
The correctness requirement of signature schemes is also similar to the one
for MAC schemes (Section 4.3), namely for security parameter 1l , message m
$
and key-pair (s, v) ← KG(1l ) holds:
Vv (m, Ss (m)) = True
(6.2)
We presented constructions for one-time signatures in subsection 3.4.2. In
Section 6.6, we discuss constructions of ‘regular’ signature schemes, i.e., schemes
which may be used to sign arbitrary number of messages.
Applied Introduction to Cryptography and Cybersecurity
326
6.1.3
CHAPTER 6. PUBLIC KEY CRYPTOGRAPHY
Public-Key-based Key Exchange Protocols
Public-key based Key Exchange protocols establish a shared secret key among
two (or more) parties, based on the use of public keys. Such protocols may be
authenticated, using keys shared in advance between the parties, or unauthenticated, not using any pre-shared keys. In this chapter, we focus on different
variants of the Diffie-Hellman (DH) key-exchange protocol. We őrst discuss
the unauthenticated Diffie-Hellman (DH) key-exchange protocols, in subsection 6.2.3, and later, in Section 6.3, discuss authenticated Diffie-Hellman (DH)
key-exchange protocols.
Like other key-exchange protocols, the őnal output is a shared key, known to
both (or all) participants; the goal is that the key would be completely hidden
from the attacker. Public-key-based key exchange protocols are much more
computationally-demanding than the shared-key key-exchange protocols we
discussed in Chapter 5, and therefore, they are typically used only periodically,
and the shared-key they output is usually referred to as a master key, and
used later to derive shared session keys, possibly using (efficient) shared-key
key-exchange protocols, as discussed in Section 5.4. Or, a combined protocol
can be used both to share the master key and then to share session-keys based
on the master key; in particular, this is done by the widely-used TLS protocol,
which we study in Chapter 7.
The basic operation and goals of unauthenticated public-key-based key
exchange protocols are illustrated in Figure 6.2, focusing on the case where the
protocol involves only two ŕows: the őrst from Alice to Bob, and the second
from Bob to Alice. To motivate the unauthenticated scenario, think that Alice
and Bob meet in public, and want to establish secure communication between
them; however, it being a public place, their discussion may be overheard,
e.g., by Eavesdropping Eve. The key exchange protocol would allow them to
establish a shared secret key, known only to the two of them, in spite of the
possible eavesdropping.
Notice that during the run of the protocol, we allow the adversary (Eavesdropping Eve) only to eavesdrop to the communication between the parties;
such adversary cannot modify or inject messages between the parties.
Defining two-flows unauthenticated public-key-based key exchange
protocols. Let us now deőne, informally, an unauthenticated public-key-based
key exchange protocol; for simplicity, let’s focus on protocols using two ŕows, as
in Figure 6.2. Such a key exchange protocol can be deőned by a pair of efficient
probabilistic algorithms, (KG, KC) (for key-generation and key-combining,
respectively). The key-generation algorithm is given the security parameter, in
unary, 1l , and outputs a pair of strings, e.g., (a, PA ); we refer to the őrst (a) as
the private key and to the second (PA ) as the public key. The security parameter
can be the same or related to the effective key length of the protocol; the actual
length of public keys is typically signiőcantly longer, see subsection 6.1.5.
The key generation function KG receives as input a security parameter 1l ,
and outputs a pair of keys, e.g., (PA , a), where PA is a public key and a is a
Applied Introduction to Cryptography and Cybersecurity
6.1. INTRODUCTION TO PKC
327
Eavesdropping Eve
Alice
$
(PA , a) ← KG(1l )
)
op
dr
es
av
(e
p)
ro
sd
ve
(ea
Nurse
Bob
$
(PB , b) ← KG(1l )
PA
PB
kA,B ← KC(a, PB )
Goals:
Indistinguishability: Eve cannot distinguish kA,B from random.
Correctness: both parties get the same keys i.e., kA,B = kB,A .
kB,A ← KC(b, PA )
Figure 6.2: Operation of an arbitrary two-ŕows unauthenticated public-key
based key exchange protocol, such as the Diffie-Hellman protocol (presented
later). Such protocol is deőned by two efficient probabilistic algorithms: Key
Generation (KG), to generate (private, public) key-pairs, and Key Combining
(KC), to combine the exchanged public keys (PA and PB ) into the shared key:
k = KC(PA , PB ). An unauthenticated key exchange protocol should be secure
against an eavesdropping adversary Eve, but is vulnerable to a Man-in-theMiddle adversary.
private key. The key-combining function KC is run by each party, and receives
as input a public value (from the other party) and a private value (of the party
running KC); the output of KC would be used as the key shared between the
two parties. The keys derived by the Alice and Bob should be the same, i.e.:
KC(a, PA ) = KC(b, PB ).
A key exchange protocol should ensure correctness and indistinguishability.
The correctness requirement is that both parties will derive the same key.
More precisely, for every security parameter 1l , the following should hold. Let
$
$
(a, Pa ) ← KG(1l ) and (b, PB ) ← KG(1l ) be two key-pairs generated, using the
key-generation algorithm KG with security parameter 1l , for Alice and Bob
respectively. Then we have:
KC(a, PB ) = KC(b, PA )
(6.3)
Namely, applying the key-combining algorithm KC to combine Alice’s private
key a with Bob’s public key PB , results in the same symmetric key as the one
resulting from combining Bob’s private key b with Alice’s public key PA .
Indistinguishability requires, intuitively, that an eavesdropping adversary,
who ‘sees’ PA and PB , cannot learn anything about the shared key; equivalently,
it requires that the adversary cannot distinguish between being given randomlygenerated PA , PB and the key derived from them, versus being given randomlygenerated PA , PB and a random string of the same length as the key. The
following deőnition states this requirement more precisely.
Applied Introduction to Cryptography and Cybersecurity
328
CHAPTER 6. PUBLIC KEY CRYPTOGRAPHY
Definition 6.2 (The indistinguishability requirement). Let (KG, KC) be a
key-exchange protocol, and A be an efficient (PPT) adversary. We say that
(KG, KC) ensures key-indistinguishability if for every PPT adversary A and
for sufficiently-large security parameter 1l , holds:
A (PA , PB , r) = 1
A (PA , PB , KC(a, PA )) = 1
where
where
$
(a, P ) ←
l
KG(1 ),
A
Pr
$
∈ N EGL(1l )
−Pr
l
(a, PA ) ← KG(1 ),
$
l
(b, PB ) ← KG(1 ),
$
l
(b, PB ) ← KG(1 )
$
r ← {0, 1}|KC(a,PA )|
(6.4)
6.1.4
Advantages of Public Key Cryptography (PKC)
Public key cryptography is not just a cool concept; it is very useful, allowing
solutions to problems which symmetric cryptography fails to solve, and making
it easier to solve other problems.
We őrst identify three important challenges which require the use of asymmetric cryptography:
Signatures provide evidence. Only the owner of the private key can digitally
sign a message, but everyone can validate this signature. This allows a
recipient of a signed message to know that once he validated the signature,
he has the ability to convince other parties that the message was signed
by the sender. This is impossible using (shared-key) MAC schemes, and
allows many applications, such as signing an agreement, payment order
or recommendation/review. An important special case is signing a public
key certificate, linking an entity and its public key.
Security without assuming shared key. Using public key cryptography,
we can establish secure communication between parties, without requiring
them to previously share a secret key between them, or to share a secret key
and communicate with an additional party (such as a KDC, see Section 5.5).
One method to do so is to use unauthenticated key-exchange protocol;
this is secure if the attacker has only eavesdropping capabilities during the
exchange (this is not secure against a MitM attacker). Another alternative
is when one party (e.g., the client) knows, or can securely receive, the
public key of the other party (e.g., the server); in this case, the client can
encrypt a shared key and send it to the server. To allow a party, e.g., the
client (Alice), to validate the public key of the other party, e.g., the server
(Bob), we can send the public key PB signed by a trusted party. We refer
to the signed public key as a public key certificate; public key certiőcates
are a very important aspect of applied cryptography, and we discuss them
extensively in Chapter 8.
Stronger resiliency to exposure. In Section 5.7 we discussed the goal of
resiliency to exposure of secret information, in particular, of the ‘master
Applied Introduction to Cryptography and Cybersecurity
6.1. INTRODUCTION TO PKC
329
key’ of shared-key key-setup protocols, and presented the forward secrecy
key-setup handshake. In subsection 5.7.3, we also brieŕy discussed some
stronger resiliency properties, including Perfect Forward Secrecy (PFS),
Threshold security and Proactive security. Designs for achieving such
stronger resiliency notions are all based on public key cryptography; we
discuss these in Section 6.3.
Public key cryptography (PKC) also makes it easier to design and deploy
secure systems. Speciőcally:
Easier key distribution: public keys are easier to distribute, since they can
be given in a public forum (such as directory) or in an incoming message;
note that the public keys still need to be authenticated, to be sure we
are receiving the correct public keys, but there is no need to protect
their secrecy. Distribution is also easier since each party only needs to
distribute one (public) key to all its peers, rather than setting up different
secret keys, one per each peer.
Easier key management: public keys are easier to maintain and use, since
they may be kept in non-secure storage, as long as they are validated
before being used.
Less keys: Only one public key is required for each party, compared to a
total of n·(n−1)
= O(n2 ) shared keys required for each pair of n entities.
2
Namely, we need to maintain - and refresh - less keys.
Considering all these advantages, one may wonder why not always use public
key cryptography. The reason is that there is also a price to the use of PKC as we next discuss.
6.1.5
The price of PKC: assumptions, computation costs and
length of keys and outputs
With all the advantages listed above, it may seem that we should always
use public key cryptography. However, PKC has three signiőcant drawbacks:
computation time, key-length and potential vulnerability. We discuss these in
this subsection.
All of these drawbacks are due to the fact that when attacking a PKC
scheme, the attacker has the public key which corresponds to the private key.
The private key is closely related to the public key - for example, the private
decryption key ‘reverses’ encryption using the public key; yet, the public key
should not expose (information about) the private key. It is challenging to
come up with a scheme that allows this relationship between the encryption
and decryption keys, and yet where the public key does not expose the private
key. In fact, as discussed in Section 1.6, the concept of PKC was ‘discovered’
twice!
Considering the challenge of designing asymmetric cryptosystems, it should
not be surprising that all known public-key schemes have considerable drawbacks
Applied Introduction to Cryptography and Cybersecurity
330
CHAPTER 6. PUBLIC KEY CRYPTOGRAPHY
compared to the corresponding shared-key (symmetric) schemes. There are two
types of drawbacks: overhead and required assumptions.
PKC assumptions and quantum cryptanalysis Applied PKC algorithms,
such as RSA, DH, El-Gamal and elliptic-curve PKCs, all rely on speciőc
computational assumptions, mostly on the hardness of speciőc number-theoretic
problems, mainly two: factoring and discrete logarithm.
These speciőc hardness assumptions, and several others, are usually considered well-founded. This is due to the extensive efforts of mathematicians
and other experts to őnd efficient algorithms for these problems. In particular,
factoring and discrete logarithms have been studied for many years, long before
their use for PKC was proposed; and efforts increased by far as PKC became
known and important.
However, it is certainly conceivable that an efficient algorithms exists - and
would someday be found. Such a discovery may even occur suddenly and
soon - such unpredictability is the nature of algorithmic and mathematical
breakthroughs. In particular, a recent draft [352] presented a new factoring
method, which claimed to be fast enough to be practical for signiőcant keylengths, and speciőcally to ‘destroy the RSA cryptosystem’. As the time of
writing, this draft was withdrawn, and may be incorrect; but it will not be
shocking if such an algorithm were to be found, indicating that RSA security
may be considerably less than currently estimated, requiring the use of either
longer keys or other schemes.
Furthermore, since all of the widely-used PKC algorithms are so closely
related, it is even possible that some, potentially related, advances in cryptanalysis would apply to all of them - leaving us without any practical PKC
algorithm. PKC algorithms are the basis for the security of many systems and
protocols; if suddenly there were no viable, practical and unbroken PKC, that
would be a major problem.
And if all that is not alarming enough, efficient algorithms to solve both
the factoring and the discrete logarithm problems are known, requiring an
appropriate quantum computer. There has been many efforts to develop quantum
computers, with signiőcant progress - but results are still far from the ability
to cryptanalyze these PKC schemes, when used with key-lengths which are
considered secure (against known attacks, using standard computing devices).
However, that may change with improvements in quantum computing.
Cryptographers work hard to identify additional candidate PKC systems,
which will rely on other, ‘independent’ or - ideally - more general assumptions,
as well as schemes which are secure even if large-scale quantum computing
becomes feasible, which are referred to as post-quantum cryptography. We discuss
the impact of quantum computing on cryptography, including both use for
cryptanalysis and development of post-quantum cryptography, in Section 10.4.
One particularly interesting approach to the development of cryptographic
schemes robust to advances in algorithms for speciőc problems, is the design
of PKC schemes based on lattice problems. Lattice problems seem resilient to
Applied Introduction to Cryptography and Cybersecurity
6.1. INTRODUCTION TO PKC
331
quantum-computing; furthermore, some of the results in this area have proofs
of security based on the general and well-founded complexity assumption of
NP-completeness. Details are beyond our scope; see, e.g., [16, 315].
PKC overhead: key-length and computation. Another drawback of
asymmetric cryptography, is that all of the proposed schemes - deőnitely, all
proposed schemes which were not broken - have much higher overhead, compared
to the corresponding shared-key schemes. There are two main types of overhead:
computation time and key-length.
The system designers choose the key-length of the cryptosystems they use,
based on the sufficient effective key length principle (principle 5). These decisions
are based on the perceived resources and motivation of the attackers, on their
estimation or bounds of the expected damages due to exposure, and on the
constraints and overheads of the relevant system resources. Finally, a critical
consideration is the estimates of the required key length for the cryptosystems
in use, based on known and estimated future attacks. Such estimates and
recommendations are usually provided by experts proposing new cryptosystems,
and then revised and improved by experts and different standardization and
security organizations, publishing key-length recommendations.
We present three well-known recommendations in Table 6.1. These recommendations are marked in the table as LV’01, NIST2014 and BSI’17, and were
published, respectively, in a seminal 2001 paper by Lenstra and Verheul [261],
by NIST in 2014 [27] and by the German BSI organization in 2017 [86]. See
these and much more online at [163].
Recommendations are usually presented with respect to a particular year in
which the ciphertexts are to remain conődential (the three rows for 2020, 2030
and 2040 in Table 6.1). Experts estimate the expected improvements in the
cryptanalysis capabilities of attackers over years, due to improved hardware
speeds, reduced hardware costs, reduced energy costs (due to improved hardware), and, often more signiőcantly but hardest to estimate, improvements in
methods of cryptanalysis. Such predictions cannot be done precisely, and hence,
recommendations differ, sometimes considerably.
Table 6.1 presents the recommendations for four typical, important cryptosystems (in columns two to four). Column two presents the recommendations
for a symmetric cryptosystem such as AES. The recommendations for symmetric cryptosystems are not limited to AES; they apply to any symmetric
(shared-key) cryptosystem. They only require that the best attacks against
the system are generic attacks such as exhaustive search (subsection 2.3.1) or
table lookup (subsection 2.3.2); symmetric cryptosystem against which there is
a more effective attack are typically considered insecure and avoided.
Column three presents the recommendations for RSA and El-Gamal, the
two oldest and most well-known public-key cryptosystems; we discuss both
cryptosystems, in sections 6.5 and 6.4.2. This column also applies to the DiffieHellman (DH) key-exchange protocol; in fact, the El-Gamal cryptosystem is
essentially a variant of the DH protocol, as we explain in subsection 6.4.2. RSA
Applied Introduction to Cryptography and Cybersecurity
332
Estimation
Year
2020
2030
2040
Crypto++
[110]
CHAPTER 6. PUBLIC KEY CRYPTOGRAPHY
Symmetric
Cryptography
LV
NIST BSI
2002 2014 2017
86
112
128
93
112
128
101
128
128
4.5 · 109 bytes/sec
128-bits AES
Factoring (RSA),
Discrete-log (DH)
LV
NIST BSI
2002 2014 2017
1881 2048 2000
2493 2048 3000
3214 3072 3000
3 · 105 bytes/sec
2048-bits RSA/DH
Elliptic curves
(ECIES)
LV
NIST BSI
2002 2014 2017
161
224
250
176
224
250
191
256
250
3 · 104 bytes/sec
256-bits ECIES
Table 6.1: Comparison of key length and computing time for asymmetric
and symmetric cryptography. Table shows three recommendations for keylength, in bits, required for conődentiality against ‘commercial’ adversaries.
The recommendations are given for widely-deployed public-key (asymmetric)
and shared-key (symmetric) cryptosystems; the rows refer to the year in which
conődentiality is to be preserved (2020, 2030 and 2040). The recommendations
are based on predictions of advances in both computing power and cryptanalysis.
The LV recommendations are from a 2002 paper [261], the NIST recommendations are from a 2014 publication [27] and the BSI values are from a 2017
publication [86]. The bottom row compares the performance of the schemes for
the Crypto++ implementation, based on [110].
and El-Gamal/DH are based on two different number-theoretic problems: the
factoring problem (for RSA) and the discrete-logarithm problem (for DH/ElGamal); but the best-known attacks against both are related, with running time
which is exponential in half the key-length. We brieŕy discuss these problems
in subsection 6.1.7.
The fourth column of table 6.1 presents the recommendations for ellipticcurve based public-key cryptosystems such as ECIES. As the table shows, the
recommended key-lengths for elliptic-curve based public-key cryptosystems are,
quite consistently, much lower than the recommendations for the ‘older’ RSA
and El-Gamal/DH systems; this makes them attractive in applications where
longer keys are problematic, due to storage and/or communication overhead.
We do not cover elliptic-curve cryptosystems in this textbook; these are covered
in other courses and books, e.g., [16, 187, 370].
Table 6.1 shows that the required key-length is considerably higher for
public-key schemes, compared to shared-key (symmetric) schemes. Symmetric
cryptography requires only about half of the key-length required by Ellipticcurve cryptosystems, and only about 5% of the key length required, for the
same level of security, when using the RSA and DH public-key schemes. The
lower key-length recommendations for Elliptic-curve cryptography, makes these
schemes attractive in the (many) applications where key-length is critical, such
as when communication bandwidth and/or storage are limited.
The bottom row of Table 6.1 compares the running time of implementations
of AES with 128 bit key in counter (CTR) mode, RSA with 1024 and 2048
bit key, and 256 bit ECIES elliptic curve cryptosystem. We see that the
Applied Introduction to Cryptography and Cybersecurity
6.1. INTRODUCTION TO PKC
333
symmetric cryptosystem (AES) is many orders of magnitude faster. It supports
about 4.5 · 109 bytes/second, compared with about 3 · 105 bytes/second for the
comparably-secure 2048-bit RSA, and less than 3 · 104 bytes/second for ECEIS.
We used the values reported for one of the popular Cryptographic libraries,
Crypto++ [110].
The minimize use of PKC principle. In this subsection we have seen
several serious concerns with the use of asymmetric (public key) cryptography.
First, practical, deployed public key cryptographic algorithms, are secure only
with speciőc assumptions - which have held many years, true, but still may
be broken, e.g., by new, faster factoring algorithms [352]. Second, applied
public-key techniques may be vulnerable to further improvements in quantum
computing. Finally, as Table 6.1 shows, asymmetric (public key) cryptography
has much higher overhead compared to symmetric cryptography. From all of
this, we conclude the following principle:
Principle 12 (Minimize use of public-key cryptography). Designers should
avoid, or, where absolutely necessary, minimize the use of public-key cryptography.
In particular, consider that typical messages are much longer than the size of
inputs to the public-key algorithms. If we ignored the high costs of asymmetric
cryptography, we could split the input into ‘blocks’ whose size is the allowed
input-size of the public-key algorithms, and then use ‘modes of operations’, like
these presented for encryption and MAC, for applying the public-key algorithms
to multiple blocks. However, the resulting computation costs would have been
absurd. Even more absurd, although theoretically possible, would be to modify
the public-key operation to directly support longer inputs. Luckily, there are
simple and efficient solutions, to both encryption and signatures, which are
used essentially universally, to apply these schemes to long, typically Variable
Input Length (VIL), messages:
Signatures: use the Hash-then-Sign (HtS) paradigm, see subsection 3.2.6.
Encryption: use the hybrid encryption paradigm, see the following subsection
(subsection 6.1.6).
6.1.6
Hybrid Encryption
The huge performance overhead of asymmetric cryptosystems implies that
they are typically used mainly when the parties do not share a symmetric
key. Furthermore, even when the parties do not share a symmetric key, we
usually do not use directly the widely-used, ‘classical’ asymmetric cryptosystems (KGA , E A , DA ), e.g., RSA. Instead, we usually combine such ‘classical’
asymmetric cryptosystem (KGA , E A , DA ), with an efficient symmetric cryptosystem (E S , DS ), e.g., AES. Namely, we construct a new, hybrid asymmetric
cryptosystem, which we denote (KGH , E H , DH ). Note our use of mnemonic
Applied Introduction to Cryptography and Cybersecurity
334
CHAPTER 6. PUBLIC KEY CRYPTOGRAPHY
superscripts to distinguish between the three cryptosystems: A for the ‘classical’
asymmetric cryptosystem (KGA , E A , DA ), S for the symmetric cryptosystem
(E S , DS ) and H for the hybrid cryptosystem (KGH , E H , DH ).
Hybrid encryption is to obtain the beneőts of asymmetric (public key)
cryptosystems, yet with much reduced overhead, for the typical case where
the plaintext is much longer than the input size of the ‘classical’ public-key
encryption. For example, when using RSA with 4000-bit keys, the input size
must be less than 4000 bits; with hybrid encryption, we can encrypt much
longer messages, with only a single (4000-bit) RSA encryption, plus the number
of symmetric-key operations required to encrypt the plaintext.
Almost universally, hybrid encryption is achieved using the simple construction illustrated in Figure 6.3. Note that Figure 6.3 shows only the encryption
and decryption processes of the hybrid encryption scheme; this is since in this
common construction, the hybrid encryption scheme (KGH , E H , DH ) uses the
same key-generation function as that of the underlying asymmetric encryption
scheme, i.e., KGH (1l ) = KGA (1l ).
Let us explain the hybrid encryption and decryption processes, as illustrated
in Figure 6.3.
Figure 6.3: Hybrid encryption (KGH , E H , DH ), deőned as a combination of
an asymmetric (public key) cryptosystem (KGA , E A , DA ) with shared key
cryptosystem E S , DS ), to allow efficient public key encryption of long message
m using public key e and security parameter 1l .
The hybrid encryption process EeH (m). We now explain how we perform
hybrid encryption, given message m and using the public key e, as illustrated
$
in Figure 6.3. We őrst select a random l-bit symmetric key k ← {0, 1}|e| ; note
that, for simplicity, we use the length of the public key as the length of the
shared key too (in practice, a shorter key usually suffices). We then use this
key k to encrypt the message using symmetric encryption, i.e., compute the
cipher-message cM = EkS (m). Finally, we then use the public-key encryption, to
encrypt the symmetric key k, i.e., compute the cipher-key cK , as: cK = EeA (k).
The ciphertext of the hybrid encryption is the pair of cipher-message and
Applied Introduction to Cryptography and Cybersecurity
6.1. INTRODUCTION TO PKC
cipher-key, i.e., (cM , cK ). More formally, we deőne:
$
k ← {0, 1}|e|
EeH (m) ← (cM , cK ) where cM ← EkS (m)
cK ← EeA (k)
335
(6.5)
The hybrid decryption process DdH ((cM , cK )).
We now explain how
we perform hybrid decryption, given the private key d and the ciphertext,
which should be a pair (cM , cK ) of cipher-message and cipher-key. The hybrid
decryption process is illustrated in Figure 6.3. We őrst decrypt the symmetric
key, by: k ← DdA (cK ), and then use this key k to decrypt the message, using
the symmetric decryption function: m ← DkS (cM ). More formally, we deőne
the hybrid decryption process as:
DdH ((cM , cK )) ← DkS (cM ), where k ← DdA (cK )
(6.6)
Exercise 6.1. Prove, or present counterexample to, the following claim: if
the asymmetric (KGA , E A , DA ) and symmetric (E S , DS ) cryptosystems ensure
correctness (per definitions 6.1 and 2.1, respectively), then the hybrid cryptosystem (KGH , E H , DH ), defined as above, is also an asymmetric cryptosystem
(PKC) that ensures correctness.
6.1.7
The Factoring and Discrete Logarithm Hard Problems
As discussed in Section A.1, cryptography, and in particular public-key cryptography is based on the theory of complexity, and speciőcally on (computationally)
hard problems. Intuitively, a hard problem is a family of computational problems,
with two properties:
Easy to verify: there is an efficient (PPT) algorithm to verify solutions.
Hard to solve: there is no known efficient algorithm that solves the problem
(with signiőcant probability). We refer here to known algorithms; it is
unreasonable to expect a proof that there is no efficient algorithm to a
problem for which there is an efficient (PPT) veriőcation algorithm. The
reason for that is that such a proof would also solve the most important,
fundamental open problem in the theory of complexity, i.e., it would show
that N P ̸= P . See Section A.1 or relevant textbooks, e.g., [165].
Intuitively, public key schemes use hard problems, by having the secret key
provide the solution to the problem, and the public key provide the parameters
to verify the solution. To make this more concrete, we brieŕy discuss factoring
and discrete logarithm, the two hard problems which are the basis for many
public key schemes, including the oldest and most well known: RSA, DH,
El-Gamal. For more in-depth discussion of these and other schemes, see courses
and books on cryptography, e.g., [370]. Note that while so far, known attacks
are equally effective against both systems (see Table 6.1), there is not yet a
proof that an efficient algorithm for one problem implies an efficient algorithm
against the second.
Applied Introduction to Cryptography and Cybersecurity
336
CHAPTER 6. PUBLIC KEY CRYPTOGRAPHY
Note: both the factoring and the discrete logarithm problems are well-known
problems from the domain of number theory. Properly understanding these
problems, as well as the operation and security of the public key cryptosystems
we present which are related to these problems, may require familiarity with
some basic notions of number theory. These are summarized in Section A.2.
Factoring. The factoring problem is one of the oldest problems in algorithmic
number theory, and is the basis for RSA and other cryptographic schemes.
Basically, the factoring problem involves őnding the prime divisors (factors) of a
large integer. However, most numbers have small divisors - half of the numbers
divide by two, third divide by three and so on... This allows efficient number
sieve algorithms to factor most numbers. Therefore, the factoring hard problem
refers speciőcally to factoring of numbers which have only large prime factors.
For the RSA cryptosystem, in particular, we consider factoring of a number
n computed as the product of two large random primes: n = pq. The factoring
hard problem assumption is that given such n, there is no efficient algorithm to
factor it back into p and q.
Veriőcation consists simply of multiplying p and q, or, if given only one of
the two, say p, of dividing n by p and conőrming that the result is an integer q
with no residue.
Discrete logarithm. The discrete logarithm problem is another important,
well-known problem from algorithmic number theory - and the basis for the DH
(Diffie-Hellman) key-exchange protocol, the El-Gamal cryptosystem, ellipticcurve cryptography, and additional cryptographic schemes.
Discrete logarithms are deőned for a given cyclic group G and a generator
g of G; see background in subsection A.2.4. Given a generator g of a őnite
cyclic group G, and an element x ∈ G, an integer y is called the discrete
logarithm of x over G with respect to g, if x = g y . Note that multiplication (and
exponentiation) are done using the group operation of G; e.g., for the modulo p
group Z∗p ≡ {1, 2, . . . , p − 1}, we require x ≡ g y ( mod p).
Discrete logarithms are similar to the ‘regular’ logarithm function logb (x)
over the real numbers R, which returns the number y ∈ R s.t. y = by . Discrete
logarithms, unlike ‘regular’ logarithms, are computed over the őnite cyclic group
G rather than over the real numbers, and use the group operation of G rather
than multiplication over the real numbers.
Intuitively, an algorithm that outputs a discrete logarithm a given an element
x ∈ G and the generator g is said to solve the discrete logarithm problem for G.
We say that the discrete logarithm problem is hard for őnite cyclic group G,
if there is no efficient (PPT) algorithm A that solves the discrete logarithm
problem for G (with signiőcant probability of success). This is in contrast to
the logarithm function over the real numbers, which is efficiently computable.
Note, however, that for any group G, it only requires an exponentiation to
verify whether x = g y , and exponentiation can be computed quite efficiently.
Applied Introduction to Cryptography and Cybersecurity
6.1. INTRODUCTION TO PKC
337
This discussion is only intuitive, since we did not clearly deőne the input
of the algorithm A. This may suffice for most readers; however, for interested
readers, we present also a precise deőnition of the discrete logarithm problem.
In this deőnition, we consider a PPT algorithm Gen which receives, as input, a
security parameter 1l , and generates (outputs) the generator g and the order q
of G.
Definition 6.3 (The discrete logarithm problem). Let Gen be a PPT algorithm
that, on input 1l , outputs (g, q) such that {1, g, . . . , g q } is a cyclic group (using
a given group operation). We say that the discrete logarithm problem is hard
for groups generated by Gen, if for every PPT algorithm A holds:
h
l
i
$
Pr (g, q) ← Gen 11 ; a ← {1, . . . , q} : a = A(g a ) ∈ N EGL(1l ) (6.7)
In practical cryptography, the discrete logarithm problem is used mostly for
(cyclic) groups deőned by multiplications modulo a prime p, often the cyclic
group Z∗p ≡ {1, 2, . . . , p − 1}. However, for some primes p, the discrete-logarithm
problem is easy for Z∗p . In particular:
Fact 6.1. Let p be a prime. If p − 1 has only ‘small’ prime factors, then
there are known algorithms, such as the Pohlig-Hellman algorithm [319], that
efficiently compute discrete logarithms.
This motivates the use of a modulus p which is a prime without small factors.
In this textbook, we focus on a special case called safe prime, as we next deőne.
Definition 6.4 (Safe prime). A prime number p ∈ N is called a safe prime, if
p = 2q + 1 for some prime q ∈ N. If p is a safe prime, we say that the group
Z∗p , containing the numbers from 1 to p − 1, with the modular multiplication
operation, is a safe prime group.
Many efforts have failed to őnd an efficient algorithm to compute discretelogarithms for safe prime groups. As a result, the discrete-logarithm problem is
widely believed to be hard for the mod-p group, Z∗p , if p is a safe prime.
For efficiency and/or security considerations, some designs use other őnite
cyclic groups for which the discrete logarithm problem is considered hard, which
are not safe prime groups; one example are groups deőned using elliptic curves.
6.1.8
The secrecy implied by the discrete logarithm
assumption
Suppose that the discrete logarithm assumption for safe prime groups holds, i.e.,
it is computationally-hard to őnd the discrete-log a, given g a mod p, where g
is a generator of the safe prime group. Does this mean that the attacker cannot
learn any information on a?
The answer is no. Furthermore, we show that the attacker can efficiently
learn some information about a - speciőcally, its least-significant bit (LSb),
i.e., if a is even (LSb(a) = 0) or odd (LSb(a) = 1). As we will see later, this
Applied Introduction to Cryptography and Cybersecurity
338
CHAPTER 6. PUBLIC KEY CRYPTOGRAPHY
has important implications on the design of some discrete logarithm-based
cryptographic schemes, such as the ‘secure’ way to use the Diffie-Hellman
protocol; see Claim 6.3.
Learning LSb(a) is based on the notion of quadratic residue modulo p, which
has many uses in the mathematics of cryptography.
Definition 6.5 (Quadratic residue). Let p be a prime number, and let y be
a positive integer. We say that y is a quadratic residue modulo p, if there is
some integer z s.t. y ≡ z 2 (modp).
We őrst claim, without proof, that quadratic residuosity2 can be efficiently
determined.
Claim 6.1. Given a prime p, there is an efficient algorithm that can determine
if a given positive integer y is a quadratic residue modulo p.
Proof: omitted; see, e.g., [205].
We next show that y = g x mod p is a quadratic residue modulo p if and
only if LSb(x) = 0, i.e., the least signiőcant bit of x is zero, or equivalently, x
is even. Combined with Claim 6.1, this shows that we can efficiently őnd the
least-signiőcant bit of the exponent x.
Claim 6.2. Let p be a prime, g be a generator for Z∗p , and x be a positive
integer. Then y ≡ g x mod p is a quadratic residue mod p, if and only if
LSb(x) = 0, i.e., x is even.
Proof: Let us őrst prove that if x is even, i.e., LSb(x) = 0, then y = g x
mod p is a quadratic residue. First observe, that if x is even, then there is
some integer z s.t. x = 2z. Hence, y ≡ g 2z ≡ (g z )2 (modp); i.e., y is, indeed, a
quadratic residue.
We now the other direction, i.e., let y = g x mod p be a quadratic residue
mod p, where x is an integer; we prove that LSb(x) = 0, i.e., x is even. This
proof uses basic facts from number theory, which we present in subsection A.2.3.
For any odd number m, there exists an integer k such that m = 2k + 1. Let
us assume, to the contrary, that g m is a quadratic residue mod p, for some odd
integer m (LSb(m) = 1); namely, g m ≡ z 2 mod p for some integer z. From
Fermat’s theorem (Theorem A.1) follows that:
z p−1 ≡ 1(modp)
(6.8)
2 Determination of quadratic residuosity is equivalent to computation of the Legendre
symbol, defined as
if y is a quadratic residue modulo p and 0 ̸= a mod p
1
y
−1 if y isn’t a quadratic residue modulo p
≡
p
0
if y = 0 mod p.
where p is a prime and y is an integer.
Applied Introduction to Cryptography and Cybersecurity
6.2. THE DH KEY EXCHANGE PROTOCOL
339
However, on the other hand:
z p−1 ≡ z 2·
p−1
2
≡ (z 2 )
p−1
2
≡ (g m )
p−1
2
≡ g (2k+1)·
p−1
2
≡ g k·(p−1) · g
p−1
2
(modp)
(6.9)
Now, again from Fermat’s theorem, we have:
g k·(p−1) ≡ (g p−1 )k ≡ 1k ≡ 1(modp)
(6.10)
By combining Equations (6.8-6.10), we have:
g
p−1
2
≡1·g
p−1
2
≡ g k·(p−1) · g
p−1
2
≡ z p−1 ≡ 1(modp)
(6.11)
p−1
Namely, g 2 ≡ 1(modp).
However, g is a generator of Z∗p , i.e., g k ̸≡ 1 for every integer k s.t. 1 ≤ k <
p−1
2
≡ 1(modp), i.e.,
p − 1, and in particular for k = p−1
2 . This contradicts g
Equation 6.11.
Claim 6.2 shows that by (efficiently) őnding if g x mod p is a quadratic
residue modulo p (Claim 6.1), we can őnd the least-signiőcant bit of x (LSb(x)),
indicating if x is even or odd. Namely, while it may be hard to compute the
entire discrete logarithm (x, given g x mod p), it is possible to efficiently őnd
at least one bit of x - the least signiőcant bit.
6.2
The DH Key Exchange Protocol
A major motivation for public key cryptography, is to secure communication
between parties, without requiring the parties to previously agree on a shared
secret key. In their seminal paper [123], Diffie and Hellman introduced the
concept of public key cryptography, including public-key cryptosystem (PKC),
which indeed allows secure communication without a preshared secret key.
However, this paper did not contain a proposal for implementing a PKC.
Instead, [123] introduced the key exchange problem, and present the DiffieHellman (DH) key exchange protocol, often referred to simply as the DH protocol.
Although a key-exchange protocol is not a public key cryptosystem, yet it also
allows secure communication - without requiring a previously shared secret key.
In fact, the goal of a key exchange protocol, is to establish a shared secret key.
In this section, we explain the DH protocol, by developing it in three steps each in a subsection. In subsection 6.2.1 we discuss a ‘physical’ variant of the
DH protocol, which involves physical padlocks and exchanging a box (locked by
one or two locks).
6.2.1
Physical key exchange
To help understand the Diffie-Hellman key exchange protocol, we őrst describe
a physical key exchange protocol, illustrated by the sequence diagram in Fig. 6.4.
In this protocol, Alice and Bob exchange a secret key, by using a box, and two
padlocks - one of Alice and one of Bob. Note that initially, Alice and Bob do
Applied Introduction to Cryptography and Cybersecurity
340
CHAPTER 6. PUBLIC KEY CRYPTOGRAPHY
Figure 6.4: Physical Key Exchange Protocol
not have a shared key - and, in particular, Bob cannot open Alice’s padlock
and vice versa; the protocol nevertheless, allows them to securely share a key.
Alice initiates the protocol by placing the key to be shared in the box, and
locking the box with her padlock. When Bob receives the locked box, he cannot
remove Alice’s padlock and open the box. Instead, Bob locks the box with his
own padlock, in addition to Alice’s padlock. Bob now sends the box, locked by
both padlocks, to Alice.
Upon receiving the box, locked by both padlocks, Alice removes her own
padlock and sends back the box, now locked only by Bob’s padlock, back to Bob.
Finally, Bob removes his own padlock, and is now able to open the box and
őnd the key sent by Alice. We assume that the Man in the Middle adversary
cannot remove Alice’s or Bob’s padlocks, and hence, cannot learn the secret
in this way. The Diffie-Hellman protocol replaces this physical assumption, by
appropriate cryptographic assumptions.
However, notice that there is a further limitation on the adversary, which is
crucial for the security of this physical key exchange protocol: the adversary
should be unable to send a fake padlock. Note that in Figure 6.4, both padlocks
are stamped by the initial of their owner - Alice or Bob. The protocol is not
secure, if the adversary is able to put her own padlock on the box, but stamp it
with A or B, and thereby make it appear as if the padlock is Alice’s or Bob’s,
respectively. This corresponds to the fact that the Diffie-Hellman protocol is
only secure against an eavesdropping adversary, but insecure against a MitM
adversary.
The critical property that facilitated the physical key exchange protocol, is
that Alice can remove her padlock, even after Bob has added his own padlock.
Namely, the ‘padlock’ operation is ‘commutative’ - it does not matter if Alice
placed her padlock őrst and Bob second, she can still remove her padlock
Applied Introduction to Cryptography and Cybersecurity
6.2. THE DH KEY EXCHANGE PROTOCOL
341
as if it was applied last. In a sense, the key to cryptographic key exchange
protocols such as Diffie Hellman, is to perform a mathematical operation which
is also commutative; of course, there are many commutative operations. We next
discuss ‘insecure prototypes’ key-exchange protocols based on three commutative
operations: addition, multiplication and XOR. However, before that, let us
brieŕy discuss the definition of a secure key exchange protocol.
6.2.2
Some candidate key exchange protocols
In this subsection, we present few ‘prototype’ key-exchange protocols, which
help us to properly explain the Diffie-Hellman protocol. Unlike the physical
key exchange protocol of subsection 6.2.1, these are ‘real protocols’, i.e., involve
only the exchange of messages - no physical objects or assumptions. We begin
with three insecure ‘prototypes’, each using a different commutative operation:
XOR, Addition and Multiplication.
The XOR, Addition and Multiplication key exchange protocols. The
sequence diagram in Figure 6.5 presents the őrst prototype: the XOR key
exchange protocol. This prototype tries to use the XOR operator, to ‘implement
the padlocks’ of Figure 6.4.
XOR is a natural candidate, since we know that XOR can provide conődentiality when used ‘correctly’, e.g., in the one-time pad construction. Furthermore,
XOR is commutative, and it is easy to see that this suffices to ensure the correctness of the XOR key exchange, i.e., the fact that kA,B = kB,A , as follows:
kB,A
=
=
=
=
=
k ′′′ ⊕ kB
(k ′′ ⊕ kA ) ⊕ kB
((k ′ ⊕ kB ) ⊕ kA ) ⊕ kB
(((kA,B ⊕ kA ) ⊕ kA
kA,B
However, as the next exercise shows, the XOR key exchange protocol is
insecure. In fact, not only it does not satisfy indistinguishability, but worse: an
eavesdropper can easily őnd the exchanged key.
Exercise 6.2 (XOR key exchange protocol is insecure). Show how an eavesdropping adversary may find the secret key exchanged by the XOR Key Exchange
protocol, by (only) using the values sent between the two parties.
Solution (sketch): attacker XORs all three messages, to obtain: k = (k ⊕
kA ) ⊕ (k ⊕ kA ⊕ kB ) ⊕ (k ⊕ kB ).
Exponentiation key exchange. The attack on the XOR key exchange was
due the fact that the attacker was able to ‘remove’ elements by applying the
XOR again, due the combination of XOR’s commutativity and the fact that for
XOR, every element is its own inverse, i.e., (∀x ∈ {0, 1}l ) x ⊕ x = 0l .
Applied Introduction to Cryptography and Cybersecurity
CHAPTER 6. PUBLIC KEY CRYPTOGRAPHY
342
Eavesdropping Eve
Alice
Bob
Nurse
$
kA , kA,B ← {0, 1}l
$
kB ← {0, 1}l
′
k ← kA ⊕ kA,B
k ′′ ← k ′ ⊕ kB
k ′′′ ← kA ⊕ k ′′
Output kB,A ← k ′′′ ⊕ kB
Output kA,B
Figure 6.5: The (insecure) XOR Key Exchange Protocol; this protocol ensures
correctness kA,B = kB,A , but is insecure. Speciőcally, by eavesdropping to the
three exchanged messages (k ′ , k ′′ and k ′′′ ), Eve can őnd the key kA,B . Can you
őnd out how? See Exercise 6.2.
Eavesdropping Eve
Alice
Bob
Nurse
$
k, ra ← {0, . . . , 2n − 1}
(random n-but integers)
$
rb ← {0, . . . , 2n − 1}
(random n-bit integer)
x ← g k·ra
y ← x rb
z ← y 1/ra
Output kA,B ≡ g k
Output kB,A ← z 1/rb
Figure 6.6: The (insecure and inefficient) Exponentiation Key Exchange Protocol, using some random integer g. It the resulting shared key kA,B = kB,A is too
long, use only some bits. This protocol, like the XOR key exchange protocol,
ensures correctness kA,B = kB,A but is insecure, as we show in the text.
So, let us try to use a different mathematical operation, that also ensures
commutativity (for correctness), but where elements are (typically) not their
own inverses: exponentiation. In Fig. 6.6, we show the resulting Exponentiation
Key Exchange protocol. This protocol is obviously very inefficient, but let us
ignore the inefficiency; we just present it to show its correctness and vulnerability,
as motivation and to build intuition to the modular exponentiation key exchange
protocol that we show afterwards.
Let us őrst show that the Exponentiation Key Exchange protocol ensures
Applied Introduction to Cryptography and Cybersecurity
6.2. THE DH KEY EXCHANGE PROTOCOL
343
correctness, i.e., that kA,B = kB,A :
kB,A
=
=
=
z 1/rb
1/rb
y 1/ra
(xrb )
=
g
=
k
1/(rb ·ra )
k·ra 1/ra
g = kA,B
(6.12)
(6.13)
(6.14)
(6.15)
(6.16)
We relied on the commutativity of exponentiation in Equation 6.15.
Let us now explain an attack recovering the key kA,B exchanged, similarly
to the attack on the XOR key exchange. The attack uses the fact that the
exponentiation operation may be removed to őnd the exponent, by computing
the inverse operation, i.e., logarithm (base g). The logarithm function is less
efficient than exponentiation, but, over the integers or real numbers, it is still
considered an efficient operation, since it can be computed in polynomial time.
Namely, an eavesdropper can simply compute the logarithm (base g) to
remove the exponentiations from all ŕows, which reduces the protocol to the
multiplication key exchange, shown insecure in Ex. 6.12 (and similarly to the
attack on XOR key exchange in Exercise 6.2). Namely, the attacker applies the
logarithm operator, with basis g, to the three messages of Fig. 6.6 - resulting in
the values k · ra , k · ra · rb and k · rb . The attacker can now combine these three
a )·(k·rb )
values to őnd k, by computing k = (k·r
(k·ra ·rb ) .
Of course, this attack used the value of g. This is justiőed, since in a key
exchange protocol, the parties do not have any preshared secret input (see
Figure 6.2); indeed, if the parties already share a secret key, why not use it
directly? Note that even if g is a preshared secret, the protocol is still vulnerable,
with a modiőed attack (Exercise 6.13).
Exercise 6.12 shows that a similar vulnerability occurs if we use multiplication
or addition instead of XOR or exponentiation. However, we will not give up
- and we next show how we ‘őx’ this protocol, and őnally present a protocol
which is hoped to be secure. Note that we do not claim that this protocol is
secure; indeed, like other designs based on supposedly computationally-hard
problem, a proof that the design is secure is unlikely - it would imply a proof
that P ̸= N P ; see Section A.1.
Modular-Exponentiation Key Exchange. We now ‘őx’ the Exponentiation Key Exchange Protocol (Fig. 6.6). The attack against it used the fact
that the computations in Fig. 6.6 are done over the őeld of the real numbers
(R), where there are efficient algorithms to compute logarithms. This motivates
changing this protocol, to use, instead, operations over a group in which the
(discrete) logarithm problem is considered hard. Such groups exist, e.g., the
‘mod p’ group, for a safe prime p.
Applied Introduction to Cryptography and Cybersecurity
CHAPTER 6. PUBLIC KEY CRYPTOGRAPHY
344
Eavesdropping Eve
Alice
Bob
Nurse
$
k, a ← Z∗p ≡ {1, . . . (p − 1)}
(large random integers < p)
$
b ← Z∗p ≡ {1, . . . (p − 1)}
(large random integer < p)
x ← g k·a mod p
y ← xb mod p
z ← ya
−1
mod (p−1)
Output kA,B ≡ g k mod p
mod p
Output kB,A ← z b
−1
mod (p−1)
mod p
Figure 6.7: The Modular-Exponentiation Key Exchange Protocol, where p is a
prime and g is a generator of Z∗p . The values k, a and b are chosen randomly
from Z∗p , i.e., integers between 1 and p − 1 (why? see Exercise 6.14). Alice
−1
derives kA,B = g k mod p, and Bob derives kB,A = z b mod (p−1) mod p;
Equation 6.17 shows both derive the same key, i.e., kA,B = kB,A .
We present this protocol in Fig. 6.7. Notice that this protocol uses multiplicative inverses in the (p − 1) modular group, e.g., a−1 mod (p − 1) is the
number in Z∗p−1 ≡ {1, . . . , p − 2} such that: a · a−1 = 1 mod (p − 1).
The correctness of the Modular-Exponentiation Key Exchange Protocol
follows from the commutativity of modular-exponentiation, much like the
correctness of the preceding protocols:
kB,A
=
=
=
=
=
−1
mod (p−1)
zb
mod p
−1
b−1 mod (p−1)
mod (p−1)
ya
−1
b ·a−1 mod (p−1)
xb
mod p
−1 mod (p−1)
k·a a
g
mod p
g k mod p = kA,B
mod p
(6.17)
Is this protocol secure, i.e., does it ensure indistinguishability? This may depend on the prime p used. One way to try to break the Modular-Exponentiation
Key Exchange Protocol of Fig. 6.7, is to compute the (discrete) logarithm of
the three values exchanged by the protocol - like the attack above against the
‘regular’ Exponentiation Key Exchange Protocol. This works when discrete
logarithm can be computed efficiently, e.g., when p − 1 is a smooth number, i.e.,
has only small prime factors.
Examples of primes p s.t. p − 1 is a smooth number. Let us give two simple
examples of primes p such that p − 1 is smooth. The őrst example is of Fermat
primes, i.e., primes of the form p = 2x + 1 for integer x. The second, possible
better3 example is of Pierpont primes, i.e., primes of the form p = 2x · 3y + 1
3 Pierpont primes may be a better example since very large Pierpont primes are known,
in fact, the number of Pierpont primes is conjectured to be infinite. In contrast, only five
Fermat primes are known, and the largest currently known is 65537 = 216 + 1.
Applied Introduction to Cryptography and Cybersecurity
6.2. THE DH KEY EXCHANGE PROTOCOL
345
for integers x, y.
In contrast, computing discrete logarithms is believed to be computationallyhard for certain moduli, making this attack impractical. In particular, discrete
logarithm is assumed to be computationally hard when the modulus p is a
large safe prime, i.e., p = 2q + 1 for some prime q; see Deőnition 6.4 and
Deőnition 6.3.
The attacker can easily detect if a = 1 or b = 1, and then őnd k. However,
1
the probability of this choice is only p−2
, which is exponentially small in the
number of bits in p - i.e., negligible (for sufficiently large p). The attacker can
also guess some speciőc value chosen by the parties, say ã ∈ {1, . . . , p − 1}
as a guess for a, compute ã−1 mod (p − 1), and check if the guess for a was
−1
correct (i.e., whether ã = a mod p), by comparing z to y ã
mod p. If the
−1
ã
guess was correct, i.e., if z = y
mod p, then ã = a mod (p − 1), and the
−1
attacker computes the key: kA,B = xa mod (p−1) mod p. Note that there is
no advantage for the parties to select the a, b or k exponents from a larger set
(not limited to {1, . . . , p − 1}); all the values sent by the protocol, as well as
the key, will be exactly the same as when using the corresponding exponents
mod (p − 1).
In the following subsection, we present the Diffie-Hellman protocol - which
is essentially an improved and simpliőed variant of the Modular-Exponentiation
Key Exchange Protocol of Fig. 6.7.
6.2.3
The Diffie-Hellman Key Exchange Protocol and
Hardness Assumptions
Fig. 6.8 presents the Diffie-Hellman (DH) key exchange protocol. The protocol
uses the (safe) prime group Z∗p , i.e., using multiplications modulo p, where p is a
(safe) prime p. The protocol assumes a given (public) choice of parameters: the
(safe) prime p and the generator g. Recall that the order q of Z∗p is q = p − 1,
i.e., g q = 1 mod p and {1, . . . , p − 1} = {g i }qi=1 .
The protocol consists of only two ŕows: in the őrst ŕow, Alice sends g a
mod p, where a ∈ {1, . . . , p − 1} is a private key chosen randomly by Alice;
and in the second ŕow, Bob responds with g b mod p, where b is a private key
chosen randomly by Bob. The result of the protocol is a shared secret value g ab
a
mod p, computed by Alice as kA,B = g b mod p
mod p = g ba mod p,
b
a
ab
and by Bob as kB,A = (g mod p) mod p = g
mod p.
The Diffie-Hellman key-exchange protocol is, essentially, a simpliőed, and
slightly optimized, variant of the Modular-Exponentiation Key Exchange Protocol of Fig. 6.7; and in particular, the security of both protocols relies on the
difficulty of computing discrete logarithms, and may fail if p has only small
factors; the choice of safe primes (p = 2q + 1 for prime q) is hoped to be ‘safe’,
i.e., to ensure security.
The basic difference between the two protocols is that in the Diffie-Hellman
protocol, the key output is not g k mod p for some random k, as happens for
the Exponentiation key exchange. Instead, the key being output (exchanged) is
Applied Introduction to Cryptography and Cybersecurity
CHAPTER 6. PUBLIC KEY CRYPTOGRAPHY
346
Eavesdropping Eve
Alice
Bob
Nurse
$
a ← Z∗p ≡ {1, . . . (p − 1)}
(random integer < p)
$
b ← Z∗p ≡ {1, . . . (p − 1)}
(random integer < p)
x ← g a mod p
y ← g b mod p
Output key:
kA,B ≡ y a ≡ (g b )a = g a·b = (g a )b ≡ xb ≡ kB,A (mod p)
Figure 6.8: The Diffie-Hellman Key Exchange Protocol. The protocol uses
mod p computations, where p is a prime. It is believed to be hard to compute
the resulting key g ab mod p, when p is a safe prime (p = 2q + 1 where q is a
prime).
g ab . This makes the protocol a bit simpler, and more efficient: only two ŕows
instead of three, no need to compute inverses (a−1 , b−1 mod p), and one less
exponentiation.
The correctness of the Diffie-Hellman key exchange protocol, i.e., the fact
that kA,B = kB,A , follows from the commutativity of exponentiation (and
modular exponentiation), as follows:
kA,B
≡
≡
≡
≡
=
y a mod p
(g b )a mod p
g a·b
mod p
xb mod p
kB,A
(6.18)
(6.19)
(6.20)
(6.21)
(6.22)
The following exercise may help to get a better feeling for the protocol and
how it works.
Exercise 6.3 (Diffie-Hellman (DH) Key Exchange). Let p = 7.
1. Is p = 7 a safe prime?
2. Find a generator g for Z∗7 ; show that g is a generator and how you found
it.
3. Alice and Bob run the DH protocol with the prime p and generator g.
Alice selects a = 3, and Bob selects b = 6. Compute the values sent by
Alice and Bob, and show the computation of the shared key by each of
them, resulting in the same value.
Applied Introduction to Cryptography and Cybersecurity
6.2. THE DH KEY EXCHANGE PROTOCOL
$
347
$
a ← {1, . . . , p}
$
e ← {1, . . . , p}
b ← {1, . . . , p}
g a mod p
g e mod p
g e mod p
g b mod p
Nurse
Alice
MitM Adversary
(g e ) = g a·e mod p
(g a ) = g a·e mod p,
e
g b = g b·e mod p
a
e
Bob
b
(g e ) = g b·e mod p
Figure 6.9: MitM attack on the DH key-exchange protocol. The DH protocol is
believed to be secure against an eavesdropping adversary - or if the messages
are authenticated.
DH is vulnerable to MitM attacker. Both the Diffie-Hellman and the
Modular-Exponentiation key exchange protocols insecure against a MitM attacker; they are designed only against an eavesdropping adversary. In fact, as
shown in Figure 6.9, all a MitM attacker needs to do is to fake the message
from a party, allowing it to impersonate that party (establishing a shared key
with the other party). Indeed, in practice, we (almost) always use authenticated
variants of the DH protocol, as we discuss in subsection 6.3.1.
Security of DH and the Computational DH Assumption. Ok, so the
DH protocol is vulnerable against MitM; but can we safely use it against an
eavesdropper? Namely, can we assume that the key output by the DH protocol
cannot be computed by an eavesdropping adversary, when DH is computed over
a group in which discrete logarithm is assumed to be a computationally-hard
problem, e.g., the ‘mod p’ group where p is a safe prime? So far, this has not
been proven; there is no proof that if DH is computed (using a safe prime p),
then the resulting key cannot be efficiently computed by an adversary. There
isn’t even a proof showing that such attack against DH would imply an efficient
method to compute discrete logarithms (modulo a safe prime p). In fact, these
are still important open questions.
The common approach is to assume that when DH protocol is run using safe
prime p, then an eavesdropping adversary cannot guess the resulting shared
key. This assumption, which is stronger than the assumption of hardness of
discrete-log, is called the Computational DH (CDH) assumption. The CDH
assumption essentially means that it is infeasible to compute the DH shared
Applied Introduction to Cryptography and Cybersecurity
CHAPTER 6. PUBLIC KEY CRYPTOGRAPHY
348
secret g ab mod p, given the two exchanged values g a mod p and g b mod p, if
p is a sufficiently-long safe prime (i.e., p = 2q + 1 for a prime q). Both the DH
and the discrete-log problems are easy for some other values of p, in particular,
when p − 1 is a smooth number, i.e., has only small prime factors; see our
discussion of such primes in subsection 6.2.2.
Definition 6.6 (Computational DH (CDH) for safe prime groups). The Computational DH (CDH) assumption for safe prime groups holds, if there is no
efficient (PPT) adversary A that, given a random n-bit safe prime p (i.e.,
p = 2q + 1 for prime q) and generator g, and the values (g a mod p, g b mod p)
$
for random a, b ← {1, . . . , p − 1}, returns, with non-negligible probability, g ab
mod p.
Namely, for every PPT algorithm A and random n-bit safe prime p and
generator g holds:
(6.23)
Pr A(g a mod p, g b mod p) = g ab mod p ∈ N EGL(n)
$
a,b←Z∗
p
Note that the deőnition allows the random choice of a = 1 (or b = 1),
although for a = 1 holds g ab = g b . However, the number of possible values in
Z∗p is exponential in n, i.e., the probability of such ‘bad choice’ is negligible.
At least one bit of g ab mod p is exposed! Even assuming that the CDH
assumption holds for safe prime groups, an eavesdropper is still be able to
learn (at least) one bit about g ab mod p. Speciőcally, an attacker, observing g a
mod p and g b mod p from a run of the Diffie-Hellman protocol, can efficiently
őnd whether g ab mod p is a quadratic residue modulo p, i.e., if there exists
some z ∈ Z∗p such that g ab ≡ z 2 mod p (Deőnition 6.5). Let us show how.
Claim 6.3. Let p be a prime, g be a generator for Z∗p , a, b be integers, and
y ≡ g ab mod p. Given g a mod p and g b mod p, we can efficiently deduce if
y is a quadratic residue modulo p.
Proof: From Claim 6.1, we can efficiently őnd if g a mod p and g b mod p
are quadratic residues modulo p. From Claim 6.2, this gives the least signiőcant
bit (parity) of a and of b; obviously, ab is even if either a or b is even. Again
from Claim 6.2, the least signiőcant bit of ab indicates the quadratic residuosity
of g ab mod p.
6.2.4
Secure derivation of keys from the DH protocol
An eavesdropper to the DH key exchange can observe g a mod p and g b mod p;
hence, from Claim 6.3, the attacker can know if y ≡ g ab mod p is a quadratic
residue modulo p. Therefore, using y ≡ g ab mod p directly as a key may
not be advisable, as even assuming that the CDH assumption is true, still an
eavesdropper can learn partial information about y (i.e., if it is a quadratic
residue). Notice that while we show only exposure of this information - the
Applied Introduction to Cryptography and Cybersecurity
6.2. THE DH KEY EXCHANGE PROTOCOL
349
Eavesdropping Eve
Alice
Bob
Nurse
$
$
a ← {1, . . . , q}
b ← {1, . . . , q}
x ← ga
y ← gb
Output key:
kA,B = y a = (g b )a = g a·b = (g a )b = xb = kB,A
Figure 6.10: The Generalized Diffie-Hellman Key Exchange Protocol, for group
G with order q. All operations are group operations, denoted like the usual
multiplication notation. For some types of groups, and sufficiently-large order q,
it is believed that it is infeasible not only to compute g ab but even to distinguish
between g ab and a random group member, i.e., DDH security. The protocol
reduces to the ‘regular’ (mod-p) Diffie-Hellman protocol, when the group G is
Z∗p , the modular-p group for prime p.
quadratic residuosity of g ab mod p - there could be ways to expose more4
information without violating the CDH assumption.
So, how can we use the DH protocol to securely exchange a key? One could
simply ignore this concern; but let us discuss two other, more prudent, options.
First option: generalized DH protocol, using DDH groups. The
generalized DH protocol can ensure that the value of the derived key kAB is
secret, without any leakage. This protocol uses (and requires) a cyclic group
G where the (stronger) Decisional DH (DDH) Assumption is believed to hold.
Let us őrst deőne this assumption (a bit informally).
Definition 6.7 (The Decisional DH (DDH) Assumption). Group G, with order
q, satisfies the Decisional DH (DDH) Assumption if there is no PPT algorithm
A
candistinguish,
with
athat
significant advantage compared to guessing, between
g , g b , g ab and g a , g b , g c , for a, b and c selected randomly from {1, . . . , q}. A
group for which the DDH assumption is believed to hold is called a DDH group.
In Figure 6.10, we present the Generalized Diffie-Hellman protocol, using
cyclic group G, rather than the speciőc group Z∗p used in the original DH
protocol. We use the group operation of G, denoted as multiplication, in lieu
of the mod-p multiplication used by Z∗p . The protocol ensures key secrecy, if
G is a DDH group, i.e., if the DDH problem is believed to be computationally
infeasible (‘hard’) for G.
The generalized DH protocol assumes agreed-upon DDH group G and
generator g, and known order q for G. Like the original DH protocol (Figure 6.8),
4 It
may possible to expose even 80% of the bits [75].
Applied Introduction to Cryptography and Cybersecurity
350
CHAPTER 6. PUBLIC KEY CRYPTOGRAPHY
the protocol has only two ŕows. In the őrst ŕow, Alice sends g a , where
a ∈ {1, . . . , q} is a secret chosen randomly by Alice; and in the second ŕow,
Bob responds with g b , where b ∈ {1, . . . , q} is a secret chosen randomly by Bob.
Both ‘exponentiations’ (g a and g b ) are done by repeatedly applying the group
operation (instead of modular multiplication, as in the original DH protocol).
The result
protocol is a shared secret value g ab , computed by Alice as
of the
b
ba
b a
kA,B = g
= g , and by Bob as kB,A = (g a ) = g ab . All ‘exponentiations’
are repeated application of the group operation of G. Notice that the secret
value exchanged, g ab , is an element of the group G, i.e., it is not a uniformly
random string; this requires mapping of g ab into a random string.
One group where DDH is assumed to hold is Qp , the subgroup of Z∗p
consisting of the quadratic residues in Z∗p , for a safe prime p (i.e., p = 2q + 1,
for prime q). Certain elliptic-curve groups are also believed to be DDH groups.
See other examples in [75].
Second option: extract a secret, random key from the partiallyThe second option is to use the DH protocol as
random g ab mod p.
described, i.e., with a safe prime group Z∗p , but to securely extract (derive) a
shared key k from g ab mod p. Section 3.5 discusses the two common ways to
extract a shared key from a mostly-random shared secret data: using either a
randomness extractor hash function or a Key Derivation Function (KDF). This
approach requires, basically, that the g ab mod p contains a ‘sufficient’ randomization, to ensure that the output of the extractor or KDF is pseudorandom basically, a variant of DDH with respect to the speciőc extractor or KDF used.
6.3
Using DH for Resiliency to Exposures: the (PFS)
Auth-h-DH and (PRS) DH-Ratchet protocols
As discussed above, and demonstrated in Ex. 6.9, the DH protocol is vulnerable
to a MitM attacker; its security is only against a passive, eavesdropping-only
attacker. In most practical scenarios, attackers who are able to eavesdrop,
have some or complete ability to also perform active attacks such as message
injection; it may seem that DH is only applicable to the relatively few scenarios
of eavesdropping-only attackers.
In this section, we discuss extensions of the DH protocol, extensively in
practice to improve resiliency to adversaries which have MitM abilities, combined
with key-exposure abilities. Speciőcally, these extensions allow us to ensure
Perfect Forward Secrecy (PFS) and Perfect Recover Security (PRS), the two
strongest notions of resiliency to key exposures of secure key setup protocols as
presented in Table 5.3 (Section 5.7).
Applied Introduction to Cryptography and Cybersecurity
6.3. USING DH FOR RESILIENCY TO EXPOSURES
351
MitM
attacker
Alice, has M K
Bob, has M K
Nurse
$
ai ←
Z∗p
$
≡ {1, . . . (p − 1)}
xi ← g
ai
bi ← Z∗p ≡ {1, . . . (p − 1)}
mod p, M ACM K (xi )
yi ← g bi mod p, M ACM K (yi )
(i)
kA,B ≡ h(yiai mod p)
(i)
(i)
Session ith key: ki = kA,B = kB,A
(i)
kB,A ← h(xbi i mod p)
Figure 6.11: The Auth-h-DH Protocol, showing ith exchange. This protocol is
secure against MitM attackers; furthermore, it ensures Perfect Forward Secrecy
(PFS), i.e., exposure of current keys does not expose past keys. The protocol,
as presented, uses both a MAC function (M AC) and a keyless randomness
extractor hash function h.
6.3.1
The Authenticated DH (Auth-h-DH) protocol: ensuring
Perfect Forward Secrecy (PFS)
Assuming that the parties share a secret master key M K, it is quite easy to
extend the DH protocol in order to protect against MitM attackers. All that is
required is to use a Message Authentication Code (MAC) scheme to authenticate
the DH ŕows. To ensure security without requiring the use of DDH group
(i.e., the DDH assumption), we may extract the key using an extractor hash
function h (or a KDF). See Fig. 6.11, showing the Auth-h-DH, the resulting
authenticated variant of the DH protocol using hash h to extract the key.
Correctness follows similarly to the argument for ‘regular’ Diffie-Hellman
protocol:
(i)
ki ≡ kA,B
≡
=
=
=
=
≡
h (yiai mod p)
a i
mod p
h g bi
a i
mod p
h g bi
b
h (g ai ) i mod p
h xbi i mod p
(i)
kB,A
(6.24)
(6.25)
(6.26)
(6.27)
(6.28)
(6.29)
The next informal lemma presents the security properties of Auth-h-DH.
We only given an informal argument for the validity of the Lemma.
Lemma 6.1. Assuming the extended CDH assumption, the Auth-h-DH protocol
(Fig. 6.11) ensures secure key-setup (indistinguishability) and Perfect Forward
Applied Introduction to Cryptography and Cybersecurity
352
CHAPTER 6. PUBLIC KEY CRYPTOGRAPHY
Secrecy (PFS), provided that M AC is a secure MAC function and that h is a
randomness extractor hash function.
Argument: Let us őrst consider a run where the key M K is unknown to
the attacker, i.e., was not exposed. Hence, since M AC is a secure Message
Authentication Code (MAC), then the protocol outputs a key only if the message
it receives was sent by the peer (or by itself); in any case, the exponent used
(ai or bi ) was selected randomly by Alice or Bob and unknown to the attacker.
From the extended CDH assumption (subsection 6.2.4), and the assumption
that h is a randomness extractor hash, it follows that the session key ki (set by
(i)
(i)
Alice to kA,B , and by Bob to kB,A ) is indistinguishable from random, i.e., the
Auth-h-DH protocol ensures secure key-setup. Note that the protocol cannot
ensure that both parties will generate the same ki , since a MitM attacker may
change messages, or simply drop one of the messages (causing the key to be
output by only one party).
The PFS property also follows, since it requires that key ki is secure, even
if M K is exposed, as long as the exposure occurs after session i was completed;
and our analysis did not exclude such exposure after the session was over.
KDF-based variants of the Auth-h-DH protocol. As mentioned in
Section 3.5, keyless generic extractors do not exist, and it is preferable to avoid
their use and rely on a Key Derivation Function (KDF), which can also output
as many pseudorandom bits as needed. Let us discuss brieŕy two variants of
the Auth-h-DH protocol, which use a KDF instead of a (keyless) extractor.
Variant 1: a two-keyed KDF-based variant of Auth-h-DH. The őrst variant
simply replaces the keyless randomness extractor hash h of Figure 6.11, with
the KDF-extract function, i.e., the key is derived as KDFsalt , where salt is
a random and non-secret (known) key, which does not expose the master key
M K used by the MAC function.
Variant 2: a secure variant of Auth-h-DH, using a combined KDF/PRF.
Another variant of Auth-h-DH uses the same function f and the same master
key M K to replace both the MAC function and the KDF function. This
variant requires f to satisfy both the MAC and the KDF functionalities. This
is a stronger assumption, but it makes the design simpler and more efficient;
therefore, it is often preferred.
See Exercise 6.19 for questions related to these and other variants of the
Auth-h-DH protocol. And here is a more basic exercise about the Auth-h-DH
protocol.
Exercise 6.4. Alice and Bob share master key M K and perform the Auth-hDH protocol daily, at the beginning of every day i, to set up a ‘daily key’ ki for
day i. Assume that Mal can eavesdrop on communication between Alice and
Bob every day, but perform MitM attacks only every even day (i s.t. i ≡ 0 (
mod 2)). Assume further that Mal is given the master key M K, on the fifth
day. Could Mal decipher messages sent during day i, for i = 1, . . . , 10? Write
your responses in a table.
Applied Introduction to Cryptography and Cybersecurity
6.3. USING DH FOR RESILIENCY TO EXPOSURES
Protocol
2PP-Key
Exchange
FS-ratchet
RS-ratchet
Auth-h-DH
DH-ratchet
353
Section
Secure
key
setup
Forward
secrecy
(FS)
Perfect
Forward
Secrecy
(PFS)
Recover
Security
(RS)
Perfect
Recover
Security
(PRS)
5.4.1
✓
✗
✗
✗
✗
5.7.1
5.7.2
6.3.1
6.3.2
✓
✓
✓
✓
✓
✓
✓
✓
✗
✗
✓
✓
✗
✓
✗
✓
✗
✗
✗
✓
Table 6.2: Resiliency to key exposures of Key Exchange protocols.
Note that the results of Ex. 6.4 imply that the Auth-h-DH protocol does not
ensure the Recover Security property. We next show extensions that improve
resiliency to key exposures, and speciőcally recover security after exposure,
provided that the attacker does not deploy the MitM ability for one handshake.
6.3.2
The DH-Ratchet protocol: Perfect Forward Secrecy
(PFS) and Perfect Recover Security (PRS)
The Auth-h-DH protocol ensures perfect forward secrecy (PFS), but does
not ensure recover security, and deőnitely not perfect recover security (PRS).
In fact, a single key exposure, at some point in time, suffices to make all
future handshakes vulnerable to a MitM attacker - even if there has been some
intermediate handshakes without (any) attacks, i.e., during which the attacker
had neither MitM nor eavesdropper capabilities. To see that the Auth-h-DH
protocol does not ensure recover security, see Exercise 6.4.
Note that the (shared key) RS-Ratchet protocol presented in subsection 5.7.2
(Fig. 5.20), achieved recovery of security - albeit, not Perfect Recover Security
(PRS). Namely, the Auth-h-DH protocol does not even strictly improve resiliency
compared to the RS-Ratchet protocol (subsection 5.7.2); see Table 6.2.
In this subsection we show how to achieve both PFS and Perfect Recover
Security (PRS). Speciőcally, we present the DH-Ratchet protocol, as illustrated
in Fig. 6.12, which ensures both PFS and PRS. This protocol uses a function f ,
which is assumed to be simultaneously both a PRF and a KDF; this is similar
to one of the variants of the Auth-h-DH protocol, discussed in subsection 6.3.1.
Like the Auth-h-DH protocol presented above, the DH-ratchet protocol also
authenticates the DH exchange; hence, as long as the authentication key is
unknown to the attacker at the time when the protocol is run, then the key
exchanged by the protocol is secret. The improvement, compared to the Authh-DH protocol, is in the key used to authenticate the DH exchange; instead of
using a őxed master key (M K) as done by the Auth-h-DH protocol (Fig. 6.11),
the DH-ratchet protocol authenticates the ith DH exchange using the session
key exchanged in the previous round (exchange), i.e., ki−1 . An initial shared
Applied Introduction to Cryptography and Cybersecurity
CHAPTER 6. PUBLIC KEY CRYPTOGRAPHY
354
Alice, has ki−1
(from previous round)
Bob, has ki−1
(from previous round)
MitM
attacker
Nurse
$
$
ai ← Z∗p ≡ {1, . . . (p − 1)}
bi ← Z∗p ≡ {1, . . . (p − 1)}
xi ← g ai mod p, fki−1 (xi )
yi ← g bi mod p, fki−1 (yi )
(i)
kA,B ≡ fki−1 (yiai mod p)
(i)
(i)
Session ith key: ki = kA,B = kB,A
(i)
kB,A ← fki−1 (xai i mod p)
Figure 6.12: The DH-Ratchet key exchange protocol, ensuring PFS and PRS
against MitM attacker, assuming that f is both a PRF and KDF. The őgure
shows round i of the protocol.
secret key k0 is used to authenticate the őrst round, i.e., i = 1.
Lemma 6.2 (DH-Ratchet ensures PFS and PRS). The DH-Ratchet protocol
(Fig. 6.12) ensures secure key-setup with perfect forward secrecy (PFS) and
perfect recovery security (PRS), assuming that f is both a PRF and KDF.
Sketch of proof: The PFS property follows, like in Lemma 6.1, from the fact
that ki , the session key exchanged during session i, depends on the result of
the DH protocol, i.e., is secure against an eavesdropping-only adversary. The
protocol also ensures secure key setup, since a MitM adversary cannot learn
ki−1 and hence cannot forge the DH messages.
The PRS property follows from the fact that if at some session i′ there
is only an eavesdropping adversary, then the resulting key ki′ is secure, i.e.,
unknown to the attacker, since this is assured when running DH against an
eavesdropping-only adversary. It follows that in the following session (i′ +1), the
key used for authentication is unknown to the attacker, hence the execution is
again secure - and results in a new key ki′ +1 which is again secure (unknown to
attacker). This continues, by induction, as long as the attacker is not (somehow)
given the key ki to some session i, before the parties receive the messages of
the following session i + 1.
Many instant messaging applications use a slightly more advanced version of
the DH-Ratchet protocol, usually referred to as the Double Ratchet protocol (or
algorithm). The Double-Ratchet protocol does not use the DH-derived keys ki
directly to protect the traffic; instead, it derives from ki a series of keys used to
protect the traffic. The standard double-ratchet protocol is also asynchronous,
allowing the two parties to change keys independently and without dependency
on time synchronization. The following exercise presents the Synchronous
Double-Ratchet protocol, a slightly simpliőed example of the Double-Ratchet
protocol which retains much of its security beneőts but assumes synchronized
clocks.
Applied Introduction to Cryptography and Cybersecurity
6.4. THE DH AND EL-GAMAL PKCS
355
Exercise 6.5 (The Synchronous Double-Ratchet protocol). Alice and Bob
use low-energy devices to communicate. To ensure secrecy, they run, daily,
the DH-Ratchet protocol (Fig. 6.12), but want to further improve security, by
changing keys every hour. However, to save energy and time, the hourly process
should use only very efficient computations - and no exponentiations. Let kij
denote the key they share after the j th hour of the ith day, where ki0 = ki (the
key exchanged in the ‘daily exchange’ of Fig. 6.12).
1. Show how Alice and Bob should set their hourly shared secret key kij .
2. Identify the security benefits of your solution, compared to the ‘regular’
DH-Ratchet protocol.
Solution:
1. kij = fkj−1 (1) (the value 1 is arbitrary of course).
i
2. The protocol uses kij as the ‘session key’, i.e., the key used to protect the
traffic in the j th hour of the ith day. Assume the őrst hour of the day is
numbered j = 1. The advantage is that exposure of kij , for any hour j > 0,
does not expose the ‘ratchet master key’ of that day ki = ki0 . Hence, such
exposure only exposes traffic sent during this hour but not traffic sent in
any other hour (or day). This is in contrast with the DH-Ratchet protocol,
where exposure of the session key ki , used throughout day i, exposes all
the traffic of day i, and furthermore, exposes future traffic until the key
is recovered (in a day in which the attacker does not eavesdrop).
6.4
The DH and El-Gamal Public Key Cryptosystems
In this section, we discuss two related public-key cryptosystems, both based
on the discrete-logarithm problem: the (well-known) El-Gamal PKCS and the
Diffie-Hellman (DH) PKCS, which is basically a transformation of the DH
key exchange protocol into a public-key cryptosystem. In the next section
(Section 6.5), we present a third system, the (well-known) RSA public key
cryptosystem.
6.4.1
The DH PKC and the Hashed DH PKC
In their seminal paper [123], Diffie and Hellman presented the concept of publickey cryptosystems - but did not present an implementation. On the other hand,
they did present the (Figure 6.8). We next show that a minor tweak allows us
to turn the DH key exchange protocol into a PKC; we accordingly refer to this
PKC as the DH PKC, and a variant of it that also uses a hash function h as
the DH-h PKC.
Applied Introduction to Cryptography and Cybersecurity
CHAPTER 6. PUBLIC KEY CRYPTOGRAPHY
356
Nurse
Alice
knows dA , g, p,
computes: eA = g dA mod p
Bob
knows eA , g, p
Input: message m
$
Select b ← {1, . . . , p − 1}
gb
eA
h
i
b
mod p, m ⊕ (eA ) mod p
Figure 6.13: The DH public key cryptosystem (DH PKC). Bob encrypts plaintext
message m, using Alice’s public key eA and the public parameters: a safe prime
p and generator g of the group Z∗p . The ciphertext consists of the pair of
h
i
b
strings g b mod p, m ⊕ (eA ) mod p . The value b is selected randomly
and known only to Bob, who should select a new b for each encryption.
The DH PKC. Let us őrst present the DH public key cryptosystem (DH
PKC), illustrated in Figure 6.13. As can be seen, this public key cryptosystem
is essentially an adaptation of the DH key exchange protocol (Figure 6.8), using
safe prime p. Essentially, instead of Alice selecting random secret a and sending
g a mod p to Bob in the őrst ŕow of the DH protocol, Alice selects a fixed
$
private key dA , exactly in the same way, i.e., dA ← {1, . . . , p − 1}. Next, Alice
computes her public key eA , as: eA ≡ g dA mod p.
Bob encrypts a message m using Alice’s public key eA by selecting a random
value b ∈ [2, p − 2], and then computing two values: g b mod p and m ⊕ ebA
mod p. Bob sends these two values to Alice; note that this is essentially Bob’s
role in the DH protocol. Namely, Bob computes the ciphertext according to
Equation 6.30:
EeA (m)
=
$
b←
[1, p− 1]
Return g b mod p, m ⊕ (eA )b
mod p
(6.30)
Notice that the ciphertext is the pair of both of these values.
Upon receiving such a ciphertext, which we denote (cb , cm ), Alice can decrypt
it by computing:
DdA (cb , cm )
=
h
d
cm ⊕ (cb ) A
mod p
i
(6.31)
To see that the DH PKC ensures correctness, i.e., that decryption recovers
Applied Introduction to Cryptography and Cybersecurity
6.4. THE DH AND EL-GAMAL PKCS
the plaintext, we observe that:
b
mod p
DdA (EeA (m)) = DdA g b mod p, m ⊕ (eA )
h
dA
b
=
m ⊕ (eA )
mod p ⊕ g b mod p
= m ⊕ g b·dA mod p ⊕ g b·dA mod p
= m
357
mod p
i
The security of DH PKC and the Hashed DH PKC. Intuitively, the
security of DH PKC seems to follow from the CDH assumption (Deőnition 6.6).
Let us present a reduction argument which supports this intuition. Assume an
attacker A DHP KC is able to learn a random message m from eA and EeA (m).
Then we can design an attacker A CDH that will be able to compute g ab mod p,
given g a mod p and g b mod p, as follows. Given g a mod p and g b mod p,
the attacker A CDH will deőne eA = g a mod p and cb = g b mod p, and select
b
a random message m and compute cm = m ⊕ (g a mod p) mod p.
However, this argument has a ŕaw; even if the CDH assumption holds, the
DH PKC may not be a secure encryption. Speciőcally, the argument was based
on the assumption that the attacker A DHP KC is able to learn the message m.
However, a secure encryption scheme should also prevent disclosure of partial
information about the plaintext, as formalized by the indistinguishability test
for public key cryptosystems, PKC IND-CPA (Deőnition 2.10).
In fact, Claim 6.3 shows that an attacker may learn partial information
about g ab mod p from the public values g a mod p and g b mod p - even if
the CDH assumption holds. This may, therefore, expose partial information
about m when using the DH public key cryptosystem (DH-PKC) as presented
in Figure 6.13.
One solution to this is to modify the design. Recall that DH-PKC uses
the mod p group (with a safe prime p). We could, instead, use a DDH
group (Deőnition 6.7), i.e., a cyclic group G that is believed to hide all partial
information, as in subsection 6.2.4.
Fig. 6.14 presents a different solution: the Hashed DH PKC. In the Hashed
DH PKC, we apply a cryptographic hash function h to ebA , before XORing it with
the message. Namely, we compute cm = m ⊕ h(ebA mod p). Speciőcally, this
should be a randomness-extractor hash function. The output of a randomnessextractor hash h should be indistinguishable from random; hence, it should hide
all partial information about the key. The ciphertext is, therefore, the pair:
b
EeA (m) ≡ g b mod p, m ⊕ h (eA )
mod p
Yet another variant of the DH PKC uses a keyed key-derivation function
KDFs (g dA ·b mod p), with a uniformly-random key/salt s. See discussion of
randomness extractor hash functions and key derivation functions KDF in
subsection 3.5.3.
Exercise 6.6. Show the correctness of the DH-h PKC.
Applied Introduction to Cryptography and Cybersecurity
CHAPTER 6. PUBLIC KEY CRYPTOGRAPHY
358
Nurse
Alice
knows dA , g, p,
computes: eA = g dA mod p
Bob
knows eA , g, p
Input: message m
$
Select b ← {1, . . . , p − 1}
gb
eA
b
mod p, m ⊕ h (eA ) mod p
Figure 6.14: The Hashed DH Public Key Cryptosystem (Hashed DH PKC:
same as the DH PKC (Figure 6.13, except for hashing the ‘one-time key’ ebA
mod p.
6.4.2
The El-Gamal PKC
The El-Gamal PKC is another encryption scheme based on the DH key exchange
protocol, which is closely related to the DH PKC. As for the DH PKC, we
discuss three variants of the El-Gamal PKC: (1) using the mod p group (for
safe prime p), (2) using DDH group, and (3) Hashed El-Gamal.
The original design of the El-Gamal PKC, in [158], uses multiplications
mod p where p is a safe prime, like the DH PKC. Key generation is also done as
$
in DH PKC, i.e., Alice’s selects her private key randomly as dA ← {2, . . . , p − 1}
dA
mod p. Even the encryption
and computes her public key eA as: eA ≡ g
process is similar to DH PKC: Bob selects a random value b ∈ [2, p − 1], and
computes and sends to Alice a pair of values (cb , cm ), where cb ≡ g b mod p
and cm ≡ m · ebA mod p, as in Equation 6.32.
EeA (m) = (cb , cm ) ≡ g b mod p , m · ebA mod p
(6.32)
The difference between the original El-Gamal PKC and the DH PKC is in how
Bob uses the ebA value to encrypt the message m. In the original El-Gamal
PKC, Bob multiples m, i.e., computes cm ← m · ebA mod p, while in DH PKC,
Bob uses exclusive-or, i.e., computes cm ← m ⊕ ebA . Decryption is also modiőed
accordingly, by using (modular) division instead of exclusive-or, i.e.:
DdA (cb , cm ) =
cm
cdb A
A
mod p = cm · c−d
b
mod p
(6.33)
Correctness follows similarly to DH PKC.
Unfortunately, similarly to DH PKC, the original El-Gamal PKC may expose
partial information about the plaintext message m. And, like for DH PKC,
Applied Introduction to Cryptography and Cybersecurity
6.4. THE DH AND EL-GAMAL PKCS
359
there are two solutions, both similar to the corresponding DH-PKC solution:
Hashed El-Gamal or using El-Gamal with a DDH group (Deőnition 6.7).
The őrst solution, Hashed El-Gamal, is similar to hashed DH PKC. Namely,
in the encryption process, we hash the ‘one-time pad’ ebA before using it to hide
the message m. Namely, encryption is, as usual, a pair (cb , cm ), except cm is
computed as: cm ← m ⊕ ebA . We also compute the hash for decryption:
DdA (cb , cm ) =
c
m
h cdb A
−1
mod p = cm · h cdb A
mod p
(6.34)
The second solution, using El-Gamal with a DDH group, is similar to the use
of a DDH group with the DH PKC. Namely, computations are done over a cyclic
group G believed to ensure the DDH assumption (Deőnition 6.7). Using the
usual convention where we denote the operation of group G in the same ways
that we normally denote multiplication over the reals (or integers), the encrypt
and decrypt operations of El-Gamal become even a bit simpler (compared to
Equation 6.32 and Equation 6.33:
A
EeA (m) = g b , ebA ; DdA (cb , cm ) = cm · c−d
(6.35)
b
The El-Gamal with a DDH group is believed to be secure, i.e., to prevent
disclosure of any partial information about the plaintext. In addition, the DDH
El-Gamal PKC is multiplicative homomorphic; we explain this property and
some of its important applications in the following section.
Let us describe the DDH El-Gamal PKC in more details; we also illustrate
it in Fig. 6.15. We use g to denote a generator of G, and q to denote the
order of G; i.e., G = {g 1 , . . . , g q }. Alice selects her private key dA to be a
$
random element in the group, i.e., dA ← {g 1 , . . . , g q }; and her public key is
simply eA = g dA . Notice that we use the standard notations of multiplication
and exponentiation for the corresponding group operations of G. As shown in
Fig. 6.15, the El-Gamal encryption of plaintext m ∈ G, denoted EeA (m), is
computed as follows:
)
(
$
b
←
[1,
q]
(6.36)
EeA (m) ←
Return g b , m · ebA
El-Gamal decryption is deőned as:
A
DdA (cb , cm ) = c−d
· cm
b
(6.37)
The correctness property holds since for every message m ∈ G holds:
h −d
b i
A
DdA (g b , m · ebA ) = g b
· m · g dA
(6.38)
= g −b·dA · m · g b·dA
=m
Applied Introduction to Cryptography and Cybersecurity
CHAPTER 6. PUBLIC KEY CRYPTOGRAPHY
360
$
Alice, private key dA ← {1, . . . , q},
public key eA ← g dA
MitM
attacker
Bob
sends plaintext m
Nurse
$
b ← {1, . . . , q}
eA
(cb , cm ) where cb ← g b , cm ← ebA · m
Output:
m′ ≡ x−dA · y
m′ = g b
−dA
Correctness:
b
−b
b
· g dA · m = m
· (eA ) · m = g dA
Figure 6.15: The El-Gamal Public-Key Cryptosystem using a DDH cyclic group
G, whose order we denote by q; exponentiations and multiplications are done
using the corresponding operations of G. The private key of Alice, denoted dA ,
and the value b used by Bob, are both selected randomly from the set {1, . . . , q}.
Alice computes her public key as eA = g dA .
Exercise 6.7. In this exercise, use the mod p modular group (where p is a
prime) to compute (the original) El-Gamal encryption. Let p = 5.
1. Find a generator for Z∗p . (There are only three candidates to try!)
2. Let’s select the private key of Alice as dA = 2. Compute Alice’s public
key, eA = g dA mod p.
3. Compute El-Gamal encryption of 4 and of 3: c4 ≡ EeA (4), c3 ≡ EeA (3).
Comment: this is a randomized encryption, so another encryption may
result in a different output!
4. Compute the decryptions of c4 and of c3 .
5. Explain why El-Gamal encryption using mod p group - even for large
safe prime p - does not satisfy the requirements of secure encryption.
6.4.3
El-Gamal is Multiplicative-Homomorphic Encryption
An encryption scheme (E, D) is multiplicative homomorphic if there is an
operation, which we denote ×, deőned over a pair of ciphertext messages, such
that for every public-private key pair (e, d) and every pair of plaintext message
m1 , m2 holds that Ee (m1 ) × Ee (m2 ) is an encryption of m1 · m2 , namely:
m1 · m2 = Dd (Ee (m1 ) × Ee (m2 ))
(6.39)
Where m1 · m2 is integer multiplication.
In this section, we show that the (non-hashed) El-Gamal cryptosystem
is multiplicative homomorphic, and discuss some of the applications of this
Applied Introduction to Cryptography and Cybersecurity
6.4. THE DH AND EL-GAMAL PKCS
361
property. We focus on the use of a DDH group G, i.e., a cyclic group which is
believed to satisfy the DDH assumption (Deőnition 6.7).
It is convenient to think of × as a ‘multiplication of ciphertexts’ operation.
Following this, the homomorphic property basically means, that the multiplication of two ciphertexts, Ee (m1 ) × Ee (m2 ), is equivalent to the encryption of
the multiplication of the two messages, Ee (m1 · m2 ). Notice that m1 · m2 is
done using the group operation rather than normal multiplication.
The El-Gamal × operation is also similar to multiplication. Recall that in
the El-Gamal PKC, ciphertexts consist of pairs (cb , cm ) of elements from the
group G. The × operator applied to a pair of ciphertexts, (cb , cm ) and (cb′ , cm′ ),
is deőned as:
(6.40)
(cb , cm ) × (cb′ , cm′ ) ≡ (cb · cb′ , cm · cm′ )
The following lemma shows that the × operator correctly computes the
encryption of the multiplication of the two plaintext messages m and m′ whose
El-Gamal encryptions are (cb , cm ) and (cb′ , cm′ ), respectively.
$
Lemma 6.3. Let dA ← [1, p − 1] and eA = g dA . Then for any two messages
m, m′ ∈ G holds and any encryption of them EeA (m), EeA (m′ ) holds:
m · m′
=
DdA (EeA (m) × EeA (m′ ))
(6.41)
Proof: Let b be the random exponent used to compute EeA (m) and b′ be
the random exponent used to compute EeA (m′ ). Then:
EeA (m)
′
EeA (m )
=
=
(g b , m · ebA )
b′
′
(g , m ·
(6.42)
′
ebA )
(6.43)
Hence:
EeA (m) × EeA (m′ )
′
′
(g b+b , m · m′ · eb+b
A )
=
(6.44)
And:
DdA (EeA (m) × EeA (m′ ))
=
=
g b+b
′
m · m′
−dA
· m · m′ · g d A
b+b′
(6.45)
(6.46)
Exercise 6.8. Use the values of p, g from Exercise 6.7, and perform all multiplications mod p.
1. Compute cM ≡ c3 × c4 .
2. Compute the decryption of cM . Explain why the result is as expected from
the lemma.
Note: it is not secure to use the
Exercise 6.7).
mod p group for El-Gamal (last item in
Applied Introduction to Cryptography and Cybersecurity
362
6.4.4
CHAPTER 6. PUBLIC KEY CRYPTOGRAPHY
Types and Applications of Homomorphic Encryption
n the next section we discuss another multiplicative-homomorphic, the textbook RSA PKC. Like we deőned multiplicative-homomorphic encryption, we
can deőne other types of homomorphic encryption. One obvious example is
additive-homomorphic encryption. There are also encryption schemes which
are homomorphic with respect to multiple operations - each using a different
operator, of course.
In particular, encryption schemes which homomorphic with respect to both
multiplication and addition are referred to as Fully Homomorphic Encryption
(FHE). FHE is a very powerful tool; it allows computation of the encryption
of arbitrary function of the plaintext, given only the ciphertexts. For example,
we can encrypt the (secret) inputs, send them to an untrusted computation
server which will compute the encryption of the function over the inputs, and
then decrypt the results - without exposing the inputs to the untrusted server.
Namely, FHE allows arbitrary computations over encrypted data. Such schemes
have different applications, e.g., in cloud computing, where an untrusted cloud
service is performing some computation on encrypted values.
However, known FHE schemes are complex and have signiőcant overhead, in
terms of computation time and/or key/ciphertext length. This is in comparison
to Partially-Homomorphic Encryption (PHE) schemes that are homomorphic
with respect to only one operation, e.g., multiplication. In some applications, a
single operations suffices. For example, multiplicative-homomorphic encryption
allows multiplication of ciphertexts, as we have just seen; given two ciphertexts,
we can compute the encryption of their multiplication - without knowing the
plaintexts or the decryption key.
In this section, we will give a glimpse of the important applications of
homomorphic encryption, limiting ourselves to multiplicative-homomorphic
encryption. We focus on the El-Gamal multiplicative homomorphic encryption,
and demonstrate how it can be used for applications requiring anonymity, and
speciőcally, for anonymous voting. This is a tiny taste from the extensive
research on the use of cryptography to ensure privacy, anonymity and secure
and private voting.
Secure and private voting. Secure voting is essential for democracy; and
one of the main requirements is, usually, voting privacy, i.e., preserving the
conődentiality of the vote of each individual, and only exposing the tally of
entire populations. This may be achieved by use of physical designs such as a
ballot box or trusted voting machines.
There is extensive research on the use of cryptography to ensure secure
electronic voting. We focus on voter privacy, i.e., ensuring the secrecy of the
vote of speciőc individuals.
First, let us consider a trivial design for an e-voting system: voters encrypt
their votes with the public key of a trusted server, to which they then send
their votes; the server decrypts and then tallies the votes. This system requires
Applied Introduction to Cryptography and Cybersecurity
6.4. THE DH AND EL-GAMAL PKCS
Tally
server
eDS
Alice Bob Cora
eDS eDS eDS
363
Decrypt
server
dDS
EeDS (piA )
EeDS (piB )
EeDS (piC )
EeDS (piA · piB · piC )
Decrypt and output piA · piB · piC
Figure 6.16: Example: Privacy-preserving voting using two servers and
multiplicative-homomorphic encryption, e.g., El-Gamal PKC; pi > 1 is a small
prime number assigned to candidate i. Voters send encrypted using eDS , the
public key of the Decrypt server, the identiőer of their candidate, and send to
the tally server. The tally server sends eDS (piA · piB · piC ) to the decrypt server,
who outputs the combined vote piA · piB · piC . By factoring this, we őnd how
many votes were given to each candidate i.
complete trust in the server; in particular, the server can trivially know the
vote of each voter.
To ensure voter privacy, we separate the two functions of the server and use
two servers: a tally server and a decrypt server. We also switch the order of
operations: the encrypted votes are sent őrst to the tally server, who aggregates
all of them into a single (encrypted) value, and then sent to the decrypt server,
who decrypts to produce the őnal outcome.
This privacy-preserving voting process is shown in Fig. 6.16. As shown,
each candidate i is assigned a unique small prime number: pi > 1. Each
voter, e.g. Alice, selects one candidate, say iA (with identiőer piA ), and sends
to the tally server EeDS (piA ). The tally server combines the encrypted votes
by computing x ≡ EeDS (piA ) × EeDS (piB ) × EeDS (piC ) and sending x to the
decrypt server. From Equation 6.39, x = EeDS (piA ) × EeDS (piB ) × EeDS (piC ) =
EeDS (piA · piB · piC ), i.e., x is the encryption of piA · piB · piC . Hence the decrypt
server outputs piA · piB · piC , i.e., the combined vote. By factoring the combined
vote, we őnd how many votes were given to each candidate i. Note that this
factoring operation is efficient, since we know exactly all possible factors. Also,
note that for the factoring to provide correct result, the combined vote must
always be less than p.
Obviously, this is not a complete description of a secure voting system. A
complete system would not only ensure voter privacy, but also prevent cheating
by users and by the servers.
Exercise 6.9. We continue with p = 5 and same g and public key from
Exercise 6.8; now use this as the public key of the Decrypt server, eDS . Let
Applied Introduction to Cryptography and Cybersecurity
364
CHAPTER 6. PUBLIC KEY CRYPTOGRAPHY
there be two candidates: 2 and 3 (actually, we can’t have more with p = 5 why?).
1. Alice and Bob vote 2 and Cora votes 3; compute the encrypted votes.
2. Compute the encrypted combined vote (output of Tally).
3. Decrypt the combined vote.
4. The output may not be correct; explain why.
5. Show at least one way in which a corrupt party (voter, tally server or
decrypt server) can manipulate the results.
Another application: Re-encryption. Let us point out another application
of multiplicative homomorphic encryption, such as the El-Gamal PKC: reencryption. Namely, consider encryption (cm , cb ) = EeA (m) of plaintext m
using public key eA . Let (c1 , cb,1 ) ← EeA (1) be an encryption of 1 using
eA , and let (c′m , c′b ) ← (cm , cb ) × (c1 , cb,1 ). From the homomorphic property
(Equation 6.39), (c′m , c′b ) is an encryption of m · 1 = m, i.e., it is also an
encryption of m. A further property of re-encryption is that an adversary
cannot distinguish between the re-encryption of (cm , cb ) and an encryption of a
different message m̂ ̸= m. Re-encryption is used in different protocols, often
for anonymous communication; there are other solutions to ensure anonymity,
including Tor and anonymous remailer [113], all based on cryptography.
Re-encryption preserves the same decryption key. El-Gamal also allows a
similar, but different, mechanism, called proxy re-encryption, where another
entity called proxy is given a special key, denoted eA→B and computed by Alice,
that allows the proxy to transform a ciphertext message encrypted with the
key of Alice, c = EeA (m), into an encryption of the same message but with
Bob’s key, c′ = EeB (m). Proxy re-encryption has been proposed for different
applications, such as monitoring of encrypted traffic. For more details on this
mechanism and its applications, see [69].
Re-encryption (and proxy re-encryption) require the use or awareness of the
public key eA with which the message was encrypted. In some applications,
it is desirable to allow re-encryption without specifying the public key, e.g.,
for recipient anonymity. In such case, one can use an elegant extension called
universal re-encryption, which allows re-encryption without knowledge of the
encryption key eA . This is done by appending the encryption EeA (1) to each
ciphertext; see details in [171].
Homomorphic encryption cannot be IND-CCA secure! We see that
homomorphic encryption has some nice applications. However, there is a caveat,
namely, homomorphic encryption cannot be IND-CCA secure. This is especially
clear to see, considering the re-encryption application. Namely, an attacker can
re-encrypt the encryption c∗ = Ee (m∗ ) of a challenge message m∗ , resulting
in ciphertext c′ ̸= c∗ , which also decrypts to m∗ . The CCA attack allows
Applied Introduction to Cryptography and Cybersecurity
6.5. THE RSA PUBLIC-KEY CRYPTOSYSTEM
365
decryption of c′ , since c′ =
̸ c∗ , and this provides the attacker with the challenge
∗
message m . This argument can be extended to show that homomorphic
encryption cannot ensure ‘relaxed CCA security’ (rCCA), where the attacker
cannot make a ciphertext query which decrpts to the challenge plaintext (such
as c′ ); see Exercise 6.29.
The homomorphic property is so useful, that it has multiple applications
such as these mentioned above, in spite of the fact that it cannot be INDCCA (or even IND-rCCA) secure. However, this does require extra care and
expertise. For example, we brieŕy mentioned that the text book RSA PKC is
multiplicative-homomorphic. We will see that this fact was, indeed, abused in
an important attack against RSA, which motivates use of RSA with a padding
mechanism (which makes it non-homomorphic). With that, let us proceed to
discuss RSA.
6.5
The RSA Public-Key Cryptosystem
In 1978, Rivest, Shamir and Adelman presented the őrst proposal for a publickey cryptosystem - as well as a digital signature scheme [334]. This beautiful
scheme is usually referred to by the initials of the inventors, i.e., RSA; it was
awarded the Turing award in 2002, and is still widely used. We will cover here
only some basic details of RSA; a more in-depth study is recommended, by
taking a course in cryptography and/or reading one of the many books on
cryptography covering RSA in depth.
The reader may want to refresh on subsection A.2.2 and Section A.2.3 before
learning this section.
6.5.1
RSA key generation.
Key generation in RSA is more complex than for DH and El-Gamal. We őrst
list the steps, and then explain them:
• Select a pair
of large prime numbers p, q, speciőcally, both p and q would
have N2 bits. Let n = p · q and let ϕn = (p − 1) · (q − 1). As a result, n
would have at least N bits.
• Select a value e which is co-prime to ϕn , i.e., gcd(e, ϕn ) = 1.
• Compute d s.t. e · d mod ϕn = 1.
• The public key is (e, n) and the private key is (d, n).
Selecting e to be co-prime to ϕn is necessary - and sufficient - to ensure that
e has a multiplicative inverse d in the group mod ϕn . To őnd the inverse d,
we can use the extended Euclidean algorithm. This algorithm efficiently őnds
numbers d, x s.t. e · d + ϕn · x = gcd(e, ϕn ) = 1; namely, e · d = 1 mod ϕn . See
subsection A.2.2.
Applied Introduction to Cryptography and Cybersecurity
CHAPTER 6. PUBLIC KEY CRYPTOGRAPHY
366
The public key of RSA is (e, n) and the private key is (d, n), since the
modulus n is required for both encryption and decryption. However, we - and
others - often abuse notation and refer to the keys simply as e and d.
6.5.2
Textbook RSA: encryption, decryption, and signing.
The RSA cryptosystem is based on the RSA-encrypt function EeRSA (m), applied
to plaintext m using public key e, and the RSA decryption function DdRSA (c),
applied to ciphertext c and using the private key d. These functions are
computed as follows:
EeRSA (m)
DdRSA (c)
=
=
me mod n
cd mod n
(6.47)
(6.48)
Here, the message m is encoded as a positive integer, and limited to m < n,
ensuring that m = m mod n. In subsection 6.5.4 we show that this ensures
correctness, i.e., correct decryption, namely:
For every message m and c ← EeRSA (m) holds: m = DdRSA (c)
(6.49)
We use the term textbook-RSA encryption, for RSA encryption performed
by directly applying the RSA-encrypt function to the plaintext, without any
padding or other preprocessing, i.e., using E RSA and DRSA as deőned above.
As we explain in subsection 6.5.5, textbook RSA encryption has signiőcant
vulnerabilities. Therefore, in practice, the input to RSA is always processed;
this preprocessing of the input is referred to as padding.
Padding is deőned by a pair of functions (pad, unpad). The input to pad,
and the output of unpad, are plaintext messages; and the two functions should
ensure that unpadding of a padded message, recovers the message as it was
before padding, namely:
(∀m) m = unpad(pad(m))
(6.50)
Therefore, for every correct public key cryptosystem (KG, E, D) and a corresponding keypair (e, d), i.e., (e, d) ← KG(1l ), holds the correctness of padded
encryption:
(∀m, (e, d) ← (1l )) m = unpad(De (Ee (pad(m))
(6.51)
Let us know consider the speciőc case of (padded) RSA encryption. From the correctness of RSA (Equation 6.49) follows the correctness of padded RSA. Namely,
For every message m and c ← EeRSA (pad(m)) holds m = unpad(DdRSA (c)). Or:
d
e
(∀m) m = unpad ([pad(m)]
mod n)
mod n
(6.52)
We illustrate textbook-RSA vs. padded-RSA in Figure 6.17, and discuss
standard RSA padding in subsection 6.5.6.
Applied Introduction to Cryptography and Cybersecurity
6.5. THE RSA PUBLIC-KEY CRYPTOSYSTEM
m
Plaintext
RSA-encrypt:
c ← me mod n
367
c
Ciphertext
(a) Textbook-RSA, presented in subsection 6.5.2 and shown to be vulnerable in
subsection 6.5.5.
Padded RSA
m
Plaintext
M
Pad
M ← Pad(m)
RSA-encrypt:
c ← M e mod n
c
Ciphertext
(b) Padded RSA, usually following the PKCS#1 standard; see subsection 6.5.6. Version
1.5 of PKCS#1 is vulnerable; version 2 of PKCS#1, also referred to as OAEP, is
considered secure. We use M for the padded plaintext (M = P ad(m)).
Figure 6.17: RSA encryption: (a) textbook RSA (vulnerable) vs. (b) padded
RSA.
Textbook RSA signatures: encrypting with private key? RSA is also
the basis for a signature scheme, which we discuss in in subsection 6.6.1. Signing
uses the same key-generation process as explained above, except that we usually
denote the public veriőcation key by v (instead of public encryption key e) and
the private signing key by s (instead of private decryption key d).
Similarly to encryption, we can also deőne RSA textbook signatures, by
computing the RSA function over the message using the private signing key s,
as in:
SignRSA
(m)
s
=
V erif yvRSA (m, σ)
=
ms mod n
True if m = σ v mod n,
False otherwise
(6.53)
(6.54)
In practice, textbook RSA signatures are never used. One reason is that
textbook RSA signatures can only be applied if the message m is less than n,
which is rarely, if ever, the case. Therefore, in practice, RSA signatures always
involve some additional processing, typically using the Hash-then-Sign (subsection 3.2.6), i.e., computing SignRSA
(h(m)). Using textbook RSA signatures
s
also introduces vulnerabilities, which are similar to these outlined for textbook
RSA encryption in subsection 6.5.5.
Notice that the textbook RSA signing equation Equation 6.53 is exactly
the same as the textbook RSA encryption equation Equation 6.47, except for
the use of the private signing key s instead of the public encryption key e.
Therefore, you may őnd reference to RSA signing a ‘encryption with the private
key’. We recommend to avoid this expression, since it only applied to textbook
RSA signatures and textbook RSA encryption, which are both vulnerable and
Applied Introduction to Cryptography and Cybersecurity
CHAPTER 6. PUBLIC KEY CRYPTOGRAPHY
368
not used in practice, and not to the secure RSA signatures and encryption
(using padding and/or hashing). See further discussion of RSA signatures in
subsection 6.6.1.
6.5.3
Efficiency of RSA
The RSA algorithms are conceptually simple; however, their computation
requires considerable resources. The basic reason for that is that RSA security
completely breaks down if an attacker is able to factor n, and őnd its factors (p
and q); this allows computation of ϕ(n) = (p − 1) · (q − 1), and using ϕ(n), the
computation of the private key d from the public key e, since d = e−1 mod n
(and using ϕ(n) we can compute the multiplicative inverses).
Factoring is a well-studied problem, and while no polynomial factoring
algorithm is known, there are algorithms that improve efficiency considerably.
As a result, RSA keys should be quite long, as shown in Table 6.1; notice
that the key-length should be chosen based on the maximal time at which
the encryption should remain secret, and not based on the current time. For
example, for information whose conődentiality should be preserved until 2040,
the modulus n should be about 3000 bits, i.e., p and q should be random prime
with about 1500 bits.
Computations, especially exponentiation, with such extremely long numbers,
is computationally intensive. The computations are modulo n, which keeps the
results from becoming even longer, but this requires computation of the modulus
of the result (and optionally of intermediate values), and the computation of
the modulus is also a computationally-intensive problem.
Therefore, efficiency is a major consideration for implementation of RSA. In
this subsection, we discuss only one of the most basic optimizations: choosing e
to improve efficiency.
The public exponent e is not secret, and, so far, we only required it to
be co-prime to ϕ(n). This motivates choosing e that will improve efficiency usually, to make encryption faster.
In particular, choosing e = 3 implies that encryption - i.e., computing me
mod n - requires only two multiplications, i.e., is very efficient (compared to
exponentiation by larger number). Note, however, that there are several concerns
with such extremely-small e; in particular, if m is also small, in particular, if
c = me < n (without reducing mod n), then we can efficiently decrypt by
1/e
taking the e-th root: c1/e = (me )
= m. This particular concern is relatively
easily addressed by padding, as discussed below; however, there are several
additional attacks on RSA with very-low exponent e, e.g., [105]. Some of these
attacks are for the case where a party may encrypt and send the same message,
or ‘highly related’ messages, to multiple recipients. These attacks motivate (1)
the use of padding to break any possible relationships between messages , as
well as (2) the choice of slightly larger e, such as 17 or 216 + 1 = 65537. The
reason to choose these speciőc primes is that exponentiation requires only 5 or
17 multiplications, respectively; see next exercise.
Applied Introduction to Cryptography and Cybersecurity
6.5. THE RSA PUBLIC-KEY CRYPTOSYSTEM
369
Exercise 6.10. Given integer m, show how to compute m17 in only five
16
multiplications, and m2 +1 in only 17 multiplications.
Hint: use the following idea: compute m8 with three multiplications by
2 2
m8 = m2
.
Handling long plaintext: hybrid encryption. To complete our discussion
of RSA efficiency, let us comment that RSA encryption, like other publickey cryptosystems, used hybrid encryption to encrypt long messages. This is
necessary, since the input m to the RSA-encrypt function must be less than
the modulus n, or decryption will output m mod n which will differ from m.
Theoretically, we could have selected n to be longer than the longest message,
but this would have resulted in excessive overhead. Therefore, we select n based
on the security requirements (Table 6.1), which is much shorter than (normal)
plaintext. To encrypt ‘normal’ messages, which are typically much longer, we
apply hybrid encryption. In hybrid encryption, we use the public key encryption
to encrypt a shared key k, and then use the shared key to efficiently encrypt
the long message m. See subsection 6.1.6 and Figure 6.3.
6.5.4
Correctness of RSA
Does RSA decryption really work? Obviously, yes, it does; we now will explain
why it does. Before we ‘really’ explain this, an exercise may give some intuition.
To solve the exercise (and understand the following discussion), the reader may
want to refresh on multiplicative inverses subsection A.2.2 and Euler’s function
and theorem Section A.2.3.
Exercise 6.11 (Textbook RSA ensures correctness, i.e., decryption recovers
message). Let p = 7.
1. Recall: Z∗p , for a prime p, is the group containing the numbers from 1 to
p − 1, with the modular multiplication operation. A generator for Z∗p is a
number g ∈ {1, . . . , p − 1} such that by multiplying g by itself enough times,
each time modulus p, we get all the numbers in Z∗p . Find a generator g
for Z∗p ; show that g is a generator and how you found it.
2. What is ϕ(p)?
3. Let q = 11, and let n = q · p. Compute ϕ(q), ϕ(p) and ϕ(pq); for each
of them, compute directly from the definition, and using the relevant
facts/lemmas that we learned or that appear in the textbook.
4. Let e = 11 be an RSA encryption key for the modulus n; compute the
corresponding private key d. Note: this is a correction; previously we had
e = 3 - do you see why that value wasn’t good?
Applied Introduction to Cryptography and Cybersecurity
CHAPTER 6. PUBLIC KEY CRYPTOGRAPHY
370
5. Encode your initials by mapping the letters (A to Z) to the corresponding
numeric values (from 1 to 26), resulting in f, l ∈ {1, 2, . . . , 26}. Compute
m = f + l + 7.
RSA
6. Compute c = Ee,n
(m) = me mod n.
RSA
7. Compute m′ = Dd,n
(c) = cd mod n.
8. Explain why this encryption is insecure - yet why the use of this value
e = 11 may be secure in other applications of RSA.
Let us now ‘really’ explain why textbook RSA decryption recovers the
plaintext, i.e., the correctness of textbook RSA (Equation 6.49), namely,
(∀m)DdRSA (EeRSA (m)) = m.
RSA’s correctness is based on Euler’s Theorem (Theorem A.3), which says
that for any co-prime integers m, n, holds mφ(n) = 1 mod n, where ϕ(n) is the
Euler function, deőned as the number of positive integers which are less than n
and co-prime to n. See Section A.2.3. We use the theorem, to explain RSA’s
correctness, i.e., why DdRSA (EeRSA (m)) = m.
Note that for any primes p, q holds ϕ(p) = p − 1, ϕ(q) = q − 1, and
ϕ(p · q) = (p − 1)(q − 1) (Lemma A.1). This is the reason for us using ϕn =
ϕ(n) = ϕ(p · q) = (p − 1)(q − 1) in the RSA key generation process.
Recall that e · d = 1 mod ϕ(n), i.e., for some integer i it holds that e · d =
1 + i · ϕ(n). Hence:
me·d
mod n
m1+i·φ(n) mod n
i
m · mφ(n)
mod n
=
=
b
Recall Eq. (A.5) : ab mod n = (a mod n) mod n. Assuming that m
and n are co-prime (gcd(m, n) = 1), we can apply Euler’s theorem and can
substitute mφ(n) = 1 mod n and receive:
i
me·d mod n = m · mφ(n) mod n
mod n
(6.55)
=
m · 1i
mod n = m
mod n
(6.56)
It follows that:
DdRSA (EeRSA (m)) = DdRSA (me
= (me
mod n)
e·d
=m
= m · 1i
d
mod n
mod n
= m · mφ(n)
=m
mod n)
mod n
mod n
i
(6.57)
mod n
mod n = m
Applied Introduction to Cryptography and Cybersecurity
6.5. THE RSA PUBLIC-KEY CRYPTOSYSTEM
371
Notice that m mod n = m since we restricted plaintext messages m so that
m < n. Namely, under the assumption (above) that m and n are co-primes, we
have shown the correctness of textbook RSA.
What about the assumption that m and n are co-primes? Unfortunately, it
does not always hold. Recall that the message m may be any positive integer
s.t. 1 < m < n. Most, but not all, of these possible messages - i.e., integers
smaller than n - are co-prime to n. In fact, the number of integers smaller than
n and co-prime to n is exactly the deőnition of ϕ(n), which we know to be:
ϕ(n) = ϕ(p · q) = (p − 1) · (q − 1) = n − q − p + 1.
So most possible messages m are indeed co-prime to n. Still, p + q − 2
messages are not co-prime to n; this number is much smaller than n but is
1
still polynomial in n (roughly n 2 ), and our explanation does not hold for these
values. We assure the reader, however, that correctness holds also for these
values; it ‘just’ requires a slightly more elaborate argument. Such arguments,
usually using the Chinese Remainder Theorem, can be found in many textbooks
on cryptography and number theory, e.g., [205].
6.5.5
The RSA assumption and the vulnerability of textbook
RSA
Now that we have seen that the textbook RSA PKC ensures correctness, it
is time to discuss its security. We will őrst discuss the underlying security
assumption, and then discuss several vulnerabilities of textbook RSA, which
are the reason that in practice, we always use padded RSA, which we discuss in
subsection 6.5.6.
The security of RSA encryption is based on the RSA assumption. Intuitively,
the RSA assumption is that there is only negligible probability that an efficient
adversary A correctly recovers the plaintext m, given the ciphertext me mod n
and the public key (e, n). Let us restate the RSA assumption a bit more formally.
Definition 6.8 (RSA assumption). Choose n, e as explained above, i.e., n = pq
for p, q chosen as random l-bit prime numbers, and e is co-prime to ϕn . The
RSA assumption is that for any efficient (PPT) algorithm A and constant c,
and for sufficiently large l, holds:
Pr [A((e, n), me
mod n) = m] ∈ N EGL(l)
(6.58)
$
Where m is chosen randomly m ← [2, n − 2].
The RSA assumption is also referred to sometimes as the RSA trapdoor
one-way permutation assumption. The ‘trapdoor’ refers to the fact that d is a
‘trapdoor’ that allows inversion of RSA; the ‘one-way’ refers to the fact that
computing RSA (given public key (e, n)) is easy, but inverting is ‘hard’; and the
‘permutation’ is due to RSA being a permutation (and in particular, invertible).
See also the related concept of one-way functions in ğ3.4.
Applied Introduction to Cryptography and Cybersecurity
372
CHAPTER 6. PUBLIC KEY CRYPTOGRAPHY
One obvious question is the relation between the RSA assumption and
the assumption that factoring is hard. Assume that some adversary AF can
efficiently factor large numbers, speciőcally, the modulus n (which is part of
the RSA public key). Hence, AF can factor n, őnd q and p, compute ϕn and
proceed to compute the decryption key d, given the public key < e, n >, just
like done in RSA key generation. We therefore conclude that if factoring is easy,
i.e., there exists such adversary A, then the RSA assumption cannot hold (and
RSA is insecure).
Textbook RSA is multiplicative-homomorphic - and vulnerable. Assume now that we are willing to accept the RSA assumption. What, then,
about the security of RSA, when used as a public key cryptosystem (PKC)? In
this subsection, we discuss textbook RSA, and argue that it is vulnerable; this
motivates the PKCS#1 speciőcations for ‘padded RSA encryption’, which we
discuss in the next subsection.
Before we discuss the vulnerabilities of textbook RSA, let us point out an
important property of it: textbook RSA is multiplicative-homomorphic. Indeed,
this follows quite simply:
EeRSA (m1 · m2 ) = (m1 · m2 )
= me1 · me2
e
mod n
mod n
= EeRSA (m1 ) · EeRSA (m2 )
mod n
As we already discussed in subsection 6.4.3, a homomorphic encryption
scheme cannot be IND-CCA secure. This is a drawback, requiring extra care in
design of secure applications using a homomorphic encryption scheme. However,
in the case of textbook RSA, there are additional vulnerabilities, which makes
its use clearly inadvisable:
1. Unlike El-Gamal PKC, the textbook RSA PKC is deterministic; hence,
encryption of the same plaintext m, will always result in the same ciphertext c = me mod n. Suppose that the attacker guesses (or knows) a set
of likely (or possible) plaintexts, say m1 , m2 and m3 . The attacker can
easily compute, say, c1 = me1 mod n; if the plaintext message m was the
same as m1 , then c = c1 . Textbook RSA encryption resembles, in this
sense, the insecure ECB mode of operation (Section 2.8), which has the
same vulnerability. Secure encryption5 - shared key or public key - must
be randomized and/or stateful!
2. Textbook-RSA is vulnerable to low-exponent attacks, especially when
sending low-value messages (small m); we mentioned the trivial attack
when me < n. See more elaborate low-exponent attacks in [105]); these
exploit scenarios where we send identical or related messages to multiple
recipients.
5 However, some designs of cloud databases employ deterministic encryption, specifically
to facilitate identification of encryption of the same element. Of course, this must be done
with great care to avoid unintended exposure.
Applied Introduction to Cryptography and Cybersecurity
6.5. THE RSA PUBLIC-KEY CRYPTOSYSTEM
373
3. The RSA assumption does not rule out a potential exposure of partial
information about the plaintext, e.g., a particular bit. The log(n) leastsigniőcant bits were shown to be as secure as the entire preimage [10];
however, it may be possible to expose other bits, like the one bit of the
discrete log (see subsection 6.1.8).
4. Finally, while every homomorphic encryption scheme is vulnerable to CCA
attacks, textbook RSA is also vulnerable to much weakened version of CCA
attacks, where the attacker only needs to receive very limited information
about the ciphertext. In subsection 6.5.7 we present Bleichenbacher’s
attack, which can effectively ‘break’ textbook RSA given only a onebit error indicator, speciőcally, indication if the result of textbook RSA
decryption is a ‘properly padded plaintext’. Furthermore, this attack
breaks not only ‘textbook RSA’, but also some versions of padded RSA, in
particular, when using the (widely-used and very simple) RSA PKCS#1
version 1.5 padding (deőned in Equation 6.59). We discuss padded RSA
in the following subsection.
6.5.6
Padded RSA encryption: PKCS#1 v1.5 and OAEP
In the previous subsection, we saw that textbook RSA has signiőcant vulnerabilities, making it inadvisable to use it. Therefore, to improve security,
practical deployments always use padded RSA, i.e., apply a pad function to the
message before applying the encryption operation, and a corresponding unpad
function to recover the plaintext after decryption. The unpad function should
recover the message before padding, which ensures the correctness of padded
RSA, i.e., for every message m we always have m = unpad(DdRSA (EeRSA (m))
(Equation 6.52).
Practical deployments of RSA encryption, usually follow one of the versions
of the PKCS #1 speciőcations6 [386].
Before version 2.0, PKCS#1 deőned only one padding, which we refer to
as the v1.5 padding; version 2.0 added OAEP padding, based on the design
from [43]. We brieŕy discuss both of these widely used padding schemes.
PKCS#1 version 1.5 padding. The v1.5 padding was designed mainly
to address two of the vulnerabilities of textbook RSA: vulnerability 1 (due to
deterministic output) and vulnerability 2 (low-exponent attacks). Namely:
• To prevent an attacker from identifying multiple encryptions of the same
plaintext, possibly by the attacker encrypting some guesses of possible
plaintexts, the padding would include a sufficient number of random bits.
6 PKCS stands for Public Key Cryptography Standards for standard padding. The PKCS
specifications were published by RSA Security LLC; versions 1.5, 2.0, 2.1 and 2.2 were define
by the IETF, in RFCs [219, 224, 225] and [386], respectively.
Applied Introduction to Cryptography and Cybersecurity
374
CHAPTER 6. PUBLIC KEY CRYPTOGRAPHY
• To prevent the low-exponent attacks (vulnerability 2), prepend the message
bytes with the random bits. To further ensure a sufficiently large value
for the padded-plaintext, prepend this with a non-zero byte.
Speciőcally, the PKCS#1 version 1.5 padding algorithm, P adv1.5 (·), is
deőned as follows. Given an input (pre-padding) plaintext message m, and
a random string r of at least eight non-zero random bytes, we compute the
padded message M = P adv1.5 (m) as:
M = P adv1.5 (m) = 0x00 +
+ 0x02 +
+r+
+ 0x00 +
+m
(6.59)
To ensure that the binary value of M is less than n, as required for correct
decryption, the (pre-padding) message m must contain less than l − 11 bytes,
where l is the length of the modulus n (in bytes). The value of the second byte
(0x02) prevents low-exponent attacks (by being non-zero). In addition, the fact
that the second byte is 0x02 identiőes that the operation applied to the input
was RSA encryption, and not, say, RSA signing.
In the decryption process, we őrst obtain the padded message M , from which
we can easily extract and return only the plaintext m. The message should be
returned only if the padding is correct, i.e., begins with 0x0002, followed by at
least eight non-zero bytes, and őnally followed by a zero byte. If M deviates
from this in any way, an error indicator is returned.
As the readers hopefully agree, the PKCS#1 version 1.5 padding, deőned
in Equation 6.59, is simple to understand and easy to implement. These
properties are part of the reason that this padding was quickly adopted by many
systems, and is still quite widely deployed. However, the PKCS#1 version 1.5
padding turns out to be vulnerable to some attacks. The őrst of these was
Bleichenbacher’s Padding Side Channel Attack, which we discuss in the next
subsection (and see Figure 6.21).
Optimal Asymmetric Encryption Padding (OAEP). A more secure
padding called Optimal Asymmetric Encryption Padding (OAEP) was proposed
by Bellare and Rogaway [43]. From version 2.0 of the PKCS#1 standard, until
the version 2.2 [386], the current version, the standard includes both OAEP as
well as the PKCS#1v1.5 padding. However, OAEP should be used whenever
possible: it is more secure, without noticeable extra overhead.
Intuitively, OAEP further improves the security of RSA, by introducing two
mechanisms, each addressing one of the two main vulnerabilities of the v1.5
padding:
Mix all bits: To deal with the concern that RSA may expose partial information (vulnerability 3), OAEP mixes up all the bits of the plaintext
before applying textbook-RSA encryption. The mixing makes it necessary
to expose many or all bits of the input to the textbook RSA encrypt
function, in order to expose any information about the plaintext.
Redundancy against chosen-ciphertext attacks: To foil chosen-ciphertext
attacks (CCA), OAEP adds redundancy to the plaintext before applying
Applied Introduction to Cryptography and Cybersecurity
6.5. THE RSA PUBLIC-KEY CRYPTOSYSTEM
375
encryption to it. The OAEP decryption process returns the decrypted
plaintext, only if it contains the same redundancy. This should make
it infeasible for the attacker to learn sensitive information by CCA, i.e.,
sending manipulated ciphertext messages, since, almost always, their decryption would not have the correct redundancy (and be rejected). Bellare
and Rogaway coined the term plaintext-aware encryption for encryption
with such validation function as part of the decryption process, where
an attacker must know (‘be aware of’) a plaintext m in order to create
a ciphertext c which will be decrypted into m. If the attacker produces
ciphertext c′ without knowing a corresponding plaintext m′ , then the
decryption of c′ is bound to result in an error indication rather than in a
decrypted plaintext m′ .
Following [43], we őrst present a simpliőed version of the OAEP padding,
which only implements the ‘mix all bits’ mechanism, to prevent exposure of
partial information, but does not add redundancy to defend against CCA
attacks. Later, we describe the (non-simpliőed) OAEP padding, which extends
the simpliőed padding, and also adds redundancy to defend against CCA attacks.
Simplified OAEP padding (P ad2 (·)). The simpliőed OAEP padding (P ad2 (·)),
illustrated in Figure 6.18, is already quite clever, and therefore, let us develop
it in three stages: P ad0 , P ad1 and then P ad2 . First, for P ad0 , consider an
attacker which can only expose a single bit in the preimage of the RSA encryption; of course we don’t know which bit. To prevent such attacker from exposing
any bit of the plaintext, we let P ad0 őrst select a random string r of the same
length as the plaintext m, and then output m ⊕ r +
+ r. Namely, padding P ad0
selects a random one-time pad r to m; the padded message M consists of the
‘encrypted’ plaintext m ⊕ r, concatenated to the ‘pad’ r.
Obviously, the P ad0 padding is inefficient. First, it only protects against
exposure of a single bit; what if we can expose two bits of the preimage, e.g.,
the őrst bit of m ⊕ r and the (corresponding) őrst bit of r? Second, do we really
need to send such a long pad (as long as the plaintext m)?
We address both issues in P ad1 , by using a shorter random string r, and
then applying a cryptographic hash function g, whose output is as long as the
plaintext m; so now we use g(r) as the one-time pad. it to a longer string
g(r). Clearly, we reduced the overhead since r is shorter. Furthermore, if we
view g as a ‘random oracle’, then security also improved. Even if the attacker
knows several - but not all - of the random string r, then its output bits are
still random.
Suppose, however, that the attacker can expose |r| bits of the preimage of
RSA - or, speciőcally, all the bits of the (short) random string r. Of course, r
cannot be too short if we rely on it to randomize g(r), so this would imply that
the RSA encryption function is much weaker than we expect; but still, it would
be nice to protect also against such a case - if we can do it efficiently - and we
can, as is done by P ad2 .
Applied Introduction to Cryptography and Cybersecurity
CHAPTER 6. PUBLIC KEY CRYPTOGRAPHY
376
r
Plaintext m
(random
k bits)
(n bits)
m
r
g(r)
g(·)
f (·)
r
f (m ⊕ g(r))
m ⊕ g(r)
r ⊕ f (m ⊕ g(r))
M = P ad2 (m) = (m ⊕ g(r)) +
+ (r ⊕ f (m ⊕ g(r)))
M = P ad2 (m)
c = EeRSA (M ) = M e mod n
c = M e mod n = (P ad2 (m))
e
mod n
Figure 6.18: Simpliőed-OAEP (P ad2 (·)) padded RSA encryption. The P ad2 (·)
padding goal is to protect against partial exposure of RSA preimage, but it
does not validate that the ciphertext is the result of applying encryption, i.e.,
does not ensure plaintext-awareness.
In fact, in this third (and őnal) simpliőed padding, P ad2 , we simply apply
again the ‘hash then one-time pad’ method of P ad1 - this time, to ‘protect r’.
More speciőcally, given plaintext m, the P ad2 algorithm őrst computes m⊕g(r),
+ r ⊕ f [m ⊕ g(r)],
for a random string r, and then outputs the padding m ⊕ g(r) +
where f is yet-another crytographic hash function. To further clarify, let us
write down both the P ad2 and the corresponding U nP ad2 functions: :
P ad2 (m) ≡ (m ⊕ g(r)) +
+ (r ⊕ f (m ⊕ g(r)))
U nP ad2 (c) ≡ c[0 : n − 1] ⊕ g(f (c[0 : n − 1]) ⊕ c[n : n + k − 1])
Applied Introduction to Cryptography and Cybersecurity
(6.60)
6.5. THE RSA PUBLIC-KEY CRYPTOSYSTEM
377
The U nP ad2 function takes advantage of the known size of the two components
of P ad2 (m, r) (the n-bits m ⊕ g(r) and the k-bits r ⊕ f (m ⊕ g(r))).
As the reader can easily conőrm, this padding is correct, i.e., m = U nP ad2 (P ad2 (m)).
OAEP padding. Finally, we describe the ‘complete’ OAEP padding, as
presented in [43], and later standardized by the IETF [386]. The OAEP
padding adds to the ‘simpliőed padding’ (P ad2 above) a simple redundancy
mechanism, that ensures plaintext-awareness, and thereby, provides effective
defense against CCA attacks. Our description is of the padding and its security
is a bit simpliőed, but we believe it suffices for our (educational) purposes;
there are also (minor) differences between the details of the design between the
one in [43] and the one standardized by the IETF. Reader interested in details
should refer to [386] and to the security analysis in [43] and in follow-up works.
r
Plaintext m
0
(n bits)
(random
l bits)
l
m+
+ 0l
r
g(r)
l
g(·)
f (·)
r
f (x)
x≡ m+
+ 0 ⊕ g(r)
y ≡ r ⊕ f (x)
M =x+
+y =
m+
+ 0l ⊕ g(r) +
+ (r ⊕ f (m ⊕ g(r)))
M =x+
+y
c = EeRSA (M ) = M e mod n
c = M e mod n
Figure 6.19: OAEP padding [43]
To add redundancy to the plaintext, OAEP simply appends to the plaintext
message m a string of l zero bits, i.e., 0l , as illustrated in Figure 6.19. The value
Applied Introduction to Cryptography and Cybersecurity
378
CHAPTER 6. PUBLIC KEY CRYPTOGRAPHY
of l should be sufficiently large to make it ensure plaintext-aware encryption,
i.e., make it impractical to őnd ciphertext which decrypts with the correct set
of zero bits, except for ciphertexts obtained by applying the encryption process
(to known plaintext). A typical value for l may be 128 bits.
The design of OAEP, mostly inherits the security properties of the simpliőed
OAEP padding, P ad2 (·). In addition, OAEP adds redundancy to the plaintext,
to ensure plaintext-aware encryption and thereby defend against CCA attacks.
Suppose the attacker sends some ciphertext c which is not the result of legitimate
encryption. Namely, the attacker does not know the resulting plaintext; and
even if we assume that the attacker can őnd some bits of the preimage, it would
deőnitely not know all of r, and therefore, cannot ensure that the decryption will
have the correct the 0l redundancy string. Therefore, intuitively, the attacker
does not gain information from CCA attacks.
6.5.7
Bleichenbacher’s Padding Side-Channel Attack on
PKCS#1 v1.5
In 1998, Daniel Bleichenbacher presented possibly the most important attack
against public-key cryptosystems based on RSA [72]. Bleichenbacher’s attack is
using the chosen-ciphertext side-channel attack (CCSCA) model. We begin this
subsection by brieŕy discussing this attack model, and the important area of
side-channel attacks in general, and then focus on Bleichenbacher’s attack.
Side-channel attacks and the chosen-ciphertext side-channel attack
(CCSCA) model. In chosen-ciphertext side-channel attacks , as in other
attack models for encryption schemes, the attacker receives a challenge ciphertext
c∗ , Which is the encryption of a challenge plaintext m∗ . At the end of the attack,
the attacker outputs a guess m, and we say that the attacker wins if its guess
is correct, i.e., if m = m∗ . What is unique about the CCSCA model are the
attacker capabilities. Speciőcally, like in a ‘regular’ CCA attack, the attacker
can give ciphertexts c1 , c2 , . . . to be decrypted; however, in a chosen-ciphertext
side-channel attack (CCSCA), the attacker does not receive the results of the
decryption. Instead, the attacker receives only some side-channel information
regarding the decryption process and the decrypted message. See Figure 6.20.
A side-channel is transmission of information using a non-standard channel,
which was not intended for communication by the system designers, and, possibly,
not considered when evaluating the security of the system. Such a channel is dur
to some ‘side-effect’ of the operation of the system. For example, in the attack
models we discussed in Chapter 2, e.g., CPA (see Figure 2.8), the model deőnes
clearly which information is available to the attacker; any other information is
therefore excluded. But a side-channel may provide some additional information,
‘outside the model’.
There are different types of side-channels, including timing of events, power
consumption, electromagnetic radiation, audio signals, error indicators and more.
There are also different applications and goals for side-channels, including noncryptographic side-channels; for example, one common use of side-channels
Applied Introduction to Cryptography and Cybersecurity
6.5. THE RSA PUBLIC-KEY CRYPTOSYSTEM
Side-channel:
timing, errors,
power, noise,...
MitM Mal
(attacker)
Challenge
m∗
Encrypt
c∗ ← Ee (m∗ )
c∗
379
c 1 , c2 , . . .
Decrypt
mi ← Dd (ci ) m1 , m2 , . . .
Guess mg
Figure 6.20: The chosen-ciphertext side-channel attack (CCSCA) model. Attacker receives ‘side-channel feedback’ from the processing of adversary-selected
ciphertexts. Attacker ‘wins’ when its ‘guess’, mg is identical to the ‘challenge’
m∗ , i.e., when mg = m∗ .
is to inőltrate information across a őrewall or other device which inspects
information sent outside of an organization, to detect leakage. Even focusing on
cryptographic side-channels, there are different types and goals. In particular,
sometimes side-channels are viewed as increasing the power of the attacker,
for example, a side-channel that leaks information about the computation,
which may allow leakage of information regarding the secret/private key. In
other side-channel attacks, the side-channel is viewed as a weaker assumption
regarding the attacker capabilities; speciőcally, in Bleichenbacher’s attack, the
attacker receives ‘only’ a very limited indication about the decrypted message,
instead of receiving the exact plaintext (as in a CCA attack).
The information leaked in each side-channel ‘signal’ is, typically, extremely
limited, e.g., only a single bit. Indeed, in many side-channel attacks, each
‘signal’ does not provide even one bit of information, since the side-channel
signal is obscured by random noise. In spite of that, there have been many
successful side-channel attacks on cryptographic systems and other security
systems. However, we only cover Bleichenbacher’s attack, as designed against
RSA encryption implementations that use the PKCS#1 version 1.5 padding,
deőned in Equation 6.59.
Bleichenbacher’s side-channel attack. Bleichenbacher’s side-channel attack is one of the most important and well-known side-channel attacks. The
attack is against RSA encryption using the PKCS#1 versions 1.5 padding;
this padding is very popular, and used by numerous systems and standards,
including several variants of the important SSL and TLS protocols (Chapter 7).
Bleichenbacher’s attack, and variants of it, apply to many systems and standards. In particular, Manger [275] showed a variant that can attack even the
variant of OAEP standardized as version 2.0 of PKCS#1. As shown in [341],
this attack can also be applied against many implementations of PKCS#1
v1.5, with greater efficiency than the original Bleichenbacher attack. We note
Applied Introduction to Cryptography and Cybersecurity
CHAPTER 6. PUBLIC KEY CRYPTOGRAPHY
380
f1 , f2 , . . . where
MitM Mal
(attacker)
c∗
fi = True iff Mi has
correct PKCS#1 version 1.5 padding
c1 , c 2 , . . .
Mi ←
cdi mod n
Unpad
{Mi }
mi ←
U npad(Mi )
{mi }
mg
Figure 6.21: Bleichenbacher’s attack on RSA PKCS#1 version 1.5. The attacker
is given ciphertext c∗ and outputs a guess plaintext mg . The attacker goal is
d
to compute the ‘correct’ plaintext, mg = m∗ , where m∗ = (c∗ ) mod n. To
∗
e
compute mg , the attacker sends ciphertexts {ci = c · si mod n, where si are
different integers. The attacker receives only the one-bit-feedback side-channel
indications {fi }, where fi is true if Mi , the ith output of the textbook RSA
decryption function, has correct PKCS#1 version 1.5 padding. We use mi
to denote the outcome of the PKCS#1 decryption of ci , i.e., the result of
unpadding Mi .
that Manger’s attack exploits a seemingly-minor detail of PKCS#1 version 2.0,
which differs from the original OAEP design; this detail was őxed in version 2.2
of PKCS#1 [292].
The setup of Bleichenbacher’s attack is illustrated in Figure 6.21. The
attacker’s goal is to compute a string M which is the same size as the modulus n,
d
which we denote l bytes, and such that M = (c∗ ) mod n, for a given ‘challenge
ciphertext’ c∗ . Basically, M would be the padded version of the original plaintext
m encoded by PKCS#1 version 1.5 padding, as in Equation 6.59; it would then
be trivial for the attacker to unpad M and őnd the original plaintext m.
Bleichenbacher’s attack works for arbitrary c∗ , i.e., c∗ is not necessarily the
result of PKCS#1 v1.5 encryption. This can be very useful in some scenarios,
and is exploited in some attacks based on Bleichenbacher’s attack, including
attacks we discuss in Chapter 7. However, we focus on the simpler and common
case where c∗ is the result of RSA PKCS#1 v1.5 padded encryption of some
plaintext m∗ , which simpliőes the attack a bit.
Often, the attacker obtains c∗ by eavesdropping to a transmission from a
benign sender, which obtained c∗ by encrypting some challenge message m∗
using RSA with PKCS#1 v1.5 padding. In this case, if the attacker succeeds to
d
compute (c∗ ) mod n, this gives the padded challenge message M ∗ = P ad(m∗ ),
Applied Introduction to Cryptography and Cybersecurity
6.5. THE RSA PUBLIC-KEY CRYPTOSYSTEM
381
which easily provides m∗ itself, since m∗ = U nP ad(P ad(m∗ )) = U nP ad(M ∗ ).
Since c∗ is the result of RSA encryption using PKCS#1 v1.5, we have
∗
M = P adv1.5 (m∗ ). From the deőnition of the pad in Equation 6.59, we know
that a properly-padded message such as M ∗ must begin with 0x0002. Hence:
2B ≤ M ∗ < 3B, where B ≡ 28(l−2)
(6.61)
Where l is the length, in bytes, of n (and hence of M ). Note that Equation 6.61
does not reŕect any knowledge about the plaintext m∗ or the private key, only
about the padded-plaintext M ∗ ← P adv1.5 (m∗ ) and the public key (n, e).
The attack generates a sequence of sets {Mi }i≥0 , each containing one or
more intervals of integers, where the correct solution, M ∗ , is in one of these
intervals. The initial set, M0 , simply contains one interval containing all the
l-bytes strings beginning with 0x0002, i.e.:
M0 ≡ { [2B, 3B − 1 ]}
(6.62)
Obviously, the correct solution, M ∗ , is within the interval contained in M0 .
The attacks proceeds to iteratively produce additional sets of intervals, {Mi },
such that for every i > 0, the set of values contained in (the intervals in) Mi
is a strict subset of the set of values in Mi−1 , but always includes the correct
solution M ∗ . The attack completes when Mi contains only one value - which
must, therefore, be the correct solution, M ∗ .
Note that when the number of elements in Mi is sufficiently small, the
attacker could also exhaustively look for the solution, i.e., the value Mg ∈ Mi
such that c∗ = Mge mod n. This is often more efficient than continuing the
attack until Mi contains only a single element. In particular, we can use the
fact that every correctly-padded message, with PKCS#1 v1.5, would contain
at least one byte of only zero bits between the random string and the actual
(unpadded) plaintext, which should rule out roughly 255/256 of the values in
Mi . Similarly, we can use any additional information we may have about the
plaintext, such as a speciőc format or known contents of part of the plaintext.
The attack computes, for every i > 0, two values: őrst, an integer si , and
then, the set of intervals Mi , terminating when Mi contains only a single value
(which would be the solution M ∗ ). The computation is done iteratively over i,
beginning with i = 1.
Computation of s1 . Let us őrst describe the special case of computing s1 .
We compute s1 as the smallest integer such that the textbook RSA decryption
of c∗ · (s1 )e mod n has correct padding, as we learn from the one-bit-feedback
upon sending c∗ · (s1 )e mod n to the decryption device. Namely, m∗ · s1 mod n
is well-padded, and, in particular, begins with 0x0002. Note: it suffice to begin
n
searching for s1 from the minimal value of 3B
, since for smaller values of s1 ,
∗
m · s1 mod n cannot begin with 0x0002 (i.e., is not well padded).
Computation of si for i > 1, if Mi−1 contains more than one interval. In
this case, si is the smallest integer such that si > si−1 and the decryption of
c∗ · (si )e mod n has correct padding, and, in particular, the resulting plaintext,
m∗ · si mod n, begins with 0x0002.
Applied Introduction to Cryptography and Cybersecurity
382
CHAPTER 6. PUBLIC KEY CRYPTOGRAPHY
Computation of si for i > 1, if Mi−1 contains exactly one interval, Mi−1 =
{[a, b]}. In this last case, choose small integers ri , si such the decryption of
c∗ · (si )e mod n has correct padding and that the following two conditions hold:
b · si−1 − 2B
n
2B + ri · n
3b + ri · n
≤si <
b
a
ri ≥ 2
(6.63)
Computation of Mi . Finally, after si has been found, we compute the new
set Mi as:
[
3B − 1 + rn
2B + rn
, min b,
Mi ←
max a,
si
si
(a,b,r)
(6.64)
a · si − 3B + 1
b · si − 2B
for all [a, b] ∈ Mi−1 and for
≤r≤
n
n
The size of the intervals in the Mi sets decreases in each iteration, but
analysis of the rate of efficiency, and, in particular, the required number of
padding-correctness feedback queries, is beyond our scope. See (simpliőed)
analysis in [72], which estimates that the attack will require about 220 (about a
million) queries.
Finally, we prove that the attack őnds M ∗ .
Lemma 6.4. For every Mi produced by the Bleichenbacher attack as describe
above, holds M ∗ ∈ Mi .
Proof: We already mentioned that M ∗ ∈ M0 . Assume that M ∗ ∈ Mi−1 ,
and we prove that M ∗ ∈ Mi .
Whenever we choose si , we conőrm, using the padding-correctness feedback,
that m∗ · si mod n is well-padded, and, in particular, begins with 0x0002.
Namely, for some integer r holds:
2B ≤ M ∗ · si − r · n ≤ 3B − 1
(6.65)
2B ≤ M ∗ · si − r · n ≤ 3B − 1, namely:
3B − 1 + r · n
2B + r · n
≤ M∗ ≤
si
si
(6.66)
Now, since M ∗ ∈ Mi−1 , then there exist an interval [a, b] ∈ Mi−1 which contains
M ∗ , i.e., a ≤ M ∗ ≤ b. Substituting M ∗ in Equation 6.65, we have:
a · si − (3B − 1)
b · si − 2B
≤r≤
n
n
(6.67)
The combination of Equation 6.66 and Equation 6.67 implies that M ∗ must be
in one of the intervals in Mi , as deőned in Equation 6.64.
Applied Introduction to Cryptography and Cybersecurity
6.6. PUBLIC KEY SIGNATURE SCHEMES
6.6
383
Public key signature schemes
We now discuss the third type of public-key cryptographic schemes: signature
schemes, introduced in subsection 1.5.1. Signature schemes consist of three
efficient algorithms (KG, S, V ), illustrated in Figure 1.6:
Key-generation KG: a randomized algorithm, whose input is the key length
l, and which outputs the private signing key s and the public validation
key v, each of length l bits.
Signing S: a (deterministic or randomized) algorithm, whose inputs are a
message m and the signing key s, and whose output is a signature σ.
Validation V : a deterministic algorithm, whose inputs are a message m, signature σ and validation key v, and which outputs an indication whether
this is a valid signature for this message or not.
Figure 1.7 illustrates the process of signing a message (by Alice) and validation of the signature (by Bob). We denote Alice’s keys by A.s (for the
private signing key) and A.v (for the public validation key); note that this őgure
assumes that Bob knows A.v - we later explain how signatures also facilitate
distribution of public keys such as A.v.
Signature schemes have two critical properties, which make them a critical
enabler to modern cryptographic systems. First, they facilitate secure remote
exchange in the MitM adversary model; second, they facilitate non-repudiation.
We begin by brieŕy discussing these two properties.
Signatures facilitate secure remote exchange of information in the
MitM adversary model. Public key cryptosystems and key-exchange protocols, facilitate establishing of private communication and shared keys between
two remote parties, using only public information (keys). However, this still
leaves the question of authenticity of the public information (keys).
If the adversary is limited in its abilities to interfere with the communication
between the parties, then it may be trivial to ensure the authenticity of the
information received from the peer. In particular, many works assume that the
adversary is passive, i.e., can only eavesdrop on messages; this is also the basic
model for the DH key exchange protocol. In this case, it suffices to simply send
the public key (or other public value).
Some designs assume that the adversary is inactive or passive during the
initial exchange, and use this to exchange information such as keys between the
two parties. This is called the trust on first use (TOFU) adversary model.
In other cases, the attacker may inject fake messages, but cannot eavesdrop
on messages sent between the parties; in this case, parties may easily authenticate
a message from a peer, by previously sending a challenge to the peer, which the
peer includes in the message.
However, all these methods fail against the stronger Man-in-the-Middle
(MitM) adversary, who can modify and inject messages as well as eavesdrop
Applied Introduction to Cryptography and Cybersecurity
384
CHAPTER 6. PUBLIC KEY CRYPTOGRAPHY
on messages. To ensure security against such attacker, we must use strong,
cryptographic authentication mechanisms. One option is to use message authentication codes, however, this requires the parties to share a secret key in
advance; if that’s the case, the parties could use this shared key to establish
secure communication directly.
Signature schemes provide a solution to this dilemma. Namely, a party
receiving signed information from a remote peer, can validate that information,
using only the public signature-validation key of the signer. Furthermore,
signatures also allow the party performing the signature-validation, to őrst
validate the public signature-validation key, even when it is delivered by an
insecure channel which is subject to a MitM attack, such as email. This solution
is called public key certificates.
Figure 6.22: Public key certiőcate issuing and usage processes.
As illustrated in Fig. 6.22, a public key certiőcate is a signature by an entity
called the issuer or certificate authority (CA), over the public key of the subject,
e.g., Alice. In addition to the public key of the subject, subject.v, the signed
information in the certiőcate contains attributes such as the validity period,
and, usually, an identiőer and/or name for the subject (Alice).
Once Alice receives her signed certiőcate Cert, she can deliver it to the
relying party (e.g., Bob), possibly via insecure channels such as email or the
Internet Protocol (IP). This allows the relying party (Bob) to use Alice’s public
key, i.e., rely on it, e.g., to validate Alice’s signature over a message m, as
shown in Fig. 6.22. Note that this requires Bob to trust this CA and to have
its validation key, CA.v.
This discussion of certiőcates is very basic; more details are provided in
chapter 8, which discusses public-key infrastructure (PKI), and in Chapter 7,
which discusses the important TLS and SSL protocols.
Signatures facilitate non-repudiation. The other unique property of
digital signature schemes is that they facilitate non-repudiation. Namely, upon
receiving a properly signed document, together with a signature by some wellApplied Introduction to Cryptography and Cybersecurity
6.6. PUBLIC KEY SIGNATURE SCHEMES
385
known authority establishing the public signature-validation key, the recipient
is assured that she can convince other parties that she received the document
signed properly. This is a very useful property. This property does not hold for
message-authentication codes (MAC schemes), where a recipient can validate an
incoming message has the correct MAC code, but cannot prove this to another
party - in particular, since the recipient is able to compute herself the MAC
code for arbitrary messages.
6.6.1
RSA-based signatures
RSA signatures were proposed in the seminal RSA paper [334], and are based
on the RSA assumption, with exactly the same key-generation process as for the
RSA PKC. The only difference in key generation, is that for signature schemes,
the public key is denoted v (as it is used for validation), and the private key is
denoted s (as it is used for signing).
There are two main variants of RSA signatures: signature with message
recovery, and signature with appendix. We begin with signatures with appendix,
as in practice, almost all applications of RSA signatures are with appendix; in
fact, we present (later) signatures with message recovery mainly since they are
often mentioned, and almost as often, a cause for confusion.
RSA signature with appendix. In the (theoretically-possible) case that
input messages are very short, and can be be encoded as a positive integer which
is less than n, we can sign using RSA by applying the RSA exponentiation
directly to the message, resulting in the signature σ. In this case, the signature
and validation operations are deőned as:
SsRSA (m)
=
(ms
VvRSA (σ, m)
=
{m if m = σ v
mod n, m)
mod n, ‘error’ otherwise}
Above, s is the private signature key, and v is the public validation key. The
keys are generated using the RSA key generation process; see subsection 6.5.1
In practice, as discussed in ğ3.2.6, input messages are of variable length - and
rarely shorter than modulus. Hence, real signatures apply the Hash-then-Sign
(HtS) paradigm, using some cryptographic hash function h, whose range is
contained in [1, . . . , n − 1], i.e., allowable input to the RSA function. Applied
to the RSA FIL signature as deőned above, we have the signature scheme
(S RSA,h , V RSA,h ), deőned as follows:
s
SsRSA,h (m)
=
([h(m)]
VvRSA,h (σ, m)
=
{m if h(m) = σ v
mod n, m)
mod n, error otherwise}
The resulting signature scheme is secure, if h is a CRHF; see ğ3.2.
This signature scheme is called signature with appendix since it requires
transmission of both original message and its signature. This is in contrast to a
Applied Introduction to Cryptography and Cybersecurity
CHAPTER 6. PUBLIC KEY CRYPTOGRAPHY
386
rarely used variant of RSA signatures which is called signature with message
recovery, which we explain next. ‘Signature with recovery’ is rarely, if ever,
applied in practice; we describe it since there is a lot of reference to it in
literature, and in fact, this method causes quite a lot of confusion among
practitioners. Hopefully the text below will help to avoid such confusion.
RSA signature with message recovery. RSA signatures with message
recovery have the cute property, that they only require transmission of one
mod n integer - the signature; the message itself does not need to be sent, as it is
recovered from the signature. This cute property would result in a small savings
of bandwidth, compared to signature with appendix, when both methods are
applicable. However, as we explain below, this method is rarely applicable;
furthermore, it is cause for frequent confusion.
RSA signatures with message recovery require the use of an invertible padding
function R(·) which is applied to the messages to be signed. The main goal of
R is to ensure sufficient, known redundancy (in R(m); this is why we denote it
by R). This redundancy, applied to the message before the public key signature
operation, should make it unlikely that a random value would appear as a valid
signature.
The output of R(m) is used as input to the RSA exponentiation; hence, to
ensure recovery, the value of R(m) must be allowed input, i.e., in the range
[1, ,̇n − 1] (where n is the RSA modulus). Note that this implies that length of
m has to be even shorter than the length of R(m), since R(m) must contain all
of m, as well as the redundancy.
Once R is deőned, the signature and validation operations for RSA with
Message Recovery (RSAwMR) would be:
SsRSAwM R (m)
VvRSAwM R (x)
=
=
s
[R(m)]
mod n
−1 v
R (x mod n) if deőned, else error
(6.68)
(6.69)
For validation to be meaningful, there should be only a tiny subset of the
integers x s.t. xv mod n would be in the range of R, i.e., the result of the
mapping of some message m. Since there are at most n values of xv mod n to
begin with, this means that the range of R, i.e., the set of legitimate messages,
must be tiny in comparison with n - which means that the message space should
be really tiny.
In reality, messages being signed are almost always much longer than the
tiny message space available for signatures with message recovery. Hence, the
use of this method is almost non-existent. In fact, our description of signature
schemes (Figure 1.6) assumed that the message is sent along with its signature,
i.e., our deőnition did not even take into consideration schemes like this, which
avoid sending the original message entirely.
Note that RSA signatures with message recovery are often a cause of
confusion, due to their syntactic similarity to RSA encryption. Namely, you
may come across people referring to the use of ‘RSA encryption with the private
key’ as a method to authenticate or sign messages. What these people really
Applied Introduction to Cryptography and Cybersecurity
6.6. PUBLIC KEY SIGNATURE SCHEMES
387
mean is to the use of RSA signatures with message recovery. We caution to
avoid such confusing use of terminology; RSA signatures are usually used with
appendix, but even in the rare cases of using RSA signatures with message
recovery, RSA signing is not the same as encryption with the private key!
Applied Introduction to Cryptography and Cybersecurity
388
6.7
CHAPTER 6. PUBLIC KEY CRYPTOGRAPHY
Labs and Additional Exercises
Lab 4 (Breaking textbook and weakly-padded RSA). In this lab we will break
textbook RSA encryption, as well as padded RSA encryption, using specific
(weak) padding schemes.
As for the other labs in this textbook, we will provide Python scripts for
generating and grading most questions in this lab (LabGen.py and LabGrade.py).
For this lab, the lab-generation script is called LabGenRSA.py, and should be
provided in the lab-scripts folder. In addition, if learning with a professor, the
students may be asked to submit a lab report with their results, including results
to questions which are not auto-graded.
If the programs are not yet posted online, professors may contact the author
to receive the scripts. The lab-generation script generates random challenges for
each student (or team), as well as solutions which will be used by the grading
script. We recommend to make the scripts available to the students, as example
of how to use the cryptographic functions. It is easy and permitted to modify
these scripts to use other languages/libraries or to modify and customize them
as desired.
1. To warm up, we perform textbook RSA decryption. In your lab-input
folder, őnd őles e1, d1, n1, ma1, mb1, cx1 and cy1, all generated
by the LabGen.py script. Use the private decryption key d1 (and the
modulus n1) to decrypt cx1 and cy1; save the results in the corresponding
őles mx1 and my1 in the lab-answers folder. To allow you to check your
program, one of these two answers (mx1 and my1) should be identical
to one of the two input message őles, ma1 and mb1. If you got this one
right, most likely you also got the other decryption right.
To warm up, implement three variants of RSA decryption, using: (1a)
textbook RSA, (1b) PKCS#1 version 1.5, and (1c) OAEP. In your labinput folder, őnd őles e1, d1, n1, ma1, mb1, cx1a, cx1b, cx1c, cy1a, cy1b
and cy1c, all generated by the LabGen.py script, as well as a textbook
RSA module. Files e1 and d1 contain an RSA encryption and decryption
keys, respectively, both using the modulus n1. Use the private key d1 and
the modulus n1 to decrypt: (1a) cx1a and cy1b using textbook RSA, (1b)
cx1b and cy2b using PKCS#1 version 1.5, and (1c) cx1c and cy1c using
OAEP. For the PKCS and OAEP encryptions, you should also check and
remove the padding. Save the results in the corresponding őles mx1a,
mx1b, mx1c, my1a, my1b, and my1c in the lab-answers folder. To allow
you to check your program, one result from each pair should be identical
to one of the two input message őles, ma1 and mb1. If you got this one
right, most likely you also got the other decryption right. You can also
conőrm that during decryption of the PKCS#1 version 1.5 and OAEP
ciphertexts, you őnd correctly padded plaintexts.
2. To further warm up, let’s also do textbook RSA encryption. Use the
public encryption key e1 (and the modulus n1) to encrypt ma1 and mb;
Applied Introduction to Cryptography and Cybersecurity
6.7. LABS AND ADDITIONAL EXERCISES
389
save the results in the corresponding őles ca1 and cb1 in the lab-answers
folder. Again, one of these two answers (ca1 and cb1) should be identical
to one of the two input ciphertext őles, cx1 and cy1. If you got this one
right, most likely you also got the other decryption right.
3. In this item, we break textbook RSA encryption. In your lab-input folder,
őnd a őle ciphertexts.csv containing ‘eavesdropped ciphertexts’ (and corresponding identiőers), and őle plaintexts.csv containing ‘suspected plaintexts’ (and corresponding identiőers), both using the CSV format (check
it out). You should be able to identify two plaintexts from plaintexts.csv
as corresponding to two of the ciphertexts (in ciphertexts.csv). The őle
pair0-1 in the lab-input folder contains one match (as a pair of commaseparated identiőers of plaintext and ciphertext); check that one of the
matches you found is the same as the contents of this őle. If so, then the
other pair you found should also be correct; save it in őle pair0-2 in the
lab-answers folder, using the same format as of pair0-1. Measure also the
runtime, and upload it as őle t0.
4. Now that we see how textbook RSA is insecure, let’s try a naive padding.
Speciőcally, let us deőne the NP1 (‘Naive Padding 1’) as: N P 1(m) =
0x02 +
+r+
+ m, where m is the (pre-padding) plaintext message, r is one
byte consisting of four random bits followed by four zero bits. This is an
(overly) simpliőed version of the PKCS#1 version 1.5 padding algorithm,
P adv1.5 (·), as deőned in Equation 6.59 (subsection 6.5.6).
Reuse the ciphertexts.csv and plaintexts.csv őles. You should again be able
to identify two of the plaintexts from plaintexts.csv as corresponding to two
of the ciphertexts (in ciphertexts.csv), this time, when applying padding
N P 1 to the plaintexts before applying RSA textbook encryption (using
e1 and n1). The őle pair1-1 in the lab-input folder contains one match (as
a pair of comma-separated identiőers of plaintext and ciphertext); check
that one of the matches you found is the same as the contents of this őle.
If so, then the other pair you found should also be correct; save it in őle
pair1-2 in the lab-answers folder, using the same format. Measure also
the runtime, and upload it as őle t1.
5. Repeat the previous item, for a random string r containing (1) 8 random
bits, (2) 12 random bits followed by four zero bits, (3) 16 random bits and
(4) 20 random bits followed by four zero bits. Check your results using
the provided pair2-1, pair3-1, pair4-1 and pair5-1 őles, and save your
‘solution pairs’ as őles pair2-2, pair3-2, pair4-2 and pair5-2. Measure
also the runtime.
a) Create a graph of the runtime for the time it took to decrypt using
a random string from 0 to 20 bits.
b) Identify the function giving the runtime as a function of the number
of random bits.
Applied Introduction to Cryptography and Cybersecurity
CHAPTER 6. PUBLIC KEY CRYPTOGRAPHY
390
c) Using the function you identiőed, approximate the runtime if you
used this attack to decrypt encryption with padding of eight random
bytes, the minimal number of random bytes required by PKCS#1
version 1.5.
d) Repeat, for padding of 12 random bytes.
, and
6. Challenge. Find in lab-input folder the őles c8 − 1, c8 − 2, c12 − 1 and
c12 − 2. These are all encryptions using the N P 1 padding, except using
a random string r of length 8 bytes (for c8 − 1, c8 − 2) or 12 bytes (for
c12 − 1, c12 − 2). Your goal is to őnd the corresponding plaintexts.
To assist you in őnding the plaintexts, use the ‘padding-correctness feedback’ interface in the lab web-server, or provided by your professor. This
interface allows you to upload a ciphertext c and receive an indication if
this ciphertext is properly padded using N P 1 or not.
To allow you to test your program, you will őnd the plaintexts p8−1, p12−1
in the lab-input folder. Once you őnd the one or both of the other solutions,
upload them in the corresponding őles (p8 − 2, p12 − 2) in the lab-answers
folder.
Upload also the runtimes, as őles t8 and t12, and add them to your graph
of run-times from the previous item, comparing them to the runtimes you
projected using the attack of the previous item.
7. Discussion. Explain why it is signiőcantly easier to attack the N P 1
padding, compared to the Bleichenbacher’s attack against the PKCS#1
v1.5 padding.
Exercise 6.12 (Addition/multiplication key exchange is insecure). Present a
sequence diagram similar to Figure 6.5 and Figure 6.6, but using addition or
multiplication instead of XOR/exponentiation. Show that the resulting protocol
is vulnerable to an eavesdropping attacker.
Exercise 6.13. Show that the exponential key exchange (Figure 6.6) is insecure
against an eavesdropper, even if the base g used by the protocol is a secret shared
between Alice and Bob.
Exercise 6.14 (Justiőcation for limitations on possible random inputs for key
exchange protocols). Show that in both the Diffie-Hellman key exchange protocol
and the Modular Exponentiations key exchange protocol, there will be no gain
in security if the parties choose their random inputs (a, b and, for exponential
key exchange, also k) from a larger set, say {1, . . . , 2p}. Show that the same
holds for the selection of g.
Exercise 6.15. The Diffie-Hellman protocol is a special case of a key exchange protocol, defined by the pair of functions (KG, F ), as introduced in
subsection 6.1.3.
Applied Introduction to Cryptography and Cybersecurity
6.7. LABS AND ADDITIONAL EXERCISES
391
1. Present the Diffie-Hellman protocol as a key exchange protocol, i.e., define
the corresponding (KG, F ) functions.
2. We presented two assumptions regarding the security of the DH protocol:
the Computational-DH (CDH) assumption and the Decisional-DH (DDH)
assumption. Show that one of these assumption does not suffice to ensure
key-indistinguishability? What about the other one?
Exercise 6.16. It is proposed that to protect the DH protocol against an
imposter, we add an additional ‘confirmation’ exchange after the protocol terminated with a shared key k = h(g ab mod p). In this confirmation, Alice
will send to Bob M ACk (g b ) and Bob will respond with M ACk (g a ). Show the
message-flow of an attack, showing how a MitM (Man-in-the-Middle) attacker
can impersonate as Alice (or Bob). The attacker has ‘MitM capabilities’, i.e., it
can intercept messages (sent by either Alice or Bob) and inject fake messages
(incorrectly identifying itself as Alice or Bob).
Exercise 6.17. Suppose that an efficient algorithm to find discrete log is found,
so that the DH protocol becomes insecure; however, some public-key cryptosystem
(G, E, D) is still considered secure, consisting of algorithms for, respectively,
key-generation, encryption and decryption.
1. Design a key-agreement protocol which is secure against an eavesdropping
adversary, assuming that (G, E, D) is secure (as a replacement to DH).
2. Explain which benefits the use of your protocol may provide, compared with
simple use of the cryptosystem (G, E, D), to protect the confidentiality of
messages sent between Alice and Bob against a powerful MitM adversary.
Assume Alice and Bob do have known public keys.
Exercise 6.18. Assume that there is an efficient (PPT) attacker A that can
find a specific bit in g ab mod p, given only g a mod p and g b mod p. Show
that the DDH assumption does not hold for this group, i.e., that there is an
efficient (PPT) attacker A that can distinguish, with significant advantage over
random guess, between g ab mod p and between g x for x taken randomly from
[1, . . . , p − 1].
Exercise 6.19. It is frequently proposed to use a PRF as a Key Derivation
Function (KDF), e.g., to extract a pseudo-random key k ′ = P RFk (g ab mod p)
from the DH exchanged value g ab mod p, where k is a uniform random key
(known to attacker). In particular, in subsection 6.3.1, a variant of the Auth-DH
protocol uses a function f assumed to fulfill both the PRF requirements and the
KDF requirements. In this exercise, we explore alternatives.
1. Let f be a secure PRF; note that f may be a KDF or not. Show a function
f ′ which is (1) also a PRF and (2) not a secure KDF.
2. Let g be a secure KDF; note that g may be a PRF or not. Show a function
g ′ which is (1) also a KDF and (2) not a secure PRF.
Applied Introduction to Cryptography and Cybersecurity
CHAPTER 6. PUBLIC KEY CRYPTOGRAPHY
392
Alice, has ki−1
(from previous round)
Bob, has ki−1
(from previous round)
MitM
attacker
Nurse
$
ai ← Z∗p ≡ {1, . . . (p − 1)}
xi ← g
$
ai
bi ← Z∗p ≡ {1, . . . (p − 1)}
mod p, M ACki−1 (xi )
(i)
kB,A ≡ h xbi i mod p
yi ← g bi mod p, M ACk(i) (yi )
B,A
(i)
(i)
kA,B ≡ h (yiai mod p)
(i)
Session ith key: ki = kA,B = kB,A
Figure 6.23: How not to ensure resilient key exchange: illustration for Ex. 6.20
MitM
attacker
Alice, knows M K
Bob, knows M K
Nurse
$
ai ←
Z∗p
≡ {1, . . . (p − 1)}
xi ← g
$
ai
bi ← Z∗p ≡ {1, . . . (p − 1)}
mod p, fM K (xi ), gM K (xi )
yi ← g bi mod p, fM K (yi ), gM K (yi )
yi′ ← yiai mod p
(i)
kA,B ≡ fM K (yi′ ) ⊕ gM K (yi′ )
Session ith key: ki =
(i)
kA,B
=
(i)
kB,A
x′i ← xbi i mod p
(i)
kB,A ≡ fM K (x′i ) ⊕ gM K (x′i )
Figure 6.24: Insecure ‘robust-combiner’ authenticated DH protocol, studied in
Exercise 6.21.
3. Present a variant of the Auth-DH protocol, as a modification of Figure 6.11,
which uses a PRF (instead of MAC) and a KDF (instead of KDF). Explain
why this variant is secure, when the key of the PRF and the KDF are
chosen independently (using uniform distribution).
4. Let f be a secure PRF and g be a secure KDF. Show functions f ′ , g ′ such
that f ′ is a PRF and g ′ is a KDF, and furthermore the following holds:
the protocol in the previous item may be insecure when used with f ′ and
g ′ , if both of them use the same symmetric master key M K. Note: this
is an example of the principle of key separation (Principle 10).
Exercise 6.20 (How not to ensure resilient key exchange). Fig. 6.23 illustrates
a slightly different protocol for authenticating the DH protocol, using a changing
key ki (to ensure resilient key exchange). Present a sequence diagram showing
that this protocol is not secure.
Exercise 6.21. The protocol in Fig. 6.24 is an (incorrect) attempt at a robustcombiner authenticated DH protocol.
Applied Introduction to Cryptography and Cybersecurity
6.7. LABS AND ADDITIONAL EXERCISES
393
1. Show a sequence diagram for an attack showing that this variant is insecure.
2. Show a simple fix that achieves the goal (robust combiner authenticated
DH protocol).
Exercise 6.22. Assume it takes 10 seconds for any message to pass between
Alice and Bob.
1. Assume that both Alice and Bob initiate the ratchet protocol (Fig. 6.12)
every 30 seconds. Draw a sequence diagram showing the exchange of
messages between time 0 and time 60seconds; mark the keys used by each
of the two parties to authenticate messages sent and to verify messages
received.
2. Repeat, if Bob’s clock is 5 seconds late.
Exercise 6.23. In the DH ratchet protocol, as described (Fig. 6.12), the parties
derive symmetric keys ki,j and use them to authenticate data (application)
messages they exchange between them, as well as the first message of the next
handshake.
1. Assume a chosen-message attacker model, i.e., the attacker may define
arbitrary data (application) messages to be sent from Alice to Bob and
vice verse at any given time, and ‘wins’ if a party accepts a message never
sent by its peer (i.e., that message passes validation successfully). Show
that, as described, the protocol is insecure in this model.
2. Propose a simple, efficient and secure way to avoid this vulnerability, by
only changing how the protocol is used - without changing the protocol
itself.
Exercise 6.24. The DH protocol, as well as the ratchet protocol (as described
in Fig. 6.12), are designed for communication between only two parties.
1. Extend DH to support key agreement among three parties.
2. Similarly extend the ratchet protocol.
Exercise 6.25 (DH-Ratchet). Figure 6.12 shows the DH-Ratchet protocol,
where the key used to authenticate the DH exchange as well as the data messages
is changing periodically (as indicated), and where f is a PRF (Pseudo-Random
Function). Assume that this protocol is run daily, from day i = 1, and where
k0 is a randomly-chosen secret initial master key, shared between Alice and
Bob; messages on day i are encrypted and authenticated using session key ki , by
selecting a random string r and sending r and f (r||fki (r) ⊕ m). An attacker can
eavesdrops on the communication between the parties on all days, and on days
3, 6, 9, . . . it can also spoof messages (send messages impersonating as either
Alice or Bob), and act as Man-in-the-Middle (MitM). On the fifth day (i = 5),
the attacker is also given the initial master key M k0 .
Applied Introduction to Cryptography and Cybersecurity
CHAPTER 6. PUBLIC KEY CRYPTOGRAPHY
394
Alice, has ki−1
(from previous round)
Bob, has ki−1
(from previous round)
MitM
attacker
Nurse
$
ai ←
Z∗p
≡ {1, . . . (p − 1)}
xi ← g
$
ai
bi ← Z∗p ≡ {1, . . . (p − 1)}
mod p, fki−1 (xi )
yi ← g bi mod p, fki−1 (yi )
(i)
kA,B ≡ fki−1 (yi · g ai mod p)
(i)
(i)
Session ith key: ki = kA,B = kB,A
(i)
kB,A ← fki−1 (xi · g ai mod p)
Figure 6.25: Insecure variant of the DH-Ratchet Protocol, for Ex. 6.26.
• Explain why sending r and f (r||fki (r) ⊕ m) ensures authenticity and
confidentiality, provided that ki is secret.
• What are the days whose messages the attacker will be able to decrypt
(find out) upon day ten?
• Show a sequence diagram of the attack, and list calculations done by the
attacker.
Exercise 6.26 (Insecure variant of DH-Ratchet). Figure 6.25 shows a variant
of the DH-Ratchet protocol, using a (secure) pseudorandom function f to derive
the session key.
1. Does this protocol ensure forward-secrecy (FS)? If so, explain; if not,
present sequence diagram of attack.
2. Repeat, for PFS.
3. Repeat, for Recover-Security (RS).
4. Repeat, for PRS.
Exercise 6.27 (GSM). Design a more secure variant of the GSM handshake
protocol, which foils the attack described in Exercise 5.16; the mobile and visited
network can identify support of this variant by referring to it as a new cipher, say
A5/33. The actual data encryption can use any secure shared-key encryption;
the critical improvement is to the negotiation, namely, to prevents attacks as in
Exercise 5.16.
The change may involve one or few new handshake messages between mobile
and visited network, but no change to the rest of the GSM network, in particular,
no change to the home network. Your solution may require the mobile and/or
visited network to use additional cryptographic mechanisms, including public
key mechanisms, but only during handshake. Hint: your solution should set the
key to be used by A5/3, using cryptographic mechanism(s) we learned.
Applied Introduction to Cryptography and Cybersecurity
6.7. LABS AND ADDITIONAL EXERCISES
395
Exercise 6.28. We saw that El-Gamal encryption (Equation 6.36) may be
re-randomized, using the recipient’s public key, and mentioned that this may be
extended into an encryption scheme which is univerally re-randomizable, i.e.
where re-randomization does not require the recipient’s public key. Design such
encryption scheme. Hint: begin with El-Gamal encryption, and use as part of
the ciphertext, the result of encrypting the number 1. Or see [171].
Exercise 6.29. A public-key cryptosystem is IND-rCCA secure, if it passes the
IND-CPA test, when the attacker is restricted to avoid any ciphertext queries
whose output is the challenge message m∗ [89]. Show that:
1. The El-Gamal PKC is not IND-rCCA secure.
2. Textbook RSA is not IND-rCCA secure.
Exercise 6.30. The RSA algorithm calls for selecting e and then computing
d to be its inverse ( mod ϕ(n)). Explain how the key owner can efficiently
compute d, and why an attacker cannot do the same.
Exercise 6.31. The RSA key generation algorithm requires the selection of
two large primes p, q. Would it be secure to save time by using p = q? Or first
choose p, then let q be the next-largest prime?
Exercise 6.32 (Tiny-message attack on textbook RSA). We discussed that
RSA should always be used with appropriate padding, and that ‘textbook RSA’
(no padding) is insecure, in particular, is not randomized so definitely does not
ensure indistinguishability.
1. Show that textbook RSA may be completely decipherable, if the message
length is less than |n|/e. (This is mostly relevant for e = 3.)
2. Show that textbook RSA may be completely decipherable, if there is only a
limited set of possible messages.
3. Show that textbook RSA may be completely decipherable, if the message
length is less than |n|/e, except for a limited set of additional (longer)
possible messages.
Exercise 6.33. Consider the use of textbook RSA for encryption (no padding).
Show that it is insecure against a chosen-ciphertext attack.
Exercise 6.34. Consider a variation of RSA which uses the same modulus N
for multiple users, where each user, say Alice, is given its key-pair (A.e, A.d)
by a trusted authority (which knows the factoring of N and hence ϕ(N ). Show
that one user, say Mal, given his keys (M.e, M.d) and the public key of other
users say A.e, can compute A.d. Note: recall that each users’s private key is
the inverse of the public key ( mod ϕ(n), e.g., M.e = M.d−1 mod ϕ(n).
Applied Introduction to Cryptography and Cybersecurity
396
CHAPTER 6. PUBLIC KEY CRYPTOGRAPHY
Exercise 6.35. Public-key algorithms often use term ‘public key’ to refer to
only one component of the public key. For example, with RSA, people often
refer to e as the public key, although the actual RSA public key consists of the
pair (e, n), i.e., also includes the modulus n.
Consider an application which receives an RSA signature (eA , n), where eA
is the same as in the public key (eA , nA ) of user Alice, but n ̸= nA ; however,
the application still concludes that this is a valid signature by Alice. Show how
this allows an attacker to trick the recipient into believing - incorrectly - that an
incoming message (sent by the attacker) was signed by Alice.
Note: similar situation exists with other public key algorithms, e.g., elliptic
curves, where the public key consists of a specification of a curve and of a
particular ‘public point’ on the curve, but often people refer only to the point
as if it is the (entire) public key. In particular, this led to the ‘Curveball’
vulnerability in the Windows certificate-validation mechanism [361], which was
due to validation of only the ‘public point’ and use of the curve selected by the
attacker.
Exercise 6.36. You are given textbook-RSA ciphertext c = 281, with public
key e = 3 and modulus n = 3111. Compute the private key d and the message
m = cd mod n.
Hint: it is probably best to begin by computing the factorization of n.
Exercise 6.37. Consider the use of textbook RSA for encryption as well as for
signing (using hash-then-sign), with the same public key e used for encryption
and for signature-verification, and the same private key d used for decryption
and for signing. Show this is insecure against chosen-ciphertext attacks, i.e.,
allows either forged signatures or decryption.
Exercise 6.38. The following design is proposed to send email while preserving
sender-authentication and confidentiality, using known public encryption and
verification keys for all users. Namely, assume all users know the public
encryption and verification keys of all other users. Assume also that all users
agree on public key encryption and signature algorithms, denoted E and S
respectively.
When one user, say Alice, wants to send message m to another user, say
+ ‘Alice’ +
+ SA.s (m)), where B.e is
Bob, it computes and sends: c = EB.e (m +
Bob’s public encryption key, A.s is Alice’s private signature key, and ‘Alice’ is
Alice’s (unique, well-known) name, allowing Bob to identify her as the sender.
When Bob receives this ciphertext c, it first decrypts it, which implies it was
sent to him. To validate that the message was sent by Alice, he looks up Alice’s
public verification key A.v, and verifies the signature.
1. Explain how a malicious user Mal can cause Bob to believe it received a
message m from Alice, although Alice never sent that message to Bob.
(Alice may have sent a different message, or sent that message to somebody
else.)
Applied Introduction to Cryptography and Cybersecurity
6.7. LABS AND ADDITIONAL EXERCISES
397
2. Propose a simple, efficient and secure fix.
Exercise 6.39 (Combining public key signatures and encryption). Many
applications require both confidentiality, using recipient’s public encryption key,
say B.e, and non-repudiation (signature), using sender’s verification key, say
A.v. Namely, to send a message to Bob, Alice uses both her private signature
key A.s and Bob’s public encryption key B.e; and to receive a message from
Alice, Bob uses his private decryption key B.d and Alice’s public verification
key A.v.
1. It is proposed that Alice will select a random key k and send to Bob the
triplet: (cK , cM , σ) = (EB.e (k), k ⊕ m, SignA.s (‘Bob′ +
+ k ⊕ m)). Show
this design is insecure, i.e., a MitM attacker may either learn the message
m or cause Bob to receive a message ‘from Alice’ - that Alice never sent.
2. Propose a simple, efficient and secure fix. Define the sending and receiving
process precisely.
3. Extend your solution to allow prevention of replay (receiving multiple
times a message sent only once).
Note: signcryption schemes combine the public key signature and encryption operations, possibly with greater efficiency than applying separately the encryption
and signing operation.
Applied Introduction to Cryptography and Cybersecurity
Chapter 7
The TLS protocols for web-security
and beyond
In this chapter, we discuss the Transport-Layer Security (TLS) protocol, which
is the main protocol used to secure connections over the Internet - and, in
particular, web-communication. We believe that TLS is the most studied
applied cryptographic protocol; it is widely deployed in many applications and
has huge impact on the security of the Internet. Its extensive study has resulted
in many attacks and subsequent countermeasures, defenses, improvements and
several versions.
7.1
Introduction to TLS and SSL
The TLS protocol is arguably the most ‘successful’ security protocol - it is
deőnitely very widely used. One reason for this wide use is that TLS is widely
applicable; it is used in more diverse scenarios and environments than any other
security protocol. Many extensions and changes have been proposed over the
years, allowing the use of TLS in new scenarios and satisfying new requirements,
as well as improving security.
Some of these extensions and changes were adopted as an inherent part of a
new revision of the protocol, and many others can be deployed using the built-in
extensions mechanism (subsection 7.4.3), which became a standard part of TLS
beginning with version TLS 1.1. Indeed, we believe TLS is probably the applied
cryptography protocol which was most widely studied and analyzed, with many
vulnerabilities exposed and őxed; this gives us considerable conődence in the
security of the (later versions of) TLS.
From a security point of view, this popularity is a double-edged sword. On
the one hand, this wide popularity motivates extensive efforts by the ‘whitehat’ security community, including researchers from academia and industry,
to identify vulnerabilities and improve the security of the protocols and their
implementations. This published research resulted in signiőcant improvements
to the security of TLS; many attacks and corresponding countermeasures were
399
400
CHAPTER 7. TLS PROTOCOLS: WEB-SECURITY AND BEYOND
published by researchers, and new versions of the protocols were gradually
more and more secure, culminating with TLS 1.3, which has signiőcant design
changes whose goal is to improve security.
On the other hand, this wide popularity also implies that ‘black-hat crackers’
have a strong motivation to őnd vulnerabilities in the TLS protocols and in their
popular implementations. In fact, the desire to ‘break’ secure connections, may
even motivate powerful organizations, e.g., the NSA, to invest extensive efforts in
‘injecting’ intentional, hidden vulnerabilities (cryptographic backdoors) into the
speciőcations and implementations of TLS and of other popular cryptographic
systems, libraries, protocols and standards.
One example of what may be a cryptographic backdoor, is the Dual-EC
Deterministic Random Bit Generator (DRBG), which was found vulnerable
in [96]. The Dual-EC DRBG was included in NIST, ANSI and ISO/IEC
standards, and implemented in the widely-used BSAFE cryptographic toolkit
from RSA (also used by implementations of TLS). There are claims that the
NSA created and promoted the Dual-EC DRBG as a cryptographic backdoor,
and allegedly even paid RSA to make it the default pseudorandom generator
in BSAFE. One evidence for NSA’s involvement came from the NSA memos
exposed by Edward Snowden in 2013, which also indicated that NSA spends
$250 million per year to insert backdoors into software, hardware and standards;
see [55, 392].
Some of this purported effort to insert trapdoors into standards, seems to
have been directed at TLS. In particular, consider [332], a proposal for a TLS
extension by Eric Rescorla (as a consultant to the US government) and Margaret
Salter (an NSA employee). The purported goal of this extension was to increase
the number of random bits exchanged during the TLS handshake. However,
these additional random bits seem to signiőcantly improve the efficiency of the
Dual-EC DRBG attack. This may indicate that this was another attempt to
insert a cryptographic trapdoor - in this case, to the TLS speciőcations; see [55].
This widespread use of TLS also has important implications for learning and
teaching TLS. Obviously, the importance of TLS motivates studying it; furthermore, the evolution of TLS, and the different attacks and countermeasures, are a
valuable, interesting lesson, which can help to identify and avoid vulnerabilities
in different protocols and systems. On the other hand, this also means that
there is an excessive wealth of important and interesting information - indeed,
entire books were dedicated to cover TLS, e.g., [307, 328], and even they do not
cover all aspects and attacks. We have tried to maintain a reasonable balance;
however, there were many hard choices and surely there is much to improve.
As in other aspects, your feedback would be appreciated.
Organization of this chapter. In the following subsection (subsection 7.1.1,
we present a brief history of SSL and TLS, and its three main phases: (1) the
proprietary SSLv2 design, (2) the evolution from SSLv3 to TLS version 1.2,
and őnally (3) the TLS 1.3 re-design. We later dedicate a section to each of
these three phases.
Applied Introduction to Cryptography and Cybersecurity
7.1. INTRODUCTION TO TLS AND SSL
401
Why discuss older versions? One motivation to describe the older SSL and
TLS versions is to learn about protocol vulnerabilities and attacks, which can
help us to develop the intuition to identify vulnerabilities in different protocols,
and to design secure protocols.
Another motivation is that these attacks are often still relevant, for two
reasons. First, many clients and servers still support outdated versions. Second,
several downgrade attacks (subsection 5.6.3) break implementations of newer
versions of TLS, by exploiting their support for older, vulnerable versions.
7.1.1
A brief history of SSL and TLS
Let us begin with a few words on the history of SSL and TLS. The TLS
standards are deőned by the Internet Engineering Task Force (IETF), as an
evolution of the Secure Socket Layer (SSL) protocols. The SSL protocols are
quite similar to TLS in their basic design and goals, but developed by the
Netscape corporation (rather than by the IETF); SSLv3 beneőted from some
feedback from researchers and the Internet security community. In fact, version
3 of SSL (SSLv3) is closely related to versions 1.0 to 1.2 of TLS, and less similar
to version 2 of SSL.
The beginning of SSL was around 1994, with the beginning of the commercial
use of the World Wide Web (WWW). Possibly the őrst company to focus on
the commercial potential of the web was the Netscape corporation, established
in 1994. At the time, a major concern was the ability to perform onlinepurchases securely. Credit cards were quickly recognized as an appropriate
payment method, since they were already used widely for phone purchases
and mail orders; such remotely-authorized transactions were referred to as
card not present, to identify the risk due to reliance on card details without
visual conőrmation of the physical card and handwritten signature, which were
required for the more common (at the time) card present transactions. However,
the transmission of credit card information over the Internet was considered less
secure, than the somewhat-protected transmission of credit card information in
a phone call or by mail. In both phone and physical mail, there is some level
of authentication of the merchant, since the customer initiated the phone call
or addresses the physical mail; but Internet communication may be viewed by
different providers and, depending on technology, even other users. A secure
solution was considered essential for the commercial use of the Web.
Netscape came out with the őrst protocol to protect credit card transactions
over the Web - the SSL protocol. SSL initial goal was, basically, to provide
security for credit card transactions, which will be comparable to the security
of credit card transactions performed remotely, over phone or mail (referred to
as ‘card not present’ transactions). SSL-protected web transactions became,
basically, a new way to perform ‘card not present’ transactions. Indeed, very
quickly, the use of SSL to protect web credit card transactions became widely
adopted and expected by customers.
Importantly, the use of SSL was also not limited to credit card transactions,
and it did not take very long for it to be widely use to protect other webpages as
Applied Introduction to Cryptography and Cybersecurity
402
CHAPTER 7. TLS PROTOCOLS: WEB-SECURITY AND BEYOND
well. SSL, as its name (Secure Socket Layer) implies, provides general-purpose
secure communication interface, extending TCP’s widely-used socket API [369].
To secure web communication, Netscape added support to the https protocol,
which is basically the same as the web http protocol, except that http runs over
TCP, without security, while https runs over SSL (or TLS), providing security.
The design of SSL as a secure variant of the socket API gave it important
advantages over alternatives protocols proposed around the same years. The
closest contender was Secure HTTP (SHTTP, [333]), a general-purpose security
extension for the HTTP protocol; but SHTTP is signiőcantly more complex to
understand, implement and deploy.
Even more complex and ambitious were two other designs for credit-card
payments over the web: the Secure Electronic Transactions (SET) protocol,
developed by Microsoft, Visa and later also Mastercard [8], and the iKP
protocol, developed by IBM [34, 35]. Both SET and iKP tried to provide
security comparable to ‘card present’ transactions, by having the client’s device
digitally-sign each purchase; the goal was to provide a secure alternative to the
handwritten signature on a credit card slip. As a result, they were signiőcantly
more complex to understand, implement and deploy, compared to SSL.
Three main advantages helped SSL to quickly become a success. First, SSL
was quickly implemented and deployed by Netscape, whose browser was, by far,
the most popular at the time, with a large lead over all other browsers combined.
Second, while SET and iKP provided better security for credit card transactions,
they were limited to this credit-card application; in contrast, SSL could be used
for other applications requiring secure client to server communication, not only
for credit card purchases. In particular, although the original motivation for
deploying SSL was to encrypt the credit-card in transit, protecting against an
eavesdropper, it soon become apparent that server authentication provides a
critical security function, by allowing clients to identify impersonating websites.
The third and most signiőcant advantage of SSL is that SSL is simple - simple
in its concept, simple to implement, simple to integrate in applications, most
notably, in a browser, and, most signiőcantly, simple for adoption. Speciőcally,
SET and iKP required adoption by credit card processors as well as merchants
and customers, with private keys and certiőcates for each party; while SSL
required only adoption by merchants and customers. Furthermore, SSL and TLS
always perform server authentication, but client authentication is optional, and,
in fact, not widely deployed. Namely, deploying SSL (or TLS) only requires the
websites (merchants) to generate private keys and obtain public-key certiőcates.
Indeed, once the customer uses an SSL-enabled browser and the merchant offers
an SSL-enabled website, all parties (the merchant, customer and credit-card
processor), they basically operate as in other card-not-present scenarios; no
additional change to their systems or processes is required.
The importance of simplicity and ease of deployment and use for applied
security mechanisms cannot be overstated, and we will return to it when we
introduce, later in this chapter, the Keep it Simple and Secure (KISS ) principle
(Principle 14). For example, while SSL supports both server-authentication
and client-authentication, it is usually deployed with only server authentication,
Applied Introduction to Cryptography and Cybersecurity
7.1. INTRODUCTION TO TLS AND SSL
403
requiring only servers (merchants) to obtain public key certiőcates; even today,
only few clients obtain certiőcates (for client authentication).
The very őrst versions of SSL were not published. The őrst publication was
in June 1995, when Netscape published SSL version 2 (SSLv2) [202]. Later in
1995, Microsoft changed strategy; they published and implemented the Private
Communication Technology (PCT) Protocol [50], which is similar to SSLv2.
SSLv2 had signiőcant design vulnerabilities, and its publication allowed the
web security community to expose these vulnerabilities. In November 1996,
Netscape published the much-improved, and quite different, SSLv3 [156] (later
published as [155]). The speciőcation published was quite complete, allowing
independent interoperable implementations.
Also in 1996, the IETF established a working group to develop an agreed
standard protocol to replace the proprietary-developed SSL and PCT protocols.
To avoid arguments on which of the two names should be used, a new name was
chosen: the TLS (Transport Layer Security) protocol. However, TLS 1.0 [120],
the őrst standard produced by the TLS working group, was closely based on
SSLv3.
Unfortunately, although SSLv3 and TLS 1.0 addressed some of SSLv2’s
vulnerability, they still had serious vulnerabilities, as well as non-security limitations. In April 2006, the TLS working group of the IETF deőned TLS 1.1 [121]
to őx the issues discovered in TLS 1.0. However, additional vulnerabilities and
concerns were discovered, motivating another release: TLS 1.2 [122] (published
in Augus 2008). These three TLS versions (1.0 to 1.2) were all quite similar to
SSLv3, only őxing clearly-exploitable vulnerabilities and adding features.
After vulnerabilities were discovered also in TLS 1.2, the working group
decided to do a major redesign. This took about 10 years; the IETF published
TLS 1.3 [329] only in August 2018. TLS 1.3 is still the latest version of
TLS. In contrast to previous designs, the TLS 1.3 designers gave preference to
mechanisms with proven security properties; ideally, we would like the complete
TLS protocol to be provably secure. While a complete proof of security was not
yet published for TLS 1.3, there are encouraging and important partial results.
As a result on this stronger emphasis on security, as well as due to signiőcant
changes to improve performance (mainly, reduce latency), TLS 1.3 is a major
deviation from the previous versions.
7.1.2
TLS: High-level Overview
The TLS and SSL protocols were originally designed to secure the communication
between a web-browser and a web-server, and, while they are now widely
deployed for additional applications, web-security remains their main application.
We present a highly-simpliőed overview of this typical use-case in Figure 7.1.
In Figure 7.1, we show how Alice, a web user, surfs to the TLS protected
website https://b.com; we focus on the simpler variants of TLS, which are based
on RSA encryption, and denote b.com’s public RSA encryption key by B.e.
Notice that the URL of TLS protected websites begins with the protocol name
Applied Introduction to Cryptography and Cybersecurity
404
CHAPTER 7. TLS PROTOCOLS: WEB-SECURITY AND BEYOND
Alice
Browser
DNS
CA
Web server
0a. Request certificate
for B.e, b.com [,IDs]
0b. Certificate:
SCA.s (B.e, b.com[,IDs], . . .)
1. https://b.com
2a. Resolve name https://b.com
2b. https://b.com A 1.2.3.4
3a. TCP handshake: SYN
3b. TCP SYN/ACK
4a. TLS Handshake: Client_Hello
4b. Server_Hello and certificate:
(B.e, SCA.s (B.e, b.com, . . .))
4c. Key_Exchange: EB.e (k)
5a. TLS session: k (HTTP GET b.com)
6a. Display login form
and
5b. k (HTTP response (HTML login form))
6b. username, pw
7a. k (HTTP POST (username, pw))
7b. k (HTTP response . . .)
· · ·
Figure 7.1: A simpliőed overview of the operation of TLS, to secure the login
between the browser and the web-server, using RSA for key exchange.
https, rather than the protocol name http, used for unprotected web sites. The
process consists of the following steps:
Step 0 (in advance): Web server obtains certificate. Before the server
of b.com can provide TLS service, it need to obtain a certificate for its
public key B.e, signed by a certificate authority (CA) trusted by the
client (in this case, browser). For that purpose, the server sends to the
CA (in ŕow 0.a) its domain name, b.com, its public encryption key B.e,
and optionally other identiőers (IDs). The CA should validate that the
Applied Introduction to Cryptography and Cybersecurity
7.1. INTRODUCTION TO TLS AND SSL
405
server indeed ‘owns’ domain b.com, and is associated with any additionally
provided identiőers (the optional IDs). If validation passes, the CA signs
the certiőcate using its private signing key CA.s, and sends it to the server
(ŕow 0.b). The certiőcate contains the public key B.e, the domain b.com,
the optional IDs and other ‘administrative’ information, such as validity
period. For more details about certiőcates see Chapter 8; in particular,
the validation process is discussed in subsection 8.2.8.
Step 1: client requests website https://b.com. The user (Alice) enters
the desired Universal Resource Locator (URL), https://b.com. The URL
consists of the protocol (https) and the domain name of the desired webserver (b.com); in addition, the path may contain identiőcation of a speciőc
path and object in the server. In this example, Alice does not specify any
speciőc path or object; and the browser considers this a request for the
default object index.html. The choice of the https protocol, instructs the
browser to open a secure connection, i.e., send the HTTP requests over
an TLS session, rather than directly over an unprotected TCP connection.
The request may be speciőed in one of three ways: (1) by the user ‘typing’
the URL into the address bar of the browser, i.e., ‘manually’, (2) ‘semiautomatically’, by the user clicking on an hyperlink or bookmark which
speciőed this URL, or (3) by an instruction from the webpage currently
displayed by the browser.
Step 2: resolving domain name into IP address. To communicate with
the b.com web server, the browser needs the IP address of the server.
The Domain Name System (DNS) provides resolution (mapping) from
domain names to IP addresses. We simplify this process into a request
from the browser to the DNS (ŕow 2a), and a response from the DNS
to the browser specifying the IP address (ŕow 2b). The step is skipped
if the IP address is already known, typically, cached from a previous
connection. This step is vulnerable to network attacks, including MitM
attacks and off-path attacks exploiting weaknesses of the domain name
system (DNS) [200, 201]. Figure 9.1 illustrates a DNS poisoning attack
against a login webpage which uses TLS incorrectly, to only protect the
password submitted by the user.
Step 3: TCP handshake. The TLS protocol runs over the TCP (Transmission Control Protocol) protocol, which provides important services such
as reliability and congestion/ŕow control. The őrst two ŕows of a TCP
connections are called the TCP handshake, and contain only control signals, no data. The őrst ŕow is referred to as TCP SYN (ŕow 3a), and
the second ŕo is referred to as TCP SYN/ACK (ŕow 3b).
Step 4: TLS handshake. The TLS protocol also begins with a handshake,
i.e., few control ŕows, which establishes the secure connection. Different versions of TLS support different handshakes, which we describe in
following sections; we simplify the handshake based on RSA encryption
Applied Introduction to Cryptography and Cybersecurity
CHAPTER 7. TLS PROTOCOLS: WEB-SECURITY AND BEYOND
406
in Figure 7.1. All TLS begin with the Client_Hello message (ŕow 4a).
The server responds with Server_Hello and the certiőcate (ŕow 4b). This
provides the browser with the server’s public encryption key (B.e); the
browser selects a random key, here denoted simply as k, and shares it
with the server by encrypting it using B.e and sending the encryption
(EB.e (k)) to the web server (ŕow 4c, the Key_Exchange message).
Step 5: the TLS session (record protocol), initial webpage. At this point,
the browser and server can communicate securely, with their messages
protected using the TLS record protocol and the key they shared (which
we denoted k). We denote the protection of the record protocol by the
envelope symbol, with the key as subscript, i.e., k (·). The browser őrst
sends an HTTP GET request, requesting the index.html webpage (ŕow
5a); the server responds by sending back the page, written in HTML
(Hypertext Markup Language), e.g., a login form (ŕow 5b).
Step 6: page displayed to user. Flow 6a represents the browser displaying
the webpage to the user (Alice), together with a few security indicators
such as a padlock . Flow 6b represents the user entering user name and
password.
Step 7: additional HTTP requests and response. Flows 7a and 7b are
examples of additional HTTP requests and responses, protected by the
TLS record protocol. In ŕow 7a, the browser sends the username and password; the interaction typically continues with the server HTTP response
(ŕow 7b) and additional requests and responses (not shown).
7.1.3
TLS: security goals
TLS and SSL are designed to ensure security between two computers, usually
referred to as a client and a server, in spite of attacks by a MitM (Man-in-theMiddle) attacker. The goals include:
Key exchange: securely setup a secret shared key, preventing exposure of this
key to a MitM attacker.
Server authentication: authenticate the identity of the server, i.e., assure
the client that it is communicating with the right server.
Client authentication: authenticate the identity of the client. Client authentication is optional; in fact, TLS is usually used without client authentication, allowing an anonymous, unidentiőed client to connect to the server.
When client authentication is desired, it is usually performed by sending
a secret credential within the TLS secure connection, such as a password
or cookie.
Connection Integrity: validate that the communication received by one
party, is exactly identical to the communication sent by the peer (in
Applied Introduction to Cryptography and Cybersecurity
7.1. INTRODUCTION TO TLS AND SSL
407
spite of a MitM attacker); if it isn’t, abort the connection, with error
message, rather than delivering information without integrity. Note that
TLS and SSL are run over TCP, which ensures integrity against benign
errors, therefore any failure must be due to an attack, and aborting the
connection is a sensible response. TLS and SSL detect not just corruption
of an individual message, but also message re-ordering, and truncation
attacks where the attacker drops the last message sent by the peer.
Connection confidentiality: Ensure that a MitM attacker cannot learn anything about the information sent between the two parties, except for the
‘traffic pattern’ - amount of information sent/received.
Perfect forward secrecy (PFS): Version 3 of SSL and all versions of the
TLS handshakes support the (optional) use of authenticated DH key
agreement, which ensures perfect forward secrecy (PFS), as discussed in
subsection 6.3.1. PFS is deőned in Deőnition 5.3.
Crypto-agility: We say that a cryptographic protocol, such as TLS, provides
cryptographic agility or crypto-agility, if it allows the parties to select the
speciőc cryptographic algorithms they use for a given function (e.g., block
cipher, hash function or signatures). We introduced and discussed the importance of crypto-agility in subsection 5.6.2; in particular, crypto-agility
is essential, when a vulnerability is found or suspected in a particular algorithm. TLS supports crypto-agility; it allows the cryptographic algorithms
to be negotiated in each session, through cipher suite negotiation.
All versions of TLS, as well as SSLv3, were designed to meet these goals;
for SSLv2, Perfect Forward Secrecy (PFS) was not a goal. Of course, the goals
may not be actually met, due to different vulnerabilities; we present some of
the important vulnerabilities in this chapter, and in particular, see Table 7.2.
7.1.4
TLS: Engineering goals
In addition to the security goals, the success of TLS is largely due to its focus
- from the very őrst versions - on the generic ‘engineering goals’, applicable
to any system, of efficiency, ease of deployment and use, and flexibility. By
addressing these goals, TLS is widely used and applicable in a very wide range
of applications and scenarios. Let us brieŕy discuss these three engineering
goals.
Efficiency - and session resumption. Efficiency is always a desirable goal.
In the case of TLS, there are two main efficiency considerations: computational
overhead and latency. In terms of computational overhead, the main consideration is minimizing the computationally-intensive public-key operations. To
minimize public-key operations, once the handshake establishes a shared key
(using public key operations), the parties may reuse this key to establish future
connections without requiring additional public-key operations. We refer to the
Applied Introduction to Cryptography and Cybersecurity
408
CHAPTER 7. TLS PROTOCOLS: WEB-SECURITY AND BEYOND
set of connections based on the same public-key exchange as a session, and a
handshake that reuses the pre-exchanged shared key as a session-resumption
handshake.
In terms of minimizing latency, the main consideration is to minimize
the number of round trip exchanges. End-to-end delays are typically on the
order of tens to hundreds milliseconds, which is usually much higher than the
transmission delays, esp. for the limited amount of information sent in TLS
exchange. Reducing the number of round trips became even more important as
transmission speeds increased; this is reŕected by the fact that until TLS 1.3,
all designs had a őxed number of two round-trips to complete the handshake,
only then allowing the client to send a protected message (already in the third
exchange). In contrast, a TLS 1.3 handshake requires only a single round-trip
(before sending a protected message), and even allows the clients to send a
request already in the őrst exchange (with some limitations and somewhat
reduced security properties, see later).
A more minor efficiency consideration is minimization of bandwidth; this is
mainly signiőcant in scenarios where bandwidth is limited, such as very noisy
wireless connections.
Extensibility and versatility. Extensibility is always important - deőnitely
for a widely deployed security protocol such as TLS, which is used in diverse
scenarios and environments. Indeed, part of the success of TLS derives from its
extensibility and versatility; the protocol supports many optional mechanisms,
e.g., client authentication, and ŕexibility such as crypto-agility (Principle 11).
Furthermore, from TLS version 1.1 (and even earlier for some implementations),
the TLS protocol supports a built-in extension mechanism, providing even
greater ŕexibility (subsection 7.4.3).
Ease of deployment and use. Finally, the success and wide-use of the TLS
protocols are largely due to their ease of deployment and usage. As shown in
Figure 7.2, the TLS protocol is typically implemented ‘on top’ of the popular
TCP sockets API, and then used by applications, directly or via the HTTPS or
other protocols. This architecture makes it easy to install and use TLS, without
requiring changes to the operating-system and kernel. This is in contrast to
some of the other communication-security mechanisms, in particular the IPsec
protocol [127, 153], which, like TLS, is also an IETF standard. The ease of
deployment and use of TLS is probably the reason that TLS has become an
almost-universal security substrate for many systems, even where other protocols
may have advantages. For example, IPsec is probably better for use for Virtual
Private Networks (VPNs), yet TLS VPNs are more widely deployed.
7.1.5
TLS and the TCP/IP Protocol Stack
See Figure 7.2 for the placement of the TLS protocols with respect to the
TCP/IP protocol stack, and Figure 7.3 for a typical connection.
Applied Introduction to Cryptography and Cybersecurity
7.2. THE TLS RECORD PROTOCOL
TLS Handshake
...
HTTPS
TLS record
TCP sockets API
TCP
IP
409
HTTP
...
Figure 7.2: Placement of TLS in the TCP/IP protocol stack. The TLS record
protocol (őrst box in top line, in green) establishes keys for, and also uses, the
TLS record layer protocol (őrst box in second line, also in green). The HTTPS
protocol, and other application protocols that use the TLS record protocol, are
in two middle boxes of the top line (in yellow). Application protocols that do
not use TLS for security, including the HyperText Transfer Protocol (HTTP),
are in the two last boxes in the two top lines (in pink). These protocols, as
well as TLS itself, all use the TCP protocol, via the sockets library layer. TCP
ensures reliable communication, on top of the (unreliable) Internet Protocol
(IP).
Figure 7.3: Phases of TLS connection. The black ŕows (Syn+Ack and later
Fin+Ack) are the TCP connection setup and tear-down exchanges, required
to ensure reliability. The fuchsia ŕows represent the TLS handshake; notice
there are often more than the three shown. The blue ŕows represent the data
transfer, protected using TLS record layer; and the red ŕows represent the TLS
connection tear-down exchange.
[This subsection is yet to be written; this material is well covered in many
textbooks on networking, e.g., [245].]
7.2
The TLS Record Protocol
In this section we begin our in-depth discussion of the TLS protocols. Speciőcally,
we focus on the record protocol component of TLS; this protocol protects the
communication, using symmetric cryptography, i.e., encryption, authentication
(MAC) and/or authenticate-encryption. The symmetric key used by the record
protocol, must be previously setup, securely, by the handshake protocol, which
we discuss in the following sections.
We begin the in-depth discussion of TLS with the record protocol, rather
than with the more ‘interesting’ handshake protocol, for two reasons. First, we
think that the record protocol is simpler, and can be understood independently
Applied Introduction to Cryptography and Cybersecurity
410
CHAPTER 7. TLS PROTOCOLS: WEB-SECURITY AND BEYOND
of the handshake protocol. Second, by presenting attacks on vulnerable record
protocol options, we motivate some of the mechanisms later introduced by the
handshake protocol.
We discussed the basic principles underlying record protocols already earlier,
in Chapter 4 and Chapter 5. In this section, we focus on the TLS record
protocol, the most widely-deployed record protocol, with some unique aspects
and instructive vulnerabilities.
We focus on the common case, where the TLS record protocol is applied
‘on-top’ of an underlying reliable communication protocol - typically, TCP 1 .
Hence, without an attack, messages sent are received reliably, without losses,
duplications or re-ordering; any deviation must indicate an attack and justiőes
closing the connection.
Both TCP and the TLS record protocol treat the data from the application
as one long stream of bytes, regardless of the sequence of (usually multiple)
calls in which the protocol receives the data from the application. Namely,
the application in the receiver, should parse the stream of bytes which the
record protocol outputs, into the different application-level units (typically
called messages).
The record protocol involves authentication, encryption and few other functions applied to the data. We already discussed such combinations in Section 4.7;
in this section, we focus on the TLS-speciőc aspects. In subsection 7.2.1, we
discuss the TLS Authenticate-then-Encrypt (AtE) record protocol, used in SSL
and in versions of TLS until TLS 1.2. TLS 1.2 also supports the AtE record protocol, but also supports the alternative AEAD record protocol, which we discuss
in subsection 7.2.7 (see also discussion of AEAD schemes in subsection 4.7.1).
TLS 1.3 supports only the AEAD record protocol.
In subsection 7.2.3-7.2.6, we present attacks exploiting vulnerabilities in the
AtE record protocol. This motivates the adoption of the AEAD record protocol
(in TLS 1.3 and, optionally, in TLS 1.2).
Cipher suites. Different versions and implementations of TLS may support
different cryptographic algorithms. The list of cryptographic algorithms in used
by both record protocol and handshake protocol, at a speciőc connection, is
called the cipher suite. For the AtE record protocol, this deőnes an encryption
algorithm, a MAC algorithm, and (optionally) a compression algorithm. For
the AEAD record protocol, the record protocol has less options: it is deőned by
a single AEAD algorithm.
7.2.1
The Authenticate-then-Encrypt (AtE) Record Protocol
We begin our discussion of the TLS record protocol, by focusing on the
Authenticate-then-Encrypt (AtE) design, used by SSLv3 and TLS, until until
1We do not cover DTLS [330], a variant of TLS, designed to work over the UDP protocol,
i.e., over an unreliable datagram service.
Applied Introduction to Cryptography and Cybersecurity
7.2. THE TLS RECORD PROTOCOL
Message as sent by the application, e.g., HTTP request
Plaintext
Fragment
(up to 16KB)
Compress
(optional)
411
Message as sent by
S T V L
E Y E E
Q P R N
Compressed
fragment 1
the application, e.g
S T V L
E Y E E
Q P R N
Compressed
fragment 2
., HTTP request
S T V L
E Y E E
Q P R N
Compressed
fragment 3
Authenticate
Compressed
fragment 1
M
A
C
Compressed
fragment 2
M
A
C
Compressed
fragment 3
M
A
C
Pad (if using
a block cipher)
Compressed
fragment 1
M P
A A
C D
Compressed
fragment 2
M P
A A
C D
Compressed
fragment 3
M P
A A
C D
Compressed
fragment 1
M P
A A
C D
Compressed
fragment 2
M P
A A
C D
Compressed
fragment 3
M P
A A
C D
IV (if block cipher,
from TLS 1.1.)
Encrypt
and send
I
V
T V L
I
Y E E
V
P R N
Encrypted
fragment 1
I
V
T V L
I
Y E E
V
P R N
Encrypted
fragment 2
I
V
T V L
I
Y E E
V
P R N
Encrypted
fragment 3
Figure 7.4: The Authenticate-then-Encrypt (AtE) design of the record protocol
of SSL and TLS. Unőlled őelds (pad, IV) are only used for block ciphers; the
IV őeld is added only from TLS 1.1. The MAC is computed over the sequence
number (SEQ), type (TYP), version (VER), length (LEN) and compressed
fragment, as in Equation 7.1. The type, version and length őelds are sent, as
plaintext, together with the corresponding encrypted fragment. The record
protocol used (only) the AtE design until TLS 1.1; TLS 1.2 supports the use of
either the AtE design or the AEAD design (subsection 7.2.7).
version 1.2 (which allows AEAD as an alternative) and 1.3 (which only allows
AEAD).
Figure 7.4 illustrates the sequence of processing-steps applied by the sender,
running the TLS AtE record protocol. These steps are applied to the input that
the sender receives - from the application, or from the TLS alert or handshake
protocols. Let us discuss each of the steps in the order they are applied by the
sender of the data:
Fragment: break the TCP stream into fragments; namely, a single (long)
‘message’, sent in one ‘send’ event by the application, may be parsed by
the record protocol into multiple fragments, as shown in the top lines of
Figure 7.4. Note that the record protocol may also aggregate multiple
(short) ‘messages’, sent in consecutive ‘send’ event, into one fragment (this
is less common and not shown in the őgure). Each fragment consists of up
to 16KB. One motivation for fragmenting is to allow pipeline operation,
reducing the latency. For example, the sender may process the őrst
fragment, then send it while, in parallel, processing the second fragment.
Another motivation for fragmenting is to allow recipients to allocate a
őxed-sized buffer for incoming fragments, and avoid the risk of buffer
Applied Introduction to Cryptography and Cybersecurity
412
CHAPTER 7. TLS PROTOCOLS: WEB-SECURITY AND BEYOND
overŕow bugs and attacks. TLS recipients should discard an incoming
fragment larger than 16KB.
Compress: apply lossless compression to each fragment. Compression may
reduce the processing overhead and the communication. As discussed in
Section 4.7, ciphertext cannot be compressed. Therefore, compression
should be done before encryption - or not at all. Note that the length of
compressed data depends on the amount of redundancy in the plaintext,
and encryption usually does not hide the length of the (compressed)
plaintext; hence, there is a risk of exposure of the (approximate) amount
of redundancy in the plaintext, when applying compress-then-encrypt.
Indeed, the fact that TLS applies compression before encryption was
exploited in the CRIME, BREACH and TIME compression attacks [29,
283, 335, 354]. These attacks motivated disabling of TLS compression, and
currently TLS compression is rarely used (and not even supported in TLS
1.3). However, compression attacks may still be possible, in the (common)
use of application-level compression; this is exploited in the BREACH
attack [164]. We discuss these attacks in subsection 7.2.6.
Authenticate: The AtE record protocol authenticates the plaintext by applying a MAC function, before encryption. The input to the MAC function
consists of the concatenation of a Sequence number (SEQ) őeld, indicator
the sequence number of the record, the type, version and length őelds and
the Compressed Fragment itself; the type, version and length őelds are
deőned as:
Type: one byte indicating the type of data in this record. The most
common type is ‘application data’, which is encoded by 0x17; other
types are used for the handshake protocol, alert protocol (error
indicators) and for a special Change Cipher Specification (CCS)
message, indicating change of to new set of cryptographic keys (and,
optionally, algorithms).
Version: an identiőer of the version of TLS.
Length: the number of bytes in the (optionally compressed) fragment.
Namely, the MAC of a message sent by the server is calculated by:
M AC = M ACkSM AC
SeqNum +
+ Type +
+ Version+
+
+
+Length +
+ Compressed_Fragment
(7.1)
M AC
A similar equation - using kC
instead of kSM AC - is used for the MAC
of messages sent by the client.
MAC keys. The sender computes the MAC uses a shared key; we use
M AC
kC
to denote the key for traffic from client to server, and kSM AC to
Applied Introduction to Cryptography and Cybersecurity
7.2. THE TLS RECORD PROTOCOL
Authenticated
fragment
413
Compressed
fragment
MAC
x bytes
Padded
authenticated
fragment
Compressed
fragment
Block 1
(l Bytes)
MAC
Block 2
(l Bytes)
Pad:
p bytes
Block 3
(l Bytes)
Figure 7.5: Padding in the AtE record protocol of SSL and TLS, when using
a block cipher with block of l bytes; l = 8 for DES and l = 16 for AES. Pad
contains p = l − (x mod l) bytes, where x is the length of the authenticated
fragment (compressed fragment plus MAC); e.g., if l = 16 and x = 35, then
p = 13, and the padded authenticated fragment őts in three blocks (as in the
őgure). In TLS, all p pad bytes must contain p − 1. In the SSL record protocol,
only the last pad byte must contains p − 1; the other p − 1 pad bytes may have
any value. Stream-cipher encryption does not require padding.
denote the key used for traffic from server to client2 . The recipient
validates that the value in the received MAC őeld, is the same as the
result of the MAC function applied with the corresponding key. Both
M AC
kSM AC and kC
are generated by the handshake protocol.
Padding: The input to a block cipher must be exactly one block; however, the
length of the output from the authentication, consisting of the compressed
fragment and the MAC, would often not be an integral number of blocks.
Therefore, when using a block cipher, the TLS AtE record protocol
appends a padding string, ensuring that the total length of the input to
the encryption is an integral number of blocks, as shown in Figure 7.5. If
the length of the authenticated fragment is x bytes, and the block-length
is l bytes, then the required number of pad bytes would be p = l − (x
mod l). SSL restricts the pad to only őll up to one block (0 < p ≤ l), but
TLS allows longer pad, up to 256 bytes, which can be used to hide the
exact length of the fragment.
2 In SSLv2, the same keys are used also for encryption, and hence simply denoted k
C and
M AC as the
kS , and derived as in Eq. (7.20. Also, note that the TLS specifications refers to kS
server’s MAC_write_key, i.e., a key used by the server to compute the MAC for segments
M AC . We
being sent (‘written’), as well as the client’s MAC_read_key; and similarly for kC
find our notation simpler.
Applied Introduction to Cryptography and Cybersecurity
CHAPTER 7. TLS PROTOCOLS: WEB-SECURITY AND BEYOND
414
SSL uses X9.17 padding, and TLS uses PKCS#5 padding, to allow removal
of the padding after decryption, and to validate the correctness of the pad;
see their description in Section 2.9. Basically, in SSLv3, the length is in
the last padding byte, and the value of other padding bytes is undeőned,
while in TLS, all padding bytes must contain the number of padding bytes.
This seemingly-minor difference is signiőcant, as we show by describing
the Poodle padding attack in subsection 7.2.3 below.
Prepend IV: Many block-cipher encryption algorithms, e.g., CBC, require
an initialization vector (IV), which is usually selected randomly; see
Section 2.8. In TLS 1.1 and 1.2, the IV is sent by the record protocol,
prepended to the plaintext/ciphertext, as shown in Figure 7.4. SSL and
TLS 1.0 try to save the (very limited) resources required to select and
send the IV, and do not send the IV. This ŕawed design was exploited by
the devastating BEAST attack [132]; its publication was a main driver
for adoption of TLS 1.1. See subsection 7.2.4.
Encrypt: TLS encrypts the concatenation of the compressed plaintext fragment,
the MAC and, if necessary, the padding. Padding is required when using a
mode-of-operation of a block cipher; it is not required for stream ciphers.
A basic problem with the record protocols for SSLv3 and TLS versions 1.0
to 1.2, is its use of Authenticate-then-Encrypt (AtE) design, rather than the
secure Encrypt-then-Authenticate design, or the use of a secure authenticated
encryption. We discussed these alternatives in Section 4.7. The EtA design
has been standardized, as a TLS extension (subsection 7.4.3), to improve the
security of TLS 1.0 to 1.2 [179], and the authenticated encryption, speciőcally using AEAD, is used by the TLS 1.3 record protocol, and optionally in
TLS 1.2 (subsection 7.2.7). We discuss attacks exploiting the use of EtA in
subsection 7.2.3-7.2.5.
7.2.2
The CPA-Oracle Attack Model
Following Principle 1, we now model the adversary against the TLS record
protocol, which we call the CPA-Oracle Attack model. The model is applicable
to different applications of TLS. We focus on the common use of TLS to secure
the communication between web-client and web server. Intuitively, our attack
model combines three adversary capabilities:
MitM: the adversary has Man-in-the-Middle (MitM) capability when it can
intercept packets sent between client and server, modify them and inject
forged packets.
Rogue website: the victim client innocently visits a rogue website controlled
by the adversary, e.g., 666.com. A rogue website can send automaticallyexecuted hyperlinks to the browser, e.g., a request to embed an image or
script from a victim website. Such attacks are referred to as cross site
Applied Introduction to Cryptography and Cybersecurity
7.2. THE TLS RECORD PROTOCOL
415
attacks, and their goal is usually to abuse the relationship between the
client and the victim website. Cross site attacks are of the most common
attacks on web security, and there is extensive study of non-cryptographic
defenses against them. These defenses rely on prevention of MitM attacks
by TLS (against MitM attackers).
CPA oracle: the adversary can receive an indication if the decrypted plaintext
is valid or invalid. In some attacks, the adversary also has the ability to
distinguish between a padding error and a MAC-validation error; this
ability is key to padding attacks. From SSLv3, the error messages are
encrypted, but different attacks, against different versions and implementations, were able to detect the whether a failure is due to invalid padding
or to invalid MAC, based on differences in timing. See subsection 7.2.3.
MitM Mal
controls 666.com
If c′i is invalid:
connection i aborted.
Timing side-channel
may expose reason:
invalid pad or MAC?
Distribution
Cookie x
:
om
B .c
from
i,
t i efix) p
s
e
r
u
si
Req ath (p uffix)
p
y (s
bod
ci
mi = p i +
+x+
+ si
Encode: ci ←
+ M AC(mi ) +
+ padi )
Encki (mi +
c′i
Decode
using ki
Nurse
Client
(Alice)
Benign
website
B.com
Figure 7.6: The CPA-Oracle Attack model on the AtE record protocol of SSL
and TLS. The attacker’s goal is to őnd a string x, such a cookie (or password),
sent (encrypted) by a browser to a web-server with every request. The model
allows the attacker to control the preőx pi (the request path), and the suffix si
(the request body) for every request, such that the plaintext input is pi +
+x+
+ si .
We allow the attacker to intercept and modify the ciphertext, and to receive
feedback on the results of the validation of decryption of c′i . Both SSL and TLS
breaks the connection upon each error; the attacker may cause the browser to
send a new request (using same secret x, and possibly changing (pi , si ), but
the parties will use in each connection i a separate key ki . Error messages are
encrypted, but some attacks, on some versions/implementations, distinguish
between invalid MAC and invalid padding, see subsection 7.2.3.
The CPA-Oracle Attack model (Figure 7.6) is a simpliőed model of an
attacker with MitM, rogue website and padding oracle capabilities. The goal
of the attacker is to expose information about a secret string x, taken from
some (known) distribution. Often, x is a cookie that the browser automatically
includes with every request that it sends to the benign website B.com. The rogue
website capability allows the attacker to cause the browser to send different
Applied Introduction to Cryptography and Cybersecurity
416
CHAPTER 7. TLS PROTOCOLS: WEB-SECURITY AND BEYOND
requests to the website, always including the cookie as part of the request; the
attacker may control much of the request, including the path (which comes before
the cookie) and the body/payload (which comes after the cookie). Furthermore,
the attacker often knows the contents of the rest of the request, except the
cookie. For simplicity, the CPA-Oracle Attack model lets the attacker choose,
with every request, both preőx pi and suffix si , so that the plaintext input to
the TLS record protocol in the ith request is mi = pi +
+x+
+ si . For simplicity,
we further assume that the attacker knows the length |x| of the secret cookie.
Error handling. The TLS record protocol aborts a connection upon receiving
any invalid ciphertext c′i . The attacker can send a new request, i.e., new plaintext
preőx pi+1 and suffix si+1 . TLS negotiates a new pseudorandom key ki for each
connection i. The attacker could send multiple requests on the same connection
until it is aborted; if requests i and i + 1 are sent on the same connection (no
abort), then ki = ki+1 .
The Plaintext-Recovery security goal. Ideally, the use of TLS should prevent
an attacker, which can eavesdrop to the communication between Alice and the
benign website B.com, from learning any information about x. However, the
AtE record protocol allows the use of compression before encryption, which
makes it impossible to ensure indistinguishability between ciphertexts of a highly
compressible message and a mostly-random messages; see subsection 4.7.5.
Instead, we consider the more modest security goal of preventing plaintext
recovery. More speciőcally, the goal is to prevent exposure of a secret string x,
typically a cookie, which is part of plaintext.
7.2.3
Padding Attacks: Poodle and Lucky13
The use of Authenticate-then-Encrypt (AtE) by SSL and versions 1.0, 1.1 and
(optionally) 1.2 of TLS, may result in vulnerabilities to padding oracle attacks,
introduced in Section 2.9. However, Section 2.9 focused only on encrypted
communication, without authentication or integrity checks on the plaintext. In
contrast, the TLS AtE record protocol validates MAC on the plaintext, after
padding is removed. Note that if the pad is formatted correctly but the last
byte of the pad contains a wrong value (not the value p = l − (x mod l)), then
an incorrect number of ‘pad bytes’ would be removed; see Figure 7.5. This will
result in a failed MAC validation, since the input to the MAC will begin before
or after the correct beginning of the MAC őeld (depending on the value in the
last pad byte).
Furthermore, from early on, TLS designers were aware of the risk of padding
attacks, and took steps to prevent them. First, upon detecting an error - invalid
pad, invalid MAC or invalid contents - the connection is aborted. Second,
all error messages are encrypted, to prevent an attacker from distinguishing
between padding errors and MAC errors. Indeed, early padding attacks [380]
resulted in only limited information leakage from the encrypted messages.
This was quickly followed by more realistic attacks such as [90], which used
a timing side channel to distinguish between pad errors and MAC errors, based
Applied Introduction to Cryptography and Cybersecurity
7.2. THE TLS RECORD PROTOCOL
417
on differences in the processing time of the two. Effective attacks, based on
easy-to-measure timing differences, are known for SSLv3 and TLS 1.0.
These attacks, at least, should have motivated the TLS 1.1 and 1.2 designers
to change from the (insecure) AtE of TLS, to the secure EtA (Encrypt-thenAuthenticate) paradigm, or to an AEAD-based protocol. Abandoning the
vulnerable AtE follows the conservative design principle (Principle 3); but,
unfortunately, this did not happen.
Instead, the TLS 1.1 and 1.2 speciőcations include additional countermeasures that attempt to prevent distinguishing between pad errors and MAC errors,
such as computing the MAC even if the pad is invalid. Such countermeasures
are ingenious, but unreliable. Furthermore, surely these steps cannot prevent a
possible padding attack that works without distinguishing between MAC and
pad failures!
We discuss two of the most important padding oracle attacks on TLS:
the Lucky13 and Poodle padding attacks. Lucky13 is based on circumventing
the countermeasures and distinguishing between MAC and pad failures, while
Poodle works even if the two errors are indistinguishable - making it much easier
to exploit. Both of these attacks are against the use of CBC mode; Lucky13
addresses TLS, which uses PKCS#5 padding, and Poodle addresses SSL, which
uses X9.23 padding.
Lucky13. We őrst brieŕy discuss the Lucky13 padding attack [9]. Lucky13
extends the padding oracle attacks from Section 2.9, speciőcally the attack
against PKCS#5 padding in Exercise 2.24. Lucky13 uses careful timing sidechannel analysis, to distinguish between pad errors and MAC errors; i.e., it
circumvents the countermeasures against timing side channels in TLS 1.1 and
TLS 1.2, which were designed speciőcally to prevent such distinction. This
allows Lucky13 to then follow a similar approach to the padding oracle attack
of [90].
Lucky13 uses carefully-constructed plaintexts, allowing it to attack the secret
x one byte at a time, like the method used in Exercise 2.24. The attack cleverly
uses the fact that when that in the (rare) cases that the pad is valid, the padding
bytes are removed. It designs the plaintext carefully to cause a difference in the
number of invocations of the compression function used iteratively by the MAC
algorithm; see Section 3.9. The details are elegant, and while we will not cover
them, the reader is encouraged to look them up in [9].
Poodle. Even more signiőcantly, SSLv3 is also vulnerable to the Poodle
padding attack, which does not require distinction between padding and MAC
failures. Furthermore, many implementations of TLS are vulnerable to the
Poodle downgrade attack, which we discuss in Section 7.5. By downgrading
TLS to SSL, the Poodle downgrade attack allows the Poodle padding attack
to succeed against many implementations of TLS 1.0 to 1.2. This is a major
motivation for adoption of TLS 1.3. We therefore describe the Poodle padding
attack.
Applied Introduction to Cryptography and Cybersecurity
418
CHAPTER 7. TLS PROTOCOLS: WEB-SECURITY AND BEYOND
The Poodle padding attack was introduced in [290]; it is based on observations
made years earlier, in [287]. Speciőcally, the Poodle padding attack is based on
three observations:
1. SSL uses X9.23 padding, where the only requirement for valid padding,
is that the last byte contains a number smaller than the block length l.
There is no requirement on the values of the other pad bytes (in contrast
to PKCS#5 padding, see Section 2.9).
2. The CPA-Oracle Attack model allows the attacker to detect both invalidpadding and invalid-MAC errors.
3. The CPA-Oracle Attack model allows the attacker to prepend chosen
preőx p to the to the secret (cookie) x, to append chosen suffix s to x.
By adding or removing bytes from the preőx p and suffix s, the attacker
ensures that the entire plaintext, before padding, contains an integral number
of blocks: |m| ≡ 0( mod l). As a result, the pad length will also be a whole
block, with the last byte containing (l − 1). For convenience, let us focus on
8-byte blocks, as with DES; then the last plaintext block, denoted mn , consists
of eight bytes containing 0x07, i.e., (∀j : 1 ≤ j ≤ 8)mn [j] = 0x07.
The Poodle attacks takes advantage of a similar observation as we used to
solve Exercise 2.24, namely, that a random plaintext block would have valid
PKCS#5 padding if, and almost always if, its last byte contains 0x00. The
difference is that SSL uses X9.23 padding, and also applied authentication
(MAC) before padding and encrypting; in particular, this will mean that now
we need the last byte of the plaintext to contain 0x07, in order to remove an
entire padding block and leave the MAC intact. The attack proceeds in three
steps.
First Poodle step: collect. In this step, the attacker ‘collects’ 256 SSL
record protocol packets, which we denote r0x00 , . . . , r0xF F . For convenience, all
records should consist of exactly n blocks, e.g., ri = r1i . . . rni . The records should
be correctly encoded and, in particular, decrypt into valid-padded plaintext; we
further require that the pad will őll the entire last block of the plaintext. Since
SSL , for 8-byte blocks:
i
(∀i ∈ {0x00, . . . , 0xF F })Dk (rni )[8] ⊕ rn−1
[8] = 0x07
(7.2)
We further require that the value of the last byte of the before-last block of
each record would be identical to the index of the record. Namely:
0x00
0xF F
rn−1
[8] = 0x00, . . . , rn−1
[8] = 0xF F
(7.3)
This means we need to generate candidate records repeatedly, until we collect
all 256 records. This is not a lot of overhead, and should not require much more
than 256 requests to the encryption oracle, i.e., hyperlinks sent to the browser
to cause it to send a request to the victim server. We will use this ri collection
in the following steps of the attack.
Applied Introduction to Cryptography and Cybersecurity
7.2. THE TLS RECORD PROTOCOL
419
Second Poodle step: find last byte of x. In this step, the attacker őnds
the last byte of the cookie/secret x; for example, assume the cookie x is 8 bytes,
so we őnd x[8]. The attacker ensures, by adjusting the length of the preőx p, so
that x[8] is the last byte of some plaintext block. Since we use CBC, there are
two consecutive ciphertext blocks, which we denote by c− and c+ , such that:
x[8] = c− [8] ⊕ Dk (c+ )[8]
(7.4)
The attacker now constructs chosen ciphertexts c0x00 , . . . , c0xF F , by using
c+ to replace the last block of each of the ri records. Namely:
i
(∀i ∈ {0x00, . . . , 0xF F })ci = r1i +
+ ... +
+ rn−1
+
+ c+
(7.5)
The attacker invokes the CPA-Oracle Attack oracle on each of these chosen
ciphertexts. The padding is valid only if the value of the last decrypted plaintext
byte would be between 0x00 and 0x07. If the padding is valid, the MAC is
checked; and it is valid only if the padding consists of the entire last block, i.e.,
only if the last byte of the plaintext (and padding) contains 0x07. Since we use
CBC mode, this occurs when:
0x07 = cin−1 [8] ⊕ Dk (cin )[8]
(7.6)
i
By substituting cin−1 = rn−1
and cin = c+ (both from Equation 7.5), we have:
i
0x07 = rn−1
[8] ⊕ Dk (c+ )[8]
(7.7)
i
Substitute now rn−1
[8] = i (Equation 7.3) and Dk (c+ )[8] = x[8] ⊕ c− [8] (Equation 7.4), we have:
0x07 = i ⊕ (x[8] ⊕ c− [8])
(7.8)
Equation 7.8 holds when i = 0x07 ⊕ x[8] ⊕ c− [8], and therefore, one (exactly)
of the chosen ciphertexts will have valid padding and valid MAC. Furthermore,
when we identify that ciphertext ci has valid padding and MAC, we can also
őnd x[8], the last byte of the secret/cookie, as x[8] = i ⊕ 0x07 ⊕ c− [8].
Last Poodle step: őnally, the attacker repeats the second step, with a minor
change, to őnd the other bytes of the cookie/secret x. This can be done easily
for the cookie (or any other secret automatically sent by the browser), as follows.
The adversary changes the preőx p to make a different byte of x be the last
byte in some plaintext block. Then the attacker can proceed exactly as in step
2 to őnd this other byte of x.
7.2.4
The BEAST Attack: Exploiting CBC with
Predictable-IV
The basic design of the TLS record layer is deőned for an arbitrary encryption
algorithm; however, the speciőcations deőne a limited number of standard cipher
suites. Furthermore, all of the EtA cipher suites use CBC mode encryption
Applied Introduction to Cryptography and Cybersecurity
420
CHAPTER 7. TLS PROTOCOLS: WEB-SECURITY AND BEYOND
(subsection 2.8.5). Several of the attacks on the record protocol, e.g., Poodle
and other padding oracle attacks, are based on properties of CBC mode. In
this section, we brieŕy discuss the BEAST3 attack, which is also focused on
the use of CBC mode, but addresses a very different vulnerability, existing in
SSL and TLS 1.0 (but not in later versions). This vulnerability is the use of
predictable Initialization Vector (IV).
As presented in subsection 2.8.5, CBC mode requires the use of a random
IV for each message. The value of the IV does not need to be secret, and in most
implementations - including TLS 1.1 and 1.2 - the IV is sent ‘in the clear’, visible
to an eavesdropper. However, the SSL design - adopted also by TLS 1.0 - used
the handshake protocol to derive the IV for the őrst fragment in a connection,
similarly to the derivation of the shared keys used by the record protocol. This
means that both sender and recipient have a shared, pseudorandom IV for the
őrst fragment; therefore, the protocol does not send the IV along with the rest
of the ciphertext. We conjecture, that the designers felt that this is a better
design, probably since it appears that keeping the IV secret may, somehow,
be beneőcial against some future attack against CBC mode (with a particular
block cipher). There is also the minor beneőt of reducing the amount of bytes
sent.
Using a pseudorandom IV, without sending it, is őne. So, there is no problem
with this ‘implicit IV’ method, for the first fragments in a connection. However,
what about other fragments, sent over the same connection ? Also for these
(non-őrst) fragments, SSL and TLS 1.0 do not send an IV. Instead, for any
non-őrst fragment in the connection, the SSL and TLS 1.0 design uses the last
ciphertext block of the previous fragment sent over the connection, as the IV
for the new fragment. Namely, their IV is the value of the previous ciphertext
block sent (from most-recently-sent fragment).
Since the ciphertext is produced by a block cipher, this may seem secure.
However, this is another example of the risk of trusting intuition and not carefully
validating the security of a design, and not relying on the exact cryptographic
properties of the underlying mechanisms. Speciőcally, the security of CBC relied
on the assumption that the IV is random, which implies unpredictable; once
the IV is őxed (as the value of the last-sent ciphertext block), it is completely
predictable and not random any more!
Let us see the cryptographic details of BEAST, under the CPA-Oracle
Attack model. The model allows the attacker to choose each preőx-suffix pair
(pi , si ), potentially as a function of the previous ciphertext ci−1 . We later brieŕy
discuss the challenges in actually deploying BEAST in practice, since, obviously,
the CPA-Oracle Attack model is only a simpliőcation of reality.
BEAST: cryptographic aspects. Suppose we use 8-bytes blocks (as with
DES). The attack exposes the secret/cookie x byte by byte; let us őrst show
how we expose the őrst (most signiőcant) byte, x[1]. The attacker provides
őrst chosen-plaintext preőx p∗ (and suffix s∗ ), chosen to ensure that a speciőc
3 BEAST
stands for Browser Exploit Against TLS.
Applied Introduction to Cryptography and Cybersecurity
7.2. THE TLS RECORD PROTOCOL
421
block of the resulting plaintext m∗ = p∗ +
+x+
+ s∗ would contain the last
∗
seven bytes of p , followed by the first byte x[1] of the secret/cookie. It is
convenient to have p∗ contain őfteen bytes (two blocks minus one byte). The
exact contents are not important, but for our discussion, a convenient choice is:
p∗ = 123456789ABCDEF . As a result, we have:
m∗ [1 : 16] = p∗ +
+ x[1] = 123456789ABCDEF +
+ x[1]
(7.9)
Let c∗ denote the resulting encryption of m∗ . Let c∗ (1) = c∗ [1 : 8], i.e.,
the őrst block (eight bytes) of c∗ , and c∗ (2) = c∗ [9 : 16] denote the second
block; similarly, m∗ (2) = m∗ [9 : 16] = 9ABCDEF +
+ x[1]. Since we use CBC
encryption, we have:
c∗ (2) = Ek (c∗ (1) ⊕ m∗ (2)) = Ek (c∗ (1) ⊕ (9ABCDEF +
+ x[1]))
(7.10)
For i = 0, . . . , 255, the attacker next obtains IVi , the IV that would be used
to encrypt the next fragment. In the SSL (and TLS 1.0) design, the IV is the
last-sent ciphertext block (the end of the previous ciphertext fragment). The
attacker asks for encryption of plaintext p′i , computed as:
p′i = (m∗ [9 : 15] +
+ i) ⊕ c∗ (1) ⊕ IVi = (9ABCDEF +
+ i) ⊕ c∗ (1) ⊕ IVi (7.11)
Let c′i be the CBC encryption of p′i with the known IV value IVi ; hence,
c′i = Ek ((9ABCDEF +
+ i) ⊕ c∗ (1)). We try the 256 different values for i until
we őnd one of them, denoted i∗ , such that:
+ i∗ ) ⊕ c∗ (1))
c∗ (2) = c′i∗ = Ek ((9ABCDEF +
(7.12)
Since Ek is a permutation, equal outputs of Ek imply equal inputs to Ek . Hence,
from Equation 7.10, we őnd x[1] by the following deductions:
c∗ (1) ⊕ (9ABCDEF +
+ x[1]) =(9ABCDEF +
+ i∗ ) ⊕ c∗ (1)
(9ABCDEF +
+ x[1]) =(9ABCDEF +
+ i∗ )
x[1] = i∗
(7.13)
Finding other bytes. Let us explain how we őnd x[2], by utilizing the
fact that we already know x[1]; the method extends to the other bytes too.
The attack simply requires choosing a fourteen bytes preőx pb∗ , e.g., pb∗ =
123456789ABCDE. As a result, we have:
c∗ [1 : 16] = pb∗ +
m
+ x[1 : 2] = 123456789ABCDE +
+ x[1 : 2]
(7.14)
Since x[1] is already known, we are in a similar situation to earlier, of őnding
c∗ [16] = x[2]. Let
the last (unknown)
byte in the second block, which is now m
c∗ (2) = m
c∗ [9 : 16].
c∗ , cb∗ (1) = cb∗ [1 : 8], cb∗ (2) = cb∗ [9 : 16] and m
cb∗ = Ek m
Since we use CBC, we have:
c∗ (2) = Ek cb∗ (1) ⊕ (9ABCDE +
cb∗ (2) = Ek cb∗ (1) ⊕ m
+ x[1 : 2])
(7.15)
Applied Introduction to Cryptography and Cybersecurity
422
CHAPTER 7. TLS PROTOCOLS: WEB-SECURITY AND BEYOND
The attacker now performs a similar test to the one before to őnd the value of
x[2]. Namely, for i = 0, . . . , 255, the attacker obtains IVi , the IV that would be
used to encrypt the next fragment, and then asks for encryption of pb′i :
pb′i = (9ABCDE +
+ x[1] +
+ i) ⊕ cb∗ (1) ⊕ IVi
(7.16)
The attacker eavesdrops to obtain cb′i , the CBC encryption of pb′i with IV IVi :
cb′i = Ek pb′i ⊕ IVi = Ek (9ABCDE +
+ x[1] +
+ i) ⊕ cb∗ (1)
(7.17)
Similarly to before, the attacker őnds a value ib∗ ∈ {0, . . . , 255} for which:
′
cb∗ (2) = cc
ib∗
(7.18)
Since Ek is a permutation, equal outputs of Ek imply equal inputs to Ek .
Substitute the inputs to Ek from Equation 7.15 and Equation 7.17 and we have:
cb∗ (1) ⊕ (9ABCDE +
+ x[1 : 2]) = (9ABCDE +
+ x[1] +
+ ib∗ ) ⊕ cb∗ (1)
(9ABCDE +
+ x[1 : 2]) = (9ABCDE +
+ x[1] +
+ ib∗ )
x[2] =
(7.19)
ib∗
In this way, we őnd x[2] (as ib∗ ); other bytes follow similarly.
BEAST: system aspects. The cryptographic aspects of BEAST were published already in 2004 by Bard [25], and observed for SSH and IPsec years
earlier [38, 336]. Hence, following the conservative design principle (Principle 3,
the TLS designers should have avoided the use of observable IV, i.e., the value
of the last-sent ciphertext block, preventing this vulnerability - as was őnally
done from TLS 1.1.
Unfortunately, as in many similar scenarios, the designers ignored these wellknown warnings, and used anyway the last-sent ciphertext block as the IV for
the next fragment. The reason is that deploying Bard’s attack [25] seemed too
challenging. In particular, the attack requires the attacker to control the very
first block of the new fragment; but in the classical use of TLS to secure HTTP
communication between browser and website, every HTTP request begins with
a őxed header. So it seems that the attacker cannot control the őrst block and the attack is prevented. Bard’s paper [25] showed this problem may be
overcome, but the solution wasn’t very practical.
As a result, the vulnerability persisted in TLS 1.0. It was eventually
addressed, in TLS 1.1; however, deployment of TLS 1.1 was limited for years.
One of the main drivers for the adoption of TLS 1.1 was the publication [132],
by Duong and Rizzo, of several practical ‘implementation tricks’ allowing the
BEAST attack to deploy the basic cryptanalytical ideas of Bard, addressing the
system-challenges that made Bard’s attack [25] so challenging. Let us brieŕy
mention the two most important implementation tricks, which are relevant for
Applied Introduction to Cryptography and Cybersecurity
7.2. THE TLS RECORD PROTOCOL
423
other attacks too. First, they observed that TLS attacks can target the HTTP
cookie őeld, which is often used as a (secret) authenticator sent automatically
by the browser whenever sending a request to a speciőc website. Second, they
observed that the use of the WebSocket mechanism [146] made the attack usable
against TLS as used by browsers, i.e., when running the HTTP protocol over
TLS (denoted as HTTPS). Further details are beyond our scope.
7.2.5
Exploiting RC4 Biases to Recover Plaintext
BEAST, Lucky13 and POODLE are all critical attacks against the use of
CBC encryption as speciőed by TLS (1.0 to 1.2). Some countermeasures were
proposed to these attacks, with the most popular one being to simply use
stream-cipher encryption, using RC4, instead of using a block-cipher such as
DES or AES in CBC mode.
However, RC4 was known, for years, to have some vulnerabilities, such as
the bias of the second byte observed and exploited in [276]; see subsection 2.5.6.
Therefore, following the conservative design principle (Principle 3), it should
have been avoided, and deőnitely not used as a ‘more secure’ alternative to
CBC-mode.
This choice to use RC4 as a supposedly more secure alternative to CBC,
was another example of underestimating the risk due to what appeared as
‘impractical vulnerabilities’ - this time, of the RC4 stream cipher (or pseudorandom generator). Indeed, in the common use of TLS to protect HTTP
communication, the beginning of the information sent by the client is normally
the HTTP header - which begins with well-known bytes. Therefore, we cannot
exploit the signiőcant bias of the second RC4 byte [276].
However, in [11], it was shown that RC4 has additional biases, which may
allow exposure of conődential, sensitive communication such as the cookie. This
included two types of biases:
Single byte biases: biases were detected for some output bytes of RC4, not
just the second byte - although it has the largest bias. In [11], additional
biases were found in the őrst 256 bytes of RC4. By careful analysis of the
results of a large number of encryptions of the same secret x, the attack
can recover the secret with signiőcant probability, which depends on the
number of encryptions and on the positions of the secret (earlier positions
usually had more bias). For example, after 225 encryptions, the őrst 50
bytes were recovered with probability of more than 50% per byte.
Double-byte biases: RC4 also has biases of pairs of bytes in different (adjacent) positions, as reported already in 2000 [148]. By carefully analyzing
such pairs, using the known biases, the attacker can recover the plaintext
from arbitrary positions.
The combination of the attacks on CBC and on RC4, was an important
motivation to development and adoption of other cipher suites, and of improved
record protocols, mainly, the AEAD record protocols (subsection 7.2.7).
Applied Introduction to Cryptography and Cybersecurity
424
7.2.6
CHAPTER 7. TLS PROTOCOLS: WEB-SECURITY AND BEYOND
Exploiting Compress-then-Encrypt: The CRIME,
TIME and BREACH Attacks
We conclude our discussion of attacks against the TLS record protocol, by
discussing attacks which focus on the compression of the plaintext before
encryption. As we discussed in subsection 4.7.5, when we apply compression
to plaintext and then encrypt the ciphertext, there is a risk of exposure of
partial information about the plaintext. In particular, an attacker would be
able to distinguish between the compressed-then-encrypted ciphertexts of two
equal-length plaintexts p1 , p2 , if p1 has high redundancy (compresses to a much
shorter string) while p2 has low redundancy (compression does not reduce its
length).
However, the potential exposure due to plaintext compression, may not
appear to be a serious threat - and, unfortunately, this threat, presented already
in 2002 [230], was mostly ignored for many years. In particular, as shown in
Figure 7.4, the AtE TLS record protocol includes an (optional) compression
process; and applications using TLS often apply their own processing to the
messages, before sending them via TLS.
As you will őnd in Exercise 7.6, under the CPA-Oracle Attack model, it
can be quite easy to deploy this attack and expose part or all of the cookie.
Deploying it in practice involves several challenges, such as the following. First,
the attacker may not be able to completely control the entire plaintext (except
for the secret). Second, when using a block cipher, small changes in the length
of the plaintext may not reŕect corresponding changes in the length of the
ciphertext. Third, the deployed compression schemes are considerably more
complex than Exercise 7.6; and, as the exercise shows, details of the compression
scheme are very relevant to the feasibility of the attack.
The CRIME attack [335], presented by Duong and Rizzo, demonstrated how
these and other challenges can be overcome, allowing efficient exposure of cookie
or other secrets repeatedly-sent in HTTP requests or responses. Speciőcally,
the demonstration focused on exposure of cookies by utilizing TLS compression.
We will not describe the attack here, since the principle is quite simple (as will
be evident from Exercise 7.6), however, the attack necessarily involves details
of the compression mechanisms in use, which are beyond our scope. The details
are not that complex, and interested readers are encouraged to read about
them; a good description is provided in [29].
In their presentation, Duong and Rizzo also discussed potential variants of
the CRIME attack that can extract secrets sent in HTTP responses, such as
CSRF tokens4 , as well as the potential abuse of other compression mechanisms,
such as the widely-deployed HTTP compression.
In spite of this, and in contrary to the conservative design principle (Principle 3), the main response to CRIME was disabling of TLS compression. Ignoring
Principle 3 was obviously a CRIME, and the punishment followed, in the form of
4 A CSRF token is a pseudorandom identifier sent by a website to a browser, allowing the
browser to submit operations on the user’s account. CSRF tokens are the common defense
against the Cross Site Request Forgery (CSRF) attack.
Applied Introduction to Cryptography and Cybersecurity
7.2. THE TLS RECORD PROTOCOL
425
two effective, convincing attacks that exploited HTTP compression mechanisms:
TIME [29] and BREACH [164].
The BREACH attack showed how HTTP compression allows the application
of CRIME to expose secrets in HTTP responses, most notably the abovementioned CSRF tokens. Like CRIME, the disclosure is very effective.
TIME, like BREACH, also assumes only HTTP compression (not TLS
compression). But TIME also extends CRIME in a more profound way: it
shows how to expose secrets in HTTP requests, such as cookies, without requiring
eavesdrop capabilities. Namely, the attacker does not have the entire capabilities
in the CPA-Oracle Attack model; it only controls a website visited by the user.
This is the cross-site attack model, which is the most commonly deployed model
in studies of non-cryptographic web security. Luckily, TIME seems signiőcantly
harder to deploy, namely, it may require an extensive amount of queries and
time.
Preventing Compress-then-Encrypt Exposure. There are several possible countermeasures to the Compress-then-Encrypt exposure. We mention
three of them.
First, the most certain way to avoid exposure of conődentiality due to the
use of Compress-then-Encrypt, is simple: to avoid compression. While the use
of compression for data can be critical for performance, it may be possible to
avoid compression of the sensitive information. For example, cookies are sent
in the HTTP headers, which are usually much shorter than the payload; it may
be acceptable to compress only the payload. Or, avoid HTTP compression
completely, in spite of the performance hit!
A second possible countermeasure is to perform special encoding to sensitive
data, that will prevent compress-then-encrypt exposure of (only) that data while allowing compression of the rest of the data. For example, we can apply
a ‘randomizing transform’ R to sensitive data s, such as cookies and CSRF
tokens, before applying compression. One simple randomizing transform would
XOR the sensitive data s, with a random or pseudorandom string r, which is
appended to the data separately. A standard transform may even be embedded
into TLS, requiring the application only to mark the sensitive data s. This
could be a nice programming project for the interested reader, and may be a
useful extension to TLS.
One unavoidable challenge, however, of this approach, if the need to identify
the sensitive data s. Some kinds of sensitive data may be amenable to automated
identiőcation (e.g., cookies), but other types may require the programmer to
annotate the data. This is a serious disadvantage, as the countermeasure is
prone to be done incorrectly or to not to be done at all.
The third and őnal countermeasure we mention is to add random padding to
the compressed data, hiding the exact amount compressed. This countermeasure
is very intuitive, but could often fail, e.g., by averaging-out the randomness
using multiple measurements.
Applied Introduction to Cryptography and Cybersecurity
CHAPTER 7. TLS PROTOCOLS: WEB-SECURITY AND BEYOND
426
Plaintext
Message as sent by the application, e.g., HTTP request
Ciphertext
k (fragment 1)
k
Plaintext
AEAD
Authenticated encryption
with Additional Data
Ciphertext
k (fragment 2)
Additional
data
k
Type
., HTTP request
Zeros
pad
OTYP
VER
LEN
Nonce
SEQ
Type
Fragment 3(≤ 16KB)
Zeros
pad
OTYP
VER
LEN
Nonce
SEQ
Type
AEAD
Authenticated encryption
with Additional Data
Additional
data
the application, e.g
OTYP
VER
LEN
OTYP
VER
LEN
k
Plaintext
OTYP
VER
LEN
Additional
data
Fragment 2(≤ 16KB)
Zeros
pad
OTYP
VER
LEN
Nonce
SEQ
Fragment 1(≤ 16KB)
Message as sent by
Plaintext
AEAD
Authenticated encryption
with Additional Data
Ciphertext
k (fragment 3)
Figure 7.7: The AEAD Record Protocol (used always in TLS 1.3, and optionally
in TLS 1.2). The ciphertext produced by the Authenticated Encryption with
Additional Data (AEAD) function, provides encryption of the plaintext input
as well as authentication of both the plaintext and the additional data, using a
nonce which should be unique in each invocation. The design contains several
‘small print őelds’, e.g., SEQ and OTYP, which are explained in the text, and can
be mostly ignored for intuitive understanding.
7.2.7
The TLS AEAD-based record protocol (TLS 1.3)
Most of the vulnerabilities identiőed for the TLS AtE record protocol, can be
avoided by adopting one of the two other designs discussed in Section 4.7: a
Encrypt-then-Authenticate (EtA) design, which uses őrst an encryption scheme
and then an authentication scheme, or the authenticated encryption with
associated data (AEAD) schemes, which combine encryption (for conődentiality)
and authentication.
RFC 7366 [179], Encrypt-then-MAC, applies the ‘classical’ Encrypt-thenAuthenticate (EtA) paradigm. The speciőcations use TLS extensions to signal
the use of EtA rather AtE. It could be deployed in versions 1.0 to 1.2 of TLS
(for version 1.0, provided that extensions are supported).
TLS 1.3 supports only the use of an authenticated encryption with associated
data (AEAD) scheme, which ensures both conődentiality and authenticity; see
subsection 4.7.1. AEAD schemes accept two types of data: plaintext, which is
encrypted and authenticated, and additional data, which is only authenticated,
not encrypted.
The design of the AEAD record protocol is illustrated in Figure 7.7. Without
going into all of the ‘small print őelds’ used by the protocol, which we discuss
below, this design is simpler than that of the AtE record protocol (Figure 7.4).
First, instead of a separate encryption scheme and authentication scheme, we use
just the AEAD scheme; we also need only one key to achieve both conődentiality
and authenticity. Second, the AEAD function provides all functions of a ‘mode
of operation’ and more; it does not require a separate padding operation or
Applied Introduction to Cryptography and Cybersecurity
7.2. THE TLS RECORD PROTOCOL
427
a random initialization vector, although it does require a unique nonce for
security [74]. This simpliőes its use, and in particular, avoids the need for
variants (for stream cipher and for block cipher). Finally, the AEAD record
protocol does not support a TLS compression function, to foil Compress-thenEncrypt vulnerabilities as deployed by the CRIME attack (subsection 7.2.6).
The simplicity of the TLS AEAD-based record protocol, follows the KISS
principle (Principle 14), and helps avoid implementation vulnerability. This
simplicity also facilitated automated veriőcation of the security of the TLS 1.3
record layer speciőcations and of appropriately-developed implementation [115].
The seven small print fields. Unfortunately, the TLS 1.3 record protocol
also comes with seven őelds whose meaning and use can be a bit obscure.
Readers may mostly ignore these ‘seven small print őelds’, but let us explain
them anyway, and reserve judgement on whether including all of these small
print őelds in the design was really necessary or desirable. We present these
‘small print’ őelds by their positioning in Figure 7.7, from left to right.
Nonce: The AEAD function requires provision of a unique nonce őeld in
every invocation; the same nonce value should be used for the encryptand-authenticate operation and to the corresponding decrypt-and-verify
operation. The nonce can be seen as a substitute to the random or unique
IV required by most modes of operation, or to the counter used by CTR
mode. The nonce input is 32 bits, and as the value should only be used
once in the TLS connection, it imposes a (very high) limit on the number
of fragments in a connection.
SEQ: The 32-bit sequence number is the sequence number of the fragment.
By authenticating this data with each fragment, the protocol can detect
any tampering with the order of fragments sent in the connection, e.g.,
reordering or duplicating some fragments. Any detected manipulation
results in immediate disconnection, since unintentional errors would be
prevented by the lower-layer TCP layer which ensures reliable, ordered
connections. Both sender and recipient maintain count of the fragments
sent and received, therefore, there is no need to actually send SEQ.
OTYP: This őeld is referred to in the TLS 1.3 speciőcations [329] as the opaque
type; it is included for backward compatibility with earlier versions of TLS,
and should always contain the őxed type of 23, indicating application
data - even if, in reality, the fragment contains a different type of record,
such as of the alert or handshake protocols. The ‘real type’ of the record
is included in the plaintext, with the fragment data, and therefore its
value is hidden from an eavesdropper.
VER: The legacy version őeld. This őeld is included for backward compatibility
with earlier versions of TLS, and should always contain the value 0x0303.
The protocol learns the ‘real’ version of TLS from the handshake protocol,
Applied Introduction to Cryptography and Cybersecurity
428
CHAPTER 7. TLS PROTOCOLS: WEB-SECURITY AND BEYOND
since, in TLS 1.3, the record layer is invoked only after negotiation by the
handshake protocol, which authenticate the protocol version.
LEN: The length of the ciphertext fragment, i.e., of the output of the AEAD.
The length is provided as part of the additional data input to the AEAD,
therefore, it should be computed before applying the OAEP, taking into
account the length of the plaintext input and the expansion performed by
the OAEP.
Type: This is the ‘real’ type of the fragment. TLS 1.3 and earlier versions,
deőne only four valid types: application data, handshake, alert, and CCS
(Change Cipher Speciőcation), each assigned to a non-zero byte value.
Zeros pad: This is an optional őeld, which can have arbitrary number of
bytes whose value is 0x00 (zero). A random or otherwise selected number
of zero pad bytes may be used to hide the size of the fragment from
an attacker which can observe the TLS ciphertexts. The motivation to
hide the length of the fragment is mainly due to the Compress-thenEncrypt vulnerabilities and attacks (CRIME, TIME and BREACH), see
subsection 7.2.6.
7.3
The SSLv2 Handshake Protocol
In this section we discuss the SSLv2 (SSL version 2) handshake protocol, its
features - and some of its main vulnerabilities. SSL version 2 is the earliest
published version of the SSL protocol [202], and its handshake protocol is
interesting - beyond its historical importance. One motivation to study it, is
that SSLv2 already introduces much of the basic concepts and designs used in
later versions - and, since it is a bit simpler, it is a good way for us to introduce
these basic TLS concepts and designs. Another motivation is that the SSLv2
handshake has some serious vulnerabilities; understanding these vulnerabilities is
instructive, to develop the ability to detect ŕaws in cryptographic protocols, and
to understand and motivate the design of later versions of the TLS handshake
protocol. Finally, surprisingly, there are still quite a lot of implementations that
support SSLv2, although they also support (and prefer) later versions, which
may make them vulnerable to downgrade attacks; see Section 7.5.
The SSLv2 handshake is a non-trivial cryptographic protocol, with support
for multiple options and mechanisms - mostly supported also by all later versions
(of SSL and TLS), often with extensions and improvements, and removal of
insecure mechanisms. We describe the protocol in the following three subsections.
In ğ7.3.1 we present the ‘basic’ handshake, namely, the handshake when there
is no existing session (already established shared key), and the protocol uses
public-key operations to share a key. In contrast, in ğ7.3.3 we present the
session resumption handshake, allowing to re-use the shared key exchanged
in a previous handshake between the same client and server, to open a new
connection without additional public key operations. In ğ7.5.1 we discuss how
Applied Introduction to Cryptography and Cybersecurity
7.3. THE SSLV2 HANDSHAKE PROTOCOL
429
SSLv2 handles cipher suite negotiation, and explain how an attacker may exploit
the (insecure) SSLv2 cipher suite negotiation mechanism, to launch the simple
yet effective cipher suite downgrade attack. Finally, in ğ7.3.4 we discuss how
SSLv2 supports the (optional) client-authentication feature.
Terms and notations. SSLv2, as described in the original publications,
e.g., in [202], uses several terms and notations which were modiőed in later
versions. For consistency, we use the terms used by the later versions, also
when describing SSLv2; these terms are often also more intuitive. For example,
we use the terms client random rC and server random rS , as in SSL3 and
TLS. However, the SSLv2 documentation refers to these őelds as challenge and
connection-ID, respectively.
7.3.1
SSLv2: the ‘basic’ handshake
In this subsection we discuss the ‘basic’ SSLv2 handshake, illustrated in Fig. 7.8,
which is a simpliőcation of the SSLv2 handshake protocol. This simpliőed
version does not include cipher suite negotiation, session resumption and client
authentication. We discuss these additional aspects of SSLv2 in the following
subsections.
The Hello messages. The SSLv2 handshake begins with the client sending a
ClientHello message to the server, specifying the client’s protocol version and
the client random, rc , a random bit-string used to randomize the key derivation.
The server responds with the ServerHello message, which contains the server
random, rS , and the server’s public-key certiőcate. The certiőcate contains the
server’s public key, the server’s RSA public encryption key S.e, the domainname of the server, e.g., s.com, and additional őelds. The certiőcate is signed
by an authorized Certificate Authority (CA), using the CA’s private signing key
CA.s.
The client veriőes the certiőcate. This includes several checks: does the
client trust the signing CA? Is the certiőcate properly signed? Is the domain
that the client tries to connect to, the same as the domain in the certiőcate (or
one of the domains in the certiőcate)? Has the certiőcate expired or revoked?
There are a few more checks; we discuss certiőcates and their validation in
Chapter 8.
The ClientHello and ServerHello messages retain these basic functions
and őelds in later versions of SSL and TLS; the most signiőcant changes are in
TLS 1.3.
7.3.2
SSLv2 Key-derivation
The SSLv2 handshake protocol establishes a shared master key, which we denote
kM . The master key is selected by the client, and sent encrypted, using RSA,
to the server, in the Client key exchange message. Namely, the client sends
ES.e (kM ).
Applied Introduction to Cryptography and Cybersecurity
CHAPTER 7. TLS PROTOCOLS: WEB-SECURITY AND BEYOND
430
Server S (s.com)
Client C
ClientHello: client random (rC )
ServerHello: server random (rS )
Certificate: SCA.s (S.e, s.com, . . .)
(signed by CA’s signing key CA.s, and containing Server’s public encryption key, S.e)
Select kM
Derive kC , kS
ClientKeyExchange: ES.e (kM )
ClientFinished: kC (rS )
kM ← DS.d (ES.e (kM ))
Derive kC , kS
ServerFinished: kS (ID)
Figure 7.8: ‘Basic’ SSLv2 handshake: new session, no client authentication, and
ignoring cipher suite negotiation.. The client and server selects random strings
(rC , rS respectively) and exchange them in the ClientHello and ServerHello
messages. The servers sends its certiőcate. The client veriőes that the certiőcate
is valid and properly signed (by a trusted CA), and that the domain in the
certiőcate matches the desired server domain. If so, the client selects randomly a
shared master key kM , encrypts it using RSA encryption with the server’s public
key S.e, and sends to the server. The client to server key kC and the server to
client key kS are derived from the master key kM and the randomizers rC and
rS , using the M D5 cryptographic hash, as in Equation 7.20. Both Finished
messages are protected by the SSL record protocol, which we denote by kC
(client to server communication) or kS (server to client); the ClientFinished
contains the server’s random rS , preventing replay, and the ServerFinished
contains an identiőer ID, allowing efficient session resumption (Figure 7.9).
The public key encryption of kM is the most computationally-intensive
operation by the client; therefore, it is desirable for the protocol to be secure
even if the client reuses the same master key kM and its encryption ES.e (kM ) in
multiple connections, assuming that the master key was not exposed. To ensure
this, we use the client random and server random őelds from the Hello messages,
rC and rS , respectively. Namely, we combine rC and rS with the master key,
and use the combination to derive session-speciőc cryptographic keysfor the
session. The derived cryptographic keys are used to protect communication
in the connection, and include keys for encryption/decryption as well as for
authentication and veriőcation of authentication (MAC).
In SSLv2, the parties derive and use only two keys from kM and the random
nonces rC , rS : the client-to-server key kC and the server-to-client key kS . The
client uses kC to encrypt messages it sends and compute the MAC to attach to
them for authentication, and kS to decrypt messages it receives from the server
and to compute the MAC on the ciphertext, and compare it to the received
MAC value, for authentication. These are derived as follows:
Applied Introduction to Cryptography and Cybersecurity
7.3. THE SSLV2 HANDSHAKE PROTOCOL
kC = M D5(kM +
+ ł1ž +
+ rC +
+ rS )
kS = M D5(kM +
+ ł0ž +
+ rC +
+ rS )
431
(7.20)
Why is this separation between kC , used to protect messages from client to
server, and kS , used to protect messages from server to client? One reason is
that, with a stream cipher, using the same key in both directions would result
in insecure re-use of the same key-pad for encryption of two different messages.
Another motivation, relevant to block ciphers, is to improve security, following
the key separation principle (Principle 10) . In particular, many websites are
public, and send exactly the same information to all users; however, we may
want to protect the conődentiality of the contents, e.g., queries, sent by the users.
By separating between kC and kS , the attacker cannot use the large amount of
known plaintext sent from server to client, to cryptanalyze the ciphertext sent
from client to server.
On the other hand, the use of a the same secret key for two different
cryptographic functions violates the same key separation principle. This was
őxed in later versions of TLS, which derive separate keys for encryption and for
authentication - with the exception of TLS 1.3, which only needs one key since
it uses a single AEAD scheme to ensure both authentication and encryption.
The use of both random numbers rC and rS is required, to ensure that a
different key is used in different connections. This has three motivations. First,
it is necessary to prevent replay of messages - from either client or server; see
exercise 7.9. Second, it reduces the total amount of known and chosen plaintext
that can be available for cryptanalysis of a key, and the amount of plaintext
that is exposed by successful cryptanalysis. Finally, it avoids the possibility
that known or chosen plaintext from one connection, e.g., from the public login
page of a website, may help attack against data sent in another connection,
which may not have the same amount of known or chosen plaintext.
Note, however, that the SSLv2 key derivation does not fully follow the key
separation principle, since it uses the same key for conődentiality (encryption)
and for message-authenticity (MAC). This can cause vulnerability even if both
encryption and MAC are independently secure; see Exercise 4.3. Indeed, later
E
versions of TLS use separate keys for the encryption and for MAC, e.g., kC
and
M AC
kC
.
7.3.3
SSLv2: ID-based Session Resumption
The main overhead of the TLS protocol is due to the computationally-intensive
public key operations. Often, there are multiple connections between the same
(client, server) pair, over a short period of time; in such cases, the server
and client may re-use the master key exchanged previously, thereby avoiding
additional public key operations. To facilitate re-use of the master key, the
server includes an identiőer ID at the end of the handshake; to re-use the same
master key in another connection, the client sends this ID with its client-hello,
Applied Introduction to Cryptography and Cybersecurity
432
CHAPTER 7. TLS PROTOCOLS: WEB-SECURITY AND BEYOND
Client
Server
Client hello: client random (rC ) , ID
Server hello: server random (rS ), ID-hit
Client finished: kC (rS )
Server finished: kS (rC ), kS (ID)
Figure 7.9: SSLv2 handshake, with ID-based session resumption. The client
initiates this handshake by including the ID őeld in the ‘Client hello’ message.
The ID őeld was received from in the ‘Server őnish’ message in a previous
connection, and cached, with the corresponding session key kM , by the client.
If the server does not have the (ID, kM ) pair in cache, then the handshake
completes without resumption, as in Fig.7.8. Otherwise, when (ID, kM ) is in
the server’s cache, then the parties can reuse kM , i.e., ‘resumes the session’, by
deriving new shared keys from kM , using Eq. (7.20). This avoids the public
key operations, encryption by client and decryption by server of master key kM ,
as well as the overhead of transmitting the certiőcate SCA.s (S.e, s.com). The
server indicates such cache-hit by sending the ID-hit ŕag in its ‘Server hello’
response, and continuing with the resumption handshake as shown here.
and if the server has the corresponding key, the session is resumed efficiently,
avoiding additional public key operations. We illustrate this ID-based session
resumption process in Figure 7.9.
The impact of session-resumption can be quite dramatic. The savings
are mostly on the computation (CPU) time; instead of computing public-key
encryption of the master key kM (for client) and decryption (for server) for
every TCP connection, we now need only require these operations for the őrst
TCP connection in a session. The ratio of the computation time with and
without session resumption is typically on the orders of 100 for typical usage,
such as for protecting web communication using the https protocol, i.e., running
http over TLS.
Session resumption in SSLv2 is always based on the use of the ID. This
ID-based session resumption mechanism has a signiőcant drawback: it requires
the server to be stateful, speciőcally, to maintain state for each session (for the
session-ID and the master key). In the typical case where the same web-server
is running over multiple machines, this requires that this storage be shared
between all of these servers, or to ensure that a client will contact the same
machine each time - a difficult requirement that sometimes is infeasible. These
drawbacks motivate the adoption of alternative methods for session resumption,
most notably, the TLS session-token resumption mechanism, that we discuss
later.
Note that the session resumption protocol is one reason for requiring the
Applied Introduction to Cryptography and Cybersecurity
7.4. THE HANDSHAKE PROTOCOL: FROM SSLV3 TO TLSV1.2
433
use of client and server random numbers; see the following exercise.
Exercise 7.1. Consider implementations of the SSLv2 protocol, where the (1)
client random or (2) server random fields are omitted (or always sent as a fixed
string). Show a message sequence diagram for two corresponding attacks, one
allowing replay of messages to the client, and one allowing replay of messages
to the server.
Hint: perform replay of messages from one connection to a different connection (both using the same master key, i.e., same session).
7.3.4
SSLv2: Client Authentication
All versions of SSL and TLS, including SSLv2, support an (optional) client
authentication mechanism, where the client proves its identity by sending a
certiőcate for a public signature-validation key, and then signs content sent by
the server. Client certiőcates should identify a client approved by the server,
and be signed (issued) by a certiőcate authority (CA) trusted by the server,
just like server certiőcates should be signed by a CA trusted by the client.
In SSLv2, the information signed by the client consists of a signature using
the client’s private key, over several őelds, including a challenge sent by the
server with the request for client authentication, the server’s certiőcate, and the
shared connection keys. Furthermore, this signature should be sent encrypted
(using the appropriate connection key). It may not be immediate to see why all
of these elements are used, but as we see in Exercise 7.10, removal of some of
them may result in a vulnerability.
In the next section we discuss the handshake protocol from SSLv3 to TLSv1.2;
the client-authentication design of these versions is simpler and more amenable
to security analysis.
7.4
The Handshake Protocol: from SSLv3 to TLSv1.2
We will now discuss the evolution of the TLS handshake protocol after version
2, from version 3 of SSL [155], to versions 1.0, 1.1 and 1.2 of TLS [120, 122, 381].
These four handshake protocols are quite similar - we will mention the few major
differences. Later, in ğ7.6, we present version 1.3 of TLS, which is the latest
- and involves more signiőcant differences, compared to the more incremental
changes of these earlier versions.
The handshake protocol, especially before TLS 1.3, has multiple mechanisms
and options; we will not cover all of them. One important mechanism we
will not cover is for handshake renegotiation. Renegotiation allows clients and
servers to change negotiated aspects of the session. One important use for
renegotiation is when a server decides to ask for client authentication, after the
session began without client authentication (i.e., with an anonymous client).
Renegotiation was a rather complex mechanism, and subject to few attacks,
most notably [57, 331].
Applied Introduction to Cryptography and Cybersecurity
434
CHAPTER 7. TLS PROTOCOLS: WEB-SECURITY AND BEYOND
Client
Server
Client hello: version (vC ), random (rC ), cipher suites, [extensions,]
Server hello: version (vS ), random (rS ), cipher suite, [extensions,]
certificate: SCA.s (S.e, s.com, . . .)
Client key exchange: ES.e (kP M );
Client finished: P RFkM (‘client finished:’, h(previous flows))
Server finished: P RFkM (‘server finished:’, h(previous flows))
Figure 7.10: The ‘basic’ RSA-based handshake, for SSLv3 and TLS 1.0, 1.1 and
1.2. The master key kM is computed, as in Eq. (7.21), from the pre-master
key kP M , which is sent in the client key exchange message (third ŕow). Notice
that the client key exchange message simply contains encryption of kP M , i.e.:
ES.e (kP M )). From TLS 1.1, the speciőcations supports (optional) extensions,
as illustrated; see subsection 7.4.3.
Figure 7.10 illustrates the ‘basic’ variant of the handshake protocol, of the
SSLv3 protocol and the TLS protocol (versions 1.0 to 1.2). Like SSLv2, this
‘basic’ variant uses RSA encryption to send encrypted key from client to server.
In the following subsections, we discuss the main improvements introduced
in these later versions of TLS, including:
Improved key derivation and kP M (§7.4.1): the key derivation process
was signiőcantly overhauled between SSLv2 and the later versions, beginning with SSLv3. In particular, the client-key-exchange message of the
basic exchange includes the premaster key kP M , from which the protocol
derives the master key kM . As before, the master key kM is used to derive
the keys for the record-protocol, used to encrypt and authenticate data
on the connection. However, from SSLv3, the protocol correctly separates
between the encryption keys and the authentication keys (in contrast to
SSLv2).
DH key exchange and PFS (§7.4.2): From SSLv3, the TLS protocols supports DH key exchange, as an alternative or complementary mechanism
to the use of RSA-based key exchange (the only method in SSLv2). The
main advantage is support for Perfect forward secrecy (PFS).
Session-Ticket Resumption (§7.4.4): an important TLS extension allows
Session-Ticket Resumption, a new mechanism for session resumption.
Session-ticket resumption allows the server to avoid keeping state for
each session, which is often an important improvement over the ID-based
session resumption mechanism supported already in SSLv2 (but which
requires servers to maintain state for each session).
Improved handshake integrity and negotiation (§7.5): from SSLv3, the
handshake protocol’s őnish message authenticates the data of the previApplied Introduction to Cryptography and Cybersecurity
7.4. THE HANDSHAKE PROTOCOL: FROM SSLV3 TO TLSV1.2
435
ous ŕows of the handshake; this prevents the SSLv2 downgrade attack
(Figure 7.18) and other violations of handshake integrity. TLS, and to
lesser degree SSLv3 too, also improve other aspects of the negotiation,
in particular, support for extensions, negotiation of the protocol version,
and negotiation of additional mechanisms, including key-distribution and
compression.
Two of these changes - improved key derivation and improved handshake
integrity - have impact already on the ‘basic’ handshake. To see this impact,
compare Figure 7.10 (for SSLv3 to TLS 1.2) to Figure 7.8 (the corresponding
‘basic’ handshake of SSLv2). We therefore begin our discussion with these two
changes.
7.4.1
SSLv3 to TLSv1.2: improved derivation of keys
Deriving master key from premaster key. From SSLv3, the handshake
protocol exchanges a pre-master key kP M , instead of the master key kM exchanged in SSLv2. The parties derive the master key kM from the pre-master
key kP M , using a PRF, as in Eq. (7.21):
kM = P RFkP M (łmaster secretž +
+ rC +
+ rS )
(7.21)
The main motivation for this additional step is that the value exchanged
between the parties may not be a perfectly-uniform secret binary string, as
required for a cryptographic key. When exchanging the shared key using the
‘basic’, RSA-based handshake, this may happen when the client does not have
a sufficiently good source of randomization, or if the client simply resends the
same encrypted premaster key as computed and used in a previous connection
to the same server - not a recommended way to use the protocol, of course, but
possibly attractive for some very weak clients.
When exchanging the shared key using the DH protocol, there is a different
motivation for using this additional derivation step, from premaster key to
master key. Namely, the standard DH groups are all based on the use of a safe
prime; as we explain in ğ6.2.3, this implies that we rely on the Computational
DH assumption (CDH), and that the attacker may be able to learn at least
one bit of information about the exchanged key. By deriving the master key
from the premaster key, we hope to ensure that the entire master key would be
pseudorandom.
Deriving connection keys. Another important improvement of the handshake protocols of SSLv3 to TLS1.2, compared to the SSLv2 handshake, is
in the derivation of the connection keys which are used for encryption and
authentication by the record protocol. This aspect is not apparent from looking
at the ŕows (Fig. 7.10).
Speciőcally, recall that in SSLv2, we derived from the master-key kM two
keys, kS for protecting traffic sent by the server S, and kC for protecting traffic
Applied Introduction to Cryptography and Cybersecurity
CHAPTER 7. TLS PROTOCOLS: WEB-SECURITY AND BEYOND
436
Table 7.1: Derivation of connection keys and IVs, in SSLv3 to TLS1.2
A
kC
key-block = P RFkM (‘key expansion’ +
+ rC +
+ rS )
E
kSA
kC
kSE
IVC
IVS
sent by the client C, as in Eq. (7.20). In SSLv3 and TLS, we use kM to derive,
for traffic sent by the client C and server S, three keys/values each, for a total
A A
of six keys/values: two authentication (MAC) keys, (kC
, kS ), two encryption
E
E
keys, (kC , kS ), and two initialization vectors, (IVC , IVS ), used for initialization
of the ‘modes of operation’ (Section 2.8). In each pair of keys, we use the one
with subscript C for traffic from client to server, and the one with subscript S
for traffic from server to client.
To derive these six keys/values, we generate from kM a long string which
is referred to as key block, which we then partition into the six keys/values.
The exact details of the derivation differ between these different versions of the
handshake protocol, and arguably, none of the derivations is fully justiőed by
standard cryptographic deőnitions and reductions. We present the following
simpliőcation, leaving the exact details for exercises; the interested reader can
őnd the full details in the corresponding RFC speciőcations.
Our simpliőcation is deőned using a generic pseudorandom function P RF ,
whose input is an arbitrary-length string, and whose output is a ‘sufficiently
long’ pseudo-random binary string called key-block, as follows:
key-block = P RFkM (‘key expansion’ +
+ rC +
+ rS )
(7.22)
The key-block is then partitioned into the six keys/values, as illustrated in
Table 7.1.
7.4.2
SSLv3 to TLSv1.2: DH-based key exchange
From SSLv3, the TLS handshake supports DH key exchange. Three types
of DH key exchange are supported, ephemeral (signed), static (certified) and
anonymous (unauthenticated). The ephemeral method is the most popular TLS
handshake method, due to its signiőcant security beneőts. The static (certiőed)
method is rarely, if ever, used, and does not offer increased security. However,
the two methods are actually quite similar. The anonymous method is rarely
used, since it does not provide server authentication; we focus on the other two
methods.
In both methods, the parties derive a shared key kP M , referred to as the
pre-master key, following the DH protocol. Speciőcally, TLS use a modular
group, with an agreed upon safe prime p and generator g. The parties exchange
their ‘public keys’, g S.x (for the server) and g C.y (for the client), where each
party uses a randomly-generated private key: S.x for the server and C.y for the
client. The parties then derive the pre-master key kP M , again as in ‘plain’ DH
key exchange, namely:
Applied Introduction to Cryptography and Cybersecurity
7.4. THE HANDSHAKE PROTOCOL: FROM SSLV3 TO TLSV1.2
Client
Client hello: version (vC ), random (rC ),
cipher suites (. . . DH. . . ),
437
Server
Server hello: version (vS ), random (rS ), cipher suite:. . . DH. . . ,
certificate: SCA.s ((g, p, g S.x mod p), s.com, . . .), [, extensions]
Client key exchange: g C.y ;
Client finished: P RFkM (‘client finished:’, h(previous flows))
Server finished: P RFkM (‘server finished:’, h(previous flows))
Figure 7.11: SSLv3 to TLSv1.2: the static DH handshake, using static (certiőed)
DH public parameter for the server, g S.x mod p. Pre-master key kP M is
computed as in Eq. (7.23), and master key kM is computed - from kP M - as in
Eq. (7.21).
kP M = g C.y·S.x
mod p
(7.23)
Recall that when using a modular group, the value exchanged by the
DH protocol is not pseudorandom; namely, security may rely only on the
computational DH assumption (CDH), as we know that the stronger DDH
(Decisional DH, Deőnition 6.7) assumption does not hold for such groups. This
is one motivation for not using kP M directly as a key to cryptographic functions.
Instead, we derive from the pre-master key kP M another key, the master key
kM , which should be pseudorandom. See ğ7.4.1, where we discuss the derivation
of the master key and of the keys for speciőc cryptographic functions, such as
PRF, MAC or shared-key encryption.
DH Static (certified) handshake. In static (certiőed) DH key exchange,
the server’s DH public key is signed as part of the signing process of a public key
certificate. Namely, the signing entity is a certificate authority which is trusted
by the browser, and the certiőcate contains the domain name (e.g., s.com) and
other parameters such as expiration date: SCA.s ((g, p, g S.x mod p), s.com, . . .).
See Figure 7.11.
In practice, the use of a certiőcate implies that the server’s DH public key,
g S.x , is őxed for long periods, similarly to the typical use of RSA or other
public key methods. Hence, the static (certiőed) DH key exchange is similar in
its properties to the RSA key exchange; the difference is simply that instead
of using RSA encryption to exchange the key, and relying on the RSA (and
factoring) assumptions, the static (certiőed) DH key exchange relies on the DH
(and discrete-logarithm) assumptions.
DH Ephemeral (DHE) handshake: ensuring Perfect Forward Secrecy
(PFS). The DH Ephemeral (DHE) key exchange uses a different, randomlychosen private key for each exchange; for DH, this means that each party selects
Applied Introduction to Cryptography and Cybersecurity
438
CHAPTER 7. TLS PROTOCOLS: WEB-SECURITY AND BEYOND
Client
Server
Client hello: version (vC ), random (rC ),
cipher suites (incl. DHE-RSA or DHE-DSS) [, ID] [,extensions]
Server hello: version (vS ), random (rS ), cipher suite (DHE-RSA),
Certificate: SignCA.s
. .), [, extensions]
(S.v, s.com, .S.x
(p, g, g
mod p),
Server key exchange:
SignS.s ((p, g, g S.x mod p))
Client key exchange: (p, g, g S.x mod p), g C.y ;
Client finished: P RFkM (‘client finished:’), h(previous flows))
Server finished: P RFkM (‘server finished:’), h(previous flows))
Figure 7.12: SSLv3 to TLSv1.2: the DH Ephemeral (DHE) handshake. The
DH exponents S.x and C.y are chosen randomly for this handshake. The server
signs its DH public key g S.x mod p, using RSA (RSA_DHE ciphersuite) or
DSS (DSS_DHE ciphersuite). The pre-master key kP M is computed as in Eq.
(7.23), and master key kM is computed as in Eq. (7.21).
a new private exponent (S.x for the server, C.y for the client) in each handshake.
This is illustrated in Figure 7.12.
The DH exchange is ‘server-authenticated’, i.e., the server signs its ‘public’
DH value (g S .x mod p), and links it with the particular handshake by including
in the signed data also the server and client random numbers (rS and rC ). To
allow the client to validate this signature, the server should send a public key
certiőcate that speciőes its public signature-verification key, rather than the
server’s public decryption key, as used for TLS handshakes where the client
encrypts the pre-master key using the server’s public key. Following the key
separation principle, these two public keys should be different, but many servers
actually use the same public key (and private key) for both purposes. The lack
of key separation was exploited in several attacks against TLS [18, 73, 214, 341].
Once the TLS session terminates, the private exponents are erased - as
well as any keys derived from them, including the pre-master key kP M , the
master key kM , the derived key block (Eq. (7.22)) and the keys derived from
A
E
it (kSA , kSE , kC
, kC
). This ensures perfect forward secrecy (PFS), i.e., the ith
session between client and server is secure against a powerful MitM attacker,
even if the attacker is given, all the keys and other contents of the memory of
both client and server before and after the ith session, as long as the keys are
given only after the ith handshake is completed.
Security assumptions of DH key exchange. An obvious, important
difference between the RSA key exchange and the DH key exchange methods,
is that instead of using RSA encryption to exchange the key, and relying on
the RSA (and factoring) assumptions, the static (certiőed) DH key exchange
relies on the computational-DH (and discrete-logarithm) assumptions. Notice,
Applied Introduction to Cryptography and Cybersecurity
7.4. THE HANDSHAKE PROTOCOL: FROM SSLV3 TO TLSV1.2
439
however, that in the typical case where the certiőcate uses RSA signatures, the
security of the handshake still relies also on the RSA (and factoring) assumptions.
Namely, the DH key exchanges require both the computational-DH (and discrete
logarithm) assumption, and the RSA (and factoring) assumption. In this
sense, DH key exchanges, using RSA signatures, requires more assumptions
compared to the RSA key exchange. TLS 1.2 (and 1.3) also support ECDSA
signatures, which, like the DH key exchange, are based on the discrete logarithm
assumption, avoiding the reliance on an addition assumption (RSA and therefore
also hardness of factoring), which retaining the advantage of perfect forward
secrecy (PFS).
7.4.3
The TLS Extensions mechanism
One of the most important improvements of TLS over SSL, is that TLS support
a ŕexible and secure extensions mechanism. This mechanism allows clients
to specify additional őelds, not deőned in the protocol, but supported (and
‘understood’) by some of the servers. Once a server receives an extension that
it supports, its behavior may change from the ‘standard protocol’ in arbitrary
way (as deőned by the extension); however, servers should ignore any unknown
extension.
Extensions were ‘unofficially’ supported as early as TLS1.0, where servers
are required to ignore any unknown őelds appended beyond the known őelds,
as deőned in [67, 68]. Support for extensions became a (mandatory) part of
the TLS speciőcations from version 1.1. Some standard extensions facilitate
important functionality, and some are needed for security; and users may deőne
additional extensions. Let us discuss one important extension here, and another
one in the next subsection.
The Server Name Indication (SNI ) is an example of an important, popular
extension. Support of SNI became mandatory from TLS 1.1, and was one of
the main factors motivating websites and clients to adopt TLS. Many servers
refuse handshakes where the client does not include the SNI extension in Client
Hello.
The main use of SNI is to support the common scenario, where the same
web server is used to provide web-pages belonging to multiple different domain
names, e.g., a.com and b.org. Each domain name may require a different
certiőcate; the SNI extension allows the client to indicate the desired server
domain name early on in the protocol, before the server has to send a certiőcate
to the client - allowing the server to send the desired certiőcate based on the
web-page that is being requested. Before SNI, the common way to a web-server
to support multiple web-sites, with different domain names, was by having each
site use a dedicated port - an inconvenient and inefficient solution.
However, the SNI extension is valuable even for servers which host only a
single domain. This is since SNI allows the server to verify that the domain
that the client wants to connect with, is the same as the hosted domain, before
the server spends the considerable computational resources to complete the
TLS handshake. This avoids spending server resources due to incorrect client
Applied Introduction to Cryptography and Cybersecurity
CHAPTER 7. TLS PROTOCOLS: WEB-SECURITY AND BEYOND
440
Client
Server
Client hello: client random (rC ), cipher suites, ID
Server hello: server random (rS )
Client finished: P RFkM (‘client finished:’), h(previous flows))
Server finished: P RFkM (‘server finished:’), h(previous flows))
Figure 7.13: SSLv3 to TLS1.2 handshake, with ID-based session resumption.
requests sent to the server; it also avoids exposing the hosted domain (to an
attacker sending Client Hello to őnd out the domain name).
By requiring the SNI extension, a server can also prevent a potential Denial
of Service (DoS) attack, exhausting server’s resources. Without SNI, a roguewebsite could abuse visits by benign users, to attack other sites; this kind of
DoS attack is called a cross-site Denial-of-Service attack; without SNI, it could
be especially effective against TLS 1.3. For details, see [193].
7.4.4
SSLv3 to TLSv1.2: session resumption
Both SSLv3 and TLS, like SSLv2, support the (stateful) ID-based session
resumption mechanism; however, many TLS servers also support extensions,
including the session-ticket extension, which is an alternative, ‘stateless’ method
for session resumption. In this subsection we discuss these two methods.
ID-based session resumption in SSLv3 and TLS 1.0-1.2 We begin with
the (stateful) ID-based session resumption mechanism, which did not change
much from its implementation in SSLv2.
Figure 7.13 illustrates the handling of ID-based session resumption, in the
SSLv3 handshake protocol, and in versions 1.0-1.2 of the TLS protocol. In
the őgure, the client-hello message contains the session-ID, denoted simply ID,
which was received from server in a previous connection.
Session resumption is possible, when the server still has the corresponding
entry (ID, kM , γ) saved from a previous connection; ID is the session identifier,
kM is the session’s master key, and γ contains ‘related information’ such as the
cipher suite used in the session.
When the server has the (ID, kM , γ) entry, it reuses kM and γ, i.e., ‘resumes
the session’, and derives new shared keys from it (using Eq. (7.20). This avoids
the public key encryption (by client) and decryption (by server) of master key
kM , as well as the transmission of the relevant information, most signiőcantly,
the public key certiőcate.
Applied Introduction to Cryptography and Cybersecurity
7.4. THE HANDSHAKE PROTOCOL: FROM SSLV3 TO TLSV1.2
441
Note that when either the client or the server, or both, do not have a
valid (ID, kM ) pair, then the handshake is essentially the same as for a ‘basic’
handshake (without resumption), as in Fig. 7.8. The only changes are the
inclusion of the ID from client (if it has it), and the inclusion of an ID in
the ‘server-őnish’ message, to be (optionally) used for future resumption of
additional connections (in the same session).
The session resumption mechanism can have a signiőcant impact on performance; in particular, websites often involve opening of a very large number
of TCP connections to the same server, to download different objects. The
reduction in CPU time can easily be a ratio of dozens or even hundreds. Therefore, this is a very important mechanism; however, it also has some signiőcant
challenges and concerns, as we next discuss.
Session-ID resumption: challenges and concerns. The basic challenge of
ID-based session resumption is the need to maintain state, and lookup the state
- and key - using the ID. To minimize the storage and lookup time overhead,
the cache of saved (ID, kM ) pairs cannot be too large; on the other hand, if
the cached is too small, then the resumption mechanism is less effective.
This challenge is made much harder, since web servers are usually replicated
- to handle high load and to reduce latency by placing the server closer to the
clients, e.g., in a Content Distribution Network (CDN).
Ensuring PFS with ID-based session resumption Another challenge is
that the exposure of the master key kM , exposes the entire communication of
every connection to an eavesdropper; namely, the storage of the key may foil
the perfect-forward secrecy (PFS) mechanism. To ensure PFS, we must ensure
that all copies of the key kM are discarded, without any copies remaining - a
non-trivial challenge.
This challenge is often made even harder due to the way that web-servers
implement the (ID, kM ) cache. Speciőcally, in some popular servers, e.g.
Apache, the operator can only deőne the size of the (ID, kM ) cache. Suppose
the goal is to ensure PFS on daily basis, i.e., to change keys daily. Then the
cache size must be small enough to ensure that entries will be thrown out after
at most a day, yet, if it is too small, there will be many cache misses, i.e., the
efficiency-gain of the resumption mechanism will be reduced. Furthermore, even
if we use a small cache, a client which continues a session for very long time
may never get evicted from the cache, and hence we may not achieve the goal of
ensuring PFS on daily basis, if the cache uses the (usual) paradigm of throwing
out the least-recently-used element; to ensure entries are thrown after one day
at most, it should operate as a queue (őrst-in-őrst-out).
Exercise 7.2. Consider a web server which has, on average, one million daily
visitors, but the number in some days may be as low as one thousand. What
is the required size of the ID-session cache, in terms of number of (ID, kM )
entries, to ensure PFS on daily basis, when entries are removed from the cache
only when necessary to make room for new entries? Can you estimate or bound,
Applied Introduction to Cryptography and Cybersecurity
442
CHAPTER 7. TLS PROTOCOLS: WEB-SECURITY AND BEYOND
Client
Server
Client hello: client random (rC ), cipher suites, ticket-extension(τ )
Server hello: server random (rS )
Client finished: P RFkM (‘client finished:’, h(previous flows))
Server finished: P RFkM (‘server finished:’, h(previous flows))
ticket-extension(τ ′ )
Figure 7.14: Ticket-based session resumption, using a ticket τ which the client
sends to the server with client hello; the client has received τ from the server
in a previous handshake. The server should be able to validate the ticket as
one that the server issued previously, and not too long ago, and to retrieve
the shared pre-master-key encrypted within the ticket. This is usually done by
having the ticket .
how many of the connections will be served from cache on a typical day? Assume
the ID-session cache operates using a FIFO eviction paradigm.
The Session-Ticket extension and its use for session resumption. The
TLS extensions mechanism provides an alternative, stateless session-resumption
mechanism. The idea is simple: together with the finish message of a successful
handshake, the server attaches a session-ticket extension. Later, when the client
re-connects to the same server, it attaches the previously-received session-ticket
extension. See Figure 7.14.
The ticket should allow any of the ‘authorized servers’ (e.g., running the
website), to recover the value of the master key kM of the session with the client
- but prevent attackers, eavesdropping on the ticket as sent by the client, from
őnding kM . This is achieved by having kM , and other values sent in the ticket,
encrypted using a secret, symmetric Session Ticket Encryption Key, which we
denote kST EK , known (only) to all authorized servers. Notice that kST EK is
not shared with clients or derived by TLS; the method of generating it and
sharing it between the servers is implementation-speciőc. Since kST EK is a
shared key, it is usually simply selected randomly.
Clients cannot encrypt the tickets; hence, they must store both ticket and
(plaintext) session’s master key kM , to allow the client to perform its part of
the handshake.
The contents of the session ticket are only used by the servers, and are
opaque to the clients, i.e., not ‘understood’ or used by the clients; hence,
different implementations may use different tickets. RFC5077 [344] recommends
a structure which uses Encrypt-then-Authenticate, where the encrypted contents
include the protocol version, cipher suite, compression method, master secret
key, client identity and a timestamp.
The timestamp allows the server to limit the validity period of a ticket (and
Applied Introduction to Cryptography and Cybersecurity
7.4. THE HANDSHAKE PROTOCOL: FROM SSLV3 TO TLSV1.2
443
the keys contained within); if the server receives a ticket which already expired,
or is invalid for any other reason, it simply ignores it and proceeds with the
‘regular’ handshake, establishing a new pre-master key (and potentially sending
a new ticket to the client).
The limited validity period for the ticket is important, to limit the risk from
exposure of the keys in a particular ticket - via cryptanalysis or in other ways,
such as abuse of a vulnerability of the browser. Limiting the validity of tickets is
clearly also necessary to ensure Perfect Forward Secrecy (PFS); however, this is
not sufficient. Speciőcally, to ensure PFS, we should also limit the ability of an
attacker to decipher messages from past recorded ciphertexts of long-terminated
connections, using an exposed Session Ticket Encryption Key (STEK) kST EK .
Let us discuss this challenge.
PFS with Session Ticket Encryption Keys. To preserve PFS, e.g., on
daily basis, we need to make sure that each Session Ticket Encryption Key
kST EK is kept for only the allowed duration - e.g., up to 24 hours (‘daily’). In
principle, this is easy; we can maintain this key only in memory, and never
write it to disk or other non-volatile storage, making it easier to ensure it is
not kept beyond the desired period (e.g., daily). This rule may require us to
maintain several ticket-keys concurrently, e.g., generate a new key once an hour,
allowing it to ‘live’ for up to 24 hours.
In the typical case of replicated servers, the ticket keys kST EK should be
distributed securely to all replicates. Changing the key becomes even more
important, with it being used in so many machines.
Unfortunately, like for ID-based resumption, many popular web-servers
implement ticket-based resumption in ways which are problematic for perfect
forward secrecy (PFS). These web-server implementations do not provide a
mechanism to limit the lifetime of the ticket key, except by restarting the server
(to force the server to choose a new ticket key). For some administrators and
scenarios, this lack of support for PFS may be a consideration for choosing a
server, or for using session-IDs and disabling session-tickets.
7.4.5
SSLv3 to TLSv1.2: Client authentication
The SSL and TLS protocols support, already from SSLv2, a mechanism for
authenticating client, as an optional service of the handshake. In this subsection
we describe how this optional client authentication mechanism works, in SSLv3
and in TLS 1.0 to 1.2.
The TLS client authentication mechanism is illustrated in Figure 7.15. The
mechanism consists of tree additions to the ‘basic’ handshake. First, the server
signals the need for client authentication, by including the certificate request őeld
together with the server-hello message. The certiőcate-request őeld identiőes
the certiőcate-authorities (issuers) which are accepted by this server; namely,
client authentication is possible only if the client has a certiőcate from one of
these entities.
Applied Introduction to Cryptography and Cybersecurity
444
CHAPTER 7. TLS PROTOCOLS: WEB-SECURITY AND BEYOND
Client
Server
Client hello: client random (rC ), cipher suites
Server hello: server random (rS ), cipher suite,
Certificate, CertificateRequest:{CAs}
Client-certificate:SignCAC .s (C.v, . . .),
CertificateVerify:SignC.s (h(handshake)),
ClientKeyExchange: . . . ,
Client finished: P RFkM (‘client finished:’, h(previous flows))
Server finished: P RFkM (‘server finished:’, h(previous flows))
Figure 7.15: Client authentication in SSLv3 to TLS1.2.
Next, the client attaches, to its client key exchange message, two őelds.
The őrst is the certiőcate itself; the second, called certificate verify, is a digital
signature over the handshake messages. The ability to produce this signature,
serves as proof of the identity of the client.
This client authentication mechanism is quite simple and efficient; however,
it is not widely deployed. In reality, TLS is typically deployed using only
the public key (and certiőcate) of the server, i.e., only allowing the client to
authenticate the server, but without client authentication. The reason for that
is that TLS client authentication requires clients to use a private key, and to
obtain a certiőcate on the corresponding public key; furthermore, that certiőcate
must be signed by an authority trusted by the server.
This raises two serious challenges. First, clients often use multiple devices,
and this requires them to have access to their private keys on these multiple
devices, which raises both usability and security concerns. Second, clients
must obtain a certiőcate - and from an authority trusted by the server. As a
result, most websites prefer to avoid the use of TLS client authentication; when
user authentication is required, they rely on sending secret credentials such as
passwords or cookies, over the TLS secure connection.
Note also that the client authentication mechanism requires the client to
send their certiőcate ‘in the clear’. This may be a privacy concern, since the
certiőcate may allow identiőcation of the client.
7.5
Negotiations and Downgrade Attacks (SSL to TLS
1.2)
The evolution of TLS, at least until TLS 1.3, saw an increasingly complex set of
different options and choices: different usage modes, different protocol versions,
different cipher suites (from SSLv2) and different extensions (from TLS 1.1).
To allow this ŕexibility, the speciőcations and implementations use different
Applied Introduction to Cryptography and Cybersecurity
7.5. NEGOTIATIONS AND DOWNGRADE ATTACKS (SSL TO TLS 1.2)445
Client
Server
Client hello: version (vC ), client random (rC ), cipher suites
Server hello: server random (rS ),
certificate: SCA.s (S.e, s.com, . . .) and cipher suites
Client key exchange: cipher suite, ES.e (kM );
Client finished: EkC (rS )
Server finished: EkS (rC )
Figure 7.16: SSLv2 handshake, with details of cipher suite negotiation (underlined). The Client hello message indicates the options supported by the
client; the Server hello message contains the subset of these, which are also
supported by the server. The client chooses one of these, and indicates the
choice in the Client key exchange message. Note: the negotiation was modiőed
in later versions.
negotiation mechanisms. The basic goal of negotiation is for client and server
to agree on the same options/choices.
However, there is also a more challenging security goal: to prevent downgrade
attacks, where an attacker causes the parties to use a vulnerable option/choice,
although both parties are able, and prefer, to use a secure option/choice. As
we have seen for GSM (subsection 5.6.3), downgrade attacks can be simple to
understand and deploy, yet efficiently break down security - and they could
persist for years, as old versions die slow. The situation with TLS is quite
similar.
In this section, we discuss the pre-TLS-1.3 negotiation mechanisms, and
several effective, and quite simple, downgrade attacks. We exclude discussion
of TLS 1.3, since its negotiation mechanisms are quite different, mostly due
to insights from these downgrade attacks. Let us begin with SSLv2, which is
completely vulnerable to such attacks.
7.5.1
SSLv2 cipher suite negotiation and downgrade attack
In ğ5.6.2 we presented the crypto-agility principle (Principle 11), i.e., allowing
ŕexibility, replacement and upgrade of the cryptographic mechanisms. We also
discussed how the GSM support for crypto-agility is vulnerable to downgrade
attack. How about SSLv2?
Figure 7.16 illustrates how SSLv2 also supports crypto-agility, i.e., the
SSLv2 cipher suite negotiation mechanism. Figure 7.17 gives an example of
the negotiation process, when the client supports three cipher suites, all using
the MD5 hashing algorithm, but with three different ciphers and key lengths
(128-bit keys or 40-bit keys for RC4, and 56-bit keys with DES), and the
server supports two ciphers (128-bit RC4 and 40-bit RC4). The 40-bit keys are
Applied Introduction to Cryptography and Cybersecurity
CHAPTER 7. TLS PROTOCOLS: WEB-SECURITY AND BEYOND
446
Client
Client hello: client random (rC ),
cipher suites=RC4_128_MD5, RC4_40_MD5, DES_64_MD5
Server
Server hello: server random (rS ), certificate: SCA.s (S.e, s.com, . . .)
and cipher suites=RC4_128_MD5, RC4_40_MD5
Client key exchange: RC4_128_MD5, ES.e (kM );
Client finished: EkC (rS )
Server finished: EkS (rC )
Figure 7.17: Example of SSLv2 cipher suite negotiation. In this example, the
client offers three cipher suites, and the server supports two of these. The
negotiation was changed from SSLv3.
Client
Client hello: client random (rC ),
cipher suites=RC4_128_MD5, RC4_40_MD5
MitM
Client hello: client random (rC ),
cipher suites=RC4_40_MD5
Server
Server hello: server random (rS ), certificate: SCA.s (S.e, s.com, . . .)
and cipher suites=RC4_40_MD5
Client key exchange (RC4_40_MD5): ES.e (kM );
Client finished: kC (rS )
Server finished: kS (rC ), EkS (ID)
Figure 7.18: Cipher suite downgrade attack on SSLv2. Server and client end
up using master key kM with only 40 secret bits, which the attacker can őnd
by exhaustive search. Attacker does not need to őnd key during handshake;
parties use the 40-bit key for entire connection, attacker may even just record
ciphertexts and decrypt later. Note that while SSLv2 is not used anymore, we
later discuss sversion downgrade attack that may trick the server and/or client
into using SSLv2, exposing them to this (and other) attacks on SSLv2.
obviously insecure, but until about 2000, these were the only keys allowed for
products sold or distributed outside of the USA, due to USA export controls.
In SSLv2, the őnish messages only conőrm that the parties share the same
server and client keys (KS and KC , respectively), but not the integrity of
the rest of the hello messages - in particular, there is no authentication of
the cipher suites sent by server and client. This allows simple downgrade
attacks, removing ‘strong’ ciphers from the list of ciphers supported by client
and/or server. Figure 7.18 illustrates how a Man-in-the-Middle (MitM) attacker
may perform this downgrade attack on SSLv2; in the example illustrated, the
attacker removes the ‘regular-version’ 128-bit RC4 encryption from the list of
ciphers supported by the client, leaving only the weaker ‘export-version’ 40-bit
RC4 encryption. Indeed, the SSLv2 downgrade attack is even simpler and easier
to deploy, compared to the GSM downgrade attack (ğ5.6.3).
Applied Introduction to Cryptography and Cybersecurity
7.5. NEGOTIATIONS AND DOWNGRADE ATTACKS (SSL TO TLS 1.2)447
7.5.2
Handshake Integrity Against Cipher Suite Downgrade
From SSLv3, the cipher suite negotiation mechanism was improved. The
most important change is the adoption of an important, simple defense of
the handshake integrity, which prevents the cipher suite downgrade attack of
Figure 7.18. Two other changes are (1) extended cipher-suites that specify also
the key-exchange mechanism (SSLv2 cipher suites speciőed only the symmetric
cryptography), and (2) the server chooses the cipher-suite (in Server hello, i.e.,
second ŕow), instead of the client (in third ŕow).
‘Finished:’ handshake integrity foils cipher suite downgrade. Beginning with SSLv3, the handshake protocol includes a simple mechanism for
validating the integrity of all handshake messages. The client and server authenticate the entire handshake, using the master key derived for that connection.
In particular, if the cipher suite downgrade attack of Figure 7.18 would be
launched against SSLv3 or TLS, then the server will detect the attack, and
disconnect the connection, upon receiving the Client Finished: message.
Speciőcally, as can be seen, e.g., in Figure 7.10, both client and server send,
in their respective finished message, a validation value, ensuring the integrity
of all previously exchanged messages in that handshake. Upon receiving the
őnished message from the peer (server or client, respectively), the value is
checked and if incorrect, the handshake is aborted.
Similarly to the keys-derivation process (ğ7.4.1), the details slightly differ
among the different versions, and we present a slight simpliőcation, consistent
with the one we used in ğ7.4.1. Both validation values are computed following
the ‘hash-then-authenticate’ paradigm, using a hash function h, assumed to be
collision resistant, and a a pseudorandom function P RF . The validation sent
with the Client őnished message, which we denote vC , is computed as:
vC = P RFkM (‘client őnished:’ +
+ h(handshake-messages))
(7.24)
Similarly, the validation sent with the Server őnished message, which we denote
vS , is computed as:
vS = P RFkM (‘server őnished:’ +
+ h(handshake-messages))
(7.25)
Note the similarity to the derivation of the key-block (Equation 7.22).
Security Analysis of Finished Validation. Let us give an intuitive explanation to the security provided by Finished validation mechanism, from SSLv3,
focusing on the prevention of cipher suite downgrade attacks.
For simplicity, assume that the client and server both support a secure shared
cipher suite X, which they prefer over a vulnerable cipher suite V . Also, focus
and on the (typical) handshake, with server authentication but without client
authentication. In this case, the protocol should ensure that if the handshake
completes successfully at a client, then it also completed successfully at the
Applied Introduction to Cryptography and Cybersecurity
448
CHAPTER 7. TLS PROTOCOLS: WEB-SECURITY AND BEYOND
server, and the two parties use cipher suite X (or a more preferred secure cipher
suite).
Consider an execution in which the client completes the handshake successfully. In such execution, the client must have received a valid Server őnished
message, i.e., the client’s value of vS , computed as in Equation 7.25, is the same
as the value received from the server, which we denote vS′ .
Basically, the security follows from the use of ‘hash-then-authenticate’.
Let us elaborate a bit, mainly to highlight the assumptions involved. The
őrst assumption is that the master key kM cannot be exposed by an attacker
sending a (manipulated or fabricated) Server őnished message. Namely, kM is
pseudorandom and known only to the client and to the intended server.
Note that the assumption, and other assumptions identiőed below, should
hold for an any supported cipher suite. Also, recall that, from SSLv3, the
cipher suite speciőed both the symmetric-encryption mechanisms and the key
exchange mechanisms. Hence, all key exchange mechanisms supported by the
client must be secure, at least to the extent of preventing exposure of kM during
a successful handshake.
The basic argument for security, is that when the client receives a valid
Server őnished message, with vS = vS′ , then the server must have previously
sent that message, and client and server must have seen identical handshake
messages. This argument is based on the assumption that kM is pseudorandom,
and on the use of hash-then-authenticate in the computation of both Finished
messages.
Hence, our analysis assumes that the P RF function is a secure pseudorandom
function, and that the hash function h is a collision-resistant hash function.
The observant reader will notice that the second ‘assumption’ is not really an
assumption as much as a simplification, since h is a keyless hash and there are
no keyless collision resistant hash functions (subsection 3.2.2). This is one of
multiple reasons that imply that we cannot hope to provide a full, reduction
based proof for the security of TLS - at least, until version 1.3.
The rest of the security argument follows. A PRF provides message authentication (subsection 4.5.1), when used with a pseudorandom key (in this
case, kM ). Hence, when the client receives vS′ , this implies that the server, previously, computed vS′ = P RFkM (‘server őnished:’ +
+ h(handshake-messages)),
providing as input (‘server őnished:’ +
+ h(handshake-messages)), and then sent
vS . Since the client computed the same value (vS = vS′ ), it follows that the
inputs to the PRF provided by client and server were the same. From the
collision-resistance of h, the inputs to the hash were also the same. It follows
that both parties have seen identical handshake messages. In particular, they
must have seen the same ciphersuite negotiation messages, and hence, cipher
suite downgrade attacks are impossible from SSLv3.
Details of the hash function h. The computation of the validation values
vC , vS involves, in both equations, a cryptographic hash function h, whose
deőnition differs between the different versions. Speciőcally, in TLS 1.2, the
hash function is implemented simply as SHA-256, i.e., h(m) = SHA_256(m).
The TLS 1.0 and 1.1 design is more elaborate, and follows the ‘robust combiner
Applied Introduction to Cryptography and Cybersecurity
7.5. NEGOTIATIONS AND DOWNGRADE ATTACKS (SSL TO TLS 1.2)449
for MAC’ design of ğ4.6.2; speciőcally, the hash is computed by concatenating
the results of two cryptographic hash functions, M D5 and SHA1, as: h(m) =
M D5(m)+
+SHA1(m). SSLv3 also similarly combines MD5 and SHA1, however,
in SSLv3 the combination is in the computation of P RF itself, and fails to
ensure a robust combiner; details omitted.
7.5.3
Finished Fails: the Logjam and FREAK cipher suite
downgrade attacks
In spite of the above security analysis of Finished message validation, two
ciphertext downgrade attacks have circumvented the defense: FREAK and
Logjam. Both attacks exploit the fact that, due to USA export regulations
until around 2000, SSLv3 and TLS 1.0 support several cipher suites which use
short vulnerable keys; these cipher suites were supposed to be used (only) for
exported versions of TLS, i.e., versions distributed outside the USA.
The FREAK5 attack [56] exploits the RSA_EXPORT cipher suite, which
uses a weak, 512-bit modulus N , which can be factored in few hours by affordable
hardware. The attack was effective on popular implementations of TLS, however,
it exploited a subtle bug in their implementations, which caused the client to
receive the weak (512-bit) key although the client was using the strong (nonexport) RSA cipher suite.
Similarly, the Logjam attack [7] exploits an exportable version of the DiffieHellman Ephemeral (DHE) key exchange (Figure 7.12), which uses 512-bit
groups. While the speciőc exponents are (correctly) chosen randomly in each
run, many implementations use the same groups. However, for a given group,
it is possible to perform a precomputation step, following which, different
discrete logs (with the same modulus) can be computed with acceptable, if
still signiőcant, computational costs. If the attack can be carried out in real
time, the MitM attacker may be able to őnd the pre-master key derived by the
client, and hence forge a valid Server Finished message, thereby successfully
impersonating as the legitimate server. See Figure 7.19.
5 FREAK
stands for Factoring RSA Export Keys. With an added ‘A’ for fun, I guess.
Applied Introduction to Cryptography and Cybersecurity
CHAPTER 7. TLS PROTOCOLS: WEB-SECURITY AND BEYOND
450
kP M ← (g b )a mod p
kM ← P RFkP M
“master secret”
+
+rC +
+ rS
b ← DiscLog(g b modp)
derive kP M and kM
MitM
Client A
(Alice)
Server B
(Bob.com)
After precomputation phase
of disc-log of group (g, p)
Client Hello: random rC ,
cipher suite: DHE
Client Hello: random rC ,
cipher suite: DHE_Export
Server Hello: random rS ,
cipher suite: DHE
Server Hello: random rS ,
cipher suite: DHE_Export
CertB , SignS.s rC +
+ rS +
+ (p +
+g+
+ g b mod p)
g a mod p
Client
Finished:
Server
Finished:
P RFkM
P RFkM
‘client finished:’+
+
+
+h(previous flows)
‘server finished:’+
+
+
+h(previous flows)
Figure 7.19: The Logjam cipher suite downgrade attack against servers supporting the exportable (weak keys) version of the DH Ephemeral (DHE) key
exchange. The attack works for the (surprisingly common) case of known DH
group (p, g), allowing the attacker to precompute the most computationally
challenging part of the discrete log computation and use it for the different
exponents and handshakes. The MitM attack begins once this precomputation
is done. The attacker forces TLS clients to use export-strength Diffie-Hellman
Ephemeral (DHE) key exchange (the DHE_Export cipher suite). The attacker
modiőes Client Hello to request DHE_Export from the server, and modiőes
Serve Hello to appear as if the server uses regular DHE. The client does not deIntroduction
Cryptography
and Cybersecurity
g, g b ) correspond
to export-version of DH,
tect that theApplied
Diffie-Hellman
valuesto(p,
a
and continues the handshake, sending g mod p and then Client Finished. The
attacker now uses the precomputed values to compute bb ← DiscLog(g b modp),
allowing it to őnd the master key kM and complete the handshake.
7.5. NEGOTIATIONS AND DOWNGRADE ATTACKS (SSL TO TLS 1.2)451
What can we learn from these attacks? One important lesson is yet-anotherexample of the risks of due to the use of insufficiently-secure cryptography;
vulnerabilities die hard, and can often be abused even years after most systems
adopted more secure solutions. Another lesson is the risk of relying only on
intuitive reasoning, as presented above, for the security of systems. Intuition is
useful to identify some attacks and to create initial designs, but cryptographic
vulnerabilities can be subtle, and a precise, in-depth security analysis is critical.
7.5.4
Backward compatibility and protocol version
negotiation
SSL was an immediate success; it was widely deployed soon after it was released
(as SSLv2). Hence, when introducing SSLv3, designers had to seriously consider
backward compatibility, namely, allowing a client/server running a SSLv3, to
interact with a server/client, respectively, running SSLv2. Note that already
SSLv2 includes a version number in the client hello and server hello messages,
although it did not include a protocol version negotiation mechanism. With the
deőnition of (different versions of) TLS, in parallel to further proliferation of
web devices, the need for backward compatibility only grew stronger; basically,
a new version has almost no chance of adoption without backward compatibility
with earlier versions.
TLS protocol version negotiation. The Client Hello messages of TLS 1.01.2 and SSLv3 are very similar, and the Client Hello and Server Hello messages
include protocol-version identiőcation. This allows a simple and efficient version
negotiation mechanism for implementations of TLS 1.0-1.2, allowing downgrade
to the least-updated among the versions of the client and the server, from SSLv3
to TLS 1.2.
This TLS version negotiation mechanism works as follows. When a TLS
server receives a Client Hello indicating an older version, it simply continues
the negotiation using this older version of the protocol. Similarly, when a TLS
server receives a Client Hello indicating a protocol version newer than the one
supported by the server, then the server continues the negotiation using its own
(older) version of the protocol.
This version negotiation mechanism allows TLS servers running any version
of TLS, or running SSLv3, to interact with clients running any version of TLS
or SSLv3; only SSLv2 is incompatible. In all cases, the clients detect the
lowest-common version which is to be used from the Server Hello message, and
continue the protocol using that version. Importantly, the same handshake
continues, and therefore the integrity mechanism of the Finished messages
validates that the downgrade was, indeed, selected by the server.
Vulnerability of TLS version negotiation. The TLS version negotiation
mechanism seems, intuitively, secure; however, is it really? As in many cases,
the intuition here can be misleading. Speciőcally, [341] showed how the use of
Applied Introduction to Cryptography and Cybersecurity
CHAPTER 7. TLS PROTOCOLS: WEB-SECURITY AND BEYOND
452
an optimized version of the Bleichenbacher attack succeeds in decrypting the
premaster key, in a small fraction of the attempts. By trying to open a sufficient
number of sessions, the attack succeeds against many TLS implementations.
Furthermore, while TLS 1.3 has an improved version-negotiation mechanism
(subsection 7.6.1), and does not even rely on RSA encryption, this vulnerability
may allow the attack of [341] to downgrade from TLS 1.3 to a lower, vulnerable
version.
The SSL version negotiation mechanism and SSLv3 version downgrade
attack. The above TLS protocol version negotiation mechanism does not
support SSLv2, since SSLv2 uses a different client-hello format. The SSLv3
speciőcation [155] speciőed a different version negotiation mechanism, which
allows SSLv3 clients to interact with SSLv2 servers.
The SSL version negotiation mechanism works by the following simple
downgrade dance: an SSLv3 client őrst sends the SSLv3 Client Hello message;
but if a valid Server Hello is not received, it sends the SSLv2 Client Hello. This
allows interoperability of an SSLv3 client with an SSLv2 server. The other case,
of SSLv2 client and SSLv3 server, is handled like the TLS mechanism, i.e., the
SSLv3 server detects receipt of SSLv2 Client Hello and continues the handshake
using SSLv2.
It is easy to see, that the SSL version negotiation (downgrade dance) mechanism is vulnerable to a downgrade attack. Basically, a MitM attacker drops
the SSLv3 Client Hello message, thereby causing the client to connect using
SSLv2. See the following exercise. The designers were aware of this risk, and
the SSLv3 standard notes that this method for backward compatibility will be
‘phased out with all due haste6 ’.
Exercise 7.3 (The SSLv3 version downgrade attack). Show message sequence
diagram for a MitM version downgrade attack, tricking an SSLv3 server and an
SSLv3 client who sends SSLv2-format client-hello (for backward compatibility),
into completing the handshake using SSLv2 and using a weak (40-bit) cipher.
Hint: see this attack (referred to as ‘version rollback attack’) in [384].
Kocher’s ad-hoc defense against downgrade to SSLv2. In practice,
interactions between most TLS and SSLv3 servers and clients are protected
from the protocol downgrade attack of Exercise 7.3, by an ingenious ad-hoc
defense, designed by Paul Kocher. These clients signal their support of SSLv3,
by encoding a ‘signal’ of that in the padding used in the RSA encryption. For
details on this and other issues related to MitM downgrade attacks (also referred
to as version rollback), see [384] and appendix E.2 of the TLS 1.0 speciőcations,
RFC2246 [120].
6 SSLv3
was released in 1996, and never modified to remove this method of backward
compatibility. This allowed downgrading even of TLS 1.2 to SSLv2, until March 2011, when
TLS versions were redefined [375], removing the ability to downgrade to SSL. That’s haste
for IETF, apparently.
Applied Introduction to Cryptography and Cybersecurity
7.5. NEGOTIATIONS AND DOWNGRADE ATTACKS (SSL TO TLS 1.2)453
7.5.5
The TLS Downgrade Dance and the Poodle Version
Downgrade Attack
The TLS backward compatibility mechanism (subsection 7.5.4) requires support
by the server: the server should respond correctly to Client Hello using newer
versions, indicating to the client the need to move to the older version. This is
not trivial; while Client Hello messages use basically the same design, there are
several additions from SSLv3 onwards, including the important TLS extensions
mechanism (from TLS 1.1). While the additions have been carefully designed to
be ignored by implementations following correctly and precisely the speciőcations
of the previous versions, many lower-version implementations still fail to process
them correctly, or for some other reason, do not continue with the protocol
using their (lower-version) protocol. Unfortunately, there are many older version
servers which fail, in this way, to support TLS version negotiation.
Many clients try to work with such servers anyway, by the following process,
which we call the TLS downgrade dance: try őrst to connect using the latest
version, but if receiving no response (or error), try with older versions. The
reader will notice that this downgrade dance is basically an extension of the SSL
downgrade dance. Unfortunately, it is also vulnerable to a downgrade attack.
An attacker can simply block connection attempts (or send back a fake error
message), causing the client to use an older, vulnerable version of the protocol;
see exercise below.
Exercise 7.4 (The Poodle version downgrade attack). Consider client that
supports the downgrade dance described above. Namely, the client first tries to
connect using TLS 1.2; if that fails, it tries to connect using TLS 1.1; and if
that also fails, it tries to connect using TLS 1.0. Present a message sequence
diagram for a MitM attack, which tricks this client into using TLS 1.0, even
when the server it tries to connect with supports TLS 1.2.
Although the TLS downgrade dance does not follow the TLS speciőcations,
it is supported by many TLS clients. This provides a very effective downgrade
attack from TLS versions 1.0-1.2 to lower versions, including SSLv3, and
sometimes even SSLv2. The ability to perform this downgrade attack was őrst
demonstrated in the Poodle attack [290], and therefore we refer to it as the
Poodle version downgrade attack. Combined with the Poodle padding attack
(subsection 7.2.3), this allowed Poodle to be successfully exploited against most
web servers implementing TLS 1.0-1.2.
One way to avoid the version downgrade attack is to avoid the downgrade
dance. However, evidently, this may cause loss of backward compatibility with
many servers (e.g., websites), a price that client developers may not be willing
to pay. We next present the SCSV signaling mechanism, which allows secure
use of the TLS downgrade dance.
Applied Introduction to Cryptography and Cybersecurity
454
7.5.6
CHAPTER 7. TLS PROTOCOLS: WEB-SECURITY AND BEYOND
Securing the TLS downgrade dance: the SCSV cipher
suite and beyond
Exercise 7.4 shows a potential vulnerability for a common case, where clients use
‘downgrade dance’ to ensure backward compatibility with servers supporting
older (lower) versions of the TLS protocol. How can we mitigate this risk, while
still allowing clients to use the TLS downgrade dance, in order to to interact
with servers running older versions, that do not support the TLS negotiation
mechanism?
A standard solution is the Signaling Cipher Suite Value (SCSV) cipher
suite, speciőed in RFC 7507 [288]. Clients that support SCSV, őrst try to
connect to the server using their current TLS version - no change from clients
not supporting SCSV. The difference is only when this initial connection fails,
and the client decides to try the ‘downgrade dance’, to support connections
with servers supporting (only) older versions of TLS.
In these ‘downgrade dance’ handshakes, the client adds a special ‘cipher
suite’ to its list of supported cipher suites, sent as part of the ClientHello
message. The special ‘cipher suite’ is called TLS_FALLBACK_SCSV, and is
encoded by a speciőc string. Unlike the original (and main) goal of the cipher
suites őeld, the SCSV is not an indication of cryptographic algorithms supported
by the client. Instead, the existence of SCSV indicates to the server, that this
handshake message is sent as part of a downgrade dance by the client, i.e.,
that the client supports a higher version than the one speciőed in the current
handshake. If the server receives such handshake, and supports a higher version
of the protocol itself, this would indicate an error or attack, as this client and
server should use the higher version. Therefore, in this case, the server responds
with an appropriate indication to the client.
This use of the cipher suites őelds for signaling the downgrade dance is a
‘hack’ - it is not the intended, typical use of this őeld. A ‘cleaner’ alternative
would be to achieve similar signaling using a dedicated extension mechanism;
later in this section, we describe the TLS extension mechanism, which is used
for this purpose in TLS 1.3. We believe that the reason that SCSV was deőned
using this ‘hack’ (encoding of a non-existent cipher suite) rather than using an
appropriate TLS extension, was the desire to support downgrade dance to older
versions and implementations of TLS, that do not support TLS extensions.
7.5.7
The SSL-Stripping Attack and the HSTS Defense
An even more extreme downgrade attack is to trick the client into using an
insecure connection, i.e., not to use TLS at all, although the server supports
secure (TLS) connections. The attack is designed against the use of TLS to
protect web communication.
Browsers connect to websites using the protocol speciőed in the URL,
typically, either HTTP (unprotected) or HTTPS (TLS protected). The URL is
often from a previously-received webpage. If that previously-received webpage
is unprotected, then the hyperlink may be modiőed by a MitM attacker Applied Introduction to Cryptography and Cybersecurity
7.5. NEGOTIATIONS AND DOWNGRADE ATTACKS (SSL TO TLS 1.2)455
Alice
(client)
Bob
(server)
MitM
attacker
Nurse
GET bob.com/index.htm
GET bob.com/index.htm
...<a href=http://bob.com/login.htm>... ...<a href=https://bob.com/login.htm>...
GET bob.com/login.htm?pw=IluvBob,...
(plaintext)
TLS Client Hello
TLS Server Hello
TLS Client key-exchange and őnish
TLS Server őnish
GET bob.com/login.htm?pw=IluvBob,...
(encrypted)
...http connection continues...
(plaintext)
...http connection continues...(encrypted)
Figure 7.20: The SSL-Stripping MitM Attack on an TLS connection. The
attack replaces https hyperlinks with http hyperlinks (second ŕow). If the users
do not notice and enters their password, then the attacker obtains the password.
The attacker may continue with the connection, using a secure connection to
the server, to obtain more sensitive information and to reduce the likelihood of
the user detecting the attack.
speciőcally, changing from a URL specifying the protected HTTPS, to a URL
specifying the unprotected HTTP. Browsers indicate the protocol used (HTTP or
HTTPS) to the user, but many or most users are unlikely to notice a downgrade
(from HTTPS to HTTP). This attack is referred to as SSL-Stripping, and was
őrst presented by Marlinspike [277].
The SSL-Stripping attack is illustrated in Figure 7.20. The attack works
by replacing https hyperlinks, sent over insecure connections, with the http
hyperlinks (to the same resource, except the change from https to http). The
user often would not notice that the web-page is delivered over http, and send
sensitive information such as password.
Of course, the attack only works if the https hyperlink is sent over an
insecure connection. Therefore, the attack surface is minimized by securing
more web-pages, especially search engines.
Applied Introduction to Cryptography and Cybersecurity
456
CHAPTER 7. TLS PROTOCOLS: WEB-SECURITY AND BEYOND
A good defense against SSL-Stripping and similar attacks is for the browsers
to detect or prevent HTTP hyperlinks to a website which always uses (or offers)
HTTPS connections. The standard mechanism to ensure that is the HSTS
(HTTP Strict Transport Security) policy, deőned in RFC 6797 [203]. The HSTS
policy indicates that a particular domain name (of a web server), should be
used only via HTTPS (secure) connections, and not via unprotected (HTTP)
connections. HSTS is sent as an HTTP header őeld (Strict-Transport-Security),
in an HTTP response sent by the web server to the client.
The HSTS policy speciőes that the speciőc domain name, and optionally also
subdomains, should always be connected using HTTPS, i.e., a secure connection.
Speciőcally,
1. The browser should only use secure connections to the server; in particular, if the browser receives a request to connect to the server using the
(unprotected) HTTP protocol, the browser would, instead, connect to the
server using the HTTPS (protected) protocol, i.e., using HTTP over TLS.
2. The browser should terminate any secure transport connection attempts
upon any secure transport errors or warnings, e.g., a warning about the
use of invalid certiőcate.
The HSTS policy is designed to prevent attacks by a MitM attacker, hence,
the HSTS policy itself must be protected - and, in particular, the attacker
should not be able to ‘drop’ it. For this reason, HSTS policy must be known to
the browser before it connects to the server. This may be achieved in two ways:
Caching - max-age: The HSTS header őeld has a parameter called max-age,
which deőnes a period of time, speciőed in seconds, during which the
browser should ‘remember’ the HSTS directive, i.e., keep it in cache.
Any connection within this time, would be protected by HSTS. Namely,
suppose that at time T , a browser receives an HTTP response containing
the HSTS header with max − age = m from site example.com, over a
secure connection (i.e., using TLS). Assume that later, but before time
T +m, the browser again is directed to request an object from example.com;
then the browser will open the link to example.com only over a secure
connection, i.e., using TLS. This motivates the use of a large value for
max-age; however, notice that if a domain must move back to HTTP for
some reason, or there are failures in the secure connection attempts for
some reason, e.g., expired certiőcate, then the site may be unreachable
for max-age seconds.
Pre-loaded HSTS policy: The browser maintains a list of HSTS domains
which are preloaded, i.e., do not require a previous visit to the site by
this browser. This avoids the risk of a browser accessing an HSTS-using
website but without a cached HSTS policy. However, this requires the
browser to be preloaded with the HSTS policy - a burden on the site and
on the browser, and some overhead for this communication. An optional
Applied Introduction to Cryptography and Cybersecurity
7.5. NEGOTIATIONS AND DOWNGRADE ATTACKS (SSL TO TLS 1.2)457
parameter of the HSTS header, instructs search engines to add the site
to the HSTS preload list of related browsers. This is used by Google to
maintain the pre-loaded HSTS list of the Chrome browser.
7.5.8
Three Principles: Secure Extensibility, KISS and
Minimize Attack Surface
The different downgrade attack attacks show the importance and the challenge
of secure backward compatible upgrades. Backward compatibility is essential,
to motivate adoption of a new versions of protocols. For example, without
backward compatibility, a web-server using SSLv2 is unlikely to upgrade to
SSLv3, until most clients would upgrade to SSLv3; and clients would not upgrade
to SSLv3, until there are many web-servers they can interact with using SSLv3
- a chicken and egg problem. We conclude that backward compatibility is
essential for successful upgrade. On the other hand, we see the backward
compatibility mechanisms may allow vulnerability that can be exploited for
downgrade attacks. It is therefore necessary to ensure a secure extension
mechanism allowing backward compatibility. In particular, a every practical
security protocol should support a secure version negotiation mechanism.
We conclude the principle of secure extensibility by design, which requires
built-in secure mechanisms for extensions and backward compatibility. This is
another important design principle for secure systems and protocols, cryptographic or otherwise.
Principle 13 (Secure extensibility by design). When designing security systems
or protocols, one goal must be to build-in secure mechanisms for extensions,
downward compatible versions, and negotiation of options and cryptographic
algorithms (cipher suite negotiation).
However, backward compatibility, like other options and extensions, increases
the attack surface and make the system more complex, both of which imply
greater risk of vulnerabilities. Let us discuss these two issues.
First, flexibility brings complexity, and vulnerabilities lurk in complexity:
the simpler the system, the easier it is to protect. This is the important KISS
principle7 .
Principle 14 (The KISS Principle). Keep It Simple and Secure The simpler a
system is, the easier it is to protect; for better security, minimize complexity,
options, flexibility and extensibility.
Now to the attack surface, which is based on the intuitively-deőned notion of
an attack vector. An attack vector is an element of the system, which may have a
ŕaw which can be exploited by an attacker to ‘break’ the system; attack vectors
may be deőned as functions, classes, lines of code, cryptographic functions, API
calls, software/hardware modules or protocol variants. The attack surface is a
7 The KISS principle originates from the US nave, where it meant ‘Keep It Simple, Stupid’.
We changed it a bit, to Keep It Simple and Secure.
Applied Introduction to Cryptography and Cybersecurity
458
CHAPTER 7. TLS PROTOCOLS: WEB-SECURITY AND BEYOND
rough measure of the number or extent of the attack vectors in a given system.
In a simpliőed case, let n denote the attack surface, corresponding to n attack
vectors, each with probability pv of a discovered vulnerability. The probability
that the overall system will be secure, i.e., not have any discovered vulnerability,
is only (1 − pv )n . Hence, we need to minimize ‘attack surface’, n - as well as to
minimize the probability of vulnerability, pv .
Principle 15 (Minimize the attack surface). Systems should be designed to
minimize their attack surface, i.e., minimize the number of their attack vectors.
Roughly, the probability of vulnerability of a system, is exponential in the number
of attack vectors.
DROWN and other Cross-Protocol Attacks (exploiting lack of key
separation). Modern protocols, like (newer versions of) TLS, are adopting the
extensibility-by-design principle and support secure extensions and backward
compatible versions. However, there is yet an important element of extensibility
that is often neglected: version-based key separation (Principle 10). Namely,
suppose the same key - in particular, public-private key pair - is used by both a
vulnerable protocol and a secure protocol. Then, it may be possible to expose
the key by running the vulnerable protocol, and exploit this to attack the system
also when using the secure protocol.
The DROWN attack [18] is an important example of a cross-protocol attack,
due to the lack of version-based key separation. A signiőcant number of webservers were found to support SSLv2, furthermore, using the same key-pair
as they use for improved-security TLS handshake. This makes these servers
vulnerable to an improved Bleichenbacher attack presented in [18], which allows
to perform operations using the RSA private key.
7.6
The TLS 1.3 Handshake: Improved Security and
Performance
In this section, we (őnally) discuss the handshake protocol of TLS 1.3 [329] the current version of TLS. The TLS 1.3 handshake protocol, like the record
protocol Figure 7.7, is a major re-design, providing signiőcant improvements
in performance and security compared to the handshake protocol of earlier
versions. The main goals of these changes were to improve security, to improve
performance. Another goal was to simplify - a goal on its own right, but also
reducing the risk of vulnerabilities, following the KISS principle (Principle 14).
However, while some signiőcant simpliőcations were made, we cannot deny
that TLS 1.3 introduces its own complexities; arguably, these complexities can
be justiőed by the security beneőts they provide to different scenarios. Our
description makes some simpliőcations, in what seems to be more technical
details.
Consistency with previous versions was not one of the main goals. Indeed,
the TLS 1.3 handshake protocol, and especially the key derivation mechanisms,
Applied Introduction to Cryptography and Cybersecurity
7.6. THE TLS 1.3 HANDSHAKE: IMPROVED SECURITY AND
PERFORMANCE
459
differ considerably from previous versions, even in the terminology used. For
example, the term premaster key is not used. This, unfortunately, adds another
challenge to the reader familiar with the previous versions.
TLS 1.3 Security Improvements. The security improvements of the TLS
1.3 handshake include:
• TLS 1.3 does not support the (previously widely used) RSA-based key
exchange. This avoids many attacks on RSA implementations, mainly
variants of Bleichenbacher’s attack, such as ROBOT [73]. Note, however,
that TLS 1.3 still allows the use of RSA signatures for authentication of
the handshake ŕows, including of the public DH values. The signature can
be vulnerable to a cross-protocol attack, if (incorrectly) using the same
RSA private key for TLS 1.3 CertiőcateVerify signature, as the key used
by insecure protocols such as earlier TLS versions; see subsection 7.6.6.
• TLS 1.3 handshake always uses DH Ephemeral (DHE) key exchange to
provide a fresh secret shared key in every exchange, using either őniteőelds of elliptic-curves Diffie-Hellman. This ensures perfect forward secrecy
(PFS)8 . The use of this single key-exchange mechanism also simpliőes the
handshake.
• TLS 1.3 disallows the use of other cipher suites with known weaknesses,
most notably, these using ‘export-grade’ cryptography. This foils cipher
suite downgrade attacks which exploits the use of weak, ‘export-grade’
cryptography, such as the LOGJAM attack [7], exploiting support for
512 bit DH groups, and the FREAK attack [56], exploiting the use of
ephemeral 512 bit RSA private keys. See subsection 7.5.3.
• Previous versions of TLS allowed the server to specify an arbitrary DiffieHellman group, by sending the modulus p and generator g. This allows different servers to select different groups, which may help to foil
discrete-logarithm precomputation attacks such as in the Logjam attack
(subsection 7.5.3). However, as in many cases, flexibility resulted in vulnerability; many servers used the same groups - and often, using weak
groups. TLS 1.3 uses standard őnite groups or elliptic curves; these
should correspond to carefully chosen groups and curves, properly deőned,
e.g., in [162, 251]. This use of speciőc, studied groups/curves follows the
cryptographic building blocks principle (Principle 8).
• In the TLS 1.3 Server_Key_Exchange message, the servers sign the
entire handshake, not just the DHE parameters and the client and server
random numbers (rC and rS ) as in previous versions; compare to Figure 7.12. In particular, when the client receives and validates this signed
8 However,
TLS 1.3 does not ensure perfect recover security (PRS), since it relies on the
secrecy of the server’s private key, for signing the exchange; see Exercise 7.16. The reliance
on a fixed private key was exploited by the few known attacks against the TLS 1.3 handshake
protocol.
Applied Introduction to Cryptography and Cybersecurity
460
CHAPTER 7. TLS PROTOCOLS: WEB-SECURITY AND BEYOND
Server_Key_Exchange, it conőrms that the server received correctly the
supported-versions, cipher suites and supported-groups information sent
in Client_Hello. This defends against downgrade attacks (Section 7.5).
• For signatures and hashing, TLS 1.3 forbids the use of the vulnerable RSA
PKCS#1 v1.5 and SHA-1 algorithms. Secure alternatives are speciőed
instead: PSS padding for signatures and for hashing, SHA-256, SHA-384
or SHA-512. See PKCS#1 v2.2 (RFC 8017), RFC 5756 and RFC4055 [292,
347, 374].
• TLS 1.3 removed support for handshake renegotiation. Renegotiation
added complexity, vulnerabilities (e.g., [57, 331]) and attack surface. TLS
1.3 provides an alternative mechanism to support the main use case for
renegotiation, which is, invoking client authentication only ‘as-needed’,
after the handshake completes.
• TLS 1.3 design prefers cryptographic designs which are amenable to proofs
of security, and hence their security seems more well established. This is
done without requiring an explicit attack exploiting the previous, intuitive
or less-established design. The improved, and arguably more complex,
key derivation process is a good example (subsection 7.6.5).
• TLS 1.3 speciőes the use of pre-shared keys (PSKs). This single mechanism
replaces multiple mechanisms in previous versions: different PSK-based
cipher suites, with and without DH [21, 22, 141] as well as session resumption, both ID-based and session-ticket based (subsection 7.4.4). This
simplicity helps ensure security (Principle 14).
• TLS 1.3 protects őelds and messages as soon as the necessary keys are established. In particular, it protects the extensions sent in the Server_Hello
and later messages, and it protects the Certiőcate (sent after Server_Hello).
One beneőt from protecting the certiőcate is improved privacy for users
against an eavesdropper, who could have used the certiőcate to identify
the website used (even if the website cannot be identiőed from the addressing information, e.g., when using an anonymity-providing proxy).
Note that exactly this feature, may cause concern to network administrators who may have relied on the visibility of the certiőcate to prevent
communication with undesired websites.
TLS 1.3 Performance Improvement: reduced latency overhead (less
round trips). The performance improvement of TLS 1.3 handshake is mostly
due to reduced handshake latency, which results mainly from the reduction
in the number of round-trips. Most applications use the request-response
communication pattern, where the client sends a request to the server who
sends back a response. The handshake latency is the time since the client
application initiated the connection (and transferred the request to the client’s
TLS module), and until the server receives the request.
Applied Introduction to Cryptography and Cybersecurity
7.6. THE TLS 1.3 HANDSHAKE: IMPROVED SECURITY AND
PERFORMANCE
461
Modern networks use fast transmission rates; hence the latency is mostly
due to propagation and queuing delays. Namely, as explained in Fact 5.1 and
illustrated in Table 5.2, the dominant factors in determining the delay are the
number of round-trips and the round trip delay; we can mostly ignore the
transmission time of the packets. This is especially true since the handshake
packets, and most requests, are quite short. This means that it doesn’t matter
much if we send somewhat longer messages or multiple messages consecutively,
without waiting for a response. The latency is almost entirely determined by
the delay caused by the delay due to waiting for a response before sending the
next message.
Clearly, request-response interaction requires at least one round trip: sending
the request to the server and receiving the response. Therefore, the ‘base’ latency
for request-response interaction is one round trip time (RTT); this is required
even without security.
TLS 1.3 aims to minimize the latency overhead, which basically means,
minimize the number of additional round trips, required by the handshake before
the request can be sent.
Previous versions of TLS required two round trips (Figs. 7.3 and 7.8); TLS
1.3 requires only one round trip. Furthermore, TLS 1.3 supports a zero round
trips handshake, where one client request, containing some application data,
can be sent as part of the initial ŕow from the client; see subsection 7.6.4.
Our presentation of the TLS 1.3 Handshake protocol. In the following
subsections, we present a simpliőed overview of the TLS 1.3 Handshake protocol.
We cover the most important aspects of the protocol. In subsection 7.6.1, we
discuss the TLS 1.3 negotiation and backward compatibility mechanisms. In
subsection 7.6.2 we discuss the TLS 1.3 1-RTT (‘full’) Diffie-Hellman handshake.
In subsection 7.6.3 we discuss the Pre-Shared Key (PSK) handshake, used to
support both off-band shared keys and session resumption. In subsection 7.6.4
we discuss the zero-TTL handshake, which avoids entirely the delay-overhead
of earlier versions of TLS and even of the ‘full’ TLS 1.3 handshake. In subsection 7.6.5, we discuss the key derivation process of the TLS 1.3 handshake
protocol. Finally, in subsection 7.6.6 we discuss cross-protocol attacks, which
are the only known attacks which exploit a vulnerability that exists in the TLS
1.3 speciőcations.
Overall, we tried to cover the most important aspects of the TLS 1.3
handshake mechanism. However, we had to make some simpliőcations and
omissions. As one important example, see Exercise 7.19, which discusses the
risk of Denial of Service (DoS) on TLS servers, and two defenses against it,
including using the Cookie extension. The Cookie extension can also be used to
off-load state from the TLS server to the client [329].
7.6.1
TLS 1.3: Negotiation and Backward Compatibility
In Section 7.5, we discussed the cipher suite and version negotiation mechanisms
of previous versions of TLS. The redesign of TLS 1.3 includes a signiőcant
Applied Introduction to Cryptography and Cybersecurity
462
CHAPTER 7. TLS PROTOCOLS: WEB-SECURITY AND BEYOND
change in these negotiation mechanisms. However, this redesign was done
carefully to ensure backward compatibility with earlier versions of TLS.
Cipher suite negotiation. In previous versions of TLS, as well as SSLv3,
the cipher suites deőned three separate aspects: (1) the record protocol ciphers,
e.g., AES, (2) the key exchange mechanism (RSA, DH or pre-shared key), and
(3) the signature algorithm. This resulted in an exponential explosion in the
number of cipher suites, with unnecessary complexity (and room for error).
TLS 1.3 separates the cipher suite negotiations info four distinct aspects:
Record protocol cipher suite: the AEAD algorithm and the hash function
used for key derivation.
DH group and public share: the Diffie-Hellman group (or elliptic curve),
and optionally, a public key share extension which contains DH public
values (key shares).
Signature algorithm: the signature algorithm used for server authentication,
i.e., to sign the Server_Finished message.
Pre-Shared Key: identiőes pre-shared keys, and speciőes whether such keys
are to be used to authenticate a Diffie-Hellman key exchange, providing
PFS, or to be used directly as the shared secret between the parties. See
subsection 7.6.3.
TLS 1.3 version negotiation: the Supported_Versions extension.
TLS included a version negotiation mechanism from early on; however, as we
discussed in subsection 7.6.1, many TLS servers did not implement it correctly,
leading clients to adopt the insecure ‘downgrade dance’ and exposing them to
the Poodle version downgrade attack.
However, upon early, experimental deployment of TLS 1.3, it was discovered
that many servers still do not implement this version negotiation mechanism. At
őrst, it appeared that clients should be able to use ‘downgrade dance’ securely,
by using the SCSV extension (subsection 7.5.6), designed to prevent a MitM
attacker from causing unnecessary downgrades. No such luck; it was soon
realized that there are also many TLS 1.2 servers that do not support SCSV,
which would have allowed a Poodle-like downgrade attack against TLS 1.3.
The TLS 1.3 designers decided that the only secure solution is to change
the version negotiation mechanism, in a way which is backward compatible with
TLS 1.2 servers. Speciőcally:
1. TLS 1.3 uses a Client_Hello message which is compatible with TLS 1.2,
including the version number. Namely, TLS 1.3 Client_Hello messages
include the identiőer of TLS 1.2, rather than that of TLS 1.3. TLS servers
running version 1.2 or an earlier version, should handle this correctly
(if they implement version negotiation correctly). TLS 1.3 servers will
manage, since they use the Supported_Versions extension, which provides
a new version negotiation mechanism.
Applied Introduction to Cryptography and Cybersecurity
7.6. THE TLS 1.3 HANDSHAKE: IMPROVED SECURITY AND
PERFORMANCE
463
2. TLS 1.3 clients list the versions they support, in order of preference, in the
Supported_Versions extension. Supported_Versions is a new mandatory
extension, which should be supported by all new TLS servers and clients
(running version 1.3 or future versions). If this extension is absent, the
server should continue the handshake using TLS 1.2. If the extension is
present, the server uses the ‘best’ version supported both by the client
and by itself, and indicates it in the Supported_Versions extension sent
back (with Server_Hello).
Let us pray that this new mechanism will be implemented correctly by TLS
1.3 servers, avoiding a similar predicament upon upgrading to TLS 1.4!
Backward compatible Client_Hello. As we explained, the Client_Hello
message of TLS 1.3 must be backwards compatible with TLS 1.2. In particular,
the Client_Hello version őeld will indicate version 1.2, not 1.3; the ‘Supported
Versions’ extension will indicate which versions are supported by the client (e.g.,
version 1.3 and some older versions).
To retain backward compatibility of the Client_Hello with TLS 1.2, several
őelds contain ‘legacy’ values, used only by legacy (TLS 1.2 and lower) servers
receiving the message, and ignored by TLS 1.3 (or newer) servers. Let us discuss
each of these ‘legacy’ őelds:
Version: As explained above, TLS 1.3 clients indicate here the version of TLS
1.2.
Session_ID: This őeld is used for any previously-cached Session_ID for that
server (subsection 7.4.4).
Compression: This őeld is used in earlier versions to identify the compression method. In TLS 1.3, should contain the indication for the ‘null’
compression method.
Applied Introduction to Cryptography and Cybersecurity
464
7.6.2
CHAPTER 7. TLS PROTOCOLS: WEB-SECURITY AND BEYOND
TLS 1.3 Full (1-RTT) DH Handshake
Client
Server
Client_Hello: Client
random (rC ), cipher suites,
supported_groups, Key_Share,
extensions:
supported_versions, signature_algs, CAs, . . .
Server_Hello: Server random (rS ), extensions: k̂ {Key_Share, . . . }
X
Certificate: k̂ {SignCA.s (S.v, . . .)},
S
CertificateVerify: k̂ {SignS.s (handshake)},
S
Server_Finished: k̂ M ACkF inished (Server_Finished:handshake+ )
S
M ACF inished_key (Client_Finished:handshake∗ )
Client_Finished: k̂
C
kC [Application data]
Figure 7.21: TLS 1.3 1-RTT full Diffie-Hellman handshake; see Figure 7.22 to see
the version with support for Pre-Shared Key (PSK). The CertiőcateVerify message contains a signature over the entire handshake until it: Client_Hello,
Server_Hello and the Certiőcate. Server_Finished contains a MAC over
handshake+ , i.e., the entire handshake, plus the CertiőcateVerify message itself.
Client_Finished contains MAC over the handshake∗ , i.e., entire handshake, plus
CertiőcateVerify and Server_Finished. We use k̂X {. . .} to denote AEAP
protection using handshake-key of party X ∈ {C, S}, and kX [. . .] to denote
AEAP protection using application-key of party X ∈ {C, S}.
We next present the TLS 1.3 1-RTT Diffie-Hellman Handshake, illustrated in
Figure 7.21. This is the typical initial TLS 1.3 handshake, always used by a
client and server on their őrst connection, and optionally used in subsequent
connections.
In contrast to the ‘basic handshake’ of the previous sections (and versions),
the TLS 1.3 full handshake always uses the DH protocol for key exchange;
therefore, it always ensures PFS.
The handshake ensures server authentication using the server’s signature
over the initial handshake messages, sent in a message called CertificateVerify.
This signature authenticates the server’s DH component (gibi mod pi ). By
including the initial handshake messages in the signed content, the signature
also protects against downgrade attacks.
The TLS 1.3 full (1-RTT) handshake allows the client to send the request
after a single round-trip; that’s why it is called a 1-RTT handshake. Namely, the
server (single) ŕow contains both the Server_Hello message (with the server’s
DH exponent, extensions, certiőcate and signature), and the Server_Finished
message, which ensures the integrity of the exchange.
In order to allow the server to send the finished: message in its (single) ŕow,
it has to receive all the necessary keying information from the client earlier i.e., already in the Client_Hello message. Since TLS 1.3 always uses DH key
exchange, this means that the client must send sufficient information for the
Applied Introduction to Cryptography and Cybersecurity
7.6. THE TLS 1.3 HANDSHAKE: IMPROVED SECURITY AND
PERFORMANCE
465
server to determine the group (for őnite őeld DH) or curve (for Elliptic Curves
DH) to be used; furthermore, the client needs to even provide its DH key-share
(e.g., g a mod p). We can provide all this - efficiently - in the Client_Hello, as
a result of the TLS 1.3 use of a limited number of standard DH finite groups
and elliptic curves.
The client indicates the DH groups (including elliptic curves) that it supports, in the Supported_Groups extension; and its DH key-share, in the
Key_Share extension. Both extensions must be sent as part of the TLS
1.3 Client_Hello message. There is a cost here in overhead of computing and
sending these values - but the savings in RTT are usually much more important. The client may provide key_share only for some of the groups; if there
is no key_share for the group preferred by the server, the server can send a
special message, HelloRetryRequest, and the client will send a ‘corrected’
Client_Hello. (In this case, the handshake will require two RTTs.)
Three other mandatory extensions in the TLS 1.3 Client_Hello identify:
(1) the supported_versions, (2) the signature_algorithms supported and
(3) the certificate_authorities (CAs) trusted by the client. By requiring
the supported versions extension, TLS 1.3 protects against version downgrades.
The goal of the two other mandatory extensions is, apparently, to avoid incompatabilities, i.e., a situation where the server may use a signing algorithm not
supported by the client, or a certiőcate from a CA not supported by the client.
The Server_Hello message is quite similar to this of previous versions,
except that it also provides the server’s key_share for the DH exchange, in an
aptly named extension. It is followed by the server’s certiőcate and the TLS
1.3 CertiőcateVerify message, which authenticates the server and ensures the
handshake integrity, by including a signature over the entire handshake (to this
point). Finally, the server sends its (authenticated) Server_Finished message,
again much like in previous versions (just earlier!).
The őnal ŕow of the handshake contains the Client_Finished message,
authenticating the entire handshake (until this point), much like in earlier
versions.
Both client and server Finished messages may be followed immediately by
application data sent by the respective parties, protected using the shared key.
Notice that since the TLS 1.3 record protocol uses AEAD to protect the data,
it uses only one key in each direction (client to server, kC→S , and server to
client, kS→C ); the őgure includes only a message from the client. We denote the
AEAD protection provided by the record protocol by kC→S (Application data).
Applied Introduction to Cryptography and Cybersecurity
466
7.6.3
CHAPTER 7. TLS PROTOCOLS: WEB-SECURITY AND BEYOND
TLS 1.3 Full (1-RTT) Pre-Shared Key (PSK)
Handshake
Client
Server
Initial handshake (no PSK):
Client_Hello: Client
random (rC ), cipher suites,
PSK_key_exchange_modes (PSK-only, PSK-DHE),
supported_versions, supported groups,
extensions:
signature_algs, CAs, Key_Share, . . .
Server_Hello: Server random (rS ), extensions: k̂ {Key_Share (if DHE), . . . }
X
Certificate: k̂ {SignCA.s (S.v, . . .)},
S
CertificateVerify: k̂ {SignS.s (handshake)},
S
Server_Finished: k̂ M ACkF inished (Server_Finished:handshake+ )
S
Client_Finished: k̂
M ACF inished_key (Server_Finished:handshake∗ )
C
kC (Application data)
kS
NewSessionTicket:
ticket_lifetime, ticket_age_add,
ticket_nonce, ticket, extensions
...
Subsequent handshake (with PSK):
Client_Hello: Client
random (rC ), cipher suites,
Pre_Shared_Key, PSK_key_exchange_modes,
supported_versions, supported_groups,
extensions:
signature_algs, CAs, Key_Share, . . .
Server_Hello: Server random (rS ),
extensions: k̂ {Pre_Shared_Key, Key_Share (if DHE),. . . }
S
Server_Finished: k̂ M ACkF inished (Server_Finished:handshake+ )
S
Client_Finished: k̂
M ACF inished_key (Client_Finished:handshake∗ )
C
kC (Application data)
Figure 7.22: TLS 1.3 Full (1-RTT) Pre-Shared Key (PSK) handshake. We
show the typical use of a PSK, for session resumption, although a PSK can
also be shared off-band. We show two handshake: an initial handshake where
the PSK is established, and a subsequent handshake which uses the PSK. To
establish a PSK, the Client_Hello includes the PSK_key_exchange_modes
extension, indicating if the PSK can be used to authenticate a DHE (providing
PFS) and/or to provide shared-key only handshake (without PFS). See text for
details.
Applied Introduction to Cryptography and Cybersecurity
7.6. THE TLS 1.3 HANDSHAKE: IMPROVED SECURITY AND
PERFORMANCE
467
In Figure 7.22, we present the TLS 1.3 Full (1-RTT) Pre-Shared Key (PSK)
Handshake. The TLS 1.3 pre-shared key mechanism can be used for session
resumption, using a key shared in an earlier session, or with out-of-band preshared keys, as previously supported by several special cipher suites [21, 22, 141].
For both session resumption and out-of-band pre-shared keys, TLS 1.3
supports two Pre-Shared Key modes, which offer different costs and beneőts:
The PSK Key Exchange (PSK_KE) mode: the pre-shared key is used
to secure the session without the use of Diffie-Hellman key exchange, or any
other public key operation. The advantage is reduced computational costs
and associated delay and energy costs; as shown in Table 6.1, symmetric
cryptography requires a tiny fraction of the computational resources
required by public-key cryptography. There is also a reduction in the
amount of data sent, which is meaningful in some unusual situations;
usually, this reduction is insigniőcant. The disadvantage of the a PSK
Key Exchange (PSK_KE) mode, is that it does not ensure perfect forward
secrecy (PFS).
The PSK and DHE (PSK_DHE_KE) mode: the pre-shared key is used
to authenticate the Diffie-Hellman key exchange, instead of using the
server’s certiőcate and signature (with the CertiőcateVerify message).
This mode has three uses. The őrst is simply to reduce the overhead
of the transmission and veriőcation of the certiőcate and the signature,
while retaining the added security provided by PFS. The second is to use
is for scenarios where the server does not have a certiőcate and cannot
perform the certiőcate-based veriőcation (CertiőcateVerify message), and
the security of the DH key exchange is based on the shared-key authentication. The third usage is when the server does send both certiőcate
and signature; in this case, the goal is the added security provided by the
shared key, e.g., in case of exposure of the server’s private key.
In Figure 7.22, we focus on the session resumption case, by presenting
a sequence of two handshakes. The őrst is an initial handshake, which is
essentially a full (1-REE) DH handshake, which further establishes one or
more Pre-Shared Keys, using the PSK_key_exchange_modes extension and
the NewSessionTicket message. It is followed by the second, a subsequent
handshake, which uses the Pre-Shared Key from the previously sent ticket, to
resume the session, by performing a pre-shared key handshake.
The PSK_key_exchange_modes extension speciőes which Pre-Shared Key
mode(s) the client wants to use: the PSK Key Exchange (PSK_KE) mode
and/or the PSK Key Exchange (PSK_KE) mode. This extension is relevant for
the use of the PSK in the current handshake (if it uses a pre-shared key), and for
any new pre-shared keys which the server may share using the NewSessionTicket
message.
The NewSessionTicket message provides the client with one or more ‘tickets’.
The ticket(s) sent after a successful handshake, refer to a pre-shared key,
derived from a dedicated shared secret called the resumption_master_secret.
Applied Introduction to Cryptography and Cybersecurity
468
CHAPTER 7. TLS PROTOCOLS: WEB-SECURITY AND BEYOND
The resumption_master_secret is one of the secrets and keys derived for the
session; the derivation uses a keyed function, which we denote hExpand . As
a simpliőcation, consider resumption_master_secret to be a secret key, and
hExpand to be a pseudorandom function (PRF); for more precise details, see
subsection 7.6.5.
The PSK associated with the ticket is derived from the ticket_nonce,
which is one of the őelds sent in the NewSessionTicket, and the resumption_master_secret, as follows:
′′
P SK(ticket_nonce) = hExpand
resumption_master_secret (“resumption , ticket_nonce)
(7.26)
The Client_Hello message of the subsequent handshake, using the PSK,
includes two PSK-related extensions: the PSK_key_exchange_modes extension,
discussed above, and the Pre_Shared_Key extension. The Pre_Shared_Key
extension identiőes one or more pre-shared keys known to the client, allowing
the server to use any of these that the server may have cached to establish
the new connection. For session resumption, the PSK identiőer is the ticket,
provided in a previous NewSessionTicket message. The server also uses the
Pre_Shared_Key extension to signal that it uses a speciőc pre-shared key (from
the list provided by the client).
In Figure 7.22, the subsequent handshake contains the Key_Share extension.
This extension includes the server’s DH key-share, and therefore, is used if,
and only if, using the DH key exchange (with őnite őeld or elliptic curve). We
conclude, therefore, that the client’s PSK_key_exchange_modes extension
allowed the use of PSK_KE, which was then chosen by the server.
Applied Introduction to Cryptography and Cybersecurity
7.6. THE TLS 1.3 HANDSHAKE: IMPROVED SECURITY AND
PERFORMANCE
469
7.6.4
TLS 1.3 Zero-RTT Pre-Shared Key (PSK) Handshake
Client
Server
Client_Hello: Client
random (rC ), cipher suites,
Early_Data, Pre_Shared_Key, PSK_key_exchange_modes,
,
supported_versions, supported_groups,
extensions:
signature_algs, CAs, Key_Share, . . .
P SKC (Early application data)
Server_Hello: Server random (rS ),
extensions: k̂ {Pre_Shared_Key, Early_Data, Key_Share (if DHE),. . . }
S
Server_Finished: k̂ M ACkF inished (Server_Finished:handshake+ )
S
P SKC (EndOfEarlyData)
M ACF inished_key (Client_Finished:handshake∗ )
Client_Finished: k̂
C
kC (Application data)
Figure 7.23: TLS 1.3 Zero-RTT Pre-Shared Key (PSK) Handshake. The client
provides some ‘early application data’ to the server immediately after the
Client_Hello message, without waiting for the server’s response.
In Figure 7.23 we present the TLS 1.3 Zero-RTT Pre-Shared Key (PSK)
Handshake. This is a special form of a pre-shared key handshake, in which we
use the PSK to secure some data sent in the Client_Hello message. Speciőcally,
this data is contained in the dedicated Early_Data extension, which is sent
in the őrst ŕow, sent from the client to the server.
Since Early_Data is sent as part of the very őrst ŕow of the connection, the
delay until it arrive includes only the time since the client sends this message,
and until the server receives it. Namely, there is no need to wait for any roundtrip of handshake messages, before the client can send this ‘early’ application
data to the server. In other words, the Early_Data is sent without waiting
for any round-trips to complete, i.e., with zero RTT latency. The only delay
is, therefore, the unavoidable time for the data itself to transfer from client to
server.
The Early_Data does not beneőt from all the protections offered by TLS.
In particular, it is only protected by the pre-shared key, therefore it does
not beneőt from perfect forward secrecy (PFS). Furthermore, an attacker
can replay the Client_Hello message; this may cause the server to re-process
the Early_Data, unless appropriate countermeasures prevent this (see below).
Therefore, Early_Data should only be used for client requests which do not
require PFS, and either where re-processing is allowed, or with appropriate
countermeasures to prevent re-processing. A typical example for a legitimate,
common use of Early_Data (and zero-RTT handshake), where both the lack of
PFS and the possibility for re-processing are not a concern, is when the client
Applied Introduction to Cryptography and Cybersecurity
470
CHAPTER 7. TLS PROTOCOLS: WEB-SECURITY AND BEYOND
sends a query to the server, authenticated by a password or cookie sent with
the request, and receives back a (protected) response.
Two alternative countermeasures against re-processing of Early_Data are
possible. The őrst possible countermeasure is to allow each ticket to be used
only once; this requires the server to maintain track of already-used tickets,
similarly to the use of session-ID for session resumption in earlier versions.
An alternative countermeasure is to limit the lifetime of the ticket, using the
ticket_lifetime őeld of the NetSessionTicket message.
7.6.5
TLS 1.3 Key Derivation
TLS 1.3 makes extensive use of key derivation, using a pair of keyed functions:
a key expansion function hExpand and a key extraction function hExtract . Both
functions are deőned in [243], based on the HMAC construction and a given
hash function h. The hash function h is deőned as part of the TLS 1.3 cipher
suite.
The key expansion function hExpand is similar to a PRF; like a PRF, it
should receive a pseudorandom key, and outputs a pseudorandom string. The
length of its output pseudorandom string is speciőed as one of its inputs. It
receives one more input, which is basically the input information to the PRF.
The TLS speciőcations deőnes different inputs for each derivation.
The key extraction function hExtract is similar to a keyed Key Derivation
Function (KDF). Namely, the output of hx (y) is a (short) pseudorandom string,
provided the either (a) x is pseudorandom (secret) and y is a unique value, or
(b) x is a ‘salt’ (randomly chosen but known to attacker), and y is a high-entropy
string, i.e., intuitively, contains ‘sufficient secret random bits’.
The TLS 1.3 handshake protocol applies hExtract and hExpand to derive
multiple pseudorandom, independent keys for different purposes. Simplifying:
1. When using a pre-shared key P SK, we use it (or actually, a key derived
from it) as a key to hExpand , to generate several pseudorandom ‘early
secrets/keys’. We use one key to protect the Early_Data (if sent), and
another key, which we denote k1 , as the key for the next derivation step.
2. We next derive khandshake = hExtract
(shared_secret), where shared_secret
k1
is the partially-secret output of the Diffie-Hellman key exchange. We use
khandshake as the key to hExpand , to generate several pseudorandom keys,
including a key to protect client-to-server handshake messages, a key to
protect server-to-client handshake messages, and a key k2 which we use
for the next derivation step.
3. We next derive kM aster = hExtract
(0). We use kM aster as the key to
k2
hExpand , to generate several pseudorandom keys, including a key to
protect client-to-server application messages, a key to protect server-toclient application messages, and a key resumption_master_secret which
we use to derive pre-shared keys, as deőned in Equation 7.26.
Applied Introduction to Cryptography and Cybersecurity
7.6. THE TLS 1.3 HANDSHAKE: IMPROVED SECURITY AND
PERFORMANCE
7.6.6
471
Cross-Protocol Attacks on TLS 1.3
We complete out discussion of TLS 1.3 by brieŕy discussing cross-protocol attacks,
since these types of attacks can be effective standard-complying implementations
of TLS 1.3; arguably, the standard could have speciőed certain countermeasures
which would have prevented this, as we discuss below.
Let us őrst explain what are cross-protocol attacks; the concept is not
limited to TLS. Cross-protocol attacks are attacks which exploit a vulnerability
in one (‘weak’) protocol PW , to attack another, ‘strong’ protocol, PS . The
attacks exploit two ŕaws: the vulnerability of PW , and the fact that PS and
PW use the same private/secret key, in violation of the key separation principle
(Principle 10).
In particular, while TLS 1.3 does not use RSA encryption, it does use
signatures - which are often RSA signatures - for CertiőcateVerify. In many
implementations, the same private key is used for these TLS 1.3 RSA signatures,
and for vulnerable RSA decryption, typically, using version 1.5 or 2.0 of PKCS#1.
In particular, many TLS 1.3 implementations were found to use the same
private key (to sign CertiőcateVerify) as used for RSA decryption by older TLS
implementations (of the same organization). This allow use of the Bleichenbacher
attack and variants of it, e.g., Manger’s attack, to allow the attacker to ‘perform
the private-key RSA operation’, potentially allowing the attacker to sign the
CertiőcateVerify message. Two variants of this attack were published. The őrst
attack was published in [214]; however, it required quite extensive computational
abilities from the attacker. The DROWN attack, published a year later, is a
much more efficient and practical attack, but it required the key to be shared
with an implementation of SSLv2; surprisingly, the researchers found that even
in 2016, there was a very signiőcant number of web servers which supported
SSLv2 (and reused the same private key for other protocols, e.g., TLS 1.3
signing).
These cross-protocol attacks are not necessarily a major concern for the
use of TLS 1.3, for two reasons. The őrst reason is a technical challenge: the
attacker must complete the attack, and in particular abuse the lower-version
TLS version to sign the CertiőcateVerify message, before the client aborts the
connection (due to not receiving CertiőcateVerify in time). This challenge
makes deployment of the attack challenging - but it may still be possible in
some scenarios.
The second reason for the defense to be of limited concern, is that the attack
only works when the TLS 1.3 implementation uses the same private key (for
signatures) as used by lower versions of TLS (for encryption). Such reuse of
the same key for two different purposes (decryption and signing) and by two
different versions of TLS, is a double violation of the key separation principle
(Principle 10). Implementations of TLS 1.3 should avoid this, and use a different
private (signing) key then the private (decryption) key used by lower versions in fact, such key separation should have been done even before this attack was
published! Furthermore, typically, the same private key is used for two different
purposes and protocol, only when using also the same certificate. Speciőcally,
Applied Introduction to Cryptography and Cybersecurity
472
CHAPTER 7. TLS PROTOCOLS: WEB-SECURITY AND BEYOND
TLS 1.3 could require that the certiőcate contains a key-usage extension, that
explicitly forbids its use for decryption. While this would not absolutely prevent
cross-protocol attacks, it would probably make them extremely unlikely, as
certiőcates of keys currently used (for encryption) by older versions of TLS are
unlikely to contain a key-usage extension forbiddings the use of the private key
for decryption. For details about the key-usage and other certiőcate extensions,
see subsection 8.2.7.
7.7
TLS: Final Words and Further Reading
The TLS protocols are the most widely used, and deőnitely the most studied,
applied cryptographic protocol. Extensive efforts by many cryptographers and
security experts were invested in validating and improving the security of TLS.
It is instructive to observe that these efforts resulted in the discovery of a large
number of serious vulnerabilities. We discussed several of these, focusing on
speciőcation vulnerabilities. We summarize the most important TLS speciőcations vulnerabilities in in Table 7.2. We did not cover the many implementation
vulnerabilities, although some of them, in particular the Heartbleed bug [91,399],
had comparable and even greater impact.
Name
BEAST
CRIME
Lucky13
RC4-biases
BREACH
TIME
Poodle-padding
Poodle-downgrade
FREAK
Cross-Bleichenbacher
DROWN
Logjam
ROBOT
Bleichenbacher’s CAT
Year
2011
2012
2013
2013
2013
2013
2014
2014
2015
2015
2016
2018
2018
2019
References
subsection 7.2.4,
subsection 7.2.6,
[9]
subsection 7.2.5,
subsection 7.2.6,
subsection 7.2.6,
subsection 7.2.3,
subsection 7.5.5,
subsection 7.5.3,
subsection 7.6.6,
subsection 7.6.6,
subsection 7.5.3,
[73]
subsection 7.5.4,
[132]
[335]
[11]
[164]
[29]
[290]
[290]
[56]
[214]
[18]
[7]
[341]
Versions
SSL, TLS 1.0
TLS 1.0-1.1
SSL, TLS 1.0-1.2
SSL, TLS 1.0-1.2
TLS 1.0-1.2
TLS 1.0-1.2
SSLv3
TLS 1.0-1.2
TLS 1.0-1.2
TLS 1.3
TLS 1.0-1.3
TLS 1.0-1.2
TLS 1.0-1.2
TLS 1.0-1.3
Ciphers
CBC
n/a
CBC
RC4
n/a
n/a
CBC
n/a
RSA
RSA
RSA
DHE
RSA
RSA
Type
Cryptanalysis
Compression
Padding, timing
Cryptanalysis
Compression
Compression
Padding
Version downgrade
Cipher suite downgrade
Cross-protocol and Bleichenbacher
Cross protocol and Bleichenbacher
Cipher suite downgrade
Bleichenbacher
Bleichenbacher and downgrade
Table 7.2: Important TLS/SSL attacks due to speciőcations vulnerabilities.
We can learn some important lessons from this history of attacks and
improvements, including:
A crack today, a break tomorrow: many devastating attacks, e.g., BEAST
and Poodle, can be traced to vulnerabilities reported years earlier, but
ignored since they appeared impractical. Or as correctly stated in [9],
attacks only gets better.
Vulnerabilities are resilient and return: even after a vulnerability has been
discovered and countermeasures adopted, attackers are often able to continue taking advantage of the vulnerability, in different ways. First,
attackers are often able to adjust the attack and defeat the countermeasures. Second, attacker are often able to downgrade the system to use an
outdated, vulnerable version, possibly using downgrade attacks. Third,
Applied Introduction to Cryptography and Cybersecurity
7.8. ADDITIONAL EXERCISES
473
attackers are sometimes able to use cross-protocol attacks, where they
exploit a vulnerable system to circumvent the (strong) defenses of another
system which uses the same keys. It is much better to design systems
securely from early on; of course, an advice easier given than followed!
Separate keys: Cross-protocol attacks such as DROWN [18], as well as
BEAST [132] which abuses the continued use of the same key for different
messages, remind us of the important principle of key separation (Principle 10). The use of TLS 1.3, or of any secure protocol, will not help,
if we reuse the same secret key which can be found from its usage in an
insecure protocol!
Test, test, test: őnally, while we focused on speciőcation ŕaws, many attacks
on TLS, e.g., Heartbleed, exploit implementation flaws. Testing for
security is difficult, however, vital for the security of the system, since
vulnerabilities will not be detected by normal use of the system. Of course,
testing is harder for larger and more complex systems; which brings us to
the next and őnal item...
KISS! Finally, we see again and again the importance of the KISS principle
(Principle 14): keep speciőcations, design and code small and simple,
minimize complexity and attack surface and avoid unnecessary options
and ŕexibility. The KISS principle is important against both speciőcation
and implementation vulnerabilities. The design of TLS 1.3 began with an
extensive effort to follow this principle and eliminate unnecessary options.
However, with features creeping in, there are reasons to be concerned
that vulnerabilities may be found, in implementations and/or in the
speciőcations themselves.
7.8
Additional Exercises
Exercise 7.5 (The TLS record protocol fragmentation and compression).
1.
The TLS record protocol uses fragments of size up to 16KB. Explain potential disadvantage of using much longer fragments (or no fragments).
2. Explain potential disadvantage of using much shorter fragments.
3. Explain why fragmentation is applied before compression (for the AtE
protocol).
4. Suppose compression is not used. Why would we apply fragmentation
before authentication and encryption?
5. The TLS record protocols apply compression, then authentication. Is it possible to reverse the order, i.e., apply authentication and then compression?
Can you identify advantages to either order?
Applied Introduction to Cryptography and Cybersecurity
474
CHAPTER 7. TLS PROTOCOLS: WEB-SECURITY AND BEYOND
Exercise 7.6 (Vulnerability of Compress-then-Encrypt). One of the important
goals of TLS is to hide the value of cookies sent by a browser, as part of an
HTTP-over-TLS connection (marked by the https protocol in the beginning of
the URL). Cookies are strings that are sent automatically by the browser to
a website; they are often used to authenticate the user. Consider a cross-site
attacker, as in subsection 7.2.6, i.e., the attacker controls a rogue website visited
by the victim user; further, assume that the attacker can eavesdrop on the
(protected) communication, following the CPA-Oracle Attack model. This allows
the attacker to control the contents of the request sent by the browser, except
for the Cookie HTTP header, which is added by the browser and which consists
of the string ‘Cookie:’ followed by the value of the (unknown) cookie.
1. Assume that the length of the cookie is known, that it contains only
alphanumeric characters, and the following compression scheme. Check
what is the longest string α which appears at least twice in the uncompressed
data; replace the occurrences of α by a special character, say ‘ !’, and
concatenate to the data the same special character (e.g. ‘ !’), followed by α.
Present an efficient attack which exposes the first character of the cookie.
2. Extend the attack, to find the entire cookie.
3. What is the maximal number of requests required by the attack, for a cookie
of l characters?
4. What is the expected number of requests required by the attack, for cookie
of l alphabetic characters selected randomly (with uniform distribution)?
Note: compression is used by all versions of TLS before TLS 1.3.
Exercise 7.7 (SSLv2 key derivation). SSL uses MD5 for key derivation. In
this question, we explore the required properties from MD5 for the key derivation
to be secure.
1. Show that it is not sufficient to assume that MD5 is collision-resistant,
for the key derivation to be secure.
2. Repeat, for the one-way function property.
3. Repeat, for the randomness-extraction property.
4. Define a simple assumption regarding MD5, which ensures that key derivation is secure. The definition should be related to cryptographic functions
and properties we defined and discussed.
Exercise 7.8 (BEAST vulnerability). Versions of TLS before TLS1.1, use
CBC encryption in the following way. They select the IV randomly only for
the first message m0 in a connection; for subsequent messages, say mi , the
IV is simply the last ciphertext block of the previous message. This creates a
vulnerability exploited, e.g., by the BEAST attack and few earlier works [25,132].
Applied Introduction to Cryptography and Cybersecurity
7.8. ADDITIONAL EXERCISES
475
In this question we explore a simplified version of these attacks. For simplicity,
assume that the attacker always knows the next IV to be used in encryption,
and can specify plaintext message and receive its CBC encryption (using the
known next IV). Assume known block length, e.g., 16 bytes.
1. Assume the attacker sees ciphertext (c0 , c1 ) resulting from CBC encryption
with c0 being the IV, of a single-block message m, which can have only two
known values: m ∈ {m0 , m1 }. To find if m was m0 or m1 , the adversary
uses fact that it knows the next IV to be used, which we denote c′0 , and asks
for CBC encryption of a specially-crafted single-block message m′ ; denote
the returned ciphertext by the pair (c′0 , c′1 ), where c′0 is the (previously
known) IV, as indicated earlier. The adversary can now compute m′ from
c′1 :
a) What is the value of m′ that the adversary will ask to encrypt?
b) Fill the
missing parts in the solution of the adversary:
m0 if
m=
m1 if
2. Show pseudo-code for the attacker algorithm used in the previous item.
3. Show pseudo-code for an attack that finds the last byte of message m.
Hint: use the previous solution as a routine in your code.
4. Assume now that the attacker tries to find a long secret plaintext string
x of length l bytes. Assume attacker can ask for encryption of messages
m = p+
+ x, where p is a plaintext string chosen by the attacker. Show
pseudo-code for an attack that finds x. Hint: use previous solution as
routine; it may help to begin considering fixed-length x, e.g., four bytes.
Sketch of solution to second part: Attacker makes query for encryption
of some one-block message y, receives α0 , α1 where α1 = Ek (α0 ⊕ y). Suppose
that now, the attacker knows the value IV to be used for encryption of the
next message. Attacker picks m0 = IV ⊕ y ⊕ α0 , and m1 some random message.
If the game picks bit b = 0, then attacker receives encryption of m0 ; this
encryption would be Ek (IV ⊕ m0 ) = Ek (IV ⊕ IV ⊕ y ⊕ α0 ) = Ek (y ⊕ α0 ) = α1
(and the IV ). Otherwise, if the game picks b = 1, then the attacker receives
some other string.
Sketch of solution to third part: solution to previous part allowed attacker
to check if the plaintext was a given string; we now simply repeat this for the
256 different strings corresponding to all possible values of last byte of m2 .
Exercise 7.9 (Non-random client/server). Some devices may not have a source
of random bits; in this exercise, we explore possible resulting vulnerabilities, and
a possible work-around.
Applied Introduction to Cryptography and Cybersecurity
476
CHAPTER 7. TLS PROTOCOLS: WEB-SECURITY AND BEYOND
1. Consider a IoT-lock, which receives the lock/unlock requests over TLS,
as a web-server - but without access to a source of randomness. Show
a replay attack allowing an eavesdropping attacker to open the lock by
replaying messages of a legitimate user.
2. Consider a monitoring station displaying images from security cameras,
by initiating TLS connections to the cameras and receiving the current
impages. Show a replay attack allowing an eavesdropping attacker to
replay old images (while cracking the safe).
3. Assume that the devices have non-volatile memory. Show how it can be
used to ensure secure interactions, even though the devices still do not
have a source of random bits.
Your answer should present a sequence diagram and the relevant equations, for
SSLv2 or another version of TLS.
Exercise 7.10 (SSLv2 client authentication). The SSLv2 client authentication
mechanism requires clients to sign a message containing three main fields: a
random challenge sent by the server, the server’s certificate, and the shared secret
keys exchanged by the protocol. The signature uses RSA with the Hash-thenSign paradigm, using the MD5 hash function. Furthermore, the client should
encrypt the message containing the signature (using the just-established shared
key). This is more complex than the later client-authentication designs; in this
question, we explore attempts to remove some of these multiple requirements.
1. Suppose that the input to the signature did not contain the server’s certificate. Show a sequence diagram showing client-authentication fails.
2. Suppose that the signed message was not encrypted. Present a possible
vulnerability of the protocol, where you are allowed to replace the use of
MD5 with the use of any collision-resistant hash function h. Note: this
may require h to have a vulnerability that we may not expect to find in
MD5 or other well-designed cryptographic has functions.
Exercise 7.11 (TLS handshake: resiliency to key exposure). Fig. 7.10 presents
the RSA-based TLS Handshake. This variant of the handshake protocol was
popular in early versions, but later ‘phased out’ and completely removed in TLS
1.3. The main reason was the fact that this variant allows an attacker that
obtains the server’s public key, to decrypt all communication with the server
using this key - before and after the exposure.
1. Show, in a sequence diagram, how a MitM attacker who is given the
private key of the server at time T1 , can decrypt communication of the
server at past time T0 < T1 .
2. Show, in a sequence diagram, how TLS 1.3 avoids this impact of exposure
of the private key.
Applied Introduction to Cryptography and Cybersecurity
7.8. ADDITIONAL EXERCISES
477
3. Show, in a sequence diagram, how a MitM attacker who is given the
private key of the server at time T1 , can decrypt communication of the
server at future time T2 > T1 .
4. Explain which feature of TLS 1.3 can reduce the exposure of future communication, and how.
Exercise 7.12 (Protocol version downgrade-dance attack). Implementations
of TLS and SSL specify the version of the protocol in the ClientHello and
ServerHello messages. If the server does not support the client’s version, then
it replies with an error message. When the client receives this error message
(‘version not supported’), it re-tries the handshake using the best-next version
of TLS supported by the client. This method of ensuring backward compatibility
with older versions of TLS is referred to as downgrade dance.
1. Present a sequence diagram showing how a MitM attacker can exploit the
downgrade dance mechanism, to cause the server and client to use an outdated version of the protocol, allowing the attacker to exploit vulnerabilities
of that version.
2. The TLS Fallback Signaling Cipher Suite Value (SCSV) [288], discussed
in subsection 7.5.6, is designed to mitigate this risk. Let vc denote the
TLS version run by the client and vS denote the TLS version run by the
server. Present a sequence diagrams showing TLS connections where (1)
client and server support SCSV and vc > vs , (2) same, vc = vs , (3) same,
vc < vs , (4) any of these, with a MitM attacker who tries to cause use of
version vm < min(vc , vs ).
Note: See also Exercise 8.15.
Exercise 7.13 (Client-chosen cipher suite downgrade attack). In many variants of the TLS handshake, e.g., the RSA-based handshake in Fig. 7.10, the
authentication of the (previous) handshake messages in the Finish flows, is
relied upon to prevent a MitM attacker from performing a downgrade attack and
causing the client and server to use a less-preferred (and possibly less secure)
cipher suite. However, in this process, the server can choose which of the client’s
cipher suites would be used. To ensure the use of the cipher suite most preferred
by the client, even if less preferred by the server, some client implementations
send only the most-preferred cipher suites. If none of these is acceptable to the
server, then the server responds with an error message. In this case, the client
will try to perform the handshake again, specifying now only the next-preferred
cipher suite(s), and so on - referred to as downgrade dance.
1. Show how a MitM attacker can exploit this mechanism to cause the server
and client to use a cipher suite that both consider inferior.
2. Suggest a fix to the implementation of the client which achieves the same
goal, yet is not vulnerable to this attack. Your fix should not require any
change in the server.
Applied Introduction to Cryptography and Cybersecurity
478
CHAPTER 7. TLS PROTOCOLS: WEB-SECURITY AND BEYOND
Exercise 7.14 (TLS server without randomness). An IoT device provides http
interface to clients, i.e., acts as a tiny web server. For authentication, clients
send their commands together with a secret password, e.g., on, <password> and
off, <password>. Communication is over TLS for security, with the RSA-based
TLS handshake, as in Figure 7.10.
The IoT device does not have a source of randomness, hence, it computes
the server-random rS from the client-random, using a fixed symmetric key kS
(kept only by the device), as: rS = AESkS (rC ).
1. Present a message sequence diagram showing how an attacker, which can
eavesdrop on a connection in which the client turned the device ‘on’, can
later turn the device ‘on’ again, without the client being involved.
2. Would your answer change (and how), if the device supports ID-based
session resumption? Ticket-based session resumption?
3. Show a secure method for the server to compute the server-random method,
which will not require a source of randomness. The IoT device may use
and update a state variable s; you solution consists of the computation of
the server-random: rS =
and of the update to the state
.
variable performed at the end of every handshake: s =
Exercise 7.15 (DH Ephemeral (DHE)). Consider a client and server that use
TLSv1.2 with DH Ephemeral (DHE) public keys, as in Fig. 7.12. Assume that
the client and server run this protocol daily, at the beginning of every day i.
(Within each day, they may use session resumption to avoid additional public
key operations; but this is not relevant to the question). Assume that Mal can
(1) eavesdrop on communication every day, (2) perform MitM attacks (only)
every even day (i s.t. i ≡ 0 ( mod 2)), (3) is given all the keys known to the
server on the fourth day. Note: the server erases any key once it is not longer
in use (i.e., on fourth day, attacker is not given the ‘session keys’ established n
previous days).
Fill the ‘Exposed on’ column of day i in in Table 7.3, indicating the first day
j ≥ i in which the adversary should be able to decrypt (expose) the traffic sent
on day i between client and server. Write ‘never’ if the adversary should never
be able to decrypt the traffic of day i. Briefly justify.
Day
1
2
3
4
5
6
7
8
Eavesdrop?
Yes
Yes
Yes
Yes
Yes
Yes
Yes
Yes
MitM?
No
Yes
No
Yes
No
Yes
No
Yes
Given keys?
No
No
No
Yes
No
No
No
No
Exposed on...
Justify
Table 7.3: Table for Exercise 7.15.
Applied Introduction to Cryptography and Cybersecurity
7.8. ADDITIONAL EXERCISES
479
Exercise 7.16 (TLS with PRS). Consider a client that has three consecutive
TLS connections to a server, using TLS 1.3. An attacker has different capabilities
in each of these connections, as follows:
• In the first connection, attacker obtains all the information kept by the
server (including all keys).
• In the second connection, attacker is disabled.
• In the third connection, attacker has MitM capabilities.
Is the communication between client and server exposed, during the third connection?
1. Show a sequence diagram showing that with TLS 1.3, communication
during third connection is exposed to attacker.
2. Present an improvement to TLS 1.3 that will protect communication
during third connection. Simplify your solution by assuming no attack
during the second connection.
3. Further to provide same protection, even if the attacker can eavesdrop to
the communication during the second connection.
4. How can your improvement be implemented using TLS 1.3, allowing
backward compatibility, i.e., a ‘normal’ TLS 1.3 interaction when one of
the two parties (client or server) does not support your improvement?
Exercise 7.17. A Pierpont prime is a prime number of the form 2u · 3v + 1,
where u, v are non-negative integers; Pierpont primes are a generalization of
Fermat primes. Assume that Alice’s browser sends, in the Client Hello message
of TLS 1.3, the set of exponentiations {giai mod pi }, where for some i, say
i = 3, the prime p3 is a Pierpont prime.
1. Assume that the server bob.com selects to use this p3 and g3a3 mod p3 ,
i.e., sends back g3b3 mod p3 as part of the server-hello message. Present a
sequence diagram showing how a MitM attacker would be able to eavesdrop
and modify messages sent between Alice and Bob.
2. Assume that the server prefers p2 , such that there is some prime q2 such
that p2 = 2 · q2 + 1. Explain why the MitM attack of the previous item
fails.
3. Present a sequence diagram showing that the attacker is still able to
impersonate as the website.
4. Extend the impersonation attack to a complete MitM attack against Alice
and bob.com, assuming typical user authentication (using cookie or
userid/password).
Applied Introduction to Cryptography and Cybersecurity
480
CHAPTER 7. TLS PROTOCOLS: WEB-SECURITY AND BEYOND
5. Suppose the client hold a pre-shared key (and ticket); would this prevent
the attack? explain.
Exercise 7.18. This exercise continues Exercise 2.43, please see the ANSI
X9.31 design presented there.
Some TLS implementations use X9.31 as a PRG, to generate keys, nonces
and IVs for encryption. Assume fixed key k (known to attacker), and that the
values Ti are the current time, in seconds. Explain a possible vulnerability; you
may make reasonable assumptions, e.g., on the use of different outputs of the
PRG and on clock synchronization. Demonstrate how an attacker may exploit
the vulnerability, using a sequence diagram. You may present the attack into
any variant of TLS that you wish.
Exercise 7.19 (Protecting TLS 1.3 servers from computational DoS). TLS
servers can be subject to a Denial-of-Service (DoS) attack, in which the attacker
overload the server with Client_Hello messages, each time causing the server
to perform computationally-intensive operations. In TLS 1.3, the attacker can
cause the server to perform two or three computationally-intensive operations:
signing the CertificateVerify message and computing the server’s Diffie-Hellman
key share, and possibly also computing the Diffie-Hellman shared key. All this
requires minimal computational costs to the attacker.
1. Explain the attack, using sequence diagram.
2. Explain how the Cookie extension of TLS 1.3 can help against this attack,
following the details in [329]. Identify assumptions/limitations of this
defense.
3. An alternative way to defend against such DoS attacks, uses the pre-shared
key to authenticate the Client_Hello message. This defense can allows
the server to establish connections with clients which has a pre-shared key,
even when, due to the attack, other clients cannot establish a connection.
Such defense does not currently exist in TLS 1.3. Design such defense
and explain it, using appropriate sequence diagrams. You solution should
not require any change in the TLS 1.3 handshake protocol.
Applied Introduction to Cryptography and Cybersecurity
Chapter 8
Public Key Infrastructure (PKI)
A big advantage of public key cryptography is that public keys are easier to
distribute, as they are not secret; we only need to ensure their authenticity.
The main mechanism for authenticating public keys is a public key certificate
(or simply a certiőcate), signed by a trusted Certificate Authority (CA).
In Chapter 7, we sketched how certiőcates are used by the TLS protocol, but
without discussing certiőcates or CAs. This chapter focuses on certiőcates; we
discuss how certiőcates are issued, validated and revoked, and how to determine
which certiőcates, signed by which authorities, are trustworthy. This set of
mechanisms is key to the use of public key, and therefore referred to as Public
Key Infrastructure (PKI). PKI is, therefore, an essential component for practical
deployments of public key cryptography (PKC).
The basic concept of PKI is almost as old as the őrst publications of public
key cryptography, and appeared in [239]. PKI was also standardized quite
early, before any application of PKC; this was in the (őrst version of the) X.509
speciőcations [92]. The X.509 standard has evolved over the years; the most
important change was with the publication of X.509 version 3 in 1997, often
referred to simply as X.509v3. X.509v3 is still the most important and widely
used version of certiőcate, and much of our discussion in this chapter is focused
on it (speciőcally, Section 8.2 to Section 8.5). X.509v3 is designed for generality;
there are several published X.509 profiles, which deőne different restrictions
on the contents and use of X.509 certiőcates. We mostly focus on the most
well-known proőle, the PKIX proőle, used by Internet protocols and most other
deployed PKIs, and deőned in RFC 5280 [104, 207].
We also discuss some extensions of X.509, mainly OCSP (RFC 6960) [346]
and Certificate Transparency (CT) [253ś255]). All of these (X.509 with the PKIX
proőle as well as CRLs, OCSP and CT) are used in common implementations
of the TLS/SSL handshake protocols, which we discussed in Chapter 7, and in
particular its application to secure the communication between browsers and
web-servers; this particular application is often referred to as the Web PKI. The
Web PKI is probably the most well-known, and possibly also most important,
application of PKI; see subsection 8.1.3.
481
482
CHAPTER 8. PUBLIC KEY INFRASTRUCTURE (PKI)
For many years, PKI was basically identiőed with X.509v3, and to a large
extent, this is still mostly true; all signiőcant PKI deployments and mechanisms
follow X.509v3. One exception was the handling of revocations, where the
original CRL proposal was found too inefficient, and other mechanisms have
been deployed and proposed, with no dominant standard yet; see Section 8.4.
However, with the growing importance and use of the Web, there were
growing concerns about vulnerabilities of the PKI system, due to repeated,
high proőle PKI failures (see subsection 8.5.1). Several proposals were made
to improve PKI security, and one of them, Certificate Transparency, has been
widely deployed, including by CAs and browsers, and is being standardized by
the IETF. However, CT is also based on X.509, and only extends the X.509
speciőcations, as we discuss in Section 8.6.
Topics. PKI is important and involves many proposals, mechanisms and
details; this chapter covers what seemed the most important aspects, and
readers may want to further focus, at least in őrst reading, on the aspects most
important to them. All readers should probably read the ‘PKI concepts and goal’
in Section 8.1. X.509 is covered mostly in Section 8.2, with the important aspect
of intermediate-CAs and certiőcate-path in Section 8.3; readers may skip some
of the details of the different extensions and even entire Section 8.3, if their goal
is to get a more high-level understanding of PKI. Section 8.4 discusses certiőcate
revocation, including both the CRL and OCSP standards, as well as other,
‘optimized’ designs. In Section 8.5, we discuss some of the criticisms of Web
PKI, the main current deployed PKI system, and some propose improvements,
while Section 8.6 focuses on the emerging Certiőcate Transparency extension to
the X.509 PKI.
8.1
Introduction: PKI Concepts and Goals
Basic PKI entities: relying party, issuer (CA) and subject. Public
Key Infrastructure (PKI) schemes distribute a public key pk together with a
set ATTR of attributes and a signature σ. The signature σ is the result of
a signature algorithm applied to input containing both pk and ATTR. The
tuple (pk, ATTR, σ) is called a public key certificate or simply a certificate. The
certiőcate is issued by an entity referred to as a Certificate Authority (CA) or
as the issuer.
Most attributes refer to the subject of the certiőcate, i.e., the entity who
knows (‘owns’) the private key corresponding to the certiőed public key pk. In
addition, there are often additional attributes related to the certiőcate itself
rather than to the subject, such as the certiőcate validity period and serial
number.
The basic feature of public key cryptography is that the party that knows
the private key is, usually, different than the party that uses the corresponding
public key. To use the public key, we need to authenticate it; and usually this is
done by verifying a certiőcate. Namely, the party that uses a public key relies
upon the public key certiőcate, and on the PKI processes used to validate and
Applied Introduction to Cryptography and Cybersecurity
8.1. INTRODUCTION: PKI CONCEPTS AND GOALS
483
(at least one) trusted Certiőcate Authority (CA). Therefore, we refer to this
party as the relying party.
The X.509 certificate life cycle. Figure 8.1 illustrates the X.509 PKI
entities and certiőcate life cycle; some other PKIs have different life-cycles,
sometimes with additional entities. In particular, Certiőcate Transparency has
additional parties and a more complex life cycle; see Section 8.6. But for now,
let’s focus on the more basic scenario of X.509 PKI.
To request an X.509 certiőcate, the subject typically generates a (public,
private) key pair, and then requests a CA to issue the certiőcate for the public
key, with speciőc requested attributes, such as identiőers. The CA should
validate that the public key was received from a subject which is entitled to
the requested attributes; in subsection 8.2.8 we discuss the main validation
methods: the ‘easy’ domain validation, the ‘classical’ organization validation and
the extra-secure extended-validation. If the validation passes, the CA constructs
the certiőcate, including the validated attributes from the relying party, as well
as other attributes determined by the CA, such as serial number and validity
period, and then signs it, using the CA’s private signing key.
After issuing the certiőcate, the CA sends it to the subject, who provides it
to the relying party, often, and in particular for Web PKI, during the TLS/SSL
handshake. The relying party should validate the certiőcate; this includes
validation of the signature, using the public key of the CA and validation of
attributes within the certiőcate (e.g., expiration time).
In the typical example of Figure 8.1, the subject is the website bob.com,
and the relying party is Alice, or Alice’s browser. The őgure shows the simple
case of a typical identity certiőcate, issued to website bob.com directly by a
CA trusted by the relying party. Such directly-trusted CAs are called ‘trust
anchors’ or ‘root CAs’. In reality, most Web PKI certiőcates are indirectlyissued, i.e., issued by an intermediate-CA (Figure 8.6), or even by a path of
multiple intermediate-CAs, as we discuss in Figure 8.7.
Usually, certiőcates are used until they expire, i.e., throughout their validity
period, which is typically few months (rarely over a year). Around the expiration
date, certiőcates are often re-issued. However, sometimes, a certiőcate should
be revoked, i.e., invalidated before its planned and speciőed expiration time.
Revocation is done by the CA, usually upon appropriate (and authenticated)
request from the subject. A simple revocation mechanism called Certificate
Revocation List (CRL) has been part of X.509 from its earliest versions [92];
however, practical, efficient certiőcation turned out to be quite a challenge,
and multiple designs were proposed; we discuss revocation mechanisms in
Section 8.4.
While revocations can occur for administrative reasons, most revocations
are due to security concerns, such as:
Subject key exposure: private keys should be well protected from exposure;
however, exposures do happen. Normally, exposures are quite rare and
sporadic. However, a discovery of a software vulnerability may cause
Applied Introduction to Cryptography and Cybersecurity
CHAPTER 8. PUBLIC KEY INFRASTRUCTURE (PKI)
484
Certiőcate Authority
(Aka CA or Issuer)
Signing key: CA.s.
Bob’s public key
Bob.e
Alice
(relying party)
CA’s public key: CA.v.
Certiőcate
CB
Certiőcate CB :
CB = SignCA.s (bob.com, Bob.e, . . .)
Subject
(e.g, website bob.com))
Nurse
Figure 8.1: PKI Entities and typical application for server-authentication in
Web PKI process. Here, we show the simple case of a typical identity certiőcate
issued by a trusted CA (‘trust anchor’ or ‘root CA’) to website bob.com. The
certiőcate is signed using the CA’s private signing key, denoted CA.s, and
validated by the relying party using the CA’s public key CA.v, which should
be known to the relying party. Dashed arrows represent the certiőcate issuing
process, occurring once, before client connections. See also Figure 8.6 and
Figure 8.7 for issuing with one or more intermediate-CAs, and Figure 8.5 for
the certiőcate őelds.
exposure of many private keys, as happened due to the Heartbleed Bug [91,
399].
CA failures: usually, certiőcate authorities have operated in a secure, trustworthy manner, and issued correct certiőcates to the rightful subjects
- as required and expected. However, there have also been several incidents where CAs have failed in different ways, including vulnerable subject
identification, e.g., insecure email validation, issuing intermediate-CA
certificates to untrusted entities, e.g., to all customers, and even CA compromise and issuing of rogue certificates or what appears to be intentional
issuing of rogue certificates. See subsection 8.5.1 and Table 8.5.
Cryptanalytical certificate forgery: certiőcate-based PKIs all use and depend on the Hash-then-Sign mechanism, and therefore become vulnerable
if the signature scheme is vulnerable - or if the hash-function used is
vulnerable. Speciőcally, certiőcate forgery was demonstrated when using
hash functions vulnerable to chosen-prefix collision attacks, speciőcally,
using MD5 [367], and later also using SHA-1 [262, 367]. See Chapter 3.
Subject key exposures are the most common reason for revocation, and
typically results in a ‘steady’ rate of several dozens of revocations daily; however,
software and CA vulnerabilities can result in exceptional ‘waves’ of revocations.
Applied Introduction to Cryptography and Cybersecurity
8.1. INTRODUCTION: PKI CONCEPTS AND GOALS
485
For example, the Heartbleed bug resulted in several days with many revocation,
even more than 10,000 on one day [91, 399].
Identity certificates. Many certiőcates include an identifying attribute,
i.e., an identifier of the subject; such certiőcates are referred to as identity
certificates. In the typical server-authentication use by TLS/SSL of Web PKI,
the relying-party is the browser, the subject is the website, and the relevant
identiőer is the domain name of the website, e.g., bob.com. This typical usecase is illustrated in Figure 8.1, where the certiőcate CB contains a signature
SignCA.s (bob.com, Bob.e, . . .), using CA.s, the private signing key of the CA,
over the identiőer bob.com, the public key Bob.e, and other őelds.
8.1.1
Rogue certificates
The basic goal of PKI is to allow a relying party to determine which public key
to use, or whether to use a given public key - typically, included in a certiőcate.
We use the term rogue certificate for a certiőcate which contains wrong or
misleading information, and hence, should not be relied upon.
The basic goal of rogue certiőcates is to allow the attacker to mislead the
user or security mechanisms, typically by impersonating as a trusted entity:
impersonate as a trusted website (website spoofing), impersonate as a web-server
of a trusted sender (phishing email), impersonate as a trusted software provider
(signed malware).
Equivocating certificates. To mislead security mechanisms, the rogue certiőcates need to use exactly a speciőc name that ‘belongs’ to the legitimate
‘owner’, namely, an equivocating name. Equivocating (same-name) certiőcates
contain exactly the same name as a legitimate domain name, but certiőed by
an attacker - and, obviously, containing a public key chosen by the attacker.
Equivocating certiőcates can be used to circumvent many important security
mechanisms, including Same-Origin-Policy (SOP), blacklists, whitelists and
other access-control mechanisms.
Misleading (impersonating) certificates and domain-names. Many
attacks focus on misleading the user, rather than misleading an automated
security mechanism. These attacks take advantage of the fact that humans do not
follow a precise algorithm for their trust decisions, in contrast with automated
security mechanisms. The attacker’s goal is still, usually, impersonation; there
is currently no mechanism that prevents rogue entities from obtaining domain
names and certiőcates for non-impersonating nefarious purposes, such as scams such mechanism is probably desirable, but seems very hard to establish. Hence,
we focus on impersonating certiőcates. Notice that equivocating certiőcates can
obviously be used for impersonating the subject, such as for phishing emails
and spoofed (fake) websites; however, when the goal is to trick a human user,
there are other ways which prove almost as effective, mainly:
Applied Introduction to Cryptography and Cybersecurity
486
CHAPTER 8. PUBLIC KEY INFRASTRUCTURE (PKI)
Homographic (visually impersonating) domain names: the attacker uses
names which appear visually to be the exact names of a legitimate, trusted
entity, although, they actually just exploit visual similarities between different characters, typically using different fonts. A simple example are the
names paypal.com and paypaI.com - in many fonts, the lower-case ‘L’ and
capital ‘I’ are hard to distinguish. Attacker may also use fonts from different alphabets, e.g., some Cyrillic letters are visually indistinguishable from
other Latin letters (e.g., P). Attacks using such visually-impersonating
domain names are called homographic attacks.
Domain-name hacking: the attacker uses a different domain name, which
the attacker can register and control, but which users, usually not even
aware of the structure of domain names, will not distinguish from a trusted
domain name. Examples: to impersonate as the site accounts.bank.com,
use bank.accounts.com, accounts-bank.com or accounts.bank.co. The last
example also uses the human tendency to ignore the end and other minor
deviations in text (our ‘built-in error correction’).
Combo domain names (combosquatting): names which combine a trademark or a name associated with a legitimate, trusted entity, with another
term which seem to either ‘make sense’ or simply to ‘appear meaningless/technical’. Combo names are probably one of the most effective form
of misleading names. Example: to impersonate as the website bank.com,
use accounts-bank.com or bank.accts.com.
Typosquatting : These domain names exploit typical typos, such as due to
typing and/or spelling errors, such as banc.com, baank.com and banl.com.
The high-level goal of PKI is to protect the relying parties from such rogue
certiőcates, as well as CAs who issue rogue certiőcates intentionally or due to
negligence. In the next subsection, we try to turn this high-level goal into more
precise requirements. The concerns about misleading certiőcates and domain
names are some of the many challenges which a designer faces, when trying to
protect systems involving human users; we look a bit deeper into this important
topic in Chapter 9.
8.1.2
Security goals of PKI schemes.
X.509 security goals. The basic, high-level goal of a public key infrastructure
(PKI) is to allow relying parties to ensure that it uses a valid public key for its
speciőc needs and application.
The mere fact that X.509 certiőcates are signed by the CA may seem to
ensure this goal; however, there are two caveats. First, X.509 allows certiőcation not only by root CAs, trusted directly by the relying party, but also by
intermediate CAs, based on a precise policy of the root CA. Second, relying
parties should not be fooled into relying on a revoked certiőcate. These two
caveats imply two corresponding security requirements:
Applied Introduction to Cryptography and Cybersecurity
8.1. INTRODUCTION: PKI CONCEPTS AND GOALS
487
Accountability: assume a relying party validated a certiőcate c, optionally
using some ‘additional data’ D (typically, additional certiőcates). Yet,
assume that the entity identiőed as the subject in c, denies ‘owning’ the
certiőed public key. Then we can identify an accountable CA, CAA , which
has signed some certiőcate cA , whose subject denies ‘owning’ the public
key certiőed in cA . If c was issued by a root CA than D is unnecessary,
CAA is the root CA and cA = c. See X.509’s certiőcate-path mechanisms
in Section 8.3.
Revocation: assume that at time t, a CA revokes a certiőcate c (that it
previously issued). Then after some bounded delay ∆, all relying parties
will not consider the certiőcate as valid.
Post-X.509 security goals. There are several ‘post-X.509’ PKI designs
which aim to address additional requirements, including:
Transparency: the set of all issued certiőcates is transparent, i.e., publicly
known. See Section 8.6.
Revocation-status transparency: the revocation status of a certiőcate is
publicly known, i.e. it is known whether a particular certiőcate was
revoked.
Equivocation-prevention: there cannot exist two valid yet equivocating certiőcates, i.e., different identity certiőcates for the same identiőer (e.g.,
domain name).
Equivocation detection any pair of equivocating certiőcates would be detected within bounded time after the second one is issued.
Relying party privacy: the PKI mechanism does not expose which certiőcates are validated by a given relying party. One case where this does not
hold is when using the OCSP protocol (Section 8.4).
8.1.3
The Web PKI
In Chapter 7 we discussed the use of certiőcates by the SSL/TLS protocol, used
to secure web traffic and other applications. The SSL or TLS client (often, the
browser) receives a certiőcate (pk, ATTR, σ) authenticating the server’s public
key pk, and binding it to the server’s domain, e.g., bob.com, which is speciőed
as one of the attributes in ATTR. The certiőcate contains a signature σ; for the
certiőcate to be valid, σ must be a signature over (pk, ATTR), which validates
correctly using the public validation key of some trusted certificate authority
(CA).
In web security applications, each browser maintains and/or uses1 a list of
trusted root certificate authorities (root CAs). These root CAs can also certify
1 Often, browsers use a list of trusted root CAs maintained by the operating system,
possibly combining it with browser-maintained list.
Applied Introduction to Cryptography and Cybersecurity
488
CHAPTER 8. PUBLIC KEY INFRASTRUCTURE (PKI)
additional CAs, referred to as intermediate CAs; we explain the process later
on. To support SSL or TLS, a web-server for a domain, e.g., bob.com, needs a
certiőcate for that domain, signed by a root or intermediate CA, and, of course,
to know and use the corresponding private key.
Note that SSL and TLS also supports (optional) client authentication;
however, client authentication requires a client certificate; only few traditional
clients, such as browsers, have these client certiőcates, but their use is more
common for IoT devices. Similarly, client certiőcates are required for end-to-end
secure email services, e.g., using S/MIME [325]; again, only a tiny fraction
of the users went through the process of obtaining a client certiőcate, and as
a result, these secure email services are not widely used. The difficulties of
obtaining client certiőcates are probably one reason for the fact that most secure
messaging applications rely on authentication by the provider, and sometimes
also by the peer user, but not on client certiőcates.
For further discussion, focusing on weaknesses of the current Web PKI and
some solutions, see subsection 8.5.1.
8.2
The X.509 PKI
In this section, we discuss the basic notions of the X.509 PKI standard, which
was developed as part of the X.500 global directory standard. X.509 is the most
widely deployed PKI speciőcation, and also includes some of the more advanced
PKI concepts which we cover in the following sections.
8.2.1
The X.500 Global Directory Standard
X.500 [95] is an ambitious, extensive set of standards of the International
Telecommunication Union (ITU), a United Nations agency whose role is to
facilitate international connectivity in communications networks. The goal
of X.500 is to facilitate the interconnection of directory services provided by
different organizations and systems. The őrst version of X.500 was published
as early as 1988, and numerous extensions and updates were published over the
years.
The basic idea of X.500 is to provide a trusted, unified and ideally global directory, by combining the data and services of its multiple component directories.
Such a uniőed directory would be operated by cooperation between trustworthy
providers, such as telecommunication companies. Signiőcant aspects of X.500,
such as the distinguished names, are deployed by LDAP and other directory services [208], although, these are far from the vision of a global directory. Among
the possible reasons for that is the high complexity of the X.500 design, concerns
that X.500 interoperability may cause exposure of sensitive information, and
lack of sufficient trust among different directory providers.
However, some concepts from X.500 live on; we already mentioned LDAP
as one example. More relevantly to our subject, the X.500 recommendation
contributed extensively to the development of PKI schemes. The X.500 designers
Applied Introduction to Cryptography and Cybersecurity
8.2. THE X.509 PKI
489
observed that an interoperable directory should bind standard identifiers to
standard attributes.
One important set of attributes deőnes the public key(s) of each entity. The
entity’s public encryption key allows relying parties to encrypt messages so that
only the intended recipient may decrypt them. Similarly, the entity’s public
validation key allows relying parties to validate statements signed by the entity.
We next discuss the main form of standard identifier deőned in X.500: the
distinguished name.
8.2.2
The X.500 Distinguished Name
The design of X.500 was extensively informed by the experience of telecommunication companies at the time, which included provision of directory services
to phone users. Phone directory services are mostly based on looking up the
person’s common name; the common name has the obvious advantage of being
a meaningful identifier - we usually know the common name of a person when
we ask the directory for that person’s information. Phone directories would
normally also allow speciőcation of the relevant area, e.g., in form of locality;
by limiting search to speciőc areas or localities, the directory services can be
decentralized.
However, obviously, a common name is not a unique identiőer - in fact,
some common names are quite common, if you excuse the pun. In classical
phone directories, this is addressed by returning a set of results containing all
relevant entries, along with the relevant common name and other attributes
(e.g., location).
The X.500 designers decided that in order to allow efficient use of large,
global directories, returning multiple results is not a viable option. Instead,
they decided to use a more reőned identiőer, with multiple keywords - where the
common name will simply be one of these keywords. This identiőer is the X.500
Distinguished Name (DN). The distinguished name was designed to satisfy the
following three main goals for identifiers:
Meaningful: identiőers should be meaningful and recognizable by humans.
This makes it easier to memorize the identiőer, as well as to link it with
off-net identiőer, with potential legal and reputation implications.
Unique: identiőers should be unique, i.e., different subjects should have different identiőers, allowing each identiőer to be mapped to a speciőc
subject.
Decentralized management: multiple, ‘distributed’ issuers, can issue identiőers, without restrictions, i.e., any issuer is allowed to issue any identiőer.
The uniqueness requirement is an obvious challenge, as common names are
obviously not unique. To facilitate unique DNs for people sharing the same
common name, X.500 distinguished names consist of a sequence of several
keyword-value pairs. The inclusion of multiple keywords - also referred to as
Applied Introduction to Cryptography and Cybersecurity
490
CHAPTER 8. PUBLIC KEY INFRASTRUCTURE (PKI)
C
L
O
OU
CN
Country
Locality
Organization name
Organization unit
common name
Table 8.1: Standard keywords/attributes in X.500 Distinguished Names
Figure 8.2: Example of the X.500 (and X.509) Distinguished Name (DN)
Hierarchy.
attributes - helps to ensure unique identiőcation, when combined with the
common name. Typical, standard keywords are shown in Table 8.1; however, a
directory is free to use any keyword it desires.
To satisfy the ‘meaningful’ goal, identiőers should have readable representations. RFC 1779 [233] speciőes a popular string representation for distinguished
names, where keyword-value pairs are separated by the equal sign, and different
pairs are separated by comma or semicolon, optionally also with spaces. Other
representations are possible too, e.g., Figure 8.2 includes encoding of a DN
using slash for separation.
Let us give two simple examples of different legitimate interpretations (and
implementations) of the RFC 1779 representation:
1. As illustrated in Figure 8.2, the distinguished name (DN) for a police
officer named John Doe in the Soho precinct of the NYPD may be deőned
as: C=US/L=NY/O=NYPD/OU=soho/CN=John Doe.
2. The distinguished name (DN) for an IBM UK employee with the name
Julian Jones may be written as: CN=Julian Jones, O=IBM, C=GB. Read
below on the author’s experience with this (realistic) DN.
Applied Introduction to Cryptography and Cybersecurity
8.2. THE X.509 PKI
491
Note that the two examples use different order of keywords, with the most
speciőc term being on the right in the őrst and on the left in the second; this
kind of ambiguity is a source of implementation bugs, as indeed happened for
few implementations; it may also cause vulnerabilities.
Note that keyword-value pairs comprising an X.500 distinguished name are
speciőed in a sequence, i.e., as an ordered list. This allows the distinguished
names to be organized as a hierarchy, using the sequence of keywords as the
nodes, as illustrated in Figure 8.2. By assigning a speciőc, single entity to
assign identiőers in a sub-tree of the X.500 DN hierarchy, this entity can ensure
uniqueness by never allocating the same identiőer (DN) to two different subjects,
e.g., the Soho precinct of the NYPD may maintain its own sub-directory. This
also allows queries over the entire set of distinguished names that begins with a
particular preőx of keyword-value pairs.
However, this implies that X.500 distinguished names cannot be issued in an
entirely decentralized manner - some control and coordination on the allocation
of identiőers is required. Furthermore, there are also some caveats with respect
to the other goals - unique and meaningful identiőers.
Let us őrst consider the goal of meaningful identiőers. The use of subdivisions such as ‘organization unit (OU)’ may help to reduce the likelihood
of two persons with the same common name in the same ‘bin’, this possibility
still exists. As a result, administrators may have to enforce uniqueness by
‘modifying’ the common name. For example, if there are multiple IBM UK
employees with the name Julian Jones, one of them may be assigned the DN:
CN=Julian Jones2, O=IBM, C=GB
This results in less meaningful distinguished names; e.g., it is easy to confuse
between the DNs of the two employees. For example, the author has sent to
CN=Julian Jones, O=IBM, C=GB messages intended for CN=Julian Jones2,
O=IBM, C=GB, when using an email system that used distinguished names as
email addresses. Luckily, both Julians were understanding of the mistake.
Another cause of mistakes and ambiguity is the fact that there are no rules
governing the order of the keywords, i.e., the structure of the hierarchy, as is
evident from the two examples we presented. In particular, some multinational
organizations may use the country as the top level category, as in CN=Julian
Jones, O=IBM, C=GB, while others may view the organization itself as the toplevel category, as in CN=Julian Jones, C=GB, O=IBM. These two distinguished
names are different; this distinction may not be obvious to a non-expert, further
reducing from the goal of ‘meaningful’ names.
There are also cases where uniqueness is not guaranteed. Some namespaces
are shared by design, and cannot be segregated with a single authority assigning
identiőers in each segment. For example, consider Internet domain names;
multiple registrars are authorized to assign names in several top-level domains
such as com and org. There is a coordination process between registrars, but if
not followed correctly, conŕicts may occur.
This problem is more severe with respect to public key certiőcates for Internet
domain names, which can be issued by multiple Certiőcate Authorities; any
Applied Introduction to Cryptography and Cybersecurity
492
CHAPTER 8. PUBLIC KEY INFRASTRUCTURE (PKI)
faulty authority may issue a certiőcate to an entity who does not rightfully own
the certiőed domain name. Such incidents occurred - often due to intentional
attack; e.g., see Table 8.5. This is a major concern for Web PKI as well as PKI
in general, and we discuss it further later in this chapter.
We conclude that X.500 distinguished names are not perfectly meaningful
and deőnitely not decentralized ; furthermore, sometimes, distinguished names
may even not perfectly ensure uniqueness. Indeed, there seem to be an inherent
challenge in satisfying all three goals, although achieving any two of these three
properties is deőnitely feasible - a classical trilemma scenario.
The identifiers trilemma. We argued that X.500 distinguished names may
fail to ensure each of the three goals deőned above - uniqueness, meaningfulness
and decentralized management. In contrast, several other identiőers ensure pairs
of these three properties:
Common names are meaningful - and decentralized, as any person can decide
on the name. However, they are deőnitely not unique.
Public keys and random identifiers are decentralized and (except for very
rare collisions) unique. However, obviously, public keys are not directly
meaningful to humans.
Email addresses are unique and meaningful. However, they are not decentralized, since each issuer can only assign identiőers (email addresses) in
its own domain.
This begs the question: is there a scheme which will ensure identiőers which
fully satisfy all three properties, i.e., would be unique, meaningful and managed
and issued in a decentralized way? It seems that this may be hard or impossible,
i.e., it may be possible to only fully ensure two of these three goals, but not all
three. We refer to this challenge as the Identifiers Trilemma 2 , and illustrate it
in Figure 8.3.
Additional concerns regarding X.500 Distinguished Names. We conclude our discussion of X.500 distinguished names, by discussing few additional
concerns.
Privacy. The inclusion of multiple categorizing őelds in X.500 DNs, may
expose information in an unnecessary, and sometimes undesired, manner. For example, employees may not always want to expose their location or organizational
unit.
Flexibility. People may change locations, organization units and more; with
X.500 DNs, this may result in ‘incorrect’ DN, or require change of the DN both undesirable.
2 This
challenge is also referred to as Zooko’s triangle; however, Zooko has apparently
referred to a different trilemma, albeit also related to identifiers. Specifically, Zooko considered
the challenge of identifiers which will be distributed, meaningful for humans, and also selfcertifying, allowing recipients to locally confirm the mapping from name to value.
Applied Introduction to Cryptography and Cybersecurity
8.2. THE X.509 PKI
493
Figure 8.3: The Identiőers Trilemma: the challenge of co-ensuring unique,
decentralized and meaningful identiőers.
Usability. X.500 DNs are designed to be meaningful, i.e., users can easily
understand the different keywords and values. However, sometimes this may
not suffice to ensure usability. In particular, consider two of the most important
applications for public key cryptography and certiőcates: secure web-browsing
and secure email/messaging.
Secure web-browsing: users, as well as hyperlinks, specify the desired website
using an Internet domain name, and not a distinguished name. Hence,
the relevant identiőer for the website is that domain name - provided by
the user or in the hyperlink. This requires mapping from the domain
name to the distinguished name. A better solution is for the certiőcate to
directly include the domain name; this is supported by the SubjectAltName
extension, deőned by PKIX, see subsection 8.2.6.
Secure email/messaging: users also do not use distinguished names to identify peers with whom they communicate using email and instant messaging
applications. Instead, they use email addresses - or application-speciőc
identiőcation. This problem may not be as meaningful, since most end
users do not have a public key certiőcate at all; and, again, PKIX allows
certiőcates to directly specify email address.
8.2.3
X.509 Public Key Certificates
The X.500 standard included a dedicated sub-standard, X.509, which deőned
authentication mechanisms, allowing entities to authenticate themselves to the
directory. X.509 deőned multiple authentication mechanisms, e.g., the use of
password based authentication. However, one of these authentication methods
became a very important, widely used standard: the X.509 public key certificate.
Originally, the main goal of the X.509 authentication was to allow each entity
to maintain its own record with the directory, e.g., to change address. However,
it was soon realized that public key certiőcates allow many more applications,
Applied Introduction to Cryptography and Cybersecurity
494
CHAPTER 8. PUBLIC KEY INFRASTRUCTURE (PKI)
Figure 8.4: X.509 version 1 certiőcate. Note the őelds added in later versions,
mainly version 3 of X.509 (Figure 8.5), most notably the extensions őeld.
since they allow recipients to authenticate the public key of a party without
requiring any prior communication. As a result, X.509 certiőcates became a
widely deployed standard, which is used for SSL and TLS, code-signing, secure
email (S/MIME), IP-sec and more. All this use is in spite of complaints about
the complexity of the X.509 speciőcations and encoding formats - obviously,
the wide use is also one reason for the numerous complaints. For details of the
encoding, see [226, 376]; see also [306].
The deőnition of the X.509 certiőcates did not change too much from the
őrst version of X.509; the contents (őelds) of that őrst version of X.509 are
shown in Figure 8.4. These őelds are, by their order in the certiőcate:
Version: the version of the X.509 certiőcate and protocol.
Certificate serial number: a serial number of the certiőcate, unique among
all of the certiőcates issued by this CA. PKIX [104] speciőes that the
serial number should be a positive integer of up to 20 bytes, i.e., up to
159 bits. The best practice is to select the serial number randomly, not
sequentially. The motivation are attacks [367, 368] that manipulate a
CA into issuing a certiőcate whose hash collides with the contents of a
different certiőcate, when using predictable sequence numbers, together
with a hash-function which has the chosen-preőx collisions vulnerability,
such as MD5 or SHA-1 (subsection 3.3.1).
Signature-process Object Identifier (OID): this is an identiőer of the process used for signing the certiőcate, typically using the Hash-then-Sign
paradigm. This identiőer speciőes both the underling public key signature
algorithm, e.g., RSA, as well as the hash algorithm, e.g., SHA-256. The
algorithm may be written as a string for readability, and standard string
terms are used for widely used methods, e.g., sha256WithRSAEncryption;
Applied Introduction to Cryptography and Cybersecurity
8.2. THE X.509 PKI
495
notice the use of the term ‘RSA Encryption’ when referring to RSA signatures - a common misnomer. In the certiőcate itself, the algorithm is
typically speciőed using the Object-Identifier (OID) standard; see Note 8.1.
Issuer Distinguished Name: the distinguished name of the certiőcate authority which issued, and signed, the certiőcate.
Validity period: the period of time during which the certiőcate is to be
considered valid.
Subject Distinguished Name: the distinguished name of the subject of the
certiőcate, i.e., the entity to whom the certiőcate was issued. This entity
is expected to know the private key corresponding to the certiőed public
key.
Subject public key information: this őeld contains the public key of the
subject, and an object identifier (OID, see Note 8.1) that identiőes the
algorithm with which the key is used, including key-length, e.g., RSA/2048.
The allowed usage of the certiőed public key - e.g., to encrypt messages
sent to the subject, or to validate signatures by the subject is speciőed
in the KeyUsage extension (subsection 8.2.7), not in subject public key
information őeld.
Signature: őnally, this őeld contains the result of the application of the
signature algorithm (identiőed by the signature-process OID őeld above),
to all of the other őelds in the certiőcate, using the private signing key
of the issuer (certiőcate authority). The sequence of all these őelds in
the certiőcate, excluding the signature őeld itself, is referred to as the
to-be-signed őelds; see Figure 8.4 and Figure 8.5. This allows the relying
party to validate the authenticity of the őelds in the certiőcate, e.g., the
validity period, the subject distinguished name, and the subject public
key.
Exercise 8.1. Provide a security motivation for the fact that the signature
process is specified as one of the (signed) fields within the certificate. Do this by
constructing two ‘artificial’ CRHFs, hA and hB ; to construct hA and hB , you
may use a given CRHF h. Your constructions should allow you to show that it
could be insecure to use certificates where the signature process (incl. hashing)
is not clearly identified as part of the signed fields. Specifically, design hA , hB
to show how an attacker may ask a CA to sign a certificate for one name, say
Attacker, and then use the resulting signature over the certificate to forge a
certificate for a different name, say Victim.
X.509 Certificates: Versions 2 and 3. Following X.509 version 1, the
X.509 certiőcates were extended by few additional őelds; see Figure 8.5.
Version 2 of X.509 added two őelds, both of them for unique identiőers one for the subject and one for the issuer (CA). These őelds were deőned to
Applied Introduction to Cryptography and Cybersecurity
496
CHAPTER 8. PUBLIC KEY INFRASTRUCTURE (PKI)
Note 8.1: Object identiőers (OIDs)
The joint ITU and ISO ASN.1 standard [93, 129] defines the concept of an object
identifier (OID) as a unique identifier for arbitrary objects. Object identifiers are
specified as a sequence of numbers, e.g., 1.16.180.1.45.34, separated by dots (as
shown) or spaces. OID numbers are assigned hierarchically to organizations and
to ‘individual objects’; when an organization is assigned a number, e.g., 1.16, it
may assign OIDs whose prefix is 1.16 to other organizations or directly to objects,
e.g., 1.16.180.1.45.34. The top level numbers are either zero (0), allocated to ITU,
1, allocated to ISO, or 2, allocated jointly to ISO and ITU. RFC 3279 [28] defines
OIDs for many cryptographic algorithms and processes used in Internet protocols,
e.g., RSA, DSA and elliptic-curve signature algorithms; when specifying a signature
process, the OID normally also specifies both the underlying public key signature
algorithm and key length, e.g., RSA/2048, and the hashing function, e.g., SHA-256,
used to apply the ‘Hash-then-Sign process. X.509 uses OIDs to identify signature
algorithms and other types of objects, e.g., extensions and issuer-policies. The
use of OIDs allows identification of the specific type of each object, which helps
interoperability between different implementations.
Figure 8.5: X.509 version 3 certiőcate. Version 2 is identical, except for not
having the extensions őeld; version 1 also does not have the two ‘unique identiőer’
őelds (Figure 8.4).
Applied Introduction to Cryptography and Cybersecurity
8.2. THE X.509 PKI
497
ensure uniqueness, in situations where the distinguished name may fail to ensure
uniqueness, as discussed in subsection 8.2.2. However, these unique identiőer
őelds are not in wide use, as they are entirely unrelated to the meaningful
identiőers used in typical applications.
Version 3 of X.509 (X.509v3) is the one in practical use; the main reason for
its wide success is that it dramatically increased the expressiveness of X.509
certiőcates. As can be seen in Figure 8.5, this dramatic improvement is due to
just one new őeld added in version 3: the general-purpose extensions őeld. The
extensions őeld provides extensive ŕexibility and expressiveness to certiőcates,
and facilitates many applications and use cases; this őeld is typically much
longer than all other őelds combined. The X.509v3 extensions mechanism is
the subject of the next subsection.
8.2.4
The X.509v3 Extensions Mechanism
As shown in Figure 8.5, X.509 certiőcates, from version 3, include a őeld that
can contain one, or more, extensions. We discuss some speciőc, important
extensions in the following subsections. But őrst, let us discuss the extensions
mechanism itself, since this mechanism has a rather clever design, which cleverly
balances between the need to allow extendibility, and the concern of using a
certiőcate incorrectly (due to ignoring or incorrectly handling an extension).
Each extension has the following three components:
Extension identifier: speciőes the type of the extension. The extension
identiőer is speciőed using an object identiőer (OID), to facilitate interoperability. The following subsections discuss some important extensions,
e.g., key usage and name constraint.
Extension value: this is an arbitrary string which provides the value of the
extension. For example, a possible value for the key-usage extension
would indicate that the certiőed key is to be used as a public encryption
key, while a possible value for the name constraint extension may be
Permit C=GB, allowing the subject of the certiőcate to issue its own
certiőcates, but only with the value ‘GB’ (Great Britain) to their ‘C’
(country) keyword.
Criticality indicator: this is a binary ŕag, i.e., an extension can be marked
as critical or as non-critical. The value of the criticality indicator ŕag
in an extension instructs relying parties how to handle the certiőcate if
the relying party is not familiar with this type of extension, as indicated
by the extension identiőer. A relying party should not use a certiőcate
which includes an extension marked as critical, if the relying party is not
familiar with this type of extension. Relying parties can use certiőcates
even if it contains an extension of a type not known to the relying party,
ignoring that extension, if the extension is marked as non-critical. When
the relying party is familiar with the type of an extension, the value of
the criticality indicator is not applicable.
Applied Introduction to Cryptography and Cybersecurity
498
CHAPTER 8. PUBLIC KEY INFRASTRUCTURE (PKI)
The criticality indicator ŕag is a simple mechanism - but very valuable, by
allowing both critical extensions and non-critical extensions. The ŕexibility
offered by the ‘criticality indicator’ makes the X.509 extensions mechanism very
versatile; it is a pity that this idea has not been adopted by other extension
mechanisms. For example, TLS client and servers simply ignore unknown
TLS extensions, i.e., treat them as non-critical, as discussed in Chapter 7. It
would have been useful if TLS allowed also deőnition of critical extensions, i.e.,
instructing a TLS peer to refuse connection if the peer is sending a critical but
unknown TLS extension. This can be achieved quite easily; see next exercise.
Exercise 8.2. Design how TLS may be extended to support critical extensions.
Could you achieve this using the existing TLS extensions mechanism?
X.509, as well as PKIX and other X.509 proőles, deőne some extensions to
be always marked critical, others to be always marked non-critical, and others
to be marked differently depending on needs. We next present examples of each
of these three types of extensions, focusing on standard extensions.
Example 8.1. The TLS feature X.509 extension: defined to be used as a
non-critical extension.
An X.509 extension called TLS feature is deőned in RFC 7633 [184]. This
TLS feature extension is used in TLS server certiőcates, to indicates that the
server supports a speciőc TLS extension (see subsection 7.4.3). The name
chosen for this X.509 extension is TLS feature, rather than TLS extension, to
make it clearer that the TLS feature is an X.509 extension, and only refers
to the support for a speciőc TLS extension; unfortunately, confusion is still
natural.
The TLS feature X.509 extension allows the server to indicate to the client
that the server supports certain important TLS extensions (‘features’). Some
TLS clients may not support the TLS feature X.509 extension, so if this extension
would be marked critical, these clients would reject the certiőcate, and the
connection would fail. Hence, the TLS feature X.509 extension should be marked
as non-critical. Note that the TLS feature extension isn’t one of the standard
extensions deőned in either X.509 or PKIX; it was developed later, speciőcally,
to allow a certiőcate to mark that the server always uses the must-staple TLS
extension, see subsection 8.4.3.
Example 8.2. The extended key usage extension: can be either critical or
non-critical.
The extended key usage extension allows the issuer to deőne allowed usage
for the public key certiőed, which is in addition to or in place of the usage
speciőed in the key usage extension. In some scenarios, the ‘extended key
usage’ should be critical, e.g., to prevent incorrect usage based on the key usage
extension, by clients not supporting extended key usage. In other scenarios, the
extended key usage extension should be non-critical, e.g., when allowing some
additional usage over that speciőed already in the key usage extension.
Applied Introduction to Cryptography and Cybersecurity
8.2. THE X.509 PKI
499
Example 8.3. The key usage extension: always critical in PKIX, and a
motivating attack.
PKIX speciőes that the key usage extension must be marked critical, while
X.509 allows the key usage extension to be marked as either critical or ‘notcritical’. Let us őrst give a contrived example of a possible attack exploiting
a certiőcate where key-usage was not marked as critical, causing a relying
party who does not understand this extension, to make a critical security
mistake. Assume that the parties (key-owner, i.e., certiőcate subject, and
relying party) use ‘textbook RSA’ encryption, i.e., encrypt plaintext mE by
computing c = meE mod n; and ‘textbook RSA’ signing, i.e., sign message mS
by outputting σ = h(mS )d mod n, i.e., ‘decrypting’ the hash of the message.
Furthermore, assume the key-owner uses its decryption key to authenticate that
it is active at a given time, by decrypting an arbitrary challenge ciphertext sent
to it; this requires only a relatively weak form of ciphertext-attack resistance,
where the attacker must ask for the decryption before seeing the challenge
ciphertext it must decrypt, often referred to as IND-CCA1 secure and assumed
for textbook RSA. A key-owner using this mechanism must use its key only for
decrypting these challenges; assume it receives a certiőcate CE for its encryption
key e, with the key-usage extension correctly marking this as an encryption key,
but not marked as critical.
An attacker may abuse this, together with the fact that key usage is not
understood by some relying parties, to mislead these relying-parties into thinking
that the key-owner signed some attacker-chosen-message mA , as follows. The
attacker computes cA = h(mA ) and sends it to the key-owner, as if it is
a standard challenge ciphertext to be decrypted. The key-owner therefore
decrypts cA and outputs the decryption, cdA mod n = h(mA )d mod n, which
we denote by σA , i.e., σA ≡ h(mA )d mod n. Now the attacker sends the pair
(mA , σA ), along with the certiőcate CE , to the relying party, claiming mA was
signed by the key-owner with signature σA . Since the relying party is not
familiar with the key-usage extension, and it was not marked critical in the keyowner’s certiőcate CE , then the relying-party would validate (mA , σA ), which
would validate correctly, and thereby incorrectly consider mA as validly-signed
by the key-owner.
Let us also point out a more practical attack on a TLS 1.3 client that does
not correctly implement the key usage extension. If the TLS server also has
runs any older version of TLS (or SSL) that is vulnerable to some variant of
the Bleichenbacher attack, then the attacker may be able to forge an RSA
signature using the private (decryption) key of the old, vulnerable version. If
the client ignores the key usage extension and uses the public key for verifying
the signature, the attacker succeeds in a cross-protocol attack on TLS 1.3, even
if the server has correctly separated between the TLS 1.3 signature-veriőcation
public key, and the public encryption key used by the old, vulnerable version.
Applied Introduction to Cryptography and Cybersecurity
500
8.2.5
CHAPTER 8. PUBLIC KEY INFRASTRUCTURE (PKI)
Trust-Anchor Certificate Validation
Upon receiving a certiőcate, the relying party must decide whether it can rely
on and use the certiőed public key, for a particular application. In this section,
we focus on the case of certiőcates signed by a trust anchor CA, i.e., a CA
trusted by the relying party. In this case, the relying party would apply the
certificate validation process, using the public signature-validation key of the
CA, CA.v, to determine if the given certiőcate is valid. If the certiőcate is
not signed by a trust anchor, then the relying party should őrst perform the
certification path validation process, to decide if to trust this certiőcate, based
on additional certiőcates; we discuss this in Section 8.3.
Assume, therefore, that a relying party receives a certiőcate signed (issued)
by a trust anchor, i.e., the relying party trusts the issuing CA, denoted I, and
knows its public validation key I.v. To validate the certiőcate, the relying party
uses I.v and the contents of the certiőcate, as follows:
Issuer. The relying party veriőes that the issuer I of the certiőcate, as identiőed
by the issuer distinguished name őeld, is a trusted CA, i.e., a trust anchor
(root-CA, for Web PKI).
Validity period. The relying party checks the validity period speciőed in the
certiőcate. If the public key is used for encryption or to validate signatures
on responses to challenges sent by the relying party, then the certiőcate
should be valid at the relevant times, including at the current time. If
the public key is used to validate signature generated at the past, then it
should be valid at a time when these signatures already existed, possibly
attested by supporting validation by trusted time-stamping services.
Subject. The relying party veriőes that the subject, identiőed in the subject
őeld using the distinguished name, is an entity that the relying party
expected. For example, when the relying party is a browser and it receives
a website certiőcate, then the relying party should conőrm that the
website identity (e.g., domain name) is the same as indicated in the
‘subject distinguished name’ őeld of the certiőcate.
Signature algorithms. The relying party conőrms that it can apply and trust
the validation algorithm of the signature scheme identiőed in the signature
algorithm OID őeld of the certiőcate. If the certiőcate is signed using
an unsupported algorithm, or an algorithm known or suspected to be
insecure, validation fails.
Issuer and subject unique identifiers. From version 2, X.509 certiőcates
also include őelds for unique identiőers for the issuer and the subject,
which the relying party should use to further conőrm their identities. In
PKIX, these identiőers are usually not used, and PKIX does not require
their validation. This is probably since in PKIX, the issuer and subject
identiőers are typically in corresponding extensions.
Applied Introduction to Cryptography and Cybersecurity
8.2. THE X.509 PKI
501
Extensions. The relying party validates that it is familiar with any extension
marked as critical; the existence of any unrecognized extension, marked
as critical, would invalidate the entire certiőcate. Then, the relying
party validates the existence and contents of any extension that its policy
requires. To avoid incompatibilities, relying parties and CAs usually
follow agreed-upon policies for the required and permitted extensions,
often referred to as PKI profile, such as PKIX from the IETF [104] and
proőles deőned by the CA/Browser Forum [152].
Validate signature. The relying party next uses the trusted public validation
key of the CA, CA.v, and the signature-validation process as speciőed in
the certiőcate, to validate the signature over all the ‘to be signed’ őelds
in the certiőcate, i.e., all őelds except the signature itself.
8.2.6
The SubjectAltName and the IssuerAltName
Extensions
Both X.509 and PKIX deőne the standard SubjectAltName (SAN) and IssuerAltName (IAN) extensions, providing alternative identiőcation mechanisms
(names) to complement or replace the Distinguished Name mechanism, providing
identiőcation for identifying, respectively, the subject and the issuer. These
alternative őelds allow the use of other forms of names, identiőers and addresses
for the subject and/or the issuer. Note that a certiőcate may contain multiple
SANs.
The most important form of an alternative name is a Domain Name System
(DNS) name, referred to as dNSName, e.g., example.com. These dNSNames are
used by most Internet protocols, and are familiar to most users. Also allowed
but rarely used alternative names, include email addresses, IP addresses, and
URIs.
In fact, the use of alternate names is so common, that in many PKIX
certiőcates, the subject and the issuer distinguished-name őelds are left empty.
Indeed, PKIX (RFC 5280) speciőes that this must be done, when the Certiőcate
Authority can only validate one (or more) of the alternative name forms, which
is often the case in practice. PKIX speciőes that in such cases, where the SubjectAltName extensions is the only identiőcation and the subject distinguished
name is empty, then the extension should be marked as critical, and otherwise,
when there is a subject distinguished name, it should be marked as non-critical.
Note that PKIX (RFC 5280) speciőes that the Issuer Alternative Name
extension should always be marked as non-critical. In contrast, the X.509
standard speciőes that both alternative-name extensions, may be ŕagged as
either critical or non-critical.
Also, note that implementations of the Secure Socket Layer (SSL) and
Transport Layer Security (TLS) protocols, often allow certiőcates to include
wildcard certificates, which, instead of specifying a speciőc domain name, use
the wildcard notation to specify a set of domain name. For TLS, this support is
clearly deőned in RFC 6125 [342]. Wildcard domain names are domain names
Applied Introduction to Cryptography and Cybersecurity
502
CHAPTER 8. PUBLIC KEY INFRASTRUCTURE (PKI)
where some of the alphanumeric strings are replaced with the wildcard character
‘*’; there are often restrictions on the location of the wildcard character, e.g., it
may be allowed only in the complete left-most label of a DNS domain name, as in
*.example.com. Wildcard domain names are not addressed in PKIX (RFC5280)
or X.509, and RFC 6125 mentions several security concerns regarding their use.
8.2.7
Standard key-usage and policy extensions
We next discuss another set of standard extensions, deőned in the X.509 standard
with further details in PKIX and other PKI proőles. These extensions deal with
the usage of the certiőed key and with the certificate policies related to the
issuing and the usage of the certiőcate. Some of the more important extensions,
and their recommended usage as per the PKIX proőle, include:
The authority key identifier extension. Provides an identiőer for the issuer’s public key, allowing the relying party to identify which public
validation key to use to validate the certiőcate, if the issuer has multiple
public keys. It is always non-critical.
The subject key identifier extension. Provides an identiőer for the certiőed subject’s public key, allowing the relying party to identify that key
when necessary, e.g., when validating a signature signed by one of few
signature keys of the subject - including signatures on (other) certiőcates.
It is always non-critical.
The key usage extension. The key usage extension deőnes the allowed usages of the certiőed public key of the subject, including for signing,
encryption and key exchange. The speciőcation allows the use of same key
for multiple purposes, e.g., encryption and validating signatures, however,
this should not be used, as the use of the same key for such different
purposes may be vulnerable - security would not follow from the pure security deőnitions for encryption and for signatures. An exception is when
using schemes designed speciőcally to allow both applications, such as
signcryption schemes. The PKIX standard ( [104] requires this extension
to be marked as critical; see subsection 8.2.4.
The extended key usage extension. The extended key usage extension allows deőnition of speciőc purposes for which the key is to be used, as
supported by relying parties. The speciőcation also allows the CA to
indicate that other uses, as deőned by the key-usage extension, are also
allowed; otherwise, only the speciőed purposes are allowed. This extension
may be marked as critical or not; see subsection 8.2.4.
The private key usage period extension. This extension is relevant only
for certiőcation of signature-validation public keys; it indicates the allowed
period of use of the private key (to generate signatures). Always marked
non-critical.
Applied Introduction to Cryptography and Cybersecurity
8.2. THE X.509 PKI
503
The certificate policies extension. This extension identiőes one or more
certificate policies which apply to the certiőcate; for brief discussion of
certiőcate policies, see subsection 8.2.8. The extension identiőes certiőcate
policies using object identiőers (OID). In particular, the policy OID in the
certiőcate policies extension, is the main mechanism to identify the type of
validation of the legitimacy of the certiőcate, performed by the CA before it
issued the certiőcate - Domain Validation (DV), Organization Validation
(OV) or Extended Validation (EV). For more discussion on certiőcate
policies and on the three types of validation, see subsection 8.2.8. The
certiőcate policies extension may be marked as critical or as non-critical.
The policy mappings extension. This extension is used only in certiőcates
issued to another CA, called CA certificates. It speciőes that one of
the issuer’s certiőcate policies can be considered equivalent to a given
(different) certiőcate policy used by the subject (certiőed) CA. This
extension may be marked as critical or as non-critical.
Exercise 8.3. Some of the extensions presented in this subsection should always
be non-critical, while others may be marked either critical or non-critical. Justify
each of these designations by appropriate examples.
8.2.8
Certificate policy (CP) and
Domain/Organization/Extended Validation
A certificate policy (CP) is a set of rules that indicate the applicability of
the certiőcate to a particular use, such as indicating a particular community
of relying parties that may rely on the certiőcate, and/or a class of relying
party applications or security requirements, which may rely on the certiőcate.
Certiőcate policies inform relying parties of the level of conődence they may
have in the validity of the bindings between the certiőed public key and the
information in the certiőcates regarding the subject, including the subject
identiőers. Namely, the Certiőcate Policy provides information which may assist
the relying party to decide whether or not to trust a certiőcate for a particular
purpose. The certiőcate policy may also be viewed as a legally-meaningful
document, which may deőne, and often limit, the liability and obligations of
the issuer (CA) for potential inaccuracies in the certiőcate, and deőne statutes
to which the CA, subject, and relying parties should conform; however, these
legal aspects are beyond our scope.
Standard certificate policies and types of validations: DV, OV and
EV. The certiőcate policies extension is often used to identify a standard
policy; the policy is speciőed by the policy OID őeld. Such standard policies are
deőned by the CA-Brower Forum (CABF), and speciőc policies are identiőed
in [152]; see Table 8.2. Standard policies often identify the type of validation
performed by the CA before issuing the certiőcate.
Applied Introduction to Cryptography and Cybersecurity
CHAPTER 8. PUBLIC KEY INFRASTRUCTURE (PKI)
504
OID:
Name
Usage
TLS-DV
TLS
...1.2.1
TLS-OV
TLS
...1.2.2
2.23.140...
Validation
requirements
Domain validated
(confirmation email)
Price
Chrome
IE
$82.5
N/A
N/A
$291
N/A
N/A
$14
Organization
validation
$30
(registration, phone...)
TLS-EV
TLS
...1.1
Code-Sign
OV
Code
signing
...1.3
Code-Sign
EV
Code
signing
...1.3
Extended validation
(more verifications)
Organization
validation
$135
(registration, phone...)
Extended validation
(more verifications)
Table 8.2: Standard certiőcate policies.
Prices for a yearly certificate, from https://www.thesslstore.com, June 2021.
For the important case of website certiőcates for SSL or TLS, three types of
validation are deőned: Domain Validation (DV) and Organization Validation
(OV) and Extended Validation (EV) (in order of increasing validation). Domain Validation and Organization Validation follow the ‘Baseline certiőcate
requirements’ deőned by the CA/Browser Forum in [150].
Domain Validation (DV) is a fully-automated - but not very secure - validation process. It involves sending a request to an address associated with the
domain, and validating the response. The address may be an IP address or
email address (sometimes referred to as email validation). Domain Validation
is vulnerable to network attacks, including MitM attacks and off-path attacks
exploiting weaknesses of the domain name system (DNS) or of the routing
infrastructure; e.g., see [109].
(OV) also requires validation of the organization which is certiőed, i.e.,
the subject of the certiőcate. Most CAs perform limited validation, typically
involving validation that the organization is registered in the speciőed location,
and that the request is validated by a phone call to a registered number.
Extended Validation involve additional validation requirements, usually following the ‘Extended-Validation certiőcate guidelines’ deőned by the CA/Browser
Forum in [151]. These include registration in official registries, physical address
and more.
The type of validation could be used by relying parties to determine their
use of the certiőcate; it is conceivable that fraudsters would be less likely
to obtain EV or even OV certiőcates, due to their higher costs and stronger
validation requirements. The risk of a rogue certiőcate depends on the validation
requirements (Table 8.2); it is highest for domain-validated certiőcates, which
are only validated by automated email to the address listed in the Whois records,
a process vulnerable to off-path and routing attacks [62, 81].
However, the current popular web-browsers do not appear to treat certiőcates
differently based on their validation method. Until around 2019, most major
browsers displayed a visible indication in the location bar for EV certiőcates,
Applied Introduction to Cryptography and Cybersecurity
8.3. INTERMEDIATE-CAS AND CERTIFICATE PATH VALIDATION
505
e.g., as shown in Table 8.2 for the IE browser, based on research such as [194].
However, this was mostly abandoned in the recent years, and currently, most
browsers make minimal use of the type of validation. Browsers display the same
indicator for all TLS-protected websites (as shown for Chrome in Table 8.2),
and the validation type can only be identiőed by users using the user-interface
to look up the details of the certiőcate. The main justiőcations given [174] are
that these indications were found ineffective, and interfere with the browser
approach of presenting a warning against sites which are not protected by TLS.
Exercise 8.4. Compare the user-interface indications of the certificate validation method of two browsers. Check the certificates for at least three websites,
e.g., a bank, a newspaper and a browser download web-page.
8.3
Intermediate-CAs and Certificate Path Validation
PKI schemes require the relying parties to trust the contents of the certiőcate,
mainly, the binding between the public key and the identiőer. In the simple
case, the certiőcate is signed by a CA trusted directly by the relying parties, as
in Figure 8.1. Such a CA, which is directly trusted by a relying party, is called
a trust anchor or root CA of that relying party.
Direct trust in one or more trust-anchor (directly trusted) CAs might
suffice for small, simple PKI systems. However, many PKI systems are more
complex. For example, browsers typically directly trust dozens of trust anchor
CAs, referred to in browsers as root CAs, and also browsers also indirectly trust
certiőcates signed by other CAs, referred to as intermediate CA; an intermediate
CA must be certiőed by root CA, or by a properly-certiőed, indirectly-trusted
intermediate CA.
Relying parties and PKIs may apply different conditions for determining
which certiőcates (and CAs) to trust. For example, in the PGP Web-of-trust
PKI [159], every party can certify other parties. One party, say Bob, may
decide to indirectly trust another party, say Alice, if Alice is properly certiőed
by a ‘sufficient’ number of Bob’s trust anchors, or by a ‘sufficient’ number of
parties which Bob trusts indirectly. The trust decision may also be based on
ratings speciőed in certiőcates, indicating the amount of trust in a peer. Some
designs may also allow ‘negative ratings’, i.e., one party recommending not to
trust another party. The determination of whether to trust an entity based
on a set of certiőcates - and/or other credentials and inputs - is referred to as
the trust establishment or trust management problem, and studied extensively;
see [70, 71, 198, 199, 266] and citations of and within these publications.
We focus on the simpler case, where a single valid certification path suffices
to establish trust in a certiőcate; a certificate path is a series of certiőcates
C1 , C2 , . . . where each of them is signed by the public key certiőed in the previous
one, and the őrst one is signed by a root CA (trust anchor). Different relying
parties may validate a certiőcate path differently, based on their different trust
anchors and different policies for trusting certiőcates certiőed by an intermediate
CA, using a certiőcate path. The same CA, say CAA , may be a trust anchor
Applied Introduction to Cryptography and Cybersecurity
506
CHAPTER 8. PUBLIC KEY INFRASTRUCTURE (PKI)
for Alice, and an intermediate CA for Bob, who has a different trust anchor, say
CAB . This is the mechanism deployed in most PKI systems and by most relying
parties, and speciőed in X.509, and speciőcally in PKIX and Web-PKI. The
validation of the certiőcate path is based on several certificate path constraints
extensions, which we discuss in the following subsections.
8.3.1
The certificate path constraints extensions
In this subsection, we present the three certificate path constraints extensions
that are deőned in X.509 and PKIX: basic constraints, name constraints and
policy constraints. These constraints are relevant only for certiőcates issued to
a subject, e.g., www.bob.com, by some intermediate CA (ICA), i.e., ICA is not
directly trusted by the relying party (say Alice), i.e., it is not one of Alice’s
trust anchors.
Since an intermediate CA (ICA) is not a trust anchor for Alice (the relying
party), then Alice would only trust certiőcates issued by the ICA if the ICA is
‘properly certiőed’ by some trust anchor CA; we use TACA to refer to a speciőc
Trust Anchor CA which Alice trusts, and based on this trust, may or may not
trust a given ICA.
In the simple case, illustrated in Figure 8.6, the relying party (Alice) receives
two certiőcates: a certiőcate for the subject, e.g., the website www.bob.com,
signed by some Intermediate CA, which we denote ICA; and a certiőcate for
ICA, signed by the trust anchor CA, TACA. In this case, we will say that the
subject, www.bob.com, has a single-hop certification path from TACA, since ICA
is certiőed by the trust anchor TACA. In this case, therefore, the certiőcation
path consists of two certiőcates: CICA , the certiőcate issued by the trust anchor
TACA to the intermediate CA ICA, and CB , the certiőcate issued by the
intermediate CA ICA to the subject (www.bob.com).
In more complex scenarios there are additional Intermediate CAs in the
certification path from the trust anchor to the subject, i.e, the certiőcation path
is indirect, or in other words, contains multiple hops. For example, Figure 8.7
illustrates a scenario where the subject, www.bob.com, is certiőed via an indirect
certiőcation path with three hops, i.e., including three intermediate CAs: ICA1,
ICA2 and ICA3. The subject www.bob.com is certiőed by ICA3, which is
certiőed by ICA2, which is certiőed by ICA1, and only ICA1 is certiőed by a
trust anchor CA, TACA. Hence, in this example, the certiőcation path consists
of four certiőcates: (1) CICA1 , the certiőcate issued by the trust anchor TACA
to the intermediate CA ICA1, (2) and (3), the two certiőcates CICA2 and
CICA3 , issued by the intermediate CAs ICA1 and ICA2, respectively, to the
intermediate CAs ICA2 an ICA3, respectively, and őnally (4) CB , the certiőcate
issued by the intermediate CA ICA3 to the subject (www.bob.com).
We use the terms subsequent certificates to refer to the certiőcates in a
certiőcation path which were issued by intermediate CAs, and the terms root
certificate or trust-anchor certificate to refer to the ‘őrst’ certiőcate on the path,
i.e., the one issued by the trust-anchor CA. The second certiőcate along the
path is certiőed by the intermediate CA certiőed by the trust anchor (in the
Applied Introduction to Cryptography and Cybersecurity
8.3. INTERMEDIATE-CAS AND CERTIFICATE PATH VALIDATION
507
trust-anchor certiőcate); and any following certiőcate along the path, say the
ith certiőcate along the path (for i > 1), is certiőed by the intermediate CA
which was certiőed in the (i − 1)th certiőcate in the path. The length of a
certiőcate path is the number of intermediate CAs along it, which is one less
than the number of certiőcates along the path.
Note that, somewhat contrary to their name, the certiőcation path constraints cannot prevent or prohibit Intermediate CAs from signing certiőcates
which do not comply with these constraints; the constraints only provide information for the relying party, say Alice, instructing Alice to trust a certiőcate
signed by ICA, only if it conforms with the constraints speciőed in the certiőcates
issued to the intermediate CAs.
8.3.2
The basic constraints extension
The basic constraints extension deőnes whether the subject of the certiőcate,
say example.com, is allowed to be a CA itself, i.e., if example.com may also
sign certiőcates (e.g., for other domains or for employees). More speciőcally,
the extension deőnes two values: a Boolean ŕag denoted simply cA (with this
non-standard capitalization), and an integer called pathLenConstraint (again,
with this capitalization).
The cA ŕag indicates if the subject (example.com) is ‘allowed’ to issue
certiőcates, i.e., act as a CA; if cA = T RU E, then example.com may issue
certiőcates, and if cA = F ALSE, then it is not ‘allowed’ to issue certiőcates.
Recall that this is really just a signal to the relying parties receiving certiőcates
signed by example.com; also, this only restricts the use of the certiőcate that
I issued to example.com for validation of certiőcates issued by example.com,
it does not prevent or prohibit example.com from issuing certiőcates, which a
relying party may still trust, either since it directly trusts example.com (i.e., it is
a trust anchor), or since it receives also an additional certiőcate for example.com
signed by a different trusted CA, and that certiőcate allows example.com to be
a CA, e.g., by having the value TRUE to the cA ŕag in the basic constraints
extension.
The value of the pathLenConstraint is relevant only when there is a ‘path’ of
more than one intermediate CA, between the Trust Anchor CA and the subject.
For example, it is relevant only in Figure 8.7, and not in Figure 8.6.
For example, in both Figure 8.6 and Figure 8.7, the Trust Anchor CA
(TACA) signs certiőcate CICA1 , where is should specify the ICA1 is a trusted
(intermediate) CA. Namely, it must set the cA ŕag in the basic-constraints
extension of CICA1 to TRUE. However, in Figure 8.7, ICA1 further certiőes
ICA2 which certiőes ICA3 - and only ICA3 certiőes the subject (www.bob.com).
Therefore, for the relying party to ‘trust’ certiőcate CB for the subject, signed
by ICA3, it is required that CICA1 will also contain the path-length (pathLen)
parameter in the basic constraint extension, and this is parameter must be at
least 2 - allowing two more CAs till certiőcation of the subject. Similarly, the
certiőcate issued by ICA1 to ICA2 must contain the basic constraints extension,
indicating cA as TRUE, as well as value of 1 at least for the pathLen parameter.
Applied Introduction to Cryptography and Cybersecurity
508
CHAPTER 8. PUBLIC KEY INFRASTRUCTURE (PKI)
Unfortunately, currently, essentially all browsers do not enforce path-length
constraints on the root CAs. Root CAs sometimes do enforce path-length
constraints on intermediate CAs, however, these are usually rather long, e.g., 3,
leaving wide room for an end-entity to receive, by mistake, a certiőcate allowing
it to issue certiőcates. Of course, in most cases, end-entity certiőcates will not
allow issuing certiőcates, typically since their basic-constraints will indicate
that they are not a CA.
Browsers usually enforce basic constraint, although, failures may happen,
esp. since this kind of ŕaw - lack of validation - is not likely to be detected by
normal user.
Exercise 8.5 (IE failure to validate basic constraint). Old versions of the IE
browser failed to validate the basic constraint field. Show a sequence diagram
for an attack exploiting this vulnerability, allowing a MitM attacker to collect
the user’s password to trusted sites which authenticate the user using user-id
and password, protected using SSL or TLS.
Exercise 8.6. Assume that TACA is concerned that subject-CAs may issue
certificates to end-entities (e.g., websites) and neglect to include a basic constraint extension, to prevent the end entity from issuing certificates. Explain
how TACA may achieve this, for the scenarios in Figure 8.6 and in Figure 8.7.
Identify any remaining potential for such failure by one of the intermediate CAs
in these figures.
8.3.3
The name constraint extension
The name constraint extension is used in certiőcates issued to a subject CA,
such as the intermediate CAs in Figure 8.6 and Figure 8.7. The name constraint
extension restricts the set of subject-names to be certiőed by the subject CA,
as well as by any subsequent CA. For example, in Figure 8.7, name constraint
included in certiőcate CICA1 issued by TACA to ICA1, would restrict certiőcates
issued by ICA1, ICA2 and ICA3 3 .
The name constraint extension has two possible parameters, which we denote
by the names4 permit (to deőne permitted name spaces) and exclude (to forbid
name spaces, typically within the permitted name space). Focusing on the
PKIX proőle, both parameters are identiőers for names, usually a domain name;
we focus on this case. When a domain name is speciőed, this is taken to include
sub-domains, e.g., if a name constraints contain parameter permit (only) for
domain name com, then this allows subdomains such as google.com, but not
names in other top-level domains such as x.org. The exclude parameter takes
precedence; i.e., if a certiőcate contains both permit for domain name, say edu,
3 The name constraints in C
ICA1 would also restrict certificates issued by the subject
(www.bob.com); we didn’t list this above, since the subject’s certificate, CB , should prevent
the subject from issuing certificates, using the basic constraints extension.
4 The actual parameter names are permittedSubtrees and excludedSubtrees, which are a
bit cumbersome.
Applied Introduction to Cryptography and Cybersecurity
8.3. INTERMEDIATE-CAS AND CERTIFICATE PATH VALIDATION
Root CA / TACA
Certiőcate CICA
509
ICA
(Intermediate CA)
(Trust Anchor CA)
Certiőcates
CICA , CB
Relying party
(e.g, Alice’s browser)
Certiőcates CICA , CB
Subject
(e.g, www.bob.com)
Nurse
1
2
3
4
5
6
cA
No
Yes
Yes
Yes
Yes
Yes
Basic
pathLen
(any)
(any)
(any)
(any)
(any)
(any)
CICA constraints extensions
Name
Permit
Exclude
(any)
(any)
bob.com none or x.bob.com
cat.com
(any)
bob.com
www.bob.com
(any)
(any)
(any)
bob.com
Policy
Req. Policy
(any)
none or > 1
(any)
(any)
0
(any)
CB
valid?
No
Yes
No
No
No
No
Figure 8.6: A single-hop (length one) certiőcate-path, consisting of trust-anchor
CA T ACA, an intermediate CA ICA, and a subject (e.g., website www.bob.com).
The table shows the impact of six examples of certiőcate path constraints
extensions in certiőcate CICA , on the validity of certiőcate CB issued by ICA.
In these examples, CB is for domain name www.bob.com, has no certiőcate
policies extension and has basic constraints indicating cA = N o (not a CA); and
the value (any) in a őeld, indicates that the example holds for any value in this
őeld. Each row is one example of the constraints in CICA . In example (row) 1,
CICA does not have the cA ŕag set (true); namely, CICA does not indicate that
ICA is a CA, and hence CB is invalid. In contrast, in example 2, certiőcate
CB is valid, since the cA ŕag is true, the Name-constraints permit bob.com and
does not exclude www.bob.com, and either there is no policy-constraint or its
value is more than 1. See discussion in subsection 8.3.1.
Applied Introduction to Cryptography and Cybersecurity
CHAPTER 8. PUBLIC KEY INFRASTRUCTURE (PKI)
510
TACA
CICA1
ICA1
CICA1 ,
CICA2
ICA2
CICA1 ,
CICA2 ,
CICA3
ICA3
Certiőcates
CICA1 , CICA2 ,
CICA3 , CB
Relying party
(e.g, Alice’s browser)
Certiőcates CICA1 ,
CICA2 , CICA3 , CB
Subject
(e.g, www.bob.com)
Nurse
1
2
3
4
5
cA
Yes
Yes
Yes
Yes
Yes
CICA1 constraints extensions
Basic
Name
pathLen
Permit
Exclude
<2
(any)
(any)
none or ≥ 2 bob.com none or x.bob.com
(any)
(any)
(any)
(any)
cat.com
(any)
(any)
(none)
bob.com
Policy
Req. Policy
(any)
none or > 3
≤3
(any)
(any)
CB
valid?
No
Yes
No
No
No
Figure 8.7: A length 3 certiőcate-path, consisting of trust-anchor CA T ACA,
three intermediate CAs (ICA1, ICA2, ICA3), and a subject (e.g., website
www.bob.com). The table shows the impact of the őve example values of the
certiőcate path constraints extensions (see subsection 8.3.1), in particular, of
the pathLen (path length) parameter of the basic constraints extension. For the
examples in the table, assume that none of the certiőcates has the certiőcate
policies extension, and that the intermediate certiőcates CICA1 , CICA2 , CICA3
all have the cA ŕag set in ‘Basic constraints’, and that CICA2 , CICA3 do not
have any other constraints. For example, in row 1, CB is invalid, since the
pathLen őeld in the Basic-constraints extensions of CICA1 is set to less than 2
(and the path from ICA1 to ICA3 is of length two). In contrast, in row 2, the
pathLen constraint does not exist (or is satisőed), and the other constraints
in CICA1 are also set to allow the certiőcate path to be valid (compare to the
examples in Figure 8.6).
Applied Introduction to Cryptography and Cybersecurity
8.3. INTERMEDIATE-CAS AND CERTIFICATE PATH VALIDATION
511
Figure 8.8: Example of the use of the name constraint extension, where constraints are over distinguished name keywords. NTT Japan issues a certiőcate to
IBM Japan, with the name constraint Permit O=IBM, i.e., allowing it to certify
only distinguished names with the value ‘IBM’ to the ‘O’ (organization) keyword,
since NTT Japan does not trust IBM Japan to certify other organizations. IBM
Japan certiőes the global IBM, only for names in the IBM organization (Permit
O=IBM ), and excluding names in Japan (Exclude C=Japan). As a result, NTT
trusts certiőcates issued by IBM to different parts of IBM, e.g., IBM US, but
would not trust certiőcates issued by IBM to IBM Japan or to other companies,
e.g., Symantec. Similarly, IBM certiőes Symantec for all names, except names
in the IBM organization.
and exclude for subdomain uconn.edu, then this allows subsequent certiőcates
only for domains in the edu top-level domain, and excludes domains in the
subdomain uconn.edu. See examples in the tables in Figure 8.6 and Figure 8.7.
Note that these examples focus on the typical case of DNS domain names,
however, the restrictions may apply to other types of names, e.g., email addresses
or X.509 distinguished names.
Figure 8.8 presents an example of a typical application of the name constraint
extension, using X.509 domain names. In this example, the NTT Japan CA
issues a certiőcate to IBM Japan, allowing the IBM Japan CA to certify any
certiőcates with the value ‘IBM’ for the organization (O) keyword - implying
that IBM Japan cannot certify other organizations. Also, see IBM Japan
certifying the ‘main’, corporate IBM CA, but excluding sites where the value
of the country (C) keyword is Japan, i.e., not allowing corporate IBM CA to
certify sites in Japan, even IBM sites. Notice that such certiőcate issued by
corporate IBM would also be trusted by relying parties using only NTT Japan
as a trust anchor, provided that other relevant constraints such as certiőcate
path length are satisőed (or not speciőed).
Figure 8.9 presents a similar example, but using DNS domain names instead
of X.509 distinguished names.
Applied Introduction to Cryptography and Cybersecurity
512
CHAPTER 8. PUBLIC KEY INFRASTRUCTURE (PKI)
Figure 8.9: Example of the use of the name constraint extension with DNS
names (dNSName).
Unfortunately, currently, essentially all browsers do not enforce any Name
constraints on the root CAs, and root CAs rarely enforce Name constraints on
intermediate CAs. Therefore, although we believe most browsers do support
name constraints, these are rarely actually deployed in practice.
8.3.4
The policy constraints extension
In addition to the basic constraints and name constraint extensions, X.509 and
PKIX also deőne a third standard extension that deőnes additional constraints
on subsequent certiőcates. This is the policy constraints extension, which is
related to the certiőcate policies and certiőcate policy mappings extensions; see
subsection 8.2.8.
The policy constraints extension allows the CA to deőne two requirements
which must hold, for subsequent certiőcates in a certiőcate path to be considered
valid:
requireExplicityPolicy: if speciőed as a number n, and the path length is
longer than n, then all certiőcates in the path must have a policy required
by the user.
inhibitPolicyMapping: if speciőed as a number n, and the certiőcate path is
longer than n, say C1 , . . . , Cn , Cn+1 , . . ., then Cn+1 and any subsequent
certiőcate, should not have a policy mapping extension.
8.4
Certificate Revocation
In several scenarios, it becomes necessary to revoke an issued certiőcate, prior
to its planned expiration date. Different reasons for revoking a certiőcate are
Applied Introduction to Cryptography and Cybersecurity
8.4. CERTIFICATE REVOCATION
513
listed in the PKIX [104] and X.509 [212] standard; we roughy categorize5 them
as follows:
Revocation due to security concerns: these revocations are due to potential compromise of the certiőed key, discovery that the certiőcate was
requested without authorization, that the certiőcate was misused or could
be misleading, or that the subject violated its obligations or Terms of Use.
Revocation due to change: these revocations are due to a change which
invalidates the information in the certiőcate. This can be a change to
the certiőed name or to other certiőed attributes, e.g., removal of some
(certiőed) privilege. Other reasons for change are when the certiőed entity
ceases to operate, or just stops using the certiőed public-private key pair,
for a benign reason (such as change of business or change to a more secure
key or algorithm).
Other revocations: some revocations may be for other reasons, such as a
request by the subject, a legal obligation of the CA, or some mistake or failure of the CA. For example, the Let’s Encrypt CA had to revoke multiple
certiőcates, when they detected a bug in their CAA (Certificate Authority
Authorization) validation code, affecting over 3 million certiőcates [1], most
of which were revoked. (Read about CAA in subsection 8.5.2 and [185].)
Revocation mechanisms: CRLs, OCSP and others. Revocation mechanisms are methods to inform the relying parties that a certiőcate was revoked.
This turns out to be a signiőcant challenge - deőnitely much larger than originally
anticipated.
The early X.509 design [92] only offered one method for revocation, the
Certificate Revocations List (CRL), which is, essentially, a list of the revoked
certiőcates, typically signed by the issuing CA; see subsection 8.4.1. CRLs are
still widely implemented and supported by CAs, however, most relying parties
prefer to use other mechanisms to check for revocations, since CRLs often have
excessive overhead, mainly in terms of bandwidth. A daily download of, say,
100M B to 1GB for each relying party, is problematic for the CAs - as well as
for many relying parties.
Another revocation mechanism which was standardized is the Online Certificate Revocation Protocol (OCSP), see subsection 8.4.2 (and the ‘StapledOCSP’ mechanism in subsection 8.4.3). However, OCSP also has its own set
of drawbacks, including delay, loss of privacy and of availability, overhead and
vulnerability to Denial-of-Service (DoS) attacks, as we discuss.
Indeed, there is still no consensus on the ‘best’ revocation mechanism. In subsection 8.4.5 we discuss several other, non-standardized revocation mechanisms.
This includes the (deployed, proprietary) OneCRL and CRLset mechanisms,
5While X.509 specifies an optional field to specify the reason for revocation, most certificates do not include such indication, hence, unfortunately, we do not know the distribution of
reasons.
Applied Introduction to Cryptography and Cybersecurity
CHAPTER 8. PUBLIC KEY INFRASTRUCTURE (PKI)
514
Method
CRL
(periodic)
Freshness
Delay
Hours/days
None
(local)
∆-CRL
Hours/days
None
(local)
OCSP
(ignore
cache)
Seconds
(or less)
Seconds
Stapled
OCSP
TS
minutes
None
(in
TLS)
Hours/days
None
(local)
OneCRL,
CRLset
CRV
Hours/days
∆-CRV
Hours/days
None
(local)
None
(local)
Compute
One
signature,
one
verification
qCA
signatures,
qRP verifications
c · 24·60
T
S
signatures,
qRP verifications
One
signature,
one
verification
Bandwidth
Storage
Concerns
Very high:
rAll · LCRL
High:
rAll · LCRL
Bandwidth
Usually:
> rD · LCRL ,
sometimes:
rAll · LCRL
High:
rAll · LCRL
Complexity,
adoption, storage
Medium/Low:
qCA · LO
None
Delay, overhead,
availability, DoS,
privacy exposure
Medium/Low:
qCA · LO
None
Delay, overhead,
availability, DoS,
privacy exposure
Medium/Low
Medium/Low
Partial coverage,
proprietary,
another TTP
Low:
rAll · log c
Very low:
rD · log c
Low:
rAll · log c
Low:
rAll · log c
Proposed, not
yet deployed
Table 8.3: Comparison of revocation-checking mechanisms. Bandwidth and
computations are for a 24 hours period. Computation focuses on the public key
signature operations: verifying (done by the relying party) and signing. Signing
is done by the CA, except for OneCRL/CRLset, where signing is done by the
vendor. The parameters (c, LCRL , . . .) are described in Table 8.4. Values are
rough approximations (simpliőcations).
Parameter
Length of CRL entry
Length of OCSP response
Number of certiőcates
Total number of revocations
Revocations in 24 hours
Validation-queries by a relying-party in 24 hours
Validation-queries to a CA in 24 hours
Time between OCSP requests by website (stapled OCSP)
Notation
LCRL
LO
c
rAll
rD
qRP
qCA
Typical value
100 bytes
1000 bytes
108
106
1000
50
107
TS
10 minutes
Table 8.4: Revocation-related parameters. Actual values differ signiőcantly
between different PKIs; the values given are just examples.
the CRV and ∆-CRV mechanisms proposed in [364]. In subsection 8.4.4, we
also discuss some non-standardized OCSP optimizations.
Table 8.3 compares the CRL, OCSP, OneCRL/CRLset and CRV/∆−CRV
revocation mechanisms.
Validation and freshness of revocation information. Revocation information, sent as CRL, as OCSP response or otherwise, should be validated for
authenticity and freshness. All of the revocation mechanisms we discuss are
based on provision of signed and time-stamped revocation information, which
allows any party to validate it at any time (not just immediately). Furthermore,
Applied Introduction to Cryptography and Cybersecurity
8.4. CERTIFICATE REVOCATION
515
this allows the relying party to prove later that it received particular revocation
information, which can help justify its resulting actions, such as performing a
signed order which the relying party validated using the certiőed public key (not
revoked based on information available to the relying party at that time), or
refusing to perform some order (since the key was revoked). The correct action
may also depend on the time (and date) of revocation, which is, therefore, often
part of the information for a revoked certiőcate.
What is fresh revocation information? Note that revocation information
may change over time, even while in transit to the relying party. Therefore, it
is possible that the relying party receives, at time t information indicating a
certiőcate is still valid, while the certiőcate was revoked at some time t′ ≤ t.
This should happen only if t′ is ‘sufficiently close’ to t, i.e., t − t′ ≤ ∆ where ∆
is the allowed period to use a certiőcate after the timestamp in the revocation
information. The value of ∆ may be deőned by an extension in the certiőcate
or within the revocation information; for example, this is the case for CRL
and OCSP. Alternatively the allowed ∆ may be the same for all certiőcates
(deőned by a speciőca CA, or by any CA), as is done by several (non-standard)
revocation mechanisms such as OneCRL, CRLsets and CRVs.
Distribution of revocation information. Often, the relying party receives
the revocation information directly from the CA. However, since the revocation
information is signed and time-stamped, then it could also be relayed by third
parties. This fact is utilized by OCSP stapling, where the revocation information
is provided by the subject, ‘stapled’ to the certiőcate. OCSP stapling is speciőed
for the common case where the certiőcate is provided by a TLS Server, and in
this case, it is provided in the Server Hello message. We discuss OCSP stapling
in subsection 8.4.3.
Retrieving revocation information: periodically or as-needed (online)? There are two main options for the retrieving revocation information:
periodically, e.g., daily, or as-needed, i.e., when the relying party needs the
revocation information to validate a speciőc certiőcate. Online (as-needed)
retrieval may reduce the bandwidth overhead, since it avoids downloading
unnecessary revocation information; on the other hand, the relying party must
wait for the revocation information to arrive, introducing delay (waiting for the
response) and the risk of communication failures. Online retrieval may also
allow an attacker to perform a Denial-of-Service (DoS) attack agains the CA
or other OCSP server, by a ‘ŕood’ of revocation-queries; and there are also
privacy concerns, due to the exposure of the identity of the certiőcates used by
the relying party. To reduce the overhead, some relying parties use a cached
OCSP response if available, and perform online retrieval only when they do not
have a valid OCSP response in cache.
The impact of the periodical retrieval vs. as-needed retrieval can be clearly
seen in Table 8.3. OCSP, as its name implies, retrieves revocation information
Applied Introduction to Cryptography and Cybersecurity
To be signed (tbs)
516
CHAPTER 8. PUBLIC KEY INFRASTRUCTURE (PKI)
CRL Version (optional)
Signature algorithm (OID)
Issuer (Distinguished Name)
thisUpdate (time)
nextUpdate (time; optional)
revokedCertificates (optional)
... CRL Entry n
CRL Entry 1
crlExtensions (optional)
Signature: Signs.Issuer (tbs)
Certificate serial number
Revocation Date (time)
CRL Entry Extensions
(optional)
(b) X.509 CRL Entry
(a) X.509 CRL
Figure 8.10: (
X.509 Certiőcate Revocation List (CRL): CRL őelds (a), and CRL Entry
őelds (b). Fields with white background have corresponding őelds in the
X.509v3 certiőcate.
as-needed (online), unless it is cached (in OCSP implementations that cache
responses). Most other revocation mechanisms, including CRLs, usually retrieve
revocation information periodically, although some implementations, mainly of
CRLs, also retrieve them only as-needed, to save bandwidth by not downloading
unnecessary CRLs. Table 8.3 does not include the less-typical options of OCSP
which caches responses and of retrieving CRLs only as-needed.
8.4.1
Certificate Revocation List (CRL)
The X.509 designers probably expected revocation to be a rare incident, with a
small number of certiőcates which were revoked (but not yet expired) at any
given time. In this case, a simple solution is for the CA to periodically sign
and distribute a time-stamped list of all revoked certiőcates, which is called a
Certificate Revocation List (CRL). The CA may also authorize another entity
to issue CRLs; the term CRL issuer can be used, to refer to the entity issuing
the CRLs - the CA or an entity authorized by the CA. However, for simplicity,
we mostly refer to the typical case where the CA is also the CRL issuer.
CRLs are deőned as part of the X.509 standard, already from its early
versions [92]; their contents were enhanced with later versions. Figure 8.10
shows the contents of the widely used CRL deőned in [211]. As can be seen, this
CRL shares quite a lot with the X.509v3 certiőcate (Figure 8.5); in particular,
it has similar version6 , OID, issuer Distinguished Name (DN), subject DN and
extensions őelds. The őelds which are unique to CRLs are:
6 Confusingly, the CRL specification support extensions from CRL version 2, not 3 (as
might be expected for X.509v3 certificate). Furthermore, in CRLs, the version field is optional;
when the version field is absent, this indicates version 1 (which does not support extensions).
In practice (and this book), CRLs are always version 2, i.e., contain both version and extension
fields.
Applied Introduction to Cryptography and Cybersecurity
8.4. CERTIFICATE REVOCATION
517
thisUpdate: the time at which the CRL was issued (signed).
nextUpdate: If speciőed, the nextUpdate bounds the time when an updated
CRL will be issued. Usually, relying parties request a new certiőcate
prior to the nextUpdate time, i.e., it serves, essentially, as the ‘expiration
date’ for the CRL.
revokedCertificates: this őeld lists one or more CRL Entries. The contents
of CRL Entries are shown in Figure 8.10 (b). Each CRL Entry contains
(1) the serial number of the revoked certiőcate, (2) the revocation date
(and time), and (3) optional extensions, much like the X.509v3 certiőcate
(and the CRL) extensions.
crlExtensions: an optional őeld, that may contain extensions to the CRL,
much like the X.509 certiőcate extensions.
A relying party should use a valid, non-expired CRL to check if a certiőcate
issued by the CA was revoked. Almost always, the CRL is cached until it
is replaced by a more-recently-issued CRL (from the same issuer and set of
certiőcates). The relying party typically requests the CRL, either periodically,
to ensure a fresh CRL is available when needed, or only when needed to validate
a certiőcate.
The CRL Distribution Points (‘cRLDistributionPoints’ certificate
extension. To retrieve the CRL, the relying party usually uses the cRLDistributionPoints certiőcate extension, deőned in [211]; this extension deőnes how
(using what protocol) to retrieve the CRL, and what address to use. Speciőcally,
this information is provided by one or more DistributionPoints entries, each of
which deőnes a URI (Universal Resource Locator), which deőnes the protocol
and location for downloading the CRL.
Bandwidth overhead is a major concern with CRLs, since CRLs can often be
quite large. The size of a single CRL entry, which we denote LCRL , can differ
signiőcantly among providers, but most are close to 100 bytes; and the total
number of revocations, which we denote rAll , could be a million or even more.
This results in total CRL length of LCRL · rAll of around 100 million bytes per
CA, considerable overhead for both CAs and relying parties. Measurements of
CRL overhead were reported in [399], who found median CRL of 51KB and
maximal CRL of 76MB (yes, this is in Mega Bytes!); and [364], who found
average CRL of 173KB. The reason for that is that the number of revocations
may be surprisingly high; speciőcally, [399] found that about 8% of the nonexpired certiőcates were revoked, mostly due to the Heartbleed bug [91], and
Let’s revoke discovered a bug in their issuing which required revocation of three
million certiőcates [305]. However, even looking at measurements for periods
without such events, we see that about 1% of the non-expired certiőcates are
revoked - which can still result in excessively long CRLs.
Three standard X.509 CRL extensions are designed to reduce the bandwidth
overhead of CRLs:
Applied Introduction to Cryptography and Cybersecurity
518
CHAPTER 8. PUBLIC KEY INFRASTRUCTURE (PKI)
The Issuing Distribution Point (IDP) CRL extension and CRL scopes.
A CRL, available from some DistributionPoint, should contain all revoked
certiőcates which were issued by the CA and not yet expired, belonging
to the scope of the CRL. By default, all certiőcates issued by the CA
have the same scope, i.e., are available from the same DistributionPoint.
However, often, CAs prefer to use multiple smaller CRLs, by splitting
the set of certiőcates into separate scopes, e.g., based on issuing time.
X.509 [211] speciőes that in this case, the CRL must contain the standard,
critical CRL extension called Issuing Distribution Point (IDP), which will
deőne the relevant scope. The main motivation for a CA to use multiple
CRLs, each with distinct scope (deőned using the IDP CRL extension)
and DistributionPoint, is to reduce the bandwidth overhead. The use of
multiple DistributionPoints (and scopes) reduces the length of each CRL,
at the cost of requiring the CA to sign and distribute multiple CRLs.
When relying parties download CRLs only as-needed, this may reduce
the required bandwidth, but at the cost of reduced likelihood that the
required CRL is cached, i.e., more cases where the relying party must
download the CRL to validate a certiőcate, which can cause signiőcant
delay and availability concerns. OCSP can be seen as an extreme case,
with each certiőcate requiring a separate request/response.
The Authorities Revocation List (ARL) extension lists only revocations
of CA certiőcates. This is essentially equivalent to placing CA certiőcates
in a dedicated distribution point, and becomes meaningful mainly when
relying parties download only the ARL, and use other mechanisms, such
as OCSP, for non-CA certiőcates.
The Delta CRL extension lists only new revocations, which occurred since
last base-CRL, which is retrieved as needed. To validate that a given
certiőcate is not revoked, check if it is contained either in the Delta-CRL
or in a base-CRL, issued not earlier than the time speciőed in the DeltaCRL. For this method to be effective, relying parties should cache, and
periodically download, the base-CRL, so, the storage requirements are the
same as when using ‘regular’ CRLs, and sometimes also the bandwidth
requirements. Also, implementation is more complex, especially if the
relying party may need to ‘prove’ to a third party, in the future, that she
relied on a certiőcate that was not revoked at the time. Possibly due to
such concerns, Delta-CRLs are not widely deployed.
Even with such optimizations, CRLs may still introduce signiőcant bandwidth overhead.
When to download CRLs: periodically (in advance) or as-needed?
The original X.509 CRL design, was to download all CRLs in advance, in a
periodical process, e.g., daily [92]. This periodical process should be done with
sufficient frequency, to make sure that the revocation information is reasonably
Applied Introduction to Cryptography and Cybersecurity
8.4. CERTIFICATE REVOCATION
OCSP Client
Usually, relying party or subject, e.g., web server
519
OCSP Responder
(CA or trusted OCSP server)
OCSP request:
version, {CertID1 , . . .} [, signature] [, extensions]
OCSP response:
ResponseStatus, producedAt, responses, signature
Figure 8.11: The Online Certificate Status Protocol (OCSP). The request
includes one or more certiőcate identiőers {CertID1 , . . .}; requests are optionally
signed. The OCSP response is signed by the responder, and includes response
for each CertID in the request. Each of these ‘individual responses’ includes
the CertID, cert-status, time of this update, time of the next update, and
optional extensions. Cert-status is either revoked, good or unknown.
updated (fresh). Since CRLs can be quite long, and many CRLs are not required
in any given day, this results in considerable overhead.
Therefore, implementations often fetch the CRLs only as-needed ( online).
However, this may cause increased delay (waiting for information to arrive),
reduced reliability (what to do if revocation information is unavailable), and
privacy concerns (e.g., exposing the website being visited). As a result, the use
of CRLs has become less and less common; e.g., it is not done, currently, by
major browsers.
A standardized alternative to CRLs is the Online Certificate Status Protocol
(OCSP) standard, which we discuss in subsection 8.4.2, subsection 8.4.3. We
later also discuss other, non-standardized alternatives, in subsection 8.4.4 and
subsection 8.4.5.
8.4.2
Online Certificate Status Protocol (OCSP)
OCSP (Online Certiőcate Status Protocol) [346], shown in Figure 8.11 is a
request-response protocol, providing a secure, signed indication to the relying
party, showing the ‘current’ status of certiőcates (details below). The protocol involves two entities: the OCSP client, who sends an OCSP request to
request the status of one or more certiőcates, and the OCSP responder (server),
who responds with a (signed) OCSP response, indicating the status of the
certiőcate(s).
The OCSP client, i.e., the entity that sends the OCSP request, is either the
relying party or another party. In this subsection, we focus on the ‘classical’
OCSP deployment, where the relying party, e.g., browser, sends the OCSP
request to the CA (or other OCSP responder), as in Figure 8.12; in this case, the
relying party (often browser) acts as the OCSP client. Later, in subsection 8.4.3,
we discuss the stapled-OCSP deployment, where it is the subject, e.g., website,
Applied Introduction to Cryptography and Cybersecurity
CHAPTER 8. PUBLIC KEY INFRASTRUCTURE (PKI)
520
OCSP Responder
(often the CA)
TLS client
(browser)
TLS (web)
server
TLS Client Hello
TLS Server Hello
(includes certificate)
OCSP request
OCSP response
Revoked or invalid: abort,
Timeout: abort (hard-fail) or proceed (soft-fail),
Valid: proceed.
TLS key exchange, finish
TLS finish
Figure 8.12: OCSP used by relying party (as OCSP client). There are several
concerns with this form of using OCSP, including privacy exposure, overhead
on CA, and handling of delayed/missing OCSP response by the client/browser.
This last concern, illustrated in Figure 8.13, motivated updated browsers to
support and prefer OCSP-stapling (see Figure 8.14), where the TLS/web server
makes the OCSP request, instead of the client/browser, and ‘staples’ the OCSP
response to the TLS server hello message.
who sends the OCSP request, i.e., the subject (often website) acts as the OCSP
client.
The OCSP responder, i.e, the entity that processes OCSP requests and sends
responses, is an entity trusted by the relying party; we will assume this is the
CA itself, although it could also be another entity, delegated by the CA. Each
OCSP response message is signed by the OCSP responder or the CA, allowing
the relying party to validate it, even if received via an untrusted intermediary,
e.g., the subject (website).
Improving efficiency with multi-cert OCSP requests. To improve
efficiency, a single OCSP request may specify (request status for) multiple
certiőcates (CertIDs)7 . Correspondingly, a single OCSP response, using a
single signature, may include (signed) responses for multiple certiőcates. The
support for OCSP requests and responses for multiple certiőcates, is especially
important when certiőcates are signed by intermediate CAs, using a CertiőcatePath; see Note 8.2.
7 Certificate identifiers (CertIDs) may be specified using the hash of the issuer name and
key, and a certificate serial number.
Applied Introduction to Cryptography and Cybersecurity
8.4. CERTIFICATE REVOCATION
521
Note 8.2: Using OCSP to validate a Certiőcate-Path (CP)
An indirectly trusted certificate, certified via a certificate chain, consisting of certificates issued by (one or more) intermediate CAs, may be invalidated via revocation
of any of these certificates. A relying party wishing to validate the status of the
indirectly-trusted certificate, needs to check for revocation of the intermediate-CA
certificates in the chain, not only of the indirectly-trusted certificate itself.
Since Intermediate-CAs are critical elements of the PKI, and their number is much
smaller than end-entities, relying parties may use other mechanisms to check for
revocation of intermediate-CA certificates. Specifically, intermediate-CA certificates
are often validated using (special) CRLs or proprietary mechanisms such as OneCRL
or CRLset. However, sometimes, their validity should be checked using OCSP.
The fact that an OCSP request may include multiple certificates, allows this process
to be more efficient; a single OCSP request-response interaction may suffice to obtain
updated status for all of these certificates, provided that the same OCSP responder
is able to provide (signed) OCSP responses for all of these certificates (issued by
different CAs).
The original OCSP stapling specification, RFC 6066 [139], does not support stapling
of multiple certificates. This is addressed in TLS 1.3 [329], which allows the RFC
6066 information to be attached to every certificate in the chain sent by the server.
Alternatively, implementations of older versions of TLS can use the (later-defined)
‘multiple certificate status’ extension, RFC 6961 [317].
OCSP vs. CRLs. The length of an OCSP response is linear in the number
of CertIDs in the corresponding OCSP request, rather than a function of
the total number of revoked certiőcates of this CA, as is the case for CRLs.
Furthermore, the computation required for sending an OCSP response is just
one signature operation, plus some hash function applications, regardless of
the number of revoked certiőcates or the number of certiőcates whose status is
requested in this OCSP request. In the common case where the total number
of revoked certiőcates may be large, this signiőcantly reduces the overhead of
generating and distributing often large CRL responses. Namely, OCSP provides
an alternative which is often more efficient than CRLs; with CRLs, the CA must
‘push’ the list of all revocations to all relying parties, while with OCSP, a relying
party receives information only about relevant certiőcates. In addition, OCSP
responses are sent on a timely fashion, when the relying party is validating
the relevant certiőcate - which may provide a more ‘fresh’ indication compared
to the periodical CRL. As a result of these advantages, OCSP appears to be
deployed more than CRLs. However, OCSP has its own set of challenges, so it
is also not widely used. Let us discuss these challenges.
OCSP Challenges: ambiguity, failures and delay. OCSP status responses for each certiőcate may specify one of three values: revoked, good or
unknown. The ‘unknown’ response is typically sent when the OCSP responder
does not serve OCSP requests for the issuer of the certiőcate in question, or
cannot resolve their status at the time (e.g., due to lack of response from
the CA). These unknown responses are ambiguous; relying parties are left to
Applied Introduction to Cryptography and Cybersecurity
522
CHAPTER 8. PUBLIC KEY INFRASTRUCTURE (PKI)
decide how to interpret and respond to it. These ambiguous responses are
quite problematic, as we explain below. But őrst let us discuss another OCSP
scenario that also leads to similar ambiguity: failed requests.
An OCSP request may fail in multiple ways. One way is when the OCSP
client fails to establish communication or receive response from the OCSP server.
Another reason is when the OCSP responder sends back an OCSP failure return
code, indicating a reason for failure. These reasons include:
• Lack of signature on OCSP request (when required by OCSP responder)
• Request not properly authorized/authenticated, e.g., not from known IP
address, or missing/incorrect authentication information, when required
by OCSP responder. Authentication information should be provided by
the client in an appropriate OCSP extension.
• Technical reasons, such as overload or internal error.
Recall now that in the ‘classical’ OCSP deployment, the OCSP client is the
relying party, typically, the browser, as in Figure 8.12. However, this creates a
dilemma for the browser (or other relying party): how should the relying party
respond to OCSP failures and ambiguous responses, e.g., when a response does
not arrive (within reasonable time) or indicates an OCSP failure? The following
are the main options - and why each of them seems unsatisfactory:
Wait: if the problem is timeout, then the relying party may simply continue
waiting for the OCSP response, possibly resending the request periodically,
and never ‘giving up’. However, OCSP servers could fail or become
inaccessible forever, or for extremely long, leaving the relying party in
this state. We do not believe any relying party has taken or will take this
approach; also, it does not address the other types of OCSP ambiguities.
In fact, even when the relying party ‘time-outs’ if the OCSP response is
not received within reasonable time, the delay of waiting for the OCSP
response is often a concern.
Hard-fail: abort the connection (and inform the user). That is clearly a ‘safe’
alternative, i.e., prevent use of a revoked certiőcate. However, the OCSP
interaction may often fail or return ambiguous response due to benign
reasons, such as network connectivity issues or overload of the OCSP
responder. In particular, usually, the OCSP responder is the CA, and
CAs often do not have sufficient resources to handle high load of OCSP
requests. Therefore, this approach is not widely adopted.
Ask user: the relying party may, after some timeout, invoke a user-interface
dialog and ask the user to decide if to continue with the connection or
abort it. For example, a browser may invoke a dialog, informing the user
that the certiőcate-validation process is taking longer than usually, and
ask the user what action it should take. While this option may seem
to empower the user, in reality, users are rarely able to understand the
Applied Introduction to Cryptography and Cybersecurity
8.4. CERTIFICATE REVOCATION
523
situation and make an informed decision, and are very likely to continue
with the connection; see discussion of usability in Chapter 9. Hence,
except for ‘shifting the responsibility’ to the user, this option is inferior
to direct soft-fail, discussed next.
Soft-fail: őnally, the relying party may simply continue as if it received a
valid OCSP response. By far, this is the most widely-adopted option. In
the typical case of a benign failure to receive the OCSP response, there
is no harm in picking this option. However, this choice leaves the user
vulnerable to an impersonation attack using a revoked certiőcate, when
the attacker can block the OCSP response; see Figure 8.13. Since our need
for cryptography is mainly due to concerns about a Man-in-the-Middle
attacker, who can surely block communication, this option results in
vulnerability.
As Figure 8.13 shows, the soft-fail approach essentially nulliőes the value
of OCSP validation - against an attacker that can block or sufficiently delay
the OCSP request/response, if the attacker has exposed the private key of the
TLS (web) server, or has obtained a fake certiőcate for the server’s domain
(that was later revoked). Both exposing of the private key and obtaining a fake
certiőcate are challenging attacks. However, such attacks do occur, which is
one reason we need revocations; see examples of such attacks in Table 8.5. The
other condition, of being able to block the OCSP response, is often surprisingly
easy for an attacker, e.g., by sending an excessive number of OCSP requests to
the OCSP responder (e.g., the CA) at the same time as the OCSP request from
the relying party. In particular, an attacker is likely to be able to launch such
attack by intentionally invoking appropriate links from a website controlled
by the attacker, in a so called web-puppet attack; see the web-security chapter
of [192]. In spite of this, soft-fail is common choice of browsers and most other
relying parties, basically, since developers give more weight to user-experience
(UX) considerations, than to security considerations - see the UX>security
precedence rule (Note 8.3). Unfortunately, as we explained, this allows attackers
to circumvent OCSP and use revoked certiőcates, by intentionally causing a
failure to the OCSP challenge-response communication.
There are several additional problems with the use of ‘classical’ OCSP
deployment, where the OCSP request is sent by the relying party (often,
browser):
Delay: since OCSP is an online, request-response protocol, its deployment at
the beginning of a connection often results in considerable delay.
Privacy exposure: the stream of OCSP requests (and responses) may expose
the identities of websites visited by the user to the OCSP responder, or
to other agents able to inspect the network traffic. By default, OCSP
requests and responses are not encrypted, exposing this information even
to an eavesdropper; but even if encryption is used, privacy is at risk. First,
the CA is still exposed to the identities of websites visited by a particular
Applied Introduction to Cryptography and Cybersecurity
CHAPTER 8. PUBLIC KEY INFRASTRUCTURE (PKI)
524
TLS client
(browser)
MitM (fake server,
with revoked cert)
OCSP Responder
(CA)
TLS Client Hello
TLS Server Hello
with revoked certificate
OCSP request
(drop)
OCSP response
time-out→
softfail
TLS key exchange, finish
TLS finish
(data)
Figure 8.13: The MitM soft-fail Attack on a TLS connection using OCSP. The attack assumes ‘classical’ OCSP deployment, where the TLS-client (browser) sends
the OCSP request (acts as OCSP client), and (vulnerable) soft-fail handling of
timeouts and ambiguous OCSP responses. The attacker is impersonating as a
website, to which the attacker has the private key; the corresponding certiőcate
is already revoked, but the attack allows the attacker to trick the browser into
accepting it anyway, allowing the impersonation attack to succeed. The browser
queries the CA (or other OCSP server) to receive a fresh certiőcate-status.
However, the attacker ‘kills’ the OCSP request, or the OCSP response (őgure
illustrates dropping of the response). After waiting for some time, the browser
times-out, and accepts the revoked certiőcate sent by the impersonating website,
although no OCSP response was received. This soft-fail behavior is used by
most browsers, since the alternatives (very long timeout, asking the user, or
hard-fail) are not well received by users.
Applied Introduction to Cryptography and Cybersecurity
8.4. CERTIFICATE REVOCATION
525
Note 8.3: The UX>Security Precedence Rule
In the OCSP soft-fail vulnerability, as described in Section 8.4.2, most browsers
support OCSP, but only using soft-fail, namely, if the OCSP-response is not received
within some time, then the browser simply continues with the connection, i.e., ‘gives
up’ on the OCSP validation and continues using the received certificate, basically,
assumes that the certificate was not revoked. It is well understood that this allows
a MitM attacker to foil the OCSP validation, i.e., the use of the soft-fail approach
results in a known vulnerability. Still, browser developers usually prefer to have this
vulnerability, to the secure alternative of hard-fail, namely, aborting a connection
after ‘giving up’ on the OCSP response. The reason is that there are also benign
reasons that may cause the OCSP response not to arrive, such as unusually high
delay due to network congestion or high load on the OCSP responder (typically, the
CA). Aborting a connection in such cases would result in loss of availability. If the
response is only delayed and eventually arrives, waiting for a long time would result
in poor performance.
Loss of availability, performance, reliability and functionality, are all immediately
visible to the end users, i.e., they harm the user experience (UX). User experience
has a direct, immediate impact on the success of a product. In contrast, security
and privacy considerations are rarely visible to the users. As a result, even when
vendors and developers care about security and privacy, they usually prefer to
compromise on these goals, to avoid harming the user experience (UX) aspects:
availability, functionality, performance, usability and reliability. We refer to this as
the UX>Security Precedence Rule.
Principle 16 (The UX>Security Precedence Rule). Vendors and developers give
precedence to the user experience (UX) considerations ( availability, functionality,
performance, usability and reliability), than to the security and privacy considerations.
Of course, the UX>Security Precedence Rule is just a simplification; real decisions
are more complex, and some vulnerabilities will be considered so critical, that
developers will prefer to fix them, even at the cost of some reduction in UX. However,
usually, the challenge for designers and researchers is to find solutions which will
ensure sufficient security, but avoiding or minimizing harm to the user experience
(UX).
Applied Introduction to Cryptography and Cybersecurity
CHAPTER 8. PUBLIC KEY INFRASTRUCTURE (PKI)
526
user. Second, even with encryption of OCSP requests and responses, the
timing patterns create a side-channel that may allow an eavesdropper to
identify visited websites.
Processing overhead: while OCSP often reduces overhead signiőcantly compared to CRLs, it still requires each response to be signed, which is
computational burden on the OCSP responder. In addition to this computational overhead, there is the overhead of processing each of the (many)
OCSP requests; this overhead remains even when applying optimizations
that reduce the OCSP computational overhead, e.g., as in subsection 8.4.4
and Exercise 8.20.
The processing overhead is especially a concern for the OCSP responder.
Consider the typical case, of a CA providing OCSP responder service; the
signatures in OCSP responses imply signiőcant processing overhead, which
can be a signiőcant concern to the CA. Normally, CAs cannot charge for the
overhead of handling these OCSP requests; and to provide reliable service,
they should be ready to respond to a Flash Crowd8 of requests, from visitors
of a (suddenly popular) website, or to respond to request sent as part of an
intentional Denial-of-Service attack (on the CA or on a subject of a certiőcate).
Due to the overhead concerns, an OCSP responder may limit its services
to authorized OCSP clients. To support this, OCSP requests may be signed;
some servers may use other ways to authenticate their clients, e.g., using the
optional extensions mechanism supported by OCSP requests.
We next describe OCSP stapling, where the OCSP client is the subject of
the certiőcate rather than the relying party. The goal of OCSP stapling is to
mitigate these security, privacy and efficiency concerns. In subsection 8.4.4 we
discuss additional methods to reduce the computational overhead of OCSP.
8.4.3
OCSP Stapling and the Must-Staple Extension
In the previous subsection, we have seen several disadvantages of the ‘classical’
OCSP deployment, where the relying party sends the OCSP requests (i.e., acts
as the OCSP client). In this section we discuss an alternative approach, the
OCSP stapling deployment, where the OCSP request is sent by the subject,
typically the website, acting as the OCSP client. Namely, this design moves
the responsibility to obtain ‘fresh’ OCSP signed responses to the subject (e.g.,
web-server), rather than placing this responsibility (and burden) on every client
(e.g., browser). This addresses the privacy exposure and reduces the overhead
on the OCSP responder (typically, the CA), since it now needs only to send a
single signed OCSP response to each subject (website) - much less overhead
than sending to every relying party (browser). Furthermore, since now only the
subject is supposed to make OCSP requests, the CA may limit the service to
its customers, the subjects.
8 The term Flash Crowd is the name of a sci-fi novella by Larry Niven, describing ‘physical
flash crowd’ due to the use of a transfer booth.
Applied Introduction to Cryptography and Cybersecurity
8.4. CERTIFICATE REVOCATION
527
Therefore, of all the concerns discussed for the relying-party-based OCSP,
only one remains: handling of ambiguous OCSP responses, and in particular,
the MitM soft-fail attack (Figure 8.13). We discuss two variants of OCSP
stapling, which handle, in two different ways, such ambiguities and failures.
OCSP Stapling. OCSP stapling is a different way to deploy OCSP, where
the subject runs the OCSP client and periodically sends OCSP requests to the
OCSP responder for an OCSP response for the server’s certiőcate, e.g., CB .
Let us focus on the typical scenario, where the relying party is a browser
running TLS, who receives a certiőcate CB from the web (and TLS) server,
e.g., bob.com, who is the subject of the certiőcate CB . In OCSP Stapling,
the subject (web server) periodically sends an OCSP request to the OCSP
responder (CA). The web-server does this periodically, without waiting for the
TLS Client Hello message from the client. The CA (or other OCSP responder)
sends back the OCSP response; usually, the response indicates that CB is still
Ok (not revoked), at the current time time(·). We denote this response as σ;
importantly, σ = SignCA.s (CB Ok:time(·)), i.e., contains a signature by the
private signing key CA.s of the CA, on the web-server’s certiőcate CB and
the current time. This response should satisfy browsers (as relying parties), at
least until bob.com will ‘refresh’ it by again sending OCSP request for CB . The
web-server, e.g., bob.com, keeps the response σ, providing it to all connections
by OCSP-stapling-supporting browsers, until it would request and receive a
newer OCSP response, in the next period.
When an OCSP-stapling-supporting browser connects to bob.com, it indicates
its support for OCSP-stapling by including the CSR TLS extension; CSR stands
for the Certificate Status Response TLS-extension. If the server supports stapling
and has a valid OCSP response σ, then staples (includes) the OCSP response
σ, which it places in the CSR TLS-extension, sent in the server’s response. See
this scenario in Figure 8.14.
Note that we discuss here the variant of OCSP deployment, where stapling
is optional ; i.e., the web-server may not staple an OCSP response, e.g., if the
web-server did not receive the OCSP response from the OCSP responder. We
later discuss OCSP Must-Staple, a variant of OCSP deployment where the
subject commits to sending a valid OCSP response.
Once the browser receives the OCSP response (in the CSR TLS-extension),
it validates it, i.e., validates the signature of the CA (using the CA’s public
validation key CA.v), and then validating that the response indicates nonrevocation (which we marked by Ok) and that the time indicated is ‘recent
enough’. When all is Ok, the browser completes the TLS handshake with
bob.com and then continues with the TLS connection.
We described the OCSP-stapling process for a TLS connection between a
browser and a web-server, for the case where the certiőcate was issued by a root
CA (directly trusted by the browser). However, the process is exactly the same
for other TLS clients and servers, and the modiőcations for the (typical) case
Applied Introduction to Cryptography and Cybersecurity
CHAPTER 8. PUBLIC KEY INFRASTRUCTURE (PKI)
528
Web+TLS server bob.com,
subject of CB and
OCSP client
Browser, TLS client
and relying party
CA and
OCSP Responder
OCSP request (for CB )
OCSP response:
σ = SignCA.S (CB Ok:time(·))
TLS Client Hello with
CSR TLS-extension
TLS Server Hello with
CSR extension: σ (OCSP Response)
TLS key exchange, finish
TLS finish
Figure 8.14: (Optional) OCSP stapling in the TLS protocol, using the Certificate
Status Request (CSR) TLS extension, for a typical TLS connection between
browser and web-server bob.com, the subject of certiőcate CB . bob.com received
CB from the CA (not shown); the CA is also the OCSP responder. The web
(and TLS) server bob.com periodically sends OCSP requests to the CA (also
OCSP responder), requesting the status of its own certiőcate CB . The CA sends
back the OCSP response, σ = SignCA.S (CB Ok:time(·)), signaling that CB
was not revoked up to time time(·). The browser sends the TLS CSR extension
to bob.com with TLS Client Hello, to request OCSP-stapling. The server sends
back σ, the OCSP response, also in the CSR extension. The TLS handshake
now completes as usual.
of intermediate CA are simple, following the multi-cert OCSP request-response
as discussed earlier, including in Note 8.2.
Handling Ambiguous OCSP responses and the MitM soft-fail attack.
Let us now return to discuss the handling of ambiguous OCSP responses, and
in particular, handling of the case where no OCSP response is received. For
stapled OCSP, such failure may happen either between subject of the certiőcate,
typically the web-server, who acts as the OCSP client, and the CA (OCSP
responder); or between the relying party, typically the browser, and the subject
(web-server). In particular, this will happen if the web-server does not support
OCSP stapling.
In any case, the bottom line is that the browser does not receive a stapled
OCSP response from the web-server. In the ‘optional’ OCSP stapling design,
Applied Introduction to Cryptography and Cybersecurity
8.4. CERTIFICATE REVOCATION
529
this simply directs the browser to attempt to resolve the revocation situation
by itself. Typically, the browser would now perform an OCSP query directly
with the OCSP responder (typically, the CA), or even request the CRL.
However, now we are basically back in the ‘classical OCSP’ deployment,
where OCSP (and/or CRL) are deployed by the relying party. So, let us consider
again the browser’s response if it fails to receive a response to its OCSP (or
CRL) request. This places the browser in similar dilemma to the one discussed
earlier - and most implementations would adopt the soft-fail approach, i.e., use
the certiőcate assuming that it was not revoked.
Unfortunately, this implies we are again vulnerable to an MitM soft-fail
attack, similar to the one presented earlier (Figure 8.13). The attack is only
slightly modiőed due to the failed effort for OCSP stapling, and should probably
be quite clear from Figure 8.15.
One way to defend against the MitM soft-fail attack (Figure 8.15), is using
the Must-Staple extension to the server’s X.509 certiőcate, which we discuss
next.
The Must-Staple X.509 extension: enforcing OCSP stapled response.
The attacks of Figure 8.13 and Figure 8.15 show the risk of adopting the softfail approach. The soft-fail mechanism is the equivalent of deciding to allow
bypassing of airport security screening, whenever the line becomes too long.
A likely outcome of such policy would be that an attacker will őnd ways to
cause the line to be congested, and then use the bypass to avoid screening and
perform an attack. We sum this up with the following principle.
Principle 17 (Soft-fail security is insecure). Defenses should not be bypassed
due to failures: if defenses are bypassed upon failure, attacker will cause failures
to bypass defenses. Namely, soft-fail security is insecurity.
Awareness of the risk of the soft-fail approach, motivates adoption of the
harsher, hard-fail approach. However, this conŕicts with the UX>Security
precedence rule (Principle 16). Deőnitely, it would be absurd for a browser to
refuse connection to a website, only since it does not receive the OCSP response;
this is very likely due to a benign reason, such as that the website does not
support OCSP stapling!
The TLS-feature X.509 extension [184] is the standard solution to this
dilemma. This extension to the website’s X.509 certiőcate can be used to
indicate that the website always staples OCSP responses. To a large extent,
this moves the UX vs. Security decision from the browser to the website: the
browser would apply the ‘must-staple’ policy, only to a website that requests
it, by using the ‘must-staple’ extension in its X.509 certiőcate. As shown in
Figure 8.16, this foils the MitM soft-fail attack on OCSP-stapling TLS client
of Figure 8.15. Note, however, that the TLS-feature is only effective when
the attacker tries to abuse a certiőcate issued to the legitimate website (with
TLS-feature extension) and later revoked, e.g., after key-exposure was detected
or suspected. In the common case where the attacker is able to get a CA to
Applied Introduction to Cryptography and Cybersecurity
CHAPTER 8. PUBLIC KEY INFRASTRUCTURE (PKI)
530
TLS client
(browser)
MitM (fake server,
with revoked cert)
OCSP Responder
(CA)
TLS Client Hello
with CSR extension
TLS Server Hello
without OCSP response
OCSP request
(drop)
OCSP response
time-out→
softfail
TLS key exchange, finish
TLS finish
(data)
Figure 8.15: MitM soft-fail attack on OCSP-stapling TLS client (browser), using
a revoked TLS server (website) certiőcate; assume that the attacker has the
certiőed (and revoked) private key. The browser sends the CSR TLS extension;
however, the website’s certiőcate does not have the X.509 Must-Staple extension,
or the client does not respect this extension. The attacker impersonates as the
web-server, and sends the TLS server-hello and certiőcate messages; the attacker
does not send the OCSP response (which would have indicated revocation). The
client is misled into thinking that the server does not support OCSP stapling.
The client may now send an OCSP request to the appropriate OCSP responder,
e.g., the relevant CA, but the MitM attacker would ‘kill’ the OCSP request or
response (the őgure shows killing of the response). After time-out, the client
‘gives up’ on the OCSP response, and ‘soft-fails’, i.e., accepts the certiőcate and
establishes the connection with the impersonated website).
Applied Introduction to Cryptography and Cybersecurity
8.4. CERTIFICATE REVOCATION
TLS client (browser)
531
MitM (fake server,
with revoked cert)
TLS Client Hello with CSR extension
TLS Server Hello
without CSR extension;
certificate has TLS-feature X.509 extension [184]
indicating Must-Staple
abort (and alert/report?)
Figure 8.16: The use of the TLS-feature X.509 extension [184], to indicate MustStaple, defends against the MitM soft-fail attack on OCSP-stapling TLS client
of Figure 8.15. As in Figure 8.15, the attacker tries to impersonate a website, to
which the attacker has the private key and the corresponding certiőcate, which
was already revoked. As in Figure 8.15, the client sends Client-Hello request,
with the CSR TLS extension, i.e., asking the server to staple OCSP response.
As in Figure 8.15, the attacker responds without the CSR extension, i.e., trying
to mislead the client into falling back to sending an OCSP request (and then
soft-failing). However, the Must-Staple extension instructs the client to refuse
to continue without the OCSP response from the server.
issue a certiőcate for a request sent by the attacker, the attacker can surely ask
not to include this extension, allowing the attacker to avoid the must-staple
mechanism.
Mandatory Must-Staple? The inclusion of the Must-Staple certiőcate
extension in a certiőcate C prevents an attacker from abusing C after C was
revoked, when C was revoked due to (suspected or detected) exposure of the
private key. However, the Must-Staple extension does not prevent an attacker
from abusing a rogue certiőcate CR =
̸ C, e.g., a certiőcate with a misleading
domain-name (subsection 8.1.1), even after the CA revokes CR (and/or C),
if the rogue certiőcate CR does not include the must-staple extension. Such
rogue certiőcate CR can still be used to attack a client which does not request
and wait for OCSP approval. One way to prevent this would be a mandatory
Must-Staple extension, but this seems unlikely to happen. In fact, there are
signiőcant challenges to the adoption of the Must-Staple extension, as we now
explain.
Must-Staple Adoption Challenges. The UX>Security precedence rule
(Principle 16) applies also to websites; website developers would be reluctant
to adopt the Must-Staple extension, if they believe this may jeopardize the
availability of their website. That may be due to different reasons, such as
clients processing the extension incorrectly, web-servers not supporting the
extension or the OCSP process correctly, or to not receiving the OCSP response
from the OCSP responder (usually, the CA).
Applied Introduction to Cryptography and Cybersecurity
532
CHAPTER 8. PUBLIC KEY INFRASTRUCTURE (PKI)
Unfortunately, measurements of adoption published so far were not very
encouraging [98]. One possible reason is that CA may be reluctant to support
Must-Staple; indeed, with the proliferation of websites, it is likely that the
use of Must-Staple will result in increase rate of OCSP responses by CAs,
each requiring a new signature - a potentially signiőcant overhead. See some
(non-standardized) possible optimizations in subsection 8.4.4.
However, we still hope that Must-Staple will be gradually adopted, as it
offers signiőcant security advantages, with high efficiency to relying parties
and subjects. There does not appear to be any technical reason for either the
incorrect processing or for failures of the web-servers to receive OCSP responses
(and then provide them to the browsers).
Indeed, this is an example of the signiőcant adoption challenges facing
designers of new Internet and web security mechanisms. Adoption considerations
should be an important part of the design process. In the following exercise,
we discuss some issues which may help - or hinder - the adoption of the OCSP
Must-Staple extension.
Exercise 8.7. For each of the following variants of the OCSP Must-Staple extension process, explain possible impacts on adoption, security and performance:
1. Mark the Must-Staple extension as a critical X.509 extension.
2. Mark the Must-Staple extension as a non-critical X.509 extension.
3. When a browser receives from a website a certificate with Must-Staple
extension, but without the stapled OCSP response, then the browser would
not abort the connection, but request a certificate from the CA, and abort
the connection only if this request also fails.
4. Same as previous item, however, the website/CA will have the ability to
indicate if the client should try sending OCSP request to the CA (if it
does not receive it stapled from the web-server). Consider three ways to
indicate this: (a) an option of the OCSP Must-Staple extension, (b) a
separate extension, or (c) an option indicated in a TLS extension returned
by the web server.
Notice that Must-Staple extension requires support by the CA, to include it
in the web-server’s certiőcate, and to provide sufficiently-reliable OCSP service.
An alternative solution which does not require such special certiőcate-extension
is discussed in Exercise 8.18.
8.4.4
Reducing OCSP Computational Overhead
As can be seen in Table 8.3, OCSP performance is, typically, better in most
aspects: low-delay, relatively low bandwidth, and no storage required. However,
OCSP computational overhead can be higher. For the CA, this overhead is
mainly due to the signature operations; in the relying party, it is mostly due to
signature-veriőcations.
Applied Introduction to Cryptography and Cybersecurity
8.4. CERTIFICATE REVOCATION
533
σ1−8 = SignCA.s (h1−8 +
+ time)
h1−8
h1−4
h5−8
h1−2
h3−4
h5−6
h7−8
h1
h2
h3
h4
h5
h6
h7
h8
c 1 , s1
c 2 , s1
c 3 , s3
c 4 , s4
c 5 , s5
c 6 , s6
c 7 , s7
c 8 , s8
Figure 8.17: Certiőcates-Merkle-tree variant of OCSP: optimizing OCSP response, by signing the digest of a Merkle-tree whose leaves are the certiőcates
ci and their statuses si ∈ {good, revoked, unknown} (subsection 3.7.3). The
root is the signature over the hash-tree and the time. Every internal node
is the hash of its children; in particular, for every i holds hi = h(ci , si ), and
hi−(i+1) = h(hi +
+ hi+1 ). To validate any certiőcate, say c3 , provide the signature of the certiőcate hash-tree, i.e., σ1−8 , the time-of-signing and the digest
scheme’s Proof-of-Inclusion (PoI), i.e., the values of internal hash nodes required
to validate the signed hash, namely h4 , h1−2 and h5−8 .
In this subsection, we discuss several non-standard optimizing-variants of
OCSP, which can reduce its computational overhead.
The Certificate-Hash-Tree. This OCSP variant uses the Merkle-tree scheme,
introduced in subsection 3.7.3, to allow the CA or OCSP responder to periodically perform a single signature operation, to provide OCSP responses
indicating status for any OCSP requests. Assume that the CA issued a large
set of certiőcates c1 , c2 , . . . cn , but each OCSP request will contain only one or
few certiőcate-identiőers.
As shown in Figure 8.17, the signature is computed over the result of a
hash-tree applied to the entire set of certiőcates issued by the CA (and their
statuses), concatenated with the current time. The leaves of the hash-tree are
the pairs of individual certiőcates c1 , . . . , cn and their corresponding statuses
s1 , . . . , sn . The construction uses a collision-resistant hash function (CRHF)
denoted h.
The OCSP response for a query for status of certiőcate ci , consists of this
signature, the time of signing, and the values of ‘few’ internal nodes, essentially,
one node per layer of the hash tree. This allows the OCSP client to recompute the
result of the hash tree, and then validate the signature. For example, to validate
Applied Introduction to Cryptography and Cybersecurity
534
CHAPTER 8. PUBLIC KEY INFRASTRUCTURE (PKI)
the value of c6 , the response should include h5 , h7−8 and h1−4 . To validate,
compute h6 = h(c6 ), then h5−6 = h(h5 +
+ h6 ), then h5−8 = h(h5−6 +
+ h7−8 ),
+ h5−8 ) and őnally verify the signature over h1−8 and
then h1−8 = h(h1−4 +
time, by validating that verif yCA.v (σ1−8 , h1−8 +
+ time) returns true. We refer
to this set of values (e.g., c6 , h5 , h7−8 , h1−4 ) as Proof-of-Inclusion (PoI) of c6 .
Exercise 8.8. Consider the Certificates-Merkle-tree variant of OCSP, described
above and illustrated in Figure 8.17.
1. Present pseudo-code for the validation of the OCSP responses by a client
(relying party), when using this variant.
2. Let c be the number of certificates issued by a CA, rAll be the number of
revoked certificates, and i be the number of certificate-identifiers sent in
a given OCSP request. Note that r < n and, typically, i << r. What
is the number of signature and hash operations, required to (a) produce
and send a CRL, (b) produce and send an OCSP response, (c) produce a
certificate-hash-tree OCSP response.
3. This variant uses a (keyless) collision-resistant hash function (CRHF) h.
Explain a disadvantage of this requirement and suggest a change to the
design that will avoid this disadvantage.
Signed Revocations-Status Merkle-Tree. We can further signiőcantly
reduce the overhead of OCSP, by using a Merkle-tree of revocations status
instead of a Merkle-tree of certiőcates. This Merkle tree will still contain one
leaf per certiőcate. However, the value of leaf i will be a bit bi , corresponding
to the revocation of certiőcate i; i.e., bi = 1 if certiőcate i is revoked, and bi = 0
otherwise.
The CA applied the Merkle-tree scheme to these leafs, and obtains the digest
of the entire tree of revocations, which it signs. To provide the OCSP response
for a query to the status of a particular certiőcate, say certiőcate i, the CA
includes in the response this signature, together with a Proof-of-Inclusion (PoI)
of the value bi as the ith leaf of the tree.
This PoI can be further optimized by observing that revocations are not
very common, i.e., most leafs will be zero (not revoked). There is no need
to include the hash of any subtree whose leaves are all zero (i.e., none of the
certiőcates in it was revoked).
Revoked-certificates Merkle tree. The disadvantage of the revocationstatus approach is that it provides information about all certiőcates. Assuming
that the only a very small fraction of the certiőcates are revoked, other optimizations are possible - and possibly even more effective. For example, we
present the revoked-certificates Merkle tree approach. This approach is also
interesting since it introduces an additional optional mechanism for Merkle
digest schemes: a Proof of Non-Inclusion (PoNI).
Applied Introduction to Cryptography and Cybersecurity
8.4. CERTIFICATE REVOCATION
535
σ1−8 = SignCA.s (h1−8 +
+ time)
h1−8
h0000
h1001
h00
h00
h10
h01
h0
h0
h0
h0
h1
h0
h0
h1
0 (b1 )
0 (b2 )
0 (b3 )
0 (b4 )
1 (b5 )
0 (b6 )
0 (b7 )
1 (b8 )
Figure 8.18: Signed revocations-status Merkle-tree; leaf i contain