3

An encryption scheme consists of the following data

  • a plaintext space $\mathcal M$
  • a ciphertext space $\mathcal C$
  • a key space $\mathcal K$
  • a key generating algorithm Gen
  • encryption and decryption algorithms $\operatorname{Enc}_k$ and $\operatorname{Dec}_k$ for each key $k\in \mathcal K$

I would like some clarification on the following extract from Katz & Lindell:

https://i.stack.imgur.com/Q3fzK.png

More specifically, my question is: What does, say, $P(K=k)$ mean exactly? What is the domain and codomain of the random variable $K$? What is the set $\{K=k\}$? Also, what is the domain of Gen?

2 Answers2

1

$K$ is the random variable of "chosen key from the keyspace used in one instance of encryption passing." So, given a key $k \in \mathcal{K}$, $P(K =k)$ is the probability that the key $k$ is chosen for a round of encryption. Most crypto protocols (and certainly analysis of perfect secrecy) require this to be uniformly distributed.

Randall
  • 18,542
  • 2
  • 24
  • 47
  • Yes, thank you @Randall. What is the sample space on which $K$ is defined? Does $K$ take real values? Exactly what is meant by $P(K=k)$? Usually it's shorthand for $P({K=k})$ but I'm not even sure what this set means here. – CRYPTONEWBIE Sep 07 '17 at 18:44
  • I suppose the sample space for $K$ is $\mathcal{K}$, the set of all potential keys. I wouldn't think about it any more deeply than that. – Randall Sep 07 '17 at 18:54
  • But this is also the codomain of $K$? I don't understand. This is bugging me too much, I need to clarify this, @Randall. – CRYPTONEWBIE Sep 07 '17 at 19:27
  • The sample space is $\mathcal{K}$, the random variable $K$ is the identity function on $\mathcal{K}$, the probability function is uniform. Thus $P(K=k)$ means $P({k})$. – Matthew Towers Sep 07 '17 at 19:40
1

Gen is a key generation "algorithm". An algorithm is modelled as a non-deterministic Turing machine. The algorithm has a "security parameter", here the key size $n$ (which can be $128$ for AES). To get a uniform key space: the Turing machine has $1^n$ as input ,a sequence of $n$ $1$'s (or $0$'s if you prefer), and starts at the left, overwriting each $1$ (or $0$ with) with a $0$ or $1$ using the randomised algorithm (given by coin flips), moving 1 to the right. It stops at the end of the input, at the first empty cell. The result is what was written on the tape.

Usually this is written as $\text{Gen}: 1^n \to \mathcal{K}$, to denote that $k$ is chosen uniformly at random from $\{0,1\}^n$. So then $P(K=k) = \frac{1}{2^n}$, as all keys are equally likely. So the sample space for $K$ (as a random variable) is all subsets of $\mathcal{K}$ (so $\mathscr{P}(\mathcal{K})$), and $P(A) = P(K \in A) = \frac{|A|}{2^n}$.

So $P$ is just a probability measure on $\mathscr{P}(\mathcal{K})$ and the random variable $K$ is somewhat superfluous: we can talk about $P(A)$, which we interpret to mean that the key we chose lies in $A$. Usually a random variable maps events in a probability space to numeric values, but in a general sense a random variable is a measurable function from $(S, P)$, a probability space, to any measurable space (like the reals). In this book, Gen is seen as the random function $K$ that produces values in the measurable space $(\mathcal{K},P)$ with the powerset as the $\sigma$-algebra and $P$ uniform as described.

Henno Brandsma
  • 242,131