Unicity Distance

Theorem
  H(K|C^n) = H(K) + H(P^n) - H(C^n)

Proof
Note that:

    H(K,C^n,P^n) = H(C^n|K,P^n) + H(K,P^n)
             = H(P^n|K,C^n) + H(K,C^n)
and that H(C|K,P)=H(P|K,C)=0 since the encryption is deteremined completely by the message (or ciphertext) and the key. Further, K and P are chosen independently. Therefore:
   H(K,C^n,P^n) = H(K,C^n) = H(K) + H(P^n)
   H(K,C^n) = H(K|C^n) + H(C^n)
Combining the equalities completes the proof.

The entropy H(K|C^n) has as upper bound the average number s_n of spurious keys associated with a ciphertext. That is, define:

    K(y) = { k in K | exists x in P^n, p(x)>0, and e_k(x)=y }
the set of keys which could possibly connect a plaintext with a given ciphertext. Its probabilistic average is s_n:
    s_n = sum_{y in C^n} p(y)( |K(y)|-1 )
        = -1 + sum_{y in C^n} p(y) |K(y)| 
So
    H(K|C^n) = sum_y p(y)H(K|y)
             <= sum_y p(y) log | K(y) |
             <= log sum_y p(y) log |K(y)|
             = log ( s_n + 1 )

Definition
The entropy of a natural language L is defined as the limit

    H_L = lim_{n to infinity} H(P^n) / n
and the redundancy of the language as:
   R_L = 1 - ( H_L / log |P| )
See Stinson, the class text, for further discussion.

We make a few more assumptions and approximations. Using these approximations:

    H(P^n) =(approx) n H_L = n(1-R_L)log |P|,
    H(C^n) <= n log |C|,
we combine inequalities to yeild:
    log( s_n + 1 ) >= H(K) - n R_L log |P|

Theorem (Unicity distance)
The expected number of spurious keys in a cryptosystem where |C| = |P| and where keys are chosen equally likely, so that H(K) = log |K|, is:

   s_n >= ( |K| / |P|^(n R_L) ) - 1 
The unicity distance is that value of n for which the right hand side becomes zero, and is a lower bound to the number of characters needed in order to discover the key,
     n =(approx) log |K| / ( R_L log |P| )

As an example, for the substitution cipher and english, the unicity distance is 25.