27 October 2017

I'm currently reading Liza Mundy's Code Girls, a book about the role that American women played in World War II cryptanalysis. (By coincidence, it came out around the same time as The Woman Who Smashed Codes, a biography of Elizebeth Friedman, one of the greatest cryptanalysts in history.) Mundy notes that the attack on Japan's PURPLE machine was aided by a design feature: PURPLE encrypted 20 letters separately from 6 other letters. But why should the machine have been designed that way?

PURPLE, it turns out, was a descendant of RED, which had the same 20/6 split. In RED, though, the 6 letters were the vowels; the ciphertext thus preserved the consonant versus vowel difference from the plaintext. But why was that a desirable goal?

The answer was economy. Telegraph companies of the time charged by the word—but what is a "word"? Is ATOY a word? Two words? What about "GROUP LEADER"? In English, that's two words, but the German "GRUPPENFÜHRER" is one word. Could an English speaker write "GROUPLEADER" instead?

The exact rules were a subject of much debated and were codified into international regulations. One rule that was adopted was to permit artificial words if they were pronounceable, which in turn was instantiated as a minimum density of vowels. So, to save money, the Japanese cryptologists designed RED to keep the (high) vowel density of Japanese as rendered in Romaji.

These rules were hotly debated. One bitter opponent of any such rules was William Friedman, himself a great cryptanalyst (and the husband of Elizebeth) and the administrative head of the US Army group that eventually broke PURPLE.

So: if Friedman's 1927 advice had been followed, RED would not have treated vowels differently, PURPLE wouldn't have had the 20/6 split, and Friedman's group might have been denied its greatest triumph.