home *** CD-ROM | disk | FTP | other *** search
- Newsgroups: sci.crypt
- Path: sparky!uunet!think.com!ames!saimiri.primate.wisc.edu!nntp.msstate.edu!willis1.cis.uab.edu!sloan
- From: sloan@cis.uab.edu (Kenneth Sloan)
- Subject: Re: Attack Methods
- Message-ID: <1992Nov18.224350.11512@cis.uab.edu>
- Organization: CIS, University of Alabama at Birmingham
- References: <1992Nov18.134243.24089@qiclab.scn.rain.com> <1992Nov18.190513.10997@cis.uab.edu> <1992Nov18.203413.11509@rchland.ibm.com>
- Date: Wed, 18 Nov 1992 22:43:50 GMT
- Lines: 98
-
- In article <1992Nov18.203413.11509@rchland.ibm.com> lwloen@vnet.ibm.com writes:
- >In message <1992Nov18.190513.10997@cis.uab.edu> Jim Dailey writes:
-
- [ahem...actually, I wrote it. Perhaps I'm taking blame, instead of credit?]
- >
- > [ long description of transposition system with some
- > "random" pad bytes thrown in before the ordinary
- > transposition step ]
- >
- >Transpositions are weak because they fall to some pretty universal,
- >system-dependent attacks.
- >
- >The technique of interest here is called "multiple anagramming".
- >...
- >
- >Now, the most common two words in English are "the" and "and".
- >Whenever both ended up at the same location, the fragment:
- >
- >Msg1 t...h...e
- >Msg2 a...n...d (were... represent other ciphertext in between of same length).
- >
- >would stand out. After all, there is no substitution, so the real words are
- >in there somewhere!
-
- Very good. Thank you. Let me try to summarize where we are so far:
-
- 0) original post: histogram to tell *if* a transposition cipher is in
- use.
-
- 1) my post: how about flattening the histogram, to defeat this?
-
- 2) your post: NG - we'll try anyway, and use second order statistics to
- begin to unravel the transposition. That is - you seem to reject the
- point of view represented by the original scheme, and will look for
- transpositions anyway (perhaps because you know from other sources
- that I'm using transpositions - the original poster was replying to
- a "what if you don't know the method" question)
-
- Have I missed anything, yet?
-
- If not - let's see if we can defeat the second order statistics. Recall
- that my suggestion (old as the hills, I'm sure) was to add pad
- characters to defeat the first order statistics attack. Can I instead
- add pad bytes to defeat the second order statistics, instead? For
- example, you show an attack based on common triples - such as {t,h,e}
- and {a,n,d}. Does it help me to include lots of these triples,
- scrambled, in the padding?
-
- That is - if flattening the character histogram doesn't help, does it
- help to add (scrambled) padding with the *same* distribution of
- characters as the likely plaintext? The theory is that this will give
- the multiple anagrammer lots more "possible matches", and cause him to
- chase down too many blind alleys. Perhaps I can even arrange for the
- triples (quadruples, whatever) that you are looking for to match many
- times in the padding, rather than letting the statistics do this for me?
-
- Ah...a light begins to dawn. Nevermind...
-
- ==================
-
- My other point was - the padding scheme relies on the receiver being
- able to recognize the intended message. You point out that this annoys
- people. To get around this, I proposed that the padding be chosen so
- that no fragment of the padding formed a word in the lexicon - so that
- an automatic dictionary lookup could extract the words in the message
- from the jumble - this eliminates all whitespace, and eliminates the
- need for a human to look at the full padded text. I thought that an
- advantage of this scheme might be that message text would be more evenly
- distributed in the (untransposed) test. I asked if this would be
- counterbalanced by the "common lexicon" providing too much of a lever
- for the opponent.
-
- I gather from your one-line reply to this point that the restricted
- lexicon simply makes multiple anagramming easier. Is this still true if
- the padding has the same statistics as the message - and even
- intentional misleading matches? It appears so - even matches in the
- padding tend to reveal bits of the transposition...oh well.
-
- But...distributing the message differently within the padded blocks
- makes it less likely that "common 3 letter combinations" will fall in
- the same place in 2 different messages. Does this matter?
-
- ===================
-
- Finally (I promise, I'll go away after this) - is multiple anagramming
- bothered by schemes which re-write the plain text in shorthand-style
- alphabets designed so that the second-order statistics are flattened?
- Perhaps this is too much of a tangent from the original point - if so,
- sorry.
-
- Thanks for the reply - I learned from it.
-
-
- --
- Kenneth Sloan Computer and Information Sciences
- sloan@cis.uab.edu University of Alabama at Birmingham
- (205) 934-2213 115A Campbell Hall, UAB Station
- (205) 934-5473 FAX Birmingham, AL 35294-1170
-