home *** CD-ROM | disk | FTP | other *** search
- Xref: sparky sci.crypt:7059 alt.security.pgp:583
- Path: sparky!uunet!munnari.oz.au!comp.vuw.ac.nz!cavebbs!tornado!sai
- From: sai@tornado.welly.gen.nz (Simon McAuliffe)
- Newsgroups: sci.crypt,alt.security.pgp
- Subject: Was Sidelnikov right???
- Message-ID: <7V2qXB1w165w@tornado.welly.gen.nz>
- Date: Thu, 21 Jan 93 23:03:17 +1300
- Organization: Tornado BBS, Wellington, New Zealand.
- Lines: 189
-
- After reading Dr. Sidelnikov's analysis of PGP I, along with many other
- crypt/pgp readers thought something must be wrong with his analysis and
- since nobody else seemed to be doing much other than saying it was being
- verified I decided to have a look myself.
-
- Now I'm no statistics expert, in fact quite the opposite, so please
- excuse any blunders and try to forgive the rather informal treatment I
- give the problem.
-
- PGP has three random number generators. One of these is based on timing
- between keystrokes and I didn't look at this one. I haven't looked at
- the code, but I guess it uses 4 bits from each 16 bit 8253 (PC version)
- timing value (since you need to enter twice as many keystrokes as random
- bytes you get from it). And if the low 4 bits are used, I guess they're
- fairly random, and the MD5 hashing must help things along nicely too.
- In any case, I ignored all of this.
-
- There are also two other random number generators. One of these is not
- intended to be even remotely secure, in fact it is probably just about
- one of the simplest and least random of any RNG you have ever seen;
-
- static int randseed=0; /* used only by pseudorand() function. */
- int pseudorand(void)
- /* Home-grown 16-bit LCG pseudorandom generator. */
- { randseed = (randseed*31421 + 6927) & 0xffff;
- return (randseed);
- } /* pseudorand */
-
- This random number generator is used when the "pool" of good random
- numbers from the first RNG (mentioned above) runs out. However the
- number of values in the pool is alway sufficient for key generation etc
- so this RNG is never called for anything important. It will be called at
- several points when overwriting information in memory etc etc.
-
- A function randombyte() returns a random number from the pool of good
- numbers or it returns a weak pseudo RN if the pool is empty.
-
- I decided that perhaps Dr. Sidelnikov had seen this function
- randombyte() and decided that it was PGP's RNG and tested this one. Of
- course without initialising the pool of numbers this function would
- return the terrible weak numbers. This seemed to explain to me why PGP
- was failing the tests.
-
- In fact this random number generator is so poor that it simply repeats a
- permuted sequence of 256 bytes over and over. I tested the linear
- correlation and found it to be 0.016750. Aha! Terrible! Okay, that
- must be what Dr. Sidelnikov did.
-
- In any case, I decided to check the third RNG as well (I was bored and
- there was always a chance he did test this after all). The third RNG is
- used for creating random session keys for encryption of messages with
- IDEA. The RNG itself uses IDEA to generate the random numbers. In
- order to simulate PGP conditions fairly closely I generated my test data
- in blocks of 16 random bytes (as PGP does) just in case this has some
- effect. For each collection of data, I generated 65536 blocks giving
- files of 1048576 (1 meg) in size and I created 20 such files. This
- required a small change to crypto.c so that instead of rewriting the
- file each time, it stored it in main memory. For each file, the "truely
- random" key timing random number generator was called for the initial
- seed value.
-
- > - the sequence of random numbers has strong prevalences on
- > bytes (up to 0.05 ... 0.1 on material of 10000 byte) and strong
- > correlation dependence between contiguous bytes;
-
- Strong prevalences? I assumed this to mean certain characters showed up
- more than others. I found this hard to believe, since this must surely
- show some kind of weakness in IDEA so I tested this first:
-
- I'm not sure how statistically valid my tests are, but I used a Chi^2
- goodness-of-fit-test to check if each occurence of byte values matched
- the flat distribution it should have. Each byte should have occured
- 4096 times in a perfectly flat distribution. The following table shows
- the results and the observed significance level of the test (Chi^2 with
- 255 degrees of freedom).
-
- Roughness | OSL (Chi^2(255))
- ------------+-----------------
- 258.034668 | 43.51% (high values in the OSL column
- 265.982422 | 30.54% mean a good fit. For testing
- 278.838379 | 14.60% at level x%, any value >x
- 239.045898 | 75.56% means the test is passed).
- 276.901855 | 16.54%
- 286.939941 | 8.25%
- 247.055176 | 62.78%
- 252.427734 | 53.38%
- 268.672363 | 26.62%
- 239.907715 | 74.29%
- 211.188477 | 97.90%
- 292.719238 | 5.22%
- 229.750977 | 87.02%
- 263.931152 | 33.71%
- 278.236816 | 15.18%
- 271.500977 | 22.82%
- 253.606934 | 51.27%
- 278.595215 | 14.84%
- 237.597168 | 77.61%
- 252.171875 | 53.83%
- Mean = 259.155249 --> 41.59% (Can I really do this? No? Ooops...)
-
- Well I'm not going into more detail on this one, but a hypothesis test
- at the 5% level (or 1%) shows the data to fit a flat distribution quite
- nicely. So what did Sidelnikov mean? Who knows...
-
- Oh well so much for that, but what about the "strong correlation." This
- seems to be what has more people worried. I decided to check this out
- next. I had seen the correlation test a while ago, but didn't know the
- formula so I used the one posted. I had enormous trouble getting this to
- work because of the typo's but after looking at some books myself I
- found the correct formula in a statistics textbook (of course a
- correction has been posted now but a bit late). Anyway, plugging in the
- numbers to the other formulae posted gives:
-
- mu = 0.000001907
- sigma = 0.001381073
- for n = 524288 (half a meg)
-
- So, we want to do some hypothesis tests something along the lines of;
-
- H0n: Cn = mu (null hypothesis)
- H1n: Cn =/= mu (alternative hypothesis)
-
- Following are the tests for each of the 20 sample files. The table
- shows the correlation value, the normalised/standardised value and
- corresponding significance level.
-
- We expect a critical value fairly near 0 when standardised if H0 is to
- be accepted.
-
- C | Crit +/- | Two sided OSL
- ----------+------------+--------------
- -0.000151 | -80.1672 | 0.0000 (Ouch! The OSL should be high
- 0.001282 | 671.1347 | 0.0000 if we want to accept H0)
- -0.001413 | -741.8161 | 0.0000
- 0.001116 | 584.1032 | 0.0000
- 0.001013 | 530.1017 | 0.0000
- -0.002072 |-1087.3206 | 0.0000
- 0.000340 | 177.2572 | 0.0000
- -0.001034 | -543.1117 | 0.0000
- -0.001972 |-1034.8920 | 0.0000
- -0.000230 | -121.5858 | 0.0000
- 0.000508 | 265.3373 | 0.0000
- -0.000425 | -223.8215 | 0.0000
- 0.002383 | 1248.3735 | 0.0000
- -0.001605 | -842.4790 | 0.0000
- 0.000506 | 264.2887 | 0.0000
- 0.002695 | 1411.9508 | 0.0000
- -0.001196 | -628.0461 | 0.0000
- 0.000450 | 234.9287 | 0.0000
- 0.000016 | 7.3886 | 0.0000
- 0.002822 | 1478.5351 | 0.0000
-
- This is kinda frightening... Every single test failed miserably,
- supporting the alternative hypothesis. Even if I tried my best I
- couldn't fiddle these tests. These tests simply cannot be passed at any
- significance level.
-
- It looks like _THIS_IS_ what Dr. Sidelnikov also did, and it means one
- of three things: (ranked in order of likelihood as I see it)
-
- 1) Both Dr. Sidelnikov and I stuffed up.
-
- 2) This correlation test isn't appropriate for this situation.
-
- 3) Something funny is going on and either the RNG is badly
- implemented or IDEA has a problem.
-
- Number 1 sounds like the most likely and number 3 is just plain scary.
-
- Incidentally, it's of no consequence whatsoever, but just in case
- anybody wants know, there were no cycles in the 20 samples files. At
- least this is something! Of course it really means nothing because
- using a 1048576 byte sample isn't really a sensible way of testing for
- cycles.
-
- Well, after all this, perhaps Dr. Sidelnikov isn't just a crock. If I
- didn't make any mistakes, there may be something to what he says after
- all. Can somebody who really knows some real statistics shed a brighter
- light on the subject?
-
- I hope this has been of some interest/help in any case.
-
-
-
- Sai (e-mail sai@tornado.welly.gen.nz)
- (s-mail PO Box 3598 Wellington, New Zealand)
-
-
- PS Can we have a little more sci in sci.crypt?
-