home *** CD-ROM | disk | FTP | other *** search
- [Sat Mar 2 10:32:33 PST 1996 KERNEL_BUG-HOWTO lm@sgi.com (Larry McVoy)]
-
- This is how to track down a bug if you know nothing about kernel hacking.
- It's a brute force approach but it works pretty well.
-
- You need:
-
- . A reproducible bug - it has to happen predictably (sorry)
- . All the kernel tar files from a revision that worked to the
- revision that doesn't
-
- You will then do:
-
- . Rebuild a revision that you believe works, install, and verify that.
- . Do a binary search over the kernels to figure out which one
- introduced the bug. I.e., suppose 1.3.28 didn't have the bug, but
- you know that 1.3.69 does. Pick a kernel in the middle and build
- that, like 1.3.50. Build & test; if it works, pick the mid point
- between .50 and .69, else the mid point between .28 and .50.
- . You'll narrow it down to the kernel that introduced the bug. You
- can probably do better than this but it gets tricky.
-
- . Narrow it down to a subdirectory
-
- - Copy kernel that works into "test". Let's say that 3.62 works,
- but 3.63 doesn't. So you diff -r those two kernels and come
- up with a list of directories that changed. For each of those
- directories:
-
- Copy the non-working directory next to the working directory
- as "dir.63".
- One directory at time, try moving the working directory to
- "dir.62" and mv dir.63 dir"time, try
-
- mv dir dir.62
- mv dir.63 dir
- find dir -name '*.[oa]' -print | xargs rm -f
-
- And then rebuild and retest. Assuming that all related
- changes were contained in the sub directory, this should
- isolate the change to a directory.
-
- Problems: changes in header files may have occurred; I've
- found in my case that they were self explanatory - you may
- or may not want to give up when that happens.
-
- . Narrow it down to a file
-
- - You can apply the same technique to each file in the directory,
- hoping that the changes in that file are self contained.
-
- . Narrow it down to a routine
-
- - You can take the old file and the new file and manually create
- a merged file that has
-
- #ifdef VER62
- routine()
- {
- ...
- }
- #else
- routine()
- {
- ...
- }
- #endif
-
- And then walk through that file, one routine at a time and
- prefix it with
-
- #define VER62
- /* both routines here */
- #undef VER62
-
- Then recompile, retest, move the ifdefs until you find the one
- that makes the difference.
-
- Finally, you take all the info that you have, kernel revisions, bug
- description, the extent to which you have narrowed it down, and pass
- that off to whomever you believe is the maintainer of that section.
- A post to linux.dev.kernel isn't such a bad idea if you've done some
- work to narrow it down.
-
- If you get it down to a routine, you'll probably get a fix in 24 hours.
-
- My apologies to Linus and the other kernel hackers for describing this
- brute force approach, it's hardly what a kernel hack would do. However,
- it does work and it lets non-hackers help bug fix. And it is cool
- because Linux snapshots will let you do this - something that you can't
- do with vender supplied releases.
-
-