home *** CD-ROM | disk | FTP | other *** search
- Newsgroups: comp.unix.bsd
- Path: sparky!uunet!europa.asd.contel.com!emory!swrinde!cs.utexas.edu!news.uta.edu!utacfd.uta.edu!rwsys!sneaky!gordon
- From: gordon@sneaky.lonestar.org (Gordon Burditt)
- Subject: [386bsd] gdb and 386bsd vs. floating point problems (+gdb patches)
- Message-ID: <C1EH04.45M@sneaky.lonestar.org>
- Organization: Gordon Burditt
- Date: Mon, 25 Jan 1993 08:02:23 GMT
- Lines: 247
-
- Is anyone else having terrible floating point problems on 386bsd?
- I'm running on a 486DX/33, so the coprocessor is built in.
- I seem to be getting npx stack-underflow faults, in contexts where it
- makes absolutely no sense to be getting them ( e.g. fldl 16(%ebp);
- faddl 44(%ebp); fstpl 24(%ebp), and I get a stack-empty fault on the
- fstpl instruction. Breaking npxprobe() to make it not recognize
- the coprocessor and using the math emulator instead makes the program
- run VERY slowly but it works. The math emulator doesn't seem to
- do stack underflow faults.
-
- A certain program I'm trying to port keeps dying on SIGFPEs. It's a
- client of a server program run on the same machine, and so far only the
- client gets the SIGFPEs, although both use floating-point, and there's
- a lot of context-switching going on also. So I get out gdb, and the
- first thing I discover is that gdb doesn't do anything useful when you
- ask for a stack trace, and complains about "Operation not permitted".
- I discover that the user virtual address for the stack end is not a
- constant, so I fix gdb to properly access the stack in core dumps:
-
- Index: /usr/src/usr.bin/gdb/config/i386bsd-dep.c
- ***************
- *** 942,947 ****
- --- 942,957 ----
- */
- reg_offset = (int) u.u_ar0 - KERNEL_U_ADDR;
- #else
- + /*
- + * 386bsd does not put the stack end in a fixed virtual
- + * location, so we get the beginning and depend on the
- + * MAXSSIZ constant for the full length of the stack to
- + * find the end.
- + * (See code & comments in kern_execve.c, search for USRSTACK)
- + */
- + stack_end = (CORE_ADDR) u.u_kproc.kp_eproc.e_vm.vm_maxsaddr
- + + MAXSSIZ;
- +
- data_end = data_start +
- NBPG * u.u_kproc.kp_eproc.e_vm.vm_dsize;
- stack_start = stack_end -
-
- Ok, now I can get a stack trace. The SIGFPEs seem to be coming at random
- places all over the code. Further, getting an assembly file for the problem
- code, adding "fwait" instructions before and after every floating-point
- instruction, assembling it, and testing the new code doesn't change anything.
-
- "info float" in gdb doesn't do anything. The code is conditionalled out.
- So I fixed it. Well, this is a bit kludgey, and I'd love to have someone
- point out a mistake so that the problem really isn't as wierd as it seems,
- but it seems to work. I couldn't figure out where to get an exception
- status value in addition to the stored one. There are some fundamental
- disagreements between the original code and my reading of Intel manuals
- regarding the order of floating-point registers saved by fsave/fnsave.
- Nothing but code in gdb seems to care, though.
-
- Index: /usr/src/usr.bin/gdb/config/i386bsd-dep.c
- ***************
- *** 1758,1771 ****
-
- top = (ep->status >> 11) & 7;
-
- ! printf ("regno tag msb lsb value\n");
- ! for (fpreg = 7; fpreg >= 0; fpreg--)
- {
- double val;
-
- ! printf ("%s %d: ", fpreg == top ? "=>" : " ", fpreg);
-
- ! switch ((ep->tag >> (fpreg * 2)) & 3)
- {
- case 0: printf ("valid "); break;
- case 1: printf ("zero "); break;
- --- 1758,1773 ----
-
- top = (ep->status >> 11) & 7;
-
- ! printf ("regno tag msb lsb value\n");
- ! for (fpreg = 0; fpreg <= 7; fpreg++)
- {
- double val;
-
- ! printf ("%s ST%d: ", ((fpreg+top)&7) == 7 ? "=>" : " ", fpreg);
-
- ! /* according to Intel 486 documentation, the registers are stored */
- ! /* in LOGICAL order but the tag bits correspond to PHYSICAL registers */
- ! switch ((ep->tag >> (((top + fpreg)&7) * 2)) & 3)
- {
- case 0: printf ("valid "); break;
- case 1: printf ("zero "); break;
- ***************
- *** 1787,1792 ****
- --- 1789,1804 ----
- if (ep->r3)
- printf ("warning: reserved3 is 0x%x\n", ep->r3);
- }
- + #ifdef __386BSD__
- + /*
- + * 386BSD name for saved fpu state. This had better have the same
- + * layout as the env387 struct. Note that the size of struct fpacc87
- + * in <machine/npx.h> is actually wrong, due to struct padding, but
- + * the data layout seems to be correct anyway.
- + */
- + #define U_FPSTATE(u) u.u_pcb.pcb_savefpu
- + #define fpstate save87
- + #endif
-
- #ifndef U_FPSTATE
- #define U_FPSTATE(u) u.u_fpstate
- ***************
- *** 1798,1823 ****
- --- 1810,1851 ----
- int i;
- #ifndef __386BSD__
- /* fpstate defined in <sys/user.h> */
- + #else
- + /* save87 defined in <machine/npx.h> */
- + #endif
- struct fpstate *fpstatep;
- char buf[sizeof (struct fpstate) + 2 * sizeof (int)];
- unsigned int uaddr;
- + #ifndef __386BSD__
- char fpvalid;
- + #else
- + int fpvalid;
- + #endif
- unsigned int rounded_addr;
- unsigned int rounded_size;
- extern int corechan;
- int skip;
-
- + #ifndef __386BSD__
- uaddr = (char *)&u.u_fpvalid - (char *)&u;
- + #else
- + uaddr = (char *)&u.u_pcb.pcb_flags - (char *)&u;
- + #endif
- if (have_inferior_p())
- {
- unsigned int data;
- unsigned int mask;
-
- + #ifndef __386BSD__
- rounded_addr = uaddr & -sizeof (int);
- data = ptrace (3, inferior_pid, rounded_addr, 0);
- mask = 0xff << ((uaddr - rounded_addr) * 8);
-
- fpvalid = ((data & mask) != 0);
- + #else
- + data = ptrace(3, inferior_pid, (caddr_t)uaddr, 0);
- + fpvalid = (data & FP_WASUSED) != 0;
- + #endif
- }
- else
- {
- ***************
- *** 1825,1831 ****
- perror ("seek on core file");
- if (myread (corechan, &fpvalid, 1) < 0)
- perror ("read on core file");
- !
- }
-
- if (fpvalid == 0)
- --- 1853,1861 ----
- perror ("seek on core file");
- if (myread (corechan, &fpvalid, 1) < 0)
- perror ("read on core file");
- ! #ifdef __386BSD__
- ! fpvalid = (fpvalid & FP_WASUSED) != 0;
- ! #endif
- }
-
- if (fpvalid == 0)
- ***************
- *** 1847,1853 ****
- ip = (int *)buf;
- for (i = 0; i < rounded_size; i++)
- {
- ! *ip++ = ptrace (3, inferior_pid, rounded_addr, 0);
- rounded_addr += sizeof (int);
- }
- }
- --- 1877,1883 ----
- ip = (int *)buf;
- for (i = 0; i < rounded_size; i++)
- {
- ! *ip++ = ptrace (3, inferior_pid, (caddr_t)rounded_addr, 0);
- rounded_addr += sizeof (int);
- }
- }
- ***************
- *** 1861,1866 ****
- --- 1891,1900 ----
- }
-
- fpstatep = (struct fpstate *)(buf + skip);
- + # ifdef __386BSD__
- + /* not sure where to get exception status */
- + print_387_status (0, (struct env387 *)fpstatep);
- + #else
- print_387_status (fpstatep->status, (struct env387 *)fpstatep->state);
- #endif
- }
-
- Ok, now I can see what happens when I get a SIGFPE. Invariably the
- exceptions shown are INVALID, LOS, and FSTACK, and the stack is shown
- as empty. The address of the last npx exception is often 0, but sometimes
- it shows the address of an instruction that's actually in my code.
-
- Ok, so why am I getting these exceptions? I can think of several
- reasons:
-
- - Flakey hardware. It's fairly new hardware, but there might be problems
- with it. It's a 486DX, though, so there's less a motherboard manufacturer
- can goof up than if they were wiring a 386 and 387 together.
- - Flakey code generation by gcc. Well, when I see code like:
-
- fldl 16(%ebp)
- faddl 44(%ebp)
- fstpl 24(%ebp)
- fwait
- ...
-
- if I'm going to get an exception from the fstp (why? - the stack should
- have something on it), shouldn't I get the exception at the fwait
- or before (486 eip, not npx exception address), NOT dozens to hundreds
- of instructions later? And what's wrong with this code, anyway? There
- should be something on the stack.
-
- - Flakey library code. I suspected for a while that library code was
- doing "fninit" instructions when it shouldn't, but I think I have
- isolated the problem to exclude library code.
-
- - Flakey OS floating point save/restore code. I suspect this the most,
- but I haven't been able to prove it. I've tried putting fwait
- instructions around just about every floating-point instruction
- in the kernel, and nothing makes much difference. (If you do an
- frstor, which starts saving the context, and yank the address space
- out from under the npx by re-loading control register 3 or doing
- a task switch, don't you NEED an fwait first? There isn't one. But
- fixing it doesn't change the problem.) I tried setting the 486 NE
- bit in control register 0. No change. I'm still wondering what
- those outb calls for ports 0xb1 and 0xf0 do.
-
- - Flakey debugger. If my fixes to "info float" don't do what I think
- they do, the problem might be something entirely different.
-
- Gordon L. Burditt
- sneaky.lonestar.org!gordon
-