Netware Super Library

home *** CD-ROM | disk | FTP | other *** search

/ Netware Super Library / Netware Super Library.iso / drivers / nics / pktdrv7 / 8259.not < prev next >

Wrap

Internet Message Format | 1990-07-26 | 8.2 KB

From nelson@sun.soe.clarkson.edu Thu Jul 26 15:25:24 1990 Received: from omnigate.clarkson.edu by pear.ecs.clarkson.edu with SMTP id AA2818 ; Thu, 26 Jul 90 15:25:23 GMT Received: from sun.soe.clarkson.edu by omnigate.clarkson.edu id aa09445; 26 Jul 90 15:14 EDT Received: by sun.soe.clarkson.edu (4.1/SMI-4.0) id AA01795; Thu, 26 Jul 90 15:14:20 EDT Message-Id: <9007261914.AA01795@sun.soe.clarkson.edu> Return-Path: <@po5.andrew.cmu.edu:ddp+@andrew.cmu.edu> Date: Fri, 20 Jul 90 04:18:41 -0400 (EDT) From: Drew Daniel Perkins <ddp+@andrew.cmu.edu> To: nelson@sun.soe.clarkson.edu Subject: Re: In-Reply-To: <2802@pear.ecs.clarkson.edu> References: <2802@pear.ecs.clarkson.edu> nelson@pear.ecs.clarkson.edu writes: > Why? > > in al,dx ;get master mask > and al,not (1 shl 2) ; and clear slave cascade bit in mask > out dx,al ;set new master mask (enable slave int) That solves a bug that I'm truly amazed that you haven't run into before (I'm equally amazed that it exists). We have a few very old original IBM PC/AT's (the one's with the strange piggybacked 256KB DRAMS which together made 512KB). It seems that the BIOS on those old versions does NOT initialize the master 8259 for you so that the slave 8259 can interrupt. If you want it to (i.e. if you want to use int 8-15), you had better make sure that the bit is cleared. The obvious outcome is that you don't get interrupts and the packet driver doesn't work. I think the reason that you didn't see this is that you didn't really have any cards that supported ints 8-15 (atleast not WD varieties). In any case, these three instructions shouldn't ever cause anyone any harm. Drew From nelson@sun.soe.clarkson.edu Thu Jul 26 15:42:53 1990 Received: from omnigate.clarkson.edu by pear.ecs.clarkson.edu with SMTP id AA2821 ; Thu, 26 Jul 90 15:42:52 GMT Received: from sun.soe.clarkson.edu by omnigate.clarkson.edu id aa09631; 26 Jul 90 15:31 EDT Received: by sun.soe.clarkson.edu (4.1/SMI-4.0) id AA02405; Thu, 26 Jul 90 15:31:25 EDT Message-Id: <9007261931.AA02405@sun.soe.clarkson.edu> Return-Path: <ddp+@andrew.cmu.edu> Date: Tue, 12 Jun 90 00:56:40 -0400 (EDT) From: Drew Daniel Perkins <ddp+@andrew.cmu.edu> To: pcip@twg.com, drivers@sun.soe.clarkson.edu Subject: Dell System 325 hardware bug I seem to have run into a a real hardware bug in the Dell System 325 Chips & Technologies 8259 clone interrupt controller. Summary: Sending this interrupt controller a Non Specific End of Interrupt (EOI) command causes it to reset all In Service Register (ISR) bits instead of only the most recent one with the highest priority. Long Winded Explanation: I had a serious bug with my high-performance Western Digital wd80x3 packet driver. Transmitting and receiving on it at high rates caused it to go west in many different ways. After tearing my hair out for a while, I added logging code which logged all procedure entries and exits along with detailed chip status in a large ring. I finally discovered that the impossible was happening. During my packet copy routine (which can take > 1.5ms to copy a 1500 byte packet), my interrupt handler was being reentered and was trashing the stack. This "shouldn't happen" since I was not giving an EOI command to the interrupt controller until the very end of the interrupt handler. The interrupt did however reenable processor and ethernet chip interrupts fairly early. After tearing my hair out some more and checking my code thouroughly, I decided that I must be getting some other interrupt in the middle of my code somewhere. I added some more logging code to record interrupt controller status, and changed my packet copy routine to enable processor interrupts AFTER the copy instead of before it. Sure enough, at the point the bug hit, my log indicated that timer had fired and a timer interrupt was now pending. Also, I had received a new packet, and the ethernet chip also had a new interrupt pending although it was still blocked because it already had an interrupt in service. However, immediately after reenabling processor interrupts, my log indicated that my interrupt handler was reentered. This indicated to me that the timer interrupt handler was somehow resetting not only its ISR bit but mine also. After disassembling the timer interrupt handler, I determined that the only thing it was doing was sending a Non Specific EOI to the primary interrupt controller (using mov al,20h; out 20h,al). To make my case for a hardware bug even stronger, I next coded my own timer interrupt handler. Just before and just after the mov/out instructions, I made log entries. Sure enough, while my log showed the ISR register reading 21h (IR5 and IR0 in service) just before the EOI was sent, it read 0 just after. I then changed the code to use a specific EOI instruction to reset the timer interrupt instead of a non specific EOI. The problem went away! Finally, I tested the code with a non specific EOI on a stock IBM PC/AT with a real Intel 8259. It didn't exhibit the problem. Since I can't change the real timer interrupt handler (its in BIOS), I had to use a different workaround. Just before reenabling processor interrupts, I now disable further ethernet device interrupts by setting its Interrupt Mask Register (IMR) bit. At the end of interrupt handler, I reset the bit. This insures that I can't get further device interrupts even if the timer interrupt clears my ISR bit. Synopsis: If you write high performance drivers where: 1. The interrupt handler runs with other interrupts enabled at the processor and at the interrupt controller, 2. The interrupt handler reenables interrupts at the device while 1. is true and further device interrupts are possible before the interrupt handler again disables interrupts and returns, 3. You want to the driver to work in clones with C&T chips Then, you better use a technique like the one I use to guarantee that you can't get reentrant interrupts. Drew From nelson@sun.soe.clarkson.edu Thu Jul 26 15:43:12 1990 Received: from omnigate.clarkson.edu by pear.ecs.clarkson.edu with SMTP id AA2822 ; Thu, 26 Jul 90 15:43:11 GMT Received: from sun.soe.clarkson.edu by omnigate.clarkson.edu id aa09633; 26 Jul 90 15:31 EDT Received: by sun.soe.clarkson.edu (4.1/SMI-4.0) id AA02415; Thu, 26 Jul 90 15:31:47 EDT Message-Id: <9007261931.AA02415@sun.soe.clarkson.edu> Return-Path: <nelson> Date: Wed, 13 Jun 90 10:16:38 EDT To: drivers@sun.soe.clarkson.edu From: Drew Daniel Perkins <ddp+@andrew.cmu.edu> Sender: nelson@sun.soe.clarkson.edu In-Reply-To: <9006120558.AA14002@endor.harvard.edu> Subject: Re: Dell System 325 hardware bug Reply-To: nelson@clutx.clarkson.edu ddl@das.harvard.edu (Dan Lanciani) writes: > Do you have a little demo program I can rty on machines? Unfortunately, no I don't. Producing the bug took a lot of effort including 1 machine (a router) with two interfaces and two other machines pinging each other through the first. To generate enough traffic, the first machine had a packet exploder which generated 50 packets for every one going through it. I certainly believe that it should be possible to write a much simpler program to generate the bug, but I definitely don't have time to try it... I guess what I would do is write a program that would: 1. Initialize a "reentered" variable to zero. 2. Cause some device to generate an interrupt. 3. Have the interrupt handler check the "reentered" variable. If it is equal to zero, continue to 4. Else, goto 8. 4. Reset the interrupt at the device. 5. Cause the device to generate another interrupt. 4 and 5 must be done in this order to get the interrupt controller's edge trigger latch to be set. 6. Reenable processor interrupts. Do NOT reenabled interrupt controller interrupts. I.e. do NOT send an EOI. 7. Wait in a infinite loop. Higher priority (i.e. timer) interrupts should be able to occur but interrupts from this device should not. 8. Print "bad interrupt controller". If you got here then a timer interrupt fired and managed to reenable your device interrupts by doing a Non Specific EOI to a broken interrupt controller. Drew