home *** CD-ROM | disk | FTP | other *** search
- {
- >I need to find the minimum and maximum values in large arrays (~20,000
- >points) of type word. Looking for a faster way using TP7 (assembler?)
- >to do it than:
- >
- >min := data^[1];
- >max := data^[1];
- >
- >for i := 2 to num_points do
- >begin
- > data_value := data^[i];
- > if data_value < min then min := data_value;
- > if data_value > max then max := data_value;
- >end;
- >
- Lets try some asm here:
-
- From: terjem@hda.hydro.com (Terje Mathisen)
- }
-
- Procedure FindMinMax(var data; num_points : word; var min, max : word);
- Assembler;
- asm
- push ds
- lds si,[data]
- mov cx,[num_points]
- sub cx,1
- jc @done {Empty array! }
- mov dx,[si] {Min value}
- lea si,[si+2] {Point at second table entry}
- mov bx,dx {Max value}
- jz @store {Single-entry array!}
-
- @loop:
- mov ax,[si]
- add si,2
- cmp ax,dx
- jb @new_min
- cmp ax,bx
- ja @new_max
- dec cx
- jnz @loop
- jmp @store
-
- @new_min:
- mov dx,ax {Save new min value}
- dec cx
- jnz @loop
- jmp @store
-
- @new_max:
- mov bx,ax {Save new max value}
- dec cx
- jnz @loop
-
- @store: {Return the values found!}
- lds si,[min]
- mov [si],dx
- lds si,[max]
- mov [si],bx
- @done:
- pop ds
- end;
-
- {
- This was written from scratch, so no testing whatsoever! It should be quite
- well optimized for both 486 and Pentium-class machines, running in about 10
- cycles/word on a 486, and just 4 cycles/word on a Pentium, since the
- 8 inner-loop instructions will pair perfectly. With 20,000 points in your
- array, this should correspond to 6ms on a 486-33, and less than a milli-
- second on a Pentium-90.
-
- The most important feature to note is that new extremal values will be quite
- rare, averaging just O(log(n)) for a random n-element array. That's why I
- jump out of the loop to handle these cases, making the normal case much
- faster. With a worst-case, already sorted array, we will find a new max
- value on each iteration, which will increase the running time to 12 cycles
- on the 486, while the Pentium will stay constant at 4 cycles.
-
- If the array is pre-sorted in reverse (declining) order, the 486 is back to
- 10 cycles, while the P5 is actually faster, at just 3 cycles/word.
-
- I think the only way to improve on this code is by unrolling it, which will
- save up to 4 cycles/word for the 486, and just a single P5 cycle.
-
- PS. Make sure that your array is naturally aligned (16-bit word), if not it
- will run a lot slower, esp. on a P5.
- }