home *** CD-ROM | disk | FTP | other *** search
- Short: Patch CopyMem/Quick for 68060(040) v1.5
- Uploader: dbusse@primus-online.de (Dirk Busse)
- Author: dbusse@primus-online.de (Dirk Busse)
- Type: util/boot
- Requires: 68060 or 68040
-
- Description:
- This is a small patch which replace the CopyMem and CopyMemQuick
- functions of exec.library.
-
- These functions are optimized for the 68060 processor. They should
- also work with the 68040 processor.
- The patch tests for a 68040 or 060 processor. If it can't find one,
- it doesn't install the patch and exits with a return code of 20 (=fail).
- It also fails, if it can't allocate the necessary memory.
-
- In some cases these new functions are four times faster than the
- original functions.
-
- Installation:
- Just copy CMQ060 into c:
- And insert CMQ060 in your s:Startup-Sequence
-
- Some notes about Move16:
- Move16 is a new assembler command of the 68040 and 060 processors. It
- moves 16 bytes at once. Therefor it uses burst accesses.
- Andreas Kleinert and Thomas Richter told me, there could be problems with
- the Move16 command on the Amiga. Especially in the Chipram. Caused by
- the DMA of the custom chips.
- I couldn't produce such an error, but maybe on other systems.
- So V1.4+ of CMQ060 doesn't use Move16 from or into memory below $01000000
- (Chipram, ZorroII-Fastram, I/O-Space, Kickstart,...). Move16 is only
- used, when the source and destination addresses are both higher than
- $00ffffff (32-bit-Fastram,...).
-
- (If you didn't get any errors with V1.3 and want to get the most speed
- improvement, you could use CMQ060Move16. This is identically with CMQ060
- V1.3 and uses Move16 also in and from Chipram. But maybe you get
- problems.)
-
- (If you want to avoid all problems which Move16 could cause [the 68040
- has some Move16 bugs], you should use Aminet:util/boot/CMQ030. This one
- never uses Move16 and is still faster than the other available patches.)
-
- The source code is also in the archive.
-
- Author:
- Dirk Busse
- Kropsburgstraße 8
- D-67141 Neuhofen
- Germany
- <dbusse@primus-online.de>
- <100.141999@germanynet.de>
-
- How often are these functions used?
- Some people told me, they couldn't notice a speed improvement.
- You couldn't get a speed improvement by a factor of two. But there is a
- little speed improvement, even if you couldn't notice it.
- To show you how often the patched functions are called, I've inserted two
- modified patches into Version 1.1b of this archive.
- CMQ060beep:
- Every time one of the patched functions CopyMem or CopyMemQuick is
- called, your AMIGA makes a DisplayBeep. After calling LoadWB your
- AMIGA beeps very often per second. If you boot your AMIGA without
- Startup-Sequence and install CMQ060beep, you could see, every AMIGA
- dos command like Dir, List, Avail, Resident... is using the patched
- functions.
- They all are using the CopyMem function. And this is the function with
- the most speed improvement.
- CMQ060beepCMQ:
- This will only make a DisplayBeep, if the patched CopyMemQuick
- function is called. So it shows you which programms are using the
- patched CopyMemQuick function. For Example: PageStream3.3 while moving
- a scrollbar or making a redraw or TeleInfo2 or ... .
- The two above patches aren't for real use. They are only to demonstrate
- how often the functions are used.
-
-
- Speed comparision:
- There are already some similar patches available on the Aminet:
- CopyMemQuicker V2.8 from 1994 -> Aminet:util/boot/COPMQR28.lha
- PCM V1.0 from 1996 -> Aminet:util/boot/PCM_1.0.lha
- Also MCP patches these functions.
-
- CopyMemQuicker is optimized for a 68000,010 and 020 processor.
- But on a 68060 (I think also on a 68040) you could get some more
- speed improvement.
-
- PCM is optimized for the 68040 and 060 processor. But some copy modes
- like Long to Even aren't optimized. And the copy mode Long+1 to Even+1
- needs twice the time as the original exec function.
- PCM works only with a 68040 or 060, because it also uses the Move16
- command (see the note above).
-
- In a lot of cases the patched functions from MCP are the slowest of all.
- Some copy modes are even slower than the original Kickstart 3.1
- functions.
-
- Here are some test results. All results are measured on the same
- AMIGA 2000 with a DKB WildFire060-50MHz:
-
- "TestIt" from
- CopyMemQuicker
- V2.8 original CopyMemQuicker MCP PCM CMQ030 CMQ060 CMQ060
- Kickstart3.1 V2.8 V1.32b12 V1.0 V1.1 V1.4 Move16
- CopyMem routines V1.4
- 565×64kB L->L 1.85 1.85 1.85 1.35 1.79 1.31 1.31
- 147×64kB L->L+1 1.33 1.14 1.07 1.07 0.47 0.45 0.47
- 413×64kB L->E 2.21 2.21 2.21 2.23 1.31 1.31 1.31
- 147×64kB L->E+1 1.35 1.15 1.07 1.07 0.45 0.45 0.45
- 147×64kB L+1->L 1.35 1.15 0.51 0.47 0.47 0.45 0.47
- 382×64kB L+1->L+1 2.11 1.23 2.88 0.91 1.21 0.89 0.87
- 147×64kB L+1->E 1.33 1.15 0.81 0.79 0.47 0.45 0.47
- 501×64kB L+1->E+1 1.71 1.70 3.81 3.71 1.59 1.57 1.59
- 501×64kB E->L 1.71 1.71 1.75 1.59 1.59 1.59 1.59
- 147×64kB E->L+1 1.33 1.15 1.11 1.07 0.47 0.47 0.47
- 382×64kB E->E 2.11 1.23 2.13 0.91 1.21 0.87 0.89
- 147×64kB E->E+1 1.35 1.13 1.13 1.09 0.47 0.45 0.45
- 147×64kB E+1->L 1.33 1.15 0.51 0.45 0.45 0.47 0.47
- 413×64kB E+1->L+1 2.19 2.19 3.15 3.05 1.31 1.29 1.29
- 147×64kB E+1->E 1.33 1.15 0.81 0.79 0.45 0.45 0.45
- 564×64kB E+1->E+1 1.81 1.81 4.31 1.35 1.79 1.31 1.31
- 33900×1kB L->L 1.10 1.11 1.13 1.31 1.03 1.04 1.07
- 9400×1kB L->L+1 1.17 0.93 0.91 0.86 0.29 0.29 0.27
- 24000×1kB E->E 1.70 0.80 1.68 0.92 0.74 0.75 0.75
- 196000×128B L->L 1.02 0.73 1.03 1.04 0.75 0.75 0.75
- 155000×128B E->E 1.61 0.63 1.55 1.05 0.62 0.60 0.59
- 588000×19B L->L 0.83 0.60 1.43 0.74 0.50 0.51 0.49
- 622000×18B L->L 0.81 0.51 1.43 0.77 0.51 0.49 0.51
- 663000×17B L->L 0.75 0.70 1.47 0.73 0.52 0.52 0.50
- 956000×16B L->L 0.79 0.71 1.98 1.00 0.58 0.51 0.50
- 1060000×8B L->L 0.85 0.79 1.17 1.01 0.58 0.52 0.52
- 1430000×4B L->L 0.73 0.61 1.09 1.14 0.45 0.39 0.41
- 2190000×1B L->L 0.67 0.61 0.73 0.84 0.33 0.57 0.62
- CopyMemQuick
- 565×64kB L->L 1.85 1.87 1.85 1.33 1.79 1.31 1.29
- 33900×1kB L->L 1.09 1.11 1.13 0.89 1.03 1.03 1.07
- 196000×128B L->L 0.99 0.71 1.03 0.81 0.73 0.73 0.73
- 956000×16B L->L 0.69 0.63 0.88 0.94 0.38 0.39 0.38
- 1060000×8B L->L 0.47 0.57 0.71 0.60 0.40 0.40 0.40
- 1430000×4B L->L 0.35 0.51 0.73 0.52 0.23 0.21 0.25
-
- "Test" from
- PCM V1.0 ("Test" moves ten times a Block of 500.000 Bytes)
- Fast->Fast
- CopyMem 0.26 0.26 0.18 0.18 0.24 0.18 0.18
- CopyMemQuick 0.26 0.26 0.18 0.20 0.26 0.18 0.18
- Chip->Fast
- CopyMem 1.98 1.98 1.96 2.16 2.16 2.15 1.98
- CopyMemQuick 1.98 1.98 1.98 2.16 2.16 2.16 1.98
- Fast->Chip
- CopyMem 1.92 1.91 1.92 1.90 1.90 1.90 1.90
- CopyMemQuick 1.92 1.92 1.92 1.90 1.90 1.88 1.90
- Chip->Chip
- CopyMem 3.64 3.62 3.64 3.70 3.96 3.96 3.72
- CopyMemQuick 3.62 3.62 3.62 3.70 3.94 3.94 3.72
-
-
- History:
- 1.0 (12.Sep.1998)
- - First public version.
- 1.1 (15.Sep.1998)
- - V1.0 exits with a return code of 10 (=error), if it can't find
- a 68040 or 68060 or can't get the necessary memory.
- V1.1 exits, in this cases, with a return code of 20 (=fail).
- - Fixed a mistake in the readme.
- 1.1b (19.Sep.1998)
- (I didn't changed the Patch itself! It's the same as V1.1)
- - Added the Testresults of MCP V1.30 into the readme.
- - Added CMQ060beep and CMQ060beepCMQ (see above).
- 1.2 (29.Nov.1998)
- - Added the Testresults of MCP V1.32b12 into the readme.
- - Changed the source code.
- There was a problem with a wrong written program which expects
- the address of the last source byte +1 in A0 and the address
- of the last destination byte +1 in A1.
- This version of CMQ060 solves problems with such badly programs.
- It's now 100 Bytes longer, but the speed is the same. Big moves
- by the CopyMem function will be one or two cycles faster, but
- you didn't recognize it.
- 1.3 (5.Jan.1999)
- All changes made to this version doesn't effect the speed. They
- are only to avoid problems with future versions of AMIGA OS.
- - changed the version string to the "standard" format
- - changed BMI to BCS and BPL to BCC
- -> now CMQ030 could move blocks bigger than 2 GigaByte ;-)
- 1.4 (3.Apr.1999)
- - CMQ060 now doesn't use Move16 into/from memory below $01000000
- - added CMQ060Move16 (It's the same as CMQ060 V1.3)
- - added the test results of CMQ030 (Does never use Move16)
- 1.5 (11.Jul.1999)
- - Fixed Move16 workaround that rarely caused some problem (Thanks
- for the report Jim)
- - Speed up to one copy section.
-