home *** CD-ROM | disk | FTP | other *** search
- Newsgroups: comp.windows.x.i386unix
- Path: sparky!uunet!math.fu-berlin.de!informatik.tu-muenchen.de!roell
- From: roell@informatik.tu-muenchen.de (Thomas Roell)
- Subject: Re: 32K or 16M colours?
- In-Reply-To: mouse@thunder.mcrcim.mcgill.edu's message of 23 Dec 92 17:19:33 GMT
- References: <linda.724803368@minnie> <1992Dec20.000426.21774@cbnewsj.cb.att.com>
- <1992Dec22.130619.18414@news.lrz-muenchen.de>
- <1992Dec23.171933.11736@thunder.mcrcim.mcgill.edu>
- Sender: news@Informatik.TU-Muenchen.DE (USENET Newssystem)
- Organization: Inst. fuer Informatik, Technische Univ. Muenchen, Germany
- Date: Thu, 24 Dec 1992 11:58:11 GMT
- Message-ID: <1992Dec24.115811.1540@Informatik.TU-Muenchen.DE>
- Lines: 55
-
- >>> Take a look at how the 24-bit SVGA's are implemented. Packed 3-byte
- >>> pixels. It's going to make working on them rediculous[sic].
- >> I don't think 3byte pixels is that rediculus[sic] if you look at the
- >> memory contraints given in todays SVGAs. But the price to pay is
- >> VERY high. You can only deal with 24bpp (opposed to 32bpp) by using
- >> 32bit reads/writes. A simple statistic shows that about every second
- >> access will be not correctly aligned and hence at least 2 32bit
- >> accesses to video memroy will be necessary. Having this in mind a
- >> 24bpp framebuffer will be at leat 1.5 times slower than a 32bpp.
- >
- >Only if the implementation is careless. You should be doing three
- >32-bit memory accesses for every four pixels. Only if you're doing
- >something narrow (ie, small span along the x axis) will the
- >misalignment effect make much difference. Granted, drawing lines is a
- >common operation, but I believe area copies and text drawing are much
- >more important for nearly all applications.
-
- Hmmm .... very intresting. I tried it doing it the 'careless' way. Anyway,
- I have some perhaps strong arguments again doing your approach (at leat
- some thoughts that might make this approach not a good as it seems to be
- at first, but nevertheless faster than the dumb brute force approach):
-
- 1) Much of the speed of cfb comes from the fact that basically all unary
- operations (everything except CopyArea and CopyPlane) are reduced to:
-
- (((dst) & (and)) ^ (xor))
-
- and & xor are precumputed constants (with optimisations if xor == 0 or
- and == PMSK). If you would used units of 96bits (for four pixels) you
- had to use 6 constants instead of 2, which might break the register
- allocation of your compiler.
-
- 2) The way cfb handles ragged edges would have to be changed to make
- as few accesses as possible, instead of reading one pixel unit (32bit
- for cfb) and simply handle the complete group. Instead you hand to make
- special case for 3,2,1 pixels which are unaligned.
-
- 3) Cases for CopyArea where source and destination will not have the
- same alignment regarding the 96bit units will have to be handled in
- a complex shifting code. For 32bit units you have enought registers on
- a 386, but with 96bit units, you have to store temporay results and
- hence will be slower than using unaligned accesses.
-
-
- But after all this way seems to be faster than even 32bpp. One would
- have to try it out. But on the other hand I feel like the speed will be still
- to slow for real usage. And then most people will get accelerated graphics
- boards, which do support TrueColor but again only in 32bpp ...
-
- - Thomas
- --
- -------------------------------------------------------------------------------
- Das Reh springt hoch, e-mail: roell@sgcs.com
- das Reh springt weit, #include <sys/pizza.h>
- was soll es tun, es hat ja Zeit ...
-