NetNews Usenet Archive 1992 #27

home *** CD-ROM | disk | FTP | other *** search

/ NetNews Usenet Archive 1992 #27 / NN_1992_27.iso / spool / comp / arch / 10836 < prev next >

Wrap

Text File | 1992-11-17 | 6.6 KB | 179 lines

Newsgroups: comp.arch Path: sparky!uunet!ukma!wupost!usc!elroy.jpl.nasa.gov!decwrl!adobe!dd From: dd@adobe.com (David DiGiacomo) Subject: Re: integer distance or sqrt Message-ID: <1992Nov17.204207.530@adobe.com> Originator: dd@neon Sender: usenet@adobe.com (USENET NEWS) Organization: Adobe Systems Incorporated References: <55F0TB1w165w@tsoft.sf-bay.org> Date: Tue, 17 Nov 1992 20:42:07 GMT Lines: 166 In article <55F0TB1w165w@tsoft.sf-bay.org> bbs.dennis@tsoft.sf-bay.org (Dennis Yelle) writes: >I am not sure that comp.arch is the right place for this, but >because of the interest in Herman Rubin's optimization problems >some people here might be interested in this. >It is a real problem. That is, I have a program that >spends quite a lot of time here, and I would like to change this >if I could. Here is the current implementation: > > >typedef long int32; /* 32 bit integer */ >#define SCALE 16 /* To prevent overflow when computing sq */ > >int32 distance( int32 x1, int32 y1, int32 x2, int32 y2) >{ > int32 dx = (x1 - x2) / SCALE; > int32 dy = (y1 - y2) / SCALE; > int32 sq = dx * dx + dy * dy; > > return (int32)(sqrt( (double)sq) * SCALE); >} Here's the tail end of a previous discussion of this topic. Ken originally cited Tom Duff's SIGGRAPH '84 tutorial on Numerical Methods for Computer Graphics as his source for the Moler-Morrison algorithm. BTW, it also works well in fixed-point. From: turk@Apple.COM (Ken "Turk" Turkowski) Newsgroups: comp.graphics,sci.math Subject: Re: Iterative square root function? Message-ID: <4331@internal.Apple.COM> Date: 25 Sep 89 05:23:18 GMT In article <1989Sep25.022915.5886@sun.soe.clarkson.edu> stadnism@image.soe!clutx.clarkson.edu writes: >In the Ray Tracing News, vol. 2 # 4 (I think), someone mentioned a cubically >convergent method for doing Pythagorean sums and square roots. Does anyone >have info on this method, and any methods for computing quick trig and inverse >trig functions? It would be much appreciated. Here's a reposting of an article of mine of a few years ago: [Moler-Morrison, "Replacing Square Roots by Pythagorean Sums", IBM Journal of Research and Development", vol. 27, no. 6, pp. 577-581, Nov. 1983] describes an algorithm for computing sqrt(a^2+b^2) which is fast, robust and portable. The naive approach to this problem (just writing sqrt(a*a+b*b) ) has the problem that for many cases when the result is a representable floating point number, the intermediate results may overflow or underflow. This seriously restricts the usable range of a and b. Moler and Morrison's method never causes harmful overflow or underflow when the result is in range. The algorithm has cubic convergence, and depending on implementation details may be even faster than sqrt(a*a+b*b). Here is a C function that implements their algorithm: double hypot(p,q) double p,q; { double r, s; if (p < 0) p = -p; if (q < 0) q = -q; if (p < q) { r = p; p = q; q = r; } if (p == 0) return 0; for (;;) { r = q / p; r *= r; s = r + 4; if (s == 4) return p; r /= s; p += 2 * r * p; q *= r; } } ---------------------------------------------------------------- And a relevant response: > From decwrl!decvax!dartvax!chuck Mon Nov 19 23:15:02 1984 > Subject: Pythagorean Sums > > > Ken -- > > I recently worked on our assembly language routines which compute > the square root and absolute value of a complex number. Thus, your > posting of this article caught my attention. I went so far as to > visit our library and read the paper by Moler and Morrison. I then > reread the code that we use to implement our routines in hopes of > improving them. > > In their paper, Moler and Morrison write "The [algorithm] is > robust, portable, short, and, we think, elegant. It is also > potentially faster than a square root." Unfortunately, I've come > to the conclusion that our existing algorithm is about as robust, > shorter, faster, and more elegant. (I'm not sure about portability.) > > I thought I'd send this letter to you, hoping that if you got bored > one afternoon, you could point out any flaws in my thinking. And I > thought that there might be a small chance you would be interested > in a comparison of the Moler-Morrison algorithm with another algorithm. > > The algorithm that we use is: > > double hypot(p,q) > double p,q; > { > double r; > if (p == 0) return (q); > if (q == 0) return (p); > if (p < 0) p = -p; > if (q < 0) q = -q; > if (p < q) {r = p; p = q; q = r;} > r = q / p; > return p * sqrt (1 + r * r); > } > > Moler and Morrison claim two advantages: their algorithm does > not overflow very easily and it converges rapidly. However, depending > on just how "sqrt" is implemented, our algorithm will not overflow > very easily, either. This brings us to speed considerations. The > standard method for computing the square root is the Newton-Raphson > method which coverges quadraticly (sp?). The Moler-Morrison > algorithm converges cubically. However, the amount of work needed for > one iteration of the M&M algorithm will allow us to perform two iterations > of the Newton-Raphson method. > > Let me unroll these two methods side by side so we can see what is > involved in computing each. We assume that the arguments 'p' and 'q' > have been normalized so that 0 < q <= p. (Pardon my pseudo-code.) > > Newton_Raphson(p,q) Moler_Morrison(p,q) > r = q/p; r = q/p; > r *= r; r *= r; > r += 1; > > /* linear approx. to sqrt(r) > using magic constants 'c1' and > 'c2'. Note that 1 <= r <= 2. > See "Computer Evaluation of > Mathematical Functions" > by C.T. Fike. */ > > g = r * c1 + c2; /* 7 bits */ s = r / (4 + r); > g = (g + r/g) * .5; /* 15 bits */ p = p + 2 * s * p; > > /* Here N-R has about 5 digits of accuracy, M-M has about 6, > and N-R has performed an extra addition. */ > > g = (g + r/g) * .5; /* 30 bits */ q = s * q; > g = (g + r/g) * .5; /* 60 bits */ r = q / p; > r = r * r; > s = r / (4 + r); > p = p + 2 * s * p; > > /* Now, both algorithms have about 20 digits of accuracy. > Note that (depending on how one counts) M-M uses from 2 > to 3 more floating point multiplication than N-R on each > iteration. N-R used an extra addition to set up for the > iterations and will need an extra multiply to return its > value. On each further iteration, the number of significant > digits will quadruple for the N-R method and only triple > for the M-M method. */ > > return p * g; return p;