home *** CD-ROM | disk | FTP | other *** search
- Newsgroups: comp.arch
- Path: sparky!uunet!ukma!wupost!usc!elroy.jpl.nasa.gov!decwrl!adobe!dd
- From: dd@adobe.com (David DiGiacomo)
- Subject: Re: integer distance or sqrt
- Message-ID: <1992Nov17.204207.530@adobe.com>
- Originator: dd@neon
- Sender: usenet@adobe.com (USENET NEWS)
- Organization: Adobe Systems Incorporated
- References: <55F0TB1w165w@tsoft.sf-bay.org>
- Date: Tue, 17 Nov 1992 20:42:07 GMT
- Lines: 166
-
- In article <55F0TB1w165w@tsoft.sf-bay.org> bbs.dennis@tsoft.sf-bay.org (Dennis Yelle) writes:
- >I am not sure that comp.arch is the right place for this, but
- >because of the interest in Herman Rubin's optimization problems
- >some people here might be interested in this.
- >It is a real problem. That is, I have a program that
- >spends quite a lot of time here, and I would like to change this
- >if I could. Here is the current implementation:
- >
- >
- >typedef long int32; /* 32 bit integer */
- >#define SCALE 16 /* To prevent overflow when computing sq */
- >
- >int32 distance( int32 x1, int32 y1, int32 x2, int32 y2)
- >{
- > int32 dx = (x1 - x2) / SCALE;
- > int32 dy = (y1 - y2) / SCALE;
- > int32 sq = dx * dx + dy * dy;
- >
- > return (int32)(sqrt( (double)sq) * SCALE);
- >}
-
-
- Here's the tail end of a previous discussion of this topic. Ken
- originally cited Tom Duff's SIGGRAPH '84 tutorial on Numerical Methods for
- Computer Graphics as his source for the Moler-Morrison algorithm. BTW, it
- also works well in fixed-point.
-
-
- From: turk@Apple.COM (Ken "Turk" Turkowski)
- Newsgroups: comp.graphics,sci.math
- Subject: Re: Iterative square root function?
- Message-ID: <4331@internal.Apple.COM>
- Date: 25 Sep 89 05:23:18 GMT
-
- In article <1989Sep25.022915.5886@sun.soe.clarkson.edu> stadnism@image.soe!clutx.clarkson.edu writes:
- >In the Ray Tracing News, vol. 2 # 4 (I think), someone mentioned a cubically
- >convergent method for doing Pythagorean sums and square roots. Does anyone
- >have info on this method, and any methods for computing quick trig and inverse
- >trig functions? It would be much appreciated.
-
- Here's a reposting of an article of mine of a few years ago:
-
- [Moler-Morrison, "Replacing Square Roots by Pythagorean Sums", IBM
- Journal of Research and Development", vol. 27, no. 6, pp. 577-581, Nov.
- 1983] describes an algorithm for computing sqrt(a^2+b^2) which is fast,
- robust and portable. The naive approach to this problem (just writing
- sqrt(a*a+b*b) ) has the problem that for many cases when the result is
- a representable floating point number, the intermediate results may
- overflow or underflow. This seriously restricts the usable range of a
- and b. Moler and Morrison's method never causes harmful overflow or
- underflow when the result is in range. The algorithm has cubic
- convergence, and depending on implementation details may be even faster
- than sqrt(a*a+b*b). Here is a C function that implements their
- algorithm:
-
- double hypot(p,q)
- double p,q;
- {
- double r, s;
- if (p < 0) p = -p;
- if (q < 0) q = -q;
- if (p < q) { r = p; p = q; q = r; }
- if (p == 0) return 0;
- for (;;) {
- r = q / p;
- r *= r;
- s = r + 4;
- if (s == 4)
- return p;
- r /= s;
- p += 2 * r * p;
- q *= r;
- }
- }
-
- ----------------------------------------------------------------
-
- And a relevant response:
-
- > From decwrl!decvax!dartvax!chuck Mon Nov 19 23:15:02 1984
- > Subject: Pythagorean Sums
- >
- >
- > Ken --
- >
- > I recently worked on our assembly language routines which compute
- > the square root and absolute value of a complex number. Thus, your
- > posting of this article caught my attention. I went so far as to
- > visit our library and read the paper by Moler and Morrison. I then
- > reread the code that we use to implement our routines in hopes of
- > improving them.
- >
- > In their paper, Moler and Morrison write "The [algorithm] is
- > robust, portable, short, and, we think, elegant. It is also
- > potentially faster than a square root." Unfortunately, I've come
- > to the conclusion that our existing algorithm is about as robust,
- > shorter, faster, and more elegant. (I'm not sure about portability.)
- >
- > I thought I'd send this letter to you, hoping that if you got bored
- > one afternoon, you could point out any flaws in my thinking. And I
- > thought that there might be a small chance you would be interested
- > in a comparison of the Moler-Morrison algorithm with another algorithm.
- >
- > The algorithm that we use is:
- >
- > double hypot(p,q)
- > double p,q;
- > {
- > double r;
- > if (p == 0) return (q);
- > if (q == 0) return (p);
- > if (p < 0) p = -p;
- > if (q < 0) q = -q;
- > if (p < q) {r = p; p = q; q = r;}
- > r = q / p;
- > return p * sqrt (1 + r * r);
- > }
- >
- > Moler and Morrison claim two advantages: their algorithm does
- > not overflow very easily and it converges rapidly. However, depending
- > on just how "sqrt" is implemented, our algorithm will not overflow
- > very easily, either. This brings us to speed considerations. The
- > standard method for computing the square root is the Newton-Raphson
- > method which coverges quadraticly (sp?). The Moler-Morrison
- > algorithm converges cubically. However, the amount of work needed for
- > one iteration of the M&M algorithm will allow us to perform two iterations
- > of the Newton-Raphson method.
- >
- > Let me unroll these two methods side by side so we can see what is
- > involved in computing each. We assume that the arguments 'p' and 'q'
- > have been normalized so that 0 < q <= p. (Pardon my pseudo-code.)
- >
- > Newton_Raphson(p,q) Moler_Morrison(p,q)
- > r = q/p; r = q/p;
- > r *= r; r *= r;
- > r += 1;
- >
- > /* linear approx. to sqrt(r)
- > using magic constants 'c1' and
- > 'c2'. Note that 1 <= r <= 2.
- > See "Computer Evaluation of
- > Mathematical Functions"
- > by C.T. Fike. */
- >
- > g = r * c1 + c2; /* 7 bits */ s = r / (4 + r);
- > g = (g + r/g) * .5; /* 15 bits */ p = p + 2 * s * p;
- >
- > /* Here N-R has about 5 digits of accuracy, M-M has about 6,
- > and N-R has performed an extra addition. */
- >
- > g = (g + r/g) * .5; /* 30 bits */ q = s * q;
- > g = (g + r/g) * .5; /* 60 bits */ r = q / p;
- > r = r * r;
- > s = r / (4 + r);
- > p = p + 2 * s * p;
- >
- > /* Now, both algorithms have about 20 digits of accuracy.
- > Note that (depending on how one counts) M-M uses from 2
- > to 3 more floating point multiplication than N-R on each
- > iteration. N-R used an extra addition to set up for the
- > iterations and will need an extra multiply to return its
- > value. On each further iteration, the number of significant
- > digits will quadruple for the N-R method and only triple
- > for the M-M method. */
- >
- > return p * g; return p;
-