home *** CD-ROM | disk | FTP | other *** search
- Newsgroups: comp.benchmarks
- Path: sparky!uunet!cs.utexas.edu!zaphod.mps.ohio-state.edu!darwin.sura.net!jvnc.net!nuscc!iti.gov.sg!hblim
- From: hblim@iti.gov.sg (Hock-Beng Lim)
- Subject: Counting number of MFLOPs
- Message-ID: <1992Nov16.033410.11391@iti.gov.sg>
- Sender: news@iti.gov.sg (News Admin)
- Organization: Information Technology Institute, National Computer Board, Singapore.
- Date: Mon, 16 Nov 1992 03:34:10 GMT
- Lines: 367
-
- Hi !
-
- Here are the responses I have received regarding my query on the accepted
- ways to count MFLOPs in applications.
-
- HB
- --------------------------------------------------------------------------
- From: schreiber@schreiber.asd.sgi.com (Olivier Schreiber)
- Message-Id: <9211111829.AA03875@schreiber.asd.sgi.com>
- To: hblim@iti.gov.sg
- Subject: Re: Counting number of MFLOPS
- Newsgroups: comp.benchmarks
- References: <1992Nov11.074519.26172@iti.gov.sg>
- Status: RO
-
-
- I had posted a similar inquiry a while back. Here's what I got.
- Thanks for posting the results of your enquiry.
-
- >From: csrdh@manta.jcu.edu.au (Rowan Hughes)
- Message-Id: <9205290324.AA04971@manta.jcu.edu.au>
- To: schreiber
- Subject: Re: How are Floating Point operations counted?
- Newsgroups: comp.benchmarks
- References: <lc5k6ro@fido.asd.sgi.com>
- Status: RO
-
- In comp.benchmarks you write:
- >Is there an accepted way industry-wide of counting floating point
- >operations or a definition of what a floating point operation is
- >for FP benchmark purposes?
- >How many FPO is x=y*z ?
- >or x=sqrt(y)*sin(z)?
- >How do most vendors measure MFLOP rates on general user programs?
-
- The accepted way (for Crays, etc.) is to count each of *,+,- as one
- flop. Generally there should be the same number of *'s as + and -.
- If not, you should state that there were few *'s. Division is
- usually not used, since it can usually be eliminated, and turned
- into a *. Division time is quoted as a multilple of * time, eg
- 4 times longer. Intrinsic functions are not usually counted
- (sin cos log etc.). Don't use benchmarks with these in them.
- The most useful vector benchmark is the DAXPY (REAL*8)
-
- do i=1,n
- y(i)=a*x(i)+y(i)
- enddo
-
- n should be very large (i.e. 5,000,000); much larger than the cache
- size. Each iteration counts as 2 flops, * and + are equal in number.
- SAXPY is the real*4 equivalent. The Cray YMP does 289Mflops on one
- processor for DAXPY. The memory bandwidth, in Mb/sec, is 12 times this.
- number (3500 Mb/sec). Another name for DAXPY is the linked triad.
-
- --
- Rowan Hughes James Cook University
- Marine Modelling Unit Townsville, Australia.
- Dept. Civil and Systems Engineering csrdh@marlin.jcu.edu.au
-
-
- >From schreiber Mon Jun 1 11:07:36 1992
- Received: by schreiber.asd.sgi.com (920110.SGI/911001.SGI)
- for schreiber id AA11352; Mon, 1 Jun 92 11:07:36 -0700
- Date: Mon, 1 Jun 92 11:07:36 -0700
- From: schreiber (Olivier Schreiber)
- Message-Id: <9206011807.AA11352@schreiber.asd.sgi.com>
- To: schreiber
- Subject: flops
- Status: RO
-
- In article <lc5k6ro@fido.asd.sgi.com> schreiber@schreiber.asd.sgi.com (Olivier Schreiber) writes:
- >Is there an accepted way industry-wide of counting floating point
- >operations or a definition of what a floating point operation is
- >for FP benchmark purposes?
-
- No. There are several different "standard" approaches. One is to take the
- known number of floating point operations for a given size of the Linpack
- benchmark, time the number of seconds it takes to execute on a system,
- then divide the first number by the second to get Linpack MFLOPS. This
- gives the most commonly reported "MFLOPS" number in the workstation world.
- This can be done for single and double precision to produce single and
- double precision Linpack MFLOPS numbers.
-
- The authors of the Livermore Loops developed a normalizing formula that can
- be used on codes that are more complex than Linpack (i.e. that do a lot more
- than adds, subtracts and multiplies.) They count floating point operations
- as follows: (taken from page 43 of Hennessy and Patterson)
-
- Real FP Operations Normalized FP operations
- ------------------ ------------------------
- ADD,SUB,COMPARE,MULT 1
- DIVIDE,SQRT 4
- EXP,SIN,COS,TAN,.... 8
-
-
-
- >
- >How many FPO is x=y*z ?
- >or x=sqrt(y)*sin(z)?
-
- x=y*z is one FLOP.
- x=sqrt(y)*sin(z) is 3 FLOPs unnormalized, 13 FLOPs normalized by the Livermore
- method shown above.
-
- >How do most vendors measure MFLOP rates on general user programs?
-
- I don't know that they do. Certainly there is no "standard" for this; even
- in the more controlled world of benchmarking, there is some variation, as
- you can see.
-
- BTW, based on the Livermore normalizing table above, a MULTIPLY-and-ADD
- instruction is 2 FLOPs, which makes sense. Some posters in the past have
- tried to claim that a maultiply and accumulate instruction is only 1 FLOP,
- which makes no sense at all, although they claim from memory that that
- was the "original" definition of a FLOP back in the Dark Ages (and then
- no reference for this claim is ever produced.)
-
-
-
- --
- -----------------------------------------------------------------------------
- "It is seldom that any liberty is lost all at once." David Hume
- ||| clc5q@virginia.edu (Clark L. Coleman)
-
- >From comp.benchmarks Tue Jun 2 13:55:05 1992
- Path: fido!odin!sgigate!rutgers!ucla-cs!ucla-ma!euphemia!pmontgom
- From: pmontgom@euphemia.math.ucla.edu (Peter Montgomery)
- Newsgroups: comp.benchmarks
- Subject: Re: How are Floating Point operations counted?
- Message-ID: <1992Jun1.203454.4900@math.ucla.edu>
- Date: 1 Jun 92 20:34:54 GMT
- References: <lc5k6ro@fido.asd.sgi.com> <1992May29.150530.13028@murdoch.acc.Virginia.EDU> <1992Jun1.175743.11513@news.eng.convex.com>
- Sender: news@math.ucla.edu
- Organization: UCLA Mathematics Department
- Lines: 55
- Status: RO
-
- In article <1992Jun1.175743.11513@news.eng.convex.com>
- patrick@convex.COM (Patrick F. McGehearty) writes:
- >In article <1992May29.150530.13028@murdoch.acc.Virginia.EDU>
- >clc5q@hemlock.cs.Virginia.EDU (Clark L. Coleman) writes:
-
- ...
-
- >>Real FP Operations Normalized FP operations
- >>------------------ ------------------------
- >>ADD,SUB,COMPARE,MULT 1
- >>DIVIDE,SQRT 4
- >>EXP,SIN,COS,TAN,.... 8
- >>
- >
- >Sounds reasonable to me. If we (all readers of this note) all agree to use
- >this method, does that make it a standard? :-) :-)
-
- I do extensive multiple-precision arithmetic. I consider
- my programs to be "number crunching", but they rate very low
- on this scale because the list omits many operations. We should also count
-
- a) Conversions: integer to floating,
- floating to integer, conversion between
- different floating precisions (perhaps one FP each).
-
- b) Truncating or rounding floating to integer
- while retaining floating point form;
- (FORTRAN AINT and ANINT, for example)
- (perhaps two FP operations each).
-
- c) Simple operations like the Fortran
- ABS, DIM, MAX, MIN, and SIGN functions
- (perhaps one FP operation each, with two for SIGN).
-
- d) Integer multiplies not used for subscript
- computations and where neither operand is
- a compile-time constant (one FP each).
-
- e) Integer division when denominator is
- not a power of 2 or negative thereof
- should count the same as an FP divide
- if only one of the quotient, remainder is used;
- if both are used, count an additional FP.
-
- In all cases no FP operations should be counted if all
- operands are compile time constants, such as 2.5 * 6.7 + ABS(-7.8).
-
- The list also fails to specify how much operations like X**10
- or pow(X, 10.0) (exponentiation) count. One vendor may compile this using
- four multiplies, while another uses logarithm and exponential.
- Does it matter if the 10 or 10.0 is an execution-time value rather than
- compile-time constant?
- --
- Peter L. Montgomery Internet: pmontgom@math.ucla.edu
- Department of Mathematics, UCLA, Los Angeles, CA 90024-1555 USA
-
- --
- Olivier Schreiber schreiber@sgi.com (415)390 5353 Fax:(415) 964-8671 MS/9L580
- Silicon Graphics Inc., 2011 North Shoreline Blvd. Mountain View, Ca 94039-7311
-
-
- ---------------------------
- From: earl@fuji.idtinc.COM (Earl Killian)
- Message-Id: <9211111615.AA22511@fuji.qedinc.com>
- To: hblim@iti.gov.sg (Hock-Beng Lim)
- In-Reply-To: hblim@iti.gov.sg's message of Wed, 11 Nov 1992 07:45:19 GMT
- Subject: Counting number of MFLOPS
- Status: RO
-
- The pixie/pixstats programs on MIPS systems count FLOPs.
-
-
- ---------------------------
- From: Larry Meadows <lfm@pgroup.com>
- Message-Id: <199211120431.AA04770@libby.pgroup.com>
- To: hblim@iti.gov.sg
- Subject: Re: Counting number of MFLOPS
- Newsgroups: comp.benchmarks
- In-Reply-To: <1992Nov11.074519.26172@iti.gov.sg>
- Organization: The Portland Group, Portland, OR
- Cc:
- Status: RO
-
-
- pixie on a mips will tell you how many f.p. ops were executed (exactly
- I believe).
-
- lfm
-
- --
- Larry Meadows The Portland Group
- lfm@pgroup.com
-
-
- ---------------------------
- From: earl@fuji.idtinc.COM (Earl Killian)
- Message-Id: <9211122242.AA23731@fuji.qedinc.com>
- To: hblim@iti.gov.sg
- In-Reply-To: Hock-Beng Lim's message of Thu, 12 Nov 92 11:36:47 WST <9211120336.AA06744@iti.gov.sg>
- Subject: Counting number of MFLOPS
- Status: RO
-
- Pixie is a object-code instrumentation program. It adds intructions
- to a binary to count basic blocks and optionally to generate
- instruction and data address traces. The non-tracing usage is quite
- simple:
-
- pixie foo # reads foo, produces new executable foo.pixie
- # (also produces foo.Addrs)
- # (takes 2-3 seconds)
- foo.pixie fooargs < fooinput > foooutput
- # run new executable in the normal way
- # writes foo.Counts on exit
- # (takes 2-3x longer than running foo)
- pixstats foo > foo.stats # reads foo, foo.Addrs, and foo.Counts
- # and generates statistics, including
- # opcode frequencies
- # (takes <1 second)
-
- >From the SPEC benchmark pixstats outputs I have lying around I
- selected the lines that would let you count floating point operations
- (to compute FLOPs as the supercomputer folks do, use
- fadd+fsub+...+fmul+fdiv*4+fsqrt*8). The complete outputs are a little
- large to mail.
-
- 013.spice2g6:
- fmul 330562977 2.10%
- fsub 222139265 1.41%
- fadd 203333823 1.29%
- fdiv 78994140 0.50%
- c.lt 62909949 0.40%
- fabs 52376825 0.33%
- fmov 22202904 0.14%
- c.le 19064771 0.12%
- c.eq 12807281 0.08%
- fcvtd 11471709 0.07%
- fneg 11089840 0.07%
- fcvtw 5138333 0.03%
- c.ole 5138308 0.03%
- c.ult 3325350 0.02%
- fcvts 1977914 0.01%
- c.olt 1555423 0.01%
- c.ule 1555373 0.01%
- fsqrt 1132405 0.01%
- c.un 1 0.00%
-
- 015.doduc:
- fmul 122995914 11.86%
- fadd 89962441 8.67%
- fsub 37023197 3.57%
- c.le 28173005 2.72%
- fdiv 23850719 2.30%
- fcvtd 14394593 1.39%
- c.lt 11241927 1.08%
- fmov 8652077 0.83%
- fcvts 3378554 0.33%
- c.eq 3355800 0.32%
- fneg 2796032 0.27%
- fabs 2738844 0.26%
- fcvtw 1279710 0.12%
- c.ole 838950 0.08%
- fsqrt 507377 0.05%
- c.un 456765 0.04%
- c.ult 5514 0.00%
-
- 020.nasa7:
- fmul 1027731023 18.52%
- fsub 600896167 10.83%
- fadd 456774097 8.23%
- fmov 44444306 0.80%
- fdiv 19186849 0.35%
- fcvtd 6549652 0.12%
- c.ult 6159939 0.11%
- fabs 4424521 0.08%
- c.lt 2220883 0.04%
- fsqrt 2060200 0.04%
- c.olt 1293562 0.02%
- fcvtw 1030288 0.02%
- fneg 809244 0.01%
- c.le 516000 0.01%
- c.ole 515000 0.01%
- c.ule 263082 0.00%
- c.eq 51010 0.00%
- fcvts 46108 0.00%
-
- 030.matrix300:
- fadd 216000300 31.65%
- fmul 216000000 31.65%
- fmov 3 0.00%
- c.eq 3 0.00%
- fabs 2 0.00%
- fsub 1 0.00%
- fdiv 1 0.00%
-
- 042.fpppp:
- fmul 306318594 23.42%
- fadd 260531821 19.92%
- fsub 7504774 0.57%
- fmov 3922913 0.30%
- fneg 3677499 0.28%
- c.lt 1685380 0.13%
- fdiv 1303147 0.10%
- fabs 1109268 0.08%
- c.le 682180 0.05%
- fcvtd 495142 0.04%
- fsqrt 377621 0.03%
- fcvts 248187 0.02%
- fcvtw 223627 0.02%
- c.ole 200322 0.02%
- c.eq 61467 0.00%
- c.un 23306 0.00%
-
- 047.tomcatv:
- fmul 156167870 17.01%
- fsub 113846282 12.40%
- fadd 110607782 12.04%
- fabs 26010198 2.83%
- c.lt 13005198 1.42%
- fdiv 6502758 0.71%
- fneg 6502500 0.71%
- fmov 47120 0.01%
- c.eq 400 0.00%
- fcvts 257 0.00%
- fcvtd 257 0.00%
-
- --
- -----------------------------------------------------------------------------
- Hock-Beng Lim | hblim@csrd.uiuc.edu
- Center for Supercomputing R&D, UIUC. (on leave) | hblim@iti.gov.sg
- Information Technology Institute, Singapore. | tel : (65)772-7205,772-7273
-