Received: from sloth.swcp.com (sloth.swcp.com [198.59.115.25]) by nacm.com (8.6.9/8.6.9) with ESMTP id XAA04556 for <executor@nacm.com>; Fri, 4 Nov 1994 23:22:50 -0800
Received: from iclone.UUCP (uucp@localhost) by sloth.swcp.com (8.6.9/8.6.9) with UUCP id AAA01021 for nacm.com!executor; Sat, 5 Nov 1994 00:24:11 -0700
Received: by iclone (NX5.67d/NX3.0M)
id AA14216; Sat, 5 Nov 94 00:01:31 -0700
Date: Sat, 5 Nov 94 00:01:31 -0700
From: "Mathew J. Hostetter" <iclone!mat@sloth.swcp.com>
Message-Id: <9411050701.AA14216@iclone>
To: executor@nacm.com
Subject: Re: How to generate discussion
Sender: Executor-Owner@nacm.com
Precedence: bulk
>Does executor store the native code
>it generates on disk or does it regenerate on every restart of the
>program?
It generates native code every time the program is run. 68k code is
only dynamically compiled as it is first executed, which is why you
don't notice a huge lag on program start-up.
>It seems to me that such a comparison (say 486/66 running
>the synthetic benchmarks vs IBM code on same benchmarks) would show
>how much farther the synthetic CPU has to go before matching the
>performance of compiled code for the IBM.
We don't have the source code to Speedometer, so we can't recompile it
native for the x86. Although the ratio of emulated speed to native
speed is certainly an interesting number for me (the emulator author),
absolute performance is what matters to the end user. We can give
people near-Quadra performance on a 486 DX/2 66, and that's good
enough for most people. For programs that spend time in the "ROMs",
our performance is even better than these numbers indicate.
>In theory, it seems to me
>that the synthetic CPU should be able to match the real one (it would
>take alot of analysis... data flow and control flow analysis just like
>in a real compiler/optimizer). Am I way off base here? Or is such
>analysis too complicated for now?
"In theory" much is possible, but the analysis would be too extensive
to perform at runtime. Remember, we're emulating a big endian
processor with 16 registers on a little endian processor with 8
registers. Maintaining a big endian memory image necessitates extra
work, and full inter-block register allocation would (probably) be too
slow to do at runtime. A fair comparison might be an x86 version of
the same benchmark coded in such a way as to maintain a big endian
memory image.
>Are there reaons besides time
>constraints that heavy optimization isn't performed and then the
>results stored on disk?
No. Dynamic compilation is responsive enough and our performance is
good enough that we haven't allocated resources to opening this can of
worms. We *could* do this someday to augment our emulator's
performance, and I would personally find such a project interesting,