home *** CD-ROM | disk | FTP | other *** search
Text File | 1994-10-14 | 121.7 KB | 2,696 lines |
- %
- % TeX source for The Specification of the Z-machine
- %
-
- \newif\iffiles\filesfalse
- %
- % Manual macros
- %
- % Page layout
- %
- \newif\ifshutup\shutupfalse
- \magnification=\magstep 1
- \hoffset=0.15 true in
- \voffset=2\baselineskip
- %
- % General hacks
- %
- \def\PAR{\par}
- %
- % Font loading
- %
- \font\medfont=cmr10 scaled \magstep2
- \font\bigfont=cmr10 scaled \magstep3
- %\def\sectfont{\bf}
- \font\sectfont=cmbx12
- \def\small{\sevenrm}
- \font\rhrm=cmr8
- \font\rhit=cmsl8
- %
- % Titles
- %
- \newcount\subsectno % Subsection number
- \def\rhead{} % The running head will go here
- %
- \def\newsection#1#2{% To begin a section...
- %\global\titletrue% Declare this as a title page
- \xdef\rhead{{\rhrm #1}\quad #2}% Initialise running head and ssn
- \subsectno=0%
- \iffiles
- \write\conts{\string\sli\string{#1\string}\string{#2\string}\string{\the\pageno\string}}%
- \fi
- }
- %
- \def\section#1#2{\vskip 1 true in\goodbreak\newsection{#1}{#2}
- \noindent{\sectfont #1\quad #2}\bigskip\noindent}
- %
- % Headers and footers
- %
- \newif\iftitle
- \headline={\iftitle\hfil\global\titlefalse%
- \else{\hfil{\rhit \rhead}}%
- \fi}
- \footline={\ifnum\pageno<0\hfil{\tenbf\romannumeral -\pageno}%
- \else\hfil{\tenbf \number\pageno}\fi}
- %\footline={\ifnum\pageno=1\hfil\else\hfil{\tenbf \number\pageno}\fi}
- %
- % (Old date-stamping version:)
- % \footline={\hfil{\rm \number\pageno}\hfil{\rm \number\day/\number\month}}
- %
- % If this works I'll be impressed
- %
-
- \font\ninerm=cmr9
- \font\ninei=cmmi9
- \font\ninesy=cmsy9
- \font\ninebf=cmbx9
- \font\eightbf=cmbx8
- \font\ninett=cmtt9
- \font\nineit=cmti9
- \font\ninesl=cmsl9
- \def\ninepoint{\def\rm{\fam0\ninerm}%
- \textfont0=\ninerm
- \textfont1=\ninei
- \textfont2=\ninesy
- \textfont3=\tenex
- \textfont\itfam=\nineit \def\it{\fam\itfam\nineit}%
- \textfont\slfam=\ninesl \def\sl{\fam\slfam\ninesl}%
- \textfont\ttfam=\ninett \def\tt{\fam\ttfam\ninett}%
- \textfont\bffam=\ninebf
- \normalbaselineskip=11pt
- \setbox\strutbox=\hbox{\vrule height8pt depth3pt width0pt}%
- \normalbaselines\rm}
-
- \def\tenpoint{\def\rm{\fam0\tenrm}%
- \textfont0=\tenrm
- \textfont1=\teni
- \textfont2=\tensy
- \textfont3=\tenex
- \textfont\itfam=\tenit \def\it{\fam\itfam\tenit}%
- \textfont\slfam=\tensl \def\sl{\fam\slfam\tensl}%
- \textfont\ttfam=\tentt \def\tt{\fam\ttfam\tentt}%
- \textfont\bffam=\tenbf
- \normalbaselineskip=12pt
- \setbox\strutbox=\hbox{\vrule height8.5pt depth3.5pt width0pt}%
- \normalbaselines\rm}
-
- \parindent=30pt
- \def\inpar{\hangindent40pt\hangafter1\qquad}
- \def\onpar{\par\hangindent40pt\hangafter0}
-
- \newskip\ttglue
- \ttglue=.5em plus.25em minus.15em
-
- \def\orsign{$\mid\mid$}
-
- \outer\def\begindisplay{\obeylines\startdisplay}
- {\obeylines\gdef\startdisplay#1
- {\catcode`\^^M=5$$#1\halign\bgroup\indent##\hfil&&\qquad##\hfil\cr}}
- \outer\def\enddisplay{\crcr\egroup$$}
-
- \chardef\other=12
-
- \def\ttverbatim{\begingroup \catcode`\\=\other \catcode`\{=\other
- \catcode`\}=\other \catcode`\$=\other \catcode`\&=\other
- \catcode`\#=\other \catcode`\%=\other \catcode`\~=\other
- \catcode`\_=\other \catcode`\^=\other
- \obeyspaces \obeylines \tt}
- {\obeyspaces\gdef {\ }}
-
- \outer\def\beginstt{$$\let\par=\endgraf \ttverbatim\ninett \parskip=0pt
- \catcode`\|=0 \rightskip=-5pc \ttfinish}
- \outer\def\begintt{$$\let\par=\endgraf \ttverbatim \parskip=0pt
- \catcode`\|=0 \rightskip=-5pc \ttfinish}
- {\catcode`\|=0 |catcode`|\=\other
- |obeylines
- |gdef|ttfinish#1^^M#2\endtt{#1|vbox{#2}|endgroup$$}}
-
- \catcode`\|=\active
- {\obeylines\gdef|{\ttverbatim\spaceskip=\ttglue\let^^M=\ \let|=\endgroup}}
-
- \def\beginlines{\par\begingroup\nobreak\medskip\parindent=0pt
- \nobreak\ninepoint \obeylines \everypar{\strut}}
- \def\endlines{\endgroup\medbreak\noindent}
-
- \def\<#1>{\leavevmode\hbox{$\langle$#1\/$\rangle$}}
-
- \def\dbend{{$\triangle$}}
- \def\d@nger{\medbreak\begingroup\clubpenalty=10000
- \def\par{\endgraf\endgroup\medbreak} \noindent\hang\hangafter=-2
- \hbox to0pt{\hskip-\hangindent\dbend\hfill}\ninepoint}
- \outer\def\danger{\d@nger}
- \def\dd@nger{\medskip\begingroup\clubpenalty=10000
- \def\par{\endgraf\endgroup\medbreak} \noindent\hang\hangafter=-2
- \hbox to0pt{\hskip-\hangindent\dbend\kern 1pt\dbend\hfill}\ninepoint}
- \outer\def\ddanger{\dd@nger}
- \def\enddanger{\endgraf\endsubgroup}
-
- \def\cstok#1{\leavevmode\thinspace\hbox{\vrule\vtop{\vbox{\hrule\kern1pt
- \hbox{\vphantom{\tt/}\thinspace{\tt#1}\thinspace}}
- \kern1pt\hrule}\vrule}\thinspace}
-
- \newcount\exno
- \exno=0
-
- \def\xd@nger{%
- \begingroup\def\par{\endgraf\endgroup\medbreak}\ninepoint}
-
- \outer\def\warning{\medbreak
- \noindent\llap{$\bullet$\rm\kern.15em}%
- {\ninebf WARNING}\par\nobreak\noindent}
- \outer\def\exercise{\medbreak \global\advance\exno by 1
- \noindent\llap{$\bullet$\rm\kern.15em}%
- {\ninebf EXERCISE \bf\the\exno}\par\nobreak\noindent}
- \def\dexercise#1{\global\advance\exno by 1
- \medbreak\noindent\llap{$\bullet$\rm\kern.15em}%
- #1{\eightbf ~EXERCISE \bf\the\exno}\hfil\break}
- \outer\def\dangerexercise{\xd@nger \dexercise{\dbend}}
- \outer\def\ddangerexercise{\xd@nger \dexercise{\dbend\dbend}}
-
-
- \newwrite\ans%
- \newwrite\conts%
- \iffiles
- \immediate\openout\conts=$.games.infocom.ftp.toolkit.mandir.conts
- \fi
-
- \iffiles\else\outer\def\answer#1{\par\medbreak}\shutuptrue\fi
-
- \newwrite\inx
- \ifshutup\else
- \immediate\openout\inx=$.games.infocom.ftp.toolkit.mandir.inxdata
- \fi
- \def\marginstyle{\sevenrm %
- \vrule height6pt depth2pt width0pt } %
-
- \newif\ifsilent
- \def\specialhat{\ifmmode\def\next{^}\else\let\next=\beginxref\fi\next}
- \def\beginxref{\futurelet\next\beginxrefswitch}
- \def\beginxrefswitch{\ifx\next\specialhat\let\next=\silentxref
- \else\silentfalse\let\next=\xref\fi \next}
- \catcode`\^=\active \let ^=\specialhat
- \def\silentxref^{\silenttrue\xref}
-
- \newif\ifproofmode
- \proofmodetrue %
-
- \def\xref{\futurelet\next\xrefswitch}
- \def\xrefswitch{\begingroup\ifx\next|\aftergroup\vxref
- \else\ifx\next\<\aftergroup\anglexref
- \else\aftergroup\normalxref \fi\fi \endgroup}
- \def\vxref|{\catcode`\\=\active \futurelet\next\vxrefswitch}
- \def\vxrefswitch#1|{\catcode`\\=0
- \ifx\next\empty\def\xreftype{2}%
- \def\next{{\tt\text}}%
- \else\def\xreftype{1}\def\next{{\tt\text}}\fi %
- \edef\text{#1}\makexref}
- {\catcode`\|=0 \catcode`\\=\active |gdef\{}}
- \def\anglexref\<#1>{\def\xreftype{3}\def\text{#1}%
- \def\next{\<\text>}\makexref} %
- \def\normalxref#1{\def\xreftype{0}\def\text{#1}\let\next=\text\makexref}
-
- \def\makexref{\ifproofmode%
- \xdef\writeit{\write\inx{\text\space!\xreftype\space
- \noexpand\number\pageno.}}\iffiles\writeit\fi
- \else\ifhmode\kern0pt \fi\fi
- \ifsilent\ignorespaces\else\next\fi}
-
- \newdimen\fullhsize
- \def\fullline{\hbox to\fullhsize}
- \let\lr=L \newbox\leftcolumn
-
- \def\doubleformat{\shipout\vbox{\makeheadline
- \fullline{\box\leftcolumn\hfil\columnbox}
- \makefootline}
- \advancepageno}
- \def\tripleformat{\shipout\vbox{\makeheadline
- \fullline{\box\leftcolumn\hfil\box\midcolumn\hfil\columnbox}
- \makefootline}
- \advancepageno}
- \def\columnbox{\leftline{\pagebody}}
-
- \newbox\leftcolumn
- \newbox\midcolumn
- \def\beginindex{
- \fullhsize=6.5true in \hsize=2.1true in
- \global\def\makeheadline{\vbox to 0pt{\vskip-22.5pt
- \fullline{\vbox to8.5pt{}\the\headline}\vss}\nointerlineskip}
- \global\def\makefootline{\baselineskip=24pt \fullline{\the\footline}}
- \output={\if L\lr
- \global\setbox\leftcolumn=\columnbox \global\let\lr=M
- \else\if M\lr
- \global\setbox\midcolumn=\columnbox \global\let\lr=R
- \else\tripleformat \global\let\lr=L\fi\fi
- \ifnum\outputpenalty>-20000 \else\dosupereject\fi}
- \begingroup
- \parindent=1em \maxdepth=\maxdimen
- \def\par{\endgraf \futurelet\next\inxentry}
- \obeylines \everypar={\hangindent 2\parindent}
- \exhyphenpenalty=10000 \raggedright}
- \def\inxentry{\ifx\next\sub \let\next=\subentry
- \else\ifx\next\endindex \let\next=\vfill
- \else\let\next=\mainentry \fi\fi\next}
- \def\endindex{\mark{}\break\endgroup
- \supereject
- \if L\lr \else\null\vfill\eject\fi
- \if L\lr \else\null\vfill\eject\fi
- }
- \let\sub=\indent \newtoks\maintoks \newtoks\subtoks
- \def\mainentry#1,{\mark{}\noindent
- \maintoks={#1}\mark{\the\maintoks}#1,}
- \def\subentry\sub#1,{\mark{\the\maintoks}\indent
- \subtoks={#1}\mark{\the\maintoks\sub\the\subtoks}#1,}
-
- \def\subsection#1{\medbreak\par\noindent{\bf #1}\qquad}
-
- % For contents
-
- \def\cl#1#2{\bigskip\par\noindent{\bf #1}\quad {\bf #2}}
- \def\li#1#2#3{\smallskip\par\noindent\hbox to 5 in{{\bf #1}\quad #2\dotfill #3}}
- \def\sli#1#2#3{\par\noindent\hbox to 5 in{\qquad\item{#1}\quad #2\dotfill #3}}
- \def\fcl#1#2{\bigskip\par\noindent\hbox to 5 in{\phantom{\bf 1}\quad {\bf #1}\dotfill #2}}
-
- % Epigrams
-
- \def\poem{\begingroup\narrower\narrower\narrower\obeylines\ninepoint}
- \def\widepoem{\begingroup\narrower\narrower\obeylines\ninepoint}
- \def\verywidepoem{\begingroup\narrower\obeylines\ninepoint}
- \def\quote{\medskip\begingroup\narrower\narrower\noindent\ninepoint}
- \def\poemby#1#2{\par\smallskip\qquad\qquad\qquad\qquad\qquad -- #1, {\it #2}
- \tenpoint\endgroup\bigskip}
- \def\widepoemby#1#2{\par\smallskip\qquad\qquad\qquad -- #1, {\it #2}
- \tenpoint\endgroup\bigskip}
- \def\quoteby#1{\par\smallskip\qquad\qquad\qquad\qquad\qquad
- -- #1\tenpoint\endgroup\bigskip}
- \def\endquote{\par\tenpoint\endgroup\medskip}
-
- %
- % End of macros
-
- \def\subtitle#1{\bigbreak\noindent{\bf #1}\medskip}
-
- \titletrue
-
- \centerline{\bigfont The Specification of the Z-Machine}
- \vskip 0.3in
- \centerline{\medfont and Inform assembly language}
- \vskip 0.3in
- \centerline{\sl last updated 24/9/94}
- \vskip 0.5in
-
- \sli{1}{Introduction}{2}
- \sli{2}{Resources available}{4}
- \sli{3}{History and the six versions}{5}
- \sli{4}{How text is encoded}{7}
- \sli{5}{How instructions are encoded}{9}
- \sli{6}{The early Z-machine}{12}
- \sli{7}{The late Z-machine}{16}
- \sli{8}{Complete table of opcodes}{22}
- \sli{9}{Dictionary of opcodes}{29}
- \sli{10}{Header format through the ages}{40}
- \sli{11}{A few statistics}{41}
-
-
- \vfill\eject
- \section{1}{Introduction}
-
- \quote
- The legend that every cipher is breakable is of course absurd,
- though still quite widespread among people who should know better.
- \poemby{J. E. Littlewood}{A Mathematician's Miscellany}
- \quote
- There is an obvious resemblance between an unreadable script
- and a secret code; similar methods can be employed to break
- both. But the differences must not be overlooked. The code is
- deliberately designed to baffle the investigator; the script
- is only puzzling by accident.
- \poemby{John Chadwick}{The Decipherment of Linear B}
-
- The Z-machine is an imaginary computer originally devised by Joel Berez and
- Marc Blank in 1979 to run the Infocom adventure games. Since the demise of
- Infocom much effort by many people has gone into deciphering it and
- implementing it with new portable interpreters to allow modern-day players
- to run the classic Infocom games. The Z-machine is also the run-time code
- format of the Inform compiler, which means that there are now more
- Infocom-format games in play than the ones Infocom actually wrote.
-
- It is well-adapted to its task. Its behaviour is (very, very nearly)
- exactly specified and it has been accurately implemented on virtually every
- small computer. It maintains a hierarchy of objects and possessions, and
- does the computationally-intensive part of parsing input itself.
-
- The purpose of this paper is to fully document the Z-machine, discuss to
- what extent it is presently implemented and detail how to use Inform as an
- assembler.
-
- Only a few of the pieces in this jigsaw were placed by myself, and the
- credit belongs to many people. Old hands at the decipherment game will
- no doubt find the opcode table tiresomely familiar: but, as with a chemist
- finding Mendeleyev's periodic table on a laboratory wall, so will the hacker
- be reassured by the sight.
-
- I gratefully acknowledge the help of Paul David Doherty and Mark Howell, who
- each read a draft of this paper and sent back detailed corrections.
- Mistakes and misunderstandings remain my own.
- \medskip
-
- To begin, three general points. The fascination with the letter Z began
- with `Zork': apparently ``zork" was a nonsense word used at MIT for the
- current uninstalled program in progress, and stuck. The Z-machine runs what
- we shall call Z-code. Just as we shall use the term ``Z-machine" for both
- the machine and its loaded program, so ZIP (Zork Implementation Program) was
- used to mean either the interpreter or the object code it interpreted. Code
- was written in ZIL (Zork Implementation Language), which was derived from
- MDL (informally called ``muddle"), a particularly unhelpful form of LISP.
- It was then compiled by ZILCH to assembly code which was passed to ZAP to
- make the ZIP. We refer to code as ``Z-code" to avoid confusion with ``Zip",
- the name of Mark Howell's interpreter (by far the best available).
-
- Secondly. In talking about ``the Z-machine", what do we really mean: the
- design Infocom had in mind, the syntax which seems to be in their surviving
- game files, or what is actually done by various interpreters, theirs or
- ours? Aided by the patient detective work of my predecessors (e.g.
- disassembling Infocom-written interpreters, and going through all existing
- game files) I shall try to give all three specifications. (Inform
- assembly-language programmers will need to bear in mind that it is the third
- that really counts.)
-
- For the standard format (version 3) there are many existing games and there
- isn't much conflict. But for later versions, there are few games, not all
- the opcodes were ever used and the interpreters publically available
- disagree about what to do with some of the obscure ones. To some extent
- this account is an attempt to settle arguments.
-
- Finally, note that the Z-machine does not provide the bulk of a game's
- parser, or its `operating system'. The parser has to be coded, and the
- tables it uses (which some investigators think are part of the Z-code
- format) are in fact the same across different Infocom games only because
- they contain similar parsers. So those are not specified here. An account
- of the parsing tables as generated by Inform can be found in the {\sl Inform
- Technical Manual}. For the usual format of Infocom's parsing tables, see
- the C source code to Mark Howell's utility ``Infodump''.
-
- \medskip
-
- Hexadecimal numbers are written with an initial dollar, as in |$ff|, while
- binary numbers are written with a double-dollar as in |$$11011|, according
- to Inform conventions. The bits in a byte are numbered 0 to 7, 0 being
- the least significant and the top bit, 7, the most.
-
- \medskip
- \hbox to\hsize{\hfill\it Graham Nelson}
- \hbox to\hsize{\hfill\it Magdalen College, Oxford}
- \hbox to\hsize{\hfill\it September 1994}
- \vfill\eject
-
- \section{2}{Resources available}
-
- \quote
- ...the dead hand of the academy had yet to stifle the unbridled
- enthusiasms of a small band of amateurs in Europe and America.
- \poemby{Michael D. Coe}{Breaking the Maya Code}
-
- (This document representing the dead hand of the academy.)
-
- The four publically available {\bf interpreters} that I know of are:
- \item{$\bullet$} ``Zip'', the fastest and most accurate, which is currently
- being updated to interpret even version 6;
- \item{$\bullet$} ``InfoTaskForce'' (henceforth ITF), which is almost as good
- for most purposes but slightly inaccurate in some screen-handling matters
- and does not provide the necessary features for ``undo'' in Version 5 games;
- \item{$\bullet$} ``Pinfocom'', which is competent on version 3 games but
- unable to cope with higher versions;
- \item{$\bullet$} ``Zterp'', similarly primitive.
- \medskip
-
- Bryan Scattergood has made a considerable enhancement of ITF for his
- Psion and Archimedes interpreters. However, the ITF no longer seems to
- exist as such.
-
- The only existing {\bf compiler} is Inform, since Zilch no longer exists.
-
- Mark Howell's toolkit of utility programs includes a {\bf disassembler}
- called ``txd'' and a {\bf vocabulary dumper} called ``infodump'', together
- with other less generally useful programs.
-
- An enhanced version of Zip which will be a source-level {\bf debugger} for
- Inform games, called Infix, will soon be available.
-
- The {\bf Infocom story files} are, with a few exceptions (the samplers)
- copyright and are currently being sold by Activision in the collections
- `The Lost Treasures of Infocom'. They represent excellent value for
- money. They should not be present at any archive site, and if they
- are then this is so illegally.
-
- A few other {\bf story files}, such as `Curses' and `Advent', are freely
- available.
- \medskip
-
- Most of the above programs have publically available source code (in C)
- and many have executables as well; the |if-archive| at the anonymous
- ftp site |ftp.gmd.de| is the best place to find them.
-
- A curse of these programs is that they almost all use different names for
- the opcodes internally (that is, in their source code). Mark Howell and
- I (as authors of the disassembler and assembler, respectively) have agreed
- on what we think is a reasonable standard, and these are the opcode names
- documented here. They are used from Inform 5.4 and in recent editions of
- txd.
-
-
- \section{3}{History and the six versions}
-
- \quote
- Confusion now hath made his masterpiece
- \poemby{Shakespeare}{Macbeth}
-
- There were six main versions of the Z-machine, and several minor variant
- forms. These are recognisably similar but with labyrinthine differences,
- like different archaic dialects of the same language. (And, of course, the
- job of decipherment is made harder by the fact that the archaeological
- record suddenly stops in about 1989 when the civilisation in question
- collapsed.)
-
- Broadly, these fall into two groups: early (versions 1 to 3) and late
- (4 to 6). This paper will give an expository account of versions 3 and 5
- (as representative of these two groups) but will conclude with brief tables
- and specification for all versions.
-
- The six versions are:
- \smallskip
- \settabs\+\indent Version 1\quad&\cr
- \+ Version 1 & Early Apple ][ games for DOS 3.2, and the TRS-80 Models I/II\cr
- \+ Version 2 & Early Apple ][ games for DOS 3.3, and the TRS-80 Models I/II\cr
- \smallskip
- \+ Version 3 & ``Standard'' series games\cr
- \+ Version 4 & ``Plus'' series games\cr
- \+ Version 5 & ``Advanced" series games, or, as the marketing division would\cr
- \+ & have it, ``Solid Gold Interactive Fiction" - a reference to\cr
- \+ & the colour (though not composition) of the boxes they came in\cr
- \smallskip
- \+ Version 6 & Later games with graphics, mouse support, sound effects, etc.\cr
- \smallskip
- \noindent
- Infocom called their own interpreters ZIP (versions 1 to 3), EZIP/LZIP
- (V4), XZIP (V5) and YZIP (V6).
-
- Versions 1 and 2 are thought to be extinct, though collectors have a few
- fossils and Zip and ITF implement them anyway. Many Version 3 games are
- still in circulation, and enough worthwhile Version 4 and 5 ones to make
- the format important.
-
- Most of the Infocom games exist in several different releases, and some
- were written for one version and then ported to later ones. `Zork I', for
- instance, exists in at least ten editions, two early, seven in version-3
- (with release numbers between 5 to 88 in chronological order) and one in
- version 5 (release 52 - the releases go back to 1 when the version changes).
-
- There are few version 6 games, and they are of (arguably) poorer quality.
- Few interpreters exist for them, because they are inherently difficult to
- port to different machines. However, there will be a brief discussion of
- the version-6 format here and in effect a full specification in the
- dictionary which follows the opcode table.
-
- The definitive guide to all Infocom story files known to exist is Paul David
- Doherty's ``fact sheet'' file, which can be found at |ftp.gmd.de|.
- \bigskip
-
- The Z-machine as originally constructed was surprisingly similar to that
- in use when Infocom ground to a halt. Version 1 (1979-80) had essentially
- the same object format, for instance, and a similar header, but encoded text
- with a different character table and had no concept of synonyms. Its
- addresses were all word-addresses and not byte-addresses, so presumably a
- small amount of memory was wasted in null bytes to fix parities everywhere.
-
- Version 2 was quite a minor enhancement, presumably made only because a new
- interpreter had to be written anyway. Synonyms appeared, but only in one
- 32-word bank, and the six-digit serial number appeared in the header,
- though it wasn't always the date in those days: Release 7 of `Zork II',
- for instance, is (reputedly) called |UG3AU5|.
-
- Version 3 changed the text encoding alphabets again, and tripled the number
- of synonyms possible. (Consequently the previous ``caps lock" style
- permanent changes of alphabet were dropped.) The ``verify" code and verify
- checksums appeared; and a new opcode to print the status bar at the top of
- the screen was introduced. (Previously, this was updated only when input
- was taken from the keyboard.) The earliest Version-3 releases (`Deadline',
- then `Zork I' and `II') were in March and April 1982; the latest (the
- `Minizork', a cassette-based Commodore-64 sampler of `Zork') in November
- 1987.
-
- A primitive form of screen-splitting (which, presumably, was devised in a
- hurry in 1984 and then accidentally became the foundation for the character
- graphics designs of later versions) was allowed by some interpreters, in
- order to give `Seastalker' a sonar display. In order that `Seastalker'
- should run on less enlightened interpreters, the game itself contained code
- to check whether this feature was available before using the opcodes.
- And `The Lurking Horror' (1987) has sound effects (on some machines) - another
- sign of things to come.
-
- Nevertheless by 1982 the Z-machine had stabilised to a reasonably clean
- design. It was very portable, contained everything reasonably necessary and
- most of its complications were optimisations to squeeze a few more bytes out
- of the 100K or so available on an early-1980s floppy disc. (Actually the
- Zilch's code generator, although very good at exploiting these tricks, had
- little larger-scale optimisation, and some of its code makes disheartening
- reading. But then the same could be said of Inform.)
- \medskip
-
- By 1985 there were two basic pressures to change. One was that home
- computers were larger, and several fundamental restrictions (the game size
- being only 128K, the number of objects only 255, the attributes only 32,
- the properties only 30) were beginning to bite. The other was the drive for
- more gimmicks - character graphics, flashier status bars, sound effects,
- different typefaces, and so on. The former led to logical, easy to
- understand structural changes in the machine. The latter, in contrast, made
- a mess of the system of opcodes.
-
- More does not mean better: just because the price of paper falls is no
- reason to double the size of the modern novel, for instance. Nor is
- literature (pace e. e. cummings) much improved by using four different
- typefaces and illustrating it with typewriter pictures. Also, the relieving
- of size restrictions only increased design time - or lowered its quality.
-
- Nonetheless, two excellent games resulted from the lifting of size
- restrictions. In August 1985 the first version-4 game (`A Mind Forever
- Voyaging') reached production, and it was followed most notably by
- `Trinity' (which had previously been shelved as too ambitious for the
- version-3 format). Still, most of the new 1985/6 games remained in
- version-3: after all, there were still plenty of 8-bit home computers
- around, too small for version-4 games: and, despite critical acclaim,
- the new games consequently did not sell as well.
-
- Version 5 games began to appear in September 1987 with `Beyond Zork' and
- `Border Zone'. Both of these games needed new features - character graphics
- gone wild in the case of the former, and real-time keyboard interaction in
- the latter. The number of opcodes grew ever faster as a result.
-
- Although five old games were re-released in Version 5 editions (with an
- in-game hints system added, and benefiting from 9-letter word dictionaries,
- but otherwise as written), the direction was all too clearly away from
- the old text game into graphics. Having gradually moved this way (`Beyond
- Zork' can look like a parody of an early mainframe maze game, for instance)
- there was nothing left but to complete the process, and so Version 6 was
- born. After something of a hiatus in 1988, the last few
- increasingly-unrecognisable Infocom games appeared: `Zork Zero', `Shogun',
- `Journey', `Arthur'.
-
- Infocom gradually ceased to exist during 1987-9 for financial reasons
- generally said to be unrelated to their games output. Whether they would
- have continued to release text games of the classical style is arguable.
-
- \section{4}{How text is encoded}
-
- \quote
- This technique is similar to the five-bit Baudot code, which
- was used by early Teletypes before ASCII was invented.
- \endquote
- \quote Marc S. Blank and S. W. Galley, {\sl How to Fit a Large Program
- Into a Small Machine}
- \endquote
-
- Text is stored as a sequence of 2-byte words. Each of these is divided into
- three 5-bit pieces, plus 1 bit left over, arranged as
- \begintt
- --first byte------- --second byte---
- 7 6 5 4 3 2 1 0 7 6 5 4 3 2 1 0
- bit --first-- --second--- --third--
- \endtt
- The bit is set only on the last 2-byte word of the text, and so marks the
- end.
-
- These pieces are called `Z-characters' and have values in the range 0 to 31.
-
- There are three alphabets, in which the numbers 6 to 31 mean:
- \begintt
- A0 abcdefghijklmnopqrstuvwxyz
- A1 ABCDEFGHIJKLMNOPQRSTUVWXYZ
- A2 ^0123456789.,!?_#'"/\-:()
- \endtt
- (Here the new-line character is written as a circumflex |^|).
-
- Character 0 is a space in all alphabets. Characters 1, 2 and 3 are used for
- abbreviations: thus, 1 followed by 14 means ``print entry 14 in the synonym
- table"; 2 followed by 5 means ``print entry 32+5=37..."; 3 followed by 20
- means ``print entry 64+20=84..." and so on.
-
- The Z-machine provides these for commonly occurring strings to be printed
- out as if they were characters, thus saving memory. Though they are
- actually abbreviations, by accident of history they have come to be called
- `synonyms'. (Well chosen synonyms tend to make about a 10\% space saving.)
-
- By default, a character is presumed to be in A0, i.e. to be a lower-case
- English letter. However, the character 4 means that the next one (only) is
- in A1; and 5 means the next is in A2.
-
- (Note for purists: actually the full rule is
- \begintt
- A0 A1 A2
- 4 [A1->] [A1->] [A0->]
- 5 [A2->] [A0->] [A2->]
- \endtt
- but since alphabet changes are (in versions 3 and onward) not permanent,
- it seems pointless ever to use 4 and 5 in alphabets 1 and 2.)
-
- Notice that character 6 in A2 is blank. It isn't a space: it simply isn't
- there. The sequence 5 followed by 6 indicates that the next two characters
- define an ASCII value. This is the way to get at the characters not in any
- of the three alphabets. For example, the familiar message
- \begintt
- *** You are dead ***
- \endtt
- takes four Z-characters to produce each of the asterisks.
-
- Finally, note that the end-bit only comes up once every three characters,
- so that a way is needed to safely use up any spare characters in the last
- 2-byte block. This is done by padding out with 5's. (5 followed by 5 does
- nothing.)
-
- This is especially the case with dictionary entries. Some dictionary
- entries, like ``i", ought only to take one 2-byte block, but in order to make
- all entries the same number of 2-byte blocks long and so alphabetically
- sortable by number, they are padded out by as many 5's in a row as needed
- (possibly as many as eight of them). Dictionary entries are not permitted
- to use synonyms and their letters are in lower case (though they can
- contain characters from A2).
-
- In practice the text compression factor is not really very good: for
- instance, 155000 characters of text squashes into 99000 bytes. (Text
- usually accounts for about 75\% of a story file.) But the encoding does
- at least encrypt the text so that casual browsers can't read it.
- \medskip
-
- Footnotes: 1. The versions 1 and 2 formats are slightly different: see
- below.
-
- 2. In versions 5 and 6, the three alphabet blocks need not be the
- default ones A0 to A2 tabulated above, but instead can be chosen by the
- story file itself by means of an entry in the game's header.
-
- 3. In version 6, it is expected that the ASCII codes for tab (9) and
- control-K (11) are printed slightly differently: a tab at the start of a
- line should be a paragraph indentation suitable for the font being used, but
- anywhere in the middle of a line should be a space; and 11 should be
- rendered as a gap between two sentences.
-
-
- \section{5}{How instructions are encoded}
-
- \widepoem
- We do but teach bloody instructions
- Which, being taught, return to plague th' inventor
- \poemby{Shakespeare}{Macbeth}
-
- This account is to be read in conjunction with the opcode table and
- dictionary, so it does not tabulate or individually discuss opcodes.
- Experimenting with Inform as an assembler, while tracing is turned on, may
- be helpful.
-
- Except for the printing instructions |print| and |print_ret|, which are
- simply opcodes followed by an encrypted string, an instruction consists of
- the following:
- \begintt
- Opcode 1 byte (possibly 2 in versions 5-6)
- (Types of operands) 1 byte; only for VAR form opcodes
- Operands Between 0 and 4, each taking 1-2 bytes
- (Store) 1 byte; variable to store a result
- (Branch) 1-2 bytes; offset to branch to
- \endtt
- (not all opcodes take ``store'' or ``branch''; a few take both).
-
- \subtitle{Operands}
-
- Z-code understands four kinds of operand, and describes these in 2-bit
- fields:
- \begintt
- $$00 Large constant (>=256 or <0) 2 bytes
- $$01 Small constant (0 to 255) 1 byte
- $$10 Variable 1 byte
- $$11 Omitted altogether 0 bytes
- \endtt
- Variables are described in one byte. |$00| means the top of the stack,
- |$01| to |$0f| are the local variables of the current routine and |$10| to
- |$ff| are the global variables, 0 to 239. Writing to the stack pointer,
- or variable |$00|, pushes something onto the stack; and reading from it
- pulls it off. The stack can also be manipulated with the use of opcodes.
- The stack is guaranteed to be at least 512 bytes long, and some interpreters
- are more generous. There isn't any way for a Z-code program to check stack
- overflowing, so recursion requires care.
-
- \subtitle{Opcodes}
-
- In versions 1 to 4, Z-code opcodes are 1 byte only. To begin with, look at
- the top two bits. If these are |$$11|, we shall call it ``variable"; if
- |$$10|, ``short" (0OP or 1OP, i.e. 0 or 1 operands); and otherwise ``long"
- (2OP: 2 operands). In versions 5 and 6, there are also ``extended", EXT,
- opcodes two bytes long.
- \medskip
-
- For short opcodes, look at the next two bits (4 and 5). These give the kind
- of operand which the code has. If this is |$11|, there isn't an operand and
- the opcode has no argument at all. In this event, the opcode number is the
- bottom 4 bits (see table of 0OP opcodes).
-
- If the type wasn't |$11|, then an operand follows of the given type (large
- constant, small constant or variable), and the bottom four bits gives the
- opcode number (see table of 1OP opcodes).
- \medskip
-
- Long opcodes have two operands. The bottom 5 bits of the opcode say what
- it is (see table of 2OP opcodes).
-
- The alert reader will notice that this only leaves bits 5 and 6 spare to
- hold the operand types. As there are two operands to specify, this ought
- to take up 4 bits, which obviously won't fit. So a more economical form is
- used instead. Bit 6 refers to the first operand, and bit 5 to the second.
- A value of 0 means a small constant and 1 means a variable. Now, type |$11|
- (not really there) operands can't happen, so that's no problem, but there
- might well be type |$00| (large constant) operands, for example in assembling
- \begintt
- @mul x #666 sp;
- \endtt
- In this event, the opcode must instead be assembled as a ``variable" opcode.
- \medskip
-
- So we must now describe the ``variable" or VAR opcode form. In addition to
- the possible opcodes which can arise from overflowing ``long" opcodes, there
- are others which can only be ``variable". In the former case bit 6 is clear
- and in the latter it is set. In either case the bottom 5 bits contain the
- opcode number: see the 2OP or VAR tables accordingly.
-
- Some of these are only of ``variable" type because the available codes for
- the other types had run out; |print_char|, for instance. Others, especially
- |call|, need the flexibility to have between 1 and 4 operands.
-
- In the ``variable" type opcode, all eight bits of the opcode have been used
- up, so we have to add another byte describing the operands. This is divided
- into four 2-bit fields. For example, |$$00101111| means large constant
- followed by variable (and no third or fourth opcode).
-
- Once the opcode is out of the way, the operands are simply stored in one or
- two-byte form as appropriate.
-
- \subtitle{Numbers and addresses}
-
- These are two-byte words, stored in the order high-byte then
- low-byte. The top bit is treated as the sign when needed
- (e.g. for numerical comparisons) and not otherwise (e.g. for addresses).
-
- When holding an address such a number can be a byte address, which puts
- it necessarily in the bottom 64K of the memory map, or a packed address.
- Routines and static strings will be at addresses in memory which can be
- pointed to by packed addresses. Given a packed address $p$, the formula
- to obtain the corresponding `real address' in bytes is:
- $$ b = \cases{ 2p & versions 1-3\cr
- 4p & versions 4-5\cr
- 8p+o & version 6\cr} $$
- where the offset $o$ in Version 6 is given in the game header (this can
- be used to stretch the memory map another 64K or so beyond the apparent
- 512K limit).
-
- \subtitle{Strings, stores, branches}
-
- |print| and |print_ret| are followed by text: this is assembled in the usual
- way immediately after the opcode (which may well be at an odd address, but
- this doesn't matter) and execution resumes after the last 2-byte word of text
- (the one with top bit set).
-
- ``Store" opcodes return a value: for example, |mul| multiplies its two
- arguments together, and |call| calls a routine which must return a value. Such
- instructions are followed by a single byte giving the variable (stack
- pointer, local or global as usual) to put it in. This may look like an extra
- operand but is not: there is no need to tell the Z-machine what type it has,
- since it must be a variable.
-
- Finally, there are instructions which test a condition. More opcodes than
- just the obvious branch instructions do this; e.g. |save| does so (in
- version 3), the test in question being whether or not the save was
- successful. Branches are stored in two different ways for economy reasons:
- nearby ones in a single byte at the end of the instruction, farther ones
- in two such bytes.
-
- The top bit of the first byte of a branch is the ``flag". If this is clear,
- then a branch occurs when the condition came out false. If it is set, then
- the branch occurs when it was true.
-
- If the next bit (bit 6) is set, then the branch is in abbreviated 1-byte
- format and the offset is in the bottom 6 bits (0 to 5). If not, the offset
- is in the bottom 14 bits (0 to 5 of the first byte, and all of the second).
- This offset can be positive or negative. (E.g., all 1's means -1 in the
- usual way.)
-
- In the abbreviated form, an offset of 1 in fact means ``return true from the
- current routine" and an offset of |$20| (i.e., -31) means ``return false". An
- offset of 1 is never useful but -31 might arise, and so it is essential to
- use the long form for such branches.
-
- Working out what the offset ought to be is more complicated than it appears
- because the PC has already moved on from the start of the instruction when
- it reaches the branch. The bizarre formula in question is
- \beginstt
- Offset = Destination address - Address of this instruction - Length + B
- \endtt
- where
- \beginstt
- Length = number of bytes in instruction (not counting the branch)
- \endtt
- and |B| is 1 for short branches, 0 for long ones.
-
- (For its own code Inform compiles branches in the long form, considering the
- economy to be not worth the nightmarish computation needed to make the
- long/short decision. (One problem is that the number of bytes in each
- instruction must be the same in both passes, so that the decision needs to
- be made before the value of the offset is known... in a 2-pass compiler this
- is insoluble. Another is that the offsets are affected by the size of the
- branch, confusing matters on forward branches.) However, its assembler
- mode allows you to make an explicit choice.)
-
- |jump| instructions similarly encode their address operand as an offset, but
- always as a two-byte (signed) constant.
-
- A few instructions both store results and branch: if so, the store comes
- first.
-
-
- \subtitle{Extended set of opcodes}
-
- The extended (or EXT) set only applies in versions 5 and 6. These are two
- byte opcodes, of which the first byte is always 190, the second the opcode
- number. Subsequently, they behave exactly as VAR...
-
- ...except that, actually, two of them don't. Two of them, |call_vs2| and
- |call_vn2|, have up to 8 operands and so have two bytes of type information
- instead of one. (These are provided for calling functions with up to 7
- arguments instead of only 3, the limit in earlier versions.)
-
- (Inform's assembler is unable to use these two opcodes.)
-
-
- \section{6}{The early Z-machine}
-
- Since the majority of extant Infocom story files use it, this section talks
- about version 3 unless otherwise stated. The following section will indicate
- how the late Z-machine differs.
-
- The early Z-machine has a memory map at most 128K long.
- \topinsert
- \centerline{\sl An example memory map of a small game (produced by Inform)}
- \medskip
- $$ \vbox{\offinterlineskip
- \hrule
- \halign{\vrule#&\strut\quad{\it #}\hfil\quad&\hfil # \quad&%
- \vrule#&\strut\quad\hfil#\hfil\quad&\vrule#\cr
- height2pt&\omit&\omit&&\omit&\cr
- && Start && Contains &\cr
- height2pt&\omit&\omit&&\omit&\cr
- \noalign{\hrule}
- height2pt&\omit&\omit&&\omit&\cr
- & Dynamic& |00000| && header &\cr
- & & |00040| && synonym strings &\cr
- & & |00042| && synonym table &\cr
- & & |00102| && property defaults &\cr
- & & |00140| && objects &\cr
- & & |002f0| && object descriptions &\cr
- & & && and properties &\cr
- & & |006e3| && global variables &\cr
- & & |008c3| && arrays &\cr
- & Static & |00b48| && grammar table &\cr
- & & |010a7| && actions table &\cr
- & & |01153| && preactions table &\cr
- & & |01201| && adjectives table &\cr
- & & |0124d| && dictionary &\cr
- & Paged & |01a0a| && Z-code &\cr
- & & |05d56| && static strings &\cr
- & & |06ae6| && end of file &\cr
- height2pt&\omit&\omit&&\omit&\cr
- }\hrule}$$
- \endinsert
-
- \subtitle{The Header}
-
- The first 64 bytes contain a header, to be detailed fully later. It
- contains (mainly) addresses of other tables and flags, and is both a
- vehicle for the game to tell the interpreter what to do, and for the
- interpreter to tell the game what it can do.
-
- To briefly run through the essential points of the version-3 header:
- the first 4 bytes are
- \begintt
- 03 <Flags> <Release Number>
- ----2 bytes-----
- \endtt
- (The first byte is the version number.)
- Next come seven word addresses, at words 2 to 8:
- \begintt
- 2 <Start of Routines> Where routines begin, in bytes
- \endtt
- Actually, in some games, read-only data seems to continue here: this
- pointer actually tells the interpreter where the ``resident" data ends,
- i.e. the part of the game which is kept in memory at all times rather
- than loaded off disc as and when required. (Of course modern interpreters
- should almost certainly not be swapping pages from the disc anyway, now
- that 128K is no longer a scandalous amount of memory.)
- \begintt
- 3 <Main Routine> Address of main routine, in bytes, +1
- \endtt
- (This +1 is why the Main routine cannot have local variables - it is a
- peculiarity of the standard. Note also that this is uniquely a routine
- address in bytes and not a packed address: Main must occur in the lower 64K
- of the file. Inform always sets word 3 to be word 2, plus 1, because it
- requires Main to be the first routine defined.)
- \begintt
- 4 <Dictionary> The dictionary table address, in bytes
- 5 <Object tree> Object table address, in bytes
- 6 <Variables> Global variables address, in bytes
- 7 <Save area size> The total number of bytes in a saved game
- \endtt
- Saving the game is done by saving this many bytes from the beginning of the
- machine. (Saved games also contain the current state of the Z-machine
- stack; the stack is {\sl not} stored anywhere in the Z-machine's memory.)
- \begintt
- 8 <More flags>
- \endtt
- This is followed by the six bytes from byte 18 to 23, which are the version
- number string. (By custom these hold the compilation date in the form
- YYMMDD.) Then more words:
- \begintt
- 12 <Synonyms table> Synonym table address in bytes
- 13 <Length> Length of file, in words
- 14 <Checksum> Sum of bytes from 64 upwards, mod $10000
- \endtt
- The length and checksum are needed to perform ``verify", something which
- most games only do when explicitly asked.
-
- \subtitle{Synonyms}
-
- We are now at byte address |$0040| and by convention we reach the synonyms.
- Usually, the actual strings (the expansions of the synonyms) are stored
- here, one after another, making up 96 strings. When that is out of the way,
- the actual table begins (and this is what the synonyms address points to).
- The table contains 96 word addresses in sequence.
-
- Note: extremely annoyingly (from the point of view of the compiler writer),
- these are word addresses and not packed addresses: thus a synonym string
- must lie in the bottom 128K of memory. (Inform has to go to a considerable
- amount of extra trouble because of this.) Of course in the original design
- synonym strings had to be resident (hence low in memory) anyway for speed
- reasons.
-
- \subtitle{Object Table}
-
- Next is the object table. In fact it begins with what is sometimes called
- the ``global properties table", though it is actually a table of default
- values of properties. This is a list of 31 2-byte numbers. There is no
- property 0, so the first word is always |0000|. (Recall that there are
- 30 properties in versions 1 to 3.)
-
- After these 62 bytes, the objects begin, beginning from object 1. An object
- entry consists of 9 bytes, looking like:
- \beginstt
- <the 32 attribute flags> <parent> <sibling> <child> <properties>
- ---32 bits in 4 bytes--- ---3 bytes------------------ ---2 bytes--
- \endtt
- The three parent-sibling-child bytes are |00| when the object pointed to is
- ``nothing". The |properties| pointer is the byte address of the list of
- properties attached to the given object.
-
- When all these 9-byte entries are out of the way, the property lists
- begin. (Inform keeps these in the same order as the objects they are
- attached to but the specification does not require this.) An individual
- property table has the brief header
- \beginstt
- <text-length> <text of short name of object>
- -----byte---- --some even number of bytes---
- \endtt
- (where the |text-length| is the number of 2-byte words making up the text,
- which is stored in the usual format).
-
- Then the properties held are listed, in descending numerical order. (This
- order is essential.) An individual property is stored as
- \beginstt
- <size byte> <the actual property data>
- ---between 1 and 8 bytes--
- \endtt
- The |size byte| is arranged as 32 times the number of data bytes, plus the
- property number.
- Each list of properties is ended by a |00| size byte. This is why there is no
- property 0.
-
-
- \subtitle{Global variables}
-
- When all the property tables are done, we come to the global variable table.
- Global variables are numbered from 0 to 239, and this table begins with 240
- initial 2-byte values for them. After this is conventionally left space for
- all the arrays, dynamic strings and so on which they point to.
- \medskip
-
- We have now reached the top of the save area. Everything higher in memory
- than here is never altered (and not saved when the game is saved, hence
- the name).
-
-
- \subtitle{Grammar and parsing tables}
-
- Next is the table of grammar, an actions table, a preactions table and then
- an adjectives table. Note that this is not a part of the specification at
- all, and the Z-machine knows nothing about these tables. The old Infocom
- files have certain standards about their formats because they used roughly
- similar parsers; Inform follows these conventions to some extent (see the
- {\sl Inform Technical Manual} for the formats it writes here).
-
-
- \subtitle{The dictionary}
-
- And next the dictionary table, which has the following short header:
- \beginstt
- n <list of ASCII codes> entry-length number-of-entries
- byte ------n bytes-------- byte 2-byte word
- \endtt
- The codes listed are word-separators: typically (and under Inform
- mandatorily) these are
- \begintt
- . , "
- \endtt
- A space character (32) does not appear because these characters will not
- only divide words but also come out as words in their own right: thus,
- \begintt
- > fred,go
- \endtt
- will be lexically analysed as three words:
- \begintt
- "fred" "," "go"
- \endtt
- Each word entry has 4 bytes of text (i.e. 6 Z-characters, padded out
- with as many ``pad'' characters, that is 5s, as necessary), and
- then a few extra bytes of data: almost invariably (and under Inform
- mandatorily) three.
-
- Dictionary entries appear in alphabetical order (precisely, this means
- in numerical order, regarding the first 4 bytes as an unsigned
- integer). They use only alphabets A0 and A2 (i.e., they don't use
- upper case letters).
-
- The contents of the data bytes are not specified by the Z-machine,
- which never does anything with them. (See the {\sl Inform Technical
- Manual} for what Inform does with them.)
-
-
- \subtitle{The code area and static strings}
-
- Next is the code area. (In fact some Infocom games, though no Inform
- ones, put some static data next before the code begins.)
- The code area simply contains a list of routines; the specification
- does not require the first routine to be the `main routine', and indeed
- it is not in some existing files (though it always is under Inform).
-
- All routines (and static strings) must occur at addresses which can
- be packed addresses (meaning, at even byte addresses in Version 3).
- The bytes sometimes left over in between them are unspecified (but under
- Inform, always 0).
-
- A routine begins with one byte indicating the number of local variables the
- routine has (from 0 to 15), and then with that many 2-byte numbers giving
- their initial values. When a function call takes place, the arguments --
- however many there are -- are written into the first few local variables,
- over-riding the default values here. Unlike global variables, these bytes
- are not used for the current values of the variables: they are kept on
- the stack.
-
- (Inform never makes use of these initialisation numbers, and simply stores
- zeros.)
-
- Executable code follows this header. There is no special marker for the end
- of a routine; it is simply expected that in every case a legal return
- instruction will be hit.
-
- Finally, from the end of the code to the top of memory are the static
- strings. These are put up here to be out of the way, where they won't clog
- up the bottom 64K of memory. There's no table of their addresses, or pointer
- to where they begin; each is referred to by a packed address in code or
- data given earlier.
-
-
- \section{7}{The late Z-machine}
-
- \subtitle{Versions 4 and 5: Architecture}
-
-
- The bulk of this section is given over to a detailed discussion of the
- differences between version 3 and version 5, since those are the two forms
- Inform can produce. (Version 4 is nearer to version 5 than 3.) We
- begin with the architecture.
-
- The memory map doubles to 256K, a change which is surprisingly easy to make.
- But the processor remains 16-bit, so packed addresses are now multiples of
- 4. However, this only really affects addresses of routines and static
- strings (which are now aligned to longword boundaries, not word-boundaries).
-
- As mentioned in \S 6, an annoying exception is that the synonyms table
- contains word addresses still, and so assumes that the synonym strings lie
- in the lower 128K. This is understandable because the Z-machine used to
- rely on virtual memory (swapping pages of memory on and off of disc), and
- the synonyms need to be accessed at virtually all times: keeping them
- together in low memory (just after |$0040|) is therefore efficient, and
- giving them addresses divisible by four would waste bytes in the
- save-game-area.
-
- The only important change to the header, then, is that the length is in
- longwords, being a packed address.
-
- A minor new feature in Version 5 is that the game can change the alphabet
- tables used for text decoding, putting a pointer to them in the header at
- |$34-5|: this is usually left as |$0000|, meaning the default alphabets. See
- \S 10. Also, it seems to be expected that the interpreter tells the game
- the dimensions of the screen by writing them into the header itself, in
- play. Thus it is fairly safe to consult
- \begintt
- Byte 32 - Screen height
- Byte 33 - Screen width
- \endtt
- and it's hard to cope without this information, since games after Version 3
- have to construct their own status lines. (It isn't clear that the various
- interpreters all understand the same thing by ``height" and ``width",
- though.)
-
- There is effectively no limit on the number of possible objects, since an
- object number is no longer expected to fit into a single byte. This has the
- knock-on effect that in most games many properties will have to allow for a
- word and not a byte (which is why Inform defaults property definitions as
- |long| in version-5 mode), but the only architectural effect is that object
- definitions grow in size. Since the number of attributes is increased from
- 32 to 48, and of properties from 30 to 62, this would be needed anyway: and
- here is the new form:
- \beginstt
- <the 48 attribute flags> <parent> <sibling> <child> <properties>
- ---48 bits in 6 bytes--- ---3 words, i.e. 6 bytes---- ---2 bytes--
- \endtt
- giving a 14-byte block. As before, the properties field is the byte address
- of the property table.
-
- The property table is also altered. A property is now stored as
- \beginstt
- <size and number> <the actual property data>
- --1 or 2 bytes--- --between 1 and 64 bytes--
- \endtt
- The property number now occupies the bottom 6 bits, not 5, of the first size
- byte, which is why more properties are available. But this only leaves two
- bits. If these are |$$00|, the size is taken as 1, and if |$$01|, then it
- is taken as 2. (These are the most common sizes in practice.) Otherwise
- the top bit is set, which means that the second byte is present, and
- contains the size in its bottom six bits.
-
- However, when present the second byte must also have the top bits set to
- |$$10|. The reason for this is that the size must be parsable either
- forwards or backwards - the Z-machine needs to be able to reconstruct the
- length of a property given only the address of the first byte of its data.
-
- There are very many (e.g. 2000) property entries in a story file, so this
- optimisation is probably worthwhile.
- \medskip
-
- The formats of the parsing tables are generally different in later
- versions, but this isn't part of the Z-machine specification.
-
- Whereas Version 3 games have dictionaries store words in 6 Z-characters,
- all Version 4 and above games take 9 Z-characters. (I.e., four and six
- bytes of encoded text respectively.) This increases the length of entries.
- Otherwise, the specification is the same.
-
- The extra resolution makes it reasonable to include hyphenated words, which
- might not have been sensible earlier because of the number of five-bit
- blocks they would have needed.
-
- These modifications appear at first sight to make much larger, less
- efficient code, but this is misleading. The original version-3 `Curses' was
- only 3\% larger when first compiled as version-5, and a good part of that
- was the extra dictionary resolution.
-
- There is one sensible structural change to the way actual code is written:
- in Version 5 (not Version 4, though) the header of a function no longer
- contains initialisation values for its local variables. In practice these
- were very often zero, wasting a large number of bytes across the whole story
- file. On the other hand, one peculiarity of the machine is that functions
- can be called with 0, 1, 2 or 3 arguments, and routines in version-3 games
- used to be able to put a default value in their headers for any argument not
- supplied by the caller. This they can no longer do, so that they are unable
- to tell how many arguments actually were supplied: and so a new branch
- instruction |check_arg_count| exists to test this.
-
- Another improvement is in subroutine calls. In Version 3 code, a |call|
- instruction is always VAR and has a variable argument list, which wastes a
- byte even when there are no parameters. Also, every function call returns a
- value, and in Version 3 this value had to be written somewhere even when it
- wasn't wanted - wasting another byte. (In fact Inform used to return this
- to the stack, and then pop it from the stack - wasting another one.
- Nowadays it stores unwanted return values in a scratch global variable.) In
- Version 4 (and to a greater extent in Version 5), new forms of the |call|
- instruction are provided which automatically throw away the return value.
-
- This leads to the nightmarish position that there are eight variant forms of
- |call| in the Version 5 machine. Inform christens six of these as follows:
- \beginstt
- call_vs <address> <0 to 3 arguments> <place to put answer>
- \endtt
- (which is just as in version 3 |call|, and compatible with it),
- \beginstt
- call_vn <address> <0 to 3 arguments>
- \endtt
- which is the same but throws away the answer, and
- \beginstt
- call_1n <address> address();
- call_1s <address> <answer> answer=address();
- call_2n <address> <a1> address(a1);
- call_2s <address> <a1> <answer> answer=address(a1);
- \endtt
- Two of the others are called |call_vs2| and |call_vn2| by Inform: these are
- provided for function calls with up to seven arguments, circumventing the
- usual restriction on function calls to have at most three: and, uniquely,
- they have two bytes of type bits, arranged as eight two-bit fields. (Inform
- does not compile these instructions, and does not make use of them when
- coding function calls, because it would be extremely unportable to lower
- versions.) Note that the standard opcode name for all eight opcodes is
- |call|, and this is what appears in disassembly, but that Inform uses these
- eight names internally and for assembly.
-
-
- \subtitle{Versions 4 and 5: Reliable extra features}
-
-
- We now discuss those important extra features which can more or less be
- relied upon to be safely interpreted. Roughly speaking, don't rely on
- interpreters other than Zip to correctly perform an opcode not actually used
- in any existing Infocom game.
-
- But we must begin with unfortunate clashes with version 3. Chief among
- these is |pop| which used simply to throw away the top of the stack. In
- version 5 no such instruction exists (there is less need for it anyway given
- the new |n| form of the |call| opcodes).
-
- Also, the |read| opcode (although it has the same basic form,
- \beginstt
- read text_buffer parse_buffer;
- \endtt
- as before) does a subtly different job: it appends the result of parsing the
- text to the |parse_buffer|, rather than over-writing the parse buffer. It
- also no longer prints any kind of status bar. (To avoid confusion of the
- syntax, Inform calls the version-3 opcode |sread| and the version-5 opcode
- |aread|; and its higher-level command |read| translates into sensible code
- for either.)
-
- And since there is no longer any Z-machine ``status bar", the old opcode to
- display it (|show_status|) disappears and in theory becomes illegal.
-
- The |random| function now makes the random number generator predictable for
- a while if given a negative argument (some version 3 games had a |#random|
- opcode - so called because typing |#random| into the game made it happen).
-
- Cutting and pasting bits of parse buffer is a common job for Z-code parsers,
- and there are new opcodes to help with shuffling tables around. One can
- also (using |tokenise|) parse from any string, with any supplied dictionary
- table (not necessarily the default one). One may also |encode_text| to
- Z-machine text format - which might be useful for constructing dictionary
- entries at run-time.
-
- A few opcodes have been moved around, irritatingly, and there have been
- three casualties. |not| has moved. |save| and |restore| now appear in the
- extended set, as a result of which they are no longer branch instructions
- (presumably to avoid coping with branch offsets being different for extended
- opcodes), and now take a less convenient syntax:
- \beginstt
- save <variable>;
- restore <variable>;
- \endtt
- These put return codes in the variable. They return 0 if they fail;
- |restore| returns 1 if successful, |save| returns either 1 or 2. The
- ambiguity is because a successful |restore| results in execution continuing
- from immediately after the |save| instruction which produced the save game
- file... so in order that the program could know whether a restore had just
- taken place, or only a save of a game after which normal execution
- continued, the return value is altered.
-
- Being in the extended set does give them extra functions but not very useful
- ones. It is possible to imagine saving a ``preferred settings" file, for
- instance.
-
- (Inform compiles a little code to make |save| and |restore| emulate the
- version 3 opcodes, for portability between versions. To get at the raw
- opcodes, they must be assembled in |@| mode.)
-
- \subtitle{Character graphics before Version 6}
-
- Now for the graphics routines. The simplest of these allows for different
- text styles: boldface, underlining and reverse video (e.g. white on black if
- text would normally be black on white). These effects are modelled on the
- VT100 (design of terminal) and cannot safely be combined, even though the
- codes for them look like bit masks:
- \beginstt
- set_text_style 0 Default: Inform calls this "Roman"
- 1 Reverse video
- 2 Boldface
- 4 Underlined (or italic)
- \endtt
- An interpreter providing coloured text may implement these with colour
- changes: my own represents bold as blue lettering instead of black on white,
- for instance, which is quite pleasant.
-
- Some ports of ITF paint entirely-reversed next lines when scrolling
- the screen in Reverse video, but this is incorrect. Some interpreters
- do not implement ``bold face". A stone tablet with keywords picked out in
- bold might be impossible to decipher to some players.
-
- (There is another option, 8, which forces use of a fixed-spaced font,
- used in `Beyond Zork'.)
-
- An upper (usually status line) screen can be split off from the main screen
- with:
- \beginstt
- split_window <n>
- \endtt
- creating one which is |n| lines tall. There are then two screens, 0 (the
- main screen) and 1 (the upper one). Text output can be switched between
- them by
- \beginstt
- set_window 0 to lower
- 1 to upper
- \endtt
- The lower window is just a text stream and its cursor position cannot be
- set: on the other hand, when it is returned to, the cursor will be where it
- was before it was left.
-
- Within the upper window, anyway, the cursor can be moved by
- \beginstt
- set_cursor <line> <column>
- \endtt
- where $(1,1)$ is the top left hand character. Printing on the upper window
- overlies printing on the lower, and is always done in a fixed-space font,
- and does not appear in a printed transcript of the game.
-
- However, before printing to the ``status line" screen, it is essential to
- change the printing format - this is the |buffer_mode| opcode alluded to
- earlier. Before printing, execute
- \beginstt
- buffer_mode 0
- \endtt
- and when returning to the normal screen,
- \beginstt
- buffer_mode 1
- \endtt
- Otherwise, if the cursor comes near the edge the interpreter may continue
- trying to split lines at word breaks; some ports of ITF make a horrid mess
- in this case, though Zip manages.
-
- Also, the status line screen must be tall enough to include all the cursor
- positions you want to write to. If it is not quite tall enough, different
- interpreters flounder about in different ways: some will scroll the upper
- window, some won't.
-
- A common thing to want to do is to erase areas of screen - especially a
- status bar which is being redisplayed. Opcodes
- \beginstt
- erase_window $ffff - erases whole screen, both windows
- erase_line - erases from cursor to end of line [Achtung!]
- \endtt
- are provided for this. If you are in reverse video mode, they erase to the
- reversed colour: a particularly unpleasant effect is achieved by
- \beginstt
- set_text_style 1; erase_window $ffff;
- \endtt
- Unfortunately |erase_window| (which is intended to erase window $n$, or all
- windows if $n=-1$) is not fully implemented by ITF and cannot safely be used
- except in this drastic way. (E.g., the Version 4 file `Trinity', for
- instance, only uses it thus.)
-
- |erase_line| is only sometimes implemented and does slightly unpredictable
- things in reverse video mode, which is a nuisance since it would otherwise
- be ideal for blanking out an out-of-date status bar. However, no existing V4
- or V5 game uses this opcode and so it may not be relied upon. (It's
- interesting to note that the Version-5 edition of `Zork I' - one of the
- earliest Version 5 files - blanks out lines by looking up the screen width
- and printing that many spaces.)
-
-
- There are new arithmetic opcodes:
- \beginstt
- art_shift x y z z=x arithmetically shifted y bits
- log_shift x y z z=x logically shifted y bits
- \endtt
- Version 5 games effectively have ``undo" provided for them, though the logic
- is tricky to get right (from a programmer's point of view). The two
- relevant opcodes are |save_undo| and |restore_undo|, which work in exactly
- the same way as |save| and |restore| except that they save the game
- internally to spare memory. The idea is that if the game is saved before
- any action, then the last action can be undone by restoring this
- memory-saved game.
-
- |save_undo| provides one more return code than |save|: it returns -1 if the
- interpreter is unable to manage internal saves (presumably this was provided
- for machines tight on memory). Now, of course, an interpreter which knows
- about |save_undo| enough to return this code probably knows enough to
- implement it fully.
-
- Zip does provide this, but the ITF interpreter currently does not (and
- |save_undo| returns 0). This is probably the biggest feature it lacks.
- In any case, ``undo" is such a worthwhile feature and so easy to code that
- games probably ought to provide it in hope.
-
- Changing input/output streams and reading the keyboard in real time
- are, similarly, more reliable under Zip.
-
-
- \subtitle{Architecture: version 6}
-
-
- The architecture of the Version 6 Z-machine is extremely similar to that
- of Version 5. Packed addresses are expanded again and this allows the
- memory map to stretch yet further. (`Shogun', for instance, is about 335K
- long.)
-
- Pictures and sampled sounds are not stored in the Z-machine itself and it
- is simply expected that the interpreter has them to hand. They were
- thus stored in different formats for different machines.
-
- A few opcodes are changed (mostly the character graphics ones) and
- many new ones are added: see the dictionary.
-
- The graphical features are the most disheartening to interpreter writers,
- but most of them seem to be optional. For instance, the interpreter
- can declare itself unable to draw pictures, or to produce sound effects.
- It is not impossible to imagine that a fairly portable version-6 interpreter
- could be constructed, and Zip is currently going down this road.
-
- The display is expected to be arranged in pixels. Coordinates are usually
- given in the form $(y,x)$, with $(1,1)$ in the top left. There is a
- generalised colour scheme intended to look like the basic IBM PC colours
- (which is to say, not very pleasant). There are eight, instead of two,
- windows, and they have more elaborate possibilities; but essentially
- similar to the two windows in version 4 onward.
-
- There may be a mouse, but if so it is not expected to do much beyond move
- an arrow around and have one or more buttons. Similarly, there may be a
- concept of ``menus" - which seems primarily furnished for Apple Macintoshes.
-
-
- \section{8}{Complete table of opcodes}
-
-
- This table might be called a variorum edition of the Z-machine
- specification: it contains all 120 or so possible opcodes for every version
- of the Z-machine, from 1 to 6 and (taken with the accompanying dictionary)
- documents them and their corresponding Inform assembly language syntax.
-
- A few opcodes do not in fact occur in any existing files, but they can
- be deduced by disassembling Infocom-supplied interpreters. This table
- specifies also which opcodes occur in V1 to V5 files, at least.
-
- Inform names (and can assemble) all the opcodes, even the version-6 ones.
- This may be useful for preparing test files. The names here are the set
- used by Inform 5.4 and later, extended from a system worked out by Mark
- Howell for his disassembler, which we have agreed on as a standard. We hope
- that this will provide interpreter writers and others with a common lexicon.
- It would be helpful if interpreter sources use these names internally.
-
- \subtitle{Reading the opcode tables}
-
- The two columns ``St" and ``Br" (store and branch) mark whether an
- instruction stores a result in a variable, and whether it must provide a
- label to jump to, respectively.
-
- The ``Opcode" is written
- \begintt
- TYPE:Decimal
- \endtt
- where the TYPE is 2OP, 1OP, 0OP, VAR or EXT: two operands, one operand, no
- operands, variable number of said, and variable number of said but occurring
- in the ``extended" set. The extended set of opcodes are two-byte opcodes
- where the first byte is (decimal) 190.
-
- Briefly, single byte opcodes have types as follows:
- \beginstt
- 0 to 31, 32 to 63, 64 to 95, 96 to 127: forms of 2OP, the opcode number
- being the value mod 32
- 128 to 143, 144 to 159, 160 to 175: forms of 1OP, the opcode number
- being the value mod 16
- 176 to 191: 0OP, the opcode number
- being the value mod 16
- 192 to 223: 2OP opcodes implemented in the
- VAR form, the opcode number
- being the value mod 32
- 224 to 255: VAR, the opcode number
- being the value mod 32
- \endtt
- The decimal number is the lowest possible decimal opcode value. The hex
- number is the opcode number within each TYPE.
-
- The ``V" column gives the version information. If nothing is specified, the
- opcode is as stated from version 1 onwards. Otherwise, it exists only from
- the version quoted onwards. Before this time, its use is illegal. Some
- opcodes change their meanings and these have more than one line of
- specification. Others become illegal again, and these are marked
- |[illegal]|.
-
- In a few cases, the version is given as ``3/4" or some such. The first
- number is the version number whose specification the opcode belongs to, and
- the second is the earliest version in which the opcode is known actually to
- be used. A dash means that it is never used at all (in versions 1 to 5 at
- least: possibly a few of the 5/- opcodes may be used in version 6).
-
- The table explicitly marks opcodes which remain unused in all six versions
- of the Z-machine as |------|. In principle, the interpreter is at liberty
- to crash if it finds them, though in practice ignoring them is more polite.
-
- However, the extended set, which could in principle run from |$00| to |$ff|,
- stops at |$1c|: subsequent codes |$1d| to |$ff| were never used, even in
- version 6.
-
- \subtitle{Inform assembly language}
-
- An Inform line beginning with an |@| is sent direct to the assembler. The
- syntax is as laid out in the tables below. (Remember that opcodes can only
- be used if the game version number is right.)
-
- |<variable>| and |<result>| must be variables (or |sp|, the stack pointer);
- |<label>| a label (not a routine name). In a branch instruction, the
- logical effect can be negated using a tilde |~| before the label name, so
- for instance
- \beginstt
- @je a b ~Different; ! Jump to Different if a not equal to b
- \endtt
- The programmer must specify whether a branch is in the ``near" or ``far"
- form, the default being ``near". A question mark |?| before the label (and
- the tilde, if there is one) forces it to be far, it otherwise being ``near"
- (which is cheaper and more likely).
-
- |<string>| must be literal text in quotation marks ``thus" and it is
- translated in the usual Inform way. When |function| is listed, a constant
- is expected to be a packed address of a function. Inform assembles these in
- the right way if you just name a function at the appropriate point.
-
- Generally speaking any Inform constant term (such as |'a'| or |'beetle'|)
- can be used as an operand but a compound expression (which would obviously
- incur extra assembly) cannot.
-
- \subtitle{Opcode names changed since Inform 5.2}
-
- In order to bring Inform into line with the agreed standard names for opcodes,
- the following changes have been made to opcode names:
- \beginstt
- From To
- ====================================
- compare_pobj same_parent
- colour set_colour
- retsp ret_popped
- show_score show_status
- scanw scan_table
- aparse tokenise
- encrypt encode_text
- check_no_args check_arg_count
- \endtt
- \vfill\eject
- \hrule\smallskip
- \centerline{\bf Two-operand (long) opcodes 2OP}\smallskip\hrule
- \beginlines
- | St Br Opcode Hex V Inform name and syntax|
- \endlines\smallskip\hrule\smallskip\beginlines
- | ------ 0 ------|
- | * 2OP:1 1 je a b <label>|
- | * 2OP:2 2 jl a b <label>|
- | * 2OP:3 3 jg a b <label>|
- | * 2OP:4 4 dec_chk <variable> value <label>|
- | * 2OP:5 5 inc_chk <variable> value <label>|
- | * 2OP:6 6 same_parent obj1 obj2 <label>|
- | * 2OP:7 7 test bitmap flags <label>|
- | * 2OP:8 8 or a b <result>|
- | * 2OP:9 9 and a b <result>|
- | * 2OP:10 A test_attr object attribute <label>|
- | 2OP:11 B set_attr object attribute|
- | 2OP:12 C clear_attr object attribute|
- | 2OP:13 D store <variable> value|
- | 2OP:14 E insert_obj object destination|
- | * 2OP:15 F loadw table index <result>|
- | * 2OP:16 10 loadb table index <result>|
- | * 2OP:17 11 get_prop object property <result>|
- | * 2OP:18 12 get_prop_addr object property <result>|
- | * 2OP:19 13 get_next_prop object property <result>|
- | * 2OP:20 14 add a b <result>|
- | * 2OP:21 15 sub a b <result>|
- | * 2OP:22 16 mul a b <result>|
- | * 2OP:23 17 div a b <result>|
- | * 2OP:24 18 mod a b <result>|
- | * 2OP:25 19 4 call_2s function arg1 arg2 <result>|
- | 2OP:26 1A 5 call_2n function arg1 arg2|
- | 2OP:27 1B 5 set_colour foreground background|
- | 2OP:28 1C 5/- throw value stack-frame|
- | ------ 1D ------|
- | ------ 1E ------|
- | ------ 1F ------|
- \endlines\smallskip\hrule\smallskip
- \vfill\eject
- \smallskip\hrule\smallskip
- \centerline{\bf One-operand opcodes \rm 1OP}\smallskip\hrule
- \beginlines
- | St Br Opcode Hex V Inform name and syntax|
- \endlines\smallskip\hrule\smallskip\beginlines
- | * 1OP:128 0 jz a <label>|
- | * * 1OP:129 1 get_sibling object <result> <label>|
- | * * 1OP:130 2 get_child object <result> <label>|
- | * 1OP:131 3 get_parent object <result>|
- | * 1OP:132 4 get_prop_len property-address <result>|
- | 1OP:133 5 inc <variable>|
- | 1OP:134 6 dec <variable>|
- | 1OP:135 7 print_addr byte-address-of-string|
- | * 1OP:136 8 4 call_1s function arg1 <result>|
- | 1OP:137 9 remove_obj object|
- | 1OP:138 A print_obj object|
- | 1OP:139 B ret value|
- | 1OP:140 C jump <label>|
- | 1OP:141 D print_paddr word-address-of-string|
- | * 1OP:142 E load value <result>|
- | * 1OP:143 F 1/4 not value <result>|
- | 5 call_1n function arg1|
- \endlines\smallskip\hrule\smallskip
- \centerline{\bf Zero-operand opcodes \rm 0OP}\smallskip\hrule
- \beginlines
- | St Br Opcode Hex V Inform name and syntax|
- \endlines\smallskip\hrule\smallskip\beginlines
- | 0OP:176 0 rtrue|
- | 0OP:177 1 rfalse|
- | 0OP:178 2 print <string>|
- | 0OP:179 3 print_ret <string>|
- | 0OP:180 4 1/- nop|
- | * 0OP:181 5 1 save <label>|
- | 5 [illegal]|
- | * 0OP:182 6 1 restore <label>|
- | 5 [illegal]|
- | 0OP:183 7 restart|
- | 0OP:184 8 ret_popped|
- | 0OP:185 9 1 pop|
- | * 5 catch <result>|
- | 0OP:186 A quit|
- | 0OP:187 B new_line|
- | 0OP:188 C 3 show_status|
- | 4 [illegal]|
- | * 0OP:189 D 3 verify|
- | 0OP:190 E 5 [first byte of extended opcode]|
- | * 0OP:191 F 5 piracy|
- \endlines\smallskip\hrule\smallskip
- \vfill\eject
- \smallskip\hrule\smallskip
- \centerline{\bf Variable-operand opcodes \rm VAR}\smallskip\hrule
- \beginlines
- | St Br Opcode Hex V Inform name and syntax|
- \endlines\smallskip\hrule\smallskip\beginlines
- | * VAR:224 0 1 call function ...args... <result>|
- | icall address <result>|
- | 4 call_vs function ...args... <result>|
- | VAR:225 1 storew table word value|
- | VAR:226 2 storeb table byte value|
- | VAR:227 3 put_prop object property value|
- | * VAR:228 4 1 sread text-buffer parse-buffer|
- | 5 aread text parse time function|
- | VAR:229 5 print_char ascii-value|
- | VAR:230 6 print_num value|
- | * VAR:231 7 random range <result>|
- | VAR:232 8 push value|
- | * VAR:233 9 1 pull <result>|
- | 5/- pull stack <result>|
- | VAR:234 A 3 split_window lines|
- | VAR:235 B 3 set_window window|
- | * VAR:236 C 4 call_vs2 [not properly assembled]|
- | VAR:237 D 4 erase_window window|
- | VAR:238 E 4/- erase_line value|
- | 6 erase_line pixels|
- | VAR:239 F 4 set_cursor line row|
- | 6 set_cursor line row window|
- | VAR:240 10 4/- get_cursor table|
- | VAR:241 11 4 set_text_style style|
- | VAR:242 12 4 buffer_mode flag|
- | VAR:243 13 3 output_stream number|
- | 5 output_stream number table|
- | 6 output_stream number table width|
- | VAR:244 14 3 input_stream number|
- | VAR:245 15 4 beep|
- | 5/3 sound_effect number effect volume|
- | sound_effect number effect repeats volume|
- | 6 sound_effect number effect volume repeats|
- | * VAR:246 16 4 read_char 1 time function <result>|
- | * * VAR:247 17 4 scan_table x table len form <result> <label>|
- | * 1OP:248 18 5/- not value <result>|
- | VAR:249 19 5 call_vn function ...args...|
- | VAR:250 1A 5 call_vn2 [not properly assembled]|
- | VAR:251 1B 5 tokenise text parse dictionary flag|
- | VAR:252 1C 5 encode_text ascii-text length from coded-text|
- | VAR:253 1D 5 copy_table from to size|
- | VAR:254 1E 5 print_table ascii-text width height skip|
- | * VAR:255 1F 5 check_arg_count argument-number|
- \endlines\smallskip\hrule\smallskip
- \vfill\eject
- \smallskip\hrule\smallskip
- \centerline{\bf Extended opcodes \rm EXT}\smallskip\hrule
- \beginlines
- | St Br Opcode Hex V Inform name and syntax|
- \endlines\smallskip\hrule\smallskip\beginlines
- | * EXT:256 0 5 save table bytes name <result>|
- | * EXT:257 1 5 restore table bytes name <result>|
- | * EXT:258 2 5 log_shift number places <result>|
- | * EXT:259 3 5/- art_shift number places <result>|
- | * EXT:260 4 5 set_font font window <result>|
- | EXT:261 5 6 draw_picture picture-number y x|
- | * EXT:262 6 6 picture_data picture-number table <label>|
- | EXT:263 7 6 erase_picture picture-number y x|
- | EXT:264 8 6 set_margins left right window|
- | * EXT:265 9 5 save_undo <result>|
- | * EXT:266 A 5 restore_undo <result>|
- | ------- B ------|
- | ------- C ------|
- | ------- D ------|
- | ------- E ------|
- | ------- F ------|
- | EXT:272 10 6 move_window window y x|
- | EXT:273 11 6 window_size window y x|
- | EXT:274 12 6 window_style window flags operation|
- | * EXT:275 13 6 get_wind_prop window property-number <result>|
- | EXT:276 14 6 scroll_window window pixels|
- | EXT:277 15 6 pop_stack items stack|
- | EXT:278 16 6 read_mouse table|
- | EXT:279 17 6 mouse_window window|
- | * EXT:280 18 6 push_stack value stack <label>|
- | EXT:281 19 6 put_wind_prop window property-number value|
- | EXT:282 1A 6 print_form formatted-table|
- | * EXT:283 1B 6 make_menu number table <label>|
- | EXT:284 1C 6 picture_table table|
- \endlines\smallskip\hrule\bigskip
-
- Notes: 1. The opcodes 5, 6, 7, 8 in the extended set were very likely in the
- V5 specification, and are named in some interpreter sources (though only
- very haphazardly implemented) but they do not occur in any existing V5 story
- file.
-
- 2. The notation ``5/3" for |sound_effect| is because this plainly version-5
- feature was used also in one solitary Version-3 game, `The Lurking Horror'
- (the sound version of which was the last V3 release, in September 1987).
- A V3 interpreter may ignore this but may not crash.
-
- 3. The opcode 0 (in the 2-operand set, i.e. the actual byte 00) was possibly
- intended for setting break-points in debugging. It was not |nop|. (At
- time of writing, the Infix debugger uses the actual |nop| instruction as
- a break-point.)
-
-
- \section{9}{Dictionary of opcodes}
-
- \quote
- The highest ideal of a translation... is achieved when the
- reader flings it impatiently into the fire, and begins
- patiently to learn the language for himself.
- \quoteby{Philip Vellacott}
-
- This dictionary is alphabetical and includes entries on every opcode listed
- in the table above, as well as brief notes on some Inform internal synonyms
- which might otherwise be confused with opcodes. Although concise it
- essentially documents correct interpreter behaviour.
-
- The following have been corrected since the first edition: |aread|,
- |erase_line|, |get_cursor|, |get_wind_prop|, |input_stream|, |picture_data|,
- |random|, |set_cursor| and
- |split_window|. |picture_table|, the last opcode
- to be discovered, has been added.
- \bigskip
-
- \def\de{\medskip\noindent}
-
- \de |add |\inpar Signed 16-bit addition.
-
- \de |and |\inpar Bitwise and.
-
- \de |"aparse"|\inpar Obselete name for |tokenise|.
-
- \de |aread|\inpar Advanced form of |read|. This behaves just as the standard
- form does if the last two operands are not
- supplied, except that: (i) the status line is not redisplayed, and
- (ii) if the parse buffer supplied is zero, no attempt is made to parse
- the input.
- \onpar
- The parse buffer is appended to, not over-written as in version 3.
- \onpar
- If all four operands are supplied, then every |time| seconds
- while the player is working on her input, the |function| is called: if it
- returns 1 (true) then the reading process is interrupted. (The function
- obviously needs to run pretty quickly.)
- \onpar
- The |function| is called with one argument: the |time| value.
-
- \de |art_shift|\inpar Does an arithmetic shift of number by the given number of
- places, shifting left (i.e. increasing) if places is positive, right if negative.
- In a right shift, the sign bit is preserved as well as being shifted on
- down. (The alternative behaviour is |log_shift|.)
-
- \de |beep|\inpar Beeps in a more or less irksome fashion and possibly flashes the
- display.
-
- \de |buffer_mode|\inpar If set to 1, text output is buffered up so that it can be
- word-wrapped properly. If set to 0, it isn't.
-
- \de |call|\inpar The only call instruction in version-3, Inform reads this as
- |call_vs| in higher versions: it calls the function with 0, 1, 2 or 3
- arguments as supplied and stores the resulting return value.
-
- \de |call_1n|\inpar Executes |function(arg)| and throws away result.
-
- \de |call_1s|\inpar Stores |function(arg)|.
-
- \de |call_2n|\inpar Executes |function(arg1, arg2)| and throws away result.
-
- \de |call_2s|\inpar Stores |function(arg1, arg2)|.
-
- \de |call_vn|\inpar Like |call|, but throws away result.
-
- \de |call_vs|\inpar See |call|.
-
- \de |call_vn2|\inpar Call with a variable number (from 0 to 7) of arguments, then
- throw away the result. This (and |call_vs2|) uniquely have an extra byte
- of opcode types to specify the types of arguments 4 to 7.
-
- \de |call_vs2|\inpar See |call_vn2|.
-
- \de |catch|\inpar Opposite to |throw|, and occupying the same opcode that |pop| used
- to in versions 3 and 4, but now with a store argument. |catch| gives the
- stack frame of the current routine: see |throw| for what to do with it
- subsequently.
-
- \de |check_arg_count|\inpar Branches if the given argument-number (1 being the first of
- these) has been provided by the function call to the current routine.
- (Default values would otherwise be difficult to provide in versions 5
- and 6.)
-
- \de |"check_no_args"|\inpar Obselete name for |check_arg_count|.
-
- \de |clear_attr|\inpar Make |object| not have |attribute|.
-
- \de |clear_flag|\inpar A name once used for one of the not-really-present extended v5
- opcodes.
-
- \de |"colour"|\inpar Obselete name for |set_colour|.
-
- \de |"compare_pobj"|\inpar Obselete name for |same_parent|.
-
- \de |copy_table|\inpar Copies size bytes from the first table to the second. If the
- second table is given as 0, then it zeroes the bytes in the first table.
- If the length is positive, it copies backwards:
- \onpar |copy_table $1000 $1001 20|
- \onpar
- would push the first 20 bytes forward by one. However, if the length is
- negative, it copies forwards. Thus the same operation with -20 would
- result in the byte at |$1000| being copied into the 20 following bytes.
-
- \de |dec |\inpar Decrement variable
-
- \de |dec_chk|\inpar Decrement variable, and jump if now equal to value
-
- \de |div |\inpar Signed 16-bit division
-
- \de |draw_picture|\inpar Displays the picture with the given number from the library of
- pictures which the interpreter is expected to have (which is not resident
- in the Z-machine itself). The Z-machine knows nothing of what picture
- format is being used. By default, this appears at the current cursor
- position in the current window. Y and X pixel coordinates from the top
- left can be given instead, though (the top left having coordinates $(1,1)$).
- \onpar
- Pictures are numbered from 1 and need not be numbered contiguously.
-
- \de |encode_text|\inpar Translates an ASCII word to the internal (z-encoded) text format,
- suitable for dictionary use. The text begins at |from| in the |ascii-text|
- and is |length| characters long, which should contain the right length
- value even though in fact the interpreter translates the word as far
- as a 0 terminator. A 6-byte z-encoded string results: this is the
- dictionary resolution in versions 4, 5 and 6 and usually represents
- 9 characters of ASCII.
-
- \de |"encrypt"|\inpar Obselete name for |encode_text|.
-
- \de |erase_line|\inpar Before version 6: erase the current cursor line in the current
- window. (Badly interpreted by ITF.) In version 6: if the value is 1, do
- just that: if not, erase the given number of pixels minus one across from
- the cursor (clipped to the window size).
- In both cases, don't move the cursor.
-
- \de |erase_picture|\inpar Like |draw_picture|, but wipes the appropriate region to
- the background colour for the given window.
-
- \de |erase_window|\inpar Erases window with given number (to the background colour in
- version-6), or if -1 it unsplits the screen and clears the lot. The
- cursor moves back to top left. (In version 6, -2 means clear the whole
- screen but don't unsplit it.)
-
- \de |extended|\inpar This byte (decimal 190) is not really an instruction, but
- indicates that the opcode is ``extended": the next byte contains the
- number in the extended set.
-
- \de |get_next_prop|\inpar Gives the number of the next property provided by the
- quoted object. This may be zero, indicating the end of the property list;
- if called with zero, it gives the first property number present. (If
- called with the number of a property not present, the Z-machine may
- legitimately crash.)
-
- \de |get_prop|\inpar Read property from object (resulting in the default value if it
- had no such declared property).
-
- \de |get_prop_addr|\inpar Get address of property data for given object's property.
-
- \de |get_prop_len|\inpar Get length of property data.
-
- \de |get_child|\inpar Get first object contained in given object, branching if there
- are none (i.e., if this is |nothing|, or 0).
-
- \de |get_cursor|\inpar Puts the current cursor row into the first word of the given
- table, and the current cursor column into the second word.
-
- \de |get_parent|\inpar Get parent object (note that this has no ``branch if nothing"
- clause).
-
- \de |get_sibling|\inpar Get next object in tree, branching if this is |nothing| (i.e. 0).
-
- \de |get_wind_prop|\inpar The eight windows (in version 6) have 16 properties, numbered
- 0 to 15, which can be read using this call and (mostly) written
- using |put_wind_prop|. The 16 properties are:
- \onpar|0 y coordinate 6 left margin size 12 font number|
- \onpar|1 x coordinate 7 right margin size 13 font size|
- \onpar|2 y size 8 newline interrupt function 14 attributes|
- \onpar|3 x size 9 interrupt countdown 15 line count|
- \onpar|4 y cursor 10 highlight mode|
- \onpar|5 x cursor 11 colour data|
- \onpar
- These properties are all explained elsewhere except for 8 and 9, about
- ``newline interrupts". If the countdown is set non-zero, it begins to
- count downwards, once per new-line. When it then hits zero, the
- interrupt function is called. This is provided so that text can be
- shaped past crinkly margins (e.g., to roll nicely around a picture)
- because the interrupt function can fix the margins at the crucial moment.
- The interrupt function should not attempt to print anything to the same
- window!
- \onpar
- Window coordinates are relative to the screen; cursor coordinates are
- relative to the window.
- \onpar
- Font size contains two bytes: height then width, in pixels. Colour data
- similarly gives foreground, then background colour.
-
- \de |icall|\inpar This is an Inform internal name for ``call to a function whose
- address is supplied, not its name". It allows calculated calls; but takes
- no arguments. It stores the result as |call| does.
-
- \de |inc |\inpar Increment variable.
-
- \de |inc_chk|\inpar Increment variable, and branch if now equal to value.
-
- \de |input_stream|\inpar Switches the input stream (the source of the player's commands).
- 0 is the keyboard, and 1 a command file (the idea is that a list of
- commands produced by |output_stream 4| can be fed back in again: Zip
- provides this useful feature).
-
- \de |insert_obj|\inpar Moves object to destination (it need not be removed from the tree
- first).
-
- \de |je |\inpar Jump if |a = b|.
-
- \de |jg |\inpar Jump if |a > b| (note: not |a>=b|).
-
- \de |"jge"|\inpar Inform used to call |jg| this, which was rather confusing, and now it
- is withdrawn.
-
- \de |jl |\inpar Jump if |a < b| (note: not |a<=b|).
-
- \de |"jle"|\inpar Inform used to call |jl| this, which was rather confusing, and now it
- is withdrawn.
-
- \de |jump|\inpar Jump (unconditionally) to the given label. It is safe to jump into a
- different routine but care is advisable. (The operand to jump is always
- a 2-byte signed offset: not an absolute routine address.)
-
- \de |jz |\inpar Jump if |a = 0|.
-
- \de |load|\inpar Results in the value of the given variable: so load v1 v2 actually
- does "v2 = v1". This is better done with |store| or |push| as appropriate
- and Inform never uses it in compiled code.
-
- \de |loadb|\inpar Stores |table->index|.
-
- \de |loadw|\inpar Stores |table-->index|.
-
- \de |log_shift|\inpar Does a logical shift of number by the given number of places,
- shifting left (i.e. increasing) if places is positive, right if negative.
- In a right shift, the sign is zeroed instead of being shifted on. (The
- alternative behaviour is |art_shift|.)
-
- \de |lstore|\inpar Inform names this to force |store| to take the ``long"
- form; it is only used internally.
-
- \de |make_menu|\inpar Provided for the benefit of the Apple Macintosh, and who are
- we to object. Interpreters which don't provide menus are supposed to set
- a bit to say so in the header, but anyway this instruction can simply
- do nothing and not branch if there are no menus (or if there are too many
- already).
- \onpar
- The menu number to be added has to be more than 2 (since 0 is the Apple
- menu, 1 the File menu, 2 the Edit menu). If the table supplied is 0,
- the menu is removed. Otherwise it is a table of tables. Each table is
- an ASCII string: the first item being a menu name, subsequent ones the
- entries.
-
- \de |mod |\inpar Remainder after signed 16-bit division.
-
- \de |mouse_window|\inpar Constrain the mouse arrow to sit inside the given window.
- By default it sits in window 1. Setting to -1 takes all restriction away.
- (The mouse clicks are not reported if the arrow is outside the window
- and interpreters are presumably supposed to hold the arrow there by
- hardware means if possible.)
-
- \de |move_window|\inpar Moves the given window to pixels $(y,x)$: $(1,1)$ being
- the top left. Nothing actually happens (since windows are entirely
- notional transparencies): but any future plotting happens in the new place.
-
- \de |mul |\inpar Signed 16-bit multiplication.
-
- \de |new_line|\inpar Print carriage return.
-
- \de |nop |\inpar Probably the official ``no operation" instruction. Ironically,
- since there is hardly ever any point in using it (self-modifying code is
- illegal in the Z-machine since the code is outside the save area)
- interpreters sometimes do not bother to implement it... and crash.
- (In any event, no V1 to V5 datafile actually uses this opcode.)
-
- \de |not |\inpar Bitwise not (i.e., all 16 bits reversed). Note: in versions 3 and 4
- this was a one-operand instruction (as would be expected) but in versions
- 5 and 6 it was pushed into the extended set to make room for |call_1n|.
- (Inform knows which to compile to.)
- (Note also that although this opcode seems to belong to V3, it is not
- in fact used until V4.)
-
- \de |or |\inpar Bitwise or.
-
- \de |output_stream|\inpar Text can be output to a variety of different "streams",
- possibly simultaneously. 0 does nothing. +n switches stream n on,
- -n switches it off. The output streams are: 1 (the screen),
- 2 (the game transcript), 3 (memory) and 4 (script of player's commands).
- Thus, one can turn the screen off and print only to the transcript,
- for instance. Zip does now provide 4, which is extremely useful
- in debugging games. Other interpreters do not.
- \onpar
- Case 3 is more complicated. Here the syntax is:
- \onpar|output_stream 3 table|
- \onpar
- and the text is printed into the table+2, the first word always holding
- the number of characters printed. Printing is never buffered in this
- stream, whatever the state of |buffer_mode|.
- \onpar
- In Version 6, the total number of pixels width is kept in a field in
- the game's header. Also, the |width| field may optionally be given,
- and the text will then be justified as if it were in the window with
- that number (if width is positive) or a box -|width| pixels wide (if
- negative). Then the table will contain not ordinary text but formatted
- text: see |print_form|.
- \onpar
- In version 3 (which does not have this opcode) transcripting is caused
- purely by setting the header bit. In higher versions games do this
- as well anyway, despite using the opcode.
-
- \de |picture_data|\inpar Asks the interpreter for data on the picture with the given
- number. This is a branch instruction: if the picture number is not valid,
- no branch is made. Otherwise information is written to the table and a
- branch occurs.
- \onpar
- If the number is zero, the first word of the table is simply written as
- the highest legal picture number, and the second word is the highest
- legal picture number.
- \onpar
- Otherwise, the first word of the table contains the height and the second
- the width.
-
- \de |piracy|\inpar Branches if the game disc is believed to be genuine by the
- interpreter (which is assumed to have some evil way of finding out).
- Earlier specifications suggested this to be an unconditional branch
- instruction... interpreter writers are urged to code it as such,
- and Z-code programmers not to use it at all.
-
- \de |pop |\inpar This exists only in versions 3 and 4, and simply throws away the top of
- the stack. (The need for it was largely circumvented by the
- call-and-throw-away-result instructions.) The same opcode was then used
- for |catch| which tends to crash the machine if used naively.
-
- \de |pop_stack|\inpar In Version 6, an honest |pop| instruction was finally re-invented.
- This throws the given number of items off the system stack, unless a stack
- is given as a second argument, in which case it pops off that one instead.
-
- \de |print|\inpar Print the quoted (literal) string.
-
- \de |print_addr|\inpar Print (Z-encoded) string at given byte address.
-
- \de |print_char|\inpar Print ASCII character.
-
- \de |print_form|\inpar Prints a formatted table of the kind produced when the output
- stream is 3. This is an elaborated version of |print_table| to cope with
- fonts, pixels and other impedimenta. It is a sequence of lines,
- terminated with a zero word. Each line is a word containing the number
- of characters, followed by that many bytes which hold the characters
- concerned.
-
- \de |print_num|\inpar Print (signed) number in decimal.
-
- \de |print_paddr|\inpar Print the (Z-encoded) string at the given packed address.
-
- \de |print_ret|\inpar Print the quoted (literal) string, and print a new-line, and
- then return true (i.e., 1).
-
- \de |print_table|\inpar Prints a rectangle of text on screen spreading right and
- down from the current cursor position, of given |width| and |height|, from
- the table of ASCII text given. (Height is optional and defaults to 1.)
- If a |skip| value is given, then that many characters of text are skipped
- over in between each line and the next. So one could make this display,
- for instance, a 2 by 3 region of a giant 40 by 40 character graphics
- map.
-
- \de |pull|\inpar Pulls value off the stack (crashing if it underflows). In versions
- 5 and 6, the stack in question may be specified as a user one. A user
- stack is just a table of words in the save area somewhere, whose first
- word always holds the number of spare slots on the stack (so the initial
- value is the capacity of the stack). User stacks are not well interpreted.
-
- \de |push|\inpar Pushes value onto the system stack.
-
- \de |push_stack|\inpar Pushes the value onto the user-specified stack, and branches
- if successful. If the stack was full already, nothing happens and no
- branch is made.
-
- \de |put_prop|\inpar Write value to the given property of the given object (this crashes
- the machine if the object has no such property). The interpreter stores
- a word or a byte as appropriate.
-
- \de |put_wind_prop|\inpar Writes a window property (see |get_wind_prop|). This should
- only be used when there is no direct command (such as |move_window|) to
- use instead, as some such operations may have side-effects.
-
- \de |quit|\inpar Exit the game. (Any ``Are you sure?" question must be asked by
- the game, not the interpreter.) It is not legal to return from the main
- routine: this must be used.
-
- \de |random|\inpar Returns a random number between 1 and range (supposing range to be
- positive).
- If range is negative, it is used as a seed for the random number generator
- (different interpreters do this in different ways), to make the generator
- predictable. Random then returns 0.
- If range is zero, some interpreters crash (though they absolutely should
- not). Correct behaviour is to reset the generator to some suitable seed
- value (say, taken from a real-time clock). Again, random should then
- return 0.
-
- \de |read|\inpar The two forms of |read| are called |aread| and |sread| by Inform,
- for the sake of clarity (Advanced and Standard read). |read| is actually
- a high-level Inform command which compiles suitably portable code for
- either version.
-
- \de |read_char|\inpar Reads a single character. The stream (the first operand) is
- always 1, meaning the keyboard for some reason. Time and function are
- optional and dealt with as in |aread|. Function keys return special
- values from 129 onwards:
- \onpar |up down left right f1 ... f12 keypad 0...9|
- \onpar |menu click double mouse click single mouse click|
- \onpar (Mice only being at play in version 6.)
-
- \de |read_mouse|\inpar The four words in the |table| are written with the mouse
- y coordinate, x coordinate, button bits (low bits on the right of the
- mouse, rising as one looks left), and a menu word. In the menu word,
- the upper byte is the menu number (from 1) and the lower byte is the
- item number (from 0).
-
- \de |restore|\inpar See |save|. In version 3, the branch is never actually made,
- since either the game has successfully picked up again from where it
- was saved, or it failed to load the save game file. From version 5
- it can have optional parameters as |save| does, and returns the number
- of bytes loaded if so. If the restore fails, 0 is returned, but once
- again this necessarily happens since otherwise control is already
- elsewhere.
-
- \de |restore_undo|\inpar Like |restore|, but restores from the internal RAM saved
- game made by |save_undo|. (The optional parameters of |restore| may not be
- supplied.)
-
- \de |restart|\inpar Restarts the game. (Any ``Are you sure?" question must be asked
- by the game, not the interpreter.)
-
- \de |ret |\inpar Returns the value given.
-
- \de |ret_popped|\inpar Pops top of stack and returns that. This is equivalent to
- |ret sp|, but is one byte cheaper.
-
- \de |"retsp"|\inpar Obselete name for |ret_popped|.
-
- \de |rfalse|\inpar Return false (i.e., 0).
-
- \de |rtrue|\inpar Return true (i.e., 1).
-
- \de |same_parent|\inpar Compare parent objects of the two given: branch if equal.
-
- \de |save|\inpar On versions 3 and 4, this attempts to save the game (all questions
- about filenames are asked by interpreters) and branches if successful.
- From version 5 it moves to the extended set, as a result of which it is
- no longer a branch instruction, and works in a different way (see the
- explanation above). This returns 0 for failure, 1 for ``save
- succeeded" and 2 for ``the game is being restored and is resuming
- execution again from here, the point where it was saved".
- \onpar
- The extension also has (optional) parameters, which save a region of
- the save area, whose address and length are in bytes, and provides a
- suggested filename: name is a pointer to an array of ASCII characters
- giving this name (as usual preceded by a byte giving the number of
- characters).
-
- \de |save_undo|\inpar Like |save|, except that the optional parameters may not be
- specified: it saves the game into a cache of RAM held by the interpreter.
- (This is typically done once per turn, in order to implement ``UNDO", so
- it needs to be quick.) It may also return -1, meaning that the
- interpreter is unable to offer this feature. (Alas, most interpreters
- do not understand this opcode well enough to be able to confess to being
- unable to act on it.)
-
- \de |scan_table|\inpar Is |x| one of the words in |table|, which is |len| words
- long? If so, return the address where it first occurs and branch. If not,
- return 0 and don't.
- \onpar
- The |form| is optional (and only used in version 5?): bit 8 is set for
- words, clear for bytes: the rest contains the length of each field in
- the table. (The first word or byte in each field being the one looked
- at.) Thus |$82| is the default.
-
- \de |"scanw"|\inpar Obselete name for |scan_table|.
-
- \de |scroll_window|\inpar Scrolls the given window by the given number of pixels
- (a negative value scrolls backwards, i.e., down) writing in blank
- (background colour) pixels in the new lines. This can be done to any
- window and is not related to the ``scrolling" attribute of a window
- (which controls text scrolling, a different matter).
-
- \de |set_attr|\inpar Give |object| the |attribute|.
-
- \de |set_colour|\inpar If coloured text is available, set text to be foreground-against-
- background, where colour numbers are borrowed from the IBM PC:
- 2 - black, 3 - red, 4 - green, 5 - yellow, 6 - blue, 7 - magenta, 8 - cyan,
- 9 - white: in addition, 0 means keep the current colour setting, 1 means
- use the default and -1 means the colour of the pixel under the mouse arrow
- \onpar
- One of the V5 games, `Beyond Zork', uses this (Paul David Doherty reports
- it as used ``76 times in 870915 and 870917, 58 times in 871221'') and from
- the structure of the table it clearly logically belongs in version 5.
- \onpar
- Text styles such as bold and underline may also be realised with colour
- changes, if this is used.
-
- \de |set_cursor|\inpar Move cursor in the current window to $(x,y)$ character position
- (relative to $(1,1)$ in the top left). (In version 6 the window is supplied
- and need not be the current one.) Each window remembers its own cursor
- position. Using this call may result in any buffered text being printed
- out first (if word-wrapping is going on, for instance).
- \onpar
- In V6, |set_cursor -1| turns the cursor off, and either |set_cursor -2| or
- |set_cursor -2 0| turn it back on. It is not known what, if anything, this
- second argument means: in all known cases it is 0.
-
- \de |set_flag|\inpar See |clear_flag|.
-
- \de |set_font|\inpar The (text) font in the given window is changed. All windows
- (and this includes both windows in Version 5, contrary to common
- interpreter practice) seem to be expected to start with a non-fixed-space
- font. Anyway font 0 means ``keep current one" (this seems less than
- altogether useful), font 1 means ``default", font 3 refers to character
- graphics fonts (in versions 5 and 6) and font 4 means a fixed space font.
- \onpar
- No such opcode exists in versions 3 and 4: turning on and off the
- fixed space font is done by altering a bit in the header as usual. This
- remains the best way for interpreters to work even in higher versions.
-
- \de |set_margins|\inpar Sets the margin widths (in pixels) on the left and right
- for the given window which are by default 0. These are only used by
- windows which have word-wrapping (i.e., |buffer_mode| 1) and do nothing
- for others.
-
- \de |set_text_style|\inpar Sets printing style: 0 means normal, 1 means inverse video,
- 2 means bold, 4 means underline. (In version 6, 8 means change to a
- fixed-width characters font.) In principle the interpreter should
- clear flags in the header according to which of these it is unable to
- provide (in practice, few bother, and it doesn't much matter).
-
- \de |set_window|\inpar Moves text output to one of the windows. 0 is the default
- (lower) window and 1 means the upper one. This only just counts as a
- version-3 instruction: it was used by `Seastalker' on some machines.
- \onpar
- In version 6 this is much more fulsome. There are 8 windows, 0 to 7,
- which can do almost anything. In addition, the window number -3
- means ``the current window", in this and all the other calls.
-
- \de |"show_score"|\inpar Obselete name for |show_status|.
-
- \de |show_status|\inpar (In version 3 only) Display and update the status line now
- (don't wait until the next keyboard input). Ideally this should not
- crash in version 5, since the v5 release of `Wishbringer' (V23) contains
- this opcode by accident.
-
- \de |sound_effect|\inpar `The Lurking Horror' used this opcode, but no other version-3
- game did: the v5 game `Sherlock' also used its full form. See |beep|, the
- Inform name for the simpler form of this opcode in versions 4 and 5.
- \onpar
- In Version 6, this produces the given sound (1 meaning a high-pitched
- beep, 2 meaning a low one and other values corresponding to noises
- held by the interpreter) at the given volume (1 to 8: -1 being the
- default, loudest value: Mark Howell suggests |$34FB| causes fade in
- and |$3507|, fade out) repeated the given number of times (-1 now meaning
- forever). The ``effect" can be: 1 (prepare), 2 (start), 3 (stop), 4
- (finish with). (Preparation means in effect loading the sample file
- off disc.)
- \onpar
- Version 5 (and 3) is similar but the parameters seem to be less
- sensibly arranged, as shown.
-
- \de |split_window|\inpar Divides the screen into two windows, an upper one (of the
- stated number of lines) which is in effect a big status bar, and a
- lower one (all the rest). This only just counts as a version-3
- instruction: it was used by `Seastalker' on some machines.
- In V6, this seems to be used just to bound the cursor movement. `Journey'
- creates a status region which is the whole screen and then overlays it
- with two other windows.
-
- \de |sread|\inpar Standard (version 3) form of |read|. For details, see the |read|
- command's description in section (8). Note that this automatically
- redisplays the status line before the keyboard is listened to.
-
- \de |store|\inpar Set |variable| to |value|.
-
- \de |storeb|\inpar |table->byte = value|.
-
- \de |storew|\inpar |table-->word = value|.
-
- \de |sub |\inpar Signed 16-bit subtraction.
-
- \de |test|\inpar Jump if any of the flags in bitmap are set
- (i.e. if |bitmap & flags ~= 0|).
-
- \de |test_array|\inpar See |clear_flag| (ITF makes this come out
- unconditionally false, though).
-
- \de |test_attr|\inpar Jump if |object| has |attribute|.
-
- \de |throw|\inpar Opposite of |catch|. This causes the game to behave as if the
- current routine was that whose stack-frame is given (which was found
- using |catch| at the right moment). Thus the next return to happen
- will return as if from the ``caught" routine. This is useful for getting
- out of large recursive tangles in a hurry, if an error has occurred.
- (This opcode plainly belongs to the V5 specification, but is not actually
- used in any V5 game.)
-
- \de |tokenise|\inpar The parser (strictly speaking, the lexical analyser) from |aread|.
- The given |text| is parsed into the given parse table. Unlike in version
- 3, |aread| appends to the parse table, not over-writes it.
- \onpar
- If a non-zero |dictionary| is supplied, it is used (if not, the ordinary
- game dictionary is). If the |flag| is set, unrecognised words are not
- listed as zero in the parse table: this is presumably so that if several
- |tokenise|s are done in a row, each fills in more slots without wiping those
- filled by the others.
- \onpar
- Parsing a user dictionary is slightly different. A user dictionary
- should look just like the main one, except that it should have no
- ``separator" characters listed (the ones listed in the main one are
- valid instead), and that it need not be alphabetically sorted. If the
- number of entries is given as $-n$, then the interpreter reads this as
- ``$n$ entries unsorted". This is very convenient if the table is being
- altered in play: if, for instance, the player is naming things.
-
- \de |verify|\inpar Some version-3 interpreters are said not to implement this. It
- counts a (two byte, unsigned) checksum of the file from |$0040| onwards and
- compares this against the value in the game header, branching if correct.
-
- \de |vje |\inpar Internal Inform name for the variable-length form of |je| (for
- compiling conditions such as |a==1 or 2 or 4|).
-
- \de |window_size|\inpar Change size of window in pixels. (Does not change the current
- display.)
-
- \de |window_style|\inpar Changes the four attributes for a given window. The bits in
- question are: 1 - word wrapping (if this is off text is clipped to the
- window size instead), 2 - scrolling, 3 - text to be sent to the printer
- (if transcripting is switched on), 4 - text is buffered.
- \onpar
- The operation, by default, is 0. 0 means ``set to these settings". 1 means
- ``set the bits supplied". 2 means ``clear the ones supplied", and 3 means
- ``reverse the bits supplied" (i.e. exclusive or).
-
-
- \section{10}{Header format through the ages}
-
-
- The initial block of 64 bytes in the Z-machine, the ``header", is of
- particular fascination to Infocom hackers and many tables have been drawn up
- of its contents. The table here steals from its predecessors (I am
- particularly indebted to Paul David Doherty) but also fills some gaps to do
- with version 6.
-
- Once again ``V" refers to the earliest version in which the feature appeared;
- ``Dyn" is marked if the entry is dynamic, i.e. changes as the game plays;
- ``Int" if it is written by the interpreter (otherwise it is set in or by the
- game file).
-
- Bits in a byte are numbered from 0 (|$01|) up to 7 (|$80|).
-
- \pageinsert
- \smallskip\hrule\smallskip
- \beginlines
- | Hex V Dyn Int Contents|
- \endlines\smallskip\hrule\smallskip\beginlines
- | 0 1 Version number (1 to 6)|
- | 1 3 Flags 1:|
- | 3 * Bit 0 (unused: possibly a flag to indicate byte sex,|
- | i.e. LSB-MSB or MSB-LSB in 2-byte words, at|
- | a time when two different forms of game file|
- | was considered: no such forms ever emerged)|
- | 1 Status line type: clear for score/moves,|
- | set for time in hours/minutes|
- | * 2 (unused: set in V3?)|
- | * 3 The legendary "Tandy" bit (see below)|
- | * 4 The interpreter sets this if it cannot|
- | produce a status line|
- | * 5 Interpreter sets if it _can_ split the screen|
- | (only `Seastalker' uses this in V3)|
- | * 6 Interpreter sets if it uses non-fixed-space|
- | fonts|
- | 7 (unused)|
- | 4 * Flags 1: Interpreter sets bits to say what it can do:|
- | 4 * Bit 0 (always set)|
- | 6 * Colours available?|
- | 4 * 1 (always set)|
- | 6 * Picture displaying available?|
- | 4 * 2 Boldface available?|
- | 4 * 3 Underlining available?|
- | (the only one of these flags any V4/5 games|
- | actually ever looked at)|
- | 4 * 4 Fixed-space font available?|
- | 6 * 5 Sound effects available?|
- | 6,7 (unused)|
- | 2 1 Release number|
- | 4 1 Start of code area (bytes)|
- | 6 1 Main routine address (uniquely, a byte address which|
- | points to first byte of code in routine)|
- | 8 1 Dictionary address (bytes)|
- | A 1 Object table address (bytes)|
- | C 1 Global variables table address (bytes)|
- | E 1 Size of save area (bytes)|
- \endlines\smallskip\hrule\endinsert
-
- \pageinsert
- \smallskip\hrule\beginlines
- | Hex V Dyn Int Contents|
- \endlines\smallskip\hrule\smallskip\beginlines
- | 10 3 * Flags 2:|
- | * Bit 0 Printer transcripting happens when the game|
- | sets this bit|
- | * 1 The interpreter is forced to use a fixed-space|
- | font when the game sets this bit|
- | (does not apply in version 6?)|
- | 6 * * 2 If the interpreter thinks the status line needs|
- | redrawing (because, e.g., the player has|
- | dragged a menu across it) it sets this bit.|
- | The game should notice, redraw the status|
- | line and clear the bit itself.|
- | 6 3 If set, game wants to use pictures|
- | 3 4 Set in the Amiga version of The Lurking Horror|
- | so presumably to do with sound effects|
- | 5 If set, game wants to use the UNDO opcodes|
- | 6 5 If set, game wants to use a mouse|
- | 6 6 If set, game wants to use colours|
- | 6 7 If set, game wants to use sound effects|
- | 6 8 If set, game wants to use menus|
- | (In each case except bit 6, if the|
- | interpreter cannot manage the given feature,|
- | it should clear the relevant bit again.)|
- | 9 (unused)|
- | * * 10 Possibly set by interpreter to indicate an error|
- | with the printer during transcription|
- | 11-15 (unused)|
- | 12 2 Serial number (six characters of ASCII, conventionally|
- | the compilation date in the form YYMMDD)|
- | 18 2 Synonyms table address (bytes)|
- | 1A 3+ Length of file (in words (V3) or longwords (V4,5,6))|
- | 1C 3+ Checksum of file (sum of bytes from $0040 to length|
- | by unsigned 16-bit addition)|
- | 1E 4 * Interpreter number, identifying the machine as one of:|
- | 1 DECSystem-20 6 IBM PC|
- | 2 Apple IIe 7 Commodore 128|
- | 3 Macintosh 8 Commodore 64|
- | 4 Amiga 9 Apple IIc|
- | 5 Atari ST 10 Apple IIgs|
- | 11 Tandy Color|
- | The latest versions of the portable interpreters I|
- | have seen are: InfoTaskForce 2 Version A|
- | Zip 6 Version B|
- | 1F 4 * Interpreter version (a single ASCII character,|
- | conventionally running through capital letters from A)|
- \endlines\smallskip\hrule\endinsert
-
- \pageinsert
- \smallskip\hrule\beginlines
- | Hex V Dyn Int Contents|
- \endlines\smallskip\hrule\smallskip\beginlines
- | 20 4 * Screen height (lines): 255 means "infinite", i.e. never|
- | worry about screen overflow and never produce [MORE]|
- | 21 4 * Screen width (characters)|
- | 22 5 * * Leftmost screen coordinate|
- | 23 5 * * Rightmost screen coordinate|
- | 24 5 * * Highest screen coordinate|
- | 25 5 * * Lowest screen coordinate|
- | 26 5 * * Width in these coordinate terms of a character in the|
- | current font|
- | 27 5 * * Similarly, font height|
- | (Note: it is perfectly permissible for 22 to 25 to be|
- | character grid positions, and the width and height both|
- | to be 1: or they could all be in pixels.)|
- | 22 6 * Screen width in pixels|
- | 24 6 * Screen height in pixels|
- | 26 6 * * Font height in pixels|
- | 27 6 * * Font width in pixels (defined as width of a '0')|
- | (Note: 22-27 are similar in V6 to V5, with the coordinates|
- | now being pixels, but the highest and leftmost slots are|
- | dropped (both values being 1) to give room for 2-byte values,|
- | i.e. for resolutions of more than 255 pixels.)|
- | 28 6 Functions extra offset (longwords): this may be 0. It is|
- | added to all function addresses and effectively allows|
- | the program to exceed the 256K maximum address space|
- | by the size of the save area|
- | 2A 6 Static strings extra offset (longwords): similar (needed|
- | since static strings come last, after the functions)|
- | 2C 6 * Default background colour|
- | 2D 6 * Default foreground colour|
- | 2E 6 Address of terminating characters table (bytes)|
- | 30 6 * * Slot used when the output_stream is to memory, to record|
- | total width of text in pixels|
- | 32 --- (these 2 bytes unused in any version)|
- | 34 5 Character set table address (bytes), or 0 if the default|
- | character set is to be used|
- | 36 6 Mouse data table address (bytes)|
- | 38 6 * 8 bytes of ASCII: the player's user-name on Infocom's|
- | own mainframe, used for debugging purposes and|
- | possibly allowing users access to special features.|
- \endlines\smallskip\hrule\bigskip
- \noindent Some early version-3 files do not contain length and checksum
- data, hence the mysterious |3+|.
- \endinsert
-
- \subtitle{The ``Tandy" bit}
-
- Some early Infocom games were sold by the Tandy Corporation, who seem to
- have been sensitive souls. `Zork I' pretends not to have sequels if it
- finds this bit set. And to quote Paul David Doherty:
- \quote
- In `The Witness', the Tandy Flag can be set while playing the game,
- by typing |$DB| and then |$TA|. If it is set, some of the prose will be
- less offensive. For example, ``private dicks" become ``private eyes",
- ``bastards" are only ``idiots", and all references to ``slanteyes" and
- ``necrophilia" are removed.
- \endquote
- We live in an age of censorship.
-
- \subtitle{The character set table}
-
- Is 78 bytes long, arranged as 3 blocks of 26 ASCII values for what
- characters to print when translating text. (The first two characters of
- block 3 are ignored anyway as they correspond to newline and the literal
- escape code.) This feature is implemented by Zip but not ITF, which
- means that the German translation of `Zork I' (which uses the character
- set for non-English letters like 'sz') is illegible on it.
-
- \subtitle{The terminating characters table}
-
- Is a zero-terminated list of character codes which cause |read| to finish
- (other than new-line). An entry of 255 means that any function key
- terminates input.
-
- \subtitle{The mouse data table}
-
- Seems to have been intended to grow at some future time, because the first
- word is the length of it. But the only data is the second and third words:
- the mouse x and y coordinates respectively. The interpreter writes these
- and they alter.
-
-
- \section{11}{A few statistics}
-
-
- To give some idea of the sizes found in typical story files, here are a few
- statistics, mostly gathered by Paul David Doherty, whose ``fact sheet" file
- contains many more.
-
- \medskip
- (i) {\sl Length}\quad
- The shortest files are those dating from the time of the `Zork'
- trilogy, at about 85K; middle-period version 3 games are typically 105K,
- and only the latest use the full memory map. In versions 4 and 5, only
- `Trinity', `A Mind Forever Voyaging' and `Beyond Zork' use the full 256K.
- `Border Zone' and `Sherlock', for instance, are about 180K. (The author's
- short story `Balances' is about 50K, an edition of `Adventure' takes 80K,
- and `Curses' about 240K.)
-
-
- \medskip
- (ii) {\sl Code size}\quad
- `Zork I' uses only about 5500 opcodes, but the number rises
- steeply with later games; `Hollywood Hijinx' has 10355 and, e.g.
- `Moonmist' has 15900 (both these being version 3). Against this, `A Mind
- Forever Voyaging' has only 18700, and only `Trinity' and `Beyond Zork'
- reach 32000 or so. (Inform games are more efficiently compiled and make
- better use of common code - the library - so perform much better here:
- the version 3, release 10 of `Curses' (128K long, and a larger game than
- any Infocom v3 game) has only 6720 opcodes.)
-
-
- \medskip
- (iii) {\sl Objects and rooms}\quad
- Obviously, this varies greatly with the style of
- game. `Zork I' has 110 rooms and 60 takeable objects, but several quite
- complex games have as few as 30 rooms (the mysteries, or `Hitch-hikers').
- The average for version-3 games is 69 rooms, 39 takeable objects.
-
- `A Mind Forever Voyaging' contains many rooms (178) but few objects (30).
- `Trinity', a more typical style of game, contains 134 rooms and 49
- objects: the version-5 `Curses' has a few more of each. Of the version-6
- games, only `Zork Zero' scores highly here, with 215 rooms and 106
- objects. The average for version 4/5 games is 105 rooms and 54 objects.
-
-
- \medskip
- (iv) {\sl Dictionary}\quad Early games such as `Zork I' know about 600 words, but
- again this rises steeply to about 1000 even in v3. Later games know
- 1569 (`Beyond Zork') to the record, 2120 (`Trinity'). (This is achieved
- by heroic inclusion of unlikely synonyms: e.g. the Japanese lady with the
- umbrella can be called WOMAN, LADY, CRONE, MADAM, MADAME, MATRON, DAME or
- FACE with any of the adjectives OLD, AGED, ANCIENT, JAP, JAPANESE,
- ORIENTAL or YELLOW.) V6 games have smaller dictionaries.
-
- \vfill\eject
- \end
-