home *** CD-ROM | disk | FTP | other *** search
-
- Having become extremely frustrated by VIEW.EXE's penchant for windows
- that come and go, without even opening large enough to see everything
- in them, I thought I'd try to turn .INF files into something more
- conventional. While I don't have code to offer, I can tell you what I
- learned about .INF format--it was enough to produce more-or-less
- readable more-or-less plaintext from .INFs.
-
- I offer this in the hope that somebody will give the community a
- really nice, tasteful, convenient, doesn't-use-too-much-screen-real-estate
- .INF browser to replace VIEW.EXE.
-
- All of this was developed by looking at .INF files without any
- documentation of the format except what VIEW.EXE showed for a
- particular feature.
-
- I don't have a lot of personal interest in refining this document with
- additional escape sequences, etc., but I would be happy to correspond
- with someone who wanted to fill in the details, or to clarify anything
- that may be confusing. If someone could point us to an official document
- describing the format that would be most helpful.
-
- -- Carl Hauser (chauser.parc@xerox.com)
-
- All numeric quantities are least-significant-byte first in the file
- (little-endian).
-
- **** Types ****
-
- bit1 1 bit boolean
- int4 4 bit unsigned integer
- char8 8 bit character (ASCII more-or-less)
- int8 8 bit unsigned integer
- int16 16 bit unsigned integer
- int32 32 bit unsigned integer
-
- **** The File Header ****
-
- Starting at file offset 0 the following structure can overlay the file
- to provide some starting values:
- {
- char8 unknown[8]; // unknown purpose
- int16 ntoc; // 16 bit number of entries in the tocarray
- int32 tocstart; // 32 bit file offset of the start of the tocarray
- int32 tocstrlen; // number of bytes in file occupied by the
- // table-of-contents strings
- int32 tocstrtablestart; // 32 bit file offset of the start of the
- // strings for the table-of-contents
- int16 nslots; // number of "slots"
- int32 slotsstart; // file offset of the slots array
- int32 dictlen; // number of bytes occupied by the "dictionary"
- int16 ndict; // number of entries in the dictionary
- int32 dictstart; // file offset of the start of the dictionary
- }
- I think there's more to the header and that it describes the index,
- but I didn't decode that.
-
- **** The table of contents array ****
-
- Beginning at file offset tocstart, this structure can overlay the
- file:
- {
- int32 tocentrystart[ntoc]; // array of file offsets of
- // tocentries
- }
-
- **** The table of contents entries ****
-
- Beginning at each file offset, tocentrystart[i]:
- {
- int8 len; // 8 bit length of the entry including this byte
- bit1 haschildren; // following nodes are a higher level
- bit1 hidden; // this entry doesn't appear in VIEW.EXE's
- // presentation of the toc
- bit1 extended; // extended entry format
- bit1 stuff; // ??
- int4 level; // nesting level
- int8 ntocslots; // number of "slots" occupied by the article for
- // this toc entry
- }
-
- if the "extended" bit is not 1, this is immediately followed by
-
- {
- int16 tocslots[ntocslots]; // indices of the slots that make up
- // the article for this entry
- char8 title[]; // the remainder of the tocentry
- // until len byteshave been used
- }
-
- if extended is 1 there are intervening bytes that (I think) describe
- the kind, size and position of the window in which to display the
- article. I haven't decoded these bytes, though in most cases the
- following tells how many there are. Overlay the following on the next
- two bytes
- {
- int8 w1;
- int8 w2;
- }
-
- Here's a C code fragment for computing the number of bytes to skip
- int bytestoskip = 0;
- if (w1 & 0x8) { bytestoskip += 2 };
- if (w1 & 0x1) { bytestoskip += 5 };
- if (w1 & 0x2) { bytestoskip += 5 };
- if (w2 & 0x4) { bytestoskip += 2 };
-
- skip over bytestoskip bytes (after w2) and find the tocslots and title
- as in the non-extended case.
-
- **** The Slots array ****
-
- Beginning at file offset slotsstart (provided by the file header) find
- {
- int32 slots[nslots]; // file offset of the article
- // corresponding to this slot
- }
-
- **** The Dictionary ****
-
- Beginning at file offset dictstart (provided by the file header) and
- continuing until ndict entries have been read (and dictlen bytes have
- been consumed from the file) find a sequence of null-terminated
- strings. Build a table mapping i to the ith string.
- {
- char8* strings[ndict];
- }
-
- **** The Article entries ****
-
- Beginning at file offset slots[i] the following structure can overlay
- the file:
- {
- int8 stuff; // ??
- int32 localdictpos; // file offset of the local dictionary
- int8 nlocaldict; // number of entries in the local dictionary
- int16 ntext; // number of bytes in the text
- int8 text[ntext]; // encoded text of the article
- }
-
- **** The Local dictionary ****
-
- Beginning at file position localdictpos (for each article) there is an
- array:
- {
- int16 localwords[nlocaldict];
- }
-
- **** The Text ****
-
- The text for an article then consists of words obtained by referencing
- strings[localwords[text[i]]] for i in [0..ntext), with the following
- exceptions. If text[i] is greater than nlocaldict it means
- 0xfa => end-of-paragraph
- 0xfc => if in-an-example then end-of-line
- else spacing = !spacing // see below
- 0xfd => if in-an-example then end-of-line else spacing = TRUE
- 0xfe => space
- 0xff => escape sequence // see below
-
- When spacing is true, each word needs a space put after it. When
- false, the words are abutted and spaces are supplied using 0xfe or the
- dictionary. Examples are entered and left with 0xff escape sequences.
- the variable "spacing" is initially TRUE..
-
- **** 0xff escape sequences ****
-
- These are used to change fonts, make cross references, enter and leave
- examples, etc. The general format is
- {
- int8 FF; // always equals 0xff
- int8 esclen; // length of the sequence including esclen (but
- // excluding FF)
- int8 escCode; // which escape function
- }
-
- escCodes I have partially deciphered are
-
- 0x02 or 0x11 => (esclen==3) goto horizontal position.
- The remaining byte is an int8
- describing the position. 0x00 is the
- left margin and starts a new line; I
- don't know the units for other values.
-
- 0x04 => (esclen==3) change font. The
- remaining byte is an int8 denoting the
- font: I've determined 0x00 is normal,
- 0x01 is italic and 0x02 is bold in
- VIEW's presentation.
-
- 0x05 or 0x07 => (esclen varies) beginning of cross
- reference. The next two bytes of the
- escape sequence are an int16 index of
- the tocentrystart array. The
- remaining bytes describe the size,
- position and characteristics of the
- window created when the
- cross-reference is followed by VIEW.
- I have not decoded this.
-
- 0x08 => (escLen==2) end of cross reference
- introduced by escape code 0x05 or 0x07
-
- 0x0B => (escLen==2) begin example. set
- spacing to FALSE
-
- 0x0C => (escLen==2) end example. set spacing
- to TRUE
-
- 0x0F => if esclen==5 an inlined cross
- reference: the title of the referenced
- article becomes part of the text.
- This is probably the case even if
- esclen is not 5, but I don't know the
- decoding. In the case that esclen is
- 5, I don't know the purpose of the
- byte following the escCode, but the
- two bytes after that are an int16
- index of the tocentrystart array.
-
- 0x19 => (esclen==3) change font? I haven't
- checked VIEW's decoding of the next
- byte. I used the same decoding as for
- 0x04
-
- 0x1C => (escLen==2) I don't know it's function. I just ignored it.
-
- I doubt that this is an exhaustive list of the possible escape codes,
- but it covers most of what I found in the Control Program, REXX, and
- CSet/2 references. With a little more work and some playing with the
- info compiler to produce (chosen-plaintext, ciphertext) pairs it
- shouldn't be hard to pick out the whole decoding including the window
- positions.
-
- One other transformation I had to make was of the character box
- characters. Maybe these are standard but they weren't in the font I
- was using. These characters appear in strings in the dicitonary.
- They are given here in octal together with their translation
-
- 020, 021 => blank seems satisfactory
- 037 => solid down arrow: used to give direction to
- a line in the syntax diagrams
- 0263 => vertical bar
- 0264 => left connector: vertical bar with short
- horizontal bar extending left from the
- center
- 0277, 0300 => top right or bottom left corner; one is
- one, the other is the other and I
- can't tell which from my translation
- 0301 => up connector: horizontal line with vertical
- line extending up from the center
- 0302 => down connector: horizontal line with
- vertical line extending down from the
- center
- 0303 => right connector: vertical bar with short
- horizontal bar extending right from
- the center
- 0304 => horizontal bar
- 0305 => cross connector, i.e. looks like + only
- slightly larger to connect with
- adjacent chars
- 0331, 0332 => top left or bottom right corner; one is
- one, the other is the other and I
- can't tell which from my translation
-
-
- History:
- October 22, 1992: version for initial posting
-
-