home *** CD-ROM | disk | FTP | other *** search
-
- F I L E T Y P E 1 . 1
- ======================
-
- Free Software by TapirSoft Gisbert W.Selke
- August 1991
-
- This is a utility similar to the Un*x programme named file. It takes
- the name of a file of unknown purpose and tries to guess what kind of a
- file it is -- a ZIP archive, an LZH archive, an executable, an MS Word
- document, a QuattroPro spreadsheet, a Bitstream font, or whatnot. Often,
- the purpose of a file can be gleaned from its extension; but sometimes
- (e.g., after transmission by E-mail), this extension is lost, and
- sometimes it just isn't meaningful. (After all, there are only finitely
- many permissible file extensions, but probably uncountably many purposes
- to use files for.)
-
- FileType does its work by looking at sepcified bytes in the specified
- file; it tries to match these against a number of known file signatures.
- These signatures are stored in a plain ASCII text file; this signature
- file can be extended at will and as need dictates.
-
- Naturally, these guesses are not always correct.
-
- The simplest way to use FileType is
-
- filetype <filename>
-
- where <filename> is replaced by the name of the file to be examined.
- (No wildcards allowed.) Output consists of a header plus a single line.
-
- E.g., if you type
-
- filetype filetype.exe
-
- you'll get this answer:
-
- filetype.exe: executable (EXE)
-
- In order to be able to work, FileType must have access to a file
- containing the magic signatures; ordinarily, this file is called
- MAGIC.FT and is located in the current directory or in the directory
- where FILETYPE.EXE itself is stored. (The latter method works only
- under MS-DOS 3.x or later.) You can specify a different magic file, or
- an explicit path, with the /m switch:
-
- filetype /mc:\stuff\mymagic.typ foo.bar
-
- (Notice no blanks between /m and the file name!).
-
- There's another command line switch, /q, which suppresses output of the
- header, by the way; and just typing 'filetype' without any arguments
- displays usage hints.
-
- If you want to check a whole bunch of files, use something like this:
-
- for %f in (*.*) do filetype %f
-
- (If you used this inside a batch file, you wouldn't forget to double the
- percent characters, would you?)
-
- One thing remains to be told: how to extend the magic file? Just take an
- editor that stores plain ASCII files (no extraneous word processor
- information!) and add, modify, or delete lines at your leisure.
- These are the rules: (Advanced topics are marked with an asterisk.)
-
- - Maximum line length is 255 characters. Lines are CRLF-delimited.
- File must be plain (8 bit) ASCII.
- - Each line consists of a file recognizer pattern, then at least one
- blank, then either a name for the file type thusly identified or
- a continuation marker.
- - The recognizer sequence consists of a file offset (optional), a bit
- mask (optional), and a matching sequence (required). These items, if
- present, must be separated by at least one blank each; the ordering
- of these items is required.
- * The file offset starts with @, then an optional -, followed by
- a seqence of hex (!) digits which represent an offset into the file
- at which the matching should occur. (No blanks within the file
- offset sequence!) Start of file is at 0(!). A negative offset
- matches from the end of the file, with the last byte in the file
- pointed to by -1. -- Default for offset is 0.
- * The bit mask starts with & and is followed by a sequence as
- specified for matching sequences (cf. below). This bit mask will be
- ANDed bytewise to the bytes found in the file before matching takes
- place. Thus, masking with DF would make matching of 7-bit ASCII
- characters case-independent. (However, note the use of double quotes
- below.) If the bit mask is shorter than the matching sequence, it is
- extended with FF (functionally equivalent to no masking at all.) --
- Default for the bit mask is all FFs.
- - The matching sequence can be any mixture of pairs of hex digits and
- ASCII strings enclosed in single (') or double (") quotes.
- - Characters in single quotes require an exact match, characters in double
- quotes are matched case-independently. (Cf. note on case-conversion below.)
- - Both subtypes may contain a question mark to stand for any
- character. (And I mean 'character', *not* hex digit!)
- - ASCII strings may contain escaped sequences: \' and \" for
- embedded quotes, \b (backspace), \t (tab), \n (newline), \v
- (vertical tab), \f (form feed), \r (carriage return), \? (question
- mark), \\ (backslash).
- - If a starting sequence in this file is identical to the beginning of
- another one, the longer sequence should come first.
- - Comment lines may start with semicolon or hash mark.
- * For case-independent matchings, FileType knows about the upper-case
- equivalents of standard 7-bit ASCII characters; under DOS 3.30+, it
- can also handle (8-bit) national characters according to your
- country code and code page. You can override this knowledge by
- including a pair of lines starting with v and ^, respectively. These
- lines must not contain any blanks after the line marker and must
- match character by character. The 'v' line contains lower-case
- characters, the '^' line the corresponding upper-case characters.
- You need specify only as many characters as are necessary. These
- lines must occur in Magic.FT before the first recognizer line in
- which they are needed. -- Note that two or more pairs of translator
- lines may be specified, but only the last one used will be in
- effect.
- * If different places of the file need to be checked for pattern
- matching, there are two ways to do so:
- - If the places are close together, specify *one* sequence and a bit
- mask to ignore the irrelevant bytes by ANDing these with 0.
- - Otherwise, use a multi-line matching: specify one recognizer
- pattern, but instead of a file type name, include a slash (/);
- then, on a new line, specify the next recognizer pattern, this
- time using the file type name. (There may be more than one slash-
- delimited line.) This way, all the slash-delimited lines *and*
- the next one are required to match.
- - The first line of the file is taboo.
-
- Note that a file offset will rarely have to be used, since most files
- can be told from their first few bytes (if at all). You may consider
- offsets and masking as advanced topics which are necessary only in very
- special circumstances. Multi-line matchings will have to be used even
- more rarely. -- In any case, remember that you are invited to extend or
- change the magic file to suit your ain needs.
-
- That's it. Enjoy. And if you feel I have omitted a really important sort
- of files from MAGIC.FT (as distributed) and you know its magic
- signature, why not send it to me? I can be reached at
-
- TapirSoft
- Gisbert W.Selke
- Ermekeilstrasse 28
- D-5300 Bonn 1
- Germany
- E-Mail: <s00100@dbnrhrz1.bitnet>
-
-
-
- History:
- 1.0 01 Aug 1991 It hit the world.
- 1.1 19 Aug 1991 AARGH. Wildcard handling was broken, discovered
- by Richard J. Reiner. Fixed. Added multi-line
- matching. Added automatic national character
- uppercasing via country code and code page.
- Increased I/O buffer sizes. Corrected doc bug.
- Commented source some (gasp).
-
-
-
- Oh, the legal stuff:
-
- FileType.Pas contains no material copyrighted by anyone else. I retain
- the copyright on FileType; however, there are no restrictions on using
- and copying the package, as long as no money is asked for and all files
- are distributed unaltered and together. (That is: FILETYPE.PAS,
- FILETYPE.EXE, MAGIC.FT, FILETYPE.DOC, suitably archived by your
- favourite archiver.) The usual standard disclaimers apply: I cannot be
- held responsible for this programme's doing anything at all, or nothing
- at all, or not doing what you'd like it to do. (Which, on the other
- hand, isn't meant to say that I'd ignore any sufficiently detailed bug
- report.)
-
-
- Registered trademarks etc. used in this file:
-
- Bitstream : Bitstream Inc.
- MS DOS, MicroSoft Word: Microsoft Corporation
- QuattroPro : Borland International
- ZIP : PKWare, Inc.
-
-