home *** CD-ROM | disk | FTP | other *** search
- The following text was captured by Jim Gilliland and posted in the Fidonet
- OS2 echo conference. His introduction follows. -Pete Norloff
-
-
- Here is a description of the extended attribute mechanism for FAT
- partitions. This document appears to have made the rounds of UseNet,
- Compuserve, BIX, etc. It appears to be reasonably well researched,
- but I cannot speak for its correctness. Hope it proves useful and/or
- educational.
-
- ==============================================================================
-
- The EA DATA. SF file, and what it does
- ======================================
-
- I originally wrote this because of all the queries about the file named
- EA DATA. SF file which is a frequent subject of discussion. I have
- tried to explain what it does, why it exists, and what you should and
- should not do with it. Various people on CompuServe have given me extra
- information; particular thanks to Dean Gibson (73427,2072) who figured
- out the format of the EA DATA. SF file and put me right on a few
- points. Some of the following information (and nearly all of Appendix
- A) is due to Dean.
-
- OS/2 1.2 and beyond support the concept of "extended attributes" (EAs)
- on files. These are used for all kinds of things, and can be very small
- or quite large (the limit is 64K per file at present). EAs might
- represent a file type, a file classification, an icon type, some free
- text...practically anything. Use the Properties entry in the File
- pulldown on the File Manager to see the EAs on a specified file
- (actually, I have found that Properties doesn't seem to tell you
- absolutely everything).
-
- EAs are supported directly by the High Performance File System (HPFS).
- They are stored in an efficient manner; a small EA does not effectively
- take any additional space most of the time (typically, if it is less
- than several hundred bytes).
-
- However, for backwards compatibility the DOS (File Allocation Table, or
- FAT) file system needs to support EAs too. In order to do this, and
- keep the file system consistent for DOS if it is booted instead of OS/2
- on the same machine, some trickery is needed.
-
- FAT directory entries have ten spare bytes in them, starting at offset
- 0CH (immediately after the filename and the attribute byte); these are
- normally zero. They are there because originally the directory entry
- layout was modelled on the CP/M file system, and these bytes (among
- others) were used to describe the location of the disk extents making up
- the file; they aren't used for that purpose under DOS. Two of these
- spare bytes (at offsets 14H and 15H within the directory entry) are used
- to head a chain of disk allocation units (or clusters) which hold the
- EAs for that file. This causes interesting problems (for example) with
- early versions of the Norton Utilities, which flag the directory entry
- as one with an "illegal" format! So, effectively an OS/2 FAT directory
- entry can head two chains of clusters; one for the file itself (as
- usual) and one for the EAs attached to the file. The latter listhead is
- often null.
-
- All this would be fine until you ran CHKDSK under DOS. It would find
- all these clusters holding the EAs, and because they would appear not to
- belong to any file, they would be collected up and marked as "lost"
- clusters to be added to the free list. Disaster next time OS/2 looked
- at the file (well, eventually anyway) because the chances are that the
- clusters making up the EAs would have been allocated to another file by
- that time. To prevent this, the file named EA DATA. SF (the EA
- datafile) is used. This file is never meant to be read directly. Its
- directory entry heads a chain of clusters (as usual), but these clusters
- are the SAME ones that hold all the EAs on that disk. In other words,
- there are two references to every EA cluster; one via the file's
- directory entry and one via the EA datafile. This makes the disk appear
- consistent under DOS; all of the clusters used on the disk belong to a
- valid file. Microsoft say that the EA datafile is position dependent,
- and it shouldn't be manipulated or deleted; to make this hard, it has a
- strange name with spaces in it (which defeats a lot of software), and it
- is marked readonly, system and hidden. Observation has shown this not
- to be strictly true; it seems that you can back up and restore the file
- without any damage (of course, the EA datafile must correspond to the
- files on the disk; if you attempted to restore such a file on its own
- without also restoring the various files that reference it, you would
- have problems). The snag is that restored files won't generally have
- the entire directory entry restored, so the head of the EA cluster chain
- (in offsets 14H and 15H) will be lost (set to zero).
-
- Notice the implication for backup under OS/2. A proper, EA-aware backup
- program need not back up the EA datafile; it simply reads the EAs for
- each file as it is backed up, and of course it restores them the same
- way - with system calls. So, the fact that OS/2 locks the EA datafile
- open is actually a benefit of sorts - it saves the file being backed up
- when its contents will never be needed; and it would be semi-useless
- unless the directory entries were also restored in their entirety.
-
- Why is this file so big? I can speak only for IBM OS/2 1.2 and 1.3,
- which are the ones I have run. When installing OS/2, the installation
- utility scans the OS/2 hard disk (if FAT) for any files it considers
- should have EAs on them. This means all .EXE files for a start. To
- each one it helpfully adds a short EA that marks the file as executable;
- this EA is 23 bytes long, but since each EA needs to be in a cluster
- unique to the file to which it is attached, it actually occupies a whole
- 2K cluster. Note that EAs are attached at this time even to DOS .EXE
- files found on the disk. In my case this used up 700K of disk space;
- your mileage may vary. Incidentally, the EA datafile is created when
- the first EA is attached to any file on the disk; try it out with a
- floppy; it also takes one cluster (the first one) for some kind of
- internal housekeeping information. I suspected that this cluster is
- some kind of map similar to the FAT, chaining together the clusters
- relating to one file within the EA datafile; if so, it would probably
- expand if you had a lot of EAs on your disk. Dean Gibson figured out a
- lot more about the format of the file; the details are given in Appendix
- A.
-
- You can safely delete the EAs from all your DOS files, and from many
- OS/2 ones. Beware, though! Some files have large EAs that are used for
- special purposes. Ones I know of include some printer drivers, and the
- VIEW utility used for the online command reference. DIR/N will show you
- the sizes of the EAs for each file. To delete the EAs from all of the
- files in my DOS directory, I used:
-
- FOR %X IN (*.EXE) DO EAUTIL /S %X
-
- This splits off the EA for each file into another file of the same name,
- in a subdirectory called EAS (which is created automatically). Delete
- this directory and its contents to free up the space. The clusters are
- automatically removed from the EA datafile at this time. I have found
- this the easiest way to remove EAs.
-
- EAs are also removed from the EA datafile if the file to which they are
- attached is deleted; this ONLY applies if deletion takes place under
- OS/2 (the DOS box will do). If deleted under vanilla DOS, the EA
- datafile retains the "lost" EA clusters; they can be reclaimed by
- running CHKDSK under OS/2 (using the installation disk if DLLs or a
- swapfile are open on the disk in question).
-
- All this of course plays havoc with defragmenters. They have to work
- round all of the scattered, immobile clusters making up the EA datafile.
- Yes, it's a kludge; but quite a good one, given the constraint that it
- has to look OK under normal DOS as well as provide the functionality
- under OS/2.
-
- Please let me know if you have any comments on the above; if I receive
- more information, I'll produce a further updated version.
-
-
- Appendix A - Notes on the format of the EA datafile
- ---------------------------------------------------
-
- Most of this information came from Dean Gibson - many thanks, Dean! I
- have made the occasional addition.
-
- The actual EA DATA. SF file format is as follows (this has been
- verified with both 128 & 512 byte sector disks):
-
- The first word is for identification and contains the ASCII characters
- 'ED'; the next 15 words seem unused. The next 240 words (call this
- "table A") contain offsets into "table B". Table B starts at file byte
- offset 512 and continues for as many contiguous 128 word segments as
- necessary.
-
- Given a non-zero 16 bit EA pointer "X" in a FAT system directory entry
- (in offsets 14H and 15H):
-
- 1. Shift X right 7 bits, and use the result as a WORD INDEX to obtain a
- word entry from table A. Note that since a FAT system can only have
- 64K entries, that means a maximum of 32K files that have EA entries
- (since each file and each EA takes one cluster each), so the max EA
- pointer value is <32K, and thus the high-order bit of X is unused.
-
- 2. Use X as a relative WORD INDEX into table B, to obtain the word entry
- at that location. A value of FFFFH means that the entry is unused.
-
- 3. Add the values from steps 2 & 3 to obtain the relative CLUSTER of the EA
- for the target file within EA_DATA._SF.
-
- In order to keep the EA DATA. SF file logically contiguous when table B
- is expanded into a new cluster or when an EA is deleted, the FAT cluster
- chain for EA DATA. SF is altered, and values in table A and/or segments
- of table B are changed to reflect this.
-
- The first word of the EA sector is for identification and contains the
- ASCII characters 'EA'; the next word is the relative sector number of
- this sector (consistency check); then the next two words are zero; the
- next twelve bytes contain the target file name (no path); the next word
- has an as yet undeciphered meaning; then the next two words are zero;
- followed by the EA data for the target file. The first word of the EA
- data is the length of the EA data in bytes, including the count word.
-
-
- Bob Eager
- Compuserve: 100016,2770
- USENET: rde@ukc.ac.uk
-
-