home *** CD-ROM | disk | FTP | other *** search
Text File | 1987-05-14 | 8.7 KB | 159 lines | [TEXT/MACA] |
- Welcome to Browser v.244! This program enables you to make and browse
- indices to very large collections of free-text data. Read the articles
- accompanying the program for comments on design philosophy, algorithms,
- data structures, etc.
-
- -----------------------------------------------------------------------------
-
- This is the help file for Browser v.244. If you have used Browser v.223 or
- earlier versions, please read the warning at the end of this help file!
-
- -----------------------------------------------------------------------------
-
- In general:
- * if you have problems, try running without a RAM Cache (turn off from
- the control panel, and reboot) and make sure you have enough memory
- allocated for the program (probably at least 500 KB).
- * use DivJoin, Microsoft Word, or other such programs to join multiple small
- text files together to make suitably-massive files for indexing.
- * when all else fails, call me up and I'll try to help....
-
- -----------------------------------------------------------------------------
-
- Under the "Browse" menu:
- * select "Open..." to open a previously-created index file for browsing
- - click on a line in the Index window to call up all occurrences of
- that term in the Context window
- - click on a line in the Context window to call up the full text of
- the database in the vicinity of that line
- - select a longer text length from the menu if speed in opening the
- text window is less essential, or if more text around a target
- is desired
- - type into the Index window to jump to a given target location (takes
- a few seconds, longer from floppy)
- - use the Subindex commands to work within a subset of the entire file
- (very useful for large databases)
- - "Empty" empties out the working subindex, "Fill" fills it up again (the
- default startup condition), and "Invert" does a boolean NOT operation
- (so anything that was in the subindex leaves it, and everything that
- was not in the subindex is not included in it)
- - hold down the shift key (cursor turns into a "+") and click on items to
- add their neighborhoods to the working subindex (boolean OR operation);
- hold down the option key (cursor turns into a "-") and click on items
- to remove their neighborhoods from the working subindex (boolean NAND)
- - the Subindex Proximity choices allow you to control the neighborhood or
- region of influence around shift-click and option-click selections for
- the working subindex. "Words" (the default) selects neighborhoods of
- half a dozen words or so around each click (actually, it selects all
- terms within 32 bytes of the chosen item, and some terms out as far
- as 64 bytes ... read the tech notes or subindex source code for the
- gory details). "Sentences" selects neighborhoods within a few sentences
- of the selected items, and "Paragraphs" selects neighborhoods of
- within a few paragraphs.
-
- EXAMPLE:
- You have indexed up the past year of on-line sessions you've had, and want
- to recall items concerned with the format of MacWrite documents. Choose
- "Open..." from under the Browser menu and open the already-indexed file.
- Scroll to "MACWRITE" in the index ... it occurs 2,345 times, far too many to
- effectively browse through. So, choose "Empty" to clear out the working
- subindex and then shift-click on MACWRITE. Now all 2,345 neighborhoods of
- the occurrences of MACWRITE are marked as valid. Scroll the index window
- to FORMAT and see that only 2 out of the 987 occurrences of FORMAT occurred
- within a few words of MACWRITE. Click on FORMAT and see those two occurrences
- in context. If they don't answer your questions, go back to MACWRITE, change
- the Proximity neighborhood from "Words" to "Sentences" (or even "Paragraphs")
- and shift-click again, to broaden out the selected subindex. Now check back
- under FORMAT and see that 31 of the 987 occurrences are marked valid; browse
- through them, and find the desired items.
-
- -----------------------------------------------------------------------------
-
- Under the "Index" menu:
- select "New..." to start creating an inverted index to a text file
- - index creation goes on in background while you can browse another file
- (unless you select Fast Index option, in which case everything else
- mostly locks up but indexing goes about 3 times faster)
- - don't quit the main program while indexing is still occurring, or you'll
- have to throw away the partially-sorted "....Index" file and the
- "Temporary Radix Sort File" before running again
- - don't attempt to index a file that already has an index, and don't index
- a file which has a name longer than 25 letters
- - be sure that you have at least 6 times the space free on your disk as
- the length of the original text file to be indexed (the index file
- occupies up to 3 times the space of the original text, and a temporary
- sort file of that same size is also needed during indexing)
- - use Omit Words options to leave out undesired terms from your indices
- and thereby make them smaller (and get around the 6x space requirement
- above) ... index sorting also goes faster when words are omitted
- - the status window displays progress of an index operation: gray bar shows
- amount of index-building scan completed, and black bar then shows
- proportion of index-sorting that has been finished; the window updates
- about once/second (when you're not in Fast Index mode)
- - you can turn options like Fast Index on/off during indexing (though it
- doesn't make much sense to change the Omit Words choices during the
- course of index generation); hold down the mouse button and wait
- for the disk to spin in order to get a chance to do something while
- Fast Index is on
-
- -----------------------------------------------------------------------------
-
- Under the "File" and "Edit" menus:
- * the editor is a modified version of the Sibley Editor supplied with MacForth,
- and copyright restrictions prevent me from sending out the source code
- unless you are a MacForth purchaser (and if you have MacForth, you can
- run the Browser from within MacForth with the full-up Forth interpreting
- version of the Sibley Editor -- a powerful combination! One caveat: the
- default Sibley Editor uses numerous OUTFILE commands and thus prevents
- Browser windows from always updating properly ... if a window isn't
- refreshed when you uncover it, click in it and scroll a bit and that
- should cure the problem).
- - editor commands are pretty standard; things get slow if the files being
- edited are very big (over 50 KB or so), so stick to shorter files
- - you have up to four windows available into four different text files
- - don't launch another application while building an index, as mentioned
- in connection with "Quit" command earlier
- - "Margin" under the Edit menu re-word-wraps the current selection to fit the
- current screen margins
- - The editor is memory-based, so don't try to edit too big a file
- - The editor is the newest thing in the package, so please watch out
- carefully for bugs in it (and in its interactions with the rest of the
- program ... for instance, you may have to click on another Browser
- window before clicking on an editor window in order to get the "Edit"
- menu activated, if a Desk Accessory was the front window ... I'll try
- to fix that someday, if I can make it happen consistently).
-
- -----------------------------------------------------------------------------
-
-
- Send suggestions for improvements, and details of bugs, to:
-
- Mark Zimmermann
- 9511 Gwyndale Drive
- Silver Spring, MD 20910
-
- phone (301)565-2166 (home) or (703)482-9572 (ofc, rarely in)
-
- arpanet: science@nems.arpa
-
- CompuServe: 75066,2044
-
- -----------------------------------------------------------------------------
-
- For users of earlier Browser releases:
-
- WARNING -- in versions of Browser before 0.224 there is a problem that may
- cause some words to be omitted and others to be duplicated in big indices! It
- is associated with the behavior of MacForth's WRITE.VIRTUAL command for file
- I/O, and ONLY occurs with an index file when an alphanumeric character appears
- more than 255 times in column 12 of the indexed word list. (For normal English
- text this won't happen until the file being indexed is over 1.5 MB long, and
- only 256 index terms will be messed up out of 80,000+.)
-
- THUS -- if you are using a version prior to v.224, please send me a disk
- and a self-addressed stamped envelope for a more recent release, if you need
- to work with big files. If you have used an early version to make some big
- index files, please throw those indices away and reindex.
-
-
-