home *** CD-ROM | disk | FTP | other *** search
-
- ΓòÉΓòÉΓòÉ 1. Introduction ΓòÉΓòÉΓòÉ
-
- The WAIS OS/2 Client is a Public Domain software product developed at
- the Library of Congress. The Client allows OS/2 users to connect to WAIS
- Servers on the Internet and to search for and retrieve documents from those
- Servers. Documents returned can be text, pictures, or other types of data,
- depending on the type of server being accessed. The Client and WAIS Servers
- communicate using the WAIS Protocol. This allows a single user to query many
- different data servers without having to learn a new query language or
- interface.
- The Client can also be used to access local WAIS
- Servers across a local area network (LAN).
-
-
- ΓòÉΓòÉΓòÉ 2. Quick Start ΓòÉΓòÉΓòÉ
-
- WAIS is simple to use.
- First, choose one or more sources using the 'Sources' pull-down menu. Next, enter your query in the 'Tell me about' window. Then, just click on the
- 'Search' pushbutton.
-
-
- ΓòÉΓòÉΓòÉ 3. Network Requirements ΓòÉΓòÉΓòÉ
-
- The OS/2 Client runs on top of IBM's TCP/IP for OS/2 network software.
- The user must be able to open a socket connection to a remote WAIS Server
- machine on the network. The WAIS Client will work with either the 16-bit or
- 32-bit flavor of IBM's TCP/IP for OS/2 product. The Client will not work with
- non-IBM TCP/IP products, but conversion should not be difficult. Since the
- Client is Public Domain software, source code is
- available for porting, modification, or improvement.
-
-
- ΓòÉΓòÉΓòÉ 4. Sources ΓòÉΓòÉΓòÉ
-
- The first step in beginning a search is to select a .source to contact.
- The user lists the currently known sources by clicking on the "Sources" button.
- A window listing current known sources will appear. You can select one or more
- of these sources with the mouse and then hit the "Use Selected Sources" button
- or double click on a source. The selected sources will then appear in the "Look
- in these Sources:" window, ready to be searched. Most searches are a
- single-source, but there are times when it is desirable to search multiple
- sources simultaneously.
-
- If you want to stop searching a source, select the source in the "Look
- in these Sources:" window and execute the "Stop Using Source" command in
- the "Sources" pull-down menu.
-
- Known sources are described in files with ".src" extensions. The first
- time the user lists sources, the Client loads in all the .src files in the
- local directory. To see what these files contain, select a source in the "Known
- Sources" window (just one) and then click on the "Edit Source" button.
- A window will appear, showing all the information associated with that source.
- Typically, the source description provides information on how to search that
- source, how to obtain more information on that source, whether or not the server
- service costs money, and the e-mail address of the source administrator. Be
- careful, you can click in any of these windows and edit the contents, if you
- change the network information, you may not be able to contact that source in
- the future.
-
- You can also select sources and hit the "Delete Selected Sources"
- button. This erases all the information related to that source and erases the
- *.src file in the local directory.
-
-
- ΓòÉΓòÉΓòÉ 5. Queries ΓòÉΓòÉΓòÉ
-
- Once you have selected a source to use,
- the Client should put you back into the Query window. This is the window which
- is labeled, "Tell me about:". You can now enter a natural language
- question in this window, or just type a set of words and phases that are
- relevant to the type of information you are seeking from the selected source.
-
- The general algorithm for weighting words and phrases is as follows: if
- a word is rarely used in the database, it get more weight; if a phase matches
- exactly, it gets more weight; and if a word appears in the document title, it
- get more weight.
-
- Once you have entered your query, hit return or click on the
- "Search" button to begin a search.
-
- You can also enter more complex queries, depending on the type of server
- you are contacting. For example, WAIS Inc. commercial servers allow you to
- enter boolean queries by using logical words in capital letters, like AND, OR,
- and NOT. The source description should tell you what kind of server it is and
- what kinds of queries it supports. Also, the server description often contains
- a method for getting a help document about that server.
-
-
- ΓòÉΓòÉΓòÉ 6. Results ΓòÉΓòÉΓòÉ
-
- Search results are displayed in the results window, the largest window
- in the display with the column headings "Score Size HEADLINES". The server
- should return a number of document titles or headlines, along with their score
- and size. The score runs from 0 to 1000. The highest scoring documents are
- listed first at the top of the display. The default file size indicates the
- number of bytes or characters it contains. If the file is large, the size will
- be expressed in multiples of 1024. If the size is followed by a "k", these are
- units of 1024. "M" stands for megabytes, or units of 1024 squared (slightly
- more than a million). "G" stands for gigabytes, or units of 1024 cubed
- (slightly more than a billion).
-
-
- ΓòÉΓòÉΓòÉ 7. Retrieving Documents ΓòÉΓòÉΓòÉ
-
- You can double click on any displayed headline in the results window to retrieve and display the document.
- Before retrieving a document, it is wise to look at how large it is to get an
- idea of how long it will take to retrieve the document. A 150k file will take
- anywhere from 10 seconds to a minute to download, depending on network traffic,
- network bandwidth, and server workload.
-
- The Client retrieves the document and puts it into a file called
- "new_doc.tmp" and launches a viewer to display the document. The type of viewer
- depends on the type of document retrieved.
-
- The user can select which type of viewer to launch with each type of
- document by selecting the "Document Viewers" menu item from the "Options" menu
- list. Typically, editors are used for text documents, while an image viewer
- is used to display GIF, JPEG, or TIFF documents.
-
- The Client comes with default viewer settings. The OS/2 epm editor is
- called on text documents. Also included on the Client distribution disk is a
- Public Domain image viewer (pmviewjr.exe) which is called by the Client for GIF
- and JPEG images. The user can substitute his preferred editors and viewers for
- these default values.
-
- The document viewer runs as a separate program. When you are done
- viewing a document, simply quit or close out the editor or viewer. The WAIS
- Client will still be running.
-
-
- ΓòÉΓòÉΓòÉ 8. Saving Documents ΓòÉΓòÉΓòÉ
-
- Each document retrieval erases the previous contents of "new_doc.tmp".
- If the user wishes to permanently store a document, she should copy the file
- "new_doc.tmp" to another file before retrieving another document. In the case
- of text documents, simply use the "Save As" command in the editor to save
- the file under another name. With images, the user may have to go to another
- OS/2 command window to copy the file, unless the viewer has a "Save As" command.
-
-
- ΓòÉΓòÉΓòÉ 9. Finding New Sources ΓòÉΓòÉΓòÉ
-
- The Client disk comes with a few of .src files, but these are only for
- demonstration purposes. The one source which is essential to have is the
- Directory of Servers. This is a WAIS Server which is a database of databases.
- Begin your search with this source in order to locate sources which are relevant
- to your query.
-
- The Directory of Servers functions like a normal WAIS Server, except
- that the documents it returns are source descriptions, not documents. To
- examine a source description, simply double click on the headline in the Results
- window. The "document" will be retrieved and displayed. At this point you have
- the option to discard the source description "Cancel", or to save it out for
- future use "Save".
-
- If you wish to save the source, be sure to edit the
- "Filename" field to indicate the filename to use. The default name is
- "new-src" which will be overwritten the next time you save a source description
- without changing the file name. The Client will append a ".src" extension to
- the source filename. The new source should now appear in the known sources
- window, listed under the filename you chose, ready to be used.
-
- If you are running WAIS on a FAT formatted disk, you will get an error
- if you specify a filename greater than eight characters.
-
-
- ΓòÉΓòÉΓòÉ 10. Creating Source Pointers ΓòÉΓòÉΓòÉ
-
- You can also create source descriptions if you know the database name,
- the internet address, and the port number of the Server you are trying to
- contact. Call the "Create a New Source" command under the "Sources" pull-down
- menu. Then fill out the necessary information by clicking in each field. The
- IP Number is not required, but if you know it, put it in as it will save lookup
- time. The rest of the information is optional.
-
- You must enter the exact Database Name; the machine name and port
- number are not sufficient. Servers run under the UNIX Operating System. The
- Database Name is actually a UNIX path name which the Server uses to access the
- database. UNIX is case sensitive. This means that the database name must have
- the correct capitalization.
-
- CAUTION:
- Also note that UNIX pathnames use "/" not "\" as in DOS, or OS/2.
-
-
- ΓòÉΓòÉΓòÉ 11. Relevance Feedback ΓòÉΓòÉΓòÉ
-
- One of the most powerful aspects of WAIS is the ability to say to a
- server, "find me more documents like this one." This is called relevance
- feedback. This is a quick, intuitive way of searching large databases to obtain
- the documents you are looking for. If you find a document that you want to use
- for relevance feedback, select the document headline and execute the "Use
- Document for Relevance Feedback" command under the "Documents" menu list. The
- document headline, along with the source it comes from, will appear in the
- relevance feedback window which is titled "Similar to:".
-
- You can now run the search again(by
- clicking on the "Search" button), but this time, in addition to your query,
- the
- document pointers in the relevance feedback window will be passed to the server
- to refine your search. Relevance feedback can be used iteratively, adding and
- deleting documents until you find the what you are looking for.
-
-
- ΓòÉΓòÉΓòÉ 12. Relevance Feedback and Multiple Source Searches ΓòÉΓòÉΓòÉ
-
- Relevance feedback works best with single-source searches with documents
- which come from that source. If you are doing a multiple-source query,
- relevance feedback becomes more complicated. For those of you who want to
- know how it really works, read on.
-
- Although all relevance feedback document ID's are send to all the
- servers being searched, only those servers that can access relevance feedback
- documents on their own file systems will use them, otherwise they will ignore
- them. That is, relevance feedback documents from Server X cannot be used by
- Server Y, unless Server X and Y are on the same file system.
-
- Thus, if you are simultaneously searching on two servers (X and Y) with
- relevance feedback documents from both servers, and if they are not on the
- same file system, then each server will perform its search only with the relevance
- feedback documents from their respective databases.
-
- Also, when a user removes a source from the "Look in these
- Sources:" window(via the "Stop Using Source" command), all the relevance
- feedback documents from that source are placed at the bottom of the list, with
- the label "These documents may be ignored:" to indicate that their source
- is no longer being used. If they exist on a file system that is still in use,
- they may still be used, but otherwise they will be ignored.
-