home *** CD-ROM | disk | FTP | other *** search
Text File | 1996-07-05 | 24.4 KB | 540 lines | [TEXT/R*ch] |
-
-
- How to build a GopherPup
-
- Don Gilbert
- January 1995
- email: gopherpup@bio.indiana.edu
-
- GopherPup is an Internet Gopher+ client with extensions to
- (a) display rich text documents, including pictures, and
- (b) allow network hypertext links inside documents it displays
-
- This document includes discussion of
- i the Gopher+ protocol extentions used between GopherPup and
- gopher+ servers
- ii how to create your own gopherpup hyper-rich-text documents
- iii Gopher+ server modifications to support GopherPup
- iv suggestions on how to add hypertext to other gopher+ clients
- v GopherPup source code and how to build it
- vi Current bugs and problems with GopherPup
- vii why not use Mosaic/WWW instead of extending Gopher+ protocol?
-
- I. Gopher+ Protocol Extensions for Hypertext
- The basic protocol extensions used in GopherPup were outlined in a
- Usenet article in November 1993 <usenet ref>, available as
- <gopher+link>. The idea is to include network hypertext capabilities
- in the gopher+ protocol as simply as possible. For that, I chose to
- use the currently defined gopher-menu and gopher+-menu data types.
- These data types, when included as a block extension +MENU, can be
- associated with any document served by a gopher+ server. In order to
- place specific link items within a document, another set of block
- extensions is needed, which are associated with each item in a +MENU
- block. These placing extensions Ive called +MENUSTRING, +MENURECT,
- +MENUBYTES, and others may be included as seems important.
-
- Another advantage of this method is that these gopher+menu data can be
- save to local disk files associated with a document, where users can
- access them in the same way as from a remote server.
-
- Here is the general scheme of a client-server transaction that
- includes +MENU data.
-
- Client connects to server and queries for a gopher-menu or gopher+-
- menu. Server responds with requested data. If gopher+-menu data is
- returned, the server will send any +MENU block extensions associated
- with any of its documents. If gopher-menu is returned, no +MENU data
- is returned, and it is the clients responsibility to ask for more
- information (!) on an item to see if it has any +MENU extensions.
-
- The format that the server sends gopher+-menu data including +MENU
- blocks is as follows (the bars indicate tab characters):
-
- +-1
- +INFO: 0MainDoc1 0/Some/Path/0MainDoc1 my.host.edu 70 +
- +ADMIN:
- Admin: blah blah
- +VIEWS:
- application/rtf En_US: <12k>
- text/plain En_US <8k>
- +MENU:
- +INFO: 0LinkDoc1 0/Some/Path/LinkDoc1
- my.host.edu 70 +
- +MENUSTRING: locate at this text in MainDoc1
- +INFO: 1LinkMenu1 1/Some/Path/LinkMenu1
- my.host.edu 70 +
- +MENUSTRING: locate at this other text in MainDoc1
- +INFO: 7LinkQuery 7Some/Path/LinkQuery
- my.host.edu 70 +
- +MENUSTRING: locate at this even different text in MainDoc1
- +QUERYSTRING: search gopher index with this string
- +INFO: 1LinkMenu2 1/Some/Path/LinkMenu2
- my.host.edu 70 +
- +MENUSTRING: locate somewhere else in MainDoc1
-
- +INFO: 0Doc2 0/Some/Path/Doc2 my.host.edu 70 +
- +ADMIN:
- Admin: blah blah
- +VIEWS:
- application/rtf En_US: <12k>
- text/plain En_US <8k>
- +MENU:
- +INFO: 0LinkDoc1 0/Some/Path/LinkDoc1
- my.host.edu 70 +
- +MENUSTRING: locate at etc in MainDoc1
- +INFO: 1LinkMenu1 1/Some/Path/LinkMenu1
- my.host.edu 70 +
- +MENUSTRING: locate at etc in MainDoc1
- +INFO: 0LinkDoc2 0/Some/Path/LinkDoc2
- my.host.edu 70 +
- +INFO: 1LinkMenu2 1/Some/Path/LinkMenu2
- my.host.edu 70 +
-
- +INFO: 0Doc3 0/Some/Path/Doc3 my.host.edu 70 +
- [ et cetera ... ]
-
-
- The basic idea here is that any document can have an associated
- gopher+-menu structure. The current U.Minnesota gopherd server can
- easily include this data by means of its block extensions method.
- Data in the format of a .cache or .cache+ file can be copied to a
- a file to be associated with a document (e.g., MainDoc1.gmenu,
- associated with MainDoc1), or one can create such a .gmenu document by
- hand.
-
- The document maintainer can then edit that file and put in +MENUSTRING
- and related locating extensions. This will produce a MainDoc1.gmenu
- file looking like this, for instance:
-
- +INFO: 0LinkDoc1 0/Some/Path/LinkDoc1 my.host.edu 70 +
- +MENUSTRING: locate at this text in MainDoc1
- +INFO: 1LinkMenu1 1/Some/Path/LinkMenu1 my.host.edu 70 +
- +MENUSTRING: locate at this other text in MainDoc1
- +INFO: 7LinkQuery 7Some/Path/LinkQuery my.host.edu 70 +
- +MENUSTRING: locate at this even different text in MainDoc1
- +QUERYSTRING: search gopher index with this string
- +INFO: 1LinkMenu2 1/Some/Path/LinkMenu2 my.host.edu 70 +
- +MENUSTRING: locate at etc in MainDoc1
-
- The gopherd server should have its configuration file modified to
- include the block extension type MENU, with some suitable filename
- suffix (e.g., .gmenu). On my server, I have a
- /usr/local/etc/gopherd.conf that has this entry:
- blockext: .gmenu MENU
-
- Then when gopherd serves out menu information, it will provide it in
- the above format with any MENU blocks right-shifted by one space.
-
- The +QUERYSTRING is another extension which can be used in conjunction
- with the +MENU blocks. Its function is to provide a given string that
- will be used to search a gopher Index database (type 7). That way, a
- document can easily include hypertext references to data in searchable
- databases. More about this later.
-
-
- When sent to a client, the client can use the same general method of
- parsing the +MENU block as it uses to parse standard gopher-menu and
- gopher+-menu data. Then it parses the +MENUSTRING and related
- placement extenstions, using that information to place the +INFO link
- at a particular word, phrase or position in the document. The client
- software can highlight these hotspots in various ways (underline,
- color, or in a character-mode client, perhaps as extra symbols in the
- text like [1]), so the person can see which areas have hypertext
- links.
-
-
- II. Making Hypertext documents for serving to GopherPup
-
- GopherPup now includes the ability to display documents that are in
- Microsoft standard Rich-Text-Format (RTF). This format was chosen
- because it is an existing, widely available standard for storing fully
- formatted documents (including font, paragraph and document styling
- and pictures). It is suitable for exchange of scientific
- documents, as it handles symbol fonts, super and subscripting, and the
- other formatting characteristics needed in most scientific and
- scholarly documents. Also importantly, the most commonly used
- wordprocessors on the common computing platforms (Macintosh and MS-
- DOS/MS-Windows) will read and write RTF format.
-
- The primary step in creating network hypertext documents is then as
- simple as writing the document in your wordprocessor, and saving it in
- RTF format (see the Save As or similar option in your wordprocessor
- for saving in this format).
-
- The next step is to associate network links with words or images in
- your document. This current release of GopherPup is sparse in its
- support for associating link data with document parts. Currently this
- involves (a) creating a separate text file that has gopher+-menu
- data, (b) inserting +MENUSTRING statements with each +INFO statement.
- Alternately, if you save a hypertext gopher document retrieved thru
- gopherpup, it will create a .go4 gopher+-menu document including the
- +MENUSTRING statements.
-
- II.a) The GopherPup .go4 data file.
- GopherPup will now look for link data associated with any document it
- opens from your local disk drives. The standard way to open a local
- disk file in GopherPup is to choose the File/Open command, and use the
- system-specific file chooser to select a document to view. When you
- make such a selection, e.g., MyThesis.rtf or MyThesis.text, the
- program now will look for an associated file with the suffix .go4,
- e.g., MyThesis.go4. The program looks in such a file for gopher+-
- menu data and if found, will use it to mark hypertext links in the
- selected document.
-
- You can create an initial .go4 data file by saving gopher menu windows
- from GopherPup (use the File/Save menu when a gopher menu window is
- the active window). Such a file can be edited with a standard text
- editor application. It consists of lines such as described above that
- are the gopher+-menu data:
-
- +INFO: 0LinkDoc1 0/Some/Path/LinkDoc1 my.host.edu 70 +
- +INFO: 1LinkMenu1 1/Some/Path/LinkMenu1 my.host.edu 70 +
- +INFO: 7LinkQuery 7Some/Path/LinkQuery my.host.edu 70 +
- +INFO: 1LinkMenu2 1/Some/Path/LinkMenu2 my.host.edu 70 +
-
- Protocols other than Gopher are supported in this format with the
- addition of +URL: lines. Currently GopherPup requires a +INFO: line
- for each item, then additional Gopher+ statements will add to and
- modify that. This is an example:
- +INFO: 0ItemTitle any/path any.host 0
- +URL: http://real.host/real/path/here
-
- II.b) Linking with +MENUSTRING statements
-
- Currently only the +MENUSTRING: some word or phrase method of
- linking a gopher item to a document location is supported in
- GopherPup. You need to edit the gopher+-menu .go4 file you created
- above with a text editor and insert these statements, one following
- each +INFO statement. The syntax of this statement is
- +MENUSTRING: word
- +MENUSTRING: a phrase in double-quotes
- +MENUSTRING: a phrase in single quotes
-
- GopherPup will search a document and mark all matches to a given
- +MENUSTRING as links to the associated +INFO gopher item.
-
- II.c) Planned MENU locating statements
- Future versions of gopherpup should expand on these capabilities. It
- is also hoped that a later version of gopherpup will included means to
- directly link gopher items to a displayed document, bypassing need to
- use a text editor. Suggested locating statements are
-
- +MENUSTRING: "some string" repeat-value
- used to locate a network link at a string in a text document. A
- phrase with embedded spaces should be enclosed in quote () or
- double quote () characters. An optional repeat-value specifies
- which occurence of the string to match, from the top of the text.
- The repeat value is not currently implemented in gopherpup -- all
- instances of the menustring are matched.
- +MENUSTRING: "bob was here" 3
-
- +MENULINE: start-paragraph start-char stop-paragraph stop-char
- specifies the item's location as a line number and character range
- in a text document. Start-paragraph indexes a paragraph, from 0
- at top, and 'start-char' indexes a character within that line.
- Likewise Stop-paragraph and stop-char indicate the end of the
- index.
- +MENULINE: 100 500 100 515
-
- +MENURECT: left top right bottom
- specifies the item's pixel location rectangle on a graphics
- document.
- +MENURECT: 300 0 400 100
-
- +MENULINERECT: start-paragraph start-char left top right bottom
- This is a combinary of +menuline and +menurect, to index a
- rectangle within a paragraph.
- +MENULINERECT: 100 500 300 0 400 100
-
- +MENUBYTES: startbyte stopbyte
- specifies the item's location as a byte range in a text document.
- If 'stopbyte' is missing, only the 'startbyte' location is indexed
- for the item location.
- +MENUBYTES: 100 500
-
- +ISMAP:
- Indicates the item is a queryable graphic map, and x-y coordinates
- of mouse clicks will be sent to the item server.
-
-
-
- II.d) NetDoc formats -- Gopher links combined with document data
-
- As an alternative to having a separate .go4 data file of link
- statements for each document, GopherPup understands a compount
- document type where the link statements are prepended to the other
- data. This method allows documents with any kind of data that
- GopherPup can display, including binary formats such as PICT and GIF,
- to contain network links. They are typed as NetDoc/Text, NetDoc/RTF,
- NetDoc/PICT, NetDoc/GIF, et cetera. The format of these NetDoc types
- is to start with the line
-
- +MENU:
-
- followed by as many Gopher+ link statements as described above as
- needed, then the line
-
- +DATA:
-
- is a key that the document data follows to the end of the file.
-
- Here is an example NetDoc header
-
- +MENU:
- +info: 1GopherPup home 1/IUBio-Software+Data/util/gopher/gopherpup
- ftp.bio.indiana.edu 70 +
- +menustring: ftp.bio.indiana.edu:/util/gopher/gopherpup
- +comment:
- +comment: To read this with a standard wordprocessor, strip out the
- lines down thru
- +comment: the "+DATA:" line, so that the first line starts with "{ \
- rtf1"
- +comment:
- +info: 1DCLAP home 1/IUBio-Software+Data/util/dclap
- ftp.bio.indiana.edu 70 +
- +menustring: ftp.bio.indiana.edu:/util/dclap
- +info: 1IUBio Archive ftp.bio.indiana.edu 70 +
- +menustring: "IUBio Biology Archive"
- +info: 7GenBank sequence search 7/.indices/genbank
- ftp.bio.indiana.edu 70 +
- +menustring: "fetch a sequence entry from Genbank"
- +info: 1RTF Info 1/IUBio-Software+Data/util/rtf
- ftp.bio.indiana.edu 70 +
- +menustring: "Rich Text Format"
- +info: 1RTF (Rich Text Format) utilities 1/RTF ftp.primate.wisc.edu
- 70 +
- +menustring: ftp.primate.wisc.edu:/pub/RTF
- +info: 0Go+hypertext-method 0/IUBio-
- Software+Data/util/gopher/go+menu-article.txt
- ftp.bio.indiana.edu 70 +
- +menustring: "network hypertext links"
- +comment: new gopherpup "mailto" kind 'm'
- +info: mGopherPup bug report GopherPup@Bio.Indiana.Edu localhost
- 0
- +menustring: GopherPup@Bio.Indiana.Edu
- +info: 1NCBI Gopher ncbi.nlm.nih.gov 70
- +menustring: "National Center for Biotechnology Information"
- +info: 1ncbi_tools 1/toolbox/ncbi_tools ncbi.nlm.nih.gov 70
- +menustring: ncbi.nlm.nih.gov:/toolbox
- +info: 1Gopher help from U.Minnesota (USA)
- gopher.micro.umn.edu 70 +
- +menustring: gopher.micro.umn.edu
- +comment:
- +DATA:
- {\rtf1\mac\deff2 {\fonttbl{\f0\fswiss Chicago;}{\f2\froman New
- York;}{\f3\fswiss Geneva;}{\f4\fmodern Monaco;}{\f13\fnil Zapf
- Dingbats;}{\f14\fnil Bookman;}{\f16\fnil Palatino;}{\f18\fnil Zapf
- Chancery;}{\f19\fnil Souvenir;}{\f20\froman Times;}
- [... and more rtf data... ]
-
- III. Gopher+ server modifications to support GopherPup
- All of the current extensions to GopherPup will work with the current
- gopher+ GopherD server from University of Minnesota, I believe. I use
- one minor extension, but it isnt essential. At this point, I havent
- tested other gopher+ servers, such as the Macintosh one.
-
- The one extension Ive added to server software for these methods, is
- to allow the combined MENU information and document DATA to be built
- from two separate files, and sent as one chunk to the client. This
- extention allows one to support both non-hypertext and hypertext
- formats from the same data files. For instance, one set of documents
- at IUBio server is available in the following formats: image/pict,
- netdoc/pict, image/gif, netdoc/gif, and text/html. The two pict forms
- use one data file, and the netdoc form is created by prepending a
- +MENU: block (as in II.d above). The other three forms are created
- from one gif file and the same menu block (for netdoc/gif) or with an
- HTML wrapper (for text/html).
-
- IV. Suggestions for adding hypertext support to other gopher+ clients
-
- The protocol extentions that are outlined here, and embodied in the
- GopherPup client, are readily transferable to other gopher+ clients.
- The basic data here is the same gopher+-menu data already parsed by
- these clients. A gopher+ server should need no modification to send
- MENU data associated with a document, if it follows the U.Minnesota
- model of block extensions. It is my intension, and suggestion for
- others, to provide at least both plain text and Rich Text Format
- versions of documents for network information servers, since it is
- usually trivial to produce both forms from a wordprocessor. Thus it
- isnt necessary for other clients to support RTF display (which is the
- hard part here -- it took me rougly one day to add just the hypertext
- portion to my gopher client). Other clients might also care to
- support other document formats. This method of MENU extensions can be
- applied to various text and graphics formats.
-
- For a client to support MENU hypertext links it basically needs to add
- recognition of the +MENU: block extention, and extension locator
- statements like +MENUSTRING:. For a graphics mode client, this would
- involve highlighting the located strings in a document on display, and
- responding to user interactions (mouse clicks) with the highlighted
- item by fetching it from the server, basically in the same way as for
- gopher items that are displayed and selected from a menu.
-
- For character-mode client, it might be easiest to insert markers into
- a display document to correspond to highlighting in a graphics client.
- If the document text were
- Please match this word, and then match this second word,
- then formats such as
- Please match this word{a}, and then match this second word{b},
- or
- Please match this word[1], and then match this second word[2],
- would display easily, and the client could respond to typing of
- letters or numbers as selection events for the linked items.
-
- V. GopherPup source code and how to build it
- Source code for GopherPup is freely available for non-commercial
- distribution and use, as are program executables for the major
- platforms available to the author. It may be modified and enhanced
- for those who wish to. The source code is available at
- <ftp://iubio.bio.indiana.edu:/util/dclap/src/> with associated source
- code in /util/dclap/from-ncbi/, documents in /util/dclap/docs/, and
- program executables in /util/dclap/apps/.
-
- An interesting feature of the GopherPup software is that it is written
- based on a cross-platform development library. This means that one-
- size-fits-all, or at least it tries to. Additions and changes to the
- program are readily available on the common computing platforms of
- Macintosh, MS Windows and X/Motif Windows.
-
- The cross platform development library includes an ANSI-C interface to
- Macintosh, MS-Windows, and X/Motif Windows graphic and system
- environments. The C library is called NCBI Toolkit, with VIBRANT the
- graphic user interface portion. These parts are written by
- biocomputing scientists at the National Center for Biotechnology
- Information (NCBI). The remainder of the framework for GopherPup is a
- C++ object oriented, class application library (DCLAP), which is still
- fairly young. It is based loosely on the Apple Computer MacApp
- development library (the parent of GopherPup, GopherApp was written to
- MacApp).
-
-
- VI. Current bugs and undeveloped features of GopherPup
-
- General:
- RTF and HTML, and in general the rich-text display, need much bug
- chasing.
- HTML Forms are not supported yet (a first hack at it is in
- progress).
- Memory management is improved but still needs much work --
- documents are not flushed & loaded as sensibly as they should be.
-
- Mac:
- seems generally stable, but not completely
-
- XMotif:
- somewhat usable, though it still tends to crash.
- no printing. This lack is fundamental to the XWindow support of
- Vibrant/DCLAP.
-
- MS-Windows:
- freezes fairly frequently still. When not frozen, all the
- features are there. The 32 bit version seems much more stable --
- several of the 16bit version crashes are known due to the small word
- size of that system for memory allocations, etc.
-
- [more bugs are known, just not documented here...]
-
- VII. Why not just use Mosaic?
-
- Internet Gopher and the network information systems that goes under
- names of WWW, the Web, HTTP and HTML are both capable of being good
- network information clients, and in very similar manners. Many of the
- features that were pioneered in gopher are now being added to
- WWW/Mosaic client-servers. Many of the same problems are now being
- addressed. My opinion is that Gopher has good features that should
- not be lost sight of due to a desire for hypertext, and a pretty
- (inter)face.
-
- The hypertext markup language (HTML) has advantages and disadvantages
- over the method discussed here. One basic difference is that HTML is
- an in-line markup language, where hypertext links are inserted in the
- document. The form described here doesnt change documents
- themselves, but adds a secondary data file, or is appended at the
- start of the data. In some cases this nicer, as it leaves the
- original document in its original form, without confusing the text
- with network link codes. GopherPup has been developed to allow
- translators to display various document formats, and as well to permit
- adding network links to any of these formats, whether they are simple
- text or some binary image format.
-
- The HTML document format as implemented in Mosaic and other WWW
- browsers does not provide enough support for the formatting features
- needed for scientific documents. Symbol and other fonts, super and
- subscripts, various formatting capabilities that any common
- wordprocessor can display are essential for electronic publication of
- scientific documents. Also, to get wordprocessing documents into HTML
- format involves various difficulties. The RTF format chosen for
- GopherPup is better suited to display of scholarly articles, and has
- the great benefit of being readily produced by many or most of the
- commonly used wordprocessing programs. As well, GopherPup has the
- markup features needed to allow you to very simply add network links
- to any part of a displayed document, by selecting words or portions of
- the display and dragging a network link item to that selection.
-
- The RTF support also includes support of embedded pictures, including
- both vector and bitmap forms. Vector pictures are not supported in
- Mosaic/www, but are very useful for scientific data (frequently
- graphs) both because they are independent of resolution of the
- display/output device, and also because they can be re-edited to
- different sizes or components can be changed easily. In one example
- of genetic map drawings, I have compared the same maps in formats
- suitable for Mosaic browsers (GIF) and for GopherPup (PICT). The low-
- weight, high-resolution vector map wins hands down any speed contests,
- at about 5 times faster in fetching and displaying the same map,
- making it much more suitable to set up interactive graphic browsing
- services with GopherPup.
-
- The hypertext transport protocol (HTTP) is essentially the same in
- functionality to the Internet Gopher protocol, it just works with
- different codes. Since gopher servers can also server HTML documents
- if those are preferred, there seems no compelling reason to run an
- HTTP server.
-
- Gopher already has mature support for client-server handling of
- multiple-format documents (Gopher+ VIEWS). This allows for instance a
- server to provide both a fancy, picture-rich document and a simple,
- plain text one that can provide the same basic information. But this
- choice of formats lets gopher serve a wider range of client and
- network transport capabilities.
-
- Mosaic and other so-called web browsers tend to be kitchen-sink
- clients. They try to speak many network protocols, but are masters
- only of HTTP/HTML. In part this means that people using these
- browsers get a poor impression of methods other than HTTP/HTML,
- because the browser only poorly support other methods. This is
- basically why I avoid using the terms web and WWW. The advocates
- of HTML/HTTP browsers are claiming more than they deliver. Gopher
- clients tend to concentrate on mastering one protocol, and rely on
- information servers to provide gateways to alternate protocols if
- desired.
-
- The HTML/HTTP developers have long talked about the standard that
- they developed. It currently looks like it will be very difficult
- for anyone to extend or improve that method, as they have a committee
- in charge of it. In my estimation, standards are not born over
- night. Where standard make sense, existing ones should be employed
- before inventing a new one. Where new methods are needed, they should
- be developed and tested before calling them a standard. That is why
- Ive (a) concentrated on using existing formats like RTF and PICT that
- are already in wide use for document display, and (b) extended and
- experimented with the flexible Gopher+ protocol to get it to do useful
- things, rather than try to do such with the restricted HTTP/HTML
- protocols.
-
- Both Internet Gopher and WWW are still evolving and changing rapidly.
- It may well be that they converge on some set of features that all
- find useful. GopherPup is one step to providing a complex client
- that supports the important information protocols. Ive hoped with
- GopherPup extentions to amend the parts of Gopher that many people
- find advantageous in Mosaic, and perhaps provide in some small ways a
- better network client for some uses.
-
- The evolving GopherPup now supports local gophering -- browsing and
- (soon) searching of local data with the same simple methods that apply
- to remote services. With this addition it will be feasible for
- information providers to offer both network services and CDROM based
- information disks relying on one set of data and client/user-
- interface.
-
-
-