home *** CD-ROM | disk | FTP | other *** search
- Path: sparky!uunet!dtix!darwin.sura.net!spool.mu.edu!olivea!mintaka.lcs.mit.edu!zurich.ai.mit.edu!jaffer
- From: jaffer@zurich.ai.mit.edu (Aubrey Jaffer)
- Newsgroups: comp.lang.scheme
- Subject: database in Scheme
- Message-ID: <JAFFER.92Dec22170313@camelot.ai.mit.edu>
- Date: 22 Dec 92 22:03:13 GMT
- Sender: news@mintaka.lcs.mit.edu
- Organization: M.I.T. Artificial Intelligence Lab.
- Lines: 163
-
- I have written the system described in this message. If other people
- are interested in using this system, I may be able to convince my
- employer (which is not MIT) to release it for public use. If you are
- interested, please send me a description of what you might use it for.
-
- ======================================================================
- WB is a disk based, sorted associative array package integrated into
- the Scheme implementation SCM. These associative arrays consist of
- keys and values which are strings of length less than 256 bytes.
- Associative arrays can be used to form a MUMPS style database which
- can be used to implement standard record structures without auxiallary
- files (see example below).
-
- Basic operations are creation, destruction, opening and closing of
- diskfiles and arrays, insertion, deletion, retrieval, successor, and
- predecessor (with respect to dictionary order of keys). Functional
- application of find-next, deletion, and modification over a range of
- consecutive key values is supported.
-
- Multiple associative arrays can be stored in one disk file.
- Simultaneous access to multiple disk files is supported. A structure
- checker, garbage collector are included. A repair program and
- ram-disk type file (for temporary structures) are in developement.
-
- The current WB implementation has a file size limit of 2^32 * block
- size (currently 2048) = 2^43 bytes (8796 Gbytes). We currently run WB
- with file sizes of several hundred Mbytes. WB does its own memory and
- disk management and maintains an in core cache of recently used
- blocks. The WB implementation adds about 45 Kbytes to the size of
- SCM.
-
- WB is implemented using a B-tree variant. B-trees give slower access
- than hashing but are dynamic and provide an efficient determination of
- successor and predecessor keys. All operations are O(log(n)) in the
- size of the database. B-trees are commonly used by database systems
- for implementing index structures. B-trees are optimized for using
- the minimum number of disk operations for large data structures.
- Prefix and suffix compression are used for storage efficiency.
-
- MUMPS Style Database Phone Book Example
-
- (open-seg 5 "mydata" 2) ;opens a previously created file.
- (define pb (create-db 5 #\T "phone-book")) ;create an array called
- ;"phone-book" which will
- ;contain the phone book
- ;records.
- (define pi (create-db 5 #\T "phone-index")) ;create an array called
- ;"phone-index" which we will
- ;use for indexing by phone
- ;number.
- (define lni (create-db 5 #\T "lastname-index"))
- ;create an array called
- ;"lastname-book" which we will
- ;use for indexing by last name
- (define record-number 0)
-
- ;;;MAKE-NAME is a routine which concatenates its arguments together
- ;;;separated by null characters. This assures that the arguments act
- ;;;as independent subscripts.
-
- (put! pb (make-name record-number "LN") "Doe") ;last name
- (put! pb (make-name record-number "FN") "Joe") ;first name
- (put! pb (make-name record-number "PN") "5551212") ;phone number
- (put! pb (make-name record-number "AD1") "13 Hi St.") ;street address
- (put! pb (make-name record-number "CITY") "Podunk")
- (put! pb (make-name record-number "ST") "NY")
- (put! pb (make-name record-number "ZIP") "10000")
- (put! lni (make-name "Doe" record-number) "")
- ;This adds index entry so that
- ;(next lni (make-name "Doe"))
- ;will find the record with
- ;complete information.
- (put! pi (make-name "5551212" record-number) "")
- ;similarly for looking up by
- ;phone number.
-
- ;;; Note we put the record number into the key. This is so that we
- ;;; can index records for more than one "Doe".
-
- (define doe-rec
- (get-subscript
- (next lni (make-name "Doe")) ;returns the a name which
- 2 ;includes the record number of
- ;the record.
-
- (get pb (make-name doe-rec "PN")) ;returns Doe's Phone number.
-
- SCHEME DATABASE CALLS
-
- (open-seg seg filename mode) Procedure
-
- Seg should be a non-negative number. Opens the database seg in
- filename and returns seg if successful, #f otherwise. The database
- seg will be read-only if the mode argument is 0. It will be
- read-write if the mode argument is 2.
-
- (close-seg seg) Procedure
-
- Closes database seg seg and the file containing it.
-
- (make-seg seg filename block-size) Procedure
-
- Makes a new empty database seg of name filename. #t is returned
- of creation was successful, #f if not. Use open-seg to get the
- new seg.
-
- (open-bt seg blknum) Procedure
-
- Returns a bt-handle open to seg number seg, block number blknum. If
- no such block exists or is not a root block #f is returned.
-
- (create-bt seg dir) Procedure
-
- Creates a new root block in seg seg of type dir and returns a
- bt-handle open to it.
-
- (get han key) Procedure
-
- Returns a string of the value associated with key in the bt which han
- is open to. Returns #f if key is not in the bt. Key is a string.
-
- (next han key) Procedure
-
- Returns the next key in bt han or #f if none.
-
- (prev han key) Procedure
-
- Returns the previous key in bt han or #f if none.
-
- (rem! han key) Procedure
-
- Removes key and it's associated value from bt han.
-
- (rem han key) Procedure
-
- Removes key and it's associated value from bt han only if key is
- present. Returns #t for success, #f for failure (not present).
-
- (rem* han key1 key2) Procedure
-
- Removes keys (and their associated values) between (including) key1
- and (not including) key2 from bt han.
-
- (put! han key val) Procedure
-
- Associates key with val in the bt han.
-
- (put han key val) Procedure
-
- Associates key with val in the bt han only if key was previously
- empty. Returns #t for success, #f for failure.
-
- (create-db seg typ name) Procedure
-
- Returns a B-tree whose name has been entered in the root directory.
- Type should be one of
- (define DIR-TYP #\D)
- (define IND-TYP #\T)
-
- (open-db seg name) Procedure
-
- Returns the B-tree whose name has been entered in the root directory
- or #f if not found
-