home *** CD-ROM | disk | FTP | other *** search
- Chapter3 L.O.V.E. FORTH
-
-
-
- 3.0 L.O.V.E. Forth addressing and segmentation
- ------------------------------------------
-
-
- Almost all languages have problems running on the 8086/88 CPUs, but
- these problems for FORTH are especially severe. Most FORTH systems on
- this architecture are restricted to 64K of main memory for program and
- data, and are referred to as small memory models. This restricts the
- user and programs to a small amount of memory, but offers the highest
- possible execution speed. 32 bit FORTHS have been produced that offer
- a large address space, but performance has been severely degraded. The
- segmentation approach taken in L.O.V.E. Forth offers both a large memory
- size ( 320K ) and very fast execution speed.
-
- Rather than offer a large contiguous memory space, L.O.V.E. Forth
- has divided up the forth model by function. There are separate
- segments for machine code, threaded addresses, data, stacks and
- dictionary headers. As the source code is compiled, it is parcelled
- into these five segments. There is no execution time penalty over
- Forths with the small memory model.
-
-
- Note that this implementation is quite compatible with standard
- 16 bit models. For example, @ (fetch) and ! (store) access the data
- segment (the vast majority of FORTH programs use @ and ! to access
- data). Another example is that the assembler always puts its code into
- the code segment. The programmer need not worry that the code has been
- separated from the rest of the program. Even though segmentation is
- provided for in a logical fashion, some compiler words must be
- implemented differently than in standard FORTH.
-
- There are numerous indirect benefits to this segmentation, over
- and above that of memory conservation. Target systems can easily be
- saved without heads (the head segment is simply not written to disk).
- The segments can be compressed to provide small target systems. And
- because machine code is separated from threads, it is actually possible
- to save space in the thread segment by re-coding some words in machine
- code. (The thread segment always fills the fastest). This gives
- simultaneously a speed and size advantage.
-
- Note that this conforms closely to the intended usage of the
- architecture of 8086/88 microprocessors. The ususal programming battle
- with these processors is to overcome this limited architecture.
-
- Here is a summary of the contents of each segment:
-
- 3.1 Segment Description Name
- ------- ----------- ----
-
- CODE Contains 8086 machine code CS:
- pointed to by CS register
-
- THREAD Contains threaded address lists generated by TS:
- high level words.
- The code field address points here.
- pointed to by DS register
-
- DATA holds data from variables, alphanumeric strings, VS:
- and block buffers.
- pointed to by ES register
-
- HEAD holds the compile-time word headers, and HS:
- vocabulary links.
- (segment value calculated when req'd)
-
- STACK holds the parameter, return and vocabulary SS:
- stacks and local variables, if used.
- pointed to by SS register
-
- Each segment has a corresponding dictionary pointer, and a set
- of basic manipulation words such as CS:@ or HS:, . Note that all the
- addresses within these segments are 16 bits. The programmer must
- specify the segment to be operated upon by the type of operator used
- (eg. @ TS:@ CS:@ etc.)
-
- As MS-DOS tends to vary the position in RAM at which a program
- is loaded, each segment also has a word to return the actual position of
- the segment GET:CS GET:SS etc. The handy command MEM-MAP
- displays all the segments and their respective dictionary pointers.
-
- In this documentation and elsewhere, addresses are abbreviated.
- For example TS:addr represents an address in the thread segment.
- Simply 'addr' refers to the the variable segment (most often used).
- Some names assume a segment, for example 'compilation address' is always
- in the thread segment; name field address is always in the head segment.
-
-
-
- 3.2 CODE SEGMENT
- ------------
-
-
- This is the only segment that contains 8086/8088 machine code.
- Apart from the space taken by a few pointers used in CREATE DOES>
- words, this allows code to reach a full 64K. The assembler places the
- definition body into the code segment automatically.
-
- This is always the lowest of the 5 segments. Startup code in
- this segment, sets up the other segments.
- This segment contains the MS-DOS "PSP"
- (program segment prefix) in the first 256 bytes, in
- version 1.28 and prior ones. Use GET:PSP
- in newer versions.
-
- Basic operators:
- CS:C@ CS:@ CS:! CS:C! CS:, CS:C, CS:HERE
-
- These are analogous to the standard words: C@ @ ! C! , C, and
- HERE but operate on the code segment.
-
- 'CODE operates like ' but returns the address of the executed
- code extracted from the compilation address. For example all :
- words return the same value from 'CODE because they all call
- the common code for nesting colon definitions. It is thus most
- useful with CODE words, where it returns the address of the
- code loaded by the assembler.
- CS:DUMP is a utility that allows bytes to be dumped from this
- segment. ( CS:addr, #bytes -- )
-
- There are also some system 'variables' which are used, for
- example, at start-up before all the segments have been loaded or
- properly positioned.
-
- TOPSEG STACKSIZE TOPSEG SEGPAK LOVEF
- CSEG TSEG VSEG HSEG SSEG
-
- The current segment (position in RAM) is returned by:
- GET:CS (8086 CS register)
-
-
- 3.3 THREAD SEGMENT
- --------------
-
- Forth high-level : words are compiled into a sequence of 16
- bit addresses, called threads. This segment contains these threads,
- CONSTANT and LITERAL values, and pointers to data and code.
- In the majority of applications this segment fills up the fastest.
-
- Basic operators:
- TS:@ TS:! TS:, TS:HERE
- Note that there are no single byte operators - all elements in
- this segment are two bytes.
-
- EXECUTE ( TS:addr -- )
- Accepts the code field address.
-
- TS:DUMP ( TS:addr, #bytes -- )
- Dumps bytes from the specified address.
-
- Many words with compile-time usage accept or return addresses in
- this segment:
-
- ' ['] -FIND ( -- TS:addr )
-
- FIND ( VS:addr -- VS:addr, 0 or TS:addr, n )
-
- Words created with the following return a thread segment address
- at run-time:
- CREATE: (alone) or CREATE: DOES:> ( pair)
- The most often used words for creation are CREATE and
- CREATE DOES> (pair). See the Variable segment (below).
-
- In addition the following words add to this segment and have
- functions as expected:
- COMPILE [COMPILE] wordname LITERAL DLITERAL
-
- See also the technical note on L.O.V.E. Forth compatibility for
- examples of compile-time word usage.
-
- TS:BODY> TS:>BODY ( TS: addr -- TS: addr )
- are like >BODY and BODY> but operate on the thread segment
- only. (see discussion of 'Field access operators' below)
-
- operates in LOVE Forth to accept a code field address of a
- >BODY ( TS:addr -- VS:addr )
- VARIABLE (or word created by CREATE) and return the data field
- address.
-
- >LINK >NAME ( TS:addr -- HS:addr )
- are used to access the dictionary header of the specified word.
- If TS:addr is not a valid code field address, an error message
- is displayed.
-
- NAME> LINK> ( HS:addr -- TS:addr )
- are used to find the compilation address from the head address.
-
- FIND-1VOC FIND-VOCS
- ( addr, addr -- TS:addr,true or false)
- are used by FIND - address of word to find (usually at
- HERE) and vocab body input and cfa output (if found).
-
- GET:TS - returns current segment value (8086 DS register)
-
-
- 3.4 VARIABLE (DATA) SEGMENT
- -----------------------
-
- This segment is accessed the most often by application programs.
- This contains the data for variables, alphanumeric strings compiled by
- ." " , BLOCK buffers text input buffer TIB PAD HERE
- and where space is allocated for programmer defined data structures.
- Most standard Forth memory access words work relative to this segment.
-
- Basic operators:
- @ ! C@ C! C, , D@ D! +! +C! TYPE ALLOT
- TOGGLE BMOVE CMOVE CMOVE> FILL
- ENCLOSE EXPECT COUNT TYPE -TRAILING
- CONVERT NUMBER #>
- HERE PAD WORD
- BLOCK LIMIT FIRST BUFFER +BUF R/W
-
- Various I/O words:
- L->CRT N$ N$. 'STREAM
-
- TIB HLD and other VARIABLEs all return addresses in VS:
-
- File name strings passed into DOS words:
- <OPEN> OPEN <CREATE> FCREATE INQUIRE <CREATE-NEW> CREATE-NEW
- DELETE RENAME CHDIR
-
- Other DOS words:
- READ WRITE ENV-SRCH DIR-GET ASCIIZ. ASCIIZ"
-
- -words created by VARIABLE DVARIABLE CREATE
- CREATE ... DOES>
-
- DUMP ( addr, #bytes-- ) Dumps the specified bytes.
-
- GET:VS - returns current segment value (8086 ES register)
-
-
- 3.5 HEAD SEGMENT
- ------------
-
- The head segment is normally used during compilation only.
- It contains the header part of a Forth word definition, including name,
- dictionary links and pointers to the locations of the word in other
- segements. This segment may be discarded when creating a stand-alone
- application program. Utilities such as WORDS and FORGET access this
- segment automatically.
-
- Basic operators:
- HS:@ HS:! HS:C@ HS:C! HS:, HS:C, TRAVERSE
- N>LINK L>NAME LINK> NAME>
- .ID LAST
- HS:HERE returns the next available address in this segment.
-
- GET:HS - returns current head segment value (calculated)
- Note TOGGLE does not act on HS: (often used to toggle header
- bits)
- Note: the form of the head segment is subject to change in future
- versions by the authors without prior notice.
-
-
- 3.6 STACK SEGMENT
- -------------
-
- This segment holds the Forth parameter, return, vocabulary and
- local variables stacks. The operation of words on this segment is
- transparent to the programmer. During development, allowing a full 64K
- to the stack segment means that system crashes due to stack overflow are
- minimized.
- Basic operators: SS:@ SS:! .S
-
- SP@ RP@ LP@ ( -- SS:addr )
- These words return stack limits or current positions
- S0
- is a variable that contains the address of the bottom of stack
-
- SS:HERE
- Is the dictonary pointer in this segment, but is currently
- unused by any words in L.O.V.E. Forth and may be used by
- the programmer if so desired.
-
- GET:SS - returns current segment value (8086 SS register)
-
-
- 3.7 Field Access Operators
- ----------------------
-
-
- Every word in Forth has a number of parts or fields. These
- include the name, link, code and parameter fields. Field access
- operators are used to gain access to the various portions of forth
- words. In L.O.V.E. Forth, as the parts of words are parcelled between
- segments, many of these operators accept an address in one segment and
- deliver an address in another. Here is a summary of the standard
- field access operators and their functions in L.O.V.E. Forth.
-
- >BODY ( TS:addr -- addr )
- accepts a code field address of a VARIABLE (or word created by
- CREATE and returns the data field address (in VS:) .
-
- TS:>BODY TS:BODY> ( TS: addr -- TS: addr )
- are like >BODY and BODY> but operate on the thread segment
- only. Given the compilation address, TS:>BODY returns the
- address of the first threaded address (of a : definition), the
- data field of a CONSTANT or the address pointer of a
- VARIABLE .
- Note that there are thus two types of >BODY . >BODY could be
- rewritten:
- : >BODY TS:>BODY TS:@ ;
-
- >LINK >NAME ( TS:addr -- HS:addr )
- are used to access the dictionary header of the specified word.
- If TS:addr is not a valid compilation address, an error message
- is displayed and execution is ABORTED.
-
- NAME> LINK> ( HS:addr -- TS:addr )
- are used to find the compilation address from the header
- addresses name and link fields.
-
- N>LINK L>NAME ( HS:addr -- HS:addr )
- are used to move between the name and link fields which are
- both in the head segment.
-
- Note that there is no word BODY> to move from the VS: parameter
- address of a VARIABLE or CREATED word to the compilation
- address. This is not supported in L.O.V.E. Forth.
-
-
- 3.8 Long Operators
- --------------
-
- L.O.V.E. Forth contains a set of basic operators which operate on
- any area of memory. These words allow the specification of both the
- segment and address of the word to be operated upon.
-
- Basic operators: @L !L C@L C!L BMOVEL
-
- Some disk operators will operate on any segment:
- READL WRITEL <READL> <WRITEL>
- RWTSL EXEC
-
- ENV-SRCH ( string -- seg, addr, f or t )
- returns both segment and address of DOS environment
-
- DUMPL ( seg,addr,#bytes -- )
- Allows memory to be dumped relative to any segment.
-
-
- 3.9 Memory map
- ----------
-
- The dictionary pointers move up as more is compiled. Certain
- words only use certain segments (eg. a CONSTANT occupies only the thread
- and head segments). When any of the dictionary pointers reaches within
- 400 bytes of the maximum available address a warning message is
- displayed 'GETTING CLOSE TO FULL'.
-
- The maximum available address in each segment is dependent on
- several things. Virtual vocabularies are loaded in high memory; disk
- buffers are also here (in the VS: only - minimum of 2k bytes). The
- current maximum addresses are always stored in the VARIABLE TOPS
- (contains one cell for each of CS: TS: VS: and HS:).
- If the program is very large, it is best to remove any resident virtual
- vocabulary with FORGET-SYS .
-
-
-