home *** CD-ROM | disk | FTP | other *** search
-
-
-
-
-
-
- _8. _I_N_T_E_R_N_A_L
-
- You don't need to know the material in this section to
- use Elvis. You only need it if you intend to modify Elvis.
-
- You should also check out the CFLAGS, TERMCAP, ENVIRON-
- MENT VARIABLES, VERSIONS, and QUIESTIONS & ANSWERS sections
- of this manual.
-
- _8._1. _T_h_e _t_e_m_p_o_r_a_r_y _f_i_l_e
-
- The temporary file is divided into blocks of 1024 bytes
- each. The functions in "blk.c" maintain a cache of the five
- most recently used blocks, to minimize file I/O.
-
- When Elvis starts up, the file is copied into the tem-
- porary file by the function ttttmmmmppppssssttttaaaarrrrtttt(((()))) in "tmp.c". Small
- amounts of extra space are inserted into the temporary file
- to insure that no text lines cross block boundaries. This
- speeds up processing and simplifies storage management. The
- extra space is filled with NUL characters. the input file
- must not contain any NULs, to avoid confusion. This also
- limits lines to a length of 1023 characters or less.
-
- The data blocks aren't necessarily stored in sequence.
- For example, it is entirely possible that the data block
- containing the first lines of text will be stored after the
- block containing the last lines of text.
-
- In RAM, Elvis maintains two lists: one that describes
- the "proper" order of the disk blocks, and another that
- records the line number of the last line in each block.
- When Elvis needs to fetch a given line of text, it uses
- these tables to locate the data block which contains that
- line.
-
- Before each change is made to the file, these lists are
- copied. The copies can be used to "undo" the change. Also,
- the first list -- the one that lists the data blocks in
- their proper order -- is written to the first data block of
- the temp file. This list can be used during file recovery.
-
- When blocks are altered, they are rewritten to a _d_i_f_-
- _f_e_r_e_n_t block in the file, and the order list is updated
- accordingly. The original block is left intact, so that
- "undo" can be performed easily. Elvis will eventually
- reclaim the original block, when it is no longer needed.
-
- _8._2. _I_m_p_l_e_m_e_n_t_a_t_i_o_n _o_f _E_d_i_t_i_n_g
-
- There are three basic operations which affect text:
-
- +o delete text - delete(from, to)
- +o add text - add(at, text)
-
-
-
- June 13, 1992
-
-
-
-
-
- 8-2 INTERNAL 8-2
-
-
- +o yank text - cut(from, to)
-
-
- To yank text, all text between two text positions is
- copied into a cut buffer. The original text is not changed.
- To copy the text into a cut buffer, you need only remember
- which physical blocks that contain the cut text, the offset
- into the first block of the start of the cut, the offset
- into the last block of the end of the cut, and what kind of
- cut it was. (Cuts may be either character cuts or line
- cuts; the kind of a cut affects the way it is later "put".)
- Yanking is implemented in the function ccccuuuutttt(((()))), and pasting is
- implemented in the function ppppaaaasssstttteeee(((()))). These functions are
- defined in "cut.c".
-
- To delete text, you must modify the first and last
- blocks, and remove any reference to the intervening blocks
- in the header's list. The text to be deleted is specified
- by two marks. This is implemented in the function ddddeeeelllleeeetttteeee(((()))).
-
- To add text, you must specify the text to insert (as a
- NUL-terminated string) and the place to insert it (as a
- mark). The block into which the text is to be inserted may
- need to be split into as many as four blocks, with new
- intervening blocks needed as well... or it could be as sim-
- ple as modifying a single block. This is implemented in the
- function aaaadddddddd(((()))).
-
- There is also a cccchhhhaaaannnnggggeeee(((()))) function, which generally just
- calls delete() and add(). For the special case where a sin-
- gle character is being replaced by another single character,
- though, change() will optimize things somewhat. The add(),
- delete(), and change() functions are all defined in
- "modify.c".
-
- The iiiinnnnppppuuuutttt(((()))) function reads text from a user and inserts
- it into the file. It makes heavy use of the add(),
- delete(), and change() functions. It inserts characters one
- at a time, as they are typed.
-
- When text is modified, an internal file-revision
- counter, called cccchhhhaaaannnnggggeeeessss, is incremented. This counter is
- used to detect when certain caches are out of date. (The
- "changes" counter is also incremented when we switch to a
- different file, and also in one or two similar situations --
- all related to invalidating caches.)
-
- _8._3. _M_a_r_k_s _a_n_d _t_h_e _C_u_r_s_o_r
-
- Marks are places within the text. They are represented
- internally as 32-bit values which are split into two bit-
- fields: a line number and a character index. Line numbers
- start with 1, and character indexes start with 0. Lines can
- be up to 1023 characters long, so the character index is 10
-
-
-
- June 13, 1992
-
-
-
-
-
- 8-3 INTERNAL 8-3
-
-
- bits wide and the line number fills the remaining 22 bits in
- the long int.
-
- Since line numbers start with 1, it is impossible for a
- valid mark to have a value of 0L. 0L is therefore used to
- represent unset marks.
-
- When you do the "delete text" change, any marks that
- were part of the deleted text are unset, and any marks that
- were set to points after it are adjusted. Marks are
- adjusted similarly after new text is inserted.
-
- The cursor is represented as a mark.
-
- _8._4. _C_o_l_o_n _C_o_m_m_a_n_d _I_n_t_e_r_p_r_e_t_a_t_i_o_n
-
- Colon commands are parsed, and the command name is
- looked up in an array of structures which also contain a
- pointer to the function that implements the command, and a
- description of the arguments that the command can take. If
- the command is recognized and its arguments are legal, then
- the function is called.
-
- Each function performs its task; this may cause the
- cursor to be moved to a different line, or whatever.
-
- _8._5. _S_c_r_e_e_n _C_o_n_t_r_o_l
-
- In input mode or visual command mode, the screen is
- redrawn by a function called rrrreeeeddddrrrraaaawwww(((()))). This function is
- called in the getkey() function before each keystroke is
- read in, if necessary.
-
- Redraw() write to the screen via a package which looks
- like the "curses" library, but isn't. It is actually much
- simpler. Most curses operations are implemented as macros
- which copy characters into a large I/O buffer, which is then
- written with a single large write() call as part of the
- refresh() operation.
-
- (Note: Under MS-DOS, the pseudo-curses macros check to
- see whether you're using the pcbios interface. If you are,
- then the macros call functions in "pc.c" to implement screen
- updates.)
-
- The low-level functions which modify text (namely
- add(), delete(), and change()) supply redraw() with clues to
- help redraw() decide which parts of the screen must be
- redrawn. The clues are given via a function called
- rrrreeeeddddrrrraaaawwwwrrrraaaannnnggggeeee(((()))).
-
- Most EX commands use the pseudo-curses package to per-
- form their output, like redraw().
-
-
-
-
- June 13, 1992
-
-
-
-
-
- 8-4 INTERNAL 8-4
-
-
- There is also a function called mmmmssssgggg(((()))) which uses the
- same syntax as printf(). In EX mode, msg() writes message
- to the screen and automatically adds a newline. In VI mode,
- msg() writes the message on the bottom line of the screen
- with the "standout" character attribute turned on.
-
- _8._6. _O_p_t_i_o_n_s
-
- For each option available through the ":set" command,
- Elvis contains a character array variable, named "o__o_p_t_i_o_n".
- For example, the "lines" option uses a variable called
- "o_lines".
-
- For boolean options, the array has a dimension of 1.
- The first (and only) character of the array will be NUL if
- the variable's value is FALSE, and some other value if it is
- TRUE. To check the value, just by dereference the array
- name, as in "if (*o_autoindent)".
-
- For number options, the array has a dimension of 3.
- The array is treated as three unsigned one-byte integers.
- The first byte is the current value of the option. The
- second and third bytes are the lower and upper bounds of
- that option.
-
- For string options, the array usually has a dimension
- of about 60 but this may vary. The option's value is stored
- as a normal NUL-terminated string.
-
- All of the options are declared in "opts.c". Most are
- initialized to their default values; the iiiinnnniiiittttooooppppttttssss(((()))) function
- is used to perform any environment-specific initialization.
-
- _8._7. _P_o_r_t_a_b_i_l_i_t_y
-
- To improve portability, Elvis collects as many of the
- system-dependent definitions as possible into the "config.h"
- file. This file begins with some preprocessor instructions
- which attempt to determine which compiler and operating sys-
- tem you have. After that, it conditionally defines some
- macros and constants for your system.
-
- One of the more significant macros is ttttttttyyyyrrrreeeeaaaadddd(((()))). This
- macro is used to read raw characters from the keyboard, pos-
- sibly with timeout. For UNIX systems, this basically reads
- bytes from stdin. For MSDOS, TOS, and OS9, ttyread() is a
- function defined in curses.c. There is also a ttttttttyyyywwwwrrrriiiitttteeee(((())))
- macro.
-
- The ttttrrrreeeeaaaadddd(((()))) and ttttwwwwrrrriiiitttteeee(((()))) macros are versions of read()
- and write() that are used for text files. On UNIX systems,
- these are equivelent to read() and write(). On MS-DOS,
- these are also equivelent to read() and write(), since DOS
- libraries are generally clever enough to convert newline
-
-
-
- June 13, 1992
-
-
-
-
-
- 8-5 INTERNAL 8-5
-
-
- characters automatically. For Atari TOS, though, the MWC
- library is too stupid to do this, so we had to do the
- conversion explicitly.
-
- Other macros may substitute index() for strchr(), or
- bcopy() for memcpy(), or map the "void" data type to "int",
- or whatever.
-
- The file "tinytcap.c" contains a set of functions that
- emulate the termcap library for a small set of terminal
- types. The terminal-specific info is hard-coded into this
- file. It is only used for systems that don't support real
- termcap. Another alternative for screen control can be seen
- in the "curses.h" and "pc.c" files. Here, macros named
- VOIDBIOS and CHECKBIOS are used to indirectly call functions
- which perform low-level screen manipulation via BIOS calls.
-
- The stat() function must be able to come up with UNIX-
- style major/minor/inode numbers that uniquely identify a
- file or directory.
-
- Please try to keep you changes localized, and wrap them
- in #if/#endif pairs, so that Elvis can still be compiled on
- other systems. And PLEASE let me know about it, so I can
- incorporate your changes into my latest-and-greatest version
- of Elvis.
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
- June 13, 1992
-
-
-