home *** CD-ROM | disk | FTP | other *** search
- CHAPTER 6 THE 86 INSTRUCTION SET
-
-
- Effective Addresses
-
- Most memory data accessing in the 86 family is accomplished via
- the mechanism of the effective address. Wherever an effective
- address specifier "eb", "ew" or "ed" appears in the list of 8086
- instructions, you may use a wide variety of actual operands in
- that instruction. These include general registers, memory
- variables, and a variety of indexed memory quantities.
-
- GENERAL REGISTERS: Wherever an "ew" appears, you can use any of
- the 16-bit registers AX,BX,CX,DX,SI,DI,SP, or BP. Wherever an
- "eb" appears, you can use any of the 8-bit registers
- AL,BL,CL,DL,AH,BH,CH, or DH. For example, the "ADD ew,rw" form
- subsumes the 16-bit register-to-register adds; for example, ADD
- AX,BX; ADD SI,BP; ADD SP,AX.
-
- MEMORY VARIABLES: Wherever an "ew" appears, you can use a word
- memory variable. Wherever an "eb" appears, you can use a byte
- memory variable. Variables are typically declared in the DATA
- segment, using a DW declaration for a word variable, or a DB
- declaration for a byte variable. For example, you can declare
- variables:
-
- DATA_PTR DW ?
- ESC_CHAR DB ?
-
- Later, you can load or store these variables:
-
- MOV SI,DATA_PTR ; load DATA_PTR into SI for use
- LODSW ; fetch the word pointed to by DATA_PTR
- MOV DATA_PTR,SI ; store the value incremented by the LODSW
- MOV BL,ESC_CHAR ; load the byte variable ESC_CHAR
-
- Alternatively, you can address specific unnamed memory locations
- by enclosing the location value in square brackets; for example,
-
- MOV AL,[02000] ; load contents of location 02000 into AL
-
- Note that A86 discerned from context (loading into AL) that a
- BYTE at 02000 was intended. Sometimes this is impossible, and
- you must specify byte or word:
-
- INC B[02000] ; increment the byte at location 02000
- MOV W[02000],0 ; set the WORD at location 02000 to zero
- 6-2
-
- INDEXED MEMORY: The 86 supports the use of certain registers as
- base pointers and index registers into memory. BX and BP are the
- base registers; SI and DI are the index registers. You may
- combine at most one base register, at most one index register,
- and a constant number into a run time pointer that determines the
- location of the effective address memory to be used in the
- instruction. These can be given explicitly, by enclosing the
- index registers in brackets:
-
- MOV AX,[BX]
- MOV CX,W[SI+17]
- MOV AX,[BX+SI+5]
- MOV AX,[BX][SI]5 ; another way to write the same instr.
-
- Or, indexing can be accomplished by declaring variables in a
- based structure (see the STRUC directive in Chapter 9):
-
- STRUC [BP] ; NOTE: based structures are unique to A86!
- BP_SAVE DW ? ; BP_SAVE is a word at [BP]
- RET_ADDR DW ? ; RET_ADDR is a word at [BP+2]
- PARM1 DW ? ; PARM1 is a word at [BP+4]
- PARM2 DW ? ; PARM2 is a word at [BP+6]
- ENDS
- INC PARM1 ; equivalent to INC W[BP+4]
-
- Finally, indexing can be done by mixing explicit components with
- declared ones:
-
- TABLE DB 4,2,1,3,5
- MOV AL,TABLE[BX] ; load byte number BX of TABLE
-
-
- Segmentation and Effective Addresses
-
- The 86 family has four segment registers, CS, DS, ES, and SS,
- used to address memory. Each segment register points to 64K
- bytes of memory within the 1-megabyte memory space of the 86.
- (The start of the 64K is calculated by multiplying the segment
- register value by 16; i.e., by shifting the value left by one hex
- digit.) If your program's code, data and stack areas can all fit
- in the same 64K bytes, you can leave all the segment registers
- set to the same value. In that case, you won't have to think
- about segment registers--no matter which one is used to address
- memory, you'll still get the same 64K. If your program needs
- more than 64K, you must point one or more segment registers to
- other parts of the memory space. In this case, you must take
- care that your memory references use the segment registers you
- intended.
-
- Each effective address memory access has a default segment
- register, to be used if you do not explicitly specify which
- segment register you wish. For most effective addresses, the
- default segment register is DS. The exceptions are those
- effective addresses that use the BP register for indexing. All
- BP-indexed memory references have a default of SS. (This is
- because BP is intended to be used for addressing local variables,
- stored on the stack.)
- 6-3
-
- If you wish your memory access to use a different segment
- register, you provide a segment override byte before the
- instruction containing the effective address operand. In the A86
- language, you code the override by giving the name of the segment
- register you wish before the instruction mnemonic. For example,
- suppose you want to load the AL register with the memory byte
- pointed to by BX. If you code MOV AL,[BX], the DS register will
- be used to determine which 64K segment BX is pointing to. If you
- want the byte to come from the CS-segment instead, you code CS
- MOV AL,[BX]. Be aware that the segment override byte has effect
- only upon the single instruction that follows it. If you have a
- sequence of instructions requiring overrides, you must give an
- override byte before every instruction in the sequence. (In that
- case, you may wish to consider changing the value of the default
- segment register for the duration of the sequence.)
-
- NOTE: This method for providing segment overrides is unique to
- the A86 assembler! The assemblers provided by Intel and IBM (MS-
- DOS) attempt to figure out segment allocation for you, and plug
- in segment override bytes "behind your back". In order to do
- this, those assemblers require you to inform them which variables
- and structures are pointed to by which segment registers. That
- is what the ASSUME directive in those assemblers is all about. I
- wrote Intel's first 86 assembler, ASM86, so I have been watching
- the situation since day one. Over the years, I have concluded
- that the ASSUME mechanism creates far, far more confusion that it
- solves. So I scrapped it; and the result is an assembler with
- far less red tape. But if your program needs more than 64K, you
- do have to manage those segment registers yourself; so take care!
-
-
- Effective Use of Effective Addresses
-
- Remember that all of the common instructions of the 86 family
- allow effective addresses as operands. (The only major functions
- that don't are the AL/AX specific ones: multiply, divide, and
- input/output). This means that you don't have to funnel many
- through AL or AX just to do something with them. You can perform
- all the common arithmetic, PUSH/POP, and MOVes from any general
- register to any general register; from any memory location
- (indexed if you like) to any register; and (this is most often
- overlooked) from any register TO memory. The only thing you
- can't do in general is memory-to-memory. Among the more common
- operations that inexperienced 86 programmers overlook are:
-
- * setting memory variables to immediate values
-
- * testing memory variables, and comparing them to constants
-
- * preserving memory variables by PUSHing and POPping them
-
- * incrementing and decrementing memory variables
-
- * adding into memory variables
- 6-4
-
- Encoding of Effective Addresses
-
- Unless you are concerned with the nitty-gritty details of 86
- instruction encoding, you don't need to read this section.
-
- Every instruction with an effective address has an encoded byte,
- known as the effective address byte, following the 1-byte opcode
- for the instruction. (For obscure reasons, Intel calls this byte
- the ModRM byte.) If the effective address is a memory variable,
- or an indexed memory location with a non-zero constant offset,
- then the effective address byte will be immediately followed by
- the offset amount. Amounts in the range -128 to +127 are given
- by a single signed byte, denoted by "d8" in the table below.
- Amounts requiring a 2-byte representation are denoted by "d16" in
- the table below. As with all 16-bit memory quantities in the 86
- family, the word is stored with the least significant byte FIRST.
-
- The following table of effective address byte values is organized
- into 32 rows and 8 columns. The 32 rows give the possible values
- for the effective address operand: 8 registers and 24 memory
- indexing modes. A 25th indexing mode, [BP] with zero
- displacement, has been pre-empted by the simple-memory-variable
- case. If you code [BP] with no displacement, you will get
- [BP]+d8, with a d8-value of zero.
-
- The 8 columns of the table reflect further information given by
- the effective address byte. Usually, this is the identity of the
- other (always a register) operand of a 2-operand instruction.
- Those instructions are identified by a "/r" following the opcode
- byte in the instruction list. Sometimes, the information given
- supplements the opcode byte in identifying the instruction
- itself. Those instructions are identified by a "/" followed by a
- digit from 0 through 7. The digit tells which of the 8 columns
- you should use to find the effective address byte.
-
- For example, suppose you have a perverse wish to know the precise
- bytes encoded by the instruction SUB B[BX+17],100. This
- instruction subtracts an immediate quantity, 100, from an
- effective address quantity, B[BX+17]. By consulting the
- instruction list, you find the general form SUB eb,ib. The
- opcode bytes given there are 80 /5 ib. The "/5" denotes an
- effective address byte, whose value will be taken from column 5
- of the table below. The offset 17 decimal, which is 11 hex, will
- fit in a single "d8" byte, so we take our value from the "[BX] +
- d8" row. The table tells us that the effective address byte is
- 6F. Immediately following the 6F is the offset, 11 hex.
- Following that is the ib-value of 100 decimal, which is 64 hex.
- So the bytes generated by SUB B[BX+17],100 are 80 6F 11 64.
-
-
- 6-5
-
- Table of Effective Address byte values
-
- s = ES CS SS DS
- rb = AL CL DL BL AH CH DH BH
- rw = AX CX DX BX SP BP SI DI
- digit= 0 1 2 3 4 5 6 7
- Effective
- EA byte address:
- values: 00 08 10 18 20 28 30 38 [BX + SI]
- 01 09 11 19 21 29 31 39 [BX + DI]
- 02 0A 12 1A 22 2A 32 3A [BP + SI]
- 03 0B 13 1B 23 2B 33 3B [BP + DI]
-
- 04 0C 14 1C 24 2C 34 3C [SI]
- 05 0D 15 1D 25 2D 35 3D [DI]
- 06 0E 16 1E 26 2E 36 3E d16 (simple var)
- 07 0F 17 1F 27 2F 37 3F [BX]
-
- 40 48 50 58 60 68 70 78 [BX + SI] + d8
- 41 49 51 59 61 69 71 79 [BX + DI] + d8
- 42 4A 52 5A 62 6A 72 7A [BP + SI] + d8
- 43 4B 53 5B 63 6B 73 7B [BP + DI] + d8
-
- 44 4C 54 5C 64 6C 74 7C [SI] + d8
- 45 4D 55 5D 65 6D 75 7D [DI] + d8
- 46 4E 56 5E 66 6E 76 7E [BP] + d8
- 47 4F 57 5F 67 6F 77 7F [BX] + d8
-
- 80 88 90 98 A0 A8 B0 B8 [BX + SI] + d16
- 81 89 91 99 A1 A9 B1 B9 [BX + DI] + d16
- 82 8A 92 9A A2 AA B2 BA [BP + SI] + d16
- 83 8B 93 9B A3 AB B3 BB [BP + DI] + d16
-
- 84 8C 94 9C A4 AC B4 BC [SI] + d16
- 85 8D 95 9D A5 AD B5 BD [DI] + d16
- 86 8E 96 9E A6 AE B6 BE [BP] + d16
- 87 8F 97 9F A7 AF B7 BF [BX] + d16
-
- C0 C8 D0 D8 E0 E8 F0 F8 ew=AX eb=AL
- C1 C9 D1 D9 E1 E9 F1 F9 ew=CX eb=CL
- C2 CA D2 DA E2 EA F2 FA ew=DX eb=DL
- C3 CB D3 DB E3 EB F3 FB ew=BX eb=BL
-
- C4 CC D4 DC E4 EC F4 FC ew=SP eb=AH
- C5 CD D5 DD E5 ED F5 FD ew=BP eb=CH
- C6 CE D6 DE E6 EE F6 FE ew=SI eb=DH
- C7 CF D7 DF E7 EF F7 FF ew=DI eb=BH
-
- d8 denotes an 8-bit displacement following the EA byte, to be
- sign-extended and added to the index.
-
- d16 denotes a 16-bit displacement following the EA byte, to be
- added to the index.
-
- Default segment register is SS for effective addresses containing
- a BP index; DS for other memory effective addresses.
-
-
- 6-6
-
- How to Read the Instruction Set Chart
-
- The following chart summarizes the machine instructions you can
- program with A86. In order to use the chart, you need to learn
- the meanings of the specifiers (each given by 2 lower case
- letters) that follow most of the instruction mnemonics. Each
- specifier indicates the type of operand (register byte, immediate
- word, etc.) that follows the mnemonic to produce the given
- opcodes.
-
-
- "c" means the operand is a code label, pointing to a part of the
- program to be jumped to or called. A86 will also accept a
- constant offset in this place (or a constant segment-offset
- pair in the case of "cd"). "cb" is a label within about 128
- bytes (in either direction) of the current location. "cw" is
- a label within the same code segment as this program; "cd" is
- a pair of constants separated by a colon-- the segment value
- to the left of the colon, and the offset to the right. Note
- that in both the cb and cw cases, the object code generated
- is the offset from the location following the current
- instruction, not the absolute location of the label operand.
- In some assemblers (most notably for the Z-80 processor) you
- have to code this offset explicitly by putting "$-" before
- every relative jump operand in your source code. You do NOT
- need to, and should not do so with A86.
-
- "e" means the operand is an Effective Address. The concept of
- an Effective Address is central to the 86 machine
- architecture, and thus to 86 assembly language programming.
- It is described in detail at the start of Chapter 8. We
- summarize here by saying that an Effective Address is either
- a general purpose register, a memory variable, or an indexed
- memory quantity. For example, the instruction "ADD rb,eb"
- includes the instructions: ADD AL,BL, and ADD CH,BYTEVAR, and
- ADD DL,B[BX+17].
-
- "i" means the operand is an immediate constant, provided as part
- of the instruction itself. "ib" is a byte-sized constant;
- "iw" is a constant occupying a full 16-bit word. The operand
- can also be a label, defined with a colon. In that case, the
- immediate constant which is the location of the label is
- used. Examples: "MOV rw,iw" includes the instructions: MOV
- AX,17, or MOV SI,VAR_ARRAY, where "VAR_ARRAY:" appears
- somewhere in the program, defined with a colon. NOTE that if
- VAR_ARRAY were defined without a colon, e.g., "VAR_ARRAY DW
- 1,2,3", then "MOV SI,VAR_ARRAY" would be a "MOV rw,ew" NOT a
- "MOV rw,iw". The MOV would move the contents of memory at
- VAR_ARRAY (in this case 1) into SI, instead of the location
- of the memory. To load the location, you can code "MOV
- SI,OFFSET VAR_ARRAY".
-
- "m" means a memory variable or an indexed memory quantity; i.e.,
- any Effective Address EXCEPT a register.
-
- "r" means the operand is a general purpose register. The 8 "rb"
- registers are AL,BL,CL,DL,AH,BH,CH,DH; the 8 "rw" registers
- are AX,BX,CX,DX,SI,DI,BP,SP.
- 6-7
-
- WARNING: Instruction forms marked with "*" by the mnemonic are
- part of the extended 186/286/NEC instruction set. Instructions
- marked with "#" are unique to the NEC processors. These
- instructions will NOT work on the 8088 of the IBM-PC; nor will
- they work on the 8086; nor will the NEC instructions work on the
- 186 or 286. If you wish your programs to run on all PC's, do not
- use these instructions!
-
-