home *** CD-ROM | disk | FTP | other *** search
- CHAPTER 8 NUMBERS AND EXPRESSIONS
-
-
- Numbers and Bases
-
- A86 supports a variety of formats for numbers. In non-computer
- life, we write numbers in a decimal format. There are ten
- digits, 0 through 9, that we use to describe numbers; and each
- digit position is ten times as significant as the position to its
- right. The number ten is called the "base" of the decimal
- format. Computer programmers often find it convenient to use
- other bases to specify numbers used in their programs. The most
- commonly-used bases are two (binary format), sixteen (hexadecimal
- format), and eight (octal format).
-
- The hexadecimal format requires sixteen digits. The extra six
- digits beyond 0 through 9 are denoted by the first six letters of
- the alphabet: A for ten, B for eleven, C for twelve, D for
- thirteen, E for fourteen, and F for fifteen.
-
- In A86, a number must always begin with a digit from 0 through 9,
- even if the base is hexadecimal. This is so that A86 can
- distinguish between a number and a symbol that happens to have
- digits in its name. If a hexadecimal number would begin with a
- letter, you precede the letter with a zero. For example, hex A0,
- which is the same as decimal 160, would be written 0A0.
-
- Because it is necessary for you to append leading zeroes to many
- hex numbers, and because you never have to do so for decimal
- numbers, I decided to make hexadecimal the default base for
- numbers with leading zeroes. Decimal is still the default base
- for numbers beginning with 1 through 9.
-
- Large numbers can be given as the operands to DD, DQ, or DT
- directives. For readability, you may freely intersperse
- underscore characters anywhere with your numbers.
-
- The default base can be overridden, with a letter or letters at
- the end of the number: B or xB for binary, O or Q for octal, H
- for hexadecimal, and D or xD for decimal. Examples:
-
- 077Q octal, value is 8*7 + 7 = 63 in decimal notation
- 123O octal if the "O" is a letter: 64 + 2*8 + 3 = 83 decimal
- 1230 decimal 1230: shows why you should use "Q" for octal!!
- 01234567H large constant
- 0001_0000_0000_0000_0003R real number specified in hexadecimal
- 100D superfluous D indicates decimal base
- 0100D hex number 100D, which is 4096 + 13 = 5009 in decimal
- 0100xD decimal 100, since xD overrides the default hex format
- 0110B hex 110B, which is 4096 + 256 + 11 = 4363 in decimal
- 0110xB binary 4+2 = 6 in decimal notation
- 110B also binary 4+2 = 6, since "B" is not a decimal digit
- 8-2
-
- The last five examples above illustrate why an "x" is sometimes
- necessary before the base-override letter "B" or "D". If that
- letter can be interpreted as a hex digit, it is; the "x" forces
- an override interpretation for the "B" or "D". By the way, the
- usage of lower case for x and upper case for the following
- override letter is simply a recommendation; A86 treats upper-and
- lower-case letters equivalently.
-
-
- The RADIX Directive
-
- The above-mentioned set of defaults (hex if leading zero, decimal
- otherwise) can be overridden with the RADIX directive. The RADIX
- directive consists of the word RADIX followed by a number from 2
- to 16. The default base for the number is ALWAYS decimal,
- regardless of any (or no) previous RADIX commands. The number
- gives the default base for ALL subsequent numbers, up to (but not
- including) the next RADIX command. If there is no number
- following RADIX, then A86 returns to its initial mixed default of
- hex for leading zeroes, decimal for other leading digits.
-
- For compatibility with IBM's assembler, RADIX can appear with a
- leading period; although I curse the pinhead designer who put
- that period into IBM's language.
-
- As an alternative to the RADIX directive, I provide the D switch,
- which causes A86 to start with decimal defaults. You can put +D
- into the A86 command invocation, or into the A86 environment
- variable. The first RADIX command in the program will override
- the D switch setting.
-
- Following are examples of radix usage. The numbers in the
- comments are all in decimal notation.
-
- DB 10,010 ; produces 10,16 if RADIX was not seen yet
- ; and +D switch was not specified
- RADIX 10
- DB 10,010 ; produces 10,10
- RADIX 16
- DB 10,010 ; produces 16,16
- RADIX 2
- DB 10,01010 ; produces 2,10
- RADIX 3 ; for Martian programmers in Heinlein novels
- DB 10,100 ; produces 3,9
- RADIX
- DB 10,010 ; produces 10,16
- 8-3
-
- Floating Point Initializations
-
- A86 allows floating point numbers as the operands to DD, DQ, and
- DT directives. The numbers are encoded according to the IEEE
- standard, followed by the 8087 and 287 coprocessors. The format
- for floating point constants is as follows: First, there is a
- decimal number containing a decimal point. There must be a
- decimal point, or else the number is interpreted as an integer.
- There must also be at least one decimal digit, either to the left
- or right of the decimal point, or else the decimal point is
- interpreted as an addition (structure element) operator.
- Optionally, there may follow immediately after the decimal number
- the letter E followed by a decimal number. The E stands for
- "exponent", and means "times 10 raised to the power of". You may
- provide a + or - between the E and its number. Examples:
-
- 0.1 constant one-tenth
- .1 the same
- 300. floating point three hundred
- 30.E1 30 * 10**1; i.e., three hundred
- 30.E+1 the same
- 30.E-1 30 * 10**-1; i.e., three
- 30E1 not floating point: hex integer 030E1
- 1.234E20 scientific notation: 1.234 times 10 to the 20th
- 1.234E-20 a tiny number: 1.234 divided by 10 to the 20th
-
-
-
- Overview of Expressions
-
- Most of the operands that you code into your instructions and
- data initializations will be simple register names, variable
- names, or constants. However, you will regularly wish to code
- operands that are the results of arithmetic calculations,
- performed either by the machine when the program is running (for
- indexing), or by the assembler (to determine the value to
- assemble into the program). A86 has a full set of operators that
- you can use to create expressions to cover these cases:
-
- * Arithmetic Operators
- byte isolation and combination (HIGH, LOW, BY)
- addition and subtraction (+,-)
- multiplication and division (* , /, MOD)
- shifting operators (SHR, SHL, BIT)
-
- * Logical Operators
- (AND, OR, XOR, NOT)
-
- * Boolean Negation Operator
- (!)
-
- * Relational Operators
- (EQ, LE, LT, GE, GT, NE)
-
- * String Comparison Operators
- (EQ, NE, =)
- 8-4
-
- * Attribute Operators/Specifiers
- size specifiers (B=BYTE,W=WORD,F=FAR,SHORT,LONG)
- attribute specifiers (OFFSET,NEAR,brackets)
- segment addressing specifier (:)
- compatibility operators (PTR,ST)
- built-in value specifiers (TYPE,THIS,$)
-
- * Special Data Duplication Operator
- (DUP) --see Chapter 9 for a description
-
-
- Types of Expression Operands
-
- Numbers and Label Addresses
-
- A number or constant (16-bit number) can be used in most
- expressions. A label (defined with a colon) is also treated as
- a constant and so can be used in expressions, except when it is a
- forward reference.
-
- Variables
-
- A variable stands for a byte- or word-memory location. You may
- add or subtract constants from variables; when you do so, the
- constant is added to the address of the variable. You typically
- do this when the variable is the name of a memory array.
-
- Index Expressions
-
- An index expression consists of a combination of a base register
- [BX] or [BP], and/or an index register [SI] or [DI], with an
- optional constant added or subtracted. You will usually want to
- precede the bracketed expression with B, W, or F; to specify the
- kind of memory unit (byte, word, or far pointer) you are
- referring to. The expression stands for the memory unit whose
- address is the run-time value(s) of the base and/or index
- registers added to the constant. See the Effective Address
- section and the beginning of this chapter for more details on
- indexed memory.
-
-
- Arithmetic Operators
-
-
- HIGH/LOW
-
- Syntax: HIGH operand
- LOW operand
-
- These operators are called the "byte isolation" operators. The
- operand must evaluate to a 16-bit number. HIGH returns the
- high order byte of the number; LOW the low order byte.
-
- For example,
-
- MOV AL,HIGH(01234) ; AL = 012
- TENHEX EQU LOW(0FF10) ; TENHEX = 010
- 8-5
-
- These operators can be applied to each other. The following
- identities apply:
-
- LOW LOW Q = LOW Q
- LOW HIGH Q = HIGH Q
- HIGH LOW Q = 0
- HIGH HIGH Q = 0
-
-
- BY
-
- Syntax: operand BY operand
-
- This operator is a "byte combination" operator. It returns the
- word whose high byte is the left operand, and whose low byte is
- the right operand. For example, the expression 3 BY 5 is the
- same as hexadecimal 0305. The BY operator is exclusive to A86. I
- added it to cover the following situation: Suppose you are
- initializing your registers to immediate values. Suppose you
- want to initialize AH to the ASCII value 'A', and AL to decimal
- 10. You could code this as two instructions MOV AH,'A' and MOV
- AL,10; but you realize that a single load into the AX register
- would save both program space and execution time. Without the BY
- operator, you would have to code MOV AX,0410A, which disguises
- the types of the individual byte operands you were thinking
- about. With BY, you can code it properly: MOV AX,'A' BY 10.
-
-
- Addition (combination)
-
- Syntax: operand + operand
- operand.operand
- operand PTR operand
- operand operand
-
- As shown in the above syntax, addition can be accomplished in
- four ways: with a plus sign, with a dot operator, with a PTR
- operator, and simply by juxtaposing two operands next to each
- other. The dot and PTR operators are provided for compatibility
- with Intel/IBM assemblers. The dot is used in structure field
- notation; PTR is used in expressions such as BYTE PTR 0. (See
- Chapter 12 for recommendations concerning PTR.)
-
- If either operand is a constant, the answer is an expression with
- the typing of the other operand, with the offsets added. For
- example, if BVAR is a byte variable, then BVAR + 100 is the byte
- variable 100 bytes beyond BVAR.
-
- Other examples:
-
- DB 100+17 ; simple addition
- CTRL EQU -040
- MOV AL,CTRL'D' ; a nice notation for control-D!
- MOV DX,[BP].SMEM ; --where SMEM was in an unindexed structure
- DQ 10.0 + 7.0 ; floating point addition
- 8-6
-
- Subtraction
-
- Syntax: operand - operand
-
- The subtraction operator may have operands that are:
-
- a. both absolute numbers
-
- b. variable names that have the same type
-
- The result is an absolute number; the difference between the two
- operands.
-
- Subtraction is also allowed between floating point numbers; the
- answer is the floating point difference.
-
-
- Multiplication and Division
-
- Syntax: operand * operand (multiplication)
- operand / operand (division)
- operand MOD operand (modulo)
-
- You may only use these operators with absolute or floating point
- numbers, and the result is always the same type. Either operand
- may be a numeric expression, as long as the expression evaluates
- to an absolute or floating point number. Examples:
-
- CMP AL,2 * 4 ; compare AL to 8
- MOV BX,0123/16 ; BX = 012
- DT 1.0 / 7.0
-
-
-
- Shifting Operators
-
- Syntax: operand SHR count (shift right)
- operand SHL count (shift left)
- BIT count (bit number)
-
- The shift operators will perform a "bit-wise" shift of the
- operand. The operand will be shifted "count" bits either to the
- right or the left. Bits shifted into the operand will be set to
- 0.
-
- The expression "BIT count" is equivalent to "1 SHL count"; i.e.,
- BIT returns the mask of the single bit whose number is "count".
- The operands must be numeric expressions that evaluate to
- absolute numbers. Examples:
-
- MOV BX, 0FACBH SHR 4 ; BX = 0FACH
- OR AL,BIT 6 ; AL = AL OR 040; 040 is the mask for bit 6
- 8-7
-
- Logical Operators
-
- Syntax: operand OR operand
- operand XOR operand
- operand AND operand
- NOT operand
-
- The logical operators may only be used with absolute numbers.
- They always return an absolute number.
-
- Logical operators operate on individual bits. Each bit of the
- answer depends only on the corresponding bit in the operand(s).
-
- The functions performed are as follows:
-
- 1. OR: An answer bit is 1 if either or both of the operand bits
- is 1. An answer bit is 0 only if both operand bits are 0.
-
- Example:
-
- 11110000xB OR 00110011xB = 11110011xB
-
-
- 2. XOR: This is "exclusive OR." An answer bit is 1 if the
- operand bits are different; an answer bit is 0 if the operand
- bits are the same. Example:
-
- 11110000xB XOR 00110011xB = 11000011xB
-
-
- 3. AND: An answer bit is 1 only if both operand bits are 1. An
- answer bit is 0 if either or both operand bits are 0.
- Example:
-
- 11110000xB AND 00110011xB = 00110000xB
-
- 4. NOT: An answer bit is the opposite of the operand bit. It
- is 1 if the operand bit is 0; 0 if the operand bit is 1.
- Example:
-
- NOT 00110011xB = 11001100xB
-
-
- Boolean Negation Operator
-
- Syntax: ! operand
-
- The exclamation-point operator, rather than reversing each
- individual bit of the operand, considers the entire operand as a
- boolean variable to be negated. If the operand is non-zero (any
- of the bits are 1), the answer is 0. If the operand is zero, the
- answer is 0FFFF.
- 8-8
-
- Because ! is intended to be used in conditional assembly
- expressions (described in Chapter 11), there is also a special
- action when ! is applied to an undefined name: the answer is the
- defined value 0FFFF, meaning it is TRUE that the symbol is
- undefined. Similarly, when ! is applied to some defined quantity
- other than an absolute constant, the answer is 0, meaning it is
- FALSE that the operand is undefined.
-
-
- Relational Operators
-
- Syntax: operand EQ operand (equal)
- operand NE operand (not equal)
- operand LT operand (less than)
- operand LE operand (less or equal)
- operand GT operand (greater than)
- operand GE operand (greater or equal)
-
- The relational operators may have operands that are:
-
- a. both absolute numbers
-
- b. variable names that have the same type
-
- The result of a relational operation is always an absolute
- number. They return an 8-or 16-bit result of all 1's for TRUE
- and all 0's for FALSE. Examples:
-
- MOV AL, 3 EQ 0 ; AL = 0 (false)
- MOV AX, 2 LE 15 ; AX = 0FFFFH (true)
-
-
- String Comparison Operators
-
- Syntax: string EQ string (equal)
- string NE string (not equal)
- string = string (equal ignoring case)
-
- In order to subsume the string comparison facilities offered by
- That Other Assembler's special conditional-assembly directives
- IFIDN and IFDIF, A86 allows the relational operators EQ and NE to
- accept string arguments. For this syntax to be accepted by A86,
- both strings must be bounded using the same delimiter (either
- single quotes for both strings, or double quotes for both
- strings). For a match (EQ returns TRUE or NE returns FALSE), the
- strings must be the same length, and every character must match
- exactly.
- 8-9
-
- An additional A86-exclusive feature is the = operator, which
- returns TRUE if the characters of the strings differ only in the
- bit masked by the value 020. Thus you may use = to compare a
- macro parameter to a string containing nothing but letters. The
- comparison will be TRUE whether the macro parameter is upper-case
- or lower-case. No checking is made to detect non-letters, so if
- you use = on strings containing non-letters, you may get some
- false TRUE results. Also, = is accepted when it is applied to
- non-strings as well-- the corresponding values are interpreted as
- two-byte strings, with the 020 bits masked away before
- comparison.
-
-
-
- Attribute Operators/Specifiers
-
-
- B,W,D,Q,T memory variable specifiers
-
- Syntax: B operand Q operand
- operand B operand Q
- W operand T operand
- operand W operand T
- D operand
- operand D
-
- B, W, D, F, Q, and T convert the operand into a byte, word,
- doubleword, far, quadword, and ten-byte variable, respectively.
- The operand can be a constant, or a variable of the other type.
- Examples:
-
- ARRAY_PTR:
- DB 100 DUP (?)
- WVAR DW ?
- MOV AL,ARRAY_PTR B ; load first byte of ARRAY_PTR array into AL
- MOV AL,WVAR B ; load the low byte of WVAR into AL
- MOV AX,W[01000] ; load AX with the memory word at loc. 01000
- LDS BX,D[01000] ; load DS:BX with the doubleword at loc. 01000
- JMP F[01000] ; jump far to the 4-byte location at 01000
- FLD T[BX] ; load ten-byte number at [BX] to 87 stack
-
-
- For compatibility with Intel/IBM assemblers, A86 accepts the more
- verbose synonyms BYTE, WORD, DWORD, FAR, QWORD, and TBYTE for
- B,W,D,F,Q,T, respectively.
-
-
- SHORT and LONG Operators
-
- Syntax: SHORT label
- LONG label
- 8-10
-
- The SHORT operator is used to specify that the label referenced
- by a JMP instruction is within 127 bytes of the end of the
- instruction. The LONG operator specifies the opposite: that the
- label is not within 127 bytes. The appropriate operator can (and
- sometimes must) be used if the label is forward referenced in the
- instruction.
-
- When a non-local label is forward referenced, the assembler
- assumes that it will require two bytes to represent the relative
- offset of the label (so the instruction including the opcode byte
- will be three bytes). By correctly using the SHORT operator, you
- can save a byte of code when you use a forward reference. If the
- label is not within the specified range, an error will occur. The
- following example illustrates the use of the SHORT operator.
-
- JMP FWDLAB ; three byte instruction
- JMP SHORT FWDLAB ; two byte instruction
- JMP >L1 ; two byte instruction assumed for a local label
-
- Because the assembler assumes that a forward reference local
- label is SHORT, you may sometimes be forced to override this
- assumption if the label is in fact not within 127 bytes of the
- JMP. This is why LONG is provided:
-
- JMP LONG >L9 ; three byte instruction
-
- If you are bothered by this possibility, you can specify the +L
- switch, which causes A86 to pessimistically generate the three
- byte JMP for all forward references, unless specifically told not
- to with SHORT.
-
- NOTE that LONG will have effect only on the operand to an
- unconditional JMP instruction; not to conditional jumps. This is
- because the conditional jumps don't have 3-byte forms; the only
- conditional jumps are short ones. If you run into this problem,
- then chances are your code is getting out of control--time to
- rearrange, or to break off some of the intervening code into
- separate procedures. If you insist upon leaving the code intact,
- you can replace the conditional jump with an "IF cond JMP".
-
-
- OFFSET Operator
-
- Syntax: OFFSET var-name
-
- OFFSET is used to convert a variable into the constant pointer to
- the variable. For example, if you have declared XX DW ?, and
- you want to load SI with the pointer to the variable XX, you can
- code: MOV SI,OFFSET XX. The simpler instruction MOV SI,XX moves
- the variable contents of XX into SI, not the constant pointer to
- XX.
- 8-11
-
- NEAR Operator
-
- Syntax: NEAR operand
-
- NEAR converts the operand to have the type of a code label, as if
- it were defined by appearing at the beginning of a program line
- with a colon after it. NEAR is provided mainly for compatibility
- with Intel/IBM assemblers.
-
-
- Square Brackets Operator
-
- Syntax: [operand]
-
- Square brackets around an operand give the operand a memory
- variable type. Square brackets are generally used to enclose the
- names of base and index registers: BX, BP, SI, and DI. When the
- size of the memory variable can be deduced from the context of
- the expression, square brackets are also used to turn numeric
- constants into memory variables. Examples:
-
- MOV B[BX+50],047 ; move imm value 047 into mem byte at BX+50
- MOV AL,[050] ; move byte at memory location 050 into AL
- MOV AL,050 ; move immediate value 050 into AL
-
-
- Colon Operator
-
- Syntax: constant:operand
- segreg:operand
- seg_or_group_name:operand
-
- The colon operator is used to attach a segment register value to
- an operand. The segment register value appears to the left of
- the colon; the rest of the operand appears to the right of the
- colon.
-
- There are three forms to the colon operator. The first form has
- a constant as the segment register value. This form is used to
- create an operand to a long (inter-segment) JMP or CALL
- instruction. An example of this is the instruction JMP 0FFFF:0,
- which jumps to the cold-boot reset location of the 86 processor.
-
- The only context other than JMP or CALL in which this first form
- is legal, is as the operand to a DD directive or an EQU
- directive. The EQU case has a further restriction: the offset
- (the part to the right of the colon) must have a value less than
- 256. This is because there simply isn't room in a symbol table
- entry for a segment register value AND a 2-byte offset. I don't
- think you will be hurt by this restriction, since references to
- other segments are usually to jump tables at the beginning of
- those segments.
- 8-12
-
- The second form has a segment register name to the left of the
- colon. This is the segment override form, provided for
- compatibility with Intel/IBM assemblers. A86 will generate a
- segment override byte when it sees this form, unless the operand
- to the right of the colon already has a default segment register
- that is the same as the given override.
-
- I prefer the more explicit method of overrides, exclusive to A86:
- simply place the segment register name before the instruction
- mnemonic. For example, I prefer ES MOV AL,[BX] to MOV
- AL,ES:[BX].
-
- The third form has a segment or group name before the colon. This
- form is ignored by A86; it is provided for compatibility with
- Turbo C, which likes to include spurious DGROUP: overrides, to
- satisfy MASM's ASSUME-checking.
-
-
- ST Operator
-
- ST is ignored whenever it occurs in an expression. It is
- provided for compatibility with Intel and IBM assemblers. For
- example, you can code FLD ST(0),ST(1), which will be taken by A86
- as FLD 0,1.
-
-
- TYPE Operator
-
- Syntax: TYPE operand
-
- The TYPE operator returns 1 if the operand is a byte variable; 2
- if the operand is a word variable; 4 if the operand is a
- doubleword variable; 8 if the operand is a quadword variable; 10
- if the operand is a ten-byte variable; and the number of bytes
- allocated by the structure if the operand is a structure name
- (see STRUC in the next chapter).
-
- A common usage of the TYPE operator is to represent the number of
- bytes of a named structure. For example, if you have declared a
- structure named LINE (as described in the next chapter) that
- defines 82 bytes of storage, then two ways you might refer to the
- value symbolically are as follows:
-
- MOV CX,TYPE LINE ; loads the size of LINE into CX
- DB TYPE LINE DUP ? ; allocates an area of memory for a LINE
-
-
-
- THIS and $ Specifiers
-
- THIS returns the value of the current location counter. It is
- provided for compatibility with Intel/IBM assemblers. The dollar
- sign $ is the more standard and familiar specifier for this
- purpose; it is equivalent to THIS NEAR. THIS is typically used
- with the BYTE and WORD specifiers to create alternate-typed
- symbols at the same memory location:
- 8-13
-
- BVAR EQU THIS BYTE
- WVAR DW ?
-
- I don't recommend the use of THIS. If you wish to retain Intel
- compatibility, you can use the less verbose LABEL directive:
-
- BVAR LABEL BYTE
- WVAR DW ?
-
- If you are not concerned with compatibility to lesser assemblers,
- A86 offers a variety of less verbose forms. The most concise is
- DB without an operand:
-
- BVAR DB
- WVAR DW ?
-
- If this is too cryptic for you, there is always BVAR EQU B[$].
-
-
- Operator Precedence
-
- Consider the expression 1 + 2 * 3. When A86 sees this
- expression, it could perform the multiplication first, giving an
- answer of 1+6 = 7; or it could do the addition first, giving an
- answer of 3*3 = 9. In fact, A86 does the multiplication first,
- because A86 assigns a higher precedence to multiplication than it
- does addition.
-
- The following list specifies the order of precedence A86 assigns
- to expression operators. All expressions are evaluated from left
- to right following the precedence rules. You may override this
- order of evaluation and precedence through the use of parentheses
- ( ). In the example above, you could override the precedence by
- parenthesizing the addition: (1+2) * 3.
-
- Some symbols that we have referred to as operators, are treated
- by the assembler as operands having built-in values. These
- include B, W, F, $, and ST. In a similar vein, a segment
- override term (a segment register name followed by a colon) is
- recorded when it is scanned, but not acted upon until the entire
- containing expression is scanned and evaluated.
-
- If two operators are adjacent, the rightmost operator must have
- precedence; otherwise, parentheses must be used. For example,
- the expression BIT ! 1 is illegal because the leftmost operator
- BIT has the higher precedence of the two adjacent operators BIT
- and "!". You can code BIT (! 1).
-
- --Highest Precedence--
- 8-14
-
- 1. Parenthesized expressions
- 2. Period
- 3. OFFSET, SEG, TYPE, and PTR
- 4. HIGH, LOW, and BIT
- 5. Multiplication and division: *, /, MOD, SHR, SHL
- 6. Addition and subtraction: +,-
- a. unary
- b. binary
- 7. Relational: EQ, NE, LT, LE, GT, GE =
- 8. Logical NOT and !
- 9. Logical AND
- 10. Logical OR and XOR
- 11. Colon for long pointer, SHORT, LONG, and BY
- 12. DUP
-
- --Lowest Precedence--
-
-