home *** CD-ROM | disk | FTP | other *** search
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
- TABLE OF CONTENTS
-
- Description ..................2
-
- Syntax .......................2
-
- Labels .....................3
-
- Opcodes ....................3
-
- Operands ...................4
-
- Pseudo ops .................5
-
- Error Messages ................6
-
- Source code ...................6
-
- Disclamer .....................7
-
- Op code table .................8
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
- 1
-
-
-
-
-
-
-
-
-
-
-
- DESCRIPTION
- ASSEMBLE.COM is a two pass assembler written in Turbo
- Pascal. It was written because I was a new owner of an IBM PCjr
- with little software and I was not aware of any public domain
- assemblers that had reasonable performance. Additionaly I had
- reciently purchased Turbo Pascal from Borland International and
- wanted a project to help me learn Pascal.
-
- ASSEMBLE.COM is a simple assembler for Intel 8088/8086
- instruction set. It closely follows the syntax of the instruction
- set described in THE 8806 BOOK by Russel Rector and George Alexy.
- It is also patterened after CHASM.BAS version 1.9 written in
- basic by David Whitman of Whitman Software (there have been
- numerious changes to avoid copywrite violations). It is not a
- macro assembler and therefore recognizes only a few pseudo op
- codes. Listed later in this documemtation is a complete list of
- op codes and pseudo ops recognized by ASSEMBLE.COM.
-
- The input requirememts to ASSEMBLE.COM are ordinary DOS
- files which can be created with most text editors. The output
- will be a listing sent to the screen, printer or a disk file and
- a .COM file. This assembler was designed to creat executeable
- code or subroutines used in BASIC or Turbo Pascal programs. It
- will not generate code that can be linked to other other
- programs. See your BASIC or TURBO instruction manuals for
- includeing executable code in your programs. Since this assembler
- is intended for small projects you will probably find bypassing
- the Link and conversion from .EXE to .COM files is a convience.
- If you intend to write large software projects, I recomend you
- get a macro assembler such as IBM's or CHASM version 4.0 (also
- written in Turbo Pascal). Speaking of size, since both the input
- file and output file(s) are on the disc the only limitations to
- the size of your program is the disk space, number of labels you
- use and your endurance. Labels and their memory address are
- stored on the first pass in ASSEMBLE.COM's area for dynamic
- variables. This area uses all avaliable memory above the program
- area therefore even on a 64K byte system you should have plenty
- room for several hundred labels. My goal is to keep this program
- under 35K bytes so it can be run on small systems.
-
- SYNTAX
- Each line of source code begins with a label or blank space
- followed by the op code (8088 instruction or pseudo op) and then
- the operand(s) (if required by the opcode). The source code may be
- followed with a semi colon (;) and any commemts for that line.
- Optionaly a line may be only comments if it starts with a semi
- colon. A blank space must preceed and follow the op code field.
- For readability I recomemd one or more spaces between the label
- and the op code and between the op code and the operand. A comma
- must be used to seperate operands.
-
-
-
-
- 2
-
-
-
-
-
-
-
-
- EXAMPLE:
-
- 1stLabel mov ax,90H ;format example
-
- Labels:
- Although labels may be longer only the first 7 letters of
- the label are stored for future reference therefore any labels
- with the first seven characters the same will cause a duplicate
- label error. Also since the line parsing routine converts all
- source code not in single quotes to upper case prior to decodeing
- the line, you can not use upper and lower case to distinguish
- labels. For example LongLabel1 and longlabel2 are both stored as
- LONGLAB therefore would cause an duplicate label error. If you
- are going to use numbers to distinguish labels I suggest you use
- them at the beginning of a label such as 2LongLabel.
-
- Op Codes:
- All of the op codes specified in THE 8086 BOOK are supported
- however in order to resolve some ambiguous op codes the syntax
- was modified. The first ambiguous opcode is JMP which can be
- either a 8 or 16 bit displacement jump. Eight bit displacement
- jumps are resolved by specifying JMPS for short jump. Jumps
- useing mem/reg (indirect) addressing for their destination must
- specify Near or Far to indicate a jump within the current CS or
- an intersegment jump. These jumps are coded as JMPN or JMPF. This
- same logic is used for CALLN and CALLF when useing the mem/reg
- addressing mode. The other major area of ambiguity comes from
- useing op codes that do not specify a register as either the
- destination or source. This assembler requires you to append the
- op code with a B or W to distinguish between bytes and words ie.
- MOVSB for move a string of bytes or MOVW [bx],8 to load 8 into
- the word address pointed to by BX.
-
- Normally all data moves are assumed to be relative to the DS
- (data segment) register. This default can be over ridden one
- instruction at a time by useing the SEG op code in the line prior
- to the desired over ride.
- EXAMPLE:
- SEG ES
- MOV AX,[BX]
- This moves a word into the accumulator from the address in the
- extra segment offset by the bx register. This is a little used
- function since ASSEMBL assumes all of the segment registers are
- set to the same location as is required for the start of a .COM
- program. Access of system resources should be done with BIO or
- DOS calls when possible rather than going directly to a hardware
- memory location outside your program.
-
- The opcode table lists the avaliable opcodes and pseudo ops
- and the various addressing modes associated with each. Please
- note that the mem/reg addressing mode includes several sub modes
- such as base relative (useing BX as an offset), stack relative
- (useing BP as an offset), and indexed (useing the SI or DI). See
- the 8086 BOOK for an explination of each mode.
-
-
-
- 3
-
-
-
-
-
-
-
-
-
- Operands:
- The operands describe to the assembler the destination and
- source of the data to be operated on. The 8088 uses a number of
- addressing modes to determine where that data is and should go.
- You will discover by looking at the op code table, not all modes
- can be used with an individual op code. Addressing modes are :
-
- Accumulator - data is transfered to/from the accumulator.
- Displacement - the displacement value is added to the present IP.
- Immediate - data is assembled into the instruction.
- Memory/Reg - data is transfered to/from address pointed to
- by [mem] or [reg].
- Register - data is transfered to/from the register.
-
- Accumulator: The accumulator(s) are AX or AL and AH where AX is a
- 16 bit accumulator, AL is the lower 8 bits of AX and AH is the
- higher 8 bits.
-
- Displacement: A displacement value to be added to the instruction
- pointer (IP) is included as immediate data in the opcode. The
- assembler calculates the amount of displacement based on the
- location of the opcode and then location of the address in the
- operand. The address in the operand can be expressed as a number
- (binary,hex or decimal) but is most commonly expressed as a
- label. EXAMPLE:
-
- LABEL MOV AX,[BX]
- CMP AX,10H
- JLE LABEL
-
- With this example the assembler calculates a negative
- displacement to jump back to LABEL when then value in AX is less
- than or equal 10H.
-
- Immediate: All immediate data is assembled into the instruction
- code. This data can be represented in two ways. First immediate
- data can be presented in binary, decimal or hexidecimal format in
- a signed range of -32768 to 32767 (8000H to 7FFFH) or if the sign
- bit is not used, 0 to 65535 (0000H to FFFFH). As in these
- examples a 'H' is appended to the number to indicate hexidecimal.
- Binary numbers are expressed as a series of up to 16 ones and
- zeros followed by a B, ie 11010B represents 26. The other method
- of representing immediate data is with labels. The value of the
- label is the address at which the label was used or the value
- assigned to the label in an EQU pseudo op. When useing labels in
- the operand for data access the assembler assumes you are
- reffering to the data in the refferenced address. If you want to
- load the value of the address itself then the modifier OFFSET(..)
- must be used.
- EXAMPLE:
-
- MOV BX,LABEL ;Load BX with the value at address Label
- MOV BX,OFFSET(LABEL) ;Load BX with the address of Label
-
-
-
- 4
-
-
-
-
-
-
-
-
- Some math is supported when useing immediate data however the
- number parser is very simple therefore the format is restricted.
- Only one math operator is permited between numbers (-3*4 is ok
- but 4*-3 gives an error). Parseing is done from left to right
- with no presience set for the math operator (+ or * are
- interpeted when recieved in the input string verses the usual of
- interperting multiplication and division before addition and
- subtraction).
-
- Memory/register: This addressing mode is also called indirect
- addressing. The operand is used to point to a memory location
- that contains the data rather than the instruction containing the
- data as in the immediate addressing mode. The operand can be a
- memory location expressed as a label, decimal number or a
- hexidecimal number or it can be a memory location pointed to by a
- register. The following indirect modes are allowed:
- MOV Reg,[BP]
- MOV [BX],Reg
- MOV [BX+SI],Reg ;BX plus SI displacement equal location
- MOV Reg,[BX+DI] ;BX plus DI " " "
- MOV Reg,[BP+SI] ;BP plus SI " " "
- MOV [BP+DI],Reg ;BP plus SI " " "
- MOV [DI],Reg
- MOV Reg,[SI]
- MOV LABEL,Reg
- Any of the general purpose registers can be used in place of Reg
- in these examples. Immediate data may also be substituted for a
- register however then the opcode most be appended with W or B so
- the assembler knows if you are pointing to a word or byte
- address. In addition to the above when an indirect address useing
- a register is chosen a displacement may also be used.
- EXAMPLE:
-
- MOV Reg,10H[BP] ;Source address equal BP+10H
- MOV Reg,-5[BP+SI] ;Source address equal BP+SI-5
- DEMO EQU FFH
- MOV DEMO[BX+DI],Reg ;Destination address equal BX+DI+255
-
- Register: In this addressing mode the data is contained in or is
- to be stored in one of the 8088 registers. The registers are AX
- (AL+AH), BX (BL+BH), CX (CL+CH), DX (DL+DH), BP, DI, SI and the
- four segment registers CS, DS, ES, SS. All math operations use
- the accumulator (AX, AL or AH) plus the MUL and DIV use the DX
- register when 32 bit numbers are involved. The BX and BP
- registers can be used as base pointers in the Data or Stack
- segments respectively. The CX register can be used for a
- automatic counter for some instructions. As demonstrated in
- earlier examples the SI and DI registers can be used as indexing
- registers. Any of the numerious assembly language books for the
- IBM PC or PCjr will give you an explination of each of the
- processor registers and their uses.
-
- PSEUDO OPCODES
- Pseudo opcodes are assembler directives that control the
- generation of the object code. The avaliable pseudo ops are DB,
-
-
- 5
-
-
-
-
-
-
-
-
- DS, DW, ENDP, EQU, ORG and PROC.
- DB = define byte and has operands of one or more bytes and/or
- a string ( DB 20H,'Demo' ). Strings are set off by single
- quotes. Numbers are less than 256 and expressed in binary,
- decimal or hexidecimal.
- DS = define segment and initializes a string of memory
- locations. The first operand defines the number of bytes to
- be initialized. If included the second operand defines the
- value the memory is to be initialized to. The default value
- is zero. ( DS 20,FFH ;initialize 20 bytes to 255).
- DW = define word and its operand(s) must be numeric or a
- label. With DW the low order byte is stored first in memory
- as this is the format used by the 8088 for integer storage
- (ie: dw 1020H = db 20H,10H).
- EQU= define the value of a label. All label definitions must
- occur at the beginning of your program or errors may occur
- in the assembly process. The most common error message
- recieved from defining a label late in the program is 'PHASE
- ERROR'. A phase error indicates the assembler generated a
- address for a label on the second pass different from that
- of the first pass.
- ORG= reset the location counter to new orgin. Since all .COM
- programs start at 100H the default setting for ASSEMBLE.COM
- is 100H however you may have a need to start at 00H for a
- driver routine or a machine language routine for BASIC.
- PROC and ENDP are used together to define a program or procedure
- as Near or Far. This information is used to determine the
- type of return to be generated when a RET is encountered.
- If no procedure is defined a Near procedure is assumed. The
- syntax is:
- PROC NEAR ;Proc must be followed by Near or Far
- ....
- ....
- ENDP
- If PROC is used an ENDP must be used.
-
-
- ERROR MESSAGES
- All error messages and diagnostics are printed immediately
- before the line in which the error occurs. The total number of
- error and diagnostic messages will be displayed at the end of the
- source code print out immediately prior to the symbol table dump.
- I have made an attempt to make error messages as user
- friendly as possible. The most criptic of the error messages is
- the series you recieve when there is a syntax error. This message
- will be the opcode and ASSEMBLE.COM's interpetation of the type
- of data included in the operands.
- EXAMPLE:
- *** Syntax Error: MOV (16 bit immediate or 8 bit immediate), (none)
-
- This message would appear immediate before a line of code
- containing the instruction MOV 45H. By reviewing the type of data
- and the allowable operands for each instruction you should be
- able to locate the error.
- 'Phase Error' is most commonly caused by referenceing an
-
-
- 6
-
-
-
-
-
-
-
-
- equate before defineing it. I strongly recomend you only use the
- EQU pseudo op at the beginning of your source code. This pratice
- should prevent this error and will make your source code more
- easier to read.
- 'Error: EQU without symbol' is recieved when you use the equate
- pseudo-op without a label.
- 'Error: EQU with forward refference' is recieved if you
- attempt to use a forward refference when equating a label.
- 'Error: ENDP without PROC'. You must specify where the
- procedure begins.
- 'Error: Missing ENDP'. You must specify where the procedure
- ends if PROC is used.
- 'Error: Procedures nested too deeply'. Only 10 levels of
- nesting are allowed.
- 'Error: Duplicate label'. See section on labels.
- 'Error: Data too long' indicates use of a byte operand where
- the data is out of the range of 0 to 255.
- 'Error: Too far for short jump' indicates an attempt to jump
- further than +or- 127 bytes.
- 'Error: Undefined Symbol' plus the operand being interpeted
- is displayed when no match is found in the symbol table.
- Frequently caused by bad syntax in the operand.
- 'Error: Illegal or undefined argument for OFFSET' is simular
- to 'Undefined Symbol'
-
- Two diagnostic messages may be given. The first follows a
- syntax error and is 'Specify word or byte operand' if the
- assembler could not determine which to use. The opcode must be
- corrected by appending a B or W to it. The other message is just
- a notice that you used a long jump where you could have used a
- short jump and saved a byte of object code.
-
- SOURCE CODE
- Turbo Pascal source code for this assembler is avaliable for
- those who wish to customize it for their own needs (or those who
- would like to see what makes it tick). If you would like a copy
- of the source code send a formated disk and $10 to
- George Fulford
- RR 1 Box 163c
- Shellsburg Ia 52332
- Although I have no intentions of entering the software market at
- this time I do plan to make corrections to this program as the
- bugs are found. If you obtain a copy of the source code from me
- it will be the most up to date version.
-
- WARRENTY/GUARANTEE
- There is NONE.
- This assembler runs on my PCjr and since I used all standard
- Turbo Pascal it should run on any PC DOS machine. If it does not
- I probably won't be able to help you.
-
- I have spent quite a bit of time debuging but I am sure there are
- still a few bugs lurking in the code. I will attempt to stamp out
- any reported.
-
-
-
- 7
-
-
-
-
-
-
-
-
-
- OP CODE TABLE
- addressing modes supported (b/w = must specify byte or word)
- A = acumulator reg(ax, ah, al)
- b/w = must add B or W to opcode for this addressing mode
- D = displacememt (8 or 16 bit as required by the instruction)
- I = immediate (byte for 8 bit registers, word for 16 bit reg)
- M/R = memory or register (indirect addressing)
- N = none
- R = register(bx, cx, dx, bp, si, di)
- S = segment register (cs, ds, es, ss)
- Op Operand
- types
- dest. N | A | A | R | R | M/R | R | M/R | I | I | D | M/R
- source N | I | M/R | I | N | R | M/R | I | I | N | N | N
- AAA x | | | | | | | | | | |
- AAD x | | | | | | | | | | |
- AAM x | | | | | | | | | | |
- AAS x | | | | | | | | | | |
- ADC | x | | x | | x | x | b/w | | | |
-
- AND | x | | x | | x | x | b/w | | | |
- CALL | | | | | | | | x | | x |
- CALLF | | | | | | | | | | | x
- CALLN | | | | | | | | | | | x
- CBW x | | | | | | | | | | |
-
- CLC x | | | | | | | | | | |
- CLD x | | | | | | | | | | |
- CLI x | | | | | | | | | | |
- CMC x | | | | | | | | | | |
- CMP | x | | x | | x | x | b/w | | | |
-
- CMPS b/w | | | | | | | | | | |
- CWD x | | | | | | | | | | |
- DAA x | | | | | | | | | | |
- DAS x | | | | | | | | | | |
- DB | | | | | | | | x | x | |
-
- DEC | | | | x | | | | | | | b/w
- DIV | | x | | | | | | | | |
- DS | | | | | | | | x | x | |
- DW | | | | | | | | x | x | |
- ENDP x | | | | | | | | | | |
-
- EQU | | | | | | | | | x | | memory
- HLT x | | | | | | | | | | |
- IDIV | | x | | | | | | | | |
- IMUL | | x | | | | | | | | |
- IN | x |note1| | | | | | | | |
-
-
-
-
-
-
-
- 8
-
-
-
-
-
-
-
-
- Op Operand
- types
- dest. N | A | A | R | R | M/R | R | M/R | I | I | D | M/R
- source N | I | M/R | I | N | R | M/R | I | I | N | N | N
- INC | | | | x | | | | | | | b/w
- INT x | | | | | | | | | x | |
- INTO x | | | | | | | | | | |
- IRET x | | | | | | | | | | |
- JA | | | | | | | | | | x |
-
- JAE | | | | | | | | | | x |
- JB | | | | | | | | | | x |
- JBE | | | | | | | | | | x |
- JCXZ | | | | | | | | | | x |
- JE | | | | | | | | | | x |
-
- JG | | | | | | | | | | x |
- JGE | | | | | | | | | | x |
- JL | | | | | | | | | | x |
- JLE | | | | | | | | | | x |
- JMP | | | | | | | | | | x |
-
- JMPF | | | | | | | | | | x |
- JMPN | | | | | | | | | | x |
- JMPS | | | | | | | | | | x |
- JNE | | | | | | | | | | x |
- JNO | | | | | | | | | | x |
-
- JNP | | | | | | | | | | x |
- JNS | | | | | | | | | | x |
- JNZ | | | | | | | | | | x |
- JO | | | | | | | | | | x |
- JP | | | | | | | | | | x |
-
- JPE | | | | | | | | | | x |
- JPO | | | | | | | | | | x |
- JS | | | | | | | | | | x |
- JZ | | | | | | | | | | x |
- LAHF x | | | | | | | | | | |
-
- LDS | | | | | |note2| | | | |
- LEA | | | | | |note2| | | | |
- LES | | | | | |note2| | | | |
- LOCK x | | | | | | | | | | |
- LODS b/w | | | | | | | | | | |
-
- LOOP | | | | | | | | | | x |
- LOOPE | | | | | | | | | | x |
- LOOPNE | | | | | | | | | | x |
- LOOPNZ | | | | | | | | | | x |
- LOOPZ | | | | | | | | | | x |
-
-
-
-
-
-
- 9
-
-
-
-
-
-
-
-
- Op Operand
- types
- dest. N | A | A | R | R | M/R | R | M/R | I | I | D | M/R
- source N | I | M/R | I | N | R | M/R | I | I | N | N | N
- MOV note3| | x | | | x | x | b/w | | | |
- MOVS b/w | | | | | | | | | | |
- MUL | | x | | | | | | | | |
- NEG | | | | x | | | | | | |
- NOP x | | | | | | | | | | |
-
- NOT | | | | x | | | | | | | b/w
- OR | x | | x | | x | x | b/w | | | |
- ORG | | | | | | | | | x | |
- OUT | |note1| | | | | | | | |
- POP | | | | x | | | | | | |x or seg
-
- POPF x | | | | | | | | | | |
- PROC note4| | | | | | | | | | |
- PUSH | | | | x | | | | | | |x or seg
- PUSHF x | | | | | | | | | | |
- RCL | | | | x | | | | | | | b/w
-
- RCR | | | | x | | | | | | | b/w
- REP x | | | | | | | | | | |
- REPE x | | | | | | | | | | |
- REPNE x | | | | | | | | | | |
- REPNZ x | | | | | | | | | | |
-
- REPZ x | | | | | | | | | | |
- RET x | | | | | | | | | | x |
- ROL | | | | x | | | | | | | b/w
- ROR | | | | x | | | | | | | b/w
- SAHF x | | | | | | | | | | |
-
- SAR | | | | x | | | | | | | b/w
- SBB | x | | x | | x | x | b/w | | | |
- SCAS b/w | | | | | | | | | | |
- SEG | | | | | | | | | x | |
- SHL | | | | x | | | b/w | | | |
-
- SHR | | | | x | | | b/w | | | |
- STC x | | | | | | | | | | |
- STD x | | | | | | | | | | |
- STI x | | | | | | | | | | |
- STOS b/w | | | | | | | | | | |
-
- SUB | x | | x | | x | x | b/w | | | |
- TEST | x | | x | | x | x | b/w | | | |
- WAIT x | | | | | | | | | | |
- XCHG | |note5| | | x | x | | | | |
- XLAT x | | | | | | | | | | |
-
- XOR | x | | x | | x | x | b/w | | | |
-
- note 1 IN/OUT supports DX<-acum(8 or 16) and port<-acum(8 or 16).
-
-
- 10
-
-
-
-
-
-
-
-
- note 2 These instructions can use only memory refference
- in the source operand.
- note 3 MOV also supports mem<-acum seg<-M/R and M/R<-seg(or CS).
- note 4 Must specify near or far PROCedure.
- note 5 The accumulator can be exchanged with any of the registers
- using the form XCHG AX,BX or XCHG BX
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
- 11
-
-
-
-
-