home *** CD-ROM | disk | FTP | other *** search
Text File | 1995-08-10 | 26.0 KB | 465 lines | [TEXT/MPS ] |
- (**---------------------------------------------------------------------------*
- | |
- | <<< Disassembler.p >>> |
- | |
- | Power[PC] Disassembler Interfaces |
- | |
- | Ira L. Ruben |
- | 5/9/93 |
- | |
- | Translated from C to Pascal by |
- | Greg Branche |
- | 7/20/94 |
- | |
- | Copyright Apple Computer, Inc. 1993-1995 |
- | All rights reserved. |
- | |
- *---------------------------------------------------------------------------**)
-
- (*$TAGS-*)
- (*$CALLING PASCAL*)
- MODULE Disassembler;
-
- IMPORT SYSTEM, Types;
-
- (* $PUSH*)
-
- CONST
-
- (** The following defines the "options" that can be passed to the disassembler. All **)
- (** except ONE of the target architecture options have preset defaults. **)
-
- (** Target architecture (one must be set): **)
- Disassemble_Power* = $00000001; (** Power **)
- Disassemble_PowerPC32* = $00000002; (** 32-bitPowerPC **)
- Disassemble_PowerPC64* = $00000004; (** 64-bit PowerPC **)
- Disassemble_PowerPC601* = $00000008; (** PowerPC 601 **)
- (** Error detection options*: **)
- Disassemble_RsvBitsErr* = $80000000; (** invalid reserved bits is error **)
- Disassemble_FieldErr* = $40000000; (** invalid field (regs, BO, etc.) error **)
- (** Formatting options (reverses presets): **)
- Disassemble_Extended* = $08000000; (** extended mnemonics (ppc only) **)
- Disassemble_BasicComm* = $04000000; (** basic form in comment if extended **)
- Disassemble_DecSI* = $02000000; (** SI fields formatted as decimal **)
- Disassemble_DecUI* = $01000000; (** UI fields formatted as decimal **)
- Disassemble_DecField* = $00800000; (** fields shown as decimal **)
- Disassemble_DecOffset* = $00400000; (** D of D(RA) shown in decimal **)
- Disassemble_DecPCRel* = $00200000; (** $+decimal offset instead of $+hex **)
- Disassemble_DollarHex* = $00100000; (** $XXX... instead of 0xXXX... **)
- Disassemble_Hex2sComp* = $00080000; (** negative hex shown in 2s compliment **)
- Disassemble_MinHex* = $00040000; (** min nbr of hex digits for values >= 0 **)
- Disassemble_CRBits* = $00020000; (** crN_LT, crN_GT, crN_EQ, crN_SO **)
- Disassemble_CRFltBits* = $00010000; (** crN_FX, crN_FEX, crN_VX, crN_OX **)
- Disassemble_BranchBO* = $00008000; (** branch BO meaning if not extended **)
- Disassemble_TrapTO* = $00004000; (** trap TO meaning if not extended **)
- Disassemble_IBM* = $00002000; (** IBM assembler conventions **)
-
- (**
- Except for the target architecture options, ONE of which must be set, here's an explanation
- of the other options and their preset default.
-
- Disassemble_RsvBitsErr - Reserved bits in PowerPC instructions are considered a "warning"
- and causes the return status to be set to indicate whether
- reserved bits were incorrectly coded (1's that should be 0's and
- vice versa). The option indicates incorrectly coded reserved bits
- cause the instruction to be treated as "invalid".
-
- Disassemble_FieldErr - Attempted use of a field value not valid for a target is
- considered a "warning" and causes the return status to be set to
- indicate that fact. The option indicates that use of a field
- whose value is not valid for the target is "invalid". An example
- of an invalid field would be the use of a SPR not supported for
- the target architecture like the "HIDx" SPRs which are only valid
- for the 601. Another example is non zero bits in the bc[l][a] BO
- field that are supposed to be zero. Note this is NOT the same as
- Disassemble_RsvBitsErr. But if a field has NO valid decoding
- value for ANY target, that is always considered as an invalid
- instruction.
-
- Disassemble_Extended - Extended mnemonics are NOT generated. The option allows the
- extended mnemonic generation (recommended). Only PowerPC32,
- PowerPC64, and PowerPC32 and PowerPC64 instructions used on the
- 601 are supported.
-
- Disassemble_BasicComm - The basic instruction form is NOT placed in the comment field.
- The option causes the basic form of the instruction to be placed
- in the comment if an extended mnemonic is generated for it. This
- option is not recommended since it is mainly for debugging and it
- tends to "clutter" up the comment field making it harder to see
- branch addresses.
-
- Disassemble_DecSI - SIs (signed immediate integers) are formatted as hex. The option
- causes SI operands to be generated as decimal integers.
-
- Disassemble_DecUI - UIs (unsigned immediate integers) are formatted as hex. The
- option causes UI operands to be generated as decimal integers.
-
- Disassemble_DecField - All fields (e.g., shift/rotate constants) are shown as hex. The
- option causes the offsets to be generated as decimal integers.
-
- Disassemble_DecOffset - The "D" offsets in operands of the form D(RA) are shown in hex.
- The option causes these to be generated as decimal.
-
- Disassemble_DecPCRel - PC-relative branch addresses are formatted as "$+n" or "$-n", with
- the offset ("n") generated in hex. The option causes the offset
- to be generated as decimal.
-
- Disassemble_DollarHex - Hex values are prefixed with "0x". The option causes hex values
- to be formatted as "$XXX...".
-
- Disassemble_Hex2sComp - Signed negative values that are shown in hex are negated and
- prefixed with a "-" (e.g. "-0x0001"). The option causes these
- values to be shown in their two's complement form (e.g.,
- "0xFFFFFFFF").
-
- Disassemble_MinHex - Positive hex values or negated negative values are always shown
- with the number of digits attempting to indicate the size of the
- instruction field which produced the value or the implied value
- size. Thus 32-bit target addresses are shown as 8 hex digits,
- 16-bit field values are shown with 4 hex digits, byte field values
- as 2 hex digits. 5 or six-bit values are also shown as 2 hex
- digits since the minimum is always at least 2. The option forces
- the generation to always use 2 as the minimum even if the value
- came from a bigger field (e.g., "0x1234" address, "0x01" or
- "-0x01" from a 16-bit field).
-
- Disassemble_CRBits - Condition register field bits are referenced as bit numbers 0:31
- in the basic instruction operand forms. The option causes these
- bits to be referenced using the format “crN_X”, where N is a 4-bit
- CR field (0:7) and X is the bit “name” in the field (“LT”, “GT”,
- “EQ”, “SO” for bits 0, 1, 2, and 3 respectively). Note, this
- notation is always used with extended mnemonics.
-
- Disassemble_CRFltBits - Condition register field bits are referenced as bit numbers 0:31
- in the basic instruction operand forms. The option is identical
- to Disassemble_CRBits to generate the references as “crN_X”,
- except that the bits (X) are referenced as “FX”, “FEX”, “VX” and
- “OX” for the four bits 0,1, 2, and 3 respectively. This option
- can be used if the context of floating-point operations, but it's
- up to the caller to determine that context.
-
- Disassemble_BranchBO - Branch test BO encodings are referenced as values 0:31 in the
- basic instruction operand forms. The option causes the BO value
- to be referenced as more meaningful names (e.g., "dCTR_NZERO_NOT",
- "ALWAYS", etc.).
-
- Disassemble_TrapTO - Trap TO operand encodings are referenced as values 0:31 in the
- basic instruction operand forms. The option causes the TO value
- to be an expression of the form "x|y|...", where the "x", "y",
- and so are the meaning of each of the five TO bits; "LT", "GT",
- "EQ", "LOW", "HI" for bits 0, 1, 2, 3, and 4 respectively.
-
- Disassemble_IBM - Apple assembler conventions are used for comments and invalid
- instructions. The option causes IBM assembler conventions to be
- used for these. A “#” is used instead of a “;” as the comment
- character, and “.long” is used instead of “dc.l” for the invalid
- instruction directive mnemonic.
-
- [Are we having fun yet?]
- **)
-
- (** The following defines a set of the above options which seem to give "acceptable" **)
- (** results*: **)
-
- DisStdOptions* = (Disassemble_Extended + (** permit extended mnemonics **)
- Disassemble_DecSI + (** decimal SIs but hex UIs **)
- Disassemble_DecField + (** decimal field numbers **)
- Disassemble_BranchBO + (** meaning of branch BO **)
- Disassemble_TrapTO + (** meaning of trap TO **)
- Disassemble_CRBits); (** CR bits references as crN_X **)
-
-
- (** Return status flags*: **)
-
- Disassembler_OK* = $0001; (** instruction successfully decoded **)
- Disassembler_InvRsvBits* = $0002; (** invalidly coded reserved bits **)
- Disassembler_InvField* = $0004; (** invalidly coded field(s) **)
- Disassembler_InvSprMaybe* = $0008; (** possibly invalid SPR **)
- Disassembler_601Power* = $0010; (** power instruction used with 601 **)
- Disassembler_Privileged* = $0020; (** privileged instruction **)
- Disassembler_Optional* = $0040; (** optional instruction **)
- Disassembler_Branch* = $0080; (** branch instruction **)
- Disassembler_601SPR* = $0100; (** SPR valid only for 601 has been used **)
- Disassembler_HasExtended* = $4000; (** possible extended mnemonic **)
- Disassembler_ExtendedUsed* = $8000; (** the extended mnemonic was generated **)
-
- DisInvalid* = $0000; (** invalid instruction **)
-
- (**
- Unless DisInvalid (0) is returned as the function result, Disassembler_OK will always be
- set. The other flags have the following meaning*:
-
- Disassembler_InvRsvBits - The instruction had some or all of its reserved bits
- incorrectly coded, and the Disassemble_RsvBitsErr option was
- NOT set. This is something like a "warning". With the option
- set, this condition is considered as an "error" and the
- "invalid instruction" is generated ("dc.l 0xXXXXXXXX").
-
- Disassembler_InvField - The instruction had fields incorrectly coded for the
- target, but is is still valid for some target (e.g., not
- valid for the 601 but valid for the PowerPC64), and the
- Disassemble_FieldErr option was NOT set.
-
- Disassembler_InvSprMaybe - A mfspr or mtspr instruction references a POSSIBLY invalid
- SPR. This occurs when an SPR value is not for one of the
- predefined SPR names (see list above) and there is no lookup
- routine, or it does not supply a substitution name. In that
- case the SPR register number is generated. Since there is
- no way of the disassembler knowing whether the register is
- valid for the architecture of interest, this flag is set
- instead of Disassembler_InvField to indicate the possibility
- that the SPR may be invalid.
-
- Disassembler_601Power - The options specified that the target architecture is the
- 601 (Disassemble_PowerPC601), and a Power instruction was
- disassembled. The 601 is basically an ORing of the Power
- and PowerPC32 architectures. But this flag could be useful
- for "weeding" Power instructions out in preparation for use
- on a "pure" PowerPC32 or PowerPC64 architecture.
-
- Disassembler_601SPR - The options specified that the target architecture is the
- 601 (Disassemble_PowerPC601), and a mfspr or mtspr
- instruction references a SPR valid ONLY for the 601.
-
- Disassembler_Privileged - The instruction is privileged.
-
- Disassembler_Optional - The instruction is optional.
-
- Disassembler_Branch - Branch instruction; bc[l][a], b[l][a], bclr[l], bcctr[l] and
- Power bcr[l], bcc[l]. If any of these instructions are
- processed the flag is set. Branches are signaled because
- the caller might want to do some additional processing on
- these. For example, a debugger might want to dynamically
- show which way the branch is taken, or static analysis might
- want to know possible exit points from a function or show
- the branch in some graphical way. Although the caller could
- determine if the instruction is a branch, the disassembler
- always has to classify the instructions passed to it, so
- there is no sense having both do it if the information is
- already available. Note, the caller might still, however,
- need to extract the BO and BI fields to determine the
- condition of the branch, but at least it only needs to be
- done when the flag is set.
-
- Disassembler_HasExtended - The instruction POSSIBLY has an extended mnemonic, whether
- used or not used (as a function of the Disassemble_Extended
- option). Note, "possibly has an extended mnemonic"; the
- instruction could have extendeds, but not for all
- values of its operands.
-
- Disassembler_ExtendedUsed - The instruction has an extended mnemonic, and it was used
- because the option (Disassemble_Extended) permits it. The
- operand is formatted appropriate to the extended mnemonic.
- Whether the original basic form is placed in the comment or
- not is controlled by the Disassemble_BasicComm option.
- **)
-
-
- (** All assembler options are of type DisassemblerOptions*: **)
-
- TYPE
- DisassemblerOptions* = LONGINT;
-
- DisassemblerStatus* = INTEGER; (** disassembler return status (see above) **)
-
- (** The optional lookup function (NULL could be passed) is used to allow the caller to **)
- (** substitute name strings for various objects that can occur in an operand. It should **)
- (** return a pointer to a non-null string if substitution is desired. If NULL or a null **)
- (** string is returned, the disassembler uses its own default names. The following **)
- (** defines the possible substitable objects*: **)
-
- DisassemblerLookupType* = SHORTINT; (*ΔΔ ( *) (** Types of substitutable objects*: **)
- CONST
- Disassembler_Lookup_GPRegister* = 0; (*ΔΔ ,*) (** general purpose register **)
- Disassembler_Lookup_FPRegister* = 1; (*ΔΔ ,*) (** floating point register **)
- Disassembler_Lookup_UImmediate* = 2; (*ΔΔ ,*) (** unsigned immediate value **)
- Disassembler_Lookup_SImmediate* = 3; (*ΔΔ ,*) (** signed (32-bit) immediate value **)
- Disassembler_Lookup_AbsAddress* = 4; (*ΔΔ ,*) (** absolute addresse **)
- Disassembler_Lookup_RelAddress* = 5; (*ΔΔ ,*) (** relocatable addresse **)
- Disassembler_Lookup_RegOffset* = 6; (*ΔΔ ,*) (** offset from a base register **)
- Disassembler_Lookup_SPRegister* = 7; (*ΔΔ ,*) (** special purpose register **)
- (*ΔΔ );*)
- TYPE
-
- (** Here's a definition of an object (value) which is a function of each **)
- (** DisassemblerLookupType*: **)
-
- (* $ALIGN MAC68K*)
- DisLookupValue* = LONGINT; (*ΔΔ RECORD (** A "meaningful" name for each value type*: **)
- CASE INTEGER OF
- 0: (gpr: LONGINT;); (** Disassembler_Lookup_GPRegister **)
- 1: (fpr: LONGINT;); (** Disassembler_Lookup_FPRegister **)
- 2: (ui: LONGINT;); (** Disassembler_Lookup_UImmediate **)
- 3: (si: LONGINT;); (** Disassembler_Lookup_SImmediate **)
- 4: (absAddress: LONGINT;); (** Disassembler_Lookup_AbsAddress **)
- 5: (relAddress: LONGINT;); (** Disassembler_Lookup_RelAddress **)
- 6: (spr: LONGINT;); (** Disassembler_Lookup_SPRegister **)
- 7: ( (* regOffset*)
- offset*: INTEGER;
- baseReg*: INTEGER;
- ); (** Disassembler_Lookup_RegOffset **)
- END;*)
- (* $ALIGN RESET*)
-
- DisLookupValuePtr* = SYSTEM.PTR (*ΔΔ POINTER TO DisLookupValue*);
-
- (** Finally, at long last, here's the definition of the disassembler... **)
-
- PROCEDURE ppcDisassembler*( VAR instruction : LONGINT;
- dstAdjust : LONGINT;
- options : DisassemblerOptions;
- mnemonic : Types.CStringPtr;
- operand : Types.CStringPtr;
- comment : Types.CStringPtr;
- refCon : (*ΔΔUNIVΔΔ*) Types.Ptr;
- lookupRoutine : (*ΔΔUNIVΔΔ*) Types.Ptr) : DisassemblerStatus; (*ΔΔC;ΔΔ*)
- EXTERNAL (*•• C*);
- (**
- Takes the four bytes pointed to by instruction and disassembles it, placing the mnemonic,
- operand, and comment in the strings provided. The caller is then free to format or use
- the output strings any way appropriate to the application. Any of these strings may be a
- null pointer, in which case that portion of the disassembled instruction is not returned.
- If they are not null, it is ASSUMED that the associated buffers are large enough to hold
- the disassembled output.
-
- Comments are formatted starting with a "; " (or "#" if the appropriate "IBM" option is
- set). Invalid instructions generate a "dc.l" (".long" for IBM), an operand of the form
- 0xXXXXXXXX showing the actual instruction, and a comment with a message indicating what
- is wrong with the instruction.
-
- For PC-relative branches, the comment generated is the destination address, the only
- address that the disassembler "knows" about is the address of the code pointed to by the
- instruction. Generally, that may be a buffer that has no relation to "reality", i.e.,
- the actual code loaded into the buffer. Therefore, to allow the address comment to be
- mapped back to some actual address, the caller may specify an adjustment factor,
- specified by dstAdjust that is ADDED to the value that normally would be placed in the
- comment.
-
- Many operands usually consist of registers, absolute and relocatable addresses, and
- signed and unsigned values. In places where these occur, the disassembler can call a
- user specified routine to do the substitution using the lookupRoutine parameter if it
- is not NULL. A "refcon" is passed to the disassembler that is, in turn, passed on to
- the lookup routine to allow a communication path between the disassembler caller and its
- lookup routine. The refcon can be anything. The disassembler does not look at it.
-
- The caller also can control some aspects of the formatting with the DisassemblerOptions
- as described above. The options also specify the target architecture; Power, PowerPC32,
- PowerPC64, or PowerPC601.
-
- The disassembler returns as its function result the DisassemblerStatus. This may be
- tested for 0 ("false" or DisInvalid defined below) to find out if an invalid instruction
- was detected. For valid instructions, the DisassemblerStatus is non zero and indicates
- various attributes about the instruction as follows*: **)
-
- (** The "lookup" substitution routine for the objects is defined as follows*:
-
- PROCEDURE DisassemblerLookups*( refCon : (*ΔΔUNIVΔΔ*) Types.Ptr,
- VAR cia : LongInt,
- lookupType : DisassemblerLookupType,
- thingToReplace : DisLookupValue) : CStringPtr; (*ΔΔC;ΔΔ*)
- EXTERNAL (*•• C*);
-
- where, refCon* = A "reference constant" that can be used as a communication link
- between the lookup routine and the caller of the disassembler.
- It is the same refCon passed to the disassembler.
-
- cia* = The instruction address passed to the disassembler.
-
- lookupType and
- thingToReplace* = The kind of object and the associated value of that object to be
- replaced. As defined by DisLookupValue, the thingToReplace has
- the following value for each lookupType.
-
- lookupType value
- =============================================
- Disassembler_Lookup_GPRegister 0:31
- Disassembler_Lookup_FPRegister 0:31
- Disassembler_Lookup_UImmediate integer
- Disassembler_Lookup_SImmediate integer
- Disassembler_Lookup_AbsAddress address [1]
- Disassembler_Lookup_RelAddress address [2]
- Disassembler_Lookup_RegOffset D + Ra [3]
- Disassembler_Lookup_SPRegister spr [4]
- =============================================
-
- Notes*:
-
- [1] This is an absolute target branch address, i.e., the "a" bit
- in the branch instruction IS set. The passed absAddress
- is the address contained in the instruction.
-
- [2] This is a relocatable target branch address, i.e., the "a"
- bit in the branch instruction was NOT set. The relAddress
- is relative to the current instruction address adjusted
- by the dstAdjust. Thus,
-
- relAddress* = destinationAddress + dstAdjust + cia
-
- where cia is the current instruction address, i.e, the value
- of the instruction address passed to the disassembler.
-
- [3] Both the offset (D) and base register (Ra) are passed. The
- DisLookupValue.regOffset value defines how they are packed
- in the thingToReplace. The offset should be assigned to a
- long to get its true 32-bit value. It is valid to pass it
- as a signed short since the instruction field from which it
- came is never more than 16 bits wide.
-
- [4] The lookup for SPRs is slightly different in that it is only
- done as an ESCAPE mechanism, i.e., only when the SPR number is
- NOT one of the predefined Power, 601, PowerPC32, or
- PowerPC64 SPR names. This is done because a different
- PowerPC architectures can have additional SPRs specific to
- those architectures! The lookup routine is called only if
- the SPR is NOT one of the following predefined numbers*:
-
- 0 MQ 272 SPRG0 528 IBAT0U 536 DBAT0U 1008 HID0
- 1 XER 273 SPRG1 529 IBAT0L 537 DBAT0L 1009 HID1
- 4 RTCU 274 SPRG2 530 IBAT1U 538 DBAT1U 1010 IABR
- 5 RTCL 275 SPRG3 531 IBAT1L 539 DBAT1L 1013 DABR
- 6 DEC 280 ASR 532 IBAT2U 540 DBAT2U 1023 PIR
- 8 LR 282 EAR 533 IBAT2L 541 DBAT2L
- 9 CTR 284 TB 534 IBAT3U 542 DBAT3U
- 18 DSIAR 285 TBU 535 IBAT3L 543 DBAT3L
- 19 DAR 287 PVR
- 22 DEC
- 25 SDR1
- 26 SRR0
- 27 SRR1
-
- Not all of these SPRs are valid for all targets. The
- disassembler will check to see if these SPRs are valid for
- the specified target architecture. If they are not, the SPR
- number is treated as an invalid field and processed
- according to the Disassemble_FieldErr option, i.e., it’s
- accepted but returns a status warning, or the instruction is
- treated as invalid (“DC.L 0xXXXXXXXX”).
-
- SPR numbers which are not on the list, and also do not have
- a lookup substitution name, are always accepted. But since
- there is no way for the disassembler to validate these
- against the target, the Disassembler_InvSprMaybe return
- status flag will be set. **)
-
- (** NOTES*: 1. The disassembler library uses the convention that, with the exception of **)
- (** the called routine name itself, i.e., "ppcDisassembler", all externally **)
- (** visible names (linker symbols and macro names) begin with the letters "dis"**)
- (** (in any case). The user should keep this in mind to avoid possible name **)
- (** conflicts. **)
-
- (** 2. Except for statically declared (read only) tables, the disassembler uses no**)
- (** other global data. **)
-
- (** 3. The disassembler is fully self contained in that it has no explicit **)
- (** references to any runtime library routines (e.g., strcpy). There may, **)
- (** however, be implicit references generated by the (C) compiler. **)
-
- (** 4. The disassembler is written in standard ANSI C making it possible to easily**)
- (** port to other platforms. **)
-
-
- (* $ALIGN RESET*)
- (* $POP*)
-
- END Disassembler.
-