home *** CD-ROM | disk | FTP | other *** search
Text File | 1999-06-06 | 56.5 KB | 1,651 lines |
- Assembly HOWTO
- Franτois-RenΘ Rideau fare@tunes.org
- v0.4p, 6 June 1999
-
- This is the Linux Assembly HOWTO. This document describes how to pro¡
- gram in assembly using FREE programming tools, focusing on development
- for or from the Linux Operating System on i386 platforms. Included
- material may or may not be applicable to other hardware and/or soft¡
- ware platforms. Contributions about these would be gladly accepted.
- keywords: assembly, assembler, free, macroprocessor, preprocessor,
- asm, inline asm, 32-bit, x86, i386, gas, as86, nasm
- ______________________________________________________________________
-
- Table of Contents
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
- 1. INTRODUCTION
-
- 1.1 Legal Blurp
- 1.2 Important Note
- 1.3 Foreword
- 1.3.1 How to use this document
- 1.3.2 Other related documents
- 1.4 History
- 1.5 Credits
-
- 2. DO YOU NEED ASSEMBLY?
-
- 2.1 Pros and Cons
- 2.1.1 The advantages of Assembly
- 2.1.2 The disadvantages of Assembly
- 2.1.3 Assessment
- 2.2 How to NOT use Assembly
- 2.2.1 General procedure to achieve efficient code
- 2.2.2 Languages with optimizing compilers
- 2.2.3 General procedure to speed your code up
- 2.2.4 Inspecting compiler-generated code
-
- 3. ASSEMBLERS
-
- 3.1 GCC Inline Assembly
- 3.1.1 Where to find GCC
- 3.1.2 Where to find docs for GCC Inline Asm
- 3.1.3 Invoking GCC to have it properly inline assembly code ?
- 3.2 GAS
- 3.2.1 Where to find it
- 3.2.2 What is this AT&T syntax
- 3.2.3 Limited 16-bit mode
- 3.3 GASP
- 3.3.1 Where to find GASP
- 3.3.2 How it works
- 3.4 NASM
- 3.4.1 Where to find NASM
- 3.4.2 What it does
- 3.5 AS86
- 3.5.1 Where to get AS86
- 3.5.2 How to invoke the assembler?
- 3.5.3 Where to find docs
- 3.5.4 What if I can't compile Linux anymore with this new version ?
- 3.6 OTHER ASSEMBLERS
- 3.6.1 Win32Forth assembler
- 3.6.2 Terse
- 3.6.3 Non-free and/or Non-32bit x86 assemblers.
-
- 4. METAPROGRAMMING/MACROPROCESSING
-
- 4.1 What's integrated into the above
- 4.1.1 GCC
- 4.1.2 GAS
- 4.1.3 GASP
- 4.1.4 NASM
- 4.1.5 AS86
- 4.1.6 OTHER ASSEMBLERS
- 4.2 External Filters
- 4.2.1 CPP
- 4.2.2 M4
- 4.2.3 Macroprocessing with yer own filter
- 4.2.4 Metaprogramming
- 4.2.4.1 Backends from compilers
- 4.2.4.2 The New-Jersey Machine-Code Toolkit
- 4.2.4.3 TUNES
-
- 5. CALLING CONVENTIONS
-
- 5.1 Linux
- 5.1.1 Linking to GCC
- 5.1.2 ELF vs a.out problems
- 5.1.3 Direct Linux syscalls
- 5.1.4 I/O under Linux
- 5.1.5 Accessing 16-bit drivers from Linux/i386
- 5.2 DOS
- 5.3 Winblows and suches
- 5.4 Yer very own OS
-
- 6. TODO & POINTERS
-
-
-
- ______________________________________________________________________
-
- 1. INTRODUCTION
-
- 1.1. Legal Blurp
-
- Copyright ⌐ 1996-1999 by Franτois-RenΘ Rideau.
-
- This document is free software; you can redistribute it and/or modify
- it under the terms of the GNU General Public License as published by
- the Free Software Foundation; either version 2 of the License, or (at
- your option) any later version.
-
-
-
- 1.2. Important Note
-
- This is an interactively evolving document: you are especially invited
- to ask questions, to answer to questions, to correct given answers, to
- add new FAQ answers, to give pointers to other software, to point the
- current maintainer to bugs or deficiencies in the pages. If you're
- motivated, you could even take over the maintenance of the HOWTO. In
- one word, contribute!
-
- To contribute, please contact whoever appears to maintain the
- Assembly-HOWTO. At the time of this writing, it's me, i.e. Franτois-
- RenΘ Rideau <mailto:fare@tunes.org>.
-
- However, it's been some time since I've been looking for a serious
- hacker to replace me as maintainer of this document. Disadvantages are
- you must spend some time updating and correcting the document, and
- learning the LDP publication tools. Advantages are you get some fame
- and you can receive complimentary copies of HOWTO compendiums.
-
-
-
- 1.3. Foreword
-
- This document aims at answering frequently asked questions of people
- who program or want to program 32-bit x86 assembly using free
- software, particularly under the Linux operating system. It may also
- point to other documents about non-free, non-x86, or non-32-bit
- assemblers, though such is not its primary goal.
-
- Because the main interest of assembly programming is to build to write
- the guts of operating systems, interpreters, compilers, and games,
- where a C compiler fails to provide the needed expressiveness
- (performance is more and more seldom an issue), we stress on
- development of such software.
-
- 1.3.1. How to use this document
-
- This document contains answers to some frequently asked questions. At
- many places, Universal Resource Locators (URL) are given for some
- software or documentation repository. Please see that the most useful
- repositories are mirrored, and that by accessing a nearer mirror site,
- you relieve the whole Internet from unneeded network traffic, while
- saving your own precious time. Particularly, there are large
- repositories all over the world, that mirror other popular
- repositories. You should learn and note what are those places near
- you (networkwise). Sometimes, the list of mirrors is listed in a
- file, or in a login message. Please heed the advice. Else, you should
- ask archie about the software you're looking for...
-
- The most recent version for this documents sits in
- <http://www.tunes.org/~fare/files/asm/Assembly-HOWTO.en.sgml> but
- what's in Linux HOWTO repositories should be fairly up to date, too (I
- can't know): <http://metalab.unc.edu/LDP/HOWTO/>. A french
- translation of this HOWTO can be found around
- <ftp://ftp.lip6.fr/pub/linux/french/HOWTO/>.
-
-
-
- 1.3.2. Other related documents
-
-
- ╖ If you don't know what free software is, please do read carefully
- the GNU General Public License, which is used in a lot of free
- software, and is a model for most of their licenses. It generally
- comes in a file named COPYING, with a library version in a file
- named COPYING.LIB. Literature from the FSF <http://www.fsf.org>
- (free software foundation) might help you, too.
-
- ╖ Particularly, the interesting kind of free software comes with
- sources that you can consult and correct, or sometimes even borrow
- from. Read your particular license carefully, and do comply to it.
-
- ╖ There is a FAQ for comp.lang.asm.x86 that answers generic questions
- about x86 assembly programming, and questions about some commercial
- assemblers in a 16-bit DOS environment. Some of it apply to free
- 32-bit asm programming, so you may want to read this FAQ...
-
- <http://www2.dgsys.com/~raymoon/faq/asmfaq.zip>
-
- ╖ FAQs and docs exist about programming on your favorite platform,
- whichever it is, that you should consult for platform-specific
- issues not directly related to programming in assembler.
-
-
-
- 1.4. History
-
- Each version includes a few fixes and minor corrections, which needs
- not be repeatedly mentionned every time.
-
- Version 0.1 23 Apr 1996
- Francois-Rene "FarΘ" Rideau <fare@tunes.org> creates and
- publishes the first mini-HOWTO, because ``I'm sick of answering
- ever the same questions on comp.lang.asm.x86''
-
- Version 0.2 4 May 1996
- *
-
- Version 0.3c 15 Jun 1996
- *
-
- Version 0.3f 17 Oct 1996
- *
-
- Version 0.3g 2 Nov 1996
- Created the History. Added pointers in cross-compiling section.
- Added section about I/O programming under Linux (particularly
- video).
-
- Version 0.3h 6 Nov 1996
- more about cross-compiling -- See on sunsite: devel/msdos/
-
- Version 0.3i 16 Nov 1996
- NASM is getting pretty slick
-
- Version 0.3j 24 Nov 1996
- point to french translated version
-
- Version 0.3k 19 Dec 1996
- What? I had forgotten to point to terse???
-
- Version 0.3l 11 Jan 1997
- *
-
- Version 0.4pre1 13 Jan 1997
- text mini-HOWTO transformed into a full linuxdoc-sgml HOWTO, to
- see what the SGML tools are like.
-
- Version 0.4 20 Jan 1997
- first release of the HOWTO as such.
-
- Version 0.4a 20 Jan 1997
- CREDITS section added
-
- Version 0.4b 3 Feb 1997
- NASM moved: now is before AS86
-
- Version 0.4c 9 Feb 1997
- Added section "DO YOU NEED ASSEMBLY?"
-
- Version 0.4d 28 Feb 1997
- Vapor announce of a new Assembly-HOWTO maintainer.
-
- Version 0.4e 13 Mar 1997
- Release for DrLinux
-
- Version 0.4f 20 Mar 1997
- *
-
- Version 0.4g 30 Mar 1997
- *
-
- Version 0.4h 19 Jun 1997
- still more on "how not to use assembly"; updates on NASM, GAS.
-
- Version 0.4i 17 July 1997
- info on 16-bit mode access from Linux.
-
- Version 0.4j 7 September 1997
- *
-
- Version 0.4k 19 October 1997
- *
-
- Version 0.4l 16 November 1997
- release for LSL 6th edition.
-
- Version 0.4m 23 March 1998
- corrections about gcc invocation
-
- Version 0.4o 1 December 1998
- *
-
- Version 0.4p 6 June 1999
- clean up and updates.
-
- This is yet another ``last release by FarΘ before new maintainer
- takes over''. Only nobody knows who the new maintainer might
- be.
-
-
-
-
-
- 1.5. Credits
-
- I would like to thanks the following persons, by order of appearance:
-
- ╖ Linus Torvalds <mailto:buried.alive@in.mail> for Linux
-
- ╖ Bruce Evans <mailto:bde@zeta.org.au> for bcc from which as86 is
- extracted
-
- ╖ Simon Tatham <mailto:anakin@pobox.com> and Julian Hall
- <mailto:jules@earthcorp.com> for NASM
-
- ╖ Greg Hankins <mailto:gregh@metalab.unc.edu> and now Tim Bynum
- <mailto:linux-howto@metalab.unc.edu> for maintaining HOWTOs
-
- ╖ Raymond Moon <mailto:raymoon@moonware.dgsys.com> for his FAQ
-
- ╖ Eric Dumas <mailto:dumas@linux.eu.org> for his translation of the
- mini-HOWTO into french (sad thing for the original author to be
- french and write in english)
-
- ╖ Paul Anderson <mailto:paul@geeky1.ebtech.net> and Rahim Azizarab
- <mailto:rahim@megsinet.net> for helping me, if not for taking over
- the HOWTO.
-
- ╖ Marc Lehman <mailto:pcg@goof.com> for his insight on GCC
- invocation.
-
- ╖ All the people who have contributed ideas, remarks, and moral
- support.
-
-
-
-
- 2. DO YOU NEED ASSEMBLY?
-
- Well, I wouldn't want to interfere with what you're doing, but here
- are a few advice from hard-earned experience.
-
-
-
- 2.1. Pros and Cons
-
-
-
- 2.1.1. The advantages of Assembly
-
- Assembly can express very low-level things:
-
- ╖ you can access machine-dependent registers and I/O.
-
- ╖ you can control the exact behavior of code in critical sections
- that might otherwise involve deadlock between multiple software
- threads or hardware devices.
-
- ╖ you can break the conventions of your usual compiler, which might
- allow some optimizations (like temporarily breaking rules about
- memory allocation, threading, calling conventions, etc).
-
- ╖ you can build interfaces between code fragments using incompatible
- such conventions (e.g. produced by different compilers, or
- separated by a low-level interface).
-
- ╖ you can get access to unusual programming modes of your processor
- (e.g. 16 bit mode to interface startup, firmware, or legacy code on
- Intel PCs)
-
- ╖ you can produce reasonably fast code for tight loops to cope with a
- bad non-optimizing compiler (but then, there are free optimizing
- compilers available!)
-
- ╖ you can produce code where (but only on CPUs with known instruction
- timings, which generally excludes all current ....
-
- ╖ you can produce hand-optimized code that's perfectly tuned for your
- particular hardware setup, though not to anyone else's.
-
- ╖ you can write some code for your new language's optimizing compiler
- (that's something few will ever do, and even they, not often).
-
-
-
-
- 2.1.2. The disadvantages of Assembly
-
- Assembly is a very low-level language (the lowest above hand-coding
- the binary instruction patterns). This means
-
- ╖ it's long and tedious to write initially,
-
- ╖ it's very bug-prone,
-
- ╖ your bugs will be very difficult to chase,
-
- ╖ it's very difficult to understand and modify, i.e. to maintain.
-
- ╖ the result is very non-portable to other architectures, existing or
- future,
-
- ╖ your code will be optimized only for a certain implementation of a
- same architecture: for instance, among Intel-compatible platforms,
- each CPU design and its variations (relative latency, throughput,
- and capacity, of processing units, caches, RAM, bus, disks,
- presence of FPU, MMX extensions, etc) implies potentially
- completely different optimization techniques. CPU designs already
- include Intel 386, 486, Pentium, PPro, Pentium II; Cyrix 5x86,
- 6x86; AMD K5, K6. New designs keep popping up, so don't expect
- either this listing or your code to be up-to-date.
-
- ╖ your code might also be unportable accross different OS platforms
- on the same architecture, by lack of proper tools. (well, GAS
- seems to work on all platforms; NASM seems to work or be workable
- on all intel platforms).
-
-
- ╖ you spend more time on a few details, and can't focus on small and
- large algorithmic design, that are known to bring the largest part
- of the speed up. [e.g. you might spend some time building very
- fast list/array manipulation primitives in assembly; only a hash
- table would have sped up your program much more; or, in another
- context, a binary tree; or some high-level structure distributed
- over a cluster of CPUs]
-
- ╖ a small change in algorithmic design might completely invalidate
- all your existing assembly code. So that either you're ready (and
- able) to rewrite it all, or you're tied to a particular algorithmic
- design;
-
- ╖ On code that ain't too far from what's in standard benchmarks,
- commercial optimizing compilers outperform hand-coded assembly
- (well, that's less true on the x86 architecture than on RISC
- architectures, and perhaps less true for widely available/free
- compilers; anyway, for typical C code, GCC is fairly good);
-
- ╖ And in any case, as says moderator John Levine on comp.compilers,
- ``compilers make it a lot easier to use complex data structures,
- and compilers don't get bored halfway through and generate reliably
- pretty good code.'' They will also correctly propagate code
- transformations throughout the whole (huge) program when optimizing
- code between procedures and module boundaries.
-
-
-
- 2.1.3. Assessment
-
- All in all, you might find that though using assembly is sometimes
- needed, and might even be useful in a few cases where it is not,
- you'll want to:
-
- ╖ minimize the use of assembly code,
-
- ╖ encapsulate this code in well-defined interfaces
-
- ╖ have your assembly code automatically generated from patterns
- expressed in a higher-level language than assembly (e.g. GCC inline
- assembly macros).
-
- ╖ have automatic tools translate these programs into assembly code
-
- ╖ have this code be optimized if possible
-
- ╖ All of the above, i.e. write (an extension to) an optimizing
- compiler back-end.
-
- Even in cases when Assembly is needed (e.g. OS development), you'll
- find that not so much of it is, and that the above principles hold.
-
- See the sources for the Linux kernel about it: as little assembly as
- needed, resulting in a fast, reliable, portable, maintainable OS.
- Even a successful game like DOOM was almost massively written in C,
- with a tiny part only being written in assembly for speed up.
-
-
-
- 2.2. How to NOT use Assembly
-
-
-
-
-
-
- 2.2.1. General procedure to achieve efficient code
-
- As says Charles Fiterman on comp.compilers about human vs computer-
- generated assembly code,
-
- ``The human should always win and here is why.
-
- ╖ First the human writes the whole thing in a high level language.
-
- ╖ Second he profiles it to find the hot spots where it spends its
- time.
-
- ╖ Third he has the compiler produce assembly for those small sections
- of code.
-
- ╖ Fourth he hand tunes them looking for tiny improvements over the
- machine generated code.
-
- The human wins because he can use the machine.''
-
-
-
- 2.2.2. Languages with optimizing compilers
-
- Languages like ObjectiveCAML, SML, CommonLISP, Scheme, ADA, Pascal, C,
- C++, among others, all have free optimizing compilers that'll optimize
- the bulk of your programs, and often do better than hand-coded
- assembly even for tight loops, while allowing you to focus on higher-
- level details, and without forbidding you to grab a few percent of
- extra performance in the above-mentionned way, once you've reached a
- stable design. Of course, there are also commercial optimizing
- compilers for most of these languages, too!
-
- Some languages have compilers that produce C code, which can be
- further optimized by a C compiler. LISP, Scheme, Perl, and many other
- are suches. Speed is fairly good.
-
-
-
-
- 2.2.3. General procedure to speed your code up
-
- As for speeding code up, you should do it only for parts of a program
- that a profiling tool has consistently identified as being a
- performance bottleneck.
-
- Hence, if you identify some code portion as being too slow, you should
-
- ╖ first try to use a better algorithm;
-
- ╖ then try to compile it rather than interpret it;
-
- ╖ then try to enable and tweak optimization from your compiler;
-
- ╖ then give the compiler hints about how to optimize (typing
- information in LISP; register usage with GCC; lots of options in
- most compilers, etc).
-
- ╖ then possibly fallback to assembly programming
-
- Finally, before you end up writing assembly, you should inspect
- generated code, to check that the problem really is with bad code
- generation, as this might really not be the case: compiler-generated
- code might be better than what you'd have written, particularly on
- modern multi-pipelined architectures! Slow parts of a program might
- be intrinsically so. Biggest problems on modern architectures with
- fast processors are due to delays from memory access, cache-misses,
- TLB-misses, and page-faults; register optimization becomes useless,
- and you'll more profitably re-think data structures and threading to
- achieve better locality in memory access. Perhaps a completely
- different approach to the problem might help, then.
-
-
-
- 2.2.4. Inspecting compiler-generated code
-
- There are many reasons to inspect compiler-generated assembly code.
- Here are what you'll do with such code:
-
- ╖ check whether generated code can be obviously enhanced with hand-
- coded assembly (or by tweaking compiler switches)
-
- ╖ when that's the case, start from generated code and modify it
- instead of starting from scratch
-
- ╖ more generally, use generated code as stubs to modify, which at
- least gets right the way your assembly routines interface to the
- external world
-
- ╖ track down bugs in your compiler (hopefully rarer)
-
- The standard way to have assembly code be generated is to invoke your
- compiler with the -S flag. This works with most Unix compilers,
- including the GNU C Compiler (GCC), but YMMV. As for GCC, it will
- produce more understandable assembly code with the -fverbose-asm
- command-line option. Of course, if you want to get good assembly
- code, don't forget your usual optimization options and hints!
-
-
-
-
- 3. ASSEMBLERS
-
-
-
- 3.1. GCC Inline Assembly
-
- The well-known GNU C/C++ Compiler (GCC), an optimizing 32-bit compiler
- at the heart of the GNU project, supports the x86 architecture quite
- well, and includes the ability to insert assembly code in C programs,
- in such a way that register allocation can be either specified or left
- to GCC. GCC works on most available platforms, notably Linux, *BSD,
- VSTa, OS/2, *DOS, Win*, etc.
-
-
- 3.1.1. Where to find GCC
-
- The original GCC site is the GNU FTP site
- <ftp://prep.ai.mit.edu/pub/gnu/gcc/> together with all the released
- application software from the GNU project. Linux-configured and
- precompiled versions can be found in
- <ftp://metalab.unc.edu/pub/Linux/GCC/> There exists a lot of FTP
- mirrors of both sites. everywhere around the world, as well as CD-ROM
- copies.
-
- GCC development has split in two branches recently. See more about
- the experimental version, egcs, at <http://www.cygnus.com/egcs/>
-
- Sources adapted to your favorite OS, and binaries precompiled for it,
- should be found at your usual FTP sites.
-
-
- For most popular DOS port of GCC is named DJGPP, and can be found in
- directories of such name in FTP sites. See:
-
- <http://www.delorie.com/djgpp/>
-
-
- There is also a port of GCC to OS/2 named EMX, that also works under
- DOS, and includes lots of unix-emulation library routines. See around
- the following site: <ftp://ftp-os2.cdrom.com/pub/os2/emx09c/>. Other
- URLs listed in previous versions of this HOWTO seem to be as dead as
- OS/2.
-
-
- 3.1.2. Where to find docs for GCC Inline Asm
-
- The documentation of GCC includes documentation files in texinfo
- format. You can compile them with tex and print then result, or
- convert them to .info, and browse them with emacs, or convert them to
- .html, or nearly whatever you like. convert (with the right tools) to
- whatever you like, or just read as is. The .info files are generally
- found on any good installation for GCC.
-
- The right section to look for is: C Extensions::Extended Asm::
-
- Section Invoking GCC::Submodel Options::i386 Options:: might help too.
- Particularly, it gives the i386 specific constraint names for
- registers: abcdSDB correspond to %eax, %ebx, %ecx, %edx, %esi, %edi,
- %ebp respectively (no letter for %esp).
-
- The DJGPP Games resource (not only for game hackers) had this page
- specifically about assembly, but it's down. Its data have nonetheless
- been recovered on the DJGPP site <http://www.delorie.com/djgpp/>, that
- contains a mine of other useful information:
- <http://www.delorie.com/djgpp/doc/brennan/>
-
- GCC depends on GAS for assembling, and follow its syntax (see below);
- do mind that inline asm needs percent characters to be quoted so they
- be passed to GAS. See the section about GAS below.
-
- Find lots of useful examples in the linux/include/asm-i386/
- subdirectory of the sources for the Linux kernel.
-
-
-
-
- 3.1.3. Invoking GCC to have it properly inline assembly code ?
-
- Because assembly routines from the kernel headers (and most likely
- your own headers, if you try making your assembly programming as clean
- as it is in the linux kernel) are embedded in extern inline functions,
- GCC must be invoked with the -O flag (or -O2, -O3, etc), for these
- routines to be available. If not, your code may compile, but not link
- properly, since it will be looking for non-inlined extern functions in
- the libraries against which your program is being linked !!! Another
- way is to link against libraries that include fallback versions of the
- routines.
-
- Inline assembly can be disabled with -fno-asm, which will have the
- compiler die when using extended inline asm syntax, or else generate
- calls to an external function named asm() that the linker can't
- resolve. To counter such flag, -fasm restores treatment of the asm
- keyword.
-
- More generally, good compile flags for GCC on the x86 platform are
-
-
- ______________________________________________________________________
- gcc -O2 -fomit-frame-pointer -W -Wall
- ______________________________________________________________________
-
-
-
- -O2 is the good optimization level in most cases. Optimizing besides
- it takes longer, and yields code that is a lot larger, but only a bit
- faster; such overoptimization might be useful for tight loops only (if
- any), which you may be doing in assembly anyway. In cases when you
- need really strong compiler optimization for a few files, do consider
- using up to -O6.
-
- -fomit-frame-pointer allows generated code to skip the stupid frame
- pointer maintenance, which makes code smaller and faster, and frees a
- register for further optimizations. It precludes the easy use of
- debugging tools (gdb), but when you use these, you just don't care
- about size and speed anymore anyway.
-
- -W -Wall enables all warnings and helps you catch obvious stupid
- errors.
-
- You can add some cpu-specific -m486 or such flag so that GCC will
- produce code that is more adapted to your precise computer. Note that
- EGCS (and perhaps GCC 2.8) have -mpentium and such flags, whereas GCC
- 2.7.x and older versions do not. A good choice of CPU-specific flags
- should be in the Linux kernel. Check the texinfo documentation of
- your current GCC installation for more.
-
- -m386 will help optimize for size, hence also for speed on computers
- whose memory is tight and/or loaded, since big programs cause swap,
- which more than counters any "optimization" intended by the larger
- code. In such settings, it might be useful to stop using C, and use
- instead a language that favors code factorization, such as a
- functional language and/or FORTH, and use a bytecode- or wordcode-
- based implementation.
-
- Note that you can vary code generation flags from file to file, so
- that performance-critical files use maximal optimization, whereas
- other files be optimized for size.
-
- To optimize even more, option -mregparm=2 and/or corresponding
- function attribute might help, but might pose lots of problems when
- linking to foreign code, including the libc. There are ways to
- correctly declare foreign functions so the right call sequences be
- generated, or you might want to recompile the foreign libraries to use
- the same register-based calling convention...
-
- Note that you can add make these flags the default by editing file
- /usr/lib/gcc-lib/i486-linux/2.7.2.3/specs or wherever that is on your
- system (better not add -Wall there, though). The exact location of
- the GCC specs files on your system can be found by asking gcc -v.
-
-
-
- 3.2. GAS
-
- GAS is the GNU Assembler, that GCC relies upon.
-
-
-
- 3.2.1. Where to find it
-
- Find it at the same place where you found GCC, in a package named
- binutils.
-
- 3.2.2. What is this AT&T syntax
-
- Because GAS was invented to support a 32-bit unix compiler, it uses
- standard ``AT&T'' syntax, which resembles a lot the syntax for
- standard m68k assemblers, and is standard in the UNIX world. This
- syntax is no worse, no better than the ``Intel'' syntax. It's just
- different. When you get used to it, you find it much more regular
- than the Intel syntax, though a bit boring.
-
- Here are the major caveats about GAS syntax:
-
- ╖ Register names are prefixed with %, so that registers are %eax, %dl
- and suches instead of just eax, dl, etc. This makes it possible to
- include external C symbols directly in assembly source, without any
- risk of confusion, or any need for ugly underscore prefixes.
-
- ╖ The order of operands is source(s) first, and destination last, as
- opposed to the intel convention of destination first and sources
- last. Hence, what in intel syntax is mov ax,dx (move contents of
- register dx into register ax) will be in att syntax mov %dx, %ax.
-
- ╖ The operand length is specified as a suffix to the instruction
- name. The suffix is b for (8-bit) byte, w for (16-bit) word, and l
- for (32-bit) long. For instance, the correct syntax for the above
- instruction would have been movw %dx,%ax. However, gas does not
- require strict att syntax, so the suffix is optional when length
- can be guessed from register operands, and else defaults to 32-bit
- (with a warning).
-
- ╖ Immediate operands are marked with a $ prefix, as in addl $5,%eax
- (add immediate long value 5 to register %eax).
-
- ╖ No prefix to an operand indicates it is a memory-address; hence
- movl $foo,%eax puts the address of variable foo in register %eax,
- but movl foo,%eax puts the contents of variable foo in register
- %eax.
-
- ╖ Indexing or indirection is done by enclosing the index register or
- indirection memory cell address in parentheses, as in testb
- $0x80,17(%ebp) (test the high bit of the byte value at offset 17
- from the cell pointed to by %ebp).
-
-
- A program exists to help you convert programs from TASM syntax to AT&T
- syntax. See
- <ftp://x2ftp.oulu.fi/pub/msdos/programming/convert/ta2asv08.zip>.
- (Since the original x2ftp site is closing, use a mirror site
- <ftp://ftp.lip6.fr/pub/pc/x2ftp/README.mirror_sites>). There also
- exists a program for the reverse conversion:
- <http://www.multimania.com/placr/a2i.html>.
-
-
- GAS has comprehensive documentation in TeXinfo format, which comes at
- least with the source distribution. Browse extracted .info pages with
- Emacs or whatever. There used to be a file named gas.doc or as.doc
- around the GAS source package, but it was merged into the TeXinfo
- docs. Of course, in case of doubt, the ultimate documentation is the
- sources themselves! A section that will particularly interest you is
- Machine Dependencies::i386-Dependent::
-
-
- Again, the sources for Linux (the OS kernel), come in as good
- examples; see under linux/arch/i386, the following files: kernel/*.S,
- boot/compressed/*.S, mathemu/*.S
-
-
- If you are writing kind of a language, a thread package, etc you might
- as well see how other languages (OCaml, gforth, etc), or thread
- packages (QuickThreads, MIT pthreads, LinuxThreads, etc), or whatever,
- do it.
-
- Finally, just compiling a C program to assembly might show you the
- syntax for the kind of instructions you want. See section ``Do you
- need Assembly?'' above.
-
-
-
-
- 3.2.3. Limited 16-bit mode
-
- GAS is a 32-bit assembler, meant to support a 32-bit compiler. It
- currently has only limited support for 16-bit mode, which consists in
- prepending the 32-bit prefixes to instructions, so you write 32-bit
- code that runs in 16-bit mode on a 32 bit CPU. In both modes, it
- supports 16-bit register usage, but what is unsupported is 16-bit
- addressing. Use the directive .code16 and .code32 to switch between
- modes. Note that an inline assembly statement asm(".code16\n") will
- allow GCC to produce 32-bit code that'll run in real mode!
-
- I've been told that most code needed to fully support 16-bit mode
- programming was added to GAS by Bryan Ford (please confirm?), but at
- least, it doesn't show up in any of the distribution I tried, up to
- binutils-2.8.1.x ... more info on this subject would be welcome.
-
- A cheap solution is to define macros (see below) that somehow produce
- the binary encoding (with .byte) for just the 16-bit mode instructions
- you need (almost nothing if you use code16 as above, and can safely
- assume the code will run on a 32-bit capable x86 CPU). To find the
- proper encoding, you can get inspiration from the sources of 16-bit
- capable assemblers for the encoding.
-
-
-
- 3.3. GASP
-
- GASP is the GAS Preprocessor. It adds macros and some nice syntax to
- GAS.
-
-
-
- 3.3.1. Where to find GASP
-
- GASP comes together with GAS in the GNU binutils archive.
-
-
-
- 3.3.2. How it works
-
- It works as a filter, much like cpp and the like. I have no idea on
- details, but it comes with its own texinfo documentation, so just
- browse them (in .info), print them, grok them. GAS with GASP looks
- like a regular macro-assembler to me.
-
-
-
- 3.4. NASM
-
- The Netwide Assembler project is producing yet another i386 assembler,
- written in C, that should be modular enough to eventually support all
- known syntaxes and object formats.
-
-
- 3.4.1. Where to find NASM
-
- <http://www.cryogen.com/Nasm>
-
- Binary release on your usual metalab mirror in devel/lang/asm/ Should
- also be available as .rpm or .deb in your usual RedHat/Debian
- distributions' contrib.
-
-
- 3.4.2. What it does
-
- At the time this HOWTO is written, version 0.98 of NASM is just out.
-
- The syntax is Intel-style. Some macroprocessing support is
- integrated.
-
- Supported object file formats are bin, aout, coff, elf, as86, (DOS)
- obj, win32, (their own format) rdf.
-
- NASM can be used as a backend for the free LCC compiler (support files
- included).
-
-
- Surely NASM evolves too fast for this HOWTO to be kept up to date.
- Unless you're using BCC as a 16-bit compiler (which is out of scope of
- this 32-bit HOWTO), you should definitely use NASM instead of say AS86
- or MASM, because it is actively supported online, and runs on all
- platforms.
-
- Note: NASM also comes with a disassembler, NDISASM.
-
- Its hand-written parser makes it much faster than GAS, though of
- course, it doesn't support three bazillion different architectures.
- For the x86 target, it should be the assembler of choice...
-
-
-
- 3.5. AS86
-
- AS86 is a 80x86 assembler, both 16-bit and 32-bit, part of Bruce
- Evans' C Compiler (BCC). It has mostly Intel-syntax, though it
- differs slightly as for addressing modes.
-
-
-
- 3.5.1. Where to get AS86
-
- A completely outdated version of AS86 is distributed by HJLu just to
- compile the Linux kernel, in a package named bin86 (current version
- 0.4), available in any Linux GCC repository. But I advise no one to
- use it for anything else but compiling Linux. This version supports
- only a hacked minix object file format, which is not supported by the
- GNU binutils or anything, and it has a few bugs in 32-bit mode, so you
- really should better keep it only for compiling Linux.
-
- The most recent versions by Bruce Evans (bde@zeta.org.au) are
- published together with the FreeBSD distribution. Well, they were: I
- could not find the sources from distribution 2.1 on :( Hence, I put
- the sources at my place:
- <http://www.tunes.org/~fare/files/asm/bcc-95.3.12.src.tgz>
-
- The Linux/8086 (aka ELKS) project is somehow maintaining bcc (though I
- don't think they included the 32-bit patches). See around
- <http://www.linux.org.uk/ELKS-Home/index.html> and
- <ftp://linux.mit.edu/pub/linux/ELKS/>. I haven't followed these
- developments, and would appreciate a reader contributing on this
- topic.
-
- Among other things, these more recent versions, unlike HJLu's,
- supports Linux GNU a.out format, so you can link you code to Linux
- programs, and/or use the usual tools from the GNU binutils package to
- manipulate your data. This version can co-exist without any harm with
- the previous one (see according question below).
-
- BCC from 12 march 1995 and earlier version has a misfeature that makes
- all segment pushing/popping 16-bit, which is quite annoying when
- programming in 32-bit mode. I wrote a patch at a time when the TUNES
- Project used as86:
- <http://www.tunes.org/~fare/files/asm/as86.bcc.patch.gz>. Bruce Evans
- accepted this patch, but since as far as I know he hasn't published a
- new release of bcc, the ones to ask about integrating it (if not done
- yet) are the ELKS developers.
-
-
-
- 3.5.2. How to invoke the assembler?
-
- Here's the GNU Makefile entry for using bcc to transform .s asm into
- both GNU a.out .o object and .l listing:
-
-
- ______________________________________________________________________
- %.o %.l: %.s
- bcc -3 -G -c -A-d -A-l -A$*.l -o $*.o $<
- ______________________________________________________________________
-
-
-
- Remove the %.l, -A-l, and -A$*.l, if you don't want any listing. If
- you want something else than GNU a.out, you can see the docs of bcc
- about the other supported formats, and/or use the objcopy utility from
- the GNU binutils package.
-
-
-
- 3.5.3. Where to find docs
-
- The docs are what is included in the bcc package. I salvaged the man
- pages that used to be available from the FreeBSD site at
- <http://www.tunes.org/~fare/files/asm/bcc-95.3.12.src.tgz>. Maybe
- ELKS developers know better. When in doubt, the sources themselves
- are often a good docs: it's not very well commented, but the
- programming style is straightforward. You might try to see how as86
- is used in ELKS or Tunes 0.0.0.25...
-
-
-
- 3.5.4. What if I can't compile Linux anymore with this new version ?
-
- Linus is buried alive in mail, and since HJLu (official bin86
- maintainer) chose to write hacks around an obsolete version of as86
- instead of building clean code around the latest version, I don't
- think my patch for compiling Linux with a modern as86 has any chance
- to be accepted if resubmitted. Now, this shouldn't matter: just keep
- your as86 from the bin86 package in /usr/bin, and let bcc install the
- good as86 as /usr/local/libexec/i386/bcc/as where it should be. You
- never need explicitly call this ``good'' as86, because bcc does
- everything right, including conversion to Linux a.out, when invoked
- with the right options; so assemble files exclusively with bcc as a
- frontend, not directly with as86.
-
-
- 3.6. OTHER ASSEMBLERS
-
- These are other, non-regular, options, in case the previous didn't
- satisfy you (why?), that I don't recommend in the usual (?) case, but
- that could prove quite useful if the assembler must be integrated in
- the software you're designing (i.e. an OS or development environment).
-
-
-
- 3.6.1. Win32Forth assembler
-
- Win32Forth is a free 32-bit ANS FORTH system that successfully runs
- under Win32s, Win95, Win/NT. It includes a free 32-bit assembler
- (either prefix or postfix syntax) integrated into the reflective FORTH
- language. Macro processing is done with the full power of the
- reflective language FORTH; however, the only supported input and
- output contexts is Win32For itself (no dumping of .obj file, but you
- could add that feature yourself, of course). Find it at
- <ftp://ftp.forth.org/pub/Forth/Compilers/native/windows/Win32For/>.
-
-
-
- 3.6.2. Terse
-
- Terse is a programming tool that provides THE most compact assembler
- syntax for the x86 family! See <http://www.terse.com>. However, it
- is not quite free software. It is said that there was a project for a
- free clone somewhere, that was abandonned after worthless pretenses
- that the syntax would be owned by the original author. Thus, if
- you're looking for a nifty programming project related to assembly
- hacking, I invite you to develop a terse-syntax frontend to NASM, if
- you like that syntax.
-
-
-
- 3.6.3. Non-free and/or Non-32bit x86 assemblers.
-
- You may find more about them, together with the basics of x86 assembly
- programming, in Raymond Moon's FAQ for comp.lang.asm.x86:
- <http://www2.dgsys.com/~raymoon/faq/asmfaq.zip>.
-
- Note that all DOS-based assemblers should work inside the Linux DOS
- Emulator, as well as other similar emulators, so that if you already
- own one, you can still use it inside a real OS. Recent DOS-based
- assemblers also support COFF and/or other object file formats that are
- supported by the GNU BFD library, so that you can use them together
- with your free 32-bit tools, perhaps using GNU objcopy (part of the
- binutils) as a conversion filter.
-
-
-
-
- 4. METAPROGRAMMING/MACROPROCESSING
-
- Assembly programming is a bore, but for critical parts of programs.
-
- You should use the appropriate tool for the right task, so don't
- choose assembly when it's not fit; C, OCAML, perl, Scheme, might be a
- better choice for most of your programming.
-
- However, there are cases when these tools do not give a fine enough
- control on the machine, and assembly is useful or needed. In those
- case, you'll appreciate a system of macroprocessing and
- metaprogramming that'll allow recurring patterns to be factored each
- into a one indefinitely reusable definition, which allows safer
- programming, automatic propagation of pattern modification, etc. A
- ``plain'' assembler is often not enough, even when one is doing only
- small routines to link with C.
-
-
-
- 4.1. What's integrated into the above
-
-
- Yes I know this section does not contain much useful up-to-date
- information. Feel free to contribute what you discover the hard
- way...
-
-
-
- 4.1.1. GCC
-
- GCC allows (and requires) you to specify register constraints in your
- ``inline assembly'' code, so the optimizer always know about it; thus,
- inline assembly code is really made of patterns, not forcibly exact
- code.
-
- Thus, you can make put your assembly into CPP macros, and inline C
- functions, so anyone can use it in as any C function/macro. Inline
- functions resemble macros very much, but are sometimes cleaner to use.
- Beware that in all those cases, code will be duplicated, so only local
- labels (of 1: style) should be defined in that asm code. However, a
- macro would allow the name for a non local defined label to be passed
- as a parameter (or else, you should use additional meta-programming
- methods). Also, note that propagating inline asm code will spread
- potential bugs in them; so watch out doubly for register constraints
- in such inline asm code.
-
- Lastly, the C language itself may be considered as a good abstraction
- to assembly programming, which relieves you from most of the trouble
- of assembling.
-
-
-
- 4.1.2. GAS
-
- GAS has some macro capability included, as detailed in the texinfo
- docs. Moreover, while GCC recognizes .s files as raw assembly to send
- to GAS, it also recognizes .S files as files to pipe through CPP
- before to feed them to GAS. Again and again, see Linux sources for
- examples.
-
-
-
- 4.1.3. GASP
-
- It adds all the usual macroassembly tricks to GAS. See its texinfo
- docs.
-
-
-
- 4.1.4. NASM
-
- NASM has some macro support, too. See according docs. If you have
- some bright idea, you might wanna contact the authors, as they are
- actively developing it. Meanwhile, see about external filters below.
-
-
-
-
-
-
- 4.1.5. AS86
-
- It has some simple macro support, but I couldn't find docs. Now the
- sources are very straightforward, so if you're interested, you should
- understand them easily. If you need more than the basics, you should
- use an external filter (see below).
-
-
-
- 4.1.6. OTHER ASSEMBLERS
-
-
- ╖ Win32FORTH: CODE and END-CODE are normal that do not switch from
- interpretation mode to compilation mode, so you have access to the
- full power of FORTH while assembling.
-
- ╖ TUNES: it doesn't work yet, but the Scheme language is a real high-
- level language that allows arbitrary meta-programming.
-
-
-
- 4.2. External Filters
-
- Whatever is the macro support from your assembler, or whatever
- language you use (even C !), if the language is not expressive enough
- to you, you can have files passed through an external filter with a
- Makefile rule like that:
-
-
- ______________________________________________________________________
- %.s: %.S other_dependencies
- $(FILTER) $(FILTER_OPTIONS) < $< > $@
- ______________________________________________________________________
-
-
-
-
-
- 4.2.1. CPP
-
- CPP is truely not very expressive, but it's enough for easy things,
- it's standard, and called transparently by GCC.
-
- As an example of its limitations, you can't declare objects so that
- destructors are automatically called at the end of the declaring
- block; you don't have diversions or scoping, etc.
-
- CPP comes with any C compiler. However, considering how mediocre it
- is, stay away from it if by chance you can make it without C,
-
-
-
- 4.2.2. M4
-
- M4 gives you the full power of macroprocessing, with a Turing
- equivalent language, recursion, regular expressions, etc. You can do
- with it everything that CPP cannot.
-
- See macro4th (this4th)
- <ftp://ftp.forth.org/pub/Forth/Compilers/native/unix/this4th.tar.gz>
- or the Tunes 0.0.0.25 sources
- <ftp://ftp.tunes.org/pub/tunes/obsolete/dist/tunes.0.0.0/tunes.0.0.0.25.src.zip>
- as examples of advanced macroprogramming using m4.
-
- However, its disfunctional quoting and unquoting semantics force you
- to use explicit continuation-passing tail-recursive macro style if you
- want to do advanced macro programming (which is remindful of TeX --
- BTW, has anyone tried to use TeX as a macroprocessor for anything else
- than typesetting ?). This is NOT worse than CPP that does not allow
- quoting and recursion anyway.
-
- The right version of m4 to get is GNU m4 1.4 (or later if exists),
- which has the most features and the least bugs or limitations of all.
- m4 is designed to be slow for anything but the simplest uses, which
- might still be ok for most assembly programming (you're not writing
- million-lines assembly programs, are you?).
-
-
-
- 4.2.3. Macroprocessing with yer own filter
-
- You can write your own simple macro-expansion filter with the usual
- tools: perl, awk, sed, etc. That's quick to do, and you control
- everything. But of course, any power in macroprocessing must be
- earned the hard way.
-
-
-
- 4.2.4. Metaprogramming
-
- Instead of using an external filter that expands macros, one way to do
- things is to write programs that write part or all of other programs.
-
- For instance, you could use a program outputing source code
-
- ╖ to generate sine/cosine/whatever lookup tables,
-
- ╖ to extract a source-form representation of a binary file,
-
- ╖ to compile your bitmaps into fast display routines,
-
- ╖ to extract documentation, initialization/finalization code,
- description tables, as well as normal code from the same source
- files,
-
- ╖ to have customized assembly code, generated from a
- perl/shell/scheme script that does arbitrary processing,
-
- ╖ to propagate data defined at one point only into several cross-
- referencing tables and code chunks.
-
- ╖ etc.
-
- Think about it!
-
-
-
- 4.2.4.1. Backends from compilers
-
- Compilers like GCC, SML/NJ, Objective CAML, MIT-Scheme, CMUCL, etc, do
- have their own generic assembler backend, which you might choose to
- use, if you intend to generate code semi-automatically from the
- according languages, or from a language you hack: rather than write
- great assembly code, you may instead modify a compiler so that it
- dumps great assembly code!
-
-
-
- 4.2.4.2. The New-Jersey Machine-Code Toolkit
-
- There is a project, using the programming language Icon (with an
- experimental ML version), to build a basis for producing assembly-
- manipulating code. See around
- <http://www.cs.virginia.edu/~nr/toolkit/>
-
-
-
- 4.2.4.3. TUNES
-
-
- The TUNES Project <http://www.tunes.org/> for a Free Reflective
- Computing System is developping its own assembler as an extension to
- the Scheme language, as part of its development process. It doesn't
- run at all yet, though help is welcome.
-
- The assembler manipulates abstract syntax trees, so it could equally
- serve as the basis for a assembly syntax translator, a disassembler, a
- common assembler/compiler back-end, etc. Also, the full power of a
- real language, Scheme, make it unchallenged as for
- macroprocessing/metaprograming.
-
-
-
-
-
- 5. CALLING CONVENTIONS
-
-
-
-
- 5.1. Linux
-
-
-
- 5.1.1. Linking to GCC
-
- That's the preferred way. Check GCC docs and examples from Linux
- kernel .S files that go through gas (not those that go through as86).
-
- 32-bit arguments are pushed down stack in reverse syntactic order
- (hence accessed/popped in the right order), above the 32-bit near
- return address. %ebp, %esi, %edi, %ebx are callee-saved, other
- registers are caller-saved; %eax is to hold the result, or %edx:%eax
- for 64-bit results.
-
- FP stack: I'm not sure, but I think it's result in st(0), whole stack
- caller-saved.
-
- Note that GCC has options to modify the calling conventions by
- reserving registers, having arguments in registers, not assuming the
- FPU, etc. Check the i386 .info pages.
-
- Beware that you must then declare the cdecl or regparm(0) attribute
- for a function that will follow standard GCC calling conventions. See
- in the GCC info pages the section: C Extensions::Extended Asm::. See
- also how Linux defines its asmlinkage macro...
-
-
-
-
- 5.1.2. ELF vs a.out problems
-
- Some C compilers prepend an underscore before every symbol, while
- others do not.
-
- Particularly, Linux a.out GCC does such prepending, while Linux ELF
- GCC does not.
-
- If you need cope with both behaviors at once, see how existing
- packages do. For instance, get an old Linux source tree, the Elk,
- qthreads, or OCAML...
-
- You can also override the implicit C->asm renaming by inserting
- statements like
-
- ______________________________________________________________________
- void foo asm("bar") (void);
- ______________________________________________________________________
-
-
- to be sure that the C function foo will be called really bar in assem¡
- bly.
-
- Note that the utility objcopy, from the binutils package, should allow
- you to transform your a.out objects into ELF objects, and perhaps the
- contrary too, in some cases. More generally, it will do lots of file
- format conversions.
-
-
-
-
- 5.1.3. Direct Linux syscalls
-
- This is specifically NOT recommended, because the conventions change
- from time to time or from kernel flavor to kernel flavor (cf L4Linux),
- plus it's not portable, it's a burden to write, it's redundant with
- the libc effort, AND it precludes fixes and extensions that are made
- to the libc, like, for instance the zlibc package, that does on-the-
- fly transparent decompression of gzip-compressed files. The standard,
- recommended way to call Linux system services is, and will stay, to go
- through the libc.
-
- Shared objects should keep your stuff small. And if you really want
- smaller binaries, do use #! stuff, with the interpreter having all the
- overhead you want to keep out of your binaries.
-
- Now, if for some reason, you don't want to link to the libc, go get
- the libc and understand how it works! After all, you're pretending to
- replace it, ain't you? You might also take a look at how my eforth
- 1.0c <ftp://ftp.forth.org/pub/Forth/Compilers/native/unix/Linux/linux-
- eforth-1.0c.tar.gz> does it.
-
- The sources for Linux come in handy, too, particularly the
- asm/unistd.h header file, that describes how to do system calls...
-
- Basically, you issue an int $0x80, with the __NR_syscallname number
- (from asm/unistd.h) in %eax, and parameters (up to five) in %ebx,
- %ecx, %edx, %esi, %edi respectively. Result is returned in %eax, with
- a negative result being an error whose opposite is what libc would put
- in errno. The user-stack is not touched, so you needn't have a valid
- one when doing a syscall.
-
-
-
- 5.1.4. I/O under Linux
-
- If you want to do direct I/O under Linux, either it's something very
- simple that needn't OS arbitration, and you should see the IO-Port-
- Programming mini-HOWTO; or it needs a kernel device driver, and you
- should try to learn more about kernel hacking, device driver
- development, kernel modules, etc, for which there are other excellent
- HOWTOs and documents from the LDP.
-
-
- Particularly, if what you want is Graphics programming, then do join
- the GGI project: <http://www.ggi-project.org/>
-
- Anyway, in all these cases, you'll be better off using GCC inline
- assembly with the macros from linux/asm/*.h than writing full assembly
- source files.
-
-
-
- 5.1.5. Accessing 16-bit drivers from Linux/i386
-
- Such thing is theoretically possible (proof: see how DOSEMU
- <http://www.dosemu.org> can selectively grant hardware port access to
- programs), and I've heard rumors that someone somewhere did actually
- do it (in the PCI driver? Some VESA access stuff? ISA PnP? dunno). If
- you have some more precise information on that, you'll be most
- welcome. Anyway, good places to look for more information are the
- Linux kernel sources, DOSEMU sources (and other programs in the DOSEMU
- repository <ftp://tsx-11.mit.edu/pub/linux/ALPHA/dosemu/>), and
- sources for various low-level programs under Linux... (perhaps GGI if
- it supports VESA).
-
- Basically, you must either use 16-bit protected mode or vm86 mode.
-
- The first is simpler to setup, but only works with well-behaved code
- that won't do any kind of segment arithmetics or absolute segment
- addressing (particularly addressing segment 0), unless by chance it
- happens that all segments used can be setup in advance in the LDT.
-
- The later allows for more "compatibility" with vanilla 16-bit
- environments, but requires more complicated handling.
-
- In both cases, before you can jump to 16-bit code, you must
-
- ╖ mmap any absolute address used in the 16-bit code (such as ROM,
- video buffers, DMA targets, and memory-mapped I/O) from /dev/mem to
- your process' address space,
-
- ╖ setup the LDT and/or vm86 mode monitor.
-
- ╖ grab proper I/O permissions from the kernel (see the above section)
-
- Again, carefully read the source for the stuff contributed to the
- DOSEMU project, particularly these mini-emulators for running ELKS
- and/or simple .COM programs under Linux/i386.
-
-
-
- 5.2. DOS
-
- Most DOS extenders come with some interface to DOS services. Read
- their docs about that, but often, they just simulate int $0x21 and
- such, so you do ``as if'' you were in real mode (I doubt they have
- more than stubs and extend things to work with 32-bit operands; they
- most likely will just reflect the interrupt into the real-mode or vm86
- handler).
-
- Docs about DPMI and such (and much more) can be found on
- <ftp://x2ftp.oulu.fi/pub/msdos/programming/> (again, the original
- x2ftp site is closing, so use a mirror site
- <ftp://ftp.lip6.fr/pub/pc/x2ftp/README.mirror_sites>).
-
- DJGPP comes with its own (limited) glibc
- derivative/subset/replacement, too.
-
-
- It is possible to cross-compile from Linux to DOS, see the
- devel/msdos/ directory of your local FTP mirror for metalab.unc.edu
- Also see the MOSS dos-extender from the Flux project
- <http://www.cs.utah.edu/projects/flux/> from university of Utah.
-
- Other documents and FAQs are more DOS-centered. We do not recommend
- DOS development.
-
-
-
- 5.3. Winblows and suches
-
- Hey, this document covers only free software. Ring me when Winblows
- becomes free, or when there are free dev tools for it!
-
- Well, after all there are: Cygnus Solutions <http://www.cygnus.com>
- has developped the cygwin32.dll library, for GNU programs to run on
- MacroShit platforms. Thus, you can use GCC, GAS, all the GNU tools,
- and many other Unix applications. Have a look around their homepage.
- I (FarΘ) don't intend to expand on Losedoze programming, but I'm sure
- you can find lots of documents about it everywhere...
-
-
-
- 5.4. Yer very own OS
-
- Control being what attract many programmers to assembly, want of OS
- development is often what leads to or stems from assembly hacking.
- Note that any system that allows self-development could be qualified
- an "OS" even though it might run "on top" of an underlying system that
- multitasking or I/O (much like Linux over Mach or OpenGenera over
- Unix), etc. Hence, for easier debugging purpose, you might like to
- develop your ``OS'' first as a process running on top of Linux
- (despite the slowness), then use the Flux OS kit
- <http://www.cs.utah.edu/projects/flux/oskit/> (which grants use of
- Linux and BSD drivers in yer own OS) to make it standalone. When your
- OS is stable, it's still time to write your own hardware drivers if
- you really love that.
-
- This HOWTO will not itself cover topics such as Boot loader code &
- getting into 32-bit mode, Handling Interrupts, The basics about intel
- ``protected mode'' or ``V86/R86'' braindeadness, defining your object
- format and calling conventions. The main place where to find reliable
- information about that all is source code of existing OSes and
- bootloaders. Lots of pointers lie in the following WWW page:
- <http://www.tunes.org/Review/OSes.html>
-
-
-
- 6. TODO & POINTERS
-
-
-
- ╖ find someone who has got some time to takeover the maintenance
-
- ╖ fill incomplete sections
-
- ╖ add more pointers to software and docs
-
- ╖ add simple examples from real life to illustrate the syntax, power,
- and limitations of each proposed solution.
-
- ╖ ask people to help with this HOWTO
-
- ╖ perhaps give a few words for assembly on other architectures than
- i386?
- ╖ A few pointers (in addition to those already in the rest of the
- HOWTO)
-
- ╖ 80x86 CPU family references: intel manuals
- <http://www.intel.com/design/pentium/manuals/>; bugs
- <http://www.xs4all.nl/~feldmann/86bugs.htm>.
-
- ╖ ftp.luth.se <ftp://ftp.luth.se/pub/msdos/> mirrors the hornet and
- x2ftp former archives of msdos assembly coding stuff.
-
- ╖ A few starting points on the web about assembly programming: Jannes
- Faber's <http://www.fys.ruu.nl/~faber/Amain.html>; QZX's
- <http://www.qzx.com/library/>; JanW's <http://bewoner.dma.be/JanW>;
- this one (?) <ftp://zfja-gate.fuw.edu.pl/user/net/ka9q/guest/>
-
- ╖ Fun stuff: CoreWars <http://www.koth.org>, a fun way to learn
- assembly in general.
-
- ╖ USENET: comp.lang.asm.x86 <news://comp.lang.asm.x86>;
- alt.os.assembly <news://alt.os.assembly>.
-
- ╖ And of course, do use your usual Internet Search Tools to look for
- more information, and tell me anything interesting you find!
-
-
- Author's .sig:
-
- ## FarΘ | VN: ╨úng-V√ BΓn | Join the TUNES project! http://www.tunes.org/ ##
- ## FR: Franτois-RenΘ Rideau | TUNES is a Useful, Not Expedient System ##
- ## Reflection&Cybernethics | Project for a Free Reflective Computing System ##
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-