home *** CD-ROM | disk | FTP | other *** search
- Abstract:
- This brief document describes how to allocate and deallocate memory correctly,
- i.e. in a way compatible to the Os (and, as a result, compatible to PoolMem).
- ______________________________________________________________________________
- The following rules apply to all programs that are supposed to run in an OS
- friendly way. I didn't make them up myself. What you find here is more or
- less a copy of the rules taken from the ROM Kernal reference manual, the
- official Amiga developer documentation.
- Breaking these rules will result in unstable programs, with or without
- any additional memory tools. A program that seems to run fine without
- PoolMem, but crashes with PoolMem is, nevertheless, unstable and might crash
- in certain situations, even without this tool.
- Allocation of memory:
- o) The MEMF_PUBLIC bit:
- Set the MEMF_PUBLIC bit (exec/memory.h). You usually want it!
- NOT setting this bit results in memory that is
- a) *private* to your task, i.e. can't be read from any other task
- b) and can't be read safely within a Forbid()/Permit() or
- Enable()/Disable() pair.
- The current Os DOES NOT implement any checks for this rule, neither
- does PoolMem. However, future memory managers might see this bit as
- a hint to assign "virtual memory" to the allocation, i.e. memory that
- can be swapped out to disk.
- As an example, VMM requires correct usage of this bit.
- All data that is supposed to hold Os structures MUST BE ALLOCATED
- WITH THE MEMF_PUBLIC flag set, any memory that is passed to other
- tasks, interrupts, exceptions, I/O buffers MUST BE ALLOCATED WITH
- The only exception are private structures that are only read
- or written to by your task, that are never passed nor read or
- written to by other processes or Os functions and that are not
- accessed with multitasking disabled.
- o) Memory flushes:
- Be prepared that a memory allocation might flush unused libraries,
- fonts and devices from memory. In special, DO NOT USE CLOSED
- RESORCES. Using a "FindName()" on the exec resource lists IS NOT
- ENOUGH to use a resource.
- If you DO NOT want that resources get flushed, set the
- MEMF_NO_EXPUNGE flag as memory attribute. See exec/memory.h.
- o) Memory and custom chips:
- Memory that should be read by the Amiga custom chip set MUST BE
- ALLOCATED with the MEMF_CHIP attribute set or the custom chips
- won't be able to address this memory. That goes for:
- o) display buffers (native bitmaps)
- o) hardware copper lists (but not their gfx abstractions for the
- CMove(),CWait() etc... family)
- o) image bitmaps (struct IntuiImage->ImageData)
- o) floppy hardware buffers (but since V37 not required for the
- trackdisk.device I/O buffer)
- o) hardware audio buffers
- o) hardware sprites and image datas of "Bobs"
- o) everything else the custom chip set might access
- o) Order of memory blocks:
- Do not make *any* assumptions about the order in which you get
- memory. The second allocation is not necessarely the higher
- address!
- o) The MEMF_FAST bit:
- Do not use the MEMF_FAST bit unnecessary if chip mem would be
- O.K. for you, too. The operating system is smart enough to
- allocate fast memory for you if that is available. It will fall
- back to chip mem if fast mem is not available. There's usually
- no reason to ask for fast mem explicitly.
- o) Alignment:
- It is guaranteed that all memory allocated by AllocMem() is aligned
- to two long word boundaries, i.e. the bits 0 to 2 of the address will
- always be zero. NOT MORE! If you need more alignment, see the kludge
- below.
- o) Size of buffers:
- Make sure you allocate enough memory even for the worst case. A
- C style string needs n+1 bytes memory to hold a string of length n.
- Some Os functions require, due to bugs, a slightly larger buffer
- than you might think, check the "BUGS" section of the autodocs.
- (Mostly dos functions suffer from this bug, but some intuition
- functions require this as well).
- o) Memory attributes:
- Do NOT set ANY undocumented bits for the memory attributes of
- AllocMem(). They *might* be ignored for this version of the Os,
- but probably won't the next version. Check the exec/memory.h
- file for valid flags. As for the current (V40) version of the Os,
- the following flags have been defined:
- #define MEMF_ANY (0L) /* Any type of memory will do */
- #define MEMF_PUBLIC (1L<<0) /* Damn important, see caveats above !*/
- #define MEMF_CHIP (1L<<1) /* for custom chips */
- #define MEMF_FAST (1L<<2) /* explicitly fast mem, see caveats! */
- #define MEMF_LOCAL (1L<<8) /* Memory that does not go away at RESET */
- #define MEMF_24BITDMA (1L<<9) /* DMAable memory within 24 bits of address */
- #define MEMF_KICK (1L<<10) /* Memory that can be used for KickTags */
- #define MEMF_CLEAR (1L<<16) /* AllocMem: NULL out area before return */
- #define MEMF_LARGEST (1L<<17) /* AvailMem: return the largest chunk size */
- #define MEMF_REVERSE (1L<<18) /* AllocMem: allocate from the top down */
- #define MEMF_TOTAL (1L<<19) /* AvailMem: return total size of memory */
- o) Memory contents:
- Do NOT MAKE any asumption about the contents of the memory block
- unless you specified the MEMF_CLEAR attribute to erase the memory
- block. Not setting this bit is a bit faster, but results in a
- memory block with whatever contents you might dream of.
- o) Self modifying code:
- Self modifying code should be avoided.
- (What do thing this is? A C64? :-)
- If you absolutely MUST play with this and can't go 'round this,
- use the following Os call to flush the CPU caches once you've
- placed your code in memory and need to run it:
- ClearCacheU()
- Do NOT expect that it is there BEFORE you called this routine.
- This is even more important to routines like interrupts that are
- called asynchroniously.
- o) Failures:
- Feel prepared that your memory request might fail. An explicit
- check is REQUIRED after an AllocMem() call. Just "going guru" in
- this case *IS NOT ENOUGH*. Print a warning message, abort your
- program safely, CHECK YOUR CODE!
- Assembly language authors: NO, IT'S NOT DOCUMENTED THAT AllocMem()
- If your calling task is indeed a process, OS versions V37 and
- above guarantee to set the result code for IoErr() to
- o) Memory flushers:
- The following is a safe memory flush:
- AllocMem(0x7ffffff0,MEMF_PUBLIC);
- (The flush used by the "avail flush" command).
- o) AllocMem() and context switches:
- Neither AllocMem() (nor FreeMem()) break a "Forbid" state. This is
- important because it's the only way to "print" a list thru the
- dos.library and other functions that is access protected
- via Forbid().
- The following code sequence is legal for this purpose, and should
- stay legal:
- - call Forbid() first,
- - make a copy of that list element by element, using AllocMem()
- - call Permit().
- - print the copy of the list
- - deallocate the copy.
- Running into a Wait(), like using a semaphore for access protection
- of the memory list memory would be fatal here.
- o) AllocMem,FreeMem,AllocAbs and interrupts:
- NONE of these functions can be called from interrupts or in the
- supervisor mode.
- Remember, however, that "input handlers" of the input.device
- are not run as interrupts but in the context of the input.device
- task, even though they are build on top of an interrupt structure.
- Thus, calling AllocMem() here to make a copy of an input event IS
- _____________________________________________________________________________
- Usage of stack for storing: (Or, how to allocate memory without allocating it)
- It's a somewhat vague point whether the stack can be used for storing
- system/Os structures or for passing structures to other tasks. The following
- paragraph is my own interpretation of this technique and should be used with
- some care:
- The CURRENT Os implementation allows this technique. The stack is allocated
- with the MEMF_PUBLIC flag set, AND MUST BE ALLOCATED THIS WAY. This is simply
- due to the fact that the memory for the stack is allocated by the task that
- creates a new process, and not by the new process itself. Since the AmigaOs
- doesn't know the unix fork() style of creating new processes, this is the
- only way of allocating the stack for the new process anyways. Thus, the stack
- is kept in memory that is passed across task boundaries and must be,
- therefore, public. Thus, it can be used for storing Os structures and for
- passing data accross processes. It's furthermore common practice to use
- the stack to pass "taglists" to Os functions that might be read by a
- different process, and even to keep complete Os structures on the stack, as
- done by some CBM shell commands, routines in the dos.library and others.
- (However, see the note below about how strict CBM/AI read their own
- design rules!)
- HOWEVER, Ralph Babel writes in "The Amiga Guru Book" (2nd ed., 1993):
- "The stack is private memory ... and should not be considered MEMF_CHIP
- or MEMF_PUBLIC, nor should it be used for storing system structures or code.
- The latter is important, since there is no guarantee that the stack is
- aligned to an even address, as these processors also allow nonbyte data
- acesses from any base address, although opcodes must still be word aligned."
- I do not agree in this point with Ralph except that the stack is indeed
- usually not MEMF_CHIP and shouldn't be considered to be. Storing code on
- the stack is truely considered "higher magic" and should be avoided.
- (Also see above for caveats IF YOU ABSOLUTELY HAVE to do this.)
- However, I would suppose that stack memory is always MEMF_PUBLIC for
- reasons stated above, and it's always word aligned since the MC68K keeps
- track of this themselfes unless you really attempt to screw the stack up.
- Normal usage of stack does not break this alignment as even a
- move.b d0,-(a7)
- instruction will decrement the stack pointer BY TWO BYTES, NOT BY ONE.
- This is one of the lesser known features of the MC68K series, indeed, and
- goes for all processors, from the MC68000 to the MC68060.
- Citing Motorola's "Programmer's Reference Manual" M68000PM/AD Rev.1,
- Page 2-28:
- "To keep data on the system stack aligned for maximum efficiency, the active
- stack pointer is automatically decremented or incremented by two for all
- byte-size operands moved to or from the stack."
- You should, however, still remember that you must align stack memory
- to four byte boundaries by hand. The following code snipped shows how to
- reserve 256 bytes of stack aligned to a longword boundary:
- lea -$104(a7),a7 ;reserve 256 bytes plus 2+2 for alignment.
- ;we use the extra two bytes to keep
- ;a possible long word alignment of the
- ;stack and to avoid speed penalties for
- ;the more advanced processors. If you write
- ;your own routines, you should always allocate
- ;stack memory this way since the Os always
- ;generates tasks with the stack pointer
- ;aligned to four byte boundaries.
- move.l a7,d0
- addq.l #2,d0 ;round up
- and.b #$fc,d0 ;to next four byte boundary
- move.l d0,a0 ;pass pointer in a0
- However, most C compilers are not smart enough to for this technique. Even
- AI fall into that pithole when writing the "List" and "Dir" commands. Both
- don't align DOS structures to long words correctly. (Urgh!) But since a
- similar code sequence is used sucessfully by the "DoPkt()" routine
- inside the dos library, I would still say that using stack for Os structures
- is legal and continues to stay legal. Allocating each tiny structure from
- the stack would create a huge overhead and would fragmentate memory a lot.
- However, as I said, this is a somewhat vague point, you don't have to
- agree with me and I'm open for a discussion.
- _____________________________________________________________________________
- o) Size of deallocation:
- - round the size because the rounding algorithm of the operating
- system might change in future to support special hardware
- (e.g. PowerPC cache lines which are 32 bytes wide)
- Now to another rule that hasn't been formulated in the RKRMs:
- - free a partial memory block, i.e. parts of an array.
- Freeing a partial part of a memory block requires knowledge of
- the alignment rules of the Os and may break code if these rules
- change in future versions.
- I would therefore strongly recommend NOT to use this technique.
- o) Access to deallocated memory:
- Do not touch deallocated memory. If it's gone it's gone and you're
- no longer allowed to use it, address it, read it or write data to
- it. Another task might want it.
- A tiny exception that hasn't been formulated in the RKRMs, but is
- unfortunately widely used:
- Deallocation of memory WITHIN a Forbid()/Permit() pair. The memory,
- is guaranteed to stay unmodified and ready for use as long as
- the multitasking is disabled. Running into a Wait(), directly or
- indirectly, will break the Forbid() state and will therefore make
- the memory unusable.
- Be warned! Even though this access is sort of legal, hence
- tolerated by MungWall, MemSniff, PoolMem and others, it's IMHO
- still ugly and therefore highly discouraged. One of the very few
- exceptions where this feature might be helpful is the following
- code segment that unloads the segment of a load- and stay resident
- program:
- move.l SysBase(a4),a6
- jsr _LVOForbid(a6)
- move.l DOSBase(a4),a6
- move.l Segment(a4),d1
- jsr _LVOUnloadSeg(a6) ;Unload own code
- move.l a6,a1 ;THIS CODE STAYS LEGAL because
- move.l SysBase(a4),a6 ;of the Forbid()
- jsr CloseLibrary(a6) ;close dos
- moveq #0,d0
- rts ;exit.
- Note that you must definitely positively sure that the segment
- is not an overlayed segment because UnloadSeg() WILL break the
- Forbid() state in this case. However, this doesn't work for
- load- and stay-resident programs anyways.
- o) Chip memory and blitter access.
- The custom "blitter logic" uses DMA and accesses the chip memory
- independent of the CPU. If you use a temporary buffer for the blitter,
- make sure the blitter does no longer access this buffer before you
- deallocate it. To be on the safe side, call WaitBlit() before de-
- allocating memory that has been used as blitter buffer.
- o) Memory and hardware DMA access.
- Modern hard disk interfaces might access memory by DMA, parallel
- to the CPU. If you're planning to use this hardware DMA directly
- because you're writing a device driver for this hardware, be
- prepared to flush the CPU caches properly. Especially, call
- CachePreDMA(...)
- prior the DMA operation
- CachePostDMA(...)
- afterwards.
- Check the autodocs for details about these functions and their
- parameters.
- o) Return value:
- FreeMem() DOES NOT return any useful value, nor does it set any
- condition codes.
- ______________________________________________________________________________
- AllocAbs and other wierdos:
- AllocAbs is for specialized usage of allocating memory from a predefined
- o) Range of allocated memory:
- AllocAbs performs some rounding. Be prepared that the memory block
- you get is not identical to the memory block you requested.
- However, IF the memory allocation could be satisfied, the requested
- memory block is guaranteed to be contained in the returned memory
- block.
- Feel prepared that the memory request cannot be satisfied because
- the requested memory is already in use by a different task.
- AllocAbs() returns NULL in this case. You've to check for this
- explicitly! It does NOT set any condition codes.
- AllocAbs() WILL NOT set the ERROR_NO_FREE_STORE return code for
- IoErr().
- o) Contents of allocated memory:
- Do not make any asumptions about the contents of the
- allocated memory block. The OS uses parts of the free memory blocks
- for administratory purposes and might have been trashed parts of
- memory block.
- That means especially for reset resident programs - whose memory is
- allocated this way by the exec KickMemPtr mechanism - that the
- first eight bytes will be trashed. Be prepared for that feature!
- o) Deallocation of AllocAbs()-ed memory:
- To be sure that the allocated memory is really deallocated completely,
- call FreeMem with the memory address and size you REQUESTED, NOT
- with the return value of AllocAbs(). This might sound strange indeed,
- but the FreeMem() logic performs the same rounding of size and address
- as the AllocAbs() logic. If, however, you pass in a different address,
- as the return value instead of the requested address, it is not
- guaranteed that really all memory is deallocated.
- A tiny example might be helpful (asuming the the current rounding
- algorithm):
- AllocAbs(0x07,0x300007);
- allocates 16 bytes and returns 0x300000. Calling now
- FreeMem(0x300000,0x07);
- will only free EIGHT bytes starting from 0x300000 instead of
- 16 bytes. However,
- FreeMem(0x300007,0x07);
- will work as required.
- I'm sorry to say that the kludge documented in the last revision
- of this file failed for the same reason; this has been fixed.
- o) Using AllocAbs() for aligned memory allocation:
- The following code segment is a kludge for allocating memory aligned
- to a boundary:
- void *AllocAligned(ULONG bytesize,ULONG attributes,ULONG alignment)
- {
- UBYTE *mem,*res;
- alignment--;
- if (mem=AllocMem(bytesize+alignment,attributes & (~MEMF_CLEAR))) {
- Forbid();
- FreeMem(mem,bytesize+alignment);
- mem = (mem + alignment)&(~alignment);
- res = AllocAbs(bytesize,mem);
- Permit();
- if (res) {
- if (attributes & MEMF_CLEAR)
- memset(mem,0,bytesize);
- } else mem = NULL;
- }
- return mem;
- }
- I.e, call this routine with "aligment" set to 16 for an alignment
- to a sixteen byte boundary.
- Calling this routine with anything but a power of two for the
- alignment doesn't make much sense and is illegal.
- Note that the memory is cleared MANUALLY if MEMF_CLEAR is set.
- This MUST be done since AllocAbs() does not guarantee the
- contents of the memory, even if the former AllocMem() already
- cleared the memory.
- _____________________________________________________________________________
- Any program that obeys these rules won't have *any* problems with PoolMem!
- _____________________________________________________________________________
- Debugging tools (memory related):
- The following two debugging tools are "official" AI tools and should be used
- by any serious developer:
- -Enforcer: Detects memory accesses to the vector base and to unmapped
- memory regions.
- -MungWall: Detects a lot of illegal accesses as in the list above,
- as failing to initialize memory properly, accessing de-
- allocated memory and others. However, it *could* do more.
- -SegTracker: Keeps program names together with their loaded segments
- for easy identification of code.
- Even a program that runs without problems with these tools is not
- necessarely bug free!
- The use of the following debugging tools is highly recommended:
- -PatchWork: (by Richard Körber)
- Detects invalid parameters to Os calls.
- I would also recommend the following combination: Since this is my own
- stuff, I can't be very objective. You might want to check them out....
- - COP: (my own stuff)
- Catches gurus and exceptions "on line" for straight
- forewards debugging.
- - MemSniff: Even pickier than MungWall. It detects software failures
- and memory problems MungWall can't find. However, check
- the documentation as this tool has its special "caveats".
- It should be used in conjunction with COP since it doesn't
- generate an as complete output as MungWall or Enforcer.
- - SaferPatches: Detects illegal function patches. If this one crashes with
- a guru, something is wrong. For details, check the doc
- of the SaferPatches archive.
- _____________________________________________________________________________
- Another set of wierdos for the "enlighted". (-:
- The following is a list of "OS features" you should be aware of if you
- consider writing your own memory tool. I found them when writing PoolMem,
- so they are here for your information. However, DO NOT USE THESE TECHNIQUES
- in own code.
- Even though the above rules have been setup for the developer, that doesn't
- mean that the Os respects these rules ("Quod licet Iovi non licet bovi.").
- I found the following "OS features":
- - The FFS (all versions V37 thru V43) expect a return value of "-1" for
- FreeMem(). This has been fixed for release 43.20.
- PoolMem contains a kludge for that. MemSniff and MungWall will mess up
- the registers on purpose. The result code for the FreeMem() can be set
- with FREEMEMRESULT. The default value is -1 to fix the earlier FFS releases.
- Other programs, as for example the "RexxPlus" compiler, requires different
- result codes to work properly, the default value "-1" conflicts with a bug
- in the compiled code. Setting the result code to "-2" might help in these
- cases.
- - The layers.libray allocates memory in large blocks, but deallocates this
- large block of memory in a series of small deallocations. In other words,
- it breaks up large memory blocks in smaller ones.
- The current version of PoolMem respects this behaviour, not only for the
- layers.library. MungWall and MemSniff include special kludges to allow this
- EXCLUSIVELY for the layers.library. However, THIS HAS BEEN ILLEGAL, IS
- ILLEGAL AND WILL CONTINUE TO BE ILLEGAL. I hope that this mess will be
- cleaned up in a future Os revision.
- - Some programs expect the Z (zero) flag of the CPU after an AllocMem()
- call to be set on failure and to be cleared if the allocation worked.
- The current version of PoolMem contains a kludge to make these programs
- working.
- - Some Os functions allocate Os structures from the stack and pass them
- to other tasks. (The DoPkt() routine is one example, but there are others).
- I would still say that this is O.K., but if you don't want to follow me in
- this point, it's an Os bug.
- _____________________________________________________________________________
- Thomas Richter, November 1998