Home Tools Solutions Order Mail Links
AnsiString Primer
This is a concise background primer for those unfamiliar with the new 32-bit long dynamic string type; otherwise known as AnsiString.
Structure
Under the hood, an AnsiString is fundamentally a pointer to a dynamic memory block. Implicit pointer de-referencing; provided by the compiler, tends to obscure this. In response to an AnsiString declaration, only a string header (which includes a pointer) is allocated. Memory storage for the actual string data (referenced by the pointer) is dynamically allocated on assignment. In contrast, the older Pascal strings were always allocated a static 256 byte memory block. The first byte of the block held the effective length; therefore, maximum effective string length was limited to 255 bytes.
Power and Performance
The dynamic memory used by AnsiStrings is transparently managed (allocated, re-allocated and freed as required) by the Delphi memory manager. As a result, an AnsiString can be assigned any length within the limits of available memory. A string that holds 2 bytes can be almost effortlessly expanded to hold 2 million bytes if needed. As you can see, this is much more powerful than the older string implementation. However, with power comes responsibility. Memory manager abuse is tempting and can adversely affect performance. Consider these simple examples:
Poor:
S2 := ’ ’;
for I := 2 to length(S1) do S2 := S2 + S1[I];Better:
SetLength(S2, length(S1) - 1);
for I := 2 to length(S1) do S2[I-1] := S1[I];The same result is achieved in each case but the second is potentially much more efficient. In the first case, string S2 is being built incrementally, byte by byte. When the string outgrows it’s memory allocation, a whirlwind of behind the scenes activity is triggered. A new, larger memory block is allocated, existing string data is copied to the new block and finally the old block is freed. This is all handled transparently but it still takes time. Depending upon length(S1), memory re-allocation may be triggered dozens of times from within the loop. In the second example, memory is only allocated once prior to the loop.
Complaints regarding performance with AnsiString can usually be traced to memory manager abuse and overuse. As shown above, such abuse can be avoided with some simple changes in coding style. Aside from their more powerful, dynamic nature; AnsiStrings are just strings --- a linear sequence of bytes in memory and devoid of any inherent performance penalties.
Compatibility
Windows is largely written in C. As a result, the WinAPI functions expect C-style, null terminated strings. For compatibility, AnsiStrings are also null terminated. Outside the API, this terminating null is not normally accessible and thus can not and does not serve as an indicator of string length.
If a null doesn’t do it, what does determine the length of an AnsiString? Instead of an embedded length indicator, effective string length is stored in the AnsiString header, alongside the dynamic memory block pointer. This requires a very small amount of storage overhead but the benefits are well worth it. String length is readily accessible using the Length() function. In C, a time consuming scan is required to determine the length. Length can be set implicitedly by assignment or explicitedly using SetLength(). As was shown above, more efficient memory management can often be achieved by explicitedly pre-setting string length.
Versatility
Within an AnsiString, a character is a character. No single character has any special significance over any other, not even a null. Therefore; an AnsiString is in effect a managed, dynamic, buffer capable of hold not only text but binary data as well. As such, AnsiStrings can be applied to a wide range of problems beyond the reach of other strings. For example, a dynamically allocated buffer (GetMem(), GlobalAlloc(), etc.) with pointer addressing and mandatory cleanup (FreeMem(), GlobalFree(), etc.) can often be replaced by a safer, more convenient AnsiString buffer. To illustrate, HyperString features a rather unique implementation of dynamic numeric arrays using AnsiStrings as container structures.
Summary
AnsiStrings are a new, more powerful string type available for the first time with Delphi32. With proper coding, these new strings are just as efficient as the older Pascal strings but much more powerful and versatile and they provide compatibility with the C strings used by the WinAPI.
With power, versatility and compatibility, why use anything but AnsiStrings?