Update v2.0.1: FileFlex International WorldFlex Functions

Understanding Character-Level Sort Order
Custom Character Sort Orders
Creating a Single-Byte Custom Sort Order Table
Creating Single-Byte Sort Order Utility Scripts
Understanding Double-Byte Sort Order Tables
Creating Double-Byte Sort Order Tables
Tricks with Sort Order
Setting the Sort Order with FileFlex
Character Translation
Creating Character Translation Utility Scripts
Translating Characters Using FileFlex
Case Translation
Creating Case Translation Utility Scripts
Intelligent Case Conversion Using FileFlex
Standalone Intelligent Case Conversion Function

FileFlex is used within multimedia productions throughout the world. While standard ASCII is prevalent, it is certainly not ubiquitous. When dealing with international languages, it's necessary to account for differences in character sorting order, for differences in case conversion, for differences in character values, and for double-byte characters.

FileFlex new WorldFlex technology now gives you the ability to build international flexibility into your applications with unprecedented power. FileFlex' WorldFlex technology gives you true dynamic localization. Unlike virtually all other so-called "world-aware" implementations, you're not forced to rely on a particular operating system revision or a country-nationalized version of an application. FileFlex allows you to define your own international conversion tables and apply them on-the-fly to any data management task. This dynamic localization functionality allows you to switch languages, character sets, sort orders, and conversions at any time throughout the operation of your multimedia production instantly, with virtually no impact upon FileFlex' already blazing performance.

FileFlex WorldFlex' technology falls into these three broad categories:

Dynamic character-level sort order: FileFlex allows you to use indexes and queries that dynamically switch between sort-order tables. Finally, an accented "a" character is treated like a regular "a", rather than something from Mars. Sort orders can be specified for either single-byte or double-byte languages.
Character translation: As many FileFlex users have discovered, the special diacritical characters have different values between Macintosh and Windows, and even between DOS and Windows. FileFlex allows you to convert characters so that all the diacritical marks (and any other conversions you may need) are all in the right places and your characters look just right.
Case conversion: Normal case conversion routines apply a simple heuristic to determine the upper case value of a character. Converting an "a" to an "A" is simply the matter of subtracting 32. But what about converting a "u" with an umlaut to an upper case value? What about converting vowels with accents to their equivalent upper case characters? FileFlex provides two standalone functions that allow you to use custom case conversion tables so that your case conversions make sense in your language. FileFlex internal intrinsic index and query functions also take into account custom case conversion tables so your data can be case insensitive when desired (as opposed to case insane).

Before we proceed with details of these functions, we'd like to thank our customers throughout the world for working with us to understand the individual needs of different languages and customs and how those needs apply to the authoring of multimedia productions worldwide.

Understanding Character-Level Sort Order

Note: The character-level sorting features in FileFlex require that you have a measurable amount of programming expertise. These features let you modify the very core of FileFlex data management and require both care to use and experience to understand. If you're not a pretty advanced scripter or programmer, you may want to find an experienced "buddy"to team up with before attempting to utilize these powerful capabilities.

FileFlex uses index files to sort information. When you create an index file, you're choosing a field that will determine the sort order of the database. For example, you might choose to sort on zipcode (a numeric code in the US that helps the post office tell where to deliver mail--in other countries this is often called the postal code), meaning that records containing 08553 in the zipcode field will be earlier in the database than records with 94404 in the zipcode field. Likewise, if you chose to organize your data based on last name, then "Clinton" would come before "Kennedy".

When you switch indexes, FileFlex doesn't reorder the entire database of records. Instead it adopts a different sort order based on the data in the fields. FileFlex creates the order of information in an index file when DBCreateIndex is called. It maintains and updates that order of information as part of the process of writing a record.

When FileFlex updates an index file, it's comparing the values in two different records. When it looks at "Clinton" and "Kennedy", it looks at the first characters (i.e., "C" and "K") and determines that "C" comes before "K" and therefore "Clinton"comes before "Kennedy".

This comparison of "C" vs. "K" is based on the standard ordered table we call ASCII (American Standard Code for Information Interchange). When FileFlex compares "C" against "K", it's really getting the ASCII value of "C" (67 decimal) and comparing it to the ASCII value of "K" (75 decimal). Since 67 comes before 75, then "C" comes before "K".

Note: Character sorting is case sensitive. A lower case "c"is ASCII 99 while an upper case "C" is ASCII "67". If you were to compare "clinton" (note the lower case "c") against "Kennedy", "Kennedy" would come first because of the ASCII value of "K" (ASCII 75) is less than that of lower case "c".

So, when FileFlex looks at "CLINTON" and "KENNEDY", it's really looking at the comparative weights (or priorities) of the individual characters, according to their representation in ASCII. Here's the two strings and their corresponding values:

         C  L  I  N  T  O  N
         67 76 73 78 84 79 78
         |  |  |  |  |  |  |
         75 69 78 78 69 68 89
         K  E  N  N  E  D  Y

Custom Character Sort Orders

FileFlex' new WorldFlex technology allows you to customize the character-level sort order used by the FileFlex indexing routines. There are two primary reasons you might want to do this:

To sort in descending rather than ascending order
To sort according to sorting rules different than ASCII, in particular for languages other than English.

In fact, a very important part of WorldFlex technology is the ability to change the sort order of your characters, and thereby sort your database according to the sorting rules you feel are currently appropriate.

Many so-called "internationalized", "localized", or "world-aware" systems do provide support for character sorting order for multi-country use. But they are usually available only when you're running the localized version of the operating system or database application. While many of orur friends outside the US are grateful for any mechanism that recognizes their native language, FileFlex doesn't stop there. FileFlex' new WorldFlex technology is vastly more powerful. FileFlex allows you to change your sorting order on-the-fly, as you switch index files. Nothing else can do this!

Here's an example of where this is so powerful: Imagine you're a multi-national firm with customers throughout the world. When you do a query to list your customers in the US, the ASCII sort order is just fine. But when you do a query to list customers in Japan, you want the customers' names sorted by the appropriate sorting conventions for the Japanese language and character sets--not according to the rather provincial expectations of ASCII. With FileFlex, you can switch from an ASCII index to an index ordered according to Japanese sort order absolutely instantly.

Creating a Single-Byte Custom Sort Order Table

Character sort orders are controlled by a custom sort order table. For applications and languages that use single-byte characters (typically, "roman"languages), each character can be represented by a single byte. Since a byte is 8-bits wide, this allows for 256 characters.

You create a sort order table in your host development environment's programming language (our examples will be in Director's Lingo). We do this by building a table containing three bytes of data for each character in the sort order:

Leader Flag Byte: For single-byte languages, this byte is always set to 255.
Priority Multiplier Byte: For single-byte languages, this byte is also always set to 255.
Priority Value Byte: This value signifies the priority of the character in the list (never use 0).

At the end of all of the three-byte sets, a single byte containing the value 0 is used to terminate the table.

Before we look in more detail at the Priority Value Byte, let's first look at how ASCII prioritizes it's characters:

    A  B  C  D  E     ...  V  W  X  Y  Z
    65 66 67 68 69    ...  86 87 88 89 90

Since "A" is an ASCII 65, it's got a lower value than "D", which is an ASCII 68. The numbers 65 and 68 correspond to the priority value of the various letters. Likewise, in FileFlex' custom sort order tables, the lower priority value number, the earlier in the sort the character will be placed. If we wanted to sort in reverse order ("Z" before "A"), we could assign different priority values, giving "Z" a much lower number than "A", as in the following list:

    Z  Y  X  W  V     ...  E  D  C  B  A
    65 66 67 68 69    ...  86 87 88 89 90

With the priorities show above, if we looked up a "D", we'd see it's value was 87. Since an "A" has a priority value of 90, the "D" would come earlier in the list. If we used this set of priority values, "KENNEDY" would certainly appear before "CLINTON".

It's important to remember that the priority value is entirely up to you. If you wanted all words with vowels (A, E, I, O, and U) to come at the beginning of the list, you might create the following table of priority values:

    A  E  I  O  U  B  C  D  F  G  H  J  K  L
    65 66 67 68 69 70 71 72 73 74 75 76 77 78 ...

FileFlex determines where in the sort order table to find a priority value based on the character's actual computer-code value (usually ASCII). So, since "A" has the ASCII code value of 65, FileFlex will look in the 65th entry in the sort order table to retrieve the priority value. Let's make this a bit clearer by constructing a partial sort order table for traditional ASCII (note, we're showing all three data bytes as described above and all numbers are in base-10):

  Entry Pos   65           66           67           68
  US Char     "A"          "B"          "C"          "D"           
  Data Bytes  255 255 065  255 255 066  255 255 067  255 255 068

  Entry Pos   69           70           71           72
  US Char     "E"          "F"          "G"          "H"           
  Data Bytes  255 255 069  255 255 070  255 255 071  255 255 072

  Entry Pos   73           74           75           76
  US Char     "I"          "J"          "K"          "L"           
  Data Bytes  255 255 073  255 255 074  255 255 075  255 255 076

So, to create a FileFlex sort order table that matches traditional ASCII in ascending order, you'd want "A" to have a sort order priority of 65, so the third data type at position 65 would be the value 65.

Now let's look at how the table would change if we wanted to sort everything in reverse order (note that we've reversed the entire ASCII character set):

  Entry Pos   65           66           67           68
  US Char     "A"          "B"          "C"          "D"           
  Data Bytes  255 255 190  255 255 189  255 255 188  255 255 187

  Entry Pos   69           70           71           72
  US Char     "E"          "F"          "G"          "H"           
  Data Bytes  255 255 186  255 255 185  255 255 184  255 255 183

  Entry Pos   73           74           75           76
  US Char     "I"          "J"          "K"          "L"           
  Data Bytes  255 255 182  255 255 181  255 255 180  255 255 179

Using the above table, when FileFlex encounters the character "A", which has the ASCII value of 65, it looks at the 65th entry in the table. It then retrieves the priority value, which is 190. If FileFlex then looks for "C" (in the 67th entry in the table), it retrieves the priority value of 188. Since 188 is less than 190, FileFlex will put "C"before "A".

Creating Single-Byte Sort Order Utility Scripts

The best way to create the sort order table is to write a simple utility script. Here's an example script that simply builds the ASCII order in ASCII order:

  on buildSortOrder_ASCII
    global ASCII
    put "" into theTable
    repeat with i = 0 to 255
      put the number of chars of theTable into theChar
      put numToChar(255) after theTable -- no leader char
      put numToChar(255) after theTable -- priority multiplier of 0
      if i = 0 then
        put numToChar(255) after theTable -- use 255 in byte 0
      else
        put numToChar(i) after theTable -- priority value
      end if
    end repeat
    put numToChar(0) after theTable -- terminator byte code
    put theTable into ASCII
  end buildSortOrder_ASCII

Note the name of the handler is "BuildSortOrder_ASCII". We've developed a convention where the routine that builds the sort order is called "BuildSortOrder_" and the name of the sort order itself is appended to the end. The sort order table is placed in a global variable of the same name. So, for a sort order for French Canadian, we recommend naming the handler "BuildSortOrder_FrenchCanadian" and the global variable containing the sort order "FrenchCanadian".

Note that the routine above places the actual byte value into the string by using numToChar(x). This places a single byte value corresponding to the number in the string location. Each set of data bytes in the table gets two bytes with 255 (for the leader char and priority page 0), and the byte corresponding to the priority value. Finally, after all the data byte sets are added to the string, BuildSortOrder_ASCII appends a terminator byte (value 0).

Here's an example routine that reverses the ASCII sort order, placing the table in the global ASCIIReverse:

  on buildSortOrder_ASCIIReverse
    global ASCIIReverse
    put "" into theTable
    put 255 into priority
    repeat with i = 0 to 255
      put the number of chars of theTable into theChar
      put numToChar(255) after theTable -- no leader char
      put numToChar(255) after theTable -- priority multiplier of 0
      if i = 0 then
        put numToChar(255) after theTable -- use 255 in byte 0
      else
        put numToChar(priority) after theTable -- priority value
      end if
      put priority-1 into priority
    end repeat
    put numToChar(0) after theTable -- terminator byte code
    put theTable into ASCIIReverse
  end buildSortOrder_ASCIIReverse

WARNING: Make absolutely certain you end each sequence with a numToChar(0) terminator byte. Failure to do this could cause FileFlex to scan beyond the end of the sort order table and the results could be unpredictable and your program could abnormally terminate.

Understanding Double-Byte Sort Order Tables

If the language you're sorting uses double-byte characters (like certain Japanese and Chinese character sets), you'll need to create double-byte sort order tables. Double-byte character sets are different because they use two bytes for many characters. The computer distinguishes between a standard single-byte character and a dual-byte character by the existence of a leader byte. This leader byte tells the computer that the byte that follows the leader byte is to be treated as a special character, rather than simply part of the standard ASCII table.

FileFlex sort order tables are not limited to 256 bytes. Instead, they can be anywhere from 256 bytes long to 65,280 bytes long (255 * 256). Each set of 256 bytes in the sort order table is called a "sort order page"and the maximum number of sort order pages allowed by FileFlex is 255.

If you recall from earlier, each character value is represented in the sort order table by three bytes, a leader char byte, a priority multiplier byte, and a priority value byte. Also, if you recall, the leader char byte for single-byte sort order tables was always 255. That told FileFlex to look in the very first page of the sort order table (i.e., the very first set of 256 bytes) for the character's priority value.

When you're using double-byte character sets, you'll need more than one 256-byte page to represent the sort order. The value that's placed in the leader character tells FileFlex in which sort order page to look for the priority value of the character which follows the leader character. Let's diagram that out:

Suppose that your language character set uses characters with the value of 128 as a leader character. Now, let's suppose your database has a double-byte character with the values 128 and 065 respectively for the two bytes. Here's how the sort order table might be be defined:

  Sort order page 0
  ---------------------------
  Position #128:  001 255 255

  Sort order page 1
  ---------------------------
  Position #65:   255 255 015

When reading the character stream, FileFlex would read the first byte and determine it's value was 128. It would then go to position 128 in the sort order table and read the first byte. Since the first byte (the leader byte flag) is not a 255, it would know that 128 was a leader byte. Since the leader byte flag is 1, FileFlex would know that the next character retrieved should be compared against sort order page 1 (located in the second bank of 256 bytes).

FileFlex would now read the second byte of the character. Since it knows that this character is the second of a double-byte character set, FileFlex will then determine the character's value (in this case 65) and jump 65 bytes into the second sort order page (or to byte 321...256+65...of the full sort order table). 321 bytes into the table (position 65 in the second page) FileFlex would look at the priority value byte and determine that the priority of the character represented by 128 065 is 15.

Creating Double-Byte Sort Order Tables

You create a double-byte sort order table very much like you would a single-byte table. You create sets of three-byte sequences for each character. For each sort order page, you create 256 of these three byte sets. At the very end, you place a single byte value of 256 that signifies the termination of the table.

You should probably lay out the sort order tables on paper before you attempt to write the code to generate a table.

First, you should determine those byte values that are leader bytes. For every unique leader byte value, assign a sort order page, from page 1 to 254. Obviously, you want to keep the number of absolute sort order pages down as much as possible to make things run faster and to use less memory. For each leader byte in the sort order byte triplet, make sure you've set the following two bytes to 255.

Next, fill in all the other remaining values in the first 256 byte page. For each character, assign a weighted value and place that in the third byte of the data triplet.

Note: you can use the second byte of the data triplet as a priority multiplier. If you need priorities higher than 255, use the priority multiplier byte by setting it to anything between 1 (earliest in the priority order) to 254 (last in the priority search list order).

After you've filled in the first sort order page, you can then create the subsequent pages. In these pages, the first byte of the triplet will always be 255, the second byte between 1 and 254 depending on your desired priority multiplier, and the third value byte also between 1 and 254.

Finally, append a terminator byte--which needs to be a charToNum(0) value.

Once you've layed all this out on paper, you can write a BuildSortOrder_ routine that will create a global variable containing your sort order.

Tricks with Sort Order

You can do some pretty interesting things with sort orders besides handling international issues. For example, lets assume you wanted to sort numerical data which you stored in a character field.

Note: You should generally do this because the DBF format stores numbers as ASCII values internally. But if you use character fields to store numbers, you get to manipulate values with more control (i.e., sort order).

So, again, let's assume you've got a character field containing numeric data. Sometimes, in a numeric field, you might want to have spaces or asterisks instead of zeros, like in the following example:

      "0002598"      "   2598"      "***2598"

When creating a custom sort order table for numerical sorts in character fields, you can give the space character (ASCII 32), the asterisk character (ASCII 42), and the zero all the same priority value weighting. This would cause the sorting/seeking routines to treat all three characters the same.

This kind of "equalizing" of sorting values also applies to those special international characters, like letters with umlauts (e.g., the double-dots) or accent marks over characters. You might want to treat a lower case 'a' and a lower-case 'a' with an accent mark as the same character in sort order.

You can also do this with upper and lower case values. If you want upper case and lower case letters to be sorted together, give them the same priority value.

Setting the Sort Order with FileFlex

You can tell FileFlex to use a new sort order with the FileFlex command DBSetSortOrder. Unlike most FileFlex commands, DBSetSortOrder is a wrapper script that does not call FileFlex directly. Instead, DBSetSortOrder sets two FileFlex global properties: gDBWorldSort and gDBSortOrder.

Note: I almost named the gDBSortOrder variable gDBWorldOrder. Then the function would have been DBSetWorldOrder. But that seemed far too Republican, so I restrained myself. Wouldn't it be great if you could write a new translation table, give a quick call to DBSetWorldOrder, and--poof--a new world order emerges? It gives new (and terrifying meaning) to the phrase "FileFlex users rule!" [chuckle] [[shiver]].

Here's the Lingo code for DBSetSortOrder:

  on DBSetSortOrder order
    global gDBWorldSort
    global gDBSortOrder
    if order = EMPTY then
      put EMPTY into gDBWorldSort
    else
      put "1" into gDBWorldSort
      put order into gDBSortOrder
    end if
    return 0
  end DBSetSortOrder

When you call DBSetSortOrder, you want to pass your sort order table. Here's an example:

  put DBSetSortOrder(ASCII) into DBResult

To disable custom sort order processing, set the sort order to the empty string:

  put DBSetSortOrder("") into DBResult

Inside of FileFlex is a C++ function called worldCompare(). When a DBCreateIndex or DBSeek command is executed, at some time, the internal worldCompare routine is called upon to compare two strings. When worldCompare is called, it asks the host development environment (i.e., Director) for the value of the reserved global variable gDBWorldSort. If worldCompare discovers that gDBWorldSort is not empty, it then asks the host environment for the contents of the global variable gDBSortOrder and uses that to control the comparison of two strings.

Hint: One of the reasons building a sort order table is so complex and precise is you're building an actual binary data structure that FileFlex can use directly. While the table may be a bit painful to design once, this mechanism allows FileFlex to do custom comparisons and switch sort order tables at blinding speed.

To turn off a sort order table, send the empty string to DBSetSortOrder. When this happens, the global gDBWorldSort is set to the empty string. FileFlex then knows to skip the extra processing inherent in comparing world-aware data strings.

Cautions: The sort order impacts the internal compare functions; it does not reorder the dataset or the index. As a result, you should set your sort order BEFORE you call DBCreateIndex and you should always use the appropriate sort order table when doing a DBSeek or DBSelectIndex. Failure to do this could cause your data to appear out of order. When writing records, try not to get in the situation where two different sort orders need to be active when writing one record.

Here's a sample script from the Sort Order demo file:

  on mouseUp
    global ASCIIReverse   -- the reverse sort order table
    -- initialize FF session
    put DBOpenSession() into dbresult
    if dbResult < 0 then
      alert "FileFlex could not initialize!"      exit
    end if
    -- open a database file
    put dbUse(field "theDBFile") into dbID
    if dbID < 0 then errorClose "Could not open database file."    --
    -- create a a custom index on TITLE using ASCIIReverse
    --
    buildSortOrder_ASCIIReverse -- build the sort order
    put DBSetSortOrder(ASCIIReverse) into dbResult
    put "Creating index file..." into field "status"    updateStage
    put dbCreateIndex("REVASCII","TITLE","0","0") into ndxID
    if ndxID < 0 then errorClose "Could not create index file."    -- fill the list
    put "Scanning data file..." into field "status"    updateStage
    put DBSelectIndex(ndxID) into dbResult
    if dbResult < 0 then errorClose "Could not select index file."    put "" into theList
    put DBTop() into dbResult
    repeat while 1 = 1  -- forever
      if theList <> "" then put return after theList
      put DBGetFieldByName("TITLE") into title
      updateStage
      put title after theList
      if DBSkip(1) = 3 then exit repeat
    end repeat
    put theList into field "movie list"    updateStage
    put DBSetSortOrder(EMPTY) into dbResult -- turn off
    put DBCloseSession() into dbresult
    if dbResult < 0 then 
      alert "FileFlex could not terminate!"      exit
    end if
    put "Processing complete..." into field "status"    updateStage
  end

  on errorClose s
    alert s
    put DBCloseSession() into dbresult
    if dbResult < 0 then
      alert "FileFlex could not terminate!"      abort
    end if
    abort
  end errorClose

Important: FileFlex uses the xBASE/dBASE III standard format. This format does not permit 8-bit deep characters in memo fields contained within DBT files. Attempting to do character translation to characters greater than 128 can cause this format difficulties. If you need to store non-ASCII text in memo fields, you should either use a custom translation table or store your data in text files and refer to those files from FileFlex fixed-length fields.

Character Translation

If you're using a language that has special characters in it's character sets (i.e., accent marks, umlauts, and other specialty characters), you may run into an interesting problem moving documents from Macintosh to Windows or vice-versa. That's because while ASCII is cleanly defined for the US English character set of "a-zA-Z", that does not mean that character values of special characters are uniformly used across platforms.

FileFlex user Antonio Lucena of Madrid, Spain describes the conversion issue as it pertains to DOS vs. Windows files as well:

"The problem is that Windows uses different character set than MS-DOS (and the databases created with dBASE). MS-DOS uses OEM Char set, and Windows uses ANSI. For example in OEM, a diacritical "e" is numbered 130, but in ANSI, same "e" is numbered 233. The same problem appears when you open a document (with diachitical vowels on it) made with the EDIT tool from MS-DOS and you try to open it with the WRITE tool from Windows and no previous conversion was made."

Note: The above message illustrates the value of the free fileflex-talk mailing list. Another user had discovered the translation problem and by asking questions to this user and making that dialog public via fileflex-talk, Antonio was able to see the message and contribute his feedback. With feedback from him and others, we were able to identify the need for the new DBTranslateChars function described below.

FileFlex WorldFlex technology provides for character-level translation using much the same mechanism as used for developing sort order tables. You develop a translation table that describes the new and old values and pass it to FileFlex along with a container of characters to be translated.

Setting up a character translation table is very straightforward. You need to build a Lingo string consisting of 256 characters. The position in the string is the value of the old character and the value at that position becomes the new character.

Note: The first character in the string is considered "position 0" by FileFlex. Also note that you cannot place a 0 into any character position. If you do not want translation, place the corresponding character value into that position or the value 255.

Creating Character Translation Utility Scripts

The best way to create the character translation table is to write a simple utility script. Here's an example script that simply contains the ASCII character set:

  on buildTranslateTable_ASCIIX
    global ASCIIX
    put "" into theTable
    repeat with i = 0 to 255
      if i = 0 then
        put numToChar(255) after theTable -- use 255 in byte 0
      else
        put numToChar(i) after theTable -- position in table
      end if
    end repeat
    put theTable into ASCIIX
  end buildTranslateTable_ASCIIX

Note the name of the handler is "BuildTranslateTable_ASCIIX". We've developed a convention where the routine that builds the translation table is called "BuildTranslateTable_" and the name of the translation itself is appended to the end. In order to prevent confusion from sort order tables, we've also placed an X after every translation table ("X"for an often used abbreviation for translate, which is "Xlate"). The translation table is placed in a global variable of the same name. So, for a translation table that converts to Windows diacriticals, we recommend naming the handler "BuildTranslateTable_WinCharX" and the global variable containing the sort order "WinCharX".

Here's an example routine that converts upper case to lower case (and the reverse):

  on buildTranslateTable_CaseReverseX
    global CaseReverseX, ASCIIX
    buildTranslateTable_ASCIIX
    put ASCIIX into theTable
    -- fill in lower case
    repeat with i = 65 to 90
      put numToChar(i+32) into char i+1 of theTable 
      -- using i+1 above because strings begin at 1, not 0
    end repeat
    -- fill in upper case
    repeat with i = 97 to 122
      put numToChar(i-32) into char i+1 of theTable 
    end repeat
    put theTable into CaseReverseX
  end buildTranslateTable_CaseReverseX

The above routine reverses the case, so an upper case "A" becomes a lower case "a" and vice versa. To create a routine that always converts to upper case, make both sets of characters upper case. Likewise, to create a routine that always converts to lower case, make both sets of characters lower case. Here's an UpperX routine:

  on buildTranslateTable_UpperX
    global UpperX, ASCIIX
    buildTranslateTable_ASCIIX
    put ASCIIX into theTable
    -- fill in upper case
    repeat with i = 97 to 122
      put numToChar(i-32) into char i+1 of theTable 
      -- using i+1 above because strings begin at 1, not 0
    end repeat
    put theTable into UpperX
  end buildTranslateTable_UpperX

WARNING: Make absolutely certain you fill in all 256 bytes. Failure to do this could cause FileFlex to scan beyond the end of the translation table and the results could be unpredictable and your program could abnormally terminate.

Translating Characters Using FileFlex

You can use FileFlex to translate character sets within a text container using the DBTranslateChars function. DBTranslateChars takes two parameters: the string to be translated and the pre-built translation table described above. It returns the translated string:

  put DBTranslateChars(myString,CaseReverseX) into newString

Here's a sample routine that will do the character translation (it presupposes that FileFlex has been initialized properly with DBOpenSession):

  on mouseUp
    global CaseReverseX
    
    buildTranslateTable_CaseReverseX
    put DBTranslateChars(field "text data",CaseReverseX) 
       into field "text data"  end mouseUp

Case Translation

If you're using a language that has special characters in it's character sets (i.e., accent marks, umlauts, and other specialty characters), you may run into an interesting problem converting between upper and lower case. With standard ASCII, it's easy to do a case conversion: just add or subtract 32 to the character's value. That's because in ASCII, the upper or lower case character is always algorithmically deterministic. However, when dealing with international character sets where lower case characters might have diacritical marks, it becomes much harder. That's because the characters have a wide variety of values and because there is little standardization.

FileFlex WorldFlex technology provides for intelligent case translation using much the same mechanism as used for developing character translation tables. You develop a translation table that describes the new and old values and pass it to FileFlex along with a container of characters to be translated.

You'll need to set up two case translation tables; one going to upper case and one going to lower case. For each table, you must build a Lingo string consisting of 256 characters. The position in the string is the value of the old character and the value at that position becomes the new character.

Note: The first character in the string is considered "position 0" by FileFlex. Also note that you cannot place a 0 into any character position. If you do not want translation, place the corresponding character value into that position or the value 255.

Creating Case Translation Utility Scripts

The best way to create the case translation table is to write a simple utility script. Here's an example script that simply converts ASCII lower case to ASCII upper case:

  on buildCaseTable_AsciiUC
    global AsciiUC
    put "" into theTable
    -- Although it takes a few extra cycles, consider
    -- building a full table first, then modifying it below.
    -- This is much easier to understand and test.
    repeat with i = 0 to 255
      if i = 0 then
        put numToChar(255) after theTable -- use 255 in byte 0
      else
        put numToChar(i) after theTable -- position in table
      end if
    end repeat
    -- fill in upper case
     repeat with i = 97 to 122
      put numToChar(i-32) into char i+1 of theTable 
      -- using i+1 above because strings begin at 1, not 0
    end repeat
    put theTable into AsciiUC
  end buildCaseTable_AsciiUC

Note the name of the handler is "BuildCaseTable_AsciiUC". We've developed a convention where the routine that builds the translation table is called "BuildCaseTable_" and the name of the translation itself is appended to the end. In order to prevent confusion with other tables, we've also placed an UC after every translation table (for translation to upper case--use "LC" for translation to lower case). The upper case table is placed in a global variable of the same name.

Here's the routine that translates back down to lower case:

  on buildCaseTable_AsciiLC
    global AsciiLC
    put "" into theTable
    -- Although it takes a few extra cycles, consider
    -- building a full table first, then modifying it below.
    -- This is much easier to understand and test.
    repeat with i = 0 to 255
      if i = 0 then
        put numToChar(255) after theTable -- use 255 in byte 0
      else
        put numToChar(i) after theTable -- position in table
      end if
    end repeat
    -- fill in lower case
    repeat with i = 65 to 90
      put numToChar(i+32) into char i+1 of theTable 
      -- using i+1 above because strings begin at 1, not 0
    end repeat
    put theTable into AsciiLC
  end buildCaseTable_AsciiLC

Intelligent Case Conversion Using FileFlex

Case translation is used in a number of important ways within FileFlex, in particular within the intrinsic functions used in indexes and queries, and through special utility functions provided to perform simple case conversion.

You can tell FileFlex to use a case translation table with the FileFlex command DBSetCaseTables. Unlike most FileFlex commands, DBSetCaseTables is a wrapper script that does not call FileFlex directly. Instead, DBSetCaseTables sets three FileFlex global properties: gDBWorldCase, gDBWorldUpper and gDBWorldLower.

Here's the Lingo code for DBSetCaseTables:

  on DBSetCaseTables upperTable, lowerTable
    global gDBWorldCase
    global gDBWorldUpper, gDBWorldLower
    if (upperTable = EMPTY or lowerTable = EMPTY) then
      put EMPTY into gDBWorldCase
    else
      put "1" into gDBWorldCase
      put upperTable into gDBWorldUpper
      put lowerTable into gDBWorldLower
    end if
    return 0
  end DBSetCaseTables

When you call DBSetCaseTables, you want to pass your case tables. Here's an example:

  put DBSetCaseTables(AsciiUC, AsciiLC) into DBResult

To disable custom case conversion processing, set the sort order to the empty string:

  put DBSetCaseTables("") into DBResult

Inside of FileFlex is a C++ function called worldUpper(). When an intrinsic UPPER function is executed, the internal worldUpper routine is called upon to do the case conversion. When worldUpper is called, it asks the host development environment (i.e., Director) for the value of the reserved global variable gDBWorldCase. If worldUpper discovers that gDBWorldCase is not empty, it then asks the host environment for the contents of the global variables gDBWorldUpper and gDBWorldLower and uses them to control the conversion of the strings.

To turn off custom case conversion, send the empty string to DBSetCaseTables. When this happens, the global gDBWorldCase is set to the empty string. FileFlex then knows to skip the extra processing inherent in case conversion of world-aware data strings.

Cautions: Be careful that the first parameter is the upper case table and the second parameter is the lower case table. Also make sure you pass two tables. Failure to pass two complete case conversion tables could cause unpredictable results and might lead to abnormal termination.

Standalone Intelligent Case Conversion Functions

In addition to doing intelligent case conversions within index and query functions, FileFlex provides you with the ability to do intelligent case conversions of standalone strings.

The function DBUpper will convert a string intelligently from lower case to upper case. If case tables have already been set with DBSetCaseTables, DBUpper will use those tables, otherwise it will use the standard ASCII upper case conversion. Here's how to call DBUpper:

  put DBUpper(string) into newString

Likewise DBLower will convert a string intelligently from upper case to lower case. If case tables have already been set with DBSetCaseTables, DBLower will use those tables, otherwise it will use the standard ASCII lower case conversion. Here's how to call DBLower:

  put DBUpper(string) into newString