|
|
This chapter explains how to design new filters based on the Replace Text filter type. The Replace Text filter type is similar to the find and replace command found in many word processors and text editors, and can be used just as easily. However, unlike common find and replace commands the Replace Text filter supports case masks, break lists, wild card characters, and replace masks. This allows the Replace Text filter to perform very advanced fixed-width text search and replace functions.
This chapter assumes that you already know how to create new filters and open filter editors, and that you know how to use the elements common to all filter editors. If not, please read Editing Filters before proceeding.
The Replace Text editor has three fields named Find, Replace, and Break List. You use these fields to enter the search and replace strings for the filter and the break characters which, if present, set the parameters for separate text or whole word searches. There's also a button named Edit Case Mask which brings up a Case Mask Editor for defining the case mask to be used during the search.
Replacing all occurrences of one string with another is very easy using the Replace Text filter. Simply enter the string you want to find (the find string) in the Find field, and the string you want to replace it with (the replace string) in the Replace field. If your find or replace string contains the '*' character, you must enter it as '\*'. You'll learn why in the Wild Card Characters section. Whenever you apply the filter to some text, all occurrences of find string are replaced with replace string.
It's important to note that at this point the filter does exactly what it's told: only text which exactly matches the find string you entered will be found, and it will be replaced exactly by the replace string. If you entered "cat" for the find string, and "dog" for the replace string, "catalog" would become "dogalog" because you have not entered a Break List for a whole word search. Further, "Cat" will remain "Cat" because it does not match "cat".
Sometimes this is the kind of search you want, but often you will want to define whole word and/or case insensitive searches. Read on to learn how.
The Replace Text filter supports wild card characters in both the Find and Replace fields. A wild card character is defined by entering the '*' character. If you need to search for a literal '*', type '\*' for every '*' in your find string.
In the Find field, typing '*' tells the filter to accept any character in that position. For instance, a find string of '*at' will find cat, bat, mat, etc. This allows you to search for variations of text where one or more characters are not known or do not matter.
In the Replace field the '*' character acts as a replace mask when replacing the found text. Typing '*' tells the filter to keep whatever character is currently in that position when replacing the text. For instance, if you enter a find string of 'cat' and a replace string of '*og', the text 'cat' will become 'cog'. The first character in the found text, whatever it is in each occurrence, is retained because the first character in the replace string is '*'. This allows you to perform replacements where certain characters, which vary with each occurrence, should remain the same.
The Replace Text filter editor has a field for entering a break list. In any search where there is a break list, at least one character from the break list must be on either side of the find string occurrence for it to be considered a find. Any single character from the list can be on either side for a successful "break", and the characters can be different from each other. This is useful for defining "separate text" or "whole word" searches.
For instance, say you want to replace 'cat' with 'dog' but don't want to affect words like 'catalog'. You could enter 'cat' as the find string, 'dog' as the replace string, and ' ' (a space character) for the break list. With this filter, only occurrences of 'cat' with a space on either side, ' cat ', will actually be found and replaced.
When using a break list you need to remember to enter all of the characters which should break the text. In the above example, ' ' is an incomplete break list. The string ' cat.', even though it's a separate word, would not be found because on the right side there is a period, not a space. A more complete break list for whole words would include punctuation, such as ' .?!,'. This would find and successfully replace ' cat.' since at least one character from the break list is on either side of the find string, a space on the left and a period on the right.
The Break List field has two popup menus next to it. The left popup menu is the standard character menu useful for entering control and special characters. The right popup menu lists several useful break lists, such as Whole Word Search. To use one of these break lists, simply select it from the menu and it will be entered into the field.
The break list feature is set up to support the most common types of separate text searches, such as whole word searches. For more advanced searches you may need to use the Replace Pattern filter type instead.
The Replace Text filter supports case masks in its search. A case mask is a table which maps normal ASCII characters to replacement, or mask, characters. Case masks are used in the Character Table Filter to actually map and replace characters at very high speed. In the Replace Text filter the characters are not actually changed as they are in the Character Table filter. But the filter 'sees' the mapped characters rather than the normal characters during the search. This is useful for implementing case insensitive searches of various kinds.
Consider the example used throughout this chapter, the endless search for the word 'cat'. What if you wanted your filter to find and replace both 'cat' and 'Cat'? First you would enter 'cat' for your find string. Then you would enter a case mask which told the filter to see a 'c' everywhere there was a 'C'. When the filter performed its search it would find both 'cat' and 'Cat'.
To edit and save a new case mask for a Replace Text filter, click the Edit Case Mask button in the Replace Text filter editor. A new dialog will appear which lists all of the 256 characters in the 8-bit Mac ASCII character set. On the left side of the list is the normal character and its ASCII code. On the right side is the replacement character and its ASCII code.
Now select the character you wish to map and type the replacement character. For the example above, you would scroll to find 'C' (ASCII 67), select the list item by clicking on it, and type 'c'. The right side of the list item will change to reflect the replacement character. Repeat for every character you wish to mask, and then click Save to save the new mask, or Cancel to discard the new mask and retain the old one.
You can also type the ASCII code for the replacement character while holding down the control key. Or you can select a character from the Character popup menu.
Finally, there are a number of existing case masks under the Tables popup menu. Simply select one from the menu and click Save to use it.
(Under Construction)