home
***
CD-ROM
|
disk
|
FTP
|
other
***
search
/
PC World 2001 April
/
PCWorld_2001-04_cd.bin
/
Software
/
TemaCD
/
webclean
/
config
/
zaplets.txt
< prev
Wrap
Text File
|
2000-12-07
|
4KB
|
126 lines
ZAPLET FORMAT
=============
Zaplets are configuration files for blocking URLs and filtering HTML
data.
Each file can have any number of zaplets (including zero).
Each zaplet can have any number of blockers or filters (including
zero).
A zaplet is an XML formatted list of tags which are described below:
[XXX make formal DTD specification]
1) <zaplet version="1.0" description="blubb" lang="Perl"></zaplet>
The version tag is for backward compatibility. If a parser does
not understand a specific version, the zaplet is ignored.
The description can be displayed by configuration tools which
enable/disable some zaplets.
The lang attribute can restrict the zaplet only to this language.
If there is no lang attribute, this zaplet can be used by all
languages.
If there is a lang attribute, all lang attributes occuring in
the rules are ignored.
For language specific things see below.
2) <filter
description=""
tag=""
attr=""
attrvalue=""
lang=""
replace_tag
replace_tag_name
replace_enclosed_block
replace_attribute
replace_attribute_value
replace_alternate_content
replace_ifnotmatch
>this is the replacement text (or none)</filter>
The filter tag applies to HTML content and can replace (or delete if
the replacement is empty) arbitrary HTML tag blocks.
Option Description of this option
-------------------------------------------------------------------
description this text can be displayed by configuration
tools which enable/disable some filters.
tag specifies a regular expression to match an
HTML tag.
attr specifies a regular expression to match an
attribute of an HTML tag.
attrvalue specifies a regular expression to match an
attribute value of an HTML tag.
lang restricts this rule to the specified
language
-------------------------------------------------------------------
Option What it does when replacement text is 'foo'
-------------------------------------------------------------------
replace_tag <blink>text</blink> => footextfoo
replace_tag_name <blink>text</blink> => <foo>text</foo>
replace_enclosed_block <blink>text</blink> => <blink>foo</blink>
replace_attribute <a href="bla">..</a> => <a foo>..</a>
replace_attribute_value <a href="bla">..</a> => <a href="foo">..</a>
replace_ifnotmatch replace if matchers do not match a tag
There are several ways to combine these options. We will list now
all valid combinations.
Invalid combinations result in dropping the filter rule.
[XXX list all combinations]
3) <block description="" host="" path=""/ lang="">
Here we can specify urls to block. We split urls in host and path and
specify regular expressions for each.
NOTES
=====
All regular expressions are matched case insensitive.
If a regular expression is not given it matches everything (this is
not the same as giving an empty regular expression!).
Default zaplet version is "1.0".
Default filter options are "replace_tag replace_enclosed_block".
Zaplet files have a .zap extension.
XXX quote XML metacharacters!
EXAMPLES
========
<zaplet description="BLINK tag">
<filter description="Replace BLINK with B"
tag="blink" replace_tag_name>b</filter>
</zaplet>
<zaplet description="Advertisements">
<block description="data from hosts without DNS name"
host="^[\d.:]+$" path="([=&?]|\.gif$|banner)"/>
<filter description="CGI adverts"
tag="a" attr="href"
attrvalue="http://.*/cgi-bin/ads?(log)?.*([=&?]|\.gif)"/>
</zaplet>
# restricted to Python because of ?P<replace> name submatches.
<zaplet description="Redirects" lang="Python">
<filter description="No redirection"
tag="a" attr="href"
attrvalue="redirect\.cgi\?.*?location=(?P<replace>[^="&]+)"
replace_attribute_value/>
</zaplet>