ZAPLET FORMAT ============= Zaplets are configuration files for blocking URLs and filtering HTML data. Each file can have any number of zaplets (including zero). Each zaplet can have any number of blockers or filters (including zero). A zaplet is an XML formatted list of tags which are described below: [XXX make formal DTD specification] 1) The version tag is for backward compatibility. If a parser does not understand a specific version, the zaplet is ignored. The description can be displayed by configuration tools which enable/disable some zaplets. The lang attribute can restrict the zaplet only to this language. If there is no lang attribute, this zaplet can be used by all languages. If there is a lang attribute, all lang attributes occuring in the rules are ignored. For language specific things see below. 2) this is the replacement text (or none) The filter tag applies to HTML content and can replace (or delete if the replacement is empty) arbitrary HTML tag blocks. Option Description of this option ------------------------------------------------------------------- description this text can be displayed by configuration tools which enable/disable some filters. tag specifies a regular expression to match an HTML tag. attr specifies a regular expression to match an attribute of an HTML tag. attrvalue specifies a regular expression to match an attribute value of an HTML tag. lang restricts this rule to the specified language ------------------------------------------------------------------- Option What it does when replacement text is 'foo' ------------------------------------------------------------------- replace_tag text => footextfoo replace_tag_name text => text replace_enclosed_block text => foo replace_attribute .. => .. replace_attribute_value .. => .. replace_ifnotmatch replace if matchers do not match a tag There are several ways to combine these options. We will list now all valid combinations. Invalid combinations result in dropping the filter rule. [XXX list all combinations] 3) Here we can specify urls to block. We split urls in host and path and specify regular expressions for each. NOTES ===== All regular expressions are matched case insensitive. If a regular expression is not given it matches everything (this is not the same as giving an empty regular expression!). Default zaplet version is "1.0". Default filter options are "replace_tag replace_enclosed_block". Zaplet files have a .zap extension. XXX quote XML metacharacters! EXAMPLES ======== b # restricted to Python because of ?P name submatches.