home *** CD-ROM | disk | FTP | other *** search
-
-
-
- wwwwssssrrrreeeeggggeeeexxxxpppp((((3333WWWW)))) wwwwssssrrrreeeeggggeeeexxxxpppp((((3333WWWW))))
-
-
-
- NNNNAAAAMMMMEEEE
- _wwww_ssss_rrrr_eeee_gggg_eeee_xxxx_pppp: _wwww_ssss_rrrr_eeee_cccc_oooo_mmmm_pppp_iiii_llll_eeee, _wwww_ssss_rrrr_eeee_ssss_tttt_eeee_pppp, _wwww_ssss_rrrr_eeee_mmmm_aaaa_tttt_cccc_hhhh, _wwww_ssss_rrrr_eeee_eeee_rrrr_rrrr - Wide character
- based regular expression compile and match routines
-
- SSSSYYYYNNNNOOOOPPPPSSSSIIIISSSS
- _####_iiii_nnnn_cccc_llll_uuuu_dddd_eeee _<<<<_wwww_ssss_rrrr_eeee_gggg_eeee_xxxx_pppp_...._hhhh_>>>>
- _####_iiii_nnnn_cccc_llll_uuuu_dddd_eeee _<<<<_wwww_iiii_dddd_eeee_cccc_...._hhhh_>>>>
- _llll_oooo_nnnn_gggg _wwww_ssss_rrrr_eeee_cccc_oooo_mmmm_pppp_iiii_llll_eeee_((((_ssss_tttt_rrrr_uuuu_cccc_tttt _rrrr_eeee_xxxx_dddd_aaaa_tttt_aaaa _****_pppp_rrrr_eeee_xxxx_,,,, _llll_oooo_nnnn_gggg _****_eeee_xxxx_pppp_bbbb_uuuu_ffff_,,,,
- _llll_oooo_nnnn_gggg _****_eeee_nnnn_dddd_bbbb_uuuu_ffff_,,,, _wwww_cccc_hhhh_aaaa_rrrr______tttt _eeee_oooo_ffff_))))_;;;;
- _iiii_nnnn_tttt _wwww_ssss_rrrr_eeee_ssss_tttt_eeee_pppp_((((_ssss_tttt_rrrr_uuuu_cccc_tttt _rrrr_eeee_xxxx_dddd_aaaa_tttt_aaaa _****_pppp_rrrr_eeee_xxxx_,,,, _wwww_cccc_hhhh_aaaa_rrrr______tttt _****_wwww_ssss_tttt_rrrr_,,,, _llll_oooo_nnnn_gggg _****_eeee_xxxx_pppp_bbbb_uuuu_ffff_))))_;;;;
- _iiii_nnnn_tttt _wwww_ssss_rrrr_eeee_mmmm_aaaa_tttt_cccc_hhhh_((((_ssss_tttt_rrrr_uuuu_cccc_tttt _rrrr_eeee_xxxx_dddd_aaaa_tttt_aaaa _****_pppp_rrrr_eeee_xxxx_,,,, _wwww_cccc_hhhh_aaaa_rrrr______tttt _****_wwww_ssss_tttt_rrrr_,,,, _llll_oooo_nnnn_gggg _****_eeee_xxxx_pppp_bbbb_uuuu_ffff_))))_;;;;
- _cccc_hhhh_aaaa_rrrr _****_wwww_ssss_rrrr_eeee_eeee_rrrr_rrrr_((((_iiii_nnnn_tttt _eeee_rrrr_rrrr_))))_;;;;
- DDDDEEEESSSSCCCCRRRRIIIIPPPPTTTTIIIIOOOONNNN
- These functions are general purpose internationalized regular expression
- matching routines to be used in programs that perform regular expression
- matching. These functions are defined by the _wwww_ssss_rrrr_eeee_gggg_eeee_xxxx_pppp_...._hhhh header file.
-
- The function _wwww_ssss_rrrr_eeee_cccc_oooo_mmmm_pppp_iiii_llll_eeee takes as input an internationalized regular
- expression as defined below (apart from the normal regular expressions as
- defined by _rrrr_eeee_gggg_eeee_xxxx_pppp) and produces a compiled expression that can be used
- with _wwww_ssss_rrrr_eeee_ssss_tttt_eeee_pppp or _wwww_ssss_rrrr_eeee_mmmm_aaaa_tttt_cccc_hhhh.
- _ssss_tttt_rrrr_uuuu_cccc_tttt _rrrr_eeee_xxxx_dddd_aaaa_tttt_aaaa _{{{{
- _ssss_hhhh_oooo_rrrr_tttt _ssss_eeee_dddd_;;;; _////_**** _ffff_llll_aaaa_gggg _ffff_oooo_rrrr _ssss_eeee_dddd _****_////
- _wwww_cccc_hhhh_aaaa_rrrr______tttt _****_ssss_tttt_rrrr_;;;; _////_**** _rrrr_eeee_gggg_uuuu_llll_aaaa_rrrr _eeee_xxxx_pppp_rrrr_eeee_ssss_ssss_iiii_oooo_nnnn _****_////
- _iiii_nnnn_tttt _eeee_rrrr_rrrr_;;;; _////_**** _rrrr_eeee_tttt_uuuu_rrrr_nnnn_eeee_dddd _eeee_rrrr_rrrr_oooo_rrrr _cccc_oooo_dddd_eeee_,,,, _0000 _==== _nnnn_oooo _eeee_rrrr_rrrr_oooo_rrrr _****_////
- _wwww_cccc_hhhh_aaaa_rrrr______tttt _****_llll_oooo_cccc_1111_;;;;
- _wwww_cccc_hhhh_aaaa_rrrr______tttt _****_llll_oooo_cccc_2222_;;;;
- _iiii_nnnn_tttt _cccc_iiii_rrrr_cccc_ffff_;;;;
- _...._...._....
- _}}}}_;;;;
-
- The first parameter, _p_r_e_x, is a pointer to the specification of the
- regular expression. _p_r_e_x->_s_e_d should be non-zero if sed style delimiter
- syntax is to be adopted. _p_r_e_x->_s_t_r should point to the regular expression
- that needs to be compiled. The regular expression string should be in
- wide character format. _p_r_e_x->_e_r_r indicated any error during the
- compilation and use of this regular expression. _e_x_p_b_u_f points to the
- place where the compiled regular expression will be placed. _e_n_d_b_u_f points
- to the first long after the space where the compiled regular expression
- may be placed. (_e_n_d_b_u_f-_e_x_p_b_u_f) should be large enough for the compiled
- regular expression to fit. _e_o_f is the wide character which marks the end
- of the regular expression. This character is usually a _//// (slash).
-
- If _wwww_ssss_rrrr_eeee_cccc_oooo_mmmm_pppp_iiii_llll_eeee was successful, it returns the pointer to the end of the
- regular expression, _e_n_d_b_u_f. Otherwise, 0 is returned and the error code
- is set in _p_r_e_x->_e_r_r.
-
-
-
-
- PPPPaaaaggggeeee 1111
-
-
-
-
-
-
- wwwwssssrrrreeeeggggeeeexxxxpppp((((3333WWWW)))) wwwwssssrrrreeeeggggeeeexxxxpppp((((3333WWWW))))
-
-
-
- The functions _wwww_ssss_rrrr_eeee_ssss_tttt_eeee_pppp and _wwww_ssss_rrrr_eeee_mmmm_aaaa_tttt_cccc_hhhh do pattern matching given a null
- terminated wide character string _w_s_t_r and a compiled regular expression
- _e_x_p_b_u_f as input. _e_x_p_b_u_f for these functions should be the compiled
- regular expression which was obtained by a call to the function
- _wwww_ssss_rrrr_eeee_cccc_oooo_mmmm_pppp_iiii_llll_eeee.
-
- The function _wwww_ssss_rrrr_eeee_ssss_tttt_eeee_pppp returns non-zero if some substring of _w_s_t_r matches
- the regular expression in _e_x_p_b_u_f and zero if there is no match. The
- function _wwww_ssss_rrrr_eeee_mmmm_aaaa_tttt_cccc_hhhh returns non-zero if a substring of _w_s_t_r starting from
- the beginning matches the regular expression in _e_x_p_b_u_f and zero if there
- is no match. If there is a match, _p_r_e_x->_l_o_c_1 and _p_r_e_x->_l_o_c_2 are set.
- _p_r_e_x->_l_o_c_1 points to the first wide character that matched the regular
- expression; _p_r_e_x->_l_o_c_2 points to the wide character after the last wide
- character that matches the regular expression. Thus if the regular
- expression matches the entire input string, _p_r_e_x->_l_o_c_1 will point to the
- first wide character of _w_s_t_r and _p_r_e_x->_l_o_c_2 will point to the null at the
- end of _w_s_t_r.
-
- _wwww_ssss_rrrr_eeee_ssss_tttt_eeee_pppp uses the variable _c_i_r_c_f of _s_t_r_u_c_t _r_e_x_d_a_t_a which is set by
- _wwww_ssss_rrrr_eeee_cccc_oooo_mmmm_pppp_iiii_llll_eeee if the regular expression begins with _^^^^ (caret). If this is
- set then _wwww_ssss_rrrr_eeee_ssss_tttt_eeee_pppp will try to match the regular expression to the
- beginning of the string only. If more than one regular expression is to
- be compiled before the first is executed, the value of _p_r_e_x->_c_i_r_c_f should
- be saved for each compiled expression and should be set to that saved
- value before each call to _wwww_ssss_rrrr_eeee_ssss_tttt_eeee_pppp.
-
- _wwww_ssss_rrrr_eeee_eeee_rrrr_rrrr returns the error message corresponding to the error code in the
- language of the current locale. The error code _e_r_r should be one returned
- by the wsregexp functions in the _e_r_r variable of _s_t_r_u_c_t _r_e_x_d_a_t_a.
-
- The internationalized regular expressions available for use with the
- wsregexp functions are constructed as follows:
-
- _E_x_p_r_e_s_s_i_o_n _M_e_a_n_i_n_g
-
- _c the character _c where _c is not a special character.
-
- _[[[[_[[[[_::::_c_l_a_s_s_::::_]]]]_]]]] _c_l_a_s_s is any character type as defined by the _L_C__T_Y_P_E locale
- category. _c_l_a_s_s can be one of the following
-
- _a_l_p_h_a a letter
-
- _u_p_p_e_r an upper-case letter
-
- _l_o_w_e_r a lower-case letter
-
- _d_i_g_i_t a decimal digit
-
- _x_d_i_g_i_t a hexadecimal digit
-
-
-
-
-
-
- PPPPaaaaggggeeee 2222
-
-
-
-
-
-
- wwwwssssrrrreeeeggggeeeexxxxpppp((((3333WWWW)))) wwwwssssrrrreeeeggggeeeexxxxpppp((((3333WWWW))))
-
-
-
- _a_l_n_u_m an alphanumeric character
-
- _s_p_a_c_e any whitespace character
-
- _p_u_n_c_t a punctuation character
-
- _p_r_i_n_t a printable character
-
- _g_r_a_p_h a character that has a visible representation
-
- _c_n_t_r_l a control character
-
- _[[[[_[[[[_====_c_====_]]]]_]]]] An equivalence class, or, any collation element defined as
- having the same relative order in the current collation
- sequence as _c. As an example, if _AAAA and _aaaa belong to the same
- equivalence class, then both [[=_A=]_b]] and [[=_a=]_b]] are
- equivalent to [_A_a_b].
-
- _[[[[_[[[[_...._c_c_...._]]]]_]]]] This represents a multi-character collating symbol. Multi-
- character collating elements must be represented as collating
- symbols to distinguish them from single-character collating
- elements. As an example, if the string _a_b is a valid
- collating element, then [[._a_b.]] will be treated as an
- element and will match the same string of characters, while
- _a_b will match the list of characters _a and _b. If the multi-
- character collating symbol is not a valid collating element
- in the current collating sequence definition, the symbol will
- be treated as an invalid expression.
-
- _[[[[_[[[[_c_----_c_]]]]_]]]] Any collation element in the character expression range _c-_c,
- where _c can identify a collating symbol or an equivalence
- class. If the character _---- (hyphen) appears immediately after
- an opening square bracker, _e._g. [-_c], or immediately prior to
- a closing square bracket, _e._g. [_c-], it has no special
- meaning.
-
- Immediately following an opening square bracket ^ means the complement
- of, _e._g. [^_c]. Otherwise, it has no special meaning.
-
- Within square brackets, a _.... that is not part of a [[._c_c.]] sequence, or
- a _:::: that is not part of a [[:_c_l_a_s_s:]] sequence, matches itself.
-
- SSSSEEEEEEEE AAAALLLLSSSSOOOO
- regexp(5)
-
- DDDDIIIIAAAAGGGGNNNNOOOOSSSSTTTTIIIICCCCSSSS
- Errors are:
-
- _EEEE_RRRR_RRRR______NNNN_OOOO_RRRR_MMMM_BBBB_RRRR no remembered search string
-
-
-
-
-
-
- PPPPaaaaggggeeee 3333
-
-
-
-
-
-
- wwwwssssrrrreeeeggggeeeexxxxpppp((((3333WWWW)))) wwwwssssrrrreeeeggggeeeexxxxpppp((((3333WWWW))))
-
-
-
- _EEEE_RRRR_RRRR______RRRR_EEEE_OOOO_VVVV_FFFF_LLLL_OOOO_WWWW regexp overflow
- This happens when _wwww_ssss_rrrr_eeee_cccc_oooo_mmmm_pppp_iiii_llll_eeee cant fit the
- compiled regular expression in (_e_n_d_b_u_f-
- _e_x_p_b_u_f).
-
- _EEEE_RRRR_RRRR______BBBB_RRRR_AAAA _(((( _)))) imbalance
-
- _EEEE_RRRR_RRRR______DDDD_EEEE_LLLL_IIII_MMMM illegal or missing delimiter.
-
- _EEEE_RRRR_RRRR______NNNN_BBBB_RRRR bad number in _{{{{ _}}}}
-
- _EEEE_RRRR_RRRR______2222_MMMM_NNNN_BBBB_RRRR more than 2 numbers given in _{{{{ _}}}}
-
- _EEEE_RRRR_RRRR______DDDD_IIII_GGGG_IIII_TTTT _dddd_iiii_gggg_iiii_tttt out of range
-
- _EEEE_RRRR_RRRR______2222_MMMM_LLLL_BBBB_RRRR_AAAA too many _((((
-
- _EEEE_RRRR_RRRR______RRRR_AAAA_NNNN_GGGG_EEEE range number too large
-
- _EEEE_RRRR_RRRR______MMMM_IIII_SSSS_SSSS_BBBB _}}}} expected after _\\\\
-
- _EEEE_RRRR_RRRR______BBBB_AAAA_DDDD_RRRR_NNNN_GGGG first number exceeds second in _{{{{ _}}}}.
-
- _EEEE_RRRR_RRRR______SSSS_IIII_MMMM_BBBB_AAAA_LLLL _[[[[ _]]]] imbalance.
-
- _EEEE_RRRR_RRRR______SSSS_YYYY_NNNN_TTTT_AAAA_XXXX illegal regular expression
-
- _EEEE_RRRR_RRRR______IIII_LLLL_LLLL_CCCC_LLLL_AAAA_SSSS_SSSS illegal _[[[[_::::_c_l_a_s_s_::::_]]]]
-
- _EEEE_RRRR_RRRR______EEEE_QQQQ_UUUU_IIII_LLLL illegal _[[[[_====_c_l_a_s_s_====_]]]]
-
- _EEEE_RRRR_RRRR______CCCC_OOOO_LLLL_LLLL illegal _[[[[_...._c_c_...._]]]]
-
- EEEEXXXXAAAAMMMMPPPPLLLLEEEE
- The following is an example of how the regular expression macros and
- calls might be defined by an application program:
-
- _####_iiii_nnnn_cccc_llll_uuuu_dddd_eeee _<<<<_wwww_ssss_rrrr_eeee_gggg_eeee_xxxx_pppp_...._hhhh_>>>>
- _####_iiii_nnnn_cccc_llll_uuuu_dddd_eeee _<<<<_wwww_iiii_dddd_eeee_cccc_...._hhhh_>>>>
- _.... _.... _....
- _ssss_tttt_rrrr_uuuu_cccc_tttt _rrrr_eeee_xxxx_dddd_aaaa_tttt_aaaa _rrrr_eeee_xxxx_;;;;
- _llll_oooo_nnnn_gggg _eeee_xxxx_pppp_bbbb_uuuu_ffff _[[[[_BBBB_UUUU_FFFF_SSSS_IIII_ZZZZ_]]]]_;;;; _////_**** _BBBB_uuuu_ffff_ffff_eeee_rrrr _ffff_oooo_rrrr _tttt_hhhh_eeee _cccc_oooo_mmmm_pppp_iiii_llll_eeee_dddd _RRRR_EEEE _****_////
-
- _////_**** _DDDD_eeee_ffff_iiii_nnnn_eeee _aaaa _RRRR_EEEE _tttt_oooo _iiii_dddd_eeee_nnnn_tttt_iiii_ffff_yyyy _aaaa _cccc_aaaa_pppp_iiii_tttt_aaaa_llll_iiii_zzzz_eeee_dddd _wwww_oooo_rrrr_dddd _****_////
- _cccc_hhhh_aaaa_rrrr _****_rrrr_eeee_gggg_eeee_xxxx_pppp _==== _""""_[[[[_[[[[_::::_ssss_pppp_aaaa_cccc_eeee_::::_]]]]_]]]]_[[[[_[[[[_::::_uuuu_pppp_pppp_eeee_rrrr_::::_]]]]_]]]]_""""_;;;;
- _wwww_cccc_hhhh_aaaa_rrrr______tttt _wwww_rrrr_eeee_gggg_eeee_xxxx_pppp _[[[[_5555_1111_2222_]]]]_;;;;
- _wwww_cccc_hhhh_aaaa_rrrr______tttt _wwww_eeee_oooo_ffff_;;;; _////_**** _TTTT_hhhh_eeee _eeee_nnnn_dddd _oooo_ffff _rrrr_eeee_gggg_uuuu_llll_aaaa_rrrr _eeee_xxxx_pppp_rrrr_eeee_ssss_ssss_iiii_oooo_nnnn _****_////
- _cccc_hhhh_aaaa_rrrr _eeee_oooo_ffff _==== _''''_\\\\_0000_''''_;;;;
-
- _wwww_cccc_hhhh_aaaa_rrrr______tttt _llll_iiii_nnnn_eeee_bbbb_uuuu_ffff _[[[[_BBBB_UUUU_FFFF_SSSS_IIII_ZZZZ_]]]]_;;;; _////_**** _BBBB_uuuu_ffff_ffff_eeee_rrrr _ffff_oooo_rrrr _tttt_hhhh_eeee _iiii_nnnn_pppp_uuuu_tttt _ssss_tttt_rrrr_iiii_nnnn_gggg _****_////
- _.... _.... _....
- _((((_vvvv_oooo_iiii_dddd_)))) _mmmm_bbbb_ssss_tttt_oooo_wwww_cccc_ssss_((((_wwww_rrrr_eeee_gggg_eeee_xxxx_pppp_,,,, _rrrr_eeee_gggg_eeee_xxxx_pppp_,,,, _ssss_tttt_rrrr_llll_eeee_nnnn_((((_rrrr_eeee_gggg_eeee_xxxx_pppp_))))_++++_1111_))))_;;;;
-
-
- PPPPaaaaggggeeee 4444
-
-
-
-
-
-
- wwwwssssrrrreeeeggggeeeexxxxpppp((((3333WWWW)))) wwwwssssrrrreeeeggggeeeexxxxpppp((((3333WWWW))))
-
-
-
- _((((_vvvv_oooo_iiii_dddd_)))) _mmmm_bbbb_tttt_oooo_wwww_cccc_((((_&&&&_wwww_eeee_oooo_ffff_,,,, _&&&&_eeee_oooo_ffff_,,,, _1111_))))_;;;;
- _rrrr_eeee_xxxx_...._ssss_tttt_rrrr _==== _wwww_rrrr_eeee_gggg_eeee_xxxx_pppp_;;;;
- _rrrr_eeee_xxxx_...._ssss_eeee_dddd _==== _0000_;;;;
- _rrrr_eeee_xxxx_...._eeee_rrrr_rrrr _==== _0000_;;;;
- _iiii_ffff _((((_!!!!_wwww_ssss_rrrr_eeee_cccc_oooo_mmmm_pppp_iiii_llll_eeee_((((_&&&&_rrrr_eeee_xxxx_,,,, _eeee_xxxx_pppp_bbbb_uuuu_ffff_,,,, _&&&&_eeee_xxxx_pppp_bbbb_uuuu_ffff_[[[[_BBBB_UUUU_FFFF_SSSS_IIII_ZZZZ_]]]]_,,,, _wwww_eeee_oooo_ffff_))))_))))
- _ffff_pppp_rrrr_iiii_nnnn_tttt_ffff_((((_ssss_tttt_dddd_eeee_rrrr_rrrr_,,,, _""""_%%%%_ssss_\\\\_nnnn_""""_,,,, _wwww_ssss_rrrr_eeee_eeee_rrrr_rrrr_((((_rrrr_eeee_xxxx_...._eeee_rrrr_rrrr_))))_))))_;;;;
- _.... _.... _....
- _iiii_ffff _((((_wwww_ssss_rrrr_eeee_ssss_tttt_eeee_pppp_((((_&&&&_rrrr_eeee_xxxx_,,,, _llll_iiii_nnnn_eeee_bbbb_uuuu_ffff_,,,, _eeee_xxxx_pppp_bbbb_uuuu_ffff_))))_))))
- _ssss_uuuu_cccc_cccc_eeee_eeee_dddd_;;;;
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
- PPPPaaaaggggeeee 5555
-
-
-
-