char |
Definition |
. |
(period) Matches
any character, except the end-of-line. |
^ |
(caret) Matches the actual beginning-of-line position
or the preceding line-delimiter character pair (CHR$(13,10) or $CRLF),
as taken from the start& character position. The line-delimiter
characters themselves are not included in the iLen& result.
(also see [^] below for usage within a character class definition). |
$ |
(dollar) Matches
the end-of-line position, which may be either the first line-delimiter
character pair (CHR$(13,10) or $CRLF) that is encountered in the search
to the right of the start& position, or the actual end of the
main$ string, whichever occurs first. The line-delimiter characters
themselves are not included in the iLen& result. |
| |
(stile) Specifies
alternation (the OR operator), so that an
expression on either side can match. Precedence is from left-to-right,
as encountered in the expression. |
? |
(question mark)
Specifies that zero or one match of the preceding sub-pattern is allowed.
Cannot be used with a Tag. |
+ |
(plus) Specifies
that one or more matches of the preceding sub-pattern are allowed.
Cannot be used with a Tag. |
* |
(asterisk) Specifies that zero or more matches of
the preceding sub-pattern are allowed. Cannot be used with a Tag. |
Character classes |
[ ] |
(square brackets)
Identifies a user-defined class of characters, any of which will match:
[abc] will match a, b, or c. Only three special metacharacters are
recognized within a class definition, the caret ^ for complemented characters,
the hyphen - for a range of characters, or one of the following \ backslash
escape sequences: |
|
\\ \- \] \e \f
\n \q \r \t \v \x## |
|
Any other use of a backslash within a class
definition yields an undefined operation that should be avoided. |
[-] |
(hyphen)
The hyphen identifies a range of characters to match. For example,
[a-f] will match a, b, c, d, e, or f. |
|
Characters in an individual range must occur
in the natural order as they appear in the character set. For example,
[f-a] will match nothing. |
|
Lists of characters, and one or more ranges
of characters, may be intermixed in a single class definition. The
start and end of a range may be specified by a literal character, or one
of the \ backslash escape sequences: |
|
\\ \- \] \e \f
\n \q \r \t \v \x## |
|
Any other use of a backslash within a class
definition yields an undefined operation. |
|
Multiple ranges in a class are valid.
For example, [a-d2-5] matches a, b, c, d, 2, 3, 4, or 5. |
|
When the hyphen is escaped, it is treated
as a literal. For example, [a\-c] is a list, not a range, and matches
a, -, or c due to the \ backslash escape sequence. |
[^] |
(caret)
When the caret appears as the first item in a class definition, it identifies
a complemented class of characters, which will not match. For example,
[^abc] matches any character except a, b, or c. |
|
A range can also be specified for the complemented
class. For example, [^a-z] matches any character except a through
z. |
|
A caret located in any position other than
the first is treated as a literal character. |
Tags/sub-patterns |
( ) |
(parentheses)
Parentheses are used to match a Tag, or sub-pattern, within the full search
pattern, and remember the match. The matched sub-pattern can be
retrieved later in the mask (or in a replace operation with REGREPL),
with \01 through \99, based upon the left-to-right position of the opening
parentheses. |
|
Parentheses may also be used to force precedence
of evaluation with the alternation operator. For example, "(Begin)|(End)File"
would match either "BeginFile" or "EndFile", but without
the Tag designations, "Begin|EndFile" would only match either
"BeginndFile" or "BegiEndFile". |
Escaped characters |
\ |
(backslash).
The escape operator (single-character quote). The following character
will be treated as a literal value rather than being interpreted as a
special character. Note that the character following the backslash
must actually be a special character, as follows: |
\b |
A word boundary.
The start or end of a word, where a word is defined as one or more characters
that include an alphabetic character (A-Z or a-z), a numeric character
(0-9), and an underscore. For example, "abc_123" is considered
a single word and "abc-123" is considered two words. |
\c |
Case-sensitive
search. Without the \c operator, the default is to ignore case
when matching. Unlike some other implementations of regular expressions,
case-insensitivity is recognized in all operations, even a range of characters
such as "[6-Z]". The \c operator may appear at any position
in the mask. |
\e |
Escape character:
CHR$(27) or $ESC. |
\f |
Formfeed character:
CHR$(12) or $FF. |
\n |
Linefeed (or
new-line) character: CHR$(10) or $LF. |
\q |
Double-quote
mark ("): CHR$(34) or $DQ.
\q is included for ease of inclusion within a literal string. For
example: "\qHello\q". |
\r |
Carriage-return
character: CHR$(13) or $CR. |
\s |
Shortest match
character: The \s flag causes the shortest matching string to be returned,
rather than the longest (the default). For example, when searching
for the mask "abc.*abc" in "abcdabcabc", the default
setting would return position 1 and length 10. With the \s switch
set, it returns position 1 and length 7. This option may cause a
slight increase in processing time. The \s flag must appear at the
beginning of the mask string. |
\t |
Horizontal
tab character: CHR$(9) or $TAB. |
\v |
Vertical tab
character: CHR$(11) or $VT. |
\x## |
Hex character
code: Indicates that an ASCII code follows, given by two hexadecimal
digits. For example, \xFF = CHR$(&HFF) (which is equivalent
to CHR$(255)). XX must be in the range 0 through 255. |