:: DEVELOPER ZONE
A regular expression is a powerful way of specifying a pattern for a complex search.
MySQL uses Henry Spencer's implementation of regular expressions,
which is aimed at conformance with POSIX 1003.2. See
Appendix B, Credits. MySQL uses the extended version to support
pattern-matching operations performed with the
REGEXP operator in SQL statements. See
Section 3.3.4.7, “Pattern Matching”.
This appendix is a summary, with examples, of the special characters
and constructs that can be used in MySQL for REGEXP
operations. It does not contain all the details that can be found in
Henry Spencer's regex(7) manual page. That manual
page is included in MySQL source distributions, in the
regex.7 file under the regex
directory.
A regular expression describes a set of strings. The simplest regular
expression is one that has no special characters in it. For example,
the regular expression hello matches
hello and nothing else.
Non-trivial regular expressions use certain special constructs so that
they can match more than one string. For example, the regular
expression hello|word matches either the string
hello or the string word.
As a more complex example, the regular expression
B[an]*s matches any of the strings
Bananas, Baaaaas,
Bs, and any other string starting with a
B, ending with an s, and
containing any number of a or n
characters in between.
A regular expression for the REGEXP operator may
use any of the following special characters and constructs:
^
Match the beginning of a string.
mysql> SELECT 'fo\nfo' REGEXP '^fo$'; -> 0 mysql> SELECT 'fofo' REGEXP '^fo'; -> 1
$
Match the end of a string.
mysql> SELECT 'fo\no' REGEXP '^fo\no$'; -> 1 mysql> SELECT 'fo\no' REGEXP '^fo$'; -> 0
.
Match any character (including carriage return and newline).
mysql> SELECT 'fofo' REGEXP '^f.*$'; -> 1 mysql> SELECT 'fo\r\nfo' REGEXP '^f.*$'; -> 1
a*
Match any sequence of zero or more a characters.
mysql> SELECT 'Ban' REGEXP '^Ba*n'; -> 1 mysql> SELECT 'Baaan' REGEXP '^Ba*n'; -> 1 mysql> SELECT 'Bn' REGEXP '^Ba*n'; -> 1
a+
Match any sequence of one or more a characters.
mysql> SELECT 'Ban' REGEXP '^Ba+n'; -> 1 mysql> SELECT 'Bn' REGEXP '^Ba+n'; -> 0
a?
Match either zero or one a character.
mysql> SELECT 'Bn' REGEXP '^Ba?n'; -> 1 mysql> SELECT 'Ban' REGEXP '^Ba?n'; -> 1 mysql> SELECT 'Baan' REGEXP '^Ba?n'; -> 0
de|abc
Match either of the sequences de or
abc.
mysql> SELECT 'pi' REGEXP 'pi|apa'; -> 1 mysql> SELECT 'axe' REGEXP 'pi|apa'; -> 0 mysql> SELECT 'apa' REGEXP 'pi|apa'; -> 1 mysql> SELECT 'apa' REGEXP '^(pi|apa)$'; -> 1 mysql> SELECT 'pi' REGEXP '^(pi|apa)$'; -> 1 mysql> SELECT 'pix' REGEXP '^(pi|apa)$'; -> 0
(abc)*
Match zero or more instances of the sequence abc.
mysql> SELECT 'pi' REGEXP '^(pi)*$'; -> 1 mysql> SELECT 'pip' REGEXP '^(pi)*$'; -> 0 mysql> SELECT 'pipi' REGEXP '^(pi)*$'; -> 1
{1} , {2,3}
{n} or {m,n} notation provides
a more general way of writing regular expressions that match many
occurrences of the previous atom (or ``piece'') of the pattern.
m and n are integers.
a*
Can be written as a{0,}.
a+
Can be written as a{1,}.
a?
Can be written as a{0,1}.
To be more precise, a{n} matches exactly
n instances of a.
a{n,} matches n or more
instances of a. a{m,n} matches
m through n instances of
a, inclusive.
m and n must be in the range
from 0 to RE_DUP_MAX (default
255), inclusive. If both m and
n are given, m must be less
than or equal to n.
mysql> SELECT 'abcde' REGEXP 'a[bcd]{2}e'; -> 0
mysql> SELECT 'abcde' REGEXP 'a[bcd]{3}e'; -> 1
mysql> SELECT 'abcde' REGEXP 'a[bcd]{1,10}e'; -> 1
[a-dX] , [^a-dX]
Matches any character that is (or is not, if ^ is used) either
a, b, c,
d or X. A -
character between two other characters forms a range that matches all
characters from the first character to the second. For example,
[0-9] matches any decimal digit. To include a
literal ] character, it must immediately follow
the opening bracket [. To include a literal
- character, it must be written first or last. Any
character that does not have a defined special meaning inside a
[] pair matches only itself.
mysql> SELECT 'aXbc' REGEXP '[a-dXYZ]'; -> 1 mysql> SELECT 'aXbc' REGEXP '^[a-dXYZ]$'; -> 0 mysql> SELECT 'aXbc' REGEXP '^[a-dXYZ]+$'; -> 1 mysql> SELECT 'aXbc' REGEXP '^[^a-dXYZ]+$'; -> 0 mysql> SELECT 'gheis' REGEXP '^[^a-dXYZ]+$'; -> 1 mysql> SELECT 'gheisa' REGEXP '^[^a-dXYZ]+$'; -> 0
[.characters.]
Within a bracket expression (written using [ and
]), matches the sequence of characters of that
collating element. characters is either a single
character or a character name like newline. You
can find the full list of character names in the
regexp/cname.h file.
mysql> SELECT '~' REGEXP '[[.~.]]'; -> 1 mysql> SELECT '~' REGEXP '[[.tilde.]]'; -> 1
[=character_class=]
Within a bracket expression (written using [ and
]), [=character_class=]
represents an equivalence class. It matches all characters with the
same collation value, including itself. For example, if
o and (+) are the members of an
equivalence class, then [[=o=]],
[[=(+)=]], and [o(+)] are all
synonymous. An equivalence class may not be used as an endpoint of a
range.
[:character_class:]
Within a bracket expression (written using [ and
]), [:character_class:]
represents a character class that matches all characters belonging to
that class. The standard class names are:
alnum
|
Alphanumeric characters |
alpha
|
Alphabetic characters |
blank
|
Whitespace characters |
cntrl
|
Control characters |
digit
|
Digit characters |
graph
|
Graphic characters |
lower
|
Lowercase alphabetic characters |
print
|
Graphic or space characters |
punct
|
Punctuation characters |
space
|
Space, tab, newline, and carriage return |
upper
|
Uppercase alphabetic characters |
xdigit
|
Hexadecimal digit characters |
These stand for the character classes defined in the
ctype(3) manual page. A particular locale may
provide other class names. A character class may not be used as an
endpoint of a range.
mysql> SELECT 'justalnums' REGEXP '[[:alnum:]]+'; -> 1 mysql> SELECT '!!' REGEXP '[[:alnum:]]+'; -> 0
[[:<:]] , [[:>:]]
These markers stand for word boundaries. They match the beginning and
end of words, respectively. A word is a sequence of word characters
that is not preceded by or followed by word characters. A word
character is an alphanumeric character in the
alnum class or an underscore
(_).
mysql> SELECT 'a word a' REGEXP '[[:<:]]word[[:>:]]'; -> 1 mysql> SELECT 'a xword a' REGEXP '[[:<:]]word[[:>:]]'; -> 0
To use a literal instance of a special character in a regular
expression, precede it by two backslash (\) characters. The MySQL
parser interprets one of the backslashes, and the regular expression
library interprets the other. For example, to match the string
1+2 that contains the special +
character, only the last of the following regular expressions is the
correct one:
mysql> SELECT '1+2' REGEXP '1+2'; -> 0 mysql> SELECT '1+2' REGEXP '1\+2'; -> 0 mysql> SELECT '1+2' REGEXP '1\\+2'; -> 1
© 1995-2005 MySQL AB. All rights reserved.
| Provided by Onino - product information, price comparison and reviews on: Software, Radar Detectors, DVD Recorders, Music, Garden, Kettle Shop, and more. Also see Electrical Goods and Computing Shop. |

User Comments
Warning: query failed: Unknown column 'user.firstname' in 'field list' in /data0/sites/live/web-main/lib/mysql-cxn.php on line 69
Warning: mysql_fetch_array(): supplied argument is not a valid MySQL result resource in /data0/sites/live/web-main/lib/docbook.php on line 245
Add your own comment.