If you want to use regular expressions in your PHP program the best way is to use so called preg-functions (they wrap Perl-Compatible Regular Expressions library so sometimes they are called PCRE functions). Of course, there’re some other function sets like ereg and mb_ereg but they are quite outdated and in this article we’ll focus on preg functions only.
Does it match?
Ok, let’s look how those preg functions work. We will start with preg_match function which simply returns number of times the pattern matches, the first parameter being a regex string and the second one being an input string:
preg_match('/java/', "regex in php");
returns 0, but
preg_match('/p/', "regex in php");
returns 1, Why not 2? Because this function always stops after the first match, if you need to continue please use preg_match_all function.
If you need to retrieve all captured parenthesized subpatterns you may specify a third parameter:
preg_match('/(\w+)\s(\w+)\s+\w+)/', "regex in php", $matches);
fills $matches array with the following values: “regex in php“, “regex“, “in“, “php“.
As we mentioned before if you need to get all the matches (not just subpatterns) you should use preg_match_all function:
preg_match_all('/\w+/', "regex in php", $matches);
also fills $matches array with the same values: “regex in php“, “regex“, “in“, “php“.
What does match?
If you need to select array elements that match the pattern use preg_grep function. In a simple case it takes only two arguments: a pattern string and an array of input strings. For example:
preg_grep('/php/', Array ("regex in php", "regex in java", "php language"));
preg_grep('/php/', Array ("regex in php", "regex in java", "php language"), PREG_GREP_INVERT);
Want to replace?
To replace all the matches you may use preg_replace function:
preg_replace('/(\w+) (\d{1,2})th/i', '$2 $1','March 12th');
returns 12 March. Here we got day and month as numbered groups and changed their order.
Want to split?
To split use the simple preg_split() function. Let’s split a list with delimiter being any number of dots, commas, semicolons or space characters:
preg_split('/[.,;\s]+/', ' jan,feb mar. apr, may;june');
returns array of 6 elements: ‘jan’, ‘feb’, ‘mar’, ‘apr’, ‘may’, ‘june’.
Important thing here is that we could use both different delimiters and also one delimiter multiple times.
Note: If you don't need the power of regex, you may choose faster php functions for split: str_split() or explode()
Regex Pattern Modifiers
Sometimes you need to change the behaviour how regex works. To do it just add a letter (regex modifier) after the last delimiter, for example:
i – makes search case-insensitive:
preg_match('/SQL query/i', "the sql query process");
returns 1
m – enables “multi-line mode” (In this mode you may use “start of line” (^) and “end of line” ($) characters):
preg_match_all('/^.+$/m', "line 1 \n line 2", $matches);
fills $matches array with “line 1” and “line 2″ values. Here /^.+$/ regex with ‘m’ modifier following considers line boundaries for a match rather than string boundaries.
s – enables “dot matches all” (In this mode a “dot” matches all characters, including newlines (otherwise newlines are excluded)):
preg_match_all('/^.+$/s', "line 1 \n line 2", $matches);
fills $matches array with one element: “line 1 \n line 2″
x – enables “free-spacing mode” (In this mode, whitespace between regex tokens is ignored, and an unescaped # starts a comment):
preg_match('/reg ex # regex sample/x', 'regex', $matches);
fills $matches with “regex”.
U – makes quantifiers not greedy (but rather greedy if followed by “?”):
preg_match_all('/.+/U', 'navi', $matches);
fills $matches array with “n”, “a”, “v” and “i” elements since + is not greedy
Want more?
If you want to test regexes online visit different engines online regex testers.
The more comprehensive manual on PCRE functions is here.
For the whole set of PCRE pattern modifiers look here.
Other Resources
- PHP’s online documentation at http://www.php.net/pcre.
- http://www.phpbuilder.com/columns/dario19990616.php3