PCRE –– examples 2
written by: admin
Date Written: 2/14/10
Last Updated: 11/7/19
replace content between anchor tags –– example 1
$text=preg_replace('/(<a.+>)(.+)(<\/a>)/Ue',"'$1'.str_replace('-','–','$2').'$3'",$text);
This was really hard to figure out. This uses the
e modifier to use php. Notice that captured subpatterns are placed within single quotes and is concatenated with a period. This is the first time I have started using the
U modifier. The
U modifier reverses the greediness of the expressions.
? becomes a maximizer as opposed to a minimizer. The expressions are now ungreedy by default when this modifier is set.
Note: deprecated.
Replace pattern between anchor tags –– example 2
This is another variation of the example above, but without using
\e. Basically, it will find and replace a pattern found within a pattern.
<?php
$text="hi-ll-upu<a href=\"http://www.animev-iews.com/index.php\">yo-me-yo</a>tipsy-me";
$text=preg_replace('/-(?=((?!<\/?a\b|>).)*<\/a)/','XX',$text);
echo "$text";
?>
produces:
replace pattern found between tags and is not preceded by or followed by a period
example 1
<?php
$text = "<strong>34D-23.48</strong>";
$text=preg_replace('/>\s?([\w\-]{1,}\.?)(?=((?!<\/?strong\b|>|\.\s?<).)*<\/strong)/','>stock # $1',$text);
echo "$text";
?>
matches:
<strong>34D–23.48</strong>
<strong>34d<strong>
<strong>34–435k53.fr4.4</strong>
<strong>34d <strong>
<strong> 34d<strong>
The following fail:
<strong>.34d<strong>
<strong> .34d<strong>
<strong>34d.<strong>
<strong>34d. <strong>
I like this example, because it demonstrates a workaround for using a lookbehind and a lookahead at the same time. The need for using a lookbehind results when you are looking for a pattern within a pattern, which is more complicated than finding a pattern between a simple match as demonstrated in example 2 below.
example 2
This example is very similar, but is easier to understand. The only differences is that it requires a period somewhere in the middle as opposed to being optional and is stricter about being within
<strong> tags.
<?php
$text = "<strong>34D-23.48</strong>";
$text=preg_replace('/>\s?([\w\-]{1,}\.[\w\-]{1,})(?=((?!<\/?strong\b|>|\.\s?<).)*<\/strong)/','>stock # $1',$text);
echo "$text";
?>
Replace pattern that is not between anchor tags
<?php
$text="hi-ll-upu<a href=\"http://www.anime-views.com/index.php\">yo-me-yo</a>tipsy-me";
$text=preg_replace('/-(?!((?!<\/?a\b).)*<\/a)/','XX',$text);
echo "$text";
?>
produces:
The dashes that are replaced are not between the anchor tags or within them.
replace hyphens with dashes
$text=preg_replace("/-(?!((?!<\/?(script|style|a|object|iframe)\b).)*(<\/(script|style|a>|object|iframe)))/is","–",$text);
$text=preg_replace('/-(?=((?!<\/?a\b|>).)*<\/a)/','–',$text);
$text=preg_replace('/(style=(\"|\'))(.+)(\'|\")/Ue',"'$1'.str_replace('–','-','$3').'$4'",$text);
$text=preg_replace('/(<button\sonclick=(\"|\'))(.+)(\'|\")/Ue',"'$1'.str_replace('–','-','$3').'$4'",$text);
This script will replace hyphens with dashes unless it is part of inline styling, javascript, button, css, iframe, object, or the url of a hyperlink. The script will work on the hyperlinked text though. One limitation of the script is that it will stop executing if the string is approximately 11,633 characters in length. I still use this script because I have a workaround in place to selectively disable the script if needed. You can try it out
here.
Convert BB Code to HTML Code using recursion
<?php
$text = '[one]text [two]text [three]text[/thre] text[/two] text[/one]';
$regexp = '{((\[([^\]]+)\])((?:(?:(?!\[/?\3\]).)*+|(?1))*)(\[/\3\]))}si';
while(preg_match($regexp,$text,$match)){
$text = preg_replace($regexp,'<$3>$4</$3>',$text);
}
echo $text;
?>
Recursion is difficult to understand or even find good examples for. The above will output
<one>text <two>text [three]text[/thre] text</two> text</one>
mismatched BB Code is ignored.
some helpful links
TAGS: pcre,
php