[nycphp-talk] Regex for P Elements
Dan Cech
dcech at phpwerx.net
Wed Jan 12 09:20:51 EST 2011
On 1/12/2011 9:00 AM, Donald J. Organ IV wrote:
> $blockpattern='/<p*[^>]>.*?/m';
>
> Notice the m after the last / this says it can span multiple
> lines....
Good call, I missed the multiple-line thing. In this situation though
you'd actually want /s like:
$blockpattern='/<p[^>]*>.*?<\/p>/s';
From the manual
[http://php.net/manual/en/reference.pcre.pattern.modifiers.php]:
> s (PCRE_DOTALL) If this modifier is set, a dot metacharacter in the
> pattern matches all characters, including newlines. Without it,
> newlines are excluded. This modifier is equivalent to Perl's /s
> modifier. A negative class such as [^a] always matches a newline
> character, independent of the setting of this modifier.
The /m modifier is used to control how the ^ and $ characters match:
> m (PCRE_MULTILINE) By default, PCRE treats the subject string as
> consisting of a single "line" of characters (even if it actually
> contains several newlines). The "start of line" metacharacter (^)
> matches only at the start of the string, while the "end of line"
> metacharacter ($) matches only at the end of the string, or before a
> terminating newline (unless D modifier is set). This is the same as
> Perl. When this modifier is set, the "start of line" and "end of
> line" constructs match immediately following or immediately before
> any newline in the subject string, respectively, as well as at the
> very start and end. This is equivalent to Perl's /m modifier. If
> there are no "\n" characters in a subject string, or no occurrences
> of ^ or $ in a pattern, setting this modifier has no effect.
And isn't really applicable in this case.
Dan
More information about the talk
mailing list