WordPress.org

Ready to get started?Download WordPress

Forums

Subscribe2
[resolved] PCRE replacements - possibly buggy (4 posts)

  1. jimshell
    Member
    Posted 1 year ago #

    Hello,

    Just a quick thought on some PCRE replacements I bumped into while looking for something else in "class-s2-core.php" :

    $plaintext = preg_replace('|<s*>(.*)<\/s>|','', $plaintext);
    $plaintext = preg_replace('|<strike*>(.*)<\/strike>|','', $plaintext);
    $plaintext = preg_replace('|<del*>(.*)<\/del>|','', $plaintext);
    1. not sure what "<s*>", "<strike*>", and "<del*>" are supposed to do; shouldn't it be "<s[^>]*>", "<strike[^>]*>", and "<del[^>]*>" instead?
    2. as un undefined number of characters (except newlines) are used as "(.*)", the regexp should be ungreedy - otherwise everything could be deleted between the first opening tag and the last closing tag, which isn't what is wanted (need to remove what's between one opening tag and the matching, immediately following, closing tag, whatever the number of times those tags are used in the string)

    So I guess it should be:

    $plaintext = preg_replace('|<s[^>]*>(.*)<\/s>|Ui','', $plaintext);
    $plaintext = preg_replace('|<strike[^>]>(.*)<\/strike>|Ui','', $plaintext);
    $plaintext = preg_replace('|<del[^>]>(.*)<\/del>|Ui','', $plaintext);

    (used the "i" flag in addition to the "U" one as it makes the regexp case-insensitive, which could be handy in some cases?)

    It that right? Or did I miss somehting?

    Thank you for your excellent plugin!

    Jimshell

    http://wordpress.org/extend/plugins/subscribe2/

  2. mattyrob
    Member
    Plugin Author

    Posted 1 year ago #

    @jimshell,

    I don't even remotely class myself as an expert in regular expressions - rather I tend to stumble and fumble around until I find something that works!

    I'll give your suggestions a whirl in my test code though as I think I understand what you are saying and I like the case insensitive part too.

  3. jimshell
    Member
    Posted 1 year ago #

    Hi MattyRob!

    I'm not a regexp expert either - and they do require time and patience till you find what works and outputs the expected results!

    Noticed I wrote a wrong explanation on point 2! Without the "U" flag, the regexp (let's consider "|<strike*>(.*)<\/strike>|") would actually erase everything from the first "<strike" as (.*) does match any character - thus matches </strike> as well, which means it would not stop at it - until it reaches a newline ("\n", as "." doesn't match newlines) or the end of the string.

    For instance, applying the replacement on this string:

    Hello <strike>people</strike> John, glad to see you! <strike>What's up?</strike> How are you?

    would only output:

    Hello

    On a string with one or several newlines, my guess is that it would not do anything - if the newline or newlines are after a "<strike" opening tag, but this should be tested to be sure ;-)

    Using the "U" flag turns the expression into an ungreedy one, which makes the pattern mean "anything but \n between <strike [possible attributes ?]> and the closest next </strike>.

    I remember I had very tough times with regular expressions before I discovered the ungreedy behaviour!

    I decided to let you know about my thoughts on this point because I assume that the use of <del>, <strike> or such may be pretty rare, and thus that this could be a very "silent" hard to reproduce possible issue.

    Hope that helps anyway, and again: many thanks for your awesome plugin!

    Cheers
    Jimshell

  4. mattyrob
    Member
    Plugin Author

    Posted 1 year ago #

    @jimshell,

    I appreciate your time and explanation, I'll get your changes into the next version and stop any potential issues before they arise.

Topic Closed

This topic has been closed to new replies.

About this Plugin

About this Topic

Tags

No tags yet.