• I’ve asked a similar question quite a while ago and didn’t get any helpful responses.

    I’m trying to replace specific contain across my entire website.

    /**
         * Replace content
         */
    
        add_filter('the_content', 'wr_replace_text', 100);
        add_filter('the_excerpt', 'wr_replace_text', 100);
        add_filter('the_title', 'wr_replace_text', 100);
        add_filter('category_description', 'wr_replace_text', 100);
        add_filter('term_description', 'wr_replace_text', 100);
        add_filter('pre_user_description', 'wr_replace_text', 100);
    
        function wr_replace_text($text){
            $replace = array(
                    '[myname]' => '<span class="name">MyName</span>',
        			'¶' => '<span class="para">¶</span>',
        			'...' => '…',
        			'(c)' => '©',
        			'‹' => '«',
        			'›' => '»',
        			'„' => '«',
        			'“' => '»',
        			' - ' => ' — ',
        			' – ' => ' — ',
        			' –,' => ' —, ',
            );
            $text = str_replace(array_keys($replace), $replace, $text);
            return $text;
        }

    The snippet above is all I’m working with. This seems to work quite randomly. E.g. the replacement of [myname] works almost everywhere.

    However the replacement of the quotes doesn’t always work. There are posts, or pages where there are " transformed to « and there are others where it randomly doesn’t happen.

    In example: As mentioned above [myname] works almost everywhere, except for my search-results. Even though my search-results use the_excerpt() to display a teaser-text of the result the [myname] slug doesn’t get replaced in there. However for normal blog-teasers on my front-page where I also use the_excerpt() it does work.

    Why could there be such an unconsitancy? Any ideas on that?

    Thank you in advance.

    Matt

Viewing 8 replies - 1 through 8 (of 8 total)
  • Moderator bcworkz

    (@bcworkz)

    Only speculation, there maybe some character interpretation issues between your code page on your computer, to your server, to the php interpreter, the sql database, and out to the client browser. I have to think you’d have better consistency if you checked for ascii codes for any character higher than ascii 127, and sent html entity codes instead of characters to the client browser.

    Thread Starter sepp88

    (@sepp88)

    Thank you for your answer. What would my function look like if I’d use ascii codes instead of normal characters?

    Moderator bcworkz

    (@bcworkz)

    For example, the array element '‹' => '«' would become chr(8249) => '&laquo;' . Unfortunately, I’m not sure this syntax works for array definitions, but hopefully it at least illustrates the concept.

    I misspoke earlier, this isn’t really checking for ascii codes as much as defining the key string with an ascii code. If you print_r the array, the output would still look like '‹' => '&laquo;'

    To truly check by ascii, which shouldn’t be required, you’d have to step through each character one by one and do something like

    if ($replace[ord($chr)]) {
    $str = $replace[ord($chr)];
    // then stuff $str back into the original string in place of $chr
    }

    In this case, $replace would be defined with elements like 8249 => '«'
    – indexed with integers associated ascii codes instead of strings.

    Moderator bcworkz

    (@bcworkz)

    In the last sentence I mean:

    In this case, $replace would be defined with elements like 8249 => '&laquo;'
    – indexed with integers associated ascii codes instead of strings.

    The forum parser has fits with me trying to show html entities, it finally got fed up and wouldn’t let me edit my post any more.

    Thread Starter sepp88

    (@sepp88)

    Puh, first of thank you very much. However since I’m really fresh to php this is very complicated for me, even though your explanaition sounds logically.

    What do you mean by

    `// then stuff $str back into the original string in place of $chr

    Even though I’m ashamed to ask, but would you mind implementing your if-statement into my function?

    Thank you in advance!

    Moderator bcworkz

    (@bcworkz)

    I understand. Unfortunately, the script to decode and check by ascii would require a complete rewrite. I didn’t even really see the need, as I thought the chr(8249) bit should have sufficed. To be sure, I set up a little test page.

    You will probably not be surprised when I report that this did not work well. Some odd character code interpretation is going on in PHP. I’m using version 5.3.8 FWIW. I then experimented with ord('‹'); in anticipation of the need for full ascii code interpretation. The function does not return the number you would expect. In fact, ord('›'); (close quote instead of open) returns the same number! Not useful for substituting double quotes.

    I played around with the php.ini settings and default_charset. (I try to set everything I do to UTF-8 when I can) Also with iconv settings. I could still not produce reliable results. So as things stand, there is no use trying to implement any of my suggestions. PHP does not behave as I would have expected. As it stands, using the actual character (as you originally had done) instead of ascii codes is more reliable, though not fully reliable, as you well know.

    Sorry to give you false hope of a resolution. I’m currently out of ideas on how to resolve this. I’ll post back if I figure anything out.

    sepp88, what is the character set of the page, as defined by the HTTP header, and by any meta tags in the HTML <head>? Make sure this is UTF-8 for both.

    I’m not an expert on character sets, but I would imagine that your files should be edited and saved in UTF-8 too.

    If you’re of the opinion that you’re seeing an intermittent behaviour, then save the before and after of your replacements to a file, perhaps with a timestamp in the filename, so you can review what input your code is struggling with. You can do this with file_put_contents().

    Moderator bcworkz

    (@bcworkz)

    halferdev makes some good points. The conversion doesn’t work well as it is, it doesn’t need help from wrong charsets making things worse.

    I have a partial solution that should work for the conversions to angle quotes you’re having trouble with. If you want to convert the more esoteric characters, say transliterated Vietnamese, this will not work. Basically, first apply htmlentities() to the input, and do str_replace based solely on html entities. An example to convert single angle to double angle quotes:

    $replace = array('&lsaquo;' => '&laquo;',
      '&rsaquo;' => '&raquo;'
    );
    $text = htmlentities($text, ENT_COMPAT, 'UTF-8');
    $text = str_replace(array_keys($replace), $replace, $text);

    This can be extended to work with low and curly quotes as well, or any other character that has an html entity defined.

Viewing 8 replies - 1 through 8 (of 8 total)
  • The topic ‘add_filter sometimes working, sometimes not working … simply replacing text’ is closed to new replies.