Title: dpromies's Replies | WordPress.org

---

# dpromies

  [  ](https://wordpress.org/support/users/dpromies/)

 *   [Profile](https://wordpress.org/support/users/dpromies/)
 *   [Topics Started](https://wordpress.org/support/users/dpromies/topics/)
 *   [Replies Created](https://wordpress.org/support/users/dpromies/replies/)
 *   [Reviews Written](https://wordpress.org/support/users/dpromies/reviews/)
 *   [Topics Replied To](https://wordpress.org/support/users/dpromies/replied-to/)
 *   [Engagements](https://wordpress.org/support/users/dpromies/engagements/)
 *   [Favorites](https://wordpress.org/support/users/dpromies/favorites/)

 Search replies:

## Forum Replies Created

Viewing 6 replies - 1 through 6 (of 6 total)

 *   Forum: [Plugins](https://wordpress.org/support/forum/plugins-and-hacks/)
    In
   reply to: [[Relevanssi - A Better Search] Indexing blocked by special characters](https://wordpress.org/support/topic/indexing-blocked-by-special-characters/)
 *  Thread Starter [dpromies](https://wordpress.org/support/users/dpromies/)
 * (@dpromies)
 * [4 years, 3 months ago](https://wordpress.org/support/topic/indexing-blocked-by-special-characters/#post-15183307)
 * That’s a good question. I see two possible enhancements. You could check within
   the punctuation function if the incoming string (being not empty) is getting 
   empty after going through the punctuation regex. This should not happen whatever
   the punctuation filtering is instructed to do. In this case there could be used
   a workaround to handle non-unicode characters.
 * And in general you could give the user an information about indexed posts without
   terms, as a stat on the indexing tab (after index building) or as a function 
   on the debugging tab (“check index status of posts”). I think this could quite
   easily be checked in the database. Of course this will lead to further questions
   by the users but in my case it would have been a help to see that something is
   going wrong.
 *   Forum: [Plugins](https://wordpress.org/support/forum/plugins-and-hacks/)
    In
   reply to: [[Relevanssi - A Better Search] Indexing blocked by special characters](https://wordpress.org/support/topic/indexing-blocked-by-special-characters/)
 *  Thread Starter [dpromies](https://wordpress.org/support/users/dpromies/)
 * (@dpromies)
 * [4 years, 3 months ago](https://wordpress.org/support/topic/indexing-blocked-by-special-characters/#post-15173081)
 * I have checked the encoding of the strings – it’s utf-8. But somehow there are
   some non utf-8 chars making their way through the string handling functions. 
   I think they may result from editors working on a Mac, and the server settings
   are not suitable to cope with it.
 * Thank you for the recommendation to use a filter – I will try this. Maybe in 
   a future update you could integrate an additional error handling within the relevanssi_remove_punct-
   function? In my case the function just returned an empty string without giving
   any hints that the indexed post will not have any terms. Thanks for your help
 * David
 *   Forum: [Plugins](https://wordpress.org/support/forum/plugins-and-hacks/)
    In
   reply to: [[Relevanssi - A Better Search] Indexing blocked by special characters](https://wordpress.org/support/topic/indexing-blocked-by-special-characters/)
 *  Thread Starter [dpromies](https://wordpress.org/support/users/dpromies/)
 * (@dpromies)
 * [4 years, 3 months ago](https://wordpress.org/support/topic/indexing-blocked-by-special-characters/#post-15169103)
 * No, it’s the unicode modifier that is preventing the string from being processed.
   When I take it away it nearly works correctly. But it can’t handle the German
   char “ö” (replacing it with a question mark).
 * With `$a = preg_replace( '/[:punct:](https://codex.wordpress.org/:punct:)+/',
   apply_filters( 'relevanssi_default_punctuation_replacement', ' ' ), $a);` it’s
   better. There is just a questionmark in the string now where the non breaking
   space had been replaced by an � after `$a = html_entity_decode( $a, ENT_QUOTES);`
 *   Forum: [Plugins](https://wordpress.org/support/forum/plugins-and-hacks/)
    In
   reply to: [[Relevanssi - A Better Search] Indexing blocked by special characters](https://wordpress.org/support/topic/indexing-blocked-by-special-characters/)
 *  Thread Starter [dpromies](https://wordpress.org/support/users/dpromies/)
 * (@dpromies)
 * [4 years, 3 months ago](https://wordpress.org/support/topic/indexing-blocked-by-special-characters/#post-15162592)
 * I think I have found the bug now. It seems to be a server related charset problem
   causing strings not to be handled correctly within the function relevanssi_remove_punct.
 * I could reproduce these steps going through this function:
 * 1) String in Post:
    `&nbsp;`Media
 * 2) String after `$a = html_entity_decode( $a, ENT_QUOTES )`:
    <p>�Media
 * 3) String after `$a = preg_replace( '/:punct:+/u', apply_filters( 'relevanssi_default_punctuation_replacement',''),
   $a )`:
    empty
 * When I use another regular expression instead of `':punct:+/u'` the function 
   does not fail:
 * 4) String after `$a = preg_replace('/\p{P}/', '', $a)`:
    <p>?Media
    -  This reply was modified 4 years, 3 months ago by [dpromies](https://wordpress.org/support/users/dpromies/).
 *   Forum: [Plugins](https://wordpress.org/support/forum/plugins-and-hacks/)
    In
   reply to: [[Relevanssi - A Better Search] Indexing blocked by special characters](https://wordpress.org/support/topic/indexing-blocked-by-special-characters/)
 *  Thread Starter [dpromies](https://wordpress.org/support/users/dpromies/)
 * (@dpromies)
 * [4 years, 3 months ago](https://wordpress.org/support/topic/indexing-blocked-by-special-characters/#post-15158431)
 * Honestly I don’t think that you can reproduce this behaviour. Could you give 
   me a hint where to insert debugging code in the indexing functions to get more
   information about what is happening with the content?
 *   Forum: [Plugins](https://wordpress.org/support/forum/plugins-and-hacks/)
    In
   reply to: [[Relevanssi - A Better Search] Indexing blocked by special characters](https://wordpress.org/support/topic/indexing-blocked-by-special-characters/)
 *  Thread Starter [dpromies](https://wordpress.org/support/users/dpromies/)
 * (@dpromies)
 * [4 years, 3 months ago](https://wordpress.org/support/topic/indexing-blocked-by-special-characters/#post-15158003)
 * Hi Mikko,
 * thanks for your response. The error log doesn’t contain any Relevanssi-related
   errors. But I took a closer look at the source code of the unindexed pages. It
   seems that in some cases a hardcoded `&nbsp;` put in by the editors is causing
   the trouble. I can reproduce it only on my live server. Maybe there is a problem
   writing this to the database?

Viewing 6 replies - 1 through 6 (of 6 total)