dpromies's Replies | WordPress.org

Forum Replies Created

Viewing 6 replies - 1 through 6 (of 6 total)

Forum: Plugins
In reply to: [Relevanssi - A Better Search] Indexing blocked by special characters

Thread Starter dpromies
(@dpromies)

4 years, 3 months ago

That’s a good question. I see two possible enhancements. You could check within the punctuation function if the incoming string (being not empty) is getting empty after going through the punctuation regex. This should not happen whatever the punctuation filtering is instructed to do. In this case there could be used a workaround to handle non-unicode characters.

And in general you could give the user an information about indexed posts without terms, as a stat on the indexing tab (after index building) or as a function on the debugging tab (“check index status of posts”). I think this could quite easily be checked in the database. Of course this will lead to further questions by the users but in my case it would have been a help to see that something is going wrong.

Forum: Plugins
In reply to: [Relevanssi - A Better Search] Indexing blocked by special characters

Thread Starter dpromies
(@dpromies)

4 years, 3 months ago

I have checked the encoding of the strings – it’s utf-8. But somehow there are some non utf-8 chars making their way through the string handling functions. I think they may result from editors working on a Mac, and the server settings are not suitable to cope with it.

Thank you for the recommendation to use a filter – I will try this. Maybe in a future update you could integrate an additional error handling within the relevanssi_remove_punct-function? In my case the function just returned an empty string without giving any hints that the indexed post will not have any terms. Thanks for your help

David

Forum: Plugins
In reply to: [Relevanssi - A Better Search] Indexing blocked by special characters

Thread Starter dpromies
(@dpromies)

4 years, 3 months ago

No, it’s the unicode modifier that is preventing the string from being processed. When I take it away it nearly works correctly. But it can’t handle the German char “ö” (replacing it with a question mark).

With $a = preg_replace( '/:punct:+/', apply_filters( 'relevanssi_default_punctuation_replacement', ' ' ), $a); it’s better. There is just a questionmark in the string now where the non breaking space had been replaced by an � after $a = html_entity_decode( $a, ENT_QUOTES );

Forum: Plugins
In reply to: [Relevanssi - A Better Search] Indexing blocked by special characters
Thread Starter dpromies
(@dpromies)

4 years, 3 months ago
I think I have found the bug now. It seems to be a server related charset problem causing strings not to be handled correctly within the function relevanssi_remove_punct.

I could reproduce these steps going through this function:

1) String in Post:
 Media

2) String after $a = html_entity_decode( $a, ENT_QUOTES ):
<p>�Media

3) String after $a = preg_replace( '/:punct:+/u', apply_filters( 'relevanssi_default_punctuation_replacement', ' ' ), $a ):
empty

When I use another regular expression instead of ':punct:+/u' the function does not fail:

4) String after $a = preg_replace('/\p{P}/', '', $a):
<p>?Media
- This reply was modified 4 years, 3 months ago by dpromies.
Forum: Plugins
In reply to: [Relevanssi - A Better Search] Indexing blocked by special characters

Thread Starter dpromies
(@dpromies)

4 years, 3 months ago

Honestly I don’t think that you can reproduce this behaviour. Could you give me a hint where to insert debugging code in the indexing functions to get more information about what is happening with the content?

Forum: Plugins
In reply to: [Relevanssi - A Better Search] Indexing blocked by special characters

Thread Starter dpromies
(@dpromies)

4 years, 3 months ago

Hi Mikko,

thanks for your response. The error log doesn’t contain any Relevanssi-related errors. But I took a closer look at the source code of the unindexed pages. It seems that in some cases a hardcoded   put in by the editors is causing the trouble. I can reproduce it only on my live server. Maybe there is a problem writing this to the database?

Viewing 6 replies - 1 through 6 (of 6 total)