Transliteration of lemmas to ASCII both for creating anchor names and for sorting now uses the same transliteration table. The characters ÄäÖöÜü (graphemes corresponding to German umlauts) still receive special treatment in that they are expanded à la ä → ae for anchor names (but not for sorting), but this is now done as a preprocessing step.
The transliteration table now covers almost every latin-derived letter in the Unicode blocks Latin-1 Supplement, Latin Extended-A, Latin Extended-B and Latin Extended Additional. Most transliterations are "glyph-oriented" in that they involve only removing diacritic marks, decomposing ligatures and rotating letters back. A few transliterations are more "usage-oriented", such as ß → ss, þ → th or Ɣ → g. Some effort was made to keep the transliteration table sane, consistent and language-neutral. Missing letters are indicated in comments. Suggestions for additions and improvement are more than welcome!
Bugfix: generated links were broken if not using /%postname permalinks.
Bugfix: was indexing unpublished posts/pages on installation.
Now observing DB_CHARSET for creating the database table. This fixes a problem where non-ASCII characters get replaced by question marks when inserting into the table via a UTF-8 connection.
Tested with WordPress 3.3.2.
Index can now be inserted in widgets.
Tested with WordPress 3.3.
Lemmas are now removed from the index when the post/page containing them is deleted or otherwise unpublished.
The index now uses absolute links.
Tested with WordPress 2.9.1.
Requires: 2.8.4 or higher Compatible up to: 3.3.2 Last Updated: 2012-5-6 Downloads: 1,917