For a plugin that needs to cope with chinese names and characters I'm using:
htmlentities(trim($value), ENT_QUOTES, "UTF-8");
(for data fields that might also end up being post titles).
When the data is being parroted back at the user via a form, is there any reason not to do something like :
stripslashes(attribute_escape($value))
(I figure attribute_escape being native to the WordPress platform might deserve to be respected/retained but I don't want to show escaped 'quotes' e.g. \' as its a critical part of many chinese names)