Thanks for all your work on the update Zack. Once again, I want to point out that Valid UTF-8 != Valid XML, so your mb_encode solution will not work.
Forms that have valid UTF-8 control characters in them will be rejected by the SOAP interface because they are not Valid XML.
How do you get UTF-8 control characters in a form, you ask? Just cut and paste from Word. If you have a longish form where someone has composed the answer offline in Word, and they cut and paste their answer into the form, the form submission stands a chance of being rejected