I have recently discovered a lot of nasty characters that have been added by our users. These characters might cause various issues. Apart from the fact that you can hardly see these characters to be wrong, they also might cause issues in automation. Just as an example ... an XML parse might get stuck when trying to read cells data containing these characters.
Most likely these characters are caused by people doing copy and paste actions from a Microsoft Office document into ProactivePack. Office might replace some characters automatically. One famous example is the minus character to be replaced by a hyphen. Hardly to see a difference but since it is a different character it might cause unexpected results when utilizing the data.
My recommendation is to implement an optional filter that only allows characters by their numeric byte value. Let me call it a "forceASCII" option that only allows a character byte range from 0x20 (32) to 0x7E (126) which would filter out control characters and ensures reliable data.