unicode English text is not ASCII only


An assumption which pops up regularly is that when dealing with English text only, it’s unlikely to encounter characters outside the ASCII character set. To avoid problems with handling Unicode correctly, people are tempted to do things like stripping non-ASCII characters, or removing any accents on letters.

These examples show this assumption is wrong, and even for English text you should take care to handle Unicode characters correctly.