[^0-9a-zA-Z]
This will match all characters that are neither numbers nor letters (alphanumerical characters). If the underscore character _ is also to be negated, the expression can be shortened to:
[^\w]
Or:
\W
In the following sentences:
Hi, what's up?
I can't wait for 2017!!!
The following characters match:
,,,',?and the end of line character.
',,!and the end of line character.
UNICODE NOTE
Note that some flavors with Unicode character properties support may interpret \w and \W as [\p{L}\p{N}_] and [^\p{L}\p{N}_] which means other Unicode letters and numeric characters will be included as well (see PCRE docs). Here is a PCRE \w test:
In .NET, \w = [\p{Ll}\p{Lu}\p{Lt}\p{Lo}\p{Lm}\p{Mn}\p{Nd}\p{Pc}], and note it does not match \p{Nl} and \p{No} unlike PCRE (see the \w .NET documentation):
Note that for some reason, Unicode 3.1 lowercase letters (like 𝐚𝒇𝓌𝔨𝕨𝗐𝛌𝛚) are not matched.
Java's (?U)\w will match a mix of what \w matches in PCRE and .NET: