In C, some expressions yield undefined behavior. The standard explicitly chooses to not define how a compiler should behave if it encounters such an expression. As a result, a compiler is free to do whatever it sees fit and may produce useful results, unexpected results, or even crash.
Code that invokes UB may work as intended on a specific system with a specific compiler, but will likely not work on another system, or with a different compiler, compiler version or compiler settings.
What is Undefined Behavior (UB)?
Undefined behavior is a term used in the C standard. The C11 standard (ISO/IEC 9899:2011) defines the term undefined behavior as
behavior, upon use of a nonportable or erroneous program construct or of erroneous data, for which this International Standard imposes no requirements
What happens if there is UB in my code?
These are the results which can happen due to undefined behavior according to standard:
NOTE Possible undefined behavior ranges from ignoring the situation completely with unpredictable results, to behaving during translation or program execution in a documented manner characteristic of the environment (with or without the issuance of a diagnostic message), to terminating a translation or execution (with the issuance of a diagnostic message).
The following quote is often used to describe (less formally though) results happening from undefined behavior:
“When the compiler encounters [a given undefined construct] it is legal for it to make demons fly out of your nose” (the implication is that the compiler may choose any arbitrarily bizarre way to interpret the code without violating the ANSI C standard)
Why does UB exist?
If it's so bad, why didn't they just define it or make it implementation-defined?
Undefined behavior allows more opportunities for optimization; The compiler can justifiably assume that any code does not contain undefined behaviour, which can allow it to avoid run-time checks and perform optimizations whose validity would be costly or impossible to prove otherwise.
Why is UB hard to track down?
There are at least two reasons why undefined behavior creates bugs that are difficult to detect:
Consider null-pointer dereference: the compiler is not required to diagnose null-pointer dereference, and even could not, as at run-time any pointer passed into a function, or in a global variable might be null. And when the null-pointer dereference occurs, the standard does not mandate that the program needs to crash. Rather, the program might crash earlier, later, or not crash at all; it could even behave as if the null pointer pointed to a valid object, and behave completely normally, only to crash under other circumstances.
In the case of null-pointer dereference, C language differs from managed languages such as Java or C#, where the behavior of null-pointer dereference is defined: an exception is thrown, at the exact time (NullPointerException
in Java, NullReferenceException
in C#), thus those coming from Java or C# might incorrectly believe that in such a case, a C program must crash, with or without the issuance of a diagnostic message.
Additional information
There are several such situations that should be clearly distinguished:
Also have in mind that in many places the behavior of certain constructs is deliberately undefined by the C standard to leave room for compiler and library implementors to come up with their own definitions. A good example are signals and signal handlers, where extensions to C, such as the POSIX operating system standard, define much more elaborated rules. In such cases you just have to check the documentation of your platform; the C standard can't tell you anything.
Also note that if undefined behavior occurs in program it doesn't mean that just the point where undefined behavior occurred is problematic, rather entire program becomes meaningless.
Because of such concerns it is important (especially since compilers don't always warn us about UB) for person programming in C to be at least familiar with the kind of things that trigger undefined behavior.
It should be noted there are some tools (e.g. static analysis tools such as PC-Lint) which aid in detecting undefined behavior, but again, they can't detect all occurrences of undefined behavior.