C Language Copying overlapping memory


A wide variety of standard library functions have among their effects copying byte sequences from one memory region to another. Most of these functions have undefined behavior when the source and destination regions overlap.

For example, this ...

#include <string.h> /* for memcpy() */

char str[19] = "This is an example";
memcpy(str + 7, str, 10);

... attempts to copy 10 bytes where the source and destination memory areas overlap by three bytes. To visualize:

               overlapping area
               _ _
              |   |
              v   v
T h i s   i s   a n   e x a m p l e \0
^             ^
|             |
|             destination

Because of the overlap, the resulting behavior is undefined.

Among the standard library functions with a limitation of this kind are memcpy(), strcpy(), strcat(), sprintf(), and sscanf(). The standard says of these and several other functions:

If copying takes place between objects that overlap, the behavior is undefined.

The memmove() function is the principal exception to this rule. Its definition specifies that the function behaves as if the source data were first copied into a temporary buffer and then written to the destination address. There is no exception for overlapping source and destination regions, nor any need for one, so memmove() has well-defined behavior in such cases.

The distinction reflects an efficiency vs. generality tradeoff. Copying such as these functions perform usually occurs between disjoint regions of memory, and often it is possible to know at development time whether a particular instance of memory copying will be in that category. Assuming non-overlap affords comparatively more efficient implementations that do not reliably produce correct results when the assumption does not hold. Most C library functions are allowed the more efficient implementations, and memmove() fills in the gaps, serving the cases where the source and destination may or do overlap. To produce the correct effect in all cases, however, it must perform additional tests and / or employ a comparatively less efficient implementation.