I have the following code:
uint8_t buffer[16];
uint8_t data[16];
uint8_t buffer_length = 16;
uint8_t data_length = 0;
memcpy(buffer + buffer_length, data, data_length);
memcpy
should be a no-op, because data_length
is zero. However buffer + buffer_length
points just outside of the allocated memory. I wonder if it could trigger some kind of undefined behaviour? Should I wrap this memcpy
with an additional if
?
I understand that any reasonable implementation of memcpy
would work fine, however this question is more from the code correctness perspective and avoiding undefined behaviours.
9
3 Answers
As the answer of Stephen C points out, the C17 specification is a bit vague about whether or not this is well-defined.
However, the C23 specification clarifies this in a footnote to the part of 7.1.4 stating
If a function argument is described as being an array, the pointer passed to the function shall
have a value such that all address computations and accesses to objects (that would be valid if
the pointer did point to the first element of such an array) are valid.
The footnote (235) reads:
This includes, for example, passing a valid pointer that points one-past-the-end of an array along with a size of 0, or using any valid pointer with a size of 0.
The first part of the sentence explicitly defines the OP case as well-defined.
Adding this statement can be seen as admitting that the C17 specification is not sufficiently clear on this point and thus, it cannot be ruled out that an implementer of a C17 compiler may in good faith interpret the standard such that this case is not defined behavior.
However, C23 should remove that uncertainty.
4
-
Thats footnote 235) in the C23 N3096 draft. And either way, foot notes are not normative. Although this one ought to point a knowledgeable reader straight to the actual normative text in 6.5.6 (C17/C23), as quoted in my answer.
– Lundin14 hours ago
-
4
@Lundin That is correct, but in this case, I see the footnote as a clarification of how the normative paragraph should be interpreted. It is not adding to the specification.
– nielsen14 hours ago
-
2
Yeah it's a good find – I don't think the clarification was necessary prior to C23 but apparently some people need one. It's generally muddy thinking to say that something in chapter 7 (the standard library) invalidates the rules laid out in chapter 6 (the C language). The standard library need not be implemented in the C language, but the function APIs definitely need to be and therefore (for the most part) the same rules apply to standard library functions as to any C function.
– Lundin14 hours ago
-
@Lundin: Such clarification shouldn't have been necessary, but there are many situations where the authors of the Standard didn't think it necessary to explicitly specify various aspects of behavior which implementations to date had either processed identically or in one of a few discrete ways (possibly chosen in unspecified fashion), but which clever compiler writers have interpreted such omissions as invitations to deviate in arbitrary fashion. The real problem, though, should be handled by recognizing adherence to precedent as a quality-of-implementation matter.
– supercat6 hours ago
The code has well-defined behavior.
The "string handling functions" that the memcpy
function sorts under states (C17 7.24.1):
Where an argument declared as
size_t n
specifies the length of the array for a function,n
can have the value zero on a call to that function. Unless explicitly stated otherwise in the description of a
particular function in this subclause, pointer arguments on such a call shall still have valid values, as described in 7.1.4.
The part in C17 7.1.4 regarding array parameters passed to standard library functions is somewhat relevant:
If a function argument is described as being an array, the pointer actually passed
to the function shall have a value such that all address computations and accesses to objects (that
would be valid if the pointer did point to the first element of such an array) are in fact valid.
(The arguments to memcpy
need not necessarily be an array/arrays however. But in this case they both are.)
Address computations and the following access to an item of the array are defined by the rules for pointer arithmetic, specifically C17 6.5.6 §8 about the additive operators, the relevant part being this one:
If both the pointer operand and the result point
to elements of the same array object, or one past the last element of the array object, the evaluation
shall not produce an overflow; otherwise, the behavior is undefined. If the result points one past
the last element of the array object, it shall not be used as the operand of a unary * operator that is
evaluated.
Therefore buffer + buffer_length
is explicitly allowed by this "point one item past the end of an array" special rule, as long as we don’t de-reference that location. Which will not happen in this case. Had we written buffer + buffer_length + 1
then it would be an invalid address computation and undefined behavior.
1
-
But doesn't 7.24.1 explicitely state, that the pointers passed have to be valid? If for example a particular implementation probes the destination pointer and discovers (by means of a hardware trap) that it is an invalid location, where would this be disallowed in the standard? The fact that buffer+buffer_length is a well-defined operation does not mean, that passing it to a function is well-defined.
– Jens5 hours ago
If the question is whether an conforming implementation could process such pointer constructs in gratuitously wacky fashion, the answer is almost certainly yes. If the question is whether programmers should be expected to jump through hoops to allow for such possibility, the answer is no. The Standard treats many such judgments as quality of implementation issues outside its jurisdiction.
In both clang and gcc, an equality comparison between a pointer to the start of an object, and a a legitimately formed "one past" pointer for the object that happens to immediately precede it in memory, may have side effects which are consistent neither with the comparison yielding 0, nor with it yielding 1. It would not, however, would probably not render such compilers non-conforming, because there would almost certainly exist some possible program which nominally exercises the translation limits in N1570 5.2.4.1, which clang and gcc would process correctly.
If there were an implementation that were incapable of correctly processing any program that exercises the translation limits in N5.2.4.1, other than one which passes a just-past pointer to memcpy
with a size argument of zero, then one could argue that failure to treat that operation as a no-op might render the implementation non-conforming, or one could argue that because the Standard fails to unambiguously make clear that such behavior is required, the Standard would impose no requirements on a program that performs such an operation, but neither of those arguments would detract from the fact that quality implementations should be expected to treat the operation as a no-op, but the Standard would allow poor quality implementations to process such constructs in nonsensical fashion. Since the Standard deliberately waives jurisdiction over such quality-of-implementation issues, it should not be used as a source of guidance on such matters.
You're allowed to create pointer to just after the end of an object. You're not allowed to dereference it. Since this doesn't dereference anything because the length is 0, I think it should be OK.
17 hours ago
Using
buffer + buffer_length
doesn't violate the C standard, but is a loaded gun waiting to go off… Why would you want to form the address outside of the destination array in the first place?17 hours ago
In practice Barmar is correct. And so is Rusty. In theory, you would need to look at the precise wording of the specification of
memcpy
to decide if that call violates the spec's preconditions.17 hours ago
@klutt – You would need to know that for every possible implementation of
memcpy
. Including those that haven't been written yet. Or … read the specification!!17 hours ago
@12431234123412341234123 That question uses null pointers though, which are explicitly not valid pointers, per definition.
10 hours ago