Is it undefined behaviour to use pointer after allocated memory?

Question

I have the following code:

uint8_t buffer[16];
uint8_t data[16];
uint8_t buffer_length = 16;
uint8_t data_length = 0;

memcpy(buffer + buffer_length, data, data_length);

memcpy should be a no-op, because data_length is zero. However buffer + buffer_length points just outside of the allocated memory. I wonder if it could trigger some kind of undefined behaviour? Should I wrap this memcpy with an additional if?

I understand that any reasonable implementation of memcpy would work fine, however this question is more from the code correctness perspective and avoiding undefined behaviours.

You're allowed to create pointer to just after the end of an object. You're not allowed to dereference it. Since this doesn't dereference anything because the length is 0, I think it should be OK. — 17 hours ago
Using buffer + buffer_length doesn't violate the C standard, but is a loaded gun waiting to go off… Why would you want to form the address outside of the destination array in the first place? — 17 hours ago
In practice Barmar is correct. And so is Rusty. In theory, you would need to look at the precise wording of the specification of memcpy to decide if that call violates the spec's preconditions. — 17 hours ago
@klutt – You would need to know that for every possible implementation of memcpy. Including those that haven't been written yet. Or … read the specification!! — 17 hours ago
@12431234123412341234123 That question uses null pointers though, which are explicitly not valid pointers, per definition. — 10 hours ago

score 12 · Accepted Answer · 2023-09-26 11:28:48Z

12

As the answer of Stephen C points out, the C17 specification is a bit vague about whether or not this is well-defined.

However, the C23 specification clarifies this in a footnote to the part of 7.1.4 stating

If a function argument is described as being an array, the pointer passed to the function shall
have a value such that all address computations and accesses to objects (that would be valid if
the pointer did point to the first element of such an array) are valid.

The footnote (235) reads:

This includes, for example, passing a valid pointer that points one-past-the-end of an array along with a size of 0, or using any valid pointer with a size of 0.

The first part of the sentence explicitly defines the OP case as well-defined.

Adding this statement can be seen as admitting that the C17 specification is not sufficiently clear on this point and thus, it cannot be ruled out that an implementer of a C17 compiler may in good faith interpret the standard such that this case is not defined behavior.

However, C23 should remove that uncertainty.

edited 11 hours ago

answered 14 hours ago

nielsen

5,86910 silver badges28 bronze badges

4

Thats footnote 235) in the C23 N3096 draft. And either way, foot notes are not normative. Although this one ought to point a knowledgeable reader straight to the actual normative text in 6.5.6 (C17/C23), as quoted in my answer.

– Lundin

14 hours ago
4

@Lundin That is correct, but in this case, I see the footnote as a clarification of how the normative paragraph should be interpreted. It is not adding to the specification.

– nielsen

14 hours ago
2

Yeah it's a good find – I don't think the clarification was necessary prior to C23 but apparently some people need one. It's generally muddy thinking to say that something in chapter 7 (the standard library) invalidates the rules laid out in chapter 6 (the C language). The standard library need not be implemented in the C language, but the function APIs definitely need to be and therefore (for the most part) the same rules apply to standard library functions as to any C function.

– Lundin

14 hours ago
@Lundin: Such clarification shouldn't have been necessary, but there are many situations where the authors of the Standard didn't think it necessary to explicitly specify various aspects of behavior which implementations to date had either processed identically or in one of a few discrete ways (possibly chosen in unspecified fashion), but which clever compiler writers have interpreted such omissions as invitations to deviate in arbitrary fashion. The real problem, though, should be handled by recognizing adherence to precedent as a quality-of-implementation matter.

– supercat

6 hours ago

score 7 · Accepted Answer · 2023-09-26 11:48:45Z

The code has well-defined behavior.

The "string handling functions" that the memcpy function sorts under states (C17 7.24.1):

Where an argument declared as size_t n specifies the length of the array for a function, n can have the value zero on a call to that function. Unless explicitly stated otherwise in the description of a
particular function in this subclause, pointer arguments on such a call shall still have valid values, as described in 7.1.4.

The part in C17 7.1.4 regarding array parameters passed to standard library functions is somewhat relevant:

If a function argument is described as being an array, the pointer actually passed
to the function shall have a value such that all address computations and accesses to objects (that
would be valid if the pointer did point to the first element of such an array) are in fact valid.

(The arguments to memcpy need not necessarily be an array/arrays however. But in this case they both are.)

Address computations and the following access to an item of the array are defined by the rules for pointer arithmetic, specifically C17 6.5.6 §8 about the additive operators, the relevant part being this one:

If both the pointer operand and the result point
to elements of the same array object, or one past the last element of the array object, the evaluation
shall not produce an overflow; otherwise, the behavior is undefined. If the result points one past
the last element of the array object, it shall not be used as the operand of a unary * operator that is
evaluated.

Therefore buffer + buffer_length is explicitly allowed by this "point one item past the end of an array" special rule, as long as we don’t de-reference that location. Which will not happen in this case. Had we written buffer + buffer_length + 1 then it would be an invalid address computation and undefined behavior.

But doesn't 7.24.1 explicitely state, that the pointers passed have to be valid? If for example a particular implementation probes the destination pointer and discovers (by means of a hardware trap) that it is an invalid location, where would this be disallowed in the standard? The fact that buffer+buffer_length is a well-defined operation does not mean, that passing it to a function is well-defined. — 5 hours ago

supercatsupercat 77.9k9 gold badges168 silver badges211 bronze badges · Accepted Answer · 2023-09-26 20:13:00Z

If the question is whether an conforming implementation could process such pointer constructs in gratuitously wacky fashion, the answer is almost certainly yes. If the question is whether programmers should be expected to jump through hoops to allow for such possibility, the answer is no. The Standard treats many such judgments as quality of implementation issues outside its jurisdiction.

In both clang and gcc, an equality comparison between a pointer to the start of an object, and a a legitimately formed "one past" pointer for the object that happens to immediately precede it in memory, may have side effects which are consistent neither with the comparison yielding 0, nor with it yielding 1. It would not, however, would probably not render such compilers non-conforming, because there would almost certainly exist some possible program which nominally exercises the translation limits in N1570 5.2.4.1, which clang and gcc would process correctly.

If there were an implementation that were incapable of correctly processing any program that exercises the translation limits in N5.2.4.1, other than one which passes a just-past pointer to memcpy with a size argument of zero, then one could argue that failure to treat that operation as a no-op might render the implementation non-conforming, or one could argue that because the Standard fails to unambiguously make clear that such behavior is required, the Standard would impose no requirements on a program that performs such an operation, but neither of those arguments would detract from the fact that quality implementations should be expected to treat the operation as a no-op, but the Standard would allow poor quality implementations to process such constructs in nonsensical fashion. Since the Standard deliberately waives jurisdiction over such quality-of-implementation issues, it should not be used as a source of guidance on such matters.

Is it undefined behaviour to use pointer after allocated memory?

Is it undefined behaviour to use pointer after allocated memory?

3 Answers 3

3 Answers
3