I have the following C/C++ code snippet:
#define ARRAY_LENGTH 666
int g_sum = 0;
extern int *g_ptrArray[ ARRAY_LENGTH ];
void test()
{
unsigned int idx = 0;
// either enable or disable the check "idx < ARRAY_LENGTH" in the while loop
while( g_ptrArray[ idx ] != nullptr /* && idx < ARRAY_LENGTH */ )
{
g_sum += *g_ptrArray[ idx ];
++idx;
}
return;
}
When I compile the above code using GCC compiler in version 12.2.0 with the option -Os
for both cases:
- the while loop condition is
g_ptrArray[ idx ] != nullptr
- the while loop condition is
g_ptrArray[ idx ] != nullptr && idx < ARRAY_LENGTH
I get the following assembly:
test():
ldr r2, .L4
ldr r1, .L4+4
.L2:
ldr r3, [r2], #4
cbnz r3, .L3
bx lr
.L3:
ldr r3, [r3]
ldr r0, [r1]
add r3, r3, r0
str r3, [r1]
b .L2
.L4:
.word g_ptrArray
.word .LANCHOR0
g_sum:
.space 4
As you can see the assembly does !NOT! do any checking of the variable idx
against the value ARRAY_LENGTH
.
My question
How is that possible?
How can the compiler generate the exactly same assembly for both cases and ignore the idx < ARRAY_LENGTH
condition if it is present in the code? Explain me the rule or procedure, how the compiler comes to the conclusion that he can completely remove the condition.
The output assembly shown in Compiler Explorer (see both assemblies are identical):
-
the while condition is
g_ptrArray[ idx ] != nullptr
: -
the while condition is
g_ptrArray[ idx ] != nullptr && idx < ARRAY_LENGTH
:
NOTE: If I swap the order of conditions to be idx < ARRAY_LENGTH && g_ptrArray[ idx ] != nullptr
, the output assembly contains the check for value of idx
as you can see here: https://godbolt.org/z/fvbsTfr9P.
16
4 Answers
Accessing an array out of bounds is undefined behavior so the compiler can assume that it never happens in the LHS of the &&
expression. It is then jumping through hoops (optimizations) to notice that since ARRAY_LENGTH
is the length of the array, the RHS condition must necessarily hold true (otherwise UB would ensue in the LHS). Hence the result you see.
The correct check would be idx < ARRAY_LENGTH && g_ptrArray[idx] != nullptr
.
Even potential undefined behavior can do nasty things like that!
16
-
13
A blatant case of compiler perversion: instead of warning about an obvious programmer error, the compiler optimizes code away silently. I wish the compiler folks would add
-Wdont-assume-the-programmer-is-always-right
– chqrlieyesterday
-
5
Even potential undefined behavior can do nasty things like that! – an easier way to express that is: "The compiler can assume doesn't UB in your program". (The C++ standard is clear that this is retroactive, that later code can imply value-ranges for earlier code if there isn't a branch that would avoid the UB. ISO C is not clear on that: see Does undefined behaviour retroactively mean that earlier visible side-effects aren't guaranteed? which compares the standards)
– Peter Cordesyesterday
-
2
In this case, it's UB that would have necessarily already happened for
idx < ARRAY_LENGTH
to be reachable with anidx
that makes the comparison false. That's just a matter of terminology in how to describe things; I think; the concept seems relatively simple. It gets trickier if you had code likeif (x < 0) *ptr = 1;
arr[x] = 2;
wherearr
is an actual array, not just a pointer, so the compiler can be sure that a negative index would be out of bounds and thus UB.– Peter Cordesyesterday
-
4
Sorry, this is mostly a nitpick, as Peter said a matter of terminology. "can happen at runtime given the right conditions" I mean that there's a certain moment at runtime when the UB will inevitably happen as the program continues, that's when the adverse effects kick in. As long as it remains merely "potential", the program doesn't break; it's impossible to notice that the condition was removed from the loop without causing UB.
– HolyBlackCatyesterday
-
2
@chqrlie you are just questioning the fundamental principles of C and C++. They are made for the hypothetical developer “who knows what they are doing” or “knows better than the compiler”. While I’d agree with you regarding the design philosophy, I don’t see a point in discussing it in the context of such a language while alternative languages exist. As Passer By said, the code might not have been written in this form literally. At the same time, compiler writers try to implement warnings for cases that look like a mistake. Would be interesting to check which compilers do already warn here.
– Holger15 hours ago
The C standard (C17 6.5.6 §8) says that we may not do pointer arithmetic outside an array nor access it beyond the array – doing so is undefined behavior and anything can happen.
Therefore, strictly speaking, the array out of bounds check is superfluous since your loop condition goes as "stop upon spotting a null pointer within the array". And in case g_ptrArray[ idx ]
is an out of bounds access, you invoke undefined behavior so the program would in theory be toasted at this point. So there’s no need to evaluate the right operand of &&
. (As you probably know, &&
has strict left-to-right evaluation.) The compiler can assume that the access is always inside the array of known size.
We can smack the compiler back in line by adding something that makes it impossible for the compiler to predict the code:
int** I_might_change_at_any_time = g_ptrArray;
void test2()
{
unsigned int idx = 0;
// check for idx value is NOT present in code
while( I_might_change_at_any_time[ idx ] != nullptr && idx < ARRAY_LENGTH)
{
g_sum += *g_ptrArray[ idx ];
++idx;
}
}
Here the pointer acts as "middle man". It’s a file scope variable with external linkage and so it may change at any time. The compiler can no longer assume that it always points at g_ptrArray
. It will now be possible for the left operand of &&
to be a well-defined access. And therefore gcc now adds the out of bounds check to the assembler:
cmp QWORD PTR [rdx+rax*8], 0
je .L6
cmp eax, 666
je .L6
16
-
1
“The C standard (C17 6.5.6 §8) says that we may not do pointer arithmetic outside an array”. You know this, but there’s one exception: we can calculate a pointer one element past the end of an array, and compare it to a pointer within the array. That is,
p < &array[0] + ARRAY_SIZE
forp
between&array[0]
andendp
is valid.– Davisloryesterday
-
@Davislor: The C Standard waives jurisdiction over how compiler writers process pointer arithmetic outside an array, so as to allow them to process it in whatever manner would best serve the needs of their customers, who were expected to be the programmers that would be writing code for use with the compilers. The only things that would be forbidden within a "conforming C program" would be constructs that would compel a conforming C implementation to reject it.
– supercatyesterday
-
2
@supercat That is not correct. The Standard specifically allows it: “If both the pointer operand and the result point to elements of the same array object, or one past the last element of the array object, the evaluation shall not produce an overflow; otherwise, the behavior is undefined.”
– Davisloryesterday
-
"It's a file scope variable with external linkage and so it may change at any time". That doesn't follow, and in fact here it cannot change during the loop; doing so would be either be a data race or a violation of strict aliasing.
– Ben Voigtyesterday
-
4
@supercat: No one is arguing with you about what an implementation may do with UB. Davidslor is correctly pointing out that it's not the pointer arithmetic here that is UB, it's the lvalue-to-rvalue conversion (or C's equivalent).
g_ptrArray + idx
is fine even whenidx == ARRAY_LENGTH
, but*(g_ptrArray + idx)
(assuming it is in an evaluated context and not the operand of&
) is not.– Ben Voigtyesterday
The explanation: as documented by Marco Bonelli, the compiler assumes that the programmer who wrote the first test g_ptrArray[idx] != nullptr
knows that this code has defined behavior, hence it assumes that idx
is in the proper range, ie: idx < ARRAY_LENGTH
. It follows that the next test && idx < ARRAY_LENGTH
is redundant has idx
is already known to be < ARRAY_LENGTH
, hence the code can be elided.
It is unclear where this range analysis occurs and whether the compiler could also warn the programmer about a redundant test elided
, the way it flags potential programming errors in if (a = b) ...
or a = b << 1 + 2;
What makes this optimisation perverse IMHO is the lack of a warning for a non obvious optimisation.
Unlike compilers, programmers are human and make mistakes. Even 10x programmers sometimes make stupid mistakes, and compilers should not assume that programmers are always right, unless they explicitly boast about it as in if ((a = b)) ...
The incorrect order of the tests g_ptrArray[idx] != nullptr
and idx < ARRAY_LENGTH
should be obvious in a code review. Conversely, the fact that idx < ARRAY_LENGTH
is redundant if g_ptrArray[idx] != nullptr
is assumed to be legitimate is non obvious to human readers of the code. The compiler assumes that the programmer knows that g_ptrArray[idx] != nullptr
can be performed, hence assumes that idx
is in the proper range and deduces that the second test is redundant. This is perverse: if the programmer was shrewd enough to correctly assume idx is always in the proper range, they surely would not have written the redundant test. Conversely, if they made a mistake and wrote the test in the wrong order, flagging the redundant code as such would help fix a blatant bug.
When compilers become smart enough to detect such redundancy, this level of analysis should profit the programmer, and help detect programming errors, not make debugging harder than it already is.
1
-
1
Hopefully tools like
gcc -fsanitize=undefined
(or perhaps Valgrind?) would be able to catch the out-of-bounds read ing_ptrArray[idx] != nullptr
. But only if you had a test-case where all the pointers were non-null, so you'd have to already be looking for possible errors in loop termination to catch it. Good point about the lack of warning. To me, this optimization does seem obvious, but you're right, I wouldn't have written it that way in the first place (at least not on purpose) because I know you need to check things before you use them at all.– Peter Cordes7 hours ago
In the language the Dennis Ritchie invented in the early 1970s and documented in the 1974 C Language Reference Manual, an expression like arr[index]
would take the address of arr
, use the platform’s natural pointer indexing method to displace it by the size of i
elements of arr
‘s element type, and access the resulting storage using the platform’s natural means of reading an object of that type, with whatever consequences would result. If the address would fall outside the array, but the programmer knew enough about how arr
would be placed and the way the platform would respond to reads of that address to know that the behavior of reading an address would satisfy application requirements, then code which read outside the array would be non-portable but possibly correct. On the flip side, if on the target platform, attempting to read from a certain address within the a second of the last floppy disk access would cause the most recently accessed track on that disk to be completely overwritten (actual behavior of the popular Apple II computer) and a programmer didn’t ensure that no stray reads would hit such addresses, such code should have been viewed as broken. There was no need for a compiler to distinguish between correct but not portable code and broken code.
The C Standard allows implementations to process array accesses in a manner consistent with Dennis Ritchie’s language, but it makes no attempt to catalog all of the actions that should be usable within non-portable programs for various platforms. Instead, it waives jurisdiction over how implementations process actions such as array indexing beyond the immediate confines of an array, using the phrase "Undefined Behavior" as a catch-all for both constructs that are non-portable but may be correct on some implementations, to those that are simply erroneous, without making any effort to distinguish them.
The authors of clang and gcc operate under the philosophy that when the Standard says a construct or corner case is "non-portable or erroneous", what it really means is "non-portable, and therefore erroneous", and that because the authors of the Standard don’t care how implementations treat such constructs or corner cases, programmers have no right to view any way the implementation might handle such cases as being inferior to any other.
Incidentally, the loop may have been written in such a weird fashion because the original compiler it targeted would process that loop slightly more efficiently than a version which checks the boundary condition first. Such optimizations tend to be futile when using the clang and gcc optimizers even when UB isn’t an issue, because those implementations will substitute their own judgment of how the loop should be processed for what the programmer wrote, even if what the programmer wrote would have been more efficient.
You just accessed
g_ptrArray[ idx ]
to check for NULL, and the compiler can see the array declaration so it knows its size. It would be UB foridx
to be past the end of the array, so the compiler can assume it doesn't happen.yesterday
See also – examples of UB and optimization
yesterday
There is no point in validating
idx
after indexing with it; when it's out-of-bounds it is already too late.yesterday
You should be aware that "undefined behavior means anything can happen including but not limited to the program giving your expected output. But never rely on the output of a program that has UB. The program may just crash."
yesterday
@Lundin: It's not unrolling in that
-Os
code-gen. It "rotated" the loop and partially peeled the first iteration (loadingg_ptrArray[0]
ahead of the loop withmov rsi, [rax]
, and testing it for null). This is part of "loop inversion", re-arranging awhile
loop so there's a conditional branch at the bottom: Why are loops always compiled into "do…while" style (tail jump)? . Socmp rdx, 664
/ja done
is exiting the loop wheni
= 665 or higher, since it's about to load fromg_arrayPtr[i+1]
. (Clang doesn't optimize away theidx
check)yesterday