GCC removes a bounds check in the right operand of &&, but not in the left operand, why?

GCC removes a bounds check in the right operand of &&, but not in the left operand, why?


18

I have the following C/C++ code snippet:

#define ARRAY_LENGTH 666

int g_sum = 0;
extern int *g_ptrArray[ ARRAY_LENGTH ];

void test()
{
    unsigned int idx = 0;

    // either enable or disable the check "idx < ARRAY_LENGTH" in the while loop
    while( g_ptrArray[ idx ] != nullptr /* && idx < ARRAY_LENGTH */ )
    {
        g_sum += *g_ptrArray[ idx ];
        ++idx;
    }

    return;
}

When I compile the above code using GCC compiler in version 12.2.0 with the option -Os for both cases:

  1. the while loop condition is g_ptrArray[ idx ] != nullptr
  2. the while loop condition is g_ptrArray[ idx ] != nullptr && idx < ARRAY_LENGTH

I get the following assembly:

test():
        ldr     r2, .L4
        ldr     r1, .L4+4
.L2:
        ldr     r3, [r2], #4
        cbnz    r3, .L3
        bx      lr
.L3:
        ldr     r3, [r3]
        ldr     r0, [r1]
        add     r3, r3, r0
        str     r3, [r1]
        b       .L2
.L4:
        .word   g_ptrArray
        .word   .LANCHOR0
g_sum:
        .space  4

As you can see the assembly does !NOT! do any checking of the variable idx against the value ARRAY_LENGTH.


My question

How is that possible?
How can the compiler generate the exactly same assembly for both cases and ignore the idx < ARRAY_LENGTH condition if it is present in the code? Explain me the rule or procedure, how the compiler comes to the conclusion that he can completely remove the condition.

The output assembly shown in Compiler Explorer (see both assemblies are identical):

  1. the while condition is g_ptrArray[ idx ] != nullptr:

  2. the while condition is g_ptrArray[ idx ] != nullptr && idx < ARRAY_LENGTH:

NOTE: If I swap the order of conditions to be idx < ARRAY_LENGTH && g_ptrArray[ idx ] != nullptr, the output assembly contains the check for value of idx as you can see here: https://godbolt.org/z/fvbsTfr9P.

16

  • 24

    You just accessed g_ptrArray[ idx ] to check for NULL, and the compiler can see the array declaration so it knows its size. It would be UB for idx to be past the end of the array, so the compiler can assume it doesn't happen.

    – Peter Cordes

    yesterday

  • 2

    See also – examples of UB and optimization

    – Richard Critten

    yesterday

  • 18

    There is no point in validating idx after indexing with it; when it's out-of-bounds it is already too late.

    – molbdnilo

    yesterday

  • 2

    You should be aware that "undefined behavior means anything can happen including but not limited to the program giving your expected output. But never rely on the output of a program that has UB. The program may just crash."

    – user12002570

    yesterday

  • 3

    @Lundin: It's not unrolling in that -Os code-gen. It "rotated" the loop and partially peeled the first iteration (loading g_ptrArray[0] ahead of the loop with mov rsi, [rax], and testing it for null). This is part of "loop inversion", re-arranging a while loop so there's a conditional branch at the bottom: Why are loops always compiled into "do…while" style (tail jump)? . So cmp rdx, 664/ ja done is exiting the loop when i = 665 or higher, since it's about to load from g_arrayPtr[i+1]. (Clang doesn't optimize away the idx check)

    – Peter Cordes

    yesterday


4 Answers
4


37

Accessing an array out of bounds is undefined behavior so the compiler can assume that it never happens in the LHS of the && expression. It is then jumping through hoops (optimizations) to notice that since ARRAY_LENGTH is the length of the array, the RHS condition must necessarily hold true (otherwise UB would ensue in the LHS). Hence the result you see.

The correct check would be idx < ARRAY_LENGTH && g_ptrArray[idx] != nullptr.

Even potential undefined behavior can do nasty things like that!

16

  • 13

    A blatant case of compiler perversion: instead of warning about an obvious programmer error, the compiler optimizes code away silently. I wish the compiler folks would add -Wdont-assume-the-programmer-is-always-right

    – chqrlie

    yesterday


  • 5

    Even potential undefined behavior can do nasty things like that! – an easier way to express that is: "The compiler can assume doesn't UB in your program". (The C++ standard is clear that this is retroactive, that later code can imply value-ranges for earlier code if there isn't a branch that would avoid the UB. ISO C is not clear on that: see Does undefined behaviour retroactively mean that earlier visible side-effects aren't guaranteed? which compares the standards)

    – Peter Cordes

    yesterday

  • 2

    In this case, it's UB that would have necessarily already happened for idx < ARRAY_LENGTH to be reachable with an idx that makes the comparison false. That's just a matter of terminology in how to describe things; I think; the concept seems relatively simple. It gets trickier if you had code like if (x < 0) *ptr = 1; arr[x] = 2; where arr is an actual array, not just a pointer, so the compiler can be sure that a negative index would be out of bounds and thus UB.

    – Peter Cordes

    yesterday

  • 4

    Sorry, this is mostly a nitpick, as Peter said a matter of terminology. "can happen at runtime given the right conditions" I mean that there's a certain moment at runtime when the UB will inevitably happen as the program continues, that's when the adverse effects kick in. As long as it remains merely "potential", the program doesn't break; it's impossible to notice that the condition was removed from the loop without causing UB.

    – HolyBlackCat

    yesterday


  • 2

    @chqrlie you are just questioning the fundamental principles of C and C++. They are made for the hypothetical developer “who knows what they are doing” or “knows better than the compiler”. While I’d agree with you regarding the design philosophy, I don’t see a point in discussing it in the context of such a language while alternative languages exist. As Passer By said, the code might not have been written in this form literally. At the same time, compiler writers try to implement warnings for cases that look like a mistake. Would be interesting to check which compilers do already warn here.

    – Holger

    15 hours ago


5

The C standard (C17 6.5.6 §8) says that we may not do pointer arithmetic outside an array nor access it beyond the array – doing so is undefined behavior and anything can happen.

Therefore, strictly speaking, the array out of bounds check is superfluous since your loop condition goes as "stop upon spotting a null pointer within the array". And in case g_ptrArray[ idx ] is an out of bounds access, you invoke undefined behavior so the program would in theory be toasted at this point. So there’s no need to evaluate the right operand of &&. (As you probably know, && has strict left-to-right evaluation.) The compiler can assume that the access is always inside the array of known size.

We can smack the compiler back in line by adding something that makes it impossible for the compiler to predict the code:

int** I_might_change_at_any_time = g_ptrArray;
void test2()
{
    unsigned int idx = 0;
     

    // check for idx value is NOT present in code
    while( I_might_change_at_any_time[ idx ] != nullptr && idx < ARRAY_LENGTH)
    {
        g_sum += *g_ptrArray[ idx ];
        ++idx;
    }
}

Here the pointer acts as "middle man". It’s a file scope variable with external linkage and so it may change at any time. The compiler can no longer assume that it always points at g_ptrArray. It will now be possible for the left operand of && to be a well-defined access. And therefore gcc now adds the out of bounds check to the assembler:

        cmp     QWORD PTR [rdx+rax*8], 0
        je      .L6
        cmp     eax, 666
        je      .L6

16

  • 1

    “The C standard (C17 6.5.6 §8) says that we may not do pointer arithmetic outside an array”. You know this, but there’s one exception: we can calculate a pointer one element past the end of an array, and compare it to a pointer within the array. That is, p < &array[0] + ARRAY_SIZE for p between &array[0] and endp is valid.

    – Davislor

    yesterday


  • @Davislor: The C Standard waives jurisdiction over how compiler writers process pointer arithmetic outside an array, so as to allow them to process it in whatever manner would best serve the needs of their customers, who were expected to be the programmers that would be writing code for use with the compilers. The only things that would be forbidden within a "conforming C program" would be constructs that would compel a conforming C implementation to reject it.

    – supercat

    yesterday

  • 2

    @supercat That is not correct. The Standard specifically allows it: “If both the pointer operand and the result point to elements of the same array object, or one past the last element of the array object, the evaluation shall not produce an overflow; otherwise, the behavior is undefined.”

    – Davislor

    yesterday


  • "It's a file scope variable with external linkage and so it may change at any time". That doesn't follow, and in fact here it cannot change during the loop; doing so would be either be a data race or a violation of strict aliasing.

    – Ben Voigt

    yesterday

  • 4

    @supercat: No one is arguing with you about what an implementation may do with UB. Davidslor is correctly pointing out that it's not the pointer arithmetic here that is UB, it's the lvalue-to-rvalue conversion (or C's equivalent). g_ptrArray + idx is fine even when idx == ARRAY_LENGTH, but *(g_ptrArray + idx) (assuming it is in an evaluated context and not the operand of &) is not.

    – Ben Voigt

    yesterday



4

The explanation: as documented by Marco Bonelli, the compiler assumes that the programmer who wrote the first test g_ptrArray[idx] != nullptr knows that this code has defined behavior, hence it assumes that idx is in the proper range, ie: idx < ARRAY_LENGTH. It follows that the next test && idx < ARRAY_LENGTH is redundant has idx is already known to be < ARRAY_LENGTH, hence the code can be elided.

It is unclear where this range analysis occurs and whether the compiler could also warn the programmer about a redundant test elided, the way it flags potential programming errors in if (a = b) ... or a = b << 1 + 2;

What makes this optimisation perverse IMHO is the lack of a warning for a non obvious optimisation.

Unlike compilers, programmers are human and make mistakes. Even 10x programmers sometimes make stupid mistakes, and compilers should not assume that programmers are always right, unless they explicitly boast about it as in if ((a = b)) ...

The incorrect order of the tests g_ptrArray[idx] != nullptr and idx < ARRAY_LENGTH should be obvious in a code review. Conversely, the fact that idx < ARRAY_LENGTH is redundant if g_ptrArray[idx] != nullptr is assumed to be legitimate is non obvious to human readers of the code. The compiler assumes that the programmer knows that g_ptrArray[idx] != nullptr can be performed, hence assumes that idx is in the proper range and deduces that the second test is redundant. This is perverse: if the programmer was shrewd enough to correctly assume idx is always in the proper range, they surely would not have written the redundant test. Conversely, if they made a mistake and wrote the test in the wrong order, flagging the redundant code as such would help fix a blatant bug.

When compilers become smart enough to detect such redundancy, this level of analysis should profit the programmer, and help detect programming errors, not make debugging harder than it already is.

1

  • 1

    Hopefully tools like gcc -fsanitize=undefined (or perhaps Valgrind?) would be able to catch the out-of-bounds read in g_ptrArray[idx] != nullptr. But only if you had a test-case where all the pointers were non-null, so you'd have to already be looking for possible errors in loop termination to catch it. Good point about the lack of warning. To me, this optimization does seem obvious, but you're right, I wouldn't have written it that way in the first place (at least not on purpose) because I know you need to check things before you use them at all.

    – Peter Cordes

    7 hours ago


-2

In the language the Dennis Ritchie invented in the early 1970s and documented in the 1974 C Language Reference Manual, an expression like arr[index] would take the address of arr, use the platform’s natural pointer indexing method to displace it by the size of i elements of arr‘s element type, and access the resulting storage using the platform’s natural means of reading an object of that type, with whatever consequences would result. If the address would fall outside the array, but the programmer knew enough about how arr would be placed and the way the platform would respond to reads of that address to know that the behavior of reading an address would satisfy application requirements, then code which read outside the array would be non-portable but possibly correct. On the flip side, if on the target platform, attempting to read from a certain address within the a second of the last floppy disk access would cause the most recently accessed track on that disk to be completely overwritten (actual behavior of the popular Apple II computer) and a programmer didn’t ensure that no stray reads would hit such addresses, such code should have been viewed as broken. There was no need for a compiler to distinguish between correct but not portable code and broken code.

The C Standard allows implementations to process array accesses in a manner consistent with Dennis Ritchie’s language, but it makes no attempt to catalog all of the actions that should be usable within non-portable programs for various platforms. Instead, it waives jurisdiction over how implementations process actions such as array indexing beyond the immediate confines of an array, using the phrase "Undefined Behavior" as a catch-all for both constructs that are non-portable but may be correct on some implementations, to those that are simply erroneous, without making any effort to distinguish them.

The authors of clang and gcc operate under the philosophy that when the Standard says a construct or corner case is "non-portable or erroneous", what it really means is "non-portable, and therefore erroneous", and that because the authors of the Standard don’t care how implementations treat such constructs or corner cases, programmers have no right to view any way the implementation might handle such cases as being inferior to any other.

Incidentally, the loop may have been written in such a weird fashion because the original compiler it targeted would process that loop slightly more efficiently than a version which checks the boundary condition first. Such optimizations tend to be futile when using the clang and gcc optimizers even when UB isn’t an issue, because those implementations will substitute their own judgment of how the loop should be processed for what the programmer wrote, even if what the programmer wrote would have been more efficient.



Leave a Reply

Your email address will not be published. Required fields are marked *