If a C program has undefined behavior, anything can happen. Therefore compilers may assume that any given program does not contain UB. So, suppose our program contains the following:
x += 5;
/* Do something else without x in the meantime. */
x += 7;
Of course, this can be optimized to
/* Do something without x. */
x += 12;
or similarly the other way.
If x has type unsigned int
then there is no possibility of UB in the above program. On the other hand, if x has type signed int
then there is a chance of overflow and hence UB. Since the compiler may assume that our program contains no UB, we can do the same optimization above. In fact, in this case the compiler can even assume that x - 12 <= MAX_INT
.
However, this seems to contradict Jens Gustedt’s famous "Modern C" (pg 42):
But such an optimization can also be forbidden because the compiler can’t prove that a certain operation will not force program termination. In our example, much depends on the type of x. If the current value of x could be close to the upper limit of the type, the innocent-looking operation x += 7 may produce an overflow. Such overflows are handled differently according to the type. As we have seen, overflow of an unsigned type is not a problem, and the result of the condensed operation will always be consistent with the two separate ones. For other types, such as signed integer types (signed) and floating-point types (double), an overflow may raise an exception and terminate the program. In that case, the optimization cannot be performed.
(Emphasis mine). If the compiler can (and does) assume our program has no UB, why can this optimization not be performed?
[1]: Jens Gustedt. Modern C. Manning, 2019, 9781617295812. hal-02383654
14
1 Answer
TL:DR: You’re right, such optimization is not forbidden for signed int
, only for float
/double
, and not just because of exceptions in that case.
One reason for things to be UB is that some obscure machine could raise an exception. But hitting UB is not guaranteed to raise an exception on all machines (unless you’re compiling with gcc -fsanitize=undefined
, for types of UB it or clang can reliably detect).
Signed integer overflow is UB in ISO C so anything can happen at any point in the execution of a program where that happens. Some implementations might define the behaviour as raising an exception. If they define where that exception gets raised, then it would prevent the optimization. But that would be defining the behaviour, the exact opposite of UB meaning absolutely zero guarantees about what your program actually does, even before the point where the abstract machine encounters UB. (e.g. it would be legal to do x+=12
with MIPS signed-overflow-trapping add
before the block of code separating x+=5;
from x+=7
, as long as that block can’t exit or otherwise not reach the x+=7
.)
The compiler just has to avoid introducing exceptions on paths of execution that shouldn’t have any. (Which isn’t a problem for integer addition on mainstream ISAs; most don’t even have a trapping signed-add instruction, and compilers targeting MIPS use addu
even for signed math so they can optimize freely, and because historically programmers didn’t want trapping on int
math.)
Both int
and unsigned
can do this optimization, it’s only FP types that can’t, but that’s (also) because of rounding even if you compile with gcc -fno-trapping-math
(an FP math option). See it in action on Godbolt with GCC13 and Clang 16
int sink;
int foo_signed(int x) {
x += 5;
sink = 1;
x += 7;
return x;
}
// and same for unsigned and float versions
# GCC -O3 -fno-trapping-math
foo_signed: # input in EDI, retval in EAX
mov DWORD PTR sink[rip], 1
lea eax, [rdi+12] # x86 can use LEA as a copy-and-add instruction
ret
foo_unsigned:
mov DWORD PTR sink[rip], 1
lea eax, [rdi+12]
ret
foo_float: # first arg and retval in XMM0
addss xmm0, DWORD PTR .LC0[rip] # two separate FP adds.
mov DWORD PTR sink[rip], 1
addss xmm0, DWORD PTR .LC1[rip]
ret
You’re correct; assuming x
is a local variable so literally nothing can possibly use the x += 5
result, it’s safe to optimize x+=5; ... ; x+=7
to x+=12
for both signed
and unsigned
integer types.
Unsigned integer math is of course fine.
Signed integer math has to produce the right result in any case where the abstract machine doesn’t encounter UB. x+=12
does that. There’s no guarantee that signed overflow raises an exception at any specific point in your program, that’s the whole point of C’s concept of undefined behaviour. For an execution that will encounter UB, literally anything can happen anywhere before or after that point. This optimization would be safe even for turning x-=5; x+=7
into x+=2
, where the abstract machine could wrap twice (encountering UB) where the asm doesn’t wrap, since "happens to work" is an allowed behaviour, and common in practive.
If you use compiler options like gcc -fwrapv
, that defines the behaviour of signed integer math to be 2’s complement wrapping, removing UB and making the situation identical to unsigned.
GCC does sometimes miss optimizations with signed integer math because of some reluctance for GCC internals to create signed overflow in a temporary in the asm where none would have existed in the abstract machine. This is a missed optimization when compiling for a machine that allows non-trapping integer math (i.e. all modern ISAs.) For example, GCC will optimize a+b+c+d+e+f
into (a+b)+(c+d)+(e+f)
for unsigned int
but not for signed int
without -fwrapv
. Clang does for both for AArch64 and RISC-V, although chooses not to for x86. (Godbolt). Again, this is a missed optimization due to GCC being over-cautious for some unknown reason; it’s perfectly valid. 2’s complement signed math is identical to unsigned binary math, so is associative; the final result will be correct in cases where the optimized computation wrapped back and forth but the abstract machine didn’t, for example.
Signed overflow UB is only a thing in the abstract machine, not asm; most mainstream ISAs don’t even have integer addition instructions that trap on overflow. (MIPS does, but compilers don’t use them for int
math, so they can do optimizations that produce values that didn’t exist in the abstract machine.)
Semi related: Why doesn't GCC optimize a*a*a*a*a*a to (a*a*a)*(a*a*a)? (answers show that compilers do optimize to three multiplies for integer math, even for signed int
.)
Floating-point math can’t do this optimization because it could give a different result in non-overflowing cases, due to different rounding. Two smaller numbers could both round down, vs. one larger number overcoming the threshold.
e.g. for a number large enough that the nearest representable double
values are 16
apart from each other, 8
would get half-way and round to nearest-even (assuming the default rounding mode). But any less, like 7
or 5
, will always round back down; x + 7 == x
, so both the 5
and the 7
would be lost, but x+12
all in one go would get over the hump to the next representable float or double, producing x+16
.
(The magnitude of 1 unit-in-the-last-place (of the mantissa) depends on the exponent of a float/double. For large enough FP values, it’s 1.0. For even larger values, e.g. double
from 253 to 254 only even numbers are representable, and so on with larger exponents.)
If you compile with GCC’s buggy default of -ftrapping-math
, it will try to respect FP exception semantics. It doesn’t reliably generate 2 FP exceptions if overflow happens twice, so it might not care about that.
5
-
Re “But that would be defining the behaviour, the exact opposite of UB meaning absolutely zero guarantees”: “Undefined behavior” has a specific meaning in the C standard, and it is not “absolutely” free rein. It is very definitely a qualified free rein; the standard is clear that “undefined behavior” means only that the C standard does not impose any requirements. It does not negate or nullify any requirements from sources outside the C standard.…
– Eric Postpischil17 hours ago
-
… You’ve acknowledged in this answer the compiler may define behavior beyond the standard, but it is not clear from your writing what you are saying about that with regard to this statement about undefined behavior.
– Eric Postpischil17 hours ago
-
@EricPostpischil: What I meant was that the optimizer doesn't have to care what happens for cases that involve UB. The quote in the question seems to be saying that signed-integer overflow UB means the program needs to produce an exception at a certain point in its execution (before or after the intervening code depending on the initial value of
x
). But that's of course not the case for UB, only for an implementation that definesint
math as trapping on signed overflow, a bit likeclang -fsanitize=something
.– Peter Cordes17 hours ago
-
7
I'm pretty sure my answer could make the same point with much less text, and that I ended up with at least 3 different versions of an answer all crammed into one :/
– Peter Cordes17 hours ago
-
3
@EricPostpischil: I rephrased that paragraph; I think that helps some (with avoiding the implication of "free rein" applying more broadly than I intended, not with the answer being too long and redundantly repeating its points :P)
– Peter Cordes17 hours ago
"If x has type unsigned int then there is no possibility of UB in the above program." — false if you allow arbitrary code in between, one can also get an array out-of-bounds there and UB in theoretical terms, overwrite of
x
in practical terms. However you cannot really talk about "practical terms" if you're talking about "UB not happening".18 hours ago
Good question. I'm not sure what Jens Gustedt meant here either. For the record, it's in chapter 5 "Basic values and data", section 5.1.4 "Optimization", currently available on manning.com/books/modern-c if anyone else would like to check the full context.
18 hours ago
Sometimes books are wrong.
18 hours ago
@EricPostpischil After this discussion the author provides the takeaway: "Type determines optimization opportunities." So, presumably, the implication is that a compiler can only make this optimization if x is
unsigned
and notsigned
. But my contention is that this distinction is irrelevant since the compiler can safely assume our program has no UB.17 hours ago
I think he's simply wrong. The compiler doesn't have to prove that program termination doesn't happen, because it's allowed to assume it. This often enables optimizations, like the one you show, that would not be possible without the assumption.
17 hours ago