How to get a warning when comparing unsigned integers of different size in C and C++?

How to get a warning when comparing unsigned integers of different size in C and C++?

18

A common source of bugs in C or C++ is code like

size_t n = // ...

for (unsigned int i = 0; i < n; i++) // ...

which can infinite-loop when the unsigned int overflows.

For example on Linux unsigned int is 32-bit, while size_t is 64-bit, so if n = 5000000000, we get an infinite loop.

How can I get a warning about this with GCC or clang?

GCC’s -Wall -Wextra doesn’t do it:

#include <stdint.h>

void f(uint64_t n)
{
    for (uint32_t i = 0; i < n; ++i) {
    }
}
gcc-13 -std=c17 
       -Wall -Wextra -Wpedantic 
       -Warray-bounds -Wconversion 
       -fanalyzer 
       -c -o 76840686.o 76840686.c

(no output)


Edit: Answering some of the questions in comments/answers:

  • I am looking for a solution that does not requre n to be a compile-time constant.
  • Ideally the solution would work on existing C/C++ projects without having to rewrite them entirely.
  • Suggesting other tools than compiler warnings would also be useful, but compiler warnings themselves would be best.

Share
Improve this question

7

  • 1

    Would you consider using an additional checker tool? There are products which work compiler-like on code. Some work on single code files (lint, QAC, MISRA), some work on whole linkable projects (Klocwork). They are highly configurable and I am sure that you can configure them as nosy and annoying, sorry, as thorough and in-depth, as you like.

    – Yunnosch

    yesterday

  • 3

    I don't accept the premiss. Is this really a common source of bugs?

    – user207421

    yesterday

  • I hardly believe the solution is in warnings. The responsibility lies on the programmer to use the correct type when doing variable comparisons.

    – machine_1

    yesterday

  • 2

    @machine_1: You could say that about anything warnings exist for. The responsibility for writing correct code always lies on the programmer. But warnings can help with doing that.

    – user2357112

    18 hours ago

  • 4

    @user207421 Yes, it is a common source of bugs. For example, I just hit and fixed it in the most-used algorithm for 3D surface reconstruction. I also found and fixed a similar 32-bit overflow bug in the glibc realloc() function, which is one of the most used functions in one of the most used libraries in the world.

    – nh2

    16 hours ago

5 Answers
5

Reset to default

Highest score (default)

Trending (recent votes count more)

Date modified (newest first)

Date created (oldest first)

8

There does not appear to be a warning option built in to gcc or
clang that does what is requested. However, we can use
clang-query
instead.

Below is a clang-query command that will report comparison of
32-bit and 64-bit integers, on the assumption that int is 32 bits and
long is 64 bits. (More about that below.)

#!/bin/sh

PATH=$HOME/opt/clang+llvm-14.0.0-x86_64-linux-gnu-ubuntu-18.04/bin:$PATH

# In this query, the comments are ignored because clang-query (not the
# shell) recognizes and discards them.
query='m
  binaryOperator(                            # Find a binary operator expression
    anyOf(                                   #  such that any of:
      hasOperatorName("<"),                  #   is operator <, or
      hasOperatorName("<="),                 #   is operator <=, or
      hasOperatorName(">"),                  #   is operator >, or
      hasOperatorName(">="),                 #   is operator >=, or
      hasOperatorName("=="),                 #   is operator ==, or
      hasOperatorName("!=")                  #   is operator !=;
    ),

    hasEitherOperand(                        #  and where either operand
      implicitCastExpr(                      #   is an implicit cast
        has(                                 #    from
          expr(                              #     an expression
            hasType(                         #      whose type
              hasCanonicalType(              #       after resolving typedefs
                anyOf(                       #        is either
                  asString("int"),           #         int or
                  asString("unsigned int")   #         unsigned int,
                )
              )
            )
          )
        ),
        hasImplicitDestinationType(          #    and to a type
          hasCanonicalType(                  #     that after typedefs
            anyOf(                           #      is either
              asString("long"),              #       long or
              asString("unsigned long")      #       unsigned long.
            )
          )
        )
      ).bind("operand")
    )
  )
'

# Run the query on test.c.
clang-query 
  -c="set bind-root false" 
  -c="$query" 
  test.c -- -w

# EOF

When run on the following test.c it reports all of the indicated cases:

// test.c
// Demonstrate reporting comparisons of different-size operands.

#include <stddef.h>          // size_t
#include <stdint.h>          // int32_t, etc.

void test(int32_t i32, int64_t i64, uint32_t u32, uint64_t u64)
{
  i32 < i32;                 // Not reported: same sizes.
  i32 < i64;                 // reported
  i64 < i64;

  u32 < u32;
  u32 < u64;                 // reported
  u64 < u64;

  i32 < u64;                 // reported
  u32 < i64;                 // reported

  i32 <= i64;                // reported

  i64 > i32;                 // reported
  i64 >= i32;                // reported

  i32 == i64;                // reported
  u64 != u32;                // reported

  i32 + i64;                 // Not reported: not a comparison operator.

  ((int64_t)i32) < i64;      // Not reported: explicit cast.

  // Example #1 in question.
  size_t n = 0;
  for (unsigned int i = 0; i < n; i++) {}        // reported
}

// Example #2 in question.
void f(uint64_t n)
{
  for (uint32_t i = 0; i < n; ++i) {             // reported
  }
}

// EOF

Some details about the clang-query command:

  • The command passes -w to clang-tidy to suppress other warnings.
    That’s just because I wrote the test in a way that provokes warnings
    about unused values, and is not necessary with normal code.

  • It passes set bind-root false so the only reported site is the
    operand of interest rather than also reporting the entire expression.

The unsatisfying aspect of the query is it explicitly lists the source
and destination types. Unfortunately, clang-query does not have a
matcher to, say, report any 32-bit type, so they have to be listed
individually. You might want to add [unsigned] long long on the
destination side. You might also need to remove [unsigned] long if running this
code with compiler options that target an IL32 platform like Windows.

Relatedly, note that clang-query accepts compiler options after
the --, or alternatively in a
compile_commands.json
file.

Finally, I’ll note that I haven’t done any "tuning" of this query on
real code. It is likely to produce a lot of noise.

Share
Improve this answer

2

For C++ you may be able to do even better than a compiler warning, assuming n is a compile time constant. This also works for non-gcc compilers. This logic is not available for C code though.

The idea is basically encoding the value information in the variable type instead of the variable value.

template<std::integral T, auto N>
constexpr bool operator<(T value, std::integral_constant<decltype(N), N>)
{
    static_assert(std::is_signed_v<T> == std::is_signed_v<decltype(N)>, "the types have different signs");
    static_assert((std::numeric_limits<T>::max)() >= N, "the maximum of type T is smaller than N");
    return value < N;
}

// todo: overload with swapped operator parameter types

int main()
{
    constexpr std::integral_constant<size_t, 500'000'000> n; // go with 5'000'000'000, and you'll get a compiler error 

    for (unsigned int i = 0; i < n; i++)
    {

    }
}

If the value is not a compile time constant, you could still create a wrapper template type for the integer and overload the < operator for comparisons with integral values, adding the static_asserts into the body of this operator.

template<std::integral T>
class IntWrapper
{
    T m_value;
public:
    constexpr IntWrapper(T value)
        : m_value(value)
    {}

    template<std::integral U>
    friend constexpr bool operator<(U o1, IntWrapper o2)
    {
        static_assert(std::is_signed_v<U> == std::is_signed_v<T>, "types have different signedness");
        static_assert((std::numeric_limits<U>::max)() >= (std::numeric_limits<T>::max)(),
            "the comparison may never yield false because of the maxima of the types involved");

        return o1 < o2.m_value;
    }
};

void f(IntWrapper<uint64_t> n)
{
    for (uint32_t i = 0; i < n; ++i) {
    }
}

Note that the necessity of changing the type for one of the operands of the comparison operator can be both a benefit and a drawback: it requires you to modify the code, but it also allows you to apply check on a per-variable basis…

Share
Improve this answer

5

  • This adds complexity and overhead to the code. I think the OP was asking for gcc flag.

    – machine_1

    yesterday

  • @machine_1 what overhead are you talking about? Any optimizer worthy of it's name will optimize the logic to the same instructions for both the code in the question and mine. As for the added complexity: That's a tradeoff between making this work for any standard compliant compiler vs possibly having to adjust the compiler flags for every single compiler and possibly not even having diagnostics available for some compilers. It may not be exaclty what the OP is looking for, but it may be of interest for people looking for a solution for a similar issue in the future…

    – fabian

    yesterday

  • @machine_1 I guess to avoid such problem is more a programmer-issue rather than the compiler's job. Maybe we can use a configure file to check the environment or something else written by ourselves, rather than depending these things on the compiler.

    – LiuYuan

    yesterday

  • Are you sure that it's legal to overload operators for builtin and std types? I'm pretty sure at least one argument must be of user defined type.

    – chrysante

    21 hours ago

  • @chrysante I skimmed though the section of the standard that seems relevant. However I didn't find this kind of restriction (I may have overlooked it though). Assuming the operator is implemented in a namespace other than std and sub-namespaces. It should be ok to add this implementation.

    – fabian

    1 hour ago

2

Recent versions of gcc seem to support -Warith-conversion for this purpose:

-Warith-conversion

Do warn about implicit conversions from arithmetic operations even when conversion of the operands to the same type cannot change their values. This affects warnings from -Wconversion, -Wfloat-conversion, and -Wsign-conversion.

void f (char c, int i)
{
    c = c + i; // warns with -Wconversion
    c = c + 1; // only warns with -Warith-conversion
}

Yet it does not work for your example, probably because i < n is not an arithmetic expression. There does not seem to be a variant of this warning for generic binary expressions.

Share
Improve this answer

1

  • @DavidRanieri: this explains it. Too bad there is no equivalent for more general binary operations

    – chqrlie

    19 hours ago

1

PVS Studio can issue such warning (and many more), here is almost identical example from their docs:

https://pvs-studio.com/en/docs/warnings/v104/

It is a paid tool, but they give free license to Open Source projects.

I did not find such a warning in Clang-tidy, a free linter tool from LLVM project, but it would be very simple to add a check for comparison of integers of different sizes (a later reply by Scott McPeak with excellent clang-query did most of the work – the remaining part is just plugging this query to clang-tidy). It would be very noisy check though. One can restrict the noise by limiting the check to conditions of loops, that can be done with Clang-tidy too, but a bit more work with AST matchers.

Share
Improve this answer

0

This does not directly answer the question (providing a warning), but would you consider an alternative which avoids the problem entirely?

    size_t n = // ...

    for (typeof(n) i = 0; i < n; i++) // ...

It now doesn’t matter what type n is, since i will always be the same type as n, you should never have trouble with infinite loops resulting from i being smaller or having a smaller range then n.

Share
Improve this answer

Your Answer

Draft saved
Draft discarded

Post as a guest

Required, but never shown


By clicking “Post Your Answer”, you agree to our terms of service and acknowledge that you have read and understand our privacy policy and code of conduct.

Not the answer you're looking for? Browse other questions tagged

or ask your own question.

Leave a Reply

Your email address will not be published. Required fields are marked *