Emulating signed integers using unsigned integers in C

Emulating signed integers using unsigned integers in C


11

In Jens Gustedt’s book Modern C, on page 59, he explains how signed integers can be emulated using unsigned integers.
His example code shows how one can implement a comparison of two unsigned integers reinterpreted as signed integers:

bool is_negative(unsigned a) { 
   unsigned const int_max = UINT_MAX /2;
   return a > int_max;
}

bool is_signed_less(unsigned a, unsigned b) {
   if (is_negative(b) && !is_negative(a)) return false;
   else return a < b; 
} 

Do I misunderstand something here or does he miss the second special case where is_negative(a) = true and is_negative(b) = false?

For example if we want to have a = -1 and b = 1, then, using two’s complement, we would represent them as

unsigned int a = UINT_MAX; 
unsigned int b = 1;    

(e.g. for a 4 bit integer we would have a = 1111 and b = 0001).
Now we have is_negative(a) returns true, and is_negative(b) returns false. When calling is_signed_less(a, b) we end up in the else clause and a < b (now interpreted as unsigned integers) will return false. However, it is clearly true that -1 < 1, so the function returns the wrong result.

Is this a typo in the code of the book or is there something that I do not understand?

3

  • 1

    Hmm, when the signs are the same, with 2's complement, return a < b suffices. When signs differ, return a > b should apply.

    – chux – Reinstate Monica

    17 hours ago

  • 3

    It is wrong, and a simpler solution is #define Offset (UINT_MAX/2+1) / bool is_signed_less(unsigned a, unsigned b) { return a+Offset < b+Offset; }.

    – Eric Postpischil

    16 hours ago


  • All this case-by-case stuff is inherently error-prone and it only gets worse when you start using comparisons (or XOR) between booleans.

    – harold

    9 hours ago

5 Answers
5


6

This is what happens when people try to be "smart" instead of following "keep it simple, stupid" best practices. Good engineering involves writing the code as simple as you possibly can, for example:

bool is_signed_less_correct (unsigned a, unsigned b) {
  bool is_neg_a = is_negative(a);
  bool is_neg_b = is_negative(b);

  if(is_neg_a != is_neg_b) // one is negative
  {
    return is_neg_a; // if one is negative and it is a, return true otherwise false
  } 

  // both are negative or both are positive
  return a < b;
}

Even this code a bit too "smart" still, since it implicitly uses the fact that -1 == 0xFFFF… is the largest 2’s complement signed number and therefore a < b holds true no matter if these are negative or not, as long as they are both of the same signedness.

Then of course you would always write a little unit test to sanity check it: https://godbolt.org/z/h4nKsffqr

Output:

-2 < -1 ? true  (is_signed_less_gustedt)
-1 < -1 ? false (is_signed_less_gustedt)
-1 <  0 ? false (is_signed_less_gustedt)
 0 < -1 ? false (is_signed_less_gustedt)
 0 <  1 ? true  (is_signed_less_gustedt)
 1 <  0 ? false (is_signed_less_gustedt)
 1 <  1 ? false (is_signed_less_gustedt)

-2 < -1 ? true  (is_signed_less_correct)
-1 < -1 ? false (is_signed_less_correct)
-1 <  0 ? true  (is_signed_less_correct)
 0 < -1 ? false (is_signed_less_correct)
 0 <  1 ? true  (is_signed_less_correct)
 1 <  0 ? false (is_signed_less_correct)
 1 <  1 ? false (is_signed_less_correct)


3

Yes, this is an error for the reason you specified.

In the case the signs are the same, using a < b works as normal. If they differ, then we need to use a > b instead as all "negative" values are greater than all positive values. We can check this with an XOR:

bool is_signed_less(unsigned a, unsigned b) {
   if (is_negative(b) ^ is_negative(a)) return b < a;
   else return a < b;
}


3

As already suggested by Eric Postpischil in the comments above, this is surely simpler and more efficient than the book version (or any of the other answers posted so far, AFAICT):

bool is_signed_less(unsigned a, unsigned b) {
   unsigned const offset = (unsigned)INT_MIN;
   return (a - offset) < (b - offset); 
}

This code simply subtracts an offset from both of the inputs, so that the number representing the lowest possible signed integer value (INT_MIN) is mapped to the lowest possible unsigned integer (0), and then does an unsigned comparison.

In particular, the subtraction maps the lowest signed integer (INT_MIN) to 0, the second-lowest signed integer (INT_MIN + 1) to 1, the third-lowest (INT_MIN + 2) to 2, and so on. Meanwhile, non-negative signed integers (whose bit patterns correspond to unsigned integers less than offset) will cause the subtraction to wrap around, yielding values that are higher than those corresponding to any negative input.


Note that there are several possible (and equivalent) variations of this code. For example, that constant offset could be equivalently defined (in terms of the standard integer limit constants) as either (unsigned)INT_MIN or UINT_MAX / 2 + 1; both of these evaluate to an unsigned integer with only the most significant bit set (i.e. 0x80000000 on platforms with 32-bit ints).

Also, a curious numeric quirk of how fixed-width binary arithmetic works means that offset is the unique unsigned integer for which x + offset, x - offset and x ^ offset are all equal (for any x) — all of these operations just flip the most significant bit of x. Thus, in the code above, the - operators can be replaced by either + or ^ without changing the behavior of the code in any way!

(There could theoretically be a small performance difference, if some of these operations happened to run faster or pipeline better on your CPU, and if your compiler wasn’t smart enough to optimize them all into the same assembly code. If you’re worried that might be the case, benchmark them all and find out. Most likely they’ll all be equally fast, though. Also, if you really want to obfuscate the code and confuse readers, try mixing and matching the operators into something like return (a ^ offset) < (offset + b). :P)


2

Yes, it’s an error. One way of resolving it is this:

bool is_signed_less(unsigned a, unsigned b) {
    bool negativeA = is_negative(a);
    bool negativeB = is_negative(b);
    return (
        (
            negativeA != negativeB ?
            negativeA :
            a < b
        )
    );
}

First I computed negativeA and negativeB, respectively to avoid computing them again and again. Then, I checked if their sign differs.

Because if the sign of a and b differs and a is negative, then a < b.

Otherwise, if the sign of a and b is the same and a is positive, then a > b.

In the other case, when their sign is the same, if a is negative, then a is smaller than b if it’s represented as a greater unsigned number.

Otherwise, if their sign is the same, but a is positive, that means b is also positive and they can be compared normally.

5

  • Anyway, the real point here is that we shouldn't try to be "smart" about supposedly simple stuff like this. I'm pretty sure that both you and Jens Gustedt are seasoned programmers, and yet here we are.

    – Lundin

    15 hours ago


  • @Lundin apparently I had an error indeed but it was a misthinking on my part which may happen anytime when we did not test a code. I have applied a fix, thanks for pointing out the shortcoming of the answer and the downvote as well, but I have no issue using ternaries in such situations.

    – Lajos Arpad

    15 hours ago

  • I'll delete my comments and I've revoked the downvote too. And well, there are 3 very similar answers now, just different choices of operators and local variables.

    – Lundin

    14 hours ago

  • @Lundin thanks for the comments and downvote anyways. It's always good to push the answerers to fix whatever mistakes they leave in their answers. Yes, the answers are pretty similar and differ in style. I think that's good, readers may choose what fits them the best for whatever reason they might have.

    – Lajos Arpad

    14 hours ago

  • 1

    Btw a minor nitpick is that your version contains an implicit conversion to int and then back to bool. It's actually 100% equivalent to (bool) (negativeA != negativeB ? (int)negativeA : a < b). Didn't affect the result here, but implicit conversions are to be avoided. This too was caused by the ?: operator. Something like gcc -Wconversion might decide to whine over stuff like this too.

    – Lundin

    14 hours ago



-1

Assuming we’re using a 2s-complement representation, it looks like an error. I’d write

   return is_negative(a) && !is_negative(b) || a < b;

(And I’d write some tests for it, too).

3

  • Thank you for confirming that it is an error. However, your suggestion does not work either. if a=1 and b = -1 then the logic expression you wrote would be true, but a is not less than b. The first && part (which takes precedence) is evaluated to is_neg(a) && !is_neg(b) = false && false = false and the second part is a < b gives 1 < UINT_MAX which is true. So you get false || true = true.

    – JezuzStardust

    17 hours ago

  • That's why I'd write some unit-tests.

    – Toby Speight

    14 hours ago

  • is_negative(a) == is_negative(b) ? a < b : a > b is better, as others have written.

    – Toby Speight

    14 hours ago



Leave a Reply

Your email address will not be published. Required fields are marked *