Unexpected uint64 behaviour 0xFFFF’FFFF’FFFF’FFFF – 1 = 0?

Unexpected uint64 behaviour 0xFFFF’FFFF’FFFF’FFFF – 1 = 0?

28

Consider the following brief numpy session showcasing uint64 data type

import numpy as np
 
a = np.zeros(1,np.uint64)
 
a
# array([0], dtype=uint64)
 
a[0] -= 1
a
# array([18446744073709551615], dtype=uint64)
# this is 0xffff ffff ffff ffff, as expected

a[0] -= 1
a
# array([0], dtype=uint64)
# what the heck?

I’m utterly confused by this last output.

I would expect 0xFFFF’FFFF’FFFF’FFFE.

What exactly is going on here?

My setup:

>>> sys.platform
'linux'
>>> sys.version
'3.10.5 (main, Jul 20 2022, 08:58:47) [GCC 7.5.0]'
>>> np.version.version
'1.23.1'

Share
Improve this question

11

  • 5

    Interestingly I get a OverflowError: long too big to convert for the second decrement. What version of numpy is this, and in what environment?

    – Dan Mašek

    yesterday

  • 3

    Does it matter if you make the 1 an np array of uint64 type also?

    – ProfDFrancis

    yesterday

  • 4

    Same as Dan, I get OverflowError: Python int too large to convert to C long

    – njzk2

    yesterday

  • 4

    b = np.zeros(1,np.uint64); b[0] = 1; a - b give, as one may expect, array([18446744073709551614], dtype=uint64). Similarly a[0] -= np.uint64(1) also works. I suspect the -= is allowing a conversion to the right-hand side-value type

    – njzk2

    yesterday

  • 3

    I'm seeing the same issue as OP. Running 3.11 on MacOS. a[0] -= 1 leaves the value at zero, while a -= 1 sets the value to FF..FF.

    – Frank Yellin

    yesterday

2 Answers
2

Reset to default

Highest score (default)

Trending (recent votes count more)

Date modified (newest first)

Date created (oldest first)

28

By default, NumPy converts Python int objects to numpy.int_, a signed integer dtype corresponding to C long. (This decision was made back in the early days when Python int also corresponded to C long.)

There is no integer dtype big enough to hold all values of numpy.uint64 dtype and numpy.int_ dtype, so operations between numpy.uint64 scalars and Python int objects produce float64 results instead of integer results. (Operations between uint64 arrays and Python ints may behave differently, as the int is converted to a dtype based on its value in such operations, but a[0] is a scalar.)

Your first subtraction produces a float64 with value -1, and your second subtraction produces a float64 with value 2**64 (since float64 doesn’t have enough precision to perform the subtraction exactly). Both of these values are out of range for uint64 dtype, so converting back to uint64 for the assignment to a[0] produces undefined behavior (inherited from C – NumPy just uses a C cast).

On your machine, this happened to produce wraparound behavior, so -1 wrapped around to 18446744073709551615 and 2**64 wrapped around to 0, but that’s not a guarantee. You might see different behavior on other setups. People in the comments did see different behavior.

Share
Improve this answer

7

  • 4

    I am curious, as someone who only uses numpy occasionally, does the community feel this is a defect? This is incredibly surprising since most languages with explicitly-specified fixed with types have their behaviour defined similarly or identically to C.

    – Chuu

    22 hours ago

  • 3

    @Chuu: I dunno about the general community, but I'd personally prefer if mixing uint64 and signed integers produced uint64 output instead of float64 output. (C handles this specific case better, but it's got its own issues – for example, integer arithmetic on two unsigned operands can produce signed overflow and undefined behavior in C, due to how unsigned types smaller than int get promoted to signed int. I'm not sure if NumPy has any protections in place to avoid this issue.)

    – user2357112

    22 hours ago

  • 2

    Isn’t inheriting UB from C rather worrying, given that the C standard allows a program that invokes UB to do literally anything? Admittedly, most (all?) of the fun stuff that has surfaced lately, like not generating any code at all for an execution path that always invokes UB, hinges on the UB being detected at compile time, which this won’t be, but still…

    – Ture Pålsson

    21 hours ago

  • @TurePålsson: "Isn’t inheriting UB from C rather worrying" – absolutely! The behavior we're seeing here already sucks, but it could be arbitrarily worse some day, and if it does get worse, there's no telling how long the problem might pass unnoticed.

    – user2357112

    20 hours ago

  • @TurePålsson: Yup. Fortunately real-world C implementations will produce a value for out-of-range FP to int conversions (at least when the the value being converted is a run-time variable that might be in range), Some compilers even go as far as officially defining the behaviour, such as MSVC: devblogs.microsoft.com/cppblog/… (article re: changing the out-of-range conversion result for FP-to-unsigned to match AVX-512 instructions which produce all-ones, 0xfff… in that case.)

    – Peter Cordes

    17 hours ago

8

a[0] - 1 is 1.8446744073709552e+19, a numpy.float64. That can’t retain all the precision, so its value is 18446744073709551616=264. Which, when written back into a with dtype np.uint64, becomes 0.

Share
Improve this answer

2

  • 7

    NumPy floating->integer conversion with a floating point value out of the integer type's range is undefined behavior (inherited from C), so there's no guarantee the result is actually 0 – it could be anything.

    – user2357112

    yesterday

  • 1

    @user2357112 Yeah I had actually written a second paragraph about that, also noting harold's different result, but then decided to keep this brief, just explaining the observed behavior of the OP.

    – Kelly Bundy

    23 hours ago

Your Answer

Draft saved
Draft discarded

Post as a guest

Required, but never shown


By clicking “Post Your Answer”, you agree to our terms of service, privacy policy and cookie policy

Not the answer you're looking for? Browse other questions tagged

or ask your own question.

Leave a Reply

Your email address will not be published. Required fields are marked *