For "unsigned int max(-5, 4) " , what should it return? -5 or 4?

Hi everyone,

When I use the integer function unsigned long int max(const unsigned long int a, const long int b), and I provide the inputs -5 and 4, it returns the two’s complement representation of -5, which is a very large number. This behavior seems a little different from how max works in C or C++.

Is this because the return type is unsigned? Is this behavior by design? Sorry if this is a basic question, but I’m a beginner.

I am using the sm_86 architecture, and here is my code:

#include <stdio.h>

__global__ void kernel_max(unsigned long int a, long int b, unsigned long int *result) {
    *result = max(a, b);
}

int main() {
    unsigned long int a = 4;
    long int b = -5;
    unsigned long int *dev_result, result;

    cudaMalloc(&dev_result, sizeof(unsigned long int));

    kernel_max<<<1, 1>>>(a, b, dev_result);

    cudaMemcpy(&result, dev_result, sizeof(unsigned long int), cudaMemcpyDeviceToHost);
    cudaFree(dev_result);
    printf("The result of max(4, -5) is: %lu\n", result);
    return 0;
}

This is the max function’s definition in the documentation:
device unsigned long int max(const unsigned long int a, const long int b)

This is the PTX code:

//
// Generated by NVIDIA NVVM Compiler
//
// Compiler Build ID: CL-34714021
// Cuda compilation tools, release 12.6, V12.6.68
// Based on NVVM 7.0.1
//

.version 8.5
.target sm_52
.address_size 64

        // .globl       _Z10kernel_maxmlPm

.visible .entry _Z10kernel_maxmlPm(
        .param .u64 _Z10kernel_maxmlPm_param_0,
        .param .u64 _Z10kernel_maxmlPm_param_1,
        .param .u64 _Z10kernel_maxmlPm_param_2
)
{
        .reg .b64       %rd<6>;


        ld.param.u64    %rd1, [_Z10kernel_maxmlPm_param_0];
        ld.param.u64    %rd2, [_Z10kernel_maxmlPm_param_1];
        ld.param.u64    %rd3, [_Z10kernel_maxmlPm_param_2];
        cvta.to.global.u64      %rd4, %rd3;
        max.u64         %rd5, %rd1, %rd2;
        st.global.u64   [%rd4], %rd5;
        ret;

}

It looks like max treats the long int value -5 as an unsigned integer initially.

std::max() from algorithms does not support mixed-type arguments, according to the C++11 standard I consulted (section 24.5.7). I also tried compiling with clang and -std=c++23 and it throws an error error: no matching function for call to 'max'.

The closest equivalent to CUDA’s max() that will compile is therefore

*result = (a > b) ? a : b;

With this, the sample program below returns

The result of max(4, -5) is: 18446744073709551611

The reason for this is that when comparing a signed and an unsigned integer of the same storage size in C++, the signed integer is implicitly converted to an unsigned integer prior to the comparison. This is because unsigned T is considered a wider type than T, and by the rules governing implicit type conversions, for a two-input operator where one argument has a narrower type and the other has a wider type, the argument of the narrower type is implicitly converted to the wider type prior to performing the operation. In other words, this applies to an arithmetic operator like + or * just as it applies to a comparison operator like >.

Because this can lead to “surprising” behavior of C++ programs, as you noted, it is a best practice to avoid mixing signed and unsigned integer types in the same expression.

#include <cstdio>
#include <cstdlib>
#include <algorithm>

void kernel_max (unsigned long int a, long int b, unsigned long int *result) {
#if 0 // does not compile with C++23
    *result = std::max (a, b);
#else
    *result = (a > b) ? a : b;
#endif    
}

int main (void)
{
    unsigned long int a = 4;
    long int b = -5;
    unsigned long int result;

    kernel_max (a, b, &result);
    printf("The result of max(4, -5) is: %lu\n", result);
    return EXIT_SUCCESS;
}

I wonder what the function documentation indicates:

__device__ unsigned long int max(const unsigned long int a, const long int b)

Calculate the maximum value of the input unsigned long int and long int arguments.

Calculate the maximum value of the arguments a and b, perform integer promotion first.

I wonder what “perform integer promotion first” means in this context. Does that refer to promotion to signed integer? Or unsigned integer? The function prototypes that don’t include mixed parameter types don’t have that notation.

FWIW I tried using max in host code and did not have any difficulty, and FWIW it produced the same answer as device code:

# cat t316.cu
#include <stdio.h>
#include <math.h>
__global__ void kernel_max(unsigned long int a, long int b, unsigned long int *result) {
    *result = max(a, b);
}

int main() {
    unsigned long int a = 4;
    long int b = -5;
    unsigned long int *dev_result, result;

    cudaMalloc(&dev_result, sizeof(unsigned long int));

    kernel_max<<<1, 1>>>(a, b, dev_result);

    cudaMemcpy(&result, dev_result, sizeof(unsigned long int), cudaMemcpyDeviceToHost);
    cudaFree(dev_result);
    printf("The result of max(4, -5) is: %lu\n", result);
    printf("host: %lu\n", max(a,b));
    return 0;
}
# nvcc -o t316 t316.cu
# ./t316
The result of max(4, -5) is: 18446744073709551611
host: 18446744073709551611
#

CUDA’s max() and C++ std::max() from algorithms are not the same thing. To my recollection, min() and max() in CUDA predate CUDA’s transition from C to C++, which presumably explains the different semantics in the form of support for mixed-type arguments in CUDA. You can double check with the NVIDIA compiler folks, but I am fairly certain that

(1) max(a,b) is essentially implemented as (((a)>(b))?(a):(b)), presumably in a way that avoids multiple evaluations of a and b.

(2) This implementation is supplied for both host and device code to achieve matching behavior, which is important when this is used in __host__ __device__ functions.

“perform integer promotion” is standard terminology for the various forms of silent integer type conversions applied in C++. See section 4.5 integral promotions in the C++11 standard document (I don’t own a copy of later versions, the sections might be numbered differently). What I loosely called “width” of an integer type is related but not identical to the “rank” in the terminology of the standard. The section on relational operators in the standard dryly specifies:

The usual arithmetic conversions are performed on operands of arithmetic or enumeration type

Section 5 Expressions of the C++11 standard specifies:

Otherwise, if the operand that has unsigned integer type has rank greater than or equal to the rank of the type of the other operand, the operand with signed integer type shall be converted to the type of the operand with unsigned integer type.

The implicit conversion to the wider type on two-operand operator works just the same as if the programmer had supplied an explicit cast:

unsigned T a, result;
T b;
result = (a > (unsigned T)b) ? a : b;

Given the values in the example (a=, b=-5), this leads to b being selected, and subsequently being converted to unsigned T on assignment to result.

The ‘trunk’ for the C++ standard is available here, it may contain some C++26 features:

The integral promotions are now in chapter 7.3.7 or conv.prom (chapter id has more stability than the chapter number, if other chapters are inserted or deleted, and also the link uses the chapter id)
https://eel.is/c++draft/conv.prom#:promotion,integral

For an easier read, also see the quite faithful CppReference, there is a section integral promotion further down the page

https://en.cppreference.com/w/cpp/language/implicit_conversion


The mentioned specific wordings are still present:

https://eel.is/c++draft/expr#mul-2

The usual arithmetic conversions are performed on the operands and determine the type of the result.

with

https://eel.is/c++draft/expr#arith.conv-1.5

Otherwise, each operand is converted to a common type C.

The integral promotion rules ([conv.prom]) are used to determine a type T1 and type T2 for each operand.48

Then the following rules are applied to determine C:

  • If T1 and T2 are the same type, C is that type.

  • Otherwise, if T1 and T2 are both signed integer types or are both unsigned integer types, C is the type with greater rank.

  • Otherwise, let U be the unsigned integer type and S be the signed integer type.

    • If U has rank greater than or equal to the rank of S, C is U.

    • Otherwise, if S can represent all of the values of U, C is S.

    • Otherwise, C is the unsigned integer type corresponding to S.

Thank you all for your kindness and help. I learned a lot from this.

1 Like