If you've ever written a computer program, you've probably compared integers at least a few times.

Of course, as with all things in C and C++, this is a potentially dangerous operation by default. What is the result value of the following program?

bool bad_less(signed s, unsigned u) {
    return s < u;
}

/// ...
bool a = bad_less(-27, 44);
bool b = bad_less(44, UINT_MAX);

In any reasonable context, a and b will both by true. In C and C++, this will (non-obviously) end with a = false and b = true.

If we compile with GCC and the -Wsign-compare flag, we at least get a helpful warning:

warning: comparison of integer expressions of different signedness: 'int' and 'unsigned int'

The initial temptation is to fix this with a cast: 

(unsigned)s < u

or

s < (int)u

However: Only the latter cast will actually change the program result, and it is still broken, as now a = true and b = false. Patching over this warning with a cast does nothing to add correctness to the program, because we're just making the perilous implicit conversions explicit, when the conversions themselves are the problem.

The correct way to do the comparison is:

    return s < 0 || (unsigned)s < u;

It's easy enough to write this inline, but when you have to remember to do it correctly every time, or when you don't know the actual integer types (e.g. it may be a library's opaque integer typedef), it becomes a challenge.

What we want is to actually compare the integers properly and generically. This post will explore how we can do that using only C++-compatible C99 (No _Generic, no typeof()).

NOTE: The technique in this post assumes two important factors: The compiler uses two's complement signed integer encoding, and casting between signed/unsigned types leaves the bit pattern unchanged.

This isn't strictly guaranteed by C99, but almost all existing compilers and platforms behave this way, so you will probably find the method herein to be useful for your target.

Embiggening Integers

What we need is a way to store the full range of integers, whether signed or unsigned, in a way that loses no information.

For all signed and unsigned integer types smaller than intmax_t, we can simply upcast them to an intmax_t imax, and imax will contain the same value.

The problem is with uintmax_t and intmax_t, which cannot be cast to intmax_t without losing the information on which of the two types we had initially.

For this reason, we'll add a new type that allows us to encode the full range by adding an additional bit of information:

typedef struct wide_integer {
    uintmax_t bits;
    bool is_signed;
} wide_integer;

Where bits encodes the bits of the original integer, and is_signed tells us whether to treat bits as a signed two's complement integer or a regular unsigned integer.

Widening the Bits

To widen the bit pattern of any integer type, we just pass it through two casts:

#define widen_bits(Value) ((uintmax_t)(intmax_t)(Value))

The effect of the cast from Value to intmax_t differs based on the type of Value:

  • If Value is unsigned, then the bit pattern is left unchanged and zero-extended.

  • If Value is signed, then the sign bit of Value is dragged into the position of the sign bit in intmax_t, and additional zero bits are added (sign-extension).

Converting from intmax_t to uintmax_t does nothing but change the interpretation of the same bit pattern.

If we cast from uintmax_t to intmax_t then back to uintmax_t, the value of the original uintmax_t will be preserved, regardless of value.

Detecting the Signedness

Whether to treat the widen_bits(V) result as signed or unsigned is simple in one case, but trickier in another.

In the simple case: If V is smaller than intmax_t, we can treat the resulting bit pattern as a signed integer, because intmax_t can hold all unsigned values of all smaller integers.

#define should_treat_widened_bits_as_signed(V) \
       (sizeof((V)) < sizeof(intmax_t))        \
    || /* something else ... */

The tricky case is detecting whether V is either intmax_t or uintmax_t. The solution is to write an arithmetic expression that returns a different answer depending on the type.

But we don't know the value of V, so we can't do arithmetic on it reliably, right?

Actually, there is something we can do, although it's not immediately obvious:

#define IS_MAXINT_SIGNED(V) \
    (((1 ? 1 : (V)) - 2) < (1 ? 1 : (V)))

This relies on C's integer promotion rules to do something useful for us:

  • The operands of a ternary expression are promoted to have the same type.

  • The right-hand operand V will be larger than the int literal 1 (because we are only calling this macro if V is larger than int)

  • Thus the integer 1 will be upcast to match exactly the type of V (signed or unsigned)

  • Only the left-hand operand is evaluated in the ternary, unconditionally yielding a value of 1 in the type of V.

  • We subtract 2 from 1 in the type of V.

  • If the type of V is unsigned, then 1 - 2 will wrap around to a very large value, and 1 - 2 < 1 will be false.

  • Otherwise, 1 - 2 will yield -1 in the appropriate type, and 1 - 2 < 1 will be true.

This gives us our final detection macro:

#define should_treat_widened_bits_as_signed(V) \
     (sizeof((V)) < sizeof(intmax_t))         \
  || IS_MAXINT_SIGNED(V)

Put it Together

The final integer-widening code looks like this:

inline wide_integer
_make_wide_integer(uintmax_t bits, bool treat_as_signed) {
    wide_integer r;
    r.bits = bits;
    r.is_signed = treat_as_signed;
    return r;
}

#define widen_integer(V) \
    _make_wide_integer(  \
        widen_bits((V)), \
        should_treat_widened_bits_as_signed((V)))

Comparing Wide Integers

Comparing wide integers is very simple, and is just straightforward code:

inline int
compare_wide_integers(wide_integer lhs, wide_integer rhs) {
    const uintmax_t uleft  = lhs.bits;
    const intmax_t  ileft  = (intmax_t)uleft;
    const uintmax_t uright = rhs.bits;
    const intmax_t  iright = (intmax_t)uright;
    if (lhs.is_signed) {
        // S <=> S
        if (rhs.is_signed) {
            if (ileft < iright)
                return -1;
            else if (ileft > iright)
                return 1;
        }
        // S <=> U
        else {
            if (ileft < 0 || uleft < uright)
                return -1;
            else if (uleft > uright)
                return 1;
        }
    } else {
        // U <=> U
        if (!rhs.is_signed) {
            if (uleft < uright)
               return -1;
            else if (uleft > uright)
                return 1;
        }
        // U <=> S
        else {
            if (iright < 0 || uleft > uright)
                return 1;
            else if (uleft < uright)
                return -1;
        }
    }

    // The operands are equivalent
    return 0;
}

This function returns -1 if the left-hand operand is less than the right, +1 if the right-hand operand is greater than the left, and 0 if the operands are equivalent.

A Macro Magic

We can get concise with a special magic macro:

#define safe_cmp(Left, Operator, Right) \
    (compare_wide_integers(widen_integer(Left), \
                           widen_integer(Right)) \
        Operator 0)

And use it for comparison with our familiar infix operators:

bool number_is_less(some_integer_type       a,
                    some_other_integer_type b) {
    return safe_cmp(a, <, b);
}

C++ Bonus

If we are in C++, we can add some bonus features to wide_integer:

typedef struct wide_integer {
    uintmax_t bits;
    bool is_signed;

#ifdef __cplusplus
    // Default-construct uninitialized
    wide_integer() = default;

    // Implicit convert from any standard integral type
    wide_integer(std::integral auto v) noexcept
        : bits(widen_bits(v))
        , is_signed(should_treat_widened_bits_as_signed(v))
    {}

    // Default equality
    bool operator==(wide_integer r) = default;

    // Add comparison operators
    std::strong_ordering
    operator<=>(wide_integer other) const noexcept {
        return compare_wide_integers(*this, other) <=> 0;
    }
#endif
} wide_integer;

This will give us lossless implicit conversions from any integral value to a wide_integer, and allows us to use the standard relational operators on wide_integer objects.