Saturday, August 11, 2007

Unsigned and Signed Integer

* Two attributes for char, short, int, long and long long:
bitwidth (8, 16, 32, 64 bits) and sign (unsigned, signed).

* Conversion Rule
- When an expression does operations with the same bitwidth on (signed/unsigned)char, (signed/unsigned)short, bit-field, enum, these types would be promoted to int type. And float type would be promoted to double type. This is called type promotion.
- When an expression contains variables or numbers whose bitwidths are different, all variables or numbers would be converted to the wider data type (signed or unsigned) and continue the operation. This is called universal arithmetic conversions. The conversion rule is to extend the sign to the bytes of high addresses since in general the allocation of memory is from low address to high address in stack and heap. On the other hand, the opposite conversion from wider bitwidth to narrower bitwidth, the bytes with high addresses, which contains the sign, would be discarded. Keep in mind different results due to the big and little endian of the system.

* Arithmetic Operation of Unsigned
- When an expression contains variables or numbers that are with the SAME bitwidth but different sign, the signed data is converted to the unsigned version. This might bring some trouble when it happens in the condition check, like this:
unsigned int a = 6;
int b = -20;
int c = (a+b>6) ? a : b;
The c would be always equal to a.
- The general arithmetic operations on unsigned:
c = a +/- b mod 2^n
where n is the bitwidth of the data type. Therefore no overflow and underflow for unsigned data. This might be not expected in some cases.

If both operands are signed, the result of overflow/underflow is UNDEFINED. In general, it is hard to test overflow/underflow of SIGNED integer operations. It could be done to check the flags of some status register in Assembly. However if x and y are two integers and known to be non-negative, it could be done in this way:
if ((int)((unsigned)x + (unsigned)y) < 0)
    complain();

* Shift Operations
- If the item is left shifted, zeros are padded in the right. Not left shift signed data.
- "If the item being right shifted is unsigned, zeroes are shifted in. If the item is signed, the implementation is permitted to fill vacated bit positions either with zeroes or with copies of the sign bit. If you care about vacated bits in a right shift, declare the variable in question as unsigned. You are then entitled to assume that vacated bits will be set to zero."
- "if the item being right or left shifted is n bits long, then the shift count must be greater than or equal to zero and strictly less than n. Thus, it is not possible to shift all the bits out of a value in a single operation."
- By shifting bits, the multiplication and division for unsigned and multiplication for signed are safe and correct. "Note that a right shift of a signed integer is generally not equivalent to division by a power of two, even if the implementation copies the sign into vacated bits. To prove this, consider that the value of (-1)>>1 cannot possibly be zero."

* size_t
- size_t = unsigned long int
- The return data type of sizeof is size_t. Keep in mind the rules of conversion and unsigned arithmetic operations. For example,
#define TOTAL (sizeof(array)/sizeof(array[0]))
{
     int d = -1;
     if (d <= TOTAL-2)
         x = array[d+1];
}

* Post-fix UL and L
If an expression has the overflowed value, consider to put these post-fix on integer numbers: U, L, and UL. Fox example, write a routine to calculate n! assuming the result would not make long int overflow.
long foo(int n)
{
     return ((n+1L) * n / 2);
}

* Usage of Unsigned Data in C
- Unsigned version is ONLY used in BIT OPERATIONS (&, |, ~, >>, <<). Otherwise, CAST it to signed version.

* Identify Whether a Data Type or Variable Is Unsigned
- For variables: #define ISUNSIGNED(a) ((a) >= (char)0 && ~(a) >= (char)0)
- For data type: #define ISUNSIGNED(type) ((type)0 - (char)1 > (char)0)

No comments: