Saturday, August 11, 2007

Struct and Union

* Typical Usage of Struct and Union
struct A
{
     int a;
     char b;
};
struct B
{
     short c;
     char d[2];
};
struct C
{
     int e;
     long f;
     short g;
};

struct CommonPacket
{
     int PacketType;
     union
    {
         struct A PacketA;
         struct B PacketB;
         struct C PacketC;
    }
}

In general, struct could be used effectively to describe a section of continual memory slots or registers; and access it with the pointer to the head of this section.

* Alignment in Struct (How to estimate the size of struct?)
- "Although you can never be absolutely sure how your compiler will pad the members within a structure, the Standard guarantees there will be no padding before the first member. The Standard also mandates that each member in a structure must be allocated in the order in which it's declared."

- Natural Alignment:
By default, each data member is aligned based on the size of its data type. The padding after this member is the one which makes the next data member aligned on its boundary. In the end, the size of struct should be multiple of the maximal size among data member in struct. This maximal size is just the alignment size for this struct. Fox example,
struct A
{
     char a;
     long b;
};
struct B
{
     short c;
     struct A d;
}
Then for struct A, a would be aligned with the size of 1 byte (sizeof(char)). Since the alignment size of b is 4 bytes (sizeof(long int)), three bytes need to be padded after a in order to guarantee b aligned in one address of multiple of 4. The final size of struct A is 8 which is multiple of 4. Therefore the alignment of struct A is 4. (Think about what if the positions of a and b switch in struct A.) About struct B, it contains one compound of struct A. So first the alignment of this compound should be considered. It is 4 as explained before. Two bytes are padded after c and the final size of struct B is 12 which satisfies the requirement.
Note: 1), The final size of struct is not equal to (N x Len), where N is the number of data members and Len is the maximal size of data members. The objective is to save memory allocation as much as possible. Look at this example:
struct C
{
char x1;
short x2;
int x3;
char x4;
};
The size of struct C is not 16.
2), For arrays, the alignment size is the size of data type but not the size of the array. For example,
long int c[20];
The alignment size for c is sizeof(long) but not sizeof(c).
- Alignment with #pragma
Force structs to align n bytes: #pragma pack(n)
Cancel alignment of n bytes: #pragma pack()
The alignment size of each data member should be the minimal value between its natural alignment size and n.
In summary, 1) defining the alignment size for each data member (compounds first), 2) align data members in sequence, 3) save memory as much as possible.
- Offset Calculation
(size_t)((char *)&((struct A *)0)->f - (char *)((struct A *)0))

* Initialization of struct
- struct A a = {'t', 'c', 8, 0.99, "example"}; or
struct A a = {0}; /* Every member is 0 now, no matter which type.*/

* Assignment of struct
struct A
{
     char *p;
     char c;
} a, b;
char cc = 'c';
a.p = &cc;
a.c = 15;
b = a;
*b.p = 30;      /* cc now is changed */
If the pointer is contained in struct, when assignment happens between two variables, two pointers are point to the same memory.
Although arrays could not be assigned to each other, they could if they are within one struct, like this:
struct A
{
     char array[10];
} a, b;
for (int i = 0; i < 10; ++i)
     a.array[i] = i;
b = a;     /* b.array now is the same with a.array */

* Struct For Bit Map
Under some circumstances, struct could be used to do bit map for a block of memory, like this
struct A
{
     int a:1;
     int b:7;
} t;
t.a = 1;
t.b = 0x7f;
- The total number of bits should be reasonable. It might be the size of one of basic data types, like char, short, int, long int, etc.
- Be cautious that t.a and t.b are defined as SIGNED int. Therefore, one bit needs to be the sign and the value ranges of them are [0,-1] and [-64, 63]. Unsigned data type might be more useful for this kind of struct usage since each bit in this struct should be meaningful.
- Almost everything of bit field in struct is implementation-dependent. Make sure everything, like which end starts in bit order, whether it allows cross the boundary of byte, etc. before use this data structure.

* Union in Memory
In general, all data members of one union start at the same low memory address. This property would be used to exploit some special usages of union.
union bits32
{
     char bytes[4];
     int whole;
} t;
t.whole = 0x12345678;
t.bytes[0] = 0x90; => Now t.whole becomes 0x12345690 in little endian system.
Another classic example of union to check system endian:
t.whole = 1;
return (t.bytes[0] == 1); /* True is little endian and false is big endian */
Or:
union bits32 endian_test = { { 'l', '?', '?', 'b' } };
#define ENDIANNESS ((char)endian_test.whole)

No comments: