Monday, December 3, 2007

Syntax elements of H.264

* Syntax for video sequence stream
- VCL represents the content of the video data. NAL is to format that data and provide header information for comm and storage.
- Big endian for video stream in byte while little endian for bits in one byte with LSBit on the right. The MSBit is always first in bit stream.
- One coded slice NAL needs to contain all the data of one slice. The data struct is slice header, slice data and trailing bits. This means NAL represents one slice of a picture instead of one picture.

- Syntax elements for NAL
1). nal_ref_idc:
For seq and pic parameter sets NAL, it shall be 1.
For slices of reference pic, it shall be 1.
For slices of non reference pic, it shall be 0.
2). nal_unit_type:

- Syntax elements for Seq Parameter Set (SPS)
0) seq_parameter_set_id: [0, 31]
1) log2_max_frame_num_minus4: [0, 12] (maximal num of MaxFrameNum is 2^16)
2) pic_order_cnt_type: specify the method to decode picture order count. [0, 2]
3) log2_max_pic_order_cnt_lsb_minus4: [0, 12] (MaxPicOrderCntLsb)
4) several elements for the decoding of picture order count
5) num_ref_frames: [0, MaxDpbSize], the sum of reference frames, complementary reference field pair and non-paired reference fields
6) frame_mbs_only_flag: indicate only frames exist in the video seq
7) mb_adaptive_frame_field_flag:

- Syntax elements for Pic Parameter Set (PPS)
0) pic_parameter_set_id: [0, 255]
1) mb to slice group map
2) QP initial value for Y/C

- Syntax elements for Slice header
0) first_mb_in_slice: MB index in general and MB pair index for MBAFF
1) slice_type: IDR only contains I/SI slices and so does the video seq when num_ref_frames is 0
2) frame_num: number reference pictures.
3) field_pic_flag: this slice is one of a coded field, i.e. the picture is field picture. The picture structure could be defined with this flag. But if it is 0 the MB structure may be either frame or field.
4) bottom_field_flag: this slice is part of a coded bottom field. The picture is bottom field.
5) pic_order_cnt_lsb: the picture order count modulo MaxPicOrderCntLsb for the top field of a coded frame or for a coded field.
6) delta_pic_order_cnt_bottom
7) delta_pic_order_cnt[0-1]?
8) idr_pic_id: identifies an IDR picture. All slices in one IDR have the same value of idr_pic_id.

- Syntax elements for slice data
0) mb_field_decoding_flag: identify if the current MB is field or frame structure in MBAFF mode

No comments: