How to get Raw data from CR2 format. page7 Decoding Lossless JPEG - DHT

since:Aug 31,2014
last update:Sep 6,2014
previous page : next page

top > photograph and camera > How to get Raw data from CR2 format. page1 > Decoding Lossless JPEG - DHT

4. Decoding Lossless JPEG

4.1 Structure of Lossless JPEG in CR2 format

Here is the part that how to decode lossless JPEG in CR2 format.
To understanding more easier, the file (IMG_2026_IFD3_JPEG.jpg) which is clipped lossless JPEG section is used.

In CR2 format, Lossless JPEG has five markers including SOI and EOI.

MarkerSymbolDescription
0xFFD8SOIStart Of Image
0xFFC4DHT Define Huffman Tables
0xFFC3SOF3 Start Of Frame Lossless (sequential) (non-differential, Huffman coding)
0xFFDASOS Start Of Scan
0xFFD9EOI End Of Image

4.2 DHT - Define Huffman Tables

Below figure is a partial dump of lossless JPEG in the sample.



In JPEG, the byte order is used the big endian.
The 2 byte value (0xFF 0xC4) in address 0x0000 0002 is DHT Marker.

The 2 byte value (0x00 0x42) in address 0x0000 0004 is the segment length including itself (Lh).
This segment length is 66.

The high 4 bit of next 1 byte (address 0x0000 0006) means Table Class (Tc). 0 is DC table or lossless table. 1 is AC table. Now here is lossless JPEG, this is always 0.
The low 4 bit of this 1 byte means Huffman table destination identifier (Th). In the sample, this value is 0, so this segment is ID #0 huffman table.

The next 16 byte (from address 0x0000 0007) means Number of Huffman codes of length i (Li).

Bit length (i)12345678 910111213141516
Number of Huffman codes0x000x010x040x020x030x01 0x010x010x010x010x000x000x000x00 0x000x00

That is,
0 Huffman code of 1 bit length is.
1 Huffman code of 2 bit length is.
4 Huffman codes of 3 bit length are.
:

Now, we make a huffman tree from these data.

The number of 1 bit length Huffman code is 0, and we need not consider it.

The number of 2 bit length Huffman code is 1 in above table.
2 bit length code are of 4 types as follows:
00
01
10
11
And we should select 1 code in the most minimum and have not yet selected.
In the sample, we can get 00.

The number of 3 bit length Huffman codes is 4 in above table.
3 bit length code are of 8 types as follows:
000
001
010
011
100
101
110
111
And we should select 4 code in the most minimum and have not yet selected in the same way.
000 and 001 have been selected in 2 bit length, and they must be excluded.
Therefore in the sample, we can get 010, 011, 100 and 101.

This can be easily expressed by the tree structure like following figure.



The top "R" circle means the root. The white circle means node. The yellow circle means the leaf.
The depth of tree means bit length. The left side number is count of leaves in the depth. It is same as number of Huffman codes.

In the sample, the Huffman code list is below table.

Code
00
010
011
100
101
1100
1101
11100
11101
11110
111110
1111110
11111110
111111110
1111111110

The next 15 byte (from address 0x0000 0017) means Value associated with each Huffman code (Vi).
These values are sequentially put on the Huffman code table as follows. This is #0 Huffman table.

CodeValue
000x06
0100x04
0110x08
1000x05
1010x07
11000x03
11010x09
1110 00x00
1110 10x0A
1111 00x02
1111 100x01
1111 1100x0C
1111 11100x0B
1111 1111 00x0D
1111 1111 100x0E

In address 0x0000 0026, Tc and Th exist as address 0x0000 0006.
And Th is 1, the following data is for ID #1 Huffman table.
In the sample, in fact, data for #1 is same as data for #0. Therefore #1 Huffman table is same as #0 Huffman table.


next page : Decoding Lossless JPEG part2 - SOF3 and SOS

previous page [1] [2] [3] [4] [5] [6] [7] [8] [9] next page