Saturday, February 23, 2008

Where Does All The Computing Power Go ?

Warning: Huge Post

I've started using Adobe photoshop very often ( i even use it for simple image resizing and cropping, this is the only photo editor i use). I have a 1 Gb DDR2 memory stick and a Intel 945 mobo, this is a pretty good config and works has worked really well for me. I use my comp. only for the internet, music and movies, no games - no junk softwares etc. So i've always been content with the computing power i had (i know this config. is a mere pittance when compared with the current h/w trends). Now that i use high end media editor s/w's like pshop and Flash, i'm thinking maybe my computing power isn't enough. That does not mean i'm going to upgrade my memory anytime soon ( or will i ?) but, that just got me thinking - Why do these image editors use up so much memory ?

This question led me to studying the concept of the .jpeg( or the .jpg) file format( don't ask me how), the most common image compression format. And the wiki lover in me refuses to go anyplace else when researching, so headed over to Wikipedia and keyed in "jpeg", what followed was 20 minutes of total bliss(? ofcourse i'm exaggerating here, i'm just trying to say it was interesting). I now have a overall idea of the working of this file format and also understood why image editing was such a complex process.

The JPEG File Format

JPEG - Joint Photographic Experts Group.

The basics first -This is a lossy compression format, ie. the compressed file has fewer information than the original ( the other format being - lossless compression where all the components of the original are retained), but then jpeg is an intelligent file compression format (in a way that it decides to leave out data that are least visible to the naked eye), that way we don't see much depreciation in the quality even after considerable compression.

All digital images employ the RGB format ( red-green-blue), ie. all the distinct colors in a palette are defined using a combination of the R,G,B colors.

JPEG Compression Process.

- The RGB images are converted into a YCbCr format - the luma component (Y), representing brightness of a pixel and two chroma components(Cb and Cr), representing the color of a pixel.

- The human eye is less sensitive to fine color details than to brightness details, so the brightness of a pixel(Y component) is more important and is retained completely, where as the resolution of the two chroma components(Cb and Cr) is reduced by a factor of 2.

- The image now in the simplified YCbCr format is split into blocks of 8×8 pixels - this is called a sub-image and the data is contained in a 8 x 8 matrix with each integer elements ranging from -128 to 127. This cos' in a 8-bit image each pixel can have 256 possible values, so centered around zero the range becomes -128 to 127.



The sub-image matrix and the corresponding sub-image.


- The above sub-image matrix undergoes a DCT(two dimensional discrete cosine transform.. ya), this is to transform all the Y,Cb,Cr data in the matrix to a frequency spectrum.

The DCT is taken thru' a simple formula (which i guess is not required here)



The matrix obtained after taking two dimensional DCT of the sub-image matrix.

Do you see that the first element(1,1) is a very high value compared to all other elements in the matrix. The actual information about the pixel block is held by this element(1,1), this is an advantage of using the Discrete Cosine Transform - all the information is aggregated in a corner element( and all the other elements are reduced to lower values).

Quantization.

Now the large (1,1) element falls outside the 8-bit range ( 8 bit range is 0-255, -415 will require 9-10 bits ), corner element(1,1) needs to be shortened to 8 bits so we go in for quantization. This is the actual lossy stage in the compression. But, remember i spoke about 'jpeg' using an intelligent compression technique.

The human eye is not so good at distinguishing the strength of a high frequency brightness variation. So we can afford to lose some of the information in the high frequency components without any appreciable depreciation in image quality (to the naked eye). So the high frequency components are rounded off to zero or to small integer values and the low frequency components are preserved without any change.

This is done by using a fixed(constant, pre determined) 8 x 8 quantization matrix , the elements of the DCT matrix are divided by the elements in the quantization matrix ( simple one-to-one division) and the result is rounded off. Thus the final quantized matrix with all the image data of the block is found to have very low values. This is the actual compression that happens, instead of storing large values in a matrix we use an (intelligent) algorithm to reduce the value range.



The end matrix after the quantization stage. Notice the large number of trailing zeros ? The string of trailing zeros are discarded ( this is the compression).

The compression rates can be varied by using different quantization matrices ( for higher values of elements in the Q matrix, a larger number of trailing zeros will be produced, this thus decided the overall quality of the image).

Now all this is just the encoding process, the decoding process is just the reverse. But this, if you remember is a lossy compression technique, so the matrix that is obtained after the decoding process differs from the original matrix.





The original sub-image on the left and the decompressed (after a encoding decoding cycle) on the right. The variations are easily seen in the bottom left corner.

Now, why did i write all this ? Two reasons - first I was really interested and then this kind of answers my question - Why does image editing require so much computing power ?

I sometimes deal with high res images ( 11814 x 8862-like resolutions are high, right ?), at such resols. i can't even think of the number of calculations required to render the image. So i guess the amount of computing power used is justified. Rendering a HQ image is itself a big task, and these image editing softwares offer options to play around with these pixel blocks, so for every little image effect i choose, the engine has to render the picture with all the changes made to the individual pixel blocks. Now i'm just talking about a 2-d image here, so for 3-d graphic editors and video editors even larger amounts of data is handled per second.

Hmm.. So now that i know the reason, i can learn to shut up and wait till Photoshop opens an image, no more bitching about my comp being slow. I respect pshop now for all that it does ..

It actually feels good to know your technology...

Cheers.

Image courtesy: [Wikipedia.]
Content Source: [Click here.]

6 ADDITIONAL THOUGHTS:

VIJAY a.k.a VJ said...

Interesting info.....
I have ssen about jpeg coding somewhere in our simon haykin book,but never minded to go thru....

The wiki makes it completely readable...nice work da..

And now to show all of you that i know something--

guru- the (1,1) th element isn't big,its actually the smallest of em all...(he he he he he)...

Guru said...

^^ Yep...But then when storing data in the form of bits, its the magnitude that matters not the sign of the data. Right ?

Next up i'm trying to read bout' a few other popular file formats like mp3, avi, png. But then i won't be burdening my blog readers with more file format posts.

But these make very good reading, so if you have the time just go ahead and read them all.

Cheers..

VIJAY a.k.a VJ said...

Ya ....

Ela said...

But still i would like photoshop to do things as fast as it can and whenever i want to too! After all it's job is to assist me! I too bitch a lot when i wait for it to do the processing.... i use Tiff format rather than jpeg since the former retains the resolution...but without photoshop i will be doomed since i process all my image data with it!

Guru said...

TIFF is even higher resolution(than jpeg) i'm if right .. So it must take even longer to render a TIFF file after adding effects or something..

I now understand why you need a faster pshop. You must use it a lot for all the imaging from the lab, right ?

Ela said...

Yeap...Tiff is of higher quality which could be used for publishing...most of the time photoshop is fast but at times i tax it too much by opening lot of images (which i need to work simulatenously) then it takes few seconds to read all images and open them and also to process it! but as i agreed it is the best tool for us to work with our data...it gives quite a lot of options to work with!

 
ss_blog_claim=b3f6d4d2a9c12dbdfb4bcda8dba3cdd0 ss_blog_claim=b3f6d4d2a9c12dbdfb4bcda8dba3cdd0