Reading BMP Files

BMP is an abbreviation for Bitmap. It’s a file format used for storing image.

The quote below is extracted from Wikipedia:

The BMP file format, also known as bitmap image file or device independent bitmap (DIB) file format or simply a bitmap, is a raster graphicsimage file format used to store bitmapdigital images, independently of the display device(such as a graphics adapter), especially on Microsoft Windows[1] and OS/2[2] operating systems.

Here’s a diagram from Wikipedia showing the structure of a bitmap file:


To extract a BMP file, we’ve to know a few things related what it stores. It’s not just raw data stored in it. There’re a total of 54 Bytes (usually) used in storing header information(overhead). The header can be broken down into 2 parts which are file header  and info header.

File header stores essential information that a file needs. It takes up 14 Bytes of the header information. The table below covers the contents of file header alongside with their required storage:

The uppercase characters BM, ASCII codes 66 and 77  expressed as a base-10 integer (0x4D and 0x42 in hexadecimal) 2 bytes
File size, bytes 4 bytes
Two “reserved values” that are not needed 2 bytes each
Offset to beginning of image data 4 bytes

Info header stores information that are related to the image stored. The table below shows the related components stored in the info header:

Header size, bytes (should be 40) 4 bytes
Image width, pixels 4 bytes
Image height, pixels 4 bytes
Number of color planes 2 bytes
Bits per pixel, 1 to 24 2 bytes
Compression, bytes (assumed 0) 4 bytes
Image size, bytes 4 bytes
X-resolution and y-resolution, pixels per meter 4 bytes each
Number of colors and “important colors,” bytes 4 bytes each
File Type: 19778
File Size: 1749656
Reserved 1: 0
Reserved 2: 0
Offset bytes: 54
Struct Size: 40
Width: 810
Height: 540
Plane: 1
Bit/Pixel: 32
Compression: 0
Image Size: 1749602
X Pixel/Meter: 13776
Y Pixel/Meter: 13776
Num Color: 0
important Color: 0

Above shows the information extracted from an input image. It can be seen that File Size – Image Size = Offset Bytes

An image can have 1, 2, 4, 8, 16, 24 and 32 bits stored per pixel.

  • 1 bit is usually used for binary image.
  • 8 bit for grey scale image.
  • 24 bit for image with RGB.
  • 32 bit for image with RGB and Alpha (Opacity/Transparency)

32 bit and 24 bit are the most commonly used format out of the rest.

Image is stored in such way: (extracted from Wikipedia)

Normally pixels are stored “upside-down” with respect to normal image raster scan order, starting in the lower left corner, going from left to right, and then row by row from the bottom to the top of the image.[4] Unless BITMAPCOREHEADER is used, uncompressed Windows bitmaps also can be stored from the top to bottom, when the Image Height value is negative.

Bitmap bits are stored immediately following the header information, consisting of an array of BYTE values representing consecutive rows, or “scan lines” of the bitmap. Each scan line consists of consecutive bytes representing the pixels in the scan line, in left-to-right order. The number of bytes representing a scan line depends on the color format and the width, in pixels of the bitmap. A scan line must be zero-padded to end on a 32-bit boundary (this happens when the bits stored per pixel is not 32). A quick check to this is to obtain the remainder of the Column size * bits stored per pixel to check whether it’s equal to 0.

Part of my Source code is referenced from Stackoverflow, thus I’m not the original author.

There are a few parts of the code that needs clarification, for instance the sorting of BGR to RGB when we’re reading from a BMP file. This is because images are stored in BGR format in BMP files and we usually operate on image using RGB format (at least for me). The other part that needs attention is #pragma pack(push, 1). It ensures that data type are packed to their respective size. For instance: (Part of it is referred from Stack overflow)

struct Test
   char AA;
   int BB;
   char CC;

The compiler could choose to lay the struct out in memory like this:

|   1   |   2   |   3   |   4   |  

| AA(1) | pad.................. |
| BB(1) | BB(2) | BB(3) | BB(4) | 
| CC(1) | pad.................. |

With #pragma pack(push, 1) on, this would be the result:

|   1   |

| AA(1) |
| BB(1) |
| BB(2) |
| BB(3) |
| BB(4) |
| CC(1) |



Leave a Reply

Please log in using one of these methods to post your comment: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s