Understanding Block Blobs

The blob storage service provided 2 falvours of blob storage.

  1. Block blob
  2. Page blob

In this article I would like to focus on Block blob for its wide use.
I’ll go over the block blob concept following with some exampled for clarification.

Block vs Page Blob

Block blob provide the ability to upload large amount of data very fast. Why ?
Since the upload itself can be done in parallel and the size of each block can verified (up to 4MB).
In contrast the Page blob is best optimized  for random accessed. Meaning I can specify an offset in the blob for reading/writing.
Each page in the page blob has fixed size of 512B (so the storage service can optimize the access internally)

Block Blob Concepts

If you go over the Storage Service REST API’s or the Microsoft.WindowsAzure.StorageClient namespace you probably noticed that there
isn’t any way to delete a single block. Why ?
To answer this question lets first go over some concept:

The blocks in the storage service are organized into 2 lists

  1. Committed
  2. Uncommitted

Lets look from the Azure Storage Service perspective:

Any time you upload a block (calling PutBlock) to the storage service it goes directly to the Uncommitted block list.
Blocks on the Uncommitted list are saved in the storage service up to 1 week  if they are not committed (participating in the Committed list).
In order to commit the block one should call PutBlockList and specify the block ids.

So how do I delete a block ?
By simply not mentioning the block id to be deleted in the list when calling PutBlockList.

How do i update a block content ?
Using the same block id concept done for deleting. You upload a new block that has the same id of an old one (the blob service will take the latest block)

Live example

The following simlpe example demonstrate how to delete/update blocks from a blob.

I used the Microsoft.WindowsAzure.StorageClient api’s

const string storageName = "[Put_Storage__Name]";
const string storageKey = "[Put_Private_Key]";
CloudStorageAccount storage = CloudStorageAccount.Parse(String.Format("DefaultEndpointsProtocol=https;AccountName={0};AccountKey={1}", storageName, storageKey));
CloudBlobClient cloudBlobClient = storage.CreateCloudBlobClient();

CloudBlockBlob blob = cloudBlobClient.GetBlockBlobReference("mycontainer/testfile.txt");

// Get committed blocks
List<string> commitedBlocks = new List<string>();
// Grab the first 100 blocks ids
commitedBlocks.AddRange(blob.DownloadBlockList(BlockListingFilter.Committed).Take(100).Select(id => id.Name));
// Noticed tha blocks not in the list will be garbage collected and deleted by the storage service

// Update the first block
const String newBlockContent = "testblock";
var blockID = Convert.ToBase64String(BitConverter.GetBytes(0));
// Upload the content
blob.PutBlock(blockID, new MemoryStream(Encoding.Default.GetBytes(newBlockContent)), null);
// Commit the blocks

About Shay Yannay
Shay Yannay is a Software Developer and Technology Evangelist. He is experienced with designing and developing highly scalable, distributed, 24x7 availability complex system. Shay also specializes in performance management & diagnostics of multi-tier applications. He is passionate about the cloud technologies and trends, specifically with Microsoft Azure. He currently works for Quest Software's cloud tools division as an Azure Specialist. Shay holds a B.Sc in Communication Systems Engineering from the Ben-Gurion university.

