Azure Table Storage Design Manage And Scale Table Partitions

About Azure Table Storage Service and Entity

Azure Table service is highly scalable as it can process massive amounts -- up to terabytes -- of physical storage of structured data.  Scalability is achieved through the table partition.

Table Entity

A table entity is a table row which contains 3 system properties' partition keys, row key, and timestamp.

Table Primary key

Combination of Partition Key and RowKey forms the primary key; hence this combination creates a clustered index on the Azure table.

This clustered index enables table data to be sorted by partition key in ascending order and subsequently by RowKey in ascending order.

Table Partition

We know that partition key basically is used to group the related and similar entities; we will discuss the factors to choose the partition key during table design later in this tutorial.  Table service will spread the entities of each partition key in the different physical servers (i.e. partition server) ensuring the entities of same partition key remain in a particular partition server. But one partition server may contain entities of more than partition key as a logical partition.

So in the context of table partitioning for scalability, this partition key is used as a scalable unit which determines the number of partition servers.

A partition has a scalability to serve 500 entities per second. It will be throttled down if storage node becomes hot or very active.

For better scalability and good throughput

As the request is served from a partition of a partition server, the efficiency and throughput depend on the good health of the server.

Please refer to the below image with physical partition and logical partition.

SERVER 1 contains the entities of 2 partition keys, 'IT' and 'HR'; these are logical partitions in a physical server.

SERVER 2 contains the entities of 1 partition key, 'Support'

Azure
Azure Table Storage: Scale the Partitions

  • If a partition Server (SERVER 1) with multiple logical partitions encounters high traffic, it may not be able to provide a high throughput.
  • So to make the table storage scalable, a new partition server will be created and traffic should be distributed across partition servers.
  • So more partitions should be used for optimal load balancing of traffic; Ii helps Azure Table service to spread the logical partitions to more physical partition servers.

Partition Key in Group Transactions

As we know, partition key is used to group related/similar entities; for group transaction, this key is required.

Entity group transaction comprises the storage operations that are executed atomically on entities of a particular Partition Key.

An entity group transaction comprises no more than 100 storage operations and may be no more than 4 MB in size. All operations must be on the same partition key.

Entity group transaction improves throughput by reducing the number of individual storage operations.

Partition Sizing

From the "Scalability" perspective, for better load balancing, more partitions are required.

Partition size determines the number of entities within a partition. Partition key's granularity determines the partition size.

The table may contain one partition key (at the coarsest level) for all entities as well as the table may contain separate partition keys (at the finest level) for each entity.

Let's view the advantages and disadvantages of different partition keys with their respective size; we can consider these types of partitions during table design.

Azure
Azure Table Storage Partition Key and Partition Sizing

Partitions consideration

PartitionKey granularity determines partition numbers and sizes which affect scalability.

Consideration the PartitionKey for table design

Table design (considering PartitionKey) should be based on:

  • Scalability. Good partition keys help better scalability.
  • Type of queries to be accessed, because for better performance, every query must include PartitionKey.
  • And the types of storage operations like insert operation.

PartitionKey should be considered in a way so that:

  • It should have a wide range of distinct values.
  • It should evenly distribute entities across the partition.
  • Because in case the number of entity gets increased in a partition key, it can be scaled by distributing the entities across the physical partition. Thus PartitionKey determines the number of the table partition.

Tips for Azure 70-532: Developing Microsoft Azure Solutions

Please note the following points:

  • An entity group transaction comprises not more than 100 storage operations
  • Entity group transaction may be no more than 4 MB in size.
  • All operations must be on same partition key.