Showing posts with label Azure Storage tables. Show all posts
Showing posts with label Azure Storage tables. Show all posts

Monday, August 19, 2019

Migrate an app to Azure Cosmos DB from Azure Storage tables


If you have decided to move to Azure Cosmos DB, and you currently have data in one or more Azure Storage tables, you must consider how to move that data into Azure Cosmos DB. Microsoft provides two tools to complete this task:
  • The Azure Cosmos DB Data Migration Tool. This open-source tool is built specifically to import data into Azure Cosmos DB from many different sources, including tables in Azure Storage, SQL databases, MongoDB, text files in JSON and CSV formats, HBase, and other databases. The tool has both a command-line version and a GUI version. You supply the connection strings for the data source and the Azure Cosmos DB target, and you can filter the data before migration.
  • AzCopy. This command-line only tool is designed to enable developers to copy data to and from Azure Storage accounts. The process has two stages:
    • Export the data from the source to a local file.
    • Import from that local file to a database in Cosmos, specifying the destination database by using its URL and access key.

Azure Cosmos DB vs Azure Storage tables

Azure Storage Tables is a service that provides a way to store semi-structured data in the cloud. The data is highly available to clients because it is replicated to multiple nodes or locations.
Storage tables are an example of a NoSQL database. Such databases don't impose a strict schema on each table like a SQL database does. Instead, each entity in the table can have a different set of properties. It's up to you to ensure that these properties are organized, and to ensure that apps that query the data can work with results that may have different values. A primary advantage of this semi-structured approach to data is that the database can evolve more quickly to meet changing business requirements.

Azure Cosmos DB is Microsoft's globally distributed, multi-model database service with Azure.
Multi-model means that you can use one of many data access methods and APIs to read and write data. For example, you can use SQL, but if you prefer a NoSQL approach, you can use MongoDB, Cassandra, or Gremlin. Azure Cosmos DB includes the Tables API, which means that if you move your data from Azure Storage tables into Azure Cosmos DB, you don't have to rewrite your apps. Instead, you just change their connection strings.
Azure Cosmos DB can replicate data for read and write access to multiple regions. Clients can connect to a local replica both to query but also to modify data, which is not possible in Azure Storage tables.

Differences between Azure Storage tables and Azure Cosmos DB tables

There are some differences in behavior between Azure Storage tables and Azure Cosmos DB tables to remember if you are considering a migration. For example:
  • You are charged for the capacity of an Azure Cosmos DB table as soon as it is created, even if that capacity isn't used. This charging structure is because Azure Cosmos DB uses a reserved-capacity model to ensure that clients can read data within 10 ms. In Azure Storage tables, you are only charged for used capacity, but read access is only guaranteed within 10 seconds.
  • Query results from Azure Cosmos DB are not sorted in order of partition key and row key as they are from Storage tables.
  • Row keys in Azure Cosmos DB are limited to 255 bytes.
  • Batch operations are limited to 2 MBs.
  • Cross-Origin Resource Sharing (CORS) is not currently supported by Azure Cosmos DB.
  • Table names are case-sensitive in Azure Cosmos DB. They are not case-sensitive in Storage tables.
How to choose between the two ?
PriorityAzure Storage TablesAzure Cosmos DB Tables
LatencyResponses are fast, but there is no guaranteed response time.< 10 ms for reads, < 15 ms for writes
ThroughputMaximum 20,000 operations/secNo upper limit on throughput. Over 10 million operations/sec/table.
Global distributionSingle region for writes. A secondary read-only region is possible with read-access geo-redundant replication.Replication of data for read and write to more than 30 regions.
IndexesA single primary key on the partition key and the row key. No other indexes.Indexes are created automatically on all properties.
Data consistencyStrong in the primary region. If you are using read-access geo-redundant replication, it may take time for changes to reach the secondary region.You can choose from five different consistency levels depending on your needs for availability, latency, throughput, and consistency.
PricingOptimized for storage.Optimized for throughput.
SLAs99.99% availability.99.99% availability for single region and relaxed consistency databases. 99.999% availability for multi-region databases.