Todays world the digital economy runs in the cloud. Most consumers, including many business executives, dont know much about the inner workings of the cloud or its architecture even though they expect a lot from it. We expect clouds -- private, public and hybrid -- to perform and deliver information quickly, even if the data centers are scattered around the world.
These expectations dictate better cloud architectures and applications. These demands also drive the need for databases that scale in the cloud, can be geographically distributed, scale “horizontally” to quickly add compute power without being expensive and have transactional integrity that is common to relational databases. These drivers have given rise to geo-distributed databases.
Briefly, a geo-distributed database is a database spread across two or more geographically distinct locations and runs without degraded transaction performance.
But beyond this base definition of a geo-distributed database, what should business leaders, information technology (IT) managers and others involved with migrating systems to the cloud know about geo-distributed databases?
Here are three key considerations to keep in mind about geographically distributed databases that support cloud-scale systems.
To easily accommodate compute-load spikes (or declines), a cloud database must be able to quickly scale horizontally. That way you can add commodity servers to handle more capacity without having to shut down and migrate workloads. Past computing architectures were designed to scale vertically -- to increase capacity, add a bigger, more powerful server, even if that meant some downtime. Relational databases can sometimes scale horizontally, but the amount is often limited and it typically means more software, more administration, and costly hardware.
Cloud databases also should be transactionally consistent. If a database has been sliced up or “partitioned” as part of its geo-distribution, it is more challenging for cloud applications to maintain transactional integrity.
A better way is to always maintain transactional consistency with a database architecture that separates transaction management from data storage and presents what in effect is a single logical database for applications, even when geographically distributed. Most relational databases are transactionally consistent, which is often referred to as “ACID” compliance, for “atomicity, consistency, isolation & durability”. These properties guarantee that transactions are processed reliably. So two questions to keep in mind with a cloud database are -- can it be geographically distributed and is it ACID compliant?
Another benefit of a database that looks like one logical database even when it is geographically distributed is that data residency can be established at a specific location. This property supports compliance with data sovereignty laws cropping up in some countries that mandate that data about a citizen should remain in-country, even if the person or other authorized users can access the data via the cloud outside that country. When you start slicing and dicing a database architecture to support geo-distribution, data residency can be harder to achieve.