blog

Architecture: Database Sharding

Database sharding gets reached for as a cure for scaling problems far more often than it should. Before you split anything up, it’s worth being clear on what sharding actually is and, more importantly, when it’s the right call.

What is database sharding?

Database sharding is like spreading out your data over many computers. You do this when one computer isn’t enough to handle all the work.

This is horizontal scaling: adding more machines instead of making one machine stronger (vertical scaling). Adding moving parts is tempting, but a bigger, faster single machine is usually the better choice as long as it can still solve the problem.

There’s also the question of how you lay the data out. Think of it like a closet: you put different types of clothes in different drawers. In a microservice setup, each part of the app might get its own drawer, or database.

Another way to do it is by spreading out the rows of a table across multiple database “drawers.” This introduces concepts like sharding keys, which we’ll talk about later.

Some newer databases, like Cassandra, handle all this organization behind the scenes, so you don’t have to worry about it in your code.

Try these alternatives first

Before you reach for sharding, it’s worth exhausting the simpler options:

Option 1: leave it alone

Sharding sounds appealing, but if your system isn’t actually struggling, the best move is often to do nothing. It solves a problem you might not have yet.

Option 2: vertical scaling

Adding RAM, CPU cores, or storage to the machine you already have avoids any architectural change at all. It’s the cheapest fix whenever it works.

Option 3: replication

flowchart LR    
		Master -- replication --> Replica1;
    Master -- replication --> Replica2;
    Replica1 <-- reads --> Client1;
    Replica2 <-- reads --> Client2;
    Client --> Master;
    subgraph Write Load
	    Client
	  end
    subgraph Main Database
        Master
    end
    subgraph Replicas
        Replica1
        Replica2
    end
    subgraph Read Load
        Client1
        Client2
    end

If you mostly need to read your data, replication can make it more available and speed up reading. It’s like making photocopies of a book so more people can read it at the same time. This can help avoid the complications of splitting up your data across multiple servers (sharding).

However, if you’re writing a lot of new data or changing existing data frequently, replication can make things more complicated. This is because every change needs to be copied to all the copies of the database, which can slow things down.

💡 A WAL (write-ahead log) is like a backup diary stored on your computer’s hard drive. It’s a place where all changes to the database are written down before they’re officially added to the database itself. This log is crucial because it helps recover lost data if there’s a crash or if something goes wrong during a transaction.

Option 4: specialized databases

If your database is running slow, it might be because it’s not set up well for the work it needs to do. For instance, using a traditional database for search data might not be the best idea. It could be more efficient to use specialized tools like Elasticsearch for that.

Similarly, storing large files in a regular database might not be efficient. Using something like Amazon S3 for that can be a big improvement. Sometimes, it’s better to use specialized services for specific tasks rather than trying to split up your whole database.

Sharding, if you must

When you genuinely need it, sharding raises your capacity, storage, and availability by breaking the database into smaller pieces. Each piece can then be scaled on its own, with its own replicas.

However, this comes with some trade-offs. It adds complexity to operations, requires more work for your applications, and can be costly to set up and maintain.

So, how does it actually work?

First, you need to answer some key questions:

  1. How do we split the data among different shards? We need to make sure the data is evenly distributed to avoid overloading some shards.
  2. What kind of queries do we run, and how do our tables interact with each other?
  3. How will our data grow over time, and how will we handle redistributing it later if needed?

Some terms you’ll run into:

  • Shard Key: This is a part of the primary key that tells us how to distribute data. By using a shard key, we can quickly find and modify data by sending operations to the right database.
  • Logical Shard: This is a group of data that shares the same shard key. Each logical shard is stored on one node, also called a physical shard.
  • Separating Logical Shards: It’s important to note that a logical shard can’t span multiple nodes; it’s like a single chunk of storage. If a shard gets too big for one node, it can cause problems for the whole cluster.

So sharding splits your database into smaller pieces, each with its own rules for storing and reaching data. Done well, that lets the system take on more users and more data without grinding to a halt.

Sharding approaches

Sharding databases can be done in different ways, depending on how you want to split up your data.

Key-based sharding

Key-based sharding uses a special function (a hash function) to decide where data should go in the database. This means that when you have a specific key for your data, the hash function tells you which part of the database to look in.

With key-based sharding, data distribution relies solely on this hash function. It doesn’t consider how much data there is or how much space it takes up. This hashing method helps evenly spread out the data, even if you don’t have a perfect way to divide it up.

However, there are some downsides to this approach. It can be tough to change how the data is divided up (resharding), and keeping everything consistent and available can be even trickier.

Range-based sharding

This approach requires a lookup table to determine where each chunk of data should be stored. It’s crucial to carefully choose the ranges for this division and maintain consistency in the lookup table.

When picking a shard key for range-based sharding, it’s important to choose one with high cardinality, meaning it has a large number of possible values. For example, a key with values like North, South, East, and West would have low cardinality because there are only four options. Ideally, you want a key with a wide range of values to ensure even distribution across shards.

If too much data falls within a narrow range of values, some shards may become overloaded (hotspots). You can test this by simulating different scenarios with actual data to see how evenly the data is distributed among the shards.

Relationship-based sharding

In this approach, related data is stored together on one physical shard. For instance, in a relational database, data for a user and all associated information might be kept on the same shard. In an application like Instagram, this could include posts and comments.

By grouping related data together, you can improve consistency within each shard and reduce the need for queries that span multiple shards. This means that operations involving related data can be faster and more reliable.

Cross-shard transactions and two-phase commit

Cross-shard transactions get awkward when a single transaction has to touch more than one shard. However carefully you plan the layout, the longer a service lives the more likely you are to hit them.

This essentially means you require transaction guarantees provided by an ACID-compliant database, but across shards. The challenge arises because the database can’t ensure such compliance when the data you’re working with spans multiple shards.

This scenario is often referred to as a global transaction, where multiple sub-transactions must coordinate and succeed. Generally, the longer a transaction is open, the more potential for contention and failure.

One common approach to manage cross-shard transactions is the two-phase commit method:

  1. Initiation: The leader writes a durable transaction record indicating a cross-shard transaction.
  2. Participation: Participants write a permanent record indicating their willingness to commit and notify the leader.
  3. Commitment: The leader commits the transaction by updating the durable transaction record after receiving responses from all participants. If no one responds, it can abort the transaction.
  4. Finalization: Participants update their state after the leader announces the commit decision. They remove the staged state if the leader aborts the transaction.

However, implementing the two-phase commit is challenging in practice. It can lead to write amplification, lock contention, and application instability due to excessive writes. Additionally, the database must filter every read to ensure it doesn’t access state dependent on a pending cross-shard transaction, affecting all reads in the system, even non-transactional ones.

When it’s worth it

Sharding is a solid answer for very large datasets with heavy read and write load. It also costs you a lot in complexity, so be honest with yourself before you start: do the gains really outweigh that cost, or would one of the simpler options have done the job?