Snapshot management

Certify and Increase Opportunity.
Be
Govt. Certified Apache Cassandra Professional

Snapshot management

Cassandra backs up data by taking a snapshot of all on-disk data files (SSTable files) stored in the data directory. Snapshots are taken per keyspace and while the system is online. However, nodes must be taken offline in order to restore a snapshot.

Using a parallel ssh tool (such as pssh), you can snapshot an entire cluster. This provides an eventually consistent backup. Although no one node is guaranteed to be consistent with its replica nodes at the time a snapshot is taken, a restored snapshot can resume consistency using Cassandra’s built-in consistency mechanisms.

After a system-wide snapshot has been taken, you can enable incremental backups on each node (disabled by default) to backup data that has changed since the last snapshot was taken. Each time an SSTable is flushed, a hard link is copied into a /backups subdirectory of the data directory.

Taking a Snapshot

Snapshots are taken per node using the nodetool snapshot command. If you want to take a global snapshot (capture all nodes in the cluster at the same time), run the nodetool snapshot command using a parallel ssh utility, such as pssh. A snapshot first flushes all in-memory writes to disk, then makes a hard link of the SSTable files for each keyspace. The snapshot files are stored in the /var/lib/cassandra/data (by default) in the snapshots directory of each keyspace.

You must have enough free disk space on the node to accommodate making snapshots of your data files. A single snapshot requires little disk space. However, snapshots will cause your disk usage to grow more quickly over time because a snapshot prevents old obsolete data files from being deleted. After the snapshot is complete, you can move the backup files off to another location if needed, or you can leave them in place.

To create a snapshot of a node

Run the nodetool snapshot command, specifying the hostname, JMX port and snapshot name. For example:

$ nodetool -h localhost -p 7199 snapshot 12022011

The snapshot is created in <data_directory_location>/<keyspace_name>/snapshots/<snapshot_name>. Each snapshot folder contains numerous .db files that contain the data at the time of the snapshot.

Clearing Snapshot Files

When taking a snapshot, previous snapshot files are not automatically deleted. To maintain the snapshot directories, old snapshots that are no longer needed should be removed.

The nodetool clearsnapshot command removes all existing snapshot files from the snapshot directory of each keyspace. You may want to make it part of your back-up process to clear old snapshots before taking a new one.

If you want to clear snapshots on all nodes at once, run the nodetool clearsnapshot command using a parallel ssh utility, such as pssh.

To clear all snapshots for a node

Run the nodetool clearsnapshot command. For example:

$ nodetool -h localhost -p 7199 clearsnapshot

Enabling Incremental Backups

When incremental backups are enabled (disabled by default), Cassandra hard-links each flushed SSTable to a backups directory under the keyspace data directory. This allows you to store backups offsite without transferring entire snapshots. Also, incremental backups combine with snapshots to provide a dependable, up-to-date backup mechanism.

To enable incremental backups, edit the cassandra.yaml configuration file on each node in the cluster and change the value of incremental_backups to true.

As with snapshots, Cassandra does not automatically clear incremental backup files. DataStax recommends setting up a process to clear incremental backup hard-links each time a new snapshot is created.

Restoring from a Snapshot

To restore a keyspace from a snapshot, you will need all of the snapshot files for the keyspace, and if using incremental backups, any incremental backup files created after the snapshot was taken.

If restoring a single node, you must first shutdown the node. If restoring an entire cluster, you must shutdown all nodes, restore the snapshot data, and then start all nodes again.

Note

Restoring from snapshots and incremental backups temporarily causes intensive CPU and I/O activity on the node being restored.

To restore a node from a snapshot and incremental backups:

Shut down the node to be restored.
Clear all files the /var/lib/cassandra/commitlog (by default).
Clear all *.db files in <data_directory_location>/<keyspace_name>, but DO NOT delete the /snapshots and /backups subdirectories.
Locate the most recent snapshot folder in <data_directory_location>/<keyspace_name>/snapshots/<snapshot_name>, and copy its contents into <data_directory_location>/<keyspace_name>.
If using incremental backups as well, copy all contents of <data_directory_location>/<keyspace_name>/backups into <data_directory_location>/<keyspace_name>.
Restart the node, keeping in mind that a temporary burst of I/O activity will consume a large amount of CPU resources.