Backup & Restore Methods for Syndeia Cloud Keyspace in Cassandra

Backup & Restore Methods for Syndeia Cloud Keyspace in Cassandra

1. Overview

Cassandra has multiple methods to backup and restore data depending on the scenario and the amount of data you wish to backup/restore.

This document provides instructions for backing up and restoring the Cassandra database schema and data for Syndeia Cloud using CQLSH's DESCRIBE and COPY commands. The process will handle the schema and data of multiple keyspaces and tables. Backup and restore scripts automate these processes, making it simpler and portable for use in both interactive and automated modes.

Please note, this document does not cover the backup and restore of the application software or configurations (i.e., dependencies installed or symlinked into /opt). For that, use conventional tools like tar, rsync, or any other backup solution you're comfortable with.

In addition to backing up and restoring the data, this document also explains how to zip the backup directory for archiving or transferring and how to unzip it before restoring.

Types of Data:

  1. Schema: Describes the structure of the database tables (keyspace, columns).

  2. Data: The actual application data stored in the tables.

  3. Metadata: (Not covered in this method) Data related to node token assignments or other Cassandra system information.

For more advanced backup methods, such as using sstableloader with nodetool snapshots, or third-party tools, please refer to the following references:

  1. ICX knowledge base article "How to Backup & Restore Syndeia Cloud"

  2. https://docs.datastax.com/en/cassandra/3.0/cassandra/operations/migrating.html

  3. https://web.archive.org/web/20161227010514/http://datascale.io/cloning-cassandra-clusters-fast-way/.

 


2. Prerequisites

This process requires two shell scripts: one for the backup and one for the restore process. These scripts can be downloaded from the password-protected link provided in the Intercax Helpdesk ticket where you originally requested your Syndeia Cloud license. The specific files required for this process and their locations within the Syndeia_3.6_SP2 release package are listed below:

  • Location of Backup and Restore Scripts:
    Syndeia_Cloud_3.6_SP2/Syndeia_Cloud_Utilities/Backup_and_Restore

    • Backup Script: syndeia_cloud-3.6_backup.bash

    • Restore Script: syndeia_cloud-3.6_restore.bash


3. Backup

Backup Steps:

  1. Download the syndeia_cloud-3.6_backup.bash from the password protected link provided in the Intercax Helpdesk ticket where you originally requested your Syndeia Cloud license. It should be located in the following folder Syndeia_Cloud_3.6_SP2/Syndeia_Cloud_Utilities/Backup_and_Restore.

  2. Make the script executable by running:

    chmod ug+x syndeia_cloud-3.6_backup.bash
  3. To run the backup process, use the following command:

    ./syndeia_cloud-3.6_backup.bash [--automated|-a] [--syndeia_admin_pw|-sa_pw] [cassandra_jg_host]
    • If --automated is provided, the script runs non-interactively using environment variables, if available.

    • If --syndeia_admin_pw is provided, it specifies the Cassandra syndeia_admin password (default is myPw).

    • cassandra_jg_host is optional and defaults to localhost.

  4. The backup files will be saved into a directory called SC_backups.

Zipping the Backup Directory:

After the backup is complete, you can zip the backup directory for easy storage or transfer:

zip -r SC_backups.zip SC_backups

You can then move, upload, or store this zip file as needed.


4. Restore

Restore Steps:

  1. Download the syndeia_cloud-3.6_restore.bash. from the password protected link provided in the Intercax Helpdesk ticket where you originally requested your Syndeia Cloud license. It should be located in the following folder Syndeia_Cloud_3.6_SP2 / Syndeia_Cloud_Utilities / Backup_and_Restore.

  2. Make the script executable by running:

    chmod ug+x syndeia_cloud-3.6_restore.bash
  3. Ensure the SC_backups folder is in the same directory as the restore script. The script will expect to find the backup files in this folder. If the folder or backup files are missing, the restore process will fail.

  4. If you have a zipped backup, unzip it using:

    unzip SC_backups.zip
  5. To run the restore process, use the following command:

    ./syndeia_cloud-3.6_restore.bash [--automated|-a] [--syndeia_admin_pw|-sa_pw] [cassandra_jg_host]
    • If --automated is provided, the script runs non-interactively and will unconditionally drop the existing syndeia cloud keyspaces.

    • If --syndeia_admin_pw is provided, it specifies the Cassandra syndeia_admin password (default is myPw).

    • cassandra_jg_host is optional and defaults to localhost.

  6. The restore process will recreate the schema and import the data from the backup files.


5. Troubleshooting

5.1 Import Errors

Q1: Timeout error during import

Error:

Failed to import 20 rows: OperationTimedOut - errors={<Host: x.x.x.x dc1>: ConnectionException('Host has been marked down or removed',)}, last_host=y.y.y.y, will retry later, attempt 1 of 5

Solution: Increase the timeout in your ~/.cassandra/cqlshrc configuration file (create it if it does not exist):

[connection] request_timeout=6000 client_timeout=3600

Q2: "Pickling" error during import

Error:

PicklingError: Can't pickle <class 'cqlshlib.copyutil.ImmutableDict'>

Solution: Add the following parameters to the COPY command during import:

WITH MINBATCHSIZE=1 AND MAXBATCHSIZE=1 AND PAGESIZE=10