Manual Set Up of a Spark Cluster

Manual Set Up of a Spark Cluster¶

This page describes how to set up a Spark cluster on your infrastructure and to use it to perform a migration.

Follow the official documentation to install Spark on each node of your cluster, and start the Spark master and the Spark workers.

In the Spark master node, download the latest release of the Migrator.

wget https://github.com/scylladb/scylla-migrator/releases/latest/download/scylla-migrator-assembly.jar

Alternatively, download a specific release of scylla-migrator-assembly.jar.

In the Spark master node, copy the file config.yaml.example from our Git repository.

wget https://github.com/scylladb/scylla-migrator/raw/master/config.yaml.example \
  --output-document=config.yaml

Configure the migration according to your needs.

Finally, run the migration as follows from the Spark master node.

spark-submit --class com.scylladb.migrator.Migrator \
  --master spark://<spark-master-hostname>:7077 \
  --conf spark.scylla.config=<path to config.yaml> \
  <... other arguments> \
  <path to scylla-migrator-assembly.jar>

See a complete description of the expected arguments to spark-submit in page Run the Migration, and replace “<spark-master-hostname>”, “<… other arguments>”, and “<path to scylla-migrator-assembly.jar>” above with appropriate values.

You can monitor progress from the Spark web UI.

Was this page helpful?