Was this page helpful?
Caution
You're viewing documentation for an unstable version of ScyllaDB Migrator. Switch to the latest stable version.
This page describes how to set up a Spark cluster on your infrastructure and to use it to perform a migration.
Follow the official documentation to install Spark on each node of your cluster, and start the Spark master and the Spark workers.
In the Spark master node, download the latest release of the Migrator.
wget https://github.com/scylladb/scylla-migrator/releases/latest/download/scylla-migrator-assembly.jar
Alternatively, download a specific release of scylla-migrator-assembly.jar.
In the Spark master node, copy the file config.yaml.example
from our Git repository.
wget https://github.com/scylladb/scylla-migrator/raw/master/config.yaml.example \
--output-document=config.yaml
Configure the migration according to your needs.
Finally, run the migration as follows from the Spark master node.
spark-submit --class com.scylladb.migrator.Migrator \
--master spark://<spark-master-hostname>:7077 \
--conf spark.scylla.config=<path to config.yaml> \
<... other arguments> \
<path to scylla-migrator-assembly.jar>
See a complete description of the expected arguments to spark-submit
in page Run the Migration, and replace “<spark-master-hostname>”, “<… other arguments>”, and “<path to scylla-migrator-assembly.jar>” above with appropriate values.
You can monitor progress from the Spark web UI.
Was this page helpful?