Was this page helpful?
Caution
You're viewing documentation for a previous version of ScyllaDB Migrator. Switch to the latest stable version.
An Ansible playbook is provided in the ansible folder of our Git repository. The Ansible playbook will install the pre-requisites, Spark, on the master and workers added to the ansible/inventory/hosts
file. Scylla-migrator will be installed on the spark master node.
The Ansible playbook expects to be run in an Ubuntu environment, by a user named ubuntu
(like you get in AWS EC2 Ubuntu-based images).
Clone the Migrator Git repository:
git clone https://github.com/scylladb/scylla-migrator.git
cd scylla-migrator/ansible
Update ansible/inventory/hosts
file with master and worker instances
Update ansible/ansible.cfg
with location of private key if necessary
The ansible/template/spark-env-master-sample
and ansible/template/spark-env-worker-sample
contain environment variables determining number of workers, CPUs per worker, and memory allocations - as well as considerations for setting them.
run ansible-playbook scylla-migrator.yml
On the Spark master node:
cd scylla-migrator
./start-spark.sh
On the Spark worker nodes:
./start-slave.sh
Open Spark web console
Ensure networking is configured to allow you access spark master node via TCP ports 8080 and 4040
visit http://<spark-master-hostname>:8080
Review and modify config.yaml based whether you’re performing a migration to CQL or Alternator
If you’re migrating to ScyllaDB CQL interface (from Apache Cassandra, ScyllaDB, or other CQL source), make a copy review the comments in config.yaml.example
, and edit as directed.
If you’re migrating to Alternator (from DynamoDB or other ScyllaDB Alternator), make a copy, review the comments in config.dynamodb.yml
, and edit as directed.
As part of ansible deployment, sample submit jobs were created. You may edit and use the submit jobs.
For CQL migration: edit
scylla-migrator/submit-cql-job.sh
, change line--conf spark.scylla.config=config.yaml \
to point to the whatever you named theconfig.yaml
in previous step.For Alternator migration: edit
scylla-migrator/submit-alternator-job.sh
, change line--conf spark.scylla.config=/home/ubuntu/scylla-migrator/config.dynamodb.yml \
to reference theconfig.yaml
file you created and modified in previous step.
Ensure the table has been created in the target environment.
Submit the migration by submitting the appropriate job
CQL migration: ./submit-cql-job.sh
Alternator migration: ./submit-alternator-job.sh
You can monitor progress by observing the Spark web console you opened in step 7. Additionally, after the job has started, you can track progress via http://<spark-master-hostname>:4040
.
FYI: When no Spark jobs are actively running, the Spark progress page at port 4040 displays unavailable. It is only useful and renders when a Spark job is in progress.
Was this page helpful?