Was this page helpful?
Caution
You're viewing documentation for a previous version of ScyllaDB Migrator. Switch to the latest stable version.
This page describes how to use the Migrator in Amazon EMR. This approach is useful if you already have an AWS account, or if you do not want to manage your infrastructure manually.
Download the config.yaml.example
from our Git repository.
wget https://github.com/scylladb/scylla-migrator/raw/master/config.yaml.example \
--output-document=config.yaml
Configure the migration according to your needs.
Download the latest release of the Migrator.
wget https://github.com/scylladb/scylla-migrator/releases/latest/download/scylla-migrator-assembly.jar
Alternatively, download a specific release of scylla-migrator-assembly.jar.
Upload them to an S3 bucket.
aws s3 cp config.yaml s3://<your-bucket>/scylla-migrator/config.yaml
aws s3 cp scylla-migrator-assembly.jar s3://<your-bucket>/scylla-migrator/scylla-migrator-assembly.jar
Replace <your-bucket>
with an S3 bucket name that you manage.
Each time you change the migration configuration, re-upload it to the bucket.
Create a script named copy-files.sh
, to load the files config.yaml
and scylla-migrator-assembly.jar
from your S3 bucket.
#!/bin/bash
aws s3 cp s3://<your-bucket>/scylla-migrator/config.yaml /mnt1/config.yaml
aws s3 cp s3://<your-bucket>/scylla-migrator/scylla-migrator-assembly.jar /mnt1/scylla-migrator-assembly.jar
Upload the script to your S3 bucket as well.
aws s3 cp copy-files.sh s3://<your-bucket>/scylla-migrator/copy-files.sh
Log in to the AWS EMR console.
Choose “Create cluster” to create a new cluster based on EC2.
Configure the cluster as follows:
Choose the EMR release emr-7.1.0
, or any EMR release that is compatible with the Spark version used by the Migrator.
Make sure to include Spark in the application bundle.
Choose all-purpose EC2 instance types (e.g., i4i).
Make sure to include at least one task node.
Add a Step to run the Migrator:
Type: Custom JAR
JAR location: command-runner.jar
Arguments:
spark-submit --deploy-mode cluster --class com.scylladb.migrator.Migrator --conf spark.scylla.config=/mnt1/config.yaml <... other arguments> /mnt1/scylla-migrator-assembly.jar
See a complete description of the expected arguments to spark-submit
in page Run the Migration, and replace “<… other arguments>” above with the appropriate arguments.
Add a Bootstrap action to download the Migrator and the migration configuration:
Script location: s3://<your-bucket>/scylla-migrator/copy-files.sh
Finalize your cluster configuration according to your needs and finally choose “Create cluster”.
The migration will start automatically after the cluster is fully up.
Was this page helpful?