ScyllaDB University Live | Free Virtual Training Event
Learn more
ScyllaDB Documentation Logo Documentation
  • Server
  • Cloud
  • Tools
    • ScyllaDB Manager
    • ScyllaDB Monitoring Stack
    • ScyllaDB Operator
  • Drivers
    • CQL Drivers
    • DynamoDB Drivers
  • Resources
    • ScyllaDB University
    • Community Forum
    • Tutorials
Download
ScyllaDB Docs ScyllaDB Migrator Getting Started Manual Set Up of a Spark Cluster

Caution

You're viewing documentation for a previous version of ScyllaDB Migrator. Switch to the latest stable version.

Manual Set Up of a Spark Cluster¶

This page describes how to set up a Spark cluster on your infrastructure and to use it to perform a migration.

  1. Follow the official documentation to install Spark on each node of your cluster, and start the Spark master and the Spark workers.

  2. In the Spark master node, download the latest release of the Migrator.

    wget https://github.com/scylladb/scylla-migrator/releases/latest/download/scylla-migrator-assembly.jar
    

    Alternatively, download a specific release of scylla-migrator-assembly.jar.

  3. In the Spark master node, copy the file config.yaml.example from our Git repository.

    wget https://github.com/scylladb/scylla-migrator/raw/master/config.yaml.example \
      --output-document=config.yaml
    
  4. Configure the migration according to your needs.

  5. Finally, run the migration as follows from the Spark master node.

    spark-submit --class com.scylladb.migrator.Migrator \
      --master spark://<spark-master-hostname>:7077 \
      --conf spark.scylla.config=<path to config.yaml> \
      <... other arguments> \
      <path to scylla-migrator-assembly.jar>
    

    See a complete description of the expected arguments to spark-submit in page Run the Migration, and replace “<spark-master-hostname>”, “<… other arguments>”, and “<path to scylla-migrator-assembly.jar>” above with appropriate values.

  6. You can monitor progress from the Spark web UI.

Was this page helpful?

PREVIOUS
Set Up a Spark Cluster with Ansible
NEXT
Set Up a Spark Cluster with AWS EMR
  • Create an issue
  • Edit this page
ScyllaDB Migrator
  • 1.0.x
    • master
    • 1.1.x
    • 1.0.x
  • Getting Started
    • Set Up a Spark Cluster with Ansible
    • Manual Set Up of a Spark Cluster
    • Set Up a Spark Cluster with AWS EMR
    • Set Up a Spark Cluster with Docker
  • Migrate from Apache Cassandra or from a Parquet File
  • Migrate from DynamoDB
  • Run the Migration
  • Stream Changes
  • Rename Columns
  • Validate the Migration
  • Configuration Reference
  • Tutorials
    • Migrate from DynamoDB to ScyllaDB Alternator Using Docker
Docs Tutorials University Contact Us About Us
© 2025, ScyllaDB. All rights reserved. | Terms of Service | Privacy Policy | ScyllaDB, and ScyllaDB Cloud, are registered trademarks of ScyllaDB, Inc.
Last updated on 28 Apr 2025.
Powered by Sphinx 7.4.7 & ScyllaDB Theme 1.8.6