ScyllaDB University Live | Free Virtual Training Event
Learn more
ScyllaDB Documentation Logo Documentation
  • Server
  • Cloud
  • Tools
    • ScyllaDB Manager
    • ScyllaDB Monitoring Stack
    • ScyllaDB Operator
  • Drivers
    • CQL Drivers
    • DynamoDB Drivers
  • Resources
    • ScyllaDB University
    • Community Forum
    • Tutorials
Download
ScyllaDB Docs ScyllaDB Migrator Validate the Migration

Validate the Migration¶

In addition to the monitoring user interface provided by Spark, you can run another program, after running a migration, to check whether the destination table contains exactly the same items as the source table.

Running this program consists of submitting the same Spark job as the migration, but with a different entry point.

Before running the validator, adjust the corresponding configuration in the top-level validation property of the file config.yml:

validation:
  # Should WRITETIMEs and TTLs be compared?
  compareTimestamps: true
  # What difference should we allow between TTLs?
  ttlToleranceMillis: 60000
  # What difference should we allow between WRITETIMEs?
  writetimeToleranceMillis: 1000
  # How many differences to fetch and print
  failuresToFetch: 100
  # What difference should we allow between floating point numbers?
  floatingPointTolerance: 0.001
  # What difference in ms should we allow between timestamps?
  timestampMsTolerance: 0

The exact way to run the validator depends on the way you set up the Spark cluster.

Run the Validator in the Ansible-based Setup¶

Submit the job submit-cql-job-validator.sh.

Run the Validator in a Manual Spark Setup¶

Pass the argument --class com.scylladb.migrator.Validator to the spark-submit invocation:

spark-submit --class com.scylladb.migrator.Validator \
  --master spark://<spark-master-hostname>:7077 \
  --conf spark.scylla.config=<path to config.yaml> \
  <path to scylla-migrator-assembly.jar>

Run the Validator in AWS EMR¶

Use the following arguments for the Cluster Step that runs a custom JAR:

spark-submit --deploy-mode cluster --class com.scylladb.migrator.Validator --conf spark.scylla.config=/mnt1/config.yaml /mnt1/scylla-migrator-assembly.jar

Run the Validator in a Docker-based Spark Setup¶

Pass the argument --class com.scylladb.migrator.Validator to the spark-submit invocation:

docker compose exec spark-master /spark/bin/spark-submit --class com.scylladb.migrator.Validator \
  --master spark://spark-master:7077 \
  --conf spark.driver.host=spark-master \
  --conf spark.scylla.config=/app/config.yaml \
  /jars/scylla-migrator-assembly.jar

Was this page helpful?

PREVIOUS
Resume an Interrupted Migration Where it Left Off
NEXT
Configuration Reference
  • Create an issue
  • Edit this page

On this page

  • Validate the Migration
    • Run the Validator in the Ansible-based Setup
    • Run the Validator in a Manual Spark Setup
    • Run the Validator in AWS EMR
    • Run the Validator in a Docker-based Spark Setup
ScyllaDB Migrator
  • 1.1.x
    • master
    • 1.1.x
    • 1.0.x
  • Getting Started
    • Set Up a Spark Cluster with Ansible
    • Manual Set Up of a Spark Cluster
    • Set Up a Spark Cluster with AWS EMR
    • Set Up a Spark Cluster with Docker
  • Migrate from Apache Cassandra or from a Parquet File
  • Migrate from DynamoDB
  • Run the Migration
  • Stream Changes
  • Rename Columns
  • Resume an Interrupted Migration Where it Left Off
  • Validate the Migration
  • Configuration Reference
  • Tutorials
    • Migrate from DynamoDB to ScyllaDB Alternator Using Docker
Docs Tutorials University Contact Us About Us
© 2025, ScyllaDB. All rights reserved. | Terms of Service | Privacy Policy | ScyllaDB, and ScyllaDB Cloud, are registered trademarks of ScyllaDB, Inc.
Last updated on 28 Apr 2025.
Powered by Sphinx 7.4.7 & ScyllaDB Theme 1.8.6