ScyllaDB University Live | Free Virtual Training Event
Learn more
ScyllaDB Documentation Logo Documentation
  • Server
  • Cloud
  • Tools
    • ScyllaDB Manager
    • ScyllaDB Monitoring Stack
    • ScyllaDB Operator
  • Drivers
    • CQL Drivers
    • DynamoDB Drivers
  • Resources
    • ScyllaDB University
    • Community Forum
    • Tutorials
Download
ScyllaDB Docs ScyllaDB Migrator Validate the Migration

Caution

You're viewing documentation for a previous version of ScyllaDB Migrator. Switch to the latest stable version.

Validate the Migration¶

In addition to the monitoring user interface provided by Spark, you can run another program, after running a migration, to check whether the destination table contains exactly the same items as the source table.

Running this program consists of submitting the same Spark job as the migration, but with a different entry point.

Before running the validator, adjust the corresponding configuration in the top-level validation property of the file config.yml:

validation:
  # Should WRITETIMEs and TTLs be compared?
  compareTimestamps: true
  # What difference should we allow between TTLs?
  ttlToleranceMillis: 60000
  # What difference should we allow between WRITETIMEs?
  writetimeToleranceMillis: 1000
  # How many differences to fetch and print
  failuresToFetch: 100
  # What difference should we allow between floating point numbers?
  floatingPointTolerance: 0.001
  # What difference in ms should we allow between timestamps?
  timestampMsTolerance: 0

The exact way to run the validator depends on the way you set up the Spark cluster.

Run the Validator in the Ansible-based Setup¶

Submit the job submit-cql-job-validator.sh.

Run the Validator in a Manual Spark Setup¶

Pass the argument --class com.scylladb.migrator.Validator to the spark-submit invocation:

spark-submit --class com.scylladb.migrator.Validator \
  --master spark://<spark-master-hostname>:7077 \
  --conf spark.scylla.config=<path to config.yaml> \
  <path to scylla-migrator-assembly.jar>

Run the Validator in AWS EMR¶

Use the following arguments for the Cluster Step that runs a custom JAR:

spark-submit --deploy-mode cluster --class com.scylladb.migrator.Validator --conf spark.scylla.config=/mnt1/config.yaml /mnt1/scylla-migrator-assembly.jar

Run the Validator in a Docker-based Spark Setup¶

Pass the argument --class com.scylladb.migrator.Validator to the spark-submit invocation:

docker compose exec spark-master /spark/bin/spark-submit --class com.scylladb.migrator.Validator \
  --master spark://spark-master:7077 \
  --conf spark.driver.host=spark-master \
  --conf spark.scylla.config=/app/config.yaml \
  /jars/scylla-migrator-assembly.jar

Was this page helpful?

PREVIOUS
Rename Columns
NEXT
Configuration Reference
  • Create an issue
  • Edit this page

On this page

  • Validate the Migration
    • Run the Validator in the Ansible-based Setup
    • Run the Validator in a Manual Spark Setup
    • Run the Validator in AWS EMR
    • Run the Validator in a Docker-based Spark Setup
ScyllaDB Migrator
  • 1.0.x
    • master
    • 1.1.x
    • 1.0.x
  • Getting Started
    • Set Up a Spark Cluster with Ansible
    • Manual Set Up of a Spark Cluster
    • Set Up a Spark Cluster with AWS EMR
    • Set Up a Spark Cluster with Docker
  • Migrate from Apache Cassandra or from a Parquet File
  • Migrate from DynamoDB
  • Run the Migration
  • Stream Changes
  • Rename Columns
  • Validate the Migration
  • Configuration Reference
  • Tutorials
    • Migrate from DynamoDB to ScyllaDB Alternator Using Docker
Docs Tutorials University Contact Us About Us
© 2025, ScyllaDB. All rights reserved. | Terms of Service | Privacy Policy | ScyllaDB, and ScyllaDB Cloud, are registered trademarks of ScyllaDB, Inc.
Last updated on 28 Apr 2025.
Powered by Sphinx 7.4.7 & ScyllaDB Theme 1.8.6