Spring Data and Cassandra Consistency

Spring Data and Cassandra Consistency

A month or so ago I had to solve a problem that couple of the CQL queries in a Spring Boot application needed different consistency level than all the others in the microservice. It was hard to find information on how to costumize query consistency of Cassandra query while using Spring Data Cassandra. Was looking into CassandraTemplate at first, as it was first thing which popped out in google search, but I found that CqlTemplate doesn’t support custom query options… Luckily colleague of mine told me about @Consistency annotation.

This article has three parts:

  • Cassandra part – really short introduction to Cassandra consistency.
  • Spring Data part – brief description of Spring Data project.
  • Cassandra Query Options part – example of how to customize CQL query consistency while using Spring Data.

Cassandra

If you have any working experience with Cassandra DB you probably know that Cassandra is rarely used in a single node mode, well maybe on a DEV lab or your laptop. Usually you have Cassandra cluster running or at least a data center with one or multiple racks.

Cassandra cluster is usually called Cassandra ring, because it uses a consistent hashing algorithm to distribute data. Each node in a Cassandra ring is responsible for a certain part of DB data which assigned by the partitioner.

Data partitioning in Cassandra can be easily a separate article, as there is so much to it. But without getting into too much details: in Cassandra data is distributed across all the nodes and when database is queried nodes has to vote for the correctness of the data. Sufficient data validity is determined by level of consistency which totally depends on your use case. But nothing is free in this world so higher the consistency the slower are queries and other way around.

Most common consistency levels are:

  • ANY – A write must be written to at least one node. Meaning at least one node in a cluster has to be alive.
  • LOCAL_ONE – A write must be sent to, and successfully acknowledged by, at least one replica node in the local datacenter.
  • LOCAL_QUORUM – Majority of the local datacenter nodes have to acknowledge write.
  • QUORUM – Majority of ring nodes have to acknowledge write.
  • EACH_QUORUM – Majority of nodes in each datacenter have to acknowledge write.
  • ALL – All nodes in a cluster have to acknowledge write.

Spring Data

Spring Data is one of many Spring projects created to make developers lives easier. The main purpose of Spring Data library is to simplify access to different kinds of persistence stores, both relational and NoSQL databases and for sure Cassandra DB is one of the options. Which can be found under spring-data-cassandra package.

Once spring-data-cassandra is imported into your project (using maven or gradle) it is quite straight forward to configure it.

For example in your Spring Boot application.yml you can add:

spring:
  data:
    cassandra:    
      contact-points: 127.0.0.1 #ip of your cluster
      port: 9042 #default Cassandra port
      keyspace-name: codespacelab # name of your keyspace/database
      username: codespacelab
      password: codespacelab
      consistency-level: ALL 

In the example case we have really high consistency level (ALL) all nodes in a cluster will have to acknowledge each write. This level of consistency could be possibly needed where data accuracy is extremely important for example when building a system for financial transactions and even then EACH_QUORUM should be enough.

Now let’s say we also need to read customer transactions data for personalized spending analytics chart or something like that from the same Cassandra cluster on the same Spring application. This time we don’t need such a high consistency it’s OK if some of the historic data comes in later for this chart.

But how can I set consistency level for one query without changing overall configuration?

Cassandra Query Options

Query Options to the rescue! If we wouldn’t be using Spring Data Cassandra library one way to change Cassandra query consistency level for specific query would be by utilizing QueryOptions class together with QueryBuilder from Datastax Cassandra driver library. Spring Data Cassandra uses the same Cassandra driver provided by Datastax in its core.

QueryOptions queryOptions = new QueryOptions() 
   .setConsistencyLevel(ConsistencyLevel.LOCAL_QUORUM);

Cluster cluster = Cluster.builder()
    .addContactPoints((String[]) ips.toArray())
    .withPort(sourcePort)
    .withQueryOptions(queryOptions)
    .build();

But since we have Spring Data Cassandra module imported we can use @Consistency annotation to tweak consistency level on our queries! It is really simple to use:

public interface TransactionRepository extends CrudRepository<Transaction, String> {

    @Consistency(ConsistencyLevel.LOCAL_QUORUM)
    List<Transaction> findByLastname(String lastname);
}

Just annotate desired query with the annotation, pass desired consistency level as a parameter and you are good to go. Don’t even need to write custom query it works just like that.

Thank for reading and I hope you enjoyed this post, maybe even learned something new. Official documentation of Consistency annotation.

Add Comment

Your email address will not be published. Required fields are marked *