Introduction to Cassandra DB (Part 1)

Introduction to Cassandra DB (Part 1)
Photo by Alina Grubnyak / Unsplash

Cassandra DB is a highly scalable, distributed (and decentralised) NoSQL database designed for handling large amounts of data across many commodity servers without any single point of failure. It provides high availability and fault tolerance while supporting replication across multiple data centres.

Cassandra is highly configurable with options like replication factor, consistency level, replication strategy, partitioning strategy and much more.

This post is going to cover the basics to get up and running quickly.

If you need to need to install Cassandra DB to follow code examples, you get it from the official Cassandra Docker image.

Create a table

Let's start with the basics, and create a table.

Cassandra DB interprets Cassandra Query Language (CQL) to make changes. CQL will seem familiar if you have previous experience with SQL.

One way to do that is to create a schema file users_schema.cql:

CREATE KEYSPACE IF NOT EXISTS mykeyspace WITH REPLICATION = {'class' : 'SimpleStrategy','replication_factor' : 1};

USE mykeyspace;

CREATE TABLE users (
    user_id UUID PRIMARY KEY,
    first_name TEXT,
    last_name TEXT
);
users_schema.cql

The first statement:

CREATE KEYSPACE IF NOT EXISTS mykeyspace WITH REPLICATION = {'class' : 'SimpleStrategy','replication_factor' : 1};

Will seem strange at first, especially if you're coming from a traditional relational database background, such as PostgreSQL or MySQL. While not being identical, you can think of a key space in Cassandra DB as being analogous to a schema in PostgreSQL. We will cover replication factors in more detail in future post.

With the .cql file complete, execute it using cqlsh:

cqlsh -f users_schema.cql

This is a very simple table with only three columns. user_id of type UUID, which is the tables primary key (and partitioning key. More on this later), then TEXT columns for first_name and last_name.

CRUD Actions

Now we have a table in place, we can now use CQL to perform Create, Read, Update, Delete actions. Use cqlsh to execute statements:

cqlsh

Then execute an INSERT CQL statement:

cqlsh> INSERT INTO mykeyspace.users (user_id, first_name, last_name) VALUES (uuid(), 'John', 'Doe');

Great, now let's read it back:

cqlsh> SELECT * FROM mykeyspace.users;

 user_id                              | first_name   | last_name
--------------------------------------+--------------+-------------
 7f875173-856f-4e48-a5d2-5960191e39f4 | John         | Doe

(1 rows)
💡
Instead of prefixing table names with the keyspace each time, you can simply first execute use mykeyspace;. Now each statement is understood to be inside the context of mykeyspace.

Update the record:

cqlsh> UPDATE users SET first_name = 'Jane' WHERE user_id = '7f875173-856f-4e48-a5d2-5960191e39f4';

Delete the record:

cqlsh> DELETE FROM users WHERE user_id = '7f875173-856f-4e48-a5d2-5960191e39f4';

Summary

Wrapping up, we've walked through the foundational steps of getting started with Cassandra DB and delved into the creation of a table using Cassandra Query Language (CQL), a language that facilitates intuitive interaction with the database, especially for those already familiar with SQL.

We initiated our exploration with setting up a basic table and delved into performing CRUD (Create, Read, Update, and Delete) operations on it, a crucial set of actions that form the backbone of many database operations.

In our upcoming posts, we'll dig deeper, offering insights into more complex functionalities and configurations available in Cassandra DB, such as understanding replication factors, partitioning strategies, and fine-tuning consistency levels.

As you venture forward with Cassandra don't forget to refer back to the official documentation or community forums whenever in doubt, as they are excellent resources to help guide your journey.

See you in the next post!