Module Review: Cluster Operations

Operating a Cassandra cluster requires mastering node lifecycle, data safety, and security.

1. Key Takeaways

  • Node Lifecycle: Use nodetool bootstrap to add nodes and nodetool decommission to remove them safely. Never use removenode unless the node is dead.
  • Repair: Anti-entropy repair using Merkle Trees is essential for eventual consistency. Run incremental repairs frequently.
  • Backups: Snapshots are cheap (hard links), but incremental backups require log management. Always test your restore procedure.
  • Security: Enable PasswordAuthenticator and configured mTLS for both client-to-node and node-to-node communication in production.

2. Interactive Flashcards

Bootstrap

Tap to reveal definition

The process of a new node joining the cluster, gossiping with seeds, and streaming data for its assigned token range from existing nodes.

Merkle Tree

Tap to reveal definition

A hash tree used during repair to efficiently compare data on replicas. Only the branches that differ (hashes mismatch) are streamed.

Hard Link

Tap to reveal definition

A filesystem feature used by snapshots. It allows multiple filenames to point to the same disk inode, creating an instant backup without duplicating data.

RBAC

Tap to reveal definition

Role-Based Access Control. A security model where permissions (SELECT, MODIFY) are assigned to Roles, and Users are assigned to Roles.

Decommission

Tap to reveal definition

The graceful removal of a node using `nodetool decommission`. The node streams its data to other nodes before shutting down.

mTLS

Tap to reveal definition

Mutual TLS. A security protocol where both the client and the server authenticate each other using certificates.

3. Operations Cheat Sheet

Category Command / Concept Description
Node Mgmt nodetool status Check cluster state (UN, DN, Joining).
  nodetool decommission Safely remove a live node.
  nodetool removenode Force remove a dead node.
  nodetool cleanup Remove data no longer owned by node.
Repair nodetool repair Run full repair (heavy IO).
  nodetool repair -pr Repair primary range (recommended).
Backup nodetool snapshot Create hard-link backup.
  nodetool listsnapshots List active snapshots.
  nodetool clearsnapshot Delete snapshots.
Security cqlsh -u <user> -p <pass> Login with credentials.
  nodetool --ssl Run nodetool with SSL enabled.
Files /var/lib/cassandra/data Default SSTable location.
  /var/lib/cassandra/commitlog Write-ahead log location.

4. Next Steps

Now that you can operate a cluster, you are ready to explore the rest of the course or dive deeper into specific topics.