Module Review: Cluster Operations
Operating a Cassandra cluster requires mastering node lifecycle, data safety, and security.
1. Key Takeaways
- Node Lifecycle: Use
nodetool bootstrapto add nodes andnodetool decommissionto remove them safely. Never useremovenodeunless the node is dead. - Repair: Anti-entropy repair using Merkle Trees is essential for eventual consistency. Run incremental repairs frequently.
- Backups: Snapshots are cheap (hard links), but incremental backups require log management. Always test your restore procedure.
- Security: Enable
PasswordAuthenticatorand configured mTLS for both client-to-node and node-to-node communication in production.
2. Interactive Flashcards
Bootstrap
Tap to reveal definition
The process of a new node joining the cluster, gossiping with seeds, and streaming data for its assigned token range from existing nodes.
Merkle Tree
Tap to reveal definition
A hash tree used during repair to efficiently compare data on replicas. Only the branches that differ (hashes mismatch) are streamed.
Hard Link
Tap to reveal definition
A filesystem feature used by snapshots. It allows multiple filenames to point to the same disk inode, creating an instant backup without duplicating data.
RBAC
Tap to reveal definition
Role-Based Access Control. A security model where permissions (SELECT, MODIFY) are assigned to Roles, and Users are assigned to Roles.
Decommission
Tap to reveal definition
The graceful removal of a node using `nodetool decommission`. The node streams its data to other nodes before shutting down.
mTLS
Tap to reveal definition
Mutual TLS. A security protocol where both the client and the server authenticate each other using certificates.
3. Operations Cheat Sheet
| Category | Command / Concept | Description |
|---|---|---|
| Node Mgmt | nodetool status |
Check cluster state (UN, DN, Joining). |
nodetool decommission |
Safely remove a live node. | |
nodetool removenode |
Force remove a dead node. | |
nodetool cleanup |
Remove data no longer owned by node. | |
| Repair | nodetool repair |
Run full repair (heavy IO). |
nodetool repair -pr |
Repair primary range (recommended). | |
| Backup | nodetool snapshot |
Create hard-link backup. |
nodetool listsnapshots |
List active snapshots. | |
nodetool clearsnapshot |
Delete snapshots. | |
| Security | cqlsh -u <user> -p <pass> |
Login with credentials. |
nodetool --ssl |
Run nodetool with SSL enabled. | |
| Files | /var/lib/cassandra/data |
Default SSTable location. |
/var/lib/cassandra/commitlog |
Write-ahead log location. |
4. Next Steps
Now that you can operate a cluster, you are ready to explore the rest of the course or dive deeper into specific topics.