Hinted Handoff

In a distributed system, nodes fail. It’s not a matter of if, but when. Network glitches, GC pauses, or hardware restarts happen all the time.

If a node is down when a write comes in, does the write fail? In Cassandra, no. (Assuming you meet your Consistency Level requirements with the remaining nodes).

Cassandra uses a mechanism called Hinted Handoff to store the write temporarily and deliver it when the node comes back.

1. How it Works

Write Request: The client sends a write to a Coordinator Node.
Failure Detection: The coordinator tries to forward the write to all replicas (e.g., Nodes A, B, and C).
The “Hint”: If Node C is down (or unresponsive), the coordinator writes a “hint” locally.
Success Response: If the coordinator can still satisfy the requested Consistency Level (e.g., QUORUM is met by Nodes A and B), it returns SUCCESS to the client. The client has no idea Node C missed the write.
Handoff: When the coordinator detects (via Gossip) that Node C is back up, it “hands off” (replays) the stored hints to Node C.

What is a “Hint”?

A hint is not just a log entry. It is a specialized record stored in the hints directory on the coordinator. It contains:

Target Node ID: The UUID of the node that missed the data.
Hint ID: Unique identifier for the hint.
Message Time: The timestamp of the original mutation.
Mutation: The actual data (INSERT/UPDATE/DELETE) serialized as a blob.

[!NOTE] Hinted Handoff is an optimization for availability. It helps the cluster recover consistency faster after a temporary outage. It is NOT a replacement for permanent repairs (like Anti-Entropy Repair).

2. Interactive Simulator: The Handoff

Visualize how a hint is stored and replayed.

👤

Client

Coord

No Hints

Online

Offline

  > System ready. Node C is OFFLINE.

3. Constraints and Limits

Hinted Handoff is awesome, but it’s not magic.

Time Limit: Hints are not stored forever. By default, hints are stored for 3 hours (max_hint_window_in_ms in cassandra.yaml). If a node is down longer than this, the hints are discarded to save space. You must run a full repair (Anti-Entropy) when the node returns.
Space Limit: Hints are stored on the coordinator’s disk. If the coordinator runs out of disk space, it stops saving hints.
Performance Impact: Replaying hints puts extra load on the coordinator.

4. Configuration & Code

While Hinted Handoff is a server-side feature, understanding its configuration is vital for operations.

cassandra.yaml Configuration

# Enable or disable hinted handoff
hinted_handoff_enabled: true

# Max time to store hints (default 3 hours)
max_hint_window_in_ms: 10800000

# Directory where hints are stored
hints_directory: /var/lib/cassandra/hints

Client-Side: Specifying Consistency

Wait, if hinted handoff makes writes “durable”, does ConsistencyLevel.ANY mean my data is safe?

NO!

ConsistencyLevel.ANY: Allows a write to succeed even if all replicas are down, as long as the coordinator can store a hint.
Danger: If the coordinator crashes before replaying the hint, DATA IS LOST.
Recommendation: Never use ANY for important data. Use at least ONE or QUORUM.

Java Example: Custom Retry Policy

If a node is down and hinted handoff is triggered, the client usually sees a success. But if too many nodes are down (cannot meet CL), the client gets an UnavailableException. You can handle this with a Retry Policy.

Below is a complete implementation of a custom Retry Policy that logs the failure and decides whether to retry or rethrow.

import com.datastax.oss.driver.api.core.retry.RetryPolicy;
import com.datastax.oss.driver.api.core.retry.RetryDecision;
import com.datastax.oss.driver.api.core.session.Request;
import com.datastax.oss.driver.api.core.ConsistencyLevel;
import com.datastax.oss.driver.api.core.servererrors.WriteType;
import edu.umd.cs.findbugs.annotations.NonNull;

public class MyCustomRetryPolicy implements RetryPolicy {

    @Override
    public RetryDecision onWriteTimeout(
            @NonNull Request request,
            @NonNull ConsistencyLevel cl,
            @NonNull WriteType writeType,
            int blockFor,
            int received,
            int retryCount) {

        // If we received enough acks but some were timeouts (Hinted Handoff might handle it),
        // we might want to ignore or return success based on business logic.

        // Example: If we are writing a log (CL=ONE) and it timed out, try once more
        if (writeType == WriteType.SIMPLE && retryCount < 1) {
            return RetryDecision.RETRY_SAME;
        }

        // Default: Re-throw exception if we didn't meet consistency level
        return RetryDecision.RETHROW;
    }

    @Override
    public RetryDecision onReadTimeout(
            @NonNull Request request,
            @NonNull ConsistencyLevel cl,
            int blockFor,
            int received,
            boolean dataPresent,
            int retryCount) {

        if (retryCount < 1) {
            return RetryDecision.RETRY_SAME;
        }
        return RetryDecision.RETHROW;
    }

    @Override
    public RetryDecision onUnavailable(
            @NonNull Request request,
            @NonNull ConsistencyLevel cl,
            int required,
            int alive,
            int retryCount) {

        // If not enough nodes are alive, fail fast.
        return RetryDecision.RETHROW;
    }

    @Override
    public RetryDecision onRequestAborted(
            @NonNull Request request,
            @NonNull Throwable error,
            int retryCount) {
        return RetryDecision.RETHROW;
    }

    @Override
    public RetryDecision onErrorResponse(
            @NonNull Request request,
            @NonNull com.datastax.oss.driver.api.core.servererrors.CoordinatorException error,
            int retryCount) {
        return RetryDecision.RETHROW;
    }

    @Override
    public void close() {
        // No resources to close
    }
}

Go Example: Handling Unavailable

In Go, you check for specific errors.

package main

import (
	"log"
	"time"

	"github.com/gocql/gocql"
)

func main() {
	cluster := gocql.NewCluster("127.0.0.1")
	cluster.Consistency = gocql.Quorum
	session, _ := cluster.CreateSession()
	defer session.Close()

	// Attempt a write
	err := session.Query("INSERT INTO logs (id, message) VALUES (?, ?)",
        gocql.TimeUUID(), "User logged in").Exec()

	if err != nil {
		// Check for specific Cassandra errors
		if err == gocql.ErrUnavailable {
			log.Println("Cluster cannot meet Consistency Level! Too many nodes down.")
			// Logic: Maybe retry with lower consistency if acceptable?
		} else if err == gocql.ErrWriteTimeout {
			log.Println("Write timed out. Hinted Handoff may have stored it, or it failed.")
		} else {
			log.Println("Other error:", err)
		}
	} else {
        log.Println("Write success!")
    }
}

5. Summary

Hinted Handoff masks temporary node failures.
The Coordinator stores writes (hints) for down nodes.
Hints are replayed when the node recovers.
Limitations: Hints expire (default 3 hours). They are not a replacement for backups or repair.
Caution: CL.ANY relies entirely on hints and is risky.