RPC and gRPC: The Microservices Standard
[!TIP] Interview Tip: “REST is for Humans (Public APIs). gRPC is for Machines (Internal Microservices).” If you are building a public-facing service like Stripe, use REST. If you are building internal services like Uber’s ride matching, use gRPC (and protect it with Rate Limiting).
1. What is gRPC?
gRPC (gRPC Remote Procedure Call) is Google’s open-source framework for high-performance communication.
- Protocol Buffers (Protobuf): Binary serialization (not JSON).
- HTTP/2: Multiplexing and streaming built-in.
- Strict Contracts: You define the API in
.protofiles first.
2. The Power of Protobuf (vs JSON)
Why is gRPC 10x faster?
- Size: JSON repeats keys (
"name": "Alice","name": "Bob"). Protobuf uses numbered tags (1: "Alice",1: "Bob"). - Parsing: Parsing text (JSON) is CPU expensive. Parsing binary (Protobuf) is instant.
Interactive Demo: Serialization Overhead Race
Parsing text (JSON) requires scanning every character for quotes, colons, and brackets. Protobuf just reads bytes directly into memory structs.
Size Comparator: JSON vs Protobuf
Size Comparator: JSON vs Protobuf
Type a value to see how Protobuf strips away the metadata overhead.
3. The “Load Balancing” Nightmare
This is the most common gRPC interview trap. “How do you load balance gRPC?”
The Problem: Sticky Connections
- REST (HTTP/1.1): Client opens connection, sends request, gets response, closes connection. The Load Balancer (LB) can easily round-robin requests.
- gRPC (HTTP/2): Client opens One Persistent Connection and keeps it open for days.
- If you put a standard L4 LB (AWS NLB) in the middle, it just forwards that one TCP connection to one server.
- Result: Server A gets 100% of traffic. Server B gets 0%. (See L4 vs L7 Load Balancing).
The Solutions
- L7 Load Balancing (Proxy): Use a smart proxy (e.g., Envoy, Nginx). It terminates the HTTP/2 connection, inspects individual requests, and distributes them. (Most common).
- Client-Side Balancing (Lookaside): The Client asks a Service Registry (e.g., Consul) for a list of IPs and connects to all of them, doing its own Round Robin. (Complex client logic).
Interactive Demo: L4 vs L7 Load Balancing
Visualize why L4 fails for gRPC.
- Mode L4: All requests follow the Single Connection to Server 1. Server 2 is idle.
- Mode L7: The Proxy opens connections to both. Requests are distributed evenly.
The gRPC Load Balancing Trap
L4 sees one persistent TCP connection and sticks to it.
4. Interactive Demo: Schema Evolution (Protobuf)
See why Protobuf is “Backward Compatible”.
- We start with a simple message.
- Click “Add Email Field”.
- Notice the Hex Output grows, but the original bytes (Tag 1 and 2) stay exactly the same. Old clients can still read the name and ID!
Orange = Value (Data)
System Walkthrough: The gRPC Call
When you run client.GetUser({id: 150}), what happens?
- Stub: Code generated from
.prototakes your object. - Serialization: Converts
{id: 150}into08 96 01(Protobuf). - Framing (HTTP/2): Wraps it in a DATA frame.
- Adds 5 bytes prefix:
[Compressed Flag] [Length (4 bytes)].
- Adds 5 bytes prefix:
- Network: Sends over persistent TCP connection.
- Server: Decodes frame -> Deserializes Protobuf -> Calls actual Go/Java function.
5. Can I use gRPC in the Browser?
No, not directly.
The Problem
gRPC relies heavily on HTTP/2 Trailers (headers sent after the body) to send the Status Code (e.g., grpc-status: 0).
Browser JavaScript APIs (fetch, XHR) generally do not give you access to HTTP/2 Trailers. If the request fails, the browser hides the specific gRPC error.
The Solution: gRPC-Web
gRPC-Web is a protocol that wraps the gRPC data in a way browsers can understand (often base64 encoded text). You need a “Translation Layer” (Proxy) in the middle.
(HTTP/1.1 or 2)
Encoding
(HTTP/2)
6. gRPC vs HTTP Status Codes
gRPC doesn’t use 200/404. It uses its own Enum.
| gRPC Status | HTTP Code | Meaning |
|---|---|---|
| OK (0) | 200 | Success. |
| INVALID_ARGUMENT (3) | 400 | Bad Request (Validation failed). |
| NOT_FOUND (5) | 404 | Resource missing. |
| PERMISSION_DENIED (7) | 403 | Auth failed. |
| UNAUTHENTICATED (16) | 401 | Missing Token. |
| RESOURCE_EXHAUSTED (8) | 429 | Rate limit hit. |
| UNAVAILABLE (14) | 503 | Server down / Maintenance. |
6.5 The Silent Killer: No Deadlines (Timeouts)
In microservices, if Service A calls B, and B calls C, and C hangs… the whole chain hangs. gRPC solves this with Deadlines (Context Propagation).
- Service A: “I need this done in 100ms.” (Sends request to B with
grpc-timeout: 100m). - Service B: Takes 20ms to process. Calls Service C. Forwarding the remaining time (80ms).
- Service C: Takes 90ms.
- Result: At 80ms, Service B cancels the request to C and returns
DEADLINE_EXCEEDEDto A. The system fails fast instead of hanging.
[!TIP] Always set Deadlines. The default is “Infinite”, which is a production outage waiting to happen.
Summary: REST vs gRPC
| Feature | REST (Open API) | gRPC (Internal) |
|---|---|---|
| Payload | JSON (Text) | Protobuf (Binary) |
| Contract | Loose (Swagger) | Strict (.proto) |
| Streaming | Request/Response only | Bi-directional Streaming |
| Best For | Mobile apps, Public APIs | Microservices, High throughput |
[!IMPORTANT] gRPC-Web: Browsers cannot speak raw gRPC because they don’t have access to HTTP/2 trailers. You need a proxy like Envoy to translate between the browser and the gRPC backend.