Chain-of-Thought

[!IMPORTANT] Key Insight: LLMs do not “think” before they speak. They think while they speak. If you ask for a direct answer to a complex problem, the model has to compute the final result in a single forward pass (one token). This often leads to errors.

Chain-of-Thought (CoT) is a technique that forces the model to generate intermediate reasoning steps before giving the final answer. This allows the model to decompose complex problems into simpler, manageable parts.

1. The Problem: System 1 vs System 2

In psychology, Kahneman’s System 1 is fast and intuitive, while System 2 is slow and analytical.

  • Standard Prompting is like System 1: Immediate, gut-reaction answers.
  • CoT Prompting induces System 2: Deliberate, step-by-step reasoning.

Failure Case: The Bat and Ball Problem

Consider this classic riddle:

“A bat and a ball cost 1.10 in total. The bat costs 1.00 more than the ball. How much does the ball cost?”

Standard Prompt (Zero-Shot):

Q: A bat and a ball cost 1.10 in total. The bat costs 1.00 more than the ball. How much does the ball cost?

A: 10 cents. (Incorrect)

The model (and many humans) intuitively jump to 10 cents because 1.10 - 1.00 looks like $0.10.

2. Zero-Shot CoT

The simplest way to enable reasoning is to append a magic phrase to your prompt:

“Let’s think step by step.”

Zero-Shot CoT Prompt:

Q: A bat and a ball cost 1.10 in total. The bat costs 1.00 more than the ball. How much does the ball cost? Let’s think step by step.

A: Let’s denote the cost of the ball as x. Then the cost of the bat is x + 1.00. The total cost is x + (x + 1.00) = 1.10. This means 2x + 1.00 = 1.10. Subtracting 1.00 from both sides, we get 2x = 0.10. Dividing by 2, we get x = 0.05. So the ball costs 5 cents. (Correct)

Standard Prompting Q A "10 cents" (Hallucination/Guess) Chain-of-Thought Q Thought 1 Thought 2 Thought 3 A "Let x = ball..." → "x + (x+1) = 1.1" → "2x = 0.1" → "5 cents"

3. Why It Works

Models compute token by token.

  1. Computation Time: By forcing the model to output “The bat costs $1.00…”, you are giving it more “time” (tokens) to process the logic before it commits to the final number.
  2. State Management: The intermediate tokens serve as a “scratchpad”. The model attends to its own previous reasoning steps to generate the next step.

4. Few-Shot CoT

For production systems, relying on “Let’s think step by step” can be unstable. Instead, use Few-Shot CoT by providing examples of the reasoning process you want.

Prompt: Q: Roger has 5 tennis balls. He buys 2 more cans of tennis balls. Each can has 3 tennis balls. How many tennis balls does he have now? A: Roger started with 5 balls. 2 cans of 3 tennis balls each is 2 * 3 = 6 tennis balls. 5 + 6 = 11. The answer is 11.

Q: The cafeteria had 23 apples. If they used 20 to make lunch and bought 6 more, how many apples do they have? A:

5. Code Implementation

Python
Java
Go
from openai import OpenAI

client = OpenAI()

def solve_math_problem(problem):
    response = client.chat.completions.create(
        model="gpt-4-turbo",
        messages=[
            {"role": "system", "content": "You are a helpful math tutor."},
            # Zero-Shot CoT injection
            {"role": "user", "content": problem + "\nLet's think step by step."}
        ],
        temperature=0.5
    )
    return response.choices[0].message.content

print(solve_math_problem("If I bake 3 cakes an hour for 5 hours, but eat 2, how many do I have?"))
import dev.langchain4j.model.openai.OpenAiChatModel;

public class CoTExample {
    public static void main(String[] args) {
        OpenAiChatModel model = OpenAiChatModel.builder()
            .apiKey(System.getenv("OPENAI_API_KEY"))
            .modelName("gpt-4-turbo")
            .build();

        String problem = "If I bake 3 cakes an hour for 5 hours, but eat 2, how many do I have?";
        // Append trigger phrase
        String prompt = problem + "\nLet's think step by step.";

        String response = model.generate(prompt);
        System.out.println(response);
    }
}
package main

import (
	"context"
	"fmt"
	"os"

	openai "github.com/sashabaranov/go-openai"
)

func main() {
	client := openai.NewClient(os.Getenv("OPENAI_API_KEY"))

	problem := "If I bake 3 cakes an hour for 5 hours, but eat 2, how many do I have?"
    // Zero-Shot CoT
	prompt := problem + "\nLet's think step by step."

	resp, _ := client.CreateChatCompletion(
		context.Background(),
		openai.ChatCompletionRequest{
			Model: openai.GPT4,
			Messages: []openai.ChatCompletionMessage{
				{Role: openai.ChatMessageRoleUser, Content: prompt},
			},
		},
	)

	fmt.Println(resp.Choices[0].Message.Content)
}

6. CoT Simulator

Test the difference between Standard Prompting and Chain-of-Thought on a logic puzzle.