Advanced Prompting

[!NOTE] Once you master Chain-of-Thought, the next frontier is structure and autonomy. Advanced techniques treat LLMs not just as text generators, but as reasoning engines that can plan, search, and use tools.

1. Self-Consistency

Chain-of-Thought is powerful, but it’s “greedy”—it takes the first likely path. Self-Consistency improves accuracy by generating multiple reasoning paths and taking a majority vote.

How it Works

  1. Prompt the model with CoT multiple times (e.g., 5 times).
  2. Use a non-zero temperature (e.g., 0.7) to ensure diversity.
  3. Check the final answers.
  4. Select the most common answer.
Q Reasoning A Ans: 10 Reasoning B Ans: 42 Reasoning C Ans: 42 Majority Vote 42

2. Tree of Thoughts (ToT)

Tree of Thoughts generalizes CoT by exploring multiple reasoning paths as a tree search (BFS or DFS).

The Algorithm

  1. Decomposition: Break the problem into steps.
  2. Generation: Generate multiple candidates for the next step.
  3. Evaluation: Critique each candidate (State → Value).
  4. Search: Keep the best states, discard the bad ones, and backtrack if necessary.

This is famously used for creative writing or solving 24-game puzzles.

3. ReAct (Reason + Act)

ReAct is the foundation of modern AI Agents. It combines Reasoning (thinking about what to do) with Acting (interacting with external tools/APIs).

The Loop

  1. Thought: The model analyzes the current state.
  2. Action: The model decides to call a tool (e.g., search_google("weather in Tokyo")).
  3. Observation: The tool executes and returns output (“25°C”).
  4. Repeat: The model uses this new information to think again.
THINK ACT (Call API) OBSERVE Loop until Answer

4. Code Implementation (ReAct Loop)

```python # Conceptual ReAct Loop prompt = """ Question: What is the capital of France? Thought: I should search for the capital of France. Action: Search[capital of France] Observation: Paris Thought: The observation is Paris. I have the answer. Answer: Paris """ def react_loop(question): history = f"Question: {question}\n" for _ in range(5): # 1. Generate Thought + Action response = llm.generate(history + "Thought:") step = response.split("\n")[0] # Get first line history += f"Thought: {step}\n" if "Answer:" in step: return step if "Action:" in step: action = parse_action(step) # 2. Execute Tool result = tools.execute(action) # 3. Add Observation history += f"Observation: {result}\n" return "Timeout" ```
```java // LangChain4j Agents perform this loop automatically import dev.langchain4j.service.AiServices; interface Assistant { String chat(String message); } public class ReActExample { public static void main(String[] args) { Assistant assistant = AiServices.builder(Assistant.class) .chatLanguageModel(model) .tools(new CalculatorTool(), new WeatherTool()) // Tools available .build(); // The model will automatically "Reason" -> "Call Tool" -> "Observe" -> "Answer" String answer = assistant.chat("What is 25 * 48?"); System.out.println(answer); } } ```
```go // In Go, you often implement the loop manually or use frameworks like LangChainGo func ReActLoop(ctx context.Context, question string) string { history := fmt.Sprintf("Question: %s\n", question) for i := 0; i < 5; i++ { // 1. Predict next step (Thought + Action) resp, _ := client.CreateChatCompletion(...) content := resp.Choices[0].Message.Content history += content + "\n" // 2. Check for Action if strings.Contains(content, "Action: Search") { query := extractQuery(content) // 3. Execute result := searchTool(query) history += fmt.Sprintf("Observation: %s\n", result) } else if strings.Contains(content, "Answer:") { return extractAnswer(content) } } return "No answer" } ```

5. ReAct Visualizer

Watch how an Agent solves a multi-step problem using the ReAct loop.