Lesson 2: ReAct Planning Loops

Most human tasks cannot be solved in a single thought. They require a loop of **Reasoning** and **Acting**. The **ReAct** (Reason + Act) framework, published by Yao et al. in 2022, teaches LLMs to alternate between reasoning traces (Thought) and execution steps (Action), adapting dynamically based on real-time feedback (Observation).

Why Linear Prompting Fails

When faced with multi-step logical problems, standard zero-shot prompting or even Chain-of-Thought (CoT) prompting has a critical flaw:error propagation. If the model makes a minor mathematical or factual error in Step 1, it has no way to detect or correct it, dooming all subsequent steps.

ReAct introduces environmental feedback into the thinking process. By pausing the generation after generating an Action, running a tool, and feeding the result back as an Observation, the model can adjust its subsequent reasoning steps.

1. THOUGHT

The model reasons about the user query and plans what it needs to discover next.

2. ACTION

The model decides to call a specific tool with concrete parameter inputs.

3. OBSERVATION

The code runs the tool and returns the raw environment result to the model.

4. REPEAT

The loop repeats until the model generates a "Final Answer" containing the solution.

Anatomy of a ReAct Prompt

The heart of a ReAct agent is its system prompt. The prompt forces the model to follow a strict text-based format. If the model deviates from this syntax, the parsing code will break. A standard ReAct system instructions block looks like this:

You are an autonomous solver agent. Loop through Thought, Action, Observation steps.
You have access to these tools:
- calculate[expression]: Solve a mathematical formula.
- search_web[query]: Perform an index query search on Google.

Use the following strict format:

Thought: Write down what you need to do next.
Action: tool_name[arguments_string]
Observation: (This will be filled in by the environment, do not write this yourself)

... (repeat Thought/Action/Observation if needed)

Thought: I know the final answer.
Final Answer: The final output to the user.

Implementing a ReAct Engine from Scratch

To understand ReAct, it is best to code the execution loop yourself in Python, avoiding bulky frameworks like LangChain. This loop uses regular expressions to catch actions, execute functions, and resume text generation:

import re
import google.generativeai as genai

# Define local mock tools
def calculate(expression: str) -> str:
    try:
        return str(eval(expression, {"__builtins__": None}, {}))
    except Exception as e:
        return f"Error: {str(e)}"

def search_web(query: str) -> str:
    if "population of tokyo" in query.lower():
        return "Tokyo population is estimated at 14,043,239 in 2024."
    return "No matching search results."

# Core execution loop
def run_react_agent(user_prompt: str, max_turns=5):
    model = genai.GenerativeModel("gemini-1.5-flash")
    
    # We load system instructions and state into a single prompt history
    history = SYSTEM_PROMPT + f"\n\nUser: {user_prompt}\n"
    
    for turn in range(max_turns):
        print(f"\n--- TURN {turn + 1} ---")
        response = model.generate_content(history)
        model_output = response.text
        print(model_output)
        
        # Append thoughts to running context
        history += model_output
        
        # Check if the model has finalized the answer
        if "Final Answer:" in model_output:
            break
            
        # Parse the Action using Regex
        action_match = re.search(r"Action:\s*(\w+)\[([^\]]+)\]", model_output)
        if action_match:
            tool_name = action_match.group(1)
            tool_arg = action_match.group(2)
            
            # Execute corresponding tool
            if tool_name == "calculate":
                obs = calculate(tool_arg)
            elif tool_name == "search_web":
                obs = search_web(tool_arg)
            else:
                obs = f"Error: Tool '{tool_name}' does not exist."
                
            print(f"Observation: {obs}")
            history += f"\nObservation: {obs}\n"
        else:
            # Fallback if model failed to output a tool action
            print("No action found or formatting error. Stopping loop.")
            break

Interactive Playroom: ReAct Challenges

Navigate to your coding workspace on the right. You will build a customized ReAct parser and loop boundary control structure:

[ ]Task 1: Complete the system prompt to explicitly prevent the model from generating its own Observation: lines.
[ ]Task 2: Write a robust regex parser extract_action(generation: str) -> (str, str) that can handle leading whitespaces and newlines before the Action: token.
[ ]Task 3: Implement a safety counter inside the loop. If the iteration count exceeds 5, terminate the loop and raise a MaxIterationsReachedException.
[ ]Task 4: Execute a multi-turn prompt *"Search for the population of Tokyo, multiply it by 1.12 using the calculator, and present the result."* Ensure the script successfully invokes both tools sequentially!

When your ReAct planner is executing and self-correcting perfectly, let's step up to the next level of orchestration:Multi-Agent Collaboration, where separate specialized agents work as a cohesive team to solve complex problems.

You are an autonomous solver agent. Loop through Thought, Action, Observation steps. You have access to these tools: - calculate[expression]: Solve a mathematical formula. - search_web[query]: Perform an index query search on Google. Use the following strict format: Thought: Write down what you need to do next. Action: tool_name[arguments_string] Observation: (This will be filled in by the environment, do not write this yourself) ... (repeat Thought/Action/Observation if needed) Thought: I know the final answer. Final Answer: The final output to the user.

import re import google.generativeai as genai # Define local mock tools def calculate(expression: str) -> str: try: return str(eval(expression, {"__builtins__": None}, {})) except Exception as e: return f"Error: {str(e)}" def search_web(query: str) -> str: if "population of tokyo" in query.lower(): return "Tokyo population is estimated at 14,043,239 in 2024." return "No matching search results." # Core execution loop def run_react_agent(user_prompt: str, max_turns=5): model = genai.GenerativeModel("gemini-1.5-flash") # We load system instructions and state into a single prompt history history = SYSTEM_PROMPT + f"\n\nUser: {user_prompt}\n" for turn in range(max_turns): print(f"\n--- TURN {turn + 1} ---") response = model.generate_content(history) model_output = response.text print(model_output) # Append thoughts to running context history += model_output # Check if the model has finalized the answer if "Final Answer:" in model_output: break # Parse the Action using Regex action_match = re.search(r"Action:\s*(\w+)\[([^\]]+)\]", model_output) if action_match: tool_name = action_match.group(1) tool_arg = action_match.group(2) # Execute corresponding tool if tool_name == "calculate": obs = calculate(tool_arg) elif tool_name == "search_web": obs = search_web(tool_arg) else: obs = f"Error: Tool '{tool_name}' does not exist." print(f"Observation: {obs}") history += f"\nObservation: {obs}\n" else: # Fallback if model failed to output a tool action print("No action found or formatting error. Stopping loop.") break