Skip to main content

7. Planning & Decision-Making

From "How" to "What"

So far, we have focused on the "how" of robotics: how to walk, how to grasp, how to see. This chapter moves up the stack to the "what": what should the robot do to achieve a goal? This is the realm of planning and decision-making, the cognitive engine that transforms a high-level objective into a sequence of executable actions.

This process is typically broken into two layers: Motion Planning and Task Planning.

Motion Planning: The Art of Not Colliding

Motion planning answers the question: "How do I get from point A to point B without hitting anything?"

Configuration Space (C-Space)

The key to modern motion planning is to stop thinking about the robot's complex shape in the 3D world. Instead, we think about the robot as a single point in a high-dimensional Configuration Space (C-Space).

  • Configuration: A "configuration" is a list of numbers that perfectly describes the robot's pose—typically, all its joint angles. For a 6-joint arm, the C-Space has 6 dimensions.
  • C-Obstacles: An obstacle in the real world (like a table) doesn't just occupy a 3D space; it creates a complex, high-dimensional "forbidden region" in C-Space. Any point inside this C-Obstacle corresponds to a robot configuration where it is colliding with the table.
  • The Goal: Motion planning is thus transformed into finding a valid path for a single point from a start configuration to a goal configuration, navigating around the C-Obstacles.

Sampling-Based Planners

For robots with many joints, C-Space is too vast to search completely. Sampling-based planners are the solution.

  • RRT (Rapidly-exploring Random Tree): This is the most popular algorithm. It works by "growing" a tree of valid configurations from the starting point. It randomly samples a point in C-space and tries to extend the nearest branch of the tree towards it. This process naturally biases exploration towards large, open spaces, allowing it to find a path quickly.
  • PRM (Probabilistic Roadmap): This algorithm first scatters thousands of random, valid points across C-Space. It then connects nearby points to form a "roadmap." Finally, it searches this roadmap for a path from start to goal.

Task Planning: The Logic of Action

Motion planning finds a single path. Task planning figures out the entire sequence of paths and actions needed to complete a complex goal. It answers: "Given the goal 'make coffee,' what are the steps?"

Behavior Trees (BTs): A Modular Approach

While classical planners using formal languages like PDDL exist, Behavior Trees have become extremely popular in robotics due to their modularity and reactivity. A BT is a tree of nodes where each node is a simple behavior, and the tree structure defines the logic of how they are executed.

The key node types are:

  • Sequence (→): Executes its children in order. If any child fails, the whole sequence fails. It's an "AND" operator.
  • Selector (?) : Tries each child in order until one succeeds. If all fail, the selector fails. It's an "OR" operator.
  • Action: Performs a task, like Move To Object or Close Gripper. It returns Success, Failure, or Running (if the action takes time).
  • Condition: Checks a state of the world, like Is Gripper Open? and returns Success or Failure.

This tree defines a robust "pick up" behavior. It first checks if it's already holding the object. If not, it makes sure the gripper is open (opening it if necessary), moves to a pre-grasp position, and finally closes the gripper. The modularity means you can create a library of these behaviors (Pick Up, Put Down, Walk To) and compose them into incredibly complex tasks.

Code Example: Conceptual Behavior Tree

A full BT implementation requires an execution engine, but we can represent the logical structure easily in code. This shows how simple functions can be combined to create emergent, intelligent behavior.

# --- Define individual actions and conditions (our leaf nodes) ---
def is_object_in_hand():
print("CHECK: Is object in hand?")
return False # Let's assume we start with an empty hand

def find_object():
print("ACTION: Finding the object...")
return True # Assume we found it

def move_to_object():
print("ACTION: Moving to the object...")
return True

def grasp_object():
print("ACTION: Grasping the object...")
return True

# --- Define control flow nodes ---
def selector(children):
"""Tries each child until one succeeds. An OR gate."""
for child in children:
if child():
print("SELECTOR: Child succeeded.")
return True
print("SELECTOR: All children failed.")
return False

def sequence(children):
"""Executes each child in order. An AND gate."""
for child in children:
if not child():
print("SEQUENCE: Child failed.")
return False
print("SEQUENCE: All children succeeded.")
return True

# --- Build and run the Behavior Tree ---
print("---
--- Executing Behavior Tree for 'Acquire Object' ---
")

# This is the high-level task.
# It's a selector: either we already have the object, OR we execute the pickup sequence.
acquire_object_task = lambda: selector([
is_object_in_hand,
lambda: sequence([
find_object,
move_to_object,
grasp_object
])
])

# Run the tree
acquire_object_task()