Sunday, December 07, 2014

Goal Oriented Action Planning (GOAP)

Nick Pedalino
7 December 2014



As the name implies, goal oriented action planning is a planning architecture in which game characters generate series of actions in an attempt to fulfill a goal. This sequence of actions is dependent on the goal, as well as the current state of the world and the character. Since the action plan is dependent on not just a goal, but the world and character states, you can achieve an AI with a totally difference sequence of actions even when attempting the same goal.

The decision making is achieved by taking a finite state machine and removing the web of transitions.
(Figure 1)

Figure 1: FSM (Intel Developer Zone)

This turns all actions in the architecture into encapsulated, independent structures, to be added to and removed from the action queue. It takes away the complexities of modifying a FSM (adding an action, removing an action, etc.) and replaces them with a modular structure in which you can add and remove actions easily without having to change any existing actions, because they are no longer linked to each other.

Pretty simple so far. Let's get into how to determine which actions to choose for the action sequence. Each action has a cost. This cost value will put precedence on the actions with lower cost values; the actions with higher costs will be less preferable for the algorithm to put in an action sequence. If you are familiar with the A* algorithm for optimal pathfinding, GOAP will use a heuristic to calculate the most efficient "path" (action plan). For example, a series of two or three actions may be less optimal than a series of 5 or 6 actions.

Consider a scenario in which a character's goal is to dig a hole. The character does not have a shovel. The character has 3 actions to choose from: getShovel(), digWithHands(), and digWithShovel(). The character cannot digWithShovel() unless the character has a shovel. Firstly, let's assign costs to these actions
  • getShovel()  :  Cost = 4
  • digWithHands()  :  Cost = 10
  • digWithShovel()  :  Cost = 4
Let's outline all possible paths the player can choose, assuming the same goal of "make hole"
  • getShovel() + digWithShovel() = "make hole" (Total Cost: 8)
  • getShovel() + digWithHands()  = "make hole" (Total Cost: 14)
  • digWithHands() = "make hole" (Total Cost: 10)
Being the most efficient way to produce the same outcome of "make hole," our AI chooses to get a shovel, then dig a hole with the shovel. Under only these conditions, the player will always make the same decision, and GOAP doesn't seem incredibly useful.

Let's look at the same example, except this time introduce a world state. The world state is a collection of data that informs the AI of the state of the world. The contents of this world state will be taken into consideration when forming the path. Let's assume one of the world states is that there is no shovel near the player. In looking back at our possible paths for this scenario, we would choose the one with the smallest total cost that is feasible given the world state. The getShovel() + digWithShovel() would be the immediate answer, but our world state does not allow this as there is no shovel for the player to pick up. This leaves the player with the digWithHands() action path, at a cost of 10.

Not only does the AI take into consideration the world's state, but also the character's state. Another example, using the hole-digging scenario. The character has a state called "hasShovelsForHands." In this state, the digWithHands() action only has a total cost of 4. Looking back at our possible action sequences, the most efficient method of digging a hole is clear: digWithHands().

Not the greatest example, but GOAP is intended to be used for AIs with a lot of difference actions to choose from. It would be overkill to implement GOAP for our hole-digging example. 


For a more in-depth overview of GOAP, from which the above summary was based, check out Brent Owens' tutorial: Goal Oriented Action Planning for a Smarter AI


David Pittman's masters thesis on practical GOAP helped me understand the inner workings of GOAP:


Figure 2:  Relationship between primary GOAP classes (David Pittman)

This diagram clearly illustrates the simple cycle that runs GOAP. The Planner takes in the AI's goal, the set of possible actions, and the world state. The Planner then finds the most efficient queue of actions to the AI, which carries out those actions. Upon carrying those actions out, or possibly after some type of interruption, the cycle repeats itself.


Jeff Orkin's Applying Goal-Oriented Action Planning to Games gave some good insight into the abstract, almost nebulous way in which all GOAP Actions are modular nodes, independent of each other. 


Figure 3: GOAP plan formulation process

Look familiar? Pathfinding, but as a means of reaching the most optimal sequence of decisions? One could implement any variety of pathfinding techniques, notably A*, considering the dynamic game conditions a meaningful GOAP implementation would have.

GOAP is incredibly versatile, I could see this being used in a multitude of different genres. Realistic group combat tactics, The Sims, great stealth AI, in combination with fuzzy logic to make even more realistic AI, any strategy game... the list goes on (most any game with medium-complex AI). I would not recommend GOAP for any game with rudimentary AI that does not need even this moderately complex structure, as it can be more time consuming than just making a non-GOAP AI that works just as well.



Also! Here's a small video of a demo of GOAP I put together in C#. Mind the current state as it pertains to GOAP in the bottom right corner, next to the current time. Actions like the food bowl refill and the mail truck driving by happen at certain times. Notice the dog buries the bone to restore his boredom. This sends the world into a state which does not have a bone, so that is no longer an option to satiate hunger. And if he's already eaten his food, then he must dig up his bone first, then chew it. (Also, a couple seconds in... he's in his sleep state)