G.O.A.P
Goal Oriented Action Planing
Simulation
In this simulation, you will see two drones, the red one is the “enemy“ and the other uses the GOAP system. You will be able to observe which goals are completed, achieved, or not running. You will see a new plan on the left side every time a new plan can be exicuted.
For this simulation, I was allowed to use some of the models from our game project. The environment was created by Christoffer Ryrvall, and the two drones were designed by Mathilda Ekberg.
Basics
To understand the foundation of GOAP (Goal-Oriented Action Planning) system, I first needed to explore which games use this approach, how it is implemented, and the core principles behind it. Through my research, in the game F.E.A.R, I discovered that the system consists of three key components: a Goal, an Action, and an Agent
Click here to see my git repositorie for GOAP
Goal
Represents what the agent wants to achieve, contains a priority used to determine which goal to pursue first and a method if it is Achieved to check if the goal is met given the current world state.
Action
Represents a task that the agent can execute. It includes a check for preconditions, which are the conditions that must be met for the action to be performed, and applies effects that alter the world state once the action is executed. Each action is associated with a cost, which is considered during the planning process to determine the most optimal sequence of actions.
Agent
It is responsible for planning and executing actions, maintaining a list of available actions and goals. The planning process uses the A* algorithm to determine the optimal sequence of actions needed to achieve the highest-priority goal. Once a plan is formed, the executePlan method carries out the actions while updating the world state accordingly.
The smart pointers are used for cached resources primarily for three reasons:
First, resource sharing multiple agents may need to reference the same actions or goals, eliminating the need for redundant copies.
Second, cache efficiency since actions and goals remain unchanged after creation, storing them efficiently improves performance.
Lastly, memory management shared pointers automatically handle deallocation when no references remain, reducing the risk of memory leaks.
World states
To unify all components, a World State is used, allowing the agent to track environmental changes and plan actions accordingly. Each actor maintains its own copy of the World State for several reasons:
Local perception, every agent perceives the world differently, so having individual copies ensures they can make decisions based on their own perspective.
Predictive planning, during the planning phase, agents can simulate different world states without affecting the actual game state.
Performance, local copies prevent the need for constant locking, which would be required if all agents shared a single World State.
In my implementation, the World State is defined as:
Here, WorldName is an enum designed to prevent spelling errors, while WorldStateValue is an unordered map that efficiently handles integers, floats, and booleans. Looking back, I should have made this a const reference, as it would have simplified handling if multiple AIs were using it. However, since I originally intended to use only one AI, I decided against making it a const reference, as the performance impact on the program was negligible.
Creating the AI
When developing my GOAP AI, I first needed to establish a theme to guide its design. After revisiting Counter-Strike 2, I initially assumed that creating an AI focused on planting a bomb and attacking enemies would be a straightforward task. However, I quickly realized that the complexity of AI behavior extends far beyond these basic actions.
Designing the behavior
To begin the design process, I delved into Valve's documentation on how their bots are structured, to gain some inspiration. Given the limited time I had to develop my AI, I chose for a simpler version to better explore the GOAP system. After several iterations and becoming more familiar with the system, I realized that additional goals and actions were necessary to make the AI feel more dynamic and capable of decision-making, rather than simply moving from point A to B.
Goals
Plant Bomb
Prepare to Attack
Eliminate Enemy
Retreat to Cover
Navigate to Bomb Site
Seek Cover
Find Peek Position
Actions
Hide
Attack
Peek
Retreat
Move to Bomb Site
Move to Cover
Move to Peek Position
Hold Position
Plant Bomb
Add the world stats
The required world states were those that applied not only to actions but also to goals, letting the AI to gather information about how the environment is evolving and adapt its behaviour accordingly.
To create a goal, you need to inherit from the Goal class and define the desired behavior. For instance, the BombPlanted goal inherits from the Goal class and specifies the conditions under which the goal is considered achieved.
In the code below, the constructor BombPlanted() calls the base class Goal and assigns the goal named "BombPlanted" along with a priority value of 3. The AchievedState function is then overridden to check whether the world state indicates that the bomb has been planted.
The function searches for the PlantedTheBomb key in the world state. If the key is found and its value is a boolean, the function returns that value. If the key is not present or the value is not of type bool, it defaults to returning false. This allows the AI to determine whether the bomb planting goal has been successfully achieved based on the current state of the world.
Creating a goal
Creating an action
When defining an action, it is essential to specify both a name and an cost. The name provides a identifier for the action, while the cost determines its relative value or priority within the planning process, influencing the decision-making of the AI.
When we use CheckPreconditions function, it is used for the planning phase. It works by determining the conditions that must be met for a plan to be valid. By performing these checks, we allow the AI to be more responsive and dynamic, creating a sense of liveliness. The ability to cancel out certain behaviours or exclude them from the plan based on preconditions ensures that the AI's actions are purposeful.
Moreover, the ApplyEffects function demonstrates how the world state is modified when a plan is executed. It plays an essential role in updating the environment, reflecting the AI’s actions and progress.
Creating an agent
For the Agent, it is enough to define it as a variable, and within the constructor, you must specify the actions and goals associated with that particular AI. This ensures that the agent is equipped with the necessary behaviors and objectives for its decision-making process.
In the AI update, you can determine how frequently the GOAP system should revise its plan or continue with the current path. For an FPS AI, I wanted it to react often, because how often the information of position and distance needs to update.
The plan function uses the A* algorithm, traditionally used for pathfinding, an interesting way to handle GOAP. By adapting A* to the planning phase, the AI can determine the optimal sequence of actions needed to achieve its highest-priority goal.
A deeper dive
For those who wish to dive deeper into the planning phase I can explain it a bit more.
The function starts by sorting the agent's goals based on priority. This ensures that the AI first attempts to achieve the most important goal before moving on to others.
For each goal that has not yet been achieved, a priority queue openSet is initialized. This queue holds potential plans, sorted by their cost lower cost being prioritized.
The visitedStates set keeps track of already explored world states to avoid unnecessary loops and improve performance.
The algorithm continuously evaluates the action plan with the lowest cost. If the goal’s target world state is achieved, the current plan is returned.
Otherwise, the algorithm iterates through all possible actions, checking if their preconditions are met. If so, the action’s effects are applied to the world state, and the new state is added to the priority queue along with the updated plan and cost.
To prevent looping over already explored world states, the algorithm keeps track of visited states. If a state has already been explored, it is skipped.
Planning
Results
The final outcome is the video at the beginning of this webpage. Throughout the development process, I encountered several challenges. One key improvement would have been implementing the simulation in 2D rather than 3D. While constructing the level, designing steering behaviors, and understanding the NVIDIA PhysX was both an engaging challenge and a valuable learning experience, these tasks required significant time. Time that could have been allocated to developing additional AI agents using the GOAP system. Additionally, I would have made more optimizations to enhance the efficiency of the code, making sure it could effectively support multiple AI agents rather than just one.