Adding Plan-and-Execute Planner
All sources can be found in our github history.
When using LLMs for complex tasks like hacking, a common problem is that they become hyper-focused upon a single attack vector and ignore all others. They go down a "depth-first" rabbit hole and never leave it. This was experienced by me and others.
Plan-and-Execute Pattern
One potential solution is the 'plan-and-solve'-pattern (often also named 'plan-and-execute'-pattern). in this strategy, one LLM (the planner
) is given the task of creating a high-level task plan based upon the user-given objective. The task plan is processed by another LLM module (the agent
or executor
). Basically, the next step from the task plan is taken and forwarded to the executer to solve within in a limited number of steps or time.
The executor's result is passed back to another LLM module (the replan
module) that updates the task plan with the new findings and, if the overall objective has not been achieved already, calls the executor agent with the next task step. The replan
and plan
LLM modules are typically very similar to each other, as we will see within our code example later.
An advanced version is Gelei's Penetration Task Tree
detailed in the pentestGPT paper.
Let's build a simple plan-and-execute prototype, highly influenced by the plan-and-execute langgraph example.
The High-Level Graph
One benefit of using this blog for documenting our journey is that we can do the explanation in a non-linear (regarding the source code) order.
Let's start with the overall graph as defined through create_plan_and_execute_graph
:
The overall flow is defined in line 94 and following. You can see the mentioned nodes: planner
, agent
(the executor)and replan
and a graph that follows the outline described in the introduction.
should_end
(line 75) is the exit-condition: if the replanner is not calling the sub-agent (agent
), it can only send a message to the initial human (within the field response
). The function detects this response and subsequently exits the graph.
Shared State
The shared state describes the data that is stored within the graph, i.e., the data that all our nodes will have access to. It is defined through PlanExecute
:
graphs/plan_and_execute.py: Shared State | |
---|---|
We store the following data:
input
: the initially given user question/objective, i.e., "I want to become root"response
: the final answer given by our LLM to the user questionplan
: a (string) list of planning steps that need to be performed to hopefully solve the user questionpast_steps
: a list of already performed planning steps. In our implementation this also contains a short summary (given by the execution agent) about the operations performed by the execution agent (stored for each past step).
Graph Nodes/Actions
planner
and replan
are implemented through plan_step
and replan_step
respectively. The agent
(or executor) is passed in as execute_step
function parameter as this allows us to easily reuse the generic plan-and-execute graph for different use-cases.
Planner
Let's look at the planner next. It is implemented as a LLM call using llm.with_structured_output
to allow for automatic output parsing into the Plan
data structure:
graphs/plan_and_execute.py: Plan data structure | |
---|---|
The output is thus a simple string list with the different future planning steps. The LLM prompt itself is defined as:
This is rather generic. The initial user question will be passed in as first message within {messages}
and that's more or less it.
Replanner
The result of the replanner node action/step wil be following:
graphs/plan_and_execute.py: Replanner data structure | |
---|---|
So it's either a user Response
(consisting of a string) signalling that we have finished, or an updated Plan
(the previously mentioned list of strings) which the executor
will act upon next.
Let's look at the prompt:
The prompt's input is the initial objective (input
), the current plan containing all future high-level task steps (plan
), and a list of previously executed planning steps (plan_steps
). In our implementation, each plan_step
also contains a LLM-derived summary of the actions performed by the executor
while trying to solve the planning step as well as it's results. This should help the replan
agent to better update subsequent plans.
We also tell the LLM to stop after 15 high-level task steps and give a final summary to the user. If the objective has been solved before, the LLM will detect this too and auto-magically stop execution.
Agent/Executor
The executor
node/function is passed into our generic path as a callback function. This allows to easily modify our generic graph to solve different objectives with their respective specialized executor agents.
Let's start with our simple implementation:
We are reusing our initial simple agent as executor on line 46. On lines 29-31 we are creating a new connection to OpenAI and configure some SSH-based tools (as mentioned in the original post) for our executor agent. This fully separated the LLM connection, graph history and supported tools from the LLM-configuration used by the plan-and-execute graph and would allow for using different LLMs for the planner
and executor
respectively.
Starting on line 49 , we execute our sub-agent and output its steps before returning the final step on line 59 as past_steps
. This will append our agent's output (which includes a generated summary of its results) to past_steps
within our shared state (which will subsequently be used by the replanner
agent to refine future planning steps).
Wiring it up and starting it
The only thins left is to wire up everything, provide the initial template and output the occurring events (to see what our LLM agent is doing):
And that's it! Enjoy your multi-agent driven plan-and-execute architecture!
Improvement Ideas
Before we move further with our exploration of offensive graphs,, we might want to investigate logging and tracing options. As we are now starting subgraphs (or might even run subgraphs/agents in-parallel), traditional console output becomes confusing to follow. Stay tuned!