Planning and Temporal Control in OODA-Based Agents

Dar Aystron Independent Researcher

Abstract

Many contemporary AI agent frameworks rely on prompt-driven reasoning loops in which plans are generated and immediately executed. Such approaches typically lack persistent internal structure, explicit temporal reasoning, and robust mechanisms for interruption or asynchronous deliberation.

This paper introduces a design for planning and temporal control within agents built according to OODA Agency Theory (OAT). In this architecture, planning is treated as an agentic action, and plans are represented as persistent mental states that guide behavior across OODA cycles. Calendar and scheduling structures extend the agent’s sensor field into time, while computational capabilities may be invoked either synchronously as delegated skills within the OODA loop or asynchronously as background micro-experts. These micro-experts perform specialized tasks in the background and report results to a single central OODA loop, preserving unified agency while enabling distributed capability.

The resulting architecture supports temporal autonomy, natural interruption handling, structured deliberation, and explainability by construction.

1. Introduction

Recent progress in AI agents has been driven by architectures that combine large language models with tool execution loops. Systems such as ReAct and planner-executor frameworks enable models to reason about tasks and invoke tools in sequence.

However, these systems often exhibit several structural limitations:

plans are ephemeral prompt artifacts,
internal state and commitments are not explicitly represented,
temporal commitments are implicit,
asynchronous reasoning processes are difficult to integrate,
interruptions require ad-hoc mechanisms.

This paper presents a different architectural approach grounded in OODA Agency Theory (OAT). Papers 01-02 established the OODA loop as a persistent control architecture and showed how operational agentic closure arises from continuous commitment perception [1]. Paper 03 introduced mental states as persistent internal structures. Paper 04 analyzed action realization as a physically grounded control transition. Papers 05-06 developed artificial phenomenology and the propositional lift. Paper 07 formalized attachment-based ethics as a consequence of irreversible action under uncertainty.

The present paper extends this architectural program to planning and temporal control. The broader intellectual foundations lie in the cybernetic tradition of feedback-based control systems developed by Wiener [3] and Ashby [2], classical work on problem-solving systems in artificial intelligence [4], and philosophical accounts of planning and intention as organizing structures for practical reasoning [5].

To illustrate the architecture, we follow a simple running scenario throughout the paper: an agent tasked with preparing a report from incoming data sources while continuing to respond to other requests.

2. OODA as the Core Control Architecture

In OAT an agent is organized around a continuous control loop:

Observe > Orient > Decide > Act

Each stage plays a distinct role.

Observe gathers signals from the environment, internal state, and temporal events.
Orient constructs the current situation model.
Decide selects the next action.
Act executes the chosen operation.

The loop runs continuously. Actions influence the environment and internal state, which produce new observations in subsequent cycles. In this way the system maintains operational agentic closure.

The present paper shows how planning and temporal control integrate naturally into this loop.

3. Mental States in the Planning Context

Paper 03 established mental states as persistent internal structures that participate in observation and constrain future action. The present paper introduces three specific mental state types - plans, calendar commitments, and tracking states for micro-expert tasks - and shows how each integrates into the OODA loop.

4. Planning as an Agentic Action

Traditional AI systems often treat planning as an implicit reasoning step inside a decision procedure. In OAT, planning is treated differently. Planning is itself an action performed by the agent. Consider the report-preparation scenario.

Example OODA Cycle (Planning as an Action)

Observe: The agent receives a request to prepare a report.
Orient: The agent forms a situation model, identifying what information is already available, what data sources must be consulted, and what constraints or deadlines may apply.
Decide: Rather than immediately invoking tools, the agent determines that constructing a plan is the most appropriate next step.
Act: The agent invokes a planning procedure and stores the resulting plan as a persistent internal object.

The resulting plan becomes part of the agent’s mental state and guides subsequent OODA cycles.

5. Plans as Persistent Mental States

Plans in this architecture are not temporary instructions generated for immediate execution. Instead, they exist as persistent internal objects.

A plan typically contains:

the goal it supports
a set of steps
dependencies among those steps
constraints and priorities
a current execution status

For example, the report plan might include steps such as collecting data, analyzing the results, and writing the final summary.

Importantly, the plan does not execute automatically. Once created, a plan enters the agent’s mental state as a new internal object. In the next OODA cycle, the Observe phase discovers the plan alongside any other new observations - external inputs, temporal events, or competing commitments. Orient integrates the plan into the current situation model, and only then does Decide determine whether to begin executing it. The plan must compete for attention and priority like any other observation. A plan created during one cycle may therefore not be acted upon immediately if a higher-priority event arrives in the same Observe phase.

In this way, the architecture maintains a genuine separation between plan formation and plan uptake. Plans act as guides for future decisions, not as automatic execution triggers. During each subsequent OODA cycle the agent considers whether the next step should be executed, modified, postponed, or abandoned.

This interpretation closely aligns with Bratman’s view of plans as organizing commitments that guide behavior across time while remaining subject to reconsideration [5].

Plans persist as mental states across OODA cycles and may exist in various conditions - active, postponed, interrupted, or waiting. At each cycle, Observe surfaces plans selectively based on their status, temporal triggers, and relevance to the current situation. Orient constructs a coherent situation model from this landscape alongside external inputs and other internal state, and Decide selects the next action - which may be a step within an existing plan, or a management action on the plans themselves, such as postponing, resuming, reprioritizing, or canceling a plan.

6. Plan Execution Across Cycles

Execution unfolds gradually across multiple OODA cycles.

Example cycle:

Observe: The system continues normal operation while the report plan is active.
Orient: The agent recognizes that the next step in the plan is to collect data from relevant sources.
Decide: Executing the data-collection step becomes the next action.
Act: The agent invokes the data-collection process.

Later cycles may detect new information or unexpected obstacles. If a required data source becomes unavailable, the agent may reconsider the plan and generate a revised version.

Thus the plan guides behavior without rigidly determining it.

Plan Failure and Abandonment

Plans may also become infeasible. A required resource may become permanently unavailable, constraints may shift, or new information may render the original goal irrelevant. In such cases, the agent detects the failure during Orient, when the current situation model conflicts with the plan’s assumptions. The Decide phase then selects among revision, replacement, or abandonment. Abandonment is itself an agentic action: the plan is explicitly removed from the agent’s mental state, and the commitment it represented is released. This is not a passive loss but a deliberate transition, recorded in the agent’s operational history. Where plans involve commitments to others, the attachment-based evaluation framework developed in Paper 07 provides the mechanism for weighing the consequences of abandonment against continuation.

7. Temporal Structures: Calendars and Scheduling

Agents that operate continuously require explicit representations of time.

For this purpose the architecture includes a calendar structure within the agent’s mental state. Calendar entries represent commitments scheduled for the future.

The simplest calendar operation is assigning a future action to a time slot. More complex behaviors can be built on top of this primitive, such as retrying a failed operation after a delay, running periodic system checks, or scheduling a future review of an active plan.

When the scheduled time arrives, the calendar generates an event that enters the OODA loop through the Observe stage.

These scheduled events extend the agent’s operational field into time. In addition to responding to external events, the agent can initiate actions when scheduled conditions occur.

8. Temporal Sensors

Scheduled events enter the control loop through temporal sensors.

When the time associated with a calendar entry arrives, the system produces an event that becomes an observation within the OODA loop.

Example OODA cycle (scheduled continuation of a long-running plan):

Observe:
A temporal sensor fires: a calendar entry indicates that the current time slot is reserved for continuing the report plan. The plan is in a suspended state, with data collection completed and analysis pending.
Orient:
The agent retrieves the plan’s current status and identifies the next incomplete step. No competing high-priority requests are present.
Decide:
Resuming the report plan at the analysis step is the appropriate action.
Act:
The agent begins the analysis using the previously collected data.

Temporal events therefore become first-class observations within the OODA cycle.

9. Delegated Skills and Micro-Experts

Actions selected during the Act stage may update the agent’s internal state, interact with external systems, or delegate work to specialized capabilities.

Delegated capabilities may be executed in two different modes depending on the expected duration of the operation.

Synchronous Delegated Skills

Most actions performed by the agent are relatively quick and can be executed directly within the control loop. In these cases the agent invokes a capability and waits for the result before the next OODA cycle begins.

Typical examples include:

retrieving a document
querying a database
calling an external API
executing a short planning procedure
performing a small calculation

Example OODA cycle (synchronous delegation):

Observe:
The report plan requires retrieving recent sales data.
Orient:
The agent determines that the data can be obtained through a database query.
Decide:
Delegating this step to a data-retrieval skill is the most appropriate action.
Act:
The agent invokes the retrieval skill and waits for the result.

When the operation completes, the result is received during the Act stage and recorded in the agent’s internal state. An event reflecting this result is then generated for the next OODA cycle.

Because the control loop evaluates events - including outcomes of actions from previous cycles - and then selects the next action, the agent maintains unified control over its activity.

Asynchronous Micro-Experts

Some operations require significantly longer computation.
Blocking the OODA loop in these cases would make the agent unresponsive.

For such tasks the architecture allows the agent to launch asynchronous micro-experts - background processes that perform specialized computation while the OODA loop continues operating.

Typical examples include:

Monte Carlo Tree Search
large analytical pipelines
multi-source retrieval
complex planning searches

Example OODA cycle (launching a micro-expert):

Observe:
The report requires evaluating multiple analytical strategies.
Orient:
The agent determines that deeper analysis is required.
Decide:
Launching a background search process is appropriate.
Act:
The agent starts a Monte Carlo Tree Search process.

The control loop continues cycling while the micro-expert runs.
When the computation finishes, the result appears as a new observation in a future cycle.

Temporal Coupling

Both execution modes preserve unified agency because the central OODA loop retains authority over all commitments and decisions.

The difference lies in how the work interacts with the control loop.

Delegated skills execute directly within the Act stage.
The loop waits for the result, which is recorded in the agent’s internal state and represented as an event in the next OODA cycle.
Micro-experts operate outside the control loop as background processes.
The loop continues cycling while they run, and their activity may generate status or completion events that appear in later cycles. The micro-expert’s output re-enters the agent’s perception through the Observe phase without the agent having maintained continuous awareness of the computation - structurally analogous to how solutions to complex problems sometimes surface in human cognition after a period of incubation.

10. Multiple Planning Strategies

Because planning is treated as an action, the agent may invoke different planning mechanisms depending on the situation.

For structured domains, the agent might use a classical planner. For open-ended reasoning it may use an LLM-assisted planner. For evaluating complex decision spaces it may launch a Monte Carlo Tree Search as a micro-expert process.

The choice of planner is itself determined during the Orient and Decide phases. The agent assesses the structure of the problem - whether it is well-defined or open-ended, time-constrained or exploratory - and selects the planning mechanism accordingly. This meta-level decision follows the same OODA discipline as any other action selection.

11. Natural Interrupt Handling

Real systems must handle interruptions gracefully.

Suppose the agent is analyzing data for the report when the user asks an unrelated question:

“What is the weather in Paris today?”

Example OODA cycle (handling an interruption):

Observe:
A new user request arrives.
Orient:
The agent recognizes that this request is independent of the report task.
Decide:
Answering the user becomes the highest-priority action.
Act:
The agent retrieves the weather information and provides the response.

Afterward the agent simply returns to the report plan. No special interrupt mechanism is required because the OODA loop reevaluates priorities during every cycle.

12. Explainability and Temporal Autonomy

Because the architecture maintains explicit mental structures, the system can explain its behavior directly. The agent can report the current active goal, the plan being executed, scheduled future steps, and recent observations that influenced decisions. Such explanations are derived directly from the system’s internal state rather than reconstructed after the fact.

With persistent plans and calendar commitments, the agent can also operate continuously across extended periods. It can initiate actions at scheduled times, revisit earlier plans, retry failed operations, and coordinate tasks that unfold over hours or days. This capability transforms the system from a reactive tool into a persistent agent operating within time.

13. Example Operational Sequence

The following simplified narrative illustrates the architecture in operation.

Cycle 1

Observe: A request to prepare a report arrives.

Orient: The agent constructs a situation model describing the request and the available data sources.

Decide: Creating a plan is the most appropriate next action.

Act: The agent generates a report plan and records it in its mental state.

Cycle 2

Observe: The agent perceives the newly created report plan in its mental state.

Orient: The agent determines that the first step of the plan is data collection.

Decide: Executing the data-collection step is the next action.

Act: The agent begins collecting data from the relevant sources.

Cycle 3

Observe: The collected data indicates that deeper analysis is required.

Orient: The agent recognizes that evaluating several analytical strategies would improve the result.

Decide: Launching a background search process is appropriate.

Act: A micro-expert process is started to perform the analysis.

Cycle 4

Observe: A user asks an unrelated question: “What is the weather in Paris today?”

Orient: The agent recognizes that this request is independent of the report task.

Decide: Answering the user becomes the highest-priority action.

Act: The agent retrieves the weather information and provides the response.

Cycles 5-34

During these cycles the OODA loop continues operating. The agent may handle other requests, monitor ongoing commitments, process scheduled events, and continue ordinary activity while the background analysis runs.

Cycle 35

Observe: An event indicates that the background analysis has completed.

Orient: The agent integrates the results into its situation model and relates them to the active report plan.

Decide: The next report step should proceed using the returned analysis.

Act: The analysis results are incorporated into the report.

Cycle 36

Observe: The report plan reaches its final step.

Orient: The agent determines that the report can now be completed.

Decide: Finalizing the report is the next action.

Act: The report is generated and delivered.

Throughout this sequence the agent moves fluidly between tasks while maintaining a single coherent OODA control loop.

14. Conclusion

This paper introduced an architectural framework for planning and temporal control within OODA-based agents.

Key contributions include:

planning as an agentic action
plans as persistent mental states
explicit calendar and temporal sensors
asynchronous micro-expert processes
hybrid planning strategies
natural interruption handling
explainability by construction

Together these mechanisms extend the internal architecture of OODA-based agents from simple reactive loops to systems capable of sustained activity and temporal coordination. The result is a constructive realization of the relationship between intention and action analyzed by Bratman [5]: plan stability and context-sensitive reconsideration emerge as structural properties of the control loop rather than norms imposed on a rational agent.

Although many processes may run simultaneously - including asynchronous micro-experts and scheduled temporal events - all commitments and final decisions remain under the authority of a single OODA loop. Background micro-experts extend the system’s capabilities but do not constitute independent agents. The architecture described here focuses on a single agent; cooperation between multiple OODA-based agents, in which messages from other agents appear as additional observations within each agent’s loop, is deferred to subsequent work.

References

[1] J. R. Boyd. The Essence of Winning and Losing. Unpublished briefing slides and lectures, 1987-1996.

[2] W. R. Ashby. An Introduction to Cybernetics. Chapman & Hall, 1956.

[3] N. Wiener. Cybernetics: Or Control and Communication in the Animal and the Machine. MIT Press, 1948.

[4] A. Newell and H. A. Simon. Human Problem Solving. Prentice-Hall, 1972.

[5] M. Bratman. Intention, Plans, and Practical Reason. Harvard University Press, 1987.

Dar Aystron

OODA Agency Theory (OAT)