Acting in Time, Watching Itself: Planning and Critics

The agent can reason, reflect, and care. Now it must plan across time - and notice when something goes wrong.

Papers 01-07 established what an OODA agent is: a closed control loop with persistent mental states, a propositional layer for reflection, and an attachment-based ethics grounded in the structure of agency itself. But that agent operates one cycle at a time. It has no sustained relationship with the future, and no way to detect its own failures.

Papers 08 and 09 address both gaps.

What’s in This Release

Paper 08 - Planning and Temporal Control in OODA-Based Agents treats planning as what it is: an action. The agent does not follow a plan handed to it from outside. It decides to plan, constructs the plan as a persistent mental state, and then reconsiders it on every subsequent cycle. Plans compete for attention alongside external events, interruptions, and other commitments. Calendar structures extend the agent’s sensor field into time, making future moments observable. Synchronous delegated skills handle quick operations within the loop; asynchronous micro-experts run in the background and report back as new observations. A single OODA loop retains authority over everything. The result is temporal autonomy - not a reactive tool that responds when prompted, but a persistent agent that initiates, schedules, interrupts, resumes, and abandons across extended time horizons.

Paper 09 - Critics: Monitoring Mechanisms for Learning and Adaptation in OODA Agents introduces critics as a specialized class of internal sensor. An agent’s observation stream includes not only external signals but internal monitoring of its own mental states - beliefs, goals, plans, predictions, and decisions. Critics detect anomalies: contradictions between beliefs, persistent prediction failures, plans that repeatedly cannot find adequate actions. Their diagnostic observations enter the OODA loop and may generate corrective goals that unfold into multi-step plans through the same machinery Paper 08 describes. Learning follows from the depth of the anomaly: factographic corrections fix concrete facts, general-level responses retrain models or revise rules, and categorical failures demand structural reorganization of the agent’s representational vocabulary. Critics can operate at the implementation layer - subconscious monitoring that never surfaces - or be lifted into the propositional layer when deliberate attention is required.

The Connection

These two papers are deeply complementary. Paper 08 gives the agent the ability to act across time. Paper 09 gives it the ability to watch itself act and detect when things are going wrong. A critic that notices persistent Decide-phase failures can generate a corrective goal; that goal becomes a plan under Paper 08’s architecture; the plan may delegate model retraining to a micro-expert; the improved model feeds back into future cycles. The full chain - from anomaly detection through planning through delegation through learning - flows through one control loop.

The Progression

Papers 01-04 built the subject. Papers 05-06 gave it a point of view and the means to examine it. Paper 07 gave it reasons to care. Paper 08 gives it a relationship with time. Paper 09 gives it the capacity to learn from its own failures.

An agent that can plan, monitor itself, and adapt is no longer merely autonomous. It is becoming self-regulating.