← Yiwen Lu

Is 200K Context Enough for Anybody?

"640K ought to be enough for anybody" is almost certainly not something Bill Gates actually said, but the quote survived because it is such a perfect joke about capacity planning.

I have started using a nearby joke for coding agents:

Is 200K context enough for anybody?

The answer is less interesting than the premise. A larger context window helps, but it does not remove the need to decide what kind of information the agent is reading. Durable working philosophy, project-specific facts, this week's task list, and a one-turn "please keep going" instruction should not live in the same flat prompt.

A big context window can still become a big junk drawer.

Four-layer context hierarchy

The practice I have settled into is a four-layer context hierarchy. The layers differ by scope and half-life:

Layer Scope Lifetime Job
Global instructions All work Longest Stable working style: evidence standards, coding taste, failure handling
Project instructions and skills One project Relatively stable Repo structure, environment facts, local workflows, project invariants
PROMPT.md Current task bundle Days to weeks Objective, next workbench, task specs, constraints
Unattended-mode prompt Current unattended stretch Shortest Execution policy for continuing safely without constant human nudges

I built codoxear around this way of working. It lets me watch and steer multiple Codex/Pi sessions from desktop or phone while the actual sessions keep running in terminals or tmux. codoxear gave the hierarchy an operating surface: I could see which sessions were active, blocked, snoozed, or ready for another unattended stretch.

Global instructions: stable method

Global instructions hold the rules I want every agent to follow in every project. Mine are mostly about epistemic hygiene:

- Increase correctness probability, not perceived progress.
- Read the artifact before judging it.
- No silent fallbacks; fail loudly on contract violations.
- Separate observation from inference.
- Understand before action.
- Use primary sources when facts may have changed.
- Every sentence should reduce uncertainty.

They are not facts about codoxear, MinT, or any other repository. They are habits I want preserved across tasks: inspect before summarizing, expose contract violations, distinguish what was seen from what was inferred, and avoid producing work that only looks finished.

Putting these rules in the global layer keeps them from being rewritten for every task. It also keeps temporary project needs from polluting the general operating style.

Project instructions and skills: local truth

Project instructions hold things the model cannot infer from pretraining. In a real repository, that usually means paths, ports, machines, logs, validation routes, deployment cautions, and conventions that only exist inside the team.

I do not buy the claim that skills are just patches for weak models.

Some generic skills are model patches. A downloaded "better code review" skill may matter less as models improve. Project-local skills preserve local truth. A stronger model still will not know which SSH alias points to dev, where the server log is written, which command touches the real cluster, or which validation path has fooled the team before.

I like project guides for architecture and skills for procedures. A guide can explain the shape of the system. A skill can encode a repeatable operation: which files to inspect, which command to run, what output counts as evidence, and which tempting shortcut is invalid in this repo.

PROMPT.md: the current task bundle

PROMPT.md is the layer I use for work that is too large for one prompt but too temporary for project memory.

My template is plain:

## Objective
// What must be achieved and when it is done.

## Workbench
// Short, prioritized next actions.

## Context
// Paths, references, and facts the agent needs.

## Task specifications
// Requirements, details, edge cases, verification criteria.

## Constraints
// Operational rules for this task.

The headings matter because they separate different jobs.

Objective names the desired end state. Workbench says what to do next. Context points to files and references instead of pasting everything inline. Task specifications carry the actual requirements. Constraints keep hard rules visible.

PROMPT.md can survive a day, a week, or a messy multi-session task. When a session gets compacted, restarted, or moved to another device, the file gives the agent a clean re-entry point.

It is also allowed to be wrong for long. As the task changes, I rewrite it. When a lesson stops being task-specific, I move it outward into project instructions or a skill.

Unattended mode: execution policy

The innermost layer is the shortest-lived one: unattended mode.

codoxear can inject a short prompt when a session is idle and eligible. In my current setup, eligibility checks include busy state, queued input, whether the last message came from the assistant, and cooldown. Those checks keep unattended mode from becoming a blind resend loop; a suitable session gets a policy for continuing.

The policy prompt asks the agent to keep four internal buckets:

Deliverables
Completed
Next actions
Parked user decisions

The buckets encode four different states. Deliverables keeps the owed output visible. Completed contains verified facts rather than optimistic summaries. Next actions keeps the continuation concrete. Parked user decisions stops the agent from treating every missing preference as a blocker.

Goal mode and Ralph-style loops are the nearby ideas here. They solve the obvious annoyance: the agent should keep working toward a declared outcome without a human sending a fresh prompt after every turn.

That means "keeps going" is not the interesting definition of unattended mode. In this hierarchy, unattended mode is the local policy layer for an already-running codoxear session: what state the agent should keep in its head, what counts as verified progress, what should remain parked for the user, and when the session should yield.

Operationally, PROMPT.md says what the task is. Unattended mode says how the agent should behave during the next stretch of unattended execution.

Where codoxear fits

codoxear is the UI around the practice, not the practice itself.

It gives me a queue of running sessions rather than a pile of terminal tabs. Sessions can be snoozed, deprioritized, blocked on dependencies, resumed from another device, or announced through desktop/mobile notifications. The actual agent process remains where it started; the phone is just a controller.

Long-running work creates state: active sessions, blocked sessions, snoozed sessions, and sessions that only need a small steering prompt. Without an operating surface, I lose track of the work. Without the context hierarchy, the agent loses track of which instructions matter at which timescale.

The placement question

Before adding context, I ask how long it should live.

Global instructions hold durable method. Project instructions and skills hold local truth. PROMPT.md holds the current task bundle. Unattended mode holds continuation policy.

The context window can be large, the agent can have a goal command, and a loop can keep turning. The operator still has to place information where it belongs.