Context is King
What you bring to your agent is the foundation of agentic impact
There’s a hierarchy in AI-assisted work that most practitioners get backwards.
They obsess over model selection. They debate Claude vs GPT vs Gemini. They study orchestration frameworks and agent architectures. They optimize token costs and latency.
None of this matters if you get context wrong.
Context—the information you pass to the model—is the single most important factor determining whether an agentic task succeeds or devolves into a hallucination-filled mess. Model choice matters. Orchestration plays its role. But without the right context on a complicated query, you’re navigating with a broken compass.
I’ve spent months building agentic workflows for development and research. The pattern is unmistakable: invest in context engineering, and everything else falls into place.
The Prompt Era is Over
Remember when ChatGPT hit the scene? Everyone was collecting prompts like rare baseball cards. “This persona pattern unlocks 10x productivity.” “This Chain of Thought template changes everything.” We saved them in Notion, shared them on Twitter, treated them like secret weapons.
That era is over.
Here’s what I’ve learned on complicated tasks: just ask the LLM to write the prompt for you. Describe what you’re trying to accomplish, and the model will generate a better prompt than any you’d craft by hand.
Research backs this up. Zhou et al.’s Automatic Prompt Engineer (APE) tested automated prompt generation against human-authored prompts across the BIG-Bench suite. APE outperformed humans in 19 out of 24 tasks. The models are better at prompting themselves than we are at prompting them.
This flips the mental model. Instead of memorizing prompt patterns, focus on clearly articulating what you want. The model handles the translation.
One interesting pattern I’ve seen in practice: Ticket-Driven Development (TkDD) approaches where prompts are auto-generated at ticket creation time and stored as part of the ticket metadata. The ticket captures intent; the system generates the optimal execution prompt. Context engineering at the point of capture.
The Automated Memory Problem
Most consumer LLM clients now ship with automated memory. Claude, ChatGPT, Gemini—they all try to remember things about you across conversations. It happens behind the scenes, and for casual use, it works fine.
For serious agentic work, automated memory is dangerous.
The fundamental issue is incompleteness and lack of transparency. These systems decide what to remember based on heuristics you can’t see or correct. OpenAI’s forums are full of users reporting duplicated memories, ignored context, and the system aggressively storing irrelevant details while missing important ones.
I experienced this firsthand. I was demonstrating LLM capabilities to a friend in finance and used an example about someone making $50,000 annually. That number got stored. For weeks afterward, financial questions were answered with assumptions about my $50k salary. Every investment recommendation, every budgeting suggestion, every career question—all biased by a throwaway example I’d forgotten about.
The system remembered the wrong thing, and I had no visibility into the error until its effects became obvious.
This is the core tension: automated memory optimizes for user-friendliness by hiding complexity. But for agentic work, you need control. You need to know exactly what context is being injected. You need the ability to curate, correct, and evolve your knowledge deliberately.
Deliberate Long-Term Memory
The alternative is deliberate knowledge construction. Instead of hoping the system remembers the right things, you build structured knowledge stores that you control.
The tactical toolbox here is broader than most people realize.
Markdown as universal substrate. Markdown files are both human-readable and agent-readable. Any file system becomes a knowledge base. You don’t need a specialized database—you need a folder of well-organized .md files.
Tools like Obsidian store data as markdown while giving you powerful tooling for discovery, search, linking, and extension through hundreds of community plugins. Store the vault in git, and you gain change history, the ability to revert, and visibility into how your knowledge evolves.
The key insight: markdown future-proofs your investment. If a better tool emerges, your knowledge migrates trivially. No lock-in, no export nightmares.
Establish an ontology. An ontology is a formal structure for modeling knowledge—defining concepts and their relationships within a domain. Think about how Netflix understands the relationship between “genre” and “actor” to power recommendations. They’re not just storing data; they’ve built a conceptual framework that makes data meaningful.
For personal knowledge management, this means folders for specific content types—Goals, Projects, Resources, People—with metadata linking them together. A goal links to the projects pursuing it. A project links to the resources it uses and the people involved. A person links to the conversations you’ve had with them.
This isn’t bureaucracy. It’s leverage. As new information enters your system, the ontology ensures it connects to existing knowledge automatically. Every new note enriches the whole.
From Retrieval to Graph RAG
I wrote about Graph RAG in detail, but it’s worth revisiting in this context.
Traditional RAG treats your knowledge as a bag of disconnected paragraphs. You embed chunks, find the most “similar” ones to a query, and hope coherence emerges.
It works. Sort of. Until it doesn’t.
The failure mode appears when you need context that isn’t semantically similar to your query but is structurally essential. Ask “What’s the authentication approach for the payment service?” and vector search finds chunks mentioning “authentication” and “payment.” But it misses the shared auth library that all services inherit from. It misses the decision doc explaining why you chose OAuth. It misses the three other services using the same pattern, which would tell you this is established convention rather than a one-off.
These aren’t similar in embedding space. They’re related in structure.
Graph RAG combines vector search for semantic similarity with graph traversal for structural relationships. When you query, the system retrieves not just similar content but connected context—decisions that led here, patterns this follows, components this touches.
Databricks’ research on long-context RAG performance found that across 2,000+ experiments on 13 LLMs, retrieval quality—not model capability—was the determining factor in RAG system effectiveness. More relevant context beats larger context windows.
The Context Engineering Discipline
Google’s Agent Development Kit team recently published their framework for production-grade context management. Their core thesis: “Context is a compiled view over a richer stateful system.”
This is the mental shift required. Stop thinking about context as “stuff I paste into the prompt.” Start thinking about it as a compiled artifact—transformed, filtered, and optimized from underlying knowledge stores.
The ADK framework separates:
Sessions, memory, and artifacts as sources—the full, structured state
Flows and processors as the compiler pipeline—transformations that shape context
Working context as the compiled view shipped to the LLM for a single invocation
Once you adopt this model, context engineering stops being “prompt gymnastics” and starts looking like systems engineering. You ask systems questions: What’s the intermediate representation? Where do we apply compaction? How do we make transformations observable?
The Manus agent team shares similar learnings. Their KV-cache hit rate—essentially measuring how much context can be reused across agent steps—is their single most important production metric. It directly affects both latency and cost. With Claude Sonnet, cached input tokens cost $0.30/MTok versus $3/MTok uncached. A 10x difference.
Context stability matters. Context structure matters. Context management is infrastructure.
Practical Tactics
Beyond the architecture, here are specific tactics I’ve found valuable:
Keep context window utilization intentional. Research on context window optimization shows that there’s an optimal balance between “enough context” and “too much noise.” Stuffing everything in degrades performance. Models perform better with fewer, more relevant documents than large volumes of unfiltered data.
Use the LLM to generate context for the LLM. Before complex tasks, ask the model what information it would need to do the task well. Let it tell you what’s missing. Then provide it.
Layer context by scope. Some context applies to all tasks (who you are, what you’re working on, your preferences). Some applies to a session (the specific project, the current goal). Some applies only to a single query. Structure your knowledge stores to match these scopes.
Make context visible. Whatever system you build, ensure you can inspect what context is being passed to any given query. Debug your context the way you’d debug code.
Invest in capture habits. The best context engineering means nothing if you don’t capture knowledge as it emerges. Daily notes, inbox processing, regular reviews—the discipline of capture compounds into the value of retrieval.
The Path Forward
Cognizant recently announced they’re deploying 1,000 “context engineers” powered by a new platform called ContextFabric. Whether that specific initiative succeeds or not, the signal is clear: enterprises are recognizing that context management is the bottleneck to agentic AI value.
MIT Technology Review’s 2025 retrospective characterized the year as a shift from “vibe coding” to “context engineering”—moving from a loose, intuition-based approach to systematic management of how AI systems process context.
The models will keep getting better. The orchestration frameworks will mature. But the fundamental constraint remains: agents can only work with what you give them.
Context is king. Everything else is optimization at the margins.
I’m building an AI-first development workflow that combines dual track agile, TkDD (Ticket-Driven Development), and structured context engineering. More on the practical implementation in future posts.