CC - Context Engineering Theory
How the LLM Works
Everything Claude does starts with text. You send a message, and behind the scenes, that message — along with a lot of other content — gets assembled into a single string and sent to the language model. The model reads it, predicts what should come next, and that prediction is Claude's response.
That string is called the context. It's literally what gets fed into the model. And because the model's output is entirely determined by its input, context is everything. Better context → better output. The wrong context → hallucinations, mistakes, missed requirements.
Claude Code's job is to assemble the right context string for every request. It pulls together:
System-level content: the system prompt, descriptions of available tools, your instructions file (
CLAUDE.md), skill descriptionsUser-level content: your current message, the conversation history, the results of any tool calls Claude has already made
When you write an instructions file or a skill, you're contributing to that context string. You're shaping what the model sees, and therefore what it does.
The Agent Loop
Here's what actually happens when you send Claude a message:
Claude assembles the context string and sends it to the LLM
The LLM decides: is the task done? If not, it picks a tool — read a file, write a file, call an MCP server, load a skill
The tool runs, and its output gets added back to the context string
The updated context string goes back to the LLM
Repeat until the LLM decides the task is complete
This is why Claude can do complex multi-step work. It's not solving the problem in one pass — it's going around this loop many times, accumulating context as it goes, until it's confident the work is done.
Skills slot into step 2. When the LLM decides a skill is relevant, it loads the skill content into the context string. That's how a skill influences behavior: it adds more text to the context, and that text shapes what the LLM does next.
The Context Window
The context window is the maximum amount of text the LLM can process at once. Think of it as a fixed-size container. Everything has to fit: the system prompt, your instructions file, the conversation history, tool outputs, skill content, and the space reserved for Claude's response.
When the context window fills up, Claude starts to lose track of things. It might forget earlier instructions, repeat itself, or make decisions that seem inconsistent with what you asked. This is called context rot.
When context rot gets bad enough, Claude will compact the context — summarizing older parts of the conversation to free up space. This helps, but it's not perfect. The compacted version loses some detail.
What this means practically:
Keep your
CLAUDE.mdconcise — it's loaded with every messageStart a new conversation if you've been working in one for a long time
For very long tasks, break them into smaller conversations
✏️ Exercise: Check the Context Window
To see where you stand, run this in Claude:
/contextClaude will give you a breakdown of the context window: how full it is, what's taking up space, and how much room is left. This is useful any time you're doing a long session and things start to feel a little off.
If the window is getting full, it's a good time to start a fresh conversation.
Key Takeaways
Skills — markdown files that define how to do something. Claude invokes them automatically when relevant, or you can call them with
/skill-name. The description is what triggers auto-invocation.Instructions files —
CLAUDE.mdat the project root. Loaded with every request. Defines the universe Claude operates in.Context — the full string of text sent to the LLM. Everything you do in context engineering is shaping this string.
Context window — finite capacity. Keep instructions lean and start fresh conversations when things get long.
Context rot — what happens when the window fills up. Claude loses focus and consistency.